There are some pieces of scripts or code that we generally use in the middle of several tasks daily. For example, whenever I work on a sentiment analysis task, I always use one common script to clean the text column properly. It doesnāt mean that you should never use a different logic on a new task, you can just prefer such scripts while working on the same kind of problem where you used the script earlier and got desired results. So if you are looking for some of the Python scripts that you can use in the middle of solving real-time problems, then this article is for you. In this article, I will take you through five useful Python scripts that you can use to solve real-time problems in your projects.
5 Useful Python Scripts
Below are all the useful Python scripts that you will learn about in this article:
- removing duplicatesĀ
- text cleaning
- web scraping
- converting image to an array
- annotating graphs
Now letās go through all these useful Python scripts one by one.
Removing Duplicates
Suppose you have a list of names that contain duplicate names. In most problems, you should remove all duplicate names before taking any action. So here is a Python script that you can use in any list to remove duplicate values:
def remove(items): list1 = [] for i in items: if i not in list1: list1.append(i) return list1 a = ["Aman", "Akanksha", "Aman", "Shiwangi", "Sajid"] print(remove(a))
Cleaning Text
Data cleansing is one of the most important steps when working on a data science task. When data is about opinions, it contains informal language with many language errors. So here is a Python function that you can apply to the text column of your data:
import re import nltk import nltk from nltk.corpus import stopwords import string nltk.download('stopwords') stemmer = nltk.SnowballStemmer("english") stopword=set(stopwords.words('english')) def clean(text): text = str(text).lower() text = re.sub('\[.*?\]', '', text) text = re.sub('https?://\S+|www\.\S+', '', text) text = re.sub('<.*?>+', '', text) text = re.sub('[%s]' % re.escape(string.punctuation), '', text) text = re.sub('\n', '', text) text = re.sub('\w*\d\w*', '', text) text = [word for word in text.split(' ') if word not in stopword] text=" ".join(text) text = [stemmer.stem(word) for word in text.split(' ')] text=" ".join(text) return text
Scraping Tables from a Website
If you want to collect data from a table of a webpage, then this Python script is for you. It will also store the collected data into a CSV file:
import csv from urllib.request import urlopen from bs4 import BeautifulSoup import pandas as pd html = urlopen("https://bit.ly/3jpMFRW") soup = BeautifulSoup(html, "html.parser") table = soup.findAll("table", {"class":"wikitable"})[0] rows = table.findAll("tr") with open("Dataset.csv", "wt+", newline="") as f: writer = csv.writer(f) for i in rows: row = [] for cell in i.findAll(["td", "th"]): row.append(cell.get_text()) writer.writerow(row) data = pd.read_csv("Dataset.csv") data.head()
Converting Image to an Array
To analyze the features of an image, you first need to convert it into an array. Below is how you can convert any image into an array using Python:
from keras.preprocessing.image import load_img from keras.preprocessing.image import img_to_array img = load_img("image.png") # converting it to array data = img_to_array(img) print(data)
Annotation of Graphs
When plotting data on charts, annotations help to get a better idea of the features to which the data points relate. So here is a simple example of how you can annotate a chart using Python:
import matplotlib.pyplot as plt x = [3, 5, 7, 5, 4] y = [5, 3, 4, 5, 2] labels = ["Jan", "Feb", "Mar", "April", "May"] plt.scatter(x, y) for i, j in enumerate(labels): plt.annotate(j, (x[i]+0.10, y[i]), fontsize=10) plt.show()
Summary
So these were some of the Python scripts that you can use in the middle of solving real-time problems. Most of these Python scripts are related to data science, as the use of Python is mostly in data science today. I hope you liked this article on five useful Python scripts that you can use. Feel free to ask your valuable questions in the comments section below.