Dropping rows and columns from a dataset helps in preparing the data so that we can filter the data for better understanding. When we are working on a data science task, we store the data in a Pandas DataFrame, and if you don’t know how to delete rows and columns from a DataFrame, this article is for you. In this article, I will present you with a tutorial on how to drop rows and columns from a Pandas DataFrame in Python.
Drop Rows and Columns of a Pandas DataFrame
To delete both the rows and columns of a DataFrame,Ā the drop()Ā method in pandas is used. Anything you are going to delete from a dataset will be deleted from the DataFrame only and not from the original dataset. Below is the initial dataset that I will be using for this tutorial:
import pandas as pd data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/social.csv") print(data.head())
Age EstimatedSalary Purchased 0 19 19000 0 1 35 20000 0 2 26 43000 0 3 27 57000 0 4 19 76000 0
Now let’s see how to drop rows and columns from a dataset step by step.
Dropping Rows
If you want to drop all the rows containing missing values, then you can use theĀ dropna()Ā method instead of theĀ drop()Ā method:
data = data.dropna() #dropping null values
If you want to drop some particular rows, then you can write their indexes inside a list as shown below:
data = data.drop([0, 1]) #dropping rows by index print(data.head()) #the new transformed data will start from index no.2
Age EstimatedSalary Purchased 2 26 43000 0 3 27 57000 0 4 19 76000 0 5 27 58000 0 6 27 84000 0
Dropping Columns
Below is how you can drop a single column from a Pandas DataFrame:
data = data.drop(columns="Purchased", axis=1) #dropping a single column
If you want to drop multiple columns from a DataFrame, then write all those columns inside a list as shown below:
data = data.drop(columns=["EstimatedSalary", "Age"], axis=1) #dropping multiple columns
Summary
So this is how you can drop the unnecessary rows and columns from a dataset using Python. It helps in data preparation so that we can filter the data for a better understanding. I hope you liked this article on dropping rows and columns of a Pandas DataFrame in Python. Feel free to ask your valuable questions in the comments section below.