Drop Rows and Columns of a Pandas DataFrame in Python

Dropping rows and columns from a dataset helps in preparing the data so that we can filter the data for better understanding. When we are working on a data science task, we store the data in a Pandas DataFrame, and if you don’t know how to delete rows and columns from a DataFrame, this article is for you. In this article, I will present you with a tutorial on how to drop rows and columns from a Pandas DataFrame in Python.

Drop Rows and Columns of a Pandas DataFrame

To delete both the rows and columns of a DataFrame,Ā the drop()Ā method in pandas is used. Anything you are going to delete from a dataset will be deleted from the DataFrame only and not from the original dataset. Below is the initial dataset that I will be using for this tutorial:

import pandas as pd
data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/social.csv")
print(data.head())
   Age  EstimatedSalary  Purchased
0   19            19000          0
1   35            20000          0
2   26            43000          0
3   27            57000          0
4   19            76000          0

Now let’s see how to drop rows and columns from a dataset step by step.

Dropping Rows

If you want to drop all the rows containing missing values, then you can use theĀ dropna()Ā method instead of theĀ drop()Ā method:

data = data.dropna() #dropping null values

If you want to drop some particular rows, then you can write their indexes inside a list as shown below:

data = data.drop([0, 1]) #dropping rows by index
print(data.head()) #the new transformed data will start from index no.2
   Age  EstimatedSalary  Purchased
2   26            43000          0
3   27            57000          0
4   19            76000          0
5   27            58000          0
6   27            84000          0

Dropping Columns

Below is how you can drop a single column from a Pandas DataFrame:

data = data.drop(columns="Purchased", axis=1) #dropping a single column

If you want to drop multiple columns from a DataFrame, then write all those columns inside a list as shown below:

data = data.drop(columns=["EstimatedSalary", "Age"], axis=1) #dropping multiple columns

Summary

So this is how you can drop the unnecessary rows and columns from a dataset using Python. It helps in data preparation so that we can filter the data for a better understanding. I hope you liked this article on dropping rows and columns of a Pandas DataFrame in Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of datašŸ“ˆ.

Articles: 1435

Leave a Reply