Pandas Profiling in Python

Pandas Profiling is an amazing open-source Python library that is used for a quick exploratory data analysis of your data in a few lines of code. Exploratory data analysis is a very important step in data science tasks, and this Python library saves a lot of time when exploring any dataset. If you’ve never used the Pandas Profiling Library, this article is for you. In this article, I will present a tutorial on the Pandas Profiling Library in Python.

Pandas Profiling

It is very important to explore the data you are using while working on any kind of data science task. The process of exploring your data is called exploratory data analysis. Here we use data visualization tools like Tableau, Google Data Studio, and Python libraries like Matplotlib, Seaborn, and Plotly. Exploring your dataset takes a long time, this is where the Pandas Profiling library in Python comes in. It helps you explore your entire dataset in just a few lines of code. Besides exploratory data analysis, you can also use this library to create reports as it also provides various built-in functions that can be used to generate reports of your analysis.

I hope you now have understood what Pandas Profiling library is and why it is used. Now in the section below, I will take you through a tutorial on how to use this library for exploratory data analysis using Python.

Pandas Profiling in Python (Tutorial)

If you have never used this Python library before, then you can easily install it by using the pip command in your terminal or command prompt mentioned below:

  • pip install pandas-profiling

You can use the this Python library in any code editor, but it is recommended to use it in a Jupyter or Google Colab notebook as it is easier to understand the reports generated by this library there.

Now let’s see how you can use this Python library to explore your dataset. For this task, I will first import the necessary Python libraries and the dataset that we want to explore:

import pandas as pd
from pandas_profiling import ProfileReport
data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/mobile_prices.csv")

And now, you just need to write these two lines of code, and you will see the complete exploratory data analysis of your dataset:

profile = ProfileReport(data, title="Pandas Profiling Report", explorative=True)
profile
Pandas Profiling in Python
You will see a complete report in the output.

To create and save a report of your exploratory data analysis, you just need to execute the code mentioned below:

profile.to_file("your_report.html")

Summary

So this is how you can use the Pandas Profiling library in Python for a faster exploratory data analysis of your data. In simple words, this Python library helps you to explore your complete dataset in just a few lines of code. I hope you liked this article on a tutorial on the Pandas Profiling library in Python. Feel free to ask your valuable questions in the comments section below.

Default image
Aman Kharwal
Coder with the ♥️ of a Writer || Data Scientist | Solopreneur | Founder
Articles: 1126

Leave a Reply