Box Plot using Python

A box plot is a statistical data visualization technique for analyzing the distribution and patterns of numerical data points of a dataset. It represents quartile 1, quartile 3, median, maximum and minimum data points of a feature which helps to understand the distribution of the numerical values of a dataset. If you want to know how to visualize a box plot, this article is for you. In this article, I’ll walk you through how to visualize a box plot using the Python programming language.

Box Plot

The box portion of a box plot contains three lines:

  1. the first line in the top represents quartile 3 of the data points, which means that 75% of the data lies below this point;
  2. the second line in the middle represents the median value of the data points, which means that 50% of the data lies below this point;
  3. the third line in the box plot represents quartile 1 of the data points, which means that 25% of the data lies below this point;
  4. the two horizontal lines below and above the box are known as whisker lines, the above whisker represents maximum value, and the lower whisker represents minimum value.

I hope you now have understood what a box plot shows you about a numerical feature of a dataset. Now in the section below, I will take you through how to visualize a box plot using Python.

Box Plot using Python

I will start by importing the necessary Python libraries and a dataset that we can use to visualize box plots using Python:

import pandas as pd
data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/Advertising.csv")
print(data.head())
   Unnamed: 0     TV  Radio  Newspaper  Sales
0           1  230.1   37.8       69.2   22.1
1           2   44.5   39.3       45.1   10.4
2           3   17.2   45.9       69.3    9.3
3           4  151.5   41.3       58.5   18.5
4           5  180.8   10.8       58.4   12.9

Now below is how you can visualize a box plot using the Python programming language:

import plotly.express as px
fig = px.box(data, y="TV")
fig.show()
Box Plot using Python

So this is how you can easily visualize box plots using Python.

Summary

A box plot represents quartile 1, quartile 3, median, maximum and minimum data points of a feature which helps to understand the distribution of the numerical features of a dataset. I hope you liked this article on visualizing a box plot using Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1501

Leave a Reply