IPL Analysis with Python

Cricket is an important part of the culture in India and the Indian Premier League (IPL) matches are one of the most important events in India. In this article, I will introduce you to a data science project on IPL analysis with Python.

Data Science Project on IPL Analysis

IPL is a major professional cricket league in India, contested by eight teams representing different cities in India. This IPL analysis task focuses on analyzing the performance of the eight competing IPL teams.

Also, Read – 100+ Machine Learning Projects Solved and Explained.

Team performance is visualized graphically using the Plotly library in Python to render interpretation efficiently. Performance data using visual analysis help select players for future matches and provides additional information about the player as well as team profiles.

IPL Analysis with Python

Now let’s start the task of IPL analysis with Python by importing the necessary Python libraries and the dataset:

In December 2018 the team changed their name from Delhi Daredevils to Delhi Capitals and Sunrisers Hyderabad replaced Deccan Chargers in 2012 and debuted in 2013. But I consider them to be the same in this IPL analysis task. Now let’s start with some data preparation:

Let’s start with looking at the number of matches played in every season of the IPL:

IPL matches in every season

The year 2013 has the most matches, possibly due to super overs. Also, there are 10 teams in 2011, 9 in 2012 and 2013, this is another reason for the increase in the number of matches.

Matches Played Vs Wins

Now let’s have a look at the number of matches played by each team and the number matches won by them:

total matches vs wins IPL

Now let’s analyze the winning percentage of all IPL teams:

winning percentage of IPL teams

So MI, SRH and RCB are the top three teams with the highest winning percentage. Let’s look at the winning percentage of these three teams:

win_percentage = round(matches_played['wins']/matches_played['Total Matches'],3)*100
MI     59.1
SRH    53.3
RCB    50.8
dtype: float64

The next step in IPL analysis is to have a look at the venues where the most number of matches have been played:

IPL analysis

So Eden Gardens, M Chinnaswamy, Wankhede and Feroz Shah Kotla are stadiums with most matches because most of each season’s eliminators, playoffs and finals were there. Now let’s have a look the most prefered decision taken by teams after winning the toss:

IPL analysis

Runs Per Season

In this section of IPL analysis with Python, we will analyze the runs per season. Let’s start by looking at the average and total runs of all the seasons:

Total runs in IPL

Now let’s have a look the distributions of runs over the years which will be distributed among three categories; 6s, 4s and remaining runs:

Runs distribution in IPL

We can see just a slight increase in runs by boundaries over the years. At last, we will look at the highest runs scored by teams over the years:

IPL highest scores

I hope you liked this article on a data science project on IPL analysis with Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1500


  1. Hi. Could you please provide a code for the Total and Average runs per Seasons graph. I am referring to Runs Per Season section

Leave a Reply