
Understanding GDP
Gross domestic product (GDP) at current prices is the GDP at the market value of goods and services produced in a country during a year.
In other words, GDP measures the monetary value of final goods and services produced by a country/state in a given period of time.
GDP can be broadly divided into goods and services produced by three sectors: the primary sector (agriculture), the secondary sector (industry), and the tertiary sector (services).
It is also known as nominal GDP. More technically, (real) GDP takes into account the price change that may have occurred due to inflation. This means that the real GDP is nominal GDP adjusted for inflation.
I will use the nominal GDP for this Article. Also, I will consider the financial year 2015-16 as the base year, as most of the data required for this exercise is available for the aforementioned period.
This GDP analysis is based on Indian states only, If you want to see GDP analysis for the world you can click below.
Download the data set
Lets start with importing the libraries
import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline
GDP Analysis of the Indian States
Read the data
data = pd.read_csv('GSDP.csv') data.head()

# Basic info regarding the data data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 11 entries, 0 to 10 Data columns (total 36 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Items Description 11 non-null object 1 Duration 11 non-null object 2 Andhra Pradesh 11 non-null float64 3 Arunachal Pradesh 9 non-null float64 4 Assam 9 non-null float64 5 Bihar 9 non-null float64 6 Chhattisgarh 11 non-null float64 7 Goa 9 non-null float64 8 Gujarat 9 non-null float64 9 Haryana 11 non-null float64 10 Himachal Pradesh 7 non-null float64 11 Jammu & Kashmir 9 non-null float64 12 Jharkhand 9 non-null float64 13 Karnataka 9 non-null float64 14 Kerala 9 non-null float64 15 Madhya Pradesh 11 non-null float64 16 Maharashtra 7 non-null float64 17 Manipur 7 non-null float64 18 Meghalaya 11 non-null float64 19 Mizoram 7 non-null float64 20 Nagaland 7 non-null float64 21 Odisha 11 non-null float64 22 Punjab 7 non-null float64 23 Rajasthan 7 non-null float64 24 Sikkim 9 non-null float64 25 Tamil Nadu 11 non-null float64 26 Telangana 11 non-null float64 27 Tripura 7 non-null float64 28 Uttar Pradesh 9 non-null float64 29 Uttarakhand 9 non-null float64 30 West Bengal1 0 non-null float64 31 Andaman & Nicobar Islands 7 non-null float64 32 Chandigarh 9 non-null float64 33 Delhi 11 non-null float64 34 Puducherry 11 non-null float64 35 All_India GDP 11 non-null float64 dtypes: float64(34), object(2) memory usage: 3.2+ KB
# Observe the various columns in the dataset data.columns
Index(['Items Description', 'Duration', 'Andhra Pradesh ', 'Arunachal Pradesh', 'Assam', 'Bihar', 'Chhattisgarh', 'Goa', 'Gujarat', 'Haryana', 'Himachal Pradesh', 'Jammu & Kashmir', 'Jharkhand', 'Karnataka', 'Kerala', 'Madhya Pradesh', 'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram', 'Nagaland', 'Odisha', 'Punjab', 'Rajasthan', 'Sikkim', 'Tamil Nadu', 'Telangana', 'Tripura', 'Uttar Pradesh', 'Uttarakhand', 'West Bengal1', 'Andaman & Nicobar Islands', 'Chandigarh', 'Delhi', 'Puducherry', 'All_India GDP'], dtype='object')
# Remove the rows: (% Growth over the previous year)' and 'GSDP - CURRENT PRICES (in Crore) for the year 2016-17. data = data[data['Duration'] != '2016-17'] data

# Check the total number of null values in each columns data.isnull().sum()
Items Description 0 Duration 0 Andhra Pradesh 0 Arunachal Pradesh 0 Assam 0 Bihar 0 Chhattisgarh 0 Goa 0 Gujarat 0 Haryana 0 Himachal Pradesh 2 Jammu & Kashmir 0 Jharkhand 0 Karnataka 0 Kerala 0 Madhya Pradesh 0 Maharashtra 2 Manipur 2 Meghalaya 0 Mizoram 2 Nagaland 2 Odisha 0 Punjab 2 Rajasthan 2 Sikkim 0 Tamil Nadu 0 Telangana 0 Tripura 2 Uttar Pradesh 0 Uttarakhand 0 West Bengal1 9 Andaman & Nicobar Islands 2 Chandigarh 0 Delhi 0 Puducherry 0 All_India GDP 0 dtype: int64
# Check if any column has all the values as NAN data.isnull().all(axis=0)
Items Description False Duration False Andhra Pradesh False Arunachal Pradesh False Assam False Bihar False Chhattisgarh False Goa False Gujarat False Haryana False Himachal Pradesh False Jammu & Kashmir False Jharkhand False Karnataka False Kerala False Madhya Pradesh False Maharashtra False Manipur False Meghalaya False Mizoram False Nagaland False Odisha False Punjab False Rajasthan False Sikkim False Tamil Nadu False Telangana False Tripura False Uttar Pradesh False Uttarakhand False West Bengal1 True Andaman & Nicobar Islands False Chandigarh False Delhi False Puducherry False All_India GDP False dtype: bool
# removing West Bengal as the whole column is NAN data = data.drop('West Bengal1', axis = 1)
Calculating the average growth of states for the duration 2013-14, 2014-15 and 2015-16 by taking the mean of the row ‘(% Growth over previous year)’.
data.iloc[6:].isnull().sum() # since there are at max. only 1 missing value we can take the average of the other two numbers
Items Description 0 Duration 0 Andhra Pradesh 0 Arunachal Pradesh 0 Assam 0 Bihar 0 Chhattisgarh 0 Goa 0 Gujarat 0 Haryana 0 Himachal Pradesh 1 Jammu & Kashmir 0 Jharkhand 0 Karnataka 0 Kerala 0 Madhya Pradesh 0 Maharashtra 1 Manipur 1 Meghalaya 0 Mizoram 1 Nagaland 1 Odisha 0 Punjab 1 Rajasthan 1 Sikkim 0 Tamil Nadu 0 Telangana 0 Tripura 1 Uttar Pradesh 0 Uttarakhand 0 Andaman & Nicobar Islands 1 Chandigarh 0 Delhi 0 Puducherry 0 All_India GDP 0 dtype: int64
avg_growth = data.iloc[6:] avg_growth #dataframe to find the average growth of states

avg_growth.columns
Index(['Items Description', 'Duration', 'Andhra Pradesh ', 'Arunachal Pradesh', 'Assam', 'Bihar', 'Chhattisgarh', 'Goa', 'Gujarat', 'Haryana', 'Himachal Pradesh', 'Jammu & Kashmir', 'Jharkhand', 'Karnataka', 'Kerala', 'Madhya Pradesh', 'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram', 'Nagaland', 'Odisha', 'Punjab', 'Rajasthan', 'Sikkim', 'Tamil Nadu', 'Telangana', 'Tripura', 'Uttar Pradesh', 'Uttarakhand', 'Andaman & Nicobar Islands', 'Chandigarh', 'Delhi', 'Puducherry', 'All_India GDP'], dtype='object')
# Taking only the values for the states average_growth_values = avg_growth[avg_growth.columns[2:34]].mean() # Sorting the average growth rate values and then making a dataframe for all the states average_growth_values = average_growth_values.sort_values() average_growth_rate = average_growth_values.to_frame(name='Average growth rate')
# plotting the average growth rate for all the states plt.figure(figsize=(12,10), dpi = 300) sns.barplot(x = average_growth_rate['Average growth rate'], y = average_growth_values.index,palette='viridis') plt.xlabel('Average Growth Rate', fontsize=12) plt.ylabel('States', fontsize=12) plt.title('Average Growth Rate for all the states',fontsize=13) plt.show()

Observations:
- We can see an interesting observation from the above plot, the average growth rate has been the maximum for the North East states except for Assam and Meghalaya which is not what we generally expect so we should take a further look at these states.
- The average growth rate has been least for states like Goa, Odisha, Meghalaya, Sikkim, Jammu & Kashmir etc.
# top 5 states as per average growth rate average_growth_rate['Average growth rate'][-5:]
Arunachal Pradesh 14.413333 Manipur 14.610000 Nagaland 16.415000 Tripura 17.030000 Mizoram 17.700000 Name: Average growth rate, dtype: float64
# top 5 states as per average growth rate for the years 2013-14, 2014-15, 2015-16 avg_growth[['Mizoram','Tripura','Nagaland','Manipur','Arunachal Pradesh']]
Mizoram Tripura Nagaland Manipur Arunachal Pradesh 7 23.1 18.14 21.98 17.83 16.38 8 12.3 15.92 10.85 11.39 14.79 9 NaN NaN NaN NaN 12.07
- We can see that the growth rate for the above states actually decreased substantially for the year 14-15 in comparison to the year 13-14 but as the growth rate was very high for the year 13-14 so the average is higher for these states.
- In the absence of data for the year 2015-16 we cannot say definitively that these are high performing states as their growth rate decreased for the year 2014-15.
To find out the states that have been growing continuously fast we need to take a look at the Standard Deviation and the Mean growth rate for the states.
#create a dataframe to store the mean and the standard deviation of the growth rate for various states describe = pd.DataFrame(avg_growth.describe()) describe = describe.T
# states having mean growth rate greater than 12 and standard deviation less than 2 describe[(describe['mean']>12) & (describe['std']<2)]

# states having mean growth rate greater than 13 and standard deviation greater than 2 describe[(describe['mean']<12) & (describe['std']>2)]
# states having mean growth rate greater than 13 and standard deviation greater than 2 describe[(describe['mean']<12) & (describe['std']>2)]

By comparing the average growth rate for the year 2013-14, 2014-15, 2015-16 and the standard deviation.
States that are growing consistently fast are:
- Andhra Pradesh
- Assam
- Kerala
- Tamil Nadu
- Telangana
States that are struggling are:
- Goa
- Meghalaya
- Odisha
- Jammu & Kashmir
- Jharkhand
Plotting the total GDP of the states for the year 2015-16
# filtering out the data for the year 2015-16 and storing it in a dataframe total_GDP_15_16 = data[(data['Items Description'] == 'GSDP - CURRENT PRICES (` in Crore)') & (data['Duration'] == '2015-16')] # carrying out necessary transformation to make the data ready for plotting total_GDP_15_16_states = total_GDP_15_16[total_GDP_15_16.columns[2:34]].transpose() total_GDP_15_16_states = total_GDP_15_16_states.rename(columns={4: 'Total GDP of States 2015-16'}) total_GDP_15_16_states = total_GDP_15_16_states.dropna() total_GDP_15_16_states = total_GDP_15_16_states.sort_values('Total GDP of States 2015-16',ascending=True)
plt.figure(figsize=(10,8), dpi = 600) sns.barplot(x = total_GDP_15_16_states['Total GDP of States 2015-16'], y = total_GDP_15_16_states.index,palette='plasma') plt.xlabel('Total GDP of States for 2015-16', fontsize=12) plt.ylabel('States', fontsize=12) plt.title('Total GDP of States 2015-16 for all the states',fontsize=12) plt.show()

Top 5 states in terms of total GDP for the year 2015-16
top_5_eco = total_GDP_15_16_states[-5:] top_5_eco
Total GDP of States 2015-16 Andhra Pradesh 609934.0 Gujarat 994316.0 Karnataka 1027068.0 Uttar Pradesh 1153795.0 Tamil Nadu 1212668.0