Indian GDP Analysis with Python

Understanding GDP

Gross domestic product (GDP) at current prices is the GDP at the market value of goods and services produced in a country during a year.

In other words, GDP measures the monetary value of final goods and services produced by a country/state in a given period of time.

GDP can be broadly divided into goods and services produced by three sectors: the primary sector (agriculture), the secondary sector (industry), and the tertiary sector (services).

It is also known as nominal GDP. More technically, (real) GDP takes into account the price change that may have occurred due to inflation. This means that the real GDP is nominal GDP adjusted for inflation.

I will use the nominal GDP for this Article. Also, I will consider the financial year 2015-16 as the base year, as most of the data required for this exercise is available for the aforementioned period.

This GDP analysis is based on Indian states only, If you want to see GDP analysis for the world you can click below.

Download the data set

Lets start with importing the libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

GDP Analysis of the Indian States

Read the data

data = pd.read_csv('GSDP.csv')
data.head()
# Basic info regarding the data
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11 entries, 0 to 10
Data columns (total 36 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   Items  Description         11 non-null     object 
 1   Duration                   11 non-null     object 
 2   Andhra Pradesh             11 non-null     float64
 3   Arunachal Pradesh          9 non-null      float64
 4   Assam                      9 non-null      float64
 5   Bihar                      9 non-null      float64
 6   Chhattisgarh               11 non-null     float64
 7   Goa                        9 non-null      float64
 8   Gujarat                    9 non-null      float64
 9   Haryana                    11 non-null     float64
 10  Himachal Pradesh           7 non-null      float64
 11  Jammu & Kashmir            9 non-null      float64
 12  Jharkhand                  9 non-null      float64
 13  Karnataka                  9 non-null      float64
 14  Kerala                     9 non-null      float64
 15  Madhya Pradesh             11 non-null     float64
 16  Maharashtra                7 non-null      float64
 17  Manipur                    7 non-null      float64
 18  Meghalaya                  11 non-null     float64
 19  Mizoram                    7 non-null      float64
 20  Nagaland                   7 non-null      float64
 21  Odisha                     11 non-null     float64
 22  Punjab                     7 non-null      float64
 23  Rajasthan                  7 non-null      float64
 24  Sikkim                     9 non-null      float64
 25  Tamil Nadu                 11 non-null     float64
 26  Telangana                  11 non-null     float64
 27  Tripura                    7 non-null      float64
 28  Uttar Pradesh              9 non-null      float64
 29  Uttarakhand                9 non-null      float64
 30  West Bengal1               0 non-null      float64
 31  Andaman & Nicobar Islands  7 non-null      float64
 32  Chandigarh                 9 non-null      float64
 33  Delhi                      11 non-null     float64
 34  Puducherry                 11 non-null     float64
 35  All_India GDP              11 non-null     float64
dtypes: float64(34), object(2)
memory usage: 3.2+ KB
# Observe the various columns in the dataset
data.columns
Index(['Items  Description', 'Duration', 'Andhra Pradesh ',
       'Arunachal Pradesh', 'Assam', 'Bihar', 'Chhattisgarh', 'Goa', 'Gujarat',
       'Haryana', 'Himachal Pradesh', 'Jammu & Kashmir', 'Jharkhand',
       'Karnataka', 'Kerala', 'Madhya Pradesh', 'Maharashtra', 'Manipur',
       'Meghalaya', 'Mizoram', 'Nagaland', 'Odisha', 'Punjab', 'Rajasthan',
       'Sikkim', 'Tamil Nadu', 'Telangana', 'Tripura', 'Uttar Pradesh',
       'Uttarakhand', 'West Bengal1', 'Andaman & Nicobar Islands',
       'Chandigarh', 'Delhi', 'Puducherry', 'All_India GDP'],
      dtype='object')
# Remove the rows: (% Growth over the previous year)' and 'GSDP - CURRENT PRICES (in Crore) for the year 2016-17.
data = data[data['Duration'] != '2016-17']
data
# Check the total number of null values in each columns
data.isnull().sum()
Items  Description           0
Duration                     0
Andhra Pradesh               0
Arunachal Pradesh            0
Assam                        0
Bihar                        0
Chhattisgarh                 0
Goa                          0
Gujarat                      0
Haryana                      0
Himachal Pradesh             2
Jammu & Kashmir              0
Jharkhand                    0
Karnataka                    0
Kerala                       0
Madhya Pradesh               0
Maharashtra                  2
Manipur                      2
Meghalaya                    0
Mizoram                      2
Nagaland                     2
Odisha                       0
Punjab                       2
Rajasthan                    2
Sikkim                       0
Tamil Nadu                   0
Telangana                    0
Tripura                      2
Uttar Pradesh                0
Uttarakhand                  0
West Bengal1                 9
Andaman & Nicobar Islands    2
Chandigarh                   0
Delhi                        0
Puducherry                   0
All_India GDP                0
dtype: int64
# Check if any column has all the values as NAN
data.isnull().all(axis=0)
Items  Description           False
Duration                     False
Andhra Pradesh               False
Arunachal Pradesh            False
Assam                        False
Bihar                        False
Chhattisgarh                 False
Goa                          False
Gujarat                      False
Haryana                      False
Himachal Pradesh             False
Jammu & Kashmir              False
Jharkhand                    False
Karnataka                    False
Kerala                       False
Madhya Pradesh               False
Maharashtra                  False
Manipur                      False
Meghalaya                    False
Mizoram                      False
Nagaland                     False
Odisha                       False
Punjab                       False
Rajasthan                    False
Sikkim                       False
Tamil Nadu                   False
Telangana                    False
Tripura                      False
Uttar Pradesh                False
Uttarakhand                  False
West Bengal1                  True
Andaman & Nicobar Islands    False
Chandigarh                   False
Delhi                        False
Puducherry                   False
All_India GDP                False
dtype: bool
# removing West Bengal as the whole column is NAN
data = data.drop('West Bengal1', axis = 1)

Calculating the average growth of states for the duration 2013-14, 2014-15 and 2015-16 by taking the mean of the row ‘(% Growth over previous year)’.

data.iloc[6:].isnull().sum() # since there are at max. only 1 missing value we can take the average of the other two numbers
Items  Description           0
Duration                     0
Andhra Pradesh               0
Arunachal Pradesh            0
Assam                        0
Bihar                        0
Chhattisgarh                 0
Goa                          0
Gujarat                      0
Haryana                      0
Himachal Pradesh             1
Jammu & Kashmir              0
Jharkhand                    0
Karnataka                    0
Kerala                       0
Madhya Pradesh               0
Maharashtra                  1
Manipur                      1
Meghalaya                    0
Mizoram                      1
Nagaland                     1
Odisha                       0
Punjab                       1
Rajasthan                    1
Sikkim                       0
Tamil Nadu                   0
Telangana                    0
Tripura                      1
Uttar Pradesh                0
Uttarakhand                  0
Andaman & Nicobar Islands    1
Chandigarh                   0
Delhi                        0
Puducherry                   0
All_India GDP                0
dtype: int64
avg_growth = data.iloc[6:]
avg_growth #dataframe to find the average growth of states
avg_growth.columns
Index(['Items  Description', 'Duration', 'Andhra Pradesh ',
       'Arunachal Pradesh', 'Assam', 'Bihar', 'Chhattisgarh', 'Goa', 'Gujarat',
       'Haryana', 'Himachal Pradesh', 'Jammu & Kashmir', 'Jharkhand',
       'Karnataka', 'Kerala', 'Madhya Pradesh', 'Maharashtra', 'Manipur',
       'Meghalaya', 'Mizoram', 'Nagaland', 'Odisha', 'Punjab', 'Rajasthan',
       'Sikkim', 'Tamil Nadu', 'Telangana', 'Tripura', 'Uttar Pradesh',
       'Uttarakhand', 'Andaman & Nicobar Islands', 'Chandigarh', 'Delhi',
       'Puducherry', 'All_India GDP'],
      dtype='object')
# Taking only the values for the states
average_growth_values = avg_growth[avg_growth.columns[2:34]].mean()
# Sorting the average growth rate values and then making a dataframe for all the states
average_growth_values = average_growth_values.sort_values()
average_growth_rate = average_growth_values.to_frame(name='Average growth rate')
# plotting the average growth rate for all the states
plt.figure(figsize=(12,10), dpi = 300)

sns.barplot(x = average_growth_rate['Average growth rate'], y = average_growth_values.index,palette='viridis')
plt.xlabel('Average Growth Rate', fontsize=12)
plt.ylabel('States', fontsize=12)
plt.title('Average Growth Rate for all the states',fontsize=13)
plt.show()
Observations:
  1. We can see an interesting observation from the above plot, the average growth rate has been the maximum for the North East states except for Assam and Meghalaya which is not what we generally expect so we should take a further look at these states.
  2. The average growth rate has been least for states like Goa, Odisha, Meghalaya, Sikkim, Jammu & Kashmir etc.
# top 5 states as per average growth rate
average_growth_rate['Average growth rate'][-5:]
Arunachal Pradesh    14.413333
Manipur              14.610000
Nagaland             16.415000
Tripura              17.030000
Mizoram              17.700000
Name: Average growth rate, dtype: float64
# top 5 states as per average growth rate for the years 2013-14, 2014-15, 2015-16
avg_growth[['Mizoram','Tripura','Nagaland','Manipur','Arunachal Pradesh']]
   Mizoram	Tripura	Nagaland  Manipur  Arunachal Pradesh
7	23.1	18.14	21.98	    17.83	16.38
8	12.3	15.92	10.85	    11.39	14.79
9	NaN	     NaN	NaN	         NaN	12.07
  1. We can see that the growth rate for the above states actually decreased substantially for the year 14-15 in comparison to the year 13-14 but as the growth rate was very high for the year 13-14 so the average is higher for these states.
  2. In the absence of data for the year 2015-16 we cannot say definitively that these are high performing states as their growth rate decreased for the year 2014-15.

To find out the states that have been growing continuously fast we need to take a look at the Standard Deviation and the Mean growth rate for the states.

#create a dataframe to store the mean and the standard deviation of the growth rate for various states

describe = pd.DataFrame(avg_growth.describe())
describe = describe.T
# states having mean growth rate greater than 12 and standard deviation less than 2

describe[(describe['mean']>12) & (describe['std']<2)]
# states having mean growth rate greater than 13 and standard deviation greater than 2

describe[(describe['mean']<12) & (describe['std']>2)]
# states having mean growth rate greater than 13 and standard deviation greater than 2

describe[(describe['mean']<12) & (describe['std']>2)]

By comparing the average growth rate for the year 2013-14, 2014-15, 2015-16 and the standard deviation.

States that are growing consistently fast are:

  1. Andhra Pradesh
  2. Assam
  3. Kerala
  4. Tamil Nadu
  5. Telangana

States that are struggling are:

  1. Goa
  2. Meghalaya
  3. Odisha
  4. Jammu & Kashmir
  5. Jharkhand

Plotting the total GDP of the states for the year 2015-16

# filtering out the data for the year 2015-16 and storing it in a dataframe
total_GDP_15_16 = data[(data['Items  Description'] == 'GSDP - CURRENT PRICES (` in Crore)') & (data['Duration'] == '2015-16')]
# carrying out necessary transformation to make the data ready for plotting

total_GDP_15_16_states = total_GDP_15_16[total_GDP_15_16.columns[2:34]].transpose()
total_GDP_15_16_states = total_GDP_15_16_states.rename(columns={4: 'Total GDP of States 2015-16'})
total_GDP_15_16_states = total_GDP_15_16_states.dropna()
total_GDP_15_16_states = total_GDP_15_16_states.sort_values('Total GDP of States 2015-16',ascending=True)
plt.figure(figsize=(10,8), dpi = 600)

sns.barplot(x = total_GDP_15_16_states['Total GDP of States 2015-16'], y = total_GDP_15_16_states.index,palette='plasma')
plt.xlabel('Total GDP of States for 2015-16', fontsize=12)
plt.ylabel('States', fontsize=12)
plt.title('Total GDP of States 2015-16 for all the states',fontsize=12)
plt.show()

Top 5 states in terms of total GDP for the year 2015-16

top_5_eco = total_GDP_15_16_states[-5:]
top_5_eco
Total GDP of States 2015-16
Andhra Pradesh	       609934.0
Gujarat	               994316.0
Karnataka	          1027068.0
Uttar Pradesh	      1153795.0
Tamil Nadu	          1212668.0
Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1435

Leave a Reply