Bubble Plots using Python

Bubble plots are better versions of the scatter plots, replacing the dots with bubbles. Most often, a bubble plot displays the values ​​of three numeric variables, with the data for each observation represented by a circle (“bubble”), while the horizontal and vertical positions of the bubble indicate the values ​​of two other variables.

Also, Read – Unique Password Generator with Python.

Data Preparation for Bubble Plots

For this task, I will be using the dataset which describes the information on Canadian immigration. It contains data from 1980 to 2013 and includes the number of immigrants from 195 countries. Now let’s import the necessary packages and dataset to get started with the task:

Now, let’s have a look at the columns of the dataset:

df.columnsCode language: CSS (css)
Index([    'Type', 'Coverage',   'OdName',     'AREA', 'AreaName',      'REG',
        'RegName',      'DEV',  'DevName',       1980,       1981,       1982,
             1983,       1984,       1985,       1986,       1987,       1988,
             1989,       1990,       1991,       1992,       1993,       1994,
             1995,       1996,       1997,       1998,       1999,       2000,
             2001,       2002,       2003,       2004,       2005,       2006,
             2007,       2008,       2009,       2010,       2011,       2012,
             2013],
      dtype='object')

I will not use all the columns as I will only use the data which is required to understand the features that are needed to create the bubble plots:

df = df.drop(columns = ['Type', 'Coverage', 'AREA', 'AreaName','REG', 'RegName', 'DEV', 'DevName',]).set_index('OdName')
Code language: Python (python)

Data Normalization for Bubble Plots

I will choose the data of India and Brazil for this task:

India = df.loc['India']
Brazil = df.loc['Brazil']Code language: JavaScript (javascript)

Normalization is done the data to bring the data into a similar range. Immigration data for India and Brazil have different ranges. I needed to bring them to a range of 0 to 1.

I will simply divide the India data by the maximum value of the India data series. I will do the same with the Brazil data series, then I will now plot the Indian and Bazil data against years. It will be helpful to have the years on a list:

image for post

Now let’s draw the bubble plots on the size we defined before:

image for post

We can get an idea of ​​the number of immigrants by the size of the bubbles. We can also make this plot multicoloured. To make this a little meaningful, we need sorting of the data series. You will see the reason very soon:

c_br = sorted(Brazil)
c_fr = sorted(India)

Now, let’s prepare the bubble plot to change its colours:

image for post

Now, let’s plot the number of immigrants from Brazil per year to see the trend over the years using bubble plots:

bubble plots

Also, Read – Scraping Twitter with Python.

I hope you have now understood how to plot a bubble plot. I hope you liked this article on Bubble plots using Python. Feel free to ask your valuable questions in the comments section below. you can also follow me on Medium to learn every topic of Machine Learning.

Follow Us:

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1498

Leave a Reply