Bubble plots are better versions of the scatter plots, replacing the dots with bubbles. Most often, a bubble plot displays the values of three numeric variables, with the data for each observation represented by a circle (“bubble”), while the horizontal and vertical positions of the bubble indicate the values of two other variables.
Data Preparation for Bubble Plots
For this task, I will be using the dataset which describes the information on Canadian immigration. It contains data from 1980 to 2013 and includes the number of immigrants from 195 countries. Now let’s import the necessary packages and dataset to get started with the task:
Now, let’s have a look at the columns of the dataset:
Code language: CSS (css)
Index([ 'Type', 'Coverage', 'OdName', 'AREA', 'AreaName', 'REG', 'RegName', 'DEV', 'DevName', 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013], dtype='object')
I will not use all the columns as I will only use the data which is required to understand the features that are needed to create the bubble plots:
Code language: Python (python)
df = df.drop(columns = ['Type', 'Coverage', 'AREA', 'AreaName','REG', 'RegName', 'DEV', 'DevName',]).set_index('OdName')
Data Normalization for Bubble Plots
I will choose the data of India and Brazil for this task:
India = df.loc['India'] Brazil = df.loc['Brazil']
Normalization is done the data to bring the data into a similar range. Immigration data for India and Brazil have different ranges. I needed to bring them to a range of 0 to 1.
I will simply divide the India data by the maximum value of the India data series. I will do the same with the Brazil data series, then I will now plot the Indian and Bazil data against years. It will be helpful to have the years on a list:
Now let’s draw the bubble plots on the size we defined before:
We can get an idea of the number of immigrants by the size of the bubbles. We can also make this plot multicoloured. To make this a little meaningful, we need sorting of the data series. You will see the reason very soon:
c_br = sorted(Brazil) c_fr = sorted(India)
Now, let’s prepare the bubble plot to change its colours:
Now, let’s plot the number of immigrants from Brazil per year to see the trend over the years using bubble plots:
I hope you have now understood how to plot a bubble plot. I hope you liked this article on Bubble plots using Python. Feel free to ask your valuable questions in the comments section below. you can also follow me on Medium to learn every topic of Machine Learning.