Amazon Bestselling Books Analysis with Python

In this article, I’m going to introduce you to a data science project on Amazon bestselling books analysis with the Python programming language. The data I’ll be using in this data science project is a dataset of Amazon’s 50 Best Books between 2009 and 2019.

The dataset contains 550 books and has been categorized into fiction and non-fiction using Goodreads.

Also, Read – 100+ Machine Learning Projects Solved and Explained.

Data Science Project on Amazon Bestselling Books Analysis with Python

I will start the task of amazon bestselling books analysis with Python by importing the necessary Python libraries and the dataset:

Data Preparation:

Now the next step is to prepare the data, here I will rename User Rating as user_rating, and then we will fix some spellings in the data:

Amazon Bestselling Books Analysis with Python

In the data set, Genre is a categorical dummy variable; Fiction and non-fiction. Non-fiction was a more popular category than fiction, each year from 2009 to 2019. Of the 351 unique books, 54.4% were non-fiction and 45.6% were fiction.

The highest fraction (66%) of non-fiction books were sold in 2015 and the lowest for fiction books. For fiction books, the highest fraction (48%) of books were sold in 2009, 2013 and 2017, and the lowest for non-fiction books. Let’s visualize the data according to the genre:

Amazon bestselling

Now let’s visualize the above insights according to each year:

amazon best selling with categories

The bestselling authors are selected based on their appearances in the top 50 bestselling books each year, from 2009 to 2019. Now let’s look at the top 10 bestselling authors of both fiction and non-fiction categories:

bestselling authors

Top-selling authors are selected based on their appearances in the top 50 best-selling books each year. The number of appearances includes duplicate book names. Their unique posts and overall reviews are featured below:

top selling authors

Author Jeff Kinney is the best-selling author with 12 appearances in best-selling books from 2009 to 2019. This is how we can analyze any data related to sales consisting of different categories.

I hope you liked this article on the Data Science project on Amazon Bestselling books analysis with Python programming language. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1433

Leave a Reply