Video Game Sales Prediction Model with Python

Video game sales analysis is a popular problem statement on Kaggle. You can work on this problem to analyze the sales of more than 16,500 games or you can also train a machine learning model for forecasting video game sales. So if you want to learn how to train a video game sales prediction model, this article is for you. In this article, I’ll walk you through a machine learning task on training a video game sales prediction model using Python.

Video Game Sales Prediction Model using Python

Analyzing sales data for over 16,500 games is a very popular problem statement on Kaggle. You can either solve this problem to find numerous patterns and relationships between factors affecting video game sales, or you can use this dataset to predict future video game sales. So in the section below, I’m going to walk you through how to train a machine learning model for predicting video game sales using Python.

The dataset I’m using for this task contains a list of video games and their sales. Let’s start this task by importing the necessary Python libraries and the dataset:

   Rank                      Name Platform    Year  ... EU_Sales JP_Sales  Other_Sales  Global_Sales
0     1                Wii Sports      Wii  2006.0  ...    29.02     3.77         8.46         82.74 
1     2         Super Mario Bros.      NES  1985.0  ...     3.58     6.81         0.77         40.24 
2     3            Mario Kart Wii      Wii  2008.0  ...    12.88     3.79         3.31         35.82 
3     4         Wii Sports Resort      Wii  2009.0  ...    11.01     3.28         2.96         33.00 
4     5  Pokemon Red/Pokemon Blue       GB  1996.0  ...     8.89    10.22         1.00         31.37

Now let’s see if this dataset contains null values:

print(data.isnull().sum())
Rank              0
Name              0
Platform          0
Year            271
Genre             0
Publisher        58
NA_Sales          0
EU_Sales          0
JP_Sales          0
Other_Sales       0
Global_Sales      0
dtype: int64

Now I’m going to create a new dataset removing the null values:

data = data.dropna()

Before we train the model, let’s take a look at the top 10 best-selling game categories:

Video Game Sales analysis

Now let’s have a look at the correlation between the features of this dataset:

print(data.corr())
sns.heatmap(data.corr(), cmap="winter_r")
plt.show()
correlation

Training Video Game Sales Prediction Model

Now let’s see how to train a machine learning model for predicting video game sales with Python. I’ll prepare the data by storing the features we need to train this model in the x variable and storing the target column in the y variable:

x = data[["Rank", "NA_Sales", "EU_Sales", "JP_Sales", "Other_Sales"]]
y = data["Global_Sales"]

Now let’s split the data and use the linear regression algorithm to train this model:

Summary

This is how we can train a machine learning model to predict video game sales. This is a popular Kaggle problem statement that you can use to improve your skills in working with data and training on the machine learning model. I hope you liked this article on how to train a video game sales prediction model using Python. Feel free to ask your valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1500

Leave a Reply