Video game sales analysis is a popular problem statement on Kaggle. You can work on this problem to analyze the sales of more than 16,500 games or you can also train a machine learning model for forecasting video game sales. So if you want to learn how to train a video game sales prediction model, this article is for you. In this article, I’ll walk you through a machine learning task on training a video game sales prediction model using Python.
Video Game Sales Prediction Model using Python
Analyzing sales data for over 16,500 games is a very popular problem statement on Kaggle. You can either solve this problem to find numerous patterns and relationships between factors affecting video game sales, or you can use this dataset to predict future video game sales. So in the section below, I’m going to walk you through how to train a machine learning model for predicting video game sales using Python.
The dataset I’m using for this task contains a list of video games and their sales. Let’s start this task by importing the necessary Python libraries and the dataset:
Rank Name Platform Year ... EU_Sales JP_Sales Other_Sales Global_Sales 0 1 Wii Sports Wii 2006.0 ... 29.02 3.77 8.46 82.74 1 2 Super Mario Bros. NES 1985.0 ... 3.58 6.81 0.77 40.24 2 3 Mario Kart Wii Wii 2008.0 ... 12.88 3.79 3.31 35.82 3 4 Wii Sports Resort Wii 2009.0 ... 11.01 3.28 2.96 33.00 4 5 Pokemon Red/Pokemon Blue GB 1996.0 ... 8.89 10.22 1.00 31.37
Now let’s see if this dataset contains null values:
print(data.isnull().sum())
Rank 0 Name 0 Platform 0 Year 271 Genre 0 Publisher 58 NA_Sales 0 EU_Sales 0 JP_Sales 0 Other_Sales 0 Global_Sales 0 dtype: int64
Now I’m going to create a new dataset removing the null values:
data = data.dropna()
Before we train the model, let’s take a look at the top 10 best-selling game categories:

Now let’s have a look at the correlation between the features of this dataset:
print(data.corr()) sns.heatmap(data.corr(), cmap="winter_r") plt.show()

Training Video Game Sales Prediction Model
Now let’s see how to train a machine learning model for predicting video game sales with Python. I’ll prepare the data by storing the features we need to train this model in the x variable and storing the target column in the y variable:
x = data[["Rank", "NA_Sales", "EU_Sales", "JP_Sales", "Other_Sales"]] y = data["Global_Sales"]
Now let’s split the data and use the linear regression algorithm to train this model:
Summary
This is how we can train a machine learning model to predict video game sales. This is a popular Kaggle problem statement that you can use to improve your skills in working with data and training on the machine learning model. I hope you liked this article on how to train a video game sales prediction model using Python. Feel free to ask your valuable questions in the comments section below.