Salary differs according to the job profile of the person. But generally, it’s the working experience that determines the salary. Salary prediction is a popular problem among the **Data Science** community for complete beginners. So, if you are a beginner in Data Science, you should work on this problem to understand Machine Learning. In this article, I will take you through the task of salary prediction with Machine Learning using Python.

## Salary Prediction with Machine Learning

For salary prediction, we need to find relationships in the data on how the salary is determined. For this task, we need to have a dataset based on salaries. I found a dataset that contains data about how job experience affects salary.

The dataset contains two columns only:

- job experience
- salary

**You can download the dataset from here.**

In the section below, I will introduce how to use Machine Learning to predict the salary based on the job experience.

## Salary Prediction using Python

Let’s start this task by importing the necessary Python libraries and the **dataset**:

import pandas as pd import numpy as np import plotly.express as px import plotly.graph_objects as go data = pd.read_csv("Salary_Data.csv") print(data.head())

YearsExperience Salary 0 1.1 39343.0 1 1.3 46205.0 2 1.5 37731.0 3 2.0 43525.0 4 2.2 39891.0

Let’s check if the dataset has any null values or not:

print(data.isnull().sum())

YearsExperience 0 Salary 0 dtype: int64

The dataset doesn’t have any null values. Let’s have a look at the relationship between the salary and job experience of the people:

figure = px.scatter(data_frame = data, x="Salary", y="YearsExperience", size="YearsExperience", trendline="ols") figure.show()

There is a perfect linear relationship between the salary and the job experience of the people. It means more job experience results in a higher salary.

## Training a Machine Learning Model

As this is a regression analysis problem, we will train a regression model to predict salary with Machine Learning. Here’s how we can split the data into training and test sets before training the model:

from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression x = np.asanyarray(data[["YearsExperience"]]) y = np.asanyarray(data[["Salary"]]) xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2, random_state=42)

Now here’s how we can train the Machine Learning model:

model = LinearRegression() model.fit(xtrain, ytrain)

Now let’s predict the salary of a person using the trained Machine Learning model:

a = float(input("Years of Experience : ")) features = np.array([[a]]) print("Predicted Salary = ", model.predict(features))

Years of Experience : 2 Predicted Salary = [[44169.21365784]]

So this is how you can solve the salary prediction problem as a beginner in Data Science.

### Summary

Salary prediction is a popular problem among the Data Science community for complete beginners. Through this regression analysis, we found a perfect linear relationship between the salary and the job experience of the people. It means more job experience results in a higher salary. I hope you liked this article on the task of salary prediction with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.