# Salary Prediction with Machine Learning

Salary differs according to the job profile of the person. But generally, it’s the working experience that determines the salary. Salary prediction is a popular problem among the Data Science community for complete beginners. So, if you are a beginner in Data Science, you should work on this problem to understand Machine Learning. In this article, I will take you through the task of salary prediction with Machine Learning using Python.

## Salary Prediction with Machine Learning

For salary prediction, we need to find relationships in the data on how the salary is determined. For this task, we need to have a dataset based on salaries. I found a dataset that contains data about how job experience affects salary.

The dataset contains two columns only:

1. job experience
2. salary

In the section below, I will introduce how to use Machine Learning to predict the salary based on the job experience.

## Salary Prediction using Python

Let’s start this task by importing the necessary Python libraries and the dataset:

```import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go

```   YearsExperience   Salary
0              1.1  39343.0
1              1.3  46205.0
2              1.5  37731.0
3              2.0  43525.0
4              2.2  39891.0```

Let’s check if the dataset has any null values or not:

`print(data.isnull().sum())`
```YearsExperience    0
Salary             0
dtype: int64```

The dataset doesn’t have any null values. Let’s have a look at the relationship between the salary and job experience of the people:

```figure = px.scatter(data_frame = data,
x="Salary",
y="YearsExperience",
size="YearsExperience",
trendline="ols")
figure.show()```

There is a perfect linear relationship between the salary and the job experience of the people. It means more job experience results in a higher salary.

## Training a Machine Learning Model

As this is a regression analysis problem, we will train a regression model to predict salary with Machine Learning. Here’s how we can split the data into training and test sets before training the model:

```from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

x = np.asanyarray(data[["YearsExperience"]])
y = np.asanyarray(data[["Salary"]])
xtrain, xtest, ytrain, ytest = train_test_split(x, y,
test_size=0.2,
random_state=42)```

Now here’s how we can train the Machine Learning model:

```model = LinearRegression()
model.fit(xtrain, ytrain)```

Now let’s predict the salary of a person using the trained Machine Learning model:

```a = float(input("Years of Experience : "))
features = np.array([[a]])
print("Predicted Salary = ", model.predict(features))```
```Years of Experience : 2
Predicted Salary =  [[44169.21365784]]```

So this is how you can solve the salary prediction problem as a beginner in Data Science.

### Summary

Salary prediction is a popular problem among the Data Science community for complete beginners. Through this regression analysis, we found a perfect linear relationship between the salary and the job experience of the people. It means more job experience results in a higher salary. I hope you liked this article on the task of salary prediction with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.

##### Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of dataðŸ“ˆ.

Articles: 1498