# Ads Click Through Rate Prediction using Python

Ads Click-through rate prediction means predicting whether the user will click on the ad. In the task of ads click-through rate prediction, we need to train a Machine Learning model to find relationships between the characteristics of all the users who click on ads.

I found an ideal dataset for this task containing data about the user and whether the user clicked on the ad. You can download the dataset from here.

In the section below, I will take you through the task of Ads Click Through Rate Prediction with Machine Learning using Python.

## Ads Click-Through Rate Prediction using Python

Let’s start the task of ads click-through rate prediction by importing the necessary Python libraries and the dataset:

```import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
import numpy as np
pio.templates.default = "plotly_white"

```   Daily Time Spent on Site   Age  Area Income  Daily Internet Usage  \
0                     62.26  32.0     69481.85                172.83
1                     41.73  31.0     61840.26                207.17
2                     44.40  30.0     57877.15                172.83
3                     59.88  28.0     56180.93                207.17
4                     49.21  30.0     54324.73                201.58

Ad Topic Line             City  Gender  \
0      Decentralized real-time circuit         Lisafort    Male
1       Optional full-range projection  West Angelabury    Male
2  Total 5thgeneration standardization        Reyesfurt  Female
3          Balanced empowering success      New Michael  Female
4  Total 5thgeneration standardization     West Richard  Female

0  Svalbard & Jan Mayen Islands  2016-06-09 21:43:05              0
1                     Singapore  2016-01-16 17:56:05              0
3                        Zambia  2016-06-21 14:32:32              0
4                         Qatar  2016-07-21 10:54:35              1  ```

The “Clicked on Ad” column contains 0 and 1 values, where 0 means not clicked, and 1 means clicked. I’ll transform these values into “yes” and “no”:

```data["Clicked on Ad"] = data["Clicked on Ad"].map({0: "No",
1: "Yes"})```

## Click Through Rate Analysis

Now let’s analyze the click-through rate based on the time spent by the users on the website:

```fig = px.box(data,
x="Daily Time Spent on Site",
title="Click Through Rate based Time Spent on Site",
color_discrete_map={'Yes':'blue',
'No':'red'})
fig.update_traces(quartilemethod="exclusive")
fig.show()```

From the above graph, we can see that the users who spend more time on the website click more on ads. Now let’s analyze the click-through rate based on the daily internet usage of the user:

```fig = px.box(data,
x="Daily Internet Usage",
title="Click Through Rate based on Daily Internet Usage",
color_discrete_map={'Yes':'blue',
'No':'red'})
fig.update_traces(quartilemethod="exclusive")
fig.show()```

From the above graph, we can see that the users with high internet usage click less on ads compared to the users with low internet usage. Now let’s analyze the click-through rate based on the age of the users:

```fig = px.box(data,
x="Age",
title="Click Through Rate based on Age",
color_discrete_map={'Yes':'blue',
'No':'red'})
fig.update_traces(quartilemethod="exclusive")
fig.show()```

From the above graph, we can see that users around 40 years click more on ads compared to users around 27-36 years old. Now let’s analyze the click-through rate based on the income of the users:

```fig = px.box(data,
x="Area Income",
title="Click Through Rate based on Income",
color_discrete_map={'Yes':'blue',
'No':'red'})
fig.update_traces(quartilemethod="exclusive")
fig.show()```

There’s not much difference, but people from high-income areas click less on ads.

Now let’s calculate the overall Ads click-through rate. Here we need to calculate the ratio of users who clicked on the ad to users who left an impression on the ad. So let’s see the distribution of users:

`data["Clicked on Ad"].value_counts()`
```No     5083
Yes    4917
Name: Clicked on Ad, dtype: int64```

So 4917 out of 10000 users clicked on the ads. Let’s calculate the CTR:

```click_through_rate = 4917 / 10000 * 100
print(click_through_rate)```
`49.17`

So the CTR is 49.17.

## Click Through Rate Prediction Model

Now let’s move on to training a Machine Learning model to predict click-through rate. I’ll start by dividing the data into training and testing sets:

```data["Gender"] = data["Gender"].map({"Male": 1,
"Female": 0})

x=data.iloc[:,0:7]
y=data.iloc[:,9]

from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(x,y,
test_size=0.2,
random_state=4)```

Now let’s train the model using the random forecast classification algorithm:

```from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(x, y)```

Now let’s have a look at the accuracy of the model:

```from sklearn.metrics import accuracy_score
print(accuracy_score(ytest,y_pred))```
`0.9615`

Now let’s test the model by making predictions:

```print("Ads Click Through Rate Prediction : ")
a = float(input("Daily Time Spent on Site: "))
b = float(input("Age: "))
c = float(input("Area Income: "))
d = float(input("Daily Internet Usage: "))
e = input("Gender (Male = 1, Female = 0) : ")

features = np.array([[a, b, c, d, e]])
print("Will the user click on ad = ", model.predict(features))```
```Ads Click Through Rate Prediction :
Daily Time Spent on Site: 62.26
Age: 28
Area Income: 61840.26
Daily Internet Usage: 207.17
Gender (Male = 1, Female = 0) : 0
Will the user click on ad =  ['No']```

### Summary ##### Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1364

1. #### lemartinezba

Hi, I was trying to run this code, but it says:
line 68, in
print(accuracy_score(ytest,y_pred))
NameError: name ‘y_pred’ is not defined

how can I define y_pred?
Thanks

• #### Aman Kharwal

y_pred = model.predict(xtest)