Ads CTR Forecasting using Python

Ads CTR Analysis stands for Click-Through Rate Analysis for advertisements. Ads CTR Analysis is the process of examining the effectiveness of online advertisements by measuring the rate at which users click on an ad’s link to reach the advertiser’s website. If you want to learn how to perform Ads CTR Analysis, this article is for you. In this article, I’ll take you through the task of Ads CTR Analysis and Forecasting using Python.

Ads CTR Forecasting: Process We Can Follow

Ads CTR Analysis and Forecasting are crucial for businesses to assess the return on investment (ROI) of their advertising efforts and make data-driven decisions to improve ad performance. Below are the steps we can follow for the task of Ads CTR Analysis and Forecasting:

  1. Gather ad data, including the number of ad impressions (how often an ad was shown), the number of clicks, and any other relevant metrics.
  2. Explore the data to understand its characteristics and distribution. Calculate basic statistics, such as the mean CTR (Click-Through Rate) and standard deviation.
  3. Create visualizations, such as line charts or bar graphs, to represent CTR trends over time.
  4. Conduct A/B tests if necessary to compare the performance of different ad variations.
  5. Analyze the CTR data to identify factors that influence ad performance.
  6. Build a forecasting model to predict future CTR values.

So, the process begins with collecting data. I found an ideal dataset for the task of Ads CTR Analysis and Forecasting. You can download the dataset from here.

Ads CTR Forecasting using Python

Let’s get started with the task of Ads CTR Analysis and forecasting by importing the necessary Python libraries and the dataset:

import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt

data = pd.read_csv("ctr.csv")
print(data.head())
         Date  Clicks  Impressions
0  2022-10-19    2851        58598
1  2022-10-20    2707        57628
2  2022-10-21    2246        50135
3  2022-10-22    1686        40608
4  2022-10-23    1808        41999

Let’s start by converting the Date column in the DataFrame from a string format to a datetime format and then setting it as the index of the DataFrame:

# Data Preparation
data['Date'] = pd.to_datetime(data['Date'],
                                     format='%Y/%m/%d')
data.set_index('Date', inplace=True)

Now, let’s visualize the clicks and impressions over time:

# Visualize Clicks and Impressions
fig = go.Figure()
fig.add_trace(go.Scatter(x=data.index, y=data['Clicks'], mode='lines', name='Clicks'))
fig.add_trace(go.Scatter(x=data.index, y=data['Impressions'], mode='lines', name='Impressions'))
fig.update_layout(title='Clicks and Impressions Over Time')
fig.show()
Ads CTR Forecasting: Clicks and Impressions Over Time

Now, let’s have a look at the relationship between clicks and impressions:

# Create a scatter plot to visualize the relationship between Clicks and Impressions
fig = px.scatter(data, x='Clicks', y='Impressions', title='Relationship Between Clicks and Impressions',
                 labels={'Clicks': 'Clicks', 'Impressions': 'Impressions'})

# Customize the layout
fig.update_layout(xaxis_title='Clicks', yaxis_title='Impressions')

# Show the plot
fig.show()
Relationship Between Clicks and Impressions

So, the relationship between clicks and impressions is linear. It means higher ad impressions result in higher ad clicks. Now, let’s calculate and visualize CTR over time:

# Calculate and visualize CTR
data['CTR'] = (data['Clicks'] / data['Impressions']) * 100
fig = px.line(data, x=data.index, y='CTR', title='Click-Through Rate (CTR) Over Time')
fig.show()
Ads CTR Forecasting: Click-Through Rate (CTR) Over Time

Now, let’s have a look at the average CTR by day of the week:

data['DayOfWeek'] = data.index.dayofweek
data['WeekOfMonth'] = data.index.week // 4

# EDA based on DayOfWeek
day_of_week_ctr = data.groupby('DayOfWeek')['CTR'].mean().reset_index()
day_of_week_ctr['DayOfWeek'] = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

fig = px.bar(day_of_week_ctr, x='DayOfWeek', y='CTR', title='Average CTR by Day of the Week')
fig.show()
Average CTR by Day of the Week

Now, let’s compare the CTR on weekdays and weekends:

# Create a new column 'DayCategory' to categorize weekdays and weekends
data['DayCategory'] = data['DayOfWeek'].apply(lambda x: 'Weekend' if x >= 5 else 'Weekday')

# Calculate average CTR for weekdays and weekends
ctr_by_day_category = data.groupby('DayCategory')['CTR'].mean().reset_index()

# Create a bar plot to compare CTR on weekdays vs. weekends
fig = px.bar(ctr_by_day_category, x='DayCategory', y='CTR', title='Comparison of CTR on Weekdays vs. Weekends',
             labels={'CTR': 'Average CTR'})

# Customize the layout
fig.update_layout(yaxis_title='Average CTR')

# Show the plot
fig.show()
Comparison of CTR on Weekdays vs. Weekends

Now, let’s compare the impressions and clicks on weekdays and weekends:

# Group the data by 'DayCategory' and calculate the sum of Clicks and Impressions for each category
grouped_data = data.groupby('DayCategory')[['Clicks', 'Impressions']].sum().reset_index()

# Create a grouped bar chart to visualize Clicks and Impressions on weekdays vs. weekends
fig = px.bar(grouped_data, x='DayCategory', y=['Clicks', 'Impressions'],
             title='Impressions and Clicks on Weekdays vs. Weekends',
             labels={'value': 'Count', 'variable': 'Metric'},
             color_discrete_sequence=['blue', 'green'])

# Customize the layout
fig.update_layout(yaxis_title='Count')
fig.update_xaxes(title_text='Day Category')

# Show the plot
fig.show()
Impressions and Clicks on Weekdays vs. Weekends

Ads CTR Forecasting

Now, let’s see how to forecast the Ads CTR. As CTR is dependent on impressions and impressions change over time, we can use Time Series forecasting techniques to forecast CTR. As CTR is seasonal, let’s calculate the p, d, and q values for the SARIMA model:

data.reset_index(inplace=True)

from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# resetting index
time_series = data.set_index('Date')['CTR']

# Differencing
differenced_series = time_series.diff().dropna()

# Plot ACF and PACF of differenced time series
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
plot_acf(differenced_series, ax=axes[0])
plot_pacf(differenced_series, ax=axes[1])
plt.show()
Ads CTR: ACF and PACF

The value of p, d, and q will be one here. You can learn more about calculating p, d, and q values from here. And as we are using the SARIMA model here, the value of s will be 12.

Now, let’s train the forecasting model using SARIMA:

from statsmodels.tsa.statespace.sarimax import SARIMAX

p, d, q, s = 1, 1, 1, 12

model = SARIMAX(time_series, order=(p, d, q), seasonal_order=(p, d, q, s))
results = model.fit()
print(results.summary())
                                    SARIMAX Results                                      
==========================================================================================
Dep. Variable:                                CTR   No. Observations:                  365
Model:             SARIMAX(1, 1, 1)x(1, 1, 1, 12)   Log Likelihood                 -71.365
Date:                            Mon, 23 Oct 2023   AIC                            152.730
Time:                                    07:35:12   BIC                            172.048
Sample:                                10-19-2022   HQIC                           160.418
                                     - 10-18-2023                                         
Covariance Type:                              opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1          0.5266      0.070      7.513      0.000       0.389       0.664
ma.L1         -0.9049      0.036    -25.361      0.000      -0.975      -0.835
ar.S.L12      -0.1573      0.071     -2.225      0.026      -0.296      -0.019
ma.S.L12      -0.9974      1.099     -0.908      0.364      -3.151       1.156
sigma2         0.0772      0.084      0.917      0.359      -0.088       0.242
===================================================================================
Ljung-Box (L1) (Q):                   5.64   Jarque-Bera (JB):                 1.20
Prob(Q):                              0.02   Prob(JB):                         0.55
Heteroskedasticity (H):               1.14   Skew:                            -0.01
Prob(H) (two-sided):                  0.48   Kurtosis:                         3.28
===================================================================================

Now, here’s how to predict the future CTR values:

# Predict future values
future_steps = 100
predictions = results.predict(len(time_series), len(time_series) + future_steps - 1)
print(predictions)
2023-10-19    3.852350
2023-10-20    3.889426
2023-10-21    3.820260
2023-10-22    3.727494
2023-10-23    3.710360
                ...   
2024-01-22    3.545574
2024-01-23    3.466648
2024-01-24    3.561193
2024-01-25    3.546697
2024-01-26    3.580132
Freq: D, Name: predicted_mean, Length: 100, dtype: float64

Now, let’s visualize the forecasted trend of CTR:

# Create a DataFrame with the original data and predictions
forecast = pd.DataFrame({'Original': time_series, 'Predictions': predictions})

# Plot the original data and predictions
fig = go.Figure()

fig.add_trace(go.Scatter(x=forecast.index, y=forecast['Predictions'],
                         mode='lines', name='Predictions'))

fig.add_trace(go.Scatter(x=forecast.index, y=forecast['Original'],
                         mode='lines', name='Original Data'))

fig.update_layout(title='CTR Forecasting',
                  xaxis_title='Time Period',
                  yaxis_title='Impressions',
                  legend=dict(x=0.1, y=0.9),
                  showlegend=True)

fig.show()
CTR Forecasting

Summary

So, this is how we can analyze and forecast CTR using Python. Ads Click Through Rate Analysis and Forecasting are crucial for businesses to assess the return on investment (ROI) of their advertising efforts and make data-driven decisions to improve ad performance. I hope you liked this article on Ads CTR Analysis and Forecasting using Python. Feel free to ask valuable questions in the comments section below.

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1536

Leave a Reply