SARIMA in Machine Learning

In Machine Learning, a seasonal autoregressive integrated moving average (SARIMA) model is a different step from an ARIMA model based on the concept of seasonal trends. In this article, I will introduce you to the SARIMA model in machine learning.

What is the SARIMA Model?

Seasonal variations of the time series can take into account periodic models, allowing more accurate predictions. Seasonal ARIMA (SARIMA) defines both a seasonal and a non-seasonal component of the ARIMA model, allowing periodic characteristics to be captured.

Also, Read – How To Launch Your Machine Learning Model?

By choosing an appropriate forecasting model, always visualize your data to identify trends, seasons and cycles. If seasonality is a strong feature of the series, consider models with seasonal adjustments such as the SARIMA model.

SARIMA Model in Action

In this article, I will use the number of tourist arrivals in Italy. The data is taken from European statistics: annual data on tourism industries. You can download this dataset from here. First, we import the data set for foreign tourist arrivals in Italy from 2012 to October 2019 and then convert it to a time series.

Let’s get started with the task by importing and reading the data:

import pandas as pd
df = pd.read_csv('IT_tourists_arrivals.csv')
df['date'] = pd.to_datetime(df['date'])
df = df[df['date'] > '2012-01-01']
df.set_index('date', inplace=True)Code language: JavaScript (javascript)

Preliminary analysis

The preliminary analysis is a visual analysis of the time series, to understand its trend and behaviour. First, we create the time series and store it in the variable ts.

ts = df['value']
import matplotlib.pylab as plt
plt.plot(ts)
plt.ylabel('Total Number of Tourists Arrivals')
plt.grid()
plt.tight_layout()
plt.savefig('plots/IT_tourists_arrivals.png')
plt.show()Code language: JavaScript (javascript)
Image for post

Now, let’s tune the parameters:

from statsmodels.tsa.stattools import adfuller
def test_stationarity(timeseries):
    
    dftest = adfuller(timeseries, autolag='AIC')
    dfoutput = pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
    for key,value in dftest[4].items():
        dfoutput['Critical Value (%s)'%key] = value
    
    critical_value = dftest[4]['5%']
    test_statistic = dftest[0]
    alpha = 1e-3
    pvalue = dftest[1]
    if pvalue < alpha and test_statistic < critical_value:  # null hypothesis: x is non stationary
        print("X is stationary")
        return True
    else:
        print("X is not stationary")
        return FalseCode language: PHP (php)

We now need to transform the time series via the diff () function as many times as the time series becomes stationary:

ts_diff = pd.Series(ts)
d = 0
while test_stationarity(ts_diff) is False:
    ts_diff = ts_diff.diff().dropna()
    d = d + 1Code language: PHP (php)

Now, let’s plot the correlations between the parameters:

from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
plot_acf(ts_trend, lags =12)
plt.savefig('plots/acf.png')
plt.show()Code language: JavaScript (javascript)
SARIMA

Building the SARIMA Model:

Now, let’s build our model by using the SARIMAX method provided by the statsmodels library:

from statsmodels.tsa.statespace.sarimax import SARIMAX
p = 9
q = 1
model = SARIMAX(ts, order=(p,d,q))
model_fit = model.fit(disp=1,solver='powell')
    
fcast = model_fit.get_prediction(start=1, end=len(ts))
ts_p = fcast.predicted_mean
ts_ci = fcast.conf_int()Code language: JavaScript (javascript)

Now, we need to plot the results of our model:

plt.plot(ts_p,label='prediction')
plt.plot(ts,color='red',label='actual')
plt.fill_between(ts_ci.index[1:],
                ts_ci.iloc[1:, 0],
                ts_ci.iloc[1:, 1], color='k', alpha=.2)
plt.ylabel('Total Number of Tourists Arrivals')
plt.legend()
plt.tight_layout()
plt.grid()
plt.savefig('plots/IT_trend_prediction.png')
plt.show()Code language: JavaScript (javascript)
Image for post

Also, Read – Word Embeddings in Machine Learning.

I hope you liked this article on how we can build the SARIMA Model for seasonality effects. Feel free to ask your valuable questions in the comments section below. You can also follow me on Medium to learn every topic of Machine Learning.

Follow Us:

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1538

One comment

Leave a Reply