ArarForecast
Time series forecasting using ARAR Algorithm. Ref: Introduction to Time Series and Forecasting, Chapter: 10.1.The ARAR Algorithm (Peter J. Brockwell Richard A. Davis (2016) )
Installation
using Pkg
# dev version
Pkg.add(url = "https://github.com/Akai01/ArarForecast.jl.git")
Usage
Load packages
using CSV
using Downloads
using DataFrames
using TimeSeries
using Dates
using ArarForecast
Download and load the data
dta = CSV.File(Downloads.download("https://raw.githubusercontent.com/Akai01/exampletimeseriesdatasets/main/Data/AirPassengers.csv")) > DataFrame;
train = filter(row > row["ds"] < Date(1960,1,1), dta);
test = filter(row > row["ds"] >= Date(1960,1,1), dta);
Create a TimeArray:
train = (date = train[:,"ds"], data = train[:, "y"]);
train = TimeArray(train; timestamp = :date);
test = (date = test[:,"ds"], test = test[:, "y"]);
test = TimeArray(test; timestamp = :date);
length(test)
## 12
There are different ways to create a TimeArray
see
TimeSeries.jl
package.
Forecasting
fc = arar(;y = train, h = 12, freq = Month, level = [80, 95]);
typeof(fc)
## ArarForecast.Forecast
Plot the Forecast Object
p = ArarForecast.plot(;object = fc)
## Plot{Plots.GRBackend() n=6}
using Plots
Plots.plot(p, test)
The accuracy
accuracy(fc, test)
## (me = [11.275515996902792], mae = [13.065720800742184], mape = [2.8577953269438026], mdae = [8.718991617233343], rmse = 18.21764747304158)
Benchmark with R forecast package auto.arima
Load the data in and create a ts object
library(magrittr)
dta = read.csv("https://raw.githubusercontent.com/Akai01/exampletimeseriesdatasets/main/Data/AirPassengers.csv")%>%
dplyr::mutate(ds = as.Date(ds))
head(dta)
## ds y
## 1 19490131 112
## 2 19490228 118
## 3 19490331 132
## 4 19490430 129
## 5 19490531 121
## 6 19490630 135
train < dta%>%dplyr::filter(ds < as.Date("19600101"))
train_ts < train%>%dplyr::select(ds)%>%
ts(start = c(1949, 1), frequency = 12)
test < dta%>%dplyr::filter(ds >= as.Date("19600101"))
test_ts < test%>%dplyr::select(ds)%>%
ts(start = c(1960, 1), frequency = 12)
Train and forecast 12 months ahead:
fc < forecast::auto.arima(train_ts)%>%
forecast::forecast(h = 12)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
Plot the forecast
forecast::autoplot(fc) + forecast::autolayer(test_ts)
forecast::accuracy(fc$mean, test_ts)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 16.98639 23.9317 18.52768 3.933491 4.182395 0.04802038 0.5336134
Accuracy Benchmark: R forecast::auto.arima 4.18 vs Julia ArarForecast 2.85
How does the ARAR algorithm Work?
Memory Shortening
The ARAR algorithm applies a memoryshortening transformation if the underlying process of a given time series Y_{t}, t = 1, 2, ..., n is “longmemory” then it fits an autoregressive model.
The algorithm follows five steps to classify Y_{t} and take one of the following three actions:
 L: declare Y_{t} as long memory and form Y_{t} by Ỹ_{t} = Y_{t} − ϕ̂Y_{t − τ̂}
 M: declare Y_{t} as moderately long memory and form Y_{t} by Ỹ_{t} = Y_{t} − ϕ̂_{1}Y_{t − 1} − ϕ̂_{2}Y_{t − 2}
 S: declare Y_{t} as short memory.
If Y_{t} declared to be L or M then the series Y_{t} is transformed again until. The transformation process continuous until the transformed series is classified as short memory. However, the maximum number of transformation process is three, it is very rare a time series require more than 2.

 If Err(τ̂) ≤ 8/n, Y_{t} is a longmemory series.

 If ϕ̂(τ̂) ≥ 0.93 and τ̂ > 2, Y_{t} is a longmemory series.

 If ϕ̂(τ̂) ≥ 0.93 and τ̂ = 1 or 2, Y_{t} is a longmemory series.

 If ϕ̂(τ̂) < 0.93, Y_{t} is a shortmemory series.
Subset Autoregressive Model:
In the following we will describe how ARAR algorithm fits an autoregressive process to the meancorrected series X_{t} = S_{t} − S̄, t = k + 1, ..., n where S_{t}, t = k + 1, ..., n is the memoryshortened version of Y_{t} which derived from the five steps we described above and S̄ is the sample mean of S_{k + 1}, ..., S_{n}.
The fitted model has the following form:
X_{t} = ϕ_{1}X_{t} − 1 + ϕ_{1}X_{t − l1} + ϕ_{1}X_{t − l1} + ϕ_{1}X_{t − l1} + Z
where Z ∼ WN(0,σ^{2}). The coefficients ϕ_{j} and white noise variance σ^{2} can be derived from the YuleWalker equations for given lags l_{1}, l_{2}, and l_{3}:
and σ^{2} = γ̂(0)[1−ϕ_{1}ρ̂(1)] − ϕ_{l1}ρ̂(l_{1})] − ϕ_{l2}ρ̂(l_{2})] − ϕ_{l3}ρ̂(l_{3})], where γ̂(j) and ρ̂(j), j = 0, 1, 2, ..., are the sample autocovariances and autocorelations of the series X_{t}.
The algorithm computes the coefficients of ϕ(j) for each set of lags where 1 < l_{1} < l_{2} < l_{3} ≤ m where m chosen to be 13 or 26. The algorithm selects the model that the YuleWalker estimate of σ^{2} is minimal.
Forecasting
If shortmemory filter found in first step it has coefficients Ψ_{0}, Ψ_{1}, ..., Ψ_{k}(k≥0) where Ψ_{0} = 1. In this case the transforemed series can be expressed as
where Ψ(B) = 1 + Ψ_{1}B + ... + Ψ_{k}B^{k} is polynomial in the backshift operator.
If the coefficients of the subset autoregression found in the second step it has coefficients ϕ_{1}, ϕ_{l1}, ϕ_{l2} and ϕ_{l3} then the subset AR model for X_{t} = S_{t} − S̄ is
where Z_{t} is a whitenoise series with zero mean and constant variance and ϕ(B) = 1 − ϕ_{1}B − ϕ_{l1}B^{l1} − ϕ_{l2}B^{l2} − ϕ_{l3}B^{l3}. From equation (1) and (2) one can obtain
where ξ(B) = Ψ(B)ϕ(B).
Assuming the fitted model in equation (3) is an appropriate model, and Z_{t} is uncorrelated with Y_{j}, j < t ∀t ∈ T, one can determine minimum mean squared error linear predictors P_{n}Y_{n + h} of Y_{n + h} in terms of 1, Y_{1}, ..., Y_{n} for n > k + l_{3}, from recursions
with the initial conditions P_{n}Y_{n + h} = Y_{n + h}, for h ≤ 0.