Time series forecasting using ARAR Algorithm. Ref: Introduction to Time Series and Forecasting, Chapter: 10.1.The ARAR Algorithm (Peter J. Brockwell Richard A. Davis (2016) )
using Pkg
# dev version
Pkg.add(url = "https://github.com/Akai01/ArarForecast.jl.git")
using CSV
using Downloads
using DataFrames
using TimeSeries
using Dates
using ArarForecast
dta = CSV.File(Downloads.download("https://raw.githubusercontent.com/Akai01/example-time-series-datasets/main/Data/AirPassengers.csv")) |> DataFrame;
train = filter(row -> row["ds"] < Date(1960,1,1), dta);
test = filter(row -> row["ds"] >= Date(1960,1,1), dta);
train = (date = train[:,"ds"], data = train[:, "y"]);
train = TimeArray(train; timestamp = :date);
test = (date = test[:,"ds"], test = test[:, "y"]);
test = TimeArray(test; timestamp = :date);
length(test)
## 12
There are different ways to create a TimeArray
see
TimeSeries.jl
package.
fc = arar(;y = train, h = 12, freq = Month, level = [80, 95]);
typeof(fc)
## ArarForecast.Forecast
p = ArarForecast.plot(;object = fc)
## Plot{Plots.GRBackend() n=6}
using Plots
Plots.plot(p, test)
accuracy(fc, test)
## (me = [-11.275515996902792], mae = [13.065720800742184], mape = [2.8577953269438026], mdae = [8.718991617233343], rmse = 18.21764747304158)
Load the data in and create a ts object
library(magrittr)
dta = read.csv("https://raw.githubusercontent.com/Akai01/example-time-series-datasets/main/Data/AirPassengers.csv")%>%
dplyr::mutate(ds = as.Date(ds))
head(dta)
## ds y
## 1 1949-01-31 112
## 2 1949-02-28 118
## 3 1949-03-31 132
## 4 1949-04-30 129
## 5 1949-05-31 121
## 6 1949-06-30 135
train <- dta%>%dplyr::filter(ds < as.Date("1960-01-01"))
train_ts <- train%>%dplyr::select(-ds)%>%
ts(start = c(1949, 1), frequency = 12)
test <- dta%>%dplyr::filter(ds >= as.Date("1960-01-01"))
test_ts <- test%>%dplyr::select(-ds)%>%
ts(start = c(1960, 1), frequency = 12)
fc <- forecast::auto.arima(train_ts)%>%
forecast::forecast(h = 12)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
forecast::autoplot(fc) + forecast::autolayer(test_ts)
forecast::accuracy(fc$mean, test_ts)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -16.98639 23.9317 18.52768 -3.933491 4.182395 0.04802038 0.5336134
The ARAR algorithm applies a memory-shortening transformation if the underlying process of a given time series Yt, t = 1, 2, ..., n is “long-memory” then it fits an autoregressive model.
The algorithm follows five steps to classify Yt and take one of the following three actions:
- L: declare Yt as long memory and form Yt by Ỹt = Yt − ϕ̂Yt − τ̂
- M: declare Yt as moderately long memory and form Yt by Ỹt = Yt − ϕ̂1Yt − 1 − ϕ̂2Yt − 2
- S: declare Yt as short memory.
If Yt declared to be L or M then the series Yt is transformed again until. The transformation process continuous until the transformed series is classified as short memory. However, the maximum number of transformation process is three, it is very rare a time series require more than 2.
-
- If Err(τ̂) ≤ 8/n, Yt is a long-memory series.
-
- If ϕ̂(τ̂) ≥ 0.93 and τ̂ > 2, Yt is a long-memory series.
-
- If ϕ̂(τ̂) ≥ 0.93 and τ̂ = 1 or 2, Yt is a long-memory series.
-
- If ϕ̂(τ̂) < 0.93, Yt is a short-memory series.
In the following we will describe how ARAR algorithm fits an autoregressive process to the mean-corrected series Xt = St − S̄, t = k + 1, ..., n where St, t = k + 1, ..., n is the memory-shortened version of Yt which derived from the five steps we described above and S̄ is the sample mean of Sk + 1, ..., Sn.
The fitted model has the following form:
Xt = ϕ1Xt − 1 + ϕ1Xt − l1 + ϕ1Xt − l1 + ϕ1Xt − l1 + Z
where Z ∼ WN(0,σ2). The coefficients ϕj and white noise variance σ2 can be derived from the Yule-Walker equations for given lags l1, l2, and l3:
and σ2 = γ̂(0)[1−ϕ1ρ̂(1)] − ϕl1ρ̂(l1)] − ϕl2ρ̂(l2)] − ϕl3ρ̂(l3)], where γ̂(j) and ρ̂(j), j = 0, 1, 2, ..., are the sample autocovariances and autocorelations of the series Xt.
The algorithm computes the coefficients of ϕ(j) for each set of lags where 1 < l1 < l2 < l3 ≤ m where m chosen to be 13 or 26. The algorithm selects the model that the Yule-Walker estimate of σ2 is minimal.
If short-memory filter found in first step it has coefficients Ψ0, Ψ1, ..., Ψk(k≥0) where Ψ0 = 1. In this case the transforemed series can be expressed as
where Ψ(B) = 1 + Ψ1B + ... + ΨkBk is polynomial in the back-shift operator.
If the coefficients of the subset autoregression found in the second step it has coefficients ϕ1, ϕl1, ϕl2 and ϕl3 then the subset AR model for Xt = St − S̄ is
where Zt is a white-noise series with zero mean and constant variance and ϕ(B) = 1 − ϕ1B − ϕl1Bl1 − ϕl2Bl2 − ϕl3Bl3. From equation (1) and (2) one can obtain
where ξ(B) = Ψ(B)ϕ(B).
Assuming the fitted model in equation (3) is an appropriate model, and Zt is uncorrelated with Yj, j < t ∀t ∈ T, one can determine minimum mean squared error linear predictors PnYn + h of Yn + h in terms of 1, Y1, ..., Yn for n > k + l3, from recursions
with the initial conditions PnYn + h = Yn + h, for h ≤ 0.