Skip to content

7.4 ARIMA Models (AutoRegressive Integrated Moving Average)

"Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful."— George E. P. Box, Statistician

Classic method for time series forecasting: From theory to practice

DifficultyImportance


Section Objectives

Upon completing this section, you will be able to:

  • Understand the mathematical principles of AR, MA, ARMA, and ARIMA models
  • Master model identification methods (ACF/PACF analysis)
  • Use the Box-Jenkins method for model selection
  • Implement ARIMA model estimation and diagnosis
  • Perform time series forecasting and calculate forecast intervals
  • Master SARIMA (Seasonal ARIMA)
  • Complete ARIMA modeling workflow using Python

ARIMA Family Overview

Model Family Tree

ARIMA Family
├── AR(p) - Autoregressive Model
│   └── Current value depends on past p values

├── MA(q) - Moving Average Model
│   └── Current value depends on past q errors

├── ARMA(p,q) - Autoregressive Moving Average
│   └── Combination of AR + MA

├── ARIMA(p,d,q) - ARMA after differencing
│   └── d times differencing + ARMA(p,q)

└── SARIMA(p,d,q)(P,D,Q)s - Seasonal ARIMA
    └── ARIMA + seasonal components

Why ARIMA?

Characteristics of Time Series:

  • Current values correlated with past values (autocorrelation)
  • OLS regression assumes independence, not applicable

Advantages of ARIMA:

  • Explicitly models time dependence structure
  • Flexibly adapts to different data characteristics
  • Strong forecasting capability

AR Model (Autoregressive Model)

AR(p) Model Definition

Mathematical Expression:

Where:

  • : Current value
  • : Autoregressive coefficients
  • : Constant term
  • : White noise

Lag Operator Representation:

Where is the lag operator:

AR(1) Model Details

Simplest AR Model:

Stationarity Condition:

Autocorrelation Function (ACF):

Characteristic: Exponential decay

AR Model Identification

PACF Cutoff Rule:

  • PACF of AR(p) cuts off after lag p
  • ACF decays gradually
ModelACFPACF
AR(1)Exponential decayLag 1 significant, others insignificant
AR(2)Exponential/sinusoidal decayLags 1,2 significant, others insignificant
AR(p)Gradual decayLag p significant, others insignificant

MA Model (Moving Average Model)

MA(q) Model Definition

Mathematical Expression:

Where:

  • : Moving average coefficients
  • : White noise

MA(1) Model Details

Invertibility Condition:

Autocorrelation Function (ACF):

Characteristic: ACF cuts off

MA Model Identification

ACF Cutoff Rule:

  • ACF of MA(q) cuts off after lag q
  • PACF decays gradually
ModelACFPACF
MA(1)Lag 1 significant, others insignificantExponential decay
MA(2)Lags 1,2 significant, others insignificantExponential/sinusoidal decay
MA(q)Lag q significant, others insignificantGradual decay

ARIMA Model (Integrated ARMA)

ARIMA(p,d,q) Model Definition

Core Idea: Difference the non-stationary series d times, then build ARMA(p,q)

Steps:

  1. Difference d times:
  2. Build ARMA(p,q) on

Mathematical Expression:

Determining Differencing Order d

Principle:

  • : Series already stationary
  • : Series has unit root (I(1))
  • : Series has two unit roots (rare)

Methods:

  1. ADF test (see Section 7.2)
  2. Observe ACF after differencing
  3. Avoid over-differencing

Box-Jenkins Modeling Process

Complete Workflow

Box-Jenkins Method
├── 1. Identification
│   ├── Plot time series
│   ├── Stationarity test (ADF)
│   ├── Determine differencing order d
│   ├── Plot ACF/PACF
│   └── Preliminarily determine p, q

├── 2. Estimation
│   ├── Maximum likelihood estimation
│   ├── Fit multiple candidate models
│   └── Calculate AIC/BIC

├── 3. Diagnostic Checking
│   ├── Residual white noise test
│   ├── Ljung-Box test
│   ├── Residual normality test
│   └── Residual ACF/PACF

└── 4. Forecasting
    ├── Point forecast
    ├── Forecast interval
    └── Forecast evaluation

SARIMA Model (Seasonal ARIMA)

SARIMA(p,d,q)(P,D,Q)s Definition

Complete Form:

Parameter Explanation:

  • : Non-seasonal part
  • : Seasonal part
  • : Seasonal period (monthly=12, quarterly=4)

SARIMA Identification

Seasonal Characteristics:

  • ACF/PACF significant at seasonal lags (s, 2s, 3s...)
  • May need seasonal differencing (D=1)

Model Selection: Information Criteria

AIC, BIC, HQIC

AIC (Akaike Information Criterion):

BIC (Bayesian Information Criterion):

HQIC (Hannan-Quinn IC):

Where:

  • : Likelihood function
  • : Number of parameters
  • : Sample size

Selection Principle: Lower value is better

Comparison:

  • AIC: Favors complex models
  • BIC: More severe parameter penalty, favors parsimonious models
  • Practice: Use both together

Section Summary

ARIMA Model Family

ModelFormulaIdentification Feature
AR(p)PACF cutoff
MA(q)ACF cutoff
ARMA(p,q)AR + MABoth decay
ARIMA(p,d,q)Differencing + ARMAFor non-stationary data
SARIMAARIMA + seasonalityFor seasonal data

Box-Jenkins Process

  1. Identification: ACF/PACF, stationarity tests
  2. Estimation: Maximum likelihood estimation, information criteria
  3. Diagnosis: Residual tests, Ljung-Box
  4. Forecasting: Point forecast, forecast intervals

Practice Points

  • Always check stationarity
  • Compare multiple candidate models
  • Diagnose residuals (white noise test)
  • Validate using test set

Next Section Preview

In the next section, we will learn how to use event study methods to evaluate policy effects.


Extended Reading

  1. Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (2015). Time Series Analysis: Forecasting and Control (5th ed.). Wiley.

  2. Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.

  3. Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). Chapters 8-9.

Master ARIMA, predict the future!

Released under the MIT License. Content © Author.