Chapter 7 Introduction (Time Series Analysis and Event Study)

Understanding socioeconomic phenomena through the time dimension: Trends, cycles, and causality

Chapter Objectives

Upon completing this chapter, you will be able to:

Understand the basic characteristics of time series data (trend, seasonality, cycle)
Master time series decomposition methods and applications
Use ARIMA models for forecasting
Conduct stationarity tests (ADF, KPSS)
Implement event study methodology
Evaluate policy effects and shock impacts
Use Python toolkits (statsmodels, pandas)

Why is Time Series Analysis So Important?

Time series data is everywhere:

Macroeconomics: GDP, inflation, unemployment rate, interest rates
Financial Economics: Stock prices, exchange rates, futures prices
Labor Economics: Employment numbers, wage levels, labor participation rates
Public Policy: Pre-post policy comparison, policy effect evaluation
Sociology: Crime rates, birth rates, education enrollment rates

Time Series vs Cross-sectional Data

Characteristic	Cross-sectional Data	Time Series Data
Observation Objects	Multiple individuals, single time point	Single/multiple individuals, multiple time points
Independence Assumption	Usually satisfied (i.i.d.)	Violated (serial correlation)
Typical Issues	Individual differences, omitted variables	Trend, seasonality, autocorrelation
Analysis Methods	OLS, cross-sectional regression	ARIMA, VAR, cointegration
Causal Inference	RCT, IV, RDD	DID, event study, breakpoint

Classic Case: Box & Jenkins (1970)

Airline Passenger Data: Monthly international airline passenger numbers from 1949-1960

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.datasets import co2

# Load classic dataset
data = co2.load_pandas().data

# Visualization
fig, axes = plt.subplots(2, 1, figsize=(14, 8))

# Original series
axes[0].plot(data.index, data.values, linewidth=1.5)
axes[0].set_title('Atmospheric CO₂ Concentration (1958-2001)', fontsize=14, fontweight='bold')
axes[0].set_ylabel('CO₂ (ppm)')
axes[0].grid(True, alpha=0.3)

# Annual average
yearly = data.resample('Y').mean()
axes[1].plot(yearly.index, yearly.values, linewidth=2, marker='o')
axes[1].set_title('Annual Average CO₂ Concentration', fontsize=14, fontweight='bold')
axes[1].set_ylabel('CO₂ (ppm)')
axes[1].set_xlabel('Year')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Observed Characteristics:

Trend: Long-term upward trend
Seasonality: Annual cyclical fluctuations
Random Variation: Unpredictable short-term fluctuations

Core Concepts of Time Series

1. Stationarity

Definition: Statistical properties of a time series do not change over time

Strict Stationarity:

Weak Stationarity / Covariance Stationarity:

(constant mean)
(constant variance)
(autocovariance depends only on lag)

Why is Stationarity Important?

Non-stationary series may lead to spurious regression
Many time series models (like ARIMA) require stationary series
Forecasting depends on stable statistical properties

Illustration:

python

np.random.seed(42)
n = 200

# Stationary series: white noise
stationary = np.random.normal(0, 1, n)

# Non-stationary series: random walk
random_walk = np.cumsum(np.random.normal(0, 1, n))

# Non-stationary series: deterministic trend
trend = np.arange(n) * 0.1 + np.random.normal(0, 1, n)

fig, axes = plt.subplots(3, 1, figsize=(14, 10))

axes[0].plot(stationary)
axes[0].set_title('Stationary Series: White Noise', fontsize=14, fontweight='bold')
axes[0].axhline(y=0, color='r', linestyle='--', alpha=0.5)
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)

axes[1].plot(random_walk, color='orange')
axes[1].set_title('Non-stationary Series: Random Walk (Unit Root)', fontsize=14, fontweight='bold')
axes[1].set_ylabel('Value')
axes[1].grid(True, alpha=0.3)

axes[2].plot(trend, color='green')
axes[2].set_title('Non-stationary Series: Deterministic Trend', fontsize=14, fontweight='bold')
axes[2].set_xlabel('Time')
axes[2].set_ylabel('Value')
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

2. Autocorrelation

Definition: Correlation between a time series and its lagged values

Autocorrelation Function (ACF):

Partial Autocorrelation Function (PACF):

Correlation between and after controlling for intermediate lags
Used to identify AR order

Example:

python

from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Generate AR(1) process
np.random.seed(42)
phi = 0.8
ar1 = [0]
for i in range(1, 200):
    ar1.append(phi * ar1[-1] + np.random.normal(0, 1))

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Time series plot
axes[0].plot(ar1)
axes[0].set_title('AR(1) Series: $y_t = 0.8 y_{t-1} + \epsilon_t$', fontsize=14, fontweight='bold')
axes[0].grid(True, alpha=0.3)

# ACF
plot_acf(ar1, lags=20, ax=axes[1], alpha=0.05)
axes[1].set_title('Autocorrelation Function (ACF)', fontsize=14, fontweight='bold')

# PACF
plot_pacf(ar1, lags=20, ax=axes[2], alpha=0.05)
axes[2].set_title('Partial Autocorrelation Function (PACF)', fontsize=14, fontweight='bold')

plt.tight_layout()
plt.show()

Interpretation:

ACF: Exponential decay (AR process characteristic)
PACF: Lag 1 significant, others insignificant (identified as AR(1))

3. Time Series Model Family

Time Series Models
├── Classical Decomposition Models
│   ├── Additive Model: Y = T + S + R
│   └── Multiplicative Model: Y = T × S × R
│
├── Exponential Smoothing
│   ├── Simple Exponential Smoothing (SES)
│   ├── Holt Linear Trend
│   └── Holt-Winters Seasonal
│
├── ARIMA Family
│   ├── AR (Autoregressive)
│   ├── MA (Moving Average)
│   ├── ARMA
│   ├── ARIMA (with differencing)
│   └── SARIMA (seasonal)
│
├── Vector Models
│   ├── VAR (Vector Autoregression)
│   ├── VECM (Vector Error Correction Model)
│   └── Structural VAR (SVAR)
│
└── Advanced Models
    ├── GARCH (Volatility Models)
    ├── State Space Models
    └── Prophet (Facebook)

Event Study Methodology

Definition and Applications

Event Study: Evaluating the impact of a specific event on a variable

Classic Application Scenarios:

Financial Markets:
- Impact of mergers and acquisitions on stock prices
- Impact of earnings announcements on stock returns
- Impact of regulatory policies on market volatility
Public Policy:
- Impact of minimum wage laws on employment
- Impact of environmental regulations on pollution
- Impact of education reforms on test scores
Natural Disasters:
- Impact of earthquakes on economic activity
- Impact of epidemics on consumer behavior

Basic Framework of Event Study

Timeline
├── Estimation Window
│   └── Establish "normal" benchmark
│
├── Event Window
│   ├── Pre-event
│   ├── Event Day
│   └── Post-event
│
└── Post-event Window
    └── Long-term effect evaluation

Core Metrics:

Abnormal Return (AR):

Cumulative Abnormal Return (CAR):

Illustration:

python

# Simulate event study
np.random.seed(123)
n = 200
event_day = 100

# Normal period returns
normal_returns = np.random.normal(0.001, 0.02, n)

# Add abnormal returns from event day onwards
abnormal_effect = 0.05
normal_returns[event_day:] += abnormal_effect

# Calculate cumulative returns
cumulative_returns = np.cumsum(normal_returns)

fig, axes = plt.subplots(2, 1, figsize=(14, 8))

# Daily returns
axes[0].plot(normal_returns, alpha=0.7)
axes[0].axvline(x=event_day, color='r', linestyle='--', linewidth=2, label='Event Day')
axes[0].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[0].set_title('Daily Returns', fontsize=14, fontweight='bold')
axes[0].set_ylabel('Returns')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Cumulative abnormal returns
axes[1].plot(cumulative_returns, linewidth=2)
axes[1].axvline(x=event_day, color='r', linestyle='--', linewidth=2, label='Event Day')
axes[1].set_title('Cumulative Returns', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Time (Days)')
axes[1].set_ylabel('Cumulative Returns')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Chapter Structure

Section 1: Time Series Basics

Characteristics of time series data
Time series processing in Python (pandas)
Data stationarity testing (ADF, KPSS, PP)
Differencing and transformation
Case: Stationarity analysis of China's GDP growth rate

Section 2: Time Series Decomposition

Classical decomposition methods (additive/multiplicative models)
STL decomposition (Seasonal-Trend Decomposition using Loess)
Seasonal adjustment (X-13ARIMA-SEATS)
Trend extraction and filtering (HP filter, BK filter)
Case: Seasonal adjustment of Consumer Price Index (CPI)

Section 3: ARIMA Models

AR, MA, ARMA models
Unit root and differencing
ARIMA model identification, estimation, diagnosis
Model selection (AIC, BIC)
Forecasting and forecast intervals
SARIMA (Seasonal ARIMA)
Case: Unemployment rate forecasting

Section 4: Event Study Methodology

Basic framework of event studies
Normal return models (market model, mean-adjusted model)
Calculation of abnormal returns
Significance testing (t-test, rank test)
Multiple event studies
Case: Impact of merger announcements on stock prices

Section 5: Summary and Review

Knowledge system summary
10 high-difficulty programming exercises
Classic literature recommendations
Learning path guide

Python Toolkits

Core Libraries

Library	Main Functions	Installation
pandas	Time series data processing	`pip install pandas`
statsmodels	ARIMA, VAR, cointegration	`pip install statsmodels`
scipy	Statistical testing	`pip install scipy`
matplotlib	Visualization	`pip install matplotlib`
seaborn	Advanced visualization	`pip install seaborn`

Specialized Libraries

Library	Main Functions	Installation
arch	GARCH models	`pip install arch`
pmdarima	Auto ARIMA	`pip install pmdarima`
prophet	Facebook forecasting tool	`pip install prophet`
ruptures	Breakpoint detection	`pip install ruptures`

Basic Setup

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller, kpss, acf, pacf
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Chinese font settings
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS']  # macOS
# plt.rcParams['font.sans-serif'] = ['SimHei']  # Windows
plt.rcParams['axes.unicode_minus'] = False

# Set style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['figure.dpi'] = 100

# pandas display settings
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 20)
pd.set_option('display.width', 1000)

Classic Literature

Must-Read Papers

Box, G. E. P., & Jenkins, G. M. (1970). Time Series Analysis: Forecasting and Control
- Foundational work on time series analysis
- Systematic introduction to ARIMA models
Dickey, D. A., & Fuller, W. A. (1979). "Distribution of the Estimators for Autoregressive Time Series with a Unit Root"
- ADF unit root test
- Standard method for stationarity testing
Fama, E. F., Fisher, L., Jensen, M. C., & Roll, R. (1969). "The Adjustment of Stock Prices to New Information"
- Seminal work on event study methodology
- Empirical support for efficient market hypothesis
Engle, R. F., & Granger, C. W. J. (1987). "Co-integration and Error Correction"
- Cointegration theory (Nobel Prize in Economics)
- Long-run equilibrium relationships

Applied Literature

Card, D., & Krueger, A. B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania"
- Difference-in-differences (DID) method
- Natural experiment design
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). "How Much Should We Trust Differences-In-Differences Estimates?"
- Serial correlation issues in DID
- Clustered standard errors

Learning Path Recommendations

Beginner (1-2 weeks)

Goal: Master basic time series data processing

Learning Content:

Section 1: Stationarity concepts, ADF test
First half of Section 2: Time series decomposition
Practice: Using real data (GDP, CPI)

Recommended Resources:

Wooldridge (2020): Chapter 10 "Basic Regression Analysis with Time Series Data"

Intermediate Learning (3-4 weeks)

Goal: Independently build ARIMA models

Learning Content:

Second half of Section 2: STL decomposition
Section 3: Complete ARIMA modeling process
Practice: Forecasting unemployment rate, inflation rate

Recommended Resources:

Stock & Watson (2020): Chapter 14 "Introduction to Time Series Regression"

Advanced Application (5-6 weeks)

Goal: Implement event studies and VAR analysis

Learning Content:

Section 4: Event study methodology
Practice: Policy effect evaluation

Recommended Resources:

Hamilton (1994): Time Series Analysis (Classic textbook)
Lütkepohl (2005): New Introduction to Multiple Time Series Analysis

Practical Recommendations

Common Pitfalls in Time Series Analysis

Spurious Regression
- Problem: Two non-stationary series may show high correlation but are actually unrelated
- Solution: Test for stationarity, use differencing or cointegration
Over-differencing
- Problem: Unnecessary differencing introduces MA components
- Solution: Use joint ADF and KPSS testing
Ignoring Structural Breaks
- Problem: Policy changes, crises lead to parameter changes
- Solution: Chow Test, Bai-Perron breakpoint testing
Small Sample Issues
- Problem: Time series typically have small sample sizes
- Solution: Interpret cautiously, use robust standard errors

Data Quality Checklist

[ ] Check for missing values (interpolation vs deletion)
[ ] Identify outliers (extreme events vs measurement errors)
[ ] Confirm time unit and frequency (daily, monthly, quarterly)
[ ] Check for seasonality (holiday effects)
[ ] Plot time series (trend, breakpoints)
[ ] Calculate basic statistics (mean, variance, skewness, kurtosis)

Get Started

Ready to explore the mysteries of the time dimension?

Let's begin with Section 1: Time Series Basics!

Remember:

"The best way to predict the future is to understand the past."

Time Series Analysis: Understanding the past, predicting the future, evaluating causality!

Chapter 7 Introduction (Time Series Analysis and Event Study) ​

Chapter Objectives ​

Why is Time Series Analysis So Important? ​

The Time Dimension in Social Sciences ​

Time Series vs Cross-sectional Data ​

Classic Case: Box & Jenkins (1970) ​

Core Concepts of Time Series ​

1. Stationarity ​

2. Autocorrelation ​

3. Time Series Model Family ​

Event Study Methodology ​

Definition and Applications ​

Basic Framework of Event Study ​

Chapter Structure ​

Section 1: Time Series Basics ​

Section 2: Time Series Decomposition ​

Section 3: ARIMA Models ​

Section 4: Event Study Methodology ​

Section 5: Summary and Review ​

Python Toolkits ​

Core Libraries ​

Specialized Libraries ​

Basic Setup ​

Classic Literature ​

Must-Read Papers ​

Applied Literature ​

Learning Path Recommendations ​

Beginner (1-2 weeks) ​

Intermediate Learning (3-4 weeks) ​

Advanced Application (5-6 weeks) ​

Practical Recommendations ​

Common Pitfalls in Time Series Analysis ​

Data Quality Checklist ​

Get Started ​

Chapter 7 Introduction (Time Series Analysis and Event Study)

Chapter Objectives

Why is Time Series Analysis So Important?

The Time Dimension in Social Sciences

Time Series vs Cross-sectional Data

Classic Case: Box & Jenkins (1970)

Core Concepts of Time Series

1. Stationarity

2. Autocorrelation

3. Time Series Model Family

Event Study Methodology

Definition and Applications

Basic Framework of Event Study

Chapter Structure

Section 1: Time Series Basics

Section 2: Time Series Decomposition

Section 3: ARIMA Models

Section 4: Event Study Methodology

Section 5: Summary and Review

Python Toolkits

Core Libraries

Specialized Libraries

Basic Setup

Classic Literature

Must-Read Papers

Applied Literature

Learning Path Recommendations

Beginner (1-2 weeks)

Intermediate Learning (3-4 weeks)

Advanced Application (5-6 weeks)

Practical Recommendations

Common Pitfalls in Time Series Analysis

Data Quality Checklist

Get Started