7.2 Time Series Basics
"The only relevant test of the validity of a hypothesis is comparison of prediction with experience."— Milton Friedman, 1976 Nobel Laureate in Economics
Understanding the properties of time series data and stationarity testing
Section Objectives
Upon completing this section, you will be able to:
- Understand the core characteristics of time series data
- Process time series data using pandas
- Master the concept of stationarity and its importance
- Implement ADF, KPSS, and PP tests
- Apply differencing and transformation techniques
- Analyze the stationarity of real economic data
Characteristics of Time Series Data
Definition of Time Series
A time series is a set of observations arranged in chronological order:
Core Characteristics of Time Series
| Characteristic | Description | Example |
|---|---|---|
| Time Dependence | Observations are correlated | Today's stock price depends on yesterday's |
| Trend | Long-term upward or downward movement | GDP's long-term growth trend |
| Seasonality | Fixed periodic fluctuations | Quarterly cycle in retail sales |
| Cyclicality | Non-fixed periodic fluctuations | Business cycles (4-7 years) |
| Randomness | Unpredictable fluctuations | White noise error term |
Time Series Processing in Python
Core pandas Time Series Functions
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Create time series data
dates = pd.date_range(start='2010-01-01', end='2023-12-31', freq='M')
np.random.seed(42)
values = 100 + np.cumsum(np.random.randn(len(dates)) * 2)
ts = pd.Series(values, index=dates)
print(ts.head())
print(f"\nData type: {type(ts.index)}")
print(f"Frequency: {ts.index.freq}")
print(f"Time range: {ts.index.min()} to {ts.index.max()}")Time Series Slicing and Indexing
# Select by year
ts_2015 = ts['2015']
# Select by date range
ts_range = ts['2015-01':'2016-12']
# Filter by condition
ts_high = ts[ts > 110]
# Access specific date
value_jan_2015 = ts['2015-01-31']
print(f"2015 average: {ts_2015.mean():.2f}")
print(f"2015-2016 data points: {len(ts_range)}")Resampling
# Downsample: monthly → quarterly
ts_quarterly = ts.resample('Q').mean()
# Upsample: monthly → daily (forward fill)
ts_daily = ts.resample('D').ffill()
# Downsample: different aggregation functions
ts_q_stats = pd.DataFrame({
'mean': ts.resample('Q').mean(),
'std': ts.resample('Q').std(),
'min': ts.resample('Q').min(),
'max': ts.resample('Q').max()
})
print("Quarterly statistics:")
print(ts_q_stats.head())Rolling Windows
# Moving average
ts_ma_12 = ts.rolling(window=12).mean()
# Rolling standard deviation
ts_std_12 = ts.rolling(window=12).std()
# Visualization
fig, ax = plt.subplots(figsize=(14, 6))
ax.plot(ts, label='Original Data', alpha=0.6)
ax.plot(ts_ma_12, label='12-Month Moving Average', linewidth=2, color='red')
ax.fill_between(ts.index,
ts_ma_12 - 2*ts_std_12,
ts_ma_12 + 2*ts_std_12,
alpha=0.2, color='red', label='±2σ Interval')
ax.set_title('Time Series with Moving Average', fontsize=14, fontweight='bold')
ax.set_xlabel('Time')
ax.set_ylabel('Value')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()Lags and Differences
# Lag operator
ts_lag1 = ts.shift(1)
ts_lag12 = ts.shift(12)
# First-order difference
ts_diff1 = ts.diff(1)
# Seasonal difference
ts_diff12 = ts.diff(12)
# Log difference (approximate growth rate)
ts_log = np.log(ts)
ts_growth = ts_log.diff(1) * 100 # percentage
print(f"Average monthly growth rate: {ts_growth.mean():.3f}%")
print(f"Growth rate standard deviation: {ts_growth.std():.3f}%")Stationarity
Why is Stationarity Important?
Foundation for Statistical Inference:
- Non-stationary series: Statistical properties change over time, sample statistics unreliable
- Stationary series: Statistical properties constant, sample mean/variance are consistent estimators of population parameters
Spurious Regression:
- Regressing two independent non-stationary series may yield significant but meaningless results
- Granger & Newbold (1974): "nonsense correlations"
Mathematical Definition of Stationarity
Strictly Stationary:
holds for all and .
Weakly Stationary/Covariance Stationary:
- Constant expectation: (constant)
- Constant variance: (constant)
- Autocovariance depends only on lag:
In practical applications, we typically focus on weak stationarity.
Visual Judgment of Stationarity
# Generate stationary and non-stationary series
np.random.seed(123)
n = 300
# Stationary series: AR(1) with |ρ| < 1
rho = 0.7
stationary = np.zeros(n)
stationary[0] = np.random.randn()
for t in range(1, n):
stationary[t] = rho * stationary[t-1] + np.random.randn()
# Non-stationary series: random walk (unit root)
random_walk = np.cumsum(np.random.randn(n))
# Non-stationary series: deterministic trend
trend = np.arange(n) * 0.05 + np.random.randn(n) * 2
# Visualization
fig, axes = plt.subplots(3, 2, figsize=(14, 10))
# Time series plots
for i, (data, title) in enumerate([(stationary, 'Stationary: AR(1), ρ=0.7'),
(random_walk, 'Non-stationary: Random Walk'),
(trend, 'Non-stationary: Deterministic Trend')]):
axes[i, 0].plot(data, linewidth=1.5)
axes[i, 0].set_title(title, fontsize=12, fontweight='bold')
axes[i, 0].set_ylabel('Value')
axes[i, 0].grid(True, alpha=0.3)
# ACF plot
from statsmodels.graphics.tsaplots import plot_acf
plot_acf(data, lags=40, ax=axes[i, 1], alpha=0.05)
axes[i, 1].set_title(f'ACF: {title.split(":")[1]}', fontsize=12)
axes[2, 0].set_xlabel('Time')
axes[2, 1].set_xlabel('Lag')
plt.tight_layout()
plt.show()Key Observations:
- ACF of stationary series decays rapidly
- ACF of random walk decays slowly (close to 1)
- ACF of trend series also decays slowly
Stationarity Tests
1. ADF Test (Augmented Dickey-Fuller Test)
Null Hypothesis: : Series has a unit root (non-stationary) Alternative Hypothesis: : Series has no unit root (stationary)
Test Equation:
Test (unit root) vs (stationary)
from statsmodels.tsa.stattools import adfuller
def adf_test(series, name=''):
"""Perform ADF test and print results"""
result = adfuller(series, autolag='AIC')
print(f'\n{"="*60}')
print(f'ADF Test Results: {name}')
print(f'{"="*60}')
print(f'ADF Statistic: {result[0]:.4f}')
print(f'p-value: {result[1]:.4f}')
print(f'Lags Used: {result[2]}')
print(f'Number of Observations: {result[3]}')
print('Critical Values:')
for key, value in result[4].items():
print(f' {key}: {value:.4f}')
if result[1] <= 0.05:
print(f"\nConclusion: Reject null hypothesis (p={result[1]:.4f}) → Series is stationary ✓")
else:
print(f"\nConclusion: Cannot reject null hypothesis (p={result[1]:.4f}) → Series is non-stationary ✗")
return result
# Test stationary and non-stationary series
adf_test(stationary, 'AR(1) Series')
adf_test(random_walk, 'Random Walk')
adf_test(trend, 'Trend Series')2. KPSS Test (Kwiatkowski-Phillips-Schmidt-Shin Test)
Null Hypothesis: : Series is stationary Alternative Hypothesis: : Series is non-stationary (has a unit root)
⚠️ Note: KPSS has the opposite null hypothesis from ADF!
from statsmodels.tsa.stattools import kpss
def kpss_test(series, name='', regression='c'):
"""
Perform KPSS test
regression: 'c' (constant), 'ct' (constant+trend)
"""
result = kpss(series, regression=regression, nlags='auto')
print(f'\n{"="*60}')
print(f'KPSS Test Results: {name}')
print(f'{"="*60}')
print(f'KPSS Statistic: {result[0]:.4f}')
print(f'p-value: {result[1]:.4f}')
print(f'Lags Used: {result[2]}')
print('Critical Values:')
for key, value in result[3].items():
print(f' {key}: {value:.4f}')
if result[1] >= 0.05:
print(f"\nConclusion: Cannot reject null hypothesis (p={result[1]:.4f}) → Series is stationary ✓")
else:
print(f"\nConclusion: Reject null hypothesis (p={result[1]:.4f}) → Series is non-stationary ✗")
return result
# Test
kpss_test(stationary, 'AR(1) Series')
kpss_test(random_walk, 'Random Walk')
kpss_test(trend, 'Trend Series', regression='ct')3. PP Test (Phillips-Perron Test)
Principle: Similar to ADF, but uses non-parametric methods to handle serial correlation and heteroskedasticity
from statsmodels.tsa.stattools import pp_test
def pp_test_custom(series, name=''):
"""Perform PP test"""
# Note: pp_test is available in statsmodels 0.13.0+
try:
result = pp_test(series, lags='auto')
print(f'\n{"="*60}')
print(f'PP Test Results: {name}')
print(f'{"="*60}')
print(f'PP Statistic: {result[0]:.4f}')
print(f'p-value: {result[1]:.4f}')
print('Critical Values:')
for key, value in result[4].items():
print(f' {key}: {value:.4f}')
if result[1] <= 0.05:
print(f"\nConclusion: Reject null hypothesis (p={result[1]:.4f}) → Series is stationary ✓")
else:
print(f"\nConclusion: Cannot reject null hypothesis (p={result[1]:.4f}) → Series is non-stationary ✗")
return result
except AttributeError:
print(f"\n⚠️ Current statsmodels version does not support pp_test, please upgrade to 0.13.0+")
print("pip install --upgrade statsmodels")
return None
pp_test_custom(stationary, 'AR(1) Series')
pp_test_custom(random_walk, 'Random Walk')Comparison of Test Methods
| Test | Null Hypothesis | Advantage | Use Case |
|---|---|---|---|
| ADF | Non-stationary | Most common, robust | General testing |
| KPSS | Stationary | Complements ADF | Use jointly with ADF |
| PP | Non-stationary | Handles heteroskedasticity | High-frequency financial data |
Best Practice: Use both ADF and KPSS
| ADF | KPSS | Conclusion |
|---|---|---|
| Reject H0 (stationary) | Do not reject H0 (stationary) | ✓ Confirmed stationary |
| Do not reject H0 (non-stationary) | Reject H0 (non-stationary) | ✗ Confirmed non-stationary |
| Reject H0 | Reject H0 | ⚠️ Conflicting results, check data |
| Do not reject H0 | Do not reject H0 | ⚠️ Uncertain, needs further analysis |
Differencing and Transformation
First-Order Differencing
# Apply first-order differencing to random walk
rw_diff = pd.Series(random_walk).diff().dropna()
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
axes[0].plot(random_walk)
axes[0].set_title('Original Series: Random Walk (Non-stationary)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)
axes[1].plot(rw_diff)
axes[1].set_title('After First-Order Differencing (Stationary)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('Δy')
axes[1].axhline(y=0, color='r', linestyle='--', alpha=0.5)
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Test stationarity after differencing
adf_test(rw_diff, 'Random Walk (After First-Order Differencing)')Seasonal Differencing
where is the seasonal period (e.g., monthly data , quarterly data )
# Generate seasonal data
dates = pd.date_range('2010-01', periods=120, freq='M')
seasonal = 10 + 5*np.sin(2*np.pi*np.arange(120)/12) + np.random.randn(120)
trend_seasonal = seasonal + np.arange(120) * 0.1
ts_seasonal = pd.Series(trend_seasonal, index=dates)
# Apply differencing
ts_diff1 = ts_seasonal.diff(1) # First-order difference (detrend)
ts_diff12 = ts_seasonal.diff(12) # Seasonal difference (deseasonalize)
ts_diff1_12 = ts_seasonal.diff(1).diff(12) # Combined differencing
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
axes[0, 0].plot(ts_seasonal)
axes[0, 0].set_title('Original Series (Trend + Seasonality)', fontsize=12, fontweight='bold')
axes[0, 1].plot(ts_diff1)
axes[0, 1].set_title('First-Order Difference (Detrended)', fontsize=12, fontweight='bold')
axes[0, 1].axhline(y=0, color='r', linestyle='--', alpha=0.5)
axes[1, 0].plot(ts_diff12)
axes[1, 0].set_title('Seasonal Difference (Deseasonalized)', fontsize=12, fontweight='bold')
axes[1, 0].axhline(y=0, color='r', linestyle='--', alpha=0.5)
axes[1, 1].plot(ts_diff1_12.dropna())
axes[1, 1].set_title('Combined Differencing (Detrended + Deseasonalized)', fontsize=12, fontweight='bold')
axes[1, 1].axhline(y=0, color='r', linestyle='--', alpha=0.5)
for ax in axes.flat:
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()Log Transformation
Uses:
- Stabilize variance (log variance approximately constant)
- Convert multiplicative model to additive model
- Approximate growth rate calculation
# Generate exponentially growing data (heteroskedastic)
exp_data = 100 * np.exp(0.05 * np.arange(100)) + np.random.randn(100) * np.arange(100) * 0.5
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
axes[0].plot(exp_data)
axes[0].set_title('Original Data (Heteroskedastic)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Value')
axes[1].plot(np.log(exp_data))
axes[1].set_title('After Log Transformation (Variance Stabilized)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('log(Value)')
for ax in axes:
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Calculate growth rate
growth_rate = pd.Series(np.log(exp_data)).diff() * 100
print(f"Average growth rate: {growth_rate.mean():.2f}%")
print(f"Growth rate standard deviation: {growth_rate.std():.2f}%")Box-Cox Transformation
Automatically find optimal power transformation parameter:
from scipy.stats import boxcox
# Box-Cox transformation
data_positive = exp_data - exp_data.min() + 1 # Ensure positive values
transformed, lambda_opt = boxcox(data_positive)
print(f"Optimal λ parameter: {lambda_opt:.4f}")
fig, axes = plt.subplots(1, 3, figsize=(16, 5))
axes[0].plot(data_positive)
axes[0].set_title('Original Data', fontsize=12, fontweight='bold')
axes[1].plot(transformed)
axes[1].set_title(f'Box-Cox Transformation (λ={lambda_opt:.3f})', fontsize=12, fontweight='bold')
axes[2].hist(data_positive, bins=30, alpha=0.5, label='Original', density=True)
axes[2].hist(transformed, bins=30, alpha=0.5, label='Transformed', density=True)
axes[2].set_title('Distribution Comparison', fontsize=12, fontweight='bold')
axes[2].legend()
for ax in axes:
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()Complete Case Study: China GDP Growth Rate Stationarity Analysis
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# Simulate China GDP data (1978-2023, in 100 million yuan)
np.random.seed(2024)
years = pd.date_range('1978', '2023', freq='Y')
n = len(years)
# GDP level: exponential growth + cycle + random disturbance
t = np.arange(n)
log_gdp = 8.5 + 0.12*t - 0.003*t**2 + 0.2*np.sin(2*np.pi*t/10) + np.random.randn(n)*0.05
gdp = np.exp(log_gdp)
# GDP growth rate
gdp_growth = pd.Series(log_gdp).diff() * 100
gdp_growth = gdp_growth.dropna()
df_gdp = pd.DataFrame({
'year': years,
'gdp': gdp,
'log_gdp': log_gdp,
'growth_rate': [np.nan] + list(gdp_growth)
})
# 1. Visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
# GDP level
axes[0, 0].plot(df_gdp['year'], df_gdp['gdp'], linewidth=2)
axes[0, 0].set_title('China GDP Level (1978-2023)', fontsize=12, fontweight='bold')
axes[0, 0].set_ylabel('GDP (100 million yuan)')
axes[0, 0].grid(True, alpha=0.3)
# log(GDP)
axes[0, 1].plot(df_gdp['year'], df_gdp['log_gdp'], linewidth=2, color='orange')
axes[0, 1].set_title('log(GDP)', fontsize=12, fontweight='bold')
axes[0, 1].set_ylabel('log(GDP)')
axes[0, 1].grid(True, alpha=0.3)
# GDP growth rate
axes[1, 0].plot(df_gdp['year'][1:], df_gdp['growth_rate'][1:], linewidth=2, color='green')
axes[1, 0].axhline(y=df_gdp['growth_rate'].mean(), color='r', linestyle='--',
label=f'Average: {df_gdp["growth_rate"].mean():.2f}%')
axes[1, 0].set_title('GDP Growth Rate', fontsize=12, fontweight='bold')
axes[1, 0].set_ylabel('Growth Rate (%)')
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)
# Growth rate distribution
axes[1, 1].hist(df_gdp['growth_rate'].dropna(), bins=15, edgecolor='black', alpha=0.7)
axes[1, 1].axvline(x=df_gdp['growth_rate'].mean(), color='r', linestyle='--', linewidth=2)
axes[1, 1].set_title('Growth Rate Distribution', fontsize=12, fontweight='bold')
axes[1, 1].set_xlabel('Growth Rate (%)')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# 2. Descriptive statistics
print("\n" + "="*60)
print("GDP Growth Rate Descriptive Statistics")
print("="*60)
print(df_gdp['growth_rate'].describe())
# 3. ACF/PACF plots
fig, axes = plt.subplots(3, 2, figsize=(14, 12))
# GDP level
plot_acf(df_gdp['gdp'].dropna(), lags=20, ax=axes[0, 0], alpha=0.05)
axes[0, 0].set_title('ACF: GDP Level', fontsize=12, fontweight='bold')
plot_pacf(df_gdp['gdp'].dropna(), lags=20, ax=axes[0, 1], alpha=0.05)
axes[0, 1].set_title('PACF: GDP Level', fontsize=12, fontweight='bold')
# log(GDP)
plot_acf(df_gdp['log_gdp'].dropna(), lags=20, ax=axes[1, 0], alpha=0.05)
axes[1, 0].set_title('ACF: log(GDP)', fontsize=12, fontweight='bold')
plot_pacf(df_gdp['log_gdp'].dropna(), lags=20, ax=axes[1, 1], alpha=0.05)
axes[1, 1].set_title('PACF: log(GDP)', fontsize=12, fontweight='bold')
# GDP growth rate
plot_acf(df_gdp['growth_rate'].dropna(), lags=20, ax=axes[2, 0], alpha=0.05)
axes[2, 0].set_title('ACF: GDP Growth Rate', fontsize=12, fontweight='bold')
plot_pacf(df_gdp['growth_rate'].dropna(), lags=20, ax=axes[2, 1], alpha=0.05)
axes[2, 1].set_title('PACF: GDP Growth Rate', fontsize=12, fontweight='bold')
plt.tight_layout()
plt.show()
# 4. Stationarity tests
def comprehensive_stationarity_test(series, name):
"""Comprehensive stationarity test"""
print(f"\n{'='*70}")
print(f"Stationarity Test: {name}")
print(f"{'='*70}")
# ADF test
adf_result = adfuller(series.dropna(), autolag='AIC')
print(f"\n【ADF Test】(H0: Non-stationary)")
print(f" Statistic: {adf_result[0]:.4f}")
print(f" p-value: {adf_result[1]:.4f}")
print(f" Critical Value (5%): {adf_result[4]['5%']:.4f}")
adf_conclusion = "Stationary ✓" if adf_result[1] < 0.05 else "Non-stationary ✗"
print(f" Conclusion: {adf_conclusion}")
# KPSS test
kpss_result = kpss(series.dropna(), regression='c', nlags='auto')
print(f"\n【KPSS Test】(H0: Stationary)")
print(f" Statistic: {kpss_result[0]:.4f}")
print(f" p-value: {kpss_result[1]:.4f}")
print(f" Critical Value (5%): {kpss_result[3]['5%']:.4f}")
kpss_conclusion = "Stationary ✓" if kpss_result[1] > 0.05 else "Non-stationary ✗"
print(f" Conclusion: {kpss_conclusion}")
# Combined judgment
print(f"\n【Combined Conclusion】")
if adf_result[1] < 0.05 and kpss_result[1] > 0.05:
print(f" Both tests consistently support: Series is stationary ✓")
elif adf_result[1] >= 0.05 and kpss_result[1] <= 0.05:
print(f" Both tests consistently support: Series is non-stationary ✗")
else:
print(f" Test results are conflicting, further analysis needed ⚠️")
# Test GDP level
comprehensive_stationarity_test(df_gdp['gdp'], 'GDP Level')
# Test log(GDP)
comprehensive_stationarity_test(df_gdp['log_gdp'], 'log(GDP)')
# Test GDP growth rate
comprehensive_stationarity_test(df_gdp['growth_rate'], 'GDP Growth Rate')
# 5. Differencing effect comparison
print(f"\n{'='*70}")
print("First-Order Differencing Effect")
print(f"{'='*70}")
gdp_diff = df_gdp['gdp'].diff().dropna()
comprehensive_stationarity_test(gdp_diff, 'GDP Level (After First-Order Differencing)')
# 6. Visualization summary
fig, axes = plt.subplots(3, 1, figsize=(14, 10))
axes[0].plot(df_gdp['year'], df_gdp['gdp'])
axes[0].set_title('GDP Level (Non-stationary)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('GDP (100 million yuan)')
axes[0].grid(True, alpha=0.3)
axes[1].plot(df_gdp['year'][1:], df_gdp['growth_rate'][1:], color='green')
axes[1].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
axes[1].set_title('GDP Growth Rate (Stationary)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('Growth Rate (%)')
axes[1].grid(True, alpha=0.3)
axes[2].plot(df_gdp['year'][1:], gdp_diff, color='orange')
axes[2].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
axes[2].set_title('GDP First-Order Difference (Stationary)', fontsize=12, fontweight='bold')
axes[2].set_ylabel('Δ GDP')
axes[2].set_xlabel('Year')
axes[2].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
print("\n" + "="*70)
print("✓ Case analysis complete!")
print("="*70)
print("\nKey findings:")
print(" 1. GDP level is I(1) series (integrated of order 1), non-stationary")
print(" 2. log(GDP) is also I(1) series, non-stationary")
print(" 3. GDP growth rate is I(0) series (stationary)")
print(" 4. First-order differencing makes GDP level stationary")
print("\nPolicy implications:")
print(" - Analyzing GDP growth rate (rather than level) is more reliable")
print(" - Regression analysis requires differencing or cointegration testing")
print(" - Economic shocks' impact on growth rate is temporary")Section Summary
Core Concepts
| Concept | Definition | Importance |
|---|---|---|
| Stationarity | Statistical properties don't change over time | Foundation for statistical inference |
| Unit Root | AR(1) coefficient=1, series non-stationary | Test for non-stationarity |
| Differencing | Δy = y_t - y_ | Convert I(1) to I(0) |
| Log Transformation | Stabilize variance, approximate growth rate | Handle exponential growth data |
Practice Checklist
- [ ] Plot time series, observe trend/seasonality
- [ ] Plot ACF/PACF, check autocorrelation structure
- [ ] Use both ADF and KPSS tests for stationarity
- [ ] If non-stationary, try differencing or log transformation
- [ ] Verify stationarity of transformed series
- [ ] Record transformation steps (needed for inverse transformation in modeling)
Common Errors
❌ Avoid:
- Using only ADF test (may reach wrong conclusions)
- Over-differencing (e.g., differencing I(1) series twice)
- Ignoring seasonal patterns
- Differencing already stationary series
✓ Recommended:
- ADF + KPSS dual testing
- Visual check of transformation effects
- Understand economic meaning of data (e.g., growth rates typically stationary)
Next Section Preview
In the next section, we will learn how to decompose trend, seasonal, and random components of time series.
Master the foundations of time series!
Extended Reading
Dickey, D. A., & Fuller, W. A. (1979). "Distribution of the estimators for autoregressive time series with a unit root." Journal of the American Statistical Association, 74(366a), 427-431.
Kwiatkowski, D., et al. (1992). "Testing the null hypothesis of stationarity against the alternative of a unit root." Journal of Econometrics, 54(1-3), 159-178.
Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press. (Chapters 15-17)
Enders, W. (2014). Applied Econometric Time Series (4th ed.). Wiley. (Chapter 4)