7.2 Time Series Basics

"The only relevant test of the validity of a hypothesis is comparison of prediction with experience."— Milton Friedman, 1976 Nobel Laureate in Economics

Understanding the properties of time series data and stationarity testing

Section Objectives

Upon completing this section, you will be able to:

Understand the core characteristics of time series data
Process time series data using pandas
Master the concept of stationarity and its importance
Implement ADF, KPSS, and PP tests
Apply differencing and transformation techniques
Analyze the stationarity of real economic data

Characteristics of Time Series Data

Definition of Time Series

A time series is a set of observations arranged in chronological order:

Core Characteristics of Time Series

Characteristic	Description	Example
Time Dependence	Observations are correlated	Today's stock price depends on yesterday's
Trend	Long-term upward or downward movement	GDP's long-term growth trend
Seasonality	Fixed periodic fluctuations	Quarterly cycle in retail sales
Cyclicality	Non-fixed periodic fluctuations	Business cycles (4-7 years)
Randomness	Unpredictable fluctuations	White noise error term

Time Series Processing in Python

Core pandas Time Series Functions

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Create time series data
dates = pd.date_range(start='2010-01-01', end='2023-12-31', freq='M')
np.random.seed(42)
values = 100 + np.cumsum(np.random.randn(len(dates)) * 2)

ts = pd.Series(values, index=dates)

print(ts.head())
print(f"\nData type: {type(ts.index)}")
print(f"Frequency: {ts.index.freq}")
print(f"Time range: {ts.index.min()} to {ts.index.max()}")

Time Series Slicing and Indexing

python

# Select by year
ts_2015 = ts['2015']

# Select by date range
ts_range = ts['2015-01':'2016-12']

# Filter by condition
ts_high = ts[ts > 110]

# Access specific date
value_jan_2015 = ts['2015-01-31']

print(f"2015 average: {ts_2015.mean():.2f}")
print(f"2015-2016 data points: {len(ts_range)}")

Resampling

python

# Downsample: monthly → quarterly
ts_quarterly = ts.resample('Q').mean()

# Upsample: monthly → daily (forward fill)
ts_daily = ts.resample('D').ffill()

# Downsample: different aggregation functions
ts_q_stats = pd.DataFrame({
    'mean': ts.resample('Q').mean(),
    'std': ts.resample('Q').std(),
    'min': ts.resample('Q').min(),
    'max': ts.resample('Q').max()
})

print("Quarterly statistics:")
print(ts_q_stats.head())

Rolling Windows

python

# Moving average
ts_ma_12 = ts.rolling(window=12).mean()

# Rolling standard deviation
ts_std_12 = ts.rolling(window=12).std()

# Visualization
fig, ax = plt.subplots(figsize=(14, 6))
ax.plot(ts, label='Original Data', alpha=0.6)
ax.plot(ts_ma_12, label='12-Month Moving Average', linewidth=2, color='red')
ax.fill_between(ts.index,
               ts_ma_12 - 2*ts_std_12,
               ts_ma_12 + 2*ts_std_12,
               alpha=0.2, color='red', label='±2σ Interval')
ax.set_title('Time Series with Moving Average', fontsize=14, fontweight='bold')
ax.set_xlabel('Time')
ax.set_ylabel('Value')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Lags and Differences

python

# Lag operator
ts_lag1 = ts.shift(1)
ts_lag12 = ts.shift(12)

# First-order difference
ts_diff1 = ts.diff(1)

# Seasonal difference
ts_diff12 = ts.diff(12)

# Log difference (approximate growth rate)
ts_log = np.log(ts)
ts_growth = ts_log.diff(1) * 100  # percentage

print(f"Average monthly growth rate: {ts_growth.mean():.3f}%")
print(f"Growth rate standard deviation: {ts_growth.std():.3f}%")

Stationarity

Why is Stationarity Important?

Foundation for Statistical Inference:

Non-stationary series: Statistical properties change over time, sample statistics unreliable
Stationary series: Statistical properties constant, sample mean/variance are consistent estimators of population parameters

Spurious Regression:

Regressing two independent non-stationary series may yield significant but meaningless results
Granger & Newbold (1974): "nonsense correlations"

Mathematical Definition of Stationarity

Strictly Stationary:

holds for all and .

Weakly Stationary/Covariance Stationary:

Constant expectation: (constant)
Constant variance: (constant)
Autocovariance depends only on lag:

In practical applications, we typically focus on weak stationarity.

Visual Judgment of Stationarity

python

# Generate stationary and non-stationary series
np.random.seed(123)
n = 300

# Stationary series: AR(1) with |ρ| < 1
rho = 0.7
stationary = np.zeros(n)
stationary[0] = np.random.randn()
for t in range(1, n):
    stationary[t] = rho * stationary[t-1] + np.random.randn()

# Non-stationary series: random walk (unit root)
random_walk = np.cumsum(np.random.randn(n))

# Non-stationary series: deterministic trend
trend = np.arange(n) * 0.05 + np.random.randn(n) * 2

# Visualization
fig, axes = plt.subplots(3, 2, figsize=(14, 10))

# Time series plots
for i, (data, title) in enumerate([(stationary, 'Stationary: AR(1), ρ=0.7'),
                                     (random_walk, 'Non-stationary: Random Walk'),
                                     (trend, 'Non-stationary: Deterministic Trend')]):
    axes[i, 0].plot(data, linewidth=1.5)
    axes[i, 0].set_title(title, fontsize=12, fontweight='bold')
    axes[i, 0].set_ylabel('Value')
    axes[i, 0].grid(True, alpha=0.3)

    # ACF plot
    from statsmodels.graphics.tsaplots import plot_acf
    plot_acf(data, lags=40, ax=axes[i, 1], alpha=0.05)
    axes[i, 1].set_title(f'ACF: {title.split(":")[1]}', fontsize=12)

axes[2, 0].set_xlabel('Time')
axes[2, 1].set_xlabel('Lag')

plt.tight_layout()
plt.show()

Key Observations:

ACF of stationary series decays rapidly
ACF of random walk decays slowly (close to 1)
ACF of trend series also decays slowly

Stationarity Tests

1. ADF Test (Augmented Dickey-Fuller Test)

Null Hypothesis: : Series has a unit root (non-stationary) Alternative Hypothesis: : Series has no unit root (stationary)

Test Equation:

Test (unit root) vs (stationary)

python

from statsmodels.tsa.stattools import adfuller

def adf_test(series, name=''):
    """Perform ADF test and print results"""
    result = adfuller(series, autolag='AIC')

    print(f'\n{"="*60}')
    print(f'ADF Test Results: {name}')
    print(f'{"="*60}')
    print(f'ADF Statistic: {result[0]:.4f}')
    print(f'p-value: {result[1]:.4f}')
    print(f'Lags Used: {result[2]}')
    print(f'Number of Observations: {result[3]}')
    print('Critical Values:')
    for key, value in result[4].items():
        print(f'  {key}: {value:.4f}')

    if result[1] <= 0.05:
        print(f"\nConclusion: Reject null hypothesis (p={result[1]:.4f}) → Series is stationary ✓")
    else:
        print(f"\nConclusion: Cannot reject null hypothesis (p={result[1]:.4f}) → Series is non-stationary ✗")

    return result

# Test stationary and non-stationary series
adf_test(stationary, 'AR(1) Series')
adf_test(random_walk, 'Random Walk')
adf_test(trend, 'Trend Series')

2. KPSS Test (Kwiatkowski-Phillips-Schmidt-Shin Test)

Null Hypothesis: : Series is stationary Alternative Hypothesis: : Series is non-stationary (has a unit root)

⚠️ Note: KPSS has the opposite null hypothesis from ADF!

python

from statsmodels.tsa.stattools import kpss

def kpss_test(series, name='', regression='c'):
    """
    Perform KPSS test
    regression: 'c' (constant), 'ct' (constant+trend)
    """
    result = kpss(series, regression=regression, nlags='auto')

    print(f'\n{"="*60}')
    print(f'KPSS Test Results: {name}')
    print(f'{"="*60}')
    print(f'KPSS Statistic: {result[0]:.4f}')
    print(f'p-value: {result[1]:.4f}')
    print(f'Lags Used: {result[2]}')
    print('Critical Values:')
    for key, value in result[3].items():
        print(f'  {key}: {value:.4f}')

    if result[1] >= 0.05:
        print(f"\nConclusion: Cannot reject null hypothesis (p={result[1]:.4f}) → Series is stationary ✓")
    else:
        print(f"\nConclusion: Reject null hypothesis (p={result[1]:.4f}) → Series is non-stationary ✗")

    return result

# Test
kpss_test(stationary, 'AR(1) Series')
kpss_test(random_walk, 'Random Walk')
kpss_test(trend, 'Trend Series', regression='ct')

3. PP Test (Phillips-Perron Test)

Principle: Similar to ADF, but uses non-parametric methods to handle serial correlation and heteroskedasticity

python

from statsmodels.tsa.stattools import pp_test

def pp_test_custom(series, name=''):
    """Perform PP test"""
    # Note: pp_test is available in statsmodels 0.13.0+
    try:
        result = pp_test(series, lags='auto')

        print(f'\n{"="*60}')
        print(f'PP Test Results: {name}')
        print(f'{"="*60}')
        print(f'PP Statistic: {result[0]:.4f}')
        print(f'p-value: {result[1]:.4f}')
        print('Critical Values:')
        for key, value in result[4].items():
            print(f'  {key}: {value:.4f}')

        if result[1] <= 0.05:
            print(f"\nConclusion: Reject null hypothesis (p={result[1]:.4f}) → Series is stationary ✓")
        else:
            print(f"\nConclusion: Cannot reject null hypothesis (p={result[1]:.4f}) → Series is non-stationary ✗")

        return result
    except AttributeError:
        print(f"\n⚠️ Current statsmodels version does not support pp_test, please upgrade to 0.13.0+")
        print("pip install --upgrade statsmodels")
        return None

pp_test_custom(stationary, 'AR(1) Series')
pp_test_custom(random_walk, 'Random Walk')

Comparison of Test Methods

Test	Null Hypothesis	Advantage	Use Case
ADF	Non-stationary	Most common, robust	General testing
KPSS	Stationary	Complements ADF	Use jointly with ADF
PP	Non-stationary	Handles heteroskedasticity	High-frequency financial data

Best Practice: Use both ADF and KPSS

ADF	KPSS	Conclusion
Reject H0 (stationary)	Do not reject H0 (stationary)	✓ Confirmed stationary
Do not reject H0 (non-stationary)	Reject H0 (non-stationary)	✗ Confirmed non-stationary
Reject H0	Reject H0	⚠️ Conflicting results, check data
Do not reject H0	Do not reject H0	⚠️ Uncertain, needs further analysis

Differencing and Transformation

First-Order Differencing

python

# Apply first-order differencing to random walk
rw_diff = pd.Series(random_walk).diff().dropna()

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].plot(random_walk)
axes[0].set_title('Original Series: Random Walk (Non-stationary)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Value')
axes[0].grid(True, alpha=0.3)

axes[1].plot(rw_diff)
axes[1].set_title('After First-Order Differencing (Stationary)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('Δy')
axes[1].axhline(y=0, color='r', linestyle='--', alpha=0.5)
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Test stationarity after differencing
adf_test(rw_diff, 'Random Walk (After First-Order Differencing)')

Seasonal Differencing

where is the seasonal period (e.g., monthly data , quarterly data )

python

# Generate seasonal data
dates = pd.date_range('2010-01', periods=120, freq='M')
seasonal = 10 + 5*np.sin(2*np.pi*np.arange(120)/12) + np.random.randn(120)
trend_seasonal = seasonal + np.arange(120) * 0.1

ts_seasonal = pd.Series(trend_seasonal, index=dates)

# Apply differencing
ts_diff1 = ts_seasonal.diff(1)        # First-order difference (detrend)
ts_diff12 = ts_seasonal.diff(12)      # Seasonal difference (deseasonalize)
ts_diff1_12 = ts_seasonal.diff(1).diff(12)  # Combined differencing

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

axes[0, 0].plot(ts_seasonal)
axes[0, 0].set_title('Original Series (Trend + Seasonality)', fontsize=12, fontweight='bold')

axes[0, 1].plot(ts_diff1)
axes[0, 1].set_title('First-Order Difference (Detrended)', fontsize=12, fontweight='bold')
axes[0, 1].axhline(y=0, color='r', linestyle='--', alpha=0.5)

axes[1, 0].plot(ts_diff12)
axes[1, 0].set_title('Seasonal Difference (Deseasonalized)', fontsize=12, fontweight='bold')
axes[1, 0].axhline(y=0, color='r', linestyle='--', alpha=0.5)

axes[1, 1].plot(ts_diff1_12.dropna())
axes[1, 1].set_title('Combined Differencing (Detrended + Deseasonalized)', fontsize=12, fontweight='bold')
axes[1, 1].axhline(y=0, color='r', linestyle='--', alpha=0.5)

for ax in axes.flat:
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Log Transformation

Uses:

Stabilize variance (log variance approximately constant)
Convert multiplicative model to additive model
Approximate growth rate calculation

python

# Generate exponentially growing data (heteroskedastic)
exp_data = 100 * np.exp(0.05 * np.arange(100)) + np.random.randn(100) * np.arange(100) * 0.5

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].plot(exp_data)
axes[0].set_title('Original Data (Heteroskedastic)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Value')

axes[1].plot(np.log(exp_data))
axes[1].set_title('After Log Transformation (Variance Stabilized)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('log(Value)')

for ax in axes:
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Calculate growth rate
growth_rate = pd.Series(np.log(exp_data)).diff() * 100
print(f"Average growth rate: {growth_rate.mean():.2f}%")
print(f"Growth rate standard deviation: {growth_rate.std():.2f}%")

Box-Cox Transformation

Automatically find optimal power transformation parameter:

python

from scipy.stats import boxcox

# Box-Cox transformation
data_positive = exp_data - exp_data.min() + 1  # Ensure positive values
transformed, lambda_opt = boxcox(data_positive)

print(f"Optimal λ parameter: {lambda_opt:.4f}")

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

axes[0].plot(data_positive)
axes[0].set_title('Original Data', fontsize=12, fontweight='bold')

axes[1].plot(transformed)
axes[1].set_title(f'Box-Cox Transformation (λ={lambda_opt:.3f})', fontsize=12, fontweight='bold')

axes[2].hist(data_positive, bins=30, alpha=0.5, label='Original', density=True)
axes[2].hist(transformed, bins=30, alpha=0.5, label='Transformed', density=True)
axes[2].set_title('Distribution Comparison', fontsize=12, fontweight='bold')
axes[2].legend()

for ax in axes:
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Complete Case Study: China GDP Growth Rate Stationarity Analysis

python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Simulate China GDP data (1978-2023, in 100 million yuan)
np.random.seed(2024)
years = pd.date_range('1978', '2023', freq='Y')
n = len(years)

# GDP level: exponential growth + cycle + random disturbance
t = np.arange(n)
log_gdp = 8.5 + 0.12*t - 0.003*t**2 + 0.2*np.sin(2*np.pi*t/10) + np.random.randn(n)*0.05
gdp = np.exp(log_gdp)

# GDP growth rate
gdp_growth = pd.Series(log_gdp).diff() * 100
gdp_growth = gdp_growth.dropna()

df_gdp = pd.DataFrame({
    'year': years,
    'gdp': gdp,
    'log_gdp': log_gdp,
    'growth_rate': [np.nan] + list(gdp_growth)
})

# 1. Visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# GDP level
axes[0, 0].plot(df_gdp['year'], df_gdp['gdp'], linewidth=2)
axes[0, 0].set_title('China GDP Level (1978-2023)', fontsize=12, fontweight='bold')
axes[0, 0].set_ylabel('GDP (100 million yuan)')
axes[0, 0].grid(True, alpha=0.3)

# log(GDP)
axes[0, 1].plot(df_gdp['year'], df_gdp['log_gdp'], linewidth=2, color='orange')
axes[0, 1].set_title('log(GDP)', fontsize=12, fontweight='bold')
axes[0, 1].set_ylabel('log(GDP)')
axes[0, 1].grid(True, alpha=0.3)

# GDP growth rate
axes[1, 0].plot(df_gdp['year'][1:], df_gdp['growth_rate'][1:], linewidth=2, color='green')
axes[1, 0].axhline(y=df_gdp['growth_rate'].mean(), color='r', linestyle='--',
                  label=f'Average: {df_gdp["growth_rate"].mean():.2f}%')
axes[1, 0].set_title('GDP Growth Rate', fontsize=12, fontweight='bold')
axes[1, 0].set_ylabel('Growth Rate (%)')
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)

# Growth rate distribution
axes[1, 1].hist(df_gdp['growth_rate'].dropna(), bins=15, edgecolor='black', alpha=0.7)
axes[1, 1].axvline(x=df_gdp['growth_rate'].mean(), color='r', linestyle='--', linewidth=2)
axes[1, 1].set_title('Growth Rate Distribution', fontsize=12, fontweight='bold')
axes[1, 1].set_xlabel('Growth Rate (%)')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# 2. Descriptive statistics
print("\n" + "="*60)
print("GDP Growth Rate Descriptive Statistics")
print("="*60)
print(df_gdp['growth_rate'].describe())

# 3. ACF/PACF plots
fig, axes = plt.subplots(3, 2, figsize=(14, 12))

# GDP level
plot_acf(df_gdp['gdp'].dropna(), lags=20, ax=axes[0, 0], alpha=0.05)
axes[0, 0].set_title('ACF: GDP Level', fontsize=12, fontweight='bold')
plot_pacf(df_gdp['gdp'].dropna(), lags=20, ax=axes[0, 1], alpha=0.05)
axes[0, 1].set_title('PACF: GDP Level', fontsize=12, fontweight='bold')

# log(GDP)
plot_acf(df_gdp['log_gdp'].dropna(), lags=20, ax=axes[1, 0], alpha=0.05)
axes[1, 0].set_title('ACF: log(GDP)', fontsize=12, fontweight='bold')
plot_pacf(df_gdp['log_gdp'].dropna(), lags=20, ax=axes[1, 1], alpha=0.05)
axes[1, 1].set_title('PACF: log(GDP)', fontsize=12, fontweight='bold')

# GDP growth rate
plot_acf(df_gdp['growth_rate'].dropna(), lags=20, ax=axes[2, 0], alpha=0.05)
axes[2, 0].set_title('ACF: GDP Growth Rate', fontsize=12, fontweight='bold')
plot_pacf(df_gdp['growth_rate'].dropna(), lags=20, ax=axes[2, 1], alpha=0.05)
axes[2, 1].set_title('PACF: GDP Growth Rate', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

# 4. Stationarity tests
def comprehensive_stationarity_test(series, name):
    """Comprehensive stationarity test"""
    print(f"\n{'='*70}")
    print(f"Stationarity Test: {name}")
    print(f"{'='*70}")

    # ADF test
    adf_result = adfuller(series.dropna(), autolag='AIC')
    print(f"\n【ADF Test】(H0: Non-stationary)")
    print(f"  Statistic: {adf_result[0]:.4f}")
    print(f"  p-value: {adf_result[1]:.4f}")
    print(f"  Critical Value (5%): {adf_result[4]['5%']:.4f}")
    adf_conclusion = "Stationary ✓" if adf_result[1] < 0.05 else "Non-stationary ✗"
    print(f"  Conclusion: {adf_conclusion}")

    # KPSS test
    kpss_result = kpss(series.dropna(), regression='c', nlags='auto')
    print(f"\n【KPSS Test】(H0: Stationary)")
    print(f"  Statistic: {kpss_result[0]:.4f}")
    print(f"  p-value: {kpss_result[1]:.4f}")
    print(f"  Critical Value (5%): {kpss_result[3]['5%']:.4f}")
    kpss_conclusion = "Stationary ✓" if kpss_result[1] > 0.05 else "Non-stationary ✗"
    print(f"  Conclusion: {kpss_conclusion}")

    # Combined judgment
    print(f"\n【Combined Conclusion】")
    if adf_result[1] < 0.05 and kpss_result[1] > 0.05:
        print(f"  Both tests consistently support: Series is stationary ✓")
    elif adf_result[1] >= 0.05 and kpss_result[1] <= 0.05:
        print(f"  Both tests consistently support: Series is non-stationary ✗")
    else:
        print(f"  Test results are conflicting, further analysis needed ⚠️")

# Test GDP level
comprehensive_stationarity_test(df_gdp['gdp'], 'GDP Level')

# Test log(GDP)
comprehensive_stationarity_test(df_gdp['log_gdp'], 'log(GDP)')

# Test GDP growth rate
comprehensive_stationarity_test(df_gdp['growth_rate'], 'GDP Growth Rate')

# 5. Differencing effect comparison
print(f"\n{'='*70}")
print("First-Order Differencing Effect")
print(f"{'='*70}")

gdp_diff = df_gdp['gdp'].diff().dropna()
comprehensive_stationarity_test(gdp_diff, 'GDP Level (After First-Order Differencing)')

# 6. Visualization summary
fig, axes = plt.subplots(3, 1, figsize=(14, 10))

axes[0].plot(df_gdp['year'], df_gdp['gdp'])
axes[0].set_title('GDP Level (Non-stationary)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('GDP (100 million yuan)')
axes[0].grid(True, alpha=0.3)

axes[1].plot(df_gdp['year'][1:], df_gdp['growth_rate'][1:], color='green')
axes[1].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
axes[1].set_title('GDP Growth Rate (Stationary)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('Growth Rate (%)')
axes[1].grid(True, alpha=0.3)

axes[2].plot(df_gdp['year'][1:], gdp_diff, color='orange')
axes[2].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
axes[2].set_title('GDP First-Order Difference (Stationary)', fontsize=12, fontweight='bold')
axes[2].set_ylabel('Δ GDP')
axes[2].set_xlabel('Year')
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\n" + "="*70)
print("✓ Case analysis complete!")
print("="*70)
print("\nKey findings:")
print("  1. GDP level is I(1) series (integrated of order 1), non-stationary")
print("  2. log(GDP) is also I(1) series, non-stationary")
print("  3. GDP growth rate is I(0) series (stationary)")
print("  4. First-order differencing makes GDP level stationary")
print("\nPolicy implications:")
print("  - Analyzing GDP growth rate (rather than level) is more reliable")
print("  - Regression analysis requires differencing or cointegration testing")
print("  - Economic shocks' impact on growth rate is temporary")

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179

Section Summary

Core Concepts

Concept	Definition	Importance
Stationarity	Statistical properties don't change over time	Foundation for statistical inference
Unit Root	AR(1) coefficient=1, series non-stationary	Test for non-stationarity
Differencing	Δy = y_t - y_	Convert I(1) to I(0)
Log Transformation	Stabilize variance, approximate growth rate	Handle exponential growth data

Practice Checklist

[ ] Plot time series, observe trend/seasonality
[ ] Plot ACF/PACF, check autocorrelation structure
[ ] Use both ADF and KPSS tests for stationarity
[ ] If non-stationary, try differencing or log transformation
[ ] Verify stationarity of transformed series
[ ] Record transformation steps (needed for inverse transformation in modeling)

Common Errors

❌ Avoid:

Using only ADF test (may reach wrong conclusions)
Over-differencing (e.g., differencing I(1) series twice)
Ignoring seasonal patterns
Differencing already stationary series

✓ Recommended:

ADF + KPSS dual testing
Visual check of transformation effects
Understand economic meaning of data (e.g., growth rates typically stationary)

Next Section Preview

In the next section, we will learn how to decompose trend, seasonal, and random components of time series.

Master the foundations of time series!

Extended Reading

Dickey, D. A., & Fuller, W. A. (1979). "Distribution of the estimators for autoregressive time series with a unit root." Journal of the American Statistical Association, 74(366a), 427-431.
Kwiatkowski, D., et al. (1992). "Testing the null hypothesis of stationarity against the alternative of a unit root." Journal of Econometrics, 54(1-3), 159-178.
Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press. (Chapters 15-17)
Enders, W. (2014). Applied Econometric Time Series (4th ed.). Wiley. (Chapter 4)

7.2 Time Series Basics ​

Section Objectives ​

Characteristics of Time Series Data ​

Definition of Time Series ​

Core Characteristics of Time Series ​

Time Series Processing in Python ​

Core pandas Time Series Functions ​

Time Series Slicing and Indexing ​

Resampling ​

Rolling Windows ​

Lags and Differences ​

Stationarity ​

Why is Stationarity Important? ​

Mathematical Definition of Stationarity ​

Visual Judgment of Stationarity ​

Stationarity Tests ​

1. ADF Test (Augmented Dickey-Fuller Test) ​

2. KPSS Test (Kwiatkowski-Phillips-Schmidt-Shin Test) ​

3. PP Test (Phillips-Perron Test) ​

Comparison of Test Methods ​

Differencing and Transformation ​

First-Order Differencing ​

Seasonal Differencing ​

Log Transformation ​

Box-Cox Transformation ​

Complete Case Study: China GDP Growth Rate Stationarity Analysis ​

Section Summary ​

Core Concepts ​

Practice Checklist ​

Common Errors ​

Next Section Preview ​

Extended Reading ​

7.2 Time Series Basics

Section Objectives

Characteristics of Time Series Data

Definition of Time Series

Core Characteristics of Time Series

Time Series Processing in Python

Core pandas Time Series Functions

Time Series Slicing and Indexing

Resampling

Rolling Windows

Lags and Differences

Stationarity

Why is Stationarity Important?

Mathematical Definition of Stationarity

Visual Judgment of Stationarity

Stationarity Tests

1. ADF Test (Augmented Dickey-Fuller Test)

2. KPSS Test (Kwiatkowski-Phillips-Schmidt-Shin Test)

3. PP Test (Phillips-Perron Test)

Comparison of Test Methods

Differencing and Transformation

First-Order Differencing

Seasonal Differencing

Log Transformation

Box-Cox Transformation

Complete Case Study: China GDP Growth Rate Stationarity Analysis

Section Summary

Core Concepts

Practice Checklist

Common Errors

Next Section Preview

Extended Reading