Module 9 Difference-in-Differences (DID)
The Gold Standard of Causal Inference: From Natural Experiments to Policy Evaluation
Learning Objectives
Upon completing this module, you will be able to:
- Understand the core intuition and identification logic of difference-in-differences
- Master the fundamental assumptions of DID (parallel trends assumption) and testing methods
- Implement standard DID regressions and event study designs
- Conduct placebo tests and robustness checks
- Handle violations of the parallel trends assumption
- Use Python to implement DID analysis (statsmodels, linearmodels)
- Replicate classic DID studies (Card & Krueger 1994, etc.)
Why is DID a Core Method in Social Sciences?
DID: A Powerful Tool for Quasi-Experimental Design
In social science research, we rarely have opportunities to conduct randomized controlled trials (RCTs). Policymakers don't randomly assign policies for academic research. However, natural experiments provide us with "quasi-random" opportunities:
Classic Scenario: Some regions implement a new policy while others do not
- Treatment Group: Regions/individuals affected by the policy
- Control Group: Regions/individuals not affected by the policy
Core Question: How can we identify the causal effect of the policy from observational data?
DID vs Simple Comparisons
Suppose we want to evaluate the impact of minimum wage increases on employment rates.
❌ Wrong Method 1: Simple Before-After Comparison
Problem: There may be time trends! Even without the policy, employment rates may naturally increase or decrease.
❌ Wrong Method 2: Cross-Sectional Comparison (Treatment vs Control)
Problem: The treatment and control groups may be inherently different (selection bias)!
✅ Correct Method: Difference-in-Differences (DID)
Intuition:
- First difference: Eliminates cross-sectional differences (inherent differences between treatment and control)
- Second difference: Eliminates time trends (common time effects)
- What remains: The causal effect of the policy!
Core Intuition of DID
Visualization: Ideal DID
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Set Chinese font
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS'] # macOS
plt.rcParams['axes.unicode_minus'] = False
sns.set_style("whitegrid")
# Simulate data
np.random.seed(42)
time = np.arange(0, 20)
treatment_time = 10
# Control group: stable growth
control = 50 + 2 * time + np.random.normal(0, 2, len(time))
# Treatment group: parallel before policy, jump after
treatment = 55 + 2 * time + np.random.normal(0, 2, len(time))
treatment[treatment_time:] += 15 # Policy effect
# Counterfactual (what would have happened without the policy)
counterfactual = 55 + 2 * time + np.random.normal(0, 2, len(time))
fig, ax = plt.subplots(figsize=(14, 8))
# Plot actual observations
ax.plot(time[:treatment_time], control[:treatment_time],
'o-', color='blue', linewidth=2, label='Control Group', markersize=6)
ax.plot(time[treatment_time:], control[treatment_time:],
'o-', color='blue', linewidth=2, markersize=6)
ax.plot(time[:treatment_time], treatment[:treatment_time],
's-', color='red', linewidth=2, label='Treatment Group', markersize=6)
ax.plot(time[treatment_time:], treatment[treatment_time:],
's-', color='red', linewidth=2, markersize=6)
# Plot counterfactual (dashed line)
ax.plot(time[treatment_time:], counterfactual[treatment_time:],
'--', color='red', linewidth=2, alpha=0.6, label='Counterfactual')
# Mark policy time point
ax.axvline(x=treatment_time, color='green', linestyle='--', linewidth=2, alpha=0.7)
ax.text(treatment_time + 0.3, 45, 'Policy Implementation', fontsize=12, color='green', fontweight='bold')
# Mark DID effect
did_effect_y = treatment[treatment_time + 5]
counterfactual_y = counterfactual[treatment_time + 5]
ax.annotate('', xy=(treatment_time + 5, did_effect_y),
xytext=(treatment_time + 5, counterfactual_y),
arrowprops=dict(arrowstyle='<->', color='purple', lw=3))
ax.text(treatment_time + 5.5, (did_effect_y + counterfactual_y) / 2,
'DID Effect\n(Causal Effect)', fontsize=12, color='purple', fontweight='bold',
bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.3))
ax.set_xlabel('Time', fontsize=14, fontweight='bold')
ax.set_ylabel('Outcome Variable (e.g., Employment Rate)', fontsize=14, fontweight='bold')
ax.set_title('Core Logic of Difference-in-Differences (DID)', fontsize=16, fontweight='bold')
ax.legend(loc='upper left', fontsize=12)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()Key Observations:
- Before Policy: Treatment and control groups are parallel (same trend, different levels)
- After Policy: Treatment group jumps, control group continues original trend
- DID Effect: Treatment group actual value - counterfactual value (dashed line)
Mathematical Expression of DID
2×2 DID: The Simplest Case
Data Structure:
| Before Policy | After Policy | Difference (Δ) | |
|---|---|---|---|
| Treatment Group | |||
| Control Group |
DID Estimator:
Intuition:
- First expression: Compare time first (before-after), then compare between groups
- Second expression: Compare groups first (treatment vs control), then compare time
Regression Form of 2×2 DID
Equivalent regression model:
Variable Definitions:
- if individual is in the treatment group
- if time is after the policy
- : Interaction term, the key to policy effect!
Parameter Interpretation:
- : Baseline level of control group before policy
- : Fixed difference between treatment and control groups before policy (cross-sectional heterogeneity)
- : Time trend for control group before and after policy (common time effect)
- : DID Effect (causal effect of policy)
Why is the causal effect?
Let's derive the four group means:
| Combination | Interaction | Predicted Value | ||
|---|---|---|---|---|
| Control-Before | 0 | 0 | 0 | |
| Control-After | 0 | 1 | 0 | |
| Treatment-Before | 1 | 0 | 0 | |
| Treatment-After | 1 | 1 | 1 |
Calculate DID:
Conclusion: The interaction term coefficient in the regression is the DID effect we want!
Core Assumptions of DID
Assumption 1: Parallel Trends Assumption ⭐
Core Assumption: In the absence of policy intervention, the treatment and control groups would follow the same time trend.
Mathematical Expression:
Plain Language:
- In the counterfactual world without policy, the gap between treatment and control groups remains constant
- In other words, the trends of the two groups are the same (parallel), but levels can differ
Why is it important?
- If parallel trends are violated, DID estimates will be biased
- This is DID's most critical and most frequently questioned assumption
How to test? (discussed in detail in Section 3)
- Pre-trend Test: Check if trends are parallel before policy
- Event Study Plot: Visualize dynamic effects
- Placebo Test: Use fake policy time points
Assumption 2: Exogeneity
Assumption: The timing and location of policy implementation are unrelated to potential outcomes (conditional independence).
Mathematical Expression:
Plain Language:
- Which regions/individuals are affected by the policy cannot be determined by future outcomes
- Example: If the government raises minimum wages only in regions where employment is about to rise, DID will overestimate the effect
Threats in Practice:
- Anticipation Effects: Firms know about the policy in advance and adjust beforehand
- Reverse Causality: Government selects policy targets based on outcome variables (endogeneity)
Assumption 3: Stable Unit Treatment Value Assumption (SUTVA)
Assumption: The treatment status of one unit does not affect the outcomes of other units (no spillover effects).
Threats:
- Geographic Spillover: Firms in policy regions relocate to non-policy regions
- General Equilibrium Effects: Policy changes market prices, wages, etc.
Classic Applications of DID
Case 1: Card & Krueger (1994) - Minimum Wage and Employment
Research Question: Does raising minimum wage reduce employment?
Background:
- In April 1992, New Jersey raised its minimum wage from $4.25 to $5.05
- Neighboring Pennsylvania did not raise its minimum wage
Research Design:
- Treatment Group: Fast-food restaurants in New Jersey
- Control Group: Fast-food restaurants in Pennsylvania
- Outcome Variable: Full-time equivalent employment (FTE)
Data:
- Pre-Policy: February 1992 (2 months before policy)
- Post-Policy: November 1992 (7 months after policy)
Findings:
Conclusion: After the minimum wage increase, employment in New Jersey increased by 2.76 workers (relative to Pennsylvania)!
Impact:
- Challenged traditional economic theory
- Sparked long-term debate about minimum wages
- Became a classic case for the DID method
Case 2: Bertrand et al. (2004) - Statistical Inference Issues in DID
Problem Identified:
- Many DID studies underestimate standard errors
- Reason: Serial Correlation
Solutions:
- Clustered Standard Errors: Cluster at the policy unit level
- Block Bootstrap: Resampling methods
- Randomization Inference
Lesson: DID should focus not only on point estimates but more importantly on statistical inference!
Python Implementation: 2×2 DID Example
Simple Example: Simulated Data
import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col
# Set random seed
np.random.seed(123)
# Simulate data
n_units = 50 # Number of units per group
n_periods = 2 # Two periods
# Create dataframe
data = []
for unit in range(n_units * 2):
treated = 1 if unit >= n_units else 0
for period in range(n_periods):
post = 1 if period == 1 else 0
# Data generating process
# Y = 50 + 10*treated + 5*post + 15*(treated*post) + noise
y = 50 + 10 * treated + 5 * post + 15 * (treated * post) + np.random.normal(0, 5)
data.append({
'unit': unit,
'period': period,
'treated': treated,
'post': post,
'y': y
})
df = pd.DataFrame(data)
df['treated_post'] = df['treated'] * df['post']
print("=" * 60)
print("Data Preview")
print("=" * 60)
print(df.head(10))
print("\n")
# Calculate means
means = df.groupby(['treated', 'post'])['y'].mean().unstack()
print("=" * 60)
print("2×2 Mean Table")
print("=" * 60)
print(means)
print("\n")
# Manual DID calculation
treated_diff = means.loc[1, 1] - means.loc[1, 0]
control_diff = means.loc[0, 1] - means.loc[0, 0]
did_manual = treated_diff - control_diff
print("=" * 60)
print("Manual DID Calculation")
print("=" * 60)
print(f"Treatment Group Difference: {treated_diff:.3f}")
print(f"Control Group Difference: {control_diff:.3f}")
print(f"DID Effect: {did_manual:.3f}")
print("\n")
# Regression estimate of DID
model = smf.ols('y ~ treated + post + treated_post', data=df).fit(cov_type='HC1')
print("=" * 60)
print("Regression-Based DID")
print("=" * 60)
print(model.summary())Output Interpretation:
treatedcoefficient ≈ 10: Inherent difference between treatment and control groupspostcoefficient ≈ 5: Time trend (same for all groups)treated_postcoefficient ≈ 15: DID Effect (Policy Causal Effect) ⭐
Module Structure
Section 1: Chapter Introduction (Current)
- Core intuition and ideas of DID
- Mathematical expression and regression form of 2×2 DID
- Core assumptions: parallel trends, exogeneity, SUTVA
- Classic application cases
- Basic Python implementation
Section 2: DID Fundamentals
- Identification logic and causal diagrams of DID
- Multi-period DID (Staggered DID)
- Control variables and covariate adjustment
- Standard error calculation (clustering, robustness)
- Panel data DID implementation
Section 3: Parallel Trends Assumption
- Pre-trend tests
- Event study design
- Dynamic effects plots
- Handling violations of parallel trends
Section 4: Placebo Tests
- Placebo tests
- Fake policy time point tests
- Sample exclusion tests
- Randomization inference
Section 5: Classic Cases and Python Implementation
- Complete replication of Card & Krueger (1994)
- DID evaluation of environmental policies
- Panel data handling techniques
- Visualization best practices
Section 6: Chapter Summary
- DID method summary
- Common pitfalls and considerations
- Exercises
- Literature recommendations
Python Toolkit
Core Libraries
| Library | Main Functions | Installation |
|---|---|---|
| pandas | Data manipulation | pip install pandas |
| statsmodels | OLS regression | pip install statsmodels |
| linearmodels | Panel DID | pip install linearmodels |
| matplotlib | Visualization | pip install matplotlib |
| seaborn | Advanced visualization | pip install seaborn |
Basic Setup
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col
from linearmodels.panel import PanelOLS
# Chinese font settings
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS'] # macOS
plt.rcParams['axes.unicode_minus'] = False
# Set style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)
pd.set_option('display.float_format', '{:.4f}'.format)Essential References
Foundational Papers
Card, D., & Krueger, A. B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania." American Economic Review, 84(4), 772-793.
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). "How Much Should We Trust Differences-In-Differences Estimates?" Quarterly Journal of Economics, 119(1), 249-275.
Callaway, B., & Sant'Anna, P. H. (2021). "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, 225(2), 200-230.
Recommended Textbooks
- Angrist & Pischke (2009). Mostly Harmless Econometrics, Chapter 5
- Cunningham (2021). Causal Inference: The Mixtape, Chapter 9
- Huntington-Klein (2022). The Effect, Chapter 18
Ready to Begin?
DID is one of the foundational methods of modern causal inference. Mastering it will enable you to:
- Evaluate real-world policy effects
- Publish research in top-tier journals
- Provide scientific evidence for policymaking
Remember the core idea:
"The beauty of DID is its simplicity: it differences out fixed differences and common trends, leaving only the treatment effect."
Let's begin with Section 2: DID Fundamentals!
Causal inference starts with DID!