Module 9 Difference-in-Differences (DID)

The Gold Standard of Causal Inference: From Natural Experiments to Policy Evaluation

Learning Objectives

Upon completing this module, you will be able to:

Understand the core intuition and identification logic of difference-in-differences
Master the fundamental assumptions of DID (parallel trends assumption) and testing methods
Implement standard DID regressions and event study designs
Conduct placebo tests and robustness checks
Handle violations of the parallel trends assumption
Use Python to implement DID analysis (statsmodels, linearmodels)
Replicate classic DID studies (Card & Krueger 1994, etc.)

DID: A Powerful Tool for Quasi-Experimental Design

In social science research, we rarely have opportunities to conduct randomized controlled trials (RCTs). Policymakers don't randomly assign policies for academic research. However, natural experiments provide us with "quasi-random" opportunities:

Classic Scenario: Some regions implement a new policy while others do not

Treatment Group: Regions/individuals affected by the policy
Control Group: Regions/individuals not affected by the policy

Core Question: How can we identify the causal effect of the policy from observational data?

DID vs Simple Comparisons

Suppose we want to evaluate the impact of minimum wage increases on employment rates.

❌ Wrong Method 1: Simple Before-After Comparison

Problem: There may be time trends! Even without the policy, employment rates may naturally increase or decrease.

❌ Wrong Method 2: Cross-Sectional Comparison (Treatment vs Control)

Problem: The treatment and control groups may be inherently different (selection bias)!

✅ Correct Method: Difference-in-Differences (DID)

Intuition:

First difference: Eliminates cross-sectional differences (inherent differences between treatment and control)
Second difference: Eliminates time trends (common time effects)
What remains: The causal effect of the policy!

Core Intuition of DID

Visualization: Ideal DID

python

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set Chinese font
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS']  # macOS
plt.rcParams['axes.unicode_minus'] = False
sns.set_style("whitegrid")

# Simulate data
np.random.seed(42)
time = np.arange(0, 20)
treatment_time = 10

# Control group: stable growth
control = 50 + 2 * time + np.random.normal(0, 2, len(time))

# Treatment group: parallel before policy, jump after
treatment = 55 + 2 * time + np.random.normal(0, 2, len(time))
treatment[treatment_time:] += 15  # Policy effect

# Counterfactual (what would have happened without the policy)
counterfactual = 55 + 2 * time + np.random.normal(0, 2, len(time))

fig, ax = plt.subplots(figsize=(14, 8))

# Plot actual observations
ax.plot(time[:treatment_time], control[:treatment_time],
        'o-', color='blue', linewidth=2, label='Control Group', markersize=6)
ax.plot(time[treatment_time:], control[treatment_time:],
        'o-', color='blue', linewidth=2, markersize=6)

ax.plot(time[:treatment_time], treatment[:treatment_time],
        's-', color='red', linewidth=2, label='Treatment Group', markersize=6)
ax.plot(time[treatment_time:], treatment[treatment_time:],
        's-', color='red', linewidth=2, markersize=6)

# Plot counterfactual (dashed line)
ax.plot(time[treatment_time:], counterfactual[treatment_time:],
        '--', color='red', linewidth=2, alpha=0.6, label='Counterfactual')

# Mark policy time point
ax.axvline(x=treatment_time, color='green', linestyle='--', linewidth=2, alpha=0.7)
ax.text(treatment_time + 0.3, 45, 'Policy Implementation', fontsize=12, color='green', fontweight='bold')

# Mark DID effect
did_effect_y = treatment[treatment_time + 5]
counterfactual_y = counterfactual[treatment_time + 5]
ax.annotate('', xy=(treatment_time + 5, did_effect_y),
            xytext=(treatment_time + 5, counterfactual_y),
            arrowprops=dict(arrowstyle='<->', color='purple', lw=3))
ax.text(treatment_time + 5.5, (did_effect_y + counterfactual_y) / 2,
        'DID Effect\n(Causal Effect)', fontsize=12, color='purple', fontweight='bold',
        bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.3))

ax.set_xlabel('Time', fontsize=14, fontweight='bold')
ax.set_ylabel('Outcome Variable (e.g., Employment Rate)', fontsize=14, fontweight='bold')
ax.set_title('Core Logic of Difference-in-Differences (DID)', fontsize=16, fontweight='bold')
ax.legend(loc='upper left', fontsize=12)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Key Observations:

Before Policy: Treatment and control groups are parallel (same trend, different levels)
After Policy: Treatment group jumps, control group continues original trend
DID Effect: Treatment group actual value - counterfactual value (dashed line)

Mathematical Expression of DID

2×2 DID: The Simplest Case

Data Structure:

	Before Policy	After Policy	Difference (Δ)
Treatment Group
Control Group

DID Estimator:

Intuition:

First expression: Compare time first (before-after), then compare between groups
Second expression: Compare groups first (treatment vs control), then compare time

Regression Form of 2×2 DID

Equivalent regression model:

Variable Definitions:

if individual is in the treatment group
if time is after the policy
: Interaction term, the key to policy effect!

Parameter Interpretation:

: Baseline level of control group before policy
: Fixed difference between treatment and control groups before policy (cross-sectional heterogeneity)
: Time trend for control group before and after policy (common time effect)
: DID Effect (causal effect of policy)

Why is the causal effect?

Let's derive the four group means:

Combination			Interaction
Control-Before	0	0	0
Control-After	0	1	0
Treatment-Before	1	0	0
Treatment-After	1	1	1

Calculate DID:

Conclusion: The interaction term coefficient in the regression is the DID effect we want!

Core Assumptions of DID

Assumption 1: Parallel Trends Assumption ⭐

Core Assumption: In the absence of policy intervention, the treatment and control groups would follow the same time trend.

Mathematical Expression:

Plain Language:

In the counterfactual world without policy, the gap between treatment and control groups remains constant
In other words, the trends of the two groups are the same (parallel), but levels can differ

Why is it important?

If parallel trends are violated, DID estimates will be biased
This is DID's most critical and most frequently questioned assumption

How to test? (discussed in detail in Section 3)

Pre-trend Test: Check if trends are parallel before policy
Event Study Plot: Visualize dynamic effects
Placebo Test: Use fake policy time points

Assumption 2: Exogeneity

Assumption: The timing and location of policy implementation are unrelated to potential outcomes (conditional independence).

Mathematical Expression:

Plain Language:

Which regions/individuals are affected by the policy cannot be determined by future outcomes
Example: If the government raises minimum wages only in regions where employment is about to rise, DID will overestimate the effect

Threats in Practice:

Anticipation Effects: Firms know about the policy in advance and adjust beforehand
Reverse Causality: Government selects policy targets based on outcome variables (endogeneity)

Assumption 3: Stable Unit Treatment Value Assumption (SUTVA)

Assumption: The treatment status of one unit does not affect the outcomes of other units (no spillover effects).

Threats:

Geographic Spillover: Firms in policy regions relocate to non-policy regions
General Equilibrium Effects: Policy changes market prices, wages, etc.

Classic Applications of DID

Case 1: Card & Krueger (1994) - Minimum Wage and Employment

Research Question: Does raising minimum wage reduce employment?

Background:

In April 1992, New Jersey raised its minimum wage from $4.25 to $5.05
Neighboring Pennsylvania did not raise its minimum wage

Research Design:

Treatment Group: Fast-food restaurants in New Jersey
Control Group: Fast-food restaurants in Pennsylvania
Outcome Variable: Full-time equivalent employment (FTE)

Data:

Pre-Policy: February 1992 (2 months before policy)
Post-Policy: November 1992 (7 months after policy)

Findings:

Conclusion: After the minimum wage increase, employment in New Jersey increased by 2.76 workers (relative to Pennsylvania)!

Impact:

Challenged traditional economic theory
Sparked long-term debate about minimum wages
Became a classic case for the DID method

Case 2: Bertrand et al. (2004) - Statistical Inference Issues in DID

Problem Identified:

Many DID studies underestimate standard errors
Reason: Serial Correlation

Solutions:

Clustered Standard Errors: Cluster at the policy unit level
Block Bootstrap: Resampling methods
Randomization Inference

Lesson: DID should focus not only on point estimates but more importantly on statistical inference!

Python Implementation: 2×2 DID Example

Simple Example: Simulated Data

python

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col

# Set random seed
np.random.seed(123)

# Simulate data
n_units = 50  # Number of units per group
n_periods = 2  # Two periods

# Create dataframe
data = []
for unit in range(n_units * 2):
    treated = 1 if unit >= n_units else 0
    for period in range(n_periods):
        post = 1 if period == 1 else 0

        # Data generating process
        # Y = 50 + 10*treated + 5*post + 15*(treated*post) + noise
        y = 50 + 10 * treated + 5 * post + 15 * (treated * post) + np.random.normal(0, 5)

        data.append({
            'unit': unit,
            'period': period,
            'treated': treated,
            'post': post,
            'y': y
        })

df = pd.DataFrame(data)
df['treated_post'] = df['treated'] * df['post']

print("=" * 60)
print("Data Preview")
print("=" * 60)
print(df.head(10))
print("\n")

# Calculate means
means = df.groupby(['treated', 'post'])['y'].mean().unstack()
print("=" * 60)
print("2×2 Mean Table")
print("=" * 60)
print(means)
print("\n")

# Manual DID calculation
treated_diff = means.loc[1, 1] - means.loc[1, 0]
control_diff = means.loc[0, 1] - means.loc[0, 0]
did_manual = treated_diff - control_diff

print("=" * 60)
print("Manual DID Calculation")
print("=" * 60)
print(f"Treatment Group Difference: {treated_diff:.3f}")
print(f"Control Group Difference: {control_diff:.3f}")
print(f"DID Effect: {did_manual:.3f}")
print("\n")

# Regression estimate of DID
model = smf.ols('y ~ treated + post + treated_post', data=df).fit(cov_type='HC1')

print("=" * 60)
print("Regression-Based DID")
print("=" * 60)
print(model.summary())

Output Interpretation:

treated coefficient ≈ 10: Inherent difference between treatment and control groups
post coefficient ≈ 5: Time trend (same for all groups)
treated_post coefficient ≈ 15: DID Effect (Policy Causal Effect) ⭐

Module Structure

Section 1: Chapter Introduction (Current)

Core intuition and ideas of DID
Mathematical expression and regression form of 2×2 DID
Core assumptions: parallel trends, exogeneity, SUTVA
Classic application cases
Basic Python implementation

Section 2: DID Fundamentals

Identification logic and causal diagrams of DID
Multi-period DID (Staggered DID)
Control variables and covariate adjustment
Standard error calculation (clustering, robustness)
Panel data DID implementation

Section 3: Parallel Trends Assumption

Pre-trend tests
Event study design
Dynamic effects plots
Handling violations of parallel trends

Section 4: Placebo Tests

Placebo tests
Fake policy time point tests
Sample exclusion tests
Randomization inference

Section 5: Classic Cases and Python Implementation

Complete replication of Card & Krueger (1994)
DID evaluation of environmental policies
Panel data handling techniques
Visualization best practices

Section 6: Chapter Summary

DID method summary
Common pitfalls and considerations
Exercises
Literature recommendations

Python Toolkit

Core Libraries

Library	Main Functions	Installation
pandas	Data manipulation	`pip install pandas`
statsmodels	OLS regression	`pip install statsmodels`
linearmodels	Panel DID	`pip install linearmodels`
matplotlib	Visualization	`pip install matplotlib`
seaborn	Advanced visualization	`pip install seaborn`

Basic Setup

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col
from linearmodels.panel import PanelOLS

# Chinese font settings
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS']  # macOS
plt.rcParams['axes.unicode_minus'] = False

# Set style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)
pd.set_option('display.float_format', '{:.4f}'.format)

Essential References

Foundational Papers

Card, D., & Krueger, A. B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania." American Economic Review, 84(4), 772-793.
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). "How Much Should We Trust Differences-In-Differences Estimates?" Quarterly Journal of Economics, 119(1), 249-275.
Callaway, B., & Sant'Anna, P. H. (2021). "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, 225(2), 200-230.

Ready to Begin?

DID is one of the foundational methods of modern causal inference. Mastering it will enable you to:

Evaluate real-world policy effects
Publish research in top-tier journals
Provide scientific evidence for policymaking

Remember the core idea:

"The beauty of DID is its simplicity: it differences out fixed differences and common trends, leaving only the treatment effect."

Let's begin with Section 2: DID Fundamentals!

Causal inference starts with DID!

Module 9 Difference-in-Differences (DID) ​

Learning Objectives ​

Why is DID a Core Method in Social Sciences? ​

DID: A Powerful Tool for Quasi-Experimental Design ​

DID vs Simple Comparisons ​

Core Intuition of DID ​

Visualization: Ideal DID ​

Mathematical Expression of DID ​

2×2 DID: The Simplest Case ​

Regression Form of 2×2 DID ​

Core Assumptions of DID ​

Assumption 1: Parallel Trends Assumption ⭐ ​

Assumption 2: Exogeneity ​

Assumption 3: Stable Unit Treatment Value Assumption (SUTVA) ​

Classic Applications of DID ​

Case 1: Card & Krueger (1994) - Minimum Wage and Employment ​

Case 2: Bertrand et al. (2004) - Statistical Inference Issues in DID ​

Python Implementation: 2×2 DID Example ​

Simple Example: Simulated Data ​

Module Structure ​

Section 1: Chapter Introduction (Current) ​

Section 2: DID Fundamentals ​

Section 3: Parallel Trends Assumption ​

Section 4: Placebo Tests ​

Section 5: Classic Cases and Python Implementation ​

Section 6: Chapter Summary ​

Python Toolkit ​

Core Libraries ​

Basic Setup ​

Essential References ​

Foundational Papers ​

Recommended Textbooks ​

Ready to Begin? ​

Module 9 Difference-in-Differences (DID)

Learning Objectives

Why is DID a Core Method in Social Sciences?

DID: A Powerful Tool for Quasi-Experimental Design

DID vs Simple Comparisons

Core Intuition of DID

Visualization: Ideal DID

Mathematical Expression of DID

2×2 DID: The Simplest Case

Regression Form of 2×2 DID

Core Assumptions of DID

Assumption 1: Parallel Trends Assumption ⭐

Assumption 2: Exogeneity

Assumption 3: Stable Unit Treatment Value Assumption (SUTVA)

Classic Applications of DID

Case 1: Card & Krueger (1994) - Minimum Wage and Employment

Case 2: Bertrand et al. (2004) - Statistical Inference Issues in DID

Python Implementation: 2×2 DID Example

Simple Example: Simulated Data

Module Structure

Section 1: Chapter Introduction (Current)

Section 2: DID Fundamentals

Section 3: Parallel Trends Assumption

Section 4: Placebo Tests

Section 5: Classic Cases and Python Implementation

Section 6: Chapter Summary

Python Toolkit

Core Libraries

Basic Setup

Essential References

Foundational Papers

Recommended Textbooks

Ready to Begin?