9.5 Classic Cases and Python Implementation

From Real Research to Reproducible Code

Learning Objectives

Understand classic DID application scenarios
Replicate complete DID workflow using synthetic data
Master code templates transferable to real datasets

I. Overview of Classic Cases (Conceptual Summary)

Minimum Wage Policy: Using employment/wages as outcome variables, examining differences before and after policy implementation across affected and unaffected regions.
Environmental Regulation/Tax Policies: Examining impacts on production capacity, emissions, investment, etc.
Education/Healthcare Reforms: Examining impacts on student performance, health indicators, etc.

Reminder: In any case, always return to the core question "Is parallel trends assumption reasonable?" (see 9.3).

II. Synthetic Data Demonstration: Complete DID from Scratch (Runnable)

python

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

# Parameters
n_units = 60      # Number of units
n_periods = 10    # Number of time periods
policy_time = 6   # Policy implementation period (starting from 1)

rng = np.random.default_rng(42)
ids = np.arange(n_units)
periods = np.arange(1, n_periods + 1)

data = []
for i in ids:
    treated = 1 if i >= n_units // 2 else 0
    unit_fe = rng.normal(0, 3)
    for t in periods:
        time_fe = 0.5 * t
        post = 1 if t >= policy_time else 0
        tau = 4.0  # True policy effect
        y = 20 + unit_fe + time_fe + tau * treated * post + rng.normal(0, 2)
        data.append({
            'id': i,
            'time': t,
            'treated': treated,
            'post': post,
            'y': y
        })

df = pd.DataFrame(data)

# DID regression (with entity/time fixed effects, clustered standard errors at entity level)
model = smf.ols('y ~ treated*post + C(id) + C(time)', data=df) \
            .fit(cov_type='cluster', cov_kwds={'groups': df['id']})
print(model.summary().tables[1])
print('\nATE (treated:post) =', model.params['treated:post'])

Extensions:

Event Study to check pre-trends and dynamic effects (see 9.3).
For multi-period/staggered treatment, recommend using recent multi-period DID estimators (e.g., Sun & Abraham, Callaway & Sant'Anna implementations).

III. Real Data Example (Operational Template)

Organize data into long format: one row = one unit × one period (contains id, time, treated, post, y)
Conduct 9.3's "pre-trends and event study" to assess identifying assumptions
Estimate basic DID + robustness (clustering/two-way clustering, controlling for trends)
Conduct 9.4's placebo tests: fake time points/fake control groups/leave-one-out/permutation tests/placebo outcomes
Report: main results + pre-trends/event study plots + placebo tests + explanations

Section Summary

Classic DID cases are numerous, but identification logic always revolves around "parallel trends."
First use reproducible synthetic data to work through the complete workflow, then transfer to real data.
Template-based thinking: data structure, regression formula, clustering methods, visualization and robustness.

Appendix: Event Study Plot (Consistent with 9.3 Style)

Using the df data constructed above, estimate and plot event study dynamic effects.

python

import matplotlib.pyplot as plt
from linearmodels.panel import PanelOLS

# Construct relative time (policy implementation period as 0, pre-policy negative, post-policy positive)
df['rel_time'] = df['time'] - policy_time
df['rel_time_treated'] = df['rel_time'] * df['treated']

# Generate leads/lags (period -1 as baseline)
min_lead, max_lag = - (policy_time - 1), (n_periods - policy_time)
lead_lag_vars = []
for k in range(min_lead, max_lag + 1):
    if k == -1:
        continue  # Baseline period doesn't get dummy
    col = f'LL_{k}'
    df[col] = (df['rel_time_treated'] == k).astype(int)
    lead_lag_vars.append(col)

# Panel indexing
panel = df.set_index(['id', 'time'])

# Regression (entity and time fixed effects)
model_es = PanelOLS(
    dependent=panel['y'],
    exog=panel[lead_lag_vars],
    entity_effects=True,
    time_effects=True
).fit(cov_type='clustered', cluster_entity=True)

print(model_es.summary)

# Extract coefficients and confidence intervals, construct series including baseline period (-1)
rows = []
for k in range(min_lead, max_lag + 1):
    if k == -1:
        rows.append({'rel_time': k, 'coef': 0.0, 'low': 0.0, 'high': 0.0})
    else:
        name = f'LL_{k}'
        coef = float(model_es.params.get(name, 0.0))
        ci = model_es.conf_int().loc[name]
        rows.append({'rel_time': k, 'coef': coef, 'low': float(ci[0]), 'high': float(ci[1])})

es = pd.DataFrame(rows).sort_values('rel_time')

# Plotting
fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(es['rel_time'], es['coef'], 'o-', color='navy', label='DID Coefficient')
ax.fill_between(es['rel_time'], es['low'], es['high'], color='navy', alpha=0.25, label='95% CI')
ax.axhline(0, color='black', linestyle='--', linewidth=1)
ax.axvline(0, color='red', linestyle='--', linewidth=1.5, label='Policy Implementation Period')
ax.set_xlabel('Relative Time (pre-policy negative, post-policy positive)')
ax.set_ylabel('Effect')
ax.set_title('Event Study: Dynamic Treatment Effects')
ax.legend()
ax.grid(alpha=0.3)
plt.tight_layout()
plt.show()

Previous: 9.4 Placebo Tests | Next: 9.6 Chapter Summary

9.5 Classic Cases and Python Implementation ​

Learning Objectives ​

I. Overview of Classic Cases (Conceptual Summary) ​

II. Synthetic Data Demonstration: Complete DID from Scratch (Runnable) ​

III. Real Data Example (Operational Template) ​

Section Summary ​

Appendix: Event Study Plot (Consistent with 9.3 Style) ​

9.5 Classic Cases and Python Implementation

Learning Objectives

I. Overview of Classic Cases (Conceptual Summary)

II. Synthetic Data Demonstration: Complete DID from Scratch (Runnable)

III. Real Data Example (Operational Template)

Section Summary

Appendix: Event Study Plot (Consistent with 9.3 Style)