Skip to content

11.5 Classic Cases and Python Implementation

"A picture is worth a thousand regressions."— David Lee, On RDD Graphical Evidence

Learning RDD best practices from classic papers


Section Overview

This section will replicate in depth three classic RDD studies:

  1. Angrist & Lavy (1999): Class size and student achievement (Fuzzy RDD)
  2. Lee (2008): Electoral advantage and re-election (Sharp RDD)
  3. Carpenter & Dobkin (2009): Minimum drinking age and mortality (Sharp RDD)

Each case includes complete Python implementation, data simulation, and visualization.


Case 1: Angrist & Lavy (1999) - Class Size and Student Achievement

Research Background

Research question: Does reducing class size improve student performance?

Challenge:

  • Class size is endogenous (good schools may have smaller classes)
  • Simple OLS regression will be biased

Natural experiment: Maimonides' Rule

  • Israeli law stipulates: Class size cannot exceed 40 students
  • If school has 41 students → Must split into 2 classes (~20 each)
  • If school has 40 students → 1 class (40 students)

Research Design (Fuzzy RDD)

Running variable:

Cutoffs:

Treatment variable:

Instrumental variable:

Formula:

Why Fuzzy RDD?

  • Rule not perfectly enforced (some schools don't comply)
  • Class size jumps at cutoff, but not from 0 to 1

Python Implementation (Simulated Data)

python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
from scipy import stats

# Setup
np.random.seed(42)
sns.set_style("whitegrid")

# Generate data
n_schools = 500

# School grade enrollment (clustered around multiples of 40)
enrollment = []
for _ in range(n_schools):
    # Randomly choose a "target" enrollment (near multiples of 40)
    target = np.random.choice([40, 80, 120]) + np.random.randint(-15, 16)
    enrollment.append(max(20, target))

enrollment = np.array(enrollment)

# Maimonides' Rule: Predicted class size
def maimonides_rule(n_students):
    """Calculate predicted class size based on Maimonides' Rule"""
    n_classes = int((n_students - 1) / 40) + 1
    return n_students / n_classes

predicted_class_size = np.array([maimonides_rule(n) for n in enrollment])

# Actual class size (with noise, imperfect compliance)
# True class size = predicted + noise
actual_class_size = predicted_class_size + np.random.normal(0, 3, n_schools)
actual_class_size = np.clip(actual_class_size, 15, 45)  # Limit to reasonable range

# Generate test scores (causal effect of class size = -0.3)
# DGP: test_score = 70 - 0.3 * class_size + noise
true_effect = -0.3  # Class size increase by 1 student, score decreases by 0.3
test_score = 70 - 0.3 * actual_class_size + np.random.normal(0, 5, n_schools)

# Create dataframe
df_angrist = pd.DataFrame({
    'enrollment': enrollment,
    'predicted_class_size': predicted_class_size,
    'actual_class_size': actual_class_size,
    'test_score': test_score
})

print("=" * 70)
print("Angrist & Lavy (1999) Data Simulation")
print("=" * 70)
print(df_angrist.head(10))
print(f"\nTrue causal effect: {true_effect:.2f}")

Visualization: Maimonides' Rule

python
# Plot Maimonides' Rule
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# Subplot 1: Predicted class size
ax1 = axes[0]
enrollment_range = np.arange(20, 160)
predicted_range = [maimonides_rule(n) for n in enrollment_range]

ax1.plot(enrollment_range, predicted_range, color='red', linewidth=2.5,
         label='Maimonides\' Rule (Predicted)')
ax1.scatter(df_angrist['enrollment'], df_angrist['predicted_class_size'],
            alpha=0.3, s=30, color='blue', label='Schools')

# Cutoffs
for cutoff in [40, 80, 120]:
    ax1.axvline(x=cutoff, color='green', linestyle='--', linewidth=1.5, alpha=0.6)

ax1.set_xlabel('Enrollment (Total Students)', fontsize=13, fontweight='bold')
ax1.set_ylabel('Predicted Class Size', fontsize=13, fontweight='bold')
ax1.set_title('Maimonides\' Rule', fontsize=15, fontweight='bold')
ax1.legend(fontsize=11)
ax1.grid(True, alpha=0.3)

# Subplot 2: Actual vs predicted class size
ax2 = axes[1]
ax2.scatter(df_angrist['predicted_class_size'], df_angrist['actual_class_size'],
            alpha=0.5, s=40, edgecolors='black', linewidth=0.5)
ax2.plot([15, 45], [15, 45], 'r--', linewidth=2, label='45° Line (Perfect Compliance)')

ax2.set_xlabel('Predicted Class Size', fontsize=13, fontweight='bold')
ax2.set_ylabel('Actual Class Size', fontsize=13, fontweight='bold')
ax2.set_title('First Stage: Compliance with Maimonides\' Rule',
              fontsize=15, fontweight='bold')
ax2.legend(fontsize=11)
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Fuzzy RDD Estimation

python
from rdrobust import rdrobust

# Re-center around 40
df_angrist['enrollment_centered'] = df_angrist['enrollment'] - 40

# Fuzzy RDD (using predicted_class_size as instrument)
# First stage: predicted class size vs actual class size
print("\n" + "=" * 70)
print("First Stage")
print("=" * 70)

first_stage = rdrobust(
    y=df_angrist['actual_class_size'],
    x=df_angrist['enrollment'],
    c=40,
    fuzzy=None  # Sharp RDD for first stage
)
print(f"Jump in predicted class size at cutoff: {first_stage.coef[0]:.3f}")
print(f"F-statistic: {first_stage.z[0]**2:.2f}")

# Fuzzy RDD: using instrumental variable
print("\n" + "=" * 70)
print("Fuzzy RDD Estimation")
print("=" * 70)

# Note: rdrobust's fuzzy parameter accepts actual treatment variable
fuzzy_result = rdrobust(
    y=df_angrist['test_score'],
    x=df_angrist['enrollment'],
    c=40,
    fuzzy=df_angrist['actual_class_size']
)

print(f"Causal effect of class size on test scores: {fuzzy_result.coef[0]:.3f}")
print(f"p-value: {fuzzy_result.pval[0]:.4f}")
print(f"95% CI: [{fuzzy_result.ci[0][0]:.3f}, {fuzzy_result.ci[0][1]:.3f}]")
print(f"\nTrue effect: {true_effect:.3f}")

# Manual implementation: 2SLS
print("\n" + "=" * 70)
print("Manual Implementation: 2SLS")
print("=" * 70)

# Restrict sample near cutoff
df_local = df_angrist[np.abs(df_angrist['enrollment_centered']) <= 20].copy()
df_local['D'] = (df_local['enrollment'] >= 40).astype(int)

# First stage
fs_model = smf.ols('actual_class_size ~ D + enrollment_centered + D:enrollment_centered',
                   data=df_local).fit()
df_local['class_size_hat'] = fs_model.predict(df_local)

print("First stage regression:")
print(f"  Cutoff effect (F-stat): {fs_model.fvalue:.2f}")

# Second stage
ss_model = smf.ols('test_score ~ class_size_hat + enrollment_centered',
                   data=df_local).fit()

print("\nSecond stage regression:")
print(f"  Class size effect: {ss_model.params['class_size_hat']:.3f}")
print(f"  SE: {ss_model.bse['class_size_hat']:.3f}")

Angrist & Lavy's findings:

  • Reducing class size by 1 student → Scores increase by about 0.2-0.3 standard deviations
  • Small class teaching is beneficial (especially for disadvantaged students)

️ Case 2: Lee (2008) - Electoral Advantage and Re-election

Research Background

Research question: Does incumbency status confer re-election advantage?

Challenge:

  • Incumbency status is endogenous (more capable candidates more likely to win)
  • Observational comparisons have selection bias

Natural experiment: Electoral vote share

  • Vote share ≥ 50% → Win (become incumbent)
  • Vote share < 50% → Lose

Research Design (Sharp RDD)

Running variable:

Cutoff:

Treatment:

Outcome:

Key intuition:

  • Candidate with 49.9% vs candidate with 50.1%
  • These two are almost identical (ability, funding, voter support)
  • Only difference: One wins (becomes incumbent), one loses

Python Implementation (Simulated Data)

python
# Generate data
n_elections = 1000

# Current election vote share (concentrated around 50%)
vote_share_t = 50 + np.random.normal(0, 10, n_elections)

# Whether won election
won_election = (vote_share_t >= 50).astype(int)

# Next election vote share
# Incumbency advantage: winners gain 8 percentage points in next election
incumbency_advantage = 8
vote_share_t1 = (vote_share_t + incumbency_advantage * won_election +
                 np.random.normal(0, 8, n_elections))

# Create dataframe
df_lee = pd.DataFrame({
    'vote_share_t': vote_share_t,
    'vote_margin_t': vote_share_t - 50,  # Center
    'won_election': won_election,
    'vote_share_t1': vote_share_t1
})

# Restrict to reasonable range
df_lee = df_lee[(df_lee['vote_share_t'] >= 30) & (df_lee['vote_share_t'] <= 70)]

print("=" * 70)
print("Lee (2008) Data Simulation")
print("=" * 70)
print(df_lee.head(10))
print(f"\nTrue incumbency advantage: {incumbency_advantage:.1f} percentage points")

Visualization: Electoral RDD

python
from rdrobust import rdplot

# RDD plot
fig, ax = plt.subplots(figsize=(14, 8))

# Using rdplot (manual implementation for control)
# Binning
vote_bins = pd.cut(df_lee['vote_margin_t'], bins=30)
df_binned = df_lee.groupby([vote_bins, 'won_election']).agg({
    'vote_share_t1': 'mean',
    'vote_margin_t': 'mean'
}).reset_index()

df_binned_left = df_binned[df_binned['won_election'] == 0]
df_binned_right = df_binned[df_binned['won_election'] == 1]

# Scatter plot
ax.scatter(df_binned_left['vote_margin_t'], df_binned_left['vote_share_t1'],
           s=100, alpha=0.7, color='blue', edgecolors='black', linewidths=1.5,
           label='Lost election (t)')
ax.scatter(df_binned_right['vote_margin_t'], df_binned_right['vote_share_t1'],
           s=100, alpha=0.7, color='red', edgecolors='black', linewidths=1.5,
           label='Won election (t)')

# Fitted lines
df_local = df_lee[np.abs(df_lee['vote_margin_t']) <= 15]
df_left = df_local[df_local['won_election'] == 0]
df_right = df_local[df_local['won_election'] == 1]

from sklearn.linear_model import LinearRegression
if len(df_left) > 0:
    lr_left = LinearRegression().fit(
        df_left[['vote_margin_t']], df_left['vote_share_t1']
    )
    X_range_left = np.linspace(df_left['vote_margin_t'].min(), 0, 100).reshape(-1, 1)
    ax.plot(X_range_left, lr_left.predict(X_range_left),
            color='blue', linewidth=3)

if len(df_right) > 0:
    lr_right = LinearRegression().fit(
        df_right[['vote_margin_t']], df_right['vote_share_t1']
    )
    X_range_right = np.linspace(0, df_right['vote_margin_t'].max(), 100).reshape(-1, 1)
    ax.plot(X_range_right, lr_right.predict(X_range_right),
            color='red', linewidth=3)

# Cutoff
ax.axvline(x=0, color='green', linestyle='--', linewidth=2.5)

# Annotation
y_left_at_cutoff = lr_left.predict([[0]])[0]
y_right_at_cutoff = lr_right.predict([[0]])[0]
jump = y_right_at_cutoff - y_left_at_cutoff

ax.annotate('', xy=(0.5, y_right_at_cutoff), xytext=(0.5, y_left_at_cutoff),
            arrowprops=dict(arrowstyle='<->', color='purple', lw=3))
ax.text(1, (y_left_at_cutoff + y_right_at_cutoff) / 2,
        f'Incumbency\nAdvantage\n{jump:.1f} pp',
        fontsize=12, color='purple', fontweight='bold',
        bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.3))

ax.set_xlabel('Vote Margin at t (%)', fontsize=13, fontweight='bold')
ax.set_ylabel('Vote Share at t+1 (%)', fontsize=13, fontweight='bold')
ax.set_title('Lee (2008): Incumbency Advantage in U.S. House Elections',
             fontsize=15, fontweight='bold')
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

RDD Estimation

python
# Sharp RDD
print("\n" + "=" * 70)
print("Sharp RDD Estimation")
print("=" * 70)

lee_result = rdrobust(
    y=df_lee['vote_share_t1'],
    x=df_lee['vote_margin_t'],
    c=0
)

print(f"Incumbency advantage: {lee_result.coef[0]:.3f} percentage points")
print(f"Robust p-value: {lee_result.pval[0]:.4f}")
print(f"95% CI: [{lee_result.ci[0][0]:.3f}, {lee_result.ci[0][1]:.3f}]")
print(f"\nTrue advantage: {incumbency_advantage:.1f} percentage points")

# Validity test: density test
print("\n" + "=" * 70)
print("McCrary Density Test (Manipulation?)")
print("=" * 70)

try:
    from rddensity import rddensity
    density_test = rddensity(X=df_lee['vote_margin_t'], c=0)
    print(f"p-value: {density_test.pval[0]:.4f}")
    print(f"Conclusion: {'✓ No manipulation evidence' if density_test.pval[0] > 0.05 else '✗ Possible manipulation'}")
except ImportError:
    print("rddensity package not installed")

Lee's findings:

  • Huge incumbency advantage: about 40 percentage points!
  • Candidates who barely won election have about 40% higher vote share in next election than those who barely lost

Case 3: Carpenter & Dobkin (2009) - Minimum Drinking Age

Research Background

Research question: Does minimum legal drinking age (21) reduce youth mortality?

Natural experiment:

  • U.S. law: Can legally drink after 21st birthday
  • Before and after 21st birthday, individuals almost identical (except legal drinking rights)

Research Design (Sharp RDD)

Running variable:

Cutoff: (21st birthday)

Treatment: (Legal drinking)

Outcomes:

  • (per 100,000 people)
  • Categories: Traffic fatalities, alcohol poisoning, total mortality

Python Implementation (Simulated Data)

python
# Generate data
n_days = 365 * 4  # Ages 19-23, 4 years total
ages = np.linspace(19, 23, n_days)

# Legal drinking
legal_drinking = (ages >= 21).astype(int)

# Mortality rate (per 100,000 people)
# Base mortality increases slightly with age
base_mortality = 80 + 2 * (ages - 19)

# Legal drinking causes mortality jump (effect = 10)
alcohol_effect = 10
mortality_rate = base_mortality + alcohol_effect * legal_drinking + np.random.normal(0, 5, n_days)

# Create dataframe
df_alcohol = pd.DataFrame({
    'age': ages,
    'age_centered': ages - 21,
    'legal_drinking': legal_drinking,
    'mortality_rate': mortality_rate
})

print("=" * 70)
print("Carpenter & Dobkin (2009) Data Simulation")
print("=" * 70)
print(df_alcohol.head(10))
print(f"\nTrue alcohol effect: {alcohol_effect:.1f} deaths/100,000 people")

Visualization

python
fig, ax = plt.subplots(figsize=(14, 8))

# Scatter plot (weekly averages)
df_alcohol['week'] = (df_alcohol['age'] * 52).astype(int)
df_weekly = df_alcohol.groupby('week').agg({
    'age_centered': 'mean',
    'mortality_rate': 'mean'
}).reset_index()

ax.scatter(df_weekly['age_centered'], df_weekly['mortality_rate'],
           s=30, alpha=0.6, color='gray', edgecolors='black', linewidths=0.5)

# Fitted lines
df_left = df_alcohol[df_alcohol['age'] < 21]
df_right = df_alcohol[df_alcohol['age'] >= 21]

lr_left = LinearRegression().fit(df_left[['age_centered']], df_left['mortality_rate'])
lr_right = LinearRegression().fit(df_right[['age_centered']], df_right['mortality_rate'])

X_range_left = np.linspace(-2, 0, 100).reshape(-1, 1)
X_range_right = np.linspace(0, 2, 100).reshape(-1, 1)

ax.plot(X_range_left, lr_left.predict(X_range_left),
        color='blue', linewidth=3, label='Age < 21 (Illegal)')
ax.plot(X_range_right, lr_right.predict(X_range_right),
        color='red', linewidth=3, label='Age ≥ 21 (Legal)')

# Cutoff
ax.axvline(x=0, color='green', linestyle='--', linewidth=2.5)
ax.text(0.05, 75, '21st Birthday', fontsize=12, color='green', fontweight='bold')

ax.set_xlabel('Age - 21 (years)', fontsize=13, fontweight='bold')
ax.set_ylabel('Mortality Rate (per 100,000)', fontsize=13, fontweight='bold')
ax.set_title('Carpenter & Dobkin (2009): Alcohol and Mortality',
             fontsize=15, fontweight='bold')
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

RDD Estimation

python
# Sharp RDD
print("\n" + "=" * 70)
print("Sharp RDD Estimation")
print("=" * 70)

alcohol_result = rdrobust(
    y=df_alcohol['mortality_rate'],
    x=df_alcohol['age_centered'],
    c=0
)

print(f"Effect of legal drinking on mortality: {alcohol_result.coef[0]:.3f} deaths/100,000 people")
print(f"p-value: {alcohol_result.pval[0]:.4f}")
print(f"95% CI: [{alcohol_result.ci[0][0]:.3f}, {alcohol_result.ci[0][1]:.3f}]")
print(f"\nTrue effect: {alcohol_effect:.1f}")

Carpenter & Dobkin's findings:

  • After 21st birthday, mortality jumps about 9%
  • Primarily driven by traffic accidents and alcohol poisoning
  • Minimum drinking age indeed reduces youth mortality

Key Takeaways

Lessons from Classic Cases

  1. Angrist & Lavy (1999):

    • Exemplar of Fuzzy RDD
    • Institutional rule (Maimonides' Rule) as IV
    • First stage must be strong (F > 10)
  2. Lee (2008):

    • Sharp RDD application in political economy
    • Testing for manipulation very important (election fraud)
    • Surprising findings on incumbency advantage
  3. Carpenter & Dobkin (2009):

    • RDD evaluation of public health policy
    • Very clear cutoff (legal age)
    • Advantage of high-frequency data

Best Practices Summary

  1. Clear cutoff rule: Seek exogenous, non-manipulable cutoffs
  2. Validity tests: Covariate balance, density test
  3. Robustness tests: Multiple bandwidths, polynomial orders
  4. Visualization: RDD plots are core of paper
  5. Causal interpretation: Clearly state LATE meaning

Section Summary

In this section, we replicated:

  • Angrist & Lavy (1999) Fuzzy RDD
  • Lee (2008) Electoral RDD
  • Carpenter & Dobkin (2009) Age cutoff RDD
  • Complete Python implementation and visualization

Key lesson:

"Classic studies are classic not only because questions are important, but because designs are clever and execution rigorous. Learning from these cases is the best way to master RDD."

Next step: In Section 6, we will summarize RDD methods and provide practice exercises and literature recommendations.


Learn from classic cases to make your research a classic too!

Released under the MIT License. Content © Author.