11.1 Chapter Introduction (Regression Discontinuity Design)

Local randomization: when nature creates quasi-experiments for us

Learning Objectives

After completing this chapter, you will be able to:

Understand the core ideas of RDD and the principle of local randomization
Master the differences and applications of Sharp RDD and Fuzzy RDD
Implement RDD validity tests (continuity assumption, density test, covariate balance)
Conduct bandwidth selection and robustness analysis
Use Python to implement RDD analysis (rdrobust, statsmodels)
Replicate classic RDD studies (Angrist & Lavy 1999, Lee 2008, etc.)

Why is RDD the "Most Credible" Quasi-Experimental Method?

Starting with the Counterfactual Idea

Josh Angrist's Perspective:

"RDD is the most credible quasi-experimental design, because it mimics a randomized experiment in a local neighborhood of the cutoff."

In causal inference, we care most about the counterfactual question:

Observed outcome: A student's GPA after receiving a scholarship
Counterfactual: What would this student's GPA be if they didn't receive the scholarship?

Problem: We can never observe both states simultaneously! (The fundamental problem of causal inference)

RCT's solution:

Randomly assign treatment, ensuring treatment and control groups are completely comparable
Average outcome in treatment group - Average outcome in control group = Average Treatment Effect (ATE)

RDD's clever approach: When we cannot conduct randomized experiments, if there exists a cutoff rule, individuals near the cutoff are almost "random"!

The Core Intuition of RDD

Scenario: College Scholarships and Student Performance

Suppose a university has the following rule:

College entrance exam score ≥ 600 → Receive scholarship
College entrance exam score < 600 → No scholarship

Research question: Does the scholarship improve students' college GPA?

Intuition:

A student with 599 points vs a student with 600 points
These two students are almost identical (ability, family background, study habits, etc.)
The only difference: one just crossed the cutoff and received a scholarship
Therefore, the difference in their GPAs can be attributed to the causal effect of the scholarship!

Illustration: An Ideal RDD

python

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set font for Chinese characters
plt.rcParams['font.sans-serif'] = ['SimHei', 'DejaVu Sans']
plt.rcParams['axes.unicode_minus'] = False
sns.set_style("whitegrid")

# Set random seed
np.random.seed(42)

# Generate running variable
x = np.linspace(-50, 50, 1000)
cutoff = 0

# Generate outcome variable
# Left of cutoff (untreated)
y_left = 60 + 0.5 * x[x < cutoff] + np.random.normal(0, 3, sum(x < cutoff))
# Right of cutoff (treated): jump of 10 points
y_right = 70 + 0.5 * x[x >= cutoff] + np.random.normal(0, 3, sum(x >= cutoff))

# Fit polynomials (for drawing smooth curves)
from numpy.polynomial import Polynomial
p_left = Polynomial.fit(x[x < cutoff], y_left, deg=2)
p_right = Polynomial.fit(x[x >= cutoff], y_right, deg=2)

# Plot
fig, ax = plt.subplots(figsize=(14, 8))

# Scatter plot
ax.scatter(x[x < cutoff], y_left, alpha=0.4, s=20, color='blue', label='No Scholarship')
ax.scatter(x[x >= cutoff], y_right, alpha=0.4, s=20, color='red', label='Scholarship')

# Fitted curves
x_left_smooth = np.linspace(x.min(), cutoff, 100)
x_right_smooth = np.linspace(cutoff, x.max(), 100)
ax.plot(x_left_smooth, p_left(x_left_smooth), color='blue', linewidth=3, label='Left Fitted Line')
ax.plot(x_right_smooth, p_right(x_right_smooth), color='red', linewidth=3, label='Right Fitted Line')

# Mark cutoff
ax.axvline(x=cutoff, color='green', linestyle='--', linewidth=2.5, alpha=0.8)
ax.text(cutoff + 2, 45, 'Cutoff', fontsize=14, color='green',
        fontweight='bold', ha='left')

# Annotate RDD effect
y_left_at_cutoff = p_left(cutoff)
y_right_at_cutoff = p_right(cutoff)
rdd_effect = y_right_at_cutoff - y_left_at_cutoff

ax.annotate('', xy=(cutoff + 0.5, y_right_at_cutoff),
            xytext=(cutoff + 0.5, y_left_at_cutoff),
            arrowprops=dict(arrowstyle='<->', color='purple', lw=3.5))
ax.text(cutoff + 3, (y_left_at_cutoff + y_right_at_cutoff) / 2,
        f'RDD Effect\nτ = {rdd_effect:.1f}',
        fontsize=13, color='purple', fontweight='bold',
        bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.4))

# Legend and labels
ax.set_xlabel('Running Variable (Exam Score - 600)', fontsize=14, fontweight='bold')
ax.set_ylabel('Outcome Variable (College GPA)', fontsize=14, fontweight='bold')
ax.set_title('Core Logic of Regression Discontinuity Design (RDD)', fontsize=16, fontweight='bold', pad=20)
ax.legend(loc='upper left', fontsize=12)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('rdd_illustration.png', dpi=300, bbox_inches='tight')
plt.show()

Key observations:

Left of cutoff: Outcome variable follows a smooth curve
Right of cutoff: Outcome variable follows another smooth curve
At the cutoff: A clear jump (discontinuity) appears
RDD effect: The magnitude of the jump is the treatment effect!

Mathematical Expression of RDD

Potential Outcomes Framework

Notation:

: Running Variable, e.g., exam score
: Cutoff, e.g., 600 points
: Treatment Status
: Potential outcome (if untreated)
: Potential outcome (if treated)
: Observed outcome

Sharp RDD: Treatment Completely Determined by Cutoff

Definition: If the running variable crosses the cutoff , treatment status changes deterministically, we call it Sharp RDD.

Key assumption: Continuity Assumption

Assume that at the cutoff, the potential outcome functions are continuous:

In plain language: Without treatment, the outcome variable should be smooth at the cutoff (no jump).

Identification strategy:

Observed outcomes:

RDD Estimator:

Why is this a causal effect?

According to the continuity assumption:

Important: RDD identifies the average treatment effect at the cutoff, not the overall ATE!

RDD vs RCT: The Perspective of Local Randomization

RDD as a "Local RCT"

Josh Angrist's perspective:

"RDD can be thought of as a local randomized experiment. Near the cutoff, treatment assignment is 'as-if random'."

Intuition:

Far from cutoff: High-scoring and low-scoring students differ greatly (ability, family background, etc.)
Close to cutoff: Students with 599 and 600 points are almost identical
At the cutoff: Treatment assignment is almost random (who gets exactly 600 has a luck component)

Formalization:

Within a small neighborhood of the cutoff, assume:

This is similar to balance in RCT: treatment and control groups are similar on all covariates.

RDD vs DID: When to Use Which Method?

Feature	RDD	DID
Data requirement	Cross-section or single-period panel	Multi-period panel (at least 2 periods)
Identification source	Jump at cutoff	Double difference in time and group
Core assumption	Continuity assumption	Parallel trends assumption
External validity	Local effect (at cutoff)	May be broader
Internal validity	Very high (close to RCT)	Depends on parallel trends
Classic cases	Scholarships, elections	Minimum wage, environmental policy

Rule of thumb:

If there's a clear cutoff rule → Use RDD
If there's spatial-temporal variation in policy → Use DID
If you can conduct random assignment → Do an RCT directly!

️ Empirical Implementation of Sharp RDD

Linear Regression Approach

The simplest RDD estimation: fit two linear regression lines near the cutoff.

Model:

Parameter interpretation:

: RDD effect (jump at cutoff) ⭐
: Slope left of cutoff
: Additional slope right of cutoff (total slope = )

Key: Center the running variable (), so that is the effect at the cutoff.

Polynomial Approach

Allow the relationship between outcome and running variable to be nonlinear:

Warning: High-order polynomials () are prone to overfitting! (Gelman & Imbens 2019)

Local Linear Regression

Modern best practice (Calonico, Cattaneo, Titiunik 2014):

Choose bandwidth : Use only observations with
Kernel weighting: Closer to cutoff gets higher weight
Fit local linear regression:

Advantages:

Optimal bias-variance tradeoff
Minimal functional form assumptions
Modern packages (like rdrobust) implement automatically

Python Implementation: Simple Example

Simulate Sharp RDD Data

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
from scipy import stats

# Setup
np.random.seed(123)
n = 1000
cutoff = 0

# Generate running variable
X = np.random.uniform(-50, 50, n)

# Generate treatment status
D = (X >= cutoff).astype(int)

# Generate outcome variable
# True DGP: Y = 50 + 0.5*X + 10*D + noise
# This means treatment effect = 10
true_effect = 10
Y = 50 + 0.5 * X + true_effect * D + np.random.normal(0, 5, n)

# Create dataframe
df = pd.DataFrame({
    'X': X,
    'D': D,
    'Y': Y,
    'X_centered': X - cutoff
})

print("=" * 70)
print("Sharp RDD Simulated Data")
print("=" * 70)
print(f"Sample size: {n}")
print(f"Cutoff: {cutoff}")
print(f"True treatment effect: {true_effect}")
print(f"Treated units: {D.sum()} ({D.sum()/n*100:.1f}%)")
print("\nData preview:")
print(df.head(10))

Visualization: Scatter Plot + Fitted Lines

python

# Fit separately by group
df_left = df[df['D'] == 0]
df_right = df[df['D'] == 1]

# OLS fit
from sklearn.linear_model import LinearRegression
lr_left = LinearRegression().fit(df_left[['X_centered']], df_left['Y'])
lr_right = LinearRegression().fit(df_right[['X_centered']], df_right['Y'])

# Predict
X_left_range = np.linspace(df_left['X_centered'].min(), 0, 100).reshape(-1, 1)
X_right_range = np.linspace(0, df_right['X_centered'].max(), 100).reshape(-1, 1)
Y_left_pred = lr_left.predict(X_left_range)
Y_right_pred = lr_right.predict(X_right_range)

# Plot
fig, ax = plt.subplots(figsize=(14, 8))

# Scatter plot (using binning to reduce visual clutter)
bins = 20
df['X_bin'] = pd.cut(df['X_centered'], bins=bins)
df_binned = df.groupby(['X_bin', 'D']).agg({'Y': 'mean', 'X_centered': 'mean'}).reset_index()

df_binned_left = df_binned[df_binned['D'] == 0]
df_binned_right = df_binned[df_binned['D'] == 1]

ax.scatter(df_binned_left['X_centered'], df_binned_left['Y'],
           s=100, alpha=0.6, color='blue', edgecolors='black', linewidths=1.5,
           label='Untreated (binned means)')
ax.scatter(df_binned_right['X_centered'], df_binned_right['Y'],
           s=100, alpha=0.6, color='red', edgecolors='black', linewidths=1.5,
           label='Treated (binned means)')

# Fitted lines
ax.plot(X_left_range, Y_left_pred, color='blue', linewidth=3, label='Left Fitted Line')
ax.plot(X_right_range, Y_right_pred, color='red', linewidth=3, label='Right Fitted Line')

# Cutoff
ax.axvline(x=0, color='green', linestyle='--', linewidth=2.5, alpha=0.7)

# Annotate effect
y_left_at_cutoff = lr_left.predict([[0]])[0]
y_right_at_cutoff = lr_right.predict([[0]])[0]
estimated_effect = y_right_at_cutoff - y_left_at_cutoff

ax.annotate('', xy=(0.5, y_right_at_cutoff), xytext=(0.5, y_left_at_cutoff),
            arrowprops=dict(arrowstyle='<->', color='purple', lw=3))
ax.text(1, (y_left_at_cutoff + y_right_at_cutoff) / 2,
        f'Estimated Effect\n= {estimated_effect:.2f}',
        fontsize=12, color='purple', fontweight='bold',
        bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.3))

ax.set_xlabel('X - Cutoff', fontsize=13, fontweight='bold')
ax.set_ylabel('Y', fontsize=13, fontweight='bold')
ax.set_title(f'Sharp RDD Example (True Effect = {true_effect})',
             fontsize=15, fontweight='bold')
ax.legend(loc='upper left', fontsize=11)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Regression Estimation

python

# Method 1: Full sample linear RDD
model1 = smf.ols('Y ~ D + X_centered + D:X_centered', data=df).fit()

print("\n" + "=" * 70)
print("Method 1: Full Sample Linear RDD")
print("=" * 70)
print(model1.summary().tables[1])
print(f"\nEstimated RDD effect: {model1.params['D']:.3f}")
print(f"Standard error: {model1.bse['D']:.3f}")
print(f"95% Confidence interval: [{model1.conf_int().loc['D', 0]:.3f}, {model1.conf_int().loc['D', 1]:.3f}]")

# Method 2: Bandwidth restriction (use only observations near cutoff)
bandwidth = 20
df_local = df[np.abs(df['X_centered']) <= bandwidth].copy()

model2 = smf.ols('Y ~ D + X_centered + D:X_centered', data=df_local).fit()

print("\n" + "=" * 70)
print(f"Method 2: Local Linear RDD (bandwidth = {bandwidth})")
print("=" * 70)
print(f"Observations used: {len(df_local)} / {len(df)} ({len(df_local)/len(df)*100:.1f}%)")
print(model2.summary().tables[1])
print(f"\nEstimated RDD effect: {model2.params['D']:.3f}")
print(f"Standard error: {model2.bse['D']:.3f}")

# Comparison
print("\n" + "=" * 70)
print("Effect Estimate Comparison")
print("=" * 70)
print(f"True effect:           {true_effect:.3f}")
print(f"Full sample estimate:  {model1.params['D']:.3f} (SE = {model1.bse['D']:.3f})")
print(f"Local estimate (h={bandwidth}): {model2.params['D']:.3f} (SE = {model2.bse['D']:.3f})")

Output interpretation:

Both methods should be close to the true effect of 10
Local estimate typically has larger standard errors (smaller sample size)
But local estimate has smaller bias (weaker functional form assumptions)

Fuzzy RDD: Imperfect Cutoffs

What is Fuzzy RDD?

In reality, the cutoff rule may be imperfect:

Sharp RDD: ,
Fuzzy RDD: , but not 0 or 1

Examples:

College admission: Cutoff at 600, but special cases (sports, minority bonuses, etc.)
Medicare: Auto-enrollment at age 65, but some purchase early

Fuzzy RDD Identification

Idea: Use the cutoff as an instrumental variable (IV)!

Two-stage regression:

First stage: Use cutoff to predict treatment status

Second stage: Use predicted treatment to estimate effect

Fuzzy RDD estimator:

Interpretation:

Numerator: Jump in outcome at cutoff (Reduced Form)
Denominator: Jump in treatment at cutoff (First Stage)
Ratio: Local average treatment effect (LATE)

Connection to IV: Fuzzy RDD is essentially IV estimation, where the cutoff () serves as the instrument!

Preview of Classic RDD Applications

Case 1: Thistlethwaite & Campbell (1960) - Birth of RDD

Research question: Does receiving National Merit Award affect future scholarship attainment?

Design:

Cutoff: Threshold on national exam score
Treatment: Receive merit award
Outcome: Number of subsequent scholarships

Finding: Significant positive RDD effect

Historical significance: This was the first application of RDD (1960)!

Case 2: Angrist & Lavy (1999) - Class Size and Student Achievement

Research question: Does reducing class size improve student performance?

Background:

Israel has a rule (Maimonides' Rule): Class size cannot exceed 40 students
If school has 41 students → Must split into 2 classes (~20 each)
If school has 40 students → 1 class (40 students)

Design:

Running variable: Total enrollment in school
Cutoffs: 40, 80, 120, ... (multiples of 40)
Treatment: Class size (determined by rule)
Outcome: Standardized test scores

Finding: Reducing class size by 1 student → Scores increase by 0.1-0.2 standard deviations

Innovation: Classic application of Fuzzy RDD (rule not perfectly enforced)

Case 3: Lee (2008) - Electoral Advantage and Re-election

Research question: Does incumbency status confer re-election advantage?

Design:

Cutoff: Vote share = 50%
Treatment: Become incumbent
Outcome: Vote share in next election

Key intuition:

Candidate with 49.9% vs candidate with 50.1%
These two are almost identical (political strength, funding, voter support, etc.)
Only difference: One wins, one loses

Finding: Huge incumbency advantage (about 40 percentage points)!

Core Assumptions of RDD

Assumption 1: Continuity Assumption ⭐

Assumption: At the cutoff, all factors except treatment status are continuous.

Mathematical expression:

In plain language: Without treatment, outcome variable doesn't jump at cutoff.

How to test? (Section 3 discusses in detail)

Covariate balance tests: Check if covariates (age, gender, etc.) are balanced on both sides of cutoff
Density test (McCrary Test): Check if density of running variable is smooth at cutoff
Placebo tests: Test using false cutoffs

Assumption 2: No Precise Manipulation

Assumption: Individuals cannot precisely manipulate the running variable to just cross the cutoff.

Threats:

Exam cheating: Students know 600 is cutoff, cheat to get exactly 600
Election fraud: Candidates manipulate votes to get just over 50%
Policy lobbying: Firms lobby government to stay just below regulatory threshold

How to test?

McCrary density test: Check for abnormal bunching at cutoff

Assumption 3: Local Exclusion

Assumption: Running variable affects outcome only through treatment (near cutoff).

Threat:

If exam score itself (beyond scholarship) directly affects GPA (e.g., confidence), RDD will be biased

Rule of thumb: Choose "exogenous" running variables (birth date, lottery number)

Chapter Structure

Section 1: Chapter Introduction (Current)

Core ideas of RDD and counterfactual framework
Sharp RDD vs Fuzzy RDD
Comparison with RCT and DID
Python basic implementation

Section 2: RDD Fundamentals and Identification

Mathematical derivation of Sharp RDD
Fuzzy RDD and instrumental variables
Local average treatment effect (LATE)
Linear vs nonparametric methods

Section 3: Continuity Assumption and Validity Tests

Testing the continuity assumption
Covariate balance tests
Density test (McCrary Test)
Placebo tests

Section 4: Bandwidth Selection and Robustness Tests

Optimal bandwidth selection (IK, CCT)
Sensitivity analysis
Polynomial order selection
Donut-hole RDD

Section 5: Classic Cases and Python Implementation

Angrist & Lavy (1999) Class size
Lee (2008) Electoral advantage
Carpenter & Dobkin (2009) Minimum drinking age
Best practices using rdrobust package

Section 6: Chapter Summary

RDD methodology summary
Common pitfalls and best practices
Practice exercises
Literature recommendations

️ Python Toolkit

Core Libraries

Package	Main Functions	Installation
pandas	Data manipulation	`pip install pandas`
numpy	Numerical computation	`pip install numpy`
statsmodels	OLS regression	`pip install statsmodels`
rdrobust	RDD optimal bandwidth and robust inference	`pip install rdrobust`
rddtools	RDD toolkit	(Install from source)
matplotlib	Visualization	`pip install matplotlib`
seaborn	Advanced visualization	`pip install seaborn`

Basic Setup

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
from scipy import stats

# Font settings (choose based on OS)
plt.rcParams['font.sans-serif'] = ['SimHei', 'Arial Unicode MS', 'DejaVu Sans']
plt.rcParams['axes.unicode_minus'] = False

# Set style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 7)
pd.set_option('display.float_format', '{:.4f}'.format)

rdrobust Package Installation

bash

# Python version
pip install rdrobust

# Or using conda
conda install -c conda-forge rdrobust

Usage example:

python

from rdrobust import rdrobust, rdbwselect, rdplot

# Automatic bandwidth selection and robust inference
result = rdrobust(y=Y, x=X, c=cutoff)
print(result)

# Plot RDD
rdplot(y=Y, x=X, c=cutoff, nbins=20)

Essential Reading

Foundational Papers

Thistlethwaite, D. L., & Campbell, D. T. (1960). "Regression-discontinuity analysis: An alternative to the ex post facto experiment." Journal of Educational Psychology, 51(6), 309.
- Birth of RDD method
Hahn, J., Todd, P., & Van der Klaauw, W. (2001). "Identification and Estimation of Treatment Effects with a Regression-Discontinuity Design." Econometrica, 69(1), 201-209.
- Modern identification theory for RDD
Lee, D. S., & Lemieux, T. (2010). "Regression Discontinuity Designs in Economics." Journal of Economic Literature, 48(2), 281-355.
- Must-read review, the bible of RDD

Methodological Breakthroughs

Imbens, G., & Kalyanaraman, K. (2012). "Optimal Bandwidth Choice for the Regression Discontinuity Estimator." Review of Economic Studies, 79(3), 933-959.
- Optimal bandwidth selection (IK method)
Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014). "Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs." Econometrica, 82(6), 2295-2326.
- Robust inference (CCT method)
Gelman, A., & Imbens, G. (2019). "Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs." Journal of Business & Economic Statistics, 37(3), 447-456.
- Warning: Don't use high-order polynomials!

Classic Applications

Angrist, J. D., & Lavy, V. (1999). "Using Maimonides' Rule to Estimate the Effect of Class Size on Scholastic Achievement." Quarterly Journal of Economics, 114(2), 533-575.
Lee, D. S. (2008). "Randomized Experiments from Non-random Selection in U.S. House Elections." Journal of Econometrics, 142(2), 675-697.

Ready to Begin?

RDD is the quasi-experimental method closest to randomized experiments. Master it, and you'll be able to:

Identify causal effects in the absence of randomized experiments
Leverage policy rules and natural cutoffs for research
Publish high-quality causal inference studies

Remember the core idea:

"In the neighborhood of the cutoff, RDD is as good as a randomized experiment. The discontinuity is your friend." — Joshua Angrist

Let's dive into Section 2: RDD Fundamentals and Identification!

Local randomization: a powerful tool for causal inference!

11.1 Chapter Introduction (Regression Discontinuity Design) ​

Learning Objectives ​

Why is RDD the "Most Credible" Quasi-Experimental Method? ​

Starting with the Counterfactual Idea ​

The Core Intuition of RDD ​

Scenario: College Scholarships and Student Performance ​

Illustration: An Ideal RDD ​

Mathematical Expression of RDD ​

Potential Outcomes Framework ​

Sharp RDD: Treatment Completely Determined by Cutoff ​

RDD vs RCT: The Perspective of Local Randomization ​

RDD as a "Local RCT" ​

RDD vs DID: When to Use Which Method? ​

️ Empirical Implementation of Sharp RDD ​

Linear Regression Approach ​

Polynomial Approach ​

Local Linear Regression ​

Python Implementation: Simple Example ​

Simulate Sharp RDD Data ​

Visualization: Scatter Plot + Fitted Lines ​

Regression Estimation ​

Fuzzy RDD: Imperfect Cutoffs ​

What is Fuzzy RDD? ​

Fuzzy RDD Identification ​

Preview of Classic RDD Applications ​

Case 1: Thistlethwaite & Campbell (1960) - Birth of RDD ​

Case 2: Angrist & Lavy (1999) - Class Size and Student Achievement ​

Case 3: Lee (2008) - Electoral Advantage and Re-election ​

Core Assumptions of RDD ​

Assumption 1: Continuity Assumption ⭐ ​

Assumption 2: No Precise Manipulation ​

Assumption 3: Local Exclusion ​

Chapter Structure ​

Section 1: Chapter Introduction (Current) ​

Section 2: RDD Fundamentals and Identification ​

Section 3: Continuity Assumption and Validity Tests ​

Section 4: Bandwidth Selection and Robustness Tests ​

Section 5: Classic Cases and Python Implementation ​

Section 6: Chapter Summary ​

️ Python Toolkit ​

Core Libraries ​

Basic Setup ​

rdrobust Package Installation ​

Essential Reading ​

Foundational Papers ​

Methodological Breakthroughs ​

Classic Applications ​

Recommended Textbooks ​

Ready to Begin? ​

11.1 Chapter Introduction (Regression Discontinuity Design)

Learning Objectives

Why is RDD the "Most Credible" Quasi-Experimental Method?

Starting with the Counterfactual Idea

The Core Intuition of RDD

Scenario: College Scholarships and Student Performance

Illustration: An Ideal RDD

Mathematical Expression of RDD

Potential Outcomes Framework

Sharp RDD: Treatment Completely Determined by Cutoff

RDD vs RCT: The Perspective of Local Randomization

RDD as a "Local RCT"

RDD vs DID: When to Use Which Method?

️ Empirical Implementation of Sharp RDD

Linear Regression Approach

Polynomial Approach

Local Linear Regression

Python Implementation: Simple Example

Simulate Sharp RDD Data

Visualization: Scatter Plot + Fitted Lines

Regression Estimation

Fuzzy RDD: Imperfect Cutoffs

What is Fuzzy RDD?

Fuzzy RDD Identification

Preview of Classic RDD Applications

Case 1: Thistlethwaite & Campbell (1960) - Birth of RDD

Case 2: Angrist & Lavy (1999) - Class Size and Student Achievement

Case 3: Lee (2008) - Electoral Advantage and Re-election

Core Assumptions of RDD

Assumption 1: Continuity Assumption ⭐

Assumption 2: No Precise Manipulation

Assumption 3: Local Exclusion

Chapter Structure

Section 1: Chapter Introduction (Current)

Section 2: RDD Fundamentals and Identification

Section 3: Continuity Assumption and Validity Tests

Section 4: Bandwidth Selection and Robustness Tests

Section 5: Classic Cases and Python Implementation

Section 6: Chapter Summary

️ Python Toolkit

Core Libraries

Basic Setup

rdrobust Package Installation

Essential Reading

Foundational Papers

Methodological Breakthroughs

Classic Applications

Recommended Textbooks

Ready to Begin?