2.4 Average Treatment Effects (ATE/ATT/LATE)
"The central role of the randomized experiment in statistical inference is beyond dispute."— Guido Imbens, 2021 Nobel Laureate in Economics
Understanding the Meaning and Estimation of Different Causal Effects
Section Objectives
- Distinguish between ATE, ATT, ATU, LATE, and other effects
- Understand intention-to-treat (ITT) analysis
- Master treatment effect heterogeneity (CATE)
- Learn methods for handling non-compliance
Types of Average Treatment Effects
Core Question
Why do we need to distinguish different average effects?
Because in reality:
- Not everyone receives their assigned treatment (non-compliance)
- Treatment effects may vary across individuals (heterogeneity)
- The population of interest may differ (full sample vs treated vs compliers)
1️⃣ ATE: Average Treatment Effect
Definition
Average Treatment Effect (ATE): The average causal effect for the full sample
Meaning: If we randomly select a person to receive treatment, what is the average effect?
Case: Education Training
# Assume we have "God's perspective" data
data = pd.DataFrame({
'id': range(5),
'Y0': [5000, 6000, 5500, 7000, 6500], # Income without training
'Y1': [6500, 7200, 6800, 8000, 7800], # Income with training
})
data['tau'] = data['Y1'] - data['Y0'] # Individual causal effects
ATE = data['tau'].mean()
print(f"ATE = {ATE:.0f} USD") # ATE = 1400 USDApplication Scenarios
- Universal policy rollout: If rolling out a policy nationwide, we care about ATE
- Randomized experiments: RCT with random assignment naturally estimates ATE
2️⃣ ATT: Average Effect on the Treated
Definition
Average Treatment Effect on the Treated (ATT): The average causal effect for those who receive treatment
Meaning: For those who actually receive treatment, what is the average effect?
Distinction from ATE
Continuing the example above:
# Assume only the first 3 people received treatment
data['D'] = [1, 1, 1, 0, 0]
# ATT: Calculate average effect only for treated group
ATT = data[data['D'] == 1]['tau'].mean()
print(f"ATT = {ATT:.0f} USD") # ATT = 1433 USD
# ATE: Average effect for full sample
ATE = data['tau'].mean()
print(f"ATE = {ATE:.0f} USD") # ATE = 1400 USDWhen is ATT ≠ ATE?
Condition: Treatment effect has heterogeneity
Case: Job training
- People who voluntarily attend training may be more motivated
- Their training effects may be larger
- Therefore ATT > ATE
Application Scenarios
- Program evaluation: Assess how much benefit participants gained
- Volunteer programs: Participants self-select, we care about their benefits
3️⃣ ATU: Average Effect on the Untreated
Definition
Average Treatment Effect on the Untreated (ATU): The average counterfactual effect for those who didn't receive treatment
Meaning: If we let those who didn't receive treatment receive it, what would be the average effect?
Case
# ATU: Calculate average effect only for control group (counterfactual)
ATU = data[data['D'] == 0]['tau'].mean()
print(f"ATU = {ATU:.0f} USD") # ATU = 1350 USDPolicy Implications
- Program expansion: If expanding program coverage, new participants' effect may be ATU
- Marginal effect: Expanding from current participants to non-participants, marginal benefit is ATU
4️⃣ LATE: Local Average Treatment Effect
Problem: Non-compliance
Perfect RCT Assumption:
- Assigned to treatment group → 100% receive treatment
- Assigned to control group → 0% receive treatment
Reality:
- One-sided non-compliance
- Assigned to treatment, some don't receive it
- Assigned to control, none receive it
- Two-sided non-compliance
- Assigned to treatment, some don't receive it
- Assigned to control, some obtain treatment
Case: Drug Clinical Trial
# RCT data
rct_data = pd.DataFrame({
'id': range(10),
'Z': [1, 1, 1, 1, 1, 0, 0, 0, 0, 0], # Z = random assignment
'D': [1, 1, 0, 1, 1, 0, 1, 0, 0, 0], # D = actually took drug
'Y': [120, 115, 95, 125, 118, 90, 105, 88, 92, 85] # Blood pressure
})
print(rct_data)Observations:
- ID 2: Assigned to treatment (Z=1), but didn't take drug (D=0) → Non-complier
- ID 6: Assigned to control (Z=0), but took drug (D=1) → Two-sided non-compliance
Four Population Types
Based on potential treatment status and , population can be divided into:
| Type | Description | English | ||
|---|---|---|---|---|
| Compliers | 1 | 0 | Follow assignment | Compliers |
| Always-takers | 1 | 1 | Always take treatment | Always-takers |
| Never-takers | 0 | 0 | Never take treatment | Never-takers |
| Defiers | 0 | 1 | Do opposite of assignment | Defiers |
Assumption: Monotonicity — No defiers
LATE Definition
Local Average Treatment Effect (LATE): The average causal effect for compliers
Meaning: For those who respond to random assignment, what is the average effect?
LATE Estimation: Instrumental Variables (IV)
Key Insight: Random assignment can serve as an instrument for actual treatment
IV Estimator:
Numerator: Intention-to-treat effect (ITT) Denominator: Compliance rate
# Calculate LATE
ITT_Y = (rct_data[rct_data['Z'] == 1]['Y'].mean() -
rct_data[rct_data['Z'] == 0]['Y'].mean())
ITT_D = (rct_data[rct_data['Z'] == 1]['D'].mean() -
rct_data[rct_data['Z'] == 0]['D'].mean())
LATE = ITT_Y / ITT_D
print(f"ITT (Y): {ITT_Y:.2f}")
print(f"ITT (D): {ITT_D:.2f}")
print(f"LATE: {LATE:.2f}")Python Implementation: 2SLS (Two-Stage Least Squares)
import statsmodels.api as sm
from statsmodels.sandbox.regression.gmm import IV2SLS
# First stage: D ~ Z
X1 = sm.add_constant(rct_data['Z'])
first_stage = sm.OLS(rct_data['D'], X1).fit()
print("First stage F-statistic:", first_stage.fvalue)
# Second stage: Y ~ D_hat (using Z as instrument)
iv_model = IV2SLS(
endog=rct_data['Y'],
exog=sm.add_constant(np.ones(len(rct_data))),
instrument=sm.add_constant(rct_data['Z'])
).fit()
print(iv_model.summary())5️⃣ ITT: Intention-to-Treat Analysis
Definition
Intention-to-Treat (ITT): Effect grouped by random assignment (not actual treatment)
Meaning: Average difference between assigned to treatment (regardless of actual receipt) vs control
Importance of ITT
| Advantage | Explanation |
|---|---|
| Preserves randomization | Analysis unit is random assignment , no selection bias |
| Policy-relevant | Reflects effect of "offering opportunity" (not "forced participation") |
| Conservative estimate | ITT < ATE (dilution bias) |
ITT vs LATE
Derivation:
Case: Drug Trial
# Compliance rate = 80%
compliance_rate = 0.8
# True LATE = 25 (effect on compliers)
true_LATE = 25
# ITT = LATE × compliance rate
ITT = true_LATE * compliance_rate
print(f"ITT: {ITT:.2f}") # ITT = 20
# Interpretation: Offering drug (regardless of taking it) reduces blood pressure by 20 points on average
# But for those who actually take it, the effect is 25 pointsTreatment Effect Heterogeneity (CATE)
Definition
Conditional Average Treatment Effect (CATE): Conditional average effect given covariates
Meaning: Treatment effects may differ across subgroups
Case: Education Training Heterogeneity
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Generate data
np.random.seed(42)
n = 1000
data = pd.DataFrame({
'baseline_skill': np.random.normal(50, 15, n),
'treatment': np.random.binomial(1, 0.5, n)
})
# Heterogeneous effects: Higher baseline → larger effect
data['tau'] = 5 + 0.3 * data['baseline_skill'] # Individual effects
data['Y0'] = 50 + 0.8 * data['baseline_skill'] + np.random.normal(0, 10, n)
data['Y1'] = data['Y0'] + data['tau']
data['Y_obs'] = np.where(data['treatment'] == 1, data['Y1'], data['Y0'])
# CATE estimation: Grouped regression
# Divide baseline skill into three groups
data['skill_group'] = pd.cut(data['baseline_skill'], bins=3,
labels=['Low', 'Medium', 'High'])
cate_results = []
for group in ['Low', 'Medium', 'High']:
group_data = data[data['skill_group'] == group]
ATE_group = (group_data[group_data['treatment'] == 1]['Y_obs'].mean() -
group_data[group_data['treatment'] == 0]['Y_obs'].mean())
cate_results.append({'Group': group, 'CATE': ATE_group})
cate_df = pd.DataFrame(cate_results)
print(cate_df)
# Visualization
plt.figure(figsize=(8, 6))
plt.bar(cate_df['Group'], cate_df['CATE'], color=['#3498db', '#2ecc71', '#e74c3c'])
plt.xlabel('Baseline Skill Group')
plt.ylabel('CATE (Treatment Effect)')
plt.title('Treatment Effect Heterogeneity: Higher Baseline → Larger Effect')
plt.grid(axis='y', alpha=0.3)
plt.show()Machine Learning CATE Estimation
Causal Forest:
# Using EconML library (developed by Microsoft)
from econml.dml import CausalForestDML
# Train causal forest
causal_forest = CausalForestDML(
model_y=RandomForestRegressor(),
model_t=RandomForestClassifier(),
n_estimators=1000
)
X = data[['baseline_skill']]
causal_forest.fit(Y=data['Y_obs'], T=data['treatment'], X=X)
# Predict CATE for each individual
data['cate_pred'] = causal_forest.effect(X)
# Visualization
plt.figure(figsize=(10, 6))
plt.scatter(data['baseline_skill'], data['cate_pred'], alpha=0.3, s=10)
plt.xlabel('Baseline Skill')
plt.ylabel('Predicted CATE')
plt.title('Individual Treatment Effects Estimated by Causal Forest')
plt.show()Relationships Between Different Effects
Mathematical Relationships
Visualization
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(figsize=(10, 6))
# Effect values
effects = {
'ATE': 10,
'ATT': 12,
'ATU': 8,
'LATE': 15,
'ITT': 9
}
colors = ['#3498db', '#2ecc71', '#e74c3c', '#f39c12', '#9b59b6']
bars = ax.barh(list(effects.keys()), list(effects.values()), color=colors)
# Add value labels
for i, (k, v) in enumerate(effects.items()):
ax.text(v + 0.5, i, f'{v}', va='center', fontweight='bold')
ax.set_xlabel('Treatment Effect Size')
ax.set_title('Comparison of Different Average Effects')
ax.set_xlim(0, 17)
ax.grid(axis='x', alpha=0.3)
# Add annotations
ax.annotate('LATE > ATT: Compliers have larger effects',
xy=(15, 3), xytext=(13, 4.5),
arrowprops=dict(arrowstyle='->', color='gray'))
ax.annotate('ITT < ATE: Non-compliance causes dilution',
xy=(9, 4), xytext=(11, 3.5),
arrowprops=dict(arrowstyle='->', color='gray'))
plt.tight_layout()
plt.show()Complete Python Practice: Full Case Study
Case: Online Course RCT (with Non-compliance)
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm
from linearmodels.iv import IV2SLS
# ==== 1. Generate Data ====
np.random.seed(42)
n = 1000
# Individual characteristics
data = pd.DataFrame({
'id': range(n),
'motivation': np.random.normal(50, 15, n), # Learning motivation
'baseline_score': np.random.normal(60, 10, n) # Baseline score
})
# Random assignment (RCT)
data['Z'] = np.random.binomial(1, 0.5, n)
# Actual participation (non-compliance)
# High-motivation individuals more likely to comply
prob_comply = 1 / (1 + np.exp(-(data['motivation'] - 50) / 10))
data['comply'] = np.random.binomial(1, prob_comply)
# Actual treatment received
data['D'] = data['Z'] * data['comply']
# Potential outcomes
# Effect heterogeneity: Higher motivation → larger effect
data['tau'] = 5 + 0.2 * data['motivation']
data['Y0'] = 60 + 0.3 * data['motivation'] + 0.5 * data['baseline_score']
data['Y1'] = data['Y0'] + data['tau'] + np.random.normal(0, 5, n)
data['Y_obs'] = np.where(data['D'] == 1, data['Y1'], data['Y0'])
# ==== 2. Estimate Different Effects ====
print("=" * 60)
print("Estimation of Different Average Effects")
print("=" * 60)
# (1) True ATE (God's perspective)
true_ATE = data['tau'].mean()
print(f"\nTrue ATE: {true_ATE:.2f}")
# (2) Simple comparison (biased! due to self-selection into treatment)
naive = (data[data['D'] == 1]['Y_obs'].mean() -
data[data['D'] == 0]['Y_obs'].mean())
print(f"Simple comparison: {naive:.2f} (biased!)")
# (3) ITT: Compare by random assignment
ITT_Y = (data[data['Z'] == 1]['Y_obs'].mean() -
data[data['Z'] == 0]['Y_obs'].mean())
print(f"ITT: {ITT_Y:.2f}")
# (4) Compliance rate
compliance_rate = (data[data['Z'] == 1]['D'].mean() -
data[data['Z'] == 0]['D'].mean())
print(f"Compliance rate: {compliance_rate:.2%}")
# (5) LATE (IV estimate)
LATE = ITT_Y / compliance_rate
print(f"LATE: {LATE:.2f}")
# ==== 3. Regression Estimation ====
print("\n" + "=" * 60)
print("Regression Estimation")
print("=" * 60)
# ITT regression
X_itt = sm.add_constant(data['Z'])
itt_model = sm.OLS(data['Y_obs'], X_itt).fit(cov_type='HC3')
print("\nITT regression:")
print(f" Coefficient: {itt_model.params['Z']:.2f}")
print(f" SE: {itt_model.bse['Z']:.2f}")
print(f" p-value: {itt_model.pvalues['Z']:.4f}")
# 2SLS (LATE)
iv_model = IV2SLS(
dependent=data['Y_obs'],
exog=sm.add_constant(np.ones(n)),
endog=data[['D']],
instruments=data[['Z']]
).fit(cov_type='robust')
print("\n2SLS (LATE):")
print(f" Coefficient: {iv_model.params['D']:.2f}")
print(f" SE: {iv_model.std_errors['D']:.2f}")
print(f" F-stat (first stage): {iv_model.f_statistic.stat:.2f}")
# ==== 4. CATE Estimation ====
print("\n" + "=" * 60)
print("Heterogeneity Analysis (CATE)")
print("=" * 60)
# Group by motivation
data['motivation_group'] = pd.qcut(data['motivation'], q=3,
labels=['Low Motivation', 'Medium Motivation', 'High Motivation'])
for group in ['Low Motivation', 'Medium Motivation', 'High Motivation']:
group_data = data[data['motivation_group'] == group]
# ITT by group
itt_group = (group_data[group_data['Z'] == 1]['Y_obs'].mean() -
group_data[group_data['Z'] == 0]['Y_obs'].mean())
# Compliance rate by group
comp_group = (group_data[group_data['Z'] == 1]['D'].mean() -
group_data[group_data['Z'] == 0]['D'].mean())
# LATE by group
late_group = itt_group / comp_group if comp_group > 0 else np.nan
print(f"\n{group}:")
print(f" ITT: {itt_group:.2f}")
print(f" Compliance rate: {comp_group:.2%}")
print(f" LATE: {late_group:.2f}")
# ==== 5. Visualization ====
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
# Plot 1: Non-compliance situation
comply_counts = data.groupby(['Z', 'D']).size().unstack(fill_value=0)
comply_counts.plot(kind='bar', ax=axes[0, 0], color=['skyblue', 'salmon'])
axes[0, 0].set_xlabel('Random Assignment (Z)')
axes[0, 0].set_ylabel('Count')
axes[0, 0].set_title('Non-compliance Situation')
axes[0, 0].legend(['Did not receive treatment', 'Received treatment'])
axes[0, 0].set_xticklabels(['Control group', 'Treatment group'], rotation=0)
# Plot 2: ITT vs LATE
effects = ['ITT', 'LATE', 'True ATE']
values = [ITT_Y, LATE, true_ATE]
axes[0, 1].bar(effects, values, color=['#3498db', '#e74c3c', '#2ecc71'])
axes[0, 1].set_ylabel('Effect Size')
axes[0, 1].set_title('Comparison of Different Effect Estimates')
axes[0, 1].grid(axis='y', alpha=0.3)
# Plot 3: Heterogeneity (CATE)
cate_data = []
for group in ['Low Motivation', 'Medium Motivation', 'High Motivation']:
group_data = data[data['motivation_group'] == group]
itt = (group_data[group_data['Z'] == 1]['Y_obs'].mean() -
group_data[group_data['Z'] == 0]['Y_obs'].mean())
cate_data.append(itt)
axes[1, 0].bar(['Low', 'Medium', 'High'], cate_data,
color=['#3498db', '#2ecc71', '#e74c3c'])
axes[1, 0].set_ylabel('ITT')
axes[1, 0].set_title('Treatment Effect Heterogeneity')
axes[1, 0].grid(axis='y', alpha=0.3)
# Plot 4: Scatter plot (motivation vs effect)
treated = data[data['Z'] == 1]
control = data[data['Z'] == 0]
axes[1, 1].scatter(treated['motivation'], treated['Y_obs'],
alpha=0.3, s=10, label='Treatment', color='salmon')
axes[1, 1].scatter(control['motivation'], control['Y_obs'],
alpha=0.3, s=10, label='Control', color='skyblue')
axes[1, 1].set_xlabel('Learning Motivation')
axes[1, 1].set_ylabel('Final Score')
axes[1, 1].set_title('Motivation vs Score (by Assignment Group)')
axes[1, 1].legend()
plt.tight_layout()
plt.savefig('ate_analysis.png', dpi=300, bbox_inches='tight')
plt.show()
print("\n✓ Analysis complete!")Summary
Core Concept Comparison
| Effect | Definition | Estimator | Application Scenario |
|---|---|---|---|
| ATE | Full sample average effect | Simple difference (RCT) | Universal policy rollout |
| ATT | Treated group average effect | Matching, DID | Program evaluation |
| ATU | Untreated group average effect | Matching | Program expansion |
| LATE | Compliers average effect | IV/2SLS | RCT with non-compliance |
| ITT | Intention-to-treat effect | Group by assignment | Policy effect (conservative) |
| CATE | Conditional average effect | Grouping/ML | Heterogeneity analysis |
Key Insights
Non-compliance Problem
- ITT preserves randomization but underestimates true effect
- LATE estimates complier effect through IV
- Compliance rate = ITT / LATE
Heterogeneity Analysis
- CATE reveals differential effects across subgroups
- Helps with precision targeting and resource optimization
Effect Selection
- Policymakers care about ITT (effect of offering opportunity)
- Researchers care about LATE (true mechanism)
- Practitioners care about CATE (personalized intervention)
Practice Questions
Understanding question: Why is ITT called a "conservative estimate"? When might ITT be more policy-relevant?
Calculation question: An education RCT finds:
- ITT = 0.2 standard deviations
- Compliance rate = 60%
Questions: What is LATE? If everyone participated (100% compliance), what would be the expected effect?
Design question: You're studying "the causal effect of a fitness app on weight loss."
- (a) Define ATE, ATT, LATE
- (b) How to estimate CATE (age, gender, baseline BMI)
- (c) Which effect do you expect to be largest? Why?
Click for answer hints
Question 1:
- ITT is "conservative" because it includes non-compliers (dilutes effect)
- Policy relevance: Reflects actual effect of "offering program opportunity" (can't force everyone to participate)
Question 2:
- LATE = ITT / compliance rate = 0.2 / 0.6 = 0.33 standard deviations
- Expected effect with 100% compliance ≈ LATE = 0.33 (assuming LATE = ATE)
Question 3:
- (c) Expect ATT > ATE > ATU (self-selection effect: those who voluntarily use app have better outcomes)
Next Steps
In the next section, we'll learn about Identification Strategies and Validity, diving deep into core assumptions of causal inference (SUTVA, independence, etc.).
Keep going! 🚀
References:
- Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). "Identification of causal effects using instrumental variables". JASA.
- Imbens, G. W., & Angrist, J. D. (1994). "Identification and estimation of local average treatment effects". Econometrica.
- Athey, S., & Imbens, G. (2016). "Recursive partitioning for heterogeneous causal effects". PNAS.