2.4 Average Treatment Effects (ATE/ATT/LATE)

"The central role of the randomized experiment in statistical inference is beyond dispute."— Guido Imbens, 2021 Nobel Laureate in Economics

Understanding the Meaning and Estimation of Different Causal Effects

Section Objectives

Distinguish between ATE, ATT, ATU, LATE, and other effects
Understand intention-to-treat (ITT) analysis
Master treatment effect heterogeneity (CATE)
Learn methods for handling non-compliance

Types of Average Treatment Effects

Core Question

Why do we need to distinguish different average effects?

Because in reality:

Not everyone receives their assigned treatment (non-compliance)
Treatment effects may vary across individuals (heterogeneity)
The population of interest may differ (full sample vs treated vs compliers)

1️⃣ ATE: Average Treatment Effect

Definition

Average Treatment Effect (ATE): The average causal effect for the full sample

Meaning: If we randomly select a person to receive treatment, what is the average effect?

Case: Education Training

python

# Assume we have "God's perspective" data
data = pd.DataFrame({
    'id': range(5),
    'Y0': [5000, 6000, 5500, 7000, 6500],  # Income without training
    'Y1': [6500, 7200, 6800, 8000, 7800],  # Income with training
})

data['tau'] = data['Y1'] - data['Y0']  # Individual causal effects

ATE = data['tau'].mean()
print(f"ATE = {ATE:.0f} USD")  # ATE = 1400 USD

Application Scenarios

Universal policy rollout: If rolling out a policy nationwide, we care about ATE
Randomized experiments: RCT with random assignment naturally estimates ATE

2️⃣ ATT: Average Effect on the Treated

Definition

Average Treatment Effect on the Treated (ATT): The average causal effect for those who receive treatment

Meaning: For those who actually receive treatment, what is the average effect?

Distinction from ATE

Continuing the example above:

python

# Assume only the first 3 people received treatment
data['D'] = [1, 1, 1, 0, 0]

# ATT: Calculate average effect only for treated group
ATT = data[data['D'] == 1]['tau'].mean()
print(f"ATT = {ATT:.0f} USD")  # ATT = 1433 USD

# ATE: Average effect for full sample
ATE = data['tau'].mean()
print(f"ATE = {ATE:.0f} USD")  # ATE = 1400 USD

When is ATT ≠ ATE?

Condition: Treatment effect has heterogeneity

Case: Job training

People who voluntarily attend training may be more motivated
Their training effects may be larger
Therefore ATT > ATE

Application Scenarios

Program evaluation: Assess how much benefit participants gained
Volunteer programs: Participants self-select, we care about their benefits

3️⃣ ATU: Average Effect on the Untreated

Definition

Average Treatment Effect on the Untreated (ATU): The average counterfactual effect for those who didn't receive treatment

Meaning: If we let those who didn't receive treatment receive it, what would be the average effect?

Case

python

# ATU: Calculate average effect only for control group (counterfactual)
ATU = data[data['D'] == 0]['tau'].mean()
print(f"ATU = {ATU:.0f} USD")  # ATU = 1350 USD

Policy Implications

Program expansion: If expanding program coverage, new participants' effect may be ATU
Marginal effect: Expanding from current participants to non-participants, marginal benefit is ATU

4️⃣ LATE: Local Average Treatment Effect

Problem: Non-compliance

Perfect RCT Assumption:

Assigned to treatment group → 100% receive treatment
Assigned to control group → 0% receive treatment

Reality:

One-sided non-compliance
- Assigned to treatment, some don't receive it
- Assigned to control, none receive it
Two-sided non-compliance
- Assigned to treatment, some don't receive it
- Assigned to control, some obtain treatment

Case: Drug Clinical Trial

python

# RCT data
rct_data = pd.DataFrame({
    'id': range(10),
    'Z': [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],  # Z = random assignment
    'D': [1, 1, 0, 1, 1, 0, 1, 0, 0, 0],  # D = actually took drug
    'Y': [120, 115, 95, 125, 118, 90, 105, 88, 92, 85]  # Blood pressure
})

print(rct_data)

Observations:

ID 2: Assigned to treatment (Z=1), but didn't take drug (D=0) → Non-complier
ID 6: Assigned to control (Z=0), but took drug (D=1) → Two-sided non-compliance

Four Population Types

Based on potential treatment status and , population can be divided into:

Type			Description	English
Compliers	1	0	Follow assignment	Compliers
Always-takers	1	1	Always take treatment	Always-takers
Never-takers	0	0	Never take treatment	Never-takers
Defiers	0	1	Do opposite of assignment	Defiers

Assumption: Monotonicity — No defiers

LATE Definition

Local Average Treatment Effect (LATE): The average causal effect for compliers

Meaning: For those who respond to random assignment, what is the average effect?

LATE Estimation: Instrumental Variables (IV)

Key Insight: Random assignment can serve as an instrument for actual treatment

IV Estimator:

Numerator: Intention-to-treat effect (ITT) Denominator: Compliance rate

python

# Calculate LATE
ITT_Y = (rct_data[rct_data['Z'] == 1]['Y'].mean() -
         rct_data[rct_data['Z'] == 0]['Y'].mean())

ITT_D = (rct_data[rct_data['Z'] == 1]['D'].mean() -
         rct_data[rct_data['Z'] == 0]['D'].mean())

LATE = ITT_Y / ITT_D

print(f"ITT (Y): {ITT_Y:.2f}")
print(f"ITT (D): {ITT_D:.2f}")
print(f"LATE: {LATE:.2f}")

Python Implementation: 2SLS (Two-Stage Least Squares)

python

import statsmodels.api as sm
from statsmodels.sandbox.regression.gmm import IV2SLS

# First stage: D ~ Z
X1 = sm.add_constant(rct_data['Z'])
first_stage = sm.OLS(rct_data['D'], X1).fit()
print("First stage F-statistic:", first_stage.fvalue)

# Second stage: Y ~ D_hat (using Z as instrument)
iv_model = IV2SLS(
    endog=rct_data['Y'],
    exog=sm.add_constant(np.ones(len(rct_data))),
    instrument=sm.add_constant(rct_data['Z'])
).fit()

print(iv_model.summary())

5️⃣ ITT: Intention-to-Treat Analysis

Definition

Intention-to-Treat (ITT): Effect grouped by random assignment (not actual treatment)

Meaning: Average difference between assigned to treatment (regardless of actual receipt) vs control

Importance of ITT

Advantage	Explanation
Preserves randomization	Analysis unit is random assignment , no selection bias
Policy-relevant	Reflects effect of "offering opportunity" (not "forced participation")
Conservative estimate	ITT < ATE (dilution bias)

ITT vs LATE

Derivation:

Case: Drug Trial

python

# Compliance rate = 80%
compliance_rate = 0.8

# True LATE = 25 (effect on compliers)
true_LATE = 25

# ITT = LATE × compliance rate
ITT = true_LATE * compliance_rate
print(f"ITT: {ITT:.2f}")  # ITT = 20

# Interpretation: Offering drug (regardless of taking it) reduces blood pressure by 20 points on average
# But for those who actually take it, the effect is 25 points

Treatment Effect Heterogeneity (CATE)

Definition

Conditional Average Treatment Effect (CATE): Conditional average effect given covariates

Meaning: Treatment effects may differ across subgroups

Case: Education Training Heterogeneity

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Generate data
np.random.seed(42)
n = 1000

data = pd.DataFrame({
    'baseline_skill': np.random.normal(50, 15, n),
    'treatment': np.random.binomial(1, 0.5, n)
})

# Heterogeneous effects: Higher baseline → larger effect
data['tau'] = 5 + 0.3 * data['baseline_skill']  # Individual effects
data['Y0'] = 50 + 0.8 * data['baseline_skill'] + np.random.normal(0, 10, n)
data['Y1'] = data['Y0'] + data['tau']
data['Y_obs'] = np.where(data['treatment'] == 1, data['Y1'], data['Y0'])

# CATE estimation: Grouped regression
# Divide baseline skill into three groups
data['skill_group'] = pd.cut(data['baseline_skill'], bins=3,
                              labels=['Low', 'Medium', 'High'])

cate_results = []
for group in ['Low', 'Medium', 'High']:
    group_data = data[data['skill_group'] == group]
    ATE_group = (group_data[group_data['treatment'] == 1]['Y_obs'].mean() -
                 group_data[group_data['treatment'] == 0]['Y_obs'].mean())
    cate_results.append({'Group': group, 'CATE': ATE_group})

cate_df = pd.DataFrame(cate_results)
print(cate_df)

# Visualization
plt.figure(figsize=(8, 6))
plt.bar(cate_df['Group'], cate_df['CATE'], color=['#3498db', '#2ecc71', '#e74c3c'])
plt.xlabel('Baseline Skill Group')
plt.ylabel('CATE (Treatment Effect)')
plt.title('Treatment Effect Heterogeneity: Higher Baseline → Larger Effect')
plt.grid(axis='y', alpha=0.3)
plt.show()

Machine Learning CATE Estimation

Causal Forest:

python

# Using EconML library (developed by Microsoft)
from econml.dml import CausalForestDML

# Train causal forest
causal_forest = CausalForestDML(
    model_y=RandomForestRegressor(),
    model_t=RandomForestClassifier(),
    n_estimators=1000
)

X = data[['baseline_skill']]
causal_forest.fit(Y=data['Y_obs'], T=data['treatment'], X=X)

# Predict CATE for each individual
data['cate_pred'] = causal_forest.effect(X)

# Visualization
plt.figure(figsize=(10, 6))
plt.scatter(data['baseline_skill'], data['cate_pred'], alpha=0.3, s=10)
plt.xlabel('Baseline Skill')
plt.ylabel('Predicted CATE')
plt.title('Individual Treatment Effects Estimated by Causal Forest')
plt.show()

Relationships Between Different Effects

Mathematical Relationships

Visualization

python

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(figsize=(10, 6))

# Effect values
effects = {
    'ATE': 10,
    'ATT': 12,
    'ATU': 8,
    'LATE': 15,
    'ITT': 9
}

colors = ['#3498db', '#2ecc71', '#e74c3c', '#f39c12', '#9b59b6']
bars = ax.barh(list(effects.keys()), list(effects.values()), color=colors)

# Add value labels
for i, (k, v) in enumerate(effects.items()):
    ax.text(v + 0.5, i, f'{v}', va='center', fontweight='bold')

ax.set_xlabel('Treatment Effect Size')
ax.set_title('Comparison of Different Average Effects')
ax.set_xlim(0, 17)
ax.grid(axis='x', alpha=0.3)

# Add annotations
ax.annotate('LATE > ATT: Compliers have larger effects',
            xy=(15, 3), xytext=(13, 4.5),
            arrowprops=dict(arrowstyle='->', color='gray'))

ax.annotate('ITT < ATE: Non-compliance causes dilution',
            xy=(9, 4), xytext=(11, 3.5),
            arrowprops=dict(arrowstyle='->', color='gray'))

plt.tight_layout()
plt.show()

Complete Python Practice: Full Case Study

Case: Online Course RCT (with Non-compliance)

python

import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm
from linearmodels.iv import IV2SLS

# ==== 1. Generate Data ====
np.random.seed(42)
n = 1000

# Individual characteristics
data = pd.DataFrame({
    'id': range(n),
    'motivation': np.random.normal(50, 15, n),  # Learning motivation
    'baseline_score': np.random.normal(60, 10, n)  # Baseline score
})

# Random assignment (RCT)
data['Z'] = np.random.binomial(1, 0.5, n)

# Actual participation (non-compliance)
# High-motivation individuals more likely to comply
prob_comply = 1 / (1 + np.exp(-(data['motivation'] - 50) / 10))
data['comply'] = np.random.binomial(1, prob_comply)

# Actual treatment received
data['D'] = data['Z'] * data['comply']

# Potential outcomes
# Effect heterogeneity: Higher motivation → larger effect
data['tau'] = 5 + 0.2 * data['motivation']
data['Y0'] = 60 + 0.3 * data['motivation'] + 0.5 * data['baseline_score']
data['Y1'] = data['Y0'] + data['tau'] + np.random.normal(0, 5, n)
data['Y_obs'] = np.where(data['D'] == 1, data['Y1'], data['Y0'])

# ==== 2. Estimate Different Effects ====
print("=" * 60)
print("Estimation of Different Average Effects")
print("=" * 60)

# (1) True ATE (God's perspective)
true_ATE = data['tau'].mean()
print(f"\nTrue ATE: {true_ATE:.2f}")

# (2) Simple comparison (biased! due to self-selection into treatment)
naive = (data[data['D'] == 1]['Y_obs'].mean() -
         data[data['D'] == 0]['Y_obs'].mean())
print(f"Simple comparison: {naive:.2f} (biased!)")

# (3) ITT: Compare by random assignment
ITT_Y = (data[data['Z'] == 1]['Y_obs'].mean() -
         data[data['Z'] == 0]['Y_obs'].mean())
print(f"ITT: {ITT_Y:.2f}")

# (4) Compliance rate
compliance_rate = (data[data['Z'] == 1]['D'].mean() -
                   data[data['Z'] == 0]['D'].mean())
print(f"Compliance rate: {compliance_rate:.2%}")

# (5) LATE (IV estimate)
LATE = ITT_Y / compliance_rate
print(f"LATE: {LATE:.2f}")

# ==== 3. Regression Estimation ====
print("\n" + "=" * 60)
print("Regression Estimation")
print("=" * 60)

# ITT regression
X_itt = sm.add_constant(data['Z'])
itt_model = sm.OLS(data['Y_obs'], X_itt).fit(cov_type='HC3')
print("\nITT regression:")
print(f"  Coefficient: {itt_model.params['Z']:.2f}")
print(f"  SE: {itt_model.bse['Z']:.2f}")
print(f"  p-value: {itt_model.pvalues['Z']:.4f}")

# 2SLS (LATE)
iv_model = IV2SLS(
    dependent=data['Y_obs'],
    exog=sm.add_constant(np.ones(n)),
    endog=data[['D']],
    instruments=data[['Z']]
).fit(cov_type='robust')

print("\n2SLS (LATE):")
print(f"  Coefficient: {iv_model.params['D']:.2f}")
print(f"  SE: {iv_model.std_errors['D']:.2f}")
print(f"  F-stat (first stage): {iv_model.f_statistic.stat:.2f}")

# ==== 4. CATE Estimation ====
print("\n" + "=" * 60)
print("Heterogeneity Analysis (CATE)")
print("=" * 60)

# Group by motivation
data['motivation_group'] = pd.qcut(data['motivation'], q=3,
                                    labels=['Low Motivation', 'Medium Motivation', 'High Motivation'])

for group in ['Low Motivation', 'Medium Motivation', 'High Motivation']:
    group_data = data[data['motivation_group'] == group]

    # ITT by group
    itt_group = (group_data[group_data['Z'] == 1]['Y_obs'].mean() -
                 group_data[group_data['Z'] == 0]['Y_obs'].mean())

    # Compliance rate by group
    comp_group = (group_data[group_data['Z'] == 1]['D'].mean() -
                  group_data[group_data['Z'] == 0]['D'].mean())

    # LATE by group
    late_group = itt_group / comp_group if comp_group > 0 else np.nan

    print(f"\n{group}:")
    print(f"  ITT: {itt_group:.2f}")
    print(f"  Compliance rate: {comp_group:.2%}")
    print(f"  LATE: {late_group:.2f}")

# ==== 5. Visualization ====
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: Non-compliance situation
comply_counts = data.groupby(['Z', 'D']).size().unstack(fill_value=0)
comply_counts.plot(kind='bar', ax=axes[0, 0], color=['skyblue', 'salmon'])
axes[0, 0].set_xlabel('Random Assignment (Z)')
axes[0, 0].set_ylabel('Count')
axes[0, 0].set_title('Non-compliance Situation')
axes[0, 0].legend(['Did not receive treatment', 'Received treatment'])
axes[0, 0].set_xticklabels(['Control group', 'Treatment group'], rotation=0)

# Plot 2: ITT vs LATE
effects = ['ITT', 'LATE', 'True ATE']
values = [ITT_Y, LATE, true_ATE]
axes[0, 1].bar(effects, values, color=['#3498db', '#e74c3c', '#2ecc71'])
axes[0, 1].set_ylabel('Effect Size')
axes[0, 1].set_title('Comparison of Different Effect Estimates')
axes[0, 1].grid(axis='y', alpha=0.3)

# Plot 3: Heterogeneity (CATE)
cate_data = []
for group in ['Low Motivation', 'Medium Motivation', 'High Motivation']:
    group_data = data[data['motivation_group'] == group]
    itt = (group_data[group_data['Z'] == 1]['Y_obs'].mean() -
           group_data[group_data['Z'] == 0]['Y_obs'].mean())
    cate_data.append(itt)

axes[1, 0].bar(['Low', 'Medium', 'High'], cate_data,
               color=['#3498db', '#2ecc71', '#e74c3c'])
axes[1, 0].set_ylabel('ITT')
axes[1, 0].set_title('Treatment Effect Heterogeneity')
axes[1, 0].grid(axis='y', alpha=0.3)

# Plot 4: Scatter plot (motivation vs effect)
treated = data[data['Z'] == 1]
control = data[data['Z'] == 0]
axes[1, 1].scatter(treated['motivation'], treated['Y_obs'],
                   alpha=0.3, s=10, label='Treatment', color='salmon')
axes[1, 1].scatter(control['motivation'], control['Y_obs'],
                   alpha=0.3, s=10, label='Control', color='skyblue')
axes[1, 1].set_xlabel('Learning Motivation')
axes[1, 1].set_ylabel('Final Score')
axes[1, 1].set_title('Motivation vs Score (by Assignment Group)')
axes[1, 1].legend()

plt.tight_layout()
plt.savefig('ate_analysis.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n✓ Analysis complete!")

Summary

Core Concept Comparison

Effect	Definition	Estimator	Application Scenario
ATE	Full sample average effect	Simple difference (RCT)	Universal policy rollout
ATT	Treated group average effect	Matching, DID	Program evaluation
ATU	Untreated group average effect	Matching	Program expansion
LATE	Compliers average effect	IV/2SLS	RCT with non-compliance
ITT	Intention-to-treat effect	Group by assignment	Policy effect (conservative)
CATE	Conditional average effect	Grouping/ML	Heterogeneity analysis

Key Insights

Non-compliance Problem
- ITT preserves randomization but underestimates true effect
- LATE estimates complier effect through IV
- Compliance rate = ITT / LATE
Heterogeneity Analysis
- CATE reveals differential effects across subgroups
- Helps with precision targeting and resource optimization
Effect Selection
- Policymakers care about ITT (effect of offering opportunity)
- Researchers care about LATE (true mechanism)
- Practitioners care about CATE (personalized intervention)

Practice Questions

Understanding question: Why is ITT called a "conservative estimate"? When might ITT be more policy-relevant?
Calculation question: An education RCT finds:
- ITT = 0.2 standard deviations
- Compliance rate = 60%
Questions: What is LATE? If everyone participated (100% compliance), what would be the expected effect?
Design question: You're studying "the causal effect of a fitness app on weight loss."
- (a) Define ATE, ATT, LATE
- (b) How to estimate CATE (age, gender, baseline BMI)
- (c) Which effect do you expect to be largest? Why?

Click for answer hints

Question 1:

ITT is "conservative" because it includes non-compliers (dilutes effect)
Policy relevance: Reflects actual effect of "offering program opportunity" (can't force everyone to participate)

Question 2:

LATE = ITT / compliance rate = 0.2 / 0.6 = 0.33 standard deviations
Expected effect with 100% compliance ≈ LATE = 0.33 (assuming LATE = ATE)

Question 3:

Next Steps

In the next section, we'll learn about Identification Strategies and Validity, diving deep into core assumptions of causal inference (SUTVA, independence, etc.).

Keep going! 🚀

References:

Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). "Identification of causal effects using instrumental variables". JASA.
Imbens, G. W., & Angrist, J. D. (1994). "Identification and estimation of local average treatment effects". Econometrica.
Athey, S., & Imbens, G. (2016). "Recursive partitioning for heterogeneous causal effects". PNAS.

2.4 Average Treatment Effects (ATE/ATT/LATE) ​

Section Objectives ​

Types of Average Treatment Effects ​

Core Question ​

1️⃣ ATE: Average Treatment Effect ​

Definition ​

Case: Education Training ​

Application Scenarios ​

2️⃣ ATT: Average Effect on the Treated ​

Definition ​

Distinction from ATE ​

When is ATT ≠ ATE? ​

Application Scenarios ​

3️⃣ ATU: Average Effect on the Untreated ​

Definition ​

Case ​

Policy Implications ​

4️⃣ LATE: Local Average Treatment Effect ​

Problem: Non-compliance ​

Case: Drug Clinical Trial ​

Four Population Types ​

LATE Definition ​

LATE Estimation: Instrumental Variables (IV) ​

Python Implementation: 2SLS (Two-Stage Least Squares) ​

5️⃣ ITT: Intention-to-Treat Analysis ​

Definition ​

Importance of ITT ​

ITT vs LATE ​

Case: Drug Trial ​

Treatment Effect Heterogeneity (CATE) ​

Definition ​

Case: Education Training Heterogeneity ​

Machine Learning CATE Estimation ​

Relationships Between Different Effects ​

Mathematical Relationships ​

Visualization ​

Complete Python Practice: Full Case Study ​

Case: Online Course RCT (with Non-compliance) ​

Summary ​

Core Concept Comparison ​

Key Insights ​

Practice Questions ​

Next Steps ​

2.4 Average Treatment Effects (ATE/ATT/LATE)

Section Objectives

Types of Average Treatment Effects

Core Question

1️⃣ ATE: Average Treatment Effect

Definition

Case: Education Training

Application Scenarios

2️⃣ ATT: Average Effect on the Treated

Definition

Distinction from ATE

When is ATT ≠ ATE?

Application Scenarios

3️⃣ ATU: Average Effect on the Untreated

Definition

Case

Policy Implications

4️⃣ LATE: Local Average Treatment Effect

Problem: Non-compliance

Case: Drug Clinical Trial

Four Population Types

LATE Definition

LATE Estimation: Instrumental Variables (IV)

Python Implementation: 2SLS (Two-Stage Least Squares)

5️⃣ ITT: Intention-to-Treat Analysis

Definition

Importance of ITT

ITT vs LATE

Case: Drug Trial

Treatment Effect Heterogeneity (CATE)

Definition

Case: Education Training Heterogeneity

Machine Learning CATE Estimation

Relationships Between Different Effects

Mathematical Relationships

Visualization

Complete Python Practice: Full Case Study

Case: Online Course RCT (with Non-compliance)

Summary

Core Concept Comparison

Key Insights

Practice Questions

Next Steps