Skip to content

2.4 Average Treatment Effects (ATE/ATT/LATE)

"The central role of the randomized experiment in statistical inference is beyond dispute."— Guido Imbens, 2021 Nobel Laureate in Economics

Understanding the Meaning and Estimation of Different Causal Effects


Section Objectives

  • Distinguish between ATE, ATT, ATU, LATE, and other effects
  • Understand intention-to-treat (ITT) analysis
  • Master treatment effect heterogeneity (CATE)
  • Learn methods for handling non-compliance

Types of Average Treatment Effects

Core Question

Why do we need to distinguish different average effects?

Because in reality:

  • Not everyone receives their assigned treatment (non-compliance)
  • Treatment effects may vary across individuals (heterogeneity)
  • The population of interest may differ (full sample vs treated vs compliers)

1️⃣ ATE: Average Treatment Effect

Definition

Average Treatment Effect (ATE): The average causal effect for the full sample

Meaning: If we randomly select a person to receive treatment, what is the average effect?

Case: Education Training

python
# Assume we have "God's perspective" data
data = pd.DataFrame({
    'id': range(5),
    'Y0': [5000, 6000, 5500, 7000, 6500],  # Income without training
    'Y1': [6500, 7200, 6800, 8000, 7800],  # Income with training
})

data['tau'] = data['Y1'] - data['Y0']  # Individual causal effects

ATE = data['tau'].mean()
print(f"ATE = {ATE:.0f} USD")  # ATE = 1400 USD

Application Scenarios

  • Universal policy rollout: If rolling out a policy nationwide, we care about ATE
  • Randomized experiments: RCT with random assignment naturally estimates ATE

2️⃣ ATT: Average Effect on the Treated

Definition

Average Treatment Effect on the Treated (ATT): The average causal effect for those who receive treatment

Meaning: For those who actually receive treatment, what is the average effect?

Distinction from ATE

Continuing the example above:

python
# Assume only the first 3 people received treatment
data['D'] = [1, 1, 1, 0, 0]

# ATT: Calculate average effect only for treated group
ATT = data[data['D'] == 1]['tau'].mean()
print(f"ATT = {ATT:.0f} USD")  # ATT = 1433 USD

# ATE: Average effect for full sample
ATE = data['tau'].mean()
print(f"ATE = {ATE:.0f} USD")  # ATE = 1400 USD

When is ATT ≠ ATE?

Condition: Treatment effect has heterogeneity

Case: Job training

  • People who voluntarily attend training may be more motivated
  • Their training effects may be larger
  • Therefore ATT > ATE

Application Scenarios

  • Program evaluation: Assess how much benefit participants gained
  • Volunteer programs: Participants self-select, we care about their benefits

3️⃣ ATU: Average Effect on the Untreated

Definition

Average Treatment Effect on the Untreated (ATU): The average counterfactual effect for those who didn't receive treatment

Meaning: If we let those who didn't receive treatment receive it, what would be the average effect?

Case

python
# ATU: Calculate average effect only for control group (counterfactual)
ATU = data[data['D'] == 0]['tau'].mean()
print(f"ATU = {ATU:.0f} USD")  # ATU = 1350 USD

Policy Implications

  • Program expansion: If expanding program coverage, new participants' effect may be ATU
  • Marginal effect: Expanding from current participants to non-participants, marginal benefit is ATU

4️⃣ LATE: Local Average Treatment Effect

Problem: Non-compliance

Perfect RCT Assumption:

  • Assigned to treatment group → 100% receive treatment
  • Assigned to control group → 0% receive treatment

Reality:

  • One-sided non-compliance
    • Assigned to treatment, some don't receive it
    • Assigned to control, none receive it
  • Two-sided non-compliance
    • Assigned to treatment, some don't receive it
    • Assigned to control, some obtain treatment

Case: Drug Clinical Trial

python
# RCT data
rct_data = pd.DataFrame({
    'id': range(10),
    'Z': [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],  # Z = random assignment
    'D': [1, 1, 0, 1, 1, 0, 1, 0, 0, 0],  # D = actually took drug
    'Y': [120, 115, 95, 125, 118, 90, 105, 88, 92, 85]  # Blood pressure
})

print(rct_data)

Observations:

  • ID 2: Assigned to treatment (Z=1), but didn't take drug (D=0) → Non-complier
  • ID 6: Assigned to control (Z=0), but took drug (D=1) → Two-sided non-compliance

Four Population Types

Based on potential treatment status and , population can be divided into:

TypeDescriptionEnglish
Compliers10Follow assignmentCompliers
Always-takers11Always take treatmentAlways-takers
Never-takers00Never take treatmentNever-takers
Defiers01Do opposite of assignmentDefiers

Assumption: Monotonicity — No defiers

LATE Definition

Local Average Treatment Effect (LATE): The average causal effect for compliers

Meaning: For those who respond to random assignment, what is the average effect?

LATE Estimation: Instrumental Variables (IV)

Key Insight: Random assignment can serve as an instrument for actual treatment

IV Estimator:

Numerator: Intention-to-treat effect (ITT) Denominator: Compliance rate

python
# Calculate LATE
ITT_Y = (rct_data[rct_data['Z'] == 1]['Y'].mean() -
         rct_data[rct_data['Z'] == 0]['Y'].mean())

ITT_D = (rct_data[rct_data['Z'] == 1]['D'].mean() -
         rct_data[rct_data['Z'] == 0]['D'].mean())

LATE = ITT_Y / ITT_D

print(f"ITT (Y): {ITT_Y:.2f}")
print(f"ITT (D): {ITT_D:.2f}")
print(f"LATE: {LATE:.2f}")

Python Implementation: 2SLS (Two-Stage Least Squares)

python
import statsmodels.api as sm
from statsmodels.sandbox.regression.gmm import IV2SLS

# First stage: D ~ Z
X1 = sm.add_constant(rct_data['Z'])
first_stage = sm.OLS(rct_data['D'], X1).fit()
print("First stage F-statistic:", first_stage.fvalue)

# Second stage: Y ~ D_hat (using Z as instrument)
iv_model = IV2SLS(
    endog=rct_data['Y'],
    exog=sm.add_constant(np.ones(len(rct_data))),
    instrument=sm.add_constant(rct_data['Z'])
).fit()

print(iv_model.summary())

5️⃣ ITT: Intention-to-Treat Analysis

Definition

Intention-to-Treat (ITT): Effect grouped by random assignment (not actual treatment)

Meaning: Average difference between assigned to treatment (regardless of actual receipt) vs control

Importance of ITT

AdvantageExplanation
Preserves randomizationAnalysis unit is random assignment , no selection bias
Policy-relevantReflects effect of "offering opportunity" (not "forced participation")
Conservative estimateITT < ATE (dilution bias)

ITT vs LATE

Derivation:

Case: Drug Trial

python
# Compliance rate = 80%
compliance_rate = 0.8

# True LATE = 25 (effect on compliers)
true_LATE = 25

# ITT = LATE × compliance rate
ITT = true_LATE * compliance_rate
print(f"ITT: {ITT:.2f}")  # ITT = 20

# Interpretation: Offering drug (regardless of taking it) reduces blood pressure by 20 points on average
# But for those who actually take it, the effect is 25 points

Treatment Effect Heterogeneity (CATE)

Definition

Conditional Average Treatment Effect (CATE): Conditional average effect given covariates

Meaning: Treatment effects may differ across subgroups

Case: Education Training Heterogeneity

python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Generate data
np.random.seed(42)
n = 1000

data = pd.DataFrame({
    'baseline_skill': np.random.normal(50, 15, n),
    'treatment': np.random.binomial(1, 0.5, n)
})

# Heterogeneous effects: Higher baseline → larger effect
data['tau'] = 5 + 0.3 * data['baseline_skill']  # Individual effects
data['Y0'] = 50 + 0.8 * data['baseline_skill'] + np.random.normal(0, 10, n)
data['Y1'] = data['Y0'] + data['tau']
data['Y_obs'] = np.where(data['treatment'] == 1, data['Y1'], data['Y0'])

# CATE estimation: Grouped regression
# Divide baseline skill into three groups
data['skill_group'] = pd.cut(data['baseline_skill'], bins=3,
                              labels=['Low', 'Medium', 'High'])

cate_results = []
for group in ['Low', 'Medium', 'High']:
    group_data = data[data['skill_group'] == group]
    ATE_group = (group_data[group_data['treatment'] == 1]['Y_obs'].mean() -
                 group_data[group_data['treatment'] == 0]['Y_obs'].mean())
    cate_results.append({'Group': group, 'CATE': ATE_group})

cate_df = pd.DataFrame(cate_results)
print(cate_df)

# Visualization
plt.figure(figsize=(8, 6))
plt.bar(cate_df['Group'], cate_df['CATE'], color=['#3498db', '#2ecc71', '#e74c3c'])
plt.xlabel('Baseline Skill Group')
plt.ylabel('CATE (Treatment Effect)')
plt.title('Treatment Effect Heterogeneity: Higher Baseline → Larger Effect')
plt.grid(axis='y', alpha=0.3)
plt.show()

Machine Learning CATE Estimation

Causal Forest:

python
# Using EconML library (developed by Microsoft)
from econml.dml import CausalForestDML

# Train causal forest
causal_forest = CausalForestDML(
    model_y=RandomForestRegressor(),
    model_t=RandomForestClassifier(),
    n_estimators=1000
)

X = data[['baseline_skill']]
causal_forest.fit(Y=data['Y_obs'], T=data['treatment'], X=X)

# Predict CATE for each individual
data['cate_pred'] = causal_forest.effect(X)

# Visualization
plt.figure(figsize=(10, 6))
plt.scatter(data['baseline_skill'], data['cate_pred'], alpha=0.3, s=10)
plt.xlabel('Baseline Skill')
plt.ylabel('Predicted CATE')
plt.title('Individual Treatment Effects Estimated by Causal Forest')
plt.show()

Relationships Between Different Effects

Mathematical Relationships

Visualization

python
import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(figsize=(10, 6))

# Effect values
effects = {
    'ATE': 10,
    'ATT': 12,
    'ATU': 8,
    'LATE': 15,
    'ITT': 9
}

colors = ['#3498db', '#2ecc71', '#e74c3c', '#f39c12', '#9b59b6']
bars = ax.barh(list(effects.keys()), list(effects.values()), color=colors)

# Add value labels
for i, (k, v) in enumerate(effects.items()):
    ax.text(v + 0.5, i, f'{v}', va='center', fontweight='bold')

ax.set_xlabel('Treatment Effect Size')
ax.set_title('Comparison of Different Average Effects')
ax.set_xlim(0, 17)
ax.grid(axis='x', alpha=0.3)

# Add annotations
ax.annotate('LATE > ATT: Compliers have larger effects',
            xy=(15, 3), xytext=(13, 4.5),
            arrowprops=dict(arrowstyle='->', color='gray'))

ax.annotate('ITT < ATE: Non-compliance causes dilution',
            xy=(9, 4), xytext=(11, 3.5),
            arrowprops=dict(arrowstyle='->', color='gray'))

plt.tight_layout()
plt.show()

Complete Python Practice: Full Case Study

Case: Online Course RCT (with Non-compliance)

python
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm
from linearmodels.iv import IV2SLS

# ==== 1. Generate Data ====
np.random.seed(42)
n = 1000

# Individual characteristics
data = pd.DataFrame({
    'id': range(n),
    'motivation': np.random.normal(50, 15, n),  # Learning motivation
    'baseline_score': np.random.normal(60, 10, n)  # Baseline score
})

# Random assignment (RCT)
data['Z'] = np.random.binomial(1, 0.5, n)

# Actual participation (non-compliance)
# High-motivation individuals more likely to comply
prob_comply = 1 / (1 + np.exp(-(data['motivation'] - 50) / 10))
data['comply'] = np.random.binomial(1, prob_comply)

# Actual treatment received
data['D'] = data['Z'] * data['comply']

# Potential outcomes
# Effect heterogeneity: Higher motivation → larger effect
data['tau'] = 5 + 0.2 * data['motivation']
data['Y0'] = 60 + 0.3 * data['motivation'] + 0.5 * data['baseline_score']
data['Y1'] = data['Y0'] + data['tau'] + np.random.normal(0, 5, n)
data['Y_obs'] = np.where(data['D'] == 1, data['Y1'], data['Y0'])

# ==== 2. Estimate Different Effects ====
print("=" * 60)
print("Estimation of Different Average Effects")
print("=" * 60)

# (1) True ATE (God's perspective)
true_ATE = data['tau'].mean()
print(f"\nTrue ATE: {true_ATE:.2f}")

# (2) Simple comparison (biased! due to self-selection into treatment)
naive = (data[data['D'] == 1]['Y_obs'].mean() -
         data[data['D'] == 0]['Y_obs'].mean())
print(f"Simple comparison: {naive:.2f} (biased!)")

# (3) ITT: Compare by random assignment
ITT_Y = (data[data['Z'] == 1]['Y_obs'].mean() -
         data[data['Z'] == 0]['Y_obs'].mean())
print(f"ITT: {ITT_Y:.2f}")

# (4) Compliance rate
compliance_rate = (data[data['Z'] == 1]['D'].mean() -
                   data[data['Z'] == 0]['D'].mean())
print(f"Compliance rate: {compliance_rate:.2%}")

# (5) LATE (IV estimate)
LATE = ITT_Y / compliance_rate
print(f"LATE: {LATE:.2f}")

# ==== 3. Regression Estimation ====
print("\n" + "=" * 60)
print("Regression Estimation")
print("=" * 60)

# ITT regression
X_itt = sm.add_constant(data['Z'])
itt_model = sm.OLS(data['Y_obs'], X_itt).fit(cov_type='HC3')
print("\nITT regression:")
print(f"  Coefficient: {itt_model.params['Z']:.2f}")
print(f"  SE: {itt_model.bse['Z']:.2f}")
print(f"  p-value: {itt_model.pvalues['Z']:.4f}")

# 2SLS (LATE)
iv_model = IV2SLS(
    dependent=data['Y_obs'],
    exog=sm.add_constant(np.ones(n)),
    endog=data[['D']],
    instruments=data[['Z']]
).fit(cov_type='robust')

print("\n2SLS (LATE):")
print(f"  Coefficient: {iv_model.params['D']:.2f}")
print(f"  SE: {iv_model.std_errors['D']:.2f}")
print(f"  F-stat (first stage): {iv_model.f_statistic.stat:.2f}")

# ==== 4. CATE Estimation ====
print("\n" + "=" * 60)
print("Heterogeneity Analysis (CATE)")
print("=" * 60)

# Group by motivation
data['motivation_group'] = pd.qcut(data['motivation'], q=3,
                                    labels=['Low Motivation', 'Medium Motivation', 'High Motivation'])

for group in ['Low Motivation', 'Medium Motivation', 'High Motivation']:
    group_data = data[data['motivation_group'] == group]

    # ITT by group
    itt_group = (group_data[group_data['Z'] == 1]['Y_obs'].mean() -
                 group_data[group_data['Z'] == 0]['Y_obs'].mean())

    # Compliance rate by group
    comp_group = (group_data[group_data['Z'] == 1]['D'].mean() -
                  group_data[group_data['Z'] == 0]['D'].mean())

    # LATE by group
    late_group = itt_group / comp_group if comp_group > 0 else np.nan

    print(f"\n{group}:")
    print(f"  ITT: {itt_group:.2f}")
    print(f"  Compliance rate: {comp_group:.2%}")
    print(f"  LATE: {late_group:.2f}")

# ==== 5. Visualization ====
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: Non-compliance situation
comply_counts = data.groupby(['Z', 'D']).size().unstack(fill_value=0)
comply_counts.plot(kind='bar', ax=axes[0, 0], color=['skyblue', 'salmon'])
axes[0, 0].set_xlabel('Random Assignment (Z)')
axes[0, 0].set_ylabel('Count')
axes[0, 0].set_title('Non-compliance Situation')
axes[0, 0].legend(['Did not receive treatment', 'Received treatment'])
axes[0, 0].set_xticklabels(['Control group', 'Treatment group'], rotation=0)

# Plot 2: ITT vs LATE
effects = ['ITT', 'LATE', 'True ATE']
values = [ITT_Y, LATE, true_ATE]
axes[0, 1].bar(effects, values, color=['#3498db', '#e74c3c', '#2ecc71'])
axes[0, 1].set_ylabel('Effect Size')
axes[0, 1].set_title('Comparison of Different Effect Estimates')
axes[0, 1].grid(axis='y', alpha=0.3)

# Plot 3: Heterogeneity (CATE)
cate_data = []
for group in ['Low Motivation', 'Medium Motivation', 'High Motivation']:
    group_data = data[data['motivation_group'] == group]
    itt = (group_data[group_data['Z'] == 1]['Y_obs'].mean() -
           group_data[group_data['Z'] == 0]['Y_obs'].mean())
    cate_data.append(itt)

axes[1, 0].bar(['Low', 'Medium', 'High'], cate_data,
               color=['#3498db', '#2ecc71', '#e74c3c'])
axes[1, 0].set_ylabel('ITT')
axes[1, 0].set_title('Treatment Effect Heterogeneity')
axes[1, 0].grid(axis='y', alpha=0.3)

# Plot 4: Scatter plot (motivation vs effect)
treated = data[data['Z'] == 1]
control = data[data['Z'] == 0]
axes[1, 1].scatter(treated['motivation'], treated['Y_obs'],
                   alpha=0.3, s=10, label='Treatment', color='salmon')
axes[1, 1].scatter(control['motivation'], control['Y_obs'],
                   alpha=0.3, s=10, label='Control', color='skyblue')
axes[1, 1].set_xlabel('Learning Motivation')
axes[1, 1].set_ylabel('Final Score')
axes[1, 1].set_title('Motivation vs Score (by Assignment Group)')
axes[1, 1].legend()

plt.tight_layout()
plt.savefig('ate_analysis.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n✓ Analysis complete!")

Summary

Core Concept Comparison

EffectDefinitionEstimatorApplication Scenario
ATEFull sample average effectSimple difference (RCT)Universal policy rollout
ATTTreated group average effectMatching, DIDProgram evaluation
ATUUntreated group average effectMatchingProgram expansion
LATECompliers average effectIV/2SLSRCT with non-compliance
ITTIntention-to-treat effectGroup by assignmentPolicy effect (conservative)
CATEConditional average effectGrouping/MLHeterogeneity analysis

Key Insights

  1. Non-compliance Problem

    • ITT preserves randomization but underestimates true effect
    • LATE estimates complier effect through IV
    • Compliance rate = ITT / LATE
  2. Heterogeneity Analysis

    • CATE reveals differential effects across subgroups
    • Helps with precision targeting and resource optimization
  3. Effect Selection

    • Policymakers care about ITT (effect of offering opportunity)
    • Researchers care about LATE (true mechanism)
    • Practitioners care about CATE (personalized intervention)

Practice Questions

  1. Understanding question: Why is ITT called a "conservative estimate"? When might ITT be more policy-relevant?

  2. Calculation question: An education RCT finds:

    • ITT = 0.2 standard deviations
    • Compliance rate = 60%

    Questions: What is LATE? If everyone participated (100% compliance), what would be the expected effect?

  3. Design question: You're studying "the causal effect of a fitness app on weight loss."

    • (a) Define ATE, ATT, LATE
    • (b) How to estimate CATE (age, gender, baseline BMI)
    • (c) Which effect do you expect to be largest? Why?
Click for answer hints

Question 1:

  • ITT is "conservative" because it includes non-compliers (dilutes effect)
  • Policy relevance: Reflects actual effect of "offering program opportunity" (can't force everyone to participate)

Question 2:

  • LATE = ITT / compliance rate = 0.2 / 0.6 = 0.33 standard deviations
  • Expected effect with 100% compliance ≈ LATE = 0.33 (assuming LATE = ATE)

Question 3:

  • (c) Expect ATT > ATE > ATU (self-selection effect: those who voluntarily use app have better outcomes)

Next Steps

In the next section, we'll learn about Identification Strategies and Validity, diving deep into core assumptions of causal inference (SUTVA, independence, etc.).

Keep going! 🚀


References:

  • Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). "Identification of causal effects using instrumental variables". JASA.
  • Imbens, G. W., & Angrist, J. D. (1994). "Identification and estimation of local average treatment effects". Econometrica.
  • Athey, S., & Imbens, G. (2016). "Recursive partitioning for heterogeneous causal effects". PNAS.

Released under the MIT License. Content © Author.