9.4 Placebo Tests

A Critical Method for Validating DID Robustness

Learning Objectives

Understand the rationale and role of placebo tests
Master common placebo test designs and applicable scenarios
Implement common placebo tests using Python

I. Why Placebo Tests Are Needed

In the DID framework, the core identifying assumption is "parallel trends." Even after conducting pre-trend tests, we still need "placebo tests" to further verify: if the policy were "fake" (not at the true time/not on the true group/using an unrelated outcome variable), the estimated "policy effect" should be close to 0 and insignificant. Otherwise, the model may be mistaking other factors for the policy effect.

Common motivations:

Exclude coincidence (other events happening at the same time point)
Exclude selectivity (spurious effects caused by differences between treatment and control groups)
Exclude model specification issues (inappropriate controls, clustering methods, trend specifications, etc.)

II. Common Placebo Test Methods (Recommended Priority Order)

Placebo Time
- Approach: Shift the "policy implementation time" forward/backward by several periods, then re-estimate DID.
- Expectation: Coefficients should be close to 0 (insignificant).
- Applicable: When time dimension is long and there are no dense interventions.
Placebo Group
- Approach: Randomly designate unaffected units that are similar to the treatment group as "treatment group."
- Expectation: DID coefficient distribution centered at 0.
- Applicable: When there are many cross-sectional samples.
Leave-One-Out Robustness (Exclude single treatment unit/region)
- Approach: Each time exclude one treatment unit and re-estimate.
- Expectation: Results not driven by a single unit.
Randomization Inference (Permutation Test)
- Approach: Keep time structure, randomly shuffle treatment assignment, repeat B times, form "virtual effect" distribution.
- Expectation: True effect lies in the tail of the permutation distribution (obtaining approximate p-value).
Placebo Outcome Variable
- Approach: Select an outcome variable "theoretically unaffected by the policy" and run DID.
- Expectation: Estimated coefficient ≈ 0.

III. Minimal Python Implementation Templates

For quick adoption, below are minimal, replicable "placebo test" code templates (consistent with 9.3 style, UTF-8 encoding).

python

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

# df: contains columns y, treated(0/1), time(integer), post(0/1), id

def did_once(df):
    """Standard DID regression (simplest form of entity/time fixed effects, can be extended as needed)"""
    model = smf.ols('y ~ treated*post + C(id) + C(time)', data=df).fit(cov_type='cluster', cov_kwds={'groups': df['id']})
    return model.params.get('treated:post', np.nan)

def placebo_time(df, shift=2):
    """Shift policy time point by shift periods, reconstruct post and re-estimate DID"""
    df2 = df.copy()
    true_cut = df2.loc[df2['post']==1, 'time'].min()  # True time point
    fake_cut = true_cut + shift
    df2['post'] = (df2['time'] >= fake_cut).astype(int)
    return did_once(df2)

def placebo_group(df, n_draw=200, seed=42):
    """Randomly designate control units of same size as "treatment group", repeat n_draw times, return placebo effect distribution"""
    rng = np.random.default_rng(seed)
    ids = df['id'].unique()
    treated_ids = df[df['treated']==1]['id'].unique()
    k = len(treated_ids)
    fake_ate = []
    for _ in range(n_draw):
        fake_treated = set(rng.choice(ids, size=k, replace=False))
        df2 = df.copy()
        df2['treated'] = df2['id'].isin(fake_treated).astype(int)
        fake_ate.append(did_once(df2))
    return pd.Series(fake_ate, name='placebo_group_ate')

Usage recommendation: First run the true DID once, record ATE; then run placebo_time / placebo_group, plot distribution and mark where true ATE falls.

IV. Interpreting Results and Considerations

Placebo Time: Across a series of "fake time points," coefficients should fluctuate around 0; if significant, beware of temporal confounding.
Placebo Group: Placebo effect distribution should be centered at 0, true ATE should fall in the tail of distribution.
Leave-One-Out: If results remain stable after excluding any unit, not driven by "one special unit."
Permutation Test: Can report permutation p-value as robustness evidence.
Placebo Outcome: Clearly explain why this outcome "should not be affected by the policy."

Technical details:

Clustered standard errors: Panel data commonly cluster at entity level (or two-way clustering).
Multi-period/staggered treatment: Recommend using event study (see 9.3) or recent multi-period DID estimators.

Section Summary

Placebo tests are "counterfactual stress tests" for DID identifying assumptions.
Common methods: fake time points, fake control groups, leave-one-out, permutation tests, placebo outcomes.
Key expectation is "close to 0 and insignificant"; if significant, need to return to identification and specification level to find causes.
When reporting, recommend combining: plots (distributions/event studies), tables (ATE and p-values), and textual explanations.

Previous: 9.3 Parallel Trends Assumption | Next: 9.5 Classic Cases and Python Implementation

9.4 Placebo Tests ​

Learning Objectives ​

I. Why Placebo Tests Are Needed ​

II. Common Placebo Test Methods (Recommended Priority Order) ​

III. Minimal Python Implementation Templates ​

IV. Interpreting Results and Considerations ​

Section Summary ​