Skip to content

9.4 Placebo Tests

A Critical Method for Validating DID Robustness


Learning Objectives

  • Understand the rationale and role of placebo tests
  • Master common placebo test designs and applicable scenarios
  • Implement common placebo tests using Python

I. Why Placebo Tests Are Needed

In the DID framework, the core identifying assumption is "parallel trends." Even after conducting pre-trend tests, we still need "placebo tests" to further verify: if the policy were "fake" (not at the true time/not on the true group/using an unrelated outcome variable), the estimated "policy effect" should be close to 0 and insignificant. Otherwise, the model may be mistaking other factors for the policy effect.

Common motivations:

  • Exclude coincidence (other events happening at the same time point)
  • Exclude selectivity (spurious effects caused by differences between treatment and control groups)
  • Exclude model specification issues (inappropriate controls, clustering methods, trend specifications, etc.)

  1. Placebo Time

    • Approach: Shift the "policy implementation time" forward/backward by several periods, then re-estimate DID.
    • Expectation: Coefficients should be close to 0 (insignificant).
    • Applicable: When time dimension is long and there are no dense interventions.
  2. Placebo Group

    • Approach: Randomly designate unaffected units that are similar to the treatment group as "treatment group."
    • Expectation: DID coefficient distribution centered at 0.
    • Applicable: When there are many cross-sectional samples.
  3. Leave-One-Out Robustness (Exclude single treatment unit/region)

    • Approach: Each time exclude one treatment unit and re-estimate.
    • Expectation: Results not driven by a single unit.
  4. Randomization Inference (Permutation Test)

    • Approach: Keep time structure, randomly shuffle treatment assignment, repeat B times, form "virtual effect" distribution.
    • Expectation: True effect lies in the tail of the permutation distribution (obtaining approximate p-value).
  5. Placebo Outcome Variable

    • Approach: Select an outcome variable "theoretically unaffected by the policy" and run DID.
    • Expectation: Estimated coefficient ≈ 0.

III. Minimal Python Implementation Templates

For quick adoption, below are minimal, replicable "placebo test" code templates (consistent with 9.3 style, UTF-8 encoding).

python
import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

# df: contains columns y, treated(0/1), time(integer), post(0/1), id

def did_once(df):
    """Standard DID regression (simplest form of entity/time fixed effects, can be extended as needed)"""
    model = smf.ols('y ~ treated*post + C(id) + C(time)', data=df).fit(cov_type='cluster', cov_kwds={'groups': df['id']})
    return model.params.get('treated:post', np.nan)

def placebo_time(df, shift=2):
    """Shift policy time point by shift periods, reconstruct post and re-estimate DID"""
    df2 = df.copy()
    true_cut = df2.loc[df2['post']==1, 'time'].min()  # True time point
    fake_cut = true_cut + shift
    df2['post'] = (df2['time'] >= fake_cut).astype(int)
    return did_once(df2)

def placebo_group(df, n_draw=200, seed=42):
    """Randomly designate control units of same size as "treatment group", repeat n_draw times, return placebo effect distribution"""
    rng = np.random.default_rng(seed)
    ids = df['id'].unique()
    treated_ids = df[df['treated']==1]['id'].unique()
    k = len(treated_ids)
    fake_ate = []
    for _ in range(n_draw):
        fake_treated = set(rng.choice(ids, size=k, replace=False))
        df2 = df.copy()
        df2['treated'] = df2['id'].isin(fake_treated).astype(int)
        fake_ate.append(did_once(df2))
    return pd.Series(fake_ate, name='placebo_group_ate')

Usage recommendation: First run the true DID once, record ATE; then run placebo_time / placebo_group, plot distribution and mark where true ATE falls.


IV. Interpreting Results and Considerations

  • Placebo Time: Across a series of "fake time points," coefficients should fluctuate around 0; if significant, beware of temporal confounding.
  • Placebo Group: Placebo effect distribution should be centered at 0, true ATE should fall in the tail of distribution.
  • Leave-One-Out: If results remain stable after excluding any unit, not driven by "one special unit."
  • Permutation Test: Can report permutation p-value as robustness evidence.
  • Placebo Outcome: Clearly explain why this outcome "should not be affected by the policy."

Technical details:

  • Clustered standard errors: Panel data commonly cluster at entity level (or two-way clustering).
  • Multi-period/staggered treatment: Recommend using event study (see 9.3) or recent multi-period DID estimators.

Section Summary

  • Placebo tests are "counterfactual stress tests" for DID identifying assumptions.
  • Common methods: fake time points, fake control groups, leave-one-out, permutation tests, placebo outcomes.
  • Key expectation is "close to 0 and insignificant"; if significant, need to return to identification and specification level to find causes.
  • When reporting, recommend combining: plots (distributions/event studies), tables (ATE and p-values), and textual explanations.

Previous: 9.3 Parallel Trends Assumption | Next: 9.5 Classic Cases and Python Implementation

Released under the MIT License. Content © Author.