5.6 Interpretation and Reporting

"If you torture the data long enough, it will confess to anything."— Ronald Coase, 1991 Nobel Laureate in Economics

From Regression Output to Academic Publication: Professionally Presenting Your Research

Section Objectives

After completing this section, you will be able to:

Correctly interpret coefficients from different model forms
Distinguish statistical significance from substantive significance
Produce publication-grade regression tables
Write standardized regression result reports
Visualize regression results
Understand the limitations of causal inference

The Art of Coefficient Interpretation

Four Classic Model Forms

Model Form	Interpretation of	Example
Level-Level	increases by 1 unit, increases by units	Each additional year of education increases wage by 2.5 thousand yuan
Log-Level	increases by 1 unit, increases by %	Each additional year of education increases wage by 8%
Level-Log	increases by 1%, increases by units	GDP increases by 1%, unemployed population decreases by 0.03 million
Log-Log	increases by 1%, increases by % (elasticity)	Price increases by 1%, demand decreases by 1.5%

Level-Level Model

python

import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt

# Generate data
np.random.seed(42)
n = 200
education = np.random.normal(13, 3, n)
wage = 10 + 2.5 * education + np.random.normal(0, 5, n)

df = pd.DataFrame({'wage': wage, 'education': education})

# Level-Level regression
model_ll = smf.ols('wage ~ education', data=df).fit()
print("Level-Level Model:")
print(model_ll.summary())

# Interpretation
beta_1 = model_ll.params['education']
print(f"\nInterpretation: Each additional year of education increases wage by {beta_1:.2f} thousand yuan/month")

Log-Level Model (Most Commonly Used)

python

# Log-Level regression
df['log_wage'] = np.log(df['wage'])
model_logl = smf.ols('log_wage ~ education', data=df).fit()
print("\nLog-Level Model:")
print(model_logl.summary())

# Interpretation (approximate)
beta_1_log = model_logl.params['education']
print(f"\nApproximate interpretation: Each additional year of education increases wage by approximately {beta_1_log*100:.2f}%")

# Exact interpretation
print(f"Exact interpretation: Each additional year of education increases wage by {(np.exp(beta_1_log)-1)*100:.2f}%")

When to Use Approximate vs Exact:

: Approximate and exact are nearly identical
: Use exact interpretation

Level-Log Model

python

# Level-Log regression
df['log_education'] = np.log(df['education'])
model_llevl = smf.ols('wage ~ log_education', data=df).fit()
print("\nLevel-Log Model:")
print(model_llevl.summary())

# Interpretation
beta_1_llevl = model_llevl.params['log_education']
print(f"\nInterpretation: Education increases by 1%, wage increases by {beta_1_llevl/100:.4f} thousand yuan")
print(f"Or: Education increases by 10%, wage increases by {beta_1_llevl*0.1:.3f} thousand yuan")

Log-Log Model (Elasticity Model)

python

# Log-Log regression
model_loglog = smf.ols('log_wage ~ log_education', data=df).fit()
print("\nLog-Log Model:")
print(model_loglog.summary())

# Interpretation
elasticity = model_loglog.params['log_education']
print(f"\nInterpretation: Education-wage elasticity = {elasticity:.3f}")
print(f"That is: Education increases by 1%, wage increases by {elasticity:.3f}%")

Visualizing Model Comparisons

python

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# 1. Level-Level
axes[0, 0].scatter(df['education'], df['wage'], alpha=0.5)
axes[0, 0].plot(df['education'], model_ll.fittedvalues, 'r-', linewidth=2)
axes[0, 0].set_xlabel('Education (years)')
axes[0, 0].set_ylabel('Wage (thousands)')
axes[0, 0].set_title('Level-Level: Wage = β₀ + β₁·Education')
axes[0, 0].grid(True, alpha=0.3)

# 2. Log-Level
axes[0, 1].scatter(df['education'], df['log_wage'], alpha=0.5)
axes[0, 1].plot(df['education'], model_logl.fittedvalues, 'r-', linewidth=2)
axes[0, 1].set_xlabel('Education (years)')
axes[0, 1].set_ylabel('log(Wage)')
axes[0, 1].set_title('Log-Level: log(Wage) = β₀ + β₁·Education')
axes[0, 1].grid(True, alpha=0.3)

# 3. Level-Log
axes[1, 0].scatter(df['log_education'], df['wage'], alpha=0.5)
axes[1, 0].plot(df['log_education'], model_llevl.fittedvalues, 'r-', linewidth=2)
axes[1, 0].set_xlabel('log(Education)')
axes[1, 0].set_ylabel('Wage (thousands)')
axes[1, 0].set_title('Level-Log: Wage = β₀ + β₁·log(Education)')
axes[1, 0].grid(True, alpha=0.3)

# 4. Log-Log
axes[1, 1].scatter(df['log_education'], df['log_wage'], alpha=0.5)
axes[1, 1].plot(df['log_education'], model_loglog.fittedvalues, 'r-', linewidth=2)
axes[1, 1].set_xlabel('log(Education)')
axes[1, 1].set_ylabel('log(Wage)')
axes[1, 1].set_title('Log-Log: log(Wage) = β₀ + β₁·log(Education)')
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Publication-Grade Regression Tables

Using stargazer (Python Version)

python

from statsmodels.iolib.summary2 import summary_col

# Generate complete data
np.random.seed(123)
n = 500
education = np.random.normal(13, 3, n)
experience = np.random.uniform(0, 30, n)
female = np.random.binomial(1, 0.5, n)
married = np.random.binomial(1, 0.6, n)

log_wage = (1.5 + 0.08*education + 0.03*experience - 0.0005*experience**2
            - 0.15*female + 0.05*married + np.random.normal(0, 0.3, n))

df = pd.DataFrame({
    'log_wage': log_wage,
    'education': education,
    'experience': experience,
    'experience_sq': experience**2,
    'female': female,
    'married': married
})

# Estimate multiple models
model1 = smf.ols('log_wage ~ education', data=df).fit(cov_type='HC3')
model2 = smf.ols('log_wage ~ education + experience + I(experience**2)',
                 data=df).fit(cov_type='HC3')
model3 = smf.ols('log_wage ~ education + experience + I(experience**2) + female',
                 data=df).fit(cov_type='HC3')
model4 = smf.ols('log_wage ~ education + experience + I(experience**2) + female + married',
                 data=df).fit(cov_type='HC3')

# Create comparison table
results_table = summary_col(
    [model1, model2, model3, model4],
    model_names=['(1)', '(2)', '(3)', '(4)'],
    stars=True,
    float_format='%.3f',
    info_dict={
        'N': lambda x: f"{int(x.nobs)}",
        'R²': lambda x: f"{x.rsquared:.3f}",
        'Adj. R²': lambda x: f"{x.rsquared_adj:.3f}"
    }
)

print("Table 1: Wage Determination Equation (Dependent Variable: log(wage))")
print("="*80)
print(results_table)
print("="*80)
print("Note: Robust standard errors (HC3) in parentheses")
print("*** p<0.01, ** p<0.05, * p<0.1")

Output LaTeX Format

python

# Output as LaTeX
latex_table = results_table.as_latex()
print("\nLaTeX code:")
print(latex_table)

# Save to file
with open('regression_table.tex', 'w') as f:
    f.write(latex_table)
print("\nSaved to regression_table.tex")

Custom Table Style

python

# More professional table style
def create_regression_table(models, model_names, dependent_var, note=''):
    """
    Create professional regression table
    """
    results = summary_col(
        models,
        model_names=model_names,
        stars=True,
        float_format='%.4f'
    )

    # Add header and notes
    output = f"\nTable: Regression Analysis of {dependent_var}\n"
    output += "="*90 + "\n"
    output += str(results)
    output += "\n" + "="*90 + "\n"
    output += "Standard errors in parentheses. Robust standard errors (HC3) used.\n"
    output += "*** p<0.01, ** p<0.05, * p<0.1\n"
    if note:
        output += f"\nNote: {note}\n"

    return output

table = create_regression_table(
    [model1, model2, model3, model4],
    ['Model 1', 'Model 2', 'Model 3', 'Model 4'],
    'log(wage)',
    note='Sample includes 500 workers. Models (2)-(4) control for work experience and its square.'
)
print(table)

Writing Regression Result Reports

Complete Report Template

python

def generate_report(model, df, dep_var, title="Regression Analysis Report"):
    """
    Generate complete regression analysis report
    """
    report = f"\n{'='*80}\n"
    report += f"{title:^80}\n"
    report += f"{'='*80}\n\n"

    # 1. Model specification
    report += "1. Model Specification\n"
    report += "-" * 80 + "\n"
    report += f"Dependent variable: {dep_var}\n"
    report += f"Sample size: {int(model.nobs)}\n"
    report += f"Estimation method: OLS (robust standard errors)\n\n"

    # 2. Main findings
    report += "2. Main Findings\n"
    report += "-" * 80 + "\n"
    for var in model.params.index:
        if var == 'Intercept' or var == 'const':
            continue
        coef = model.params[var]
        se = model.bse[var]
        t = model.tvalues[var]
        p = model.pvalues[var]

        # Significance markers
        sig = '***' if p < 0.01 else ('**' if p < 0.05 else ('*' if p < 0.1 else ''))

        report += f"\n{var}:\n"
        report += f"  Coefficient = {coef:.4f}{sig} (SE = {se:.4f})\n"
        report += f"  t statistic = {t:.3f}, p-value = {p:.4f}\n"

        # Interpretation (assuming log-level model)
        if 'log' in dep_var.lower():
            pct_change = (np.exp(coef) - 1) * 100
            report += f"  Interpretation: {var} increases by 1 unit, {dep_var} increases by {pct_change:.2f}%\n"

    # 3. Model fit
    report += "\n3. Model Fit\n"
    report += "-" * 80 + "\n"
    report += f"R² = {model.rsquared:.4f}\n"
    report += f"Adjusted R² = {model.rsquared_adj:.4f}\n"
    report += f"F statistic = {model.fvalue:.2f} (p = {model.f_pvalue:.4f})\n"

    # 4. Diagnostic tests
    report += "\n4. Diagnostic Tests\n"
    report += "-" * 80 + "\n"

    # Heteroskedasticity test
    from statsmodels.stats.diagnostic import het_breuschpagan
    bp_test = het_breuschpagan(model.resid, model.model.exog)
    report += f"Breusch-Pagan test (heteroskedasticity): LM = {bp_test[0]:.3f}, p = {bp_test[1]:.4f}\n"

    # Normality test
    from statsmodels.stats.stattools import jarque_bera
    jb_test = jarque_bera(model.resid)
    report += f"Jarque-Bera test (normality): JB = {jb_test[0]:.3f}, p = {jb_test[1]:.4f}\n"

    # Autocorrelation test
    from statsmodels.stats.stattools import durbin_watson
    dw = durbin_watson(model.resid)
    report += f"Durbin-Watson statistic (autocorrelation): {dw:.3f}\n"

    report += "\n" + "="*80 + "\n"

    return report

# Generate report
report = generate_report(model4, df, 'log(wage)', title="Analysis of Wage Determination Equation")
print(report)

Academic Writing Example

markdown

## Empirical Results

Table 1 reports the estimation results for the wage determination equation. Column (1) shows
a simple regression of education on wage, with an estimated education return of 8.2%,
significant at the 1% level. This means that each additional year of education is
associated with an average wage increase of 8.2%.

Column (2) adds work experience and its square. The coefficient on experience is 0.030
(p < 0.01), and the coefficient on experience squared is -0.0005 (p < 0.01), indicating
that the wage-experience profile follows an inverted U-shape. The peak occurs at
approximately 30 years of experience. After controlling for experience, the education
return slightly decreases to 7.9% but remains highly significant.

Column (3) further controls for gender. The coefficient on female is -0.147 (p < 0.01),
indicating that after controlling for education and experience, female wages are
approximately 13.7% lower than male wages [= (exp(-0.147)-1)×100%]. This significant
gender wage gap may reflect labor market discrimination or unobserved productivity
differences.

Column (4) is the complete model, adding marital status. The coefficient on married is
0.052 (p < 0.05), indicating that married individuals earn approximately 5.3% more than
unmarried individuals. This "marriage premium" is widely documented in the labor economics
literature (Korenman & Neumark, 1991).

All models use HC3 robust standard errors to correct for potential heteroskedasticity.
Adjusted R² increases from 0.427 in model (1) to 0.583 in model (4), indicating that
added variables significantly improve the model's explanatory power.

Visualizing Regression Results

Coefficient Plot

python

# Extract coefficients and confidence intervals
coefs = model4.params.drop('Intercept')
ci = model4.conf_int(alpha=0.05).drop('Intercept')
ci_lower = ci[0]
ci_upper = ci[1]

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
y_pos = np.arange(len(coefs))

ax.errorbar(coefs, y_pos, xerr=[coefs - ci_lower, ci_upper - coefs],
            fmt='o', markersize=8, capsize=5, capthick=2, linewidth=2)
ax.axvline(x=0, color='red', linestyle='--', linewidth=1.5, alpha=0.7)
ax.set_yticks(y_pos)
ax.set_yticklabels(coefs.index)
ax.set_xlabel('Coefficient Estimate')
ax.set_title('Regression Coefficients with 95% Confidence Intervals')
ax.grid(True, alpha=0.3, axis='x')
plt.tight_layout()
plt.show()

Marginal Effects Plot

python

# Marginal effect of experience (considering quadratic term)
def marginal_effect_exp(exp_values, model):
    beta_exp = model.params['experience']
    beta_exp2 = model.params['I(experience ** 2)']
    return beta_exp + 2 * beta_exp2 * exp_values

exp_range = np.linspace(0, 40, 100)
me = marginal_effect_exp(exp_range, model4)

plt.figure(figsize=(10, 6))
plt.plot(exp_range, me, linewidth=2)
plt.axhline(y=0, color='r', linestyle='--', alpha=0.5)
plt.xlabel('Work Experience (years)')
plt.ylabel('Marginal Effect of Experience (on log(wage))')
plt.title('Marginal Effect of Work Experience on Wage')
plt.grid(True, alpha=0.3)

# Mark peak
peak_exp = -model4.params['experience'] / (2 * model4.params['I(experience ** 2)'])
plt.axvline(x=peak_exp, color='green', linestyle=':', alpha=0.7,
           label=f'Peak at experience = {peak_exp:.1f} years')
plt.legend()
plt.show()

Predicted Wage Distribution

python

# Predicted wages for different groups
scenarios = pd.DataFrame({
    'education': [12, 16, 16, 18],
    'experience': [5, 10, 10, 15],
    'female': [0, 0, 1, 0],
    'married': [0, 1, 1, 1],
    'label': ['High school graduate male', 'College male, married', 'College female, married', 'Graduate male, married']
})

# Predict
scenarios['log_wage_pred'] = model4.predict(scenarios)
scenarios['wage_pred'] = np.exp(scenarios['log_wage_pred'])

# Calculate prediction intervals
predictions = model4.get_prediction(scenarios)
pred_summary = predictions.summary_frame(alpha=0.05)
scenarios['ci_lower'] = np.exp(pred_summary['mean_ci_lower'])
scenarios['ci_upper'] = np.exp(pred_summary['mean_ci_upper'])

# Visualize
fig, ax = plt.subplots(figsize=(10, 6))
y_pos = np.arange(len(scenarios))

ax.barh(y_pos, scenarios['wage_pred'], alpha=0.7)
ax.errorbar(scenarios['wage_pred'], y_pos,
           xerr=[scenarios['wage_pred'] - scenarios['ci_lower'],
                 scenarios['ci_upper'] - scenarios['wage_pred']],
           fmt='none', ecolor='black', capsize=5)

ax.set_yticks(y_pos)
ax.set_yticklabels(scenarios['label'])
ax.set_xlabel('Predicted Wage (thousands/month)')
ax.set_title('Predicted Wages for Different Groups with 95% Confidence Intervals')
ax.grid(True, alpha=0.3, axis='x')
plt.tight_layout()
plt.show()

print("Prediction results:")
print(scenarios[['label', 'wage_pred', 'ci_lower', 'ci_upper']])

Statistical Significance vs Substantive Significance

Problem: Misuse of p-values

Common Misconceptions:

"p < 0.001, therefore the effect is very large"
"p > 0.05, therefore there is no effect"

Correct Understanding:

Statistical significance: Strength of evidence that effect is nonzero
Substantive significance: Whether the effect size is practically important

Case Study

python

# Simulate large sample data
np.random.seed(999)
n_large = 10000

education_large = np.random.normal(13, 3, n_large)
# True effect is very small: 0.005 (0.5%)
log_wage_large = 2.5 + 0.005*education_large + np.random.normal(0, 0.3, n_large)

df_large = pd.DataFrame({'log_wage': log_wage_large, 'education': education_large})
model_large = smf.ols('log_wage ~ education', data=df_large).fit()

print("Large sample regression:")
print(f"Sample size: {n_large}")
print(f"Education coefficient: {model_large.params['education']:.6f}")
print(f"p-value: {model_large.pvalues['education']:.6f}")
print(f"95% confidence interval: [{model_large.conf_int().loc['education', 0]:.6f}, "
      f"{model_large.conf_int().loc['education', 1]:.6f}]")

# Substantive meaning
effect_pct = model_large.params['education'] * 100
print(f"\nSubstantive interpretation: Each additional year of education increases wage by {effect_pct:.2f}%")
print("Although statistically significant, the actual effect is extremely small (less than 1%), of little substantive importance")

Assessing Substantive Significance

Standards (vary by field):

Cohen's d (effect size)
R² increment
Domain expert judgment

python

# Calculate Cohen's d
def cohens_d(group1, group2):
    n1, n2 = len(group1), len(group2)
    var1, var2 = np.var(group1, ddof=1), np.var(group2, ddof=1)
    pooled_std = np.sqrt(((n1-1)*var1 + (n2-1)*var2) / (n1+n2-2))
    return (np.mean(group1) - np.mean(group2)) / pooled_std

# Case: Gender wage gap
male_wage = df[df['female'] == 0]['log_wage']
female_wage = df[df['female'] == 1]['log_wage']

d = cohens_d(male_wage, female_wage)
print(f"Cohen's d = {d:.3f}")

# Interpretation
if abs(d) < 0.2:
    print("Effect size: Small")
elif abs(d) < 0.5:
    print("Effect size: Medium")
else:
    print("Effect size: Large")

Limitations of Causal Inference

OLS Regression ≠ Causal Effect

Conditions for Causal Inference:

Randomized Controlled Trial (RCT)
Natural Experiment
Instrumental Variables (IV)
Difference-in-Differences (DID)
Regression Discontinuity (RDD)

Limitations of OLS Regression:

Omitted variable bias
Reverse causality
Selection bias

Case: Causal Effect of Education on Wage

Problem:

Sources of Bias:

Omitted variables: Ability
- High ability → More education
- High ability → Higher wage
- biased upward
Reverse causality: Expected wage → Education choice
Measurement error: Quality differences in education

Gold Standard for Causal Inference: Instrumental Variables

python

# Simulate IV estimation
np.random.seed(2024)
n = 1000

# Latent ability (unobservable)
ability = np.random.normal(0, 1, n)

# Instrument: Birth quarter (Angrist & Krueger, 1991)
# Assume those born later attend more school due to compulsory education laws
birth_quarter = np.random.choice([1, 2, 3, 4], n)
instrument = (birth_quarter == 4).astype(int)

# Education (endogenous)
education_iv = 12 + 1.5*ability + 0.5*instrument + np.random.normal(0, 2, n)

# Wage (true causal effect = 0.05)
log_wage_iv = 2.0 + 0.05*education_iv + 0.20*ability + np.random.normal(0, 0.3, n)

df_iv = pd.DataFrame({
    'log_wage': log_wage_iv,
    'education': education_iv,
    'instrument': instrument,
    'ability': ability  # Unobservable in reality
})

# OLS (biased)
model_ols = smf.ols('log_wage ~ education', data=df_iv).fit()
print("OLS estimate (biased):")
print(f"Education coefficient = {model_ols.params['education']:.4f}")

# IV estimation (unbiased)
from linearmodels.iv import IV2SLS
iv_model = IV2SLS.from_formula('log_wage ~ 1 + [education ~ instrument]',
                                data=df_iv).fit()
print("\nIV estimate (unbiased):")
print(f"Education coefficient = {iv_model.params['education']:.4f}")

print(f"\nTrue causal effect: 0.05")
print(f"OLS upward bias: {model_ols.params['education'] - 0.05:.4f}")

Complete Case: Publication-Grade Paper

Research Question

Title: Gender Differences in Returns to Education: Evidence from China's Labor Market

Research Questions:

What is the return to education on wages?
Are there gender differences in returns to education?
How do these differences vary across education levels?

Data and Methods

python

# Generate complete dataset
np.random.seed(20250128)
n = 2000

education = np.random.normal(13, 3, n)
experience = np.random.uniform(0, 30, n)
female = np.random.binomial(1, 0.5, n)
region = np.random.choice(['East', 'Central', 'West'], n, p=[0.4, 0.3, 0.3])
married = np.random.binomial(1, 0.6, n)

# DGP: Gender differences in education returns
region_effects = [{'East': 0.15, 'Central': 0.05, 'West': 0}[r] for r in region]
log_wage = (1.5 + 0.08*education + 0.03*experience - 0.0005*experience**2 +
            0.10*female - 0.015*education*female + np.array(region_effects) +
            0.06*married + np.random.normal(0, 0.3, n))

df_final = pd.DataFrame({
    'log_wage': log_wage,
    'education': education,
    'experience': experience,
    'female': female,
    'region': region,
    'married': married
})

# Descriptive statistics
print("Table 2: Descriptive Statistics")
print("="*80)
desc_stats = df_final.describe().T[['mean', 'std', 'min', 'max']]
print(desc_stats)

# By gender
print("\nBy gender:")
print(df_final.groupby('female')[['education', 'experience', 'log_wage']].mean())

Regression Analysis

python

# Models 1-4
m1 = smf.ols('log_wage ~ education', data=df_final).fit(cov_type='HC3')
m2 = smf.ols('log_wage ~ education + experience + I(experience**2)',
             data=df_final).fit(cov_type='HC3')
m3 = smf.ols('log_wage ~ education + experience + I(experience**2) + female',
             data=df_final).fit(cov_type='HC3')
m4 = smf.ols('log_wage ~ education * female + experience + I(experience**2) + C(region) + married',
             data=df_final).fit(cov_type='HC3')

# Output table
print("\nTable 3: Wage Determination Equation")
table = summary_col([m1, m2, m3, m4],
                   model_names=['(1)', '(2)', '(3)', '(4)'],
                   stars=True)
print(table)

Visualizing Main Results

python

# Plot interaction effects
edu_range = np.linspace(6, 20, 50)

# Male
male_pred = m4.predict(pd.DataFrame({
    'education': edu_range,
    'female': 0,
    'experience': 10,
    'region': 'East',
    'married': 1
}))

# Female
female_pred = m4.predict(pd.DataFrame({
    'education': edu_range,
    'female': 1,
    'experience': 10,
    'region': 'East',
    'married': 1
}))

plt.figure(figsize=(10, 6))
plt.plot(edu_range, male_pred, 'b-', linewidth=2, label='Male')
plt.plot(edu_range, female_pred, 'r-', linewidth=2, label='Female')
plt.xlabel('Years of Education')
plt.ylabel('Predicted log(wage)')
plt.title('Gender Differences in Education-Wage Relationship')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Calculate gender wage gap at different education levels
for edu in [10, 13, 16]:
    gap = (m4.params['female'] +
           m4.params['education:female'] * edu)
    gap_pct = (np.exp(gap) - 1) * 100
    print(f"Education = {edu} years: Gender wage gap = {gap_pct:.1f}%")

Section Summary

Key Points

Topic	Key Point
Coefficient Interpretation	Level-Level, Log-Level, Level-Log, Log-Log
Significance	Statistical significance ≠ Substantive importance
Causal Inference	OLS ≠ Causation, need identification strategy
Academic Writing	Clear, standardized, complete

Paper Writing Checklist

[ ] Clearly state research question
[ ] Describe data sources and variable definitions
[ ] Report descriptive statistics
[ ] Explain estimation method (OLS, IV, robust SE)
[ ] Present multiple model specifications
[ ] Interpret main coefficients (magnitude, significance, substantive meaning)
[ ] Conduct robustness checks
[ ] Discuss causal identification strategy
[ ] Visualize main results
[ ] Discuss limitations

5.6 Interpretation and Reporting

Section Objectives

The Art of Coefficient Interpretation

Four Classic Model Forms

Level-Level Model

Log-Level Model (Most Commonly Used)

Level-Log Model

Log-Log Model (Elasticity Model)

Visualizing Model Comparisons

Publication-Grade Regression Tables

Using stargazer (Python Version)

Output LaTeX Format

Custom Table Style

Writing Regression Result Reports

Complete Report Template

Academic Writing Example

Visualizing Regression Results

Coefficient Plot

Marginal Effects Plot

Predicted Wage Distribution

Statistical Significance vs Substantive Significance

Problem: Misuse of p-values

Case Study

Assessing Substantive Significance

Limitations of Causal Inference

OLS Regression ≠ Causal Effect

Case: Causal Effect of Education on Wage

Gold Standard for Causal Inference: Instrumental Variables

Complete Case: Publication-Grade Paper

Research Question

Data and Methods

Regression Analysis

Visualizing Main Results

Section Summary

Key Points

Paper Writing Checklist

Further Reading

Academic Writing Guides

Causal Inference Classics

5.6 Interpretation and Reporting ​

Section Objectives ​

The Art of Coefficient Interpretation ​

Four Classic Model Forms ​

Level-Level Model ​

Log-Level Model (Most Commonly Used) ​

Level-Log Model ​

Log-Log Model (Elasticity Model) ​

Visualizing Model Comparisons ​

Publication-Grade Regression Tables ​

Using stargazer (Python Version) ​

Output LaTeX Format ​

Custom Table Style ​

Writing Regression Result Reports ​

Complete Report Template ​

Academic Writing Example ​

Visualizing Regression Results ​

Coefficient Plot ​

Marginal Effects Plot ​

Predicted Wage Distribution ​

Statistical Significance vs Substantive Significance ​

Problem: Misuse of p-values ​

Case Study ​

Assessing Substantive Significance ​

Limitations of Causal Inference ​

OLS Regression ≠ Causal Effect ​

Case: Causal Effect of Education on Wage ​

Gold Standard for Causal Inference: Instrumental Variables ​

Complete Case: Publication-Grade Paper ​

Research Question ​

Data and Methods ​

Regression Analysis ​

Visualizing Main Results ​

Section Summary ​

Key Points ​

Paper Writing Checklist ​

Further Reading ​

Academic Writing Guides ​

Causal Inference Classics ​

5.6 Interpretation and Reporting

Section Objectives

The Art of Coefficient Interpretation

Four Classic Model Forms

Level-Level Model

Log-Level Model (Most Commonly Used)

Level-Log Model

Log-Log Model (Elasticity Model)

Visualizing Model Comparisons

Publication-Grade Regression Tables

Using stargazer (Python Version)

Output LaTeX Format

Custom Table Style

Writing Regression Result Reports

Complete Report Template

Academic Writing Example

Visualizing Regression Results

Coefficient Plot

Marginal Effects Plot

Predicted Wage Distribution

Statistical Significance vs Substantive Significance

Problem: Misuse of p-values

Case Study

Assessing Substantive Significance

Limitations of Causal Inference

OLS Regression ≠ Causal Effect

Case: Causal Effect of Education on Wage

Gold Standard for Causal Inference: Instrumental Variables

Complete Case: Publication-Grade Paper

Research Question

Data and Methods

Regression Analysis

Visualizing Main Results

Section Summary

Key Points

Paper Writing Checklist

Further Reading

Academic Writing Guides

Causal Inference Classics