When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
Kosuke Imai (Princeton) In Song Kim (MIT)
Political Methodology Summer Meeting Rice University
July 21, 2016
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 1 / 37 Fixed Effects Regressions in Causal Inference
Linear fixed effects regression models are the primary workhorse for causal inference with longitudinal/panel data
Researchers use them to adjust for unobserved time-invariant confounders (omitted variables, endogeneity, selection bias, ...):
“Good instruments are hard to find ..., so we’d like to have other tools to deal with unobserved confounders. This chapter considers ... strategies that use data with a time or cohort dimension to control for unobserved but fixed omitted variables” (Angrist & Pischke, Mostly Harmless Econometrics)
“fixed effects regression can scarcely be faulted for being the bearer of bad tidings” (Green et al., Dirty Pool)
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 2 / 37 Overview of the Talk
Identify two under-appreciated causal assumptions of unit fixed effects regression estimators: 1 Past treatments do not directly affect current outcome 2 Past outcomes do not directly affect current treatments and time-varying confounders can be relaxed under a selection-on-observables approach New matching framework for causal inference with panel data: 1 propose within-unit matching estimators to relax linearity 2 incorporate various estimators, e.g., the before-and-after estimator 3 establish equivalence between matching estimators and weighted linear fixed effects regression estimators Extend the analysis to two-way fixed effects models, difference-in-differences design, and synthetic control method An empirical illustration: Effects of GATT on trade
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 3 / 37 Linear Regression with Unit Fixed Effects
Balanced panel data with N units and T time periods
Yit : outcome variable Xit : causal or treatment variable of interest
Assumption 1 (Linearity)
Yit = αi + βXit + it
Ui : a vector of unobserved time-invariant confounders
αi = h(Ui ) for any function h(·) A flexible way to adjust for unobservables Average contemporaneous treatment effect:
β = E(Yit (1) − Yit (0))
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 4 / 37 Strict Exogeneity and Least Squares Estimator
Assumption 2 (Strict Exogeneity)
it ⊥⊥{Xi , Ui }
Mean independence is sufficient: E(it | Xi , Ui ) = E(it ) = 0 Least squares estimator based on de-meaning:
N T ˆ X X 2 βFE = arg min {(Yit − Y i ) − β(Xit − X i )} β i=1 t=1
where X i and Y i are unit-specific sample means ATE among those units with variation in treatment:
τ = E(Yit (1) − Yit (0) | Cit = 1) PT where Cit = 1{0 < t=1 Xit < T }.
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 5 / 37 Causal Directed Acyclic Graph (DAG)
Yi1 Yi2 Yi3
Xi1 Xi2 Xi3 arrow = direct causal effect absence of arrows causal assumptions Ui
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 6 / 37 Nonparametric Structural Equation Model (NPSEM)
One-to-one correspondence with a DAG:
Yit = g1(Xit , Ui , it )
Xit = g2(Xi1,..., Xi,t−1, Ui , ηit )
Nonparametric generalization of linear unit fixed effects model: Allows for nonlinear relationships, effect heterogeneity Strict exogeneity holds No arrows can be added without violating Assumptions 1 and 2
Causal assumptions: 1 No unobserved time-varying confounders 2 Past outcomes do not directly affect current outcome 3 Past outcomes do not directly affect current treatment 4 Past treatments do not directly affect current outcome
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 7 / 37 Potential Outcomes Framework
DAG causal structure Potential outcomes treatment assignment mechanism Assumption 3 (No carryover effect) Past treatments do not directly affect current outcome
Yit (Xi1, Xi2,..., Xi,t−1, Xit ) = Yit (Xit )
What randomized experiment satisfies unit fixed effects model?
1 randomize Xi1 given Ui 2 randomize Xi2 given Xi1 and Ui 3 randomize Xi3 given Xi2, Xi1, and Ui 4 and so on
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 8 / 37 Assumption 4 (Sequential Ignorability with Unobservables)
T {Yit (1), Yit (0)}t=1 ⊥⊥ Xi1 | Ui . . T {Yit (1), Yit (0)}t=1 ⊥⊥ Xit0 | Xi1,..., Xi,t0−1, Ui . . T {Yit (1), Yit (0)}t=1 ⊥⊥ XiT | Xi1,..., Xi,T −1, Ui
“as-if random” assumption without conditioning on past outcomes Past outcomes cannot directly affect current treatment
Says nothing about whether past outcomes can directly affect current outcome
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 9 / 37 Past Outcomes Directly Affect Current Outcome
Yi1 Yi2 Yi3
Strict exogeneity still Xi1 Xi2 Xi3 holds Past outcomes do not confound Xit −→ Yit given Ui
Ui No need to adjust for past outcomes
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 10 / 37 Past Treatments Directly Affect Current Outcome
Past treatments as confounders Yi1 Yi2 Yi3 Need to adjust for past treatments Strict exogeneity holds given past treatments and Xi1 Xi2 Xi3 Ui Impossible to adjust for an entire treatment history and Ui at the same time Adjust for a small number Ui of past treatments often arbitrary
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 11 / 37 Past Outcomes Directly Affect Current Treatment
Correlation between Yi1 Yi2 Yi3 error term and future treatments Violation of strict exogeneity
Xi1 Xi2 Xi3 No adjustment is sufficient
Together with the previous assumption no feedback effect Ui over time
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 12 / 37 Instrumental Variables Approach
Yi1 Yi2 Yi3 Instruments: Xi1, Xi2, and Yi1 GMM: Arellano and Bond (1991)
Xi1 Xi2 Xi3 Exclusion restrictions Arbitrary choice of instruments Substantive justification rarely given Ui
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 13 / 37 An Alternative Selection-on-Observables Approach
Absence of unobserved Yi1 Yi2 Yi3 time-invariant confounders Ui past treatments can directly affect current outcome
past outcomes can directly Xi1 Xi2 Xi3 affect current treatment
Comparison across units within the same time rather than across different time periods within the same unit
Marginal structural models can identify the average effect of an entire treatment sequence Trade-off no free lunch
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 14 / 37 Adjusting for Observed Time-varying Confounders
Yi1 Yi2 Yi3 past treatments cannot directly affect current outcome past outcomes Xi1 Xi2 Xi3 cannot directly affect current treatment
adjusting for Zit does not relax these
Zi1 Zi2 Zi3 assumptions past outcomes cannot indirectly affect current treatment through Zit Ui
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 15 / 37 A New Matching Framework
Even if these assumptions are satisfied, the the unit fixed effects estimator is inconsistent for the ATE:
PT PT t=1 Xit Yit t=1(1−Xit )Yit 2 E Ci PT − PT Si p = Xit = 1−Xit βˆ −→ t 1 t 1 6= τ FE 2 E(Ci Si )
2 PT 2 where Si = t=1(Xit − X i ) /(T − 1) is the unit-specific variance Key idea: comparison across time periods within the same unit ˆ The Within-unit matching estimator improves βFE by relaxing the linearity assumption:
N PT PT ! 1 X Xit Yit (1 − Xit )Yit τˆ = C t=1 − t=1 match PN i PT PT i=1 Ci i=1 t=1 Xit t=1(1 − Xit )
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 16 / 37 Constructing a General Matching Estimator
Mit : matched set for observation (i, t) For the within-unit matching estimator,
match 0 0 0 Mit = {(i , t ): i = i, Xi0t0 = 1 − Xit }
A general matching estimator:
N T 1 X X τˆ = D (Y\(1) − Y\(0)) match PN PT it it it i=1 t=1 Dit i=1 t=1
where Dit = 1{#Mit > 0} and Yit if Xit = x Y\it (x) = 1 P 0 0 Y 0 0 if X = 1 − x #Mit (i ,t )∈Mit i t it
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 17 / 37 Before-and-After Design
No time trend for the average potential outcomes:
E(Yit (x) − Yi,t−1(x) | Xit 6= Xi,t−1) = 0 for x = 0, 1
with the quantity of interest E(Yit (1) − Yit (0) | Xit 6= Xi,t−1)
Or just the average potential outcome under the control condition
E(Yit (0) − Yi,t−1(0) | Xit = 1, Xi,t−1 = 0) = 0
This is a matching estimator with the following matched set:
BA 0 0 0 0 Mit = {(i , t ): i = i, t ∈ {t − 1, t + 1}, Xi0t0 = 1 − Xit }
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 18 / 37 It is also the first differencing estimator:
N T ˆ X X 2 βFD = arg min {(Yit − Yi,t−1) − β(Xit − Xi,t−1)} β i=1 t=2 “We emphasize that the model and the interpretation of β are exactly as in [the linear fixed effects model]. What differs is our method for estimating β” (Wooldridge; italics original).
The identification assumptions is very different Slightly relaxing the assumption of no carryover effect But, still requires the assumption that past outcomes do not affect current treatment Regression toward the mean: suppose that the treatment is given when the previous outcome takes a value greater than its mean
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 19 / 37 Matching as a Weighted Unit Fixed Effects Estimator
Any within-unit matching estimator can be written as a weighted unit fixed effects estimator with different regression weights
The proposed within-matching estimator:
N T ˆ X X ∗ ∗ 2 τˆmatch = βWFE = arg min Dit Wit {(Yit − Y i ) − β(Xit − X i )} β i=1 t=1
∗ ∗ where X i and Y i are unit-specific weighted averages, and
T PT if Xit = 1, t0=1 Xit0 Wit = T PT if Xit = 0. t0=1(1−Xit0 )
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 20 / 37 We show how to construct regression weights for different matching estimators (i.e., different matched sets) Idea: count the number of times each observation is used for matching
Benefits: computational efficiency model-based standard errors
robustness matching estimator is consistent even when linear unit fixed effects regression is the true model
specification test (White 1980) null hypothesis: linear fixed effects regression is the true model
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 21 / 37 Linear Regression with Unit and Time Fixed Effects
Model:
Yit = αi + γt + βXit + it
where γt flexibly adjusts for a vector of unobserved unit-invariant time effects Vt , i.e., γt = f (Vt )
Estimator:
N T ˆ X X 2 βFE2 = arg min {(Yit − Y i − Y t + Y ) − β(Xit − X i − X t + X)} β i=1 t=1
where Y t and X t are time-specific means, and Y and X are overall means
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 22 / 37 Understanding the Two-way Fixed Effects Estimator
βFE: bias due to time effects
βFEtime: bias due to unit effects
βpool: bias due to both time and unit effects
ˆ ˆ ˆ ˆ ωFE × βFE + ωFEtime × βFEtime − ωpool × βpool βFE2 = wFE + wFEtime − wpool
with sufficiently large N and T , the weights are given by,
2 ωFE ≈ E(Si ) = average unit-specific variance 2 ωFEtime ≈ E(St ) = average time-specific variance 2 ωpool ≈ S = overall variance
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 23 / 37 Matching and Two-way Fixed Effects Estimators
Problem: No other unit shares the same unit and time Units
4 C T C C T
3 T C T T C
2 C C T C C Time periods
1 T T T C T Two kinds of mismatches 1 Same treatment status 2 Neither same unit nor same time
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 24 / 37 We Can Never Eliminate Mismatches
Units
4 C T C C T
3 T C T T C
2 C C T C C Time periods
1 T T T C T
To cancel time and unit effects, we must induce mismatches No weighted two-way fixed effects model eliminates mismatches
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 25 / 37 Difference-in-Differences Design
Parallel trend assumption:
E(Yit (0) − Yi,t−1(0) | Xit = 1, Xi,t−1 = 0) = E(Yit (0) − Yi,t−1(0) | Xit = Xi,t−1 = 0) 1.0
treatment group ● 0.8
counterfactual 0.6
● 0.4 Average Outcome Average ●
0.2 control group ● 0.0
time t time t+1 Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 26 / 37 General DiD = Weighted Two-Way FE Effects
2 × 2: equivalent to linear two-way fixed effects regression General setting: Multiple time periods, repeated treatments
Units
4 C T C C T
3 T C T T C
2 C C T C C Time periods
1 T T T C T
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 27 / 37 Units
4 0 0 0 0 0
1 1 3 0 2 0 1 2
1 1 2 0 2 0 1 2 Time periods
1 0 0 0 0 0
Fast computation, standard error, specification test Still assumes that past outcomes don’t affect current treatment Baseline outcome difference caused by unobserved time-invariant confounders It should not reflect causal effect of baseline outcome on treatment assignment Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 28 / 37 Synthetic Control Method (Abadie et al. 2010)
One treated unit i∗ receiving the treatment at time T
Quantity of interest: Yi∗T − Yi∗T (0) Create a synthetic control using past outcomes P Weighted average: Y\i∗T (0) = i6=i∗ wˆi YiT Estimate weights to balance past outcomes and past time-varying covariates
A motivating autoregressive model:
> YiT (0) = ρT Yi,T −1(0) + δT ZiT + iT ZiT = λT −1Yi,T −1(0) + ∆T Zi,T −1 + νiT
Past outcomes can affect current treatment No unobserved time-invariant confounders
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 29 / 37 Causal Effect of ETA’s Terrorism THE AMERICAN ECONOMIC REVIEW MARCH 2003
11- -5 / s / a / 210- 5
0 9-
, -Actual with terrorism - -Synthetic without terrorism 1 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 year
FIGURE1. PERCAPITA GDP FOR THE BASQUECOUNTRI Abadie and Gardeazabal (2003, AER) 4 - - Imai (Princeton) and Kim\ (MIT) Fixed Effects for Causal Inference GDt' gap PolMeth (July 21, 2016) 30 / 37 I -Terrorist activity
year
FIGIJRE2. TERRORISTACTIVITY AND ESTIMATEDGAP
expect terrorism to have a lagged negative ef- of deaths caused by terrorist actions (used as a fect on per capita GDP. In Figure 2, we plotted proxy for overall terrorist activity). As ex- the per capita GDP gap, Y, - YT, as a percent- pected, spikes in terrorist activity seem to be age of Basque per capita GDP, and the number followed by increases in the amplitude of the The main motivating model:
> > Yit (0) = γt + δt Zit + ξ Ui + it
A generalization of the linear two-way fixed effects model How is it possible to adjust for unobserved time-invariant confounders by adjusting for past outcomes?
The key assumption: there exist weights such that X X wi Zit = Zi∗t for all t ≤ T − 1 and wi Ui = Ui∗ i6=i∗ i6=i∗
In general, adjusting for observed confounders does not adjust for unobserved confounders
The same tradeoff as before
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 31 / 37 Effects of GATT Membership on International Trade
1 Controversy Rose (2004): No effect of GATT membership on trade Tomz et al. (2007): Significant effect with non-member participants
2 The central role of fixed effects models: Rose (2004): one-way (year) fixed effects for dyadic data Tomz et al. (2007): two-way (year and dyad) fixed effects Rose (2005): “I follow the profession in placing most confidence in the fixed effects estimators; I have no clear ranking between country-specific and country pair-specific effects.” Tomz et al. (2007): “We, too, prefer FE estimates over OLS on both theoretical and statistical ground”
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 32 / 37 Data and Methods
1 Data Data set from Tomz et al. (2007) Effect of GATT: 1948 – 1994 162 countries, and 196,207 (dyad-year) observations
2 Year fixed effects model:
> ln Yit = αt + βXit + δ Zit + it
Yit : trade volume Xit : membership (formal/participants) Both vs. At most one Zit : 15 dyad-varying covariates (e.g., log product GDP) 3 Assumptions: past membership status doesn’t directly affect current trade volume past trade volume doesn’t affect current membership status Before-and-after increasing trend in trade volume Difference-in-differences after conditional on past outcome?
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 33 / 37 Empirical Results: Formal Membership
Dyad with Both Members vs. One or None Member 0.4 0.3
DID− 0.2 FE caliper
●
0.1 FE ●
● FE ● ● ● 0.0 ● FD ● DID WFE −0.1 WFE Estimated Effects (log of trade) Estimated Effects −0.2 Year Fixed Effects Dyad Fixed Effects Year and Dyad Fixed Effects
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 34 / 37 Empirical Results: Participants Included
Dyad with Both Members vs. One or None Member
● Formal Member Participants FE 0.4
0.3 FE
FE DID− 0.2 FD FE caliper DID− WFE ● caliper 0.1 FE ●
● FE ● ● ● 0.0 ● FD ● WFE DID DID WFE −0.1 WFE Estimated Effects (log of trade) Estimated Effects −0.2 Year Fixed Effects Dyad Fixed Effects Year and Dyad Fixed Effects
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 35 / 37 Concluding Remarks
When should we use linear fixed effects models? Key tradeoff: 1 unobserved time-invariant confounders fixed effects 2 causal dynamics between treatment and outcome selection-on-observables Two key (under-appreciated) causal assumptions of fixed effects: 1 past treatments do not directly affect current outcome 2 past outcomes do not directly affect current treatment A new matching estimator: 1 Within-unit matching estimator no linearity assumption 2 Various causal identification strategies can be incorporated including the before-and-after and difference-in-differences designs 3 Equivalent representation as a weighted linear fixed effects regression estimator R package wfe is available at CRAN
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 36 / 37 Send comments and suggestions to:
[email protected] [email protected]
More information about this and other research:
http://imai.princeton.edu http://web.mit.edu/insong/www
Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference PolMeth (July 21, 2016) 37 / 37