<<

Short Guides to Microeconometrics Kurt Schmidheiny : Fixed and Random Effects 2 Fall 2020 Unversit¨atBasel

We will assume throughout this handout that each individual i is ob- served in all time periods t. This is a so-called balanced panel. The Panel Data: Fixed and Random Effects treatment of unbalanced panels is straightforward but tedious. The T observations for individual i can be summarized as

   0   0    yi1 xi1 zi ui1  .   .   .   .   .   .   .   .  1 Introduction            0   0    yi =  yit  Xi =  xit  Zi =  zi  ui =  uit          In panel data, individuals (persons, firms, cities, ... ) are observed at  .   .   .   .   .   .   .   .  several points in time (days, years, before and after treatment, ...). This y x0 z0 u iT T ×1 iT T ×K i T ×M iT T ×1 handout focuses on panels with relatively few time periods (small T ) and many individuals (large N). and NT observations for all individuals and time periods as This handout introduces the two basic models for the analysis of panel         y1 X1 Z1 u1 data, the fixed effects model and the random effects model, and presents  .   .   .   .   .   .   .   .  consistent for these two models. The handout does not cover                 so-called dynamic panel data models. y =  yi  X =  Xi  Z =  Zi  u =  ui   .   .   .   .  Panel data are most useful when we suspect that the outcome variable  .   .   .   .   .   .   .   .  depends on explanatory variables which are not observable but correlated y X Z u N NT ×1 N NT ×K N NT ×M N NT ×1 with the observed explanatory variables. If such omitted variables are constant over time, panel data estimators allow to consistently estimate The data generation process (dgp) is described by: the effect of the observed explanatory variables. PL1: Linearity

0 0 2 The Econometric Model yit = α + xitβ + ziγ + ci + uit where E[uit] = 0 and E[ci] = 0

Consider the multiple model for individual i = 1, ..., N The model is linear in parameters α, β, γ, effect ci and error uit. who is observed at several time periods t = 1, ..., T PL2: Independence y = α + x0 β + z0γ + c + u N it it i i it {Xi, zi, yi}i=1 i.i.d. (independent and identically distributed) 0 where yit is the dependent variable, xit is a K-dimensional row vector of The observations are independent across individuals but not necessarily 0 time-varying explanatory variables and zi is a M-dimensional row vector across time. This is guaranteed by random sampling of individuals. of time-invariant explanatory variables excluding the constant, α is the PL3: Strict Exogeneity intercept, β is a K-dimensional column vector of parameters, γ is a M- dimensional column vector of parameters, ci is an individual-specific effect E[uit|Xi, zi, ci] = 0 (mean independent) and uit is an idiosyncratic error term.

Version: 26-11-2020, 09:38 3 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 4

The idiosyncratic error term uit is assumed uncorrelated with the ex- RE3: Identifiability planatory variables of all past, current and future time periods of the 0 a) rank(W ) = K + M + 1 < NT and E[Wi Wi] = QWW is p.d. and same individual. This is a strong assumption which e.g. rules out lagged 0 0 0 finite. The typical element wit = [1 xit zi]. dependent variables. PL3 also assumes that the idiosyncratic error is 0 −1 uncorrelated with the individual specific effect. b) rank(W ) = K + M + 1 < NT and E[Wi Ωv,i Wi] = QWOW is p.d. and finite. Ωv,i is defined below. PL4: Error Variance RE3 assumes that the regressors including a constant are not perfectly a) V [u |X , z , c ] = σ2I, σ2 > 0 and finite i i i i u u collinear, that all regressors (but the constant) have non-zero variance (homoscedastic and no serial correlation) and not too many extreme values. b) V [u |X , z , c ] = σ2 > 0, finite and it i i i u,it The random effects model can be written as Cov[uit, uis|Xi, zi, ci] = 0 ∀s 6= t (no serial correlation) 0 0 yit = α + x β + z γ + vit c) V [ui|Xi, zi, ci] = Ωu,i(Xi, zi) is p.d. and finite it i The remaining assumptions are divided into two sets of assumptions: the where vit = ci +uit. Assuming PL2, PL4 and RE1 in the special versions random effects model and the fixed effects model. PL4a and RE2a leads to   Ωv,1 ··· 0 ··· 0 2.1 The Random Effects Model  . . .   . .. .      In the random effects model, the individual-specific effect is a random Ωv = V [v|X,Z] =  0 Ωv,i 0    variable that is uncorrelated with the explanatory variables.  . .. .   . . .  0 ··· 0 ··· Ω RE1: Unrelated effects v,N NT ×NT with typical element E[ci|Xi, zi] = 0

 2 2 2  RE1 assumes that the individual-specific effect is a that σv σc ··· σc is uncorrelated with the explanatory variables of all past, current and  2 2 2   σc σv ··· σc  Ωv,i = V [vi|Xi, zi] =  . . .  future time periods of the same individual.  . . .. .   . . . .  σ2 σ2 ··· σ2 RE2: Effect Variance c c v T ×T 2 2 2 2 a) V [ci|Xi, zi] = σc < ∞ (homoscedastic) where σv = σc + σu. This special case under PL4a and RE2a is therefore 2 called the equicorrelated random effects model. b) V [ci|Xi, zi] = σc,i(Xi, zi) < ∞ (heteroscedastic) RE2a assumes constant variance of the individual specific effect. 2.2 The Fixed Effects Model

In the fixed effects model, the individual-specific effect is a random vari- able that is allowed to be correlated with the explanatory variables. 5 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 6

FE1: Related effects and RE3a in samples with a large number of individuals (N → ∞). How- – ever, the pooled OLS is not efficient. More importantly, the FE1 explicitly states the absence of the unrelatedness assumption in RE1. usual standard errors of the pooled OLS estimator are incorrect and tests (t-, F -, z-, Wald-) based on them are not valid. Correct standard errors FE2: Effect Variance can be estimated with the so-called cluster-robust covariance estimator – treating each individual as a cluster (see the handout on “Clustering in FE2 explicitly states the absence of the assumption in RE2. the Linear Model”). Fixed effects model: The pooled OLS estimators of α, β and γ are FE3: Identifiability biased and inconsistent, because the variable ci is omitted and potentially ¨ ¨ 0 ¨ correlated with the other regressors. rank(X) = K < NT and E(XiXi) is p.d. and finite P where the typical elementx ¨it = xit − x¯i andx ¯i = 1/T t xit FE3 assumes that the time-varying explanatory variables are not perfectly 4 Random Effects Estimation collinear, that they have non-zero within-variance (i.e. variation over time The random effects estimator is the feasible generalized for a given individual) and not too many extreme values. Hence, x it (GLS) estimator cannot include a constant or any time-invariant variables. Note that only   the parameters β but neither α nor γ are identifiable in the fixed effects αRE b  −1 −1 −1   0 0 model.  βbRE  = W Ωbv W W Ωbv y. γbRE

3 Estimation with Pooled OLS where W = [ιNT XZ] and ιNT is a NT × 1 vector of ones.

The error Ωv is assumed block-diagonal with equicor- The pooled OLS estimator ignores the panel structure of the data and related diagonal elements Ωv,i as in section 2.1 which depend on the two simply estimates α, β and γ as 2 2 unknown parameters σv and σc only. There are many different ways to   estimate these two parameters. For example, αbP OLS 0 −1 0 T N  βbP OLS  = (W W ) W y   2 1 X X 2 2 2 2 σv = vit , σc = σv − σu γP OLS b NT b b b b b t=1 i=1 where W = [ιNT XZ] and ιNT is a NT × 1 vector of ones. where T N Random effects model: The pooled OLS estimator of α, β and γ is un- 1 X X σ2 = (v − v )2 bu NT − N bit bi biased under PL1, PL2, PL3, RE1, and RE3 in small samples. Addition- t=1 i=1 ally assuming PL4 and normally distributed idiosyncratic and individual- 0 0 PT and vbit = yit − αP OLS − xitβbP OLS − ziγbP OLS and vbi = 1/T t=1 vbit. The specific errors, it is normally distributed in small samples. It is consistent 2 degree of freedom correction in σbu is also asymptotically important when and approximately normally distributed under PL1, PL2, PL3, PL4, RE1, N → ∞. 7 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 8

Random effects model: We cannot establish small sample properties yields the within model 0 for the RE estimator. The RE estimator is consistent and asymptotically y¨it =x ¨itβ +u ¨it normally distributed under PL1 - PL4, RE1, RE2 and RE3b when the wherey ¨it = yit − y¯i,x ¨itk = xitk − x¯ik andu ¨it = uit − u¯i. Note that number of individuals N → ∞ even if T is fixed. It can therefore be the individual-specific effect ci, the intercept α and the time-invariant approximated in samples with many individual observations N as regressors zi cancel.       αbRE α αbRE The fixed effects estimator or within estimator of the slope coefficient   A      βbRE  ∼ N  β  , Avar  βbRE  β estimates the within model by OLS γbRE γ γbRE −1  ¨ 0 ¨ ¨ 0 2 2 βbFE = X X X y¨ Assuming the equicorrelated model (PL4a and RE2a), σbv and σbc are con- 2 2 sistent estimators of σv and σc , respectively. Then αbRE, βbRE and γbRE are Note that the parameters α and γ are not estimated by the within esti- asymptotically efficient and the asymptotic variance can be consistently mator. estimated as   Random effects model and fixed effects model: The fixed effects esti- α bRE  −1 mator of β is unbiased under PL1, PL2, PL3, and FE3 in small samples. Avar[  β  = W 0Ω−1W  bRE  bv Additionally assuming PL4 and normally distributed idiosyncratic errors, γ bRE it is normally distributed in small samples. Assuming homoscedastic er- Allowing for arbitrary conditional variances and for serial correlation in h i rors with no serial correlation (PL4a), the variance V βbFE|X can be Ωv,i (PL4c and RE2b), the asymptotic variance can be consistently es- unbiasedly estimated as timated with the so-called cluster-robust covariance estimator treating h i  −1 each individual as a cluster (see the handout on “Clustering in the Linear 2 ¨ 0 ¨ Vb βbFE|X = σbu X X Model”). In both cases, the usual tests (z-, Wald-) for large samples can 0 be performed. 2 0 where σbu = ub¨ u/b¨ (NT −N −K) and ub¨it =y ¨it −x¨itβbFE. Note the non-usual In practice, we can rarely be sure about equicorrelated errors and degrees of freedom correction. The usual z- and F -tests can be performed. better always use cluster-robust standard errors for the RE estimator. The FE estimator is consistent and asymptotically normally distributed Fixed effects model: Under the assumptions of the fixed effects model under PL1 - PL4 and FE3 when the number of individuals N → ∞ even (FE1, i.e. RE1 violated), the random effects estimators of α, β and γ are if T is fixed. It can therefore be approximated in samples with many biased and inconsistent, because the variable ci is omitted and potentially individual observations N as correlated with the other regressors. A  h i βbFE ∼ N β, Avar βbFE

5 Fixed Effects Estimation Assuming homoscedastic errors with no serial correlation (PL4a), the P asymptotic variance can be consistently estimated as Subtracting time averagesy ¯i = 1/T t yit from the initial model h i  −1 0 0 Avar[ β = σ2 X¨ 0X¨ yit = α + xitβ + ziγ + ci + uit bFE bu 9 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 10

2 0 7 Least Squares Dummy Variables Estimator (LSDV) where σbu = ub¨ u/b¨ (NT − N). Allowing for heteroscedasticity and serial correlation of unknown form The least squares dummy variables (LSDV) estimator is pooled OLS in- (PL4c), the asymptotic variance Avar[βbk] can be consistently estimated cluding a set of N −1 dummy variables which identify the individuals and with the so-called cluster-robust covariance estimator treating each indi- hence an additional N − 1 parameters. Note that one of the individual vidual as a cluster (see the handout on “Clustering in the Linear Model”). dummies is dropped because we include a constant. Time-invariant ex- In both cases, the usual tests (z-, Wald-) for large samples can be per- planatory variables, z , are dropped because they are perfectly collinear formed. i with the individual dummy variables. In practice, the idiosyncratic errors are often serially correlated (vio- The LSDV estimator of β is numerically identical with the FE esti- lating PL4a) when T > 2. Bertrand, Duflo and Mullainathan (2004) show mator and therefore consistent under the same assumptions. The LSDV that the usual standard errors of the fixed effects estimator are drastically estimators of the additional parameters for the individual-specific dummy understated in the presence of serial correlation. It is therefore advisable variables, however, are inconsistent as the number of parameters goes to to always use cluster-robust standard errors for the fixed effects estimator. infinity as N → ∞. This so-called incidental parameters problem gener- ally biases all parameters in non-linear fixed effects models like the probit 6 Random Effects vs. Fixed Effects Estimation model. The random effects model can be consistently estimated by both the RE estimator or the FE estimator. We would prefer the RE estimator if we 8 First Difference Estimator can be sure that the individual-specific effect really is an unrelated effect Subtracting the lagged value y from the initial model (RE1 ). This is usually tested by a (Durbin-Wu-)Hausman test. However, i,t−1 the Hausman test is only valid under and cannot include 0 0 yit = α + xitβ + ziγ + ci + uit time fixed effects. The unrelatedness assumption (RE1 ) is better tested by running an yields the first-difference model auxiliary regression (Wooldridge 2010, p. 332, eq. 10.88, Mundlak, 1978): 0 y˙it =x ˙ itβ +u ˙ it 0 0 0 yit = α + x β + z γ + x λ + δt + uit it i i wherey ˙it = yit −yi,t−1,x ˙ it = xit −xi,t−1 andu ˙ it = uit −ui,t−1. Note that P the individual-specific effect ci, the intercept α and the time-invariant where xi = 1/T t xit are the time averages of all time-varying regressors. regressors zi cancel. The first-difference estimator (FD) of the slope co- Include time fixed δt if they are included in the RE and FE estimation. efficient β estimates the first-difference model by OLS. A joint Wald-test on H0: λ = 0 tests RE1. Use cluster-robust standard −1 errors to allow for heteroscedasticity and serial correlation.  ˙ 0 ˙  ˙ 0 βbFD = X X X y˙ Note: Assumption RE1 is an extremely strong assumption and the FE estimator is almost always much more convincing than the RE estimator. Note that the parameters α and γ are not estimated by the FD estimator. Not rejecting RE1 does not mean accepting it. Interest in the effect of a In the special case T = 2, the FD estimator is numerically identical to the time-invariant variable is no sufficient reason to use the RE estimator. FE estimator. 11 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 12

Random effects model and fixed effects model: The FD estimator is a 11 Implementation in Stata 14 consistent estimator of β under the same assumptions as the FE estimator. Stata provides a series of commands that are especially designed for panel It is less efficient than the FE estimator if uit is not serially correlated (PL4a). data. See help xt for an overview. Stata requires panel data in the so-called long form: there is one line for every individual and every time observation. The very powerful 9 Fixed Effects vs. First Difference Estimation Stata command reshape helps transforming data into this format. Before Given the fixed effects model (PL1, PL2, PL3, FE3 ), both the fixed ef- working with panel data commands, we have to tell Stata the variables fects and the first difference estimator of β are consistent. Hence, the two that identify the individual and the time period. For example, load data estimators should be similar in large samples. In practice, however, the webuse nlswork.dta two estimator often differ substantially. The reason for this is typically and define individuals (variable idcode) and time periods (variable year) a misspecification of the timing in the linear model. PL1 assumes that xtset idcode year changes in xit have only an instantaneous effect on yit at time t. In prac- tice, effects often need several periods to materialize. Such patterns are The fixed effects estimator is calculated by the Stata command xtreg called dynamic treatment effects. In this situation, the first difference es- with the option fe: timator will only pick up the instantaneous effect at time t while the fixed generate ttl_exp2 = ttl_exp^2 effects estimator picks up an average of the dynamic treatment effects. xtreg ln_wage ttl_exp ttl_exp2, fe Note that the effect of time-constant variables like grade is not identified 10 Time Fixed Effects by the fixed effects estimator. The parameter reported as cons in the P Stata output is the average fixed effect 1/N i ci. Stata uses NT − We often also suspect that there are time-specific effects δt which affect N − K − M degrees of freedom for small sample tests. Cluster-robust all individuals in the same way Huber/White standard errors are reported with the vce option: 0 0 xtreg ln_wage ttl_exp ttl_exp2, fe vce(cluster idcode) yit = α + xitβ + ziγ + δt + ci + uit . Since version 10, Stata automatically assumes clustering with robust stan- We can estimate this extended model by including a dummy variable dard errors in fixed effects estimations. So we could also just use for T − 1 time periods with one period serving as the reference period. Assuming a fixed number of time periods T and the number of individuals xtreg ln_wage ttl_exp ttl_exp2, fe vce(robust) N → ∞, both the RE estimator and the FE estimator are consistent using Stata reports small sample t- and F -tests with N − 1 degrees of freedom. time dummy variables under above conditions. Furthermore, Stata multiplies the cluster-robust covariance by N/(N −1) to correct for degrees of freedom in small samples. This is ad-hoc in the cross-section context with large N but useful in the time series context with large T (see Stock and Watson, 2008). 13 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 14

The random effects estimator is calculated by the Stata command xtreg 12 Implementation in R with the option re: The R package plm provides a series of functions and data structures that xtreg ln_wage grade ttl_exp ttl_exp2, re are especially designed for panel data. Stata reports asymptotic z- and Wald-tests with random effects estima- The plm package works with data stored in a structured dataframe tion. Cluster-robust Huber/White standard errors are reported with: format. The function plm.data transforms data from the so-called long xtreg ln_wage grade ttl_exp ttl_exp2, re vce(cluster idcode) form into the plm structure. Long form data means that there is one line for every individual and every time observation. For example, load data The Hausman test is calculated by > library(foreign) xtreg ln_wage grade ttl_exp ttl_exp2, re > nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta") estimates store b_re xtreg ln_wage ttl_exp ttl_exp2, fe and define individuals (variable idcode) and time periods (variable year) estimates store b_fe hausman b_fe b_re, sigmamore > library(plm) > pnlswork <- plm.data(nlswork, c("idcode","year")) and the auxiliary regression version by The fixed effects estimator is calculated by the R function plm and its egen ttl_exp_mean = mean(ttl_exp), by(idcode) egen ttl_exp2_mean = mean(ttl_exp2), by(idcode) model option within: regress ln_wage grade ttl_exp ttl_exp2 /// > ffe<-plm(ln_wage~ttl_exp+I(ttl_exp^2), model="within", data = pnlswork) ttl_exp_mean ttl_exp2_mean, vce(cluster idcode) > summary(ffe) test ttl_exp_mean ttl_exp2_mean Note that the effect of time-constant variables like grade is not identi- The pooled OLS estimator with corrected standard errors is calculated fied by the fixed effects estimator. Cluster-robust Huber/White standard with the standard ols command regress: errors are reported with the lmtest package: reg ln_wage grade ttl_exp ttl_exp2, vce(cluster idcode) > library(lmtest) where the vce option was used to report correct cluster-robust Huber/White > coeftest(ffe, vcov=vcovHC(ffe, cluster="group")) standard errors. where the option cluster="group" defines the clusters by the individual The least squares dummy variables estimator is calculated by including identifier in the plm.data dataframe. dummy variables for individuals in pooled OLS. It is numerically only The random effects estimator is calculated by the R function plm and feasibly with relatively few individuals: its model option random: drop if idcode > 50 > fre <- plm(ln_wage~grade+ttl_exp+I(ttl_exp^2), model="random", xi: regress ln_wage ttl_exp ttl_exp2 i.idcode data = pnlswork) where only the first 43 individuals are used in the estimation. The long > summary(fre) list of estimated fixed effects can suppressed by using the areg command: Cluster-robust Huber/White standard errors are reported with the lmtest areg ln_wage ttl_exp ttl_exp2, absorb(idcode) package: 15 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 16

> library(lmtest) References > coeftest(fre, vcov=vcovHC(fre, cluster="group")) Introductory textbooks The Hausman test is calculated by estimating RE and FE and then com- paring the estimates: Stock, James H. and Mark W. Watson (2020), Introduction to Economet- > phtest(ffe, fre) rics, 4th Global ed., Pearson. Chapter 10. Wooldridge, Jeffrey M. (2009), Introductory : A Modern The pooled OLS estimator with corrected standard errors is calculated by Approach, 4th ed., South-Western Cengage Learning. Ch. 13 and 14. the R function plm and its model option pooled:

> fpo <- plm(ln_wage~grade+ttl_exp+I(ttl_exp^2), model="pooling", Advanced textbooks data = pnlswork) > library(lmtest) Cameron, A. Colin and Pravin K. Trivedi (2005), Microeconometrics: > coeftest(fpo, vcov=vcovHC(fpo, cluster="group")) Methods and Applications, Cambridge University Press. Chapter 21. where the lmtest package was used to report correct cluster-robust Hu- Wooldridge, Jeffrey M. (2010), Econometric Analysis of Cross Section and ber/White standard errors. Panel Data, MIT Press. Chapter 10. The least squares dummy variables estimator is calculated by including dummy variables for individuals in pooled OLS. It is numerically only Companion textbooks feasibly with relatively few individuals: Angrist, Joshua D. and J¨orn-Steffen Pischke (2009), Mostly Harmless > lsdv <- lm(ln_wage~ttl_exp+I(ttl_exp^2)+factor(idcode), data = nlswork, subset=(idcode<=50)) Econometrics: An Empiricist’s Companion, Princeton University Press. > summary(lsdv) Chapter 5. where only the first 43 individuals are used in the estimation. Articles

Manuel Arellano (1987), Computing Robust Standard Errors for Within- Group Estimators, Oxford Bulletin of Economics and , 49, 431-434. Bertrand, M., E. Duflo and S. Mullainathan (2004), How Much Should We Trust Differences-in-Differences Estimates?, Quarterly Journal of Economics, 119(1), 249-275. Mundlak, Y. (1978), On the pooling of time series and cross section data, Econometrica, 46, 69-85. Stock, James H. and Mark W. Watson (2008), Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression, Economet- rica, 76(1), 155-174. [advanced]