Panel Data: Fixed and Random Effects

Total Page:16

File Type:pdf, Size:1020Kb

Panel Data: Fixed and Random Effects Short Guides to Microeconometrics Kurt Schmidheiny Panel Data: Fixed and Random Effects 2 Fall 2020 Unversit¨atBasel We will assume throughout this handout that each individual i is ob- served in all time periods t. This is a so-called balanced panel. The Panel Data: Fixed and Random Effects treatment of unbalanced panels is straightforward but tedious. The T observations for individual i can be summarized as 2 3 2 0 3 2 0 3 2 3 yi1 xi1 zi ui1 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 1 Introduction 6 7 6 7 6 7 6 7 6 7 6 0 7 6 0 7 6 7 yi = 6 yit 7 Xi = 6 xit 7 Zi = 6 zi 7 ui = 6 uit 7 6 7 6 7 6 7 6 7 In panel data, individuals (persons, firms, cities, ... ) are observed at 6 . 7 6 . 7 6 . 7 6 . 7 4 . 5 4 . 5 4 . 5 4 . 5 several points in time (days, years, before and after treatment, ...). This y x0 z0 u iT T ×1 iT T ×K i T ×M iT T ×1 handout focuses on panels with relatively few time periods (small T ) and many individuals (large N). and NT observations for all individuals and time periods as This handout introduces the two basic models for the analysis of panel 2 3 2 3 2 3 2 3 y1 X1 Z1 u1 data, the fixed effects model and the random effects model, and presents 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 6 . 7 consistent estimators for these two models. The handout does not cover 6 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7 so-called dynamic panel data models. y = 6 yi 7 X = 6 Xi 7 Z = 6 Zi 7 u = 6 ui 7 6 . 7 6 . 7 6 . 7 6 . 7 Panel data are most useful when we suspect that the outcome variable 6 . 7 6 . 7 6 . 7 6 . 7 4 . 5 4 . 5 4 . 5 4 . 5 depends on explanatory variables which are not observable but correlated y X Z u N NT ×1 N NT ×K N NT ×M N NT ×1 with the observed explanatory variables. If such omitted variables are constant over time, panel data estimators allow to consistently estimate The data generation process (dgp) is described by: the effect of the observed explanatory variables. PL1: Linearity 0 0 2 The Econometric Model yit = α + xitβ + ziγ + ci + uit where E[uit] = 0 and E[ci] = 0 Consider the multiple linear regression model for individual i = 1; :::; N The model is linear in parameters α, β, γ, effect ci and error uit. who is observed at several time periods t = 1; :::; T PL2: Independence y = α + x0 β + z0γ + c + u N it it i i it fXi; zi; yigi=1 i.i.d. (independent and identically distributed) 0 where yit is the dependent variable, xit is a K-dimensional row vector of The observations are independent across individuals but not necessarily 0 time-varying explanatory variables and zi is a M-dimensional row vector across time. This is guaranteed by random sampling of individuals. of time-invariant explanatory variables excluding the constant, α is the PL3: Strict Exogeneity intercept, β is a K-dimensional column vector of parameters, γ is a M- dimensional column vector of parameters, ci is an individual-specific effect E[uitjXi; zi; ci] = 0 (mean independent) and uit is an idiosyncratic error term. Version: 26-11-2020, 09:38 3 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 4 The idiosyncratic error term uit is assumed uncorrelated with the ex- RE3: Identifiability planatory variables of all past, current and future time periods of the 0 a) rank(W ) = K + M + 1 < NT and E[Wi Wi] = QWW is p.d. and same individual. This is a strong assumption which e.g. rules out lagged 0 0 0 finite. The typical element wit = [1 xit zi]. dependent variables. PL3 also assumes that the idiosyncratic error is 0 −1 uncorrelated with the individual specific effect. b) rank(W ) = K + M + 1 < NT and E[Wi Ωv;i Wi] = QW OW is p.d. and finite. Ωv;i is defined below. PL4: Error Variance RE3 assumes that the regressors including a constant are not perfectly a) V [u jX ; z ; c ] = σ2I, σ2 > 0 and finite i i i i u u collinear, that all regressors (but the constant) have non-zero variance (homoscedastic and no serial correlation) and not too many extreme values. b) V [u jX ; z ; c ] = σ2 > 0, finite and it i i i u;it The random effects model can be written as Cov[uit; uisjXi; zi; ci] = 0 8s 6= t (no serial correlation) 0 0 yit = α + x β + z γ + vit c) V [uijXi; zi; ci] = Ωu;i(Xi; zi) is p.d. and finite it i The remaining assumptions are divided into two sets of assumptions: the where vit = ci +uit. Assuming PL2, PL4 and RE1 in the special versions random effects model and the fixed effects model. PL4a and RE2a leads to 0 1 Ωv;1 ··· 0 ··· 0 2.1 The Random Effects Model B . C B . .. C B C B C In the random effects model, the individual-specific effect is a random Ωv = V [vjX; Z] = B 0 Ωv;i 0 C B C variable that is uncorrelated with the explanatory variables. B . .. C @ . A 0 ··· 0 ··· Ω RE1: Unrelated effects v;N NT ×NT with typical element E[cijXi; zi] = 0 0 2 2 2 1 RE1 assumes that the individual-specific effect is a random variable that σv σc ··· σc is uncorrelated with the explanatory variables of all past, current and B 2 2 2 C B σc σv ··· σc C Ωv;i = V [vijXi; zi] = B . C future time periods of the same individual. B . .. C @ . A σ2 σ2 ··· σ2 RE2: Effect Variance c c v T ×T 2 2 2 2 a) V [cijXi; zi] = σc < 1 (homoscedastic) where σv = σc + σu. This special case under PL4a and RE2a is therefore 2 called the equicorrelated random effects model. b) V [cijXi; zi] = σc;i(Xi; zi) < 1 (heteroscedastic) RE2a assumes constant variance of the individual specific effect. 2.2 The Fixed Effects Model In the fixed effects model, the individual-specific effect is a random vari- able that is allowed to be correlated with the explanatory variables. 5 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 6 FE1: Related effects and RE3a in samples with a large number of individuals (N ! 1). How- { ever, the pooled OLS estimator is not efficient. More importantly, the FE1 explicitly states the absence of the unrelatedness assumption in RE1. usual standard errors of the pooled OLS estimator are incorrect and tests (t-, F -, z-, Wald-) based on them are not valid. Correct standard errors FE2: Effect Variance can be estimated with the so-called cluster-robust covariance estimator { treating each individual as a cluster (see the handout on \Clustering in FE2 explicitly states the absence of the assumption in RE2. the Linear Model"). Fixed effects model: The pooled OLS estimators of α, β and γ are FE3: Identifiability biased and inconsistent, because the variable ci is omitted and potentially ¨ ¨ 0 ¨ correlated with the other regressors. rank(X) = K < NT and E(XiXi) is p.d. and finite P where the typical elementx ¨it = xit − x¯i andx ¯i = 1=T t xit FE3 assumes that the time-varying explanatory variables are not perfectly 4 Random Effects Estimation collinear, that they have non-zero within-variance (i.e. variation over time The random effects estimator is the feasible generalized least squares for a given individual) and not too many extreme values. Hence, x it (GLS) estimator cannot include a constant or any time-invariant variables. Note that only 0 1 the parameters β but neither α nor γ are identifiable in the fixed effects αRE b −1 −1 −1 B C 0 0 model. @ βbRE A = W Ωbv W W Ωbv y: γbRE 3 Estimation with Pooled OLS where W = [ιNT XZ] and ιNT is a NT × 1 vector of ones. The error covariance matrix Ωv is assumed block-diagonal with equicor- The pooled OLS estimator ignores the panel structure of the data and related diagonal elements Ωv;i as in section 2.1 which depend on the two simply estimates α, β and γ as 2 2 unknown parameters σv and σc only. There are many different ways to 0 1 estimate these two parameters. For example, αbP OLS 0 −1 0 T N B βbP OLS C = (W W ) W y @ A 2 1 X X 2 2 2 2 σv = vit ; σc = σv − σu γP OLS b NT b b b b b t=1 i=1 where W = [ιNT XZ] and ιNT is a NT × 1 vector of ones. where T N Random effects model: The pooled OLS estimator of α, β and γ is un- 1 X X σ2 = (v − v )2 bu NT − N bit bi biased under PL1, PL2, PL3, RE1, and RE3 in small samples. Addition- t=1 i=1 ally assuming PL4 and normally distributed idiosyncratic and individual- 0 0 PT and vbit = yit − αP OLS − xitβbP OLS − ziγbP OLS and vbi = 1=T t=1 vbit. The specific errors, it is normally distributed in small samples. It is consistent 2 degree of freedom correction in σbu is also asymptotically important when and approximately normally distributed under PL1, PL2, PL3, PL4, RE1, N ! 1.
Recommended publications
  • Improving the Interpretation of Fixed Effects Regression Results
    Improving the Interpretation of Fixed Effects Regression Results Jonathan Mummolo and Erik Peterson∗ October 19, 2017 Abstract Fixed effects estimators are frequently used to limit selection bias. For example, it is well-known that with panel data, fixed effects models eliminate time-invariant confounding, estimating an independent variable's effect using only within-unit variation. When researchers interpret the results of fixed effects models, they should therefore consider hypothetical changes in the independent variable (coun- terfactuals) that could plausibly occur within units to avoid overstating the sub- stantive importance of the variable's effect. In this article, we replicate several recent studies which used fixed effects estimators to show how descriptions of the substantive significance of results can be improved by precisely characterizing the variation being studied and presenting plausible counterfactuals. We provide a checklist for the interpretation of fixed effects regression results to help avoid these interpretative pitfalls. ∗Jonathan Mummolo is an Assistant Professor of Politics and Public Affairs at Princeton University. Erik Pe- terson is a post-doctoral fellow at Dartmouth College. We are grateful to Justin Grimmer, Dorothy Kronick, Jens Hainmueller, Brandon Stewart, Jonathan Wand and anonymous reviewers for helpful feedback on this project. The fixed effects regression model is commonly used to reduce selection bias in the es- timation of causal effects in observational data by eliminating large portions of variation thought to contain confounding factors. For example, when units in a panel data set are thought to differ systematically from one another in unobserved ways that affect the outcome of interest, unit fixed effects are often used since they eliminate all between-unit variation, producing an estimate of a variable's average effect within units over time (Wooldridge 2010, 304; Allison 2009, 3).
    [Show full text]
  • Linear Mixed-Effects Modeling in SPSS: an Introduction to the MIXED Procedure
    Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction. 1 Data preparation for MIXED . 1 Fitting fixed-effects models . 4 Fitting simple mixed-effects models . 7 Fitting mixed-effects models . 13 Multilevel analysis . 16 Custom hypothesis tests . 18 Covariance structure selection. 19 Random coefficient models . 20 Estimated marginal means. 25 References . 28 About SPSS Inc. 28 SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc. All other names are trademarks of their respective owners. © 2005 SPSS Inc. All rights reserved. LMEMWP-0305 Introduction The linear mixed-effects models (MIXED) procedure in SPSS enables you to fit linear mixed-effects models to data sampled from normal distributions. Recent texts, such as those by McCulloch and Searle (2000) and Verbeke and Molenberghs (2000), comprehensively review mixed-effects models. The MIXED procedure fits models more general than those of the general linear model (GLM) procedure and it encompasses all models in the variance components (VARCOMP) procedure. This report illustrates the types of models that MIXED handles. We begin with an explanation of simple models that can be fitted using GLM and VARCOMP, to show how they are translated into MIXED. We then proceed to fit models that are unique to MIXED. The major capabilities that differentiate MIXED from GLM are that MIXED handles correlated data and unequal variances. Correlated data are very common in such situations as repeated measurements of survey respondents or experimental subjects. MIXED extends repeated measures models in GLM to allow an unequal number of repetitions.
    [Show full text]
  • Application of Random-Effects Probit Regression Models
    Journal of Consulting and Clinical Psychology Copyright 1994 by the American Psychological Association, Inc. 1994, Vol. 62, No. 2, 285-296 0022-006X/94/S3.00 Application of Random-Effects Probit Regression Models Robert D. Gibbons and Donald Hedeker A random-effects probit model is developed for the case in which the outcome of interest is a series of correlated binary responses. These responses can be obtained as the product of a longitudinal response process where an individual is repeatedly classified on a binary outcome variable (e.g., sick or well on occasion t), or in "multilevel" or "clustered" problems in which individuals within groups (e.g., firms, classes, families, or clinics) are considered to share characteristics that produce similar responses. Both examples produce potentially correlated binary responses and modeling these per- son- or cluster-specific effects is required. The general model permits analysis at both the level of the individual and cluster and at the level at which experimental manipulations are applied (e.g., treat- ment group). The model provides maximum likelihood estimates for time-varying and time-invari- ant covariates in the longitudinal case and covariates which vary at the level of the individual and at the cluster level for multilevel problems. A similar number of individuals within clusters or number of measurement occasions within individuals is not required. Empirical Bayesian estimates of per- son-specific trends or cluster-specific effects are provided. Models are illustrated with data from
    [Show full text]
  • 10.1 Topic 10. ANOVA Models for Random and Mixed Effects References
    10.1 Topic 10. ANOVA models for random and mixed effects References: ST&DT: Topic 7.5 p.152-153, Topic 9.9 p. 225-227, Topic 15.5 379-384. There is a good discussion in SAS System for Linear Models, 3rd ed. pages 191-198. Note: do not use the rules for expected MS on ST&D page 381. We provide in the notes below updated rules from Chapter 8 from Montgomery D.C. (1991) “Design and analysis of experiments”. 10. 1. Introduction The experiments discussed in previous chapters have dealt primarily with situations in which the experimenter is concerned with making comparisons among specific factor levels. There are other types of experiments, however, in which one wish to determine the sources of variation in a system rather than make particular comparisons. One may also wish to extend the conclusions beyond the set of specific factor levels included in the experiment. The purpose of this chapter is to introduce analysis of variance models that are appropriate to these different kinds of experimental objectives. 10. 2. Fixed and Random Models in One-way Classification Experiments 10. 2. 1. Fixed-effects model A typical fixed-effect analysis of an experiment involves comparing treatment means to one another in an attempt to detect differences. For example, four different forms of nitrogen fertilizer are compared in a CRD with five replications. The analysis of variance takes the following familiar form: Source df Total 19 Treatment (i.e. among groups) 3 Error (i.e. within groups) 16 And the linear model for this experiment is: Yij = µ + i + ij.
    [Show full text]
  • Chapter 17 Random Effects and Mixed Effects Models
    Chapter 17 Random Effects and Mixed Effects Models Example. Solutions of alcohol are used for calibrating Breathalyzers. The following data show the alcohol concentrations of samples of alcohol solutions taken from six bottles of alcohol solution randomly selected from a large batch. The objective is to determine if all bottles in the batch have the same alcohol concentrations. Bottle Concentration 1 1.4357 1.4348 1.4336 1.4309 2 1.4244 1.4232 1.4213 1.4256 3 1.4153 1.4137 1.4176 1.4164 4 1.4331 1.4325 1.4312 1.4297 5 1.4252 1.4261 1.4293 1.4272 6 1.4179 1.4217 1.4191 1.4204 We are not interested in any difference between the six bottles used in the experiment. We therefore treat the six bottles as a random sample from the population and use a random effects model. If we were interested in the six bottles, we would use a fixed effects model. 17.3. One Random Factor 17.3.1. The Random‐Effects One‐way Model For a completely randomized design, with v treatments of a factor T, the one‐way random effects model is , :0, , :0, , all random variables are independent, i=1, ..., v, t=1, ..., . Note now , and are called variance components. Our interest is to investigate the variation of treatment effects. In a fixed effect model, we can test whether treatment effects are equal. In a random effects model, instead of testing : ..., against : treatment effects not all equal, we use : 0,: 0. If =0, then all random effects are 0.
    [Show full text]
  • Recovering Latent Variables by Matching∗
    Recovering Latent Variables by Matching∗ Manuel Arellanoy St´ephaneBonhommez This draft: November 2019 Abstract We propose an optimal-transport-based matching method to nonparametrically es- timate linear models with independent latent variables. The method consists in gen- erating pseudo-observations from the latent variables, so that the Euclidean distance between the model's predictions and their matched counterparts in the data is mini- mized. We show that our nonparametric estimator is consistent, and we document that it performs well in simulated data. We apply this method to study the cyclicality of permanent and transitory income shocks in the Panel Study of Income Dynamics. We find that the dispersion of income shocks is approximately acyclical, whereas skewness is procyclical. By comparison, we find that the dispersion and skewness of shocks to hourly wages vary little with the business cycle. Keywords: Latent variables, nonparametric estimation, matching, factor models, op- timal transport. ∗We thank Tincho Almuzara and Beatriz Zamorra for excellent research assistance. We thank Colin Mallows, Kei Hirano, Roger Koenker, Thibaut Lamadon, Guillaume Pouliot, Azeem Shaikh, Tim Vogelsang, Daniel Wilhelm, and audiences at various places for comments. Arellano acknowledges research funding from the Ministerio de Econom´ıay Competitividad, Grant ECO2016-79848-P. Bonhomme acknowledges support from the NSF, Grant SES-1658920. yCEMFI, Madrid. zUniversity of Chicago. 1 Introduction In this paper we propose a method to nonparametrically estimate a class of models with latent variables. We focus on linear factor models whose latent factors are mutually independent. These models have a wide array of economic applications, including measurement error models, fixed-effects models, and error components models.
    [Show full text]
  • Testing for a Winning Mood Effect and Predicting Winners at the 2003 Australian Open
    University of Tasmania School of Mathematics and Physics Testing for a Winning Mood Effect and Predicting Winners at the 2003 Australian Open Lisa Miller October 2005 Submitted in partial fulfilment of the requirements for the Degree of Bachelor of Science with Honours Supervisor: Dr Simon Wotherspoon Acknowledgements I would like to thank my supervisor Dr Simon Wotherspoon for all the help he has given me throughout the year, without him this thesis would not have happened. I would also like to thank my family, friends and fellow honours students, especially Shari, for all their support. 1 Abstract A `winning mood effect’ can be described as a positive effect that leads a player to perform well on a point after performing well on the previous point. This thesis investigates the `winning mood effect’ in males singles data from the 2003 Australian Open. It was found that after winning the penultimate set there was an increase in the probability of winning the match and that after serving an ace and also after breaking your opponents service there was an increase in the probability of winning the next service. Players were found to take more risk at game point but less risk after their service was broken. The second part of this thesis looked at predicting the probability of winning a match based on some previous measure of score. Using simulation, several models were able to predict the probability of winning based on a score from earlier in the match, or on a difference in player ability. However, the models were only really suitable for very large data sets.
    [Show full text]
  • Menbreg — Multilevel Mixed-Effects Negative Binomial Regression
    Title stata.com menbreg — Multilevel mixed-effects negative binomial regression Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see Syntax menbreg depvar fe equation || re equation || re equation ::: , options where the syntax of fe equation is indepvars if in , fe options and the syntax of re equation is one of the following: for random coefficients and intercepts levelvar: varlist , re options for random effects among the values of a factor variable levelvar: R.varname levelvar is a variable identifying the group structure for the random effects at that level or is all representing one group comprising all observations. fe options Description Model noconstant suppress the constant term from the fixed-effects equation exposure(varnamee) include ln(varnamee) in model with coefficient constrained to 1 offset(varnameo) include varnameo in model with coefficient constrained to 1 re options Description Model covariance(vartype) variance–covariance structure of the random effects noconstant suppress constant term from the random-effects equation 1 2 menbreg — Multilevel mixed-effects negative binomial regression options Description Model dispersion(dispersion) parameterization of the conditional overdispersion; dispersion may be mean (default) or constant constraints(constraints) apply specified linear constraints collinear keep collinear variables SE/Robust vce(vcetype) vcetype may be oim, robust, or cluster clustvar Reporting level(#) set confidence level; default is level(95)
    [Show full text]
  • A Simple, Linear, Mixed-Effects Model
    Chapter 1 A Simple, Linear, Mixed-effects Model In this book we describe the theory behind a type of statistical model called mixed-effects models and the practice of fitting and analyzing such models using the lme4 package for R. These models are used in many different dis- ciplines. Because the descriptions of the models can vary markedly between disciplines, we begin by describing what mixed-effects models are and by ex- ploring a very simple example of one type of mixed model, the linear mixed model. This simple example allows us to illustrate the use of the lmer function in the lme4 package for fitting such models and for analyzing the fitted model. We also describe methods of assessing the precision of the parameter estimates and of visualizing the conditional distribution of the random effects, given the observed data. 1.1 Mixed-effects Models Mixed-effects models, like many other types of statistical models, describe a relationship between a response variable and some of the covariates that have been measured or observed along with the response. In mixed-effects models at least one of the covariates is a categorical covariate representing experimental or observational “units” in the data set. In the example from the chemical industry that is given in this chapter, the observational unit is the batch of an intermediate product used in production of a dye. In medical and social sciences the observational units are often the human or animal subjects in the study. In agriculture the experimental units may be the plots of land or the specific plants being studied.
    [Show full text]
  • BU-1120-M RANDOM EFFECTS MODELS for BINARY DATA APPLIED to ENVIRONMENTAL/ECOLOGICAL STUDIES by Charles E. Mcculloch Biometrics U
    RANDOM EFFECTS MODELS FOR BINARY DATA APPLIED TO ENVIRONMENTAL/ECOLOGICAL STUDIES by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1120-M April1991 ABSTRACT The outcome measure in many environmental and ecological studies is binary, which complicates the use of random effects. The lack of methods for such data is due in a large part to both the difficulty of specifying realistic models, and once specified, to their computational in tractibility. In this paper we consider a number of examples and review some of the approaches that have been proposed to deal with random effects in binary data. We consider models for random effects in binary data and analysis and computation strategies. We focus in particular on probit-normal and logit-normal models and on a data set quantifying the reproductive success in aphids. -2- Random Effects Models for Binary Data Applied to Environmental/Ecological Studies by Charles E. McCulloch 1. INTRODUCTION The outcome measure in many environmental and ecological studies is binary, which complicates the use of random effects since such models are much less well developed than for continuous data. The lack of methods for such data is due in a large part to both the difficulty of specifying realistic models, and once specified, to their computational intractibility. In this paper we consider a number of examples and review some of the approaches that have been proposed to deal with random effects in binary data. The advantages and disadvantages of each are discussed. We focus in particular on a data set quantifying the reproductive success in aphids.
    [Show full text]
  • Lecture 34 Fixed Vs Random Effects
    Lecture 34 Fixed vs Random Effects STAT 512 Spring 2011 Background Reading KNNL: Chapter 25 34-1 Topic Overview • Random vs. Fixed Effects • Using Expected Mean Squares (EMS) to obtain appropriate tests in a Random or Mixed Effects Model 34-2 Fixed vs. Random Effects • So far we have considered only fixed effect models in which the levels of each factor were fixed in advance of the experiment and we were interested in differences in response among those specific levels . • A random effects model considers factors for which the factor levels are meant to be representative of a general population of possible levels. 34-3 Fixed vs. Random Effects (2) • For a random effect, we are interested in whether that factor has a significant effect in explaining the response, but only in a general way. • If we have both fixed and random effects, we call it a “mixed effects model”. • To include random effects in SAS, either use the MIXED procedure, or use the GLM procedure with a RANDOM statement. 34-4 Fixed vs. Random Effects (2) • In some situations it is clear from the experiment whether an effect is fixed or random. However there are also situations in which calling an effect fixed or random depends on your point of view, and on your interpretation and understanding. So sometimes it is a personal choice. This should become more clear with some examples. 34-5 Random Effects Model • This model is also called ANOVA II (or variance components model). • Here is the one-way model: Yij=µ + α i + ε ij αN σ 2 i~() 0, A independent ε~N 0, σ 2 ij () 2 2 Yij~ N (µ , σ A + σ ) 34-6 Random Effects Model (2) Now the cell means µi= µ + α i are random variables with a common mean.
    [Show full text]
  • Advanced Panel Data Methods Fixed Effects Estimators We Discussed The
    EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Chapter 14: Advanced panel data methods Fixed effects estimators We discussed the first difference (FD) model as one solution to the problem of unobserved heterogeneity in the context of panel data. It is not the only solution; the leading alternative is the fixed effects model, which will be a better solution under certain assumptions. For a model with a single explanatory variable, yit = β1xit + ai + uit (1) If we average this equation over time for each unit i, we have y¯it = β1x¯it + ¯ai +u ¯it (2) Subtracting the second equation from the first, we arrive at yit − y¯it = β1(xit − x¯it) + (uit − u¯it) (3) defining the demeaned data on [y, x] as the ob- servations of each panel with their mean values per individual removed. This algebra is known as the within transformation, and the estima- tor we derive is known as the within estimator. Just as OLS in a cross-sectional context only “explains” the deviations of y from its mean y¯, the within estimator’s explanatory value is derived from the comovements of y around its individual-specific mean with x around its individual-specific mean. Thus, it matters not if a unit has consistently high or low values of y and x. All that matters is how the variations around those mean values are correlated. Just as in the case of the FD estimator, the within estimator will be unbiased and consis- tent if the explanatory variables are strictly ex- ogenous: independent of the distribution of u in terms of their past, present and future val- ues.
    [Show full text]