Short Guides to Microeconometrics Kurt Schmidheiny Heteroskedasticity in the 2 Fall 2020 Unversit¨atBasel

Note that under OLS2 (i.i.d. ) the errors are unconditionally 2 homoscedastic, V [ui] = σ but allowed to be conditionally heteroscedas- tic, V [u |x ] = σ2. Assuming OLS2 and OLS3c provides that the errors Heteroskedasticity in the Linear Model i i i are also not conditionally autocorrelated, i.e. ∀i 6= j : Cov[ui, uj|xi, xj] =

E[uiuj|xi, xj] = E[ui|xi] · E[uj|xj] = 0 . Also note that the condition-

ing on xi is less restrictive than it may seem: if the conditional

V [ui|xi] depends on other exogenous variables (or functions of them), we 1 Introduction can include these variables in xi and set the corresponding β parameters This handout extends the handout on “The Multiple to zero. 0 model” and refers to its definitions and assumptions in section 2. The variance- of the vector of error terms u = (u1, u2, ..., uN ) This handouts relaxes the assumption (OLS4a) and in the whole sample is therefore shows how the parameters of the linear model are correctly estimated and 0 2 tested when the error terms are heteroscedastic (OLS4b). V [u|X] = E[uu |X] = σ Ω

 2    σ1 0 ··· 0 ω1 0 ··· 0 2 The Econometric Model  0 σ2 ··· 0   0 ω ··· 0   2  2  2  =  . . .  = σ  . . .   . . .. .   . . .. .  Consider the multiple linear regression model for observation i = 1, ..., N  . . . .   . . . .  2 0 0 ··· σN 0 0 ··· ωN 0 yi = xiβ + ui where x0 is a (K + 1)-dimensional row vector of K explanatory variables i 3 A Generic Case: Groupewise Heteroskedasticity and a constant, β is a (K + 1)-dimensional column vector of parameters and ui is a scalar called the error term. Heteroskedasticity is sometimes a direct consequence of the construc- Assume OLS1, OLS2, OLS3 and tion of the . Consider the following linear regression model with homoscedastic errors OLS4: Error Variance 0 yi = xiβ + ui

b) conditional : 2 with V [ui|xi] = V [ui] = σ . V [u |x ] = σ2 = σ2ω = σ2ω(x ) i i i i i Assume that instead of the individual observations y and x only the 2 0 i i and E[ui xixi] = QXΩX is p.d. and finite values yg and xg for g = 1, ..., G groups are observed. The error 2 term in the resulting regression model where ω(.) is a function constant across i. The decomposition of σi into 2 ωi and σ is arbitrary but useful. 0 yg = xgβ + ug

Version: 29-10-2020, 23:34 3 Short Guides to Microeconometrics Heteroskedasticity in the Linear Model 4

2 2 is now conditionally heteroskedastic with V [ug|Ng] = σg = σ /Ng, where 4 Estimation with OLS Ng is a with the number of observations in group g. The parameter β can be estimated with the usual OLS estimator

0 −1 0 βbOLS = (X X) X y

The OLS estimator of β remains unbiased under OLS1, OLS2, OLS3c, OLS4b, and OLS5 in small samples. Additionally assuming OLS3a, it is normally distributed in small samples. It is consistent and approximately normally distributed under OLS1, OLS2, OLS3d, OLS4a or OLS4b and OLS5, in large samples. However, the OLS estimator is not efficient any more. More importantly, the usual standard errors of the OLS estimator and tests (t-, F -, z-, Wald-) based on them are not valid any more.

5 Estimating the Variance of the OLS Estimator

The small sample covariance of βbOLS is under OLS4b

0 −1  0 2  0 −1 V [βbOLS|X] = (X X) X σ ΩX (X X)

2 0 −1 and differs from usual OLS where V [βbOLS|X] = σ (X X) . Conse- 2 0 −1 quently, the usual estimator Vb[βbOLS|X] = σc(X X) is biased. Usual small sample test procedures, such as the F - or t-Test, based on the usual estimator are therefore not valid. The OLS estimator is asymptotically normally distributed under OLS1, OLS2, OLS3d, and OLS5 √ d −1 −1  N(βb − β) −→ N 0,QXX QXΩX QXX

2 0 0 where QXΩX = E[ui xixi] and QXX = E[xixi]. The OLS estimator is therefore approximately normally distributed in large samples as

A   βb ∼ N β, Avar[βb] 5 Short Guides to Microeconometrics Heteroskedasticity in the Linear Model 6

−1 −1 −1 where Avar[βb] = N QXX QXΩX QXX can be consistently estimated as 7 Estimation with GLS/WLS when Ω is Known

" N # 0 −1 X 2 0 0 −1 When Ω is known, β is efficiently estimated with generalized Avar[ [βb] = (X X) ubi xixi (X X) i=1 (GLS) −1  0 −1  0 −1 with some additional assumptions on higher order moments of xi (see βbGLS = X Ω X X Ω y. White 1980). The GLS estimator simplifies in the case of heteroskedasticity to This so-called White or Eicker-Huber-White estimator of the covari- ˜ 0 ˜ −1 ˜ 0 ance matrix is a heteroskedasticity-consistent estimator βbW LS = (X X) X y˜ that does not require any assumptions on the form of heteroscedasticity where (though we assumed independence of the error terms in OLS2 ). Stan-  √   0 √  y1/ ω1 x1/ ω1 dard errors based on the White estimator are often called robust. We can  .  ˜  .  y˜ =  .  , X =  .  perform the usual z- and Wald-test for large samples using the White     √ 0 √ covariance estimator. yN / ωN xN / ωN Note: t- and F -Tests using the White covariance estimator are only and is called (WLS) estimator. The WLS estimator asymptotically valid because the White covariance estimator is consistent of β can therefore be estimated by running OLS with the transformed but not unbiased. It is therefore more appropriate to use large sample variables. tests (z, Wald). Note: the above transformation of the explanatory variables also ap- Bootstrapping (see the handout on “The Bootstrap”) is an alternative √ plies to the constant, i.e.x ˜i0 = 1/ ωi. The OLS regression using the method to estimate a heteroscedasticity robust covariance matrix. transformed variables does not include an additional constant. The WLS estimator minimizes the sum of squared residuals weighted 6 Testing for Heteroskedasticity by 1/ωi:

There are several tests for the assumption that the error term is ho- N N X 0 2 X 1 0 2 S (α, β) = (˜yi − x˜iβ) = (yi − xiβ) → min moskedastic. White (1980)’s test is general and does not presume a par- ω β i=1 i=1 i ticular form of heteroskedasticity. Unfortunately, little can be said about its power and it has poor small sample properties unless the number of The WLS estimator of β is unbiased and efficient (under OLS1, OLS2, regressors is very small. If we have prior knowledge that the variance σ2 i OLS3c, OLS4b, and OLS5 ) and normally distributed additionally assum- is a linear (in parameters) function of explanatory variables, the Breusch- ing OLS3a (normality) in small samples. Pagan (1979) test is more powerful. Koenker (1981) proposes a variant of The WLS estimator of β is consistent, asymptotically efficient and ap- the Breusch-Pagan test that does not assume normally distributed errors. proximately normally distributed under OLS4b (conditional heteroscedas- Note: In practice we often do not test for heteroskedasticity but di- ticity) and OLS2 , OLS1, OLS3d, and a modification of OLS5 rectly report heteroskedasticity-robust standard errors. 7 Short Guides to Microeconometrics Heteroskedasticity in the Linear Model 8

OLS5: Identifiability in the calculations for βbF GLS and Avar[ [βbF GLS]. X0Ω−1X is p.d. and finite The FGLS estimator is consistent and approximately normally dis- −1 0 tributed in large samples under OLS1, OLS2 ({xi, zi, yi} i.i.d.), OLS3d, E[ω xixi] = QXΩ−1X is p.d. and finite i 2 OLS4b, OLS5 and some additional more technical assumptions. If σi is where ωi = ω(xi) is a function of xi. correctly specified, βF GLS is asymptotically efficient and the usual tests The covariance matrix of βbW LS can be estimated as (z, Wald) for large samples are valid; small samples tests are only asymp- 2 ˜ 0 ˜ −1 2 Avar[ [βbW LS] = Vb[βbW LS] = σc(X X) totically valid and nothing is gained from using them. If σi is not correctly 0 0 specified, the usual covariance matrix is inconsistent and tests (z, Wald) where σ2 = u˜ u/˜ (N − K − 1) in small samples and σ2 = u˜ u/N˜ in large c b b c b b invalid. In this case, the White covariance estimator used after FGLS pro- samples and u˜ =y ˜ − X˜β . Usual tests (t-, F -) for small samples b bGLS vides consistent standard errors and valid large sample tests (z, Wald). are valid (under OLS1, OLS2, OLS3a, OLS4b and OLS5 ; usual tests (z, Note: In practice, we often choose a simple model for heteroscedastic- Wald) for large samples are also valid (under OLS1, OLS2, OLS3d, OLS4b ity using only one or two regressors and use robust standard errors. and OLS5 ).

8 Estimation with FGLS/FWLS when Ω is Unknown

In practice, Ω is typically unknown. However, we can model the ωi’s as a function of the data and estimate this relationship. Feasible gener- alized least squares (FGLS) replaces ωi by their predicted values ωbi and calculates then βbF GLS as if ωi were known. A useful model for the error variance is

2 0 σi = V [ui|zi] = exp(ziδ) where zi are L + 1 variables that may belong to xi including a constant and δ is a vector of parameters. We can estimate the auxiliary regression

2 0 ubi = exp(ziδ) + νi 0 by nonlinear least squares (NLLS) where ubi = yi−xiβbOLS or alternatively, 2 0 log(ubi ) = ziδ + νi by (OLS). In both cases, we use the predictions

0 ωbi = exp(ziδb) 9 Short Guides to Microeconometrics Heteroskedasticity in the Linear Model 10

0 Implementation in Stata 15 variance, e.g. ωbi = exp(ziδb)

Stata reports the White covariance estimator with the robust option, e.g. 3. perform a linear regression using the weights 1/ωbi webuse auto.dta For example, regress price mpg weight, vce(robust) matrix list e(V) regress price mpg weight predict e, residual Alternatively, Stata estimates a heteroscedasticity robust covariance generate loge2 = log(e^2) using a nonparametric bootstrap. For example, regress loge2 weight foreign predict zd regress price mpg weight, vce(bootstrap, rep(100)) generate w=exp(zd) matrix list e(V) regress price mpg weight [aweight = 1/w] The White (1980) test for heteroskedasticity is implemented in the post-estimation command Implementation in R estat imtest, white R reports the Eicker-Huber-White covariance after estimation using the The Koenker (1981) version of the Breusch-Pagan (1979) test is im- two packages sandwich and lmtest plemented in the postestimation command estat hettest. For example, > library(foreign) estat hettest weight foreign, iid > auto <- read.dta("http://www.stata-press.com/data/r11/auto.dta") > fm <- lm(price~mpg+weight, data=auto) 2 assumes σi = δ0 + δ1weighti + δ2foreigni and tests H0 : δ1 = δ2 = 0. > library(sandwich) WLS is estimated in Stata using analytic weights. For example, > library(lmtest) > coeftest(fm, vcov=sandwich) regress depvar indepvars [aweight = 1/w] F -tests for one or more restrictions are calculated with the command calculates the WLS estimator assuming ωi is provided in the variable w. waldtest which also 2 2 Recall that we defined σi = σ ωi (mind the squares). The analytic weight > waldtest(fm, "weight", vcov=sandwich) is proportional to the inverse variance of the error term. Stata internally P scales the weights s.t. 1/ωi = N. The reported Root MSE therefore tests H0 : β1 = 0 against HA : β1 6= 0 with Eicker-Huber-White, and 2 P 2 reports σb = (1/N) σbi . > waldtest(fm, .~.-weight-displacement, vcov=sandwich) FGLS/FWLS is carried out step by step: tests H0 : β1 = 0 and β2 = 0 against HA : β1 6= 0 or β2 6= 0. 0 1. Estimate the regression model yi = xiβ + ui by OLS and predict 0 the residuals ubi = yi − xiβb 2. Use the residuals to estimate a regression model for the error vari- 2 0 ance, e.g. log(ubi ) = ziδ + νi, and to predict the individual error 11 Short Guides to Microeconometrics

References

Introductory textbooks

Stock, James H. and Mark W. Watson (2020), Introduction to Economet- rics, 4th Global ed., Pearson. Section 5.4. Wooldridge, Jeffrey M. (2009), Introductory : A Modern Approach, 4th ed., South-Western Cengage Learning. Chapter 8.

Advanced textbooks

Cameron, A. Colin and Pravin K. Trivedi (2005), Microeconometrics: Methods and Applications, Cambridge University Press. Sections 4.5. Hayashi, Fumio (2000), Econometrics, Princeton University Press. Sec- tions 1.6 and 2.8. Wooldridge, Jeffrey M. (2002), Econometric Analysis of Cross Section and , MIT Press. Sections 4.23 and 6.24.

Companion textbooks

Kennedy, Peter (2008), A Guide to Econometrics, 6th ed., Blackwell Pub- lishing. Chapters 8.1-8.3.

Articles

Breusch, T.S. and A.R. Pagan (1979), A Simple Test for Heteroscedastic- ity and Random Coefficient Variation. 47, 987–1007. Koenker, R. (1981), A Note on Studentizing a Test for Heteroskedasticity. 29, 305–326. White, H. (1980), A Heteroscedasticity-Consistent Covariance Matrix Es- timator and a Direct Test for Heteroscedasticity. Econometrica 48, 817–838.