Heteroskedasticity in the Linear Model

Short Guides to Microeconometrics Kurt Schmidheiny Fall 2020 UnversitätBasel Heteroskedasticity in the Linear Model 1 Introduction This handout extends the handout on \The Multiple Linear Regression model" and refers to its definitions and assumptions in section 2. This handouts relaxes the homoscedasticity assumption (OLS4a) and shows how the parameters of the linear model are correctly estimated and tested when the error terms are heteroscedastic (OLS4b). 2 The Econometric Model Consider the multiple linear regression model for observation i = 1; :::; N 0 yi = xiβ + ui 0 where xi is a (K + 1)-dimensional row vector of K explanatory variables and a constant, β is a (K + 1)-dimensional column vector of parameters and ui is a scalar called the error term. Assume OLS1, OLS2, OLS3 and OLS4: Error Variance b) conditional heteroscedasticity: 2 2 2 V [uijxi] = σi = σ !i = σ !(xi) 2 0 and E[ui xixi] = QXΩX is p.d. and finite 2 where !(:) is a function constant across i. The decomposition of σi into 2 !i and σ is arbitrary but useful. Version: 29-10-2020, 23:35 Heteroskedasticity in the Linear Model 2 Note that under OLS2 (i.i.d. sample) the errors are unconditionally 2 homoscedastic, V [ui] = σ but allowed to be conditionally heteroscedas- 2 tic, V [uijxi] = σi . Assuming OLS2 and OLS3c provides that the errors are also not conditionally autocorrelated, i.e. 8i 6= j : Cov[ui; ujjxi; xj] = E[uiujjxi; xj] = E[uijxi] · E[ujjxj] = 0 . Also note that the condition- ing on xi is less restrictive than it may seem: if the conditional variance V [uijxi] depends on other exogenous variables (or functions of them), we can include these variables in xi and set the corresponding β parameters to zero. 0 The variance-covariance of the vector of error terms u = (u1; u2; :::; uN ) in the whole sample is therefore V [ujX] = E[uu0jX] = σ2Ω 0 2 1 0 1 σ1 0 ··· 0 !1 0 ··· 0 B 0 σ2 ··· 0 C B 0 ! ··· 0 C B 2 C 2 B 2 C = B . C = σ B . C B . .. C B . .. C @ . A @ . A 2 0 0 ··· σN 0 0 ··· !N 3 A Generic Case: Groupewise Heteroskedasticity Heteroskedasticity is sometimes a direct consequence of the construc- tion of the data. Consider the following linear regression model with homoscedastic errors 0 yi = xiβ + ui 2 with V [uijxi] = V [ui] = σ . Assume that instead of the individual observations yi and xi only the mean values yg and xg for g = 1; :::; G groups are observed. The error term in the resulting regression model 0 yg = xgβ + ug 3 Short Guides to Microeconometrics 2 2 is now conditionally heteroskedastic with V [ugjNg] = σg = σ =Ng, where Ng is a random variable with the number of observations in group g. Heteroskedasticity in the Linear Model 4 4 Estimation with OLS The parameter β can be estimated with the usual OLS estimator 0 −1 0 βbOLS = (X X) X y The OLS estimator of β remains unbiased under OLS1, OLS2, OLS3c, OLS4b, and OLS5 in small samples. Additionally assuming OLS3a, it is normally distributed in small samples. It is consistent and approximately normally distributed under OLS1, OLS2, OLS3d, OLS4a or OLS4b and OLS5, in large samples. However, the OLS estimator is not efficient any more. More importantly, the usual standard errors of the OLS estimator and tests (t-, F -, z-, Wald-) based on them are not valid any more. 5 Estimating the Variance of the OLS Estimator The small sample covariance matrix of βbOLS is under OLS4b 0 −1 0 2 0 −1 V [βbOLSjX] = (X X) X σ ΩX (X X) 2 0 −1 and differs from usual OLS where V [βbOLSjX] = σ (X X) . Conse- 2 0 −1 quently, the usual estimator Vb[βbOLSjX] = σc(X X) is biased. Usual small sample test procedures, such as the F - or t-Test, based on the usual estimator are therefore not valid. The OLS estimator is asymptotically normally distributed under OLS1, OLS2, OLS3d, and OLS5 p d −1 −1 N(βb − β) −! N 0;QXX QXΩX QXX 2 0 0 where QXΩX = E[ui xixi] and QXX = E[xixi]. The OLS estimator is therefore approximately normally distributed in large samples as A βb ∼ N β; Avar[βb] 5 Short Guides to Microeconometrics −1 −1 −1 where Avar[βb] = N QXX QXΩX QXX can be consistently estimated as " N # 0 −1 X 2 0 0 −1 Avar[ [βb] = (X X) ubi xixi (X X) i=1 with some additional assumptions on higher order moments of xi (see White 1980). This so-called White or Eicker-Huber-White estimator of the covariance matrix is a heteroskedasticity-consistent covariance matrix estimator that does not require any assumptions on the form of heteroscedasticity (though we assumed independence of the error terms in OLS2 ). Stan- dard errors based on the White estimator are often called robust. We can perform the usual z- and Wald-test for large samples using the White covariance estimator. Note: t- and F -Tests using the White covariance estimator are only asymptotically valid because the White covariance estimator is consistent but not unbiased. It is therefore more appropriate to use large sample tests (z, Wald). Bootstrapping (see the handout on \The Bootstrap") is an alternative method to estimate a heteroscedasticity robust covariance matrix. 6 Testing for Heteroskedasticity There are several tests for the assumption that the error term is ho- moskedastic. White (1980)'s test is general and does not presume a par- ticular form of heteroskedasticity. Unfortunately, little can be said about its power and it has poor small sample properties unless the number of 2 regressors is very small. If we have prior knowledge that the variance σi is a linear (in parameters) function of explanatory variables, the Breusch- Pagan (1979) test is more powerful. Koenker (1981) proposes a variant of the Breusch-Pagan test that does not assume normally distributed errors. Note: In practice we often do not test for heteroskedasticity but di- rectly report heteroskedasticity-robust standard errors. Heteroskedasticity in the Linear Model 6 7 Estimation with GLS/WLS when Ω is Known When Ω is known, β is efficiently estimated with generalized least squares (GLS) −1 0 −1 0 −1 βbGLS = X Ω X X Ω y: The GLS estimator simplifies in the case of heteroskedasticity to ~ 0 ~ −1 ~ 0 βbW LS = (X X) X y~ where 0 p 1 0 0 p 1 y1= !1 x1= !1 B . C ~ B . C y~ = B . C ; X = B . C @ A @ A p 0 p yN = !N xN = !N and is called weighted least squares (WLS) estimator. The WLS estimator of β can therefore be estimated by running OLS with the transformed variables. Note: the above transformation of the explanatory variables also ap- p plies to the constant, i.e.x ~i0 = 1= !i. The OLS regression using the transformed variables does not include an additional constant. The WLS estimator minimizes the sum of squared residuals weighted by 1=!i: N N X 0 2 X 1 0 2 S (α; β) = (~yi − x~iβ) = (yi − xiβ) ! min ! β i=1 i=1 i The WLS estimator of β is unbiased and efficient (under OLS1, OLS2, OLS3c, OLS4b, and OLS5 ) and normally distributed additionally assuming OLS3a (normality) in small samples. The WLS estimator of β is consistent, asymptotically efficient and approximately normally distributed under OLS4b (conditional heteroscedasticity) and OLS2 , OLS1, OLS3d, and a modification of OLS5 7 Short Guides to Microeconometrics OLS5: Identifiability X0Ω−1X is p.d. and finite −1 0 E[!i xixi] = QXΩ−1X is p.d. and finite where !i = !(xi) is a function of xi. The covariance matrix of βbW LS can be estimated as 2 ~ 0 ~ −1 Avar[ [βbW LS] = Vb[βbW LS] = σc(X X) 0 0 where σc2 = ub~ u=b~ (N − K − 1) in small samples and σc2 = ub~ u=Nb~ in large ~ samples and ub~ =y ~ − XβbGLS. Usual tests (t-, F -) for small samples are valid (under OLS1, OLS2, OLS3a, OLS4b and OLS5 ; usual tests (z, Wald) for large samples are also valid (under OLS1, OLS2, OLS3d, OLS4b and OLS5 ). 8 Estimation with FGLS/FWLS when Ω is Unknown In practice, Ω is typically unknown. However, we can model the !i's as a function of the data and estimate this relationship. Feasible generalized least squares (FGLS) replaces !i by their predicted values !bi and calculates then βbF GLS as if !i were known. A useful model for the error variance is 2 0 σi = V [uijzi] = exp(ziδ) where zi are L + 1 variables that may belong to xi including a constant and δ is a vector of parameters. We can estimate the auxiliary regression 2 0 ubi = exp(ziδ) + νi 0 by nonlinear least squares (NLLS) where ubi = yi−xiβbOLS or alternatively, 2 0 log(ubi ) = ziδ + νi by ordinary least squares (OLS). In both cases, we use the predictions 0 !bi = exp(ziδb) Heteroskedasticity in the Linear Model 8 in the calculations for βbF GLS and Avar[ [βbF GLS]. The FGLS estimator is consistent and approximately normally distributed in large samples under OLS1, OLS2 (fxi; zi; yig i.i.d.), OLS3d, 2 OLS4b, OLS5 and some additional more technical assumptions. If σi is correctly specified, βF GLS is asymptotically efficient and the usual tests (z, Wald) for large samples are valid; small samples tests are only asymp- 2 totically valid and nothing is gained from using them. If σi is not correctly specified, the usual covariance matrix is inconsistent and tests (z, Wald) invalid.

Heteroskedasticity in the Linear Model

Three Statistical Testing Procedures in Logistic Regression: Their Performance in Differential Item Functioning (DIF) Investigation

Wald (And Score) Tests

Comparison of Wald, Score, and Likelihood Ratio Tests for Response Adaptive Designs

Heteroscedastic Errors

Statistical Asymptotics Part II: First-Order Theory

Power Comparisons of the Mann-Whitney U and Permutation Tests

Research Report Statistical Research Unit Goteborg University Sweden

Econometrics-I-11.Pdf

T-Statistic Based Correlation and Heterogeneity Robust Inference

Package 'Lmtest'

Two-Way Heteroscedastic ANOVA When the Number of Levels Is Large

Profiling Heteroscedasticity in Linear Regression Models