Lecture 3: Heteroscedasticity
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 4 Lecture 3: Heteroscedasticity In many situations, the Gauss-Markov conditions will not be satisfied. These are: E[ǫ] = 0 i = 1,...,n ǫ X ⊥ 2 Var(ǫ)= σ In We consider the model Y = Xβ + ǫ. Suppose that, conditioned on X, ǫ has covariance matrix Var(ǫ X)= σ2Ψ | where Ψ depends on X. Recall that the OLS estimator βOLS of β is: t −1 t b t −1 t βOLS =(X X) X Y = β +(X X) X ǫ. Therefore, conditioned on X,b the covariance matrix of βOLS is: 2 t b−1 t −1 Var(βOLS X)= σ (X X) Ψ(X X) . | If Ψ = I, then the proof of the Gauss-Markov theorem, that the OLS estimator is BLUE breaks down; 6 b the OLS estimator is unbiased, but no longer best in the least squares sense. 4.1 Estimation when Ψ is known If Ψ is known, let P denote a non-singular n n matrix such that P tP =Ψ−1. Such a matrix exists × and P ΨP t = I. Consequently, E[Pǫ X] = 0 and Var(Pǫ X) = σ2I, which does not depend on X. | | It follows that the Gauss-Markov conditions are satisfied for Pǫ. The entire model may therefore be transformed: 41 PY = PXβ + Pǫ which may be written as: Y ∗ = X∗β + ǫ∗. This model satisfies the Gauss Markov conditions and the resulting estimator, which is BLUE, is: ∗t ∗ −1 ∗t t −1 −1 t −1 βGLS =(X X ) X Y =(X Ψ X) X Ψ Y. Here GLS stands for Generlisedb Least Squares. It has covariance: 2 ∗t ∗ −1 2 t −1 −1 Var(βGLS)= σ (X X ) = σ (X Ψ X) . If k = rank(X); that is X is n k where n>k (that is k = p + 1 where p is the number of regressor ×b variables), the variance estimator is: 1 1 σ2 = (Y ∗ X∗β )t(Y ∗ X∗β )= (Y Xβ )tΨ−1(Y Xβ ). n k − GLS − GLS n k − GLS − GLS − − b b b b 4.2b Heteroskedasticity Heteroskedasticity refers to the special case where Ψ is diagonal, but the elements are not all equal. The errors are mutually uncorrelated, but the variance may vary between observations. This is frequently encountered in cross-sectional models. For example, suppose yi denotes expenditure on food, while xi is disposable income. Higher income corresponds to higher expenditure on food; variation of food expenditure increases as income increases. A suitable model may be: 2 Var(ǫi xi)= σ exp α1xi . | { } Set 2 1 h = Var(ǫi Xi.) i σ2 | where Xi. denotes the row-vector of values for the regressor variables for observation i. We assume 2 2 that, for each i, Var(ǫi Xi.) = Var(ǫi X). Under this assumption, Ψ = diag(h ,...,h ). We also | | 1 n replace the Gauss-Markov assumption of zero expectation with E[ǫi X] = 0. The computation of the | BLUE estimator now follows: n −1 n 1 1 β = Xt X Xt Y . GLS h2 i. i. h2 i. i i=1 i ! i=1 i X X This is a Generalised Least Squaresb (GLS) estimator and is sometimes referred to as weighted least squares. Furthermore, the covariance structure is: n −1 1 Var(β )= σ2 Xt X GLS h2 i. i. i=1 i ! X b 42 and the unbiased estimate of σ2 is: n 2 1 1 2 σ = (Yi Xi.β ) . n k h2 − GLS i=1 i − X If in addition we make assumptionsb of normality, then as before,b an F -test can be used to test a number of linear restrictions. Let R be a j k matrix; we’re testing j linear restrictions on the parameters. × Consider H0 : Rβ = q versus the alternative H1 : Rβ = q. For example, we could test (simultaneously) 6 β1 + β2 + β4 = 1 and β5 = 0. This represents two restrictions (j = 2). Let ξ =(Rβ q)t(RVar(β)Rt)−1(Rβ q). − − 2 Under H0, thi statistic has an asumptoticb χ distributiond b withb j degrees of freedom. This test is usually referred to as the Wald Test. 4.3 Heteroskedasticity: Unknown Variances Following from the formula for the covariance matrix of β, a consistent estimator of the k k matrix × n 1 1b Σ= Xtdiag(σ2)X = σ2Xt X n i n i i. i. i=1 X is needed. It turns out that (proof omitted) under very general conditions, n 1 2 t S R X Xi. ≡ n i i. i=1 X is a considtent estimator for Σ, where Ri is the OLS residual. Therefore: n t −1 2 t t −1 Var(βOLS)=(X X) Ri Xi.Xi.(X X) i=1 X can be used as an estimate ofd theb true variance of the OLS estimator. Hence inference can be made about βOLS without specifying the type of heteroskedasticity. The diagonal elements of Var(βOLS) are usually referred to as heteroskedasticity-consistent standard b errors. d b 4.3.1 Multiplicative Heteroskedasticity A common form of heteroskedasticity employed in practise is that of multiplicative heteroskedasticity. It is assumed that the error variance is related to a number of exogenous variables, gathered in a j-vector zi for observation i. It is assumed that: 43 j 2 2 Var(ǫi Xi.)= σi = σ exp αkzik . | ( =1 ) Xk To compute the EGLS (Estimated Generalised Least Squares) estimator, we need consistent esti- mators of α1,...,αj. We assume that j 2 log Ri = const + zikαk + vi =1 Xk n where vi is, asymptotically, homoskedastic. One first obtains (Ri)i=1, the residuals from OLS. Next, 2 n regress (log Ri )i=1 against zi. and a constant. The estimators α of α are consistent. 2 From this, h = exp zi.α may be computed. i { } Now run OLS on the transformed model: b b b y X ǫ i = i. β + i hi hi hi This yields the EGLS estimator β of β. EGLS b b b The scalar σ2 can be estimated by: b n 2 1 (yi Xi.βEGLS) σ2 = − n k 2 i=1 hi − X b b and the estimated covariance matrix of βEGLS is given by:b n −1 b 1 β σ2 Xt X . Var( )= 2 i. i. i=1 hi ! X d b b 4.4 Testing for Heteroskedasticity b There are several tests available for heteroskedasticity Two different populations Suppose the sample variance of group A based on n1 observations is 2 2 s1 and the sample variance of group B based on n2 observations is s2. Suppose that the model is Y = X1β1 + ǫ for population 1 and Y = X2β2 + ǫ for population 2, where X1 is n1 k1 and X2 is × n2 k2, n1 > k1 and n2 > k2, rank(X1) = k1 and rank(X2) = k2. Then, under the null hypothesis × 2 2 H0 : σ1 = σ2, where σ1 and σ2 are the variances for the models for populations 1 and 2 respectively, 2 2 s1 Fn1−k1,n2−k2 . s2 ∼ Testing for Multiplicative Heteroskedasticity Following the construction, as a regression of 2 log Ri against const + α1zi1 + ... + αkzik, the test of H0 : α1 = ... = αk = 0 can be tested in the same way as for least squares regression. 44 The Breusch-Pagan Test The Breusch–Pagan test, developed in 1979 by Trevor Breusch and Adrian Pagan, is used to test for heteroskedasticity in a linear regression model. It was independently suggested with some extension by R. Dennis Cook and Sanford Weisberg in 1983. It tests whether the variance of the errors from a regression is dependent on the values of the regressor variables. In that case, heteroskedasticity is present. Suppose that we estimate the regression model Y = Xβ + ǫ and obtain from this fitted model a set of values for R the residuals. Ordinary least squares constrains these so that their mean is 0 and so, given the assumption that their variance does not depend on the regressor variables, an estimate of this variance can be obtained from the average of the squared values of the residuals. If the assumption is not held to be true, a simple model might be that the variance is linearly related to independent variables. Such a model can be examined by regressing the squared residuals on the independent variables, using an auxiliary regression equation of the form R2 =Ξγ + v t where γ = (γ0,γ1,...,γk) . This is the basis of the Breusch–Pagan test. It is a chi-squared test: the 2 test statistic is distributed nχk (k degrees of freedom). If the test statistic has a p-value below an appropriate threshold (e.g. p < 0.05) then the null hypothesis of homoskedasticity is rejected and heteroskedasticity assumed. 2 Procedure The Breusch–Pagan test is based on models of the type σi = h(zi.γ) where zi. = (1, zi1, . , zik) are the explanatory variables for observation i. The null hypothesis is equivalent to the restriction H0 : γ1 = ... = γk = 0. The steps are: Step 1: Apply OLS to the model Y = Xβ + ǫ. • Step 2: Perform the auxiliary regression: • 2 Ri = γ0 + γ1zi1 + ... + γkzik + ηi Q Step 3: Compute the coefficient of determination Ξ = 1 res . Under H0, asymptotically, • − Qtotal nΞ χ2. ∼ k 4.5 Example This section gives an example where the following model is appropriate: 2 t E ǫ xi, zi = g(z α) i | i 45 where zi is an l-vector of observations on exogeneous or predetermined variables which affect the variance and α is a k-vector of parameters. Consider journal prices (a data set found in AER).