Heteroskedasticity 1 / 50 Heteroskedasticity in Regression Paul E. Johnson1 2 1Department of Political Science 2Center for Research Methods and Data Analysis, University of Kansas 2020 Heteroskedasticity 2 / 50 Introduction Outline 1 Introduction 2 Fix #1: Robust Standard Errors 3 Weighted Least Squares Combine Subsets of a Sample Random coefficient model Aggregate Data 4 Testing for heteroskedasticity Categorical Heteroskedasticity Checking for Continuous Heteroskedasticity Toward a General Test for Heteroskedasticity 5 Appendix: Robust Variance Estimator Derivation 6 Practice Problems Heteroskedasticity 3 / 50 Introduction Remember the Basic OLS Theory behind the Linear Model yi = β0 + β1x1i + ei Error term, we assumed, for all i, E(ei ) = 0 for all i (errors are \symmetric" above and below) 2 2 Var(ei ) = E[(ei − E(ei )) ] = σ (Homoskedasticity: same variance). Heteroskedasticity: the assumption of homogeneous variance is violated. Heteroskedasticity 4 / 50 Introduction Homoskedasticity means 2 σe 0 0 0 0 2 0 σe 0 0 0 2 0 0 σ 0 0 Var(e) = e .. 0 0 0 . 0 2 0 0 0 0 σe Heteroskedasticity 5 / 50 Introduction Heteroskedasticity depicted in one of these ways 2 σe 2 w 0 σe w1 0 1 σ2 σ2w e e 2 w2 2 σ2 Var(e) = σe w3 or e w3 . . . . 2 0 σ wN σ2 e 0 e wN I get confused a lot when comparing textbooks because of this problem! Heteroskedasticity 6 / 50 Introduction Consequences of Heteroskedasticity 1: β^OLS still unbiased, consistent OLS Estimates of β0 and β1 are still unbiased and consistent. Unbiased: E[β^OLS ] = β Consistent: As N → ∞, β^OLS tends to β in probability limit. If the predictive line was \right" before, It's still right now. However, these are incorrect standard error of β^ RMSE confidence / prediction intervals Heteroskedasticity 7 / 50 Introduction Proof: β^OLS Still Unbiased Begin with \mean centered" data. OLS with one variable: P P P 2 P P ^ xi · yi xi (b · xi + ei ) β1 xi + xi · ei xi · ei β1 = P 2 = P 2 = P 2 = β1+ P 2 xi xi xi xi Apply the Expected value operator to both sides: P ^ xi · ei E[β1] = E(β1) + E( P 2 ) xi P P ^ xi · ei E[xi · ei ] E[β1] = β1 + E( P 2 ) = β1 + ( P 2 ) xi xi Assume xi is uncorrelated with ei , E[xi ei ] = 0, the work is done E(β^1) = β1 Heteroskedasticity 8 / 50 Introduction Consequence 2. OLS formula for Var\(β^) is wrong 1 Usual formula to estimate Var(β^), Var\(β^) is wrong. And it's square root, the std.err.(β^) is wrong. 2 Thus t-tests are WRONG (too big). Heteroskedasticity 9 / 50 Introduction Proof: OLS Var\(β^) Wrong Variance of ei : Var(ei ). The variance of the OLS slope estimator, Var(b^1), in \mean-centered (or deviations) form": P P P P 2 xi · ei Var[ xi ei ] Var(xi ei ) xi · Var(ei ) Var(β^1) = Var = = = P x2 P 2 2 P 2 2 P 2 2 i ( xi ) ( xi ) ( xi ) (1) We assume all Var(ei ) are equal, and we put in the MSE as an estimate of it. \^ MSE Var(β1) = P 2 (2) xi Heteroskedasticity 10 / 50 Introduction Proof: OLS Var(β^)Wrong (page 2) Instead, suppose the \true" variance 2 2 Var(ei ) = s + si (3) (a common minimum variance s2 plus an additional individualized 2 variance si ). Plug this into (1): P x2(s2 + s2) s2 P x · s2 i i = + i i (4) P 2 2 P x2 P 2 2 ( xi ) i ( xi ) The first term is "roughly"what OLS would calculate for the variance of β^1. The second term is the additional "true variance" in β^1 that the OLS formula V\(β^1) does not include. Heteroskedasticity 11 / 50 Introduction Consequence 3: β^OLS is Inefficient ^OLS 1 βi is inefficient: It has higher variance than the \weighted" estimator. 2 Note that to prove an estimator is \inefficient", it is necessary to provide an alternative estimator that has lower variance. ^WLS 3 WLS: Weighted Least Squares estimator, β1 . 4 The Sum of Squares to be minimized now includes a weight for each case N X 2 SS(β^0, β^1) = Wi (y − y^i ) (5) i=1 5 The weights chosen to \undo" the heteroskedasticity. 2 Wi = 1/Var(ei ) (6) Heteroskedasticity 12 / 50 Introduction Heteroskedasticity 13 / 50 Introduction Covariance matrix of error terms This thing is a weighting matrix 1 0 Var(e1) 1 Var(e2) .. . 0 1 Var(eN ) Is usually simplified in various ways. Factor out a common parameter, so each individual's error variance is proportional 1 0 w1 1 1 w 2 2 .. σe . 0 1 wN Heteroskedasticity 14 / 50 Introduction 2 Example. Suppose variance proportional to xi Heteroskedasticity The \truth" is 60 yi = 3 + 0.25xi + ei (7) 40 20 Homoskedastic: y 0 Std.Dev.(ei ) = σe = 10 (8) −20 Heteroskedastic: 30 40 50 60 x Std.Dev.(ei ) = 0.05∗(xi −min(xi ))∗σe (9) Heteroskedasticity 15 / 50 Introduction Compare Lines of 1000 Fits (Homo vs Heteroskedastic) Heteroskedastic Errors Homoskedastic Errors 60 60 40 40 20 20 Homoskedastic y Homoskedastic 0 0 Heteroskedastic y Heteroskedastic −20 −20 20 30 40 50 60 70 80 20 30 40 50 60 70 80 x x Heteroskedasticity 16 / 50 Introduction Histograms of Slope Estimates (w/Kernel Density Lines) Heteroskedastic Data Homoskedastic Data 5 8 4 6 3 4 Density Density 2 2 1 0 0 0.0 0.2 0.4 0.0 0.2 0.4 Slope Est. Slope Est. Heteroskedasticity 17 / 50 Introduction So, the Big Message Is Heteroskedasticity inflates the amount of uncertainty in the estimates. Distorts t-ratios. Heteroskedasticity 18 / 50 Fix #1: Robust Standard Errors Outline 1 Introduction 2 Fix #1: Robust Standard Errors 3 Weighted Least Squares Combine Subsets of a Sample Random coefficient model Aggregate Data 4 Testing for heteroskedasticity Categorical Heteroskedasticity Checking for Continuous Heteroskedasticity Toward a General Test for Heteroskedasticity 5 Appendix: Robust Variance Estimator Derivation 6 Practice Problems Heteroskedasticity 19 / 50 Fix #1: Robust Standard Errors Robust estimate of the variance of b^ Replace OLS formula for Var\(b^) with a more \robust" version Robust \heteroskedasticity consistent" variance estimator: weaker assumptions. No known small sample properties But are consistent / asymptotically valid Note: This does not “fix" b^OLS . It just gives us more accurate t-ratios by correcting std.err(b^). Heteroskedasticity 20 / 50 Fix #1: Robust Standard Errors Robust Std.Err. in a Nutshell Recall: the variance-covariance matrix of the errors assumed by OLS. 2 σe 0 0 0 0 2 0 σe 0 0 0 0 Var(e) = E(e · e |X ) = 0 0 σ2 0 0 (10) e ......... 0 2 0 0 0 0 σe Heteroskedasticity 21 / 50 Fix #1: Robust Standard Errors Heteroskedastic Covariance Matrix If there's heteroskedasticity, we have to allow the possibility like this: 2 σ1 0 0 0 0 0 σ2 0 0 0 2 0 .. Var(e) = E[e · e |X ] = 0 0 . ··· 0 2 0 0 0 σN−1 0 2 0 0 0 0 σN Heteroskedasticity 22 / 50 Fix #1: Robust Standard Errors Robust Std.Err. : Use Variance Estimates Fill in estimates for the case-specific error variances 2 σc1 0 0 0 0 2 0 σc2 0 0 0 \ .. Var(e) = 0 0 . ··· 0 0 0 0 σ[2 0 N−1 2 0 0 0 0 σcN Embed those estimates into the larger formula that is used to calculate the robust standard errors. Famous paper White, Halbert. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4), 817-838. Robust estimator originally proposed by Huber (1967), but was forgotten Heteroskedasticity 23 / 50 Fix #1: Robust Standard Errors Derivation: Calculating The Robust Estimator of Var(b^) The true variance of the OLS estimator is P 2 ^ xi Var(ei ) Var(b1) = P 2 2 (11) ( xi ) 2 Assuming Homoskedasticity, estimate σe with MSE. \^ MSE ^ Var(b1) = P 2 and the square root of that is std.err.(b1) (12) xi The robust versions replace Var(ei ) with other estimates. White's suggestion was P 2 2 \^ xi · e^i Robust Var(b1) = P 2 2 (13) ( xi ) 2 e^i : the \squared residual", used in place of the unknown error variance. Heteroskedasticity 24 / 50 Weighted Least Squares Outline 1 Introduction 2 Fix #1: Robust Standard Errors 3 Weighted Least Squares Combine Subsets of a Sample Random coefficient model Aggregate Data 4 Testing for heteroskedasticity Categorical Heteroskedasticity Checking for Continuous Heteroskedasticity Toward a General Test for Heteroskedasticity 5 Appendix: Robust Variance Estimator Derivation 6 Practice Problems Heteroskedasticity 25 / 50 Weighted Least Squares WLS is Efficient Change OLS: N X 2 SS(β^) = (yi − y^i ) i=1 to WLS: N X 2 minimize SS(β^) = Wi (yi − y^i ) i=1 In practice, weights are guesses about the standard deviation of the error term 1 Wi = (14) σi Heteroskedasticity 26 / 50 Weighted Least Squares Feasible Weighted Least Squares. Analysis proceeds in 2 steps. Regression is estimated to gather information about the error variance. That information is used to fill in the Weight matrix with WLS May revise the weights, re-fit the WLS, repeatedly until convergence. Heteroskedasticity 27 / 50 Weighted Least Squares Combine Subsets of a Sample Data From Categorical Groups Suppose you separately investigate data for men and women men : yi = β0 + β1xi + ei (15) women : yi = c0 + c1xi + ui (16) Then you wonder, \can I combine the data for men and women to estimate one model" humans : yi = β0 + β1xi + β2sexi + β3sexi xi + ei (17) This \manages" the differences of intercept and slope for men and women by adding coefficients β2 and β3.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages51 Page
-
File Size-