Robust and Clustered Standard Errors
Total Page:16
File Type:pdf, Size:1020Kb
Robust and Clustered Standard Errors Molly Roberts March 6, 2013 Molly Roberts Robust and Clustered Standard Errors March 6, 2013 1 / 35 An Introduction to Robust and Clustered Standard Errors Outline 1 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance GLM’s and Non-constant Variance Cluster-Robust Standard Errors 2 Replicating in R Molly Roberts Robust and Clustered Standard Errors March 6, 2013 2 / 35 An Introduction to Robust and Clustered Standard Errors Outline 1 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance GLM’s and Non-constant Variance Cluster-Robust Standard Errors 2 Replicating in R Molly Roberts Robust and Clustered Standard Errors March 6, 2013 3 / 35 Errors are the vertical distances between observations and the unknown Conditional Expectation Function. Therefore, they are unknown. Residuals are the vertical distances between observations and the estimated regression function. Therefore, they are known. An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Review: Errors and Residuals Molly Roberts Robust and Clustered Standard Errors March 6, 2013 4 / 35 Residuals are the vertical distances between observations and the estimated regression function. Therefore, they are known. An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Review: Errors and Residuals Errors are the vertical distances between observations and the unknown Conditional Expectation Function. Therefore, they are unknown. Molly Roberts Robust and Clustered Standard Errors March 6, 2013 4 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Review: Errors and Residuals Errors are the vertical distances between observations and the unknown Conditional Expectation Function. Therefore, they are unknown. Residuals are the vertical distances between observations and the estimated regression function. Therefore, they are known. Molly Roberts Robust and Clustered Standard Errors March 6, 2013 4 / 35 y = Xβ + u u = y − Xβ Residuals represent the difference between the outcome and the estimated mean. y = Xβ^ + u^ u^ = y − Xβ^ An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation Errors represent the difference between the outcome and the true mean. Molly Roberts Robust and Clustered Standard Errors March 6, 2013 5 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation Errors represent the difference between the outcome and the true mean. y = Xβ + u u = y − Xβ Residuals represent the difference between the outcome and the estimated mean. y = Xβ^ + u^ u^ = y − Xβ^ Molly Roberts Robust and Clustered Standard Errors March 6, 2013 5 / 35 Residuals represent the difference between the outcome and the estimated mean. y = Xβ^ + u^ u^ = y − Xβ^ An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation Errors represent the difference between the outcome and the true mean. y = Xβ + u u = y − Xβ Molly Roberts Robust and Clustered Standard Errors March 6, 2013 5 / 35 y = Xβ^ + u^ u^ = y − Xβ^ An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation Errors represent the difference between the outcome and the true mean. y = Xβ + u u = y − Xβ Residuals represent the difference between the outcome and the estimated mean. Molly Roberts Robust and Clustered Standard Errors March 6, 2013 5 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation Errors represent the difference between the outcome and the true mean. y = Xβ + u u = y − Xβ Residuals represent the difference between the outcome and the estimated mean. y = Xβ^ + u^ u^ = y − Xβ^ Molly Roberts Robust and Clustered Standard Errors March 6, 2013 5 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Variance of β^ depends on the errors − β^ = X0X 1 X0y − = X0X 1 X0(Xβ + u) − = β + X0X 1 X0u Molly Roberts Robust and Clustered Standard Errors March 6, 2013 6 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Variance of β^ depends on the errors − V [β^] = V [β] + V [X0X 1 X0u] − = 0 + V [X0X 1 X0u] − − − − = E[X0X 1 X0uu0X X0X 1] − E[X0X 1 X0u]E[X0X 1 X0u]0 − − = E[X0X 1 X0uu0X X0X 1] − 0 Molly Roberts Robust and Clustered Standard Errors March 6, 2013 7 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Variance of β^ depends on the errors (continued) − V [β^] = V [β] + V [X0X 1 X0u] − = 0 + V [X0X 1 X0u] − − − − = E[X0X 1 X0uu0X X0X 1] − E[X0X 1 X0u]E[X0X 1 X0u]0 − − = E[X0X 1 X0uu0X X0X 1] − 0 − − = X0X 1 X0E[uu0]X X0X 1 − − = X0X 1 X0ΣX X0X 1 Molly Roberts Robust and Clustered Standard Errors March 6, 2013 8 / 35 u ∼ Nn(0; Σ) 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Σ = Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 What does this mean graphically for a CEF with one explanatory variable? An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Constant Error Variance and Dependence Under standard OLS assumptions, Molly Roberts Robust and Clustered Standard Errors March 6, 2013 9 / 35 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Σ = Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 What does this mean graphically for a CEF with one explanatory variable? An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Constant Error Variance and Dependence Under standard OLS assumptions, u ∼ Nn(0; Σ) Molly Roberts Robust and Clustered Standard Errors March 6, 2013 9 / 35 What does this mean graphically for a CEF with one explanatory variable? An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Constant Error Variance and Dependence Under standard OLS assumptions, u ∼ Nn(0; Σ) 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Σ = Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 Molly Roberts Robust and Clustered Standard Errors March 6, 2013 9 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Constant Error Variance and Dependence Under standard OLS assumptions, u ∼ Nn(0; Σ) 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Σ = Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 What does this mean graphically for a CEF with one explanatory variable? Molly Roberts Robust and Clustered Standard Errors March 6, 2013 9 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Evidence of Non-constant Error Variance (4 examples) ● ● ● 1.5 ● ● ● ● ● 2.0 ●● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ●● ●●●● ●● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ●● ●● 1.0 ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● y ● ● ● y ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●●●● ●● ● ● ●● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● −0.5 ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● −1.5 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x x ● ● ● ● ● ● ● ●●● 1.2 1.0 ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● 0.8 ● ●●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● 0.8 ● ●● ● 0.6 ● ●●● ● ●● ●●●● ● ● ● ● ●●●●● ● y ● y ● ● ●●● ● ● ●●● ● ● ●● ● 0.4 ● ● ● ●●●● ● ●●● ● ● ● ● ●● ● ●●●● ● ● ● 0.4 ● ●● ● ● ● ● ●● 0.2 ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ● ● ● ● ● 0.0 ● ● ●● ● ● 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x x Molly Roberts Robust and Clustered Standard Errors March 6, 2013 10 / 35 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 In this section we will allow violations of this assumption in the following heteroskedastic form. 2 2 3 σ1 0 0 ::: 0 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 2 7 6 . 7 4 . 5 2 0 0 0 : : : σn An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation The constant error variance assumption sometimes called homoskedasiticity states that Molly Roberts Robust and Clustered Standard Errors March 6, 2013 11 / 35 In this section we will allow violations of this assumption in the following heteroskedastic form. 2 2 3 σ1 0 0 ::: 0 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 2 7 6 . 7 4 . 5 2 0 0 0 : : : σn An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation The constant error variance assumption sometimes called homoskedasiticity states that 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 Molly Roberts Robust and Clustered Standard Errors March 6, 2013 11 / 35 2 2 3 σ1 0 0 ::: 0 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 2 7 6 . 7 4 . 5 2 0 0 0 : : : σn An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation The constant error variance assumption sometimes called homoskedasiticity states that 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 7 6 . 7 4 . 5 0 0 0 : : : σ2 In this section we will allow violations of this assumption in the following heteroskedastic form. Molly Roberts Robust and Clustered Standard Errors March 6, 2013 11 / 35 An Introduction to Robust and Clustered Standard Errors Linear Regression with Non-constant Variance Notation The constant error variance assumption sometimes called homoskedasiticity states that 2 σ2 0 0 ::: 0 3 6 0 σ2 0 ::: 0 7 Var(u) = E[uu0] = 6 7 6 .