Lesson 7 Heteroskedasticity and Autocorrelation

Pilar González and Susan Orbe

Dpt. Applied Economics III (Econometrics and )

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 1 / 39 Learning objectives

• To understand the concepts of heteroskedasticiy and serial correlation.

• To identify the consequences of the presence of heteroskedasticiy and/or serial correlation on the properties of the OLS estimator.

• To identify the consequences of the presence of heteroskedasticity and/or serial correlation on the inference based on the OLS estimator.

• To detect the presence of heteroskedasticity and/or serial correlation.

• To carry out inference robust to the presence of heteroskedasticity and/or serial correlation based on the OLS estimator.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 2 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 3 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 4 / 39 Multiple Regression Model assumptions.

Assumptions.

A.1. The model in the population can be written as:

Yt = β1 + β2X2t + β3X3t + ... + βkXkt + ut t = 1, .., T

A.2. No perfect collinearity

A.3. Zero conditional : E(ut|X2,X3,...,Xk) = 0 ∀t = 1, 2,...,T

A.4. Homoskedasticity (constant ):

2 Var(ut|X2,X3,...,Xk) = σu ∀t = 1, 2,...,T .

A.5. No autocorrelation: Cov(ut, us|X2,X3,...,Xk) = 0 ∀t 6= s.

A.6. Normality: The errors ut are independent of X and they are identically normally distributed.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 5 / 39 Multiple Regression Model assumptions.

Matrix form.

A.1. The model in the population can be written as: Y = Xβ + u

A.2. No perfect collinearity

A.3. Zero conditional mean: E(u|X) = 0

A.4. + A.5. Homoskedasticity (constant variance) + No autocorrelation:

 2  σu 0 0 ... 0 2 0 σu 0 ... 0  2  0  0 0 σu ... 0  2 V (u|X) = E(uu |X) =   = σu IT  . . . . .   ......  2 0 0 0 . . . σu

A.6. Normality.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 6 / 39 Multiple Regression Model assumptions.

Under these assumptions, conditional on X:

A. The OLS estimator βˆ, is • linear • unbiased • efficient in the Gauss-Markov sense.

2 B. The estimator of the variance of the error term, σˆu, is unbiased.

ˆ ˆ ˆ 2 0 −1 C. β ∼ N(β, V (β)) where V (β) = σu(X X)

D. Test statistics: ˆ 0 βj − βj (SSRR − SSRUR)/q t = ∼H0 t(T − k) F = ∼H0 F(q, T − k) σˆ ˆ SSRUR/(T − k) βj

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 7 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 8 / 39 Heteroskedasticity.

Concept.

Homoskedasticity = The variance of the disturbances ui is constant ∀i, conditional on X.

Heteroskedasticity = The variance of the disturbances ui is not constant ∀i, conditional on X, because it depends on one or several variables.

The presence of heteroskedasticity is quite frequent when working with cross-section data. Let’s assume a regression model that determines consumption as a function of income. The variance of the error term might be expected to increase as income increases.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 9 / 39 Heteroskedasticity.

Homoskedastic error term

F(Y | X)

Y

F(Y | X=X0)

F(Y | X=X1)

F(Y | X=X2)

F(Y | X=X3)

F(Y | X=X4) X

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 10 / 39 Heteroskedasticity.

Heteroskedastic error term

F(Y | X)

Y

F(Y | XX= XX0)

F(Y | X=X 1)

F(Y | X=X 2)

F(Y | X=X 3)

F(Y | X=X 4) X

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 11 / 39 Heteroskedasticity.

Concept.

Heteroskedasticity 2 V (ui|X) = σi i = 1, 2,...,N

The covariance matrix of the error term conditional on X is:  2  σ1 0 0 ... 0    0 σ2 0 ... 0   2   2  0  0 0 σ ... 0  V (u|X) = E(uu |X) =  3  = Σ    ......   . . . . .    2 0 0 0 . . . σN This covariance matrix will be denoted by V (u).

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 12 / 39 Heteroskedasticity.

Properties of the OLS estimator.

Conditional on X, the OLS estimator is: • Linear: βˆ = (X0X)−1X0Y = β + (X0X)−1X0u

• Unbiased: E(βˆ|X) = E[(β + (X0X)−1X0u)|X] = β

• Not Efficient. Given that the error term in heteroskedastic, assumption A4 is not satisfied and it is not possible to apply the Gauss-Markov theorem. βˆOLS has NOT the smallest variance within the class of linear and unbiased estimators. It can be proved that there is another estimator with a smaller variance when the structure of heteroskedasticity is known: → Generalized Least Squares Estimator (GLS): uses the information about the covariance matrix of u to estimate the coefficients β. The derivation of this estimator is out of the scope of this book.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 13 / 39 Heteroskedasticity.

Inference using the OLS estimator, conditional on X.

When the assumptions of the GLRM are satisfied: βˆ ∼ N(β, σ2(X0X)−1).

When the error term is heteroskedastic, the covariance matrix of the error term is V (u) = Σ, then what is the distribution of the OLS estimator ???

βˆ ∼ N[β, (X0X)−1X0ΣX(X0X)−1]

Proof:

V (βˆ|X) = E[(βˆ − E(βˆ))(βˆ − E(βˆ))0|X] = E[(βˆ − β)(βˆ − β)0|X] =

= E[((X0X)−1X0u)((X0X)−1X0u)0|X] = E[(X0X)−1X0uu0X(X0X)−1|X] =

= (X0X)−1X0ΣX(X0X)−1

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 14 / 39 Heteroskedasticity.

Inference using the OLS estimator conditional on X.

Hypothesis testing based on the t and F statistics is NOT adequate and it can be misleading.

The problem is that the usual estimator of V (βˆ)

Vˆ (βˆ) =σ ˆ2(X0X)−1

is not appropriate because it is BIASED.

The t and F statistics do not follow the usual Student-t and F-Snedecor distributions.

ˆ 0 βj − β (SSR − SSR )/q t = j H∼0 ??? F = R UR H∼0 ??? σˆ SSR /(N − k) βˆj UR

What are the distributions of the t and F statistics ???

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 15 / 39 Heteroskedasticity.

Detection. A. Graphic analysis. • Estimate the model by OLS and compute the residuals. • Plot the OLS residuals against the regressors that may be the cause of the heteroskedasticity.

The plots below show the behaviour of the error term under the homoskedasticity assumption and the behaviour of a heteroskedastic error term.

u i ui * * * * * * * ** * * * * * * * * * * * * 00* * * X X * * * * i * * * * i * * * * * * * * * * * * *** * Homoskedasticity Heteroskedasticity

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 16 / 39 Heteroskedasticity.

Detection. B. Heteroskedasticity tests. • Goldfeld-Quandt test. • Breusch-Pagan test. • White test.

Consider the linear regression model:

Yi = β1 + β2X2i + β3X3i + ... + βkXki + ui i = 1, 2,...,N.

The set-up of the tests to detect heteroskedasticity is always the same:

2 2 H0 : Homoskedasticity: V (ui) = σi = σ ∀i 2 Ha : Heteroskedasticity: V (ui) = σi

See Example 7.1 for applications.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 17 / 39 Heteroskedasticity.

Goldfeld-Quandt test.

This simple test is used when it is believed that only one variable Z, usually one of the regressors, is the cause of the heteroskedasticity. That is, it assumes that the 2 2 heteroskedasticity takes the form σi = E(ui) = h(Z) where h is an increasing (decreasing) function of Z.

Procedure: 2 1. Identify the variable Z (usually one of the regressors) that is related with σi . 2. Sort the observations by Z (increasing/decreasing).

3. Split the in three parts: the first N1 observations, the last N2 observations and delete the central observations. Usually, one third of the total number observations is deleted.

4. Estimate by OLS two regressions: one using the first N1 observations and another one using the last N2 observations.

SSR2/(N2 − k2) H0 5. Test : GQ = ∼ F((N2 − k2), (N1 − k1)) SSR1/(N1 − k1)

where SSR1 and SSR2 are the sum of squared residuals of the first and second regressions, respectively. 6. Reject the null hypothesis of homoskedasticity at the α % significance level if:

GQ > Fα((N2 − k2), (N1 − k1))

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 18 / 39 Heteroskedasticity.

Breusch-Pagan test.

This test is based in the assumption that the heteroskedasticity takes the form 2 2 0 σi = E(ui) = h(zi α), 0 • zi = [1 z1i, . . . , zpi] is a vector of known variables,

• α = [α0 α1, . . . , αp] is a vector of unknown coefficients and • h(·) is any function that only takes positive values.

The null hypothesis of homoskedasticity is: H0 : α1 = α3 = ... = αp = 0

Procedure: 1. Estimate the regression model by OLS, compute the residuals uˆ and estimate the variance of the error term as follows: σ˜2 = P uˆ2/N. 2 2 0 2. Regress uˆi /σ˜ on zi by OLS and compute the explained sum of squares (SSE).

SSE H ,a 3. Test statistic: LM = ∼0 χ2(p) 2 4. Reject the null hypothesis of homoskedasticity at the α % significance level if:

2 LM = SSE/2 > χα(p)

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 19 / 39 Heteroskedasticity.

White test.

This is a very flexible test because it is not necessary to make any assumption about the structure of homoskedasticity. In this sense, it is said to be a robust test.

Procedure: 1. Estimate the regression model by OLS and compute the residuals uˆ. 2. Undertake an auxiliary . Regress the squared residuals from the original regression model onto a set of regressors that contain the original regressors, the cross-products of the regressors and the squared regressors. Compute the coefficient of determination of this auxiliary regression. 3. Test statistic: H ,a LM = NR2 ∼0 χ2(p) where R2 is the coefficient of determination of the auxiliary regression, N is the sample size and p is the number of estimated coefficients in the auxiliary regression minus 1.

2 2 4. Reject the null hypothesis of homoskedasticity at the α % significance level if: NR > χα(p).

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 20 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 21 / 39 Autocorrelation.

Concept.

Autocorrelation : there are linear relationships among the error terms for different observations:

cov(utus|X) = E(utus|X) 6= 0 ∀t, s t 6= s The presence of autocorrelation in the error term is quite frequent when working with time series data. The covariance matrix of the error term is:

2  σ σ12 σ13 . . . σ1T 

2  σ21 σ σ23 . . . σ2T   2  0  σ31 σ32 σ . . . σ3T  V (u|X) = E(uu |X) =   = Σ  . . . .   ......   . . . . .  2 σT 1 σT 2 σT 3 . . . σ This covariance will be denoted by V (u).

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 22 / 39 Autocorrelation.

Consequences of autocorrelation.

A. Properties of the OLS estimators, conditional on X: The OLS estimator is linear, unbiased, but it is Not Efficient, in the sense of minimum variance because the Gauss-Markov theorem cannot be applied under autocorrelation.

It can be proved that there is another estimator with a smaller variance when the structure of the autocorrelation is known: Generalized Least Squares Estimator (GLS).

B. Inference using the OLS estimator. When assumption A.5 is not satisfied and the error term is autocorrelated, the true covariance matrix of the OLS estimator conditional on X is:

V (βˆ) = (X0X)−1X0ΣX(X0X)−1

As a consequence, the true distribution of the t and F statistics used for testing linear restrictions is unknown. Inference based on the usual Student-t and F-Snedecor distributions is not adequate and may lead to wrong conclusions. Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 23 / 39 Autocorrelation.

Detection.

A. Graphic analysis. Procedure: estimate by OLS the regression model and compute the residuals uˆ. Plot the OLS residuals against time to analyse whether there is any pattern in the evolution of the error term that may suggest the presence of autocorrelation.

The plot below shows the temporal evolution of the error terms when there is: • Positive autocorrelation: clusters of positive and negative error terms. • Negative autocorrelation: the error terms alternate sign, negative, positive, negative, ....

u t ut * * * * * * * * * * * * * * * * * 00* * * t t * * * * * * * * * * * ** ** * *

Positive autocorrelation Negative autocorrelation

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 24 / 39 Autocorrelation.

Detection.

B. Autocorrelation tests. 1. Durbin-Watson test (only valid for determinist regressors). 2. Breusch-Godfrey test.

Consider the linear regression model:

Yt = β1 + β2X2t + β3X3t + ... + βkXkt + ut t = 1, 2,...,T.

The set-up of the autocorrelation tests is:

H0 : NO autocorrelation (cov(utus) = E(utus) = 0, t 6= s)

Ha : Autocorrelation (cov(utus) = E(utus) 6= 0, t 6= s)

It is ALWAYS necessary to specify the autocorrelation structure under the . See Example 7.2 for applications. Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 25 / 39 Autocorrelation.

Durbin-Watson test.

Ha: First order autocorrelation in the error term, that is, the error term follows an autorregresive process of order 1: 2 AR(1): ut = ρ ut−1 + vt vt ∼ NID(0, σv) |ρ| < 1

The presence of autocorrelation depends on the value of ρ.

 ρ = 0 : No autocorrelation   ρ > 0 : Positive autocorrelation   ρ < 0 : Negative autocorrelation

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 26 / 39 Autocorrelation.

Durbin-Watson test. Positive autocorrelation.

Procedure: ( H0 : ρ = 0 (NO autocorrelation) 1. Ha : ut = ρ ut−1 + vt ρ > 0 (Positive autocorrelation)

T X 2 (ˆut − uˆt−1) H DW = t=2 ∼0 ??? 2. Test statistic: T X 2 uˆt

t=1 where uˆt are the OLS residuals of the regression model. T X uˆtuˆt−1

ρˆ = t=2 It can be proved that: DW ≈ 2 (1 − ρˆ) where: T X 2 uˆt−1

t=2

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 27 / 39 Autocorrelation.

Durbin-Watson test. Positive autocorrelation.

( ρˆ = 0 → DW ≈ 2 Therefore: ρˆ ∈ (0, 1] → DW ∈ [0, 2)

3. Decision rule: Durbin and Watson did not derive the exact distribution of the test statistic under the null because this distribution depends on the sample. But they estimated the critical values depending on the sample size and the number of regressors of the regression model for a given level of significance. They were able to estimate an upper limit (dU ) and a lower limit (dL).

• If DW < dL, the null hypothesis is rejected: there is evidence in the sample of first order positive autocorrelation.

• If DW > dU , the null hypothesis is not rejected: there is no evidence in the sample of first order positive autocorrelation.

• If dL < DW < dU , the test is inconclusive.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 28 / 39 Autocorrelation.

Durbin-Watson test. Positive autocorrelation.

dL dU

0 2 Reject H DO NOT Reject H Uncertainty Evidence of NO region positive autocorrelation autocorrelation

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 29 / 39 Autocorrelation.

Durbin-Watson test. Negative autocorrelation.

Procedure: ( H0 : ρ = 0 (NO autocorrelation) 1. Ha : ut = ρ ut−1 + vt ρ < 0 (Negative autocorrelation)

T X 2 (ˆut − uˆt−1) t=2 H0 2. Test statistic DW = T ∼ ??? X 2 uˆt t=1 It can be proved: DW ≈ 2 (1 − ρˆ) ( ρˆ = 0 → DW ≈ 2 Therefore: ρˆ ∈ (0, 1] → DW ∈ (2, 4]

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 30 / 39 Autocorrelation.

Durbin-Watson test. Negative autocorrelation.

Durbin-Watson estimated the critical values depending on the sample size and the number of regressors of the model for a given significance level. They estimated only an upper limit (dU ) and a lower limit (dL).

• If DW > 4 − dL, the null hypothesis is rejected: there is evidence in the sample of first order negative autocorrelation.

• If 2 < DW < 4 − dU , the null hypothesis is not rejected: there is no evidence in the sample of first order negative autocorrelation.

• If 4 − dU < DW < 4 − dL, the test is inconclusive.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 31 / 39 Autocorrelation.

Durbin-Watson test. Negative autocorrelation.

4 - dU 4 - dL

2 4 DO NOT Reject H Reject H Uncertainty NO region Evidence of autocorrelation negative autocorrelation

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 32 / 39 Autocorrelation.

Durbin-Watson test.

dL dU 4 - dU 4 - dL

0 4 V 2 ject H Vject H Evidence of Uncertainty Uncertainty Evidence of DO NOT Reject H positive region region negative NO autocorrelation autocorrelation autocorrelation

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 33 / 39 Autocorrelation.

Breusch-Godfrey test.

This test may be used to test the presence of first order autocorrelation but it is designed to test the presence of autocorrelation of higher orders. Assume that the error term follows an autorregresive model of order q: ( H0 : ρ1 = ρ2 = ... = ρq = 0 (NO autocorrelation)

Ha : Autocorrelation of order q

Procedure: 1. Estimate the regression model by OLS and compute the residuals uˆ.

2. Estimate an auxiliary regression of uˆt on X2t,X3t,...,Xkt, uˆt−1, uˆt−2,..., uˆt−q

3. Test the overall significance of uˆt−1, uˆt−2,..., uˆt−q using a statistic based on the Lagrange multiplier: H ,a LM = TR2 ∼0 χ2(q) where R2 is the coefficient of determination of the auxiliary regression. Reject the null hypothesis at the α % significance level if: 2 LM > χα(q).

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 34 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 35 / 39 Heteroskedasticity and Autocorrelation.

Inference using the OLS estimator conditional on X.

Is there any procedure to make inference using the OLS estimator?

βˆ ∼ N[β, (X0X)−1X0ΣX(X0X)−1]

Idea : To estimate the true covariance matrix of the OLS estimator.

V (βˆ) = (X0X)−1X0ΣX(X0X)−1

Not an easy task because usually little is known about the structure of matrix Σ . ( ) White (heteroskedasticity) Some econometricians Newey-West (autocorrelation) have derived estimators for the covariance matrix V (βˆ) with good properties in large samples.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 36 / 39 Heteroskedasticity and Autocorrelation.

Inference using the OLS estimator conditional on X. Denote by Vb R(βˆ) the estimator of the true covariance matrix of the OLS estimator of the coefficients β. This estimator is usually called: estimator robust to heteroskedasticity and/ or autocorrelation.

A. Hypothesis on a single coefficient. It can be proved that the asymptotic distribution of the t-statistic under the null is:

ˆ 0 βj − βj H ,a t = ∼0 N(0, 1) R σˆ ˆ βj

R where σˆ ˆ is the of the OLS estimator robust to heteroskedasticity and/ or autocorrelation. βj The null hypothesis is rejected at the α % significance level if: |t| > Nα/2(0, 1). B. Hypothesis on multiple restrictions. It is necessary to multiply the usual F statistic by the number of restrictions to obtain a q × F statistic that is going to follow asymptotically a χ2(q) distribution. The null hypothesis is rejected at the α % significance level if: 2 q × F > χα(q).

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 37 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 38 / 39 Contents

1 Multiple Regression Model assumptions.

2 Heteroskedasticity. Concept. Consequences. Detection.

3 Autocorrelation. Concept. Consequences. Detection.

4 Inference using the OLS estimator.

5 Task: T7.

6 Exercises: E7.1, E7.2 and E7.3.

Pilar González and Susan Orbe | OCW 2014 Lesson 7. Heteroskedasticity and Autocorrelation 39 / 39