Chapter 9 Autocorrelation

Chapter 9 Autocorrelation One of the basic assumptions in the linear regression model is that the random error components or disturbances are identically and independently distributed. So in the model y Xu , it is assumed that 2 u ifs 0 Eu(,tts u ) 0 if0s i.e., the correlation between the successive disturbances is zero. 2 In this assumption, when Eu(,tts u ) u , s 0 is violated, i.e., the variance of disturbance term does not remain constant, then the problem of heteroskedasticity arises. When Eu(,tts u ) 0, s 0 is violated, i.e., the variance of disturbance term remains constant though the successive disturbance terms are correlated, then such problem is termed as the problem of autocorrelation. When autocorrelation is present, some or all off-diagonal elements in E(')uu are nonzero. Sometimes the study and explanatory variables have a natural sequence order over time, i.e., the data is collected with respect to time. Such data is termed as time-series data. The disturbance terms in time series data are serially correlated. The autocovariance at lag s is defined as sttsEu( , u ); s 0, 1, 2,... At zero lag, we have constant variance, i.e., 22 0 Eu()t . The autocorrelation coefficient at lag s is defined as Euu()tts s s ;s 0,1,2,... Var() utts Var ( u ) 0 Assume s and s are symmetrical in s , i.e., these coefficients are constant over time and depend only on the length of lag s . The autocorrelation between the successive terms (uu21 and ) (uuuu32 and ),...,(nn and 1 ) gives the autocorrelation of order one, i.e., 1 . Similarly, the autocorrelation between the successive terms (uuuuuu3142 and ),( and )...(nn and 2 ) gives the autocorrelation of order two, i.e., 2 . Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 1 Source of autocorrelation Some of the possible reasons for the introduction of autocorrelation in the data are as follows: 1. Carryover of effect, at least in part, is an important source of autocorrelation. For example, the monthly data on expenditure on household is influenced by the expenditure of preceding month. The autocorrelation is present in cross-section data as well as time-series data. In the cross-section data, the neighbouring units tend to be similar with respect to the characteristic under study. In time-series data, time is the factor that produces autocorrelation. Whenever some ordering of sampling units is present, the autocorrelation may arise. 2. Another source of autocorrelation is the effect of deletion of some variables. In regression modeling, it is not possible to include all the variables in the model. There can be various reasons for this, e.g., some variable may be qualitative, sometimes direct observations may not be available on the variable etc. The joint effect of such deleted variables gives rise to autocorrelation in the data. 3. The misspecification of the form of relationship can also introduce autocorrelation in the data. It is assumed that the form of relationship between study and explanatory variables is linear. If there are log or exponential terms present in the model so that the linearity of the model is questionable, then this also gives rise to autocorrelation in the data. 4. The difference between the observed and true values of the variable is called measurement error or errors–in-variable. The presence of measurement errors on the dependent variable may also introduce the autocorrelation in the data. Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 2 Structure of disturbance term: Consider the situation where the disturbances are autocorrelated, 01 n 1 E(') 10n 2 nn12 0 1 11 n 1 12n 0 nn12 1 1 11 n 1 2 12n . u nn12 1 2 Observe that now there are ()nk parameters- 12, ,...,ku , , 12 , ,..., n 1 . These ()nk parameters are to be estimated on the basis of available n observations. Since the number of parameters are more than the number of observations, so the situation is not good from the statistical point of view. In order to handle the situation, some special form and the structure of the disturbance term is needed to be assumed so that the number of parameters in the covariance matrix of disturbance term can be reduced. The following structures are popular in autocorrelation: 1. Autoregressive (AR) process. 2. Moving average (MA) process. 3. Joint autoregression moving average (ARMA) process. 1. Autoregressive (AR) process The structure of disturbance term in the autoregressive process (AR) is assumed as uutt11 2 u t 2 ... qtqt u , i.e., the current disturbance term depends on the q lagged disturbances and 12, ,..., k are the parameters (coefficients) associated with uutt12, ,..., u tq respectively. An additional disturbance term is introduced in ut which is assumed to satisfy the following conditions: Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 3 E t 0 2 ifs 0 E tts 0if0.s This process is termed as ARq process. In practice, the AR1 process is more popular. 2. Moving average (MA) process: The structure of disturbance term in the moving average (MA) process is utt 11 t .... ptp , i.e., the present disturbance term ut depends on the p lagged values. The coefficients 12, ,..., p are the parameters and are associated with tt12, ,..., tp , respectively. This process is termed as MAp process. 3. Joint autoregressive moving average (ARMA) process: The structure of disturbance term in the joint autoregressive moving average (ARMA) process is uutt11... qtqt u 1 11 ... ptp . This is termed as ARMA q, p process. The method of correlogram is used to check that the data is following which of the processes. The correlogram is a two dimensional graph between the lag s and autocorrelation coefficient s which is plotted as lag s on X -axis and s on y -axis. In MA(1) process utt 11 t 1 2 fors 1 s 11 0for2s 0 1 1 0 i 0i 2,3,... So there is no autocorrelation between the disturbances that are more than one period apart. Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 4 In ARMA(1,1) process uutttt11 11 1 11 1 1 fors 1 2 s 12111 11s fors 2 2 2212111 u 2 . 11 The autocorrelation function begins at some point determined by both the AR and MA components but thereafter, declines geometrically at a rate determined by the AR component. In general, the autocorrelation function - is nonzero but is geometrically damped for AR process. - becomes zero after a finite number of periods for MA process. The ARMA process combines both these features. The results of any lower order of process are not applicable in higher-order schemes. As the order of the process increases, the difficulty in handling them mathematically also increases. Estimation under the first order autoregressive process: Consider a simple linear regression model yXutnttt01 ,1,2,...,. Assume usi ' follow a first-order autoregressive scheme defined as uuttt 1 where 1,E (t ) 0, 2 ifs 0 E(,tts ) 0if0s for all tn1,2,..., where is the first-order autocorrelation between ut and utt1,1,2,...,. n Now uuttt 1 ()utt21 t 2 tt12 t ... r tr r0 Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 5 Eu()t 0 222242 Eu(tt ) E ( ) E ( t12 ) E ( t ) ... 24 2' (1 ....) (t s are seriallyindependent) 2 Eu()22 forall. t tu1 2 Euu( ) E22 ... ... tt112123 t t t t t t E ttt 12 ... tt 12 ... E ... 2 tt12 2 u . Similarly, 22 Euu()tt2 u . In general, s 2 Euu()tts u 1 21 n 1 n2 . Euu(') 2 231 n u nn123 n 1 Note that the disturbance terms are no more independent and E(')uu 2 I . The disturbances are nonspherical. Consequences of autocorrelated disturbances: Consider the model with first-order autoregressive disturbances yX u n1 nk k1 n1 uuttt1 ,1,2,..., t n with assumptions Eu() 0,( Euu ') 2 ifs 0 EE()ttts 0,( ) 0ifs 0 where is a positive definite matrix. Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 6 The ordinary least squares estimator of is bXXXy (')1 ' (')X XXXu1 '( ) bXXXu (')1 ' Eb()0. So OLSE remains unbiased under autocorrelated disturbances. The covariance matrix of b is Vb()()()' Eb b (')X XXEuuXXX11 '(')(') (')XXXXXX11 ' (') 21 u (').XX The residual vector is e y Xb Hy Hu ee'' yHy uHu ' 1 E(')ee Euu (') E u '(') X X X X ' u 21 ntrXXXX u (') ' . ee' Since s2 , so n 1 2 1 E()strXXXX21u (') ' , nn11 so s2 is a biased estimator of 2 . In fact, s2 has a downward bias. Application of OLS fails in case of autocorrelation in the data and leads to serious consequences as overly optimistic view from R2. narrow confidence interval. usual t -ratio and F ratio tests provide misleading results. prediction may have large variances. Since disturbances are nonspherical, so generalized least squares estimate of yields more efficient estimates than OLSE. Econometrics | Chapter 9 | Autocorrelation | Shalabh, IIT Kanpur 7 The GLSE of is ˆ ('X 11XX ) ' 1 y E()ˆ ˆ 211 VXX()u ( ' ). The GLSE is best linear unbiased estimator of . Tests for autocorrelation: Durbin Watson test: The Durbin-Watson (D-W) test is used for testing the hypothesis of lack of first-order autocorrelation in the disturbance term. The null hypothesis is H0 :0 Use OLS to estimate in y Xu and obtain the residual vector eyXbHy where bXXXyHIXXXX(')11 ', (') '. The D-W test statistic is n 2 eett 1 t2 d n 2 et t1 nn n 2 eett11 ee tt tt22 t 2 nn 2. n 22 2 eett e t tt11 t 1 For large n, dr112 dr 2(1 ) where r is the sample autocorrelation coefficient from residuals based on OLSE and can be regarded as the regression coefficient of et on et1 .

Chapter 9 Autocorrelation

Power Comparisons of Five Most Commonly Used Autocorrelation Tests

Understanding Linear and Logistic Regression Analyses

Autocovariance Function Estimation Via Penalized Regression

LECTURES 2 - 3 : Stochastic Processes, Autocorrelation Function

Alternative Tests for Time Series Dependence Based on Autocorrelation Coefficients

3 Autocorrelation

Spatial Autocorrelation: Covariance and Semivariance Semivariance

Modeling and Generating Multivariate Time Series with Arbitrary Marginals Using a Vector Autoregressive Technique

Application of General Linear Models (GLM) to Assess Nodule Abundance Based on a Photographic Survey (Case Study from IOM Area, Paciﬁc Ocean)

The Simple Linear Regression Model

Random Processes

Chapter 2 Simple Linear Regression Analysis the Simple