SUGI 23: Regression with Time-Series Errors

Beginning Tutorials Regression with Time Series Errors David A. Dickey, North Carolina State University Abstract: if t represents time and X = t, then part of our mo del consists of a simple linear time The basic assumptions of regression are re- trend and there will b e no surprises when we viewed. Graphical and statistical metho ds try to extend the time sequence 1; 2; 3;:::;n for checking the assumptions are presented into the future. On the other hand if X is t using a sales example. Departures from in- the number of incoming phone calls at time dep endence in time series data are empha- t then forecasting to time n +1 would re- sized and illustrated in the example. Sev- quire that some value be inserted for X n+1 TM eral pro ducts from SAS Institute for ana- and this value will itself likely b e a forecast. lyzing regressions with time series errors are These two examples represent deterministic illustrated. The imp ortance of the sto chastic and stochastic explanatory variables, resp ec- prop erties of the mo del input variables is em- tively. phasized. Forecasts from several mo dels for the example data are compared. The nature of the X variables will a ect the forecast accuracy - obviously a p erson fore- 1. Intro duction: casting with a known future X is b etter o than one who must estimate that future X . Regression is a to ol that allows one to mo del Thus a problem we will need to deal with, if the relationship b etween a resp onse variable wewant to put some sort of error b ounds on Y , which might be a mail order company's our forecasts, is the incorp oration of our level sales, and some explanatory variables usually of uncertainty ab out the future X values. denoted X where X might b e the cost of one j 1 item from the company, X the cost of a sim- 2 The usual way of estimating the s is the ilar item from a comp etitor company and X 3 metho d used in PROC GLM and PROC the number of phone calls coming in to the REG. The metho d is referred to as ordinary company's switchb oard. Atypical regression least squares in that it nds estimates b of mo del for this situation is j the parameters that minimize the error j P n Y = + X + X + X + e 0 1 1 2 2 3 3 sum of squares SSE = Y b + b X + t 0 1 1 t=1 2 b X + b X . This SSE is a function of the 2 2 3 3 estimates, b , and much of the sub ject of cal- j where the regression co ecients, , are un- culus is concerned with nding values of ar- known. guments, like these b , that minimize a func- j tion, SSE in our case. Thus wehave mathe- You would like to estimate these s, for if you matical to ols which are relatively easy to im- could, you would then havean equation for plement on the computer that allowusto nd predicting a future Y from the asso ciated X s. the minimizing values. This is what PROC Notice that even if the regression co ecients REG and PROC GLM are set up to do. Fur- were known, such a prediction would require thermore, statistical theory allows us to com- knowledge of future X values. For example, 1 Beginning Tutorials a histogram of the residuals, a hanging his- pute measures of uncertainty called standard togram in which each bar b ecomes a line seg- errors for these b estimates and the result- j ment at the former bar midp oint, this line b e- ing forecasts if certain conditions are satis- ing hung from the normal curve rather than ed. Note the expression \if certain condi- rising from the horizontal axis, and a plot tions are satis ed." It is this with whichwe of the residuals against their normal scores. are concerned here. These are very easy to pro duce using the fol- In this pap er we review these \certain condi- lowing co de: tions," indicate why they might be violated pro c capability graphics; when data are taken over time, present meth- histogram r /normal hanging vref=0; o ds for checking these conditions, and nally histogram r / normal; represent corrections that can be applied if qqplot r / normal mu=est sigma=est; the conditions are violated. The corrections that we sp eak of are implemented in SAS In- The histograms lo ok reasonably normal stitute's PROC AUTOREG. and the quantile-quantile plot reasonably straight. PROC CAPABILITY also presents Throughout the pap er we will use an arti - tests of the normalityhyp othesis but the the- cial example in which X represents the num- t ory b ehind these assumes indep endence, an ber of phone calls in week t to a mail order assumption wehaveyet to check. company and Y is the numb er of shipments t for that week. Figure 1 shows the data over Not shown is a simple plot of residuals against a 3 year p erio d. We are interested in es- predicted values. Because this lo oks uni- timating the company's growth, estimating form as opp osed to megaphone shap ed this the numb er of shipments generated p er phone check on the homogeneous variance assump- call, and forecasting phone calls and sales two tion do es not give us cause for concern. weeks into the future. The regression and subsequent calculation of 2. Checking the usual assumptions. residuals was accomplished with this co de: Our mo del is Y = + X + t + e . We pro c reg; mo del y = t x/dw; t 0 1 t 2 t assume pro c reg; mo del y = t/dw; A Normality: where Y is sales, X phone calls, and t week The errors all come from normal numb er. The previous residual analysis was distributions from the rst regression. The advantage of the second regression is that only future B Homogeneity: values of t would be needed for forecasting These normal distributions all whereas for the rst mo del wewould need to have mean 0 and know, or at least estimate, next week's phone 2 the same variance, calls to forecast sales. Notice the dw options. These request the C Indep endence: \Durbin Watson" statistic which is a test The correlation b etween e and e is 0 i j for auto correlation, that is, correlation be- for i not equal to j tween successive residuals. Auto correlation is a commonly o ccurring violation of the in- We can check the normality assumption by dep endence assumptions when data are taken drawing histograms and normal probability over time. The option also gives an estimate plots of the residuals. In gures 2-4, we see 2 Beginning Tutorials r of the rst order auto correlation. We get 21 b so that from our Z ,we could get a dw = 1:407 and b = :283 for the rst mo del, large sample approximate distribution for the dw = :969 and b = :497 for the second mo del. Durbin-Watson statistic. The real contribu- tion of Durbin and Watson was to to show 3. The Durbin-Watson statistic and how to get the exact nite sample distribu- rst order auto correlation. tion of the statistic dw. The Durbin-Watson statistic is dw = Unfortunately the Durbin-Watson theory P P n n 2 2 r where r is the r r = t t t1 t shows that the exact distribution dep ends on t=1 t=2 residual at time t. If e represents white noise t the values of the X explanatory variables in an uncorrelated sequence then we nd these the regression so that each new problem en- exp ected values: countered would require a new table of critical values. However, if none of the X vari- 2 2 2 E fe e g = E fe e e + e g = t t1 t t1 t t1 ables are lagged Y values and the errors are 2 2 +0+ normal, they were able to calculate b ounds 2 2 Efe g = t for all critical values. Thus if you enter the P P tables of Durbin and Watson for a certain n n 2 2 Thus e e = e should be t t1 t t=2 t=1 sample size and numb er of explanatory vari- near 2, that is, the Durbin Watson statistic ables you will see upp er and lower b ounds for should b e near 2 if calculated on a white noise the true critical value. sequence. If there is p ositive correlation b e- tween neighb oring e's then e and e would t t1 A dw to the left of the lower b ound is clearly b e more alike than in the white noise case so less than the critical value and thus to o close that e e would b e smaller in magnitude t t1 to 0 to accept the indep endence hyp othesis and thus dw would movetoward 0. under which dw should b e near 2. Adwto the rightof the upp er b ound makes it clear The rst order correlation in the residuals is that dw is closer to 2 than is the critical value n n so we cannot reject indep endence. Adwbe- X X 2 b = r r r r = r r t t1 t tween the b ounds just tells you that the cal- t=2 t=1 culated dw and the critical value are b etween these numb ers so you have no idea how they are placed relativetoeach other.

Load more