Time Series Regression
Total Page:16
File Type:pdf, Size:1020Kb
Statistics 203: Introduction to Regression and Analysis of Variance Time Series Regression Jonathan Taylor - p. 1/12 Today's class ● Today's class ■ Regression with autocorrelated errors. ● Autocorrelation ● Durbin-Watson test for ■ autocorrelation Functional data. ● Correcting for AR(1) in regression model ● Two-stage regression ● Other models of correlation ● More than one time series ● Functional Data ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother - p. 2/12 Autocorrelation ● Today's class ■ In the random effects model, outcomes within groups were ● Autocorrelation ● Durbin-Watson test for correlated. autocorrelation ● Correcting for AR(1) in ■ regression model Other regression applications also have correlated outcomes ● Two-stage regression ● Other models of correlation (i.e. errors). ● More than one time series ■ ● Functional Data Common examples: time series data. ● Scatterplot smoothing ● Smoothing splines ■ Why worry? Can lead to underestimates of SE ! inflated t’s ● Kernel smoother ! false positives. - p. 3/12 Durbin-Watson test for autocorrelation ● Today's class ■ In regression setting, if noise is AR(1), a simple estimate of ρ ● Autocorrelation ● Durbin-Watson test for is obtained by (essentially) regressing e onto e − autocorrelation t t 1 ● Correcting for AR(1) in regression model n ● Two-stage regression t=2 (etet−1) ● Other models of correlation ρ = n 2 : ● P More than one time series t=1 et ● Functional Data ● Scatterplot smoothing P ■ b ● Smoothing splines To formally test H0 : ρ = 0 (i.e. whether residuals are ● Kernel smoother independent vs. they are AR(1)), use Durbin-Watson test, based on d = 2(1 − ρ): b - p. 4/12 Correcting for AR(1) in regression model ● Today's class ■ If we now ρ, it is possible “pre-whiten” the data and ● Autocorrelation ● Durbin-Watson test for regressors autocorrelation ● Correcting for AR(1) in regression model ~ ● Two-stage regression Yi+1 = Yi+1 − ρYi; i > 1 ● Other models of correlation ● More than one time series ~ ● Functional Data X(i+1)j = X(i+1)j − ρXij; i > 1 ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother then model satisfies “usual” assumptions. ■ For coefficients in new model β~, β0 = β~0=(1 − ρ), βj = β~j : - p. 5/12 Two-stage regression ● Today's class ■ Step 1: Fit linear model to unwhitened data. ● Autocorrelation ● Durbin-Watson test for ■ autocorrelation Step 2: Estimate ρ with ρ. ● Correcting for AR(1) in regression model ■ Step 3: Pre-whiten data using ρ – refit the model. ● Two-stage regression b ● Other models of correlation ● More than one time series ● Functional Data b ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother - p. 6/12 Other models of correlation ● Today's class ■ If we have noise then we can also pre-whiten ● Autocorrelation ARMA(p; q) ● Durbin-Watson test for the data and perform OLS – equivalent to GLS. autocorrelation ● Correcting for AR(1) in ■ regression model If we estimate parameters we can then use a two-stage ● Two-stage regression ● Other models of correlation procedure as in the AR(1) case. ● More than one time series ■ ● Functional Data OR, we can just use MLE (or REML): R does this. This is ● Scatterplot smoothing ● Smoothing splines similar to iterating the two-stage procedure. ● Kernel smoother - p. 7/12 More than one time series ● Today's class ■ Suppose we have r time series Y ; 1 ≤ i ≤ r; 1 ≤ j ≤ n . ● Autocorrelation ij r ● Durbin-Watson test for ■ autocorrelation Regression model ● Correcting for AR(1) in regression model ● Two-stage regression Yij = β0 + β1Xij + "ij : ● Other models of correlation ● More than one time series ● Functional Data where the β’s are common to everyone and ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother "i = ("i1; : : : ; "ini ) ∼ N(0; Σi); independent across i ■ We can put all of this into one big regression model and estimate everything. Easy to do in R. - p. 8/12 Functional Data ● Today's class ■ Having observations that are time series can be thought of ● Autocorrelation ● Durbin-Watson test for as having a “function” as an observation. autocorrelation ● Correcting for AR(1) in ■ regression model Having many time series, i.e. daily temperature in NY, SF, ● Two-stage regression ● Other models of correlation LA, . allows one to think of the individual time series as ● More than one time series ● Functional Data observations. ● Scatterplot smoothing ■ ● Smoothing splines The field “Functional Data Analysis” (Ramsay & Silverman) ● Kernel smoother is a part of statistics that focuses on this type of data. ■ Today we’ll think of having one function and what we might do with it. - p. 9/12 Scatterplot smoothing ● Today's class ■ When we only have one “function” we can think of fitting a ● Autocorrelation ● Durbin-Watson test for trend as smoothing a scatterplot of pairs (X ; Y ) ≤ ≤ . autocorrelation i i 1 i n ● Correcting for AR(1) in ■ regression model Different techniques ● Two-stage regression ◆ ● Other models of correlation B-splines; ● More than one time series ◆ ● Functional Data Smoothing splines; ● Scatterplot smoothing ◆ Kernel smoothers; ● Smoothing splines ● Kernel smoother ◆ many others. - p. 10/12 Smoothing splines ● Today's class ■ We saw early on in the class that we could use B-splines in a ● Autocorrelation ● Durbin-Watson test for regression setting to predict Yi from Xi. autocorrelation ● Correcting for AR(1) in ■ regression model Smoothing splines: for λ ≥ 0 and weights wi; 1 ≤ i ≤ n find ● Two-stage regression ● Other models of correlation the function with two-derivatives that minimizes ● More than one time series ● Functional Data n ● Scatterplot smoothing 2 00 2 ● Smoothing splines !i(Yi − f(Xi)) + λ (f (x)) dx: ● Z Kernel smoother Xi=1 ■ This should remind you of ridge regression: prior is now on functions. ■ Equivalent to saying that we have a Gaussian prior (integrated Brownian motion) on functions and we want the “MAP” estimator based on observing f at the points X with measurement errors "i ∼ N(0; 1=wi). - p. 11/12 Kernel smoother ● Today's class ■ Given a kernel function K and a bandwidth h, the kernel ● Autocorrelation ● Durbin-Watson test for smooth of the scatterplot (X ; Y ) ≤ ≤ is defined by the local autocorrelation i i 1 i n ● Correcting for AR(1) in average regression model ● n Two-stage regression · − ● Other models of correlation i=1 Yi K((x Xi)=h) ● Y (x) = : More than one time series P n − ● Functional Data i=1 K((x Xi)=h) ● Scatterplot smoothing b P ● Smoothing splines ■ Most commonly used kernel: ● Kernel smoother 2 K(x) = e−x =2: ■ The key parameter is the bandwidth. Much work has been done on choosing an “optimal bandwidth.” - p. 12/12.