<<

Statistics 203: Introduction to Regression and of Series Regression

Jonathan Taylor

- p. 1/12 Today's class

● Today's class ■ Regression with autocorrelated errors. ● ● Durbin-Watson test for ■ autocorrelation Functional . ● Correcting for AR(1) in regression model ● Two-stage regression ● Other models of correlation ● More than one time series ● Functional Data ● Scatterplot ● Smoothing splines ● smoother

- p. 2/12 Autocorrelation

● Today's class ■ In the random effects model, outcomes within groups were ● Autocorrelation ● Durbin-Watson test for correlated. autocorrelation ● Correcting for AR(1) in ■ regression model Other regression applications also have correlated outcomes ● Two-stage regression ● Other models of correlation (i.e. errors). ● More than one time series ■ ● Functional Data Common examples: time series data. ● Scatterplot smoothing ● Smoothing splines ■ Why worry? Can lead to underestimates of SE → inflated t’s ● Kernel smoother → false positives.

- p. 3/12 Durbin-Watson test for autocorrelation

● Today's class ■ In regression setting, if is AR(1), a simple estimate of ρ ● Autocorrelation ● Durbin-Watson test for is obtained by (essentially) regressing e onto e − autocorrelation t t 1 ● Correcting for AR(1) in regression model n ● Two-stage regression t=2 (etet−1) ● Other models of correlation ρ = n 2 . ● P More than one time series t=1 et ● Functional Data ● Scatterplot smoothing P ■ b ● Smoothing splines To formally test H0 : ρ = 0 (i.e. whether residuals are ● Kernel smoother independent vs. they are AR(1)), use Durbin-Watson test, based on d = 2(1 − ρ). b

- p. 4/12 Correcting for AR(1) in regression model

● Today's class ■ If we now ρ, it is possible “pre-whiten” the data and ● Autocorrelation ● Durbin-Watson test for regressors autocorrelation ● Correcting for AR(1) in regression model ˜ ● Two-stage regression Yi+1 = Yi+1 − ρYi, i > 1 ● Other models of correlation ● More than one time series ˜ ● Functional Data X(i+1)j = X(i+1)j − ρXij, i > 1 ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother then model satisfies “usual” assumptions. ■ For coefficients in new model β˜, β0 = β˜0/(1 − ρ), βj = β˜j .

- p. 5/12 Two-stage regression

● Today's class ■ Step 1: Fit to unwhitened data. ● Autocorrelation ● Durbin-Watson test for ■ autocorrelation Step 2: Estimate ρ with ρ. ● Correcting for AR(1) in regression model ■ Step 3: Pre-whiten data using ρ – refit the model. ● Two-stage regression b ● Other models of correlation ● More than one time series ● Functional Data b ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother

- p. 6/12 Other models of correlation

● Today's class ■ If we have noise then we can also pre-whiten ● Autocorrelation ARMA(p, q) ● Durbin-Watson test for the data and perform OLS – equivalent to GLS. autocorrelation ● Correcting for AR(1) in ■ regression model If we estimate parameters we can then use a two-stage ● Two-stage regression ● Other models of correlation procedure as in the AR(1) case. ● More than one time series ■ ● Functional Data OR, we can just use MLE (or REML): does this. This is ● Scatterplot smoothing ● Smoothing splines similar to iterating the two-stage procedure. ● Kernel smoother

- p. 7/12 More than one time series

● Today's class ■ Suppose we have r time series Y , 1 ≤ i ≤ r, 1 ≤ j ≤ n . ● Autocorrelation ij r ● Durbin-Watson test for ■ autocorrelation Regression model ● Correcting for AR(1) in regression model ● Two-stage regression Yij = β0 + β1Xij + εij . ● Other models of correlation ● More than one time series ● Functional Data where the β’s are common to everyone and ● Scatterplot smoothing ● Smoothing splines ● Kernel smoother εi = (εi1, . . . , εini ) ∼ N(0, Σi), independent across i ■ We can put all of this into one big regression model and estimate everything. Easy to do in R.

- p. 8/12 Functional Data

● Today's class ■ Having observations that are time series can be thought of ● Autocorrelation ● Durbin-Watson test for as having a “” as an observation. autocorrelation ● Correcting for AR(1) in ■ regression model Having many time series, i.e. daily temperature in NY, SF, ● Two-stage regression ● Other models of correlation LA, . . . allows one to think of the individual time series as ● More than one time series ● Functional Data observations. ● Scatterplot smoothing ■ ● Smoothing splines The field “Functional Data Analysis” (Ramsay & Silverman) ● Kernel smoother is a part of statistics that focuses on this type of data. ■ Today we’ll think of having one function and what we might do with it.

- p. 9/12 Scatterplot smoothing

● Today's class ■ When we only have one “function” we can think of fitting a ● Autocorrelation ● Durbin-Watson test for trend as smoothing a scatterplot of pairs (X , Y ) ≤ ≤ . autocorrelation i i 1 i n ● Correcting for AR(1) in ■ regression model Different techniques ● Two-stage regression ◆ ● Other models of correlation B-splines; ● More than one time series ◆ ● Functional Data Smoothing splines; ● Scatterplot smoothing ◆ Kernel smoothers; ● Smoothing splines ● Kernel smoother ◆ many others.

- p. 10/12 Smoothing splines

● Today's class ■ We saw early on in the class that we could use B-splines in a ● Autocorrelation ● Durbin-Watson test for regression setting to predict Yi from Xi. autocorrelation ● Correcting for AR(1) in ■ regression model Smoothing splines: for λ ≥ 0 and weights wi, 1 ≤ i ≤ n find ● Two-stage regression ● Other models of correlation the function with two-derivatives that minimizes ● More than one time series ● Functional Data n ● Scatterplot smoothing 2 00 2 ● Smoothing splines ωi(Yi − f(Xi)) + λ (f (x)) dx. ● Z Kernel smoother Xi=1 ■ This should remind you of ridge regression: prior is now on functions. ■ Equivalent to saying that we have a Gaussian prior (integrated Brownian motion) on functions and we want the “MAP” based on observing f at the points X with measurement errors εi ∼ N(0, 1/wi).

- p. 11/12 Kernel smoother

● Today's class ■ Given a kernel function K and a bandwidth h, the kernel ● Autocorrelation ● Durbin-Watson test for smooth of the scatterplot (X , Y ) ≤ ≤ is defined by the local autocorrelation i i 1 i n ● Correcting for AR(1) in regression model ● n Two-stage regression · − ● Other models of correlation i=1 Yi K((x Xi)/h) ● Y (x) = . More than one time series P n − ● Functional Data i=1 K((x Xi)/h) ● Scatterplot smoothing b P ● Smoothing splines ■ Most commonly used kernel: ● Kernel smoother 2 K(x) = e−x /2. ■ The key parameter is the bandwidth. Much work has been done on choosing an “optimal bandwidth.”

- p. 12/12