Introduction to Time Series Analysis. Lecture 23

Introduction to Time Series Analysis. Lecture 23. 1. Lagged regression models. 2. Cross-covariance function, sample CCF. 3. Lagged regression in the time domain: prewhitening. 4. Lagged regression in the frequency domain: Cross spectrum. Coherence. 1 Lagged regression models Consider a lagged regression model of the form ∞ Yt = βhXt−h + Vt, h=X−∞ where Xt is an observed input time series, Yt is the observed output time series, and Vt is a stationary noise process. This is useful for Identifying the (best linear) relationship between two time series. • Forecasting one time series from the other. • (We might want βh = 0 for h< 0.) 2 Lagged regression models ∞ Yt = βhXt−h + Vt. h=X−∞ In the SOI and recruitment example, we might wish to identify how the values of the recruitment series (the number of new fish) is related to the Southern Oscillation Index. Or we might wish to predict future values of recruitment from the SOI. 3 Lagged regression models: Agenda Multiple, jointly stationary time series in the time domain: • cross-covariance function, sample CCF. Lagged regression in the time domain: model the input series, extract • the white time series driving it (‘prewhitening’), regress with transformed output series. Multiple, jointly stationary time series in the frequency domain: • cross spectrum, coherence. Lagged regression in the frequency domain: Calculate the input’s • spectral density, and the cross-spectral density between input and output, and find the transfer function relating them, in the frequency domain. Then the regression coefficients are the inverse Fourier transform of the transfer function. 4 Introduction to Time Series Analysis. Lecture 23. 1. Lagged regression models. 2. Cross-covariance function, sample CCF. 3. Lagged regression in the time domain: prewhitening. 4. Lagged regression in the frequency domain: Cross spectrum. Coherence. 5 Cross-covariance Recall that the autocovariance function of a stationary process X is { t} γ (h)= E [(X µ )(X µ )] . x t+h − x t − x The cross-covariance function of two jointly stationary processes X and { t} Y is { t} γ (h)= E [(X µ )(Y µ )] . xy t+h − x t − y (Jointly stationary = constant means, autocovariances depending only on the lag h, and cross-covariance depends only on h.) 6 Cross-correlation The cross-correlation function of jointly stationary X and Y is { t} { t} γxy(h) ρxy(h)= . γx(0)γy(0) p Notice that ρ (h)= ρ ( h) (but ρ (h) ρ ( h)). xy yx − xy 6≡ xy − Example: Suppose that Y = βX − + W for X stationary and t t ℓ t { t} uncorrelated with W , and W zero mean and white. Then X and Y { t} t { t} { t} are jointly stationary, with µy = βµx, γxy(h)= βγx(h + ℓ). If ℓ> 0, we say xt leads yt. If ℓ< 0, we say xt lags yt. 7 Sample cross-covariance and sample CCF − 1 n h γˆ (h)= (x x¯)(y y¯) xy n t+h − t − Xi=1 for h 0 (and γˆ (h) =γ ˆ ( h) for h< 0). ≥ xy yx − The sample CCF is γˆxy(h) ρˆxy(h)= . γˆx(0)ˆγy(0) p 8 Sample cross-covariance and sample CCF If either of X or Y is white, then ρˆ (h) AN(0, 1/√n). { t} { t} xy ∼ Notice that we can look for peaks in the sample CCF to identify a leading or lagging relation. (Recall that the ACF of the input series peaks at h = 0.) Example: CCF of SOI and recruitment (Figure 1.14 in text) has a peak at h = 6, indicating that recruitment at t has its strongest correlation with − SOI at time t 6. Thus, SOI leads recruitment (by 6 months). − 9 Introduction to Time Series Analysis. Lecture 23. 1. Lagged regression models. 2. Cross-covariance function, sample CCF. 3. Lagged regression in the time domain: prewhitening. 4. Lagged regression in the frequency domain: Cross spectrum. Coherence. 10 Lagged regression in the time domain (Section 5.6) Suppose we wish to fit a lagged regression model of the form ∞ Yt = α(B)Xt + ηt = αj Xt−j + ηt, Xj=0 where Xt is an observed input time series, Yt is the observed output time series, and ηt is a stationary noise process, uncorrelated with Xt. One approach (pioneered by Box and Jenkins) is to fit ARIMA models for Xt and ηt, and then find a simple rational representation for α(B). 11 Lagged regression in the time domain ∞ Yt = α(B)Xt + ηt = αj Xt−j + ηt, Xj=0 For example: θx(B) Xt = Wt, φx(B) θη(B) ηt = Zt, φη(B) δ(B) α(B)= Bd. ω(B) d Notice the delay B , indicating that Yt lags Xt by d steps. 12 Lagged regression in the time domain How do we choose all of these parameters? 1. Fit θ (B),φ (B) to model the input series X . x x { t} 2. Prewhiten the input series by applying the inverse operator φx(B)/θx(B): φx(B) φx(B) Y˜t = Yt = α(B)Wt + ηt. θx(B) θx(B) 13 Lagged regression in the time domain 3. Calculate the cross-correlation of Y˜t with Wt, ∞ E 2 γy,w˜ (h)= αj Wt+h−jWt = σwαh, Xj=0 to give an indication of the behavior of α(B) (for instance, the delay). 4. Estimate the coefficients of α(B) and hence fit an ARMA model for the noise series ηt. 14 Lagged regression in the time domain Why prewhiten? The prewhitening step inverts the linear filter Xt = θx(B)/φx(B)Wt. Then the lagged regression is between the transformed Yt and a white series Wt. This makes it easy to determine a suitable lag. For example, in the SOI/recruitment series, we treat SOI as an input, estimate an AR(1) model, prewhiten it (that is, compute the inverse of our AR(1) operator and apply it to the SOI series), and consider the cross-correlation between the transformed recruitment series and the prewhitened SOI. This shows a large peak at lag -5 (corresponding to the SOI series leading the recruitment series). Examples 5.7 and 5.8 in the text then consider α(B)= B5/(1 ω B). − 1 15 Lagged regression in the time domain This sequential estimation procedure (φx,θx, then α, then φη,θη) is rather ad hoc. State space methods offer an alternative, and they are also convenient for vector-valued input and output series. 16 Introduction to Time Series Analysis. Lecture 23. 1. Lagged regression models. 2. Cross-covariance function, sample CCF. 3. Lagged regression in the time domain: prewhitening. 4. Lagged regression in the frequency domain: Cross spectrum. Coherence. 17 Coherence To analyze lagged regression in the frequency domain, we’ll need the notion of coherence, the analog of cross-correlation in the frequency domain. Define the cross-spectrum as the Fourier transform of the cross-correlation, ∞ −2πiνh fxy(ν)= γxy(h)e , h=X−∞ 1/2 2πiνh γxy(h)= fxy(ν)e dν, Z−1/2 ∞ (provided that −∞ γ (h) < ). h= | xy | ∞ Notice that fxyP(ν) can be complex-valued. Also, γ (h)= γ ( h) implies f (ν)= f (ν)∗. yx xy − yx xy 18 Coherence The squared coherence function is 2 2 fyx(ν) ρy,x(ν)= | | . fx(ν)fy(ν) 2 2 Compare this with the correlation ρy,x = Cov(Y,X)/ σxσy. We can think of the squared coherence at a frequency ν as theq contribution to squared correlation at that frequency. (Recall the interpretation of spectral density at a frequency ν as the contribution to variance at that frequency.) 19 Estimating squared coherence Recall that we estimated the spectral density using the smoothed squared modulus of the DFT of the series, (L−1)/2 1 fˆ (ν )= X(ν l/n) 2 x k L | k − | l=−(XL−1)/2 (L−1)/2 1 ∗ = X(ν l/n)X(ν l/n) . L k − k − l=−(XL−1)/2 We can estimate the cross spectral density using the same sample estimate, (L−1)/2 1 ∗ fˆ (ν )= X(ν l/n)Y (ν l/n) . xy k L k − k − l=−(XL−1)/2 20 Coherence Also, we can estimate the squared coherence using these estimates, ˆ 2 2 fyx(ν) ρˆy,x(ν)= | | . fˆx(ν)fˆy(ν) 2 (Knowledge of the asymptotic distribution of ρˆy,x(ν) under the hypothesis of no coherence, ρy,x(ν)=0, allows us to test for coherence.) 21 Lagged regression models in the frequency domain Consider a lagged regression model of the form ∞ Yt = βj Xt−j + Vt, j=X−∞ where Xt is an observed input time series, Yt is the observed output time series, and Vt is a stationary noise process. We’d like to estimate the coefficients βj that determine the relationship between the lagged values of the input series Xt and the output series Yt. 22 Lagged regression models in the frequency domain The projection theorem tells us that the coefficients that minimize the mean squared error, 2 ∞ E Yt βj Xt−j − j=X−∞ satisfy the orthogonality conditions ∞ E Yt βj Xt−j Xt−k = 0, k = 0, 1, 2,... − ± ± j=X−∞ ∞ β γ (k j)= γ (k), k = 0, 1, 2,... j x − yx ± ± j=X−∞ 23 Lagged regression models in the frequency domain We could solve these equations for the βj using the sample autocovariance and sample cross-covariance. But it is more convenient to use estimates of the spectra and cross-spectrum. (Convolution with β in the time domain is equivalent to multiplication { j } by the Fourier transform of β in the frequency domain.

Load more