Introduction to Statistical Analysis of Time Series Richard A

Introduction to Statistical Analysis of Time Series Richard A. Davis Department of Statistics Outline Modeling objectives in time series General features of ecological/environmental time series Components of a time series Frequency domain analysis-the spectrum Estimating and removing seasonal components Other cyclical components Putting it all together 1 Time Series: A collection of observations xt, each one being recorded at time t. (Time could be discrete, t = 1,2,3,…, or continuous t > 0.) Objective of Time Series Analaysis Data compression -provide compact description of the data. Explanatory -seasonal factors -relationships with other variables (temperature, humidity, pollution, etc) Signal processing -extracting a signal in the presence of noise Prediction -use the model to predict future values of the time series 2 General features of ecological/environmental time series Examples. 1. Mauna Loa (CO2,, Oct `58-Sept `90) CO2 320 330 340 350 1960 1970 1980 1990 Features increasing trend (linear, quadratic?) seasonal (monthly) effect. 3 2. Ave-max monthly temp (vegetation=tundra,, 1895-1993) temp -5 0 5 10 15 20 0 200 400 600 800 1000 1200 Features seasonal Go(monthly to ITSM effect) Demo more variability in Jan than in July 4 July: mean = 21.95, var = .6305 temp Jan : mean = -.486, var =2.637 -5 0 5 10 15 20 0 20406080100 Sept : mean = 17.25, var =1.466 t temp Line: 16.83 + .00845 15 16 17 18 19 20 0 20406080100 5 Components of a time series Classical decomposition Xt = mt + st + Yt • mt = trend component (slowly changing in time) • st = seasonal component (known period d=24(hourly), d=12(monthly)) • Yt = random noise component (might contain irregular cyclical components of unknown frequency + other stuff). Go to ITSM Demo 6 Estimation of the components. Xt = mt + st + Yt Trend mt filtering. E.g., for monthly data use = + ++ + mˆ t (.5xt−6 xt−5 xt+5 .5xt+6 ) /12 polynomial fitting = + ++ k mˆ t a0 a1t akt 7 Estimation of the components (cont). Xt = mt + st + Yt Seasonal st Use seasonal (monthly) averages after detrending. (standardize so that st sums to 0 across the year. = + + = sˆt (xt xt+12 xt+24 ) / N, N number of years harmonic components fit to the time series using least squares. 2π 2π sˆ = Acos( t) + Bsin( t) t 12 12 Irregular component Yt ˆ = − − Yt X t mˆ t sˆt 8 The spectrum and frequency domain analysis Toy example. (n=6) 2π0 2π0 c = (cos( ),…, cos( 6) )'/sqrt(6)c0 0 6 6 π π = 2 … 2 c (cos( ), , cos( 6) )' /sqrt(3)c1 1 6 6 π π = 2 … 2 s (sin( ), , sin ( 6) )' /sqrt(3)s1 1 6 6 π π = 2 2 … 2 2 c (cos( ), , cos( 6) )'/sqrt(3c2 ) 2 6 6 π π = 2 2 … 2 2 s (sin( ), , sin( 6) )'/sqrt(3)s2 2 6 6 2π 2π c =(cos( ),…, cos( 6) )'/sqrt(6c3 3 2 2 X=(4.24, 3.26, -3.14, -3.24, 0.739, 3.04)’ = 2c0+5(c1+s1)-1.5(c2+s2)+.5c3 x 9 Fact: Any vector of 6 numbers, x = (x1, . , x6)’ can be written as a linear combination of the vectors c0, c1, c2, s1, s2, c3. More generally, any time series x = (x1, . , xn)’ of length n (assume n is odd) can be written as a linear combination of the basis (orthonormal) vectors c0, c1, c2, …, c[n/2], s1, s2, …, s[n/2]. That is, = + + + + = x a0c0 a1c1 b1s1 amcm bmsm , m [n / 2] 1 cos(ω ) sin(ω ) j j 1/ 2 1/ 2 ω 1/ 2 ω 1 1 2 cos(2 j ) 2 sin(2 j ) c = , c = , s = 0 n j n j n ω ω 1 cos(n j ) sin(n j ) 10 = + + + + = x a0c0 a1c1 b1s1 amcm bmsm , m [n / 2] Properties: 1. The set of coefficients {a0, a1, b1, … } is called the discrete Fourier transform n = = 1 a0 (x,c0 ) 1/ 2 ∑ xt n t=1 1/ 2 n = = 2 ω a j (x,c j ) 1/ 2 ∑ xt cos( jt) n t=1 1/ 2 n = = 2 ω b j (x,s j ) 1/ 2 ∑ xt sin( jt) n t=1 11 2. Sum of squares. n m 2 = 2 + ()2 + 2 ∑ xt a 0 ∑ a j b j t =1 j=1 3. ANOVA (analysis of variance table) Source DF Sum of Squares Periodgram ω 2 ω 0 1 a0 I( 0) ω π/ 2 2 ω 1=2 n2 a1 + b1 2I( 1) ω π / 2 2 ω m =2 m n 2 am + bm 2I( m) 2 n ∑ xt t 12 Applied to toy example Source DF Sum of Squares ω 2 0=0 (period 0) 1 a0 = 4.0 ω π/ 2 2 1=2 6 (period 6) 2 a1 + b1 = 50.0 ω π / 2 2 2 =2 2 6 (period 3) 2 a2 + b2 = 4.5 ω π / 2 3 =2 3 6 (period 2) 1 a3 = 0.25 2 6 ∑ xt = 58.75 t Test that period 6 is significant µ ε ε H0: Xt = + t , { t} ~ independent noise µ π π ε H1: Xt = + A cos (t2 /6) + B sin (t2 /6) + t ω Σ 2 ω Test Statistic: (n-3)I( 1)/( t xt -I(0)-2I( 1)) ~ F(2,n-3) (6-3)(50/2)/(58.75-4-50)=15.79 ⇒ p-value = .003 13 The spectrum and frequency domain analysis Ex. Sinusoid with period 12. 2π 2π x = 5cos( t) + 3sin( t), t =1,2,…,120. t 12 12 Ex. Sinusoid with periods 4 and 12. Ex. Mauna Loa ITSM DEMO 14 Differencing at lag 12 Sometimes, a seasonal component with period 12 in the time series can be removed by differencing at lag 12. That is the differenced series is = − yt xt xt−12 Now suppose xt is the sinusoid with period 12 + noise. 2π 2π x = 5cos( t) + 3sin( t) + ε , t =1,2,…,120. t 12 12 t Then = − = ε − ε yt xt xt−12 t t−12 which has correlation at lag 12. 15 Other cyclical components; searching for hidden cycles Ex. Sunspots. period ~ 2π/.62684=10.02 years Fisher’s test ⇒ significance What model should we use? ITSM DEMO 16 Noise. The time series {Xt} is white or independent noise if the sequence of random variables is independent and identically distributed. x_t -2024 0 20406080100120 time Battery of tests for checking whiteness. In ITSM, choose statistics => residual analysis => Tests of Randomness 17 Residuals from Mauna Loa data. t rt rt+1t+25 Cor(Xt, Xt+25) = .074 t rt rt+1t+2 Cor(X , X ) = .654 Cor(Xt t, Xt+3t+2) = .736 1 -.19 -.13 .13 Cor(Xt, Xt+1) = .824 1 -.19 -.14-.25 2 -.14 -.25-.13-.32 .04 3 -.25 -.13-.32 .20 x_{t+1} 4 -.13 -.02 .47 x_{t+1} 4 -.13 -.32-.02 x_{t+1} x_{t+1} … -1.0 -0.5 0.0 0.5 1.0 1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.0-1.0 -0.5 -0.5 0.0 0.0x_t 0.5 0.5 1.0 1.0 1.5 1.5 x_t x_tx_t 18 Autocorrelation function (ACF): Mauna Loa residuals ACF 0.0 0.2 0.4 0.6 0.8 1.0 Conf Bds: ± 1.96/sqrt(n) 0 10203040 lag white noise ACF -0.2 0.0 0.2 0.4 0.6 0.8 1.0 0 10203040 lag 19 Example: Mauna Loa irregular part seasonal trend CO2 -1.0 0.0 1.0 -3 -1 1 2 3 320 340 320 340 19601960 19701960 1970 19801960 1970 1980 1990 1970 1980 1990 1980 1990 1990 Putting it all together 20 Strategies for modeling the irregular part {Yt}. Fit an autoregressive process Fit a moving average process Fit an ARMA (autoregressive-moving average) process In ITSM, choose the best fitting AR or ARMA using the menu option Model => Estimation => Preliminary => AR estimation or Model => Estimation => Autofit 21 How well does the model fit the data? 1. Inspection of residuals. Are they compatible with white (independent) noise? no discernible trend no seasonal component variability does not change in time. no correlation in residuals or squares of residuals Are they normally distributed? 2. How well does the model predict. values within the series (in-sample forecasting) future values 3. How well do the simulated values from the model capture the characteristics in the observed data? ITSM DEMO with Mauna Loa 22 Model refinement and Simulation Residual analysis can often lead to model refinement Do simulated realizations reflect the key features present in the original data Two examples Sunspots NEE (Net ecosystem exchange). Limitations of existing models Seasonal components are fixed from year to year. Stationary through the seasons Add intervention components (forest fires, volcanic eruptions, etc.) 23 Other directions Structural model formulation for trend and seasonal components Local level model mt =mt-1 + noiset Seasonal component with noise st =–st-1 –st-2 – . – st-11+noiset ε Xt= mt + st + Yt + t Easy to add intervention terms in the above formulation. Periodic models (allows more flexibility in modeling transitions from one season to the next). 24.

Load more