02 Stationary Time Series

02 Stationary time series Andrius Buteikis, [email protected] http://web.vu.lt/mif/a.buteikis/ Introduction All time series may be divided into two big classes - stationary and non-stationary. I Stationary process - a random process with a constant mean, variance and covariance. Examples of stationary time series: WN, mean = 0 MA(3), mean = 5 AR(1), mean = 5 2 7 8 6 1 6 5 x1 x2 x3 0 4 4 −1 3 2 2 −2 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 Time Time Time The three example processes fluctuate around their constant mean values. Looking from the graphs, the fluctuations of the first two graphs seem to be constant, however the third one is not so apparent. If we plot the last time series for a longer time period: AR(1), mean = 5 8 6 x3 4 2 0 50 100 150 200 Time AR(1), mean = 5 8 6 x3 4 2 0 100 200 300 400 Time We can see that the fluctuations are indeed around a constant mean and the variance does not appear to change throughout the period. Some non-stationary time series examples: I Yt = t + t , where t ∼ N (0, 1); 2 I Yt = t · t, where t ∼ N (0, σ ); Pt I Yt = j=1 Zj , where each independent variable Zj is either 1 or −1, with a 50% probability for either value. The reasons for their non-stationarity are as follows: I The first time series is not stationary because its mean is not constant: EYt = t - depends on t; I The second time series is not stationary because its variance is not 2 2 constant: Var(Yt ) = t · σ - depends on t. However, EYt = 0 · t = 0 is constant; I The third time series is not stationary because even though Pt EYt = j=1 (0.5 + (−0.5)) = 0, the variance 2 2 2 Var(Yt ) = E(Yt ) − (E(Yt )) = E(Yt ) = t where: 2 Pt 2 P 2 E(Yt ) = j=1 E(Zj ) + 2 j6=k E(Zj Zk ) = t · (0.5 · 1 + 0.5 · (−1) ) = t The sample data graphs are provided below: non stationary in mean non stationary in variance no clear tendency 50 4 100 3 40 2 50 30 1 ns1 ns2 ns3 0 20 0 −1 10 −2 −50 0 −3 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Index Index Index I White noise (WN) - a stationary process of uncorrelated (sometimes we may demand a stronger property of independence) random variables with zero mean and constant variance. White noise is a model of an absolutely chaotic process of uncorrelated observations - it is a process that immediately forgets its past. How can we know which of the previous three stationary graphs are not WN? Two functions help us determine this: I ACF - Autocorrelation function I PACF - Partial autocorrelation function If all the bars (except the 0th in the ACF) are within the blue band - the stationary process is WN. WN MA(3) AR(1) 1.0 0.8 0.8 0.6 0.4 0.4 ACF ACF ACF 0.2 0.0 0.0 −0.2 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 25 Lag Lag Lag WN MA(3) AR(1) 0.10 0.6 0.3 0.4 0.00 0.1 0.2 Partial ACF Partial ACF Partial ACF 0.0 −0.10 −0.1 5 10 15 20 5 10 15 20 0 5 10 15 20 25 Lag Lag Lag The 95% confidence intervals are calculated from: qnorm(p = c(0.025, 0.975))/sqrt(n) (more details on the confidence interval calculation are provided later in these slides) par(mfrow = c(1,2)) set.seed(10) n = 50 x0 <- rnorm(n) acf(x0) abline(h = qnorm(c(0.025, 0.975))/sqrt(n), col = "red") pacf(x0) abline(h = qnorm(c(0.025, 0.975))/sqrt(n), col = "red") Series x0 Series x0 0.3 1.0 0.8 0.2 0.6 0.1 0.4 0.0 ACF Partial ACF 0.2 −0.1 0.0 −0.2 −0.2 0 5 10 15 5 10 15 Lag Lag To decide whether a time series is stationary, examine its graph. To decide whether a stationary time series is WN, examine its ACF and PACF. Covariance-Stationary Time Series I In cross-sectional data different observations were assumed to be uncorrelated; I In time series we require that there be some dynamics, some persistence, some way in which the present is linked to the past and the future - to the present. Having historical data then would allow us to forecast the future. If we want to forecast a series - at a minimum we would like its mean and covariance structure to be stable over time. In that case, we would say that the series is covariance stationary. There are two requirements for this to be true: 1. The mean of the series is stable over time: EYt = µ; 2. The covariance structure is stable over time. In general, the (auto)covariance between Yt and Yt−τ is: γ(t, τ) = cov(Yt , Yt−τ ) = E(Yt − µ)(Yt−τ − µ) If the covariance structure is stable, then the covariance depends on τ but not on t: γ(t, τ) = γ(τ). Note: γ(0) = Cov(Yt , Yt ) = Var(Yt ) < ∞. Remark When observing/measuring time series we obtain numbers y1, ..., yT which are the realization of random variables Y1, ..., YT . Using probabilistic concepts, we can give a more precise definition of a (weak) stationary series: I If EYt = µ - the process is called mean-stationary; 2 I If Var(Yt ) = σ < ∞ - the process is called variance-stationary; I If γ(t, τ) = γ(τ) - the process is called covariance-stationary. In other words, a time series Yt is stationary if its mean, variance and covariance do not depend on t. If at least one of the three requirements is not met, then the process is not-stationary. Since we often work with the (auto)correlation between Yt and Yt−τ rather than the (auto)covariance (because they are easier to interpret), we can calculate the autocorrelation function (ACF): cov(Yt , Yt−τ ) γ(τ) ρ(τ) = p = Var(Yt )Var(Yt−τ ) γ(0) Note: ρ(0) = 1, |ρ(τ)| ≤ 1. The partial autocorrelation function (PACF) measures the association between Yt and Yt−k : p(k) = βk , where Yt = α + β1Yt−1 + ... + βk Yt−k + t The variance of the autocorrelation coefficient at lag k, rk , is normally distributed at the limit, and the variance can be approximated: 1 Var(r ) ∼ (where T is the number of observations). k T As such, we want to create lower and upper 95% confidence bounds for 1 1 the normal distribution N 0, , whose standard deviation is √ . T T The 95% confidence interval (of a stationary time series) is: 1.96 ∆ = 0 ± √ T In general, the critical value of a standard normal distribution and its confidence interval can be found in these steps: 1 − Q Compute α = , where Q is the confidence level; I 2 I To express the critical value as a z − score, find the z1−α value. For example, if Q = 0.95, then α = 0.05. Then, the standard normal distributions 1 − α quantile is z0.025 ≈ 1.96. White Noise White noise processes are the fundamental building blocks of all stationary time series. 2 We denote it t ∼ WN(0, σ ) - a zero mean, constant variance and serially uncorrelated (ρ(t, τ) = 0, for τ > 0 and any t) random variable process. Sometimes we demand a stronger property of independence. From the definition it follows that: I E(t ) = 0; 2 I Var(t ) = σ < ∞; I γ(t, τ) = E(t − Et )(t−τ − Et−τ ) = E(t t−τ ), where: ( 0, if τ 6= 0 E(t t−τ ) = σ2, if τ = 0 Example on how to check if a process is stationary. 2 Let us check if Yt = t + β1t−1, where t ∼ WN(0, σ ) is stationary: 1. EYt = E(t + β1t−1) = 0 + β1 · 0 = 0; 2 2 2 2 2. Var(Yr ) = Var(t + β1t−1) = σ + β1 σ = σ (1 + β1); 3. The autocovariance for τ > 0: γ(t, τ) = E(Yt Yt−τ ) = E(t + β1t−1)(t−τ + β1t−τ−1) 2 = Et t−τ + β1Et t−τ−1 + β1Et−1t−τ + β1 Et−1t−τ−1 ( 2 β1σ , if τ = 1 = β1Et−1t−τ = 0, if τ > 1 None of these characteristics depend on t, which means that the process is stationary. This process has a very short memory (i.e. if Yt and Yt+τ are separated by more than one time period - they are uncorrelated). On the other hand, this process is not a WN. The Lag Operator The lag operator L is used to lag a time series: LYt = Yt−1. Similarly: 2 L Yt = L(LYt ) = L(Yt−1) = Yt−2 etc. In general, we can write: p L Yt = Yt−p Typically, we operate on a time series with a polynomial in the lag operator. A lag operator polynomial of degree m is: 2 m B(L) = β0 + β1L + β2L + ... + βmL For example, if B(L) = 1 + 0.9L − 0.6L2, then: B(L)Yt = Yt + 0.9Yt−1 − 0.6Yt−2 A well known operator - the first-difference operator ∆ - is a first-order polynomial in the lag operator: ∆Yt = Yt − Yt−1 = (1 − L)Yt , i.e.

Load more