<<

Introduction to Series . Lecture 3. Peter Bartlett

1. Review: , linear processes 2.

3. ACF and 4. Properties of the ACF

1 , Autocovariance, Stationarity

A {Xt} has mean function µt = E[Xt] and autocovariance function

γX (t + h, t)= Cov(Xt+h,Xt)

= E[(Xt+h − µt+h)(Xt − µt)].

It is stationary if both are independent of t.

Then we write γX (h)= γX (h, 0). The autocorrelation function (ACF) is

γX (h) ρX (h)= = Corr(Xt+h,Xt). γX (0)

2 Linear Processes

An important class of stationary time series:

∞ Xt = µ + ψj Wt−j j=X−∞ 2 where {Wt}∼ W N(0, σw)

and µ, ψj are parameters satisfying ∞ |ψj| < ∞. j=X−∞

3 Linear Processes

∞ Xt = µ + ψj Wt−j j=X−∞

Examples:

• White : ψ0 = 1. • MA(1): ψ0 = 1, ψ1 = θ. 2 • AR(1): ψ0 = 1, ψ1 = φ, ψ2 = φ , ...

4 Estimating the ACF: Sample ACF

Recall: Suppose that {Xt} is a stationary time series. Its mean is

µ = E[Xt]. Its autocovariance function is

γ(h)= Cov(Xt+h,Xt)

= E[(Xt+h − µ)(Xt − µ)].

Its autocorrelation function is γ(h) ρ(h)= . γ(0)

5 Estimating the ACF: Sample ACF

For observations x1,...,xn of a time series, 1 n the sample mean is x¯ = x . n t Xt=1 The sample autocovariance function is

n−|h| 1 γˆ(h)= (x − x¯)(x − x¯), for −n

6 Estimating the ACF: Sample ACF

Sample autocovariance function:

n−|h| 1 γˆ(h)= (x − x¯)(x − x¯). n t+|h| t Xt=1

≈ the sample of (x1, xh+1),..., (xn−h, xn), except that • we normalize by n instead of n − h, and • we subtract the full sample mean.

7 Sample ACF for white Gaussian (hence i.i.d.) noise

1.2

1

0.8

0.6

0.4

0.2

0

−0.2 −20 −15 −10 −5 0 5 10 15 20 Red lines=c.i.

8 Sample ACF

We can recognize the sample autocorrelation functions of many non-white (even non-stationary) time series.

Time series: Sample ACF: White zero Trend Slow decay Periodic Periodic MA(q) Zero for |h| >q AR(p) Decays to zero exponentially

9 Sample ACF: Trend

4

3

2

1

0

−1

−2

−3

−4 0 10 20 30 40 50 60 70 80 90 100

10 Sample ACF: Trend

1.2

1

0.8

0.6

0.4

0.2

0

−0.2 −60 −40 −20 0 20 40 60 (why?)

11 Sample ACF

Time series: Sample ACF: White zero Trend Slow decay Periodic Periodic MA(q) Zero for |h| >q AR(p) Decays to zero exponentially

12 Sample ACF: Periodic

6

5

4

3

2

1

0

−1

−2

−3

−4 0 10 20 30 40 50 60 70 80 90 100

13 Sample ACF: Periodic

6 signal plus noise 5

4

3

2

1

0

−1

−2

−3

−4 0 10 20 30 40 50 60 70 80 90 100

14 Sample ACF: Periodic

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

−0.8 −100 −80 −60 −40 −20 0 20 40 60 80 100 (why?)

15 Sample ACF

Time series: Sample ACF: White zero Trend Slow decay Periodic Periodic MA(q) Zero for |h| >q AR(p) Decays to zero exponentially

16 ACF: MA(1)

MA(1): X = Z + θ Z t t t−1

1

0.8

0.6

0.4 θ/(1+θ2)

0.2

0 −10 −8 −6 −4 −2 0 2 4 6 8 10

17 Sample ACF: MA(1)

1.2 ACF Sample ACF

1

0.8

0.6

0.4

0.2

0

−0.2 −10 −8 −6 −4 −2 0 2 4 6 8 10

18 Sample ACF

Time series: Sample ACF: White zero Trend Slow decay Periodic Periodic MA(q) Zero for |h| >q AR(p) Decays to zero exponentially

19 ACF: AR(1)

AR(1): X = φ X + Z t t−1 t 1

0.9

0.8

0.7

0.6

0.5

0.4

0.3 φ|h| 0.2

0.1

0 −10 −8 −6 −4 −2 0 2 4 6 8 10

20 Sample ACF: AR(1)

1.2 ACF Sample ACF

1

0.8

0.6

0.4

0.2

0

−0.2 −10 −8 −6 −4 −2 0 2 4 6 8 10

21 Introduction to Time Series Analysis. Lecture 3. 1. Sample autocorrelation function 2. ACF and prediction

3. Properties of the ACF

22 ACF and prediction

2

1

0

−1 MA(1) −2

−3 0 2 4 6 8 10 12 14 16 18 20

1.2 ACF 1 Sample ACF

0.8

0.6

0.4

0.2

0

−0.2 −10 −8 −6 −4 −2 0 2 4 6 8 10

23 ACF of a MA(1) process

5 5 5 5

0 0 0 0

−5 −5 −5 −5 −5 0 5 −5 0 5 −5 0 5 −5 0 5 lag 0 lag 1 lag 2 lag 3

24 ACF and prediction

Best least squares estimate of Y is EY :

min E(Y − c)2 = E(Y − EY )2. c

Best least squares estimate of Y given X is E[Y |X]:

min E(Y − f(X))2 = min E E[(Y − f(X))2|X] f f   = E E[(Y − E[Y |X])2|X] = var[ Y |X]. 

Similarly, the best least squares estimate of Xn+h given Xn is f(Xn)= E[Xn+h|Xn].

25 ACF and least squares prediction

Suppose that X =(X1,...,Xn+h) is jointly Gaussian: 1 1 f (x)= exp − (x − µ)′Σ−1(x − µ) . X (2π)n/2|Σ|1/2  2 

Then the joint distribution of (Xn,Xn+h) is

2 µn σ ρσnσn+h N , n ,    2  µn+h ρσnσn+h σn+h     and the conditional distribution of Xn+h given Xn is

σn+h 2 2 N µn+h + ρ (xn − µn), σn+h(1 − ρ ) .  σn 

26 ACF and least squares prediction

So for Gaussian and stationary {Xt}, the best estimate of Xn+h given

Xn = xn is

f(xn)= µ + ρ(h)(xn − µ), and the mean squared error is

2 2 2 E(Xn+h − f(Xn)) = σ (1 − ρ(h) ).

Notice: • Prediction accuracy improves as |ρ(h)| → 1. • Predictor is linear: f(x)= µ(1 − ρ(h)) + ρ(h)x.

27 ACF and least squares

Consider a linear predictor of Xn+h given Xn = xn. Assume first that {Xt} is stationary with EXn = 0, and predict Xn+h with f(xn)= axn. The best linear predictor minimizes

2 2 2 2 E (Xn+h − aXn) = E Xn+h − E (2aXn+hXn)+ E a Xn = σ2− 2aγ(h)+ a2σ2,  and this is minimized when a = ρ(h), that is,

f(xn)= ρ(h)Xn. For this optimal linear predictor, the mean squared error is

2 2 2 2 E(Xn+h − f(Xn)) = σ − 2ρ(h)γ(h)+ ρ(h) σ = σ2(1 − ρ(h)2).

28 ACF and least squares linear prediction

Consider the following linear predictor of Xn+h given Xn = xn, when {Xn} is stationary and EXn = µ:

f(xn)= a(xn − µ)+ b. The linear predictor that minimizes

2 E (Xn+h − (a(Xn − µ)+ b)) has a = ρ(h), b = µ, that is,

f(xn)= ρ(h)(Xn − µ)+ µ. For this optimal linear predictor, the mean squared error is again

2 2 2 E(Xn+h − f(Xn)) = σ (1 − ρ(h) ).

29 Least squares prediction of Xn+h given Xn

f(Xn)= µ + ρ(h)(Xn − µ). 2 2 2 E(f(Xn) − Xn+h) = σ (1 − ρ(h) ).

• If {Xt} is stationary, f is the optimal linear predictor.

• If {Xt} is also Gaussian, f is the optimal predictor. • Linear prediction is optimal for Gaussian time series. • Over all stationary processes with that value of ρ(h) and σ2, the optimal mean squared error is maximized by the . • Linear prediction needs only second order .

• Extends to longer histories, (Xn,Xn − 1,...).

30 Introduction to Time Series Analysis. Lecture 3. 1. Sample autocorrelation function 2. ACF and prediction

3. Properties of the ACF

31 Properties of the autocovariance function

For the autocovariance function γ of a stationary time series {Xt}, 1. γ(0) ≥ 0, ( is non-negative)

2. |γ(h)|≤ γ(0), (from Cauchy-Schwarz)

3. γ(h)= γ(−h), (from stationarity) 4. γ is positive semidefinite.

Furthermore, any function γ : Z → that satisfies (3) and (4) is the autocovariance of some stationary time series.

32 Properties of the autocovariance function

A function f : Z → R is positive semidefinite if for all n, the Fn, with entries (Fn)i,j = f(i − j), is positive semidefinite. n×n n A matrix Fn ∈ R is positive semidefinite if, for all vectors a ∈ R ,

a′F a ≥ 0.

To see that γ is psd, consider the variance of (X1,...,Xn)a.

33 Properties of the autocovariance function

For the autocovariance function γ of a stationary time series {Xt}, 1. γ(0) ≥ 0, 2. |γ(h)|≤ γ(0),

3. γ(h)= γ(−h), 4. γ is positive semidefinite.

Furthermore, any function γ : Z → R that satisfies (3) and (4) is the autocovariance of some stationary time series (in particular, a Gaussian process). e.g.: (1) and (2) follow from (4).

34 Introduction to Time Series Analysis. Lecture 3. 1. Sample autocorrelation function 2. ACF and prediction

3. Properties of the ACF

35