Chaper 9: Nonseasonal Box-Jenkins Models
Total Page:16
File Type:pdf, Size:1020Kb
Chaper 9: Nonseasonal Box-Jenkins Models
The concepts of ‘stationary time series’ and ‘nonstationary time series’ are important in the Box-Jenkins methodology.
Stationary time series
A time series {yt } is said to be stationary if the following two conditions are satisfied: (a) the mean function is constant over time, i.e.,
mt =E( y t ) = c for all t
(b) rt, s= cov(y t , y s ) / var( y t )var( y s ) are not
functions of time, i.e., rt, t- k = r 0, k = r k for all time t and lag k . This is equivalent to the
condition: gt, s = cov(y t , y s ) are independent of
time t also, ie., t,t-k = 0,k = k for all t and lag k.
In other words both the autocorrelations rt, s and autocovariances gt, s depend only the distance
1 Ch9 between the two time points s and t but not on the actual positions of s and t.
Note: Since gt, t =cov(y t , y t ) = var( y t ), a stationary time series is also necessary that the variance is constant with respect to t.
Nonstationary Time Series
If the n values of yt do not fluctuate around a constant mean or do not fluctuate with constant variation then it is reasonable to believe the time series is not stationary.
Random walk with zero mean
15
10
Zt 5
0
-5
Time 50 100 150
2 Ch9 A nonstationary series can be transformed into a stationary one by first differencing zt=� y t - y t y t-1. Minitab command for differencing is Stat ▷ Time Series ▷Difference (lag 1)
(Differencing is like differentiation in calculus) �y y y �y- y y �y = t t t-1 t t t-1 t 1t- ( t - 1) which is similar to the definition of a derivative of a function f( t ): f( t+ D ) - f ( t ) f ( t + D ) - f ( t ) f' ( t )= lim = lim D瓺0t+ D - t 0 D
3 Ch9 Time Series Plot of Paper Towel Sales 20
15
10 y
5
0
1 12 24 36 48 60 72 84 96 108 120 Index
After first differencing
Time Series Plot of first differencing
3
2
1
0 2 C -1
-2
-3
-4 1 12 24 36 48 60 72 84 96 108 120 Index
4 Ch9 If this is not sufficient, take second differences (the first differences of the first differences) of the original series values should normally does the job
2 zt=� y t �� y t - - y t-1 - ( y t y t - 1 ) ( y t - 1 y t - 2 ) If a time series plot indicates increasing variability, it is often transform the series by using either square root, quadric or logarithmetic transformation first and then takes first differences
Example: Consider the following NCR (New Company Registrations) rates data given below:
5 Ch9 Time Series Plot of NCR 700
600
500 R
C 400 N
300
200
100 4 8 12 16 20 24 28 32 36 Index
The series is clearly not stationary since it has a trend and increasing variability which means both
E( yt ) and var(yt ) are depending on the time variable t.
6 Ch9 Time Series Plot of lnNCR
6.50
6.25
6.00
R 5.75 C N n l 5.50
5.25
5.00
4 8 12 16 20 24 28 32 36 Index
Clearly the log transformation has stabilised the variance somewhat.
Applying differencing on the logged series:
Time Series Plot of d1lnNCR
0.3
0.2
0.1 R C N n l 0.0 1 d
-0.1
-0.2
-0.3 4 8 12 16 20 24 28 32 36 Index
7 Ch9 It now appears that the resulting series is stationary.
Working Series
The textbook uses zb, z b+1 ,..., z n as the ‘working series’ obtained from the original series by transformation or differencing. b = 2 if zt= y t - y t-1
Sample autocorrelation coefficient (SAC) The sample autocorrelation at lag k is
n- k (zt- z )( z t+ k - z ) t= b rk = rk = n 2 (zt - z ) t= b where
n z= zt /( n - b + 1) t= b
The standard error of rk is
8 Ch9 1 , if k = 1 (n- b + 1)1/ 2 s = k-1 rk 2 1+ 2 rj j=1 , if k = 2,3,... (n- b + 1)1/ 2 t The rk -statistic is
rk tr = k s rk SAC graph is a graph of sample autocorrelations (Minitab calls it the ACF plot):
Autocorrelation Function for y (original towel sales) (with 5% significance limits for the autocorrelations)
1.0 0.8 0.6
n 0.4 o i t
a 0.2 l e r
r 0.0 o c
o -0.2 t u
A -0.4 -0.6 -0.8 -1.0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Lag
9 Ch9 Spikes
We say that a spike at lag k exists if rk is t= r/ s > 2 statistically large, says rk k r k in absolute value.
In Minitab acf graph, any rk that is above or below the confidence bands is considered to be a spike so t you do not need to find the value of rk .
Cuts off after k We say that SAC cuts off after lag k if no spikes at lags greater than k in SAC
Using the SAC to find a stationary time series For nonseasonal data (i) If the time series either cuts off fairly quickly or dies down fairly quickly, then the series is considered stationary
10 Ch9 (ii) If the time series dies down extremely slowly, then the series is considered nonstationary Note that the SAC of the towel sales series refuse to die down quickly so there is a clear sign the series is nonstationary
Sample partial autocorrelation rkk Can be thought of as the sample autocorrelation of time series observations separated by a lag of k time units with the effects of the intervening observations eliminated.
In other words, this measure of correlation is used to identify the extent of relationship between current values of a variable with earlier values of the same variable (values for various time lags) while holding the effects of all other time lags constant.
11 Ch9 Consider now the differenced series of the towel sales
Autocorrelation Function for z (differenced series) (with 5% significance limits for the autocorrelations)
1.0 0.8 0.6
n 0.4 o i t
a 0.2 l e r
r 0.0 o c
o -0.2 t u
A -0.4 -0.6 -0.8 -1.0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Lag
Here, there is a cut-off at lag 1 so the differenced series is stationary.
Simple Stationary Time Series Models- ARMA
Let {at } be a sequence of random shocks which describe the effect of all other factors other than zt-1 on zt . It is more or less the residual errors of
12 Ch9 the forecast (if the residuals et are not independent, then we can’t treat et as at )
Note: Most textbooks call {at } the white noise.
Properties of {at }
(i) a1, a 2 , a 3, ... are independent
2 (ii) ai: N(0,s a )
(iii) at+1 is independent of yt, y t-1 ,...
{at } forms a very important role in Box-Jenkins methodology. Essentially, every stationary Box- Jenkins model can be expressed in terms of the white noise process.
Simple Box-Jenkins Models
Moving Average Models
zt= a t - q1 a t -1 ... - q qa t - q
13 Ch9 and refer to it as a moving average process of order q, denoted by MA(q). (Note that structurally speaking, MA(q) is expressed as averaging of at terms except the negative signs)
The special case: MA(1)
zt= a t - q1 a t- 1
E( zt )= 0 2 2 var(zt )= s a (1 + q1 ) 2 cov(zt , z t+1 ) = -q 1 s a
cov(zt , z t+ k )= 0 for k 2
q1 Thus r1 = 2 and all other rk are zero. 1+ q1 (Make sure you know how to derive the above).
Hence the TAC of an MA(1) “cuts off” after lag 1.
MA(2) zt= a t - q 1 a t -1 - q 2 a t -2
14 Ch9 E(Z ) = 0 E( zt )= 0 t 2 2 2 var(zt )= s a (1 + q1 + q 2 ) 2 cov(zt , z t+1 )= ( -q 1 + q 1 q 2 ) s a 2 cov(zt , z t+2 ) = -q 2 s a
cov(zt , z t+ k )= 0 for k 3
-q1 + q 1 q 2 r1 = 2 2 , 1+ q1 + q 2
-q2 r2 = 2 2 1+ q1 + q 2 and all other rk are zero. Thus the TAC of an MA(2) “cuts off” after lag 2.
In general, for MA(q)
(i) rk �0 for k 1,2,..., q
rk =0 for k > q
(ii) PAC dies down
15 Ch9 Autoregressive Models
zt= f1 z t -1 + f 2 z t- 2 + ... f p z t - p + a t
Here the zt are regressed on themselves, (hence of course the name) but lagged by various amounts. The simplest case is the first order, denoted as AR(1), which takes the form
zt= f1 z t -1 + a t
E( zt )= 0 2 sa var(zt )= g0 = 2 , 1- f1 so |f1 | < 1 to ensure stationarity
k = 1 k-1 k k = f1
Thus rk “dies down” exponentially as k increases, oscillating if 1 < 0. Thus if the TAC of a series dies down rather than cuts off, we suspect it to be an AR rather than an MA.
16 Ch9 Note that AR and MA series are not entirely unrelated. It can be shown that an AR(1) can be expressed as an “infinite” MA series, much like the general linear process. The MA(1) can similarly be expressed as an “infinite” AR series.
Note: a linear process is a time series that has the form yt= a t +y1 a t- 1 + y 2 a t - 2 + ...
The AR(2) can be written as
zt= f1 z t -1 + f 2 z t -2 + a t
f1 r1 = 1- f2
r2 = f 1 r 1 + f 2
r3 = f 1 r 2 + r 2 f 1 etc. Thus again the TAC dies down rather than cuts off, though it is difficult at times to tell the difference in TAC’s between AR(1) and AR(2).
17 Ch9 TPAC has nonzero partial autocorrelations at lags 1 and 2 and zero at all lags after lag 2, i.e., cuts off after lag 2.
In general, for AR(p), TAC dies down and TPAC cuts off after lag p.
ARMA(p, q) Mixed autoregressive-moving average models
The model can be written as zt= f1 z t- 1 + f 2 z t - 2 +... + f t - p + a t - q 1 a t - 1 - q 2 a t - 2 - ... - q q a t - q
zt- f1 z t- 1 - f 2 z t - 2... - f t - p = a t - q 1 a t - 1 - q 2 a t - 2 - ... - q q a t - q i.e., we move autoregressive part to the left whereas the moving average part on the right.
ARMA(1, 1)
18 Ch9 zt= f1 z t- 1 + a t - q 1 a t - 1
(1- q1 f 1 )( f 1- q 1 ) k-1 rk =2 f1 , k 1 1- 2 q1 f 1 + q 1 i.e., TAC dies down exponentially from r1 (not from r0 =1) TPAC also dies down exponentially.
Summary
We can therefore tentatively produce a Model Identification Chart, as follows, based on the behaviours of the SAC and SPAC of a stationary series.
SAC SPAC Tentative behaviour behaviour Model Cuts off after 1 Dies down MA(1) Cuts off after 2 Dies down MA(2) Dies down Cuts off after 1 AR(1) Dies down Cuts off after 2 AR(2) Dies down Dies down ARMA(1, 1)
19 Ch9 This looks relatively obvious, but isn’t as easy in practice as it appears. Note that no process has ACF and PACF that both cut off.
Box-Jenkins Models with a nonzero constant term MA(q):
zt= d + a t - q1 a t -1 ... - q qa t - q
E( zt ) = m = d AR(p):
zt= d + f1 z t -1 + f 2 z t- 2 + ... f p z t - p + a t
d = m(1 - f1 - f 2 - ... - f p )
m = d/(1 - f1 - f 2 ... fk ) ARMA(p,q) zt= d + f1 z t- 1 + f 2 z t - 2 +... + f t - p + a t - q 1 a t - 1 - q 2 a t - 2 - ... - q q a t - q
d = m(1 - f1 - f 2 - ... - f p )
20 Ch9 Time Series Operations and Representation of ARMA (p,q) Models.
Backshift Operator
Byt= y t-1 (Push back the time series to the previous position) Difference operator
�-1 B so �yt-(1 = B ) y t - y t y t-1 . Thus, is generally known as a differencing operator.
2 �yt蜒( = y t ) � = ( y t - y t-1 ) - ( y t - y t - 1 ) ( y t - 1 y t - 2 )
=yt -2 y t-1 + y t - 2
Also �d- (1B ) d
Representation of an ARMA(p, q) model: AR(p)
zt=d + f1 z t- 1 +... + f p z t - p + a t zt-f1 z t- 1 -... - f p z t - p = d + a t which can also be written as
21 Ch9 2 p (1-f1B - f 2 B - ... - fp B ) z t = d + a t
2 p Define fp(B )= (1 - f1 B - f 2 B - ... - f p B ) so
fp(B ) z t= d + a t
MA(q) – moving average model of order q
The model is written as
zt=d + a t - q1 a t- 1 - q 2 a t - 2 -... - q q a t - q which can also be written as
2 q zt=d +(1 - q1 B - q 2 B - ... - q q B ) a t Define
2 q , qq(B )= (1 - q1 B - q 2 B - ... - q q B ) then
zt=d + q q( B ) a t
ARMA (p, q)—Mixed autoregressive-moving average model of order (p, q):
22 Ch9 zt=d + f1 z t- 1 + f 2 z t - 2 +... + f p z t - p
+at-q1 a t- 1 - q 2 a t - 2 - ... - q q a t - q or zt-f1 z t- 1 - f 2 z t - 2鬃 �= f p z t - p + d - a t - q 1 a t - 1 鬃 q 2 a t - 2 � q q a t - q
2p 2 q (1-f1B - f 2 B - ... - fp B ) z t = d + (1 - q 1 B - q 2 B - ... - q q B ) a t or
fp(B ) z t= d + q q ( B ) a t (*)
2 q where qq(B )= (1 - q1 B - q 2 B - .. - q q B )
In this notation, ARMA(p, 0)= AR(p) and ARMA(0, q) = MA(q).
In such cases one would prefer to write AR(p) and MA(q) instead of ARMA(p, 0) and ARMA(0, q).
23 Ch9 Point Estimate of the model parameters Having identified a tentative ARMA model, we must now fit it to the dataset concerned, in so doing obtain estimates of the parameters defined by the models. For the ARMA(p, q) model, the parameters are qi , fi and d (if the constant term is required).
These parameters are popularly estimated the least squares method (As we understand it, both Minitab and SAS use this approach). The least method essentially find the estimates so
ˆ 2 that SSE = (yt- y t ) is minimum.
You do not need to know the detailed algorithm. Isn’t nice that the computer packages do it for us?
24 Ch9 Forecasts What is the meaning of forecasting? yˆt+t ( t ) is a point forecast of the series at time t +t given the series has been observed from 1 to t Statistically speaking, yˆt+t( t )= E ( y t + t | y1 , y 2 ,.., y t )
Since ARMA models build upon the series{at }, the properties of {at } needs to be revisited. In particular, a1, a 2 , a 3 ,... are independent and that future values of a' s are independent of the present and the past values of y' s , i.e., at+1 is independent of yt, y t-1 ,.... Example: Paper Towel Sales It is found that the differenced series can be fitted by MA(1), so
zt= a t -q1 a t- 1 (assuming d = 0).
Since zt= y t - y t-1 so
25 Ch9 yt- y t-1 = a t -q 1 a t - 1
yt= y t-1 + a t -q 1 a t - 1 (This is known as in the form of a difference- equation) One-step forecast:
First, we have yt+1= y t + a t + 1 -q 1 a t yˆt+1( t )= E ( y t + 1 | y 1 , y 2 ,..., y t )
=E ( yt + a t+1 -q 1 a t | y 1 ,..., y t ) ˆ ˆ =yt + 0 -q1 aˆ t = y t - q 1 a ˆ t since at+1 is independent of y1,.., yt
so E( at+1 | y 1 , y 2 ,.., y t )= E ( a t + 1 ) = 0. Let t =120 and t =1 so ˆ yˆ121(120) = y 120 -q 1 a ˆ 120 In the absorbent towel sales example given in ˆ Table 9.1, Minitab gives q1 = -0.3544
26 Ch9 Final Estimates of Parameters Type Coef SE Coef T P MA 1 -0.3544 0.0864 -4.10 0.000
Differencing: 1 regular difference Number of observations: Original series 120, after differencing 119 Residuals: SS = 127.367 (backforecasts excluded) MS = 1.079 DF = 118
Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag 12 24 36 48 Chi-Square 10.3 18.6 27.5 41.2 DF 11 23 35 47 P-Value 0.500 0.725 0.815 0.710
The last two residuals are e119 = -1.0890 and e120 = 0.6903 so aˆ119 = -1.0890 and aˆ120 = 0.6903. Thus yˆ121(120)= 15.6453 + 0.3544 0.6903 = 15.8899 Using Minitab to forecast, we get
Forecasts from period 120
95 Percent Limits Period Forecast Lower Upper Actual 121 15.8899 13.8532 17.9267 which is identical.
27 Ch9 Two-step forecast: yt+2= y t + 1 + a t + 2 -q 1 a t + 1 ˆ yˆt+2= y ˆ t + 1( t ) + E ( a t + 2 ) -q 1 E ( a t + 1 ) = y ˆ t + 1 ( t ) Again, let t =120, then yˆ122= y ˆ 121(120) = 15.8899. However, the prediction interval is winder:
Forecasts from period 120
95 Percent Limits Period Forecast Lower Upper Actual 121 15.8899 13.8532 17.9267 122 15.8899 12.4609 19.3189
Finally, in ARIMA notation, we may write our model that fits the original series as
ARIMA(0,1,1).
28 Ch9