Chaper 9: Nonseasonal Box-Jenkins Models

Chaper 9: Nonseasonal Box-Jenkins ModelsThe concepts of ‘stationary time series’ and ‘nonstationary time series’ are important in the Box-Jenkins methodology.Stationary time seriesA time series {yt } is said to be stationary if the following two conditions are satisfied: (a) the mean function is constant over time, i.e., mt =E( y t ) = c for all t(b) rt, s= cov(y t , y s ) / var( y t )var( y s ) are not functions of time, i.e., rt, t- k = r 0, k = r k for all time t and lag k . This is equivalent to the condition: gt, s = cov(y t , y s ) are independent of time t also, ie., t,t-k = 0,k = k for all t and lag k. In other words both the autocorrelations rt, s and autocovariances gt, s depend only the distance1 Ch9 between the two time points s and t but not on the actual positions of s and t.Note: Since gt, t =cov(y t , y t ) = var( y t ), a stationary time series is also necessary that the variance is constant with respect to t. Nonstationary Time SeriesIf the n values of yt do not fluctuate around a constant mean or do not fluctuate with constant variation then it is reasonable to believe the time series is not stationary. Random walk with zero mean1510Zt 50-5Time 50 100 1502 Ch9 A nonstationary series can be transformed into a stationary one by first differencing zt=� y t - y t y t-1. Minitab command for differencing is Stat ▷ Time Series ▷Difference (lag 1)(Differencing is like differentiation in calculus) �y y y �y- y y �y = t t t-1 t t t-1 t 1t- ( t - 1) which is similar to the definition of a derivative of a function f( t ): f( t+ D ) - f ( t ) f ( t + D ) - f ( t ) f' ( t )= lim = lim D瓺0t+ D - t 0 D3 Ch9 Time Series Plot of Paper Towel Sales 201510 y501 12 24 36 48 60 72 84 96 108 120 IndexAfter first differencingTime Series Plot of first differencing 3210 2 C -1-2-3-4 1 12 24 36 48 60 72 84 96 108 120 Index4 Ch9 If this is not sufficient, take second differences (the first differences of the first differences) of the original series values should normally does the job2 zt=� y t �� y t - - y t-1 - ( y t y t - 1 ) ( y t - 1 y t - 2 ) If a time series plot indicates increasing variability, it is often transform the series by using either square root, quadric or logarithmetic transformation first and then takes first differencesExample: Consider the following NCR (New Company Registrations) rates data given below: 5 Ch9 Time Series Plot of NCR 700600500 RC 400 N300200100 4 8 12 16 20 24 28 32 36 IndexThe series is clearly not stationary since it has a trend and increasing variability which means both E( yt ) and var(yt ) are depending on the time variable t.6 Ch9 Time Series Plot of lnNCR6.506.256.00R 5.75 C N n l 5.505.255.004 8 12 16 20 24 28 32 36 IndexClearly the log transformation has stabilised the variance somewhat.Applying differencing on the logged series:Time Series Plot of d1lnNCR0.30.20.1 R C N n l 0.0 1 d-0.1-0.2-0.3 4 8 12 16 20 24 28 32 36 Index7 Ch9 It now appears that the resulting series is stationary.Working SeriesThe textbook uses zb, z b+1 ,..., z n as the ‘working series’ obtained from the original series by transformation or differencing. b = 2 if zt= y t - y t-1Sample autocorrelation coefficient (SAC) The sample autocorrelation at lag k is n- k (zt- z )( z t+ k - z ) t= b rk = rk = n 2 (zt - z ) t= b where n z= zt /( n - b + 1) t= bThe standard error of rk is 8 Ch9 1 , if k = 1 (n- b + 1)1/ 2 s = k-1 rk 2 1+ 2 rj j=1 , if k = 2,3,... (n- b + 1)1/ 2 t The rk -statistic is rk tr = k s rk SAC graph is a graph of sample autocorrelations (Minitab calls it the ACF plot):Autocorrelation Function for y (original towel sales) (with 5% significance limits for the autocorrelations)1.0 0.8 0.6 n 0.4 o i t a 0.2 l e r r 0.0 o c o -0.2 t uA -0.4 -0.6 -0.8 -1.02 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Lag9 Ch9 SpikesWe say that a spike at lag k exists if rk is t= r/ s > 2 statistically large, says rk k r k in absolute value.In Minitab acf graph, any rk that is above or below the confidence bands is considered to be a spike so t you do not need to find the value of rk .Cuts off after k We say that SAC cuts off after lag k if no spikes at lags greater than k in SACUsing the SAC to find a stationary time series For nonseasonal data (i) If the time series either cuts off fairly quickly or dies down fairly quickly, then the series is considered stationary10 Ch9 (ii) If the time series dies down extremely slowly, then the series is considered nonstationary Note that the SAC of the towel sales series refuse to die down quickly so there is a clear sign the series is nonstationarySample partial autocorrelation rkk Can be thought of as the sample autocorrelation of time series observations separated by a lag of k time units with the effects of the intervening observations eliminated. In other words, this measure of correlation is used to identify the extent of relationship between current values of a variable with earlier values of the same variable (values for various time lags) while holding the effects of all other time lags constant. 11 Ch9 Consider now the differenced series of the towel salesAutocorrelation Function for z (differenced series) (with 5% significance limits for the autocorrelations)1.0 0.8 0.6 n 0.4 o i t a 0.2 l e r r 0.0 o c o -0.2 t uA -0.4 -0.6 -0.8 -1.02 4 6 8 10 12 14 16 18 20 22 24 26 28 30 LagHere, there is a cut-off at lag 1 so the differenced series is stationary. Simple Stationary Time Series Models- ARMALet {at } be a sequence of random shocks which describe the effect of all other factors other than zt-1 on zt . It is more or less the residual errors of12 Ch9 the forecast (if the residuals et are not independent, then we can’t treat et as at )Note: Most textbooks call {at } the white noise.Properties of {at }(i) a1, a 2 , a 3, ... are independent2 (ii) ai: N(0,s a )(iii) at+1 is independent of yt, y t-1 ,... {at } forms a very important role in Box-Jenkins methodology. Essentially, every stationary Box- Jenkins model can be expressed in terms of the white noise process. Simple Box-Jenkins ModelsMoving Average Models zt= a t - q1 a t -1 ... - q qa t - q 13 Ch9 and refer to it as a moving average process of order q, denoted by MA(q). (Note that structurally speaking, MA(q) is expressed as averaging of at terms except the negative signs) The special case: MA(1) zt= a t - q1 a t- 1E( zt )= 0 2 2 var(zt )= s a (1 + q1 ) 2 cov(zt , z t+1 ) = -q 1 s a cov(zt , z t+ k )= 0 for k 2 q1 Thus r1 = 2 and all other rk are zero. 1+ q1 (Make sure you know how to derive the above).Hence the TAC of an MA(1) “cuts off” after lag 1.MA(2) zt= a t - q 1 a t -1 - q 2 a t -214 Ch9 E(Z ) = 0 E( zt )= 0 t 2 2 2 var(zt )= s a (1 + q1 + q 2 ) 2 cov(zt , z t+1 )= ( -q 1 + q 1 q 2 ) s a 2 cov(zt , z t+2 ) = -q 2 s a cov(zt , z t+ k )= 0 for k 3-q1 + q 1 q 2 r1 = 2 2 , 1+ q1 + q 2-q2 r2 = 2 2 1+ q1 + q 2 and all other rk are zero. Thus the TAC of an MA(2) “cuts off” after lag 2.In general, for MA(q)(i) rk �0 for k 1,2,..., q rk =0 for k > q(ii) PAC dies down15 Ch9 Autoregressive Models zt= f1 z t -1 + f 2 z t- 2 + ... f p z t - p + a tHere the zt are regressed on themselves, (hence of course the name) but lagged by various amounts. The simplest case is the first order, denoted as AR(1), which takes the form zt= f1 z t -1 + a tE( zt )= 0 2 sa var(zt )= g0 = 2 , 1- f1 so |f1 | < 1 to ensure stationarityk = 1 k-1 k k = f1Thus rk “dies down” exponentially as k increases, oscillating if 1 < 0. Thus if the TAC of a series dies down rather than cuts off, we suspect it to be an AR rather than an MA.16 Ch9 Note that AR and MA series are not entirely unrelated. It can be shown that an AR(1) can be expressed as an “infinite” MA series, much like the general linear process. The MA(1) can similarly be expressed as an “infinite” AR series.Note: a linear process is a time series that has the form yt= a t +y1 a t- 1 + y 2 a t - 2 + ...The AR(2) can be written as zt= f1 z t -1 + f 2 z t -2 + a t f1 r1 = 1- f2 r2 = f 1 r 1 + f 2 r3 = f 1 r 2 + r 2 f 1 etc. Thus again the TAC dies down rather than cuts off, though it is difficult at times to tell the difference in TAC’s between AR(1) and AR(2).17 Ch9 TPAC has nonzero partial autocorrelations at lags 1 and 2 and zero at all lags after lag 2, i.e., cuts off after lag 2. In general, for AR(p), TAC dies down and TPAC cuts off after lag p. ARMA(p, q) Mixed autoregressive-moving average modelsThe model can be written as zt= f1 z t- 1 + f 2 z t - 2 +... + f t - p + a t - q 1 a t - 1 - q 2 a t - 2 - ... - q q a t - q zt- f1 z t- 1 - f 2 z t - 2... - f t - p = a t - q 1 a t - 1 - q 2 a t - 2 - ... - q q a t - q i.e., we move autoregressive part to the left whereas the moving average part on the right. ARMA(1, 1)18 Ch9 zt= f1 z t- 1 + a t - q 1 a t - 1(1- q1 f 1 )( f 1- q 1 ) k-1 rk =2 f1 , k 1 1- 2 q1 f 1 + q 1 i.e., TAC dies down exponentially from r1 (not from r0 =1) TPAC also dies down exponentially. SummaryWe can therefore tentatively produce a Model Identification Chart, as follows, based on the behaviours of the SAC and SPAC of a stationary series.SAC SPAC Tentative behaviour behaviour Model Cuts off after 1 Dies down MA(1) Cuts off after 2 Dies down MA(2) Dies down Cuts off after 1 AR(1) Dies down Cuts off after 2 AR(2) Dies down Dies down ARMA(1, 1)19 Ch9 This looks relatively obvious, but isn’t as easy in practice as it appears. Note that no process has ACF and PACF that both cut off.Box-Jenkins Models with a nonzero constant term MA(q): zt= d + a t - q1 a t -1 ... - q qa t - q E( zt ) = m = d AR(p): zt= d + f1 z t -1 + f 2 z t- 2 + ... f p z t - p + a t d = m(1 - f1 - f 2 - ... - f p ) m = d/(1 - f1 - f 2 ... fk ) ARMA(p,q) zt= d + f1 z t- 1 + f 2 z t - 2 +... + f t - p + a t - q 1 a t - 1 - q 2 a t - 2 - ... - q q a t - q d = m(1 - f1 - f 2 - ... - f p )20 Ch9 Time Series Operations and Representation of ARMA (p,q) Models.Backshift OperatorByt= y t-1 (Push back the time series to the previous position) Difference operator�-1 B so �yt-(1 = B ) y t - y t y t-1 . Thus, is generally known as a differencing operator. 2 �yt蜒( = y t ) � = ( y t - y t-1 ) - ( y t - y t - 1 ) ( y t - 1 y t - 2 )=yt -2 y t-1 + y t - 2Also �d- (1B ) dRepresentation of an ARMA(p, q) model: AR(p) zt=d + f1 z t- 1 +... + f p z t - p + a t zt-f1 z t- 1 -... - f p z t - p = d + a t which can also be written as 21 Ch9 2 p (1-f1B - f 2 B - ... - fp B ) z t = d + a t2 p Define fp(B )= (1 - f1 B - f 2 B - ... - f p B ) so fp(B ) z t= d + a tMA(q) – moving average model of order qThe model is written as zt=d + a t - q1 a t- 1 - q 2 a t - 2 -... - q q a t - q which can also be written as 2 q zt=d +(1 - q1 B - q 2 B - ... - q q B ) a t Define 2 q , qq(B )= (1 - q1 B - q 2 B - ... - q q B ) then zt=d + q q( B ) a tARMA (p, q)—Mixed autoregressive-moving average model of order (p, q):22 Ch9 zt=d + f1 z t- 1 + f 2 z t - 2 +... + f p z t - p+at-q1 a t- 1 - q 2 a t - 2 - ... - q q a t - q or zt-f1 z t- 1 - f 2 z t - 2鬃 �= f p z t - p + d - a t - q 1 a t - 1 鬃 q 2 a t - 2 � q q a t - q2p 2 q (1-f1B - f 2 B - ... - fp B ) z t = d + (1 - q 1 B - q 2 B - ... - q q B ) a t or fp(B ) z t= d + q q ( B ) a t (*)2 q where qq(B )= (1 - q1 B - q 2 B - .. - q q B )In this notation, ARMA(p, 0)= AR(p) and ARMA(0, q) = MA(q). In such cases one would prefer to write AR(p) and MA(q) instead of ARMA(p, 0) and ARMA(0, q).23 Ch9 Point Estimate of the model parameters Having identified a tentative ARMA model, we must now fit it to the dataset concerned, in so doing obtain estimates of the parameters defined by the models. For the ARMA(p, q) model, the parameters are qi , fi and d (if the constant term is required).These parameters are popularly estimated the least squares method (As we understand it, both Minitab and SAS use this approach). The least method essentially find the estimates soˆ 2 that SSE = (yt- y t ) is minimum.You do not need to know the detailed algorithm. Isn’t nice that the computer packages do it for us?24 Ch9 Forecasts What is the meaning of forecasting? yˆt+t ( t ) is a point forecast of the series at time t +t given the series has been observed from 1 to t Statistically speaking, yˆt+t( t )= E ( y t + t | y1 , y 2 ,.., y t )Since ARMA models build upon the series{at }, the properties of {at } needs to be revisited. In particular, a1, a 2 , a 3 ,... are independent and that future values of a' s are independent of the present and the past values of y' s , i.e., at+1 is independent of yt, y t-1 ,.... Example: Paper Towel Sales It is found that the differenced series can be fitted by MA(1), so zt= a t -q1 a t- 1 (assuming d = 0). Since zt= y t - y t-1 so 25 Ch9 yt- y t-1 = a t -q 1 a t - 1 yt= y t-1 + a t -q 1 a t - 1 (This is known as in the form of a difference- equation) One-step forecast:First, we have yt+1= y t + a t + 1 -q 1 a t yˆt+1( t )= E ( y t + 1 | y 1 , y 2 ,..., y t )=E ( yt + a t+1 -q 1 a t | y 1 ,..., y t ) ˆ ˆ =yt + 0 -q1 aˆ t = y t - q 1 a ˆ t since at+1 is independent of y1,.., yt so E( at+1 | y 1 , y 2 ,.., y t )= E ( a t + 1 ) = 0. Let t =120 and t =1 so ˆ yˆ121(120) = y 120 -q 1 a ˆ 120 In the absorbent towel sales example given in ˆ Table 9.1, Minitab gives q1 = -0.354426 Ch9 Final Estimates of Parameters Type Coef SE Coef T P MA 1 -0.3544 0.0864 -4.10 0.000Differencing: 1 regular difference Number of observations: Original series 120, after differencing 119 Residuals: SS = 127.367 (backforecasts excluded) MS = 1.079 DF = 118Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag 12 24 36 48 Chi-Square 10.3 18.6 27.5 41.2 DF 11 23 35 47 P-Value 0.500 0.725 0.815 0.710The last two residuals are e119 = -1.0890 and e120 = 0.6903 so aˆ119 = -1.0890 and aˆ120 = 0.6903. Thus yˆ121(120)= 15.6453 + 0.3544 0.6903 = 15.8899 Using Minitab to forecast, we getForecasts from period 12095 Percent Limits Period Forecast Lower Upper Actual 121 15.8899 13.8532 17.9267 which is identical. 27 Ch9 Two-step forecast: yt+2= y t + 1 + a t + 2 -q 1 a t + 1 ˆ yˆt+2= y ˆ t + 1( t ) + E ( a t + 2 ) -q 1 E ( a t + 1 ) = y ˆ t + 1 ( t ) Again, let t =120, then yˆ122= y ˆ 121(120) = 15.8899. However, the prediction interval is winder:Forecasts from period 12095 Percent Limits Period Forecast Lower Upper Actual 121 15.8899 13.8532 17.9267 122 15.8899 12.4609 19.3189Finally, in ARIMA notation, we may write our model that fits the original series as ARIMA(0,1,1).28 Ch9

Chaper 9: Nonseasonal Box-Jenkins Models

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support