Revised Chapter 15 in Specifying and Diagnostically Testing Econometric Models (Edition 3) © by Houston H. Stokes 8 February 2015. All rights reserved. Preliminary Draft

Chapter 15

Spectral Analysis of Time Series...... 1 15.0 Introduction...... 1 15.1 A Brief Treatment of Spectral Analysis Theory...... 2 Table 15.1 Values and Names Calculated by the B34S spectral Command...... 14 15.2 Examples...... 15 Table 15.2 Program to Generate AR(1) models...... 15 Figure 15.1 AR(1) Model where f = .9 ...... 16 Figure 15.2 AR(1) Model where f = -.9 ...... 17 Table 15.3 Program to Analyze Gas Furnace Data using Spectral Methods...... 18 Figure 15.3 Spectral analysis of GASIN...... 23 Figure 15.4 Spectral analysis of GASOUT...... 24 Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 1...... 25 Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 2...... 26 15.3 Matrix Command Implementation...... 26 Table 15.4 Matrix Command Implementation of Spectral Analysis...... 26 Table 15.5 Cross Spectral Analysis with the Matrix Command...... 27 Table 15.6 Verification that the sin and cosine vectors are orthogonal...... 28 Table 15.7 Inverse Spectral Examples...... 29 15.4 Wavelet Analysis...... 33 Table 15.8 Wavelet basis functions supported...... 34 Table 15.9 Empirically derived factors for four wavelet bases...... 35 Table 15.10 Wavelet Filter For The Nino Series...... 37 Figure 15.6 Raw Nino series and three wavelet smoothed series...... 40 15.5 Use of Normalized Cumulative Periodogram to test for white noise...... 42 Table 15.12 Calculating the Normalized Cumulative Periodogram...... 42 Table 15.13 Testing series for white noise using the Normalized Cumulative Periodogram...... 43 Figure 15.7 Normalized Cumulative Periodogram for gasout series...... 44 Figure 15.8 Normalized Cumulative Periodogram for white noise series...... 45 15.6 Forecasting using spectral methods...... 45 Table 15.14 A Subroutine to Calculate Forecasts using the FFT of a series...... 46 Table 15.15 Using the FFT to Forecast...... 48 15.7 Conclusion...... 54

Spectral Analysis of Time Series

15.0 Introduction

For single series the B34S spectral command calculates the Fourier cosine and sine coefficients, the periodogram and, if weights are supplied, the spectrum. The spectral command provides substantially more capability than the spectral option available under the B34S bjiden

15-1 15- 2 Chapter 15 and bjest sentences. For multiple series the cspectral paragraph calculates the real part of the cross periodogram, the imaginary part of the cross periodogram, the cospectral density estimate, the quadrature-spectrum estimate, the amplitude, the coherency squared and the phase spectrum. The output from this command is similar to that produced with the SAS/ETS spectra command.1 After a brief discussion of the theory, the spectrum is calculated for generated AR data with positive and negative coefficients. While spectral analysis usually proceeds by a Fourier decomposition of the series into cosine and sine coefficients, it is possible to use OLS methods to make these calculations. The advantage of the OLS approach is that it highlights the relationship between time and frequency domain approaches. The matrix command also contains substantial spectral programming capability that includes spectral, cspectral and fft commands together with complex variable manipulation capability. Use of these features will be discussed later.

Wavelet analysis provides means by which a series can be filtered in a manner that has certain advantages over the windowed Fourier transformation (WFT). Torrence and Compo (1998) stress the WFT has the disadvantage of being inefficient since it "imposes a scale or 'response interval' T into the analysis, a problem not found with the wavelet transformation. They advocate using wavelets as a means by which a data series can be filtered from noise. Hastie- Tibshirani-Friedman (2001, 149) note the ability of wavelets to represent a series and provide a means by which a sparse representation can be found. "Wavelets bases are very popular in signal processing and compression, since they are able to represent both smooth and/or locally bumpy functions in an efficient way – a phenomenon dubbed time and frequency localization. In contrast, the traditional Fourier basis allows only frequency localization." In section 15.4 the Torrence-Compo wavelet implementation is discussed and examples shown.

15.1 A Brief Treatment of Spectral Analysis Theory

The basic idea behind spectral analysis is to decompose the variance in a series by frequency. Assume the model

yt=a + f y t-1 + u t . (15.1-1)

If f > 0 , it will be shown that the series contains low-frequency information, while if f < 0 , the series contains high-frequency information. Spectral quantities, such as the periodogram and spectrum, can be calculated using frequency-domain methods or time domain methods such as OLS.

Assume an autocovariance-generating function (see Hamilton (1994) equation [3.6.1])

1 Jenkins - Watts (1968), Anderson (1971) and Bloomfield (1976) provide good references for spectral methods. Hamilton (1994, Chap. 6) provides a very concise treatment, which is further summarized here. For a more modern reference, see Wei (2006, Chapters 11-13)

15-2 j gy( z )= g j z , (15.1-2) j=- where z = complex scalar, and covariances g are summable. For spectral models assume z=cos(w ) - i sin( w ) = e-iw . (15.1-3)

The population spectrum S y is equal to the autocovariance-generating function gy ( z ) evaluated at z= e-iw and divided by 2π

-1 -i w - 1 - iw j Sy(w )= (2 p ) g y ( e ) = (2 p ) g j e . (15.1-4) j=-

Consider an MA(1) model. Once we get the covariance-generating function, from this function we can get the spectrum. Assume a model such as

yt=m + e t + q e t-1, (15.1-5) which was discussed in Chapter 7. Here

E( yt )=m + E ( e t ) + q E ( e t-1 ) (15.1-6)

2 2 E( yt-m ) = E ( e t + q e t-1 ) 2 2 2 =E( et + 2q e t e t-1 + q e t - 1 ) (15.1-7) =s2 +0 + q 2 s 2

E( yt-m )( y t-1 - m ) = E ( e t + q e t - 1 )( e t - 1 + q e t - 2 ) 2 2 =E( et e t-1 +q e t - 1 + q e t e t - 2 + q e t - 1 e t - 2 ) (15.1-8) 2 =0 +q et-1 + 0 + 0

Using (15.1-2), (15.1-7) and (15.1-8) the autocovariance- generating function for this model becomes

2- 1 2 2 0 2 1 gy ( z )= [qs ] z + [(1 + q ) s ] z + [ qs ] z =s2[ qz- 1 + 1 + q 2 + q z ] (15.1-9) =s2(1 + qz )(1 + q z- 1 ).

15-3 15- 4 Chapter 15

For the general MA(q) model

2 2q- 1 - 2 - q gy( z )=s (1 + q1 z + q 2 z + ,⋯ , + q q z )(1 + q 1 z + q 2 z + , ⋯ , + q q z ). (15.1-10)

For the AR(1) model

-1 yt-m =(1 - f B ) e t (15.1-11)

2- 1 gy ( z )= [s /(1 - f1 z )(1 - f 1 z )]. (15.1-12)

In general, for the ARMA(p,q) model

2 2q- 1 - 2 - q gy( z )= [s (1 + q1 z + q 2 z + , , + q q z )(1 + q 1 z + q 2 z + , ⋯ , + q q z )]/ 2p- 1 - 2 - p [ (1+f1z + f 2 z + ,⋯ , + fp z )(1 + f 1 z + f 2 z + , ⋯ , + f p z )]. (15.1-13)

We now show how to write the spectrum in terms of z of the MA(1) model.

From (15.1-3), (15.1-4) and (15.1-9)

-1 2 -iw i w Sy (w )= (2 p ) s (1 + q e )(1 + q e ) (15.1-14) =(2p )-1 s 2 (1 + qe -iw + q e i w + q 2 ).

15-4 Since e-iw+ e i w =cos(w ) - i sin( w ) + cos( w ) + i sin( w ) (15.1-15) = 2 cos(w )

-1 2 2 S y (w )= (2 p ) s [1 + q + 2 q cos( w )]. (15.1-16)

Since cos(w ) goes from 1 to -1 as w goes from 0 to p , equation (15.1-16) indicates that if q>0 ( q < 0) , the spectrum monotonically decreases (increases) as frequency w increases from 0 to p .

The spectrum of the AR(1) model in equation (15.1-11) can be calculated from the covariance-generating function (15.1-12) using (15.1-15) as

-1 2 -iw i w Sy =(2p ) s /[(1 - f e )(1 - f e )] =(2p )-1 s 2 /[(1 - fe -iw - f e i w + f 2 )] (15.1-17) =(2p )-1 s 2 /[(1 + f 2 - 2 f cos( w ))].

Equation (15.1-17) shows that if f>0 ( f < 0) , the spectrum is monotonically decreasing (increasing) over the range [0,p ] since the denominator increases (decreases).

From (15.1-13) the spectrum for the ARMA(p,q) process becomes

2 2q- 1 - 2 - q Sy(w )= [ s (1 + q1 z + q 2 z + , , + q q z )(1 + q 1 z + q 2 z + , ⋯ , + q q z )]/ 2p- 1 - 2 - p (15.1-18) [2p (1+ f1z + f 2 z + ,⋯ , + fp z )(1 + f 1 z + f 2 z + , ⋯ , + f p z )].

Following Hamilton (1994), it can be shown that if

2 q (1+q1z + q 2 z + ,⋯ , + qq z ) = (1 - h 1 z )(1 - h 2 z ), ⋯ ,(1 - h q z ) (15.1-19)

2 q (1+f1z + f 2 z + ,⋯ , + fq z ) = (1 - l 1 z )(1 - l 2 z ), ⋯ ,(1 - l q z ) (15.1-20) then (15.1-17) can be written as

q 2 (1+hj - 2 h j cos( w )) 2 j=1 S y (w )= ( s / 2 p ) p . (15.1- 2 (1+lj - 2 l j cos( w )) j=1

15-5 15- 6 Chapter 15 21) The spectrum contains all the information concerning the autocovariances. Assuming the sequence of autocovariances g j is summable over the range j = -ゥ, (i. e., the covariances die out) Hamilton (1994, Appendix 6.A) proves that

p iw k Sy(w ) e d w= g k (15.1-22) j=-p or, alternatively,

p

Sycos(w k ) d w= g k . (15.1-23) j=-p

For k=0, equation (15.1-22) becomes

p

Sy (w ) d w= g 0 , (15.1-24) j=-p which indicates that the area under the population spectrum between -p to p is the variance.

Since the spectrum is symmetric, the portion of the variance of yt that is attributed to p frequencies less than w is 2Sy (w ) d w . . j=0

Results for a sample of T observations follow directly from the above results for the population. From (15.1-4) we get

T -1 ˆ -1 -iw j Py(w )= (2 p ) gˆ j e , (15.1-25) j=- T +1

ˆ 2 where Py (w ) is the sample periodogram. Since e-iw j =cos(w j ) - i sin( w j ) (15.1-26)

ˆ 2 Py (w ) is called the sample periodogram. The sample periodogram can be smoothed to form the sample spectrum ˆ S y (w ) . Weights are selected to smooth out noise in the estimated sample periodogram. In B34S there must be an odd number of weights. All weights are normalized to sum to (1/(4*π)). WEIGHTS(1 1 1)$ implies rectangular weighting, while WEIGHTS(1 2 1)$ implies triangular weighting.

15-6 and for a covariance stationary process gj= g - j , we can write (15.1-25) as

T -1 ˆ -1 Py (w )= (2 p ) gˆ [cos( w j ) - i sin( w j )] j=- T +1 T -1 (15.1-27) -1 - 1 =(2p ) g0 [cos(0) -i sin(0)] + (2 p ) gˆ j [cos( w j ) - i sin( w j )], j=1 which since cos(0)= 1, sin(0) = 0, sin( -q ) = - sin( q ) and cos( - q ) = cos( q ) , sin(0) =0, sin(-θ) = -sin(θ) and cos(-θ)=cos(θ) can be written

T -1 ˆ -1 Py(w )= (2 p ) [ gˆ0 + 2 g ˆ j cos( w j )]. (15.1-28) j=1

The sample periodogram can be estimated using Fourier methods or OLS. While the OLS approach is slower, it highlights what is being estimated with spectral analysis. The sample equivalent of the spectral representation theorem that states that any covariance-stationary process yt can be written as

p yt =m +[ a ( w )cos( w t ) + d ( w )sin( w t )] d w (15.1-29) 0 is

M ˆ yt=mˆ +[ aˆ j cos( w j ( t - 1)) + d j sin( w j ( t - 1))] + e t . (15.1-30) j=1

If T is odd, there will be M=(T-1)/2 frequencies in equation (15.1-30), where w1=2 p /T , w 2 = 4 p / T , and wM = 2 M p / T = ( T - 1) p / T . If equation (15.1-30) is estimated M ˆ times, then the M sets of coefficients aˆ jand d j can be used to recover the sample periodogram and other quantities. The reason for this is that all the right-hand-side variables are orthogonal and hence the regression can be done in sequences of models with two variables on the right plus ˆ the constant. Define Pi(w i ) as the sample periodogram of series i (here yt ) evaluated at frequency w j , then

ˆ 2ˆ 2 Pi(w i )= ( T /8 p )( aˆ j + d j ). (15.1-31)

15-7 15- 8 Chapter 15 The sample variance3 involves summing the periodogram values and is

T M 2 2ˆ 2 (1/T )邋 ( yt- y ) = .5[ (aˆ j + d j )]. (15.1-32) i=1 j = 1

Since

T aˆ j=(2 /T ) y t cos[ w j ( t - 1)] (15.1-33) t=1

T ˆ dj=(2 /T ) y t sin[ w j ( t - 1)], (15.1-34) t=1

(15.1-31) can be written as

T ˆ Pi(w j )= (1/ 2 p T )[( y t cos[ w j ( t - 1)] (15.1-35) t=1

It can be shown that

ˆ 2 2Pi (w ) / S i ( w )= c (2), (15.1-36) which implies that

ˆ E[ Pi (w )]= S i ( w ). (15.1-37)

2 ˆ Since c (2) has a mean of 2 and a 95% confidence interval of .05 - 7.4, Pi (w ) is not a good estimate of the population spectrum. Another problem is that (15.1-28) requires that as many ˆ parameters (g i ) have to be estimated as observations. The solution is to weight Pi (w ) to form an estimate of the sample spectrum for series i, where h = the number of weights minus 1 divided by 2.

h ˆ ˆ Si(w )= w m+ h +1 P i ( w ). (15.1-38) m=- h

In B34S the WEIGHTS sentence requires that the user set an odd number of weights to be used for smoothing the spectrum. The weights are then normalized to sum to 1/(4*π) or .079577 .

3 Note that the large sample variance formula is used.

15-8 Up to 99 weights can be supplied. If the supplied sentence was

weights(1 2 3 2 1) $ then triangular weights of .0088419, .017684, .026526, .017684 and .0088419 would be used, while if the sentence was

weights(1 1 1 1 1)$ then there would be five weights of (1/ 20p ) or .015915. Note that both sets of weights sum to (1/ 4p ) or .079577.

The B34S and SAS have slightly different parameterizations of the spectral quantities.

The values calculated by B34S (and SAS) are listed in Table 15.1. Note that w j in equations ˆ ˆ (15.1-33) and (15.1-34) is 2kp / T in Table 15.1 in the calculation of a j and d j , which are 2 2 called COSj(k) and SINj(k), respectively. B34S and SAS normalize (aˆ + dˆ ) by T/2 in place of (T/8π) listed in equation (15.1-31). If the data contain cycles greater than π, then these will be seen as having cycles with a range of 0 to π. For the lowest frequency w1 = 2 p /T , the corresponding period is T. In words this means that if there are T data points, it will be impossible to detect a cycle longer than T. If two series are added, then the autocovariance- generating function of the sum is the sum of the autocovariance-generating functions of each series.

Up until now the analysis has been in terms of one series. We now assume two series, xt and yt , and develop a spectral representation . Expanding the notation, define the population cross spectrum as

(j ) Sx y(w )= (1/ 2 p ) g x y {cos( w j ) - i sin( w j )}. (15.1-39) j=-

Equation (15.1-40) can be broken into two parts, the population cospectrum cx y (w ) defined as

(j ) cx y(w )= (1/ 2 p ) g x y cos( w j ) (15.1-40) j=- and the population quadrature spectrum defined as

15-9 15- 10 Chapter 15

(j ) qx y(w )= - (1/ 2 p ) g x y sin( w j ), (15.1-41) j=- where

Sx y(w )= c x y ( w ) + i q x y ( w ). (15.1-42)

It can be shown that the covariance between x and y is

p Sx y(w ) d w= E ( y t - m y )( x t - m x ) (15.1-43) -p or

p cx y(w )= E ( y t - m y )( x t - m x ) (15.1-44) -p since

p q(w ) d w = 0. (15.1-45) -p x y

The population cospectrum, cx y (w ) , measures the portion of the covariance between xt and yt that is attributable to cycles of frequency w . Looking now at the sample, define T aˆ y j=(2 /T ) y t cos[ w j ( t - 1)] (15.1-46) t=1 T ˆ dy j=(2 /T ) y t sin[ w j ( t - 1)] (15.1-47) t=1 T aˆx j=(2 /T ) x t cos[ w j ( t - 1)] (15.1-48) t=1 T ˆ dx j=(2 /T ) x t sin[ w j ( t - 1)] (15.1-49) t=1

The sample covariance between xt and yt can be written as T M ˆ ˆ (1/T )邋 ( yt- y )( x t - x ) = (1/ 2) (aˆ y j a ˆ x j + d x j d y j ), (15.1-49) t=1 j = 1

which implies that the portion of the sample covariance between xt and yt that is due to ˆ ˆ common dependence on cycles of frequency w j is (1/ 2)(aˆx j a ˆ y j+ d x j d y j ) . The sample cross

15-10 periodogram is the sum of a real cˆx y(w j ) and imaginary component qˆx y(w j )

ˆ Px y(w j )= cˆ x y ( w j ) + i q ˆ x y ( w j ) (15.1-50) where

ˆ ˆ cˆx y(w j )= ( T /8 p )( aˆ x j a ˆ y j + d x j d y j ) (15.1-51)

ˆ ˆ qˆx y(w j )= ( T /8 p )( aˆ x j a ˆ y j - d x j d y j ). (15.1-52)

The formulas used by B34S and SAS scales by (T/2) in place of (T/8π) in equation (15.1-51) and

(15.1-52) or RPxy( k )= 4p cˆ x y ( w j ) and IPx y( k )= qˆ x y (w j ).

While the real part of the sample cross periodogram measures to what degree xt and yt have common cycles, i. e., cycles at the same frequency, the imaginary component of the sample cross periodogram corrects for whether these cycles are in phase or out of phase. If both series shared common cycles, but these cycles were out of phase, the result would be that the contemporaneous covariance would be low. In a manner similar to equation (15.1-38), the real and imaginary parts of the sample cross periodogram can be weighted to form an estimate of the ˆ ˆ sample cospectral density estimate Cx y(w j ) and sample quadrature spectrum Qx y(w j ). h ˆ Cx y(w )= w m+ h +1 cˆ x y ( w ) (15.1- m=- h 53) h ˆ Qx y(w )= w m+ h +1 qˆ x y ( w ). (15.1-54) m=- h

ˆ The population coherence hx y(w j ) , measures the correlation between xt and yt by frequency, is ˆ ˆ2 ˆ 2 ˆ ˆ hx y(w j )= [( C x y ( w j )) + ( Q x y ( w j )) ]/[ S x ( w j ) S y ( w j )], (15.1-55)

ˆ while the amplitude Ax y(w j ), is

ˆ ˆ2 ˆ 2 .5 Ax y(w j )= [( C x y ( w j )) + ( Q x y ( w j )) ] . (15.1-56)

Using (15.1-56), (15.1-55) can be rewritten as

ˆ ˆ2 ˆ ˆ hx y(w j )= [ A x y ( w j )] / [ S x ( w j ) S y ( w j )]. (15.1-57)

15-11 15- 12 Chapter 15 ˆ The phase Gx y(w j ) between xt and yt is

ˆ ˆ ˆ Gx y(w j ) = arctan[Q x y ( w j ), C x y ( w j )]. (15.1-58)

In the next section generated data and the Box-Jenkins (1976) gas furnace data are used to illustrate these spectral magnitudes. Table 15.1 provides a summary of the formulas used in these calculations and the b34s names. Before moving to this discussion it is important to fully document the differences between SAS and B34S.

The B34S spectral command will exactly replicate the SAS procedure spectra with two exceptions. In contrast with SAS, B34S does not print the zero frequency data point (where PERIOD = ). There is a "bug" in SAS concerning how weighting is done for the end points of the quadrature spectrum. The SAS Institute has acknowledged this "bug" and provides a switch that will provide the correct results. Since the quadrature spectrum is an intermediate step toward calculating the amplitude, the coherency squared and the phase, these values differ also. To illustrate the problem, assume the Box-Jenkins (1976) gas furnace data is run with WEIGHTS (1 1 1). Here the weights are all .026525. The sum of the three weights is .079575, which is (1/4π). For frequency .00378, .006757, .01014, and .013551 the IP values are -33.07, -39.08, -35.49 and -12.04, respectively. The QS values should be -2.791, -2.855, -2.298 and -1.586. SAS produces -1.914, -2.855, -2.298 and -1.586. The number of values differing at the beginning and at the ending are the number of weights - 2 or 1 in this case. At the end points, the correct way to calculate the QS value is to fold. This is illustrated next.

QS(1) =(.026525)*(-33.07)+(.026525)*(-33.07)+(.026525)*(-39.08) QS(2) =(.026525)*(-33.07)+(.026525)*(-39.08)+(.026525)*(-35.49) QS(3) =(.026525)*(-39.08)+(.026525)*(-35.49)+(.026525)*(-12.04).

SAS uses the above formulas for QS(2) and QS(3) but for QS(1) adds in the IP value of 0.0 for frequency 0.0, giving

QS(1) =(.026525)*(0.0)+(.026525)*(-33.07)+(.026525)*(-39.08). 4

4 The SAS command ALTW gives the “correct” answer but was not made the default to maintain compatibility with older versions of SAS and replicate the old answers. This appears to be a self serving decision on the part of SAS.

15-12 Table 15.1 Values and Names Calculated by the B34S spectral Command ______FREQ(k) Frequency from 0 to π if NOBSPP is set = 0. Cycles per observation is NOBSPP*FREQ/2π, where NOBSPP is the number of observations per unit time. Using the default setting of NOBSPP = 1, FREQ is in the range 0 to .5. The number of frequencies calculated goes from 1 to K, where if T is even, K = (T/2) - 1

PERIOD(k) Period or wavelength ( 1 / FREQ ).

COSj(k) Cosine transform of Xjt. Defined over range 1 to K. T COSj(k)= Σ i=1(2/T)*Xji*cos(k*(i-1)*2*π/T).

SINj(k) Sine transform of Xjt. Defined over range 1 to K. T SINj(k)= Σ i=1(2/T)*Xji*sin(k*(i-1)*2*π/T).

Pj(k) Periodogram of Xjt. Defined over range 1 to K. Pj(k) = (T/2)*((COSj(k)**2) + (SINj(k)**2)).

Sj(k) Spectral density estimate of Xjt. Defined over range 1 to v K as Sj(k) = Σ i=-v w(i)*Pj(k+i), where v=(p-1)/2 if the WEIGHTS sentence (containing v elements) is present and Sj(k) = Pj(k) if it is not.

RPmn(k) Real part of cross periodogram of Xmt and Xnt. Defined over range 1 to K as Rpmn(k)= (T/2)*((COSm(k)*COSn(k))+(SINm(k)*SINn(k))).

IPmn(k) Imaginary part of cross periodogram of Xmt and Xmt. Defined over over range 1 to K as IPmn(k) = (T/2)*((COSn(k)*SINm(k))-(SINn(k)*COSm(k))).

CSmn(k) Cospectral density estimate (real part of cross spectrum) or the weighted real part of the cross periodogram. Defined v over range 1 to K as CSmn(k) = Σ i=-v w(i) * RPmn(k+i), where v=(p- 1)/2 if the WEIGHTS sentence (containing v elements) is present and as Csmn(k) = RPmn(k) if it is not.

QSmn(k) Quadrature-spectrum (imaginary part of cross spectrum) or the weighted imaginary part of the cross periodogram. Defined v over range 1 to K as QSmn(k) = Σ i=-v w(i)*IPmn(k+i), where v=(p- 1)/2 if the weights sentence (containing v elements) is present and QSmn(k)= Ipmn(k) if it is not.

Amn(k) Amplitude (modulus of cross-spectrum). Defined over range 1 to K as Amn(k) = ((CSmn(k)**2) + (QSmn(k)**2))**.5.

Kmn(k) Coherency squared. Defined over range 1 to K as Kmn(k)=(Amn(k)**2)/(Sn(k)*Sm(k)). If the WEIGHTS sentence is not present, Kmn(k) reduces to 1.0 for all frequencies.

15-13 15- 14 Chapter 15 PHmn(k) Phase spectrum in radians. Defined over range 1 to K as PHmn(k) = arctan(QSmn(k),CSmn(k)).

15-14 15.2 Examples

Table 15.2 shows the statements to generate AR(1) models of the form of equation (15.1- 1), where in Model 1, Φ = .9, while in Model 2, Φ = -.9. The programming setup illustrates the use of the Macro facility. If the B34SLET statement setting PLOT1 = no is changed to PLOT1 = yes, hard copy graphs will be produced.

Table 15.2 Program to Generate AR(1) models ______%b34slet plot1 = no $ b34sexec options open('_JUNK.FSV') disp=unknown unit(44)$ b34srun$ b34sexec options clean(44)$ b34srun$ b34sexec options gfactor(.8)$ b34srun$ b34sexec data noob=300 maxlag=1 heading('Sample AR(1) Data')$ build noise x1 x2$ gen noise= rn()$ gen x1 = lp(1,1,noise) ar(.9) ma(1.0) values(0.0) $ gen x2 = lp(1,1,noise) ar(-.9) ma(1.0) values(0.0) $ b34srun$ b34sexec spectral list(sin,cos,p,s) nobspp=0 scafname=spec scaunit=44 output(all)$ weights( 1 2 3 4 3 2 1)$ var x1$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 1")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 1")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 1")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 1")$ b34srun$ %b34sif(&plot1.eq.YES.or.&plot1.eq.yes)%then$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig1.wmf') plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 1")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 1")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 1")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 1")$ b34srun$ %b34sendif$ b34sexec options clean(44)$ B34SRUN$ b34sexec spectral list(sin,cos,p,s) nobspp=0 scafname=spec scaunit=44 output(all)$ weights( 1 2 3 4 3 2 1)$ var x2$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 2")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 2")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 2")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 2")$ b34srun$

%b34sif(&plot1.eq.YES.or.&plot1.eq.yes)%then$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec)

15-15 15- 16 Chapter 15 print gport('fig2.wmf') plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Model 2")$ plot=(freq cos_1) gposition(2) title("Cos Transform Model 2")$ plot=(freq p_1 ) gposition(3) title("Periodogram Model 2")$ plot=(freq s_1 ) gposition(4) title("Spectrum Model 2")$ b34srun$ %b34sendif$

Graphics produced by the code in Table 15.2 are listed in Figures 15.1 and 15.2. Note that as predicted by theory, all the information in the series where Φ = .9 is at low frequency while when Φ = .-9, the information is at high frequency. The rough appearance of the periodogram is smoothed by the WEIGHTS (1 2 3 2 1).

Figure 15.1 AR(1) Model where f = .9

15-16 Figure 15.2 AR(1) Model where f = -.9

15-17 15- 18 Chapter 15

The next example uses the Box-Jenkins (1976) gas furnace data. First the periodogram is estimated with OLS methods as suggested in equation (15.1-30). The code in Table 15.3 provides an example that uses the MACRO OLSSPEC, which is called as

%b34smcall %olsspec(y=gasout noob=296 ntest=20)$ to produce estimates to the periodogram using the reg command. Next, the spectral command is used to estimate, list and plot spectral and cross-spectral values. These are discussed below. The rather long but complete command file is provided to show how all figures and tables were produced. It is to be stressed that the OLS method of estimating the periodogram is not the preferred way to proceed for production work, but has much to recommend as a method for understanding the periodogram as just representing the explained sum of squares at a frequency.

Table 15.3 Program to Analyze Gas Furnace Data using Spectral Methods ______b34sexec options include('c:\b34slm\gas.b34')$ b34srun$

%b34smacro olsspec$

15-18 /$ Needs to be called as %b34smcall(y=yname noob=_ ntest=_ ) /$ y = yname /$ noob = number of observations i series /$ ntest must be set to lt noob/2 b34sexec spectral list(sin,cos,p,s) nobspp=0$ var %b34seval(&Y)$ weights(1 2 3 2 1)$ b34srun$ %b34sdo i=1,&ntest$ b34sexec data set $ %b34sif(&i.eq.1)%then$ build sin cos freq$ %b34sendif$ gen freq=(timespi(2.0)/%b34seval(&noob))*%b34seval(&I)$ gen sin=sin((kount()-1.)*freq)$ gen cos=cos((kount()-1.)*freq)$ b34srun$ b34sexec reg$ model %b34seval(&y)=sin cos$ b34srun$ %b34senddo $ %b34smend$

%b34smcall olsspec(y= gasout noob=296 ntest=20)$ b34sexec options open('_junk.fsv') disp=unknown unit(44)$ b34srun$ b34sexec options clean(44)$ b34srun$ %b34slet plot1=yes$ b34sexec spectral scafname=spec scaunit=44 output=all list=all nobspp= 1 plotby( freq ) $ weights( 1 2 3 4 3 2 1)$ var gasin gasout $ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine transform Gasin")$ plot=(freq cos_1) gposition(2) title("Cos transform Gasin")$ plot=(freq p_1 ) gposition(3) title("Periodogram Gasin ")$ plot=(freq s_1 ) gposition(4) title("Spectrum Gasin")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq sin_2) gposition(1) title("Sine Transform Gasout")$ plot=(freq cos_2) gposition(2) title("Cos Transform Gasout")$ plot=(freq p_2 ) gposition(3) title("Periodogram Gasout")$ plot=(freq s_2 ) gposition(4) title("Spectrum Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) plottype=xyplot$ plot=(freq rp_1) gposition(1) title("Real Part Cross Period Gasin-Gasout")$ plot=(freq ip_1) gposition(2) title("Imag Part Cross Period Gasin-Gasout")$ plot=(freq cs_1) gposition(3) title("Cospectral Density Gasin-Gasout")$

15-19 15- 20 Chapter 15 plot=(freq qs_1) gposition(4) title("Quadrature-Spectrum Gasin-Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) Plottype=xyplot$ Plot=(freq a_1) gposition(1) Title("Amplitude Gasin-Gasout")$ Plot=(freq k_1) gposition(2) Title("Coherency Gasin-Gasout")$ Plot=(freq ph_1) gposition(3) Title("Phase Gasin-Gasout")$ b34srun$ /$ graphs to list b34sexec options gfactor(.8)$ b34srun$ %b34sif(&plot1.eq.yes.or.&plot1.eq.yes)%then$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig3.wmf') plottype=xyplot$ plot=(freq sin_1) gposition(1) title("Sine Transform Gasin")$ plot=(freq cos_1) gposition(2) title("Cos Transform Gasin")$ plot=(freq p_1 ) gposition(3) title("Periodogram Gasin ")$ plot=(freq s_1 ) gposition(4) title("Spectrum Gasin")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig4.wmf') plottype=xyplot$ plot=(freq sin_2) gposition(1) title("Sine Transform Gasout")$ plot=(freq cos_2) gposition(2) title("Cos Transform Gasout")$ plot=(freq p_2 ) gposition(3) title("Periodogram Gasout")$ plot=(freq s_2 ) gposition(4) title("Spectrum Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig5.wmf') plottype=xyplot$ plot=(freq rp_1) gposition(1) title("Real Part Cross Period Gasin-Gasout")$ plot=(freq ip_1) gposition(2) title("Imag Part Cross Period Gasin-Gasout")$ plot=(freq cs_1) gposition(3) title("Cospectral Density Gasin-Gasout")$ plot=(freq qs_1) gposition(4) title("Quadrature-Spectrum Gasin-Gasout")$ b34srun$ b34sexec hrgraphics gformat=(fourgraph) scafname(spec) print gport('fig6.wmf') plottype=xyplot$ plot=(freq a_1) gposition(1) title("Amplitude Gasin-Gasout")$ plot=(freq k_1) gposition(2) title("Coherency gasin-gasout")$ plot=(freq ph_1) gposition(3) title("Phase gasin-gasout")$ b34srun$ %b34sendif$

Edited output from running the program in Table 15.3 follows. The periodogram and spectrum for GASOUT for the first ten frequencies are listed below. The regression output that duplicates these values is shown next. Only the first three reg command outputs are shown. Note that the

15-20 model sum of squares of 442.44887, 406.68593 and 529.75342 are the same as found with the spectral command under P_1. SIN and COS values from the two sources are listed next and found to be the same. Periodogram values are generated from the SIN and COS values using (15.1-31) with scaling (T/2), in place of (T /8p ) giving identical results from a time prospective using OLS or from a frequency prospective using Fourier analysis. As noted above, while the OLS approach may be easier to interpret in that it shows how the periodogram relates to the regression diagnostics, it is substantially slower. Note that each periorogram value can be calculated. It is left as an exercise for the reader to validate that the spectral values have been calculated from the periodogram using the scaled (triangular) weights 1, 2, 3, 2, and 1 from equation (15.1-38). For the output listed below, series 1 is GASOUT. Note that for observation 1, period = 296. Frequency is 1/period = .0033784. The listed frequency can be obtained by .003378(1/(2p ))= .021227

Obs PERIOD FREQ SIN_1 COS_1 P_1 S_1 1 296.0 0.2123E-01 -1.714 -0.2243 442.4 35.35 2 148.0 0.4245E-01 -1.482 -0.7423 406.7 33.04 3 98.67 0.6368E-01 0.6334 1.783 529.8 28.14 4 74.00 0.8491E-01 0.6519 0.6703 129.4 17.95 5 59.20 0.1061 0.5667 0.4589 78.69 11.64 6 49.33 0.1274 -0.1568 -0.3118 18.02 7.946 7 42.29 0.1486 -1.063 0.7739 255.9 10.93 8 37.00 0.1698 0.4674 -0.3030 45.92 11.22 9 32.89 0.1910 -0.4580 -1.249 262.0 11.78 10 29.60 0.2123 0.5578E-01 0.7187 76.91 7.876

REG Command. Version 1 February 1997

Real*8 space available 10000000 Real*8 space used 755

OLS Estimation Dependent variable GASOUT Adjusted R**2 0.1404460150470348 Standard Error of Estimate 2.968754524497961 Sum of Squared Residuals 2582.356504031045 Model Sum of Squares 442.4488675905768 Total Sum of Squares 3024.805371621621 F( 2, 293) 25.10062379103647 F Significance 0.9999999999132527 1/Condition of XPX 0.5555555555555556 Number of Observations 296 Durbin-Watson 6.376831838373521E-02

Variable Coefficient Std. Error t SIN { 0} -1.7144070 0.24403012 -7.0253911 COS { 0} -0.22433884 0.24403012 -0.91930800 CONSTANT { 0} 53.509122 0.17255535 310.09830

15-21 15- 22 Chapter 15

REG Command. Version 1 February 1997

Real*8 space available 10000000 Real*8 space used 755

OLS Estimation Dependent variable GASOUT Adjusted R**2 0.1285420898400539 Standard Error of Estimate 2.989240914807500 Sum of Squared Residuals 2618.119445300439 Model Sum of Squares 406.6859263211827 Total Sum of Squares 3024.805371621621 F( 2, 293) 22.75659665299046 F Significance 0.9999999993494489 1/Condition of XPX 0.5555555555555556 Number of Observations 296 Durbin-Watson 6.275877065018662E-02

Variable Coefficient Std. Error t SIN { 0} -1.4821769 0.24571409 -6.0321201 COS { 0} -0.74231363 0.24571409 -3.0210463 CONSTANT { 0} 53.509122 0.17374610 307.97308

REG Command. Version 1 February 1997

Real*8 space available 10000000 Real*8 space used 755

OLS Estimation Dependent variable GASOUT Adjusted R**2 0.1695058964630655 Standard Error of Estimate 2.918139078177818 Sum of Squared Residuals 2495.051954119426 Model Sum of Squares 529.7534175021956 Total Sum of Squares 3024.805371621621 F( 2, 293) 31.10511407826055 F Significance 0.9999999999994319 1/Condition of XPX 0.5555555555555556 Number of Observations 296 Durbin-Watson 6.501243564636745E-02

Variable Coefficient Std. Error t SIN { 0} 0.63341038 0.23986955 2.6406452 COS { 0} 1.7827524 0.23986955 7.4321747 CONSTANT { 0} 53.509122 0.16961339 315.47699

REG Command. Version 1 February 1997

Real*8 space available 10000000 Real*8 space used 755

OLS Estimation Dependent variable GASOUT Adjusted R**2 3.624381523407638E-02 Standard Error of Estimate 3.143556705612574 Sum of Squared Residuals 2895.410987090722 Model Sum of Squares 129.3943845308991 Total Sum of Squares 3024.805371621621 F( 2, 293) 6.547007460527660 F Significance 0.9983466201852209 1/Condition of XPX 0.5555555555555556 Number of Observations 296 Durbin-Watson 5.641171059948612E-02

Variable Coefficient Std. Error t SIN { 0} 0.65193021 0.25839877 2.5229618 COS { 0} 0.67027858 0.25839877 2.5939697 CONSTANT { 0} 53.509122 0.18271552 292.85482

______

The second set of problems in the code in Table 15.3 involves estimating a cross spectral model using the Box-Jenkins (1976) gas furnace data. Here weights of [1, 2, 3, 4, 3, 2, 1] are used. Actual values are listed for the first 40 frequencies and plots of all values are given. Inspection of Figures 15.3 and 15.4 and especially 15.5 and 15.6 and listings indicates that GASIN maps to GASOUT at low frequencies. The weighting smoothes the periodogram values.

15-22 For output listed below, series 1 is GASIN and series 2 is GASOUT. Note that here FREQ = 1/PERIOD.

15-23 15- 24 Chapter 15

Obs PERIOD FREQ SIN_1 SIN_2 COS_1 COS_2 P_1 P_2 S_1 S_2 1 296.0 0.3378E-02 0.4798 -1.714 0.1931 -0.2243 39.60 442.4 2.976 33.99 2 148.0 0.6757E-02 0.3560 -1.482 0.3565 -0.7423 37.56 406.7 2.675 30.88 3 98.67 0.1014E-01 -0.4462E-01 0.6334 -0.5042 1.783 37.92 529.8 2.235 26.01 4 74.00 0.1351E-01 -0.1421 0.6519 -0.2708 0.6703 13.84 129.4 1.714 19.35 5 59.20 0.1689E-01 -0.9377E-01 0.5667 -0.2222 0.4589 8.612 78.69 1.299 13.83 6 49.33 0.2027E-01 -0.6245E-01 -0.1568 0.2150 -0.3118 7.418 18.02 1.190 11.03 7 42.29 0.2365E-01 0.3529 -1.063 0.1057 0.7739 20.08 255.9 1.227 10.46 8 37.00 0.2703E-01 -0.2504 0.4674 -0.2797E-01 -0.3030 9.395 45.92 1.278 10.20 9 32.89 0.3041E-01 -0.3567 -0.4580 0.3711 -1.249 39.21 262.0 1.308 10.13 10 29.60 0.3378E-01 0.1291 0.5578E-01 -0.1646 0.7187 6.479 76.91 1.049 8.601 11 26.91 0.3716E-01 -0.1486 -0.3376 0.1131 -0.4302 5.165 44.26 0.8214 7.051 12 24.67 0.4054E-01 0.3193E-02 -0.2289E-01 -0.6891E-01 0.9868E-01 0.7043 1.519 0.6559 6.359 13 22.77 0.4392E-01 0.1735 -0.9361 0.2199 0.3890 11.61 152.1 0.5668 6.014 14 21.14 0.4730E-01 0.1826 -0.8758 0.2172 0.3462 11.92 131.3 0.5671 5.550 15 19.73 0.5068E-01 0.6261E-01 0.8874E-01 -0.1209 0.1638 2.745 5.137 0.5458 4.477 16 18.50 0.5405E-01 0.1643 0.2742 -0.1507 0.3253 7.355 26.79 0.5440 3.741 17 17.41 0.5743E-01 0.7021E-01 -0.2818 0.9312E-01 0.2096 2.013 18.26 0.4795 2.919 18 16.44 0.6081E-01 -0.2711 0.4794E-01 -0.1195 -0.6516 12.99 63.18 0.5033 2.845 19 15.58 0.6419E-01 0.9543E-01 -0.2331 0.1405 0.4178 4.271 33.87 0.4622 2.880 20 14.80 0.6757E-01 0.1108E-01 -0.4056 0.9569E-01 0.2647 1.373 34.72 0.3981 2.643 21 14.10 0.7095E-01 -0.5102E-01 0.2422 -0.2574 -0.4199 10.19 34.78 0.4056 2.405 22 13.45 0.7432E-01 -0.6070E-02 -0.3126 0.8152E-01 0.1288 0.9889 16.91 0.3545 2.121 23 12.87 0.7770E-01 0.8882E-01 -0.8514E-01 0.1528E-01 0.2173 1.202 8.063 0.3834 2.207 24 12.33 0.8108E-01 -0.6353E-01 0.1641 -0.2489 -0.5095 9.764 42.41 0.4592 2.436 25 11.84 0.8446E-01 -0.6266E-01 0.1842E-01 -0.2016 -0.5336 6.595 42.18 0.4519 2.445 26 11.38 0.8784E-01 0.1244 0.4029 -0.1371 -0.3746 5.072 44.78 0.4363 2.232 27 10.96 0.9122E-01 -0.1261 -0.2429 0.1878 0.1310 7.577 11.27 0.3766 1.659 28 10.57 0.9459E-01 0.7276E-01 0.1863 -0.6586E-01 -0.5671E-01 1.425 5.615 0.2564 1.042 29 10.21 0.9797E-01 -0.7566E-01 -0.1554 0.5779E-01 0.1007 1.342 5.072 0.1770 0.6215 30 9.867 0.1014 0.3848E-01 0.4538E-01 -0.2020E-01 0.6868E-02 0.2795 0.3118 0.1213 0.4066 31 9.548 0.1047 0.1100 0.1857 0.2119E-02 0.6038E-01 1.792 5.645 0.9122E-01 0.4173 32 9.250 0.1081 -0.8815E-01 -0.2289 -0.3095E-01 -0.5963E-01 1.292 8.278 0.8530E-01 0.4302 33 8.970 0.1115 -0.6512E-01 -0.2054 -0.3968E-01 -0.1437 0.8606 9.300 0.8654E-01 0.4339 34 8.706 0.1149 -0.8999E-02 -0.6225E-01 0.6456E-01 0.6460E-01 0.6288 1.191 0.8751E-01 0.3735 35 8.457 0.1182 0.4304E-01 -0.1184E-01 0.6325E-01 0.5566E-01 0.8663 0.4793 0.8479E-01 0.3037 36 8.222 0.1216 0.1303 0.2045 -0.2427E-01 -0.1395 2.599 9.069 0.8274E-01 0.2585 37 8.000 0.1250 -0.1981E-02 -0.8518E-02 -0.4682E-01 -0.6563E-01 0.3251 0.6482 0.6805E-01 0.1779 38 7.789 0.1284 -0.1171E-02 -0.1834E-01 0.3996E-01 0.3119E-01 0.2366 0.1938 0.5485E-01 0.1505 39 7.590 0.1318 0.4124E-03 0.4063E-01 0.2228E-01 -0.2612E-02 0.7352E-01 0.2453 0.4527E-01 0.1356 40 7.400 0.1351 -0.5830E-01 0.2978E-01 0.7638E-01 0.1332 1.367 2.758 0.4053E-01 0.1322

Obs PERIOD FREQ RP_1 IP_1 CS_1 QS_1 A_1 K_1 PH_1 1 296.0 0.3378E-02 -128.2 -33.07 -9.691 -2.641 10.04 0.9974 -2.876 2 148.0 0.6757E-02 -117.2 -39.08 -8.722 -2.475 9.067 0.9952 -2.865 3 98.67 0.1014E-01 -137.2 -35.49 -7.268 -2.123 7.571 0.9862 -2.857 4 74.00 0.1351E-01 -40.57 -12.04 -5.301 -1.867 5.620 0.9525 -2.803 5 59.20 0.1689E-01 -22.96 -12.27 -3.648 -1.722 4.034 0.9058 -2.701 6 49.33 0.2027E-01 -8.471 -7.870 -2.626 -2.071 3.344 0.8515 -2.474 7 42.29 0.2365E-01 -43.41 -57.05 -2.184 -2.612 3.405 0.9030 -2.267 8 37.00 0.2703E-01 -16.07 -13.16 -1.991 -2.837 3.466 0.9216 -2.183 9 32.89 0.3041E-01 -44.43 -91.10 -1.846 -2.990 3.514 0.9317 -2.124 10 29.60 0.3378E-01 -16.45 -15.09 -1.429 -2.499 2.879 0.9186 -2.090 11 26.91 0.3716E-01 0.2236 -15.12 -0.9536 -2.084 2.292 0.9066 -2.000 12 24.67 0.4054E-01 -1.017 0.1868 -0.7065 -1.817 1.950 0.9113 -1.942 13 22.77 0.4392E-01 -11.38 -40.45 -0.5322 -1.688 1.770 0.9188 -1.876 14 21.14 0.4730E-01 -12.53 -37.51 -0.4658 -1.639 1.704 0.9226 -1.848 15 19.73 0.5068E-01 -2.110 -3.106 -0.3087 -1.427 1.460 0.8722 -1.784 16 18.50 0.5405E-01 -0.5905 -14.02 -0.1028 -1.312 1.316 0.8510 -1.649 17 17.41 0.5743E-01 -0.3994E-01 -6.062 0.1193 -1.088 1.094 0.8554 -1.462 18 16.44 0.6081E-01 9.601 -26.99 0.3557 -1.066 1.124 0.8817 -1.249 19 15.58 0.6419E-01 5.397 -10.75 0.4434 -0.9803 1.076 0.8698 -1.146 20 14.80 0.6757E-01 3.084 -6.178 0.4637 -0.8184 0.9407 0.8410 -1.055 21 14.10 0.7095E-01 14.17 -12.40 0.5361 -0.7187 0.8967 0.8244 -0.9299 22 13.45 0.7432E-01 1.835 -3.655 0.5457 -0.5532 0.7771 0.8031 -0.7922 23 12.87 0.7770E-01 -0.6276 -3.049 0.6595 -0.4920 0.8228 0.8003 -0.6409 24 12.33 0.8108E-01 17.22 -10.84 0.8470 -0.4753 0.9712 0.8432 -0.5114 25 11.84 0.8446E-01 15.75 -5.497 0.8913 -0.3875 0.9719 0.8550 -0.4101 26 11.38 0.8784E-01 15.02 -1.282 0.8623 -0.3078 0.9156 0.8610 -0.3428 27 10.96 0.9122E-01 8.177 -4.308 0.6943 -0.2343 0.7327 0.8595 -0.3254 28 10.57 0.9459E-01 2.559 -1.206 0.4569 -0.1377 0.4772 0.8525 -0.2927 29 10.21 0.9797E-01 2.601 -0.2014 0.2960 -0.8169E-01 0.3071 0.8571 -0.2693 30 9.867 0.1014 0.2379 -0.1748 0.2015 -0.5190E-01 0.2081 0.8779 -0.2521 31 9.548 0.1047 3.043 -0.9250 0.1829 -0.2928E-01 0.1852 0.9012 -0.1588 32 9.250 0.1081 3.259 0.2703 0.1769 -0.2120E-01 0.1781 0.8645 -0.1193 33 8.970 0.1115 2.823 -0.1786 0.1732 -0.1209E-01 0.1737 0.8031 -0.6965E-01 34 8.706 0.1149 0.7001 -0.5087 0.1567 -0.1997E-02 0.1568 0.7518 -0.1274E-01 35 8.457 0.1182 0.4456 -0.4654 0.1354 0.1177E-01 0.1359 0.7172 0.8671E-01 36 8.222 0.1216 4.444 1.955 0.1247 0.2623E-01 0.1275 0.7596 0.2073 37 8.000 0.1250 0.4573 0.3978E-01 0.9227E-01 0.2999E-01 0.9702E-01 0.7774 0.3142 38 7.789 0.1284 0.1877 -0.1030 0.7148E-01 0.3352E-01 0.7895E-01 0.7553 0.4384 39 7.590 0.1318 -0.6136E-02 0.1341 0.5326E-01 0.4042E-01 0.6686E-01 0.7280 0.6491 40 7.400 0.1351 1.249 1.486 0.3813E-01 0.4738E-01 0.6082E-01 0.6905 0.8931

15-24 Figure 15.3 Spectral analysis of GASIN

15-25 15- 26 Chapter 15

Figure 15.4 Spectral analysis of GASOUT

15-26 Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 1

15-27 28 Chapter 15

Figure 15.5 Cross spectral analysis of GASIN-GASOUT part 2

15.3 Matrix Command Implementation

The B34S matrix command, discussed in Chapter 16, but used in many chapters, provides a more compact and flexible way to do spectral analysis. For an example consult the code in Table 15.4 which when run

Table 15.4 Matrix Command Implementation of Spectral Analysis ______b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; call loaddata; /; /; No weighting here px=sx /; call spectral(gasin,sinx,cosx,px,sx,freq); freq2=freq/(2.0*pi()); period=vfam(1.0/afam(freq2)); call tabulate(freq freq2 period sinx cosx px sx); /; Spectral Analysis of Time Series 29

/; With weights 1 2 3 2 1 px ne sx /; call spectral(gasin,sinx,cosx,px,sx,freq:1 2 3 2 1); call tabulate(freq freq2 period sinx cosx px sx); call graph(freq2,sx:heading 'Spectrum of Gasin' :plottype xyplot :file 'sp_gasin.wmf'); b34srun; produces the periodogram, spectrum and other intermediate steps as well as a graph which is not shown due to space. The first 20 lines of the tabulation illustrate what is being calculated. The first tabulation shows that if there is no weighting px=sx. Note that by dividing FREQ by 2p we get the B34S FREQ value.

=> CALL SPECTRAL(GASIN,SINX,COSX,PX,SX,FREQ:1 2 3 2 1)$

=> CALL TABULATE(FREQ FREQ2 PERIOD SINX COSX PX SX)$

Obs FREQ FREQ2 PERIOD SINX COSX PX SX 1 0.2123E-01 0.3378E-02 296.0 0.4798 0.1931 39.60 3.100 2 0.4245E-01 0.6757E-02 148.0 0.3560 0.3565 37.56 2.840 3 0.6368E-01 0.1014E-01 98.67 -0.4462E-01 -0.5042 37.92 2.341 4 0.8491E-01 0.1351E-01 74.00 -0.1421 -0.2708 13.84 1.588 5 0.1061 0.1689E-01 59.20 -0.9377E-01 -0.2222 8.612 1.117 6 0.1274 0.2027E-01 49.33 -0.6245E-01 0.2150 7.418 0.9096 7 0.1486 0.2365E-01 42.29 0.3529 0.1057 20.08 1.253 8 0.1698 0.2703E-01 37.00 -0.2504 -0.2797E-01 9.395 1.421 9 0.1910 0.3041E-01 32.89 -0.3567 0.3711 39.21 1.544 10 0.2123 0.3378E-01 29.60 0.1291 -0.1646 6.479 1.046 11 0.2335 0.3716E-01 26.91 -0.1486 0.1131 5.165 0.7134 12 0.2547 0.4054E-01 24.67 0.3193E-02 -0.6891E-01 0.7043 0.4780 13 0.2760 0.4392E-01 22.77 0.1735 0.2199 11.61 0.6011 14 0.2972 0.4730E-01 21.14 0.1826 0.2172 11.92 0.6412 15 0.3184 0.5068E-01 19.73 0.6261E-01 -0.1209 2.745 0.5340 16 0.3396 0.5405E-01 18.50 0.1643 -0.1507 7.355 0.4994 17 0.3609 0.5743E-01 17.41 0.7021E-01 0.9312E-01 2.013 0.4752 18 0.3821 0.6081E-01 16.44 -0.2711 -0.1195 12.99 0.5329 19 0.4033 0.6419E-01 15.58 0.9543E-01 0.1405 4.271 0.4752 20 0.4245 0.6757E-01 14.80 0.1108E-01 0.9569E-01 1.373 0.4157

Cross spectral analysis can easily be done with the matrix command as is shown with the code listed in Table 15.5

Table 15.5 Cross Spectral Analysis with the Matrix Command ______b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; call loaddata; * For sample output See Stokes (1997) page 424; call cspectral(gasin,gasout,sinx,siny,cosx,cosy,px,py,sx,sy, rp,ip,cs,qs,a,k,ph,freq:1 2 3 4 3 2 1); freq2=freq/(2.0*pi()); period=vfam(1.0/afam(freq2)); call tabulate(freq2,period,sinx,siny,cosx,cosy,px,py,sx,sy); call tabulate(freq2,period,rp,ip,cs,qs,a,k,ph); call graph(freq2,a :heading 'Amplitude':plottype xyplot); call graph(freq2,k :heading 'Coherence':plottype xyplot); call graph(freq2,ph:heading 'Phase':plottype xyplot); b34srun; 30 Chapter 15

which runs the same problem as was illustrated earlier. Edited output showing the first 20 lines of each tabulation replicates the calculations shown in the prior section.

=> CALL LOADDATA$

=> * FOR SAMPLE OUTPUT SEE STOKES (1997) PAGE 424$

=> CALL CSPECTRAL(GASIN,GASOUT,SINX,SINY,COSX,COSY,PX,PY,SX,SY, => RP,IP,CS,QS,A,K,PH,FREQ:1 2 3 4 3 2 1)$

=> FREQ2=FREQ/(2.0*PI())$

=> PERIOD=VFAM(1.0/AFAM(FREQ2))$

=> CALL TABULATE(FREQ2,PERIOD,SINX,SINY,COSX,COSY,PX,PY,SX,SY)$

Obs FREQ2 PERIOD SINX SINY COSX COSY PX PY SX SY 1 0.3378E-02 296.0 0.4798 -1.714 0.1931 -0.2243 39.60 442.4 2.976 33.99 2 0.6757E-02 148.0 0.3560 -1.482 0.3565 -0.7423 37.56 406.7 2.675 30.88 3 0.1014E-01 98.67 -0.4462E-01 0.6334 -0.5042 1.783 37.92 529.8 2.235 26.01 4 0.1351E-01 74.00 -0.1421 0.6519 -0.2708 0.6703 13.84 129.4 1.714 19.35 5 0.1689E-01 59.20 -0.9377E-01 0.5667 -0.2222 0.4589 8.612 78.69 1.299 13.83 6 0.2027E-01 49.33 -0.6245E-01 -0.1568 0.2150 -0.3118 7.418 18.02 1.190 11.03 7 0.2365E-01 42.29 0.3529 -1.063 0.1057 0.7739 20.08 255.9 1.227 10.46 8 0.2703E-01 37.00 -0.2504 0.4674 -0.2797E-01 -0.3030 9.395 45.92 1.278 10.20 9 0.3041E-01 32.89 -0.3567 -0.4580 0.3711 -1.249 39.21 262.0 1.308 10.13 10 0.3378E-01 29.60 0.1291 0.5578E-01 -0.1646 0.7187 6.479 76.91 1.049 8.601 11 0.3716E-01 26.91 -0.1486 -0.3376 0.1131 -0.4302 5.165 44.26 0.8214 7.051 12 0.4054E-01 24.67 0.3193E-02 -0.2289E-01 -0.6891E-01 0.9868E-01 0.7043 1.519 0.6559 6.359 13 0.4392E-01 22.77 0.1735 -0.9361 0.2199 0.3890 11.61 152.1 0.5668 6.014 14 0.4730E-01 21.14 0.1826 -0.8758 0.2172 0.3462 11.92 131.3 0.5671 5.550 15 0.5068E-01 19.73 0.6261E-01 0.8874E-01 -0.1209 0.1638 2.745 5.137 0.5458 4.477 16 0.5405E-01 18.50 0.1643 0.2742 -0.1507 0.3253 7.355 26.79 0.5440 3.741 17 0.5743E-01 17.41 0.7021E-01 -0.2818 0.9312E-01 0.2096 2.013 18.26 0.4795 2.919 18 0.6081E-01 16.44 -0.2711 0.4794E-01 -0.1195 -0.6516 12.99 63.18 0.5033 2.845 19 0.6419E-01 15.58 0.9543E-01 -0.2331 0.1405 0.4178 4.271 33.87 0.4622 2.880 20 0.6757E-01 14.80 0.1108E-01 -0.4056 0.9569E-01 0.2647 1.373 34.72 0.3981 2.643

=> CALL TABULATE(FREQ2,PERIOD,RP,IP,CS,QS,A,K,PH)$

Obs FREQ2 PERIOD RP IP CS QS A K PH 1 0.3378E-02 296.0 -128.2 -33.07 -9.691 -2.641 10.04 0.9974 -2.876 2 0.6757E-02 148.0 -117.2 -39.08 -8.722 -2.475 9.067 0.9952 -2.865 3 0.1014E-01 98.67 -137.2 -35.49 -7.268 -2.123 7.571 0.9862 -2.857 4 0.1351E-01 74.00 -40.57 -12.04 -5.301 -1.867 5.620 0.9525 -2.803 5 0.1689E-01 59.20 -22.96 -12.27 -3.648 -1.722 4.034 0.9058 -2.701 6 0.2027E-01 49.33 -8.471 -7.870 -2.626 -2.071 3.344 0.8515 -2.474 7 0.2365E-01 42.29 -43.41 -57.05 -2.184 -2.612 3.405 0.9030 -2.267 8 0.2703E-01 37.00 -16.07 -13.16 -1.991 -2.837 3.466 0.9216 -2.183 9 0.3041E-01 32.89 -44.43 -91.10 -1.846 -2.990 3.514 0.9317 -2.124 10 0.3378E-01 29.60 -16.45 -15.09 -1.429 -2.499 2.879 0.9186 -2.090 11 0.3716E-01 26.91 0.2236 -15.12 -0.9536 -2.084 2.292 0.9066 -2.000 12 0.4054E-01 24.67 -1.017 0.1868 -0.7065 -1.817 1.950 0.9113 -1.942 13 0.4392E-01 22.77 -11.38 -40.45 -0.5322 -1.688 1.770 0.9188 -1.876 14 0.4730E-01 21.14 -12.53 -37.51 -0.4658 -1.639 1.704 0.9226 -1.848 15 0.5068E-01 19.73 -2.110 -3.106 -0.3087 -1.427 1.460 0.8722 -1.784 16 0.5405E-01 18.50 -0.5905 -14.02 -0.1028 -1.312 1.316 0.8510 -1.649 17 0.5743E-01 17.41 -0.3994E-01 -6.062 0.1193 -1.088 1.094 0.8554 -1.462 18 0.6081E-01 16.44 9.601 -26.99 0.3557 -1.066 1.124 0.8817 -1.249 19 0.6419E-01 15.58 5.397 -10.75 0.4434 -0.9803 1.076 0.8698 -1.146 20 0.6757E-01 14.80 3.084 -6.178 0.4637 -0.8184 0.9407 0.8410 -1.055

The graphs, which are not shown, can be easily placed in Word or other software systems.

The OLS approach to obtaining the periodogram works because the sin and cosine transforms are orthogonal. This can be demonstrated with the code in Table 15.6 Spectral Analysis of Time Series 31

Table 15.6 Verification that the sin and cosine vectors are orthogonal ______b34sexec options ginclude('gas.b34')$ b34srun$ b34sexec matrix; call loaddata; call echooff; count=dfloat(integers(norows(gasout)))-1.; ncase=(norows(gasout)/2)-1; per=array(ncase:); test=array(ncase,2*ncase:); s1 =array(norows(gasout):); c1 =array(norows(gasout):); base=(2.0*pi())/dfloat(norows(gasout)); do i=1,ncase; s1=sin(count*base*dfloat(i)); c1=cos(count*base*dfloat(i)); is_1=(i-1)*2+1; ic_1= is_1+1; test(,is_1)=s1; test(,ic_1)=c1; if(i.le.20)then; call olsq(gasout s1 c1 :print :qr ); call print(' ':); modelss=%tss-%rss; call print('%tss-%rss',modelss:); endif; if(i.gt.20)call olsq(gasout s1 c1 :qr ); per(i)=%tss-%rss; enddo; call print(per); call graph(per); call print('Illustrate Orthagonality of sine and cosine vectors':); call print(ccf(test)); b34srun; which will produce a 294 by 294 matrix of correlations of the 147 pairs of sin and cosin vectors where all cross correlations are close to machine zero except along the diagonal where they are 1.0. Due to space limits this matrix, which contains 86,436 correlations, is not shown here.

It is possible to obtain the original values of the series from the sin and cosin values from (15.1- 30) with a small error. Table 15.7 shows both a B34S and a SAS setup.

Table 15.7 Inverse Spectral Examples ______%b34slet dosas=1; 32 Chapter 15 b34sexec options ginclude('gas.b34'); b34srun; /; /; Testing FFT /; b34sexec options ginclude('gas.b34'); b34srun; b34sexec matrix; call loaddata; call echooff; subroutine spectest(x,testx,error); call spectral(x,sinx,cosx,px,sx,freq1); count=dfloat(integers(norows(x)))-1. ;

/; Test 100% recovery of actual value adj_freq=freq1/(2.*pi()); period=1.0/afam(adj_freq); do i=1,norows(x); sum1=sinx*sin(count(i)*freq1); sum2=cosx*cos(count(i)*freq1); test(i)=sum1+sum2; enddo; call print('mean(x)',mean(x):); adj=mean(x); testx=afam(test)+adj; error_1=x-testx; call tabulate(x,testx,error_1); return; end; call spectest(gasin, yhat,error); call names(all); call olsq(yhat gasin :print); call spectest(gasout,yhat,error); call olsq(yhat gasout :print);

/; Now look at fft cfft=fft(gasin); test=fft(cfft:back)/dfloat(norows(gasin)); error=gasin-test; call tabulate(gasin,test,error); b34srun;

%b34sif(&dosas.ne.0)%then; b34sexec options open('testsas.sas') unit(29) disp=unknown$ b34srun$ b34sexec options clean(29) $ b34seend$ b34sexec pgmcall idata=29 icntrl=29$ Spectral Analysis of Time Series 33

sas $ * sas commands next ; pgmcards$ proc spectra out=specgas coef p s; var gasin gasout; weights 1 1 1; run; proc means data=specgas; run; proc print data=specgas; run; b34sreturn$ b34srun $ b34sexec options close(29)$ b34srun$

/$ the next card has to be modified to point to sas location /$ be sure and wait until sas gets done before letting b34s resume /$ *************************************************************** b34sexec options dodos('start /w /r sas testsas' ) dounix('sas testsas' ) $ b34srun$ b34sexec options npageout noheader writeout(' ','output from sas',' ',' ') writelog(' ','output from sas',' ',' ') copyfout('testsas.lst') copyflog('testsas.log') /;dodos('erase testsas.sas','erase testsas.lst','erase testsas.log') dounix('rm testsas.sas','rm testsas.lst','rm testsas.log') $ b34srun$ b34sexec options header$ b34srun$ %b34sendif;

When the code in Table 15.7 is run it produces edited output that shows the recovery of GASIN and GASOUT from the sin and cosin values. Note the very small error which is validated with OLS.

mean(x) -5.683445945945946E-02

Obs X TESTX ERROR_1 1 -0.1090 -0.1090 0.7633E-15 2 0.000 -0.2720E-14 0.2720E-14 3 0.1780 0.1780 0.4496E-14 4 0.3390 0.3390 0.3497E-14 5 0.3730 0.3730 0.1499E-14 6 0.4410 0.4410 -0.1998E-14 7 0.4610 0.4610 -0.4441E-15 8 0.3480 0.3480 -0.1665E-15 9 0.1270 0.1270 -0.1998E-14 10 -0.1800 -0.1800 -0.4496E-14 11 -0.5880 -0.5880 -0.6550E-14 12 -1.055 -1.055 -0.7327E-14 13 -1.421 -1.421 -0.4441E-14 14 -1.520 -1.520 -0.1110E-14 15 -1.302 -1.302 -0.2442E-14 16 -0.8140 -0.8140 -0.1998E-14 17 -0.4750 -0.4750 0.2776E-15 18 -0.1930 -0.1930 0.4441E-14 19 0.8800E-01 0.8800E-01 0.8313E-14 20 0.4350 0.4350 0.1044E-13 …………………………………………………………………………………………………. 280 0.2510 0.2510 0.1205E-13 34 Chapter 15

281 0.2800 0.2800 0.6550E-14 282 0.000 0.6939E-16 -0.6939E-16 283 -0.4930 -0.4930 -0.1499E-14 284 -0.7590 -0.7590 -0.3442E-14 285 -0.8240 -0.8240 -0.5440E-14 286 -0.7400 -0.7400 -0.4663E-14 287 -0.5280 -0.5280 0.4885E-14 288 -0.2040 -0.2040 0.1138E-14 289 0.3400E-01 0.3400E-01 -0.3775E-14 290 0.2040 0.2040 0.1360E-14 291 0.2530 0.2530 -0.1094E-13 292 0.1950 0.1950 -0.8438E-14 293 0.1310 0.1310 -0.1887E-14 294 0.1700E-01 0.1700E-01 0.2387E-14 295 -0.1820 -0.1820 0.1016E-13 296 -0.2620 -0.2620 0.1221E-14

Ordinary Least Squares Estimation Dependent variable YHAT Centered R**2 1.000000000000000 Adjusted R**2 1.000000000000000 Residual Sum of Squares 5.020523454753602E-26 Residual Variance 1.707661039031837E-28 Standard Error 1.306775052957408E-14 Total Sum of Squares 339.4936188885142 Log Likelihood 9043.711769338292 Mean of the Dependent Variable -5.683445945945921E-02 Std. Error of Dependent Variable 1.072765504078466 Sum Absolute Residuals 2.840377339427970E-12 1/Condition XPX 0.8358143402344875 Maximum Absolute Residual 4.363176486776865E-14 Number of Observations 296

Variable Lag Coefficient SE t GASIN 0 1.0000000 0.70922662E-15 0.14099866E+16 CONSTANT 0 0.38515884E-15 0.76061639E-15 0.50637726 mean(x) 53.50912162162162

Obs X TESTX ERROR_1 1 53.80 53.80 0.2132E-13 2 53.60 53.60 0.7105E-14 3 53.50 53.50 0.7105E-14 4 53.50 53.50 0.000 5 53.40 53.40 -0.7105E-14 6 53.10 53.10 0.7105E-14 7 52.70 52.70 0.000 8 52.40 52.40 -0.7105E-14 9 52.20 52.20 0.9237E-13 10 52.00 52.00 0.9237E-13 11 52.00 52.00 0.1066E-12 12 52.40 52.40 0.8527E-13 13 53.00 53.00 0.7816E-13 14 54.00 54.00 0.1066E-12 15 54.90 54.90 0.1137E-12 16 56.00 56.00 0.1066E-12 17 56.80 56.80 0.7105E-14 18 56.80 56.80 -0.7105E-14 19 56.40 56.40 0.000 20 55.70 55.70 -0.1421E-13 ………………………………………………………………………………………………………. 280 54.40 54.40 0.1066E-12 281 53.70 53.70 -0.7105E-14 282 53.30 53.30 -0.3553E-13 283 52.80 52.80 -0.2842E-13 284 52.60 52.60 -0.3553E-13 285 52.60 52.60 -0.5684E-13 286 53.00 53.00 -0.2842E-13 Spectral Analysis of Time Series 35

287 54.30 54.30 0.7105E-14 288 56.00 56.00 -0.4263E-13 289 57.00 57.00 0.9237E-13 290 58.00 58.00 0.9948E-13 291 58.60 58.60 0.1066E-12 292 58.50 58.50 0.9237E-13 293 58.30 58.30 0.9948E-13 294 57.80 57.80 0.7816E-13 295 57.30 57.30 0.1208E-12 296 57.00 57.00 0.5684E-13

Ordinary Least Squares Estimation Dependent variable YHAT Centered R**2 0.9999997207772101 Adjusted R**2 0.9999997198274727 Residual Sum of Squares 8.445943587647710E-04 Residual Variance 2.872769927771330E-06 Standard Error 1.694924755784554E-03 Total Sum of Squares 3024.804527027029 Log Likelihood 1469.512199422020 Mean of the Dependent Variable 53.50912162163348 Std. Error of Dependent Variable 3.202120339382677 Sum Absolute Residuals 0.4999998603887690 F( 1, 294) 1052922356.462301 F Significance 1.000000000000000 1/Condition XPX 1.214609942288382E-06 Maximum Absolute Residual 1.691397594697719E-03 Number of Observations 296

Variable Lag Coefficient SE t GASOUT 0 0.99999972 0.30817805E-04 32448.765 CONSTANT 0 0.14940968E-04 0.16519738E-02 0.90443130E-02

The same calculation is done with the inverse FFT. Note the small errors:

Obs GASIN TEST ERROR 1 -0.1090 -0.1090 0.3747E-15 2 0.000 -0.2497E-14 0.2497E-14 3 0.1780 0.1780 0.4302E-14 4 0.3390 0.3390 0.3386E-14 5 0.3730 0.3730 0.1554E-14 6 0.4410 0.4410 -0.1277E-14 7 0.4610 0.4610 -0.1554E-14 8 0.3480 0.3480 -0.9992E-15 9 0.1270 0.1270 -0.2415E-14 10 -0.1800 -0.1800 -0.4136E-14 11 -0.5880 -0.5880 -0.5995E-14 12 -1.055 -1.055 -0.6217E-14 13 -1.421 -1.421 -0.3775E-14 14 -1.520 -1.520 -0.2220E-15 15 -1.302 -1.302 -0.2220E-15 16 -0.8140 -0.8140 0.000 17 -0.4750 -0.4750 0.1721E-14 18 -0.1930 -0.1930 0.4191E-14 19 0.8800E-01 0.8800E-01 0.5773E-14 20 0.4350 0.4350 0.4829E-14

290 0.2040 0.2040 -0.2720E-14 291 0.2530 0.2530 -0.5607E-14 292 0.1950 0.1950 -0.5079E-14 293 0.1310 0.1310 -0.2914E-14 294 0.1700E-01 0.1700E-01 -0.9957E-15 295 -0.1820 -0.1820 -0.6106E-15 36 Chapter 15

296 -0.2620 -0.2620 -0.6106E-15

From the sin and cosine coefficients the original series was recovered. The OLS model tested the conversion and found the coefficient of Y and on Yˆ was very close to 1.0, and, as expected, was close to being a perfect fit.

15.4 Wavelet Analysis

Wavelets provide a means by which a series can be filtered to remove noise.5 The first step is to estimate a discrete Fourier transform of a series xi i= 1, , N where there are N periods at frequency k, xˆk , where

N -1 1 -2pikn / N xˆk= x n e (15.4- N n=0 1)

Define the wavelet function as y(s w ) . The wavelet transform is the inverse Fourier transform of the product

N -1 * iwk n d t Wn( s )= xˆ kyˆ ( s w k ) e (15.4- k=0 2)

2pk N- 2 p k where w = for k and otherwise. The wavelet transforms in (15.4-2) are k Nd t2 N d t .5 骣2p s ˆ normalized to have unit energy yˆ(s wk )= 琪 y ˆ0 ( s w k ) where the unscaled transforms y 0 桫dt 2 have been normalized to have unit energy or �- |yˆ0 ( w ') |d w ' 1. At each scale N -1 2 s, |yˆ ( s wk ) | = N . k=0

Table 15.8 lists the wavelet basis functions and their properties and has been taken from Torrence and Compo (1998) Table 1. H(w ) = Heaviside step function which =1 if w > 0 and 0 otherwise. The DOG is the only real valued wave function of the three.

Table 15.8 Wavelet basis functions supported

5 This section discusses the approach to wavelet estimation suggested by Torrence and Compo (1998) which have provided the Fortran code that is used for all calculations. Consult this article for future information. Where ever possible, the notion in this section follows their article. Spectral Analysis of Time Series 37

Name y0 ( h ) yˆ0 (s w )

2 -.25 -(sw - w )2 / 2 Morlet p -.25eiw0 h e -h /2 pH( w ) e 0

(w0 = frequency)

2mi m m ! 2m Paul (1- ih )-(m + 1) H(w )( s w )m e- sw p (2m )! m(2 m - 1)! (m = order)

m+1 m m (- 1) d -h2 /2 -i m-( sw )2 / 2 DOG (Derivative of Gaussian) (e ) (sw ) e G(m + .5) dh m G(m + .5) (m = derivative)

Source Torrence and Compo (1998 table 1) 38 Chapter 15

Table 15.9 Empirically derived factors for four wavelet bases g Name Cd d j0 y 0 (0) -.25 Morlet (w0 = 6) .776 2.32 .60 p = .7511255 Paul (m = 4) 1.132 1.17 1.5 1.079 Marr (DOG m = 2) 3.541 1.43 1.4 .867 DOG (m = 6) 1.966 1.37 0.97 .884

Source Torrence and Compo (1998 table 2)

Cd = reconstruction factor g = decorrelation factor for time averaging d j0 = factor for scale averaging

After selecting the wavelet basis function, wavelet analysis first proceeds by selecting the appropriate scales s to use. Define

jd j sj = s0 2 , j = 0,1, , J (15.4- -1 J= d jlog2 ( N d t / s 0 ) 3)

s0 is the smallest scale and J is the largest. Smaller dt values mean more resolution. For the Morlet wavelet it is usually set as .5. By summing over all scales, the original time series can be reconstructed from the real part of the wavelet transformation, {Wn( s j )}, as

.5 j dj dt {Wn( s j )} xn = .5 (15.4- Cdy 0 (0) j=0 s j 4)

Torrence and Compo (1998) outline how these formulas are used. Assume a new wavelet 1 function d for time period n = 0 so that x = d which implies a Fourier transform xˆ = n n0 k N from (15.4-1) that is constant for all k. Substituting xˆk into (15.4-2) gives

N -1 1 * Wd ( s )= yˆ ( s wk ) (15.4- N k =0 5) Spectral Analysis of Time Series 39 which using (15.4-4) can be solved for

.5 J dj d t {Wd ( s j )} Cd = .5 (15.4- y 0 (0) j=0 s j 6)

Total energy is conserved under the wavelet function and should be checked to make sure appropriate s0 and dt values have been selected.

N-1 J 2dj d t 2 s = 邋 |Wn ( s j ) | (15.4- Cd N n=0 j = 0 7)

This suggests that all scales should be inspected using (15.4-4) and (15.4-7) to see if the level and the variance can be accurately recovered. If this is not the case, s0 and d j would have to be adjusted.

The time-averaged wavelet spectrum of a certain period is

n2 21 2 Wn( s )= | W n ( s ) | (15.4-8) na n= n1

where na = n2 - n 1 +1. When summed over all local wavelet spectrum the global wavelet spectrum

N -1 21 2 W( s )= | Wn ( s ) | (15.4-9) N n=0 is obtained.

The scale-averaged wavelet power is defined as the weighted sum of the wavelet power spectrum over scales s1to s 2

2 j2 2 dj d t |Wn ( s j ) | Wn = (15.4- Cd j= j1 s j 10)

It is possible to filter a series by using (15.4-4) and summing a range of scales 40 Chapter 15

.5 j2 dj dt {Wn( s j )} xn = .5 (15.4- Cdy 0 (0) j= j1 s j 11)

One way to think about (15.4-11) is that the larger j1 the more the information in the short periods (small scale) that is removed. Thus noise that by assumption is of short duration, can be removed from the data to capture what is hoped is the more fundamental series. The higher sampling rate, the more of this noise reduction strategy may be required. Experimentation may be required to set the scales that are appropriate for the series at hand. Once the series has been processed it can be further analyzed using nonlinear and linear models that hopefully will better capture the underlying structure..

Wavelet calculations are illustrated using data of 504 observations on the Nino suggested by Torrence and Compo (1998). Table 15.4-10 shows a setup to completely filter the Nino series.

Table 15.10 Wavelet Filter For The Nino Series

b34sexec options ginclude('wavedata.mac') member(nino3); b34srun; /; /; Basic Wavelet test /; b34sexec matrix; call loaddata; call wavelet(nino :type morlet :settings :s0 .25 :dt .25 :lower 2. :upper 7.9 :jtot 44); call tabulate(nino,%recon_y); call olsq(nino %recon_y :print); call tabulate(%scale,%period,%w_power,%w_phase, %w_ampl,%signif,%global,%g_sig); call print(%sa_df %sa_sig); call print('mean original data ',mean(nino):); call print('mean of reconstructed data ',mean(%recon_y):); call print('Variance of original Data ',variance(nino):); call print('Variance of reconstructed Data ',variance(%recon_y):); b34srun;

Edited output follows:

B34S 8.11C (D:M:Y) 5/ 8/07 (H:M:S) 17: 2:31 DATA STEP Sea Surface Temp PAGE 1

Variable Label # Cases Mean Std. Dev. Variance Maximum Minimum

NINO 1 Nino2 sea surface temperature 504 -0.198413E-04 0.734328 0.539238 2.50000 -1.85000 CONSTANT 2 504 1.00000 0.00000 0.00000 1.00000 1.00000

Number of observations in data file 504 Current missing variable code 1.000000000000000E+31 Data begins on (D:M:Y) 1: 1:1871 ends 1:10:1996. Frequency is 4

B34S(r) Matrix Command. d/m/y 5/ 8/07. h:m:s 17: 2:31.

=> CALL LOADDATA$ Spectral Analysis of Time Series 41

=> CALL WAVELET(NINO :TYPE MORLET :SETTINGS :S0 .25 :DT .25 => :LOWER 2. :UPPER 7.9 :JTOT 44)$

Wavelet Option Settings Input Variable NINO Number of Original Observations 504 Sampling time (dt) 0.2500000000000000 Wavelet used Morlet Wave number (param) 6.000000000000000 Smallest scale of wavelet s0=%scale(1) 0.2500000000000000 Largest scale of wavelet (%scale(jtot)) 430.5389646099018 Spacing between discrete scales (dj) 0.2500000000000000 Number of scales (jtot) 44 Number of observation after padding 1024 Lag1 (background autocorrelation) 0.7200000000000000 Significance level 5.000000000000004E-02 Lower Scale for filter 2.000000000000000 Upper Scale for filter 7.900000000000000 Work array size (nk) 1024

=> CALL TABULATE(NINO,%RECON_Y)$ The filter is successful at capturing the NINO series as shown by the application of (15.4-4).

Obs NINO %RECON_Y 1 -0.1500 -0.1508 2 -0.3000 -0.3005 3 -0.1400 -0.1408 4 -0.4100 -0.4109 5 -0.4600 -0.4618 6 -0.6600 -0.6617 7 -0.5000 -0.5019 8 -0.8000 -0.8021 9 -0.9500 -0.9533 10 -0.7200 -0.7219 11 -0.3100 -0.3113 12 -0.7100 -0.7119 13 -1.040 -1.044 14 -0.7700 -0.7721 15 -0.8600 -0.8630 16 -0.8400 -0.8423 17 -0.4100 -0.4116 18 -0.4900 -0.4912 19 -0.4800 -0.4818 20 -0.7200 -0.7219 21 -1.210 -1.214 22 -0.8000 -0.8021 23 0.1600 0.1602 24 0.4600 0.4618 25 0.4000 0.4009 26 1.000 1.004

. . .

494 0.3900 0.3913 495 -0.1700 -0.1706 496 1.040 1.043 497 0.7700 0.7723 498 0.1200 0.1206 499 -0.3500 -0.3513 500 -0.2200 -0.2204 501 0.8000E-01 0.7998E-01 502 -0.8000E-01 -0.7998E-01 503 -0.1800 -0.1808 504 -0.6000E-01 -0.5990E-01 An OLS regression documents the closeness of the fit.

=> CALL OLSQ(NINO %RECON_Y :PRINT)$

Ordinary Least Squares Estimation Dependent variable NINO Centered R**2 0.9999998133138651 Adjusted R**2 0.9999998129419804 Residual Sum of Squares 5.063609377294851E-05 Residual Variance 1.008687126951166E-07 Standard Error 3.175983512159919E-04 Total Sum of Squares 271.2364998015873 Log Likelihood 3345.437370731567 Mean of the Dependent Variable -1.984126984126920E-05 Std. Error of Dependent Variable 0.7343279745169901 Sum Absolute Residuals 0.1518517527012808 42 Chapter 15

F( 1, 502) 2689004765.108992 F Significance 1.000000000000000 1/Condition XPX 0.6454896073187762 Maximum Absolute Residual 5.326197560795443E-04 Number of Observations 504

Variable Lag Coefficient SE t %RECON_Y 0 0.99689178 0.19224375E-04 51855.615 CONSTANT 0 -0.19583793E-04 0.14146955E-04 -1.3843115

=> CALL TABULATE(%SCALE,%PERIOD,%W_POWER,%W_PHASE, => %W_AMPL,%SIGNIF,%GLOBAL,%G_SIG)$ This table produces the power and phase and other values by period/scale

Obs %SCALE %PERIOD %W_POWER %W_PHASE %W_AMPL %SIGNIF %GLOBAL %G_SIG 1 0.2500 0.2583 0.6064E-06 116.4 0.7787E-03 7.230 0.1340E-05 2.689 2 0.2973 0.3071 0.1538E-04 116.6 0.3922E-02 0.8132 0.3449E-04 0.3053 3 0.3536 0.3652 0.3005E-03 118.3 0.1734E-01 0.3707 0.6882E-03 0.1406 4 0.4204 0.4343 0.3146E-02 122.8 0.5609E-01 0.2774 0.7453E-02 0.1064 5 0.5000 0.5165 0.1232E-01 132.4 0.1110 0.2631 0.2912E-01 0.1021 6 0.5946 0.6143 0.2164E-01 143.6 0.1471 0.2855 0.4269E-01 0.1123 7 0.7071 0.7305 0.2336E-01 141.6 0.1529 0.3365 0.5829E-01 0.1342 8 0.8409 0.8687 0.5037E-01 135.1 0.2244 0.4181 0.9573E-01 0.1693 9 1.000 1.033 0.2590 148.1 0.5089 0.5369 0.1767 0.2211 10 1.189 1.229 0.3354 173.0 0.5791 0.7035 0.2689 0.2948 11 1.414 1.461 0.4558 -125.3 0.6751 0.9314 0.3958 0.3979 12 1.682 1.737 0.9827 -117.8 0.9913 1.236 0.5479 0.5392 13 2.000 2.066 0.4454 -159.3 0.6674 1.635 0.7847 0.7290 14 2.378 2.457 0.2008 153.5 0.4481 2.140 1.276 0.9773 15 2.828 2.922 0.2182E-01 13.59 0.1477 2.758 1.954 1.292 16 3.364 3.475 0.1589E-01 58.03 0.1260 3.481 2.569 1.676 17 4.000 4.132 0.9440 -165.5 0.9716 4.285 2.270 2.125 18 4.757 4.914 2.964 -146.5 1.722 5.130 2.109 2.625 19 5.657 5.844 0.7344 -140.4 0.8570 5.968 2.357 3.157 20 6.727 6.949 0.2061 66.75 0.4540 6.750 1.703 3.700 21 8.000 8.264 0.3094 120.3 0.5562 7.442 1.368 4.236 22 9.514 9.828 1.300 131.0 1.140 8.025 1.318 4.753 23 11.31 11.69 2.055 137.4 1.433 8.496 1.605 5.246 24 13.45 13.90 2.273 129.9 1.508 8.865 1.823 5.716 25 16.00 16.53 0.2368 119.8 0.4866 9.146 1.311 6.167 26 19.03 19.66 0.3610 -116.2 0.6009 9.355 1.424 6.603 27 22.63 23.38 0.1298 -50.34 0.3603 9.509 1.154 7.027 28 26.91 27.80 1.415 12.27 1.189 9.621 1.295 7.440 29 32.00 33.06 0.8495 14.68 0.9217 9.702 0.6433 7.838 30 38.05 39.31 0.5591 -34.62 0.7477 9.760 0.5865 8.214 31 45.25 46.75 1.005 -42.79 1.003 9.802 1.008 8.560 32 53.82 55.60 0.2850 2.065 0.5339 9.831 0.5386 8.867 33 64.00 66.11 0.7003 91.35 0.8368 9.852 0.7283 9.128 34 76.11 78.62 1.477 117.7 1.215 9.867 1.363 9.341 35 90.51 93.50 1.247 133.6 1.116 9.878 1.169 9.508 36 107.6 111.2 0.6657 173.3 0.8159 9.885 0.6206 9.634 37 128.0 132.2 0.9212 -173.0 0.9598 9.891 0.9145 9.726 38 152.2 157.2 0.1714 -163.0 0.4140 9.894 0.1606 9.790 39 181.0 187.0 0.1426 -114.7 0.3777 9.897 0.1404 9.834 40 215.3 222.4 1.092 -112.3 1.045 9.899 1.092 9.863 41 256.0 264.5 2.003 -112.3 1.415 9.900 2.003 9.881 42 304.4 314.5 0.2956 -112.3 0.5437 9.901 0.2956 9.893 43 362.0 374.0 0.7419E-03 -112.3 0.2724E-01 9.902 0.7419E-03 9.899 44 430.5 444.8 0.3191E-08 -112.3 0.5649E-04 9.902 0.3191E-08 9.902

=> CALL PRINT(%SA_DF %SA_SIG)$

%SA_DF = 6.4406460

%SA_SIG = 0.43879433

=> CALL PRINT('mean original data ',MEAN(NINO):)$

mean original data -1.984126984126920E-05

=> CALL PRINT('mean of reconstructed data ',MEAN(%RECON_Y):)$

mean of reconstructed data -2.582799130488190E-07

=> CALL PRINT('Variance of original Data ',VARIANCE(NINO):)$

Variance of original Data 0.5392375741582253

=> CALL PRINT('Variance of reconstructed Data ',VARIANCE(%RECON_Y):)$

Variance of reconstructed Data 0.5426052999725153

B34S Matrix Command Ending. Last Command reached.

Space available in allocator 11856880, peak space used 61607 Spectral Analysis of Time Series 43

Number variables used 54, peak number used 54 Number temp variables used 28, # user temp clean 0 Table 15.11 illustrates application of equation (15.4-11) to filter the noise from the nino3 series using successively higher values of s0 . The results are shown in Figure 15.6 which is best viewed in color. The series filter3 is the most aggressive at removing noise due to s0 = 4.5 . Table 15.11 Smoothing Nino Series by Local Periods

b34sexec options ginclude('wavedata.mac') member(nino3); b34srun; /; /; Illustrates filtering. Increasing so => tighter filter /; b34sexec matrix; call loaddata; /; This setting will closely filter series call wavelet(nino :type morlet :settings :s0 2.25 :dt .25 :jtot 44); filter1=%recon_y; call wavelet(nino :type morlet :settings :s0 3.5 :dt .25 :jtot 44); filter2=%recon_y; call wavelet(nino :type morlet :settings :s0 4.5 :dt .25 :jtot 44); filter3=%recon_y;

call tabulate(nino,filter1,filter2,filter3); call graph(nino,filter1,filter2 filter3 :nolabel :heading 'Raw Nino and smoothed series' :file 'nino.wmf'); b34srun;

Figure 15.7 shows increased smoothing of the Nino series 44 Chapter 15

Raw Nino and smoothed series 2.5

2

1.5

1

N F F F .5 I I I I N L L L O T T T 0 E E E R R R 1 2 3 -.5

-1

-1.5

50 100 150 200 250 300 350 400 450 500 Obs Figure 15.6 Raw Nino series and three wavelet smoothed series

A portion of the output that produced Figure 15.6 is shown.

Wavelet Option Settings Input Variable NINO Number of Original Observations 504 Sampling time (dt) 0.2500000000000000 Wavelet used Morlet Wave number (param) 6.000000000000000 Smallest scale of wavelet s0=%scale(1) 2.250000000000000 Largest scale of wavelet (%scale(jtot)) 3874.850681489117 Spacing between discrete scales (dj) 0.2500000000000000 Number of scales (jtot) 44 Number of observation after padding 1024 Lag1 (background autocorrelation) 0.7200000000000000 Significance level 5.000000000000004E-02 Lower Scale for filter 2.000000000000000 Upper Scale for filter 6.000000000000000 Work array size (nk) 1024

=> FILTER1=%RECON_Y$

=> CALL WAVELET(NINO :TYPE MORLET :SETTINGS :S0 3.5 :DT .25 => :JTOT 44)$

Wavelet Option Settings Input Variable NINO Number of Original Observations 504 Sampling time (dt) 0.2500000000000000 Wavelet used Morlet Wave number (param) 6.000000000000000 Smallest scale of wavelet s0=%scale(1) 3.500000000000000 Largest scale of wavelet (%scale(jtot)) 6027.545504538625 Spectral Analysis of Time Series 45

Spacing between discrete scales (dj) 0.2500000000000000 Number of scales (jtot) 44 Number of observation after padding 1024 Lag1 (background autocorrelation) 0.7200000000000000 Significance level 5.000000000000004E-02 Lower Scale for filter 2.000000000000000 Upper Scale for filter 6.000000000000000 Work array size (nk) 1024

=> FILTER2=%RECON_Y$

=> CALL WAVELET(NINO :TYPE MORLET :SETTINGS :S0 4.5 :DT .25 => :JTOT 44)$

Wavelet Option Settings Input Variable NINO Number of Original Observations 504 Sampling time (dt) 0.2500000000000000 Wavelet used Morlet Wave number (param) 6.000000000000000 Smallest scale of wavelet s0=%scale(1) 4.500000000000000 Largest scale of wavelet (%scale(jtot)) 7749.701362978233 Spacing between discrete scales (dj) 0.2500000000000000 Number of scales (jtot) 44 Number of observation after padding 1024 Lag1 (background autocorrelation) 0.7200000000000000 Significance level 5.000000000000004E-02 Lower Scale for filter 2.000000000000000 Upper Scale for filter 6.000000000000000 Work array size (nk) 1024

=> FILTER3=%RECON_Y$

=> CALL TABULATE(NINO,FILTER1,FILTER2,FILTER3)$

Obs NINO FILTER1 FILTER2 FILTER3 1 -0.1500 -0.1085 -0.1517 -0.1635 2 -0.3000 -0.1893 -0.2318 -0.2125 3 -0.1400 -0.2873 -0.3187 -0.2659 4 -0.4100 -0.3926 -0.4060 -0.3239 5 -0.4600 -0.4926 -0.4864 -0.3867 6 -0.6600 -0.5770 -0.5541 -0.4545 7 -0.5000 -0.6408 -0.6054 -0.5271 8 -0.8000 -0.6861 -0.6400 -0.6038 9 -0.9500 -0.7198 -0.6611 -0.6829 10 -0.7200 -0.7479 -0.6751 -0.7614 11 -0.3100 -0.7707 -0.6900 -0.8349 12 -0.7100 -0.7817 -0.7135 -0.8978 13 -1.040 -0.7720 -0.7500 -0.9434 14 -0.7700 -0.7381 -0.7992 -0.9646 15 -0.8600 -0.6890 -0.8540 -0.9546 16 -0.8400 -0.6467 -0.9008 -0.9079 17 -0.4100 -0.6371 -0.9211 -0.8209 18 -0.4900 -0.6741 -0.8946 -0.6932 19 -0.4800 -0.7441 -0.8033 -0.5274 20 -0.7200 -0.8005 -0.6361 -0.3302 21 -1.210 -0.7732 -0.3927 -0.1112 22 -0.8000 -0.5943 -0.8565E-01 0.1170 23 0.1600 -0.2287 0.2599 0.3402 24 0.4600 0.3012 0.6087 0.5438 25 0.4000 0.9105 0.9208 0.7143 26 1.000 1.471 1.158 0.8407 27 2.170 1.847 1.290 0.9155 28 2.500 1.943 1.300 0.9355 29 2.340 1.732 1.191 0.9021 30 0.8000 1.265 0.9806 0.8211 . . . . . 15.5 Use of Normalized Cumulative Periodogram to test for white noise

Box-Jenkins (1976, 294-298) and Box-Jenkins-Reinsel (2008, 347-350) suggest using the normalized cumulative periodogram cˆ(wi ) defined as

j Pˆ (w ) x j , (15.5- cˆ(w ) = i=1 j x Tsˆ2 46 Chapter 15

1) where Pˆ(w ) is defined in (15.1-25) and (15.1-27) and sˆ2 is the estimated variance of the series, to check for white noise. The advantage of using (15.5-1) in conjunction with inspection of the ACF is that one can tell using the plot if the violation of the white noise assumption occurred due to more low frequency than high frequency information (the plot is above the diagonal) or more high frequency information than low frequency information (the plot was below the diagonal). Probability limits for 99%, 95%, 90% and 75% are respectively 1.63, 1.36, 1.22 and 1.02. The below listed program cperiod listed in Table 15.12 implements this test which is tested using the program listed in Table 15.13.

Table 15.12 Calculating the Normalized Cumulative Periodogram

subroutine cperiod(x,name,c_period,c_p_freq,idrop); /; /; Normalized Cumulative Periodogram /; /; Box-Jenkins-Rensel (2008,347-350) suggests calculation of /; cumulative Periodogram to test detect periodic nonrandomness /; /; See Jenkins and Watts (1968, 235) /; /; For significance of .95 and .75 lamda = 1.36 and 1.02 /; .99 and .90 lamda = 1.63 and 1.22 /; band is +- lamda/sqrt(n/2)-1)) /; /; Command built October 2009 by Houston H. Stokes /; /; x = series to test /; name = name of series /; c_period = normalized cumulative periodogram /; c_p_freq = frequency of normalized cumulative periodogram /; idrop = Number of c_period values to drop /; /; name of file is 'c_n_period.wmf' /; n=dfloat(norows(x)); varx=variance(x);

if(varx.le.0.0d+00)then; call print('ERROR: Series has no variance':); go to done; endif;

p =spectrum(x,freq2); c_p_freq=freq2; c_period=cusum(p)/(n*varx);

if(idrop.gt.0)then; c_p_freq =dropfirst(freq2, idrop); Spectral Analysis of Time Series 47

c_period=dropfirst(c_period,idrop); endif;

diag=dfloat(integers(1,norows(c_period))); diag=diag/dfloat(norows(c_period)); test=1./dsqrt(((n/2.) -1.));

upper99=diag+((1.63)*test); lower99=diag-((1.63)*test);

upper95=diag+((1.36)*test); lower95=diag-((1.36)*test); /; /; These bands can be added if desired /; /; upper90=diag+((1.22)*test); /; lower90=diag-((1.22)*test);

upper75=diag+((1.02)*test); lower75=diag-((1.02)*test);

call character(cc,'Cumulative Periodogram of '); call character(cc2,name); call ialen(cc2,ii); call expand(cc,cc2,27,27+ii);

call graph(c_p_freq,c_period,diag, upper99 lower99 upper95, lower95 upper75 lower75 :heading cc :pgborder :nocontact :nolabel :plottype xyplot :file 'n_c_period.wmf');

done continue; return; end;

Table 15.13 Testing series for white noise using the Normalized Cumulative Periodogram

b34sexec options ginclude('gas.b34'); b34srun; /$ /$ Job tests c_period Command /$ b34sexec matrix; call loaddata; call load(cperiod);

idrop=0; call cperiod(gasout,'gasout',c_period,c_p_freq,idrop); call dodos('rename n_c_period.wmf fig15.7.wmf' :); x=rn(gasout); call cperiod(x, 'ran()',c_period,c_p_freq, idrop); call dodos('rename n_c_period.wmf fig15.8.wmf' :); b34srun$ 48 Chapter 15

Figure 15.7 for the gasout series clearly shows that there is a preponderance of low frequency information making it possible to accept ta the 99% level that the series is not white noise. Figure 15.8 shows the plot for a white noise series.

Cumulative Periodogram of gasout

1

.8

CDULULUL .6 _IPOPOPO PAPWPWPW EGEEEEEE .4 R RRRRRR I 999977 O 995555 D .2

0

0 .5 1 1.5 2 2.5 3 C_P_FREQ Figure 15.7 Normalized Cumulative Periodogram for gasout series Spectral Analysis of Time Series 49

Cumulative Periodogram of ran()

1

.8

CDULULUL .6 _IPOPOPO PAPWPWPW EGEEEEEE .4 R RRRRRR I 999977 O 995555 D .2

0

0 .5 1 1.5 2 2.5 3 C_P_FREQ Figure 15.8 Normalized Cumulative Periodogram for white noise series

15.6 Forecasting using spectral methods

Table 15.14 lists a matrix command subroutine that uses spectral methods to construct forecasts. Unlike a RATS command of the same name, no attempt is made to smooth the FFT. 50 Chapter 15

Table 15.14 A Subroutine to Calculate Forecasts using the FFT of a series subroutine specfore(data,startf,numf,detrend,forecast,obs,error,actual); /; /; Forecast with spectral methods. /; /; Based on code developed by Michael Hunstad using /; regression methods to partially reverse-engineer the RATS /; specfore command. An improved version of the /; Hunsted Matlab code is in c:\b34slm\mfiles as /; specfore.m /; /; This implementation by Houston H. Stokes uses /; a FFT to save space and reduce CPU use. Added capability is /; provided Unlike the RATS implementation of this technique, /; the current implementation does not smooth the FFT /; /; Rats smooths the data /; /; data => series to forecast. # of obs = n /; startf => last period before start forecasting /; numf => number of forecasts /; detrend => Detrend the data if gt 0. =2 print trend OLS Model /; forecast => Forecast /; obs => Observation number associated with forecast /; error => Defined if startf lt n /; actual => defined if startf lt n /; /; Routine developed 8 December 2009 by Houston H. Stokes /; nobs=norows(data); error=missing(); actual=missing(); if(nobs .lt.startf)then; call epprint('ERROR: In call specfore startf not le nobs of data':); call epprint(' nobs of data was ',nobs:); call epprint(' startf was ',startf:); go to endit; endif; series=data(integers(1,startf)); seriesm=mean(series); series=series-seriesm; if(detrend.ne.0)then; trend=dfloat(integers(startf)); if(detrend.eq.1)call olsq(series trend :qr if(detrend.eq.2)call olsq(series trend :qr :print); if(klass(series).eq.5)series=series-afam(%yhat); if(klass(series).eq.1)series=series-vfam(%yhat); tcoef=%coef(1); Spectral Analysis of Time Series 51 tmean=%coef(2); endif; obs=dfloat(integers(startf+1,startf+numf)); call spectral(series,sinx,cosx,px,sx,freq); beta=vfam(catrow(cosx,sinx));

/; zero out cosx for highest freq ijunk=norows(cosx); beta(ijunk,1)=0.0; forecast=vector(numf:); tt=dfloat(integers(0,numf-1))+dfloat(startf); do i=1,numf; c1=cos(afam(freq)*afam(tt(i))); s1=sin(afam(freq)*afam(tt(i))); cc=vfam(catrow(c1,s1)); if(detrend.eq.0)forecast(i)=transpose(beta)*cc + sfam(seriesm); if(detrend.ne.0)forecast(i)=transpose(beta)*cc + sfam(seriesm) +tmean + (dfloat(startf+i)*tcoef); enddo; if(startf.lt.nobs)then; iend=dmin1(startf+numf,nobs); nn2 =integers(iend-startf); actual = data(integers(startf+1,iend)); if(abs(klass(actual)).eq.1)error = actual - forecast(nn2); if(abs(klass(actual)).eq.5)error = actual - afam(forecast(nn2)); endif; endit continue; return; end;

Table 15.15 shows the use of the specfore command in B34S and RATS. Note the effect of the trend correction. The MATLAB implementation uses OLS in the place of the FFT and does not provide the options of trend correction. The advantage of the OLS implementation to get the sine and cosine coefficients is transparency. The disadvantage is the added computing costs in both CPU time and space. 52 Chapter 15

Table 15.15 Using the FFT to Forecast

%b34slet domatlab = 0; %b34slet dorats = 0; %b34slet dob34s1 = 1;

%b34slet file1="'_b34sdat.dat'"$ %b34slet file2="'b34sdata.m'"$ b34sexec options ginclude('b34sdata.mac') member(lydiapnm); b34srun; /$ user places RATS commands between /$ PGMCARDS$ /$ note: user RATS commands here /$ B34SRETURN$ /$ %b34sif(&dob34s1.ne.0)%then; b34sexec matrix; call echooff; call loaddata; call load(specfore); call print(' ':); call print('Forecast of sales and Advertising':); nfor=30; base=60; call specfore(sales, base,nfor,0,fsales1,obs,error1,actual1); call specfore(sales, base,nfor,2,fsales2,obs,error2,actual2); call specfore(advertis,base,nfor,0,fadd1,obs,error3,actual3); call specfore(advertis,base,nfor,2,fadd2,obs,error4,actual4); call print(' ':); call print('With out Trend Correction':); call tabulate(obs,actual1,fsales1,error1,actual3,fadd1,error3); call print('With Trend Correction':); call tabulate(obs,actual2,fsales2,error2,actual4,fadd2,error4); nn=integers(norows(actual1)); obs = obs(nn); fsales1=fsales1(nn); fsales2=fsales2(nn); nn=integers(norows(actual3)); fadd1 =fadd1(nn); fadd2 =fadd2(nn); call tabulate(obs actual1,fsales1 fsales2 ); call graph(obs actual1,fsales1 fsales2 :plottype xyplot :heading 'Sales Forecast out of sample # 2 with trend' :nolabel :nocontact :pgborder); Spectral Analysis of Time Series 53 call graph(obs actual3 fadd1 fadd2 :plottype xyplot :heading 'Advertis Forecast out of sample notrend # 2 with trend' :nolabel :nocontact :pgborder); cc1=ccf(fsales1,actual1); cc2=ccf(fsales2,actual2); cc3=ccf(fadd1,actual3); cc4=ccf(fadd2,actual4); ss1=sumsq(error1); ss2=sumsq(error2); ss3=sumsq(error3); ss4=sumsq(error4); call print(' ':); call print('Out of sample sales no trend sumsq ',ss1:); call print('Out of sample sales with trend sumsq ',ss2:); call print('Out of sample advertis no trend sumsq ',ss3:); call print('Out of sample sales with trend sumsq ',ss4:); call print('Out of sample sales forecast no trend correlation ',cc1:); call print('Out of sample sales forecast with trend correlation ',cc2:); call print('Out of sample adver forecast no trend correlation ',cc3:); call print('Out of sample adver forecast with trend correlation ',cc4:); b34srun;

%b34sendif;

%b34sif(&dorats.ne.0)%then; B34SEXEC OPTIONS OPEN('rats.dat') UNIT(28) DISP=UNKNOWN$ B34SRUN$ B34SEXEC OPTIONS OPEN('rats.in') UNIT(29) DISP=UNKNOWN$ B34SRUN$ B34SEXEC OPTIONS CLEAN(28)$ B34SRUN$ B34SEXEC OPTIONS CLEAN(29)$ B34SRUN$

B34SEXEC PGMCALL$ RATS PASSASTS PCOMMENTS('* ', '* Data passed from B34S(r) system to RATS', '* ') $ PGMCARDS$ * * see section 7.5 in RATS manual * Source(NOECHO) d:\R\specfore.src * SOURCE(ECHO) d:\R\specfore.src

SET SERIES = sales COMPUTE istart = 60 COMPUTE iend = 90

* * @SPECFORE( options ) series start end forecasts * Computes forecasts using spectral techniques * * Parameters: 54 Chapter 15

* series : (input) Series to be forecast * start end : Range of entries to forecast * forecasts : (output) Series for computed forecasts *

@SPECFORE(DIFFS=0,SDIFFS=0,TRANS=NONE,CONSTANT) SERIES ISTART IEND FORE

SET ERROR = SERIES - FORE PRINT istart iend SERIES FORE ERROR

B34SRETURN$ B34SRUN $

B34SEXEC OPTIONS CLOSE(28)$ B34SRUN$ B34SEXEC OPTIONS CLOSE(29)$ B34SRUN$ B34SEXEC OPTIONS /$ dodos('start /w /r rats386 rats.in rats.out ') dodos('start /w /r rats32s rats.in /run') dounix('rats rats.in rats.out')$ B34SRUN$ B34SEXEC OPTIONS NPAGEOUT WRITEOUT('Output from RATS',' ',' ') COPYFOUT('rats.out') dodos('ERASE rats.in','ERASE rats.out','ERASE rats.dat') dounix('rm rats.in','rm rats.out','rm rats.dat') $ B34SRUN$ %b34sendif;

%b34sif(&domatlab.ne.0)%then; /$ /$ Builds a MATLAB input file for MATLAB version 6. /$ Changes made 2 February 2002 /$ /$ Since MATLAB is case sensitive, use lower case for all variable /$ references that are from b34s. MATLAB users upper case for a matrix /$ variable /$ /$ This job assumes user has already loaded data in B34S /$ The file name for file1 is hard coded in the matlab m file (file2) /$ /$ User changes this to default matlab file directory /$ /$ /$ Job runs on linux matlab and windows matlab /$ /$ When job ends, output will be seen in b34s.out file /$ /$ User loads data here if it has not occured already /$

b34sexec options open(%b34seval(&file1)) unit(28) disp=unknown$ Spectral Analysis of Time Series 55

b34seend$ b34sexec options clean(28)$ b34seend$ b34sexec options open(%b34seval(&file2)) unit(29) disp=unknown$ b34seend$ b34sexec options clean(29)$ b34seend$ b34sexec pgmcall$ matlab lowercase outfile(%b34seval(&file1))$ pgmcards$ % User MATLAB commands here such as plot(varname)

% x1=test(sales,60,2,1,1); x1=specfore(sales,60,10,1,1); % quit is needed since have to get out of matlab automatically % Comment to stay in matlab and see plot b34sreturn$ b34seend$ b34sexec options close(28)$ b34srun$ b34sexec options close(29)$ b34srun$ b34sexec options dodos('start /w /r matlab /r b34sdata /logfile matlab.out') dounix('matlab < b34sdata.m > matlab.out'); b34srun; b34sexec options writeout(' ', 'Output from Matlab ', ' '); b34srun; b34sexec options copyfout('matlab.out'); b34srun; b34sexec options dodos('erase matlab.out') dounix('rm matlab.out'); b34srun; %b34sendif;

When run this job produces

Variable # Cases Mean Std Deviation Variance Maximum Minimum

SALES 1 78 1278.692308 196.6126330 38656.52747 1728.000000 772.0000000 ADVERTIS 2 78 619.3974359 433.9763696 188335.4893 1388.000000 39.00000000 CONSTANT 3 78 1.000000000 0.000000000 0.000000000 1.000000000 1.000000000

Number of observations in data file 78 Current missing variable code 1.000000000000000E+31 Data begins on (D:M:Y) 1: 1:1954 ends 1: 6:1960. Frequency is 12

B34S(r) Matrix Command. d/m/y 8/12/09. h:m:s 16:17:39.

=> CALL ECHOOFF$

Forecast of sales and Advertising

With out Trend Correction

Obs OBS ACTUAL1 FSALES1 ERROR1 ACTUAL3 FADD1 ERROR3 1 61.00 1052. 1297. -244.8 838.0 1245. -406.6 2 62.00 1102. 1316. -214.2 994.0 1385. -391.4 3 63.00 1355. 1730. -374.8 1020. 946.6 73.43 4 64.00 1323. 1537. -214.2 865.0 954.4 -89.43 5 65.00 1296. 1326. -29.78 819.0 51.57 767.4 6 66.00 1127. 1262. -135.2 83.00 74.43 8.567 7 67.00 1170. 1171. -0.7833 56.00 36.57 19.43 8 68.00 1059. 1477. -418.2 224.0 502.4 -278.4 9 69.00 1116. 1633. -516.8 881.0 1135. -253.6 10 70.00 1214. 1544. -330.2 436.0 952.4 -516.4 11 71.00 966.0 1461. -494.8 160.0 665.6 -505.6 56 Chapter 15

12 72.00 1089. 1085. 3.783 68.00 163.4 -95.43 13 73.00 814.0 1173. -358.8 749.0 978.6 -229.6 14 74.00 1087. 1404. -317.2 857.0 1309. -452.4 15 75.00 1180. 1621. -440.8 898.0 1353. -454.6 16 76.00 1167. 1506. -339.2 705.0 1106. -401.4 17 77.00 1210. 1523. -312.8 489.0 501.6 -12.57 18 78.00 1092. 1339. -247.2 59.00 158.4 -99.43

With Trend Correction

Obs OBS ACTUAL2 FSALES2 ERROR2 ACTUAL4 FADD2 ERROR4 1 61.00 1052. 1028. 24.21 838.0 948.2 -110.2 2 62.00 1102. 1043. 59.30 994.0 1084. -90.04 3 63.00 1355. 1461. -105.8 1020. 650.2 369.8 4 64.00 1323. 1264. 59.30 865.0 653.0 212.0 5 65.00 1296. 1057. 239.2 819.0 -244.8 1064. 6 66.00 1127. 988.7 138.3 83.00 -227.0 310.0 7 67.00 1170. 901.8 268.2 56.00 -259.8 315.8 8 68.00 1059. 1204. -144.7 224.0 201.0 22.96 9 69.00 1116. 1364. -247.8 881.0 838.2 42.84 10 70.00 1214. 1271. -56.70 436.0 651.0 -215.0 11 71.00 966.0 1192. -225.8 160.0 369.2 -209.2 12 72.00 1089. 811.7 277.3 68.00 -138.0 206.0 13 73.00 814.0 903.8 -89.79 749.0 682.2 66.84 14 74.00 1087. 1131. -43.70 857.0 1008. -151.0 15 75.00 1180. 1352. -171.8 898.0 1056. -158.2 16 76.00 1167. 1233. -65.70 705.0 805.0 -100.0 17 77.00 1210. 1254. -43.79 489.0 205.2 283.8 18 78.00 1092. 1066. 26.30 59.00 -143.0 202.0 19 79.00 NA 979.8 NA NA -271.8 NA 20 80.00 NA 986.7 NA NA 85.04 NA 21 81.00 NA 1152. NA NA 729.2 NA 22 82.00 NA 1283. NA NA 525.0 NA 23 83.00 NA 954.8 NA NA -193.8 NA 24 84.00 NA 777.7 NA NA -189.0 NA 25 85.00 NA 974.8 NA NA 668.2 NA 26 86.00 NA 1086. NA NA 916.0 NA 27 87.00 NA 1393. NA NA 893.2 NA 28 88.00 NA 1442. NA NA 670.0 NA 29 89.00 NA 1104. NA NA 293.2 NA 30 90.00 NA 1018. NA NA -206.0 NA

Obs OBS ACTUAL1 FSALES1 FSALES2 1 61.00 1052. 1297. 1028. 2 62.00 1102. 1316. 1043. 3 63.00 1355. 1730. 1461. 4 64.00 1323. 1537. 1264. 5 65.00 1296. 1326. 1057. 6 66.00 1127. 1262. 988.7 7 67.00 1170. 1171. 901.8 8 68.00 1059. 1477. 1204. 9 69.00 1116. 1633. 1364. 10 70.00 1214. 1544. 1271. 11 71.00 966.0 1461. 1192. 12 72.00 1089. 1085. 811.7 13 73.00 814.0 1173. 903.8 14 74.00 1087. 1404. 1131. 15 75.00 1180. 1621. 1352. 16 76.00 1167. 1506. 1233. 17 77.00 1210. 1523. 1254. 18 78.00 1092. 1339. 1066.

Out of sample sales no trend sumsq 1804827.578333335 Out of sample sales with trend sumsq 426933.1095972446 Out of sample advertis no trend sumsq 2229763.246666659 Out of sample sales with trend sumsq 1847970.135394790 Out of sample sales forecast no trend correlation 0.5039421385465304 Out of sample sales forecast with trend correlation 0.5023270654992271 Out of sample adver forecast no trend correlation -9.283852553158210E-02 Out of sample adver forecast with trend correlation -9.305692521339150E-02

B34S Matrix Command Ending. Last Command reached.

Space available in allocator 8856637, peak space used 9740 Number variables used 100, peak number used 108 Number temp variables used 2228, # user temp clean 0

B34S 8.11D (D:M:Y) 8/12/09 (H:M:S) 16:17:43 PGMCALL STEP Lydia Pinkham Monthly Data PAGE 2

Output from RATS

* * Data passed from B34S(r) system to RATS * CALENDAR 1954 1 12 Spectral Analysis of Time Series 57

ALLOCATE 78 OPEN DATA rats.dat DATA(FORMAT=FREE,ORG=OBS, $ MISSING= 0.1000000000000000E+32 ) / $ SALES $ ADVERTIS $ CONSTANT SET TREND = T TABLE Series Obs Mean Std Error Minimum Maximum SALES 78 1278.69230769 196.61263304 772.00000000 1728.00000000 ADVERTIS 78 619.39743590 433.97636957 39.00000000 1388.00000000 TREND 78 39.50000000 22.66053839 1.00000000 78.00000000

* * see section 7.5 in RATS manual * Source(NOECHO) d:\R\specfore.src * SOURCE(ECHO) d:\R\specfore.src SET SERIES = sales COMPUTE istart = 60 COMPUTE iend = 90 * * @SPECFORE( options ) series start end forecasts * Computes forecasts using spectral techniques * * Parameters: * series : (input) Series to be forecast * start end : Range of entries to forecast * forecasts : (output) Series for computed forecasts * @SPECFORE(DIFFS=0,SDIFFS=0,TRANS=NONE,CONSTANT) SERIES ISTART IEND FORE SET ERROR = SERIES - FORE PRINT istart iend SERIES FORE ERROR

ENTRY SERIES FORE ERROR 1958:12 1072 1240.460004817 -168.4600048169 1959:01 1052 1200.763596247 -148.7635962471 1959:02 1102 1235.408547587 -133.4085475874 1959:03 1355 1352.704220311 2.2957796894 1959:04 1323 1358.394344689 -35.3943446892 1959:05 1296 1320.524022842 -24.5240228422 1959:06 1127 1280.304763050 -153.3047630499 1959:07 1170 1287.241992611 -117.2419926114 1959:08 1059 1305.249870237 -246.2498702365 1959:09 1116 1326.211286902 -210.2112869018 1959:10 1214 1360.412853130 -146.4128531300 58 Chapter 15

B34S 8.11D (D:M:Y) 8/12/09 (H:M:S) 16:17:43 PGMCALL STEP Lydia Pinkham Monthly Data PAGE 3

1959:11 966 1300.166743214 -334.1667432144 1959:12 1089 1302.188498103 -213.1884981030 1960:01 814 1323.426262875 -509.4262628748 1960:02 1087 1330.313780233 -243.3137802326 1960:03 1180 1354.704626186 -174.7046261861 1960:04 1167 1320.833749223 -153.8337492228 1960:05 1210 1303.224404989 -93.2244049893 1960:06 1092 1310.609988566 -218.6099885658 1960:07 NA 1330.479685398 NA 1960:08 NA 1337.848002486 NA 1960:09 NA 1318.335043563 NA 1960:10 NA 1317.479168750 NA 1960:11 NA 1317.880136638 NA 1960:12 NA 1316.633819117 NA 1961:01 NA 1335.905858936 NA 1961:02 NA 1328.465844195 NA 1961:03 NA 1323.632170191 NA 1961:04 NA 1333.070850317 NA 1961:05 NA 1322.175915526 NA 1961:06 NA 1344.757186664 NA

Note the out-of-sample correlation of .5039 and .5023 for sales forecasts made without and with a trend correction. The advertis forecasts, on the other hand were not good, presumably due to less structure in the underlying series. Experimentation with the spectral forecasting approach suggests that it is a valuable tool but cannot be used blindly. Unlike an ARIMA model which eventually will converge to the expected value of the series, this is not the case with a spectral forecast.

15.7 Conclusion

After a brief survey of spectral analysis theory, the periodogram of two simple AR models was graphed. Using the gas furnace data the OLS approach to spectral analysis was contrasted with the frequency domain approach. The matrix command implementation was shown to be an easily customizable way to used spectral tools to explore the dynamics of a series. In Stokes (200x) use of the fast fourier transform to filter series is illustrated. Some simple examples using the FILTER subroutine are presented in Chapter 16 of this book. Wavelet analysis, that provides a way to study a series by both time and frequency are discussed and a number of examples shown that analyze the Nino water temperature data. At issue is the appropriate period to use for the analysis. If noise in the data is assumed to be of short duration, with wavelet analysis it is possible to remove this short duration information to allow analysis to proceed with a smoothed series that more accurately reflects the underlying process. Finally forecasting a series using FFT methods is illustrated.