<<

Section B4: Unit Roots and Analysis

B4.1. Cointegrated Some pairs of economic time series (e.g., wholesale and retail prices of a single commodity) may be expected to follow similar patterns of change over time. Even when short-term conditions cause such series to diverge, economic and/or policy dynamics will eventually force them back into equilibrium. In the economics literature, such pairs of series are called cointegrated series.

To analyze cointegrated time series, we conceptualize them as stochastic processes, i.e., processes subject to , and define properties of these processes. The notation in this section differs from that of Sections 1 through 3.

B4.2. Definitions of Short and Long Memory Stochastic Processes We begin by defining some properties of stochastic processes. The { } is stationary (or strictly stationary) if, for every collection of time indices , the joint 𝑡𝑡 𝑛𝑛 𝜀𝜀 distribution of is the same as the joint distribution of �𝑡𝑡𝑗𝑗 for�𝑗𝑗= 1all integers 1. 𝑛𝑛 𝑛𝑛 It follows that the𝑡𝑡 𝑗𝑗 are identically distributed. 𝑡𝑡𝑗𝑗+ℎ �𝜀𝜀 �𝑗𝑗=1 �𝜀𝜀 �𝑗𝑗=1 ℎ ≥

𝑡𝑡 Because strict stationarity𝜀𝜀 is defined in terms of the joint distributions of subsets of the stochastic process, the strict stationary of an empirical data series is often difficult to establish. Many data series, however, are clearly nonstationary, e.g., series that clearly display trends or cycles.

Covariance stationarity is defined in terms of the moments of the stochastic process and is thus easier to establish for an empirical data series. The stochastic process { } is stationary if, for some constant , 𝑡𝑡 𝜀𝜀 [ ]𝑐𝑐= , [ ] = (constant ), and (B4.2.1) 𝑡𝑡 𝐸𝐸 𝜀𝜀 𝑐𝑐 2= ( ), for 1. 𝑡𝑡, 𝜀𝜀 𝑉𝑉𝑉𝑉𝑉𝑉 𝜀𝜀 𝜎𝜎 𝑡𝑡 𝑡𝑡+ℎ If a strictly 𝐶𝐶𝐶𝐶𝐶𝐶 has�𝜀𝜀 a𝜀𝜀 finite� second𝑓𝑓 ℎ ,ℎ ≥ it is covariance stationary. A covariance stationary series, however, need not be stationary.

A covariance stationary process { } is a white noise process if

𝑡𝑡 [ ] =𝜀𝜀0, [ ] = 0 for 0 (uncorrolated), and (B4.2.2) 𝑡𝑡 𝐸𝐸 𝜀𝜀[ ] = (constant variance). 𝑡𝑡 𝑡𝑡−ℎ 𝐸𝐸 𝜀𝜀 𝜀𝜀 2 ℎ ≠ 𝑡𝑡 𝜀𝜀 As , { } will frequently𝑉𝑉𝑉𝑉𝑉𝑉 𝜀𝜀 cross 𝜎𝜎the horizontal axis and frequently return to values near its 0. White noise is used to model unexplained short-term movements in a data series. 𝑡𝑡 𝑡𝑡 → ∞ 𝜀𝜀 Let { } be a white noise process, and let

𝜀𝜀𝑡𝑡

= ∞ . (B4.2.3)

𝑡𝑡 𝑗𝑗 𝑡𝑡−𝑗𝑗 𝑥𝑥 � 𝑎𝑎 𝜀𝜀 𝑗𝑗=0 If 0 as , then { } is a short memory process. When { } is a short memory process with constant mean and finite variance, we say that { } is integrated of order 0 and write 𝑗𝑗 𝑡𝑡 𝑡𝑡 { 𝑎𝑎}~→(0). White𝑗𝑗 → ∞ noise, for𝑥𝑥 example, is a short memory process. 𝑥𝑥In a typical empirical time 𝑡𝑡 series generated by a short memory process, 𝑥𝑥 is present but limited. When 0 𝑡𝑡 as𝑥𝑥 𝐼𝐼 , { } is a long memory process. 𝑗𝑗 𝑎𝑎 ↛ 𝑡𝑡 𝑗𝑗 → ∞ 𝑥𝑥 B4.2. Analyzing Short and Long Memory Stochastic Processes If { } is a long memory process, an old shock to the system ( , where h is large) still has an effect on . If, however, { } is a short memory process, the effect of on diminishes 𝑡𝑡 𝑡𝑡−ℎ quickly𝑦𝑦 as h increases, i.e., the process “forgets” old shocks. Although𝜀𝜀 short memory processes 𝑡𝑡 𝑡𝑡 𝑡𝑡−ℎ 𝑡𝑡 are usually𝑦𝑦 serially correlated𝑦𝑦 , the covariance of the values between two𝜀𝜀 time periods𝑦𝑦 diminishes as the distance between the time periods grows ( [ , ] 0 as ).

𝑡𝑡−ℎ 𝑡𝑡 Differencing a Long Memory Process 𝐶𝐶𝐶𝐶𝐶𝐶 𝑦𝑦 𝑦𝑦 → ℎ → ∞ In many cases, differencing a long memory process produces a short memory process. Let

= + , (B4.2.4)

𝑡𝑡 𝑡𝑡−1 𝑡𝑡 where is a white noise process. Then, although𝑦𝑦 𝑦𝑦 { }𝜀𝜀 is a long memory process, the differenced process 𝑡𝑡 𝑡𝑡 𝜀𝜀 𝑦𝑦 = = (B4.2.5)

𝑡𝑡 𝑡𝑡 𝑡𝑡−1 𝑡𝑡 is a short memory process. When { }∆ is𝑦𝑦 a long𝑦𝑦 −memory𝑦𝑦 process𝜀𝜀 and { } is a short memory process, we say that { } is integrated of order 1 and write { } ~ (1). 𝑡𝑡 𝑡𝑡 𝑦𝑦 ∆𝑦𝑦 𝑡𝑡 𝑡𝑡 Cointegration 𝑦𝑦 𝑦𝑦 𝐼𝐼 Before presenting a rigorous definition of cointegration, we state some facts about linear combinations of (0) and (1) processes. Let and be constants.

a) If { }~ (𝐼𝐼0), then 𝐼𝐼 + is (0). 𝑎𝑎 𝑏𝑏

𝑡𝑡 𝑡𝑡 b) If {𝑥𝑥 }~𝐼𝐼(1), then 𝑎𝑎 + 𝑏𝑏𝑥𝑥 is 𝐼𝐼(1).

𝑡𝑡 𝑡𝑡 c) A linear𝑥𝑥 𝐼𝐼combination𝑎𝑎 of𝑏𝑏 short𝑥𝑥 𝐼𝐼memory processes is a short memory process, i.e., if { }~ (0), and { }~ (0), then + is (0).

𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑡𝑡 d) A𝑥𝑥 linear𝐼𝐼 combination𝑦𝑦 of𝐼𝐼 a short memory𝑎𝑎𝑥𝑥 𝑏𝑏 𝑦𝑦process𝐼𝐼 and a long memory process is a long memory process, i.e., if { }~ (0), and { }~ (1), then + is (1).

𝑥𝑥𝑡𝑡 𝐼𝐼 𝑦𝑦𝑡𝑡 𝐼𝐼 𝑎𝑎𝑥𝑥𝑡𝑡 𝑏𝑏𝑦𝑦𝑡𝑡 𝐼𝐼 Definition of Cointegration A linear combination of two long memory processes is a long memory process unless the two processes “share” the same long-term memory. Suppose { }~ (1) and { }~ (1). If there exist constants m, a, and b such that 𝑡𝑡 𝑡𝑡 𝑥𝑥 𝐼𝐼 𝑦𝑦 𝐼𝐼 1. + + is (0) and 2. [ + + ] = 0, 𝑡𝑡 𝑡𝑡 𝑚𝑚 𝑎𝑎𝑥𝑥 𝑏𝑏𝑦𝑦 𝐼𝐼 𝑡𝑡 𝑡𝑡 then { 𝐸𝐸} and𝑚𝑚 {𝑎𝑎𝑥𝑥} are 𝑏𝑏cointegrated.𝑦𝑦

𝑡𝑡 𝑡𝑡 Intuitively,𝑥𝑥 because𝑦𝑦 cointegrated long memory series share the same long-term memory, their long-term memories cancel out in the linear combination, leaving a short memory series. In the literature, the term “cointegrated” is commonly used even when the relationship between the two series is not strictly linear (e.g., it may be log-linear).

Problem of Spurious Regression In time series , we assume that one time series can be expressed as a linear combination of other time series, modulo an error term. When the time series have long memories, we are assuming cointegration.

Traditional regression diagnostics can be deceptive in the presence of long memory series. In particular, we may see high values of and low standard errors, leading to inflated t-. There is a high probability of regression2 diagnostics indicating a relationship between two independent, randomly generated (1)𝑅𝑅 series (Newbold and Granger 1974). Differencing the series may eliminate this problem but allows only analysis of short-run changes. Use of additional statistical tests and diagnostics𝐼𝐼 enables us to analyze long-run relationships between cointegrated series.

B4.3. Testing for Unit Roots

Definition of a Process in the Linear Case If

= + + , (B4.3.1)

𝑡𝑡 𝑡𝑡−1 𝑡𝑡 where ( | , , … , ) = 0, then𝑦𝑦 { 𝛼𝛼} has𝜌𝜌 a𝑦𝑦 unit root𝜀𝜀 if and only if | | = 1.

𝑡𝑡 𝑡𝑡−1 𝑡𝑡−2 0 𝑡𝑡 When 𝐸𝐸 =𝜀𝜀 1𝑦𝑦 and 𝑦𝑦 = 0, { 𝑦𝑦} is a 𝑦𝑦 without drift. When = 1 and𝜌𝜌 0, { } is a random walk with drift (Figure 5). Unit root processes are long memory processes. 𝑡𝑡 𝑡𝑡 𝜌𝜌 𝛼𝛼 𝑦𝑦 𝜌𝜌 𝛼𝛼 ≠ 𝑦𝑦 Testing : | | = 1 Observe that 0 𝐻𝐻 𝜌𝜌 • When < 0, { } is negatively autocorrelated, which is rare in economic time series.

𝜌𝜌 𝑦𝑦𝑡𝑡 • When | | > 1, { } is blowing up (positive or negative).

𝜌𝜌 𝑦𝑦𝑡𝑡 So we are usually interested in testing

: = 1 vs. : 0 < < 1. (B4.3.2)

0 1 Dickey-Fuller Test for𝐻𝐻 a𝜌𝜌 Unit Root 𝐻𝐻 𝜌𝜌 We set = 1. Then

𝜃𝜃 𝜌𝜌 − = + + (B4.3.3) 𝑡𝑡 𝑡𝑡−1 𝑡𝑡−1 𝑡𝑡 ∆𝑦𝑦 =𝛼𝛼 +𝜌𝜌𝑦𝑦 −+𝑦𝑦 𝜀𝜀

Now we can test 𝛼𝛼 𝜃𝜃𝑦𝑦𝑡𝑡−1 𝜀𝜀𝑡𝑡

: = 0 vs. : < 0. (B4.3.4)

0 1 We can compute the𝐻𝐻 usual𝜃𝜃 t- for the hypothesis test.𝐻𝐻 However,𝜃𝜃 under , the t-statistic is not asymptotically normal. Dickey and Fuller (1979) tabulated the critical values for the 0 asymptotic distribution, which is known as the Dickey-Fuller (DF) distribution.𝐻𝐻

Asymptotic Critical Values for the Dickey-Fuller (DF) Unit Root Test

Significance Level 1% 2.5% 5% 10%

Critical Value -3.43 -3.12 -2.86 -2.57

We reject : = 0 against the alternative : < 0 if the t-statistic < , where c is one of the critical values above. 0 1 � 𝐻𝐻 𝜃𝜃 𝐻𝐻 𝜃𝜃 𝑡𝑡𝜃𝜃 𝑐𝑐 If is strongly negative, we have evidence for rejecting the hypothesis that { }~ (1) in favor of the hypothesis that { }~ (0). Otherwise, we suspect that { } has a unit root, and spurious 𝜃𝜃� 𝑡𝑡 regression𝑡𝑡 is a concern. 𝑦𝑦 𝐼𝐼 𝑡𝑡 𝑡𝑡 𝑦𝑦 𝐼𝐼 𝑦𝑦 Application of the asymptotic critical values for the DF test requires a large sample, i.e., a long time series. Critical values for small samples have also been tabulated and are available in statistical software packages such as SAS and R.

The power of a hypothesis test is the probability of not committing a Type II error. Type II error is failing to reject a false null hypothesis. Unit root tests have been criticized for their low power. There is a relatively high probability that these tests may indicate a unit root in a series with no unit roots. In addition to the DF test, other tests for autocorrelation and unit roots are discussed in the literature (e.g., Ljung-Box, Durbin-Watson). The DF test is popular because it is simple and robust.

The Cointegrating Parameter Suppose we have determined that { } and { } both have unit roots.

𝑡𝑡 𝑡𝑡 • If { } and { } are not cointegrated,𝑦𝑦 𝑥𝑥regression of one series on the other may be “spurious.” 𝑡𝑡 𝑡𝑡 𝑦𝑦 𝑥𝑥 • If { } and { } are cointegrated, we are interested in the cointegrating parameter such that is an (0) process. 𝑡𝑡 𝑡𝑡 𝑦𝑦 𝑥𝑥 𝛽𝛽 𝑡𝑡 𝑡𝑡 When we know𝑦𝑦 − or𝛽𝛽 𝑥𝑥hypothesize𝐼𝐼 the cointegrating parameter , we can set = and perform the standard DF test on the series { }. If we find evidence that { } has a unit root, we 𝑡𝑡 𝑡𝑡 𝑡𝑡 may conclude that { } and { } are not cointegrated. (In practice,𝛽𝛽 we may𝑧𝑧 also𝑦𝑦 consider− 𝛽𝛽𝑥𝑥 that our 𝑡𝑡 𝑡𝑡 hypothesized value of may be incorrect.) 𝑧𝑧 The null hypothesis in the DF𝑧𝑧 test is that { } has a 𝑡𝑡 𝑡𝑡 unit root, i.e., that { 𝑦𝑦 } and { 𝑥𝑥 } are not cointegrated. 𝑡𝑡 𝛽𝛽 𝑧𝑧 𝑡𝑡 𝑡𝑡 When { } and { }𝑦𝑦 are cointegrated,𝑥𝑥 the OLS estimator from the regression

𝑡𝑡 𝑡𝑡 ̂ 𝑦𝑦 𝑥𝑥 = + + 𝛽𝛽 (B4.3.5)

𝑡𝑡 𝑡𝑡 𝑡𝑡 is a consistent estimator of . However,𝑦𝑦 under𝛼𝛼� the𝛽𝛽 ̂null𝑥𝑥 hypotheses𝜀𝜀 that = has a unit root, we must run a spurious regression to obtain . 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝛽𝛽 𝑧𝑧 𝑦𝑦 − 𝛽𝛽𝑥𝑥 To perform the DF test with an estimated value of𝛽𝛽 ̂ , we set

= 𝛽𝛽 , (B4.3.6)

𝑡𝑡 𝑡𝑡 𝑡𝑡 where is the OLS estimator of . Davidson𝑢𝑢� and𝑦𝑦 − McKinnon𝛽𝛽̂𝑥𝑥 (1993) tabulated critical values for the DF test for the presence of a unit root in the series { }. 𝛽𝛽̂ 𝛽𝛽 𝑡𝑡 Asymptotic Critical Values for the Dickey-Fuller (DF)𝑢𝑢� Unit Root Test with Estimated

𝜷𝜷 Significance Level 1% 2.5% 5% 10%

Critical Value -3.90 -3.59 -3.34 -3.04

To perform a cointegration test with an estimated value of , we run the regression

= + + 𝛽𝛽 (B4.3.7)

𝑡𝑡 𝑡𝑡−1 𝑡𝑡 and test : = 0 against the alternative∆𝑢𝑢� :𝜏𝜏 <𝜃𝜃0𝑢𝑢�. If the𝜀𝜀 t-statistic is below the critical value, we have evidence that { } and { } are cointegrated. When is estimated, we must get a t- 0 1 statistic more𝐻𝐻 𝜃𝜃 strongly negative to show cointegration.𝐻𝐻 𝜃𝜃 This is because OLS tends to produce 𝑡𝑡 𝑡𝑡 residuals that look like an𝑦𝑦 (0) sequence𝑥𝑥 even when { } and 𝛽𝛽{ } are not cointegrated.

𝑡𝑡 𝑡𝑡 Cointegration Test for Series𝐼𝐼 with Linear Time Trends𝑦𝑦 𝑥𝑥 Suppose { } and { } display linear time trends. The trends need not be the same, e.g, { } may be increasing faster than { }. Let = + , and let = + . The stochastic portions of 𝑡𝑡 𝑡𝑡 𝑡𝑡 the series, 𝑦𝑦{ } and𝑥𝑥 { }, may be cointegrated. We may remove the trends from and 𝑦𝑦 and 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑡𝑡 then test for unit roots and𝑥𝑥 the cointegration𝑦𝑦 𝛾𝛾𝑡𝑡 of𝑔𝑔 { } with 𝑥𝑥{ }. 𝜁𝜁 Alternatively,𝜁𝜁 𝑧𝑧 we may run the 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑡𝑡 regression 𝑔𝑔 𝑧𝑧 𝑥𝑥 𝑦𝑦 𝑡𝑡 𝑡𝑡 𝑔𝑔 𝑧𝑧 = + + (B4.3.8)

𝑡𝑡 𝑡𝑡 and apply the DF test to the residuals 𝑦𝑦. In this𝛼𝛼� case,𝜂𝜂̂𝑡𝑡 we𝛽𝛽̂𝑥𝑥 note the following:

𝑡𝑡 • The asymptotic critical values 𝑒𝑒differ̂ from those of the standard DF test and are given in the table below.

• If we find that { } and { } are cointegrated with cointegration parameter , this that is not an (1) process. 𝑡𝑡 𝑡𝑡 𝑔𝑔 𝑧𝑧 𝛽𝛽 𝑡𝑡 𝑡𝑡 • It’s possible𝑦𝑦 − 𝛽𝛽𝑥𝑥 that 𝐼𝐼 will still have a linear time trend.

𝑡𝑡 𝑡𝑡 𝑦𝑦 − 𝛽𝛽𝑥𝑥 Asymptotic Critical Values for the Cointegration Test with Linear Time Trends

Significance Level 1% 2.5% 5% 10%

Critical Value -4.32 -4.03 -3.78 -3.50

B4.4. Series with Complex Autocorrelation Structures

If we suspect that { }~ (1) and that { } has a complex autocorrelation structure, we may include lags of in the on which we base the unit root test. In this case, 𝑡𝑡 𝑡𝑡 we apply the augmented𝑦𝑦 𝐼𝐼 Dickey-Fuller𝑦𝑦 (ADF) test. 𝑡𝑡 ∆𝑦𝑦 We use the general model

= + + 𝑝𝑝 + . (B4.4.1)

∆𝑦𝑦𝑡𝑡 𝛼𝛼 𝜃𝜃𝑦𝑦𝑡𝑡−1 � 𝛾𝛾𝑖𝑖∆𝑦𝑦𝑡𝑡−𝑖𝑖 𝜀𝜀𝑡𝑡 𝑖𝑖=1 To find the appropriate value of p, we may set p = 0 and then increase p until is statistically : = 0 insignificant. We can use an ordinary t- test to test (or an F test for 𝑝𝑝the joint significance of the ). As in the DF test, we test 𝛾𝛾 0 𝑝𝑝 𝐻𝐻 𝛾𝛾 𝑖𝑖 𝛾𝛾 : = 0 vs. : < 0. (B4.4.2)

0 1 The asymptotic critical values for𝐻𝐻 the𝜃𝜃 ADF test are the same𝐻𝐻 𝜃𝜃 as those for the DF test. Under , { } follows an autoregressive model of order p; under , { } follows an autoregressive 0 model of order + 1. 𝐻𝐻 𝑡𝑡 1 𝑡𝑡 ∆𝑦𝑦 𝐻𝐻 ∆𝑦𝑦 Vector Autoregressive𝑝𝑝 (VAR) Models If { } and { } are autoregressive series of order p, we may fit the autoregressive models

𝑥𝑥𝑡𝑡 𝑦𝑦𝑡𝑡 = + 𝑝𝑝 + 𝑝𝑝 + , and 𝑦𝑦 𝑡𝑡 𝛿𝛿0 � 𝛼𝛼 𝑖𝑖𝑦𝑦𝑡𝑡−𝑖𝑖 � 𝛽𝛽𝑖𝑖𝑥𝑥 𝑡𝑡−𝑖𝑖 𝜀𝜀𝑡𝑡 (B4.4.3) 𝑖𝑖=1 𝑖𝑖=1

= + 𝑝𝑝 + 𝑝𝑝 + .

𝑥𝑥𝑡𝑡 𝛾𝛾0 � 𝜏𝜏𝑖𝑖𝑦𝑦𝑡𝑡−𝑖𝑖 � 𝜑𝜑𝑖𝑖𝑥𝑥𝑡𝑡−𝑖𝑖 𝑢𝑢𝑡𝑡 The inclusion of the lagged values helps𝑖𝑖 =ensur1 e that the𝑖𝑖=1 residuals are not strongly autocorrelated. VAR models are often used for short-term forecasting.

Error Correction Models Suppose we have established that { } and { } are cointegrated with

𝑡𝑡 𝑡𝑡 𝑦𝑦 𝑥𝑥= + . (B4.4.4)

We can use an to𝑦𝑦 study𝑡𝑡 𝛽𝛽 the𝑥𝑥𝑡𝑡 short𝜀𝜀𝑡𝑡 term dynamics of the relationship between { } and { }. For example, consider the model

𝑡𝑡 𝑡𝑡 𝑦𝑦 𝑥𝑥 = + + ( ) + , (B4.4.5)

𝑡𝑡 0 0 𝑡𝑡 𝑡𝑡−1 𝑡𝑡−1 𝑡𝑡 where < 0. The term ( ∆𝑦𝑦 𝛼𝛼 𝛾𝛾) is∆𝑥𝑥 called𝛿𝛿 the𝑦𝑦 error− 𝛽𝛽 𝑥𝑥correction𝜀𝜀 term. The parameter indicates the speed at which { } and { } return to their equilibrium relationship after a “shock” 𝛿𝛿 𝛿𝛿 𝑦𝑦𝑡𝑡−1 − 𝛽𝛽𝑥𝑥𝑡𝑡−1 𝛿𝛿 𝑦𝑦𝑡𝑡 𝑥𝑥𝑡𝑡 creates a short-term disturbance in this relationship. Note that, if < , then is higher than its equilibrium value. In this case, the inclusion of the term will increase the 𝑡𝑡−1 𝑡𝑡−1 𝑡𝑡−1 model-based estimate of . Conversely, if > , the error𝑦𝑦 correction𝛽𝛽𝑥𝑥 term 𝑥𝑥 ( ) will decrease the estimated value of . 𝛿𝛿 𝑡𝑡 𝑡𝑡−1 𝑡𝑡−1 ∆𝑦𝑦 𝑦𝑦 𝛽𝛽𝑥𝑥 𝑡𝑡−1 𝑡𝑡−1 𝑡𝑡 𝛿𝛿When𝑦𝑦 the− 𝛽𝛽cointegrating𝑥𝑥 parameter is unknown, we may∆𝑦𝑦 estimate it first and then fit the error correction model using the OLS estimator in place of . This is called the Engle-Granger two- step procedure. 𝛽𝛽 𝛽𝛽̂ 𝛽𝛽 B4.5. Multivariate Cointegration Analysis In the multivariate case, we consider the k x T matrix

= . 𝑦𝑦11 ⋯ 𝑦𝑦1𝑇𝑇 𝑌𝑌 � ⋮ ⋱ ⋮ � 𝑘𝑘1 𝑘𝑘𝑘𝑘 Suppose each row = [ . . . ] of Y represents𝑦𝑦 ⋯ an 𝑦𝑦(1) process, but certain linear combinations of the rows are (0). We are interested in identifying cointegrating relationships 𝒊𝒊 𝑖𝑖1 𝑖𝑖𝑖𝑖 among the k time series.𝒚𝒚 𝑦𝑦 𝑦𝑦 𝐼𝐼 𝐼𝐼 The multivariate extension of the error correction model may be expressed as

= + + 𝑝𝑝−1 + ,

∆𝒚𝒚𝒕𝒕 𝝅𝝅𝟎𝟎 Π𝒚𝒚𝒕𝒕−𝟏𝟏 � Φ𝑖𝑖∆𝒚𝒚𝒕𝒕−𝒊𝒊 𝜺𝜺𝒕𝒕 𝑖𝑖=1 where = , and are k x r matrices, and is a k x k matrix. The Johansen multivariate formulation assumes that the error terms are independent and identically distributed with 𝑖𝑖 ~ Π(0, )𝛼𝛼. 𝛼𝛼 The′ 𝛼𝛼 r columns𝛽𝛽 of are called the Φcointegration vectors, and the rank r of is called the cointegration rank. The parameters in (which must be negative, like in the two- 𝒕𝒕 𝑘𝑘 𝜺𝜺variable𝑁𝑁 case)Σ indicate corrections𝛽𝛽 of short-run deviations from the long-run equilibrium𝛽𝛽 relationships defined by the columns of . 𝛼𝛼 𝛿𝛿

Johansen Lambda-Max and Trace Tests 𝛽𝛽 Based on the normality assumption, the Johansen tests are applied sequentially to help identify the cointegration rank r. If we find that < , it may be appropriate to remove one or more time series from the system. (If one variable is not cointegrated with any of the others, all values of the matrix corresponding to it will be near𝑟𝑟 𝑘𝑘0.)

𝛽𝛽• Lambda-Max Test Using a maximum generalized eigenvalue as a test statistic, we test null hypothesis that the cointegration rank r is equal {0,1, … , 1} against the alternative that = + 1. We apply the test sequentially starting with = 0. 𝑖𝑖 ∈ 𝑘𝑘 − 𝑟𝑟 𝑖𝑖 • Trace Test 𝑖𝑖 Using the trace (sum of diagonal elements) of a diagonal matrix of generalized eigenvalues as a test statistic, we test null hypothesis that the cointegration rank r is equal {0,1, … , 1} against the alternative that = + 1.

Applying𝑖𝑖 ∈Multivariate𝑘𝑘 − Cointegration Tests to Energy Data𝑟𝑟 𝑖𝑖 Johansen’s proof of the validity of the lambda-max and trace tests relies on maximum likelihood theory, assuming that the differenced data follow a Gaussian (or normal) distribution. Because this assumption is often problematic for energy data, especially energy price data, Johansen’s technique is not used in many analyses performed within EIA. Seasonality also complicates the analysis of monthly data by the vector error correction model.