Cointegration - General Discussion
Total Page:16
File Type:pdf, Size:1020Kb
Coint - 1 Cointegration - general discussion Definitions: A time series that requires d differences to get it stationary is said to be "integrated of order d". If the dth difference has p AutoRegressive and q Moving Average terms, the differenced series is said to be ARMA(p,q) and the original Integrated series to be ARIMA(p,d,q). Two series Xtt and Y that are integrated of order d may, through linear combination, produce a series +\>> ,] which is stationary (or integrated of order smaller than d) in which case we say that \]>> and are cointegrated and we refer to Ð+ß ,Ñ as the cointegrating vector. Granger and Weis discuss this concept and terminology. An example: For example, if \]>> and are wages in two similar industries, we may find that both are unit root processes. We may, however, reason that by virtue of the similar skills and easy transfer between the two industries, the difference \]>> - cannot vary too far from 0 and thus, certainly should not be a unit root process. The cointegrating vector is specified by our theory to be Ð"ß "Ñ or Ð "ß "Ñ, or Ð-ß -Ñ all of which are equivalent. The test for cointegration here consists of simply testing the original series for unit roots, not rejecting the unit root null , then testing the \]>> - series and rejecting the unit root null. We just use the standard D-F tables for all these tests. The reason we can use these D-F tables is that the cointegrating vector was specified by our theory, not estimated from the data. Numerical examples: ]œE]>>"> I 1. Bivariate, stationary: ]"Þ#Þ$/">]"ß>" "> !%" œßIµ> Rß Œ”]Þ%Þ&]/#>00 •ŒŒ #ß>" #> ŒŒ!"$ "Þ# - Þ$ EM----- œ œ# "Þ( Þ' Þ"# œ Ð Þ*ÑÐ Þ)Ñ ºº Þ% Þ& - this is =>+>398+<C ÐnoteI> distribution has no impact ) Coint - 2 2. Bivariate, nonstationary: ]"Þ#Þ&/] ">œß"ß>" "> Œ”]ÞÞ&]/#>02 0 •ŒŒ #ß>" #> "Þ# - Þ5 EM----- œ œ# "Þ( Þ' Þ" œ Ð 17 ÑÐ Þ Ñ ºº02ÞÞ&- this is a ?83> <99> :<9-/== Using the spectral decomposition (eigenvalues, vectors) of A we have 1.2 Þ& 1 Î "Þ"' "Î # 1 Î "Þ"' "Î # 1 ! ÈÈÈÈœ !Þ# !Þ& ! !Þ( ”•–—–—Þ%ÎÈÈ "Þ"' "ÎÈÈ # Þ%Î "Þ"' "Î # ”• EX œ X ? """ " X]œXEXX]>>">ÐÑ XI ^^>>"> œ ?( components of the^> vector: ww ww ^"ß> œ ^ "ß>" ( "ß> -97798 ></8. (unit root) ^œ222ß>0.7 ^ ß>" ( ß> (stationary root) ] œ A ^ A ^ œ "Î "Þ"' ^ "Î # ^ "ß> " "ß> # #ß>È "ß>È #ß> <- share "common trend" ]2ß> œ A $ ^ "ß> A % ^ #ß> œ Þ%ÎÈ "Þ"' ^ "ß> "ÎÈ # ^ #ß> Ÿ "" ^X]>>œ so that last row of X is cointegrating vector. Notice that A is not symmetric and XX"ÁÞ w Coint - 3 Engle - Granger method This is one of the earliest and easiest to understand treatments of cointegration. ZZ22 ]œA^A^"ß> " "ß> # #ß> !!1,t 2,t 1 where 88## is O::Ð"Ñ and is O Ðn Ñ ]œA^A^2ß> $ "ß> % #ß> so if we regress ]]"ß> on #ß> our regression coefficient is n -2 nYY1t 2t nZO-2AA 2 Ð" Ñ ! "$1,t : 8 t=1 ! È A" " n = -2 # 2 " œÐÑ + O: nZOAÐÑ A$ 8 -2 2 $ 1,t : 8 nY2t ! È È !t=1 " and our residual series is thus approximately ÒA$"Y1t - A Y 2t Ó œ A$ ÐAAAÎAÑ^# " % $ #> , a stationary series. Thus a simple regression of ] "ß> on ] #ß> gives an estimate of the cointegrating vector and a test for cointegration is just a test that the residuals are stationary. Let the residuals be <<<<>>>">". Regress on (and possibly some lagged differences). Can we compare to our D-F tables? Engle and Granger argue that one cannot do so. The null hypothesis is that there is no cointegration, thus the bivariate series has 2 unit roots and no linear combination is stationary. We have, in a sense, looked through all possible linear combinations of ]]"ß> and #ß> finding the one that varies least (least squares) and hence the one that looks most stationary. It is as though we had computed unit root tests for all possible linear combinations then selected the one most likely to reject. We are thus in the area of order statistics. (If you report the minimum heights from samples of 10 men each, the distribution of these minimae will not be the same as the distribution of heights of individual men nor will the distribution of unit root tests from these "best" linear combinations be the same as the distribution you would get for a pre specified linear combination). Engle and Granger provide adjusted critical values. Here is a table comparing their E-G tables to our D-F tables for n=100. E-G used an augmented regression with 4 lagged differences and an intercept to calculate a t statistic 7. so keep in mind that part of the discrepancy is due to finite sample effects of the (asymptotically negligible) lagged differences. Prob of smaller 7. : .01 .05 .10 E-G -3.77 -3.17 -2.84 D-F -3.51 -2.89 -2.58 Coint - 4 Example: Pt = cash price on delivery date, Texas steers Ft = Futures price (source: Ken Mathews, NCSU Ag. Econ.) Data are bimonthly Feb. '76 through Dec. '86 (60 obs.) 1. Test individual series for integration 5 ^ -.117 ffPtt-1t-i = 7.6 - 0.117 P + "7i P .(D-F) = .053 = -2.203 !i=1 5 ^ -.120 ffFtt-1t-i = 7.7 - 0.120 F + "7i F .(D-F) = .054 = -2.228 !i=1 each series is integrated, cannot reject at 10% ^ 2. Regress Fttt on P : F = .5861 + .9899 P t , residual R t -.9392 fRtt-1 = 0.110 - .9392 R 7.(E-G) = .1264 = -7.428 Thus, with a bit of rounding, Ftt - 1.00 P is stationary. The Engle-Granger method requires the specification of one series as the dependent variable in the bivariate regression. Fountis and Dickey (Annals of Stat.) study distributions for the multivariate system. If ]œE]> >"ÐÑ I > then ]] > >" œ M E] >" I > We show that if the true series has one unit root then the root of the least squares estimated matrix ME ^ that is closest to 0 has the same limit distribution, after multiplication by n, as the standard D-F tables and we suggest the use of the eigenvectors of ME ^ to estimate the cointegrating vector. The only test we can do with this is the null of one unit root versus the alternative of stationarity. Johansen's test, discussed later, extends this in a very nice way. Our result also holds for higher dimension models, but requires the extraction of roots of the estimated characteristic polynomial. For the Texas steer futures data, the regression gives f P 5.3 "Þ(( "Þ'* P >>"œ $lagged differences Œ”•”fF>>" 6.9 "Þ!$ !Þ*$ •Œ F Coint - 5 0.54 -0.82 "Þ(( "Þ'* 3.80 -4.42 where ”•”•”•-.69 0.72 "Þ!$ !Þ*$ 3.63 -2.84 indicating that .69 P> 0.72 Ft is stationary. This is about .7 times the difference so the two methods agree that P>> F is stationary, as is any multiple of it. Johansen's Method This method is similar to that just illustrated but has the advantage of being able to test for any number of unit roots. The method can be described as the application of standard multivariate calculations in the context of a vector autoregression, or VAR. The test statistics are those found in any multivariate text. Johansen's idea, like in univariate unit root tests, is to get the right distribution for these standard calculated statistics. The statistics are standard, their distributions are not. We start with just a lag one model with mean 0 (no intercept). fœ]]I>>">>C where ] is a p-dimensional column vector as is I > w assume E ÖלII> > A Þ <B: w cd H! ÀœœC!" Ô× ÕØ:B< <œ! Ê all linear combinations nonstationary <œ: Ê all linear combinations stationary 0 < <: Êcointegration Note: for any C!"C!" there are infinitely many , such that œ w [because !"ww= !XX" " Ó so we do not test hypotheses about ! and " , only about the rank r. Now define sums of squares and cross products: ww f]]>>" 8 w f]WW>!!!"for example: W]]"" = >" >" >œ"! ]WW>"Œ "! "" Now write down likelihood (conditional on ]0 = 0 ) 8 "" w " _ = 8: 8 /B:ÖÐf# ]]>>">>"CA Ñ Ðf ]] C Ñ× Ð#1 Ñ## lA l >œ"! Coint - 6 If C is assumed to be full rank ( r = p) then the likelihood is maximized at the usual estimate - the least squares regression estimate 88" ^ ww " C = Ðf]]>>"!">" Ñ ] ] >" œ WW"" >œ"!!’“ >œ" and 8 w ^ " ^^w ^^ ACC = 8 Ðf]]]]>>">>"!!"" ÑÐf Ñ œ W CC W >œ"! H!>À r stationary linear combinations of ] (linearly indep.) and thus (p-r) unit root linear combinations H0 À r "cointegrating vectors" and (p-r) "common trends" w H!:B<:B<ÀœC!"!" with and ^ So far we have the unrestricted estimate ( CA^ , ) and can evaluate likelihood there. Principle of likelihood ratio requires that we maximize the likelihood for C!"œ w and compare to the unrestricted maximum. That is, we now want to maximize 8 "" www " _ = 8: 8 /B: Ö # Ðf]]>+ !" >" Ñ A Ðf ] > + !" ] >" Ñ × Ð#1 Ñ## lA l >œ"! Step 1: w For any given "" we can compute ]>" and find the corresponding ! by regression in the model w f]]I>>">= !" ( ) + and this is simply 88 " ^ www !"ÐÑ = Ðf]]>>">" " Ñ Ð " ] ] >" " Ñ >œ"!!’“ >œ" Step 2: Search over "!" for maximum.