<<

VAR MODELS & GRANGER

1 VECTOR

• A vector series consists of multiple single series. • Why we need multiple series? – To be able to understand the relationship between several components – To be able to get better forecasts

2 VECTOR TIME SERIES

• Price movements in one market can spread easily and instantly to another market. For this reason, financial markets are more dependent on each other than ever before. So, we have to consider them jointly to better understand the dynamic structure of global market. Knowing how markets are interrelated is of great importance in finance. • For an investor or a financial institution holding multiple assets play an important role in decision making.

3 VECTOR TIME SERIES

4 VECTOR TIME SERIES

5 VECTOR TIME SERIES

6 7 8 VECTOR TIME SERIES

• Consider an m-dimensional time series

Yt=(Y1,Y2,…,Ym)’. The series Yt is weakly stationary if its first two moments are time

invariant and the cross between Yit and Yjs for all i and j are functions of the time difference (t-s) only – however, please notice that in general, they are not symmetric, that

is, cov(Yit, Yjs )≠ cov(Yis, Yjt ).

9 VECTOR TIME SERIES

• The vector:  EYt     1,2,,m  • The cross- function  k  CovYtk ,Yt   EYtk  Yt   

11k 12 k  1m k  k  k   k   21 22 2m           m1k  m2 k   mm k 10 VECTOR TIME SERIES

• The cross-correlation matrix function: 1/ 2 1/ 2 k  D kD  ij k

where D is a diagonal matrix in which the i-th diagonal element is the of the i-th process, i.e.

D  diag 110, 22 0,, mm 0.

11 VECTOR WHITE NOISE PROCESS

• {at}~WN(0,) iff {at} is stationary with mean 0 vector and ,k  0 k   0,o.w.

12 VECTOR TIME SERIES

• {Yt} is a linear process if it can be expressed as  Yt  j at j for at ~WN0, j0

where {j} is a sequence of mxn matrix whose entries are absolutely summable, i.e.

  j i,l   for i,l 1,2,...,m. j

13 VECTOR TIME SERIES

• For a linear process, E(Yt)=0 and

 k   jkj ,k  0,1,2,... j

14 MA (WOLD) REPRESENTATION

Yt    Bat  s where B  s B s0 • For the process to be stationary, s should be square summable in the sense that each of

the mxm sequence ij.s is square summable.

15 AR REPRESENTATION

BYt    at  s where B 1  s B s0 • For the process to be invertible, s should be absolute summable.

16 THE VECTOR AUTOREGRESSIVE (VARMA) PROCESSES • VARMA(p,q) process:

 p BYt    q Bat

p where  p B  0  1B   p B q q B  0  1B  q B

q  0   p BYt    at VARp

p  0  Yt    q Bat VMA(q)

17 VARMA PROCESS

• VARMA process is stationary, if the zeros of

|p(B)| are outside the unit circle. 1 Yt    Bat   p B q B

• VARMA process is invertible, if the zeros of

|q(B)| are outside the unit circle.

BYt    at 1 q B  p BYt    at

18 IDENTIFIBILITY PROBLEM

• Multiplying matrices by some arbitrary matrix polynomial may give us an identical covariance matrix. So, the VARMA(p,q) model is not identifiable. We cannot uniquely determine p and q.

19 IDENTIFIBILITY PROBLEM

• Example: VARMA(1,1) process

Y1,t  0   mY1,t1  a1,t  0  ma1,t1               Y2,t  0 0 Y2,t1 a2,t  0 0 a2,t1

1    mBY1,t  1 mBa1,t         0 1 Y2,t  0 1 a2,t  1 Y1,t  1    mB 1 mBa1,t  1 Ba1,t              Y2,t  0 1  0 1 a2,t  0 1 a2,t 

MA()=VMA(1) 20 21 IDENTIFIBILITY

• To eliminate this problem, there are three methods suggested by Hannan (1969, 1970, 1976, 1979). – From each of the equivalent models, choose the minimum MA order q and AR order p. The resulting representation will be unique if Rank(p(B))=m. – Represent p(B) in lower triangular form. If the order of ij(B) for i,j=1,2,…,m, then the model is identifiable.

– Represent p(B) in a form p(B) =p(B)I where p(B) is a univariate AR(p). The model is identifiable if p0.

22 VAR(1) PROCESS

• Yi,t depends on not only the lagged values of Yit but also the lagged values of the other variables.

I  BYt    at • Always invertible. • Stationary if I   B  0 outside the unit circle. Let =B1. I  B  0  I    0

The zeros of |IB| is related to the eigenvalues of . 23 VAR(1) PROCESS

• Hence, VAR(1) process is stationary if the

eigenvalues of ; i, i=1,2,…,m are all inside the unit circle. • The autocovariance matrix:  k  EYtkYt EYtk Yt1  at  

 EYtkYt1 Ytkat 1  ,k  0 k   k k 1  0 ,k 1 24 VAR(1) PROCESS

• k=1, 1  0   110

1   01 01 1  1 1  0 1 0001    0 0

25 VAR(1) PROCESS

• Then, 0    0 vec0  I    1vec where   Kronecker product vecABC  C AvecB 3 4 a B  a B  3 2   11 1n 1 e.g. A B       e.g.X  4 6  vecX       2 1 7   am1B  amn B 6   7 26 VAR(1) PROCESS

• Example: 1.1  0.3 Yt   Yt1  at 0.6 0.2  1.1   0.3    I     0.6 0.2   det  I   1.1  0.2    0.60.3 2 1.3  0.4  0

 1  0.8,2  0.5 The process is stationary. 27 VMA(1) PROCESS

Yt    at  at1 where at ~ WN0,. • Always stationary. • The autocovariance function: 0      ,k 1  k   ,k  1  0,o.w. • The autocovariance matrix function cuts off after lag 1.

28 VMA(1) PROCESS

• Hence, VMA(1) process is invertible if the

eigenvalues of ; i, i=1,2,…,m are all inside the unit circle.

29 IDENTIFICATION OF VARMA PROCESSES • Same as univariate case. • SAMPLE CORRELATION MATRIC FUNCTION: Given a vector series of n observations, the sample correlation matrix function is

ˆ k  ˆ ij k

where  ˆ ij  k  ‘s are the crosscorrelation for the i-th and j-th component series. • It is very useful to identify VMA(q).

30 SAMPLE CORRELATION MATRIC FUNCTION • Tiao and Box (1981): They have proposed to use +, and . signs to show the significance of the cross correlations. + sign: the value is greater than 2 times the estimated  sign: the value is less than 2 times the estimated standard error . sign: the value is within the 2 times estimated standard error

31 PARTIAL AUTOREGRESSION OR PARTIAL LAG CORRELATION MATRIX FUNCTION • They are useful to identify VAR order. The partial autoregression matrix function is proposed by Tiao and Box (1981) but it is not a proper . Then, Heyse and Wei (1985) have proposed the partial lag correlation matrix function which is a proper correlation coefficient. Both of them can be used to identify the VARMA(p,q).

32 EXAMPLE OF VAR MODELING IN R

• “vars” package deals with VAR models. • Let’s consider the Canadian for an application of the model. • Canadian time series for labour productivity (prod), employment (e), unemployment rate (U) and real wages (rw) (source: OECD database) • Series is quarterly. The sample is from the 1stQ 1980 until ¨ 4thQ 2000.

33 Canadian example

> library(vars) > data(Canada) > layout(matrix(1:4, nrow = 2, ncol = 2)) > plot.ts(Canada$e, main = "Employment", ylab = "", xlab = "") > plot.ts(Canada$prod, main = "Productivity", ylab = "", xlab = "") > plot.ts(Canada$rw, main = "Real Wage", ylab = "", xlab = "") > plot.ts(Canada$U, main = "Unemployment Rate", ylab = "", xlab = "")

34 35 • An optimal lag-order can be determined according to an information criteria or the final prediction error of a VAR(p) with the function VARselect(). > VARselect(Canada, lag.max = 5, type = "const") $selection AIC(n) HQ(n) SC(n) FPE(n) 3 2 2 3 • According to the more conservative SC(n) and HQ(n) criteria, the empirical optimal lag-order is 2. 36 • In a next step, the VAR(2) is estimated with the function VAR() and as deterministic regressors a constant is included. > var.2c <- VAR(Canada, p = 2, type = "const") > names(var.2c) [1] "varresult" "datamat" "y" "type" "p" [6] "K" "obs" "totobs" "restrictions" "call“ > summary(var.2c) > plot(var.2c)

37 • The OLS results of the example are shown in separate tables 1 – 4 below. It turns out, that not all lagged endogenous variables enter significantly into the equations of the VAR(2).

38 39 40 The stability of the system of difference equations has to be checked. If the moduli of the eigenvalues of the companion matrix are less than one, the system is stable. > roots(var.2c) [1] 0.9950338 0.9081062 0.9081062 0.7380565 0.7380565 0.1856381 0.1428889 0.1428889

Although, the first eigenvalue is pretty close to unity, for the sake of simplicity, we assume a stable VAR(2)- process with a constant as deterministic regressor.

41 Restricted VARs

• From tables 1-4 it is obvious that not all regressors enter significantly. • With the function restrict() the user has the option to re- estimate the VAR either by significance (argument method = ’ser’) or by imposing zero restrictions manually (argument method = ’manual’). • In the former case, each equation is re-estimated separately as long as there are t-values that are in absolute value below the threshold value set by the function’s argument thresh. • In the latter case, a restriction matrix has to be provided that consists of 0/1 values, thereby selecting the coefficients to be retained in the model. The function’s arguments are

therefore: 42 > var2c.ser <- restrict(var.2c, method = "ser", thresh = 2) > var2c.ser$restrictions e.l1 prod.l1 rw.l1 U.l1 e.l2 prod.l2 rw.l2 U.l2 const e 1 1 1 1 1 0 0 0 1 prod 0 1 0 0 1 0 1 1 1 rw 0 1 1 0 1 0 0 1 0 U 1 0 0 1 1 0 1 0 1

43 > B(var2c.ser)

44 45 46 Diagnostic testing

• In package ‘vars’ the functions for diagnostic testing are arch(), normality(), serial() and stability(). > var2c.arch <- arch(var.2c)

47 • The Jarque-Bera normality tests for univariate and multivariate series are implemented and applied to the residuals of a VAR(p) as well as separate tests for multivariate and (see Bera & Jarque [1980], [1981] and Jarque & Bera [1987] and Lutkepohl [2006]). • The univariate versions of the Jarque-Bera test are applied to the residuals of each equation. • A multivariate version of this test can be computed by using the residuals that are standardized by a Choleski decomposition of the variance-covariance matrix for the centered residuals.

48 > var2c.norm <- normality(var.2c, multivariate.only = TRUE) > var2c.norm $JB JB-Test (multivariate) Chi-squared = 5.094, df = 8, p-value = 0.7475 $Skewness Skewness only (multivariate) Chi-squared = 1.7761, df = 4, p-value = 0.7769 $Kurtosis Kurtosis only (multivariate) Chi-squared = 3.3179, df = 4, p-value = 0.5061

49 • For testing the lack of serial correlation in the residuals of a VAR(p), a Portmanteau test and the LM test proposed by Breusch & Godfrey are implemented in the function serial(). > var2c.pt.asy <- serial(var.2c, lags.pt = 16, type = "PT.asymptotic") > var2c.pt.asy Portmanteau Test (asymptotic) Chi-squared = 205.3538, df = 224, p-value = 0.8092 > var2c.pt.adj <- serial(var.2c, lags.pt = 16, type = "PT.adjusted") > var2c.pt.adj Portmanteau Test (adjusted) Chi-squared = 231.5907, df = 224, p-value = 0.3497

50 • The Breusch-Godfrey LM- (see Breusch 1978, Godfrey 1978) is based upon the following auxiliary regressions:

> var2c.BG <- serial(var.2c, lags.pt = 16, type = "BG") > var2c.BG Breusch-Godfrey LM test Chi-squared = 92.6282, df = 80, p-value = 0.1581 > var2c.ES <- serial(var.2c, lags.pt = 16, type = "ES") > var2c.ES Edgerton-Shukur F test F statistic = 1.1186, df1 = 80, df2 = 199, p-value = 0.2648

51 • The stability of the regression relationships in a VAR(p) can be assessed with the function stability(). An empirical fluctuation process is estimated for each regression by passing the function’s arguments to the efp()-function contained in the package strucchange. > args(stability) function (x, type = c("Rec-CUSUM", "OLS-CUSUM", "Rec-MOSUM", "OLS-MOSUM", "RE", "ME", "Score-CUSUM", "Score-MOSUM", "fluctuation"), h = 0.15, dynamic = FALSE, rescale = TRUE) NULL > var2c.stab <- stability(var.2c, type = "OLS-CUSUM") > names(var2c.stab) [1] "stability" "names" "K" 52 53 A predict-method for objects with class attribute varest is available. The n.ahead forecasts are computed recursively for the estimated VAR, beginning with h = 1, 2, . . . , n.ahead: > var.f10 <- predict(var.2c, n.ahead = 10, ci = 0.95) > names(var.f10) [1] "fcst" "endog" "model" "exo.fcst" > class(var.f10) [1] "varprd" > plot(var.f10) > fanchart(var.f10)

54 55 56 A little bit of history

• The (VAR) model is one of the most successful and versatile models for the analysis of multivariate time series. • Made fameous in Chris Sims’s paper “Macroeconomics and Reality,” Econometrika 1980. • Has proven to be especially useful for describing the dynamic and causal behavior of economic and financial time series and for forecasting. • It often provides superior forecasts to those from univariate time series models.

57 GRANGER CAUSALITY

• In time series analysis, sometimes, we would like to know whether changes in a variable will have an impact on changes other variables. • To find out this phenomena more accurately, we need to learn more about Granger Causality Test (Granger, 1969).

58 GRANGER CAUSALITY

• In principle, the concept is as follows:

• If X causes Y, then, changes of X happened first then followed by changes of Y.

59 GRANGER CAUSALITY

• If X causes Y, there are two conditions to be satisfied: 1. X can help in predicting Y. Regression of X on Y has a big R2 2. Y can not help in predicting X.

60 GRANGER CAUSALITY

• In most regressions, it is very hard to discuss causality. For instance, the significance of the coefficient  in the regression

yi  xi  i only tells the ‘co-occurrence’ of x and y, not that x causes y. • In other words, usually the regression only tells us there is some ‘relationship’ between x and y, and does not tell the nature of the relationship, such as whether x causes y or y causes x.

61 GRANGER CAUSALITY

• One good thing of time series vector autoregression is that we could test ‘causality’ in some sense. This test is first proposed by Granger (1969), and therefore we refer to it as the Granger causality. • We will restrict our discussion to a system of two variables, x and y. y is said to Granger-cause x if current or lagged values of y helps to predict future values of x. On the other hand, y fails to Granger-cause x if for all s > 0, the mean squared error of a forecast of xt+s based on (xt, xt−1, . . .) is the same as that is based on (yt, yt−1, . . .) and (xt, xt−1, . . .).

62 GRANGER CAUSALITY

• If we restrict ourselves to linear functions, y fails to Granger-cause x if ˆ ˆ MSEExts xt , xt1, MSEExts xt , xt1,, yt , yt1,

• Equivalently, we can say that x is exogenous in the time series sense with respect to y, or y is not linearly informative about future x.

63 GRANGER CAUSALITY

• A variable X is said to Granger cause another variable Y, if Y can be better predicted from the past of X and Y together than the past of Y alone, other relevant information being used in the prediction (Pierce, 1977).

64 GRANGER CAUSALITY • In the VAR equation, the example we proposed above implies a lower triangular coefficient matrix: 1 p x  c   0 x   0 xt p  a  t  1  11 t1  11  1t      1 1    p p     yt  c2  21 22 yt1 21 22 yt p  a2t  Or if we use MA representations,

xt  1  11B 0 a1t           yt  2  21B 22 Ba2t  0 1 2 2 0 0 0 where ij B  ij ij B ij B ,11  22 1,21  0. 65 GRANGER CAUSALITY

• Consider a linear projection of yt on past, present and future x’s,   yt  c  bj xt j  d j xt j  et j0 j1 where E(etx ) = 0 for all t and . Then y fails to Granger-cause x iff dj = 0 for j = 1, 2, . . ..

66 TESTING GRANGER CAUSALITY

Procedure 1) Check that both series are stationary in mean, variance and covariance (if necessary transform the data via logs, differences to ensure this) 2) Estimate AR(p) models for each series, where p is large enough to ensure white noise residuals. F tests and other criteria (e.g. BIC) can be used to establish the maximum lag p that is needed. 3) Re-estimate both model, now including all the lags of the other variable 4) Use F tests to determine whether, after controlling for past Y, past values of X can improve forecasts Y (and vice versa)

67 TEST OUTCOMES

1. X Granger causes Y but Y does not Granger cause X 2. Y Granger causes X but X does not Granger cause Y 3. X Granger causes Y and Y Granger causes X (i.e., there is a feedback system) 4. X does not Granger cause Y and Y does not Granger cause X

68 TESTING GRANGER CAUSALITY

• The simplest test is to estimate the regression which is based on p p xt  c1  i xti   j yt j  ut i1 j1 using OLS and then conduct a F-test of the null hypothesis

H0 : 1 = 2 = . . . = p = 0.

69 TESTING GRANGER CAUSALITY

2.Run the following regression, and calculate RSS (full model) p p xt  c1  i xti   j yt j  ut i1 j1 3.Run the following limited regression, and calculate RSS (Restricted model). p xt  c1  i xti  ut i1

70 TESTING GRANGER CAUSALITY

4.Do the following F-test using RSS obtained from stages 2 and 3:

F = [{(n-k) /p }.{(RSSrestricted-RSSfull) / RSSfull}]

n: number of observations k: number of parameters from full model p: the difference in numbers of parameters from restricted model

71 TESTING GRANGER CAUSALITY

5. If H0 rejected, then X causes Y.

• This technique can be used in investigating whether or not Y causes X.

72 Example of the Usage of Granger Test

World Oil Price and Growth of US Economy • Does the increase of world oil price influence the growth of US economy or does the growth of US economy effects the world oil price? • James Hamilton did this study using the following model:

Zt= a0+ a1 Zt-1+...+amZt-m+b1Xt-1 +…bmXt-m+εt

Zt= ΔPt; changes of world price of oil Xt= log (GNPt/ GNPt-1)

73 World Oil Price and Growth of US Economy

• There are two that need to be observed:

(i) H0: Growth of US Economy does not influence world oil price

Full:

Zt= a0+ a1 Zt-1+...+amZt-m+b1Xt-1 +…+bmXt-m+εt

Restricted:

Zt= a0+ a1 Zt-1+...+amZt-m+ εt

74 World Oil Price and Growth of US Economy

(ii) H0 : World oil price does not influence growth of US Economy • Full :

Xt= a0+ a1 Xt-1+ …+amXt-m+ b1Zt-1+…+bmZt-m+ εt

• Restricted:

Xt= a0+ a1 Xt-1+ …+amXt-m+ εt

75 World Oil Price and Growth of US Economy • F Tests Results: 1. Hypothesis that world oil price does not influence US economy is rejected. It that the world oil price does influence US economy . 2. Hypothesis that US economy does not affect world oil price is not rejected. It means that the US economy does not have effect on world oil price.

76 World Oil Price and Growth of US Economy • Summary of James Hamilton’s Results

Null Hypothesis (H0) (I) F(4,86) (II) F(8,74) I. Economic growth 0.58 0.71 ≠→World Oil Price II. World Oil 5.55 3.28 Price≠→Economic growth

77 World Oil Price and Growth of US Economy • Remark: The first used the data 1949-1972 (95 observations) and m=4; while the second experiment used data 1950-1972 (91 observations) and m=8.

78 Canadian example The function causality() is now applied for investigating if the real wage and productivity is causal to employment and unemployment. > causality(var.2c, cause = c("rw", "prod")) $Granger Granger causality H0: prod rw do not Granger-cause e U F-Test = 3.4529, df1 = 8, df2 = 292, p-value = 0.0008086 $Instant H0: No instantaneous causality between: prod rw and e U data: VAR object var.2c Chi-squared = 2.5822, df = 4, p-value = 0.63 The null hypothesis of no Granger-causality from the real wage and labour productivity to employment and unemployment must be rejected; whereas the null hypothesis of non-instantenous causality cannot be rejected. This test outcome is economically plausible, given the frictions observed in labour markets. Instantaneous causality appears when we include the current information of variables 79 Chicken vs. Egg

• This causality test can also be used in explaining which comes first: chicken or egg. More specifically, the test can be used in testing whether the existence of egg causes the existence of chicken or vise versa. • Thurman and Fisher did this study using yearly data of chicken population and egg productions in the US from 1930 to1983 • The results: 1. Egg causes the chicken. 2. There is no evidence that chicken causes egg.

80 Chicken vs. Egg

• Remark: Hypothesis that egg has no effect on chicken population is rejected; while the other hypothesis that chicken has no effect on egg is not rejected. Why?

81 GRANGER CAUSALITY

• We have to be aware of that Granger causality does not equal to what we usually mean by causality. For instance, even if x1 does not cause x2, it may still help to predict x2, and thus Granger-causes x2 if changes in x1 precedes that of x2 for some reason. • A naive example is that we observe that a dragonfly flies much lower before a rain storm, due to the lower air pressure. We know that dragonflies do not cause a rain storm, but it does help to predict a rain storm, thus Granger-causes a rain storm.

82 Acknowledgement

• Our thanks go to fellow colleagues who have made their lecture notes available on the internet!

83