Vector Autoregressions (VAR) and Impulse Response Functions (Irf)

Vector Autoregressions (VAR) and Impulse Response Functions (irf)

We want to explore the causal relationship between pairs of time-series variables – We will discuss the vector error correction (VEC) and vector autoregressive (VAR) models Terminology – Univariate analysis examines a single data series – Bivariate analysis examines a pair of series – The term vector indicates that we are considering a number of series: two, three, or more • The term ‘‘vector’’ is a generalization of the univariate and bivariate cases A VAR

Consider the system of equations:

– Together the equations constitute a system known as a vector autoregression (VAR) • In this example, since the maximum lag is of order 1, we have a VAR(1) • In our case, hotel demand (y) is influenced by hotel supply (x), and viceversa Lag Length in VAR

• When estimating VARs or conducting ‘Granger causality’ tests, the test can be sensitive to the lag length of the VAR

• Sometimes the lag length corresponds to the data, such that quarterly data has 4 lags, monthly data has 12 lags etc.

• A more rigorous way to determine the optimal lag length is to use the Akaike or Schwarz-Bayesian information criteria.

• However the estimations tend to be sensitive to the presence of autocorrelation, in this case following the use of information criteria, if there is any evidence of autocorrelation, further lags are added, above the number indicated by the information criteria, until the autocorrelation is removed. Information Criteria

• The main information criteria are the Schwarz-Bayesian criteria and the Akaike criteria.

• They operate on the basis that there are two competing factors from adding more lags to a model. More lags will reduce the RSS, but also means a loss of degrees of freedom (penalty from adding more lags).

• The aim is the minimise the information criteria, by adding an extra lag, it will only benefit the model if the reduction in the RSS outweighs the loss of degrees of freedom.

• In general the Schwarz-Bayesian (SBIC) has a harsher penalty term than the Akaike (AIC), which leads it to indicate a parsimonious model is best. The AIC and SBIC

• The two can be expressed as:

2k AIC  ln( ˆ 2 )  T k SBIC  ln( ˆ 2 )  ln T T Where : ˆ 2  residual variance T - sample size, k - No. of parameters Multivariate Information Criteria

• The multivariate version of the Akaike information criteria is similar to the univariate:

MAIC  log ˆ  2k  / T ( Akaike ) ˆ  Variance  Co var iance matrix of the residuals. (This gives the variances on the main diagonal and covariance s between the residuals off the main diagonal of the matrix) T  number of observatio ns k   total number of regressors in all equations Multivariate SBIC

• The multivariate version of the SBIC is:

k MSBIC  log ˆ  log(T ) T ˆ Variance  Co variance matrix of the residuals T  number of observations k  total number of regressors in all equations The best criterion

• In general there is no agreement on which criteria is best (Diebold for instance recommends the SBIC). • The Schwarz-Bayesian is strongly consistent but not efficient. • The Akaike is not consistent, generally producing too large a model, but is more efficient than the Schwarz-Bayesian criteria. VAR Models

• If we assume a 2 variable model, with a single lag, we can write this VAR model as:

y1t  10  11 y1t 1  11 y2t 1  u1t

y2t   20   21 y2t 1   21 y1t 1  u2t which can be rewritten as :

 y1t   10   11 11  y1t 1   u1t               y2t    20   21  21  y2t 1   u2t 

yt   0  1 yt 1  ut g *1 g *1 g * gg *1 g *1(for a system of g variables ) Criticisms of Causality Tests

Granger causality test, much used in VAR modelling, however do not explain some aspects of the VAR: • It does not give the sign of the effect, we do not know if it is positive or negative • It does not show how long the effect lasts for. • It does not provide evidence of whether this effect is direct or indirect. Impulse Response Functions

• These trace out the effect on the dependent variables in the VAR to shocks to all the variables in the VAR • Therefore in a system of 2 variables, there are 4 impulse response functions and with 3 there are 9. • The shock occurs through the error term and affects the dependent variable over time. • In effect the VAR is expressed as a vector moving average model (VMA), as in the univariate case previously, the shocks to the error terms can then be traced with regard to their impact on the dependent variable. • If the time path of the impulse response function becomes 0 over time, the system of equations is stable, however they can explode if unstable. Impulse Response Functions

• Given:

y t  A1 y t 1  u t

 1 2  Where : A1      1  2 

Given a unit shock to y1t at time t  0

u 10  1  y 0       u 20   0  An Impulse Response Function

Shock

1.2

0.8

y 0.6 Shock

0.4

0.2

0 0 1 2 3 4 5 6 7 8 9 10 Time In our case: demand & supply; STATA https://www.stata.com/manuals13/tsirf.pdf

• An IRF measures the effect of a shock to an endogenous variable on itself or on another endogenous variable; (Lutkepohl (2005); Hamilton (1994)) • A dynamic-multiplier function, or transfer function, measures the impact of a unit increase in an exogenous variable on the endogenous variables over time; irf create estimates simple and cumulative dynamic-multiplier functions after VAR. • The forecast-error variance decomposition (FEVD) measures the fraction of the forecast-error variance of an endogenous variable that can be attributed to orthogonalized shocks to itself or to another endogenous variable. • Of the many types of FEVDs, irf create estimates the two most important: Cholesky and structural. In our case: demand & supply; STATA

https://www.stata.com/manuals13/tsirf.pdf

var tot_hotel_nights hotel_beds, lags(1/2) order1, hotel_beds, tot_hotel_nights 4000000 irf create order1, step(40) set(myirf1) replace irf graph oirf, impulse(hotel_beds) response(tot_hotel_nights)

2000000

0 The orthogonalized impulse on SUPPLY influences, at

-2000000 the max level, after a few

0 10 20 30 40 years, and then it fades away, step rather slowly 95% CI orthogonalized irf Graphs by irfname, impulse variable, and response variable In our case: demand & supply; STATA https://www.stata.com/manuals13/tsirf.pdf var tot_hotel_nights hotel_beds, lags(1/2) irf create order1, step(40) set(myirf1) replace irf graph oirf, impulse(hotel_beds) response(tot_hotel_nights)

Below we use the same estimated var but use a different Cholesky ordering to create a second set of IRF results, which we will save as order2 in the same file, and then we will graph both results

order1, hotel_beds, tot_hotel_nights order2, hotel_beds, tot_hotel_nights 4000000

2000000 Even with a different identification scheme, the DEMAND behaviour is rather similar

-2000000

0 10 20 30 40 0 10 20 30 40 step 95% CI orthogonalized irf Graphs by irfname, impulse variable, and response variable In our case: demand & supply; STATA https://www.stata.com/manuals13/tsirf.pdf

var tot_hotel_nights hotel_beds, lags(1/2) irf create order1, step(40) set(myirf1) replace irf table oirf, irf(order1 order2) impulse(hotel_beds) response(tot_hotel_nights)

The table provides more insight on the effect exerted by a SUPPLY shock on DEMAND; as we can see it starts picking up at the 6° In our case: demand & supply; STATA

https://www.stata.com/manuals13/tsirf.pdf

order, tot_hotel_nights, hotel_beds 40000 var tot_hotel_nights hotel_beds, lags(1/2) irf create order1, step(40) set(myirf1) replace irf graph oirf, impulse(tot_hotel_nights) response(hotel_beds)

30000 The irf create command above created file myirf1.irf and put one set of results in it, named order1 20000

10000 The orthogonalized impulse on DEMAND influences, at the max level, after a few 0 years, and then it fades away, 0 10 20 30 40 step although rather slowly 95% CI orthogonalized irf Graphs by irfname, impulse variable, and response variable In our case: demand & supply; STATA

https://www.stata.com/manuals13/tsirf.pdf var tot_hotel_nights hotel_beds, lags(1/2) irf create order1, step(40) set(myirf1) replace irf create order4, step(40) order(hotel_beds tot_hotel_nights) irf graph oirf, irf(order3 order4) impulse(tot_hotel_nights) response(hotel_beds)

Below we use the same estimated var but use a different Cholesky ordering to create a second set of IRF results, which we will save as order2 in the same file, and then we will graph both results

order3, tot_hotel_nights, hotel_beds order4, tot_hot el_nights, hotel_beds 4 00 0 0 Even with a different identification scheme, the DEMAND behaviour

3 00 0 0 is rather similar

2 00 0 0

1 00 0 0

0 1 0 2 0 30 4 0 0 10 2 0 3 0 4 0 step 95% C I orthogonalized irf Graphs by irfname, impulse variable, and response variable In our case: demand & supply; STATA https://www.stata.com/manuals13/tsirf.pdf

var tot_hotel_nights hotel_beds, lags(1/2) irf create order1, step(40) set(myirf1) replace irf table oirf, irf(order3 order4) impulse(tot_hotel_nights) response(hotel_beds)

The table provides more insight on the effect exerted by a SUPPLY shock on DEMAND; as we can see it starts picking up at the 6° VARs and SUR

• In general the VAR has all the lag lengths of the individual equations the same size. • It is possible however to have different lag lengths for different equations, however this involves another estimation method. • When lag lengths differ, the seemingly unrelated regression (SUR) approach can be used to estimate the equations, this is often termed a ‘near-VAR’. Alternative VARs

• It is possible to include contemporaneous terms in a VAR, however in this case the VAR is not identified. • It is also possible to include exogenous variables in the VAR, although they do not have separate equations where they act as a dependent variable. They simply act as extra explanatory variables for all the equations in the VAR. • It is worth noting that the impulse response functions can also produce confidence intervals to determine whether they are significant, this is routinely done by most computer programmes. Criticisms of the VAR

• Many argue that the VAR approach is lacking in theory. • There is much debate on how the lag lengths should be determined • It is possible to end up with a model including numerous explanatory variables, with different signs, which has implications for degrees of freedom. • Many of the parameters will be insignificant, this affects the efficiency of a regression. • There is always a potential for multicollinearity with many lags of the same variable Stationarity and VARs

• Should a VAR include only stationary variables, to be valid? • Sims argues that even if the variables are not stationary, they should not be first-differenced. • However others argue that a better approach is a multivariate test for cointegration and then use first-differenced variables and the error correction term VECMs

• Vector Error Correction Models (VECM) are the basic VAR, with an error correction term incorporated into the model. • The reason for the error correction term is the same as with the standard error correction model, it measures any movement away from the long-run equilibrium. • These are often used as part of a multivariate test for cointegration, such as the Johansen ML test. Conclusion

• VARs have a number of important uses, particularly causality tests and forecasting • To assess the affects of any shock to the system, we need to use impulse response functions and variance decomposition • VECMs are an alternative, as they allow first- differenced variables and an error correction term. • The VAR has a number of weaknesses, most importantly its lack of theoretical foundations