<<

COMPARING THE FORECASTING PERFORMANCE OF VAR, BVAR AND U-MIDAS

Submitted by Alessio Belloni

A thesis submitted to the Department of in partial fulfillment of the requirements for a one-year Master degree in Statistics in the Faculty of Social Sciences

Supervisors: Sebastian Ankargren & Mattias Nordin

Spring, 2017 ABSTRACT

This paper aims to compare the forecasting performance of the widely used VAR and Bayesian VAR model to the unrestricted MIDAS regression. The models are tested on a real-time macroeconomic set ranging from 2000 to 2015. The variables are mixed frequency data, specifically, predictions are made for GDP, using economic tendency indicator, unemployment and inflation as predicting variables. The baseline model of this analysis is a simple VAR, while it has great flexibility, this model risks to overfit the data and as a consequence makes unreliable predictions. The Villani Bayesian VAR is meant to solve this problem by introducing long run beliefs about the data structure and the steady state unconditional of each se- ries. When facing mixed frequency data, both these approached aggregate at the lower level by discarding useful information. In this scenario, the unrestricted MIDAS model addresses this problem without losing high frequency information. The results show how both BVAR and U-MIDAS outperform VAR at every horizon, while there is no absolute winner among BVAR and U-MIDAS. Evidence suggests that U-MIDAS is superior for short horizons, specifically up to the 5th step ahead, which corresponds to one year and a quarter. Contents

1 Introduction 3

2 Theory 4 2.1 VAR...... 4 2.2 Bayesian VAR ...... 6 2.2.1 Steady State Prior ...... 7 2.3 Mixed Data ...... 8

3 Data 10 3.1 Vintages ...... 11 3.2 Economic Tendency Indicator ...... 11 3.3 GDP Growth ...... 11 3.4 Underlying Inflation Rate ...... 12 3.5 Unemployment ...... 12

4 Empirical Analysis 12 4.1 VAR(4) ...... 13 4.2 Villani Steady State ...... 13 4.3 U-MIDAS ...... 14 4.4 Root Square Error and Mean Absolute Error ...... 14 4.5 Diebold-Mariano Test ...... 15

5 Results 15 5.1 Accuracy Measure ...... 16 5.2 Diebold-Mariano Test ...... 19

6 Conclusion 21

A Appendix 23

2 1 Introduction

Forecasting macroeconomic variables plays a vital role in the policy makers’ decision concern- ing the future direction of the . When it comes to forecasting Swedish macroeconomic variables, one of the most successful and commonly used approaches to deal with this matter are the vector autoregressive models (Karlsson, 2013).

The aim of this study is to compare the widely used vector autoregressive (VAR) and Bayesian vector autoregressive (BVAR) models with a relatively new approach referred to as Mixed Data Sampling (henceforth MIDAS), which is meant to address the possible dispersion of information when dealing with mixed data frequencies (Ghysels et al., 2007). This situation is commonly faced when dealing with macroeconomic and financial variables. The classic example is GDP, usually measured quarterly, while predictors such as unemploy- ment, inflation and economic sentiment indicators are sampled monthly. For the financial vari- ables, the gap among frequencies is even wider, since they are quite often reported daily. The common approach when facing mixed data frequency is to aggregate at the lower level with a potential loss of information.

The aim of this paper is to compare the relative efficiency, in forecasting Swedish GDP, of the MIDAS regression model, against the VAR and BVAR models.

The VAR model, relatively simple and well enstablished in the literature (Litterman, 1979) is considered as the baseline approach of this analysis. As a next step, the Villani steady-state Bayesian VARis implemented. This study shows how the latter deals with over-parametrization issues of the VAR model (Doan et al., 1983; Litterman, 1986; Villani, 2009). Finally, MIDAS regression, with its functional form, is presented as a way to not discard infor- mation when the data have a mixed frequency structure.

The variables included to forecast Swedish GDP are the Economic Tendency Indicator (ETI), unemployment and inflation. The forecasting accuracy is measured in terms of root mean square error (RMSE) and mean absolute error (MAE). The dataset presents a mixed fre- quency structure and span from 2000 Q1 up to 2015 Q3. The analysis is based on real-time data set, meaning that we have vintages or snapshot of the data at a specific point in time.

3 The model used to forecast is built for every vintage, whereas to evaluate the forecasting ac- curacy the last vintage is used. This procedure is implemented because as a matter of fact, past vintages have higher uncertainty compared to the recent ones, which are revised and ad- justed. In this way, when modelling we are in a situation as close as possible to the one faced by forecasters - characterized by relatively high uncertainty, whereas when evaluating model accuracy, the vintage with all the available information at that poin in time is used - the last one.

The results are limited to the present data and model specification. For the empirical anal- ysis, R is used as a software and BMR (Keith O’Hara, 2015) and Midasr (Ghysels et al., 2016) are the packages used for VAR-BVAR and MIDAS respectively. The structure of this paper is the following, Section 2 focuses on the theory behind the dis- cussed models, Section 3 describes the data structure and variables, in Section 4 the empirical analysis is presented and Section 5 shows the results. Finally, discussion and conclusion in Section 6.

2 Theory

In the following section I briefly describe the theory behind the models I will be using. First in Section 2.1 I introduce the VAR model and its drawbacks, then in Section 2.2 I explain how the BVAR addresses some of these issues and finally in Section 2.3 I present the MIDAS model and the way it deals with mixed data frequencies.

2.1 VAR

The vector autoregressive model (VAR) is one of the most succesful models for macroeco- nomics forecasting (Karlsson, 2013). The large use in has been motivated by Sims (1980) and the success of the VAR models stems from their simplicity, flexibility and ability to fit the data. Those features come from the rich parametrization of the VAR models, which brings the risk of overfitting the data and as a consequence produce imprecise inference and large uncertainty when it comes to forecasting (Karlsson, 2013).

When we are not sure about the direction of the causality between variables, the VAR

4 approach is to treat each variable symmetrically, in the sense that (in the two variates case), we let x be affected by the past realization of z and we let z be affected by the past realization of x (Walter, 1948). Therefore with one lag, we have two equations such that:

xt = b10 + b11xt−1 + γ11zt−1 + εyt (1)

zt = b20 + b21zt−1 + γ21xt−1 + εzt, (2)

where {x} and {z} are stationary proceses and εy and εz are white noise disturbances. The model from Equations (1) and (2) is a first order vector autoregressive process because the maximum lag length is 1. It appears crucial to remark that for just two variables and one lag, the number of parameters to estimate is 6. As a consequence when the number of variables and lags grow, the number of parameters to estimate increases drastically. This confirms what stated in the first part of this section regarding the large number of parameters of the VAR model. We will see how Bayesian statistics address this issue in the following section.

Because of double causality, to estimate the parameters in Equations (1) and (2) with ordi- nary least square (OLS), we need to rewrite the model in matrix notation. Where for VAR(1) the final form is the following:

Yt = A0 + A1Yt−1 + εt (3)

In the literature this is referred to as VAR model in standard form (Enders, 2015). The general VAR(p) model with p lag, can be rewritten as:

Yt = A0 + A1Yt−1 + ... + ApYt−p + εt, t = 1, .., .T (4)

Where Yt denotes a (n × 1) vector of variables, A0 is a vector (n × 1) of intercepts,

Ai are (n×n) matrices of coefficients and εt is a (n×1) random error term, centered around 0. The right hand side of Equation (4), contains only predetermined variable, therefore equation by equation can be estimated by OLS (Walter, 1948).

The way the VAR model is used to forecast is fairly straightforward. The one step ahead forecast is the expected value of Yt+1, which in notation of Equation (4) is E[Yt+1|Yt, Yt−1, ...] =

A0 + A1Yt + ... + ApYt−p+1. The two step ahead repeats the same procedure using t + 1 in the equation and the same iterative procedure is done for longer period forecasts. As mentioned

5 before VAR models might produce unreliable forecasts due to overparametrization, especially when including high order lags and several variables. One of the possible ways to deal with this issue is presented in the next section.

2.2 Bayesian VAR

Essentially, the Bayesian approach deal with the overparametrization of the VAR model with the introduction of a prior, which contains information about the long run properties of the data, independent from the short run observed data. How the Bayes Theorem combines these two pieces of information, i.e. prior beliefs and observed data, is presented below.

All the parameters, θ, are treated as random variables with their own probability distribu- tion. The prior distribution of the parameter, π(θ), represent the researcher’s beliefs and is independent of the observed data. As mentioned above, another key element of this approach is the observed data conditional on the parameters, the :

T Y L(Y1, ..., YT |θ) = f(Yt|Yt−1, Yt−2, ..., θ) (5) t=1 This information is summarized together in the posterior distribution, derived with the help of the Bayes theorem. The latter contains information about the prior beliefs about the parameters and the observed data:

π(θ)L(Y1, ..., YT |θ) p(θ|Y1, ..., YT ) = (6) p(Y1, ..., YT )

Central in Bayesian forecasting is the predictive distibution conditional on the observed data, p(YT +1:T +H |YT ). This procedure is presented rigoruosly in Karlsson (2013). As a first step, we will need the distribution of future observation conditional on observed data and parameters:

T +H Y f(YT +1:T +H |YT , YT −1...; θ) = f(Yt|Yt−1, Yt−2, ...; θ) (7) t=T +1

Applying Bayes rule, with Equation (5) and Equation (7), gives the central object of Bayesian forecast, the predictive distribution conditional on observed data:

p(YT +1:T +H ; Y1, ...YT ) p(YT +1:T +H |Y1, ...YT ) = = p(Y1, ...YT ) R (8) f(YT +1:T +H |Y1, ...YT ; θ)π(θ)L(Y1, ...YT |θ)dθ = R π(θ)L(Y1, ...YT |θ)dθ

6 By using Equation (6), Equation (5) can be rewritten such as: Z p(YT +1:T +H |Y1, ...YT ) = f(YT +1:T +H |Y1, ...YT ; θ)p(θ|Y1, ...YT )dθ (9)

As Karlsson (2013) remarks, the Bayesian forecast, takes into account the uncertainty about future observation and the uncertanty about the true value of the parameter θ. The forecasting procedure is illustrated in the next section.

2.2.1 Steady State Prior

One of the Bayesian approach to restrain the order of the over-parametrized VAR model is the so called Villani Steady State. This approach has proved particularly well suited to forecast Swedish macro data (Villani, 2009). Besides Villani’s paper, this model has been used for sim- ilar application by Meredith and Österholm (2008), Peter Gustafsson and Österholm (2015) and Ankargren et al. (2016). To some extent, this paper resembles what these authors did in their previous studies. The additional feature of the Villani approach is the possibility to specify prior beliefs about the unconditional mean or steady state value of each series, a characteristic that economists have strong beliefs about (Villani, 2009).

In details the model is:

B(L)(Yt − µ) = ηt (10)

Where B(L) is the lag polynomial of order p, ηt is a n × 1 independent error terms, with mean

0 and Σ, Yt is an n × 1 of stationary variable and µ is the steady state unconditional mean. The priors for the parameters used in the R package BMR (Keith O’Hara, 2015) are the follow- ing:

p(µ) = N(θµ, Ωµ) (11)

p(vec(B)|Σ) = N(θB, ΩB ⊗ Σ) (12)

1 p(Σ) = iW (θΣ, v) (13)

The means and variance for µ correspond to the long run beliefs about the data, while θB is set to 0.9 on the first own lag, if the variable is in level, and to 0 if the variable is in difference and

1Inverse-Wishart, BMR Vignette, O’Hara 2015

7 for the rest. In addition, for the variance prior of vec(B), the Kronecker product complicate

sjj things. Given that the variance V (Bj) = (v−m−1) ΩB, the diagonal elements of ΩB are:  2 π1 ΩB = π 2 lag ` of variable r,i = (` − 1)m + r (14) (` 3 sr)

Where v are the degrees of freedom from Equation (13), m is the number of variables, sjj and

sr are scaling factor accounting for difference in of the dependent and the explanatory

variables. In the literature, π1 is referred as "overall tightness" and π3 is the "lag decay". 1 2 2 Finally, θΣ = (v − m − 1)diag(s1, ..., sm) and sj are the variances from a univariate autore- 1 2 gression for the variable j. In this way E(Σ) = diag(s1, ..., sm).

As mentioned in the beginning of this section, the predictive distribution contains all the information about future data points. Quite often it is impossible to estimate analytically this distribution, hence a Monte Carlo Markov Chain (MCMC) approach is used to simulate the distribution of interest. The Bayesian forecasting procedure begins by simulating Σ(i) from the full conditional posterior Σ|B(0), µ(0). Usually, as starting point, OLS estimates of the parameter are chosen. Once we have Σ(1), B(1) is simulated from B|Σ(1), µ(0). The generated Σ(1) and B(1), are used to simulated µ(1) from µ|Σ(1),B(1). Finally, generate H error terms form N(0, Σ(1)) and calculate recursively to obtained simulated predictive distribution for t from t + 1 to t + h in the following way:

h−1 p (1)0 (1) X 0 (1)0 (1) X 0 (1)0 (1) (1)0 y˜T +h = µ + (yT +h−j − µ )Bi + (yT +h−j − µ )Bi + uT +h (15) j=1 j=h By repeating this procedure R times, a sample of independent draws from the joint predictive distribution is obtained. Usually a number D of draws is discarded as burn-in, before the convergence of the sample to the predictive distribution. From this distribution we can obtain all the information that we need, such as point forecast, usually mean or , credible intervals and so on.

2.3 Mixed Data Sampling

Mi(xed) Da(ta) S(ampling) or MIDAS is a fairly new approach introduced by Ghysels and Valkanov (2004). MIDAS regression deals with a specific issue that is often encountered in practice. For example, many papers in the literature use monthly or daily data to predict GDP, which is observed quarterly. Usually, macroeconomic and financial variables are observed

8 monthly and daily respectively. The most common approach is to aggregate data at the low frequencies, with a potential loss of valuable information (Andreou et al., 2013).

The original MIDAS regression introduced by Ghysels 2005 is the following. Suppose that (m) a variable yt is observed once during the period that goes from t − 1 to t. Another variable xt is available m times in the same period. Ghysels and Valkanov (2004) proposed the following model: 1 (m) (m) yt = βo + β1B(L m ; θ)xt + εt , (16)

1 K k 1 m P m m where B(L ; θ) = k=0 B(k; θ)L and L is a higher frequency lag operator, such that 1 m L m x = x 1 and B is the lag coefficients, parametrized as a function of θ. This functional t t− m form is employed in order to reduce the number of parameters. For large m, for example if x is observed daily and y monthly or quarterly, Equation (16) could have a significant number of parameters. On top of that, if we consider the fact that a large number of lags could be used, it appears straightforward that the number of parameters could go sky-high. For this reason, when dealing with a large number of variables and lags, MIDAS model often use this restric- tion. The most commonly used polynomial functional forms are the Almon lag (Almon, 1965), exponential Almon lag (Ghysels et al., 2007) and the Beta lag (Ghysels et al., 2007).

The situation described above is efficient when the difference in the data frequency is high and distributed lag functions are used to avoid parameter proliferation. In this study, we are facing a different situation, the difference in the data frequency is not large, only monthly- quarterly. This situation is similar to the one faced by Foroni et al. (2015). In their paper, the authors show how, unrestricted MIDAS (U-MIDAS), i.e., without restriction on the parameters, have several advantages when differences in sampling frequencies are small, such as monthly- quarterly, and a small number is required. Below unrestricted MIDAS (U-MIDAS) is presented in the generic form Ghysels et al. (2016).

k li X X (i) (i) y − α y − ... − α y = β x +  , t 1 t−1 p t−p j tmi−j t (17) i=0 j=0

where yt is a univariate process observed at low frequency, t ∈ Z. The x variables are observed (i) at higher frequency, xτ , τ ∈ Z that is for each low frequency period t = t0 we observe m high

frequnecy period τ = (t0 − 1)m + 1, ..., t0m. For each quarter t, we have a linear combination

9 of xmt, xmt−1, xmt−2 observed at the quarter t and x3(m−1), x3(m−1)−1, x3(m−1)−2 observed at the quarter t − 1. Through this procedure the frequencies are aligned by allowing a linear com- bination of the high frequency vector on the low frequency vector, without loss of information. Equation (17) can be estimated simply with OLS without restriction on the parameter.

This approach, as previously mentioned, works particularly well when the difference in fre- quencies are not too wide and high order lag is not required. Foroni et al. (2015), shows how U-MIDAS perform better than the restricted version in forecasting GDP with a data structure similar to the one faced by this paper.

The forecasting procedure for the U-MIDAS is the following. Let us consider the forecast- ing horizon h, which is equal to one step ahead of the low frequency variable, where the h step ahead forecast is:

˜ ˜ y˜T k+k|T k = Ch(L)yT k + Bh(L)xT k (18)

˜ ˜ Where the coefficients Ch and Bh are h horizon specific estimated coefficients from previous data. For a longer period, the procedure is repeated in the same way, using previous predicted values.

3 Data

The chosen variables to forecast the Gross Domestic Product (GDP) growth are the Economic Tendency Indicator (ETI), Inflation and Unemployment. Variable choice reflect previous stud- ies with similar purposes such as the Konjunkturinstitutet (KI) publication (the most recent are Stockhammar and Österholm (2014), Raoufinia (2016) and Ankargren et al. (2016)).

For the empirical analysis a real-time data set is used. The data set contains monthly vin- tages, which are snapshots of the variables (GDP and ETI) revised at present time. The full dataset ranges from 2000Q1 to 2015Q3 and the full data set has mixed frequencies observations. GDP is observed quarterly, while the rest of the variables are observed monthly, for this reason, MIDAS model is implemented on top of the traditional approaches (VAR and BVAR).

10 3.1 Vintages

For ETI and GDP, real-time macroeconomic vintages or snapshots of the variables at a specific point in time are used. Each Vintage is a real time picture of the data at that point in time. This is useful because some of the variables might be revised and adjusted due to additional information coming up in the future. For instance, due to revision, March 2003 GDP measured in 2007, might be different than March 2003 GDP measured in 2008. When forecasting, the procedure is the following:

• build a model for each vintage (past vintage, limited information, such present time);

• produce 8 step ahead forecast (2 years);

• evaluate the forecasting accuracy with the last vintage available, revised and adjusted.

In this way, when we are building the model we are in a situation similar to present time, characterized by more uncertainty, whereas when we are assessing the forecasting performance with the last vintage, we are as close as it can get to reality. Each vintage is issued monthly from January 2007 to December 2015 and each vintage contain data from 2000Q1 up to the publication’s month.

3.2 Economic Tendency Indicator

This indicator can be compared to the EU’s Commission Economic Sentiment Indicator (ESI). The Economic Tendency Indicator (ETI) is a weighted survey among households and firms of sentiment about the Swedish Economy. This confidence indicator has been used before to forecast GDP by (Stockhammar and Österholm, 2014). The survey is released monthly by Konjunkturinstitutet. From now on this variable is called SETI because it has been standard- ized.

3.3 GDP Growth

Gross Domestic Product (GDP) is published quarterly by Statistics Sweden and is a monetary value of good and services produced in the country. The variable capture the GDP Growth because a log difference transformation multiplied by 100 was taken. From now on this variable is called GDP.

11 3.4 Underlying Inflation Rate

Underlying Inflation (KPIF), monthly changes, also taken from Statistics Sweden. The present variable is adjustes to take into account inadequacies in the data on prices or in the methods used for calculations. Underlying inflation is used by Ankargren et al. (2016) to assess the importance of the financial system on the real economy (GDP).

3.5 Unemployment

Unemployment (U) is the number of unemployed as a share of the labor force. The data is collected from the OECD website. Besides being strictly connected with GDP, this variable accounts for the internal labour market development. The variable is used in level and reference papers are Ankargren et al. (2016) and Raoufinia (2016).

4 Empirical Analysis

In this section I outline my empirical strategy for all three models, the starting point is to pro- duce 8 step ahead forecast from a "simple" VAR model, considered as baseline. As mentioned earlier in the analysis, although this model has a great flexibility and it is able to fit the dy- namics of the data extremely well, there is a major drawback. The risk is that the model might overfit the data, by giving unreliable inference. By estimating a Bayesian VAR we are going to asses if putting restriction on the parameters is increasing the forecasting power. The last step is to implement a U-MIDAS and the motivation stands behind the nature of the data. In this analysis mixed data frequencies are used and when forecasting with the traditional techniques (VAR and BVAR), by aggregating data at the lower frequencies, there is a risk to discard a great number of useful information. GDP is forecasted from 1 to 8 step ahead and the relative accuracy is measured in term of root mean squared error (RMSE) and mean absolute error (MAE).

12 4.1 VAR(4)

The vector autoregressive model will have the following form:         GDPt A11(L) A12(L) A13(L) A14(L) GDPt−1 e1t                  SETIt  A21(L) A22(L) A23(L) A24(L)  SETIt−1   e2t    =   ×   +   (19)          Ut  A31(L) A32(L) A33(L) A34(L)  Ut−1   e3t          KPIXt A41(L) A42(L) A43(L) A44(L) KPIXt−1 e4t

Where Aij(L) are polynomials in the lag operator L of order 4. The decision regarding the order of lags was taken to optimize the RMSE. Using the augmented Dickey-Fuller test, each series is stationary with the exception of unem- ployment (U), but this does not appear to be an issue since in practice, the unemployment rate in Sweden is bound by certain values, therefore the series does not drift away. If unemployment goes up, sooner or later it is expected to go down again.

4.2 Villani Steady State

By including restrictions on the prior distribution of the parameters, we would like to solve the overparametrization problems of the VAR model. The Minnesota Prior’s restrictions are well suited for this kind of issues (Doan et al., 1983; Litterman, 1986) and have been used in this study. By setting all the coefficients, but the first own lag to 0, the Minnesota prior account for the fact that distant observations are less influential. After Litterman original paper in 1979 and 1980 several approaches have been developed and in this paper KI recommendation were used by setting the first mean coefficient lag to 0.9 for variables in level and 0 for variables in difference.

The hyperparameter π1 and π3 follow the widely used Minesota Prior specification, which is 0.2 and 1 (Ankargren et al., 2016).

Table 1 report the 95 % intervals for the parameters unconditional means. Swedish GDP growth is centered at 0.5625 and this represent the steady state growth per quar- ter. ETI is a standardized variable, therefore assumed to be centered around 0. Unemployment is centered at 6.2 and KPIX at 0.40625, which reflect the Riksbank 2% year inflation target and the fact that the inflation has been constantly lower in the last decades. For the steady state prior I am following similar studies in the previous literature (Ankargren

13 Table 1: Steady State Priors for the Bayesian VARs

Variable 95% Prior Probability Interval GDP (0.5, 0.625) SETI (−1.96, 1.96) U (5.7, 6.7) KPIF (0.25, 0.5625)

et al., 2016; Österholm, 2008; Adolfson et al., 2007) for all the variables except SETI, which is a standardized variable, therefore assumed to be centered around 0.

4.3 U-MIDAS

This study follows (Foroni et al., 2015) approach, which does not involve a functional dis- tributed form for the lag polynomial and the parameters can be estimated by simple OLS. The model specification is the following. GDP is observed quarterly, every t. The x vari- ables instead are measured monthly and since every quarter has three months, the frequency

is m = 3. This means that for each quarter yt we have a linear combination of x3t, x3t−1 and

x3t−2. The model includes one lag for the y variable GDP and up to 2 lag for the x variables.

For the x variables, the first lag correspond to x3(t−1), x3(t−1)−1 and x3(t−1)−2 and the second one is x3(t−2), x3(t−2)−1 and x3(t−2)−2.

4.4 Root Mean Square Error and Mean Absolute Error

The aim of this paper is to compare the models in terms of forecasting performance. The forecasting performance is evaluated by comparing two of the most common measures to judge forecasting accuracy, root mean square errors (RMSE) and mean absolute error (MAE). The RMSE is: v u n u 1 X 2 RMSE = t (yi − yˆi) , (20) n i=1 where y is the actual value and yˆ is the forecasted one. The other measure of accuracy, MAE, is: n 1 X MAE = |y − yˆ |. (21) n i i i=1

14 For each Vintage I forecast from 1 to 8 step ahead. This point forecast is yˆt+h, h = 1, ..., 8, which is compared with the realized GDP of the last vintage, assuming that the last Vintage has the most accurate values. Finally, I compare the RMSE and the MAE for each h step ahead.

4.5 Diebold-Mariano Test

To asses the relative forecasting accuracy the Diebold-Mariano (DM) test is also implemented. This test compares the loss associated with the forecast i. The loss is a function of the forecast error, defined as ei = yi − yˆi and it is denoted g(ei). This function needs to satify:

• When zero, no error is made.

• Non-negative.

• Increasing in size as the error increase.

The most used g(ei) function is the square (RMSE) or the absolute value (MAE) of ei. The DM test use the differential loss to construct the test. The latter is defined as:

dt = g(e1t) − g(e2t) (22)

By stating that the forecasts have equal accuracy only if dt is zero for every t and the test hypothesis is the following:

H0 : E(dt) = 0 ∀t (23)

Ha : E(dt) 6= 0 ∀t (24)

Under the null hypotesis, the test statistic DM, asymptotically follows a standard normal dis- tribution (Diebold and Mariano, 1995). In the paper this test is used to asses if one model is statistically more accurate than another one. As a g(ei) function, in this paper the absolute value of the forecast error yit − yˆit is used.

5 Results

In this section the results are presented. Figure (1) reports the time series for each variable. Looking at those plots, the results of the ADF test are confirmed. None of the variables, except

15 unemployment, are persistent. This could be a problem in case the series would drift away. However, as discussed before, Swedish unemployment in reality is bounded by certain natural values. It appears difficult to immagine unemployment to be higher than 30% and certainly cannot be negative.

5.1 Accuracy Measure

This paper is trying to identify which of the discussed models is superior in terms of forecasting performance, given this type of data. Particularly, VAR and BVAR are the most widely used approaches to forecast macroeconomic variables at the , but as explained before, when facing mixed frequency variables, there might be a lot of information thrown away because of aggregating variables at the lower frequency.

The results are presented both graphically and numerically. The Figures (2) and (3), report the results for each of the model in terms of RMSE and MAE (numerical results are presented in Table 3 and Table 4 in the Appendix).

It appears evident how BVAR and MIDAS outperform VAR for every horizon, while there is not an absolute winner among them. The MIDAS model appears to be somehow superior in forecasting at short horizons, from 1 step ahead until 5 step ahead. These results are evident if we look at the RMSE in Figure 2, while the same cannot be deducted from the MAE, by looking at Figure 3, we can see how at horizon 2 BVAR is slightly superior. The fact that the BVAR model catch up with the MIDAS model when it comes to longer horizons is somehow expected. This finding might come from the fact that the Bayesian model accounts for the steady state beliefs about the economy. As a final remark about these two accuracy measures, the fact that the RMSE and MAE results display similar findings when comparing the models, gives strength and consistency to the find- ings.

16 17

;

Figure 1: Time Series of each variable 1.5

Legend BVAR 1.0 MIDAS RMSE VAR

0.5

1 2 3 4 5 6 7 8 Forecasting Horizon

Figure 2: RMSE Plot

1.2

1.1 Legend BVAR 1.0 MIDAS MAE VAR

0.9

0.8

1 2 3 4 5 6 7 8 Forecasting Horizon

Figure 3: MAE Plot

18 Figure 4 is made by using only the last vintage, which is December 2015. The models are estimated using observation up to September 2013 (vertical dashed line in Figure 4) and 8 steps ahead forecast are plotted for every model. The solid line shows how GDP actually evolve during that time, plus 10 periods in-sample observations. The figure is a graphical example and does not say much about the actual results of this analysis, although is useful to have a visual rapresentation of the forecasts.

2.0

1.5

Legend 1.0 Actual BVAR MIDAS Gdp Growth 0.5 VAR

0.0

2012 12 2013 09 2014 05 2015 02 2015 10

Figure 4: Different models forecast, actual data up to September 2013

5.2 Diebold-Mariano Test

Finally, the DM test is implemented and the results are presented in Table 2. As mentioned in the theoretical section, the test is asymptotically normally distributed and in this case we only have 86 forecast errors for each horizon, therefore we are far from normal, hence this might be the reason why the test does not give the results that we expect. The test is set as following, the alternative hypothesis is that model 2 is more accurate than model 1 for the first two row (one-sided test) of Table 2, whereas in the last row of the Table, the alternative hypothesis is that model 1 and model 2 have different level of accuracy (two- sided test). In this setting when we observe a small p-value, the null hypotesis that the model 2 is not more accurate than model 1, is rejected. Table 2 reports the p − values for different forecast horizons. The results are not so clear as in the figure, although there is some evidence of BVAR and MIDAS outperforming VAR at short horizon. BVAR is more accurate than VAR at the 10%

19 statistical level for the first three horizons, while MIDAS is statistically more accurate than VAR only in the first step ahead.

Model 1- Model 2 1 2 3 4 5 6 7 8

VAR-BVAR 0.01∗∗ 0.05∗ 0.10∗ 0.09∗ 0.15 0.02∗∗ 0.04∗∗ 0.05∗ VAR-MIDAS 0.01∗∗ 0.22 0.24 0.27 0.50 0.32 0.33 0.33 BVAR-MIDAS 0.57 0.61 0.83 0.98 0.65 0.79 0.86 0.63

***p < 0.01, **p < 0.05, *p < 0.1

Table 2: DM Test, p-values at different horizons, alternative model 2 more accurate

20 6 Conclusion

This paper compares forecasting performance for Swedish GDP of three different models, using mixed frequency data. The models employed are vector autoregressive model (VAR), Bayesian vector autoregressive model and Mixed data Sampling regression (MIDAS).

The analysis starts with the widely used and relatively simple VAR. This model comes with great flexibility, due to the large number of parameters, which allow the model to have a great capacity to fit the data. Although, the same characteristics that make the unrestricted vector autoregressive model so flexible are responsible of a major drawback. Due to the large number of parameters, this model tends to overfit the data and produce unreliable inference.

The Bayesian vector autoregressive model, addresses this problem by accounting for prior beliefs about the long run dynamics of the data. By combining the observed data, with the long run beliefs about the data, this model is able to produce sharper inference. Furthermore in this study the Villani mean-adjusted method was used, which takes into account the unconditional mean or steady state value of each series, a value which is typically available.

Despite the positive feature about the BVAR and its capability to partially solve the VAR problems, there is one last issue that both the autoregressive models neglect. Often, macroe- conomic variables (among others), are sampled at different frequencies. The vector’s autore- gressive approach is to aggregate the data at the lower frequency and discard high frequency information. The MIDAS regression, is meant to adress this issue, by proposing a functional form or structure among the variables at different frequency, which is able to use all the avail- able information.

The forecasting performance in this paper is evaluated at different horizons, from 1 to 8 step ahead and judgments are based on root mean square error (RMSE) and mean absolute error (MAE) values. The results confirm what is expected from previous studies and literature, the VAR model is outperformed by BVAR, at every horizon, which might be due to overfitting of the former. The U-MIDAS regression model is itself superior than the unrestricted vector autoregressive model at every horizon, indicating that the frequency alignment and the consequent usage of

21 all the available information, have indeed a positive effect, when it comes to forecasting. Finally, there is not a clear winner between BVAR and U-MIDAS at every horizon. However, The MIDAS model, perform consistently better up to 5th step ahead forecast (with the exeption of the 2nd step if we look at the MAE in Figure 3).

The present study aims to compare the forecasting performance of the discussed models in this specific settings, being aware of the obvious limitation. Further is needed to claim broader and more general results. However, it appears useful to state, that the findings are in line with what has been observed in the previous literature regarding similar cases.

22 A Appendix

Step VAR BVAR MIDAS Ahead

1 1.52 0.42 0.32 2 1.34 0.45 0.37 3 1.40 0.59 0.44 4 1.59 0.64 0.52 5 1.72 0.60 0.59 6 1.41 0.47 0.59 7 1.57 0.49 0.49 8 1.42 0.38 0.42

Table 3: RMSE for VAR, BVAR and MIDAS

Step VAR BVAR MIDAS Ahead

1 1.13 0.88 0.82 2 1.00 0.82 0.85 3 1.00 0.96 0.84 4 1.15 0.98 0.94 5 1.25 0.95 0.88 6 0.92 0.77 0.88 7 1.09 0.85 0.82 8 0.97 0.79 0.83

Table 4: MAE for VAR, BVAR and MIDAS

23 References

Adolfson, M., Laseen, S., Linde, J., and Villani, M. (2007). Bayesian estimation of an open economy DSGE model with incomplete pass-through. Journal of International Economics, 72(2):481–511.

Almon, S. (1965). The distributed lag between capital appropriations and expenditures. Econo- metrica, 33(1):178–196.

Andreou, E., Ghysels, E., and Kourtellos, A. (2013). Should macroeconomic forecasters use daily financial data and how? Journal of Business & Economic Statistics, 31(2):240–251.

Ankargren, S., Bjellerup, M., and Shahnazarian, H. (2016). The importance of the financial system for the real economy. Empirical Economics, pages 1–34.

Diebold, F. and Mariano, R. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13(3):253–63.

Doan, T., Litterman, R., and Sims, C. (1983). Forecasting and conditional projection using realistic prior distributions. (1202).

Foroni, C., Marcellino, M., and Schumacher, C. (2015). Unrestricted mixed data sampling (mi- das): Midas regressions with unrestricted lag polynomials. Journal of the Royal Statistical Society Series A, 178(1):57–82.

Ghysels, E., Kvedaras, V., and Zemlys, V. (2016). Mixed frequency data sampling regression models: The r package midasr. Journal of Statistical Software, 72.

Ghysels, E., Sinko, A., and Valkanov, R. (2007). Midas regressions: Further results and new directions. Econometric Reviews, 26(1):53–90.

Ghysels, E., P. S.-C. and Valkanov, R. (2004). The midas touch: Mixed data sampling regres- sion models.

Karlsson, S. (2013). Forecasting with Bayesian , volume 2 of Handbook of Economic Forecasting, chapter 0, pages 791–897. Elsevier.

Keith O’Hara (2015). Bayesian Macroeconometrics in R.

24 Litterman, R. (1979). Techniques of forecasting using vector autoregressions. Working Papers 115, Federal Reserve Bank of Minneapolis.

Litterman, R. (1986). Forecasting with bayesian vector autoregressions-five years of experi- ence. Journal of Business & Economic Statistics, 4(1):25–38.

Meredith, B. and Österholm, P. (2008). A bayesian vector autoregressive model with informa- tive steady-state priors for the australian economy. The Economic Record, 84(267):449–465.

Peter Gustafsson, P. S. and Österholm, P. (2015). Macroeconomic effects of a decline in hous- ing prices in sweden. (138).

Raoufinia, K. (2016). Forecasting employment growth in sweden using a bayesian var model. (144).

Sims, C. (1980). Macroeconomics and reality. Econometrica, 48(1):1–48.

Österholm, P. (2008). Can forecasting performance be improved by considering the steady state? an application to swedish inflation and interest rate. Journal of Forecasting, 27(1):41– 51.

Stockhammar, P. and Österholm, P. (2014). Effects of us policy uncertainty on swedish gdp growth. (135).

Villani, M. (2009). Steady-state priors for vector autoregressions. Journal of Applied Econo- metrics, 24(4):630–650.

Walter, E. (1948). Applied econometric time series. Wiley, University of Alabama.

25