Bootstrap Prediction Intervals for Power-Transformed Time Series

Bootstrap prediction intervals for power-transformed time series Lorenzo Pascuala,1, Juan Romob,2, Esther Ruizb,* aHidroCanta´brico, Serrano Galvache 56, Centro Empresarial Parque Norte, 28033 Madrid, Spain bDepartamento de Estadı´stica, Universidad Carlos III de Madrid, C/ Madrid 126, 28903 Getafe, Spain Abstract In this paper, we propose a bootstrap procedure to construct prediction intervals for future values of a variable after a linear ARIMA model has been fitted to its power transformation. The procedure is easy to implement and provides a useful tool in empirical applications given that it is often the case that, for example, the log transformation is modeled when the variable of interest for prediction is the original one. The advantages over existing methods for computing prediction intervals of power transformed time series are that the proposed bootstrap intervals incorporate the variability due to parameter estimation and do not rely on distributional assumptions neither on the original variable nor on the transformed one. We derive the asymptotic distribution and show the good behavior of the bootstrap approach versus alternative procedures by means of Monte Carlo experiments. Finally, the procedure is illustrated by analyzing three real time series data sets. Keywords: Forecasting; Non-Gaussian distributions; Box Cox transformations; Resampling methods 1. Introduction interval forecasts as well; see, for example, Chatfield (1993). Forecasting future values of time series data is In empirical time series analysis, it is common to one of the main objectives of time series analysis. transform the data using power transformation prior to Generally, predictions are given as point forecasts, the estimation of the model used for forecasting. although it is even more important to provide There are several reasons for transforming the data before fitting a suitable model, for example, the necessity of stabilizing the increasing variance of * Corresponding author. Tel.: +34 916249851; fax: +34 trending time series, to reduce the impact of outliers, 916249849. to make the normal distribution a better approxima- 8 E-mail addresses: [email protected] (L. Pascual) tion to the data distribution, or because the trans- [email protected] (J. Romo)8 [email protected] (E. Ruiz). formed variable has a convenient economic interpre- 1 Tel.: +34 917819338; fax: +34 917819359. tation; for example, first differenced log-transformed 2 Tel.: +34 916249805; fax: +34 916249849. data correspond to growth rates. The well-known family of Box–Cox transforma- solution for k=0. Notice that, as pointed out above, the tions is given by logarithmic transformation is one of the most popular in practice. Another alternative proposed by Pankratz X k 1 t ; k p 0 and Dudley (1987) is complicated and, additionally, gXðÞ¼t k ð1Þ lnðÞXt ; k ¼ 0; only admits a closed form expression when k is a fractional positive integer. Finally, the method pro- where {Xt} denotes the observed time series with posed by Guerrero (1993) avoids all the drawbacks N d Xt 0, ln ( ) denotes the natural logarithm, and k is a found in previous approaches. His proposal is both real constant. The transformation for k=0 follows k simple and general. In a comparative study, Guerrero Xt 1 from the fact that limkY0 k ¼ lnðÞXt ; see Box and (1993) shows that his method has a performance Cox (1964). Subtracting 1 and dividing it by k does similar to or better than the other procedures. k not influence the stochastic structure of Xt , and hence, Although it is relatively well studied how to obtain without loss of generality, one often considers the a good estimate for the conditional mean in the following transformation suggested by Tukey (1957) original metric, there is no generally accepted method k of constructing prediction intervals for the untrans- Xt ; k p 0 gXðÞ¼t ð2Þ formed variable. One solution is based on a normal lnðÞXt ; k ¼ 0; assumption on XT+k, providing a symmetric interval. instead of the Box–Cox transformation in Eq. (1). This seems not to be a good choice unless the Once a model has been estimated, point and distribution of XT+k is close to be Gaussian; see interval forecasts can be obtained for the transformed Chatfield (1993). Another alternative is to construct seriesYT+k= g(XT+k)asdefinedinEq.(2) for k=1, 2,.... prediction intervals for XT+k by retransforming the We will focus on ARIMA models fitted to the series upper and lower values of the corresponding pre- yt, t=1,...,T. The specification of the model and the diction interval for YT+k. Finally, Guerrero (1993) parameter k will be assumed to be known. Notice that suggests to correct for bias the endpoints of the latter this is an interesting case given that, in many prediction intervals, using a procedure similar to the empirical applications, the logarithmic transformation one he proposes for the point forecast. is assumed. If the objective is to predict future values In this paper, we propose a bootstrap resampling of Xt, the retransformed point forecasts induce bias in scheme to obtain an estimate of the pdf of XT+k, the forecasts, as shown for linear models in Granger conditional on the available data when an ARIMA and Newbold (1976). When YT+k is normally dis- model has been fitted to yt, t=1, ..., T. Given this tributed and the point forecast of XT+k is just the density, the required prediction intervals for XT+k can inverse transformation of the forecast obtained for the be constructed. There are several advantages over the transformed variable, this naive point prediction is not methods previously described. First of all, the boot- the minimum mean squared error (MMSE) forecast strap procedure does not rely on distributional but the minimum mean absolute error (MMAE), assumptions neither on the transformed data nor on which is the median of the conditional probability the original scale. The second advantage is that the density function (pdf) of XT+k. Therefore, if the error bootstrap intervals incorporate the variability due to loss function is quadratic, this naive prediction of parameter estimation, which is not allowed by any of XT+k is not optimal; see Guerrero (1993). the alternative procedures. Finally, the method is very Assuming Gaussianity of Yt, Granger and Newbold easy to implement. (1976) propose a debiasing factor to reduce the The finite sample behavior of the bootstrap intervals transformation bias in the point forecast. Unfortunately, is compared with the alternative intervals by means of inasmuch as they solve the problem using Hermite an extensive simulation study. It is shown that the polynomial expansions, their procedure becomes very proposed procedure performs as well as the best complicated for many fractional power transforma- alternatives when Yt is Gaussian and tends to outper- tions, making this approach not useful in practice. form its competitors when leaving this assumption. Later, Taylor (1986) proposes a simpler expression for The paper is organized as follows. Section 2 the debiasing factor, but it does not provide an adequate presents a description of the existing methods for obtaining prediction intervals for a variable in its where the residuals corresponding to periods of time original scale. In Section 3, we introduce the t=p+d, p+dÀ1,...,1,0,À1, À2,. are set equal to 0; bootstrap approach. A Monte Carlo study comparing see, for example, Harvey (1993). the finite sample performance of all existing methods Once the ARIMA model has been estimated, the is presented in Section 4.InSection 5, we illustrate optimal linear predictor of YT+k, k=1, 2, ..., denoted the procedure analyzing empirically three real data by Yˆ T(k), can be obtained in the usual way. If, for sets. Finally, we conclude with some remarks and example, d=0, then the optimal predictor is given by suggestions for future research in Section 6. ˆ ˆ ˆ YYˆ T ðÞ¼k /0 þ /1YYˆ T ðÞþk À 1 /2YYˆ T ðÞþk À 2 N ˆ ˆ ˆ þ/pYYˆ T ðÞþk À p hh1aaˆ Tþk 1 þ N þ hhqaaˆ Tþk q 2. Prediction intervals for transformed time series ð5Þ There are two main alternatives proposed in the where Yˆ T( j)=YT+j for jV0andaˆ T+j=0 for jz0. ˆ literature to obtain prediction intervals for XT+k given Alternatively, if for example d=1, then YT(k) is given the observed series x , t=1, ..., T after an ARIMA by t model has been fitted to the power-transformed ˆ ˆ YYˆ T ðÞ¼k /0 þ 1 À /1 YYˆ T ðÞk À 1 variable yt, t=1, ..., T. In this section, these two procedures are described. ˆ ˆ ˆ N Consider that {x1, ..., xT} is an available sequence þ /2 À /1 YY T ðÞþk À 2 of T observations such that, for any of the reasons ˆ ˆ previously mentioned, it needs to be transformed þ /p À /p 1 YYˆ T ðÞk À p adequately by a function g(d ) defined in Eq. (2) to ˆ N ˆ obtain a new sequence y1, ..., yT. Lets also assume þ h1aaˆ Tþk 1 þ þ hqaaˆ Tþk q; ð6Þ that the transformed sequence is well fitted by an ARIMA( p, d, q) process given by Expressions of the optimal predictor for other values of d can be obtained similarly. The usual Box d and Jenkins (1976) prediction intervals for Y are /ðÞL j yt ¼ /0 þ hðÞL at; ð3Þ T+k given by 2 where a is a white noise process, /(L) and h(L) are ! t Xk 1 1=2 autoregressive and moving average polynomials 4 ˆ 2 ˆ 2 ˆ p YY T ðÞÀk za=2 rrˆ a wwj ; YY T ðÞk defined as /(L)=1À/1LÀ...À/pL and h(L)= j¼0 1Àh LÀ...Àu Lq, respectively, where L is the lag !3 ð7Þ 1 q 1=2 operator such that Lxh=x , j is the difference Xk 1 t h z rrˆ 2 wwˆ 2 5; operator such that j=(1ÀL) and d is the number þ a=2 a j j¼0 of differences needed for the series yt to be sta- tionary.

Load more