On the Application of Bootstrap Method to Stationary Time Series Process
Total Page:16
File Type:pdf, Size:1020Kb
American Journal of Computational Mathematics, 2013, 3, 61-65 http://dx.doi.org/10.4236/ajcm.2013.31010 Published Online March 2013 (http://www.scirp.org/journal/ajcm) On the Application of Bootstrap Method to Stationary Time Series Process T. O. Olatayo Mathematical Sciences Department, Olabisi Onabanjo University, Ago-Iwoye, Nigeria Email: [email protected] Received October 15, 2012; revised December 3, 2012; accepted December 14, 2012 ABSTRACT This article introduces a resampling procedure called the truncated geometric bootstrap method for stationary time se- ries process. This procedure is based on resampling blocks of random length, where the length of each blocks has a truncated geometric distribution and capable of determining the probability p and number of block b. Special attention is given to problems with dependent data, and application with real data was carried out. Autoregressive model was fitted and the choice of order determined by Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The normality test was carried out on the residual variance of the fitted model using Jargue-Bera statistics, and the best model was determined based on root mean square error MSE of the forecasting values. The bootstrap method gives a better and a reliable model for predictive purposes. All the models for the different block sizes are good. They preserve and maintain stationary data structure of the process and are reliable for predictive purposes, confirming the efficiency of the proposed method. Keywords: Truncated Geometric Bootstrap Method; Autoregressive Model; Akaike Information Criterion (AIC); Bayesian Information Criterion (BIC); Root Mean Square Error MSE 1. Introduction series process. In this article, we introduce a new resampling method The heart of the bootstrap is not simply computer simu- as an improvement on the stationary bootstrap of [15] lation, and bootstrapping is not perfectly synonymous and the moving block bootstrap of [16,17]. with Monte Carlo. Bootstrap method relies on using an The stationary bootstrap is essentially a weighted av- original sample or some part of it, such as residuals as an erage of the moving blocks bootstrap distributions or artificial population from which to randomly resampled estimates of standard error, where the weights are deter- [1]. Bootstrap resampling methods have emerged as mined by a geometric distribution. The difficult aspect of powerful tools for constructing inferential procedures in applying these methods is how to choose b in moving modern statistical data analysis. The Bootstrap approach, blocks scheme and how to choose p in the stationary as initiated by [2], avoids having to derive formulas via scheme. different analytical arguments by taking advantage of fast On this note we propose a stationary bootstrap method computers. The bootstrap methods have and will con- generated by resampling blocks of random size, where tinue to have a profound influence throughout science, as the length of each block has a “truncated geometric dis- the availability of fast, inexpensive computing has en- tribution”. hanced our abilities to make valid statistical inferences In Section 2, the actual construction of a truncated about the world without the need for using unrealistic or geometric bootstrap method is presented. Some theoreti- unverifiable assumptions [3]. cal properties of the method are investigated in Section 3 An excellent introduction to the bootstrap maybe in the case of the mean. In Section 4, it is shown how the found in the work of [4-9]. Recently, [10,11] have inde- theory may be extended to stationary time series proc- pendently introduced non-parametric versions of the esses. bootstrap that are applicable to weakly dependent sta- tionary observations. Their sampling procedure have 2. Material and Method been generalized by [12,13] and by [14] by resampling “blocks of blocks” of observations to the stationary time To overcome the difficulties of moving blocks and geo- Copyright © 2013 SciRes. AJCM 62 T. O. OLATAYO metric stationary scheme in determining both b (number struction of a bootstrap sample and the computation of of blocks) and p (probabilities). We introduced a trun- statistics on this bootstrap sample, and repeat these op- cated form for the geometric distribution and then dem- eration many times through some kinds of a loop. onstrated that it is a suitable model for a probability Proposition 1. Conditional on X12,,,XX N , problem. This truncation is of more than just theoretical X12,,,XX N is stationary. interest as a number of application has been reported If the original observations X1,, X N are all distinct, [18,19]. then the new series X1 ,, X N is, conditional on X1,, A random length L may be defined to have a truncated X N a stationary Markov chain. If, on the other hand, geometric distribution with parameter P and N terms, two of the original observations are identical and the re- when it has the probability distribution or probability maining are distinct, then the new series X1 ,, X N is a density function stationary second order Markov chain. The stationary r1 bootstrap resampling scheme proposed here is distinct P Lr K1,1,2, P Pr , N (2.1) from the proposed by [20] but posses the same properties The constant K is found, using the condition with that proposed by [21]. PL r 1, to be 11 1P N 3. Result and Discussion Thus, we have r1 In this section, the emphasis is on the construction of PP1 valid inferential procedures for stationary time series data, PL rN r 1, 2, 3, , N (2.2) 11P and some illustrations with real data are given. The real data are the geological data from demonstratigraphic data These are the probabilities of a truncated geometric from Batan well at 30 m regular interval, [22]. The data distribution with parameter P and N terms. Suppose that is the principal oxide of sand or sandstone, which is S O L length, number 1 to N are to be selected randomly with i 2 or silicon oxide. The point is that the bulk of oil reservoir replacement. Then, the process continues until it is trun- rocks in Nigeria sedimentary basins is sandstone and cated geometrically at r with an appropriate probability P shale, a product of sill stone [23]. In other to improve on attached to its random selection in form of rr,1 , the geological analysis and prediction of the presence of rrN2, 1 . We take our N to be 4, that is, the ran- these elements, a mathematical tool which can be used to dom selections could be truncated between 1 to 4 at an examine a wide range of data sets is developed to detect appropriate probabilities. and improve new and old oil basins. Then, a description of the resampling algorithm when The geological data of 130 observations was subjected r > 1 is as follows: to our new method described in section two of this article 1) Let X ,,X be a random variables. 1 N at 500 and 1000 bootstrap replicates, for block of (1, 2, 3, 2) Let X be determined by the r-th truncated ob- 1 and 4). servation X in the original time series. r The replicates with minimum variance was selected in 3) Let X be equal to X with probability 1 – P i1 r+1 each case of number of bootstrap replicates. and picked at random from the original N observations with probability p. 3.1. Model Fitting, Normality Test and 4) Let BXXX ,,, , be the block con- ib,1 i i i b1 Forecasting sisting of b observation starting from Xi. 5) Let BB,, be a sequence of blocks of ran- The linear models are fitted and consider the choice of IL11,, I 2 L 2 dom length determined by truncated geometric distribu- the order of the linear model on the basis of Akaike In- tion. formation Criterion (AIC), Bayesian Information Crite- 6) The first L , observations in the pseudo time series rion (BIC) and residual variance 2 [24]. 1 , X ,,,XX are determined by the first block B Fitting of AR Models to Bootstrapped Data 12 N I11,L , of observation XX,, ;the next L observations The linear models were fitted to the bootstrapped ob- IIL111 i 2 in the pseudo time series are the observations in the sec- servations when the bootstrap replicates are (B = 500, B ond sampled block X . = 1000) for blocks of (1, 2, 3 and 4). When B = 500 rep- IL221 . 7) The process is resampled with replacement, until licates, we have the following models. the process is stopped once N observation in the pseudo Block 1: time series have been generated. It is found that AIC and BIC is minzimum at P = 4. 8) Once X1 ,, X N has been generated, one can The fitted model is compute the quantities of interest for the pseudo time XX0.373717 0.089415 X series. tt12 t (3.1) The algorithm has two major components, the con- 0.284547XXtt34 0.248579 t Copyright © 2013 SciRes. AJCM T. O. OLATAYO 63 Block 2: Table 1. Measure of goodness fit by block sizes for 500 rep- The fitted model is: licates. X 0.38859XX 0.030660 0.315254X Block sizes 2 AIC BIC tt12 t t3 (3.2) 0.023246XXttt45 0.300233 1 28.9888 34 6.50187 4 6.592380 Block 3: 2 27.36994 6.274432 6.387564 The model fitted is: 3 27.277997 6.258376 6.372097 X 0.539969XX 0.070547 0.101612X tt12 t t3 (3.3) 4 26.22478 6.278401 6.369378 0.182896XX 0.101372 ttt45 Block 4: Table 2. Measures of goodness of fit by block sizes for 1000 The fitted model is: replicates. XX0.514047 0.099161 X Block sizes 2 AIC BIC tt1 t2 (3.4) 0.16566XXtt34 0.218808 t 1 28.83684 6 6.22685 8 6.39991 The table below shows the value of 2 , AIC and BIC 2 35.983489 6.618683 6.708724 for blocks (sizes) when B = 500 replicates.