On Prediction Interval for Independent Observations

Applied Mathematical Sciences, Vol. 9, 2015, no. 99, 4931 - 4940 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.54366 On Prediction Interval for Independent Observations Khreshna Syuhada Statistics Research Division, Institut Teknologi Bandung Jalan Ganesa 10 Bandung, Indonesia Rizky Saputra Statistics Research Division, Institut Teknologi Bandung Jalan Ganesa 10 Bandung, Indonesia Copyright c 2015 Khreshna Syuhada and Rizky Saputra. This article is distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract We consider the problem of prediction interval for future observations in the case of normal random variables. Specifically, two prediction intervals are developed with the difference lies on parameter estimators. The resulting estimative prediction intervals have coverage probability as well as expected length bounded to O(n−1). Furthermore, whilst the coverage probability 1 − α is usually given, we propose to set this as parameter. Consequently, the O(n−1) terms in the asymptotic expansion of coverage probability and of expected length will depend not only on model parameter but also α. Our aim is to find an optimal coverage probability. Numerical analysis is carried out to illustrate the unexpected coverage probability and length of estimative prediction interval. Keywords: expected length, estimative prediction interval, normal distribution, optimal coverage 1 Introduction The accuracy of prediction interval may be assessed by calculating its coverage probability. Alternatively, we can assess prediction interval via expected 4932 Khreshna Syuhada and Rizky Saputra length, see e.g. Kabaila and Syuhada (2007) who derived the relative efficiency of prediction intervals by measuring the ratio of their expected length. Suppose that Y1;Y2;:::;Yn;Yn+1 are random variables that are independent and identically distributed with parameter vector θ. The available data are Y1;Y2;:::;Yn and we wish to find a prediction interval, I(θ), for Z = Yn+1 such that P Z 2 I(θ) = 1 − α, for all θ and α is level of significance. For unknown θ, an estimative prediction interval I(Θ)b is developed by re- placing a specified estimator Θb to θ. Consequently, the coverage probability of I(Θ)b is 1 − α + O(n−1) due to parameter variability. The expected length of estimative prediction interval, E I(Θ)b , is also bounded to O(n−1). Kabaila and Syuhada (2007, 2010) argued that there can be a trade off be- tween O(n−1) terms in the asymptotic expansions of coverage probability and of expected length. To avoid this trade off and make use of expected length, an improved prediction interval I+(Θ)b is constructed with coverage probability − 3 1 − α + O(n 2 ). Our contributions in this paper are twofold. Firstly, we develop two estimative prediction intervals from normal random variables with parameter (µ, σ2). We have these two intervals by setting different estimators for µ and σ2. Then, improved prediction intervals are constructed and assessment of such intervals are carried out via their expected length. Note that our parametric estimative prediction intervals setting is different to that of Kabaila and Syuhada (2007) who considered parametric versus nonparametric prediction intervals. The second contribution lies in our attempt to assess an estimative prediction interval by calculating O(n−1) terms of coverage probability and expected length together an set to be minimum to reach optimal 1 − α. In fact, we propose that α may be considered as parameter instead of given value. The O(n−1) terms then depend on α and parameter vector (µ, σ2). Whilst improved prediction interval may correct the coverage property of estimative one, in fact we may not be able to find optimal coverage probability by incorporating O(n−1) term in the asymptotic expected length. This is due − 3 to the absorbtion of unexpected coverage and replaced by O(n 2 ). The remainder of this paper is organized as follows. In Section 2, a de- scription of estimative and improved prediction intervals is presented along with the expected length of these prediction intervals. We also introduce the total unexpected coverage and length. We provide an example of independent observations from normal distribution and this is given in Section 3. A numerical analysis is presented in Section 4 aimed to illustrate different estimative prediction intervals. Section 5 explores the optimal coverage probability. PI for independent observations 4933 2 Prediction Intervals: Unexpected Coverage And Length A general formulation of estimative prediction interval along with its expected length is described briefly in this Section. Suppose that the available i.i.d. data are Y1;:::;Yn from the model with parameter vector θ. For unknown θ, the estimative prediction interval for future observation Z = Yn+1 is I Θb = h i L(Θ)b ;U(Θ)b for a specified estimator Θ.b The coverage probability P Z 2 −1 I Θb differs from 1 − α by c(θ) n . Meanwhile, its expected length is of the form p(θ) + q(θ) n−1 + ··· , where p(θ) > 0. We propose that, for our objective, the O(n−1) terms in the asymptotic expansion of coverage probability and of expected length will depend on α. Thus, the O(n−1) terms for coverage probability and expected lengths are c(α; θ) and q(α; θ), respectively. It is clear that if c(α; θ) = 0 then we have an improved prediction interval. Otherwise, we seek to find an optimal 1 − α by having minimum c(α; θ) and so does q(α; θ). The total unexpected coverage and length is c(α; θ) + q(α; θ). To construct an improved prediction interval, first we note that d(Θ)b = −c(Θ)b n−1= 2 f(z; θb) where z is estimative prediction limit whilst f(z; θ^) h is its pdf. The relevant improved prediction interval is I+(Θ)b = L(Θ)b − i d Θb ;U(Θ)b + d Θb in which its coverage probability now is bounded to O(n−3=2). The corresponding expected length for improved prediction interval is of the form p(α; θ) + q+(α; θ) n−1 + ··· , from which the total unexpected coverage and length is k(α; θ) = q+(α; θ). 3 The Case Of Normal Random Variables In what follows, we demonstrate the derivation of estimative prediction interval and its expected length for the case of normal random variables. An improved prediction interval is also constructed. Suppose that Y1;:::;Yn are i.i.d. normally distributed with unknown mean µ and unknown variance σ2. We are concerned with a prediction interval for Z = Yn+1 which is independent and from the same distribution. Let θ = (µ, σ2). For a specified estimator 2 2 Θb = (µ,b σb ) of (µ, σ ), the 1 − α estimative prediction interval is h i I(Θ) = µ − z1− α σ ; µ + z1− α σ b b 2 b b 2 b 4934 Khreshna Syuhada and Rizky Saputra and its coverage probability is Pθ Z 2 I(Θ)b which is equal to (µ − µ) + z1− α σ (µ − µ) − z1− α σ E Φ b 2 b − Φ b 2 b = 1−α+c(θ) n−1 +··· θ σ σ −1 2 −1 2 2 −1 2 where c(θ) n = −(σ ) z1− α φ z1− α E (µ−µ) +(σ ) z1− α φ z1− α E σ − 2 2 b 2 2 b 2 1 2 −2 3 2 2 2 σ − (σ ) z1− α + z α φ z1− α E (σ −σ ) by using a similar argu- 4 2 1− 2 2 b ment to that used in Syuhada (2008, p.17). As explained before, the O(n−1) arises due to parameter variability. The expected length of I(Θ)b is 2 1 E length of I(Θ) = 2 z1− α E (σ ) 2 : (1) b 2 b Now, by Taylor expansion 2 2 1 2 1 1 2 − 1 2 2 1 2 − 3 2 2 (σ ) 2 = (σ ) 2 + (σ ) 2 σ − σ − (σ ) 2 σ − σ + ··· b b 2 2 b b 2 2 b 2 σb =σ 8 σb =σ 2 1 2 − 1 2 2 1 2 − 3 2 We obtain (1) = 2 z1− α (σ ) 2 +z1− α (σ ) 2 E σ −σ − z1− α (σ ) 2 E (σ − 2 2 b 4 2 b 2 2 2 1 −1 σ ) + ··· which is equal to 2 z α (σ ) 2 + q(θ) n + ··· for the usual sorts 1− 2 2 2 of estimators σb of σ , where q(θ) is determined by the asymptotic moments 2 of σb . An improved prediction interval is constructed by absorbing the O(n−1) term. The resulting interval is + h i I (Θ) = µ − z1− α σ − d(Θ) ; µ + z1− α σ + d(Θ) b b 2 b b b 2 b b −1 2 − 1 where d(Θ) = −c(Θ) n =(2 (σ ) 2 φ z1− α ). The coverage probability of this b b b 2 improved prediction interval is 1 − α + O(n−3=2) whilst the expected length of this interval is + 2 1 E length of I (Θ) = 2 z1− α E (σ ) 2 + 2 E d Θ (2) b 2 b b 2 1 By, again, the Taylor expansion for (σ ) 2 above and the following Tay- b @ d Θ @ d Θ b b 2 2 lor expansion d Θb = d(θ) + @ µ µ − µ + @ σ2 σ − σ + b Θ=b θ b b Θ=b θ b 2 1 2 − 1 2 ··· , we find that (2) is equal to 2 z1− α (σ ) 2 + (σ ) 2 z1− α E (µ − µ) + 2 2 b 1 2 − 3 3 2 2 2 2 1 + −1 + (σ ) 2 z α E (σ − σ ) = 2 z1− α (σ ) 2 + q (θ) n + ··· ; where q (θ) is 4 1− 2 b 2 2 determined by the asymptotic moments of σb and µb.

On Prediction Interval for Independent Observations

CONFIDENCE Vs PREDICTION INTERVALS 12/2/04 Inference for Coefﬁcients Mean Response at X Vs

Choosing a Coverage Probability for Prediction Intervals

STATS 305 Notes1

Inference in Normal Regression Model

Sieve Bootstrap-Based Prediction Intervals for Garch Processes

On Small Area Prediction Interval Problems

Bayesian Prediction Intervals for Assessing P-Value Variability in Prospective Replication Studies Olga Vsevolozhskaya1,Gabrielruiz2 and Dmitri Zaykin3

4.7 Confidence and Prediction Intervals

1 Confidence Intervals

Correlations & Confidence Intervals

Lecture 32 the Prediction Interval Formulas for the Next Observation from a Normal Distribution When Σ Is Unknown

Chapters 2 and 10: Least Squares Regression