On Variance Estimation Under Shifts in the Mean
Total Page:16
File Type:pdf, Size:1020Kb
AStA Advances in Statistical Analysis https://doi.org/10.1007/s10182-020-00366-5 ORIGINAL PAPER On variance estimation under shifts in the mean Ieva Axt1 · Roland Fried1 Received: 18 April 2019 / Accepted: 18 March 2020 © The Author(s) 2020 Abstract In many situations, it is crucial to estimate the variance properly. Ordinary variance estimators perform poorly in the presence of shifts in the mean. We investigate an approach based on non-overlapping blocks, which yields good results in change- point scenarios. We show the strong consistency and the asymptotic normality of such blocks-estimators of the variance under independence. Weak consistency is shown for short-range dependent strictly stationary data. We provide recommenda- tions on the appropriate choice of the block size and compare this blocks-approach with diference-based estimators. If level shifts occur frequently and are rather large, the best results can be obtained by adaptive trimming of the blocks. Keywords Blockwise estimation · Change-point · Trimmed mean 1 Introduction We consider a sequence of random variables Y1, … , YN generated by the model K Y = X + h I . t t k t≥tk (1) k=1 Most of the time we assume that X1, … , XN are i.i.d. random variables with 2 E Xt = and Var Xt = , but this will be relaxed occasionally to allow for a short-range dependent strictly stationary sequence. The observed data y1, … , yN are afected by an unknown number K of level shifts of possibly diferent heights h1, … , hK at diferent time points t1, … , tK . Our goal is the estimation of the vari- 2 ance . Without loss of generality, we will set = 0 in the following. * Ieva Axt [email protected] Roland Fried [email protected] 1 TU Dortmund University, Dortmund, Germany Vol.:(0123456789)1 3 I. Axt, R. Fried 2 In Sect. 2 we analyse estimators of from the sequence of observations (Yt)t≥1 by combining estimates obtained from splitting the data into several blocks. Without the need of explicit distributional assumptions the mean of the blockwise estimates turns out to be consistent if the size and the number of blocks increases, and the number of jumps increases slower than the number of blocks. If many jumps in the mean are expected to occur, an adaptively trimmed mean of the blockwise estimates can be used, see Sect. 3. In Sect. 4 a simulation study is conducted to assess the performance of the proposed approaches. In Sect. 5 the estimation procedures are applied to real data sets, while Sect. 6 summarizes the results of this paper. 2 Estimation of the variance by averaging When dealing with independent identically distributed data the sample variance is the 2 common choice for estimation of . However, if we are aware of a possible presence of level shifts at unknown locations, it is reasonable to divide the sample Y1, … , YN into m non-overlapping blocks of size n = N∕m and to calculate the average of the m sample variances derived from the diferent blocks. A similar approach has been used in Dai et al. (2015) in the context of repeated⌊ measurements⌋ data and in Rooch et al. (2019) for estimation of the Hurst parameter. 2 The blocks-estimator Mean of the variance investigated here is defned as m 1 2 = S2, Mean j (2) m j=1 S2 = 1 n (Y − Y )2 Y = 1 n Y Y , … , Y where j n−1 t=1 j,t j , j n t=1 j,t and j,1 j,n are the observa- j n tions in the th block.∑ We are interested in∑ fnding the block size which yields a low mean squared error (MSE) under certain assumptions. In what follows, we will concentrate on the situation where all jump heights are pos- itive. This is a worse scenario than having both, positive and negative jumps, since the data are more spread in the former case resulting in a larger positive bias of most scale estimators. 2.1 Asymptotic properties We will use some algebraic rules for derivation of the expectation and the variance of 2 quadratic forms in order to calculate the MSE of Mean , see Seber and Lee (2012). Let B be the number of blocks with jumps in the mean and K ≥ B the total number of 2 jumps. The expected value and the variance of Mean are given as follows: 1 3 On variance estimation under shifts in the mean B 2 2 1 T E Mean = + j Aj, m(n − 1) j=1 B 1 4(n − 3) 42 Var 2 = 4 − + TA , Mean m n n(n − 1) m2(n − 1)2 j j j=1 = E X4 , A = − 1 1 1T 1 =(1, … ,1)T where 4 1 n n n n , n is the unit matrix, n and j con- tains the expected values of the random variables in the perturbed block j = 1, … , B , T T T i.e., j =(j,1, … , j,n) =(E(Yj,1), … , E(Yj,n)) . The term j Aj∕(n − 1) is the empirical variance of the expected values E(Yj,1), … , E(Yj,n) in block j. In a jump- T free block, we have j Aj = 0 , since all expected values and therefore the elements of j are equal. The blocks-estimator (2) estimates the variance consistently if the number of blocks grows sufciently fast as is shown in Theorem 1. Theorem 1 Y , … , Y Y = X + K h I Let 1 N with t t k=1 k t≥tk from Model (1) be segregated into m blocks of size n, where t1, … , tK are the time points of the jumps of size ∑ h1, … , hK , respectively. Let B out of m blocks be contaminated by K1, … , KB jumps, B 4 respectively with Kj = K = K(N) Moreover let E( X1 ) < ∞ , 2 j=1 . , , K K h = o(m) m → ∞ k=1 k and∑ , whereas the block size n can be fxed or increas- m ing� as N →�∞ Then 2 = 1 S2 → 2 almost surely ∑ . Mean m j=1 j . ∑ Proof Without loss of generality assume that the frst B out of m blocks are contami- 2 nated by K1, … , KB jumps, respectively. Let the term Sj,0 denote the empirical vari- 2 ance of the uncontaminated data in block j, while Sj,h is the empirical variance when Kj level shifts are present. Moreover, Yj,1, … , Yj,n are the observations in the jth = E Y = 1 n E Y block, j,t j,t and j n t=1 j,t . Then we have m ∑ m � � B 2 1 2 1 2 1 2 Mean = Sj = Sj,0 + Sj,h m j=1 m j=B+1 m j=1 m B n 2 1 2 1 1 = Sj,0 + Xj,t + j,t − Xj − j m j=B+1 m j=1 n − 1 t=1 m B n (3) 1 2 1 2 = Sj,0 + (Xj,t − Xj)(j,t − j) m j=1 m j=1 n − 1 t=1 B n 1 1 2 + j,t − j . m j=1 n − 1 t=1 For the second term in the last Eq. (3), we have almost surely 1 3 I. Axt, R. Fried B n 2 (X − X )( − ) m(n − 1) j,t j j,t j j=1 t=1 B n 2 ≤ (X − X ) ( − ) j,t j j,t j m(n − 1) j=1 t=1 B n K (4) 2 ≤ (X − X ) h m(n − 1) j,t j k j=1 t=1 k=1 K B n 2 n 1 1 = B h (X − X ) ⟶ 0. k m n − 1 B n j,t j k=1 j=1 t=1 1 B 1 n The term j=1 t=1 (Xj,t− Xj) in (4) is a random variable with fnite moments B n if n and B are fxed. This random variable converges to the term ∑ ∑ � � E 1 n (X − X ) � � B → ∞ B → ∞ n → ∞ n t=1 j,t j almost� surely� if . In the case of and � �E X this term∑ converges� to� 1 almost surely due to Theorem 2 of Hu et al. (1989) � E�( X 4) < ∞ S2 − E S2 and the condition� � 1 , since j j are uniformly bounded with 2 2 → 2 → P( Sj − E Sj > t) 0 ∀t due to Chebyshev’s inequality and Var(Sj ) 0 . More- K ≤ K over, we used the fact that B k=1 hk K k=1 hk = o(m). The following is valid for the third term in (3): �∑ � �∑ � � � � � B n � � � B � n K 2 1 1 2 1 1 − ≤ h m n 1 j,t j m n 1 k j=1 − t=1 j=1 − t=1 k=1 K 2 B n = h ⟶ 0. m n 1 k − k=1 2 The frst term of the last equation in (3) converges almost surely to due to the results on triangular arrays in Theorem 2 of Hu et al. (1989), assuming that the con- 4 2 2 dition E( X1 ) < ∞ holds, since Sj − E Sj are uniformly bounded with 2 2 → 2 → P( Sj − E Sj > t) 0 ∀t due to Chebyshev’s inequality and Var(Sj ) 0 . Appli- cation of Slutsky’s Theorem proves the result. ◻ Remark 2 1. If the jump heights are bounded by a constant h ≥ hk, k = 1, … , K, the strongest restriction arises2 if all heights equal this upper bound resulting in the constraint K 3 2 K k=1 hk = K h = o(m) .