A Simulation Method for Skewness Correction

U.U.D.M. Project Report 2008:22 A simulation method for skewness correction Måns Eriksson Examensarbete i matematisk statistik, 30 hp Handledare och examinator: Silvelyn Zwanzig December 2008 Department of Mathematics Uppsala University Abstract Let X1,...,Xn be i.i.d. random variables with known variance and skewness. A one-sided confidence interval for the mean with approximate confidence level α can be constructed using normal approximation. For skew distributions the actual confidence level will then be α+o(1). We propose a method for obtaining confidence intervals with confidence level α+o(n−1/2) using skewness correcting pseudo-random variables. The method is compared with a known method; Edgeworth correction. h h Acknowledgements I would like to thank my advisor Silvelyn Zwanzig for introducing me to the subject, for mathematical and stylistic guidance and for always encouraging me. I would also like to thank my friends and teachers at and around the department of mathematics for having inspired me to study mathematics, and for continuing to inspire me. 3 h h Contents 1 Introduction 7 1.1 Skewness . 7 1.2 Setting and notation . 7 2 The Edgeworth expansion 8 2.1 Definition and formal conditions . 8 2.2 Derivations . 11 2.2.1 Edgeworth expansion for Sn ...................... 11 2.2.2 Edgeworth expansions for more general statistics . 14 2.2.3 Edgeworth expansion for Tn ...................... 15 2.2.4 Some remarks; skewness correction . 16 2.3 Cornish-Fisher expansions for quantiles . 16 3 Methods for skewness correction 17 3.1 Coverages of confidence intervals . 17 3.2 Edgeworth correction . 18 3.3 The bootstrap . 19 3.3.1 The bootstrap and Sn ......................... 19 3.3.2 Bootstrap confidence intervals . 20 3.4 A new simulation method . 21 3.4.1 Skewness correction through addition of a random variable . 21 3.4.2 Simulation procedure . 22 4 Comparison 23 4.1 Coverages of confidence intervals . 23 4.2 Criteria for the comparison . 24 4.3 Comparisons of the upper limits . 24 4.4 Simulation results . 26 4.5 Discussion . 26 A Appendix: Skewness and kurtosis 28 A.1 The skewness of a sum of random variables . 28 A.2 The kurtosis of a sum of random variables . 30 ˆ ˆ B Appendix: P (θnew ≤ θEcorr) 31 C Appendix: Simulation results 36 h h 1 Introduction 1.1 Skewness The notion of skewness has been a part of statistics for a long time. It dates back to the 19th century and most notably to an article by Karl Pearson from 1895 ([26]). Skew distributions are found in all areas of applications, ranging from finance to biology and physics. It has been seen both in theory and in practice that deviations from normality in the form of skewness might have effects too big to ignore on the validity and performance of many statistical methods and procedures. This thesis discusses skewness in the context of the central limit theorem and normal approximation, in particular applied to confidence intervals. Some methods for skewness correction are discussed and a new simulation method is proposed. We assume that the concept of skewness is known and refer to Appendix A for some basic facts about skewness. 1.2 Setting and notation Throughout the thesis we assume that we have an i.i.d. univariate sample X1,...,Xn, with EX = µ, Var(X) = σ2 and E|X|3 < ∞, such that X satisfies Cramér’scondition lim supt→∞ |ϕ(t)| < 1, where ϕ is the characteristic function of X. At times we will 4 also assume that EX < ∞. We use X to denote a generic Xi, that is, X is a random variable with the same distribution as the Xi. Thus, for instance, EX is the mean of the distribution of the observations; EX = EXi for all i. The α-quantile vα of the distribution of some random variable X is defined to be such that P (X ≤ vα) = α. When X ∼ N(0, 1) we denote the quantile λα, that is, Φ(λα) = α. 1/2 We use An to denote a general statistic, Sn = n (X¯ − µ)/σ to denote the stan- 1/2 dardized sample mean and Tn = n (X¯ − µ)/σˆ to denote the studentized sample mean, 2 1 P ¯ 2 whereσ ˆ = n (Xi − X) . 3 3 The skewness E(X − µ) /σ of a random variable X is denoted Skew(X), γ or γX if we need to distinguish between different random variables. The kurtosis of X, E(X − 4 4 µ) /σ − 3, is denoted Kurt(X), κ or κX . Basic facts about skewness and kurtosis are stated in Appendix A. As for asymptotic notation, for real-valued sequences an and bn we say that an = o(bn) if an/bn → 0 as n → ∞ and an = O(bn) if an/bn is bounded as n → ∞. Finally, we say that a sequence Xn of random variables is bounded in probability if, limc→∞ lim supn→∞ P (|Xn| > c) = 0. We write this as Xn = OP (1) and if, for some sequence an, anXn = OP (1) we write Xn = OP (1/an). 7 2 The Edgeworth expansion In this section we introduce our main tool, the Edgeworth expansion. Later we will use it to determine the coverage of confidence intervals. 2.1 Definition and formal conditions Theorem 1. Assume that X1,...,Xn is an i.i.d. sample from a univariate distribution, 2 j+2 with mean µ, variance σ and E|X| < ∞, that satisfies lim supt→∞ |ϕ(t)| < 1. Let 1/2 Sn = n (X¯ − µ)/σ. Then −1/2 −j/2 −j/2 P (Sn ≤ x) = Φ(x) + n p1(x)φ(x) + ... + n pj(x)φ(x) + o(n ) (1) uniformly in x, where Φ(x) and φ(x) are the standard normal distribution function and density function and pk is a polynomial of degree 3k − 1. In particular 1 p (x) = − γ(x2 − 1) and 1 6 1 1 p (x) = −x κ(x2 − 3) + γ2(x4 − 10x2 + 15) . 2 24 72 Proof. A proof is given in Section 2.2.1. See also [7] and [8]. (1) is called an Edgeworth expansion for Sn. The condition lim supt→∞ |ϕ(t)| < 1 is known as Cramér’s condition and was de- rived by Cramérin [8]. Note that the condition holds whenever X is absolutely con- tinuous. This is an immediate consequence of the Riemann-Lebesgue lemma (Theo- rem 1.5 in Chapter 4 of [18]). Moreover, if we limit the expansion to P (Sn ≤ x) = −1/2 −1/2 −1/2 Φ(x) + n p1(x)φ(x) + o(n ), so that the remainder term is o(n ), then it suf- fices that X has a non-lattice distribution, which was shown by Esseen in [16]. The Edgeworth expansion was first developed for the statistic Sn but has later been extended to other statistics. Definition 1. Let An denote a statistic. Then if −1/2 −j/2 −j/2 P (An ≤ x) = Φ(x) + n a1(x)φ(x) + ... + n aj(x)φ(x) + o(n ), (2) where Φ(x) and φ(x) are the standard normal distribution function and density function and ak is a polynomial of degree 3k − 1, (2) is called the Edgeworth expansion for An. If ak = 0 for all k < i the normal approximation of the distribution is said to be i:th order correct. In general, the polynomials ak will depend on the moments of the statistic. We can therefore, in a sense, view the Edgeworth expansion as an extension of the central limit theorem, where information about the higher moments of the involved random variables is used to obtain a better approximation of the distribution function of An. The expansion gives an expression for the size of the remainder term depending on the sample size. 8 We might want to compare this to the Berry-Esseen theorem (see for instance [16] or Section 7.6 of [18]), which essentially, in our context, says that the error in the normal approximation is of order n−1/2. The Edgeworth expansion for simple statistics was first introduced in papers by Cheby- shev in 1890 ([4]) and Edgeworth in 1894, 1905 and 1907 ([12, 13, 14]). The idea was made mathematically rigorous by Cramérin 1928 ([7]) and Esseen in 1945 ([16]). The expansions and their applications for more general statistics were then developed in several papers, including [2], by various authors in the mid-1900’s. A thorough treatment of the Edgeworth expansion is found in Chapter 2 of [22]. Chapter 13 of [10] gives a brief introduction to the Edgeworth expansion with some of the most important results and Section 17.7 of [8] is a standard reference for the case where the expansion for Sn is considered. Conditions for (2) to hold are given next. Although we will focus on expansions of quite simple statistics, the Edgeworth expansion can be used in very general circum- stances. We state a theorem by Bhattacharya and Ghosh ([2]) that provides conditions for the Edgeworth expansion (2) to hold in a general case and illustrate how the theorem relates to the expansion for Sn. d Let X, X1, X2,..., Xn be i.i.d. random column vectors in R with mean µ and let ¯ −1 Pn d X = n i=1 Xi. Let A : R → R be a function of the form AS = (g(x) − g(µ))/h(µ) ˆ or AT = (g(x) − g(µ))/h(x), where g and h are known, θ = g(X¯ ) is an estimator of the scalar θ = g(µ), h(µ)2 is the asymptotical variance of n1/2θˆ and h(X¯ ) is an estimator of h(µ).

A Simulation Method for Skewness Correction

Use of Proc Iml to Calculate L-Moments for the Univariate Distributional Shape Parameters Skewness and Kurtosis

A Skew Extension of the T-Distribution, with Applications

Maxskew and Multiskew: Two R Packages for Detecting, Measuring and Removing Multivariate Skewness

A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess

Approximating the Distribution of the Product of Two Normally Distributed Random Variables

Wooldridge, Introductory Econometrics, 4Th Ed. Appendix C

Convergence in Distribution Central Limit Theorem

An Honest Approach to Parallel Trends ∗

Portfolio Allocation with Skewness Risk: a Practical Guide∗

Chapter 9. Properties of Point Estimators and Methods of Estimation

Review of Mathematical Statistics Chapter 10

Chapter 4 Efficient Likelihood Estimation and Related Tests