Truncated Skew-Normal Distributions: Moments, Estimation by Weighted Moments and Application to Climatic Data

Truncated skew-normal distributions: moments, estimation by weighted moments and application to climatic data by C. Flecher1;2, D. Allard1 and P. Naveau2 1 Biostatistics and Spatial Processes, INRA, Agroparc 84914 Avignon, France 2 Laboratoire des Sciences du Climat et de l'Environnement, CNRS Gif-sur-Yvette, France Running title: Truncated skew-normal distributions Abstract In this paper we derive an expression of the mth order moments and some weighted moments of truncated skew-normal distributions. We link these formulas to previous results for truncated distributions and non truncated skew-normal distributions. Methods to estimate skew-normal parameters using weighted moments are proposed and compared to other classical techniques. In a second step we propose to model the distribution of relative humidity with a truncated skew-normal distribution. Key words: Truncated skew-normal distributions, inference methods, classical moments, weighted moments, relative humidity. 1 1 Introduction In many applications, the probability distribution function (pdf) of some observed variables can be simultaneously skewed and restricted to a xed interval. For example, variables such as pH, grades, and humidity in environmental studies, have upper and lower physical bounds and their pdfs are not necessarily symmetric within these bounds. To illustrate such behaviors, Figure 1 shows the histograms per season and the estimated pdfs of daily relative humidity measurements made in Toulouse (France) from 1972 to 1999. Dec−Jan−Feb Mar−Apr−May Jun−Jul−Aug Sep−Oct−Nov 0.06 0.06 0.06 0.06 0.05 0.05 0.05 0.05 0.04 0.04 0.04 0.04 0.03 0.03 0.03 0.03 Density Density Density Density 0.02 0.02 0.02 0.02 0.01 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 H (%) H (%) H (%) H (%) Figure 1: Histograms and estimated densities of daily relative humidity measurements made in Toulouse (France) from 1972 to 1999. Each panel corresponds to a season. All observations belong to the interval [0; 100] and skewness is apparent, especially during the Spring and Fall seasons. There are many strategies available to model such skewed and bounded data. In this paper our approach is to conceptually view such observations as truncated measurements originating from a exible skewed distribution. More precisely we assume that the truncation bounds are known and we focus on the following skew-normal pdf dened by Azzalini (1985) 2 x − µ x − µ f (x) = φ Φ λ ; (1) µ,σ,λ σ σ σ where µ 2 R, σ > 0 and λ 2 R represent the location, scale and shape parameters, respectively. The notations φ and Φ correspond to the pdf and the cumulative distribution function (cdf) of the standard normal distribution, respectively. The skew-normal distribution (1) can be obtained from a bivariate Gaussian random vector, by conditioning one component on the second one being above (or below) a given threshold (Azzalini, 1985). The notation X ∼ SN(µ, σ; λ) represents a random variable, X, following (1) and Fµ,σ,λ denotes its cdf. The particular case λ = 0 corresponds to the classical normal distribution with mean µ and variance σ2. In the following, the pdf and the cdf of a SN(0; 1; λ) are simply denoted fλ and Fλ. In this context our main objective is to propose and study a novel method-of-moments approach for estimating the parameters of (1) in presence of truncation. 2 Among others, Martinez et al. (2008) recently studied the moments of the skew-normal dened by (1). Dhrymes (2005) provided a recursive relationship among the r-order moments of a standard normal distribution, say Z, truncated at a and b r−1 b [z φ(z)]a for (2) mr(a; b) = (r − 1)mr−2(a; b) − b ; r = 1; 2;:::; [Φ(z)]a where r with , b denotes , and mr(a; b) = E[Z j a < Z ≤ b] a < b [Φ(x)]a Φ(b) − Φ(a) m0(a; b) = 1 m−1(a; b) can be any nite value. In Section 2 we combine the results of Martinez et al. (2008) with the approach of Dhrymes (2005) in order to derive the moments of a truncated skew-normal distribution. Following the work of Flecher et al. (2009a), we also compute a special type of weighted moments of truncated skew-normal distributions. In Section 3, these theoretical expressions of moments allow us to propose a new method-of-moments approach to estimate the parameters of such truncated distributions. In Section 4, this estimation method is compared to the classical method-of-moments and maximum likelihood method (Mills, 1955) using simulated data. Daily air relative humidity measurements are also analyzed in this section. All proofs are relegated to the Appendix to improve readability of the paper. 2 Moments of truncated skew-normal variables Lemma 1 presents an explicit, though cumbersome, solution to (2). Lemma 1. Let Z be a standard Gaussian random variable and −∞ ≤ a < b ≤ +1. Let us denote r th mr(a; b) = E(Z j a < Z ≤ b) the r moment of the truncated Z. Then, k ! X 1 [z2i−1φ(z)]b m (a; b) = (2k − 1)!! 1 − a ; for k = 1; 2;:::; 2k (2i − 1)!! [Φ(z)]b i=1 a k X (2k)!! [z2iφ(z)]b m (a; b) = − a ; for k = 0; 1;:::; (3) 2k+1 (2i)!! [Φ(z)]b i=0 a where n!! denotes the double factorial dened by Arfken (1985) as 8 <1; if n = −1; n = 0 or n = 1; n!! = :n × (n − 2)!! if n ≥ 2: As (a; b) ! (−∞; 1), the moments m2k(a; b) and m2k+1(a; b) tend to (2k − 1)!! and zero, respectively. The latter two values correspond to the classical Gaussian moments. 3 We now introduce the truncated skew-normal distribution as a truncation of the skew-normal distribution in 1. Its pdf is 8 1 if < b fµ,σ,λ(x); a < x ≤ b; [Fµ,σ,λ(x)]a fµ,σ,λ(x j a < X ≤ b) = (4) :0 otherwise, where X ∼ SN(µ, σ; λ) and −∞ ≤ a < b ≤ +1 represents the range of the truncation. Let us consider the simple case (µ, σ) = (0; 1). r Proposition 1. Let X be a SN(0; 1; λ). Let us denote sλ,r(u; v) = E[X j u < X ≤ v] with u < v the rth moment of the truncated variable. The following recursive relationship holds, sλ,r(u; v) = (r − 1)sλ,r−2(u; v) + rλ,r(u; v); for r = 1; 2;:::; (5) where sλ,0(u; v) = 1 and sλ,−1 can be any nite value, [xr−1f (x)]v 2 λ [Φ(λ x)]v λ u p ∗ u rλ,r(u; v) = − v + r v mr−1(λ∗u; λ∗v); [Fλ(x)]u 2π λ∗ [Fλ(x)]u 2 1=2 where λ∗ = (1 + λ ) . From (5), we can derive p X (2p − 1)!! s (u; v) = (2p − 1)!! + r (u; v); with p = 1; 2;:::; (6) λ,2p (2k − 1)!! λ,2k k=1 p X (2p)!! s (u; v) = r (u; v); with p = 0; 1;:::: (7) λ,2p+1 (2k)!! λ,2k+1 k=0 Martinez's et al. (2008) results can be viewed as limiting cases of (6) and (7) lim sλ,2p(u; v) = (2p − 1)!!; with p = 0; 1;:::; (u;v)!(−∞;+1) p 2 X (2p)!! λ lim sλ,2p+1(u; v) = p (2k − 1)!! ; with p = 0; 1;:::: (u;v)!(−∞;+1) 2π (2k)!! (1 + λ2)k+1=2 k=0 Equalities (6) and (7) tell us that odd (respectively even) moments of truncated skew-normal distributions can be interpreted as linear combinations of even (respectively odd) moments of the normal distribution truncated at λ∗u and λ∗v. If λ = 0, Equation (5) and Proposition 1 are equivalent to the recursive equation provided in Dhrymes (2005) and Lemma 1, respectively. The restricting condition (µ, σ) = (0; 1) can be easily removed and leads to the following proposition. Proposition 2. Let X ∼ SN(µ, σ; λ). Then, we have m m X r m−r r (8) E[X j a < X ≤ b] = Cmµ σ sλ,r(u; v): r=0 ! m where , , r is a binomial coecient and is dened u = (a − µ)/σ v = (b − µ)/σ Cm = sλ,r(u; v) r in Proposition 1. 4 Proposition 2 provides moments of any order and consequently the rst three moments can be used to implement a classical method-of-moment. Besides the complexity of deriving the explicit expression of the third moment, its estimation is usually tainted with a large variance because X3 can take large values. An alternative route is to derive other types of moments. Following the work of Hosking et al. (1985) and Diebolt et al. (2008), Flecher et al. (2009a) introduced and studied probability weighted moments for skew-normal distributions. The basic idea is to compute moments s r of the type E[X Φ (X)], where s and r are small integers. Here we concentrate on the case s = 0 and r = 1. We obtain the following proposition. Proposition 3. Let X ∼ SN(µ, σ; λ) and a < b. Then, b 2 t [F+(x)]a (9) E[Φ(X) j a < X ≤ b] = 2Φ2(0; ν+; I2 + σ D+D+) b = '(µ, σ; λ); [Fµ,σ,λ(x)]a 2 where F+ is the cumulative distribution function of a closed skew-normal variable CSN1;2(µ, σ ; D+; ν+; I2), t t I2 is the identity matrix of size two, D+ = (1; λ/σ) and ν+ = (−µ, 0) : The denition and some properties of closed skew-normal distributions can be found in the book edited by Genton (2004), see in particular Chapter 2 written by González-Farías et al.

Truncated Skew-Normal Distributions: Moments, Estimation by Weighted Moments and Application to Climatic Data

A Skew Extension of the T-Distribution, with Applications

On Moments of Folded and Truncated Multivariate Normal Distributions

A Study of Non-Central Skew T Distributions and Their Applications in Data Analysis and Change Point Detection

1 One Parameter Exponential Families

Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F

The Normal Moment Generating Function

A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess

Volatility Modeling Using the Student's T Distribution

The Multivariate Normal Distribution=1See Last Slide For

Approximating the Distribution of the Product of Two Normally Distributed Random Variables

Random Variables and Applications

Strategic Information Transmission and Stochastic Orders