Tilted Normal Distribution and Its Survival Properties
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Data Science 10(2012), 225-240 Tilted Normal Distribution and Its Survival Properties Sudhansu S. Maiti1∗ and Mithu Dey2 1Visva-Bharati University and 2Asansol Engineering College Abstract: To analyze skewed data, skew normal distribution is proposed by Azzalini (1985). For practical problems of estimating the skewness parame- ter of this distribution, Gupta and Gupta (2008) suggested power normal dis- tribution as an alternative. We search for another alternative, named tilted normal distribution following the approach of Marshall and Olkin (1997) to add a positive parameter to a general survival function and taking survival function is of normal form. We have found out different properties of this distribution. Maximum likelihood estimate of parameters of this distribu- tion have been found out. Comparison of tilted normal distribution with skew normal and power normal distribution have been made. Key words: Akaike information criterion, compound distribution, failure rate, Kolmogorov discrepancy measure, maximum likelihood. 1. Introduction The skew normal distribution, proposed by Azzalani (1985), can be a suitable model for the analysis of data exhibiting a unimodal density having some skewness present, a structure often occurring in data analysis. The proposed distribution is a generalization of the standard normal distribution and the probability density function (pdf) is given by φ(z; λ) = 2φ(z)Φ(λz); − 1 < z < 1; (1.1) where φ(x) and Φ(x) denote the N(0; 1) density and distribution function re- spectively. The parameter λ regulates the skewness and λ = 0 corresponds to the standard normal case. The density given by (1:1) enjoys a number of for- mal properties which resemble those of the normal distribution, for example, if X has the pdf, given by (1:1), then X2 has a chi-square distribution with one degree of freedom. That is, all even moments of X are exactly the same as the ∗Corresponding author. 226 Sudhansu S. Maiti and Mithu Dey corresponding even moments of the standard normal distribution. For more in- formation, see Azzalini and Capitanio (1999) and Genton (2004). A motivation of the skew normal distribution has been elegantly exhibited by Arnold et al. (1993). This model can naturally arise in applications as hidden function and/or selective reporting model, see Arnold and Beaver (2002). Gupta and Gupta (2008) observed that the estimation of the skewness pa- rameter of model (1:1) is problematic when the sample size is not large enough. Monti (2003) noted that the estimate is λb = ±∞, even when the data are gener- ated by a model with finite λ. In the case of a skew normal model the moment estimator of λ is the solution of r 2 λ p = X:¯ (1.2) π 1 + λ2 Therefore, the solution exists if and only if jX¯j < p2/π. The exact distribution of X¯ is not known and it is not easy to establish. Gupta and Gupta (2008) estimated the value of P(jX¯j > p2/π) based on simulation data for different values of n and λ. They have shown that as the sample size increases, the probability of the feasibility of the moment estimator increases. On the other hand, as the skewness parameter increases, the chances of obtaining the moment estimator decreases. To find out the maximum likelihood estimator(MLE) of λ based on a random sample X1;X2; ··· ;Xn, the likelihood equation is n X Xiφ(λXi) = 0: (1.3) Φ(λX ) i=0 i It is clear that if all Xi have the same sign then there is no solution of (1:3), see also Liseo (1990) and Azzalini and Capitanio (1999). The probability that X > 0, for λ > 0, is an increasing function of λ. When the absolute value of the skewness parameter is large, the probability that all the observations have the same sign is large. Therefore, for small to moderate sample sizes, the MLE may not be accurate enough for a practical use. The Centered Parametrization (CP) can solve these difficulties to some extent. Azzalini (1985) reparametrized the problem by 0 0 2 1 p writingp X = µ + σZ , where Z = (Z − µz)/σz; σz = (1 − µz) 2 ; µz = 2/π · 2 (λ/ 1 + λ ) and considering the centred parameters CP= (µ, σ; γ1) instead of the direct parameters DP= (µ, σ; λ). Here γ1 is the usual univariate index of skewness, taken with same sign as that of λ [see for details, Azzalini and Capitanio (1999)]. The set of S-PLUS routines developed for the computation are freely available at the following World Wide Web address: http://azzalini.stat.unipd.it/SN/. Tilted Normal Distribution and Its Survival Properties 227 For the above reason, Gupta and Gupta (2008) considered another skewed model for which normal distribution is a special case. They proposed the power normal model whose distribution function and density function are given by α F2(x; α) = [Φ(x)] ; − 1 < x < 1; α > 0; (1.4) and α−1 f2(x; α) = α[Φ(x)] φ(x); − 1 < x < 1; α > 0: (1.5) This is basically an exponentiated family of distributions with baseline distribu- tion as normal. This is nothing but a system of adding one or more parameters to the existing normal distribution. Such an addition of parameters makes the resulting distribution richer and more flexible for modeling data. The power nor- mal distribution has several nice properties. For α = 1 it reduces to the standard normal distribution. This distribution is a unimodal density which is skewed to the right if α > 1 and to the left if 0 < α < 1. Gupta and Gupta (2008) considered α as skewness parameter of the distribution. But for 0 < α < 1, skewness to the left is not so clear as is to be seen from the Figure 1 and modeling of left skewed data set will be misfit. This motivates us to think for another skewed distribution for which normal distribution is a specified case and at the same time it will be fit for modeling both kind of skewed data. We take Marshall and Olkin (1997) extended model to add a positive parameter to a general survival function. α=1 α=0.5 α=0.1 α=8 ) α (x; 2 f 0.0 0.2 0.4 0.6 0.8 1.0 −4 −2 0 2 4 x Figure 1: Plot of power normal distribution for different values of α 228 Sudhansu S. Maiti and Mithu Dey Let X be arandom variable with survival function and the probability density function as βF 0(x) F 3(x; β) = ; − 1 < x < 1; β > 0; (1.6) 1 − (1 − β)F 0(x) and βf (x) f (x; β) = 0 ; − 1 < x < 1; β > 0; (1.7) 3 2 [1 − (1 − β)F 0(x)] respectively. Here F0(x) is the baseline distribution function and f0(x) is the corresponding probability density function. Taking F0(x) = Φ(x) and f0(x) = φ(x), we get the tilted normal (standard) distribution as βφ(x) f (x; β) = ; − 1 < x < 1; β > 0: (1.8) 3 [1 − (1 − β)f1 − Φ(x)g]2 Here β = 1 indicates a standard normal density function. The tilted normal density is a unimodal density which is skewed to the left if β > 1 and to the right if 0 < β < 1. Figure 2 displays a few pdf graphs for various values of β. β=1 β=0.5 β=0.01 β=100 ) β (x; 3 f 0.0 0.2 0.4 0.6 0.8 1.0 −4 −2 0 2 4 x Figure 2: Plot of tilted normal distribution for different values of β Tilted Normal Distribution and Its Survival Properties 229 If the normal density has mean µ and variance σ2, then the form of the tilted normal distribution will be β φ x−µ f (x; β; µ, σ) = σ σ ; − 1 < x; µ < 1; β; σ > 0: (1.9) 3 x−µ 2 [1 − (1 − β)f1 − Φ σ g] We will denote this tilted normal distribution with parameters µ, σ and β as TN(µ, σ; β). The paper is organized as follows. In Section 2, we have shown that the TN distribution can be obtained as a compound distribution with mixing exponential distribution. The basic structural properties of the proposed model including the reliability properties are presented in Section 3. In Section 4, we discuss the estimation of parameters of the TN distribution. Section 5 is devoted for studying closeness of skew normal, power normal and tilted normal distributions. A data set has been analyzed using the TN model in Section 6. Section 7 concludes. 2. Compounding Let G¯(xjθ); −∞ < x; θ < 1, be a survival function (SF) of a random variable X given θ. Let θ be a random variable with pdf m(θ). Then a distribution with survival function Z 1 G¯(x) = G¯(xjθ)m(θ)dθ; − 1 < x < 1; −∞ is called a compound distribution with mixing density m(θ). Compound distri- butions provide a tool to obtain new parametric families of distributions from existing ones. They represent heterogeneous models where populations' individ- uals have different risks. The following theorem show that the TN distribution can be obtained as a compound distribution. Theorem 2.1 Suppose that the conditional SF of a continuous random variable X given Θ = θ is given by " # Φ x−µ ¯ σ G(xjθ) = exp −θ x−µ ; − 1 < x; µ < 1; θ; σ > 0: 1 − Φ σ Let θ have an exponential distribution with pdf m(θ) = βe−βθ; θ; β > 0: Then the compound distribution of X is the TN(µ, σ; β) distribution.