Discriminating Between the Normal and the Laplace Distributions
Total Page:16
File Type:pdf, Size:1020Kb
Discriminating Between The Normal and The Laplace Distributions Debasis Kundu1 Abstract Both normal and Laplace distributions can be used to analyze symmetric data. In this paper we consider the logarithm of the ratio of the maximized likelihoods to discriminate between the two distribution functions. We obtain the asymptotic distributions of the test statistics and it is observed that they are independent of the unknown parameters. If the underlying distribution is normal the asymptotic distribution works quite well even when the sample size is small. But if the underlying distribution is Laplace the asymptotic distribution does not work well for the small sample sizes. For the later case we propose a bias corrected asymptotic distribution and it works quite well even for small sample sizes. Based on the asymptotic distributions, minimum sample size needed to discriminate between the two distributing functions is obtained for a given probability of correct selection. Monte Carlo simulations are performed to examine how the asymptotic results work for small sizes and two data sets are analyzed for illustrative purposes. Key Words and Phrases: Asymptotic distributions; Likelihood ratio tests; Probability of correct selection; Location Scale Family. Address of Correspondence: 1 Department of Mathematics, Indian Institute of Tech- nology Kanpur, Pin 208016, INDIA. e-mail: [email protected]., Phone 91-512-2597141, Fax 91-512-2597500 1 1 Introduction Suppose an experimenter has n observations and the elementary data analysis, say for ex- ample histogram plot, stem and leaf plot or the box-plot, suggests that it comes from a symmetric distribution. The experimenter wants to ¯t either the normal distribution or the Laplace distribution, which one is preferable? It is well known that the normal distribution is used to analyze symmetric data with short tails, whereas the Laplace distribution is used for the long tails data. Although, these two distributions may provide similar data ¯t for moderate sample sizes, but it is still desirable to choose the correct or more nearly correct model, since the inferences often involve tail probabilities, where the distribution assumption is very important. Therefore, it is very important to make the best possible decision based on whatever data are available. For a given data set, whether it follows one of the two given distribution functions, is a very well known problem. Discriminating between any two general probability distribution functions was studied by Cox [6], [7], Chambers and Cox [4], Atkinson [1], [2], Dyer [10] and Chen [5]. Dumonceaux and Antle [9] consider the problem of discriminating between the log- normal and Weibull distributions and Dumonceaux, Antle and Haas [8] consider the problem of discriminating between any two distribution functions from the location-scale families. In both [9] and [8] the authors propose the test statistics and compute the critical values based on the Monte Carlo simulations. Fearn and Nebenzahl [11] and Bain and Englehardt [3] consider the problem of discriminating between the gamma and Weibull distributions. Wiens [20], Kim, Sun and Tsutakawa [13], Firth [12] and Kundu and Manglick [14] consider di®erent aspects of discriminating between the log-normal and gamma distributions. In this paper we discriminate between the normal and Laplace distributions using the ratio of the maximized likelihoods (RML). It should be mentioned that Dumonceaux, Antle 2 and Haas [8] also use an equivalent statistic to the RML to discriminate between the two distribution functions. They did not study any distributional property of the proposed statistic. In this paper, using the approach of White [18], [19] we obtain the asymptotic distribution of the logarithm of RML. It is observed that the asymptotic distribution of the logarithm of the RML is normal and it does not depend on the parameters of the underlying distribution function. Numerical study indicates that if the underlying distribution is normal the asymptotic distribution works quite well even for small sample sizes but the same is not true if the underlying distribution is Laplace. When the underlying distribution is Laplace, we propose a bias corrected asymptotic distribution and it works quite well for small sample sizes also. The asymptotic distribution can be used to compute the probability of correct selection (PCS). We also obtain the minimum sample size necessary to discriminate between the two distributing functions for a given PCS. The rest of the paper is organized as follows. At the end of the present section we provide all the notation we are going to use in this paper. We briefly describe the likelihood ratio method in section 2. Asymptotic distributions of the logarithm of RML statistics under null hypotheses are obtained in section 3. Sample size determination has been performed in section 4. Some numerical experiments are performed in section 5 and two real data sets are analyzed in section 6. Finally we conclude the paper in section 7. We use the following notation for the rest of the paper. The density function of a normal random variable with location parameter < ¹ < and scale parameter σ > 0, will be ¡1 1 denoted by 1 (x¡¹)2 ¡ 2 fN (x; ¹; σ) = e 2σ ; for < x < : (1) p2¼σ ¡ 1 1 A normal distribution with mean ¹ and variance σ2 will be denoted by N(¹; σ2). The density function of a Laplace distribution, with location parameter < ´ < and scale ¡1 1 3 parameter θ > 0, will be denoted by 1 ¡ jx¡´j fL(x; ´; θ) = e θ ; for < x < : (2) 2θ ¡ 1 1 A Laplace distribution with the location and scale parameters ´ and θ respectively, will be denoted by L(´; θ). Also, almost sure convergence will be denoted by a:s: For any Borel measurable func- tion, h(:), EN (h(U)), and VN (h(U)) will denote the mean and variance of h(U) under the 2 assumption that U follows N(¹; σ ). Similarly, we de¯ne, EL(h(U)) and VL(h(U)) as the mean and variance of h(U) when U follows L(´; θ). Moreover, if g(:) and h(:) are two Borel measurable function, we de¯ne covN (g(U); h(U)) = EN (g(U)h(U)) EN (g(U))EN (h(U)) ¡ and covL(g(U)h(U)) = EL(g(U)h(U)) - EL(g(U))EL(h(U)). Finally, we de¯ne median a ; : : : ; a m = am ; f 1 2 +1g +1 am+am+1 median a ; : : : ; a m = f 1 2 g 2 where a1 < a2 < : : : < a2m+1 are ordered real numbers. 2 Ratio of Maximized Likelihood We have a sample X1; : : : ; Xn, from one of the two distribution functions. The likelihood functions, assuming that the data follow N(¹; σ2) or L(´; θ), are n lN (¹; σ) = fN (Xi; ¹; σ) iY=1 and n lL(´; θ) = fL(Xi; ´; θ); iY=1 respectively. The logarithm of RML is de¯ned as l (¹^; σ^) T = ln N : (3) " lL(´^; θ^) # 4 Here (¹^; σ^) and (´^; θ^) are the maximum likelihood estimators (MLEs) of (¹; σ) and (´; θ) respectively based on the sample X1; : : : ; Xn. Therefore T can be written as n n n T = ln 2 ln ¼ + n ln θ^ n ln σ^ + ; (4) 2 ¡ 2 ¡ 2 where n n n 1 2 1 2 1 ¹^ = Xi σ^ = (Xi ¹^) ´^ = median X ; : : : ; Xn and θ^ = Xi ´^ : n n ¡ f 1 g n j ¡ j Xi=1 Xi=1 Xi=1 (5) The following discrimination procedure can be used. Choose the normal distribution if the test statistic T > 0, otherwise choose the Laplace distribution as the preferred model. Note that if the null distribution is N(¹; σ2), then the distribution of T is independent of ¹ and σ. Similarly, if the null distribution is L(´; θ), then the distribution of T is independent of ´ and θ. 3 Asymptotic Properties of the Logarithm of RML In this section we derive the asymptotic distributions of T . Case 1: The Data Follow Normal Distribution In this case we assume that X follows N(¹; σ2). We have the following result. Theorem 1: Under the assumption that the data follow N(¹; σ2), the distribution of T is asymptotically normally distributed with mean EN (T ) and VN (T ). To prove theorem 1, we need the following lemma. Lemma 1: Suppose the data follow N(¹; σ2), then as n , we have ! 1 (1) ´^ ´~ a.s.; and θ^ θ~ a.s.; where ! ! EN ln(fL(X; ´~; θ~)) = max EN (ln(fL(X; ´; θ))) : ´,θ ³ ´ 5 Note that ´~ and θ~ may depend on ¹ and σ, but we do not make it explicit for brevity. Let us denote; l (¹; θ) T ¤ = ln N ; (6) " lL(´~; θ~) # ¡ 1 ¡ 1 ¤ ¤ (2) n 2 (T ELN (T )) is asymptotically equivalent to n 2 (T ELN (T )). ¡ ¡ Proof of Lemma 1: The proof follows using the similar argument of White ([19]; Theorem 1) and therefore, it is omitted. Proof of Theorem 1: Using the Central Limit Theorem (CLT), it can be easily seen ¡ 1 ¤ ¤ that n 2 (T ELN (T )) is asymptotically normally distributed. Therefore, the proof im- ¡ mediately follows from part (2) of Lemma 1 and the CLT. Comments: It should be mentioned here that the distance between L(´~; θ~) and N(¹; σ) is minimum in terms of the Kullback-Liebler information measure. Now we compute ´~, θ~, EN (T ) and VN (T ). Note that; X ´ EN (ln(fL(X; ´; θ))) = ln 2 ln θ E ¡ : ¡ ¡ ¡ ¯ θ ¯ ¯ ¯ ¯ ¯ Since X follows N(¹; σ2), therefore, it is clear that ¯ ¯ 2 ´~ = ¹ and θ~ = E X ¹ = σ : (7) j ¡ j s¼ EN (T ) VN (T ) Now we provide the expression for EN (T ) and VN (T ). Note that limn!1 n and limn!1 n EN (T ) VN (T ) exist. Let us denote limn!1 n = AMN and limn!1 n = AVN . Since the distribution of T is independent of ¹ and σ, therefore to compute EN (T ) and VN (T ), without loss of generality, we consider ¹ = 0 and σ = 1, i:e: X follows N(0; 1).