Compound Truncated Poisson Normal Distribution: Mathematical Properties and Moment Estimation
Total Page:16
File Type:pdf, Size:1020Kb
Inverse Problems and Imaging doi:10.3934/ipi.2019036 Volume 13, No. 4, 2019, 787{803 COMPOUND TRUNCATED POISSON NORMAL DISTRIBUTION: MATHEMATICAL PROPERTIES AND MOMENT ESTIMATION Abraao~ D. C. Nascimento∗ Departamento de Estat´ıstica, Centro de Ci^enciasExatas e da Natureza Universidade Federal de Pernambuco, Recife - PE, ZIP 50740-540, Brazil Leandro C. Rego^ Departamento de Estat´ıstica e Matem´aticaAplicada, Centro de Ci^encias Universidade Federal do Cear´a,Fortaleza - CE, ZIP 60440-900, Brazil Programas de P´os-Gradua¸c~aoem Estat´ısticae Engenharia de Produ¸c~ao Universidade Federal de Pernambuco, Recife - PE, ZIP 50740-570, Brazil Raphaela L. B. A. Nascimento Departamento de Estat´ıstica, Centro de Ci^enciasExatas e da Natureza Universidade Federal de Pernambuco, Recife - PE, ZIP 50740-540, Brazil (Communicated by Naoki Saito) Abstract. The proposal of efficient distributions is a crucial step for decision making in practice. Mixture models are adjustment tools which are often used to describe complex phenomena. However, as one disadvantage, such models impose hard inference procedures, submitted to a large number of parameters. To solve this issue, this paper proposes a new model which is able to describe multimodal, symmetric and asymmetric behaviors with only three parameters, called compound truncated Poisson normal (CTPN) distribution. Some prop- erties of the CTPN law are derived and discussed: characteristic and cumulant functions and ordinary moments. A moment estimation procedure for CTPN parameters is also provided. This procedure consists of solving one nonlin- ear equation in terms of a single parameter. An application with images of synthetic aperture radar (SAR) is made. The results present evidence that the CTPN can outperform the G0, K and BGN (laws commonly used in SAR literature), as well as GBGL models. 1. Introduction. Empirical multimodal distributions are needed in many applica- tions [5] as for instance, in voice recognition [23] and radar imagery processing [9]. The use of mixture models is an alternative in these cases [10]. Wirjanto and Xu [28] presented a detailed discussion about finite mixtures among normal distributions. However, these distributions involve a large number of parameters, yielding hard optimization issues to perform the associated inferential procedures [1]. In this pa- per, we propose a distribution with three parameters as an efficient descriptor of multimodal behaviour. The proposal of new analytically tractable and flexible models by using generators received great attention in survival analysis. For instance, Cordeiro and Lemonte [6] proposed the McDonald inverted beta model to describe daily ozone concentrations, 2010 Mathematics Subject Classification. Primary: 62F10; Secondary: 62P30. Key words and phrases. Compound generator, radar data, multimodal behaviour, ordinary moments. ∗ Corresponding author: Abra~aoD. C. Nascimento. 787 c 2019 American Institute of Mathematical Sciences 788 Abraao~ D. C. Nascimento, Leandro C. Rego^ and Raphaela L. B. A. do Nascimento Cordeiro et al. [7] derived the Kumaraswamy Weibull for sample of devices from a field-tracking study, and Nandi and M¨ampel [19] provided a generalized Gaussian law, on which asymmetry properties were explored. An alternative way to generate probability distributions is through compounding. This method was initially proposed by Grushka [14] and Golubev [13]. Golubev [13] proposed the exponentially modified Gaussian model which has physical appeal in Biology and is the result of the convolution between the exponential and Gaussian models. Teich and Diament [26] showed that the compound Poisson distribution, where its parameter follows the range distribution, yields the negative binomial model. These compositions special cases of the compounding N method proposed by Karlis and Xekalaki [15]. In this method, an event is described by a random sum: (1) SN = X1 + X2 + ··· + XN ; where both N and fXi; i = 1;:::;Ng are independent random variables. In what follows, we assume that S = SN . The wide use of (1) can be justified by its analytic form being a good model for several natural phenomena. Revfeim [25] proposed the compound Poisson exponential for describing the to- tal precipitation per day, where the number of days with precipitation is Pois- son distributed and the precipitation amount follows the exponential distribution. Panger [22] showed that the compound Poisson and negative binomial models are extensively used in risk economic theory. Finally, Thompson [27] made an applica- tion of the Compound Poisson distribution to model the total amount of monthly rainfall. In this paper, we have four goals. First, we give a theoretic essay about the family named by compound N. We derive some mathematical properties of this family: characteristic and cumulant functions and ordinary moments. Second, a special case in the compound N family is introduced, which consists of usinga random sum of normal distributions, where the number of terms follows the truncated Poisson (TP) law. This model serves as a descriptor for data with negative, positive or real supports (in particular, for characteristics in radar systems). If N follows a truncated Poisson distribution, N ∼ TP (λ), its probability mass function is given by 1 λn P (N = n) = ; for n = 1; 2;:::: eλ − 1 n! 2 The model resulting from considering N ∼ TP (λ) and Xi s N(µ, σ ) in (1) is called Compound truncated Poisson normal distribution, which is denoted by CTPN(λ, µ, σ2). Third, we furnish an estimation procedure for the CTPN parameters, using the moment method. We showed that the associated estimation system can be reduced to one nonlinear equation in terms of one parameter. Fourth, an application to real data is made. The CTPN model is applied to describe data extracted from a synthetic aperture radar (SAR) image involving agriculture covers. Results provide evidence that the CTPN may outperform the GBGL distribution (recently proposed to describe lifetime data) and other three models, G0, K and BGN distributions, which are commonly used in the SAR literature. This paper is organized as follows. In Section2, the CTPN and some of its properties are presented. A moment estimation method is derived in Section3. In Inverse Problems and Imaging Volume 13, No. 4 (2019), 787{803 Compound truncated Poisson normal 789 Section4, an application toa real data set is presented . Main conclusions are given in Section5. 2. A new family and one of its special cases. 2.1. New family. We start this section deriving and discussing some properties of the compound N family, which is defined as follows. Let X1;:::;Xn be a random sample (i.e. independent and identically distributed) from X and n denote a pos- sible outcome of a random variable N 2 Z+. The compound N family is defined as PN S = j=1 Xj and has cumulative distribution function (cdf) given by: 0 N 1 1 0 N 1 X X X FS(x) = P (S ≤ x) = P @ Xj ≤ xA = P (N = k)P @ Xj ≤ xjN = kA j=1 k=1 j=1 1 0 k 1 1 X X X (2) = P (N = k)P @ Xj ≤ xA = P (N = k) FSk (x); k=1 j=1 k=1 Pk where FSk is the cdf of the sum of a k-point random sample, Sk = i=1 Xi. Thus, if X is an absolute continuous random variable, then the compound N probability density function (pdf) and hazard rate function (hrf) are, respectively, given by 1 X (3) fS(x) = P (N = k) fSk (x); k=1 and 1 P P (N = k) fSk (x) fS(x) k=1 (4) hS(x) = = 1 1 − FS(x) P 1 − P (N = k) FSk (x) k=1 1 X P (N = k) (1 − FSk (x)) (5) = 1 hSk (x): P k=1 P (N = j) (1 − FSj (x)) j=1 From the last identity, it can be seen that the compound N hrf is given bya weighted mean of the hrfs of the partial sums into S. Now, we consider a discussion about expressions for ordinary moments of S. Based on Fine [11], the characteristic function (cf) of S may be determined by the following theorem. Theorem 2.1. Let N be a positive integer random variable having cf 'N (·) and N P S = Xj; where X1;:::;XN is a random sample drawn from X with cf 'X (·). j=1 Assume also that N and Xi (for i = 1; : : : ; n) are independent. Then, the cf of S is given by p i t S 'S(t) = E(e ) = 'N (−i log 'X (t)); where i = −1: Inverse Problems and Imaging Volume 13, No. 4 (2019), 787{803 790 Abraao~ D. C. Nascimento, Leandro C. Rego^ and Raphaela L. B. A. do Nascimento The moment generating function (mgf) of S is given by Z 1 Z 1 1 t S tx tx X MS(t) = E(e ) = e · fS(x) dx = e · P (N = k) fSk (x) dx −∞ −∞ k=1 1 Z 1 X tx = P (N = k) e · fSk (x) dx k=1 −∞ 1 X k = P (N = k) · [MX (t)] = MN (log MX (t))(6) k=1 and, as a consequence, the cumulant generating function (cgf) is given by (7) κS(t) = log MS(t) = log fMN (log MX (t))g = κN (κX (t)): From (3), one can obtain compound N ordinary moments in terms of series of moments of X: Z 1 Z 1 1 j j j X E(S ) = x fS(x) dx = x P (N = k) fSk (x) dx −∞ −∞ k=1 1 Z 1 X j = P (N = k) x fSk (x) dx: k=1 −∞ Note that Z 1 j j x fSk (x) dx = E(X1 + ::: + Xk) ; −∞ where X1;:::;Xk is a random sample from X. Using the multinomial theorem combined with simple algebraic manipulations, we have: j j−j1 j−j1−j2−:::jk−2 j X X X j j1 j2 jk E[(X1+:::+Xk) ] = ::: E(X1 )E(X2 ) ::: E(Xk ); j1 : : : jk j1=0 j2=0 jk−1=0 where jk = j − j1 − j2 − ::: − jk−1.