Statistics and Its Interface Volume 9 (2016) 11–32
Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods
Chi Zhang, Guo-Liang Tian∗, and Kai-Wang Ng
The over-dispersion may result from other phenomena such as a high proportion of both zeros and ones in the To model count data with excess zeros and excess ones, count data. For example, Eriksson and Aberg˚ [8] reported a in their unpublished manuscript, Melkersson and Olsson two-year panel data from Swedish Level of Living Surveys (1999) extended the zero-inflated Poisson distribution to in 1974 and 1991, where the visits to a dentist have higher a zero-and-one-inflated Poisson (ZOIP) distribution. How- proportions of zeros and ones and one-visit observations are ever, the distributional theory and corresponding properties even much more frequent than zero-visits. It is common to of the ZOIP have not yet been explored, and likelihood- visit a dentist for a routine control, e.g., school-children go based inference methods for parameters of interest were not to a dentist once a year almost as a rule. As the second ex- well developed. In this paper, we extensively study the ZOIP ample, Carrivick et al.[3] reported that much of the data distribution by first constructing five equivalent stochastic collected on occupational safety involves accident or injury representations for the ZOIP random variable and then de- counts with many zeros and ones. In general, the safety reg- riving other important distributional properties. Maximum ulations protect most workers from injury so that many ze- likelihood estimates of parameters are obtained by both the ros are observed. However, no perfect protection exists, such Fisher scoring and expectation—maximization algorithms. that there must be somebody suffering an accident or injury. Bootstrap confidence intervals for parameters of interest and This experience would be a signal to warn other workers to testing hypotheses under large sample sizes are provided. be much more cautious and to avoid more accidents. Hence Simulations studies are performed and five real data sets there will be respectively more one observed indicating the are used to illustrate the proposed methods. majority of people having only one accident or none than those with two or more injuries. Keywords and phrases: EM algorithm, Fisher scoring Thus, the ZIP is no longer an appropriate distribution to algorithm, One-inflated Poisson distribution, Zero-inflated model such count data with extra zeros and extra ones. As Poisson model, Zero-and-one inflated Poisson model. an extension of the ZIP, a so-called zero-and-one-inflated Poisson (ZOIP) distribution proposed by Melkersson and Olsson [20] is then a useful tool to capture the characteris- 1. INTRODUCTION tics of such count data. The major goal of Melkersson and Poisson model provides a standard and popular frame- Olsson [20] is to fit the dentist visiting data with covari- work to model count data. The over-dispersion problem ates in Sweden. Later, Saito and Rodrigues [25] presented emerges in modeling count data due to the greater incidence a Bayesian analysis of the same dentist visiting data with- of zeros than that permitted by the traditional Poisson dis- out considering any covariates by the data augmentation tribution. Lambert [17]proposedthezero-inflated Poisson algorithm (Tanner and Wong, [30]). However, the distribu- (ZIP) regression model to handle count data with excess tional theory and corresponding properties of the ZOIP have zeros. Some examples of ZIP distribution can be found in not yet been explored, and likelihood-based methods for pa- Neyman [22], Cohen [4], Singh [29], Martin and Katti [19], rameters of interest were not well developed, because, to Kemp [16], and Mullahy [21]. Gupta et al.[11] developed a our best knowledge, only two papers (i.e., Melkersson and zero-adjusted generalized Poisson distribution including the Olsson, [20]; Saito and Rodrigues, [25]) involve the ZOIP ZIP distribution as a special case. Subsequently, Ridout et to date. The main objective of this paper is to fill the al.[24] reviewed some useful models for fitting count data gap. with excess zeros. Carrivick et al.[3] applied the ZIP model For convenience, in this paper we denote a degenerate distribution at a single point c by ξ ∼ Degenerate(c), where to evaluate the effectiveness of a consultative manual han- ξ is a random variable with probability mass function (pmf) dling workplace risk assessment team in reducing the risk of Pr(ξ = c)=1andc is a constant. occupational injury among cleaners within a hospital. Let ξ0 ∼ Degenerate(0), ξ1 ∼ Degenerate(1), X ∼ ∗Corresponding author. Poisson(λ) and they are independent. A discrete random variable Y is said to follow a ZOIP distribution, denoted by 2.1 The first stochastic representation Y ∼ ZOIP(φ ,φ ; λ), if its pmf is (Melkersson and Olsson, 0 1 Let z =(Z ,Z ,Z ) ∼ Multinomial (1; φ ,φ ,φ ), X ∼ [20]) 0 1 2 0 1 2 Poisson(λ), and z and X be mutually independent (denoted by z ⊥⊥ X). It is easy to show that the first SR of the random (1.1) f(y|φ0,φ1; λ) variable Y ∼ ZOIP(φ0,φ1; λ)isgivenby = φ Pr(ξ = y)+φ Pr(ξ = y)+φ Pr(X = y) ⎧0 0 1 1 2 −λ d ⎪φ + φ e , if y =0, (2.1) Y = Z0 · 0+Z1 · 1+Z2X = Z1 + Z2X ⎪ 0 2 ⎧ ⎨ − φ + φ λ e λ, if y =1, ⎨⎪0, with probability φ0, = 1 2 ⎪ y −λ = 1, with probability φ , ⎩⎪ λ e ⎪ 1 φ2 , if y =2, 3,... ⎩ y! X, with probability φ2, −λ −λ =(φ0 + φ2e )I(y =0)+(φ1 + φ2λ e )I(y =1) d λye−λ where the symbol “=” means that the random variables + φ2 I(y 2), in both sides of the equality have the same distribution. y! The justification of this fact is as follows. By noting that Z0 + Z1 + Z2 =1andPr(Zi =1)=φi for i =0, 1, 2, from where φ0 ∈ [0, 1) and φ1 ∈ [0, 1) respectively denote the un- known proportions for incorporating extra-zeros and extra- (2.1), we have (2.2) ones than those allowed by the traditional Poisson distri- ⎧ − − ∈ ⎪Pr(Y =0) = Pr(Z0 =1)+Pr(Z2 =1,X =0) bution, and φ2 =1ˆ φ0 φ1 (0, 1]. The ZOIP distribu- ⎪ ⎪ −λ tion is a mixture of two degenerate distributions with all ⎪ = φ0 + φ2e , ⎪ mass at zero and one, respectively and a Poisson(λ)dis- ⎨ Pr(Y =1) = Pr(Z1 =1)+Pr(Z2 =1,X =1) tribution. In particular, when φ = 0, the ZOIP distribu- 0 ⎪ −λ ⎪ = φ1 + φ2λ e , tion reduces to one-inflated Poisson (OIP) distribution (de- ⎪ ⎪ y −λ noted by OIP(φ1,λ)); when φ1 = 0, the ZOIP distribution ⎩⎪ λ e Pr(Y = y)=Pr(Z2 =1,X = y)=φ2 ,y 2, reduces to ZIP distribution (denoted by ZIP(φ0,λ)); when y! φ0 = φ1 = 0, the ZOIP distribution becomes the traditional Poisson distribution. which is identical to (1.1), implying that Y ∼ ∼ The remainder of the paper is organized as follows. Sec- ZOIP(φ0,φ1; λ). The SR (2.1) means that Y tion 2 presents five equivalent stochastic representations for ZOIP(φ0,φ1; λ) is a mixture of three distributions: the ZOIP random variable. Section 3 develops other dis- Degenerate(0), Degenerate(1) and Poisson(λ). tributional properties. In Section 4, we derive the Fisher 2.2 The second stochastic representation scoring algorithm and the expectation–maximization (EM) algorithm for finding the MLEs of parameters. Bootstrap Alternatively, we can develop the second SR of the ∼ confidence intervals are also provided. The likelihood ra- zero-and-one inflated Poisson random variable. Let Z − ∼ ∼ tio test and the score test for one inflation, zero infla- Bernoulli(1 φ), η Bernoulli(p), X Poisson(λ)and tion and simultaneous zero-and-one inflation are developed (Z, η, X) be mutually independent. Then in Section 5. Simulation studies to compare the likeli- (2.3) η, with probability φ, hood ratio test with the score test are conducted in Sec- Y =(1d − Z)η + ZX = tion 6. In Section 7, five real data sets are used to illus- X, with probability 1 − φ trate the proposed methods. A discussion is given in Sec- − tion 8. follows the distribution ZOIP(φ0,φ1; λ)withφ0 = φ(1 p) and φ1 = φp. This fact can be verified as follows: (2.4) 2. VARIOUS STOCHASTIC ⎧ ⎪Pr(Y =0) = Pr(Z =0,Y =0)+Pr(Z =1,Y =0) REPRESENTATIONS AND THEIR ⎪ ⎪ ⎪ =Pr(Z =0,η =0)+Pr(Z =1,X =0) EQUIVALENCE ⎪ ⎪ = φ(1 − p)+(1− φ)e−λ, ⎪ In this section, we present five different stochas- ⎨⎪ Pr(Y =1) = Pr(Z =0,Y =1)+Pr(Z =1,Y =1) tic representation (SR) for the random variable Y ∼ ⎪ =Pr(Z =0,η =1)+Pr(Z =1,X =1) ZOIP(φ0,φ1; λ) and show their equivalence. These SRs ⎪ ⎪ reveal the relationship of the ZOIP(φ0,φ1; λ) distri- ⎪ = φp +(1− φ)λ e−λ, ⎪ bution with Degenerate(0), Degenerate(1), Poisson(λ), ⎪ − ⎪Pr(Y = y)=Pr(Z =1,X = y) Bernoulli(φ1/(φ0 + φ1)), ZIP(φ0/(1 φ1),λ), ZTP(λ)and ⎩⎪ OTP(λ). =(1− φ)λye−λ/y!,y 2.
12 C. Zhang, G.-L. Tian, and K.-W. Ng ∗ ∗ ∗ ∗ ∼ ∗ ∗ ∗ By comparing (2.4)with(2.2), we obtain a one-to-one Now let z =(Z0 ,Z1 ,Z2 ) Multinomial (1; φ0,φ1,φ2), ∗ mapping between (φ, p)and(φ0,φ1): V0 ∼ ZTP(λ)andz ⊥⊥ V0.Then ⎧ ⎧ ⎧ ⎪φ(1 − p)=φ , ∗ ⎨ 0 ⎨φ = φ0 + φ1, ⎨⎪0, with probability φ0, (2.5) φp = φ , ⇐⇒ d ∗ ∗ ∗ ⎪ 1 ⎩ φ1 (2.10) Y = Z1 + Z2 V0 = 1, with probability φ1, ⎩ p = . ⎩⎪ 1 − φ = φ , φ0 + φ1 ∗ 2 V0, with probability φ2 ∼ The SR (2.3) indicates that Y ZOIP(φ0,φ1; λ)isalso ∗ − follows the distribution ZOIP(φ0,φ1; λ)withφ0 = φ0 a mixture of a Bernoulli(φ1/(φ0 + φ1)) distribution and a φ∗/(eλ −1) and φ = φ∗. This fact can be verified as follows: Poisson(λ) distribution. 2 1 1 (2.11)⎧ ∗ ∗ ⎪Pr(Y =0) = Pr(Z0 =1)=φ0, 2.3 The third stochastic representation ⎪ ⎪ ∗ ∗ ⎪Pr(Y =1) = Pr(Z1 =1)+Pr(Z2 =1,V0 =1) Now we consider the third SR of Y ∼ ZOIP(φ0,φ1; λ). ⎨ ∼ − ∼ ∗ ∼ ∗ ∗ −λ − −λ Let Z Bernoulli(1 φ), ξ1 Degenerate(1), Y = φ1 + φ2λ e /(1 e ), ∗ ∗ ⎪ ZIP(φ ,λ)and(Z, ξ1,Y ) be mutually independent. Then ⎪ ∗ ⎪Pr(Y = y)=Pr(Z2 =1,V0 = y) (2.6) ⎩⎪ ∗ y −λ − −λ 1, with probability φ = φ2λ e /[y!(1 e )],y2. Y =(1d − Z)ξ + ZY ∗ = 1 ∗ − Y , with probability 1 φ By comparing (2.11)with(2.2), we obtain a one-to-one map- ∗ ∗ ∗ ∗ ping between (φ0,φ1,φ2)and(φ0,φ1,φ2): follows the distribution ZOIP(φ0,φ1; λ)withφ0 =(1−φ)φ and φ = φ. This fact can be shown as follows: ⎧ 1 ⎪φ∗ = φ + φ e−λ, (2.7)⎧ ⎨ 0 0 2 ∗ ∗ ⎪Pr(Y =0) = Pr(Z =1,Y =0) (2.12) φ1 = φ1, ⎪ ⎩⎪ ⎪ − ∗ − ∗ −λ ∗ − −λ ⎪ =(1 φ)[φ +(1 φ )e ], φ2 = φ2(1 e ), ⎪ ⎧ ∗ ⎨ ∗ ⎪ φ Pr(Y =1) = Pr(Z =0)+Pr(Z =1,Y =1) ⎪φ = φ∗ − 2 , ⎨ 0 0 eλ − 1 ⎪ = φ +(1− φ)(1 − φ∗)λ e−λ, ⎪ ⇐⇒ ∗ ⎪ ⎪φ1 = φ1, ⎪Pr(Y = y)=Pr(Z =1,Y∗ = y) ⎩⎪ ⎪ ∗ λ λ − ⎩ φ2 = φ2e /(e 1). =(1− φ)(1 − φ∗)λye−λ/y!,y 2. The SR (2.10) shows that Y ∼ ZOIP(φ ,φ ; λ)isalsoamix- By comparing (2.7)with(2.2), we obtain a one-to-one map- 0 1 ture of three distributions: Degenerate(0), Degenerate(1) ping between (φ, φ∗)and(φ ,φ ): 0 1 and ZTP(λ). ⎧ − ∗ ⎨⎪(1 φ)φ = φ0, 2.5 The fifth stochastic representation (2.8) ⎪φ = φ1, ⎩ First of all, we introduce the one-truncated Poisson (1 − φ)(1 − φ∗)=φ , ⎧ 2 (OTP) distribution (Cohen, [5]; Hilbe, [13], p. 388) with ⎨φ = φ1, pmf defined by ⇐⇒ ∗ φ0 (2.13) ⎩ − φ = − . λve λ/v! 1 φ1 Pr(V = v)= ,v=2, 3,...,∞, 1 1 − e−λ − λ e−λ The SR (2.6) shows that Y ∼ ZOIP(φ0,φ1; λ)isalsoamix- ∗ ture of a degenerate distribution with all mass at 1 and a and denote it by V1 ∼ OTP(λ). Now let z = ∗ ∗ ∗ ∗ ∼ ∗ ∗ ∗ ∼ ZIP(φ ,λ) distribution. (Z0 ,Z1 ,Z2 ) Multinomial (1; φ0,φ1,φ2), V1 OTP(λ) and z∗ ⊥⊥ V .Then 2.4 The fourth stochastic representation 1 ⎧ ⎪0, with probability φ∗, To derive the fourth SR of Y ∼ ZOIP(φ0,φ1; λ), we ⎨ 0 first introduce the zero-truncated Poisson (ZTP) distribu- d ∗ ∗ ∗ (2.14) Y = Z1 + Z2 V1 = ⎪1, with probability φ1, tion (David and Johnson, [6]), denoted by V ∼ ZTP(λ), ⎩ 0 V , with probability φ∗ whose pmf is defined as 1 2
− ∗ − λve λ/v! follows the distribution ZOIP(φ0,φ1; λ)withφ0 = φ0 ∗ ∗ ∗ (2.9) Pr(V0 = v)= ,v=1, 2,...,∞. λ − − − λ − − 1 − e−λ φ2/(e 1 λ)andφ1 = φ1 φ2λ/(e 1 λ). This fact
Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods 13 2 2 can be verified as follows: E(Y )=φ1 + φ2(λ + λ ), (2.15) − 2 − 2 ⎧ ∗ ∗ Var(Y )=μ μ +(μ φ1) /φ2. ⎪Pr(Y =0) = Pr(Z0 =1)=φ0, ⎪ ⎨Pr(Y =1) = Pr(Z∗ =1)=φ∗, In general, for any positive integers r and s,wehave 1 1 ZrZs ∼ Degenerate(0) for i = j.Letn be an arbitrary ⎪ ∗ i j ⎪Pr(Y = y)=Pr(Z2 =1,V1 = y) positive integer, we have ⎩⎪ ∗ y −λ − −λ − −λ = φ2λ e /[y!(1 e λ e )],y 2. n n − − (3.4) E(Y n)= E(ZkZn k)E(Xn k) k 1 2 By comparing (2.15)with(2.2), we obtain a one-to-one map- k ∗ ∗ ∗ =0 ping between (φ ,φ ,φ )and(φ0,φ1,φ2): n 0 1 2 = φ1 + φ2E(X ). ⎧ ∗ −λ ⎨⎪φ0 = φ0 + φ2e , Alternatively, the SR (2.3) can be applied to calculate ∗ −λ E(Y n), and we can obtain the same result as (3.4) by noting (2.16) φ1 = φ1 + φ2λ e , ⎩⎪ − r s ∼ ∗ −λ −λ that (1 Z) Z Degenerate(0) for any Bernoulli random φ = φ2(1 − e − λ e ), ⎧ 2 variable Z,wherer and s are two arbitrary positive integers. ⎪φ = φ∗ − φ∗/(eλ − 1 − λ), ⎨ 0 0 2 3.3 Moment generating function ⇐⇒ ∗ − ∗ λ − − ⎪φ1 = φ1 φ2λ/(e 1 λ), ⎩ By using the formula of E(W1)=E[E(W1|W2)], the mo- φ = φ∗eλ/(eλ − 1 − λ). 2 2 ment generating function of Y ∼ ZOIP(φ0,φ1; λ)isgiven by The SR (2.14) shows that Y ∼ ZOIP(φ0,φ1; λ)isalsoamix- ture of three distributions: Degenerate(0), Degenerate(1) (3.5) M (t)=E[exp(tY )] Y and OTP(λ). (2.3) = E exp[t(1 − Z)η + tZX] 3. DISTRIBUTIONAL PROPERTIES = E E{exp[t(1 − Z)η + tZX] | Z} 3.1 The cumulative distribution function = E[M (t(1 − Z)) · M (tZ)] η X ∼ − Let Y ZOIP(φ0,φ1; λ). For any non-negative real num- = E [p et(1 Z) +1− p] · exp[λ(etZ − 1)] ber y, the cumulative distribution function of Y is given by = φ(p et +1− p)+(1− φ)exp[λ(et − 1)]. (3.1) Pr(Y y) −λ 3.4 Conditional distributions based on the =(φ0⎛+ φ2e )I(0 y<1) ⎞ first SR y i −λ ⎝ λ e ⎠ + φ0 + φ1 + φ2 I(y 1) In Section 2.1, we assumed that z =(Z ,Z ,Z ) ∼ i! 0 1 2 i=0 Multinomial (1; φ0,φ1,φ2)andY ∼ ZOIP(φ0,φ1; λ). It is −λ clear that z only takes one of the three base-vectors (1, 0, 0) , =(φ0 + φ2e )I(0 y<1) (0, 1, 0) and (0, 0, 1) . We first consider the joint conditional Γ(y +1 ,λ) + φ + φ + φ I(y 1), distribution of z|Y , which is stated as follows. 0 1 2 y ! Theorem 1. (Joint conditional distribution of z|Y ). Let z ∼ where k denotes the largest integer not greater than k, Multinomial (1; φ0,φ1,φ2)andY ∼ ZOIP(φ0,φ1; λ). Based and on the SR (2.1), we have ∞ (3.6) ⎧ k−1 −t − (3.2) Γ(k, λ)ˆ= t e dt ⎨⎪Multinomial (1; ψ1, 0, 1 ψ1), if y =0, λ | ∼ − z (Y = y) ⎪Multinomial (1; 0,ψ2, 1 ψ2), if y =1, is the upper incomplete gamma function. ⎩ Multinomial (1; 0, 0, 1), if y 2, 3.2 Moments where The SR (2.1) is a useful tool to derive the mo- φ φ ments of the ZOIP random variable. Since (Z ,Z ,Z ) ∼ 0 1 0 1 2 (3.7) ψ1 =ˆ −λ and ψ2 =ˆ −λ . φ0 + φ2e φ1 + φ2λ e Multinomial (1; φ0,φ1,φ2), we have E(Zi)=φi,Var(Zi)= − − φi(1 φi), Cov(Zi,Zj)= φiφj and E(ZiZj)=0fori = j. Proof. The joint conditional pmf of z|(Y = y)isgivenby From (2.1), we immediately obtain Pr(Z = z ,Z = z ,Z = z ,Y = y) Pr(z = z|Y = y)= 0 0 1 1 2 2 . (3.3) E(Y )=φ1 + φ2λ =ˆ μ, f(y|φ0,φ1; λ)
14 C. Zhang, G.-L. Tian, and K.-W. Ng If y =0,then Theorem 2. (Conditional distribution of X|Y ). Let X ∼ ⎧ Poisson(λ)andY ∼ ZOIP(φ0,φ1; λ). Based on the SR (2.1), ⎪ (1.1) φ0 (3.7) ⎪Pr{z =(1, 0, 0) | Y =0} = = ψ , we have ⎪ −λ 1 ⎧ ⎨ φ0 + φ2e − ⎨⎪ZIP(1 ψ1,λ), if y =0, Pr{z =(0, 1, 0) | Y =0} =0, ⎪ | ∼ − ⎪ (3.11) X (Y = y) OIP(1 ψ2,λ), if y =1, ⎪ φ e−λ ⎩⎪ ⎩ { | } 2 − Pr z =(0, 0, 1) Y =0 = −λ =1 ψ1, Degenerate(y), if y 2, φ0 + φ2e where ψ and ψ are given by (3.7). which imply the first assertion of (3.6). 1 2 If y =1,then Proof. If y =0,wehave ⎧ { | } Pr(X = x, Y =0) ⎪Pr z =(1, 0, 0) Y =1 =0, Pr(X = x|Y =0)= ⎪ Pr(Y =0) ⎪ (1.1) φ ⎪ { | } 1 ⎪Pr z =(0, 1, 0) Y =1 = −λ Pr(X =0,Z1 =0) ⎪ φ1 + φ2λ e = I(x =0) ⎨ f(0|φ0,φ1; λ) (3.7) ⎪ = ψ2, Pr(X = x, Z0 =1) ⎪ + I(x =0) ⎪ −λ f(0|φ ,φ ; λ) ⎪ { | } φ2λ e 0 1 ⎪Pr z =(0, 0, 1) Y =1 = − −λ x −λ ⎪ φ + φ λ e λ (1.1) (1 − φ )e φ λ e /x! ⎩⎪ 1 2 = 1 I(x =0)+ 0 I(x =0) − φ + φ e−λ φ + φ e−λ =1ψ2, 0 2 0 2 x −λ (3.7) λ e =(1− ψ + ψ e−λ)I(x =0)+ ψ I(x =0) , which imply the second assertion of (3.6). 1 1 1 x! If y 2, then | ∼ − ⎧ implying X (Y =0) ZIP(1 ψ1,λ). ⎪Pr{z =(1, 0, 0) | Y = y} =0, If y =1,wehave ⎪ ⎨ Pr{z =(0, 1, 0) | Y = y} =0, Pr(X = x, Y =1) Pr(X = x|Y =1)= ⎪ y −λ ⎪ (1.1) φ λ e /y! Pr(Y =1) ⎩Pr{z =(0, 0, 1) | Y = y} = 2 =1, y −λ Pr(X =1,Z =0) φ2λ e /y! = 0 I(x =1) f(1|φ0,φ1; λ) implying the last assertion of (3.6). Pr(X = x, Z1 =1) + | I(x =1) Next, we consider the marginal conditional distributions f(1 φ0,φ1; λ) −λ x −λ of Z |Y for i =0, 1, 2. As a corollary of Theorem 1,we (1.1) (1 − φ0)λ e φ1λ e /x! i = I(x =1)+ I(x =1) φ + φ λ e−λ φ + φ λ e−λ summarize these results as follows. 1 2 1 2 x −λ | (3.7) −λ λ e Corollary 1. (Marginal conditional distributions of Zi Y ). =(1− ψ + ψ λ e )I(x =1)+ ψ I(x =1) , 2 2 2 x! Let z ∼ Multinomial (1; φ0,φ1,φ2)andY ∼ ZOIP(φ0, φ1; λ).BasedontheSR(2.1), we have implying X|(Y =1)∼ OIP(1 − ψ2,λ). If y 2, we have Bernoulli(ψ ), if y =0, | ∼ 1 (3.8) Z0 (Y = y) Pr(X = x, Y = y) Degenerate(0), if y =0 , Pr(X = x|Y = y)= f(y|φ0,φ1; λ) Bernoulli(ψ2), if y =1, | ∼ (1.1) Pr(X = y, Z2 =1) (3.9) Z1 (Y = y) = =1, Degenerate(0), if y =1 , φ λye−λ/y! ⎧ 2 − ⎨⎪Bernoulli(1 ψ1), if y =0, implying X|(Y = y 2) ∼ Degenerate(y). (3.10) Z2|(Y = y) ∼ Bernoulli(1 − ψ2), if y =1, ⎩⎪ 3.5 Conditional distributions based on the Degenerate(1), if y 2, second SR ∼ − where ψ1 and ψ2 are given by (3.7). In Section 2.2, we assumed that Z Bernoulli(1 φ), η ∼ Bernoulli(p), X ∼ Poisson(λ)and(Z, η, X)aremu- tually independent. According to (2.4), the pmf of Y ∼ Finally, based on the SR (2.1), we discuss the conditional ZOIP(φ0,φ1; λ) can be rewritten as distribution of X|Y , which is stated in the following theo- rem. (3.12) f(y|φ, p; λ)= φ(1 − p)+(1− φ)e−λ I(y =0)
Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods 15 + φp +(1− φ)λ e−λ I(y =1) Pr(η =1, 1 − Z + ZX = y) = − f(y|φ, p; λ) + (1 − φ)λye λ/y! I(y 2). ⎧ −λ ⎪ p · (1 − φ)e (3.14) ⎪ = pϕ , BasedontheSR(2.3), we will derive the conditional distri- ⎪φ(1 − p)+(1− φ)e−λ 1 ⎪ butions of Z|Y , η|Y and X|Y , which are given in Theorems ⎪ if y =0, ⎪ 3, 4 and 5, respectively. ⎪ −λ ⎨p[φ +(1− φ)λ e ] (3.14) (3.12) =1− ϕ + pϕ , Theorem 3. (Conditional distribution of Z|Y ). Let Z ∼ = φp +(1− φ)λ e−λ 2 2 ⎪ Bernoulli(1 − φ)andthepmfofY be specified by (3.12). ⎪ if y =1, ⎪ BasedontheSR(2.3), we have ⎪ y −λ ⎪p · (1 − φ)λ e /y! ⎧ ⎪ , = p ⎪ (1 − φ)λye−λ/y! ⎨⎪Bernoulli(ϕ1), if y =0, ⎩⎪ if y 2, (3.13) Z|(Y = y) ∼ Bernoulli(ϕ2), if y =1, ⎩⎪ Degenerate(1), if y 2, which implies (3.15). By combining Theorem 2 with (3.14), we immediately where ⎧ obtain the following results. − ⎪ (1 − φ)e λ ⎨⎪ϕ =ˆ =1− ψ , Theorem 5. (Conditional distribution of X|Y ). Let X ∼ 1 φ(1 − p)+(1− φ)e−λ 1 (3.14) Poisson(λ) and the pmf of Y be specified by (3.12). Based ⎪ (1 − φ)λ e−λ on the SR (2.3), we have ⎩⎪ϕ =ˆ =1− ψ , 2 φp +(1− φ)λ e−λ 2 ⎧ ⎨⎪ZIP(ϕ1,λ), if y =0, and (ψ1,ψ2)aregivenby(3.7). | ∼ (3.16) X (Y = y) ⎪OIP(ϕ2,λ), if y =1, ∼ − ⎩ Proof. Since Z Bernoulli(1 φ), Z only takes the value 0 Degenerate(y), if y 2, or 1. By the SR (2.3), we have where ϕ and ϕ are given by (3.14). Pr(Z =1|Y = y) 1 2 Pr(Z =1,X = y) = 4. LIKELIHOOD-BASED INFERENCES f(y|φ, p; λ) iid∼ − y −λ Assume that Y1,...,Yn ZOIP(φ0,φ1; λ)andy1,...,yn (1 φ)λ e /y! { }n = denote their realizations. LetYobs = yi i=1 denote the ob- f(y|φ, p; λ) I { | } ⎧ served data. Moreover, let 0 =ˆ i yi =0, 1 i n and −λ ⎪ (1 − φ)e (3.14) m = n I(y = 0) denote the number of elements in I ; ⎪ = ϕ , if y =0, 0 i=1 i 0 ⎪φ(1 − p)+(1− φ)e−λ 1 I { | } n ⎨⎪ 1 =ˆ i yi =1, 1 i n and m1 = i=1 I(yi = 1) denote (3.12) −λ the number of elements in I . The observed-data likelihood = (1 − φ)λ e (3.14) 1 ⎪ − = ϕ2, if y =1, function for (φ0,φ1,λ)isthengivenby ⎪φp +(1− φ)λ e λ ⎪ ⎩ | 1, if y 2, (4.1) L(φ0,φ1,λYobs) −λ m0 −λ m1 =(φ0 + φ2e ) × (φ1 + φ2λ e ) which implies (3.13). In addition, by applying (2.5), we can y −λ − − λ i e × n m0 m1 immediately verify that ϕ1 =1− ψ1 and ϕ2 =1− ψ2. φ2 , yi! i/∈I ∪I Theorem 4. (Conditional distribution of η|Y ). Let η ∼ 0 1 Bernoulli(p) and the pmf of Y be specified by (3.12). Based so that the log-likelihood function is on the SR (2.3), we have ⎧ =ˆ (φ0,φ1,λ|Yobs) ⎪Bernoulli(pϕ1), if y =0, −λ −λ ⎨ = m0 log(φ0 + φ2e )+m1 log(φ1 + φ2λ e ) (3.15) η|(Y = y) ∼ Bernoulli(1 − ϕ2 + pϕ2), if y =1, − − − ⎩⎪ +(n m0 m1)(log φ2 λ)+N log λ, Bernoulli(p), if y 2, where φ =1− φ − φ and N = y . To calculate 2 0 1 i/∈I0∪I1 i where ϕ1 and ϕ2 are given by (3.14). the Fisher information matrix, we need the following results.
Proof. Since η ∼ Bernoulli(p), η only takes the value 0 or 1. Theorem 6. (Expectations). The expectations of m0, m1 By the SR (2.3), we have and N are given by
−λ Pr(η =1|Y = y) (4.2) E(m0)=n(φ0 + φ2e ),
16 C. Zhang, G.-L. Tian, and K.-W. Ng −λ E(m1)=n(φ1 + φ2λ e ), and − −λ E(N)=nφ2λ(1 e ). ∂2 m (1 − e−λ)2 m λ2e−2λ = − 0 − 1 2 −λ 2 −λ 2 ∂φ0 [φ0 + φ2e ] [φ1 + φ2λ e ] Proof. It is easy to show the expressions of E(m0)and n − m0 − m1 E(m1), and we have two methods to verify the last one. − 2 , The first way is to directly calculate E(N) as follows. Since φ2 2 −2λ −λ 2 ∂ m0e m1(1 − λ e ) −λ = − − λ(1 − e ) ∂φ2 −λ 2 −λ 2 E(Y |Y 2) = ,i/∈ I ∪ I , 1 [φ0 + φ2e ] [φ1 + φ2λ e ] i i 1 − e−λ − λ e−λ 0 1 − − − n m0 m1 2 , we have φ2 2 −λ ∂ m0φ0φ2e = 2 E(N)= E(N|m0,m1) ∂λ2 −λ [φ0 + φ2e ] − − − − λ −λ −λ (n m0 m1)λ(1 e ) m1φ2e φ1(λ − 2) − φ2e N = E −λ −λ + − , 1 − e − λ e −λ 2 2 [φ1 + φ2λ e ] λ − − − −λ [n E(m0) E(m1)]λ(1 e ) 2 −λ − −λ −λ − −λ = ∂ m0e (1 e ) m1λ e (1 λ e ) −λ −λ = + 1 − e − λ e − 2 − 2 ∂φ0∂φ1 [φ + φ e λ] [φ + φ λ e λ] (4.2) 0 2 1 2 = nφ λ(1 − e−λ). − − 2 − n m0 m1 2 , φ2 Alternatively, we noted that 2 − −λ − −λ ∂ m0(1 φ1)e − m1φ1(1 λ)e = −λ 2 −λ 2 , n ∂φ0∂λ [φ0 + φ2e ] [φ1 + φ2λ e ] (4.3) N = yi = yi − m1. 2 −λ −λ ∂ m0φ0e m1(1 − φ0)(1 − λ)e ∈I ∪I − i/ 0 1 i=1 = −λ 2 −λ 2 . ∂φ1∂λ [φ0 + φ2e ] [φ1 + φ2λ e ] (3.3) Thus, E(N)=nE(Y1) − E(m1) = n(φ1 + φ2λ) − n(φ1 + Thus, by utilizing (4.2), we can calculate the Fisher infor- φ λ e−λ)=nφ λ(1 − e−λ). mation as follows: 2 2 −∇2 | 4.1 MLEs via the Fisher scoring algorithm (4.4) J(φ0,φ1,λ)=E (φ0,φ1,λYobs) .
In this section, we use the Fisher scoring algorithm to Let (φ(0),φ(0),λ(0)) be initial values of the MLEs (φˆ , φˆ , λˆ). ∇ 0 1 0 1 calculate the MLEs of φ0, φ1 and λ. The score vector If (φ(t),φ(t),λ(t)) denote the t-th approximations of ∇2 0 1 and the Hessian matrix are given by ˆ ˆ ˆ (φ0, φ1, λ), then their (t + 1)-th approximation can be ob- tained by the following Fisher scoring algorithm: ∂ ∂ ∂ ∇(φ0,φ1,λ|Yobs)= , , and ∂φ ∂φ ∂λ (4.5) ⎛ 0 1 ⎞ ⎛ ⎞ ⎛ ⎞ ∂2 ∂2 ∂2 φ(t+1) φ(t) ⎜ 0 ⎟ ⎜ 0 ⎟ ⎜ ∂φ2 ∂φ ∂φ ∂φ ∂λ ⎟ ⎜ 0 0 1 0 ⎟ ⎜ (t+1) ⎟ = ⎜ (t) ⎟ ⎜ 2 2 2 ⎟ ⎝ φ1 ⎠ ⎝ φ1 ⎠ 2 ⎜ ∂ ∂ ∂ ⎟ ∇ (φ0,φ1,λ|Yobs)=⎜ ⎟ , ⎜ ∂φ ∂φ ∂φ2 ∂φ ∂λ ⎟ λ(t+1) λ(t) ⎝ 1 0 1 1 ⎠ ∂2 ∂2 ∂2 −1 (t) (t) (t) ∇ (t) (t) (t)| + J (φ0 ,φ1 ,λ ) (φ0 ,φ1 ,λ Yobs). 2 ∂λ∂φ0 ∂λ∂φ1 ∂λ ˆ ˆ ˆ As a by-product, the standard errors of the MLEs (φ0, φ1, λ) respectively, where are the square roots of the diagonal elements J kk of the inverse Fisher information matrix J −1(φˆ , φˆ , λˆ). Thus, the ∂ m (1 − e−λ) m λ e−λ n − m − m 0 1 0 − 1 − 0 1 (1 − α)100% asymptotic Wald confidence intervals (CIs) of = −λ −λ , ∂φ0 φ0 + φ2e φ1 + φ2λ e φ2 φ0,φ1 and λ are given by −λ −λ ∂ m0e m1(1 − λ e ) n − m0 − m1 √ √ = − + − , −λ −λ (4.6) [φˆ − − z J kk, φˆ − + z J kk],k=1, 2, ∂φ1 φ0 + φ2e φ1 + φ2λ e φ2 k 1 α/2 k 1 √α/2 − − √ ∂ m φ e λ m φ (1 − λ)e λ ˆ − 33 ˆ 33 − 0 2 1 2 [λ zα/2 J , λ + zα/2 J ], = −λ + −λ ∂λ φ0 + φ2e φ1 + φ2λ e N respectively, where zα denotes the α-th upper quantile of − (n − m − m )+ , 0 1 λ the standard normal distribution.
Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods 17 4.2 MLEs via the EM algorithm 4.3 Bootstrap confidence intervals
The zero observations from a ZOIP distribution can be The Wald CIs of φ0 and φ1 specified by (4.6) may fall out- categorized into the extra zeros resulted from the degenerate side the unit interval [0,1). The Wald CI of λ given by (4.6) distribution at zero because of population variability at the is reliable only for large sample sizes and its lower bound point zero and the structural zeros came from an ordinary may be less than 0. For small sample sizes, the bootstrap Poisson distribution. The one observations can similarly be method is a useful tool to find a bootstrap CI for an arbi- classified into the extra ones resulted from the degenerate trary function of φ0, φ1 and λ,say,ϑ = h(φ0,φ1,λ). Let ˆ ˆ ˆ ˆ ˆ ˆ ˆ distribution at one because of population variability at the ϑ = h(φ0, φ1, λ) denote the MLE of ϑ,where(φ0, φ1, λ)rep- point one and structural ones came from an ordinary Poisson resent the MLEs of (φ0,φ1,λ) calculated by either the Fisher distribution. Thus, we partition scoring algorithm (4.5) or the EM algorithm (4.9)–(4.10). ˆ ˆ ˆ Based on the obtained MLEs (φ0, φ1, λ), by using the SR I Iextra ∪ Istructral I Iextra ∪ Istructral iid 0 = 0 0 and 1 = 1 1 . ∗ ∗ ∼ ˆ ˆ ˆ (2.1) we can generate Y1 ,...,Yn ZOIP(φ0, φ1, λ). Hav- ∗ { ∗ ∗ } ing obtained Yobs = y1 ,...,yn , we can calculate the boot- Note that a major obstacle from obtaining explicit solutions ˆ∗ ˆ∗ ˆ∗ ˆ∗ ˆ∗ ˆ∗ ˆ∗ of MLEs of parameters from (4.1) is the first and second strap replications (φ0, φ1, λ ) and get ϑ = h(φ0, φ1, λ ). terms in the right-hand-side of (4.1). To overcome this diffi- Independently repeating this process G times, we obtain G {ˆ∗}G bootstrap replications ϑg g=1. Consequently, the standard culty, we augment Yobs with two latent variables W0 and W1, Iextra error, se(ϑˆ), of ϑˆ can be estimated by the sample standard where W0 denotes the number of 0 to split m0 into W0 − Iextra deviation of the G replications, i.e., and m0 W0 and W1 denotes the number of 1 to split − (4.11) m1 into W1 and m1 W1. Thus, the resulting conditional G 1/2 predictive distributions of W0 and W1 given (Yobs,φ0,φ1,λ) 1 se( ϑˆ)= [ϑˆ∗ − (ϑˆ∗ + ···+ ϑˆ∗ )/G]2 . are given by G − 1 g 1 G g=1 (4.7) ∗ G φ0 If {ϑˆ } is approximately normally distributed, the first W |(Y ,φ ,φ ,λ) ∼ Binomial m , and g g=1 0 obs 0 1 0 −λ (1 − α)100% bootstrap CI for ϑ is φ0 + φ2e φ | ∼ 1 ˆ − · ˆ ˆ · ˆ W1 (Yobs,φ0,φ1,λ) Binomial m1, −λ , (4.12) [ϑ zα/2 se(ϑ), ϑ + zα/2 se(ϑ)]. φ1 + φ2λ e {ˆ∗}G respectively. On the other hand, the complete-data likeli- Alternatively, if ϑg g=1 is non-normally distributed, the − hood function is proportional to second (1 α)100% bootstrap CI of ϑ is given by (4.13) [ϑˆ , ϑˆ ], (4.8) L U | L(φ0,φ1,λYcom) where ϑˆ and ϑˆ are the 100(α/2) and 100(1 − α/2) per- − − L U ∝ w0 − − λ m0 w0 {ˆ∗}G φ0 [(1 φ0 φ1)e ] centiles of ϑg g=1, respectively. − − × w1 − − λ m1 w1 φ1 [(1 φ0 φ1)λ e ] − − − − − 5. TESTING HYPOTHESES WITH LARGE × − − n m0 m1 (n m0 m1)λ N (1 φ0 φ1) e λ SAMPLE SIZES − − − − − − = φw0 φw1 (1 − φ − φ )n w0 w1 e (n w0 w1)λλm1 w1+N . 0 1 0 1 In this section, we consider the testing hypotheses for (i) Thus the M-step is to find the complete-data MLEs H0:(φ0,φ1)=(0, 0); (ii) H0: φ1 = 0; (iii) H0: φ0 =0;(iv) H0: λ = λ0. For (i), when H0 is true, since the parameter w w (4.9) φˆ = 0 , φˆ = 1 and values are located at the vertex boundary of the bounded pa- 0 n 1 n rameter space, the traditional asymptotic property of LRT − − ˆ is not applicable. Therefore, we only consider the score test. ˆ N + m1 w1 N + m1 nφ1 λ = = . For (ii)–(iii), although the parameter value is still on the n − w0 − w1 n(1 − φˆ − φˆ ) 0 1 boundary of the parameter space, we can provide appropri-
The E-step is to replace w0 and w1 in (4.9) by their condi- ate reference distribution. Finally for (iv), it is a standard tional expectations: two-sided test. (4.10) 5.1 Score test for simultaneous m φ E(W |Y ,φ ,φ ,λ)= 0 0 and zero-and-one inflation 0 obs 0 1 φ +(1− φ − φ )e−λ 0 0 1 First, we develop a score test to examine whether there m φ E(W |Y ,φ ,φ ,λ)= 1 1 . exist excessive zeros and excessive ones in the observa- 1 obs 0 1 − − −λ φ1 +(1 φ0 φ1)λ e tions simultaneously, i.e., zero-and-one inflation in the ZOIP
18 C. Zhang, G.-L. Tian, and K.-W. Ng n model. The null and alternative hypotheses are as follows ∂2 1 1 = , (5.1) ∂θ ∂θ (1 + θ + θ )2 0 1 i=1 0 1 H0:(φ0,φ1)=(0, 0) against H1:(φ0,φ1) =(0 , 0). 2 n −λ ∂ 1 λ e = −λ 2 I(yi =0) , By reparametrization, three new parameters are introduced ∂θ0∂β (θ0 +e ) as follows: i=1 2 n 2 − −λ (5.2) ∂ 1 (λ λ)e = −λ 2 I(yi =1) . φ0 φ1 ∂θ1∂β (θ1 + λ e ) θ0 = ,θ1 = and β =logλ. i=1 1 − φ0 − φ1 1 − φ0 − φ1 Since Then, testing H0 specified in (5.1) is equivalent to testing ∗ θ +e−λ H :(θ0,θ1)=(0, 0). The observed-data log-likelihood func- 0 0 E[I(yi = 0)] = , tion now becomes 1+θ0 + θ1 −λ n θ1 + λ e E[I(yi = 1)] = and 1 =ˆ 1(θ0,θ1,β)= − log(1 + θ0 + θ1) 1+θ0 + θ1 − − i=1 1 − e λ − λ e λ −λ E[I(yi 2)] = , + [log(θ0 +e )]I(yi =0) 1+θ0 + θ1 + [log(θ + λ e−λ)]I(y =1) 1 i the Fisher information matrix can be calculated as J(θ0,θ1,β)=(Jjj ), where +[yi log λ − λ − log(yi!)]I(yi 2) . 2 − −λ − ∂ 1 n(1 + θ1 e ) The score vector is now denoted by J11 = E 2 = 2 −λ , ∂θ0 (1 + θ0 + θ1) (θ0 +e ) ∂2 n(1 + θ − λ e−λ) ∂1 ∂1 ∂1 J = −E 1 = 0 , U(θ0,θ1,β)= , , , 22 ∂θ2 (1 + θ + θ )2(θ + λ e−λ) ∂θ0 ∂θ1 ∂β 1 0 1 1 2 −λ −λ ∂ 1 nλ e [θ0(1 − λ)+e ] where J33 = −E = ∂β2 (θ +e−λ)(1 + θ + θ ) 0 0 1 n − − ∂ 1 1 n[θ (3λ2 − λ − λ3)e λ + λ3e 2λ] 1 = − + I(y =0) , + 1 ∂θ 1+θ + θ θ +e−λ i (θ + λ e−λ)(1 + θ + θ ) 0 i=1 0 1 0 1 0 1 n − λ − λ ∂ 1 1 nλ(1 e λ e ) 1 = − + I(y =1) , + , −λ i 1+θ0 + θ1 ∂θ1 1+θ0 + θ1 θ1 + λ e i=1 2 ∂ 1 n n −λ − − ∂ λ e J12 = E = 2 , 1 − ∂θ0∂θ1 (1 + θ0 + θ1) = −λ I(yi =0) ∂β θ +e 2 −λ i=1 0 ∂ nλ e J = −E 1 = − , (λ2 − λ)e−λ 13 ∂θ ∂β (1 + θ + θ )(θ +e−λ) − I(y =1)+(y − λ)I(y 2) . 0 0 1 0 θ + λ e−λ i i i 2 2 − −λ 1 − ∂ 1 − n(λ λ)e J23 = E = −λ . The second derivatives are given by ∂θ1∂β (1 + θ0 + θ1)(θ1 + λ e ) Under H∗, the score test statistic ∂2 n 1 1 0 1 = − I(y =0) , 2 2 −λ 2 i ˆ −1 ˆ ˆ . 2 ∂θ0 (1 + θ0 + θ1) (θ0 +e ) (5.3) T1 = U (0, 0, β) J (0, 0, β) U(0, 0, β) ∼ χ (2), i=1 2 n ∂ 1 1 ˆ ˆ y¯ − y¯ − = where β =log(¯y)andU(0, 0, β)=(m0e n, m1e /y¯ 2 2 ∂θ1 (1 + θ0 + θ1) n, 0) .Thep-value is i=1 1 − I(y =1) , (5.4) p =Pr(T >t |H∗)=Pr(χ2(2) >t ), (θ + λ e−λ)2 i v1 1 1 0 1 1 n − − ∂2 λ e λ[θ (1 − λ)+e λ] where t1 is the observed value of T1. 1 = − 0 I(y =0) ∂β2 (θ +e−λ)2 i i=1 0 5.2 Likelihood ratio test for one inflation [θ (3λ2 − λ − λ3)e−λ + λ3e−2λ] − 1 I(y =1) If the null hypothesis H0 specified by (5.1) is rejected at (θ + λ e−λ)2 i 1 the α level of significance (i.e, at least one of the φ0 and φ are positive), we could next test whether there exist ex- − 1 λI(yi 2) , tra ones in the observations, i.e., one-inflation in the ZOIP
Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods 19 model. We consider the following null and alternative hy- 5.4 LR test for zero inflation potheses To test whether there exist extra zeros in the observa-
(5.5) H0: φ1 = 0 against H1: φ1 > 0. tions, i.e., zero-inflation in the ZOIP model, we consider the following null and alternative hypotheses Under H0,thelikelihood ratio (LR) test statistic is given by (5.11) H : φ = 0 against H : φ > 0. − { ˆ ˆ | − ˆ ˆ ˆ| } 0 0 1 0 (5.6) T2 = 2 (φ0,H0 , 0, λH0 Yobs) (φ0, φ1, λ Yobs) ,
Under H0, the LR test statistic (Jansakul and Hinde, [14], where (φˆ , λˆ ) denote the constrained MLEs of (φ ,λ) 0,H0 H0 0 p. 78; Joe and Zhu, [15], p. 225) ˆ ˆ ˆ under H0 and (φ0, φ1, λ) denote the unconstrained MLEs of (φ ,φ ,λ), which are obtained by either the Fisher scor- ˆ ˆ ˆ ˆ ˆ 0 1 (5.12) T4 = −2{(0, φ1,H , λH |Yobs) − (φ0, φ1, λ|Yobs)} ing algorithm (4.5) or the EM algorithm (4.9)–(4.10). Note 0 0 ∼. 0.5χ2(0) + 0.5χ2(1), that under H0, the ZOIP(φ0,φ1; λ) distribution reduces to the zero-inflated Poisson distribution ZIP(φ ,λ). Thus, the 0 ˆ ˆ MLEs of (φ0,λ) can be calculated by the following EM it- where (φ1,H0 , λH0 ) denote the constrained MLEs of (φ1,λ) ˆ ˆ ˆ eration: under H0 and (φ0, φ1, λ) denote the unconstrained MLEs of (φ0,φ1,λ), which are obtained by either the Fisher scor- m φ(t) (t+1) 0 0,H0 ing algorithm (4.5) or the EM algorithm (4.9)–(4.10). Note (5.7) φ = , 0,H0 (t) (t) −λ(t+1) that under H , the ZOIP(φ ,φ ; λ) distribution reduces to n φ + 1 − φ e H0 0 0 1 0,H0 0,H0 the one-inflated Poisson distribution OIP(φ1,λ). Thus, the (t+1) y¯ MLEs of (φ1,λ) can be calculated by the following EM it- λH = , 0 1 − φ(t) eration: 0,H0 n (t) wherey ¯ =(1/n) yi. m1φ i=1 (5.13) φ(t+1) = 1,H 0 , Standard large-sample theory suggests that the asymp- 1,H0 (t) (t) (t+1) −λ(t+1) 2 n φ + 1 − φ λ e H0 totic null distribution of T2 is χ (1). However, the null hy- 1,H0 1,H0 H0 pothesis in (5.5) corresponds to φ being on the boundary (t) 1 y¯ − φ of the parameter space and the appropriate null distribution (t+1) 1,H0 λH = , 2 2 0 1 − φ(t) is a 50:50 mixture of χ (0) (i.e., Degenerate(0)) and χ (1), 1,H0 see Self and Liang [28] and Feng and McCulloch [9]. Hence, n the corresponding p-value (Jansakul and Hinde, [14], p. 78; wherey ¯ =(1/n) i=1 yi. Hence the corresponding p-value Joe and Zhu, [15], p. 225) is is 1 1 (5.8) p =Pr(T >t |H )= Pr(χ2(1) >t ). (5.14) p =Pr(T >t |H )= Pr(χ2(1) >t ). v2 2 2 0 2 2 v4 4 4 0 2 4 5.3 Score test for one inflation 5.5 Score test for zero inflation
Alternatively, the score test can be used for testing H0 ∗ Let (θ0,θ1,β) be defined in (5.2), testing H0 specified by specified in (5.5), which is equivalent to testing H : θ =0. ∗ ∗ 0 1 (5.11) is equivalent to testing H : θ = 0. Under H ,the Let (θ ,θ ,β)bedefinedby(5.2). Under H∗, the score test 0 0 0 0 1 0 score test statistic statistic ˆ ˆ −1 ˆ ˆ ˆ ˆ . 2 ˆ ˆ −1 ˆ ˆ ˆ ˆ . 2 (5.15) T5 = U (0, θ1, β) J (0, θ1, β) U(0, θ1, β) ∼ χ (1), (5.9) T3 = U (θ0, 0, β) J (θ0, 0, β) U(θ0, 0, β) ∼ χ (1),
ˆ ˆ − ˆ ˆ ˆ where θˆ = φˆ /(1 − φˆ )andβˆ =log(λˆ ) denote where θ0 = φ0,H0 /(1 φ0,H0 )andβ =log(λH0 ) denote 1 1,H0 1,H0 H0 ∗ ˆ ˆ the MLEs of θ and β under H∗,and(φˆ , λˆ )arede- the MLEs of θ0 and β under H0 ,and(φ0,H0 , λH0 )arede- 1 0 1,H0 H0 termined by (5.7). Note that the score vector U(θ0,θ1,β) termined by (5.13). Note that the score vector U(θ0,θ1,β) ˆ ˆ ˆ ˆ evaluated at (θ0,θ1,β)=(θ0, 0, β)isgivenby evaluated at (θ0,θ1,β)=(0, θ1, β)isgivenby