CHAPTER 5
Negative Binomial Distribution
5.1 DEFINITION
Many different models give rise to the negative binomial distribution, and conse- quently there is a variety of definitions in the literature. The two main dichotomies are (a) between parameterizations and (b) between points of support. Formally, the negative binomial distribution can be defined in terms of the expan- sion of the negative binomial expression (Q − P)−k,whereQ = 1 + P , P>0, and k is positive real; the (x + 1)th term in the expansion yields Pr[X = x]. This is analogous to the definition of the binomial distribution in terms of the binomial expression (π + ω)n,whereω = 1 − π,0<π<1, and n is a positive integer. Thus the negative binomial distribution with parameters k, P is the distribution of the rv X for which k + x − 1 P x P k Pr[X = x] = 1 − ,x= 0, 1, 2,..., (5.1) k − 1 Q Q where Q = 1 + P , P>0, and k>0. Unlike the binomial distribution, here there is a nonzero probability for X to take any specified nonnegative integer value, as in the case of the Poisson distribution. The probability generating function (pgf) is
G(z) = (1 + P − Pz)−k (5.2)
= 1F0[k;;P(z− 1)] (5.3) F [k;;Pz/(1 + P)] = 1 0 (5.4) 1F0[k;;P/(1 + P)] and the characteristic function is (1 + P − Peit)−k. The mean and variance are
µ = kP and µ2 = kP(1 + P). (5.5)
Univariate Discrete Distributions, Third Edition. By Norman L. Johnson, Adrienne W. Kemp, and Samuel Kotz Copyright 2005 John Wiley & Sons, Inc. 208 DEFINITION 209
This parameterization (but with the symbol p instead of P ) is the one introduced by Fisher (1941). Other early writers adopted different parameterizations. Jeffreys (1941) had b = P/(1 + P),ρ = kP, giving the pgf [(1 − bz)/(1 − b)]ρ−ρ/b,andµ = ρ, µ2 = ρ/(1 − b). Anscombe (1950) used the form α = k,λ = kP, giving the pgf −α (1 + λ/α − λz/α) ,andµ = λ, µ2 = λ(1 + λ/α). Evans (1953) took a = P , m = kP, giving the pgf (1 + a − az)−m/a,and µ = m, µ2 = m(1 + a). This parameterization has been popular in the ecolog- ical literature. Some writers, for instance Patil et al. (1984), called this the Polya–Eggenberger´ distribution, as it arises as a limiting form of Eggenberger and Polya’s´ (1923) urn model distribution. Other authors, notably Johnson and Kotz (1977) and Berg (1988b), called the (nonlimiting) urn model distribution the Polya–Eggenberger´ distribution; see Section 6.2.4. For both distributions “Polya–Eggenberger”´ is quite often abbreviated to “Polya.”´ A further parameterization that has gained wide favor is p = 1/(1 + P),that is, q = P/(1 + P),andk = k, giving 1 − q k G(z) = , (5.6) 1 − qz k + x − 1 Pr[X = x] = qx(1 − q)k,x= 0, 1, 2,..., (5.7) k − 1
2 and µ = kq/(1 − q), µ2 = kq/(1 − q) . Sometimes λ = P/(1 + P) is used to avoid confusion with the binomial parameter q. Clearly k need not be an integer. When k is an integer, the distribution is sometimes called the Pascal distribution (Pascal, 1679). The name “Pascal dis- tribution” is, however, more often applied to the distribution shifted k units from the origin, that is, with support k,k + 1,...; this is also called the binomial waiting-time distribution. The geometric is the special case k = 1 of the negative binomial distribution. Kemp (1967a) summarized four commonly encountered formulations of pgf’s for the negative binomial and geometric distributions as follows:
Formulation Negative Binomial Geometric Conditions 1 pk(1 − qz)−k p(1 − qz)−1 p + q = 1 − − 2 pkzk(1 − qz) k pz(1 − qz) 1 0