Max-Planck-Institut für demografi sche Forschung Max Planck Institute for Demographic Research Konrad-Zuse-Strasse 1 · D-18057 Rostock · GERMANY Tel +49 (0) 3 81 20 81 - 0; Fax +49 (0) 3 81 20 81 - 202; http://www.demogr.mpg.de

MPIDR WORKING PAPER WP 2012-008 FEBRUARY 2012

The Gompertz distribution and Maximum Likelihood Estimation of its parameters - a revision

Adam Lenart ([email protected])

This working paper has been approved for release by: James W. Vaupel ([email protected]), Head of the Laboratory of Survival and Longevity and Head of the Laboratory of Evolutionary Biodemography.

© Copyright is held by the authors.

Working papers of the Max Planck Institute for Demographic Research receive only limited review. Views or opinions expressed in working papers are attributable to the authors and do not necessarily refl ect those of the Institute. The Gompertz distribution and Maximum Likelihood Estimation of its parameters - a revision

Adam Lenart November 28, 2011

Abstract

The Gompertz distribution is widely used to describe the distribution of adult deaths. Previous works concentrated on formulating approximate relationships to char- acterize it. However, using the generalized integro-exponential Milgram (1985) exact formulas can be derived for its -generating function and central moments. Based on the exact central moments, higher accuracy approximations can be defined for them. In demographic or actuarial applications, maximum-likelihood estimation is often used to determine the parameters of the Gompertz distribution. By solving the maximum-likelihood estimates analytically, the dimension of the optimization problem can be reduced to one both in the case of discrete and continuous .

Keywords: Gompertz distribution, moment-generating function, , , , , maximum-likelihood estimation

1 Introduction

The Gompertz distribution is important in describing the pattern of adult deaths (Wet- terstrand 1981; Gavrilov and Gavrilova 1991). For low levels of infant (and young adult) mortality, the Gompertz force of mortality extends to the whole life span (Vaupel 1986) of populations with no observed mortality deceleration.

The Gompertz distribution has received considerable attention from demographers and ac- tuaries. Pollard and Valkovics (1992) were the first to study the Gompertz distribution thoroughly. However, their results are true only in the case when the initial level of mortal- ity is very close to 0. Kunimura (1998) arrived at similar conclusions. They both defined the moment generating function of the Gompertz distribution in terms of the incomplete or complete gamma function and their results are either approximate or left in an integral form. Willemse and Koppelaar (2000) reformulated the Gompertz force of mortality and derived relationships for this new formulation. Later, Marshall and Olkin (2007) described the negative Gompertz distribution; a Gompertz distribution with a negative rate of aging parameter. Willekens (2002) provided connections between the Gompertz, the Weibull and other Type I extreme value distributions. In this paper, I will keep the most often used

1 formulation of the Gompertz’ law of mortality (Gompertz 1825) in demography (Preston et al. 2001:9.1). The Gompertz force of mortality at age x, x ≥ 0, is

µ(x) = aebx a, b > 0 , (1) where a denotes the level of the force of mortality at age 0 and b the rate of aging. The moments of the Gompertz distribution can be explicitly given by the generalized integro- exponential function (Milgram 1985), that offers exact power series representation of them.

2 The Gompertz distribution

The Gompertz distribution has a continuous density function with location pa- rameter a and b,

bx− a ebx−1 f(x) = ae b ( ) , (2) with on (−∞, ∞). In actuarial or demographic applications, x usually denotes age which cannot be negative, leading to bounded support on [0, ∞). The truncated distribution yields a proper density function by rescaling the a parameter to correspond to x = 0 (Garg et al. 1970). The distribution function is

− a ebx−1 F (x) = 1 − e b ( ) .

The moment generating function of the Gompertz distribution is given by its Laplace trans- form. Proposition 1. The Laplace transform of the Gompertz distribution is

a a a L(s) = e b E s a, b > 0 , (3) b b b ∞ R e−zt where En(z) = tn dt (Abramowitz and Stegun 1965:5.1.4). 1

Proof. The Laplace transform of the Gompetz pdf of (2) is

∞ Z bx− a ebx−1 −sx L(s) = ae b ( )e dx . (4)

0 By substituting q = ebx in (4)

∞ Z a a − a q − s L(s) = e b e b q b dq b 1

2 Note that the exponential integral, En(z), is defined as (Abramowitz and Stegun 1965:5.1.4)

∞ Z e−zt E (z) = dt n > 0, Re(z) > 0 , n tn 1 therefore

a a a L(s) = e b E s b b b

The moments of the Gompertz distribution are expressed as the of its Laplace transform that can be expressed in a general formula by using the generalized integro- exponential function (Milgram 1985).

Proposition 2. The nth moment of a Gompertz distributed X is

n n! a n−1 a E [X ] = e b E , (5) bn 1 b

∞ n 1 R n −s −zx where Es (z) = n! (ln x) x e dz is the generalized integro-exponential function (Milgram 1 1985).

Proof. The moments of random variable X of a Gompertz distribution can be calculated with the aid of the Laplace transform. From (3),

∞ Z d a a 1 − a x E [X] = − L(s) = e b e b ln(x) dx ds s=0 b b 1 ∞ 2 d a a Z 1 a  2 b − b x 2 E X = 2 L(s) = e 2 e [ln(x)] dx ds s=0 b b 1 . . ∞ n d a a Z 1 a n n+1 b − b x n E [X ] = (−1) n L(s) = e n e [ln(x)] dx ds s=0 b b 1

By integration by parts

∞ Z n a n −1 − a x n−1 E [X ] = e b x e b [ln(x)] dx bn 1

3 Note that the intergral representation of the generalized integro-exponential function (Mil- gram 1985) is

∞ 1 Z Ej(z) = [ln x]jx−se−zx dx s Γ(j + 1) 1 or by definition:

(−1)j ∂j Ej(z) = E (z) , (6) s j! ∂sj s and

0 Es (z) ≡ Es(z). Therefore, E [Xn] can be represented by (6):

n n! a n−1 a E [X ] = e b E bn 1 b or by the more widely used Meijer G-function (Milgram 1985):   n n! a n+1,0 a ; 1,..., 1 E [X ] = e b G , (7) bn n,n+1 b 0,..., 0; where the Meijer G-function is a generalized hypergeometric function. It is defined by the contour integral

m n Y Y Γ(bj − s) Γ (1 − aj + s)   a , . . . , a ; a , . . . , a 1 Z j=1 j=1 Gm,n z 1 n n+1 p = zs ds p,q b , . . . , b ; b , . . . , b q p 1 m m+1 q 2πi C Y Y Γ(1 − bj + s) Γ(aj − s) j=m+1 j=n+1 along contour C (Erd´elyi1953).

Series expansion of the integral form offers a simpler, but lengthier, representation of the mo- ments of the Gompertz distribution. The power series expansion of the generalized integro- a exponential function (Milgram 1985:2.10) in the Gompertz case, ie. for s = 1 and z = b is,

" ∞ # n a X 1 (− a )j (−1)n X n an−j En−1 = b + ln Ψ , (8) 1 b (−j)n j! n! j b j j=1 j=0 where dj Ψj = lim Γ(1 − t) . (9) t→0 dtj

4 For j ∈ (0, 1, 2, 3, 4)

Ψ0 = 1 ;

Ψ1 = γ ; π2 Ψ = γ2 + ; 2 6 π2 Ψ = γ3 + γ + 2ζ(3) ; 3 2 3 Ψ = γ4 + γ2π2 + 8γζ(3) + π4 , 4 20

∞ where γ ≈ 0.57722 is the Euler-Mascheroni constant and ζ(s) = P n−s is the Riemann n=1 π2 π4 zeta function (Abramowitz and Stegun 1965:23.2.18). Note that 6 = ζ(2) and 90 = ζ(4). Also note that ζ(3) ≈ 1.2021 is known as Ap´ery’sconstant (Ap´ery1979). Please refer to Appendix A.1 for derivation of the values of the Ψ function.

2.1 Expected value, variance, skewness and kurtosis

Corollary 1. The expected value of random variable X of a Gompertz distribution is

1 a a E [X] = e b E (10) b 1 b 1 a a a  ≈ e b − ln − γ . b b b The of the approximate result depends on the ratio of a to b. For low values of a 2 a 1 ( b ) b b e 4 , the approximation is precise.

Proof.

Exact From (5), the expected value of random variable X of a Gompertz distribution, or life expectancy, is given by

∞ Z 1 a −1 − a x E [X] = e b x e b dx b 1 1 a a = e b E b 1 b This result has already been shown by Missov and Lenart (2011).

5 Approximate The approximate result for the expected value in a Gompertz model is also well-known, when a is close to 0, by Abramowitz and Stegun (1965:5.1.11) the exponential 1 a a b  integral, E1(z), can be approximated by −γ − ln(z), and (10) appears as b e −γ − ln b , where γ = 0.57722 denotes the Euler-Mascheroni constant (see e.g. Pollard and Valkovics 1992; Kunimura 1998; Missov and Lenart 2011). However, the precision of the previous approximations can be improved by adding new terms to it.

From (8) the power series expansion of the expected value is

∞ 1 ! a j   1−j 1 a X (− b ) X 1 a E [X] = e b − − ln Ψ , b j! j b j j=1 j=0 and by (24)

∞ a j ! 1 a X (− b ) a E [X] = e b − − ln − γ . b j! b j=1

∞ (− a )j a P b As a is usually close to 0, b ≈ − j! and j=1

1 a a a  E [X] ≈ e b − ln − γ . (11) b b b ∞ (− a )j P b Note that because the sign of − j! changes, (11) will be slightly overestimated for j=1 relatively large a but by adding new terms of the sum, the expected value, or life expectancy at birth, can be approximated by arbitrary precision. For the error of the approximation, please see Fig. 5 in the Appendix. Corollary 2. The variance of random variable X of a Gompertz distribution is

    2 2  2 2 a a 1, 1, 1 a 1 π  a 1 a a V ar [X] = e b − F ; + + γ + ln − e b E b2 b 3 3 2, 2, 2; b 2 6 b b 1 b (12) 1 π2 a ≈ − 2 . b2 6 b3   a1, . . . , ap The approximate result holds for a ≈ 0. pFq ; z denotes the generalized hyper- b1, . . . , bq; geometric function (e.g. Askey and Daalhuis 2010).

Proof.

Exact The variance of a random variable X is

V ar [X] = E X2 − E [X]2 .

6 Concentrating on E [X2], if X is Gompertz distributed, from (5):

 2 2 a 1 a E X = e b E (13) b2 1 b An exact representation of (13) can be reached by the G function by (7):    2 2 a 3,0 a ; 1, 1 E X = e b G . b2 2,3 b 0, 0, 0;

j Power series expansion of Es yields (Milgram 1985:2.10),

∞ j+1 X 1 (−z)l (−1)j+1 X j + 1 Ej(z) = + ln(z)1+j−lΨ , (14) 1 (−l)j+1 l! (j + 1)! l l l=1 l=0 where dl Ψl = lim Γ(1 − t) . t→0 dtl Note that

∞ l   X 1 (−z) j 11,..., 1j+2 j+1 = (−1) z j+2Fj+2 ; z , (−l) l! 21,..., 2j+2; l=1

  ∞ k a1, . . . , ap P (a1)k···(ap)k z where pFq ; z = denotes the generalized hypergeometric func- (b1)k···(bp)k k! b1, . . . , bq; k=0 tion (Askey and Daalhuis 2010).

Therefore 2  a  1, 1, 1 a 1 π2  a2 E X2 = − F ; + + γ + ln (15) b2 b 3 3 2, 2, 2; b 2 6 b and

    2 2  2 2 a a 1, 1, 1 a 1 π  a 1 a a V ar[X] = e b − F ; + + γ + ln − e b E b2 b 3 3 2, 2, 2; b 2 6 b b 1 b

Approximate By juxtaposing (11) with (15), it is easy to see possible approximations. 1 π2 Pollard and Valkovics (1992) derived that the variance of the Gompertz distribution is b2 6 . This result only holds when a is very close to 0. However, from (12), it is clear that the variance depends on the parameter a, i.e. for fixed b, as a increases, the variance decreases.

When a ≈ 0, by (11)

2  2 1 a π − a 2 E X ≈ e b − e b E [X] b2 6

7 a and for e b = 1 1 π2 V ar (X) ≈ . (16) b2 6 Note that (16) crudely overestimates the variance for relatively larger a. The value of  1, 1, 1  F ; a for a ≈ 0 is 1 and for any higher likely values for human force of mortality 3 3 2, 2, 2; b a (ie. a ≈ 0.1 about age 90), it is equal to about 0.9, or approximately 1. However, −2 b3 can be significantly large and is the only term that reduces the overestimated variance of (16) in (12). Therefore,

1 π2 a V ar (X) ≈ − 2 b2 6 b3 always gives a better approximation to the variance than (16). For the error of the approxi- mation, please see Fig. 5 in the Appendix.

Corollary 3. The skewness, γ1 of the of random variable X of a Gompertz distribution is     6 a 4,0 a ; 1, 1, 1 2 a 3,0 a ; 1, 1 b b 3 3 e G − 3m1 2 e G + 2(m1) b 3,4 b 0, 0, 0, 0; b 2,3 b 0, 0, 0; γ1 = 3       2 2 a a 1, 1, 1 a 1 h π2 a 2i  1 a a 2 e b − F ; + + γ + ln − e b E b2 b 3 3 2, 2, 2; b 2 6 b b 1 b ( 4.15a0.3 − 5b0.49 − 1.48a + 4.31b − 4.96ab, if a > 0 √ ≈ −12 6 , π3 ζ(3), if a ≈ 0

where m1 denotes the expected value of the distribution.

Proof.

Exact Skewness is given as the third ,

E [X3] − 3E [X] E [X2] + 2E [X]3 γ1 = 3 . V ar [X] 2 From (7) and (12) the exact result is immediately given. An equivalent power series repre- sentation can be easily acquired from (8) and (9).

Approximate The approximate result is given by the limit of the power series expansion of the third central moment, m3, when a goes to 0 2 m3| = lim m3 = − ζ(3) a=0 a→0 b3

8 and from (16) √ m3 −12 6 γ1|a=0 = lim 3 = 3 ζ(3) (17) a→0 V ar [X] 2 π

and for the other cases in the 0 < a, b ≤ 0.2 surface, the approximation is the result of fitting the

c2 c4 γ1|a>0 ≈ c1a + c3b + c5a + c6b + c7ab expression by using (unweighted) non-linear . For the error of the approxima- tion, please see Fig. 5 in the Appendix.

Note that (17) also yields the lowest value, γ1 = −1.13955 for the skewness of the Gompertz distribution.

Corollary 4. The excess kurtosis, γ2 of the probability distribution of random variable X of a Gompertz distribution is      24 a 5,0 a ; 1, 1, 1, 1 6 a 4,0 a ; 1, 1, 1 γ2 = e b G − 4m1 e b G b4 4,5 b 0, 0, 0, 0, 0; b3 3,4 b 0, 0, 0, 0;        2 2 a 3,0 a ; 1, 1 4 2 a a 1, 1, 1 a +6(m1) e b G + 3(m1) e b − 3F3 ; b2 2,3 b 0, 0, 0; b2 b 2, 2, 2; b 2  2   2) 1 π  a2 1 a a + + γ + ln − e b E − 3 2 6 b b 1 b ( −0.75 + 34.13a0.253 + 20b0.311 − 53.51(ab)0.14, if a > 0 ≈ 12 , 5 , if a ≈ 0

Proof.

Exact Excess kurtosis is given as the difference between the fourth standardized moment and the kurtosis of the ; 3,

E [X4] − 4E [X] E [X3] + 6E [X]2 E [X2] − 3E [X]4 γ2 = − 3 . V ar [X]2

From (7) and (12) the exact result is immediately given. An equivalent power series repre- sentation can be easily acquired from (8) and (9). Note that the excess kurtosis reaches a minimal value about -0.75.

9 Approximate The approximate result is given by the limit of the power series expansion of the fourth central moment, m4, when a goes to 0 3π4 m4| = lim m4 = a=0 a→0 20b4 and from (16)

m4 12 γ2| = lim − 3 = . a=0 a→0 V ar [X]2 5 and for the other cases in the 0 < a, b ≤ 0.2 surface, the approximation is the result of fitting the

c2 c4 c6 γ2|a>0 ≈ −0.75 + c1a + c3b + c5(ab) expression by using (unweighted) non-linear least squares. For the error of the approxima- tion, please see Fig. 5 in the Appendix.

0.20 0.20

0.18 0.18

Variance 0.16 Expected value 0.16 400+ 75+ 200−400 0.14 50−75 0.14 100−200 30−50 50−100 15−30 b 0.12 b 0.12 25−50 10−15 15−25 6−10 0.10 0.10 10−15 4−6 7−10 2−4 0.08 0.08 4−7

0.06 0.06

0.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.20 a a

0.20 0.20

0.18 0.18

Skewness 0.16 0.16 (−1.14)−(−0.5) Kurtosis (−0.5)−0 0.14 0.14 (−0.75)−(−0.6) 0−0.3 (−0.6)−(−0.3) 0.3−0.5 (−0.3)−0 b 0.12 0.5−0.6 b 0.12 0−0.3 0.6−0.75 0.3−1.5 0.10 0.75−0.85 0.10 1.5+ 0.85−1 0.08 1+ 0.08

0.06 0.06

0.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.20 a a

Figure 1: Expected value, variance, skewness and kurtosis of the Gompertz distribution

The Gompertz probability density function can be reformulated into a distribution with location and scale parameters. By noting that the , M 1  b  M = ln , b a

10 (2) can be expressed as

f(x) = bee−bM −eb(x−M)+b(x−M)

3 Maximum Likelihood Estimation

3.1 Discrete age

Often, only discrete data is available and (1) cannot be directly estimated. However, if the number of deaths and the number of person-years exposed to the risk of dying can be observed, we can assume that the number of deaths in a given age follows a .

Let D denote the number of deaths and λ the rate parameter of the Poisson distribution 1 f(D; λ) = θDe−λ . (18) D! The rate parameter, λ, of the Poisson distribution is consistent with µ(x) as both λ, µ(x) > 0. The rate of occurrence of a Poisson process assumes equal observation window for each unit of observation. When applying the Poisson distribution to death-exposure data, the unit of observation is the number of deaths at each age group, Dx and the observation window is the number of people (or person-years) who are exposed to the risk of death. Clearly, the number of people at the risk of dying differs from age group to age group. Therefore, θ has to be weighted by the number of person-years exposed to death, Ex at age x. After substituting λ = λEx in (18) and taking the log of it and eliminating the additive constants, the log-likelihood function of deaths D will be X l(θ|D) ∝ Dx ln θ − Exθ . (19) x If we substitute µ(x) for θ, we assume that we have n independent draws (for each age) from a Poisson distribution with a rate parameter that sheds light on the previously hidden (in (19)) functional relationship between rate λ and age x.

Using data for US females in 2007, aged 35-100 from the Human Mortality Database, pa- rameters a and b can be readily estimated.

From Fig. 3, we can see that the maximal likelihood value lies somewhere in the middle of the of plotted a’s and b’s (ˆa = 0.0006, ˆb = 0.095, log-L = −4963073).

The maximum likelihood estimator for a is (Appendix B.1) P Dx aˆ = x . (20) P bx Exe x

11 oeuto solving. equation to usiue nteepeso fˆ of expression the in substituted As b ˆ If of data values females Log-likelihood 4: Figure optimal case. optimization the ˆ dimensional 4 two both the and Fig. in seen from as be 2007, same easily the from can is data value two value US from likelihood likelihood problem the log optimization to the maximal it of associated dimension Applying the reduce one. to to used immediately be can (20) a ostsy(e pedxB.1) Appendix (see satisfy to has a ˆ b iue2 bevdfreo mortality. of females. 35-100, force ages 2007, Observed USA 2: Figure stemxmmlklho siao for estimator likelihood maximum the is

steol nnw n(1,i a enmrclysle n afterward, and solved numerically be can it (21), in unknown only the is log(µ)

−7 −6 −5 −4 −3 −2 −1 35 40 45 50 55 60 65

log−likelihood Age 70 P

x −5300000 −5150000 −5000000 75 1 0.06 D 80 a x 85 hc eue h rbe ftodmninloptimization two-dimensional of problem the reduces which X x 90 b 0.08 D 95 ie optimal given x 100 x = 0.10 P 12 x a b hntemxmmlklho siao for estimator likelihood maximum the then , E 1 x 0.12 o S 07 gs3-0 data 35-100 ages 2007, USA of for values Log-likelihood 3: Figure e ˆ bx a X o each for x 0.14 E logL x e ˆ bx . x 0.16 b b o S20,ae 35-100, ages 2007, US for a , ˆ b n h maximal the and b , a ˆ b hthsthe has that ˆ b a a be can and (21) b 3.1.1 Example for solving b

The Newton-Raphson method is commonly used to solve (systems of) equations numerically. This method uses an iterative process to approximate the root of a function. If r denotes the root and n the number of iterations,

f(rn) rn+1 = rn − 0 , f (rn)

rn+1 will give a dr = rn+1 − rn better approximation to the root of function f(·) than rn.

In the case of discrete Gompertz mortality models, f(b) is (21), and

P bx 2 P bx 2 Exe x Exe x f 0(b) = x − x .  P bx  P bx Exe Exe x x

For the US 2007, females aged 35-100 data, by taking a reasonable initial guess of b = 0.12 for any human population,

0 n bn f(bn) f (bn) bn+1 db 0 0.12 -3.738 -131.264 0.0915218 -0.0284782 1 0.0915218 0.7091 -182.8819 0.0953989 3.87 × 10−3 2 0.0953989 0.01468335 -175.3180 0.09548266 8.38 × 10−5

Table 1: Newton-Raphson solution for b in case of US 2007, ages 35-100, females data

the solution of ˆb = 0.095 is reached after the second iteration. If higher than 8.38 × 10−5 precision is required, the iterations can be continued up to infinity. (Table 1)

3.1.2 Remarks

Please note that by letting wD and wE be

ˆbx Dx Exe wD = P and wE = P ˆbx Dx Exe x x

and by denoting the respective vectors as wD, wE and x, (21) can be rewritten as

T T wD x = wE x .

At the optimal b, ˆb, the two weighted averages of ages (x); one weighted by the number of deaths (wD) and the other, weighted by the exposures (wE) are equal to each other. The weights of the exposures are themselves also weighted. Their weights correspond to the rate of aging, each consecutive age has eˆb times higher importance than the previous one.

13 3.2 Continuous age

In the case when fully observed, or right-censored continuous data for the distribution of 1 deaths is available, the Gompertz distribution can provide the likelihood function. Let Xi denote the observed or right-censored lifetimes and let δi = 1 if time of death of individual i is observed and δi = 0 if the individual lifetime is right-censored. In this case, the joint density and the log likelihood function of the Gompertz distribution will be (Garg et al. 1970)

δ a bX bXi  i − (e i −1) p(Xi|a, b) = ae e b

and X a l(a, b|X) = δ (ln a + bX ) − ebXi − 1 i i b i respectively.

The functions in the continuous case will be

bX  P δi e i −1    a − b Sa =  i  S P a bXi  a bXi b b2 e − 1 + δiXi − b e Xi i and therefore Db aˆ = , 1 P bXi n e − 1 i

1 P where n is the number of observations and D = n δi is the proportion of deaths out of all i observations.

The MLE of b is the solution to   bXi bXi  X De Xi X D e − 1 1 =  b + δiXi . P ebXi − 1 P ebXi − b i n i n i i

Note that Garg et al. (1970) used the discretized version of this likelihood to provide max- imum likelihood estimates and when the step size of the discretization is sufficiently small, it would give the same results as the continuous form shown here.

1However, Pollard and Valkovics (1997) showed that an approximate likelihood for discrete data can be found using the truncated Gompertz distribution.

14 3.2.1 Remarks

If all Xi are observed times of death, by denoting the average force of mortality of the P a bXi population byµ ¯ and noting thatµ ¯ = n e : i

µ¯ =a ˆ + b .

If b = ˆb and the data come from a Gompertz distribution, the observed average force of mortality is equal toa ˆ + ˆb.

References

Abramowitz, M. and I. Stegun. 1965. Handbook of Mathematical Functions. Washington, DC: US Government Printing Office.

Ap´ery, R. 1979. “Irrationalit´ede ζ (2) et ζ (3).” Ast´erisque 61:11–13.

Askey, R. A. and A. B. O. Daalhuis. 2010. “Generalized Hypergeometric Functions and Mei- jer G-Function.” In NIST handbook of mathematical functions, edited by F. Olver, D. Lozier, R. Boisvert and C. Clark, Cambridge University Press.

Erd´elyi,A. 1953. Higher transcendental functions, vol. 1. New York-Toronto-London: McGraw-Hill.

Garg, M., B. Rao and C. Redmond. 1970. “Maximum-likelihood estimation of the parameters of the Gompertz .” Journal of the Royal Statistical Society. Series C (Applied Statistics) 19(2):152–159.

Gavrilov, L. and N. Gavrilova. 1991. The biology of Life Span: A Quantitative Approach. Chur: Harwood.

Gompertz, B. 1825. “On the Nature of the Function Expressive of the Law of Human Mortality, and on a New Mode of Determining the Value of Life Contingencies.” Philosophical Transactions of the Royal Society of London 115:513–583.

Gussmann, E. 1967. “Modifizierung der Gewichtsfunktionenmethode zur Berechnung der Fraun- hoferlinien in Sonnen-und Sternspektren.” Zeitschrift f¨urAstrophysik 65:456–497.

Kunimura, D. 1998. “The Gompertz Distribution-Estimation of Parameters.” Actuarial Research Clearing House 2:65–76.

Marshall, A. and I. Olkin. 2007. Life Distributions. Springer.

Milgram, M. 1985. “The generalized integro-exponential function.” Mathematics of computation 44(170):443–458.

Missov, T. and A. Lenart. 2011. “Linking period and cohort life-expectancy linear increases in Gompertz proportional hazards models.” Demographic Research 24(19):455–468.

15 Pollard, J. and E. Valkovics. 1992. “The Gompertz distribution and its applications.” Genus 48(3- 4):15–29.

———. 1997. “On the use of the truncated Gompertz distribution and other models to represent the parity progression functions of high fertility populations.” Mathematical Population Studies 6(4):291–305.

Preston, S., P. Heuveline and M. Guillot. 2001. Demography: measuring and modeling population processes. Wiley-Blackwell.

Vaupel, J. 1986. “How change in age-specific mortality affects life expectancy.” Population Studies 40(1):147–157.

Wetterstrand, W. 1981. “Parametric models for life insurance mortality data: Gompertz’s law over time.” Transactions of the Society of Actuaries 33:159–175.

Willekens, F. 2002. “Gompertz in context: the Gompertz and related distributions.” In Forecasting Mortality in Developed Countries - Insights from a Statistical, Demographic and Epidemiological Perspective, European Studies of Population, vol. 9, edited by E. Tabeau, A. van den Berg Jeths and C. Heathcote, pp. 105–126, Springer.

Willemse, W. and H. Koppelaar. 2000. “Knowledge Elicitation of Gompertz’ Law of Morality.” Scandinavian Actuarial Journal pp. 168–180.

A Appendix

A.1 Values of the Ψ function

Note that (Abramowitz and Stegun 1965:6.3.1,6.3.2)

∞ Z e−u ln u du = ψ(1) = −γ , (22)

0

d (n) dn where ψ(z) = dz ln Γ(z) denotes the digamma function and let ψ (z) = dzn ψ(z) denote the polygamma function (Abramowitz and Stegun 1965:6.4.1).

As a special case of ψ(n)(z), when z = 1 (Abramowitz and Stegun 1965:6.4.2),

ψ(n)(1) = (−1)n+1n!ζ(n + 1) . (23)

16 From (9), (22) and (23),

Ψ0 = lim −Γ(1 − t) = 1 ; t→0

Ψ1 = lim −Γ(1 − t)ψ(1 − t) = −ψ(1) = γ ; t→0 2 (1) 2 Ψ2 = lim −Γ(1 − t)[ψ(1 − t)] + Γ(1 − t)ψ (1 − t) = γ + ζ(2) ; t→0 (24) . . n−1 n−1   d X n − 1 (n−1−k) (n−1) Ψn = lim [−Γ(1 − t)ψ(1 − t)] = lim Γ(1 − t) ψ (1 − t) t→0 dtn−1 t→0 k k=0

d Any values of Ψn can be derived by using relationships (22), (23), dz Γ(z) = ψ(z)Γ(z) and Γ(1) = 1.2

A.2 Central moments as Meijer-G functions

Meijer-G functions offer the most succinct way to represent the central moments of the Gompertz distribution. The nth order central moment, mn is given as n   X n n−j m = (−1)n−jE Xj E [X] n j j=0 and from (7)   1 a 2,0 a ; 1 m1 = e b G b 1,2 b 0, 0;   2 a 3,0 a ; 1, 1 2 m2 = e b G − (m1) b2 2,3 b 0, 0, 0;     6 a 4,0 a ; 1, 1, 1 2 a 3,0 a ; 1, 1 3 m3 = e b G − 3m1 e b G + 2(m1) b3 3,4 b 0, 0, 0, 0; b2 2,3 b 0, 0, 0;     24 a 5,0 a ; 1, 1, 1, 1 6 a 4,0 a ; 1, 1, 1 m4 = e b G − 4m1 e b G b4 4,5 b 0, 0, 0, 0, 0; b3 3,4 b 0, 0, 0, 0;   2 2 a 3,0 a ; 1, 1 4 + 6(m1) e b G + 3(m1) . b2 2,3 b 0, 0, 0;

2 n For an alternative derivation (and also for alternative power series expansion of E1 (z)) please see the Appendix of Gussmann (1967), especially equations (A.39) − (A.41).

17 A.3 Error of the approximations

Expected value Variance

0.20 0.20

0.18 0.18

Difference 0.16 0.16 Difference 150+ 50+ 50−150 0.14 10−50 0.14 10−50 3−10 5−10

b 0.12 1−3 b 0.12 (−5)−5 0.5−1 (−10)−(−5) 0.10 0.2−0.5 0.10 (−50)−(−10) 0−0.2 (−150)−(−50) 0.08 0.08 −150−

0.06 0.06

0.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.20 a a

Skewness Kurtosis

0.20 0.20

0.18 0.18

0.16 0.16

0.14 Difference 0.14 Difference (−0.05)−(−0.01) 0.2+ (−0.01)−0.01 (0.1)−(0.2) b 0.12 b 0.12 0.01−0.1 (−0.1)−(0.1) 0.1−0.2 (−0.1)− 0.10 0.10

0.08 0.08

0.06 0.06

0.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.20 a a

Figure 5: Approximation error for the expected value, variance, skewness and kurtosis of the Gompertz distribution

B Appendix

B.1 Maximum Likelihood Estimation

To solve the maximum likelihood estimators analytically, let S denote the scores of the likelihood function and be defined by ∂l(θ|D) S = . ∂θ Maximizing the likelihood function implies that at the optimal parameters, θˆ the system of equations defined by the score function is homogeneous, S = 0 , and the Hessian matrix ∂2l(θˆ|D) H = ∂θˆ2

18 is negative definite.

The s score functions, i . . . s, in the Poisson likelihood case appear as

∂f(θi,x) X ∂θ ∂f(θi, x) S = D i − E , (25) i x f(θ , x) x ∂θ x i i where f(·) is some function of θ and x.

Substituting (1) in (25) yields3

 P Dx P bx  a − Exe x x S =  P P bx  . Dxx − Exxae x x By setting  0  S = , 0

Sa can be solved fora ˆ: P Dx aˆ = x P bx Exe x

Substitutinga ˆ in Sb gives P Dx X X x S = D x − E xebx , b x x P E ebx x x x x or

1 X 1 X ˆbx Dxx = Exe x . P D P ˆbx x x Exe x x x

3S will be an s × n (Jacobian) matrix

19