Simulating from a Gamma Distribution with Small Shape Parameter

Home , Rejection sampling

arXiv:1302.1884v3 [stat.CO] 7 Apr 2015 oml oso,epnnil h-qae ,bt,adDrclt o Dirichlet, and beta, distrib important F, many chi-square, so exponential, to tied Poisson, closely normal, is distribution gamma the since Let Introduction 1 eoe by denoted hr h omlzn osat Γ( constant, normalizing the where at α iuaigfo am itiuinwt small with distribution gamma a from Simulating Y hsi nipratdsrbto nsaitc n rbblt modeling. probability and statistics in distribution important an is This . oi itiuin xoeta itiuin software R distribution; exponential distribution; totic cetnertsso httepooe rcdr smr effi more is procedure proposed methods. the acceptance–rejection that show rates algorith t acceptance–rejection acceptance to efficient insight highly provides shape and limi distribution the simple, limiting a when this obtain distribution Then we gamma zero. normalized method, suitably efficient a an for Towards problem. lenging eateto ahmtc,Saitc,adCmue Scien Computer and Statistics, Mathematics, of Department eapstv am itiue admvral ihsaeparamete shape with variable random distributed gamma positive a be ewrsadphrases: and Keywords iuaigfo am itiuinwt ml hp param shape small with distribution gamma a from Simulating Y ∼ Gamma ynMri n ikSyring Nick and Martin Ryan rmri,nsyring2)@uic.edu (rgmartin, ( nvriyo liosa Chicago at Illinois of University α, ) h rbblt est ucinfor function density probability The 1). p eateto Statistics of Department α [email protected] hp parameter shape ( cetnert;acpac–eeto ehd asymp- method; acceptance–rejection rate; Acceptance y udeUniversity Purdue = ) α hahiLiu Chuanhai pi ,2015 8, April = ) Γ( 1 Abstract α R ) 0 ∞ y α 1 y − α 1 − e 1 − e y − > y , y dy stegmafnto evaluated function gamma the is , . 0 .Cmaiosbsdon based Comparisons m. , ecntuto fanew, a of construction he aaee ed to tends parameter in hnexisting than cient ecudageta it that argue could ne igdistribution ting Y tri chal- a is eter sgvnby given is tos including utions, ce r nfact, In > α 0, is one of the most fundamental (e.g., Johnson et al. 1994). Here we are particularly inter- ested in the problem of simulating gamma random variables when the shape parameter α is small. This problem is important because the small-shape gamma, with a large scale parameter, is a simple but useful model for positive size or lifetime random variables with small mean but large variance (e.g., Kleiber and Kotz 2003). Unfortunately, the small- shape gamma distribution is not easy to work with, so, seemingly routine calculations can become inefficient or even intractable. For example, in R, the functions related to the gamma distribution—in particular, the rgamma function for sampling—become rela- tively inaccurate when the shape parameter is small (R Core Team 2013, “GammaDist” documentation). To circumvent these difficulties, and to move towards new and more efficient software, we show that Gamma(α, 1), suitably normalized, has a simple and non- degenerate limiting distribution as α 0. This result is then used to develop a new and efficient algorithm for sampling from→ a small-shape gamma distribution. When the shape parameter α is large, it follows from the infinite-divisibility of the gamma distribution and Lindeberg’s central limit theorem (Billingsley 1995, Sec. 27) that the distribution of Y is approximately normal. Specifically, as α , →∞ α−1/2(Y α) N(0, 1) in distribution. − → For large finite α, better normal approximations can be obtained by working on different scales, such as log Y or Y 1/3. Our interest is in the opposite extreme case, where the shape parameter α is small, approaching zero. Here, neither infinite-divisibility nor the central limit theorem provide any help. In Theorem 1 below, we prove that α log Y converges in distribution to Exp(1), the unit-rate exponential distribution, as α− 0. Motivated by the limit distribution result in Theorem 1, we turn to the problem→ of simulating from a small-shape gamma distribution. This is a challenging problem with many proposed solutions; see, for example, Best (1983), Kundu and Gupta (2007), Tanizaki (2008), and Xi et al. (2013). For small shape parameters, the default methods imple- mented in R and MATLAB, due to Ahrens and Dieter (1974) and Marsaglia and Tsang (2000), respectively, have some shortcomings in terms of accuracy and/or efficiency. The exponential limit in Theorem 1 for the normalized gamma distribution suggests a convenient and tight envelope function to be used in an acceptance–rejection sampler (e.g., Devroye 1986; Flury 1990). We flesh out the details of this new algorithm in Section 3 and provide R code at www.math.uic.edu/~rgmartin. This new method is simple and, as we demonstrate in Section 4, is more efficient than existing methods in terms of acceptance rates.

2 Limit distribution result

For the gamma function Γ(z) defined above, write f(z) = log Γ(z). Then the digamma ′ ′′ and trigamma functions are defined as f1(z) = f (z) and f2(z) = f (z), the first and second derivatives of the log gamma function f(z). Recall that these are related to the mean and variance of log Y , with Y Gamma(α, 1): ∼ Eα(log Y )= f1(α) and Vα(log Y )= f2(α). These formulae are most directly seen by applying those well-known formulae for means and variances in regular exponential families (Brown 1986, Corollary 2.3). Next, write

2 Z = α log Y . To get some intuition for why multiplication by α is the right normaliza- tion,− consider the following recurrence relations for the digamma and trigamma functions (Abramowitz and Stegun 1966, Chap. 6):

2 f1(α)= f1(α + 1) 1/α and f2(α)= f2(α +1)+1/α . − Then, as α 0, →

Eα(Z)= αf1(α)= αf1(α +1)+1= O(1), −2 −2 Vα(Z)= α f2(α)= α f2(α +1)+1= O(1).

That is, multiplication by α stabilizes the ﬁrst and second moments of log Y . Towards a formal look at the limiting distribution of Z, deﬁne the characteristic function

ϕ (t)= E (eitZ )= E (Y −iαt)=Γ(α iαt)/Γ(α), (1) α α α − where i = √ 1 is the complex unit. − Theorem 1. For Y Gamma(α, 1), α log Y Exp(1) in distribution as α 0. ∼ − → → Proof. Set Z = α log Y . The gamma function satisﬁes Γ(z) = Γ(z + 1)/z, so the − characteristic function ϕα(t) for Z in (1) can be re-expressed as

Γ(α iαt) Γ(1 + α iαt)/(α iαt) 1 Γ(1 + o ) ϕ (t)= − = − − = α , α Γ(α) Γ(1 + α)/α 1 it Γ(1 + o ) − α where o are terms that vanish as α 0. Since the gamma function is continuous at 1, α → the limit of ϕα(t) as α 0 exists and is given by 1/(1 it). This limit is exactly the characteristic function of→Exp(1), so the claim follows by− L´evy’s continuity theorem.

3 Small-shape gamma simulations

Simulating from a gamma distribution with small shape parameter is a challenging problem that has attracted considerable attention in the literature; see, e.g., Best (1983), Kundu and Gupta (2007), Tanizaki (2008), and Xi et al. (2013). Here we demonstrate that the limiting distribution result in Theorem 1 helps provide an improved algorithm for simulating gamma random variables with small shape parameter. For Y Gamma(α, 1) with α near zero, let Z = α log Y . To simulate from the distribution∼ of Z, one might consider an acceptance–rejection− scheme; see, for example, Lange (1999, Chap. 20.4) or Givens and Hoeting (2005, Chap. 6.2.3). For this, one needs an envelope function that bounds the target density and, when properly normalized, corresponds to the density function of a distribution that is easy to simulate from. By Theorem 1 we know that Z is approximately Exp(1) for α 0. More precisely, the density h (z) of Z has a shape like e−z for z 0. Therefore,≈ we expect that a function α ≥ proportional to an Exp(1) density will provide a tight upper bound on h (z) for z 0. α ≥ We shall similarly try to bound hα(z) by an oppositely-oriented exponential-type density for z < 0, as is standard in such problems.

3 ) ) z z ( ( α α h h 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 −0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

z z (a) α =0.1 (b) α =0.05

Figure 1: Plots of the (un-normalized) target hα(z) (solid) and envelope ηα(z) (dashed) for two values of α.

The particular bounding envelope function ηα(z) is chosen to be as tight an upper bound as possible. This is done by picking optimal points of tangency with hα(z). For this, we shall need a formula for hα(z), up to norming constant, which is easily found:

−z/α h (z)= ce−z−e , z ( , ). α ∈ −∞ ∞ The norming constant c satisﬁes c−1 = Γ(α + 1). By following standard techniques, as described in Lange (1999, Chap. 20.4), we obtain the optimal envelope function

ce−z, for z 0, ηα(z)= λz ≥ (cwλe , for z < 0,

where λ = λ(α) = α−1 1 and w = w(α) = α/e(1 α). Plots of the (un-normalized) − − target density hα(z) along with the optimal envelope ηα(z), for two small values of α, are shown Figure 1. The normalized envelope function ηα(z) corresponds to the density function of a mixture of two (oppositely-oriented) exponential distributions, i.e., 1 w Exp(1) + Exp(λ) , 1+ w 1+ w{− } which is easy to sample from using standard tools, such as runif in R. Pseudo-code for the proposed new program, named rgamss, for simulating from a small-shape gamma distribution based on this acceptance–rejection scheme is presented in Algorithm 1. R code is also available at www.math.uic.edu/~rgmartin, which provides the user with further options. As a side note, since numerical precision can be lost in the ﬁnal exponentiation step in Algorithm 1, we recommend returning the samples on the log-scale, which is the default in our R function.

4 Algorithm 1 – Pseudo-code for the program rgamss designed to simulate from a gamma distribution with small shape parameter α. 1: set λ λ(α), w w(α), and r r(α) as in the text. ← ← ← 2: loop 3: U Unif #Unif denotes the random number generator ← 4: if U r then ≤ 5: z log(U/r) ← − 6: else 7: z log(Unif)/λ ← 8: end if 9: if hα(z)/ηα(z) > Unif then 10: Z z ← 11: break 12: end if 13: end loop 14: Y exp( Z/α) ← − 4 Eﬃciency comparisons

All methods based on the acceptance–rejection principle will provide genuine samples from the target distribution. So, the most natural way to compare such methods is based on the acceptance rate. It is common practice nowadays to employ an acceptance– rejection sampler inside a Markov chain Monte Carlo method, so having high acceptance rates results in shorter Monte Carlo run times. The acceptance rate r(α) for the proposed method is α −1 r(α)= 1+ w(α) −1 = 1+ . (2) { } e(1 α) n − o It is clear from the approximation r(α) 1 α/e for α 0, that the acceptance rate ≈ − ≈ converges to 1 as α 0. This indicates the high efficiency of the proposed method when α is small. This is to→ be expected based on Theorem 1: when α 0, Z is approximately Exp(1), so an algorithm that proposes Exp(1) samples with probability≈ 1/(1+ w) 1 will ≈ likely accept the proposal. To justify the claim in Section 1 that the proposed method is more efficient than existing methods, all based on the accept–reject principle, we provide a comparison in terms of acceptance rates. The methods being compared to rgamss are Ahrens–Dieter (Ahrens and Dieter 1974), Best (Best 1983), and Algorithm 3 of Kundu– Gupta (Kundu and Gupta 2007); the methods of Tanizaki (2008) and Xi et al. (2013) were also considered, but these are not efficient enough to be compared with the others. Plots of the acceptance rates for these methods, as a function of the shape parameter α, are displayed in Figure 2. The proposed rgamss has the highest acceptance rate over a range of small α values, namely (0, 0.3], and, therefore, is most efficient. For α > 0.3, the proposed method’s efficiency drops below that of Kundu–Gupta but stays above Best and Ahrens–Dieter for α (0.3, 0.4] (not shown). There are additional numerical∈ advantages to the proposed method beyond simulation efficiency. Indeed, when α is very small, the gamma distribution is tightly concentrated

5 Acceptance Rate Ahrens−Dieter Best 0.85 0.90 0.95 1.00 Kundu−Gupta rgamss

0.00 0.05 0.10 0.15 0.20 0.25 0.30

Figure 2: Plot of the acceptance rates for the four indicated methods as a function of the shape parameter α (0, 0.3]. ∈ around zero, so getting practically non-zero values can be challenging. For example, when α = 0.001, nearly half of the samples returned by the Ahrens–Dieter method are exact zeros. The Kundu–Gupta method has similar diﬃculties, since U 1/α, for U Unif(0, 1), is often practically zero. The proposed method, on the other hand, actually∼ works on the log-scale, a more appropriate scale for very small numbers, so one can readily obtain genuine non-degenerate samples of log Y for very small values of α.

Acknowledgments

This work is partially supported by the U.S. National Science Foundation, grants DMS– 1007678, DMS–1208833, and DMS–1208841.

References

Abramowitz, M. and Stegun, I. A. (1966). Handbook of mathematical functions, with formulas, graphs, and mathematical tables. Dover Publications Inc., New York.

Ahrens, J. H. and Dieter, U. (1974). Computer methods for sampling from gamma, beta, Poisson and binomial distributions. Computing (Arch. Elektron. Rechnen), 12(3):223– 246.

Best, D. J. (1983). A note on gamma variate generators with shape parameter less than unity. Computing, 30(2):185–188.

6 Billingsley, P. (1995). Probability and Measure. John Wiley & Sons Inc., New York, third edition.

Brown, L. D. (1986). Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory. Institute of Mathematical Statistics Lecture Notes— Monograph Series, 9. Institute of Mathematical Statistics, Hayward, CA.

Devroye, L. (1986). Nonuniform Random Variate Generation. Springer-Verlag, New York.

Flury, B. D. (1990). Acceptance-rejection sampling made easy. SIAM Rev., 32(3):474– 476.

Givens, G. H. and Hoeting, J. A. (2005). Computational Statistics. Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ.

Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous Univariate Distri- butions. Vol. 1. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Inc., New York, second edition.

Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics and Actuarial Sciences. Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ.

Kundu, D. and Gupta, R. D. (2007). A convenient way of generating gamma random variables using generalized exponential distribution. Comput. Statist. Data Anal., 51(6):2796–2802.

Lange, K. (1999). Numerical Analysis for Statisticians. Statistics and Computing. Springer-Verlag, New York.

Marsaglia, G. and Tsang, W. W. (2000). A simple method for generating gamma variables. ACM Trans. Math. Software, 26(3):363–372.

R Core Team (2013). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Tanizaki, H. (2008). A simple gamma random number generator for arbitrary shape parameters. Econ. Bull., 3:1–10.

Xi, B., Tan, K. M., and Liu, C. (2013). Logarithmic transformation based gamma random number generators. J. Statist. Softw., 55:Issue 4.