The Amoroso Distribution

Gavin E. Crooks Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 [email protected]

Abstract: Herein, we review the properties of the Amoroso distribution, the natural uniﬁcation of the gamma and extreme value distribution families. Over 50 distinct, named distributions (and twice as many synonyms) occur as special cases or limiting forms. Consequently, this single simple functional form encapsulates and systematizes an extensive menagerie of interesting and common probability distributions.

Contents

1 The Amoroso distribution family ...... 1 1.1 Special cases: Miscellaneous ...... 2 1.2 Special cases: Positive integer β ...... 5 1.3 Special cases: Negative integer β ...... 10 1.4 Special cases: Extreme order statistics ...... 13 1.5 Properties ...... 16 2 Log-gamma distributions ...... 17 2.1 Special cases ...... 17 2.2 Properties ...... 20 3 Miscellaneous limits ...... 20 A Index of distributions ...... 22 References ...... 25

1. The Amoroso distribution family

The Amoroso (generalized gamma, Stacy-Mihram) distribution [2, 22] is a four parameter, continuous, univariate, unimodal probability density, with semi- inﬁnite range. The functional form in the most straightforward parameterization arXiv:1005.3274v2 [math.ST] 13 Jul 2015 is αβ−1 ( β) 1 β x − a x − a Amoroso(x|a, θ, α, β) = exp − (1) Γ(α) θ θ θ for x, a, θ, α, β in R, α > 0, support x ≥ a if θ > 0, x ≤ a if θ < 0.

1 / 2

The Amoroso distribution was originally developed to model lifetimes [2]. It occurs as the Weibullization of the standard gamma distribution (5) and, with integer α, in extreme value statistics (28). The Amoroso distribution is itself a limiting form of various more general distributions, most notable the generalized beta and generalized beta prime distributions [32]. A useful and important property of the Amoroso distribution is that many common and interesting probability distributions occur as special cases or limits (See Table1). (Informally, an “interesting distribution” is one that has acquired a name, which generally indicates that the distribution is the solution to one or more interesting problems.) This provides a convenient method for system- izing a signiﬁcant fraction of the probability distributions that are encountered in practice, provides a consistent parameterization for those distributions, and obviates the need to enumerate the properties (mean, mode, variance, entropy and so on) of each and every specialization.

Notation: The four real parameters of the Amoroso distribution consist of a location parameter a, a scale parameter θ, and two shape parameters, α and β. Whenever these symbols appears in special cases or limiting forms, they refer directly to the parameters of the Amoroso distribution. The shape parameter α k is positive, and in many specializations an integer, α = n, or half-integer, α = 2 . The negation of a standard parameter is indicated by a bar, e.g. β¯ = −β. The chi, chi-squared and related distributions are traditionally parameterized with the scale parameter σ, where θ = (2σ2)1/β, and σ is the standard deviation of a related normal distribution. Additional alternative parameters are introduced as necessary. We write Amoroso(x|a, θ, α, β) for a density function, Amoroso(a, θ, α, β) for the corresponding random variables, and X ∼ Amoroso(a, θ, α, β) to indicate that two random variables have the same probability distribution [15].

1.1. Special cases: Miscellaneous

Stacy (hyper gamma, generalized Weibull, Nukiyama-Tanasawa, generalized gamma, generalized semi-normal, hydrograph, Leonard hydrograph, transformed gamma) distribution [41,6]: 1 β xαβ−1 xβ Stacy(x|θ, α, β) = exp − (2) Γ(α) θ θ θ =Amoroso(x|0, θ, α, β)

If we drop the location parameter from Amoroso, then we obtain the Stacy, or generalized gamma distribution, the parent of the gamma family of distributions. If β is negative then the distribution is generalized inverse gamma, the parent of various inverse distributions, including the inverse gamma (20) and inverse chi (26). / 3

Table 1 The Amoroso family of distributions.

αβ−1 ( β ) 1 β x − a x − a Amoroso(x|a, θ, α, β) = exp − Γ(α) θ θ θ

for x, a, θ, α, β in R, α > 0, k, n positive integers support x ≥ a if θ > 0, x ≤ a if θ < 0.

(1) Amoroso a θ α β (2) Stacy 0 . . . (28) gen. Fisher-Tippett . . n . (29) Fisher-Tippett . . 1 . (33) Fréchet . . 1 <0 (32) generalized Fréchet . . n <0 1 (25) scaled inverse chi 0 . 2 k -2 (26) inverse chi 0 √1 1 k -2 2 2 (27) inverse Rayleigh 0 . 1 -2 (19) Pearson type V . . . -1 (20) inverse gamma 0 . . -1 1 (23) scaled inverse chi-square 0 . 2 k -1 1 1 (24) inverse chi-square 0 2 2 k -1 1 (22) Lévy . . 2 -1 (21) inverse exponential 0 . 1 -1 (7) Pearson type III . . . 1 (5) gamma 0 . . 1 (5) Erlang 0 >0 n 1 (6) standard gamma 0 1 . 1 1 (13) scaled chi-square 0 . 2 k 1 1 (12) chi-square 0 2 2 k 1 (9) shifted exponential . . 1 1 (8) exponential 0 . 1 1 (8) standard exponential 0 1 1 1 (5) Wien 0 . 4 1 (10) Nakagami . . . 2 1 (15) scaled chi 0 . 2 k 2 √ 1 (14) chi 0 2 2 k 2 1 (11) half-normal 0 . 2 2 (16) Rayleigh 0 . 1 2 3 (17) Maxwell 0 . 2 2 (18) Wilson-Hilferty 0 . . 3 (30) generalized Weibull . . n >0 (31) Weibull . . 1 >0 1 (4) pseudo-Weibull . . 1+ β >0 (3) stretched exponential 0 . 1 >0

Limits (34) log-gamma . . . . limβ→∞ 1−p (44) power law . . β . limβ→0 (42) log-normal . . 1 . lim (βσ)2 β→0 (43) normal . . . 1 limα→∞ / 4

1.5 β=5

β=4

1 β=1 β=3 β=2

0.5

0 0 1 2 3

Fig 1. Amoroso(x|0, 1, 1, β), stretched exponential

The Stacy distribution is obtained as the positive even powers, modulus, and powers of the modulus of a centered, normal random variable (43), 1 2 Stacy (2σ2) β , 1 , β ∼ Normal(0, σ) β 2 and as powers of the sum of squares of k centered, normal random variables. 1 k ! 1 2 β 2 β 1 X Stacy (2σ ) , 2 k, β ∼ Normal(0, σ) i=1

Stretched exponential distribution [27]: β xβ−1 xβ StretchedExp(x|θ, β) = exp − (3) |θ| θ θ for β > 0 =Weibull(x|0, θ, β) =Amoroso(x|0, θ, 1, β) Stretched exponentials are an alternative to power laws for modeling fat tailed distributions. For β = 1 we recover the exponential distribution (8), and β = 0 a power law distribution (44). Pseudo-Weibull distribution [44]: 1 β xβ xβ PseudoWeibull(x|θ, β) = 1 exp − (4) Γ(1 + β ) |θ| θ θ for β > 0 1 =Amoroso(x|0, θ, 1 + β , β) / 5

1.5 β=4

β=3, Wilson-Hilferty 1

β=2, scaled chi 0.5 β=1, gamma

0 0 1 2 3

Fig 2. Amoroso(x|0, 1, 2, β)

Proposed as another model of failure times.

1.2. Special cases: Positive integer β

Gamma (Γ) distribution [35, 36, 22]: 1 xα−1 n xo Gamma(x|θ, α) = exp − (5) Γ(α)|θ| θ θ = PearsonIII(x|0, θ, α) = Stacy(x|θ, α, 1) = Amoroso(x|0, θ, α, 1) The name of this distribution derives from the normalization constant. The gamma distribution often appear as a solution to problems in statistical physics. For example, the energy density of a classical ideal gas; or the Wien (Vienna) distribution Wien(x|T ) = Gamma(x|T, 4), an approximation to the relative intensity of black body radiation as a function of the frequency. The Erlang (m- Erlang) distribution [8] is a gamma distribution with integer α, which models the waiting time to observe α events from a Poisson process with rate 1/θ (θ > 0). Gamma distributions obey an addition property:

Gamma(θ, α1) + Gamma(θ, α2) ∼ Gamma(θ, α1 + α2) The sum of two independent, gamma distributed random variables (with common θ’s, but possibly diﬀerent α’s) is again a gamma random variable [22]. Standard gamma (standard Amoroso) distribution [22]: 1 StdGamma(x|α) = xα−1e−x (6) Γ(α) / 6

1.5

α=8 1 α=6 α=1 α=4 α=2

0.5

0 0 1 2 3

1 Fig 3. Gamma(x| α , α) (unit variance)

The Amoroso distribution can be obtained from the standard gamma distribu- x−a β tion by the Weibull change of variables, x 7→ θ . h i1/β Amoroso(a, θ, α, β) ∼ a + θ StdGamma(α)

Pearson type III distribution [36, 22]:

1 x − aα−1 x − a PearsonIII(x|a, θ, α) = exp − (7) Γ(α)|θ| θ θ =Amoroso(x|a, θ, α, 1)

The gamma distribution with a location parameter.

An important property of the exponential distribution is that it is memory- less: the conditional probability given that x > c, where c is a positive content, is again an exponential distribution with the same scale parameter. The only other distribution with this property is the geometric distribution [9], the dis- crete analog of the exponential distribution. With θ = 1 we obtain a standard exponential distribution. See also shifted exponential (9), stretched exponential (3) and inverse exponential (21). / 7

Shifted exponential distribution [22]:

The exponential distribution with a location parameter.

Nakagami (generalized normal, Nakagami-m) distribution [34]: ( ) 2 x − am−1 x − a2 Nakagami(x|a, θ, m) = m exp − (10) Γ( 2 )|θ| θ θ m = Amoroso(x|a, θ, 2 , 2) Used to model attenuation of radio signals that reach a receiver by multiple paths [34]. Half-normal (semi-normal, positive deﬁnite normal, one-sided normal) distribution [22]:

2 x2 HalfNormal(x|σ) = √ exp − (11) 2πσ2 2σ2 = ScaledChi(x|σ, 1) √ = Stacy(x| 2σ2, 1 , 2) √2 2 1 = Amoroso(x|0, 2σ , 2 , 2) The modulus of a normal distribution with zero mean and variance σ2. Chi-square (χ2) distribution [11, 22]:

k 1 x −1 n xo ChiSqr(x|k) = 2 exp − (12) k 2 2 2Γ( 2 ) for positive integer k k = Gamma(x|2, 2 ) k = Stacy(x|2, 2 , 1) k = Amoroso(x|0, 2, 2 , 1) The distribution of a sum of squares of k independent standard normal random variables. The chi-square distribution is important for statistical hypothesis testing in the frequentist approach to statistical inference. / 8

0.5

k=1

k=2

k=3

k=4

k=5

0 0 1 2 3 4 5 6 7 8

Fig 4. ChiSqr(x|k)

Scaled chi-square distribution [28]:

k 1 x −1 n x o ScaledChiSqr(x|σ, k) = 2 exp − (13) 2 k 2σ2 2σ2 2σ Γ( 2 ) for positive integer k 2 k = Stacy(x|2σ , 2 , 1) 2 k = Gamma(x|2σ , 2 ) 2 k = Amoroso(x|0, 2σ , 2 , 1) The distribution of a sum of squares of k independent normal random variables with variance σ2. Chi (χ) distribution [22]: √ 2 x k−1 x2 Chi(x|k) = √ exp − (14) k 2 Γ( 2 ) 2 for positive integer k = ScaledChi(x|1, k) √ = Stacy(x| 2, k , 2) 2√ k = Amoroso(x|0, 2, 2 , 2) The root-mean-square of k independent standard normal variables, or the square root of a chi-square random variable.

Chi(k) ∼ pChiSqr(k) / 9

1.5

α=1/2, half-normal 1 α=1, Rayleigh α=3/2, Maxwell

0.5

0 0 1 2 3

Fig 5. Amoroso(x|0, 1, α, 2)

Scaled chi (generalized Rayleigh) distribution [33, 22]:

2 x k−1 x2 ScaledChi(x|σ, k) = √ √ exp − k 2 2 2σ2 Γ( 2 ) 2σ 2σ for positive integer k √ = Stacy(x| 2σ2, k , 2) (15) √2 2 k = Amoroso(x|0, 2σ , 2 , 2) The root-mean-square of k independent and identically distributed normal variables with zero mean and variance σ2.

Rayleigh distribution [42, 22]:

1 x2 Rayleigh(x|σ) = x exp − (16) σ2 2σ2 = ScaledChi(x|σ, 2) √ = Stacy(x| 2σ2, 1, 2) √ = Amoroso(x|0, 2σ2, 1, 2)

The root-mean-square of two independent and identically distributed normal variables with zero mean and variance σ2. For instance, wind speeds are approximately Rayleigh distributed, since the horizontal components of the velocity are approximately normal, and the vertical component is typically small [24]. / 10

Maxwell (Maxwell-Boltzmann, Maxwell speed) distribution [30,1]: √ 2 x2 Maxwell(x|σ) = √ x2 exp − (17) πσ3 2σ2 = ScaledChi(x|σ, 3) √ = Stacy(x| 2σ2, 3 , 2) √2 2 3 = Amoroso(x|0, 2σ , 2 , 2) The speed distribution of molecules in thermal equilibrium. The root-mean- square of three independent and identically distributed normal variables with zero mean and variance σ2. Wilson-Hilferty distribution [47, 22]:

The cube root of a gamma variable follows the Wilson-Hilferty distribution [47], which has been used to approximate a normal distribution if α is not too small. A related approximation using quartic roots of gamma variables [19] leads to Amoroso(x|0, θ, α, 4).

1.3. Special cases: Negative integer β

Pearson type V distribution [37]:

1 θ α+1 θ PearsonV(x|a, θ, α) = exp − Γ(α) |θ| x − a x − a =Amoroso(x|a, θ, α, −1) (19)

With negative β we obtain various “inverse” distributions related to distribu- x−a θ tions with positive β by the reciprocal transformation ( θ ) 7→ ( x−a ). Pearson’s type V is the inverse of Pearson’s type III distribution. Inverse gamma (Vinci) distribution [22]:

2.5

2 β=-3 β=-2 scaled β=-1 inverse-chi 1.5 inverse gamma

0.5

0 0 1 2

Fig 6. Amoroso(x|0, 1, 2, β), negative β.

Occurs as the conjugate prior for an exponential distribution’s scale parameter [22], or the prior for variance of a normal distribution with known mean [15]. Inverse exponential distribution [25]:

Note that the name “inverse exponential” is occasionally used for the ordinary exponential distribution (8). L´evy distribution (van der Waals proﬁle) [10]:

r c 1 c Lévy(x|a, c) = exp − (22) 2π (x − a)3/2 2(x − a) c 1 = PearsonV(x|a, 2 , 2 ) c 1 = Amoroso(x|a, 2 , 2 , −1) The Lévydistribution is notable for being stable: a linear combination of identically distributed Lévydistributions is again a Lévydistribution. The other stable distributions with analytic forms are the normal distribution (43), which is also a limit of the Amoroso distribution, and the Cauchy distribution [22], which is not. Lévydistributions describe first passage times in one dimensional Brownian diffusion [10]. / 12

Scaled inverse chi-square distribution [15]:

k +1 2σ2 1 2 1 ScaledInvChiSqr(x|σ, k) = exp − (23) k 2σ2x 2σ2x Γ( 2 ) for positive integer k 1 k = InvGamma(x| 2σ2 , 2 ) 1 k = PearsonV(x|0, 2σ2 , 2 ) 1 k = Stacy(x| 2σ2 , 2 , −1) 1 k = Amoroso(x|0, 2σ2 , 2 , −1) A special case of the inverse gamma distribution with half-integer α. Used as a prior for variance parameters in normal models [15].

Inverse chi-square distribution [15]:

k +1 2 1 2 1 InvChiSqr(x|k) = exp − (24) k 2x 2x Γ( 2 ) for positive integer k = ScaledInvChiSqr(x|1, k) 1 k = InvGamma(x| 2 , 2 ) 1 k = PearsonV(x|0, 2 , 2 ) 1 k = Stacy(x| 2 , 2 , −1) 1 k = Amoroso(x|0, 2 , 2 , −1) A standard scaled inverse chi-square distribution. Scaled inverse chi distribution [28]: √ 2 2σ2 1 k+1 1 ScaledInvChi(x|σ, k) = √ exp − (25) k 2 2σ2x2 Γ( 2 ) 2σ x = Stacy(x| √ 1 , k , −2) 2σ2 2 = Amoroso(x|0, √ 1 , k , −2) 2σ2 2 Used as a prior for the standard deviation of a normal distribution. Inverse chi distribution [28]: √ 2 2 1 k+1 1 InvChi(x|k) = √ exp − (26) k 2x2 Γ( 2 ) 2x = Stacy(x| √1 , k , −2) 2 2 = Amoroso(x|0, √1 , k , −2) 2 2 / 13

standard Gumbel

reversed Weibull, β=2 Frechet, β=-2

0.5

0 0 1 2 3

Fig 7. Extreme value distributions

The standard inverse chi distribution. Inverse Rayleigh distribution [9]:

√ 1 3 1 InvRayleigh(x|σ) = 2 2σ2 √ exp − (27) 2σ2x 2σ2x2 = Stacy(x| √ 1 , 1, −2) 2σ2 = Amoroso(x|0, √ 1 , 1, −2) 2σ2 The inverse Rayleigh distribution has been used to model failure time [43].

1.4. Special cases: Extreme order statistics

Generalized Fisher-Tippett distribution [40,3]:

n nβ−1 ( β) n β x − a x − a GenFisherTippett(x|a, ω, n, β) = exp −n Γ(n) ω ω ω for positive integer n (28) 1 = Amoroso(x|a, ω/n β , n, β)

If we take N samples from a probability distribution, then asymptotically for large N and n N, the distribution of the nth largest (or smallest) sample follows a generalized Fisher-Tippett distribution. The parameter β depends on the tail behavior of the sampled distribution. Roughly speaking, if the tail is unbounded and decays exponentially then β limits to ∞, if the tail scales as a power law then β < 0, and if the tail is ﬁnite β > 0 [17]. In these three limits we obtain the Gumbel (39, 38), Fr´echet (33, 32) and Weibull (31,30) families / 14 of extreme value distribution (Extreme value distributions types I, II and III) respectively. If β/ω is negative we obtain distributions for the nth maxima, if positive then the nth minima. Fisher-Tippett (Generalized extreme value, GEV, von Mises-Jenkinson, von Mises extreme value) distribution [12, 45, 17, 23]:

β−1 ( β) β x − a x − a FisherTippett(x|a, ω, β) = exp − (29) ω ω ω = GenFisherTippett(x|a, ω, 1, β) = Amoroso(x|a, ω, 1, β)

The asymptotic distribution of the extreme value from a large sample. The superclass of type I, II and III (Gumbel, Fr´echet, Weibull) extreme value distributions [45]. This is the distribution for maximum values with β/ω < 0 and minimum values for β/ω > 0. The maximum of two Fisher-Tippett random variables (minimum if β/ω > 0) is again a Fisher-Tippett random variable. h i max FisherTippett(a, ω1, β), FisherTippett(a, ω2, β) (ωβ + ωβ)1/β ∼ FisherTippett(a, 1 2 , β) ω1ω2 This follows because taking the maximum of two random variables is equivalent to multiplying their cumulative distribution functions, and the Fisher-Tippett n x−a βo cumulative distribution function is exp − ω .

The limiting distribution of the nth smallest value of a large number of identically distributed random variables that are at least a. If ω is negative we obtain the distribution of the nth largest value.

Weibull (Fisher-Tippett type III, Gumbel type III, Rosin-Rammler, Rosin- Rammler-Weibull, extreme value type III, Weibull-Gnedenko) distribution [46, / 15

This is the limiting distribution of the minimum of a large number of identically distributed random variables that are at least a. If ω is negative we obtain a reversed Weibull (extreme value type III) distribution for maxima. Special cases of the Weibull distribution include the exponential (β = 1) and Rayleigh (β = 2) distributions. Generalized Fr´echet distribution [40,3]:

The limiting distribution of the nth largest value of a large number identically distributed random variables whose moments are not all ﬁnite and are bounded from below by a. (If the shape parameter ω is negative then minimum rather than maxima.) Fr´echet (extreme value type II , Fisher-Tippett type II, Gumbel type II, inverse Weibull) distribution [13, 17]:

The limiting distribution of the largest of a large number identically distributed random variables whose moments are not all ﬁnite and are bounded from below by a. (If the shape parameter ω is negative then minimum rather than maxima.) Special cases of the Fr´echet distribution include the inverse exponential (β¯ = 1) and inverse Rayleigh (β¯ = 2) distributions. / 16

1.5. Properties

support x ≥ a θ > 0 x ≤ a θ < 0 x−a β β cdf 1 − Q(α, θ ) θ > 0 x−a β β Q(α, θ ) θ < 0 1 1 β mode a + θ(α − β ) αβ ≥ 1 a αβ ≤ 1 Γ(α + r ) std. moments β (a = 0, θ = 1) α + n ≥ 0 Γ(α) β Γ(α + 1 ) mean a + θ β α + 1 ≥ 0 Γ(α) β "Γ(α + 2 ) Γ(α + 1 )2 # variance θ2 β − β α + 2 ≥ 0 Γ(α) Γ(α)2 β θΓ(α) 1 entropy ln + α + − α ψ(α)[6] |β| β

Here, cdf is the cumulative distribution function, Q(α, x) = Γ(α, x)/Γ(α) is R ∞ α−1 −t the regularized gamma function [1], Γ(α, x) = x t e dt is the incomplete d gamma integral [1], and ψ(x) = dx ln Γ(x) is the digamma function [1], the logarithmic derivative of the gamma function. Important special cases and lim- 1 √ 1 √ √ its include Γ( 2 ) = π, Γ( 2 , x) = π erfc( x) and Γ(1, x) = exp(−x). The d 1 α−1 −x derivative of the regularized gamma function is dx Q(α, x) = − Γ(α) x e . The profile of the Amoroso distribution is bell shaped for αβ ≥ 1, and other- wise L- or J- shaped with the mode at the boundary. The moments are undefined if the side conditions are not satisfied. Expressions for skew and kurtosis are not simple, but can be deduced from the moments if necessary. The Amoroso distribution can be obtained from the standard gamma dis- x−a β tribution (6) with the change of variables, x 7→ θ . Therefore, Amoroso random numbers can be obtained by sampling from the standard gamma distribution, for instance using the Marsaglia-Tsang fast gamma method [29] and applying the appropriate transformation [26]. / 17

2. Log-gamma distributions

The log-gamma (Coale-McNeil, gamma-exponential) distribution [4, 39, 23] is a three parameter, continuous, univariate, unimodal probability density with inﬁnite range. The functional form in the most straightforward parameterization is 1 x − ν x − ν LogGamma(x|ν, λ, α) = exp α − exp (34) Γ(α)|λ| λ λ for x, ν, λ, α, in R, α > 0, support − ∞ ≤ x ≤ ∞

The three real parameters consist of a location parameter ν, a scale parameter λ, and a shape parameter α, which is inherited directly from the Amoroso distribution. The name “log-gamma” arises because the standard log-gamma distribution is the logarithmic transform of the standard gamma distribution StdLogGamma(α) ∼ ln StdGamma(α)

ν 1 LogGamma(ν, λ, α) ∼ ln Amoroso(0, e , α, λ )

Note that this naming convention is the opposite of that used for the log-normal distribution (42). The name “log-gamma” has also been used for the antilog transform of the generalized gamma distribution, which leads to the unit-gamma distribution [18]. The log-gamma distribution is a limit of the Amoroso distribution (1), and itself has a number of important limits and special cases (Table II), which we will discuss below.

LogGamma(x|ν, λ, α) (35) ( ) 1 1 x − ν αβ−1 1 x − ν β = lim 1 + exp − 1 + β→∞ Γ(α)|λ| β λ β λ = lim Amoroso(ν − βλ, βλ, α, β) β→∞

x β Recall that limβ→∞(1 + β ) = exp(x).

2.1. Special cases

Standard log-gamma distribution: 1 StdLogGamma(x|α) = exp {αx − exp(x)} (36) Γ(α) =LogGamma(x|0, 1, α) / 18

Table 2 The log-gamma family of distributions.

LogGamma(x|ν, λ, α) 1 x − ν x − ν = exp α − exp Γ(α)|λ| λ λ for x, ν, λ, α, in R, α > 0, support − ∞ ≤ x ≤ ∞

(34) log-gamma ν λ α (36) standard log-gamma 0 1 α k (37) log-chi-square ln 2 1 2 (38) generalized Gumbel . . n (39) Gumbel . . 1 π (41) BHP . . 2 (40) standard Gumbel 0 -1 1

Limits (43) normal . . α limα→∞

The log-gamma distribution with zero location and unit scale. Log-chi-square distribution [28]: 1 k 1 LogChiSqr(x|k) = exp x − exp(x) k k 2 2 2 2 Γ( 2 ) for positive integer k (37) k = LogGamma(x| ln 2, 1, 2 ) The logarithmic transform of the chi-square distribution (12).

Generalized Gumbel distribution [17, 23]: nn x − u x − u GenGumbel(x|u, λ,¯ n) = exp −n − n exp − Γ(n)|λ¯| λ¯ λ¯ for positive integer n (38) = LogGamma(x|u + λ¯ ln n, −λ,¯ n) The limiting distribution of the nth largest value of a large number of unbounded identically distributed random variables whose probability distribution has an exponentially decaying tail.

Gumbel (Fisher-Tippett type I, Fisher-Tippett-Gumbel, FTG, Gumbel-Fisher- Tippett, log-Weibull, extreme value (type I), doubly exponential, double exponential) distribution [12, 17, 23]: 1 x − u x − u Gumbel(x|u, λ¯) = exp − − exp − (39) |λ¯| λ¯ λ¯ = LogGamma(x|u, −λ,¯ 1) / 19

α=5

α=4

α=3

0.5 α=2

α=1

0 0 1 2 3

Fig 8. LogGamma(x|0, 1, α)

This is the asymptotic extreme value distribution for variables of “exponential type”, unbounded with ﬁnite moments [17]. With positive scale λ¯ > 0, this is an extreme value distribution of the maximum, with negative scale λ¯ < 0 (λ > 0) an extreme value distribution of the minimum. Note that the Gumbel is sometimes deﬁned with the negative of the scale used here. Note that the term “double exponential distribution” can refer to either the Gumbel or Laplace [23] distributions. The Gompertz distribution is a truncated Gumbel distribution [23]. Standard Gumbel (Gumbel) distribution [17]:

StdGumbel(x) = exp −x − e−x (40) =LogGamma(x|0, −1, 1)

The Gumbel distribution with zero location and a unit scale. BHP (Bramwell-Holdsworth-Pinton) distribution [5]:

1 π x − ν x − ν BHP(x|ν, λ) = π exp − exp (41) Γ( 2 )|λ| 2 λ λ π = LogGamma(x|ν, λ, ) 2 Proposed as a model of rare ﬂuctuations in turbulence and other correlated systems. / 20

2.2. Properties

support − ∞ ≤ x ≤ +∞ x−ν cdf 1 − Q(α, e λ ) for λ > 0 x−ν Q(α, e λ ) for λ < 0 Γ(α + λt) cgf νt + ln [23] Γ(α) mode ν − λ ln α mean ν + λψ(α) 2 variance λ ψ1(α) 3/2 skew sgn(λ)ψ2(α)/ψ1(α) 2 kurtosis ψ3(α)/ψ1(α) entropy ln Γ(α)|λ| − αψ(α) + α

Here, cdf is the cumulative distribution function, cgf the cumulant generating dn function, ln E[exp(tX)], sgn(z) is the sign function, ψn(z) = dzn ln Γ(z) is the polygamma function and ψ(z) ≡ ψ0(z) is the digamma function.

3. Miscellaneous limits

Log-normal (Galton, Galton-McAlister, antilog-normal, logarithmic-normal, logarithmico-normal, Cobb-Douglas, Λ) distribution [14, 31, 22]: ( ) ϑ x − a−1 1 x − a2 LogNormal(x|a, ϑ, σ) = √ exp − ln (42) 2πσ2 ϑ 2σ2 ϑ = lim Amoroso(x|a, ϑ(βσ)2/β, 1/(βσ)2, β) β→0

The log-normal distribution is a limiting form of the gamma family. To see this, make the requisite substitutions,

Amoroso(x|a, ϑ(βσ)2/β, 1/(βσ)2, β) x − a−1 1 x − a 1 x − a ∝ exp ln − exp β ln ϑ βσ2 ϑ β2σ2 ϑ and in the limit β → 0 expand the second exponential to second order in β. With a = 0, ϑ = 1, σ = 1 we obtain the standard log-normal (Gibrat) distribution [16]. The two-parameter lognormal distribution (a = 0) arises from the multiplicative version of the central limits theorem: When the sum of independent random variables limits to normal, the product of those random / 21 variables limits to log-normal. The log-normal distribution maps to the normal distribution with the transformation x 7→ exp(x). LogNormal(a, ϑ, σ) ∼ exp Normal(ln ϑ, σ) + a

Normal (Gauss, Gaussian, bell curve, Laplace-Gauss, de Moivre, error, Laplace’s second law of error, law of error) distribution [7, 22]: 1 (x − µ)2 Normal(x|µ, σ) = √ exp − (43) 2πσ2 2σ2 √ √ = lim Amoroso(x|µ − σ α, σ/ α, α, 1) α→∞ √ √ = lim LogGamma(x|µ − σ α ln α, σ α, α) α→∞ √ With µ = 0 and σ = 1/ 2h we obtain the error function distribution, and with µ = 0 and σ = 1 we obtain the standard normal (Φ, z, unit normal) distribution. In the limit that σ → ∞ we obtain an unbounded uniform (ﬂat distribution, and in the limit σ → 0 we obtain a delta (degenerate) distribution. The normal distribution is a limit of the Amoroso [22] and log-gamma distributions [39]. For Amoroso, make the requisite substitutions, √ √ Amoroso(x|µ − σ α, σ/ α, α, 1) √ x − µ 1 x − µ ∝ exp − α + (α − 1) ln 1 + √ σ α σ

x2 3 and expand the logarithm as ln(1 + x) = x − 2 + O(x ). Power-law (Pearson type XI, fractal) distribution [38]: 1 PowerLaw(x|p) ∝ (44) (x − a)p = lim Amoroso(a, θ, α, (1 − p)/α) α→∞ Improper (unnormalizable) power law distributions are obtained as a limit of the gamma distribution family. If p = 0 we obtain the half-uniform distribution over the positive numbers; if p = 1 we obtain Jeﬀreys distribution [21], used as an uninformative prior in Bayesian probability [20].

Acknowledgments: I am grateful to David Sivak, Edward E. Ayoub and Francis J. O’Brien for spotting various typos and errors, and, as always, to Av- ery Brooks for many insightful observations. In curating this collection of distributions, I have beneﬁted mightily from Johnson, Kotz, and Balakrishnan’s monumental compendiums [22, 23], Eric Weisstein’s MathWorld, and the myr- iad pseudo-anonymous contributors to Wikipedia. An extended enumeration of simple probability distribution families can be found at threeplusone.com/gud. / 22

Appendix A: Index of distributions generalized-X The only consistent meaning is that distribution “X” is a special case of the distribution “generalized-X”. In practice, often means “add a shape parameter”. standard-X The distribution “X” with the location parameter set to 0, scale to 1, and often the Weibull shape parameter β to 1. Not to be confused with standardized which generally indicates zero mean and unit variance. shifted-X (or translated) A distribution with an additional location parameter. scaled-X (or scale-X) A distribution with an additional scale parameter. inverse-X (Occasionally inverted-X, reciprocal-X, or negative-X) Generally 1 labels the transformed distribution with x 7→ x , or more generally the distribution with the Weibull shape parameter negated, β → −β. An exception is the inverse normal distribution [22]. log-X If x follows distribution X then either y = ln x (e.g. log-normal) or y = ex (e.g. log-gamma) follows log-X. This ambiguity arrises because although the second convention is more logical, the log-normal convention has historical precedence. reversed-X (Occasionally negative-X) The scale is negated.

X of the Nth kind See “X type N”.

Distribution Synonym or Equation

χ ...... chi χ2 ...... chi-square Γ ...... gamma Λ ...... log-normal Φ ...... standard normal antilog-normal ...... log-normal Amoroso ...... (1) bell curve ...... normal BHP ...... (41) Bramwell-Holdsworth-Pinton ...... BHP chi ...... (14) chi-square ...... (12) Coale-McNeil ...... generalized log-gamma Cobb-Douglas ...... log-normal de Moivre ...... normal / 23 degenerate ...... delta delta ...... See normal (43) doubly exponential ...... Gumbel double exponential ...... Gumbel or Laplace Erlang ...... See gamma (5) error ...... normal error function ...... See normal (43) exponential ...... (8) extreme value ...... Gumbel extreme value type N ...... Fisher-Tippett type N Fisher-Tippett ...... (29) Fisher-Tippett type I ...... Gumbel Fisher-Tippett type II ...... Fréchet Fisher-Tippett type III ...... Weibull Fisher-Tippett-Gumbel ...... Gumbel fractal ...... power law flat ...... uniform Fréchet ...... (33) FTG ...... Fisher-Tippett-Gumbel Galton ...... log-normal Galton-McAlister ...... log-normal gamma ...... (5) gamma-exponential ...... log-gamma Gaussian ...... normal Gauss ...... normal generalized gamma ...... Stacy or Amoroso generalized inverse gamma ...... See Stacy (2) generalized Gumbel ...... (38) generalized extreme value ...... Fisher-Tippett generalized Fisher-Tippett ...... (28) generalized Fréchet ...... (32) generalized inverse gamma ...... generalized gamma generalized normal ...... Nakagami generalized Rayleigh ...... scaled chi generalized semi-normal ...... Stacy generalized Weibull ...... (30) GEV ...... generalized extreme value Gibrat ...... standard log-normal Gumbel ...... (39) Gumbel-Fisher-Tippett ...... Gumbel Gumbel type N ...... Fisher-Tippett type N half-normal ...... (11) half-uniform ...... See power law (44) hydrograph ...... Stacy hyper gamma ...... Stacy inverse chi ...... (26) / 24 inverse chi-square ...... (24) inverse exponential ...... (21) inverse gamma ...... (20) inverse Rayleigh ...... (27) inverse Weibull ...... Fréchet Jeffreys ...... See power law (44) Laplace’s second law of error ...... normal Laplace-Gauss ...... normal law of error ...... normal Leonard hydrograph ...... Stacy Lévy...... (22) log-chi-square ...... (37) log-gamma ...... (34) log-normal ...... (42) log-normal, two parameter ...... See log-normal (42) log-Weibull ...... Gumbel logarithmic-normal ...... log-normal logarithmico-normal ...... log-normal Maxwell ...... (17) Maxwell-Boltzmann ...... Maxwell Maxwell speed ...... Maxwell m-Erlang ...... Erlang Nakagami ...... (10) Nakagami-m ...... Nakagami negative exponential ...... exponential normal ...... (43) Nukiyama-Tanasawa ...... generalized gamma one-sided normal ...... half-normal Pearson type III ...... (7) Pearson type V ...... (19) Pearson type X ...... exponential Pearson type XI ...... power law positive definite normal ...... half-normal power law ...... (44) pseudo-Weibull ...... (4) Rayleigh ...... (16) Rosin-Rammler ...... Weibull Rosin-Rammler-Weibull ...... Weibull scaled chi ...... (15) scaled chi-square ...... (13) scaled inverse chi ...... (25) scaled inverse chi-square ...... (23) semi-normal ...... half-normal shifted exponential ...... (9) Stacy ...... (2) Stacy-Mihram ...... Amoroso / 25 standard Amoroso ...... standard gamma standard exponential ...... See exponential (8) standard gamma ...... (6) standard Gumbel ...... (40) standard log-gamma ...... (36) standard log-normal ...... See log-normal (42) standard normal ...... See normal (43) stretched exponential ...... (3) transformed gamma ...... Stacy uniform ...... See normal (43) unit normal ...... standard normal van der Waals profile ...... Lévy Vienna ...... Wien Vinci ...... inverse gamma von Mises extreme value ...... Fisher-Tippett von Mises-Jenkinson ...... Fisher-Tippett waiting time ...... exponential Weibull ...... (31) Weibull-Gnedenko ...... Weibull Wien ...... See gamma (5) Wilson-Hilferty ...... (18) z ...... standard normal

References

[1] Abramowitz, M. and Stegun, I. A. (1965). Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover, New York. [2] Amoroso, L. (1925). Richerche intorno alla curve die redditi. Ann. Mat. Pura Appl. 21, 123–159. [3] Barndorff-Nielsen, O. (1963). On the limit behaviour of extreme order statistics. Ann. Math. Statist. 34, 992–1002. [4] Bartlett, M. S. and Kendall, M. G. (1946). The statistical analysis of variance-heterogeneity and the logarithmic transformation. J. Roy. Statist. Soc. Suppl. 8, 1, 128–138. [5] Bramwell, S. T., Holdsworth, P. C. W., and Pinton, J.-F. (1998). Universality of rare ﬂuctuations in turbulence and citical phenomena. Na- ture 396, 552–554. [6] Dadpay, A., Soofi, E. S., and Soyer, R. (2007). Information measures for generalized gamma family. J. Econometrics 138, 568–585. [7] de Moivre, A. (1738). The doctrine of chances, 2nd ed. Woodfall, London. [8] Erlang, A. K. (1909). The theory of probabilities and telephone conver- sations. Nyt Tidsskrift for Matematik B 20, 33–39. [9] Evans, M., Hastings, N., and Peacock, J. B. (2000). Statistical distributions, 3rd ed. Wiley, New York. / 26

[10] Feller, W. (1971). An introduction to probability theory and its applications, 2nd ed. Vol. 2. Wiley, New York. [11] Fisher, R. A. (1924). On a distribution yielding the error functions of several well known statistics. In Proceedings of the International Congress of Mathematics, Toronto. Vol. 2. 805–813. [12] Fisher, R. A. and Tippett, L. H. C. (1928). Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proc. Cambridge Philos. Soc. 24, 180–190. [13] Frechet,´ M. (1927). Sur la loi de probabilitéde l’écartmaximum. Ann. Soc. Polon. Math. 6, 93–116. [14] Galton, F. (1879). The geometric mean, in vital and social statistics. Proc. R. Soc. Lond. 29, 367–376. [15] Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2004). Bayesian data analysis, 2nd ed. Chapman & Hall/CRC, New York. [16] Gibrat, R. (1931). Les inégalitéséconomiques. Librairie du Recueil Sirey, Paris. [17] Gumbel, E. J. (1958). Statistics of extremes. Columbia University Press, New York. [18] Gupta, A. K. and Nadarajah, S., Eds. (2004). Handbook of beta distribution and its applications. Marcel Dekker, New York. [19] Hawkins, D. M. and Wixley, R. A. J. (1986). A note on the transformation of chi-squared variables to normality. Amer. Statistician 40, 296–298. [20] Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge University Press, Cambridge. [21] Jeffreys, H. (1948). Theory of probability, 2nd ed. Clarendon Press, Oxford. [22] Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous univariate distributions, 2nd ed. Vol. 1. Wiley, New York. [23] Johnson, N. L., Kotz, S., and Balakrishnan, N. (1995). Continuous univariate distributions, 2nd ed. Vol. 2. Wiley, New York. [24] Justus, C. G., Hargraves, W. R., Mikhail, A., and Graberet, D. (1978). Methods for estimating wind speed frequency distributions. J. Appl. Meteorology 17, 3, 350–353. [25] Kleiber, C. and Kotz, S. (2003). Statistical size distributions in eco- nomics and actuarial sciences. Wiley, New York. [26] Knuth, D. E. (1997). Art of computer programming, volume 2: Seminu- merical algorithms, 3rd ed. Addison-Wesley, New York. [27] Laherrere,` J. and Sornette, D. (1998). Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales. Eur. Phys. J. B 2, 525–539. [28] Lee, P. M. (2012). Bayesian Statistics: An Introduction. 4th. Wiley, New York. [29] Marsaglia, G. and Tsang, W. W. (2001). A simple method for generating gamma variables. ACM Trans. Math. Soft. 26, 3, 363–372. [30] Maxwell, J. C. (1860). Illustrations of the dynamical theory of gases. part 1. on the motion and collision of perfectly elastic spheres. Phil. Mag. 19, / 27

19–32. [31] McAlister, D. (1879). The law of the geometric mean. Proc. R. Soc. Lond. 29, 367–376. [32] McDonald, J. B. (1984). Some generalized functions for the size distribution of income. Econometrica 52, 3, 647–663. [33] Miller, K. S. (1964). Multidimensional Gaussian distributions. Wiley, New York. [34] Nakagami, M. (1960). The m-distribution – a general formula of intensity distribution of rapid fading. In Statistical methods in radio wave propagation: Proceedings of a symposium held June 18-20, 1958, W. C. Hoﬀman, Ed. Perg- amon, New York, 3–36. [35] Pearson, K. (1893). Contributions to the mathematical theory of evolu- tion. Philos. Trans. R. Soc. A 54, 329–333. [36] Pearson, K. (1895). Contributions to the mathematical theory of evo- lution - II. Skew variation in homogeneous material. Philos. Trans. R. Soc. A 186, 343–414. [37] Pearson, K. (1901). Mathematical contributions to the theory of evolu- tion. X. Supplement to a memoir on skew variation. Philos. Trans. R. Soc. A 197, 443–459. [38] Pearson, K. (1916). Mathematical contributions to the theory of evolu- tion. XIX. Second supplement to a memoir on skew variation. Philos. Trans. R. Soc. A 216, 429–457. [39] Prentice, R. L. (1974). A log gamma model and its maximum likelihood estimation. Biometrika 61, 539–544. [40] Smirnov, N. V. (1949). Limit distributions for the terms of a variational series. Trudy Mat. Inst. Steklov. 25, 3–60. [41] Stacy, E. W. (1962). A generalization of the gamma distribution. Ann. Math. Statist. 33, 3, 1187–1192. [42] Strutt, J. W. (1880). On the resultant of a large number of vibrations of the same pitch and of arbitrary phase. Phil. Mag. 10, 73–78. [43] Voda,ˇ V. G. (1972). On the inverse Rayleigh random variable. Rep. Statist. Appl. Res., JUSE 19, 13–21. [44] Voda,ˇ V. G. (1989). New models in durability tool-testing: pseudo- Weibull distribution. Kybernetika 25, 3, 209–215. [45] von Mises, R. (1936). La distribution de la plus grande de n valeurs. Rev. Math. Union Interbalcanique 1, 141–160. [46] Weibull, W. (1951). A statistical distribution function of wide applica- bility. J. Appl. Mech. 18, 293–297. [47] Wilson, E. B. and Hilferty, M. M. (1931). The distribution of chi- square. Proc. Natl. Acad. Sci. U.S.A. 17, 684–688.