Triangular, Gamma, Erlang, Weibull Distributions

IV. Triangular Distribution Known values are the minimum (a), the mode (b - the most likely value of the pdf), and the maximum (c). probability density function (area under the curve = 1) f(x) 2 h = c - a a b c x ì 2(x -a) h for a £ x £ b (slope = ) ï(c - a)(b - a) b- a The pdf is given by f(x) = í -2(x - c) - h ï for b £ x £ c (slope = ) îï (c -a)(c -b) c- b = 0 otherwise The expected value is given by ¥ b 2(x -a) c 2(c - x) a + b +c E(X) = x ×f(x)dx = × xdx + ×xdx = ò ò (c - a)(b -a) ò (c -a)(c - b) 3 -¥ a b The derivation is fairly tedious; with a little work it can be shown that é2b3 - 3ab 2 + a3 c3 - 3cb 2 + 2b3 ù a 3 (c - b) + b3 (a - c) + c3 (b - a) E(X) = h × ê + ú = ëê 6(b - a) 6(c - b) ûú 3(c - a)(b - a)(c - b) (a + b + c)(c - a)(b - a)(c - b) a + b + c = = 3(c - a)(b - a)(c - b) 3 Remark: For a discrete sample, measures of centrality that are typically determined are the mean, the mode, and the median. The mean is the average value of the sample and corresponds to E(X). The mode corresponds to the maximum value of the pdf. When working with a sample, it is necessary to resort to a histogram (which can be tricky) to estimate the mode of the underlying pdf. The median simply corresponds to that point at which half of the area under the curve is to the left and half is to the right. The triangular distribution is typically employed when not much is known about the distribution, but the minimum, mode, and maximum can be estimated. Sampling from the triangular distribution requires solving rsample x = òf(z)dz -¥ for rsample given random probability x. Since f(z) is piecewise continuous, its distribution function F(t) is given by ì 0 for t £ a ï t ï f(z)dz for a < t £ b t ò ïa F(t) = òf(z)dz = í c -¥ ï1 - f(z)dx for b £ t < c ï ò ï t î 1 for t ³ c Hence, for a £ rsample £ b we get rsample rsample rsample 2(z -a) z 2 - 2az (rsample - a)2 A x = f(z)dz = dz = = ò ò (b -a)(c - a) (b -a)(c - a) (b - a)(c - a) a a a and for b £ rsample £ c, since c c c 2(c - z) 2cz - z 2 (c - rsample) 2 f(z)dz = dz = = ò ò (c - b)(c -a) (c - b)(c - a) (c - b)(c - a) rsample rsample rsample we get (c - rsample) 2 B x = 1 - (c- b)(c - a) Since b (b - a) f(z)dz = ò (c -a) a (b - a) if the random probability x £ then equation A is used to solve for (c - a) rsample; otherwise equation B is used. A rsample = a + (b -a)(c - a)x for x £ (b -a)/(c - a) B rsample = c - (c - b)(c -a)(1- x) for x > (b - a)/(c - a) Graphically, the sampling function has the appearance rsample c b a x 0 1 (b-a)/(c-a) Example: (Note: the median corresponds to x= 0.5) For a=1, b = 2, c = 4 mean = (a+b+c)/3 = 2.333 mode = 2 median = c− (c − b)(c −a)(0.5) = 4 - 3 = 2.268 V. Gamma Distribution A large number of useful functions are related to the exponential function. The gamma function is one of these. The gamma function generally traces from 18th century work by Euler in which he was using interpolation methods to define n! for non-integral values (it was later dubbed the gamma function by LeGendre in a series of books published between 1811 and 1826). The gamma function appears naturally in the study of anti-differentiation; i.e., it is also studied in the context of differential equations when calculating LaPlace transforms. The gamma function is given by ∞ Γ(α) = ∫ xα - 1 e−x dx (α > 0) 0 Integrating by parts, we get Γ(α) = (α-1)Γ(α-1) for α > 1. ∞ Since Γ(1) = ∫ex- dx = 1 , then when α is an integer, Γ(α) = (α - 1)! 0 Hence, the gamma function is a generalization of the factorial, applying to all α > 0, not just integers. The gamma distribution is obtained from the gamma function by specifying the pdf x ì - ï a - 1 ß f(x) = íkx e for x > 0 for fixed a > 0 and b > 0 ï î 0 otherwise ¥ where the proportionality constant k is chosen so that òf(x)dx = 1 . -¥ k is easy to figure: x ¥- ¥ 1 = kò x a - 1e ß dx = kßa òt a - 1e-t dt where x = bt . 0 0 G(a) 1 1 so k = and f(x) is given by x a - 1 e-x/ß ßa G(a ) ßa G(a ) a is called shape (or order) parameter; b us called the scale parameter. 1 1 - × x Note that when a = 1, f(x) = × e ß which is the exponential distribution ß with mean b. In general, E(X) = ab and s2 = ab2 . Hence, if the mean and standard deviation can be estimated, then a and b can also be determined. Algorithm for calculating the natural logarithm of the gamma function Attributed to Lanczos, C., Journal S.I.A.M. Numerical Analysis, ser. B, vol. 1, p. 86 (1964) and adapted from Numerical Recipes in C by Press, W.H., and B.P. Flannery, S.A. Teukolsky, W.T. Vetterling (Cambridge University Press, 1988). FUNCTION lngamma(z) /* Use the reflection formula for z < 1 */ IF z < 1 z ¬ 1 - z RETURN ln(pz) - (lngamma(1 + z) + ln(sin(pz)) ENDIF coeff ¬ 76.18009173, -86.50532033, 24.01409822, -1.231739516, 0.00120858003, -0.00000536382 /* These values are the (approximate) coefficients for the first 6 terms of an infinite series involved in an exact formulation for the gamma function credited to Lanczos. They yield an approximation for the variable "a" (determined below) which is within |e| < 2 ´ 10-10 of its true value */ a ¬ 1 FOR i ¬ 1 TO 6 a ¬ a + coeff(i)/(i + z - 1) ENDFOR RETURN ln(a 2p ) - (z + 4.5) + (z - 0.5)ln(z + 4.5) END Gamma pdf for fixed mean ab = 5 and varying values of a and b f(x) 0.6 0.5 a=.5, b=10 0.4 a=1.5, b=3.3333 a=5, b=1 0.3 a=10, b=.5 0.2 0.1 0 x 0 5 10 Corresponding distribution functions and sampling functions F(x) rsample 1 10 8 .6 6 .2 4 0 0 10 2 0 0 .2 .6 1 The gamma distribution is used to model waiting times or time to complete a task. More specifically, it can be shown that if we have exponentially distributed interarrival times with mean 1/l, the time needed to obtain k changes distributes according to a gamma distribution with a = k and b = 1/l. ¥ Gamma Function: G(a ) = ò x a - 1 e -x dx (a > 0) 0 The general relationship G(a) = (a-1)G(a-1) for a > 1 holds. p It can also be shown that GG(a ) (1- a ) = for 0 < a < 1. sin( p×a ) (Note that in particular, this means that G(.5) = p ) For 0 < a < 1, 1 + a > 1, so G(1+a) = aG(a). This in turn gives the reflection formula p ×a G(1- a ) = for 0 < a < 1 G(1 + a )sin( p×a ) Selected values computed according to the algorithm for ln(G(a)). G(.25) » 3.62560991 G(a) G(.5) = p » 1.77245385 10 G(.75) » 1.22541670 G(1) = 0! = 1 G(1.25) » 0.90640248 8 G(1.5) = .5G(.5) = p / 2 » 0.88622693 G(1.75) » 0.91906253 G(2) = 1! = 1 6 x 3! G(2.25) » 1.13300310 0! 1! 2! G(2.5) = 1.5G(1.5) = 3 p / 4 » 1.32934039 4 G(2.75) » 1.60835942 G(3) = 2! = 2 G(3.25) » 2.54925697 2 x G(3.5) = 2.5G(2.5) = 15 p / 8 » 3.32335097 x x G(3.75) » 4.42298841 0 a G(4) = 3! = 6 0 1 2 3 4 5 G(4.25) » 8.28508514 G(4.5) = 3.5G(3.5) = 105 p /16 » 11.63172840 G(4.75) » 16.58620654 G(5) = 4! = 24 G(5.25) » 35.21161185 G(5.5) = 4.5G(4.5) = 945 p /32 » 52.34277778 G(5.75) » 78.78448105 G(6) = 5! = 120 For the pdf of the gamma distribution x ì - ï a - 1 ß f(x) = íkx e for x > 0 for fixed a > 0 and b > 0 ï î 0 otherwise note that if: a < 1 then xa - 1 ® ¥ as x ® 0 a = 1 then the distribution is the exponential distribution a > 1 then xa - 1 ® 0 as x ® 0 The earlier example showed three basic shapes, each of which is described by the behavior of the derivative f '(x) (slope function) of f(x). f '(x) = k[(a - 1)xa - 2 e-x/b + (-1/b)xa - 1 e-x/b] There are actually 5 cases: a < 1 the slope ® -¥ as x ® 0 since each term is < 0 and each exponent of x is < 0 a = 1 the slope ® -1/b2 as x ® 0 in accord with the exponential distribution since k = 1/b, term 1 is 0 and term 2 is -1/b a < 2 and a > 1 the slope ® +¥ as x ® 0 since the lead term ® +¥ and term 2 is 0 a = 2 the slope ® +1/b2 as x ® 0 since k = 1/ba = 1/b2 a > 2 the slope ® 0 as x ® 0 In each case the slope ® 0 as x ® +¥ The gamma distribution is one which is usually sampled by the accept-reject technique, which means to get k, the value of G(a) must be computed.

Load more