Lognormal and Beta Distributions

Home , Beta distribution, Beta function, Cauchy distribution, Laplace distribution, Normal distribution

VIII. Lognormal Distribution

Data points t are said to be lognormally distributed, if the natural logarithms, ln(t), of these points are normally distributed with mean μ and standard deviation σ. In particular, if the normal distribution is sampled to get points rsample, then the points ersample constitute sample values from the lognormal distribution. The pdf for the lognormal distribution is given by

()ln(x)- - μ 2 1 2 f(x) ⋅= e 2σ (x > 0) x 2πσ

since 2 2 ∞ ()ln(x)- - μ ∞ -( -t μ) 1 1 2 1 2 ⋅e 2σ dx ⋅= e 2σ (wheredt t = ln(x)) ∫ x ∫ 0 2πσ 0 2πσ which is the pdf for the normal distribution.

Moreover, it can be shown that in terms of μ and σ that σ2 μ + e E(X) = e 2

22 1 - ee Var(X) 2 μ +⋅ ⋅= ⎜⎛ee σσ - 1 ⎟⎞ ⎝ ⎠

The lognormal distribution has been used in reliability models for time until failure and for stock price distributions. The shape is similar to that of the Gamma distribution and the Weibull distribution for the case α > 2, but the peak is less towards 0.

f(x) 0.5 0.45 μ=4, σ=1 0.4 0.35 μ=4, σ=2 0.3 μ=4, σ=3 0.25 μ=4, σ=4 0.2 0.15 0.1 0.05 0 x 0 5 10 μ and σ are the lognormal’s mean and std dev, not those of the associated normal

IX. Beta Distribution

The beta function is a function studied by Euler prior to his work on the gamma function. It is given by -1 ()()ΓΓ βα ) , B(α , β) = α - 1 ()1x x- β - 1 dx B(β , α) == ∫ (α +Γ β) 0+

For 0 < x < 1 and α and β the beta distribution is given by the pdf

⎧ ()α +Γ β β - 1 ⎪ α - 1 ⋅⋅ ()1x x- 0for x << 1 f(x) = ⎨ ()Γ⋅Γ ()βα ⎩⎪ 0 otherwise

It can be shown that α E(X) = α + β αβ Var(X) = ()(α 2 αβ β ++⋅+ 1 ) and if α > 1 and β > 1 then Mode(X)=(α - 1)/(α + β - 2) and otherwise is an endpoint or not unique.

α and β serve as shape parameters for the distribution, which takes on very different shapes on the (0,1) interval to which it is restricted.

Note that if α = β = 1, then f(x) = 1 and the distribution is just the uniform distribution for (0,1).

This is a distribution that is sometimes used as a rough model in the absence of data as an alternative to the triangular distribution as follows: given estimates for minimum a, most likely c, and maximum b, if mean μ is also estimated, then α and β can in turn be estimated by solving μ = a + α(b - a)/(α + β) c = a + (α - 1)(b - a)/(α + β - 2) for α and β (here the 2 equations for E(X) and Mode(X) have been adjusted to correspond to the interval (a,b) rather than (0,1)).

The distribution has also been used to estimate the number of defective items in a collection or the time to complete a task. Note that ⎧∞ if α < 1 ⎧∞ if β < 1 ⎪ ⎪ lim f(x) = ⎨ β if α = 1 and lim f(x) = ⎨α if β = 1 x→0 ⎪ x→ 1 ⎪ ⎩ 0 if α > 1 ⎩0 if β > 1

This facilitates determining the manner in which α and β govern the appearance of the distribution as graphed.

Beta distribution: various configurations (reversing α and β flips the graph left-right)

f(x) 5 α=5, β=0.5 4 α=5, β=1

3 α=5, β=1.5

2 α=5, β=2

1 α=5, β=2.5

0 x 0 .5 1

f(x) 3

α=1.5, β=1.5 α=2, β=2 2 α=0.5, β=0.5 α=2.5, β=2.5

α=1, β=1

0 x 0 .5 1

f(x) 3

α=1, β=2 2 α=.9, β=4 α=.8, β=8 α=.7, β=16

0 x 0 .5 1

Other Distributions

There are many distributions that have been devised in addition to those covered. Here is a partial list: • Cauchy Distribution (derived from the Normal Distribution) • Chi-squared Distribution (the Gamma Distribution with α=r/2 for r an integer) • Dirichlet • F Distribution • Hyperexponential Distribution • Hypergeometric Distribution • Laplace Distribution • Logistic Distribution • Multinomial Distribution (generalization of the Binomial Distribution) • Negative Binomial Distribution • Pareto Distribution • T Distribution • Wald (Inverse Gaussian) Distribution