Probability Distributions
Total Page:16
File Type:pdf, Size:1020Kb
Probability distributions [Nematrian website page: ProbabilityDistributionsIntro, © Nematrian 2020] The Nematrian website contains information and analytics on a wide range of probability distributions, including: Discrete (univariate) distributions - Bernoulli, see also binomial distribution - Binomial - Geometric, see also negative binomial distribution - Hypergeometric - Logarithmic - Negative binomial - Poisson - Uniform (discrete) Continuous (univariate) distributions - Beta - Beta prime - Burr - Cauchy - Chi-squared - Dagum - Degenerate - Error function - Exponential - F - Fatigue, also known as the Birnbaum-Saunders distribution - Fréchet, see also generalised extreme value (GEV) distribution - Gamma - Generalised extreme value (GEV) - Generalised gamma - Generalised inverse Gaussian - Generalised Pareto (GDP) - Gumbel, see also generalised extreme value (GEV) distribution - Hyperbolic secant - Inverse gamma - Inverse Gaussian - Johnson SU - Kumaraswamy - Laplace - Lévy - Logistic - Log-logistic - Lognormal - Nakagami - Non-central chi-squared - Non-central t - Normal - Pareto - Power function - Rayleigh - Reciprocal - Rice - Student’s t - Triangular - Uniform - Weibull, see also generalised extreme value (GEV) distribution Continuous multivariate distributions - Inverse Wishart Copulas (a copula is a special type of continuous multivariate distribution) - Clayton - Comonotonicity - Countermonotonicity (only valid for 푛 = 2, where 푛 is the dimension of the input) - Frank - Generalised Clayton - Gumbel - Gaussian - Independence - t The Nematrian website functions for fitting univariate distributions, creating random variates and calculating moments etc. now cover most of the above probability distributions, see ProbabilityDistributionsFunctions. In many cases these functions can cater for the traditional (textbook) forms of these distributions and variants that include additional shift and scale parameters. location (i.e. shift) and scale variants [ProbabilityDistributionsIntro2] The location and scale of any probability distribution can be adjusted by using the (linear) transform 푌 = 푔 + ℎ푋 where 푔 and ℎ are constants (푔 adjusts the location, i.e. shifts the distribution, whilst ℎ adjusts its scale). This leaves the skew and (excess kurtosis) unaltered but alters the mean and variance as 퐸(푌) = 푔 + ℎ퐸(푋) and 푣푎푟(푌) = ℎ2푣푎푟(푋). In some cases the typical distributional specification already includes such components. For example, the normal distribution 푁(휇, 2) is the location and scale adjusted version of the unit normal distribution 푁(0,1). In other cases the standard distributional specification does not include such adjustments. For example, the (standard) Student’s t distribution depends one just one parameter, its degrees of freedom. The probability distribution orientated Nematrian web functions recognise location and/or scale adjusted variants of wide range of standard probability distributions. Discrete (univariate) distributions The Bernoulli distribution [BernoulliDistribution] A Bernoulli trial is an experiment that has one of two possible outcomes, ‘success’ with probability 푝 and ‘failure’ with probability 1 − 푝. The Bernoulli distribution is a probability distribution that takes the value of 1 if such a trial is a ‘success’ and 0 if it is a ‘failure’. The Bernoulli distribution is a special case of the binomial distribution, 퐵(푛, 푝), with 푛 = 1. For characteristics of the Bernoulli distribution (e.g. mean, standard deviation etc.), please refer to the corresponding characteristics for the binomial distribution. For other probability distributions see here. The binomial distribution [BinomialDistribution] The binomial distribution 퐵(푛, 푝) is the discrete probability distribution applicable to the number of successes in a sequence of 푛 independent yes/no experiments each of which has a success probability of 푝. Each individual success/failure experiment is called a Bernoulli trial, so if 푛 = 1 then the binomial distribution is a Bernoulli distribution. It has the following characteristics: Distribution name Binomial distribution Common notation 푋~퐵(푛, 푝) Parameters 푛 = number of (independent) trials, positive integer 푝 = probability of success in each trial, 0 ≤ 푝 ≤ 1 Support 푥 ∈ {0,1, … , 푛} = 푛푢푚푏푒푟 표푓 푠푢푐푐푒푠푠푒푠 Probability mass function 푛 푛! 푓(푥) = ( ) 푝푥(1 − 푝)푛−푥 = 푝푥(1 − 푝)푛−푥 푥 (푛 − 푥)! 푥! Cumulative distribution 푥 푛 푗 푛−푗 function 퐹(푥) = ∑ (푗 ) 푝 (1 − 푝) = 퐼1−푝(푛 − 푥, 푥 + 1) 푗=0 Mean 푛푝 Variance 푛푝(1 − 푝) Skewness 1 − 2푝 √푛푝(1 − 푝) (Excess) kurtosis 1 − 6푝(1 − 푝) 푛푝(1 − 푝) 푛 Characteristic function (1 − 푝 + 푝푒푖푡) Other comments The Bernoulli distribution is 퐵(1, 푝) and corresponds to the likelihood of success of a single experiment. Its probability mass function and cumulative distribution function are: 1 − 푝, 푥 = 0 푓(푥) = 퐹(푥) = { 푝, 푥 = 1 The Bernoulli distribution with 푝 = 1⁄2, i.e. 퐵(1, 1⁄2), has the minimum possible excess kurtosis, i.e. −2. The mode of 퐵(푛, 푝) is 푖푛푡((푛 + 1)푝) if (푛 + 1)푝 is 0 or not an integer and is 푛 if (푛 + 1)푝 = 푛 + 1. If (푛 + 1)푝 ∈ {1,2, … , 푛} then the distribution is bi-modal, with modes (푛 + 1)푝 and (푛 + 1)푝 − 1. The binomial distribution is often used to model the number of successes in a sample size of 푛 from a population size of 푁. Since such samples are not independent, the resulting distribution is actually a hypergeometric distribution and not a binomial distribution. However if 푁 ≫ 푛 then the binomial distribution becomes a good approximation to the relevant hypergeometric distribution and is thus often used. 푛 In the above ( ) is the binomial coefficient. 푥 Nematrian web functions Functions relating to the above distribution may be accessed via the Nematrian web function library by using a DistributionName of “binomial”. For details of other supported probability distributions see here. The geometric distribution [GeometricDistribution] The geometric distribution describes the probability of 푥 successes in a sequence of independent experiments each with likelihood of success of 푝 that arise before there is 1 failure. It is a special case of the negative binomial distribution. Note: different texts adopt slightly different definitions, e.g. it may be the total number of trials (i.e. 1 more than the above) in which case it may be called the shifted geometric distribution and/or 푝 may denote the probability of failure rather than the probability of success. The hypergeometric distribution [HypergeometricDistribution] The hypergeometric distribution describes the probability of 푥 successes in 푛 draws from a finite population size 푁 containing 푚 successes without replacement. This contrasts with the binomial distribution which describes the probability of 푥 successes in 푛 draws with replacement Distribution name Hypergeometric distribution Common notation 푋~퐻푦푝푒푟푔푒표푚푒푡푟푖푐(푚, 푁, 푛) Parameters 푁 = population size, integral (푁 > 0) 푛 = sample size, integral (0 < 푛 ≤ 푁) 푚 = number of tagged items, integral (0 < 푚 ≤ 푁) Domain max(0, 푛 + 푚 − 푁) ≤ 푥 ≤ max(푛, 푚) , 푥 an integer 푚 푁 − 푚 Probability mass function ( ) ( ) 푥 푓(푥) = 푛 − 푥 푁 ( ) 푛 Cumulative distribution 푚 푁 − 푚 푛 푁 − 푛 푥 ( ) ( ) ( ) ( ) 푗 푛 − 푗 푘 + 1 function 퐹(푥) = (푥) = ∑ = 1 − 푚 − 푘 − 1 푌 푁 푁 ( ) ( ) 푗=0 푗 푚 where 푌 = 3퐹2(1, 푥 + 1 − 푚, 푥 + 1 − 푛; 푥 + 2, 푁 + 푥 + 2 − 푚 − 푛; 1) 푝퐹푞 is the generalised hypergeometric function, i.e. ∞ (푎 ) … (푎 ) 푛 1 푛 푝 푛 푧 푝퐹푞(푎1, … , 푎푝; 푏1, … , 푏푞; 푧) = ∑ (푏 ) … (푏 ) 푛! 푛=0 1 푛 푞 푛 and (푎)푛 involves the rising factorial or Pochhammer notation, i.e. (푎)푛 = 푎(푎 + 1)(푎 + 2) … (푎 + 푛 − 1) and (푎)0 = 1 푛푚 Mean 푁 Variance 푛푚(푁 − 푚)(푁 − 푛) 푁2(푁 − 1) Skewness (푁 − 2푚)(푁 − 1)1⁄2(푁 − 2푛) 1⁄2 (푛푚(푁 − 푚)(푁 − 푛)) (푁 − 2) (Excess) kurtosis 퐴 + 퐵 퐶 where 퐴 = (푁 − 1)푁2(푁(푁 + 1) − 6푚(푁 − 푚) − 6푛(푁 − 푚)) 퐵 = 6푛푚(푁 − 푚)(푁 − 푛)(5푁 − 6) 퐶 = 푛푚(푁 − 푚)(푁 − 푛)(푁 − 2)(푁 − 3) 푁 − 푚 Characteristic function ( ) 푛 퐹 (−푛, −푚; 푛 − 푚 − 푛 + 1; 푒푖푡) 푁 2 1 ( ) 푛 Nematrian web functions Functions relating to the above distribution may be accessed via the Nematrian web function library by using a DistributionName of “hypergeometric”. For details of other supported probability distributions see here. The logarithmic distribution [LogarithmicDistribution] The logarithmic distribution arises from following power series expansion: 푝2 푝3 − log(1 − 푝) = 푝 + + + ⋯ 2 3 푝푥 This means that the function 푓(푥) = − , 푥 = 1,2,3, … can naturally be interpreted as a 푥 log(1−푝) ∞ probability mass function since ∑푘=1 푓(푘) = 1. Distribution name Logarithmic distribution Common notation 푋~퐿표푔(푝) Parameters 푝 = shape parameter (0 < 푝 < 1) Domain 1 ≤ 푥 < +∞, 푥 an integer Probability mass function 푝푥 푓(푥) = − 푥 log(1 − 푝) Cumulative distribution 푥 푗 1 푝 퐵푝(푥 + 1,0) 퐹(푥) = − ∑ = 1 + function log(1 − 푝) 푗 log(1 − 푝) 푗=1 푝 Mean − (1 − 푝) log(1 − 푝) Variance 푝(푝 + log(1 − 푝)) − = 푉 (1 − 푝)2(log(1 − 푝))2 Skewness 푝 3푝 2푝2 − (1 + 푝 + + ) (1 − 푝)3푉3⁄2 log(1 − 푝) log(1 − 푝) log2(1 − 푝) 푝 (Excess) kurtosis − 퐴 − 3 (1 − 푝)4푉2 log(1 − 푝) where 4푝(1 + 푝) 6푝2 3푝3 퐴 = (1 + 4푝 + 푝2 + + + ) log(1 − 푝) log2(1 − 푝) log3(1 − 푝) Characteristic function log(1 − 푝푒푖푡) log(1 − 푝) Other comments The logarithmic distribution has a mode of 1. If 푁 is a random variable with Poission distribution and 푋푖, 푖 = 1, … is an infinite sequence of iid random variables each distributed 퐿표푔(푝) then 푌 = 푁 ∑푖=1 푋푖 has a negative binomial distribution showing that the negative binomial distribution is an example of a compound Poisson distribution Nematrian web functions Functions relating to the above distribution may be accessed via the Nematrian web function library by using a DistributionName of “logarithmic”. For details of other supported probability distributions see here. The negative binomial distribution [NegativeBinomialDistribution] The negative binomial distribution describes the probability of 푥 successes in a sequence of independent experiments each with likelihood of success of 푝 that arise before there are 푟 failures.