Probability Distributions CEE 201L. Uncertainty, Design, and Optimization Department of Civil and Environmental Engineering Duke University Philip Scott Harvey, Henri P. Gavin and Jeffrey T. Scruggs Spring 2022
1 Probability Distributions
Consider a continuous, random variable (rv) X with support over the domain X . The probability density function (PDF) of X is the function fX (x) such that for any two numbers a and b in the domain X , with a < b, Z b P [a < X ≤ b] = fX (x) dx a For fX (x) to be a proper distribution, it must satisfy the following two conditions:
1. The PDF fX (x) is positive-valued; fX (x) ≥ 0 for all values of x ∈ X . R 2. The rule of total probability holds; the total area under fX (x) is 1; X fX (x) dx = 1.
Alternately, X may be described by its cumulative distribution function (CDF). The CDF of X is the function FX (x) that gives, for any specified number x ∈ X , the probability that the random variable X is less than or equal to the number x is written as P [X ≤ x]. For real values of x, the CDF is defined by Z b FX (x) = P [X ≤ b] = fX (x) dx , −∞ so, P [a < X ≤ b] = FX (b) − FX (a)
By the first fundamental theorem of calculus, the functions fX (x) and FX (x) are related as d f (x) = F (x) X dx X 2 CEE 201L. Uncertainty, Design, and Optimization – Duke University – Spring 2022 – P.S.H., H.P.G. and J.T.S.
A few important characteristics of CDF’s of X are:
1. CDF’s, FX (x), are monotonic non-decreasing functions of x.
2. For any number a, P [X > a] = 1 − P [X ≤ a] = 1 − FX (a) R b 3. For any two numbers a and b with a < b, P [a < X ≤ b] = FX (b) − FX (a) = a fX (x)dx
2 Descriptors of random variables
The expected or mean value of a continuous random variable X with PDF fX (x) is the centroid of the probability density. Z ∞ µX = E[X] = x fX (x) dx −∞ The expected value of an arbitrary function of X, g(X), with respect to the PDF fX (x) is Z ∞ µg(X) = E[g(X)] = g(x) fX (x) dx −∞
The variance of a continuous rv X with PDF fX (x) and mean µX gives a quantitative measure of how much spread or dispersion there is in the distribution of x values. The variance is calculated as Z ∞ 2 2 σX = V[X] = (x − µX ) fX (x) dx −∞ = = = =
p The standard deviation (s.d.) of X is σX = V[X]. The coefficient of variation (c.o.v.) of X is defined as the ratio of the standard deviation σX to the mean µX :
σX cX = µX for non-zero mean. The c.o.v. is a normalized measure of dispersion (dimensionless).
A mode of a probability density function, fX (x), is a value of x such that the PDF is maximized;
d fX (x) = 0 .
dx x=xmode
The median value, xm, is is the value of x such that
P [X ≤ xm] = P [X > xm] = FX (xm) = 1 − FX (xm) = 0.5 .
CC BY-NC-ND March 25, 2021 PSH, HPG, JTS Probability Distributions 3
3 Some common distributions
The National Institute of Standards and Technology (NIST) lists properties of nineteen commonly used probability distributions in their Engineering Statistics Handbook. This section describes the properties of seven distributions. For each of these distributions, this document provides figures and equations for the PDF and CDF, equations for the mean and variance, the names of Matlab functions to generate samples, and empirical distributions of such samples.
3.1 The Normal distribution
The Normal (or Gaussian) distribution is perhaps the most commonly used distribution function. 2 The notation X ∼ N (µX , σX ) denotes that X is a normal random variable with mean µX and 2 variance σX . The standard normal random variable, Z, or “z-statistic”, is distributed as N (0, 1). The probability density function of a standard normal random variable is so widely used it has its own special symbol, φ(z), ! 1 z2 φ(z) = √ exp − 2π 2 Any normally distributed random variable can be defined in terms of the standard normal random variable, through the change of variables
X = µX + σX Z.
If X is normally distributed, it has the PDF
2 ! x − µX 1 (x − µX ) fX (x) = φ = q exp − 2 σX 2 2σ 2πσX X
There is no closed-form equation for the CDF of a normal random variable. Solving the integral
z 1 Z 2 Φ(z) = √ e−u /2 du 2π −∞ would make you famous. Try it. The CDF of a normal random variable is expressed in terms of the error function, erf(z). If X is normally distributed, P [X ≤ x] can be found from the standard normal CDF x − µX P [X ≤ x] = FX (x) = Φ . σX Values for Φ(z) are tabulated and can be computed, e.g., the Matlab command . . . Prob_X_le_x = normcdf(x,muX,sigX). The standard normal PDF is symmetric about z = 0, so φ(−z) = φ(z), Φ(−z) = 1 − Φ(z), and P [X > x] = 1 − FX (x) = 1 − Φ ((x − µX )/σX ) = Φ ((µX − x)/σX ).
The linear combination of two independent normal rv’s X1 and X2 (with means µ1 and µ2 and 2 2 variances σ1 and σ2) is also normally distributed,
2 2 2 2 aX1 + bX2 ∼ N aµ1 + bµ2, a σ1 + b σ2 ,