Chapter 5, Probability Distributions
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 5, Probability Distributions
5.1 Introduction - In this chapter, we will discuss various probability distributions including discrete probability distributions and continuous probability distributions.
- Discrete probability distributions is used when the sampling space is discrete but not countable. Following is a list of discrete probability distributions: discrete uniform binomial and multinomial hypergeometric negative binomial geometric Poisson
- Continuous probability distribution is used when the sample space is continuous. Following is a list of continuous probability distributions: Uniform Normal (or Guassian) Gamma Beta t distribution F distribution 2 distribution
5.2 Discrete uniform distribution - the definition: if a r. v., X, assumes the values x1, x2, ..., xk with equal probabilities, then X conforms discrete uniform distribution and its probability function is given below:
1 f ( x , k ) , x x , x , . . . , x k 1 2 k
- the mean and variance:
1 k x i k i 1 k 2 1 2 ( x i ) k i 1
5.3 Binomial and multinomial distributions - First, let us introduce the Bernoulli process. If: the outcomes of process is either success (X = 1) or fail (X = 0) the probability of success is P(X = 1) = p and the probability of fail is P(X = 0) = 1-p = q Then, the process is a Bernoulli process.
- The probability distribution of the Bernoulli process: p(x) = px(1 - p)1-x, x = 0, 1 and 0 < p < 1
- The mean and the variance: E(X) = p V(X) = p(1 - p)
- An example: what is the prob. of picking a male student? X = 1: male student with probability p = (8/12) = 2/3 X = 0: female student with probability 1-p = 1/3 Thus, the probability distribution is: P(x) = (0.25)x(0.75)1-x, x = 0 and 1
In addition, the mean: p = 2/3 and the variance V = (2/3)(1/3) = 2/9
- Binomial Distribution: the binomial distribution is defined based on the Bernoulli process. It is made up of n independent Bernoulli processes. Suppose that X1, X2, ...,
Xn are independent Bernoulli random variables, then Y = Xi will conform Binomial distribution. (note that Y is the number of successes among the n trails)
- The probability distribution of binomial distribution is: n y ny P(Y y) p (1 p) , y 0,1,...,n y
- The student example: pick three students from the 12 students (Note we must take samples with replacement in order to ensure the same probability and independence). none is male student from the 3: the possibility: FFF 3 3 the probability: (1-p) = (0.037) 0 one is male student from the 3: the possibility: MFF, FMF, FFM 3 2 the probability: 3p(1-p) = (0.222) 1 two are male students from the 3: the possibility: MMF, MFM, FMM 3 2 the probability: 3p (1-p) = (0.445) 2 three are male students from the 3: the possibility: MMM 3 3 the probability: p = (0.296) 3 In general, the formula is:
We can derive the general formula in a same manner.
- Mean and variance of the binomial distribution:
E(Y) = E(Xi) = p = np
V(Y) = V(Xi) = p(1 - p) = np(1 - p)
- the example: find the mean and variance of picking male students and then use Chybeshev's theorem to interpret the interval ± 2. = (3)(2/3) = 2 = (3)(2/3)(1/3) = 2/3, = 0.817
at k = 2, + 2 = 2 + (2)(0.816) = 3 - 2 = 2 - (2)(0.816) = 1
(1 - 1/k2) = 3/4. Therefore, there should be at least a probability of 3/4 that the number of male students picked are between 1 to 3. Indeed, the probability is actually p(1)+p(2)+p(3) = 0.973.
- Using the Binomial distribution table: a function of n and p.
- Multinomial distribution: this is an extension of binomial distribution: let x1, x2, ..., xk be independent r. v. with the probability p1, p2, ..., pk, where,
then, they conform multinomial distribution with the probability distribution:
5.4 Hypergeometric Distribution - The example: what is the probability of pick three male students in a roll? Note that at this time, samples are not independent, or sampling without replacement. As a result we need to use hypergeometric distribution. Following shows how the distribution is formed: no male student from the 3 students 12 8 4 total , male , female 3 0 3 84 03 probability = 12 3 one male students from the 12 students 12 8 4 total , male , female 3 1 2 84 12 probability = 12 3 two male students from the 12 students 12 8 4 total , male , female 3 2 1 84 21 probability = 12 3 three male students from the 12 students 12 8 4 total , male , female 3 3 0 84 30 probability = 12 3 In general, the probability distribution is as follows: 8 4 y 3 y P ( Y y ) , y 0 , 1 , 2 , 3 1 2 3
- the general formula of the hypergeometry distribution:
k N k y n y P ( Y y ) , y 0 , 1 , 2 , . . . , n N n
- the mean and the variance of the hypergeometry distribution: n k N N n n k k 2 1 N 1 N N
as a special case, let N be infinite, then (k / N) = p, and (N-n) / (N-1) = 1. Hence: = np 2 = np(1 - p)
That is, the hypergeometric distribution becomes the binomial distribution
- We can also define the multivariate hypergeometric distribution
5.5 Negative Binomial and Geometric Distributions - An example: picking three students, what is the probability that the third student is the second male? a possibility is FMM and its probability is (1-p)p2 the other possibility is MFM and its probability is (1-p)p2
3 1 note that there are combinations, and hence, the probability is: 2 1 3 1 f ( X 3 , k 2 ) 1 p p 2 2 1
- The general formula for the negative binomial distribution is as follows:
x 1 k x k f ( X x ) p ( 1 p ) , x = k, k+1, k+2, ... k 1
where, x is the number of trails and k is the kth success.
- the mean of variance of the negative binomial distribution: E(X) = k(1-p)/p V(X) = k(1-p)/p2
- another example: picking until get a male student: the first pick: p the second pick: (1-p)p the third pick: (1-p)2p
- the general formula is: f(X = x) = (1 - p)x-1p, x = 1, 2, 3, ...
This is the geometric distribution.
- the mean of variance of the negative binomial distribution and geometric distributions: E(X) = 1/p V(X) = (1-p)/p2
5.6 Poisson Distribution - Poisson process is a random process representing a discrete event takes place over continuous intervals of time or region. Examples of Poisson processes include: the arrival of telephone calls at a switchboard, the passing cars of an electric checking device.
Note that all these examples involve a discrete random event. At any given small period of time (or region), the probability that the event occurs is small; however, over a long time (or large region), the number of occurrence is large.
- Poisson distribution plays an extremely important role in science and engineering, since it represents an appropriate probabilistic model for a large number of observational phenomena.
- The Poisson distribution can be described by the following formula:
e t ( t ) x p ( x , t ) , x = 0, 1, 2, ... x !
where, is the average number of outcomes per unit time or region. Hence, t represents the number of outcomes.
Proof: refer to the textbook.
- The Poisson process can be considered as an approximation to the Binomial Distribution when n is large and p is small.
- From a physical point of view, given a time interval of length T, which is divided interval into n equal sub-intervals of length t (t 0), (note that T = nt), and assume: The probability of a success in any sub-interval t is given by t. The probability of more than one success in any sub-interval t is negligible. The probability of a success in any sub-interval does not depend on what happened prior to that time.
Then, we have the Poisson distribution.
- Mean and Variance of Poisson distribution
- An example: in a large company, industrial accidents occur at the mean of three per week (t = 3) (note that accidents occurs independently). the probability distribution: p(y) = (3)yexp(-3) / y!, y = 0, 1, 2, ... the probability can be determined based on simple calculation or by means of checking the Poisson distribution table.
the probability of less than and equal to four accidents in a week: p(0) + p(1) + p(2) + p(3) + p(4) = 0.815
the probability of equal and more than four: P(Y 4) = 1 - P(Y 3) = 0.353
the probability of equal to four P(Y = 4) = P(Y 4) - P(Y 3) = 0.168 note that this is the same as: p(4) = 0.168
5.7 Uniform Distribution - The uniform distribution is a continuous probability distribution the assumption: the random event is equally likely in an interval an example: receiving an express mail between 1 ~ 5 pm
- The probability density function (pdf)
1 a x b f (x) b a 0 elsewhere
- By integration, we obtain the probability function (pf)
0 x a x a F(x) a x b b a 1 b x
- A comparison between the discrete distributions and continuous distribution the discrete r. v., we have probability function: P(X = x) = p(x) for continuous r. v.: F(X = x) = 0 x F ( x ) = f ( x ) d x - F ( x ) f ( x ) = d x
- An example: receiving an express mail equally likely between 1 to 5 pm. f(x) = 1/4, 1 x 5 0, elsewhere
hence, the probability of receiving an express mail between 2 to 5 pm is P(2 X 5) = (5 - 1)/(5 - 1) - (2 - 1)/(5 - 1) = 3/4.
- The mean and the variance: E(x) = (a+b)/2 V(x) = (b-a)2/12
5.8 Normal Distribution - In the natural world there are more cases where possibilities are not equally likely. Instead there is a most likely value and then the likelihood decreases symmetrically. This leads to the Normal distribution.
- Normal distribution is by far the most widely used probability distribution. Why Normal distribution is so popular? the large number theorem a linear combination of Normal is still Normal
- The probability density function:
- (x - )2/2 2 f(x) = 1 e 2
note that probability function does not have analytical form, hence, we rely on numerical calculation (Table A.3)
- The mean, variance and standard deviation of a normal distributions: E(X) = V(X) = 2
These two parameters uniquely determine the normal distribution. Hence, a normal distribution is often denoted as N(, )
- Illustration of the normal distribution: the bell shape the mean the standard deviation: ± (68% area), ±2 (95.4% area), and ±3 (99.7% area).
- In particular, with E(X) = V(X) = 2
we have the standard normal distribution N(0, 1) - Calculate the probability through the standard normal distribution: translate to a normal distribution to a standard normal distribution by: X - Z =
use the normal distribution table (Table A.3)
- An example: given N(16, 1), P(X > 17) = ? Z = (X - 16)/1 P[Z > (17 - 16)/1] = P(Z > 1) = 1 - P(Z < 1) = 1 - 8413 (form Table A.3) = 0.1587
- Questions:
given and , how to calculate P(c1 X c2)? given p, and , how to calculate x so that P(X > x) = p
- Given a set of data, it is often necessary to checking whether the data set conforms normal distribution.
- The student example - the number of hours of study of the 12 students: sorting the data: 10, 12, 12, 14, 14, 14, 15, 15, 15, 20, 20, 25 note that there are just 6 different values. So, the 100 6 = 16.7 finding the percentile of the data: 16, 32, 32, 48, 48, 48, 64, 64, 64, 80, 80, 96 finding the z-values of the percentile: -1., -.47, -.47, -.05, -.05, -.05, .36, .36, .36, . 85, .85, 1.75 plotting:
2 5 •
2 0 •
1 5 • • 1 0 - 1 . 5 -• 1 - 0 . 5 0 . 5 1 1 . 5 2
Because the horizontal axis is from a normal distribution, the linear relationship indicates that the distribution of the data can be approximated by a normal distribution. - If a data set conforms normal distribution, then the related probability calculated can be easily done. Following the 12 students example: = 15.5 = 16 Question: what is the prob. of picking a student who studies at least 15 hours per week? Answer: we first calculate the z value; z = (15 - 15.5) / 4 = -0.125
hence, the probability is: P(Z > -0.125) = 1 - P(Z < -0.125) = 1 - 0.45 = 0.55
- As another example, assuming that an exam is coming, everybody is putting an extra 3 hours for study per week, what is the probability of picking a student who studies at least 20 hours per week? We first calculate the z value; z = (20 - 18.5) / 4 = 0.375
hence, P(X > 20) = P(Z > 0.375) = 1 - P(Z < 0.375) = 1 - 0.64 = 0.36.
- As an exercise, you may want to try to find that, given a probability of 95%, what is the range of the hours of study per week for a picked student.
- Normal approximation to binomial. Assuming p is small and n is large, then
X n p Z n p ( 1 p )
is approximately normally distributed. This can be demonstrated by the example. In the students example, the probability of picking a student who studies more than 15 hours per week is p = 3/12 = 1/4. Consider the case of sampling with replacement, picking 3 students who all study more than 15 hours per week is: b(X = 3, n = 12, p = 1/4) = 0.212
Use normal distribution to approximate: = np = (12)(1/4) = 3 2 = np(1 - p) = (12)(1/4)(3/4) = 9/4 = 2.25 ( = 1.5)
hence, P(2.5 < X < 3.5) = P[(2.5 - 3)/1.5 < Z < (3.5 - 3)/1.5] = P(-0.167 < Z < 0.167) = 0.56 - 0.395 = 0.165
It is seen that the results are rather similar. The approximation error is caused by small n (n = 12). - The normal approximation of binomial distribution is very useful when n is large because binomial distribution will then require tedious calculation.
5.9 Exponential distribution, Gamma distribution and Chi-Square (2) distribution - There are cases, for example the failure rate, in which the possibility decreases exponentially. This leads to the exponential distribution.
- the probability density function of the exponential distributions: 1 x exp x 0, 0 f (x) 0 elsewhere
- the probability function
F(x) = 1 - exp(-x/), x > 0, > 0
- To calculate mean and variance, we need the Gamma () function: ( ) = x - 1 e - x d x 0
using integration by part: (uv)' = u'v + uv' u v u ' v u v ' or u v ' u v u ' v
let u = x-1, dv = e-xdx, it follows that: ( ) e x x 1 e x ( 1 ) x 2 d x ( 1 ) ( 1 ) 0 0
In particular: (+1) = F() (n) = (n-1)! (1/2) =
In general: x ( x ) 1 e d x ( ) 0
for the geometry distribution, since = 1, = : E(X) = V(X) = 2
- The exponential distribution is correlated to Poisson distribution: given a Poisson distribution with the mean t, the probability of first time occurrence is exponential.
- Another common case is that the possibility is low when close to zero - this leads to the Gamma distribution. The probability density function of Gamma distribution: x 1 f (x) x 1e , x > 0, > 0.
- The mean and variance: E(X) = V(X) = 2
- Note that exponential distribution is a special case of Gamma distribution with = 1.
- Another special case of the gamma distribution is the 2 distribution. Let = /2 and = 2, it results in the 2 distribution:
x 1 2 1 2 f ( x ) x e 2 , x > 0 2 ( 2 )
its mean and variance are as follows: = 2 = 2
- Illustration.
Gamma or 2
Exponential
5.10 Weibull distribution - The assumption: similar to Gamma - The probability density function:
- 1 f ( x ) = x e - x / , x > 0 = 0 , o t h e r w i s e
- The probability function: F(x) = 1 - exp(-x/), x > 0
- The mean and variance 1/ E(X) = (1 + 1) 2 / 2 V ( X ) = { ( 1 + 2 ) - [ ( 1 + 1 ) ] }
- Application in reliability, defining: f(t) - the pdf of failure F(t) - the pf of failure R(t) = 1 - F(t) - the probability of no failure (reliability function) r(t) = f(t) / R(t) - the failure rate function
if: f (t) f (t) 1 r(t) R(t) 1 F(t)
then f(t) will be exponential.
- Proof: since dF(t)/dt = f(t) • F'(t) = 1 - F(t) • F'(t) + F(t) = 1
solving the above gives: F(t) = 1 - exp(-t/), t 0
or f(t) = 1/ exp(-t/), t 0
5.11 Summary - Discrete distributions discrete uniform: equally likely binomial and multinomial: number of success in n independent Bernoulli experiments hypergeometric: sampling is dependent (finite sampling space) negative binomial: kth success in n trials geometric: trail until success Poisson: discrete event in continuous intervals.
- Continuous distributions uniform: equally likely Normal: has a most likely value and decreasing symmetrically exponential: gradually decreasing Gamma: small when close to zero (generalized exponential) Beta: contained in a finite interval Weibull: generalized Gamma