Common Distributions in Stochastic Analysis and Inverse Modeling Presented by Gordon A

Common Distributions in Stochastic Analysis and Inverse Modeling Presented by Gordon A. Fenton 63 Common Discrete Distributions Bernoulli Trials: - form the basis for 6 very common distributions (binomial, geometric, negative binomial, Poisson, exponential, and gamma) Fundamental Assumptions; 1. Each trial has two possible outcomes (0/1, true/false, success/failure, on/off, yes/no, etc) 2. Trials are independent (allows us to easily compute probabilities) 3. Probability of “success” remains constant from trial to trial Note: if we relax assumptions 1 and 2 we get Markov Chains, another suite of powerful stochastic models 64 Bernoulli Trials Let Xii =1 if the 'th trial is a "success" = 0 if not then the Bernoulli probability distribution is P1[X i = ] = p P[X i = 0] =1 − p = q 65 Bernoulli Trials 1 E[X i ] =∑ jXP[ i = j] =+= 0( q ) 1( p ) p j=0 1 2= 2= =+=22 EXii∑ jXP[ j] 0( q ) 1() pp j=0 =22 − 2 =−= Var[X i ] EEXi [ Xi ] pp pq 66 Common Discrete Distributions Bernoulli Family of Distributions Discrete Trials Every “Instant” Becomes a Trial 1) Binomial: 4) Poisson: N n = number of “successes” N t = number of “successes” in n trials in time t 2) Geometric: 5) Exponential: (continuous) T 1 = number of trials until T 1 = time to the first the first “success” “success” 3) Negative Binomial: 6) Gamma: (continuous) T k = number of trials until T k = time to the k’th the k’th “success” “success” 67 Binomial Distribution If N n = number of “successes” in n trials, then n Nn = X12+XX ++ n = ∑ X i i=1 since X i = 1 if the trial results in a “success” and 0 if not. nn n = = = = Mean: E[Nn ] EE∑∑Xii[ Xp] ∑np ii=11= i= 1 Variance: nn = = Var[Nn ] Var ∑∑XXi Var[ i ] (since independent) ii=11= n =∑ pq = npq i=1 68 Binomial Distribution Consider P2 [N5 = ] =P {SSFFF} ∪{ SFSFF} ∪∪ { FFFSS} =PP[SSFFF] +[ SFSFF] ++ P[ FFFSS] = PPP[SSFFF] [ ] [ ] P[ ] P[ ] ++ PP[ SFSF] [ ] PP[ ] [ ] P[F] =pp23qq +23 ++ p 23 q 5 23 = pq 2 In general the binomial distribution is n k nk− P[Nn = k] = pq k 69 Binomial Distribution n k nk− P[Npn = kq] = k n = 10 p = 0.4 70 Binomial Distribution: Example Suppose that the probability of failure, p f , of a slope is investigated using Monte Carlo simulation. That is, a series of n = 5000 independent realizations of the slope strength are simulated from a given probability distribution and the number of failures counted to be n f = 120 . What is the estimated failure probability and what is the ‘standard error’ (standard deviation) of this estimate? Solution: each realization is an independent trial with two possible outcomes, success or failure, having constant probability of failure ( p f ). Thus, each realization is a Bernoulli trial, and the number of failures is a realization of N n , which follows a binomial distribution. 71 Binomial Distribution: Example We have nn= 5000 and f = 120 The failure probability is estimated by number of failures 120 pˆ = = = 0.02 4 f number of trials 5000 Nn In general pˆ f = n N 1 so that σ 2 =Varpnˆ = Var n = Var[ ] pˆ f f 2 nn npq pq = = nn2 which, for pp ˆ f = 0.024 gives us 0.024( 1− 0.024) σ = 0.0022 pˆ f 5000 72 Geometric Distribution Let T 1 = number of trials until the first “success” 3 Consider PP[T1 = 4] =[ FFFS] = pq k−1 In general P[T1 = k] = pq ∞ k−1 1 E[T1 ] =∑kpq = k=1 p ∞ − 1 q =2 −2 =21k −= Var[T1 ] EET1 [T1 ] ∑ k pq 2 2 k=1 p p 73 Geometric Distribution Because all trials are independent, the geometric distribution is “memoryless”. That is, it doesn’t matter when we start looking at a sequence of trials – the distribution of the number of trials to the next “success” remains the same. That is, ∞ m−1 P[T >+jk] ∑ = ++ pq P[Tq>+j kT| > j] = 1 = mjk1 = k 1 1 ∞ n−1 P[T1 > j] ∑ nj= +1 pq and ∞ ∞∞ mm−1 km k1 k P[T1 > k] =∑ pq = p ∑∑ q = pq q = pq = q mk=+==1 mk m 0 1− q 74 Geometric Distribution k−1 P[T1 = k] = pq p = 0.4 75 Geometric Distribution: Example A series of piles have been designed to be able to withstand a certain test load with probability p = 0.8 . If the resulting piles are sequentially tested at that design load, what is the probability that the first pile to fail the test is the 7’th pile tested? Solution: If piles can be assumed to fail independently with constant probability, then this is again a sequence of Bernoulli trials. We thus want to compute 76−1 P[T1 = 7] =pq = (0.8)(0.2) = 0.00005 76 Negative Binomial Distribution Let T k = the number of trials until the k’th “success” Consider PP[T3 = 5] =[{ SSFFS} ∪{ SFSFS} ∪∪ { FFSSS}] =PP[SSFFS] +[ SFSFS] ++ P[ FFSSS] 4 32 = p q 2 m −1 k mk− In general P[Tk = m] = pq k −1 77 Negative Binomial Distribution The negative binomial distribution is a generalization of the geometric distribution (the k’th “success”). Its mean and variance can be obtained by realizing that T k is the sum of k independent T 1 's . That is TTT= + ++ T k 1112 1k so that kk k E[Tk ] =EETT1 =[ 1 ] = ∑∑j jj=1 =1 p and kk Vra [T ] = Var TT= Var[ ] (since independent) k ∑∑1j 1 jj=1 =1 kq =kVar[T1 ] = 2 p 78 Negative Binomial Distribution m −1 k mk− P[Tk = mp] = q k −1 p = 0.4 m = 3 79 Negative Binomial Distribution: Example A series of piles have been designed to be able to withstand a certain test load with probability p = 0.8 . If the resulting piles are sequentially tested at that design load, what is the probability that the third pile to fail the test is the 7’th pile tested? Solution: If piles can be assumed to fail independently with constant probability, then this is again a sequence of Bernoulli trials. We thus want to compute k −−1 m km− 71 37−3 P7[T3 = ] = pq = (0.8) (0.2) m −−1 31 6 34 = (0.8) ( 0.2) = 0. 00082 2 80 Poisson Distribution We now let every instant in time (or space) become a Bernoulli trial (i.e., a sequence of independent trials) “Successes” can now occur at any instant in time. PROBLEM: in any time interval, there are now an infinite number of Bernoulli trials. If the probability of “success” of a trial is, say, 0.2, then in any time interval we get an infinite number of “successes”. This is not a very useful model! SOLUTION: we can no longer talk about probability of “success” of individual trials, p (which becomes zero). We must now talk about the mean rate of “success”, λ . 81 Poisson Distribution Governs many rate dependent processes (arrivals of vehicles at an intersection, ‘packets’ through an internet gateway, earthquakes exceeding magnitude M, floods, load extremes, etc.) Let N t = number of “successes” (arrivals) in time interval t The distribution of N t is arrived at in the limit as the number of trials goes to infinity assuming the probability of “success” is proportional to the number of trials (see the course notes for details). We get (λt)k P[Nk=] = e−λt t k! 82 Poisson Distribution t = 4.5 λ = 0.9 (λt)k P[Ne= k] = −λt t k! 83 Poisson Distribution: Example Many research papers suggest that the arrivals of earthquakes follow a Poisson process over time. Suppose that the mean time between earthquakes is 50 years at a particular location. 1. How long must the time period be so that the probability that no earthquakes occur during that period is at most 0.1? Solution: λ =1/ E[T1 ] = 1/ 50 = 0.02 1. Let N t be the number of earthquakes over the time interval t. We want to find t such that 0 (λt) −−λλ P[Ne= 0] =tt = e ≤ 0.1 t 0! this gives t ≥−ln( 0.1) / λ = −ln( 0.1) / 0.02= 115 years 84 Poisson Distribution: Example 2. Suppose that 50 years pass without any earthquakes occurring. What is the probability that another 50 years will pass without any earthquakes occurring? Solution: PP[NN100 =∩=00N50 ] [ 100 = 0] P0[N100= |0N 50 = ] = = PP[NN50 = 00] [ 50 = ] −100λ e −−λ = =ee50 = 1 = 0.368 e−50λ Note that due to memorylessness, −−510λ P[N50 = 0] =ee = = 0.368 85 Simulating a Poisson Distribution Since the Poisson process derives from an infinite number of independent and equilikely Bernoulli trials over time (or space), its simulation is simple: uniformly distributed (equilikely) on any interval. 86 Common Continuous Distributions The Bernoulli Family extends to two continuous distributions when every instant in time (space) becomes a Bernoulli trial; 1. Exponential: T 1 = time until the next “success” (arrival). This is the continuous time analog to the geometric distribution. 2. Gamma: T k = time until the k’th success (or arrival). This is the continuous time analog to the negative binomial distribution. 87 Exponential Distribution Consider the Poisson Distribution which governs the number of “successes” in time interval t when every instant is an independent Bernoulli trial. We know that the Poisson distribution specifies that (λt)k P[Ne= k] = −λt t k! What is the probability that the time to the first “success” is greater than t? If the first “success” occurs later than t, then the number of successes in time interval t, must be zero, i.e.

Load more