<<

STAT 350 - An Introduction to Named Discrete Distributions Jeremy Troisi

1 Bernoulli Distribution: ’Single Coin Flip’

1 trial of an that yields either a success or failure.

X ∼ Bern(p), X ∼ the # of successes, where p is the of ”success” (however success is defined).

1.1 Probability Mass Function (PMF) P (X = x)

X P(X = x) ⇒ P (X = x) = px(1 − p)1−x : x = 0, 1, 0 ≤ p ≤ 1 OR 0 1 - p 1 p

1.2 Rules

1 1 1 1. x∈X P (X = x) = x=0 P (X = x) = (1 − p) + p = (p + (1 − p)) = 1 = 1 P P 2. 0 < p < 1 ⇒ 1 − p ≥ 0

∴ X is a PMF.

1.3 Average/Mean/ (µ = E[X]= x∈X xP (X = x)) P 1 µ = E[X] = x∈X xP (X = x) = x=0 xP (X = x) = 0 ∗ (1 − p) + 1 ∗ (p) = p P P The mean is the probability of success, p, as it ought to be.

1.4 /Average Distance Squared away from the Mean (σ2 = V ar[X])

2 2 1 2 2 2 V ar[X] = E[(X − µ) ] = x∈X (x − µ) P (X = x) = x=0(x − p) P (X = x) = (0 − p) (1 − p) + (1 − p) p P P = p(1 − p)[p + (1 − p)] = p(1 − p)

The variance is the probability of success times the probability of failure. You can either succeed or fail and the variance accounts for the variability in these two options by multiplying them together.

1.5 Examples

1. Single flip of a coin

2. Single roll of a die

3. Will the world end tomorrow

1 2 : A series or sequence of ’Coin Flips’ n trials of an experiment that either yields a success or failure on each trial independent of prior successes or failures.

n Thus, X1, X2, ..., Xn ∼iid Bern(p) ⇒ X = i=1 Xi ∼ Bin(n, p), X ∼ The # of successes. Here n is the number of trials and p is the probability ofP ”success” (whatever is defined to be success).

2.1 Probability Mass Function (PMF) P (X = x)

n x n−x P (X = x) = x p (1 − p) : x = 0, 1, ..., n, 0 < p < 1 

2.2 Rules

n n x n−x n n 1. x∈X P (X = x) = x=0 x p (1 − p) = (p + (1 − p)) = 1 = 1  Pn x P n−x n x n−x 2. x ≥ 1, p ≥ 0, (1 − p) ≥ 0 ⇒ P (X = x) = x p (1 − p) ≥ 0   ∴ X is a PMF.

2.3 Average/Mean/Expected Value (µ = E[X]= x∈X xP (X = x)) P n n n µ = E[X] = E[ i=1 Xi] = i=1 E[Xi] = i=1 p = np P P P

2.4 Variance/Average Distance Squared away from the Mean (σ2 = V ar[X])

n n n V ar[X] = V ar[ i=1 Xi] =1.INDEPENDENCE i=1 V ar[Xi] = i=1 p(1 − p) = np(1 − p) P P P n n 1. V ar[ i=1 Xi] = i=1 V ar[Xi] is true ONLY WHEN the Xi are independent. Else, we would have to determineP andP add in the covariance between each pair Xi, Xj .

2.5 Examples

1. n flips of a coin 2. n rolls of a die

2 3 : Counting over a Continuous Metric

Counting over a continuous metric (e.g. time, distance, mass, area, volume, etc.) can be associated to a number of independent trials n approaching infinite, but with extraordinarily rare chance of occurring p approaching 0 such that np → λ > 0.

This is merely an asymptotic case of the Binomial Distribution. Thus, if p < 0.01, n > 100, a Binomial Distribution can be approximated with a Poisson Distribution with rate parameter λ = np.

n nր∞,pց0,s.t.np→λ X = i=1 Xi ∼ Bin(n, p) −−−−−−−−−−−−−→ Poisson(λ) : P Proof:

λ n n x n−x n! x (1− n ) limnր∞,pց0,s.t.np→λ>0 P (X = x) = limnր∞,pց0,s.t.np→λ>0 x p (1−p) = limnր∞,pց0,s.t.np→λ>0 x!(n−x)! p (1−p)x

λ n  λ n 1 x−1 (1− n ) 1 x−1 (1− n ) = x! limnր∞,pց0,s.t.np→λ>0[ i=0 (n − i)p] (1−p)x =x finite x! limnր∞,pց0,s.t.np→λ>0[ i=0 np] (1−p)x

Q λ n x λ n x −λx Q 1 x (1− n ) λ (1− n ) λ e = x! limnր∞,pց0,s.t.np→λ>0 λ (1−p)x = x! limnր∞,pց0,s.t.np→λ>0 (1−p)x = x! 1

−λx λx = e x! : x = 0, 1, ...; λ > 0 ⇒ X ∼ Poisson(λ)

3.1 Probability Mass Function (PMF) P (X = x)

−λ λx P (X = x) = e x! : x = 0, 1, ...; λ > 0

3.2 Rules

n −λ λx −λ n λy −λ λ 1. x∈X P (X = x) = limnր∞ x=0 e x! = e limnր∞ x=0 y! =Calc-II: Taylor Polynomials e (e ) P P P = e−λ+λ = e0 = 1

−λ x −λ λx 2. e ≥ 0,λ ≥ 0,x! ≥ 1 ⇒ P (X = x) = e x! ≥ 0

∴ X is a PMF.

3.3 Average/Mean/Expected Value (µ = E[X]= x∈X xP (X = x)) P − n −λ λx n −λ λx 1 n −λ λz µ = E[X] = x∈X xP (X = x) = limnր∞ x=0 x(e x! ) = λ limnր∞ x=1 e (x−1)! = λ limnր∞ z=0 e z! ) P P P P n =1. λ limnր∞ z=0 P (Poisson(λ) = z) = λ ∗ 1 = λ P

n −λ λz n 1. limnր∞ z=0 e z! ) = limnր∞ z=0 P (Poisson(λ) = z) = 1 is knownP mathematically as a KernalP and is the technique of altering a complicated equation into a form of which prior knowledge simplifies a previous difficult equation. This technique is utilized often in probability to solve complicated discrete sums and integrals by forming them in the form of a PMF or PDF. We will now use this same technique for derivation of the variance.

3 3.4 Variance/Average Distance Squared away from the Mean (σ2 = V ar[X])

2 2 2 V ar[X] =Dr. Sellke’s Continuous Distribution Notes E[X ] − µ = E[X([X − 1] + 1)] − λ

2 2 2 =Known as Falling Powers E[X(X −1)]+E[X]−λ = E[X(X −1)]+λ−λ = x∈X [x(x−1)]P (X = x)+λ−λ

− P n −λ λx 2 2 n −λ λx 2 2 = limnր∞ x=0[x(x − 1)](e x! ) + λ − λ = λ limnր∞ x=2 e (x−2)! + λ − λ P P 2 n −λ λz 2 2 ∞ 2 2 2 = λ limnր∞ z=0 e z! + λ − λ =1. λ ( z=0 P (Poisson(λ) = z)) + λ − λ = λ (1) + λ − λ = λ P P

3.5 Sum of Independent Poisson Distributions is Poisson Distributed

Y1 ∼ Poisson(λ1)⊥Y2 ∼ Poisson(λ2) ⇒ Z2 = Y1 + Y2 ∼ Poisson(λ1 + λ2) :

Proof:

P (Z = Y + Y = z) = z P (Y = y ∩ Y = z − y ) 1 2 y1=0 1 1 2 1

P − y1 z y1 z z −λ λ1 −λ λ2 =Multiplication Rule,Y Y P (Y = y )P (Y = z − y ) = (e 1 )(e 2 ) 1⊥ 2 y1=0 1 1 2 1 y1=0 y1! (z−y1)!

−(λ +λ ) P −(λ +λ ) P −(λ +λ ) = e 1 2 z z! λy1 λz−y1 = e 1 2 z z λy1 λz−y1 = e 1 2 (λ + λ )z z! y1=0 y1!(z−y1)! 1 2 z! y1=0 y1 1 2 z! 1 2

P z P  −(λ1+λ2) (λ1+λ2) = e z! ∼ Poisson(λ1 + λ2)

Through a method called the Process of Mathematical Induction (PMI) this can be extended to show that any number of mutually independent Poisson Distributions can be lumped into one Poisson Distribution with the parameters added together.

∴ n n Yi ∼independently Poisson(λi) ⇒ Zn = i=1 Yi ∼ Poisson( i=1 λi) P P

3.6 Examples

1. Number of times Jeremy trips while teaching a lecture.

2. Number of accidents at a particular street corner for a certain period of time.

3. Number of people served in a restaurant for a certain period of time.

4 4

X ∼ geo(p), X ∼ The # of independent Bernoulli trials necessary to achieve a success.

4.1 Probability Mass Function (PMF) P (X = x)

P (X = x) = (1 − p)x−1p : x = 1, 2, ..., 0 < p < 1

4.2 Rules

n x−1 n x 1. x∈X P (X = x) = limnր∞ x=1(1 − p) p = p limnր∞ x=0(1 − p) P P P 1 1 =Calc-II: Geometric Sum/Calc Pre-Test p 1−(1−p) = p p = 1 2. (1 − p)x−1 ≥ 0 ⇒ P (X = x) = (1 − p)x−1p ≥ 0

∴ X is a PMF.

4.3 Average/Mean/Expected Value (µ = E[X]= x∈X xP (X = x)) P d µ = E[X] = xP (X = x) = lim n x[(1−p)x−1p] = p lim n [−(1−p)x] = x∈X nր∞ x=1 nր∞ x−1 dp Calc PreTest 1 P P P p

4.4 Variance/Average Distance Squared away from the Mean (σ2 = V ar[X])

2 2 2 1 2 σ = V ar[X] =Dr. Sellke’s Continuous Distribution Notes E[X ] − E[X] = E[X([X − 1] + 1)] − ( p )

1 1 p 1 =Known as Falling Powers E[X(X − 1)] + p − p2 = x∈X x(x − 1)P (X = x) + p2 − p2 P n x−1 1−p = limnր∞ x=1 x(x − 1)[(1 − p) p] − p2 P 1−p 1−p 1−p =NOW, similar techniques/’tricks’ as E[X], but 2 derivatives instead 2 p2 − p2 = p2

5 4.5 Tail Sum Property

(x−1)+1 x x i−1 x−1 i 1−(1−p) P (X ≤ x) = i=1 P (X = i) = i=1(1 − p) p = p i=0 (1 − p) =Geometric Sums p 1−(1−p)

x P P P 1−(1−p) x = p p = 1 − (1 − p)

⇒ P (X>x) = 1 − P (X ≤ x) = 1 − [1 − (1 − p)x] = (1 − p)x...Eventually we will get that heads!

4.5.1 Law of the Unconscience Statistician

n n x E[X] = x∈|X| xP (X = x) = limnր∞ x=1 xP (X = x) = limnր∞ x=1[ i=1 1]P (X = x) P P P P n x n m = limnր∞ x=1 i=1 P (X = x) = limnր∞ x=1 limmր∞ i=x P (X = x) P P P P n n = limnր∞ x=1 P (X ≥ x) = limnր∞ x=0 P (X>x) P P n x 1 1 =Geometric Distribution limnր∞ x=0(1 − p) =Geometric Sum 1−(1−p) = p P This would have been a much easier way to obtain µ above. This method is true of all Positive (or the absolute value of ANY) r.v.s, discrete and continuous (including exponential), alike. This is a proof of the base case only, but it can be extended to solve for E[X2](V ar[X]) as well as other larger moments.

4.6 Memoryless (Forget about it!) Property

P (X>s+t∩X>t) P (X>s+t) (1−p)s+t P (X>s + t|X>t) = P (X>t) =Subset P (X>t) =Tail Sum Property (1−p)t

= (1 − p)s = P (X>s)...Forget about it!

4.7 Examples

1. Flip a coin until a heads results 2. Roll of a die until a six results

5 Summary

n pց0 P Xi,Xi∼iidBern(p) nր∞,pց0,s.t.np→λ Exp(λ) ←−−− 4 : geo(p) ←−⊥ 1 : Bern(p) −−−−−−−−−−−−−−−−→i=1 2 : Bin(n, p) −−−−−−−−−−−−−→ 3 : P oi(λ)

Distribution 1 Bernoulli (Bern) 2 Binomial (Bin) 3 Poisson 4 Geometric (Geo) Trials 1 1 < n < ∞ n ր ∞ UNKNOWN (Successes fixed at 1, NOT trials) n x PMF px(1 − p)1−x px(1 − p)n−x e−λ λ (1 − p)x−1p x x! x = 0, 1 x = 0, 1,..., n x = 0, 1,... x = 1, 2,... 1 E[X] p np λ p 1−p Var[X] p(1 - p) np(1 - p) λ p2

6