<<

Binomial Approximation Binomial Approximations

Last time we looked at the normal approximation for the binomial Lecture 5: Poisson, Hypergeometric, and Geometric distribution: Distributions Works well when n is large Sta 111 Continuity correction helps Binomial can be skewed but Normal is symmetric Colin Rundel At a minimum we want np ≥ 10 and nq ≥ 10 May 20, 2014

What do we do when p is close to 0 or 1?

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 1 / 21

Poisson Distribution Binomial Approximation Poisson Distribution Binomial Approximation Alternative Approximation Alternative Approximation, Cont.

Let X ∼ Binom(n, p) which we will reparameterize so that p = λ/n for a n! fixed value of λ. As such, λ/n is small when n is large. A = n nk (n − k)! We will evaluate the as n → ∞.

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 2 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 3 / 21 Poisson Distribution Binomial Approximation Poisson Distribution Binomial Approximation Alternative Approximation, Cont. Alternative Approximation, Cont.

 n −k λ  λ Bn = 1 − C = 1 − n n n

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 4 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 5 / 21

Poisson Distribution Binomial Approximation Poisson Distribution Binomial Approximation Alternative Approximation, cont. Poisson Distribution

Let X ∼ Binom(n, p) we will reparameterize such that p = λ/n for a fixed Let X be a reflecting the number of events in a given value of λ. As such, λ/n is small when n is large. period where the expected number of events in that interval is λ then the probability of k occurrences (k ≥ 0) in the interval is given by the Poisson distribution, X ∼ Pois(λ)

 n!  λk   λn  λ−k P(X = k|n, p = λ/n) = k 1 − 1 − k n (n − k)! k! n n λ −λ | {z } | {z } | {z } P(X = k|λ) = f (k|λ) = e An Bn Cn k!

λk lim P(X = k|n, p = λ/n) = e−λ We use this approximation to the Binomial when p is very small and n is n→∞ k! very large since λ = np tends to be reasonable. Therefore for large n,

λk P(X = k|n, p = λ/n) ≈ e−λ k!

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 6 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 7 / 21 Poisson Distribution Binomial Approximation Poisson Distribution Binomial Approximation Poisson Distribution - Poisson Distribution - Example

We can use the same approach that we used with the Binomial distribution Assume you have a sample of a stable isotope of an element, there are 20 Therefore kmode is the smallest integer greater than λ − 1 approximately 10 atoms in this sample. If on average one of these atoms 12 19 ( will radioactively decay every 10 years (≈ 5 × 10 secs). λ − 1, λ if λ = dλe kmode = dλe − 1 otherwise What is the probability that 4 or fewer atoms decay in the next second?

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 8 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 9 / 21

Poisson Distribution Binomial Approximation Poisson Distribution Binomial Approximation Approximation - Mean & Poisson and Normal Distributions

We defined p = λ/n and we know that for a Binomial random variable Based on the connection between the Binomial and Poisson distributions it intuitively makes sense that we should also be able to approximate the Poisson with a . µ = np σ2 = npq For approximation to the binomial we need np ≥ 10 and nq ≥ 10.

Then for large n we then get, What is a reasonable requirement for λ?

λ lim µ = lim n = λ n→∞ n→∞ n λ  λ  λ lim σ2 = lim n 1 − = λ lim 1 − = λ n→∞ n→∞ n n n→∞ n

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 10 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 11 / 21 Poisson Distribution Binomial Approximation Hypergeometric Poisson and Normal Distributions, cont. Another way to look at the Binomial

Imagine we have a population that is partitioned into ‘good’ and ‘bad’ Pois(1) Pois(5) subsets. Let G be the number of good elements in the population, xB the 0.35 0.15

0.30 number of bad elements, and N = B + G. 0.25 0.10 0.20 If we sample this population with replacement what is the probability that 0.15 0.05 0.10 we observe g good samples and b bad samples. 0.05 0.00 0.00

Pois(10) Pois(20) This is still the Binomial distribution, but rewritten such that 0.12 0.08 0.10 0.06 0.08 nG g Bb 0.06

0.04 P(g good, b bad in n = b + g tries) = g Nn 0.04

0.02 g b g n−g 0.02 n G  B  n G   G 

0.00 0.00 = = 1 − g N N g N N

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 12 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 13 / 21

Hypergeometric Hypergeometric Hypergeometric Hypergeometric Distribution

What would change if we were sampling without replacement? Let X be a random variable reflecting the number of successes in n draws without replacement from a finite population of size N with m desired items then the probability of k successes is given by the Hypergeometric distribution, X ∼ Hypergeo(N, m, n)

mN−m P(X = k) = f (k|N, m, n) = k n−k N n

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 14 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 15 / 21 Hypergeometric Hypergeometric Hypergeometric Distribution - Example Hypergeometric Distribution - Another Way

You are dealt five cards, what is the probability that four of them are aces? Let X ∼ Binom(m, p) and Y ∼ Binom(N − m, p) be independent Binomial random variables then we can define the Hypergeometric distribution as the conditional probability of X = k given X + Y = n.

If we use the Hypergeometric distribution then, N = 52, m = 4, n = 5 and Note that X + Y ∼ Binom(N, p)

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 16 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 17 / 21

Geometric & Negative Binomial Geometric & Negative Binomial Geometric Distribution - Version 1 Geometric Distribution - Version 2

Let Y be a random variable reflecting the number of failures of Let Y be a random variable reflecting the number of independent independent Bernoulli trials, with probability of success p, needed before Bernoulli trials, with probability of success p, needed before observing the observing the first success. Then the probability of k failures before the first success. Then the probability of k trails needed to achieve the first first success is given by the Geometric distribution, Y ∼ Geo(p) success is given by the Geometric distribution, Y ∼ Geo(p)

P(Y = k) = f (k|p) = p(1 − p)k P(Y = k) = f (k|p) = p(1 − p)k−1

Sta 111 (Colin Rundel) Lec 5 May 20, 2014 18 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 19 / 21 Geometric & Negative Binomial Geometric & Negative Binomial Negative Binomial Distribution Distribution Relationships

Let X be a random variable reflecting the total number of successes before the rth failure where each trial is an independent with p probability of success. Then the probability of k successes is given by the Negative Binomial distribution, X ∼ NB(r, p)

k + r − 1 P(X = k) = f (k|r, p) = pk (1 − p)r k

http://www.johndcook.com/ distribution chart.html Sta 111 (Colin Rundel) Lec 5 May 20, 2014 20 / 21 Sta 111 (Colin Rundel) Lec 5 May 20, 2014 21 / 21