Multinomial Geometric Hypergeometric Poisson

Math 141 Lecture 4: Distributions Related To The

Albyn Jones1

1Library 304 [email protected] www.people.reed.edu/∼jones/courses/141

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Outline

Review The Multinomial Distribution The Geometric and Negative Binomial Distributions The Hypergeometric Distribution The

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Examples of Different Experiments

Binomial: Count the number of Heads in a fixed number of tosses. Geometric: Count the number of Tails before the first Head. Negative Binomial: Count the number of Tails before before the k-th Head. Hypergeometric: Count the number of red cards dealt in a poker hand. Poisson A model for rare events.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Review: The Binomial

Dichotomous Trials: Each trial results in either a ‘Success’, S, or a ‘Failure’ F. Independence: Successive trials are independent; knowing we got S on one trial does not help us predict the outcome of any other trial. Constant probability: Each trial has the same probability P(S) = p, and P(F) = 1 − p. X counts the number of S’s. n the number of trials is fixed.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson X ∼ Binomial(n, p) Probabilities

Let p = P(S) and q = 1 − p = P(F), then n (X = k) = pk qn−k P k

And in R, the density function is given by

P(X = k) = dbinom(k, n, p)

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Multinomial

Polychotomous Trials: Each trial results in one of a fixed set of possible outcomes {E1, E2,... EN }. Example: die rolls, Ω = {1, 2, 3, 4, 5, 6}. Independence: Successive trials are independent; knowing the outcome of one trial does not help us predict the outcome of any other trial. Constant probability: Each trial has the same probability for each possibility.

X1, X2,... XN count the number of events of each type. n the number of trials is fixed.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Multinomial Probabilities

Good news! The typical computation involves collapsing categories to create a binomial.

Example: Roll a fair die 20 times. For each roll, there are six possible outcomes, so we have a Multinomial Distribution. What is the probability of rolling three 6’s in 20 rolls? Let X be the number of 6’s in 20 rolls. What is the distribution of X?

X ∼ Binomial(20, 1/6)

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson But Since You Asked...

For n independent Multinomial(p1, p2,..., pN ) trials, where the probability of observing category i is pi . Let Xi be the count of events observed in category i, where

N N X X Xi = n and pi = 1 i=1 i=1

n! k1 k2 kN P(X1 = k1,..., XN = kN ) = p1 p2 ... pN k1!k2! ... kN ! with R functions: rmultinom, dmultinom.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Example: Election Polls

Suppose we ask registered Republicans for their preference: Bachman, Gingrich, Perry, Romney, or ‘none of the above’ (Ron Paul is invisible :-). The Poll We ask 1000 randomly chosen Republicans for their preference, and let {9, 290, 108, 395, 198} be the counts for those 5 options, in order. The Population Proportions Suppose that the actual probabilities are (in order) {.01,.3,.1,.4,.19}. The Probability: dmultinom(c(9, 290, 108, 395 , 198),1000, c(.01, .3, .1, .4, .19)) [1] 2.572792e-06 That looks small, but even the most likely outcome has small probability (about 5 × 10−6)!

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The

Dichotomous Trials: Each trial results in either a ‘Success’, S, or a ‘Failure’ F. Independence: Successive trials are independent; knowing we got S on one trial does not help us predict the outcome of any other trial. Constant probability: Each trial has the same probability P(S) = p, and P(F) = 1 − p = q. X counts the number of F’s before the first S. Question: What is the probability we see k failures before the first success?

k Pr(X = k) = P(F1, F2, F3,..., Fk , S) = p · q

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Example

Roll a fair die repeatedly until getting the first 6. What are p and q here? What is the probability it comes on the 6th roll, that is we have 5 non-sixes before the first 6?

55 1 (X = 5) = ≈ .067 P 6 6

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Negative Binomial Distribution

Like the Geometric: Dichotomous outcomes {F, S}, independent trials, constant probability. X counts the number of F’s before the rth S. Question: What is the probability we see k failures before the rth success? Hint The last trial must be an S, so we see k failures and (r − 1) successes (in any order), followed by a success.

r − 1 + k r − 1 + k Pr(X = k) = ·pr−1 ·qk ·p = ·qk ·pr k k

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Example

Roll a fair die repeatedly until getting the third 6. What are p and q here? What is the probability it comes on the 10th roll, that is we have 7 non-sixes before the third 6?

9 57 13 (X = 7) = ≈ .0465 P 7 6 6

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Connections

The number of Failures observed before getting the rth Success is clearly the sum of The number of Failures observed before the 1st Success The number of Failures observed between the 1st and 2nd Successes The number of Failures observed between the 2nd and 3rd Successes and so on. Theorem The sum of r independent Geometric(p) RV’s has a NegativeBinomial(r, p) distribution.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Hypergeometric Distribution

Sampling from a finite population of two categories A and B without replacement. Non-Independence, Non-constant Probability: Successive trials are dependent; every trial changes the sample space and probabilities for the subsequent trials. X counts the number of A’s. n the number of trials is fixed.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Hypergeometric Probabilities

Suppose we have a bag with A alabaster and B black marbles, well mixed. We extract n marbles. Let X be the number of alabaster marbles drawn. For 0 ≤ k ≤ min(A, n), the probability of drawing k alabaster and n − k black marbles is given by

A B  (X = k) = k n−k P A+B n

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Hypergeometric Example

Question: What is the probability that a 5 card poker hand dealt from a well shuffled deck has 4 spades? There are A = 13 spades, B = 39 non-spades. Let X be the number of spades dealt.

1339 (X = 4) = 4 1 ≈ 0.01 P 52 5

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Poisson Distribution A probability model for Rare Events

Origin: An analytical approximation for binomial probabilities when n is large and p is small: let µ = np, then µk (X = k) ≈ e−µ P k! Poisson Process A random process describing occurance of events in time: let µ be the rate per unit time, then if disjoint time intervals are independent, and Xt counts the number of events occuring in an interval of length t,

(tµ)k (X = k) = e−tµ P t k!

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Poisson Example

Question: What is the probability that in a group of 40 people, no two share a birthday? Or, the complement: at least two share a birthday? Poisson Approximation: what is the relevant Poisson distribution? To use Poisson approximation to the Binomial, we need to know how many ‘trials’ there are, and the probability of success on each trial. Trials: How many trials are there here? 40 40! 40 · 39 = = = 780 2 2! · 38! 2 Probabilities: The probability two people share a birthday is approximately 1 365 Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Solution

The probability that no two of 40 people share a birthday is approximately the probability that a Poisson RV X with parameter µ = 780/365 ≈ 2.137 is equal to 0.

µ0 (X = 0) = e−µ = e−780/365 ≈ .12 P 0! Thus the probability that at least two people share a birthday is about .88.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Sums of Poisson(µ) RV’s

Suppose that X and Y are independent Poisson(µ) RV’s. What is the distribution of X + Y ? Hint: if X and Y are independent Binomial(n, p) RV’s, then X + Y is Binomial(n + n, p). Suppose that n is large, and p small, and µ = np. Conclusion? A Binomial(2n, p) is close to a Poisson(2µ)

In General: the sum of independent Poisson(µi ) RV’s is P Poisson(µ), where µ = µi .

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Connecting the Poisson and Negative Binomial

Short version: The Negative Binomial is a good model for a collection of Poisson RV’s with variable rates µ.

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson R Functions

Distribution density CDF RNG

Binom(n,p) dbinom(k,n,p) pbinom(k,n,p) rbinom(N,n,p)

Geom(p) dgeom(k,p) pgeom(k,p) rgeom(N,p)

Hyper(A,B,n) dhyper(k,A,B,n) phyper(k,A,B,n) rhyper(N,A,B,n)

Poisson(µ) dpois(k,µ) ppois(k,µ) rpois(N,µ)

Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Summary

The Multinomial Distribution The Geometric and Negative Binomial Distributions The Hypergeometric Distribution: sampling without replacement from a finite population. The Poisson Distribution: rare events.

Albyn Jones Math 141