Math 141 Lecture 4: Distributions Related to the Binomial Distribution

Multinomial Geometric Hypergeometric Poisson Math 141 Lecture 4: Distributions Related To The Binomial Distribution Albyn Jones1 1Library 304 [email protected] www.people.reed.edu/∼jones/courses/141 Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Outline Review The Multinomial Distribution The Geometric and Negative Binomial Distributions The Hypergeometric Distribution The Poisson Distribution Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Examples of Different Experiments Binomial: Count the number of Heads in a fixed number of tosses. Geometric: Count the number of Tails before the first Head. Negative Binomial: Count the number of Tails before before the k-th Head. Hypergeometric: Count the number of red cards dealt in a poker hand. Poisson A model for rare events. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Review: The Binomial Dichotomous Trials: Each trial results in either a ‘Success’, S, or a ‘Failure’ F. Independence: Successive trials are independent; knowing we got S on one trial does not help us predict the outcome of any other trial. Constant probability: Each trial has the same probability P(S) = p, and P(F) = 1 − p. X counts the number of S’s. n the number of trials is fixed. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson X ∼ Binomial(n; p) Probabilities Let p = P(S) and q = 1 − p = P(F), then n (X = k) = pk qn−k P k And in R, the density function is given by P(X = k) = dbinom(k; n; p) Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Multinomial Polychotomous Trials: Each trial results in one of a fixed set of possible outcomes fE1; E2;::: EN g. Example: die rolls, Ω = f1; 2; 3; 4; 5; 6g. Independence: Successive trials are independent; knowing the outcome of one trial does not help us predict the outcome of any other trial. Constant probability: Each trial has the same probability for each possibility. X1; X2;::: XN count the number of events of each type. n the number of trials is fixed. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Multinomial Probabilities Good news! The typical computation involves collapsing categories to create a binomial. Example: Roll a fair die 20 times. For each roll, there are six possible outcomes, so we have a Multinomial Distribution. What is the probability of rolling three 6’s in 20 rolls? Let X be the number of 6’s in 20 rolls. What is the distribution of X? X ∼ Binomial(20; 1=6) Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson But Since You Asked... For n independent Multinomial(p1; p2;:::; pN ) trials, where the probability of observing category i is pi . Let Xi be the count of events observed in category i, where N N X X Xi = n and pi = 1 i=1 i=1 n! k1 k2 kN P(X1 = k1;:::; XN = kN ) = p1 p2 ::: pN k1!k2! ::: kN ! with R functions: rmultinom, dmultinom. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Example: Election Polls Suppose we ask registered Republicans for their preference: Bachman, Gingrich, Perry, Romney, or ‘none of the above’ (Ron Paul is invisible :-). The Poll We ask 1000 randomly chosen Republicans for their preference, and let f9; 290; 108; 395; 198g be the counts for those 5 options, in order. The Population Proportions Suppose that the actual probabilities are (in order) f:01;:3;:1;:4;:19g. The Probability: dmultinom(c(9, 290, 108, 395 , 198),1000, c(.01, .3, .1, .4, .19)) [1] 2.572792e-06 That looks small, but even the most likely outcome has small probability (about 5 × 10−6)! Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Geometric Distribution Dichotomous Trials: Each trial results in either a ‘Success’, S, or a ‘Failure’ F. Independence: Successive trials are independent; knowing we got S on one trial does not help us predict the outcome of any other trial. Constant probability: Each trial has the same probability P(S) = p, and P(F) = 1 − p = q. X counts the number of F’s before the first S. Question: What is the probability we see k failures before the first success? k Pr(X = k) = P(F1; F2; F3;:::; Fk ; S) = p · q Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Example Roll a fair die repeatedly until getting the first 6. What are p and q here? What is the probability it comes on the 6th roll, that is we have 5 non-sixes before the first 6? 55 1 (X = 5) = ≈ :067 P 6 6 Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Negative Binomial Distribution Like the Geometric: Dichotomous outcomes fF; Sg, independent trials, constant probability. X counts the number of F’s before the rth S. Question: What is the probability we see k failures before the rth success? Hint The last trial must be an S, so we see k failures and (r − 1) successes (in any order), followed by a success. r − 1 + k r − 1 + k Pr(X = k) = ·pr−1 ·qk ·p = ·qk ·pr k k Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Example Roll a fair die repeatedly until getting the third 6. What are p and q here? What is the probability it comes on the 10th roll, that is we have 7 non-sixes before the third 6? 9 57 13 (X = 7) = ≈ :0465 P 7 6 6 Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Connections The number of Failures observed before getting the rth Success is clearly the sum of The number of Failures observed before the 1st Success The number of Failures observed between the 1st and 2nd Successes The number of Failures observed between the 2nd and 3rd Successes and so on. Theorem The sum of r independent Geometric(p) RV’s has a NegativeBinomial(r; p) distribution. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Hypergeometric Distribution Sampling from a finite population of two categories A and B without replacement. Non-Independence, Non-constant Probability: Successive trials are dependent; every trial changes the sample space and probabilities for the subsequent trials. X counts the number of A’s. n the number of trials is fixed. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Hypergeometric Probabilities Suppose we have a bag with A alabaster and B black marbles, well mixed. We extract n marbles. Let X be the number of alabaster marbles drawn. For 0 ≤ k ≤ min(A; n), the probability of drawing k alabaster and n − k black marbles is given by A B (X = k) = k n−k P A+B n Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Hypergeometric Example Question: What is the probability that a 5 card poker hand dealt from a well shuffled deck has 4 spades? There are A = 13 spades, B = 39 non-spades. Let X be the number of spades dealt. 1339 (X = 4) = 4 1 ≈ 0:01 P 52 5 Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson The Poisson Distribution A probability model for Rare Events Origin: An analytical approximation for binomial probabilities when n is large and p is small: let µ = np, then µk (X = k) ≈ e−µ P k! Poisson Process A random process describing occurance of events in time: let µ be the rate per unit time, then if disjoint time intervals are independent, and Xt counts the number of events occuring in an interval of length t, (tµ)k (X = k) = e−tµ P t k! Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Poisson Example Question: What is the probability that in a group of 40 people, no two share a birthday? Or, the complement: at least two share a birthday? Poisson Approximation: what is the relevant Poisson distribution? To use Poisson approximation to the Binomial, we need to know how many ‘trials’ there are, and the probability of success on each trial. Trials: How many trials are there here? 40 40! 40 · 39 = = = 780 2 2! · 38! 2 Probabilities: The probability two people share a birthday is approximately 1 365 Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Solution The probability that no two of 40 people share a birthday is approximately the probability that a Poisson RV X with parameter µ = 780=365 ≈ 2:137 is equal to 0. µ0 (X = 0) = e−µ = e−780=365 ≈ :12 P 0! Thus the probability that at least two people share a birthday is about :88. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Sums of Poisson(µ) RV’s Suppose that X and Y are independent Poisson(µ) RV’s. What is the distribution of X + Y ? Hint: if X and Y are independent Binomial(n; p) RV’s, then X + Y is Binomial(n + n; p). Suppose that n is large, and p small, and µ = np. Conclusion? A Binomial(2n; p) is close to a Poisson(2µ) In General: the sum of independent Poisson(µi ) RV’s is P Poisson(µ), where µ = µi . Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Connecting the Poisson and Negative Binomial Short version: The Negative Binomial is a good model for a collection of Poisson RV’s with variable rates µ. Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson R Functions Distribution density CDF RNG Binom(n,p) dbinom(k,n,p) pbinom(k,n,p) rbinom(N,n,p) Geom(p) dgeom(k,p) pgeom(k,p) rgeom(N,p) Hyper(A,B,n) dhyper(k,A,B,n) phyper(k,A,B,n) rhyper(N,A,B,n) Poisson(µ) dpois(k,µ) ppois(k,µ) rpois(N,µ) Albyn Jones Math 141 Multinomial Geometric Hypergeometric Poisson Summary The Multinomial Distribution The Geometric and Negative Binomial Distributions The Hypergeometric Distribution: sampling without replacement from a finite population.

Math 141 Lecture 4: Distributions Related to the Binomial Distribution

On Two-Echelon Inventory Systems with Poisson Demand and Lost Sales

Distance Between Multinomial and Multivariate Normal Models

The Exciting Guide to Probability Distributions – Part 2

556: MATHEMATICAL STATISTICS I 1 the Multinomial Distribution the Multinomial Distribution Is a Multivariate Generalization of T

Probability Distributions and Related Mathematical Constructs

Binomial and Multinomial Distributions

Discrete Probability Distributions

Statistical Distributions

5. the Multinomial Distribution

Chapter 5: Multivariate Distributions

A Multivariate Beta-Binomial Model Which Allows to Conduct Bayesian Inference Regarding Θ Or Transformations Thereof

Review of Probability Distributions for Modeling Count Data