Stat 3000 – for Scientists and Engineers Dr. Corcoran, Fall 2005 III. Famous Discrete Distributions: The Binomial and Poisson Distributions Up to this point, we have concerned ourselves with the general properties of categorical and continuous distributions, illustrated with somewhat arbitrary examples. However, there are particular distributions that are well understood and have wide applicability.

We will first cover two specific types of categorical variables: those that follow the binomial and Poisson distributions. Both of these distributions model the of observing a certain number of events over a period of or within a physical . They are used widely in all branches of science and engineering.

Later, we will spend some time discussing the most famous continuous distribution: the or “bell curve”. Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

The As a way of understanding the , it helps to consider the simplest categorical : a binary variable, which can take only one of two values, such as “heads” or “tails”, “yes” or “no”, male or female, dead or alive, etc.

We typically code a binary random variable X as 1 or 0. We refer arbitrarily to X = 1 as a “success”, and X = 0 as a “failure”. How you choose to define “success” and “failure” is entirely up to you.

If X is binary with probability of success p, then we say X ~ Bernoulli(p). That is, the pmf for X is P(X = 1) = p and P(X = 0) = 1 – p. Note also that E(X) = (1)p + (0)(1 – p) = p, and Var(X) = (1)2p + (0)2(1 – p) – p2 = p – p2 = p(1 – p). Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

The Binomial Distribution

Suppose that we have n independent Bernoulli trials, each with probability of success p. The binomial distribution determines the probability of observing a given number of “successes” out of the n trials.

Example III.A

We flip a 10 , and observe the number of tosses that result in “heads”. The number of heads out of 10 flips follows the binomial distribution. Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005 Example III.B We survey 500 randomly selected students at USU, and ask them whether they think that President Bush is doing a good job (“yes” or “no”). The number who respond “yes” is a binomially distributed random variable.

The Binomial PMF

If X follows the binomial distribution, with number of independent trials given by n and probability of “success” given by p, we say that X ~ Binomial(n, p). The pmf of X is given by

⎛n⎞ x n−x P(X = x) = ⎜ ⎟ p (1− p) , for x = 0,K,n. ⎝ x⎠ Note also that E(X) = np, and Var(X) = np(1– p). Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

Example III.

A study of lizards reveals that at midday the probability of finding a given specimen in the sun is 0.077. Suppose that we randomly sample 60 lizards in a particular area.

What is the probability that all 60 are in the shade?

What is the expected number of lizards out of this sample found in the sun?

What is the probability that we find 10 or more in the sun? Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

The Like the binomial distribution, the Poisson distribution is used to model . The distinction between the two distributions, however, is that we use the binomial distribution to model the probability of some count out of a fixed, finite number of trials. The Poisson distribution does not depend on a fixed number of trials – the range of a Poisson random variable is 0, 1, 2,… The Poisson distribution is especially useful for modeling the occurrence of relatively rare events.

Example III.D

An engineer observes traffic flow through an intersection during the period of an hour. The number of vehicles that will pass through is a Poisson random variable. Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

Example III.E

A physicist is interested in the per minute intensity of particle emissions from a radioactive substance. The number of emitted particles can be considered a Poisson random variable.

The Poisson PMF

There is a single that determines the distribution of a Poisson random variable X: the “rate” parameter µ. We often refer to µ as the rate because it turns out that E(X) = µ. In other words, µ represents the average count per unit of time or space. Given µ, the Poisson pmf is µ xe−µ P(X = x) = , for x = 0, 1, 2,... x! Note also that Var(X) = µ. The and of a Poisson random variable are the same. Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

Example III.F

According to data from Lloyd’s Register of Shipping, the damage rate for a given ship is 0.004 incidents for each month of service.

What is the probability that a given ship is damaged once during the next month?

What is the probability that a given ship is damaged no more than once during the next month?

What is the probability that a given ship is damaged at least once during its next 5 years of service? Stat 3000 – Statistics for Scientists and Engineers Dr. Corcoran, Fall 2005

The Poisson Approximation to the Binomial Distribution

It turns out the Poisson distribution can provide a good approximation of binomial . This is especially true for a binomial random variable with relatively large n and small p. Can you think of a heuristic explanation for this?

Suppose that you have a random variable X ~ Binomial(n,p), and you wish to use a Poisson approximation for the pmf of X. What mean and variance would you use for this Poisson distribution?

Use the Poisson approximation to compute the probabilities in Example II.C. Does the approximation work well? Why or why not?