<<

Section 2.1: The

1 Section 2.1: The Binomial distribution

Consider a Bernoulli trial Ber(p), i.e., some experiment that can succeed with p. Examples would be “getting heads” on a (possibly biased) coin toss, or getting a 6 when rolling a die, or having a randomly chosen voter vote on party P . The long term frequency interpretation of probability then tells us that if we repeat the trial a large number of times n, we expect the number of successes k to be about np. This doesn’t mean, however, that we can expect the number of successes to be precisely np (which might not even be an integer), but rather merely “close to np” with high probability. In order to understand the precise meaning of “close” in the sentence above we first need to study the exact distribution of those successes.

Binomial(n, p) Ω = {0, 1, ··· , n}

n P ({k}) = pk(1 − p)n−k k

This is the distribuition of the num- ber of successes in n independent trials each with probability p. 1 n k n−k Here’s how to understand the formula k p (1−p) . Suppose the n trials are performed in order. In the diagram above moving from each line to the line below corresponds to performing a new trial, hence there is a q = 1−p probabil- ity that the number of successes k doesn’t change (downward movement) and a p probability that k increases by 1 (downward diagonal movement). The prob- ability P ({k}) is then the sum of the of all the downward paths starting at the initial square (the one marked with 1). These are paths that n move diagonally k times and straight down n − k times, hence there are k of them. Further, any of those paths has the same exact probability pk(1 − p)n−k, since each path corresponds to k successes and n − p failures (in some chosen n k n−k fixed order). The formula P ({k}) = k p (1 − p) then follows.

1 n n! Recall that k = k!(n−k)! counts the number of ways of choosing k elements from a set with n elements.

1 1.1 Consecutive ratio for the binomial distribution Computing every single one of the probabilities of a binomial distribution by the formula above can lead to quite hefty computations. In some situations, the following result allows for quicker calculations: Proposition 1. For a Binomial(n, p) distribution, P ({k} n + 1 − k p = . P ({k − 1}) k 1 − p

1 1 2 Exercise 1. Find the binomial distributions Bin(4, 2 ), Bin(4, 3 ), Bin(4, 3 ). 1 Proof. First we do the p = 2 case. We use the result above (note that in this p case 1−p = 1): 1 4 4 3 6 P ({0}) = ,P ({1}) = P ({0}) × = ,P ({2}) = P ({1}) × = , 16 1 16 2 16 2 4 1 1 P ({3}) = P ({2}) × = ,P ({4}) = P ({3}) × = . 3 16 4 16 1 p 1 Now we do the p = 3 case (note that now 1−p = 2 ): 16 4 1 32 3 1 24 P ({0}) = ,P ({1}) = P ({0}) × = ,P ({2}) = P ({1}) × = , 81 1 2 81 2 2 81 2 1 8 1 1 1 P ({3}) = P ({2}) × = ,P ({4}) = P ({3}) × = . 3 2 81 4 2 81 2 2 1 Finally we turn to the p = 3 case. Note that since 3 = 1 − 3 we can simply 1 take the probabilities we calculated for 3 and “flip them” by replacing k with 4 − k. 1 8 24 32 16 P ({0}) = ,P ({1}) = ,P ({2}) = ,P ({3}) = ,P ({4}) = . 81 81 81 81 81

1 1 2 p = 2 p = 3 p = 3

Using the formula above we can also show that the mode (i.e. the value with highest probability) of a Bin(n, p) distribution is bnp + pc, i.e., the integer part of np+p. Further, the P ({k}) function is increasing in k for k smaller than the mode, and decreasing for k larger than the mode. Also of importance is the value np, which is called the mean (or ) of the Bin(n, p) distribution. We’ll see, in a sense to be made precise in Chapter 3, that, probabilistically speaking, this is the value that sits “right in the middle” of the possible values that the binomial can take.

2