Activity: Friendly Observers s1
Total Page:16
File Type:pdf, Size:1020Kb
Winter, 2008
Lab 3 – The Birthday Problem Due beginning of lab, Friday, Jan. 25
In this lab, you are asked to use Excel to perform some of the calculations. You are encouraged to work with one other person and turn in one report. The starred questions (*) are the ones that will make up your lab report.
Situation: To investigate the probability that at least two people in your class share the same birthday. Questions to answer include: How does this probability vary as the number of people in the group increases? How many people must be present for this probability to exceed one- half? How about to exceed a 90% chance?
Prediction : (a*) Record the number of people in lab today and take a guess for the probability of finding at least 2 people with the same birthday. (b*) Now take a guess for the smallest number of people who must be present in order for this probability of matching birthdays to exceed one-half. Then do the same for exceeding .9.
Analysis (Very Small Samples): Let n denote the number of people present. In order to tackle this problem, we need to make some assumptions about people’s birthdays. First, let us ignore the complication of February 29 birthdays. Second, we will assume that people are equally likely to be born on any of the (remaining) 365 days of the year. First consider the simple case of n=3 people. Because of our “equally likely” assumption, the probability of matching birthdays will equal the number of ways in which their birthdays could match divided by the total number of ways three people could have birthdays. Since this numerator is very hard to calculate, we will first count the number of ways in which three people can have different birthdays from each other. We will use the “general product rule” (pages 659-62) of counting to determine these, starting with the denominator. (c) How many choices are there for the first person’s birthday? How many for the second person (with no restrictions)? For the third? Thus, how many total ways are there for three people to have birthdays?
(d) Now consider the number of birthday pairs for which there is a no match. Again, how many choices are there for the first person’s birthday? Then, in order for their birthdays not to match, how many choices remain for the second person? For the third? Thus, how many ways are there for three birthdays not to match?
(e) Determine the probability of no matching birthdays in the n=3 case by dividing your answer to (d) by your answer to (c).
(f*) Use your answer to (e) and the relevant probability rule (indicating the rule by name) to find the probability that there is at least one birthday match among a group of n=3 people.
(g*) Follow this same procedure to find the probability of at least one birthday match in a group of n = 8 people. 1 Winter, 2008
Analysis (General Case): (h) Now consider the general case of a group of size n. Start with the denominator: how many possibilities are there for these n birthdays?
(i) Now consider the numerator for the complement: how many possibilities are there for which there are no matches among the n birthdays?
(j*) Use the appropriate probability rule with your answers to (h) and (i) to write an expression (in terms of n) for the probability of at least one matching birthday in a group of n people.
(k*) For what values of n will the probability of {at least one match} be equal to one (so that the group is certain to have matching birthdays)? Explain. (Remember to consider our simplifying assumptions. Advice: Think rather than manipulate a formula.)
Evaluating Probabilities: Having derived this expression for the probability of matching birthdays as a function of the number of people involved, we need to find an efficient way to evaluate it. We will turn to technology, specifically to the spreadsheet package Excel, to do this. First we introduce some notation: let An denote the event that a matching birthday exists among a group of n people, so An' denotes its complement and P(An) denotes its probability. Note, we can simplify
365 364 (365 n 2) 365 n 1 Pn,365 PA' ⋯ and note that n 365 365 365 365 365n
365 364 365 n 2 Pn1,365 PA' ⋯ n1 365 365 365 365n1 These expressions provide an iterative way to calculate the probability of no match with n people from the probability of no match with n-1 people: 365 n 1 PA' P(A' ) n 365 n1
(l) Now open Excel and put the name “n” (without quotes) into cell A1, enter the value 1 into A2 and the value 2 into A3. To fill column A with consecutive integers through the value 100, select the cells in A2 and A3, click on the lower right corner of cell, drag down to row 101.
(m) Put the name “P(no match)” into cell B1 and enter the probability of no matches when n =1.
(n*) Enter the appropriate iterative formula into B3 for calculating its probability. (Start with the equal sign (e.g., =A3). Your formula should make use of both A3 and B2.) Record the Excel formula used after you convince yourself that it does give the correct value for this probability.
(o) To fill down the formula that you entered into B3 through the rest of column B, you can click on B3 and then double click on the square in the lower right corner.
(p*) Put the name “P(match)” into cell C1 and enter into C2 the appropriate formula (= …) for calculating this probability. (Your formula should make use of the complement rule and the entry in B2.) Then fill this formula down through the rest of column C.
2 Winter, 2008
(q*) Record the probabilities of matching birthdays for the values of n presented in the table, to four decimal places: n 3 5 10 20 30 50 75 100 P(match)
(r*) From the spreadsheet identify the smallest value of n for which the probability of matching birthdays exceeds one-half. What is this probability? How was your guess in (b)?
(s*) Identify the smallest value of n for which the probability of matching birthdays exceeds .9. What is this probability? How was your guess in (b)?
Examining the Function: (t*) Use Excel to create a graph of the probability of matching birthdays (C3) as a function of n (C1). (Highlight column A and hold down the Ctrl key and select column C as well.) Start the “Chart Wizard”: In Excel 2003, choose Insert > Chart > XY (Scatter), and then press Finish. If you double click on the grey background, you can change the Area color to None. (In Excel 2007, choose Insert > Scatter > Scatter with Smooth Lines.) Copy and paste the graph into Word. Describe the behavior of this function.
A Slightly Different Question: Now consider a different question: Legend has it that a guest on Johnny Carson’s Tonight show, aware of this result, told Johnny that there was almost a certainty of shared birthdays in his audience of over 100 people. Johnny allegedly asked the audience, “Was anyone else born on my birthday- October 23?” and found that nobody was. (u) Explain how Johnny’s question differs from the one you have considered so far.
(v) For a given value of n, do you expect the probability of matching Johnny’s birthday to be lower, greater, or the same as the one you found earlier? Explain.
(w) Explain why the denominator of this probability is the same as before: 365n .
(x) Consider the complement of this event: that nobody in a group of n people shares Johnny’s birthday. Determine the numerator of this probability by considering how many birthdays the first person can have, the second person can have, and so on. (Hint: You should not get the same expression as before.)
(y*) Write an expression for the probability that at least one person in a group of n shares Johnny’s birthday by using the complement rule and your answers to (w) and (x).
3 Winter, 2008
(z) Expand column A to go from 1 to 400. (Expand columns B and C as well.) Use column D of your Excel file to calculate the probability in (y) for values of n ranging from 1 to 400. [Hint: First rewrite your expression as a ratio to one power (x/y)z. Use the carat (^) symbol in Excel to raise a number to a power. Otherwise Excel won’t be able to perform the calculation!]
(aa*) Record the probabilities of matching birthdays for the values of n presented in the table (to four decimal places): n 3 5 10 20 30 50 75 100 400 P(specific match)
(bb*) Plot this function and the one involving a match of any birthday on the same graph: Highlight column A and then hold the Ctrl key down and highlight columns C and D (this skips column B), then use the Chart Wizard as above. Describe how this function compares to the one involving a match of any birthday.
(cc*) Identify the smallest value of n for which the probability of matching Johnny’s birthday exceeds .2. What is this probability?
(dd*) Identify the smallest value of n for which the probability of matching Johnny’s birthday exceeds .5. What is this probability? (You may need to expand your range of values for n.)
(ee*) Identify the smallest value of n for which the probability of matching Johnny’s birthday exceeds .9. What is this probability? (You may need to expand your range of values for n.)
(ff*) Theoretically, is there a value of n for which the probability of at least one person matching Johnny’s birthday is exactly equal to one? Explain.
(gg*) Explain, as if to someone not in this class, why these probabilities differ so dramatically between these two situations.
You have found that there is a huge difference between any two people matching birthdays and someone matching a particular birthday. This illustrates the distinction between general and specific events. This distinction is an important one for probabilistic reasoning. In a general sense, very rare events happen all the time: people win lotteries every day and triplets are born somewhere in the world every day. By the same token, though, it is very unlikely for any one specific person to win the lottery on a given day or for one particular pregnancy to result in triplets. Many apparent “coincidences” can be explained by this phenomenon.
Lab Report: Turn in your answers to the starred questions. If you collaborate with another student, you can turn in one report with both names. Incorporate the Excel graphs into the body of a typed report. You do not need to turn in the full Excel spreadsheet (just the graphs, the formulas you used to create the columns, and the tables in (q) and (aa)). 4