3. Probability Theory
Total Page:16
File Type:pdf, Size:1020Kb
Ismor Fischer, 5/29/2012 3.1-1 3. Probability Theory 3.1 Basic Ideas, Definitions, and Properties POPULATION = Unlimited supply of five types of fruit, in equal proportions. O1 = Macintosh apple O4 = Cavendish (supermarket) banana O2 = Golden Delicious apple O5 = Plantain banana O3 = Granny Smith apple … … … … … … Experiment 1: Randomly select one fruit from this population, and record its type. Sample Space: The set S of all possible elementary outcomes of an experiment. S = {O1, O2, O3, O4, O5} #(S) = 5 Event: Any subset of a sample space S. (“Elementary outcomes” = simple events.) A = “Select an apple.” = {O1, O2, O3} #(A) = 3 B = “Select a banana.” = {O , O } #(B) = 2 #(Event) 4 5 #(trials) 1/1 1 3/4 2/3 4/6 . Event P(Event) A 3/5 0.6 1/2 A 3/5 = 0.6 0.4 B 2/5 . 1/3 2/6 1/4 B 2/5 = 0.4 # trials of 0 experiment 1 2 3 4 5 6 . 5/5 = 1.0 e.g., A B A A B A . P(A) = 0.6 “The probability of randomly selecting an apple is 0.6.” As # trials → ∞ P(B) = 0.4 “The probability of randomly selecting a banana is 0.4.” Ismor Fischer, 5/29/2012 3.1-2 General formulation may be facilitated with the use of a Venn diagram: Experiment ⇒ Sample Space: S = {O1, O2, …, Ok} #(S) = k A Om+1 Om+2 O2 O1 O3 Om+3 O 4 . Om . Ok Event A = {O1, O2, …, Om} ⊆ S #(A) = m ≤ k Definition: The probability of event A, denoted P(A), is the long-run relative frequency with which A is expected to occur, as the experiment is repeated indefinitely. Fundamental Properties of Probability For any event A = {O1, O2, …, Om} in a sample space S, 1. 0 ≤ P(A) ≤ 1 m 2. P(A) = ∑ PO()()()()i = PO123 + PO + PO ++ PO ()m i=1 Special Cases: • P(∅) = 0 k • P(S) = ∑ PO()i = 1 “certainty” i=1 3. If all the elementary outcomes of S are equally likely, i.e., 1 P(O ) = P(O ) = … = P(O ) = , 1 2 k k then… #(Am ) PA()= = . #(S ) k Example: P(A) = 3/5 = 0.6, P(B) = 2/5 = 0.4 Ismor Fischer, 5/29/2012 3.1-3 Experiment 2: Select a card at random from a standard deck (and replace). Sample Space: S = {A♠, …, K♦} #(S) = 52 A A♠ 2♠ 3♠ 4♠ 5♠ 6♠ 7♠ 8♠ 9♠ 10♠ J♠ Q♠ K♠ B A♣ 2♣ 3♣ 4♣ 5♣ 6♣ 7♣ 8♣ 9♣ 10♣ J♣ Q♣ K♣ A♥ 2♥ 3♥ 4♥ 5♥ 6♥ 7♥ 8♥ 9♥ 10♥ J♥ Q♥ K♥ A♦ 2♦ 3♦ 4♦ 5♦ 6♦ 7♦ 8♦ 9♦ 10♦ J♦ Q♦ K♦ Events: A = “Select a 2.” = {2♠, 2♣, 2♥, 2♦} #(A) = 4 B = “Select a ♣.” = {A♣, 2♣, …, K♣} #(B) = 13 Probabilities: Since all elementary outcomes are equally likely, it follows that #(A ) 4 #(B ) 13 P(A) = = and P(B) = = . #(S ) 52 #(S ) 52 New Events from Old Events complement c (1) A = “not A” = {All outcomes that are in S, but not in A.} P(Ac) = 1 − P(A) c c 4 48 Example: A = “Select either A, 3, 4, …, or K.” P(A ) = 1 − = . 52 52 Example: Experiment = Toss a coin once. Events: A = {Heads} Ac = {Tails} Probabilities: Fair coin… P(A) = 0.5 ⇒ P(Ac) = 1 − 0.5 = 0.5 Biased coin… P(A) = 0.7 ⇒ P(Ac) = 1 − 0.7 = 0.3 Ismor Fischer, 5/29/2012 3.1-4 intersection (2) A ∩ B = “A and B” = {All outcomes in S that A and B share in common.} = {All outcomes that result when events A and B occur simultaneously.} 1 Example: A ∩ B = “Select a 2 and a ♣” = {2♣} ⇒ P(A ∩ B) = . 52 Definition: Two events A and B are said to be disjoint, or mutually exclusive, if they cannot occur simultaneously, i.e., A ∩ B = ∅, hence P(A ∩ B) = 0. S A B Example: A = “Select a 2” and C = “Select a 3” are disjoint events. Exercise: Are A ={24444 , 3 , 4 , 5 ,...} and B ={26666 , 3 , 4 , 5 ,...} disjoint? If not, find A ∩ B. union (3) A ∪ B = “A or B” = {All outcomes in S that are either in A or B, inclusive.} P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 0, if A and B are disjoint. Example: A ∪ B = “Select either a 2 or a ♣” has probability 4 13 1 16 P(A ∪ B) = + − = . 52 52 52 52 Example: A ∪ C = “Select either a 2 or a 3” has probability 4 4 8 P(A ∪ C) = + − 0 = . 52 52 52 Ismor Fischer, 5/29/2012 3.1-5 Note: Formula (3) extends to n ≥ 3 disjoint events in a straightforward manner: (4) P(A1 ∪ A2 ∪ … ∪ An) = P(A1) + P(A2) + … + P(An). Question: How is this formula modified if the n events are not necessarily disjoint? Example: Take n = 3 events… ∪ ∪ S Then P(A B C) = A B P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C) − P(B ∩ C) + P(A ∩ B ∩ C). Exercise: For S = {January,…, December}, verify this formula for the three events A = “Has 31 days,” B = “Name ends in r,” and C C = “Name begins with a vowel.” incisors canine canine Exercise: A single tooth is to be randomly selected for a premolars certain dental procedure. Draw a Venn diagram to illustrate the relationships between the three following events: A = “upper jaw,” B = “left side,” and C = “molar,” and indicate all corresponding probabilities. Calculate the molars probability that all of these three events, A and B and C, occur. Calculate the probability that none of these three events occur. Calculate the probability that exactly one of these three events occurs. Calculate the probability that premolars exactly two of these three events occur. (Think carefully.) Assume equal likelihood in all cases. canine canine incisors The three “set operations” – union, intersection, and complement – can be unified via... DeMorgan’s Laws Exercise: Using a Venn diagram, convince yourself that these statements are true in (A ∪ B) c = Ac ∩ Bc general. Then verify them for a specific c c c example, e.g., A = “Pick a picture card” and (A ∩ B) = A ∪ B B = “Pick a black card.” Ismor Fischer, 5/29/2012 3.1-6 Slight Detour… Suppose that out of the last n = 40 races, a certain racing horse won x = 25, and lost the remaining n – x = 15. Based on these statistics, we can calculate the following probability estimates for future races: x 25 5 P(Win) ≈ = = = 0.625 = p n 40 8 Out of every 8 races, the horse wins 5 and loses 3, on average. x 15 3 P(Lose) ≈ 1− = = = 0.375 = 1 – p = q n 40 8 P(Win) 5 / 8 5 Odds of winning = = = “5 to 3” P(Lose) 3/ 8 3 Definition: For any event A, let P(A) = p, thus P(Ac) = q = 1 – p. The odds of event A p p = = , i.e., “the probability that A does occur, divided by the probability that it q 1− p does not occur.” (In the preceding example, A = “Win” with probability p = 5/8.) Note that if odds = 1, then A and Ac are equally likely to occur. If odds > 1 (likewise, < 1), then the probability that A occurs is greater (likewise, less) than the probability that it does not occur. Example: Suppose the probability of contracting a certain disease in a particular group of “high risk” individuals is P(D+) = 0.75, so that the probability of being disease-free is P(D–) = 0.25. Then the odds of contracting the disease in this group is equal to 0.75/0.25 = 3 (or “3 to 1”).* Likewise, if in a reference group of “low risk” individuals, the prevalence of the same disease is only P(D+) = 0.02, so that P(D–) = 0.98, then their odds = 0.02/0.98 = 1/49 (≈ 0.0204). As its name suggests, the corresponding “odds ratio” between the two groups is defined as the ratio of their 3 respective odds, i.e., = 147. That is, the odds of the high-risk group contracting 1/ 49 the disease are 147 times larger than the odds of the low-risk reference group. (Odds ratios have nice properties, and are used extensively in epidemiological studies.) * That is, within this group, the probability of disease is three times larger than the probability of no disease. .