201.1.8001 Fall 2013-2014 Lecture Notes Updated: December 24, 2014

INTRODUCTION TO PROBABILITY ARIEL YADIN Course: 201.1.8001 Fall 2013-2014 Lecture notes updated: December 24, 2014 Contents Lecture 1. 3 Lecture 2. 9 Lecture 3. 18 Lecture 4. 24 Lecture 5. 33 Lecture 6. 38 Lecture 7. 44 Lecture 8. 51 Lecture 9. 57 Lecture 10. 66 Lecture 11. 73 Lecture 12. 76 Lecture 13. 81 Lecture 14. 87 Lecture 15. 92 Lecture 16. 101 1 2 Lecture 17. 105 Lecture 18. 113 Lecture 19. 118 Lecture 20. 131 Lecture 21. 135 Lecture 22. 140 Lecture 23. 156 Lecture 24. 159 3 Introduction to Probability 201.1.8001 Ariel Yadin Lecture 1 1.1. Example: Bertrand's Paradox We begin with an example [this is known as Bertrand's paradox]. Joseph Louis Fran¸cois Question 1.1. Consider a circle of radius 1, and an equilateral triangle bounded in the Bertrand (1822{1900) circle, say ABC. (The side length of such a triangle is p3.) Let M be a randomly chosen chord in the circle. What is the probability that the length of M is greater than the length of a side of the triangle (i.e. p3)? Solution 1. How to chose a random chord M? One way is to choose a random angle, let r be the radius at that angle, and let M be the unique chord perpendicular to r. Let x be the intersection point of M with r. (See Figure 1, left.) Because of symmetry, we can rotate the triangle so that the chord AB is perpendicular to r. Since the sides of the triangle intersect the perpendicular radii at distance 1=2 from 0, M is longer than AB if and only if x is at distance at most 1=2 from 0. r has length 1, so the probability that x is at distance at most 1=2 is 1=2. ut Solution 2. Another way: Choose two random points x; y on the circle, and let M be the chord connecting them. (See Figure 1, right.) Because of symmetry, we can rotate the triangle so that x coincides with the vertex A of the triangle. So we can see that y falls in the arc BC on the circle if and only if M is longer than a side of the triangle. d 4 y B C B C x M r M A A Figure 1. How to choose a random chord. The probability of this is the length of the arc BC over 2π. That is, 1=3 (the arc BC is one-third of the circle). d dut Solution 3. A different way to choose a random chord M: Choose a random point x in the circle, and let r be the radius through x. Then choose M to be the chord through x perpendicular to r. (See Figure 1, left.) Again we can rotate the triangle so that r is perpendicular to the chord AB. Then, M will be longer than AB if and only if x lands inside the triangle; that is if and only if the distance of x to 0 is at most 1=2. Since the area of a disc of radius 1=2 is 1=4 of the disc of radius 1, this happens with probability 1=4. ut How did we reach this paradox? The original question was ill posed. We did not define precisely what a random chord is. The different solutions come from different ways of choosing a random chord - these are not the same. We now turn to precisely defining what we mean by \random", \probability", etc. 5 1.2. Sample spaces When we do some experiment, we first want to collect all possible outcomes of the experiment. In mathematics, a collection of objects is just a set. The set of all possible outcomes of an experiment is called a sample space. Usually we denote the sample space by Ω, and its elements by !. Example 1.2. A coin toss. Ω = H; T . Actually, it is the set of heads and tails, maybe ; , • f g { ~g but H; T are easier to write. Tossing a coin twice. Ω = H; T 2. What about tossing a coin k times? • f g Throwing two dice. Ω = ; ; ; ; ; 2. It is probably easier to use • f g 1; 2; 3; 4; 5; 6 2. f g + The lifetime of a person. Ω = R . What if we only count the years? What if • we want years and months? Bows and arrows: Shooting an arrow into a target of radius 1=2 meters. Ω = • (x; y): x2 + y2 1=4 . What about Ω = (r; θ): r [0; 1=2] ; θ [0; 2π) . f ≤ g f 2 2 g What if the arrow tip has thickness of 1cm? So we don't care about the point up to radius 1cm? What about missing the target altogether? A random real valued continuously differentiable function on [0; 1]. Ω = C1([0; 1]). • Noga tosses a coin. If it comes out heads she goes to the probability lecture, • and either takes notes or sleeps throughout. If the coin comes out tails, she goes running, and she records the distance she ran in meters and the time it took. + Ω = H notes, sleep T N R . f g × f g f g × × S 4 5 4 1.3. Events Suppose we toss a die. So the sample space is, say, Ω = 1; 2; 3; 4; 5; 6 . We want to f g ecode the outcome \the number on the die is even". 6 This can be done by collecting together all the relevant possible outcomes to this event. That is, a sub-collection of outcomes, or a subset of Ω. The event \the number on the die is even" corresponds to the subset 2; 4; 6 Ω. f g ⊂ What do we want to require from events? We want the following properties: "Everything" is an event; that is, Ω is an event. • If we can ask if an event occured, then we can also ask if it did not occur; that • is, if A is an event, then so is Ac. If we have many events of which we can ask if they have occured, then we can • also ask if one of them has occured; that is, if (An)n2N is a sequence of events, then so is n An. S A word on notation: Ac = Ω A. One needs to be careful in which space we are X n taking the complement. Some books use A. Thus, if we want to say what events are, we have: If is the collection of events on F a sample space Ω, then has the following properties: F The elements of are subsets of Ω (i.e. (Ω)). • F F ⊂ P Ω . • 2 F If A then Ac . • 2 F 2 F If (An)n2 is a sequence of elements of , then An . • N F n 2 F S Definition 1.3. with the properties above is called a σ-algebra (or σ-field). F X Explain the name: algebra, σ-algebra. Example 1.4. Let Ω be any set (sample space). Then, = ; Ω •F f; g = 2Ω = (Ω) •G P are both σ-algebras on Ω. 4 5 4 When Ω is a countable sample space, then we will always take the full σ-algebra 2Ω. (We will worry about the uncountable case in the future.) 7 To sum up: For a countable sample space Ω, any subset A Ω is an event. ⊂ Example 1.5. The experiment is tossing three coins. The event A is \second toss was heads". • Ω = H; T 3. A = (x; H; y): x; y H; T . f g f 2 f gg The experiment is \how many minutes passed until a goal is scored in the Manch- • ester derby". The event A is \the first goal was scored after minute 23 and before minute 45". Ω = 0; 1;:::; 90 (maybe extra-time?). A = 23; 24;:::; 44 . f g f g 4 5 4 Example 1.6. We are given a computer code of 4 letters. Ai is the event that the i-th letter is `L'. What are the following events? A1 A2 A3 A4 • \ \ \ A1 A2 A3 A4 • [ [ [ Ac Ac Ac Ac • 1 \ 2 \ 3 \ 4 Ac Ac Ac Ac • 1 [ 2 [ 3 [ 4 A3 A4 • \ c A1 A • \ 2 What are the events in terms of Ai's? There are at least 3 L's • There is exactly one L • There are no two L's in a row • 4 5 4 Example 1.7. Every day products on a production line are chosen and labeled `-' if damaged or `+' if good. What are the complements of the following events? A = at least two products are damaged • B = at most 3 products are damaged • 8 C = there are at least 3 more good products than damaged products • D = there are no damaged products • E = most of the products are damaged • To which of the above events does the string + + + + belong to? − − − − −− What are the events A B; A B; C E; B D; B D. \ [ \ \ [ 4 5 4 9 Introduction to Probability 201.1.8001 Ariel Yadin Lecture 2 2.1. Probability Remark 2.1. Two sets A; B are called disjoint if A B = . For a sequence (An)n2 , \ ; N we say that (An) are mutually disjoint if any two are disjoint; i.e. for any n = m, 6 An Am = . \ ; What is probability? It is the assignment of a value to each event, between 0 and 1, saying how likely this event is.

Load more