Lecture 2: Modeling Random Experiments
Total Page:16
File Type:pdf, Size:1020Kb
Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2021 Lecture 2: Modeling Random Experiments Relevant textbook passages: Pitman [5]: Sections 1.3–1.4., pp. 26–46. Larsen–Marx [4]: Sections 2.2–2.5, pp. 18–66. 2.1 Axioms for probability measures Recall from last time that a random experiment is an experiment that may be conducted under seemingly identical conditions, yet give different results. Coin tossing is everyone’s go-to example of a random experiment. The way we model random experiments is through the use of probabilities. We start with the sample space Ω, the set of possible outcomes of the experiment, and consider events, which are subsets E of the sample space. (We let F denote the collection of events.) 2.1.1 Definition A probability measure P or probability distribution attaches to each event E a number between 0 and 1 (inclusive) so as to obey the following axioms of probability: Normalization: P (?) = 0; and P (Ω) = 1. Nonnegativity: For each event E, we have P (E) > 0. Additivity: If EF = ?, then P (∪ F ) = P (E) + P (F ). Note that while the domain of P is technically F, the set of events, that is P : F → [0, 1], we may also refer to P as a probability (measure) on Ω, the set of realizations. 2.1.2 Remark To reduce the visual clutter created by layers of delimiters in our notation, we may omit some of them simply write something like P (f(ω) = 1) orP {ω ∈ Ω: f(ω) = 1} instead of P {ω ∈ Ω: f(ω) = 1} and we may write P (ω) instead of P {ω} . You will come to appreciate this. Most probabilists also require the following stronger property, called countable additivity. 2.1.3 Definition A probability measure P is countably additive if for every pairwise disjoint infinite sequence of events E1,E2,... , S∞ X∞ P Ei = P (Ei). i=1 i=1 2.1.4 Remark If the sample space Ω is finite, it has only finitely many subsets, so the onlyway an infinite sequence E1,E2,... of events can be pairwise disjoint is that allP but finitely many ? ∞ of the events Ei are empty. In this case, since P ( ) = 0, the infinite series i=1 P (Ei) reduces KC Border v. 2020.10.21::10.28 Ma 3/103 Winter 2021 KC Border Random Experiments and Probability Spaces 2–2 to a finite sum. In other words, for a finite sample space, finite additivity guarantees countable additivity. (Cf. Section 2.2.1, item 4.) You need to take an advanced analysis course to understand that for infinite sample spaces, there can be probability measures that are additive, but not countably additive. So don’t worry too much about it. From here on out, we shall assume that all our probability measures are countably additive. 2.1.1 The special probability values zero and one It is important to distinguish between an event that cannot happen and one that has probability zero. Any decent model of a random experiment should assign probability zero to an event that cannot happen, but an event may still be possible and yet have probability zero. We shall see presently that the probability of an infinite sequence of nothing but Tails has probability zero (in the standard model of coin tossing), but that does not mean that it is impossible. As far as I know, such a sequence would not violate the laws of physics, but I should check with the Physics Police on that. (Actually if coin tossing takes time, and the life of the universe is finite we can never carry out the experiment anyhow.) Likewise an event can have probability one, and yet may not necessarily occur. At least that is the interpretation I shall take in this course. As I mentioned earlier, I do not want to become bogged down in metaphysics. But probabilists have coined a phrase to describe events of probability one. 2.1.5 Definition If P (E) = 1, then we say that that E occurs almost surely, abbreviated E a.s. An event of probaility zero is sometimes called a null event. You might ask why be wishy-washy and add the “almost?” We shall have more to say about this in Section 6.3. 2.2 Probability spaces Our complete formal model of a random experiment is what we call a probability space. 2.2.1 Definition A probability space is a triple (Ω, F,P ), where Ω is a nonempty set, the sample space or outcome space of the experiment, F is the set of events, which is a σ-algebra of subsets of Ω, and P is a countably additive probability measure on F. 2.2.1 Elementary Probability Identities There are some elementary properties of probability measures that follow in a straightforward Larsen– way from the axioms. Here are a few together with their proofs (derivations, if you prefer). Marx [4]: c § 2.3 1. P (E ) = 1 − P (E) Proof : Since E and Ec are disjoint, and E ∪ Ec = Ω, additivity gives us P (E)+P (Ec) = P (Ω). Normalization gives us P (Ω) = 1, and the desired expression is just a rearrangement. 2. If F ⊂ E, then P (E \ F ) = P (E) − P (F ) v. 2020.10.21::10.28 KC Border Ma 3/103 Winter 2021 KC Border Random Experiments and Probability Spaces 2–3 Proof : Recall the definition of E\F as the set of points in E that are not in F , that is EF c. Thus E = F ∪(E \ F ) where F and E \ F are disjoint. By additivity then P (E) = P (F ) + P (E \ F ), and the desired expression is just a rearrangement. 3. Monotonicity: If F ⊂ E, then P (F ) 6 P (E) Proof : We just saw that P (E) = P (F ) + P (E \ F ), and nonnegativity implies P (E \ F ) > 0. Therefore P (E) > P (F ). This has two useful special cases: a. If P (E) = 0 and F ⊂ E, then P (F ) = 0. b. If P (E) = 1 and E ⊂ F , then P (F ) = 1. 4. Finite additivity: If E1,...,En is a finite sequence of pairwise disjoint events, that is, i ≠ j =⇒ EiEj = ?, then n n X P ∪ Ei = P (Ei). i=1 i=1 Proof : We shall prove this by induction on n. Let P(n) stand for the above equality for n pairwise disjoint sets. Then P(2) is just Additivity. Assume P(n − 1). Write n n−1 ∪ Ei = ∪ Ei ∪ En i=1 |i=1{z } =B Then BEn = ?, so n P ∪ Ei = P (B ∪ En) i=1 = P (B) + P (En) by Additivity nX−1 = P (Ei) + P (En) by P(n − 1) i=1 Xn = P (Ei). i=1 Boole’s inequality The next result is obvious, but it can be derived from our axioms. 2.2.2 Boole’s Inequality Even if events E1,...,En are not pairwise disjoint, we may still conclude that n n X P ∪ Ei 6 P (Ei). i=1 i=1 KC Border v. 2020.10.21::10.28 Ma 3/103 Winter 2021 KC Border Random Experiments and Probability Spaces 2–4 Before I demonstrate how to prove Boole’s Inequality, let me describe a “trick” for “disjunc- tifying” a sequence of sets. 2.2.3 Lemma Let E1,...,En,... be a sequence (finite or infinite) of events. Then there is a sequence A1,...,An,... of events such that: • For each n, An ⊂ En. • The Ai’s are pairwise disjoint. That is, if i ≠ j, then AiAj = ?. • For each n, n n ∪ Ai = ∪ Ei. i=1 i=1 • The last fact implies that ∞ ∞ ∪ Ai = ∪ Ei. i=1 i=1 Proof : Define A1 = E1, and for n > 1 define An = En \ (E1 ∪ · · · ∪ En−1) = En \ (A1 ∪ · · · ∪ An−1). I leave it to you to prove that the sequence A1,...,An,... has the desired properties. See Figure 2.1 for an example with three events. E1 A1 A3 A2 E2 E3 Figure 2.1. The Ei events are the discs, and the three colored regions represent the events A1 = E1, A2 = E2 \ E1, and A3 = E3 \ (E1 ∪ E2). Proof of Boole’s Inequality: Let Ai be a sequence of pairwise disjoint events satisfying the con- n n clusion of Lemma 2.2.3. That is, each Ai ⊂ Ei and ∪ Ai = ∪ Ei. Then i=1 i=1 n n n n X X P ∪ Ei = P ∪ Ai = P (Ai) 6 P (Ei), i=1 i=1 i=1 i=1 v. 2020.10.21::10.28 KC Border Ma 3/103 Winter 2021 KC Border Random Experiments and Probability Spaces 2–5 where the second equality follows from the fact that the Ai’s are pairwise disjoint and finite additivity, and the inequality follows from P (Ai) 6 P (Ei) (monotonicity) for each i. Bonferroni’s inequality Boole’s inequality provides an upper bound on the probability of a union of not necessarily disjoint events. Bonferroni’s inequality flips this over and gives a lower bound on the probability of an intersection of events. Note that the right-hand side of Bonferroni’s inequality can be negative, which gives a valid, but uninformative, bound. 2.2.4 Bonferroni’s Inequality For a finite sequence of events E1,...,En, ! \n Xn P Ei > P (Ei) − (n − 1). i=1 i=1 I’ll leave the proof as an exercise. But here’s a big hint. Use de Morgan’s Laws (specifically ∩ c ∪ c c − ( Ei) = Ei ), the fact that P (E ) = 1 P (E), and Boole’s inequality.