CHAPTER 2
The probability set-up
2.1. Basic theory of probability
We will have a sample space, denoted by S (sometimes Ω) that consists of all possible outcomes. For example, if we roll two dice, the sample space would be all possible pairs made up of the numbers one through six. An event is a subset of S. Another example is to toss a coin 2 times, and let S = {HH,HT,TH,TT }; or to let S be the possible orders in which 5 horses nish in a horse race; or S the possible prices of some stock at closing time today; or S = [0, ∞); the age at which someone dies; or S the points in a circle, the possible places a dart can hit. We should also keep in mind that the same setting can be described using dierent sample set. For example, in two solutions in Example 1.30 we used two dierent sample sets.
2.1.1. Sets. We start by describing elementary operations on sets. By a set we mean a collection of distinct objects called elements of the set, and we consider a set as an object in its own right. Set operations
Suppose S is a set. We say that A ⊂ S, that is, A is a subset of S if every element in A is contained in S; A ∪ B is the union of sets A ⊂ S and B ⊂ S and denotes the points of S that are in A or B or both; A ∩ B is the intersection of sets A ⊂ S and B ⊂ S and is the set of points that are in both A and B; ∅ denotes the empty set; Ac is the complement of A, that is, the points in S that are not in A.
We extend this denition to have n is the union of sets , and similarly n . ∪i=1Ai A1, ··· ,An ∩i=1Ai An exercise is to show that De Morgan's laws
n !c n n !c n [ \ c and \ [ c Ai = Ai Ai = Ai . i=1 i=1 i=1 i=1
© Copyright 2013 Richard F. Bass, 2020 Masha Gordina Typesetting date: March 31, 2020
21 22 2. THE PROBABILITY SET-UP
We will also need the notion of pairwise disjoint sets ∞ which means that {Ei}i=1 Ei ∩ Ej = ∅ unless i = j. There are no restrictions on the sample space S. The collection of events, F, is assumed to be a σ-eld, which means that it satises the following. Denition ( σ-eld)
A collection F of sets in S is called a σ-eld if (i) Both ∅,S are in F, (ii) if A is in F, then Ac is in F, (iii) if are in , then S∞ and T∞ are in . A1,A2,... F i=1 Ai i=1 Ai F
Typically we will take F to be all subsets of S, and so (i)-(iii) are automatically satised. The only times we won't have F be all subsets is for technical reasons or when we talk about conditional expectation.
2.1.2. Probability axioms. So now we have a sample space S, a σ-eld F, and we need to talk about what a probability is. Probability axioms
(1) 0 6 P(E) 6 1 for all events E ∈ F. (2) P(S) = 1. (3) If the ∞ are pairwise disjoint, S∞ P∞ . {Ei}i=1 ,Ei ∈ F P( i=1 Ei) = i=1 P(Ei)
Note that probabilities are probabilities of subsets of S, not of points of S. However, it is common to write P(x) for P({x}). Intuitively, the probability of E should be the number of times E occurs in n experiments, taking a limit as n tends to innity. This is hard to use. It is better to start with these axioms, and then to prove that the probability of E is the limit as we hoped. Below are some easy consequences of the probability axioms. Proposition 2.1 (Properties of probability)
(1) P(∅) = 0. (2) If are pairwise disjoint, then Sn Pn . E1,...,En ∈ F P( i=1 Ei) = i=1 P(Ei) (3) P(Ec) = 1 − P(E) for any E ∈ F. (4) If E ⊂ F , then P(E) 6 P(F ), for any E,F ∈ F. (5) for any E,F ∈ F (2.1.1) P(E ∪ F ) = P(E) + P(F ) − P(E ∩ F ).
The last property is sometimes called the inclusion-exclusion identity.
Proof. To show (1), choose for each . These are clearly disjoint, so Ei = ∅ i P(∅) = ∞ P∞ P∞ . If were strictly positive, then the last term would P(∪i=1Ei) = i=1 P(Ei) = i=1 P(∅) P(∅) 2.1. BASIC THEORY OF PROBABILITY 23
be innity, contradicting the fact that probabilities are between 0 and 1. So the probability of ∅ must be zero. Part (2) follows if we let . Then ∞ are still pairwise En+1 = En+2 = ··· = ∅ {Ei}i=1 ,Ei ∈ F disjoint, and S∞ Sn , and by (1) we have i=1 Ei = i=1 Ei ∞ n X X P(Ei) = P(Ei). i=1 i=1
To prove (3), use S = E ∪ Ec. By (2), P(S) = P(E) + P(Ec). By axiom (2), P(S) = 1, so (1) follows.
c c To prove (4), write F = E ∪ (F ∩ E ), so P(F ) = P(E) + P(F ∩ E ) > P(E) by (2) and axiom (1).
Similarly, to prove (5), we have P(E∪F ) = P(E)+P(Ec∩F ) and P(F ) = P(E∩F )+P(Ec∩F ). Solving the second equation for P(Ec ∩ F ) and substituting in the rst gives the desired result.
It is common for a probability space to consist of nitely many points, all with equally likely probabilities. For example, in tossing a fair coin, we have S = {H,T }, with P(H) = P(T ) = 1 . Similarly, in rolling a fair die, the probability space consists of , each point 2 {1, 2, 3, 4, 5, 6} having probability 1 . 6 Example 2.1. What is the probability that if we roll 2 dice, the sum is 7?
Solution: There are 36 possibilities, of which 6 have a sum of 7: (1, 6), (2, 5), (3, 4), (4, 3), , . Since they are all equally likely, the probability is 6 1 . (5, 2) (6, 1) 36 = 6 Example 2.2. What is the probability that in a poker hand (5 cards out of 52) we get exactly 4 of a kind?
Solution: we have four suits: clubs, diamonds, hearts and spades. Each suit includes an ace, a king, queen and jack, and ranks two through ten. For example, the probability of 4 aces and 1 king is