1 Probability Space, Events, and Random Variables
Total Page:16
File Type:pdf, Size:1020Kb
1 Probability space, events, and random variables Examples We begin with some common examples of probability models: 1. Coin tossing: We can encode a series of coin tosses as a random binary sequence 11010001. Denoting it by !1!2!3 :::; we expect to see that !1 + + !n 1 ··· : n ! 2 2. Random walk on Z: How far is Sn from the origin? Sn pn. ≈ What is the probability that the random walk will reach 20 before -10? Equals 1/3. How long do we have to wait until we hit 20 or -10? 200 in expectation. 3. Brownian motion: \random walk in continuous time". Its paths are nowhere differentiable with probability 1. It is widely used in PDEs, financial maths, mathematical physics, biology, chemistry, etc. 4. Percolation: What is the chance that there is an infinite connected cluster? Either 0 or 1 (and never a number in between!), critical probability for bond percolation 2 on Z is 1/2, and decreases in higher dimensions. 5. Rolling dice: Two people wait to see a particular sequence from successive rolls of a die { one waits for 456, the other for 666. 1 Who will wait the longest? What is the average waiting time? Aims of this course 1. Introduce a rigorous setup to study random objects (so far you have only used combi- natorics (with its limited capacity) plus formulas for expectation without proof). 2. Strong Law of Large Numbers: If X1;X2;::: are independent random variables with the same distribution and mean µ, then X1 + + Xn ··· µ. n ! 3. Central Limit Theorem: If X1;X2;::: are independent random variables with the same distribution, mean µ and variance σ2, then X1 + + Xn σ ··· µ + Z; n ≈ pn where Z is a normal random variable. So the fluctuations are µ are of order 1=pn and the randomness does not depend on the distribution of our original random variable. 4. Martingales: random processes which on average stay the same (fair games). Very useful: SLLN, examples 2 and 5 above, many more. Reminder from measure theory A family Σ of subsets of a set Ω is called a σ-algebra if 1. ? Σ, 2 2. A Σ = Ω A Σ, 2 ) n 2 1 3. A1;A2;::: Σ = An Σ 2 ) n=1 2 S A function µ :Σ [0; ] is called a measure if ! 1 1. µ(?) = 0, 1 1 2. A1;A2;::: Σ disjoint = µ ( An) = µ(An) 2 ) n=1 n=1 S −1 P A function f :Ω R is called measurable if f (B) Σ for all Borel set B. ! 2 Definition 1.1. A measure µ is called a probability measure if µ(Ω) = 1. A probability space is a triple (Ω; Σ; P). A random variable is a measurable function X :Ω R. ! 2 Idea Ω = all possible outcomes of a random experiment, Σ = all possible events, P(A) = probability of an event A Σ. 2 Example 1.2. 1. Coin toss, or X equals 0 or 1 with probabilities 1/2: Ω = H; T , f g Σ = ?; H ; T ; Ω f f g f g g P(?) = 0; P( H ) = 1=2; P( T ) = 1=2; P(Ω) = 1, f g f g X : H; T R;X(H) = 1;X(T ) = 0 f g ! 2. Roll a die and spell the number you get { let X be the number of letters: Ω = 1; 2; 3; 4; 5; 6 ; Σ = 2Ω (set of all subsets of Ω), f g P( 1 ) = = P( 6 ) = 1=6 and extend to Σ by additivity, f g ··· f g X : 1; 2; 3; 4; 5; 6 R; f g ! X(1) = 3;X(2) = 3;X(3) = 5;X(4) = 4;X(5) = 4;X(6) = 3. Then P(X = 3) = P( ! Ω: X(!) = 3) ) = P( 1; 2; 6 ) = 1=2: f 2 g f g 3. We can represent the die experiment in more than one way (i.e. on different probability spaces), e.g.: Ω = 0; 1; 2; 3; 4; 5; 6 ; Σ = 2Ω; f g P( 0 ) = 0; P( 1 ) = = P( 6 ) = 1=6 and extend to Σ by additivity, f g f g ··· f g X : 0; 1; 2; 3; 4; 5; 6 R; f g ! X(0) = 4;X(1) = 3;X(2) = 3;X(3) = 5;X(4) = 4;X(5) = 4;X(6) = 3. 4. Toss a coin infinitely many times: Ω = set of all sequences consisting of 0s and 1s. We can code such a sequence as a number in [0; 1] by considering each sequence as a binary number. So Ω = [0; 1] but what is Σ? What kind of events do we want to measure? We certainly want to be able to measure events that a sequence begins with a certain pattern, e.g. 110: ! = 0:!1!2!3 [0; 1] : !1 = 1;!2 = 1;!3 = 0 = [3=4; 7=8]: f · · · 2 g So Σ should contain all binary intervals. The smallest σ-algebra with this property is , the Borel σ-algebra, so Σ = . B B 3 How should we define P? We want P(110 ) = (1=2) = 1=8, and observing that ∗ ∗ · · · Leb([3/4,7/8])=1/8 (and similarly for all patterns), we set P = Leb. So our probability space is ([0; 1]; ; Leb). We can consider various random variables on this space, for B example, X(!) = !n (representing the nth toss). We often care about the probabilities P(X B) := P( ! Ω: X(!) B ); 2 f 2 2 g for Borel sets B, rather than which ! occur. We can record these probabilities with a proba- bility measure µ on (R; ) defined by B µ(B) = P(X B);B : 2 2 B 3 This measure is entirely determined by its values on the sets ( ; t], t R. Denote −∞ 2 F (t) = µ(( ; t]) = P(X t); t R: −∞ ≤ 2 Definition 1.3. The measure µ is called the distribution or law of X. The function F is called the distribution function of X. Conversely, we can ask: what properties should a function F have to be a distribution function for some random variable? We begin with some properties of a distribution function and then show that if a function F has these properties, there exists a random variable with F its distribution function. Theorem 1.4 (Properties of a distribution function). Let F be a distribution function. Then 1. F is increasing, 2. limt!1 F (t) = 1, limt→−∞ F (t) = 0, 3. F is right continuous. Proof. 1. Let s t. Then ≤ F (t) = P( ! : X(!) s ! : X(!) (s; t] ) P( ! : X(!) s ) = F (s): f ≤ g [ f 2 g ≥ f ≤ g 2. Let tn (and set t0 = ). Then ! 1 −∞ n n F (tn) = P ! : X(!) (ti−1; ti] = P( ! : X(!) (ti−1; ti] ) f 2 g f 2 g i=1 ! i=1 1 [ X P( ! : X(!) (ti−1; ti] ) = P(Ω) = 1: ! f 2 g i=1 X Similarly, letting tn , ! −∞ F (tn) = P(X tn) P(?) = 0: ≤ ! 3. Let tn t. Then # F (tn) = P ( X t X (t; tn] ) = P(X t) + P(X (t; tn]) F (t): f ≤ g [ f 2 g ≤ 2 ! Theorem 1.5 (Skorokhod Representation). Let F : R [0; 1] have the three properties of a distribution function above. Then there is a ! random variable X on ([0; 1]; ; Leb) with distribution function F . B 4 Proof. If F is continuous and strictly increasing, then we can just take X(!) = F −1(!). Then Leb(X u) = Leb( ! : F −1(!) u ) = Leb( ! : ! F (u) ) = F (u): ≤ f ≤ g f ≤ g For more general F , we proceed as follows: Define the \inverse" G : [0; 1] R by ! G(!) = inf t : F (t) > ! : f g We set X(!) = G(!) and prove that Leb(X u) = F (u). It suffices to show that ≤ [0;F (u)) ! [0; 1] : inf t : F (t) > ! u [0;F (u)]: ⊆ f 2 f g ≤ g ⊆ First suppose ! [0;F (u)). Then F (u) > ! so u t : F (t) > ! and so 2 2 f g inf t : F (t) > ! u: f g ≤ On the other hand, set ! [0; 1] be such that inf t : F (t) > ! u. Then by monotonicity 2 f g ≤ of F , F (inf t : F (t) > ! ) F (u); f g ≤ and so by right-continuity, inf F (t): F (t) > ! F (u): f g ≤ But now since ! inf F (t): F (t) > ! we have that ! F (u). ≤ f g ≤ Definition 1.6. If F (the distribution function of a random variable X) can be written as an integral: t F (t) = f(u) du; Z−∞ for some measurable function f, then X is called a continuous random variable with density f. Remark 1.7. 1. If F is differentiable then X has density f = F 0. 1 2. If f is a density then −∞ f(t) dt = P(Ω) = 1. 3. If X has density f andR law µ, then µ Leb (Leb(A) = 0 = µ(A) = 0) and f is the dµ ) Radon-Nikodym derivative dLeb . Example 1.8 (Examples of distributions). 1. Uniform random variable on [0; 1] (\random number between 0 and 1"). 1; if t > 1, 1; if 0 t 1; F (t) = 8t; if 0 t 1, and f(t) = ≤ ≤ ≤ ≤ (0; otherwise: <>0; if t < 0: :> 5 2. Exponential random variable with mean µ. 1 e−t/µ; if t > 0, 1 e−t/µ; if t > 0; F (t) = − and f(t) = µ (0; if t 0: (0; if t 0: ≤ ≤ 3. Normal (or Gaussian) random variable with mean µ and variance σ2 (denoted N(µ, σ2)).