Probability Spaces and Random Variables

Chapter 15 Probability Spaces and Random Variables We have already introduced the notion of a probability space, namely a measure space (X; F; µ) with the property that µ(X) = 1. Here we look further at some basic notions and results of probability theory. First, we give some terminology. A set S F is called an event, and 2 µ(S) is called the probability of the event, often denoted P (S). The image to have in mind is that one picks a point x X at random, and µ(S) is the 2 probability that x S. A measurable function f : X R is called a (real) 2 ! random variable. (One also speaks of a measurable map F : (X; F) (Y; G) ! as a Y -valued random variable, given a measurable space (Y; G).) If f is integrable, one sets (15.1) E(f) = f dµ, Z X called the expectation of f, or the mean of f. One defines the variance of f as (15.2) Var(f) = f a 2 dµ, a = E(f): Z j − j X The random variable f has finite variance if and only if f L2(X; µ). 2 A random variable f : X R induces a probability measure ν on R, ! f called the probability distribution of f: (15.3) ν (S) = µ(f −1(S)); S B(R); f 2 207 208 15. Probability Spaces and Random Variables where B(R) denotes the σ-algebra of Borel sets in R. It is clear that f 2 L1(X; µ) if and only if x dν (x) < and that j j f 1 R (15.4) E(f) = x dνf (x): Z R We also have 2 (15.5) Var(f) = (x a) dνf (x); a = E(f): Z − R To illustrate some of these concepts, consider coin tosses. We start with 1 (15.6) X = h; t ; µ( h ) = µ( t ) = : f g f g f g 2 The event h is that the coin comes up heads, and t gives tails. We also f g f g form k (15.7) X = X; µ = µ µ, k k × · · · × jY=1 representing the set of possible outcomes of k successive coin tosses. If H(k; `) is the event that there are exactly ` heads among the k coin tosses, its probability is k (15.8) µ (H(k; `)) = 2−k : k ` If N : X R yields the number of heads that come up in k tosses, i.e., k k ! N (x) is the number of h's that occur in x = (x ; : : : ; x ) X , then k 1 k 2 k k (15.9) E(N ) = : k 2 The measure ν on R is supported on the set ` Z+ : 0 ` k , and Nk f 2 ≤ ≤ g k (15.10) ν ( ` ) = µ (H(k; `)) = 2−k : Nk f g k ` A central area of probability theory is the study of the large k behavior of events in spaces such as (X ; µ ), particularly various limits as k . k k ! 1 This leads naturally to a consideration of the infinite product space 1 (15.11) Z = X; jY=1 15. Probability Spaces and Random Variables 209 with σ-algebra and product measure ν, constructed at the end of Chapter Z 6. In case (X; µ) is given by (15.6), one can consider N : Z R, the number k ! of heads that come up in the first k throws; N = N π , where π : Z X k ke◦ k k ! k is the natural projection. It is a fundamental fact of probability theory that e for almost all infinite sequences of random coin tosses (i.e., with probability 1) the fraction k−1N of them that come up heads tends to 1=2 as k . k ! 1 This is a special case of the \law of large numbers," several versions of which e will be established below. −1 Note that k Nk has the form k e 1 (15.12) f ; k j Xj=1 where f : Z R has the form j ! (15.13) f (x) = f(x ); f : X R; x = (x ; x ; x ; : : : ) Z: j j ! 1 2 3 2 The random variables fj have the important properties of being independent and identically distributed. We define these last two terms in general, for random variables on a probability space (X; F; µ) that need not have a product structure. Say we have random variables f1; : : : ; fk : X R. Extending (15.3), we have a Rk ! measure νFk on called the joint probability distribution: (15.14) ν (S) = µ(F −1(S)); S B(Rk); F = (f ; : : : ; f ) : X Rk: Fk k 2 k 1 k ! We say (15.15) f ; : : : ; f are independent ν = ν ν : 1 k () Fk f1 × · · · × fk We also say (15.16) f and f are identically distributed ν = ν : i j () fi fj If f : j N is an infinite sequence of random variables on X, we say f j 2 g they are independent if and only if each finite subset is. Equivalently, we can form (15.17) F = (f ; f ; f ; : : : ) : X R1 = R; 1 2 3 ! jY≥1 set (15.18) ν (S) = µ(F −1(S)); S B(R1); F 2 and then independence is equivalent to (15.19) νF = νfj : jY≥1 It is an easy exercise to show that the random variables in (15.13) are independent and identically distributed. Here is a simple consequence of independence. 210 15. Probability Spaces and Random Variables Lemma 15.1. If f ; f L2(X; µ) are independent, then 1 2 2 (15.20) E(f1f2) = E(f1)E(f2): Proof. We have x2; y2 L1(R2; ν ), and hence xy L1(R2; ν ). 2 f1;f2 2 f1;f2 Then E(f1f2) = xy dνf1;f2 (x; y) Z R2 (15.21) = xy dνf1 (x) dνf2 (y) Z R2 = E(f1)E(f2): The following is a version of the weak law of large numbers. (See the exercises for a more general version.) Proposition 15.2. Let f : j N be independent, identically distributed f j 2 g random variables on (X; F; µ). Assume fj have finite variance, and set a = E(fj). Then k 1 (15.22) f a; in L2-norm; k j −! Xj=1 and hence in measure, as k . ! 1 Proof. Using Lemma 15.1, we have k k 2 2 1 1 2 b (15.23) f a = f a 2 = ; j 2 2 j L k − L k k − k k Xj=1 Xj=1 2 2 since (f a; f a) 2 = E(f a)E(f a) = 0; j = `, and f a 2 = b j − ` − L j − ` − 6 k j − kL is independent of j. Clearly (15.23) implies (15.22). Convergence in measure then follows by Tchebychev's inequality. The strong law of large numbers produces pointwise a.e. convergence and relaxes the L2-hypothesis made in Proposition 15.2. Before proceeding to the general case, we first treat the product case (15.11){(15.13). 15. Probability Spaces and Random Variables 211 Proposition 15.3. Let (Z; ; ν) be a product of a countable number of Z factors of a probability space (X; F; µ), as in (15.11). Assume p [1; ), let 2 1 f Lp(X; µ), and define f Lp(Z; ν), as in (15.13). Set a = E(f). Then 2 j 2 k 1 (15.24) f a; in Lp-norm and ν-a.e.; k j ! Xj=1 as k . ! 1 Proof. This follows from ergodic theorems established in Chapter 14. In j−1 fact, note that fj = T f1, where T g(x) = f('(x)), for (15.25) ' : Z Z; '(x ; x ; x ; : : : ) = (x ; x ; x ; : : : ): ! 1 2 3 2 3 4 By Proposition 14.11, ' is ergodic. We see that k k−1 1 1 (15.26) f = T jf = A f ; k j k 1 k 1 Xj=1 Xj=0 as in (14.4). The convergence A f a asserted in (15.24) now follows k 1 ! from Proposition 14.8. We now establish the following strong law of large numbers. Theorem 15.4. Let (X; F; µ) be a probability space, and let f : j N f j 2 g be independent, identically distributed random variables in Lp(X; µ), with p [1; ). Set a = E(f ). Then 2 1 j k 1 (15.27) f a; in Lp-norm and µ-a.e.; k j ! Xj=1 as k . ! 1 Proof. Our strategy is to reduce this to Proposition 15.3. We have a map F : X R1 as in (15.17), yielding a measure ν on R1, as in (15.18), ! F which is actually a product measure, as in (15.19). We have coordinate functions (15.28) ξ : R1 R; ξ (x ; x ; x ; : : : ) = x ; j −! j 1 2 3 j and (15.29) f = ξ F: j j ◦ 212 15. Probability Spaces and Random Variables Note that ξ Lp(R1; ν ) and j 2 F (15.30) ξj dνF = xj dνf = a: Z Z j R1 R Now Proposition 15.3 implies k 1 (15.31) ξ a; in Lp-norm and ν -a.e.; k j ! F Xj=1 on (R1; ν ), as k . Since (15.29) holds and F is measure-preserving, F ! 1 (15.27) follows. Note that if fj are independent random variables on X and Fk = (f ; : : : ; f ) : X Rk, then 1 k ! G(f1; : : : ; fk) dµ = G(x1; : : : ; xk) dνF Z Z k X Rk (15.32) = G(x1; : : : ; xk) dνf1 (x1) dνf (xk): Z · · · k Rk In particular, we have for the sum k (15.33) Sk = fk Xj=1 and a Borel set B R that ⊂ νS (B) = χB(f1 + + fk) dµ k Z · · · X (15.34) = χB(x1 + + xk) dνf1 (x1) dνf (xk): Z · · · · · · k Rk Recalling the definition (9.60) of convolution of measures, we see that (15.35) ν = ν ν : Sk f1 ∗ · · · ∗ fk Given a random variable f : X R, the function ! −iξf −ixξ (15.36) χf (ξ) = E(e ) = e dνf (x) = p2π ν^f (ξ) Z R 15.

Probability Spaces and Random Variables

Probability and Statistics Lecture Notes

The Probability Set-Up.Pdf

1 Probabilities

Propensities and Probabilities

Topic 1: Basic Probability Definition of Sets

(Measure Theory for Dummies) UWEE Technical Report Number UWEETR-2006-0008

Probability Theory Review 1 Basic Notions: Sample Space, Events

1 Probability Measure and Random Variables

CSE 21 Mathematics for Algorithm and System Analysis

Probability Theory: STAT310/MATH230; September 12, 2010 Amir Dembo

Characterization of Palm Measures Via Bijective Point-Shifts

Probability Spaces Lecturer: Dr