Math 710 Homework

Austin Mohr June 16, 2012

1. For the following random “experiments”, describe the Ω. For each experiment, describe also two subsets (events) that might be of interest, and describe how you might assign to these events.

(a) The USC football team will play 12 games this season. The experi- ment is to observe the Win-Tie-Loss record. Solution: Define the sample space Ω to be the set

{(a1, . . . , a12) | ai ∈ {“Win”, “Tie”, “Loss”}},

th where each ai reflects the result of the i game. One interesting event might be the event in which ai = “Win” for all i, corresponding to an undefeated season. Another interesting event is the set

{(a1, . . . , a12) | ∃j ∈ [12] such that ai = “Loss” ∀i ≤ j and ai = “Win” ∀i > j}. This event corresponds to all possible seasons in which the Game- cocks lose their first j games (here, j is nonzero), but rally to win the remaining games. To assign probabilities to each element of the sample space, we define a function pi for each ai. This can be accomplished by considering the history of the Gamecocks versus the opposing team in game i and setting

Games won against team i p (“Win”) = i Total games against team i Games tied against team i p (“Tie”) = i Total games against team i Games lost against team i p (“Loss”) = . i Total games against team i

Now, for each elementary event ω = (a1, . . . , a12), set Y P (ω) = pi(ai). i∈[12]

1 Finally, for any subset A of Ω, define X P (A) = P (ω). ω∈A

(b) Observe the change (in percent) during a trading day of the Dow Jones Industrial Average. Letting X denote the corresponding to this change, we are observing Value at Closing − Value at Opening X = 100 . Value at Opening

Solution: Strictly speaking, X may take on any real value. In the interest of cutting down the sample space somewhat, we may round X to the nearest integer. Thus, Ω = Z. One interesting event is X = 0, corresponding to no net change for the day. Another interesting event is X = 100, corresponding to a doubling in value for the day. An elementary event corresponds to specifying a single value m for X. A very rough way to define this probability to examine the (rounded) percent change data for all days that the DJIA has been monitored and set Occurrences of m P (m) = . Number of days in data set For an arbitrary subset of Z, we extend linearly, as before. (c) The DJIA is actually monitored continuously over a trading day. The experiment is to observe the trajectory of values of the DJIA during a trading day. Solution: Suppose we sample the data every second and compile it into a piecewise linear function f. The trajectory at time t (in seconds after the opening bell) is given by g(t) = f(t) − f(t − 1), where we take g(0) = 0. As before, g may take on any real value. We may combat this by partitioning the real line into intervals of the form [x, x + ) for some fixed  > 0. Our elementary events, therefore, are ordered pairs (t, [x, x + )), corresponding to the trajectory at time t falling into the interval [x, x+). The sample space Ω is the collection of all such elementary events. One interesting (and highly suspicious) event might be {(t, [0, )) | any t}, corresponding to a day in which the DJIA saw nearly no change throughout the day. Anoter interesting event might be [ {(t, I) | I = [x, x + ), any t}, x>0 corresponding to the event where the DJIA saw positive trajectory throughout the day.

2 The probabilities might be assigned as in part b, where we now fur- ther divide the data to reflect the value of t. That is, we do not want the probability of seeing a given trajectory, but the probability of seeing a given trajectory at a given time. (d) Let G be the grid of points {(x, y) | x, y ∈ {−1, 0, 1}}. Consider the experiment where a particle starts at the point (0, 0) and at each timestep the particle either moves (with equal probability) to a point (in G) that is “available” to its right, left, up, or down. The experiment ceases the moment the particle reaches any of the four points (−1, −1), (−1, 1), (1, −1), (1, 1). Solution: One natural probability to assess is the probability that the experiment ceases after n steps. We note, however, that it is pos- sible (though infinitely-unlikely) that the experiement never ceases. Thus, we take the sample space to be Ω = Z+ ∪ ∞. One interesting event is that the experiment ceases after exactly 2 steps (the minimum steps required to reach a termination state). Another intesting event is that the experiment takes at least 100 (or any constant number) steps before ceasing. This problem suggests that an exact solution may be found using Markov chains. Barring that, we might run a computer simulation to gather data. From this data, we can set Number of occurrences of m P (m) = , Total number of trials and then extend linearly to more general events. (e) The experiment is to randomly generate a point on the surface of the unit sphere. Solution: Given the abstract nature of the problem, we decline to impose any artificial discretization as was done in previous problems. Now, any point in R3 may be specified by a spherical coordinate (r, θ, φ), where r denotes radial distance, θ inclination, and φ az- imuth. Since we are restricted to the unit sphere, we may discard r and consider ordered pairs (θ, φ). Thus,

Ω = {(0, 0)} ∪ {(π, 0)} ∪ {(θ, φ) | θ ∈ (0, π), φ ∈ [0, 2π)}

(the restrictions on θ and φ are to ensure a unique representation of each point). One interesting event might be {(0, 0)} ∪ {(π, 0)}, corresponding to the random point lying at either the north or south pole of the sphere. π Another interesting event might be {( 2 , φ) | φ ∈ [0, 2π)}, correspond- ing to the random point lying somewhere along the equator. As a point in the plane has zero, we cannot assign probabil- ities to elementary events and extend. Instead, given a subset A of Ω, we must set P (A) to be the measure of A as a subset of R2.

3 2. (Secretary Problem) You have in your possession N balls, each labelled with a distinct symbol. In front of you are N urns that are also labelled with the same symbols as the balls. Your experiment is to place the balls at random into these boxes with each box getting a single ball.

(a) Write down the sample space Ω of this experiment. How many ele- ments are there in Ω? Solution: For simplicity, let the symbols be the first N integers. Thus, Ω = {(a1, . . . , aN ) | ai ∈ [N] ∀i},

where ai = j ∈ [N] means that the bucket labelled i received the ball labelled j. Observe that Ω is simply the collection of all permutations of the N distinct objects, so |Ω| = N!. (b) What probabilities will you assign to the elementary events in |Ω|? 1 Solution: Each elementary event is equally-likely, so P (ω) = N! for all ω ∈ Ω. (c) Define a match to have occurred in a given box if the ball placed in this box has the same label as the box. Let AN be the event that there is at least one match. What is the probability of AN ? Solution: For each i ∈ [N], let Bi denote the set of arrangments S having a match in bucket i. Thus, i∈[N] Bi is the collection of all arrangements having at least one match. By the inclusion-exclusion principle, we have [ X X Bi = |Bi| − |Bi ∩ Bj| + ··· i∈[N] i∈[N] i,j∈[N] i6=j N N = |B | − |B ∩ B | + ··· (since |B | = |B | for all i, j) 1 1 2 1 2 i j N N = (N − 1)! − (N − 2)! + ··· 1 2 X N! = (−1)i−1 . i! i∈[N]

Therefore,

1 X N! P (A ) = (−1)i−1 N N! i! i∈[N] X 1 = (−1)i−1 . i! i∈[N]

(d) When you let N → ∞, does the sequence of probabilities P (AN ) converge?

4 Solution: It is well known that X 1 1 (−1)i−1 = . i! e i∈N

(e) Is the answer in (d) surprising to you in the sense that it did not coincide with your initial expectation of what the probability of at least one match is when N is large? Provide some discussion. Solution: I recall that, when first encountering this problem, I was unable to form a conjecture either way. On the one hand, as N grows, the chance of placing a given ball in the right bucket is approaching 0. On the other hand, the number of chances to get a match (i.e. the number of balls and buckets involved) is growing without bound. Whenever an infinite number of terms are involved, strange things may happen. Regardless, I suspected to find the probability to be 0 or 1. That is converges to something in between is quite astonishing. That it involves e is a nice feature, though not terribly surprising considering the importance of factorials in the problem.

3. A box contains N identically-sized balls with K of them colored red and N − K colored blue. Consider the following two random experiments. Experiment 1: Draw n balls in succession without replacement, taking into account the order in which the balls are drawn. Experiment 2: Draw n balls in succession without replacement, disre- garding the order in which the balls are drawn.

If you let Ak be the event that there are exactly k red balls in the sample, do you get the same probability with Experiment 1 and Experiment 2? Justify your answer. Solution: The probability is the same in both experiments. To see this, suppose there are ` distinct ways to draw a total of k red balls in which order matters (as in Experiment 1). Associated with each such event is th a probability pi of witnessing the i ordering. Since these events are P elementary (and so disjoint), we have P (Ak) = i∈[`] pi in Experiment 1. In Experiment 2, an elementary event is drawing exactly k red balls in any order. This event may be viewed, however, as the collection of the ` P equivalent orderings, and so we still compute P (Ak) = i∈[`] pi. 4. Prove the following basic results from . Here, A, B, C, . . . are subsets of some sample space Ω.

(a) A ∪ (B ∪ C) = (A ∪ B) ∪ C

5 Solution: x ∈ A ∪ (B ∪ C) ⇔ x ∈ A or x ∈ B ∪ C ⇔ x ∈ A or x ∈ B or x ∈ C ⇔ x ∈ A ∪ B or x ∈ C ⇔ x ∈ (A ∪ B) ∪ C

(b) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) Solution: Observe that A∪(B∩C) ⊂ A∪B and A∪(B∩C) ⊂ A∪C. Hence, A ∪ (B ∩ C) ⊂ (A ∪ B) ∩ (A ∪ C). Next, let x ∈ (A ∪ B) ∩ (A ∪ C). Thus, x ∈ A ∪ B and x ∈ A ∪ C. If x∈ / A, then x ∈ B and x ∈ C. That is, x ∈ B ∩ C. Hence, x ∈ A ∪ (B ∩ C).

(c) (DeMorgan’s Laws) Let {Aα | α ∈ A} for some index set A where each Aα is a subset of Ω. Prove that !c [ \ c Aα = Aα. α∈A α∈A Solution:

!c [ [ x ∈ Aα ⇔ x∈ / Aα α∈A α∈A

⇔ x∈ / Aα for all α ∈ A c ⇔ x ∈ Aα for all α ∈ A \ c ⇔ x ∈ Aα α∈A

(d) Let A1,A2,... be a sequence of subsets of Ω. Define the sequence B1,B2,... according to

B1 = A1 c B2 = A1 ∩ A2 . . c c c Bn = A1 ∩ A2 ∩ · · · ∩ An−1 ∩ An . .

Prove that B1,B2,... is a pairwise disjoint sequence and that, for each n, [ [ Aj = Bj j∈[n] j∈[n]

6 so that, in particular, [ [ Aj = Bj. j∈N j∈N Solution: To see that the sequence is pairwise disjoint, let i, j ∈ N with i 6= j. Without loss of generality, let i < j. It follows immediately that c Bi ∩ Bj ⊂ Ai ∩ Ai = ∅. For the second claim, observe first that Bj ⊂ Aj for all j ∈ [n], so S S S j∈[n] Aj ⊃ j∈[n] Bj. For the reverse inclusion, let x ∈ j∈[n] Aj. This implies that, for some subset S of [n], x ∈ Ai for all i ∈ S.

Let i0 be the least element of S. Thus, x ∈ Ai0 and x∈ / Aj for all 1 ≤ j < i0. In other words,   \ c x ∈  Aj ∩ Ai0 j

= Bi0 [ ⊂ Bj. j∈[n]

Since N is well-ordered under the usual order, we conclude that [ [ Aj = Bj. j∈N j∈N

5. Let Ω = [−1, 1] and define, for each n ∈ N, the subset of Ω given by 1 1 An = [−1 + 2n , 1 − n ]. Obtain \ [ lim sup An = Ak n∈N n≥k and [ \ lim inf An = Ak. n∈N n≥k Do the two sets coincide? 1 Solution: Observe first that (−1 + 2n ) is a bounded, monotone sequence 1 of real numbers, and so converges (in particular to -1). Similarly, (1 − n ) converges to 1. Hence, lim sup An = lim inf An = lim An = [−1, 1]. Let us now evaluate each righthand side explicitly. Let x ∈ [−1, 1] and consider, for any  > 0, define the basic open neighborhood U = (x−, x+ 1 ). Choose N so that <  for all n ≥ N. Thus, U ∩An 6= ∅ for all n ≥ N. S n T S As AN ⊂ An, for any k, it follows that B ∩ Ak 6= ∅. T nS≥k n∈NT n≥k Hence, Ak = [−1, 1]. Similarly, as AN ⊂ An, B ∩ S T n∈N n≥k S T n≥N Ak 6= ∅. Hence, Ak = [−1, 1]. n∈N n≥k n∈N n≥k

7 1. Let Ω = (0, 1] and consider the class of subsets of Ω given by  n F0 = A = ∪j=1Ij : Ij = (aj, bj] ⊂ Ω, n ∈ {0, 1, 2,...} . c That Ω ∈ F0 and if A ∈ F0 then A ∈ F0 are immediate. Show formally that F0 is closed under finite unions, that is, if A1,A2,...,An are in F0, n then ∪i=1Ai ∈ F0. [These set of properties establishes that F0 is a field.] Sni Solution: Let A1,...,An ∈ F0. Now, each Ai can be written as j=1(aij , bij ]. Thus, we have n n n [ [ [i Ai = (aij , bij ] i=1 i=1 j=1 [ = (ak, bk], k∈I

Sni where I is the collection of all indices ij appearing in j=1(aij , bij ] as i ranges from 1 to n. As n is finite and ni is finite for all i, it follows that Sn |I| is finite, and so i=1 Ai ∈ F0.

2. For the Ω and F0 in Problem 1, define the set function P : F0 → < via: n for A = ∪i=1(aj, bj] where (aj, bj] ∩ (ai, bi] = ∅ for i 6= j, we have n X P (A) = (bi − ai). i=1 Establish that P is indeed a function by showing that for two different representations of A, you obtain the same value for P (A) according to the preceding definition. Sn Solution: Let A ∈ F0 be given with representations i=1(ai, bi] and Sm i=1(ci, di]. Choose any maximally connected interval (x, y] of A. Thus, after appropriate reordering of the indices, (x, y] is of the form

(x, y] = (a1, a2] ∪ (a2, a3] ∪ · · · ∪ (an−1, an]

= (x, a2] ∪ (a2, a3] ∪ · · · ∪ (an−1, y]. Using the other representation of A, we have also that

(x, y] = (c1, c2] ∪ (c2, c3] ∪ · · · ∪ (cn−1, cm]

= (x, c2] ∪ (c2, c3] ∪ · · · ∪ (cn−1, y]. Now, using the first representation of A, n X P (A) = (bi − ai) i=1

= (a2 − a1) + (a3 − a2) + ··· + (an − an−1)

= an − a1 = y − x.

8 Using the second representation of A,

n X P (A) = (di − ci) i=1

= (c2 − c1) + (c3 − c2) + ··· + (cm − cm−1)

= cm − c1 = y − x.

Now, since the maximally connected intervals in A are disjoint, P (A) is just the sum of its value on these intervals. As we have demonstrated that P agrees on all maximally connected intervals of A under both represen- tations, it follows that P agrees on all of A. 3. Let Ω be an uncountable sample space. A subset A ⊂ Ω is said to be co-countable if Ac is countable. Show that the class of subsets

C = {A ⊂ Ω: A is countable or co-countable}

is a σ-field of subsets in Ω. Solution: As Ωc = ∅, we see that Ω ∈ C. If A ∈ C, then A is either countable or co-countable. Hence, Ac is either co-countable or countable, respectively. Thus, Ac ∈ C.

Let {Ai} be a countable collection of elements of C. If each Ai is countable, S then {Ai} is countable. If at least one of the elements, say A1, is co- countable, then we have

c [  \ c {Ai} = {Ai } c ⊆ A1, S which is countable. That is, {Ai} is co-countable. 4. Let Ω be an uncountable set, and let C = {{ω} : ω ∈ Ω} be the class of subsets of Ω. Show that the σ-field generated by C is the σ-field consisting of countable and co-countable sets. Solution: Let C denote the σ-field generated by C, and let D denote the σ-field consisting of all countable and co-countable subsets of Ω. Since any singleton set is countable, it is clear that C ⊆ D. As C is the intersection of all σ-fields containing C, it follows that C ⊆ D.

Let now C0 be any σ-field containing C. As C0 contains every singleton subset of Ω and is closed under countable union, it follows that C0 contains every countable subset of Ω. Now, since C0 contains every countable subset of Ω and is closed under complementation, it follows that C0 contains every co-countable subset of Ω. Hence, D ⊂ C0. As C0 was chosen arbitrarily, we see that D is contained in any σ-field containing C, and so D ⊆ C.

9 5. Suppose C is a non-empty class of subsets of Ω. Let A(C) be the minimal field over C (i.e., the field generated by C). Show that A(C) consists of sets of the form m ni ∪i=1 ∩j=1 Aij

c ni where Aij ∈ C or Aij ∈ C, and where the m sets ∩j=1Aij, i = 1, 2, . . . , m are disjoint. Solution: In what follows, let

 m   \ c  F = Aj | Aj ∈ C or Aj ∈ C j=1 

and ( n ) G D = Fi | Fi ∈ F, {Fi}i∈[n] pairwise disjoint . i=1 With this notation, we must show that A(C) = D. To see that D ⊆ A(C), notice that A(C) is a field containing C, and so is closed under complementation, finite unions, and finite intersections of elements of C. In particular, A(C) must contain any element of the form specified by D. We show next that A(C) ⊆ D. To accomplish this, note first that C ⊆ D. Thus, if we can show that D is itself a field, we will have the desired inclusion, as A(C) is the smallest field containing C. Before proceeding, we establish a useful lemma. Lemma 0.1. If F ∈ F, then F c ∈ D.

T c Proof. Let F ∈ F be of the form F = j∈[m] Aj, where Aj or Aj belongs to C for each j and the collection of all Aj is pairwise disjoint. It follows that

c [ c F = Aj j∈[m] c c c = A1 ∪ (A2 ∩ A1) ∪ · · · ∪ (Am ∩ A1 ∩ · · · ∩ Am−1).

c Consider a typical term Ak ∩ A1 ∩ · · · ∩ Ak−1 in the union. This set is c evidently an element of F, as Ai or Ai belongs to C for all i (similarly for c Ak). Moreover, the collection of all terms in the union is pairwise disjoint via disjointification. Thus, F c ∈ D.

We now return to the task of showing that D is a field. Let A ∈ C. It follows that F contains A ∩ Ac = ∅. Thus, by the lemma, ∅c = Ω belongs to D.

10 Next, we show that D is closed under finite intersections. To that end, let D ,D ∈ D with D = F F and D = F F 0. Now, 1 2 1 i∈[n1] i 2 j∈[n2] j     G G 0 D1 ∩ D2 =  Fi ∩  Fj

i∈[n1] j∈[n2] [ 0 [ 0 = (F1 ∩ Fj) ∪ · · · ∪ (Fn1 ∩ Fj).

j∈[n2] j∈[n2]

0 Observe that this last line is a union of elements of the form Fi ∩Fj. As F is closed under finite intersection, each term of the union is a member of F. Moreover, the collection of all these terms is pairwise disjoint. To see this, consider some distinct F ∩ F 0 and F ∩ F 0 in the union. Without i1 j1 i2 j2

loss of generality, let Fi1 6= Fi2 . By definition of D1, it must be that Fi1 and F are disjoint, and thus F ∩F 0 and F ∩F 0 are disjoint. All told, i2 i1 j1 i2 j2 we have that D1 ∩ D2 ∈ D. Proceeding by induction, we have that D is closed under finite intersections. Finally, we show that D is closed under complementation. To that end, F c T c pick any D ∈ D with D = i∈[n] Fi. It follows that D = i∈[n] Fi . c By the lemma, each Fi is an element of D. As D is closed under finite intersections, Dc ∈ D. Taken together, we have verified that D is indeed a field containing C, and so A(C) ⊆ D. Therefore, A(C) = D, as desired. 6. Let C be a class of subsets of Ω. It is said to be a monotone class if for ∞ A1 ⊂ A2 ⊂ ... in C, then ∪n=1An = lim An ∈ C and for A1 ⊃ A2 ⊃ ... in ∞ C, then ∩n=1An = lim An ∈ C. Prove that if C is a field and a monotone class, then it is a σ-field. Solution: Since C is a field, we have Ω ∈ C and closure under comple- mentation.

Let now {Ai} be a countable collection of elements of C. Define, for all Sk k, Bk = Ai. Thus, {Bi} is a monotone sequence of subsets of C, S i=1 S S and so {Bi} ∈ C. It is clear, however, that {Ai} = {Bi}, and so we S conclude that {Ai} ∈ C, thus verifying that C is a σ-field. 7. Let Ω = < and consider the two classes of subsets given by

C1 = {(a, b): a, b are rationals in R};

C2 = {[a, b]: a, b are in R}. Establish that these two classes of sets generate the same σ-field. (Their common generated σ-field is the Borel σ-field in <.)

Solution: Let C1 denote the σ-field generated by C1 and C2 the σ-field generated by C2. Thus, we have that \ C1 = {D | D is a σ-field and C1 ⊂ D}

11 and \ C2 = {D | D is a σ-field and C2 ⊂ D}.

We proceed by showing that any σ-field containing C1 contains C2 and vice versa, thus establishing that C1 = C2.

Let D1 be any σ-field containing C1. We need to show that, for any a, b ∈ R,[a, b] ∈ D1. To that end, choose two sequences {an} and {bn} of rational numbers with an % a and bn & b. Now, (an, bn) ∈ D1 for all n, T and so D1 contains (an, bn) = [a, b]. n∈N Let D2 be any σ-field containing C2. We need to show that, for any a, b ∈ Q,(a, b) ∈ D2. To that end, choose two sequences {an} and {bn} of real numbers with an & a and bn % b. Now, [an, bn] ∈ D2 for all n, and S so D2 contains [an, bn] = (a, b). n∈N 1. Let S be a semi-algebra of subsets of Ω. Denote by A(S) the algebra or field generated by S and by σ(S) the σ-algebra generated by S. Prove that σ(A(S)) = σ(S).

Solution: Since S ⊂ A(S), we have immediately that σ(S) ⊂ σ(A(S)). To demonstrate the reverse inclusion, it suffices to show that σ(S) is itself a σ-algebra containing A(S). To that end, recall that every element of A(S) is of the form m G Si i=1 where S ∈ S. Now, as σ(S) is countable union, such elements belong to σ(S). Thus, σ(S) is a σ-algebra containing A(S), and so σ(A(S)) ⊂ σ(S). 2. Let Ω = C[0, 1], the space of continuous functions on [0, 1]. For t ∈ [0, 1] and a, b ∈ <, define the subset of Ω given by

A(t; a, b) = {f ∈ Ω: f(t) ∈ (a, b]}.

Gather these subsets into a collection C0, that is,

C0 = {A(t; a, b): t ∈ [0, 1], a ∈ <, b ∈ <}.

(a) Demonstrate that C0 is not a semi-algebra. Solution: Let A(t; a, b) ∈ C0. Now, A(t; a, b)c = {f ∈ Ω | f(t) ∈/ (a, b]} = {f ∈ Ω | f(t) ∈ (−∞, a] ∪ (b, ∞)}.

Evidently, A(t; a, b)c cannot be represented by a finite union of ele- ments of C0, and so C0 is not a semi-algebra. Additionally, Ω ∈/ C0, as there is no finite interval (a, b] such that all continuous functions satisfy, for example, f(0) ∈ (a, b].

12 (b) Describe the structure of the typical element of S0 ≡ S(C0), the semi-algebra generated by C0. Solution: We claim that a typical element of S0 has the form Tn c i=1 Ai where, for each i, Ai = A(ti; ai, bi) or Ai = A(ti; ai, bi) = A(ti; −∞, ai) ∪ A(ti; bi, ∞) with ti ∈ [0, 1], ai, bi ∈ R, where we un- derstand that intervals of the form (a, ∞] should be interpreted as (a, ∞).

By the observations in part a, it suffices to show that S0 as it is defined above is a σ-algebra (the new elements we’ve included are required at a minimum to patch up the deficiencies of C0). Now, it is clear that ∅ ∈ S0 (take the intersection of any disjoint balls) c and Ω ∈ S0 (Ω can be represented, for example, by A(0; 0, 0) ). By construction, S0 is closed under complementation. It is also closed under finite intersection, since intervals of the form (a, b] with a, b ∈ R ∪ {−∞, ∞} are closed under finite intersection. Therefore, S0 is a semi-algebra, and so must be the smallest semi-algebra containing C0.

(c) Describe the structure of the typical element of A(C0), the algebra generated by C0. Solution: By a previous homework, we know that the elements of A(C0) are of the form m nj G \ Aij, i=1 j=1

c Tnj where, for all i and j, Aij ∈ C0 or Aij ∈ C0 and the m sets j=1 Aij are pairwise disjoint. In this particular example, we know that Aij = c c A(t; a, b) and that Aij = A(t; a, b) = {f ∈ Ω | f(t) ∈ (−∞, a] ∪ (b, ∞)} for some t ∈ [0, 1] and a, b ∈ R. Adopting the conventions as in part b, we can represent any element of A(C0) as

m nj G \ Ai. i=1 j=1

(d) Denoting by B0 = σ(C0), the σ-field generated by C0, determine if the subset of Ω given by

E = {f ∈ Ω : sup |f(t)| ≤ B} t∈[0,1]

is an element of B0. [By the way, B0 is called the σ-field generated by the cylinder sets.]

Solution: Let the sequence (rn) be an enumeration of Q ∩ [0, 1] and define ∞ \ F = A(rn, −B,B). n=1

13 Since B0 is closed under countable intersection, we have that F ∈ B0. We claim that E = F .

Evidently, E ⊆ F , as any function f satisfying supt∈[0,1] |f(t)| ≤ B satisfies |f(rn)| ≤ B for all n. Suppose now, for the purpose of contradiction, that F 6⊆ E. That is, there exists continuous f and t0 ∈ R \ Q such that |f(t0)| > B. Pick any 0 <  < B − |f(t0)|. By the continuity of f, there is δ > 0 such that, whenever |x − y| < δ, |f(x) − f(y)| < . Now, as Q is dense in R, we can find a rational number r such that |t0 − r| < δ, yet

|f(t0) − f(r)| > ||f(t0)| − |f(r)||

≥ ||f(t0)| − B| > ,

which is contrary to the continuity of f. Hence, F ⊆ E, as desired.

3. Same Ω as in Problem 2. Define the metric (distance) function d :Ω×Ω → < via d(f, g) = sup |f(t) − g(t)|. t∈[0,1] For  > 0 and f ∈ Ω, define the subset B(f; ) of Ω according to

B(f; ) = {g ∈ Ω: d(f, g) < }.

These are the open balls in Ω. Gather these open balls in the collection S1, that is, S1 = {B(f; ): f ∈ Ω,  > 0}.

(a) Determine if S1 is a semi-algebra in Ω; and if it is not, find the semi-algebra generated by S1.

Solution: S1 does not contain Ω, and so is not a semi-algebra. To see this, observe that, for any f ∈ Ω and finite  > 0, there is a continuous function on [0, 1] not contained in B(f; ) (for example, the constant function M + 2 with M = supt∈[0,1] |f(t)|). For the same reason,

B(f; )c = {g ∈ Ω | d(f, g) ≥ }

cannot be represented as a finite union of B(fi, i) (for example, the constant function M + 2 with M = maxi∈[n]{supt∈[0,1] |fi(t)|} will not be contained in this union). Tn As before, we represent elements of S(S1) by i=1 Si where, for all c i, Si ∈ S1 or Si ∈ S1.

(b) Determine the algebra generated by S1, that is, A(S1).

14 Solution: In a previous homework, we have established this result in more generality. In this particular case, we have

m nj G \ A(S1) = Sij i=1 j=1

Tnj with Sij ∈ S1 or Sij ∈ S2 for all i, j and the m sets j=1 Sij are pairwise disjoint.

(c) Denote by B1 = σ(S1), the σ-field generated by S1, so that by defini- tion this is the Borel σ-field associated with the metric d. Determine if the subset of Ω defined in item (d) in Problem 2 is an element of this Borel σ-field. Solution: The set ( ) E = f ∈ Ω : sup |f(t)| ≤ B t∈[0,1]

can be represented in B1 as

∞ \  1  B 0; B + . n n=1

1  T∞ 1  Evidently, E ⊂ B 0; B + n for each n, and so E ⊂ n=1 B 0; B + n . T∞ 1  Let now g ∈ n=1 B 0; B + n . This implies that supt∈[0,1] |g(t)| < 1 B + n for all n. Hence, supt∈[0,1] |g(t)| ≤ B, and so g ∈ E. Therefore, T∞ 1  E = n=1 B 0; B + n .

4. Investigate the relationship between B0 in Problem 2 and B1 in Problem 3. Are these σ-fields identical; or is one strictly containing the other, and if so, which one is the larger σ-field?

Solution: We claim that B0 = B1. To prove this, we demonstrate that any basic open ball of B0 can be represented as a countable union of basic open balls of B1, and vice versa.

Pick some A(t; a, b) ∈ B0. It is (relatively) well-known that the collection of continuous piecewise linear functions on [0, 1] is countable and dense in the collection of all continuous functions on [0, 1]. Denote by F the subset of continuous piecewise linear functions on [0, 1] contained in A(t; a, b). Now, for each f ∈ F, define f = min{f(t) − a, b − f(t)}. We claim that [ A(t; a, b) = {B(f, f ) | f ∈ F} .

By our choice of f , each B(f, f ) ⊆ A(t; a, b). Conversely, if g ∈ A(t; a, b), then the density of F in A(t; a, b) guarantees that there is f ∈ F such that d(f, g) < f , and so g ∈ B(f, f ). Hence, B0 ⊆ B1.

15 For the reverse inclusion, pick some B(f, ) ∈ B1 and let (rn) be an enu- meration of Q ∩ [0, 1]. We claim that

∞ \     B(f, ) = A r ; f(r ) − , f(r ) + . n n 2 n 2 n=1

   By construction, A rn; f(rn) − 2 , f(rn) + 2 ⊂ B(f, ) for each n. Con- versely, if f ∈ B(f, ), then the density of the rationals in [0, 1] guarantees    that f ∈ A rn; f(rn) − 2 , f(rn) + 2 for each n. Hence, B1 ⊆ B0.

5. As you might have already noticed, the typical element of S0 in Problem 2 is of form ∩n A∗ where the t s are distinct and A∗ is either of form i=1 ti i t c A(t; a, b) or A(t; a, b) . Define the (set)-function P : S0 → < according to

n " # Y Z ∩n A∗  = φ(v)dv P i=1 ti A∗ i=1 ti where φ(z) = (2π)−1/2 exp{−v2/2} is the standard normal density func- tion. Assuming that P is σ-additive on S0, provide a reason [do not prove anything, just a reason!] why you could conclude that there is a unique

W on B0 which extends P , that is, W|S0 = P. [This probability measure W is infact the Wiener measure on (C[0, 1], B0)]. Solution: The desired result is a direct application of the first and second extension theorems. Since P is σ-additive on the semi-algebra S0, we can 0 extend it uniquely to a probability measure P on A(S0). Applying the second extension theorem to P0, we can obtain the desired probability measure W on σ(S0) = B0.

Problem 1

Let Ω be some non-empty set, and let A and B be two subsets of Ω which are not disjoint and with Ω 6= A ∪ B. Define the semi-algebra

S = {∅, Ω, A, B, AB, AcB, ABc,AcBc}.

Define the function P : S → < according to the following specification:

P (∅) = 0; P (Ω) = 1; P (A) = .2; P (B) = .6; P (AB) = .12; P (AcB) = .48; P (ABc) = .08; P (AcBc) = .32

Find the σ-field, A, generated by S, i.e., enumerate all the elements of this σ-field. [Note: This should coincide with the algebra or field generated by S.] Proof. Since S is finite, the σ-field generated by S coincides with the field generated by S. Thus, it suffices to consider A(S). Now, we know that A(S) is the collection of all sums of finite families of mutually disjoint subsets of Ω in S (Resnick, Lemma 2.4.1). A first calculation gives the following as the elements of A.

16 ∅ A ∪ AcBA ∪ AcB ∪ AcBc AB ∪ AcB ∪ ABc ∪ AcBc Ω A ∪ AcBc B ∪ ABc ∪ AcBc AB ∪ AcB AB ∪ AcB ∪ ABc BB ∪ AcBc AB ∪ ABc ∪ AcBc AB AB ∪ AcBAcB ∪ ABc ∪ AcBc AcB AB ∪ ABc ABc AB ∪ AcBc AcBc AcB ∪ ABc AcB ∪ AcBc ABc ∪ AcBc Many of these elements, however, are redundant. For example, A ∪ AcB ∪ AcBc = A ∪ Ac(B ∪ Bc) = A ∪ AcΩ = A ∪ Ac = Ω. Removing redundant elements gives the following representation of A. ∅ A ∪ AcB Ω A ∪ AcBc AB ∪ AcBc B AB ∪ AcBc AB AcB ∪ ABc AcBAcB ∪ AcBc ABc ABc ∪ AcBc AcBc

Find the (unique) extension of P to A. You must enumerate the values of this extension for each possible element of A. Proof. Since S is a semi-algebra, the unique extension P 0 of P to A is defined by ! 0 X X P Si = P (Si), i∈I i∈I (Resnick, Theorem 2.4.1). By direct calculation, we define P 0 on each element of A. P (∅) = 0 P (A ∪ AcB) = .68 P (Ω) = 1 P (A ∪ AcBc) = .52 P (A) = .2 P (B ∪ AcBc) = .92 P (B) = .6 P (AB ∪ AcBc) = .44 P (AB) = .12 P (AcB ∪ ABc) = .56 P (AcB) = .48 P (AcB ∪ AcBc) = .8 P (ABc) = .08 P (ABc ∪ AcBc) = .4 P (AcBc) = .32

17 Problem 2

Show that a σ-field cannot be countably infinite, i.e., either it has a finite car- dinality or it is at least as large as R. Proof. Let σ be an infinite σ-field of subsets of Ω and suppose we can find a sequence {An | n ∈ N} of pairwise disjoint subsets of σ. Consider the function f from the collection of infinite binary strings into σ defined by [ f(s) = Ai,

i∈Is where Is is the set of indices on which s is 1. Now, given two binary strings s0 and s1, [ [ f(s0) = f(s1) ⇒ Ai = Ai

i∈Is0 i∈Is1

⇒ Is0 = Is1 (since the Ai are pairwise disjoint)

⇒ s0 = s1.

Hence, f is injective, and so the cardinality of σ is at least ℵ1. It remains to show that we can indeed produce a sequence of pairwise disjoint subsets of σ. To that end, let A ∈ σ. Since σ is closed under complementation, Ac ∈ σ. Now, A ∪ Ac = Ω, which is infinite (since σ is infinite). Hence, one of A c c or A is infinite. Without loss of generality, let A be infinite and set A1 = A . Consider next the collection of subsets

{A ∩ B | B ∈ σ}.

Observe that each A ∩ B is an element of σ, since σ is closed under countable intersection. Now, [ {A ∩ B | B ∈ σ} = A,

c which is infinite, so some element C of {A ∩ B | B ∈ σ} is infinite. Set A2 = C and consider next {C ∩ B | B ∈ σ}.

Proceeding in this way, we generate a sequence {An} of elements of σ. The Ai c are disjoint since, by construction, Ak ⊂ Aj for all k > j. Using this sequence in the above argument yields the desired result.

18 Problem 3

Let B be a σ-field of subsets of Ω and let A ⊂ Ω which is not in B. Show that the smallest σ-field generated by {B,A} consists of sets of form

c AB1 ∪ A B2,B1,B2 ∈ B.

Proof. Denote by σ the σ-field generated by {B,A} and by C the collection c {AB1 ∪ A B2 | B1,B2 ∈ B}. We have immediately that C ⊆ σ, since σ is closed under complementation, countable intersection, and countable union, by definition. It remains to show that C ⊆ σ. To accomplish this, we need only show that C is itself a σ-field and appeal to the minimality of σ. Evidently, Ω ∈ C, since Ω ∈ B. To establish closure under complementation, choose any C ∈ C, so that c C = AB1 ∪ A B2 for B1,B2 ∈ B. It follows that

c c c C = (AB1 ∪ A B2) c c c = (AB1) ∩ (A B2) c c c = (A ∪ B1) ∩ (A ∪ B2) c c c c c c = A A ∪ A B2 ∪ B1A ∪ B1B2 c c c c c = A B2 ∪ B1A ∪ B1B2 c c c c c c c c = A B2 ∪ B1A ∪ (AB1B2 ∪ A B1B2) c c c c c c c = A(B1 ∪ B1B2) ∪ A (B2 ∪ B1B2) c c c = AB1 ∪ A B2.

c c c Since B is closed under complementation, B1 and B2 belong to B, and so C belongs to C. To establish closure under countable unions, choose Ci ∈ C for every natural c 0 0 number, so that each Ci is of the form ABi ∪ A Bi for Bi,Bi ∈ B. Now,

[ [ c 0 Ci = (ABi ∪ A Bi) i∈N i∈N [ [ c 0 = ABi ∪ A Bi) i∈N i∈N ! ! [ c [ 0 = A Bi ∪ A Bi . i∈N i∈N

S Since B is closed under countable unions, we have that both i∈ Bi and S 0 S N B belong to B, and so Ci belongs to C. i∈N i i∈N Hence, C is a σ-field, and so contains σ. Therefore, σ and C coincide, as desired.

19 Problem 6

If S1 and S2 are two semialgebras of subsets of Ω, show that

S1S2 = {A1A2 : A1 ∈ S1,A2 ∈ S2} is again a semialgebra of subsets of Ω.

Proof. For ease of notation, let S denote S1S2. We show directly that S is a semialgebra. Since both S1 and S2 are semialgebras, Ω ∈ S1 and Ω ∈ S2. Thus, Ω = ΩΩ ∈ S. To see that S is closed under finite intersection, pick S, S0 ∈ S, so that S = 0 0 0 0 0 S1S2 and S = S1S2 with S1,S1 ∈ S1 and S2,S2 ∈ S2. It follows immediately that

0 0 0 SS = (S1S2)(S1S2) 0 0 = (S1S1)(S2S2).

0 0 Since S1 and S2 are both semialgebras, S1S1 ∈ S1 and S2S2 ∈ S2. Thus, SS0 ∈ S. By a simple induction argument, we have that S is closed under finite intersection. It remains to show that, given S ∈ S, Sc can be written as the union of a finite collection of pairwise disjoint elements of S. To that end, let S = S1S2 with S1 ∈ S1 and S2 ∈ S2. It follows that,

c c S = (S1S2) c c = S1 ∪ S2 m n X X = Ai ∪ Bi, i=1 i=1 where the Ai are pairwise disjoint elements of S1 and the Bi are pairwise disjoint Pm elements of S2. Denote i=1 Ai by A. Observe that,

m n m n X X X X c Ai ∪ Bi = Ai ∪ BiA . i=1 i=1 i=1 i=1

Now, the Ai are pairwise disjoint because S1 is a semialgebra. Similarly, the Bi c are pairwise disjoint because S2 is a semialgebra, and so the BiA are pairwise c disjoint. Finally, any Ai is disjoint from any BjA , since Ai is disjoint from c c A . Thus, {Ai | i ∈ [m]} ∪ {BiA | i ∈ [n]} is a pairwise disjoint collection of c elements from S1 and S2. We proceed by showing that each Ai and BiA can be represented by a union of disjoint elements of S.

20 Evidently, every Ai belongs to S1, since each can be represented as AiΩ. Now, for any k ∈ [n], we have

m !c c X BkA = Bk Ai i=1 m Y c = Bk Ai i=1 m m Y Xi = Bk Cij, i=1 j=1 where, for each i ∈ [m], {Cij | j ∈ [mi]} is a pairwise disjoint collection of Qm elements of S1. Let now M denote the collection of m-tuples i=1[mi]. We may rewrite the above as

m m m Y Xi X Y Bk Cij = Bk Cix(i). i=1 j=1 x∈M i=1

Qm Now, since S1 is a semialgebra, we have that i=1 Cix(i) ∈ S1 for each x ∈ M. Moreover, the collection of all such elements is pairwise disjoint. That is, given x, y ∈ M with x 6= y, m m Y Y Cix(i) ∩ Ciy(i) = ∅, i=1 i=1 0 since, for some j ∈ [m], x(j) 6= y(j) and Cj` ∩ Cj`0 = ∅ for any j and ` 6= ` . All told, we have shown that, for each k ∈ [n],

m c X Y BkA = Bk Cix(i) x∈M i=1 m X Y = Bk Cix(i), x∈M i=1 which is a disjoint union of elements of S. Therefore, Sc can be represented as a disjoint union of elements of S, thus completing the proof that S is a semialgebra.

Problem 7

Let B be a σ-field of subsets of Ω and let Q : B → < satisfying the following conditions: (i) Q is finitely additive on B. (ii) Q(Ω) = 1 and 0 ≤ Q(A) ≤ 1 for all A ∈ B.

21 P∞ P∞ (iii) If Ai ∈ B are pairwise disjoint and i=1 Ai = Ω, then i=1 Q(Ai) = 1. Show that Q is σ-additive, so that it is, in fact, a probability measure on B.

Proof. Let {An} be a countable sequence of pairwise disjoint elements of B and P c let A denote n∈ An. Since B is a σ-field, A ∈ B, and so A ∈ B. Now, Nc {An | n ∈ N} ∪ {A } is a pairwise disjoint collection of elements of B whose union is Ω, so it follows that

X c Q(An) + Q(A ) = 1 (by property iii) n∈N = Q(Ω) (by property ii) = Q (A ∪ Ac) = Q(A) + Q(Ac) (by property i) ! X c = Q An + Q(A ). n∈N Thus, ! X X Q(An) = Q An , n∈N n∈N as desired.

Problem 1

Proposition 0.2. Let X : (Ω, F) → (<, B) where B is the Borel σ-field in <. −1 Let S : (Ω, F) → (S, A) where A is a σ-field in S. Denote by FS = S (A) the sub-σ-field induced by S. The function X is FS/B-measurable if and only if there exists a measurable h :(S, A) → (<, B) such that X(ω) = h[S(ω)] for every ω ∈ Ω. Proof. (⇐) It follows immediately from the existence of h that

X−1(B) = S−1(h−1(B)) ⊆ S−1(A)

⊆ FS, and so X is FS/B-measurable.

Problem 2

Proposition 0.3. Let {Xn : n = 1, 2, 3,...} be a sequence of random variables defined on (Ω, F), and let N be a positive integer-valued random variable defined on (Ω, F). The function Y = XN is a random variable.

22 Proof. Let a ∈ R and consider Y −1(−∞, a]. It follows from the definition of Y that

Y −1(−∞, a] = {ω ∈ Ω | Y (ω) ∈ (−∞, a]}

= {ω ∈ Ω | XN (ω) ∈ (−∞, a]}

= {ω ∈ Ω | N(ω) ∈ (−∞, a] and XN(ω)(ω) ∈ (−∞, a]} + = {ω ∈ Ω | N(ω) ∈ (−∞, a]} ∩ {ω ∈ Ω | Xn(ω) ∈ (−∞, a], n ∈ (−∞, a] ∩ Z }   [ = {ω ∈ Ω | N(ω) ∈ (−∞, a]} ∩  {ω ∈ Ω | Xn(ω) ∈ (−∞, a]} . + n∈(−∞,a]∩Z

Now, N is a random variable, so {ω ∈ Ω | N(ω) ∈ (−∞, a]} ∈ F. Similarly, Xn is a random variable for each n ∈ N, so each {ω ∈ Ω | Xn(ω) ∈ (−∞, a]} ∈ F. Finally, as F is closed under countable union and countable intersection, we see that Y −1(−∞, a] ∈ F, and so Y is a random variable.

Problem 3

Proposition 0.4. If X is a random variable, then |X| is also a random variable.

Proof. Let a ∈ R and consider |X|−1(−∞, a]. We know that |X|−1(−∞, a] = {ω ∈ Ω: |X(ω)| ∈ (−∞, a]} = {ω ∈ Ω: |X(ω)| ∈ [0, a]} = {ω ∈ Ω: X(ω) ∈ [−a, a]}.

Now, since X is measurable and [−a, a] ∈ B, |X|−1(−∞, a] ∈ F, and so |X| is a random variable. Proposition 0.5. If |X| is a random variable, X need not be a random variable.

Proof. Let N denote some nonmeasurable subset of R and define X :(R, B) → (R, B) via ( −1 if y ∈ N X(y) = 1 if y∈ / N. We see that |X| ≡ 1, and so ( ∅ if a < 1 |X|−1(−∞, a] = R if a ≥ 1.

Hence, |X| is a random variable. At the same time, we have X−1(−∞, −1] = N, and so X is not a random variable.

23 Problem 4

Proposition 0.6. Let (Ω, B,P ) be ([0, 1], B(0, 1], λ) where λ is the Lebesgue measure on [0, 1]. Define the process {Xt : 0 ≤ t ≤ 1} according to

Xt(ω) = I{t = ω}.

Each Xt is a random variable.

Proof. Observe first that the range of each Xt is {0, 1}, so it suffices to consider preimages of the generators of the σ-algebra {∅, {0}, {1}, {0, 1}}, namely {0} and {1}. Let now s ∈ [0, 1] be arbitrary but fixed. We see that

−1 Xs {0} = (0, 1] \{s} −1 Xs {1} = {s}.

As both (0, 1] \{s} and {s} belong to B(0, 1], we conclude that Xs is a random variable. By the above, we see the σ-field generated by {Xt : 0 ≤ t ≤ 1} consists of all sets A such that A itself or Ac is a countable union of singletons.

Problem 5

Proposition 0.7. If X and Y are random variables on (Ω, F,P ), then

sup |P {X ∈ A} − P {Y ∈ A}| ≤ P {X 6= Y }. A∈B

Proof. For any A ∈ B,

{X 6= Y } ⊇ X−1(A) ∪ Y −1(A) \ (X−1(A) ∩ Y −1(A)), and so

|P (X 6= Y )| ≥ |P (X ∈ A) + P (Y ∈ A) − P ((X ∈ A) ∩ (Y ∈ A))| ≥ |P (X ∈ A) − P (Y ∈ A)|.

As A was arbitrary, we have that P (X 6= Y ) ≥ supA∈B |P (X ∈ A)−P (Y ∈ A)|, as desired.

Problem 6

Proposition 0.8. If {An : n = 1, 2, 3,...} is an independent sequence of events, then ( ∞ ) ∞ \ Y P An = P {An}. n=1 n=1

24 Tn Proof. Define for each n ∈ N the set Bn = Ak. We see that {Bn} is a k=1 T nonincreasing sequence of sets, and thus limn→∞ Bn = Bn. By definition T T n∈N of the Bn, however, we have also that Bn = An. It now follows that n∈N n∈N

( ∞ ) \ n o P An = P lim Bn n→∞ n=1

= lim P (Bn) n→∞ n ! \ = lim P Ak n→∞ k=1 n Y = lim P (Ak) n→∞ k=1 ∞ Y = P (An). n=1

Problem 7

Proposition 0.9. If X and Y are independent random variables and f, g are measurable and real-valued, then f(X) and g(Y ) are independent. Proof. Let A, B ∈ B. Since f and g are measurable and real-valued, there exist A0,B0 ∈ B such that f −1(A) = A0 and g−1(B) = B0. Since X and Y are independent random variables, we have that X−1(A0) and Y −1(B0) are independent. Thus, we have shown for any A, B ∈ B that X−1(f −1(A)) and Y −1(g−1(B)) are independent. In other words, f(X) and g(Y ) are independent.

Problem 8

Proposition 0.10. A random variable X is independent of itself if and only if there is some constant c such that P {X = c} = 1.

Proof. (⇒) Choose some event ω ∈ Ω with nonzero probability and set c = X(ω). Since X is independent of itself, we have

0 = P ({ω} ∩ (Ω \{ω})) = P ({ω})P (Ω \{ω}) .

Since P ({ω}) > 0, it must be that P (Ω \{ω}) = 0. Thus, we conclude that P ({ω}) = 1, and so P (X = c) = 1. (⇐) Let A, B ∈ B. We consider three cases.

25 If c ∈ A and c ∈ B, then P (X ∈ A) = P (X ∈ B) = 1. Moreover, c ∈ A ∩ B, so P ([X ∈ A] ∩ [X ∈ B]) = 1. If c ∈ A and c∈ / B, then P (X ∈ A) = 1 and P (X ∈ B) = 0. Moreover, c∈ / A ∩ B, so P ([X ∈ A] ∩ [X ∈ B]) = 0. If c∈ / A and c∈ / B, then P (X ∈ A) = 0 and P (X ∈ B) = 0. Moreover, c∈ / A ∩ B, so P ([X ∈ A] ∩ [X ∈ B]) = 0. In any case, we have P ([X ∈ A] ∩ [X ∈ B]) = P (X ∈ A)P (X ∈ B), and so A and B are independent. Therefore, X is independent of itself.

Problem 9

Consider the experiment of tossing a fair coin some number of times, so that each of the possible outcomes in the sample space are equally likely. Proposition 0.11. There are three events A, B, and C such that every pair are independent, but P (ABC) 6= P (A)P (B)P (C).

Proof. The desired events can be constructed using two flips. Set

A = {TT,TH} B = {TH,HT } C = {HT,HH}.

1 1 We have P (A) = P (B) = P (C) = 2 and P (AB) = P (AC) = P (BC) = 4 , so P (AB) = P (A)P (B), P (AC) = P (A)P (C), and P (BC) = P (B)P (C). That is, the events are pairwise independent. At the same time, we have P (ABC) = 0 1 (since ABC = ∅), which is not equal to P (A)P (B)P (C) = 8 . Proposition 0.12. There are three events A, B, and C such that P (ABC) = P (A)P (B)P (C), but with at least one possible pair not independent. Proof. The desired events can be constructed using three flips. Set

A = {TTT,TTH,THT,THH} B = {TTT,TTH,THT,HTT } C = {TTT,THH,HTT,HTH}.

1 1 We have P (A) = P (B) = P (C) = 2 and P (ABC) = 8 , so P (ABC) = 3 P (A)P (B)P (C). At the same time, we have P (AB) = 8 , which is not equal to 1 P (A)P (B) = 4 . Proposition 0.13. There are four events A, B, C, and D such that these events are independent.

26 Proof. The desired events can be constructed using four flips. Set

A = {TTTT,TTTH,TTHT,THTT,THTH,THHT,THHH,HTTT } B = {TTTT,TTTH,TTHT,TTHH,THTT,HTTH,HTHT,HHTH} C = {TTTT,TTTH,TTHH,THTH,THHT,HTTH,HTHH,HHTT } D = {TTTT,TTHT,TTHH,THHH,HTTT,HTHT,HTHH,HTTT }.

For clarity, we list the possible intersections explicitly.

AB = {TTTT,TTTH,TTHT,THTT } AC = {TTTT,TTTH,THTH,THHT } AD = {TTTT,TTHT,THHH,HTTT } BC = {TTTT,TTTH,TTHH,HTTH} BD = {TTTT,TTHT,TTHH,HTHT } ABC = {TTTT,TTTH} ABD = {TTTT,TTHT } BCD = {TTTT,TTHH} ABCD = {TTTT }

From the above, it is routine to check (but lengthy to write out) that the events A, B, C, D are indepedent.

Problem 1

Consider then the experiment where a computer generates successive letters independently from the Roman alphabet randomly. Proposition 0.14. The string “MOHR” will appear infinitely often with prob- ability one.

Proof. Break the infinite string generated by the computer into disjoint blocks of four characters each (i.e. the first four characters comprise the first block, the second four characters comprise the second block, and so on). Let Ai be the th event that the i block is the string “MOHR”. Note that the Ai are independent 1 since the blocks are disjoint. Now, for all i, P (Ai) = 264 . Hence,

∞ X P (An) = ∞. n=1

By Borel-Cantelli, An occurs infinitely often with probability one. That is, the string “MOHR” appears infinitely often with probability one.

27 Problem 3

Proposition 0.15. If {An} are independent events satisfying P (An) < 1 for all n, then ( ∞ ) [ P An = 1 if and only if P (An i.o.) = 1. n=1 Proof. (⇒) Suppose that ( ∞ ) [ P An = 1. n=1 It follows that ( ∞ !c) [ 0 = P An n=1 ( ∞ ) \ c = P An n=1 ∞ Y c = P (An) (by independence) n=1 ∞ Y = 1 − P (An). n=1 Q∞ Now, since P (An) < 1 for all n, the fact that n=1 1 − P (An) = 0 implies that P (An) > 0 infinitely often. Thus, ∞ Y 0 = e−P (An) n=1 − P∞ P (A ) = e n=1 n . Therefore, ∞ X P (An) = ∞, n=1 and so P (An i.o.) = 1 by Borel-Cantelli. S (⇐) Define Bn to be the event k≥n Ak. Observe that {Bn} is a non- increasing sequence of events. Now,

1 = P (An i.o.)

 ∞  \ [ = P  Ak n=1 k≥n ∞ ! \ = P Bn n=1

= lim P (Bn). n→∞

28 As the Bn are non-increasing, the sequence {P (Bn)} is non-increasing, and so P (Bn) = 1 for all n. In particular,

1 = P (B1) ∞ ! [ = P An . n=1

In the above proposition, the condition P (An) < 1 for all n cannot be dropped. To see this, consider the sequence {An} where A1 = Ω and An are sets of measure zero for all n ≥ 2. We have that

( ∞ ) [ P An ≥ P (Ω) n=1 = 1, yet

 ∞  \ [ P (An i.o.) = P  Ak n=1 k≥n

 ∞  \ [ ≤ P  Ak n=2 k≥n

= 0 (since all Ak are sets of measure zero).

Problem 5

In a sequence of independent Bernoulli random variables {Xn | n = 1, 2, ···} with P (Xn = 1) = p = 1 − P (Xn = 0), n let An be the event that a run of n consecutive 1’s occurs between trials 2 and 2n+1.

1 Proposition 0.16. If p ≥ 2 , then P (An i.o.) = 1.

2n Proof. Break the trials into n disjoint blocks of length n. The probability of not getting all 1’s in a given block is 1−pn, and so the probability of not getting all 1’s among any of the blocks is

n n 2 (1 − p ) n .

29 Now,

2n n n 2  −pn  n (1 − p ) n ≤ e

n − (2p) = e n .

Hence,

n n 2 P (An) = 1 − (1 − p ) n n − (2p) ≥ 1 − e n .

1 We consider two cases. If p > 2 , (2p)n (1 + )n = (some  > 0) n n → ∞

Thus,

∞ ∞ n X X − (2p) P (An) ≥ 1 − e n n=1 n=1

∞ ∞ n X X − (2p) = 1 − e n n=1 n=1 = ∞, and so P (An i.o.) = 1 by Borel-Cantelli. 1 If p = 2 , then we use ∞ 1 k − 1 X − n e n = . k! k=0 Now,

∞ 1 k X − P (A ) = 1 − n n k! k=0 ∞ 1 k 1 X − = 1 − 1 + − n n k! k=2 ∞ 1 k 1 X − = − n . n k! k=2

30 Thus,

∞ ∞ ∞ 1 k ! X X 1 X − P (A ) = − n n n k! n=1 n=1 k=2 ∞ ∞ ∞ 1 k X 1 X X − = − n n k! n=1 n=1 k=2 = ∞, and so P (An i.o.) = 1 by Borel-Cantelli.

Problem 7

Proposition 0.17. If the event A is independent of the π-system P and A ∈ σ(P), then P (A) is either 0 or 1. Proof. Since A is independent of P, σ(A) is independent of σ(P) by the Basic Criterion. As A ∈ σ(P), A is independent of itself. Thus, P (A) = P (A ∩ A) = P (A)P (A) = P (A)2, and so P (A) is either 0 or 1.

Problem 9

Proposition 0.18. If σ(X1,...,Xn−1) and σ(Xn) are independent for all n ≥ 2, then {Xn | n = 1, 2,... } is an independent collection of random variables. Proof. Let a finite collection {A | A ∈ X−1(B), j ∈ [k]} be given. Without ij ij ij loss of generality, suppose i` < im for ` < m. Consider the event Ai1 ∩ · · · ∩ Aik . Since

Ai1 ∩ · · · ∩ Aik−1 ∈ σ(Xi1 ,...,Xik−1 )

⊂ σ(X1,X2,...,Xik−1), which is independent from σ(Xik ), it follows that

P (Ai1 ∩ · · · ∩ Aik−1 ∩ Aik ) = P (Ai1 ∩ · · · ∩ Aik−1 )P (Aik ). Proceeding inductively, we conclude that

k Y P (Ai1 ∩ · · · ∩ Aik−1 ∩ Aik ) = P (Aij ), j=1 as desired.

31 Problem 10

Proposition 0.19. Given a sequence of events {An | n = 1, 2,... } with P (An) → 1, there exists a subsequence {nk} tending to infinity such that ! \ P Ank > 0. k

Proof. Since P (An) → 1, we have that, for all  > 0 there exists N such that P (An) > 1 −  for all n ≥ N. Choose a sequence of positive k for k ≥ 1 satisfying X k < 1. k 1 (For example, k = 4k suffices.) From the above, we can find corresponding nk for each k such that P (Ank ) > 1 − k. Now, ! !c! \ \ P Ank = 1 − P Ank k k ! [ = 1 − P Ac nk k X ≥ 1 − P (Ac ) nk k X > 1 − k k > 0.

Problem 1

Let (Ω, F,P ) be a , and let {An} and A belong to F. Let X be a random variable defined on this probability space with X ∈ L1. Proposition 0.20. Z lim XdP = 0 n→∞ |X|>n

Proof. Define the sequence of random variables Xn for n ∈ N via ( X(ω) if |X(ω)| > n Xn(ω) = 0 otherwise. Observe that, for all n, Z Z XdP = XndP. |X|>n Ω

32 Since Z Z

XndP ≤ |Xn|dP, Ω Ω R it suffices to show that Ω |Xn|dP → 0. Now, |Xn| → 0 almost everywhere, since |X| is finite almost everywhere. Moreover, |Xn| ≤ |X| for all n. Thus, by the Dominated Convergence Theorem, Z Z lim |Xn|dP = lim |Xn|dP n→∞ Ω Ω n→∞ Z = 0dP Ω = 0.

Proposition 0.21. If lim P (An) = 0, n→∞ then Z lim XdP = 0. n→∞ An

Proof. Define the sequence of random variables Xn for n ∈ N via Xn = X · 1An . Observe that, for all n, Z Z XdP = XndP. An Ω Since Z Z

XndP ≤ |Xn|dP, Ω Ω R it suffices to show that Ω |Xn|dP → 0. Now, |Xn| → 0, since P (An) → 0. Moreover, |Xn| ≤ |X| for all n. Thus, by the Dominated Convergence Theorem, Z Z lim |Xn|dP = lim |Xn|dP n→∞ Ω Ω n→∞ Z = 0dP Ω = 0.

Proposition 0.22. Z |X|dP = 0 A if and only if P {A ∩ [X > 0]} = 0.

33 Proof. Observe first that Z Z Z |X|dP = |X|dP + |X|dP A A∩[|X|>0] A∩[|X|=0] Z = |X|dP + 0. A∩[|X|>0] R R Thus, A |X|dP = 0 if and only if A∩[|X|>0] |X|dP = 0 if and only if P {A ∩ [|X| > 0]} = 0 (since we integrate over those ω ∈ Ω for which |X(ω)| > 0).

Proposition 0.23. Let X ∈ L2. If V (X) = 0, then P [X = E(X)] = 1. Proof. Let  > 0. By Chebychev’s Inequality,

V (X) P [|X − E(X)| ≥ ] ≤ 2 = 0.

Equivalently, P [|X − E(X)| < ] = 1 for all  > 0. Therefore, P [X = E(X)] = 1.

Problem 2

If X and Y are independent random variables and E(X) exists, then, for all B ∈ B(R), Z XdP = E(X)P {Y ∈ B}. [Y ∈B] Proof. For ease of notation, let A = [Y ∈ B]. Thus, Z Z XdP = XdP [Y ∈B] A Z = X · 1AdP Ω = E(X · 1A).

Now, write X = X+ −X−. Since each of X+ and X− is measurable, we can find + − + sequences of nonneagtive, simple random variables {Xn } and {Xn } with Xn ↑ + − − + + X and Xn ↑ X . By the Monotone Convergence Theorem, E(Xn ) ↑ E(X ) − − + − and E(Xn ) ↑ E(X ). Thus, by linearity of expectation, E(Xn − Xn ) → + − E(X). Moreover, E(Xn · 1A − Xn · 1A) → E(X · 1A). We show next that + − E(Xn ·1A−Xn ·1A) → E(X)P (A) and so conclude that E(X·1A) = E(X)P (A).

34 For each n, we have

n + X Xn = ai1Ai , i=1 where the ai are constants and the Ai partition Ω. It follows that

n + X Xn · 1A = ai1Ai 1A i=1 n X = ai1Ai∩A, i=1 and so

n + X E(Xn · 1A) = aiP (Ai ∩ A) i=1 n X = aiP (Ai)P (A) (by independence of X and Y ) i=1 n X = P (A) aiP (Ai) i=1 + = P (A)E(Xn ) → P (A)E(X+).

− − Similarly, E(Xn · 1A) → P (A)E(X ), and so

+ − + − E(Xn · 1A − Xn · 1A) = E(Xn · 1A) − E(Xn · 1A) → P (A)E(X+) − P (A)E(X−) = P (A)E(X+ − X−) = P (A)E(X), as desired.

Problem 3

Proposition 0.24. For all n ≥ 1, let Xn and X be uniformly bounded random variables. If lim Xn = X, n→∞ then lim E|Xn − X| = 0. n→∞

35 Proof. The random variable that is identically K belongs to L1, since Z KdP = K · P (Ω) Ω = K.

Thus, the identically K random variable is a dominating random variable for the Xn, and so by the Dominated Convergence Theorem, E|Xn − X| → 0.

Problem 4

On the Lebesgue interval (Ω = [0, 1], B([0, 1]),P = λ) define the random vari- ables n Xn = I0, 1 . log n n

Proposition 0.25. For Xn defined as above,

lim Xn = 0 n→∞ and lim E(Xn) = 0, n→∞ yet the Xn are unbounded.

1 Proof. For any x ∈ [0, 1], we can choose N such that N < x. Thus, XN (x) = 0, 1 since x∈ / [0, N ]. As x was arbitrary, we conclude that Xn → 0. Next, observe that, for all n,

n  1  E(X ) = λ 0, n log n n n 1 = · log n n 1 = , log n and so E(Xn) → 0. n Finally, log n → ∞, and so the Xn are unbounded. Hence, Xn → 0 and E(Xn) → 0, yet the condition in the Dominated Convergence Theorem fails.

Problem 5

Proposition 0.26. Let Xn ∈ L1 for all n ≥ 1 satisfying

sup E(Xn) < ∞. n

If Xn ↑ X, then X ∈ L1 and E(Xn) → E(X).

36 + + − − + Proof. Since Xn ↑ X, we see that Xn ↑ X and Xn ↓ X . Since each of Xn − and Xn belongs to L1 for all n, we have by the Monotone Convergence Theorem + + − − that E(Xn ) → E(X ) and E(Xn ) → E(X ). By linearity of expectation, this implies that E(Xn) → E(X). Since supn E(Xn) is finite, so is E(X) by uniqueness of limits. + To show that X ∈ L1, it remains to rule out the case that E(X ) = − E(X ) = ∞ (if only one of them is infinite, then E(X) = ±∞, but supn E(Xn) < − − − − ∞). Observe, however, Xn ↓ X . Thus, if E(X ) = ∞, E(Xn ) = ∞ for all − n, contradicting the fact that Xn ∈ L1 for all n.

Problem 6

Proposition 0.27. For any positive random variable X, Z E(X) = P (X > t)dt. [0,∞) Proof. We may view the area of integration as a subset A of the product space Ω × [0, ∞) where A = {(ω, t) | X(ω) > t}. with product measure P 0 = P × µ. Now, by Fubini’s Theorem, Z Z Z 0 1AdP = 1A(ω, t)dtdP Ω×[0,∞) Ω [0,∞) Z = X(ω)dP Ω = E(X).

On the other hand, Z Z Z 0 1AdP = 1A(ω, t)dP dt Ω×[0,∞) [0,∞) Ω Z = P {ω | X(ω) > t}dt [0,∞) Z = P (X > t)dt. [0,∞)

Proposition 0.28. For any positive random variable X and any constant α > 0, Z E(Xα) = α tα−1P (X > t)dt. [0,∞)

37 α R X(ω) α−1 Proof. By direct computation, we have X (ω) = 0 αt dt. It follows that Z E(Xα) = X(ω)P (dω) Ω Z Z X(ω) = αtα−1dtP (dω) Ω 0 Z ∞ Z = αtα−1P (dω)dt (by Fubini’s Theorem) 0 {P (X>t)} Z ∞ = αtα−1P (X > t)dt. 0

Problem 7

Proposition 0.29. Let X be a nonnegative random variable and let δ > 0, 0 < β < 1, and C be constants. If P {X > nδ} ≤ Cβn for all n ≥ 1, then E(Xα) < ∞ for all α > 0. R α−1 Proof. By the previous problem, it is equivalent to show that α [0,∞) t P (X > t)dt is finite. To begin, pick N such that tα−1P (X > t) is strictly decreasing in t for all t ≥ N. Such an N exists, as tα−1 is a polynomial in t and P (X > t) decays exponentially. It follows that ∞ Z X α tα−1P (X > t)dt ≤ α δ · (nδ)α−1P (X > nδ) (since tα−1P (X > t) is strictly decreasing) [N,∞) n=N ∞ X ≤ αδα nα−1Cβn n=N ∞ X = Cαδα nα−1βn n=N < ∞ (by the ratio test).

R α−1 R α−1 Since [0,N] t P (X > t)dt is also finite, we conclude that [0,∞] t P (X > t)dt is finite.

Problem 2

Let {Xn | n = 1, 2,... } be a sequence of random variables with 1 1 P {X = ±n3} = and P {X = 0} = 1 − . n 2n2 n n2

38 Proposition 0.30. For the sequence described above, n o P lim Xn = 0 = 1. n→∞

Proof. Let An be the event that Xn is nonzero. Formally,

An = {ω ∈ Ω | Xn(ω) 6= 0}.

We see that

P (An) = 1 − P {X = 0}  1  = 1 − 1 − n2 1 = , n2 and so

∞ ∞ X X 1 P (A ) = n n2 n=1 n=1 < ∞.

Thus, by Borel-Cantelli,   P ([Ani.o]) = P lim sup An = 0. n→∞ By taking complements, we have   c 1 = P lim sup An n→∞   = P lim inf[Xn = 0] , n→∞ and so   P lim Xn = 0 = 1. n→∞

Proposition 0.31. For the sequence described above, limn→∞ E(Xn) is either ±∞ or is undefined.

3 1 Proof. For each n, P {Xn = ±n } = 2n2 . Hence, 1 P {X+ = n3} ≥ n 4n2 or 1 P {X− = n3} ≥ n 4n2

39 (possibly both). Suppose the former is true. It follows that

+ 3 + 3 E(Xn ) ≥ n P {Xn = n } 1 ≥ n3 4n2 n = , 4 and so

+ n lim E(Xn ) ≥ lim n→∞ n→∞ 4 = ∞.

− 3 1 − Similarly, if P {Xn = n } ≥ 4n2 , then limn→∞ E(Xn ) = ∞. Therefore,  ∞ if lim (X+) = ∞ and lim (X−) < ∞  n→∞ E n n→∞ E n + − lim (Xn) = −∞ if limn→∞ (X ) < ∞ and limn→∞ (X ) = ∞ n→∞ E E n E n  + − undefined if limn→∞ E(Xn ) = ∞ and limn→∞ E(Xn ) < ∞.

Problem 3

Let (Ω, F,P ) be a probability space. Definition 0.32. Two random variables X and Y are said to be independent provided that, for any A, B ∈ B(R),

P [X−1(A) · Y −1(B)] = P (X−1(A)) · P (Y −1(B)).

Proposition 0.33. Two random variables X and Y are independent if and only if, for every pair f and g of non-negative continuous functions on (R, B(R)),

E[f(X)g(Y )] = E[f(X)]E[g(Y )]. Proof. (⇒) Let f and g be any non-negative continuous functions on (R, B(R)). Since continuous functions between metric spaces are measurable (Resnick, 3.2.3), f and g are measurable. Since the composition of measurable functions is measurable (Resnick, 3.2.2), f(X) and g(Y ) are measurable. Now, σ(f(X)) ⊆ σ(X) (since f ∈ B(R)/B(R)) and σ(g(Y )) ⊂ σ(Y ) (since g ∈ B(R)/B(R)). Hence, f(X) and g(Y ) are independent, since X and Y are independent. Define now Z1 = f(X) and Z2 = g(Y ). By the above, Z1 and Z2 are inde- pendent random variables. Thus, by Fubini’s Theorem (as in Resnick, 5.9.2),

E(Z1Z2) = E(Z1)E(Z2), but this is precisely

E(f(X)g(Y )) = E(f(X))E(g(Y )).

40 (⇐) (Idea) Let a and b be real numbers and take f = 1(0,a] and g = 1(0,b]. The support of f(X) is {ω | X(ω) ≤ a} and the support of g(Y ) is {ω | Y (ω)}. Thus, the measures of the supports are P (X ≤ a) and P (Y ≤ b), respectively. Using the fact that

E(f(X)g(Y )) = E(f(X))E(g(Y )), I would like to derive that

P (X ≤ a, Y ≤ b) = P (X ≤ a) · P (Y ≤ b).

Since a and b were arbitrary, we could conclude that X and Y are independent by the Factorization Criterion (Resnick, 4.2.1). Perhaps this may be accom- plished by looking at the appropriate approximations of f(X) and g(Y ) by simple functions (where the probabilty of the support becomes more evident in the computation).

For each n, let Xn and Yn be a pair of independent random variables and define lim Xn = X and lim Yn = Y. n→∞ n→∞ Proposition 0.34. The functions X and Y are independent random variables. Proof. (Idea) We have, for each n and for all continuous, non-negative f and g,

E(f(Xn)g(Yn)) = E(f(Xn))E(g(Yn)). If we could switch limits with integrals, we would have

lim (f(Xn)g(Yn)) = lim (f(Xn)) (g(Yn)) n→∞ E n→∞ E E       lim f(Xn)g(Yn) = lim f(Xn) lim g(Yn) E n→∞ E n→∞ E n→∞ E(f(X)g(Y )) = E(f(X))E(g(Y )), where the last step makes use of the continuity of f and g. Thus, appealing again to part b, we could conclude that X and Y are independent. I fail to see how to accomplish the interchange, however, as the Xn need not be monotone nor does there appear to be any bounding function.

Problem 5

Suppose {pk | k ≥ 0} is a probability mass function on (Ω = {0, 1, 2,... }, P = P P(Ω)), where P(·) denotes the power set, so that pk ≥ 0 and k pk = 1. Define for all A ⊂ Ω, X P (A) = pk. k∈A

41 Proposition 0.35. The function P defined above is a probability measure on (Ω, P).

Proof. Since pk ≥ 0 for all k, P (A) ≥ 0 for all A ⊂ Ω. We have, by definition of the probability mass function, X P (Ω) = pk k∈Ω = 1. S Let {An} be a countable sequence of disjoint events and let A = {An}. It follows that ∞ ! [ P An = P (A) n=1 X = pk k∈A ∞ X X = pk (since the An are disjoint)

n=1 k∈An ∞ X = P (An). n=1

Define the generating function Ψ : ([0, 1], B[0, 1]) → (R, B) via ∞ X k Ψ(s) = pks . k=0 Proposition 0.36. The function Ψ defined above satisfies ∞ d X Ψ0(s) ≡ Ψ(s) = kp sk−1 ds k k=1 for 0 ≤ s ≤ 1. Pn k−1 Proof. Define the function Xn = k=1 kpks . Observe that 0 ≤ Xn ↑ X, and so by the Monotone Convergence Theorem   lim (Xn) = lim Xn n→∞ E E n→∞ n ! n ! X k−1 X k−1 lim kpks = lim kpks n→∞ E E n→∞ k=1 k=1 n n ! X k X k−1 lim pks = lim kpks n→∞ E n→∞ k=0 k=1 0 Ψ(s) = E (Ψ (s)) .

42 0 Proposition 0.37. If X has probability measure P , then E(X) = lims↑1 Ψ (s). Proof. We have, Z E(X) = X(ω)dP Ω ∞ X = kP (X = k) k=0 ∞ X = kpk k=1 = lim Ψ0(s). s↑1

Problem 6

Let X1,X2,...,Xn ∈ L2(P ) be random variables defined on a probability space (Ω, F,P ). For each i, j ∈ {1, 2, . . . , n}, define the covariances

σij = C(Xi,Xj) = E{[Xi − µi][Xj − µj]}, where 2 2 µi = E(Xi) and σi = σii = V(Xi) = E[(Xi − µi) ]. Lemma 0.38. For any random variable X and real numbers a and b,

2 V(aX + b) = a V(X). Proof. It follows from the linearity of expectation that,

2 V(aX + b) = E[(aX + b − E(aX + b)) ] 2 = E[(aX + b − aE(X) − b) ] 2 = E[(a(X − E(X))) ] 2 2 = a E[(X − E(X)) ] 2 = a V(X).

Lemma 0.39. For any random variables X and Y ,

V(X + Y ) = V(X) + 2C(X,Y ) + V(Y ).

43 Proof. It follows from the linearity of expectation that,

2 V(X + Y ) = E[(X + Y − E(X) − E(Y )) ] 2 = E[((X − E(X)) + (Y − E(Y ))) ] 2 2 = E[(X − E(X)) + 2(X − E(X))(Y − E(Y )) + (Y − E(Y )) ] 2 2 = E[(X − E(X)) ] + E[2(X − E(X))(Y − E(Y ))] + E[(Y − E(Y )) ] = V(X) + 2C(X,Y ) + V(Y ).

Proposition 0.40. For all i and j,

σij ≤ |σij| ≤ σiσj.

Moreover, |σij| = σiσj if and only if, for some α and β, we have P {Xj = α + βXi} = 1.

Proof. For all real numbers x, we have x ≤ |x|, so certainly σij ≤ |σij|. For the second inequality, let t be a real variable. It follows from the lemmas that

0 ≤ V[tXi + Xj] = V(tXi) + 2C(Xi,Xj) + V(Xj) 2 2 2 = σi t + 2σijt + σj .

Viewing this as a non-negative quadratic in t, we have that

2 2 2 0 ≥ 4σij − 4σi σj , and so |σij| ≤ σiσj. For the remaining claim, observe that

2 2 2 |σij| = σiσj ⇔ σij = σi σj 2 2 2 ⇔ 0 = 4σij − 4σi σj .

Hence, V[tXi + Xj] has a unique real root t0. Now, the variance of a random variable is equal to 0 if and only if it is constant with probability one. That is,

P {t0Xi + Xj = α} = 1, or equivalently P {Xj = α − t0Xi} = 1.

44 Proposition 0.41. For real constants αi and βi, i = 1, 2, . . . , n,

 n n  n n X X  X X C αiXi, βjXj = αiβjσij. i=1 j=1  i=1 j=1

Proof. Applying linearity of expectation, we have

 n n   n !  n  ( n )  n  X X   X X  X X  C αiXi, βjXj = E αiXi  βjXj − E αiXi E βjXj i=1 j=1   i=1 j=1  i=1 j=1 

 n n  ( n )  n  X X  X X  = E αiβjXiXj − E αiXi E βjXj i=1 j=1  i=1 j=1 

n n ( n )  n  X X X X  = αiβjE(XiXj) − αiE(Xi) βjE(Xj) i=1 j=1 i=1 j=1  n n n n X X X X = αiβjE(XiXj) − αiβjE(Xi)E(Xj) i=1 j=1 i=1 j=1 n n X X = αiβjC(Xi,Xj) i=1 j=1 n n X X = αiβjσij. i=1 j=1

Proposition 0.42. For real constants αi, i = 1, 2, . . . , n,

( n ) n X X 2 2 X V αiXi = αi σi + 2 αiαjσij. i=1 i=1 1≤i

Proof. By definition, V(X) = C(X,X) for any random variable X. Thus, by

45 the previous proposition,

( n ) ( n n ) X X X V αiXi = C αiXi, αiXi i=1 i=1 i=1 n n X X = αiαjC(Xi,Xj) i=1 j=1 n X 2 X = αi C(Xi,Xi) + 2 αiαjC(Xi,Xj) i=1 1≤i

Proposition 0.43. Let X1,...,Xn be independent random variables. Further- more, suppose that the constants αi are restricted to belong to [0, 1] and must Pn Pn −1 satisfy i=1 αi = 1. Finally, letting s = i=1 σi ,

( n ) X −2 V αiXi ≥ ns i=1

−1 with equality if and only if αi = (σis) for all i.

Proof. Since the Xi are independent, σij = 0 for all i 6= j (Resnick, 5.9.2). Thus, the expression for the variance reduces to

( n ) n X X 2 V αiXi = (αiσi) . i=1 i=1 Pn Now, since we require i=1 αi = 1, the variance will be minimized precisely when all the terms in its summation are equal. To that end, tentatively set −1 αi = σi for all i. Thus, αiσi = 1 for all i, and so all the terms are equal, as desired. Now,

n n X X −1 αi = σi i=1 i=1 = s.

−1 To ensure that the αi indeed sum to 1, we scale them all by a factor of s .

46 −1 Thus, we take instead αi = (σis) for all i. Using these values, we have ( n ) n X X 2 V αiXi = (αiσi) i=1 i=1 n X −1 2 = ((σis) σi) i=1 n X = s−2 i=1 = ns−2.

Problem 8

For i = 1, 2, let (Ωi, Bi,Pi) be probability spaces. Define Ω = Ω1 × Ω2 and B = B1 ⊗ B2 = σ(RECTS), where

RECTS = {B1 × B2 | B1 ∈ B1,B2 ∈ B2}.

Let P = P1 × P2 be the product probability measure so that, for B1 × B2 ∈ RECTS, we have P (B1 × B2) = P1(B1)P2(B2). Define the class of subsets  Z Z  C = B ⊂ Ω | 1B(ω1, ω2)dP (ω1, ω2) = Y (ω1)dP1(ω1) , Ω Ω1 R where Y (ω1) = 1B(ω1, ω2)dP2(ω2). Ω2 Proposition 0.44. The class RECTS is a subset of the class C.

Proof. Let B = B1 × B2 belong to RECTS. We have Z Z 1B(ω1, ω2)dP (ω1, ω2) = dP (ω1, ω2) Ω B = P (B). At the same time, we have Z Z Z Z

1B(ω1, ω2)dP2(ω2)dP1(ω1) = 1B1 (ω1)1B2 (ω2)dP2(ω2)dP1(ω1) Ω1 Ω2 Ω1 Ω2 Z Z

= 1B1 (ω1)dP2(ω2)dP1(ω1) Ω1 B2 Z

= 1B1 (ω1)P2(B2)dP1(ω1) Ω1 Z = P2(B2)dP1(ω1) B1

= P1(B1)P2(B2) = P (B).

47 Hence, for all B ∈ RECTS, Z Z 1B(ω1, ω2)dP (ω1, ω2) = Y (ω1)dP1(ω1), Ω Ω1 and so RECTS ⊆ C. Proposition 0.45. The class C is a λ-system.

Proof. We have immediately that Ω = Ω1 × Ω2 ∈ RECTS ⊆ C, so Ω ∈ C. Next, let B ∈ C. We have Z Z 1Bc (ω1, ω2)dP (ω1, ω2) = dP (ω1, ω2) Ω Bc = P (Bc) = 1 − P (B) Z Z = 1 − 1 (ω )dP (ω ) (since B ∈ C) Bω1 2 1 1 Ω1 Ω2 Z Z = 1 − 1 (ω )dP (ω ) Bω1 2 1 1 Ω1 Ω2 Z Z = 1 c (ω )dP (ω ) (Bω1 ) 2 1 1 Ω1 Ω2 Z Z = 1 c (ω )dP (ω ) (B )ω1 2 1 1 Ω1 Ω2 Z Z = 1Bc (ω1, ω2)dP2(ω2)dP1(ω1). Ω1 Ω2

Finally, let {Bn | n = 1, 2,... } be a collection of disjoint elements of C. We have Z Z P∞ 1 An dP (ω1, ω2) = dP (ω1, ω2) n=1 P∞ Ω n=1 An ∞ ! X = P An n=1 ∞ X = P (An) n=1 ∞ X Z Z = 1 (ω )dP (ω ) (since A ∈ C for all n) (An)ω1 2 1 1 n n=1 Ω1 Ω2 ∞ Z Z X = 1 (ω )dP (ω ) (by MCT) (An)ω1 2 1 1 Ω1 Ω2 n=1 Z Z = 1 P∞ (ω )dP (ω ) ( n=1 An)ω1 2 1 1 Ω1 Ω2 Z Z = 1P∞ (ω , ω )dP (ω )dP (ω ). n=1 An 1 2 2 2 1 1 Ω1 Ω2

48 Therefore, C is a λ-system. Proposition 0.46. For every B ∈ B, Z Z Z  1B(ω1, ω2)dP (ω1, ω2) = 1B(ω1, ω2)dP2(ω2) dP1(ω1). Ω Ω1 Ω2 Proof. We have shown RECTS ⊆ C and that C is a λ-system. If we can show also that RECTS is a π-system, then Dynkin’s Theorem gives B = σ(RECTS) ⊂ C, from which the conclusion follows. 0 0 To finish the proof, let B1 × B2 and B1 × B2 belong to RECTS. It follows immediately that

0 0 0 0 (B1 × B2) ∩ (B1 × B2) = (B1 ∩ B1) × (B2 ∩ B2).

0 0 Since B1 and B2 are closed under intersections, B1∩B1 ∈ B1 and B2∩B2 ∈ B2, 0 0 and so (B1 ∩ B1) × (B2 ∩ B2) ∈ RECTS.

To establish the more general result where 1B in part c is replaced with any B-measurable positive random variable X, we first establish the result for Pn simple functions of the form Xn = i=1 ai1Bi , where Bi ∈ B for all i. The result for simple functions follows readily from the linearity of the integral. Since each Xn is positive, we can take a sequence Xn ↑ X. By hypothesis, Z Z Z  Xn(ω1, ω2)dP (ω1, ω2) = Xn(ω1, ω2)dP2(ω2) dP1(ω1) Ω Ω1 Ω2 for all n. Applying the Monotone Convergence Theorem, we can conclude that Z Z Xn(ω1, ω2)dP (ω1, ω2) ↑ X(ω1, ω2)dP (ω1, ω2) Ω Ω and Z Z  Z Z  Xn(ω1, ω2)dP2(ω2) dP1(ω1) ↑ X(ω1, ω2)dP2(ω2) dP1(ω1), Ω1 Ω2 Ω1 Ω2 from which it follows that Z Z Z  X(ω1, ω2)dP (ω1, ω2) = X(ω1, ω2)dP2(ω2) dP1(ω1). Ω Ω1 Ω2 Problem 10

Suppose that X and Y are independent random variables and let h : R2 → [0, ∞) be a measurable function such that E{h2(X,Y )} < ∞. Define

g(x) = E{h(x, Y )} and k(x) = V{h(x, Y )}. Proposition 0.47. The functions g and k are both measureable on R → R.

49 Proof. Define hˆ(x, ω) = h(x, Y (ω)). Since h and Y are measurable, hˆ is mea- surable, as it is defined by the composition of two measurable functions. Hence, ˆ ˆ ˆ we can take a collection {hn} of simple functions with hn ↑ h. Define now R ˆ gn(x) = Ω hn(x, ω)dP (ω) for n = 1, 2,... . By the Monotone Convergence Theorem, gn ↑ g. To conclue that g is measurable, it remains to show that each gn is simple. To that end, observe that Z ˆ gn(x) = hn(x, ω)dP (ω) Ω k Z X = aj1Aj (x, ω)dP (ω) (constants aj and {Aj} a partition of R) Ω j=1 k X Z = aj1Aj (x, ω)dP (ω) (by MCT) j=1 Ω k X Z = aj1Aj (x)1Aj (ω)dP (ω) j=1 Ω k X = ajP (Aj)1Aj (x), j=1 and so gn is simple. For k(x), we have

k(x) = V(hˆ) 2 2 = E(hˆ ) − E(hˆ) 2 2 = E(hˆ ) − g .

Now, since hˆ is measureable, hˆ2 is measurable. Following the same argument as above, we find that E(hˆ2) is measurable, and so k is measurable. Proposition 0.48. For g and h as defined above,

E{g(X)} = E{h(X,Y )}.

Proof. Suppose X is a random variable on Ω1 with probability measure P1 and Y is a random variable on Ω2 with probability measure P2. Finally, let P be the probability measure on Ω = Ω1 × Ω2 induced by P1 and P2. In order to make use of Fubini’s Theorem later in the proof, we must establish first that P = P1 × P2. To that end, observe that for any measurable sets A ⊂ Ω1 and

50 B ⊂ Ω2, Z P (A × B) = 1A×BdP Ω Z = 1A · 1BdP Ω Z Z = 1AdP · 1BdP (since X and Y are independent) Ω Ω Z Z = 1A(ω1)dP1(ω1) · 1B(ω2)dP2(ω2) Ω1 Ω2

= P1(A) · P2(B). Now, Z E(g(X)) = g(X(ω1))dP1(ω1) Ω1 Z Z = h(X(ω1),Y (ω2))dP2(ω2)dP1(ω1) Ω1 Ω2 Z = h(X(ω1),Y (ω2))dP (ω1, ω2) (by Fubini’s Theorem) Ω = E(h(X,Y )).

Proposition 0.49. For g, h, and k as defined above, V{g(X)} + E{k(X)} = V{h(X,Y )}. Proof. We have V(g(X)) + E(k(X)) Z Z 2 Z 2 = g(X(ω1)) dP1(ω1) − g(X(ω1))dP1(ω1) + k(X(ω1))dP1(ω1) Ω1 Ω1 Ω1 Z Z 2 2 = g(X(ω1)) + k(X(ω1))dP1(ω1) − g(X(ω1))dP1(ω1) Ω1 Ω1 Z Z 2 Z 2 = h(X(ω1),Y (ω2))dP2(ω2) + h(X(ω1),Y (ω2)) dP2(ω2) Ω1 Ω2 Ω2 Z 2 Z Z 2 − h(X(ω1),Y (ω2))dP2(ω2) dP1(ω1) − h(X(ω1),Y (ω2))dP2(ω2)dP1(ω1) Ω2 Ω1 Ω2 Z Z Z Z 2 2 = h(X(ω1),Y (ω2)) dP2(ω2)dP1(ω1) − h(X(ω1),Y (ω2))dP2(ω2)dP1(ω1) Ω1 Ω2 Ω1 Ω2 2 2 =E(h(X,Y ) ) − E(h(X,Y )) =V(h(X,Y )).

51