Lecture-14: Almost Sure Convergence of Random Variables

Lecture-14: Almost sure convergence of random variables

1 Almost sure convergence

Consider a probability space (Ω,F, P). Recall that a random variable X is an F-measurable function on the −1 sample space Ω such that X (−∞, x] ∈ F for all x ∈ R. A sequence of random variables (Xn(ω) : n ∈ N) is hence a sequence of F-measurable functions. There are many possible definitions for convergence of a sequence of random variables. One idea is to require Xn(ω) to converge for each fixed ω. However, at least intuitively, what happens on an event of probability zero is not important. Definition 1.1. A statement holds almost surely (abbreviated a.s.) if there exists an event called the exception set N ∈ F with P(N) = 0 such that the statement holds if ω ∈/ N.

Example 1.2 (Almost sure equality). Two random variables X,Y deﬁned on the probability space (Ω,F, P) are said to be equal a.s. if there exists an exception set

N = {ω ∈ Ω : X(ω) 6= Y(ω)} ∈ F,

and P(N) = 0. Then Y is called a version of X, and we can deﬁne an equivalence class of a.s. equal random variables.

Example 1.3 (Almost sure monotonicity). Two random variables X,Y deﬁned on the probability space (Ω,F, P) are said to be X 6 Y a.s. if there exists an exception set N = {ω ∈ Ω : X(ω) > Y(ω)} ∈ F and P(N) = 0.

Deﬁnition 1.4. If (Xn : n ∈ N) is a sequence of random variables, then limn Xn exists a.s. means there exists an exception event N ∈ F, such that P(N) = 0 and if ω ∈/ N, then limn Xn(ω) exists. That is, c N = ω ∈ Ω : limsup Xn(ω) = liminf Xn(ω) . n n c Let X be the point-wise limit of the sequence of random variables (Xn(ω) : n ∈ N) on the set N , then we say that the sequence (Xn(ω) : n ∈ N) converges almost surely to X, and denote it as

lim Xn = X a.s. n

Example 1.5 (Convergence almost surely but not everywhere). Consider the probability space ([0,1],B([0,1]),λ) such that λ([a,b]) = b − a for all 0 6 a 6 b 6 1. We deﬁne the scaled indicator random variable Xn : Ω → {0,1} such that Xn(ω) = n1 1 (ω). [0, n ] 1 Let N = {0}, then for any ω ∈/ N, there exists m = d ω e ∈ N, such that for all n > m, we have Xn(ω) = 0. That is, limn Xn = 0 a.s. since λ(N) = 0. However, Xn(0) = n for all n ∈ N.

1 2 Convergence in probability

Deﬁnition 2.1. A sequence (Xn : n ∈ N) of random variables converges in probability to a random variable X, if for any e > 0 lim P {ω ∈ Ω : |Xn(ω) − X(ω)| > e} = 0. n

Remark 1. For a sequence (Xn : n ∈ N), almost sure convergence of means that for almost all outcomes ω, the difference Xn(ω) − X(ω) gets small and stays small. Convergence in probability is weaker and merely requires that the probability of the difference Xn(ω) − X(ω) being non-trivial becomes small.

Example 2.2 (Convergence in probability but not almost surely). Consider the probability space ([0,1],B([0,1]),λ) such that λ([a,b]) = b − a for all 0 6 a 6 b 6 1. For each k ∈ N, we consider the se- k quence Sk = ∑i=1 i, and deﬁne integer intervals Ik , {Sk−1 + 1,. . .,Sk}. Clearly, the intervals (Ik : k ∈ N) partition the natural numbers, and each n ∈ N lies in some Ik, such that n = Sk−1 + i for i ∈ [k]. There- fore, for each n ∈ N, we deﬁne indicator random variable Xn : Ω → {0,1} such that 1 Xn(ω) = i−1 i (ω). [ k , k ]

For any ω ∈ [0,1], we have Xn(ω) = 1 for inﬁnitely many values since there exist inﬁnitely many (i,k) (i−1) i pairs such that k 6 ω 6 k , and hence limsupn Xn(ω) = 1 and hence limn Xn(ω) 6= 0. However, limn Xn(ω) = 0 in probability, since

1 limλ {Xn(ω) 6= 0} = lim = 0. n n k

3 Borel-Cantelli Lemma

Lemma 3.1 (inﬁnitely often and almost all). Let (An ∈ F : n ∈ N) be a sequence of events.

(a) For some subsequence (kn : n ∈ N) depending on ω, we have ( ) = ∈ 1 ( ) = = ∈ ∈ ∈ N = { } limsup An ω Ω : ∑ An ω ∞ ω Ω : ω Akn ,n An inﬁnitely often . n n∈N

(b) For a ﬁnite n0(ω) ∈ N depending on ω, we have ( ) 1 liminf An = {ω ∈ Ω : ω ∈ An for all n n0(ω)} = ω ∈ Ω : Ac (ω) < ∞ = {An for all but ﬁnitely many n}. n > ∑ n n∈N

Proof. Let (An ∈ F : n ∈ N) be a sequence of events. (a) Let ω ∈ limsup A = ∩ sup A , then ω ∈ sup A for all n ∈ N. Therefore, for each n ∈ N, n n n∈N k>n k k>n k there exists kn ∈ N such that ω ∈ Akn , and hence

1 (ω) 1 (ω) = ∞. ∑ Aj > ∑ Akn j∈N n∈N

1 ( ) = ∈ N ∈ N ∈ Conversely, if ∑j∈N Aj ω ∞, then for each n there exists a kn such that ω Akn and hence ω ∈ ∪k>n Ak for all n ∈ N.

(b) Let ω ∈ liminfn An, then there exists n (ω) such that ω ∈ An for all n n (ω). Conversely, if 1 c (ω) < 0 > 0 ∑j∈N Aj ∞, then there exists n0(ω) such that ω ∈ An for all n > n0(ω).

2 Theorem 3.2 (Convergences a.s. implies in probability). If a sequence of random variables (Xn(ω) : n ∈ N) deﬁned on a probability space (Ω,F, P) converges a.s. to a random variable X, then it converges in probability to the same random variable.

Proof. Since limn Xn = X a.s., let N be the exception set. Let e > 0 and ω ∈/ N, then there exists an n0(ω) such that |Xn − X| 6 e for all n > n0. Deﬁning event An = {ω ∈ Ω : |Xn(ω) − X(ω)| > e}, we have 1 = P(Ac for all but ﬁnitely many n) = P(liminf Ac ). n n n c c Since liminfn An = (limsupn An) , we get

0 = P(limsup An) = lim P( Ak) lim P(An) 0. n ∑ > n > n k>n

Proposition 3.3 (Borel-Cantelli Lemma). Let (An ∈ F : n ∈ N) be a sequence of events such that ∑n∈N P(An) < ∞, then P(An i.o.) = 0.

Proof. We can write the probability of inﬁnitely often occurrence of An, by the continuity and sub-additivity of probability as P(limsup An) = lim P(∪k n Ak) lim P(Ak) = 0. n > 6 n ∑ n k>n

The last equality follows from the fact that ∑n∈N P(An) < ∞.

Proposition 3.4 (Borel zero-one law). If (An ∈ F : n ∈ N) is a sequence of independent events, then ( 0, iff ∑n P(An) < ∞, P(An i.o.) = 1, iff ∑n P(An) = ∞.

Proof. Let (An ∈ F : n ∈ N) be a sequence of independent events.

(a) From Borel-Cantelli Lemma, if ∑n P(An) < ∞ then P(An i.o.) = 0. ( ) = ( ) = ∈ N (b) Conversely, suppose ∑n P An ∞, then ∑k>n P Ak ∞ for all n . From the deﬁnition of limsup and liminf, continuity of probability, and independence of events (Ak ∈ F : k ∈ N) we get m c m c P(An i.o.) = 1 − P(liminf A ) = 1 − limlim P(∩ A ) = 1 − limlim (1 − P(Ak)). n n n m k=n k n m ∏ k=n − Since 1 − x 6 e x for all x ∈ R, from the above equation, the continuity of exponential function, and the hypothesis, we get m − ∑ P(Ak) 1 P(An i.o.) 1 − limlime k=n = 1 − limexp(− P(Ak)) = 1. > > n m n ∑ k>n

Example 3.5 (Convergence in probability can imply almost sure convergence). Consider a sequence of Bernoulli random variables (Xn ∈ {0,1} : n ∈ N) deﬁned on the probability space (Ω,F, P) such that P {Xn = 1} = pn for all n ∈ N. Note that the sequence of random variables is not assumed to be independent, and deﬁnitely not identical. If limn pn = 0, then we see that limn Xn = 0 in probability. ( In addition, if ∑n∈N pn < ∞, then limn Xn = 0 a.s. To see this, we observe from application of the Borel-Cantelli Lemma to events (An = {Xn = 1} : n ∈ N) that

c c 1 = P((limsup An) ) = P((liminf An). n n

c That is, limn Xn = 0 for ω ∈ liminfn An, implying almost sure convergence.

3 Theorem 3.6. A random sequence (Xn : n ∈ N) converges to a random variable X in probability, then there exists a subsequence (nk : k ∈ N) ⊂ N such that (Xnk : k ∈ N) converges almost surely to X.

Proof. Letting n1 = 1, we deﬁne the following subsequence and event recursively for each j ∈ N, n n o o n o n N > n P |X − X| > −j < −j r N A X − X > −j j , inf j−1 : r 2 2 , for all > , j , nj+1 2 .

−j From the construction, we have limk nk = ∞, and P(Aj) < 2 for each j ∈ N. Therefore, ∑k∈N P(Ak) < ∞, and hence by the Borel-Cantelli Lemma, we have P(limsupk Ak) = 0. Let N = limsupk Ak be the exception set such that for any outcome ω ∈/ N, for all but ﬁnitely many j ∈ N

( ) − ( ) −j Xnj ω X ω 6 2 .

That is, for all ω ∈/ N, we have limn Xn(ω) = X(ω).

Theorem 3.7. A random sequence (Xn : n ∈ N) converges to a random variable X in probability iff each subsequence (Xn : k ∈ N) contains a further subsequence (Xn : j ∈ N) converges almost surely to X. k kj