Lecture 18: April 5 18.1 Continuous Mapping Theorem

Total Page:16

File Type:pdf, Size:1020Kb

Lecture 18: April 5 18.1 Continuous Mapping Theorem 36-752 Advanced Probability Overview Spring 2018 Lecture 18: April 5 Lecturer: Alessandro Rinaldo Scribes: Wanshan Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. 18.1 Continuous Mapping Theorem 1 Let fXngn=1 and X be random variables taking values in metric space (X ; d). Recall that last time we D defined that Xn ! X when limn!1 E[g(Xn)] = E[g(X)] for all bounded continuous function g. This relationship can actually be generalized as follows: D Theorem 18.1. Xn ! X if and only if lim [g(Xn)] = [g(X)] n!1 E E for all bounded g that are continuous a:e:[µX ], or µX (fx : g not continuous at xg) = 0. Remark For the direct proof of this theorem, you can see Theorem 3.9.1 on Durrett's book, or the section on weak convergence of Billingsley's book. You can also prove it by using Skorokhod's Representation Theorem given below: D Theorem 18.2 (Skorokhod's Representation Theorem). Suppose Xn ! X, all taking values in metric space (X ; d), and the probability measure µ of X is separable. Then 9 fYng and Y , taking values in (X ; d), and d d defined on some probability space (Ω; F; p) s.t. Xn = Yn, X = Y , and a:s: Yn ! Y; where a:s: is w.r.t. p. In a narrow sense, the so-called continuous mapping theorem concerns the convergence in distribution of random variables, as we will discuss first. This theorem contains three parts. Roughly speaking, the main D D part of it says that if Xn ! X and f is a a:e:[µX ] continuous function, then f(Xn) ! f(X). 1 Theorem 18.3 (Continuous Mapping Theorem, I). Let fXngn=1 be a sequence of random variables and X another random variable, all taking values in the same metric space X . Let Y be a metric space and f : X!Y a measurable function. Define Cf = fx : g is continuous at xg: D D Suppose that Xn ! X and P(X 2 Cf ) = 1, then f(Xn) ! f(X). 18-1 18-2 Lecture 18: April 5 Proof. Denote the probability measure of Xn and X as µn and µ. Last time we saw that 8B ⊂ Y closed, −1 −1 c f (B) ⊂ f (B) [ Cf ; where f −1(B) is the closure of fx 2 X : f(x) 2 Bg. Therefore, −1 lim sup P(f(xn) 2 B) = lim sup µn(f (B)) n!1 n!1 −1 ≤ lim sup µn(f (B)) n!1 Portmanteau ≤ µ(f −1(B)) −1 c ≤ µ(f (B)) + µ(Cf ) = µ(f −1(B)) = P(f(X) 2 B): D This implies f(Xn) ! f(X) by Portamanteau Theorem. Example 18.4. Suppose X1;X2; ··· are i.i.d. samples from a (well behaved) distribution with mean and p h p i2 2 n ¯ D n ¯ D 2 variance µ, σ . By CLT we have σ (Xn − µ) ! N(0; 1). Now by CMT we have σ (Xn − µ) ! χ1. 1 1 Theorem 18.5 (Continuous Mapping Theorem, II). Let fXngn=1, X, fYngn=1 be random variables D P D taking values in a metric space with metric d. Suppose that Xn ! X and d(Xn;Yn) ! 0, then Yn ! X. 1 1 Theorem 18.6 (Continuous Mapping Theorem, III). Let fXngn=1 and fYngn=1 be random variables. D P D D Suppose that Xn ! X and Yn ! c, then (Xn;Yn) ! (X; c). Furthermore, if Xn ?? Yn and Yn ! Y , then 0 D 0 (Xn;Yn) ! (X; Y ) , and X ?? Y . The CMT can also be generalized to cover the convergence in probability, as the following theorem does. P Theorem 18.7 (CMT for convergence in probability). If Xn ! X and f is continuous a:s:[µX ], then P f(Xn) ! f(X). a:s: a:s: Remark Also notice the trivial fact that if Xn ! X then f(Xn) ! f(X). Therefore the CMT holds for all these three modes of convergence. Proof. Fix an arbitrary " > 0, we want to show that P(dY (f(Xn); f(X)) > ") ! 0; as n ! 1: Fix δ > 0 and let Bδ be the subset of X consisting of all x's such that 9y 2 X with dX (x; y) < δ and dY (f(x); f(y)) > ". Then fX2 = Bδg \ fdY (f(Xn); f(X)) > "g ⊂ fdX (Xn;X) ≥ δg: By the fact that Ac \ B ⊂ C ) B ⊂ A [ C, we have P(dY (f(Xn); f(X)) > ") ≤ P(X 2 Bδ) + P(dX (Xn;X) ≥ δ) , T1 + T2: P Now T2 ! 0 as n ! 1 because Xn ! X. As for T1, notice that P(X 2 Bδ) = P(X 2 Bδ \ Cf ) # 0 as δ # 0: Thus " " (d (f(X ); f(X)) > ") ≤ + = ": P Y n 2 2 For n large enough and δ small enough. Lecture 18: April 5 18-3 Remark If f (defined on X ) is a:s:[µX ] continuous and g (defined on Y) is continuous, then g◦f is a:s:[µX ] continuous. Therefore by theorem 18.1, we can get the CMT immediately. 2 Example 18.8. Suppose X1;X2; ··· are i.i.d. samples with mean and variance µ and σ . We want to 2 2 P 2 2 1 Pn ¯ 2 ¯ 1 Pn estimate σ by an estimator σ^n ! σ . Consider σ^n = n i=1(Xi − Xn) , where Xn = n i=1 Xi. By WLLN we have n 1 X P P (X − µ)2 ! σ2; X¯ ! µ. n i n i=1 Using this, together with the fact that n 1 X σ^2 = (X − µ)2 − (X¯ − µ)2; n n i n i=1 2 P and (X¯n − µ) ! 0 by CMT, we have n 1 X P ( (X − µ)2; (X¯ − µ)2)0 ! (σ2; 0)0: n i n i=1 Now let f(x; y) = x − y, which is continuous, by CMT we have n 1 X P σ^2 = f( (X − µ)2; (X¯ − µ)2) ! σ2: n n i n i=1 A direct but useful corollary of continuous mapping theorem is the Slusky's Theorem. D D Theorem 18.9 (Slusky's Theorem). If Xn ! X and Yn ! c, then D 1. Xn + Yn ! X + c. D 2. XnYn ! cX. D 3. Xn=Yn ! X=c provided that c 6= 0. p n ¯ D Example 18.10. By Central Limit Theorem we have σ (X − µ) ! N(0; 1). By Law of Large Numbers we 2 P 2 P have σ^n ! σ . Then by continuous mapping theorem σ^nσ and by Slusky's Theorem, p p n σ n (X¯ − µ) = (X¯ − µ) !D N(0; 1): σ^n σ^n σ 18.2 Tightness 1 Definition 18.11. A sequence of probability measures fµngn=1 is said to be tight if 8" > 0, 9C compact s.t. µn(C) > 1 − "; 8n: 1 Equivalently, if Xn ∼ µn, then fXngn=1 is tight or bounded by probability if 8" > 0, 9M s.t. sup P(kXnk > M) < ": n In that case, we denote Xn = OP (1). 18-4 Lecture 18: April 5 Example 18.12. Cases where bounded in probability fails: 1. P(Xn = cn) = 1, and cn ! 1. c M 2. Xn ∼ Uniform([−n; n]). (Note that then P(Xn 2 [−M; M] ) = 1 − n ). 1 d Theorem 18.13. Let fXngn=1 be a sequence of random variables taking values in R . D (i) If Xn ! X then fXng is tight. D (ii) Helly-Bray Selection Theorem. If fXng is tight, then 9 fnkg s.t. Xnk ! X. Further, if every D convergent (in distribution) sub-sequence converges to the same X, then Xn ! X. Proof of (i). X is a random variable, so 8" > 0, let M = M(") > 0 s.t. P(kXk > M) < ". Then by Portmanteau Thm, for all n ≥ n0("; M), P(kXnk ≥ M) ≤ P(kXk ≥ M) + " ≤ 2": On the other hand, we can find M1 > M s.t. for all n < n0, P(kXnk ≥ M1) ≤ 2": Now for n ≥ n0, we have P(kXnk ≥ M1) ≤ P(kXnk ≥ M) ≤ P(kXk ≥ M) + " ≤ 2": Thus supn P(kXnk ≥ M1) ≤ 2". 1 d D Theorem 18.14 (Polya Theorem). Suppose fXngn=1 and X take values in R , and Xnk ! X. If the d c.d.f. F of X is continuous, or equivalently µX λd, where λd is the Lebesgue measure on R , then (Fn is the c.d.f. of Xn) sup jFn(x) − F (x)j ! 0: d X2R Theorem 18.15 (Glivenko-Cantelli Theorem). Suppose X1;X2; ··· ;, Xi 2 R are i.i.d. samples from a distribution with c.d.f. F . For each n, let F^n be the empirical c.d.f. n 1 X F^ (x) = 1fX ≤ xg: n n n i=1 We have a:s: kF^n − F k1 = sup jF^n(x) − F (x)j ! 0: X2R Remark ^ ^ a:s: • Notice that E[Fn(x)] = F (x) for all x. Therefore by SLLN we have jFn(x) − F (x)j ! 0 for each x. However the Glivenko-Cantelli Theorem is much stronger than this because it asserts the uniform convergence. • We often use another (even stronger) theorem instead, named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved this inequality: Theorem 18.16 (Dvoretzky-Kiefer-Wolfowitz).
Recommended publications
  • Modes of Convergence in Probability Theory
    Modes of Convergence in Probability Theory David Mandel November 5, 2015 Below, fix a probability space (Ω; F;P ) on which all random variables fXng and X are defined. All random variables are assumed to take values in R. Propositions marked with \F" denote results that rely on our finite measure space. That is, those marked results may not hold on a non-finite measure space. Since we already know uniform convergence =) pointwise convergence this proof is omitted, but we include a proof that shows pointwise convergence =) almost sure convergence, and hence uniform convergence =) almost sure convergence. The hierarchy we will show is diagrammed in Fig. 1, where some famous theorems that demonstrate the type of convergence are in parentheses: (SLLN) = strong long of large numbers, (WLLN) = weak law of large numbers, (CLT) ^ = central limit theorem. In parameter estimation, θn is said to be a consistent ^ estimator or θ if θn ! θ in probability. For example, by the SLLN, X¯n ! µ a.s., and hence X¯n ! µ in probability. Therefore the sample mean is a consistent estimator of the population mean. Figure 1: Hierarchy of modes of convergence in probability. 1 1 Definitions of Convergence 1.1 Modes from Calculus Definition Xn ! X pointwise if 8! 2 Ω, 8 > 0, 9N 2 N such that 8n ≥ N, jXn(!) − X(!)j < . Definition Xn ! X uniformly if 8 > 0, 9N 2 N such that 8! 2 Ω and 8n ≥ N, jXn(!) − X(!)j < . 1.2 Modes Unique to Measure Theory Definition Xn ! X in probability if 8 > 0, 8δ > 0, 9N 2 N such that 8n ≥ N, P (jXn − Xj ≥ ) < δ: Or, Xn ! X in probability if 8 > 0, lim P (jXn − Xj ≥ 0) = 0: n!1 The explicit epsilon-delta definition of convergence in probability is useful for proving a.s.
    [Show full text]
  • Arxiv:2102.05840V2 [Math.PR]
    SEQUENTIAL CONVERGENCE ON THE SPACE OF BOREL MEASURES LIANGANG MA Abstract We study equivalent descriptions of the vague, weak, setwise and total- variation (TV) convergence of sequences of Borel measures on metrizable and non-metrizable topological spaces in this work. On metrizable spaces, we give some equivalent conditions on the vague convergence of sequences of measures following Kallenberg, and some equivalent conditions on the TV convergence of sequences of measures following Feinberg-Kasyanov-Zgurovsky. There is usually some hierarchy structure on the equivalent descriptions of convergence in different modes, but not always. On non-metrizable spaces, we give examples to show that these conditions are seldom enough to guarantee any convergence of sequences of measures. There are some remarks on the attainability of the TV distance and more modes of sequential convergence at the end of the work. 1. Introduction Let X be a topological space with its Borel σ-algebra B. Consider the collection M˜ (X) of all the Borel measures on (X, B). When we consider the regularity of some mapping f : M˜ (X) → Y with Y being a topological space, some topology or even metric is necessary on the space M˜ (X) of Borel measures. Various notions of topology and metric grow out of arXiv:2102.05840v2 [math.PR] 28 Apr 2021 different situations on the space M˜ (X) in due course to deal with the corresponding concerns of regularity. In those topology and metric endowed on M˜ (X), it has been recognized that the vague, weak, setwise topology as well as the total-variation (TV) metric are highlighted notions on the topological and metric description of M˜ (X) in various circumstances, refer to [Kal, GR, Wul].
    [Show full text]
  • Noncommutative Ergodic Theorems for Connected Amenable Groups 3
    NONCOMMUTATIVE ERGODIC THEOREMS FOR CONNECTED AMENABLE GROUPS MU SUN Abstract. This paper is devoted to the study of noncommutative ergodic theorems for con- nected amenable locally compact groups. For a dynamical system (M,τ,G,σ), where (M, τ) is a von Neumann algebra with a normal faithful finite trace and (G, σ) is a connected amenable locally compact group with a well defined representation on M, we try to find the largest non- commutative function spaces constructed from M on which the individual ergodic theorems hold. By using the Emerson-Greenleaf’s structure theorem, we transfer the key question to proving the ergodic theorems for Rd group actions. Splitting the Rd actions problem in two cases accord- ing to different multi-parameter convergence types—cube convergence and unrestricted conver- gence, we can give maximal ergodic inequalities on L1(M) and on noncommutative Orlicz space 2(d−1) L1 log L(M), each of which is deduced from the result already known in discrete case. Fi- 2(d−1) nally we give the individual ergodic theorems for G acting on L1(M) and on L1 log L(M), where the ergodic averages are taken along certain sequences of measurable subsets of G. 1. Introduction The study of ergodic theorems is an old branch of dynamical system theory which was started in 1931 by von Neumann and Birkhoff, having its origins in statistical mechanics. While new applications to mathematical physics continued to come in, the theory soon earned its own rights as an important chapter in functional analysis and probability. In the classical situation the sta- tionarity is described by a measure preserving transformation T , and one considers averages taken along a sequence f, f ◦ T, f ◦ T 2,..
    [Show full text]
  • Sequences and Series of Functions, Convergence, Power Series
    6: SEQUENCES AND SERIES OF FUNCTIONS, CONVERGENCE STEVEN HEILMAN Contents 1. Review 1 2. Sequences of Functions 2 3. Uniform Convergence and Continuity 3 4. Series of Functions and the Weierstrass M-test 5 5. Uniform Convergence and Integration 6 6. Uniform Convergence and Differentiation 7 7. Uniform Approximation by Polynomials 9 8. Power Series 10 9. The Exponential and Logarithm 15 10. Trigonometric Functions 17 11. Appendix: Notation 20 1. Review Remark 1.1. From now on, unless otherwise specified, Rn refers to Euclidean space Rn n with n ≥ 1 a positive integer, and where we use the metric d`2 on R . In particular, R refers to the metric space R equipped with the metric d(x; y) = jx − yj. (j) 1 Proposition 1.2. Let (X; d) be a metric space. Let (x )j=k be a sequence of elements of X. 0 (j) 1 Let x; x be elements of X. Assume that the sequence (x )j=k converges to x with respect to (j) 1 0 0 d. Assume also that the sequence (x )j=k converges to x with respect to d. Then x = x . Proposition 1.3. Let a < b be real numbers, and let f :[a; b] ! R be a function which is both continuous and strictly monotone increasing. Then f is a bijection from [a; b] to [f(a); f(b)], and the inverse function f −1 :[f(a); f(b)] ! [a; b] is also continuous and strictly monotone increasing. Theorem 1.4 (Inverse Function Theorem). Let X; Y be subsets of R.
    [Show full text]
  • Basic Functional Analysis Master 1 UPMC MM005
    Basic Functional Analysis Master 1 UPMC MM005 Jean-Fran¸coisBabadjian, Didier Smets and Franck Sueur October 18, 2011 2 Contents 1 Topology 5 1.1 Basic definitions . 5 1.1.1 General topology . 5 1.1.2 Metric spaces . 6 1.2 Completeness . 7 1.2.1 Definition . 7 1.2.2 Banach fixed point theorem for contraction mapping . 7 1.2.3 Baire's theorem . 7 1.2.4 Extension of uniformly continuous functions . 8 1.2.5 Banach spaces and algebra . 8 1.3 Compactness . 11 1.4 Separability . 12 2 Spaces of continuous functions 13 2.1 Basic definitions . 13 2.2 Completeness . 13 2.3 Compactness . 14 2.4 Separability . 15 3 Measure theory and Lebesgue integration 19 3.1 Measurable spaces and measurable functions . 19 3.2 Positive measures . 20 3.3 Definition and properties of the Lebesgue integral . 21 3.3.1 Lebesgue integral of non negative measurable functions . 21 3.3.2 Lebesgue integral of real valued measurable functions . 23 3.4 Modes of convergence . 25 3.4.1 Definitions and relationships . 25 3.4.2 Equi-integrability . 27 3.5 Positive Radon measures . 29 3.6 Construction of the Lebesgue measure . 34 4 Lebesgue spaces 39 4.1 First definitions and properties . 39 4.2 Completeness . 41 4.3 Density and separability . 42 4.4 Convolution . 42 4.4.1 Definition and Young's inequality . 43 4.4.2 Mollifier . 44 4.5 A compactness result . 45 5 Continuous linear maps 47 5.1 Space of continuous linear maps . 47 5.2 Uniform boundedness principle{Banach-Steinhaus theorem .
    [Show full text]
  • Almost Uniform Convergence Versus Pointwise Convergence
    ALMOST UNIFORM CONVERGENCE VERSUS POINTWISE CONVERGENCE JOHN W. BRACE1 In many an example of a function space whose topology is the topology of almost uniform convergence it is observed that the same topology is obtained in a natural way by considering pointwise con- vergence of extensions of the functions on a larger domain [l; 2]. This paper displays necessary conditions and sufficient conditions for the above situation to occur. Consider a linear space G(5, F) of functions with domain 5 and range in a real or complex locally convex linear topological space £. Assume that there are sufficient functions in G(5, £) to distinguish between points of 5. Let Sß denote the closure of the image of 5 in the cartesian product space X{g(5): g£G(5, £)}. Theorems 4.1 and 4.2 of reference [2] give the following theorem. Theorem. If g(5) is relatively compact for every g in G(5, £), then pointwise convergence of the extended functions on Sß is equivalent to almost uniform converqence on S. When almost uniform convergence is known to be equivalent to pointwise convergence on a larger domain the situation can usually be converted to one of equivalence of the two modes of convergence on the same domain by means of Theorem 4.1 of [2]. In the new formulation the following theorem is applicable. In preparation for the theorem, let £(5, £) denote all bounded real valued functions on S which are uniformly continuous for the uniformity which G(5, F) generates on 5. G(5, £) will be called a full linear space if for every/ in £(5, £) and every g in GiS, F) the function fg obtained from their pointwise product is a member of G (5, £).
    [Show full text]
  • ACM 217 Handout: Completeness of Lp the Following
    ACM 217 Handout: Completeness of Lp The following statements should have been part of section 1.5 in the lecture notes; unfortunately, I forgot to include them. p Definition 1. A sequence {Xn} of random variables in L , p ≥ 1 is called a Cauchy p sequence (in L ) if supm,n≥N kXm − Xnkp → 0 as N → ∞. p p Proposition 2 (Completeness of L ). Let Xn be a Cauchy sequence in L . Then there p p exists a random variable X∞ ∈ L such that Xn → X∞ in L . When is such a result useful? All our previous convergence theorems, such as the dominated convergence theorem etc., assume that we already know that our sequence converges to a particular random variable Xn → X in some sense; they tell us how to convert between the various modes of convergence. However, we are often just given a sequence Xn, and we still need to establish that Xn converges to something. Proving that Xn is a Cauchy sequence is one way to show that the sequence converges, without knowing in advance what it converges to. We will encounter another way to show that a sequence converges in the next chapter (the martingale convergence theorem). Remark 3. As you know from your calculus course, Rn also has the property that any Cauchy sequence converges: if supm,n≥N |xm − xn| → 0 as N → ∞ for some n n sequence {xn} ⊂ R , then there is some x∞ ∈ R such that xn → x∞. In fact, many (but not all) metric spaces have this property, so it is not shocking that it is true also for Lp.
    [Show full text]
  • MAA6617 COURSE NOTES SPRING 2014 19. Normed Vector Spaces Let
    MAA6617 COURSE NOTES SPRING 2014 19. Normed vector spaces Let X be a vector space over a field K (in this course we always have either K = R or K = C). Definition 19.1. A norm on X is a function k · k : X! K satisfying: (i) (postivity) kxk ≥ 0 for all x 2 X , and kxk = 0 if and only if x = 0; (ii) (homogeneity) kkxk = jkjkxk for all x 2 X and k 2 K, and (iii) (triangle inequality) kx + yk ≤ kxk + kyk for all x; y 2 X . Using these three properties it is straightforward to check that the quantity d(x; y) := kx − yk (19.1) defines a metric on X . The resulting topology is called the norm topology. The next propo- sition is simple but fundamental; it says that the norm and the vector space operations are continuous in the norm topology. Proposition 19.2 (Continuity of vector space operations). Let X be a normed vector space over K. a) If xn ! x in X , then kxnk ! kxk in R. b) If kn ! k in K and xn ! x in X , then knxn ! kx in X . c) If xn ! x and yn ! y in X , then xn + yn ! x + y in X . Proof. The proofs follow readily from the properties of the norm, and are left as exercises. We say that two norms k · k1; k · k2 on X are equivalent if there exist absolute constants C; c > 0 such that ckxk1 ≤ kxk2 ≤ Ckxk1 for all x 2 X : (19.2) Equivalent norms defined the same topology on X , and the same Cauchy sequences (Prob- lem 20.2).
    [Show full text]
  • 9 Modes of Convergence
    9 Modes of convergence 9.1 Weak convergence in normed spaces We recall that the notion of convergence on a normed space X, which we used so far, is the convergence with respect to the norm on X: namely, for a sequence (xn)n≥1, we say that xn ! x if kxn − xk ! 0 as n ! 1. (1) However, in many cases, this notion does not capture the full information about behaviour of sequences. Moreover, we recall that according to the Bolzano{Weierstrass Theorem, a bounded sequence in R has a convergent subsequence. An analogue of this result fails for in infinite dimensional normed spaces if we only deal with convergence with respect to norm. Example 9.1. Consider the sequencep en = (0;:::; 0; 1; 0;:::) in the space 2 ` . Then for n 6= m, ken −emk2 = 2. So that we conclude that the sequence (en)n≥1 is not Cauchy, so it cannot converge, and it even does not contain any convergent subsequence. Nonetheless, in some \weak" sense, we may say that this sequence \converges" to (0;:::; 0;:::). For instance, for every 2 x = (xn)n≥1 2 ` , hx; eni = xn ! 0: This example motivates the notion of weak convergence which we now introduce. Definition 9.2. A sequence (xn)n≥1 in a normed space X converges weakly ∗ to x 2 X if for every f 2 X , we have that f(xn) ! f(x). We write w xn −! x: To emphasise the difference with the usual notion of convergence, if (1) hold, we say (xn)n≥1 converges in norm or converges strongly.
    [Show full text]
  • Chapter 3: Integration
    Chapter 3 Integration 3.1 Measurable Functions Definition 3.1.1. Let (X; A) be a measurable space. We say that f : X ! R is measurable if f −1((α; 1; )) 2 A for every α 2 R. Let M(X; A) denote the set of measurable functions. Lemma 3.1.2. The following are equivalent: 1) f is measurable. 2) f −1((−∞; α]) 2 A for all α 2 R. 3) f −1([α; 1)) 2 A for all α 2 R. 4) f −1((−∞; α)) 2 A for all α 2 R. Proof. 1) , 2) This follows since (f −1((−∞; α]))c = f −1((α; 1)): 1) ) 3) Observe that 1 \ 1 f −1([α; 1)) = f −1((α − ; 1)) 2 A: n n=1 3) ) 1) Observe that 1 [ 1 f −1((α; 1)) = f −1([α + ; 1)) 2 A: n n=1 3) , 4) This follows since (f −1((−∞; α)))c = f −1([α; 1)): Example 3.1.3. 1) Given any (X; A), the constant function f(x) = c for all x 2 X is measurable for any c 2 R. 2) If A ⊆ X, the characteristic function χA(x) of A is measurable if and only if A 2 A. 37 3) If X = R and B(R) ⊆ A, then the continuous functions are measurable since f −1((α; 1)) is open and hence measurable. Proposition 3.1.4. [Arithmetic of Measurable Functions] Assume that f; g 2 M(X; A). 1) cf 2 M(X; A) for every c 2 R 2) f 2 2 M(X; A) 3) f + g 2 M(X; A) 4) fg 2 M(X; A) 5) jfj 2 M(X; A) 6) maxff; gg 2 M(X; A) 7) minff; gg 2 M(X; A) Proof.
    [Show full text]
  • Probability Theory - Part 2 Independent Random Variables
    PROBABILITY THEORY - PART 2 INDEPENDENT RANDOM VARIABLES MANJUNATH KRISHNAPUR CONTENTS 1. Introduction 2 2. Some basic tools in probability2 3. Applications of first and second moment methods 11 4. Applications of Borel-Cantelli lemmas and Kolmogorov’s zero-one law 18 5. Weak law of large numbers 20 6. Applications of weak law of large numbers 22 7. Modes of convergence 24 8. Uniform integrability 28 9. Strong law of large numbers 29 10. The law of iterated logarithm 32 11. Proof of LIL for Bernoulli random variables 34 12. Hoeffding’s inequality 36 13. Random series with independent terms 38 14. Central limit theorem - statement, heuristics and discussion 40 15. Strategies of proof of central limit theorem 43 16. Central limit theorem - two proofs assuming third moments 45 17. Central limit theorem for triangular arrays 46 18. Two proofs of the Lindeberg-Feller CLT 48 19. Sums of more heavy-tailed random variables 51 20. Appendix: Characteristic functions and their properties 52 1 1. INTRODUCTION In this second part of the course, we shall study independent random variables. Much of what we do is devoted to the following single question: Given independent random variables with known distributions, what can you say about the distribution of the sum? In the process of finding answers, we shall weave through various topics. Here is a guide to the essential aspects that you might pay attention to. Firstly, the results. We shall cover fundamental limit theorems of probability, such as the weak and strong law of large numbers, central limit theorems, poisson limit theorem, in addition to results on random series with independent summands.
    [Show full text]
  • Probability I Fall 2011 Contents
    Probability I Fall 2011 Contents 1 Measures 2 1.1 σ-fields and generators . 2 1.2 Outer measure . 4 1.3 Carath´eodory Theorem . 7 1.4 Product measure I . 8 1.5 Hausdorff measure and Hausdorff dimension . 10 2 Integrals 12 2.1 Measurable functions . 12 2.2 Monotone and bounded convergence . 14 2.3 Various modes of convergence . 14 2.4 Approximation by continuous functions . 15 2.5 Fubini and Radon-Nikodym Theorem . 17 3 Probability 18 3.1 Probabilistic terminology and notation . 18 3.2 Independence, Borel-Cantelli, 0-1 Law . 19 3.3 Lp-spaces . 21 3.4 Weak convergence of measures . 23 3.5 Measures on a metric space, tightness vs. compactness . 24 4 Appendix: Selected proofs 27 4.1 Section 1 . 27 4.2 Section 2 . 30 4.3 Section 3 . 33 1 1 Measures 1.1 σ-fields and generators A family F of subsets of a set S is called a field if (F1) ; 2 F and S 2 F; (F2) if A; B 2 F, then A [ B 2 F; (F3) if A 2 F, then Ac = S n A 2 F. Condition (F2), repeating itself, is equivalent to [n (F2f ) if A1;:::;An 2 F, then Ak 2 F. k=1 A field F is called a σ-field, if (F2f ) is strengthened to [1 (F2c) if A1;A2;::: 2 F, then Ak 2 F. k=1 In view of (3) and the De Morgan's laws, the union \[" in (F2) or (F2f ) or (F2c) can be replaced by the intersection \\".
    [Show full text]