Probability and Measure

Total Page:16

File Type:pdf, Size:1020Kb

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1 Convergence Concepts 1.1 Convergence of Real Numbers A sequence of real numbers an converges to a limit a if and only if, for each ǫ > 0, the sequence an eventually lies within a ball of radius ǫ centered at a. It’s okay if the first few (or few million) terms lie outside that ball— and the number of terms that do lie outside the ball may depend on how big ǫ is (if ǫ is small enough it will take millions of terms before the remaining sequence lies inside the ball). This can be made mathematically precise by introducing a letter (say, Nǫ) for how many initial terms we have to throw away, so that a a if and only if there is an N < so that, for each n → ǫ ∞ n N , a a <ǫ: only finitely many a can be farther than ǫ from a. ≥ ǫ | n − | n The same notion of convergence really works in any (complete) metric space, where we require that some measure of the distance d(an, a) from an to a tend to zero in the sense that it exceeds each number ǫ> 0 for at most some finite number Nǫ of terms. Points a in d-dimensional Euclidean space will converge to a limit a Rd n ∈ if and only if each of their coordinates converges; and, since there are only finitely many of them, if they all converge then they do so uniformly (i.e., for each ǫ we can take the same Nǫ for all d of the coordinate sequences). 1 2 Convergence of Random Variables For random variables Xn the idea of convergence to a limiting random vari- able X is more delicate, since each X is a function of ω Ω and usually n ∈ there are infinitely many points ω Ω. What should we mean in asking ∈ about the convergence of a sequence Xn of random variables to a limit X? Should we mean that Xn(ω) converges to X(ω) for each fixed ω? Or that these sequences converge uniformly in ω Ω? Or that some notion of the ∈ distance d(Xn, X) between Xn and the limit X decreases to zero? Should the probability measure P be involved in some way? Here are a few different choices of what we might mean by the statement that “Xn converges to X,” for a sequence of random variables Xn and a random variable X, all defined on the same probability space (Ω, F, P): pw: The sequence of real numbers X (ω) X(ω) for every ω Ω (point- n → ∈ wise convergence): ( ǫ> 0) ( ω Ω) ( N < ) ( n N ) X (ω) X(ω) < ǫ. ∀ ∀ ∈ ∃ ǫ,ω ∞ ∀ ≥ ǫ,ω | n − | uni: The sequences of real numbers X (ω) X(ω) uniformly for ω Ω: n → ∈ ( ǫ> 0) ( N < ) ( ω Ω) ( n N ) X (ω) X(ω) < ǫ. ∀ ∃ ǫ ∞ ∀ ∈ ∀ ≥ ǫ | n − | a.s..: Outside some null event N F, each sequence of real numbers X (ω) ∈ n → X(ω) (Almost-Sure convergence, or “almost everywhere” (a.e.)): for some N F with P[N] = 0, ∈ ( ǫ> 0) ( ω / N) ( N < ) ( n N ) X (ω) X(ω) < ǫ, ∀ ∀ ∈ ∃ ǫ,ω ∞ ∀ ≥ ǫ,ω | n − | i.e., P X (ω) X(ω) ǫ = 0. ∪ǫ>0 ∩N<∞ ∪n≥N | n − |≥ L : Outside some null event N F, the sequences of real numbers X (ω) ∞ ∈ n → X(ω) converge uniformly (“almost-uniform” or “L∞” convergence): for some N F with P[N] = 0, ∈ ( ǫ> 0) ( N < ) ( ω / N) ( n N ) X (ω) X(ω) < ǫ. ∀ ∃ ǫ ∞ ∀ ∈ ∀ ≥ ǫ | n − | i.p.: For each ǫ > 0, the probabilities P[ X X > ǫ] 0 (convergence | n − | → “in probability”, or “in measure”): ( ǫ> 0) ( η > 0) ( N < ) ( n N ) P[ X X >ǫ] < η. ∀ ∀ ∃ ǫ,η ∞ ∀ ≥ ǫ,η | n − | 2 L : The expectation E[ X X ] converges to zero (convergence “in L ”): 1 | n − | 1 ( ǫ> 0) ( N < ) ( n N ) E[ X X ] < ǫ. ∀ ∃ ǫ ∞ ∀ ≥ ǫ | n − | L : For some fixed number p> 0, the expectation of the pth power E[ X p | n − X p] converges to zero (convergence “in L ,” sometimes called “in the | p pth mean”): ( ǫ> 0) ( N < ) ( n N ) E[ X X p] < ǫ. ∀ ∃ ǫ ∞ ∀ ≥ ǫ | n − | i.d.: The distributions of Xn converge to the distribution of X, i.e., the measures P X−1 converge in some way to P X−1 (“vague” or ◦ n ◦ “weak” convergence, or “convergence in distribution”, sometimes writ- ten X X): n ⇒ ( ǫ> 0) ( φ C (R)) ( N < ) ( n N ) E[ φ(X ) φ(X) ] < ǫ. ∀ ∀ ∈ b ∃ ǫ,φ ∞ ∀ ≥ ǫ,φ | n − | Which of these eight notions of convergence is right for random variables? The answer is that all of them are useful in probability theory for one purpose or another. You will want to know which ones imply which other ones, under what conditions. All but the first two (pointwise, uniform) notions depend upon the measure P; it is possible for a sequence Xn to converge to X in any of these senses for one probability measure P, but to fail to converge for another P′. Most of them can be phrased as metric convergence for some notion of distance between random variables: i.p.: X X in probability if and only if d (X, X ) 0 as real numbers, n → 0 n → where: X Y d (X,Y ) E | − | 0 ≡ 1+ X Y | − | L : X X in L if and only if d (X, X ) = X X 0 as real 1 n → 1 1 n || − n||1 → numbers, where: X Y E X Y || − ||1 ≡ | − | L : X X in L if and only if d (X, X ) = X X 0 as real p n → p p n || − n||p → numbers, where: X Y (E X Y p)1/p || − ||p ≡ | − | 3 L : X X almost uniformly if and only if d (X, X )= X Y 0 ∞ n → ∞ n || − ||∞ → as real numbers, where: X Y = l.u.b. r < : P[ X Y > r] > 0 || − ||∞ { ∞ | − | } As the notation suggests, convergence in probability and in L∞ are in some sense limits of convergence in L as p 0 and p , respectively. Almost- p → →∞ sure convergence is an exception: there is no metric notion of distance d(X,Y ) for which X X almost surely if and only if d(X, X ) 0. n → n → 2.1 Almost-Sure Convergence Let X and X be a collection of RV’s on some (Ω, F, P). The set of points { n} ω for which Xn(ω) does converge to X(ω) is just ∞ ∞ [ω : X (ω) X(ω) ǫ], | n − |≤ n=m ǫ>\0 m[=1 \ the points which, for all ǫ > 0, have X (ω) X(ω) less than ǫ for all but | n − | finitely-many n. The sequence Xn is said to converge “almost everywhere” (a.e.) to X, or to converge to X “almost surely” (a.s..), if this set of ω has probability one, or (conversely) if its complement is a null set: ∞ ∞ P [ω : X (ω) X(ω) >ǫ] = 0. | n − | ǫ>0 m=1 n=m h [ \ [ i The union over ǫ > 0 is only a countable one, since we need include only rational ǫ (or, for that matter, any sequence ǫk tending to zero, such as ǫ = 1/k). Thus X X a.e. if and only if, for each ǫ> 0, k n → ∞ ∞ P [ω : X (ω) X(ω) >ǫ] = 0. (a.e.) | n − | m=1 n=m h \ [ i This combination of intersection and union occurs frequently in probability, ∞ ∞ and has a name; for any sequence En of events, [ m=1 n=m En] is called the lim sup of the E , and is sometimes described more colorfully as [E i.o.], { n} T S n the set of points in En “infinitely often.” Its complement is the lim inf of c ∞ ∞ the sets Fn = En, [ m=1 n=m Fn]: the set of points in all but finitely many of the F . Since P is countably additive, and since the intersection in the n S T definition of lim sup is decreasing and the union in the definition of lim inf 4 is increasing, always we have P[ ∞ E ] P[ ∞ ∞ E ] and P[ ∞ F ] P[ ∞ ∞ F ] as n=m n ց m=1 n=m n n=m n ր m=1 n=m n m . Thus, S→∞ T S T S T Theorem 1 X X P-a.s.. if and only if for every ǫ> 0, n → lim P[ Xn X >ǫ for some n m] = 0. m→∞ | − | ≥ In particular, X X P-a.s.. if P[ X X > ǫ] < for each ǫ > 0 n → | n − | ∞ (why?). P 2.2 Convergence In Probability The sequence Xn is said to converge to X “in probability” (i.p.) if, for each ǫ> 0, P[ω : X (ω) X(ω) >ǫ] 0. (i.p.) | n − | → If we denote by E the event [ω : X (ω) X(ω) >ǫ] we see that convergence n | n − | almost surely requires that P[ E ] 0 as m , while convergence n≥m n → →∞ in probability requires only that P[E ] 0. Thus: S n → Theorem 2 If X X a.e. then X X i.p. n → n → Here is a partial converse: Theorem 3 If X X i.p., then there is a subsequence n such that n → k X X a.e. nk → Proof. Set n = 0 and, for each integer k 1, set 0 ≥ 1 n = inf n>n : P ω : X (ω) X(ω) > 2−k .
Recommended publications
  • Conditional Generalizations of Strong Laws Which Conclude the Partial Sums Converge Almost Surely1
    The Annals ofProbability 1982. Vol. 10, No.3. 828-830 CONDITIONAL GENERALIZATIONS OF STRONG LAWS WHICH CONCLUDE THE PARTIAL SUMS CONVERGE ALMOST SURELY1 By T. P. HILL Georgia Institute of Technology Suppose that for every independent sequence of random variables satis­ fying some hypothesis condition H, it follows that the partial sums converge almost surely. Then it is shown that for every arbitrarily-dependent sequence of random variables, the partial sums converge almost surely on the event where the conditional distributions (given the past) satisfy precisely the same condition H. Thus many strong laws for independent sequences may be immediately generalized into conditional results for arbitrarily-dependent sequences. 1. Introduction. Ifevery sequence ofindependent random variables having property A has property B almost surely, does every arbitrarily-dependent sequence of random variables have property B almost surely on the set where the conditional distributions have property A? Not in general, but comparisons of the conditional Borel-Cantelli Lemmas, the condi­ tional three-series theorem, and many martingale results with their independent counter­ parts suggest that the answer is affirmative in fairly general situations. The purpose ofthis note is to prove Theorem 1, which states, in part, that if "property B" is "the partial sums converge," then the answer is always affIrmative, regardless of "property A." Thus many strong laws for independent sequences (even laws yet undiscovered) may be immediately generalized into conditional results for arbitrarily-dependent sequences. 2. Main Theorem. In this note, IY = (Y1 , Y2 , ••• ) is a sequence ofrandom variables on a probability triple (n, d, !?I'), Sn = Y 1 + Y 2 + ..
    [Show full text]
  • The Fundamental Theorem of Calculus for Lebesgue Integral
    Divulgaciones Matem´aticasVol. 8 No. 1 (2000), pp. 75{85 The Fundamental Theorem of Calculus for Lebesgue Integral El Teorema Fundamental del C´alculo para la Integral de Lebesgue Di´omedesB´arcenas([email protected]) Departamento de Matem´aticas.Facultad de Ciencias. Universidad de los Andes. M´erida.Venezuela. Abstract In this paper we prove the Theorem announced in the title with- out using Vitali's Covering Lemma and have as a consequence of this approach the equivalence of this theorem with that which states that absolutely continuous functions with zero derivative almost everywhere are constant. We also prove that the decomposition of a bounded vari- ation function is unique up to a constant. Key words and phrases: Radon-Nikodym Theorem, Fundamental Theorem of Calculus, Vitali's covering Lemma. Resumen En este art´ıculose demuestra el Teorema Fundamental del C´alculo para la integral de Lebesgue sin usar el Lema del cubrimiento de Vi- tali, obteni´endosecomo consecuencia que dicho teorema es equivalente al que afirma que toda funci´onabsolutamente continua con derivada igual a cero en casi todo punto es constante. Tambi´ense prueba que la descomposici´onde una funci´onde variaci´onacotada es ´unicaa menos de una constante. Palabras y frases clave: Teorema de Radon-Nikodym, Teorema Fun- damental del C´alculo,Lema del cubrimiento de Vitali. Received: 1999/08/18. Revised: 2000/02/24. Accepted: 2000/03/01. MSC (1991): 26A24, 28A15. Supported by C.D.C.H.T-U.L.A under project C-840-97. 76 Di´omedesB´arcenas 1 Introduction The Fundamental Theorem of Calculus for Lebesgue Integral states that: A function f :[a; b] R is absolutely continuous if and only if it is ! 1 differentiable almost everywhere, its derivative f 0 L [a; b] and, for each t [a; b], 2 2 t f(t) = f(a) + f 0(s)ds: Za This theorem is extremely important in Lebesgue integration Theory and several ways of proving it are found in classical Real Analysis.
    [Show full text]
  • [Math.FA] 3 Dec 1999 Rnfrneter for Theory Transference Introduction 1 Sas Ihnrah U Twl Etetdi Eaaepaper
    Transference in Spaces of Measures Nakhl´eH. Asmar,∗ Stephen J. Montgomery–Smith,† and Sadahiro Saeki‡ 1 Introduction Transference theory for Lp spaces is a powerful tool with many fruitful applications to sin- gular integrals, ergodic theory, and spectral theory of operators [4, 5]. These methods afford a unified approach to many problems in diverse areas, which before were proved by a variety of methods. The purpose of this paper is to bring about a similar approach to spaces of measures. Our main transference result is motivated by the extensions of the classical F.&M. Riesz Theorem due to Bochner [3], Helson-Lowdenslager [10, 11], de Leeuw-Glicksberg [6], Forelli [9], and others. It might seem that these extensions should all be obtainable via transference methods, and indeed, as we will show, these are exemplary illustrations of the scope of our main result. It is not straightforward to extend the classical transference methods of Calder´on, Coif- man and Weiss to spaces of measures. First, their methods make use of averaging techniques and the amenability of the group of representations. The averaging techniques simply do not work with measures, and do not preserve analyticity. Secondly, and most importantly, their techniques require that the representation is strongly continuous. For spaces of mea- sures, this last requirement would be prohibitive, even for the simplest representations such as translations. Instead, we will introduce a much weaker requirement, which we will call ‘sup path attaining’. By working with sup path attaining representations, we are able to prove a new transference principle with interesting applications. For example, we will show how to derive with ease generalizations of Bochner’s theorem and Forelli’s main result.
    [Show full text]
  • Generalizations of the Riemann Integral: an Investigation of the Henstock Integral
    Generalizations of the Riemann Integral: An Investigation of the Henstock Integral Jonathan Wells May 15, 2011 Abstract The Henstock integral, a generalization of the Riemann integral that makes use of the δ-fine tagged partition, is studied. We first consider Lebesgue’s Criterion for Riemann Integrability, which states that a func- tion is Riemann integrable if and only if it is bounded and continuous almost everywhere, before investigating several theoretical shortcomings of the Riemann integral. Despite the inverse relationship between integra- tion and differentiation given by the Fundamental Theorem of Calculus, we find that not every derivative is Riemann integrable. We also find that the strong condition of uniform convergence must be applied to guarantee that the limit of a sequence of Riemann integrable functions remains in- tegrable. However, by slightly altering the way that tagged partitions are formed, we are able to construct a definition for the integral that allows for the integration of a much wider class of functions. We investigate sev- eral properties of this generalized Riemann integral. We also demonstrate that every derivative is Henstock integrable, and that the much looser requirements of the Monotone Convergence Theorem guarantee that the limit of a sequence of Henstock integrable functions is integrable. This paper is written without the use of Lebesgue measure theory. Acknowledgements I would like to thank Professor Patrick Keef and Professor Russell Gordon for their advice and guidance through this project. I would also like to acknowledge Kathryn Barich and Kailey Bolles for their assistance in the editing process. Introduction As the workhorse of modern analysis, the integral is without question one of the most familiar pieces of the calculus sequence.
    [Show full text]
  • Stability in the Almost Everywhere Sense: a Linear Transfer Operator Approach ∗ R
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Elsevier - Publisher Connector J. Math. Anal. Appl. 368 (2010) 144–156 Contents lists available at ScienceDirect Journal of Mathematical Analysis and Applications www.elsevier.com/locate/jmaa Stability in the almost everywhere sense: A linear transfer operator approach ∗ R. Rajaram a, ,U.Vaidyab,M.Fardadc, B. Ganapathysubramanian d a Dept. of Math. Sci., 3300, Lake Rd West, Kent State University, Ashtabula, OH 44004, United States b Dept. of Elec. and Comp. Engineering, Iowa State University, Ames, IA 50011, United States c Dept. of Elec. Engineering and Comp. Sci., Syracuse University, Syracuse, NY 13244, United States d Dept. of Mechanical Engineering, Iowa State University, Ames, IA 50011, United States article info abstract Article history: The problem of almost everywhere stability of a nonlinear autonomous ordinary differential Received 20 April 2009 equation is studied using a linear transfer operator framework. The infinitesimal generator Available online 20 February 2010 of a linear transfer operator (Perron–Frobenius) is used to provide stability conditions of Submitted by H. Zwart an autonomous ordinary differential equation. It is shown that almost everywhere uniform stability of a nonlinear differential equation, is equivalent to the existence of a non-negative Keywords: Almost everywhere stability solution for a steady state advection type linear partial differential equation. We refer Advection equation to this non-negative solution, verifying almost everywhere global stability, as Lyapunov Density function density. A numerical method using finite element techniques is used for the computation of Lyapunov density. © 2010 Elsevier Inc. All rights reserved. 1. Introduction Stability analysis of an ordinary differential equation is one of the most fundamental problems in the theory of dynami- cal systems.
    [Show full text]
  • Chapter 6. Integration §1. Integrals of Nonnegative Functions Let (X, S, Μ
    Chapter 6. Integration §1. Integrals of Nonnegative Functions Let (X, S, µ) be a measure space. We denote by L+ the set of all measurable functions from X to [0, ∞]. + Let φ be a simple function in L . Suppose f(X) = {a1, . , am}. Then φ has the Pm −1 standard representation φ = j=1 ajχEj , where Ej := f ({aj}), j = 1, . , m. We define the integral of φ with respect to µ to be m Z X φ dµ := ajµ(Ej). X j=1 Pm For c ≥ 0, we have cφ = j=1(caj)χEj . It follows that m Z X Z cφ dµ = (caj)µ(Ej) = c φ dµ. X j=1 X Pm Suppose that φ = j=1 ajχEj , where aj ≥ 0 for j = 1, . , m and E1,...,Em are m Pn mutually disjoint measurable sets such that ∪j=1Ej = X. Suppose that ψ = k=1 bkχFk , where bk ≥ 0 for k = 1, . , n and F1,...,Fn are mutually disjoint measurable sets such n that ∪k=1Fk = X. Then we have m n n m X X X X φ = ajχEj ∩Fk and ψ = bkχFk∩Ej . j=1 k=1 k=1 j=1 If φ ≤ ψ, then aj ≤ bk, provided Ej ∩ Fk 6= ∅. Consequently, m m n m n n X X X X X X ajµ(Ej) = ajµ(Ej ∩ Fk) ≤ bkµ(Ej ∩ Fk) = bkµ(Fk). j=1 j=1 k=1 j=1 k=1 k=1 In particular, if φ = ψ, then m n X X ajµ(Ej) = bkµ(Fk). j=1 k=1 In general, if φ ≤ ψ, then m n Z X X Z φ dµ = ajµ(Ej) ≤ bkµ(Fk) = ψ dµ.
    [Show full text]
  • 2. Convergence Theorems
    2. Convergence theorems In this section we analyze the dynamics of integrabilty in the case when se- quences of measurable functions are considered. Roughly speaking, a “convergence theorem” states that integrability is preserved under taking limits. In other words, ∞ if one has a sequence (fn)n=1 of integrable functions, and if f is some kind of a limit of the fn’s, then we would like to conclude that f itself is integrable, as well R R as the equality f = limn→∞ fn. Such results are often employed in two instances: A. When we want to prove that some function f is integrable. In this case ∞ we would look for a sequence (fn)n=1 of integrable approximants for f. B. When we want to construct and integrable function. In this case, we will produce first the approximants, and then we will examine the existence of the limit. The first convergence result, which is somehow primite, but very useful, is the following. Lemma 2.1. Let (X, A, µ) be a finite measure space, let a ∈ (0, ∞) and let fn : X → [0, a], n ≥ 1, be a sequence of measurable functions satisfying (a) f1 ≥ f2 ≥ · · · ≥ 0; (b) limn→∞ fn(x) = 0, ∀ x ∈ X. Then one has the equality Z (1) lim fn dµ = 0. n→∞ X Proof. Let us define, for each ε > 0, and each integer n ≥ 1, the set ε An = {x ∈ X : fn(x) ≥ ε}. ε Obviously, we have An ∈ A, ∀ ε > 0, n ≥ 1. One key fact we are going to use is the following.
    [Show full text]
  • Lecture 26: Dominated Convergence Theorem
    Lecture 26: Dominated Convergence Theorem Continuation of Fatou's Lemma. + + Corollary 0.1. If f 2 L and ffn 2 L : n 2 Ng is any sequence of functions such that fn ! f almost everywhere, then Z Z f 6 lim inf fn: X X Proof. Let fn ! f everywhere in X. That is, lim inf fn(x) = f(x) (= lim sup fn also) for all x 2 X. Then, by Fatou's lemma, Z Z Z f = lim inf fn 6 lim inf fn: X X X If fn 9 f everywhere in X, then let E = fx 2 X : lim inf fn(x) 6= f(x)g. Since fn ! f almost everywhere in X, µ(E) = 0 and Z Z Z Z f = f and fn = fn 8n X X−E X X−E thus making fn 9 f everywhere in X − E. Hence, Z Z Z Z f = f 6 lim inf fn = lim inf fn: X X−E X−E X χ Example 0.2 (Strict inequality). Let Sn = [n; n + 1] ⊂ R and fn = Sn . Then, f = lim inf fn = 0 and Z Z 0 = f dµ < lim inf fn dµ = 1: R R Example 0.3 (Importance of non-negativity). Let Sn = [n; n + 1] ⊂ R and χ fn = − Sn . Then, f = lim inf fn = 0 but Z Z 0 = f dµ > lim inf fn dµ = −1: R R 1 + R Proposition 0.4. If f 2 L and X f dµ < 1, then (a) the set A = fx 2 X : f(x) = 1g is a null set and (b) the set B = fx 2 X : f(x) > 0g is σ-finite.
    [Show full text]
  • MEASURE and INTEGRATION: LECTURE 3 Riemann Integral. If S Is Simple and Measurable Then Sdµ = Αiµ(
    MEASURE AND INTEGRATION: LECTURE 3 Riemann integral. If s is simple and measurable then N � � sdµ = αiµ(Ei), X i=1 �N where s = i=1 αiχEi . If f ≥ 0, then � �� � fdµ = sup sdµ | 0 ≤ s ≤ f, s simple & measurable . X X Recall the Riemann integral of function f on interval [a, b]. Define lower and upper integrals L(f, P ) and U(f, P ), where P is a partition of [a, b]. Set − � � f = sup L(f, P ) and f = inf U(f, P ). P P − A function f is Riemann integrable ⇐⇒ − � � f = f, − in which case this common value is � f. A set B ⊂ R has measure zero if, for any � > 0, there exists a ∞ ∞ countable collection of intervals { Ii}i =1 such that B ⊂ ∪i =1Ii and �∞ i=1 λ(Ii) < �. Examples: finite sets, countable sets. There are also uncountable sets with measure zero. However, any interval does not have measure zero. Theorem 0.1. A function f is Riemann integrable if and only if f is discontinuous on a set of measure zero. A function is said to have a property (e.g., continuous) almost ev­ erywhere (abbreviated a.e.) if the set on which the property does not hold has measure zero. Thus, the statement of the theorem is that f is Riemann integrable if and only if it is continuous almost everywhere. Date: September 11, 2003. 1 2 MEASURE AND INTEGRATION: LECTURE 3 Recall positive measure: a measure function µ: M → [0, ∞] such ∞ �∞ that µ (∪i=1Ei) = i=1 µ(Ei) for Ei ∈ M disjoint.
    [Show full text]
  • Lecture Notes for Math 522 Spring 2012 (Rudin Chapter
    Lebesgue integral We will define a very powerful integral, far better than Riemann in the sense that it will allow us to integrate pretty much every reasonable function and we will also obtain strong convergence results. That is if we take a limit of integrable functions we will get an integrable function and the limit of the integrals will be the integral of the limit under very mild conditions. We will focus only on the real line, although the theory easily extends to more abstract contexts. In Riemann integral the basic block was a rectangle. If we wanted to integrate a function that was identically 1 on an interval [a; b], then the integral was simply the area of that rectangle, so 1×(b−a) = b−a. For Lebesgue integral what we want to do is to replace the interval with a more general subset of the real line. That is, if we have a set S ⊂ R and we take the indicator function or characteristic function χS defined by ( 1 if x 2 S, χ (x) = S 0 else. Then the integral of χS should really be equal to the area under the graph, which should be equal to the \size" of S. Example: Suppose that S is the set of rational numbers between 0 and 1. Let us argue that its size is 0, and so the integral of χS should be 0. Let fx1; x2;:::g = S be an enumeration of the points of S. Now for any > 0 take the sets −j−1 −j−1 Ij = (xj − 2 ; xj + 2 ); then 1 [ S ⊂ Ij: j=1 −j The \size" of any Ij should be 2 , so it seems reasonable to say that the \size" of S is less than the sum of the sizes of the Ij's.
    [Show full text]
  • A Brief Introduction to Lebesgue Theory
    CHAPTER 3 A BRIEF INTRODUCTION TO LEBESGUE THEORY Introduction The span from Newton and Leibniz to Lebesgue covers only 250 years (Figure 3.1). Lebesgue published his dissertation “Integrale,´ longueur, aire” (“Integral, length, area”) in the Annali di Matematica in 1902. Lebesgue developed “measure of a set” in the first chapter and an integral based on his measure in the second. Figure 3.1 From Newton and Leibniz to Lebesgue. Introduction to Real Analysis. By William C. Bauldry 125 Copyright c 2009 John Wiley & Sons, Inc. ⌅ 126 A BRIEF INTRODUCTION TO LEBESGUE THEORY Part of Lebesgue’s motivation were two problems that had arisen with Riemann’s integral. First, there were functions for which the integral of the derivative does not recover the original function and others for which the derivative of the integral is not the original. Second, the integral of the limit of a sequence of functions was not necessarily the limit of the integrals. We’ve seen that uniform convergence allows the interchange of limit and integral, but there are sequences that do not converge uniformly yet the limit of the integrals is equal to the integral of the limit. In Lebesgue’s own words from “Integral, length, area” (as quoted by Hochkirchen (2004, p. 272)), It thus seems to be natural to search for a definition of the integral which makes integration the inverse operation of differentiation in as large a range as possible. Lebesgue was able to combine Darboux’s work on defining the Riemann integral with Borel’s research on the “content” of a set.
    [Show full text]
  • Lecture Notes 4 36-705 1 Reminder: Convergence of Sequences 2
    Lecture Notes 4 36-705 In today's lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused on non-asymptotic properties of averages i.e. the tail bounds we derived applied for any fixed sample size n. For the next few lectures we focus on asymptotic properties, i.e. we ask the question: what happens to the average of n i.i.d. random variables as n ! 1. Roughly, from a theoretical perspective the idea is that many expressions will consider- ably simplify in the asymptotic regime. Rather than have many different tail bounds, we will derive simple \universal results" that hold under extremely weak conditions. From a slightly more practical perspective, asymptotic theory is often useful to obtain approximate confidence intervals. 1 Reminder: convergence of sequences When we think of convergence of deterministic real numbers the corresponding notions are classical. Formally, we say that a sequence of real numbers a1; a2;::: converges to a fixed real number a if, for every positive number , there exists a natural number N() such that for all n ≥ N(), jan − aj < . We call a the limit of the sequence and write limn!1 an = a: Our focus today will in trying to develop analogues of this notion that apply to sequences of random variables. We will first give some definitions and then try to circle back to relate the definitions and discuss some examples. Throughout, we will focus on the setting where we have a sequence of random variables X1;:::;Xn and another random variable X, and would like to define what is means for the sequence to converge to X.
    [Show full text]