<<

A theory and integration can just use sequences). In fact, in places we’ll even just restrict to the case of a (or even a separable metric space). If you need a reminder about these adjectives, see the appendix In these notes we give an alternative approach to measure theory and integration, emphasising on point set topology below. the duality between a measure (a way of measuring the size of of a space) and an integral Our goal for the next while will be to prove a theorem that looks something like this: (a rule for integrating some class of functions on the space). Theorem A.2. Let X be a nice space. There is a bijective correspondence between A.1 Motivation from the Lebesgue integral 1. Radon integrals on X, and 2. Radon measures on the Borel σ-algebra of X. Let’s think about for a moment, and try to abstract out what “integration” means. This is the theorem that says that we can take either of two points of view, takingasthe Certainly, we can integrate any continuous compactly supported function f C (Rn), and fundamental mathematical objects: ∈ c this is a linear functional, which we write as 1. the integrals of compactly supported continuous functions, or 2. the measures of Borel sets. n : Cc(R ) R (Of course measures of Borel sets are the same thing as integrals of characteristic functions of → ∫ Borel sets.) f f (x)dx. 7→ n ∫R We can also integrate many other functions, but for now let’s just concentrate on these — it A.2 Basic measure theory will eventually turn out that everything else is determined by how the integral behaves here. First we need to develop some of the basics of measure theory. I think Terry Tao’s book “An in- Remark. Is this a bounded linear functional? troduction to measure theory” is an excellent source for this; this section is essentially a selection Unfortunately no, as Rn is too big. It’s easy to see that if we restrict to some finite volume from §1.4 of that book. A Rn, then : C (A) R is a bounded linear functional, as ⊂ c → A.2.1 Algebras of sets ∫ f (x)dx (vol A)(sup f (x)) A ≤ x A Let X be some set (typically a or metric space, but for now we don’t need any ∫ ∈ = (vol A) f . structure at all). ∞ Definition A.3. A Boolean algebra (of subsets of X) is a collection of subsets of X such that In fact, the operator norm is = vol A. M • , What other properties does∫ integration ∫ have? The obvious one is that it is positive: if f (x) 0 ∅ ∈ M ≥ • if E , then X E , and everywhere (or even almost everywhere) then f (x)dx 0. It turns out that these properties ∈ M \ ∈ M A ≥ • if E, F , then E F . are all that ‘essentially’ matters about Lebesgue∫ integration, so let’s pull them out as a definition. ∈ M ∪ ∈ M When it is completely clear that we are talking about an algebra of sets (and not an algebra in Definition A.1. A Radon integral on a ‘nice’ space X is a positive linear functional Cc(X) R. → the sense of an associative algebra or a Lie algebra, for example), we may drop the word ‘Boolean’. What should ‘nice’ mean? We have a few options. One can set up the theory at the most One easily sees that E F, E F, F E, and the symmetric difference E∆F are all in as long ∩ \ \ M general level, allowing any locally compact X. To make life easier in what follows, as E and F are, by combining the last two axioms. Moreover, by induction, any finite union (or we’ll also freely assume second countability (so that we don’t need to futz about with nets, and finite intersection) of sets in is in . M M 1 2

We can talk about a Boolean algebra being finer than a Boolean algebra ′ — this just 2. Explain why this argument fails when we try to show that the σ-algebra generated by is M M σ F means ′ . One should think about this as the finer algebra ‘making finer distinctions’. In exactly n 0 n . (Hint: what happens if we try to take a countable union of sets En, where M ⊂ M ≥ F the other direction, if is finer that we say is coarser than . E σ ?) ′ ′ n n∪ M M M M ∈ F σ In probability, one often thinks of the elements of the set X as ‘states’ of an underlying system 3. Can you produce a family where in fact σ , n 0 n ? F ⟨F⟩ ≥ F (either at a moment, or representing an entire history), and an algebra of subsets of X describes 4. Learn about transfinite induction, and see if you’re satisfied by the resulting ‘explicit’ de- ∪ the possible ways to partition up these states according to some observations we might make. scription of . ⟨F⟩σ When we consider further observations (perhaps just by ‘time passing’) we should expect to pass How then do we work with σ-algebras? The following lemma provides an ‘induction principle’ to a finer algebra. for σ-algebras. Certainly on any set there is a finest algebra (the discrete algebra = 2X , also known as the M power set of X) and a coarsest algebra (just = , X ). Lemma A.6. Suppose P(E) is a property of subsets E X such that M {∅ } ⊂ • P( ) is true, The intersection of two Boolean algebras is always a Boolean algebra (coarser than eitherof ∅ • if P(E) is true, then P(X E) is true, and them), and in fact the intersection of an arbitrary family of Boolean algebras is again a Boolean \ algebra. • if Ei i 0 are subsets of X, and P(Ei) is true for all i, then P ( i 0 Ei) is true also. { } ≥ ≥ Now suppose that P(E) is true for all E , where is some family of subsets of X. Then P(E) is Definition A.4. Given an arbitrary collection of subsets of X, we can form the Boolean algebra ∈ F F ∪ F true for all E σ . generated , which we denote , as the intersection of all Boolean algebras containing . ∈ ⟨F ⟩ F ⟨F⟩bool F Proof: The collection E P(E) is a σ-algebra containing , so σ E P(E) . □ (This is an intersection of a nonempty collection, since the power set contains any collection { | } F ⟨F⟩ ⊂ { | } of subsets.) One can see that the Boolean algebra generated by is the unique coarsest Boolean Before we continue on to measures, we have to discuss the most important examples of σ- F algebra containing , and often this characterisation is useful. algebras. F Definition A.5. A σ-algebra is a Boolean algebra which moreover is closed under taking count- Definition A.7. If X is a topological space, the Borel σ-algebra is the σ-algebra generated by the able unions of sets in the algebra. open sets of X.

While we really want to be able to talk about σ-algebras, unfortunately they are much harder We’ll write this as [X] — the square brackets rather than parentheses are just a local notation B to deal with that Boolean algebras. While all the discussion above (finer, coarser, generation) to make sure we don’t mistake the Borel σ-algebra for the set of bounded linear transformations passes over immediately to the corresponding notions for σ-algebras, we find that σ-algebras are on something. somewhat slippery: To make sure you understand what we’ve been doing, make sure you try these:

Exercise. Given a collection of subsets of X, define a family n n 0 inductively by: Exercise. F {F } ≥ 1. Equivalenty, the Borel σ-algebra is generated by the closed sets of X. • 0 = F F 2. If X is a separable metric space, the Borel σ-algebra is generated by the open balls of X. (Is • n+1 = finite unions of set in n, or the complement of such a finite union F { σ F } there any hope of dropping the hypothesis ‘separable’ here?) and similarly define n replacing ‘finite’ everywhere with ‘countable’. {F } 3. If you enjoy point set topology, repeat the previous exercise merely assuming X is a second 1. Show that the Boolean algebra generated by is exactly n 0 n. (Hint: it is clear that F ≥ F every set in lies in . After that, you just need to show that it really isa countable, locally compact, Hausdorff space. (Can you drop any of those assumptions?) n 0 n bool ∪ ≥ F ⟨F⟩ 4. When X = Rn, the Borel σ-algebra is generated by boxes. (Or by boxes with rational Boolean algebra. 2nd hint: given E, F n 0 n, we may say E n1 and F n2 for ∪ ∈ ≥ F ∈ F ∈ F some n ,n 0.) coordinates, or by boxes with dyadic coordinates.) 1 2 ≥ ∪ 3 4 5. What condition on a topological space X is needed so that the σ-algebra generated by Exercise.

compact sets agrees with the Borel σ-algebra? • µ ( i 0 Ei) i 0 µ(Ei) ≥ ≤ ≥ Note that to prove = , it suffices to show you can write every set in using a • If E1 E2 are all in , ⟨F⟩σ ⟨G⟩σ F ∪ ⊂ ⊂ ·∑ · · M finite sequence of the operations of ‘take a set from ’, complementation, and countable union G µ Ei = lim µ(Ei) = sup µ(Ei). of sets previously constructed, and vice versa switching and . i F G i 0 i Exercise. If E and F are Borel in Ra and Rb respectively, show that E F is Borel in Ra+b. (Hint: *∪≥ + × • If E1 E2 are all in ,, and eventually- µ(Ei) is finite this is not trivial; assume F is a box, first.) ⊃ ⊃ · · · M a+b x b Exercise. Suppose E is Borel in R . Show that the slice E 1 = x2 R (x1, x2) E is Borel a { ∈ | ∈ } µ Ei = lim µ(Ei) = inf µ(Ei). for each x1 R . i i ∈ i 0 Now show that this statement is not true if we replace both instances of ‘Borel’ with ‘Lebesgue’. *∩≥ + (Hint, take a set C that isn’t Lebesgue in R (ha!), and consider C 0 . Why is this Lebesgue?) • Explain why the finiteness condition, - is required in the previous exercise. × { } (You might interpret this exercise as telling you that Borel sets are better than Lebesgue sets.) Definition A.11. A is just a measure defined on the σ-algebra of Borel sets. (Or sometimes, on a bigger σ-algebra containing all the Borel sets.) A.2.2 The definition of a measure

Definition A.8. If is a Boolean algebra of subsets of X, a finitely additive measure for is A.2.3 Completeness M M a function µ : [0, ] such that M → ∞ Definition A.12. A measure is complete if every of a measure zero set is measurable (and • µ( ) = 0 ∅ hence also has measure zero). • if E, F are disjoint, then µ(E F) = µ(E) + µ(F). ∈ M ∪ Sometimes completeness is a desirable property. Lemma A.9. • E F implies µ(E) µ(F) Definition A.13. Given a measure µ on a σ-algebra , we can complete the measure by defining M ⊂ ≤ n n • if E ,..., E are in , and are disjoint, µ E = µ(E ). a new σ-algebra ′ whose sets are of the form E X, such that there exists E′ so that 1 n M i=1 i i=1 i M ⊂ ∈ M • without the disjointness assumption, µ n E n µ(E ). E∆E′ is contained in a with respect to µ. We define the measure µ′ : ′ [0, ] simply i(=1∪ i ≤) i=1∑ i M → ∞ • µ(E F) + µ(E F) = µ(E) + µ(F) by µ′(E) = µ(E′). ∪ ∩ (∪ ) ∑ Exercise. Explain why one shouldn’t write the last fact as µ(E F) = µ(E) + µ(F) µ(E F). Exercise. • Check that as defined really is a σ-algebra. ∪ − ∩ M′ • Check that µ is well-defined (that is, it does not depend on the choiceof E for each E — Definition A.10. If is a σ-algebra on X, a measure for is a finitely additive measure for ′ ′ M M clearly there are many different choices). which further satisfies the condition M • Check that µ′ is still countably additive. • If Ei i 0 is a countable family of disjoint sets, each in , then { } ≥ M Exercise. Observe that in the construction of Lebesgue measure in Analysis 2, we effectively built

µ Ei = µ(Ei). Lebesgue measure on the Borel σ-algebra, and then completed it. i 0 i 0 *∪≥ + ∑≥ The triple (X, , µ) is called a measure space. Sometimes the pair (X, ) is called a measur- M , - M able space.

5 6

A.2.4 Radon measures Some notation: if is some collection of functions X R, we write for the subset of F → F+ non-negative functions. Definition A.14. A Borel measure is inner regular if Lemma A.17. µ(A) = sup µ(K) K is a compact subset of A . m { | } • Both Cc(X) and Cc(X)m are closed under addition, and multiplication by positive scalars. • When f C (X)m, we have f C (X) and vice versa. Definition A.15. A is a Borel measure which ∈ c − ∈ c m • is inner regular, and Lemma A.18. • µ(K) < for every compact set K. m ∞ • The function 1 is in Cc(X) iff X is compact. − m Exercise. Lebesgue measure on Rn is inner regular, and so easily a Radon measure. • The function 1 is in Cc(X) iff X is a countable union of compact sets. Proof: The first fact is easy: we have some compactly supported function lessthan 1, so outside − A.3 Extending integrals to all integrable functions of a compact set 0 1, so the whole space must be compact. ≤ − Taking K = supp f , for some sequence f C (X) such that f f , we see every point Throughout this section X is a locally compact Hausdorff space. We will assume X is second n n { n} ⊂ c n ↗ if eventually in some K , i.e. X = K . In the other direction, if X = K , for each n choose countable, although with some extra work this condition can be omitted. n n n some f C (X) which is identically 1 on K , and then take maximums as д = max(f ,..., f ). This section closely follows the presentation in §6.1 of Pedersen’s Analysis Now; I’ve tried to n ∈ c ∪ n ∪ n 1 n We see that д C (X) still, and д 1. □ unpack it a bit and make it a little friendlier. n ∈ c n ↗ We now set out to extend a Radon integral, from being merely defined on compactly supported Lemma A.19. If X is a locally compact metric space, χ C (X) for every compact set K. K ∈ c m continuous functions to a much bigger class of functions. This has two purposes: Proof: Define • to include the characteristic functions of ‘many’ sets in the class of integrals we can inte- 1 if x K grate, to extract a measure from the integral ∈ 1 fn(x) = 0 if d(x, K) • to provide a setting where we can prove the monotone and dominated convergence theo-  ≥ n  rems. 1 nd(x, K) otherwise. −  Observe that d(x, K) = inf d(x,y) y K = max d(x,y) y K since K is compact, and use { | ∈ } { | ∈ } A.3.1 Monotone limits of Cc(X) this to show d(x, K) is continuous, and hence that fn is continuous. To begin, we define two new classes of functions, One has to be quite careful checking that eventually fn is compactly supported! The problem 1 is that even though K is compact, the set x d(x, K) n could be noncompact. We need to use m { | ≤ } Cc(X) = f : X R + fn Cc(X) so fn f X → ∪ { ∞} | ∃ ∈ ↗ the hypothesis that is locally compact, and Lemma A.51 from the appendix, which ensures this C (X) = f : X R f C (X) so f f set is eventually compact. □ c m { → ∪ {−∞} | ∃ n ∈ c n ↘ } where f f expresses that f{ is an increasing sequence, converging pointwise} to f , and simi- One can also prove the following facts, although we won’t need them here: n ↗ n larly for f f . n ↘ Lemma A.20. We’re going to spend some time understanding these classes. m m • If f ,д Cc(X)+, then f д Cc(X)+. m ∈ ∈ m Example A.16. When X = R, χ(a,b) is in Cc(X) , but not in Cc(X)m. Conversely, χ[a,b]is in Cc(X)m, • Every non-negative lower semicontinous function is in Cc(X) . m m but not in Cc(X) . • C (X) C (X) = C (X). c ∩ c m c 7 8 A.3.2 Upper and lower integrals

We next define linear functionals

∗ : C (X)m R + c → ∪ { ∞} ∫ : f sup д f д C (X) 7→ | ≥ ∈ c {∫ } : C (X) R c m → ∪ {−∞} ¥ini**¥EiIYI ∫∗ . : f inf д f д C (X) 7→ | ≤ ∈ c {∫ } m Remember thatCc(X) andCc(X)m are cones, not vector spaces, so when we say linear functional we mean functions that commute with multiplication by non-negative scalars and with addition. Figure 1: We calculate ∗∗ f as an infimum over slightly larger functions д in the class C (X)m of It’s not actually obvious that these are linear; we address this in Lemma A.24. c m the supremum over compactly supported continuous functions h д of h. Observe that of course if f Cc(X), then ∗ f = f = f . Moreover if f Cc(X) ∫ ≤ ∈ ∗ ∈ ∫ ∫ ∫ ∫ m ∗ Lemma A.21. A function f C (X) is integrable if and only if ∗ f < , in which case f = f = ( f ). (A.1) ∈ c ∞ − − ∗ f . Similarly a function f Cc(X)m is integrable if and only if ∫ f > . ∫ ∫ ∫∗ ∈ ∗ −∞ We then define the class of integrable functions L1(X) (of course relative to the particular ∫Proof: If ∗ f < , for any ϵ > 0 we can find some h C (X)∫ so h f and ∗ f h < ϵ. ∞ ∈ c ≤ − R Radon measure, which we consider fixed throughout this section) to be those functions f : (That is,∫ in the definition of being integrable we observe we cantake д = f , and∫h Cc(X∫ ).) □ m → ∈ R such that for every ϵ > 0, we can find a larger function д Cc(X) and a smaller function ∪±∞ ∈ In not too long we’ll discover that the characteristic functions of compact Borel sets are au- h Cc(X)m, such that ∗ д h < ϵ. ∈ − tomatically in L1(X), and that we can define a Borel measure from a Radon integral using the Another way to express this∗ same idea is to define upper and lower integrals for arbitrary ∫ ∫ integrals of these characteristic functions. Indeed, we’ll also see that L1(X) as we’ve just defined functions R R by → ∪ ±∞ it agrees with the L1(X) that one might more conventionally define starting from that measure. m ∗∗ ∗ m First, however, we need to prove a number of technical statements about Cc(X) , Cc(X)m, f = inf д f д Cc(X) | ≤ ∈ 1 ∫ {∫ } the upper and lower integrals, and the class L (X). In the next section, we’ll establish the basic f = sup д f д C (X) convergence theorems (the monotone convergence theorem, Fatou’s lemma, and the dominated | ≥ ∈ c m ∫∗∗ {∫∗ } convergence theorem). 1 ∗∗ m m and then to say that f L (X) if and only if f = f and this number is in R (that is, not Lemma A.22. Given an increasing sequence fn Cc(X) , sup fn is in Cc(X) too. 1 ∈ ∗∗ { } ⊂ { } . We define : L (X) R by ∫f = ∗∗∫f = f . It’s only a little work to show ±∞ 1 → ∪ {±∞} ∗∗ Proof: Choose дnm Cc(X) withдnm fn. By replacing eachдnm withдnm′ = max(д1m,д2m,... дnm) Cc(X) L (X),∫ and our newly defined agrees∫ with∫ the original∫ one where it should. Moreover { } ⊂ ↗ ⊂1 (notice we still have дnm′ Cc(C), and дnm′ fn) we may assume that дnm is increasing with re- on L (X) is a positive linear functional.∫ ∈ ↗ spect to n as well. We claim дnn sup fn . □ Exercise.∫ Verify that these definitions of L1(X) really are the same! ↗ { } Lemma A.23. If f C (X) and f f for some f C (X)m, then f ∗ f . { n} ⊂ c n ↗ ∈ c n ↗ ∫ ∫ 9 10

We defer the proof to Appendix A.8. Lemma A.30. The class L1(X) is closed under max and min.

1 m Lemma A.24. The linear functionals ∗ and are actually linear! Proof: If fi L (X) for i = 1, 2, we can find fi Cc(X) and fi Cc(X)m so ∗ ∈ ∈ ∈ ∫ ∫ Proof: Homogeneity, that is, the condition that ∗ c f = c ∗ f for all c 0, is straightforward. ∗ ≥ fi fi < ϵ/2. Unfortunately additivity is a bit tricky. It’s clear that ∗(f + д) ∗ f + ∗ д, as if we have − ∫ ∫ ≥ ∫ ∫∗ functions f ,д C (X) whose integrals are very close to ∗ f and ∗ д respectively, we can use ′ ′ ∈ c ∫ ∫ ∫ Now f + д to produce a lower bound for ∗(f + д). ′ ′ ∫ ∫ max(f , f ) max(f , f ) (f f ) + (f f ) 1 2 − 1 2 ≤ 1 − 1 2 − 2 To go the other way we need Lemma∫ A.23: pick sequences fn , дn Cc(X) so fn f and { } { } ⊂ ↗ (prove this by checking all four cases in the left hand side), and both sides of this inequality are дn д. One see that fn +дn f +д. Then fn ∗ f , дn ∗ д, (fn +дn) ∗(f +д), ↗ ↗ ↗ ↗ ↗ m and the linearity of on C (X) itself allows us to conclude that (f + д ) = f + д in Cc(X) . Thus we have c ∫ ∫ ∫ ∫ n ∫ n n ∫ n ↗ ∗ f + ∗ д. Uniqueness of limits in R gives the result. □ ∗ ∗ ∫ ∫ ∫ ∫ max(f , f ) max(f , f ) (f f ) + (f f ) 1 2 − 1 2 ≤ 1 − 1 2 − 2 ∫ That’s∫ a relief. ∫ ∫ and using linearity and Equation (A.1) Corollary A.25. On L1(X), is linear.

∫ m m ∗ ∗ ∗ Lemma A.26. If fn Cc(X) and fn f , by Lemma A.22 f Cc(X) and we have max(f , f ) max(f , f ) f + f f f { } ⊂ ↗ ∈ 1 2 − 1 2 ≤ 1 2 − 1 − 1 ∫ ∫∗ ∫ ∫ ∫∗ ∫∗ ∗ f ∗ f . < ϵ n ↗ ∫ ∫ Again we defer the proof to Appendix A.8. and so the pair of functions max(f1, f2) and max(f1, f2) provide the necessary witnesses to show 1 max(f1, f2) L (X). A similar argument handles the minimum. □ m ∈ Lemma A.27. If f Cc(X)m and д Cc(X) , and f д, then f ∗ д. ∈ ∈ ≤ ≤ 1 ∗ Theorem A.31 (The monotone convergence theorem). If fn L (X) and fn f , and moreover ∫ ∫ { } ⊂ ↗ Proof: Since д f 0, sup f < , then f L1(X) and f = lim f . − ≥ n ∞ ∈ n ∗ ∫ 1 ∫ ∫ 0 д f Proof: Since L (X) is a , by subtracting off f0 from f and fn we may assume that ≤ − 1 m ∫ f0 = 0. Define дn = fn+1 fn L (X). By definition, for any ϵ > 0 we may pick hn Cc(X) ∗ ∗ − ∈ ∈ = д + ( f ) n − with hn дn, but still ∗ hn < дn + 2− ϵ. With h = hn, we can apply Lemma A.26 to see ∫ ∫ ≥m n 1 ∗ h Cc(X) and h ∫i=0− дi = (fn∫, so)h f . Now we calculate = д f . □ ∈ ≥ ≥ ∑ − ∫ ∫∗ ∑ fm = fm Corollary A.28. For any f , f ∗∗ f , since for any д in the set is a supremum over, and ∫ ∫∗∗ ∗∗ ≤ ∗∗ any д′ in the set ∗∗ is an infimum∫ over,∫ we have д f д′, so the previous∫ lemma applies. f ≤ ≤ ≤ ∫ 1 ∫∗∗ Lemma A.29. The class L (X) forms a vector space. ∗∗ f (by Corollary A.28) ≤ Proof: This is pretty easy using the definition and the factthat ∗ and are linear (Lemma ∫ ∗∗ ∗ h A.24). ∫ ∫ □ ≤ ∫ 11 12 ∗ = h A.4 Radon integrals from measures

∫ N We let denote the Borel measurable functions on a space X, that is, those functions f : X R ∗ F → = lim hn (by Lemma A.26) such that f 1(t, ) is Borel for every t R. As usual denotes the cone of non-negative N − ∞ ∈ F+ ∫ n∑=0 elements in . ∗ ∗ F = hn (by linearity of , Lemma A.24) Theorem A.33. Suppose µ is a Borel measure on X. There is a unique positive linear functional ∑ ∫ ∫ n дn + 2− ϵ (by the discussion above) ≤ Φ: + [0, ] ∑ ∫ F → ∞ = lim f + ϵ (making use of Corollary A.25) n such that ∫ • Φ(χA) = µ(A) for every A, and This hold for every m, so taking m we obtain lim fm f ∗∗ f lim fn, showing → ∞ ≤ ∗∗ ≤ ≤ • Φ(fn) Φ(f ) whenever fn f in +. that f all these numbers are equal and f is integrable.∫ ∫ ∫ ∫ □ ↗ ↗ F

1 Proof: Let s denote the simple functions in +, Theorem A.32 (The dominated convergence theorem). If fn L (X), and fn f pointwise F F { }1 ⊂ 1 → (not necessarily monotonically!), and if fn д for some д L (X), then f L (X) and f = α χ α 0, A . | | ≤ ∈ ∈ n An n n { | ≥ ∈ B} lim fn. ∫ ∑ We first define Φ in by Φ(α χ ) = α µ(A ). It’s pretty straightforward (mucking about ∫ Fs n An n n Proof: Define hn = inf(fn, fn+1,...). We see hn f , and using the bound fn д, we have with linear combinations of characteristic functions) to show this is a positive linear functional ↗ | | ≤ ∑ hn fn fn д, so sup hn < and we can apply the monotone convergence ≤ ≤ | | ≤ ∞ on s. theorem to show f L1(X). F ∫ ∫ ∫ ∈ ∫ ∫ We now establish a few preparatory results, whose proofs we postpose for a moment. Next we look at the sequence an = 2д f fn . Again, let bn = inf(an, an+1,...), so bn 2д − | − | ↗ Lemma A.34. If fn f in s, then Φ(fn) Φ(f ). and since 0 an 2д and 0 bn 2д, we have sup bn < . We can thus apply the monotone ↗ F ↗ ≤ ≤ ≤ ≤ ∞ convergence theorem a second time, obtaining Lemma A.35. If f , д , and f f , д f for some f , then lim Φ(f ) = ∫ { n} { n} ⊂ Fs n ↗ n ↗ ∈ F+ n lim Φ(дn). 2 д = lim b lim a = 2 д lim f f n ≤ n − − n Lemma A.36. If f , then there exists a sequence f so f f . ∫ ∫ ∫ ∫ ∫ ∈ F+ { n} ⊂ Fs n ↗

This gives lim f f 0, and the positivity of the integral gives us lim f f = 0. As Using these facts, we define Φ on all of + by Φ(f ) = lim Φ(fn), using some approximating | − n | ≤ − n F f f f f , we obtain the result. □ sequence provided by Lemma A.36. Using the uniqueness result from Lemma A.35 is is straight- − n ≤ ∫ − n ∫ forward to show that Φ is a positive linear functional. Certainly Φ(χ ) = µ(A), because we can ∫ ∫ ∫ A I’ve intentionally left out a proof of Fatou’s lemma here (essentially rolling the idea intothe approximate χA with a constant sequence. proof of the dominated convergence theorem) for the sake of an assignment question. (When We still need to show that Φ(fn) Φ(f ) whenever fn f in +. Pick approximating you do that assignment question, I’m happy for you to prove Fatou’s lemma in the context of ↗ ↗ F sequences дnm s, so дnm fn. We’d like to construct from this two parameter family some measures or integrals. There’s a proof of a weak version of Fatou’s lemma in the measure theory { } ⊂ F ↗ functions hn s, with that property that hn f , and still hn fn. If we can do this, we see notes.) ∈ F ↗ ≤ lim Φ(f ) Φ(f ) = lim Φ(h ) lim Φ(f ), giving the result. You might hope to use a diagonal n ≤ n ≤ n

13 14

argument, and try h = д , but the problem here is that there’s no guarantee1 that д f . Proof of Lemma A.36: We need to constructing an approximating sequence f so f f n nn nn ↗ { n} ⊂ Fs n ↗ Unfortunately while д is increasing in m, it is not necessarily increasing with n. To get around for any f . nm ∈ F+ this, we take Define , ,..., . hn = max(д1n д2n дnn) n n A = (k 1)2− < f k2− nk { − ≤ } Now, certainly hn is increasing and let hn+1 = max(д1,n+1,д2,n+1,...,дn,n+1,дn+1,n+1) n max(д1,n,д2,n,...,дn,n,дn+1,n+1) f = (k 1)2− χ . □ ≥ n − Ank k 1 max(д , ,д , ,...,д , ) ∑≥ ≥ 1 n 2 n n n That concludes the proof of Theorem A.33. □ = hn. Definition A.37. Suppose µ is a Radon measure on X. To see that hn f , for any x X and ϵ > 0, we see that we can find some n′ so f (x) fn′(x) < ↗ ∈ | − | , ϵ/2. We can then find some m, which we may assume to be at least n , so f (x) д (x) < ϵ/2. Suppose f Cc(X). We can write f = f+ + f , where f+ = max(f 0) Cc(X)+ and ′ | n′ − n′m | ∈ − ∈ f = min(f , 0), with f Cc(X)+. We have some α 0 and a compact set K so that f+ α χK . Now taking n = m, we have hn(x) дnm, and so f (x) hn(x) < ϵ. − − − ∈ ≥ ≤ ≥ − | Since µ is a Radon measure, µ(K) < , so Φ(f ) < . (Here Φ is the positive linear functional We’re now all done, except for the proofs of the lemmas! ∞ + ∞ defined in Theorem A.33 from µ.) Similarly Φ( f ) < . Proof of Lemma A.34: We need to show that if f f in , then Φ(f ) Φ(f ). − − ∞ n ↗ Fs n ↗ We then check that if we had picked some other д+ in Cc(X) and д with д Cc(X) so We first consider the case f = χ for some Borel set A. The general case is an easy conse- − − − ∈ A f = д+ +д , then Φ(f+) Φ( f ) = Φ(д+) Φ( д ). This follows easily from the observations quence. Pick ϵ > 0, and let B = f 1 ϵ . Observe (1 ϵ)χ f . Since f χ , B = A, − − − − − − − n n Bn n n A n that in this case д+ = f+ + k and д = f k for some k Cc(X)+. { ≥ − } − ≤ ↗ − − − ∈ and so µ(Bn) µ(A) by countable additivity of the measure. We define the integral by f = Φ(f ) Φ( f ). We easily see is a Radon integral: ↗ ∪ µ µ + µ For each n, f is a linear combination, with coefficients in [0, 1], of characteristic functions of − − − n positivity and linearity follow∫ from∫ the same conditions for Φ, and the invariance∫ condition of disjoint sets all contained inside A. Thus Φ(f ) µ(A). n ≤ the last paragraph. Now we have

µ(A) lim Φ(f ) A.5 Measures from Radon integrals ≥ n (1 ϵ) lim Φ(χB ) We first define ≥ − n = (1 ϵ) lim µ(Bn) 1 − ′ = B X χB L (X) = (1 ϵ)µ(A). M ⊂ | ∈ − { } and then Taking ϵ 0 gives the result. □ → Proof of Lemma A.35: We need to show that lim Φ(f ) = lim Φ(д ). Since min(f ,д ) f , = A X A K ′ for all compact K n n n n ↗ n M ⊂ | ∩ ∈ M lim Φ(дm) limn Φ(min(fn,дn)) = Φ(fn) by Lemma A.34. Thus lim Φ(дm) lim Φ(fn), and by ≥ ≥ Lemma A.38. If K is a compact{ set, χK is integrable. } symmetry we’re done. □ 1 Proof: If K is compact, χ C (X) by Lemma A.19. Clearly χ 0, so by Lemma A.21 χ is (The problem is that at each point x, we can find some n so f (x) fn(x) < ϵ/2, but then the sequence дnm(x) K c m K K | − | { } ∈ ∗ ≥ might take too long — in particular until m > n to come within ϵ/2 of fn(x).) integrable. ∫ □

15 16 It is immediate from the definitions that we then have every compact setin ′, and hence M = χK An also in . ∩ M ∫ ∑ = χ (by the MCT) Lemma A.39. The collection of subsets is a σ-algebra. K An M ∩ ∑ ∫ Proof: Certainly the zero function is integrable, so . = χK A (as An , χK A is integrable) ∅ ∈ M ∩ n ∈ M ∩ n If A , for any compact set K we have ∑ ∫∗∗ ∈ M = µ(K A ). (A.2) ∩ n χ(X A) K = χK χA K \ ∩ − ∩ We now prove that µ (∑A ) µ(A ) and then that µ ( A ) µ(A ). 1 n n n n which is in L (X) since by Lemma A.38 χK is integrable, and A ensures χA K is integrable. ≤ ≥ ∈ M ∩ Easily from Equation (A.2) we have µ(K) µ(An). Since this holds for any compact K (Of course here we’re using that L1(X) is a vector space, per Lemma A.29.) This shows that X A ∪ ∑ ≤ ∪ ∑ \ inside An, it holds for the supremum too, and inner regularity gives is also in , so we see is closed under taking complements. ∑ M M 1 We showed in Lemma A.30 that L (X) is closed under max and min, so ′ is closed under ∪ µ An = sup µ(K) K a compact subset of An M { | } finite unions and intersections. By the monotone convergence theorem ′ is also closed under (∪ ) µ(An). ∪ n M ≤ countable unions: max χEi i=0 χ Ei . □ { } ↗ In the other direction, we write∑ Lemma A.40. The σ-algebra contains∪ all Borel sets. M µ An = sup µ(K) Proof: Given any C, we see C K is compact if K is compact, so Lemma A.38 ensures cpct K A ∩ ⊂ n C . □ (∪ ) ∈ M = sup∪ µ(K An) cpct K An ∩ Theorem A.41. If we define µ(A) = χA for every A , then µ is a Radon measure. ⊂ ∑ ∗∗ ∈ M ∪ Proof: We first prove that µ is inner∫ regular, that is, and interchanging supremums and sums,

χ = sup χ K a compact subset of A . = sup µ(K An) A K cpct K A ∩ | ⊂ n ∫∗∗ {∫ } ∑ 1 sup ∪ µ(K A ) Consider h Cc(X)m, with h χA. As h is upper semicontinuous, the sets Cn = h n− ≥ ∩ n ∈ ≤ { ≥ } cpct K An are closed, and in fact compact, as h is dominated by some function with compact . Since ∑ ⊂ h 1, in fact h χ , and so h sup χ . By definition χ is the supremum of such (as this is a supremum over a more restrictive family) ≤ ≤ Cn ≤ Cn A h, so ∗∗ ∪ ∫ ∫ ∫ = sup µ(K) cpct K A ∫ χ sup χ sup χ K a compact subset of A . ⊂ n A ≤ Cn ≤ K | ∑ ∫∗∗ ∫ {∫ } = µ(An) The other inequality is easy, so we have proved inner regularity. ∑ Next we check countable additivity of µ. Suppose we have a countable collection A of by inner regularity again. □ { n} disjoint sets in . Consider K a compact subset of A . Then M n A.6 These constructions are inverses! µ(K) = χK ∪ ∫∗∗ We start with a Radon integral , define a Radon measure µ following §A.5, and then define a = χK (since K is integrable) new Radon integral following §A.4. First, if f = N α χ , with α 0 and A Borel and µ ∫ n=0 n An n n ∫ ≥ ∫ ∑ 17 18

compact, we see Theorem A.42. On a nice2 space X, there is a bijection between Radon measures and Radon integrals, given by the constructions of §A.4 and §A.5.

f = αnµ(An) = αn χAn = f = f . µ ∫ ∑ ∑ ∫∗∗ ∫∗∗ ∫ A.7 Stieltjes integrals (We used the assumption that f had compact support in the last step.) We now construct all Radon integrals on R. (This is very specific to R, using the order structure Next we can approximate any f Cc(X)+ by such functions, with fn f . Then ∈ ↗ on R is an essential manner; it’s much harder to describe all Radon integrals on anything bigger!) f = Φ(f ) Let m be a monotone non-decreasing function on R. ∫µ We’ll now construct a Radon integral from m. For each f C (R), pick some [a,b] m ∈ c = lim Φ(fn) by the second condition in Theorem A.33 containing its support. Given a partition λ = a = λ < λ < < λ = b , define upper and ∫ { 0 1 ··· n } lower Riemann-Stieltjes sums by = lim fn µ n 1 ∫ ∗ − f = sup f (x) (m(λ ) m(λ )) = lim fn by the previous paragraph k+1 k x [λ ,λ ) − ∫ ∑λ k∑=0 ∈ k k+1 n 1 * + = f by the monotone convergence theorem − f = , inf f (x)- (m(λk+1) m(λk )) . ∫ x [λk ,λk+1) − λ k=0 ∈ ∑∗ ∑ ( ) Easily this gives the result that = . µ (These are all Riemann-Stieltjes sums because when m(x) = x these are just the Riemann sums In the other direction, we start with a Radon measure µ, construct a Radon integral ac- ∫ ∫ µ for a partition.) cording to §A.4, and then construct another Radon µ following §A.5. As both measures are inner ∫ Clearly λ∗ is decreasing as the partition λ becomes finer, and similarly λ is increasing. regular, it’s enough to show that µ(K) = µ(K) for every compact set K. ∗ D Clearly λ∗ λ f 0, and we next show that this difference converges to zero. With that, we −∑ ∗ ≥ ∑ Approximate χK using Lemma A.19, obtaining fn Cc(X) with fn χK . Then µ(K) = D ∈ ↘ define the Stieltjes integral m f = limλ λ∗ = limλ λ . χ = lim f by Lemma A.23, and each of these is at least µ(K), so we have µ(K) µ(K). In ∑ ∑ ∗ K n D As f is uniformly continuous on [a,b], for any ϵ > 0 we have a δ > 0 so sup f (x) ∗∗ ≥ ∫ ∑ ∑ x [λk ,λk+1] − fact,∫ by inner∫ regularity we now know that µ(B) µ(B) for any Borel set B. inf f (x) < ϵ as long as λ λ < δ. Thus for any partition in which∈ every interval is ≥ D x [λk ,λk+1] k+1 k ∈ − To obtain the other inequality, we consider some f Cc(X), with 0 f 1, and f χK . D ∈ ≤ ≤ ≥ of length at most δ, with have λ∗ λ f < ϵ(m(b) m(a)). n − ∗ − Observe that f χL, where L = f = 1 . Now, ↘ { } Exercise. Show the the Stieltjes∑ integral∑ is actually linear. n It’s immediate that the Stieltjes integral is positive, so we have constructed a Radon integral. µ(L) = χL = lim f µ µ ∫ ∗∗ ∫ Before we show that all Radon integrals are of this form, we make some observations about n D and since f f f χL, we can apply the second condition in Theorem A.33 to obtain monotone functions. One can see that such a function has at most countably many points of − ↗ − discontinuity. If m is not already lower semicontinuous (for a monotone increasing function n lim f = χL = µ(L). this is the same as continuous from the left), we can modify its values at those discontinuities µ µ ∫ ∫ so that it becomes lower semicontinuous. Explicitly, we can define a new function m′(x) = Finally, if we had µ(K) < µ(K) we would obtain a contradiction from µ(L) = µ(K) + µ(L K) > sup m(y), which is still monotone increasing, is lower semicontinuous, and agrees with m \ y

19 20 Exercise. Check that the Stieltjes integral is not affected by this modification: that is = as A.8 Appendix: some technical details m m′ positive linear functionals on C (X). c ∫ ∫ Definition A.43. A function f : X R + is lower semicontinuous if f 1(t, ] is open for → ∪ { ∞} − ∞ In fact, every Radon integral of R is a Stieltjes integral, for some lower semicontinuous mono- every real t. tone function m : R R. Fix some Radon integral , and construct the corresponding class of → 1 Exercise. A function is lower semicontinuous if and only if for every convergent sequence x x, integrable functions L (X). Define ∫ n → f (x) lim inf f (x ). ≤ n χ[0,x) if x 0 m(x) = ≥ Exercise. The supremum of an arbitrary family of lower semicontinuous functions is still lower ∫ χ[x,0) if x < 0. − semicontinuous. The monotone convergence theorem ensures∫ that m is lower semicontinuous (and it’s easy to see  Lemma A.44. Every function f C (X)m is lower semicontinuous. it’s monotone non-decreasing, from the positivity of ). ∈ c We need to show = . Easily, f n 1 sup f (x) χ for any partition Proof: Such a function is the supremum of a sequence of continuous functions. □ m k−=0 ∫x [λ ,λ ) [λk ,λk+1) ≤ ∈ k k+1 λ, so ∫ ∫ ( ) ∑ Lemma A.45. If fn 0, and each fn is upper semicontinuous, positive, and compactly supported, n 1 ↘ − then fn 0. f sup f (x) χ[λk ,λk+1] || ||∞ ↘ ≤ x [λ ,λ ] ∫ k∑=0 ∫ ∈ k k+1 n 1 * + Proof: For any ϵ, the set fn ϵ is closed (this is direct from fn being upper semicontinous). − { ≥ } = , sup f (x)- (m(λ ) m(λ ) Since fn ϵ is contained in the support of fn, which has been assumed to be compact, these k+1 − k { ≥ } k=0 ∫ x [λk ,λk+1] sets are compact as well. ∑ * ∈ + ∗ Now n 0 fn ϵ = , and by a standard compactness argument this means only finitely = f , - ≥ { ≥ } ∅ λ many of fn ϵ (as n varies) are non-empty. That is, eventually fn ϵ, and since this ∑ ∩{ ≥ } || ||∞ ≤ argument held for any ϵ, we have the result. □ and similarly f λ f . Since these quantities converge to f , we have the desired result. ≥ ∗ m Exercise. If m ∫: R ∑R is monotone non-decreasing, and absolutely∫ continuous, so in particular Lemma A.46. If fn Cc(X)m, and fn 0, then fn 0. → { } ⊂ ↘ ∗ ↘ m′ exists and is integrable, show that ∫ Proof: We first check that we can apply Lemma A.45. The functions fn are upper semicontinuous

χE = F ′(x)dx by Lemma A.44. Since each fn 0, it has compact support. Thus Lemma A.45 shows that m E ≥ ∫ ∫ fn 0. where the right hand side is , and more generally || ||∞ ↘ Note that lim fn exists, since this is a sequence of decreasing non-negative numbers. We ∗ k now pass to a subsequence fn so that fn < 2− , and for each k pick дk Cc(X) with f = f (x)m′(x)dx. ∫ { k } || k || ∈ m f д 2 k . Since the f are decreasing, we can further assume the supports of д are ∫ ∫ nk ≤ k ≤ − n k Exercise. If decreasing (by replacing дk with min(д1,... дk ), say). Then define д = дn, which is in Cc(X)+, 0 if x < 0 and we have д д < . m = n ≤ ∞ ∑ Finally f д < , and so we must have f 0. □ 1 if x 0 ∫ n ∫ n n  ≥ ∑∗ ≤ ∞ ∗ ↘ what is ?  Proof of Lemma∑ ∫ A.23∑: Recall∫ we want to show that if ∫f C (X) and f f for some f m  n c n m { } ⊂ ↗ ∈ Exercise.∫ Show that µ( x ) = 0 for all x R if and only if m is continuous. Cc(X) , then fn ∗ f . { } ∈ ↗ ∫ ∫ 21 22

For any compactly supported continuous д with д f , we have д min(f ,д) 0, so by When X is Hausdorff, an equivalent formulation of local compactness is that “every neigh- ≤ − n ↘ Lemma A.46 bourhood contains a compact neighbourhood”.

д min(fn,д) 0 − ↘ Definition A.49. A topological space X is separable if it contains a countable dense set. ∫∗ which we can re-express as Definition A.50. A topological space X is second countable if there is a countable collection (Ui) д = lim min(f ,д) n of open sets, such that every is the union of some of the U . ∫ ∫ i which in turn is less that f . Since this held for any such д f , we also have ∗ f lim f . n ≤ ≤ n Every second countable space is separable, but not vice versa! In one important way second Since f f , the other direction is trivial, and we have the result. □ n ≤ ∫ ∫ ∫ countable spaces behave like metric spaces — convergence of sequences (rather than of nets) Proof of Lemma A.26: Recall that we want to show that if f C (X)m and f f , then determines all the topological information. It is a consequence of this that in second countable { n} ⊂ c n ↗ ∗ f ∗ f . spaces (just as in metric spaces) sequential compactness and compactness are equivalent. n ↗ By the monotonicity of ∗, we have lim ∗ f ∗ f . We collect here a few useful facts. ∫ ∫ n ≤ Now suppose д C (X) and д f . We have д min(f ,д) 0, so ∈ c ∫ ≤ ∫ −∫ n ↘ Lemma A.51. If X is a locally compact metric space, and K is a compact set, there is an ϵ > 0 so the set (д min(f n,д)) 0 − − ↘ x d(x, K) ϵ ∫∗ | ≤ by Lemma A.46. However by linearity we can write is still compact. { }

∗ (д min(f n,д)) = д + min(f ,д) Proof: Using local compactness choose rk > 0 and Kk X for each k K so that B(k,rk ) Kk − − − − n ⊂ ∈ ⊂ ∫∗ ∫ and Kk is compact. The balls B(k,rk /2) k K form an open cover of K, so there is a finite sub- ∗ { } ∈ = д min(fn,д) cover indexed by I K. Let r = mini I ri. Now, if d(x, K) r/2, then d(x,y) r/2 for some − ⊂ ∈ ≤ ≤ ∫ ∫ y K, and that y is in turn inside B(i,ri/2) for some i I. Thus x B(i,ri) i I Ki. Since д ∗ f ∈ ∈ ∈ ⊂ ∈ n the set x d(x, K) r/2 is closed, and a subset of the compact set i I Ki, we see that it is ≥ − { | ≤ } ∈ ∪ ∫ ∫ compact. □ ∪ Putting these facts together, lim д ∗ f 0, so д ∗ f . As this holds for every continuous − n ≤ ≤ n compactly supported д f , ∗ f ∗ f , completely the claim. ≤ ∫ ≤ ∫ n ∫ ∫ ∫ ∫ □

A.9 Appendix: Some point set topology

Definition A.47. A topological space X is Hausdorff if for every two points x,y, there exist disjoint open sets U ,V with x U and y V . ∈ ∈ Definition A.48. A topological space X is locally compact if every point has a compact neigh- bourhood, i.e. for every x X, there exists an open set U and a compact set K so that x U K. ∈ ∈ ⊂

23 24