Real Analysis

Jesse Peterson

February 1, 2017 2 Contents

1 Preliminaries 7 1.1 Sets ...... 7 1.1.1 Countability ...... 8 1.1.2 Transﬁnite induction ...... 9 1.1.3 The axiom of choice ...... 12 1.1.4 Ordinals and Cardinals ...... 13 1.1.5 Exercises ...... 15 1.2 Metric spaces ...... 16 1.2.1 Exercises ...... 21 1.3 Normed spaces ...... 22 1.3.1 Algebras ...... 22 1.3.2 Exercises ...... 23

2 Measure and integration 25 2.1 Measurable sets and functions ...... 26 2.1.1 Exercises ...... 28 2.2 Measures ...... 29 2.2.1 Outer measures ...... 33 2.2.2 Carath´eodory’s extension theorem ...... 34 2.2.3 Exercises ...... 36 2.3 Borel measures on R ...... 36 2.3.1 Lebesgue measure on R ...... 38 2.3.2 Regularity of Borel measures ...... 40 2.3.3 Exercises ...... 42 2.4 Integration ...... 43 2.4.1 Integrable functions ...... 43 2.4.2 Properties of integration ...... 45 2.4.3 Functions which agree almost everywhere ...... 47 2.4.4 Convergence properties ...... 47 2.4.5 Exercises ...... 50 2.5 Product spaces ...... 50 2.5.1 Exercises ...... 55 2.6 Signed and complex measures ...... 55 2.6.1 Signed measures ...... 56

3 4 CONTENTS

2.6.2 Complex measures ...... 58 2.6.3 Exercises ...... 59 2.7 The Radon-Nikodym Theorem ...... 59 2.7.1 Exercises ...... 63

3 Point set topology 65 3.1 Topological spaces ...... 65 3.1.1 Exercises ...... 67 3.2 Continuous maps ...... 67 3.2.1 Exercises ...... 71 3.3 Compact spaces ...... 72 3.3.1 Exercises ...... 76 3.4 The Stone-Weierstrass Theorem ...... 77 3.4.1 Exercises ...... 78 3.5 The Stone-Cechˇ compactiﬁcation ...... 79 3.5.1 Exercises ...... 82 3.6 The property of Baire ...... 82 3.6.1 Exercises ...... 84 3.7 Cantor spaces ...... 86 3.7.1 Exercises ...... 88 3.8 Standard Borel spaces ...... 89 3.8.1 Exercises ...... 95

4 Differentiation and integration 97 4.1 The Lebesgue differentiation theorem ...... 97 4.1.1 Vitali’s covering lemma ...... 97 4.1.2 The Lebesgue differentiation theorem ...... 98 4.1.3 Exercises ...... 101 4.2 Functions of bounded variation ...... 102 4.2.1 Exercises ...... 106 4.3 Absolutely continuous and singular functions ...... 106 4.3.1 Exercises ...... 108

5 Lp spaces 109 5.1 H¨older’sand Minkowski’s inequalities ...... 109 5.1.1 Minkowski’s integral inequality ...... 112 5.1.2 Exercises ...... 112 5.2 The dual of Lp-spaces ...... 113 5.2.1 Exercises ...... 115

6 Functional analysis 117 6.1 Topological vector spaces ...... 117 6.1.1 Locally convex spaces ...... 117 6.1.2 The open mapping and closed graph theorems ...... 119 6.1.3 Exercises ...... 121 6.2 The Hahn-Banach theorem ...... 122 CONTENTS 5

6.2.1 Separating convex sets ...... 125 6.2.2 The Krein-Milman theorem ...... 127 6.2.3 Exercises ...... 128 6.3 Hilbert space ...... 129 6.3.1 Inner product spaces ...... 129 6.3.2 Orthogonal subspaces and the Riesz representation theorem131 6.3.3 Orthonormal bases and dimension ...... 133 6.3.4 Exercises ...... 135 6 CONTENTS Chapter 1

Preliminaries

1.1 Sets

We assume that the reader is familiar with the basic language and concepts of set theory. We use the notation N, Z, Q, R, C to denote respectively the non- negative integers (including zero), the integers, the rational numbers, the real numbers, and the complex numbers. If A is a collection of sets then we denote their union by ∪A∈AA = {a | a ∈ A for some A ∈ A}, and their intersection by ∩A∈AA = {a | a ∈ A for all A ∈ A}. If the family of sets is indexed A = {Ai}i∈I then we also denote the union and intersection respectively by ∪i∈I Ai and ∩i∈I Ai. The difference of two sets A and B is A \ B = {a | a ∈ A and a 6∈ B}, and their symmetric difference is A∆B = (A \ B) ∪ (B \ A). If A is a subset of a set X, and we write Ac for the complement of A in X, i.e., Ac = X \ A. The power set of a set X is denoted by 2X and is the collection of all subsets of X, i.e., 2X = {A | A ⊂ X}. The Cartesian product X × Y of two sets X and Y consists of all ordered pairs (x, y) such that x ∈ X and y ∈ Y . A function (or mapping) f : X → Y from X to Y is a a subset of X × Y which has the property that for each x ∈ X there exists a unique y ∈ Y such that the pair (x, y) is contained in this subset. In this case we write y = f(x) (or sometimes y = fx) for each x ∈ X. If A ⊂ X and B ⊂ Y , then the image of A is denoted by f(A) = {f(a) | a ∈ A}, and the inverse image of B is denoted by f −1(B) = {x ∈ X | f(x) ∈ B}. If f : X → Y and g : Y → Z, the composition of f and g is denoted by g ◦ f, and is defined by the formula (g ◦ f)(x) = g(f(x)). A function f is injective (or 1-1) if f(x) = f(y) only when x = y, and f is surjective (or onto) if f(X) = Y . f is bijective if it is both injective and surjective, and in this case f has a unique inverse map f −1 : Y → X such that f −1 ◦ f and f ◦ f −1 are the identity maps on X and Y respectively. A sequence in a set X is a function from N to X. If f : N → X is a sequence and g : N → N is such that g(n) < g(m) whenever n < m, then we

7 8 CHAPTER 1. PRELIMINARIES say that f ◦ g is a subsequence of f. Through abuse of notation, we will often identify a sequence with its range, for instance, we may say “let {an}n∈N ⊂ X be a sequence”. Q If A is a family of sets, their Cartesian product A∈A A consists of all functions f : A → ∪a∈AA such that f(A) ∈ A for each A ∈ A. Similar to unions and intersections, if A is an indexed family A = {Ai}i∈I then the Cartesian Q Q product is written i∈I Ai. If X = i∈I Ai, and i ∈ I, then the coordinate map πi : X → Ai is given by πi(x) = xi, and we call xi the ith coordinate of x. I If each Ai is a ﬁxed set A, then we denote πi∈I Ai by A . If I = {1, 2, . . . , n}, then we denote AI by An and identify this with the set of ordered n-tuples of elements of A.

1.1.1 Countability If X and Y are sets we write |X| ≤ |Y | (resp. |X| = |Y |) if there exists an injective (resp. bijective) map f : X → Y . We also write |X| < |Y | if |X| ≤ |Y |, and there is no bijection from X to Y . Theorem 1.1.1 (The Cantor-Schr¨oder-BernsteinTheorem). If |X| ≤ |Y | and |Y | ≤ |X| then |X| = |Y |. Proof. Suppose f : X → Y , and g : Y → X are both injective. Set B = n ∪n∈N(f ◦ g) (Y \ f(X)), and set A = X \ g(B). Then we have g(B) = X \ A, and f(A) = f(X) \ (f ◦ g)(B) = Y \ ((Y \ f(X)) ∪ (f ◦ g)(B)) = Y \ B. f(x) if x ∈ A, Hence if we deﬁne θ : X → Y by θ(x) = g−1(x) if x ∈ Y \ A = g(B), then θ gives a bijection. A set X is countable if |X| ≤ |N|. We say that X is uncountable if it is not countable. Proposition 1.1.2. 1. If X and Y are countable, then so is X × Y .

2. If I is countable and Xi is countable for each i ∈ I then ∪i∈I Xi is also countable. Proof. We let p : N → N be the map which takes n, to the nth prime number. Suppose f : X → N, and g : Y → N are injective, and consider h : X×Y → N by g(y) g(y1) g(y2) h(x, y) = p(f(x)) . If p(f(x1)) = h(x1, y1) = h(x2, y2) = p(f(x2)) then by uniqueness or prime factorization we have p(f(x1)) = p(f(x2)) and g(y1) = g(y2). As p, f, and g are injective we then have x1 = x2 and y1 = y2. Thus, h is injective. Similarly, if I is countable, and Xi is countable for each i ∈ I, then consider f : I → N injective, and for each i ∈ I consider fi : Xi → N injective. We fi(x) deﬁne g : ∪i∈I Xi → N, by setting g(x) = p(f(i)) where f(i) is the smallest number so that x ∈ Xi. Then similar to above it is easy to check that g is injective and hence ∪i∈I Xi is countable. 1.1. SETS 9

Corollary 1.1.3. Z and Q are countable. Proof. We have Z = N ∪ {0} ∪ −N showing that Z is countable. Also, writing any rational number in reduced fraction form a/b with a ∈ Z and b ∈ N \{0}, deﬁnes an injective function f(a/b) = (a, b) ∈ Z × N. Since Z × N is countable, so is Q. Proposition 1.1.4 (Cantor’s diagonalization method). Let X be a set, then |X| < |2X |. Proof. The injective map f : X → 2X given by f(x) = {x} shows that we have |X| ≤ |2X |. Now, suppose we have an injective function g : X → 2X . We let A = {x ∈ X | x 6∈ g(x)}. Then, if x ∈ X and x ∈ g(x) we have x 6∈ A and hence g(x) 6= A. Similarly, if x ∈ X and x 6∈ g(x) then x ∈ A and hence g(x) 6= A. We therefore have produced a set which is not in the range of g showing that g X is not surjective. As g was arbitrary we then have |X|= 6 |2 |. Proposition 1.1.5. |R| = |2N|, and hence R is uncountable. Proof. Note that we have |2N| = |2Z|. Writing each real number in its binary expansion (If there is ambiguity we choose the representation which ends in zeros) gives an injective map from R to 2Z. On the other hand, each sequence in 2N we may view as a decimal expansion, and this gives an injective map from 2N into R.

1.1.2 Transﬁnite induction A relation on X is a subset R ⊂ X × X. We write xRy to mean (x, y) ∈ R.A relation R is an equivalence relation if the following properties hold: • xRx for each x ∈ X. • If xRy then yRx. • If xRy and yRz then xRz. A relation ≺ is a partial ordering if the following properties hold: • x ≺ x for each x ∈ X. • If x ≺ y and y ≺ z then x ≺ z. • If x ≺ y and y ≺ x then x = y.

We write x y if x ≺ y and x 6= y. An order isomorphism between two partially ordered sets is a bijection which preserves the partial orderes. A partial ordering ≺ is linear (or total) if for each x, y ∈ X we have either x ≺ y or y ≺ x. If X is partially ordered by ≺, a maximal element of X is an element x ∈ X such that if x ≺ y then we have x = y. If E ⊂ X, then an upper bounded for E is an element x ∈ X such that y ≺ x for each y ∈ E. We may 10 CHAPTER 1. PRELIMINARIES similarly deﬁne minimal elements and lower bounds. A linear ordering is said to be well ordered if every nonempty subset of X has a minimal element. For example, N is well ordered by its usual ordering. If (X, ≤) is a well ordered set and x ∈ X we deﬁne the initial segment of x to be Ix = {y ∈ X | y < x}. The elements of Ix are called predecessors of x. Note that either Ix ∪ {x} = X, or else Ix ∪ {x} = Iy where y is the minimal element in X \ (Ix ∪ {x}).

Proposition 1.1.6 (The principle of transﬁnite induction). Let X be a well ordered set. If A ⊂ X is such that x ∈ A whenever Ix ⊂ A, then A = X.

Proof. By contraposition, if A 6= X we let x ∈ X \ A be the minimal element. Then we have x 6∈ A, and Ix ⊂ A from the deﬁnition of x.

Lemma 1.1.7. Let X be a well ordered set and A ⊂ X, then ∪x∈AIx is either an initial segment or A = X.

Proof. If Ac is nonepmty then let x be the minimal element in Ac. It’s then easy to see that A = Ix.

Proposition 1.1.8 (The principle of transﬁnite recursion). Let X be a well Ix ordered set, Y a set, and let F = ∪x∈X Y denote the space of all functions from initial segments of X to Y . If G : F → Y , then there exists a unique function g : X → Y so that g(x) = G(g|Ix ) for each x ∈ X. Proof. We let E denote the family of functions f : I → Y such that I = X or is an initial segment in X, and f satisﬁes the formula f(x) = G(f|Ix ) for all x ∈ I. If f 0 : I0 → Y is another such function in F and I ⊂ I0, then set 0 A = {x ∈ I | f (x) = f(x)}. If x ∈ I and Ix ⊂ A then we have

f 0(x) = G(f 0 ) = G(f ) = f(x), |Ix |Ix hence x ∈ A. It then follows by transfinite induction that A = I and hence 0 f|I = f. We may then consider J the union of all I such that there is f : I → Y with f ∈ E. By Lemma 1.1.7 either J = X, or J is an initial segment. We define a function g : J → Y by letting g(x) = f(x), where f : I → Y is in E and x ∈ I. By our remarks above it follows easily that g is well defined and g ∈ E. If J 6= X then J = Ix for some x ∈ X. We could then extend the function g tog ˜ : J ∪ {x} → Y such thatg ˜|J = g andg ˜(x) = G(g). We would then have g˜ ∈ E which would contradict the maximality of g. Thus, we conclude that J = X and g : X → Y is our desired function. Uniqueness follows easily from the remarks above.

Intuitively, the previous result states that if we have an initial value (G(∅)), and a procedure for choosing a new value based on the ones previously chosen (this is the function G). Then we can deﬁnes resursively a unique function on all of X. 1.1. SETS 11

Lemma 1.1.9. Let (X, ≤) and (Y, ≺) be well ordered sets, and suppose f : X → Y is an order isomorphism, then f(Ix) = If(x) for each x ∈ X. Conversely, if f : X → Y is such that f(Ix) = If(x) for each x ∈ X, then f(X) is either Y or an initial segment in Y , and f is an order isomorphism onto its image.

Proof. Suppose ﬁrst that f : X → Y is an order isomorphism. If x ∈ X and a < x, then f(a) < f(x) and hence f(Ix) ⊂ If(x). Considering the inverse of f −1 gives the reverse inclusion If(x) = f(f (If(x))) ⊂ f(Ix). Now suppose f : X → Y such that f(Ix) = If(x) for each x ∈ X. Then if y ∈ f(X) we have Iy ⊂ f(X) and hence f(X) is either Y , or an inital segment in

Y . If x1 < x2 then we have f(x1) ∈ f(Ix2 ) = If(x2) and so f(x1) < f(x2), and this also shows that f is injective. If f(x1) ≤ f(x2) then f(x2) 6∈ If(x1) = f(Ix1 ) hence x1 ≤ x2. Thus, f is an order isomorphism onto its image. Proposition 1.1.10. Suppose (X, ≤) is a well ordered set and I ⊂ X is either an initial segment, or is equal to X. If f : X → I is an order isomorphism, then I = X and f is the identity map.

Proof. Let A = {x ∈ X | f(x) = x}. If x ∈ X is such that Ix ⊂ A, then Lemma 1.1.9 shows that Ix = f(Ix) = If(x), hence x = f(x) showing that x ∈ A. The result then follows by transﬁninte induction. Theorem 1.1.11. If X and Y are well ordered, then exactly one of the following holds:

1. X is order isomorphic to Y ;

2. X is order isomorphic to an initial segment in Y ;

3. Y is order isomorphic to an initial segment in X.

Moreover, in each of the cases the order isomorphism is unique.

Proof. We may assume Y is non-empty. Suppose that Y is not order isomorphic to an initial segment in X. If x ∈ X and g : Ix → Y with g(Ix) 6= Y then let G(g) denote the smalles element in Y \G(g), otherwise let G(g) be the initial element in Y . By the principal of transfinite recursion there is a function f : X → Y so that g(x) = G(gIx ) for each x ∈ X. We set A = {x ∈ X | g(Ix) = Ig(x)}. If Ia ⊂ A then from Lemma 1.1.9 we have that g(Ia) is either Y , or an inital segment in Y , and we have that g defines an order isomorphism from Ia onto g(Ia). As Y is not order isomorphic to an initial segment in X it follows that g(Ia) is an initial segment in Y , say g(Ia) = Iy. Then from the definition of G we have g(a) = y and hence g(Ia) = Ig(a), showing that a ∈ A. By the principle of transfinite induction we then have A = X and from Lemma 1.1.9 it follows that X is either order isomorphic to Y or to an initial segment in Y . Thus, one of the three cases above must hold, and Proposition 1.1.10 shows that no more than one can hold, and that the order isomorphism is unique. 12 CHAPTER 1. PRELIMINARIES

1.1.3 The axiom of choice Consider the following four principles:

AC (The Axiom of Choice): If {Ai}i∈I is a nonempty collection of nonempty Q sets then i∈I Ai is nonempty.

WO (The Well Ordering Principle) Every set X can be well ordered.

ZL (Zorn’s Lemma) If X is a nonempty partially ordered set and every linearly ordered subset of X has an upper bound, then X has a maximal element.

HM (The Hausdorﬀ Maximal Principle) Every partially ordered set has a maximal linearly ordered subset.

These principles are logically equivalent, and after this section we will use them in these notes without explicit reference. In this section we show that we have the implications AC =⇒ WO =⇒ ZL =⇒ HM. The reverse implications are easier and we leave them as exercises.

Proposition 1.1.12. The axiom of choice implies the well ordering principle.

Proof. Let X be a nonempty set. By the axiom of choice there exists f ∈ Q X Y ∈2X \X (X \ Y ). That is, f : 2 \ X → X is a function such that f(Y ) 6∈ Y for each Y ( X. We deﬁne an f-string to be a well ordered set (A, ≤) such that A ⊂ X and a = f(Ia) for all a ∈ A. We let F denote the set of f-strings. If (A, ≤) and (B, ≤0) are f-strings, and h : A → B is an order isomorphism, then let E = {a ∈ A | h(a) = a}. If x ∈ A is such that Ix ⊂ E, then by Lemma 1.1.9 we have

x = f(Ix) = f(h(Ix)) = f(Ih(x)) = h(x). It then follows from transﬁnite induction that B = A, and h is the identity map. It then follows from Theorem 1.1.11 that given any two distinct f-strings we must have that one is the initial segment of the other, so that F is linearly ordered by inclusion. We let A denote the union over all sets in F and we let ≺ denote the induced relation on A. Since F is linearly ordered by inclusion it follows easily that each (A, ≤) is well ordered and and each initial segment of A is an f-string. From this it then follows that (A, ≤) itself is an f-string, and is then the unique maximal f-string. If A 6= X then we could create the larger f-string (A ∪ {f(A)}, ≤0) where ≤0 agrees with ≤ when restricted to A, and a ≤0 f(A) for all a ∈ A. This would then contradict the maximality of (A, ≤) and so we conclude that A = X, and hence X is well ordered by ≤.

Proposition 1.1.13. The well ordering principle implies Zorn’s lemma. 1.1. SETS 13

Proof. Let (X, ≺) be a nonempty partially ordered set such that every linearly ordered subset has an upper bound. We let C denote the set of all linearly ordered subsets of X. By the well ordering principle there exist well orders ≤1 X X and ≤2 on X and 2 respectively. To simplify notation we set Y = 2 . We define f : C → X as follows: If C ∈ C and C does not contain a maximal element in X, then we let f(C) be the ≤1-least element which is not in C, but is a ≺-upper bound for C, otherwise if C contains a maximal element x0 ∈ X then we set f(C) = x0. We now recursively define a function g : Y → X such that g(y) ≺ g(z) whenever y ≤2 z. Indeed, suppose y ∈ Y and that g has been defined on Iy, then g(Iy) is a well ordered (and hence linearly ordered) subset of X and we may set g(y) = f(g(Iy)). By Cantor’s diagonalization argument g cannot be injective. Thus, there exists some y ∈ Y such that g(y) = g(z) for some z ∈ Iy. It then follows that f(g(Iy)) = g(y) ∈ g(Iy). By construction of f we must have that f(g(Iy)) is a maximal element. Proposition 1.1.14. Zorn’s lemma implies the Hausdorff maximal principle.

Proof. Let X be a nonempty partially ordered set and let C denote the space of all linearly ordered subsets. Then C is ordered by inclusion. If C0 ⊂ C is a subset which is linearly ordered by inclusion then we may consider C = ∪C0∈C0 C0. Then C ⊂ X is also linearly ordered and is an upper bound for C0 with resped to the inclusion order. We may then apply Zorn’s Lemma to C to produce a maximal linearly ordered subset.

1.1.4 Ordinals and Cardinals Proposition 1.1.15. Suppose X 6= ∅, then |X| ≤ |Y | if and only if there exists a surjection from Y to X.

Proof. Suppose that f : X → Y is injective, and take x ∈ X. We may then de- f −1(x) if x ∈ F (X) ﬁne a surjective function g : Y → X, by letting g(y) = x otherwise. Conversely, if g : Y → X is surjective, then for each x ∈ X we may choose f(x) ∈ Y so that g(f(x)) = x. If f(x) = f(y) then x = g(f(x)) = g(f(y)) = y. Thus, f : X → Y is injective. Proposition 1.1.16. For any sets X,Y , either |X| ≤ |Y |, or |Y | ≤ |X|.

Proof. If we well order X and Y , then this follows immediately from Theo- rem 1.1.11. We would like to define an ordinal as an order isomorphic equivalence class of well ordered sets, the previous proposition would then show that we have a linear ordering on the collection of ordinals. We should be somewhat careful here though as there is no reason that the collection of all well ordered sets should be a set itself, and we have only discussed equivalence relations and orderings 14 CHAPTER 1. PRELIMINARIES on sets. One option to make this precise is to work outside of the universe of sets, that is, we can call the collection of all well ordered sets a “class”, and then we can introduce equivalence relations and well orderings for classes. Another option to make this notion precise is the following definition proposed by von Neumann: A set α is an ordinal if every element of α is also a proper subset of α, and if α is well ordered with respect to set inclusion. The first few ordinals are then 0 = ∅, 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2}. The first infinite ordinal is ω = {0, 1, 2, 3, 4,...}. For each ordinal α we may consider the set α ∪ {α}, which is again an ordinal, this is the successor ordinal which we denote by α + 1. Any ordinal which is not a successor ordinal is called a limit ordinal. If two ordinals are order isomorphic, then it follows easily by induction that they must be the same. More generally, if α and β are ordinals, then by Theorem 1.1.11 either α and β are order isomorphic, in which case α = β, or else one (say α) is isomorphic to (and hence equal to) an initial segment of the other, in which case we have α ⊂ β. The following proposition shows that von Neumann’s definition captures all order isomorphism classes. Proposition 1.1.17. Let X be a well ordered set, then there exists a unique ordinal α such that X and α are order isomorphic. Proof. We let A denote the subset of X consisting of all points x such that there exists an ordinal αx, and an order isomorphism fx : Ix → αx. Note that if x, y ∈ A with x < y, then we obtain an order isomorphism from αx to an initial segment in αy, hence we have αx ⊂ αy. We then have αy = ∪x≤yαx, and f = f . y|Ix x

If Iz ⊂ A, and z is not a successor, then we set α = ∪x∈Iz αx. Note that α is again an ordinal. We define the function f : Iz → α by setting f(x) = fy(x) for some x < y < z. (note that such a y exists since z is not a successor). Then f is well defined and implements an order isomorphism between Iz and α, showing that z ∈ A. Similarly, if Iz ⊂ A, and z is a successor to y ∈ A, then we consider the ordinal α = αy ∪ {αy}, and we define the function f : Iz → α by f(x) = fy(x) for x < y, and f(y) = αy. We then again see that z ∈ A. By transfinite induction we then have A = X, and the result then follows easily. Just as there is a first infinite ordinal, there is also a first uncountable ordinal:

Proposition 1.1.18. There is a unique uncountable ordinal ω1 such that Ix is countable for each x ∈ ω1.

Proof. We let ω1 be the set of all countable ordinals α. Then ω1 is ordered by inclusion. If α ∈ ω1 then α is a countable ordinal and hence so are all the ordinals contained in α, thus α ⊂ ω1. We therefore have that ω1 is an ordinal, and Ix is countable for each x ∈ ω1. Note that ω1 cannot be countable since ω1 6∈ ω1. 1.1. SETS 15

0 0 If Ω were another such ordinal, then we could not have Ω ⊂ ω1 since ω1 0 is not countable. We similarly could not have ω1 ⊂ Ω . Hence we must have 0 ω1 = Ω .

If we ﬁx a set X, then by the well ordering principle there exists a well ordering on X and hence a bijection between X and some ordinal α. The cardinality of X is the smallest ordinal such that there exists such a bijection. We denote the cardinality of X by |X|, and note that this is consistent with our notation above.

1.1.5 Exercises Exercise 1.1.19. If X is countably inﬁnite then there is a bijection from X onto N.

Exercise 1.1.20. We have |RN| = |R|.

Exercise 1.1.21. Let 2

Exercise 1.1.22. Let X be a countably inﬁnite set. There exists an uncountable family F ⊂ 2X so that for any distinct pair A, B ∈ F we have that A ∩ B is ﬁnite. (Hint: It may help to consider the case X = 2

A complex number α is algebraic if it is the solution of a polynomial having rational coeﬃcients.

Exercise 1.1.23. The set of algebraic numbers is countable.

Exercise 1.1.24. The Hausdorﬀ maximal principle implies Zorn’s lemma.

Exercise 1.1.25. Zorn’s lemma implies the well ordering principle.

Exercise 1.1.26. The well ordering principle implies the axiom of choice.

Let X be a set, and ≤ be a linear ordering on X. We say that the linear order is dense if for all x < y there exists z ∈ X such that x < z < y.

Exercise 1.1.27 (Cantor’s back-and-forth method). Let (X, ≤) and (Y, ≤) be countable dense linear orderings which do not have upper or lower bounds. Enumerate X = {x1, x2,...}, and Y = {y1, y2,...}.

1. There exist increasing sequences of ﬁnite sets An ⊂ X, Bn ⊂ Y , and order preserving bijections fn : An → Bn such that xn ∈ An, yn ∈ Bn, and f = f , for all n ≥ 1. n+1|An n

2. There exists an order preserving bijection f : X → Y . 16 CHAPTER 1. PRELIMINARIES

A (undirected) graph consists of a pair (V,E) where V is a set (the vertex set) and E ⊂ V ×V (the edge set) such that (v, w) ∈ E if and only if (w, v) ∈ E. A subgraph is a graph (V0,E0) with V0 ⊂ V , and E0 ⊂ E. If (V,E) is a graph, two vertices v, w ∈ V are adjacent if (v, w) ∈ E. We let N(v) denote the set of vertices which are adjacent to v. A graph (V,E) is locally finite if |N(v)| < ∞ for each v ∈ V .A finite simple path is an injective function p : {1, 2, . . . , n} → V , such that (p(k), p(k + 1)) ∈ E for all 1 ≤ k < n; we say that n is the length of the path. A ray is an injective function p : N → V such that (p(k), p(k + 1)) ∈ E for all 1 ≤ k. A graph is connected if for any distinct vertices v, w ∈ V , there exists a finite simple path p : {1, 2, . . . , n} → V such that p(1) = v and p(n) = w.

Exercise 1.1.28 (König’slemma). If a locally finite connected graph (V,E) has infinitely many vertices, then (V,E) admits a ray.

1.2 Metric spaces

Let X be a set. A semimetric on X is a function d : X × X → [0, ∞) such that for all x, y, z ∈ X the following properties hold:

1. d(x, y) = d(y, x).

2. d(x, z) ≤ d(x, y) + d(y, z).

If, in addition, we have that x = y if and only if d(x, y) = 0, then d is a metric. A metric space is a pair (X, d) consisting of a set X and a metric d on X. When d is understood we will sometimes refer to the metric space X. Examples of metric spaces include:

1. Euclidean space Rn with metric d(x, y) = kx − yk. 0 if x = y; 2. If X is any set and for all x, y ∈ X we have d(x, y) = 1 if x 6= y, then (X, d) is a metric space.

3. If (X, d) is a metric space and A ⊂ X, then (A, d|A×A) is a metric space.

4. If (X1, d1) and (X2, d2) are metric spaces, then X1 × X2 is a metric space with metric d((x1, x2), (y2, y2)) = max(d1(x1, y1), d2(x2, y2)). 5. If (X, d) is a metric space and f : [0, ∞) → [0, ∞) is a strictly increasing function satisfying√ f(0) = 0, and f(s + t) ≤ f(s) + f(t) for all s, t ∈ R t (e.g., f(t) = t, or f(t) = t+1 ), then (X, f ◦ d) is again a metric space. If (X, d) is a metric space, x ∈ X and r > 0, then the ball of radius r about x is B(r, x) = {y ∈ X | d(x, y) < r}; we call any such set B(r, x) an r-ball. A set A ⊂ X is open if for each x ∈ A there exists r > 0 such that B(r, x) ⊂ A. A set A ⊂ X is closed if Ac is open. Both ∅ and X are open. If a set is both 1.2. METRIC SPACES 17 closed and open then we say it is clopen. Note that the collection of open sets is closed under ﬁnite intersections and arbitrary unions. Taking complements shows that the collection of closed sets is closed under ﬁnite unions and arbitrary intersections. If E ⊂ X then the closure E of E is the intersection of all closed sets which contian E. We say the E is dense if E = X. If (X1, d1) and (X2, d2) are metric spaces, and f : X1 → X2, then f is

1. isometric if d2(f(x), f(y)) = d1(x, y), for all x, y ∈ X1;

2. Lipschitz continuous if there exists K ≥ 0 (a Lipschitz constant), such that for all x, y ∈ X1 we have d2(f(x), f(y)) ≤ Kd1(x, y);

3. contractive if f is Lipschitz continuous with Lipschitz constant 1;

4. uniformly continuous if for all ε > 0, there exists δ > 0 so that f(B(δ, x)) ⊂ B(ε, f(x)) for all x ∈ X1;

5. continuous at x ∈ X1 if for each ε > 0, there exists δ > 0 so that f(B(δ, x)) ⊂ B(ε, f(x)).

6. continuous if f is continuous at each point x ∈ X1.

7. a homeomorphism if f is bijective and both f and f −1 are continuous.

If X is a metric space, a sequence {xn}n∈N ⊂ X has a limit point x ∈ X if for all ε > 0, there exists N ∈ N so that xn ∈ B(ε, x) for n ≥ N. We say that {xn}n converges to x and write limn→∞ xn = x if this is the case. {xn}n∈N is convergent if it converges to some point x ∈ X.

Proposition 1.2.1. If (X1, d1) and (X2, d2) are metric spaces, the following are equivalent:

1. f : X1 → X2 is continuous;

2. for any convergent sequence {xn}n∈N we have that {f(xn)}n∈N is also convergent and limn→∞ f(xn) = f(limn→∞ xn);

−1 3. f (O) is open for each open set O ⊂ X2.

Proof. First, suppose f : X1 → X2 is continuous, and {xn}n∈N is a sequence such that x = limn→∞ xn. Fix ε > 0. Since f is continuous there exists δ > 0 so that f(B(δ, x)) ⊂ B(ε, f(x)). Since x = limn→∞ xn, there exists N ∈ N so that xn ∈ B(δ, x) for all n ≥ N. Hence, f(xn) ∈ f(B(δ, x)) ⊂ B(ε, f(x)) for all n ≥ N. Since ε > 0 was arbitrary we have f(x) = limn→∞ f(xn). −1 Next, suppose that O ⊂ X2 is open, but f (O) is not open. Then there exists x ∈ f −1(O) such that for all n ∈ N we have B(1/n, x) 6⊂ f −1(O). For each n ∈ N choose xn ∈ B(1/n, x). Since O is open there exists ε > 0 so that B(ε, f(x)) ⊂ O. We then have x = limn→∞ xn, and f(xn) 6∈ O ⊃ B(ε, f(x)) for all n ∈ N, hence {f(xn)}n∈N does not converge to f(x). 18 CHAPTER 1. PRELIMINARIES

−1 Finally, suppose that f (O) is open whenever O ⊂ X2 is open. Fix x ∈ X and ε > 0. Then B(ε, f(x)) is open and hence so is f −1(B(ε, f(x))). There- fore, there exists δ > 0 so that B(δ, x) ⊂ f −1(B(ε, f(x))). Hence, we have f(B(δ, x)) ⊂ B(ε, f(x)) showing that f is continuous.

A sequence of functions fn : X → Y is said to converge pointwise to a function f : X → Y , if f(x) = limn→∞ fn(x) for each x ∈ X. The sequence

{fn}n∈N converges uniformly to f if limn→∞ supx∈X |f(x) − fn(x)| = 0.

Proposition 1.2.2. Suppose fn : X → Y are continuous, and {fn}n∈N converges to f : X → Y uniformly, then f is continuous.

Proof. Fix ε > 0, and x ∈ X. Since fn → f uniformly, there exists n ∈ N so that supy∈X |f(y) − fn(y)| < ε/3. Since fn is continuous there exists an open set O containing x so that for y ∈ O we have |fn(y) − fn(x)| < ε/3. For y ∈ O we then have

|f(y) − f(x)| ≤ |f(y) − fn(y)| + |fn(y) − fn(x)| + |fn(x) − f(x)| < ε.

Thus, f is continuous.

A sequence {xn}n∈N is Cauchy if for all ε > 0, there exists N ∈ N so that d(xn, xm) < ε for all n, m ≥ N. A metric space is complete if every Cauchy sequence is convergent.

Proposition 1.2.3. Rn with its Euclidean metric is complete. Proof. A sequence in Rn is Cauchy if and only if its coordinates are Cauchy, and, similarly, a sequence converges if and only if its coordinates converge. Thus, the general result follows from R. Suppose {xn}n∈N ⊂ R is Cauchy. Then there exists N ∈ N so that |xN − xm| < 1 for all m ≥ N. Hence {xn}n∈N is bounded. We let x = lim supn→∞ xn. Fix ε > 0, then there exists N ∈ N so that |xn − xm| < ε/2 for all n, m ≥ N. Also, there exists n ≥ N so that |x − xn| < ε/2. Then for all m ≥ N we have |x − xm| ≤ |x − xn| + |xn − xm| < ε. It then follows that x = limn→∞ xn. Proposition 1.2.4. A closed subset of a complete metric space is complete, and a complete subspace of an arbitrary metric space is closed.

Proof. Suppose (X, d) is complete and F ⊂ X is closed. Let {xn}n∈N ⊂ F be Cauchy. Then it is also Cauchy in X and by completeness there exists x ∈ X so that xn → x. If ε > 0 then there exists N ∈ N so that xn ∈ B(ε, x) for all n ≥ N. In particular we have B(ε, x) ∩ F 6= ∅ and hence x 6∈ F c as this is open and ε > 0 was arbitrary. Conversely, Suppose F ⊂ X is a subspace which is not closed. Then F c is not open and hence there exists x ∈ F c, so that B(1/n, x)∩F 6= ∅ for all n ∈ N. Take xn ∈ B(1/n, x) ∩ F for each n ∈ N. Then x = limn→∞ xn and hence {xn}n∈N is Cauchy. However, x 6∈ F and hence {xn}n∈N does not converge to a point in F . Thus, F is not complete. 1.2. METRIC SPACES 19

If (X, d) is a metric space, then we let Cauchy(X) denote the set of all Cauchy 2 sequences in X. On Cauchy(X) we deﬁne the function d by d({xn}n∈N, {yn}n∈N) = limn→∞ d(xn, yn). We leave it as an exercise to verify that this limit actually exists. Using the properties of the metric d it is then easy to see that we have d(s, t) = d(t, s) and d(s, r) ≤ d(s, t) + d(t, r), for all Cauchy sequences s, t, r. We deﬁne an equivalence relation on Cauchy(X) by s ∼ t if and only if d(s, t) = 0. It’s easy to see that this is indeed an equivalence relation, and that d(s, t) only depends on the equivalence classes that s and t lie in. Thus, setting X = Cauchy(X)/ ∼, we may view d as a function on X × X where it then gives a metric. We also have a natural isometric embedding π : X → X which takes a point x ∈ X to the constant sequence π(x) = {x}n∈N. We call the metric space (X, d) the completion of (X, d), and we usually view X as a subspace by identifying X with π(X).

Proposition 1.2.5. (X, d) is complete, and X is a dense subspace.

Proof. We ﬁrst show that π(X) is a dense subspace. Suppose that {xn}n∈N is Cauchy and ε > 0 is given. Then there exists N ∈ N so that |xn − xm| < ε for all n, m ≥ N. In particular, we have that |xN − xm| < ε for all m ≥ N. Hence, we have d(π(xN ), {xn}n∈N) ≤ ε. Next we show that (X, d) is complete. Note that if {sn}n∈N, {tn}n∈N ⊂ X such that 0 = limn→∞ d(sn, tn), then {sn}n∈N is Cauchy (resp. convergent) if and only if {tn}n∈N is Cauchy (resp. convergent). Thus, it is enough to consider Cauchy sequences which are valued in the dense subspace π(X). Suppose therefore that {π(xn)}n∈N is Cauchy, with xn ∈ X. If we set s = {xn}n∈N then it follows easily that we have 0 = limn→∞ d(s, π(xn)). Hence (X, d) is complete.

If E ⊂ X, then E is bounded if there exists K > 0, such that d(x, y) ≤ K, for all x, y ∈ E. If {Vi}i∈I is a family of subsets of X such that E ⊂ ∪i∈I Vi, then {Vi}i∈I is a cover of E. E is totally bounded if for any ε > 0, there is a ﬁnite collection of ε-balls which cover E. Note that totally bounded sets are also bounded.

Lemma 1.2.6. A metric space (X, d) is totally bounded if and only if every sequence has a Cauchy subsequence.

Proof. Suppose (X, d) is totally bounded and {xn}n∈N is a sequence. Fix ε > 0. We will inductively define a decreasing sequence of infinite subsets Aj ⊂ N, such that d(xn, xm) < 2/j for all n, m ∈ Aj, and j ≥ 2. We first set A1 = N. Suppose now that Aj−1 has been chosen for j ≥ 2. Since E is totally bounded, there exist a finite collection of 1/j-balls O1,...,Ok which cover E. Since Aj−1 is infinite for some Oi we must have that Aj = {k ∈ Aj−1 | xk ∈ Oi} is infinite. We new choose a subsequence by taking nj ∈ Aj so that nj is strictly increasing. If ε > 0, and j ∈ N so that 2/j < ε then we have d(xnk , xnl ) < 2/j < ε for all k, l ≥ j. Therefore we have that the subsequence {xnj }j∈N is Cauchy. 20 CHAPTER 1. PRELIMINARIES

Conversely, if E is not totally bounded then there exists ε0 > 0 so that there is no cover of E by ﬁnitely many ε0-balls. We may therefore inductively construct a sequence xn ∈ E so that d(xn, xm) ≥ ε0 for all n, m ∈ N. We then have that no subsequence of {xn}n∈N is Cauchy. Lemma 1.2.7. Let (X, d) be a totally bounded metric space, then every open cover has a countable subcover.

Proof. Suppose that {V } is an open cover. For each n ∈ take {xn, . . . , xn } ⊂ i i∈I N 1 kn kn X, so that ∪j=1B(1/n, xj) = X. We then let On,m be the collection of all open balls B(1/m, xj) which are contained in some Vi, for i ∈ I. We set

O = ∪n,m∈NOn,m. If x ∈ E then we have x ∈ Vi for some i ∈ I. We then have B(1/n, x) ⊂ Vi 2n for some n ∈ N. For some 1 ≤ j ≤ k2n we then have x ∈ B(1/2n, xj ) ⊂ Vi. Thus, we see that O covers X and is countable. Moreover, each set in O is contained in Vi for some i ∈ I, thus a countable subcollection of {Vi}i∈I must cover X. Theorem 1.2.8. If E ⊂ X, the following are equivalent:

1. E is complete and totally bounded.

2. (The Bolzano-Weierstrass Property) Every sequence in E has a subsequence which converges to a point in E.

3. (The Heine-Borel Property) If {Vi}i∈I is a cover of E by open sets, then there exists a ﬁnite set F ⊂ I such that {Vi}i∈F is also a cover of E. Proof. (1 =⇒ 2) Suppose that E ⊂ X is complete and totally bounded. Let

{xn}n∈N ⊂ E be a sequence. By Lemma 1.2.6 there exists a Cauchy subsequence {xnj }j∈N, and since E is complete we must have that this subsequence converges. Therefore E satisﬁes the Bolzano-Weierstrass property. (2 =⇒ 1) If E is not totally bounded then Lemma 1.2.6 shows that X has a sequence which has no Cauchy (and hence no convergent) subsequence.

Similarly, if E is not complete then there exists a Cauchy sequence {xn}n∈N which does not converge, and it then follows easily that no subsequence can converge either. We have therefore shown the equivalence between the Bolzano- Weierstrass property and being complete and totally bounded. (3 =⇒ 1) If E is not totally bounded then there exists ε0 > 0 so that there is no cover of E by ﬁnitely many ε0-balls. However, all ε0-balls cover E and hence E does not have the Heine-Borel Property. Also, if E is not complete, then we may consider the completion E and take a point x ∈ E \ E. Then consider On = {y ∈ E | d(y, x) > 1/n}. We then have that {On}n∈N is an increasing sequence of open sets, such that ∪n∈NOn = E. However, On 6= E for any n ∈ N since E is dense in E. Hence, we again have shown that E does not have the Heine-Borel property. (1 =⇒ 3) Suppose that E is totally bounded and {Vi}i∈I is an open cover which does not have a ﬁnite subcover. By Lemma 1.2.7 we may pass 1.2. METRIC SPACES 21

to a countable cover so that we may assume {Vi}i∈I is countable and then sequence this as {Vn}n∈N. We inductively define a sequence {xn}n∈N by taking n xn 6∈ ∪k=1Vn. Note that by construction, each open set Vn can contain at most finitely many elements in the sequence {xn}n∈N. By Lemma 1.2.6 there exists a Cauchy subsequence {xnj }j∈N. If this subsequence converged to some point x ∈ E, then x would be contained in some open set Vn and it would follow that infinitely many xnj ’s would belong to Vn contradicting our remark above. Thus, we must have that {xnj }j∈N is a Cauchy sequence which does not converge and hence X is not complete. Any set E which satisfies the conditions of the previous theorem is called a compact set. Note that homeomorphisms preserve open sets and hence from the Heine-Borel Property we see that homeomorphisms preserve compact sets. This is not the case however for complete sets.

Proposition 1.2.9. Let (X, d) and (Y, ρ) be metric spaces with X compact. Suppose that f : X → Y is continuous. Then f(X) is compact and f is uniformly continuous.

Proof. If {Oi}i∈I is an open cover of f(X), then since f is continuous we have −1 that {f (Oi)}i∈I is an open cover of X. By the Heine-Borel property there −1 −1 exists a finite subcover f (O1), . . . , f (On). We then have that O1,...,On covers f(X) and so by the Heine-Borel property we have that f(X) is compact. To see that f is uniformly continuous we fix ε > 0. Since f is continuous, for each x ∈ X there exists δx > 0 so that f(B(δx, x)) ⊂ B(ε/2, f(x)). Then {B(δx/2, x)}x∈X covers X and by the Heine-Borel property there is a finite subcover B(δx1 /2, x1),...,B(δxn /2, xn).

Set δ = min1≤i≤n{δxi /2}. Then if 1 ≤ i ≤ n and x ∈ B(δxi /2, xi) we have

B(δ, x) ⊂ B(δxi , xi) and hence

f(B(δ, x)) ⊂ f(B(δxi , xi)) ⊂ B(ε/2, f(xi)).

Therefore, f(B(δ, x)) ⊂ B(ε, f(x)). Since B(δx1 /2, x1),...,B(δxn /2, xn) covers X it follows that f is uniformly continuous.

1.2.1 Exercises Exercise 1.2.10. Suppose that X is a set and d : X × X → [0, ∞) is a semimetric on X. We deﬁne a relation ∼ on X by x ∼ y if d(x, y) = 0. Then ∼ is an equivalence relation on X and we have a well deﬁned metric on X/ ∼ given by d˜([x], [y]) = d(x, y).

Exercise 1.2.11. There are two homeomorphic metric spaces (X1, d1) and (X2, d2) such that (X1, d1) is complete, while (X2, d2) is not. A metric space (X, d) is separable if it contains a countable dense set.

Exercise 1.2.12. A compact metric space is separable. 22 CHAPTER 1. PRELIMINARIES

We let `∞(N) denote the set of uniformly bounded sequences from N to C. We consider this as a complete metric space whose metric is given by d(f, g) = kf − gk = sup |f(n) − g(n)|. ∞ n∈N Exercise 1.2.13 (Kuratowski). Every bounded seprable metric space is isometric to a subspace of `∞(N).

1.3 Normed spaces

We assume the reader is familiar with the basic properties of vector spaces. Let K = R, or K = C, and suppose that V is a K-vector space. A seminorm on V is a map V 3 v 7→ kvk ∈ [0, ∞) which satisﬁes

1. kv + wk ≤ kvk + kwk;

2. kkvk = |k|kvk, for k ∈ K, and v, w ∈ V . If, in addition, we have that kvk = 0 if and only if v = 0, then we say that k · k is a norm. Associated with a (pre)norm is a (pre)metric d which is given by d(v, w) = kv + wk.A normed space is a pair (V, k · k) where V is a vector space and k · k is a norm on V . If the associated metric is complete then the normed space is a Banach space. Examples of Banach spaces include:

1 n Pn 1. `n = K , with norm k(α1, . . . , αn)k1 = k=1 |αk|.

p n 2. More generally, if 1 ≤ p < ∞, `n = K , with norm k(α1, . . . , αn)kp = Pn p 1/p ( k=1 |αk| ) .

∞ n 3. `n = K , with norm k(α1, . . . , αn)k∞ = max{|αk| | 1 ≤ k ≤ n}.

If (V, k · kV ), and (W, k · kW ), are normed spaces, and T : V → W is a linear operator, then we say that T is bounded if there exists K > 0 so that kT vkW ≤ KkvkV for all v ∈ V . We let B(V,W ) (or B(V ) if V = W ) denote the set of bounded linear operators. Then B(V,W ) is a K-vector space, where the vector space structure is taken pointwise, i.e., (T + S)(v) = T (v) + S(v), and (kT )(v) = k(T (v)), for k ∈ K, v ∈ V , and T,S ∈ B(V,W ). If T ∈ B(V,W ) then the operator norm of T is given by kT kB(V,W ) = supv∈V,kvkV ≤1 kT vkV . The space B(V,W ), together with its operator norm, is a normed space.

1.3.1 Algebras

We again let K = R, or K = C.A K-algebra is a K-vector space A, together with a binary operation A × A 3 (a, b) 7→ ab ∈ A (called multiplication, or composition), such that

1. (ab)c = a(bc);

2. α(ab) = (αa)b = a(αb); 1.3. NORMED SPACES 23

3. a(b + c) = (ab) + (ac);

4. (a + b)c = (ac) + (bc), for α ∈ K, a, b, c ∈ A. Examples of algebras include:

1. The vector space of n × n matrices Mn(K), together with matrix multiplication.

2. The space of K-polynomials with its usual vector space structure and multiplication.

∞ 3. `n where multiplication is taken coordinate-wise (α1, . . . , αn)·(β1, . . . , βn) = (α1β1, . . . , αnβn). A normed algebra is an algebra A, which also has a norm k · k which satisﬁes kabk ≤ kakkbk, for a, b ∈ A.A Banach algebra is a normed algebra where the norm is complete. ∞ The space `n (K) is a normed algebra. Also, if V and W are normed spaces then B(V,W ), with its operator norm, is a normed algebra. If (X, d) is a metric space, we let Cb(X) denote the space of all complex- valued continuous functions which are uniformly bounded (If X is compact then the boundedness is automatic and we use the notation C(X) instead). For f ∈ Cb(X), the uniform norm of f is given by kfk∞ = supx∈X |f(x)|.

Proposition 1.3.1. Let (X, d) be a metric space, then Cb(X), endowed with the uniform norm, is a Banach algebra.

Proof. First, note that Cb(X) is clearly an algebra, and k·k∞ is clearly a norm on Cb(X). If f, g ∈ Cb(X), then kfgk∞ = supx∈X |f(x)g(x)| ≤ supx,y∈X |f(x)g(y)| = kfk∞kgk∞ so that Cb(X) is a normed algebra.

Suppose {fn}n∈N ⊂ Cb(X) is Cauchy. Therefore, for each x ∈ X the sequence {fn(x)}n∈N is Cauchy and hence converges to some f(x) ∈ C. We then have that 0 = limn→∞ kfn − fk∞, and f ∈ Cb(X) by Proposition 1.2.2. There- fore Cb(X) is complete and hence is a Banach algebra. 0 if x = y; Note that if X is a set and d is the metric d(x, y) = then 1 if x 6= y, every function is continuous and hence Cb(X) is the space of all uniformly bounded functions.

1.3.2 Exercises

In the following we consider vector spaces over a ﬁeld K, where K = R, or K = C. Recall that if V is a vector space and V0 ⊂ V is a subspace, then the quotient space V/V0 is deﬁned to be the set of cosets {v + V0 | v ∈ V }. This is naturally a vector space whose vector space operations satisfy α(v1 + V0) + (v2 + V0) = (αv1 + v2) + V0, for all v1, v2 ∈ V , and scalar α. 24 CHAPTER 1. PRELIMINARIES

Exercise 1.3.2. Suppose that V is a K-vector space and k · k0 is a seminorm on V . Set V0 = {v ∈ V | kvk0 = 0}. Then V0 is a linear subspace and we have a well deﬁned norm on V/V0 given by kv + V0k = kvk0, for each v ∈ V .

Exercise 1.3.3. Let (V, k·kV ) and (W, k·kW ) be normed spaces, and T : V → W a linear operator. Then T is bounded if and only if T is continuous.

Exercise 1.3.4. Let (V, k · kV ) and (W, k · kW ) be normed spaces, then the operator norm on B(V,W ) is indeed a norm, and that with this norm B(V ) is a normed algebra. Moreover, B(V,W ) is a Banach space if W is a Banach space.

Exercise 1.3.5. Let V be a finite dimensional K-vector space, and suppose k · k1, and k · k2 are norms on V . Then the identity map from (V, k · k1) to (V, k · k2) is a homeomorphism. Exercise 1.3.6. Let V be a finite dimensional normed space. Then the closed unit ball B(1, 0) is compact, and V is a Banach space. Exercise 1.3.7 (Riesz’ lemma). Let (V, k · k) be a normed space, W ⊂ V a proper closed subspace, and fix 0 < α < 1. Then there exits x 6∈ W with kxk = 1 so that infy∈W kx−yk ≥ α. (Hint: Start with x0 6∈ W , set d = infy∈W kx0 −yk, take x1 ∈ W so that kx0 − x1k ≥ d − ε for some suitably chosen ε > 0, and −1 show that x = kx0 − x1k (x0 − x1) works.) Exercise 1.3.8. Let V be a normed space such that the closed unit ball B(1, 0) is compact. Then V is finite dimensional. P∞ If (V, k · k) is a normed space, a series n=1 xn is said to converge if the Pk P∞ partial sums n=1 xn converge as k tends to infinite. A series n=1 xn is said P∞ to converge absolutely if n=1 kxnk < ∞. Exercise 1.3.9. Let (V, k · k) be a normed space over K. Then V is a Banach space if and only if every absolutely convergent series converges.

Exercise 1.3.10. Let (V, k · k) be a normed space over K. Then the metric space completion V of of V has a vector space structure which extends the vector space structure of V . Thus, every normed space is a dense linear subspace of a Banach space. Chapter 2

Measure and integration

Suppose we wanted to assign the notion of size (or measure) to a collection M of certain subsets of a Rn. That is, for a subset E ∈ M we want to assign a number 0 ≤ µ(E) ≤ ∞ which tells us in some sense how large E is. Then we might want the following properties to hold:

n (a) M = 2R .

∞ (b) µ is countably additive: If {Ej}j=1 is a sequence of disjoint sets in M, ∞ P∞ then µ(∪j=1Ej) = j=1 µ(Ej). (c) If E can be transformed to F using translations, rotations, and reflections, then µ(E) = µ(F ). (d) µ assigns a finite, nonzero value to the unit cube. Unfortunately, these conditions are mutually inconsistent. This was first noticed by Vitali in 1905. Suppose we had such a function µ : 2R → [0, ∞]. Consider the equivalence relation on [0, 1) which is given by s ∼ t if t − s ∈ Q. Take E ⊂ [0, 1) so that E contains exactly one element from each equivalence class. For each t ∈ Q consider the set Et = E + t mod 1, i.e.,

Et = ((E + t) ∩ [t, 1)) ∪ ((E + t − 1) ∩ [0, t)).

Then {Et}t∈Q is a countable family of pairwise disjoint sets which cover [0, 1). Note that for each t ∈ Q we have

µ(Et) = µ((E + t) ∩ [t, 1)) + µ((E + t − 1) ∩ [0, t)) = µ(E ∩ [0, 1 − t)) + µ(E ∩ [1 − t, 1)) = µ(E). P P Hence µ([0, 1)) = µ(∪t∈ Et) = µ(Et) = µ(E), so that µ([0, 1)) ∈ Q t∈Q t∈Q {0, ∞}. A contradiction then follows easily. We must therefore compromise of some of the conditions above. Conditions (c) and (d) seem essential to having a good notion of size, thus we look to weaken conditions (a) or (b). One thing we might try is to weaken countable additivity to ﬁnite additivity:

25 26 CHAPTER 2. MEASURE AND INTEGRATION

k ∞ (b’) If {Ej}j=1 is a finite sequence of disjoint sets in M, then µ(∪j=1Ej) = Pk j=1 µ(Ej). The question of whether there exists a function µ satisfying (a), (b0), (c), and (d) is quite interesting, and we’ll come back to this later. (It turns out that such a µ exists when n ≤ 2, and does not exist otherwise!) Another possibility is to not try to measure every subset of Rn, but rather only a certain nice class M which excludes Vitali’s set above. We would want M to contains all intervals, and to be closed under taking countable unions and complements. In this case, one can indeed obtain such a µ, as was first shown by Lebesgue in 1901 (his dissertation!). Before we present Lebesgue’s proof we first take a detour to the abstract setting.

2.1 Measurable sets and functions

Let X be a nonempty set. An algebra of subsets of X is a nonempty collection A of subsets of X which is closed under ﬁnite unions and complements. A σ-algebra is a nonempty collection E of subsets of X which is closed under countable unions and complements. Observe that σ-algebras are also closed under countable intersection. Also, observe that we have ∅,X ∈ E. Note that the intersection of any family of σ-algebras is again a σ-algebra. It follows that if A is any collection of subsets of X, then there is a unique smallest σ-algebra M(A) which contains A. M(A) is the σ-algebra generated by A. If X is a metric space, then the Borel σ-algebra is the σ-algebra B(X) generated by the open subsets of X. A measurable space is a pair, consisting of a set X, together with a σ- algebra of subsets of X. Let (X, M) and (Y, N ) be two measurable spaces. A function f : X → Y is measurable if f −1(E) ∈ M for all E ∈ N . We denote by M(X; Y ) the set of all measurable functions from X to Y (with the underlying σ-algebras implicit). We denote by M(X) = M(X; C) where C is endowed with the Borel σ-algebra. Thus, f ∈ M(X) if and only if f −1(E) ∈ M for any Borel set E ⊂ C. Lemma 2.1.1. Suppose (X, M), (Y, N ), and (Z, P) are measurable spaces and f : X → Y , g : Y → Z are measurable, then g ◦ f : X → Z is measurable.

Proof. If E ∈ P then (g ◦ f)−1(E) = f −1(g−1(E)) and the result is immediate. Proposition 2.1.2. Suppose N is generated as a σ-algebra by E ⊂ N .A function f : X → Y is measurable if and only if f −1(E) ∈ M for all E ∈ E.

Proof. We let A = {E ⊂ Y | f −1(E) ∈ M}. Then E ⊂ A and hence it is enough to show that A is a σ-algebra. Note that ∅ ∈ A. If E ∈ A, then f −1(Ec) = −1 c c −1 f (E) and hence E ∈ A. Also, if {En}n∈N ⊂ A, then f (∪n∈NEn) = −1 ∪n∈Nf (En) and hence ∪n∈NEn ∈ A. Therefore, A is a σ-algebra. 2.1. MEASURABLE SETS AND FUNCTIONS 27

Corollary 2.1.3. Suppose X and Y are metric spaces and f : X → Y is continuous, then f is measurable with respect to the Borel σ-algebras. Proof. Since a function is continuous if and only if the inverse images of open sets are open, and since the Borel σ-algebra is generated by open sets this follows from the previous proposition. Proposition 2.1.4. Let (X, d) be a separable metric space, then B(X) is generated by the open balls B(r, x), for x ∈ X and r > 0.

Proof. Let O ⊂ X be open and let {xn}n∈N ⊂ O be a countable dense subset. For each n ∈ N we let rn denote the supremum over all r > 0 so that B(r, xn) ⊂ O. Then B(rn, xn) ⊂ O, and by density it follows that ∪n∈NB(rn, xn) = O. Thus, any open set is contained in the σ-algebra generated by open balls and hence this is also true for any Borel set. Corollary 2.1.5. Suppose f : X → R. Then the following conditions are equivalent:

1. f ∈ M(X; R). 2. f −1(O) ∈ M for any open set O ⊂ R. 3. f −1((a, b)) ∈ M for any a, b ∈ R. 4. f −1((−∞, b)) ∈ M for any b ∈ R. Proposition 2.1.6. Let (X, M) be a measurable space.

1. If f ∈ M(X), and φ : C → C is continuous, then φ ◦ f ∈ M(X). 2. If f, g ∈ M(X), and α ∈ C, then αf, f +g, fg, |f|, Re (f), Im (f) ∈ M(X). 3. If f, g ∈ M(X; R) then max{f, g}, min{f, g} ∈ M(X; R). Proof. The ﬁrst assertation follows from Lemma 2.1.1 and Corollary 2.1.3. It then follows that if f is measurable then so is αf, |f|, Re (f), and Im (f), since multiplication by α, absolute value, and taking real and imaginary parts are continuous functions. More generally, consder C2 with the metric d((a, b), (x, y)) = max{|a−x|, |b− y|}. Then we have B(r, (a, b)) = B(r, a) × B(r, b), and if f, g ∈ M(X) then the function given by (f, g)(x) = (f(x), g(x)) satisﬁes

(f, g)−1(B(r, (a, b))) = f −1(B(r, a)) ∩ g−1(B(r, b)) and hence is measurable. Then if φ : C2 → C is continuous we must have that φ ◦ (f, g) is again measurable. Since addition and multiplication are continuous on C, and since maximum and minimum are continuous on R, it then follows that if f, g ∈ M(X) then f +g, fg ∈ M(X), and if f, g ∈ M(X; R) then max{f, g}, min{f, g} ∈ M(X; R). 28 CHAPTER 2. MEASURE AND INTEGRATION

Proposition 2.1.7. Suppose {fn}n∈N ⊂ M(X), and f : X → C so that fn(x) → f(x), for each x ∈ X. Then f ∈ M(X). Proof. Since a function f is measurable if and only if its real and imaginary parts are measurable we may assume that fn ∈ M(X; R) for each n ∈ N. If r ∈ R, then f(t) < r if and only if there exists k, N ∈ N so that fn(t) < r − 1/k for all n ≥ N. Hence,

−1 −1 f ((−∞, r)) = ∪k∈N ∪N∈N ∩n≥N fn ((−∞, r − 1/k)), and thus f −1((−∞, r)) is measurable. It then follows that f is measurable from Corollary 2.1.5. If E ∈ M, then the characteristic (or indicator) function on E is the 1 if x ∈ E, function 1 : X → given by 1 (x) = Clearly, character- E C E 0 if x 6∈ E. istic functions are measurable. A simple function is a ﬁnite complex linear combination of characteristic functions. Simple functions are also measurable by Proposition 2.1.6, and from the previous proposition we have that any pointwise limit of simple functions is then measurable.

Proposition 2.1.8. If f ∈ M(X), then f is a pointwise limit of simple functions. If f is bounded then f is a uniform limit of simple functions.

Proof. By considering the real and imaginary parts it is enough to consider 2 2 the case f ∈ M(X; R). For N ∈ N, and −N ≤ k ≤ N we let EN,k = −1 k k+1 f N , N , and set

X k f = 1 . N N EN,k −N 2≤k≤N 2

Then if f(x) ∈ [−N,N) we have |f(x) − fN (x)| ≤ 1/N, and the result follows.

2.1.1 Exercises Exercise 2.1.9. Suppose we have an algebra C ⊂ 2X with the proeprty that if ∞ En ∈ C and En ⊂ En+1, for n ≥ 1, then ∪n=1En ∈ C. Then C is a σ-algebra.

X ∞ Exercise 2.1.10. Suppose M ⊂ 2 is an algebra, such that if {En}n=1 ⊂ M ∞ are pairwise disjoint then ∪n=1En ∈ M. Then M is a σ-algebra.

Exercise 2.1.11. Suppose (X, M) is a measurable space and f, g ∈ M(X; R). The sets

{x ∈ X | f(x) < g(x)} and {x ∈ X | f(x) = g(x)} are measurable. 2.2. MEASURES 29

Exercise 2.1.12. Suppose (X, M) is a measurable space and {fn}n∈N ⊂ M(X; R). Set

C = {x ∈ X | {fn(x)}n∈N converges} . Then C is measurable. Exercise 2.1.13. Suppose (X, M, µ) is a measure space, (Y, N ) is a measurable space and θ : X → Y is measurable. For each set E ⊂ Y we set θ∗µ(E) = −1 µ(θ (E)). Then θ∗µ is a measure on (Y, N ) called the push forward measure of µ with respect to θ.

Exercise 2.1.14. The set Borel σ-algebra B ⊂ 2R has cardinality |R|. Hint: To show |B| ≤ |R| let B0 denote the set of open intervals, and inductively deﬁne for each ordinal α < ω1 the set Bα+1 to consist of all sets of the form ∞ ∞ c (∪i=1Ei) ∪ (∪j=1Fj ), where Ei, fi ∈ Bα, and for each limit ordinal α < ω1 set

Bα = ∪β<αBβ. Then show that |Bα| ≤ |R| for each α < ω1 and B = ∪α<ω1 Bα.

2.2 Measures

If (X, M) is a measurable space, then a measure on (X, M) is a set function µ : M → [0, ∞] that satisﬁes 1. µ(∅) = 0.

∞ 2. µ is countably additive: if {En}n=1 is a sequence of disjoint sets in M, ∞ P∞ then µ(∪n=1En) = n=1 µ(En). A measure space is a triple (X, M, µ) where (X, M) is a measurable space and µ is a measure on (X, M). Here are some basic properties of measures: Proposition 2.2.1. Let (X, M, µ) be a measure space. 1. (Monotonicity) If E,F ∈ M and E ⊂ F , then µ(E) ≤ µ(F ).

∞ ∞ P∞ 2. (Subadditivity) If {En}n=1 ⊂ M, then µ(∪n=1En) ≤ n=1 µ(En). ∞ 3. (Continuity from below) If {En}n=1 ⊂ M and E1 ⊂ E2 ⊂ · · · then ∞ µ(∪n=1En) = limn→∞ µ(En). ∞ 4. (Continuity from above) If {En}n=1 ⊂ M and E1 ⊃ E2 ⊃ · · · , with ∞ µ(E1) < ∞, then µ(∩n=1En) = limn→∞ µ(En). Proof. If E,F ∈ M with E ⊂ F , then we have µ(F ) = µ(E) + µ(F \ E) ≥ µ(E) showing monotonicity. ∞ If {En}n=1 ⊂ M, then setting Fn = En \ (∪k

∞ ∞ ∞ ∞ X X µ(∪n=1En) = µ(∪n=1Fn) = µ(Fn) ≤ µ(En). n=1 n=1 30 CHAPTER 2. MEASURE AND INTEGRATION

∞ If {En}n=1 ⊂ M and E1 ⊂ E2 ⊂ · · · , then setting F1 = E1, and Fn = ∞ En \ En−1 for n > 1 we have that {Fn}n=1 are pairwise disjoint and hence ∞ ∞ ∞ X µ(∪n=1En) = µ(∪n=1Fn) = µ(Fn) n=1 N X N = lim µ(Fn) = lim µ(∪n=1Fn) = lim µ(EN ). N→∞ N→∞ N→∞ n=1 ∞ If {En}n=1 ⊂ M and E1 ⊃ E2 ⊃ · · · , then taking Fn = E1 \ En, we have ∞ F1 ⊂ F2 ⊂ · · · . By continuity from below we have µ(∪n=1Fn) = limn→∞ µ(Fn). ∞ ∞ Since µ(E1) = µ(En)+µ(Fn) = µ(∪n=1Fn)+µ(∩n=1En), and since µ(E1) < ∞ we have

∞ ∞ µ(∪n=1En) = µ(E1) − µ(∪n=1Fn) = lim µ(E1) − µ(Fn) = lim µ(En). n→∞ n→∞ A measure µ is finite if µ(X) < ∞. µ or (X, M, µ) is σ-finite, if X = ∞ ∪n=1En where En ∈ M with µ(En) < ∞. µ or (X, M, µ) is semifinite if for all E ∈ M with µ(E) > 0 there exists A ∈ M with A ⊂ E such that 0 < µ(A) < ∞. Note that if (X, M, µ) is σ-finite then it must also be semifinite. Indeed, if ∞ X = ∪n=1En with En ∈ M, µ(En) < ∞, and if E ∈ M with µ(E) > 0, then µ(E ∩ En) ≤ µ(En) < ∞, for all n and we have 0 < µ(E ∩ En) for at least one P∞ n since 0 < µ(E) ≤ n=1 µ(E ∩ En). A measure µ or measure space (X, M, µ) has the essential suprema property if for any E ⊂ M there exists E ∈ M such that µ(A\E) = 0 for all A ∈ E, and if E0 ∈ M is any other measurable set which satisfies µ(A \ E0) = 0 for all A ∈ E then we also have µ(E\E0) = 0. A measure µ or measure space (X, M, µ) is localizable if it is semifinite and has the essential suprema property. Here are some examples of measure spaces: 1. If X is a set and M = 2X , then the counting measure on X is given by µ(E) = |E| if E is finite, and µ(E) = ∞ if E is infinite. It’s not hard to check that this space is always localizable and it is σ-finite if and only if X is countable. 2. If X is a nonempty set and M = 2X , then the Dirac measure (or point mass) at x0 ∈ X is given by µ(E) = 1 if x0 ∈ E, and µ(E) = 0 if x0 6∈ E. 3. If (X, M, µ) is a measure space and E ∈ M then we may consider a new measure µE on (X, M) given by µE(F ) = µ(F ∩ E). This is the restriction measure on E. 4. Suppose X is a set and M consists of all sets E ⊂ X such that either E or Ec is countable. Then counting measure restricted to M gives a measure. This measure is always semifinite and satisfies the essential suprema property if and only if X is countable. 2.2. MEASURES 31

5. Suppose (X, M) is a measurable space and N ⊂ M is a non-empty collection of subsets such that N is closed under countable union and whenever we have E ∈ N and F ∈ M with F ⊂ E then F ∈ N . We deﬁne the 0 if E ∈ N , measure µ∞ on M by setting µ∞(E) = Then µ gives ∞ if E 6∈ N , a measure on M. This will be semiﬁnite if and only if N = M.

Given a measure space (X, M, µ), we say a set E ⊂ X is σ-ﬁnite if E = ∞ ∪n=1En where En ∈ M with µ(En) < ∞. We say that E is a null set if µ(E) = 0. We say that E is conull if Ec is null. A property is said to hold almost everywhere (or µ-almost everywhere) if it holds on a conull set. The collection of null sets is non-empty, and closed under countable unions and taking measurable subsets, therefore given any measure space (X, M, µ) we may consider the corresponding measure µ∞ as described above. Then µ∞ will satisfy the essential suprema property if and only if µ does. In practice most interesting measure spaces one encounters are localizable. In part because these are the spaces in which a nice integration theory can be developed. The latter two examples above show that there do exist more general measure spaces, however we shall view these spaces as pathalogical.

∞ Lemma 2.2.2. Suppose (X, M, µ) is a measure space and {Fn}n=1 ⊂ M is a countable partition of X so that µFn has the essential suprema property for each n ≥ 1, then µ has the essential suprema property.

Proof. Suppose E ⊂ M, and for each n take En ∈ M so that µ(Fn∩(A\En)) = 0 for all A ∈ E, and if E0 ∈ M is such that µ(Fn ∩ (A \ E0)) = 0 for all A ∈ E then we have µ(Fn ∩ (En \ E0)) = 0. ∞ Set E = ∪n=1(En ∩ Fn). If A ∈ E then we have

∞ ∞ X X µ(A \ E) = µ(Fn ∩ (A \ E)) = µ(Fn ∩ (A \ En)) = 0. n=1 n=1

Also, if E0 ∈ M such that µ(A \ E0) = 0 for all A ∈ E then we have

∞ ∞ X X µ(E \ E0) = µ(Fn ∩ (E \ E0)) = µ(Fn ∩ (En \ E0)) = 0. n=1 n=1

Proposition 2.2.3. Suppose (X, M, µ) is a σ-ﬁnite measure space, then (X, M, µ) is localizable.

Proof. We already noted above that µ is semifinite, thus we only need to show that it satisfies the essential suprema property. By the previous lemma it is enough to consider the case when µ is finite. Suppose E ⊂ M, and let

E+ = {E ∈ M | µ(A \ E) = 0 for all A ∈ E}. 32 CHAPTER 2. MEASURE AND INTEGRATION

Note that X ∈ E+ and E+ is closed under countable intersection. Let a = + ∞ + inf{µ(E) | E ∈ E }, then there exists a sequence {Ek}k=1 ⊂ E so that ∞ + µ(Ek) → a. We set E = ∩k=1Ek so that E ∈ E and µ(E) ≤ infk→∞ µ(Ek) = a ≤ µ(E). + + If E0 ∈ E , then E0 ∩ E ∈ E and hence ∞ > µ(E0 ∩ E) ≥ a = µ(E). We then have 0 = µ(E) − µ(E0 ∩ E) ≥ µ(E \ E0). Proposition 2.2.4. If (X, M, µ) is a semiﬁnite measure space, then for all E ∈ M we have µ(E) = sup{µ(A) | A ⊂ E,A ∈ M, and µ(A) < ∞}. Proof. If µ(E) < ∞ then this is obvious, therefore we may assume that µ(E) = ∞. We let a = sup{µ(A) | A ⊂ E,A ∈ M, and µ(A) < ∞}, and take An ⊂ k ∞ E,An ∈ M such that µ(An) → a. We set Bk = ∪n=1An and B = ∪n=1An. Then µ(Bk) → µ(B), and Bk ⊂ E, hence a ≥ µ(Bk) ≥ µ(Ak) → a, so that µ(B) = a. If we had a < ∞ then µ(E \ B) = ∞ and so by semiﬁniteness there exists A0 ∈ M, A0 ⊂ E \B so that 0 < µ(A0) < ∞. We then have An ∪A0 ⊂ E and µ(An ∪ A0) = µ(An) + µ(A0), therefore

a ≥ µ(An ∪ A0) = µ(An) + µ(A0) → a + µ(A0) > a, which cannot happen. Thus, we must have a = ∞ = µ(E). Proposition 2.2.5. Suppose (X, M, µ) has the essential suprema property, and F ⊂ M(X, [0, ∞]). Then there exists h ∈ M(X, [0, ∞]) so that µ({x ∈ X | h(x) < f(x)}) = 0 for each f ∈ F, and if h˜ ∈ M(X, [0, ∞]) is any other function with this property then we have µ({x ∈ X | h˜(x) < h(x)}) = 0. −1 Proof. For each k ≥ 0, n ≥ 1 consider the collection Ek,n = {f ([k/n, ∞] | f ∈ F}, and let Ek,n be such that µ(A \ Ek,n) = 0 for all A ∈ Ek,n and if E0 is another measurable set with this property then we have µ(E \ E0) = 0. It then 0 0 follows that µ(Ek,n \ Ek0,n0 ) = 0 whenever k/n ≥ k /n . Let h(x) = supk≥0,n≥1{k/n | x ∈ Ek,n}. Then for each f ∈ F we have

µ({x ∈ X | h(x) < f(x)}) = µ(∪k≥0,n≥1{x ∈ X | h(x) ≤ k/n < f(x)}) = 0. Moreover, if h˜ is another measurable function with this property and if for each −1 −1 k ≥ 0, n ≥ 1 we set E˜n,k = h ([k/n, ∞]), then we have µ(f (k/n, ∞]) \ E˜n,k) = 0 for every f ∈ F and hence µ(Ek,n \ E˜k,n) = 0. It then follows that µ({x ∈ X | h˜(x) < h(x)}) = 0.

We call a function h in the previous proposition an essential supremum of F. 2.2. MEASURES 33

2.2.1 Outer measures An outer measure is a set function µ∗ : 2X → [0, ∞] that satisﬁes 1. µ∗(∅) = 0. 2. (Monotonicity) µ∗(A) ≤ µ∗(B) if A ⊂ B. ∗ P∞ ∗ 3. (Subadditivity) µ (∪n∈NAn) ≤ n=1 µ (An). X Proposition 2.2.6. Suppose S ⊂ 2 and µ0 : S → [0, ∞] is such that ∅ ∈ S, and µ0(∅) = 0. For E ⊂ X deﬁne

( ∞ ) ∗ X ∞ µ (A) = inf µ0(En) | En ∈ S and A ⊂ ∪n=1En . n=1 Then µ∗ is an outer measure. Proof. Since ∅ covers itself we clearly have µ∗(∅) = 0. If A ⊂ B ⊂ X, then as any cover of B also covers A it follows that the set for which we are taking the inﬁmum for B is contained in the corresponding set for A. Therefore µ∗(A) ≤ µ∗(B). X n Fix ε > 0. If {An}n∈N ⊂ 2 , then for each n ∈ N there exists {Ej }j∈N ⊂ S P n ∗ −n n so that µ0(E ) < µ (An) + ε2 . We then have ∪n∈ An ⊂ ∪n,j∈ E , j∈N j N N j and so ∗ X n X ∗ µ (∪n∈NAn) ≤ µ0(Ej ) < ε + µ (An). n,j∈N n∈N As ε > 0 was arbitrary it then follows that

∗ X ∗ µ (∪n∈NAn) ≤ µ (An). n∈N The outer measure µ∗ in the previous proposition is called the outer measure associated to µ0. If µ∗ is an outer measure on X, then a set A ⊂ X is µ∗-measurable if µ∗(S) = µ∗(S ∩ A) + µ∗(S ∩ Ac) for all S ⊂ X. Theorem 2.2.7 (Carath´eodory). Suppose µ∗ is an outer measure on X, then the collection M of all µ∗-measurable sets is a σ-algebra, and the restriction of µ∗ to M is a measure. Proof. Since µ∗(∅) = 0 we have µ∗(S) = µ∗(∅) + µ∗(S) for each S ⊂ X, hence ∅ ∈ M. Also, note that M is clearly closed under taking complements. If A, B ∈ M, then for each S ⊂ X we have µ∗(S) = µ∗(S ∩ A) + µ∗(S ∩ Ac) = µ∗(S ∩ A ∩ B) + µ∗(S ∩ A ∩ Bc) + µ∗(S ∩ Ac ∩ B) + µ∗(S ∩ Ac ∩ Bc) ≥ µ∗(S ∩ (A ∪ B)) + µ∗(S ∩ (A ∪ B)c) ≥ µ∗(S). 34 CHAPTER 2. MEASURE AND INTEGRATION

We therefore have that A ∪ B ∈ M and if A and B are disjoint then taking S = A ∪ B we have

µ∗(A ∪ B) = µ∗((A ∪ B) ∩ A) + µ∗((A ∪ B) ∩ Ac) = µ∗(A) + µ∗(B).

It then follows easily that M is closed under unions of ﬁnite families, and hence M is an algebra. To show that M is a σ-algebra it is then enough to show that M is closed under taking countable unions of pairwise disjoint families. If {An}n∈N ⊂ M is a sequence of pairwise disjoint sets, then set Bn = n ∞ ∪k=1Ak and B = ∪k=1Ak. Since An ∈ M, if S ⊂ X, and n > 1 we have

∗ ∗ ∗ c µ (S ∩ Bn) = µ (S ∩ Bn ∩ An) + µ (S ∩ Bn ∩ An) ∗ ∗ = µ (S ∩ An) + µ (S ∩ Bn−1).

By induction it then follows easily that

n ∗ X ∗ µ (S ∩ Bn) = µ (S ∩ An). k=1 Hence,

∗ ∗ ∗ c µ (S) = µ (S ∩ Bn) + µ (S ∩ Bn) n X ∗ ∗ c ≥ µ (S ∩ Ak) + µ (S ∩ B ). k=1 Taking n → ∞ then gives

∞ ∗ X ∗ ∗ c µ (S) ≥ µ (S ∩ Ak) + µ (S ∩ B ) k=1 ≥ µ∗(S ∩ B) + µ∗(S ∩ Bc) ≥ µ∗(S). (2.1)

Thus, B ∈ M, showing that M is a σ-algebra. Taking S = B in (2.1) shows

∞ ∗ X ∗ µ (B) = µ (Ak). k=1

∗ Hence µ deﬁnes a measure on M.

2.2.2 Carath´eodory’s extension theorem X If A ⊂ 2 is an algebra, a function µ0 : A → [0, ∞] is a premeasure if

1. µ0(∅) = 0.

∞ 2. Whenever {En}n∈N ⊂ A are disjoint such that ∪n=1En ∈ A, then we have ∞ P∞ µ0(∪n=1En) = n=1 µ0(En). 2.2. MEASURES 35

Theorem 2.2.8 (Carathéodory’s extension theorem). Suppose A ⊂ 2X is an ∗ algebra, µ0 : A → [0, ∞] is a premeasure, and µ is the associated outer measure, ∗ ∗ then every set E ∈ A is µ -measurable and we have µ (E) = µ0(E). Moreover, if M denotes the σ-algebra generated by A, and if µ∗ defines a semifinite measure on M, then µ∗ is the unique measure on M which extends µ0.

∞ Proof. If E,An ∈ A, for n ≥ 1 with E ⊂ ∪n=1An, then setting Bn = E ∩ (An \ n−1 ∞ (∪k=1 Ak)) we have that Bn ⊂ An, and {Bn}n=1 is a family of pairwise disjoint ∞ P∞ sets in A such that E = ∪n=1Bn. We therefore have µ0(E) = n=1 µ0(Bn) ≤ P∞ ∗ ∗ n=1 µ0(An). Thus, it follows that µ0(E) ≤ µ (E) ≤ µ0(E). Hence, µ is an extension of µ0. ∞ If A ∈ A, S ⊂ X, and ε > 0, then we may take {An}n=1 ⊂ A so that ∞ P∞ ∗ S ⊂ ∪n=1An and n=1 µ0(An) ≤ µ (S) + ε. Since µ0 is ﬁnitely additive on A it then follows that

∞ ∗ X c µ (S) + ε ≥ (µ0(An ∩ A) + µ0(An ∩ A )) n=1 ≥ µ∗(S ∩ A) + µ∗(S ∩ A∗) ≥ µ∗(S).

As this was for ε > 0 arbitrary we then have that A is µ∗-measurable. Suppose now that M is the σ-algebra generated by A, and let ν be another measure on (X, M) so that ν(A) = µ0(A) for all A ∈ A. Then for E ∈ M, if ∞ E ⊂ ∪n=1An with An ∈ A we have

∞ ∞ X X ν(E) ≤ ν(An) = µ0(An), n=1 n=1 and it follows that ν(E) ≤ µ∗(E). If we have E ∈ M such that µ∗(E) < ∞, and if ε > 0, then there exist ∞ ∗ ∗ ∞ ∞ {An}n=1 ⊂ A so that µ (E) + ε > µ (∪n=1An), and hence setting A = ∪n=1An we have µ∗(A \ E) < ε. Therefore,

µ∗(E) ≤ µ∗(A) = ν(A) = ν(E) + ν(A \ E) ≤ ν(E) + µ∗(A \ E) < ν(E) + ε.

Since ε > 0 was arbitrary we then have µ∗(E) ≤ ν(E) whenever µ∗(E) < ∞. If µ∗ gives a semiﬁnite measure on M then by Proposition 2.2.4 it follows that for all E ∈ M we have

ν(E) ≥ sup{ν(A) | A ⊂ E,A ∈ M, and ν(A) < ∞} ≥ sup{µ∗(A) | A ⊂ E,A ∈ M, and µ∗(A) < ∞} = µ∗(E),

∗ and hence in this case we have ν(E) = µ (E) for all E ∈ M. 36 CHAPTER 2. MEASURE AND INTEGRATION

2.2.3 Exercises A measure space (X, M, µ) is complete if every subset of a null set is measurable (and hence also null). Exercise 2.2.9. Suppose (X, M, µ) is a measure space and let N = {E ∈ M | µ(E)} be the space of null sets. We let M = {E ∪ F | E ∈ M and F ⊂ N for some N ∈ N }. Then M is a σ-algebra and there is a unique extension µ of µ to a complete measure on M. The measure space (X, M, µ) from the previous theorem is called the completion of (X, M, µ). Exercise 2.2.10. If µ∗ is an outer measure on X, M is the collection of all µ∗-measurable sets, and µ is the restriction of µ∗ to M, then (X, M, µ) is a complete measure space. Let (X, M, µ) be a measure space. A function f ∈ M(X) is essentially bounded if there exists M ∈ [0, ∞) such that µ({x ∈ X | |f(x)| > M}) = 0. We let L∞(X, µ) denote the space of all (complex valued) essentially bounded functions, and for f ∈ L∞(X, µ) we set

kfk = inf{M ∈ [0, ∞) | µ({x ∈ X | |f(x)| > M}) = 0}.

For clarity, we sometimes may write kfk∞ instead of kfk.

∞ Exercise 2.2.11. L (X, µ) is an algebra and k · k∞ gives a seminorm on L∞(X, µ). We let L∞(X, µ) be the normed algebra obtained from L∞(X, µ) by identifying two functions f and g when kf − gk∞ = 0, i.e., when f = g almost everywhere.

∞ Exercise 2.2.12. If {fn}n∈N ⊂ L (X, µ) is Cauchy with respect to k · k∞, ∞ ∞ then there exists f ∈ L (X, µ) such that kf − fnk∞ → 0, hence L (X, µ) is a Banach algebra. Exercise 2.2.13. Let (X, M, µ) be a finite measure space and for E,F ∈ M set ρ(E,F ) = µ(E∆F ). Then ρ gives a semimetric on M. Exercise 2.2.14. Let (X, M, µ) be a finite measure space and ρ defined as above. Then ρ is a complete semimetric. Hint: If {En}n∈N is Cauchy, by −n −m passing to a subsequence we may suppose µ(En∆Em) ≤ max{2 , 2 }, and in this case setting Fm = ∪k≥mEk, we have that {Fm}m∈N is again Cauchy, and −n+4 µ(Fm∆En) < 2 for m > n.

2.3 Borel measures on R By a Borel measure on a metric space, we mean a measure on the Borel σ-algebra. 2.3. BOREL MEASURES ON R 37

Lemma 2.3.1. Let F : R → R be increasing and right continuous. Let I be the collection of intervals of the form (a, b], for −∞ ≤ a < b < ∞, or of the form (a, ∞) for −∞ ≤ a < ∞, and set µ0((a, b]) = F (b) − F (a) if b < ∞, and µ0((a, ∞)) = F (∞) − F (a), where F (±∞) = limt→±∞ F (t). We let A denote the algebra consisting of finite unions of intervals in I. Then µ0 extends to a premeasure on A. n Proof. Note that if (a, b] = ∪k=1(ak, bk], then after rearranging we may assume that a = a1 < b1 = a2 < b2 = ··· < bk−1 = ak < bk = b, and we have that n n X X µ0((a, b]) = F (b) − F (a) = F (bk) − F (ak) = µ0((ak, bk]). k=1 k=1 n We similarly have that if I1,...,In ∈ I are disjoint and ∪k=1Ik = (a, ∞), then Pn µ0((a, ∞)) = k=1 µ0(Ik). From this it then follows easily that we obtain a well n defined finitely additive set function µ0 : A → [0, ∞] by setting µ0(∪k=1Ik) = Pn k=1 µ0(Ik) for pairwise disjoint sets I1,...,In ∈ I. ∞ We will now show that µ0 is a premeasure on A. Suppose that {Ij}j=1 is ∞ a pairwise disjoint sequence of intervals in A, such that ∪j=1Ij = I ∈ I. Then we have n n n n X µ0(I) = µ0(∪j=1Ij) + µ0(I \ ∪j=1Ij) ≥ µ0(∪j=1Ij) = µ0(Ij). j=1 P∞ Taking a limit as n → ∞ we see that µ0(I) ≥ n=1 µ0(Ij). For the reverse inequality we first assume that I = (a, b], where a and b are finite. Fix ε > 0. As F is right continuous there exists δ > 0 so that F (a + δ) − F (a) < ε. Similarly, if Ij = (aj, bj] then there exist δj > 0 so −j that F (bj + δj) − F (bj) < ε2 . Since the open intervals (aj, bj + δj) cover the compact set [a + δ, b] there exists n ∈ N and j1, . . . , jn, so that [a + δ, b] ⊂ n ∪i=1(aj1 , bji + δji ). We may further assume that no subcollection also covers [a + δ, b] and by reordering and reindexing j1, . . . , jn as 1, . . . , n we may then assume that

a1 < a + δ ≤ a2 < b1 + δ1 ≤ a3 < · · · ≤ an < bn−1 + δn−1 ≤ b < bn + δn. We then have

µ0(I) < F (b) − F (a + δ) + ε

≤ F (bn + δn) − F (a1) + ε n−1 X = F (bn + δn) − F (an) + (F (aj+1) − F (aj)) + ε j=1 n−1 X ≤ F (bn + δn) − F (an) + (F (bj + δj) − F (aj)) + ε j=1 n ∞ X X ≤ F (bn) − F (an) + 2ε ≤ µ0(Ij) + 2ε. j=1 j=1 38 CHAPTER 2. MEASURE AND INTEGRATION

P∞ As ε > 0 was arbitrary it then follows that µ0(I) ≤ j=1 µ0(Ij). For the case of a general interval I ∈ I, we can easily check that have µ0(I) = P∞ P∞ lima→−∞,b→∞ µ0(I∩[a, b)) and j=1 µ0(Ij) = lim a → −∞, b → ∞ j=1 µ0(Ij∩ [a, b)). Using the case −∞ < a < b < ∞ above and taking limits then shows P∞ that µ0(I) = j=1 µ0(Ij). ∞ If we now consider general sets E,Ej ∈ A, such that {Ej}j=1 is pairwise ∞ disjoint and E = ∪j=1Ej, then writing each set as a finite union of disjoint inter- P∞ vals, and using finite additivity of µ0 it then follows that µ0(E) = j=1 µ0(Ej). Hence, µ0 is a premeasure on A. Theorem 2.3.2. Let F : R → R be increasing and right continuous. Then there is a unique Borel measure µF on R such that µF ((a, b]) = F (b) − F (a) for all a, b ∈ R. Conversely, if µ is a Borel measure on R which is finite on all bounded Borel sets and we define   µ((0, x]) if x > 0, F (x) = 0 if x = 0,  −µ((−x, 0]) if x < 0, then F is increasing, right continuous, and µ = µF . Proof. We let A denote the algebra in Lemma 2.3.1, and note that the σ- algebra generated by A is the Borel σ-algebra. By Lemma 2.3.1 there exists a premeasure µ0 on A so that µ0((a, b]) = F (b) − F (a) for a, b ∈ R. By Carathéodory’s extension theorem there then exists a Borel measure µf on R such that µF ((a, b]) = F (b) − F (a). Moreover, since µF is σ-finite it then also follows from Carathéodory’s extension theorem that µF is the unique Borel measure with this property. If µ is a Borel measure on R which is finite on all bounded Borel sets and if we define F as above, then it follows from monotonicity that F is increasing and from continuity from above/below that F is right continuous. Moreover, we see easily that for a, b ∈ R we have µ((a, b]) = F (b) − F (a). By uniqueness of the measure µF it then follows that µ = µF . Given F : R → R increasing and right continuous, the completion of the corresponding measure µF is called the Lebesgue-Stieltjes measure associated to F , and F is called a distribution function associated to µF . It’s easy to check that two distribution functions associated to the same measure must differ by a constant.

2.3.1 Lebesgue measure on R The Lebesgue-Stieltjes measure corresponding to the function F (x) = x is called Lebesgue measure on R and usually denoted by λ. A set E ⊂ R is Lebesgue measurable if it is λ∗-measurable where λ∗ is the outer measure corresopnding to λ. Note that by Exercise 2.2.10 if E ⊂ R satisﬁes λ∗(E) = 0, then E is Lebesgue measurable. 2.3. BOREL MEASURES ON R 39

Theorem 2.3.3. If E ⊂ R is Borel, then so is E + s and rE for all s, r ∈ R. Moreover, we have λ(E + s) = λ(E) and λ(rE) = |r|λ(E).

Proof. Since addition and multiplication are continuous it follows from Corol- lary 2.1.3 that E + s and rE are Borel. By uniqueness of the Lebesgue-Stieltjes measure to show that λ(E + s) = λ(E) and λ(rE) = |r|λ(E), it suﬃces to show these equalities when E is a half open interval, in which case this is obvious.

Note that every point x ∈ R has Lebesgue measure zero. It follows that every countable set has Lebesgue measure zero. There are also uncountable sets with Lebesgue measure zero. The Cantor set C is the set of all x ∈ [0, 1] P∞ −n that have a base-3 expansion x = n=1 an3 with an 6= 1 for all n (note that such an expansion, if it exists, must be unique). We may may obtain C by 1 2 starting with the unit interval [0, 1] and removing the open middle third ( 3 , 3 ), 1 2 7 8 then removing the open middle thirds ( 9 , 9 ) and ( 9 , 9 ) of the two remaining interval, etc.

Proposition 2.3.4. The Cantor set C is compact, contains no non-trivial open interval, and has no isolated points. Moreover, C has cardinality |R| and satis- ﬁes λ(C) = 0.

Proof. C is obtained by removing open intervals, thus C is a decreasing union of closed subsets of [0, 1], and so C itself is a closed subset of [0, 1] which must P∞ −n then be compact. If x = n=1 an3 ∈ C with an ∈ {0, 2} for all n, then for each n ∈ N consider xn ∈ C which has the same expansion as x except for the nth coeﬃcient, which is either 0 if an = 2, or 2 if an = 0. Then {xn}n∈N ⊂ C is an inﬁnite sequence such that xn → x. Hence, C has no isolated points. By considering the lengths of the intervals removed from C we have

∞ X 2n−1 λ(C) = 1 − = 0. 3n n=1

P∞ −n P∞ −n If x = n=1 an3 ∈ C with an 6= 1 for all n, then set f(x) = n=1 bn2 where bn = an/2. Then the series describing f(x) is a base-2 expansion and every number in [0, 1] can be expressed in this way, thus f : C → [0, 1] is a surjection which shows that |C| = |R|. Corollary 2.3.5. Let L ⊂ 2R denote the σ-algebra of Lebesgue measurable subsets, then |L| = |2R|. Hence, B(R) ( L ( 2R. Proof. We clearly have |L| ≤ |2R|, and if C is the Cantor set then λ(C) = 0, hence any subset of C is Lebesgue measurable and so we have |2R| = |2C | ≤ |L|. By Exercise 2.1.14 we have |B(R)| = |R|, hence B(R) ( L. Also, Vitali’s set E constucted at the beginning of the chapter cannot be Lebesgue measurable hence L ( 2R. Note that the function f : C → [0, 1] in the proof of Proposition 2.3.4 is monotone increasing. Moreover, if x, y ∈ C, with x < y, then f(x) = f(y) 40 CHAPTER 2. MEASURE AND INTEGRATION only if x and y are the endpoints of one of the intervals removed from [0, 1]. In this case we have f(x) = f(y) = m2−n where m, n are integers. Thus we may extend f on the interval (x, y) by letting it be constant m2−n. In this way we extend f to a monotone increasing function f˜ : [0, 1] → [0, 1]. The function f˜ is called the Cantor function. Theorem 2.3.6. Let f : [0, 1] → [0, 1] be the Cantor function. Then the following hold: 1. f is continuous. 2. The derivative of f exists, and equals zero, almost everywhere. 3. There exists a Lebesgue measurable set E ⊂ [0, 1] such that f(E) is not Lebesgue measurable. Proof. Note that f is surjective and hence cannot have any jump discontinuities. Since f is monotone increasing it must then be continuous. f is constant on each middle third, and hence the derivative of f exists and equals zero, on each middle third, and as we say above, the union of these open intervals is conull. If we consider Vitali’s example of a non-measurable set E ⊂ [0, 1], then −1 setting E0 = f (E) ∩ C we have that E0 is contained in a measure zero set and hence must be measurable. Since f(C) = [0, 1] we have f(E0) = E.

2.3.2 Regularity of Borel measures Theorem 2.3.7. Suppose µ is a ﬁnite Borel measure on a metric space (X, d). Then µ is regular: For E ⊂ X Borel we have

µ(E) = inf{µ(G) | E ⊂ G and G is open} = sup{µ(F ) | F ⊂ E and F is closed}.

Proof. We let Σ denote the family of Borel sets E which satisfy the conclusion of the theorem. If E ⊂ X is closed then Gn = {x ∈ X | d(x, E) < 1/n} is ∞ open for each n ∈ N and we have ∩n=1Gn = E. By continuity from above of measures we have that µ(E) = limn→∞ µ(Gn). Hence it follows that E ∈ Σ. It is also clear that Σ is closed under taking complements. ∞ If {En}n=1 ⊂ Σ and ε > 0, then there exist Fn,Gn ⊂ X with Fn closed and −n ∞ Gn open such that Fn ⊂ En ⊂ Gn and µ(Gn \ Fn) < ε2 . If we set G = ∪n=1 ∞ ∞ ∞ and F = ∪n=1Fn then we have F ⊂ ∪n=1En ⊂ G, and G \ F ⊂ ∪n=1(Gn \ Fn) P∞ hence µ(G \ F ) ≤ n=1 µ(Gn \ Fn) < ε. By continuity from above we have n limn→∞ µ(G \ (∪k=1Fk)) = µ(G \ F ) < ε. Hence, for n large enough we have n n µ(G \ (∪k=1Fk)) < ε. Since G is open and ∪k=1Fk ⊂ F ⊂ E is closed it then ∞ follows that ∪n=1En ∈ Σ. Thus, Σ is a σ-algebra which contains the closed sets and hence must contain all Borel sets.

A set E ⊂ X is a Gδ-set if it a countable intersection of open sets. A set E ⊂ X is an Fσ-set if it is a countable union of closed sets. 2.3. BOREL MEASURES ON R 41

Corollary 2.3.8. Suppose µ is a σ-ﬁnite Borel measure on a metric space (X, d). Then for every Borel set E ⊂ X there exists an Fσ-set F , and a Gδ-set G such that F ⊂ E ⊂ G and µ(G \ F ) = 0.

Proof. This follows easily from Theorem 2.3.7 when µ is finite. If µ is σ-finite n we may write X as a disjoint union X = ∪n=1En where En ⊂ X is Borel and µ(En) < ∞. Suppose E ⊂ X is Borel and ε > 0. For each n ≥ 1 consider the Borel measure µn on X given by µn(A) = µ(A ∩ En), then µn is a 1 2 c finite measure and so there exist Fσ-sets Fn ⊂ E ∩ En, Fn ⊂ E ∩ En so that 1 c 2 µ((E ∩ En) \ Fn ) = µ((E ∩ En) \ Fn ) = 0. 1 ∞ 1 2 ∞ 2 c If we set F = ∪n=1Fn ⊂ E and F = ∪n=1Fn ⊂ E , then we have µ(E \ 1 c 2 2 c F ) = µ(E \ F ) = 0. Then G = (F ) is Gδ and satisfies E ⊂ G, and c 2 1 1 µ(G \ E) = µ(E \ F ) = 0. Hence, µ(G \ F ) = µ(G \ E) + µ(E \ F ) = 0.

Corollary 2.3.9. Suppose µF is a Lebesgue-Stieltjes measure on R, and E ⊂ R is a Borel set such that µF (E) < ∞. Then for every ε > 0, there exists G ⊂ R such that G is a ﬁnite union of intervals and µF (E∆G) < ε.

Proof. Suppose E ⊂ R is Borel and ε > 0. Take t > 0 so that µ(E \ (−t, t)) < ε/3. By the previous Corollary there exists a sequence of open set Gn such that ∞ 0 k 0 µ(E∆∩n=1Gn) = 0. Thus, setting Gn = Gn∩(−t, t) we have limk→∞ µ((∩n=1Gn)\ k 0 k 0 E) = 0, and hence for some k we have µ((∩n=1Gn) \ E) < ε/3. Since ∩n=1Gn is open it is a countable union of intervals, hence there exists a set G which is k 0 a ﬁnite union of intervals such that µ((∩n=1Gn) \ G) < ε/3. We then have

k 0 k 0 µ(E∆G) ≤ µ(E \ (−t, t)) + µ((∩n=1Gn) \ E) + µ((∩n=1Gn) \ G) < ε.

Theorem 2.3.10. Suppose µ is a ﬁnite Borel measure on a complete separable metric space (X, d). Then µ is tight: For E ⊂ X Borel we have

µ(E) = sup{µ(K) | K ⊂ A and K is compact}.

Proof. Since µ is regular it is enough to consider the case when E is closed, and then restricting to E we might as well assume that E = X. So it suﬃces to show µ(X) = sup{µ(K) | K is compact}. ∞ Fix ε > 0. Let {xi}i=1 be a countable dense set in X. If n ≥ 1 then ∞ k ∪i=1B(1/n, xi) = X and hence limk→∞ µ(∪i=1B(1/n, xi)) = µ(X). We take kn −n ∞ kn kn so that µ(X \ ∪i=1B(1/n, xi)) < ε2 . Let K = ∩n=1 ∪i=1 B(1/n, xi). Then K is closed and totally bounded, and hence compact. We also have P∞ kn µ(X \ K) ≤ n=1 µ(X \ ∪i=1B(1/n, xi)) < ε.

Theorem 2.3.11 (Lusin’s Theorem). Suppose µ is a ﬁnite Borel measure on a metric space (X, d), and f ∈ M(X). For each ε > 0 there exists a closed set c F ⊂ X such that µ(F ) < ε and f|F is continuous. 42 CHAPTER 2. MEASURE AND INTEGRATION

∞ Proof. Fix ε > 0, and take an enumeration of the rationals Q = {qn}n=1. −1 Then En,k = f ((qn, qk)) is measurable and hence there exist Fn,k closed and −n−k Vn,k opne so that Fn,k ⊂ En,k ⊂ Vn,k with µ(Vn,k \ Fn,k) < ε2 . Set c U = ∪n,k(Vn,k \ Fn,k) and F = U . Then µ(U) < ε, and F is closed. Moreover, −1 f ((qn, qk)) ∩ F = Vn,k ∩ F . Since every open set is a union of sets of the form (qn, qk) it then follows easily that f|F is continuous.

Corollary 2.3.12. Suppose µF is a Lebesgue-Stieltjes measure on R, and f ∈ M(R) is such that f vanishes outside a ﬁnite measure set. Then for all ε > 0 there exists a continuous function g ∈ C0(R) so that

µF ({x ∈ R | f(x) 6= g(x)}) < ε.

Proof. Fix ε > 0 and take t0 > 0 so that µF (((−∞, −t0) ∪ (t0, ∞)) ∩ {x ∈ R | f(x) 6= 0}) < ε/4. By considering the restriction of µF to [−t0, t0] the previous theorem gives a closed set E ⊂ [−t0, t0] so that f|E is continuous and µF ([−t0, t0] \ E) < ε/4. We let a = inf E and b = sup E, and take a0 < a, and b0 > b so that 0 0 c µF ([a , a)) + µF ((b, b ]) < ε/2. If t ∈ E , a < t < b we let (t1, t2) denote the largest interval in Ec which contains t. We deﬁne g so that  f(t) if t ∈ E,  0 0  0 if t < a , or t > b ,  t−a0 g(t) = a−a0 f(a) if ∈ [a − 1, a),  b0−t  b0−b f(b) if t ∈ (b, b + 1],  t−t1 t2−t f(t2) + f(t1) if t ∈ (t1, t2). t2−t1 t2−t1

We then have that g ∈ C0(R) and g agrees with f on E, hence it follows easily that µF ({x ∈ R | f(x) = g(x)}) < ε.

2.3.3 Exercises Exercise 2.3.13. For every ε > 0, there exists a compact set K ⊂ [0, 1] which contains no isolated points and no non-trival open interval such that λ(K) > 1 − ε.

Exercise 2.3.14. Let E ⊂ R be a Borel set such that λ(E) < ∞. Then the maps R 3 t 7→ λ(E∆tE), and t 7→ λ(E∆(E + t)) are continuous. ∞ Exercise 2.3.15. If (X, M, µ) is a measure space and {Aj}j=1 ⊂ M, we set ∞ ∞ lim infj→∞ Aj = ∪N=1 ∩k=N Ak. Then µ(lim inf Aj) ≤ lim inf µ(Aj). Exercise 2.3.16. There exists a measurable function f : [0, 1] → R so that for all (a, b) ⊂ [0, 1] and (c, d) ⊂ R we have λ({x ∈ (a, b) | f(x) ∈ (c, d)}) > 0.

Exercise 2.3.17. Show that there exists a Borel set A ⊂ [0, 1] such that 0 < m(A ∩ I) < m(I) for every subinterval I of [0, 1]. 2.4. INTEGRATION 43

2.4 Integration 2.4.1 Integrable functions Let (X, M, µ) be a measure space. If E ∈ M has ﬁnite measure, and f = Pk n=1 αn1En is a simple function with respect to some measurable partition k {En}n=1 of E, then we deﬁne the integral of f to be

k Z X f = αnµ(En). n=1 Pl Note that if we have another representation f = m=1 βm1Fm then by writing k Pk Fm = ∪n=1Fm ∩ En we see that n=1 αnµ(Fm ∩ En) = βmµ(Fm). Summing over m then shows that the integral is well defined. Depending on the situation we also use the following notation for the integral Z Z Z Z f dµ; f dµ; f(x) dµ(x); f(x) dx X X X R R If A ⊂ X is measurable we write A f for the integral X 1Af. 1 We let L0(X) denote the set of all simple functions having a decomposition Pk k f = n=1 αn1En where the {En}n=1 gives a partition of a finite measure set E. 1 1 If I ⊂ C we denote by L0(X; I) those functions in L0(X) which take values in I, 1 1 1 and we also set L0(X)+ = L0(X; [0∞)). We note that L0(X) is a vector space 1 over C. We also note that the integral defines a linear functional on L0(X). 1 R If f ∈ L0(X)+ then we clearly have f ≥ 0. Linearity then shows that for 1 f, g ∈ L0(X; R) we have Z Z f ≤ g, if f ≤ g. (2.2)

1 Also, note that for f ∈ L0(X) the triangle inequality in C shows that Z Z

f ≤ |f|. (2.3)

1 Similarly, if we have f, g ∈ L0(X), then taking a partition of a set of ﬁnite measure such that both f and g are simple functions with respect to this partition, the triangle inequality in C shows that Z Z Z |f + g| ≤ |f| + |g|. (2.4)

1 1 Therefore, we may deﬁne the L -seminorm on L0(X) as Z kfk1 = |f|. (2.5)

Finally, note that kfk1 = 0 if and only if f = 0 almost everywhere. 44 CHAPTER 2. MEASURE AND INTEGRATION

1 1 We let L (X) be the Banach space completion of L0(X) after identifying functions which agree almost everywhere. Depending on the situation we may use the terminology

L1(X); L1(µ); L1(X, µ); L1(X, M, µ).

Note that since | R f| ≤ R |f| it follows that the integral is continuous with respect to the seminorm k · k1. Hence, the integral extends continuously to L1(X), and we use the same terminology here. Every vector in L1(X) is the limit of a Cauchy sequence of functions in 1 1 L0(X). We now wish to ﬁnd a more tractable realization of vectors in L (X). 1 Lemma 2.4.1. Let {fn}n∈N ⊂ L0(X) be a Cauchy sequence, then there exists a subsequence {fnk }k∈N which converges almost everywhere to a measurable function f, and such that for all ε > 0 there exists a measurable set A ⊂ X with c µ(A) < ε, such that {fnk }k∈N converges to f uniformly on A . 1 Proof. Suppose {fn}n∈N ⊂ L0(X) is Cauchy. Passing to a subsequence we may −n assume that kfn − fn+1k1 ≤ 4 , for all n ∈ N. Set

−n En = {x ∈ X | |fn(x) − fn+1(x)| ≥ 2 }.

−n Then we have |fn − fn+1| ≥ 2 1En , and hence it follows that Z Z −n −n −n 4 ≥ |fn − fn+1| ≥ 2 1En = 2 µ(En).

For N ∈ N set AN = ∪n≥N En, and set A = ∩N∈NAN . Then

X X −n −N+1 µ(AN ) ≤ µ(En) ≤ 2 < 2 . n≥N n≥N

Hence µ(A) = 0. c −n If x ∈ AN and n ≥ N then |fn(x) − fn(x + 1)| < 2 . It therefore follows that {fn(x)}n∈N is Cauchy and hence converges to some f(x) ∈ C. There- c c fore there exists a function f : A → C so that fn(x) → f(x) for all x ∈ A . Note that we also have that fn converges uniformly to f on each set AN , and c limN→∞ µ(AN ) = 0. From Proposition 2.1.7 we see that f : A → C is measurable, and we may extend f to a measurable function on X by setting f(x) = 0 for all x ∈ A. 1 Lemma 2.4.2. Suppose {fn}n∈N, {gn}n∈N ⊂ L0(X) are Cauchy sequences, and f ∈ M(X) such that both sequences converge almost everywhere to f. Then limn→∞ kfn − gnk1 = 0.

∞ 1 Proof. If we consider hn = fn − gn, then {hn}n=1 is Cauchy in L0(X) and satisﬁes hn → 0 almost everywhere. We must show that khnk1 → 0. Fix ε > 0, and take N ∈ N so that khn − hmk1 < ε for all n, m ≥ N. Using Lemma 2.4.1 and passing to a subsequence we may assume that there exists a measurable set 2.4. INTEGRATION 45

ε c A ⊂ X with µ(A) < , such that hn → 0 uniformly on A . Let E ⊂ X 1+khN k∞ be a set of ﬁnite measure such that fN vanishes outside of E. Then for n large we have Z Z Z |hn| ≤ |hN | + |hn − hN | A∪Ec A∪Ec A∪Ec Z ≤ |hN | + khn − hN k1 A ≤ µ(A)khN k∞ + khn − hN k1 < 2ε.

Hence, Z Z lim sup khnk1 = lim sup |hn| + |hn| n→∞ n→∞ A∪Ec Ac∩E

≤ 2ε + µ(E) lim sup khn|Ac k∞ = 2ε. n→∞

Since ε > 0 was arbitrary we have that khnk1 → 0. We let L1(X) denote the set of all measurable functions f ∈ M(X) such that 1 f is the almost everywhere limit of a Cauchy sequence of functions in L0(X). Functions in L1(X) are said to be integrable. Clearly, L1(X) is a vector space. Lemma 2.4.2 shows that we have a well deﬁned map Ξ : L1(X) → L1(X) which assigns to each function f ∈ L1(X) a Cauchy sequence which converges almost everywhere to f. The map Ξ is clearly linear and the kernel consists of all measurable functions which are zero almost everywhere. We extend the integral to a linear map on L1(X) by setting R f = R Ξ(f). In other words, given a 1 R R function f ∈ L (X) we have f = limn→∞ fn where {fn}n∈N is a Cauchy 1 sequence in L0(X) which converges almost everywhere to f. Lemma 2.4.1 shows that the map Ξ is surjective. Thus, we may think of the space L1(X) as the space of integrable functions where we identify functions which agree almost everywhere.

2.4.2 Properties of integration 1 1 Lemma 2.4.3. If {fn}n∈N ⊂ L0(X) is Cauchy, and f ∈ L (X) such that fn → f almost everywhere, then {|fn|}n∈N is also Cauchy, and |fn| → |f| almost everywhere.

Proof. This follows easily from the inequality ||a| − |b|| ≤ |a − b| for all a, b ∈ C. Theorem 2.4.4. If f ∈ L1(X) then |f| ∈ L1(X). Moreover, the inequalities (2.2), (2.3), (2.4), and (2.5) hold for all functions in L1(X).

Proof. The fact that |f| ∈ L1(X) if f ∈ L1(X) follows from the previous lemma, which also shows that the function f 7→ |f| is uniformly continuous on bounded 1 R subsets of L0(X). Since f 7→ f and f 7→ kfk1 are also uniformly continuous 46 CHAPTER 2. MEASURE AND INTEGRATION

1 on bounded subsets of L0(X), it follows that these maps are also continuous on the completion L1(X). The inequalities (2.2), (2.3), (2.4), and (2.5) then follow 1 by continuity, since they hold on the dense subspace L0(X). Recall that if we let L∞(X) denote the space of essentially bounded func- ∞ tions, then we have a complete semi-norm on L (X) given by kfk∞ = inf{M ∈ [0, ∞) | µ({x ∈ X | |f(x)| > M}) = 0}. An essentially bounded function f sat- isﬁes kfk∞ = 0 if and only if f is zero almost everywhere. Thus, if we identify functions which agree almost everywhere then we obtain a Banach space L∞(X). ∞ This is also a Banach algebra since kfgk∞ ≤ kfk∞kgk∞. We let L0 (X) denote the space of simple functions. Since any function f ∈ L∞(X) agrees almost ev- ∞ erywhere with a bounded function it follows from Proposition 2.1.8 that L0 (X) is dense in L∞(X).

Theorem 2.4.5. Suppose f ∈ L1(X), and g ∈ L∞(X), then gf ∈ L1(X) and kgfk1 ≤ kgk∞kfk1.

1 ∞ PN Proof. Suppose ﬁrst that f ∈ L0(X) and g ∈ L0 (X), say f = n=1 αn1En PM N with µ(En) < ∞, and g = m=1 βm1Fm . We may assume that {En}n=1 and M {Fm}m=1 are each pariwise disjoint. We then have

N M X X 1 gf = βmαn1Fm∩En ∈ L0(X), n=1 m=1 and

1 ∞ For the general case, suppose that f ∈ L (X) and g ∈ L (X). Take fn ∈ 1 ∞ 1 L0(X) so that {fn}n=1 is Cauchy in L0(X) and fn → f almost everywhere. ∞ Also, take gn ∈ L0 (X) so that kgn − gk∞ → 0. Then gnfn → gf almost everywhere, and from the triangle inequality and the argument above we have

kgnfn − gmfmk1 ≤ kgnk∞kfn − fmk1 + kgn − gmk∞kfmk1.

∞ 1 1 Therefore {gnfn}n=1 is Cauchy in L0(X). We then have that gf ∈ L (X) and

kgfk1 = lim kgnfnk1 ≤ lim kgnk∞kfnk1 = kgk∞kfk1. n→∞ n→∞

2.4. INTEGRATION 47

Corollary 2.4.6. If f ∈ L1(X), and h ∈ M(X) such that |h| ≤ |f|, then h ∈ L1(X). In particular, f ∈ L1(X) if and only if |f| ∈ L1(X).

Proof. We let g(x) = 0 if f(x) = 0, and g(x) = h(x)/f(x) otherwise. Then g is 1 measurable and kgk∞ ≤ 1. Therefore h = gf ∈ L (X).

∞ 1 Corollary 2.4.7. If µ(X) < ∞ then L (X) ⊂ L (X), and kfk1 ≤ kfk∞µ(X).

1 Proof. If µ(X) < ∞ then 1X ∈ L (X) with k1X k1 = µ(X). Therefore for ∞ f ∈ L (X) we have kfk1 ≤ kfk∞k1X k1 = kfk∞µ(X).

1 1 ∞ ∞ We set L (X)+ = L (X) ∩ M(X; [0, ∞)), and we set L (X)+ = L (X) ∩ M(X)+.

2.4.3 Functions which agree almost everywhere Let (X, M, µ) be a measure space. So far we have introduced M(X), L∞(X), and L1(X), as the spaces of all measurable, essentially bounded, and integrable functions respectively. It is often the case that we are interested in functions only up to measure zero, and so we consider the spaces M(X), L∞(X), and L1(X) which are respectivley the quotient of the above spaces where we have identiﬁed functions which agree almost everywhere. Note that M(X) does not depend on the measure µ, however M(X) does. The elements in M(X) are equivalence classes of functions, however it is cumbersome to state this explicitly each time. Thus, in the sequel when we write f ∈ M(X) (or f ∈ L∞(X), f ∈ L1(X)) we mean that we can take f to be any function in M(X) which represents this equivalence class. Similarly, if we write an expression, e.g., f ≤ g with f, g ∈ M(X), then this is expression is meant to be understood as occuring almost everywhere. ∞ As an example, we might say {fn}n=1 ⊂ M(X), and f ∈ M(X) such that fn → f almost everywhere. This is unambiguous as the countable union of measure zero sets has measure zero, and thus replacing fn and f by functions which agree almost everywhere does not change the fact that fn → f almost everywhere. As long as we restrict to countably many functions/operations at a time this will not cause any diﬃculty.

2.4.4 Convergence properties

We begin this subsection by improving Lemma 2.4.1 to the case when fn ∈ L1(X).

1 Theorem 2.4.8. Let {fn}n∈N ⊂ L (X) be a Cauchy sequence, then there exists a subsequence {fnk }k∈N which converges almost everywhere to a measurable function f, and such that for all ε > 0 there exists a measurable set A ⊂ X with c µ(A) < ε, such that {fnk }k∈N converges to f uniformly on A . 48 CHAPTER 2. MEASURE AND INTEGRATION

1 Proof. In the proof of Lemma 2.4.1 the only reason we needed fn ∈ L (X) was R 0 so that kfnk1 was deﬁned and satisﬁed kfnk1 = fn, and that if g ≤ h then R g ≤ R h. By Theorem 2.4.4 we also have these facts now for general functions 1 in L (X). Thus, the proof follows verbatim as in Lemma 2.4.1. Theorem 2.4.9 (The Monotone Convergence Theorem). Let (X, M, µ) be a ∞ 1 measure space. Suppose {fn}n=1 ⊂ L (X)+ is a sequence such that fn ≤ fn+1 R ∞ 1 for all n, and such that fn is bounded. Then {fn}n=1 converges in L , and 1 almost everywhere to a function f ∈ L (X)+.

R ∞ R ∞ Proof. Suppose a = sup fn < ∞. Since {fn} is increasing so is { fn} , R n=1 R n=1 and for n ≤ m we have kfm − fnk1 = (fm − fn). Since fn → a it then ∞ 1 follows that {fn}n=1 is Cauchy in L . By the previous theorem there then exists a subsequence which converges in L1 and almost everywhere to a function 1 ∞ f ∈ L . Since we have an increasing sequence it then follows that {fn}n=1 1 converges almost everywhere and in L to f. ∞ Corollary 2.4.10. Let (X, M, µ) be a measure space. Suppose {fn}n=1 ⊂ 1 ∞ L (X)+ is a sequence such that fn+1 ≤ fn for all j. Then {fn}n=1 converges 1 1 in L , and almost everywhere to a function f ∈ L (X)+.

Proof. We apply the monotone convergence theorem to the sequence {f1 − ∞ fn}n=1. ∞ 1 R Lemma 2.4.11 (Fatou’s Lemma). If {fn}n=1 ⊂ L (X)+, is such that lim infn→∞ fn < ∞, then lim infn→∞ fn(x) exists for almost every x ∈ X. Moreover, lim infn→∞ fn is a measurable function which is in L1(X), and we have Z Z lim inf fn ≤ lim inf fn. n→∞ n→∞

∞ Proof. Fix k and consider the decreasing sequence {gm}m=1 where

gm = inf{fk, fk+1, . . . , fm}.

∞ Then {gm}m=1 decreases to infm≥k fm, and applying the previous corollary we 1 have that infm≥k fm is in L (X), and Z Z inf fm ≤ inf fm. m≥k m≥k

R ∞ ∞ By hypothesis we have that {infm≥k fm}m=1 is bounded, and since {infm≥k fm}k=1 is increasing we then have from the monotone convergence theorem that lim infn→∞ fn exists almost everywhere, is in L1(X), and satisﬁes Z Z Z lim inf fn = lim inf fm ≤ lim inf fn. n→∞ n→∞ m≥n n→∞

2.4. INTEGRATION 49

∞ 1 Theorem 2.4.12 (The Fatou-Lebesgue Theorem). Let {fn}n=1 ⊂ L (X; R). 1 If there exists a function g ∈ L (X)+ such that |fn| ≤ g for all n ≥ 1 then lim infn→∞ fn and lim supn→∞ fn exist almost everywhere, are integrable, and we have Z Z Z Z lim inf fn ≤ lim inf fn ≤ lim sup fn ≤ lim sup fn. n→∞ n→∞ n→∞ n→∞ Proof. The ﬁrst inequality follows from linearity of the integral and by applying Fatou’s lemma to the non-negative functions fn + g. The second inequality is obvious. The third inequality follows by applying Fatou’s lemma to the non- negative functions g − fn.

Theorem 2.4.13 (Lebesgue’s Dominated Convergence Theorem). Suppose {fn} 1 is a sequence in L (X), such that fn → f almost everywhere. If there exists 1 1 R R g ∈ L (X)+ such that |fn| ≤ g for all n ∈ N, then f ∈ L (X) and fn → f. Proof. Note that |f| ≤ g almost everywhere and so by Corollary 2.4.6 we have that f ∈ L1(X). By considering separately the real and imaginary parts of fn we see that it is enough to consider the case when fn is real valued. In this case it follows from the Fatou-Lebesgue theorem that Z Z Z f = lim inf fn ≤ lim inf fn n→∞ n→∞ Z ≤ lim sup fn n→∞ Z Z ≤ lim sup fn = f, n→∞ and the result then follows. Theorem 2.4.14 (Egorov’s Theorem). Let (X, M, µ) be a ﬁnite measure space, ∞ and suppose {fn}n=1 ⊂ M(X), and f ∈ M(X) such that fn → f almost everywhere. Then for each ε > 0 there exists A ⊂ X measurable such that µ(A) < ε c and fn → f uniformly on A .

Proof. We let En,k = {x ∈ X | |fn(x) − f(x)| ≥ 1/k}. Then En,k ∈ M and ∞ ∞ since fn(x) → f(x) for almost every x ∈ X we have that µ(∩N=1 ∪n=N En,k) = 0, for every k ∈ N. Thus, there exists Nk so that for each k ∈ N we have µ(∪∞ E ) < ε2−k. We set A = ∪∞ ∪∞ E , so that µ(A) < ε. n=Nk n,k k=1 n=Nk n,k c If k ∈ N, and n ≥ Nk we have |fn(x) − f(x)| < 1/k for all x ∈ A . Therefore c fn → f unifomrly on A . We extend the integral to certain real valued functions as follows: If g ∈ 1 R L (X; R) and f ∈ M(X; [0, ∞)) is not integrable, then we write (f + g) = ∞, and R −(f+g) = −∞. Many of the results above extend to this setting, although the inequalities become trivial in the case when f is not integrable. 50 CHAPTER 2. MEASURE AND INTEGRATION

2.4.5 Exercises Exercise 2.4.15 (Chebyshev’s inequality). Let (X, M, µ) be a measure space and suppose f ∈ L1(X, µ), then for each α > 0 we have 1 µ({x ∈ X | |f(x)| > α}) ≤ kfk . α 1 Exercise 2.4.16. There does not exist a metric d on L∞([0, 1], λ) so that a ∞ ∞ sequence of functions {fn}n=1 ⊂ L ([0, 1], λ) converge almost eveywhere to a ∞ function f ∈ L ([0, 1], λ) if and only if d(fn, f) → 0. Hint: Find a sequence ∞ ∞ {fn}n=1 ⊂ L ([0, 1], λ) which does not converge almost everywhere to any function but such that every subsequence has a further subsequence which does converge almost everywhere to some function. Exercise 2.4.17. Let (X, M, µ) be a σ-ﬁnite measure space and suppose f ∈ M(X; [0, ∞)). Deﬁne Z 1 I1(f) = sup g | g ∈ L0(X; [0, ∞)), g ≤ f ,

Z 1 I2(f) = inf g | g ∈ L0(X; [0, ∞)), f ≤ g .

Show that f is integrable if and only if I1(f) < ∞, and this this case we have R I1(f) = I2(f) = f. Exercise 2.4.18. Suppose (X, M, µ) is a measure space and f ∈ M(X, [0, ∞)). Set F (λ) = µ(f −1([λ, ∞))). Then F is measurable and F ∈ L1([0, ∞)) if and only if f ∈ L1(X). Moreover, in this case we have R fdµ = R F dλ. Exercise 2.4.19. Suppose (X, M, µ) is a measure space, (Y, N ) is a measurable space, and θ : X → Y is measurable. Then for all f ∈ M(Y ) we have f ◦ θ ∈ 1 1 L (X, µ) if and only if f ∈ L (Y, θ∗µ), and in this case we have Z Z f ◦ θ dµ = f d(θ∗µ).

2.5 Product spaces

If (X, M) and (Y, N ) are measurable spaces, we let M ⊗ N ⊂ 2X×Y denote the σ-algebra generated by sets of the form E × F where E ∈ M and F ∈ N . In other words, M ⊗ N is the smallest σ-algebra so that the projection maps πX : X × Y → X and πY : X × Y → Y are measurable. More generally, if {(Xi, Mi)}i∈I is a family of measurable spaces then we denote by ⊗i∈I Mi the smallest σ-algebra so that the projection maps are measurable. Proposition 2.5.1. Let (X, M, µ) and (Y, N , ν) be measure spaces, then there is measure ζ on M ⊗ N so that ζ(E × F ) = µ(E)ν(F ), for E ∈ M and F ∈ N . Here we use the convention 0 · ∞ = 0. If µ and ν are σ-ﬁnite then this measure is unique. 2.5. PRODUCT SPACES 51

Proof. We let A denote the algebra generated by sets of the from E × F where ∞ E ∈ M and F ∈ N . If E × F = ∪n=1En × Fn, where E,En ∈ M, and F,Fn ∈ N , with µ(E), ν(F ) < ∞. Then for x ∈ X and y ∈ Y we have

∞ ∞ X X 1E(x)1F (y) = 1E×F (x, y) = 1En×Fn (x, y) = 1En (x)1Fn (y). n=1 n=1 Integrating with respect to x, and using the monotone convergence theorem gives ∞ X µ(E)1F (y) = µ(En)1Fn (y). n=1 If we then integrate with respect to y we obtain

∞ X µ(E)ν(F ) = µ(En)ν(Fn). n=1

n If A ∈ A, then we may write A as a finite disjoint union A = ∪i=1Ei × Fi where Ei ∈ M and Fi ∈ N . From above we then see that setting ζ0(A) = Pn i=1 µ(Ei)ν(Fi) gives a well defined premeasure on A. Carathéodory’s extension theorem then shows that this extends to a measure ζ, and if µ and ν are σ-finite then so is ζ and hence this measure is unique. The measure constructed by Carathéodory’s extension theorem in the previous proof is called the product measure and denoted by µ × ν (or µ2 if µ = ν). We can of course generalize the above proposition easily to any finite number of measure spaces. If E ⊂ X × Y , and x ∈ X, y ∈ Y then we define the x-section Ex and y y y-section E by Ex = {y ∈ Y | (x, y) ∈ E}, and E = {x ∈ X | (x, y) ∈ E}. y Also, if f : X × Y → C we define the x-section fx and y-section f by y fx(y) = f (x) = f(x, y). Proposition 2.5.2. Let (X, M) and (Y, N ) be measurable spaces. If E ∈ y M ⊗ N then Ex ∈ N and E ∈ M for all x ∈ X and y ∈ Y . Also, if y f : X × Y → C is M ⊗ N -measurable then fx is N -measurable and f is M-measurable for all x ∈ X and y ∈ Y .

Proof. We let Σ denote the collection of subsets E ⊂ X × Y such that Ex ∈ N and Ey ∈ M for all x ∈ X and y ∈ Y . Then Σ contains all sets of the form A×B ∞ ∞ c c with A ∈ M and B ∈ N . Since (∪n=1En)x = ∪n=1(En)x and (E )x = (Ex) , (and similarly for y) it follows that Σ is a σ-algebra and hence must contain M ⊗ N . −1 −1 If f : X × Y → C is M ⊗ N -measurable, then as (fx) (C) = (f (C))x y (and similarly for y) it then follows that fx and f are measurable. If X is a set, then a family E ⊂ 2X is called a monotone class if X ∈ E and E is closed under countable monotone unions and intersections, i.e., whenever 52 CHAPTER 2. MEASURE AND INTEGRATION

∞ ∞ {En}n=1 ⊂ E with E1 ⊂ E2 ⊂ · · · then we have ∪n=1En ∈ E, and whenever ∞ ∞ {En}n=1 ⊂ E with E1 ⊃ E2 ⊃ · · · then we have ∩n=1En ∈ E. Given a family of monotone classes it is clear that the intersection is again a monotone class, thus for any collection of sets E0 there exists a smallest monotone class which contains E0, we call this the monotone class which is generated by E.

Lemma 2.5.3 (The monotone class lemma). Suppose A ⊂ 2X is an algebra, then the monotone class generated by A coincides with the σ-algebra generated by A.

Proof. We let M denote the monotone class generated by A. Since a σ-algebra is a monotone class then it suffices to show that M is a σ-algebra. For this it suffices to show that M is closed under taking complements and finite unions ∞ N since if {En}n=1 ⊂ M then ∪n=1En is a monotone increasing sequence and ∞ hence if these finite unions are in M then so is ∪n=1En. For E ∈ M we set K(E) = {F ∈ M | E \ F,F \ E,E ∪ F ∈ M}. We will show that K(E) = M for each E ∈ M. This is shown by the following six steps, each of which is easily verified:

1. If F ∈ K(E) then E ∈ K(F ). (This follows from symmetry in the deﬁni- tion of K(E) and K(F ).)

2. If E ∈ A then A ⊂ K(E). (This follows since A ⊂ M).

3. K(E) is a monotone class for all E ∈ M. (This follows since M is a monotone class).

4. If E ∈ A then K(E) = M. (This follows from (2) and (3) since M is the smallest monotone class which contains A).

5. A ⊂ K(E) for all E ∈ M. (This follows from (4) and (1)).

6. K(E) = M for all E ∈ M. (This follows from (5) and (3)).

Lemma 2.5.4. Suppose (X, M, µ) and (Y, N , ν) are σ-ﬁnite measure speaces, y and A ∈ M⊗N . Then the functions x 7→ ν(Ex) and y 7→ µ(E ) are measurable and Z Z y µ × ν(E) = ν(Ex) dµ(x) = µ(E ) dν(y).

Proof. We ﬁrst consider the case when µ and ν are ﬁnite. We let Σ denote the family of sets in M⊗N such that the conclusion of the proposition holds. Then from the arguemnt in the proof of Proposition 2.5.1 we see that Σ contains the algebra A generated by sets of the form E × F with E ∈ M and F ∈ M. ∞ If {En}n=1 ⊂ Σ such that En ⊂ En+1 for all n ∈ N, then we have that ∞ ∞ ∪n=1(En)x = (∪n=1En)x and so by the monotone convergence theorem we 2.5. PRODUCT SPACES 53 have Z Z ∞ ν((∪n=1En)x) dµ(x) = lim ν((EN )x) dµ(x) N→∞

= lim µ × ν(EN ) N→∞ ∞ = µ × ν(∪n=1En). R ∞ y ∞ ∞ And we similarly have µ((∪n=1En) ) dν(y) = µ×ν(∪n=1En). Thus, ∪n=1En ∈ Σ. Since µ and ν are finite a similar argument shows that if En ⊃ En+1 for all ∞ n ∈ N, then ∩n=1En ∈ Σ. Therefore, Σ is a monotone class which contains A and by the monotone class lemma we have that M = Σ. ∞ ∞ If µ and ν are σ-finite, say X = ∪n=1Xn and Y = ∪n=1Yn with µ(Xn), ν(Yn) < ∞, then the result follows by first restricting to Xn×Yn then using the monotone convergence theorem as we did above. Theorem 2.5.5 (The Fubini-Tonelli Theorem). Suppose (X, M, µ) and (Y, N , ν) are measure spaces, and f : X × Y → C is M ⊗ N -measurable. Consider the following conditions: 1. f ∈ L1(X × Y, µ × ν).

1 2. For almost every x ∈ X, fx ∈ L (Y, ν), and the function fL1(Y )(x) = 1 kfxkL1(Y ) is in L (X). y 1 3. For almost every y ∈ Y , f ∈ L (X, µ), and the function fL1(X)(y) = y 1 kf kL1(X) is in L (Y ). Then (1) implies both (2) and (3), and if µ and ν are σ-ﬁnite then all three conditions are equivalent. Moreover, if (1) (and hence also (2) and (3)) is satisﬁed then we have Z Z Z f d(µ × ν) = f(x, y) dν(y) dµ(x) (2.6) Z Z = f(x, y) dµ(x) dν(y).

Proof. We ﬁrst consider the case when µ and ν are σ-ﬁnite. If f is a characteristic function then the result follows from Lemma 2.5.4. By linearity we then have the result for simple functions. Suppose now that f ∈ M(X × Y )+. Then there exists an increasing sequence of simple functions ϕn which are valued in the non-negative reals so that ϕn(x, y) → f(x, y) for all (x, y) ∈ X × Y . By the monotone convergence theorem we then have Z Z f d(µ × ν) = lim ϕn d(µ × ν) n→∞ Z Z = lim ϕn(x, y) dν(y) dµ(x) n→∞ Z Z = f(x, y) dν(y) dµ(x). 54 CHAPTER 2. MEASURE AND INTEGRATION

We similarly have R f d(µ × ν) = R R f(x, y) dµ(x) dν(y). Thus, for non- negative valued functions we see that the three conditions above are equivalent and that (2.6) holds. From linearity we then get the result for general measurable functions when µ and ν are σ-finite. 1 If µ or ν is not σ-finite but f ∈ L (X × Y, µ × ν), then we see that Gn = {(x, y) | |f(x, y)| ≥ 1/n} must have finite measure for all n ≥ 1. Therefore, there ∞ ∞ ∞ exist Ek ∈ M and Fk ∈ N so that Gn ⊂ ∪k=1Ek × Fk ⊂ (∪k=1Ek) × (∪k=1Fk), P∞ and k=1 µ(Ek)ν(Fk) < ∞. In otherwords, we have Gn ⊂ E × F where E and F are σ-finite. It then follows that there exist σ-finite sets E˜, and F˜ so that ∞ ˜ ˜ {(x, y) ∈ X × Y | f(x, y) 6= 0} = ∪n=1Gn ⊂ E × F . Restricting to the σ-finite ˜ ˜ measure spaces (E, µE˜ ) and (F , νF˜ ) we see that the result then follows from the σ-finite case. We give the following 2 examples which show how the hypotheses of the Fubini-Tonelli theorem are necessary: Example 2.5.6. Consider [0, 1] with Lebesgue measure, and let f(x, y) = x2−y2 (x2+y2)2 . Then fixing x 6= 0 we have

Z 1 Z 1 x2 + y2 − 2y2 f(x, y) dy = 2 2 2 dy 0 0 (x + y ) Z 1 1 Z 1 1 = 2 2 dy + y d 2 2 0 x + y 0 x + y Z 1 1 Z 1 1 y 1 = 2 2 dy + 2 2 − 2 2 dy 0 x + y x + y y=0 0 x + y 1 = . x2 + 1 We therefore have

Z 1 Z 1 2 2 x − y −1 1 π 2 2 2 dy dx = tan (x)|x=0 = , 0 0 (x + y ) 4

R 1 R 1 x2−y2 −π and as f(y, x) = −f(x, y) we have 0 0 (x2+y2)2 dx dy = 4 . We must x2−y2 1 2 2 therefore have that (x2+y2)2 6∈ L ([0, 1] , λ ). Example 2.5.7. Consider [0, 1] with its Borel σ-algebra B, and consider Lebesgue measure λ on [0, 1], and also counting measure µ on [0, 1]. In the product space ([0, 1]2, B ⊗ B, λ × µ) we may consider the measurable subset ∆ = {(x, x) | x ∈ 1 [0, 1]}, and set f = 1∆. Then for every x ∈ [0, 1] we have fx ∈ L ([0, 1], µ) and R R f(x, y) dλ(y) dµ(x) = 0. Similarly, for every y ∈ [0, 1] we have 1 R R fy ∈ L ([0, 1], λ) and f(x, y) dµ(x) dλ(y) = 1. So that also in this case the iterated integrals do not agree. We must therefore have that λ×µ(∆) = ∞, and we see that for non-σ-ﬁnite spaces the iterated integrals need not agree even for functions valued in the non-negative reals. 2.6. SIGNED AND COMPLEX MEASURES 55

If λ is Lebesgue measure on R and n ≥ 1, then Lebesgue measure on Rn is deﬁned to be λn. When there is no danger of confusion we will just write λ for λn. Theorem 2.5.8. If E ⊂ Rn is Borel and t ∈ Rn, then λ(E + t) = λ(E). Proof. If E is a disjoint union of products of intervals than the formula λ(E + t) = λ(E) clearly holds. As such sets form an algebra which generates the Borel σ-algebra, and since E 7→ λ(E + t) gives a Borel measure, the proposition then follows from uniqueness in Carath´eodory’s extension theorem. −1 Theorem 2.5.9. If T ∈ GLn(R) then T∗λ = | det T | λ, i.e., for all Borel sets E ⊂ Rn we have λ(T −1(E)) = | det T |−1λ(E). (2.7) Proof. Since every invertible matrix can be row reduced to the identity matrix it follows that every linear transformation T ∈ GLn(R) is a composition of elementrary matrices of the following types:

1. T1(x1, . . . , xj, . . . , xn) = (x1, . . . , cxj, . . . , xn) with c 6= 0.

2. T2(x1, . . . , xj, . . . , xn) = (x1, . . . , xj + cxi, . . . , xn) with i 6= j.

3. T3(x1, . . . , xi, . . . , xj, . . . , xn) = (x1, . . . , xj, . . . , xi, . . . , xn). Also, if we can show that (2.7) holds for matrices of the above type then as the determinant is multiplicative it also holds for their composition. Thus, it is enough to verity (2.7) for matrices of the above type. These all follow easily from the Fubini-Tonelli theorem. Writing Z Z λ(E) = ··· 1E(x1, . . . , xn) dλ(x1) . . . dλ(xn) we see that (2.7) holds for matrices of the ﬁrst and second type by their corresponding formulas in one dimensions, while matrices of the third type just correspond to changing the orders of integration.

2.5.1 Exercises

Exercise 2.5.10. Consider N with the counting measure, and consider the function f : N2 → C given by f(n, m) = 1 if n = m, f(n, m) = −1 if n = m + 1, and P∞ P∞ P∞ P∞ f(n, m) = 0 otherwise. Then n=1 ( m=1 f(n, m)) and m=1 ( n=1 f(n, m)) both exist but are not equal.

2.6 Signed and complex measures

In this section we extend the notion of a measure to allow set functions which may give negative, or even complex, values. The Hahn and Jordan decomposition theorems below, together with the polar decomposition theorem for complex measures, give the main tools to relate this more general setting to the non-negative valued case we have already considered. 56 CHAPTER 2. MEASURE AND INTEGRATION

2.6.1 Signed measures A signed measure on a measurable space (X, M) is a function ν : M → [−∞, ∞] such that 1. at most one of the values in {−∞, ∞} are obtained; 2. ν(∅) = 0;

∞ 3. if {En}n=1 are pairwise disjoint measurable sets then ∞ ∞ X ν(∪n=1En) = ν(En), n=1 ∞ where this series converges absolutely if ν(∪n=1En) is ﬁnite.

If µ1 and µ2 are measures on (X, M), at least one of which is finite, then we obtain a signed measure µ1 − µ2 on (X, M) by setting (µ1 − µ2)(E) = µ1(E) − µ2(E) for any E ∈ M. If ν is a signed measure on (X, M), and E ∈ M, then we say that E is positive (resp. negative, null) with respect to ν if for all F ⊂ E measurable we have ν(F ) ≥ 0 (resp. ≤ 0, = 0). Note that positive (resp. negative, null) sets are preserved under taking measurable subsets, and also under taking countable unions. If µ is a measure on (X, M) and E ∈ M with µ(E) < ∞, the we may obtain a signed measure on (X, M) by setting ν(F ) = µ(F ∩ Ec) − µ(F ∩ E). In this case E is a negative set, Ec is a positive set, and a set is null for ν if and only if it is null for µ. The Hahn decomposition theorem shows that every signed measure (or its negative) arrises in this way. Lemma 2.6.1. Let ν be a signed measure on (X, M), and suppose E ∈ M such that −∞ < ν(E) < 0. Then there exists a negative set N ⊂ E with ν(N) < 0. Proof. Note first that there does not exist a measurable subset F ⊂ E with ν(F ) = ∞, since otherwise we would have ν(E) = ν(F ) + ν(E \ F ) = ∞. We ∞ inductively define a non-decreasing sequence of number {nk}k=0 ⊂ N ∪ {∞}, ∞ and pairwise disjoint subsets {Ek}k=0 of E as follows: Set E0 = ∅ and n0 = 1. Having defined n0, . . . , nk and E0,...,Ek we let nk+1 denote the smallest k integer such that there exists a measurable subset Ek+1 ⊂ E \ (∪j=0Ej) with ν(Ek+1) ≥ 1/nk+1. If no such number exists we set nk+1 = ∞ and Ek+1 = ∅. ∞ P∞ P∞ 1 We then have that ∞ > ν(∪ Ek) = ν(Ek) ≥ ≥ 0 (here we k=0 k=0 k=0 nk 1 use the convention ∞ = 0). Since this series converges we must have nk → ∞. ∞ We set N = E \ (∪k=0Ek) then we have ν(N) ≤ ν(E) < 0. If F ⊂ N is measurable then by our choice of nk we must have ν(F ) ≤ 1/(nk − 1) for each k. Since nk → ∞ this then shows that ν(F ) ≤ 0, and hence N is negative. Theorem 2.6.2 (The Hahn decomposition theorem). Let (X, M) be a measurable space and let ν be a signed measure on (X, M). Then there exists a positive set P ∈ M so that N = P c is a negative set. Moreover if P˜ is another positive set such that P˜c is a negative set, then we have that P ∆P˜ is null. 2.6. SIGNED AND COMPLEX MEASURES 57

Proof. We may assume that the value −∞ is never obtained (otherwise consider −ν). We let a = inf{ν(E) | E ∈ M,E negative} ≤ 0. We take negative sets ∞ En ∈ M so that ν(En) → a, and we set N = ∪n=1En. Then N is a negative set, and we have ν(N) = ν(En) + ν(N \ En) ≤ ν(En) for each n ≥ 1, hence ν(N) = a. We claim that N c is positive. Otherwise, there would exist a measurable set E ⊂ N c with −∞ < ν(E) < 0, and by the previous lemma there would then exist a negative set N0 ⊂ E with ν(N0) < 0. However, we would then have that N ∪ N0 is negative and ν(N ∪ N0) = ν(N) + ν(N0) < a, contradicting our deﬁnition of a. This then ﬁnishes the existence part of the theorem. Suppose now that N˜ is another negative set such that N˜ c is positive, and let F ⊂ N˜ \N be measurable. Then F ⊂ N hence ν(F ) ≤ 0, and F ⊂ N˜ c hence ν(F ) ≥ 0. Therefore ν(F ) = 0 and N˜ \ N is a null set. We similarly have that ˜ ˜ N \ N is null and hence so is N∆N.

If (X, M) is a measurable space, then two (signed) measures µ, η on (X, M) are singular, which we write as µ ⊥ η, if there exists E ∈ M so that E is a conull set for µ and Ec is a conull set for η. Note that this is a symmetric relation.

Theorem 2.6.3 (The Jordan decomposition theorem). Let (X, M) be a measurable space and let ν be a signed measure on (X, M). Then there exist unique singular measures ν−, ν+ on (X, M), at least one of which is ﬁnite, so that ν = ν+ − ν−.

Proof. From the Hahn decomposition theorem there exists P ∈ M a positive c set so that N = P is a negative set. We define ν+ by ν+(E) = ν(E ∩ P ) for all E ∈ M and we define ν− by ν−(E) = −ν(E ∩ N) for all E ∈ M. That these define measures is easily seen from the definition of a signed measure. Moreover, we have ν(E) = ν(E ∩ P ) + ν(E ∩ N) = ν+(E) − ν−(E) for all E ∈ M, hence at least one of ν− or ν+ is finite and we have ν = ν+ −ν−. Since, ν+(N) = ν(N ∩ P ) = −ν−(P ) = 0 we have ν+ ⊥ ν−. Suppose now that η1, η2 are singular measure on (X, M), at least one of which is finite, such that ν = η1 − η2. Take E ∈ M so that E is a conull set for c η1 and E is a conull set for η2. Then we clearly have that E is a positive set for ν, and Ec is a negative set for ν. Therefore, P ∆E is a null set for ν by the uniqueness part of the Hahn decomposition theorem. If we have F ∈ M then we have

η1(F ) = η1(F ∩ E) = ν(F ∩ E) = ν(F ∩ P ) = ν+(F ).

Hence, η1 = ν+. We similarly have that η2 = ν− which then shows uniqueness.

If ν is a signed measure on (X, M) then the measures ν+ and ν− are called respectively the positive and negative variations of ν. The measure |ν| = ν+ + ν− is called the absolute variation of ν. We also set kνk = |ν|(X) and 58 CHAPTER 2. MEASURE AND INTEGRATION call this the total variation of ν. It’s easy to see that the absolutely variation satisﬁes ∞ X |ν|(A) = sup |ν(Ek)|, (2.8) k=1 where the supremum is taken over all measurable partitions of A.

2.6.2 Complex measures A complex measure on a measurable space (X, M) is a set function ν : M → C such that 1. ν(∅) = 0;

∞ 2. if {En}n=1 are pairwise disjoint measurable sets then ∞ ∞ X ν(∪n=1En) = ν(En), n=1 where this series converges absolutely. Given a complex measure ν, we may consider the real and imaginary parts Re (ν), Im (ν), which give signed measures such that ν(E) = Re (ν)(E) + iIm (ν)(E) for each E ∈ M. The absolute variation of ν is the set function |ν| : M → [0, ∞] given by equation (2.8). The total variation of ν is given by kνk = |ν|(X). Proposition 2.6.4. Let ν be a complex measure on (X, M), then |ν| is a measure on (X, M), kνk < ∞, and for all A ∈ M we have |ν(A)| ≤ |ν|(A).

∞ ∞ Proof. Suppose {An}n=1 ⊂ M is a sequence of pairwise disjoint sets. If {Ek}k=1 ∞ gives a measurable partition of ∪n=1An, then for each n ≥ 1 we have a measur- ∞ able partition of An given by {Ek ∩ An}k=1. We therefore have

∞ ∞ ∞ ∞ ∞ ∞ X X X X X X |ν|(An) ≥ |ν(Ek ∩ An)| ≥ ν(Ek ∩ An) = |ν(Ek)|. n=1 n=1 k=1 k=1 n=1 k=1 P∞ ∞ Taking supremums over all such partitions then gives n=1 |ν|(An) ≥ |ν|(∪n=1An). n ∞ Also, if ε > 0, and if {Ek }k=1 is a measurable partition of An such that P∞ n −n k=1 |ν(Ek )| + ε2 ≥ |ν|(An), then we have ∞ ∞ ∞ X n X |ν|(∪n=1An) + ε ≥ |ν(Ek )| + ε ≥ |ν|(An). n,k=1 n=1

∞ P∞ Therefore, |ν|(∪n=1An) ≥ n=1 |ν|(An), and hence |ν| is a measure. We have kνk ≤ kRe (ν)k + kIm (ν)k, and from the Hahn decomposition theorem we see that kRe (ν)k + kIm (ν)k < ∞. Hence, kνk < ∞. Also, if E ∈ M then the inequality |ν(E)| ≤ |ν|(E) follows easily from the deﬁnition of |ν|. 2.7. THE RADON-NIKODYM THEOREM 59

2.6.3 Exercises Exercise 2.6.5. Consider [0, 1] with the Borel σ-algebra. Let ν be counting measure and µ be Lebesgue measure on [0, 1], then there do not exist Borel measures ν0, ν1 on [0, 1] so that ν = ν0 + ν1, ν0 ⊥ µ, and ν1 µ. If µ is a measure on (X, M) and f ∈ L1(X, µ) then we obtain a complex R valued measure fµ by (fµ)(E) = E f dµ. 1 Exercise 2.6.6. If f ∈ L (X, µ) then |fµ| = |f|µ, and kfµk = kfk1.

We let Mb(X) denote the space of all complex valued measures on (X, M).

Exercise 2.6.7. The map ν 7→ kνk gives a norm on Mb(X), and with this norm Mb(X) is a Banach space.

2.7 The Radon-Nikodym Theorem

If µ and ν are measures on (X, M), then ν is absolutely continuous with respect to µ (and we write ν µ) if every µ-null set is also a ν-null set. The terminology is justified by the following proposition: Proposition 2.7.1. Suppose µ and ν are measures on (X, M) with ν finite, then ν µ if and only if for every ε > 0 there exists δ > 0 so that if E ∈ M with µ(E) < δ then we have ν(E) < ε. Proof. Clearly the above condition implies that ν is absolutely continuous with respect to µ, thus we only need to show the converse. Suppose therefore that the condition above does not hold. Then there exists ε > 0 and En ∈ M so −n ∞ that µ(En) < 2 , and ν(En) ≥ ε for all n ≥ 1. We let Fk = ∪n=kEn, and ∞ set F = ∩k=1Fk. Then Fk is decreasing and µ(Fk) → 0, so that µ(F ) = 0. However, ν(Fk) ≥ ε for all k ≥ 1, hence ν(F ) ≥ ε showing that ν is not absolutely continuous with respect to µ. Theorem 2.7.2 (The Lebesgue decomposition theorem). Suppose µ and ν are measures on (X, M) such that ν is σ-finite, then there exist unique measures ν0 and ν1 on (X, M) so that ν = ν0 + ν1, ν0 ⊥ µ, and ν1 µ. Proof. We first consider the case when ν(X) < ∞. We let N denote the space of µ-null sets, and we set a = sup{ν(E) | E ∈ N }. We take En ∈ N so that ∞ ν(En) → a and we set E = ∪n=1En. Then E ∈ N and ν(E) ≥ ν(En) for all n ≥ 1, hence ν(E) = a. We let ν0 be the measure given by ν0(F ) = ν(F ∩ E), and we let ν1(F ) = c ν(F ∩ E ). Then we clearly have ν = ν0 + ν1, and we also have ν0 ⊥ µ since E c is a µ-null set and E is a ν0-null set. If F ∈ N , then F ∪ E ∈ N and hence a ≥ ν(F ∪ E) = ν(F ∩ Ec) + ν(E) ≥ a, therefore we must have ν(F ∩ Ec) = 0. c It therefore follows that ν1(F ) = ν(F ∩ E ) = 0 and hence ν1 µ. If ν =ν ˜0 +ν ˜1 is another decomposition withν ˜0 ⊥ µ andν ˜1 µ, then for all F ∈ N we haveν ˜0(F ) = ν(F ) = ν0(F ), and sinceν ˜0 ⊥ µ we then have that 60 CHAPTER 2. MEASURE AND INTEGRATION

ν˜0 = ν0, and it follows thatν ˜1 = ν1. This then ﬁnishes the theorem in the case when ν(X) < ∞. ∞ ∞ For the general case, we write X = supn=1 En where {En}n=1 ⊂ M is a pairwise disjoint sequence with ν(En) < ∞. We consider the restriction of ν n n to En and from above there are unique measures ν0 , ν1 which have the conull n n P∞ n set En and such that ν0 ⊥ µ and ν1 µ. If we set ν0 = n=1 ν0 and P∞ n ν1 = n=1 ν1 then it is then easy to see that ν0 and ν1 are then the unique measures which satisfy the conclusion of the theorem. Lemma 2.7.3. Suppose η and µ are measures on (X, M), with µ ﬁnite, η 6= 0 and such that η µ. Then there exists δ > 0, and E ∈ M so that µ(E) > 0 and η ≥ δµ on E, i.e., η(F ) ≥ δµ(F ) for all F ∈ M, F ⊂ E.

Proof. For each n ∈ N we consider a Hahn decomposition X = Pn ∪ Nn of 1 ∞ ∞ c η − n µ. We set P = ∪n=1Pn and N = ∩n=1Nn, so that P = N . Since N is a 1 1 negative set for η − n µ for each n, we have 0 ≤ η(N) ≤ n µ(N) for each n, and hence η(N) = 0, so that η(P ) > 0. Since η µ we than also have µ(P ) > 0. Therefore, for some n we have µ(Pn) > 0, and as Pn is a positive set for 1 1 η − n µ we have η(F ) ≥ n µ(F ) for all F ∈ M, F ⊂ Pn. Theorem 2.7.4 (The Radon-Nikodym theorem). Let µ and ν be σ-finite measures on (X, M) such that ν µ. Then there exists a unique f ∈ M(X, µ) so that for all E ∈ M we have Z ν(E) = f dµ. (2.9) E Proof. We first consider the case when µ and ν are finite. We set Z F = f ∈ M(X, [0, ∞)) | f dµ ≤ ν(E), for all E ∈ M , E and Z a = sup f dµ. f∈F

Note that if f, g ∈ F then h = max{f, g} ∈ F, since if we set F0 = {x ∈ X | R R R f(x) ≥ g(x)}, then for E ∈ M we have h dµ ≤ f dµ + c g dµ ≤ E F0∩E F0 ∩E c ν(F0 ∩ E) + ν(F ∩ E) = ν(E). 0 R We choose fn ∈ F so that fn dµ → a. Setting hn = max{f1, . . . , fn}, ∞ we then have hn ∈ F, and {hn}n=1 is an increasing sequence. If we set h = limn→∞ hn then for each E ∈ M it follows from the monotone convergence R R theorem that h dµ = limn→∞ hn dµ ≤ ν(E). Therefore h ∈ F, and we R E R E R have a ≥ h dµ = limn→∞ fn dµ = a. So that h dµ = a. We claim that ν = µh. If not, then setting η = ν − µh, we have η ν µ, and η 6= 0, so that by Lemma 2.7.3 there exists δ > 0 and E ∈ M with µ(E) > 0 so that η ≥ δµ on E, or equivalently, µh + δµ ≤ ν on E. This would R then show that h + δ1E ∈ F, and hence a ≥ (h + δ1E) dµ = a + δµ(E) > a, 2.7. THE RADON-NIKODYM THEOREM 61 giving a contradiction. If h˜ were another such function then we would have R R ˜ ˜ E h dµ = E h dµ for all E ∈ M and from this it follows that h = h, µ-almost everywhere. In general, since µ and ν are σ-finite, we may decompose X as a countable ∞ disjoint union of measurable sets X = ∪n=1Xn, such that µ(Xn), ν(Xn) < ∞. By the finite measure case above, there then exists a measurable function fn ∈ R M(X, µ) so that for all E ∈ M we have ν(E ∩ Xn) = E∩X fn dµ. If we set P∞ n R f = n=1 fn then it is easy to see that for all E ∈ M we have ν(E) = E f dµ. Uniqueness follows similar to the finite case above.

The function f ∈ M(X, µ) in the previous theorem is called the Radon- dν Nikodym derivative of ν with respect to µ and denoted by dµ . There is also a Radon-Nikodym theorem for complex measures:

Theorem 2.7.5 (The Radon-Nikodym theorem for complex measures). Let µ be a σ-finite measure on (X, M) and let ν be a complex measure on (X, M) which is absolutely continuous with respect to µ, then there exists a unique f ∈ 1 R L (X, µ) so that ν(E) = E f dµ for all E ∈ M. Proof. By considering the real and imaginary parts separately it is enough to consider the case when ν is a finite signed measure. We let ν = ν+ − ν− be the Jordan decomposition. Then ν+ and ν− are both absolutely continuous with respect to µ and so by the Radon-Nikodym theorem there exists f+, f− ∈ M(X, µ) so that ν± = f±µ. Note that since ν+, and ν− are finite measures 1 1 we have that f+, f− ∈ L (X, µ). We then have ν = fµ and f ∈ L (X, µ). Uniqueness follows just as in the previous theorem.

Corollary 2.7.6 (Polar decomposition for complex measures). Let µ be a complex measure on (X, M), then there exists f : X → T measurable such that µ = f|µ|. Moreover, if g : X → T is a measurable function such that µ = g|µ| then g = f |µ|-almost everywhere.

Proof. Since µ |µ| it follows from the Radon-Nikodym theorem that there exists a unique f ∈ M(X, |µ|) so that µ = f|µ|. We just need to show that f(x) ∈ T for |µ|-almost every x ∈ X. Suppose this were not the case. Then there exists α 6∈ T so that α is in the essential range of f. We take δ > 0 so that 2δ < d(α, T). We set E = f −1(B(δ, α)) so that |µ|(E) > 0. Assume ﬁrst that |α| > 1. Since µ = f|µ| it then follows that for any measur- ∞ able set F ⊂ E we have |µ(F )| = |f|µ|(F )| ≥ (α − δ)|µ|(F ). If ∪n=1En gives a P∞ measurable partition of E then we have |µ|(E) ≥ (α − δ) n=1 |µ|(En). Taking a supremum over all partitions gives |µ|(E) ≥ (α − δ)|µ|(E), a contradiction. If we had |α| < 1, then a similar computation would show that |µ|(E) ≤ (α + δ)|µ|(E) which is again a contradiction. We must therefore have that f(x) ∈ T for |µ|-almost every x ∈ X. 62 CHAPTER 2. MEASURE AND INTEGRATION

Lemma 2.7.7. A measure space (X, M, µ) is semiﬁnite if and only if for all f ∈ M(X, µ) we have Z 1 kfk∞ = sup fg dµ | g ∈ L (X, µ), kgk1 ≤ 1 .

∞ (Where we set kfk∞ = ∞ if f 6∈ L (X, µ).)

Proof. Suppose first that µ is semifinite. If f ∈ M(X, µ), take w ∈ L∞(X, µ; T) so that wf = |f|. Set α = kfk∞ = k|f|k∞ and fix ε > 0. If α < ∞ then α is in the essential range of |f| and hence F = |f|−1((α − ε, α]) has positive measure. We let E ⊂ F be a measurable set with finite positive measure (which 1 exists by semifiniteness) and set g = µ(E) 1E then kwgk1 = kgk1 = 1 and by Theorem 2.4.5 we have Z Z 1 α ≥ fwg dµ = |f| dµ ≥ α − ε. µ(E) E

R 1 So that kfk∞ = sup{| fg dµ| | g ∈ L (X, µ), kgk1 ≤ 1}. Similarly, if α = ∞ then F = |f|−1((N, ∞)) has positive measure for all N > R 0, and the same argument above then shows that kfk∞ = ∞ = sup{| fg dµ| | 1 g ∈ L (X, µ), kgk1 ≤ 1}. Conversely, suppose µ is not semiﬁnite. Then there exists E ∈ M so that µ(E) = ∞, and for any measurable subset F ⊂ E we have µ(F ) ∈ {0, ∞}, hence 1 if g ∈ L (X, µ) we must have g(x) = 0 for almost every x ∈ E. Setting f = 1E R 1 we then have kfk∞ = 1, while sup fg dµ | g ∈ L (X, µ), kgk1 ≤ 1 = 0.

Theorem 2.7.8. Let (X, M, µ) be a measure space and consider the map Ψ: L∞(X, µ) → L1(X, µ)∗ given by Ψ(f)(g) = R fg dµ. Then Ψ is isometric if µ is semiﬁnite, and Ψ is surjective if µ has the essential suprema property.

Proof. That Ψ maps into L1(X, µ)∗ follows from Theorem 2.4.5. From Lemma 2.7.7 we see that this map is injective if and only if µ is semifinite. Suppose µ has the essential suprema property, and ϕ ∈ L1(X, µ)∗. Fix E ⊂ X so that µ(E) < ∞. Then F 7→ ϕ(1F ∩E) defines a complex measure on (X, M) which is absolutely continuous with respect to µ and so by the Radon-Nikodym R theorem there exists a function fE ∈ M(X, µ) so that ϕ(1F ∩E) = F ∩E fE dµ for all F ∈ M. Since µ(E) < ∞ we have from Lemma 2.7.7 that kfEk∞ ≤ kϕk. Note that by uniqueness in the Radon-Nikodym we have that if E1,E2 ∈ M have finite measure then fE1 and fE2 agree almost everywhere on E1 ∩E2. If we let f denote an essential supremum of {fE | µ(E) < ∞} as in Proposition 2.2.5, then kfk∞ ≤ kϕk and for each E ∈ M with µ(E) < ∞ we have f(x) = fE(x) for almost every x ∈ E. It then follows that for every function g ∈ L1(X, µ) such that µ({x ∈ X | g(x) 6= 0}) < ∞ we have ϕ(g) = R fg dµ. Since functions of this type are dense 1 R 1 in L (X, µ) it follows that ϕ(g) = fg dµ for each g ∈ L (X, µ). 2.7. THE RADON-NIKODYM THEOREM 63

2.7.1 Exercises Exercise 2.7.9. Suppose λ ν µ, then dµ dµ dν = µ − almost everywhere. dλ dν dλ Exercise 2.7.10. If ν µ and µ ν then

dµ dν −1 = µ − almost everywhere. dν dµ

Exercise 2.7.11. If ν µ and g ∈ M(X), then g is µ-integrable if and only dν if g dµ is ν-integrable, and in this case we have Z Z dµ g dµ = g dν. dν 64 CHAPTER 2. MEASURE AND INTEGRATION Chapter 3

Point set topology

3.1 Topological spaces

Let X be a set. A topology on X is a family T of subsets of X, which contains ∅ and X, and is closed under ﬁnite intersections and arbitrary unions. A topological space is a pair (X, T ) consisting of a set X, together with a topology T on X. When T is understood we sometimes refer to the topological space X. The following are examples of topological spaces:

1. If X is any set then 2X and {∅,X} are both topologies on X, called the discrete and trivial (or indiscrete) topologies respectively.

2. If (X, d) is a metric space and T consists of all open subsets of X then (X, T ) is a topological space. In this case we call (X, T ) metrizable.

3. If X is a set then T = {∅} ∪ {U ⊂ X | U c is ﬁnite} gives a topology on X.

Generalizing the case of metric spaces, we call the sets in T , open sets, and we call a set closed if its complement is open. If a set is both open and closed then we say it is clopen. A set A ⊂ X is a Gδ-set if A is the intersection of countably many open sets, and a set B ⊂ X is an Fσ-set if B is the countable union of closed sets. If A ⊂ X, then the closure A of A is the intersection of all closed sets containing A, and hence is the smallest closed set containing A. The interior Ao of A is the union of all open sets contained in A. The difference A\Ao is the boundary of A and denoted by ∂A. If A = X then A is dense and if Ao = ∅ then A is nowhere dense. A topological space (X, T ) is separable if it has a countable dense subset. Given two topologies T1 and T2 on X such that T1 ⊂ T2, we say that T1 is weaker (or coarser) than T2, and T2 is stronger ( or finer) than T1. Thus, the trivial topology is the coarsest topology, while the discrete topology is the finest topology. If E ⊂ 2X , then the intersection of all topologies on X which

65 66 CHAPTER 3. POINT SET TOPOLOGY contain E is clearly a topology and is denoted by T (E). It is called the topology generated by E. For example: 1. if K = Rn, or Cn, then the Zariski topology on K is the weakest topology such that the zero set {k ∈ K | p(k) = 0} of any polynomial p is closed.

2. the Sorgenfrey line Rl is the space R, together with the topology generated by all half-open intervals [a, b).

3. the Moore plane is the (closed) upper half plane Γ = {(x, y) ∈ R2 | y ≥ 0}, together with the topology generated by Euclidean open sets, and all sets of the form {(x0, 0)} ∪ (O \{(x, 0) | x ∈ R}) where O is an open neighborhood of x0 in the Euclidean sense. 4. If (X, ≤) is a linearly ordered set, the order topology on X is the topology generated by the open sets {x ∈ X | a < x < b} for all pair (a, b) ∈ X2 such that a < b. If x ∈ X then a neighborhood of x is a set A ⊂ X so that x ∈ O ⊂ A for some open set O ∈ T . A point x ∈ X is an accumulation point (or condensation point, or limit point) of a set E if every neighborhood of x has nonempty intersection with E \{x}.A neighborhood base for x ∈ X is a family {Oi}i∈I of open neighborhoods of x such that for any open neighborhood O of x there is some Oi so that Oi ⊂ O.A base for the topology T is a family {Oi}i∈I of open sets which contains a neighborhood base for any point x ∈ X. Proposition 3.1.1. If E ⊂ 2X then T (E) consists of all unions of ﬁnite intersections of E. Proof. If we let T denote the unions of all ﬁnite intersections of E then we must show that T is a topology. Clearly T is closed under arbitrary unions. Suppose

U1,...,Un ∈ T . Then we may write Ui = ∪j∈Ji Oj,i where Oj,i is a finite n n intersection of sets in E. Therefore ∩i=1Ui = ∪j1∈J1,...,jn∈Jn (∩i=1Oji,i) ∈ T , so that T is also closed under finite intersections. A topological space (X, T ) is first countable if each point has a countable neighborhood base which is countable. (X, T ) is second countable if it has a countable base. A topological space is:

1. T1 if {x} is closed for each point x ∈ X;

2. Hausdorﬀ (or T2) if for each x 6= y, there exist disjoint open sets U, V ∈ T , such that x ∈ U and y ∈ V ;

c 3. regular (or T3) if it is T1 and for each closed set A ⊂ X and x ∈ A there exist disjoint open sets U, V with x ∈ U and A ⊂ V ;

4. normal (or T4) if it is T1, and for any disjoint closed sets A, B ⊂ X there are disjoint open sets U, V ⊂ X so that A ⊂ U, and B ⊂ V .

We leave it to the reader to check the implications T4 =⇒ T3 =⇒ T2 =⇒ T1. 3.2. CONTINUOUS MAPS 67

3.1.1 Exercises Exercise 3.1.2. Show that a metric space X is separable, if and only if X is second countable.

Exercise 3.1.3. Show that in a ﬁrst countable space, singletons {x} are Gδ. Exercise 3.1.4. Prove that every metric space is normal and ﬁrst countable. Exercise 3.1.5. Prove that a metric space is separable if and only if it is second countable.

Exercise 3.1.6. Let X = R and let T be the family of all sets of the form U ∪ (V ∩ Q) where U and V are open sets in the usual sense. Show that T gives a topology on R which is Hausdorﬀ but not regular. Exercise 3.1.7. Suppose (X, d) is a metric space. Show that closed subsets of X are Gδ. Exercise 3.1.8. Let (X, d) be a metric space and consider the bounded metric 0 d(x,y) 0 d (x, y) = 1+d(x,y) . Show that (X, d ) describes the same topology on X.

Exercise 3.1.9 (Fr´echet). Let A, B ⊂ R be two countable dense sets, show that there is a homeomorphism θ : R → R so that θ(A) = B. Hint: Use Exercise 1.1.27.

3.2 Continuous maps

Let (X, T ) and (Y, S) be two topological spaces. A map f : X → Y is continuous if f −1(U) is open for any open set U ⊂ Y . Note that this agrees with our terminology for metric spaces. We say that f is open if F (U) is open for all U open. We say that f is a homeomorphism if it is bijective, continuous, and open. Proposition 3.2.1. Suppose E generates the topology on Y , then f : X → Y is continuous if and only if f −1(U) is open for each U ∈ E. Proof. If f is continuous then we trivially have that f −1(U) is open for each U ∈ E. conversely, if f −1(U) is open for each U ∈ E, then as the inverse image of a function distributes over unions and intersections it follows that f −1(O) is open whenever O is a union of ﬁnite intersections of sets in E. By Proposition 3.1.1 every open set is of this form and hence f is continuous. A directed set is a set A, together with a binary relation ≤ such that 1. x ≤ x for all x ∈ A. 2. if x ≤ y and y ≤ z then x ≤ z. 3. for each x, y ∈ A there exists some z ∈ A so that x ≤ z and y ≤ z. 68 CHAPTER 3. POINT SET TOPOLOGY

A net in X is a function f : A → X from a nonempty directed set A into X. We usually prefer to think of a net as being indexed by A and so we write this as {fα}α∈A, and we sometimes abuse notation by identifying a net with its image, so that we might say “let {xα}α∈A ⊂ X be a net”. Note that sequences are just nets when the index set is N, nets were first introduced by Moore and Smith in 1922 as a generalization of sequences. A net {xα}α∈A ⊂ X has a limit x ∈ X if for every open neighborhood U of x there exists α ∈ A so that xβ ∈ U for all β ≥ α. If x is a limit of a net {xα}α∈A then we say that this net is convergent and we write limα→∞ xα = x. In general, limits need not be unique, however, it’s easy to see that if X is Hausdorff then limits must be unique. Proposition 3.2.2. Suppose X and Y are topological spaces and f : X → Y , then f is continuous if and only if for any convergent net {xα}α∈A such that x = limα→∞ xα, we have that {f(xα)}α∈A is also convergent and limα→∞ f(xα) = f(x). Moreover, if X is first countable, then one may consider only sequences rather than nets.

Proof. Suppose ﬁrst that f is continuous and {xα}α∈A is a net such that x = −1 limα→∞ xα. If we ﬁx an open neighborhood U of f(x) then f (U) is an open neighborhood of x and since x = limα→∞ xα there then exists a ∈ A so that −1 xβ ∈ f (U) for all β ≥ a. Therefore, f(xβ) ∈ U for all β ≥ a and hence limα→∞ f(xα) = f(x). Conversely, suppose that for any net {xα}α∈A such that x = limα→∞ xα, we have that {f(xα)}α∈A is also convergent and limα→∞ f(xα) = f(x). Conversely, suppose that f is not continuous and let U be an open set in Y such that f −1(U) is not open in X. Therefore there exists a point x ∈ f −1(U) so that f −1(U) contains no open neighborhood of x. We let A denote the set of open neighborhoods of x, and note that this is a directed set when ordered −1 by reverse inclusion. For each O ∈ A we take xO ∈ O \ f (U). Then we have limO→∞ xO = x. However, f(xO) 6∈ U for each O ∈ A and hence {f(xO)}O∈A does not converge to f(x).

If X is a set and {fi : X → Yi}i∈I is a family of maps from X into topological spaces Yi then there is a unique weakest topology on X making each of the maps fi continuous. We call this topology the weak topology on X generated by {fi}i∈I . For example if {Xi}i∈I is a family of topological spaces then we Q may endow Xi with the weak topology generated by the coordinate maps Q i∈I Q πi : j∈I Xj → Xi, we always consider i∈I Xi with this topology unless otherwise stated. Q Proposition 3.2.3. If Xi is Hausdorff for each i ∈ I then i∈I Xi is Hausdorff. Q Proof. Suppose x, y ∈ i∈I Xi such that x 6= y. Then for some coordinate i ∈ I we have πi(x) 6= πi(y). Since Xi is Hausdorff there exists disjoint open −1 −1 neighborhoods O and U of πi(x) and πi(y) respectively, then πi (O) and πi (U) give disjoint open neighborhoods of x and y respectively. Q Proposition 3.2.4. If Y is a topological space and f : Y → i∈I Xi, then f is continuous if and only if πi ◦ f is continuous for each i ∈ I. 3.2. CONTINUOUS MAPS 69

Proof. Since composition of continuous maps are continuous we see that if f is continuous then πi ◦ f is continuous for each i ∈ I. Conversely, suppose πi ◦ f is continuous for each i ∈ I. Then for any i ∈ I and open set O ⊂ Xi we have −1 −1 Q that f (πi (O)) is open. Since these sets generate the topology on i∈I Xi we then have that f is continuous. Q Proposition 3.2.5. A net {xα}α∈A converges to x ∈ i∈I Xi if and only if for each i ∈ I the net {πi(xα)}α∈A converges to πi(x).

Proof. If {xα}α∈A is a net which converges to x. Then for each i ∈ I we have that {πi(xα)}α∈A converges to πi(x) by Proposition 3.2.2. Conversely, if for each i ∈ I we have {πi(xα)}α∈A converges to πi(x), then for each i1, . . . , in ∈ I and O1,...,On open neighborhoods of πi1 (x), . . . , πin (x) respectively, we have that there exists a ∈ A so that πik (xα) ∈ Ok for each α ≥ a. Since sets of the form ∩n π−1(O ) form a base for the topology it then follows that {x } k=1 ik k α α∈A converges to x.

In the case when each Xi is equal to some ﬁxed space X, then we are considering the function space XI , and the topology we are considering is the topology of pointwise convergence; a net {fα}α∈A converges to f : I → X if and only if for each i ∈ I we have limα→∞ fα(i) = f(i). If X is a topological space then we denote by Cb(X) the space of all continuous functions with bounded image. We consider the uniform norm of f as kfk∞ = sup{|f(x)| | x ∈ X}. x∈X

The function d(f, g) = kf − gk∞ gives a metric on Cb(X), which we call the uniform metric.

Proposition 3.2.6. The space Cb(X) is a Banach algebra when endowed with the uniform metric, and pointwise operations.

Proof. The arguments in Propositions 1.3.1 and 1.3.1 when X is a metric space work equally well here.

If K ⊂ C is closed, then we denote by Cb(X; K) the subspace of all functions which take values in K. It is easy to see that Cb(X; K) is a closed subspace of Cb(X) in the uniform norm. Lemma 3.2.7. Let X be a normal space. Suppose that A and B are disjoint closed sets in X, and let D = {k2−n | n ≥ 1 and 0 < k < 2n} be the set of dyadic rationals in (0, 1). There is a family {Ur | r ∈ D} of open sets in X such c that A ⊂ Ur ⊂ B for all r ∈ D and Ur ⊂ Us for r < s.

c Proof. Set U0 = A and U1 = W . As X is normal there exist disjoint open sets V and W such that A ⊂ V and B ⊂ W . We set U1/2 = V , so that c c −n A ⊂ U1/2 ⊂ U1/2 ⊂ W ⊂ B . We now select Ur for r = k2 by induction on −n n n. Suppose N ≥ 2 and we have chosen Ur for r = k2 , for 0 < k < 2 , and 70 CHAPTER 3. POINT SET TOPOLOGY

−N N−1 1 ≤ n < N. Then for each r = (2j +1)2 , 0 ≤ j < 2 , we have that Uj21−N c and (U(j+1)21−N ) are disjoint closed sets and so as above we may choose Ur so that c A ⊂ Uj21−N ⊂ Ur ⊂ Ur ⊂ U(j+1)21−N ⊂ B . Lemma 3.2.8 (Urysohn’s Lemma). Let X be a normal space. If A and B are disjoint closed sets in X, then there exists f ∈ C(X; [0, 1]) such that f|A = 0, and f|B = 1.

Proof. Let {Ur}r∈D be as in the previous lemma. Set U1 = X and for x ∈ X deﬁne f(x) = inf{r | x ∈ Ur}. Since A ⊂ Ur for 0 < r < 1 we have f|A = 0. We also have f|B = 1 and clearly f(X) ⊂ [0, 1], so all that remains is to show that f is continuous. Towards this end note that f(x) < a if and only if x ∈ Ur for −1 some r < α, thus f ((−∞, a)) = ∪r<αUr is open. Also, f(x) > a if and only if x 6∈ Ur for some r > a, and hence if and only if x 6∈ Us for some s > a (since −1 c Us ⊂ Ur for s < r). Thus f ((a, ∞)) = ∪s>α(Ux) is open. Since half-lines generate the topology on R it follows that f is continuous.

A space X is Tychonoﬀ (or T31/2) if it is T1 and for every closed set A and c point x ∈ A , there exists a continuous function f ∈ C(X; [0, 1]) so that f|A = 0, and f(x) = 1. Note that by Urysohn’s lemma we have that T4 =⇒ T31/2. It’s also easy to check that T31/2 =⇒ T3. Theorem 3.2.9 (The Tietze Extension Theorem). Let X be a normal space. If A is a closed subset of X and f : A → R is continuous, then there exists F : X → R continuous, such that F|A = f. Moreover, if f is bounded then we may choose F so that kF k∞ = kfk∞. Proof. We ﬁrst consider the case when f is bounded. We may assume f is nonconstant. If we set a = inf f(A) and b = inf f(A), then by replacing f with (f −a)/(b−a) we may assume that f : X → [0, 1]. We will inductively construct a sequence of continuous functions gn : X → [0, 1] so that

n X 2i f(x) − g (x) ∈ [0, (2/3)n+1] 3i+1 i i=0 P∞ for all x ∈ A. Then Proposition 3.2.6 shows that F = i=1 gi deﬁnes a continuous function with kF k ≤ 1, such that F agrees with f on A. −1 −1 To construct g0 we set E = f ([0, 1/3]) and F = f ([2/3, 1]). Then E and F are disjoint closed sets and so by Urysohn’s lemma there exists g0 : X → [0, 1] continuous so that g0(x) = 0 for x ∈ E and g0(x) = 1 for x ∈ F . We then have 1 f(x) − 3 g0(x) ∈ [0, 2/3] for all x ∈ A. Now suppose g0, . . . , gn−1 have been constructed so that

n−1 X 2i f˜(x) = f(x) − g (x) ∈ [0, (2/3)n] 3i+1 i i=0 3.2. CONTINUOUS MAPS 71 for all x ∈ A. Then as above we set E = f˜−1([0, 2n/3n+1]) and set F = ˜−1 n+1 n+1 n f ([2 /3 , (2/3) ], and we take gn : X → [0, 1] continuous so that gn(x) = 0 if x ∈ E and gn(x) = 1 if x ∈ F . Then, we have

n X 2i f(x) − g (x) = f˜(x) − 2n/3n+1g (x) ∈ [0, (2/3)n+1], 3i+1 i n i=0

ﬁnishing the induction step. For the case when f is not bounded we take a homeomorphism θ : R → (−1/2, 1/2) and consider θ ◦ f : A → (−1/2, 1/2). Then from above there exists a continuous function F : X → [−1/2, 1/2] so that F agrees with θ ◦f on A. We let E = F −1({−1/2, 1/2}). Then E is a closed set which is disjoint from A and so by Urysohn’s lemma there exists g : X → [0, 1] so that g|E = 0 and g|A = 1. Then F˜ = gF also agrees with θ ◦ f on A and sastisﬁes F˜ : X → (−1/2, 1/2). −1 ˜ Therefore θ ◦ F gives the desired continuous function.

Corollary 3.2.10. If X is normal, A ⊂ X is closed, and f ∈ Cb(A), then there exists F ∈ Cb(X) such that F|A = f, and kF k∞ = kfk∞. Proof. We assume f 6= 0. By considering the real and imaginary parts separately, it then follows from the previous theorem that there exists a bounded continuous function F0 such that F0 agrees with f on A. We let E = {x ∈ X | |F0(x)| ≥ kfk∞}. Then

kfk /|F (x)| if x ∈ E; h(x) = ∞ 0 1 if x 6∈ E; gives a continuous function and hF0 agrees with f on A and satisﬁes khF0k∞ = kfk∞.

3.2.1 Exercises A topological space is disconnected if there exists nonempty disjoint open sets U, V which cover X; otherwise X is connected. A subset E ⊂ X is connected or disconnected if this is the case in the relative topology.

Exercise 3.2.11. (a) Show that if {Ai}i∈I is a family of connected subsets such that ∩i∈I Ai 6= ∅ then ∪i∈I Ai is connected.

(b) Show that if A ⊂ X is connected then A is also connected.

(c) Show that every point x ∈ X is contained in a unique maximal connected subset of X, and this subset is closed. (This is the connected component of x).

A topological space is totally disconnected if {x} is the connected component of x, for each x ∈ X. 72 CHAPTER 3. POINT SET TOPOLOGY

Exercise 3.2.12. Show that the continuous image of a connected set is connected.

A topological space (X, T ) is arc-connected if for each x, y ∈ X there exists a continuous function f : [0, 1] → X so that f(0) = x and f(1) = y.

Exercise 3.2.13. Show that arc-connected spaces are connected. Also, ﬁnd an example of a connected space which is not arc-connected.

3.3 Compact spaces

Generalizing the case for metric spaces, a topological space (X, T ) is compact if every open cover has a finite subcover. A subset E ⊂ X is compact if it is compact with respect to the relative topology. We say a subset E ⊂ X is precompact if E is compact. A family of subsets F of X has the finite intersection property if for any n F1,...,Fn ∈ F, with n ≥ 1 we have ∩i=1Fi 6= ∅. Proposition 3.3.1. A topological space X is compact if and only if, whenever F is a non-empty family of closed subsets which has the finite intersection property then we have ∩F ∈F F 6= ∅. Proof. By contraposition a space X is compact if and only if whenever G is a family of open sets which does not have a finite subfamily covering X, then G itself does not cover X. Taking complements of the sets in G, then gives the criterion for compactness above. Proposition 3.3.2. Suppose X and Y are topological spaces with X compact. If f : X → Y is continuous then f(X) is compact.

Proof. Suppose G is an open cover for f(X), then {f −1(O) | O ∈ G} is an open −1 −1 cover for X and hence has a finite subcover f (O1), . . . , f (On). It then follows that O1,...,On is a finite subcover of f(X). Hence, f(X) is compact. By the previous proposition, any continuous map from a compact space to C is bounded, thus for compact spaces we write C(X) for Cb(X). Proposition 3.3.3. A closed subset of a compact space is compact. Also, a compact subset of a Hausdorff space is closed.

Proof. First, suppose X is compact, and F ⊂ X is closed, if G is an open cover of F , then G ∪ {F c} is an open cover for X and hence by compactness there is c a finite subcover G0, then G0 \{F } is a finite subcover of G which covers F , showing that F is compact. Next, suppose X is Hausdorff and F ⊂ X is compact. Fix x0 6∈ F . Since X is Hausdorff, for each x ∈ X there exist disjoint open neighborhoods Ox, and Gx of x and x0 respectively. We have that {Ox}x∈F covers F and so by 3.3. COMPACT SPACES 73

n compactness there is a ﬁnite subcover {Ox1 ,...,Oxn }. If we set U = ∩i=1Gxi n then we have that U is open and is disjoint from F ⊂ ∪i=1Oxi . Thus, x 6∈ F , and as x was arbitrary it then follows that F = F is closed. Corollary 3.3.4. Suppose X and Y are topological spaces with X compact and Y Hausdorﬀ, and suppose f : X → Y is a continuous bijection. Then f is a homeomorphism.

Proof. Suppose O ⊂ X is open. By Proposition 3.3.3 we have that Oc is compact. By Proposition 3.3.2 we then have that f(Oc) is compact and hence closed. Since f is a bijection we then have that f(O) = f(Oc)c is open. Thus, −1 f is continuous and hence f is a homeomorphism. Proposition 3.3.5. Suppose X is a Hausdorﬀ topological space, and E,F ⊂ X are disjoint compact subsets, then there exist disjoint open sets U, V ⊂ X so that E ⊂ U and F ⊂ V .

Proof. We first consider the case when E is a singleton E = {x}. Since X is Hausdorff, for each y ∈ F there exists disjoint open sets Oy and Vy so that y ∈ Oy and x ∈ Vy. We then have that {Oy}y∈F forms an open cover of F and n n by compactness there exists a finite subcover {Oyi }i=1. If we set U = ∪i=1Oyi n and V = ∩i=1Vyi , then U and V are disjoint open sets such that F ⊂ U and x ∈ V . We now consider the general case. From above, for each y ∈ E there exist disjoint open sets Oy and Vy so that y ∈ Oy and F ⊂ Vy. Again by compactness n n there exists a finite collection {Oyi }i=1 which covers E. Then U = ∪i=1Oyi and n V = ∩i=1Vyi are disjoint open sets and we have E ⊂ U, while F ⊂ V . Corollary 3.3.6. A compact Hausdorff space X is normal.

Proof. This follows directly from Propositions 3.3.3 and 3.3.5.

Theorem 3.3.7 (Tychonoff’s Theorem). If {Xi}i∈I is a family of compact Q topological spaces, then i∈I Xi is also compact. Q Proof. Suppose F is a family of closed subsets of i∈I Xi with the finite intersection property. By Zorn’s lemma there exists a maximal family E of (not necessarily closed) subsets with the finite intersection property such that F ⊂ E. Note that E itself must then be closed under finite intersections, and if E ∈ E and E ⊂ F , then F ∈ E. For each i ∈ I the family {πi(E) | E ∈ E} has the finite intersection property and hence by compactness we have ∩E∈E πi(E) 6= ∅. Take xi a point in this Q intersection. We let x be the point in i∈I Xi whose ith coordinate is xi. We claim that all neighborhoods of x are contained in E. To prove this it is enough to show that the neighborhoods of the form ∩n π−1(E ) are contained k=1 ik ik in E, and since E is closed under finite intersections it is then enough to show −1 that neighborhoods of the form πi (Ei) are contained in E. To see this note −1 that since xi ∈ πi(E) for any E ∈ E it follows that πi (Ei) ∩ E 6= ∅ for all 74 CHAPTER 3. POINT SET TOPOLOGY

−1 E ∈ E. This then shows that E ∪ {πi (Ei)} has the finite intersection property −1 and hence πi (Ei) ∈ E by maximality of E. Thus, arbitrary neighborhoods of x are contained in E and hence have non-trivial intersection with an arbitrary set E ∈ E. Thus, x ∈ E for all E ∈ E, and hence x ∈ ∩E∈E E ⊂ ∩F ∈F F , showing Q that i∈I Xi is compact. If X is a Banach space, then the weak∗-topology on X∗ is defined to be the coarsest topology so that the maps X∗ 3 ϕ 7→ ϕ(x) are continuous for each x ∈ X. Theorem 3.3.8 (The Banach-Alaoglu theorem). Let X be a Banach space. Then the closed unit ball in X∗ is compact in the weak∗-topology. Q Proof. Let D = x∈X B(kxk, 0). Since closed balls in Euclidean space are compact, it then follows from Tychonoff’s theorem that D is compact. We let K denote the closed unit ball in X∗ and consider the map π : K → D where π(ϕ) has coordinates π(ϕ)x = ϕ(x). Note that since ϕ is in the unit ball we have |π(ϕx)| = |ϕ(x)| ≤ kxk, so that π is well defined. Also note that π is injective since if π(ϕ) = π(ψ) then for each point x ∈ X we have ϕ(x) = ψ(x). If {ϕα}α∈A is a net in K then {ϕα}α∈A converges to ϕ if and only if for each x ∈ X we have ϕα(x) → ϕ(x), and this is also if and only if π(ϕα) → π(ϕ) in D. Therefore π defines a homeomorphism from D onto its image. K is therefore compact if and only if the image of π is closed. Note that D consists of functions from X to K (where K = R or K = C), and as these functions take x to an element in the ball B(kxk < x) it follows that they are bounded functions. Thus, the image of π consists of those functions which are linear, i.e.

π(K) = ∩x1,x2∈X,α∈K{f ∈ D | fx1+αx2 = fx1 + αfx2 }.

As an intersection of closed sets is closed it then follows that π(K) is closed. A property P of topological spaces is said to hold locally for a space X if each x ∈ X has a neighborhood which satisfies P. For example a locally compact space X is one in which each point x ∈ X has a compact neighborhood. Euclidean spaces Rn are examples of locally compact spaces. A subset F ⊂ Cb(X) is equicontinuous at x ∈ X if for each ε > 0 there is a neighborhood U of x such that |f(y) − f(x)| < ε for all u ∈ U and f ∈ F. F is equicontinuous if it is equicontinuous at each point. Also, F is pointwise bounded if {f(x) | f ∈ F} is bounded for each x ∈ X. Theorem 3.3.9 (The Arzelà-AscoliTheorem). Let X be a compact Hausdorff space. If F ⊂ C(X) is equicontinuous and pointwise bounded, then F is totally bounded in the uniform metric, and F is precompact. Proof. Fix ε > 0. Since F is equicontinuous, for each x ∈ X there exists an open neighborhood Ox of x such that |f(y) − f(x)| < ε/4 for all f ∈ F, and y ∈ Ox. The family {Ox}x∈X is an open cover, and since X is compact it is covered by a finite collection Ox1 ,...,Oxn . 3.3. COMPACT SPACES 75

Set K = supf∈F,1≤k≤n |f(xk)|. Since F is pointwise bounded we have K < ∞. We cover the closed ball B(K, 0) ⊂ C with ﬁnitely many ε/4 balls B(ε/4, z1),...,B(ε/4, zm). We consider the ﬁnite set

F = {φ : {1, . . . , n} → {1, . . . , m} | there exists f ∈ F such that

f(xi) ∈ B(ε/4, zφ(i)) for all 1 ≤ i ≤ n}, and for each φ ∈ F , we choose fφ ∈ F which realizes the fact that φ ∈ F . If f ∈ F, then as B(ε/4, z1),...,B(ε/4, zm) cover B(K, 0), there exists a function φ ∈ F so that f(xi) ∈ B(ε/4, zφ(i)), for 1 ≤ i ≤ n. If y ∈ Oxi , we then have

Thus kf − fφk∞ < ε, showing that F is covered by ﬁnitely many ε-balls, and is therefore totally bounded. It then follows that F is precompact in the uniform norm by the Heine-Borel property. A topological space is σ-compact if it is a countable union of compact subsets. Lemma 3.3.10. Suppose X is a σ-compact, locally compact Hausdorﬀ space. ∞ Then there is a sequence {Un}n=1 of precompact open sets such that Un ⊂ Un+1 ∞ for each 1 ≤ n < ∞, and ∪n=1Un = X. ∞ Proof. We have X = ∪n=1Fn where Fn are compact sets. Since X is locally compact each point x ∈ Fn has a precompact open neighborhood Ox, then

{Ox}x∈Fn covers Fn and hence has a finite subcover Ox1 ,...,Oxk . Setting k Vn = ∪i=1Oxi , we then have that Vn is open, precompact, and Fn ⊂ Vn for each n 1 ≤ n < ∞. Setting Un = ∪m=1Vn then produces the desired sequence. ∞ Note that if {Un}n=1 are as in the previous lemma and F ⊂ X is compact, ∞ then {Un}n=1 covers F and hence by compactness there exists a finite subcover. ∞ However, since {Un}n=1 is an increasing union it then follows that F ⊂ Un for some 1 ≤ n < ∞. Theorem 3.3.11. Let X be a σ-compact, locally compact Hausdorff space. If ∞ {fn}n=1 is a sequence which is equicontinuous and pointwise bounded, then there ∞ exists a continuous function f : X → C and a subsequence of {fn}n=1 which converges to f uniformly on compact sets.

∞ Proof. We write X = ∪n=1Fn where each Fn = Un as in the previous lemma. 0 ∞ ∞ Set F0 = ∅, g0 = ∅, and {fn}n=1 = {fn}n=1. For n ≥ 1 we inductively choose g : F → , and subsequences {f k}∞ of {f k−1}∞ so that g = g as k k C n n=1 n n=1 k|Fk−1 k−1 k ∞ follows: We suppose that gk and {fn }n=1 have already been chosen for 0 ≤ k < k ∞ ∞. Restricting {fn }n=1 to Fk+1 we have an equicontinuous, pointwise bounded 76 CHAPTER 3. POINT SET TOPOLOGY family and hence this is precompact in the uniform norm by the Arzelà-Ascoli k+1 ∞ k+1 ∞ Theorem. Therefore there exists a subsequence {fn }n=1 such that {fn }n=1 converges uniformly on Fk+1 to a continuous function gk+1 : Fk+1 → C. As {f k+1}∞ is a subsequence of {f k}∞ we have that g = g . n n=1 n n=1 k+1|Fk k We may then define g : X → C by g(x) = gk(x) for x ∈ Fk. Then g is a well n ∞ defined continuous function. If we consider the diagonal subsequence {fn }n=1 n ∞ then we have that {fn }n=1 converges to g uniformly on Fk for any 1 ≤ k < ∞. n ∞ Since any compact set is covered by some Fk it follows that {fn }n=1 converges to g uniformly on compact sets.

3.3.1 Exercises Exercise 3.3.12. Show that a metric space X is compact if and only if every continuous real valued function on X is bounded. Exercise 3.3.13. Let (X, T ) be a compact Hausdorff space. Show that if T 0 is any weaker topology then (X, T 0) is not Hausdorff. Show that if T 0 is any stronger topology then (X, T 0) is not compact. Let X be a locally compact topological space, and fix a point ω 6∈ X. On the space X˜ = X ∪ {ω} we define a new topology whose open sets consist of the open sets in X, together with the compliments (in X˜) of compact subsets of X. The space X˜ is called the one-point compactification of X. Exercise 3.3.14. Show that X˜ is a compact Hausdorff space and that the relative topology from X ⊂ X˜ agrees with the topology on X. Let X be a locally compact topological space, a function f : X → C is said to vanish at infinity if for every ε > 0, there exists a compact set K ⊂ X so c that |f(x)| < ε for all x ∈ K . We denote by C0(X) the space of all continuous functions which vanish at infinity.

Exercise 3.3.15. Show that C0(X) is a closed subspace of Cb(X). Exercise 3.3.16 (Compare this with Exercise 3.1.3). Suppose X is a compact Hausdorff space such that singletons {x} are Gδ. 1. For each x ∈ X find a countable open cover O of X \{x} so that x 6∈ O for all O ∈ O. 2. Show that X is first countable. If (V,E) is a graph, and k ∈ N, a k-coloring of the graph (V,E) is an assignment f ∈ {1, 2, . . . , k}V such that for all (v, w) ∈ E we have f(v) 6= f(w). Exercise 3.3.17. Prove the De Bruijn-Erdöstheorem: If (V,E) is a graph such that a k-coloring exists for every finite subgraph, then a k-coloring exists for (V,E). Hint: For each finite subgraph (V0,E0) consider V F(V0,E0) = {f ∈ {1, . . . , k} | f|V0 gives a k − coloring of (V0,E0)}, then show that the family of all such F(V0,E0) has the finite intersection property. (This approach is due to Gottschalk.) 3.4. THE STONE-WEIERSTRASS THEOREM 77

3.4 The Stone-Weierstrass Theorem

A subset A ⊂ Cb(X) (resp. Cb(X; R)) is an algebra if it is a complex (resp. real) vector subspace such that fg ∈ A for all f, g ∈ A. A is said to separate points if for all x 6= y there exists f ∈ A such that f(x) 6= f(y). A subset A ⊂ Cb(X; R) is a lattice if it is a real vector subspace such that f ∨ g = max{f, g} ∈ A, and f ∧ g = min{f, g} ∈ A for all f, g ∈ A.

Lemma 3.4.1. For any ε > 0 there is a polynomial p on R such that p(0) = 0 and ||x| − p(x)| < ε for x ∈ [−1, 1].

P∞ n 1/2 Proof. Consider the Maclaurin series 1 − n=1 cnt for (1 − t) . This series converges absolutely and uniformly on [−1, 1] and its sum is (1−t)1/2. Therefore, given any ε > 0 we may take a suitable partial sum to obtain a polynomial q so that |(1 − t)1/2 − q(t)| < ε/2 for t ∈ [−1, 1]. Setting r(x) = q(1 − x2), we then obtain a polynomial r such that ||x| − r(x)| < ε/2 for x ∈ [−1, 1]. If we set p(x) = r(x)−r(0), then p is a polynomial such that p(0) = 0 and ||x|−p(x)| < ε for all x ∈ [−1, 1].

Proposition 3.4.2. Let X be a topological space. If A ⊂ Cb(X; R) is a closed subalgebra, then A is a lattice.

1 1 Proof. As f ∨ g = 2 (f + g + |f − g|) and f ∧ g = 2 (f + g − |f − g|) it is enough to show that |f| ∈ A whenever f ∈ A. Suppose f ∈ A and ε > 0 is given. We may assume f 6= 0. Since f/kfk∞ maps into [−1, 1] it follows from the previous lemma that there exists a polynomial p on R so that p(0) = 0 and |(p ◦ f)/kfk∞ − |f|/kfk∞| < ε. Since p(0) = 0 it follows that p has 0 for its constant coeﬃcient, thus since A is an algebra we have p◦f ∈ A, and since ε > 0 was arbitrary it then follows that |f|/kfk∞ ∈ A and hence also |f| ∈ A.

Lemma 3.4.3. Let X be a compact space. Suppose A ⊂ Cb(X; R) is a closed lattice and f ∈ C(X; R). If for every x, y ∈ X there exists g ∈ A so that g(x) = f(x) and g(y) = f(y), then f ∈ A.

Proof. Fix ε > 0. For each x, y take gx,y ∈ A so that gx,y(x) = f(x) and gx,y(y) = f(y). Let Ux,y = {z ∈ X | f(z) < gx,y(z) + ε}. Fix y; then {Ux,y}x∈X n is an open cover of X and so there is a ﬁnite subcover {Uxi,y}i=1. Set gy = max{gx1,y, . . . , gxn,y} ∈ A. Then f < gy + ε on X and f(y) = gy(y) so that in some neighborhood Vy of y we have that f > gy −ε. We then have that {Vy}y∈X m covers X and so there is a ﬁnite subcover {Vyj }j=1. Set g = min{gy1 , . . . , gym } ∈ A. Then kf − gk∞ < ε, g ∈ A, and since ε > 0 was arbitrary we then have f ∈ A.

Theorem 3.4.4 (The Stone-Weierstrass Theorem). Let X be a compact Haus- dorﬀ space. If A ⊂ C(X; R) is a closed algebra which separates points then either A = C(X; R) or else there exists x0 ∈ X such that A = {f ∈ C(X; R) | f(x0) = 0}. 78 CHAPTER 3. POINT SET TOPOLOGY

Proof. We ﬁrst consider the case when X is a two point set {x, y}, so that C({x, y}) is two dimensional. If A is two dimensional then we are done. Also, since A separates points we have a function f ∈ A with f(x) 6= f(y), so that we may assume A is one dimensional. Then f 2 = cf for some c ∈ R, so that f(x) and f(y) distinct roots of the polynomial t2 − ct. It then follows that either f(x) = 0 in which case A = {g ∈ C({x, y}) | g(x) = 0}, or else f(y) = 0 in which case A = {g ∈ C({x, y}) | g(y) = 0}. We now consider the general case. Suppose x, y ∈ X, with x 6= y. Con- sidering the restriction map from A we obtain an algebra Ax,y ⊂ C({x, y}; R). Note that since A separates points so does Ax,y. If there exists x0 ∈ X so that f(x0) = 0 for every f ∈ A, then as A separates points there can be at most one such x0 and from above we then have Ax,y = C({x, y}) whenever x0 6∈ {x, y}. It then follows from Proposition 3.4.2 and Lemma 3.4.3 that A = {f ∈ C(X) | f(x0) = 0}. Otherwise Ax,y = C({x, y}) for all x, y ∈ X in which case it again follows from Proposition 3.4.2 and Lemma 3.4.3 that A = C(X).

Corollary 3.4.5. Let K ⊂ Rn be compact, and f ∈ C(K; R), then for every ε > 0 there exists a polynomial p : Rn → R so that |f(k) − p(k)| < ε for all k ∈ K.

Proof. Since the space of polynomials forms an algebra which contains the constant functions and separate points it follows that this space is dense in C(K; R) by the Stone-Weierstrass theorem.

Theorem 3.4.6 (The Complex Stone-Weierstrass Theorem). Let X be a compact Hausdorﬀ space. If A ⊂ C(X) is a closed algebra which is closed under complex conjugation and separates points then either A = C(X) or else there exists x0 ∈ X such that A = {f ∈ C(X; R) | f(x0) = 0}. Proof. Since Re f = (f +f)/2 and Im f = (f −f)/2i it follows that the set of real and imaginary parts of functions in A is an algebra AR in C(X; R). Moreover it is easy to see that this separates points and hence the Stone-Weierstrass theorem applies. Since A = {f + ig | f, g ∈ AR} the complex version then follows.

3.4.1 Exercises Exercise 3.4.7. Suppose X and Y are compact Hausdorﬀ spaces and f ∈ C(X×Y ). Show that for all ε > 0 there exist g1, . . . , gn ∈ C(X) and h1, . . . , hn ∈ Pn C(Y ) so that |f(x, y) − i=1 gi(x)hi(y)| < ε for all (x, y) ∈ X × Y . Exercise 3.4.8. Let X be a compact Hausdorﬀ space. An ideal in C(X) is a subalgebra I ⊂ C(X), such that fg ∈ I whenever f ∈ C(X) and g ∈ I.

1. If I ⊂ C(X) is an ideal, let h(I) = {x ∈ X | f(x) = 0 for all f ∈ I}, the hull of I. Show that h(I) is closed. 3.5. THE STONE-CECHˇ COMPACTIFICATION 79

2. If A ⊂ X, let k(A) = {f ∈ C(X) | f(x) = 0 for all x ∈ A}, the kernel of A. Show that k(A) is a closed ideal in C(X) which is closed under conjugation. 3. Show that k(h(I)) = I for any ideal I ⊂ C(X) which is closed under conjugation, and h(k(A)) = A for any subset A ⊂ X. Given a topological space X and an equivalence relation R on X, we let X/R denote the set of equivalence classes and we consider q : X → X/R the quotient map q(x) = [x]. We endow X/R with the weakest topology so that q is continuous. Exercise 3.4.9. Let X be a compact Hausdorﬀ space. 1. Show that X/R is Hausdorﬀ if and only if R is a closed subset of X × X.

2. If R ⊂ X × X is closed, consider AR = {f ◦ q | f ∈ C(X/R)}. Show that AR is a closed subalgebra of C(X) which contains the constant functions and is closed under complex conjugation.

3. Show that R 7→ AR gives a bijection between equivalence relations on X which are closed in X × X, and closed subalgebras of C(X) which contain the constant functions and are closed under complex conjugation.

3.5 The Stone-Cechˇ compactiﬁcation

Let X be a topological space. A map χ : Cb(X) → C is a homomorphism if it is linear, and satisﬁes χ(fg) = χ(f)χ(g) for f, g ∈ Cb(X), and χ(1) = 1. We denote by σ(Cb(X)) the space of all such homomorphisms and we endow this with the topology of pointwise convergence inherited from CCb(X). Note that if ϕ, χ ∈ σ(Cb(X)) then ϕ = χ if and only if ker(ϕ) = ker(χ).

Lemma 3.5.1. σ(Cb(X)) is compact.

Proof. Suppose ϕ ∈ σ(Cb(X)), and f ∈ Cb(X) with kfk∞ ≤ 1. We claim that |ϕ(f)| ≤ 1. If this were not the case then the function g(x) = ϕ(f)−f(x) would 1 satisfy |g(x)| ≥ |ϕ(f)| − 1 for each x ∈ X and hence the function h(x) = g(x) would be in Cb(X). However, we would then have 1 = ϕ(1) = ϕ(g)ϕ(h) = 0 a contradiction. Also, restricting to Cb(X)1 = {f ∈ Cb(X) | kfk ≤ 1}, gives the same topology of pointwise convergence and hence we may view σ(Cb(X)) as a subspace of DB where D is the closed unit disc in C. Since DB is compact it is then enough to show that σ(Cb(X)) is a closed subspace. Suppose therefore that {ϕα}α∈A is a net of homomorphisms which converge pointwise to a function ϕ : Cb(X) → C. As addition, scalar multiplication are continuous in C it then follows that ϕ is linear, and as multiplication is jointly continuous in C it follows that ϕ(fg) = ϕ(f)ϕ(g) for all f, g ∈ Cb(X). Therefore ϕ is a homomorphism. 80 CHAPTER 3. POINT SET TOPOLOGY

Theorem 3.5.2 (Stone). Let X be a topological space. For each x ∈ X denote by βx : Cb(X) → C the homomorphism given by βx(f) = f(x), then X 3 x 7→ βx ∈ σ(C(X)) is a continuous map with dense image which satisfies the universal property that if π : X → K is any continuous map into a compact Hausdorff space K, then there exists a unique continuous map βπ : σ(C(X)) → K, such that for x ∈ X we have π(x) = βπ(βx). In particular, if X is a compact Hausdorff space then β is a homeomorphism.

Proof. If {xi} ⊂ X is a net such that xi → x, then for any f ∈ Cb(X) we have

βxi (f) = f(xi) → f(x) = βx(f), hence βxi → βx. Thus, x 7→ βx is continuous. To show that this map has dense image we suppose by way of contradiction that ϕ ∈ σ(Cb(X)) is not in the closure of β(X), and set I = ker(ϕ). If ψ ∈ β(X), then there exists fψ ∈ I such that fψ 6∈ ker(ψ). Hence, for 0 some cψ > 0, and an open neighborhood Oψ of ψ, we have that |ψ (f)| > 0 cψ for all ψ ∈ Oψ. As β(X) is compact we may take a finite subcover of the cover {Oψ}ψ∈β(X). Thus, we obtain f1, . . . , fn ∈ I, and c > 0 such that Pn 2 Pn 2 i=1 ψ(|f| ) > c for all ψ ∈ β(X). In particular we have i=1 |f| (x) = Pn 2 βx( i=1 |f| ) > c, for all x ∈ X. Thus, if we consider the function g(x) = Pn 2 1/ i=1 |f| , then g ∈ Cb(X) and we have fg = 1. We would then have 1 = ϕ(1) = ϕ(fg) = ϕ(f)ϕ(g) = 0, a contradiction. Thus, we must have that β(X) = σ(Cb(X)). If X is a compact Hausdorff space then β is surjective since the image is dense and compact. Moreover, β is injective since Cb(X) separates points. Hence, β is a homoeomorphism, being a continuous bijection between compact Hausdorff spaces. In general, to see that β : X → σ(Cb(X)) satisfies the above universal property, suppose that K is a compact Hausdorff space and π : X → K is ∗ continuous. We then obtain a continuous map π : C(K) → Cb(X) given by ∗ π (f)(x) = f(π(x)). Thus, we obtain the continuous mapπ ˜ : σ(Cb(X)) → σ(C(K)) byπ ˜(ϕ)(g) = ϕ(π∗(g)). Since K is compact and Hausdorff we have K established above that β : K → σ(Cb(K)) is a homeomorphism. Thus, we K −1 obtain a continuous map βπ : σ(Cb(X)) → K by setting βπ = β ◦ π˜. If x ∈ X, and g ∈ C(K) then we compute directly ∗ ∗ K π˜(βx)(ϕ)(g) = βx(π (g)) = π (g)(x) = g(π(x)) = βπ(x)(g).

Hence, βπ(βx) = π(x). If X is a topological space, then the Stone-Cechˇ compactification of X consists of a compact Hausdorff space βX, together with a continuous map β : X → βX, which satisfies the universal property given in the previous theorem. If follows easily that, up to homeomorphism, this is uniquely defined by its universal property. The previous theorem shows that βX exists and may be identified with σ(Cb(X)). The following easy consequence (implicit already in Tychonoff’s work) was obtained independently by Cechˇ using different methods: Corollary 3.5.3 (Stone, Cech)ˇ . Let X be a topological space, then β : X → βX is a homeomorphism onto its image if and only if X is a Tychonoff space. 3.5. THE STONE-CECHˇ COMPACTIFICATION 81

Proof. From the previous theorem we have that β : X → βX is continuous. Since X is Tychonoff we have, in particular, that Cb(X) separates points, and it then follows that β is injective. Thus, we just need to show that β is an open map into β(X). Suppose that F ⊂ X is closed and x ∈ X \ F . As X is Tychonoff there exists f : X → [0, 1] continuous so that f|F = 0, and f(x) = 1. Thus, we have βx(f) = 1 while βy(f) = 0 for all y ∈ F , and hence βx 6∈ β(F ). Since x 6∈ F was arbitrary it follows that β(F ) ∩ β(X) = β(F ), and hence β(F ) is closed in β(X). As β is injective, taking complements shows that β is an open map into β(X). As a subspace of a Tychonoff space is again Tychonoff, and compact Haus- dorff spaces are normal and hence Tychonoff by Corollary 3.3.6, the previous corollary gives the following characterization of Tychonoff spaces. Corollary 3.5.4. A topological space X is Tychonoff if and only if X is homeomorphic to a subspace of a compact Hausdorff space. Considering the one-point compactification gives the following: Corollary 3.5.5. Locally compact spaces are Tychonoff. Theorem 3.5.6 (The Tietze Extension Theorem for Tychonoff spaces). Let X be a Tychonoff space, K ⊂ U ⊂ X, with K compact, and U open. If f ∈ C(K) then there exists F ∈ Cb(X), with kF k∞ = kfk∞, such that F|K = f, and F|U c = 0. Proof. Since X is Tychonoff, the map β : X → βX is a homeomorphism onto its image. Thus β(K) ⊂ βX is compact, and there exists V ⊂ βX open such that β(K) ⊂ β(U) = V ∩ βX ⊂ V . We consider the function g : β(K) ∪ V c → C by setting g(β(k)) = f(k) for k ∈ K, and g(x) = 0 for x ∈ V c. Since β(U) and V c are disjoint closed sets, and βX is normal, they can be separated so that they are both clopen in the relative topology. Thus, g is continuous and by the Tietze Extension Theorem for compact Hausdorff spaces there is then a continuous function G ∈ C(βX), with kGk∞ = kgk∞ = kfk∞ so that G|β(K) = g, and G|V c = 0. Taking F = G ◦ β then gives the desired function. Lemma 3.5.7. If X is normal and second countable then there exists a countable family F ⊂ Cb(X; [0, 1]) which separates points. Proof. Let E be a countable base for X. For each U, V ∈ E such that U ⊂ V we may use Urysohn’s lemma to construct a continuous function fU,V : X → [0, 1] so that f|U = 0 and f|V c = 1. If we let F be the collection of all such fU,V and claim that F separates points. Indeed, if x, y ∈ X with x 6= y, then as X is normal there exist disjoint closed neighborhoods E and F of x and y respectively. Then there must exist U, V ∈ E neighborhoods of x and y respectively such that U ⊂ E and V ⊂ F . We then have that fU,V (x) = 0 while fU,V (y) = 1. Proposition 3.5.8. Every second countable normal space is homeomorphic to a subspace of the Hilbert cube [0, 1]N 82 CHAPTER 3. POINT SET TOPOLOGY

Proof. Suppose X is normal and second countable. Then by Lemma 3.5.7 there is a countable family F ⊂ Cb(X) which separates points in F. Consider the evaluation map e : X → [0, 1]F given by e(x)(f) = f(x). Then this is continuous and since F separates points it is injective. Since [0, 1]F is compact e extends F to a continuous map βe : βX → [0, 1] . If F ⊂ X is closed then F ⊂ βX is compact and satisﬁes F ∩ X = F . As βe is continuous we have that βe(F ) is compact, and hence e(F ) = βe(F ) ∩ e(X) is closed in e(X). Thus, the map e onto its image preserves closed sets and hence e is a homeomorphism of X onto F its image in [0, 1] . Theorem 3.5.9 (The Urysohn Metrization Theorem). Every second countable normal space is metrizable.

Proof. The Hilbert cube is metrizable. Indeed, the explicit metric d(f, g) = ∞ P −n N n=1 2 |f(n) − g(n)| is easily seen to give the topology on [0, 1] . Since subspaces of metrizable spaces are again metrizable, the result then follows from Proposition 3.5.8

3.5.1 Exercises Exercise 3.5.10. Let X be a compact Hausdorﬀ space. Show that X is a second countable if and only if C(X) is separable.

Exercise 3.5.11. Suppose that a topological space X has a countable basis of clopen sets, show that X embedds into {0, 1}N.

Exercise 3.5.12. Let X and Y be compact Hausdorﬀ spaces and suppose φ : C(X) → C(Y ) is a (unital) homomorphism, i.e., φ is complex linear, φ(1) = 1, and φ(fg) = φ(f)φ(g) for all f, g ∈ C(X). Show that there exists a unique continuous map π : Y → X so that φ(f) = f ◦ π for all f ∈ C(X). Moreover, show that π is bijective if and only if φ is bijective.

3.6 The property of Baire

A topological space X is completely metrizable if there is a complete metric on X which gives the topology.

Proposition 3.6.1. A Gδ subset A of a completely metrizable space X is completely metrizable in the relative toplogy.

Proof. Suppose that d is a complete metric giving the topology on X. We consider ﬁrst the case when A is open. In this case we may consider the metric d on A given by d (x, y) = d(x, y) + 1 − 1 . Then it is easy 1 1 d(x,Ac) d(y,Ac) to check that d1 is a complete metric on A which gives the relative topology.

Next suppose that A = ∩n∈NOn where each On ⊂ X is open. for each n ∈ N we let dn be a complete metric on On which gives the relative topology on On. dn(x,y) Replacing dn(x, y) with we assume that dn(x, y) < 1 for all x, y ∈ On. 1+dn(x,y) 3.6. THE PROPERTY OF BAIRE 83

˜ ˜ P∞ −n We define the metric d by d(x, y) = n=1 2 dn(x, y). It is then easy to see ˜ that d gives a complete metric on A which gives the relative topology. Theorem 3.6.2 (Kuratowski’s Extension Theorem). Let X be a Hausdorff space and A ⊂ X a dense subset of X. Suppose that Y is a completely metrizable space and f : A → Y is continuous, then there exists a continuous extension ˜ f : B → Y where B ⊂ X is Gδ with A ⊂ B. Proof. We fix a complete metric d on Y . For x ∈ X we set

oscf (x) = inf{diamf(U ∩ A) | U an open neighborhood of x}.

Then Bn = {x ∈ X | ocsf (x) < 1/n} is open for each n ∈ N, and hence B = ∩nBn = {x ∈ X | oscf (x) = 0} is Gδ. Since f is continuous on A we have A ⊂ B. ˜ ˜ We define f : B → Y by f(x) = limα→∞ f(xα) where {xα}α ⊂ A is any net such that xα → x. Since B = {x ∈ X | oscf (x) = 0} and Y is a complete space ˜ it follows easily that f is a well defined continuous extension of f. Corollary 3.6.3. Let X be a Hausdorff space and A ⊂ X a dense subset such that A is completely metrizable, then A is a Gδ-set in X. Proof. If we consider the identity map on A then from Kuratowski’s theorem ˜ there exists a Gδ-set G ⊂ X with A ⊂ G and a continuous extension f : G → A ⊂ G. Since A is dense in G and f˜ agrees with the identity on A it then follows ˜ that f is the identity map, hence A = G. Corollary 3.6.4. A subspace F of a completely metrizable space X is completely metrizable if and only if F is Gδ. Proof. Replacing X with F , we may assume that F is dense. The result then follows from Proposition 3.6.1 and Corollary 3.6.3. A Polish space is a topological space X which is separable and completely metrizable.

Corollary 3.6.5. A topological space X is Polish if and only if X is homeomorphic to a Gδ subset of a second countable compact Hausdorff space. Proof. If X is Polish, then Proposition 3.5.8 shows that X is homeomorphic to a subset of a second countable compact Hausdorff space, and Corollary 3.6.3 shows that this subset must be Gδ. Conversely, second countable compact Hausdorff spaces are completely metrizable by Urysohn’s Metrization Theorem, hence if X is a Gδ subset then X is completely metrizable by Corollary 3.6.4.

A topological space X is Cech-completeˇ if it is homeomorphic to a Gδ- subset of a compact Hausdorﬀ space. 84 CHAPTER 3. POINT SET TOPOLOGY

Corollary 3.6.6. Let X be a completely metrizable space, then X is Cech-ˇ complete.

Proof. Let β : X → βX be the Stone-Cechˇ compactiﬁcation of X. As X is Tychonoﬀ, β is a homeomorphism onto its image. Corollary 3.6.3 shows that the image must be a Gδ-set.

Note that, by considering the one point compactiﬁcation, any locally compact Hausdorﬀ space is also Cech-complete.ˇ Let X be a topological space. A subset A ⊂ X is meager if it is a countable union of nowhere dense sets. A subset B ⊂ X is comeager (or residual) if its complement is meager. We say that X is a Baire space if every comeager set is dense. Equivalently, X is a Baire space if whenever {On}n∈N is a sequence of open dense sets, we have that ∩nOn is dense. The following lemma is left to the reader.

Lemma 3.6.7. Let X be a Baire space, and Y ⊂ X a dense Gδ-subset, then Y is a Baire space.

Theorem 3.6.8 (Baire). Cech-completeˇ spaces are Baire.

Proof. By the previous lemma it is enough to show that compact Hausdorff spaces are Baire. Thus, suppose X is a compact Hausdorff space and {On}n∈N is a sequence of open dense sets. Let U be any non-empty open set in X. We now inductively define a decreasing sequence of closed sets {Fn}n, and nonempty open sets {Gn}n such that Gn ⊂ Fn ⊂ U ∩ O1 ∩ · · · On: Since O1 is dense, we have O1 ∩ U 6= ∅. Let F1 be a non-empty closed subset of O1 ∩ U with non-empty interior Gn. Now suppose F1,...,Fn and G1,...,Gn have been constructed. Since On−1 is dense we have On−1 ∩ Gn is non-empty and hence we may take Fn+1 to be any closed subset of On−1 ∩Gn with non-empty interior Gn+1. By construction we then have ∩nFn ⊂ U ∩ ∩nOn, and by compactness we have that ∩nFn is not empty. Since U was an arbitrary non-empty open subset it follows that ∩nOn is dense in X.

3.6.1 Exercises

Note that from Corollary 3.6.5 the space of irrationals R \ Q with its subspace topology is Polish, even though the usual metric is far from complete. The next two exercises give an explicit way to see this.

∞ Exercise 3.6.9. Suppose {Xn}n=1 is a sequence of completely metrizable spaces. Q∞ Q∞ Show that n=1 Xn is completely metrizable. Moreover, show that n=1 Xn is separable if each Xn is separable.

Exercise 3.6.10. Show that R\Q and the Baire space NN are homeomorphic. Hint: Consider continued fraction expansions. 3.6. THE PROPERTY OF BAIRE 85

Exercise 3.6.11. Let X be a compact Hausdorﬀ space and suppose |X| = ∞. Show that, as a complex vector space, C(X) has no countable basis.

Exercise 3.6.12. Suppose fn : R → R are continuous, and f : R → R such that fn(x) → f(x) for each x ∈ R. 1. Show that if I is an open interval of positive length then f −1(I) ∩ f −1(I)c is Fσ and nowhere dense.

2. Show that if f is not continuous at a point x then there exists an open interval I with rational endpoints such that x ∈ f −1(I) ∩ f −1(I)c.

3. Show that f is continuous on a dense set of points in R. Exercise 3.6.13. Show that there exists a function f ∈ C([0, 1]) so that f is not monotone on any interval of positive length.

Exercise 3.6.14 ([hg]). Let f : R → R be an inﬁnitely diﬀerentiable function such that at each point x ∈ R there is a derivative f (n) so that f (n)(x) = 0. Let

Y = {x ∈ R |f|O = p|O for some polynomial p and some neighborhood O of x},

c and let X = Y . Suppose that X 6= ∅. For each n ≥ 0 let Sn = {x ∈ R | f (n)(x) = 0}.

(a) Show that X is a closed set without isolated points.

(b) Show that there exists an interval (a, b) such that ∅= 6 (a, b) ∩ X ⊂ Sn.

(d) Conclude that, in fact, X = ∅, and deduce from this that f agrees with a polynomial on R.

Exercise 3.6.15. For each n, m ∈ N let f(t) − f(x) 1 An,m = f ∈ C([0, 1]) | there exists x ∈ [0, 1] such that ≤ n if 0 < |x − t| < . t − x m

1. Show that if f ∈ C([0, 1]) is diﬀerentiable at some point in [0, 1] then f ∈ An,m for some n, m ∈ N.

2. Show that An,m is closed in C([0, 1]).

3. Show that An,m is nowhere dense in C([0, 1]).

4. Show that the set of functions f ∈ C([0, 1]) which are nowhere diﬀeren- tiable is dense in C([0, 1]). 86 CHAPTER 3. POINT SET TOPOLOGY

3.7 Cantor spaces

A Cantor space is a non-empty compact Hausdorff space without isolated points and having a countable basis consisting of clopen sets. For example, ∞ suppose {Fn}n=1 is a sequence of finite sets with |Fn| ≥ 2, which we consider Q∞ as discrete topological spaces, then X = n=1 Fn is a Cantor space. Indeed, by Tychonoff’s theorem X is non-empty compact Hausdorff. A countable basis of −1 clopen sets are given by finite intersections of sets of the form πn (En) where En ⊂ Fn is a non-empty set. It is also easy to see that since |Fn| ≥ 2, X has no isolated points. Given a set A, we let A

A Cantor scheme on a set X is a family {As}s∈{0,1}

< 1. Asˆ0 ∩ Asˆ1 = ∅, for s ∈ {0, 1} N,

< 2. Asˆi ⊂ As, for s ∈ {0, 1} N.

If (X, d) is a metric space and we additionally have limn→∞ diam(As|n) = 0 for N any s ∈ {0, 1} then we say that {As}s∈{0,1}

Proof. Since {As}s∈{0,1}

Proof. Let C be a Cantor space. By Lemma 3.7.1, to prove the theorem it is enough to produce a Cantor scheme {As}s∈{0,1}

1. A∅ = X;

2. As is open nonempty;

< 3. As = Asˆ0 ∪ Asˆ1, for s ∈ {0, 1} N.

We construct {As}s∈{0,1}

Proof. Let d be a compatible metric on X. By Lemma 3.7.1 it is enough to produce a Cantor scheme {Us}s∈{0,1}

1. Us is open nonempty for each s;

−length(s) 2. diam(Us) ≤ 2 ;

< 3. Usˆi ⊂ Us, for s ∈ {0, 1} N.

We construct such a scheme by induction on the length. Let U∅ be any open nonempty set with diameter at most 1. Given Us, as X has no isolated points there exist distinct points x, y ∈ Us. We then let Usˆ0 and Usˆ1 be disjoint open neighborhoods in Us of x and y respectively so that each has diameter at most −length(s)−1 2 . Theorem 3.7.4 (The Cantor-Bendixson theorem). Let X be a Polish space. Then X can be written uniquely as P ∪ C, where P has no isolated points, and C is countable open.

Proof. We let P denote the set of points x ∈ X such that any neighborhood of c x has uncountably many points, and we let C = P . If {On}n∈N is a countable open basis, then C is the union of all countable On, hence, C is countable and open. Each neighborhood in X of each point in P is uncountable, and since C is countable, this also holds for each neighborhood in P , thus P has no isolated points. For uniqueness, suppose that X = Q ∪ D where D is countable open and Q has no isolated points. Since D is countable open we clearly have that D ⊂ C. If x ∈ C \ D were isolated as C is open we would have that x is also isolated in Q, however, Q has no isolated points and hence we conlude that C \ D also has no isolated points. By Proposition 3.6.1 we then have that C \ D is a countable Polish space without isolated points. Proposition 3.7.3 then shows that C \ D = ∅ and hence C = D, and P = Q. 88 CHAPTER 3. POINT SET TOPOLOGY

Lemma 3.7.5. Suppose A ⊂ {0, 1}N is nonempty closed, then there exists a continuous map f : {0, 1}N → A so that f is the identity on A.

Proof. For each s ∈ {0, 1}N we deﬁne f(s) to be the point in A which has the longest common initial segment with s. It’s easy to see that f is well deﬁned, and if s, t ∈ {0, 1}N and k ∈ N such that s|k = t|k then f(s)|k = f(t)|k. Since a ∞ N sequence {sn}n=1 ⊂ {0, 1} converges to a point s if and only if for each k ∈ N we have sn|k = s|k for large enough n, it then follows that f is continuous. In Proposition 3.5.8 we saw that every compact metric has a continuous injective map into the Hilbert cube. The following result gives a nice complement to this result.

Theorem 3.7.6 (The Hausdorﬀ-Alexandroﬀ theorem). Every nonempty compact metric space X is a continuous image of the Cantor space.

Proof. Set C = {0, 1}N. We ﬁrst prove the theorem in the case when X is the P∞ −n−1 Hilbert cube. Note that the map f(x) = n=0 xn2 maps C continuously onto [0, 1], hence CN maps continuously onto [0, 1]N. Since CN is homeomorphic to C we are done. We now consider the general case. Since X is a compact metric space, Proposition 3.5.8 shows that we may assume X ⊂ [0, 1]N. From above we know that there is a continuous surjection g : C → [0, 1]N. Then g−1(X) ⊂ C is closed and from Lemma 3.7.5 there is a continuous surjection h : C → g−1(X). The map g ◦ h then gives a continuous surjection of C onto X.

3.7.1 Exercises Exercise 3.7.7. Give an explicit homoemorphism between the Cantor space {0, 1}N and the usual Cantor set C ⊂ [0, 1].

A Souslin scheme on a set X is a family {A } < of subsets of X.A s s∈N N Lusin scheme on X is a Souslin scheme such that

< 1. Asˆi ∩ Asˆj = ∅, for s ∈ N N, i 6= j.

< 2. Asˆi ⊂ As, for s ∈ N N.

If (X, d) is a metric space and we additionally have limn→∞ diam(As|n) = 0 for any s ∈ {0, 1}N then we say that {A } < has vanishing diameter. In this s s∈N N case we let D = {s | ∩n∈NAs|n 6= ∅}, and for s ∈ D we deﬁne f(s) ∈ X so that {f(s)} = ∩n∈NAs|n. The map f : D → X is the associated map. Exercise 3.7.8. Suppose (X, d) is a metric space and we have a Souslin scheme

{A } < which has vanishing diameter, and associated map f : D → X. s s∈N N 1. Show that f is continuous.

3. Show that f is injective if {A } < is a Lusin scheme. s s∈N N

A ⊂ X, there exists ε > 0, so that if A ⊂ ∪n∈NBn and diam(Bn) < ε, then Bn 6= ∅ for inﬁnitely many n ∈ N. Exercise 3.7.11. Prove the Alexandrov-Urysohn Theorem: If X is a Polish space which has a countable basis of clopen sets and such that any compact subset of X has empty interior, then X is homeomorphic to NN. Exercise 3.7.12. Show that if (X, d) is a nonempty countable metric space without isolated points, then there is an embedding F : X → NN which has dense image. Hint: Find a Lusin scheme on X so that the associated map f : D → X is open, and bijective, with D dense, and then let F = f −1. Exercise 3.7.13. Prove Sierpi´nski’sTheorem: Let (X, d) be a nonempty countable metric space without isolated points, then X is homeomorphic to Q with its usual topology. Hint: Combine Exercises 3.1.9, 3.6.10, and 3.7.12.

Exercise 3.7.14. Consider Q2 with the lexicographical ordering (q, r) ≤ (s, t) if and only if either q < s, or q = s and r ≤ t. Show that Q2 with the corresponding order topology is homeomorphic to Q2 with its usual product topology.

Exercise 3.7.15 (Benyamini). Show that there exists f ∈ Cb(R) such that given any doubly infinite sequence {yn}n∈Z there is a point t ∈ R so that yn = f(t + n) for n ∈ Z. Hint: If C ⊂ [0, 1/2] is homeomorphic to the Cantor set, first construct a continuous surjection f : C → [0, 1]Z. Then define g :

∪n∈ZC + n → R by g(t + n) = πn(f(t)) for t ∈ C. Then extend g to a continuous function in Cb(R).

3.8 Standard Borel spaces

A standard Borel space is a measurable space (X, M) so that M is the Borel σ-algebra for some Polish topology on X.A standard measure space is a σ-ﬁnite measure space (X, M, µ) whose underlying measurable structure (X, M) is a standard Borel space. If, in addition, we have µ(X) = 1, then we say that (X, M, µ) is a standard probability space.A Lebesgue space is a standard probability space which has no atoms, e.g., X = [0, 1] with its Borel structure and Lebesgue measure. Lemma 3.8.1. Let X be a Polish space with topology T , and suppose A ⊂ X closed, then T ∪ {A} is again a Polish topology. 90 CHAPTER 3. POINT SET TOPOLOGY

Proof. Since X is Polish there exists a complete metric d on X such that (X, d) d(x,y) gives the topology T on X. Replacing d with 1+d(x,y) we may assume that diamd(X) ≤ 1. Since A ⊂ X is closed, d restricts to a complete metric on A, and from Propo- c c sition 3.6.1 we have a complete metric d1 on A , which satisﬁes diamd1 (A ) ≤ 1, and gives the topological structure to Ac. We may then deﬁne a metric on X by  d(x, y) if x, y ∈ A,  c d˜(x, y) = d1(x, y) if x, y ∈ A ,  1 otherwise.

Then d˜ is a complete metric on X, and the corresponding topology is T ∪ {A}. Lemma 3.8.2. Let X be a Polish space with topology T , and suppose that ∞ {Tn}n=1 is a sequence of Polish topologies on X such that T ⊂ Tn for each n ∈ N, then the topology T∞ generated by ∪n∈NTn is again a Polish topology on X. Moreover, if B(Tn) = B(T ) for all n ∈ N, then B(T∞) = B(T ). Q∞ Proof. Let Xn = X for n ∈ N. Consider the map ϕ : X → n=1 Xn given by ϕ(x) = (x, x, . . .). Then ϕ gives a homeomorphism between (X, T∞) and Q∞ ϕ(X) ⊂ n=1 Xn. Thus, to show that (X, T∞) is Polish, it is enough to show Q∞ that ϕ(X) ⊂ n=1 Xn is closed. Suppose (xn) 6∈ ϕ(X), then for some i < j we have xi 6= xj. We let U and V be disjoint open sets in T (hence also in Ti and Tj) so that xi ∈ U and xj ∈ V , then

−1 −1 c (xn) ∈ πi (U) ∩ πj (V ) ⊂ ϕ(X) .

Since Polish spaces are separable, given any set G which generates the topology, we have that any open set is a countable union of ﬁnite intersections in G. Thus, if G is in a given σ-algebra M, then this σ-algebra contains all Borel sets.

If Tn ⊂ B(T ), then ∪n∈NT ⊂ B(T ) and this generates the Polish topology T∞. Thus, from the remark above we have that B(T∞) ⊂ B(T ).

Theorem 3.8.3. Let X be a Polish space, and {En}n∈N a countable collection of Borel subsets, then there exists a finer Polish topology on X with the same Borel structure, such that for each n ∈ N, En is clopen in this new topology. Proof. We first consider the case of a single Borel subset E ⊂ X. We let A denote the set of subsets which satisfy the conclusion of the theorem and we let B be the σ-algebra of Borel subsets of X. Lemma 3.8.1 shows that A contains all closed subsets of (X, d). It is also clear that A is closed under taking complements. Thus, to conclude that B ⊂ A it is then enough to show that A is closed under countable intersections. If An ∈ A, and Tn are finer Polish topologies on X, with Borel structure B, such that An is clopen in Tn for each n ∈ N, then by Lemma 3.8.2 there is a finer Polish topology T∞ which generates B and such that An is clopen in T∞, for each 3.8. STANDARD BOREL SPACES 91

n ∈ N. We then have that ∩n∈NAn is closed in T∞, and hence by Lemma 3.8.1 we have that ∩n∈NAn ∈ A. Having established the result for a single Borel set E, we may then apply

Lemma 3.8.1 to obtain the result for a sequence of Borel sets {En}n∈N. Corollary 3.8.4. Let (X, B) be a standard Borel space, and E ∈ B a Borel subset, then (E, B|E) is a standard Borel space. Proof. By the previous theorem we may assume X is Polish and E ⊂ X is clopen, and hence Polish. We then have that B|E is the associated Borel structure on E and hence (E, B|E) is standard. Corollary 3.8.5. Let X be a standard Borel space, Y a Polish space, and f : X → Y a Borel map, then there exists a Polish topology on X which generates the same Borel structure and such that f is continuous with respect to this topology.

Proof. Let {En} be a countable basis for the topology on Y . By Theorem 3.8.3 there exists a Polish topology on X which generates the same Borel structure −1 and such that f (En) is clopen for each n ∈ N. Hence, in this topology f is continuous. Lemma 3.8.6. Let X be a Polish space, then there exists a Souslin scheme

{E } < consisting of Borel subsets such that the the following conditions are s s∈N N satisﬁed:

(i) E∅ = X.

N (iii) For each s ∈ N the set ∩n∈NEs|n consists of at most one element.

N (iv) For each s ∈ N , ∩n∈NEs|n = {x}= 6 ∅ if and only if Es|n 6= ∅ for all n ∈ N, and in this case for any sequence xn ∈ Es|n we have xn → x. Proof. Let d be a complete metric on X which generates the Polish topology on X, and such that X has diameter at most 1. We will inductively construct n −n {E } < so that for s ∈ the diameter of E is at most 2 . First, we set s s∈N N N s k n E∅ = X. Now suppose Es has been constructed for each s ∈ {∅} ∪n=1 N . If k s ∈ N , let {xn}n∈N be a countable dense subset of Es (note that any subspace of a separable metric space is again separable). We deﬁne Esˆi = Es ∩ (B2−k−1 (xi) \ ∪j

0, it follows from completeness, that there exists x ∈ ∩n∈NEs|n, and for each sequence xn ∈ Es|n we have xn → x. 92 CHAPTER 3. POINT SET TOPOLOGY

If X is a standard Borel space and A, B ⊂ X are disjoint, then we say that A and B are Borel separated if there exists a Borel subset E ⊂ X such that A ⊂ E, and B ⊂ X \ E.

Lemma 3.8.7. Let X be a standard Borel space and suppose that A = ∪n∈NAn, and B = ∪m∈NBm, are such that An and Bm are Borel separated for each n, m ∈ N, then A and B are Borel separated.

Proof. Suppose En,m is a Borel subset which separates An and Bm for each n, m ∈ N. Then E = ∪n∈N ∩m∈N En,m separates A and B. If X is a Polish space, a subset E ⊂ X is analytic if there exits a Polish space Y and a continuous function f : Y → X such that E = f(Y ). Note that it follows from Corollary 3.8.5 that if f : Y → X is Borel then f(Y ) is analytic. In particular, it follows that all Borel sets are analytic. If X is a standard Borel space then a subset E ⊂ X is analytic if it is analytic for some (and hence all) Polish topologies on X which give the Borel structure. Theorem 3.8.8 (The Lusin Separation Theorem). Let X be a standard Borel space, and A, B ⊂ X two disjoint analytic sets, then A and B are Borel separated. Proof. We may assume that X is a Polish space, and that there are Polish spaces Y1, and Y2, and continuous functions fi : Yi → X such that A = f1(Y1) and B = f2(Y2). Let {E } < (resp. {F } < ) be a Souslin scheme for Y (resp. Y ) which s s∈N N s s∈N N 1 2 satisﬁes the conditions in Lemma 3.8.6. If A and B are not Borel separated then by Lemma 3.8.7 we may recursively deﬁne sequences s, r ∈ NN such that f1(Es|n) and f2(Fr|n) are not Borel separated for each n ∈ N. In particular, we have that Es|n and Fr|n are non-empty for each n ∈ N, hence there exists a ∈ Y1, b ∈ Y2 such that ∩n∈NEs|n = {a}, ∩n∈NFr|n = {b}. If V,W ⊂ X are disjoint open subsets with f1(a) ∈ V , and f2(b) ∈ W , then by continuity of fi, for large enough n we have f1(Ex|n) ⊂ V , and f2(Fy|n) ⊂ W . Hence V separates Ex|n from Fy|n for large enough n, a contradiction. Corollary 3.8.9. If X is a standard Borel space then a subset E ⊂ X is Borel if and only if both E and X \ E are analytic.

Corollary 3.8.10. let X be a standard Borel space, and let {An}n∈N be a sequence of disjoint analytic subsets, then there exists a sequence {En}n∈N of disjoint Borel subsets such that An ⊂ En for each n ∈ N. Proof. It is easy to see that the countable union of analytic sets is analytic. Hence, by Lusin’s separation theorem we may inductively deﬁne a sequence of Borel subsets {En}n∈N such that An ⊂ En, while (∪k>nAk) ∪ (∪k

Proof. We first show that f(X) is Borel. By Corollary 3.8.5 we may assume that X and Y are Polish spaces and f is continuous. Let {E } < be a Souslin s s∈N N scheme for X which satisfies the conditions of Lemma 3.8.6. Then {f(E )} < s s∈N N gives a Souslin scheme of analytic sets for Y , and since f is injective it follows n that for each s ∈ N we have that {f(Esˆk)}k∈N are pairwise disjoint. Thus, by Corollary 3.8.10 there exist pairwise disjoint Borel subsets {Ysˆk}k∈N such that f(Esˆk) ⊂ Ysˆk for each k ∈ N. We inductively define a new Souslin scheme {C } < for Y by setting s s∈N N < C∅ = Y , and Csˆk = Cs ∩ f(Esˆk) ∩ Ysˆk for all s ∈ N N, and k ∈ N. Then for < each s ∈ N N we have that Cs is Borel, and also

f(Es) ⊂ Cs ⊂ f(Es).

We claim that f(X) = ∩ ∪ k C , from which it then follows that f(X) is k∈N s∈N s Borel. If y ∈ f(X), then let x ∈ X be such that f(x) = y. There exists s ∈ NN such that x ∈ ∩k∈NEs|k, and hence y ∈ ∩k∈Nf(Es|k). Thus, y ∈ ∩k∈NCs|k ⊂ ∩ ∪ k C . Conversely, if y ∈ ∩ ∪ k C , then there exists s ∈ N such k∈N s∈N s k∈N s∈N s N that y ∈ Cs|k ⊂ f(Es|k) for each k ∈ N. Hence Es|k 6= ∅ for each k ∈ N and thus

∩k∈NEs|k = {x} for some x ∈ X. We must then have that f(x) = y, since if this were not the case there would exists an open neighborhood U of f(x) such that y 6∈ U. By continuity of f we would then have that f(Es|k) ⊂ U for large enough k, and hence y ∈ ∩k∈Nf(Es|k) ⊂ U, a contradiction. Having established that f(X) is Borel, the rest of the theorem follows easily. We have that f gives a bijection from X to f(X) which is Borel, and if E ⊂ X is Borel, then from Corollary 3.8.4 and the argument above we have that f(E) −1 is again Borel. Thus, f is a Borel map. Corollary 3.8.12. Suppose X and Y are standard Borel spaces such that there exists injective Borel maps f : X → Y , and g : Y → X, then X and Y are isomorphic as standard Borel spaces.

Proof. Suppose f : X → Y , and g : Y → X are injective Borel maps. From Theorem 3.8.11 we have that f and g are Borel isomorphisms onto their image and hence we may apply an argument used for the Cantor-Schr¨oder-Bernstein n theorem. Speciﬁcally, if we set B = ∪n∈N(f ◦ g) (Y \ f(X)), and we set A = X \ g(B), then we have g(B) = X \ A, and

f(A) = f(X) \ (f ◦ g)(B) = Y \ ((Y \ f(X)) ∪ (f ◦ g)(B)) = Y \ B.

f(x) if x ∈ A, Hence if we deﬁne θ : X → Y by θ(x) = g−1(x) if x ∈ Y \ A = g(B), then we have that θ is a bijective Borel map whose inverse is also Borel. Theorem 3.8.13 (Kuratowski). Any two uncountable standard Borel spaces are isomorphic. In particular, two standard Borel spaces X and Y are isomorphic if and only if they have the same cardinality. 94 CHAPTER 3. POINT SET TOPOLOGY

Proof. Let X be an uncountable standard Borel space, we’ll show that X is isomorphic as Borel spaces to the Polish space C = 2N. Note that by Corol- lary 3.8.12 it is enough to show that there exist injective Borel maps f : X → C, and g : C → X. Note that such a map g exists by Proposition 3.7.3 and the Cantor-Bendixson theorem, so we only need to construct f. To construct f, ﬁx a metric d on X such that d gives the Borel structure to X and such that the diameter of X is at most 1. Let {xn} be a countable dense subset of (X, d), and deﬁne f0 : X → [0, 1]N by (f0(x))(n) = d(x, xn). The function f0, is clearly injective and continuous, thus to construct f it is enough to construct an injective Borel map from [0, 1]N to C, and since CN is homeomorphic to C, it is then enough to construct an injective Borel map from [0, 1] to C, and this is easily done. For example, if y ∈ [0, 1) then we may P∞ −k consider its dyadic expansion y = k=1 bk2 , where in the case when y is a dyadic rational we take the expansion such that bk is eventually 0. Then it is easy to see that [0, 1) 3 y 7→ {bk}k ∈ C gives an injective function which is continuous except at the countable family of dyadic rational, hence is Borel. We may then extend this map to [0, 1] by sending 1 to (1, 1, 1, ··· ) ∈ C.

Theorem 3.8.14 (von Neumann). Let (X, M, µ) be a standard probability space which has no atoms, then there exists an isomorphism of standard Borel spaces θ : X → [0, 1] so that θ∗µ is Lebesgue measure on [0, 1].

Proof. By Kuratowski’s theorem we may assume that (X, M) is [0, 1] with its Borel σ-algebra. We then consider f : [0, 1] → [0, 1] the cumulative distribution function f(x) = µ([0, x]). Then f is a monotone nondecreasing function which is continuous on the right. Since f(x) − limt→x− f(t) = µ({x}) = 0 we see that f is continuous. Moreover, f(0) = 0, and f(1) = 1 so f is surjective. For each y ∈ [0, 1] we have that f −1({y}) is a closed set and since f is monotone this must be a closed interval. We let M denote the set of y’s so that f −1({y}) is an interval of positive length. Then M is countable and the set N = f −1(M) has µ-measure zero. g = f|X\N then gives a Borel isomorphism from X \ N to [0, 1] \ M such that g∗µ is Lebesgue measure on [0, 1] \ M. We ﬁx C ⊂ [0, 1] an uncountable Borel set with Lebesgue measure zero, e.g., C the usual Cantor set. Then B = g−1(C) has µ-measure zero and is again uncountable. Thus, N˜ = N ∪ B, and M˜ = M ∪ C are both uncountable standard Borel spaces by Corollary 3.8.4 and so by Kuratowski’s theorem there is a Borel isomorphism h : N˜ → M˜ . g(x) if x ∈ X \ M˜ ; We then deﬁne θ : X → [0, 1] by setting θ(x) = h(x) if x ∈ M.˜ Then θ is a Borel isomorphism and θ∗µ is Lebesgue measure.

If (X, M, µ) is a measure space, then we may consider on M the equivalence relation ∼ given by E ∼ F if µ(E∆F ) = 0. We denote the set of equivalence classes by Mc = M/ ∼. We transfer the operations of complements, countable unions and countable intersections to Mc respectively as as 0 c ∞ ∞ ∞ ∞ [E] = [E ], ∨n=1[En] = [∪n=1En], and ∧n=1[En] = [∪n=1En]. Note that these 3.8. STANDARD BOREL SPACES 95 are well deﬁned operations. We also denote byµ ˆ the function on Mc given by µˆ([E]) = µ(E), where [E] denotes the equivalence class of E. If (Y, N , ν) is another measure space, then a measure algebra homomorphism is a map α : Nb → Mc such that for E,E1,E2,... ∈ N we have 1. α([∅]) = [∅]; 2. α([E]0) = α([E])0;

∞ ∞ 3. α(∨n=1[En]) = ∨n=1α([En]); 4.µ ˆ(α([E])) =ν ˆ([E]). As an example, if we have a measure preserving map θ : X → Y , then we obtain a measure algebra homomorphism θˆ by setting θˆ([E]) = [θ−1(E)].

Theorem 3.8.15 (von Neumann). Let (X, M, µ), and (Y, N , µ) be probability spaces without atoms such that Y is standard. Suppose α : Nb → Mc is a measure algebra homomorphism, then there exists a measurable map θ : X → Y so that α = θˆ. Proof. By Theorem 3.8.14 we may assume that Y = [0, 1], endowed with the Borel σ-algebra and Lebesgue measure. For each rational r ∈ Q ∩ [0, 1] we choose a measurable set Xr ⊂ X such that [Xr] = α([[0, r)]) = α([[0, r]]). We may assume that X0 = ∅, and X1 =

X. By replacing Xr with ∪s∈Q∩[0,1],s

πˆ([[0, t)]) =π ˆ([∪r∈Q∩[0,1],r

3.8.1 Exercises Exercise 3.8.16. Let (X, M, µ), and (Y, N , µ) be probability spaces such that Y is standard, and suppose θ, φ : X → Y are measure preserving maps. Show that θˆ = φˆ if and only if θ and φ agree almost everywhere. Exercise 3.8.17. Suppose (X, M, µ), and (Y, N , µ) are standard probability spaces without atoms, and α : Nb → Mc is a measure algebra homomorphism. Show that if α is surjective, then there exists an isomorphism of standard Borel spaces θ : X → Y such that θˆ = α. 96 CHAPTER 3. POINT SET TOPOLOGY Chapter 4

Diﬀerentiation and integration

4.1 The Lebesgue diﬀerentiation theorem 4.1.1 Vitali’s covering lemma If (X, d) is a metric space, B ⊂ X is a (open or closed) ball in X, and c > 0, then we denote by cB the (open or closed respectively) ball in X having the same center as B, and having radius satisfy rad(cB) = crad(B). Lemma 4.1.1 (Vitali’s covering lemma). Let (X, d) be a metric space and let F be a collection of balls in X, having positive radii, such that sup{rad(B) | B ∈ F} < ∞, then there exists a pairwise disjoint subcollection G ⊂ F such that for all B ∈ F there exists C ∈ G with B ∩ C 6= ∅, and with B ⊂ 5C.

∞ Proof. Set R = sup{rad(B) | B ∈ F} and partition F as F = ∪n=0Fn where −n−1 −n B ∈ Fn if and only if rad(B) ∈ (2 R, 2 R]. We inductively deﬁne sub- collections Gn ⊂ Fn as follows: Set H0 = F0. By Zorn’s lemma there exists G0 ⊂ F0, a maximal (with respect to inclusion) pairwise disjoint family. Having deﬁned G0,..., Gn−1 we let

n−1 Hn = {C ∈ Fn | C ∩ B = ∅ for all B ∈ ∪k=0 Gk}, and we again use Zorn’s lemma to ﬁnd Gn ⊂ Hn which is a maximal pairwise ∞ disjoint family. We set G = ∪n=0Gn, and note that G is pairwise disjoint. If B ∈ F, then B ∈ Fn for some n ≥ 0. Either B 6∈ Hn, in which case n−1 there exists C ∈ ∪k=0 Gk such that B ∩ C 6= ∅, or else B ∈ Hn, in which case by maximality of Gn there exists C ∈ Gn so that B ∩ C 6= ∅. Since B ∈ Fm we have rad(B) ≤ 2−nR, and so in either case there exists C ∈ G with B ∩ C 6= ∅ −n−1 1 and with rad(C) > 2 R ≥ 2 rad(B), hence B ⊂ 5C.

97 98 CHAPTER 4. DIFFERENTIATION AND INTEGRATION

If E ⊂ Rd, and V is a collection of closed balls in Rd, then we say that V is a Vitali covering of E if for each x ∈ E, and ε > 0 there exists B ∈ V with x ∈ B such that rad(B) < ε.

Theorem 4.1.2 (Vitali’s covering theorem). Let E ⊂ Rd be a Borel set with ﬁnite Lebesgue measure, and suppose V is a Vitali covering of E, then there exists a pairwise disjoint (hence countable) subcollection G ⊂ V so that

λ(E \ ∪{C | C ∈ G}) = 0.

Proof. By considering the subcollection of V consisting of balls with radius at most 1 we may assume that sup{rad(B) | B ∈ V} ≤ 1. By Vitali’s covering lemma there then exists a pairwise disjoint subcollection G ⊂ V such that for every B ∈ V there exists C ∈ G with B ∩ C 6= ∅ and with B ⊂ 5C. Fix r > 0 and set Z = (E \ ∪{C | C ∈ G}) ∩ B(r, 0). We let G˜ denote the subcollection consisting of balls which intersect B(r, 0), and hence are contained ˜ ˜ ∞ in B(r + 2, 0). Partition G as G = ∪n=0Gn where C ∈ Gn if and only if rad(C) ∈ (2−n−1, 2−n]. Since G˜ is pairwise disjoint we then have

∞ X X λ(C) = λ(∪C∈G˜C) ≤ λ(B(r + 2, 0)) < ∞. n=0 C∈Gn Fix ε > 0. Then there exists N ≥ 0 so that P∞ P λ(B) < ε. Set n=N B∈Gn N−1 K = ∪n=0 ∪C∈Gn C which is compact as it is a ﬁnite union of closed balls. If z ∈ Z, then z 6∈ K and as V is a Vitali covering of E there then exists B ∈ V such that z ∈ B, B ⊂ B(r, 0), and B ∩ K = ∅. As B ∈ V and B ⊂ B(r, 0) there exists C ∈ G˜ so that B ∩ C 6= ∅ and z ∈ B ⊂ 5C. Since B ∩ C 6= ∅ we must ∞ have C 6⊂ K and hence C ∈ ∪n=N Gn. Since z ∈ Z was arbitrary we have then shown that ∞ Z ⊂ ∪n=N ∪C∈Gn 5C, hence ∞ X X λ(Z) ≤ λ(5C) < 5dε.

n=N C∈Gn Since ε > 0 was arbitrary we conclude that λ(Z) = 0, and since r > 0 was arbitrary we then conclude that λ(E \ ∪{C | C ∈ G}) = 0.

4.1.2 The Lebesgue diﬀerentiation theorem d R A function f : R → C is locally integrable if K |f| dλ < ∞ for any compact d 1 d set K ⊂ R . We let Lloc(R ) denote the space of all locally integrable functions. 1 d Theorem 4.1.3 (The Lebesgue diﬀerentiation theorem). Let f ∈ Lloc(R ), then for almost every x ∈ Rd we have 1 Z f(x) = lim f dλ. r→0 λ(B(r, x)) B(r,x) 4.1. THE LEBESGUE DIFFERENTIATION THEOREM 99

Note that Lebesgue’s diﬀerentiation theorem is obvious when f is continuous. Our strategy to prove the theorem in general will be to approximate f in L1(Rd) by a continuous function. It then becomes necessary to control the the size of 1 R the set on which the averages λ(B(r,x)) B(r,x) |f| dλ can be large in terms of the L1-norm of the function f. This is achieved by the following lemma:

Lemma 4.1.4. Suppose f ∈ L1(Rd) with compact support, and for each x ∈ Rd set 1 Z f˜(x) = lim sup |f| dλ, (4.1) r→0 λ(B(r, x)) B(r,x) then for α > 0 we have

1 λ({x ∈ d | f˜(x) > α}) ≤ kfk . R α 1

Proof. Since f has compact support, so does f˜ and hence the set

d E = {x ∈ R | f˜(x) > α} has ﬁnite measure. Let V be the collection of closed balls B such that 1 Z |f| dλ > α. λ(B) B

From the deﬁnition of f˜we see that V is a Vitali covering of E. By Vitali’s covering theorem there exists a pairwise disjoint family G, so that λ(E \∪C∈GC) = 0. Hence, X X 1 Z 1 λ(E) ≤ λ(C) ≤ |f| dλ ≤ kfk . α α 1 C∈G C∈G C

Proof of Theorem 4.1.3. First note that if the theorem holds for 1B(n,0)f for each n ∈ N then the theorem also holds for f, therefore we may assume that f ∈ L1(Rd) with compact support. The set of continuous functions with compact support is dense in L1(Rd). Indeed, it is enough to approximate characteristic functions, which can be easily done by combining the regularity of Lebesgue measure (Corollary 2.3.8) with Urysohn’s lemma. Let gn be a sequence of continuous functions with compact support such that kf − gnk1 → 0. For x ∈ Rd and n ∈ N we then have

1 Z 1 Z lim sup f dλ − f(x) ≤ lim sup (f − gn) dλ r→0 B(r, x) B(r,x) r→0 B(r, x) B(r,x)

1 Z

+ lim sup gn dλ − gn(x) + |gn(x) − f(x)|. r→0 B(r, x) B(r,x) 100 CHAPTER 4. DIFFERENTIATION AND INTEGRATION

Since gn is continuous the second term on the right vanishes. Thus,

1 Z

lim sup f dλ − f(x) ≤ f^− gn(x) + |gn(x) − f(x)|, r→0 B(r, x) B(r,x) where f^− gn is deﬁned as in (4.1). If ε > 0 and we let E be the set of points where the left hand side is greater than ε then we have

d d E ⊂ {x ∈ R | f^− gn(x) > ε/2} ∪ {x ∈ R | |gn(x) − f(x)| > ε/2}. Using Lemma 4.1.4 together with Chebyshev’s inequality then gives 2 2 λ(E) ≤ kg − fk + kg − fk . ε n 1 ε n 1 Taking n → ∞ shows λ(E) = 0, and the result follows. Corollary 4.1.5 (The Lebesgue density theorem). Suppose E ⊂ Rd is a Borel set, then for almost every x ∈ E we have λ(B(r, x) ∩ E) lim = 1. (4.2) r→0 λ(B(r, x)) Proof. This follows immediately from the Lebesgue diﬀerentiation theorem by considering the characteristic function f = 1E. Points where (4.2) holds are called points of density for the set E. In light of the triangle inequality for integration, the following gives a slight improvement on Lebesgue’s diﬀerentiation theorem. 1 d d Theorem 4.1.6. Let f ∈ Lloc(R ), then for almost every x ∈ R we have 1 Z lim |f(y) − f(x)| dλ(y). (4.3) r→0 λ(B(r, x)) B(r,x)

Proof. For each rational number q let Zq denote the set of points where the formula 1 Z lim |f(y) − q| dλ(y) = |f(x) − q| r→0 λ(B(r, x)) B(r,x) does not hold. Since x 7→ |f(x)−q| is locally integrable it follows from Lebesgue’s diﬀerentiation theorem that λ(Zq) = 0 for every q ∈ Q. If we set Z = ∪q∈QZq then we have λ(Z) = 0. For any x ∈ Rd, q ∈ Q, and r > 0 we have 1 Z 1 Z |f(y) − f(x)| dλ(y) ≤ |f(y) − q| dλ(y) + |q − f(x)|. λ(B(r, x)) λ(B(r, x)) Therefore if x 6∈ Z we have 1 Z lim sup |f(y) − f(x)| dλ(y) ≤ 2|f(x) − q|, r→0 λ(B(r, x)) for every q ∈ Q. Since Q is dense in R the result then follows. 4.1. THE LEBESGUE DIFFERENTIATION THEOREM 101

Any point x ∈ Rd which satisﬁes (4.3) is called a Lebesgue point of f.

d Theorem 4.1.7. Let µ be a complex Borel measure on R , let µ = µac + µs be dµac its Lebesgue decomposition so that µac λ and µs ⊥ λ, and let f = dλ be the Radon-Nikodym derivative. Then for λ-almost every x ∈ Rd we have µ(B(r, x)) lim = f(x). r→0 λ(B(r, x))

Proof. In light of Lebesgue’s density theorem it is enough to show that when µ(B(r,x)) d µ ⊥ λ we have limr→0 λ(B(r,x)) = 0 for λ-almost every x ∈ R . Note also that since µ(B(r,x)) ≤ |µ|(B(r,x)) it is enough to consider the case when µ is a ﬁnite λ(B(r,x)) λ(B(r,x)) positive measure. Let A be a Borel set so that µ(A) = λ(Ac) = 0. Fix r > 0 and set

µ(B(r, x)) F = x ∈ A ∩ [−r, r] | lim sup > 1/r . r→0 λ(B(r, x))

Fix ε > 0, and take O an open set so that A ⊂ O and µ(O) < ε. Let V consist µ(B) of all closed balls B ⊂ O so that λ(B) > r. Then V is a Vitali cover of F ∩ A and hence by Vitali’s covering theorem there is a pairwise disjoint subcollection G so that λ(F \ (∪C∈GC)) = 0. P P Therefore, λ(F ) ≤ C∈G λ(C) ≤ r C∈G µ(C) ≤ rµ(O) < rε. Since ε > 0 was arbitrary we conclude that λ(F ) = 0, and since r > 0 was arbitrary the result follows.

4.1.3 Exercises

A net of measurable sets {Sα}α∈I is said to shrink regularly to x if

1. the diameter of Sα tends to 0, and

2. there exists K > 0 so that for all α ∈ I, if B is the smallest ball with center x containing Sα, then λ(B) ≤ Kλ(Sα).

Exercise 4.1.8. If {Sα}α∈I shrinks regularly to x, and if x is a Lebesgue point 1 d of f ∈ Lloc(R ), then 1 Z lim |f(y) − f(x)| dλ(y) = 0. α→∞ λ(Sα) Sα

Exercise 4.1.9 (The Lebesgue density topology on R). For a Lebesgue mea- λ(A∩I) surable set A ⊂ R, set D(A) = {x ∈ R | limx∈I,diam(I)→0 λ(I) = 1}. We deﬁne a topology on R by letting the open sets be all measurable sets A such that A ⊂ D(A). Give a description of meager sets and use this to show that R with the Lebesgue density topology is a Baire space. 102 CHAPTER 4. DIFFERENTIATION AND INTEGRATION

4.2 Functions of bounded variation

Lemma 4.2.1. Suppose f : R → R is monotone, then f is continuous except at countably many points.

Proof. Let D ⊂ R denote the set of points where f is discontinuous. As f is monotone it can only have jump discontinuities, so that for each x ∈ D we have limh→0− f(x) < limh→0+ f(x). We let Dn = {x ∈ R | limh→0− f(x) < limh→0+ f(x) + 1/n}. If F ⊂ Dn ∩ [a, b] is ﬁnite then we have f(b) − f(a) ≥ P x∈F limh→0+ f(x) − limh→0− f(x) > |F |/n. Therefore |Dn ∩ [a, b]| < ∞ for each n ≥ 0 and a, b ∈ Q. We then have that D is countable. Suppose f is a complex function which is deﬁned in a neighborhood of a point x0 ∈ R, the Dini numbers associated to f at x0 are

f(x0+h)−f(x0) • D1f(x0) = lim suph→0+ h .

f(x0+h)−f(x0) • D2f(x0) = lim infh→0+ h .

f(x0+h)−f(x0) • D3f(x0) = lim suph→0− h .

f(x0+h)−f(x0) • D4f(x0) = lim infh→0+ h .

We say that f is diﬀerentiable at x0 if D1f(x0) = D2f(x0) = D3f(x0) = 0 D4f(x0) and in this case this common value is the derivative f (x0).

Theorem 4.2.2. Suppose f :[a, b] → R is monotone increasing, then f is almost everywhere (with respect to Lebesgue measure) diﬀerentiable, f 0 is measurable, and we have Z b f 0 dλ ≤ f(b) − f(a). a

Proof. We obviously have 0 ≤ D2f ≤ D1f and 0 ≤ D4f ≤ D3f, thus it is enough to show that at almost every point we have D1 ≤ D4 and D3 ≤ D2. We will show that D1 ≤ D4 almost everywhere. A similar argument will apply for D3 ≤ D2. It is enough to show that for all r, s ∈ Q with r > s > 0 the set

A = Ar,s = {x ∈ (a, b) | D1f(x) > r > s > D4f(x)} has measure zero. Fix ε > 0 and take O ⊂ (a, b) open so that A ⊂ O and λ(O \ A) ≤ ε. Let V denote the collection of intervals [x − h, x] with h > 0 so that [x − h, x] ⊂ O and

f(x − h) − f(x) < s. (4.4) −h

As D4f(x) < s for x ∈ A it follows that V gives a Vitali covering of A. By Vitali’s covering theorem there exists {[xi − hi, xi]}i ⊂ V pairwise disjoint so 4.2. FUNCTIONS OF BOUNDED VARIATION 103

that λ((∪i[xi − hi, xi]) \ A) = 0. Since (4.4) holds we have X X f(xi) − f(xi − hi) < s hi = sλ(∪i[xi − hi, xi]) ≤ sλ(O) ≤ sλ(A) + sε. i i (4.5) Set B = A ∩ (∪i(xi − hi, xi)) and note that λ(B) = λ(A). Let W be the collection of intervals [y, y + k] so that [y, y + k] ⊂ (xi − hi, xi) for some i, and such that f(y + k) − f(y) > r. (4.6) k As D1(x) > r for all x ∈ B ⊂ A it follows that W is a Vitali covering of B. Again by Vitali’s covering theorem there exists {[yj, yj + kj]}j ⊂ W pairwise disjoint so that λ((∪j[yj, yj + kj]) \ B) = 0. Since (4.6) holds we have X X f(yj + kj) − f(yj) > r kj = rλ(∪j[yj, yj + kj]) ≥ rλ(A). (4.7) j j

If Ji denotes the collection of j’s so that [yj, yj + kj] ⊂ [xi − hi, xi] then as f is monotone increasing we have X f(yj + kj) − f(yj) ≤ f(xi) − f(xi − hi).

j∈Ji Summing over all i ∈ I and using (4.5) and (4.7) then gives X X rλ(A) ≤ f(yj + kj) − f(yj) ≤ f(xi) − f(xi − hi) ≤ sλ(A) + sε. j∈J i∈I As ε > 0 was arbitrary this then shows rλ(A) ≤ sλ(A). Since r > s we conclude that λ(A) = 0. We have therefore established that the derivative f 0 exists almost everywhere. We let D ⊂ (a, b) denote the set of points where f 0 exists. Then D is f(x+k−1)−f(x) clearly measureable and if we set fk : D → [0, ∞) as fk(x) = k−1 , 0 0 then fk are measurable and fk → f pointwise which shows that f is measurable. If a < c < d < b and f is continuous at c and d then by Fatou’s lemma we have Z d Z d 0 f dλ ≤ lim inf fk dλ c k→∞ c Z d+k−1 Z c ! = lim inf k f dλ − k f dλ k→∞ d c−k−1 = f(d) − f(c) ≤ f(b) − f(a). By Lemma 4.2.1 we may take limits c → a and d → b, which then gives Z b f 0 dλ ≤ f(b) − f(a). a 104 CHAPTER 4. DIFFERENTIATION AND INTEGRATION

If f : R → C and Γ = {x0, x1, . . . , xn} with x0 < x1 < ··· < xn, then consider the sum n X SΓ(f) = |f(xi) − f(xi−1)|. i=1 The variation of f is then deﬁned as

V (f) = sup SΓ(f) Γ

If V (f) < ∞ then we say that f is a function of bounded variation. If I ⊂ R is an interval then we set VI (f) = supΓ⊂I SΓ(f), and say that f is a function of bounded variation on I. For example, if f : R → R is monotone then f is of bounded variation on any bounded interval [a, b] and we have V[a,b](f) = |f(b)−f(a)|, if we moreover have limx→−∞ |f(x)| < ∞ and limx→∞ |f(x)| < ∞ then f is of bounded variation on all of R. Another example is given by f = 1{x} in which case we have V (f) = 2. Recall that a function is Lipschitz continuous with Lipschitz constant C if for all x, y ∈ R we have |f(x) − f(y)| ≤ C|x − y|. It is easy to see that a Lipschitz function is of bounded variation on any bounded interval [a, b], and we have

V[a,b](f) ≤ C(b − a).

Proposition 4.2.3. If f, g are functions of bounded variation on an interval I, then f and g are bounded, and both f + g and fg are of bounded variation on I. Moreover, we have

VI (f + g) ≤ VI (f) + VI (g); VI (fg) ≤ kfk∞VI (g) + kgk∞VI (f).

Also, if f is of bounded variation and k1/fk∞ < ∞, then 1/f is also of bounded variation and we have

2 VI (1/f) ≤ k1/fk∞VI (f).

Proof. If f(xn) → ∞, then taking Γn = {x1, . . . , xn} (unordered) its easy to see that we have SΓn (f) → ∞. Therefore functions of bounded variation are bounded. Using the triangle inequality it is easy to see that for any ﬁnite partition Γ we have

SΓ(f + g) ≤ SΓ(f) + SΓ(g);

SΓ(fg) ≤ kfk∞SΓ(g) + kgk∞SΓ(f);

2 SΓ(1/f) ≤ k1/fk∞SΓ(f).

Taking suprema then gives the result. 4.2. FUNCTIONS OF BOUNDED VARIATION 105

Note that of course any scalar multiple of function of bounded variation is again of bounded variation, and we also have VI (f) = VI (f). Therefore it follows that f is of bounded variation on an interval I if and only if its real and imaginary parts are of bounded variation on I. If Γ = {x0, . . . , xn} is a partition and f is real valued we set n X PΓ(f) = (f(xi) − f(xi−1))+; i=1 n X NΓ(f) = (f(xi) − f(xi−1))−. i=1 Note that PΓ(f) + NΓ(f) = SΓ(f), (4.8) while PΓ(f) − NΓ(f) = f(b) − f(a). (4.9) The positive (resp. negative) variation of f on an interval I is given by

P (f) = sup PΓ(f) (resp.N(f) = sup NΓ(f)). Γ⊂I Γ⊂I Proposition 4.2.4. If any of the three of P (f),N(f),V (f) are finite then all three must be finite, and in this case we have P (f) + N(f) = V (f); P (f) − N(f) = f(b) − f(a). Proof. From (4.8) we see that P (f),N(f) ≤ V (f), so that if V (f) is finite then so is P (f) and N(f). Also, if either of P (f) or N(f) is finite then from (4.9) we see that they both must be finite, and from (4.8) we see that V (f) ≤ P (f) + N(f) so that V (f) is also finite. If we take a sequence of partitons 1 2 Γ , so that P 1 (f) → P (f), and Γ , so that N 2 (f) → N(f) then setting n Γn n Γn 1 2 Γn = Γn ∪ Γn we see that PΓn (f) → P (f) and NΓn (f) → N(f). From (4.9) we then deduce that P (f) − N(f) = f(b) − f(a), and from (4.8) we deduce that P (f) + N(f) ≤ V (f). Theorem 4.2.5 (Jordan decomposition). If f :[a, b] → R is of bounded variation then there exist monotone increasing functions f1, f2 :[a, b] → [0, ∞) so that f = f1 − f2. Proof. Since f is of bounded variation the restriction of f to [a, x] is also of bounded variation for any a < x ≤ b. We set f1(x) = P[a,x](f) and set f2(x) = N[a,x](f) − c. The functions f1 and f2 are increasing and from the previous proposition we have f1(x) − f2(x) = f(x). The previous theorem can also be easily extended to unbounded intervals.

Corollary 4.2.6. Let f : R → C be a function of bounded variation, then the derivative f 0 exists almost everywhere, and is integrable.

Proof. This follows from the Jordan decomposition and Theorem 4.2.2. 106 CHAPTER 4. DIFFERENTIATION AND INTEGRATION

4.2.1 Exercises

Exercise 4.2.7. There is a continuous function f : R → R which is not of bounded variation on any interval of positive length.

Exercise 4.2.8. There is a function of bounded variation f : R → R which is not monotone on any interval of positive length.

4.3 Absolutely continuous and singular functions

Let I be an interval in the real line. A function f : I → C is absolutely continuous if for all ε > 0 there exists δ > 0 so that whenever {(ai, bi)}i is a P pairwise disjoint collection of subintervals of I which satisfy (bi − ai) < δ we P i have i |f(bi) − f(ai)| < ε. A function f : I → C is singular if it its derivative exists and equals 0 almost everywhere. R x Lemma 4.3.1. Suppose g :[a, b] → C is integrable, then f(x) = a g dλ is absolutely continuous.

Proof. Since |g|λ λ this follows easily from Proposition 2.7.1. Lemma 4.3.2. If f is absolutely continuous then f is of bounded variation on any compact interval.

Proof. Since f is absolutely continuous there exists δ so that if {(ai, bi)}i is a P pairwise disjoint collection of subintervals of I which satisfy (bi − ai) < δ P i we have i |f(bi) − f(ai)| < 1. Then we have V[a,b](f) ≤ 1 for any interval [a, b] ⊂ I which satisﬁes b − a < δ. If we take a1 < b1 = a2 < ··· < bN so N that I = ∪i=1[ai, bi] and such that bi − ai < δ then we must have VI (f) = PN i=1 V[ai,bi](f) ≤ N. Corollary 4.3.3. If f is absolutely continuous then f is diﬀerentiable almost everywhere.

Lemma 4.3.4. If f : I → C is absolutely continuous and singular then f is constant. Proof. Take a, b ∈ I, with a < b. Let E = {x ∈ (a, b) ∈ I | f 0(x) = 0}, and ﬁx ε > 0. Since f is singular we have λ(E) = b − a. Since f is absolutely continuous on I there exists δ > 0 so that whenever {(ai, bi)}i is a pairwise P disjoint collection of subintervals of I which satisfy (bi − ai) < δ then we P i have i |f(bi) − f(ai)| < ε. Let V denote the collections of intervals [a0, b0] ⊂ (a, b) so that

|f(b0) − f(a0)| < (b0 − a0)ε/(b − a). (4.10) Then V is a Vitali covering of E and hence by Vitali’s covering theorem there N exists a ﬁnite pairwise disjoint collection of intervals {[ai, bi]}i=1 ⊂ V so that

λ(E \ (∪i[ai, bi])) < δ. 4.3. ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS 107

Rearranging we may assume that a1 < b1 ≤ a2 < b2 ≤ · · · < bn. From (4.10) we have N N X X |f(bi) − f(ai)| ≤ ε (bi − ai)/(b − a) ≤ ε. (4.11) i=1 i=1

If we set b0 = a and aN+1 = b we then have

N X N (bi − ai+1) = λ(∪i=0(bi, ai+1)) = λ([a, b] \ (∪i[xi, xi + hi])) < δ. i=0

Hence N X |f(bi) − f(ai+1)| < ε. (4.12) i=0 Combining (4.11) and (4.12) with the triangle inequality then gives

N N X X |f(b) − f(a)| ≤ |f(bi) − f(ai)| + |f(bi) − f(ai+1)| < 2ε. i=1 i=0

As ε > 0 was arbitrary we then have f(b) = f(a), and as a < b was arbitrary it then follows that f is constant.

Theorem 4.3.5. Let f :[a, b] → C be a function of bounded variation, then Z x 0 fac(x) = f dλ a is absolutely continuous and f −fac is singular. Moreover, if f = gac +gs where gac is absolutely continuous and gs is singular then there exists a constant c ∈ C so that gac = fac + c and gs = fs − c.

Proof. Since f is of bounded variation f 0 exists almost everywhere and is integrable, therefore fac is well defined. Moreover, by Lebesgue’s differentiation the- 0 0 orem we have that fac is differentiable almost everywhere and we have fac = f almost everywhere, hence f − fac is singular. f = gac + gs where gac is absolutely continuous and gs is singular, then 0 0 0 fac = f = gac almost everywhere and hence fac−gac is an absolutely continuous function which is also singular, hence constant by the previous lemma.

Corollary 4.3.6. A function f :[a, b] → C is absolutely continuous if and only if f 0 exists almost everywhere, is integrable, and we have Z x f(x) − f(a) = f 0 dλ a for each x ∈ [a, b]. 108 CHAPTER 4. DIFFERENTIATION AND INTEGRATION

4.3.1 Exercises Exercise 4.3.7. Every absolutely continuous function is uniformly continuous. Exercise 4.3.8. The Cantor function is monotone, uniformly continuous, but not absolutely continuous.

Exercise 4.3.9. Every Lipschitz continuous function is absolutely continuous. Exercise 4.3.10. The sum and product of two absolutely continuous function on a compact interval remains absolutely continuous.

Exercise 4.3.11. Let µ be a complex Borel measure on R and set f(x) = µ((−∞, x]). Then f is of bounded variation. Moreover, f is absolutely continuous if and only if µ λ, and f is singular if and only if µ ⊥ λ. Chapter 5

Lp spaces

Suppose (X, µ) is a measure space, and 0 < p < ∞. We denote by Lp(X, µ) the collection of all measurable functions f ∈ M(X, µ) such that |f|p ∈ L1(X, µ). We identify two functions if they agree almost everywhere. Given f ∈ Lp(X, µ) we set Z 1/p p kfkp = |f| dµ .

We will almost exclusively be interested in the case when p ≥ 1. When p ≥ 1 p we will show that L (X, µ) is a vector space, and k · kp gives a complete norm on Lp(X, µ). 1 Throughout this chapter we will use the convention ∞ = 0.

5.1 H¨older’sand Minkowski’s inequalities

Lemma 5.1.1 (Young’s inequality). Suppose 0 ≤ a, b < ∞, and 1 ≤ p, q ≤ ∞ 1 1 with p + q = 1, then ap bq ab ≤ + . p q

1 1 Proof. Set t = p , so that 1 − t = q . As logarithm is concave we have

log(tap + (1 − t)bq) ≥ t log(ap) + (1 − t) log(bq) = log(a) + log(b) = log(ab).

Exponentiating then gives the inequality.

Theorem 5.1.2 (H¨older’sinequality). Let (X, µ) be a measure space and sup- 1 1 p q pose 1 ≤ p, q ≤ ∞ with p + q = 1, then for f ∈ L (X, µ) and g ∈ L (X, µ) we have fg ∈ L1(X, µ) and

kfgk1 ≤ kfkpkgkq.

109 110 CHAPTER 5. LP SPACES

Proof. We assume that neither f nor g is essentially 0 since otherwise the inequality is trivial. Applying Young’s inequality to a = |f(x)| and b = |g(x)| kfkp kgkq gives |f(x)g(x)| |f(x)|p |g(x)|q ≤ p + q . kfkpkgkq pkfkp qkgkq p R p q R q Since kfkp = |f| dµ and kgkq = |g| dµ, integrating the right hand side 1 1 1 gives p + q = 1. It then follows that fg ∈ L (X, µ) and integrating the left hand side gives kfgk 1 ≤ 1, kfkpkgkq so that H¨older’sinequality follows.

Theorem 5.1.3. Let (X, µ) be a measure space and suppose 1 ≤ p, q ≤ ∞ with 1 1 p p < ∞ such that p + q = 1, then for any f ∈ L (X, µ) we have Z kfkp = sup |fg| dµ. kgkq =1 R Proof. Set L = supkgkq =1 |fg| dµ. Then by H¨older’sinequality we have L ≤ kfkp. If f is essentially 0 then the result is trivial so we assume that f is not essentially 0. Set p−1 g0 = |f| sgnf. q R (p−1)q R p p If q = ∞ then kg0k∞ = 1, otherwise kg0kq = |f| dµ = |f| dµ = kfkp, p−1 g0 so that kg0kq = kfkp . If we set g = p−1 then kgkq = 1. Hence, kfkp

Z R |f|p dµ L ≥ fg dµ = p−1 = kfkp. kfkp

Note that by Lemma 2.7.7, the previous theorem holds in the case p = ∞ if and only if (X, µ) is semiﬁnite. In the σ-ﬁnite case we also have the inequality even if f 6∈ Lp(X, µ):

Theorem 5.1.4 (Minkowski’s inequality). Let (X, µ) be a measure space and p suppose 1 ≤ p ≤ ∞, then L (X, µ) is a vector space and k · kp gives a norm on Lp(X, µ), i.e., for f, g ∈ Lp(X, µ) we have f + g ∈ Lp(X, µ) and

kf + gkp ≤ kfkp + kgkp.

Proof. This is just the pointwise triangle inequality for p = ∞ so we consider only the case when p < ∞. Note ﬁrst that from convexity of the function t 7→ tp we have the pointwise inequality |f + g|p ≤ 2p−1(|f|p + |g|p), so that f + g ∈ Lp(X, µ). 5.1. HOLDER’S¨ AND MINKOWSKI’S INEQUALITIES 111

1 1 By the previous theorem if we take 1 < q ≤ ∞ so that p + q = 1 then we have Z kf + gkp = sup |(f + g)h| dµ khkq =1 Z Z ≤ sup |fh| dµ + sup |gk| dµ = kfkp + kgkp. khkq =1 kkkq =1

Theorem 5.1.5. Let (X, µ) be a measure space and 1 ≤ p ≤ ∞. Then Lp(X, µ) p is a vector space and k · kp gives a complete norm on L (X, µ). Proof. We have already shown this for p = ∞, and so we may assume p < ∞. It’s enough to show that every absolutely convergent series in Lp(X, µ) ac- p ∞ p tually converges in L (X, µ). Suppose {fn}n=1 ⊂ L (X, µ) such that A = P∞ Pn P∞ n=1 kfnkp < ∞. Set Gn = k=1 |fk|, and G = k=1 |fk|. Then by Pn Minkowski’s inequality we have kGnkp ≤ k=1 kfkkp ≤ A. By the mono- R p R p p tone convergence theorem we then have G dµ = limn→∞ Gk dµ ≤ A . In P∞ particular, we have that the series n=1 fn converges almost everywhere to a function F such that |F | ≤ G. Pn p p 1 Then |F − k=1 fk| ≤ (2G) ∈ L (X, µ), and by the dominated convergence theorem we have p p n Z n X X F − fk = F − fk dµ → 0. k=1 p k=1

P∞ p Therefore the series n=1 fn converges to F in L (X, µ). The next proposition extends Theorem 5.1.3 to the case when f 6∈ Lp(X, µ). Proposition 5.1.6. Let (X, µ) be a σ-ﬁnite measure space and suppose 1 ≤ 1 1 p p, q ≤ ∞ with q < ∞ such that p + q = 1. If f 6∈ L (X, µ) then we have Z sup |fg| dµ = ∞. kgkq ≤1

Proof. We leave the case p = ∞ to the reader. For 1 ≤ p < ∞ we ﬁrst consider the case when (X, µ) is ﬁnite, so that if a < ∞ and E = {x ∈ X | |f(x)| ≤ a} p p then 1Ef ∈ L (X, µ) and hence 1Ec f 6∈ L (X, µ). Therefore, we may construct a pairwise disjoint sequence of measurable sets En of the form En = {x ∈ X | p n a < |f(x)| ≤ b} so that 1En f ∈ L (X, µ) for each n and k1En fkp > 4 . q By Theorem 5.1.3 there then exists gn ∈ L (X, µ) with kgnkq ≤ 1 so that R n P∞ −n |fgn| dµ ≥ 4 . We set g = 2 gn1E , so that by Minkowski’s in- En n=1 n equality we have kgkq ≤ 1. For each n ≥ 1 we then have Z Z −n n |fg| dµ ≥ 2 |fgn| dµ ≥ 2 . En 112 CHAPTER 5. LP SPACES

Hence, R |fg| dµ = ∞. Now suppose (X, µ) is σ-finite and suppose X = ∪nXn where {Xn}n is p an increasing sequence of finite measure subsets. If f 6∈ L (Xn, µ) for some n then the result follows from the finite case above. Otherwise we have that p f ∈ L (Xn, µ) for each n and k1Xn fkp → ∞. By Theorem 5.1.3 there then exists q R a sequence gn ∈ L (Xn, µ) with kgnkq so that |fgn| dµ → ∞, completing the proof.

5.1.1 Minkowski’s integral inequality Minkowski’s inequality shows that the Lp-norm of a sum is dominated by the sum of their Lp-norms. Generalizing from sums to integrals gives the following:

Theorem 5.1.7 (Minkowski’s integral inequality). Suppose (X, M, µ) and (Y, N , ν) are σ-ﬁnite measure spaces, 1 ≤ p < ∞, and F : X × Y → [0, ∞) is M ⊗ N - measurable. Then Z Z p 1/p Z Z 1/p F (x, y) dµ(x) dν(y) ≤ F (x, y)p dν(y) dµ(x). Y X X Y Proof. If g ∈ Lq(Y, ν) then by Tonelli’s theorem and H¨older’sinequality we have Z Z ZZ F (x, y) dµ(x) |g(y)| dν(y) = F (x, y)|g(y)| dν(y)dµ(x)

Z Z 1/p p ≤ kgkq F (x, y) dν(y) dµ(x).

Proposition 5.1.6 and Theorem 5.1.3 then gives the result.

5.1.2 Exercises Exercise 5.1.8. Suppose (X, µ) is a ﬁnite measure space, then for 1 ≤ p ≤ q ≤ ∞ we have Lq(X, µ) ⊂ Lp(X, µ).

Exercise 5.1.9. Let (X, µ) be a measure space. If 0 < p < q < r ≤ ∞ then Lq(X, µ) ⊂ Lp(X, µ) + Lr(X, µ).

Exercise 5.1.10. Let (X, µ) be a measure space. If 0 < p < q < r ≤ ∞ then p r q 1 1 1 L (X, µ) ∩ L (X, µ) ⊂ L (X, µ) and if λ ∈ [0, 1] satisﬁes q = λ p + (1 − λ) q then

λ 1−λ kfkq ≤ kfkp kfkr .

Hint: Apply Holder’s inequality with |f|λq ∈ Lp/λq(X, µ) and |f|(1−λ)q ∈ Lr/(1−λ)q(X, µ).

Exercise 5.1.11. If X is any set and 0 < p < q ≤ ∞ then `p(X) ⊂ `q(X) and kfkq ≤ kfkp. 5.2. THE DUAL OF LP -SPACES 113

Exercise 5.1.12. If (X, µ) is a measure space with µ(X) = 1, and 0 < p < q ≤ q p ∞ then L (X, µ) ⊂ L (X, µ) and kfkp ≤ kfkq. Exercise 5.1.13. Suppose (X, µ) is a measure space and f ∈ Lp(X, µ) ∩ ∞ q L (X, µ) for some p < ∞ (hence f ∈ L (X, µ) for q ≥ p), then kfk∞ = limq→∞ kfkq. Exercise 5.1.14 (Chebyshev’s inequality). Let (X, µ) be a measure space, 0 < p < ∞, and f ∈ Lp(X, µ). Then for any α > 0 we have

kfk p µ({x ∈ X | |f(x)| > α}) ≤ p . α

Exercise 5.1.15. Let (X, µ) be a measure space and suppose 1 ≤ p, q, r ≤ ∞ 1 1 1 p q r such that p + q = r . If f ∈ L (X, µ) and g ∈ L (X, µ) then fg ∈ L (X, µ) and

kfgkr ≤ kfkpkgkq.

Exercise 5.1.16. Let (X, µ) be a semiﬁnite measure space, 1 ≤ p ≤ ∞, and ∞ p p g ∈ L (X, µ). The operator Mg : L (X, µ) → L (X, µ) given by Mg(f) = gf satisﬁes kMgkB(Lp(X,µ)) = kgk∞.

5.2 The dual of Lp-spaces

Lemma 5.2.1. Let (X, µ) be a ﬁnite measure space and suppose 1 ≤ p, q ≤ ∞ 1 1 1 such that p + q = 1. If g ∈ L (X, µ) then Z

kgkp = sup fg dµ . ∞ f∈L (X,µ),kfkq ≤1

Proof. Replacing f with |f| sgn g we see that Z Z

sup fg dµ = sup |fg| dµ. ∞ ∞ f∈L (X,µ),kfkq ≤1 f∈L (X,µ),kfkq ≤1

q If f ∈ L (X, µ), then setting Ek = {x ∈ X | |f|(x) ≤ k} we have 1Ek f ∈ ∞ L (X, µ) with k1Ek fkq ≤ kfkq and by the monotone convergence theorem we have Z Z lim |1E fg| dµ = |fg| dµ. k→∞ k Therefore, Z Z sup |fg| dµ = sup |fg| dµ, ∞ q f∈L (X,µ),kfkq ≤1 f∈L (X,µ),kfkq ≤1 and the result then follows from Proposition 5.1.6. 114 CHAPTER 5. LP SPACES

Theorem 5.2.2. Let (X, µ) be a measure space and suppose 1 < p, q < ∞ 1 1 q p ∗ such that p + q = 1. For each g ∈ L (X, µ) let Ξg ∈ L (X, µ) be given R q by Ξg(f) = fg dµ. Then Ξ defines an isometric isomorphism from L (X, µ) onto Lp(X, µ)∗. Proof. First, note that by Theorem 5.1.3 we have that Ξ is a well defined and isometric map, thus we only need to show that it is surjective. Suppose ϕ ∈ Lp(X, µ)∗. We first consider the case when µ is finite so that all simple functions are in Lp(X, µ). For each measurable set E ⊂ X we set ∞ ν(E) = ϕ(1E). If {En}n=1 is a pairwise disjoint sequence of measurable sets ∞ P∞ and E = ∪n=1En then we have 1E = n=1 1En where the series converges in Lp-norm since ∞ p X 1 = µ (∪∞ E ) → 0. En n=K n n=K p Since ϕ is continuous we then have

∞ ∞ X X ν(E) = ϕ(1En ) = ν(En). n=1 n=1

Therefore ν describes a complex measure. Also, if µ(E) = 0 then 1E = 0 in Lp(X, µ) and hence ν(E) = ϕ(0) = 0, so that µ is absolutely continuous with respect to µ. By the Radon-Nikodym theorem there then exists g ∈ L1(X, µ) R so that ν(E) = E g dµ for any measurable set E, and hence for any simple function f ∈ Lp(X, µ) we have Z ϕ(f) = fg dµ. (5.1)

∞ ∞ If f ∈ L (X, µ) then taking simple functions fn ∈ L (X, µ) with kf −fnk∞ R R we have fng dµ → fg dµ. Moreover, since kf − fnkp ≤ µ(X)kf − fnk∞ we ∞ R also have ϕ(fn) → ϕ(f). Thus, for any f ∈ L (X, µ) we have ϕ(f) = fg dµ and hence Z

sup fg dµ = sup |ϕ(f)| ≤ kϕk. ∞ ∞ f∈L (X,µ),kfkp≤1 f∈L (X,µ),kfkp≤1

p ∞ By Lemma 5.2.1 we then have g ∈ L (X, µ) with kgkp ≤ kϕk. Since L (X, µ) is dense in Lp(X, µ), applying Hölder’sinequality to approximate f in Lp(X, µ) we see that equation (5.1) holds for all f ∈ Lp(X, µ). We now consider the case when µ is σ-finite, so that we may write X = n ∪n=1Fn where Fn are increasing measurable sets of finite measure. Considering p p L (Fn, µ) as a subset of L (X, µ) we may then restrict ϕ and from the argument q R above it then follows that there exists gn ∈ L (Fn, µ) so that ϕ(f) = fg dµ p for all f ∈ L (Fn, µ). It’s easy to check that for m ≥ n we have gn = gm|Fn , µ-almost everywhere. Thus, we may essentially define a function g : X → C by letting g|Fn = gn. Since kgnkq ≤ kϕk it follows from the monotone convergence 5.2. THE DUAL OF LP -SPACES 115

q p R theorem that g ∈ L (X, µ). If f ∈ L (Fn, µ) then we have ϕ(f) = fg dµ and ∞ p p R since ∪n=1L (Fn, µ) is dense in L (X, µ) it then follows that ϕ(f) = fg dµ for all f ∈ Lp(X, µ). Finally, we consider the general case. From above, for each σ-finite set E ⊂ q X there exists an essentially unique function gE ∈ L (E, µ) so that kgEkq ≤ kϕk R p and ϕ(f) = fgE dµ for any f ∈ L (E, µ). We let M ≤ kϕk be the supremum of kgEkq as E varies over all σ-finite subsets of X, and we take a sequence ∞ En so that kgEn kq → M. Set F = ∪n=1En. Then F is σ-finite, and we have c kgF kq = M. If E ⊂ F is any σ-finite set then M ≥ kgF + gEkq ≥ kgF kq = M, p and hence it follows that gE is essentially 0. In particular, if f ∈ L (X, µ) R vanishes on F it then follows that ϕ(f) = 0 = fgF dµ. Thus, we then see that p R for general f ∈ L (X, µ) we have ϕ(f) = fgF dµ.

5.2.1 Exercises ∞ ∗ 1 Pn Exercise 5.2.3. Deﬁne ϕn ∈ ` (N) by ϕn(f) = n k=1 f(n), show that if ϕ 1 is a weak∗ cluster point of {ϕn}n then ϕ 6∈ ` (N). If (X, M) is a measurable space then a (complex) ﬁnitely additive measure on (X, M) is a function m : M → C, such that there exists K > 0 so that whenever E1,...,En ∈ M are disjoint we have

n Pn 1. m(∪k=1Ek) = k=1 m(Ek); Pn 2. k=1 |m(Ek)| ≤ K. If µ is a measure on (X, M) then we say that m is absolutely continuous with respect to µ if m(E) = 0 whenever µ(E) = 0. Exercise 5.2.4. Let (X, M, µ) be a measure space and suppose that m is a ﬁnitely additive measure on (X, M) which is absolutely continuous with respect to µ. There exists a unique continuous linear functional ϕ ∈ L∞(X, µ)∗ so that ϕ(1E) = m(E) for all E ∈ M. Moreover, every continuous linear functional on L∞(X, µ) arrises in this way. 116 CHAPTER 5. LP SPACES Chapter 6

Functional analysis

6.1 Topological vector spaces

Suppose K = C, or K = R.A topological K-vector space consists of a K-vector space X, which is also a Hausdorﬀ topological space such that vector addition and scalar mulitiplication give continuous maps X × X → X and K × X → X respectively. Examples of topological vector spaces that we have already encountered include normed spaces, as well as the duals of normed spaces enowed with the weak∗-topology. As was the case for normed spaces, if X is a topological vector space then we may consider the space X∗ consisting of all continuous linear functions into K. The weak∗-topology on X∗ is the coarsest topology such that the evaluation maps X∗ 3 ϕ 7→ ϕ(x) ∈ K are continuous for each x ∈ X. The weak-topology on X is the coarsest topology such that the evaluation maps X 3 x 7→ ϕ(x) ∈ K are still continuous for each ϕ ∈ X∗. If X is a topological vector space then a net {xi}i∈I is Cauchy if for any neighborhood U of 0 there exists α ∈ I so that xi − xj ∈ U whenever i, j ≥ α. We say that X is complete if every Cauchy net converges to a point in X.A metric d is translation invariant if d(x + z, y + z) = d(x, y) for all x, y, z ∈ X. If d is a translation invariant metric on X which is compatible with the topology then the notions of Cauchy and completeness are the same as those notions for d. In particular, if X has two translation invariant metrics, both of which give the topology, then X is complete with respect to one if and only if X is complete with respect to the other.

6.1.1 Locally convex spaces

Suppose X is a vector space over K and F is a family of seminorms on X. We say that F separates points if ρ(x) = 0 for all ρ ∈ F only when x = 0. If F separates points then the coarsest topology for which every ρ ∈ F is continuous gives a Hausdorﬀ topological vector space structure to X. In this case we see that a net {xi}i is Cauchy in X if and only if for each ρ ∈ F the net is Cauchy

117 118 CHAPTER 6. FUNCTIONAL ANALYSIS

with respect to (X, ρ). We also have that xi → x in X if and only if ρ(x−xi) → 0 for each ρ ∈ F. We now wish to ﬁnd a topological characterization of those spaces whose topology arises from seminorms. If X is a topological vector space and C ⊂ X, then C is convex if for all x, y ∈ C, and 0 ≤ t ≤ 1 we have tx + (1 − t)y ∈ C. The set C is balanced if for all x ∈ C and λ ∈ K with |λ| ≤ 1 we have λx ∈ C. The set C is absorbing if ∪t≥0tC = X. If ρ is a semi-norm on X then the unit ball {x ∈ X | ρ(x) < 1} is a convex, balanced, absorbing, open set. In general, if C is a an absorbing set we deﬁne the Minkowski functional ρC : X → [0, ∞) by

ρC (x) = inf{t ≥ 0 | x ∈ tC}.

Proposition 6.1.1. If X is a topological vector space and C is a convex, balanced, and open set, then the Minkowski functional is a continuous semi-norm on X, such that C = {x ∈ X | ρC (x) < 1}.

1 Proof. First note that since C is open if x ∈ X then lim n → ∞ n x = 0 and hence 1 n x ∈ C for some n. Therefore C is also absorbing and so we have ρC (x) < ∞ for all x ∈ X. If t > 0 then it is clear that ρC (tx) = tρC (x). Also if α ∈ C with |α| = 1 then as C is balanced we have that ρC (tαx) = ρC (tx) = tρC (x). a If 0 < a, b < ∞ and x ∈ aC, y ∈ bC, then as C is convex we have a+b x + b a+b y ∈ C, and hence x + y ∈ (a + b)C. If x, y ∈ X, t = ρC (x), s = ρC (y), then for every ε > 0 we have x ∈ (t + ε)C and y ∈ (s + ε)C. Therefore we see that x + y ∈ (s + t + 2ε)C. Since ε > 0 is arbitrary we then have ρC (x + y) ≤ s + t = ρC (x) + ρC (y). Thus, it remains to show C = {x ∈ X | ρC (x) < 1}. If ρC (x) < 1 then since C is absorbing it following that x ∈ C. Conversely, if x ∈ C, then as C is open we have tx ∈ C for t suﬃciently close to 1, therefore ρC (x) < 1. Theorem 6.1.2. Let X be a topological vector space. Then there is a family of seminorms on X which generates the topology if and only if there exists a neighborhood base at 0 consisting of convex, balanced, absorbing sets.

Proof. If F is a family of seminorms which generate the topology on X then we have a neighborhood base at 0 consisting of the convex balanced and absorbing sets {x ∈ X | ρ(x) < a} for ρ ∈ F and a > 0. Conversely, if {Cα}α is a family of open convex balanced and absorbing sets which give a neighborhood base at

0 then by the previous proposition the seminorms {ρCα } are continuous and satisfy {x ∈ X | ρCα (x) < 1} = Cα, hence the family generates the topology on X. A topological vector space X is locally convex if it satisﬁes one of the hypotheses of the previous theorem.

Proposition 6.1.3. Let X be a locally convex topological vector space, then the following conditions are equivalent: 6.1. TOPOLOGICAL VECTOR SPACES 119

1. The topology on X is given by a translation invariant metric. 2. X is metrizable. 3. X is ﬁrst countable. 4. There exists a countable family of semi-norms on X which generate the topology on X. 5. There exists a countable neighborhood base at 0 consisting of convex, balanced, and absorbing sets. Proof. The equivalence between the last two conditions above follow as in The- orem 6.1.2. Also, every metric space is ﬁrst countable. If {ρn}n is a countable family of seminorms which generate the topology then consider the translation invariant metric

X ρn(x − y) d(x, y) = 2−n . 1 + ρ (x − y) n n A net converges with respect to this metric if and only if it converges with respect to each seminorm ρn, thus the metric d describes the topology on X. If X is ﬁrst countable then there exists a countable neighborhood base {Un}n at 0. Since X is locally convex, for each n there exists a convex, balanced, and absorbing set Cn such that Cn ⊂ Un. We then have that {Cn}n gives a neighborhood base at 0 which consists of convex, balanced, and absorbing sets. A Fr´echet space is a locally convex topological vector space which is has a complete translation invariant metric.

6.1.2 The open mapping and closed graph theorems Lemma 6.1.4. Let X be a locally convex topological vector space, Y a Fréchet space, and T : X → Y a continuous surjective linear map. Then for any neighborhood G of 0 in X we have that T (G) is a neighborhood of 0 in Y . Proof. Let G be a convex, balanced, absorbing neighborhood of 0 in X. Then X = ∪n≥1nG and as T is surjective we have ∪n≥1nT (G) = ∪n≥1T (nG) = T (X) = Y . Since Y is a Fréchet space it satisfies the Baire property, and so cannot be a countable union of nowhere dense sets. Hence for some n we must have that nT (G) contains a non-empty open set O. We then have that U = O − O is an open neighborhood of 0 in Y and U ⊂ nT (G) − nT (G) ⊂ 2nT (G). Thus 1 V = 2n U is an open neighborhood of 0 and V ⊂ T (G). Theorem 6.1.5 (The open mapping theorem). Let X and Y be Fréchetspaces and T : X → Y a continuous surjective linear map, then T is an open map. In particular, if T is a bijection then T is a homeomorphism. 120 CHAPTER 6. FUNCTIONAL ANALYSIS

Proof. Fix compatible translation invariant metrics dX and dY on X and Y respectively. The map T is an open map if neighborhoods of a point x are mapped to neighborhoods of T x, and by translation it suffices to consider the case x = 0. Fix 0 < r ≤ 1, then by Lemma 6.1.4 we have that T (BX (r, 0)) is a neighborhood of 0, and thus it suffices to show that T (BX (r, 0)) ⊂ T (BX (2r, 0)). Fix y ∈ T (BX (r, 0)). For n ≥ 0 we inductively define a sequence xn, so that −n xn ∈ BX (r2 , 0), and −n −n y − T (x0 + x1 + ··· + xn) ∈ BY (2 , 0) ∩ T (BX (r2 , 0)). −1 Indeed, y ∈ T (BX (r, 0)) and by Lemma 6.1.4 T (BX (r2 , 0)) is a neighborhood of 0, therefore there exists x0 ∈ BX (r, 0) so that −1 y − T x0 ∈ BY (1, 0) ∩ T (BX (r2 , 0)).

Now suppose that x0, . . . , xn−1 have been chosen. Since −n y − T (x0 + x1 + ··· + xn−1) ∈ T (BX (r2 , 0)) −n−1 −n and since T (BX (r2 , 0)) is a neighborhood of 0 there exists xn ∈ BX (r2 , 0) such that

−n −n−1 y − T (x0 + x1 + ··· + xn) ∈ BY (2 , 0) ∩ T (BX (r2 , 0)). Pn P∞ Then { k=0 xk}n≥1 gives a Cauchy sequence and we have y = T ( n=0 xn). P∞ Since n=0 xn ∈ B(2r, 0) we then have that y ∈ T (B(2r, 0)). Corollary 6.1.6. Let X and Y be Fréchetspaces and suppose T : X → Y is a continuous linear bijective map, then T gives an isomorphism of topological vector spaces. Theorem 6.1.7 (The closed graph theorem). Let X and Y be Fréchetspaces and T : X → Y a linear map such that the graph of T G(T ) = {(x, T x) | x ∈ X} ⊂ X × Y is closed in the product topology, then T is continuous. Proof. X × Y is also a Fréchet space and hence so is G(T ), being a closed subspace. The projection map pX : G(T ) → X is continuous bijective and hence by the open mapping theorem has continuous inverse. Since pY is also −1 continuous it then follows that T = pY ◦ pX is continuous. Theorem 6.1.8 (The Banach-Steinhaus uniform boundedness principle). Let X be a Banach space and Y a normed vector space. Suppose F ⊂ B(X,Y ) is such that for all x ∈ X one has

sup kT (x)kY < ∞, T ∈F then sup kT kB(X,Y ) < ∞. T ∈F 6.1. TOPOLOGICAL VECTOR SPACES 121

Proof. For each n ∈ N let

Xn = {x ∈ X | sup kT (x)kY ≤ n}. T ∈F

∞ Then Xn is closed and ∪n=1Xn = X. By the Baire category theorem there exists n so that Xn has non-empty interior, i.e., there exists x0 ∈ Xn and ε > 0 so that B(ε, x0) ⊂ Xn. Let y ∈ X with kyk ≤ 1, and suppose T ∈ F. Then

−1 kT (y)kY = ε kT (x0 + εy) − T (x0)kY −1 −1 ≤ ε (kT (x0 + εy)kY + kT (x0)kY ) ≤ 2ε n.

−1 Therefore supT ∈F kT kB(X,Y ) ≤ 2ε n < ∞.

6.1.3 Exercises Exercise 6.1.9. Let X be a σ-compact, locally compact Hausdorﬀ space. Con- sider C(X) endowed with the topology of uniform convergence on compact sets. Then C(X) is a Fr´echet space.

Exercise 6.1.10. Let X be a vector space over K and suppose k · k1 and k · k2 are two complete norms on X such that there exists C > 0 with kxk1 ≤ Ckxk2 0 0 for all x ∈ X. Show that there exists C > 0 so that kxk2 ≤ C kxk1 for all x ∈ X.

Exercise 6.1.11. The vector space Cm(R) of all m-times continuously diﬀeren- (k) tiable functions is a Fr´echet space with the semi-norms kfkk,n = sup{|f (x)| | x ∈ [−n, n]}, for 0 ≤ k ≤ m.

Exercise 6.1.12. The vector space C∞(R) of all infinitely differentiable func- (k) tions f : R → C is a Fréchet space with the semi-norms kfkk,n = sup{|f (x)| | x ∈ [−n, n]}, for 0 ≤ k < ∞.

Exercise 6.1.13. If {Xi}i∈I is a family of locally convex topological vector Q spaces, then i∈I Xi with the product topology and coordinatewise operations is again a locally convex topological vector space. Moreover, if I is countable Q and each Xi is a Fr´echet space then so is i∈I Xi. Exercise 6.1.14. If X is a topological vector space, then X∗ with the weak∗- topology, and X with the weak-topology are also topological vector spaces. Exercise 6.1.15. Let X be a topological vector space and suppose ϕ ∈ X∗ is not the zero functional, then ϕ is an open map, i.e., if G ⊂ X is open then ϕ(G) is also open. Exercise 6.1.16. If X is a topological vector space, then (X, wk)∗ = X∗.

1 Exercise 6.1.17. Let X be a set. The pairing between c0X and ` X given by P 1 ∼ ∗ hf, gi = x∈X f(x)g(x) gives an isomorphism ` X = c0X . 122 CHAPTER 6. FUNCTIONAL ANALYSIS

Exercise 6.1.18. Let (X, µ) be a σ-ﬁnite measure space, and let {fn}n∈N ⊂ L1(X, µ) be a uniformly bounded sequence of non-negative functions. Then R fn → 0 weakly if and only if fn dµ → 0.

∞ 1 ∗ i2πnt Exercise 6.1.19. Consider L ([0, 1]) = L ([0, 1]) . Then span{e }n∈Z is weak∗-dense in L∞([0, 1]). Hint: Use Lusin’s theorem and the Stone-Weierstrass theorem.

i2πnt ∞ Exercise 6.1.20. The sequence {e }n∈N ⊂ L ([0, 1]) converges to 0 in the weak∗-topology.

Exercise 6.1.21. A topological vector space X is locally convex if and only if it is isomorphic (as topological vector spaces) to a subspace of a product of Banach spaces.

Exercise 6.1.22. If X is a Fr´echet space then a subspace Y ⊂ X is a Fr´echet space if and only if Y is closed.

Exercise 6.1.23. If X is a Fr´echet space then there exists a complete metric d on X which is compatible with the topology and is translation invariant, i.e., d(x+z, y+z) = d(x, y) for all x, y, z ∈ X. Hint: First ﬁnd a translation invariant metric which is compatible with the topology and then use the previous exercise to show that this metric is complete.

Exercise 6.1.24. Let X be a locally convex topological vector space whose topology is deﬁned by a family of seminorms F. If ϕ ∈ X∗ then there exist Pn ρ1, . . . , ρn ∈ F and K > 0 so that |ϕ(x)| ≤ K i=1 ρi(x), for all x ∈ X.

6.2 The Hahn-Banach theorem

Let X be a real vector space. A function f : X → R is a sublinear functional if

1. f(tx) = tf(x), for all t > 0 and x ∈ X;

2. f(x + y) ≤ f(x) + f(y) for all x, y ∈ X.

Note that any seminorm is a sublinear functional. Further examples are given by the following lemma whose proof we leave to the reader.

Lemma 6.2.1. Let X be a real vector space and let C ⊂ X be a convex and absorbing set. Then the Minkowski functional

ρC (x) = inf{t > 0 | x ∈ tC} is a sublinear funcitonal, and C = {x ∈ X | ρC (x) < 1}. 6.2. THE HAHN-BANACH THEOREM 123

Theorem 6.2.2 (The Hahn-Banach theorem I). Let X be a real vector space with a sublinear functional f : X → R. Suppose Y ⊂ X is a subspace, and ϕ : Y → R is a linear functional such that ϕ(y) ≤ f(y), y ∈ Y.

Then there exists a linear functional ψ : X → R such that ψ(y) = ϕ(y) for y ∈ Y , and such that ψ(x) ≤ f(x), x ∈ X. Proof. We let F denote the set of all linear functionals ψ : Z → K such that Y ⊂ Z ⊂ X, ψ|Y = ϕ and ψ(z) ≤ f(z) for all z ∈ Z. If ψ1, ψ2 ∈ F with

ψ1 : Z1 → K and ψ2 : Z2 → K then we write ψ1 ≺ ψ2 if Z1 ⊂ Z2 and ψ2|Z1 = ψ1. This then gives a partial ordering on F. If {ψα}α∈I is an increasing chain in F with ψα : Zα → K then by setting Z = ∪αZα and defining ψ : Z → K by ψ(x) = ψα(x) for x ∈ Zα we have that ψ ∈ F is well defined and ψα ≺ ψ for each α. Thus every chain has an upper bound and hence by Zorn’s lemma there exists a maximal element ψ ∈ F. To finish the theorem it then suffices to show that any maximal element in F must have the entirety of X in its domain. Suppose that ψ ∈ F with ψ : Z → R ˜ such that X 6= Z. Take x0 ∈ X \ Z and set Z = {z + αx0 | α ∈ R}. If t ∈ R ˜ ˜ ˜ then we may define a linear function ψ : Z → R by ψ(z + αx0) = ψ(z) + αt. In order for ψ˜ to belong to F we need to be able to choose t so that for all z ∈ Z we have ψ(z) + αt ≤ f(z + αx0). Equivalently, for α > 0 we need z z z z −f − x + ψ ≤ t ≤ f + x − ψ , α 0 α α 0 α for all z ∈ Z. Note that for z1, z2 ∈ Z we have

ψ(z2) + ψ(z1) = ψ(z2 + z1) ≤ f(z2 + z1) ≤ f(z2 + x0) + f(z1 − x0). Therefore, −f(z1 − x0) + ψ(z1) ≤ f(z2 + x0) − ψ(z2).

If we set c1 = supz1∈Z {−f(z1 − x0) + ψ(z1)} and we set c2 = infz2∈Z {f(z2 + x0) − ψ(z2)} then we have shown that c1 ≤ c2. Taking t so that c1 ≤ t ≤ c2 ˜ then gives the extension ψ ∈ F, showing that ψ was not maximal. Theorem 6.2.3 (The Hahn-Banach theorem II). Let X be a vector space over K and ρ a semi-norm on X. Suppose Y ⊂ X is a subspace, and ϕ : Y → K is a linear functional such that |ϕ(y)| ≤ ρ(y), y ∈ Y.

Then there exists a linear functional ψ : X → K such that ψ(y) = ϕ(y) for y ∈ Y , and such that |ψ(x)| ≤ ρ(x), x ∈ X. 124 CHAPTER 6. FUNCTIONAL ANALYSIS

Proof. We ﬁrst consider the case K = R. Since ρ is a sublinear functional and ϕ(y) ≤ |ϕ(y)| ≤ ρ(y) for y ∈ Y it follows from the previous theorem that there exists a linear functional ψ : X → R so that ψ|Y = ϕ and ψ(x) ≤ ρ(x) for x ∈ X. Considering −x then shows that −ψ(x) = ψ(−x) ≤ ρ(−x) = ρ(x) and hence |ψ(x)| ≤ ρ(x) for all x ∈ X. We now consider the case K = C. We have |Re (ϕ(x))| ≤ |ϕ(y)| ≤ ρ(y) and hence viewing X as a real vector space it follows from above that there exists a R-linear functional ψ0 : X → R so that ψ0(y) = Re (ϕ(y)) for y ∈ Y , and |ψ0(x)| ≤ ρ(x) for x ∈ X. Set ψ : X → C by ψ(x) = ψ0(x) − iψ0(ix). Then ψ is R-linear and we also have ψ(ix) = iψ(x), hence ψ is C-linear. Moreover, for y ∈ Y we have ψ(y) = Re (ϕ(y))−iRe (ϕ(iy)) = ϕ(y). Finally, if x ∈ X choose θ so that eiθψ(x) = |ψ(x)|. Then

iθ iθ iθ iθ |ψ(x)| = ψ(e x) = Re (ψ(e x)) = ψ0(e x) ≤ ρ(e x) = ρ(x).

Corollary 6.2.4. Let X be a normed space, then the map ι : X → X∗∗ given by ι(x)(ϕ) = ϕ(x) is isometric.

Proof. Fix x ∈ X and deﬁne ϕ : Kx → K by ϕ(αx) = αkxk. Then ϕ is linear and we have |ϕ(αx)| = kαxk. By the Hahn-Banach theorem there then exists a linear functional ψ : X → K so that ψ(x) = kxk, and |ψ(z)| ≤ kzk for all z ∈ X, i.e., kψk ≤ 1. Therefore we see kι(x)k ≥ |ι(x)(ψ)| = |ψ(x)| = kxk. The reverse inequality is trivial.

Lemma 6.2.5. Let X be a vector space over K and suppose ϕ, ϕ1, . . . , ϕn are n linear functionals on X such that ∩k=1 ker(ϕk) ⊂ ker(ϕ). Then we have ϕ ∈ span{ϕ1, . . . , ϕn}.

Proof. We may assume that {ϕ1, . . . , ϕn} is linearly independent. Set L = n ∩k=1 ker(ϕk). Then ϕ, ϕ1, . . . , ϕn all give well deﬁned linear functionals on ∗ X/L which has dimension at most n. Since {ϕ1, . . . , ϕn} ⊂ (X/L) is a set of ∗ n linearly independent vectors it follows that {ϕ1, . . . , ϕn} also spans (X/L) . Therefore, ϕ ∈ span{ϕ1, . . . , ϕn}. Proposition 6.2.6. If X is a locally convex topological vector space, then the ∗ ∗ ∗ map Ξ: X → (X , wk ) given by Ξx(ϕ) = ϕ(x) is bijective. Proof. If x ∈ X, x 6= 0 then there exists a continuous semi-norm ρ so that ρ(x) > 0. If we consider ϕ : Kx → K given by ϕ(αx) = αρ(x), then as in Corollary 6.2.4 we may apply the Hahn-Banach theorem to produce a linear functional ψ : X → K so that ψ(x) = ρ(x) and |ψ(z)| ≤ ρ(z) for all z ∈ X. Hence ψ is continuous and Ξ(x)(ψ) = ψ(x) = ρ(z) 6= 0. Therefore Ξ is injective. ∗ ∗ ∗ If ζ ∈ (X , wk ) then there exit K > 0 and x1, . . . , xn ∈ X so that Pn ∗ |ζ(ϕ)| ≤ K i=1 |ϕ(xi)|, for all ϕ ∈ X . In particular we have ker(ζ) ⊂ n ∩i=1 ker(Ξ(xi)), and it follows from the previous lemma that ζ = Ξ(x) for some x ∈ span{x1, . . . , xn}. 6.2. THE HAHN-BANACH THEOREM 125

6.2.1 Separating convex sets

If X is a topological vector space over K and A, B ⊂ X, then A and B are separated if there exists a continuous linear functional ϕ ∈ X∗, and α ∈ R so that Re (ϕ(a)) ≤ α ≤ Re (ϕ(b)), a ∈ A, b ∈ B (6.1) If the inequalities in Equation (6.1) may be taken to be strict then we say that A and B are strictly separated. Note, that if A and B are (strictly) separated then the convex sets they generate are also (strictly) separated.

Lemma 6.2.7. Let X be a topological vector space over K. If G ⊂ X is convex open, and x0 6∈ G, then G and x are separated.

Proof. We first consider the case K = R and we assume that G is nonempty. Taking g ∈ G and replacing G with G − g and x0 with x0 − g it is enough to consider the case when 0 ∈ G. Since G is open and contains 0 we have that G is absorbing and hence by Lemma 6.2.1 the Minkowski functional ρG(y) = inf{t > 0 | y ∈ tG} is sublinear and satisfies G = {y ∈ X | ρG(y) < 1}. It is also easy to see that ρG is continuous. Define ϕ : Rx0 → R by ϕ(αx0) = αρG(x0) ≤ ρG(αx0). By the Hahn-Banach theorem there then exists a linear functional ψ : X → R so that ψ(αx0) = αρG(x0) and ψ(z) ≤ ρG(z) for all z ∈ X. If zα → z then we have ρG(zα −z) → 0 and hence lim supα→∞ ψ(zα −z) ≤ 0. Since we also have ρG(z −zα) → 0 we also obtain lim infα→∞ ψ(zα − z) ≥ 0 and hence limα→∞ ψ(zα − z) = 0. Therefore ψ is continuous. Finally, for x ∈ G we have

ψ(x) ≤ ρG(x) < 1 ≤ ρG(x0) = ψ(x0).

We now consider the case K = C. Treating X as a R-vector space it follows from above that there exists a continuous R-linear functional ψ and α ∈ R so ˜ that for all x ∈ G we have ψ(x) ≤ α ≤ ψ(x0). Setting ψ(x) = ψ(x) − iψ(ix) then gives the result.

Proposition 6.2.8. Let X be a topological vector space over K. If G, H ⊂ X are disjoint convex sets such that G is open, then G and H are separated. Moreover, if H is also open then G and H are strictly separated.

Proof. Consider the set G − H = {x − y | x ∈ G, y ∈ H}. Since G and H are convex it follows that G − H is also convex. Moreover, as G and H are disjoint we have that 0 6∈ G − H. Writing G − H = ∪y∈H (G − y) we see that G − H is open. Thus, by Lemma 6.2.7 we can separate G − H and 0 by some linear functional ϕ, i.e., replacing ϕ with −ϕ if needed, we have 0 ≤ Re (ϕ(x − y)) for all x ∈ G and y ∈ H. This then shows that Re (ϕ(y)) ≤ Re (ϕ(x)) for all x ∈ G and y ∈ H, and hence G and H are separated. If H is also open then by Exercise 6.1.15 we have that Re (ϕ(G)) and Re (ϕ(H)) are disjoint open sets in R showing that G and H are strictly separated. 126 CHAPTER 6. FUNCTIONAL ANALYSIS

Theorem 6.2.9 (The Hahn-Banach separation theorem). Let X be a locally convex topological vector space over K. If K,F ⊂ X are disjoint closed convex sets and K is compact, then K and F are strictly separated. Proof. Since F is closed and X is locally convex for each k ∈ K there exists a balanced convex open neighborhood Uk of 0 so that (k + Uk) ∩ F = ∅. The 1 family {k + 2 Uk}k∈K gives an open over of K and so by compactness there 1 n n 1 exists a ﬁnite subcover {ki + 2 Uki }i=1. Set U = ∩i=1 4 Uki . Then K + U and F + U are convex open sets. We claim that K + U and F + U are disjoint. Indeed if k + u = f + v with 1 k ∈ K and u, v ∈ U then we have k ∈ (ki + 2 Uki ) for some i and so 1 1 1 f = k + u − v ∈ k + U + U + U ⊂ (k + U ) ⊂ F c. i 2 ki 4 4 i ki

Hence (K + U) ∩ (F + U) = ∅ and thus these sets are strictly separated by Proposition 6.2.8. Since K ⊂ K + U and F ⊂ F + U it then follows that K and F are strictly separated. Note that the hypothesis that K is compact is necessary. Indeed, even in R2 the closed sets {(t, 0) | t > 0} and {(t, t−1) | t > 0} cannot be strictly separated.

Corollary 6.2.10. Let X be a K-vector space and suppose that T1 and T2 are topologies on X giving the structure of a locally convex topological vector space and such that (X, T1) and (X, T2) have the same continuous linear functionals. Then a convex set C ⊂ X is closed in the T1-topology if and only if C is closed in the T2-topology.

Proof. Suppose C ⊂ X is convex and closed in the T1-topology. For each point x 6∈ C it follows from the Hahn-Banach separation theorem that x and C are strictly separated. Thus, there exists a T1-continuous linear functional ϕx and α ∈ R so that Re (ϕx(y)) ≤ α for all y ∈ C and Re (ϕx(x)) > α. We then have C = ∩x∈X {y ∈ X | Re (ϕx(y)) ≤ α}. Since the two topologies have the same continuous linear functionals it then follows that C is also closed in the T2-topology. The converse also holds by symmetry. Corollary 6.2.11. Let X be a locally convex topological vector space and suppose C ⊂ X is a convex set, then C is closed if and only if C is weakly closed. In particular, a subspace Y ⊂ X is closed if and only if it is weakly closed. Proof. Since X with its given topology and X with the weak topology have the same continuous linear functionals this follows from the previous corollary. Recall from Corollary 6.2.4 that for a normed space X we have a natural isometric embedding ι : X → X∗∗ given by ι(x)(ϕ) = ϕ(x). We will therefore identify X as a subspace of X∗∗. Proposition 6.2.12. Let X be a normed space, then the unit ball of X is weak∗-dense in the unit ball of X∗∗. In particular, X is weak∗-dense in X∗∗. 6.2. THE HAHN-BANACH THEOREM 127

Proof. Let B be the unit ball of X and let C be the weak∗-closure of B in X∗∗. By Proposition 6.2.6 any weak∗-continuous linear functional on X∗∗ is of the form η 7→ η(ϕ) for some ϕ ∈ X∗. Therefore by the Hahn-Banach separation theorem if ζ ∈ X∗∗ is not in C then there exists ϕ ∈ X∗ and α ∈ R so that for any x ∈ B we have Re (ϕ(x)) < α < Re (ζ(ϕ)). Since 0 ∈ B we have 0 < α. Since eiθB = B it then follows that |ϕ(x)| < α for each x ∈ B and hence kϕk ≤ α. Therefore α < Re (ζ(ϕ)) ≤ |ζ(ϕ))| ≤ kζkkϕk ≤ αkζk, and we conclude that kζk > 1. By contraposition it then follows that C agrees with the ∗∗ unit ball in X . A Banach space X is reﬂexive if X∗∗ = X. For example, Theorem 5.2.2 shows that if (X, µ) is a measure space and 1 < p < ∞, then Lp(X, µ) is reﬂexive.

Theorem 6.2.13. Let X be a Banach space, then the following conditions are equivalent:

1. X is reﬂexive.

2. The weak and weak∗ topologies on X∗ agree.

3. X∗ is reﬂexive.

4. The closed unit ball in X is weakly compact.

Proof. By Proposition 6.2.12 the unit ball of X is weak∗-dense in the unit ball of X∗∗ in the weak∗-topology, which agrees with the weak topology on X. Therefore if the unit ball of X is weakly compact we must have that the unit ball of X is equal to the unit ball of X∗∗ and hence it follows that X = X∗∗. This then shows that (4) =⇒ (1). While the Banach-Alaoglu theorem gives (1) =⇒ (4). We also clearly have (3) =⇒ (2). And (2) together with the Banach-Alaoglu theorem shows that the closed unit ball of X∗ is weakly compact which then shows that X∗ is reflexive from the implication (4) =⇒ (1) applied to X∗. Thus we see that (2) and (3) are also equivalent. To complete the theorem it then suffices to show (3) =⇒ (1) as (1) =⇒ (3) would then follow by considering X∗∗. Suppose therefore that (3) (and hence also (2)) holds. If ζ ∈ X∗∗ then ζ is continuous with respect to the weak topology on X∗. Since the weak and weak∗ topologies agree we then have that ζ is continuous with respect to the weak∗ topology and hence ζ ∈ X by Proposition 6.2.6. This then shows that X is reflexive.

6.2.2 The Krein-Milman theorem

If X is a K-vector space and C ⊂ X is a nonempty convex set, then a point k ∈ C is an extreme point if it cannot be written as a non-trivial convex combination of elements in C, i.e., if x, y ∈ C x, y 6= k and t ∈ [0, 1] then tx + (1 − t)y 6= k. More generally, we say that a nonempty subset K ⊂ C is an extreme subset if 128 CHAPTER 6. FUNCTIONAL ANALYSIS whenever x, y ∈ C \ K and t ∈ [0, 1] then we have tx + (1 − t)y 6∈ K. We denote by ext(C) the set of extreme points of C. If F ⊂ X we let co(F ) denote the smallest closed convex set which contains F .

Lemma 6.2.14. Let X be a vector space over K and suppose C ⊂ X is a nonempty convex set, ϕ is a linear functional and α ∈ R so that K = C ∩ {x ∈ X | Re (ϕ(x)) ≤ α} is nonempty. Then K is an extreme subset of C. Proof. Suppose x, y ∈ C \ K and t ∈ [0, 1] then ϕ(tx + (1 − t)y) = tϕ(x) + (1 − t)ϕ(y) > tα + (1 − t)α = α. Theorem 6.2.15 (The Krein-Milman theorem). Suppose K is a compact convex subset of a locally convex topological vector space over K, then K = co(extK). Proof. We consider the family F consisting of compact convex extreme subsets of C which we consider ordered by decreasing inclusion. If {Kα}α is a chain of elements in F then set K = ∩αKα. Then K is convex and by compactness we have that K is nonempty. It is also easy to see that K is an extreme subset. Thus any chain has an upper bound and by Zorn’s lemma there then exists a nonempty convex extreme set K which has no proper subset with this property. If x, y ∈ K with x 6= y then by the Hahn-Banach theorem there exists a continuous linear functional ϕ so that Re (ϕ(x)) < Re (ϕ(y)). We would then have that K ∩ {z ∈ X | Re (ϕ(z)) ≤ Re (ϕ(x))} is a non-empty convex compact extreme subset which does not contain y contradicting minimality of K. Thus we conclude that K consists of a single point. and so ext(C) 6= ∅. Let B = co(ext(C)) and suppose B 6= C so that there exists x ∈ C \ B. As B is a closed convex set from the Hahn-Banach separation theorem there exists a continuous linear functional ϕ and α ∈ R so that Re (ϕ(x)) ≤ α and Re (ϕ(y)) > α for all y ∈ B. We then have that C ∩ {x ∈ X | Re (ϕ(x)) ≤ α} is a nonempty convex compact extreme subset which is disjoint from B. However the argument above shows that C ∩ {x ∈ X | Re (ϕ(x)) ≤ α} must contain an extreme point and hence cannot be disjoint from B. Therefore we conclude that B = C.

6.2.3 Exercises ∞ Exercise 6.2.16. Let X be a Banach space and suppose {xn}n=1 ⊂ X is a ∞ sequence which converges weakly, then {xn}n=1 is bounded. Exercise 6.2.17. Let (X, µ) be a σ-finite measure space. Then L1(X, µ) is reflexive if and only if L1(X, µ) is finite dimensional. Exercise 6.2.18. Let X be a Fréchet space, then X∗ is a Fréchet space in the weak∗-topology if and only if X is isomorphic to a separable Banach space. Hint:

If {ρn}n∈N is a family of seminorms which give the topology on X, ﬁrst show ∗ Pn ∗ that An = {ϕ ∈ X | |ϕ(x)| ≤ n k=1 ρk(x) for all x ∈ X} is weak -compact by the Banach-Alaoglu theorem, then show that X = ∪nAn and use the Baire property. 6.3. HILBERT SPACE 129

Exercise 6.2.19 (The Markov-Kakutani fixed point theorem). Let K ⊂ X be a non-empty compact convex subset of a locally convex space X, and suppose S is a non-empty family of pairwise commuting continuous maps which are affine, i.e., T (tk1 + (1 − t)k2) = tT (k1) + (1 − t)T (k2) whenever T ∈ S, k1, k2 ∈ K and t ∈ [0, 1]. Then there is a point in K which is a common fixed points for all maps in S. Hint: First consider the case when S consists of a single map T , 1 Pn−1 i take k0 ∈ K and consider the sequence n i=0 T k0.

6.3 Hilbert space 6.3.1 Inner product spaces

Let H be a vector space over K, where K = R, or K = C. An inner product on H is a map (ξ, η) 7→ hξ, ηi from H × H → K so that 1. hξ, ξi ∈ (0, ∞) for all nonzero ξ ∈ H.

2. hαξ + η, ζi = αhξ, ζi + hη, ζi, for α ∈ K, ξ, η, ζ ∈ H. 3. hξ, ηi = hη, ξi, for ξ, η ∈ H. Observe that from the last two conditions above we also have hξ, αη + ζi = αhξ, ηi + hξ, ζi for α ∈ K, and ξ, η, ζ ∈ H. Given ξ ∈ H we deﬁne kξk = phξ, ξi.

An inner product space is a vector space together with an inner product on that space. As an example, suppose (X, µ) is a measure space, and consider L2(X, µ) = {f ∈ M(X, µ) | |f|2 ∈ L1(X, µ)} where we identity two functions which agree almost everywhere. From the inequalities

2 2 2 2 2 |ab| ≤ |a| + |b| , |a + b| ≤ 2(|a| + |b| ), a, b ∈ C, we deduce that for f, g ∈ L2(X, µ) we have gf ∈ L1(X, µ), and f +g ∈ L2(X, µ). Therefore, L2(X, µ) is a vector space and we obtain an inner product by setting Z hf, gi = gf dµ.

Proposition 6.3.1 (The Cauchy-Schwarz inequality). Let H be an inner product space, then for all ξ, η ∈ H we have

|hξ, ηi| ≤ kξkkηk.

Proof. For any λ ∈ K, ξ0, η0 ∈ H we have 2 2 2 2 kξ0k + 2Re (λhξ0, η0i) + |λ| kη0k = kξ0 + λη0k ≥ 0.

Taking λ so that |λ| = 1 and λhξ0, η0i ≤ 0 gives

2 2 kξ0k + kη0k ≥ 2|hξ0, η0i|. 130 CHAPTER 6. FUNCTIONAL ANALYSIS

Setting ξ0 = kηkξ and η0 = kξkη gives

2kξk2kηk2 ≥ 2kξkkηk|hξ, ηi|, and the inequality follows. Proposition 6.3.2. Let H be an inner product space, then the map H 3 ξ 7→ kξk deﬁnes a norm on H.

Proof. The only nontrivial thing to check is the triangle inequality. Suppose ξ, η ∈ H, then from the Cauchy-Schwarz inequality we have

kξ + ηk2 = kξk2 + 2Re hξ, ηi + kηk2 ≤ kξk2 + 2kξkkηk + kηk2 = (kξk + kηk)2.

Lemma 6.3.3. Let H be an inner product space, then the inner product is jointly continuous with respect to the topology induced by the norm.

Proof. Suppose ξn → ξ and ηn → η, then {ηn}n∈N and the Cauchy-Schwarz inequality gives

≤ kξn − ξkkηnk + kξkkηn − ηk → 0.

A Hilbert space is an inner product space which is complete with respect to the given norm. For example, if (X, µ) is any measure space then by Theorem 5.1.5 we have that L2(X, µ) is a Hilbert space with inner product hf, gi = R gf dµ.

Proposition 6.3.4 (The parallelogram identity). Let H be an inner product space, then for ξ, η ∈ H we have

kξ + ηk2 + kξ − ηk2 = 2(kξk2 + kηk2).

2 2 2 Proof. Just add the formulas kξ ± ηk = kξk ± 2Re hξ, ηi + kηk . The next proposition shows that the norm completely determines the inner product.

Proposition 6.3.5 (The polarization identity). Let H be a complex inner- product space and suppose µ is a measure on the circle T so that µ(T) = 1, and R λ dµ(λ) = R λ2 dµ(λ) = 0. Then for ξ, η ∈ H we have Z λkξ + ληk2 dµ(λ) = hξ, ηi 6.3. HILBERT SPACE 131

Proof. We may compute directly Z Z Z λkξ + ληk2 dµ(λ) = λkξk2 dµ(λ) + hξ, ηi dµ(λ) Z Z + λ2hη, ξi dµ(λ) + λkηk2 dµ(λ)

= hξ, ηi.

The most commonly used case of the polarization identity is when µ = 1 4 δ{1} + δ{i} + δ{−1} + δ{−i} , in which case we obtain the formula

3 1 X hξ, ηi = ikkξ + ikηk2. 4 k=0

Another case is if we take Lebesgue measure on the interval [0, 1] and we consider the corresponding Lebesgue measure on the circle, which is the push-forward under the map t 7→ e2πit.

Corollary 6.3.6. Let H and K be two complex inner product spaces. A linear map U : H → K is isometric if and only if hUξ, Uηi = hξ, ηi for all ξ, η ∈ H.

Proof. If U : H → K is isometric, and ξ, η ∈ H, then by the polarization identity we have

3 3 1 X 1 X hUξ, Uηi = ikkU(ξ + ikη)k2 = ikk(ξ + ikη)k2 = hξ, ηi. 4 4 k=0 k=0

The previous result is also valid for real inner product spaces, and we leave it as an exercise.

6.3.2 Orthogonal subspaces and the Riesz representation theorem Given an inner product space H, two vectors ξ, η ∈ H are orthogonal if hξ, ηi = 0. A set {ξi}i∈I is orthogonal if hξi, ξji = 0 for i 6= j, and {ξi}i∈I is orthonormal if it is orthogonal and we also have kξik = 1 for all i ∈ I. If A ⊂ H we set

A⊥ = {ξ ∈ H | hξ, ηi = 0 for all η ∈ A}.

Theorem 6.3.7. Let H be a Hilbert space, K ⊂ H a nonempty closed convex subset, and η0 ∈ H. Then there exists a unique element ξ0 ∈ K with minimal distance to η0. 132 CHAPTER 6. FUNCTIONAL ANALYSIS

Proof. By considering K˜ = K − η0 it suﬃces to consider the case when η0 = 0. Set d = inf{kξk | ξ ∈ K}, and choose a sequence ξn ∈ K such that kξnk → d. Then for n, m ∈ N we have

2 2 1 1 1 2 1 1 2 d ≤ ξn + ξm = kξnk + Re hξn, ξmi + kξmk . 2 2 4 2 4

2 Hence, limn,m→∞ Re hξn, ξmi = d . Therefore,

2 2 2 lim kξn − ξmk = lim kξnk − 2Re hξn, ξmi + kξmk n,m→∞ n,m→∞ = d2 − 2d2 + d2 = 0.

Hence {ξn}n∈N is Cauchy and converges to vector ξ0 ∈ K, which satisﬁes kξ0k = d. Since the sequence {ξn}n∈N was chosen arbitrary it follows that ξ0 must be unique.

Let H be a Hilbert space and suppose K ⊂ H is a closed subspace. If ξ ∈ H we let PK(ξ) denote the unique vector in K with minimal distance to ξ and we call the map PK the orthogonal projection from H onto K.

Proposition 6.3.8. Let H be a Hilbert space, K ⊂ H a closed subspace and ﬁx ⊥ ξ ∈ H. Then PK(ξ) is the unique vector in K such that ξ − PK(ξ) ∈ K .

Proof. As PK(ξ) minimizes the distance to ξ it follows that for all a > 0 and η ∈ K we have

2 2 2 2 2 kξ −PK(ξ)k ≤ kξ −PK(ξ)−aηk = kξ −PK(ξ)k −2aRe hξ −PK(ξ), ηi+a kηk .

Rearranging and dividing by a then gives

2 Re hξ − PK(ξ), ηi ≤ akηk .

As a > 0 was arbitrary this then shows that Re hξ −PK(ξ), ηi ≤ 0, and replacing k η with i η for k = 0, 1, 2, 3 then shows that hξ − PK(ξ), ηi = 0. Therefore ⊥ ξ − PK(ξ) ∈ K . Conversely, suppose that η ∈ K is such that ξ − η ∈ K⊥. Then for ζ ∈ K we have hξ − η, ζi = 0, hence

kξ − η − ζk2 = kξ − ηk2 + kζk2 ≥ kξ − ηk2.

It then follows that η − PK(ξ).

Corollary 6.3.9. Let H be a Hilbert space, K ⊂ H a closed subspace, then PK is a linear map and kPKk ≤ 1. 6.3. HILBERT SPACE 133

Proof. If ξ, η ∈ H, α ∈ C and ζ ∈ K then by the previous proposition we have

hαξ + η − αPK(ξ) − PK(η), ζi = αhξ − PK(ξ), ζi + hη − PK(η), ζi = 0.

Hence, again by the previous proposition we have αPK(ξ)+PK(η) = PK(αξ+η), which shows that PK is linear. 2 2 Also, since PK(ξ) and ξ−PK(ξ) are orthogonal we have kPK(ξ)k ≤ kPK(ξ)k + 2 2 kξ − PK(ξ)k = kξk so that PK is a contraction. Theorem 6.3.10 (Riesz representation theorem). Let H be a Hilbert space, ∗ and for each η ∈ H consider the map Ξη ∈ H given by Ξη(ξ) = hξ, ηi. Then Ξ: H → H∗ gives an isometric anti-linear surjection. Proof. Clearly Ξ is anti-linear. By the Cauchy-Schwarz inequality we have |Ξη(ξ)| = |hξ, ηi| ≤ kξkkηk which shows that Ξ is a contraction. Moreover 2 Ξη(η) = kηk which then shows that Ξ is isometric. Suppose now that we have ϕ ∈ H∗, and assume that ϕ 6= 0. Then ker(ϕ) is a proper closed subspace. Take η ∈ H so that ϕ(η) = 1 and by replacing η with ⊥ η − Pker(ϕ)(η) we assume that η ∈ ker(ϕ) . If ξ ∈ H, then ξ − ϕ(ξ)η ∈ ker(ϕ) and hence is orthogonal to η. Thus, 0 = hξ − ϕ(ξ)η, ηi = hξ, ηi − ϕ(ξ)kηk2.

−2 Thus, ϕ(ξ) = hξ, ηkηk i for all ξ ∈ H which shows that Ξ is surjective.

6.3.3 Orthonormal bases and dimension

Note that if {ξ1, . . . , ξn} is a ﬁnite orthonormal set then expanding the inner product gives 2 n X X 2 αiξi = |αi| . i=1 i∈I We will use this equality throughout this section. Proposition 6.3.11 (Bessel’s inequality). Let H be a Hilbert space and suppose {ξi}i∈I is an orthonormal set, then for any η ∈ H we have

X 2 2 |hη, ξii| ≤ kηk . i∈I

In particular, {i ∈ I | hη, ξii= 6 0} is countable.

Proof. If {ξ1, . . . , ξn} is an orthonormal set and K denotes it’s span, then for Pn η ∈ H set η0 = i=1hη0, ξiiξi. For 1 ≤ j ≤ n we then have n X hη − η0, ξji = hη, ξji − hη0, ξiihξi, ξji = hη, ξji − hη, ξji = 0. i=1 Pn Thus, by Proposition 6.3.8 we have PK(η) = η0 = i=1hη, ξiiξi. Pn 2 2 2 In particular, we have i=1 |hη, ξii| = kPK(η)k ≤ kηk . Thus, Bessel’s inequality holds for ﬁnite sets and the general case then follows easily. 134 CHAPTER 6. FUNCTIONAL ANALYSIS

If H is a Hilbert space then an orthonormal set {ξi}i∈I is an orthonormal basis if 0 is the only vector which is orthogonal to every ξi. For example, if X is a set then the family of Dirac functions {δx}x∈X forms an orthonormal basis. Proposition 6.3.12 (Parseval’s identity). Suppose H is a Hilbert space with orthonormal basis {ξi}i∈I , then for η ∈ H we have

2 X 2 kηk = |hη, ξii| , i∈I P and η = i∈I hη, ξiiξi, where the sum converges absolutely in H. P Proof. Given η ∈ H set η0 = i∈I hη, ξiiξi and note that this sum converges absolutely in H by Bessel’s inequality. For j ∈ I we have hη − η0, ξji = hη, ξji − hη, ξji = 0. Since {ξi}i∈I is an P orthonormal basis we then have η = η0 = i∈I hη, ξiiξi. By approximating η by ﬁnite sums it then follows that

2 X 2 kηk = |hη, ξii| . i∈I

Theorem 6.3.13. Every Hilbert space has an orthonormal basis. Moreover, any two orthonormal bases have the same cardinality. Proof. Let H be a Hilbert space. By Zorn’s lemma it follows easily that H has a maximal (with respect to inclusion) orthonormal set {ξi}i∈I . Suppose η ∈ H −1 such that hη, ξii = 0 for all i ∈ I. If η 6= 0 then the set {ηkηk } ∪ {ξi}i∈I would be an orthonormal set which is strictly larger and hence would contradict Zorn’s lemma. Thus, we must have η = 0 and hence {ξi}i∈I is an orthonormal basis. To see that any two bases have the same cardinality we consider separately the finite and infinite cases. In the finite case an orthonormal basis is also an algebraic basis and this is then a standard fact from abstract linear algebra which we will not present here. Suppose therefore that {ξi}i∈I and {ηj}j∈J are two infinite orthonormal bases. By Bessel’s inequality to each i ∈ I the set of j’s such that hξi, ηji 6= 0 is a non-empty countable set. Since every ηj is not orthogonal to some ξi it follows that there exists a surjective map from I × N onto J. Since I is infinite we have |I| = |I × N| ≥ |J|. By symmetry we also have that |J| ≥ |I| and hence |I| = |J|. If H is a Hilbert space then the dimension of H is the cardinality of any orthonormal basis, and denoted by dim H. If H and K are Hilbert spaces then a unitary operator is a surjective linear isometry U : H → K. Theorem 6.3.14. If H is a Hilbert space, and X is a set then dim H = |X| if and only if there exists a unitary operator U : `2(X) → H. 6.3. HILBERT SPACE 135

2 Proof. If U : ` (X) → H is a unitary operator then {Uδx}x∈X gives an orthonormal basis with cardinality |X|. Conversely, if H has an orthonormal basis {ξi}i∈I with cardinality |I| = |X|, then there exists a bijection θ : X → I. We deﬁne an operator U : `2X → H, P by setting U(f) = x∈X f(x)ξθ(x). It’s then an easy calculation to see that U is a unitary operator.

6.3.4 Exercises

Exercise 6.3.15. If we consider N with the counting measure, then `2(N) is a Hilbert space.

Exercise 6.3.16. Let X be a real normed space with norm k · k2. Then k · k2 comes from an inner product if and only if the parallelogram identity kξ +ηk2 + kξ − ηk2 = 2(kξk2 + kηk2) holds.

Exercise 6.3.17. Let H and K be two real inner product spaces. A linear map U : H → K is isometric if and only if hUξ, Uηi = hξ, ηi for all ξ, η ∈ H.

Exercise 6.3.18 (The Banach-Saks theorem). Let H be a Hilbert space and

{ξn}n∈N ⊂ H a uniformly bounded sequence, then there exists a subsequence 1 PK {ξnk }k so that the Ces`aromeans K k=1 ξnk converges in H. Hint: Using the Banach-Alaoglu theorem you may assume that ξn has a weak limit. Exercise 6.3.19. Let H be an inner product space, and A ⊂ H, then A⊥ is a closed subspace.

Exercise 6.3.20. If H is a Hilbert space and A ⊂ H is a subspace then (A⊥)⊥ = A. This does not hold for general inner product spaces.

Exercise 6.3.21. Let H, and K be Hilbert spaces and suppose T : H → K is a bounded linear operator, then there exists a unique bounded linear operator T ∗ : K → H so that for all ξ ∈ H and η ∈ K we have

hT ξ, ηi = hξ, T ∗ηi.

The operator T ∗ : K → H is called the adjoint of the operator T .

Exercise 6.3.22. Let H and K be Hilbert spaces.

1. A bounded linear operator P ∈ B(H) is an orthogonal projection operator if and only if P = P ∗ and P 2 = P .

2. A bouned operator U ∈ B(H, K) is isometric if and only if U ∗U = id.

Exercise 6.3.23. Let H and K be Hilbert spaces, and suppose T ∈ B(H, K). Then ker(T ) = Range(T ∗)⊥.

A linear operator V ∈ B(H, K) is a partial isometry if V ∗V is an orthogonal projection. 136 CHAPTER 6. FUNCTIONAL ANALYSIS

Exercise 6.3.24. Suppose V ∈ B(H, K) is a partial isometry. 1. V ∗V is the orthogonal projection onto ker(V )⊥. 2. Range(V ) is closed and VV ∗ is the orthogonal projection onto Range(V ). In particular, V ∗ is also a partial isometry.

Exercise 6.3.25. Let M be a closed subspace of L2([0, 1], λ) such that M is contained in C([0, 1]).

1. There exists C > 0 such that kfk∞ ≤ CkfkL2 for all f ∈ M.

2. For each x ∈ [0, 1] there exists gx ∈ M so that f(x) = hf, gxi for all f ∈ M. Moreover, kgxkL2 ≤ C.

2 3. dim M ≤ C . Hint: If {fi}i is an orthonormal sequence in M then P 2 2 i |fi(x)| ≤ C for all x ∈ [0, 1]. Exercise 6.3.26 (Von Neumann’s mean ergodic theorem). Let U be a unitary operator on a Hilbert space H, set K = {ξ ∈ H | Uξ = ξ}, and let P denote the 1 Pn−1 k orthogonal projection onto K. If Sn = n k=0 U then for all ξ ∈ U we have Snξ → P ξ. Hint: Use Exercise 6.3.23 applied to the operator 1 − U. Exercise 6.3.27. Let (X, µ) be a ﬁnite measure space and ﬁx f ∈ L∞(X, µ). R n Set an = |f| dµ. 1. The sequence an+1 is non-decreasing as n increases. Hint: Apply the an Cauchy-Schwarz inequality for ξ = |f|n/2 and η = |f|(n+2)/2.

an+1 P∞ n n 2. The sequence converges to kfk∞. Hint: Show that the series b |f| an n=1 1 −1 1 converges absolutely in L (X, µ) if b < kfk∞ , and diverges in L (X, µ) if −1 b > kfk∞ . Index

A (1918-2012), 76 Alaoglu, Leonidas (1914-1981), 74 Dini, Ulisse (1845-1918), 102 Alexandrov, Pavel (1896-1982), 88, 89 E Arzelà,Cesare (1847-1912), 74 Egorov, Dmitri (1869-1931), 49 Ascoli, Giulio (1843-1896), 74 Erdös,Paul (1913-1996), 76 B F Baire, René-Louis(1874-1932), 84 Fatou, Pierre (1878-1929), 48 Banach, Stefan (1892-1945), 22, 74, Fréchet, Maurice (1878-1973), 119 120, 123, 135 Fubini, Guido (1879-1943), 50 Bendixson, Ivar Otto (1861-1935), G 87 Gottschalk, Walter (1918-2004), 76 Benyamini, Yoav (1943- ), 89 Bernstein, Felix (1878-1956), 8 H Bessel, Friedrich (1784-1846), 133 Hahn, Hans (1879-1934), 56, 123 Bolzano, Bernhard (1781-1848), 20 Hausdorff, Felix (1868-1942), 12, 88 Borel, Emile´ (1871-1956), 20, 89, Heine, Eduard (1821-1881), 20 92 Hilbert, David (1862-1943), 81, 129 Brouwer, L. E. J. (1881-1966), 86 Hölder,Otto Ludwig (1859-1937), 109 C Cantor, Georg (1845-1918), 8, 9, J 15, 86, 87 Jordan, Camille (1838-1922), 57, Carathéodory, Constantin 105 (1873-1950), 33, 35 Cauchy, Augustin-Louis K (1789-1857), 18, 129 Kakutani, Shizuo (1911-2004), 129 Cech,ˇ Eduard (1893-1960), 80, 83 König,Dénes(1884-1944), 16 Cesàro,Ernesto (1859-1906), 135 Kuratowski, Kazimierz Chebyshev, Pafnuty (1821-1894), (1896-1980), 83, 93 113 L D Lebesgue, Henri (1875-1941), 26, de Bruijn, Nicolaas Govert 38, 89, 98, 100

137 138 INDEX

Lipschitz, Rudolf (1832-1903), 17 Stieltjes, Thomas Joannes Lusin, Nikolai (1883-1950), 41, 88, (1856-1894), 38 92 Stone, Marshall Harvey (1903-1989), 77, 80 M Suslin, Mikhail Yakovlevich Markov, Andrey Jr. (1903-1979), (1894-1919), 88, 92 129 Minkowski, Hermann (1864-1909), T 110, 118 Tietze, Heinrich Franz Friedrich Moore, Robert Lee (1882-1974), 66 (1880-1964), 70 Tikhonov, Andrey Nikolayevich N (1906-1993), 70, 73 Nikodym, Otto M. (1887-1974), 59 Tonelli, Leonida (1885-1946), 50

P U Parseval, Marc-Antoine Urysohn, Pavel (1898-1924), 70, (1755-1836), 134 82, 89 V R Vitali, Giuseppe (1875-1932), 25, Radon, Johann (1887-1956), 59 97 Riesz, Frigyes (1880-1956), 133 Von Neumann, John (1903-1957), 14, 94, 95, 136 S Saks, Stanislaw (1897-1942), 135 W Schr¨oder,Ernst (1841-1902), 8 Weierstrass, Karl (1815-1897), 20, Schwarz, Hermann (1843-1921), 77 129 Sierpi´nski,Waclaw (1882-1969), 89 Z Sorgenfrey, Robert (1915-1995), 66 Zariski, Oscar (1899-1986), 66 Steinhaus, Hugo (1887-1972), 120 Zorn, Max August (1906-1993), 12 Bibliography

[hg] Andrey Gogolev (http://mathoverflow.net/users/2029/andrey gogolev), If f is infinitely differentiable then f coincides with a polynomial, MathOverflow, URL:http://mathoverflow.net/q/34067 (version: 2011-11-01).

139