Universitext

Series Editors Sheldon Axler Department of Mathematics, San Francisco State University San Francisco, California, USA Vincenzo Capasso Dipartimento di Matematica, Università degli Studi di Milano Milano, Italy Carles Casacuberta Depto. Àlgebra i Geometria, Universitat de Barcelona Barcelona, Spain Angus Mcintyre Queen Mary University of London, London, Kenneth Ribet Department of Mathematics, University of California Berkeley, California, USA Claude Sabbah CNRS, Ecole polytechnique Centre de mathématiques Palaiseau, France Endre Süli Worcester College, University of Oxford, Oxford, United Kingdom Wojbor A. Woyczynski Department of Mathematics, Case Western Reserve University Cleveland, Ohio, USA Universitext is a series of textbooks that presents material from a wide variety of mathematical disciplines at master’s level and beyond. The books, often well class- tested by their author, may have an informal, personal even experimental approach to their subject matter. Some of the most successful and established books in the series have evolved through several editions, always following the evolution of teaching curricula, to very polished texts. Thus as research topics trickle down into graduate-level teaching, first textbooks written for new, cutting-edge courses may make their way into Universitext.

More information about this series at http://www.springer.com/series/223 Adam Bowers • Nigel J. Kalton

An Introductory Course in

2123 Adam Bowers Nigel J. Kalton (deceased) Department of Mathematics Department of Mathematics University of California, San Diego University of , Columbia La Jolla, CA Columbia, MO USA USA

ISSN 0172-5939 ISSN 2191-6675(electronic) Universitext ISBN 978-1-4939-1944-4 ISBN 978-1-4939-1945-1 (eBook) DOI 10.1007/978-1-4939-1945-1

Library of Congress Control Number: 2014955345

Springer New York Heidelberg Dordrecht London © Springer Science+Business Media, LLC 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com) To the memory of Nigel J. Kalton (1946–2010) Foreword

Mathematicians are peculiar people who spend their life struggling to understand the great book of Mathematics, and find it rewarding to master a few pages of its daunting and ciphered chapters. But this book was wide open in front of Nigel Kalton, who could browse through it with no apparent effort, and share with his colleagues and students his enlightening vision. That book is now closed and we are left with the grief, and the duty to follow Nigel’s example and to keep working, no matter what. Fortunately, his mathematical legacy is accessible, which includes the last graduate course he taught in Columbia during the AcademicYear 2009–2010. Adam Bowers, a post-doctoral student during that year, gathered very careful notes of all classes and agreed with Nigel that these notes would eventually result in a textbook. Fate had in store that he completed this work alone. Adam Bowers decided that this tribute to Nigel should be co-authored by the master himself. All functional analysts should be grateful to Adam for his kind en- deavour, and for the splendid textbook he provides. Indeed this book is a smooth and well-balanced introduction to functional analysis, constantly motivated by applica- tions which make clear not only how but why the field developed. It will therefore be a perfect base for teaching a one-semester (or two) graduate course in functional analysis. A cascade falling from so high is a powerful force, and a beautiful sight. Please open this book, and enjoy.

Paris, France Gilles Godefroy January 2014

vii Preface

During the Spring Semester of 2010, Nigel Kalton taught what would be his final course. At the time, I was a postdoctoral fellow at the University of Missouri- Columbia and Nigel was my mentor. I sat in on the course, which was an introduction to functional analysis, because I simply enjoyed watching him lecture. No matter how well one knew a subject, Nigel Kalton could always show something new, and watching him present a subject he loved was a joy in itself. Over the course of the semester, I took notes diligently. It had occurred to me that someday, if I had the good fortune to teach a functional analysis course of my own, Nigel Kalton’s notes would make the best foundation. When I happened to mention to Nigel what I was doing, he suggested we turn the notes into a textbook. Sadly, Nigel was unexpectedly taken from us before the text was complete. Without him, this work—and indeed mathematics itself—suffers from a terrible loss.

About this book

This book, as the title suggests, is meant as an introduction to the topic of functional analysis. It is not meant to function as a reference book, but rather as a first glimpse at a vast and ever-deepening subject. The material is meant to be covered from beginning to end, and should fit comfortably into a one-semester course. The text is essentially self-contained, and all of the relevant theory is provided, usually as needed. In the cases when a complete treatment would be more of a distraction than a help, the necessary information has been moved to an appendix. The book is designed so that a graduate student with a minimal amount of advanced mathematics can follow the course. While some experience with measure theory and complex analysis is expected, one need not be an expert, and all of the advanced theory used throughout the text can be found in an appendix. The current text seeks to give an introduction to functional analysis that will not overwhelm the beginner. As such, we begin with a discussion of normed spaces and define a . The additional structure in a Banach space simplifies many proofs, and allows us to work in a setting which is more intuitive than is necessary

ix x Preface for the development of the theory. Consequently, we have sacrificed some generality for the sake of the reader’s comfort, and (hopefully) understanding. In Chap. 2, we meet the key examples of Banach spaces—examples which will appear again and again throughout the text. In Chap. 3, we introduce the celebrated Hahn–Banach Theorem and explore its many consequences. Banach spaces enjoy many interesting properties as a result of having a complete norm. In Chap. 4, we investigate some of the consequences of completeness, in- cluding the Baire Category Theorem, the Open Mapping Theorem, and the Closed Graph Theorem. In Chap. 4, we relax our requirements and consider a broader class of objects known as locally convex spaces. While these spaces will lack some of the advantages of Banach spaces, considerable and interesting things can and will be said about them. After a general discussion of topological preliminaries, we con- sider topics such as Haar measure, extreme points, and see how the Hahn–Banach Theorem appears in this context. The origins of functional analysis lie in attempts to solve differential equations using the ideas of linear algebra. We will glimpse these ideas in Chap. 6, where we first meet compact operators. We will continue our discussion of compact operators in Chap. 7, where we see an example of how techniques from functional analysis can be used to solve a system of differential equations, and we will encounter results ∞ 1 which allow us to do unexpected things, such as sum the series n=1 n4 . We conclude the course in Chap. 8, with a discussion of Banach algebras. We will meet the spectrum of an operator and see how it relates to the seemingly unrelated concept of maximal ideals of an algebra. As a final flourish, we will prove the Wiener Inversion Theorem, which provides a nontrivial result about Fourier series. At the end of each chapter, the reader will find a collection of exercises. Many of the exercises are directly related to topics in the chapter and are meant to complement the discussion in the textbook, but some introduce new concepts and ideas and are meant to expose the reader to a broader selection of topics. The exercises come in varying degrees of diffculty. Some are very straightforward, but some are quite challenging. It is hoped that the reader will find the material intriguing and seek to learn more. The inquisitive mind would do well with the classic text Functional Analysis by Walter Rudin [34], which covers the material of this text, and more. For further study, the reader might wish to peruse A Course in Operator Theory by John B. Conway [8] or (moving in another direction) Topics in Banach Space Theory by Albiac and Kalton [2].

About Nigel Kalton

Nigel Kalton was born on 20 June 1946 in Bromley, . He studied mathematics at Trinity College Cambridge, where he took his Ph.D. in 1970. His thesis was awarded the Rayleigh Prize for research excellence. He held positions at Lehigh University, Warwick University, University College of Swansea, the University of Preface xi

Illinois, and Michigan State University before taking a permanent position in the mathematics department at the University of Missouri in 1979. In 1984, Kalton was appointed the Luther Marion Defoe Distinguished Professor of Mathematics and then he was appointed to the Houchins Chair of Mathematics in 1985. In 1995, he was appointed a Curators’ Professor, the highest recognition bestowed by the University of Missouri. Among the many honors Nigel Kalton received, he was awarded the Chancellor’s Award for outstanding research (at the University of Missouri) in 1984, the Weldon Springs Presidential Award for outstanding research (at the University of Missouri) in 1987, and the Banach Medal from the Polish Academy of Sciences in 2005 (the highest honor in his field). During his career, he wrote over 270 articles and books, mentored 14 Ph.D. students, served on many editorial boards, and inspired countless mathematicians. For more information about Nigel Kalton, please visit the the Nigel Kalton Memorial Website developed by Fritz Gesztesy and hosted by the University of Missouri: http://kaltonmemorial.missouri.edu/ A very nice tribute to Nigel Kalton appeared in the Notices of the American Mathematical Society with contributions from Peter Casazza, Joe Diestel, Gilles Godefroy, Aleksander Pełczy´nski, and Roman Vershynin: A Tribute to Nigel J. Kalton (1946–2010), Peter G. Casazza, Coordinating Editor, Notices of the AMS, Vol. 59, No. 7 (2012), pp. 942–951. The article (which is [6] in the references) is a good starting point to learn about the life and work of Nigel Kalton. Acknowledgements

I owe much to those who have helped me during the creation of this text, including friends, family, the staff at Springer, and the anonymous reviewers. In particular, I offer my sincere thanks to Greg Piepmeyer, who thoroughly read the first draft and provided invaluable suggestions and corrections. I also wish to thank Minerva Catral, Nadia Gal, Simon Cowell, Brian Tuomanen, and Daniel Fresen for many helpful comments. I am very grateful to Professor for sharing his knowledge of the Approximation Problem and providing insight into a source of great confusion. I am indebted to Jennifer Kalton and Gilles Godefroy for their kind support, without which this book would not exist, and I am deeply grateful to Professor Godefroy for all of his advice and generosity. Of course, I owe my deepest gratitude to Professor Nigel Kalton, who was both an inspiration and mentor to me. Since I cannot show him the gratitude and admiration I feel, I offer this text as my humble tribute to his memory. I assure the reader that if there are any errors in this book, they belong to me.

La Jolla, California Adam Bowers January 2014

xiii Contents

1 Introduction ...... 1 Exercises ...... 7

2 Classical Banach Spaces and Their Duals ...... 11 2.1 Sequence Spaces ...... 11 2.2 Function Spaces ...... 16 2.3 Completeness in Function Spaces ...... 24 Exercises ...... 26

3 The Hahn–Banach Theorems ...... 31 3.1 The Axiom of Choice ...... 31 3.2 Sublinear Functionals and the Extension Theorem ...... 32 3.3 Banach Limits ...... 39 3.4 Haar Measure for Compact Abelian Groups...... 44 3.5 Duals, Biduals, and More ...... 48 3.6 The Adjoint of an Operator ...... 50 3.7 New Banach Spaces From Old ...... 53 3.8 Duals of Quotients and Subspaces ...... 57 Exercises ...... 58

4 Consequences of Completeness ...... 61 4.1 The Baire Category Theorem ...... 61 4.2 Applications of Category ...... 64 4.3 The Open Mapping and Closed Graph Theorems ...... 71 4.4 Applications of the Open Mapping Theorem ...... 77 Exercises ...... 81

5 Consequences of Convexity ...... 83 5.1 General Topology ...... 83 5.2 Topological Vector Spaces ...... 86 5.3 Some Metrizable Examples ...... 88 5.4 The Geometric Hahn–Banach Theorem ...... 93

xv xvi Contents

5.5 Goldstine’s Theorem ...... 106 5.6 Mazur’s Theorem ...... 108 5.7 Extreme Points ...... 111 5.8 Milman’s Theorem ...... 116 5.9 Haar Measure on Compact Groups ...... 118 5.10 The Banach–Stone Theorem ...... 121 Exercises ...... 124

6 Compact Operators and ...... 129 6.1 Compact Operators ...... 129 6.2 A Rank-Nullity Theorem for Compact Operators ...... 141 Exercises ...... 148

7 Theory ...... 151 7.1 Basics of Hilbert Spaces ...... 151 7.2 Operators on Hilbert Space ...... 157 7.3 Hilbert–Schmidt Operators...... 166 7.4 Sturm–Liouville Systems ...... 170 Exercises ...... 177

8 Banach Algebras ...... 181 8.1 The Spectral Radius ...... 181 8.2 Commutative Algebras ...... 196 8.3 The Wiener Algebra ...... 202 Exercises ...... 205

Appendix A Basics of Measure Theory ...... 207

Appendix B Results From Other Areas of Mathematics ...... 219

References ...... 225

Index ...... 227 Chapter 1 Introduction

Functional analysis is at its foundation the study of infinite-dimensional vector spaces. The goal of functional analysis is to generalize the well-known and very successful results of linear algebra on finite-dimensional vector spaces to the more complicated and subtle infinite-dimensional spaces. Of course, in infinite dimen- sions, certain issues (such as summability) become much more delicate, and accordingly additional structure is imposed. Perhaps the most natural structure imposed upon a vector space is that of a norm. Definition 1.1 A normed space is a real or complex vector space X together with a real-valued function x →x defined for all x ∈ X, called a norm, such that (N1) x≥0 for all x ∈ X, and x=0 if and only if x = 0, (N2) x + y≤x+y for all x and y in X, and (N3) λx=|λ|x for all x ∈ X and scalars λ. Property (N1) is known as non-negativity or positive-definiteness, Property (N2) is called subadditivity or the triangle inequality, and Property (N3) is called homogeneity. A norm on a vector space X is essentially a way of measuring the size of an element in X.Ifx is in R (or C), the norm of x is given by the absolute value (or modulus) of x, which is denoted (in either case) by |x|. Definition 1.2 A metric space is a set X together with a map d : X × X → R that satisfies the following properties: (M1) d(x, y) ≥ 0 for all x and y in X, and d(x, y) = 0 if and only if x = y, (M2) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, and z in X, and (M3) d(x, y) = d(y, x) for all x and y in X. The function d is said to be a metric on X. As was the case with a norm, (M1) is known as non-negativity and (M2) is called the triangle inequality. The final property, (M3), is called symmetry. A metric is a measure of distance on the set X. Any normed space is metrizable; that is, we can always introduce a metric on a normed space by d(x, y) =x − y,(x, y) ∈ X × X.

© Springer Science+Business Media, LLC 2014 1 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_1 2 1 Introduction

This metric measures the size of the displacement between x and y. While every norm determines a metric, not every metric space can have a norm. (We will see some examples of this in Sect. 5.3.) ∞ Definition 1.3 Let X be a metric space with metric d. A sequence (xn)n=1 in X is said to converge to a point x in X if for any >0 there exists an N ∈ N such that d(x, xn) <whenever n ≥ N. In such a case, we say the sequence is convergent = ∞ and write limn→∞ xn x. A sequence (xn)n=1 in X is called a Cauchy sequence if for any >0 there exists an N ∈ N such that d(xn, xm) <whenever n ≥ N and m ≥ N. It is well-known that a scalar-valued sequence converges if and only if it is a Cauchy sequence. This is not always true in infinite-dimensional normed spaces. A convergent sequence will always be a Cauchy sequence, but there may be Cauchy sequences that do not converge to an element of the normed space. Such a space is called incomplete, because we imagine it lacks certain desirable points. For this reason, we are generally interested in normed spaces that are complete; that is, spaces in which every Cauchy sequence does converge. For these spaces we have a special name. Definition 1.4 A normed space X is called a Banach space if it is a in the metric given by d(x, y) =x − y for all (x, y) ∈ X × X. Suppose X and Y are Banach spaces (or simply normed spaces) over a scalar field K (which is R or C). A basic goal of functional analysis is to solve equations of the form Tx = y, where x ∈ X, y ∈ Y , and T : X → Y is a given linear map (e.g., a differential or integral operator). Consequently, much study is made of linear maps on Banach spaces. Of particular interest are bounded linear maps. Definition 1.5 Let X and Y be normed spaces. A linear map T : X → Y is called bounded if there exists a constant M ≥ 0 such that Tx≤M for all x≤1. We denote the smallest such M by T ; that is, if T is a bounded linear map, then T =sup{Tx : x≤1}. A linear map that is not bounded is called unbounded. An early and important result in functional analysis is the following proposition. Proposition 1.6 Let X and Y be normed spaces and suppose T : X → Y is a linear map. The following are equivalent: (i) T is continuous, (ii) T is continuous at zero, and (iii) T is bounded.

Proof The implication (i) ⇒ (ii) is clear. To show (ii) ⇒ (iii), assume T is continuous at zero, but is unbounded. If T is unbounded, then for every n ∈ N, there exists an element xn ∈ X such that xn≤1 and Txn≥n. By the choice of xn,wehave that   x  1  n  ≤ , n ∈ N. n n Introduction 3   x  x It follows that  n  → 0asn →∞, and consequently n → 0inX as n →∞. n n By continuity at zero, it must be that     x  lim T n  =T (0)=0. n→∞ n However,         xn  Txn  T  =   ≥ 1, n ∈ N. n n This is a contradiction. Therefore, T must be bounded. Now we show (iii) ⇒ (i). Assume that T is bounded and let x ∈ X. Then  x  ≤  x T ( x ) T , because x has norm 1. By the linearity of T ,    x  T (x)=T ·x ≤T x, x ∈ X. x To prove the continuity of T , we use the above inequality together with linearity:

Tx− Ty=T (x − y)≤T x − y,(x, y) ∈ X × X.

This completes the proof. 2 Remark 1.7. In the proof of Proposition 1.6, when showing (iii) ⇒ (i), we showed that Tx− Ty≤T x − y for all x and y in X. This allowed us to conclude that T was continuous. In fact, this property is stronger than continuity. A function f : X → Y that satisfies the condition f (x) − f (y)≤Kx − y for a constant K>0 is called Lipschitz continuous with Lipschitz constant K. Any bounded linear map T between Banach spaces is Lipschitz continuous (with Lipschitz constant T ), but a Lipschitz continuous map between Banach spaces need not be linear. The proof of Proposition 1.6 provides an alternative method for computing T . Corollary 1.8 Let X and Y be normed spaces. If T : X → Y is a linear map, then

T =inf{K : Tx≤K x, x ∈ X}.

Furthermore, Tx≤T x for all x ∈ X. Definition 1.9 If X and Y are normed spaces, then L(X, Y ) denotes the set of all bounded linear maps (or operators) from X to Y . The set L(X, X) of bounded linear maps from X to itself is often denoted by L(X). The word operator implies both boundedness and linearity. Frequently, however, we say T ∈ L(X, Y )isabounded linear operator, even though this is redundant. The set L(X, Y ) is a vector space over the scalar field K with vector space operations given by:

(αS + βT)(x) = αS(x) + βT(x), where α and β are in K, S and T are in L(X, Y ), and x is in X. 4 1 Introduction

Proposition 1.10 If X and Y are normed spaces, then

T =sup{Tx : x≤1}, T ∈ L(X, Y ), defines a norm on L(X, Y ). Proof We verify the triangle inequality. Let x ∈ X be such that x≤1. Then

(S + T )(x)=Sx + Tx≤Sx+Tx≤S+T .

Therefore,

S + T =sup{(S + T )(x) : x≤1}≤S+T .

The other conditions are also straightforward and are left to the reader. (See Exercise 1.5.) 2 Proposition 1.11 If X is a normed space and Y is a Banach space, then L(X, Y ) is a Banach space. ∞ L ∈ Proof Suppose (Tn)n=1 is a Cauchy sequence in (X, Y ). For each x X, the ∞ sequence (Tnx)n=1 is a Cauchy sequence in Y because

Tnx − Tmx≤Tn − Tmx, for all m and n in N. It follows that lim Tnx exists for each x ∈ X. Denote this limit n→∞ by Tx, so that Tx = lim Tnx, x ∈ X. n→∞ We will show that T is linear and bounded, and so T ∈ L(X, Y ). The linearity of T follows from the continuity of the vector space operations. To see this, observe that for all x and y in X,

T (x + y) = lim Tn(x + y) = lim Tn(x) + lim Tn(y) = Tx+ Ty. n→∞ n→∞ n→∞ We now show that T is bounded. For each m and n in N, we have (from Exercise 1.3)

|Tm−Tn| ≤Tm − Tn. ∞ L   ∞ Thus, since (Tn)n=1 is a Cauchy sequence in (X, Y ), it follows that ( Tn )n=1 is a Cauchy sequence of scalars. Consequently, sup Tn < ∞, and so, for each x ∈ X, n∈N  Tx≤sup Tnx≤ sup Tn x. n∈N n∈N  ≤   Therefore, T is bounded and T supn∈N Tn . It remains to show that T = lim Tn in the norm on L(X, Y ). Let x ∈ X be such n→∞ that x≤1. For n ∈ N,

(T − Tn)(x)≤sup (Tm − Tn)(x)≤sup Tm − Tn, m≥n m≥n Introduction 5 by Corollary 1.8. Taking the supremum over all x ∈ X with x≤1, we have

T − Tn≤sup Tm − Tn. m≥n ∞ 2 The sequence (Tn)n=1 is a Cauchy sequence, and hence the result. When X and Y are normed spaces, a (not necessarily linear) function f : X → Y is called an isometry if f (x) − f (y)=x − y for all x and y in X. (In this case, we say f preserves distances.) When f is linear, this condition is equivalent to stating that f (x)=x for all x ∈ X. (In which case, we say f preserves the norm.) If there exists an isometry between the spaces X and Y , they are said to be isometric. We remind the reader that a one-to-one map is called an injection and an onto map is called a surjection. An injective surjection is also called a bijection. Definition 1.12 Let X and Y be normed spaces. A bijective linear map T : X → Y is called an isomorphism if both T and T −1 are continuous. In such a case, we say X and Y are isomorphic. If in addition Tx=x for all x ∈ X, then T is called an isometric isomorphism.

Let X be a vector space. If ·α and ·β are two norms on X, then they are called equivalent norms provided there are constants c1 > 0 and c2 > 0 such that

c1xα ≤xβ ≤ c2xα, x ∈ X.

In such an event, there is a linear isomorphism between the spaces (X, ·α) and (X, ·β ). (See Exercise 1.13.) The next definitions are central to all that follows, and even indicate the origins of the subject we are studying. Definition 1.13 Let X be a normed space over the scalar field K (which is either R or C). The of X is the space L(X, K), which is denoted X∗. Elements of X∗ are called (bounded) linear functionals. We remark that for any normed space X, the dual space X∗ is a Banach space with the norm given by x∗=sup{|x∗(x)| : x≤1} for x∗ ∈ X∗. (See Exercise 1.6.) We will see that much can be learned about a Banach space by studying the properties of its dual space. We close this section by recalling some topological notions that will appear fre- quently throughout the remainder of this text. We will encounter these concepts in greater generality in Chap. 5, but for now we will restrict our attention to metric spaces. Definition 1.14 Let M be a metric space with metric d.Forx ∈ M, the open ball of radius δ about x is the set B(x, δ) ={y ∈ M : d(x, y) <δ}. The closed ball of radius δ about x is the set B(x, δ) ={y ∈ M : d(x, y) ≤ δ}. 6 1 Introduction

A subset U of M is called open if for every x ∈ U there exists a δ>0 such that B(x, δ) ⊆ U. A set is called closed if its complement is open. For any subset E of M, the interior of E, denoted int(E), is the union of all open sets that are subsets of E. The closure of E is denoted E and is the intersection of all closed sets which contain E as a subset. If E = M, then E is called dense in M. If there exists a countable dense subset of M, then M is called separable. We remark that a set E in a metric space is closed if and only if any convergent ∞ ∈ ∈ N ∈ sequence (xn)n=1, where xn E for all n , converges to some x E. (See Exercise 1.10.) For a normed space, we have special terminology and notation for the open ball B(0, 1) and the closed ball B(0, 1). Definition 1.15 If X is a normed space, then the closed unit ball of X is the set BX ={x ∈ X : x≤1} and the open unit ball of X is the set UX ={x ∈ X : x < 1}.

Observe that UX = int(BX). We will use both notations for the open unit ball in X. Also, if X is a Banach space (or simply a normed vector space), and consequently has addition and scalar multiplication, we can write the closed and open balls in X (respectively) as:

B(x, δ) = x + δBX and B(x, δ) = x + int(δBX).

This is the notation we will generally use when working in a Banach space (or normed vector space), in order to emphasize the underlying linear structure. { } A set K in a metric space M is called compact if any collection Uα α∈A of open ⊆ { } sets for which K α∈A Uα contains a finite subcollection Uα1 , ... , Uαn such ⊆ ∪···∪ that K Uα1 Uαn . In particular, any compact set in a metric space can be covered by finitely many open balls of radius δ for any δ>0. Compact sets have many desirable properties. For example, if K is a compact set and f : K → R is a continuous function, then f is bounded and attains its maximum and minimum values. (This is the content of the Extreme Value Theorem.) If M1 and M2 are metric spaces, then a map f : M1 → M2 is called a homeomorphism if it is a continuous bijection with continuous inverse. If such a homeomorphism exists, we say that M1 and M2 are homeomorphic. Homeomorphic spaces are considered identical from a topological point of view. Notice that an isomorphism is a linear homeomorphism. Later, we will show that any bounded linear bijection between Banach spaces is necessarily an isomorphism. (See Corollary 4.30.) Digression: Historical Comments Functional analysis has its beginnings in the work of Fourier (1768–1830) and the study of differential equations. Fourier’s goal was to find solutions to differential equations such as d2y y + = g(x), dx2 Exercises 7 where g is some prescribed function. To this end, he applied to this setting the techniques of the already well-developed theory of linear algebra. In 1902, Fredholm applied similar techniques to solve integral equations. Fredholm was trying to find continuous functions f on the unit interval [0, 1] that satisfied equations such as

1 f (x) + K(x, y) f (y) dy = g(x), x ∈ [0, 1], 0 where K and g were given. The space of continuous functions on [0, 1] is denoted C[0, 1]. There is a natural norm on this space, called the supremum norm:

f = max |f (t)|, f ∈ C[0, 1]. t∈[0,1]

Note that this maximum is attained, because f is continuous and [0, 1] is compact. (It is for this reason we compute the maximum over t in [0, 1] instead of the supremum.) The space (C[0, 1], ·) is a Banach space, and was the first Banach space considered (although the term was not applied until much later). The space C[0, 1] can be generalized. If K is any compact Hausdorff space, the space C(K) denotes the collection of all scalar-valued continuous functions on K. When equipped with the norm

f =max|f (x)|, f ∈ C(K), x∈K this space is a Banach space. With the advent of Lebesgue’s thesis in 1903, and the subsequent development of measure theory, more examples of Banach spaces were discovered. In 1908, Hilbert and Schmidt studied L2(0, 1), and F. Riesz studied the more general spaces Lp(0, 1) for p ∈ [1, ∞). (See Appendix A for the relevant definitions.) The notion of a mea- surable function was critical in the understanding of these spaces—when restricted to continuous functions they are not complete. (However, the continuous functions are dense in Lp(0, 1) for all p ∈ [1, ∞), by Lusin’s Theorem (Theorem A.36).) (in 1919) and (in his thesis in 1920) independently introduced the axioms of what are now called Banach spaces. Banach and his school at Lwów made many advances in the area, but their work was cut short by the Second World War. Banach himself died shortly after the war in 1945. Much of the work of the Lwów school survived in what is now called the Scottish Book [24].

Exercises

Exercise 1.1 Show that in a metric space a convergent sequence is a Cauchy sequence. ∞ Exercise 1.2 Let X be a normed space. A sequence (xn)n=1 in X is said to be bounded if there exists a M>0 such that xn≤M for all n ∈ N. Show that in a normed 8 1 Introduction space a convergent sequence is bounded. Is the converse true? That is, must a bounded sequence necessarily converge? Exercise 1.3 Let X be a normed space. Show that |x−y| ≤x − y for all x and y in X. (This is sometimes called the reverse triangle inequality.) Exercise 1.4 Let X and Y be normed vector spaces. A function f : X → Y is said to be bounded on X if there exists a M>0 such that f (x)≤M for all x ∈ X. Show that a nonzero bounded linear operator T : X → Y is not bounded on X. Exercise 1.5 Finish the proof of Proposition 1.10. Exercise 1.6 Let X be a normed space. Prove that X∗ is a Banach space. Exercise 1.7 Let C[0, 1] be the space of continuous functions on [0, 1] and define a norm ·on C[0, 1] by f =maxt∈[0,1]|f (t)|. Show that the map

1 A(f ) = f (x) dx 0 defines a bounded linear functional on the normed vector space (C[0, 1], ·). Exercise 1.8 Let C(1)[0, 1] be the space of all continuous functions on [0, 1] that have continuous derivative. (This space is called the space of continuously differentiable (1) functions.) Define a norm ·on C [0, 1] by f =maxt∈[0,1]|f (t)|. Show that the map B(f ) = f (0) defines a linear functional on the normed vector space (C(1)[0, 1], ·) that is not bounded. Exercise 1.9 Let X be a normed vector space and suppose T and S are in X∗.If T (x) = 0 implies that S(x) = 0, show that there is a constant c such that S(x) = cT(x) for all x ∈ X. Exercise 1.10 Show that a subset of a metric space M is closed if and only if it is sequentially closed. That is, show that E is a closed subset of M if and only if every convergent sequence in E has its limit in E. Exercise 1.11 Let X and Y be normed vector spaces and suppose T : X → Y is a linear surjection. Show that T is an isomorphism if and only if there are constants c1 > 0 and c2 > 0 such that

c1x≤Tx≤c2x, x ∈ X. Exercise 1.12 Let X and Y be Banach spaces and let T ∈ L(X, Y ). Suppose T is not bounded below; that is, there does not exist a c>0 such that T (x)≥cx for all ∈ ∞ x X. Show there exists a sequence (xn)n=1 of norm one elements in X such that lim T (xn)=0. n→∞

Exercise 1.13 Suppose ·α and ·β are equivalent norms on a vector space X. Show that there is an isomorphism from (X, ·α) onto (X, ·β ). Exercises 9

Exercise 1.14 Let (X, ·X) and (Y , ·Y ) be normed vector spaces and assume the map T ∈ L(X, Y ) is an isomorphism. Define a scalar-valued function ·T on X by

xT =T (x)Y , x ∈ X.

Prove that ·T is a norm on X and show that it is equivalent to the original norm ·X.

Exercise 1.15 Suppose ·α and ·β are two norms on a vector space X. Show that the two norms are equivalent if and only if the normed spaces (X, ·α) and (X, ·β ) have the same open sets.

Exercise 1.16 Suppose ·α and ·β are two equivalent norms on a vector space X. Show that (X, ·α) is a complete normed space if (X, ·β ) is a complete normed space. Exercise 1.17 Let X and Y be isomorphic normed vector spaces. Use the preceding exercises to conclude that X is a Banach space if and only if Y is a Banach space. Chapter 2 Classical Banach Spaces and Their Duals

In the next two sections, we will consider the classical sequence and function spaces. The main purpose of these sections is to make the necessary definitions and to identify the dual spaces for these classical spaces. We will therefore take for granted that the various Banach spaces are indeed Banach spaces—putting off until Sect. 2.3 the proofs that they are complete in the given norms.

2.1 Sequence Spaces

th In the context of sequence spaces, we denote by en the sequence with 1 in the n coordinate, and 0 elsewhere, so that en = (0, ... ,0,1,0,...) for all n ∈ N. Also, we let e = (1, 1, 1, ...) be the constant sequence with 1 in every coordinate (not to be confused with the base of the natural logarithm e ≈ 2.718).

Definition 2.1 The set p of p-summable sequences for p ∈ [1, ∞) is the collection of sequences  ∞ p p = (ξ1, ξ2, ... , ξn, ...): |ξn| < ∞ . n=1

Define the p-norm on p by   ∞ 1/p   = | |p = ∞ ∈ ξ p ξn , ξ (ξn)n=1 p. n=1

The set p is a vector space under component-wise addition and scalar multipli- cation. (This is a nontrivial fact which we will take as given. [See Theorem A.27.]) Furthermore, p is a Banach space when given the p-norm (for 1 ≤ p<∞). We leave the proof of this fact to the exercises. (See Exercise 2.7.) The next lemma will identify the dual space of p for p ∈ (1, ∞). ∈ ∞ ∗ Lemma 2.2 For p (1, ), the space p can be identified with q , where 1 + 1 = p q 1.

© Springer Science+Business Media, LLC 2014 11 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_2 12 2 Classical Banach Spaces and Their Duals

Proof For simplicity, we start by assuming the scalars are real. (We will consider the complex case at the end of the proof.) We wish to identify the dual space of p for p ∈ (1, ∞) as the sequence space q , 1 + 1 = where p q 1. First, we will demonstrate how elements in q determine linear = ∞ functionals on p. Let η (ηn)n=1 be an element in q and define a scalar-valued function φη on p by ∞ φη(ξ) = ξn ηn, (2.1) n=1 = ∞ where ξ (ξn)n=1 is any sequence in p. By Hölder’s Inequality (Theorem A.29), this series is absolutely convergent (whence φη is linear) and     ∞ 1/p ∞ 1/q p q |φη(ξ)|≤ |ξn| |ηn| =ξpηq . n=1 n=1

It follows that φη is a bounded linear functional on p and φη≤ηq . We claim that φη is in fact equal to ηq . In order to show this, it will suffice to find a sequence ξ in p such that ξp = 1 and φη(ξ) =ηq . We begin by = ∞ =| |q−1 ∈ N constructing a sequence ζ (ζn)n=1 so that ζn ηn (sign ηn) for each n . Then ∞ ∞ ∞ p (q−1)p q |ζn| = |ηn| = |ηn| , n=1 n=1 n=1 − = 1 + 1 = where (q 1)p q follows from the assumption that p q 1. Consequently, q/p the sequence ζ is in p and ζ p =ηq . Observe that ∞ ∞ ∞ = · = | |q−1 · = | |q = q φη(ζ) ζn ηn ηn (sign ηn) ηn ηn η q . n=1 n=1 n=1

ζ Let ξ = . Then ξ is a sequence in p such that ξp = 1 and such that ζp

q φ (ζ ) η q− q φ (ξ) = η = q =η p =η . η   q/p q q ζ p ηq = ∞ Therefore, for any η (ηn)n=1 in q, there is a linear functional φη on p such that   =  = ∞ = ∞ η q φη and such that φη(ξ) n=1 ξnηn for all ξ (ξn)n=1 in p. We have demonstrated that any sequence in q determines a bounded linear func- tional on p. Next, we will show that all linear functionals on p can be obtained ∈ ∗ = ∞ = in this way. Let ψ p and define a sequence η (ηi )i=1 by letting ηi ψ(ei ) ∈ N for each i . We will show that the sequence η is an element of q such that   =  = ∞ = ∞ η q ψ and such that ψ(ξ) i=1 ξi ηi for all ξ (ξi )i=1 in p. 2.1 Sequence Spaces 13

First, we will show that η is in fact an element of q . For each i ∈ N, define q−1 ζi =|ηi | (sign ηi ). Then, for any n ∈ N,wehave

n n n p (q−1)p q |ζi | = |ηi | = |ηi | . i=1 i=1 i=1  n Computing the p-norm of the finite sequence i=1 ζi ei , we conclude that         1 1  n  n p n p   = | |p = | |q  ζi ei  ζi ηi . i=1 p i=1 i=1

By assumption, the linear functional ψ is bounded on p, and consequently            1  n   n  n p   ≤    =  | |q ψ ζi ei  ψ  ζi ei  ψ ηi . (2.2) i=1 i=1 p i=1 However, computing directly, we obtain   n n n n q ψ ζi ei = ζi ψ(ei ) = ζi ηi = |ηi | . (2.3) i=1 i=1 i=1 i=1 From (2.2) and (2.3), it follows that

  1 n n p q q |ηi | ≤ψ |ηi | . i=1 i=1 Dividing, we see that     1− 1 1 n p n q q q ψ≥ |ηi | = |ηi | . i=1 i=1

This inequality holds for all n ∈ N, and so η ∈ q and ψ≥ηq . It remains to show that ψ≤ηq and that ψ = φη, where φη is defined by (2.1). Since we have already demonstrated that φη≤ηq , it suffices to show that ψ = φη. = ∞ ∈ (n) = ∈ N Suppose ξ (ξi )i=1 p and let ξ (ξ1, ... , ξn,0,... ) for each n . (n) We claim that ξ converges to ξ in the norm on p. To see this, observe that ∞ | |p ∞ i=1 ξi < , by assumption, and consequently

    ∞ 1/p  (n) p lim ξ − ξ = lim |ξi | = 0. (2.4) n→∞ p n→∞ i=n+1 14 2 Classical Banach Spaces and Their Duals

Now, observe that    n n n (n) ψ ξ = ψ ξi ei = ξi ψ(ei ) = ξi ηi . i=1 i=1 i=1 Therefore, by the continuity of ψ, ∞ (n) ψ(ξ) = lim ψ(ξ ) = ξi ηi = φη(ξ). n→∞ i=1

It follows that ψ = φη, which is the desired result. Now assume the scalar field is C. The proof in this case is essentially the same. q−1 However, when defining ζn, for n ∈ N, let ζn =|ηn| ρn, where ρn is a such that |ρn|=1 and ηn ρn =|ηn|. The argument proceeds as it did in the real case. 2 The previous theorem identified the dual space of p for p ∈ (1, ∞) as the space 1 + 1 = q , where p q 1, via the dual space action ∞ η(ξ) = ξnηn, n=1 = ∞ = ∞ where ξ (ξn)n=1 is in p and η (ηn)n=1 is in q . In the above equation, we write η(ξ) as shorthand for φη(ξ), where φη is the linear functional corresponding to η that appears in (2.1). The object η is a sequence in the space q , but we write η(ξ) ∗ because we are viewing η as a linear functional in p. Next, we wish to identify the dual space of 1. In order to do this, we must introduce a new space of sequences.

Definition 2.3 The set ∞ of bounded sequences is the collection of sequences   = ∞ | | ∞ ∞ (ξn)n=1 : sup ξn < . n∈N

Define the supremum norm on ∞ by   = | | = ∞ ∈ ξ ∞ sup ξn , ξ (ξn)n=1 ∞. n∈N

The set ∞ is a vector space under component-wise addition and scalar multiplication and is a Banach space when given the supremum norm. ∗ Lemma 2.4 The space 1 can be identified with ∞. Proof The proof is similar to the proof of Lemma 2.2 and is left to the reader. (See Exercise 2.2.) As in Lemma 2.2, the dual space action is ∞ η(ξ) = ξnηn, n=1 = ∞ = ∞ 2 where now ξ (ξn)n=1 is in 1 and η (ηn)n=1 is in ∞. 2.1 Sequence Spaces 15

∗ = Since we identify the dual space of p with q , it is standard to write p q . When we write this, however, it is understood that we mean there is a way of identifying ∗ the linear functionals in p with the sequences in q and the identification is shown = ∞ = ∞ explicitly by the dual space action η(ξ) n=1 ξnηn, where ξ (ξn)n=1 is in p = ∞ and η (ηn)n=1 is in q . We summarize our results in the following theorem. ∈ ∞ 1 + 1 = Theorem 2.5 Let p [1, ) and suppose q is such that p q 1, with the =∞ = ∗ = convention that q when p 1. Then p q . Proof See Lemmas 2.2 and 2.4. The relationship between the exponents p and q in Theorem 2.5 motivates the next definition. ∈ ∞ 1 + 1 = Definition 2.6 If p [1, ) and if q is such that p q 1, with the convention that q =∞when p = 1, then p and q are called conjugate exponents. ∗ = ∗ = If p and q are conjugate exponents that are both finite, then p q and q p. ∗ = ⊆ ∗ Since 1 ∞, it is natural to ask if 1 is the dual space of ∞. While 1 ∞, the spaces do not coincide. The argument used in the proof of Lemma 2.2 for p ∈ (1, ∞) =∞ ∞ fails in the case p because there exist bounded sequences (ξn)n=1 that do not satisfy the equation that corresponds to (2.4) for the case p =∞. That is, we can find ξ ∈ ∞ such that   (n) lim ξ − ξ ∞ = lim sup |ξ | = 0, (2.5) →∞ →∞ k n n k>n (n) where ξ = (ξ1, ... , ξn,0,... ). As a simple example, let ξ = e, the constant sequence having every term equal to 1. We have that e∞ = 1, and so e is an (n) element of ∞,bute − e ∞ = 1 for all n ∈ N. Let us now consider the space of sequences for which the limit in (2.1.5) is 0. Definition 2.7 Let c be the space of all sequences converging to 0: 0   ∞ c0 = (ξn) = : lim ξn = 0 . n 1 n→∞

The space c0 is a Banach space with the supremum norm   = | | = ∞ ∈ ξ ∞ sup ξn , ξ (ξn)n=1 c0. n∈N ∗ = Theorem 2.8 c0 1. The proof is similar to that of Lemma 2.2 and is left as an exercise for the reader. (See Exercise 2.2.) The last sequence space we discuss is the space c. Definition 2.9 Let c be the space of all convergent sequences:   ∞ c = (ξn) = : lim ξn exists . n 1 n→∞ The space c is also a Banach space with the supremum norm. Perhaps surprisingly, the dual space of c is also 1, albeit with a slightly different dual space action. (See 16 2 Classical Banach Spaces and Their Duals

Example 2.22.) It is straightforward to show that c0 is a closed subspace of c, and in turn c is a closed subspace of ∞. (See Exercise 2.1.)

2.2 Function Spaces

In this section, let (Ω, Σ, μ) be a measure space, where μ is a positive measure. The theorems in this section are true for positive σ -finite measure spaces, but for simplicity we will assume μ(Ω) < ∞. As before, K denotes the underlying scalar field, which is either R or C. We begin by recalling some definitions from measure theory. (See Appendix A for a more detailed discussion.) Definition 2.10 If A is a subset of Ω, then the characteristic function of A is the function 1ifω ∈ A, χA(ω) = 0ifω ∈ A. The characteristic function of A is a measurable function if and only if A is a measurable subset of Ω.

Definition 2.11 For p ∈ [1, ∞), the set of p-integrable functions (or Lp-functions) on (Ω, Σ, μ) is the collection   p Lp(Ω, Σ, μ) = f : Ω → K a measurable function : |f | dμ < ∞ . Ω

We often write this space as Lp(Ω, μ)orLp(μ) when there is no risk of confusion. The set Lp(μ) is actually a collections of equivalence classes of measurable functions. Two functions in Lp(μ) are considered equivalent if they differ only on a set of μ-measure zero. Despite this, we will usually speak of the elements in Lp(μ) as functions, rather than equivalence classes of functions. We remark that the set Lp(μ) is a vector space under pointwise addition and scalar multiplication. (As with p in the previous section, this is a nontrivial result. [See Theorem A.27.]) Define the p-norm on Lp(μ)by

  1 p p f p = |f | dμ , f ∈ Lp(μ). Ω

We will show that Lp(μ) is a Banach space in the p-norm (for 1 ≤ p<∞)in the next section. (See Theorem 2.25.) As in the case of sequence spaces, we must consider the case p =∞separately. Definition 2.12 Let (Ω, Σ, μ) be a measure space. The essential supremum norm of a measurable function f is defined to be

f ∞ = inf {K : μ(|f | >K) = 0} . 2.2 Function Spaces 17

The set of essentially bounded functions (or L∞-functions)on(Ω, Σ, μ)isthe collection

L∞(Ω, Σ, μ) = {f : Ω → K a measurable function : f ∞ < ∞} .

We often write L∞(Ω, μ)orL∞(μ) when there is no risk of confusion. As is the case for Lp(μ) when p is finite, the set L∞(μ) is a collection of equiva- lence classes of measurable functions and (as before) we consider two functions to be equal in L∞(μ) if they differ only on a set of μ-measure zero. The set L∞(μ) is a vector space under pointwise addition and scalar multiplication. The essential supremum defines a norm on L∞(μ) and L∞(μ) is a Banach space when given this norm. (See Theorem 2.26.) We use the terminology “essentially bounded” to describe the functions in L∞(μ) and call f ∞ the “essential supremum” of |f | in L∞(μ), because the quantity f ∞ is the smallest number K such that |f |≤K a.e.(μ). We wish to identify the dual space of Lp(μ), where p ∈ [1, ∞). We will see that, analogous to the case of sequence spaces, the dual space of Lp(μ)isLq (μ), where p and q are conjugate exponents. In this case, however, the dual action of Lq (μ)on Lp(μ) is given by integration. That is, if f ∈ Lp(μ) and g ∈ Lq (μ), then the dual action of g on f is given by

g(f ) = fgdμ. Ω Notice that g is a function on Ω. Here, however, we write g(f ) because we view ∗ ∗ g as an element of the dual space Lp(μ) . As before, we write Lp(μ) = Lq (μ) ∗ to indicate the identification of linear functionals in Lp(μ) with elements of the function space Lq (μ). Theorem 2.13 Let (Ω, Σ, μ) be a positive finite measure space. If p and q are ∗ conjugate exponents, where p ∈ [1, ∞), then Lp(μ) = Lq (μ). Proof Start by assuming the scalars are real. We begin with the case p ∈ (1, ∞). ∈ 1 + 1 = Let g Lq (μ), where p q 1, and define a scalar-valued function φg on Lp(μ) by

φg(f ) = fgdμ, f ∈ Lp(μ). (2.6) Ω

We will show first that φg is a linear functional on Lp(μ) such that φg=gq , and then we will show that all linear functionals on Lp(μ) can be achieved in this way. We note that φg is linear (by the linearity of the integral) and |φg(f )|≤f pgq (by Hölder’s Inequality). Thus, φg is a bounded linear functional and φg≤gq . In order to show equality of the norms, it suffices to find a function f in Lp(μ) such that f p = 1 and such that φg(f ) =gq . First, define a scalar-valued function h on Ω by letting h(x) =|g(x)|q−1 (sign g(x)) for each x ∈ Ω. Then 18 2 Classical Banach Spaces and Their Duals

  1   1 p p   = | |(q−1)p = | |q = q/p h p g dμ g dμ g q , Ω Ω − = 1 + 1 = where (q 1)p q follows from the assumption that p q 1. It follows that h q h is in Lp(μ). Next, observe that φg(h) =gq .Now,letf = . Then f p = 1 hp and φ (h) gq φ (f ) = g = q =g . g   q/p q h p gq

Therefore, φg is a bounded linear functional on Lp(μ) and φg=gq . ∗ We now wish to show that any bounded linear functional in Lp(μ) can be written as in (2.6) for some g in Lq (μ). To that end, let ψ be a bounded linear functional ∗ in the dual space Lp(μ) . Define a measure ν on (Ω, Σ)byν(A) = ψ(χA) for all A ∈ Σ. It is routine to show that ν is finitely additive (by the linearity of ψ), and it is countably additive by the continuity of ψ. We also claim that ν  μ. That is, ν is absolutely continuous with respect to μ. To see this, suppose A ∈ Σ is such that μ(A) = 0. Because ψ is bounded,

| |=| |≤   =  1/p = ν(A) ψ(χA) ψ χA Lp(μ) ψ μ(A) 0.

By the Radon–Nikodým Theorem (Theorem A.24), there exists a measurable func- ∈ = ∈ tion g L1(μ) such that ν(A) A gdμ for all A Σ. Therefore, for every A ∈ Σ,

ψ(χA) = ν(A) = gdμ = χAgdμ. A Ω  = By linearity, it follows that ψ(f ) Ω fgdμ whenever f is a simple measurable function. Let f ∈ L∞(μ) be a real nonnegative essentially bounded measurable function. Since (Ω, Σ, μ) is a finite measure space, it follows that f ∈ Lp(μ). Thus, ∞ ≥ there exists a sequence of simple measurable functions (fn)n=1 such that fn fn−1 for all n ∈ N, and such that f − fnp → 0asn →∞. By the continuity of ψ,

ψ(f ) = lim ψ(f ) = lim f gdμ. →∞ n →∞ n n n Ω Therefore, by Lebesgue’s Dominated Convergence Theorem (Theorem A.17),

ψ(f ) = fgdμ, f ∈ L∞(μ) ∩ Lp(μ), f ≥ 0. Ω

To extend this to an arbitrary real function in L∞(μ) ∩ Lp(μ), let

+ − f = fχ{x:f (x)≥0} and f =−fχ{x:f (x)<0}, and observe that f = f + − f −. 2.2 Function Spaces 19

We claim that g ∈ Lq (μ). For each n ∈ N, define a function hn on Ω by letting q−1 hn = χ{|g|≤n} |g| (sign g). Then, for each n ∈ N,wehavehn ∈ L∞(μ) ∩ Lp(μ) and q ψ(hn) = |g| dμ. {|g|≤n}

By assumption, the linear functional ψ is bounded on Lp(μ), and so it follows that |ψ(hn)|≤φhnp. Computing the Lp-norm of hn, and once again observing that (q − 1)p = q, we see that

  1   1 p p (q−1)p q hnp = |g| dμ = |g| dμ . {|g|≤n} {|g|≤n} Therefore,

  1 p q q |g| dμ =|ψ(hn)|≤ψhnp =ψ |g| dμ . {|g|≤n} {|g|≤n} Dividing, we obtain

  − 1   1 1 p q ψ≥ |g|q dμ = |g|q dμ . {|g|≤n} {|g|≤n} Thus, by Fatou’s Lemma (Theorem A.16),     1/q 1/q q q g = lim inf χ{| |≤ }|g| dμ ≤ lim inf χ{| |≤ }|g| dμ ≤ψ. q →∞ g n →∞ g n Ω n n Ω

Therefore, g is in Lq (μ) and gq ≤ψ. It remains to show that ψ = φg.Iff is a real nonnegative function in Lp(μ), ∞ then we may choose a sequence (fn)n=1 of simple measurable functions such that fn increases to f almost everywhere and such that fn → f in the Lp-norm as n →∞. = ∈ N We have already established that ψ(fn) Ω fn gdμfor all n . We also know that ψ(fn) → ψ(f )asn →∞, because ψ is a continuous linear functional on Lp(μ). Since f ∈ Lp(μ) and g ∈ Lq (μ), it follows that fg ∈ L1(μ), by Hölder’s Inequality. Thus, lim f gdμ= fgdμ, →∞ n n Ω Ω by Lebesgue’s Dominated Convergence Theorem. Therefore, ψ(f ) = φg(f ) for all nonnegative functions f in Lp(μ). As before, we may extend this to all real functions + − in Lp(μ) by writing f = f − f . We have now proven the theorem for p ∈ (1, ∞) when the scalar field is R.In order to extend this result to C, we argue as above, but define the function h by the rule h =|g|q−1 ρ, where ρ : Ω → C is a function such that |ρ|=1 and gρ =|g|. q−1 Similarly, we let hn = χ{|g|≤n} |g| ρ for each n ∈ N. This argument proves that φg is a bounded linear functional on Lp(μ) for all g ∈ Lq (μ). It also proves that for any bounded linear functional ψ on Lp(μ), there exists a function g ∈ Lq (μ) such that 20 2 Classical Banach Spaces and Their Duals

ψ(f ) = φg(f ) for all real functions f in Lp(μ). To extend this result to complex functions f in Lp(μ), write f =(f ) + i (f ), where (f ) and (f ) are the real and imaginary parts of f , respectively, and use linearity. For p = 1, the proof is similar to the case when p ∈ (1, ∞) and is left to the reader. (See Exercise 2.3.) 2 As is the case with sequence spaces, the dual of L∞(μ) need not be L1(μ). Theorem 2.13 remains true when μ is a positive σ -finite measure. Such a case can be seen in the following example. Example 2.14 Consider the measure space (N,2N, m), where 2N denotes the power set of N (the collection of all subsets of N), and m is counting measure on N (i.e., the set function for which m(A) is the cardinality of the set A ⊆ N). Suppose that N f ∈ Lp(N,2 , m), where p ∈ [1, ∞). Then,     1/p ∞ 1/p p p f p = |f | dm = |f (n)| . N n=1 N N ∞ We see that f in Lp( ,2 , m) corresponds to the sequence (f (n))n=1 in p. The same conclusion holds for p =∞. Therefore,

N Lp(N,2 , m) = p,1≤ p ≤∞.

Let us now consider spaces of continuous functions. Definition 2.15 Let K be a compact metric space. We denote the collection of scalar-valued continuous functions on K by C(K). Define the supremum norm on C(K)by f ∞ = sup |f (t)|, f ∈ C(K). t∈K The set C(K) is a vector space under pointwise addition and scalar multiplication and is a Banach space when given the supremum norm. (See Theorem 2.27.) If we wish to emphasize the underlying scalar field, we will write CR(K)orCC(K). Observe that, for f ∈ C(K), the quantity f ∞ is actually the maximum of |f |, since a continuous function attains its supremum on compact sets.

Remark 2.16 We use the notation ·∞ to represent both the supremum norm on C(K) (for a compact metric space K) and the essential supremum norm on L∞(μ) (for a measure space (Ω, A, μ)). If there is any risk of confusion, we will write · · C(K) and L∞(μ) to denote the norm on C(K) and L∞(μ), respectively. We wish to identify the dual space of C(K). To that end, we consider the following example, where K = [0, 1]. In this case, we write C(K) = C[0, 1]. Example 2.17 Consider the following linear functionals on C[0, 1]:

1 (a) (Integration) f → f (t) dt. 0 (b) (Point evaluation) f → f (s) for s ∈ K. 1 (c) (Integration against L1 functions) f → f (t) g(t) dt for g ∈ L1(0, 1). 0 2.2 Function Spaces 21

We use L1(0, 1) to denote the Banach space of L1-functions on [0, 1] with Lebesgue measure. Note that point evaluation (Example 2.17(b)) can be thought of as integration, by means of a Dirac measure:

1ifs ∈ A, δs (A) = (2.7) 0ifs ∈ A.

For any s ∈ [0, 1], the set function δs is a measure on [0, 1], and

f (s) = f (t) δs (dt). [0,1]

The measure δs is also known as the Dirac mass at s. A Dirac measure is an example of a singular measure, or a measure that is concentrated on a set of Lebesgue measure zero. Example 2.17(c) provides a bounded linear functional on C[0, 1] because    1  1    f (t) g(t) dt ≤ sup |f (t)|· |g(t)| dt =f ∞g1, 0 t∈[0,1] 0 for f ∈ C(K) and g ∈ L1(0, 1). This, too, can be realized as integration against a measure. Let ν(A) = g(t) dt, A ∈ B, A where B denotes the collection of Borel measurable subsets of [0, 1]. Then

1 f (t) g(t) dt = fdν. 0 [0,1] The linear functionals given in Example 2.17 are not all of the linear functionals on C[0, 1] (as will be seen in Theorem 2.20, below), but they do give a hint to the true nature of (C[0, 1])∗. Before we identify this space, let us recall several definitions from measure theory. Definition 2.18 Let (K, B) be a measurable space and let μ be a measure on K. The total variation of μ is defined for all A ∈ B by ⎧ ⎫ ⎨ ⎬ | | = | | μ (A) sup ⎩ μ(Aj ) :(Aj )j∈F is a finite measurable partition of A⎭ . j∈F

We remark here that the measure ν defined in Example 2.17(c) has total variation

|ν|(A) = |g(t)| dt, A ⊆ [0, 1] is a Borel set. A We leave the verification of this as an exercise. (See Exercise 2.4.) 22 2 Classical Banach Spaces and Their Duals

Definition 2.19 Suppose K is a compact metric space. We denote by M(K) the space of all Borel measures on K having finite total variation. This is a Banach space with the total variation norm, which is given by νM =|ν|(K) for all ν ∈ M(K). The next theorem, called the Riesz Representation Theorem, was proved by F. Riesz in 1909 [31], although not in this generality. It represented a milestone in analysis and identified the dual space of C(K) as a space of measures on K. Theorem 2.20 (Riesz Representation Theorem) Suppose K is a compact metric space. The dual of C(K) can be identified with the space M(K). In particular, if φ ∈ C(K)∗, then there exists a Borel measure ν on K such that

φ(f ) = fdν, f ∈ C(K), (2.8) K and φ=|ν|(K). Furthermore, all Borel measures on K determine bounded linear functionals on C(K) according to (2.8). The Riesz Representation Theorem is a classical result in measure theory and, consequently, we will not prove it here. If the underlying scaler field is R, then M(K) consists of all bounded signed Borel measures on K. When the scalar field is C, the measures are bounded complex measures, and in this case

M(K) ={μ + iν : μ, νare signed (real-valued) measures}.

The Riesz Representation Theorem identifies (C[0, 1])∗ as the space M[0, 1] of Borel measures on the unit interval. It is no surprise, then, that each of the bounded linear functionals in Example 2.17 turned out to be given by a measure on [0, 1]. We now give another example of a bounded linear functional on C[0, 1].

Example 2.21 (the Cantor function) A ternary expansion for x ∈ [0, 1] is an infinite = ∞ j ∈{ } ∈ N series x j=1 δj (x)/3 , where δj (x) 0, 1, 2 for all j . The ternary ∞ expansion is said to terminate if the sequence (δj (x))j=1 has only finitely many nonzero terms; i.e., there exists an N ∈ N such that δj (x) = 0 for all j ≥ N.A given number may have two ternary expansions. For example, 1/3 = 1/3 + 0/32 + 0/33 +···, but also

∞ 1 2 0 2 2 = = + + +··· . 3 3j 3 32 33 j=2

(This equality is easily verified with a geometric series argument.) Despite the fact that x ∈ (0, 1] may have multiple ternary expansions, it can be shown that x has a = ∞ j unique ternary expansion x j=1 δj (x)/3 that does not terminate. We will define a map G : [0, 1] → [0, 1]. If x = 0, then let G(x) = 0. If ∈ = ∞ j x (0, 1], and x has nonterminating ternary expansion x j=1 δj (x)/3 , then let 2.2 Function Spaces 23 ⎧ ∞ ⎪ δ (x)/2 ⎪ j if δ (x) = 1 for any j ∈ N, ⎪ 2j j ⎨⎪j=1 = G(x) ⎪ (2.9) ⎪N −1 ⎪ δj (x)/2 1 ⎪ + if δ (x) = 1 and δ (x) = 1 for j

[0, 1], but G (x) = 0 for almost every x. Let C ={x : δj (x) = 1 for any j ∈ N}. This set, which is precisely the set on which G is not locally constant, is known as the Cantor set. It can be shown that C has Lebesgue measure zero. (See Exercise 2.16.) Since G : [0, 1] → [0, 1] is continuous, positive, and nondecreasing, there exists a measure μG on [0, 1] such that

μG ([a, b]) = G(b) − G(a), 0 ≤ a

The measure μG, which we call Cantor measure, is a Borel probability measure on [0, 1], and so the map

1 f → fdμG, f ∈ C[0, 1], 0 determines a bounded linear functional on C[0, 1]. The Cantor measure is an example of a singular measure, or a measure that is concentrated on a set of Lebesgue measure zero. (See Parts (d) and (e) of Exercise 2.16.) In fact, the Cantor measure is an example of a nonatomic (or diffuse) singular measure, which is a singular measure that has no atoms. A measurable set E is called an atom for a measure μ if (i) μ(E) > 0 and (ii) μ(F ) = 0 for any measurable subset F of E for which μ(E) >μ(F ). Example 2.22 The sequence space c can be viewed as a space of continuous func- tions on a compact metric space. In particular, c = C(K), where K = N ∪ {∞} is the one-point compactification of the natural numbers. (See Sect. B.1 for the relevant definitions.) The space K is topologically equivalent to the compact set {1/n : n ∈ N}∪{0}, viewed as a subspace of R. (See Exercise 2.15.) By the Riesz Representation Theorem (Theorem 2.20), the dual space of c is

∗ c = 1(N ∪ {∞}).

In other words, every linear functional on c corresponds to an element of 1(K). We can exhibit this correspondence explicitly. An arbitrary element in 1(K)isofthe 24 2 Classical Banach Spaces and Their Duals  = ∞ ∞ ∈ N ∈ K form ξ (ξn)n=1, ξ∞ , where (ξn)n=1 1( ) and ξ∞ . The linear functional that corresponds to ξ is a measure μξ on K defined by μξ (n) = ξn for n ∈ N ∪ {∞}.   The total variation of μξ is ξ 1(K); that is, ∞ |μξ |(K) = |ξn|+|ξ∞|. n=1

This quantity is finite, by assumption. The dual action of μξ on c is given by integration: ∞   fdμξ = f (n) ξn + lim f (j) ξ∞, f ∈ c = C(K). j→∞ K n=1 It is natural to wonder if there are any Banach spaces with trivial dual space. The Hahn–Banach Theorem, which we prove in Sect. 3, guarantees there are no such Banach spaces. In particular, we will prove the following. (See Theorem 3.9.) Theorem 2.23 (Hahn–Banach Theorem) Suppose X is a real Banach space and let E be a linear subspace of X.Ifφ ∈ E∗, then there exists a bounded linear functional ψ ∈ X∗ such that ψ=φ and ψ(x) = φ(x) for all x ∈ E. The Hahn–Banach Theorem implies there are no Banach spaces X with X∗ ={0}. Any one-dimensional subspace of X will have non-trivial linear functionals, and the Hahn–Banach Theorem states that these can be extended to the entire space X.

2.3 Completeness in Function Spaces

In this section, we will prove that the function spaces of the previous section are complete in their given norms. The sequence spaces are left for the exercises. We begin by proving a very useful lemma. Lemma 2.24 (Cauchy Summability Criterion) A normed space X is complete ∞ ∞ if and only if every series = xn converges in X whenever = xn < ∞. n 1 n 1 · ∞   ∞ Proof First, suppose X is complete in the norm and suppose n=1 xn < . ∈ N = +···+  − ≤ m   For n , let Sn x1 xn. Then Sm Sn j=n+1 xk for each m and ∞ n in N, and so (Sn) = is a Cauchy sequence. Therefore, since X is complete, the ∞ n 1 series j=1 xj converges.  ∞ ∞   ∞ ∞ Now suppose n=1 xn converges whenever n=1 xn < . Let (yn)n=1 be a Cauchy sequence in X. Pick an increasing sequence (nk)k∈N of natural numbers such that 1 y − y  < , whenever p>q≥ n . p q 2k k = Let yn0 0. Then k = − ∈ N ynk (ynj ynj−1 ), k . j=1 2.3 Completeness in Function Spaces 25

 −  1 ≥ By construction, we have ynj ynj−1 < 2j−1 for all j 2, and so we see that ∞  −  ∞ ∞ j=1 ynj ynj−1 < . It follows that the subsequence (ynk )k=1 converges to some element in X. Let y be the limit of this subsequence. ∈ N  −  Let >0 be given. We may choose K large enough so that ynK y </2 K  − ≤ K and 1/2 </2. By definition, if m>nK , then ym ynK 1/2 . Thus, 1  y − y≤y − y +y − y < + <, m m nK nK 2K 2 whenever m>nK . ∞ 2 Therefore, (ym)m=1 is a convergent sequence, as required. Theorem 2.25 Let (Ω, Σ, μ) be a positive measure space. If 1 ≤ p<∞, then Lp(Ω, μ) is a Banach space when given the norm   1/p p f p = |f | dμ , f ∈ Lp(Ω, μ). Ω

Proof We will make liberal use of the theorems in Appendix A. First, observe that ·p is a norm on Lp(Ω, μ), by Minkowski’s Inequality. To show that ·p is complete, we will use the Cauchy Summability Criterion (Lemma 2.24). ∞ ∞   Assume (fk)k=1 is a sequence of functions in Lp(Ω, μ) such that k=1 fk p < ∞ = ∞   ∈ N = n | | . Let M k=1 fk p. For each n , define gn on Ω by gn(ω) k=1 fk(ω) ∈ ∞ for all ω Ω. Observe that (gn)n=1 is a sequence of nonnegative measurable functions and that gn ≤ gn+1 for each n ∈ N. By Fatou’s Lemma,

p p p p lim inf |gn| dμ ≤ lim inf |gn| dμ = lim inf gn ≤ M < ∞. n→∞ n→∞ n→∞ p

p It follows that lim inf |gn| < ∞ a.e.(μ), and consequently, n→∞     ∞ p n p p |fk| = lim |fk| = lim inf |gn| < ∞ a.e.(μ). n→∞ n→∞ k=1 k=1  = ∞ | |   ≤ Consequently, the function g k=1 fk exists a.e.(μ) and g p M. = ∞ | |≤ Now let f k=1 fk. It follows that f exists a.e.(μ), because f (ω) g(ω) for all ω ∈ Ω. Observe also that      n    ≤ ∈  fk(ω) g(ω), ω Ω. k=1  n Therefore, by the Lebesgue Dominated Convergence Theorem, k=1 fk converges to f in Lp(Ω, μ), as required.

Theorem 2.26 Let (Ω, Σ, μ) be a positive measure space. The space L∞(Ω, μ) is a Banach space when given the essential supremum norm ·∞. 26 2 Classical Banach Spaces and Their Duals

∞ Proof Again we use Lemma 2.24. Let (fk)k=1 be a sequence of functions in = ∞   ∞ L∞(Ω, μ) and let M k=1 fk ∞. Suppose M< . By assumption, for each k ∈ N,wehave|fk|≤fk∞ a.e.(μ). Therefore, for each k ∈ N, there exists a = | |≤  ∈ \ measurable set Nk such that μ(Nk) 0 and fk(ω) fk ∞ for all ω Ω Nk. = ∞ = Let N k=1 Nk. Then μ(N) 0 (as a countable union of measure-zero sets) and ∞ ∈ \ ∈ N = ∞ fk(ω) < for all ω Ω N and all k . Consequently, f k=1 fk exists a.e.(μ) and ∞ ∞ |f (ω)|≤ |fk(ω)|≤ fk∞ = M, k=1 k=1 for all ω ∈ Ω\N. We have determined that f ∈ L∞(Ω, μ), but now we must show that f is the limit n of k=1 fk in the essential supremum norm. Let >0 be given. By assumption, ∞   ∞ ∈ N ∞   k=1 fk ∞ < . Therefore, there exists some n0 such that k=m fk ∞ < whenever m ≥ n0. Then n ∞ f − fk∞ ≤ fk∞ <, k=1 k=n+1 provided n ≥ n0 − 1. This completes the proof. 2 Theorem 2.27 Let K be a compact metric space. The space C(K) of scalar-valued continuous functions on K is a Banach space when given the supremum norm ·∞. Proof The proof is similar to that of Theorem 2.26, with modifications related to continuity, and is left to the reader. (See Exercise 2.5.) 2 Theorem 2.27 remains true if we replace K with a locally compact Hausdorff space and consider the space C0(K) of continuous functions on K vanishing at infinity. (See Sects. 5.1 and A.6 for the relevant definitions.) ∗ When K is a locally compact Hausdorff space, the dual space C0(K) is still M(K), but in this case it denotes the Banach space of regular Borel measures on K with the total variation norm. This is not inconsistent notation because, when K is a compact metric space, C0(K) = C(K) and all finite Borel measures on K are regular. Note that M(K) is necessarily a Banach space as the dual space of a Banach space. (See Proposition 1.11.)

Exercises

Exercise 2.1 Show that c0 is a closed subspace of c and that c is a closed subspace of ∞.

Exercise 2.2 Show that the dual of c0 can be identified with 1, and that the dual of 1 can be identified with ∞. Exercises 27

∗ Exercise 2.3 Let (Ω, Σ, μ) be a positive finite measure space. Show that L1(Ω, μ) can be identified with L∞(Ω, μ). (This completes the proof of Theorem 2.13.)

Exercise 2.4 Let g ∈ L1(0, 1), the Banach space of L1-functions on [0, 1] with = Lebesgue measure, and define a measure on [0, 1] by ν(A) A g(t) dt, where A is | | = | | a Borel subset of [0, 1]. Show that ν (A) A g(t) dt for all Borel subsets A of [0, 1]. (See Example 2.17(c) and the comments following it.) Exercise 2.5 Let K be a compact metric space. Prove that C(K) is a Banach space when given the supremum norm. (That is, prove Theorem 2.27.)

Exercise 2.6 Let δ0 denote the linear functional on C[0, 1] given by evaluation at 0. That is, δ0(f ) = f (0) for all f ∈ C[0, 1]. Show that δ0 is bounded on C[0, 1] when equipped with the ·∞-norm, but not when equipped with the ·1-norm.

Exercise 2.7 Use the theorems of Sect. 2.3 to prove that p is complete in the p-norm for 1 ≤ p ≤∞. (You may assume the theorems of Sect. 2.3 remain true for σ -finite measure spaces.)

Exercise 2.8 Verify that any Cauchy sequence in c0 (equipped with the supremum norm) converges to a limit in c0. Conclude that c0 is a Banach space.

Exercise 2.9 Prove that c0 is not a Banach space in the ·2-norm. Exercise 2.10 Let 1 ≤ p

n Show that the norms ·p and ·q are equivalent on R . (b) Show that p ⊆ q ,butq is not a subset of p.

Exercise 2.11 Let x ∈ r for some r<∞. Show that x ∈ p for all p ≥ r and prove that xp →x∞ as p →∞. Exercise 2.12 Suppose (Ω, μ) is a positive measure space and let 1 ≤ p

(a) Prove that if μ(Ω) < ∞, then f p ≤ Cp,q f q for all measurable functions f , where Cp,q is a constant that depends on p and q. (b) Show that the assumption μ(Ω) < ∞ cannot be omitted in (a). (c) Find a real-valued function f on [0, 1] such that f p < ∞ but f q =∞. Exercise 2.13 Suppose (Ω, μ) is a positive measure space such that μ(Ω) = 1.

(a) If 1 ≤ p

(a) What do the closed unit balls B 2 , B 2 , and B 2 represent geometrically? 1 2 ∞ (b) Let a and b be nonzero real numbers and define a function on R2 by   x2 y2 1/2 (x, y) = + ,(x, y) ∈ R2. E a2 b2

2 Prove that ·E is a norm on R and identify geometrically the closed unit ball 2 in (R , ·E). Exercise 2.15 Let M be a metric space with subset E. A set V is said to be open in the subspace topology on E if there exists a set U that is open in M and V = U ∩ E. (a) Show that a closed subset of a complete metric space is a complete metric space. (b) Show that N, the set of natural numbers, is a locally compact metric space with the metric d(x, y) =|x − y| for all x and y in N. Conclude that the one-point compactification N ∪ {∞} of the natural numbers is a compact metric space. (See Appendix B.1 for the definition of the one-point compactification.) (c) Show that the one-point compactification of the natural numbers N ∪ {∞} (from part (b)) is homeomorphic to {1/n : n ∈ N}∪{0}, where the latter set is given the subspace topology inherited from R. (d) Conclude that c is a Banach space. (See Example 2.22.) Exercise 2.16 Consider the interval [0, 1]. From this set, remove the open subinterval 1 2 ( 3 , 3 ), the so-called middle third. This leaves the union of two closed intervals: 1 ∪ 2 [0, 3 ] [ 3 , 1]. From each of these, again remove the middle third. What remains 1 ∪ 2 1 ∪ 2 7 ∪ 8 is the union of four closed intervals: [0, 9 ] [ 9 , 3 ] [ 3 , 9 ] [ 9 , 1]. Once again, from each remaining set remove the middle third. Continue this process indefinitely to create Cantor’s Middle Thirds Set. This set, which we denote K, can be written explicitly as follows:

∞ n−1−    3  1 3k + 1 3k + 2 K = [0, 1] \ , . 3n 3n n=1 k=0 (a) Show that K coincides with the Cantor set C of Example 2.21.(Hint: You may wish to use the fact that every nonzero number has a unique nonterminating ternary expansion.) (b) Evidently K is not empty because it contains the endpoints of the middle third K 1 ∈ K sets. Show that contains other numbers by showing that 4 .(Hint: Consider ∞ −kj ∈ N the geometric series j=1 3 , where k .) (c) Show that K is uncountable. (Hint: Use diagonalization and K = C.) (d) Let m be Lebesgue measure on [0, 1]. Show that m(K) = 0. (e) Let μG be the Cantor measure from Example 2.21. Show that μG(K) = 1. Exercise 2.17 Let K be Cantor’s Middle Thirds Set from Exercise 2.16. After the first middle third is removed, two closed intervals remain. Call these two sets E1,1 and E1,2. After the middle thirds are removed from E1,1 and E1,2, there will remain Exercises 29 four closed intervals. Label these sets E2,1, E2,2, E2,3, E2,4, ordering them from left to right, as they appear on the unit interval. After the process has been repeated n 1 n times, there will remain 2 closed intervals, each of length 3n . Label these sets En,1, ... , En,2n , again from left to right, as they appear on the unit interval, so that ∈ ∈ x1 En,k1 and x2 En,k2 implies that x1

∞ 2n K = En,k. n=1 k=1

n For each n ∈ N and each k ∈{1, ... ,2 }, let En,k = [an,k, bn,k] and define a function Gn : [0, 1] → [0, 1] as follows: ⎧ − − ⎨⎪ bn,k x k−1 + x an,k k ≤ ≤ ∈{ n} 3−n 2n 3−n 2n if an,k x bn,k for k 1, ... ,2 , = Gn(x) ⎪ ⎩ k n + ∈{ − } 2n if bn,k

Several theorems in functional analysis have been labeled as “the Hahn–Banach Theorem.” At the heart of all of them is what we call here the Hahn–Banach Exten- sion Theorem, given in Theorem 3.4, below. This theorem is at the foundation of modern functional analysis, and its use is so pervasive that its importance cannot be overstated.

3.1 The Axiom of Choice

The Zermelo–Fraenkel Axioms (ZF) is a list of accepted statements upon which mathematics can be built. They form what is possibly the most common foundation of mathematics. When the system includes the Axiom of Choice, it is often called (ZFC). Axiom of Choice For any collection X of nonempty sets, there exists a choice function defined on X . A choice function on a collection X of nonempty sets is a function c, defined on X , such that c(A) ∈ A for every set A in X . The Axiom of Choice was formulated by Ernst Zermelo in 1904, and is usually accepted by mathematicians. Indeed, it is often used without realizing it. For example, consider the sequential characterization of continuity: Suppose f is a continuous ∞ function. From a countable collection of open sets (Un)n=1 that decrease to a point ∞ ∈ → →∞ x0, we choose a sequence (xn)n=1 such that xn Un, and so xn x0 as n .By the choice of xn, and the continuity of f , it follows that f (xn) → f (x0)asn →∞, and so on. We are using the Axiom of Choice when we choose xn ∈ Un for each n ∈ N; but the Axiom of Choice is even stronger, allowing us to choose uncountably many points simultaneously, each one from a different set. While the Axiom of Choice is easy to believe, accepting it results in some unexpected consequences.

© Springer Science+Business Media, LLC 2014 31 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_3 32 3 The Hahn–Banach Theorems

Banach–Tarski Paradox A three-dimensional solid sphere of radius 1 can be split into a finite number of disjoint pieces, and those pieces can be reassembled to form two solid spheres of radius 1. The disjoint pieces will by necessity be non-measurable, and so difficult to imag- ine. The Banach–Tarski Paradox violates standard geometric intuition, and as a result has led some to question the Axiom of Choice. Despite these reservations, we will take the Axiom of Choice for granted. Of interest to us is an alternate (but equivalent) formulation of the Axiom of Choice known as Zorn’s Lemma. Zorn’s Lemma Suppose (P , ≤ ) is a partially ordered set. If every chain in P has an upper bound, then P contains a maximal element. Let us recall the relevant definitions. Definition 3.1 Let P be a set. A relation ≤ on P is said to be a partial order if for every a, b, and c in P : (i) (a ≤ b and b ≤ c) ⇒ a ≤ c, (ii) (a ≤ b and b ≤ a) ⇒ a = b, (iii) a ≤ a. A subset C of P is a chain in P if (C, ≤ ) is totally ordered:

{a, b}⊆C ⇒ (a ≤ b or b ≤ a).

If A ⊆ P , then b is called an upper bound for A if a ≤ b for all a ∈ A. An element a ∈ P is called maximal if a ≤ b implies b = a. Naturally, we can define a reverse partial order ≥ in the obvious way, and reinterpret Zorn’s Lemma in terms of lower bounds and minimal elements. An- other (equivalent) formulation of the Axiom of Choice is the Hausdorff Maximality Principle. Hausdorff Maximality Principle In a partially ordered set, every chain is contained in a maximal chain. The Hausdorff Maximality Principle was formulated by Felix Hausdorff in 1914. Zorn’s lemma was proposed later, in 1935, by Max Zorn. Both are equivalent to the Axiom of Choice. A detailed discussion of the history of the Axiom of Choice and its equivalent formulations can be found in [26]. The formulation we will usually use is Zorn’s Lemma.

3.2 Sublinear Functionals and the Extension Theorem

We briefly turn our attention from linear functionals to sublinear functionals. Definition 3.2 Suppose V is a vector space. A map p : V → R is called a sublinear functional if 3.2 Sublinear Functionals and the Extension Theorem 33

(i) p(αx) = αp(x) for all α ≥ 0 and x ∈ V (positive homogeneity), and (ii) p(x + y) ≤ p(x) + p(y) for all x and y in V (subadditivity). Note that condition (i) implies that p(0) = 0. Condition (ii) is also called the triangle inequality. Example 3.3 The following are examples of sublinear functionals: (a) Any linear functional is a sublinear functional. (b) If E is a normed space, then the norm function defined by p(x) =x for x ∈ E is a sublinear functional. (c) If C(K) is the Banach space of real-valued continuous functions on a compact metric space K, then

p(f ) = maxf (s), f ∈ C(K), s∈K defines a sublinear functional. (d) If ∞ is the Banach space of bounded sequences of real numbers, then = = ∞ ∈ p(ξ) sup ξn, ξ (ξn)n=1 ∞, n∈N defines a sublinear functional.

Theorem 3.4 (Hahn–Banach Extension Theorem) Let E be a real vector space and let p be a sublinear functional on E.IfV is a subspace of E and if f : V → R is a linear functional such that

f (x) ≤ p(x), x ∈ V , then there is an extension fˆ : E → R of f that is linear and satisfies

fˆ(x) ≤ p(x), x ∈ E.

Before proving the Hahn–Banach Extension Theorem, we will prove several preliminary lemmas. Lemma 3.5 A sublinear functional p on a real vector space E is linear if and only if p(−x) =−p(x) for all x ∈ E. Proof Certainly, if p is linear, then the conclusion follows. Suppose now that p(−x) =−p(x) for all x ∈ E. Then, by the triangle inequality,

p(x + y) =−p(−x − y) ≥−(p(−x) + p(−y)) = p(x) + p(y).

The reverse inequality is simply subadditivity of p. 2 Let PE be the collection of all sublinear functionals on a real vector space E.We define an order on the set PE by saying p ≤ q whenever p(x) ≤ q(x) for all x ∈ E. 34 3 The Hahn–Banach Theorems

Lemma 3.6 Suppose p ∈ PE and V is a linear subspace of a real vector space E. If q is a sublinear functional on V such that q ≤ p|V , then there exists a sublinear functional r ∈ PE such that r|V = q and r ≤ p. Proof Define r(x) = inf {q(v) + p(x − v):v ∈ V } , x ∈ E. To show that r is well defined, we show the set {q(v) + p(x − v):v ∈ V } has a lower bound for all x ∈ E. Fix x ∈ E and let v ∈ V . By assumption, q(−v) ≤ p(−v), and so, by the subadditivity of q, 0 ≤ q(v) + q(−v) ≤ q(v) + p(−v). It follows that −p(−v) ≤ q(v). Next, using the subadditivity of p,wehave p(−v) ≤ p(−x) + p(x − v). Rearranging this inequality, and also using the fact that −p( − v) ≤ q(v), we have −p(−x) ≤−p(−v) + p(x − v) ≤ q(v) + p(x − v). This is true for all v ∈ V , and so the set {q(v) + p(x − v):v ∈ V } has a lower bound. Thus, the quantity r(x) is well-defined for each x ∈ E. It is clear that r is positively homogeneous and that r(x) ≤ p(x) for all x ∈ E (by taking ν = 0). We claim that also r(x) = q(x) for each x ∈ V . To see this, suppose that x ∈ V . Then for all v ∈ V , q(v) + p(x − v) ≥ q(v) + q(x − v) ≥ q (v + (x − v)) = q(x). Thus, r(x) = inf {q(v) + p(x − v):v ∈ V } ≥ q(x). This infimum is achieved (when v = x), and so r(x) = q(x). It remains only to show that r is subadditive on E. Let x and y be in E and suppose >0. Pick v and w in V so that   q(v) + p(x − v) 0 was arbitrary, r is subadditive and the proof is complete. 2

Lemma 3.7 If p ∈ PE, then there exists a minimal q ∈ PE such that q ≤ p.

Proof Let P ={r ∈ PE : r ≤ p} and let C = (ri )i∈I be a chain in P .Forx ∈ E, let r(x) = inf ri (x). i∈I 3.2 Sublinear Functionals and the Extension Theorem 35

We claim that r is a sublinear functional on E. First we must show that r is well- defined. Let i ∈ I and x ∈ E. By subadditivity, 0 ≤ ri (x) + ri ( − x). Then, since ri ∈ P , ri (x) ≥−ri (−x) ≥−p(−x).

Thus, the set {ri (x):i ∈ I} has a lower bound, and thus r is well-defined. Positive homogeneity is clear. Now we show subaditivity. Let x and y be elements of E.For any >0, there exist indices i and j in I such that   r (x)

Since C is a chain, ri and rj are comparable. Without loss of generality, assume rj ≤ ri . Then

r(x + y) ≤ rj (x + y) ≤ rj (x) + rj (y) ≤ ri (x) + rj (y).

Therefore r(x + y) 0, and so r is subadditive. By construction, r is a lower bound of the chain C. Thus, by Zorn’s Lemma, P contains a minimal element q. We claim that q is actually minimal in PE. Suppose q0 is an element of PE such that q0 ≤ q. Then q0 ≤ p, and so q0 ∈ P . It follows that q0 is an element of P such that q0 ≤ q, and so q0 = q by the minimality of q in P . This completes the proof. 2

Lemma 3.8 If q ∈ PE is minimal, then q is linear. Proof By Lemma 3.5, it suffices to show q(−x) =−q(x) for all x ∈ E. Fix an x ∈ E and let V ={αx : α ∈ R}. Define a linear functional on V by

f (αx) =−αq(−x), α ∈ R.

If α<0, then f (αx) = q(αx), by the positive homogeneity of q. Suppose α ≥ 0. By the subadditivity of q, we have that 0 ≤ q(αx) + q(−αx), and so

f (αx) =−αq( − x) =−q(−αx) ≤ q(αx).

It follows that f ≤ q on the subspace V . Thus, by Lemma 3.6, there exists a sublinear functional r on E such that r ≤ q and r|V = f . Then r = q by the minimality of q, and so f = q|V . Therefore (taking α = 1), we have

q(x) = f (x) =−q(−x).

The choice of x was arbitrary, and so we have the desired result. 2 We are now ready to prove the Hahn–Banach Extension Theorem (Theorem 3.4). Proof of the Hahn–Banach Extension Theorem By Lemma 3.6, there exists a sub- linear functional q on E such that q|V = f and q ≤ p. By Lemma 3.7, there exists 36 3 The Hahn–Banach Theorems a minimal sublinear functional q0 on E such that q0 ≤ q. By Lemma 3.8, the map ˆ ˆ ˆ ˆ q0 is linear on E. Let f = q0. Then f is linear and f ≤ p. We must show f |V = f . ˆ ˆ ˆ Since f ≤ q and q|V = f ,wehavef ≤ f on V . Let x ∈ V . Then f (x) ≤ f (x). Furthermore, fˆ( − x) ≤ f ( − x). By linearity, we then have fˆ(x) ≥ f (x). It follows that fˆ(x) = f (x) for all x ∈ V , as required. 2 Theorem 3.9 (Hahn–Banach Theorem for real normed spaces) Suppose X is a real normed vector space and let V be a linear subspace of X.Iff ∈ V ∗, then there ˆ ∗ ˆ ˆ exists an extension f ∈ X such that f |V = f and f =f . Proof Define p(x) =f x for x ∈ X. Then f ≤ p on V . (Note that f is defined only on V .) Therefore, by the Hahn–Banach ExtensionTheorem (Theorem 3.4), there ˆ ∗ ˆ ˆ is a linear functional f ∈ X such that f ≤ p and f |V = f . It follows directly that fˆ=f . 2 Example 3.10 Consider the space c of real convergent sequences. The norm on this  = | | = ∞ Banach space is ξ supn∈N ξn , where ξ (ξn)n=1. We know that c is a subspace of ∞, the space of all real bounded sequences. (See Exercise 2.1.) Define a linear functional f : c → R by

∞ f (ξ) = lim ξn, ξ = (ξn) = ∈ c. (3.1) n→∞ n 1 The map f is bounded and f =1. By Theorem 3.9 (the Hahn–Banach Theorem ˆ for real normed spaces), there exists a linear extension f : ∞ → R such that ˆ ˆ f |c = f and f =1. ˆ ∗ ˆ By construction, f ∈ (∞) . We will now show that f ∈ 1. (Note that, among ˆ ∈ other things, this shows that 1 is not reflexive.) Suppose to the contrary that f 1. ∞ ∞ | |= Then there exists a sequence of scalars (αn)n=1 such that n=1 αn 1 and such that ∞ ˆ = = ∞ ∈ f (ξ) αn ξn, ξ (ξn)n=1 ∞. (3.2) n=1

If x is a scalar-valued sequence, then denote the nth coordinate of x by x(n). th Let em be the sequence witha1inthem coordinate and zeros elsewhere, so that em(m) = 1 and em(n) = 0ifm = n. Certainly, we have em ∈ c for every m ∈ N. Thus, by (3.1), ˆ f (em) = f (em) = lim em(n) = 0. n→∞ On the other hand, by (3.2), ∞ ˆ f (em) = αn em(n) = αm. n=1 ˆ Consequently, αm = 0 for all m ∈ N. This implies that f = 0, which is a ˆ ˆ contradiction (because f =1). We conclude that f ∈ 1. 3.2 Sublinear Functionals and the Extension Theorem 37

Example 3.11 Consider again the space c of real convergent sequences, this = = ∞ time with the sublinear functional p(ξ) supn∈N ξn, where ξ (ξn)n=1. As in Example 3.10, let ∞ f (ξ) = lim ξn, ξ = (ξn) = ∈ c. n→∞ n 1 Then f is a norm one linear functional on c. It is clear that f ≤ p on c. Thus, by Theorem 3.4 (the Hahn–Banach Extension Theorem), there exists a linear functional ˜ ˜ ˜ f : ∞ → R such that f |c = f and f ≤ p. As was the case in Example 3.10, it can ˜ be shown that f ∈ 1. It is worth noting that f˜ has an additional property: If ξ is a bounded sequence ˜ ≥ = ∞ of nonnegative real numbers, then f (ξ) 0. To see this, suppose that ξ (ξn)n=1 is a bounded sequence such that ξn ≥ 0 for all n ∈ N. Then

p(−ξ) = sup (−ξn) ≤ 0. n∈N ˜ ˜ ˜ By construction, f ≤ p on ∞, and so f (−ξ) ≤ 0. Therefore, f (ξ) ≥ 0, by the linearity of f˜. = ∞ = ∞ As a simple extension, observe that, for sequences ξ (ξn)n=1 and η (ηn)n=1 ˜ ˜ in ∞,wehavef (ξ) ≥ f (η) whenever ξn ≥ ηn for all n ∈ N. Remark 3.12 The existence of fˆ and f˜ in the previous examples is guaranteed by the Hahn–Banach Theorem, but this relies on the Axiom of Choice. No formula exists for constructing fˆ or f˜ and, in fact, no formula can exist. If the Axiom of ∗ Choice is replaced by a weaker assumption, then ∞ = 1. (See [36].) This means, for one thing, that any linear functional on ∞ which can be written explicitly must belong to 1. When an extension of a bounded linear functional is found using the Hahn–Banach Theorem, it is sometimes called a Hahn–Banach extension of the functional. The extensions fˆ and f˜ in Examples 3.10 and 3.11 (respectively) are both Hahn–Banach extensions of the same bounded linear functional f . Hahn–Banach extensions are generally not unique, as the following example illustrates.

Example 3.13 Let c be the space of convergent sequences and let f (ξ) = lim ξn n→∞ = ∞  = for all ξ (ξn)n=1 in c. Then f is a linear functional on c with f 1. Now let 1E = (0, 1, 0, 1, ... ) be the sequence having 0 in each odd coordinate and 1 in each even coordinate. Let X denote the subspace of ∞ generated by c and 1E. (That is, let X be the smallest subspace of ∞ that contains all sequences in c and the sequence 1E.) We will extend the linear functional f to X in two ways:

fE(x) = lim x2n and fO (x) = lim x2n+1, n→∞ n→∞ = ∞ = = where x (xn)n=1 is a sequence in X. Observe that fE(ξ) f (ξ) fO (ξ) for all sequences ξ ∈ c. Each of the linear functionals fE and fO are bounded on X and have norm one. By the Hahn–Banach Theorem, these bounded linear functionals can be extended to 38 3 The Hahn–Banach Theorems norm one linear functionals fE and fO (respectively) on ∞. Because

fE|c = fO |c = f , and both fE and fO have norm 1, they are both Hahn–Banach extensions of f to

∞. It is easily seen, however, that fE and fO are distinct linear functionals on ∞, because fE(1E) = 1 and fO (1E) = 0. In the preceding example, we found two distinct Hahn–Banach extensions for the bounded linear functional f by partitioning N into two sets, namely the set of even numbers and the set of odd numbers. We can find further distinct Hahn–Banach extensions for f by repeating the same argument using different partitions of N. When p ∈ [1, ∞), we can actually write down an explicit formula for the linear functionals on p. Ultimately, this is because p is a separable space when p ∈ [1, ∞). ∗ The space ∞, however, is vast, and consequently ∞ is a “monster.” To get some perspective on the size of ∞, consider the set S of all sequences of zeros and ones, which is certainly contained in ∞. Any two distinct elements ξ and η in S must ℵ differ in at least one coordinate, and so ξ − η∞ = 1. The size of S is |S|=2 0 , the size of the continuum (also denoted c), and so ∞ is far from separable. Remark 3.14 (Separability and classical spaces) The comments above show that the space ∞ of bounded sequences is not a separable space. On the other hand, the space p of p-summable sequences is separable when p ∈ [1, ∞), because the set {ek : k ∈ N} is a countable dense subset of p, where ek is the sequence with 1 in the kth coordinate and zero in every other coordinate. ! The space C[0," 1] of continuous functions on [0, 1] is separable, because the set tk : k ∈ N ∪{0} is dense in C[0, 1], by the Weierstrass Approximation Theorem. Consequently, the space Lp(0, 1) of (equivalence classes of) p-integrable measurable functions on [0, 1], where p ∈ [1, ∞), is also separable, because C[0, 1] is dense in this space, by Lusin’s Theorem (Theorem A.36). The space L∞(0, 1) of essentially bounded measurable functions on [0, 1] is not separable, however. Observe that {χ[0,x] :0

Proof Let XR be the underlying real Banach space (i.e., forget you can use complex scalars). Define f0 : V → R by f0(v) =(f (v)) for all v ∈ V . Then f0 is a real linear functional on V . Furthermore, for all v ∈ V ,

|f0(v)|≤|f (v)|≤f v, and so f0≤f . By Theorem 3.9 (the Hahn–Banach Theorem for real normed ˆ spaces), the linear functional f0 has an extension f0 : XR → R that is linear and ˆ such that f0=f0≤f . Define ˆ ˆ ˆ f (x) = f0(x) − i f0(ix), x ∈ X. (3.3) 3.3 Banach Limits 39

Observe that if v ∈ V , then (since f is complex-linear)

f0(iv) =(f (iv)) =(if (v)) =−(f (v)) .

It follows that, for all v ∈ V , ˆ f (v) = f0(v) − if0(iv) =(f (v)) + i  (f (v)) = f (v).

Therefore, fˆ is an extension of f . By construction, fˆ is R-linear. To see that it is also C-linear, put ix into (3.2.3): For x ∈ X, ˆ ˆ ˆ ˆ f (ix) = f0(ix) + i f0(x) = i f (x).

It remains to show that fˆ=f . We know fˆ≥f , since fˆ is an extension of f . Suppose x ∈ X with x≤1. We know that fˆ(x) ∈ C. Fix θ ∈ R so that eiθfˆ(x) ∈ R. Then, by linearity, fˆ(eiθx) ∈ R. Therefore,

ˆ iθ ˆ iθ f (e x) = f0(e x), and so ˆ iθ ˆ ˆ iθ ˆ iθ |f (x)|=|e f (x)|=|f0(e x)|≤f0e x≤f x. Consequently, fˆ≤f , and the proof is complete. 2

3.3 Banach Limits

In Example 3.11, we showed the existence of a bounded linear functional L : ∞ → R  = ∞ ≤ on the real Banach space ∞ such that L 1 and L (ξn)n=1 supn∈N ξn, and such that  ∞ L (ξn) = = lim ξn, (3.4) n 1 n→∞ whenever this limit exists. (We called the linear functional f˜ in Example 3.11.) In this section, we will show the existence of bounded linear functionals L that satisfy an additional property, called shift-invariance:   ∞ = ∞ L (ξn)n=1 L (ξn+1)n=1 . (3.5) Shift-invariance, together with the other properties mentioned above, leads to the following inequality:  lim inf ξ ≤ L (ξ )∞ ≤ lim sup ξ , (3.6) →∞ n n n=1 n n n→∞ ∞ for all (ξn)n=1 in ∞. Notice that (3.6) implies (3.4). Linear functionals that satisfy (3.6) are of interest because they generalize the notion of limits. 40 3 The Hahn–Banach Theorems

We define the shift operator T : ∞ → ∞ by  ∞ = ∞ ∞ ∈ T (ξn)n=1 (ξn+1)n=1,(ξn)n=1 ∞. Given this definition, we can restate the shift-invariance property in terms of the shift operator T :

L(Tξ) = L(ξ), ξ ∈ ∞. (3.7)

When L satisfies this equation, we say that L is invariant under the shift operator, or is shift-invariant. We now prove a form of the Hahn–Banach Theorem for sublinear functionals which are invariant under some collection of linear maps. Theorem 3.16 (Invariant Hahn–Banach Theorem) Let V be a real vector space and suppose that T is a commutative collection of linear maps on V (i.e., ST = TS for all S and T in T ). If p is a sublinear functional on V such that

p(Tx) ≤ p(x), x ∈ V , T ∈ T , then there exists a linear functional f on V such that f ≤ p and

f (Tx) = f (x), x ∈ V , T ∈ T .

Before proving Theorem 3.16, let us consider some situations where it can be used. Example 3.17 In many cases, we will have a set I containing only one map, and so the commutativity assumption will be satisfied trivially. For example, consider the real Banach space ∞ and let T : ∞ → ∞ be the shift operator defined in (3.3). The following functions are sublinear functionals on ∞ that satisfy the hypotheses of Theorem 3.16 with T ={T }:

(i) p(ξ) = sup ξn, n∈N (ii) p(ξ) = lim sup ξn, and n→∞ (iii) p(ξ) = sup |ξn|, n∈N = ∞ ∈ where ξ (ξn)n=1 ∞. By Theorem 3.16, each one of these sublinear functionals will lead to a shift-invariant bounded linear functional on ∞. Proof of Theorem 3.16 Consider the collection C of sublinear functionals q such that q ≤ p and such that q(Tx) ≤ q(x) for all x ∈ V and T ∈ T . We will use Zorn’s Lemma to show there exists a minimal element q ∈ C. Suppose (qi )i∈I is a chain in C and let q = inf qi . Then q is a sublinear functional i∈I on V . (See the proof of Lemma 3.7.) Let T ∈ T and x ∈ V . Then, by the definition of q, 3.3 Banach Limits 41

q(Tx) ≤ qi (Tx), i ∈ I.

We assumed qi ∈ C, and hence qi (Tx) ≤ qi (x), for each i ∈ I. It therefore follows that q(Tx) ≤ qi (x) for all i ∈ I. Consequently,

q(Tx) ≤ inf qi (x) = q(x), x ∈ V , T ∈ T . i∈I

We conclude that q ∈ C and q is a lower bound for the chain (qi )i∈I . Thus, any chain in C has a lower bound. Therefore, C contains a minimal element, by Zorn’s Lemma. Now let q be a minimal element of C and let T ∈ T . Let n ∈ N and define the nth Cesàro mean by   x + Tx+···+T n−1x q (x) = q , x ∈ V. n n

Note that qn is a sublinear functional, because q is sublinear and T is linear. We wish to show that qn ∈ C. Suppose S ∈ T . By assumption, S and T commute, and so      Sx + TSx+···+T n−1Sx x + Tx+···+T n−1x q (Sx) = q = q S n n n

  x + Tx+···+T n−1x ≤ q = q (x). n n

Thus, qn ∈ C for each n ∈ N. Observe that, since q ∈ C,wehave

q(T n−1x) ≤ q(T n−2x) ≤ ··· ≤ q(T 2x) ≤ q(Tx) ≤ q(x), for all x ∈ V . Consequently,       x + Tx+···+T n−1x x x q (x) = q ≤ q +···+q = q(x), n n n n for all x ∈ V . By the minimality of q in C, it follows that qn = q for all n ∈ N, and hence   x + Tx+···+T n−1x q(x) = q , n for all x ∈ V , T ∈ T , and n ∈ N. Now, let x ∈ V and T ∈ T . For all n ∈ N,   (x − Tx) + T (x − Tx) +···+T n−1(x − Tx) q(x − Tx) = q n 42 3 The Hahn–Banach Theorems   x − T nx 1 1 = q ≤ q(x) + q( − T nx) n n n 1 1 ≤ q(x) + q( − x). n n Since this is true for all n ∈ N, we conclude that q(x − Tx) ≤ 0 for all x ∈ V and T ∈ T . By a similar argument, we also deduce that q( − x − T ( − x)) ≤ 0 for all x ∈ V and T ∈ T . By Lemmas 3.7 and 3.8, there exists a linear functional f such that f ≤ q.It follows that both f (x − Tx) ≤ 0 and f ( − x − T ( − x)) ≤ 0 for all x ∈ V and T ∈ T . Therefore, by the linearity of f ,wehavef (x) = f (Tx) for all x ∈ V and T ∈ T . This is the desired result, and so the proof is complete. 2

Example 3.17 (revisited). Let T : ∞ → ∞ be the shift operator on the real Banach = space ∞ and consider the sublinear functional p on ∞ defined by p(ξ) supn∈N ξn = ∞ ≤ ∈ for all ξ (ξn)n=1 in ∞. Observe that p(Tξ) p(ξ) for all ξ ∞.Wenowinvoke Theorem 3.16 with T ={T } to conclude that there exists a linear functional L on ∞ such that L ≤ p that is shift-invariant; that is, L(ξ) = L(Tξ) for all ξ ∈ ∞.We = ∞ ∈ claim that the map L has the following properties, where ξ (ξn)n=1 ∞: (i) lim inf ξ ≤ L(ξ) ≤ lim sup ξ , →∞ n n n n→∞ (ii) L(ξ) = lim ξn whenever the limit exists, and n→∞ (iii) L(ξ) ≥ 0 whenever ξn ≥ 0 for all n ∈ N. Observe that (i) implies both (ii) and (iii). To show that (i) is true, we use the invariance of L under the shift operator. If k ∈ N, then (by shifting k times)  = ∞ ≤ = ∈ L(ξ) L (ξn+k)n=1 sup ξn+k sup ξn, ξ ∞. n∈N n≥k

This is true for all k ∈ N, and so we conclude that L(ξ) ≤ lim sup ξn. A similar n→∞ argument shows lim inf ξn ≤ L(ξ), which proves (i). n→∞ Motivated by the preceding example, we make a definition.

Definition 3.18 A linear functional L on ∞ is called a Banach limit if, for any = ∞ sequence ξ (ξn)n=1 in ∞, (i) L(Tξ) = L(ξ), where T is the shift operator, (ii) L(ξ) = lim ξn whenever the limit exists, and n→∞ (iii) L(ξ) ≥ 0 whenever ξn ≥ 0 for all n ∈ N. Shortly, we will present an application of a Banach limit. First, we recall some definitions. Definition 3.19 A real vector space H is called a real inner product space if there is a map (·, ·):H × H → R, called an inner product, that satisfies the following properties: 3.3 Banach Limits 43

(i) (x, x) ≥ 0 for all x ∈ H, and (x, x) = 0 if and only if x = 0, (ii) (x, y) = (y, x), and (iii) (αx + βx , y) = α(x, y) + β(x , y) and (x, αy + βy ) = α(x, y) + β(x, y ), where {x, x , y, y }⊆H and {α, β}⊆R. When a map possesses the three properties listed in Definition 3.19, it is called (i) positive definite, (ii) symmetric, and (iii) bilinear, respectively. The concept of an inner product space exists also when the underlying scalar field is C, but the properties defining an inner product must be modified in this setting. (See Definition 7.1.) An inner product (·, ·) on a vector space H can always be used to define a norm by the formula # x= (x, x), x ∈ H. This norm on H is said to be induced by the inner product. (We will verify that this formula defines a norm in Section 7.1.) Definition 3.20 A real inner product space H is called a real Hilbert space if it is a complete normed space when given the norm induced by the inner product. If H is an inner product space with induced norm ·, then H is a Hilbert space precisely when (H , ·) is a Banach space. We will study the topic of Hilbert spaces in greater depth in Chapter 7. Suppose H is a real Hilbert space with inner product (·, ·). A bounded linear map S : H → H is said to be an orthogonal operator if (Sx, Sy) = (x, y) for all x and y in H . Note that S is an orthogonal operator if and only if Snx=x for all n ∈ N and x ∈ H . (See Exercise 3.3.) In particular, if n = 1, then Sx=x, and so S is necessarily bounded. A bounded linear map S : H → H is said to be similar to an orthogonal operator if it is orthogonal with respect to an equivalent inner product on H . (Two inner products on H are equivalent if they induce equivalent norms.) We will make use of the following significant fact: If H is an inner product space, then |(x, y)|≤xy for all x and y in H. This inequality, known as the Cauchy– Schwarz Inequality, is a fundamental tool. We will make use of it now, but will not prove it until later. (See Theorem 7.2.) Proposition 3.21 Let H be a real Hilbert space and suppose ·is the complete norm induced by the inner product on H.IfS : H → H is an invertible bounded linear map and there exist positive constants c and C such that

cx≤Snx≤Cx, (3.8) for all n ∈ N and x ∈ H, then S is similar to an orthogonal operator. Proof Denote the inner product on H by (·, ·). By the Cauchy–Schwarz Inequality,

|(Snx, Sny)|≤SnxSny≤C2 xy,

∈ N { }⊆ n n for n and x, y H. It follows that ((S x, S y))n∈N is a sequence in ∞ for each x and y in H. Let L be a Banach limit on ∞. Define a new inner product on 44 3 The Hahn–Banach Theorems

H by   = n n { }⊆ x, y L (S x, S y) n∈N , x, y H. (See Exercise√ 3.15 to show that this defines a real inner product on H .) If x ∈ H and |||x||| = x, x, then ||| · ||| is a norm on H and  ||| |||2 = =  n 2 x x, x L S x n∈N . By (3.8) and property (iii) of a Banach limit,

cx ≤ |||x||| ≤ Cx, x ∈ H.

Consequently, ||| · ||| and ·are equivalent norms on H , and so ·, · and (·, ·) are equivalent inner products on H. By the shift-invariance of a Banach limit (property (i)),   = n+1 n+1 =  { }⊆ Sx, Sy L (S x, S y) n∈N x, y , x, y H. Therefore, S is orthogonal with respect to ·, ·, where ·, · is an inner product equivalent to the original. Hence, S is similar to an orthogonal operator. 2

3.4 Haar Measure for Compact Abelian Groups

In this section, we will apply the Hahn–Banach Theorem to the setting of compact groups. A group is a pair (G, ·), where G is a set and · is a binary operation on G, called multiplication, that satisfies the following properties: (i) (closure) x · y ∈ G for all {x, y}⊆G. (ii) (associativity) (x · y) · z = x · (y · z) for all {x, y, z}⊆G. (iii) (identity) There exists an element e ∈ G such that x · e = x = e · x for all x ∈ G. (iv) (inverses) For x ∈ G, there exists x−1 ∈ G such that x · x−1 = e = x−1 · x. Properties (i)–(iv) are known as the group axioms. When the multiplication is under- stood, the group (G, ·) is often abbreviated to G. Frequently, group multiplication is denoted by juxtaposition, so that x · y is written xy. We will adopt this convention when there is no risk of confusion. A simple calculation shows that, for a given x ∈ G, the inverse x−1 is necessarily unique. Therefore, the map x → x−1 is a well-defined operation on G (called inversion). We call G a metric group if it is both a group and a metric space, and if the group operations of multiplication and inversion are continuous on G; that is, if both the maps (x, y) → xy and x → x−1 are continuous for x and y in G. If a metric group G is also a compact topological space, then G is called a compact metric group. 3.4 Haar Measure for Compact Abelian Groups 45

Example 3.22 The following are examples of compact metric groups: (i) The unit circle in C, written T ={eiθ :0≤ θ<2π}, is a compact group, often called the circle group or the torus. Multiplication in T is taken from C, and so is the metric. Consequently, the identity in T is 1 and the inverse of eiθ is e−iθ. The punctured complex plane C\{0} with the standard multiplication and metric is itself a metric group, but it is not compact. (Nor is it complete, because C\{0} is an open subset of C.) (ii) Let On denote the collection of n × n orthogonal matrices (n<∞). Then On is a compact metric group, called the orthogonal group. The group operations are given by matrix multiplication and matrix inversion. The metric on On is induced by the operator norm ||| · ||| on On. That is, d(X, Y ) = |||Y − X||| for O all X and Y in n. $ ∞ Z ={ }N (iii) The Cantor group is the countable product n=1 2 0, 1 . Elements of the Cantor group are sequences of zeros and ones, and the group operation is given by component-wise addition (mod 2). Note that the Cantor group is a compact space by Tychonoff’s Theorem. (See Theorem B.4 in the appendix.) The Cantor group can be given a metric using the formula

∞ 1 |x − y | d(x, y) = k k , 2k 1 +|x − y | k=1 k k = ∞ = ∞ { }N where x (xk)k=1 and y (yk)k=1 are elements of the set 0, 1 . Of particular interest in this section are abelian groups. A group G is called abelian if xy = yx for all x and y in G. (In other words, if the group multiplication is commutative.) Both the torus and the Cantor group are abelian, but the orthogonal group is not if n ≥ 2. When a group is abelian, the group multiplication is often denoted by addition (+) and the inverse x−1 is then written as −x (provided this causes no confusion). Definition 3.23 Let (G, +) be an abelian metric group. A Borel measure λ on G is translation-invariant if λ(B) = λ(x + B) for all x ∈ G and Borel subsets B ⊆ G. Theorem 3.24 If (G, +) is a compact abelian metric group, then there is a unique translation-invariant Borel probability measure on G. Proof Let C(G) be the space of real-valued continuous functions on G equipped with the norm f ∞ = max|f (x)|, f ∈ C(G). x∈G Note that this maximum is attained because f is continuous on the compact set G. For each x ∈ G, define an operator Tx : C(G) → C(G)byTx f (y) = f (x +y) for all f ∈ C(G) and y ∈ G. Let T ={Tx : x ∈ G}. The set T is a commuting family of operators on C(G) because G is abelian. We call the elements of T rotations. 46 3 The Hahn–Banach Theorems

Define a sublinear functional p on C(G)by

p(f ) = maxf (x), f ∈ C(G). x∈G

Certainly, p(Tx f ) = p(f ) for all x ∈ G and f ∈ C(G), and so p is invariant under rotations. Thus, by Theorem 3.16 (the Invariant Hahn–Banach Theorem), there exists a linear functional φ on C(G) such that φ ≤ p and φ(Tx f ) = φ(f ) for all x ∈ G and f ∈ C(G). Let f ∈ C(G). By construction,

φ(f ) ≤ maxx∈Gf (x) and φ( − f ) ≤ maxx∈G (−f (x)) =−minx∈Gf (x). Consequently,

minf (x) ≤ φ(f ) ≤ maxx∈Gf (x). (3.9) x∈G

In particular, we see |φ(f )|≤f ∞ for all f ∈ C(G). It follows that φ is a bounded linear functional on C(G). Thus, byTheorem 2.20 (the Riesz Representation Theorem), there exists a Borel measure λ on G such that

φ(f ) = fdλ, f ∈ C(G). G Since φ(f ) ≥ 0 whenever f ≥ 0, we have that λ is a positive measure. Furthermore, because of (3.4.1), we have that φ(1) = 1. It follows that λ(G) = 1, and consequently λ is a probability measure on G. We now show λ is translation-invariant. Let x ∈ G and define a measure λx on G by λx (B) = λ(x + B) for all Borel sets B in G. Our goal is to show that λ = λx .To that end, we make the following claim:

fdλx = T−x fdλ, f ∈ C(G). (3.10) G G To prove this, let B be a Borel set in G and let y ∈ G. By the definition of the map T−x , we have that T−x χB (y) = χB (y − x). Thus,

T−x χB (y) λ(dy) = χB (y−x) λ(dy) = λ(B+x) = λx (B) = χB (y) λx (dy). G G G

Therefore, (3.4.2) holds for f = χB , where B is a Borel subset of G. By linear- ity, (3.4.2) holds for simple functions, and by the density of simple functions in C(G), (3.4.2) holds for continuous functions, as well. The linear functional φ was chosen (via the Invariant Hahn–Banach Theorem) so that φ(T−x f ) = φ(f ) for all f ∈ C(G). Therefore, using (3.10),

fdλx = T−x fdλ= φ(T−x f ) = φ(f ) = fdλ, G G G for all f ∈ C(G). It follows that λx = λ. This is true for all x ∈ G, and so λ is translation-invariant. 3.4 Haar Measure for Compact Abelian Groups 47

It remains to show that λ is unique. Assume μ is a translation-invariant probability measure on G. Let f ∈ C(G). By Fubini’s Theorem,     f (x + y) λ(dx) μ(dy) = f (x + y) μ(dy) λ(dx). G G G G By translation-invariance,

f (x + y) λ(dx) = f (x) λ(dx), y ∈ G, G G and f (x + y) μ(dy) = f (y) μ(dy), x ∈ G. G G Therefore, since λ and μ are probability measures (and so λ(G) = μ(G) = 1),

f (x)λ(dx) = f (y) μ(dy), f ∈ C(G). G G It follows that λ = μ, and so the proof is complete. 2 Definition 3.25 The unique translation-invariant Borel probability measure on a compact abelian metric group is called Haar measure. Remark 3.26 In the proof of Theorem 3.24, we do not actually use the metric on G. Indeed, our proof requires only that G is a compact abelian topological group. At this time, however, we have restricted our attention to topologies arising from a metric. We will consider more general topological spaces in Chapter 5 and we will revisit the topic of Haar measure at that time. (See Section 5.9.) Example 3.27 Let T ={eiθ :0≤ θ<2π} be the torus from Example 3.22 (i). Haar measure on T is given by m/(2π), where m is Lebesgue measure on [0, 2π). To be more precise, if f ∈ C(T) and λ is Haar measure on T, then

1 2π  fdλ= f eiθ m(dθ). T 2π 0 For this reason, some authors write T = [0, 2π) and λ = m/(2π). (The factor 2π is needed in the denominator so that λ is a probability measure.) Corollary 3.28 Let G be a compact abelian metric group with Haar measure λ.If B is any Borel subset of G, and if −B is the set {−x : x ∈ B} of inverses of elements in B, then λ(−B) = λ(B). Proof Define a probability measure μ on G by μ(B) = λ(−B) for all Borel subsets B in G. By the translation invariance of λ,

μ(x + B) = λ( − x + ( − B)) = λ( − B) = μ(B).

Thus, μ is a translation invariant Borel probability measure on G. Since Haar measure is the unique measure with these properties, we conclude that μ = λ. 2 48 3 The Hahn–Banach Theorems

3.5 Duals, Biduals, and More

Let X be a Banach space. Recall that the dual space X∗ of X is the space of bounded linear functionals on X. When X∗ is equipped with the operator norm x∗= ∗ ∗ sup{|x (x)| : x ∈ BX}, where BX is the closed unit ball of X, the space X is a Banach space. (See Proposition 1.11.) Proposition 3.29 Let X be a Banach space with dual space X∗.Ifx ∈ X, then ! " ∗ ∗ x=sup |x (x)| : x ∈ BX∗ , and there exists an x∗ ∈ X∗ such that x∗=1 and x∗(x) =x.

Proof Let K denote the scalar field. For x ∈ X, define a closed linear subspace Ex of X by Ex ={αx : α ∈ K}. Define a map f : Ex → K by f (αx) = αx, for all α ∈ K. Then f is linear and f =1. By the Hahn–Banach Theorem for normed spaces (Theorem 3.9 for real spaces, ∗ ∈ ∗ Theorem 3.15 for complex spaces), there exists a bounded linear functional xf X ∗ that extends f . That is, the map xf is a bounded linear functional on X such that ∗ =   ∗ = xf (x) x and xf 1. ∗ ∗ Observe that |x (x)|≤x for all x ∈ BX∗ . Therefore, ! " ∗ ∗ x≥sup |x (x)| : x ∈ BX∗ .

∗ ∗ On the other hand, the linear functional xf is an element in BX with the property ∗ =  2 that xf (x) x . The result follows. For any x in a Banach space X, Proposition 3.29 guarantees the existence of a so-called norming element in the dual space X∗; that is, an element x∗ of norm 1 such that x∗(x) =x. This element may or may not be unique. Example 3.30 We consider the existence of norming elements in a few real sequence spaces. = ∗ = = (i) Let X 1, and so X ∞. Consider the summable sequence ξ 1 1 1 1 ∈  = ∞ 1 = π2 1, 4 , 9 , 16 , ... , n2 , ... . Then ξ 1 and ξ n=1 n2 6 . In this case, there is a unique norming element in ∞, and that element is the constant sequence (1, 1, 1, ... ). (ii) Once again, let X = 1. This time, let ξ = (1, 0, 0, 0, ... ). In this case, ξ ∈ 1 and ξ=1. There are many norming functionals in this case. Indeed, any element in ∞ of the form (1, a2, a3, a4, ... ) with |aj |≤1 for all j ≥ 2 will determine a norming functional for ξ. = (iii) In 2, the norming functional is always unique. Let ξ (ξ1, ξ2, ... )bean  = ∞ 2 1/2   element of 2. Then ξ n=1 ξn and the norming element is ξ/ ξ . ∗ = (Recall that 2 2 (Theorem 2.5).) (iv) Consider the space ∞ and let ξ = (1, 1, 1, ... ). Any linear functional φ on ∞ will be a norming element for ξ provided both φ(ξ) = 1 and φ=1. Any Banach limit will satisfy these criteria, as well as other linear functionals. 3.5 Duals, Biduals, and More 49

Definition 3.31 Let X be a Banach space. The bidual of X is the space X∗∗ = (X∗)∗. Let X be a Banach space. Define a map j : X → X∗∗ by letting j(x) ∈ X∗∗ be the linear functional on X∗ defined by

j(x)(x∗) = x∗(x), x∗ ∈ X∗, (3.11) for all x ∈ X. The equation in (3.11) is sometimes written x∗, j(x)=x, x∗. We call j the natural embedding of X into its bidual and ·, · the dual space action between a Banach space (written on the left) and its dual (written on the right). Theorem 3.32 The natural embedding of a Banach space into its bidual is an isometric isomorphism onto a closed subspace of the bidual. Proof Let X be a Banach space and suppose j is the natural embedding of X into its bidual X∗∗. By direct computation, one can show that j is a linear injection onto its image. (See Exercise 3.5.) We now show that j is an isometry. If x ∈ X, then j(x) ∈ X∗∗. Thus, for all x ∈ X, j(x)= sup |j(x)(x∗)|= sup |x∗(x)|=x. x∗≤1 x∗≤1 (The last equality follows from Proposition 3.29.) It follows that j is an isometry. From this, we can conclude that j(X) is closed in X∗∗ and that j is an isomorphism onto j(X). (See Exercise 3.5.) 2 Theorem 3.32 suggests that an exact copy of X sits inside of X∗∗. In light of this, it is common to suppress the map j and simply view X as a closed subspace of X∗∗. Definition 3.33 A Banach space X is called reflexive if the natural embedding j of X into its bidual is a surjection; that is, if j(X) = X∗∗. Example 3.34

(i) The sequence spaces p are reflexive whenever 1

Proposition 3.35 A Banach space X is reflexive if and only if its dual X∗ is reflexive. Proof First, assume X is reflexive. Then (X∗)∗ = X. By definition, the bidual of X∗ is  ∗ (X∗)∗∗ = (X∗)∗ = X∗. The last equality follows from the assumption that X is reflexive. Therefore, X∗ is reflexive. 50 3 The Hahn–Banach Theorems

Now assume X∗ is reflexive. We wish to show X is reflexive. Let j : X → X∗∗ be the natural embedding of X into its bidual. Assume that j is not a surjection. Then there exists an x∗∗ ∈ X∗∗ such that x∗∗ ∈ j(X). Let

δ = d(x∗∗, j(X)) = inf{x∗∗ − j(x) : x ∈ X}.

Then δ>0, by assumption, because j(X) is closed in X∗∗. (See Theorem 3.32.) Let E = span{x∗∗, j(x):x ∈ X}. Define a linear functional φ : E → K (where K denotes the scalar field) by  φ λx∗∗ + j(x) = λ, λ ∈ K, x ∈ X.

This map is well-defined because j is an injection onto j(X). For λ ∈ K and x ∈ X,   λx∗∗ + j(x)=|λ| x∗∗ − j( − λ−1x) ≥|λ| d(x∗∗, j(X)) = δ |λ|.

It follows that 1 |φ(λx∗∗ + j(x))|=|λ|≤ λx∗∗ + j(x). δ We conclude that φ is bounded on E ⊆ X∗∗ and φ≤1/δ. Therefore, by the Hahn–Banach Extension Theorem (Theorem 3.4), there exists an element of X∗∗∗ that extends φ. In particular, there exists x∗∗∗ ∈ X∗∗∗ such that x∗∗∗(x∗∗) = 1 and x∗∗∗(j(x)) = 0 for all x ∈ X. By assumption, X∗ is reflexive, and so x∗∗∗ corresponds to some x∗ ∈ X∗. Consequently, there exists an element x∗ ∈ X∗ ∗∗ ∗ ∗ ∗ ∗ such that x (x ) = 1 and x |X = 0. This implies both x  > 0 and x = 0, a contradiction. Therefore, X is reflexive. 2

3.6 The Adjoint of an Operator

= n × Suppose n is a natural number. If A (ajk)j,k=1 is an n n complex matrix, then × ∗ = n the matrix adjoint (or conjugate transpose)ofA is the n n matrix A (bjk)j,k=1 with entries bjk = akj for each j and k in the set {1, ... , n}. One of the important properties of the matrix adjoint is

(Ax, y) = (x, A∗y), {x, y}⊆Cn, where (·, ·) denotes the inner product on Cn. In this section we will generalize the notion of a matrix adjoint to infinite-dimensional Banach spaces. Definition 3.36 Let X and Y be Banach spaces and let T : X → Y be a bounded linear operator. The map T ∗ : Y ∗ → X∗ defined by

(T ∗y∗)(x) = (y∗ ◦ T )(x), x ∈ X, y∗ ∈ Y ∗, is called the adjoint of T . 3.6 The Adjoint of an Operator 51

Owing to the abundance of Banach spaces, we will sometimes find it convenient to denote the norm on a Banach space X by ·X. The proof of the following proposition will afford one such occasion. Proposition 3.37 Let X and Y be two Banach spaces. If T : X → Y is a bounded linear operator, then the adjoint T ∗ is a bounded linear operator and T ∗=T . Proof It is not hard to show T ∗ is linear. To show T ∗ is bounded, let y∗ ∈ Y ∗. Then

∗ ∗ ∗ ∗ ∗ T y X∗ = sup |(T y )(x)|= sup |y (Tx)|. x≤1 x≤1

Therefore, ∗ ∗ ∗ ∗ T y X∗ ≤ sup y Y ∗ TxY =T y Y ∗ . x≤1 Hence, T is bounded and T ∗≤T . To prove the reverse inequality, we begin by letting >0. There exists x ∈ X ∗ ∗ such that xX ≤ 1 and TxY > T −. By Proposition 3.29, there exists y ∈ Y ∗ ∗ such that y Y ∗ = 1 and y (Tx) =TxY ; whence,

∗ ∗ ∗ ∗ ∗ ∗ ∗ TxY = (T y )(x) ≤T y X∗ xX ≤T y X∗ ≤T .

Consequently, T  < T ∗+. Since the choice of  was arbitrary, we conclude that T ≤T ∗. 2 Corollary 3.38 Let X and Y be Banach spaces and let L(X, Y ) denote the Banach space of bounded linear operators from X to Y . The map taking T to T ∗ is a linear isometry from L(X, Y ) to L(Y ∗, X∗). Proof This follows from Proposition 3.37. 2 For an operator T between Banach spaces X and Y , the adjoint T ∗ : Y ∗ → X∗ also has an adjoint T ∗∗ : X∗∗ → Y ∗∗. In the following proposition, we think of X as a subspace of its bidual X∗∗. Proposition 3.39 Let X and Y be Banach spaces. If T : X → Y is a bounded ∗∗ ∗∗ linear operator, then T |X = T . That is, T (x) = T (x) for all x ∈ X. Proof Let j be the natural embedding of X into its dual X∗∗ (which we think of as the inclusion map). Suppose x ∈ X. By definition, T ∗∗x ∈ Y ∗∗. Let the action of Y ∗∗ on Y ∗ be represented by ·, ·. Then for any y∗ ∈ Y ∗,

(T ∗∗x)(y∗) =y∗, T ∗∗x=T ∗y∗, j(x).

We have T ∗y∗ ∈ X∗, and so, by (3.11) and Definition 3.36,

T ∗y∗, j(x)=x, T ∗y∗=Tx, y∗.

It follows that y∗, T ∗∗x=Tx, y∗ for all y∗ ∈ Y ∗, and hence the result. 2 52 3 The Hahn–Banach Theorems

Proposition 3.40 Let X, Y , and Z be Banach spaces and suppose both S : X → Y and T : Y → Z are bounded linear operators. If TS = T ◦ S : X → Z, then the adjoint map (TS)∗ : Z∗ → X∗ is given by (TS)∗ = S∗T ∗. Proof The proof is left to the reader. (See Exercise 3.8.) 2 Example 3.41 (The Volterra operator). Let p and q be conjugate exponents (so that 1/p + 1/q = 1), where 1

t Vf(t) = f (s) ds, f ∈ Lp(0, 1), t ∈ [0, 1]. 0

We call V the Volterra operator on Lp(0, 1). We must show that V is well-defined. By Hölder’s Inequality, for all t ∈ [0, 1],     t 1/q t 1/p q p 1/q |Vf(t)|≤ 1 ds |f (s)| ds ≤ t f p. 0 0 Therefore,     1  1/p 1/p 1/q p 1 Vfp ≤ t dt f p = f p. 0 p

It follows that V is bounded and V ≤(1/p)1/p. We now compute the adjoint of the operator V . Observe that the adjoint operator ∗ V : Lq (0, 1) → Lq (0, 1) satisfies the equation   1 1 1 s f (s) V ∗g(s) ds = Vf(s) g(s) ds = f (t) dt g(s) ds, 0 0 0 0 for all f ∈ Lp(0, 1) and g ∈ Lq (0, 1). By Fubini’s Theorem,     1 s 1 1 f (t) dt g(s) ds = g(s) ds f (t) dt, 0 0 0 t for all f ∈ Lp(0, 1) and g ∈ Lq (0, 1). We therefore conclude that

1 ∗ V g(t) = g(s) ds, g ∈ Lq (0, 1). t We can define the Volterra operator V for p = 1, as well. A similar argument will ∗ ∗ ∗ yield the same adjoint V .Ifp =∞, however, then V is a map from (L∞(0, 1)) ∗ to (L∞(0, 1)) , and this map is not so easy to compute. TheVolterraoperator defined in Example 3.41 is a special case of the next example. 3.7 New Banach Spaces From Old 53

Example 3.42 Suppose K ∈ L∞([0, 1] × [0, 1]). For p ∈ [1, ∞), define a linear map TK : Lp(0, 1) → Lp(0, 1) by

1 TK f (s) = K(s, t) f (t) dt, f ∈ Lp(0, 1). 0

Then |TK f (s)|≤K∞f p for all s ∈ [0, 1], and so TK f p ≤K∞f p. Thus, TK ≤K∞. A calculation similar to that in Example 3.41 reveals

1 ∗ = ∈ TK g(s) K(t, s) g(t) dt, g Lq (0, 1). 0 This example can be thought of as a “continuous” analog of a matrix adjoint. This demonstrates the original goal of functional analysis: To generalize linear algebra to an infinite-dimensional setting.

3.7 New Banach Spaces From Old

In this section, we will show two common ways to construct new Banach spaces from given ones. The first method is essentially a means of summing two spaces, while the second is comparable to subtraction. Definition 3.43 Let X an Y be Banach spaces. The direct sum of X and Y is the set X × Y equipped with component-wise addition and scalar multiplication:

•(x1, y1) + (x2, y2) = (x1 + x2, y1 + y2), and • λ · (x, y) = (λx, λy), where (x, y), (x1, y1), and (x2, y2) are in X × Y and λ is a scalar. When given this vector space structure, the direct sum is denoted by X ⊕ Y . Proposition 3.44 Let X and Y be Banach spaces. The direct sum X⊕Y is a Banach space under the norm

(x, y)=xX +yY ,(x, y) ∈ X × Y.

Proof The proof that this norm is complete follows directly from the fact that ·X and ·Y are complete norms. 2 In some cases, a Banach space can be decomposed into a direct sum of closed subspaces. Proposition 3.45 Let X be a Banach space. Suppose V and W are closed subspaces of X.IfX = V + W and V ∩ W ={0}, then X is isomorphic (as a vector space) to V ⊕ W. (In this case, we write X = V ⊕ W.) 54 3 The Hahn–Banach Theorems

Proof We wish to establish a vector space isomorphism between the spaces X and V ⊕ W. That is, we wish to find a linear bijection (which need not be a homeomor- phism). By assumption, X ={v+w : v ∈ V , w ∈ W}. Define a map φ : X → V ⊕W by φ(v + w) = (v, w), (v, w) ∈ V × W. A priori, it may not be clear that this map is well-defined. Suppose that x ∈ X can be written in two ways as the sum of elements from V and W; i.e., suppose that x = v + w and x = v + w , where (v, w) and (v , w ) are in V × W. Then v + w = v + w , and consequently

v − v = w − w ∈ V ∩ W ={0}.

Therefore, v = v and w = w . It follows that each x ∈ X has a unique representation of the form x = v + w, where v ∈ V and w ∈ W, and so φ is well-defined. By construction, φ is onto. Furthermore, φ is one-to-one, because x = v + w if φ(x) = (v, w). Next, let (v, w) and (v , w ) be elements of V × W.Ifx = v + w and x = v + w , then  φ(x + x ) = φ (v + v ) + (w + w ) = (v + v , w + w ) = (v, w) + (v , w ).

Consequently, φ(x + x ) = φ(x) + φ(x ). Furthermore, if λ is a scalar, then

φ(λx) = φ(λ v + λ w) = (λv, λw) = λφ(x).

Therefore, φ is linear, and hence a vector space isomorphism. 2 It is perhaps worth mentioning that the map φ in the proof of Proposition 3.45 need not be an isometry between the given norm on X and the norm on V ⊕W (as given in Proposition 3.44). These two norms will always be equivalent, however, because φ is, in fact, a homeomorphism (a continuous bijection with continuous inverse). This will follow from the Bounded Inverse Theorem (Corollary 4.30), which we shall meet in Section 4.3 as a consequence of the Open Mapping Theorem. We now consider a second operation used to create new Banach spaces. Definition 3.46 Let X be a Banach space and let Y be a closed subspace of X. The quotient space X/Y is the set of all cosets of Y in X. That is, X/Y ={x+Y : x ∈ X}. The map Q : X → X/Y defined by Qx = x + Y for x ∈ X is called the quotient map. The quotient space X/Y is a vector space with addition and scalar multiplication givenby(x + Y ) + (x + Y ) = (x + x ) + Y and α(x + Y ) = (αx) + Y , respectively, where x and x are in X and α is a scalar. We leave it to the reader to verify this fact. Note that the zero vector in X/Y is Y = 0 + Y . Proposition 3.47 Let X be a Banach space with closed subspace Y . For each x ∈ X, let x + Y =infy∈Y x + y. Then ·defines a complete norm on X/Y (called the quotient norm). 3.7 New Banach Spaces From Old 55

Proof We first show that ·is a norm on the quotient X/Y . It is clear that ·is nonnegative and 0 + Y =0. Suppose x + Y =0. We will show that x + Y = Y . By the definition of the norm, for all n ∈ N, there exists yn ∈ Y −n −n such that x + yn < 2 . Therefore, x − ( − yn) < 2 , and so the sequence − ∞ ∈ ( yn)n=1 converges to x. Since Y is a closed subspace of X, it follows that x Y , and consequently x + Y = Y . Now let α be a nonzero scalar. If x ∈ X, then

αx + Y =inf αx + y=inf αx + αz=|α| inf x + z. y∈Y z∈Y z∈Y (Observe that z = y/α.) Therefore, we conclude α(x + Y )=|α|x + Y , and so ·is homogeneous. Next, we show the triangle inequality. Let x1 and x2 be in X and suppose >0. There exist elements y1 and y2 in Y such that

x1 + y1 < x1 + Y +/2 and x2 + y2 < x2 + Y +/2. Then

x1 + x2 + y1 + y2≤x1 + y1+x2 + y2 < x1 + Y +x2 + Y +. Since the choice of  was arbitrary, we conclude that ·satisfies the triangle inequality, and hence is a norm on X/Y . It remains to show the norm ·is complete on X/Y . For this we use the Cauchy ∞ Summability Criterion (Lemma 2.24). Suppose (xn)n=1 is a sequence in X such that ∞  +  ∞ ∈ N ∈ +  ≤  +  n=1xn Y < . For each n , pick xn xn Y such that xn 2 xn Y . ∞   ∞ ∞ Then n=1 xn < , and hence n=1 xn converges in X (by completeness of the ∞ ∈ norm on X). Suppose n=1 xn converges to x X. Then, for any >0, there exists ∈ N  − n  ≥ an N such that x k=1 xk <for all n N. Therefore,          n   n   + − +  ≤  −  ≥ (x Y ) (xk Y ) x xk <, n N. (3.12) k=1 k=1  ∞ + + Consequently, n=1 (xn Y ) converges to x Y in X/Y . It follows that the quotient norm is complete, as required. 2 In (3.15), we used the fact that x + Y ≤x for each x ∈ X. This follows from the definition of the norm on X/Y , because 0 ∈ Y . Equivalently, the quotient map Q : X → X/Y of Definition 3.46 has norm 1. The quotient map Q : X → X/Y has the additional property that it is also an open map. A map T : X → Z is called an open map if T (U) is an open set in Z whenever U is an open set in X.IfX and Z are Banach spaces and T is linear, then in order to show that T is an open map, it suffices to show that the open unit ball of X is mapped to an open set in Z. (See Exercise 3.7.) Let UX and UX/Y be the open unit balls in X and X/Y , respectively. (Recall Definition 1.5.) We will prove the quotient map Q : X → X/Y is an open map by showing that Q(UX) = UX/Y . 56 3 The Hahn–Banach Theorems

We already know that Q≤1, and so Q(UX) ⊆ UX/Y . Now let z + Y be any element of UX/Y . By assumption, z + Y =inf{z + y : y ∈ Y } < 1.

Thus, there is some y ∈ Y such that z + y  < 1. If z = z + y , then z  < 1 and

Qz = z + Y . This proves that UX/Y ⊆ Q(UX). We have established that the image of UX is UX/Y , and hence Q is an open map, as claimed. Definition 3.48 Let T : X → Y be a bounded linear operator between Banach spaces. The set {x : Tx = 0} is called the kernel of T and is denoted ker(T ). We now derive an important proposition which is analogous to a well-known fact of linear algebra. This result will prove valuable to us on several occasions. Proposition 3.49 Let X and Y be Banach spaces. If T : X → Y is a bounded linear operator, then ker(T ) is a closed subspace of X and there exists an injective linear operator T0 : X/ker(T ) → Y such that T0=T . Furthermore, T0 makes the following diagram commute, where Q : X → X/ker(T ) is the quotient map. T X Y

Q T0 X/ker(T)

In particular, if T is a surjection, then T0 is a continuous linear bijection. Proof Let E = ker(T ). Then E = T −1({0}) is a closed set in X because T is continuous and {0} is a closed set in Y . A simple calculation shows that E is a linear subspace of X. Define the map T0 : X/E → Y by T0(x + E) = Tx for all x ∈ X. We must verify that the map T0 is well-defined. To that end, let x + E = x + E for x and x in X. It follows that x − x ∈ E, and so T (x − x ) = 0, because E is the kernel of T . Therefore, we have that Tx− Tx = T (x − x ) = 0, and hence Tx = Tx . Consequently, the map T0 is well-defined, as required. For any x ∈ X,wehaveTx=T0(x + E)≤T0x + E≤T0x, and so T ≤T0. To show the reverse inequality, let >0 and choose x ∈ x + E such that x  < x + E+. Then Tx=Tx  and

T0(x + E)=Tx≤T x ≤T x + E+T .

Since the choice of  was arbitrary, it follows that T0≤T . We therefore conclude that T0=T . The rest of the proposition follows directly. 2 There is a close connection between direct sums and quotient spaces. In Sec- tion 4.4, we will show that X = V ⊕ W if and only if there exists a continuous projection P : X → V such that W = ker(P ) and V = P (X). (Thus, by Proposition 3.49, we may identify X/W with V .) 3.8 Duals of Quotients and Subspaces 57

3.8 Duals of Quotients and Subspaces

Definition 3.50 Let X be a Banach space. If E is a closed subspace of X, then the annihilator of E is the set

E⊥ ={x∗ ∈ X∗ : x∗(x) = 0 for all x ∈ E}⊆X∗.

If F is a closed subspace of X∗, then the pre-annihilator of F is the set

∗ ∗ F⊥ ={x ∈ X : x (x) = 0 for all x ∈ F }⊆X.

⊥ ∗ It is easy to check that E and F⊥ are always closed in X and X, respectively. Proposition 3.51 If X is a Banach space and E is a closed subspace of X, then: (i) E∗ can be naturally identified with X∗/E⊥, and (ii) (X/E)∗ can be naturally identified with E⊥.

Proof (i) Define a map ρ : X∗/E⊥ → E∗ by

∗ ⊥ ∗ ∗ ∗ ρ(x + E ) = x |E, x ∈ X .

∗ ∗ We claim this map is well-defined. To see this, suppose x1 and x2 are two elements ∗ ∗ + ⊥ = ∗ + ⊥ of X such that x1 E x2 E (i.e., they are in the same coset). It follows that ∗ − ∗ ∈ ⊥ ∗ − ∗ = ∈ ∗| = ∗| x1 x2 E , and consequently (x1 x2 )(e) 0 for all e E. Thus x1 E x2 E, ∗ + ⊥ = ∗ + ⊥ and so ρ(x1 E ) ρ(x2 E ). We have established that the definition of ρ(x∗ + E⊥) does not depend on the choice of representative in the coset x∗ + E⊥, and hence the map ρ is well-defined. We can show by direct computation that ρ is ∗ ∗ ⊥ linear. We also have that ρ is injective, because x |E = 0 if and only if x ∈ E . We now define a map ψ : E∗ → X∗/E⊥. For any φ ∈ E∗, let = ∗ + ⊥ ψ(φ) xφ E ,

∗ ∗ where xφ is any Hahn–Banach extension of φ to an element of X . In order to show this map is well-defined, we must demonstrate that ψ(φ) is independent of choice of Hahn–Banach extension of φ. To that end, let φ ∈ E∗ and suppose φ has Hahn– ∗ ∗ ∗ Banach extensions x1 and x2 in X . Since these are both extensions of φ, it follows that ∗| = ∗| = x1 E x2 E φ. ∗ − ∗ | = ∗ − ∗ ∈ ⊥ ∗ ∗ Thus, (x1 x2 ) E 0, and so x1 x2 E . We conclude that x1 and x2 are in the ∗ + ⊥ = ∗ + ⊥ same coset, and therefore x1 E x2 E . Once again, it is easy to see that ψ is an injective linear map. It is also easy to see that ψ ◦ρ = IdX∗/E⊥ and ρ ◦ψ = IdE∗ . (Here, we use IdX∗/E⊥ and IdE∗ to denote the identity maps on X∗/E⊥ and E∗, respectively.) Consequently, ρ is a linear bijection with inverse ψ. 58 3 The Hahn–Banach Theorems

 = ∗ + ⊥ ∈ ∗ We now show ρ is an isometry by showing that φ xφ E , where φ E ∗  ∗ + ⊥≤ ∗=  and xφ is any Hahn–Banach extension of φ. Certainly, xφ E xφ φ .  ∗ + ⊥  ∗ ∗ ∈ ∗ + ⊥  ∗  ∗ Suppose xφ E < xφ . Then there exists some z xφ E with z < xφ . ∗ ∗ ∗| = ∗| = Since z and xφ are in the same coset, it must be the case that z E xφ E φ,but  ∗| ≤ ∗    ∗ + ⊥=  z E z < φ , a contradiction. Therefore, xφ E φ . (ii) Denote the quotient map by Q : X → X/E. Let φ ∈ (X/E)∗. For any e ∈ E, Qe = E. Consequently, φ ◦ Q(e) = φ(E) = 0, and so φ determines an element of E⊥ through the identification φ → φ ◦ Q. This identification preserves norms because Q(UX) = UX/E (see the comments following the proof of Proposition 3.47), and so φ ◦ Q(UX) = φ(UX/E). To see that an element of E⊥ determines an element of (X/E)∗, let x∗ ∈ E⊥. Define φ ∈ (X/E)∗ by φ(x + E) = x∗(x) for all x ∈ X. We must show φ is well- defined. Let x and x be in the same coset, so that x − x ∈ E. Then, x∗(x − x ) = 0, and hence x∗(x) = x∗(x ). Thus, φ(x +E) = φ(x +E), and so φ is well-defined. To complete the proof, observe that φ◦Q = x∗, and refer to the previous paragraph. 2 The identifications in the preceding proposition lead to a remarkable corollary. Corollary 3.52 Let X be a Banach space. For any closed set E in X, we have the identification E∗∗ = E⊥⊥. Proof First apply (i), and then (ii), of Proposition 3.51. 2

Exercises

Exercise 3.1 Let X be a real inner product space with inner product (·, ·) and as- sociated norm ·. Prove the Parallelogram Law: If x and y are elements of X, then x + y2 +x − y2 = 2(x2 +y2). Exercise 3.2 Let X be a real inner product space with inner product (·, ·) and associated norm ·. Verify the polarization formula: If x and y are in X, then 1  (x, y) = x + y2 −x − y2 . 4 Exercise 3.3 Let H be a Hilbert space with norm ·. Show that a bounded linear map S : H → H is an orthogonal operator if and only if Snx=x for all n ∈ N and x ∈ H .(Hint: Use Exercise 3.2.) Exercise 3.4 Let X and Y be normed vector spaces. If x is a nonzero vector in X, and y ∈ Y , show there exists a bounded linear map T : X → Y such that T (x) = y. Exercise 3.5 In this exercise, we complete the proof of Theorem 3.32. Let X be a Banach space with bidual X∗∗ and let j : X → X∗∗ be the natural embedding. Exercises 59

(a) Show that j is an injective bounded linear map. (b) Show that j(X) is a closed subset of X∗∗. (You may wish to use Exercise 1.10.) Exercise 3.6 Let X and Y be normed spaces. Show that if L(X, Y ) is a Banach space, then Y must be a Banach space. (This is the converse to Proposition 1.11.) Exercise 3.7 Let X and Z be Banach spaces and let T ∈ L(X, Z). Show that T is an open map if and only if T maps the open unit ball of X to an open set in Z. Exercise 3.8 Prove Proposition 3.40. Exercise 3.9 Let p be a sublinear functional on a real vector space V . Show that

p(x) = max{f (x):f ≤ p, f linear}, x ∈ V.

Conversely, show that a functional q of the form

q(x) = sup f (x), x ∈ V , f ∈A where A is some collection of linear functionals, is necessarily sublinear. ∞ Exercise 3.10 (Fekete’s Lemma [10]) Let (an)n=1 be a sequence of real numbers such that am+n ≤ am + an, {m, n}⊆N. ∞ Show that if the sequence (an/n)n=1 is bounded below, then a a lim n = inf n . n→∞ n n∈N n a a (Hint: For any m ∈ N, show that lim sup n ≤ m by writing n = km+ r, where n→∞ n m {k, r}⊆N and 0 ≤ r ≤ m − 1.) Exercise 3.11 Let V be a real vector space and suppose p is a sublinear functional on V . Suppose T : V → V is a linear map such that p(Tx) = p(x). Show, using Exercise 3.10, that 1 q(x) = lim p(x + Tx+···+T n−1x), x ∈ V , n→∞ n defines a sublinear functional with q ≤ p. Show further that if f is a linear functional with f ≤ p, then f is T -invariant (i.e., f (Tx) = f (x) for all x ∈ V ) if and only if f ≤ q.

Exercise 3.12. Show that a linear functional L on ∞ is a Banach limit if and only if ξk+1 +···+ξk+n ∞ L(ξ) ≤ lim sup , ξ = (ξ ) ∈ ∞. →∞ j j=1 n k∈N n

Exercise 3.13 A sequence ξ ∈ ∞ is called almost convergent to α if L(ξ) = α for all Banach limits L. 60 3 The Hahn–Banach Theorems

(a) Show that ξ is almost convergent to α if and only if

ξ + +···+ξ +  lim sup  k 1 k n − α = 0. →∞ n k∈N n

∞ (b) Show that for any θ, the sequence (sin (nθ))n=1 is almost convergent to 0. = ∞ = ∞ Exercise 3.14 If x (xj )j=1 and y (yj )j=1 are sequences in ∞, let xy be the ∞ sequence (xj yj )j=1. Show for any Banach limit L, there are sequences x and y in ∞ such that L(xy) = L(x) L(y). (Notice that L(xy) = L(x) L(y)ifx and y are in c.) Exercise 3.15 Prove that ·, · is an inner product on H in the proof of Proposi- tion 3.21.

Exercise 3.16 Show that a bounded linear functional f : c0 → R has a unique ˜ Hahn–Banach extension f : ∞ → R. ={ ∞ ∈ = ∈ N} Exercise 3.17 Let E (xn)n=1 1 : x2k−1 0 for all k . Show that E is a closed subspace of 1. Prove that any nonzero bounded linear functional on E has more than one Hahn–Banach extension to 1.   1 Exercise 3.18 Let E = f ∈ L2(0, 1) : xf (x) dx = 0 . Define a bounded 0 linear functional Λ on E by

1 Λ(f ) = x2f (x) dx, f ∈ E. 0   Find the (unique) Hahn–Banach extension of Λ to L2(0, 1) and determine Λ . 1 1 (Hint: Use the fact that x2f (x) dx = (x2 + ax)f (x) dx on E, for all a ∈ R.) 0 0 Chapter 4 Consequences of Completeness

The space C[0, 1] of continuous functions on the interval [0, 1] can be equipped with many metrics. Two important examples are the metrics arising from the norms   1 1/2 2 f ∞ = max |f (s)| and f  = |f (s)| ds , ∈ 2 s [0,1] 0 where f ∈ C[0, 1]. The metric arising from the first norm is complete, whereas the metric induced by the second norm is not (i.e., there exist Cauchy sequences that fail to converge). Completeness of a metric is a very profitable property, as we shall see in this chapter. The first theorem we shall meet is a classical result about metric spaces called the Baire Category Theorem. It originated in Baire’s 1899 doctoral thesis, although metric spaces were not formally defined until later.

4.1 The Baire Category Theorem

In this section, we will state and prove the Baire Category Theorem and see some of its applications. A notion that will prove fruitful is that of a Gδ-set. We remind the reader that a Gδ-set is a countable intersection of open sets. Certainly, all open sets are Gδ-sets, but not all Gδ-sets are open. Correspondingly, an Fσ -set is the countable union of closed sets. Naturally, all closed sets are Fσ -sets, but not all Fσ -sets are closed. (See Exercise 4.1.)

Theorem 4.1 (Baire Category Theorem) Suppose (M, d) is a% complete metric ∞ ∞ space. If (Un)n=1 is a sequence of dense open subsets of M, then n=1 Un is dense in M. Recall that a set D is dense in a topological space M if and only if D ∩ U =∅ for all nonempty open sets U in M. Observe that, while the conclusion of the Baire Category Theorem is topological R − π π in nature, the hypothesis is not. The two spaces and ( 2 , 2 ) are homeomorphic → R − π π (via the mapping x arctan x); however, is complete, while ( 2 , 2 ) is not. R − π π Therefore, Theorem 4.1 applies directly to , but not to ( 2 , 2 ). However, the

© Springer Science+Business Media, LLC 2014 61 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_4 62 4 Consequences of Completeness conclusion of Theorem 4.1 holds for every metric space which is homeomorphic to a complete metric space. Before proving Theorem 4.1, let us consider some consequences of it. % ∞ ∞ Corollary 4.2 If (Gn)n=1 is a sequence of dense Gδ-sets, then n=1 Gn is also a dense Gδ-set. ∈ N Proof By assumption, for each n %, the set Gn can be written as an intersection of = ∞ countably many open sets, say Gn m=1 Umn. Since Gn is assumed to be dense, it must be the case that Umn is dense for each m and n in N. By Theorem 4.1 (The Baire Category Theorem), the set ∞ ∞ ∞ Gn = Umn n=1 n=1 m=1 ∞ is% a dense set. Since the sequence of open sets (Umn)m,n=1 is countable, the set ∞ 2 n=1 Gn is a Gδ-set. Proposition 4.3 If f : R → R is a function, then the set of points of continuity of f is a Gδ-set. Proof For any open interval I in R, let the oscillation of f over I be given by

oscI f = sup {|f (x) − f (y)| : {x, y}⊆I} . ∞ → →∞ Let (n)n=1 be a sequence of positive real numbers such that n 0asn .For each n ∈ N, let

Un ={x ∈ R : ∃ an open interval I in R such that x ∈ I and oscI f<n}.

Suppose x ∈ Un. By definition, there exists an open interval I such that x ∈ I and oscI f<n. The same statement holds for all x ∈ I, and so I ⊆ Un. Consequently, %∞ %∞ Un is an open set. Therefore, the intersection n=1 Un is a Gδ-set. The set n=1 Un is precisely the set of all points of continuity of f , and hence the result. 2 Example 4.4 Define a function f : R → R by f (x) = 0 for all x ∈ Q, f (0) = 1, and f (x) = 1/q for all x ∈ Q\{0}, where x = p/q is written in lowest terms. The function f is continuous precisely on the set R\Q, and consequently (by Proposition 4.3), the set of irrational numbers is a Gδ-set. The preceding example shows that the set of irrational numbers R\Q is a Gδ-set, but we could have shown that directly, without the aid of Proposition 4.3. Note that R\Q = R\{q}. q∈Q

Thus, R\Q is the intersection of countably many sets, each of which is open in R. Example 4.4 does lead one to ask a natural question: Does there exist a function g : R → R which is continuous precisely on Q? By Proposition 4.3, this can happen 4.1 The Baire Category Theorem 63 only if Q is a Gδ-set. Both Q and R\Q are dense in R, and R\Q is a Gδ-set. If Q is also a Gδ-set, then the intersection Q ∩ (R\Q) =∅would have to be dense, by Corollary 4.2 (a consequence of the Baire Category Theorem). This is a clear contradiction. We conclude there exists no function that has Q as the set of points of continuity. Proof of Theorem 4.1 (the Baire Category Theorem) Let (M, d) be a complete metric ∞ space and suppose (Un) = is a sequence of dense open subsets of M. Our goal is to %∞ n 1 show that n=1 Un is also dense in M. For any z ∈ M and >0, let B(z, ) ={x ∈ M : d(x, z) <} be the open ball about z of radius . %∞ Let V be a nonempty open set in M. We will show that n=1 Un and V have nonempty intersection. Pick any x0 ∈ V and 0 ∈ (0, 1) such that B(x0, 0) ⊆ V .By 0 ∩ =∅ assumption, the set U1 is dense in M, and so B x0, 2 U1 . Thus, there exists ∈ ≤ 0 ⊆ 0 ∩ ⊆ ∩ a point x1 U1 and an 1 2 such that B(x1, 1) B x0, 2 U1 V U1. 1 ∩ =∅ By assumption, as before, the set U2 is dense in M, and so B x1, 2 U2 . ∈ ≤ 1 Again arguing as before, there exists a point x2 U2 and an 2 2 such that ⊆ 1 ∩ ⊆ ∩ ∩ B(x2, 2) B x1, 2 U2 V U1 U2. Continuing inductively, we construct ∞ ∞ ≤ n−1 ∈ ∩ ∩···∩ sequences (xn)n=0 and (n)n=0 such that n 2 and xn V (U1 Un), and with the further property that   n−1 B(x ,  ) ⊆ B x − , ∩ U ⊆ V ∩ (U ∩···∩U ), n n n 1 2 n 1 n for all n ∈ N. 1 ∈ N Observe that n < 2n for all n . Suppose m>n. By the triangle inequality,

d(xm, xn) ≤ d(xm, xm−1) +···+d(xn+1, xn).

∞  + ≤ n The sequence (xn)n=0 was chosen so that d(xn 1, xn) 2 . Consequently,

 −  1 1 1 1 1 d(x , x ) ≤ m 1 +···+ n < +···+ = − < . m n 2 2 2m 2n+1 2n 2m 2n ∞ It follows that (xn)n=0 is a Cauchy sequence. Hence, by completeness, there exists a point x ∈ M such that x = lim xn.Ifm>n, then xm ∈ B(xn, n), by construction. n→∞ We conclude that d(xm, xn) <n for all m>n, and hence d(x, xn) ≤ n for all n ∈ N. Thus,   n−1 x ∈ B(x ,  ) ⊆ B x − , ⊆ B(x − ,  − ), n n n 1 2 n 1 n 1 %∞ and so x ∈ V ∩ (U1 ∩···∩Un−1) for all n ∈ N. It follows that, x ∈ V ∩ = Un . %∞ n 1 We have shown that the intersection% of n=1 Un and any open set V in M is ∞ 2 nonempty. Therefore, the set n=1 Un is dense in M. The name Baire Category Theorem is derived from a complementary formulation of Theorem 4.1.IfU is a dense open set in M, then its complement M\U is closed with empty interior. This fact motivates the following definitions. 64 4 Consequences of Completeness

Definition 4.5 Let M be a topological space. A set E ⊆ M is said to be nowhere dense if the closure of E in M has empty interior; that is to say, if int(E) =∅. A set G ⊆ M is called first category (also known as meager ) if there exists a ∞ ⊆ ∞ sequence (En)n=1 of nowhere dense sets such that G n=1 En. A set is called second category if it is not first category. A second category set is also known as non-meager. The complement of a meager set is called a residual set. Proposition 4.6 A countable union of first category sets is first category. Proof A countable union of countably many sets is a countable union of sets. 2 Theorem 4.7 (Baire Category Theorem, complementary version) In a complete metric space, any dense Gδ-set is second category.

Proof Let G be a dense Gδ-set.Suppose that G is first category. Then there exists ∞ ⊆ ∞ a sequence (En)n=1 of nowhere dense sets such that G n=1 En. Without loss of generality, we may assume En is closed for each n ∈ N. Now, for each n ∈ N, let Vn be the complement of En. Then Vn is both open and dense for each n ∈ N. %∞ Therefore, by Theorem 4.1, the intersection = Vn is a dense Gδ-set. ∞ n 1 %∞ Since G is a subset of n=1 En, it is disjoint from n=1 Vn. Thus, we have disjoint dense Gδ-sets. This contradicts Corollary 4.2, and so G is not first category. 2 The intuition behind the proof of Theorem 4.7 is that a countable union of small sets is still small. In this case, by small we mean meager. This notion of size applies to any metric space. Example 4.8 Consider the real line R. We have now for the set of real numbers two natural notions of smallness: first category and zero measure. It is natural to wonder if there is any relationship between the two. Consider the set of rational numbers Q ⊆ R.Ifλ represents Lebesgue measure on R, then λ(Q) = 0 (because the set of ∞ rational numbers is countable). Therefore, there exists a sequence (%Un)n=1 of open Q ⊆ 1 ∈ N = ∞ sets such that Un and λ(Un) < n for each n . Let G n=1 Un. Then λ(G) = 0, but G cannot be first category, since Q ⊆ G and Q is dense in R. On the other hand, the complement of G is first category, but must be of infinite measure. For further reading on the analogies between topological spaces and measure spaces, the curious reader might consider Measure and Category by John Oxtoby [29].

4.2 Applications of Category

In this section, we will investigate some implications of category in Banach spaces. In particular, we will learn the Uniform Boundedness Principle and see the important role it plays in the study of Fourier series. Theorem 4.9 (Uniform Boundedness Principle) Let X and Y be Banach spaces and let Ti : X → Y be a bounded linear operator for each i ∈ I, where I is an index 4.2 Applications of Category 65 set. In order that there exists a uniform bound M such that Ti ≤M for all i ∈ I,it is necessary and sufficient that for each x ∈ X, there exists a constant Cx > 0 such that Ti (x)Y ≤ Cx for all i ∈ I. Proof Necessity is immediate. To show sufficiency, assume that for each x ∈ X there exists a constant Cx > 0 such that Ti (x)Y ≤ Cx for all i ∈ I. For each n ∈ N, let ={   ≤ ∈ }= {   ≤ }= −1 An x : Ti (x) Y n for all i I x : Ti (x) Y n Ti (nBY ) . i∈I i∈I

The set nBY is the closed ball centered at 0 ∈ Y with radius n. Since Ti is continuous for each i ∈ I, it follows that An is a closed set. ∈ ∈ N By assumption, every x X is contained in An for some n , and therefore we = ∞ must have X n=1 An. By Proposition 4.6, not every An can be nowhere dense, ∈ N and so there exists some n0 such that the closed set An0 has nonempty interior. ∈ Thus, there exists some x0 An0 and δ>0 such that + ={  −  ≤ }⊆ x0 δBX x : x x0 X δ An0 . ∈ + ∈  +  ≤ Now let u BX. Then x0 δu An0 , and so it follows that Ti (x0 δu) Y n0 ∈   ≤ ∈ for all i I. We chose x0 to be in An0 , and thus Ti (x0) Y n0 for all i I. Consequently, it must be that Ti (δu)Y ≤ 2n0 for all i ∈ I. By linearity, since δ>0 is a constant, we conclude that Ti (u)Y ≤ 2n0/δ for all u ∈ BX. Therefore, Ti ≤2n0/δ for each i ∈ I, and the proof is complete. 2 The following theorem is a more concise statement of Theorem 4.9. Theorem 4.10 (Uniform Boundedness Principle) Let X and Y be Banach spaces. If {Ti : i ∈ I} is a collection of bounded linear operators from X to Y , then sup Ti (x)Y < ∞ for each x ∈ X if and only if sup Ti  < ∞. i∈I i∈I We now provide some definitions that lead to a straightforward application of the Uniform Boundedness Principle. Definition 4.11 Let X be a Banach space and let A be a subset of X. The set A   ∞ is said to be bounded if supa∈A a < . The set A is called weakly bounded if | ∗ | ∞ ∗ ∈ ∗ supa∈A x (a) < for all x X . Theorem 4.12 Let X be a Banach space. A subset of X is bounded if and only if it is weakly bounded. Proof Let A be a subset of X. Certainly, if A is bounded, then it is weakly bounded. Assume now that A is weakly bounded. For each a ∈ A, define a scalar-valued ∗ function φa on X by ∗ ∗ ∗ ∗ φa(x ) = x (a), x ∈ X .

For each a ∈ A, the function φa is bounded and linear. Indeed, for a given a ∈ A, ∗ ∗ φa= sup |φa(x )|= sup |x (a)|=a. ∗ ∗ x ∈BX∗ x ∈BX∗ 66 4 Consequences of Completeness

By the weakly bounded assumption on A, for each x∗ ∈ X∗,

∗ ∗ sup |φa(x )|=sup |x (a)| < ∞. a∈A a∈A Thus, by Theorem 4.10 (the Uniform Boundedness Principle), we conclude that   ∞  =  2 supa∈A φa < . Since φa a , we have shown that A is bounded. Definition 4.13 Let X be a Banach space and let A be a subset of X∗. The set A is ∗ | ∗ | ∞ ∈ called weak -bounded if supa∗∈A a (x) < for all x X. Theorem 4.14 Let X be a Banach space. A subset of X∗ is bounded if and only if it is weak∗-bounded. Proof The argument parallels the proof of Theorem 4.12 and is left to the reader. 2 The following significant theorem is a consequence of the Uniform Boundedness Principle. Theorem 4.15 (Banach-Steinhaus Theorem) Let X and Y be Banach spaces. Sup- ∞ pose (Sn) = is a sequence of bounded linear operators from X to Y .If lim Snx n 1 n→∞ exists for each x ∈ X, then

(i) sup Sn < ∞, and n∈N (ii) if Tx = lim Snx for all x ∈ X, then T is a bounded linear operator. n→∞ ∞   ∞ Proof (i) If (Snx)n=1 converges, then supn∈N Snx < . The Uniform Bounded-   ∞ ness Principle then implies that supn∈N Sn < . (ii) A simple check reveals that T is linear. To show that T is bounded, observe that  

Tx≤sup Snx≤ sup Sn x. n∈N n∈N

Taking the supremum over x ∈ BX provides the desired result. 2 The next result is in some sense a converse to Theorem 4.15. ∞ Theorem 4.16 Let X and Y be Banach spaces. If (Sn)n=1 is a sequence of uniformly bounded linear operators from X to Y , then the set E ={x : limn→∞ Snxexists} is a closed linear subspace of X. Proof We need only show that E is closed. Suppose that x ∈ E, the closure of E.By assumption, there exists some M>0 such that Sn≤M for all n ∈ N. Let >0. Since x is in the closure of E, there exists some y ∈ E such that x − y </(3M). ∞ By assumption, the sequence (Sny)n=1 is a Cauchy sequence in Y , and so there exists some natural number N such that Smy − Sny </3 whenever m ≥ N and n ≥ N. Let m and n be natural numbers such that m ≥ N and n ≥ N. Then, 4.2 Applications of Category 67

Smx − Snx≤Smx − Smy+Smy − Sny+Sny − Snx       

2π dθ fˆ(n) = f (θ) e−inθ . (4.1) 0 2π We wish to determine if it is possible to reconstruct f from its Fourier coefficients. To that end, we define the Fourier series of f by

fˆ(n) einθ. (4.2) n∈Z

If f is a trigonometric polynomial, then there exists a sequence of scalars (ak)k∈Z ikθ with only finitely many nonzero terms such that f (θ) = ak e for all θ ∈ T.In k∈Z ˆ this case, it is easy to see that f (n) = an for each n ∈ Z. (See Exercise 4.9.) Since a trigonometric polynomial is equal to its Fourier series, it is natural to ask the following: Does the Fourier series of a general continuous function f converge to f ? To make this question more precise, define for each N ∈ N the N th partial sum operator SN : C(T) → C(T) as follows:

N ˆ inθ SN f (θ) = f (n) e , θ ∈ [0, 2π). (4.3) n=−N

The question now becomes: Is it the case that SN f − f ∞ → 0asN →∞for all f in C(T)? (In other words, does SN f always converge uniformly to f ?) If the answer to this question is “yes,” then the partial sum operators SN must be uniformly bounded, by the Uniform Boundedness Principle. We will therefore show the answer is “no” by computing the operator norm SN  for each N ∈ N, and then showing that these norms are not uniformly bounded. Fix N ∈ N and f ∈ C(T). Computing directly, by substituting (4.1) into (4.3), we have for each θ ∈ [0, 2π),     N 2π dφ 2π N dφ S f (θ) = f (φ) e−inφ einθ = f (φ) ein(θ−φ) . N 2π 2π n=−N 0 0 n=−N 68 4 Consequences of Completeness

The sum appearing in the rightmost integral above is a geometric series with constant ratio ei(θ−φ). We calculate the sum of this geometric series as follows:

N 2N 1 − ei(2N+1)(θ−φ) ein(θ−φ) = e−iN(θ−φ) ein(θ−φ) = e−iN(θ−φ) 1 − ei(θ−φ) n=−N n=0 e−iN(θ−φ) − ei(N+1)(θ−φ) ei(N+1)(θ−φ) − e−iN(θ−φ) = = . 1 − ei(θ−φ) ei(θ−φ) − 1

i (θ−φ) We reduce the final fraction by dividing the numerator and denominator by e 2 , and so the above equation becomes  N i(N+ 1 )(θ−φ) −i(N+ 1 )(θ−φ) 1 e 2 − e 2 sin (N + )(θ − φ) ein(θ−φ) = =  2 . i (θ−φ) − − i (θ−φ) θ−φ n=−N e 2 e 2 sin 2

Substituting this into the formula for SN f (θ), we conclude  2π sin (N + 1 )(θ − φ) dφ S f (θ) = f (φ)  2 . (4.4) N θ−φ 2π 0 sin 2

Assume there exists a constant M>0 such that SN ≤M for all N ∈ N.By ∗ T → T Proposition 3.37, each adjoint operator SN : M( ) M( ) must also be bounded  ∗ ≤ ∈ N by M; that is, SN M for all N . Let δ0 denote the Dirac measure at 0 (which was defined in (2.8)). It is easy to see that δ0M = 1, and as a consequence,  ∗  ≤ ∗ ≤ SN δ0 M SN M. (4.5) Let f ∈ C(T). Applying the definitions, and using (4.4), we see  2π sin (N + 1 )φ dφ  ∗ = = 2 f , SN δ0 SN f (0) f (φ) . 0 sin (φ/2) 2π ∈ T ∗ T This equality is valid for all f C( ). Therefore, because SN δ0 is in M( ),      2π sin (N + 1 )φ  dφ  ∗  =  2  SN δ0 M   . 0 sin (φ/2) 2π

= + 1 = + 1 We make a change of variables: Let ψ (N 2 ) φ, and so dψ (N 2 ) dφ. Then     (2N+1)π  sin ψ  dψ S∗ δ  =   . N 0 M  ψ  (2N + 1)π 0 sin 2N+1 Combining this with (4.5), we conclude that     1 ∞ 1  sin ψ  χ +   dψ ≤ M, π (0,(2N 1)π) 2N + 1  ψ  0 sin 2N+1 4.2 Applications of Category 69 for all N ∈ N. A quick application of l’Hôpital’s Rule reveals 1 sin ψ sin ψ lim = . N→∞ 2N + 1 ψ ψ sin 2N+1 Consequently, by Fatou’s Lemma,   ∞   1 sin ψ    dψ ≤ M. π 0 ψ  ∞ | sin ψ | =∞ This is, however, a contradiction, since 0 ψ dψ . To see this, observe that     ∞   ∞ (k+1)π   sin ψ  sin ψ    dψ =   dψ ψ ψ 0 k=0 kπ ∞ ∞ 1 π 2 1 ≥ sin ψdψ= =∞. (k + 1)π π k + 1 k=0 0 k=0  ≤ ≥ Therefore, it cannot be true that supN∈N SN M for some M 0, and so it is not possible that the Fourier series of f converges uniformly to f for all f ∈ C(T). Remark 4.18 Before the advent of functional analysis, and in particular the Uniform Boundedness Principle, the only way to demonstrate that SN f did not converge uniformly to f for all f ∈ C(T) was to construct an explicit function for which the desired convergence failed. ∞ In fact, in Example 4.17, we actually proved that (SN f (0))N=1 fails to converge to f (0) for all f in a dense Gδ-subset of C(T). Thus, there are many functions for which the Fourier series of f does not even converge pointwise to f . Definition 4.19 For any natural number N, the Dirichlet kernel of degree N is the function  N sin (N + 1 )α D (α) = einα = 2 , α ∈ R. N sin (α/2) n=−N

Observe that

2π dφ SN f (θ) = f (φ) DN (θ − φ) = (DN ∗ f )(θ), (4.6) 0 2π

th where SN is the N partial sum operator. We know from Example 4.17 that DN ∗ f does not converge uniformly to f for each f ∈ C(T); however, we can find a related kernel for which uniform convergence does hold for all continuous functions on T. 70 4 Consequences of Completeness

Definition 4.20 For any natural number N, the Fejér kernel of degree N is the function − 1 N 1 K (t) = D (t), t ∈ R. N N k k=0 The N th Cesàro mean of f ∈ C(T) is the function given by the formula 1 T f = (S f +···+S − f ). N N 0 N 1

Using (4.6), we can derive a relationship between KN and TN . For any N ∈ N and θ ∈ [0, 2π),

2π dt TN f (θ) = f (t) KN (θ − t) = (KN ∗ f )(θ). 0 2π We now find a closed-form formula for the Fejér kernel of degree N.

Lemma 4.21 If KN is the Fejér kernel of degree N, then

1 1 − cos (Nt) sin2 (Nt/2) KN (t) = = . 2N sin2 (t/2) N sin2 (t/2)

= 1 − − + Proof Recall the identity sin A sin B 2 ( cos (A B) cos (A B)), where A and B are real numbers. For t ∈ R,   N −1 + 1 N −1 + 1 1 sin (k 2 )t 1 sin (k 2 )t sin (t/2) KN (t) = = N sin (t/2) N sin2 (t/2) k=0 k=0

− 1 N 1 cos (kt) − cos ((k + 1)t) = . 2N sin2 (t/2) k=0 The first equality in the conclusion of the lemma follows directly from the above series, because it is a telescoping series. The second equality in the conclusion of the lemma follows from the application of a half-angle identity. 2 th Lemma 4.22 If TN is the N Cesàro mean, then TN =1. Proof Let f ∈ C(T). By a straightforward change of variables, and using the translation-invariance of Lebesgue measure, coupled with the periodicity of functions on T,wehave

2π dt TN f (θ) = f (θ − t) KN (t) , θ ∈ [0, 2π). 0 2π   ≤    It follows that TN f C(T) f C(T) KN L1(T). 4.3 The Open Mapping and Closed Graph Theorems 71  ≥   = 1 2π By Lemma 4.21, we know that KN 0, and so KN L1(T) 2π 0 KN (t) dt.   To find the value of KN L1(T), we compute it directly:   − − 1 2π 1 2π 1 N 1 1 N 1 2π k K (t) dt = D (t) dt = eint dt . 2π N 2π N k 2πN 0 0 k=0 k=0 0 n=−k

Observe that 2π 0ifn = 0, eint dt = 0 2π if n = 0. Consequently,

−   − 1 2π 1 N 1 k 2π 1 N 1 K (t) dt = eint dt = 2π = 1. 2π N 2πN 2πN 0 k=0 n=−k 0 k=0

2π  ≤  = = 1 = It follows that TN KN L1(T) 1. Since TN 1 KN (t) dt 1, we 2π 0 conclude that TN =1. 2 The next proposition provides a good example of how the Uniform Boundedness Principle is actually used in practice (in the guise of the Banach-Steinhaus Theorem).

Proposition 4.23 If f ∈ C(T), then KN ∗ f → f uniformly.

Proof If f is a trigonometric polynomial, then it is easy to see that TN f → f uni- formly. By the Weierstrass Approximation Theorem, the trigonometric polynomials are dense in C(T). By Theorem 4.16, the set of functions for which this limit exists is closed. It follows that TN f → f in C(T)asN →∞for all f ∈ C(T). Since TN f = KN ∗ f , the result follows. 2

4.3 The Open Mapping and Closed Graph Theorems

We begin this section with a topological definition. Definition 4.24 Let X and Y be topological spaces. A map T : X → Y is called open (or an open map)ifT (U) is open in Y whenever U is open in X.

Before stating the next proposition, we recall that BX ={x : x≤1} and int(BX) ={x : x < 1}, where X is a normed space. Naturally, BX is a closed set and int(BX) is an open set (in X). Proposition 4.25 Let X and Y be Banach spaces and suppose T : X → Y is a bounded linear operator. The map T is an open map if and only if there exists a δ>0 such that δBY ⊆ T (BX).

Proof Assume T is an open map. The set intBX is open in X, and so T (intBX) is open in Y . By the linearity of T ,wehave0∈ T (intBX). It follows that intBX 72 4 Consequences of Completeness contains a basic neighborhood of 0. Therefore, there exists a δ>0 such that

δBY ⊆ T (intBX) ⊆ T (BX).

Conversely, assume there exists a δ>0 such that δBY ⊆ T (BX). We wish to show that T (U) is open in Y whenever U is an open set in X. To that end, let U be open in X and let y ∈ T (U). There exists some x ∈ U such that Tx = y. Since x ∈ U, and U is open in X, there is some ν>0 such that x + νBX ⊆ U. It follows that y + νT (BX) ⊆ T (U). By our assumption, this implies that y + νδBY ⊆ T (U). Consequently, y ∈ intT (U), and so the set T (U) is open in Y . Therefore, T is an open map, as required. 2 Corollary 4.26 Let X and Y be Banach spaces and suppose T : X → Y is a bounded linear operator. (i) If T is open, then T maps X onto Y . (ii) If T is one-to-one and open, then T is invertible and T −1 is continuous.

Proof (i) Suppose T is an open map. By Proposition 4.25, there exists a δ>0 such that δBY ⊆ T (BX). Let y ∈ Y . It must be that y ∈yBY , and so y y y ∈ · δB ⊆ T (B ). δ Y δ X   = y ∈ It follows that y T δ x for some x BX, and so T is onto. (ii) If T is open, then T is onto, by (i). Since T is also assumed to be one-to-one, the inverse function T −1 is well-defined. It remains to show that T −1 : Y → X is continuous. If U is an open set in X, then (T −1)−1(U) = T (U) is open in Y (since T is an open map). The preimage of an open set is open, and so we conclude that T −1 is continuous. 2 Definition 4.27 Let X and Y be Banach spaces and suppose T : X → Y is a bounded linear operator. The map T is said to be almost open if there exists a δ>0 such that δBY ⊆ T (BX). Proposition 4.28 Let X and Y be Banach spaces and suppose T : X → Y is a bounded linear operator. If T is almost open, then T is open. Proof By assumption, T is almost open, and therefore there exists a δ ∈ (0, 1) such that δBY ⊆ T (BX). If y ∈ BY , then δy ∈ δBY ⊆ T (BX), and so for any ν>0, there exists some x ∈ BX such that δy − Tx <ν.Consequently, for any y ∈ BY and ν>0, there exists an x ∈ BX such that y − T (x/δ) < ν/δ.

If yˆ is an element of Y such that 0 < ˆy <δ, then y/ˆ ˆy∈BY . Therefore, for any ν>0, there exists an element x ∈ BX such that    yˆ x  ν  − T  < . (4.7) ˆy δ δ 4.3 The Open Mapping and Closed Graph Theorems 73

Consequently, for any yˆ ∈ int(δBY ) and ν>0, we can find an xˆ ∈ X such that   ν ˆy ˆy − T (xˆ)≤ ˆy, ˆx≤ . (4.8) δ δ ˆ = ˆ = ˆ = ˆ = xˆy If y 0, we choose x 0. If y 0, then we choose x δ , where x is the element of BX chosen to satisfy (4.7). Our goal is to show that T is an open map. By Proposition 4.25, it will suffice to show that int(δBY ) ⊆ T (BX). To that end, let y ∈ int(δBY ). Then y <δ. Let β be any real number such that y <β<δand choose a real number ν such that 0 <ν<δ− β. We assumed y <δ, and so, by (4.8), there exists an x1 ∈ X such that   ν β y − T (x )≤ β, x ≤ . 1 δ 1 δ

Now observe that ν<δ. This implies that y − T (x1) is an element of Y with the property that y − T (x1) <β<δ. Again we use (2.24), only this time with the element y − T (x1), and we find an x2 ∈ X such that   ν 2 ν y − T (x ) − T (x )≤ β, x ≤ β. 1 2 δ 2 δ2 ∞ Continuing inductively, we construct a sequence (xn)n=1 such that

n     ν n νn−1 y − T (x ) ≤ β, x ≤ β, (4.9) k δ n δn k=1 for each n ∈ N. Since ν<δ− β,wehaveβ<δ− ν, and so ∞ ∞   β ν n−1 β 1 β x ≤ = = < 1. k δ δ δ 1 − ν δ − ν k=1 k=1 δ ∞ Since X is a Banach space, k=1 xk converges to an element in X, by the Cauchy Summability Criterion (Lemma cauchy-criterion). Denote this limit in X by x.By  ≤ ∞   ∈ the triangle inequality, x k=1 xk < 1, and so x BX. By (4.9) and the continuity of T , n y = lim T (xk) = T (x). n→∞ k=1

Consequently, we have y ∈ T (BX). Therefore, int(δBY ) ⊆ T (BX), as required, and so T is an open map. 2 The next theorem is one of the cornerstones of functional analysis. Theorem 4.29 (Open Mapping Theorem) Suppose X and Y are Banach spaces. If T : X → Y is a bounded surjective operator, then T is an open map. 74 4 Consequences of Completeness

= = ∞ Proof Observe that Y T (X) n=1 n T (BX). Therefore, by Theorem 4.7, the set T (BX) has non-empty interior. Consequently, there exists an element y ∈ Y and a number δ>0 such that y + δBY ⊆ T (BX).

A simple calculation reveals that −y + δBY ⊆ T (BX), as well, and so it must be the case that δBY ⊆ T (BX). Therefore, T is almost open. It follows that T is an open map, by Proposition 4.28. 2 Corollary 4.30 (Bounded Inverse Theorem) Let X and Y be Banach spaces. If T : X → Y is a bounded linear bijection, then T −1 is a bounded linear oper- ator. Consequently, any continuous linear bijection between Banach spaces is an isomorphism. Proof By assumption, the map T is a continuous bijection. Since T is surjective, it is an open map, by Theorem 4.29. Because T is an injective open map, it follows that the inverse T −1 is bounded, by Corollary 4.26. Therefore, T is a continuous bijection with continuous inverse, and so is an isomorphism between the Banach spaces X and Y . 2 We have stated the Bounded Inverse Theorem as a corollary to the Open Mapping Theorem, but they are in fact equivalent. (See Exercise 4.21.) Example 4.31 By Theorem 4.29, any quotient map is an open map. Up to an equivalence, the converse is also true. Suppose X and Y are Banach spaces and let T : X → Y be a bounded linear operator that is an open mapping. Consider the following commuting diagram: T X Y

Q T0 X /ker(T)

In this diagram, T = T0 ◦ Q, where Q is the quotient map onto X/ker(T ). By assumption, the bounded linear operator T is an open map. Thus, by Corollary 4.26, we know that T is a surjection. Consequently, by Proposition 3.49, the map T0 is a continuous linear bijection. Therefore, T0 is an isomorphism, by the Bounded Inverse Theorem (Corollary 4.30). This demonstrates that the open map T can be written as T0 ◦ Q, where Q is a quotient map and T0 is an isomorphism. This means that any open map is a quotient map, up to an isomorphism. (Note that the norms of T and Q might not be equal, because T0 need not be an isometry.) Example 4.32 Banach-Mazur Characterization of Separable Spaces) A rather re- markable result (dating back to 1933) is that every separable Banach space can be realized as a quotient of 1. To see this, let X be a separable Banach space. The ∞ closed unit ball BX has a countable dense subset, say (xn)n=1. Define a bounded 4.3 The Open Mapping and Closed Graph Theorems 75 linear operator T : 1 → X by ∞ = = ∞ ∈ T (ξ) ξnxn, ξ (ξn)n=1 1. n=1

The linearity of T follows from the summability of the terms in the sequence ξ ∈ 1. To show that T is bounded, observe:  ∞  ∞ ∞   T (ξ)= ξnxn ≤ |ξn|xn≤ |ξn|=ξ. n=1 n=1 n=1 ∈ ∈ N  ≤ ⊆ (Recall xn BX for all n .) Consequently, T 1, and so T (B1 ) BX. th Recall that en denotes the sequence with 1 in the n coordinate and 0 elsewhere, so = ∈ ∈ N = that en (0, ... ,0,1,0,... ). Certainly en B1 for each n . Since T (en) xn ∈ N ∞ ⊆ ∞ for each n , we conclude that (xn)n=1 T (B1 ). By assumption, (xn)n=1 is dense = in BX, and so BX T (B1 ). The above argument shows that T is an almost open map (with δ = 1). By Proposition 4.28, an almost open map is an open map. Therefore, by Corollary 4.26, the map T is a surjection. Consider the diagram from Example 4.31, but in our current context: T 1 X

Q T0 / ( ) 1 ker T

As we saw in Example 4.31, the map T0 is an isomorphism. Furthermore, T0 is an = isometry because BX T (B1 ). Thus, the separable Banach space X is isometrically isomorphic to 1/ker(T ). The following factorization result is a consequence of the Bounded Inverse Theorem, and one we will use on several occasions. Lemma 4.33 Let X and Y be Banach spaces and suppose T : X → Y is a surjective bounded linear operator. If x∗ ∈ (kerT )⊥, then there exists some f ∈ Y ∗ such that x∗ = f ◦ T . Proof Let Q : X → X/ker(T ) be the quotient map. By Proposition 3.49 (noting that T is a surjection), there is a continuous linear bijection T0 : X/ker(T ) → Y such that T = T0 ◦ Q. By the Bounded Inverse Theorem (Corollary 4.30), we conclude that T0 is an isomorphism of Banach spaces. It was assumed that x∗ ∈ (kerT )⊥. Hence, by Proposition 3.51, the linear functional x∗ determines an element fˆ of (X/kerT )∗ via the identification fˆ(x + kerT ) = x∗(x), for all x ∈ X. Since X/ker(T ) is isomorphic to Y , there exists a bounded linear ˆ functional f : Y → K, where K is the field of scalars, such that f = f ◦ T0. (See the diagram below.) 76 4 Consequences of Completeness

X T Y

Q f T0 X/ker(T) K fˆ

Therefore, ∗ ˆ ˆ x (x) = f (x + kerT ) = f (Qx) = (f ◦ T0 ◦ Q)(x) = (f ◦ T )(x), for all x ∈ X, as required. 2 We now turn our attention to an alternate formulation of the Open Mapping Theorem. First, we need a definition. Definition 4.34 Let X and Y be Banach spaces and suppose T : X → Y is a linear map. The graph of T is the subset of X × Y given by G(T ) ={(x, Tx):x ∈ X}. We call G(T )aclosed graph if it is closed as a subset of X × Y . Theorem 4.35 (Closed Graph Theorem) Let X and Y be Banach spaces and suppose T : X → Y is a linear map. If T has a closed graph, then T is continuous. Proof For x ∈ X and y ∈ Y , let (x, y)=x+y. Under this norm, X × Y is a Banach space. (See Proposition 3.44.) By assumption, G(T ) is closed, and hence also a Banach space. Define a map S : G(T ) → X by S(x, Tx) = x, x ∈ X. Clearly, S is a bijection and S≤1. Thus, by the Bounded Inverse Theorem (Corollary 4.30), we conclude that S−1 is bounded. By definition, S−1x = (x, Tx), and so x+Tx=(x, Tx)=S−1x≤S−1x. Therefore, Tx≤S−1x for all x ∈ X, and consequently T ≤S−1.It follows that T is bounded, and hence continuous. 2 We have now shown that the Open Mapping Theorem implies the Bounded Inverse Theorem (Corollary 4.30), and also that the Bounded Inverse Theorem implies the Closed Graph Theorem (Theorem 4.35). It is also true that the Closed Graph Theorem implies the Open Mapping Theorem, and consequently all three are equivalent. (See Exercise 4.20.) Example 4.36 In general, the linearity assumption in the Closed Graph Theorem is necessary to prove continuity. Consider the map f : R → R given by

1/x if x = 0, f (x) = 0ifx = 0. The graph of f is closed, but f is certainly not continuous. 4.4 Applications of the Open Mapping Theorem 77

If we restrict our attention to functions f : [0, 1] → [0, 1], then a closed graph does indeed suffice for continuity. Assume to the contrary that there is a function f : [0, 1] → [0, 1] with closed graph that is not continuous. Then there exists some ∈ ∞ x [0, 1] and a sequence (xn)n=1 in [0, 1] converging to x such that, for some >0,

|f (xn) − f (x)| >, n ∈ N. (4.10)

∞ The interval [0, 1] is compact, and so (f (x )) = has a convergent subsequence, say  ∞ n n 1 f (xnk ) k=1. Suppose y is the limit of this subsequence. By definition, the point ∈ N (xnk , f (xnk )) is in G(f ) for all k . Since the graph of f is closed, it follows that (x, y) ∈ G(f ), but this implies y = f (x), which contradicts (4.3.4). In the preceding paragraph, we considered a function f : [0, 1] → [0, 1], but this argument works equally well for a function f : K → K, where K is an arbitrary compact set.

4.4 Applications of the Open Mapping Theorem  T ∈ T dθ Consider the torus . For any f L1 , 2π , we define the Fourier coefficients of f by

2π dθ fˆ(n) = f (θ) e−inθ , n ∈ Z. 0 2π

(See Example 4.17.)  T dθ T For ease of notation, we will abbreviate L1 , 2π as L1( ). ˆ Theorem 4.37 (Riemann-Lebesgue Lemma) If f ∈ L1(T), then lim f (n) = 0. |n|→∞ ˆ Proof For each n ∈ Z, define a linear functional on L1(T)byφn(f ) = f (n) for all f ∈ L1(T). A simple computation shows that φn≤1 for all n ∈ Z, and so the sequence of linear functionals is uniformly bounded. If f is a trigonometric polynomial, then there exists some N ∈ N such that φn(f ) = 0 for all |n|≥N.In particular, lim|n|→∞ φn exists on a dense subset of L1(T). By Theorem 4.16, the set {f : lim|n|→∞ φn(f ) exists} is a closed linear subspace of L1(T), and hence the limit exists for all elements of L1(T). By Theorem 4.15, the map defined by φ(f ) = lim|n|→∞ φn(f ) for f ∈ L1(T)is a bounded linear functional on L1(T). We have already established that φ(f ) = 0 whenever f is a trigonometric polynomial. Therefore, since φ is continuous, and since the trigonometric polynomials are dense in the space of integrable functions, we have that φ(f ) = 0 for all f ∈ L1(T). This proves the theorem. 2

The significance  of Theorem 4.37 is that, for any f ∈ L1(T), the doubly infinite ˆ sequence f (n) is always an element of c0(Z). This may lead one to ask if n∈Z the converse is true. The next proposition, which makes use of the Open Mapping 78 4 Consequences of Completeness

Theorem (in the form of the Bounded Inverse Theorem), shows that the converse is not true.

Proposition 4.38 There exists a sequence ξ ∈ c0(Z) which is not the Fourier transform of a function in L1(T).

Proof Define a map F : L1(T) → c0(Z)by   ˆ F(f ) = f (n) , f ∈ L1(T). n∈Z It suffices to show that F is not a surjection. The map F is a bounded linear operator with F=1, and F is injective because Fourier coefficients are unique. Suppose F does map L1(T) onto c0(Z). Then F is a bounded linear bijection, and hence an isomorphism, by Corollary 4.30 (the Bounded Inverse Theorem). In particular, F −1 is also a bounded linear bijection. Our assumption that F is a surjection has led us to conclude that the inverse −1 −1 ∗ F : c0(Z) → L1(T) is a bijection, which implies that (F ) : L∞(T) → 1(Z) is also a bijection. (See Exercise 4.12.) This, however, is impossible, because 1(Z) is separable and L∞(T) is not. (The proofs of separability and nonseparability are similar to those given in Remark 3.14.) We have arrived at a contradiction, and therefore must conclude that F does not map L1(T) onto c0(Z). 2 In light of Theorem 4.37 and Proposition 4.38, it is tempting to wonder if, for functions f ∈L1(T), any bounds can be established for the rate at which the se- quence fˆ(n) decays. It turns out, however, that Fourier coefficients can decay n∈Z arbitrarily slowly. (See Section I.4 of [20].) Now consider the sequence space p = p(N), where 1 ≤ p ≤∞. We suppose ∞ ∈ R N (ajk)j,k=1 is an infinite matrix, where ajk for each j and k in . ∞ Proposition 4.39 Suppose (ajk)j,k=1 is a scalar array such that the series ∞ ηj = ajkξk k=1 = ∞ ∈ ∈ N converges for all ξ (ξk)k=1 p and j . Furthermore, suppose the sequence = ∞ → = ηξ (ηj )j=1 is an element of p. If the map A : p p is defined by A(ξ) ηξ for each ξ ∈ p, then A is a bounded linear operator on p. Proof For each j ∈ N, let

n ψj (ξ) = lim ajkξk = ηj . n→∞ k=1

By Theorem 4.15, ψj is a bounded linear functional on p for each j ∈ N. We use ∞ → the bounded linear functionals (ψj )j=1 to define the map A : p p by  = ∞ ∈ Aξ ψj (ξ) j=1 , ξ p. 4.4 Applications of the Open Mapping Theorem 79

By assumption, this map is well defined. Furthermore, A is linear because ψj is linear for each j ∈ N. It remains only to show that A is bounded. In order to do this, we will show that the graph of A is closed and apply the Closed Graph Theorem. (n) ∞ (n) → Suppose (ξ )n=1 is a sequence in p such that ξ ξ in p, and suppose = ∞ ∈ (n) → ∈ N ζ (ζj )j=1 p is such that Aξ ζ in p. For each j , the map ψj is a (n) continuous linear functional on p, and so ψj (ξ ) → ψj (ξ). Consequently, it must be that ζj = ψj (ξ) for each j ∈ N, and hence ζ = Aξ. Therefore, the graph of A is closed, and so A is continuous, by the Closed Graph Theorem (Theorem 4.35). 2 Proposition 4.39 demonstrates the general principle that if a linear map is properly defined, it will tend to be bounded. The next example provides a further illustration.

Example 4.40 The Hilbert transform is the map H : Lp(R) → Lp(R) defined by the formula ∞ 1 Hf (x) = f (y) dy, f ∈ Lp(R), −∞ x − y where 1

c xα ≤xβ ≤ C xα, x ∈ X.

Proof of Theorem 4.42 Suppose that P : X → V is a continuous projection and W = ker(P ). By Proposition 3.49, there exists a linear map P0 : X/W → V such that the diagram below commutes, where Q : X → X/W is the quotient map. 80 4 Consequences of Completeness

P X V

Q P0 X/W

(Recall that W is closed by Proposition 3.49.) The continuity of P0 follows from the continuity of P and the fact that Q is an open map. Any continuous linear bijection will have a continuous inverse by the Bounded Inverse Theorem (Corollary 4.30), and so P0 is an isomorphism. The next step is to show that X is the vector space direct sum of its subspaces V and W. Suppose that x ∈ V ∩W. Since x ∈ W, we have that Px = 0. However, P is a projection onto V , and so x ∈ V implies that Px = x. It follows that x = 0. Thus, V ∩ W ={0}. Furthermore, every x ∈ X can be written as the sum of elements from V and W as follows: x = Px+ (x − Px). We conclude that X = V ⊕ W as vector spaces, by Proposition 3.45. We now show that X is isomorphic to V ⊕ W. Specifically, we wish to show (X, ·) is isomorphic to V ⊕ W equipped with the norm from Proposition 3.44:

(v, w)=v+w,(v, w) ∈ V × W.

Define φ : X → V ⊕ W by

φ(x) = (Px, x − Px), x ∈ X.

We know φ is well-defined by our earlier remarks. Furthermore, φ is linear because P is linear and because of the way the vector space operations are defined in V ⊕ W. Next, we show that φ is a bijection. Suppose φ(x) = (0, 0). Then (Px, x−Px) = (0, 0), and so Px = 0 and x−Px = 0. From this we conclude that x = Px = 0, and hence φ is injective. To show that φ is surjective, let (v, w) ∈ V × W. Then P (v) = v and P (w) = 0. Consequently,

φ(v + w) = (P (v + w), (v + w) − P (v + w)) = (v, v + w − v) = (v, w).

Thus, φ is a surjection. We next show that φ is continuous by showing that it is bounded:

φ(x)V ⊕W =(Px, x − Px)V ⊕W =Px+x − Px≤(2 P +1)x.

Therefore, φ is a continuous linear bijection, and consequently an isomorphism, by the Bounded Inverse Theorem (Corollary 4.30). Now suppose X = V ⊕ W. Then x has a unique representation of the form v + w, where v ∈ V and w ∈ W. Define P (x) = v. Then P is a projection and W = ker(P ). It remains only to show that P is continuous. We will show continuity by means ∞ of the Closed Graph Theorem. Suppose (xn)n=1 is a convergent sequence in X such Exercises 81

∞ ∈ ∈ that (Pxn)n=1 converges in V . Then there exists some x X and v V such that xn → x and Pxn → v as n →∞. We need to show that Px = v. For each n ∈ N,wehavexn − Pxn ∈ ker(P ) = W. Since W is closed,

lim xn − Pxn = x − v ∈ W. n→∞ Therefore, P (x − v) = 0, and so Px = P v. By assumption, P v = v, and hence Px = v, as required. 2 We have seen that whenever a closed subspace V is the image of a continuous projection in X, there is a closed subspace W such that X = V ⊕ W. For this reason, when V is the image of a projection, we call V a complemented subspace of X.

Exercises

Exercise 4.1 Let R be given the standard topology. Show that the closed set [0, 1] is a Gδ-set. Show that the open set (0, 1) is an Fσ -set. Exercise 4.2 Show that the space C[0, 1] of continuous functions on the closed interval [0, 1] is not complete in the norm   1 1/2 2 f 2 = |f (s)| ds , f ∈ C[0, 1]. 0

(The completion of C[0, 1] in the norm ·2 is L2(0, 1), by Lusin’s Theorem.) Exercise 4.3 Let M and E be complete metric spaces. Suppose h : M → E is a −1 homeomorphism onto its image (i.e., h is a continuous one-to-one map, and h |h(M) is continuous). Show that h(M)isaGδ-set. Exercise 4.4 Let X be a Banach space and suppose E is a dense linear subspace which is a Gδ-set. Show that E = X. Exercise 4.5 Show that if Y is a normed space which is homeomorphic to a complete metric space, then Y is a Banach space. (Hint: Consider Y as a dense subspace in its completion.) Exercise 4.6 Let X and Y be Banach spaces and let T : X → Y be a bounded linear operator. If M is a closed subspace of X, show that either T (M) is first category in Y or T (M) = Y . Exercise 4.7 Let X = C(1)[0, 1] be the space of continuously differentiable functions on [0, 1] and let Y = C[0, 1]. Equip both spaces with the supremum norm ·∞.

Define a linear map T : X → Y by T (f ) = f for all functions f ∈ C(1)[0, 1]. Show (1) that T has closed graph, but T is not continuous. Conclude that C [0, 1], ·∞ is not a Banach space. 82 4 Consequences of Completeness

Exercise 4.8 Let φ ∈ C[0, 1] be a function which is not identically 0. Show the set M ={φf : f ∈ C[0, 1]} is of the first category in C[0, 1] if and only if φ(x) = 0 for some x ∈ [0, 1].

Exercise 4.9 Let (ak)k∈Z be a sequence of complex scalars with only finitely many ikθ nonzero terms. Define a trigonometric polynomial f : T → C by f (θ) = ak e . k∈Z ˆ Show that f (n) = an for all n ∈ Z.

Exercise 4.10 Show that there exists a function f ∈ L1(T) whose Fourier series fails to converge to f in the L1-norm. Precisely, show that if

N ˆ ikθ SN f = f (k) e , k=−N ∈ T  −  then there exists an f L1( ) such that f SN f L1(T) does not tend to 0. 1 +···+ Exercise 4.11 Show that the Cesàro means N (S1f SN f ) converge to f in the L1-norm for every f ∈ L1(T). Exercise 4.12 Let X and Y be Banach spaces. If T : X → Y is a bijection, show that the adjoint map T ∗ : Y ∗ → X∗ is also a bijection. Conclude that if T is an isomorphism of Banach spaces, then so is T ∗. Exercise 4.13 Show that the Hilbert transform of Example 4.40 is well-defined. ∞ → R ∞ Exercise 4.14 Let f : [1, ) be a continuous function. Suppose (ξn)n=1 is a strictly increasing sequence of real numbers with ξ1 ≥ 1, lim ξn =∞, and n→∞ ξ + lim n 1 = 1. If lim f (ξ x) = 0 for all x ≥ 1, then prove that lim f (x) = 0. →∞ →∞ n →∞ n ξn n x

Exercise 4.15 Show that L2(0, 1) is of the first category in L1(0, 1). Exercise 4.16 Let (Ω, μ) be a probability space and suppose there exists a sequence ∞ ∈ N = of disjoint sets (En)n=1 such that μ(En) > 0 for all n . Show that Lp(Ω, μ) Lq (Ω, μ)if1≤ p

Exercise 4.18 Identify the quotient space c/c0. ≤ ≤∞ ={ ∞ ∈ = ∈ N} Exercise 4.19 Let 1 p and let V (xj )j=1 p : x2k 0 for all k . Show that p/V is isometrically isomorphic to p. Exercise 4.20 Find an example of a map T : X → Y , where X and Y are normed spaces, such that T is a bounded linear bijection, but T −1 is not bounded. (Hint: X and Y cannot both be Banach spaces, or T −1 will be bounded by Corollary 4.30.) Exercise 4.21 Show that the Closed Graph Theorem (Theorem 4.35) implies the Open Mapping Theorem (Theorem 4.29). Chapter 5 Consequences of Convexity

In this chapter, we wish to explore the geometric aspects of the Hahn–Banach The- orem. The crucial property, it turns out, is local convexity. We will first recall some notions from general topology and then introduce the concept of a . These spaces, which include Banach spaces, are sufficiently complex that we can say something interesting about their structure. Banach spaces are topolog- ical vector spaces where the topology is determined by a complete norm, and in this chapter we will get some idea of how they fit into a more general topological framework.

5.1 General Topology

Let E be a set. A topology τ on E is a collection of subsets called open sets satisfying the following three criteria:

1. The collection τ contains both E and the empty set ∅. { } 2. If Ui i∈I is a (possibly uncountable) family of sets in τ, then i∈I Ui is in τ. 3. If U and V are in τ, then U ∩ V is in τ. When E is equipped with a topology τ, we call the pair (E, τ)atopological space. When there is no ambiguity, we will suppress the τ and simply write E for the topological space (E, τ), and say U is open in E when U ∈ τ. If (E, τ) and (F , τ ) are topological spaces, then a function f : E → F is called continuous if f −1(U) is open in E whenever U is open in F .Forx a point in E,a neighborhood of x is any subset N of E for which there exists an open set U such that x ∈ U and U ⊂ N. Example 5.1 Let (E, τ) be a topological space. If every set in E is open, then τ is called the discrete topology.Ifτ ={E, ∅}, then τ is called the indiscrete topology. Example 5.2 Let (E, τ) and (F , τ ) be two topological spaces. The product E × F can be given the product topology, denoted τ × τ , as follows: W ⊆ E × F is open if

© Springer Science+Business Media, LLC 2014 83 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_5 84 5 Consequences of Convexity  W = (Ui × Vi ), i∈I where Ui ∈ τ and Vi ∈ τ for each i ∈ I, where I is a (possibly uncountable) index set. Let (E, τ) be a topological space. The topology τ is said to have a base of open { } ∈ ⊆ sets U i i∈I if for each open set V τ, there exists an index set J I such that = V i∈J Ui . When τ has a base, we say the base generates the topology τ.In Example 5.2, the product topology on E × F is generated by the base

{U × V : U ∈ τ, V ∈ τ }.

Example 5.3 (Product topology). Let I be a (possibly uncountable) index set. For ∈ each$ i I, let (Ei , τi ) be a topological space. The product topology on the product i∈I (Ei , τi ) is a topology with a base consisting of sets of the form & ×···× × Ui1 Uin Ei , i∈I\{i1,... ,in} ∈ ∈ N ∈{ } where Uij is open in Eij for ij I, n , and j 1, ... , n . Observe that all but finitely many elements of the product are the entire space. This example contains Example 5.2 as a special case, because the product in that example is finite. Example 5.4 Suppose M is a set with a metric d. We will show that the metric d determines a topology on M. For each x ∈ M and r>0, let

B(x, r) ={z ∈ M : d(x, z)

This set is the open ball about x of radius r. We declare a subset V of M to be open if for each x ∈ V , there is an r>0 such that B(x, r) ⊆ V . The collection of all such open sets forms a topology on M called the metric topology on M generated by the metric d, or just the metric topology on M, if the metric d is understood. Suppose V is open in the metric topology. For each x ∈ V , there exists a number ⊆ = rx > 0 such that B(x, rx ) V . Thus, V x∈V B(x, rx ), and so the collection of open balls forms a base for the metric topology. Let E be a topological space and let x ∈ E.Alocal base at x is a collection η of open sets, all of which contain x, such that any neighborhood U of x contains an element of η. In Example 5.4, any point in the metric space M has a local base. For x ∈ M, the collection of open balls B(x, r) for all r>0 forms a local base at x. In fact, if we consider the collection of sets η ={B(x,1/n):n ∈ N}, then η is a countable local base at x. A topological space (E, τ) with a countable local base at every point x ∈ E is called first countable. Further, (E, τ) is called second countable if τ has a countable base. From Example 5.4 (and the comments following it), we see that any metric 5.1 General Topology 85 space M is first countable; however, M will not be second countable unless M is separable. (See Exercise 5.9.) Let (E, τ) be a topological space. We say (E, τ)ismetrizable if there exists a metric d such that d generates the topology τ. That is, if the open balls in (E, d) form a base for the topology τ. We call (E, τ)aHausdorff space if for any distinct points x and y in E there exist open sets U and V in τ such that x ∈ U, y ∈ V , and U ∩ V =∅. Example 5.5 Any metrizable space is a Hausdorff space. To see this, suppose (M, d) is a metric space and let x and y be two distinct points in M. Since x = y,it follows that d(x, y) > 0. Let δ = d(x, y). Furthermore, let U = B(x, δ/2) and V = B(y, δ/2). Then U and V are open in the metric topology on M, x ∈ U, y ∈ V , and U ∩ V =∅. Example 5.6 Any nonempty set E with the discrete topology (see Example 5.1)is metrizable. Define a metric on E by

0ifx = y, d(x, y) = for (x, y) ∈ E × E. 1ifx = y,

It is easy to see that d is, in fact, a metric. This metric is called the discrete metric on E and it is not hard to show that d generates the discrete topology on E. Of particular interest to us is the notion of compactness. A topological space is said to be compact if any open cover contains a finite open subcover. To be more precise, let X be a topological space. Then X is compact if for any collection U of ⊆ { } open sets such that X U∈U U there exists a finite collection U1, ... , Un of elements in U such that X ⊆ U1 ∪···∪Un. For a (not necessarily compact) topological space, we define a compact subset in a similar way: A subset E of a topological space X is compact if any cover of E by sets open in X admits a finite subcover of E. Some well-known properties of compact sets are treated in the exercises at the end of this chapter. (See Exercise 5.2.) A topological space is said to be locally compact if every point has a compact neighborhood. Naturally, all compact spaces are locally compact, but the converse need not be true. For example, the real line R with its standard topology is locally compact, but not compact. A notion of fundamental importance in topology is that of a convergent sequence. ∞ If X is a topological space, and (xn)n=1 is a sequence of elements from X, then ∞ ∈ (xn)n=1 is said to converge to a point x X if for every open neighborhood U of x there exists an N ∈ N such that xn ∈ U for all n ≥ N. In such a case, we say x is ∞ the limit of the sequence (xn) = and we write x = lim xn. (Note that this notion of n 1 n→∞ a limit agrees with the standard definition of a limit in a metric space.) In general, the limit of a sequence need not be unique. The spaces we consider, however, are Hausdorff spaces, and limits are necessarily unique in a Hausdorff space. (See Exercise 5.8.) 86 5 Consequences of Convexity

A subset U of a topological space X is called sequentially open if every sequence ∞ (xn)n=1 that converges to a point in U is eventually in U. That is, if there exists some N ∈ N such that xn ∈ U for all n ≥ N. We call X a sequential space if every sequentially open set is open. Any first countable topological space is a sequential space. In particular, any metric space is a sequential space.

5.2 Topological Vector Spaces

We now consider topological spaces with additional structure, namely an underlying linear structure. Let X be a vector space over the field K (which is either R or C). A topology τ on X is called a vector topology if the maps

(λ, x) → λx, λ ∈ K, x ∈ X, and (x1, x2) → x1 + x2,(x1, x2) ∈ X × X, are both continuous. That is, if both scalar multiplication and addition are continuous in the topology on X. In this case, (X, τ) is called a topological vector space. Example 5.7 Any normed vector space X is a topological vector space, where the topology is given by the base of open balls:

x + λ(intBX), λ>0, x ∈ X.

Equivalently, the topology on X is generated by the metric d given by the formula d(x, y) =x − y for (x, y) ∈ X × X. A vector topology is determined by a base of neighborhoods at the origin, since sets can be translated and scaled continuously. We will denote the origin by 0. Let η be a base of neighborhoods of the origin in a topological vector space (X, τ). A ∈ = ∞ ∈ set V η is called absorbent if X n=1 nV. A set V η is called balanced if λV ⊆ V for all scalars λ such that |λ|≤1. Lemma 5.8 In a topological vector space, any open neighborhood of the origin is absorbent. Proof Let X be a topological vector space. Suppose V is an open neighborhood of 0 and let x ∈ X. Scalar multiplication is continuous, and so the map λ → λx is continuous. Consequently, the set {λ : λx ∈ V } is open in K. By assumption, V is a neighborhood of 0, and so 0 ∈{λ : λx ∈ V }. We have established that the set { ∈ } K 1 λ : λx V is open in and contains 0. Thus, it must contain n for a sufficiently ∈ N x ∈ ∈ large n . We conclude that n V , and consequently x nV . Therefore, V is absorbent. 2 5.2 Topological Vector Spaces 87

Proposition 5.9 Any topological vector space has a base of neighborhoods η of the origin such that for all V ∈ η: (i) V is balanced, (ii) V is absorbent, and (iii) there exists W ∈ η such that W + W ⊆ V . Proof Let (X, τ) be a topological vector space and let U be a neighborhood of the origin. Let s : K × X → X be scalar multiplication, so that s(λ, x) = λx for all λ ∈ K and x ∈ X. By assumption, s is continuous. Thus, since U is open in X, the preimage s−1(U) is open in K × X. Certainly, (0, 0) ∈ s−1(U), and so there exists −1 some δ>0 and an open neighborhood W of 0 in X such that δBK × W ⊆ s (U). Therefore, s(δBK × W) ⊆ U, and hence αW ⊆ U for all |α|≤δ. Let  V = αW.

α∈δBK

Then V is open, balanced, and contained in U. For each open neighborhood of 0, such a V can be constructed. Let η be the collection of all such balanced sets. Then (i) follows from the construction and (ii) follows from Lemma 5.8. It remains to verify (iii). Let V ∈ η. By the continuity of addition, there exist two open neighborhoods U1 and U2 of 0 ∈ X such that U1 + U2 ⊆ V . Let U = U1 ∩ U2. Then U is an open neighborhood of 0 such that U +U ⊆ V . As demonstrated earlier in this proof, U contains a subset W ∈ η, and this W is the required set. 2

Proposition 5.10 Let X be a topological vector space with%η a base of open sets ={ } about the origin. Then X is a Hausdorff space if and only if V ∈η V 0 . Proof Without loss of generality, we may assume that η satisfies the conclusions of Proposition 5.9. % ∈ = Assume X is% a Hausdorff space. Certainly 0 V ∈η V . Suppose x 0. We will ∈ show that x V ∈η V . Since X is a Hausdorff space, there are open sets U and W such that 0 ∈ U and x ∈ W and U ∩ W =∅. By assumption, η is a base of open ∈ ⊆ sets about the origin, and consequently% there exists a set%V0 η such that V0 U. ∈ ∈ ={ } It follows that x %V0, and so x V ∈η V . Therefore, V ∈η V 0 . ={} Now assume V ∈η V 0 . We will show that X is a Hausdorff space. Let x and y be elements of X that cannot be separated by disjoint open sets. Let V ∈ η. By Proposition 5.9, there exists a set W ∈ η such that W + W ⊆ V . By assumption, x + W and y + W are not disjoint. Then there exist elements w1 and w2 in W such that x + w1 = y + w2. − = − ∈ − Therefore, x y w2 w1 W W. The set W is balanced, and% so we conclude − ∈ + ⊆ ∈ − ∈ ={ } x y W W V . This is true for every V η, and so x y V ∈η V 0 . It follows that x = y, and consequently X is a Hausdorff space. 2 In our discussions of normed spaces, a key notion was that of the dual space. In the more general context of topological vector spaces, this will remain true. Definition 5.11 Let X be a topological vector space. The dual space X∗ consists of all continuous linear scalar-valued functionals on X. 88 5 Consequences of Convexity

5.3 Some Metrizable Examples

In this section, we consider some examples of real topological vector spaces which are metrizable, but do not have a norm structure.

Example A: Lp(0, 1), 0

If 0

 1 1/p p f p = |f (t)| dt < ∞. 0

If p ≥ 1, then Lp(0, 1) is a Banach space. If 0

f (t) = p(p − 1)tp−2 + p(p − 1)(1 − t)p−2.

Therefore, f (1/2) = 23−pp(p −1) < 0, and so f has a local maximum at t = 1/2. The result follows. 2 p p p From the above proposition, we conclude that f +gp ≤f p +gp whenever 0

={   } −n ∞ Let B f : f p < 1 . Then the collection of open sets (2 B)n=1 determines a countable base at 0 which satisfies the conclusions of Proposition 5.9. The first two properties are clear. To see property (iii), simply observe that 2−N B + 2−N B ⊆ B whenever N>1/p. ∗ We will now compute Lp(0, 1) for 0

φ= sup |φ(f )| < ∞. f p=1

This should be taken as the definition of φ in this context. The function φ is a linear functional, but not on a normed space, and consequently the notation φ has not yet been given a meaning. Let f ∈ Lp(0, 1) be such that f p = 1. The map

t t → |f (s)|p ds, t ∈ [0, 1], 0 is continuous with range [0, 1]. Therefore, by the Intermediate Value Theorem, there ∈ a | |p = exists some a [0, 1] such that 0 f (s) ds 1/2. Define two functions g and h in Lp(0, 1) by

g = fχ(0,a) and h = fχ(a,1).

By the choice of a,

 a 1/p  1/p p 1 gp = |f (s)| ds = , 0 2

1/p and similarly, hp = (1/2) . By the linearity of φ, together with the definition of φ, we have the two bounds |φ(g)|≤φ(1/2)1/p and |φ(h)|≤φ(1/2)1/p. Thus, again using the linearity of φ, 1− 1 |φ(f )|≤2 ·φ(1/2)1/p =φ 2 p .

Taking the supremum over all functions f ∈ Lp(0, 1) with f p = 1, we have

1− 1 φ≤φ 2 p .

∗ However, this can happen only if φ = 0. This implies that Lp(0, 1) ={0}. The preceding remark guarantees that Lp(0, 1) does not satisfy a Hahn–Banach Theorem if 0

Example B: L0(0, 1)

We denote by L0(0, 1) the set of all (equivalence classes of) scalar-valued Lebesgue measurable functions on [0, 1]. (As usual, we identify functions if they agree almost everywhere.) The topology on L0(0, 1) is determined by convergence in Lebesgue measure. More precisely, we define a set to be open when it is sequentially open, and a sequence converges when it converges in Lebesgue measure. Recall that a sequence ∞ (fn)n=1 of measurable functions converges in Lebesgue measure to a measurable function f if for every >0,

lim m{t : |f (t) − fn(t)|≥}=0, n→∞ where m is Lebesgue measure on [0, 1]. We claim the topology on L0(0, 1) is metrizable and is induced by the metric

1 |f (t) − g(t)| d(f , g) = dt, 0 1 +|f (t) − g(t)| where f and g are measurable functions. The only property of a metric that is not immediate is the triangle inequality. In order to verify this, it suffices to show that the function φ(x) = x/(1 + x) is a nondecreasing subadditive function on [0, ∞). A simple application of the quotient rule reveals that φ (x) = 1/(1 + x)2, and so φ is strictly increasing for all x ≥ 0. To see that φ is subadditive on [0, ∞), observe that x + y x y x y φ(x+y) = = + ≤ + = φ(x)+φ(y), 1 + x + y 1 + x + y 1 + x + y 1 + x 1 + y ≥ ≥ because x 0 and y 0. Given these properties, the triangle inequality follows = 1 | − | readily from the fact that d(f , g) 0 φ( f (t) g(t) ) dt. To see that the topology on L0(0, 1) coincides with that induced by the metric d, it suffices to show that the same sequences converge in each topology (since ∞ both spaces are sequential spaces). Suppose the sequence (fn)n=1 of measurable functions converges in the metric d to a measurable function f . Then d(f , fn) → 0 | − | as n →∞. Therefore, f fn → 0 in the L -norm, and hence in measure. It 1+|f −fn| 1 follows that fn → f in measure, as required. The reverse implication, that convergence in measure implies convergence in d, is true by the Lebesgue Dominated Convergence Theorem. (We state the Lebesgue Dominated Convergence Theorem in Theorem A.17 for almost everywhere conver- gence, but it remains valid for sequences that converge in measure on a σ -finite measure space.) It remains to show that the metric d is complete. Let

1 |f (t)| f 0 = dt, f ∈ L0(0, 1). 0 1 +|f (t)|

Observe that d(f , g) =f − g0 for all measurable functions f and g on [0, 1]. Certainly, ·0 is not a norm (it is not homogeneous), but it does satisfy the triangle 5.3 Some Metrizable Examples 91    = 1 | | ∞ inequality, because f 0 0 φ( f (t) ) dt and φ is subadditive on [0, ). Conse- quently, we may use Lemma 2.24 (the Cauchy Summability Criterion) to prove that d is a complete metric space (because the proof of Lemma 2.24 does not require homogeneity of the norm).  ∞ ∞   Suppose (fn)n=1 is a sequence of measurable functions such that n=1 fn 0 < ∞. Then, by Fubini’s Theorem,   ∞ ∞ ∞ 1 |f (t)| 1 |f (t)| f  = n dt = n dt < ∞. n 0 1 +|f (t)| 1 +|f (t)| n=1 n=1 0 n 0 n=1 n ∈ It follows that, for almost every t [0, 1], there exists some Mt > 0 such that ∞ |fn(t)| = ≤ Mt . Consequently, by the subadditivity of φ, for every N ∈ N, n 1 1+|fn(t)|   N N φ |fn(t)| ≤ φ(|f (t)|) ≤ Mt < ∞ a.e.(t). n=1 n=1

Since φ is a strictly increasing function on the interval [0, ∞), we conclude that, for al- N | | ∞ N ∞ most every t, the sequence ( n=1 fn(t) )N=1 converges. Therefore, ( n=1 fn)N=1 converges almost everywhere, and hence in measure. Therefore, L0(0, 1) is a complete metric space. As was the case in Example A (where 0

Example C: ω = RN

J Let$ J be a (possibly uncountable) index set. Let R denote the product space R R = R ∈ RJ → R j∈J j , where j for each j J .An element x in is a function x : J , where x(j) ∈ R(= Rj ) for each j ∈ J . When the space RJ is equipped with the product topology, it becomes a topological vector space. The vector space operations are done pointwise; that is, if x and y are elements in RJ , then (x + y)(j) = x(j) + y(j) for each j ∈ J . Convergence, too, J is pointwise: xn → x in R as n →∞if xn(j) → x(j)inRj as n →∞for each j ∈ J . If J is an uncountable index set, then RJ is not metrizable, since RJ with the product topology is not first countable; i.e., it does not have a countable local base at 0. (See Example 5.4.) In this example, we are interested in countable index sets, and so we let J = N. We denote RN by the Greek letter ω. Generally, we think of ω as the collection of all R ∈ = ∈ N = ∞ sequences in .Ifξ ω, we let ξk ξ(k) for each k , and we write ξ (ξk)k=1. In this context, the vector space operations are done coordinate-wise. Convergence (n) → →∞ (n) → is also now viewed coordinate-wise, so that ξ ξ in ω as n if ξk ξk in R as n →∞for each k ∈ N. 92 5 Consequences of Convexity

Unlike RJ when J is uncountable, the space ω is first countable. A base of neighborhoods at the origin is formed by sets of the type

(−1, 1) ×···×(−n, n) × R × R ×··· , (5.1) where n ∈ N and i > 0 for each i ∈{1, ... , n}. If we denote elements of ω by = ∞ ξ (ξk)k=1, then the set in (5.3.1) can be written

{ξ : |ξ1| <1, ··· , |ξn| <n}. = 1 ∈ N To identify a countable base, consider the sets with i k , where k , for all i ∈{1, ... , n} and n ∈ N. Not$ only is ω first countable, but it is also metrizable. Recall that ω was defined ∞ R R to be k=1 k. Denote the metric on k by dk. We define a metric d on ω by ∞ 1 d (ξ , η ) d(ξ, η) = k k k , 2k 1 + d (ξ , η ) k=1 k k k = ∞ = ∞ where ξ (ξk)k=1 and η (ηk)k=1. We now wish to identify the space dual to ω. To that end, we prove the following proposition. Proposition 5.13 Let X be a topological vector space and let K be the field of scalars. A linear functional f : X → K is continuous if and only if there exists a neighborhood V of 0 such that the set f (V ) is bounded in K. −1 Proof Let UK be the open unit ball in K.Iff is continuous, then f (UK)isan −1 open neighborhood of 0, and f (f (UK)) ⊆ BK is bounded in K. Now suppose V is a neighborhood of 0 such that f (V ) is bounded in K.By definition, there is some M>0 such that f (V ) ⊆ MUK. Let >0 be given. Then  ⊆ | | ∈  f ( M V ) UK. Therefore, f (x) <whenever x M V . In other words, f is continuous at zero. Continuity then follows from the linearity of f . 2 We can now use the preceding proposition to identify the continuous linear func- tionals on ω. Let f ∈ ω∗. By Proposition 5.13, there must be some neighborhood V of 0 such that f (V ) is bounded in R. Without loss of generality, we may assume V is a basic set, say V ={ξ : |ξ1| <1, ··· , |ξn| <n} for some n ∈ N. The set f (V ) is bounded, and so there exists some M>0 such that |f (ξ)|≤M for any ξ ∈ V . Let ξ = (0, ... ,0,ξn+1, ... ). Then ξ ∈ V , and so too is any constant multiple of ξ. Therefore, for any K>0, we have that |f (Kξ)|≤M, and hence |f (ξ)|≤M/K, by the linearity of f . Since this inequality holds for all K>0, it must be that f (ξ) = 0. This is true for any ξ ∈ V having ξi = 0 for all i ∈{1, ... , n}. Thus, because f is linear, it follows that f (ξ) = f (ξ ) for any ξ and ξ that agree on the first n coordinates. n Define a function g : R → R by g(ξ1, ... , ξn) = f (ξ1, ... , ξn,0,... ). Since f n ∗ is linear and continuous, it follows that g ∈ (R ) . Consequently, there exists αk ∈ R 5.4 The Geometric Hahn–Banach Theorem 93 for each k ∈{1, ... , n} such that

n = n ∈ Rn g(ξ1, ... , ξn) αk ξk,(ξk)k=1 . k=1 Since the value of f (ξ) depends only on the first n coordinates of ξ, we conclude that n = = ∞ ∈ f (ξ) αk ξk, ξ (ξk)k=1 ω. k=1

5.4 The Geometric Hahn–Banach Theorem

In this section, we will meet the Hahn–Banach Theorem without the advantages of a norm structure. The key property a space must have, we shall see, is local convexity. Definition 5.14 Let X be a real or complex vector space. A subset V of X is called convex if given any x and y in V ,wehave(1− t)x + ty ∈ V for all t ∈ [0, 1]. That is, if two points are in V , then the line segment joining them is also in V . A balanced convex set is called absolutely convex. Lemma 5.15 Let X be a real or complex vector space. A subset V of X is absolutely convex if and only if αx + βy ∈ V whenever x and y are in V and α and β are scalars such that |α|+|β|≤1. Proof We first observe that V is absolutely convex if the latter condition holds: to show balance, take β = 0; to show convexity, let α = t − 1 and β = t. Now suppose V is absolutely convex. Let x and y be in V and suppose α and β are scalars such that |α|+|β|≤1. We wish to show αx + βy ∈ V . Observe that α β αx + βy = (α + β)x + (α + β)y. α + β α + β

Since V is balanced, x = (α +β)x and y = (α +β)y are both elements of V . Thus, by convexity, α β αx + βy = x + y ∈ V. α + β α + β This completes the proof. 2 Definition 5.16 A topological vector space is locally convex if there is a base of neighborhoods of 0 consisting of convex sets. By Proposition 5.9, we can always take the elements of a base in a locally convex topological vector space to be balanced, and hence absolutely convex. Example 5.17 Any normed space is locally convex. It is easy to see that balls with center at the origin are convex. 94 5 Consequences of Convexity

p = ∞

p = 2 0 <

2 Fig. 5.1 Closed unit balls in p for various values of p

2 · Example 5.18 Consider the space p of ordered pairs in the p norm for p>0. (We use the term “norm” here even though it is not a norm when 0

Example 5.19 Let X = Lp(0, 1) for 0 0 such that δB ⊆ V , where B ={f : f p < 1}. (We remind the reader that ·p is not a norm in this case.) p−1 p Choose any f ∈ X. Because p<1, there is some n ∈ N such that n f p <δ. Pick real numbers {t0, t1, ... , tn} so that 0 = t0

tk | |p = 1  p ∈{ } f (s) ds f p, k 1, ... , n . tk−1 n

∈{ } =  p = p−1 p For each k 1, ... , n , let gk nf χ(tk−1,tk ]. Then gk p n f p <δ, and therefore gk ∈ δB ⊆ V . This is true for each k ∈{1, ... , n}, and so {g1, ... , gn}⊆ = 1 +···+ ∈ V . Observe that f n (g1 gn). Since V is convex, it follows that f V . The choice of f ∈ X was arbitrary, and so V = X, as required. ∗ Note that this argument implies that Lp(0, 1) ={0}, a fact we first observed in Example A in Sect. 5.3. The next theorem is a geometric version of the Hahn–Banach Theorem. This version of the theorem is not set in the context of a complete normed space, but in that of a locally convex topological vector space. Theorem 5.20 (Hahn–Banach Separation Theorem) Let E be a real locally con- vex topological vector space. Let K be a closed nonempty convex subset of E.If x0 ∈ K, then there exists a continuous linear functional f on E such that

f (x0) > sup f (y). y∈K 5.4 The Geometric Hahn–Banach Theorem 95

Proof Without loss of generality, we may assume 0 ∈ K. (If not, use a translation.) Since K is closed and x0 ∈ K, there exists some open neighborhood N of x0 such that N ∩ K =∅. It follows that there exists an absolutely convex open neighborhood W of 0 such that (x0 + W) ∩ K =∅. This implies that x0 ∈ K + W, for otherwise there would exist some k ∈ K and w ∈ W such that x0 − w = k, contradicting the fact that the intersection of x0 + W with K is empty. (Here we use the fact that W =−W, because W is balanced.) = + 1 Let V K 2 W. Then V is a convex neighborhood of 0. Define a function p : E → R by p(x) = inf{λ>0:x ∈ λV }, x ∈ E. Recall that every neighborhood of 0 is absorbent. In particular V is absorbent, and so p(x) < ∞ for all x ∈ E. We claim p is sublinear. For any x ∈ E and α ≥ 0,   λ λ p(αx) = inf{λ>0:αx ∈ λV }=α inf > 0:x ∈ V = αp(x). α α This proves positive homogeneity. It remains to show that p is subadditive. Let x and y be in E and let >0. Because p(x) and p(y) are infima, there exist real +  +  numbers λ>0 and μ>0 such that p(x) <λ 1. Suppose to the ≤ x0 ∈ ≥ x0 → contrary that p(x0) 1. It follows that λ V for all λ 1. Since λ x0 as → ∈ + 1 ∩ =∅ λ 1, we conclude that x0 V , and consequently (x0 2 W) V . = + 1 + 1 ∩ + 1 =∅ Recall that V K 2 W. Thus, (x0 2 W) (K 2 W) , and so there exists ∈ + 1 = + 1 an element k K and elements w1 and w2 in W such that x0 2 w1 k 2 w2. Hence, 1 1 1 1 x = k + w − w ∈ K + W − W. 0 2 2 2 1 2 2 1 − 1 ⊆ Because W is absolutely convex, we have that 2 W 2 W W. From this we conclude that x0 ∈ K + W. This is a contradiction, and so it must be that p(x0) > 1. We now make use of Exercise 3.9. There exists a linear functional f on E such that f ≤ p and f (x0) > 1. Because K ⊆ V , and because p(x) ≤ 1 for all x ∈ V , we have sup f (y) ≤ 1

∈ 1 ⊆ + 1 = ≤ ≤ 0 K, we have that 2 W K 2 W V. By construction, f (x) p(x) 1 for ∈ ≤ ∈ 1 all x V , and hence f (x) 1 for all x 2 W. The set W is balanced, and thus | |≤ ∈ 1 1 ⊆ − f (x) 1 for all x 2 W. Therefore, we have demonstrated that f ( 2 W) [ 1, 1]. Consequently, the linear functional f is continuous, by Proposition 5.13. 2 Example 5.21 Suppose E is a real locally convex topological vector space and K is a closed linear subspace of E.Ifx0 ∈ K, then, by Theorem 5.20, there exists a continuous linear functional f on E such that f (K) = 0 and f (x0) > 0. (See Exercise 5.20.) There is also a version of Theorem 5.20 for complex topological vector spaces. Theorem 5.22 Let E be a complex locally convex topological vector space. Let K be a closed nonempty convex subset of E.Ifx0 ∈ K, then there exists a continuous linear functional f on E such that

(f (x0)) > sup (f (x)). x∈K

Proof Ignoring multiplication by complex scalars, we may treat E as a vector space over R. Therefore, by Theorem 5.20, there exists a real linear functional g on E such that g(x0) > supx∈K g(x). Now, define a complex linear functional on E by f (x) = g(x) − ig(ix) for all x ∈ E. The functional f is the desired continuous linear functional on E. 2 Definition 5.23 Let X be a vector space and let K denote the scalar field. A function p : X → R is called a semi-norm if the following three conditions are satisfied: (i) p(x) ≥ 0 for all x ∈ X, (ii) p(x + y) ≤ p(x) + p(y) for all {x, y}⊆X, and (iii) p(αx) =|α| p(x) for all α ∈ K and x ∈ X. What distinguishes a semi-norm from a norm is that a semi-norm p may satisfy p(x) = 0 even when x = 0. As in the case of a norm, we call the property in (ii) subadditivity (or the triangle inequality) and we call the property in (iii) homogeneity.

Theorem 5.24 Suppose {pα}α∈A is a family of semi-norms on a vector space X. Let V (α, n) ={x : pα(x) < 1/n}, α ∈ A, n ∈ N. If η is the collection of all finite intersections of the sets V (α, n), where α ∈ A and n ∈ N, then η determines a locally convex vector topology on X in which the elements of η form an absolutely convex base of neighborhoods at 0. Proof We define a topology on X by declaring a set E ⊆ X to be open if and only if E is a (possibly empty) union of translates of elements in η. This defines a topology for which all members of η are absolutely convex (that is, convex and balanced). It remains to show that addition and scalar multiplication are continuous. Let U be an open neighborhood of 0 in X. Without loss of generality, we may assume U is 5.4 The Geometric Hahn–Banach Theorem 97 an element of η. Thus,

U = V (α1, n1) ∩···∩V (αk, nk) (5.2) for {α1, ... , αk}⊆A and {n1, ... , nk}⊆N.IfV = V (α1,2n1) ∩···∩V (αk,2nk), then V + V ⊆ U (because pα is subadditive for every α ∈ A). Therefore, addition is continuous. Now, let x ∈ X and κ ∈ K, where K is the scalar field. A basic open neighborhood of κx can be written as κx + U, where U is written as in (5.2). We will show there exists an open neighborhood W of x and a δ>0 such that λW ⊆ κx + U for all |κ − λ| <δ. Let V = V (α1,2n1)∩···∩V (αk,2nk), as above. Since V is an open neighborhood of 0, it is absorbent. Thus, there exists some n ∈ N such that x ∈ nV . Let 1 n δ = and W = x + V. n 1 +|κ|n Suppose w ∈ W and λ ∈ B(κ, δ). Then

κx − λw = (κ − λ)x + λ(x − w).

= − = n Observe that x nv1 and w x 1+|κ|n v2 for some choice of v1 and v2 in V . Hence, λn κx − λw = (κ − λ)nv − v . 1 1 +|κ|n 2 Therefore, because V is balanced, |λ|n κx − λw ∈|κ − λ| nV + V ⊆ V + V ⊆ U. 1 +|κ|n It follows that scalar multiplication is continuous, and so the proof is complete. 2 Definition 5.25 Suppose X is a topological vector space and let U be an absorbent subset of X. The Minkowski functional of U on X is the function pU : X → R defined by pU (x) = inf{λ>0:x ∈ λU}, x ∈ X.

Note that pU (x) < ∞ for all x ∈ X, because U is absorbent. Suppose that X is a locally convex topological vector space. Then X has a base of neighborhoods of 0 that are absolutely convex. Such sets are absorbent, and so each such set will give rise to a well-defined Minkowski functional. Proposition 5.26 Let X be a topological vector space and let U be an absorbent absolutely convex subset of X. The Minkowski functional pU is a semi-norm on X.

Proof Certainly pU (x) ≥ 0 for each x ∈ X, by the definition of pU . To show the subadditivity of pU , we will use the convexity of U. Let x and y be elements in X and let >0. By the definition of pU , there exist numbers λ1 > 0 98 5 Consequences of Convexity

+  +  and λ2 > 0 such that pU (x) <λ1

The choice of  was arbitrary, and so pU (x + y) ≤ pU (x) + pU (y). (Compare to the proof of Theorem 5.20.) Finally, we show homogeneity. Let α ∈ K, where K is the field of scalars. Computing directly, we have   λ λ p (αx) = inf{λ>0:αx ∈ λU}=|α| inf > 0:x ∈ · sign(α)U . U |α| |α|

Since U is balanced, sign(α)U = U. Letting λ = λ/|α|, ! "

pU (αx) =|α| inf λ > 0:x ∈ λ U =|α| pU (x).

Therefore, pU is a semi-norm on X, as claimed. 2 If X is a locally convex topological vector space, then there exists a base of absolutely convex neigborhoods of 0, say η. By Proposition 5.26, the Minkowski functional pU is a semi-norm on X for each U ∈ η. By Theorem 5.24, the family of semi-norms {pU }U∈η generates a locally convex vector topology on X. We leave it as an exercise to show that the topology generated by {pU }U∈η is, in fact, the original topology. (See Exercise 5.17.) So far, we have considered general topological vector spaces. We now focus our attention on topological vector spaces that have a complete norm structure—that is, Banach spaces. We have already said much about the norm topology of a Banach space X. We now consider a new topology on X, the so-called . Definition 5.27 Let X be a topological vector space. The weak topology on X (or the w-topology) is defined by a base of neighborhoods at 0 of the form ∗ ∗ ={ | ∗ | ≤ ≤ } W(x1 , ... , xn ; ) x : xi (x) <,1 i n , { ∗ ∗}⊆ ∗ ∈ N where >0 and x1 , ... , xn X for n . The weak topology on X is the topology it inherits as a subspace of the space ∗ ∗ KX with the product topology. The space KX is the collection of all functions from ∗ X∗ into the scalar field K, and we identify X with a subspace of KX by identifying ∗ x ∈ X with xˆ ∈ KX via the relationship xˆ(x∗) = x∗(x) for all x∗ ∈ X∗. To distinguish between the norm and weak topologies on X, we will frequently denote X with the norm topology by (X, ·) and X with the weak topology by (X, w). The weak and norm topologies are generally quite different. Any weakly open set is necessarily open in the norm topology (the basic sets are intersections of 5.4 The Geometric Hahn–Banach Theorem 99 preimages of open sets under continuous maps), but not every set open in the norm topology will be weakly open. (We will demonstrate this shortly.) The weak topology on X generally has fewer open sets, and so it is “harder” for a function on (X, w) to be continuous than a function on (X, ·). For example, consider the identity map IdX on X. The map IdX :(X, ·) → (X, w) is always continuous, but IdX :(X, w) → (X, ·) need not be. Indeed, if both maps are continuous, then the topologies must coincide, and then X must be finite-dimensional. (See Proposition 5.30.) Let us consider which sequences converge in X with the weak topology. Without loss of generality, we may consider only those sequences converging to 0. If a ∞ ∞ sequence (xn)n=1 converges to 0 in the weak topology on X, we say that (xn)n=1 → ∞ converges weakly to 0 (or xn 0 weakly). The sequence (xn)n=1 converges weakly X∗ to 0 precisely when it converges coordinate-wise to 0 in K . That is to say, xn → 0 weakly if and only if

∗ ∗ ∗ lim x (xn) = 0, for all x ∈ X . n→∞

In other words, xn converges to 0 in the weak topology if and only if every weak ∞ neighborhood of the origin eventually contains the sequence (xn)n=1. A sequence converges to 0 in the norm topology if and only if every “strong” neighborhood of the origin eventually contains the sequence. However, the norm topology has more open neighborhoods about 0 than the weak topology. Conse- quently, it is more “difficult” for a sequence to converge in the norm topology than to converge in the weak topology.

Example 5.28 Consider p for 1 ≤ p<∞. For each n ∈ N, let en be the sequence with 1 in the nth coordinate, and 0 elsewhere. If m and n are elements of N such =  −  = 1/p ∞ that m n, then em en p 2 . Consequently, the sequence (en)n=1 does not ∗ = ∗ ∞ converge in the norm topology. On the other hand, if x (xn )n=1 is a sequence in ∗ (p) = q , where p>1 and q is the exponent conjugate to p, then

∗ ∗ lim x (en) = x = 0. n→∞ n

∗ Since this is true for all x ∈ q , we conclude that en → 0 weakly. The above conclusion does not remain true when p = 1. In this case, q =∞. Let e = (1, 1, 1, ... ) be the constant sequence with all terms equal to 1. This sequence ∗ is bounded, and so e ∈ ∞ = (1) . For each n ∈ N, we have that e(en) = 1, and so en → 0 in the weak topology in this case.

Example 5.29 Consider the Banach space Lp(T)ofp-integrable complex-valued functions on the torus T = [0, 2π), where 1 ≤ p<∞. For each n ∈ N, define a inθ ∗ function fn : T → C by fn(θ) = e , where θ ∈ T. Let Λ ∈ Lp(T) . By duality, there exists some g ∈ Lq (T), where 1/p + 1/q = 1, such that

dθ Λ(f ) = f (θ) g(θ) , f ∈ Lp(T). T 2π 100 5 Consequences of Convexity

Therefore,   inθ dθ lim Λ(fn) = lim e g(θ) = lim gˆ(−n) = 0. n→∞ n→∞ T 2π n→∞ This last equality follows from the Riemann–Lebesgue Lemma (Theorem 4.37). Therefore, limn→∞ Λ(fn) = 0 for all Λ ∈ Lq (T), and so fn → 0 weakly. However,   = ∈ N → fn Lp(T) 1 for all n , and so fn 0 in the norm topology. If X is a finite-dimensional Banach space, then all linear functionals are continuous. Proposition 5.30 Let X be a Banach space. The following are equivalent: (i) dim(X) < ∞, (ii) the weak topology on X coincides with the norm topology on X, and (iii) the weak topology on X is metrizable.

Proof The implications (i) ⇒ (ii) ⇒ (iii) are clear. It remains to show (iii) ⇒ (i). Assume the weak topology on X is metrizable. Then (X, w) is first countable, and so there exists a weak base of neighborhoods ∞ (Wn)n=1 at the origin of the form = { | ∗ |≤ ≤ ≤ } Wn x : xn,j (x) n,1 j Nn , ∗ ∈ ∗ ∈ N ∈ N ∈{ } where xn,j X , n > 0, and Nn , for all n and all j 1, ... , Nn . For each n ∈ N, define = { ∗ ≤ ≤ } En span xn,j :1 j Nn . Fix some x∗ ∈ X∗. The set {x : |x∗(x)|≤1} is a weak neighborhood of 0 in X, and consequently must contain Wn for some n ∈ N. For this fixed n, define a linear map T : X → KNn , where K is the scalar field, by

T x = x∗ x ... x∗ x x ∈ X. ( ) ( n,1( ), , n,Nn ( )),

We claim x∗ ∈ (kerT )⊥. (Recall Definition 3.50.) To verify this, suppose y ∈ ker(T ). ∗ = ∈{ } By the definition of T , we have that xn,j (y) 0 for all j 1, ... , Nn . Naturally, ∈ K ∗ = ∈{ } if λ , then it follows that xn,j (λy) 0 for all j 1, ... , Nn . Consequently, ∗ ∗ λy ∈ Wn for all λ ∈ K. By design, Wn ⊆{x : |x (x)|≤1}, and so |x (λy)|≤1 for all λ ∈ K. This can occur only if |x∗(y)|≤1/λ for all λ ∈ K, and thus x∗(y) = 0. This remains true for any y ∈ ker(T ), and so we have that x∗ ∈ (kerT )⊥. ∗ ∗ By Lemma 4.33, there then exists some f ∈ (KNn ) such that x (x) = (f ◦ T )(x) ∈ KNn Nn for all x X. Since is finite-dimensional, there exists a finite sequence (aj )j=1 5.4 The Geometric Hahn–Banach Theorem 101 such that Nn = Nn ∈ KNn f (ξ1, ... , ξNn ) aj ξj ,(ξj )j=1 . j=1 Therefore,

Nn x∗ x = f x∗ x ... x∗ x = a x∗ x x ∈ X ( ) ( n,1( ), , n,Nn ( )) j n,j ( ), , j=1 ∗ and so x ∈ En. ∗ ∈ ∗ ∈ N We have shown that each x X is in En for some n . We therefore ∗ = ∞ ∈ N conclude that X n=1 En. For each n , the space En is finite-dimensional, and so is closed. Therefore, by Theorem 4.7 (the complementary version of the Baire Category Theorem), there exists some n ∈ N such that int(En) =∅. We conclude ∗ that En is an open neighborhood of the origin in X , and consequently is absorbent. ∗ = ∞ = ∗ Therefore, X k=1 kEn En. Thus, the space X is finite-dimensional, and so X is finite-dimensional, as well. 2 Proposition 5.31 Let X be a Banach space. Then: (i) The weak topology on X is a Hausdorff topology. (ii) A linear functional is continuous in the weak topology if and only if it is continuous in the norm topology.

Proof (i) Assume x1 and x2 are elements in X such that x1 = x2. By the Hahn– Banach Separation Theorem (Theorem 5.20), there exists an x∗ ∈ X∗ such that ∗ ∗ ∗  = x (x2 − x1) > 0. Therefore, the set {x : |x (x) − x (x1)| </2} is a weak ∗ ∗ neighborhood of x1, the set {x : |x (x)−x (x2)| </2} is a weak neighborhood of x2, and these two neighborhoods are disjoint. Hence, (X, w) is a Hausdorff topological space. (ii) If a linear functional f is continuous in the weak topology on X, then f −1(V ) is a weakly open set whenever V is an open set in the scalar field. But the norm topology contains all of the weakly open sets, so f −1(V ) is open in the norm topology. Therefore, f is continuous in the norm topology on X. (The idea is that it is “easier” to be continuous in the norm topology, because there are more open sets.) Now, suppose f is a norm continuous linear functional. Then f ∈ X∗, and so the set {x : |f (x)| <} is a weak neighborhood of 0 (by the definition of the weak topology). Thus, f is continuous in the weak topology on X. 2 The weak topology on X is the weakest topology on X such that all norm contin- uous linear functionals remain continuous. When we say a topology is weaker,we mean that it contains fewer open sets. The norm topology on X is stronger than the weak topology on X, because it contains more open sets. Weakly open sets are open in the norm topology, but the converse need not be true. A function is continuous if the preimage of any open set is open. The stronger the topology on the domain, the easier it is for a function to be continuous, because with more open sets, it is more likely that a given preimage is open. 102 5 Consequences of Convexity

Definition 5.32 Let X be a topological vector space. The weak∗ topology on X∗ (or the w∗-topology) is defined by a base of neighborhoods at 0 of the form

∗ ∗ ∗ W (x1, ... , xn; ) ={x : |x (xi )| <,1≤ i ≤ n}, where >0 and {x1, ... , xn}⊆X for n ∈ N. The weak∗ topology on X∗ is the topology inherited from viewing X∗ as a subspace of KX, the space of all scalar-valued functions on X. As before, we endow KX with the product topology. We use (X∗, w∗) to denote X∗ with the weak∗ topology. Observe that any x ∈ X can be thought of as a linear functional on X∗ via the ∗ ∗ ∗ ∗ ∗ mapping x → φx , where φx (x ) = x (x) for all x ∈ X . The weak topology on ∗ ∗ X is the weakest topology on X for which the linear functionals φx are continuous for all x ∈ X. The Banach space X∗ has also a weak topology that is induced by it’s dual space (X∗)∗ = X∗∗ (the bidual of X). The weak∗ topology on X∗ is weaker than the weak topology on X∗, because it requires fewer members in X∗∗ to be continuous. (Only those coming from X.)

Example 5.33 Consider the sequence space 1. In Example 5.28, we saw that 1 ∗ = had a weak topology induced upon it by 1 ∞. In this weak topology, we saw ∞ = ∈ N that the sequence (en)n=1 did not converge to 0 (because e(en) 1 for all n , where e = (1, 1, ...) is the constant sequence with all terms equal to 1). The space ∗ 1 can also be given a weak topology as the dual space of c0. = ∞ Suppose ξ (ξk)k=1 is an element of c0. Since c0 consists of sequences that converge to 0, it follows that en(ξ) = ξn → 0asn →∞. This is true for every ∈ ∞ ∗ ξ c0, and so the sequence (en)n=1 converges to 0 in the weak topology on 1. In this example we have found a sequence which converges in the weak∗ topology ∗ on 1, but not in the weak topology on 1. This happens because the weak topology has fewer open sets than the weak topology. (That is to say, the weak∗ topology is weaker than the weak topology). Proposition 5.34 Let X be a Banach space. Then: (i) The weak∗ topology on X∗ is a Hausdorff topology. (ii) A linear functional f on X∗ is weak∗-continuous if and only if there exists some x ∈ X such that f (x∗) = x∗(x) for all x∗ ∈ X∗. (In other words, (X∗, w∗)∗ = X.)

∗ ∗ ∗ ∗ = ∗ Proof (i) Let x1 and x2 be elements in X such that x1 x2 . Then there exists ∈ ∗ = ∗ some x X such that x (x) x2 (x). (Otherwise they would be the same as linear =| ∗ − ∗ | { ∗ | ∗ − ∗ | } functionals on X.) If  (x1 x2 )(x) , then the sets x : x (x) x1 (x) </2 { ∗ | ∗ − ∗ | } ∗ ∗ ∗ and x : x (x) x2 (x) </2 are disjoint weak -open sets containing x1 and x2 , respectively. (ii) Certainly, if f (x∗) = x∗(x) for all x∗ ∈ X∗, then f is continuous in the weak∗ topology. It remains only to show that any weak∗-continuous linear functional on X∗ can be achieved in this way. 5.4 The Geometric Hahn–Banach Theorem 103

Assume f is a continuous linear functional on (X∗, w∗). By Proposition 5.13, there exists a basic neighborhood of (X∗, w∗) on which f is bounded. Thus, there ∗ exists a real number >0 and a finite set {x1, ..., xn}⊆X such that |f (x )|≤1 ∗ ∗ for all x ∈ W (x1, ..., xn; ). Define a map T : X∗ → Kn, where K is the scalar field, by

∗ ∗ ∗ ∗ ∗ T (x ) = (x (x1), ..., x (xn)), x ∈ X .

∗ ∗ Suppose x ∈ ker(T ). Then x (xj ) = 0 for each j ∈{1, ..., n}. Thus, for any λ ∈ K, ∗ ∗ ∗ we have that x (λxj ) = 0 for j ∈{1, ..., n}, and so (λx ) ∈ W (x1, ..., xn; ). It follows that |f (λx∗)|≤1, and consequently |f (x∗)|≤1/|λ| for all λ = 0. From this we conclude that f (x∗) = 0, and hence f ∈ (kerT )⊥. By Lemma 4.33, then, there exists a bounded linear functional φ : Kn → K such that f = φ ◦T . Therefore, there exists a finite collection of scalars {a1, ..., an}⊆K such that n ∗ ∗ ∗ ∗ ∗ ∗ f (x ) = φ(Tx) = φ(x (x1), ..., x (xn)) = aj x (xj ), x ∈ X . j=1

n The desired element of X is x = aj xj . 2 j=1 Remark 5.3 In Example 5.33 we saw that the weak∗ topology may be strictly weaker than the weak topology. If X is a reflexive space (recall Definition 3.33), however, then the weak and weak∗ topologies coincide. Shortly, we will prove Proposition 5.37 which (in some sense) demonstrates that it is “hard” to be compact in a normed space. Before we state and prove this proposition, however, we need a lemma, which is of independent interest. Lemma 5.36 All norms on a finite-dimensional vector space are equ ivalent. Proof Let X be a finite-dimensional vector space over the scalar field K. Choose x1, ..., xn in X so that X = span{x1, ..., xn}. We recall that each element of X has n a unique representation of the form αi xi , where αi ∈ K for each i ∈{1, ..., n}. i=1 Define a norm |||·||| on X as follows:      n  n  = | |  αi xi  αi . i=1 i=1 It is straightforward to show that this does indeed define a norm on X. Now, let ·be another norm on X. We will find positive constants c and C such that c|||x|||≤x≤C|||x||| for all x ∈ X. By the triangle inequality,

 n  n  n   n       αi xi  ≤ |αi |·xi ≤(maxxi ) |αi | = (maxxi ) αi xi . i i i=1 i=1 i=1 i=1 (5.3) 104 5 Consequences of Convexity

Thus, we may choose C = maxi xi . Next, define a set ! n " S = (α1, ..., αn): |αi |=1 . i=1 Observe that S is a closed and bounded subset of Kn. Therefore, S is compact by the Heine–Borel Theorem. Define a function f : S → R+ by    n  f (α1, ..., αn) =  αi xi . i=1 We claim that the function f is continuous. To see this, observe that    n   n   n n          f (α1, ..., αn)−f (β1, ..., βn) =  αi xi − βi xi  ≤ αi xi − βi xi i=1 i=1 i=1 i=1

 n  n  n   n  1/2 1/2   2 2 =  (αi − βi )xi  ≤ |αi − βi |xi ≤ |αi − βi | xi  . i=1 i=1 i=1 i=1 The last inequality follows from the Cauchy–Schwarz Inequality. From this, it follows that f is continuous. By the Extreme Value Theorem, since f is continuous on a compact set, the function f attains a minimum value on the set S. Let c be that minimum value. Then    n  f (α1, ..., αn) ≥ c for all (α1, ..., αn)inS. This means that  αi xi  ≥ c for all i=1 n n n (α1, ..., αn)inK such that |αi |=1. Alternately, for any (α1, ..., αn)inK , i=1      n  n  n   αi xi  ≥ c |αi |=c αi xi . (5.4) i=1 i=1 i=1 Combining (5.3) and (5.4), we conclude that the two norms are equivalent. Since the norm ·was arbitrary, it follows that all norms on X are equivalent. 2 Proposition 5.37 Suppose X is a Banach space (or just a normed linear space). Then BX is compact in the norm topology on X if and only if dim(X) < ∞. Proof Suppse X is a finite-dimensional normed vector space. By Lemma 5.36, X n is homeomorphic to K with the Euclidean norm. Therefore, BX is compact by the Heine–Borel Theorem. 5.4 The Geometric Hahn–Banach Theorem 105

Next, suppose BX is compact in the norm topology on X. Denote by B(x, r) the open ball of radius r centered at x ∈ X. Since BX is compact, there exists a finite sequence {x1, ..., xk} of elements in X, such that

k   k    1  1 B ⊆ B x , = x + B . (5.5) X j 2 j 2 X j=1 j=1

= { } ⊆ + 1 Let F span x1, ..., xk . Then (5.5) implies that BX F 2 BX. This is a recursive statement, and so we apply it to itself to get   1 1 1 1 1 B ⊆ F + F + B = F + F + B = F + B . X 2 2 X 2 4 X 4 X ⊆ + 1 ∈ N Continuing recursively, we have BX F 2n BX for all n . Therefore, ∞   1 B ⊆ F + B . X 2n X n=1 However, F is closed, as a consequence of Lemma 5.36 (because F is finite- dimensional). Thus, ∞   1 F + B = F , 2n X n=1

⊆ = ∞ ⊆ ∞ and so BX F . Since BX is absorbent, we have X n=1 nBX n=1 nF . But F is a vector space, and so X ⊆ F . Therefore, X = F , as required. 2 While the unit ball in a Banach space can be compact in the norm topology only if the space is finite-dimensional, the unit ball in the weak∗ topology will always be compact. Before proving this statement, known as the Banach-Alaoglu Theorem, let us recall a theorem from general topology. { } Theorem 5.38 (Tychonoff’s Theorem) Let I be$ an arbitrary index set. If Ki i∈I is a collection of compact topological spaces, then i∈I Ki is compact in the product topology. We will not prove this theorem, but we do wish to point out it relies on the Axiom of Choice. We are now ready to state and prove the Banach-Alaoglu Theorem.

Theorem 5.39 (Banach-Alaoglu Theorem) If X is a Banach space, then BX∗ is compact in the weak∗ topology on X∗. ∗ ∗ Proof Let X be a Banach space over the scalar field K. Recall$ that X in the weak ∗ KX = K topology is achieved by viewing X as a subspace of x∈X in the product topology. We make this explicit by defining the map φ : X∗ → KX by ∗ = ∗ ∗ ∈ ∗ φ(x ) (x (x))x∈X, x X . ∗ ∗ If x ∈ BX∗ , then for each x ∈ X,wehave|x (x)|≤x. Consequently, & φ(BX∗ ) ⊆ xBK, x∈X 106 5 Consequences of Convexity where BK is the closed unit ball in the scalar field K and$ xBK is the closed ball   =   of radius x centered at the origin. The product A x∈X x BK is compact, by Tychonoff’s Theorem. There is no reason the image of BX∗ would be all of A, but it is a closed subspace. Indeed, the image is precisely the collection of elements in the following set:   & f : f (α1x1 + α2x2) = α1f (x1) + α2f (x2) xBK.

{α1,α2}⊆R x∈X {x1,x2}⊆X (The first set of relations ensures f ∈ KX is linear, while the second ensures it is bounded.) Therefore, φ(BX∗ ) is a closed subset of the compact set A, and hence X φ(BX∗ ) is compact in the product topology on K . It follows that BX∗ is compact in the weak∗ topology on X∗, as required. 2 The Banach–Alaoglu Theorem as given here is due to Leonidas Alaoglu [1], although the result was known to Banach. Banach did not have the notions of general topology available to him, and so he could not formulate it in this way.

5.5 Goldstine’s Theorem

Let X be a Banach space. Recall that X can be thought of as a subspace of it’s bidual X∗∗. The space X∗∗ is the dual space for X∗, and as such can be given a weak∗ topology. The weak∗ topology on X∗∗ is the weakest topology under which elements of X∗ define continuous functions on X∗∗. If we restrict to the subspace X, then the weakest topology under which elements of X∗ are continuous is the weak topology on X. Therefore ∗∗ ∗ (X , w )|X = (X, w). In other words, the restriction of the weak∗ topology on X∗∗ to X is the weak topology on X. ∗ Theorem 5.40 (Goldstine’s Theorem) If X is a Banach space, then BX is weak - dense in BX∗∗ . Proof Let X be a Banach space. For simplicity, we will assume X is real. (If X is ∗ complex, the argument is similar.) Denote the closure of BX in the weak topology ∗ ∗ ∗∗ (w ) (w ) on X by BX . Our goal is to show the equality BX = BX∗∗ . By the Banach–Alaoglu Theorem (Theorem 5.39), the set BX∗∗ is a compact (and hence closed) set in the weak∗ topology on X∗∗. Therefore, since X ⊆ X∗∗, we see (w∗) that B ⊆ B ∗∗ . X X ∗ ∗∗ (w ) ∈ ∗∗ \ Suppose x0 BX BX . By the Hahn–Banach Separation Theorem (The- orem 5.20), there exists a weak∗-continuous linear functional f on X∗∗ such that ∗ ∗∗ { ∗∗ ∗∗ ∈ (w )} f (x0 ) > sup f (u ):u BX . (5.6) 5.5 Goldstine’s Theorem 107

By Proposition 5.34, since f is continuous in the weak∗ topology on X∗∗, there exists an x∗ ∈ X∗ such that f (x∗∗) = x∗∗(x∗) for all x∗∗ ∈ X∗∗. Therefore, (5.5.1) becomes ∗ ∗∗ ∗ { ∗∗ ∗ ∗∗ ∈ (w )} ≥ { ∗ ∈ }= ∗ x0 (x ) > sup u (x ):u BX sup x (x):x BX x .

∗∗ ∗∗   ∈ ∗∗ This implies x0 > 1, contradicting the assumption that x0 BX . The result follows. 2 In the proof of Goldstine’s Theorem, we assumed that X was a real Banach space for the sake of simplicity. The argument is similar when the Banach space is complex, but instead of Theorem 5.20, which is the Hahn–Banach Separation Theorem for real spaces, we use Theorem 5.22, which is the Hahn–Banach Separation Theorem for complex spaces, and we replace f with (f ).

Theorem 5.41 A Banach space X is reflexive if and only if the closed unit ball BX is weakly compact.

Proof Assume first that X is reflexive. Then BX = BX∗∗ . By the Banach–Alaoglu ∗ ∗∗ Theorem (Theorem 5.39), the set BX∗∗ is compact in the weak topology on X . Since X is reflexive, the weak∗ topology on X∗∗ coincides with the weak topology on X. Therefore, BX is compact in the weak topology on X. Now, assume instead that BX is weakly compact. The weak topology on X is the ∗ ∗∗ restriction of the weak topology on X , and so BX is compact (and hence closed) in the weak∗ topology on X∗∗. By Goldstine’s Theorem (Theorem 5.40), we conclude that BX = BX∗∗ , since BX is closed and dense in BX∗∗ . Therefore, X is reflexive. 2 Proposition 5.42 Suppose X and Y are Banach spaces (or simply normed linear spaces). If T : X → Y is a linear map, then the following are equivalent: (i) T is bounded (i.e., norm-to-norm continuous). (ii) T is (X, ·) to (Y , w) continuous. (iii) T is (X, w) to (Y , w) continuous.

Proof Certainly (iii) implies (ii). We will show that (ii) implies (i), and then (i) implies (iii). Assume (ii). We wish to show that T (BX) is bounded in the norm topology on Y . Let y∗ ∈ Y ∗. Then y∗ is continuous in the weak topology on Y . Consequently, since T is norm-to-weak continuous, the functional y∗ ◦ T is continuous in the norm topology on X. Thus, y∗ ◦ T ∈ X∗, and so

sup |y∗(Tx)|= sup |(y∗ ◦ T )(x)| < ∞. (5.7) x≤1 x≤1

∗ ∗ Since (5.5.2) holds for each y ∈ Y , we conclude that the set T (BX) is weakly bounded in Y . Therefore, T (BX) is bounded in the norm topology, by Theorem 4.12. Now assume (i). Consider a weak neighborhood in Y , say = ∗ ∗ ={ | ∗ | ≤ ≤ } WY WY (y1 , ..., yn ; ) y : yj (y) <,1 j n , 108 5 Consequences of Convexity

{ ∗ ∗}⊆ ∗ ∈ ∈ for y1 , ..., yn Y and >0. Suppose x X is such that Tx WY . Then for ∈{ } | ∗ | ∗ each j 1, ..., n ,wehave yj (Tx) <. Recall that the adjoint operator T was ∗ ◦ ∗ = ∗ ◦ | ∗ ∗ | ∈{ } defined so that T y y T . Therefore, T yj (x) <for all j 1, ..., n , ∈ = ∗ ∗ ∗ ∗ and so it follows that x WX WX(T y1 , ..., T yn ; ), a weak neighborhood of −1 X. We conclude that T (WY ) ⊆ WX. Equality is obtained by running through the same argument in reverse, and so T is weak-to-weak continuous, as required. 2 Suppose that T : X → Y is a bounded linear mapping between real Banach spaces. If X is reflexive, then T (BX) is weakly compact, and hence norm-closed in Y . This is not true in general (i.e., for non-reflexive spaces X). Consider any x∗ ∈ X∗ ∗ ∗ with x =1. Then x (BX) could be either (−1, 1) or [−1, 1]. If X is reflexive, then the second interval (the closed one) is the only option. = ∗ = Example 5.43 Consider the real Banach space X 1. Recall that 1 ∞. Let ξ = − ∞ = ∞ in ∞ be the bounded sequence ξ (1 1/n)n=1. Now suppose x (xn)n=1 is any ∞ | |≤ element in B1 , so that n=1 xn 1. Then  ∞  ∞   |ξ(x)|= ξn xn ≤ ξn |xn| < 1. n=1 n=1   = ∈ ∗ = − Since ξ ∞ 1, we have a norm-one element ξ 1 such that ξ(B1 ) ( 1, 1). We see that a linear functional on a reflexive Banach space attains its maximum value on the closed unit ball. It was a long standing question whether or not this property characterized reflexive spaces. In 1964, R.C. James showed that it did when he proved the statement: If every bounded linear functional on X attains its maximum value on the closed unit ball, then X is reflexive [19]. We conclude this section with a result about the adjoint operator. Proposition 5.44 If T : X → Y is a bounded linear map between Banach spaces, then T ∗ : Y ∗ → X∗ is weak∗-to-weak∗ continuous. Proof The proof is very similar to the proof that (i) implies (iii) in Proposition 5.42. Consider a weak∗ neighborhood in X∗, say ∗ ∗ WX∗ = WX∗ (x1, ..., xn; ) ={x : |x (xj )| <,1≤ j ≤ n}, ∗ ∗ ∗ ∗ for {x1, ..., xn}⊆X and >0. Suppose y ∈ Y is such that T y ∈ WX∗ . Then ∗ ∗ ∗ ∗ ∗ |T y (xj )| <for all j ∈{1, ..., n}. But T y (x) = y (Tx) for all x ∈ X, and so ∗ ∗ |y (Txj )| <for all j ∈{1, ..., n}. Then y ∈ WY ∗ = WY ∗ (Tx1, ..., Txn; ), which ∗ ∗ ∗ −1 is a weak neighborhood of Y . This implies that (T ) (WX∗ ) ⊆ WY ∗ . Similarly, ∗ −1 ∗ ∗ ∗ WY ∗ ⊆ (T ) (WX∗ ), and so T is weak -to-weak continuous, as required. 2

5.6 Mazur’s Theorem

In this section we explore the consequences of convexity in weak topologies. Theorem 5.45 (Mazur’s Theorem) Let X be a locally convex topological vector space. A convex subset of X is closed if and only if it is weakly closed. 5.6 Mazur’s Theorem 109

Proof Without loss of generality, assume X is a real topological vector space. A weakly closed set is always strongly closed, regardless of convexity. Suppose K is (w) closed in the original topology, and let K denote the closure of K in the weak (w) topology. Assume x0 ∈ K \K. Then, by the Hahn–Banach Separation Theorem (Theorem 5.20), there is an x∗ ∈ X∗ such that ∗ ∗ x (x0) > sup{x (x):x ∈ K}. (w) This contradicts the assumption x0 is in the weak closure of K, and so K = K. 2

Example 5.46 Consider the real sequence space p, where 1

   n 1/p  1  1 p 1 −1  (e1 +···+en) = 1 = n p −−−→ 0. n  n n→∞ p j=1

The same cannot be said of 1—in this case, there exists no convex combination of elements in {en : n ∈ N} that will approximate 0. To see this, let λ1e1 +···+λnen be any convex combination of elements from {en : n ∈ N}. Then n  +···+  =  ···  = = λ1e1 λnen 1 (λ1, λ2, , λn,0,... ) 1 λj 1. j=1 In the above example, we introduced the notation co(E) to denote the set of convex linear combinations of elements in the set E. This idea will prove important later, and so we give the following definition. Definition 5.47 Let X be a topological vector space and let A be any subset of X. The convex hull of A is the smallest convex subset of X that contains A. The convex hull of A is denoted by co(A) and consists of all convex linear combinations of elements in A; that is,  m m  co(A) = λj aj : aj ∈ A, λj > 0, λj = 1, m ∈ N . j=1 j=1 The closed convex hull of A is the closure of the convex hull and is denoted co(A). 110 5 Consequences of Convexity

Example 5.48 Let X = Lp(0, 1), where 0

f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y), for all {x, y}⊆X and t ∈ [0, 1]. Our problem is as follows: Suppose K is a closed bounded convex set in a Banach space X and let f : X → R be a continuous convex function. Does there exist some x0 ∈ K such that f (x0) = min{f (x):x ∈ K}? There is no reason we should assume this minimum exists, as there is no compactness assumption made on K. Suppose for a moment that X is reflexive. Then BX is weakly compact, by The- orem 5.41. Since K is closed in norm, K is weakly closed, by Mazur’s Theorem. It follows that K is weakly compact. (Here, we use the fact that BX is weakly com- pact and absorbent.) We now have continuity and compactness, but not in the same topology: f is continuous in the norm topology, and K is compact in the weak topology. Despite this, we claim that if X is a reflexive Banach space and if f is a convex function, then f does attain its minimum value on K. Let α = inf{f (x):x ∈ K}. Our goal is to show that α>−∞ and that f (x0) = α for some x0 ∈ K. Suppose α =−∞and, for each n ∈ N, define Kn ={x ∈ K : f (x) ≤−n}.For each n ∈ N, the set Kn is closed (and hence weakly closed by Mazur’s Theorem), =−∞ ∞ convex, and nonempty (since α ). Therefore, (Kn)n=1 forms a nested sequence of weakly compact% nonempty sets. By the Nested Interval Property (Corollary B.7), ∞ =∅ ∈ it must be that n=1 Kn . But this implies that there is some x0 K such that f (x0) ≤−n for all n ∈ N, an impossibility. Consequently, α>−∞. ∈ N ={ ∈ ≤ + } Define for n a sequence of sets Kn x K : f (x) α 1/n .As ∞ before,% (Kn)n=1 is a nested% sequence of weakly compact nonempty sets, and so ∞ =∅ ∈ ∞ ∈ = n=1 Kn .Ifx0 n=1 Kn, then x0 K and f (x0) α, as required. We summarize in the following proposition. 5.7 Extreme Points 111

Fig. 5.2 Some elementary convex objects

Proposition 5.50 Suppose K is a nonempty closed bounded convex set in a Banach space X and let f : X → R be a continuous convex function. If X is reflexive, then there exists an x0 ∈ K such that f (x0) = min{f (x):x ∈ K}. Proof See the discussion preceding the statement of the proposition. 2 A special case of the above is f (x) =u − x, the function representing the distance between x ∈ X and a fixed point u ∈ X.IfX is a reflexive Banach space, and if K is a closed and bounded convex set in X (not containing u), then there exists a x0 ∈ K such that u − x0=minu − x. x∈K

That is, there exists some point x0 ∈ K which is closest to u. (Actually, the boundedness assumption on K is not needed for this statement to be true.)

5.7 Extreme Points

In this section, we consider sets K that are convex in some vector space X. Definition 5.51 Let X be a vector space and suppose K is a convex subset of X. A point x ∈ K is an extreme point of K if it does not lie on a line segment in K. That is, x is an extreme point of K provided that the following is true: If u and v are elements of K such that x = (1 − t)u + tv for some t ∈ (0, 1), then x = u = v. The set of extreme points of K is denoted ex(K). For example, a triangle has an extreme point at each vertex, while any boundary point of a circle is an extreme point. (See Fig. 5.2.)

Example 5.52 We now determine the extreme points for the unit ball BX in several cases where X is a real Banach space. Note that no point of the interior of BX can be extreme, and so we must consider only points on the boundary ∂BX.

(i) X = 2. Denote the inner product on 2 by ·, ·. Suppose x ∈ 2 is such  = { }⊆ = − + that x 1. Now let u, v B2 and suppose x (1 t)u tv for some t ∈ (0, 1). By the triangle inequality, u=v=1 (otherwise x < 1). Since x=1, we have

1 =x, x=(1 − t)u, x+tv, x. (5.8) 112 5 Consequences of Convexity

By assumption, 0

is on the boundary ∂B2 . (ii) X = Lp(0, 1), 1

1 p−1 φ(k) = |f (x)| (signf (x)) k(x) dx, k ∈ Lp(0, 1). 0 p Since φ(f ) =f p,wehave 1 = φ(f ) = (1 − t)φ(g) + tφ(h). Again using the triangle inequality, we have that φ(g) = φ(h) = 1. By Hölder’s Inequality,

 1/q | |≤ | |p−1 | | ≤ | |(p−1)q   = p/q   φ(g) f (x) g(x) dx f (x) dx g p f p g p,

1 + 1 = = p where p q 1 (and so q p−1 ). Since the left and right sides of the above inequality are both equal to 1, we have equality in Hölder’s Inequality. This happens only if there are positive constants a and b such that a(|f |p−1)q = | |p = p b g (as members of Lp(0, 1)). Because q p−1 , this equality (which is valid almost everywhere) becomes a|f |p = b|g|p, which is equivalent to 1/p 1/p a |f |=b |g|. Since f p =gp, this can only happen if |f |=|g| in Lp(0, 1).A similar argument shows |f |=|h| in Lp(0, 1). From these equalities, together with the assumption that f is a convex combination of g and h,we conclude that f = g and f = h. It follows that f is an extreme point of

BLp(0,1). The choice of f was arbitrary in ∂BLp(0,1), and therefore any f on the

boundary of the unit ball is an extreme point of the unit ball BLp(0,1). A similar argument shows that the extreme points of the unit ball in p, ∞ where 1

t F (t) = |f (s)| ds, t ∈ [0, 1]. 0 The function F is continuous with F (0) = 0 and F (1) = 1. By the Intermediate Value Theorem, there exists a τ ∈ (0, 1) such that F (τ) = 1/2. 5.7 Extreme Points 113

Let g = 2fχ(0,τ) and h = 2fχ(τ,1). By the choice of τ,wehaveg1 = 1   = and h 1 1. We have found distinct functions g and h in BL1(0,1) such that = 1 + 1 f 2 g 2 h. Therefore, f is not an extreme point of the unit ball of L1(0, 1). Since f was an arbitrary element of ∂BL1(0,1), we conclude that the unit ball in L1(0, 1) has no extreme points. = = ∞ (iv) X c0. We will show there are no extreme points in Bc0 . Suppose x (xk)k=1 is a sequence in Bc . Then lim xk = 0, and so there exists some n ∈ N such 0 k→∞ | | = ∞ = ∞ that xn < 1/2. We will define sequences y (yk)k=1 and z (zk)k=1 in c0 so = 1 + 1 = that x 2 y 2 z.Ifxn 0, then define y and z as follows:

xk if k = n, xk if k = n, yk = and zk = 0ifk = n 2xn if k = n.

= 1 + 1 | | We have x 2 y 2 z and, because xn < 1/2, the sequences y and z are = = in Bc0 . The previous sequences work only if xn 0. If instead xn 0, then define y and z so that:

x if k = n, x if k = n, y = k and z = k k 1 = k − 1 = 2 if k n 2 if k n.

= 1 + 1 Once again, we have x 2 y 2 z, where the sequences y and z are in Bc0 . Therefore, x is not an extreme point of the set Bc0 . (v) X = C[0, 1]. In this case the extreme points of BC[0,1] are the two functions χ[0,1] and −χ[0,1]. (That is, the constant functions 1 and −1.) If f ∈ BC[0,1] is a continuous function such that |f (t)| < 1 for some t ∈ (0, 1), then we may use an argument similar to the perturbation argument used in (iv). (That is, we can put a small “wiggle” in the function.) Similarly, if K is a compact Hausdorff space, then the extreme points of BC(K) are the two functions χK and −χK , which are the constant functions 1 and −1onK. In the case that C(K)isacomplex Banach space, the extreme points of BC(K) are all functions f ∈ C(K) for which |f (s)|=1 for all s ∈ K. = {± ∈ N} (vi) X 1. The set of extreme points in B1 is en : n . First, let us ∈ N show that each of these points is indeed extreme in B1 . Fix some n . Let = ∞ = ∞ = + y (yk)k=1 and z (zk)k=1 be elements of B1 such that en ay bz, where a and b are positive numbers such that a + b = 1. (Note that these conditions imply yn > 0 and zn > 0.) We have that ayn + bzn = 1 and ayk + bzk = 0 = = = 1 − b for all k n. By assumption, we know that a 0, and so yn a a zn and =−b = yk a zk for k n. Once again making use of the triangle inequality, we see that y1 = 1 and z1 = 1. Computing y1 directly, we have        ∞    b 1 b b 1 2b |y | + y = |z | + − z = |z | + − z . k n a k a a n a k a a n k=n k=n k=1 114 5 Consequences of Convexity

Thus,   b 1 2b y = z + − z . 1 a 1 a a n

But y1 = 1 and z1 = 1, and so   b 1 2b b + 1 − 2bz 1 = (1) + − z = n . a a a n a

A little arithmetic (and the fact that a + b = 1) reveals that zn = 1, and so z must in fact be en. Therefore, en is an extreme point. A similar argument shows that −en is an extreme point for each n ∈ N.

Now we show that no other element of ∂B1 is an extreme point. Suppose ∈   = =± ∈ N x B1 with x 1 1, but x en for any n . Then there must be

at least two non-zero entries, say xm1 and xm2 . Without loss of generality, we may assume both terms are positive. Choose some constant >0 such that { − − } = ∞ = ∞ 

Let M denote the maximum value of φ on F0, so that φ(x) = M for all x ∈ G0. The set G0 is nonempty (by the continuity of φ), compact, and convex. It is also a proper subset of F0, because φ is not constant. We claim that G0 is extremal in K. Suppose {u, v}⊆K and (1−t)u+tv ∈ G0 for all t ∈ (0, 1). We know that G0 ⊆ F0, and we know that F0 is extremal; hence {u, v}⊆F0. For each t ∈ (0, 1), we have that (1 − t)u + tv ∈ G0, and so M = φ((1 − t)u + tv) = (1 − t)φ(u) + tφ(v), for all t ∈ (0, 1). From this we conclude that φ(u) = φ(v) = M, and consequently {u, v}⊆G0. This implies G0 is extremal, but this violates the minimality of F0. We have derived a contradiction, and so it must be the case that F0 contains only one element. The set F0 is a single-point set and an extremal set. Therefore, the one element of F0 is an extreme point. 116 5 Consequences of Convexity

We are now prepared to prove the Krein–Milman Theorem. Proof of Theorem 5.53 Without loss of generality, we may assume E is a real topological vector space. The set K is extremal in itself, and so must contain an extreme point, by Lemma 5.56. Let K0 = co(exK), the closed convex hull of the set of extreme points in K. Suppose x ∈ K\K0. By the Hahn–Banach Separation Theorem (Theorem 5.20), there exists a continuous linear functional f on E such that

f (x) > maxf (y). (5.9) y∈K0

Let ! " G0 = z ∈ K : f (z) = maxf (y) . y∈K

The set G0 is nonempty because f is continuous and K is compact; it is also disjoint from K0 because of (5.9). The set G0 is extremal (see the argument in the proof of Lemma 5.56), and consequently contains an extreme point of K, by Lemma 5.56. This, however, is a contradiction, because K0 contains all of the extreme points of K, and K0 and G0 are disjoint. Therefore, K = K0, as required. The Krein–Milman Theorem originally appeared in a work by Krein and Mil- man in 1940 [21]. The local convexity assumption on E was needed to invoke the Hahn–Banach Separation Theorem (Theorem 5.20). Local convexity is a necessary condition, a fact which was not shown until the 1970s [32]. The Krein–Milman Theorem has a deep relationship with the Axiom of Choice. (See [4].)

5.8 Milman’s Theorem

Suppose K is a compact Hausdorff space. The Riesz Representation Theorem iden- tifies the dual space of the space of continuous functions on K as the space of regular Borel measures on K; that is, C(K)∗ = M(K). (See Theorem A.35.) We recall that the norm on M(K) is the total variation norm: μM =|μ|(K) for all μ ∈ M(K). We define the probability measures on K to be elements in the set

P(K) ={μ ∈ M(K):μ ≥ 0, μM = 1}.

∗ This set is convex. It is also closed in the w -topology, which can be seen from the P ={ ∈ = } equality (K) μ BM(K) : K 1 dμ 1 . (See Exercise 5.27.) By the Banach–Alaoglu Theorem (Theorem 5.39), the unit ball BM(K) is compact in the w∗-topology, and hence P(K)isw∗-compact as a w∗-closed subset. A simple computation shows that P(K) is an extremal set in the unit ball of M(K). (Again, see Exercise 5.27.) Since P(K)isaw∗-compact convex extremal set, Lemma 5.56 assures us that P(K) must have at least one extreme point. Proposition 5.57 Let K be a compact Hausdorff space. A probability measure in M(K) is an extreme point of P(K) if and only if it is a Dirac measure. 5.8 Milman’s Theorem 117

Proof We first show that δs is an extreme point of P(K) for s ∈ K. Suppose there exist probability measures μ and ν such that δs = (1 − t)μ + tν for some t ∈ (0, 1). It follows that μ({s}) = ν({s}) = 1. Thus, μ = ν = δs , as required. P = Now suppose μ is an extreme point of (K). We will show that μ δs for some ∈ U ={ = } = = s K. Let U open : μ(U) 0 and let V U∈U U. We claim μ(V ) 0. Suppose E is a compact subset of V . The collection of sets U forms an open cover of E, and so by compactness there exists a finite subcover, say E ⊆ U1 ∪···∪Un. The measure μ is nonnegative, and so (by subadditivity)

μ(E) ≤ μ(U1) +···+μ(Un) = 0.

Therefore, μ(E) = 0 for all compact subsets of V . By the regularity of μ,

μ(V ) = sup{μ(E):E is a compact subset of V }=0.

Let F = K\V . Then μ(F ) = 1. We wish to show that F contains only one point. Assume to the contrary that F contains more than one point. Let {s, t}⊆F . By the Hausdorff property, there are open sets W1 and W2 in K such that s ∈ W1, t ∈ W2, and W1 ∩ W2 =∅. Since W1 and W2 are not subsets of V , they have non-zero μ-measure. Define measures μ1 and μ2 on K as follows:

μ(W1 ∩ B) μ((K\W1) ∩ B) μ1(B) = and μ2(B) = , μ(W1) μ(K\W1) where B is any measurable subset of K. We note that μ(K\W1) = 0 since μ ≥ 0 and W2 ⊆ K\W1. Both μ1 and μ2 are probability measures, and

μ(W1) μ1 + μ(K\W1) μ2 = μ.

By assumption, μ is an extreme point of P(K). Thus, since μ(W1) + μ(K\W1) = 1, it follows that μ = μ1 = μ2. This is not possible, however, since μ1(W1) = 1 and μ2(W1) = 0. We have derived a contradiction. Therefore, there can be no more than one point in the set F , say s. Since μ(F ) = 1, it follows that μ = δs , as required. In the above proof, the set V is the maximal open set of μ-measure zero. The entire μ-mass of K is contained in K\V . This motivates the next definition. Definition 5.58 Suppose μ is a positive nonzero measure on K.IfV is the maximal open set of μ-measure zero, then K\V is called the support of μ.Ifμ is a signed (or complex) measure, the support of μ is defined to be the support of |μ|.

In the proof of Proposition 5.57, observe that the measures μ1 and μ2 were defined in such a way that they had disjoint supports. As a result, it was certainly the case that μ1 = μ2. (They “live” on different sets, so to speak.) Theorem 5.59 (Milman’s Theorem) Suppose E is a locally convex Hausdorff topological vector space and let K be a compact subset of E.IfD is a closed subset of K such that K = co(D), then ex(K) ⊆ D. Furthermore, for every x ∈ K, there 118 5 Consequences of Convexity  ∈ P = exists a μx (D) such that f (x) D f (y) μx (dy) for all linear functionals f ∈ E∗. ∗ Proof Observe that E ⊆ KE , the collection of all scalar-valued functions on E∗, where the superspace is equipped with the product topology. The inclusion is made ∗ explicit by the embedding ρ : E → KE defined by = ∈ ρ(e) (f (e))f ∈E∗ , e E.

(We often make this identification implicitly, suppressing the letter ρ.) ∗ Introduce a map T : M(D) → KE , defined by   T (μ) = f (y) μ(dy) , μ ∈ M(D). ∗ D f ∈E Observe that T is continuous in the w∗-topology on M(D). For each s ∈ D,wehave = = T (δs ) (f (s))f ∈E∗ s. (5.10)

∗ (w∗) (w) By the w -continuity of T , then, it follows that T maps co ({δs }s∈D) onto co (D). We note that we have the weak closure of co(D)inE because the topology E inherits ∗ from KE is the weak topology. (See the comments after Definition 5.27.) By Theorem 5.53 (the Krein–Milman Theorem) and Proposition 5.57, we make (w∗) the identification co ({δs }s∈D) = P(D). By assumption, K = co(D), and so by Mazur’s Theorem (Theorem 5.45), we have that K = co(w)(D). Therefore, the restriction T |P(D) : P(D) → K is a surjection. This proves the second part of the theorem. It remains to prove the first part of the theorem; that is, that the extreme points of K are in D. Suppose x ∈ ex(K). We claim that the set T −1(x) ⊆ P(D)isan extremal set. Suppose {μ, ν}⊆P(D) and (1 − t)μ + tν ∈ T −1(x) for all t ∈ (0, 1). It follows that (1 − t)T (μ) + tT(ν) = x for all t ∈ (0, 1). By assumption, x is an extreme point in K, and consequently Tμ= Tν = x. Therefore, {μ, ν}⊆T −1(x), and so T −1(x) is extremal. By Lemma 5.56, T −1(x) contains an extreme point of P(D). Therefore, there −1 exists some s ∈ D such that δs ∈ T (x), and consequently T (δs ) = x.By(5.10), however, T (δs ) = s, and so x = s ∈ D. The result follows. 2

5.9 Haar Measure on Compact Groups

We now turn our attention to topological groups. We saw in Sect. 3.4 that if G is a compact abelian metrizable group, then there exists a unique translation-invariant probability measure on G. We noted at the time that the metrizability assumption was not needed. Now, using the tools of the previous sections, we will extend this 5.9 Haar Measure on Compact Groups 119 result to include compact topological groups that are not abelian. Let us review some definitions. Definition 5.60 A group G is called a topological group if the set G is endowed with a topology for which the group operations (multiplication and inversion) (s, t) → s · t and s → s−1,(s, t) ∈ G × G, are continuous. If G is compact in the given topology, then G is called a compact group. Classical examples of topological groups include Rn (where the group multipli- cation is given by addition) and the set of orthogonal n × n matrices On (where the group multiplication is matrix multiplication). Multiplication in a group is usually denoted either with a dot (·) or by juxtaposition. When the group is abelian, however, it is traditional to use a plus symbol (+), provided it will not result in any confusion. When G is a compact group, we denote the space of continuous functions on G by C(G). The σ -algebra on G is implicitly taken to be the Borel σ -algebra generated by the open sets in G. We denote the Borel σ-algebra on G by B. Definition 5.61 Let G be a compact group with Borel algebra B. A measure μ on G is called left-invariant if μ(gB) = μ(B) for all B ∈ B and g ∈ G. Correspondingly, the measure μ is called right-invariant if μ(Bg) = μ(B) for all B ∈ B and g ∈ G. Theorem 5.62 (Existence of Haar Measure) Suppose G is a compact group. There exists a unique left-invariant probability measure on the Borel sets of G. Furthermore, this measure is also the unique right-invariant probability measure on the Borel sets of G. Proof First, we assume we can find a left-invariant probability measure λ and a right-invariant probability measure μ. We will show that λ = μ. (Notice that this will imply uniqueness.) Let f ∈ C(G). By Fubini’s Theorem,     f (s · t) λ(dt) μ(ds) = f (s · t) μ(ds) λ(dt). (5.11) G G G G By the left-invariance of λ,     f (s · t) λ(dt) μ(ds) = f (t) λ(dt) μ(ds) = f (t) λ(dt), G G G G G since μ(G) = 1. Similarly, by the right-invariance of μ,     f (s · t) μ(ds) λ(dt) = f (s) μ(ds) λ(dt) = f (s) μ(ds), G G G G G since λ(G) = 1. Substituting into (5.11), we obtain

f (t) λ(dt) = f (s) μ(ds). G G The choice of f ∈ C(G) was arbitrary, and so by duality we conclude λ = μ. 120 5 Consequences of Convexity

It remains to prove the existence of a left-invariant measure on G. (A similar argument will produce a right-invariant measure.) We begin by defining for each s ∈ G a left-multiplication operator Ls : C(G) → C(G)by

(Ls f )(t) = f (s · t), t ∈ G.

−  = 1 = − ◦ = Observe that Ls 1, Ls Ls 1 , and Lu Ls Lu·s , whenever u and s are in G.

Claim 1 Let f ∈ C(G). The map s → Ls f is continuous from G into C(G). We wish to estimate, for s and s in G, the quantity

Ls f − Ls f ∞ = sup |f (s · t) − f (s · t)|. t∈G Multiplication in the group is continuous, and so the map (s, t) → f (s · t) is contin- uous on G×G. Since the group G is compact, we conclude the map (s, t) → f (s ·t) is in fact uniformly continuous. Therefore, for any given >0, there exists an open neighborhood V of the identity such that |f (s · t) − f (s · t )| <whenever −1 −1 −1 s · s ∈ V and t · t ∈ V. In this case, we have t = t, and so if s · s ∈ V, then

Ls f − Ls f ∞ = sup |f (s · t) − f (s · t)| <. t∈G This proves Claim 1. ∈ → ∗ ∗ Claim 2 Let μ C(G). The map s Ls μ is w -continuous from G into M(G). ∗ → ∈ Observe that Ls : M(G) M(G). Then, for f C(G), by the definition of the ∗ = adjoint, fdLs μ Ls fdμ.Ifs and s are in G, then      ∗ ∗     −  =  −  ≤ −    fdLs μ fdLs μ Ls fdμ Ls fdμ Ls f Ls f ∞ μ M . G G G G The rest follows from Claim 1. Claim 3 If s ∈ G, then L∗(P(G)) ⊆ P(G).  s  ≥ ∗ = ≥ ≥ ∗ ≥ If f 0, then fdLs μ Ls fdμ 0, whenever μ 0. Thus, Ls μ 0 for any μ ∈ P(G). Furthermore,

∗ = ∗ = = = Ls μ(G) 1 Ls μ(dt) Ls (1) μ(dt) μ(G) 1. G G ∗ P Thus, Ls μ is in (G) whenever μ is a probability measure. This proves Claim 3. The gist of Claim 3 is that the set P(G) is invariant under multiplication on the left; i.e., P(G)isleft-invariant. We wish to find a set with this property that contains only one element. To that end, let K be the collection of all weak∗-compact convex subsets of P(G) that are left-invariant; that is, all weak∗-compact convex subsets K ∗ ⊆ ∈ ≤ K ≤ such that Ls K K for all s G. Define a partial order on so that A B when A ⊆ B. We know that K is nonempty, because P(G) ∈ K.If(Ci )i∈I is a chain in K, 5.10 The Banach–Stone Theorem 121 % = then C i∈I Ci is nonempty, by the Finite Intersection Property. Furthermore, C is a lower bound for the chain (Ci )i∈I . Therefore, by Zorn’s Lemma, there exists a minimal element of K, say K. We wish to show that K is a single-point set. Assume to the contrary that μ1 and = 1 + ∈ μ2 are distinct elements in K. Let ν 2 (μ1 μ2). Then ν K, by convexity. ={ ∗ ∈ } ∗ Define a new set E Ls ν : s G . By Claim 2, the set E is weak -compact (as the image of the compact set G under a weak∗-continuous mapping). Furthermore, E ⊆ K, by the left-invariance of K. For all u and s in G, ∗ ∗ = ∗ ∗ = ∗ ∈ Lu(Ls ν) LuLs ν Ls·uν E. ∗ ⊆ Thus Lu(E) E, and so E is left-invariant. (w∗) Let K0 = co (E). The set K0 is convex by construction. We also have that ∗ ∗ ∗ K0 is weak -compact, because it is weak -closed in the weak -compact set K.By construction, K0 is left-invariant, and so K0 ∈ K. But K0 ⊆ K and K is minimal in K. Therefore, K = K0. By the Krein–Milman Theorem (Theorem 5.53), there is some extreme point in K; and by Milman’s Theorem (Theorem 5.59), every extreme point of K is in E. ∈ ∗ Therefore, there exists some s G such that Ls ν is extreme in K. Recalling the definition of ν, we see that 1 L∗ν = (L∗μ + L∗μ ). s 2 s 1 s 2 ∗ ∗ ∗ But Ls μ1 and Ls μ2 are in K, by left-invariance, and Ls ν is extreme in K. Therefore, ∗ = ∗ = ∗ −1 Ls ν Ls μ1 Ls μ2. If we multiply all sides of this equation by s on the left ∗ = = (that is, apply Ls−1 to all sides), we discover that ν μ1 μ2. This violates the assumption that μ1 and μ2 are distinct. Thus, K contains only one element, say λ. The measure λ is the desired left-invariant probability measure on G. Definition 5.63 Let G be a compact group. The unique left-invariant probability measure on the Borel subsets of G is called Haar measure on G.

5.10 The Banach–Stone Theorem

In this section, we prove a classical theorem about the structure of spaces of contin- uous functions. We recall that two Banach spaces X and Y are called isometrically isomorphic if there exists a continuous linear bijection that preserves norms. That is, if there exists some linear bijection T : X → Y such that T =T −1=1.

Theorem 5.64 (Banach–Stone Theorem) Suppose K1 and K2 are compact Haus- dorff spaces. If C(K1) and C(K2) are isometrically isomorphic, then K1 and K2 are homeomorphic. Furthermore, if T : C(K1) → C(K2) is an isometric isomorphism, 122 5 Consequences of Convexity then there exists some u ∈ C(K2) such that |u(s)|=1 for all s ∈ K2, and such that

Tf(s) = u(s) f (φ(s)), s ∈ K2, where φ : K2 → K1 is a homeomorphism. Before proving the Banach–Stone Theorem, we will provide a simple lemma that will not only help us now, but will come in handy later, too. In order to prove this lemma, however, we need to make use of another result from general topology. Theorem 5.65 (Urysohn’s Lemma) A topological space X is normal if and only if any two disjoint closed subsets A and B can be separated by a continuous function. That is, if there exists a continuous function f : X → [0, 1] such that f |A = 0 and f |B = 1. We recall that a topological space X is normal if for disjoint closed sets E and F , there exist disjoint open sets U and V such that E ⊆ U and F ⊆ V . We will not prove Urysohn’s Lemma; however, we will observe that, as a consequence, if K is a compact Hausdorff space, then C(K) separates the points of K. That is, if a and b are distinct points in K, then there is a function f ∈ C(K) such that f (a) = 0 and f (b) = 1. (See Exercise 5.7.)

Lemma 5.66 Let K be a compact Hausdorff space. If Δ ={δs : s ∈ K}, then K is homeomorphic to Δ with the subspace topology inherited from (M(K), w∗). Proof We remind the reader that for each s ∈ K, the Dirac measure at s is a measure = ∈ ∗ δs defined so that K fdδs f (s) for all f C(K). The set Δ is closed in the w ∗ topology and Δ ⊆ BM(K). Therefore, Δ is w -compact. Define a map ψ:K→Δ by ψ(s) = δs for every s ∈ K. Clearly, ψ is a surjection. Suppose that s and t are two distinct points in K such that ψ(s) = ψ(t). Then δs = δt . This means that δs (f ) = δt (f ) for all f ∈ C(K). Thus, f (s) = f (t) for all f ∈ C(K). This contradicts the fact that C(K) separates the points of K. (See the comments before the statement of Lemma 5.66.) Therefore, ψ(s) = ψ(t) only if s = t, and so ψ is an injection as well as a surjection. We next show that ψ is a homeomorphism by showing that it is a continuous closed map (so that it maps closed sets to closed sets). Certainly, ψ is continuous in the w∗ topology on Δ, since for every f ∈ C(K),

ψ(s)(f ) = fdδs = f (s), s ∈ K, K and because f is continuous (by assumption). Now let F be a closed set in K. Then F is compact, because it is a closed subset of the compact set K. Since ψ is continuous for Δ with the w∗ topology, it follows that ψ(F )isw∗-compact in Δ. Since ψ(F )isa compact set in a Hausdorff topology, it must be closed (in that topology). Therefore, ψ is a closed map, and it follows that ψ is a homeomorphism. (See Exercise 5.4.) We are now prepared to prove the Banach–Stone Theorem.

Proof of Theorem 5.64 Let K1 and K2 be compact Hausdorff spaces and suppose T : C(K1) → C(K2) is an isometric isomorphism. We wish to show that K1 and K2 5.10 The Banach–Stone Theorem 123

are homeomorphic. Observe that T maps extreme points of BC(K1) to extreme points of BC(K2). To see this, let f be an extreme point in BC(K1) and suppose that g and h = 1 + = 1 −1 + −1 are functions in BC(K2) such that Tf 2 (g h). Then f 2 (T g T h), and from this we deduce that T −1g = T −1h = f (because f is an extreme point). It = = follows that g h Tf, and so Tf is an extreme point in BC(K2) whenever f is an extreme point in BC(K1). In particular, since χK1 (which is identically equal to 1 on

K1) is extreme in BC(K1), its image T (χK1 ) is extreme in BC(K2). Consequently, we | |= ∈ have that T (χK1 )(s) 1 for all s K2. → = If we define an operator S : C(K1) C(K2)bySf (Tf)/T(χK1 ) for all ∈ = functions f C(K1), then S is an isometry such that S(χK1 ) χK2 . Therefore, we = may assume without loss of generality that T (χK1 ) χK2 . ∗ ∗ −1 −1 ∗ The adjoint T : M(K2) → M(K1) is also an isometry because (T ) = (T ) , ∗ = ∈ P and so T (BM(K2)) BM(K1). Suppose μ (K2). Then

∗ ∗ (T μ)(K1) = 1 dT μ = T (1) dμ = 1 dμ = μ(K2) = 1. K1 K2 K2 = = ∗ (Here we use the fact that χK1 1onK1 and χK2 1onK2.) The measure T μ is ∗ = then an element of BM(K1) such that T μ(K1) 1. It can be shown that these two ∗ facts imply that T μ ∈ P(K1). (See Exercise 5.28.) As an isometry, T ∗ will map extreme points to extreme points. By Proposi- tion 5.57, the extreme points in M(K1) and M(K2) are the Dirac masses, and so ∈ ∈ ∗ = for each s K2, there must be some ts K1 (depending on s) such that T δs δts . Define a map φ : K2 → K1 by φ(s) = ts for each s ∈ K2. Then,

∗ T δs = δφ(s), s ∈ K2.

∗ The map φ is a bijection from K2 onto K1, which follows from the fact that T is a bijection from M(K2) onto M(K1). We wish to show that φ is a homeomorphism, and as such we must show that both φ and φ−1 are continuous. Let ψ : K1 →{δt : t ∈ K1} and ϕ : K2 →{δs : s ∈ K2} be defined by ψ(t) = δt and ϕ(s) = δs for all t ∈ K1 and s ∈ K2. By Lemma 5.66, the maps ψ and ϕ are homeomorphisms. Observe that, for all s ∈ K2,

−1 −1 ∗ −1 ∗ φ(s) = ψ (δφ(s)) = ψ (T δs ) = (ψ ◦ T ◦ ϕ)(s).

∗ ∗ We assumed T was continuous, and hence T : M(K2) → M(K1) is weak -to- weak∗ continuous, by Proposition 5.44. Therefore, φ is continuous. Similarly, we can show that φ−1 = ϕ−1 ◦ (T −1)∗ ◦ ψ, and so φ−1 is continuous. Thus, φ is a homeomorphism. Finally, we have

∗ Tf(s) = (Tf) dδs = fd(T δs ) = fdδφ(s) = f (φ(s)). K2 K1 K1 The factor u appearing in the statement of the theorem does not appear now because of the normalization we made at the beginning of the proof. Had we not assumed 124 5 Consequences of Convexity

= = T (χK1 ) χK2 , then we would have u T (χK1 ), which is a function with the | |= ∈ property that T (χK1 )(s) 1 for all s K2.

Remark T here are other ways to prove that K1 and K2 are homeomorphic using properties of C(K1) and C(K2). For example, it is possible to prove that K1 and K2 are homeomorphic using only the fact that C(K1) and C(K2) are isomorphic as rings, so that no norm structure is required in the proof. (See [15] for more.)

Exercises

Exercise 5.1 Let a and b be real numbers such that a

Exercise 5.2 Prove the following theorems: (a) Every closed subset of a compact space is compact. (b) The image of a compact space under a continuous map is compact. (c) Every compact subset of a Hausdorff space is closed. (d) Let X be a compact space and Y be a Hausdorff space. If f : X → Y is continuous, then f is a closed map. (That is, f (C) is closed in Y whenever C is closed in X.) Exercise 5.3 Let K be a compact topological space and suppose φ : K → E is a continuous one-to-one map, where E is a Hausdorff topological space. Show that φ is a homeomorphism onto its image φ(K). (Hint: See Exercise 5.2.) Exercise 5.4 Let X and Y be topological spaces. If φ : X → Y is a continuous closed bijection, show that φ is a homeomorphism. Exercise 5.5 Let X be a Hausdorff space. If A and B are disjoint compact subsets of X, then show there exist disjoint open sets U and V such that A ⊆ U and B ⊆ V . Exercise 5.6 Suppose X is a compact Hausdorff space. Show that X is a normal space. That is, if E and F are disjoint closed subsets of X, show there exist disjoint open sets U and V such that E ⊆ U and F ⊆ V .(Hint: Use Exercise 5.5.) Exercise 5.7 Suppose X is a compact Hausdorff space. Use Urysohn’s Lemma (Theorem 5.65) and Exercise 5.6 to show that C(X) separates the points of X. That is, if a and b are distinct points in X, show there is a function f ∈ C(X) such that f (a) = 0 and f (b) = 1. Exercise 5.8 Show that limits in a Hausdorff space are unique. That is, if X is a ∞ Hausdorff space, show that a sequence (xn)n=1 in X cannot converge to two distinct limits x and x˜. Exercise 5.9 Prove a metric space is second countable if and only if it is separable. Exercises 125

Exercise 5.10 (a) Suppose (M, d) is a metric space and A is a set. If f : A → M is an injective function, show that dA(x, y) = d(f (x), f (y)) for (x, y) ∈ A × A defines a metric on A. (b) Show that ρ(x, y) =|log (y/x)| defines a metric on the set R+ = (0, ∞). Exercise 5.11 Let d(x, y) =|φ(x) − φ(y)|, where φ(x) = x/(1 +|x|). Show that d is a metric on R that is not complete.

Exercise 5.12 Show that the space Lp(0, 1), where 0

Exercise 5.13 Let L0(0, 1) denote the space of all (equivalence classes of) Lebesgue measurable functions on [0, 1]. Define

1   d(f , g) = min 1, |f (s) − g(s)| ds, {f , g}⊆L0(0, 1). 0

Prove that d is a metric on L0(0, 1). Furthermore, show that d(fn, f ) → 0 if and only if f → 0 in measure. Conclude that L0(0, 1) is a topological vector space (i.e., show that addition and scalar multiplication are continuous).

Exercise 5.14 Show that any continuous linear functional on L0(0, 1) is identically zero. Exercise 5.15 Suppose (Ω, μ) is a positive measure space such that μ(Ω) = 1.

(a) If 0

f 0 = exp log |f (ω)| μ(dω) . Ω

(See Exercise 5.15.) If d(f , g) =f − g0 for all measurable functions f and g in L0(μ), does d define a metric on L0(μ)? Exercise 5.17 Let X be a locally convex topological vector space with η a base of absolutely convex neighborhoods of 0. Verify that the topology on X is generated by the family of Minkowski functionals {pU }U∈η. Deduce that xn → 0inX if and only if pU (xn) → 0 for all U ∈ η. 126 5 Consequences of Convexity

={ ∈   = } Exercise 5.18 Consider the set ∂B2 x 2 : x 2 1 . Show that ∂B2 is closed in the norm topology, but not the weak topology on 2. (This example shows that the convexity assumption cannot be omitted from Mazur’s Theorem.) Exercise 5.19 Let K be a compact subset of a Hausdorff topological vector space E, and suppose C is a closed subset of E. Show that C − K ={x − y : x ∈ C, y ∈ K} is a closed subset of E. Exercise 5.20 Let E be a locally convex topological vector space and suppose K is a closed linear subspace of E.Ifx0 ∈ K, show that there exists a continuous linear ∗ functional f ∈ E such that f (x0) = 1, but f (x) = 0 for all x ∈ K. Exercise 5.21 Let E be a real locally convex topological vector space. Suppose K is a nonempty compact convex subset of E, and C is a nonempty closed convex subset of E, and that K ∩ C =∅. Show there is a continuous linear functional φ on E such that inf φ(x) > sup φ(y). ∈ x C y∈K (We say φ separates K and C.) Exercise 5.22 Let X be a real Banach space and let E be a weak∗-closed subspace of X∗.Ifφ is a weak∗ continuous linear functional on E with φ=1, show for any >0 there exists an x ∈ X with x < 1 +  such that φ(e∗) = e∗(x) for all ∗ ∗ ∗ −1 e ∈ E.(Hint: Consider the sets C ={e ∈ E : φ(e ) = 1} and K = (1+) BX∗ .) Exercise 5.23 Let (X, ·) be a real reflexive Banach space and let φ ∈ X∗. Define → R = 1  2 − ∈ a map f : X by f (x) 2 x φ(x) for all x X. Show that f attains a minimum value. ∞ → Exercise 5.24 Let (fn)n=1 be a bounded sequence in C[0, 1]. Show that fn(s) 0 for every s ∈ [0, 1] if and only if fn → 0 weakly.

Exercise 5.25 Let p ∈ [1, ∞) and for each n ∈ N let en be the sequence with 1 th ∞ in the n coordinate, and 0 elsewhere. Show that the sequence (nen)n=1 does not converge weakly to 0 in p. (Compare to Example 5.28.)

Exercise 5.26 Let p ∈ (1, ∞) and for each n ∈ N define a function fn : [0, 1] → R 1/p by fn(x) = n χ[0,1/n](x) for all x ∈ [0, 1]. Show that fn → 0 weakly in Lp(0, 1), but not in norm. (Recall that χA is the characteristic function of the measurable set A.) Exercise 5.27 Suppose K is a real compact Hausdorff space. Show that the set P(K) of regular Borel probability measures on K is a convex and w∗-closed subset of M(K), the set of regular Borel measures on K. Show that P(K) is an extremal set in the unit ball of M(K). (See Sect. 5.8.) Exercise 5.28 Let K be a compact Hausdorff space and let ν be a Borel measure on K so that νM(K) ≤ 1 and ν(K) = 1. Show that ν is a probability measure. Exercises 127

Exercise 5.29 Let G be a group that is also a topological space. Show that G is a topological group if and only if the map g : G × G → G defined by g(x, y) = x−1y is continuous.

Exercise 5.30 Let X be a real separable Banach space. Show that BX∗ is metrizable ∗ ∞ in the weak topology. (Hint: Let (xn)n=1 be a countable dense subset in X and define ∗ = ∗ ∞ ∈ RN φ(x ) (x (xn))n=1 .) Exercise 5.31 Let X be a Banach space. If x ∈ X, use the Banach–Alaoglu Theorem to prove that there exists an element x∗ ∈ X∗ such that x∗=1 and x∗(x) =x. (Note: We proved this in Proposition 3.29 using the Hahn–Banach Theorem.) Exercise 5.32 A subset E of a topological vector space X is called bounded if for every open neighborhood V of 0, there exists an n ∈ N such that E ⊆ nV . Show that any compact subset of a topological vector space is bounded. Exercise 5.33 A topological vector space X has the Heine–Borel property if every closed and bounded subset of X is compact. (See Exercise 5.32 for the definition of a bounded set in a topological vector space.) (a) Show that a Banach space has the Heine–Borel property if and only if it is finite-dimensional. (Hint: Use Lemma 5.36.) (b) Show that (X∗, w∗) has the Heine–Borel property if X is a Banach space.

Exercise 5.34 Show that C[0, 1] is not reflexive by showing that BC[0,1] is not ∗ compact in the weak topology. (Hint: Find a Λ ∈ C[0, 1] such that Λ(BC[0,1]) is open.) Exercise 5.35 Let X be an infinite-dimensional Banach space. Show that (X∗, w∗) is of the first category in itself. Chapter 6 Compact Operators and Fredholm Theory

6.1 Compact Operators

Suppose X is a vector space (over R or C) and let T : X → X be a linear operator. Let us recall some basic definitions from linear algebra. The kernel (or nullspace) of T is the subspace of X given by ker(T ) ={x : Tx = 0}. The range (or image)of T is given by ran(T ) ={Tx : x ∈ X}. We say that x ∈ X is an eigenvector of T if x = 0 and there exists some scalar λ (called an eigenvalue) such that Tx = λx. The behavior of linear operators on finite-dimensional vector spaces has been studied for a long time and is well understood. Some of the most basic and important theorems of linear algebra rely heavily on the dimension of the underlying vector space X. Consider the following well-known theorems. Theorem 6.1 Let X be a finite-dimensional vector space and suppose T : X → X is a linear operator. Then: (i) The map T is one-to-one if and only if T maps X onto X. (ii) dim(X) = dim(kerT ) + dim(ranT ) (Rank-Nullity Theorem). (iii) If X is a nontrivial complex vector space, then T has at least one eigenvector. Notice that (ii) implies (i), and certainly (ii) is dependent upon the underlying di- mension of X. To see how (iii) depends on the finite-dimensionality of X, recall that eigenvalues can be calculated by solving the equation det (λI − A) = 0 for λ, where I is the identity matrix and A is any matrix representation of T . Because X is finite-dimensional, the expression det (λI − A) is a polynomial, and as such must have a root because C is algebraically closed (by the Fundamental Theorem of Algebra). These theorems rely on the finite-dimensionality of X, and so any attempt to generalize them to infinite-dimensional vector spaces requires careful consideration. In fact, as stated, Theorem 6.1 is false in a general Banach space. To see this explicitly, we now consider some examples, where the Banach spaces can be either real or complex.

© Springer Science+Business Media, LLC 2014 129 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_6 130 6 Compact Operators and Fredholm Theory

Example 6.2 Let {p, q}⊆[1, ∞] and consider the shift operators T : p → p given by = ∞ ∈ T (ξ1, ξ2, ξ3, ...) (0, ξ1, ξ2, ξ3, ...), (ξk)k=1 p, and S : q → q given by = ∞ ∈ S(η1, η2, η3, ...) (η2, η3, η4, ...), (ηk)k=1 q . The map T is clearly one-to-one, but certainly is not onto. On the other hand, the map S is onto, but clearly not one-to-one. We therefore cannot extend Theorem 6.1(i) to infinite dimensions. Since the spaces p and q have infinite dimensions, it is not obvious if the statement of Theorem 6.1(ii) has any significant meaning in its current form. To see that Theorem 6.1(iii) does not extend to infinite-dimensional vector spaces, ∞ suppose (ξk)k=1 is an eigenvector for T . Then there exists some scalar λ such that

T (ξ1, ξ2, ξ3, ...) = λ(ξ1, ξ2, ξ3, ...), or (0, ξ1, ξ2, ξ3, ...) = (λξ1, λξ2, λξ3, ...).

The only way this can happen is if ξk = 0 for all k ∈ N, contradicting the assumption ∞ that (ξk)k=1 is an eigenvector. It follows that T does not have any eigenvectors. The map S, however, has many eigenvectors. (See Exercise 6.1.) In the case p = q, notice that S ◦ T is the identity map on p,butT ◦ S is not. It is also worth mentioning that if p and q are conjugate exponents (1/p + 1/q = 1), then S = T ∗. Example 6.3 Let T : C[0, 1] → C[0, 1] be defined by Tf(x) = xf (x). It is clear that T is a bounded linear operator and that T ≤1. If λ is an eigenvalue for T , then there exists some function f ∈ C[0, 1] such that xf (x) = λf (x) for all x ∈ [0, 1]. It follows that (x − λ)f (x) = 0 for all x ∈ [0, 1]. This can happen only if f = 0. Therefore, T has no eigenvectors. We see that our intuition from linear algebra can fail us in infinite dimensions. Indeed, for general linear operators on Banach spaces, much of what we know for finite-dimensional vector spaces fails to remain true. For this reason, we impose additional conditions on our operators. We now define one class of operators for which our intuition can serve as a guide. Definition 6.4 Let X and Y be Banach spaces. A bounded linear operator T : X → Y is said to be a finite rank operator if dim(ranT ) < ∞. In such a case, the number dim(ranT ) is called the rank of T and is denoted rank(T ). For finite rank operators, we can still use some tools from linear algebra. This class of operators is too small, however, and so we wish to introduce a less restrictive condition on our operators. To that end, we introduce a definition. Definition 6.5 Let X be a topological space. A subset E of X is called relatively compact if it has compact closure; that is, if E is compact in X. 6.1 Compact Operators 131

Example 6.6 Let X be a Banach space. By Goldstine’s Theorem (Theorem 5.40), ∗ ∗∗ BX is dense in BX∗∗ in the weak topology on X . By the Banach–Alaoglu Theorem ∗ ∗∗ (Theorem 5.39), BX∗∗ is compact in the weak topology on X . Therefore, BX has weak∗-compact closure, and so is relatively compact, in (X∗∗, w∗). Definition 6.7 Let X and Y be Banach spaces. A bounded linear map T : X → Y is called a if T (BX) is a relatively compact set in Y . It is important to note that the image of the unit ball under a compact operator need not be a compact set. (See Example 6.10.) Example 6.8 Finite rank operators are compact, by the Heine–Borel Theorem. Example 6.9 Suppose that X and Y are Banach spaces such that X is reflexive, and let T : X → Y be a bounded linear operator. By Proposition 5.42, the operator T is weakly continuous. (That is, T is continuous in the weak topologies on X and Y .) Since X is reflexive, the weak and weak∗ topologies on X coincide. Therefore, by the Banach–Alaoglu Theorem (Theorem 5.39), the unit ball BX is weakly compact. We know that T is weakly continuous, and so T (BX) is weakly compact in Y . It follows that T (BX) is weakly closed in Y , and so T (BX) is closed in the norm topology on Y , by Mazur’s Theorem (Theorem 5.45). Therefore, if T is a compact operator, then T (BX) is compact in the norm topology on Y . In Example 6.9, the set T (BX) is compact if and only if T is a compact operator. The proof of this fact relies on the assumption that X is reflexive. If X is not reflexive, then T (BX) may not be closed (and so not compact) even if T is a compact operator. We illustrate this in the next example. = ∞ → R Example 6.10 Let λ (λn)n=1 be a sequence in 1 and define a function T : c0 by ∞ = = ∞ ∈ T (ξ) λnξn, ξ (ξn)n=1 c0. n=1

For simplicity, assume λn > 0 for each n ∈ N. Observe that T is linear because the series defining T (ξ) is absolutely convergent for all ξ ∈ c0. For each ξ ∈ c0,  ∞   ∞ | |=  ≤ | | =    T (ξ) λnξn sup ξn λn ξ c0 λ 1 . ∈N n=1 n n=1  ≤   =  Thus, T is bounded and T λ 1 . In fact, T λ 1 because

N T = sup |T (ξ)|≥|T (e1 +···+eN )|= λn, (6.1) ξ∈B c0 n=1

th for all N ∈ N. (Recall that ek is the sequence witha1inthek coordinate, and zero elsewhere.) We have shown that T is bounded and linear (and hence continuous); it is also compact because it is a rank 1 operator. The set T (Bc0 ) is not closed, however, because 132 6 Compact Operators and Fredholm Theory

| |=  ∈ sup T (ξ) λ 1 ,by(6.1), but there is no ξ c0 for which this maximum is ∈ ξ Bc0 = −    attained. In fact, T (Bc0 ) ( λ 1 , λ 1 ), which is not closed, but has a compact closure. For another example of a compact operator T : X → Y such that T (BX)isnot closed in Y , see Example 5.43. (See also Exercise 5.34.) In the case of a metric space, a notion closely related to compactness is that of total boundedness. Definition 6.11 A subset of a metric space is called totally bounded if it can be covered by finitely many closed balls of radius ε for any ε>0. Certainly, if a subset E of a metric space is totally bounded, then any subset of E is also totally bounded. Theorem 6.12 A closed subset of a complete metric space is compact if and only if it is totally bounded. Proof Let M be a complete metric space with metric d. For ease of notation, we will denote the open ball of radius ε about x in M by Bε(x) and the closed ball of radius ε about x in M by Bε(x). That is, Bε(x) ={y ∈ M : d(x, y) <ε} and Bε(x) ={y ∈ M : d(x, y) ≤ ε}. Suppose K is a compact (and hence closed) subset of M. Let ε>0 be given. The collection of sets {Bε(x):x ∈ K} forms an open cover of K, and so (by compactness) admits a finite subcover. Thus, there is a finite set {x1, ... , xN } in K such that

K ⊆ Bε(x1) ∪···∪Bε(xN ) ⊆ Bε(x1) ∪···∪Bε(xN ). Thus, K is totally bounded. Now suppose that K is a totally bounded closed subset of M. We wish to show that K is compact. We will assume K is not compact and derive a contradiction. Let V be a collection of open sets that cover K and suppose V does not contain a finite subcover of K. Since K is totally bounded, it can be covered by a finite collection of closed balls of radius 1, say ⊆ ∪···∪ K B1(x1,1) B1(x1,n1 ). Since K cannot be covered by finitely many sets in V, there is some member of the { } ∩ collection x1,1, ... , x1,n1 , call it x1, such that K B1(x1) cannot be covered by finitely many sets in V. Let K1 = K ∩ B1(x1). The set K1 is totally bounded (because it is a subset of K). Thus, K1 can be covered by a finite collection of closed balls of radius 1/2, say

K1 ⊆ B 1 (x2,1) ∪···∪B 1 (x2,n ). 2 2 2

Since K1 cannot be covered by finitely many sets in V, there is some member of the collection {x2,1, ... , x2,n }, call it x2, such that K1 ∩ B 1 (x2) cannot be covered by 2 2 finitely many sets in V. Let K2 = K1 ∩ B 1 (x2). 2 6.1 Compact Operators 133

∞ Continuing inductively, we find a sequence of points (xn)n=1 in M and construct ∞ a sequence of subsets (Kn)n=1 of K such that

(i) Kn ⊆ B 1 (xn) for all n ∈ N, n (ii) K ⊇ K1 ⊇ K2 ⊇ K3 ⊇···, and (iii) Kn cannot be covered by finitely many sets in V for any n ∈ N. ∈ N ∈ ∞ For each n , choose yn Kn. Then the sequence (yn)n=1 is a Cauchy sequence, by the properties in (i) and (ii), above. Since M is assumed to be a complete metric space, there is a point y ∈ K (because K is closed) such that yn → y as n →∞.We chose yn ∈ Kn for each n ∈ N, and so it follows from (i) that xn → y as n →∞. By assumption, V is an open cover of K and y ∈ K. Hence, there is an open set V ∈ V such that y ∈ V . Since xn → y as n →∞, there exists some N ∈ N such that B 1 (xn) ⊆ V for all n ≥ N. This violates property (iii), because Kn ⊆ B 1 (xn) for n n all n ∈ N, and so we have obtained a contradiction. Therefore, the cover V contains a finite subcover of K, and so K is compact. Definition 6.13 We denote the collection of compact operators from X to Y by the symbol K(X, Y ). We use K(X) to denote K(X, X). The next theorem shows that the collection of compact operators is well-behaved. Theorem 6.14 Let X and Y be Banach spaces. The set K(X, Y ) of compact operators from X to Y forms a closed linear subspace of L(X, Y ). Proof First, we will show that K(X, Y ) is a linear subspace. Suppose that S : X → Y and T : X → Y are compact operators and let α and β be scalars. We wish to show that the operator αS + βT is compact. Define a map φ : Y × Y → Y by

φ(u, v) = αu + βv,(u, v) ∈ Y × Y.

The map φ is continuous because Y is a topological vector space. By assumption, the set S(BX)×T (BX) is compact in Y ×Y . Therefore, the set H = φ(S(BX)×T (BX)) is compact in Y , as the continuous image of a compact set. Since (αS+βT)(BX) ⊆ H , and H is compact, it follows that (αS + βT)(BX) is compact, and so αS + βT is a compact operator. It remains to show that K(X, Y ) is a closed subspace. Suppose T ∈ L(X, Y )is ∞ a bounded linear operator and let (Tn)n=1 be a sequence of compact operators in K(X, Y ) such that Tn − T →0asn →∞. We wish to show that T (BX)is relatively compact. By Theorem 6.12, it suffices to show that its closure is totally bounded. This will follow if we show, for any ε>0, the set T (BX) is contained in a finite union of balls of radius ε in Y . Fix ε>0. There exists an n ∈ N such that Tn − T  <ε/2. Because Tn(BX)is relatively compact, there exists a finite set of points {y1, ... , yN } in Y such that

N  ε T (B ) ⊆ y + B . n X j 2 Y j=1 134 6 Compact Operators and Fredholm Theory

∈  −  ∈ N + If x BX, then Tnx Tx <ε/2, and so Tx j=1 (yj εBY ). Thus, T (BX) is contained in a finite union of ε-balls. We conclude that T (BX) is totally bounded, and hence compact. Therefore, T is a compact operator, as required. 2 Remark 6.15 There is a more straightforward proof that K(X, Y ) is a subspace of L(X, Y ) that uses the fact that, in a metric space, compactness is equivalent to sequential compactness (i.e., every sequence has a convergent subsequence). We will prove this shortly (for complete metric spaces), but for now let us take this fact for granted. ∞ ∞ Let (xn)n=1 be a sequence in BX. There exists a subsequence (xnk )k=1 such that ∞ (Sxnk )k=1 converges (not necessarily in S(BX)). There exists a further subsequence (x )∞ such that (Tx )∞ converges (again, not necessarily in T (B )). Then nkj j=1 nkj j=1 X ((αS + βT)(x ))∞ converges, and we obtain the desired result. nkj j=1 We will now show that compactness is equivalent to sequential compactness in a complete metric space. The theorem remains valid even without the completeness assumption, but we need it only for Banach spaces, and so we opt for a theorem with a simpler proof. Theorem 6.16 Let M be a complete metric space. A closed subset K of M is compact if and only if it is sequentially compact. Proof Let d be a complete metric on M. For notational simplicity, we will denote by Bδ(s) the open ball of radius δ about s ∈ M; that is, Bδ(s) ={x ∈ M : d(s, x) <δ}. ∞ Suppose K is a compact set and let (an)n=1 be a sequence in K. We will show that ∞ (an)n=1 has a convergent subsequence. Since K is compact, we can cover K with a finite number of open balls of radius 1. At least one of these must contain an for infinitely many values of n ∈ N. That is, there is a x1 ∈ K such that an ∈ B1(x1) for infinitely many n ∈ N. Let N1 ={n : n ∈ N and an ∈ B1(x1)}. Denote the integers ∞ ∞ in N1 by (n1j )j=1, where n1j 0. Choose a positive ∈ N 1 ε integer N such that N < 2 .Ifi>Nand j>N, then ani and anj are th members of the N subsequence, and consequently are in the open ball B 1 (xN ). By N the triangle inequality, ε ε d(a , a ) ≤ d(a , x ) + d(x , a ) < + = ε. ni nj ni N N nj 2 2 6.1 Compact Operators 135

∞ We have established that the subsequence (ani )i=1 is a Cauchy sequence. By as- ∞ sumption, M is a complete metric space, and so the Cauchy sequence (ani )i=1 converges to some point in the closed set K. Since any sequence in K has a convergent subsequence, K is sequentially compact. Now suppose K is sequentially compact. We will show that K is totally bounded. (Then K will be compact by Theorem 6.12.) Suppose to the contrary that K is not totally bounded. Then there exists some ε>0 such that K cannot be covered by finitely many closed balls with radius ε. Pick any a1 ∈ K. Since K cannot be covered by finitely many ε-balls, the set K\Bε(a1) is nonempty. Choose some a2 ∈ K\Bε(a1). Similarly, K\(Bε(a1) ∪ Bε(a2)) cannot be empty, and must contain some element, say a3. ∞ Proceeding inductively, we construct a sequence of points (an)n=1 in K such that an+1 ∈ Bε(a1) ∪···∪Bε(an) for all n ∈ N. It follows that d(an, am) ≥ ε for all = ∞ m n, and so (an)n=1 cannot have a convergent subsequence. This contradicts the assumption that K is sequentially compact. Therefore, K is totally bounded, and so is compact (by Theorem 6.12). We return now to the space of compact operators between Banach spaces X and Y . In addition to being a closed subspace of L(X, Y ), the collection of compact operators K(X, Y ) also possesses an ideal structure. Theorem 6.17 Let W, X, Y , and Z be Banach spaces. If T : X → Y is a compact operator and the maps A : W → X and B : Y → Z are bounded linear operators, then the composition B ◦ T ◦ A : W → Z is compact. In particular, the space K(X) is a two-sided ideal in L(X).

Proof The map A is bounded, and consequently A(BW ) ⊆A BX. We thus con- clude that TA(BW ) ⊆AT (BX), the latter set being compact, since T is a compact operator. Finally, BTA(BW ) ⊆A B T (BX) . The set on the right of the previous inclusion is compact as the continuous image (under B) of a compact set. Therefore, BTA(BW ) is a closed subset of a compact set, and so is compact. Example 6.18 Suppose X is a Banach space. The identity map Id : X → X is a compact operator if and only if dim(X) < ∞. Indeed, if A : X → X is any invertible bounded linear operator, then A is compact if and only if dim(X) < ∞. This follows from Proposition 5.37. ∈ ∞ = ∞ Example 6.19 Let p [1, ] be given and let a (aj )j=1 be a bounded sequence (i.e., an element of ∞). Define a map Ta : p → p by

= ∞ ∈ Ta(ξ1, ξ2, ξ3, ... ) (a1ξ1, a2ξ2, a3ξ3, ... ), (ξj )j=1 p. We claim this map is a bounded linear operator. Linearity is clear. To see it is bounded, = ∞ ∈ let ξ (ξj )j=1 p. Then ⎛ ⎞ 1/p ∞ ⎝ p⎠ Taξp = |aj ξj | ≤a∞ ξp. j=1 136 6 Compact Operators and Fredholm Theory

This implies that Ta≤a∞. In fact, Ta(ei ) = ai ei for all i ∈ N, and so it follows that Ta=a∞. (Note that ei is an eigenvector with corresponding eigenvalue ai for each i ∈ N.)

Proposition 6.20 Ta is compact if and only if a ∈ c0.

Proof If Ta is a compact operator, then Ta(Bp ) is a compact set, and consequently ∈ any sequence in Ta(Bp ) must have a convergent subsequence. Suppose that a c0. ∞ | | Then there exists some δ>0 and a subsequence (anj )j=1 such that anj >δfor all j ∈ N. Then, for all natural numbers j and k,   −  = −  = | |p +| |p 1/p Taenj Taenk p anj enj ank enk p anj ank >δ. ∞ It follows that (Taenj )j=1 has no convergent subsequence, and so Ta is not a compact operator. = ∞ ∈ ∈ N Now suppose a (aj )j=1 c0. For each n , define a sequence in c0 by the (n) (n) rule a = (a1, ... , an,0,... ). The sequence a has only finitely many nonzero terms, and so Ta(n) is a finite rank operator, and hence is compact. For each n ∈ N,

Ta − Ta(n) =Ta−a(n) =sup |ak|. k>n

Since a ∈ c0, this quantity tends to 0 as n →∞. Thus, Ta is the uniform limit of a sequence of finite rank operators. Therefore, Ta is compact, because K(p)isa closed subspace of L(p), by Theorem 6.14. 2 Example 6.21 Let K be a continuous scalar-valued function on the compact space [0, 1] × [0, 1]. Define TK : C[0, 1] → C[0, 1] by

1 TK f (s) = K(s, t) f (t) dt, f ∈ C[0, 1], s ∈ [0, 1]. 0

We must show that TK is well-defined; that is, we must show that TK f is a continuous function on [0, 1]. If sn → s as n →∞, then

1 1 K(sn, t) f (t) dt −→ K(s, t) f (t) dt, 0 0 as n →∞, by the Dominated Convergence Theorem. Consequently, TK f is continuous on [0, 1], by the sequential characterization of continuity. It is clear that TK is linear. To compute the bound, let f ∈ C[0, 1]. Then

1 |TK f (s)|≤ |K(s, t) f (t)| dt ≤K∞ f ∞. 0

This bound is uniform in s ∈ [0, 1], and so TK f ∞ ≤K∞ f ∞. Therefore, TK is bounded and TK ≤K∞. Proposition 6.22 If K is a continuous scalar-valued function on [0, 1]×[0, 1], then TK is a compact operator. 6.1 Compact Operators 137  × = n mj nj Proof Suppose K is a polynomial on [0, 1] [0, 1], say K(s, t) j=1 aj s t , n n N n where (mj )j=1 and (nj )j=1 are finite sequences in , and where (aj )j=1 is a finite sequence of scalars. Then

n 1 mj nj TK f (s) = aj s t f (t) dt. j=1 0

m1 mn Thus, TK f is a polynomial in the linear span of {s , ... , s }. This holds for all f ∈ C[0, 1], and so TK is a finite rank operator, and hence compact. Now suppose K ∈ C([0, 1] × [0, 1]) is a continuous function. By the Weierstrass ∞ Approximation Theorem, there exists a sequence of polynomials (Kn)n=1 such that K − Kn∞ → 0asn →∞. Then  − = ≤ −  → TK TKn TK−Kn K Kn ∞ 0, as n →∞. Therefore, T is the limit of a sequence of finite rank operators, and so is compact (by Theorem 6.14). Digression: Historical Comments The linear operator TK from Example 6.21 was considered by Fredholm in 1903, in what is considered by some to be the first paper on functional analysis [12]. The function K is called the Fredholm kernel (or simply the kernel) of the operator TK (not to be confused with the nullspace kerTK ). Fredholm was interested in solving integral equations of the form

1 g(s) = K(s, t) f (t) dt, 0 where f , g, and K had specified properties. This integral equation is an extension of the matrix equation n yj = ajk xk, k=1 = n × ∈ N where A (ajk)j,k=1 is an n n matrix (for some n ), x and y are n-dimensional vectors, and y = Ax. Fredholm’s work was before the advent of measure theory, and so the focus was on the space of continuous functions on [0, 1]. The space C[0, 1] is an infinite- dimensional vector space, and integral operators determine linear operators on this vector space. Fredholm set about trying to apply techniques of linear algebra to these kinds of operators. With the advent of measure theory and Riesz’s work on Lp-spaces, the focus on integral operators broadened to include these larger spaces of functions. For any p in the interval [1, ∞], we define a on Lp(0, 1) to be any operator TK : Lp(0, 1) → Lp(0, 1) of the form

1 TK f (s) = K(s, t) f (t) dt, f ∈ Lp(0, 1), s ∈ [0, 1], (6.2) 0 138 6 Compact Operators and Fredholm Theory where K is a prescribed measurable function on [0, 1] × [0, 1]. The function K is known as the kernel of TK . (It is also variously known as the nucleus or the Green’s function for TK .) In order to insure that TK is well-defined, the kernel K generally has some additional assumptions imposed upon it. For example, in the following proposition, which can be seen as an extension of Proposition 6.22, the kernel K is assumed to be an essentially bounded measurable function.

Proposition 6.23 If K ∈ L∞([0, 1] × [0, 1]), then TK is a compact operator on Lp(0, 1) for 1

Proof Let p ∈ (1, ∞) and suppose f ∈ Lp(0, 1). Then

 1  1  p 1/p   TK f p = |TK f (s)| ds ≤TK f ∞ = sup K(s, t)f (t) dt . 0 s∈[0,1] 0

Thus, TK f p ≤K∞ f 1 ≤K∞ f p.

(Note that ·1 ≤·p on a probability space, by Hölder’s Inequality.) Therefore, TK f ∈ Lp(0, 1) whenever f ∈ Lp(0, 1) and TK f p ≤K∞ f p.Wehave established that TK is well-defined and TK ≤K∞. To prove that TK is compact, we use an argument similar to the one we used in Proposition 6.22.IfK = χA×B , where A and B are measurable subsets of [0, 1], then for all f ∈ Lp(0, 1) and s ∈ [0, 1],   1 TK f (s) = χA(s) χB (t) f (t) dt = f (t) dt χA(s). 0 B

Thus, TK f ∈ span{χA}, and so TK is a rank-one operator. Therefore, TK is compact. Now suppose

n = K ci χAi ×Bi , (6.3) i=1 where n ∈ N and, for each i ∈{1, ... , n}, the sets Ai and Bi are measurable and ci is a scalar. Then for all f ∈ Lp(0, 1) and s ∈ [0, 1],   1 n n = = TK f (s) ci χAi (s) χBi (t) f (t) dt ci f (t) dt χAi (s). 0 i=1 i=1 Bi ∈ { } Consequently, TK f span χA1 , ... , χAn , and so TK is an operator of rank at most n. Thus, TK is compact. If K is a bounded measurable function on [0, 1] × [0, 1], then the compactness of TK follows from a density argument. Since K is a bounded function, it is also true 6.1 Compact Operators 139 that K is in Lr ([0, 1] × [0, 1]) for all r ∈ (1, ∞). Observe that

 1  1  1 p 1/p p 1/p TK f p = |TK f (s)| ds ≤ |K(s, t) f (t)| dt ds . 0 0 0 Thus, applying Hölder’s Inequality to the inner integral, we have

 1  1 q p/q 1/p TK f p ≤ |K(s, t)| dt ds f p, (6.4) 0 0 where 1/p + 1/q = 1. We will show that the right side of (6.4) is bounded by     = { } ≤ ∈ K r f p, where r max p, q . Suppose that q p, then for any s [0, 1], we 1 | |q 1/q ≤ 1 | |p 1/p have that ( 0 K(s, t) dt) ( 0 K(s, t) dt) . Consequently,

 1 1 p 1/p TK f p ≤ |K(s, t)| dt ds f p =Kpf p. 0 0  ≤ = 1 | |q 1/q Now suppose p q and let A(s) ( 0 K(s, t) dt) . Then

 1  1 p 1/p q 1/q TK f p ≤ A(s) ds f p ≤ A(s) ds f p 0 0  1 1 q 1/q = |K(s, t)| dt ds f p =Kq f p. 0 0

Therefore, TK f p ≤Kr f p, where r = max{p, q}. Functions of the type in (6.3) are dense in Lr ([0, 1] × [0, 1]). Thus, there exists a ∞ sequence of kernels (Kn) = , all of the type in (6.3), such that K = lim Kn, where n 1 n→∞ the limit is in the norm on Lr ([0, 1] × [0, 1]). Because TK f p ≤Kr f p,it follows that TK f = lim TK f for all f ∈ Lp(0, 1). For each n ∈ N, the operator n→∞ n

TKn is a finite rank operator on Lp(0, 1). Therefore, TK is the limit of a sequence of finite rank operators, and so TK is compact, by Theorem 6.14. Example 6.24 Let us explore which kernels we can use to define a Fredholm operator as in (6.2). In order for TK : Lp(0, 1) → Lp(0, 1) to be well-defined, we must ensure that TK f ∈ Lp(0, 1) for all f ∈ Lp(0, 1). In the proof of Proposition 6.23,wesaw that  1  1 q p/q 1/p TK f p ≤ |K(s, t)| dt ds f p. 0 0

(See (6.4).) If we choose K so that the above quantity is finite, then TK will be a well-defined bounded linear operator. An important case is when p = 2 (and thus q = 2). In this case, the inequality becomes

 1 1 2 1/2 TK f 2 ≤ |K(s, t)| dt ds f 2 =K2 f 2. (6.5) 0 0 140 6 Compact Operators and Fredholm Theory

Therefore, if K ∈ L2([0, 1] × [0, 1]), then TK is a bounded linear operator from L2(0, 1) to L2(0, 1) and TK ≤K2. An operator of this type is known as a Hilbert–Schmidt operator. We will say more about these operators in Chapter 7. In the examples given above, each operator was shown to be compact by demon- strating that it was the limit (in the operator norm) of a sequence of finite rank operators. This leads to a natural question: Is it true that every compact operator is a limit (in the operator norm) of a sequence of finite rank operators? This question, known as the “Approximation Problem,” motivated much of the early development of functional analysis. It was not until 1973 that Per Enflo settled the issue and proved that not every compact operator was the limit of a sequence of finite rank operators [9]. Famously, Per Enflo was awarded a live goose by Stanislaw Mazur for solving the Approximation Problem. (Many problems included in the Scottish Book came with the promise of a reward for a solution, especially those that were considered difficult or important. In 1936, when Mazur included a problem in the Scottish Book [Problem 153] that was equivalent to the Approximation Problem, a live goose was considered very valuable. For a complete list of problems and prizes, see [24].) Much earlier, in 1955, studied compact operators in depth [17]. He developed several conditions which were equivalent to the statement that every compact operator is the limit of a sequence of finite rank operators. (In fact, it was one of Grothendieck’s alternate formulations that Enflo disproved.) Perhaps one of the most unexpected equivalences is the following. Proposition 6.25 Each compact operator between arbitrary Banach spaces is the limit of a sequence of finite rank operators if and only if the following is true:

1 If K ∈ C([0, 1] × [0, 1]) and K(s, t) K(t, u) dt = 0 for all {s, u}⊆[0, 1], 0 1 then K(s, s) ds = 0. 0 (6.6)

The motivation behind (6.1.6) comes from the finite-dimensional theory of linear ∈ N = n × algebra. Let n and suppose A (ajk)j,k=1 is an n n matrix. Recall that the trace of the matrix A is n trace(A) = aii. i=1 It is known that if A2 = 0, then trace(A) = 0. That is,

n n aij ajk = 0 for all {i, k}⊆{1, ... , n}⇒ aii = 0. j=1 i=1

Proposition 6.25 suggests that a similar result might be true for the Fredholm operator TK , when the kernel K is continuous. We define the trace of the Fredholm operator 6.2 A Rank-Nullity Theorem for Compact Operators 141

TK to be

1 trace(TK ) = K(s, s) ds. (6.7) 0 Proposition 6.25 states that if each compact operator between arbitrary Banach spaces is the limit of a sequence of finite rank operators, then if K is a continuous kernel,

1 1 K(s, t) K(t, u) dt = 0 for all {s, u}⊆[0, 1] ⇒ K(s, s) ds = 0. 0 0 Since it is not true that each compact operator between arbitrary Banach spaces is the limit of a sequence of finite rank operators, this implication does not hold for all K ∈ C([0, 1] × [0, 1]). It can be shown, however, that the implication does hold if K is Hölder continuous with exponent α for all α>1/2.

6.2 A Rank-Nullity Theorem for Compact Operators

We resume the study of compact operators with a theorem of Schauder that guarantees the compactness of the adjoint of a compact operator. Theorem 6.26 (Schauder’s Theorem) If X and Y are Banach spaces and the operator T : X → Y is compact, then T ∗ : Y ∗ → X∗ is also compact. ∗ Proof By the Banach–Alaoglu Theorem (Theorem 5.39), we know that BY ∗ is w - ∗ ∗ ∗ compact. Consequently, it suffices to show that T :(BY ∗ , w ) → (X , ·)isa ∗ continuous map, because then T (BY ∗ ) will be compact as the continuous image of a compact set. ∗ ∗ ∗ ∈ ∗ Suppose that y0 BY and consider the closed neighborhood of T y0 given by ∗ ∗ + ∗ ∗ T y0 δBX , where δ>0. Let V denote the preimage of this set in BY ; that is, ∗ − ∗ ∗ = 1 + ∗ ∩ ∗ V (T ) (T y0 δBX ) BY .

∗ ∗ ∗ We need to show that V is a w -neighborhood of y0 relative to BY ; that is, we ∗ ∗ ∗ ∗ must find a w -open set W (open in the w -topology of Y ) containing y0 such that W ∩ BY ∗ ⊆ V . By assumption, T is a compact operator, and so T (BX) is a relatively compact set in Y . By Theorem 6.12, a relatively compact set is totally bounded, and hence there exists a finite set {x1, ... , xn}⊆BX such that

n  δ T (B ) ⊆ Tx + B . (6.8) X j 3 Y j=1 142 6 Compact Operators and Fredholm Theory

∗ ∗ Define a w -open neighborhood W of y0 by ! δ " W = y∗ ∈ Y ∗ : |y∗(Tx ) − y∗(Tx )| < , j ∈{1, ... , n} . j 0 j 3

We claim that W ∩ BY ∗ ⊆ V . To that end, we will show ∗ ∗ ∗ ∩ ∗ ⊆ + ∗ T (W BY ) T y0 δBX . ∗ ∗ ∗ ∗ ∗ ∈ ∩ ∗  − ≤ Pick y W BY . We need to show that T y T y0 δ. Let x ∈ BX.By(6.8), there exists some j such that Tx− Txj ≤δ/3. By the triangle inequality, | ∗ ∗ − ∗ ∗ |=| ∗ − ∗ |≤| ∗ − ∗ |+| ∗ − ∗ − | (T y T y0 )(x) (y y0 )(Tx) (y y0 )(Txj ) (y y0 )(Tx Txj ) . ∗ ∈ | ∗ − ∗ | ∗ ∗ Because y W, it follows that (y y0 )(Txj ) <δ/3. And since y and y0 are in BY ∗ , we conclude that 2δ |(y∗ − y∗)(Tx− Tx )|≤y∗ − y∗Tx− Tx ≤ . 0 j 0 j 3 | ∗ ∗ − ∗ ∗ |≤ ∈  ∗ ∗ − ∗ ∗≤ Consequently, (T y T y0 )(x) δ for all x BX, and so T y T y0 δ, as required. ∗ ∗ ∗ ∗ We conclude that T :(BY ∗ , w ) → (X , ·) is continuous, and hence T (BY ∗ ) ∗ is compact as the continuous image of the weak -compact set BY ∗ . Notice that in the proof of Theorem 6.26, we did not have to take the closure of ∗ ∗ T (BY ∗ )inX to get a compact set. This is a direct consequence of the Banach– ∗ AlaogluTheorem, which assures us that BY ∗ is always compact in the weak topology. In the proof of Theorem 6.26, we demonstrated the continuity of T ∗ when viewed ∗ ∗ as a map from (BY ∗ , w )to(X , ·). In general, however, we cannot extend this to the entire space Y ∗. That is, we cannot say that T ∗ :(Y ∗, w∗) → (X∗, ·)is continuous. (See Exercise 6.3.) On the other hand, by Proposition 5.44, we do know that T ∗ :(Y ∗, w∗) → (X∗, w∗) is always continuous. Example 6.27 (Volterra operator) Our purpose for adding structure to our linear op- erators was to recover some of the properties of linear operators on finite-dimensional vector spaces. We will now provide a compact operator on a complex Banach space with no eigenvalues. Let L2(0, 1) be the complex Banach space of square-integrable functions on [0, 1]. Define a map V : L2(0, 1) → L2(0, 1) by

x Vf(x) = f (t) dt, f ∈ L2(0, 1), x ∈ [0, 1]. 0 This operator, known as the Volterra operator (see Example 3.41), is a Hilbert– Schmidt operator with kernel

1ift ≤ x, K(x, t) = 0ift>x. Since a Hilbert–Schmidt operator is always compact (see Example 6.24), we conclude that V is a compact operator. 6.2 A Rank-Nullity Theorem for Compact Operators 143

Suppose that λ ∈ C is an eigenvalue. Then Vf(x) = λf(x) for almost every ∈ = = x = x [0, 1]. If λ 0, then Vf 0a.e., and so 0 f (t) dt 0 for almost every x. This implies that f = 0a.e., and soλ = 0 is not an eigenvalue. = = 1 x Suppose λ 0. Then f (x) λ 0 f (t) dt for almost every x. Then f is differen- = 1 ∈ tiable and f (x) λ f (x) for all x [0, 1]. The general solution to this differential = x/λ ∈ C = 1 = equation is f (x) Ce , for a constant C . However, f (0) λ Vf(0) 0, and so C = 0. Therefore, f = 0, and so λ = 0 is not an eigenvalue. Despite the previous example, we can recover some theory from the finite- dimensional case. In particular, we will be able to prove a version of the Rank-Nullity Theorem. In order to articulate the theorem, we will need some preparation. Definition 6.28 Suppose L is a bounded linear operator. If K is a compact operator, then the operator L − K is called a compact perturbation of L or a perturbation of L by a compact operator. For the moment, we will be interested in L = λI, where λ is a nonzero scalar, and I : X → X is the identity operator on the Banach space X. We refer to the map L = λI as a scaling of the identity. In what follows, we let K : X → X be a compact operator and define T = λI − K, so that T is a compact perturbation of a scaling of the identity. Observe that kerT ={x : Kx = λx}, and so x ∈ kerT if and only if x is an eigenvector of K with eigenvalue λ (when x = 0). Lemma 6.29 Let T = λI − K, where λ is a nonzero scalar and K is a compact operator on a Banach space X. The set kerT is finite-dimensional.

Proof On kerT , we have the equality K = λI. This implies that I|kerT is a compact operator, and so dim(kerT ) < ∞. (See Example 6.18.) 2 Lemma 6.30 Let T = λI − K, where λ is a nonzero scalar and K is a compact operator on a Banach space X. The range of T is a closed subspace of X. ∈ ∞ Proof Recall that the range of T is denoted by ranT . Let y ranT . Pick (xn)n=1 in X such that Txn → y as n →∞. For each n ∈ N, let

αn = d(xn,kerT ) = inf{xn − z : z ∈ kerT }.

Then there exists for each n ∈ N some zn ∈ kerT such that xn − zn≤2αn. = −  ≤ Let xn xn zn. Then xn 2αn. By the definition of xn,wehave = = d(xn,kerT ) d(xn,kerT ) αn. = → →∞ Furthermore, Txn Txn y as n . ∞ →∞ We claim the sequence (αn)n=1 is bounded. To prove this, assume that αn as n →∞. Without loss of generality, we may assume that αn > 0 for all n ∈ N. Then we may divide by α , and hence n    Tx  x lim  n  = 0 and d n ,kerT = 1. (6.9) →∞   n αn αn 144 6 Compact Operators and Fredholm Theory

 ≤ Furthermore, xn/αn 2. Consequently, since K is a compact operator, there is a ∞ subsequence (x /α ) such that lim →∞ K(x /α ) exists. Call this limit ν. nk nk k=1 k nk nk Recalling that T = λI − K, we see that     x x x λ · nk = T nk + K nk . αnk αnk αnk Letting k →∞, and using the limit in (6.9), we see that λx /α → ν as k →∞. nk nk But this implies that ν ∈ kerT :   x T (ν) = lim λT nk = 0. k→∞ αnk

If ν ∈ kerT , then lim →∞ d(x /α ,kerT ) = 0, which contradicts the second k nk nk equality in (6.9). It follows that αn →∞as n →∞. The same argument shows that ∞ no subsequence of (αn)n=1 can tend to infinity, and so there must be some C>0 such that |αn|≤C for all n ∈ N.  ≤ Once again, we invoke the compactness of the operator K. We know xn 2C ∈ N ∞ ∞ for all n . Thus, there is a convergent subsequence (K(xn ))k=1 of (K(xn))n=1. k Suppose lim →∞ K(x ) = ν. Then k nk = + −−−→ + λxn T (xn ) K(xn ) y ν. k k k k→∞

Miraculously,   + y ν = = T lim T (xn ) y, λ k→∞ k and hence y ∈ ranT . Our assumption was that y ∈ ranT , and thus we conclude that the range of T is closed. Lemma 6.31 Let T = λI − K, where λ is a nonzero scalar and K is a compact operator on a Banach space X. The quotient X/ranT is finite-dimensional. Proof By Proposition 3.51, we have that

(X/ranT )∗ = (ranT )⊥ ={x∗ ∈ X∗ : x∗(Tx) = 0 for all x ∈ X}.

The statement that x∗(Tx) = 0 for all x ∈ X is equivalent to the statement that (T ∗x∗)(x) = 0 for all x ∈ X. But this means T ∗x∗ = 0, and hence x∗ ∈ ker(T ∗). Thus, (X/ranT )∗ = ker(T ∗). Since T = λI − K, it follows that T ∗ = λI ∗ − K∗. Therefore, by Theorem 6.26 (Schauder’s Theorem) and Lemma 6.29, the set ker(T ∗) is finite-dimensional. Hence,

dim(X/ranT )∗ = dim(ker(T ∗)) < ∞.

The result follows, since dim(X/ranT ) = dim(X/ranT )∗. (See Exercise 6.11.) 2 6.2 A Rank-Nullity Theorem for Compact Operators 145

From the preceding proof, we extract a corollary that is of independent interest. Corollary 6.32 If X is a Banach space and T is a compact perturbation of a scaling of the identity, then dim(X/ranT ) = dim(ker(T ∗)). We are now ready to state (although not prove) our version of the Rank-Nullity Theorem for compact operators. We will see that it is really a statement about compact perturbations of scalings of the identity. Theorem 6.33 (Rank-Nullity Theorem) Let X be a Banach space with identity operator I.IfT = λI −K, where λ is a nonzero scalar and K is a compact operator on X, then dim(kerT ) = dim(X/ranT ). Before we can prove this theorem, we require a few more technical results. The first of these, known as Riesz’s Lemma, asserts that any closed proper subspace of a Banach space is always a prescribed distance away from some portion of the unit sphere ∂BX. Lemma 6.34 (Riesz’s Lemma) Suppose X is a Banach space and E is a proper closed subspace of X. Given any ε>0, there exists some x ∈ ∂BX such that d(x, E) > 1 − ε. Furthermore, if E is finite-dimensional, then x ∈ ∂BX can be chosen so that d(x, E) = 1. Proof Denote the metric on X by d. By assumption, E is a proper closed subspace of X, and so there exists some u ∈ X\E with d(u, E) > 0. It follows that, for any δ>0, there exists some e ∈ E such that d(u, E) 0. If we let x u−e , then 1 1 d(x, E) = d(u − e, E) = d(u, E) > 1 − ε, u − e u − e as required. Now assume E is finite-dimensional. For each n ∈ N, pick en ∈ E such that  1 u − e  < 1 + d(u, E). n n Note that 0 ∈ E (because E is a subspace of X), and so d(u, E) ≤u. Thus, for each n ∈ N,wehave

en≤u+2 d(u, E) ≤ 3 u.

Therefore, the set {en : n ∈ N} is bounded in norm. Since E is finite-dimensional, the Heine–Borel property asserts the existence of a convergent subsequence, say 146 6 Compact Operators and Fredholm Theory

∞ ∈ = (enk )k=1. Then there is some e E such that limk→∞ enk e. By construction, it  − = = u−e ∈ = follows that u e d(u, E). If x u−e , then x ∂BX and d(x, E) 1. Remark 6.35 If E is a closed proper subspace of a reflexive Banach space X, then we can always find an x ∈ X such that d(x, E) = 1, even if E is infinite-dimensional. (See Exercise 6.6.) Lemma 6.36 Let X be a Banach space and suppose K : X → X is a compact ∞ operator. Let δ>0 be given and suppose (λn)n=1 is a sequence of scalars (either real or complex) such that |λn|≥δ>0 for all n ∈ N.If  En = ker (λ1I − K)(λ2I − K) ···(λnI − K) , n ∈ N, then each En is finite-dimensional and there exists some N ∈ N such that Em = EN for all m ≥ N. Proof Denote the metric on X by d. Fix an n ∈ N. Expanding, we have ˆ (λ1I − K)(λ2I − K) ···(λnI − K) = λ1 ···λnI − K, where Kˆ is some compact operator. Consequently, the product is a compact perturbation of a scaling of the identity. Therefore dimEn < ∞, by Lemma 6.29. ∞ ⊆ ⊆··· By definition, the sets (En)n=1 form a nested sequence E1 E2 . We wish to show that the inclusions eventually become equality. Define a subset A of the natural numbers to be A ={n : En = En−1}. Let n ∈ A. By Riesz’s Lemma (Lemma 6.34), there exists some xn ∈ En such that xn=d(xn, En−1) = 1. Since xn ∈ En, we have that

(λ1I − K)(λ2I − K) ···(λnI − K)(xn) = 0, and so (λnI − K)(xn) ∈ En−1. Then Kxn ∈ λxn + En−1. Consequently,

d(Kxn, En−1) =|λn| d(xn, En−1) ≥ δ.

If m ∈ A is chosen such that m

Kxn − Kxm=d(Kxn, Kxm) ≥ δ. A ∞  − ≥ If is infinite, then we can find a sequence (xn)n=1 in BX such that Kxm Kxn δ for all m = n. This contradicts the assumption that K is a compact operator, because we should be able to find a convergent subsequence, but all the terms are at least δ apart. Therefore, the set A must be finite. The result follows. Theorem 6.37 Let K be a compact operator and let Λ be the set of nonzero eigenvalues of K. Then one of the following three things must be true: (i) Λ =∅, (ii) Λ is finite, or ∞ (iii) Λ = (λn) = , where lim |λn|=0. n 1 n→∞ 6.2 A Rank-Nullity Theorem for Compact Operators 147

Proof We will show that the set Λ∩{z ∈ C : |z|≥r} is finite for any r>0. Assume ∞ ∩{ ∈ C | |≥ } the set is infinite. Then we can find a sequence (μk)k=1 in Λ z : z r such that each element in the sequence is distinct. For each n ∈ N, define the nested sets

En = ker((μ1I − K) ···(μnI − K)).

Since the μk are distinct eigenvalues, we conclude that E1 ⊂ E2 ⊂ E3 ⊂ ···, where the inclusions are proper. But |μk|≥r for all k ∈ N, and so, by Lemma 6.36, there exists some N ∈ N such that En = EN for all n ≥ N. We have arrived at a contradiction. Therefore, for each r>0, there are only finitely many eigenvalues inside the set {z ∈ C : |z|≥r}. It follows that either the set of eigenvalues is finite or, if there are infinitely many eigenvalues, they must accumulate to 0. Finally, we are ready to prove the Rank-Nullity Theorem for compact operators. Proof of Theorem 6.33 Let us begin by recalling the setup. We start with a Banach space X and a compact operator K : X → X. Let T = λI − K, where λ = 0. m N1 By Lemma 6.36, there exists an N1 ∈ N such that ker(T ) = ker(T ) for all m ≥ N1. (In the lemma, let λn = λ for all n ∈ N.) Similarly, there exists an N2 ∈ N ∗ m ∗ N2 such that ker((T ) ) = ker((T ) ) for all m ≥ N2. From the latter equality, we m N2 deduce that ran(T ) = ran(T ) for all m ≥ N2. (See Exercise 6.12.) N N Let N = max{N1, N2}. For ease of notation, let W = ker(T ) and V = ran(T ). Observe that T N = λN I − Kˆ , for some compact operator Kˆ . Thus, W is finite- dimensional (by Lemma 6.29) and V is closed (by Lemma 6.30). Since V is a closed subspace of a Banach space, it too is a Banach space. Notice that T (W) ⊆ W and T (V ) ⊆ V . (That is, the spaces W and V are invariant under T ). We recall that T |V is the restriction of the map T to the subspace V .

Claim 1. The map T |V : V → V is one-to-one. Suppose v ∈ V is such that T v = 0. Since V = ran(T N ), there exists some u ∈ X such that v = T N u. Then

0 = T v = T (T N u) = T N+1u.

It follows that u ∈ ker(T N+1) = ker(T N ), and hence T N u = 0. Our assumption was N that v = T u, and consequently v = 0. Thus, the map T |V : V → V is one-to-one.

Claim 2. The map T |V : V → V is onto. Let v ∈ V . Since V = ran(T N ) = ran(T N+1), there is some u ∈ X such that

v = T N+1u = T (T N u) ∈ T (V ).

Thus, there is some v ∈ V such that v = T (v ). Consequently, the map T |V is onto. We have demonstrated that T |V : V → V is a bounded linear bijection on the Banach space V . Thus, by the Bounded Inverse Theorem (Corollary 4.30), there is a bounded linear operator S : V → V such that ST |V = T |V S = I|V . (That is, T has a bounded linear inverse on V ). 148 6 Compact Operators and Fredholm Theory

Now, define a map P : X → V by

Px = SN T N x, x ∈ X.

If x ∈ V , then Px = x, and so P is a projection onto V . Observe also that Px = 0 if and only if T N x = 0 (because S is an isomorphism on V ). Consequently, we have that Px = 0 if and only if x ∈ ker(T N ) = W. We conclude that kerP = W. Therefore, by Theorem 4.42, X = V ⊕ W. Since X = V ⊕ W and T |V is invertible, we deduce that kerT = kerT |W . Additionally, since X = V ⊕ W,wehaveX/ranT = W/ranT |W . We know that W is finite-dimensional, and so we may apply the finite-dimensional Rank-Nullity Theorem to T |W . Hence,

dim(kerT |W ) + dim(ranT |W ) = dimW.

Consequently, dim(kerT |W ) = dim(W/ranT |W ). Making the necessary substitutions, the result follows. 2

Exercises

Exercise 6.1 Let 1 ≤ q<∞ and let S : q → q be the left shift operator given by = ∞ ∈ S(η1, η2, η3, ... ) (η2, η3, η4, ... ), (ηk)k=1 q . Show that λ is an eigenvalue for S if and only if |λ| < 1. What are the eigenvalues if q =∞? Exercise 6.2 Suppose X and Y are Banach spaces and T : X → Y is a bounded linear operator. If X is reflexive, show that T (BX) is closed in Y . Exercise 6.3 Suppose X and Y are Banach spaces and T : X → Y is a bounded linear operator. Show that T ∗ :(Y ∗, w∗) → (X∗, ·) is continuous only if T ∗ has finite rank. Exercise 6.4 Suppose X and Y are Banach spaces and T : X → Y is a compact ∞ → operator. If (xn)n=1 is a sequence in X such that xn 0 weakly, then show that lim T (xn)=0. n→∞ Exercise 6.5 Show that the converse to Exercise 6.4 is true if X is a reflexive Ba- nach space. That is, if X and Y are Banach spaces such that X is reflexive, and lim T (xn)=0 whenever xn → 0 weakly, show that T is a compact operator. n→∞ Exercise 6.6 Let X be a reflexive Banach space and let E be an infinite-dimensional closed proper subspace of X. Show there exists an x ∈ X with x=1 such that Exercises 149 d(x, E) = inf{x − e : e ∈ E}=1. (Hint: Find a weak limit point of the sequence ∞ (en)n=1 in the proof of Lemma 6.34.) Exercise 6.7 Let X be a reflexive Banach space. Show that if T : X → X is a compact operator, then there exists an element x ∈ X with x=1 such that Tx=T . (Hint: Use Theorem 5.39.) Exercise 6.8 Suppose X and Y are Banach spaces and T : X → Y is a compact operator. If T (X) is dense in Y , then show that Y is separable. Exercise 6.9 Suppose X and Y are Banach spaces and T : X → Y is a compact operator. If T (X) = Y , then show that Y has finite dimension. Exercise 6.10 Suppose X and Y are Banach spaces and T : X → Y is a bounded linear operator. If T (X) is a closed infinite-dimensional subset of Y ,isT a compact operator? Explain your answer. Exercise 6.11 Let E be a Banach space. Show that E is infinite-dimensional if and only if E∗ is infinite-dimensional. Also, show that dim(E) = dim(E∗)ifE is finite-dimensional. (Hint: Use the Hahn–Banach Theorem.) Exercise 6.12 Let X be a Banach space and suppose K : X → X is a compact operator. Let λ = 0 and let T = λI − K be a compact perturbation of a scaling of the identity. (a) Show there exists an N ∈ N such that ker((T ∗)m) = ker((T ∗)N ) for all m ≥ N. (b) Use (a) to show that ran(T m) = ran(T N ) for all m ≥ N, where N is the natural number from (a). (Hint: Use Exercise 5.20.) Exercise 6.13 Suppose that T : C[0, 1] → C[0, 1] is a compact operator. Show that ∞ there exists a sequence (Tn) = of finite rank operators such that lim T − Tn=0. n 1 n→∞

Exercise 6.14 Define a map TK : C[0, 1] → C[0, 1] by

1 TK f (x) = sin ((x − y)π)f (y) dy, f ∈ C[0, 1], x ∈ [0, 1]. 0

Show that TK is a compact operator. Compute ker(TK ) and ran(TK ). Exercise 6.15 Let X and Y be Banach spaces and suppose T : X → Y is a bounded linear operator. The operator T is called weakly compact if the set T (BX) is relatively compact in the weak topology on Y . Show that T is weakly compact if and only if T ∗∗(X∗∗) ⊆ Y . Exercise 6.16 (Gantmacher’s Theorem) Prove the following: If T is weakly compact, then T ∗ is weakly compact. Exercise 6.17 Let X and Y be Banach spaces. Show that if S and T in L(X, Y ) are weakly compact operators, then S + T is a weakly compact operator. 150 6 Compact Operators and Fredholm Theory

Exercise 6.18 Assume that W, X, Y , and Z are Banach spaces and suppose that the map T : X → Y is a weakly compact operator. Show that if A : W → X and B : Y → Z are bounded linear operators, then BTA : W → Z is weakly compact. Exercise 6.19 Prove the following theorems of Pettis:

(a) If X is reflexive and T : X → 1 is a bounded linear operator, then T is compact. (b) If X is reflexive and S : c0 → X is a bounded linear operator, then S is compact. Chapter 7 Hilbert Space Theory

In this chapter, we will consider the spectral theory for compact hermitian operators on a Hilbert space.

7.1 Basics of Hilbert Spaces

Before we begin our discussion of linear operators on a Hilbert space, we recall the basic definitions and the important theorems we will use. Definition 7.1 Let H be a complex vector space. A complex inner product on H is a map (·, ·):H × H → C that satisfies the following properties: (i) (x, x) ≥ 0 for all x ∈ H, and (x, x) = 0 if and only if x = 0, (ii) (x, y) = (y, x), (iii) (αx1 + βx2, y) = α(x1, y) + β(x2, y), and (iv) (x, αy1 + βy2) = α(x, y1) + β(x, y2), where {x, x1, x2, y, y1, y2}⊆H and {α, β}⊆C. When H is equipped with a complex inner product, it is called a complex inner product space. A map that satisfies (i) is said to be positive definite, while a map satisfying (ii) is said to be conjugate symmetric. When a map satisfies (iii) and (iv), it is called sesquilinear. (The term sesqui comes from the Latin for one and a half.) A real inner product space is similarly defined, except the scalars are real. (See Definition 3.19.) When the underlying vector space is real, a map satisfying (iii) and (iv) is called bilinear. We will assume H is a complex inner product space, unless otherwise stated. √ Let H be a complex inner product space. Define a norm on H by x= (x, x) for all x ∈ H . The positive-definiteness and homogeneity of ·are clear (from the definition of the inner product). We will verify subadditivity in a moment, but first we need an important inequality. Theorem 7.2 (Cauchy–Schwarz Inequality) If x and y are elements of an inner product space, then |(x, y)|≤xy.

© Springer Science+Business Media, LLC 2014 151 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_7 152 7 Hilbert Space Theory

Proof Observe that x + ty2 ≥ 0 for all t ∈ R.Ifx + ty = 0 for some t ∈ R, then x=|t|y, and so

|(x, y)|=|( − ty, y)|=|t|y2 =xy.

Suppose now that x + ty2 > 0 for all t ∈ R. Expanding the norm as an inner product, we obtain

(x, x) + t(x, y) + t(y, x) + t2(y, y) > 0.

Since (y, x) = (x, y), it follows that (x, y) + (y, x) = 2 ((x, y)), and so  x2 + 2t  (x, y) + t2 y2 > 0, (7.1) for all t ∈ R. The left side of (7.1) is a quadratic in t with real coefficients that has no real zeros, and consequently the discriminant must be negative. Therefore, +  , 4  (x, y) 2 − 4x2 y2 < 0.

It follows that |((x, y))|≤xy. This is true for all x and y in H . By assumption, (x, y) ∈ C. Choose θ ∈ [0, 2π) so that eiθ(x, y) ∈ R. Then  |(x, y)|=|eiθ(x, y)|=|(eiθx, y) |≤eiθxy=xy, as required. 2 In the case of an inner product space, subadditivity of ·follows from the next inequality, from which we conclude that ·is a norm. Theorem 7.3 (Minkowski’s Inequality) If x and y are elements of an inner product space, then x + y≤x+y. Proof Observe that

x + y2 = (x + y, x + y) =x2 + (x, y) + (y, x) +y2.

By the Cauchy–Schwarz Inequality, |(x, y)|≤xy, and so

x + y2 ≤x2 + 2 xy+y2 = (x+y)2.

The result follows by taking square roots. 2

Definition 7.4 A Hilbert space√ is an inner product space H such that (H , ·)is a Banach space, where x= (x, x) for all x ∈ H . We recall that the norm on H is said to be induced by the inner product on H . (See Definitions 3.19 and 3.20 and the comments in between.) 7.1 Basics of Hilbert Spaces 153

Example 7.5 The classical examples of Hilbert space are the sequence space 2 and the function space L2(0, 1). The inner product on 2 is given by ∞ (x, y) = xi yi , {x, y}⊆2, i=1 = ∞ = ∞ where x (xi )i=1 and y (yi )i=1. The inner product on L2(0, 1) is given by

1 (f , g) = f (s) g(s) ds, {f , g}⊆L2(0, 1). 0 It is not hard to check that (·, ·) is a complex inner product in each of these cases. We will see that these two examples are (in some sense) typical. Definition 7.6 Let H be an inner product space. Two elements x and y in H are said to be orthogonal if (x, y) = 0. If E is a closed subspace of H , then x is said to be orthogonal to E if (x, e) = 0 for all e ∈ E. We will use the notation x ⊥ y to indicate x is orthogonal to y, and x ⊥ E to indicate x is orthogonal to E.

Example 7.7 Suppose H = 2, the space of all square-summable sequences. For th each n ∈ N, let en denote the sequence with 1 in the n coordinate, and zero every- where else. Then en ⊥ em for all n = m. Furthermore, if En = span{en, en+1, ...}, then em ⊥ En whenever m

x + y2 =x2 + 2 ((x, y)) +y2 =x2 +y2, as required. 2 Theorem 7.10. (Parallelogram Law) If x and y are elements of an inner product space, then  x + y2 +x − y2 = 2 x2 +y2 .

Proof We merely need to expand the expression on the left, and the result follows directly. (See Exercise 3.1.) 2 In fact, the Parallelogram Law characterizes inner product spaces. That is, if H is a normed vector space which satisfies the Parallelogram Law, then there exists a unique inner product that gives rise to the norm. If H is a real normed space, the inner product is given by 154 7 Hilbert Space Theory 1  (x, y) = x + y2 −x − y2 . 4 In the case of a complex normed space, 1  (x, y) = x + y2 −x − y2 + ix + iy2 − ix − iy2 . 4 These are known as the polarization formulas. We leave it to the interested reader to verify that these formulas, along with the Parallelogram Law, determine an inner product that agrees with the given norm. (See Exercise 7.25.) One consequence of the Parallelogram Law is that any Hilbert space is uniformly convex. (See Exercise 7.25.) Lemma 7.11 (Closest Point Lemma) Let H be a Hilbert space and suppose C is a nonempty closed convex subset of H. Given x ∈ H , there is a unique point y ∈ C such that x − y=d(x, C) = inf x − z. z∈C

Proof Let δ = d(x, C). For each n ∈ N, pick yn ∈ C such that 1 δ ≤x − y ≤δ + . (7.2) n n This we can do, because δ = d(x, C) is an infimum. By the Parallelogram Law,    y + y  y − y  x − y 2 +x − y 2 = 2 x − m n 2 +  m n 2 . n m 2 2

+ + ym yn  − ym yn ≥ Because C is convex, 2 is in C. Therefore, x 2 δ, and so   1 x − y 2 +x − y 2 ≥ 2 δ2 + y − y 2 . n m 4 m n

On the other hand, by (7.2),     2 δ 1 2 δ 1 x − y 2 +x − y 2 ≤ δ2 + + + δ2 + + . n m n n2 m m2

Combining these inequalities, we conclude     1 1 1 1 y − y 2 ≤ 4 δ + + 2 + . m n n m n2 m2

The right side of this inequality tends to zero as m and n approach ∞. Consequently, ∞ ∈ (yn)n=1 is a Cauchy sequence. Thus, there exists some y C (because C is closed) such that y = lim yn. It then follows from (7.2) that x − y=δ. n→∞ To show uniqueness, suppose y and y are in C such that x −y=x −y =δ. Again using the Parallelogram Law, 7.1 Basics of Hilbert Spaces 155    y + y  y − y  x − y2 +x − y 2 = 2 x − 2 +  2 . 2 2 And so, by convexity (and the definition of δ),   1 x − y2 +x − y 2 ≥ 2 δ2 + y − y 2 . 4 Therefore, 1 2 δ2 ≥ 2 δ2 + y − y 2, 2 and so y − y =0. Consequently, y = y , as required. 2 Remark 7.12 In a reflexive space, a nonempty closed convex subset has a closest point, but it might not be unique. Proposition 7.13 Let H be a Hilbert space with closed subspace E.Ifx ∈ H , then there exists a unique y ∈ E such that x − y ⊥ E. Proof Let x ∈ H be given. Pick y ∈ E to be the closest point to x, which exists (and is unique) by the Closest Point Lemma (Lemma 7.11). We will verify that x −y ⊥ E. Suppose e ∈ E. Because y was chosen to be the point in E closest to x, it follows that x − y + te≥x − y for all t ∈ R. Furthermore, since y is the unique element of E that is closest to x, we can conclude that equality holds only when t = 0. Squaring both sides of the inequality, we obtain:

x − y2 + 2t ((x − y, e)) + t2 e2 ≥x − y2, t ∈ R.

Consequently, 2t ((x − y, e)) + t2 e2 ≥ 0, t ∈ R. If t>0, then ((x − y, e)) > −t e2/2. Taking the limit as t → 0+, we conclude that ((x−y, e)) ≥ 0. Similarly, if t<0, we have ((x−y, e)) < |t|e2/2. Taking the limit as t → 0−, we see that ((x − y, e)) ≤ 0. Therefore, ((x − y, e)) = 0. We have established that ((x − y, e)) = 0 for all e ∈ E. Because E is a linear subspace, it follows that ie ∈ E. Hence, ((x − y, ie)) = 0 for all e ∈ E. However,

((x − y, ie)) =−((x − y, e)), and so ((x − y, e)) = 0 for all e ∈ E. We have thus demonstrated that (x − y, e) has zero real and imaginary parts for all e ∈ E. Consequently, (x − y, e) = 0 for all e ∈ E. Therefore, x − y ⊥ E, as required. It remains to show that y is the unique element of E such that x −y ⊥ E. Suppose y ∈ E also has the property that x − y ⊥ E. Since y − y = (x − y ) − (x − y), it follows that y−y ⊥ E. However, y−y ∈ E, and so y−y 2 = (y−y , y−y ) = 0. Thus, y = y , as required. 2 We have established that for any x ∈ H, and any closed subspace E of H , there exists a unique y ∈ E such that x − y ⊥ E. This motivates the next definition. 156 7 Hilbert Space Theory

Definition 7.14 Let H be a Hilbert space and let E be a closed subspace of H .If x ∈ H , then the symbol PEx denotes the unique element of E with the property x − PEx ⊥ E.

Lemma 7.15 Let H be a Hilbert space with closed subspace E.IfPE : H → H is defined by PE(x) = PEx for all x ∈ H, then PE is a bounded linear projection with PE=1. Proof First we show linearity. Let x and y be elements of H and let α and β be complex numbers. If e ∈ E, then

(αx + βy − (αPEx + βPEy), e) = α(x − PEx, e) + β(y − PEy, e) = 0.

This remains true for all e ∈ E, and thus αPEx + βPEy is an element of E such that αx + βy − (αPEx + βPEy) ⊥ E. However, PE(αx + βy) is the unique element of E with this property. Therefore, PE(αx + βy) = αPEx + βPEy, and so PE is linear. Now suppose x ∈ H. By definition, PEx ∈ E and so x − PEx ⊥ PEx. Thence, by the Pythagorean Theorem, 2 2 2 x =PEx +x − PEx .

Consequently, we have that PEx≤x. Certainly, if x ∈ E, then PEx = x, and so PE=1. The result follows. 2 The next theorem, known as the Riesz–Fréchet Theorem (and sometimes the Riesz Representation Theorem for Hilbert Spaces), is one of the key results in Hilbert space theory. Theorem 7.16 (Riesz–Fréchet Theorem) Let H be a Hilbert space. ∗ (i) If v ∈ H , and φv(x) = (x, v) for all x ∈ H, then φv ∈ H and φv=v. (ii) If φ ∈ H ∗, then there exists a unique v ∈ H such that φ(x) = (x, v) for all x ∈ H , and φ=v.

Proof We prove (i) first. Certainly φv is linear. By the Cauchy-Schwarz Inequality, 2 φv(x)≤xv, and so φv≤v. Since φv(v) = (v, v) =v , we conclude that φv=v. Now we turn our attention to (ii). By assumption, φ is a continuous linear func- tional, and consequently E = kerφ is a closed subspace of H .Ifφ = 0, then v = 0 satisfies the conclusions of (ii), and so we may assume φ = 0. Pick u ∈ E. Then φ(u) = 0. Without loss of generality, assume φ(u) = 1. Let w = u−PEu and suppose x ∈ H. The element u was chosen so that φ(u) = 1. It follows that φ(x − φ(x)u) = 0, and so x − φ(x)u ∈ E. By the definition of w,we know that w ⊥ E, and hence w is orthogonal to x − φ(x)u. Therefore, 0 = (x − φ(x)u, w) = (x, w) − φ(x)(u, w),

and so (x, w) = φ(x)(u, w). Since u = w + PEu and w ⊥ PEu,wehave 2 (u, w) = (w, w) + (PEu, w) =w . 7.2 Operators on Hilbert Space 157

Thus, (x, w) = φ(x)(u, w) = φ(x)w2. We assumed that u ∈ E, and so w = 0. Let v = w/w2. Then φ(x) = (x, v) for all x ∈ X. In the notation of (i), we have found a v ∈ H such that φ = φv. Therefore, using the conclusion of (i),wehaveφ=v. 2 The significance of Theorem 7.16 is the identification of H ∗ with H . When H is a real Hilbert space, the identification with H ∗ is an isometric isomorphism. When H is complex, it remains an isometry, but the correspondence is given by what is sometimes called a conjugate isomorphism. Specifically, if φ and ψ in H ∗ correspond to u and v in H (respectively), then αφ + βψ corresponds to αu + βv. Sometimes this is called conjugate linearity or anti-isomorphism. Regardless of the underlying scalar field (real or complex), a consequence of the identification of H with H ∗ is that any Hilbert space is reflexive.

7.2 Operators on Hilbert Space

In this section, we discuss operators on Hilbert space. We focus our attention not on general operators, but on a special type of operator, called hermitian. Before giving the definition of a hermitian operator, we consider the following existence lemma. Lemma 7.17 Let H be a Hilbert space. If T : H → H is a bounded linear operator, then there exists a unique linear operator T ∗ : H → H such that

(Tx, y) = (x, T ∗y), {x, y}⊆H , and such that T ∗=T . Proof Provided we can show such a map exists, the uniqueness of T ∗ is clear. If there is another such map S : H → H that satisfies the conclusion of the lemma, then (x, Sy) = (x, T ∗y), {x, y}⊆H. It follows that Sy = T ∗y for all y ∈ H, and so S = T ∗. (See Example 7.8.) ∗ We now show the existence of T . Let y ∈ H and define φy : H → C by

φy (x) = (Tx, y), x ∈ H.

Certainly, φy is a linear functional and φy ≤T y, by the Cauchy-Schwarz Inequality. By the Riesz–Fréchet Theorem, there exists a unique element in H (de- ∗ ∗ pending on y), which we call T y, such that φy (x) = (x, T y) for all x ∈ H , and ∗ ∗ ∗ ∗ such that φy =T y. Define a map T : H → H by T (y) = T y for all y ∈ H . By the uniqueness of T ∗y in H, the map T ∗ is well defined, and (by construction) we have (Tx, y) = (x, T ∗y) for all x and y in H. 158 7 Hilbert Space Theory

We now show that T ∗ is linear. Let y and y be elements in H and let α and β be complex numbers. Then for all x ∈ H,     x, T ∗(αy + βy ) = Tx, αy + βy = α Tx, y + β Tx, y .

Then, since (Tx, y) = (x, T ∗y) for all x and y in H ,     x, T ∗(αy + βy ) = α x, T ∗y + β x, T ∗y = x, αT ∗y + βT ∗y .

This is true for all x ∈ H, and so T ∗(αy + βy ) = αT ∗y + βT ∗y . Therefore, T ∗ is linear. To see that T ∗ is bounded, let y ∈ H. Then, invoking the Riesz–Fréchet Theorem,

T ∗y= sup |(x, T ∗y)|= sup |(Tx, y)|≤T y, x∈BH x∈BH where the last inequality comes from the Cauchy-Schwarz Inequality. Therefore, T ∗≤T , and so T ∗ is bounded. It remains to show that T ∗=T . We have already established T ∗≤T , and so we need only show the reverse inequality. To that end, we repeat the above construction on T ∗ to define an operator T ∗∗ : H → H with the property that

(T ∗x, y) = (x, T ∗∗y), {x, y}⊆H , and such that T ∗∗≤T ∗. It suffices then to show that T ∗∗ = T . Let x and y be in H. Then

(x, T ∗∗y) = (T ∗x, y) = (y, T ∗x) = (Ty, x) = (x, Ty).

Therefore, (x, T ∗∗y) = (x, Ty) for all x and y in H, and so T = T ∗∗. This completes the proof. 2 Definition 7.18 Let H be a Hilbert space and suppose T : H → H is a bounded linear operator. The unique linear operator T ∗ : H → H such that (Tx, y) = (x, T ∗y) for all x and y in H is called the Hilbert space adjoint of T . The Hilbert space adjoint is analogous to the matrix adjoint from linear algebra, and retains many features of its finite-dimensional cousin. (See Exercise 7.2.) We have now defined two adjoints for the operator T : H → H , both denoted T ∗. (See Definitions 3.36 and 7.18.) Recall that the (operator) adjoint of T is the bounded linear map T ∗ : H ∗ → H ∗ defined by (T ∗x∗)(x) = x∗(Tx) for all x ∈ H and x∗ ∈ H ∗. It may seem as though there is a risk of confusion, but there is a natural correspondence between the two, via the Riesz–Fréchet Theorem. (See Exercise 7.11.) Definition 7.19 Let H be a Hilbert space and suppose T : H → H is a bounded linear operator. If T = T ∗, then T is called a hermitian operator. A real hermitian operator is also known as a symmetric operator. 7.2 Operators on Hilbert Space 159

Example 7.20 Suppose T : L2(0, 1) → L2(0, 1) is a bounded linear operator given by the formula

1 Tf(x) = K(x, y) f (y) dy, f ∈ L2(0, 1), x ∈ [0, 1], 0 where K ∈ L2([0, 1] × [0, 1]). We established that T is well-defined in Section 6.1. Let f and g be functions in L2(0, 1). Then

1  1  (Tf, g) = K(x, y) f (y) dy g(x) dx. 0 0 By Fubini’s Theorem, this equals

1  1  K(x, y) g(x) dx f (y) dy. 0 0 Since T ∗ is the unique operator with the property that (Tf, g) = (f , T ∗g), we conclude that 1 T ∗g(y) = K(x, y) g(x) dx. 0 Taking complex conjugates, and swapping the roles of x and y, we obtain

1 T ∗g(x) = K(y, x) g(y) dy. 0 Now that we have obtained a formula for the Hilbert space adjoint T ∗, we see that T is hermitian if K(x, y) = K(y, x) for almost every x and y. This is reminiscent ∞ n of the n-dimensional case (n< ), where a matrix (ajk)j,k=1 is called hermitian if ajk = akj for each j and k in the set {1, ... , n}.

Definition 7.21 Let I be a (possibly uncountable) index set. A subset (ej )j∈I of H is said to be orthonormal if ej =1 for all j ∈ I, and if (ej , ek) = 0 for all j = k. We recall that the Kronecker delta is defined to be

1ifj = k, δjk = 0ifj = k.

Using this notation, the subset (ej )j∈I is called orthonormal if (ej , ek) = δjk for all indices j and k in I. √ Observe that if (ej )j∈I is an orthonormal set in H , then ej −ek= 2 whenever j = k (by the Pythagorean Theorem). From this we conclude that any orthonormal subset of a separable Hilbert space is necessarily countable. Lemma 7.22 Every orthonormal set is contained in a maximal orthonormal set. Proof This follows from Zorn’s Lemma. 2 160 7 Hilbert Space Theory

In what follows, we will restrict our attention to countable orthonormal sets, but much of what we say will remain true for uncountable sets, as well. N Theorem 7.23 (Bessel’s Inequality) Suppose (ej )j=1 is a countable orthonormal set, where N ∈ N ∪ {∞}. N 2 2 (i) If x ∈ H , then |(x, ej )| ≤x . j=1 N N 2 2 (ii) If x ∈ H and |(x, ej )| =x , then (x, ej ) ej = x. j=1 j=1  ∈ ∈ N ≤ = m Proof (i) Let x H. Choose m such that m N. Let y j=1 (x, ej ) ej . For each k ∈{1, ... , m},

m m (y, ek) = (x, ej )(ej , ek) = (x, ej ) δjk = (x, ek). j=1 j=1

Therefore, (y − x, ek) = 0 for all k ∈{1, ... , m}. It follows that (y − x, y) = 0, and so x − y ⊥ y. Consequently,

x2 =x − y2 +y2, (7.3)

by the Pythagorean Theorem. From this we conclude that y≤x. Computing the norm of y:

m m m 2 2 y = (y, y) = (x, ej ) (x, ek)(ej , ek) = |(x, ej )| , (7.4) j=1 k=1 j=1  = m | |2 ≤ 2 recalling that (ej , ek) δjk for all indices j and k. Therefore, j=1 (x, ej ) x for all m ≤ N, when m is finite. In order to complete the proof of the first part of the theorem, we consider two cases. If N ∈ N, then let m = N and the proof is complete. If N =∞, then

∞ m 2 2 2 2 |(x, ej )| = lim |(x, ej )| ≤ lim x =x . m→∞ m→∞ j=1 j=1

This proves part (i) of the theorem. (ii) If N<∞, then the assumption, together with (7.4), implies that y=x. Thus, because of (7.3), we have x − y=0, and hence x = y. Now suppose that N =∞. For any positive integers m and n such that m

 n  n  2 2 (x, ej ) ej = |(x, ej )| . j=m+1 j=m+1 7.2 Operators on Hilbert Space 161  ∞ | |2 By assumption, the series j=1 (x, ej ) converges. Consequently, the sequence

 n ∞ (x, ej ) ej n=1 j=1

is a Cauchy sequence in the norm on H. By completeness, there exists a u ∈ H such = ∞ = that u j=1 (x, ej ) ej . By direct computation, we see that (u, ej ) (x, ej ) for all j ∈ N. Thus, we have that x − u ⊥ u. Therefore, by the Pythagorean Theorem, we have that x2 =x − u2 +u2. By assumption, x=u. It follows that x − u=0, and so we conclude that x = u, as required. 2 ∞ Definition 7.24 Let H be a Hilbert space. An orthonormal sequence (en)n=1 is = ∞ ∈ called an orthonormal basis for H if x j=1 (x, ej ) ej for every x H . ∞ Theorem 7.25 Let H be a Hilbert space and suppose (ej )j=1 is an orthonormal sequence in H . The following are equivalent: ∞ (i) The sequence (ej )j=1 is an orthonormal basis for H . ∞ 2 2 (ii) If x ∈ H , then x = |(x, ej )| . (Parseval’s Identity.) j=1 ∞ (iii) (ej )j=1 is a maximal orthonormal sequence.

∞ Proof We begin by assuming (i). Let (ej )j=1 be an orthonormal basis for H and let x ∈ H . In the proof of Bessel’s Inequality (Theorem 7.23 (Bessel’s ∞ Inequality)), we showed that the series j=1 (x, ej ) ej converges to x and that  2 = ∞ | |2 x j=1 (x, ej ) . (See (7.4) and the text following it.) Therefore, (i) implies (ii). ∞ Assume now that (ii) is true, but that (ej )j=1 is not maximal. Then there exists some x ∈ H with x=1 such that (x, en) = 0 for all n ∈ N. This is a violation of ∞ (ii), and so (ej )j=1 must be maximal. This proves that (ii) implies (iii). ∞ ∈ Now assume (iii), but suppose that (ej )j=1 is not a basis for H . Let x H be such ∞ ∞ ∈ that j=1 (x, ej ) ej is not x. The series j=1 (x, ej ) ej converges to some u H . (See the proof of Theorem 7.23 (Bessel’s Inequality).) By construction, x − u ⊥ ej ∈ N = x−u for all j . Since x u (by assumption), the element x−u has norm one and is ∈ N ∞ orthogonal to ej for each j . This contradicts the maximally of (ej )j=1. Thus, ∞ 2 (ej )j=1 is an orthonormal basis for H, and so (iii) implies (i), as required. Corollary 7.26 Every nonzero separable Hilbert space has an orthonormal basis. Proof Let H be a nonzero separable Hilbert space. Then H has a maximal or- thonormal set, by Lemma 7.22. This set is countable, because H is separable. (See the comments preceding Lemma 7.22.) By Theorem 7.25, a countable maximal orthonormal set is an orthonormal basis. 2 Example 7.27 (Identification of separable Hilbert spaces). Let H be an infinite- dimensional separable Hilbert space. Let (ei )i∈I be an orthonormal basis for H . (Such a basis is known to exist, by Corollary 7.26.) By the Pythagorean Theorem, 162 7 Hilbert Space Theory √ we have ei − ej = 2 whenever i = j. Since H is separable, it must be the case that I is a countable set. Therefore, every infinite-dimensional separable Hilbert space has a countable orthonormal basis. = N Without loss of generality, we may assume I . Since (ei )i∈N is an orthonormal ∈ = ∞ basis for H , we can uniquely express every x H as x j=1 (x, ej ) ej . We define a map T : H → 2 by = ∞ ∈ Tx ((x, ej ))j=1, x H. By Parseval’s Identity, this map is well-defined, and indeed Tx=x. Conse- quently, T is an isometry. It follows that all infinite-dimensional separable Hilbert spaces are isometrically isomorphic to 2. Therefore, as far as Banach spaces are concerned, there is only one infinite-dimensional separable Hilbert space. Choosing an orthonormal basis for H is essentially representing H as 2. To emphasize this result, we place it in a theorem. Theorem 7.28 Every infinite-dimensional separable Hilbert space is isometrically isomorphic to 2. Proof See the discussion preceding the statement of the theorem. 2 The next example can be seen as a special case of the previous.

Example 7.29 (The Fourier transform and separability of L2) Let L2(T) denote the space of (equivalence classes of) complex-valued square-integrable functions on the T dθ T = T probability space , 2π , where [0, 2π). The space L2( ) is a Hilbert space with inner product given by

1 2π (f , g) = f (θ) g(θ) dθ, {f , g}⊆L2(T). 2π 0 (Compare to Example 7.5.) For each n ∈ Z, define en : T → C by

inθ en(θ) = e , θ ∈ T.

We claim that (en)n∈Z is an orthonormal basis for L2(T). The sequence (en)n∈Z is an orthonormal set, because

2π = 1 inθ −imθ 1ifn m, (en, em) = e e dθ = 2π 0 0ifn = m.

The maximality of (en)n∈Z follows from the density of trigonometric polynomials: The Weierstrass Approximation Theorem states that the set of trigonometric polyno- mials is dense in C(T), and Lusin’s Theorem (Theorem A.36) states that the set of continuous functions is dense in L2(T). 7.2 Operators on Hilbert Space 163

If f ∈ L2(T), then the Fourier series of f is given by the series  f , en en. n∈Z

Since the sequence (en)n∈Z is an orthonormal basis for L2(T), we see that the Fourier series of f ∈ L2(T) will always converge to f in the L2(T)-norm. (Compare this to Example 4.17.) ˆ We recall that the Fourier transform of f is given by the formula f (n) = (f , en) for n ∈ Z. The Fourier transform represents L2(T)as2(Z). To see this precisely, define a map F : L2(T) → 2(Z), also called the Fourier transform,by F = ˆ ∈ T (f ) (f (n))n∈Z, f L2( ). It is not hard to show that F is an isomorphism. By Parseval’s Identity (Theorem 7.25), we see that   =F  ∈ T f L2(T) (f ) 2(Z), f L2( ).

Therefore, F determines an isometric isomorphism, and so L2(T) and 2(Z) are identical as Banach spaces. The identification of L2(T) and 2(Z) is sometimes known as the Riesz–Fischer Theorem, after F. Riesz and E.S. Fischer, each of whom proved it (independently) in 1907 [11,30]. We are now ready to state the main result of this section, which nicely mirrors the matrix theory of finite-dimensional vector spaces. While we state (and prove) the result for separable Hilbert spaces, it remains true for general Hilbert spaces. Theorem 7.30 Let H be an infinite-dimensional separable Hilbert space. Suppose T : H → H is a compact hermitian operator. There exists an orthonormal basis ∞ ∈ N (en)n=1 of H such that en is an eigenvector of T for each n ; that is, there exists ∞ = ∈ N a sequence (λn)n=1 such that Ten λn en for all n . Remark 7.31 If T is a compact hermitian operator, then any eigenvalue λ of T must be real. To see this, suppose x ∈ H is an eigenvector with x=1 such that Tx = λx. Using the properties of inner products, and keeping in mind that x has norm one, we see that λ = (λx, x) = (Tx, x) = (x, Tx) = (x, λx) = λ. ∞ Also, observe that the sequence of eigenvalues (λn)n=1 promised in Theorem 7.30 must converge to 0. (We proved this for a general compact operator in Theorem 6.37.) Before proceeding with the proof of Theorem 7.30, we require some preliminary results. Lemma 7.32 Let H be a nonzero Hilbert space. If T : H → H is a compact hermitian operator, there exists an x ∈ H with x=1 such that Tx=T . Proof First, we claim that T 2=T 2. Certainly, for any x ∈ H ,wehave T 2(x)=T (Tx)≤T Tx≤T 2 x. It follows that T 2≤T 2. 164 7 Hilbert Space Theory

By the hermitian assumption on T , for any x ∈ H ,

Tx2 = (Tx, Tx) = (T 2x, x).

Therefore, by the Cauchy-Schwarz Inequality, it follows that Tx2 ≤T 2xx. 2 2 Taking the supremum over all x ∈ BH , we conclude that T  ≤T , and hence T 2 =T 2, as claimed. ∞  ≤ ∈ N Now, pick a sequence (xn)n=1 in H such that xn 1 for all n , and such that

2 2 2 lim T xn=T =T  . (7.5) n→∞

We assumed T was compact, and so T (BH ) is relatively compact in H . Thus, passing to a subsequence if necessary, we may assume (without loss of generality) that ∞ ∈  ≤  ∈ N (Txn)n=1 converges to some y H. Since Txn T for all n , it must be the case that y≤T . Let x = y/T . Then x≤1. Furthermore,

1 1 2 Tx = Ty = lim T xn. T  n→∞ T  Therefore, by (7.5), Tx=T , as required. (Note this implies x=1.) 2 Remark 7.33 The hermitian assumption is not required in Lemma 7.32. In fact, the conclusion of Lemma 7.32 holds for any compact operator T : X → X, whenever X is a reflexive Banach space. (See Exercise 6.7.) We require one more lemma. Lemma 7.34 Let H be a nonzero Hilbert space. If T : H → H is a compact hermitian operator, then either T  or −T  is an eigenvalue. Proof We may assume T = 0. By Lemma 7.32, there exists some x ∈ H with x=1 such that Tx=T . We make the observation that, since T is a hermitian operator,

(T 2x, x) = (Tx, Tx) =Tx2 =T 2. (7.6)

Define u = T 2x − (T 2x, x) x. Notice that u = T 2x −T 2 x,by(7.6). By the definition of u,wehave

(u, x) = (T 2x, x) − (T 2x, x)(x, x) = (T 2x, x) − (T 2x, x) x2 = 0.

Consequently, we have u ⊥ x, and in particular u ⊥ (T 2x, x) x. By the Pythagorean Theorem, it follows that

u2 +(T 2x, x) x2 =u + (T 2x, x) x2. (7.7) 7.2 Operators on Hilbert Space 165

On the left side of (7.7), we observe

(T 2x, x) x2 =|(T 2x, x)|2 x2 =T 4, by (7.6). On the right side of (7.7), recalling the definition of u,wehave

u + (T 2x, x) x2 =T 2x2 ≤T 4.

It follows that u2 +T 4 ≤T 4. Therefore, u=0, and so T 2x =T 2 x. If Tx =T  x, then x is an eigenvector for T with eigenvalue T . Suppose instead that Tx =T  x. Then y = Tx −T  x is a nonzero element of H . Computing:

Ty +T  y = (T 2x −T  Tx) + (T  Tx−T 2 x) = 0.

Consequently, Ty =−T  y, and so y = 0 is an eigenvector for T with eigenvalue −T . We have established that either T  or −T  is an eigenvalue for T , and we are done. 2 We are now prepared to prove Theorem 7.30. Proof of Theorem 7.30 By Lemma 7.34, there exists a nonempty set of orthonormal N eigenvectors, and so there exists a maximal set, by Zorn’s Lemma. Let (en)n=1 be a maximal set of orthonormal eigenvectors, where N ∈ N ∪ {∞}.IfN<∞, let A ={1, ... , N}; otherwise let A = N. For each n ∈ A, let λn denote the eigenvalue corresponding to the eigenvector en. Let H0 ={x :(x, en) = 0 for all n ∈ A}. We will show that H0 ={0}. We begin by claiming that T (H0) ⊆ H0. Suppose that x ∈ H0. Then for all n ∈ A,

(Tx, en) = (x, Ten) = (x, λnen) = 0. ∈ | → Therefore, Tx H0, and consequently T H0 : H0 H0 is a well-defined compact hermitian operator. ={} | → Observe that H0 is a Hilbert space. Thus, if H0 0 , then T H0 : H0 H0 has an eigenvector x0, by Lemma 7.34. But then (en)n∈A ∪{x0/x0} forms an orthonormal set of eigenvectors for T : H → H. This contradicts the maximality of (en)n∈A, and so H0 ={0}. We have established that no nonzero element of H is orthogonal to the orthonormal set (en)n∈A. It follows that (en)n∈A is maximal amongst all orthonormal sets, and thus it is an orthonormal basis for H, by Theorem 7.25. This completes the proof. 2 166 7 Hilbert Space Theory

7.3 Hilbert–Schmidt Operators

Let a and b be real numbers such that a

b  b  K2 = |K(x, y)|2 dy dx < ∞. (7.8) L2 a a  b | |2 ∞ ∈ A consequence of (7.8) is that a K(x, y) dy < for almost every x [a, b]. We use K to define a map T : L2(a, b) → L2(a, b)by

b Tf(x) = K(x, y) f (y) dy, f ∈ L2(a, b), x ∈ [a, b]. (7.9) a

We will show that T is a well-defined bounded linear operator on L2(a, b). By Hölder’s Inequality,     b b 1/2 b 1/2 |K(x, y) f (y)| dy ≤ |K(x, y)|2 dy |f (y)|2 dy . (7.10) a a a

This quantity is finite for almost every x in [a, b] whenever f ∈ L2(a, b). Therefore, the integral in (7.9) exists for almost every x in [a, b], provided that the function f is an element of L2(a, b). Linearity of T is now evident, and so T is a well-defined linear operator. It remains to show that T is bounded. From (7.10), we see that   b 1/2 | |≤ | |2   Tf(x) K(x, y) dy f L2(a,b). a By squaring, and then integrating with respect to x, we discover

b  b b  |Tf(x)|2 dx ≤ |K(x, y)|2 dy dx f 2 . L2(a,b) a a a   ≤    Therefore, Tf L2(a,b) K L2 f L2(a,b). Consequently, the map T is a well-  ≤  defined bounded linear operator and T K L2 .

Definition 7.35 A map T : L2(a, b) → L2(a, b)isaHilbert-Schmidt operator if

b Tf(x) = K(x, y) f (y) dy, f ∈ L2(a, b), x ∈ [a, b], a for some K ∈ L2([a, b] × [a, b]). The function K is the kernel associated with T ,or simply the kernel of T . 7.3 Hilbert–Schmidt Operators 167

We wish to identify T ∗, the Hilbert space adjoint of T . To that end, suppose f and g are in L2(a, b). By Fubini’s Theorem,

b  b  b  b  (Tf, g) = K(x, y) f (y) dy g(x) dx = K(x, y) g(x) dx f (y) dy. a a a a The Hilbert space adjoint T ∗ is the unique operator that satisfies the equation ∗ (Tf, g) = (f , T g) for all f and g in L2(a, b), and therefore

b T ∗g(x) = K(y, x) g(y) dy. (7.11) a (See Example 7.20.) Comparing (7.3.4) with the definition of T in (7.9), we see that T ∗ = T precisely when K(x, y) = K(y, x) for almost every x and y in [a, b]. This motivates the next definition.

Definition 7.36 Suppose T is a Hilbert-Schmidt operator on L2(a, b) with kernel K. We say that K is a hermitian kernel, or simply that K is hermitian,ifK(x, y) = K(y, x) for almost every x and y in [a, b]. A real-valued hermitian kernel is also known as a symmetric kernel. ∞ Throughout the remainder of this section, (en)n=1 will denote an orthonormal basis for L2(a, b). (We know an orthonormal basis exists by Corollary 7.26. The basis is countable by an argument similar to that used in Example 7.29.) ∞ Proposition 7.37 Let (en)n=1 be an orthonormal basis for L2(a, b). For each m and n in N, define a function fmn :[a, b] × [a, b] → C by

fmn(x, y) = em(x) en(y), {x, y}⊆[a, b]. ∞ × The set (fmn)m,n=1 is an orthonormal basis for L2([a, b] [a, b]).

Proof For ease of notation, let L2 = L2([a, b] × [a, b]). The inner product on L2 is given by b b (g, f ) = g(x, y) f (x, y) dx dy, a a ∞ where g and f are functions in L2. We wish to show that (fmn)m,n=1 is an orthonormal basis for L2. It suffices to show that if g ∈ L2 and (g, fmn) = 0 for all natural numbers m and n, then g = 0inL2. Let g ∈ L2 and suppose for all m and n in N,

b b (g, fmn) = g(x, y) em(x) en(y) dx dy = 0. (7.12) a a By Hölder’s Inequality,   b b 1/2 2 |g(x, y) em(x)| dx ≤ |g(x, y)| dx < ∞, (7.13) a a 168 7 Hilbert Space Theory for almost every y. For each m ∈ N, define a function hm on [a, b]by

b hm(y) = g(x, y) em(x) dx, y ∈ [a, b]. a

The function hm is well-defined by (7.13), and furthermore,

b b  b  h 2 = |h (y)|2 dy ≤ |g(x, y)|2 dx dy =g2 . m L2(a,b) m L2 a a a

It follows that hm ∈ L2(a, b) for each m ∈ N. For every n ∈ N,by(7.12), we have

b (hm, en) = hm(y) en(y) dy = 0. (7.14) a ∞ ∞ Since (en)n=1 is an orthonormal basis for L2(a, b), so too is (en)n=1, and thus the equality in (7.14) implies for each m ∈ N that hm = 0a.e.(y). Thus, for each m ∈ N,

b g(x, y) em(x) dx = 0a.e.(y). a A countable collection of measure zero sets is still measure zero, and so for almost every y, b g(x, y) em(x) dx = 0, a ∈ N ∞ for every m . Again invoking the fact that (en)n=1 is an orthonormal basis, we conclude that g(x, y) = 0a.e.(x) for almost every y. Therefore, g = 0inL2,as required. 2

Proposition 7.38 A Hilbert-Schmidt operator on L2(a, b) is a compact operator. ∞ Proof Let (en)n=1 be an orthonormal basis for the Hilbert space L2(a, b) and let ∞ × (fmn)m,n=1 be the orthonormal basis for L2([a, b] [a, b]) given in Proposition 7.37. ∞ Suppose that TK is a Hilbert-Schmidt operator with kernel K. Because (fmn)m,n=1 is an orthonormal basis for L2([a, b] × [a, b]), we have that ∞ ∞ K = (K, fmn) fmn, (7.15) n=1 m=1

where the series converges in the norm on L2([a, b] × [a, b]). To be precise, if for each N ∈ N, N N KN = (K, fmn) fmn, n=1 m=1  −  → →∞ then K KN L2 0asN . 7.3 Hilbert–Schmidt Operators 169

The Hilbert-Schmidt operator with kernel KN is given by

b N N = TKN g(x) (K, fmn) fmn(x, y) g(y) dy. a n=1 m=1

Recalling the definition of fmn(x, y), this becomes

N N b N N (K, fmn) em(x) en(y) g(y) dy = (K, fmn)(g, en) em(x). n=1 m=1 a n=1 m=1 Consequently, N  N  = TKN g(x) (K, fmn)(g, en) em(x). m=1 n=1

Therefore, the range of TKN is at most N-dimensional, and so the rank of TKN is at most N. (In particular, the rank of TKN is finite.) − =  ≤ −  Observe that TK TKN TK−KN . We know that TK−KN K KN L2 (see  −  → →∞ the comments at the start of this section) and K KN L2 0asN (by  − → →∞ construction). It follows that TK TKN 0asN . Therefore, TK is a compact operator, as the limit of a sequence of finite rank operators. 2 Now assume, in addition to being square-integrable, that K is a hermitian kernel, so that K(x, y) = K(y, x) for almost every x and y. By Proposition 7.38, the map TK is a compact hermitian operator. From Theorem 7.30, we know that there exists an orthonormal basis for L2(a, b) composed of eigenvectors for TK . Denote ∞ this set of orthonormal eigenvectors by (en)n=1 and let the corresponding sequence of ∞ ∞ × eigenvalues be (λn)n=1. Let (fmn)m,n=1 be the orthonormal basis for L2([a, b] [a, b]) given in Proposition 7.37. Using the fact that fmn(x, y) = em(x) en(y) for all x and y in [a, b], we compute:

b  b  b (K, fmn) = K(x, y) en(y) dy em(x) dx = Ten(x) · em(x) dx. (7.16) a a a

By assumption, Ten = λnen for all n ∈ N, and hence

b (K, fmn) = λnen(x) · em(x) dx = λn(en, em) = λn δmn. a Thus, by Parseval’s Identity (Theorem 7.25), ∞ ∞ ∞ ∞ ∞ K2 = |(K, f )|2 = |λ δ |2 = |λ |2. L2 mn n mn n n=1 m=1 n=1 m=1 n=1 We summarize in the following theorem. 170 7 Hilbert Space Theory

Theorem 7.39 Let TK be a Hilbert-Schmidt operator with hermitian kernel K.If ∞ (λn)n=1 is the sequence of eigenvalues of TK , then

b b ∞ 2 2 |K(x, y)| dx dy = |λn| < ∞, a a n=1 and TK =sup |λn|. n∈N Proof The proof of the integral equation can be found in the discussion preceding  = | | the statement of the theorem. The equality TK supn∈N λn follows from the fact = ∞ 2 that Ten λnen for all en in the orthonormal basis (en)n=1.

7.4 Sturm–Liouville Systems

In this section, we will see an example of how to apply the theory of compact hermitian operators to solve differential equations. For simplicity, we will suppose the scalar field is R. We will consider a special case of a system of differential equations known as a Sturm-Liouville system. The system we will consider is ⎧ ⎨ y + q(x)y = f (x), {f , q}⊆C[a, b], (DE) (7.17) ⎩y(a) = y(b) = 0. (BC)

We wish to find y ∈ C(2)[a, b] (a twice continuously differentiable function) that satisfies the differential equation (DE) subject to the boundary conditions (BC). While we start with the assumption f ∈ C[a, b], we will later extend to f ∈ L2(a, b). Throughout what follows, we will make the following assumption on the homogeneous system (i.e., the system with f = 0): Assumption 1 The only solution to the homogeneous system is y = 0. Example 7.40 Our basic model of a Sturm-Liouville system comes from a vibrating string. Suppose that a = 0 and b = π. Consider the following differential system: ⎧ ⎨ y = f (x), f ∈ C[0, π], (DE ) ⎩y(0) = y(π) = 0. (BC )

For the differential equation (DE ), the homogeneous equation y = 0 has general solution y(t) = α + βt. The only way that this can satisfy (BC )isifα = β = 0, and so the vibrating string model satisfies our basic assumption. Let us return our attention to the Sturm-Liouville system in (7.17). Suppose u is a solution to the initial value problem ⎧ ⎨ y + q(x)y = 0, q ∈ C[a, b], (DE0) ⎩ y(a) = 0, y (a) = 1. (IC1) 7.4 Sturm–Liouville Systems 171

We know that a solution u to this system exists by the general theory of ordinary differential equations. If u(b) = 0, then u is a solution to the homogeneous system. It would follow that u = 0, by Assumption 1. This violates the initial conditions in (IC1), and so u(b) = 0. Next, let v be a solution to the initial value problem ⎧ ⎨ y + q(x)y = 0, q ∈ C[a, b], (DE0) ⎩ y(b) = 0, y (b) = 1. (IC2)

As before, such a solution is know to exist by general theory. Furthermore, v(a) = 0, or else it would be trivial, violating the initial conditions in (IC2). Consider the function

y(x) = φ(x) u(x) + ψ(x) v(x), x ∈ [a, b], where φ and ψ are differentiable functions to be determined at a later time. We will insist only that φ and ψ satisfy the following assumption: Assumption 2 φ and ψ are differentiable functions such that φ u + ψ v = 0. Keeping Assumption 2 in mind, let us differentiate y:

y = φ u + φ u + ψ v + ψ v = φ u + ψ v .

Continue by computing the second derivative of y:

y = φ u + φ u + ψ v + ψ v .

It follows that

y + qy = φ u + ψ v + (u + qu) φ + (v + qv) ψ.

By assumption, u and v satisfy (DE0), and so we conclude that

y + qy = φ u + ψ v .

Our goal is to find a solution to the differential system in (7.17). From the preceding calculations, we see that this will be accomplished if we can solve the following differential system: ⎧ ⎨ φ u + ψ v = 0, ⎩φ u + ψ v = f , ψ(a) = 0, φ(b) = 0.

A little algebraic manipulation reveals:

(u v − uv ) φ = f v and (uv − u v) ψ = f u. (7.18) 172 7 Hilbert Space Theory

Let the Wronskian of u and v be given by the formula ⎛ ⎞ uv W = uv − u v = det ⎝ ⎠ . (7.19) u v

Then, (7.18) becomes f v f u φ =− and ψ = , (7.20) W W provided that W(x) = 0 for any x ∈ [a, b]. We will show that W is a nonzero constant function. To that end, we compute the derivative: W = u v + uv − u v − u v = uv − u v. By assumption, u + qu = 0 and v + qv = 0, and so

W = u(− qv) − (− qu)v = 0.

Therefore, W is a constant function. To verify that W is nonzero, we recall that u and v satisfy the initial conditions (IC1) and (IC2), respectively. Observe that

W(a) = u(a) v (a) − u (a) v(a) =−v(a) and W(b) = u(b) v (b) − u (b) v(b) = u(b). We know that v(a) = 0 and u(b) = 0, and so W is a nonzero constant function. Let α = W(a) be the value of this constant. To solve for φ and ψ, we now integrate the equations in (7.20). The results are

x f (t)u(t) 1 x ψ(x) = dt = f (t)u(t) dt, a W(t) α a and x f (t)v(t) 1 b φ(x) =− dt = f (t)v(t) dt. b W(t) α x We now have an integral formula for y:   1 x b y(x) = v(x) f (t) u(t) dt + u(x) f (t)v(t) dt . α a x Define a map K :[a, b] × [a, b] → R by ⎧ ⎨ 1 v(x) u(t)ift ≤ x, K(x, t) = α (7.21) ⎩ 1 ≥ α u(x) v(t)ift x. 7.4 Sturm–Liouville Systems 173

Then b y(x) = K(x, t)f (t) dt. a The map K is continuous on [a, b]×[a, b] and is symmetric; i.e., K(x, y) = K(y, x) for all x and y in [a, b]. It follows that K defines a compact symmetric (hermitian) operator

b TK f (x) = K(x, t)f (t) dt, f ∈ L2(a, b), x ∈ [a, b]. (7.22) a

Furthermore, for any f ∈ L2(a, b) we have that y = TK f is a solution to the system in (7.17). Since K is a symmetric (hermitian) kernel, we know that TK defines a bounded linear map into L2(a, b). In this case, since K is continuous, we can actually improve this to the statement TK : L2(a, b) → C[a, b]. We first show that TK f is continuous for all f ∈ L2(a, b). To see this, let >0 be given and observe that for all x and y in [a, b],

 b      |TK f (x) − TK f (y)|= K(x, t) − K(y, t) f (t) dt . a By Hölder’s Inequality, this is bounded by   b 1/2 | − |2   K(x, t) K(y, t) dt f L2(a,b). a By assumption, K is continuous on a compact set, and so there exists a δ>0 such that  |K(x, t) − K(y, t)| < √ ,   − f L2(a,b) b a for all t ∈ [a, b], whenever |x−y| <δ. Therefore, |TK f (x)−TK f (y)|≤ whenever |x − y| <δ, and so TK f is continuous on [a, b]. The argument to show TK is bounded on L2(a, b) is similar. For each x ∈ [a, b], by Hölder’s Inequality, we have      b  b 1/2 | |=  ≤ | |2   TK f (x) K(x, t)f (t) dt K(x, t) dt f L2(a,b). a a The function K is continuous on the compact set [a, b] × [a, b], and hence it is bounded on [a, b] × [a, b]. Consequently, √   ≤ −     TK f C[a,b] b a K C([a,b]×[a,b]) f L2(a,b). √ Therefore, TK : L2(a, b) → C[a, b] is bounded and TK ≤ b − a KC([a,b]×[a,b]). Suppose now that e is an eigenvector for TK (an eigenfunction in this case). Then there exists a nonzero scalar λ such that TK e = λe. (Note that λ = 0 because 174 7 Hilbert Space Theory

y = TK e is a solution to y + qy = e.Ify = TK e = 0, the differential equation implies that e = 0, which contradicts the assumption that e is an eigenvector.)

We know that y = TK e is a solution to y + qy = e (by construction) and that TK e = λe, where λ = 0. Substituting y = λe into the differential equation, we have (λe ) + q(λe) = e,or 1 e + qe = e. λ This leads us to the following theorem. Theorem 7.41 Let a and b be real numbers such that a

Proof Let TK be the operator defined by (7.21) and (7.22). By Theorem 7.30, the ∞ operator TK has a sequence of eigenvectors (en)n=1 that form an orthonormal basis for L2(a, b). For each n ∈ N, let λn be the eigenvalue associated with en, and define αn = 1/λn. Since λn → 0asn →∞(by Theorem 6.37, or even Theorem 7.39), it follows that |αn|→∞as n →∞. We know that the eigenvectors satisfy the differential equation because of the discussion prior to the statement of the theorem. Finally, suppose f ∈ L2(a, b). We have demonstrated that y = TK f is a solution ∞ to the given differential system. Since (en)n=1 is an orthonormal basis for L2(a, b), we conclude that ∞ y = TK f = (TK f , en) en. n=1

Since TK is symmetric (hermitian), we have

(TK f , en) = (f , TK en) = λn (f , en), for all n ∈ N. The result follows because λn = 1/αn. 2 7.4 Sturm–Liouville Systems 175

Example 7.42 We return now to the example of the vibrating string, which is described by the following system of differential equations: ⎧ ⎨ y = f (x), f ∈ L2(0, π), (DE ) ⎩y(0) = y(π) = 0. (BC )

We will parallel the argument used in the general case to see how it works in this example. First consider the initial value problem at the left endpoint:

y = 0, y(0) = 0, y (0) = 1.

The solution to this initial value problem is u(x) = x for all x ∈ [0, π]. Next, we consider the initial value problem at the right endpoint:

y = 0, y(π) = 0, y (π) = 1.

The solution to this initial value problem is v(x) = x − π for all x ∈ [0, π]. From the general case, we know that the value of W in (7.19) is a constant. In this example, we can calculate it explicitly:

W = uv − u v = π.

We now define the kernel K as in (7.21): ⎧ ⎨ 1 (x − π) t,ift ≤ x, K(x, t) = π (7.23) ⎩ 1 − ≥ π x (t π), if t x.

As usual, we let TK be the Hilbert-Schmidt operator with kernel K:

π TK f (x) = K(x, t) f (t) dt, f ∈ L2(0, π), x ∈ [0, π]. 0

The next step is to calculate the eigenvectors and eigenvalues for TK . According to Theorem 7.41, we can consider solutions to the equation

y − αy = 0, y(0) = 0, y(π) = 0. (7.24)

The differential equation in (7.24) is a homogeneous linear second order ordinary differential equation, and the solution depends on the sign of α. We know that α = 0, because the only solution to the homogeneous equation is y = 0. Suppose α>0. Then α = β2 for some β>0. The differential equation becomes y − β2y = 0, and the general solution to this differential equation is

y(x) = Aeβx + Be−βx, where A and B are real numbers. The boundary conditions imply that A = 0 and B = 0, and so there are no positive eigenvalues. 176 7 Hilbert Space Theory

Now let α =−β2, where β>0. The differential equation becomes y +β2y = 0, and this has general solution

y(x) = A cos (βx) + B sin (βx), where A and B are real numbers. The boundary condition y(0) = 0 implies that A = 0. At the other endpoint (and setting A = 0), the condition y(π) = 0 implies that sin (βπ) = 0. This happens whenever β ∈ N (since β>0). We have established that the solutions to (7.24) are:

y = sin (nx), α =−n2, n ∈ N.

We wish to normalize y. Computing the L2(0, π)-norm of y:   - π 1/2 π   = 2 = y L2(0,π) sin (nx) dx . 0 2 Therefore, for each n ∈ N,welet - 2 α =−n2 and e = sin (nx). n n π We now have the following interesting result. √ ∞ Theorem 7.43 ( 2/π sin (nx))n=1 is an orthonormal basis for L2(0, π). Proof See Theorem 7.41 and Example 7.42. 2

The eigenvalues of TK are the reciprocals of the αn values. Consequently, the eigenvalues of TK are 1 λ =− , n ∈ N, (7.25) n n2 where λn is the eigenvalue corresponding to en. Since we have an explicit formula for the kernel K of TK (see (7.23)), we also get an interesting summation formula via Theorem 7.39: ∞ 1 π π = |K(x, t)|2 dt dx (7.26) 4 = n 0 0 n 1 π x 1 π π 1 = (x − π)2 t2 dt dx + x2 (t − π)2 dt dx. 2 2 0 0 π 0 x π We leave it to the reader to verify this equality. (See Exercise 7.21.) Exercises 177

Exercises

Exercise 7.1 Let H be a Hilbert space and suppose x and y are nonzero elements of H . Show that x + y=x+y if and only if y = cx, where c>0. Exercise 7.2 Let H be a Hilbert space. Suppose T and S are operators on H and let α ∈ C. Show that (T + S)∗ = T ∗ + S∗,(αT )∗ = αT∗, and (ST )∗ = T ∗S∗. Exercise 7.3 Suppose T and S are hermitian operators on a Hilbert space. Show that TSis hermitian if and only if TS = ST . Exercise 7.4 Let H be a normed space that satisfies the Parallelogram Law (The- orem 7.10.). Show that H is an inner product space. (Hint: Use the polarization formulas given after the statement of Theorem 7.10..) Exercise 7.5 Let H be a complex inner product space and assume A : H → H is a bounded linear operator. Show that (Ax, y) can be written as . / 1     A(x+y), x+y − A(x−y), x−y −i A(x+iy), x+iy +i A(x−iy), x−iy . 4 Exercise 7.6 Let H be an inner product space and assume A and B are bounded linear operators on H. (a) Show that if (Ax, y) = (Bx, y) for all x and y in H , then A = B. (b) Show that if (Ax, x) = (Bx, x) for all x ∈ H and H is a complex inner product space, then A = B. (c) What assumptions need to be added to A and B in order for (b) to hold when H is a real inner product space? Exercise 7.7 Suppose A is a bounded linear operator on the complex inner product space H . Show that A= sup |(Ax, x)|. x∈BH Show that the same formula holds in a real inner product space if A is hermitian. Exercise 7.8 Let H be a complex Hilbert space and let T : H → H be a bounded linear operator. Show that T = T ∗ if and only if (Tx, x) ∈ R for all x ∈ H . (This equivalence cannot hold in a real Hilbert space because it is necessarily true that (Tx, y) ∈ R for all x and y in H when H is a real Hilbert space.) Exercise 7.9 Let H be an inner product space and assume T : H → H is a bounded linear operator. Show that T ∗T =T 2. Exercise 7.10 Let T : H → H be such that T (0) = 0 and T (x)−T (y)=x −y for all x and y in H. Show that T is a linear isometry from H to itself. (Hint: Show first that (T (x), T (y)) = (x, y) for all x and y in H). ∗ Exercise 7.11 Let H be a Hilbert space. For each φ ∈ H , let vφ ∈ H be the unique element (from the Riesz–Fr échet Theorem) satisfying φ(x) = (x, vφ) for all x ∈ H . 178 7 Hilbert Space Theory

∗ ∗ → ∗ ∗ → If TO : H H is the operator adjoint of T (see Definition 3.36) and TA : H H ∗ is the Hilbert space adjoint of T (see Definition 7.18), show that v ∗ = T v for TO (φ) A φ all φ ∈ H ∗. Exercise 7.12 If V is a closed subspace of a Hilbert space H , show H = V ⊕ V ⊥. Exercise 7.13 If V is a closed subspace of a Hilbert space H , show (V ⊥)⊥ = V . What is (V ⊥)⊥ if V is not closed? ∞ ∞ Exercise 7.14 Let H be a Hilbert space and suppose (xn)n=1 and (yn)n=1 are sequences ∞ ∞ in H .If(xn)n=1 and (yn)n=1 converge (in norm) to x and y, respectively, show that limn→∞ (xn, yn) = (x, y). Exercise 7.15 (Hellinger–Toeplitz Theorem) Let H be a Hilbert space. Prove the following: If T : H → H is a linear map that satisfies the equation (Tx, y) = (x, Ty) for all x and y in H, then T is continuous. Exercise 7.16 Let H be a Hilbert space with inner product (·, ·) and norm ·. ∞ ∞ = If (xn)n=1 and (yn)n=1 are sequences in BH , and limn→∞ (xn, yn) 1, show that limn→∞ xn − yn=0. Exercise 7.17 Let H be an infinite-dimensional Hilbert space with inner product (·, ·). Show that the function (·, ·):(H, w) × (H, w) → C is continuous in each argument separately, but that it is not continuous on the product (H , w) × (H , w). (In this problem, (H , w) denotes the Hilbert space H endowed with the weak topology.) Exercise 7.18 Solve the system of differential equations

y + λ2y = 0, λ ∈ R, y(0) = 1, y(2π) = 1.  dθ ∈Z T Use your answer to show that (en)n is an orthonormal basis for L2 , 2π , where inθ en(θ) = e and θ ∈ T = [0, 2π). ∞ 1 π 2 Exercise 7.19 Use Parseval’s Identity (Theorem 7.25) to show that = : n2 6 n=1 (a) Use the function f (x) = x for all x ∈ [0, π) and Theorem 7.43. (b) Use the function f (θ) = θ for all θ ∈ [0, 2π) and Exercise 7.18. Exercise 7.20 A theorem from linear algebra states that the trace of a square matrix → equals the sum of its eigenvalues. If TK : L2(a, b) L2(a, b) is a Hilbert–Schmidt = b operator defined by the formula TK f (x) a K(x, y)f (y) dy, then the trace of the Hilbert–Schmidt operator TK is defined to be

b trace(TK ) = K(x, x) dx, a  = ∞ whenever it exists. Let K be the kernel in (7.23) and show trace(TK ) n=1 λn, ∞ where (λn)n=1 is the sequence of eigenvalues for TK givenin(7.25). (Compare to (6.1.7).) Exercises 179

∞ 1 π 4 Exercise 7.21 Compute the integral in (7.26) to show that = . n4 90 n=1 Exercise 7.22 Suppose μ and ν are probability measures on a measure space (Ω, Σ). Assume that μ  ν and φ is the Radon–Nikodým√ derivative of μ with respect to ν. Define a map V : L2(μ) → L2(ν)byV (f ) = φf for all f ∈ L2(μ). Show that this map is a well-defined isometry. Show that V is an isomorphism if and only if ν  μ. Exercise 7.23 Recall that a function f : [0, 1] → R is called absolutely continuous on [0, 1] if f is differentiable almost everywhere (with respect to Lebesgue measure), ∈ − = x ∈ and if f L1(0, 1) satisfies the equation f (x) f (0) 0 f (t) dt for all x [0, 1]. (a) Let H denote the collection of all (real-valued) absolutely continuous functions

on [0, 1] such that f (0) = 0 and f ∈ L2(0, 1). Show that

1 (f , g) = f (t) g (t) dt, {f , g}⊆H , 0 defines a complete inner product on H.

(b) Show that the map T : H → L2(0, 1), defined by Tf = f for all f ∈ H ,isan isomorphism. Find T −1. (c) Fix a ∈ (0, 1) and define a map Λa : H → R by Λa(f ) = f (a) for all f ∈ H . Show that Λa is a bounded linear functional. Find the element φa ∈ H such that Λa(f ) = (f , φa) for all f ∈ H. Exercise 7.24 Let H be a Hilbert space and suppose T : H → H is a compact operator. Show that T is the limit (in operator norm) of a sequence of finite-rank operators. Exercise 7.25A Banach space X is called uniformly convex if given >0 there exists a δ>0 such that x − y <whenever x≤1, y≤1, and x + y > 2 − δ. Show that a Hilbert space is uniformly convex. Exercise 7.26 Suppose X is a non-reflexive Banach space and let >0 be given. (a) Show there exists an x∗∗ ∈ X∗∗ such that x∗∗=1 and

d(x∗∗, X) = inf{d(x∗∗, x):x ∈ X} > 1 − .

(Here, d is the metric induced by the norm on X∗∗.) (b) Let x∗∗ be as found in (a). Show that there exists x∗ ∈ X∗ with x∗=1 and x∗∗(x∗) > 1 − /2. Pick x ∈ X with x≤1 such that x∗(x) > 1 − /2. Show that there exists y∗ ∈ X∗ with y∗=1 and y∗(x∗∗ − x) > 1 − . (c) Let x and x∗ be as found in (b). Use Goldstine’s Theorem to show that there exists a y ∈ X with y≤1 such that x∗(y) > 1 − /2 and y∗(y − x) > 1 − . Deduce that x + y > 2 −  and x − y > 1 − . (d) Deduce that every uniformly convex space is reflexive. 180 7 Hilbert Space Theory

Exercise 7.27 For the following questions, assume 2 0 such that 1 |1 + t|p +|1 − t|p ≥ 1 + cp |t|p, t ∈ R. 2

|1+t|p+|1−t|p−2 (Hint: Show that the function |t|p is bounded below.) (b) Deduce from (a) that if f and g are functions in Lp(0, 1), then 1 f + gp +f − gp ≥f p + cp gp. 2

(c) Conclude that Lp(0, 1) is uniformly convex. (This is also true if 1

In this chapter, we will study Banach spaces with a multiplication. These spaces are called Banach algebras. Throughout this section, we will confine ourselves to considering complex Banach spaces, which will allow us to make use of powerful theorems from complex analysis. (For a brief review of results from complex analysis, see Sect. B.2 in the appendix.)

8.1 The Spectral Radius

We start with some basic definitions and properties. Definition 8.1 The set A is called an associative algebra (over the scalar field K) if it is a vector space over K together with an operation called multiplication (often denoted either by · or juxtaposition) that satisfies the following operations: (i) a · (b · c) = (a · b) · c (associativity), (ii) (a + b) · c = a · c + b · c (right-distribution), (iii) a · (b + c) = a · b + a · c (left-distribution), (iv) λ(a · b) = (λa) · b = a · (λb)(bilinearity of scalar multiplication), where a, b, and c are elements of the set A and λ is a scalar in K. We are interested in a type of associative algebra that is also a Banach space, and one in which the norm is what is called submultiplicative. A norm on an associative algebra is submultiplicative if

a · b≤ab, for all elements a and b in the Banach space. Definition 8.2 A is a Banach space that is also an associative algebra with a submultiplicative norm. Example 8.3 Suppose X is a Banach space. Then the space L(X) of linear operators on X forms a Banach algebra with multiplication given by composition.

© Springer Science+Business Media, LLC 2014 181 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1_8 182 8 Banach Algebras

In the space L(X) we distinguish a particular operator I that has the property I(x) = x for all x ∈ X. This operator, known as the identity operator, has norm one and is such that I ◦ T = T = T ◦ I for all T ∈ L(X). Elements of a Banach algebra with this property play a special role. Definition 8.4 An element 1 in a Banach algebra A is called an identity element if

1 · a = a = a · 1, a ∈ A.

We call an algebra unital if there is an identity 1 such that 1=1. If there is a risk of confusing the identity element 1 in A with the number 1 in the scalar field, we will denote the identity element in A by 1A. We will assume that all Banach algebras with an identity are unital; that is, we will assume 1=1. We can make this assumption without any loss of generality. To see this, suppose A is a Banach algebra with identity 1A,but1A = 1. For each a ∈ A, let La denote left-multiplication:

La(b) = ab, b ∈ A.

Observe that La≤a, because the norm is submultiplicative. Additionally,

a=La(1A)≤La1A.

Thus, if we let k = 1 , then 1A

ka≤La≤a.

Notice also that if La = Lb for a and b in A, then La(1A) = Lb(1A), and so a = b. → → L  = Therefore, the map a La determines an embedding A (A). Since L1A 1, we can then think of A as a closed subalgebra of a unital Banach algebra. Definition 8.5 A Banach algebra is commutative if ab = ba for all elements a and b in the algebra. Example 8.6 We now consider several examples of commutative Banach algebras. (a) If K is a compact Hausdorff space, then C(K) forms a commutative Banach algebra under pointwise multiplication:

(f · g)(s) = f (s) g(s), {f , g}⊆C(K), s ∈ K.

The identity in C(K)isχK , the function that is constantly 1 on K. (b) Let D ={z ∈ C : |z| < 1}. We denote by A(D) the collection of all analytic functions f : D → C that are extendable to elements of C(D). When equipped with the supremum norm and pointwise multiplication, the space A(D)isaBa- nach algebra, known as the disk algebra. (Completeness follows from Morera’s Theorem (Theorem B.13).) We remark that A(D) can be viewed as a subalgebra 8.1 The Spectral Radius 183

of both C(D) and C(T). In the context of the complex plane, we use T to denote the unit circle T = ∂D ={z ∈ C : |z|=1}. The inclusion of A(D)inC(T) follows from the Maximum Modulus Theorem (Theorem B.11). (c) Consider the sequence space 1 = 1(Z+), where Z+ ={0, 1, 2, ...}. We will ∗ = ∞ = ∞ define a product on 1. Suppose a (ak)k=0 and b (bk)k=0 are sequences in 1. Define the product a ∗ b to be the sequence of coefficients of the formal power series    ∞ ∞ k k akt bkt . (8.1) k=0 k=0  ∗ = k ∈ Z Then, (a b)k j=0 ak−j bj for k +, and so

a ∗ b = (a0b0, a1b0 + a0b1, a2b0 + a1b1 + a0b2, ...).

From (8.1), we deduce that a ∗ b1 ≤a1 b1, and so 1 becomes a Banach algebra with this multiplication. This example is known as a convolution algebra because the product is known as a convolution. (d) Now consider the function space L1(R). We will turn this space into a convolution algebra. If f and g are functions in L1(R), we define the convolution of f with g to be the function

∞ (f ∗ g)(s) = f (s − t) g(t) dt, s ∈ R. −∞

To compute a bound for the L1-norm of f ∗ g, we observe that ∞ ∞ ∞ |(f ∗ g)(s)| ds ≤ |f (s − t) g(t)| dt ds. −∞ −∞ −∞

By Fubini’s Theorem,

∞ ∞ ∞  ∞  |f (s − t) g(t)| dt ds = |f (s − t)| ds |g(t)| dt. −∞ −∞ −∞ −∞

By the translation-invariance of Lebesgue measure,

∞ ∞ |f (s − t)| ds = |f (s)| ds =f 1, −∞ −∞

and so we have f ∗ g1 ≤f 1 g1. The space L1(R) becomes a Banach algebra with multiplication given by convolution. It is worth noting that this Banach algebra lacks an identity element. 184 8 Banach Algebras

All of the Banach algebras listed in Example 8.6 are commutative, but not all Banach algebras are commutative. An example of a noncommutative Banach algebra is L(X), where X is a Banach space of dimension greater than one. In some cases (such as Example 8.6(d)) the Banach algebra A will lack an identity element. In such a circumstance, the algebra can always be embedded in an algebra with an identity. Let A = A ⊕ C, and define a multiplication on A by

(a, λ1) · (b, λ2) = (ab + λ1b + λ2a, λ1 λ2),

where {a, b}⊆A and {λ1, λ2}⊆C. Then A is a unital Banach algebra with identity element (0, 1). Equivalently, we may simply add an identity element to A. To do this, let 1 be an identity element and let A = A ⊕ 1C. This time, define a multiplication by

(a + 1λ1) · (b + 1λ2) = ab + λ1b + λ2a + 1λ1 λ2, where (again) {a, b}⊆A and {λ1, λ2}⊆C. The above discussion allows us to consider, without loss of generality, only unital Banach algebras. This is a convention we will adopt throughout the remainder of this chapter. Definition 8.7 Let A be a unital Banach algebra. An element a ∈ A is said to be invertible if there exists some a−1 ∈ A such that aa−1 = 1 = a−1 a. If it exists, a−1 is called the inverse of a. Example 8.8 Consider the algebra C(K) of continuous functions on the compact Hausdorff space K. A function f ∈ C(K) is invertible (in the algebraic sense) if f (s) = 0 for any s ∈ K. In this case, the inverse of f is given by the function 1/f . If an inverse exists, it is necessarily unique, but not every element of a Banach algebra is invertible. To illustrate this point, consider the following example.

Example 8.9 Consider the Banach algebra L(2) of operators on 2 with identity operator I. Recall the left shift operator L and the right shift operator R on 2:

L(ξ1, ξ2, ξ3, ...) = (ξ2, ξ3, ξ4, ...) and R(ξ1, ξ2, ξ3, ...) = (0, ξ1, ξ2, ...),

∞ where (ξn)n=1 is an element of 2. Neither R nor L has an inverse: R is not onto, and L has a zero eigenvalue. Observe that L and R do satisfy the relationship LR = I; however, they are not inverses of one another because RL = I. In a Banach algebra A, we call b the left-inverse of a whenever b · a = 1. Correspondingly, when this equation is satisfied, we call a the right-inverse of b.In the preceding example, L is the left-inverse of R and R is the right-inverse of L. 8.1 The Spectral Radius 185

Simple algebra shows that if a has both a left-inverse and a right-inverse, then they must coincide (and hence a is invertible). (See Exercise 8.1.) The next proposition, first due to the mathematician Carl Neumann (in the context of operators), will prove central to all that follows. Proposition 8.10 (Neumann Series) Let A be a unital Banach algebra and let a ∈ A.Ifa < 1, then 1 − a is invertible and the inverse is given by the Neumann series: ∞ (1 − a)−1 = an = 1 + a + a2 + a3 +··· . n=0  ∈ N = + +···+ n ∞  n Proof For each n , let Sn 1 a a . Observe that the series n=0 a  n≤n   ∞ converges, since a a and a < 1. Consequently, (Sn)n=1 is a Cauchy sequence, and hence converges to some S ∈ A. Observe that 1 − a and Sn commute for each n ∈ N, and

n+1 (1 − a) Sn = Sn (1 − a) = 1 − a .

Taking limits, we have (1 − a) S = S (1 − a) = 1,  = ∞ n − 2 and so S n=0 a is the inverse of 1 a, as required. The collection of invertible elements in a Banach algebra turns out to be very important. Definition 8.11 Let A be a unital Banach algebra. The set G ={a ∈ A : a−1 exists} is called the group of invertible elements in A. We leave it to the reader to verify that the group of invertible elements is indeed a group. (See Exercise 8.5.) In addition to being a group, it also has interesting topological properties. Proposition 8.12 The group of invertible elements in a unital Banach algebra is an open set. Proof Let A be a unital Banach algebra and let G be the group of invertible elements in A. Let a ∈ G. We will find an open ball centered at a that is contained in G. Let   1 B = b ∈ A : a − b < . a−1

Let b ∈ B. Observe that

b = a − (a − b) = a (1 − a−1(a − b) .

By assumption, a is invertible. By the choice of b, we also have

a−1(a − b)≤a−1a − b < 1. 186 8 Banach Algebras

Thus, by Proposition 8.10, we conclude that 1 − a−1(a − b) is invertible. Consequently, both a and 1 − a−1(a − b) are elements of the group G. Therefore,

b = a (1 − a−1(a − b) ∈ G, as the product of two elements in the group. Since the choice of b ∈ B was arbitrary, we conclude that B ⊆ G. Thus, B is an open set such that a ∈ B and B ⊆ G. Therefore, G is an open set in A. 2 In the proof of Proposition 8.12, we saw that if a ∈ A is invertible, and if b ∈ A is sufficiently close to a, then b is invertible, too. In fact, thanks to Proposition 8.10, we can even provide a formula for the inverse of b:  − b−1 = 1 − a−1(a − b) 1 a−1  = 1 + a−1(a − b) + a−1(a − b)a−1(a − b) +··· a−1 = a−1 + a−1(a − b)a−1 + a−1(a − b)a−1(a − b)a−1 +··· . (8.2) If A is a commutative Banach algebra, then b−1 = a−1 + (a−1)2(a − b) + (a−1)3(a − b)2 +··· .

Definition 8.13 Suppose A is a complex unital Banach algebra with group of in- vertible elements G. Let a ∈ A. The spectrum of a is the subset of C defined by Sp(a) ={λ : λ1 − a ∈ G}. The resolvent of a is the subset of C defined by Res(a) ={λ : λ1 − a ∈ G}.

Example 8.14 Consider Mn(C), the Banach algebra of n × n complex matrices, where n ∈ N. Let T ∈ Mn(C) be a square matrix. A complex scalar λ is in Sp(T ) provided λI −T is a non-invertible matrix; i.e., when det (λI −T ) = 0. Consequently, we have λ ∈ Sp(T ) if and only if λ is an eigenvalue for T . Since C is algebraically closed, T will always have at least one eigenvalue, and so Sp(T ) is nonempty for all T ∈ Mn(C). Therefore, in this case, we can identify the spectrum of T as the collection of eigenvalues of T ; that is, Sp(T ) ={λ1, ... , λr }, for a finite sequence of scalars, where r ≤ n. When A is a finite-dimensional space of operators, the spectrum of an element is well-understood in terms of eigenvalues. In the infinite-dimensional case, however, things are more subtle. Example 8.15 Define an operator T on C[0, 1] by Tf(x) = xf(x), f ∈ C[0, 1], x ∈ [0, 1]. We saw in Example 6.3 that T has no eigenvalues. Even so, the spectrum of T is not empty. In fact, for any λ ∈ [0, 1], the operator λ1−T is not invertible. If f ∈ C[0, 1], 8.1 The Spectral Radius 187 then (λ1 − T )f (x) = λf (x) − xf (x) = (λ − x)f (x), x ∈ [0, 1]. The inverse operator would have to be g(x) Sg(x) = , g ∈ C[0, 1], x ∈ [0, 1], λ − x but S is not a on C[0, 1] if λ ∈ [0, 1]. Therefore, the spectrum of T is Sp(T ) = [0, 1]. (Note that S is bounded if λ ∈ [0, 1].) Example 8.16 Let X be a Banach space and suppose T ∈ L(X). By definition, the complex scalar λ is in Sp(T ) whenever λ1 − T is not invertible. Recall that an operator is not invertible either because it is not injective or not surjective (or both). There are three (not necessarily disjoint) possibilities: (1) Suppose λ1 − T is not one-to-one. In this case, there exists a nonzero x ∈ X such that (λ1 − T )(x) = 0, and so Tx = λx. It follows that λ is an eigenvalue of T . (2) Suppose λ1−T is one-to-one, but (λ1−T )(X) is a proper closed linear subspace of X. By the Hahn–Banach Theorem, there is some x∗ ∈ X∗ such that x∗ = 0, but x∗((λ1 − T )(X)) = 0. (See Exercise 5.20.) It follows that (T ∗x∗)(x) = (λx∗)(x) for every x ∈ X, and so λ is an eigenvalue of T ∗. (3) Suppose λ1 − T is one-to-one, but λ1 − T is not bounded below; that is, there does not exist a constant c>0 such that

λx − Tx≥c x ∈ ∞  = for all x X. Therefore, we can find a sequence (xn)n=1 in X such that xn 1 for all n ∈ N and such that

lim λxn − Txn=0. (8.3) n→∞

∞ (See Exercise 1.12.) When λ satisfies (8.3) for some sequence (xn)n=1, we call it an approximate eigenvalue of T . We conclude that if λ ∈ Sp(T ), then λ is either an eigenvalue of T or T ∗,oran approximate eigenvalue of T . We will reserve the term approximate eigenvalue to describe those λ ∈ Sp(T ) which are not eigenvalues of T . It may happen, how- ever, that λ is both an approximate eigenvalue of T and an eigenvalue of T ∗. (See Example 8.18.) Definition 8.17 Let T ∈ L(X) for a Banach space X. The point spectrum of T is the set of all eigenvalues of T . The approximate point spectrum of T is the set of all approximate eigenvalues of T . 188 8 Banach Algebras

Example 8.18 Let us revisit Example 8.15. Suppose T ∈ L(C[0, 1]) is defined by the formula: Tf(x) = xf(x), f ∈ C[0, 1], x ∈ [0, 1]. We already mentioned that T has no eigenvalues. We now identify the eigenvalues of T ∗. We start by computing T ∗ : M[0, 1] → M[0, 1]. If μ ∈ M[0, 1], then

1 T ∗μ(f ) = μ(Tf) = xf(x) μ(dx). 0 Therefore, T ∗μ(dx) = xμ(dx). Suppose that λ is an eigenvalue of T ∗. Then there exists some μ ∈ M[0, 1] such that T ∗μ(dx) = λμ(dx). Then for every Borel set A,

xμ(dx) = λμ(dx). A A

This implies x = λa.e.(μ), which means μ = δλ, the Dirac measure at λ. Therefore, ∗ ∗ T δλ = λδλ for each λ ∈ [0, 1], and so each λ ∈ [0, 1] is an eigenvalue of T . We established in Example 8.15 that Sp(T ) = [0, 1] and we have shown that each point in [0, 1] is an eigenvalue of T ∗. We now show that each point in [0, 1] is also ∈ ∞ an approximate eigenvalue of T . For each λ [0, 1], we need a sequence (fn)n=1 of continuous functions such that fn∞ = 1 and λfn − Tfn∞ → 0asn →∞. ∞ There are many such sequences. It suffices to find a sequence (fn)n=1 for which fn is a function that peaks with value 1 at λ and then decreases to zero, where fn decreases more rapidly to zero as n increases. As an example, let λ ∈ [0, 1] be given and for each n ∈ N, define fn ∈ C[0, 1] as follows: ⎧ ⎪0if0≤ x<λ− 1 , ⎪ n ⎨ − + − 1 ≤ n(x λ) 1ifλ n x<λ, fn(x) = ⎪n(λ − x) + 1ifλ ≤ x<λ+ 1 , ⎩⎪ n + 1 ≤ ≤ 0ifλ n x 1.

For each n ∈ N, the function fn is continuous and fn∞ = 1. If |x − λ| > 1/n, then fn(x) = 0. On the other hand, if |x − λ| < 1/n, then   1 1 2 |(λ − x) f (x)|≤|λ − x| (n|λ − x|+1) < · n · + 1 = . n n n n

Therefore, 2 λfn − Tfn∞ = sup |λfn(x) − xfn(x)|≤ . x∈[0,1] n Naturally, this tends to zero, and it follows that λ is an approximate eigenvalue for the operator T . Theorem 8.19 Let A be a complex unital Banach algebra. If a ∈ A, then Sp(a) is a nonempty compact set. 8.1 The Spectral Radius 189

Proof We will first show that Sp(a) is compact. Let G denote the set of invertible elements in A. By Proposition 8.12, G is an open set in A. Define a map ρ : C → A by ρ(λ) = λ1 − a, λ ∈ C. Then ρ is a continuous map, and so ρ−1(G) is an open set in C. Consequently,

Sp(a) ={λ : λ1 − a ∈ G}=C \ ρ−1(G) is closed in C. If |λ| > a, then 1 − (a/λ) is invertible, by Proposition 8.10. Thus, λ1 − a is invertible, as well. Consequently, if λ1 − a is not invertible, then |λ|≤a. Therefore, Sp(a) ⊆a BC, and so is compact by the Heine–BorelTheorem. Now we show that Sp(a) is nonempty. Assume to the contrary that Sp(a) =∅. Then λ1 − a is invertible for all λ ∈ C. Let φ ∈ A∗ and define f : C → C by

f (λ) = φ((λ1 − a)−1), λ ∈ C. (8.4)

(We call f the resolvent function.) We claim that f is analytic on C (i.e., f is an entire function). Let λ ∈ C.We will show that there is a neighborhood of λ in which f has a power series expansion. Suppose μ ∈ C is such that

|μ − λ| < (λ1 − a)−1−1. (8.5)

Using a bit of algebra:

μ1 − a = (λ1 − a) + (μ − λ)1 = (λ1 − a)(1 + (μ − λ)(λ1 − a)−1).

By assumption, μ1 − a is invertible, and so  − (μ1 − a)−1 = 1 + (μ − λ)(λ1 − a)−1 1(λ1 − a)−1

Because of (8.5), we know (μ − λ)(λ1 − a)−1 < 1, and thus (by Proposition 8.10) there is a Neumann series for the inverse of 1 + (μ − λ)(λ1 − a)−1: ∞  − 1 + (μ − λ)(λ1 − a)−1 1 = ( − 1)n(μ − λ)n(λ1 − a)−n. n=0 Therefore, ∞ (μ1 − a)−1 = ( − 1)n(μ − λ)n(λ1 − a)−n−1. n=0 Since φ ∈ A∗, ∞  f (μ) = f (λ) + ( − 1)n(μ − λ)nφ (λ1 − a)−n−1 . n=1 It follows that f is analytic on C. 190 8 Banach Algebras

We claim that f is a bounded function. To that end, suppose that |λ| > a. Then 1 − a/λ is invertible, by Proposition 8.10. Hence, λ1 − a is invertible and has Neumann series ∞ (λ1 − a)−1 = λ−1 λ−n an. n=0 Thus, for |λ| > a, ∞ f (λ) = φ((λ1 − a)−1) = λ−1 λ−nφ(an). n=0 Consequently, for |λ| > a,

∞ ∞   1 φ a n |f (λ)|≤ |λ|−nφan = . |λ| |λ| |λ| n=0 n=0 Since a/|λ| < 1, by assumption, the geometric series converges, and so φ 1 φ |f (λ)|≤ · = . (8.6) |λ| − a |λ|−a 1 |λ| Because f is analytic on C, there exists some M ≥ 0 such that |f (λ)|≤M for all |λ|≤2a. On the other hand, if |λ|≥2a, then |f (λ)|≤φ/a, because of (8.6). It follows that f is bounded by max{M, φ/a}. We have established that f is a bounded entire function. Therefore, by Liouville’s Theorem (Theorem B.14), f is constant. In fact, since φ lim = 0, |λ|→∞ |λ|−a we conclude that f = 0. We have established that f (λ) = 0 for all λ ∈ C. The choice of φ ∈ A∗ in (8.4) was arbitrary, and so we conclude that φ((λ1 − a)−1) = 0 for all φ ∈ A∗. The only way this can happen is if (λ1 − a)−1 = 0 (because of the Hahn–Banach Theorem). This is a contradiction, and so we conclude that Sp(a) is nonempty. 2 For the next theorem, we recall that an algebra is a field if it is commutative and if every nonzero element is invertible. Theorem 8.20 (Gelfand–Mazur Theorem) Suppose A is a complex Banach algebra. If A is a field, then A is isometrically isomorphic to C. Proof Let a ∈ A. By Theorem 8.19, Sp(a) is nonempty. Thus there exists some λ ∈ C such that λ1 − a is not invertible. By assumption, the only noninvertible element is 0, and so λ1 − a = 0. Therefore, a = λ1, and the result follows. 2 Theorem 8.20 remains true if we replace field with skew-field (also known as a noncommutative field or division algebra). If A is a real Banach algebra which is a skew-field, then A is R, C,orH. Here, we view C as a real Banach algebra. The 8.1 The Spectral Radius 191 symbol H denotes the quaternions. (We use H in honor of the Irish mathematician Sir William Hamilton, who first described the quaternions in 1843.) We remark that H is not a complex Banach algebra because it does not satisfy property (iv) of Definition 8.1. The Gelfand–Mazur Theorem dates back to 1938, when Mazur stated in an article that every normed division algebra over R is either R, C,orH [25]. His original proof did not appear until much later, in 1973 [37]. The theorem now called the Gelfand–Mazur Theorem was proved by Gelfand in 1941 [14]. Definition 8.21 Let A be a unital Banach algebra. If a ∈ A, then the number r(a) = max{|λ| : λ ∈ Sp(a)} is called the spectral radius of a. By Theorem 8.19, the spectrum of a is nonempty and compact, and so we know there is some λ ∈ Sp(a) which achieves the maximum in Definition 8.21. We also know that r(a) ≤a, because if λ>a, then a/λ < 1, and so Proposition 8.10 would imply that λ1 − a is invertible, which is a contradiction. Lemma 8.22 Let A be a unital Banach algebra. If a ∈ A, then r(a) ≤an1/n for all n ∈ N. Proof It will suffice to show that if λ ∈ Sp(a), then λn ∈ Sp(an) for all n ∈ N. (Because then |λ|n ≤an.) Let n ∈ N and λ ∈ Sp(a). Observe that λn1 − an = (λ1 − a)(λn−11 + λn−2a +···+λan−2 + an−1) = (λn−11 + λn−2a +···+λan−2 + an−1)(λ1 − a). Suppose λn1 − an is invertible. Then 1 = (λ1 − a)(λn−11 + λn−2a +···+λan−2 + an−1)(λn1 − an)−1 = (λn1 − an)−1(λn−11 + λn−2a +···+λan−2 + an−1)(λ1 − a). From this, however, it follows that λ1 − a is invertible. More precisely, we have (λ1 − a)−1 = (λn1 − an)−1(λn−11 + λn−2a +···+λan−2 + an−1). (See Exercise 8.1.) This contradicts the assumption that λ ∈ Sp(a). Thus, λn1 − an is not invertible, and so λn ∈ Sp(an), as required. 2 We now come to a key result, one which provides a formula for computing the spectral radius. Theorem 8.23 (Spectral Radius Formula) Let A be a unital Banach algebra. If a ∈ A, then r(a) = lim an1/n = inf an1/n. n→∞ n∈N Proof By Lemma 8.22, we know that r(a) ≤ inf an1/n ≤ lim inf an1/n. n∈N n→∞ 192 8 Banach Algebras

It will suffice, therefore, to show that lim sup an1/n ≤ r(a). n→∞ Suppose φ ∈ A∗ and let

f (λ) = φ((λ1 − a)−1), λ ∈ C. (8.7)

(This is the resolvent function defined in (8.4).) We know from the proof of Theo- rem 8.19 that f is an analytic function in the region |λ| >r(a). (There will be a singularity at any λ ∈ Sp(a).) Now define a new function:

F (ξ) = φ((1 − ξa)−1), ξ ∈ C.

Since F (ξ) = ξ −1f (ξ −1), we conclude that F is an analytic function on the open disk B ={ξ : |ξ| < 1/r(a)}, except possibly at the origin ξ = 0. We will demonstrate, however, that F has a power series on B centered at ξ = 0. Suppose that ξ ∈ C is such that |ξ| < 1/a. Then ξa < 1, and so (by Proposition 8.10) there is a Neumann series for (1 − ξa)−1: ∞ (1 − ξa)−1 = 1 + ξa + ξ 2a2 +···= ξ nan. n=0 Then, for any |ξ| < 1/a,   ∞ ∞ F (ξ) = φ((1 − ξa)−1) = φ ξ nan = ξ nφ(an). n=0 n=0 The power series expansion is unique, and so we conclude that ∞ F (ξ) = ξ nφ(an), ξ ∈ B. (8.8) n=0 Since the series in (8.1.6) converges for all ξ ∈ B, we observe that

lim |ξ nφ(an)|=0 n→∞ for each ξ ∈ B. Let ρ ∈ R be such that ρ>r(a), but otherwise arbitrary. Then 1/ρ ∈ B, and so sup |φ(ρ−nan)|=sup |ρ−nφ(an)| < ∞. n∈N n∈N This is true for all φ ∈ A∗, and so by the Uniform Boundedness Principle, there exists a constant Cρ (that depends on the choice of ρ) such that Cρ > 0 and

−n n ρ a ≤Cρ , n ∈ N.

n n Then, for all n ∈ N, we have that a ≤Cρ ρ . In particular, we have  n1/n ≤ 1/n a Cρ ρ. 8.1 The Spectral Radius 193

Therefore, lim sup an1/n ≤ ρ. n→∞ The choice of ρ>r(a) was arbitrary, and so

lim sup an1/n ≤ r(a). n→∞ It follows that

lim sup an1/n ≤ r(a) ≤ inf an1/n ≤ lim inf an1/n. →∞ n→∞ n∈N n

Since it is always the case that lim inf an1/n ≤ lim sup an1/n, the proof is →∞ n n→∞ complete. 2 Observe that if a is such that r(a) = 0, then (by Theorem 8.23) it must be the n 1/n case that limn→∞ a  = 0. This motivates the next definition. Definition 8.24 Let A be a unital Banach algebra. If a ∈ A is such that r(a) = 0, then a is called quasinilpotent. Let us now return our attention to compact operators on a Banach space. Let X be an infinite-dimensional Banach space and suppose K ∈ L(X) is a compact operator. We know that all eigenvalues of K are in Sp(K). From Theorem 6.37, we also know that the only possible limit point of the eigenvalues of K is 0. Certainly, K cannot be invertible, and so 0 ∈ Sp(K). We are left with the question: What other elements of the spectrum are not eigenvalues? This leads us to a classical theorem, due to Fredholm. Theorem 8.25 (FredholmAlternative) Let K be a compact operator on the Banach space X.Ifλ is a nonzero scalar, then either λ1−K is invertible or λ is an eigenvalue of K. Proof Let λ be a nonzero scalar and assume that λ is not an eigenvalue of K. Then, by assumption, ker(λ1 − K) ={0}. Thus, by the Rank-Nullity Theorem (Theorem 6.33),

dim(X/ran(λ1 − K)) = dim(ker(λ1 − K)) = 0.

It follows that ran(λ1 − K) = X, and so λ1 − K is a surjection. By the initial assumption, λ1−K is an injection, and therefore λ1−K is invertible. 2 Example 8.26 Recall the Volterra operator from Examples 3.41 and 6.27. The Volterra operator is a map V : L2(0, 1) → L2(0, 1) defined by

x Vf(x) = f (t) dt, f ∈ L2(0, 1), x ∈ [0, 1]. 0 Since V is a Hilbert–Schmidt operator, it is compact, and so 0 ∈ Sp(V ). In Ex- ample 6.27, it was established that V has no eigenvalues. By Theorem 8.25 (the 194 8 Banach Algebras

Fredholm Alternative), it follows that Sp(V ) ={0}. Therefore, r(V ) = 0 and V is quasinilpotent. Example 8.27 Suppose R and L are the right and left shift operators, respectively. Then Rξ = (0, ξ1, ξ2, ... ) and Lξ = (ξ2, ξ3, ...), = ∞ ∞ where ξ (ξk)k=1 is any sequence indexed by the natural numbers. If 1

Such a sequence is in q if and only if |λ| < 1. We have determined that λ is an eigenvalue for R∗ = L if |λ| < 1. Consequently, we conclude that D ⊆ Sp(R). By Theorem 8.19, the set Sp(R) is closed, and so D ⊆ Sp(R). Thus, we have established mutual inclusion, and hence Sp(R) = D. We have observed that R has no eigenvalues, and we have shown that every point in D is an eigenvalue of R∗ = L. Since the spectrum of R is the closed unit disk, it must be the case that every point in T = ∂D is an approximate eigenvalue of R. Since no λ ∈ T is an eigenvalue of L, it follows that if |λ|=1, then the operator λ1 − R is one-to-one and does not have closed range. (See Example 8.16.)

Example 8.28 Let p ∈ (1, ∞) and consider the Banach space p(Z) of doubly infinite p-summable sequences. The right shift operator R : p(Z) → p(Z)isnow given by the formula

R(ξ) = (ξn−1)n∈Z, ξ = (ξn)n∈Z ∈ p(Z). In this case, R is actually invertible, and the inverse is the left shift operator, so R−1 = L (where L is defined in the obvious way). 8.1 The Spectral Radius 195

This time, neither R nor L has any eigenvalues. To see that R has no eigenvalues, suppose λ ∈ C is an eigenvalue of R. (Note that λ = 0.) Then, for some ξ = (ξn)n∈Z, we have Rξ = λξ, and so

(ξn−1)n∈Z = (λξn)n∈Z.

Consequently, ξn−1 = λξn for all n ∈ Z. From this we conclude that

n 1 ξ− = λ ξ and ξ = ξ , n ∈ N. n 0 n λn 0

−n Therefore, ξ = ξ0 (λ )n∈Z. Since ξ ∈ p(Z), it must be that both |λ| < 1 and 1 |λ| < 1. Naturally, this cannot happen. A similar argument shows that L has no eigenvalues. Despite the previous remarks, it is still the case that Rn=1 for all n ∈ N, and consequently r(R) = 1. Similarly, r(L) = 1, and so we have both Sp(R) ⊆ D and Sp(R−1) ⊆ D. It is routine to show that λ ∈ Sp(R) if and only if 1/λ ∈ Sp(R−1). (See Exercise 8.4.) Therefore, if λ ∈ Sp(R), then λ ∈ D and 1/λ ∈ D. The conclusion is that Sp(R) ⊆ T. In fact, Sp(R) = T, because of the previous example. (Consider sequences ξ with ξ−n = 0 for all n ≥ 0.) Example 8.29 In Example 7.29, we saw that, by Parseval’s Identity, there is an = dθ Z isometric isomorphism between the Hilbert spaces L2 L2([0, 2π), 2π ) and 2( ), → ˆ given by the Fourier transform f (f (n))n∈Z. Define the multiplier operator M : L2 → L2 by

iθ Mf (θ) = e f (θ), f ∈ L2, θ ∈ [0, 2π).

With regards to the isometric isomorphism, the multiplier operator M on L2 corresponds to the shift operator R on 2(Z). (See Exercise 8.10.) Remark 8.30 In Example 7.29, we used the symbol T to denote the interval [0, 2π). In the context of complex analysis, however, we adopt the convention that T denotes the unit circle ∂D ={z ∈ C : |z|=1}. (This usage indicates the origin of the name torus for the symbol T.) Although this may seem to be an overuse of notation, the two sets are readily identifiable, since the unit circle can be written as

{z ∈ C : |z|=1}={eiθ : θ ∈ [0, 2π)}.

In fact, some authors prefer to use the symbol T to denote the unit interval [0, 1), making use of the identification ∂D = {e2πiθ : θ ∈ [0, 1)}. For the remainder of this text, however, we will use T to mean the unit circle. 196 8 Banach Algebras

8.2 Commutative Algebras

In this section (as before), we consider complex Banach algebras that are unital. Shortly, we will impose on our Banach algebras the additional restriction of com- mutativity. This added structure leads to some remarkable consequences. We begin, however, by making some definitions in the general context of an algebra. Definition 8.31 A nonempty linear subspace I of an algebra A is called a left ideal of A if ax ∈ I whenever a ∈ A and x ∈ I. Similarly, I is called a right ideal if xa ∈ I whenever a ∈ A and x ∈ I.IfI is both a left ideal and a right ideal, it is called a two-sided ideal. In any case, I is called proper if I = A. If an ideal contains the identity element of A, then it must contain every element of A. Thus, a proper ideal does not contain the identity element. Similarly, a proper ideal cannot contain an element that is invertible. Notice that an ideal necessarily contains 0, and {0} is always a proper two-sided ideal in any unital algebra. Proposition 8.32 If I is a proper closed two-sided ideal in a unital Banach algebra A, then A/I is a unital Banach algebra, called the quotient algebra. Proof We know that A/I is a Banach space, by Proposition 3.47. We now define a multiplication on A/I:

(a + I) · (b + I) = ab + I, {a, b}⊆A.

With this multiplication, it is clear that 1 + I is an identity in A/I, where 1 is the identity in A.Ifx ∈ I and 1 − x < 1, then x is invertible (by Proposition 8.10), which contradicts the fact that I is a proper ideal. Therefore, 1 + I=1, and so A/I is unital. It remains to show that ab + I≤a + Ib + I for all a and b in A. Let a and b be fixed elements in A. The norm on A/I is an infimum, and so for each n ∈ N, we may select xn ∈ a + I and yn ∈ b + I so that 1 1 x  < a + I+ and y  < b + I+ . n n n n

For each n ∈ N,wehavexnyn ∈ ab + I, and hence    1 1 ab + I≤x y  < a + I+ b + I+ . n n n n Since this bound holds for all n ∈ N, we may take the limit as n →∞. It follows that ab+I≤a+Ib+I, as required. 2 Example 8.33 Let H be an infinite-dimensional separable Hilbert space. The space L(H ) of operators on H is a Banach algebra, and the subspace K(H ) of compact operators on H is a closed two-sided ideal. (See Theorem 6.17.) The quotient algebra L(H )/K(H ) is called the Calkin algebra, after the mathematician J. W. Calkin. The Calkin algebra is significant because it is not isomorphic to an algebra of operators on a separable Hilbert space (even though H is itself separable) [5]. This 8.2 Commutative Algebras 197 is remarkable because the Calkin algebra is also a C∗-algebra, and from the Gelfand– Naimark Theorem it is known that every C∗-algebra is isomorphic to an algebra of operators on some Hilbert space. The Gelfand-Naimark Theorem dates back to the work of Gelfand and Naimark in 1943 [16]. For the remainder of this section, we will suppose that all Banach algebras are commutative. We remind the reader that an algebra A is commutative if ab = ba for all a and b in A. (See Definition 8.5.) In particular, this implies that any ideal is necessarily two-sided. Consequently, when A is a commutative algebra, a two-sided ideal is called an ideal. Definition 8.34 Let A be a commutative complex unital Banach algebra. A lin- ear functional φ : A→ C is called a multiplicative linear functional if it is a ring homomorphism; that is, if φ(1) = 1 and φ(xy) = φ(x) φ(y) for all x and y in A. The multiplicative property of a multiplicative linear functional φ guarantees that φ(1)2 = φ(1), and so it must be that either φ(1) = 1orφ(1) = 0. We insist on the condition that φ(1) = 1 in order to disqualify the trivial linear functional φ = 0. Theorem 8.35 If φ is a multiplicative linear functional on a commutative complex unital Banach algebra, then φ is continuous and φ=1. Proof By assumption, φ is a multiplicative linear functional, and so φ(1) = 1. It follows that φ≥1. Suppose there exists some x ∈ A such that x≤1, but |φ(x)| > 1. Let α = φ(x). By definition, φ(α−1x) = 1, and so φ(1 − α−1x) = 0. On the other hand, |α| > 1, and so α−1x < 1. By Proposition 8.10, we have that 1 − α−1x is invertible, and so   φ 1 − α−1x · φ (1 − α−1x)−1 = 1. This is a contradiction, because φ(1 − α−1x) = 0. Consequently, it must be the case that φ=1. 2 Definition 8.36 Let A be a commutative algebra. A proper ideal I is said to be maximal in A if I = J whenever J is a proper ideal such that I ⊆ J . Example 8.37 Suppose φ is a multiplicative linear functional on a commutative complex unital Banach algebra A. We claim that the kernel of φ is a closed maximal ideal in A. Recall that the kernel of φ is the set kerφ ={x : φ(x) = 0}. It is a consequence of the multiplicative property of φ that kerφ is an ideal. It is a proper ideal because 1 ∈ kerφ. (This is why we assumed φ(1) = 1 in the definition of a multiplicative linear functional.) By Proposition 3.49, the quotient A/kerφ is isomorphic to C. This means that the codimension of kerφ is one and, in particular, this tells us that kerφ is a maximal ideal. (See Exercise 8.11.) Proposition 8.38 (Krull’s Theorem) Any proper ideal is contained in a maximal ideal. In particular, any unital algebra has a maximal ideal.

= Proof If (Ji )i∈I is a chain of proper ideals, then J i∈I Ji is also a proper ideal. To see this, simply note that 1 ∈ Ji for any i ∈ I, and so 1 ∈ J . Therefore, every chain of proper ideals has an upper bound, and so the result follows from Zorn’s Lemma. 198 8 Banach Algebras

To show every unital algebra has a maximal ideal, observe that {0} is a proper ideal, and so must be contained in a maximal ideal. 2 We are working within the context of Banach algebras, but Krull’s Theorem remains true in the more general setting of ring theory. Specifically, Krull’s Theorem asserts the existence of maximal ideals in any unital ring. The proof is the same, and relies on Zorn’s Lemma. In fact, like Zorn’s Lemma, Krull’s Theorem is equivalent to the Axiom of Choice [18]. Krull’s Theorem is named after Wolfgang Krull, who proved the general version of the result in 1929 [22]. Proposition 8.39 Every maximal ideal in a commutative unital Banach algebra is closed. Proof Suppose J is a maximal ideal in a commutative unital Banach algebra A. By the continuity of multiplication, J is also an ideal. We need only show that J is a proper ideal. It will suffice to show that 1 ∈ J . We will show this by showing that d(1, J ) = 1. First, observe that d(1, J ) ≤ 1, because 0 ∈ J . Suppose that d(1, J ) < 1. Then there exists an element x ∈ J such that d(1, x) =1−x < 1. This implies (by Proposition 8.10) that x is invertible, which contradicts the assumption that J is a proper ideal. Therefore, d(1, J ) = 1, and so 1 ∈ J . Consequently, J is a proper ideal containing the maximal ideal J , and therefore J = J . 2 Theorem 8.40 Each maximal ideal in a commutative complex unital Banach algebra is the kernel of some multiplicative linear functional. Proof Let J be a maximal ideal in a commutative complex unital Banach algebra A. By Proposition 8.39, the set J is a closed subset of A. It follows that A/J is a unital Banach algebra, by Proposition 8.32. By assumption, J is a proper ideal in A, and so there exists some x ∈ A such that x ∈ J . Define a subset of A by J [x] ={ax + y : a ∈ A, y ∈ J }. Then J [x] is an ideal which is strictly larger than J , and so A = J [x]. Let π : A → A/J be the quotient map. Since J [x] = A, it must be that 1 = ax+y for some a ∈ A and y ∈ J . Therefore, π(1) = π(ax + y), and consequently 1 + J = ax + J = (a + J )(x + J ). It follows that (x + J )−1 exists whenever x ∈ J , and so every nonzero element of A/J is invertible. By the Gelfand–Mazur Theorem (Theorem 8.20), we conclude that A/J is isometrically isomorphic to C. Denote this isometric isomorphism by i : A/J → C. Then φ = i ◦ π is a multiplicative linear functional on A and kerφ = J . 2 Since the proof of the above theorem uses the Gelfand–Mazur Theorem, it only works for complex Banach algebras. Indeed, this theorem is not true (in general) for real Banach algebras. Through Example 8.37 and Theorem 8.40, we have established a correspon- dence between the multiplicative linear functionals on a commutative complex unital Banach algebra A and the maximal ideals of A. 8.2 Commutative Algebras 199

Definition 8.41 Let A be a commutative complex unital Banach algebra. The spec- trum of A, which we denote by Σ(A), is the collection of all multiplicative linear functionals on A. One must be careful to not confuse the spectrum of an algebra Σ(A) with the spectrum of an element Sp(x), where x is an element in the algebra A. We will discover shortly the reason behind this naming. Due to the correspondence between multiplicative linear functionals and maximal ideals, the set Σ(A) is sometimes called the maximal ideal space of A. Theorem 8.42 The spectrum of a commutative complex unital Banach algebra is a nonempty w∗-compact set. Proof Let A be a commutative complex unital Banach algebra. By Proposition 8.38, there exists a maximal ideal in A. By Theorem 8.40, there is some multiplicative linear functional for which this maximal ideal is the kernel. Therefore, Σ(A)is nonempty. It remains to show that Σ(A)isw∗-compact. From Theorem 8.35, we deduce that Σ(A) ⊆ BA∗ . By Theorem 5.39 (the Banach-Alaoglu Theorem), we know that BA∗ is a w∗-compact set. Thus, it suffices to show that Σ(A)isw∗-closed. Observe that

Σ(A) = {φ ∈ BA∗ : φ(1) = 1, φ(xy) = φ(x)φ(y) for all {x, y}⊆A} (8.9) ⎛ ⎞ ⎝ ⎠ = BA∗ ∩ {φ : φ(1) = 1} ∩ {φ : φ(xy) = φ(x)φ(y)} . (8.10) x∈A y∈A All of these sets are closed in the weak∗-topology, and so the result follows. 2 Definition 8.43 Let A be a commutative complex unital Banach algebra. If x ∈ A, then we define the Gelfand transform of x to be the (complex-valued) continuous function xˆ ∈ C(Σ(A)) given by xˆ(φ) = φ(x), φ ∈ Σ(A). The Gelfand transform of x is well-defined, because Σ(A)isaw∗-compact subset of BA∗ . Observe also that the map x →ˆx, which is called the Gelfand transform,is a norm-decreasing algebra homomorphism. The next proposition explains why we call Σ(A) the spectrum of A. Proposition 8.44 Let A be a commutative complex unital Banach algebra and let Σ = Σ(A).Ifx ∈ A, then Sp(x) =ˆx(Σ) and n 1/n ˆxC(Σ) = r(x) = lim x  . n→∞

Proof Suppose λ ∈ˆx(Σ). Then there exists some φ ∈ Σ such that φ(x) =ˆx(φ) = λ = φ(λ1). Consequently, φ(λ1 − x) = 0, and so λ1 − x is not invertible. (If a is invertible, then φ(a)φ(a−1) = 1, and so φ(a) = 0.) It follows that λ ∈ Sp(x). 200 8 Banach Algebras

Now suppose λ ∈ Sp(x). Then λ1 − x is not invertible, and hence (λ1 − a)A is a proper ideal. By Proposition 8.38, there exists a maximal ideal J containing the ideal (λ1− x)A. By Theorem 8.40, there exists some multiplicative linear functional φ such that J = kerφ. We conclude that φ(λ1 − x) = 0, and so λ = φ(x) =ˆx(φ), as required. The fact that ˆxC(Σ) = r(x) follows from the identification Sp(x) =ˆx(Σ). The rest of the proposition is the Spectral Radius Formula. (See Theorem 8.23.) 2 Remark 8.45 To emphasize the relationship between Σ(A) and Sp(x), where x is an element in the Banach algebra A, the spectrum of x is sometimes denoted σ (x). Example 8.46 Once again, consider the Volterra operator:

x Vf(x) = f (t) dt, f ∈ C[0, 1], x ∈ [0, 1]. 0 In Example 8.26, we saw that Sp(V ) ={0}. (This was a result of the Fredholm Alternative, because V is a compact operator with no eigenvalues.) Let A denote the algebra given by the closure of all polynomials in V ; that is,  n k n+1 A = akV :(a0, ... , an) ∈ C , n ∈ N . k=0 We know that A is a commutative Banach algebra. By Proposition 8.44,wehave V (Σ(A)) = Sp(V ) ={0}. It follows that φ(V ) = 0 for all multiplicative linear functionals on A. Therefore, Σ(A) contains only one element φ and   n k φ akV = a0, k=0

n+1 for all (a0, ... , an) ∈ C , where n ∈ N. Theorem 8.47 Let A be a commutative complex unital Banach algebra and let Σ = Σ(A). The Gelfand transform is an algebra homomorphism between A and a subalgebra of C(Σ). Furthermore, the Gelfand transform is an isometry if and only if x2=x2 for all x ∈ A. Proof It is clear the Gelfand transform is an algebra homomorphism onto its image. We show that it is an isometry if and only if x2=x2 for all x ∈ A. First, assume x2=x2 for all x ∈ A. By induction, x2k  =x2k for all x ∈ A. Therefore, by Proposition 8.44,

2k 1/2k ˆxC(Σ) = lim x  =x. k→∞ Thus, the Gelfand transform is an isometry. 8.2 Commutative Algebras 201

Conversely, assume ˆxC(Σ) =x for all x ∈ A. By Theorem 8.23 and Proposition 8.44,

n 1/n 2 1/2 x=ˆxC(Σ) = r(x) = inf x  ≤x  , n∈N and hence x2 ≤x2. By submultiplicativity of the norm, x2≤x2. Therefore, x2 =x2, as required. 2 We now consider some examples of Banach algebras, and we identify the spectrum in each case. In each of the following examples, we will be looking at a Banach algebra with the supremum (or essential supremum) norm, and so it is a trivial calculation to show it satisfies the relationship x2 =x2 for every x in the algebra. Consequently, in each case, the Gelfand transform is an isometry (by Theorem 8.47). Example 8.48 Suppose K is a compact Hausdorff space. We would like to identify Σ = Σ(C(K)), the spectrum of C(K). Here, C(K) denotes the space of complex- valued continuous functions on K. Observe that, for any s ∈ K, the map

δs (f ) = f (s), f ∈ C(K), is a multiplicative linear functional. We will show that, in fact, all multiplicative linear functionals on C(K) can be achieved as point evaluation. ∗ Suppose that φ ∈ Σ\{δs : s ∈ K}, which is a w -open set in Σ. There exists, ∗ then, a w -neighborhood W of φ so that W ∩{δs : s ∈ K}=∅. The set W is a neighborhood of φ in the w∗-topology, and so there exists an n ∈ N, an >0, and functions {f1, ... , fn}⊆C(K) so that ∗ {μ ∈ C(K) : |μ(f1) − φ(f1)| <, ... , |μ(fn) − φ(fn)| <}⊆W.

If s ∈ K, then δs ∈ W, and so there exists some j ∈{1, ... , n} such that

|fj (s) − φ(fj )|=|δs (fj ) − φ(fj )|≥.

For each j ∈{1, ... , n}, define a function gj ∈ C(K)bygj = fj − φ(fj )1, where 1 is the unit element in the algebra C(K). (Notice that 1 = χK , the function that is constantly 1 for all s ∈ K.) By definition, φ(gj ) = 0 for all j ∈{1, ... , n}. Now define g ∈ C(K)by n n 2 g = gj gj = |gj | . j=1 j=1

For each s ∈ K, there exists some j ∈{1, ... , n} such that |gj (s)|≥. Therefore, |g(s)|≥2 for all s ∈ K, and so g is invertible (in the algebra C(K)). On the other hand, n φ(g) = φ(gj ) φ(gj ) = 0, j=1 which implies that g is not invertible. We have derived a contradiction, and so it must be that Σ ={δs : s ∈ K}. By Lemma 5.66, we conclude that Σ is homeomorphic 202 8 Banach Algebras to K. Therefore, the spectrum of C(K)isinfactK. Certainly, then, in this example, we have that C(K) is isometrically isomorphic to C(Σ).

Example 8.49 Consider the sequence space ∞ of bounded complex-valued se- quences. We define a multiplication in ∞ component-wise, so that ξ · η is the sequence with components

(ξ · η)n = ξn ηn, n ∈ N, = ∞ = ∞ where ξ (ξn)n=1 and η (ηn)n=1 are sequences in ∞. We again wish to identify Σ = Σ(∞). For each n ∈ N, the projection onto the th n coordinate ξ → ξn determines a multiplicative linear functional. Consequently, we obtain N ⊆ Σ. This inclusion cannot be equality, however, because N is not a ∗ w -compact set. In fact, the spectrum of ∞ is the set Σ = βN, the Stone–Cechˇ compactification of N. The proof of this fact is beyond the scope of our current discussion, and so we will omit it. We remark, however, that it relies on the Axiom of Choice. (See Sect. B.1 for the definition of the Stone–Cechˇ compactification.) In this example, too, the algebra ∞ is isometrically isomorphic to C(Σ).

Example 8.50 Consider the space L∞(0, 1) of (equivalence classes of) essentially bounded measurable functions on the unit interval [0, 1]. We denote the spectrum 2 2 by Ω = Σ(L∞(0, 1)). Certainly, f ∞ =f ∞ for all f ∈ L∞(0, 1), and so the Gelfand transform is an isometry (by Theorem 8.47). It turns out that the Gelfand transform is actually an isomorphism, and so L∞(0, 1) is identical (as a Banach algebra) to C(Ω). The space Ω is called the Stone space of the measure algebra, but it is very difficult to describe. Unlike Example 8.48, we cannot realize the multiplicative linear functionals in this case as point evaluation, because there is no (constructive) way to define the value of f ∈ L∞(0, 1) at a given point in [0, 1] (because f represents a class of functions that are equal almost everywhere). Theorem 8.47 is a statement about complex Banach algebras. Ultimately, it relies on the Gelfand–Mazur Theorem and requires the underlying scalar field to be C. The arguments given here cannot be applied to real Banach algebras; however, a similar theorem does hold for Banach algebras over R. (See Theorem 4.2.5 in [2].)

8.3 The Wiener Algebra

Consider the space 1(Z) of doubly infinite absolutely summable sequences. We will equip 1(Z) with a multiplication akin to the one in Example 8.6(c), making 1(Z) into a convolution algebra. Suppose ξ = (ξj )j∈Z and η = (ηj )j∈Z are sequences in 1(Z). Define the product ∗ = ∗ of ξ and η to be the sequence ξ η ((ξ η)k)k∈Z of coefficients of the formal Laurent series ⎛ ⎞ ⎛ ⎞ ∞ ∞ ∞ k ⎝ j⎠ ⎝ j⎠ (ξ ∗ η)k t = ξj t ηj t . (8.11) k=−∞ j=−∞ j=−∞ 8.3 The Wiener Algebra 203

Then ∞ (ξ ∗ η)k = ξk−j ηj , k ∈ Z. (8.12) j=−∞

We must verify that this multiplication is well-defined; i.e., that ξ ∗ η is in fact an absolutely summable sequence. From (8.12), we have ⎛ ⎞ ∞  ∞  ∞ ∞   ⎝ ⎠ ξ ∗ η1 = ξk−j ηj ≤ |ξk−j ||ηj | . k=−∞ j=−∞ k=−∞ j=−∞  ∈ Z ∞ | |= For a fixed j ,wehave k=−∞ ξk−j ξ 1. Consequently, by Fubini’s Theorem,   ∞ ∞ ∞ ξ ∗ η1 ≤ |ηj | |ξk−j | = |ηj |ξ1 =η1 ξ1. j=−∞ k=−∞ j=−∞

Therefore, ξ ∗ η ∈ 1(Z), and so the multiplication is well-defined. We have also verified that the norm is submultiplicative, and so equipped with this multiplication, the space 1(Z) becomes a commutative Banach algebra. This Banach algebra is called the Wiener algebra (after Norbert Wiener), and is denoted by W. We call W a convolution algebra because the multiplication ∗ is called a convolution. Furthermore, we say that ξ ∗ η is the convolution of ξ with η. We now identify Σ(W), the spectrum of W. It will be convenient to identify W with the space of formal Laurent series, so that t k corresponds to the sequence with 1inthekth position (k ∈ Z), and zero elsewhere. This makes sense because of how we defined the multiplication in W. We wish to identify the multiplicative linear functionals φ on W. Suppose φ is a multiplicative linear functional. Observe that φ is completely determined by what it does on t.Ifφ(t) = λ, then φ(tk) = λk for all k ∈ Z. Therefore, ∞ k φ(ξ) = λ ξk, ξ ∈ W. (8.13) k=−∞

The series in (8.13) is doubly infinite. As a result, it will converge for all ξ ∈ W only if both |λ|≤1 (for the series with k ≥ 0) and |λ|≥1 (for the series with k ≤ 0). Consequently, |λ|=1, and so λ ∈ T, the unit circle in C. Certainly, any λ ∈ T determines a multiplicative linear functional according to (8.13). Thus, the linear functionals that are multiplicative linear functionals are those given by φ(t) = λ for λ ∈ T. For each λ ∈ T, let φλ denote the multiplicative linear functional determined by φλ(t) = λ. Then Σ ={φλ : λ ∈ T}. We may thus identify Σ with T. (The argument is similar to the proof of Lemma 5.66.) Notice that φλ(ξ) = φ1(η), where k the sequence η = (ηk)k∈Z is defined so that ηk = λ ξk for each k ∈ Z. 204 8 Banach Algebras

Since Σ is homeomorphic to T, we can embed W into C(T) via the Gelfand transform. Suppose that ξ ∈ W. Then ξˆ ∈ C(T) via the map ∞ ˆ ˆ k ξ(λ) = ξ(φλ) = λ ξk, λ ∈ T. k=−∞

For every λ ∈ T, there is a θ ∈ [0, 2π) so that λ = eiθ. Therefore, we define a function f : [0, 2π) → C by f (θ) = ξˆ(eiθ), and hence ∞ ikθ f (θ) = ξk e , θ ∈ [0, 2π). (8.14) k=−∞

Consequently, each ξ ∈ W corresponds to an f ∈ C(T) with absolutely convergent Fourier series. Motivated by the above discussion, we let A(T) denote the collection of all func- tions on T with absolutely convergent Fourier series, and we provide it with the norm ∞ ˆ f A(T) = |f (k)|, f ∈ A(T), (8.15) k=−∞ where fˆ(k)isthekth Fourier coefficient of f . (See (4.2.1).) Then A(T) is a Banach algebra, and W is homeomorphic to A(T) via the Gelfand transform. Since it is isometrically isomorphic to W (as an algebra) we also call A(T) the Wiener Algebra. Remark 8.51 In the above discussion, we once again made use of the correspon- dence between T and [0, 2π). In particular, we identified the continuous functions on T with the continuous functions on [0, 2π). It is for this reason that we can say f ∈ C(T) when it is actually an element of C[0, 2π). The next theorem gives an indication of the power behind the tools we have developed. The conclusion of the theorem is not at all obvious, but the proof itself is quite simple (and short). One should not mistake the brevity of the proof as a mark of triviality—it is brief because of the power behind our machinery. Theorem 8.52 (Wiener Inversion Theorem) Let f ∈ C(T) have an absolutely convergent Fourierseries. If f (θ) = 0 for any θ ∈ [0, 2π), then 1/f has an absolutely converging Fourier series. Proof By assumption, f ∈ A(T). There exists some a ∈ W such that aˆ = f .By Proposition 8.44, f (T) =ˆa(T) = Sp(a). Since 0 ∈ f (T), it follows that 0 ∈ Sp(a), and so a is invertible in W. Therefore, f is invertible in A(T), and the result follows. 2 Exercises 205

Exercises

Exercise 8.1 Let A be a unital Banach algebra. Show that if a ∈ A has an inverse, then it must be unique. Show that if a has both a left-inverse and a right-inverse, then a is invertible. Exercise 8.2 Let A be a noncommutative unital Banach algebra. Suppose x and y are elements in A such that both xy and yx are invertible. Show that both x and y are invertible and show that (xy)−1 = y−1x−1.

Exercise 8.3 Let Abe a unital Banach algebra with identity element 1. Suppose ∈ ∞  n ∞ − a A is such that n=0 a < . Show that 1 a is an invertible element and − −1 = ∞ n that (1 a) n=0 a . Exercise 8.4 Let X be a Banach space and suppose T ∈ L(X) is invertible. Show λ ∈ Sp(T ) if and only if 1/λ ∈ Sp T −1 . Exercise 8.5 Let A be a unital Banach algebra. Show that the set of invertible elements of A is a group. Exercise 8.6 Let A be a commutative unital Banach algebra and let G(A)bethe group of invertible elements of A. Show that x ∈ G(A) if and only if φ(x) = 0 for all multiplicative linear functionals φ. Exercise 8.7 Let A be a Banach algebra. An element x ∈ A is called a topological ∞  = ∈ N divisor of zero if there exists a sequence (yn)n=1 in A with yn 1 for all n such that lim xyn = 0 = lim ynx. Find an example of a Banach algebra with a n→∞ n→∞ topological divisor of zero that is not a divisor of zero. Exercise 8.8 Let A be a unital Banach algebra and let G(A) be the group of invertible elements. Show that every x ∈ ∂G(A) is a topological divisor of zero, where ∂G(A) is the boundary of G(A). Exercise 8.9 Let X be a commutative Banach algebra. A proper ideal I of X is called prime if for any elements a and b in X the product ab ∈ I implies that either a ∈ I or b ∈ I. Show that in a commutative unital Banach algebra any maximal ideal is prime. = dθ Exercise 8.10 Let M be the multiplier operator on L2 L2([0, 2π), 2π ) given by iθ the formula Mf (θ) = e f (θ) for all f ∈ L2 and θ ∈ [0, 2π). Show that the Fourier transform of the multiplier operator satisfies the equation Mf0(n) = fˆ(n − 1) for all ∈ ∈ Z f L2 and each n . Explicitly write out the relationship between M and the Z ˆ = 2π −inθ dθ ∈ Z shift operator R on 2( ). (Recall that f (n) 0 f (θ) e 2π for all n .) Exercise 8.11 If N is a subspace of a vector space X, then the codimension of N in X is the dimension of X/N. Show that an ideal of a complex commutative unital Banach algebra X is maximal if and only if it has codimension 1. (Hint: See the proof of Theorem 8.40.) 206 8 Banach Algebras

Exercise 8.12 Let A be a complex unital Banach algebra. Suppose there exists some M<∞ such that xy≤Mxy for all x and y in A. Show that A = C. Exercise 8.13 Let A be a Banach algebra and suppose a ∈ A. Use Fekete’s Lemma (Exercise 3.10) to show that

lim an1/n = inf an1/n. n→∞ n∈N

n (Hint: Let θn = log a .) = Z Exercise 8.14 Let A 1( +) be the convolution algebra from Example 8.6(c). ∗ = k Recall that the multiplication in A is given by the formula (a b)k j=0 ak−j bj ∈ Z = ∞ = ∞ for all k +, where a (aj )j=0 and b (bj )j=0 are sequences in A. Identify Σ = Σ(A), the spectrum of A, and show that C(Σ) is the collection of all functions having an absolutely convergent Taylor series in D.(Hint: A is a subalgebra of W.) Exercise 8.15 If f has an absolutely convergent Taylor series in D, and if f (z) = 0 for any z ∈ D, then show that 1/f has an absolutely convergent Taylor series in D. (Hint: Use the previous problem.) Appendix A Basics of Measure Theory

In this appendix, we give a brief overview of the basics of measure theory. For a detailed discussion of measure theory, a good source is Real and Complex Analysis by W. Rudin [33].

A.1 Measurability

Definition A.1 Let X be a set. A collection A of subsets of X is called a σ -algebra of sets in X if the following three statements are true: (a) The set X is in A. c c (b) If A ∈ A, then A ∈ A, where A is the complement of A in X. ∈ A ∈ N = ∞ ∈ A (c) If An for all n , and if A n=1 An, then A . A set X together with a σ-algebra A is called a measurable space and is denoted by (X, A), or simply by X when there is no risk of confusion. The elements A ∈ A are known as measurable sets. From (a) and (b), we can easily see that% the empty set is always measurable; that ∅∈A ∈ A ∈ N ∞ is, .IfAn for all n , then n=1 An is always measurable, by (b) and (c), because   ∞ ∞ c = c An An . n=1 n=1 Among other things, we can now conclude that, for measurable sets A and B, the set A\B = A ∩ Bc is measurable. The σ in the name σ-algebra refers to the countability assumption in (c) of Defi- nition A.1. If we require only finite unions of measurable sets to be in A, then A is called an algebra of sets in X. Definition A.2 Let (X, A) be a measurable space. A scalar-valued function f with domain X is called measurable if f −1(V ) ∈ A whenever V is an open set in the scalar field.

© Springer Science+Business Media, LLC 2014 207 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1 208 Appendix A Basics of Measure Theory

Example A.3 Let (X, A) be a measurable space. If E ⊆ X, then the characteristic function (or indicator function)ofE is the function defined by

1ifx ∈ E, χE(x) = 0ifx ∈ E.

The characteristic function of E is measurable if and only if E ∈ A. The characteristic function χE is also denoted by 1E. Suppose u and v are real-valued functions. Then f = u + iv is a complex mea- surable function if and only if u and v are real measurable functions. Furthermore, if f and g are measurable functions, then so is f + g. ∞ Measurable functions are very well-behaved. For example, if (fn)n=1 is a sequence of real-valued measurable functions, then the functions

sup fn and lim sup fn n∈N n→∞

∞ are measurable whenever the function values are finite. In particular, if (fn)n=1 is a convergent sequence of scalar-valued measurable functions, then the limit of the sequence is also a scalar-valued measurable function. Definition A.4 A simple function is any linear combination of characteristic functions; that is n

aj χEj , j=1 where n ∈ N, a1, ... , an are scalars, and E1, ... , En are measurable sets. Certainly, any simple function is measurable, since it is a finite sum of measurable functions. The next theorem is key to the further development of the subject. Theorem A.5 Let (X, A) be a measurable space. If f is a positive measurable ∞ function, then there exists a sequence (sn)n=1 of simple functions such that:

0 ≤ s1 ≤ s2 ≤···≤f , and f (x) = lim sn(x), x ∈ X. n→∞

A.2 Positive Measures and Integration

In this section, we will consider functions defined on a σ -algebra A. Such functions are called set functions. For now, we will restrict our attention to set functions taking values in the interval [0, ∞]. A.2 Positive Measures and Integration 209

Definition A.6 Let (X, A) be a measurable space. A set function μ : A → [0, ∞] is said to be countably additive if  ∞ ∞ μ Aj = μ(Aj ), (A.1) j=1 j=1

∞ whenever (Aj )j=1 is a sequence of pairwise disjoint measurable sets. By pairwise disjoint we mean Aj ∩ Ak =∅whenever j = k. A countably additive set function μ : A → [0, ∞] such that μ(∅) = 0 is called a positive measure on A. In such a case, we call the triple (X, A, μ)apositive measure space. If the σ -algebra is understood, we often write (X, μ) for the positive measure space and say μ is a positive measure on X. Notice that we allow μ to attain infinite values. The assumption that μ(∅) = 0is to avoid the trivial case where the set function is always ∞. Equivalently, we could assume that there exists some A ∈ A such that μ(A) < ∞. Example A.7 The Borel σ-algebra on R is the smallest σ -algebra that contains all of the open intervals in R. A measure defined on the Borel σ -algebra on R is called a Borel measure on R. Let μ be a positive measure on a σ-algebra. A measurable set that has zero measure with respect to μ is called a μ-null set or a μ-measure-zero set. A subset of a μ-null set is called μ-negligible.Aμ-negligible set may or may not be measurable. If every μ-negligible set is measurable (and hence a μ-null set), then the measure μ is called complete. (That is, μ is complete if every subset of a μ-null set is μ-measurable and has μ-measure zero.) Example A.8 (Lebesgue measure) Let B be the Borel σ -algebra on R and let m be a positive Borel measure on B defined so that m((a, b)) = b − a whenever a and b are real numbers such that a

Definition A.10 A positive measure that attains only finite values is called a finite measure.Ifμ is a positive measure on X such that μ(X) = 1, then we call μ a probability measure. Example A.11 The restriction of the measure λ from ExampleA.8 to the unit interval [0, 1] is a probability measure. Let us make some obvious comments about positive measures. Suppose (X, A, μ) is a positive measure space. If A and B are measurable sets such that A ⊆ B, then μ(A) ≤ μ(B). Also, observe that if μ is a finite positive measure, then μ( · )/μ(X) is a probability measure.  A = n Definition A.12 Let (X, , μ) be a positive measure space and let s j=1 aj χEj be a simple function. For any A ∈ A, we define the integral of s over A with respect to μ to be n sdμ = aj μ(A ∩ Ej ). A j=1 (The value of this integral does not depend on the representation of s that is used.) If f : X → R is a nonnegative measurable function, then we define the integral of f over A with respect to μ to be   fdμ = sup sdμ , A A where the supremum is taken over all simple functions s such that s(x) ≤ f (x) for all x ∈ X. In order to meaningfully extend our notion of an integral to a wider class of functions, we first define the class of functions for which our integral will exist. Definition A.13 Let (X, A, μ) be a positive measure space. A scalar-valued mea- surable function f is called Lebesgue integrable (or just integrable) with respect to μ if |f | dμ < ∞. X

If f is integrable with respect to μ, then we write f ∈ L1(μ). Now, let (X, A, μ) be a positive measure space and let f : X → R be a real-valued integrable function. Define the integral of f over A by

fdμ = f + dμ − f − dμ, A A A where f (x)iff (x) ≥ 0, f +(x) = 0 otherwise, A.3 Convergence Theorems and Fatou’s Lemma 211 is the positive part of f , and

−f (x)iff (x) ≤ 0, f −(x) = 0 otherwise, is the negative part of f .Iff : X → C is a complex-valued integrable function, define the integral of f over A by

fdμ = (f ) dμ + i (f ) dμ, A A A where (f ) and (f ) denote the real and imaginary parts of f , respectively. To avoid certain arithmetic difficulties which arise from the definitions above, we ·∞ = = = adopt the convention that 0 0. Consequently, A fdμ 0 whenever f (x) 0 for all x ∈ A,evenifμ(A) =∞. Furthermore, if μ(A) = 0 for a measurable set ∈ A = A , then A fdμ 0 for all measurable functions f . In some cases, it is useful to explicitly show the variable dependence in an integration, and in such a situation we write

fdμ = f (x) μ(dx). A A

A.3 Convergence Theorems and Fatou’s Lemma

Definition A.14 Let (X, A, μ) be a positive measure space. Two measurable func- tions f and g are said to be equal almost everywhere if μ{x ∈ X : f (x) = g(x)}=0. If f and g are equal almost everywhere, we write f = ga.e.(μ). Sometimes we say f (x) = g(x) for almost every x. The previous definition is significant because, if f = ga.e.(μ), then

fdμ = gdμ, A ∈ A. A A This is true because the integral over a set of measure zero is always zero. Functions that are equal almost everywhere are indistinguishable from the point of view of integration. With this in mind, we extend our notion of a measurable function. Let (X, A, μ) be a measure space. A function f is defined a.e.(μ) on X if the domain D of f is a subset of X and X \ D is a μ-negligible set. If f is defined a.e.(μ)onX and there is a μ-null set E ∈ A such that f −1(V ) \ E is measurable for every open set V , then f is called μ-measurable. Every measurable function (in the sense of Definition A.2)isμ-measurable and every μ-measurable function is equal almost everywhere (with respect to μ) to a measurable function. The notion of μ-measurability allows us to extend our constructions to a wider collection of functions. We say that a μ-measurable function f is integrable if it is 212 Appendix A Basics of Measure Theory equal a.e.(μ) to a measurable function g that is integrable with respect to μ and we define fdμto be gdμfor every measurable set A. A A ∞ A sequence of measurable functions (fn)n=1 is said to converge almost everywhere to a function f if the set {x ∈ X : fn(x) → f (x)} has μ-measure zero. In this case, we write fn → f a.e.(μ). A function f which is the a.e.(μ)-limit of measurable functions is μ-measurable. Theorem A.15 (Monotone Convergence Theorem) Let (X, A, μ) be a positive ∞ measure space. Suppose (fn)n=1 is a sequence of measurable functions such that

0 ≤ f1(x) ≤ f2(x) ≤ f3(x) ≤··· for almost every x.Iffn → f a.e.(μ), then f is an integrable function and

fdμ = lim f dμ. →∞ n X n X The Monotone Convergence Theorem is a very useful tool, and one of the better known consequences of it is Fatou’s Lemma, named after the mathematician Pierre Fatou. Theorem A.16 (Fatou’s Lemma) Let (X, A, μ) be a positive measure space. If ∞ (fn) = is a sequence of scalar-valued measurable functions, then n 1  lim inf |f | dμ ≤ lim inf |f | dμ. →∞ n →∞ n X n n X Perhaps the most important use of Fatou’s Lemma is in proving the next theorem, which is one of the cornerstones of measure theory. Theorem A.17 (Lebesgue’s Dominated Convergence Theorem) Let (X, A, μ) be ∞ a positive measure space and suppose (fn)n=1 is a sequence of scalar-valued mea- surable functions that converge almost everywhere to f . If there exists a function g ∈ L1(μ) such that |fn|≤ga.e.(μ) for all n ∈ N, then f ∈ L1(μ) and

lim |f − f | dμ = 0. →∞ n n X In particular,

fdμ = lim fn dμ. n→∞ X X   | |≤ | | The last equality follows from the fact that X fdμ X f dμ whenever f is an integrable function. The next result is a direct consequence of Lebesgue’s Dominated Convergence Theorem. Corollary A.18 (Bounded Convergence Theorem) Let (X, A, μ) be a positive ∞ finite measure space and suppose (fn)n=1 is a sequence of uniformly bounded scalar-valued measurable functions. If fn → f a.e.(μ), then f ∈ L1(μ) and

fdμ = lim f dμ. →∞ n X n X A.4 Complex Measures and Absolute Continuity 213

A.4 Complex Measures and Absolute Continuity

Definition A.19 Let (X, A) be a measurable space.A countably additive set function μ : A → C is called a complex measure. When we say μ is countably additive, we mean ⎛ ⎞ ∞ ∞ ⎝ ⎠ μ Aj = μ(Aj ), (A.2) j=1 j=1 ∞ A whenever (Aj )j=1 is a sequence of pairwise disjoint measurable sets in , where the series in (A.2) is absolutely convergent. (Compare to Definition A.6). A complex measure which takes values in R is called a real measure or a signed measure. If μ is a complex measure on A, then the triple (X, A, μ) is called a (complex) measure space.Iftheσ-algebra is understood, then we may write (X, μ). More generally, we can define a measure to be a countably additive set function that takes values in any topological vector space, so long as the convergence of the series in (A.2) makes sense. For our purposes, such generality is not necessary, and so we will confine ourselves to complex measures and positive measures. There is a significant difference between positive measures and complex measures. In DefinitionA.19, we require that a complex measure μ be finite; that is, |μ(A)| < ∞ for all A ∈ A. This was not a requirement for a positive measure. Definition A.20 Let μ be a complex measure on the σ -algebra A. We define the total variation measure of μ to be the set function |μ| : A → R given by ⎧ ⎫ ⎨ n ⎬ | | = | | μ (E) sup ⎩ μ(Ej ) ⎭ , j=1 where the supremum is taken over all finite sequences of pairwise-disjoint measurable n ∈ N = n sets (Ej )j=1, for all n , such that E j=1 Ej . As the name suggests, the total variation measure is a measure. Additionally, it is a finite measure—a fact that we state in the next theorem. Theorem A.21 If (X, A, μ) is a complex measure space, then |μ| is a positive measure on A and |μ|(X) < ∞. There is, naturally, a relationship between a measure and its total variation mea- sure. For example, if |μ|(E) = 0 for some set E ∈ A, it must be the case that μ(E) = 0. This is an instance of a property called absolute continuity. Definition A.22 Let μ be a complex measure on A and suppose λ is a positive measure on A. We say that μ is absolutely continuous with respect to λ if, for all 214 Appendix A Basics of Measure Theory

A ∈ A, we have that μ(A) = 0 whenever λ(A) = 0. When μ is absolutely continuous with respect to λ, we write μ  λ. Shortly, we will state the Radon–Nikodým Theorem, one of the key results in measure theory. Before that, however, we introduce a definition. Definition A.23 Let (X, A) be a measurable space. A positive measure μ on A is ∞ said to be σ -finite if there exists a countable sequence (Ej )j=1 of measurable sets = ∞ ∞ ∈ N such that X j=1 Ej and μ(Ej ) < for each j . Important examples of σ-finite measures include Lebesgue measure on R and counting measure on N. (See Examples A.8 and A.9, respectively.) One can also have a counting measure on R, but it is not σ-finite. Theorem A.24 (Radon–Nikodým Theorem) Suppose (X, A) is a measurable space and let λ be a positive σ-finite measure on A.Ifμ is a complex measure on A that is absolutely continuous with respect to λ, then there exists a unique g ∈ L1(λ) such that

μ(E) = g(x) λ(dx), E ∈ A. (A.3) E The equation in (A.3) is sometimes written μ(dx) = g(x) λ(dx)ordμ = gdλ. The function g ∈ L1(λ) is called the Radon–Nikodým derivative of μ with respect to λ and is sometimes denoted dμ/dλ. When we say the Radon–Nikodým derivative is unique, we mean up to a set of measure zero. That is, if g and h are both Radon– Nikodým derivatives of μ with respect to λ, then g = ha.e.(λ) (and consequently a.e.(μ)). The σ -finite assumption on μ in Theorem A.24 cannot be relaxed. We seek to define an integral with respect to a complex measure. To that end, let (X, A, μ) be a complex measure space. Since μ |μ|, there exists (by the Radon– Nikodým Theorem) a unique (up to sets of measure zero) function g ∈ L1(|μ|) such that dμ = gd|μ|. We can, therefore, define the integral of a measurable function f : X → C by

fdμ = fgd|μ|, E ∈ A. (A.4) E E More can be said about the Radon–Nikodým derivative of μ with respect to |μ|. The following proposition is a corollary of Theorem A.24 and is often called the polar representation or polar decomposition of μ. Proposition A.25 If (X, A, μ) is a complex measure space, then there exists a measurable function g such that |g(x)|=1 for all x ∈ X and such that dμ = gd|μ|. As a consequence of Proposition A.25 and the definition in (A.4), versions of the Monotone Convergence Theorem, Fatou’s Lemma, and Lebesgue’s Dominated Convergence Theorem hold for integrals with respect to complex measures. We apply A.5 Lp-spaces 215 the existing theorems to the positive finite measure |μ| and observe that

   fdμ ≤ |f | d|μ|, X X for all measurable functions f on X. The Radon–Nikodým Theorem is named after Johann Radon, who proved the theorem for Rn in 1913 (n ∈ N), and for Otton Nikodým, who proved the theorem for the general case in 1930 [28].

A.5 Lp-spaces

In this section, we consider a measure space (X, A, μ), where μ is a positive measure. We will identify certain spaces of measurable functions on X.Forp ∈ [1, ∞), let

! " p Lp(μ) = f a measurable function : |f | dμ < ∞ . X

This is the space of p-integrable functions, also known as Lp-functions,onX.For each p ∈ [1, ∞), we let   1/p p f p = |f | dμ , X where f is a μ-measurable function on X. Observe that f p < ∞ if and only if f ∈ Lp(μ). For the case p =∞, let ! " f ∞ = inf K : μ(|f | >K) = 0 , for all measurable functions f . The quantity f ∞ is called the essential supremum norm of f , and is the smallest number having the property that |f |≤f ∞a.e.(μ). The set L∞(μ) = {f a measurable function : f ∞ < ∞} is the space of essentially bounded measurable functions on X.

Theorem A.26 Let (X, μ) be a positive measure space. If 1 ≤ p ≤∞, then ·p is a complete norm on Lp(μ). In particular, Lp(μ) is a Banach space.

In fact, the Lp-spaces are collections of equivalence classes of measurable func- tions. Two functions f and g in Lp(μ) are considered equivalent if f = ga.e.(μ). In spite of this, we will generally speak of the elements in Lp-spaces as functions, rather than equivalence classes of functions. The proof of TheoremA.26 relies heavily on the following fundamental inequality, which provides the triangle inequality for Lp(μ). 216 Appendix A Basics of Measure Theory

Theorem A.27 (Minkowski’s Inequality) Let (X, μ) be a positive measure space. If f and g are in Lp(μ), where 1 ≤ p ≤∞, then f + g ∈ Lp(μ) and

f + gp ≤f p +gp.

Before we give another theorem, we must introduce a definition. ∞ 1 + 1 = Definition A.28 If 1

Hölder’s Inequality ensures that, given any g ∈ Lq (μ), the map

f → f (x) g(x) μ(dx), f ∈ Lp(μ), X defines a bounded linear functional on Lp(μ) whenever 1 ≤ p<∞. It turns out that any bounded linear functional on Lp(μ) can be achieved in this way, which is the content of the next theorem. Theorem A.30 Let (X, μ) be a positive σ-finite measure space. If 1 ≤ p<∞, ∗ and if q is conjugate to p, then Lp(μ) = Lq (μ).

Frequently, in order to prove something about Lp-spaces, it is sufficient to prove it for simple functions. This is a consequence of the next theorem, which follows from Theorem A.5 and Theorem A.15 (the Monotone Convergence Theorem). Theorem A.31 (Density of Simple Functions) Let (X, μ) be a positive measure space. The set of simple functions is dense in Lp(μ) whenever 1 ≤ p ≤∞.

A.6 Borel Measurability and Measures

Definition A.32 Suppose X is a topological space. The smallest σ -algebra on X containing the open sets in X is called the Borel σ-algebra, or the Borel field,onX. A function which is measurable with respect to the Borel σ -algebra is called a Borel A.6 Borel Measurability and Measures 217 measurable function,oraBorel function. A measure on the Borel σ -algebra is called a Borel measure. We recall that a topological space is said to be locally compact if every point has a compact neighborhood. Naturally, all compact spaces are locally compact, but the converse need not be true. For example, the real line R with its standard topology is locally compact, but not compact. If X is a locally compact Hausdorff space, then we denote by C0(X) the collection of all continuous functions that vanish at infinity. We say f vanishes at infinity if for every ε>0, there exists a compact set K such that |f (x)| <εfor all x ∈ K. The set C0(X) is a Banach space under the supremum norm.

Example A.33 The Banach space C0(N) is the classical sequence space c0 of sequences tending to zero. Definition A.34 Let X be a locally compact Hausdorff space and suppose μ is a positive Borel measure on X. If, for every measurable set E,

μ(E) = inf{μ(V ):E ⊆ V , V an open set}, then μ is called outer regular. If, for every measurable set E,

μ(E) = sup{μ(K):K ⊆ E, K a compact set}, then μ is called inner regular.Ifμ is both inner regular and outer regular, then we call μ a regular measure. A complex measure is called regular if |μ| is regular.

Definition A.34 allows us to describe the dual space of C0(X). Theorem A.35 (Riesz Representation Theorem) Let X be a locally compact Hausdorff space. If Λ is a bounded linear functional on C0(X), then there exists a unique regular complex Borel measure μ such that

Λ(f ) = fdμ, f ∈ C0(X), X and Λ=|μ|(X). Theorem A.35 is named after F. Riesz, who originally proved the theorem for the special case X = [0, 1] [31]. The next theorem is named after Nikolai Lusin (or Luzin), who also worked in the context of the real line [23]. Theorem A.36 (Lusin’s Theorem) Let X be a locally compact Hausdorff space and suppose μ is a finite positive regular Borel measure on X. The space C0(X) is dense in Lp(μ) for all p ∈ [1, ∞). The hypotheses of Lusin’s Theorem can be relaxed somewhat. Indeed, the theorem holds for any positive Borel measures that are finite on compact sets. Note that Lusin’s Theorem is not true when p =∞. Convergence in the L∞-norm is the same as uniform convergence, and the uniform limit of a sequence of continuous functions is continuous; however, functions in L∞(μ) need not be continuous. 218 Appendix A Basics of Measure Theory

Proposition A.37 If X is a locally compact metrizable space, then any finite Borel measure on X is necessarily regular. Again, the Borel measure in question need only be finite on compact sets for the conclusion to hold. In light of Proposition A.37, Theorem A.35 is often stated for a metrizable space X, in which case the term “regular” is omitted from the conclusion. (This was done, for example, in Theorem 2.20 of the current text.)

A.7 Product Measures

Let (X, A) and (Y , B) be two measurable spaces. A measurable rectangle in X × Y is any set of the form A × B, where A ∈ A and B ∈ B. We denote by σ (A × B) the smallest σ -algebra containing all measurable rectangles in X × Y . Proposition A.38 Let (X, A) and (Y , B) be two measurable spaces. If a scalar- valued function f on X × Y is σ(A × B)-measurable, then the map x → f (x, y) is A-measurable for all y ∈ Y , and y → f (x, y) is B-measurable for all x ∈ X. Now let (X, A, μ) and (Y , B, ν)betwoσ-finite measure spaces. We define the product measure on σ(A × B) to be the set function μ × ν given by the formula  

(μ × ν)(Q) = χQ(x, y) ν(dy) μ(dx) X  Y 

= χQ(x, y) μ(dx) ν(dy), Y X for all Q ∈ σ (A × B). As the name implies, μ × ν is a measure on σ (A × B). Furthermore, the measure μ × ν is such that (μ × ν)(E × F ) = μ(E)ν(F ) for all E ∈ A and F ∈ B. The fundamental result in this section is known as Fubini’s Theorem. It is named after Guido Fubini, who proved a version of the theorem in 1907 [13]. Theorem A.39 (Fubini’s Theorem) Let (X, A, μ) and (Y , B, ν) be two σ -finite measure spaces. If f ∈ L1(μ × ν), then   fd(μ × ν) = f (x, y) ν(dy) μ(dx) × X Y X  Y  = f (x, y) μ(dx) ν(dy). Y X The same conclusion holds if f is a σ(A × B)-measurable function such that   |f (x, y)| ν(dy) μ(dx) < ∞. X Y Fubini’s Theorem is sometimes called the Fubini-Tonelli theorem, after Leonida Tonelli, who proved a version of Theorem A.39 in 1909 [35]. Appendix B Results From Other Areas of Mathematics

Throughout the course of this text, we have invoked important results (usually by name), sometimes without explicitly writing out the statement of the result being used. Many of these come from measure theory, and so appear in Appendix 8.3. We include the rest here, for easy reference. For proofs of these theorems, as well as more discussion on these topics, see (for example) Topology: a first course by James Munkres [27] and Functions of One Complex Variable by John Conway [7].

B.1 The One-Point and Stone-Cechˇ Compactifications

We will give only a brief discussion of the necessary topological concepts. For more, we refer the interested reader to Topology: a first course, by James Munkres [27]. We start by defining the one-point compactification of a locally compact Hausdorff space that is not compact. (Recall that a space is locally compact if every point has a compact neighborhood containing it.) Definition B.1 Let X be a locally compact Hausdorff space that is not compact. Adjoin to X an element ∞, called the point at infinity, to form a set Y = X ∪ {∞}. Define a topology on Y by declaring a set U to be open in Y if either U is open in X or U = Y \K, where K is compact in X. The space Y is called the one-point compactification of X. The classic example of a one-point compactification is N ∪ {∞}, the one-point compactification of the natural numbers, where N is given the discrete topology. No- tice that N is locally compact but not compact. The space N ∪ {∞} is homeomorphic to the subspace {1/n : n ∈ N}∪{0} of R. (See Exercise 2.15.)

© Springer Science+Business Media, LLC 2014 219 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1 220 Appendix B Results From Other Areas of Mathematics

Theorem B.2 Let X be a locally compact Hausdorff space that is not compact and let Y be its one-point compactification. Then Y is a compact Hausdorff space, Y \X contains exactly one point, X is a subspace of Y , and Y is the closure of X (in Y ). In general, the one-point compactification of a locally compact metric space is not itself metrizable; however, there are some circumstances under which a metric does exist. Theorem B.3 Let X be a locally compact Hausdorff space that is not compact. The one-point compactification of X is metrizable if and only if X is second countable. In particular, any locally compact separable metric space that is not compact (such as N) will have a metrizable one-point compactification. We will not provide a proof of Theorem B.3. The inquisitive reader is directed to Theorem 3.44 in [3]. The next type of compactification we wish to define is the Stone-Cechˇ compact- ification. Before introducing this concept, however, we require some background. We begin by restating Tychonoff’s Theorem (Theorem 5.38), which we will use presently. { } Theorem B.4 (Tychonoff’s Theorem) Let J$be an index set. If Kα α∈J is a col- lection of compact topological spaces, then α∈J Kα is compact in the product topology. Tychonoff’s Theorem is a statement about compactness, and while we do not pro- vide a proof of this theorem, we can profit from some knowledge of the methods used within. The standard proof of Tychonoff’s Theorem uses an alternate characterization of compactness, one for which we need a definition. Definition B.5 (Finite Intersection Property) Let X be a topological space. A collection C of subsets of X is said to satisfy the Finite Intersection Property if for every finite subcollection {C1, ... , Cn} of C, the intersection C1 ∩···∩Cn is nonempty. We now state an alternative formulation of compactness. Theorem B.6 A topological space X is compact if and only if every collection C of %closed sets in X satisfying the Finite Intersection Property is such that the intersection C C∈C C of all elements in is nonempty. The above theorem leads directly to the following useful corollary. ∞ Corollary B.7 (Nested Interval Property) Let X be a topological space. If (Cn)n=1 is a sequence of nonempty compact sets such that Cn+1 ⊆ Cn for all n ∈ N, then the %∞ intersection n=1 Cn of all sets in the sequence is nonempty. A space X is called completely regular if all singletons (single-point sets) in X are closed and if for each x ∈ X and each closed subset A not containing x there exists a bounded continuous function f : X → [0, 1] such that f (x) = 1 and f (A) ={0}. B.1 The One-Point and Stone-Cechˇ Compactifications 221

In particular, if X is completely regular, then the collection of bounded real-valued continuous functions on X separates the points of X. That is, whenever x = y, there is some continuous function f : X → [0, 1] such that f (x) = 1butf (y) = 0. We remark here that a space X is called regular if any closed subset A and point x ∈ A can be separated by open neighborhoods. That is to say, there exist disjoint open neighborhoods containing A and x, respectively. A completely regular space is regular, but there exist regular spaces that are not completely regular. Proposition B.8 A topological space X is completely regular if and only if it is homeomorphic to a subspace of a compact Hausdorff space. A compactification of a space X is a compact Hausdorff space Y that contains X as a dense subset. (An example is the one-point compactification given earlier.) In order for a noncompact space X to have a compactification, it is both necessary and sufficient that X be completely regular. There are generally many compactifications for a completely regular space X. The one we consider now is the Stone-Cechˇ compactification. Let X be a completely regular space. Let {fα}α∈J be the collection of all bounded continuous real-valued functions on X, indexed by some (possibly uncountable) set J . For each α ∈ J , let Iα be any closed interval containing the range of fα, say 1 2

Iα = inf fα(x), sup fα(x) , ∈ x X x∈X or + , Iα = −fα∞, fα∞ . R $Each of the intervals Iα is compact in , and so, by Tychonoff’s Theorem, the product α∈J Iα is a compact space.$ → Define a map φ : X α∈J Iα by

φ(x) = (fα(x))α∈J , x ∈ X.

It can be shown that φ is an embedding. (This follows from the complete regularity of X, because the collection {fα}α∈J separates the points of X. See Theorem 4.4.2 in [27] for the details.) We conclude that φ(X) is a compact Hausdorff space. Let A = φ(X)\φ(X) and define a space Y by Y = X ∪ A. We have a bijective mapping Φ : Y → φ(X) given by the rule

φ(y)ify ∈ X, Φ(y) = y if y ∈ A.

Define a topology on Y by letting U be open in Y whenever Φ(U) is open in φ(X). It follows that Φ is a homeomorphism, and X is a subspace of the compact Hausdorff space Y . The topological space Y constructed above is known as the Stone-Cechˇ compactifi- cation of X, and is generally denoted β(X). It may seem that β(X) is not well-defined, 222 Appendix B Results From Other Areas of Mathematics since one could construct it with a different choice of sets; however, the Stone-Cechˇ compactification is unique up to homeomorphism (and a homeomorphism can be chosen so that it is the identity on X). While a full discussion of the Stone-Cechˇ compactification is beyond the scope of this appendix, we will state a very important property that it possesses. Theorem B.9 Let X be a completely regular space with Stone-Cechˇ compactifi- cation β(X). Any bounded continuous real-valued function on X can be uniquely extended to a continuous real-valued function on β(X). The Stone-Cechˇ compactification is named after Marshall Harvey Stone and Ed- uard Cech.ˇ Tychonoff’s Theorem is named after Andrey Nikolayevich Tychonoff. It is not clear that these theorems are named quite as they should be, a fact which was remarked upon by Walter Rudin in his Functional Analysis [34, Note toAppendixA]: . . . Thus it appears that Cechˇ proved the Tychonoff theorem, whereas Tychonoff found the Cechˇ compactification—a good illustration of the historical reliability of mathematical nomenclature.

B.2 Complex Analysis

The subject of complex analysis is overflowing with fantastic and improbable theo- rems. Unfortunately, we only encounter a small portion of the subject in our current undertaking. Let us begin with some definitions. A complex-valued function is called differentiable at z0 in the complex plane C if f (z) − f (z ) f (z ) = lim 0 0 → z z0 z − z0 exists (and thus is a complex number). In this context, the notation z → z0 means that |z−z0|→0. If f is differentiable at every point in a set, then we say f is differentiable in the set (or on the set). A complex-valued function is called holomorphic (or analytic) if it is differentiable in a neighborhood of every point in its domain. The next theorem is one of the key results of complex analysis. Theorem B.10 A is infinitely differentiable. The next theorem is a frequently cited result and is used to show, for example, that the disk algebra A(D) is a subalgebra of C(T), the space of continuous functions on the unit circle. (See Example 8.6(b).) Theorem B.11 (Maximum Modulus Theorem) Let D be a bounded open set in the complex plane. If f is a holomorphic function in D that is extendable to a continuous function on D, then max|f (z)|=max|f (z)|. z∈D z∈∂D Holomorphic functions display a variety of interesting and useful properties. One such property is stated in the next theorem. B.2 Complex Analysis 223

Theorem B.12 (Cauchy’s Integral Formula) Let f : D → C be a holomorphic function on the simply connected domain D.Ifγ is a closed curve in D, then = γ f (z) dz 0. The integral appearing in Theorem B.12 is a standard line integral over a path γ . When we say D is simply connected, we mean that any two points in D can be connected by a path, and that any path connecting those two points can be continu- ously transformed into any other. A less precise way of saying that is to say D has no “holes” in it. Cauchy’s Integral Formula has a converse, which is named after the mathematician and engineer Giacinto Morera. Theorem B.13 (Morera’s Theorem) Let D be a connected open set in the complex → C = plane. If f : D is a continuous function such that γ f (z) dz 0 for every closed piecewise continuously differentiable curve γ in D, then f is holomorphic on D. Morera’s Theorem is used to show (among other things) that the uniform limit of ∞ holomorphic functions is holomorphic: Suppose (fn)n=1 is a uniformly convergent sequence of holomorphic functions, and suppose f = limn→∞ fn. By Cauchy’s Integral Formula, if γ is any continuously differentiable closed curve, then we have = ∈ N = = γ fn(z) dz 0 for all n . Since γ f (z) dz limn→∞ γ fn(z) dz 0 (by uniform convergence), it follows from Morera’s Theorem that f is holomorphic. A function that is holomorphic on all of C is called an entire function. There are some truly remarkable theorems related to entire functions. Theorem B.14 (Liouville’s Theorem) A bounded entire function is constant. According to Liouville’s Theorem, if an entire function is not constant, then it must be unbounded. In fact, something even stronger can be said. Theorem B.15 (Picard’s Lesser Theorem) If an entire function is not constant, then its image is the entire complex plane, with the possible exception of one point. The classic example of such a non-constant function is f (z) = ez. The range of this function is C\{0}. Theorem B.15 was proved by Charles Émile Picard in 1879. Another significant theorem due to Picard concerns essential singularities. Theorem B.16 (Picard’s Greater Theorem) Let f be an analytic function with an essential singularity at z0. On any neighborhood of z0, the function f will attain every value of C, with the possible exception of one point, infinitely often.

A function is said to have a singularity at the point z0 if f is analytic in a neigh- borhood containing z0, except f (z0) does not exist. If limz→z0 f (z) exists, then we = sin z call it a removable singularity. A classic example is the function f (z) z , which has a removable singularity at z = 0. A singularity z0 is a pole of f if there is some n ∈ N such that

n lim (z − z0) f (z) (B.5) z→z0 224 Appendix B Results From Other Areas of Mathematics exists. We call z0 a pole of order n if n is the smallest integer such that (B.5) exists. A simple example of a function with a polar singularity of order n ∈ N at z = 0is = 1 f (z) zn . If the limit in (B.5) does not exist for any n ∈ N, we call z0 an essential singularity of f . Examples of functions with an essential singularity at z = 0 are f (z) = e1/z and f (z) = sin (1/z). An alternate characterization of singularities can be obtained from the Laurent series of the function f . A function f has a Laurent series about z0 if there exists a doubly infinite sequence of scalars (an)n∈Z such that ∞ n f (z) = an(z − z0) . n=−∞

There is a formula to compute an for n ∈ Z:

1 f (z) a = dz, n n+1 2πi γ (z − z0) where γ is a circle centered at z0 and contained in an annular region in which f is holomorphic. Any function with a singularity at z0 will have a Laurent series there. Such a function has a pole of order n ∈ N at z0 if a−n = 0, but a−k = 0 for all k>n. If a−k = 0 for infinitely many k ∈ N, then f has an essential singularity at z0. Augustin-Louis Cauchy did fundamental research in complex analysis in the first half of the nineteenth century. He and Joseph Liouville, after whom Theorem B.14 is named, were contemporaries. Charles Émile Picard came later, not being born until 1856, only one year before Cauchy passed away (on 23 May 1857). References

1. L. Alaoglu, Weak topologies of normed linear spaces. Ann. Math. (2), 41, 252–267 (1940) 2. F. Albiac, N.J. Kalton, Topics in Banach Space Theory, vol. 233 of Graduate Texts in Mathematics (Springer, New York, 2006) 3. C.D. Aliprantis, K.C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide, 3rd edn. (Springer, Berlin, 2006) 4. J.L. Bell, D.H. Fremlin, A geometric form of the axiom of choice. Fundam. Math. 77, 167–170 (1972) 5. J.W. Calkin, Two-sided ideals and congruences in the ring of bounded operators in Hilbert space. Ann. Math. (2) 42, 839–873 (1941) 6. P.G. Casazza, A tribute to Nigel J. Kalton (1946–2010). Notices Am. Math. Soc. 59, 942–951 (2012). (Peter G. Casazza, coordinating editor) 7. J.B. Conway, Functions of One Complex Variable, vol. 11 of Graduate Texts in Mathematics, 2nd edn. (Springer-Verlag, New York, 1978) 8. J.B. Conway, A Course in Operator Theory, vol. 21 of Graduate Studies in Mathematics (American Mathematical Society, Providence, 2000) 9. P. Enflo, A counterexample to the approximation problem in Banach spaces. Acta Math. 130, 309–317 (1973) 10. M. Fekete, Über die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahligen Koeffizienten. Math. Zeitschrift 17, 228–249 (1923) 11. E. Fischer, Sur la convergence en moyenne. C. R. Acad. Sci. (Paris) 144, 1022–1024 (1907) 12. I. Fredholm, Sur une classe d’équations fonctionnelles. Acta. Math. 27, 365–390 (1903) 13. G. Fubini, Sugli integrali multipli. Rom. Accad. Lincei. Rend. (5) 16(1), 608–614 (1907) 14. I. Gelfand, Normierte Ringe. Rec. Math. N.S. (Mat. Sbornik) 9(51), 3–24 (1941) 15. I. Gelfand, A. Kolmogoroff, On rings of continuous functions on topological spaces. Dokl. Akad. Nauk. SSSR 22, 11–15 (1939) 16. I. Gelfand, M. Neumark, On the imbedding of normed rings into the ring of operators in Hilbert space. Rec. Math. N.S. (Mat. Sbornik) 12(54), 197–213 (1943) 17. A. Grothendieck, Produits tensoriels topologiques et espaces nucléaires. Mem. Am. Math. Soc. 1955, 140 (1955) 18. W. Hodges, Krull Implies Zorn. J. London Math. Soc. (2) 19, 285–287 (1979) 19. R.C. James, Characterizations of Reflexivity. Studia Math. 23, 205–216 (1963/1964) 20. Y. Katznelson, An Introduction to Harmonic Analysis, Cambridge Mathematical Library, 3rd edn. (Cambridge University Press, Cambridge, 2004) 21. M. Krein, D. Milman, On extreme points of regular convex sets. Studia Math. 9, 133–138 (1940) 22. W.Krull, Idealtheorie in Ringen ohne Endlichkeitsbedingung. Math. Ann. 101, 729–744 (1929) 23. N. Lusin, Sur les propriétés des fonctions mesurables. C. R. Acad. Sci. (Paris) 154, 1688–1690 (1912)

© Springer Science+Business Media, LLC 2014 225 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1 226 References

24. R.D. Mauldin (ed.), The Scottish Book, Birkhäuser Boston, Mass., 1981. Mathematics from the Scottish Café, Including selected papers presented at the Scottish Book Conference held at North Texas State University, Denton, Tex. (May 1979) 25. S. Mazur, Sur les anneaux linéaires. C. R. Acad. Sci. (Paris) 207, 1025–1027 (1938) 26. G.H. Moore, Zermelo’s Axiom of Choice. Its Origins, Development, and Influence,vol.8of Studies in the History of Mathematics and Physical Sciences (Springer-Verlag, New York, 1982) 27. J.R. Munkres, Topology: A First Course (Prentice-Hall Inc., Englewood Cliffs, 1975) 28. O. Nikodym, Sur une généralisation des intégrales de M. J. Radon. Fundam. Math. 15, 131–179 (1930) 29. J.C. Oxtoby, Measure and Category: A Survey of the Analogies Between Topological and Measure Spaces, vol. 2 of Graduate Texts in Mathematics, 2nd edn. (Springer-Verlag, New York, 1980) 30. F. Riesz, Sur les systèmes orthogonaux de fonctions. C. R. Acad. Sci. (Paris) 144, 615–619 (1907) 31. F. Riesz, Sur les opérations fonctionnelles linéaires. C. R. Acad. Sci. (Paris) 149, 974–977 (1909) 32. J.W. Roberts, A compact convex set with no extreme points. Studia Math. 60, 255–266 (1977) 33. W. Rudin, Real and Complex Analysis, 3rd edn. (McGraw-Hill Book Co., New York, 1987) 34. W. Rudin, Functional Analysis. International Series in Pure andApplied Mathematics, 2nd edn. (McGraw-Hill Inc., New York, 1991) 35. L. Tonelli, Sull’ integrazione per parti. Rom. Accad. Lincei. Rend. (5). 18(2), 246–253 (1909) 36. M. Väth, The dual space of L∞ is L1. Indag. Math. N.S. 9, 619–625 (1998) 37. W. Zelazko,˙ Banach Algebras (Elsevier Publishing Co., Amsterdam, 1973). (Translated from the Polish by Marcin E. Kuczma.) Index

A Banach, Stefan, 7, 106 A(D), disk algebra 182 Banach–Mazur Characterization of Separable A(T), Wiener Algebra 204 Spaces, 74 abelian group, 45 Banach–Stone Theorem, 121 absolutely continuous Banach-Alaoglu Theorem, 105, 199 function, 179 Banach-Steinhaus Theorem, 66 measure, 213 Banach-Tarski Paradox, 31 absolutely convex set, 93 base for a topology, 84 absorbent, 86 Bessel’s Inequality, 160 adjoint bidual, 49 Banach space adjoint, 50, 108 bijection, 5 Hilbert space adjoint, 158 bilinear map, 43, 151 of a matrix, 50 Borel Alaoglu, Leonidas, 106 σ-algebra, 209 almost convergent sequence, 59 field, 216 almost everywhere measurable function, 216 convergence, 211 measure, 216 equality, 211 measure on R, 209 almost open, 72 σ-algebra, 216 analytic function, holomorphic function 222 bounded annihilator, 57 function, 8 anti-isomorphism, 157 sequence, 7 approximate eigenvalue, 187 set, 65, 127 approximate point spectrum, 187 Bounded Convergence Theorem, 136, 212 Approximation Problem, 140 Bounded Inverse Theorem, 74 associative algebra, 181 bounded linear map, 2 Axiom of Choice, 31, 198 bounded sequence, 14

B C β(X), Stone-Cechˇ compactification 221 c0,15 Baire Category Theorem, 61, 64 has no extreme points, 113 Baire, René-Louis, 61 c,36 , 86 C(K), 20, 182, 201 Banach algebra, 181 completeness of, 26 commutative, 196 separability for K = [0, 1], 38 Banach limit, 42 C0(X), 216 Banach space, 2 Calkin algebra, 196

© Springer Science+Business Media, LLC 2014 227 A. Bowers, N. J. Kalton, An Introductory Course in Functional Analysis, Universitext, DOI 10.1007/978-1-4939-1945-1 228 Index

Calkin, J.W., 196 convolution, 183, 203 Cantor function, 23 coset, 54 Cantor group, 45 countable additivity, 209 Cantor measure, 23 C∗-algebra, 197 Cantor set, 23, 28 Cantor’s Middle Thirds Set, 28 D Cauchy sequence, 2 defined a.e., 211 Cauchy Summability Criterion, 24, 88, 91 dense set, 6, 61 Cauchy’s Integral Formula, 222 Density of Simple Functions, 216 Cauchy, Augustin-Louis, 224 differentiable function, 222 Cauchy-Schwarz Inequality, 43, 151 diffuse measure, 23 Cech,ˇ Eduard, 221 Dirac mass, see Dirac measure Cesàro means, 41, 70, 82 Dirac measure, 21, 116, 122 chain, 32 direct sum, 53, 79 characteristic function, 16, 126, 208 Dirichlet kernel, 69 choice function, 31 discrete circle group, see trus45 metric, 85 closed topology, 83, 85 graph, 76 disk algebra, 182 map, 124 division algebra, 190 set, 6 Dominated Convergence Theorem, 212 closed ball dual space, 5, 87 in a metric space, 5 dual space action, 14, 49 in a normed vector space, 6 E in 2 ,94 p eigenfunction, 173 Closed Graph Theorem, 76 eigenvalue, 129, 187 Closest Point Lemma, 154 eigenvector, 129 closure, 6 Enflo, Per, 140 codimension, 205 entire function, 189, 223 commutative algebra, 182, 197 equivalence classes of measurable functions, compact 16, 215 group, 119 equivalent norms, 5, 79 operator, 131, 133 essential supremum, 16, 215 set, 6, 85 essentially bounded function, 16, 215 topological space, 85 extremal set, 114 compactification, 221 extreme point, 111 complemented subspace, 81 complete normed space, 2 F completely regular topological space, 220 Fσ -set, 61 conjugate Fatou’s Lemma, 212 exponent, 15, 215 Fatou, Pierre, 212 isomorphism, 157 Fejér kernel, 70, 71 linearity, 157 Fekete’s Lemma, 59, 206 symmetric inner product, 151 field, 190 transpose, matrix adjoint 50 Finite Intersection Property, 220 continuous, 83 finite rank operator, 130 continuously differentiable, 8 first category, 64 convergence in measure, 90 first countable, 85 convergent sequence, 2, 85 Fischer, Ernst Sigismund, 163 convex Fourier function, 110 coefficient, 67 hull, 109 series, 67, 82, 163, 204 set, 93 transform, 78, 162, 163 Index 229

Fourier, Joseph, 6 I Fredholm Alternative, 193 ideal, 197 Fredholm operator, 137 left, 196 Fredholm, Erik Ivar, 6, 137 maximal, 197 Fubini’s Theorem, 203, 218 proper, 196 Fubini, Guido, 218 right, 196 Fundamental Theorem of Algebra, 129 two-sided, 196 identity, 182 G image, 129 Gδ-set, 61 incomplete normed space, 2 Gantmacher’s Theorem, 149 indicator function, characteristic function 208 Gelfand transform, 199 indiscrete topology, 83 Gelfand, Israel, 197 injection, 5 Gelfand–Mazur Theorem, 190 inner product Gelfand-Naimark Theorem, 197 complex, 151 generated topology, 84 real, 42 Goldstine’s Theorem, 106 integrable function, 210 graph of a function, 76 integral Grothendieck, Alexander, 140 with respect to a complex measure, 214 group, 44 with respect to a positive measure, 210 axioms, 44 interior, 6 of invertible elements, 185 Intermediate Value Theorem, 89, 112 operations, 44 inverse, 184 invertible element, 184 H isometry, 5, 121 Haar measure, 47, 119, 121 isomorphism, 5, 79 Hahn–Banach Theorem Extension Theorem, 33 J for complex normed spaces, 38 James, Robert C., 108 for real normed spaces, 24, 36 invariant version, 40 K Separation Theorem, 94 kernel Hausdorff Maximality Principle, 32 Fredholm kernel, 137 Hausdorff space, 85 of a Hilbert-Schmidt operator, 166 Hausdorff, Felix, 32 kernel (nullspace), 56, 129 Heine–Borel property, 127 Krein, M., 116 Hellinger–Toeplitz Theorem, 178 Krein-Milman Theorem, 114 hermitian Kronecker delta, 159 kernel, 167 Krull’s Theorem, 197 operator, 158 Krull, Wolfgang, 198 Hilbert space complex, 152 L real, 43 L0, 90, 125 separable (identification of), 161 1 convolution algebra, 183, 202 Hilbert transform, 79, 82 L1(R) convolution algebra, 183 Hilbert, David, 7 L2(a, b), 166 Hilbert–Schmidt operator, 140 Lp, 16, 215 Hilbert-Schmidt operator, 166 for 0

L∞, 16, 215 probability measure, 116, 210 as a Banach algebra, 202 product measure, 218 completeness of, 25 regular, 217 ∞,14 σ-finite, 214 Laurent series, 202, 223 support of a measure, 117 Lebesgue measure space, 213 σ-algebra, 209 measure-zero set, 209 integrable function, 210 metric, 1 measure on R, 209 group, 44 measure on T, 210 space, 1, 84 Lebesgue’s Devil Staircase, 23 topology, 84 Lebesgue’s Dominated Convergence Theorem, metrizable, 85 212 Milman’s Theorem, 117 Lebesgue, Henri, 7 Milman, D., 116 limit in a topological space, 85 Minkowski functional, 97 linear functional, 5 Minkowski’s Inequality, 152, 215 Liouville’s Theorem, 190, 223 Monotone Convergence Theorem, 212 Liouville, Joseph, 224 Morera’s Theorem, 182, 222 Lipschitz constant, 3 Morera, Giacinto, 222 Lipschitz continuous, 3 multiplicative linear functional, 197 local base for a topology, 84 multiplier operator, 195, 205 locally compact, 85, 216, 219 Munkres, James, 219 locally convex, 93 Lusin’s Theorem, 217 N Lusin, Nikolai, 217 Naimark, Mark, 197 Lwów, 7 natural embedding, 49 negligible set, 209 neighborhood, 83 M Nested Interval Property, 110, 220 M(K), 22 Neumann series, 185 matrix adjoint, 50 Neumann, Carl, 185 maximal element, 32 Nikodým, Otton, 214 maximal ideal space, spectrum of a Banach nonatomic measure, 23 algebra 199 norm, 1 Maximum Modulus Theorem, 183, 222 normal, 122, 124 Mazur’s Theorem, 108 normed space, 1 Mazur, Stanislaw, 140 norming element, 48 meager set, 64 for a compact hermitian operator, 163 measurable nowhere dense set, 64 function, 207, 211 null set, 209 rectangle, 217 nullspace, kernel 129 sets, 207 space, 207 O measure ω,91 absolutely continuous, 213 one-point compactification, 219 Borel measure on R, 209 of N,23 complete, 209 open ball complex, 213 in a metric space, 5, 84 counting measure, 209 in a normed vector space, 6 finite measure, 210 open map, 55, 71 invariant, 119 Open Mapping Theorem, 73 Lebesgue measure on R, 209 open set, 6, 83 Lebesgue measure on T, 210 operator, 3 positive, 209 orthogonal Index 231

to a closed subspace, 153 Radon–Nikodým Theorem, 214 to a vector, 153 range, 129 orthogonal group, 45 rank, 130 orthogonal operator, 43, 58 rank of a matrix, 45 orthonormal basis, 161 Rank-Nullity Theorem, 129, 145 for L2(0, π), 176 reflexive Banach space, 49 of eigenvectors, 163 regular topological space, 220 orthonormal set, 159 relatively compact set, 130 oscillation of a function, 62 residual set, 64 resolvent function, 189 P resolvent of an element, 186 P(K), 116 reverse triangle inequality, 8 PE , 156 Riemann-Lebesgue Lemma, 77, 100 p-integrable function, 16, 215 Riesz Representation Theorem, 22, 217 p-norm, 11 Riesz’s Lemma, 145 p-summable sequence, 11 Riesz, Frigyes, 7, 137, 163, 217 pairwise disjoint sets, 209 Riesz–Fréchet Theorem, 156 Parallelogram Law, 58, 153 Riesz-Fischer Theorem, 163 Parseval’s Identity, 161, 195 ring homomorphism, 197 partial order, 32 rotations in an abelian group, 45 partial sum operator, 67 Rudin, Walter, 222 perturbation by a compact operator, 143 Picard’s Theorems, 223 S Picard, Charles Émile, 224 Schauder’s Theorem, 141 point at infinity, 219 Schmidt, Erhard, 7 point spectrum, 187 Scottish Book, 7, 140 polar representation of a complex measure, 214 second category, 64 polarization formula, 58, 154 second countable, 85 pole of order n, 223 positive definite semi-norm, 96 inner product, 43, 151 separable space, 6 norm, 1 as a quotient of 1,74 positive homogeneity, 32 classical examples, 38 pre-annihilator, 57 sequential compactness, 134 prime ideal, 205 sequential space, 86, 99 product measure, 218 sequentially closed metric space, 8 product topology sequentially open set, 86 finite product, 83 sesquilinear map, 151 infinite product, 84 set function, 208 projection, 79 shift operator, 40, 130, 194 Pythagorean Theorem, 153 shift-invariance, 39 σ-algebra, 207 Q similar to an orthogonal operator, 43 quasinilpotent, 193 simple function, 208 quaternions, 191 simply connected topological space, 222 quotient singleton, 220 algebra, 196 singular measure, 21 map, 54 singularity, 223 norm, 54 skew-field, 190 space, 54, 79 spectral radius, 191 Spectral Radius Formula, 191 R spectral theory, 151 Radon, Johann, 214 spectrum Radon–Nikodým derivative, 214 of a Banach algebra, 199 232 Index

of an element, 186 U Stone space, 202 Uniform Boundedness Principle, 64, 65, 192 Stone, Marshall Harvey, 221 uniformly convex space, 154, 179 Stone-Cechˇ compactification, 221 unit ball Sturm-Liouville system, 170 compact, 104 subadditivity, 1, 32, 96 in a normed space, 6 sublinear functional, 32 unital algebra, 182 submultiplicative norm, 181 upper bound, 32 support of a measure, see masure117 Urysohn’s Lemma, 122, 124 supremum norm, 7, 14 surjection, 5 V symmetric vector topology, 86 inner product, 43 vibrating string, 175 kernel, 167 Volterra operator, 52, 142, 193, 200 operator, 158 property of a metric, 1 W w-topology, 98 w∗-topology, 102 T weak ternary expansion, 22 convergence, 99 Tonelli, Leonida, 218 topology, 98 topological weakly divisor of zero, 205 bounded set, 65 group, 119 compact operator, 149 space, 83 continuous operator, 131 vector space, 86 weak∗ topology, 83 bounded, 66 topology generated by a metric, 84 topology, 102 torus, 45, 67, 195 Weierstrass Approximation Theorem, 38, 71, total variation 137, 162 measure, 21, 213 Wiener norm, 22 algebra, 202, 204 totally bounded set, 132 Inversion Theorem, 204 trace Wiener, Norbert, 7, 203 of a Fredholm operator, 141 Wronskian, 172 of a Hilbert–Schmidt operator, 178 triangle inequality, 1, 33, 96 Z trigonometric polynomial, 82 Zermelo–Fraenkel Axioms, 31 Tychonoff’s Theorem, 105, 220 Zorn’s Lemma, 32 Tychonoff, Andrey Nikolayevich, 221 Zorn, Max, 32