Linear Functional Analysis

Joan Cerdà

Graduate Studies in Volume 116

American Mathematical Society Real Sociedad Matemática Española

Linear Functional Analysis

Linear Functional Analysis

Joan Cerdà

Graduate Studies in Mathematics Volume 116

American Mathematical Society Providence, Rhode Island

Real Sociedad Matemática Española Madrid, Spain

Editorial Board of Graduate Studies in Mathematics David Cox (Chair) Rafe Mazzeo Martin Scharlemann Gigliola Staffilani

Editorial Committee of the Real Sociedad Matem´atica Espa˜nola Guillermo P. Curbera, Director Luis Al´ıas Alberto Elduque Emilio Carrizosa Rosa Mar´ıa Mir´o Bernardo Cascales Pablo Pedregal Javier Duoandikoetxea Juan Soler

2010 Mathematics Subject Classification. Primary 46–01; Secondary 46Axx, 46Bxx, 46Exx, 46Fxx, 46Jxx, 47B15.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-116

Library of Congress Cataloging-in-Publication Data Cerd`a, Joan, 1942– Linear functional analysis / Joan Cerd`a. p. cm. — (Graduate studies in mathematics ; v. 116) Includes bibliographical references and index. ISBN 978-0-8218-5115-9 (alk. paper) 1. Functional analysis. I. Title. QA321.C47 2010 515.7—dc22 2010006449

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Acquisitions Department, American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294 USA. Requests can also be made by e-mail to [email protected]. c 2010 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10987654321 151413121110

To Carla and Marc

Contents

Preface xi Chapter 1. Introduction 1 §1.1. Topological spaces 1 §1.2. Measure and integration 8 §1.3. Exercises 21

Chapter 2. Normed spaces and operators 25 §2.1. Banach spaces 26 §2.2. Linear operators 39 §2.3. Hilbert spaces 45 §2.4. Convolutions and summability kernels 52 §2.5. The Riesz-Thorin interpolation theorem 59 §2.6. Applications to linear differential equations 63 §2.7. Exercises 69

Chapter 3. Fr´echet spaces and Banach theorems 75 §3.1. Fr´echet spaces 76 §3.2. Banach theorems 82 §3.3. Exercises 88

Chapter 4. Duality 93 §4.1. The dual of a Hilbert space 93 §4.2. Applications of the Riesz representation theorem 98 §4.3. The Hahn-Banach theorem 106

vii

viii Contents

§4.4. Spectral theory of compact operators 114 §4.5. Exercises 122 Chapter 5. Weak topologies 127 §5.1. Weak convergence 127 §5.2. Weak and weak* topologies 128 §5.3. An application to the Dirichlet problem in the disc 132 §5.4. Exercises 138 Chapter 6. Distributions 143 §6.1. Test functions 144 §6.2. The distributions 146 §6.3. Differentiation of distributions 150 §6.4. Convolution of distributions 154 §6.5. Distributional differential equations 161 §6.6. Exercises 175 Chapter 7. Fourier transform and Sobolev spaces 181 §7.1. The Fourier integral 182 §7.2. The Schwartz class S 186 §7.3. Tempered distributions 189 §7.4. Fourier transform and signal theory 195 §7.5. The Dirichlet problem in the half-space 200 §7.6. Sobolev spaces 206 §7.7. Applications 213 §7.8. Exercises 222 Chapter 8. Banach 227 §8.1. Definition and examples 228 §8.2. Spectrum 229 §8.3. Commutative Banach algebras 234 §8.4. C∗-algebras 238 §8.5. Spectral theory of bounded normal operators 241 §8.6. Exercises 250 Chapter 9. Unbounded operators in a Hilbert space 257 §9.1. Definitions and basic properties 258 §9.2. Unbounded self-adjoint operators 262

Contents ix

§9.3. Spectral representation of unbounded self-adjoint operators 273 §9.4. Unbounded operators in quantum mechanics 277 §9.5. Appendix: Proof of the spectral theorem 287 §9.6. Exercises 295 Hints to exercises 299 Bibliography 321 Index 325

Preface

The aim of this book is to present the basic facts of linear functional anal- ysis related to applications to some fundamental aspects of mathematical analysis. If mathematics is supposed to show common general facts and struc- tures of particular results, functional analysis does this while dealing with classical problems, many of them related to ordinary and partial differential equations, integral equations, harmonic analysis, function theory, and the calculus of variations. In functional analysis, individual functions satisfying specific equations are replaced by classes of functions and transforms which are determined by each particular problem. The objects of functional analysis are spaces and operators acting between them which, after systematic studies intertwining linear and topological or metric structures, appear to be behind classical problems in a kind of cleaning process. In order to make the scope of functional analysis clearer, I have chosen to sacrifice generality for the sake of an easier understanding of its methods, and to show how they clarify what is essential in analytical problems. I have tried to avoid the introduction of cold abstractions and unnecessary terminology in further developments and, when choosing the different topics, I have included some applications that connect functional analysis with other areas. The text is based on a graduate course taught at the Universitat de Barcelona, with some additions, mainly to make it more self-contained. The material in the first chapters could be adapted as an introductory course on functional analysis, aiming to present the role of duality in analysis, and

xi

xii Preface also the spectral theory of compact linear operators in the context of Hilbert and Banach spaces. In this first part of the book, the mutual influence between functional analysis and other areas of analysis is shown when studying duality, with von Neumann’s proof of the Radon-Nikodym theorem based on the Riesz representation theorem for the dual of a Hilbert space, followed by the rep- resentations of the duals of the Lp spaces and of C(K), in this case by means of complex Borel measures. The reader will also see how to deal with initial and boundary value problems in ordinary linear differential equations via the use of integral operators. Moreover examples are included that illustrate how functional analytic methods are useful in the study of Fourier series. In the second part, distributions provide a natural framework extend- ing some fundamental operations in analysis. Convolution and the Fourier transform are included as useful tools for dealing with partial differential operators, with basic notions such as fundamental solutions and Green’s functions. Distributions are also appropriate for the introduction of Sobolev spaces, which are very useful for the study of the solutions of partial differential equations. A clear example is provided by the resolution of the Dirichlet problem and the description of the eigenvalues of the Laplacian, in combi- nation with Hilbert space techniques. The last two chapters are essentially devoted to the spectral theory of bounded and unbounded self-adjoint operators, which is presented by us- ing the Gelfand transform for Banach algebras. This spectral theory is illustrated with an introduction to the basic axioms of quantum mechanics, which motivated many studies in the Hilbert space theory. Some very short historical comments have been included, mainly by means of footnotes. For a good overview of the evolution of functional analysis, J. Dieudonn´e’s and A. F. Monna’s books, [10] and [31], are two good references. The limitation of space has forced us to leave out many other important topics that could, and probably should, have been included. Among them are the geometry of Banach spaces, a general theory of locally convex spaces and structure theory of Fr´echet spaces, functional calculus of nonnormal operators, groups and semigroups of operators, invariant subspaces, index theory, von Neumann algebras, and scattering theory. Fortunately, many excellent texts dealing with these subjects are available and a few references have been selected for further study.

Preface xiii

A small number of references have been gathered at the end of each chapter to focus the reader’s attention on some appropriate items from a general bibliographical list of 44 items. Almost 240 exercises are gathered at the end of the chapters and form an important part of the book. They are intended to help the reader to develop techniques and working knowledge of functional analysis. These exercises are highly nonuniform in difficulty. Some are very simple, to aid in better understanding of the concepts employed, whereas others are fairly challenging for the beginners. Hints and solutions are provided at the end of the book. The prerequisites are very standard. Although it is assumed that the reader has some a priori knowledge of general topology, integral calculus with Lebesgue measure, and elementary aspects of normed or Hilbert spaces, a review of the basic aspects of these topics has been included in the first chapters. I turn finally to the pleasant task of thanking those who helped me during the writing. Particular thanks are due to Javier Soria, who revised most of the manuscript and proposed important corrections and suggestions. I have also received valuable advice and criticism from Mar´ıa J. Carro and Joaquim Ortega-Cerd`a. I have been very fortunate to have received their assistance.

Joan Cerd`a Universitat de Barcelona

Chapter 1

Introduction

The purpose of this introductory chapter is to fix some terminology that will be used throughout the book and to review the results from general topology and measure theory that will be needed later. It is intended as a reference chapter that initially may be skipped.

1.1. Topological spaces

Recall that a metric or distance on a nonempty set X is a function

d : X × X → [0, ∞) with the following properties: 1. d(x, y) = 0 if and only if x = y, 2. d(x, y)=d(y, x) for all x, y ∈ X, and 3. d(x, y) ≤ d(x, z)+d(z,y) for all x, y, z ∈ X (triangle inequality). The set X equipped with the distance d is called a metric space.If x ∈ X and r>0, the open ball of X with center x and radius r is the set BX (x, r)={y ∈ X; d(y, x)

1

2 1. Introduction where, for x =(x1,...,xn), n 1/2 |x| := (xj)2 , j=1 the Euclidean norm of x.

1.1.1. Topologies. Most continuity properties will be considered in the context of a metric, but we also need to consider the more general setting of topological spaces. Recall that a nonempty set X is called a topological space if it is endowed with a collection T of sets having the following properties: 1. ∅,X ∈T, 2. A, B ∈T ⇒A ∩ B ∈T, and ∈T A⊂T 3. A∈A A if . The elements of T are called the open sets of the space, and T is called the topology.Theclosed sets are the complements Gc = X \ G of the open sets G ∈T. A subset of X which contains an open set containing a point x ∈ X is called a neighborhood of x in X. A collection U(x) of neighborhoods of apointx is a neighborhood basis of x if every neighborhood V of this point contains some U(x) ∈U(x). Of course, the collection of all open sets that contain x is a neighborhood basis of this point. The interior of A ⊂ X is defined as the set Int A of all points x such that A contains some neighborhood U(x)ofx. It is the union of all open sets contained in A, and A is an open set if and only if Int A = A. The closure A¯ of A is the set of all points x ∈ X such that U(x)∩A = ∅ for every neighborhood U(x)ofx. Obviously A¯ is closed since, if x ∈ A¯, we can find an open neighborhood U(x) contained in A¯c, so that this set is open. Moreover, A¯ is contained in every closed set F that contains A,since F c is an open set whose points do not belong to the closure of A.

A sequence {xn}⊂X converges to a point x ∈ X if every neighbor- hood of x contains all but finitely many of the terms xn.Thenwewrite limn→∞ xn = x or xn → x as n →∞. All the topological spaces we are interested in will be Hausdorff,1 which means that for two arbitrary distinct points x, y ∈ X one can find disjoint neighborhoods of x and y. This implies that every point {x} is a closed set,

1Named after the German mathematician Felix Hausdorff, one of the creators, in 1914, of a modern point set topology, and of measure theory. He worked at the Universities of Leipzig, Greifswald, and Bonn.

1.1. Topological spaces 3 since any other point has a neighborhood that does not meet {x}, and {x}c is open.

Moreover, in a Hausdorff space, limits are unique. Indeed, if xn → x, then for any neighborhood U(x)thereexistsNx so that xn ∈ U(x)ifn ≥ N; for any other point y we can find disjoint neighborhoods U(x) and U(y)of x and y, and it is impossible that also xn ∈ U(y)ifn ≥ Ny.

Let us gather together some elementary facts concerning topological spaces:

(a) Suppose (X1, T1) and (X2, T2) are two topological spaces. A function f : X1 → X2 is said to be continuous at x ∈ X1 if, for every neighborhood V (y)ofy = f(x) ∈ X2, there exists a neighborhood U(x)ofx such that f(U(x)) ⊂ V (y). Obviously, we may always assume that V (y) ∈V(y) and U(x) ∈U(x)ifU(x) and V(y) are neighborhood bases of x and y.

(b) If f : X1 → X2 is continuous at every point x of X1, f is said to be continuous on X1. This happens if and only if, for every open set G ⊂ X2, −1 the inverse image f (G)isanopensetofX1,sincef(U(x)) ⊂ V (y) ⊂ G means that U(x) ⊂ f −1(G). By taking complements, f is continuous if and only if the inverse images of closed sets are also closed. Moreover, in this case, f(A¯) ⊂ f(A)for any subset A of X1 since, if U(x) ∩ A = ∅ when U(x) ∈U(x), for every U(f(x)) ∈U(f(x)) we may choose U(x) so that f(U(x)) ⊂ U(f(x)) and then f(U(x)) ∩ f(A) ⊂ U(f(x)) ∩ f(A) = ∅ .

(c) Suppose two topologies T1 and T2 are defined on X.ThenT1 is said to be finer than T2,orT2 is coarser than T1,ifT2 ⊂T1, which means that every T2-neighborhood is also a T1-neighborhood, or that the identity map I :(X, T1) → (X, T2) is continuous.

(d) If Y is a nonempty subset of the topological space X, then the topology T of X induces a topology on Y by taking the sets G∩Y (G ∈T)as the open sets in Y . With this new topology, we say that Y is a topological subspace of X. The closed sets of Y are the sets F ∩ Y ,withF closed in X.

(e) Many topologies encountered in this book can be defined by means of a distance. The topology of a metric space2 X is the family of all subsets G with the property that every point x ∈ G is the center of some open (or

2The name is due to F. Hausdorff (see footnote 1 in this chapter), but the concept of metric spaces was introduced in his dissertation by the French mathematician Maurice Fr´echet (1906). See also footnote 2 in Chapter 3.

4 1. Introduction closed) ball contained in G. It is not hard to verify that the collection of these sets satisfies all the properties of a topology on X. It is an easy exercise to check that open balls are open sets, closed balls are closed, xn → x if and only if d(xn,x) → 0, and that, for a given point x ∈ X, the balls BX (x, 1/n)(n ∈ N) form a countable neighborhood basis of x.

It follows from the triangle inequality that BX (x, r) ∩ BX (y, r)=∅ if d(x, y)=2r>0 and the topology of the metric space is Hausdorff. Suppose A is a subset of the metric space X.Sincex ∈ A¯ if and only if B¯X (x, 1/n) ∩ A = ∅ for every n ∈ N,bytakingan ∈ B¯X (x, 1/n) ∩ A,we obtain that x ∈ A¯ if and only if x = limn an, i.e., d(x, an) → 0, for some sequence {an}⊂A. That is, the closure is the “sequential closure”.

Similarly, a function f : X1 → X2 between two metric spaces is continu- ous at x ∈ X1 if and only if f(x) = limn f(xn) whenever x = limn xn in X1. Thus, f is continuous if it is “sequentially continuous” (see Exercise 1.9(a)). But one should remember that knowledge of the converging sequences does not characterize what a topology is or when a function is continuous (cf. Exercise 1.9). A topological space is said to be metrizable when its topology can be defined by means of a distance.

1.1.2. Compact spaces. A Hausdorff topological space (K, T )issaidto be compact if, for every family {Gj}j∈J of open sets such that K = Gj, j∈J { } a finite subfamily Gj1 ,...,Gjn can be chosen so that ∪···∪ K = Gj1 Gjn . By considering complements, compactness is equivalent to the property c ∈ that for a family of closed sets Fj = Gj (j J) such that every finite subfamily has a nonempty intersection (it is said that the family has the finite intersection property), j∈J Fj is also nonempty. A subset K of a Hausdorff topological space X is said to be compact if it is a compact subspace of X or, equivalently, if every cover of K by open subsets of X contains a finite subcover. It is also a well-known fact that a metric space K is compact if and only if it is sequentially compact; this meaning that every sequence {xn}⊂K has a convergent subsequence.

In a metric space X it makes sense to consider a Cauchy sequence {xk}, defined by the condition d(xp,xq) → 0asp, q →∞; that is, to every

1.1. Topological spaces 5

ε>0 there corresponds an integer Nε such that d(xp,xq) ≤ ε as soon as p ≥ Nε and q ≥ Nε.

If {xk} is convergent in the metric space, so that d(xk,x) → 0ask →∞, then {xk} is a Cauchy sequence, for d(xp,xq) ≤ d(xp,x)+d(xq,x) → 0as p, q →∞. The metric space is called complete if every Cauchy sequence in X converges to an element of X. → Every compact metric space K is complete, since the conditions xnk x and d(xp,xq) → 0, combined with the triangle property, imply that xn → x. In a metric space, a set which is covered by a finite number of balls with an arbitrarily small radius is compact when the space is complete: Theorem 1.1. Suppose A is a subset of a complete metric space M.If for every ε>0 a finite number of balls with radius ε cover A,thenA¯ is compact.

Proof. Let us show that every sequence {an}⊂A has a Cauchy subse- quence. m+1 Denote {an,0} = {an}. Since a finite number of balls with radius 1/2 covers A, there is a ball B(c, 1/2) which contains a subsequence {an,1} of {an}. By induction, for every positive integer m, we obtain {an,m+1}⊂ m+1 B(cm+1, 1/2 ) which is a subsequence of {an,m}, since a finite number of balls with radius 1/2m+1 cover A.

The “diagonal subsequence” {am,m} is then a Cauchy subsequence of m m {an},sinceap,p ∈ B(cm, 1/2 )ifp ≥ m, so that d(ap,p,aq,q) ≤ 2/2 if p, q ≥ m.

Finally, if {xn} is any sequence in X¯, by choosing an ∈ A so that { } { } d(an,xn) < 1/n and a Cauchy subsequence amn of an , which converges ∈ ¯ → toapointx A, it is clear that also xmn x,since ≤ → d(xmn ,x) d(xmn ,amn )+d(amn ,x) 0. This shows that A¯ is sequentially compact. 

In Rn, a set is said to be bounded if it is contained in a ball and, by the Heine-Borel theorem, every closed and bounded set is compact. This is a typical fact of Euclidean spaces which is far from being true for a general metric space, even if it is complete.

The following properties are easily proved: (a) In a Hausdorff topological space X, if a subset K is compact, then it is a closed subset of X. (b) In a compact space, all closed subsets are compact.

6 1. Introduction

(c) If f : X1 → X2 is a continuous function between two Hausdorff topological spaces and K a compact subset of X1,thenf(K)isa compact subset of X2. Property (a) is proved by assuming that there is some point x ∈ K¯ \ K. Then disjoint couples of open neighborhoods Uy(x)of x and V (y)ofy for ∈ ⊂ N every y K may be taken; by compactness, K k=1 V (yk) and U := N k=1 Uyk (x) would be a neighborhood of x disjoint with K, contrary to x ∈ K¯ . To prove (b), complete any open covering of the closed subset with the complement of the subset, yielding an open covering of the whole space; then use the compactness of the space to select a finite covering. In (c), the preimage by f of an open covering of f(K)isanopencovering of K, and a finite subcovering of this covering of K yields a corresponding finite subcovering for f(K). T Suppose (Xj, j) is a family of topological spaces. The product topol- T ogy on the product set X = j∈J Xj is defined as follows: Let πj : X → Xj be the projection on the jth component and, for every x = {xj}j∈J in X,letU(x) denote the collection of the sets of the type −1 U(x)= Uj(xj)= πj (Uj(xj)) j∈J j∈J where Uj(xj) is an open neighborhood of xj and Uj(xj)=Xj except for a finite number of indices j ∈ J. Then, by definition, G ∈T if, for every x ∈ G, x ∈ U(x) ⊂ G for some U(x) ∈U(x). It is readily checked that T is a topology, which is Hausdorff if every (Xj, Tj) is a Hausdorff topological space.

Obviously the projections πj are all continuous, and T is the “small- est” topology on X with this property, since for such a topology every set −1 πj (Uj(xj)) has to be a neighborhood of x. 3 ∈ Theorem 1.2 (Tychonoff ). If Kj (j J) is a family of compact spaces, then the product space K = j∈J Kj is also compact. Proof. Let F be a family of closed sets of K with the finite intersection property and consider the collection Φ of all families of this type that contain F, ordered by inclusion. By Zorn’s lemma,4 at least one of these families,

3Named after the Russian mathematician Andrey N. Tychonoff, who proved this theorem first in 1930 for powers of [0, 1]. He originally published in German, but the English transliteration Tichonov for his name is also commonly used. E. Cechˇ proved the general case of the theorem in 1937. 4Zorn’s lemma on partially ordered sets will be invoked several times in this book. It is equivalent to Zermelo’s axiom of choice in set theory. A binary relation  on a set X is said to

1.1. Topological spaces 7

F  {F }⊂ F , is maximal, since, if α Φ is totally ordered, then α α also has the finite intersection property and is an upper bound for {Fα}.  The closed sets πj(F ) ⊂ Kj,whereF ∈F, also have the finite in- tersection property and every space Kj is compact. Hence we can find ∈ ∈ xj F ∈F πj(F ) and, if Uj(xj) is a closed neighborhood of xj Kj, it follows that π−1(U (x )) ∈F since F  is maximal. It follows that j j j { } F ∈F πj(F )= xj . Choosing Uj(xj)=Kj except for finitely many of the indices j ∈ J,we obtain −1 ∈F Uj(xj)= πj (Uj(xj)) j∈J j∈J ∩ ∅ ∈F ∈F and j∈J Uj(xj) F = for every f . Then, if F , every neighbor- { } ∈ hood of x = xj j∈J intersects F , and x F since F is closed. This proves ∅  that F ∈F F = .

1.1.3. Partitions of unity. A locally compact space is a Hausdorff topological space with the property that every point has a neighborhood basis of compact sets. We suppose that X is a metric locally compact space, for instance any nonempty closed or open subset of Rn, with the induced topology (see also Exercise 1.5). In this section, we use the letter G to denote an open subset of X, and we use K for a compact subset of X.

We will represent by Cc(X) the set of all real or complex continuous functions g on X whose support, supp g = {g =0 }, is a compact subset of X. We consider Cc(G) ⊂Cc(X) by defining g(x)=0whenx ∈ G, for every g ∈Cc(G). Note that if K is a compact subset of G, we can consider K ⊂ Int L ⊂ L, where L is a second compact subset of G, since, if for every x ∈ K we select a neighborhood V (x) with compact closure V (x) ⊂ G, we only need to take L = V (x1) ∪···∪V (xn)ifK ⊂ V (x1) ∪···∪V (xn). Then d(x, Lc) (x):= d(x, K)+d(x, Lc) defines a function  ∈Cc(G) such that 0 ≤  ≤ 1 and (x) = 1 for all x ∈ K. We say that  is a continuous Urysohn function for the couple K ⊂ G. be a partial order if the following properties are satisfied: (a) x  x,(b)ifx  y and y  z, then x  z,and(c)ifx  y and y  x,thenx = y. A subset Y of the partially ordered set X is said to be totally ordered if every pair x, y ∈ Y satisfies either x  y or y  x. According to Zorn’s lemma, if every totally ordered subset Y of a nonempty partially ordered set X has an upper bound zY ∈ X (this meaning that y  zY for every y ∈ Y ), then X contains at least one maximal element (an element z such that z  x implies z = x).

8 1. Introduction

With the notation g ≺ G we will mean that g ∈Cc(G) and 0 ≤ g ≤ 1, and K ≺ g will mean that g ∈Cc(G) and g = 1 on a neighborhood of K and 0 ≤ g ≤ 1. Thus, there is a Urysohn function g such that K ≺ g ≺ G.

Theorem 1.3. Let {Ω1,...,Ωm} be a finite family of open subsets of X that cover the compact set K ⊂ X. Then there exists a system of functions ϕj ∈Cc(Ωj) which satisfies 1. 0 ≤ ϕ ≤ 1 for every j =1,...,m and j m ∈ 2. j=1 ϕj(x)=1for every x K. { }m This system ϕj j=1 is called a partition of unity subordinate to the cov- ering {Ω1,...,Ωm} of K.

Proof. Let us choose a system Kj ⊂ Ωj (1 ≤ j ≤ m) of compact sets that covers K, that can be constructed as follows: if

K ⊂ BX (x1,r1) ∪···∪BX (xN ,rN ) with B¯X (xi,ri) ⊂ Ωj, just define Kj := {B¯X (xi,ri) ∩ K; B¯X (xi,ri) ⊂ Ωj}.

For every j let j be a Urysohn function for the couple Kj ⊂ Ωj.Ifwe define

ϕ1 := 1,ϕk := (1 − 1) ···(1 − k−1)k (1

ϕ1 + ···+ ϕk =1− (1 − 1)(1 − 2) ···(1 − k) if 1 ≤ k ≤ m.Ifx ∈ K, j(x)=1forsomej.  Remark 1.4. If G is open in Rn, it is shown in Chapter 6 that the Urysohn ∞ functions j can be chosen to be C (cf. Section 6.1). In this case, the functions ϕj are also smooth.

1.2. Measure and integration

The Riemann integral on Rn, which may be historically grounded and useful for numerical computation and sufficient in many areas of mathematics, is far from being adequate for the requirements of functional analysis. Much more appropriate is the Lebesgue integral, based on computing the measure of level sets of functions. We will summarize the Lebesgue construction for a general measure. A measurable space is a nonempty set Ω where a distinguished col- lection Σ of subsets has been selected having the properties of a σ-, meaning that the following axioms are satisfied:

1.2. Measure and integration 9

1. Ω ∈ Σ. { }∞ ∞ ∈ 2. If Ak k=1 is a sequence of sets in Σ, then k=1 Ak Σ. 3. If A ∈ Σ, then its complement Ac =Ω\ A is also in Σ. ∅ c ∈ ∞ ∈ { }∞ ⊂ These properties also imply that =Ω Σ, that k=1 Ak Σif Ak k=1 Σ, and A \ B = A ∩ Bc ∈ ΣifA, B ∈ Σ. The sets in Σ are the measurable sets of the measurable space. If Ω is any nonempty set, a trivial example of σ-algebra is the collection P(Ω)ofallsubsetsofΩ. In our applications Ω can be assumed to be a locally compact metric space, or an open or closed subset of Rn, and it can be assumed that the measurable sets are the Borel sets of Ω, the elements of the Borel σ- algebra BΩ, which is the intersection of all the σ-algebras that contain all the open sets of Ω; that is, BΩ is the σ-algebra generated by the open sets of Ω.

A measure (sometimes also called a positive measure) μ on the mea- surable space (Ω, Σ) is a mapping μ :Σ→ [0, ∞] such that μ(∅) = 0 and with the following σ-additivity property: { }∞ If Ak k=1 is a sequence of disjoint sets in Σ, then ∞ ∞ μ Ak = μ(Ak). k=1 k=1 We use the symbol  to indicate a union of mutually disjoint sets.

A measure space is a measurable space (Ω, Σ) with a distinguished measure, μ :Σ→ [0, ∞]. The following properties are easy consequences of the definitions:

• Finite additivity, μ(A1 ···Am)=μ(A1)+···+ μ(Am), since the finite family can be completed with the addition of a sequence of empty sets to obtain a countable disjoint family. • μ(A) ≤ μ(B)ifA ⊂ B,sinceB = A ∪ (B \ A) and μ(B)= μ(A)+μ(B \ A).

• If Am ↑ A (i.e., A is the union of an increasing sequence of mea- surable sets, Am), then μ(A) = limm μ(Am), since A = A1  (A2 \ A1)  (A3 \ (A2 ∪ A1)) ···.

• If Am ↓ A (i.e., A is the intersection of a decreasing sequence of measurable sets) and μ(A1) < ∞,thenμ(A) = limm μ(Am), since μ(A1) − μ(A) = limm(μ(A1) − μ(Am)).

10 1. Introduction

∞ ∞ All our measures will be σ-finite:Ω= k=1 Ωk with μ(Ωk) < .

A simple example in any set Ω is the Dirac delta-measure, δp,located at an arbitrary point p ∈ Ω, defined on any set A ⊂ Ωby

δp(A)=χA(p) where χA stands for the characteristic function of the set A, such that χA(x)=1ifx ∈ A and χA(x)=0whenx ∈ A.Heretheσ-algebra Σ is taken to be all subsets of Ω. But the main example for us is the Lebesgue measure5 |A| of Borel sets of Rn. The change of variables formula is supposed to be known and it shows that this measure is invariant by rigid displacements, that is, by translation, by rotation, and by symmetry. If A is an interval, |A| is the volume of A (the length if n = 1, and the area if n =2). The Lebesgue measure is an example of Borel measure6 on Rn, that is, a measure μ on the Borel σ-algebra which is finite on compact sets. It is the only Borel measure on Rn which is invariant by translation, and such that the measure of the unite cube [0, 1]n is 1. We will see in Theorem 1.6 that every Borel measure on Rn is a regular measure, meaning that the following two properties hold: (a) Outer regularity: For every Borel set B, μ(B)=inf{μ(G); G ⊃ B,G an open subset of Rn}. (b) Inner regularity: For every Borel set B, μ(B)=sup{μ(K); K ⊂ B,K a compact subset of Rn}.

Given two σ-finite measure spaces, (Ω1, Σ1,μ1) and (Ω2, Σ2,μ2), the product σ-algebra Σ = Σ1 ⊗ Σ2 is defined to be the smallest σ-algebra containing all the sets A1 × A2 (A1 ∈ Σ1, A2 ∈ Σ2). It can be shown that there is a unique measure μ on Σ, the product measure, such that μ(A1 × A2)=μ(A1)μ(A2). In the special case of the Borel measures on Rn, it easy to show that BRn ⊗BRm = BRn+m , and the product of the corresponding Lebesgue mea- sures is the Lebesgue measure on the Borel sets in Rn+m. In measure theory only functions f :Ω→ R such that every level set {f>r} := {ω ∈ Ω; f(ω) >r} (r ∈ R) is measurable are admissible. They

5Described by the French mathematician Henry Lebesgue in his dissertation “Int´egrale, longueur, aire”, in 1902 at the University of Nancy. He then applied his integral to real anal- ysis, with the study of Fourier series. In fact, this integral had been previously obtained by W. H. Young. 6Named after the French mathematician Emile´ Borel, one of the pioneers of measure theory and of modern probability theory.

1.2. Measure and integration 11 are called measurable functions and, with the usual operations, they form a real vector space which is closed under pointwise limits, supremums and infimums for sequences of measurable functions. Simple functions, N ∈ ∈ s = λmχAm (An Σ,λm R), m=1 where we may suppose that the measurable sets An are disjoint, are examples of measurable functions. In fact, every measurable function, f, is a pointwise limit of a sequence of simple functions sn such that

(1.1) |sn(x)|↑|f(x)| for every x ∈ Ω. To obtain this sequence, consider f = f + − f − (f +(x)=max(f(x), 0)) and, if f ≥ 0, define n n2 k − 1 − (1.2) sn = χ{f≥n} + n χ{ k 1 ≤f< k }. 2 2n 2n k=1

If f is bounded, sn → f uniformly on Ω as n →∞. The Lebesgue integral fdμ(or Ω f(x) dx) is then defined as follows: If f ≥ 0, N (1.3) fdμ:= sup αjμ(Aj) ∈ [0, ∞], n=1 where the “sup” is extended over all simple functions N ∈ −1 ∈ s = αjχAj (N N,Aj = s (αj) Σ) j=1 such that 0 ≤ s ≤ f. This integral of nonnegative measurable functions is additive and pos- itively homogeneous. Moreover fdμ= 0 if and only if μ({f =0 })=0, that is, f = 0 almost everywhere (a.e.), and it satisfies the following funda- mental property:

Monotone convergence theorem.If0≤ fn(x) ↑ f(x) ∀x ∈ Ω (or a.e.), then fn dμ ↑ fdμ.

If the convergence is not monotone, the following inequality still holds: 7 Fatou lemma. If fn(x) ≥ 0 ∀x ∈ Ω, then lim inf fn dμ ≤ lim inf fn dμ.

7Obtained by the French mathematician Pierre Fatou (1906) in his dissertation, when working on the boundary problem of a harmonic function. Fatou also studied iterative processes, and in 1917 he presented a theory of iteration similar to the results of G. Julia which initiated the theory of complex dynamics.

12 1. Introduction

+ − + − If f = f − f , the integral fdμ:= f dμ − f dμ is defined if at ± 1 least one of the integrals f dμ is finite, and we write f ∈L(μ)ifboth integrals are finite, that is, if |f| dμ = f + dμ + f − dμ < ∞.Inthis case, f is said to be integrable or absolutely integrable. 1 Then, with the usual operations, L (μ) is a real vector space. On this linear space, the integral is a positive linear form : fdμ≥ 0iff ≥ 0. Hence, fdμ≤ gdμif f ≤ g, and fdμ ≤ |f| dμ.

Moreover, the following fundamental convergence result also holds:

Dominated convergence theorem.Iffn(x) → f(x) ∀x ∈ Ω (or a.e.) and 1 |fn(x)|≤g(x) ∀x ∈ Ω (or a.e.), where g ∈L (μ), then fn dμ → fdμ. These convergence theorems, as well as the change of variable formula and the Fubini-Tonelli theorem on iterated integration on Rn or on a product measure space, will be freely used in this text. For any measurable set A, we denote A fdμ:= χAfdμ. Lebesgue differentiation theorem. For the usual Lebesgue measure on Rn,iff ∈L1(Rn), then 1 lim |f(y) − f(x)| dy = 0 a.e. on Rn. ↓ | | r 0 B(x, r) B(x,r) 1 x+r | − | When n = 1, limr↓0 2r x−r f(y) f(x) dy = 0 a.e. on R. A function F on an interval [a, b] ⊂ R such that x (1.4) F (x):= f(t) dt + c a for some f ∈L1(R) and some constant c is called absolutely continuous. Obviously, it is continuous, and it follows from the Lebesgue differentiation theorem that F (x)=f(x)a.e.on[a, b], since, assuming that f =0on [a, b]c, F (x ± r) − F (x) 1 x+r − ≤ | − | → ↓ ± f(x) 2 f(t) f(x) dy 0a.e.asr 0. r 2r x−r The integration by parts formula b b F (x)G(x) dx =(F (b)G(b) − F (a)G(a)) − F (x)G(x) dx a a holds if F and G are absolutely continuous.

To deal with a complex-valued function, just consider the decomposition into real and imaginary parts, f = u + iv :Ω→ C (u and v real measurable

1.2. Measure and integration 13 functions) and define fdμ:= udμ+ i vdμ if u, v ∈L1(μ) (that is, if |f| dμ < ∞). This integral is a linear form on 1 the class L (μ) of all these complex integrable functions, which is a complex vector space, and also fdμ ≤ |f| dμ. In measure theory, two functions are equivalent when they coincide a.e. 1 1 If N (μ)={f; f =0a.e.}, we denote L (μ):=L (μ)/N (μ) and f1 := |f| dμ does not depend on the representative of f ∈ L1(μ). If 1 ≤ p<∞, we also define 1/p p fp := |f| dμ and f∞ := min{C ≥ 0; |f(x)|≤C a.e.}.

Obviously, fp = 0 if and only if f = 0 a.e., and λfp = |λ|fp.We set 0 ·∞:= 0. The Minkowski inequality8

f + gp ≤fp + gp is clear if p =1orp = ∞. In the remaining cases 1

Then just take a = |f(x)|/fp, b = |g(x)|gp and integrate. For 1

8Named after the Lithuanian-German mathematician Hermann Minkowski, who in 1896 used geometrical methods in number theory, in his “geometry of numbers”. He used these geometrical methods to deal also with problems in mathematical physics. Minkowski taught at the Universities of Bonn, G¨ottingen, K¨onigsberg, and Z¨urich. 9First found by the British mathematician Leonard James Rogers (1888) and independently rediscovered by Otto H¨older (1889).

14 1. Introduction that is,  p ≤   p/p    p/p      p/p f + g p f p f + g p + g p f + g p =( f p + g p) f + g p , where p − p/p =1. We will write f,g := fgdμ if the integral exists, and H¨older’s inequality reads

|f,g| ≤ fpgp .

The collection of all real or complex measurable functions f such that fp < ∞, with the usual operations, is a real or complex vector space, and p the quotient space by N (μ) is denoted L (μ). The value fp is the same for all the representatives f we may pick in an equivalence class, and ·p has on Lp(μ) all the typical properties of a norm.10 If p = ∞, note that a representative of f is bounded and then in (1.2) → we obtain sn(x) f(x) uniformly. When p = 2, note that f2 = (f,f)2,where (f,g)2 := fgdμ¯ fgdμin the real case is well-defined, and it is a scalar product that allows us to work with L2(μ) as with a Euclidean space. This will be the basic example of Hilbert spaces.

1.2.1. Borel measures on a locally compact space X and positive linear forms on Cc(X). Let X be a locally compact metric space.

If μ is a Borel measure on X,theneveryg ∈Cc(X)isμ-integrable, since |g|≤CχK for K = supp gand compact sets have finite measure. Note that, on Cc(X), the integral gdμis linear and positive, that is, gdμ≥ 0 if g ≥ 0. These linear forms are called Radon measures, first obtained in 1913 by J. Radon11 by considering Borel measures on a compact subset of Rn.

10The name Lp for these spaces was coined by F. Riesz in honor of Lebesgue. See footnote 5 in this chapter. 11In France, N. Bourbaki chose as starting point of the integral the Radon measures on a locally compact space X, a point of view previously considered by W. H. Young in 1911 and by P. J. Daniel in 1918. It is worth noting that L. Schwartz constructed his distributions in the same spirit, by changing the test space Cc(X). If every open subset of X is a countable union of compact sets, then the Radon measures are in a bijective correspondence with the Borel measures through the Riesz-Markov representation Theorem 1.5.

1.2. Measure and integration 15

Theorem 1.5 (Riesz-Markov representation theorem). Let J be a positive linear form on Cc(X). Then there exists a uniquely determined Borel mea- sure μ on X so that

(1.5) J(g)= gdμ (g ∈Cc(X)) and which satisfies the inner regularity property for open sets μ(G)=sup{μ(K); K ⊂ G, K compact}. This Borel measure is also outer regular: μ(B)=inf{μ(G); G ⊃ B, Gopen}.

Proof. We start by defining μ∗ on open sets by μ∗(G):=sup{J(g); g ≺ G}, where we assume μ∗(∅)=0. This set function has the following properties: ∗ ∗ (a) μ (G1) ≤ μ (G2)ifG1 ⊂ G2, since then Cc(G1) ⊂Cc(G2). ∗ ∗ ∗ (b) μ (G1 ∪ G2) ≤ μ (G1)+μ (G2), since, if K is the support of g ≺ ∪ ∈C ≤ (G1 G2), for j =1, 2 we can find ϕj c(Gj) such that 0 ≤ m ∈ ϕj 1 and j=1 ϕj(x) = 1 for every x K, a partition of unity constructed as in Theorem 1.3; thus g = gϕ1+gϕ2, J(g)=J(gϕ1)+ J(gϕ ) ≤ μ∗(G )+μ∗(G ), and (b) follows. 2 1 2 ∗ ∞ ≤ ∞ ∗ ≺ (c) μ ( k=1 Gk) k=1 μ (Gk), since the support K of every g ∞ N k=1 Gk is contained in some finite union k=1 Gk so that, by (b), N N ∞ ∗ ∗ ∗ J(g) ≤ μ ( Gk) ≤ μ (Gk) ≤ μ (Gk), k=1 k=1 k=1 and (c) follows. Now we extend μ∗ and for every set A we define μ∗(A):=inf{μ∗(G); G ⊃ A}. This set function has the properties of an outer measure: (a) μ∗(∅)=0, (b) μ∗(A) ≤ μ∗(B)ifA ⊂ B, so that it is an extension of μ∗ previously defined on open sets, and ∗ ∞ ≤ ∞ ∗ (c) μ ( k=1 Ak) k=1 μ (Ak)(σ-subadditivity). ∗ ∗ To prove (c), take any ε>0 and pick Ak ⊂ Gk with μ (Gk) ≤ μ (Ak)+ ε/2k. Now, by (b) and from the σ-subadditivity on open sets, ∞ ∞ ∞ ∞ ∗ ∗ ∗ ∗ μ ( Ak) ≤ μ ( Gk) ≤ μ (Gk) ≤ μ (Ak)+ε, k=1 k=1 k=1 k=1

16 1. Introduction which yields (c) since ε>0 was arbitrary. Let us say that a set E is measurable if it satisfies the Carath´eodory condition μ∗(A) ≥ μ∗(A ∩ E)+μ∗(A \ E) and let μ be the restriction of the outer measure μ∗ to measurable sets.12 It is a general fact in measure theory that the collection Σ of all mea- surable sets defined in this way is a σ-algebra and the restriction μ of μ∗ to Σ is a measure. We need to show that any open set G is measurable, since then Σ will contain the Borel σ-algebra BX . Thus, let us prove that G satisfies the Caratheodory condition when A is also an open set, U.Letg ≺ U ∩ G so that J(g) ≥ μ∗(U ∩ G) − ε. We have Gc ⊂ ( supp g)c and choose h ≺ U \ supp g such that J(h) ≥ μ∗(U \ supp g) − ε.Then μ∗(U) ≥ J(g+h) ≥ μ∗(U∩G)+μ∗(U\ supp g)−2ε ≥ μ∗(U∩G)+μ(U\G)−2ε, which yields μ∗(U) ≥ μ∗(U ∩ G)+μ∗(U \ G)sinceε is arbitrary. For any set A,ifU is an open set with A ⊂ U, we have that μ∗(U) ≥ μ∗(U ∩ G)+μ∗(U \ G) ≥ μ∗(A ∩ G)+μ∗(A \ G) and the Caratheodory condition for G follows by taking the infimum over these U ⊃ A. With this construction we have built a Borel measure since, for any compact set K, we are going to prove that (1.6) μ(K)=inf{J(g); K ≺ g}, where the set on the right side is not empty.

Indeed, μ(K)=infG⊃K μ(G) ≤ J(g) whenever K ≺ g,sinceK ⊂{g> 1 − ε} if 0 <ε<1; for any h ≺{g>1 − ε} we have h ≤ g/(1 − ε) and J(h) ≤ J(g)/(1 − ε), by the positivity of J, which yields μ(K) ≤ μ({g>1 − ε})= sup J(h) ≤ J(g)/(1 − ε), h≺{g>1−ε} and then we let ε → 0. To prove (1.6), let ε>0. Choose G ⊃ K such that μ(K) ≥ μ(G) − ε and K ≺ g ≺ G.Thenμ(K) ≤ J(g) ≤ μ(G) ≤ μ(K)+ε and (1.6) follows. By construction, μ satisfies the announced regularity properties. We still need to prove the representation identity (1.5), where we can assume that 0 ≤ g ≤ 1, since every g ∈Cc(X) is a linear combination of such functions.

12In his book “Vorlesungen ber¨ reelle Funktionen” (1918), the Greek mathematician Con- stantin Carath´eodory chose outer measures as the starting point for the construction of measure theory.

1.2. Measure and integration 17

Let K0 = supp g. We will decompose g as a sum of N functions obtained by truncating g as follows. If 0

Jλ(g)= gdμ (g ∈Cc(X)).

18 1. Introduction

First we show that λ(G)=μ(G) for every open set G by considering Km ↑ G. Then we can choose Lm ≺ gm ≺ G with L1 = K1 and m m−1 Lm = Kj ∪ supp gj , j=1 j=1 so that gm ↑ χG and λ(G)=μ(G) by monotone convergence. We now study the outer regularity for any Borel set B. ∞ ∞ Let B = j=1 Bj so that μ(Bj) < and, by the regularity properties of ⊃ \ ≤ j ∞ ⊃ μ,wecanchooseGj Bj so that μ(Gj Bj) ε/2 .ThenG = j=1 Gj B and μ(G \ B) ≤ ε. Similarly, there exists an open set U ⊃ Bc such that μ(U \ Bc) ≤ ε, and then the closed set F = U c satisfies F ⊂ B and μ(B \ F )=μ(U \ Bc) ≤ ε. Therefore λ(G\F )=μ(G\F ) ≤ 2ε, and it follows from λ(G) ≤ λ(B)+2ε that λ is outer regular.

To show that λ is also inner regular, consider Km ↑ F , so that λ(Km) → λ(F ) ≥ λ(B) − ε. 

1.2.2. Complex measures. A complex measure on the measurable space (X, Σ) is a complex-valued set function μ :Σ→ C which satisfies the σ-additivity condition ∞ ∞ μ Bk = μ(Bk). k=1 k=1 We will say that μ is a real measure if μ(B) ∈ R for all B ∈ Σ. ∞ Note that actually the convergence in C of the series k=1 μ(Bk)is absolute, since the union of the sets Bk does not change with a permutation of the subscripts k. The total variation measure of the complex measure μ is the set function defined on Σ by ∞ ∞ |μ|(B):=sup |μ(Bk)|; B = Bk . k=1 k=1 In general, a complex measure is not a positive measure and, if it is a positive measure, it is finite. Theorem 1.7. The total variation |μ| of the complex measure μ is a finite measure that satisfies (1.7) |μ(B)|≤|μ|(B)(B ∈ Σ). It is the smallest measure satisfying this property; that is, if λ is another measure such that |μ(B)|≤λ(B) for every B,then|μ|(B) ≤ λ(B) ∀B ∈ Σ.

1.2. Measure and integration 19

Proof. The estimate (1.7) is obvious, since B = B ∪∅∪···∪∅∪··· . | |≤ ≥| | ∞ Moreover, if μ(B) λ(B), then λ(B) μ (B), since, if B = k=1 Bk, ∞ ∞ λ(B)= λ(Bk) ≥ |μ(Bk)|. k=1 k=1 | | ∞ To show that μ is σ-additive, letB = k=1 Bk and consider any other ∞ ∞ ∩ partition B = j=1 Aj, so that Aj = k=1(Aj Bk). Then ∞ ∞ |μ(Aj)|≤ |μ(Aj ∩ Bk)|≤ |μ|(Bk), j=1 j,k k=1 | | ≤ ∞ | | which implies μ (B) k=1 μ (Bk). ∞ To prove the opposite inequality, for B = Bk,letδk < |μ|(Bk) and ∞ k=1 Bk = j=1 Bk,j so that ∞ |μ(Bk,j)| >δk. j=1 Then ∞ |μ|(B) ≥ |μ(Bk,j)|≥ δk j,k k=1 | | ≥ ∞ | | | and we obtain that μ (B) k=1 μ (Bk) by taking the supremum over all the possible δk. Also |μ|(∅) = 0 and |μ| is a measure. To show that |μ|(X) < ∞, suppose that |μ|(B)=∞ for some B ∈ ∞ Σ. For every t>0, we would find a partition B = k=1 Bk such that ∞ | | k=1 μ(Bk) >tand then N (1.8) |μ(Bk)| >t k=1 for some N. We claim that there is an absolute constant c>1 such that N (1.9) |μ(Bk)|≤c μ(Bj) k=1 j∈J ⊂{ } ⊂ for some J 1,...,N , so that, for A = j∈J Bk B we obtain N t< |μ(Bk)|≤c|μ(A)|, k=1

20 1. Introduction and then |μ(A)| >t/c. In (1.8) we choose t ≥ c, so that |μ(A)| > 1 and t |μ(B \ A)|≥|μ(A)|−|μ(B)| > −|μ(B)| =1 c if t = c(1 + |μ(B)|). Now we have B = A  (B \ A)with|μ(A)|, |μ(B \ A)| > 1 and at least |μ|(A)or|μ|(B \ A) equal to ∞. Suppose now that |μ|(X)=∞. Then we can successively split

X = A1  B1 = A1  A2  B2 = ···= A1  A2 ···AN  BN = ··· with |μ(Aj)| > 1 and |μ|(BN )=∞ for every j and N. We should have in C ∞ ∞ μ Aj = μ(Aj), j=1 j=1 but the series cannot converge, since 1 < |μ(Aj)| → 0, which yields a con- tradiction. Therefore |μ|(X) < ∞. ≤ ≤ N | | To prove the claim (1.9), let zk = μ(Bk)(1 k N) and r = k=1 zk . For at least one of the four quadrants of C, Q, limited by |y| = |x|,wehave |z |≥r/4. Denote J = {j; z ∈ Q} and choose a rotation with angle zk∈Q k j  iϑ | |≤ ϑ so that zj = e zj is in the quadrant y x.Then | | |  |≥   ≥ √1 |  |≥ √1 zj = zj zj zj r, j∈J j∈J j∈J 2 j∈J 4 2 √ which proves (1.9) with c =4 2. 

If μ is a real measure, then |μ| + μ |μ|−μ μ+ := ,μ− := 2 2 are two (positive) finite measures such that μ = μ+ − μ−, |μ| = μ+ + μ−. They are called the positive and negative variations of μ, respectively. Every complex measure μ is a linear combination of four measures, since μ = μ + iμ,whereμ and μ are two real measures. In Lemma 4.12 we will show that (1.10) μ(B)= hd|μ| (B ∈ Σ) B for a uniquely |μ|-a.e. defined |μ|-integrable function h such that |h| =1, so we will be allowed to define Lp(μ)=Lp(|μ|) and fdμ= fhd|μ| for every f ∈ L1(μ).

1.3. Exercises 21

Every complex Borel measure on R is a linear combination of finite (positive) Borel measures, and such a measure, μ, is the Lebesgue-Stieltjes measure associated to the distribution function F (t):=μ (−∞,t] , which is increasing and right-continuous, so that μ (a, b] = F (b)−F (a) and the Riemann-Stieltjes integrals of all Riemann-Stieltjes integrable functions on [a, b] coincide with the Lebesgue integrals: b f(t) dF (t)= fdμ. a (a,b]

Linear combinations of these integrals and measures are the correspond- ing integrals with complex distribution functions and complex measures that allow us to give the representation f(t) dF (t)= fdμ R R for complex measures μ and, say, f ∈Cc(R).

1.3. Exercises

Exercise 1.1. Prove that

d(r, s)=| arctan s − arctan r| defines a distance on R whose topology is the usual one, but R is not complete with this distance. This shows that two distances which are topo- logically equivalent may not have the same Cauchy sequences.

Exercise 1.2. Prove that, in a metric space, every compact set is contained in a ball.

Exercise 1.3. Prove that, in a metric space M, the closure of a subset A is compact if and only if every sequence {ak}⊂A has a convergent subsequence in M.

Exercise 1.4. Prove that every point of a compact space K has a neighbor- hood basis of compact sets. That is, every compact space is locally compact.

Exercise 1.5. Prove that a nonempty subset X of Rn is a locally compact subspace if and only if it is the intersection of a closed and an open set, and that every open set of X is the union of an increasing sequence of compact sets.

22 1. Introduction

I Exercise 1.6. Let I =[a, b] and K = I = t∈I I endowed with the com- pact product topology. Note that {f(t)}t∈I ∈ K means that f = {f(t)}t∈I represents a function f : I → I and prove that

M := {f = {f(t)}t∈I ∈ K; {x; f(x) =0 } is at most countable} with the topology induced by K is sequentially compact but not compact.

Exercise 1.7. Suppose (Xj, Tj) is a family of topological spaces and Y is another topological space. Show that for the product topology T on → X = j∈J Xj, a mapping f : Y X is continuous if and only if every πj ◦ f : Y → Xj is continuous. Exercise 1.8. Assume that T and T  are two Hausdorff topologies on the same set K and T  is finer than T . Prove that if (K, T ) is compact, then T = T . Exercise 1.9. Suppose that f : X → Y ,whereX and Y are two topological spaces, and x0 ∈ X.

(a) If X is metrizable (or if x0 has a countable neighborhood basis), prove that f is continuous at x0 if and only if f is sequentially continuous at x0. (b) Let X be R endowed with the topology of all sets G ⊂ R such that Gc is countable, and let Y also be R but with the discrete topology (all subsets of R are open sets). Show that Id : X → Y is a sequentially continuous noncontinuous function. Exercise 1.10. If f is a measurable function on a measurable space, show that f(x) sgn f(x):= |f(x)| (with 0/0 := 0) defines another measurable function. Exercise 1.11. If ν is the counting measure on a set X, so that ν(A)=n ∈ 1 N or ν(A)=∞ for any set A ⊂ X, prove that f : X →R (or C)isinL (ν) { } | | ∞ if and only if N := f =0 is at most countable and k∈N f(k) < .In this case, show that fdν= k∈N f(k). In this context one usually writes 1(X)or1 for L1(ν). Exercise 1.12. Compute the limits of n x n n x n n x n 1+ ex dx, 1 − e−x dx, and 1 − ex dx 0 n 0 n 0 2n as n →∞. Exercise 1.13. Use the Fubini-Tonelli theorem to prove that the integral 1 I := dx dy | − |α (0,1)2 x y

1.3. Exercises 23 is finite if and only if α<1, and then show that I =2/(1 − α)(2 − α). Exercise 1.14. If F :[a, b] → R is absolutely continuous, prove that F satisfies the following property: If ε>0 is given, there is a δ>0 such that |F (bk) − F (ak)|≤ε k { } for every finite sequence (ak,bk) of nonoverlapping intervals contained in − ≤ [a, b] such that k(bk ak) δ. The converse is also true: If the above property holds, then F has a representation as in (1.4). See a proof in [37], [39], or [6]. Exercise 1.15. Let μ be a Borel measure on R and define F (0) = 0,F(t)=μ((0,t]) if t>0,F(t)=−μ((t, 0]) if t<0. Then F is an increasing right continuous function and μ is the Lebesgue- Stieltges measure associated to F as a distribution function, that is, μ((a, b]) = F (b) − F (a). ∈C b Moreover, if f [a, b], fdμ= a f(t) dF (t), a Riemann-Stieltjes inte- gral.

Exercise 1.16. If μ is a complex measure, prove that limk μ(Bk)=μ(B)if either Bk ↑ B or Bk ↓ B. Exercise 1.17. Show that, for any complex Borel measure μ, |μ|(B)=sup fdμ; |f|≤1 . B 1 Exercise 1.18. Suppose that {λk}∈ (N) and that {ak} is a sequence in Rn. Prove that there is a uniquely determined Borel measure μ on Rn such that ∞ n gdμ= λkg(ak)(g ∈Cc(R )). k=1 ∞ | | {| | ∈C n | |≤ } | | c Show that k=1 λn =sup gdμ; g c(R ), g 1 and μ (F )=0 if F = {a1,a2,...}.

References for further reading: J. Cerd`a, An´alisis Real. P. R. Halmos, Measure Theory. L. Kantorovitch and G. Akilov, Analyse fonctionnelle. J. L. Kelley, General Topology.

24 1. Introduction

A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Functions and Functional Analysis. H. L. Royden, Real Analysis. W. Rudin, Real and Complex Analysis.

Chapter 2

Normed spaces and operators

The objects in functional analysis are function spaces endowed with topolo- gies that make the operations continuous as well as the operators between them. This chapter is devoted to the most basic facts concerning Banach spaces and bounded linear operators. It can be useful for the reader to retain as a first model of function spaces the linear space C(L) of all real continuous functions on a compact set L in Rn with the uniform convergence, defined by the condition

f − fnL := max |f(t) − fn(t)|→0. t∈L Examples of operators on this space are the integral operators Tf(x)= K(x, y)f(y) dy L where K(x, y) is continuous on L × L.ThenT : C(L) →C(L) is linear and Tf − Tfn ≤ max K(x, y)||f(y) − fn(y)| dy ≤ Mf − fn , L ∈ L x L L so that T satisfies the continuity condition Tf−TfnL → 0iff −fnL → 0. Note that if L =[a, b], this space is infinite-dimensional, since it contains the linearly independent functions 1, x, x2, etc. Two major differences with respect to the usual finite-dimensional Euclidean spaces are that a linear map between general Banach spaces is not necessarily continuous and that the closed balls are not compact.

25

26 2. Normed spaces and operators

We start this chapter with some basic definitions and, after the basic examples of Lp spaces and C(K), with the inclusion of the proof of the Weierstrass and the Stone-Weierstrass theorems, we consider the space of all bounded linear operators. The use of Neumann series, which will appear again when studying the invertible elements in a Banach algebra and the spectrum of unbounded op- erators, in combination with Volterra integral operators gives an application for the solution of initial value problems for linear ordinary differential equa- tions. The introduction of Green’s function also allows us to solve boundary value problems by Fredholm integral operators. These applications are de- scribed in the last section. The chapter includes a review of the most basic facts concerning orthog- onality in a Hilbert space. Duality will be discussed in Chapter 4. There is also a section on summability kernels that will be useful in later developments. They are applied here to show the density of the trigonomet- ric polynomials in Lp(0, 1) and to prove the Riemann-Lebesgue lemma. The section devoted to the Riesz-Thorin interpolation theorem of linear operators on Lp spaces is optional. It will be used only in Chapter 7 to include the nice proof of the Lp-continuity of the Hilbert transform due to M. Riesz.

2.1. Banach spaces

2.1.1. Topological vector spaces. In this book, a vector space will al- ways be a linear space over the real field R or the complex field C.The letter K will denote either of them, and K will be endowed with the usual topology defined by the distance d(λ, μ)=|λ − μ|,where|·|represents the absolute value. The collection of all the discs (intervals if K = R)

D(λ0,ε)={λ ∈ K; |λ − λ0| <ε} = λ0 + D(0,ε)(ε>0) is a neighborhood basis of the point λ0 ∈ K. A vector topology T on a vector space E will be a Hausdorff topology such that the vector operations (x, y) ∈ E × E → x + y ∈ E, (λ, x) ∈ K × E → λx ∈ E are continuous when we endow E × E and K × E with the corresponding product topologies. Then we say that E, or the couple (E,T ), is a topo- logical vector space. ∈ → ∈ On a topological vector space E, every translation τx0 : x E x+x0 E is continuous, since it is obtained by fixing a variable in the sum: If V (y) is a neighborhood of y = x + x0, there is a neighborhood U(x) × U(x0)of ∈ × ⊂ ⊂ (x, x0) E E such that U(x)+U(x0) V (y), and then τx0 (U(x)) V (y).

2.1. Banach spaces 27

Similarly, every multiplication x ∈ E → λ0x ∈ E by a given scalar λ0 is also continuous. ∈ Since the inverse τ−x0 of τx0 is continuous, U is a neighborhood of 0 E if and only if U + x0 is a neighborhood of x0. Thus, the topology of E is translation-invariant: U is a neighborhood basis of 0 ∈ E if and only if U(x0)={U + x0; U ∈U}is a neighborhood basis of x0 ∈ E. We will say that U is a local basis of E. An obvious example is the collection of all open sets that contain 0. A subspace of a topological vector space (E,T ) is a vector subspace F with the topology that consists of all the sets G ∩ F , G ∈T.ThenF becomes a topological vector space. Recall that C ⊂ E is convex if and only if 0

Proof. To prove (a), suppose λ, μ ∈ K.Ifλ = 0, multiplication by λ and by its inverse is continuous and we always have λF¯ = λF .Then λF¯ + μF¯ = λF + μF ⊂ F¯ + F¯ ⊂ F,¯ since, if x, y ∈ F¯, for every neighborhood x + y + U of x + y (when U is in a local basis U), the continuity of the sum allows us to take V ∈Uso that V + V ⊂ U. Hence, if a ∈ (x + V ) ∩ F and b ∈ (y + V ) ∩ F ,then a + b ∈ x + y + V + V ⊂ x + y + U and also x + y ∈ F¯. To prove (b) consider x, y ∈ C¯ and let α + β =1(α, β ≥ 0). Using the same argument as in (a), αx + βy ∈ C¯.  Theorem 2.2. Suppose U and V are local bases of the topological vector spaces E and F . A linear mapping T : E → F is continuous if and only if, for every V ∈V, T (U) ⊂ V for some U ∈U.

Proof. T (U) ⊂ V if and only if T (x0 + U) ⊂ T (x0)+V . 

2.1.2. Normed and Banach spaces. Normed spaces are the simplest and most useful topological vector spaces.

Recall that ·E : E → [0, ∞)isanorm on the vector space E if it satisfies the following properties:

1. xE > 0ifx =0,

2. triangle inequality: x + yE ≤xE + yE if x, y in E, and

3. λxE = |λ|xE if x ∈ E and λ ∈ K.

28 2. Normed spaces and operators

A normed space is a vector space E endowed with a norm ·E defined on it, with the topology associated to the distance dE(x, y):=y − xE. If E and F are two normed spaces, we endow E × F with the product norm

(x, y)E×F := max(xE, yF ) and its distance

d((x1,y1), (x2,y2)) = max(x1 − x2E, y1 − y2F ), which defines the product topology. A sequence (xn,yn) ∈ E × F (n ∈ N) is convergent in the product space if and only if the component sequences {xn} and {yn} are convergent in the corresponding factor space. Of course, these facts extend in the obvious way to finite products. Theorem 2.3. The topology of a normed space E is a vector topology, and the norm ·E is a continuous real function on E.

Proof. The continuity of the vector operations follow from the inequalities

(x1 + x2) − (y1 − y2)E ≤x1 − y1E + x2 − y2E

≤ 2max(dE(x1,y1),dE(x2,y2)) and

λxn − μxE ≤|λ − μ|xnE + |μ|xn − xE. To show that the norm is also continuous, note that (2.1) xE −yE ≤x − yE because xE ≤x − yE + yE and yE ≤x − yE + xE. 

A topological vector space (E,TE)issaidtobenormable if TE is the topology defined by some norm on E.

A normed space E is called a Banach space1 if it is complete; that is, whenever xp − xqE → 0, we can find x ∈ E so that x − xkE → 0. Completeness is a fundamental property of many normed spaces. Some basic theorems will apply only to complete spaces. The calculus with numerical series is meaningful for vector-valued series ∞   ∞ ∞ in a Banach space E.If n=1 xn E < ,then n=1 xn is called abso- lutely convergent and, as in the case of numerical series, every absolutely

1The term was introduced by M. Fr´echet to honor Stephan Banach’s work around 1920, culminating in his 1932 book “Th´eorie des Op´erations Lin´eaires” [3], the first monograph on the general theory of linear metric spaces.

2.1. Banach spaces 29

N convergent series is convergent in E since, if sN = n=1 xn and p

fX =sup|f(x)|. x∈X

The convergence f − fnX → 0 means that fn(x) → f(x) uniformly on X. A special case is the sequence space ∞ = B(N)(orB(Z)). In this context one usually writes

{xn}∞ =sup|xn| n for the sup norm. These spaces are also complete.

Indeed, if {fn}n is a Cauchy sequence in B(X), it is uniformly Cauchy and every {fn(x)}n is a Cauchy sequence in K, so that there exists f(x)= lim fn(x). Then it follows from the uniform estimate

|fp(x) − fq(x)|≤ε (p, q ≥ n0) that |fp(x) − f(x)|≤ε (p ≥ n0) when q →∞.Thus,fp − fX ≤ ε, fX ≤ ε + fpX < ∞, and fn → f uniformly on X. Example 2.6. A subspace F of a normed space E will be a vector subspace equipped with the restriction of the norm ·E.IfE is a Banach space, F is complete if and only if it is closed in E, since every Cauchy sequence of F is convergent in E and the limit is in F when F is closed.

30 2. Normed spaces and operators

Let X be a topological space. We denote by Cb(X) the subspace of B(X)ofallK-valued bounded and continuous functions f on X.Asa closed subspace of the Banach space B(X), Cb(X) is also complete.

Note that every Cauchy sequence {gn} in Cb(X)isconvergentinB(X) to a function f, which belongs to Cb(X) as a uniform limit of continuous functions.

Example 2.7. The product E = E1 ×···×En of a finite number of Banach spaces with the product norm  1 n   j (x ,...,x ) E := max x Ej 1≤j≤n is complete, since in E we have coordinatewise convergence. Let m ∈ N and suppose that Ω is a nonempty open set in Rn.We call Cm(Ω)¯ the normed space of all Cm functions f on Ω such that every derivative Dαf of order |α|≤m admits a continuous and bounded extension to Ω.¯ With the norm    α  f Cm := max D f Ω¯ , |α|≤m Cm(Ω)¯ is a Banach space.

m ¯ The space C (Ω) can be seen as a closed subspace of the finite product C ¯ E = |α|≤m b(Ω) of Banach spaces by means of the injection m ¯ α f ∈C (Ω) →{D f}|α|≤m ∈ E.

Here, according to the simplifying notation introduced by H. Whitney, n for every α =(α1,...,αn) ∈ N , we denote

α α1 ··· αn x = x1 xn if x =(x1,...,xn) and α α1 ··· αn D = ∂1 ∂n , where ∂j = ∂/∂x1 represents the partial derivative with respect to the jth coordinate, and |α| = α1 + ···+ αn is the order of the differential operator Dα. With this notation, the n-dimensional Leibniz formula states that α Dα(fg)= DβfDα−βg β β≤α and it is easily proved by induction (cf. Exercise 2.1). Here β ≤ α means βj ≤ αj for 1 ≤ j ≤ n and α − β =(α1 − β1,...,αn − βn).

2.1. Banach spaces 31

Example 2.8. Let (Ω, Σ,μ) be a measure space and let 1 ≤ p ≤∞. Con- sider Lp(μ)withtheLp-norm 1/p p fp = |f| dμ , when 1 ≤ p<∞ and f∞ := min{C ≥ 0; |f(x)|≤C a.e.}, as in Sec- tion 1.2. It will follow from the next theorem that this space is complete. A similar example is p = p(N)orp(Z), the space of all numerical sequences x = {xk} (k ∈ N or Z) such that p 1/p xp = |xk| (sup |xk| if p = ∞) k k with the usual operations.2

Theorem 2.9. Let 1 ≤ p<∞ and assume that all the functions fk are measurable. p (a) Let fk(x) → f(x) everywhere and |fk|≤g a.e. for some g ∈ L (μ). Then f → f in Lp(μ). k ∞   ∞ ∞ | | ∞ (b) If k=1 fk p < ,then k=1 fk(x) < a.e., there exists a ∈ p ∞ ∞ function f L (μ) such that f(x)= k=1 fk(x) a.e., and f = k=1 fk in Lp(μ). p (c) Every convergent sequence fk → f in L (μ) has a subsequence which converges pointwise a.e. to f.

p p 1 Proof. (a) Since also |f(x)|≤g(x), |fk − f| ≤ (2g) ∈ L (μ) and |fk(x) − f(x)|p → 0, and by dominated convergence, f − f → 0. k p ∞   N | |↑ (b) Let M = k=1 fk p and put gN (x):= k=1 fk(x) g(x), so p ↑ p p ≤ p that gN g and gN dμ M , by the triangle inequality. By monotone p p N convergence, also g dμ ≤ M and g(x)= |fk(x)| < ∞ a.e. The ∞ k=1 sum f(x):= k=1 fk(x) is defined a.e., or everywhere by picking equivalent representatives for the functions fk. Now we apply (a) to the partial sums k k → p j=1 fj to obtain j=1 fj f in L (μ).  − → { } (c) Since fm fn 0, we can select a subsequence fkn such that 1 f − f ≤ . km+1 km 2m This subsequence is the sequence of partial sums of the series − − ··· − ··· fk1 +(fk2 fk1 )+(fk3 fk2 )+ +(fkm+1 fkm )+

2 p On an arbitrary set J endowed with the counting measure, one usually writes  (J)instead p p ≤ ∞ | |p ∞ of L (J). A function f on J is in  (J)(1 p< ) if and only if j∈J f(j) < .If | |p ∞ { ∈  } j∈J f(j) < ,thenN := j J; f(j) =0 is at most countable (cf. Exercise 1.11). If J = {1, 2,...,n}, 2(J) is the Euclidean space Kn.

32 2. Normed spaces and operators

p ∞ m ∞ which is absolutely convergent on L (μ), since m=1 1/2 < .Nowan → p  application of (b) gives fkm h in L (μ) and a.e. Obviously, h = f a.e. Corollary 2.10. Lp(μ) is a Banach space.

Proof. Assume first 1 ≤ p<∞ and let {fk} be a Cauchy sequence in p { } L (μ). As in the preceding proof of (c), there exists a subsequence fkn which is convergent to a function h ∈ Lp(μ). By the triangle inequality, we p also obtain fk → h in L (μ). ∞ In L (μ), if {fk} is a Cauchy sequence, the sets Bk := {x; |fk(x)| > fk∞} and Bp,q := {x; |fp(x) − fq(x)| > fp − fq∞} have measure 0, and also μ(B)=0ifB is the union of all of them. Then we have limk fk(x)= c ∞ f(x) uniformly on B and limk fk = f in L (μ).  Remark 2.11. In a Banach space, every absolutely convergent series is convergent, and the converse is also true: If every absolutely convergent series of a normed space E is convergent, then the space is complete. To show this fact, just follow the proof of (c) in Theorem 2.9 for a Cauchy { }⊂ sequence fk E;thetermsfkn are the partial sums of the absolutely convergent series − − ··· − ··· fk1 +(fk2 fk1 )+(fk3 fk2 )+ +(fkm+1 fkm )+ , which is convergent. Corollary 2.12. Let (Ω, Σ,μ) be a measure space and 1 ≤ p<∞.Then p p p every f ∈ L (μ) is the limit in L (μ) of a sequence {sn}⊂L (μ)ofsimple functions.

Proof. Just consider sn such that |sn(x)|↑|f(x)| for every x ∈ Ω as in (1.1) and apply Theorem 2.9(a).  Corollary 2.13. Suppose μ is a Borel measure on a locally compact metric space X and let 1 ≤ p<∞.Theneveryf ∈ Lp(μ) is the limit in Lp(μ) of a sequence {gk}⊂Cc(X), meaning that every gk has a continuous with compact support and that limk→∞ f − gkp =0.

Proof. Since simple functions are dense in Lp(μ), we only need to approx- imate every Borel set B with finite measure by functions in Cc(X). By regularity, we can find K ⊂ B ⊂ G such that μ(G \ K) ≤ εp and choose K ≺ g ≺ G.Then p p |χB − g| dμ ≤ 1 dμ ≤ ε . G\K 

2.1. Banach spaces 33

A very important property of Lp(Rn) space is the continuity of transla- tions: Theorem 2.14. Assume that f ∈ Lp(Rn)(1≤ p<∞) and denote p n p n (τhf)(x)=f(x − h). Then the L -valued function h ∈ R → τhf ∈ L (R ) is continuous; that is,  −  lim τhf τh0 f p =0. h→h0  −   −  Proof. Since τhf τh0 f p = τh−h0 f f p, we can assume h0 =0. n Start first with g ∈Cc(R ) supported by K and let K(1) = K + B¯(0, 1). If ε>0, by the uniform continuity of g, we can find δ>0 so that 1/p τhg − g∞ ≤ ε/|K(1)| if |h|≤δ.

Then it follows that τhg − gp ≤ ε if |h|≤δ. n With Corollary 2.13 in hand, we choose g ∈Cc(R ) so that f −gp ≤ ε, and then

τhf − fp ≤τh(f − g)p + τhg − gp + f − gp ≤ 3ε. 

Every normed space E has a completion, defined as a Banach space E˜ → ˜ with an isometric linear embedding JE˜ : E E with dense image JE˜(E) in E˜. A completion can be obtained either in the same way as R is defined as the completion of Q or using duality, as in Theorem 4.25.

The completion is unique in the sense that, if JF : E→ F is a second one, there exists a unique isomorphism Φ : F → E˜ such that Φ ◦ JF : → ˜ → ˜ E E is the embedding JE˜ : E E and Φ is an isometry: For every { }⊂ { }⊂ ˜ { }⊂ Cauchy sequence xn E, the images JE˜ xn E and JF xn F are ˜ also Cauchy and convergent in E and F , respectively, and Φ(limn JE˜xn)= limn JF xn is the only possible definition of Φ. It is clear that Φ is an isometric isomorphism.

It is customary to identify E with the subspace JE˜(E) of the completion E˜.

2.1.3. The space C(K) and the Stone-Weierstrass theorem. By C(K) we represent either the real or the complex Banach space of all real-valued or complex-valued continuous functions on a compact topological space K,en- dowed with the uniform norm. When confusion is possible, we write C(K; R) or C(K; C), respectively. Let A be a subalgebra of C(K); that is, A is a vector subspace of C(K) and the product of two functions in A belongs to A.

34 2. Normed spaces and operators

The closure of A is also a subalgebra of C(K), since it is a vector subspace (cf. Theorem 2.1) and fn → f and gn → g uniformly on K imply fngn → fg. To prove this fact, just consider

fg − fngnK ≤fg − fgnK + fgn − fngnK

≤fKg − gnK + gnK f − fnK → 0.

We say that A separates points of K if, given x = y in K,thereisa function g ∈ A such that g(x) = g(y), and we say that A does not vanish at any point of K if, given x ∈ K, there is a function g ∈ A such that g(x) =0.

Lemma 2.15. If a subalgebra A of C(K) separates points and does not vanish at any point of K, then the following interpolation property holds: For any two different points x, y ∈ K and two given numbers α and β, there is a function f ∈ A such that f(x)=α and f(y)=β.

Proof. Let g,hx,hy ∈ A such that g(x) = g(y), hx(x) = 0, and hy(y) =0. Then

g1 := ghy − g(x)hy,g2 := ghx − g(y)hx belong to A and we define

α β f = g2 + g1. g2(x) g1(y) This function also belongs to A and satisfies the required interpolation prop- erty. 

If K is a compact subset of Rn, then the set P(K) of all real polynomial functions on K is a subalgebra of C(K; R) that separates points since, if a, b ∈ K are two different points, at least aj = bj for one coordinate j and then the corresponding monomial xj has different values on a and b. Moreover 1 ∈P(K) and this subalgebra does not vanish at any point of K. Theorem 2.19 will show that these facts will imply that P(K)isdensein C(K; R), but let us first consider the special case K =[a, b] of one variable and prove the classical Weierstrass theorem.3

3 First proved in 1885 using the summability kernel Wt of Exercise 2.27 by one of the fathers of modern analysis, the German mathematician Karl Weierstrass, who taught at Gewerbeinstitut in Berlin.

2.1. Banach spaces 35

We are going to present the constructive proof based on the Bernstein 4 polynomials Bn(f) associated to a continuous function f on [0, 1]: n n k B f(x):= f xk(1 − x)n−k. n k n k=0

For every n, the linear operator Bn on C[0, 1] is positive; that is, Bnf ≥ 0if f ≥ 0, so that Bnf ≥ Bng if f ≥ g and |Bnf|≤Bn|f|. Theorem 2.16. Every continuous function f on [a, b] is the uniform limit on [a, b] of a sequence of polynomials.

Proof. The linear change of variables x =(b − a)t + a allows us to as- sume [a, b]=[0, 1], and we will prove that f ∈C[0, 1] can be uniformly approximated on [0, 1] by the Bernstein polynomials Bnf.

It is clear that Bn1 = 1 and we are going to show that, for I(x)=x and I2(x)=x2, n − 1 1 B I = I, B I2 = I2 + I, n n n n 2 → 2 | 2 − 2 |≤ and then BnI I uniformly on [0, 1], since sup0≤x≤1 BnI (x) I (x) 1/n. Indeed, differentiating n n (2.3) (x + y)n = xkyn−k k k=0 with respect to x and multiplying by x, we obtain n n nx(x + y)n−1 = k xkyn−k, k k=1 − n n k − n−k which for y =1 x reads nx = k=1 k k x (1 x) = nBnx. Also, differentiating (2.3) once again with respect to x and now multi- pying by x2, n n n(n − 1)x2(x + y)n−2 = k(k − 1) xkyn−k. k k=2

4The Russian mathematician Sergei N. Bernstein (1880 – 1968), who solved Hilbert’s nine- teenth problem on the analytic solution of elliptic differential equations, found his constructive proof of the Weierstrass theorem using probabilistic methods in 1912: He considered a random variable X with a binomial distribution with parameters n and x, so that the expected value E(X/n)isx, and he combined the use of the weak law of large numbers, which made it possible to show that limn P (|f(X/n) − f(x)| >ε) = 0 for every continuous function f on [0, 1], with the remark that E(f(X/n)) is precisely the polynomial Bnf(x).

36 2. Normed spaces and operators

Hence, for y =1− x, n n n(n − 1)x2 = k(k − 1) xk(1 − x)n−k k k=2 n k n n k n = n2 2 xk(1 − x)n−k − n xk(1 − x)n−k, n k n k k=2 k=2 2 2 or (n − 1)x = nBnx − Bnx, which is equivalent to the identity announced 2 for Bnx . To prove the theorem, we may assume that |f|≤1on[0, 1] and, since f is uniformly continuous, for every ε>0wecanchooseδ>0 so that |f(x) − f(y)|≤ε if |x − y|≤δ, and then 2 |f(x) − f(y)|≤ε + (x − y)2 δ2 also when |x − y| >δ. We look at y as a parameter and x as the variable. Then, from the properties of Bn,

|Bnf − f(y)| = |Bn(f − f(y)1)|≤Bn(|f − f(y)|) 2 2 ≤ B ε + (x − y)2 ≤ ε + (B I2 − 2yI + y2). n δ2 δ2 n Finally, if we evaluate these functions at y, 2 |(B f)(y) − f(y)|≤ε + ((B I2)y − y2) → ε n δ2 n uniformly on y ∈ [0, 1] as n →∞. Hence Bnf − f[0,1] ≤ 2ε as n ≥ n0,for some n0.  Corollary 2.17. C[a, b] is separable; that is, it contains a countable dense set. ∈C Proof. Every g [a, b] is the uniform limit on [a, b] of a sequence of N k polynomials, and every polynomial P (x)= k=1 akx is the uniform limit N k of a sequence Pm(x)= k=1 qk,mx of polynomials with rational coefficients. Just take Q  qk,m → ak in R (qk,m ∈ Q + iQ in the complex case). Then the collection of these polynomials with rational coefficients is dense and it is countable. 

Exercise 2.9 extends Corollary 2.17 to C(K)ifK is a compact metric space.

A subset of a normed space is called total if its linear span is dense. With the argument of Corollary 2.17, if a normed space contains a countable total

2.1. Banach spaces 37 set Z, it is separable, since the set of all linear combinations of elements in Z with rational coefficients is dense. Obviously the product of two separable normed spaces is also separable, and it is readily seen that a subspace F of a separable space E is also separable. Indeed, if Z is a countable dense subset of E, the collection of all the balls BE(z,1/m)withz ∈ Z and m ∈ N is countable and covers E.By choosing a point from every nonempty set BE(z,1/m) ∩ F , we obtain a countable subset A of F which is dense in F since, for every y ∈ F and every m ∈ N,thereexistssomezm ∈ Z such that zm − yE ≤ 1/2m and some am ∈ A such that zm − amE ≤ 1/2m;theny − amE ≤ 1/m. Remark 2.18. The fact of being separable indicates that a normed space E is somehow “not too large”. It cannot contain an uncountable set {xα}α∈A such that xα − xbE ≥ δ if α = β,forsomeδ>0, since if {yn}n∈N is dense in E, for every α ∈ A, xα − yn(α)E <δ/2forsomen(α) ∈ N, and the mapping α ∈ A → n(α) ∈ N is injective.

Next we prove the extension of the Weierstrass Theorem 2.16, due in 1937 to M. H. Stone, which includes the proof of the density of the set of all polynomials in C(K)whenK is a compact subset of Rn. Theorem 2.19 (Stone-Weierstrass). Let K be a compact space. If a subal- gebra A of C(K; R) separates points and does not vanish at any point of K, then A¯ = C(K; R).

Proof. Let f ∈C(K; R) and consider any positive number ε>0. We will prove that f − gK <εfor some g ∈ A¯ in four steps. (1) If f ∈ A,then|f|∈A¯. Let a<0 and b>0 be such that f(K) ⊂ [a, b] and v(x)=|x| on [a, b]. According to the Weierstrass theorem, we can find Qn ∈P[a, b] so that

lim Qn − v[a,b] → 0. n → − Since Qn(0) v(0) = 0, the polynomials Pn = Qn Qn(0) are such that  −  → →∞ N(n) k Pn(0) = 0 and Pn v [a,b] 0asn . Hence Pn(x)= k=1 akx ,so 2 n that Pn(f)=a1f + a2f + ···+ anf ∈ A and

Pn(f) −|f|K ≤Pn − v[a,b] → 0.

(2) If f,g ∈ A, then sup{f,g}, inf{f,g}∈A¯, since according to (1) f + g + |f − g| sup{f,g}(x):=max{f(x),g(x)} = ∈ A¯ 2

38 2. Normed spaces and operators and f + g −|f − g| inf{f,g}(x):=min{f(x),g(x)} = ∈ A.¯ 2

(3) If x ∈ K, we can find gx ∈ A¯ so that gx(x)=f(x) and gx

According to Lemma 2.15, for every y ∈ K we can choose fy ∈ A so that

fy(x)=f(x) and fy(y)=f(y).

By continuity, fy

(4) Finally, if we choose gx as in (3) for every x ∈ K,thengx >f− ε on a neighborhood V (x)ofx and

K = V (x1) ∪···∪V (xk). It follows as in (3) that { }∈ ¯ g := sup gx1 ,...,gxk A  −  ∈ satisfies g f K <ε,sincegx1 ,...,gxk gxj (z) >f(z)+ε.

There is also a complex form of the Stone-Weierstrass theorem: Corollary 2.20. Let A be subalgebra of the Banach space C(K; C)ofall complex-valued continuous functions on a compact space K that separates points and does not vanish at any point of K.IfA is self-conjugate, that is f¯ ∈ A if f ∈ A,thenA¯ = C(K; C).

Proof. Since A is self-conjugate, if f ∈ A and u = f,thenalsou = (if)=(f + f¯)/2 ∈ A and

A0 := {f; f ∈ A} = {f; f ∈ A} . is a subalgebra of C(K; R), since u = f and v = g ∈ A0 imply uv = (fg + fg¯)/2 ∈ A0.

Moreover A0 separates points and does not vanish at any point, since x, y ∈ K and f(x) = f(y)forsomef ∈ A implies f(x) = f(y)or f(x) = f(y), and f(x) =0or f(x) =0if f(x) = 0. Hence A0 is dense in C(K; R) and, if f ∈C(K; C), we can find two sequences {un} and {vn}

2.2. Linear operators 39

in A0 that approximate f and f,sofn = un + ivn ∈ A0 + iA1 ⊂ A and fn → f uniformly on K. 

2.2. Linear operators

2.2.1. Bounded linear operators. Theorem 2.21. A linear mapping T : E → F between two normed spaces is continuous if and only if

(2.4) TxF ≤ CxE for some finite constant C ≥ 0.

Proof. We know from Theorem 2.2 that T is continuous if and only if, for every ε>0, we can choose δ>0 so that

T (B¯E(0,δ)) ⊂ B¯F (0,ε).

Hence, TxF ≤ ε if xE ≤ δ,orTxF ≤ CxE for any x ∈ E,with C = ε/δ. 

If a set A ⊂ E is contained in a ball B¯E(0,R), we say that A is bounded. A continuous linear mapping T between two normed spaces E and F is also called a bounded linear operator , since condition (2.4) means that TxF ≤ C when x ∈ B¯E(0, 1) and T B¯E(0, 1) is bounded in F . That is, T is bounded on the unit ball of E.     By denoting T =supxE ≤1 Tx F , the smallest constant C in (2.4), T is continuous if and only if T  is finite. 5 × → Example 2.22 (Fredholm operator ). If K :[a, b] [c, d] C is a contin- d uous function and TK f(x)= c K(x, y)f(y) dy,then

TK : C[c, d] →C[a, b] is a bounded linear operator, since |TKf(x)|≤(d − c)K[c,d]×[a,b]f[c,d], so that TK f[a,b] ≤ Cf[c,d] with C =(d − c)K[c,d]×[a,b].

5The Swedish mathematician Erik Ivar Fredholm, in 1900, created the first theory on linear equations in infinite-dimensional spaces, establishing the modern theory of integral equations  1 f(x)+ K(x, y)f(y) dy = g(y)(0≤ x ≤ 1) 0 as the limiting case of linear systems n f(xi)+ K(xi,yj )f(y)=g(yj )(1≤ i ≤ n), j=1 and found his “alternative”, which we will meet in Theorem 4.33. He based his fundamental paper published in 1903 on the determinant named after him which is associated to the Fredholm operators. His method was immediately followed by the work of Hilbert, Schmidt, Poincar´e, F. Riesz, and many others.

40 2. Normed spaces and operators

Example 2.23 (Volterra operator6). Similarly, if K is a continuous function on the triangle Δ := {(x, y) ∈ [a, b] × [a, b]; a ≤ y ≤ x ≤ b} and x TKf(x):= K(x, y)f(y) dy, a then

TK : C[a, b] →C[a, b] is a bounded linear operator, and TK ≤(b − a)KΔ.

Let T : E → F be a bijective linear mapping between two normed spaces. We say that T is an isomorphism of normed spaces if and only if T and T −1 are continuous. By (2.4), the continuity of T and T −1 means that we can find two constants α, β > 0 that satisfy

αxE ≤TxF ≤ βxE.

Two norms ·1 and ·2 on a vector space E are said to be equivalent if they define the same topology, that is, if the identity I :(E,·1) → (E,·2) is an isomorphism, so that

(2.5) αx1 ≤x2 ≤ βx1. C     b | | Example 2.24. On [a, b], the usual norm f [a,b] and f 1 = a f(t) dt satisfy

f1 ≤ (b − a)f[a,b], and we say that ·[a,b] is finer than ·1,sinceI :(E,·[a,b]) → (E,·1) is continuous and the topology of ·[a,b] is finer than the one of ·1.

It is readily checked that, if |·| is the Euclidean norm (2.2), x∞ := n | |   n | | maxj=1 xj , and x 1 := j=1 xj ,then √ x∞ ≤|x|≤ nx∞, x∞ ≤x1 ≤ nx∞. Next we will prove that, in fact, the norms on Kn are all equivalent. Theorem 2.25. If E is a normed space of finite dimension n, then every linear bijection T : Kn → E is an isomorphism.7 Thus, E is complete and, on E, two norms are always equivalent.

6 In 1896 the Italian mathematician and physicist Vito Volterra published papers on what is now called “an integral equation of Volterra type”. His main contributions in the area of integral and integro-differential equations is contained in his 1930 book “Theory of Functionals and of Integral and Integro-Differential Equations”. 7This result holds for every topological vector space. The case n = 1 is included in Exer- cise 4.5. The proof is by induction on n (cf. Berberian [4, (23.1)]).

2.2. Linear operators 41

n Proof. On K we consider the Euclidean norm |·| and the norm x :=   n j { } n Tx E.Ifx = j=1 x uj,where u1,...,un is the canonical basis in K , then n n j x≤ |x |uj≤C|x| (C = uj), j=1 j=1 j since |x |≤|x|, so that TxE ≤ C|x|. To prove the reverse estimate, we will use the fact that the unit Euclidean sphere S = {x; |x| =1} is compact and that the function f(x):=x is continuous on S,since|f(x) − f(y)|≤x − y≤C|x − y|. This function has a minimum value c =minf = x0 > 0forsomex0 ∈ S, and then x/|x| ≥ c, so that |x|≤c−1x. 

Every compact subset K of a normed space E is closed, and it is bounded, ⊂ ∞ ⊂ N since K m=1 BE(0,m), so that K m=1 BE(0,m)=BE(0,N)for some N>0. The converse is also true when E is a finite-dimensional space, since K is then homeomorphic to a closed and bounded set of a Euclidean space Kn, which is compact by the Heine-Borel theorem. An important fact is that this property characterizes finite-dimensional normed spaces; that is, if the unit ball BE is compact, then dim E<∞. The proof will be based on the existence of “nearly orthogonal elements” to any closed subspace of a normed space. Lemma 2.26 (F. Riesz). Suppose E isanormedspaceandM aclosed subspace, M = E,andlet0 <ε<1. Then there exists u ∈ E such that uE =1and d(u, M) ≥ 1 − ε.

Proof. Let v ∈ E \ M, d = d(v, M) > 0(M is closed), and choose m0 ∈ M so that v − m0E ≤ d/(1 − ε). The element u =(v − m0)/v − m0E satisfies the required conditions, since, if m ∈ M,

u − mE = v − (m0 + v − m0Em)E/v − m0E ≥ 1 − ε. 

Remark 2.27. If we have an increasing sequence of closed subspaces Mn of a normed space, then there exists a sequence {un} such that un ∈ Mn, unE = 1, and d(un+1,Mn) ≥ 1/2, so that {un} has no Cauchy subse- quence, since up − uqE ≥ 1/2ifp = q. A similar remark holds for a decreasing sequence of closed subspaces.

Theorem 2.28. If the unit sphere SE = {x ∈ E; xE =1} of a normed space E is compact, then E is of finite dimension.

42 2. Normed spaces and operators

Proof. Assume that E has a sequence {xn} of linearly independent ele- ments. We can apply Remark 2.27 to the subspaces Mn =[x1,...,xn], which are closed since they are complete by Theorem 2.25. Then {un}⊂SE has no convergent subsequence. 

2.2.2. The space of bounded linear operators. With the usual vector operations, the set L(E; F ) of all bounded linear operators between two normed spaces E and F is a vector space, and it becomes a normed space with the operator norm

T  := sup TxF , xE ≤1 since all the properties of a norm are satisfied:

1. If T  =0,thenTxF ≤T xE =0forallx ∈ E and T =0,   | |  | |   | |  2. λT =supxE ≤1 λ Tx F = λ supxE ≤1 Tx F = λ T , and 3. (T1 + T2)xF ≤ (T1 + T2)xE and then T1 + T2≤T1 + T2. Moreover, the norm of the product ST of two bounded linear operators T : E → F and S : F → G is submultiplicative, (2.6) ST≤ST , since STxG ≤STxF ≤ST xE. Note that T  =supTxF =supTxF , xE <1 xE =1 since, if xE ≤ 1,   −    −  ≤   Tx F = lim(1 ε) Tx F = lim T ((1 ε)x) F sup Tx F ε↓0 ε↓0   x E <1       and then T =supxE <1 Tx F . Also, if 0 < x E < 1,

TxF = xET (x/xE)F ≤ sup TxF , xE=1     and T =supxE=1 Tx F . Theorem 2.29. If F is a Banach space, then L(E; F ) is also complete.

Proof. Let {Tn} be a Cauchy sequence in L(E; F ). For every x ∈ E, {Tnx} is a Cauchy sequence in the complete space F ,sinceTpx − TqxF ≤ Tp − TqxE. We define Tx := lim Tnx and, by the continuity of the vector operations, T : E → F is linear. Moreover, for every ε>0wecan find N>0sothat

Tpx − TqxF ≤Tp − TqxE ≤ ε ∀p, q ≥ N, ∀xE ≤ 1

2.2. Linear operators 43

and, by letting q →∞, Tpx − TxF ≤ ε. Therefore T ∈L(E; F ) and Tp − T →0ifq →∞. 

In the special case F = K, the Banach space of all bounded linear forms E := L(E; K) with the norm u =sup|u(x)| xE ≤1 is called the dual space of E. We will use the notation L(E)forL(E; E). Every element of L(E)is called a bounded linear operator on E.

2.2.3. Neumann series. Let us consider the problem of solving the equa- tion Tu− λu = v (T ∈L(E),v∈ E), where E is any Banach space and v ∈ E and 0 = λ ∈ K are given. We may obtain the inverse of T −λI = −λ(1−T/λ)usingtheNeumann series8 ∞ 1 1 − T n, λ λn n=0 similar to a numerical geometric series. If T /|λ| < 1, ∞ ∞ ∞ (1/λn)T n≤ (1/λ)T n = (T /|λ|)n < ∞, n=0 n=0 n=0 and the Neumann series is convergent. It is shown that ∞ 1 1 (2.7) (T − λI)−1 = − T n (T  < |λ|), λ λn n=0 − − by checking that, if S is the sum of the series, S(T λI)=(T λI)S = I. − N −n−1 n For instance, if SN := n=0 λ T , a partial sum of the series, then S(T − λI) = limN SN (T − λI) by (2.6), and

N T n+1 N T n T N+1 S (T − λI)=− + = I − → I N λn+1 λn λN+1 n=0 n=0 when N ↑∞,sincewehaveT n/λn≤(T /|λ|)n → 0.

An interesting application refers to the example of the Volterra operators defined in Example 2.23. If TK and TH are defined by the integral kernels

8The German mathematician Carl G. Neumann, who worked on the Dirichlet principle, used this series in 1877 in the context of potential theory.

44 2. Normed spaces and operators

K, H ∈C(Δ), then TK TH is also a Volterra operator TL, defined by the composition L of K with H, x L(x, z)= K(x, y)H(y, z) dy a since, assuming that K(x, y)=H(x, y)=0ify>x, b b b TK TH f(x)= K(x, y)TH f(y) dy = K(x, y) H(y, z)f(z) dz dy a a a b b = K(x, y)H(y, z) dy f(z) dz a a and, if a ≤ x

Theorem 2.30. Let TK : C[a, b] −→ C [a, b] be the Volterra operator defined by the kernel K ∈C(Δ). Then, for any λ =0 , ∞ 1 T n (2.9) (T − λI)−1 = − K , K λ λn n=0 and the series is absolutely convergent in L(C[a, b]).

2.3. Hilbert spaces 45

Proof. From (2.8), M n(b − a)n T n ≤(b − a)K ≤ K n (n − 1)! and M n(b − a)n 1 (1/λ)nT n ≤ , K (n − 1)! |λ|n which is the general term of a convergent numerical series. The identity (2.9) is obtained as (2.7). 

In Theorem 2.48 and Exercise 2.20, the reader will find interesting appli- cations of Theorem 2.30 to find the unique solution for the Cauchy problem of a linear differential equation.

2.3. Hilbert spaces

Let us review some basic facts concerning Hilbert spaces.

2.3.1. Scalar products. A scalar product or inner product in a real or complex vector space H is a K-valued function on H × H,

(x, y) ∈ H × H → (x, y)H ∈ K, having the following properties:

(1) It is a sesquilinear form, meaning that, for every x ∈ H,(·,x)H is a linear form on H and (x, ·)H is skewlinear, that is,

(x, y1 + y2)H =(x, y1)H +(x, y2)H , (x, λy)H = λ¯(x, y)H.

In the real case, K = R,(·, ·)H is a bilinear form,sinceλ¯ = λ. · · (2) (y, x)H = (x, y)H , so that, if K = R,(, )H is symmetric. (3) (x, x)H > 0ifx =0.

By (1), (x, 0)H =(0,y)H =0. Given this scalar product, the associated norm on H is   1/2 (2.10) x H := (x, x)H .

Obviously λxH = |λ|xH and xH > 0ifx = 0. The subadditivity follows from the fundamental Schwarz inequality9

(2.11) |(x, y)H|≤xH yH

9Also called the Cauchy-Bunyakovsky-Schwarz inequality; first published by the French math- ematician Augustin Louis Cauchy for sums (1821), and for integrals stated by the Ukrainian mathematician Viktor Bunyakovsky (1859) and rediscovered by Hermann A. Schwarz (1888), who worked on function theory, differential geometry, and the calculus of variations in Halle, G¨ottingen, and Berlin.

46 2. Normed spaces and operators

 2 by considering x + y H =(x + y, x + y)H and the properties of the inner product to obtain  2  2  2  ≤ 2  2 | | x + y H = x H + y H +2 (x, y)H x H + y H +2(x, y)H ≤2  2     x H + y H +2 x H y H ,  2 ≤     2 that is, x + y H ( x H + y H ) . To prove the Schwarz inequality, which is obvious if y =0,in ≤ 2  2 | |2 2 ¯ 0 x + λy H = x H + λ y H + λ(x, y)H + λ(y, x)H −  2  2 we only need to choose λ = (x, y)H / y H and then multiply by y H . It follows from the Schwarz inequality that the inner product is contin- uous at any point (a, b)ofH × H since

|(xn,yn)H − (a, b)H | = |(xn − a, yn)H +(a, yn − b)H |

≤xn − aH ynH + aH yn − bH → 0 if (xn,yn) → (a, b)inH × H. A Hilbert space10 is a Banach space, H, whose norm is induced by an inner product as in (2.10). Remark 2.31. The completion H˜ of the normed space H with a norm defined by a scalar product as in (2.10) is a Hilbert space.

Ifx, ˜ y˜ ∈ H˜ with xn → x and yn → y (xn,yn ∈ H), we obtain a scalar product on H˜ by defining

(˜x, y˜) ˜ := lim (xn,yn)H , H n→∞ since xnH , ymH ≤ C and, by the Schwarz inequality,

|(xn,yn)H − (xm,ym)H |≤xn − xmH ymH + xmH yn − ymH → 0 as m, n →∞.

This definition does not depend on the sequences xn → x and yn → y  →  → { } {   } since, if also xn x and yn y,then (xn,yn)H and (xn,yn)H are subsequences of a similar one obtained by mixing both of them. Note that 2 2 x˜ = lim xn = lim(xn,xn)H =(˜x, x˜) ˜ . H˜ n H n H

10The name was coined in 1926 by Hilbert’s student J. von Neumann, who included the condition of separability in the definition, when working on the mathematical foundation of quan- tum mechanics. Hilbert used the name of infinite-dimensional Euclidean space when dealing with integral equations, around 1909. It was another student of Hilbert, Erhard Schmidt, beginning his 1905 dissertation in G¨ottingen, who completed the theory of Hilbert spaces for 2 by introducing the language of Euclidean geometry.

2.3. Hilbert spaces 47

Example 2.32. The Euclidean space Kn, with the norm 2 2 |(x1,...,xn)| = |x1| + ···+ |xn| · n induced by the Euclidean inner product x y = k=1 xjy¯j, is the simplest Hilbert space. Example 2.33. Another fundamental example is L2(μ), with 1/2 2 f2 = |f| dμ induced by the scalar product (f,g)2 = fgdμ¯ . 2 Example 2.34. The space  of all sequences x = {xn}n∈N (or {xn}n∈Z) that satisfy  2 | |2 ∞ x 2 = xn < n is the Hilbert space whose norm is induced by (x, y)2 = n xny¯n.Itisthe L2 space on N (or Z) with the counting measure.

2.3.2. Orthogonal projections. The elementary properties and termi- nology of Euclidean spaces extend to any Hilbert space H:

It is said that a, b ∈ H are orthogonal if (a, b)H = 0, and the orthogonal space of a subset A of H is defined as ⊥ A = {z ∈ H;(z,a)H =0∀a ∈ A}. · It is a closed subspace of H, since, by the Schwarz inequality, every ( ,a)H is ⊥ · a continuous linear form on H and A = a∈A Ker ( ,a)H, an intersection of closed subspaces.

For a finite number of points x1,...,xn that are pairwise orthogonal, the relation  ··· 2  2 ···  2 x1 + + xn H = x1 H + + xn H is the Pythagorean theorem, and a useful formula is the parallelogram identity  2  2  2  − 2 2 a H +2 b H = a + b H + a b H . They follow immediately from the definition (2.10) of the norm. The par- allelogram identity will be useful to prove the existence and uniqueness of an optimal projection on a closed convex set for every point in a Hilbert space: Theorem 2.35 (Projection theorem). (a) Suppose C is a nonempty closed and convex subset of the Hilbert space H and x any point in H. Then there is a unique point PC (x) in C that satisfies x − PC (x)H = d(x, C),where d(x, C)=infy∈C y − xH .

48 2. Normed spaces and operators

Apointy ∈ C is this optimal projection PC (x) of x if and only if (2.12) (c − y, x − y) ≤ 0 ∀c ∈ C. (b) If F is a closed subspace of the Hilbert space H,thenH = F ⊕ F ⊥ ⊥ (direct sum), and x = y + z with y ∈ F and z ∈ F if and only if y = PF (x) and z = PF ⊥ (x). Moreover PF is a bounded linear operator PF : H → H with norm 1 (if ⊥ F = {0}), Ker PF = F , Im PF = F , (PF (x1),x2)H =(x1,PF (x2))H ,and 2 PF = PF .

Proof. (a) Let d = d(x, C) and choose yn ∈ C so that dn = x − yn→d. Since C is convex, (yp+yq)/2 ∈ C and, by an application of the parallelogram identity to a =(x − yp)/2 and b =(x − yq)/2, 1 1 1 1 (d2 + d2)=x − (y + y )2 + y − y 2 ≥ d2 + y − y 2 . 2 p q 2 p q H 4 p q H 4 p q H →∞ 2 ≥ 2 1  − 2  − 2 → By letting p, q , d d + limp,q→∞ 4 yp yq H and yp yq H 0. Since C is complete, yn → y ∈ C and x − yH = limn x − ynH = d. The uniqueness of the minimizer y follows from the fact that if z is also a minimizer, the foregoing argument shows that {y,z,y,z,y,z,...} converges and y = z. To prove (2.12), suppose that c ∈ C and 0 ≤ t ≤ 1. Then (1−t)y+tc ∈ C by the convexity of C, and  − 2 ≤ − − − 2  − − − 2 x y H x [(1 t)y tc] H = x y t(c y) H, where  − − − 2  − 2 −  − − 2 − 2 (2.13) x y t(c y) H = x y H 2t (c y, x y)H + t c y H .  − − ≤  − 2  − − ≤ Hence 2 (c y, x y)H t c y H and it follows that (c y, x y)H 0 by letting t → 0.

Conversely, if (c − y, x − y)H ≤ 0 and in (2.13) we put t =1,  − 2 − − 2  − − − − 2 ≤ x y H c y H =2 (c y, x y)H y c H 0. (b) Since F ∩ F ⊥ = {0}, it is sufficient to decompose every x ∈ H into ⊥ x = y + z with y ∈ F and z ∈ F . We choose y = PF (x) and z = x − y,so ⊥ that we need to prove that z ∈ F . Indeed, by (a) we have (c − y, z)H ≤ 0 for any u = c − y ∈ F and also (λu, z)H ≤ 0, so that (u, z)H =0forevery u = c − y ∈ F and z ∈ F ⊥. ⊥ Let yj = PF (xj) and zj = xj − yj (j =1, 2). Since zj ∈ F ,

(PF (x1),x2)H =(y1,y2 + z2)H =(y1 + z1,y2)H =(x1,P(x2))H .

The linearity of PF follows very easily from this identity, and it is also clear 2 ⊥ that PF (x)=PF (y)=y = PF (x), Ker PF = F , and Im PF = F .

2.3. Hilbert spaces 49

 2  2  2  2 ≤ 2 From y + z H = y H + z H we obtain PF (x) H x H , so that PF ≤1, since PF (y)H = yH if y ∈ F . 

If F is a closed vector subspace of H,wecallF ⊥ the orthogonal com- plement of F , and the optimal projection PF is called the orthogonal projection on F . The projection theorem contains the fact that F ⊥ = {0} if F = H.This will be used to prove Theorem 4.1, the Riesz representation theorem for the dual space of H.

Theorem 2.36. Let A be a subset of H. Then the closed linear span [A] of A coincides with A⊥⊥, so that A is total in H if and only if A⊥ = {0}. Thus, a vector subspace F of H is closed if and only if F ⊥⊥ = F .

⊥ Proof. It is clear that A⊥ =[A]⊥ and, by continuity [A]⊥ = [A] ;thus,if F = [A], we need to prove that F ⊥⊥ = F . We have F ⊂ F ⊥⊥ and, if x ∈ F , it follows from Theorem 2.35 that we ∈ ⊥ can choose z F so that (x, z)H = 0; just take z = PF ⊥ (x). This shows that also x ∈ F ⊥⊥. Note that A is total when F = E and, by Theorem 2.35, this happens if and only if A⊥ = F ⊥ = {0}. 

2.3.3. Orthonormal bases. A subset S of the Hilbert space H is called an orthonormal system if the elements of S are mutually orthogonal and they are all of norm 1.

Suppose F =[e1,...,en], where {e1,...,en} is a finite orthonormal set in H.Then n PF (x)= (x, ek)H ek k=1 − n ∈ ⊥  2 ≤ 2 since z = x k=1(x, ek)H ek F . It follows from PF (x) H x H that

n | |2 ≤ 2 (x, ek)H x H . k=1

This estimate, which is known as Bessel’s inequality, is valid for any orthonormal system S = {ej}j∈J ; just take the supremum over all the finite

50 2. Normed spaces and operators sums11: | |2 ≤ 2 (2.14) (x, ej)H x H . j∈J

The numbers x(j):=(x, ej)H are called the Fourier coefficients of x with respect to S. An orthonormal basis of H is a maximal orthonormal system, which is also said to be complete. That is, the orthonormal system S = {ej}j∈J is an orthonormal basis if and only if

x(j):=(x, ej)H =0 ∀j ∈ J ⇒ x =0, which means that S⊥ = 0 and S is total. By an application of Zorn’s lemma, it can be proved that every orthonor- mal system can be extended to a maximal one. In our examples, Hilbert spaces will be separable, so that an√ orthonormal system S = {ej}j∈J is finite or countable, since ei − ejH = 2ifi = j (cf. Remark 2.18). We will only consider the separable case and write 2 = 2(J)ifJ = {1, 2,...,N}, N,orZ, but the results extend easily to any Hilbert space. By Bessel’s inequality x ∈ H → x = {x(j)}∈2 is a linear transform 2 such that x = {x(j)}∈ and x2 ≤xH .WhenS is complete, this mapping is clearly injective, since in this case x(j) = 0 for all j ∈ J implies x = 0 by definition. Moreover, every x ∈H is recovered from its Fourier coefficients by adding the Fourier series j∈J x(j)ej:

12 Theorem 2.37 (Fischer-Riesz ). Suppose S = {ej}j∈J is an orthonormal system of H. Then the following statements are equivalent: (a) S is an orthonormal basis of H. ∈ (b) x = j∈J x(j)ej in H for every x H.  2  ∈ (c) x H = x 2 for every x H or, equivalently, (x, y)H =(x, y)2 for all x, y ∈ H (Parseval’s relation).

2 Proof. Suppose J = N and let c = {cj}∈ .

11 According to footnote 2 in this chapter, if f(j)=(x, ej )H ,then      2 | |2 | |2 ⊂ f 2 := f(j) =sup f(k) ; F J, F finite j∈J k∈F

2 2 is the integral of |f| relative to the counting measure on J,andf ∈  (J)iff2 < ∞. 12 Found independently in 1907 by the Hungarian mathematician Frigyes Riesz and the Austrian mathematician Ernst Fischer.

2.3. Hilbert spaces 51

∞ We claim that z := c e exists and z = c. Indeed, if S = j=1 j j N N j=1 cjej and p

z(k) = (lim SN ,ek)H = lim(SN ,ek)H = ck, N N since (SN ,ek)H = ck if N ≥ k. { }∈ 2 Now (b) follows from (a): By Bessel’s inequality x = x(j)  and, if ∞  z = x(j)ej,thenz = x and z − x(j) for all j, so that x = y by (a). j=1  2  2 N | |2  From (b), x H = limN SN H = limN j=1 x(j) = x 2.Thenalso (x, y)H =(x, y)2 by the polarization identity 1 (2.15) (x, y) = (x + y2 −x − y2 ) H 4 H H if K = R, and 1 (2.16) (x, y) = (x + y2 −x − y2 + ix + iy2 − ix − iy2 ) H 4 H H H H in the complex case. They are both checked by expanding the squared norms as scalar products. Finally, if x = 0, from (c) we obtain x = 0 and (a) follows. 

Remark 2.38. We have proved that, when {ej}j∈J is an orthonormal basis, the linear map x ∈ H → x ∈ 2 is a bijective isometry, and that (b) defines its inverse.

The identity (b) is the expansion of x in a Fourier series. The classical best-known example is the following: Example 2.39. In the Hilbert space L2(T)=L2(0, 2π), where for conve- nience we define the scalar product as 1 2π (f,g)2 := f(t)g(t) dt = fg,¯ 2π 0 T ikt the trigonometric system ek(t):=e (k ∈ Z) is orthonormal. It is well known that it is complete and a proof of this fact, known as the uniqueness theorem for Fourier coefficients, follows from (2.27), where a construc- tive proof of the density of the trigonometric polynomials is given. Another proof based on the Stone-Weierstrass theorem is contained in Exercise 2.10.

52 2. Normed spaces and operators

The corresponding Parseval relation13 is ∞ 1 2π 1 2π 2 |f(t)|2 dt = f(t)e−ikt dt . 2π 2π 0 k=−∞ 0

2.4. Convolutions and summability kernels

We are going to consider examples of linear operators T between complex Lp spaces on σ-finite measure spaces X and Y . The reader may assume that X and Y are two Borel subsets of Rn and Rm, respectively, with the corresponding Lebesgue measures. Assume that the domain D(T )ofT contains all complex integrable simple functions on X and that T takes values that are measurable functions on Y . If there exists a constant M>0 such that p Tfq ≤ Mfp (f ∈ D(T ) ∩ L (X)), we say that T is of type (p, q) with constant M. As usual, we assume 1 ≤ p, q ≤∞. If T is of type (p, q) and D(T ) ∩ Lp(X) is a dense vector subspace of Lp(X), we keep the same notation T to represent the uniquely determined continuous extension T : Lp(X) → Lq(Y ) of this operator.

2.4.1. Integral operators. Let K(x, y)beanintegral kernel,acomplex- valued measurable function on X × Y . The associated integral operator TK is defined by

TK f(x):= K(x, y)f(y) dy. Y Theorem 2.40. (a) Under the condition (2.17) |K(x, y)| dy ≤ C<∞ a.e. on X, Y ∞ TK is well-defined on L and it is of type (∞, ∞) with constant C. (b) If (2.18) |K(x, y)| dx ≤ C a.e. on Y, X 1 then TK is well-defined on L (X) and it is of type (1, 1) with constant C.

(c) If both requirements (2.17) and (2.18) are satisfied, then TK is a p p bounded linear operator on TK : L (X) → L (Y ) for every p ∈ [1, ∞],and TK≤C.

13 In 1806 Marc-Antoine Parseval published an identity for series as a self-evident fact, which he later applied to the Fourier series.

2.4. Convolutions and summability kernels 53

Proof. The first result follows from |Tf(x)|≤ |K(x, y)||f(y)| dy ≤f∞ |K(x, y)| dy Y Y and from assumption (2.17).   ≤ | || | ≤   Similarly, Tf 1 X Y K(x, y) f(y) dy dx C f 1 in the second case. A direct proof of the remaining case (c) is left as an exercise (Exer- cise 2.29). It will also be a trivial corollary of the Riesz-Thorin Theorem 2.45; see Exercise 2.30. 

Convolution operators are special instances of integral operators. Recall that the convolution f ∗ g of two functions f and g on Rn is defined by (f ∗ g)(x):= f(x − y)g(y) dy. Rn One has to be careful to make sure that this integral is meaningful a.e. and that it defines a measurable function. In this case we say that f and g are convolvable. The convolution operator f∗ is the integral operator associated to the integral kernel K(x, y)=f(x − y), which is measurable on R2n if f is measurable on Rn. Indeed, if F (x, y)=f(y), then K = F ◦ T is measurable on R2n and T (x, y)=(x + y, x − y) is a homeomorphism of R2n. The following properties for convolvable functions are readily obtained from the definition: (a) f ∗ g = g ∗ f a.e. (b) {f ∗ g =0 }⊂{f =0 } + {g =0 }, so that, if supp f is compact, then (2.19) supp f ∗ g ⊂ supp f + supp g. 1 n 1 n (c) ∂k(f ∗ g)=f ∗ ∂kg if f ∈ L (R ) and g ∈C (R ) is bounded with a bounded partial derivative ∂kg. To prove (b), note that if x ∈{f =0 } + {g =0 },then (f ∗ g)(x):= f(x − y)g(y) dy {g=0 } and for every y ∈{g =0 } we obtain f(x − y) = 0, so that (f ∗ g)(x)=0.If supp f is compact, then supp f+ supp g is closed, since from x = limn an+bn ∈ ∈ → ∈ with an supp f and bn supp g, we obtain ank a supp f and − − ∈ x a = limk(ank + bnk ank )=b supp g.

54 2. Normed spaces and operators

As a corollary of Theorem 2.40 with K(x, y)=g(x − y) and C = g1, we obtain Young’s inequality p n 1 n (2.20) f ∗ gp ≤fpg1 (1 ≤ p ≤∞,f∈ L (R ),g∈ L (R )). By H¨older’s inequality, we also have p n p n (2.21) f ∗ g∞ ≤fpgp (1 ≤ p ≤∞,f∈ L (R ),g∈ L (R )).

2.4.2. Summability kernels on Rn. A summability kernel on Rn is a family {Kλ}λ∈Λ of integrable functions which satisfy (1) Λ ⊂ (0, ∞) and 0 ∈ Λ,¯ (2) Rn Kλ(x) dx =1, (3) sup K  < ∞, λ λ 1 | | (4) limλ→0 |x|≥R Kλ(x) dx =0forallR>0. Of course, for positive summability kernels assumption (3) is redundant. The following result justifies our also saying that a summability kernel is an approximation of the identity: n Theorem 2.41. Let {Kλ}λ∈Λ be a summability kernel on R . n (a) If f is a continuous function on R and lim|x|→∞ f(x)=0,then n limλ→0 Kλ ∗ f = f uniformly on R . p n (b) If f ∈ L (R ) for some 1 ≤ p<∞,thenlimλ→0 Kλ ∗ f − fp =0.

Proof. (a) For M>0, let

I(M):= |f(x − y) − f(y)||Kλ(y)| dy |y|≤M and

J(M):= |f(x − y) − f(y)||Kλ(y)| dy. |y|≥M Then, using property (2) of a summability kernel, |(f ∗ Kλ)(x) − f(x)| = [f(x − y) − f(y)]Kλ(y) dy ≤ I(M)+J(M). Rn For any ε>0, since f is uniformly continuous, we can find M such that, by (3), ε I(M) ≤ sup |f(x − y) − f(y)|Kλ1 ≤ . |y|≤M,x∈Rn 2 Then, by property (4), we can select δ > 0 so that, if |λ| ≤ δ, ε J(M) ≤ 2f∞ |Kλ(y)|dy ≤ |y|≥M 2 n and it follows that |f ∗ Kλ(x) − f(x)|≤ε for all x ∈ R .

2.4. Convolutions and summability kernels 55

(b) Using (2) as before, we obtain p  ∗ − p ≤ | − || | Kλ f f p (τyf f)(x) Kλ(y) dy dx

1/p 1/p and, by writing |Kλ| = |Kλ| |Kλ| if 1

n It is readily checked that a summability kernel on R is obtained from a single positive integrable function K such that Rn K(x) dx = 1 by defining 1 x (2.22) K (x):= K (t>0). t tn t Example 2.42. The Poisson kernel on R is the summability kernel 1 x P (x)= (t>0), t π t2 + x2 obtained from the function 1 x P (x)= . π 1+x2 2.4.3. Periodic summability kernels. Summability kernels can also be considered on a finite interval I ⊂ R,sayI =[a, a + T ). To define a convolution, we extend every function f : I → C to the whole line R by periodicity, and we associate to every T -periodic function f : R → C a function F : T → C on T = {z ∈ C; |z| =1} by the relation f(t)=F (e2πit/T ).

This is a bijective correspondence, and F ∈C(T) if and only if f ∈CT (R), this notation meaning that f is continuous on R and T -periodic.

56 2. Normed spaces and operators

p p p We will also write L (T)torepresentL (a, a + T )orLT (R), the linear space of all T -periodic functions which are in Lp when restricted to an interval (a, a + T ). In the case 1 ≤ p<∞,sinceCc(a, a + T )isdensein Lp(a, a + T ), C(T) is also dense in Lp(T). k 2πikt/T ∈ As an example, note that F (z)=z means that f(t)=e (k Z) k=N k and P (z)= k=−N ckz represents a trigonometric polynomial. We will 2πit/T k denote ek(t)=e , so that ek = z when we identify f and F , and k e−k =¯z . It will be convenient to use the notation 1 a+T f(t) dt := f(t) dt T T a p   | |p 1/p ≤ ∞ and to define the norm of L (T)as f p =( T f(t) dt) if 1 p< . Then ekp = 1 and f1 ≤fp.

The convolution on T is defined by (f ∗ g)(t)= f(s)g(t − s) ds T when f,g ∈ L1(T). A summability kernel or approximation of the identity on T is a 1 family of functions {Kλ}λ∈Λ in L (T) satisfying the following: (1) Λ is an unbounded subset of (0, ∞) (for convenience we will let λ →∞). (2) T Kλ(x) dx =1. (3) sup K  < ∞. λ λ 1 T −δ | | (4) limλ→∞ δ Kλ(x) dx =0forall0<δ<π. The proof of the following result is exactly the same as that of Theo- rem 2.41:

Theorem 2.43. Let {Kλ}λ∈Λ be a summability kernel on T.

(a) If f ∈C(T),thenlimλ→∞ Kλ ∗ f = f uniformly on T. p (b) If f ∈ L (T) for some 1 ≤ p<∞,thenlimλ→∞ Kλ ∗ f − fp =0.

If f ∈ L1(T), ∞ 2πikt/T f ∼ ck(f)e −∞ k= means that ck(f)= T f(t)e−k(t) dt are the Fourier coefficients of f and ∞ 2kπit/T that k=−∞ ck(f)e is the classical Fourier series of f.

2.4. Convolutions and summability kernels 57

Our next aim is to show how summability kernels appear when studying the convergence of these Fourier series. To study the possible convergence of the Fourier sums N −2πikt/T 2πikx/T SN (f,x)= f(t)e dt e k=−N T N = f(t) e2πik(x−t)/T dt, T k=−N by the change of variable y =2πx/T we can and will assume that T =2π. We will denote by N N kit (2.23) DN (t):= e =1+2 cos(nt) k=−N n=1 a sequence of trigonometric polynomials which is called the Dirichlet ker- nel,towrite SN (f)=f ∗ DN . By adding the geometric sequence in (2.23), we also obtain, if 0 < |t|≤π, ei(N+1)t − e−iNt sin[(N + 1 )t] D (t)= = 2 N it − t e 1 sin 2 sin[(N +1/2)t] = . sin(t/2) − Note that T DN (t) dt =1,since T ek(t) dt =0ifk = 0, and DN ( t)= DN (t). For every δ>0, {DN } is uniformly bounded on δ ≤|t|≤π: 1 (2.24) |D (t)|≤ (δ ≤ t ≤ π). N sin(δ/2)

Property (3) of summability kernels fails for the Dirichlet kernel, which is not an approximation of the identity, but we obtain a summability kernel, the Fej´er kernel, by making the averages − 1 N1 (2.25) F := D . N N n n=0 Indeed, 1 sin2(Nt2) FN (t)= N sin2(t/2) will follow from the identity (2.26) 2 sin α sin β =cos(α − β) − cos(α + β)

58 2. Normed spaces and operators and the properties

(a) FN ≥ 0, (b) F (−t)=F (t), N N 1 T/2 (c) T −T/2 FN (t) dt = 1, and

(d) limN→∞ maxδ≤|t|≤T/2 FN (t) = 0 for every 0 <δ

(2.27) lim f − σN (f)p =0. N→∞

2.5. The Riesz-Thorin interpolation theorem 59

p This shows that the trigonometric system {ek}k∈Z is total in L (T)(1≤ p<∞) and, as an application, we give a proof of the Riemann-Lebesgue lemma: 1 For every f ∈ L (T), lim|k|→∞ ck(f) = 0, since this is obviously true for the trigonometric polynomials σN (f) and we have c(f − σN (f))∞ ≤ ε if f − σN (f)1 ≤ ε, so that |ck(f)| = |ck(f − σN (f))|≤ε for every |k| >N. If f,g ∈ L1(T) have the same Fourier coefficients, then f = g. This fact, known as the uniqueness theorem for Fourier coefficients, also follows from (2.27), since f = limN σN (f) = limN σN (g)=g. ∞ The closed subspace of  which contains all the sequences {ck} with 1 limit zero will be denoted c0.Thenf ∈ L (T) → c0 is injective and contin- uous.

2.5. The Riesz-Thorin interpolation theorem

Sometimes it is easy to show the continuity of a certain operator T when acting between two couples of Banach spaces, say T : Lp0 → Lq0 and T : Lp1 → Lq1 . The interpolation theorems show that then T is also bounded between certain intermediate couples Lp and Lq. We are going to prove Thorin’s extension of the classical M. Riesz inter- polation result, known as the convexity theorem, by combining techniques of real analysis with the maximum modulus property of analytic functions.14 The following result will be used in the proof of the interpolation theo- rem: Theorem 2.44 (Three lines theorem). Let f be a bounded analytic function in the unit strip S = {z ∈ C;0< z<1} that extends continuously to S¯ = {z ∈ C;0≤z ≤ 1}, and denote M(ϑ):=sup |f(z)|. z=ϑ Then M(ϑ) ≤ M(0)1−ϑM(1)ϑ (0 <ϑ<1).

Proof. To prove that |f(ϑ + iy)|≤M(0)1−ϑM(1)ϑ, we can assume that neither M(0) nor M(1) is zero, since we can consider M(0)+ε and M(1)+ε and then ε ↓ 0.

14The first interpolation theorem was proved in 1911 by the Belarusian mathematician , who worked in Germany for most of his life, for operators between p spaces of type (1, 1) and (∞, ∞) in terms of bilinear forms. The convexity theorem was proved in 1927 by Marcel Riesz, the younger brother of Frigyes Riesz who worked in Sweden for most of his life. It was extended in 1938 by Riesz’s student Olov V. Thorin with a very ingenious proof, considered by Littlewood “the most impudent idea in Analysis”. This theorem refers to couples of Lp spaces but, with the ideas contained in Thorin’s proof, in the 1960s the Argentinian-American Alberto Calder´on developed an abstract complex interpolation method for general couples of Banach spaces.

60 2. Normed spaces and operators

1−z z Let F (z)=f(z)/(M0 M1 ), which is similar to f but with M(0) = M(1) = 1 and |F |≤K on 0 ≤z ≤ 1. We only need to show that |F |≤1.

For every ε>0, let us consider Fε(z)=F (z)/(1 + εz) on the rectangle Rε which is the fragment of S lying between the lines y = ±iK/ε.Onthe boundary of S, |F (z)| 1 |F (z)|≤ ≤ ≤ 1(x = z) ε 1+εx 1+εx and, when |y| = K/ε,also |F (z)| K |F (z)|≤ ≤ =1. ε ε|y| ε|y|

By the maximum modulus theorem, |Fε|≤1onRε, and the estimate

|Fε(z)|≤1 is also valid everywhere on S. Then we conclude that |F (z)|≤1+ε|z| on S, and |F (z)|≤1 by letting ε → 0. 

Let us apply this result to prove the interpolation theorem for operators between Lp spaces on X and Y , two measurable spaces endowed with the σ-finite measures μ and ν. We denote by S(X) the class of all simple integrable complex functions on X and by M(Y ) the class of all complex measurable functions on Y , and we recall that a linear operator T : D(X) → M(Y )(D(X) ⊂ M(X)) is said to be of type (p, q) with constant M if Tfq ≤ Mfp for every f ∈ D(X). We will use the fact that

15 (2.28) fq =sup|f,g| =sup|f,g|,   ≤   g q 1 g q =1   where f,g = Y fgdν. The proof is obtained from H¨older’s inequality as follows:

Obviously sup  ≤ |f,g| ≤ f . To prove the converse estimate, g q 1 q normalization allows us to suppose that fq = 1, and we write f = |f|s q−1 with |s| =1.When1≤ q<∞, define g0 = |f| s¯;theng0q = 1 and f,g0 =1. ∞ | |  ∞ If q = , suppose that M := supg1=1 f,g > f , so that, for some m>0, we can choose A ⊂{|f| >M+1/m} such that 0 <ν(A) < ∞.Then −1 1 g0 := ν(A) sχ¯ A ∈ L (ν) would satisfy |f,g0| > f∞, which contradicts H¨older’s inequality.

15This is the description of the Lp norm by duality. See also (4.6).

2.5. The Riesz-Thorin interpolation theorem 61

Theorem 2.45 (Riesz-Thorin). Let T : S(X) → M(Y ) be a linear operator of types (p0,q0) and (p1,q1) with constants M0 and M1, respectively. Then, if 0 <ϑ<1, T is also of type (pϑ,qϑ) with constant M(ϑ),where 1 1 − ϑ ϑ 1 1 − ϑ ϑ = + , = + , pϑ p0 p1 qϑ q0 q1 ≤ 1−ϑ ϑ and M(ϑ) M0 M1 .

Proof. (a) Let p = pϑ and q = qϑ, and consider first the case p0 = p1 = p. Note that we only need to show that, if q0 ≤ q ≤ q1,then   ≤ 1−ϑ ϑ (2.29) g q g q0 g q1 , since then, for g = Tf, we obtain   ≤ 1−ϑ ϑ ≤ 1−ϑ ϑ 1−ϑ ϑ Tf q g q0 g q1 M0 M1 f q0 f q1 .

To prove (2.29) when 1 ≤ q1 < ∞,weuseH¨older’s inequality with the  exponents r = q0/(1 − ϑ)q and r = q1/ϑq to obtain − − (1 ϑ)q/q0 ϑq/q1  q | |(1 ϑ)q| |ϑq ≤ | |q0 | |q1 g q = g g g g ,

 q ≤ (1−ϑ)q ϑq that is, g q g q0 g q1 .

If p0 = p1 = p and q0

(b) Now assume that p0 = p1, and then 1

N K

s = anχAn ,g= bkχBk n=1 k=1

62 2. Normed spaces and operators

where the An (and the Bk) are disjoint sets of finite measure and an =0 = bk.Moreoversp = 1 and gq = 1 give us that N K p p |an| μ(An)=1, |bk| μ(Bk)=1. n=1 k=1

Write s(x)=|s(x)|σ(x), so that s(x)=|an| exp(i arg an)ifx ∈ An, and also g(y)=|g(y)|γ(y). Then for every z ∈ C we define the simple functions

α(z)/α(ϑ) (1−β(z))/(1−β(ϑ)) sz = |s| σ, gz = |g| σ and α(z)/α(ϑ) (1−β(z))/(1−β(ϑ)) F (z)= (Tsz)gz dν = |an| |bk| Cn,k Y n,k with Cn,k = Y (TχAn )χBk exp(i arg an +i arg bk), which is an entire function as a linear combination of exponentials. The real parts of α(z) and β(z) are bounded on S and then F is also bounded. The announced estimate will be obtained from Theorem 2.44 if we show that

(2.31) |F (iy)|≤M0, |F (1 + iy)|≤M1, since F (ϑ)= Y (Ts)gdν and then (2.30) will follow. From the definition of F (z), T being of type (p0,q0) with constant M0, H¨older’s inequality gives

|F (iy)|≤Ts  g  ≤ M s  g  , iy q0 iy q0 0 iy p0 iy q0 where N N p  p0 | |α(iy)/α(ϑ) 0 | |p siy p0 = an μ(An)= an μ(An)=1, n=1 n=1 q0 since α(iy)=1/p0 and α(ϑ)=1/p. Similarly, g = 1 and we arrive at q0 the first estimate |F (iy)|≤M0 in (2.31).

The same argument using the fact that T is of type (p1,q1) with constant M1 yields the second estimate |F (1 + iy)|≤M1. This completes the proof for q finite.  In the case q = ∞,takegz = g for all z. 

Corollary 2.46. Let T be a linear operator on D(T )=Lp0 (X)+Lp1 (X)of types (p0,q0) and (p1,q1) with constants M0 and M1. Then, with the same p q notation as in Theorem 2.45, T : L ϑ (X) → L ϑ (Y )isoftype(pϑ,qϑ)with ≤ 1−ϑ ϑ constant M(ϑ) M0 M1 , for every 0 <ϑ<1.

2.6. Applications to linear differential equations 63

Proof. We can assume that p0

Tfq ≤ lim inf Tskq ≤ M(ϑ)skp = M(ϑ)fp. k

If q = ∞,thenq0 = q1 = ∞, and also Tf∞ ≤ lim infk Tsk∞. 

As an application we will prove an extension of the Young inequali- ties (2.20) and (2.21):

Theorem 2.47. Let 1 ≤ p, q, r ≤∞and let 1 1 1 + = +1. p q r If f ∈ Lp(Rn) and g ∈ Lq(Rn),thenf and g are convolvable, f ∗g ∈ Lr(Rn), and

f ∗ gr ≤fpgq.

Proof. Assume that p ≤ q, so that also p ≤ r. By (2.20), f∗ is of type (1,p)  with constant fp, and by (2.21) it is of type (p , ∞) with the same constant. It follows from the Riesz-Thorin Theorem 2.45 that it is also of type (p, q)  with constant fp by choosing ϑ = p/r, since then 1/q =(1− ϑ)/p + ϑ/1 and 1/r =1/ϑ. 

2.6. Applications to linear differential equations

The aim of this section is to present some applications of the preceding methods to solve initial value problems and boundary value problems for a second order linear equation with continuous coefficients,   u + a1(x)u + a0(x)u = c(x), on a bounded interval [a, b]. Functions are assumed to be real-valued. This equation can be written in what is called self-adjoint form, (2.32) (pu) − qu = f (0

64 2. Normed spaces and operators

 since, for p(x)=exp( a1) such that p = pa1, our equation is equivalent to        pu + pa1u + pa0u = pc, that is, (pu ) − p u + pa1u + pa0u = pc, so that we only need to consider q = −pa0 and f = pc. We write Lu =(pu) − qu for short, and by a solution of Lu = f we mean a function u ∈C2[a, b] such that Lu(x)=f(x) for every x ∈ [a, b]. Note that the operator L : C2[a, b] →C[a, b] satisfies the Lagrange identity (2.33) uLv − vLu =[p(uv − uv)] =(pW ), where W = uv − uv is the Wronskian determinant.16

2.6.1. An initial value problem. Next we consider the Cauchy prob- lem (2.34) Lu = f, u(a)=α, u(a)=β where α, β ∈ R are given and L is as above. Integration and the assumption u(a)=β show that to solve (2.34) involves finding the solutions of y y p(y)u(y) − p(a)β − q(t)u(t) dt = f(t) dt (u ∈C1[a, b],u(a)=α). a a After dividing by p, it turns out that integration and the condition u(a)=α show that this is equivalent to x 1 s u(x)= q(y)u(y) dyds − g(x)(u ∈C[a, b]), a p(s) a where x ds x 1 s (2.35) g(x)=−α − p(a)β − f(y) dyds. a p(s) a p(s) a Note that, if a

2.6. Applications to linear differential equations 65

Let us consider x 1 K(x, y):=q(y) dt (a ≤ y ≤ x ≤ b), y p(t) which is a continuous function on the triangle defined by a ≤ y ≤ x ≤ b, and suppose that g ∈C[a, b] is as in (2.35). With Theorem 2.30 we have proved the following result: Theorem 2.48. The function u ∈C2[a, b] is a solution of the Cauchy prob- lem (2.34) if and only if u ∈C[a, b] and it satisfies the Volterra integral equation TK u − u = g, whose unique solution is ∞ − −1 − n u =(TK I) g = TK g. n=0

As shown in Exercise 2.20, a similar result holds for linear differential equations of higher order with continuous coefficients.

Let us recall how, from the existence and uniqueness of solutions for the Cauchy problem (2.34), it can be proved that Ker L, the set of all solutions for the homogeneous equation, is a two-dimensional vector space:

Lemma 2.49. Two solutions y1 and y2 of the equation Lu =0are linearly  −  dependent if and only if their Wronskian, W = y1y2 y1y2, vanishes at one point. In this case, W (x)=0for all x ∈ [a, b].

Proof. Indeed, if W (ξ)=0atapointξ ∈ [a, b], then the system   y1(ξ)c1 + y2(ξ)c2 =0,y1(ξ)c1 + y2(ξ)c2 =0 has a solution (c1,c2) =(0 , 0) and the function v := c1y1 + c2y2 is a solution of the boundary problems Lv =0,v(ξ)=v(ξ)=0on[a, ξ] and on [ξ,b]. By the uniqueness of solutions for these Cauchy problems, it follows that v =0on[a, b] and y1, y2 are linearly dependent.   Conversely, if c1y1+c2y2 =0with(c1,c2) =(0, 0), then also c1y1+c2y2 = 0, and W (x) = 0 for every x ∈ [a, b] as the determinant of the homogeneous linear system   y1(x)c1 + y2(x)c2 =0,y1(x)c1 + y2(x)c2 =0 with a nonzero solution. 

Theorem 2.50. Let y1 and y2 be the solutions of  Ly1 =0,y1(a)=1,y1(a)=0

66 2. Normed spaces and operators and  Ly2 =0,y2(a)=0,y2(a)=1. Then {y1,y2} is a basis of Ker L.

Proof. Since W (a) = 1, according to Lemma 2.49, the functions y1 and y2 are linearly independent. To show that any other solution u of Lu = 0 is a linear combination of  y1 and y2, note that v := u(a)y1 + u (a)y2 is the solution of Lv =0,v(a)=u(a),v(a)=u(a),  and then u = v = u(a)y1 + u (a)y2. 

2.6.2. A boundary value problem. We shall now restrict our attention to the homogeneous boundary problem

(2.36) Lu = g, B1(u)=0,B2(u)=0, where Lu ≡ (pu) − qu (0

Proof. Let y1 and y2 be nonzero solutions of

Ly1 =0,B1(y1)=0 and Ly2 =0,B2(y2)=0.

2.6. Applications to linear differential equations 67

By our assumptions, there are no nonzero solutions of LDu = 0, so that B1(y2) = 0 and B2(y1) = 0, and y1, y2 are linearly independent. By the existence theorem for the Cauchy problem (Theorem 2.48) we can always find these functions y1 and y2. 

2 It is worth observing that, if y1,y2 ∈C[a, b] are two linearly indepen- dent solutions of the homogeneous equation Lu =0,thenpW is a nonzero constant, since (pW ) = 0 by (2.33) and W = 0 by the linear independence of y1 and y2.

Hence, if y1 and y2 are as in Lemma 2.51, then pW = C is a constant. Our aim is to show that the boundary value problem (2.36) is solved by the Fredholm operator defined by the kernel 1 ≤ ≤ ≤ C y1(ξ)y2(x)ifa x ξ b, (2.37) G(x, ξ):= 1 ≤ ≤ ≤ C y1(x)y2(ξ)ifa ξ x b, which is called the Green’s function17 of the differential operator L for the boundary conditions B1(u)=0,B2(u) = 0. Note that G is a real-valued continuous function on [a, b] × [a, b] and G(x, ξ)=G(ξ,x). Theorem 2.52. Under the assumption of L being one-to-one on D,for every ξ ∈ (a, b) the function G(·,ξ) is uniquely determined by the following conditions: (a) G(·,ξ) ∈C2([a, ξ) ∪ (ξ,b]), LG(·,ξ)=0on [a, ξ) ∪ (ξ,b],andG(·,ξ) satisfies the boundary conditions B1(G(·,ξ)) = 0, B2(G(·,ξ)) = 0. (b) G(·,ξ) ∈C[a, b]. (c) The right side and left side derivatives of G(·,ξ) exist at x = ξ and 1 ∂ G(ξ+,ξ) − ∂ G(ξ−,ξ)= . x x p(ξ) Proof. That G satisfies (a)–(c) follows easily from the definition. Note that  −  C∂xG(ξ+,ξ)=y1(ξ)y2(ξ),C∂xG(ξ ,ξ)=y1(ξ)y2(ξ) and then C(∂xG(ξ+,ξ) − ∂xG(ξ−,ξ)) = W , which is equivalent to (c).

To show that G is uniquely determined by (a)-(c), we choose {y1,y2} as in Lemma 2.51. By conditions (a) and since also B1(y1)=B2(y2)=0,it follows that a2B1(y2) = 0 and b1B2(y1)=0withB1(y2) = 0 and B2(y1) =0. Hence a (ξ)y on [a, ξ], G(·,ξ):= 1 1 b2(ξ)y2 on [ξ,b].

17Named after the self-taught mathematician and physicist George Green who, in 1828, in “An Essay on the Application of Mathematical Analysis to the Theories of Electricity and Magnetism” introduced several important concepts, such as a theorem similar to Green’s theorem, the idea of potential functions as used in physics, and the concept of what we call Green’s functions.

68 2. Normed spaces and operators

Now, from our assumptions (b) and (c),

a1(ξ)y1(ξ)=b2(ξ)y2(ξ), 1 b (ξ)y (ξ) − a (ξ)y (ξ)= , 2 2 1 1 p(ξ) a linear system with W (ξ) = 0 that determines a1(ξ) and b2(ξ). 

Theorem 2.53. Under the assumption of LD being injective, the Green’s function G is defined by (2.37) and the Fredholm operator b Tf(x):= G(x, ξ)f(ξ) dξ a 1 x x = y1(x) y2(ξ)f(ξ) dξ + y2(x) y1(ξ)f(ξ) dξ C a a 2 2 is the inverse operator T : C[a, b] →D⊂C [a, b] of LD. That is, u ∈C [a, b] is the unique solution of the boundary value problem

Lu = g, B1(u)=0,B2(u)=0 if and only if b u(x)= G(x, ξ)g(ξ) dξ. a Proof. Suppose u is a solution of the boundary value problem and apply Green’s formula y y (uLv − vLu)= p(uv − vu) , x x obtained from the Lagrange identity (2.33) by integration, to the solution u and to v = G(·,ξ) on the intervals [a, ξ − ε] and [ξ + ε, b]. Allowing ε ↓ 0, we obtain ξ ξ− uLG(·,ξ) − G(·,ξ)Lu = p(uG(·,ξ) − G(·,ξ)u) a a and b b uLG(·,ξ) − G(·,ξ)Lu = p(uG(·,ξ) − G(·,ξ)u) . ξ ξ+ · − b · Since Lu = g and LG( ,ξ) = 0, the sum of the left sides is a G( ,ξ)g. According to the properties of G, for the sum of the right sides we obtain 1 b −pu + p(uG(·,ξ) − G(·,ξ)u) = −u p a b and u(x)= a G(x, ξ)g(ξ) dξ holds.

2.7. Exercises 69

Conversely, according to the definition of G, it follows by differentiation from 1 x 1 x u(x)= y1(x) y2(ξ)g(ξ) dξ + y2(x) y1(ξ)g(ξ) dξ C a C a that x− b u(x)= ∂xG(x, ξ)g(ξ) dξ + ∂xG(x, ξ)g(ξ) dξ a x+ b = ∂xG(x, ξ)g(ξ) dξ. a   With a similar computation, (pu ) can be evaluated and, by using the prop- b · ·  erties of G, one gets Lu(x)= a L(G( ,ξ))(x) g(ξ) dξ + g(x)=g(x).

2.7. Exercises

Exercise 2.1 (Leibniz formula). For one variable, m m (fg)(m) = f (k)g(m−k). k k=0 Inthecaseofn variables, α Dα(fg)= (Dβf)Dα−βg. β β≤α Exercise 2.2. Prove that if a vector subspace F of a topological vector space E has an interior point, then F = E. Exercise 2.3. Equip a real or complex vector space E with the discrete topology, that is, the topology whose open sets are all the subsets of E.Is E a topological vector space? Exercise 2.4. Prove that the interior of a convex subset K in a topological vector space E is also convex.

Exercise 2.5. Let c00 be the normed space of all finitely nonzero real-valued   | | sequences with the “sup” norm (x1,...,xN ,...) ∞ =supn xn .Findan unbounded linear form u : c00 → R and an unbounded linear operator T : c00 → c00. Exercise 2.6. Show that the Banach space of all bounded continuous func- tions on Rn with the “sup” norm can be considered a closed subspace of L∞(Rn), and prove that this space is not separable. Exercise 2.7. Let 1 ≤ p<∞. Prove that Lp(Rn) is separable.

70 2. Normed spaces and operators

n Exercise 2.8. Prove that the normed space C0(R ) of all continuous func- n tions f on R such that lim|x|→∞ f(x) = 0, with the usual operations and n the “sup” norm, is a completion of the vector subspace Cc(R ) of all con- tinuous functions f on Rn with a compact support endowed with the “sup” norm and that it is separable. Exercise 2.9. Let K be a compact metric space. (a) Prove that K is separable.

(b) Prove that C(K) is also separable by showing that, if {xn} is a dense sequence in K and ϕm,n(x)=max(1/m − d(x, xn), 0), the countable set {ϕm,n; m, n ≥ 1} is total in C(K).

Exercise 2.10. Prove that the trigonometric system {ek}k∈Z is total in 2 C 2 L (T) by showing first that (T)isdenseinL (T) and that the polynomials N k ∈ ∈ C k=−N ckz (N N, ck C) are dense in (T). Exercise 2.11. Prove that the vector space of all C1 functions on [a, b]  with the usual operations and the norm f := max(f[a,b], f [a,b])isa separable Banach space.   n | |p 1/p ≤ ∞ Exercise 2.12. For the norms x p := k=1 xk (1 p< ) and   n | | n x ∞ := maxk=1 xn on R , find the best constants α and β such that αx∞ ≤xp ≤ βx∞. Exercise 2.13. If C = C([0, 1]; R), compute the norm of u ∈C defined as 1 u(f):= f(t)g(t) dt 0 ∞ − n − with g := n=1( 1) χ(1/(n+1),1/n) and show that u(BC)=( 1, 1), so the norm of u is not attained by |u(f)| on the closed unit ball BC of C[0, 1]. C →C Exercise 2.14. If TK : [c, d] [a, b] is a Fredholm operator (see 2.22) ≥   d and K 0, then TK =supa≤x≤b c K(x, y) dy. Similarly, if TK is a ≥   x Volterra operator and K 0, then TK =supa≤x≤b a K(x, y) dy. Exercise 2.15. Consider K ∈C([0, 1]2) and T : L1(0, 1) → L1(0, 1) such that 1 Tf(x):= K(x, y)f(y) dy. 0   1 | | Prove that T =max0≤y≤1 0 K(x, y) dx. Exercise 2.16. In Theorem 2.40, assume that K is nonnegative. Then ∞ → ∞ prove that the norm of TK : L L is precisely C if Y K(x, y) dx = C 1 → 1 a.e. Similarly, if C = X K(x, y) dx a.e., show that the norm of TK : L L is C.

2.7. Exercises 71

Exercise 2.17. On [−π,π] × [−π,π] we define the integral kernel N K(x, y):= an cos(nx)sin(ny) n=1 and, for a given v0 ∈C[−π,π], consider the integral equation on C[−π,π] π (2.38) u(x) − K(x, y)u(x) dx = v0(x). −π Prove that the Neumann series gives (I − T )−1 = I + T for the Fredholm operator T = TK . Then show that π u(x):=v0(x)+ K(x, y)v0(y) dy −π is the unique solution of (2.38). If v0 is an even function, then u = v0. Exercise 2.18. Find the Volterra integral equations that solve the following Cauchy problems on [0, 1]: (a) u + u =0,u(0) = 0,u(0) = 1. (b) u + u =cosx, u(0) = 0,u(0) = 1.    (c) u + a1u + a0u =0,u(0) = α, u (0) = β (α, β ∈ R). (d) u + xu + u =0,u(0) = 1,u(0) = 0. Exercise 2.19. Prove that x xn−1 x1 Tf(x)= dxn−1 dxn−2 ··· f(t) dt a a a defines a Volterra operator on C[a, b] whose integral kernel is 1 K(x, t)= (x − t)n−1 (a ≤ t ≤ x ≤ b). (n − 1)! Exercise 2.20. On [a, b], consider the Cauchy problem

(n) (n−1) (j) u + a1u + ···+ anu = f, u (a)=cj (0 ≤ j ≤ n − 1), with a1 ...an,f ∈C[a, b] and c0 ...cn−1 ∈ R. (a) By denoting v = u(n), show that the problem is equivalent to solving the Volterra integral equation x K(x, y)v(y) dy − v(x)=g(x), a where n (x − y)m−1 K(x, y)=− a (x) m (m − 1)! m=1

72 2. Normed spaces and operators and

g(x)=cn−1a1(x)+(cn−1(x − a)+cn−2) (x − a)n−1 + ···+(c − + ···+ c (x − a)+c )a (x) n 1 (n − 1)! 1 0 n −f(x). Here Exercise 2.19 may be useful. (b) Show that it follows from (a) that the Cauchy problem has a uniquely determined solution u ∈Cn[a, b]. Exercise 2.21. Suppose f and g are nonnegative integrable functions on R2 such that supp f = {(x, y); x>0, 1/x ≤ y ≤ 1+1/x} and supp g = {(x, y); y ≥ 0}. Prove that supp f ∗ g ⊂ supp f + supp g. Why is this not in contradiction with (2.19)? Exercise 2.22. If f ∈ Lp(Rn) and g ∈ Lp , prove that f ∗g is then uniformly continuous. Exercise 2.23. On L1(R), prove that the convolution is associative but check that (f1 ∗ f2) ∗ f3, f1 ∗ (f2 ∗ f3) are well-defined and they are different if f1 =1,f2(x)=sin(πx)χ(−1,1)(x), and f3 = χ[0,∞).

Exercise 2.24. Let {Kλ}λ∈Λ be a summability kernel on R and let f ∈ ∞ L (R). Prove that, if f is continuous on [a, b], then limλ→0 Kλ ∗ f = f uniformly on [a, b]. Exercise 2.25. Prove that the sequence of de la Vall´ee-Poussin sums, defined as the averages − 1 2N 1 V := D =2F − F , 2N N n 2N N n=N is a summability kernel. ∞ Exercise 2.26. Let c0 be the closed subspace of  (Z) of all sequences { k}∞ x = x k=−∞ such that lim|k|→∞ ck = 0. Calculate the norm of the Fourier 1 transform c : f ∈ L (T) → c(f) ∈ c0. Exercise 2.27. On R,letW (x):=e−πx2 and, for n variables, let W (x):= −π|x|2 e = W (x1) ···W (xn). Prove that 1 2 2 W (x):= e−π|x| /t (t>0) t tn

2.7. Exercises 73 is a C∞ summability kernel on Rn. It is called the Gauss-Weierstrass kernel.

Exercise 2.28. If {en}n≥1 is an orthonormal basis of F , a closed subspace of a Hilbert space H, prove that PF (x)= (x, en)H en. n≥1 Exercise 2.29. Prove Theorem 2.40 in the case 1

References for further reading: N. I. Akhiezer and I. M. Glazman, Theory of linear operators in Hilbert space. S. Banach, Th´eorie des op´erations lin´eaires. S. K. Berberian, Lectures in Functional Analysis and Operator Theory. R. Courant and D. Hilbert, Methods of Mathematical Physics. J. Dieudonn´e, Foundations of Modern Analysis. L. Kantorovitch and G. Akilov, Analyse fonctionnelle.

74 2. Normed spaces and operators

A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Functions and Functional Analysis. F. Riesz and B. Sz. Nagy, Le¸cons d’analyse fonctionelle. W. Rudin, Real and Complex Analysis.

Chapter 3

Fr´echet spaces and Banach theorems

It will also be useful to consider topological vector spaces which are not normable, but the class of general topological vector spaces proves to be too wide for our needs. Usually it is sufficient to consider spaces with a vector topology which is still metrizable and complete and which can be defined by a sequence of norms or semi-norms instead of a single norm. They are called Fr´echet spaces and some fundamental aspects of the theory of Banach spaces still hold on them. An example of nonnormable Fr´echet space is the vector space C(R)of all continuous functions on R with the uniform convergence on compact subsets of R, defined by

f − fn[−N,N] := max |f(t) − fn(t)|→0 −N≤t≤N for all N>0.

The sequence of semi-norms f[−N,N] defines a vector topology, which can also be described by the distance associated to the Fr´echet norm,

∞ 1 f − f = [ N,N] , 2N 1+f − N=1 [ N,N] which retains many of the properties of a norm. This Fr´echet norm is used to prove the basic Banach theorems concerning the continuity of linear operators.

75

76 3. Fr´echet spaces and Banach theorems

3.1. Fr´echet spaces

A semi-norm on the real or complex vector space E is a nonnegative func- tion p : E → [0, ∞) with the following properties: 1. p(λx)=|λ|p(x) and 2. p(x + y) ≤ p(x)+p(y). Then p(0) = 0, but it may happen that p(x)=0forsomex =0. It is shown as in the case (2.1) of a norm that (3.1) |p(x) − p(y)|≤p(x − y) and the p-balls −1 Up(ε):={x; p(x) <ε} = εUp(1) = p (−ε, ε) (p ∈P,ε>0) ⊂ | |≤ are convex sets such that λUp(ε) Up(ε)if λ 1 (it is said that they are balanced) and t>0 tUp(ε)=E (they are absorbing). Note that, if p(x) = 0 and x = 0, the ball Up(ε) contains the whole line [x].

3.1.1. Locally convex spaces. A family P of semi-norms on E is called sufficient if p(x) = 0 for every p ∈P implies x =0. Theorem 3.1. If P is a sufficient family of semi-norms on E, then the collection of all the finite intersections of the balls Up(ε)(p ∈P, ε>0) is a local basis U of a vector topology TP on E. On the topological vector space (E,T ), a semi-norm p is continuous if and only if the ball Up(1) is an open set.

Proof. We say that z ∈ E is said to be an interior point of A ⊂ E if z + U ⊂ A for some U ∈U and that A is open if every a ∈ A is an interior point of A. It is trivial to check that the collection TP of these sets satisfies all the properties of a topology, which is Hausdorff since, if x = y,thereexists p ∈Psuch that p(x − y)=ε>0, and x + Up(ε/2) and y + Up(ε/2) are disjoint, because x + u = y + v with p(u) <ε/2 and p(v) <ε/2 would imply p(x − y)=p(v − u) <ε. It is also very easy to show that the vector operations are continuous. In the case of the sum, Up(ε/2) + Up(ε/2) ⊂ Up(ε) and also, if U is a finite intersection of these balls, (1/2)U +(1/2)U ⊂ U and the sum is continuous at (0, 0) ∈ E × E. Continuity at any (x, y) ∈ E × E follows by translation; if x+y+U is a neighborhood of x+y and V +V ⊂ U,then(x+V )+(y+V ) ⊂ x + y + U. −1 For every continuous semi-norm p,thesetUp(ε)=p (−ε, ε) is open. Conversely, if q is a semi-norm and Up(1)isanopenset,thenUp(ε)=

3.1. Fr´echet spaces 77

εUp(1) ∈TP and it follows from (3.1) that, for every x ∈ y + Up(ε), |p(x) − p(y)|≤p(x − y) <ε, so p(x) ∈ (p(y) − ε, p(y)+ε) and p is continuous at y. 

A topological vector space (E,T )issaidtobealocally convex space1 if there exists a sufficient family P of semi-norms defining the topology as in Theorem 3.1. In this case, the family of all sets ∩···∩ { { } } Up1 (ε) Upn (ε)= x;max p1(x),...,pn(x) <ε

(ε>0,pi ∈P,i=1,...,n,n∈ N) is a local basis for this topology. We will also say that {Up(ε); p ∈P,ε>0} is a local subbasis for this topology. A normable topological vector space is a locally convex space with a suf- ficient family of semi-norms consisting in a single norm. Example 3.2. Let X be a nonempty set. Over the vector space CX = C x∈X X of all complex functions f : X → C (f = {f(x)}x∈X ∈ C ), the collection of all semi-norms px(f):=|f(x)| (x ∈ X) is sufficient and defines the product topology, which is the topology of pointwise convergence, since px(fn) → 0 for every x ∈ X if and only if fn(x) → 0 for every x ∈ X.

Recall that the collection of all finite intersections of sets −1 { } | − | πz D(f0(z); ε) = f = f(x) x∈X ; f(z) f0(z) <ε = f0 + Upz (ε) is a neighborhood basis of f0 = {f0(x)}x∈X for the product topology and, according to Theorem 3.1, it is also a neighborhood basis for the topology defined by the semi-norms px (x ∈ X). Example 3.3. Over the vector space C(R) of all continuous functions on R, the sequence

pn(f):=f[−n,n] =sup|f(t)| (n ∈ N) −n≤t≤n of semi-norms is sufficient. They define the topology of the local uniform convergence, or uniform convergence on compact sets.

Every compact set K ⊂ R is contained in an interval [−n, n], and then fK ≤ pn(f). Hence, pn(f − fk) → 0 implies f − fnK → 0, and this means that fn → f uniformly on K.

1Considered by J. Dieudonn´e and L. Schwartz in the 1940s to extend the duality theory of normed spaces, locally convex spaces were the basis for the study of the distributions by L. Schwartz.

78 3. Fr´echet spaces and Banach theorems

The continuity condition TxF ≤ CxE for a linear operator between normed spaces has a natural extension for locally convex spaces: Theorem 3.4. Let P and Q be two sufficient families of semi-norms for the locally convex space E and F . A linear application T : E → F is continuous if and only if for every q ∈Qthere exist pj ∈P(j ∈ J finite) and a constant C ≥ 0 so that q(Tx) ≤ C max pj(x). j∈J

Asequence{xn}⊂E is convergent to 0 if and only if p(xn) → 0 for every p ∈P.

Proof. By Theorem 2.2, the continuity of T means that, if Uq(ε)isaq-ball, ⊂ ∩···∩ ∈P T (U) Uq(ε)forsomeU = Up1 (δ) Upn (δ)(pj ). That is,

q(Tx) ≤ C max pj(x) j∈J ∈ ∩···∩ with C = ε/δ,sincex/α Up1 (1) Upn (1) if α>maxj∈J pj(x), and q(Tx/α) <ε/δ. Hence (δ/α)Tx ∈ Uq(ε) and q(Tx) <αε/δ, and we obtain q(Tx) ≤ C maxj∈J pj(x) by allowing α ↓ maxj∈J pj(x).

Finally, xn → 0 if and only if xn ∈ Up(ε)whenn ≥ ν(p, ε), i.e., eventu- ally p(xn) <ε, for every Up(ε). 

Suppose that E is a locally convex space and that F is a closed subspace of E. The linear quotient map π : E → E/F is defined as π(x)=˜x = x + F and, if P is a sufficient family of semi-norms on E, we consider on E/F the collection P˜ of semi-normsp ˜ defined by p˜(˜x)=infp(y)= infp(x − z), y∈x˜ z∈F for every p ∈P. It is clear that every functionalp ˜ is a semi-norm on E/F. The family P˜ is sufficient, sincep ˜(˜x) = 0 for allp ˜ means that infz∈F p(x − z) = 0 for all p, every ball x + Up(ε) meets F , and then x ∈ F¯ = F andx ˜ =0inE/F. This new topological vector space E/F is the quotient locally con- vex space of E modulus F , and the quotient map is continuous, since p˜(π(x)) =p ˜(˜x) ≤ p(x) for everyp ˜ ∈ P˜.IfE is a normed space, E/F is also a normed space. In a locally convex space E with the sufficient family of semi-norms P, a subset A is said to be bounded if p(A) is a bounded set in R for every p ∈Por, equivalently, if for every neighborhood U of 0 we have A ⊂ rU for some r>0.

3.1. Fr´echet spaces 79

Indeed, if every p is bounded on A and if U is a neighborhood of 0, we ∩···∩ ⊂ ≤ ≤ can choose Up1 (ε) Upn (ε) U and, if pj Mj/ε.

Conversely, suppose that A satisfies the condition A ⊂ rUp(1) = Up(r). Then r is an upper bound for p(A). A compact subset K of the locally convex space E is bounded, since every continuous semi-norm p is bounded on K. If every bounded closed subset of E is compact, it is said that E has the Heine-Borel property. As we have seen in Theorem 2.28, a normed space has this property only if its dimension is finite.

3.1.2. Fr´echet spaces. In many important examples, the sufficient fam- ily of semi-norms will be finite (this is the case of normable spaces, with a single norm) or countable (as in Example 3.3). In this case the lo- cally convex space is said to be countably semi-normable and then, if P { } = p1,p2,...,p n,... is a sufficient sequence of semi-norms, the sequence n of all Un = k=1 Upk (1/n) forms a decreasing countable local basis of open, ∞ { } convex, and balanced sets, and n=1 Un = 0 , since the topology is Haus- dorff.

We can assume p1 ≤ p2 ≤ ··· since for the increasing sequence of semi-norms q1 = p1, q2 = sup(p1,p2), q3 = sup(p1,p2,p3), ... we obtain ∩···∩ Uqn (ε)=Up1 (ε) Upn (ε) and both families of semi-norms define the same topology. The Fr´echet norm associated to the sequence of semi-norms will be the function ∞ p (x) x := 2−n n . 1+p (x) n=1 n It is not a true norm, but it has the following properties:

1. x =0ifx =0,sincepn(x)=0∀n implies x =0.

2. −x = x,sincepn(−x)=pn(x). 3. x + y≤x + y. To show this last property, note that a ≤ b 1+a 1+b if 0

80 3. Fr´echet spaces and Banach theorems

Theorem 3.5. Let · be the Fr´echet norm of a countably semi-normable locally convex space E. Then the distance d(x, y)=x − y defines the topology of E.

Proof. Let us show that every ball Bd(0,δ) for the distance d contains some ∈U U and that, conversely, every Upm (ε) contains a ball Bd(0,δ). k k+1 (a) Bd(0, 1/2 )containsUpk+1 (1/2 ): k+1 k+1 If pk+1(x) < 1/2 ,thenp1(x) ≤···≤pk+1(x) < 1/2 and k ∞ +1 1 1/2k+1 1 1 x < 2−n + < . 2n 1+1/2k+1 2n 2k n=1 n=k+2

k k+m+1 (b) The pm-ball Upm (1/2 )containsBd(0, 1/2 )since p (x) 2−m m < 1/2k+m+1 1+pm(x) k+m+1 k+1 if x < 1/2 ; therefore pm(x)/(1 + pm(x)) < 1/2 and it follows k that pm(x) < 1/2 . 

Let E be a countably semi-normable locally convex space with the suffi- cient family of semi-norms P. We say that {xn}⊂E is a Cauchy sequence if eventually xk − xm ∈ U for every 0-neighborhood U or, equivalently, p(xk − xm) → 0 for every p ∈P. By Theorem 3.5, every Cauchy sequence of E is convergent if and only if the metric space E with the distance d defined by the Fr´echet norm · is complete. Then we say that E is a Fr´echet space.2 Every Banach space is a normable Fr´echet space, but there are many other important Fr´echet spaces that are not normable. Theorem 3.6. The countably semi-normable space C(R) of Example 3.3 with the sufficient sequence of semi-norms pn(f)=f[−n,n] is a Fr´echet space. It is not normable, since there is no norm · with the property

pn(f) ≤ Cnf (f ∈C(R),Cn > 0 constant), for all n ∈ N.

Proof. If C(R) were normable, by the norm ·, then for f such that f(n)=nCn we would have nCn ≤|f(n)|≤Cnf and n ≤f for all n ∈ N.

2Named after the French mathematician Maurice Fr´echet, who with his 1906 dissertation titled “Sur quelques points du calcul fonctionnel” is considered one of the founders of modern functional analysis. See footnote 2 in Chapter 1.

3.1. Fr´echet spaces 81

If {fk} is a Cauchy sequence, it is uniformly convergent on every interval [−n, n] to a certain continuous function gn and there is a common extension of all of them to a function g on R,sincegn is the restriction of gn+1. Obviously, fk → g uniformly on every [−n, n]. 

The construction of Example 3.3 can be extended to the setting of the class C(Ω) of all complex continuous functions f :Ω→ C on an open subset ΩofRn, which is the union of an increasing sequence of the compact sets of Rn 1 (3.2) K = B¯(x ,m) ∩ x ∈ Ω; d(x, Ωc) ≥ . m 0 m n Every Km is a subset of the interior Gm+1 of the next one, Km+1.IfΩ=R , ¯ ⊂ Km = B(x0,m). Every compact set K Ω is covered by KN for some N, ∞ ⊂ N ⊂ since Ω = m=1 Gm and K m=1 Gm KN . It is also easily shown, as for C(R), that C(Ω), as a complex vector space with the usual operations and with the topology associated to the   sufficient increasing sequence of semi-norms qm(f)= f Km ,isaFr´echet space. Since fK ≤ qN (f)ifK ⊂ KN , the family of all semi-norms ·K (K any compact subset of Ω) defines the same topology on C(Ω), which is again the topology of the local uniform convergence.

To define the important example of C∞ functions, we introduce some terminology that will be useful in the future. ∈ n α α1 ··· αn Let us denote α =(α1,...,αn) N and D = ∂1 ∂n as in Exam- ple 2.7. For every compact subset K of Rn, we define α α pK,α(f)=D fK =sup|D f(x)| x∈K if Dαf exists. Then E(Ω) will represent the locally convex space of all C∞ functions on Ω with the topology defined by the sufficient increasing sequence of semi- norms  α  qj(f)= D f Kj = pKj ,α(f)(j =1, 2,...). |α|≤j |α|≤j

If K is any compact subset of Ω, then K ⊂ Kj for all j ≥ N,forsome N ∈ N, so that pK,α(f) ≤ qj(f)forsomej, and the topology of E(Ω) is also the topology of the semi-norms pK,α. A sequence {fk}⊂E(Ω) is convergent α n to0ifD fk → 0 uniformly on every compact of Ω, for every α ∈ N . For every compact subset K of Ω, the collection of functions

(3.3) DK (Ω) = {f ∈E(Ω); supp f ⊂ K}

82 3. Fr´echet spaces and Banach theorems is clearly a closed subspace of E(Ω). By extending functions by zeros, we n can suppose that DK (Ω) ⊂E(R ). Theorem 3.7. E(Ω) is a Fr´echet space.

{ }∞ ⊂E Proof. A sequence fj j=1 (Ω) is a Cauchy sequence if and only if { α }∞ every D fj j=1 is uniformly Cauchy over every compact set K and then α α it is uniformly convergent on K. Hence, D fj → f uniformly on compact α α 0 sets and then D f = f if f = f . This means that fj → f in E(Ω). 

Similarly, for any m ∈ N the vector space Em(Ω) of all Cm functions f :Ω→ C with the topology defined by the semi-norms α pK (f):= D fK |α|≤m m isalsoaFr´echet space. A sequence {fk}⊂E (Ω)isconvergentto0if α D fk → 0 uniformly on every compact of Ω, for every |α|≤m. Dm { ∈Em ⊂ } Also, K (Ω) = f (Ω); supp f K is a closed subspace of Em(Ω), for every compact subset K of Ω.

3.2. Banach theorems

We are going to present some profound results that prove to be very useful to show the continuity of linear operators. The ideas are mainly due to Stefan Banach and depend on the following Baire category principle concerning general complete metric spaces. ∞ Theorem 3.8 (Baire). If {Gn} is a sequence of dense and open subsets n=1 ∞ in a complete metric space M,thenA := n=1 Gn is also dense in M.Itis said that A is a dense Gδ-set.

Proof. We need to show that A ∩ G = ∅ for every nonempty open set G in M.

Since G1 is dense, we can find a ball B¯(x1,r1) ⊂ G ∩ G1 with r1 < 1. Similarly, we can choose B¯(x2,r2) ⊂ B(x1,r1) ∩ G2 and r2 < 1/2, since G2 is dense. In this way, by induction, we can produce

B¯(xn,rn) ⊂ B(xn−1,rn−1) ∩ Gn,rn < 1/n.

Then d(xp,xq) ≤ 2/n if p, q ≥ n,sincexp,xq ∈ B(xn,rn), and {xn} is a Cauchy sequence in M.ButM is complete, so we have limn xn = x.

Since xk lies in the closed set B¯(xn,rn)ifk ≥ n, it follows that x lies in each B¯(xn,rn) ⊂ Gn and x ∈ A.Alsox ∈ B¯(x1,r1) ⊂ G and A∩G = ∅. 

3.2. Banach theorems 83

Corollary 3.9. Let Fn (n ∈ N) be a countable family of closed subsets of a complete metric space M containing no interior points for every n ∈ N. ∞ Then the union B = n=1 Fn has no interior point either.

c Proof. Every open set Gn := Fn is dense since, for any nonempty open set ⊂ ∩ ∅ ∞ G, G Fn, so that G Gn = . According to Theorem 3.8, A := n=1 Gn is dense, so that Ac = B does not contain any nonempty open set.  Theorem 3.10 (Banach-Schauder3). Let E and F be two Fr´echet spaces. If T : E → F is a continuous linear operator and T (E)=F ,thenT is an open mapping, that is, T (G) is open in F if G is an open subset of E. In particular, if T is a bijective and continuous linear mapping, then T −1 is also continuous.

Proof. Let · be a Fr´echet norm on E and let U be the local basis of E that contains all the open balls Ur = {x ∈ E; x

We want to prove that every T (Ur) is a zero neighborhood in F ,since then, if G is an open subset of E and Ta ∈ T (G)(a ∈ G), there exists B(a, r)=a + Ur ⊂ G that satisfies T (a + Ur) ⊂ T (G) and Ta is an interior point in T (G), since T (a + Ur)=Ta+ T (Ur) is a neighborhood of Ta.

Let us start by showing that every T (Ur) is a zero neighborhood in F . For every x ∈ E,(1/n)x → 0 and x ∈ nUr/2 for some n ∈ N, so that ∞ ∞ ∞ E = n=1 nUr/2 and F = T (E)= n=1 nT (Ur/2)= n=1 nT (Ur/2). By the corollary of Baire’s theorem, T (Ur/2) has at least one interior point y and we can find y + V ⊂ T (Ur/2)(V ∈V). We have y ∈ T (Ur/2)=−T (Ur/2), so that V ⊂−y + T (Ur/2) ⊂ T (Ur/2)+T (Ur/2) ⊂ T (Ur) and T (Ur)isa neighborhood of 0 ∈ F .

If s>0 and r = s/2, let us prove now that Vσ ⊂ T (Ur) implies Vσ ⊂ T (Us), and then T (Us) will be a neighborhood of 0.

Let y ∈ Vσ, so that y <σ. To prove that y ∈ T (Us), we will find x ∈ Us so that y = Tx. ∞ Write s1 = s = n=1 rn with rn > 0 and r1 = r. We know that T (Urn ) ≥ ⊂ is a neighborhood of 0 and, for every n 2 there is a ball Vσn T (Urn ) and we can suppose that σn ↓ 0. ∈ ⊂  −  ∈ We have y Vσ T (Ur1 ), so that y Tz1 <σ2 with z1 Ur1 , i.e., z1

3Also known as the open mapping theorem in functional analysis, it was published in 1929 by S. Banach by means of duality in the Banach spaces setting in the first issue of the journal Studia Mathematica and also in 1930 in the same journal by J. Schauder, essentially with the usual direct proof that we have included here.

84 3. Fr´echet spaces and Banach theorems

By induction, − ∈ ⊂  − −    y Tz1 Vσ2 T (Ur2 ), y Tz1 Tz2 <σ3 with z2

If T : E → F is a continuous function between two metric spaces, it is obvious that its graph G(T ):={(x, y) ∈ E × F ; y = Tx} is a closed subset of the product space E × F , since if (xn,Txn) → (x, y)in E × F ,thenTxn → Tx and y = Tx.

That the converse is true if T is a linear operator between two Fr´echet spaces is a corollary of the Banach-Schauder theorem: Theorem 3.11 (Closed graph theorem). Let E and F be two Fr´echet spaces. AlinearmapT : E → F is continuous if and only if its graph G(T ) is closed in E × F .

Proof. First let us show that E ×F , with the product topology, is a Fr´echet space.

Let ·E and ·F be the Fr´echet norms defined by the increasing sufficient sequences of semi-norms {pn} and {qn} for E and F , respectively. The corresponding product topology on E × F is the topology associated to the sufficient sequence of semi-norms rn(x, y):=max(pn(x),qn(y)), since × the rn-balls Urn (ε)=Upn (ε) Uqn (ε) form a local basis for both topologies. Then {(xn,yn)} is convergent in E × F if and only if both {xn}⊂E and {yn}⊂F are convergent, and E × F is complete if and only if E and F are complete. If the vector subspace G(T )isclosedinE × F ,itisaFr´echet space with the restriction of the product topology in E×F , and we denote π1(x, T x)=x and π2(x, T x)=Tx the restrictions of the projections of E × F on E and

3.2. Banach theorems 85

F , respectively. They are two continuous linear maps π1 : G(T ) → E and π2 : G(T ) → F .

Now π1 is bijective and, by the Banach-Schauder theorem, the inverse −1 → G map π1 : x (x, T x) is continuous from E to (T ). But Tx = π2(π1x) and T is continuous as a composition of two continuous maps.  Remark 3.12. For a map T : E → F between two metric spaces E and F , the graph G(T )isclosedinE × F if and only if

xn → x and Txn → y ⇒ y = Tx. This condition is clearly weaker than a continuity assumption, which means that xn → x ⇒∃y = lim Txn and y = Tx. Note that in the case of a linear map T between normed or Fr´echet spaces, if xn → 0 and Txn → y ⇒ y =0, then the graph is closed.

InaHilbertspaceH, an orthogonal projection is a bounded linear opera- 2 tor P such that P = P and (Px,y)H =(x, P y)H. These last two properties characterize the orthogonal projections: Theorem 3.13. Let H be a Hilbert space, and let P : H → H be a mapping such that (Px,y)H =(x, P y)H .ThenP is linear and bounded, and it is an orthogonal projection if P 2 = P .

Proof. The linearity of P is clear; see (b) in Theorem 2.35. To prove that P is bounded, suppose that xx → 0 and Pxn → y; then, for any x ∈ H,

(x, y)H = lim(x, P xn)H = lim(Px,xn)H =0. n n Choose x = y and then y = 0, so that the graph of P is closed. 2 If P = P ,letF = P (H) and Q = I − P .Thenalso(Qx, y)H = 2 (x, Qy)H, Q = Q, and F =KerQ is a closed subspace of H.Furthermore ⊥ Q(H)=F ,since(Px,Qy)H =(x, P (y − Py))H = 0 and x = Px+ Qx = y + z ∈ F ⊕ F ⊥ as in Theorem 2.35(b). 

It is a well-known elementary fact that a pointwise limit of continuous functions need not be continuous, but with the Banach-Steinhaus theorem4 we will prove that, for Banach spaces, a pointwise limit of continuous lin- ear operators is always continuous. This theorem is an application of the following uniform boundedness principle:

4First published in 1927 by S. Banach and H. Steinhaus but also found independently by the Austrian mathematician Hans Hahn, who worked at the Universities of Vienna and Innsbruck.

86 3. Fr´echet spaces and Banach theorems

Theorem 3.14. Let Tj : E → F (j ∈ J) be a family of bounded linear operators between two Banach spaces, E and F .Then   ∞ (a) either M := supj∈J Tj < , and the operators are uniformly bounded on the unit ball of E,or   ∞ G (b) ψ(x):=supj∈J Tj(x) F = for every x belonging to a δ-dense set A ⊂ E.

{ } → ∞ Proof. The level sets Gn := ψ>n of the function ψ : E [0, ] {   } are open subsets of E,sinceGn = j∈J x; Tjx F >n and every Tj is continuous. Now we consider two possibilities:

(a) If one of the sets Gn,sayGm, is not dense in E, there is a ball B¯(a, r) c contained in Gm, so that

ψ(a + x) ≤ m whenever xE ≤ r, which means that Tj(a + x)F ≤ m for all j ∈ J if xE ≤ r.Then

Tj(x)F ≤Tj(a + x)F + Tj(a)F ≤ 2m (x|E ≤ r) or, equivalently, Tj(x)F ≤ 2m/r if xE ≤ 1, for every j ∈ J, and it follows that M ≤ 2m/r < ∞. ∞ ∈ ∞ (b) If every Gn is dense, then ψ(x)= for every x A := n=1 Gn, since then ψ(x) >nfor every n ∈ N. By Theorem 3.8, A is a dense subset of E. 

Theorem 3.15 (Banach-Steinhaus). Let Tn : E → F (n ∈ N) be a sequence of bounded linear operators between two Banach spaces such that the sequence {Tnx} is bounded for every x ∈ E. Suppose further that the limit limn Tn(x) exists in F for every point x belonging to a dense subset D of E.

Then T : D → F such that Tx = limn Tn(x) extends to a bounded linear operator T : E → F such that

T ≤lim infnTn.

Thus, every sequence {Tn}⊂L(E; F ) such that Tx = limn Tn(x) exists for every x ∈ E defines a bounded operator T ∈L(E; F ).

  ∞ Proof. By Theorem 3.14, M := supn∈N Tn < . For every x ∈ E, {Tn(x)} is a Cauchy sequence in the Banach space F . Indeed, if ε>0, there exist z ∈ D so that x − zE ≤ ε and n ∈ N so that Tp(z) − Tq(z)F ≤ ε whenever p, q ≥ n.Then

Tp(x) − Tq(x)F ≤Tp(x) − Tp(z)F + Tp(z) − Tq(z)F

+ Tq(z) − Tq(x)F ≤ 2Mε+ ε.

3.2. Banach theorems 87

We define T (x) := limn Tn(x) and T : E → F is obviously linear. Moreover,

T (x)F = lim Tn(x)F ≤ lim infnTnxE n and it follows that T ≤lim infnTn. 

3.2.1. An application to the convergence problem of Fourier series. ∈ 1 If f L (T) and ck(f)= T f(t)e−k(t) dt, recall that { }+∞ ∈ c(f)= ck(f) k=−∞ c0 and obviously c(f)∞ ≤f1. 1 The Fourier mapping f ∈ L (T) → c(f) ∈ c0 is continuous and injective, but it cannot be exhaustive, since in this case the inverse map would also be continuous and c(f)∞ ≥ δf1 for some δ>0, which leads to a contradiction. Indeed, if DN is the Dirichlet kernel (see (2.23)), then c(DN )∞ = 1 and DN 1 →∞,since 2 π sin[(N +1/2)t] 2 N+1/2 | sin t| DN 1 ≥ dt = dt π 0 t π 0 t 2 N 1 kπ 4 N 1 ≥ | sin t| dt = . π kπ − π2 k k=1 (k 1)π k=1

The Fourier sums are the operators SN = DN ∗ : C(T) →C(T). Their norms LN := SN  are called the Lebesgue numbers; obviously LN ≤ DN 1 and it is shown that

LN = DN 1 holds by considering a real function g which is a continuous modification of sgn DN such that |g|≤1 and |SN g|≥DN 1 − ε. 1 Theorem 3.16. (a) There are functions f ∈ L (T) such that SN f → f in L1(T) as N →∞.

(b) There are functions f ∈C(T) such that SN f → f in C(T) as N → ∞. In fact, for every x ∈ R there is a dense subset Ax of C(T) such that, ∈ | | ∞ for every f Ax, supN SN (f,x) = .

Proof. (a) Note that LN →∞. According to Theorem 3.14,

sup SN (f)1 = ∞ N for every f belonging to a dense set A ⊂C(T).

(b) Again, the linear forms uN (f):=SN (f,x) are continuous on C(T), with uN  = LN . 

88 3. Fr´echet spaces and Banach theorems

p Remark 3.17. By the Fischer-Riesz Theorem 2.37, SN f → f in L (T)for every f ∈ Lp(T)ifp = 2, and it can be shown, though it is much harder, that this is still true when 1

3.3. Exercises

Exercise 3.1. Suppose A is a convex absorbing subset of a vector space E, and let

qA(x):=inf{t>0; x ∈ tA}, which is called the Minkowski functional of A. Denote B = qA < 1 and B¯ = qA ≤ 1 .

(a) Prove that qA(x + y) ≤ qA(x)+qA(y), qA(tx)=tqA(x)ift ≥ 0, and that B ⊂ A ⊂ B¯ and q = q = q . B A B¯ (b) Suppose A is also balanced and t>0 tA = E. Prove that qA is a semi-norm.

Exercise 3.2. Suppose A is a subset of a vector space E. Prove that the convex hull of A, defined as n co(A)={t1a1 + ···+ tnan; n ∈ N,tj > 0, tj =1,aj ∈ A (1 ≤ j ≤ n)}, j=1 is the intersection of all the convex subsets of E that contain A. If E is a locally convex space and A is bounded, prove that co(A)isalso bounded and that, if A is open, then co(A)isalsoopen.

Exercise 3.3. Prove that in every topological vector space E the family of all balanced open neighborhoods of zero is a local basis of E.

Exercise 3.4. ProvethattheclassH(Ω) of all holomorphic functions on an open subset Ω of C is a closed vector subspace of C(Ω) and of E(Ω). Hence, H(Ω) is a Fr´echet space with the topology of the semi-norms ·K (K ⊂ Ω compact).

Exercise 3.5. Show that, for every p ∈ [1, ∞], the sequence of semi-norms

(n) qn(f):=f Lp(R) (n =0, 1,...) defines the topology of D[a,b](R).

3.3. Exercises 89

Exercise 3.6. Every Fr´echet norm, ·, satisfies the following properties: (a) x =0⇒ x =0. (b) |λ|≤1 ⇒λx≤x. (c) x + y≤x + y.

(d) limλ→0 λx =0.

(e) limx→0 λx =0. (f) ·is not a norm. Exercise 3.7. If ·is the Fr´echet norm on C(R) associated to the semi- norms pn(f)=f[−n,n], show that f =1/2, g =50/101, and h > 1/2iff(x)=(1−|x|)+, g(x) = 100f(x − 2), and h =(f + g)/2, and prove that the closed ball B¯d(0, 1/2) for d(f,g)=g − f is not convex. Exercise 3.8. If E = E1 ×···×Em is a finite product of Fr´echet spaces endowed with the product topology, then prove that E is also a Fr´echet space and extend this result to countable products. E C Show that (R) can be described as a closed subspace of 0≤k<∞ (R). Exercise 3.9. (a) Prove that C(R) does not have the Heine-Borel property by showing that the countable set of all “triangle functions” d(t, [−1/n, 1/n]c) f (t):= n |t| + d(t, [−1/n, 1/n]c) is bounded and its closure is not compact. (b) Extend this result to C(Ω), where Ω is an open subset of Rn.

Exercise 3.10. Suppose that E1 and E2 are two Fr´echet spaces and that M is a dense subspace of E1. Prove that every continuous linear mapping T : M → E2 has a unique continuous linear extension T˜ : E1 → E2. Exercise 3.11. Prove the following statements: (a) If E/F is a quotient locally convex space and π : E → E/F is the quotient map, then the image by π of an open subset of E is open in E/F. (b) If E is a Fr´echet or Banach space, then E/F isalsoaFr´echet space or Banach space, respectively. (c) If F and M are two closed subspaces of a Banach (or Fr´echet) space E and M is of finite dimension, then F + M is also closed in E.

Exercise 3.12. Let {ek}k∈Z be an orthonormal basis of a Hilbert space H. Let un = e−n + nen and

F := [en; n ≥ 0],M:= [un; n ≥ 1]. Prove that F and M are two closed subspaces of H such that F + M is ∞ 1 ∈ 2 \ dense and not closed in H. Note that n=1 n e−n  (Z) (F + M).

90 3. Fr´echet spaces and Banach theorems

Exercise 3.13. As an application of Corollary 3.9, prove that [a, b], and every compact metric space without isolated points, is uncountable. Exercise 3.14. Prove that in an infinite-dimensional Banach space there is no countable algebraic basis. Exercise 3.15. Find a noncontinuous function f : R → R withaclosed graph in R2.

Exercise 3.16. If T : C[0, 1] →C[0, 1] is a linear map such that Tfn(t) → Tf(t)ateveryt ∈ [0, 1] whenever fn → f in C[0, 1] (that is, uniformly), show that T is continuous. Exercise 3.17. Assume that C[0, 1] and C1[0, 1] are endowed with the sup 1 norm ·[0,1]. Show that the graph of the derivative operator D : C [0, 1] → C[0, 1] (Df = f ) is closed but the operator is unbounded. Why does this not contradict Theorem 3.11? Exercise 3.18. Let T : E → F be a linear map between two Fr´echet spaces. Prove that, if y = 0 whenever xn → 0inE and Txn → y in F ,thenT is continuous. Exercise 3.19 (Uniform Boundedness Principle for metric spaces). Let M be a complete metric space and fj : M → R (j ∈ J) a family of continuous functions which is bounded at each point x ∈ M, |fj(x)|≤C(x) < ∞ for all j ∈ J. Prove that the functions fj are uniformly bounded on a nonempty open subset G of M.

Exercise 3.20 (Approximate quadrature). Let {Jn} be a sequence of linear forms on C[0, 1] of the type N(n) k n Jn(f)= Anf(tk ), k=0 { n}N(n) where, for each n, tk k=1 is a given finite sequence of points in [0, 1] that are called the nodes of Jn. The sequence is called a quadrature method if 1 f(t) dt = lim J (f) →∞ n 0 n holds for every f ∈C[0, 1]. N(n)   | k | { } Prove that Jn C[0,1] = k=0 An and that Jn is a quadrature method if and only if the following conditions are satisfied: k−1 (i) limn→∞ Jn(x )=1/k for every k ∈ N and N(n) | k | ∞ (ii) supn k=0 An < . k ≥ If An 0 for all n and k, then (i) implies (ii).

3.3. Exercises 91

Exercise 3.21. (a) Suppose E1, E2, and F are three Banach spaces. Prove that a bilinear map B : E1 × E2 → F is continuous if and only if it is separately continuous; that is, the linear maps B(·,y) and B(x, ·)areall continuous. (b) If E is the normed space of all real polynomial functions on [0, 1] with   1 | | 1 the norm f 1 = 0 f(t) dt, prove that B(f,g):= 0 f(t)g(t) dt defines a separately continuous bilinear map B : E × E → R which is not continuous.

Exercise 3.22. Let E1, E2, and F be three Banach spaces. Prove that a bilinear map B : E1 × E2 → F is continuous if and only if its graph

G(B):={(x1,x2,y) ∈ E1 × E2 × F ; B(x1,x2)=y} is a closed subset of the product space E1 × E2 × F . Exercise 3.23. Suppose T : L1(0, 1) → L1(0, 1) is a bounded linear oper- ator and p, q ≥ 1. If T (Lp(0, 1)) ⊂ Lq(0, 1), is it necessarily true that the restriction T : Lp(0, 1) → Lq(0, 1) will also be continuous?

References for further reading: S. Banach, Th´eorie des op´erations lin´eaires. S. K. Berberian, Lectures in Functional Analysis and Operator Theory. B. A. Conway, A Course in Functional Analysis. R. E. Edwards, Functional Analysis, Theory and Applications. G. K¨othe, Topological Vector Spaces I. R. Meise and D. Vogdt, Introduction to Functional Analysis. W. Rudin, Functional Analysis.

Chapter 4

Duality

An essential aspect of functional analysis is the study and applications of duality, which deals with continuous linear forms on functional spaces. In the case of a Hilbert space, the projection theorem will allow us to prove the description of the dual given by the Riesz representation theorem and by its extension known as the Lax-Milgram theorem, which is useful in the resolution of some boundary value problems, as we will see in some examples in Chapter 7. But if we are dealing with a more general normed space, or with any locally convex space, to ensure the existence of continuous linear exten- sions of continuous linear functionals defined on subspaces, we need the Hahn-Banach theorem. This theorem adopts several essentially equivalent versions, and we will start from its analytical or dominated extension form and then the geometric or separation form will follow. We include in this chapter a number of applications of both the Riesz and the Hahn-Banach theorems, such as the description of the duality of Lp spaces, interpolation of linear operators, von Neumann’s proof of the Radon- Nikodym theorem, and an introduction to the spectral theory of compact operators.

4.1. The dual of a Hilbert space

In this section, H will denote a Hilbert space. Its norm ·H is associated toascalarproduct,

 2 x H =(x, x)H .

93

94 4. Duality

Note that, for every x ∈ H,(·,x)H is a linear form on H. Wewillseethat its norm is equal to xH and that every continuous linear form is of this type, as for Euclidean spaces.

4.1.1. Riesz representation and Lax-Milgram theorem. The decom- position H = F ⊕ F ⊥ given by the Projection Theorem 2.35 allows an easy proof of the following fundamental representation result concerning the dual of a Hilbert space.

Theorem 4.1 (Riesz representation). The map J : H → H such that J(x)=(·,x)H is a bijective skew linear isometry, that is,

(1) (·,x)HH = xH ,

(2) J(x1 + x2)=J(x1)+J(x2), (3) J(λx)=λJ¯ (x),and  (4) if u ∈ H ,thenu =(·,x)H for some x ∈ H.

Proof. The Schwarz inequality (2.11) means that (·,x)HH ≤xH , and the linear form (·,x)H reaches this value xH at x0 = x/xH , which proves (1). The identities (2) and (3) are obvious. Finally, if 0 = u ∈ H and F =Keru,thenF = H = F ⊕ F ⊥ and there ⊥ exists z ∈ F , zH =1.Sinceu(z)x − u(x)z ∈ F for every x ∈ H,wehave 0=(u(z)x − u(x)z,z)H, which is equivalent to u(x)=(x, u(z)z)H.Thus, u =(·, u(z)z)H .  Example 4.2. By an application of the Riesz representation theorem to the Hilbert space L2 = L2(μ),

2  2 (L ) = {(·,g)2; g ∈ L },

2 2  and g → (·,g)2 is a skew linear isometric bijection from L onto (L ) .Since 2 2 g ∈ L → g¯ ∈ L is also a skew linear isometric bijection, g → (·, g¯)2 is a bijective linear isometry from L2 onto (L2) that allows us to consider L2 as its own dual. The notation f,g := fgdμ (f,g ∈ L2) is a usual one, u = ·,g (g ∈ L2) are all the continuous linear forms on L2, and u = g2.

It is customary to identify ·,g with g and to say that the dual L2(μ) of L2(μ)isL2(μ).

4.1. The dual of a Hilbert space 95

The following extension of the Riesz theorem was given by P. D. Lax1 and A. Milgram when studying parabolic partial differential equations: Theorem 4.3 (Lax-Milgram). Let H be a Hilbert space and suppose that B : H × H → K is a bounded sesquilinear form; that is, B satisfies the conditions (1) B(·,a) is linear and B(a, ·) skew linear for every a ∈ H and

(2) |B(x, y)|≤CxH yH for some constant C, for all x, y ∈ H. If B is coercive, meaning that |B(x, x)|≥cx2 (x ∈ H) for some constant c>0, then for every u ∈ H there exists a uniquely determined element y ∈ H such that u = B(·,y).Hencey ∈ H → B(·,y) ∈ H is a bijective skew linear map.

Proof. By virtue of the Riesz Theorem 4.1, for every y ∈ H,thereisa unique T (y) ∈ H such that B(x, y)=(x, T (y))H for all x ∈ H, and it is readily seen that the mapping T : H → H is linear. For instance,

(x, T (λy))H = B(x, λy)=B(λx,¯ y)=(λx,¯ T (y))H =(x, λT (y))H , and then T (λy)=λT (y).

It follows from the assumption |(x, T (y))H| = |B(x, y)|≤CxH yH that T (y)H ≤ CyH and T is bounded. Similarly, by the coercivity assumption,  2 ≤| |≤    c y H (y, T(y))H y H T (y) H , so that cyH ≤T (y)H ≤ CyH . Obviously T is one-to-one, and these estimates imply that T (H)isclosed since, if T (yn) → z0,then

cyn − ymH ≤T (yn) − T (ym)H , and we can find y0 = lim yn, so that z0 = T (y0) ∈ z(H). To prove that T is onto, suppose that x ∈ T (H)⊥, so that B(x, y)=0 ⊥ for all y ∈ H and then 0 = B(x, x) ≥ bxH ;thusx = 0 and T (H) = {0}, which proves that T (H)=H,sinceT (H)isclosed. This shows that for every u ∈ H there is an element T (y) ∈ H such that u(x)=(x, T (y))H , and then u(x)=B(x, y) for all x ∈ H.

1Peter David Lax, while holding a position at the Courant Institute, was awarded the Abel Prize (2005) “for his groundbreaking contributions to the theory and application of partial differ- ential equations and to the computation of their solutions”.

96 4. Duality

Note that y is unique, since it follows from B(x, y1)=B(x, y2), or − | | | |≥  2 B(x, y)=0fory = y1 y2, that 0 = (y, T(y))H = B(y, y) c y H and then y = 0, and hence y1 = y2. 

4.1.2. The adjoint. Suppose T ∈L(H1; H2), where H1 and H2 are Hilbert    spaces. The transpose T acts between the duals H2 and H1 by the rule T v = v ◦ T ; that is,  ∈ ∈  (4.1) (T v)(x)=v(Tx)(x H1,v H2). ·  · ∗ By the Riesz representation Theorem 4.1, v =(,y)H2 and T v =(,T y)H1 ∗ for some T y ∈ H1, and (4.1) becomes ∗ ∈ ∈ (4.2) (x, T y)H1 =(Tx,y)H2 (x H1,y H2). ∗ The adjoint of T is the linear operator T : H2 → H1 characterized by the identity (4.2), and its linearity follows from this relation. For instance, ∗ ∗ ∗ (x, T (y1 + y2))H1 =(Tx,y1)H2 +(Tx,y2)H2 =(x, T y1 + T y2)H1 ∗ ∗ ∗ for every x ∈ H1, and then T (y1 + y2)=T y1 + T y2. Clearly T ∗∗ = T and (ST)∗ = T ∗S∗ if the composition is defined. ∗ Theorem 4.4. The map T ∈L(H1; H2) → T ∈L(H2; H1) is a skew linear ∗ 2 ∗ isometry such that T T  = T  = TT , and for every T ∈L(H1; H2) the following properties hold: (a) ( Im T )⊥ =KerT ∗, (b) ( Ker T ∗)⊥ = Im T , (c) ( Im T ∗)⊥ =KerT ,and (d) ( Ker T )⊥ = Im T ∗. ∗ ∗ ¯ ∗ Proof. Since (x, (λT ) y)H1 =(λT x, y)H2 = λ(x, T y)H1 =(x, λT y)H1 ,we obtain (λT )∗ = λT¯ ∗. It is also clear that (S + T )∗ = S∗ + T ∗. By the Riesz representation Theorem 4.1,     | | (4.3) T =sup Tx H2 =sup (y, Tx)H2 .   ≤     ≤ x H1 1 x H1 , y H2 1 Hence,  ∗ | ∗ | | |   T =sup (x, T y)H2 =sup (y, Tx)H2 = T     ≤     ≤ x H1 , y H1 1 x H1 , y H2 1 and also  2  2 ∗ ≤ ∗ ≤ 2 T =sup Tx H =sup(x, T Tx)H1 T T T .   ≤ 2   ≤ x H1 1 x H1 1 ∗ To prove (a), note that (y, Tx)H2 = 0 if and only if (T y, x)H1 =0,which holds for every x ∈ H when T ∗y = 0. The remaining properties follow very

4.1. The dual of a Hilbert space 97 easily from the identity T ∗∗ = T and from the relation F ⊥⊥ = F¯,ifF is a vector subspace of H1 (or H2), contained in Theorem 2.36.  Remark 4.5. It is worth observing that, in the complex case, (λT )∗ = λT¯ ∗, but (λT ) = λT ,since(λT )(v)=v ◦ (λT )=λv ◦ T .

An operator T ∈L(H)issaidtobeself-adjoint if T ∗ = T . By Theo- rem 2.35(b), every orthogonal projection is self-adjoint.

The Hilbert-Schmidt operators TK , defined by integral kernels K ∈ 2 × L (X Y )as

TK f(x):= K(x, y)f(y) dy, Y form another important family of bounded linear operators between Hilbert spaces. Here X and Y areassumedtobetwoσ-finite measure spaces.

2 2 2 Theorem 4.6. If K ∈ L (X × Y ),thenTK : L (Y ) → L (X) and TK ≤   ∗ 2 → 2 K 2. The adjoint TK : L (X) L (Y ) is the Hilbert-Schmidt operator ∗ TK∗ defined by the kernel K (y, x)=K(x, y).Hence,ifX = Y , TK is self-adjoint when K(y, x)=K(x, y) a.e. on X × X.

Proof. By Schwarz inequality, 1/2 1/2 |K(x, y)||f(y)| dy ≤ |K(x, y)|2dy |f(y)|2dy Y Y Y | |2 ∞ | |2 ∞ with Y K(x, y) dy < a.e., since X Y K(x, y) dy dx < .Then TK f(x) is defined a.e. and 2 2 2 |TKf(x)| ≤ |K(x, y)| dy |f(y)| dy. Y Y

By integrating both sides, TK f2 ≤K2f2.

Fubini’s theorem shows that (TKf,g)2 =(f,TK∗ g)2. 

The description (4.3) of the norm of an operator by duality has a useful variant for self-adjoint operators:

Theorem 4.7. If T ∈L(H) is self-adjoint, then (Tx,x)H ∈ R for every x ∈ H and

T  =sup|(Tx,x)H|. xH =1

| | | |≤2 Proof. We call S := supxH =1 (Tx,x)H , so that (Tx,x)H x H S. Since S ≤T , we only need to show that TxH ≤ S if xH = 1, and we can assume that Tx =0.

98 4. Duality

We will use the polarization identities, which extend (2.15) and (2.16), 1 (Tx,y) = (T (x + y),x+ y) − (T (x − y),x− y) H 4 H H if K = R and 1 (Tx,y)H = (T (x + y),x+ y)H − (T (x − y),x− y)H 4 +i(T (x + iy),x+ iy)H − i(T (x − iy),x− iy)H in the complex case. They are checked by expanding the inner products.

Since (Tx,x)H =(x, T x)H ∈ R, 1 (Tx,y) = (T (x + y),x+ y) − (T (x − y),x− y) . H 4 H H

Let xH = 1 and y =(1/TxH )Tx so that also yH = 1 and S S Tx = (Tx,y) ≤ (x + y2 + x − y2 )= (x2 + y2 ); H H 4 H H 4 H H hence TxH ≤ S. 

4.2. Applications of the Riesz representation theorem

Throughout this section, (Ω, B,μ) will be a σ-finite measure space. Our aim is to obtain some consequences of the duality result of Example 4.2 for L2(μ).

4.2.1. Radon-Nikodym theorem. Suppose 0 ≤ h ∈ L1(μ). The finite measure ν(A):= hdμ A on (Ω, B) is such that ν(A)=0ifμ(A) = 0. When this happens, we say that ν is absolutely continuous (relatively to μ) and write ν<<μ. Theorem 4.8 (Radon-Nikodym). Let μ and ν be two σ-finite measures on (Ω, B).Ifν<<μ, there exists a unique a.e. determined nonnegative measurable function h such that (4.4) ν(A)= hdμ (A ∈B). A If ν is finite, h ∈ L1(μ).

Proof. First assume that μ and ν are finite and define λ = μ + ν.Then λ(A) = 0 if and only if μ(A)=0.

4.2. Applications of the Riesz representation theorem 99

The linear form u(f):= fdμ is bounded on the real Hilbert space L2(μ) and also on L2(λ), since μ ≤ λ. Then there exists a unique function g ∈ L2(λ) such that fdμ= fgdλ (f ∈ L2(λ)).

We rewrite this identity as (4.5) f(1 − g) dμ = fgdν.

Formally (1 − g) dμ = gdν and we will try 1 − g h := . g We need to prove that 0

FirstF := {g ≤ 0} is μ-null, since for f = χF we obtain from (4.5) that μ(F ) ≤ χF (1 − g) dμ = χF gdν ≤ 0. To prove that G := {g>1} is also a μ-null set, assume that μ(G) > 0. − Then, by taking f = χG, again from (4.5) we obtain that 0 > G(1 g) dμ = ≥ G gdν 0, which is impossible. By changing g on a μ-null set if necessary, 0 1/n , we obtainχAn /g L2(λ) and hdμ = ν(A ). Then, by monotone convergence, hdμ = An n A ν(A). ≤ ∈ 1 Observe that 0 h L (μ) and it is unique since, if A fdμ=0for every A ∈B,thenf =0μ-a.e.

Now let μ and ν be σ-finite. We can find Ωn ∈B (n ∈ N)withΩn ↑ Ω ∞ ⊂ and μ(Ωn),ν(Ωn) < .IfA Ωn,thenν(A)= A hn dμ,withhn(x)= hn+1(x)a.e.onΩn, and we can take hn =(hn+1)|Ωn . We define hn =0on c Ωn and the nonnegative function h(x) := limn hn(x) is measurable; then, by monotone convergence,

ν(A) = lim ν(A ∩ Ωn) = lim hdμ= hdμ. n n A∩Ωn A

The uniqueness of h follows from the uniqueness on every Ωn. 

Formula (4.4) is often represented by the notation dν = hdμ, and h is called the Radon-Nikodym derivative of ν.

100 4. Duality

4.2.2. The dual of Lp. Recall that, if 1 ≤ p<∞, Lp = Lp(μ)isthe Banach space of all μ-measurable (real or complex) functions f defined by the condition 1/p p fp := |f| dμ < ∞, modulo the subspace of functions vanishing a.e., with the usual modification for p = ∞. By denoting f,g := fgdμ with p = p/(p − 1) (p = ∞ if p = 1, and p =1ifp = ∞) if the integral exists, then, by H¨older’s inequality, p p |f,g| ≤ fpgp (f ∈ L (μ),g∈ L (μ)).

p p  Note that, for every g ∈ L , ·,g∈(L ) and ·,g(Lp) ≤gp for any p ∈ [1, ∞], and we say that g ∈ Lp →·,g∈(Lp) is the natural map from Lp into the dual of Lp. We know from Example 4.2 that L2(μ) can be identified with L2(μ). As an application of the Radon-Nikodym theorem, we will show that Lp(μ) canalsobeidentifiedwithLp (μ), if 1 ≤ p<∞. To prepare the proof, we will consider positive linear forms on the real Lp spaces. We say that v ∈ (Lp) is positive if v(f) ≥ 0when0≤ f ∈ Lp, ∈ p  ∈ p + ∈ p ≥ and then we write v (L )+.Also,f (L ) if f L and f 0a.e. p  p  − p  Lemma 4.9. (L ) =(L )+ (L )+. Proof. Let v ∈ (Lp) and f ∈ (Lp)+ and define

v+(f):= sup v(g). 0≤g≤f + p + Obviously, v+(αf)=αv+(f)ifα ∈ R and f ∈ (L ) . To show that v+ is p + also additive, we observe that, if f1,f2 ∈ (L ) ,then

v+(f1)+v+(f2)= sup v(g1 + g2) 0≤g1≤f1, 0≤g2≤f2

≤ sup v(g)=v+(f1 + f2). 0≤g≤f1+f2 For the reversed inequality, we claim that p + p + p + {g ∈ (L ) ; g ≤ f1 + f2}⊂{g ∈ (L ) ; g ≤ f1} + {g ∈ (L ) ; g ≤ f2}.

To prove this, if 0 ≤ g ≤ f1 + f2 and g1 := inf(g,f1), then the function g2 := g − g1 ≥ 0 satisfies g2 ≤ f2 since, when g1(x)=f1(x),

g2(x)=g(x) − f1(x) ≤ f2(x)

4.2. Applications of the Riesz representation theorem 101

and g2(x)=0≤ f2(x)wheng1(x)=g(x). Hence, also

v+(f1 + f2) ≤ v+(f1)+v+(f2).

p Every real function f ∈ L admits a decomposition f = f1 − f2 with p + + − f1,f2 ∈ (L ) , for instance by taking f1 = f = sup(f,0) and f2 = f = sup(−f,0), and v+(f):=v+(f1) − v+(f2) + − does not depend on the decomposition, since f1 − f2 = f − f implies − + v+(f1)+v+(f )=v+(f )+v+(f2).

It is clear that v+(λf)=λv+(f)inbothcasesλ ≥ 0 and λ<0 − + (v+(−f)=v+(f ) − v+(f )=−v+(f)). Additivity also holds for v+, since v (f + f )=v (f + + f +) − (f − + f −) + 1 2 + 1 2 1 2 + + − − − = v+(f1 )+v−(f2 ) v+(f1 )+v+(f2 ) = v+(f1)+v+(f2), and it is continuous, since + − + − |v+(f)|≤v+(f )+v+(f ) ≤v(Lp) (f p + f p) ≤ 2v(Lp) fp.

Finally, v− := v+ − v is also linear and continuous, and

v−(f)= sup v(g) − v(f) ≥ 0 0≤g≤f if f ≥ 0.  Theorem 4.10 (Riesz representation theorem for (Lp)). Suppose 1 ≤ p< ∞. For every v ∈ Lp(μ) there is a uniquely determined function g ∈ Lp (μ) such that v(f)= fgdμ (f ∈ Lp), Ω and gp = v(Lp) . ∞ ∈ p  Proof. (a) First let μ(Ω) < and v (L )+. 1/p By ν(A):=v(χA)(χAp = μ(A) < ∞) we obtain a finite measure ν on (Ω, B), since ν(∅) = 0 and N N ν An = v lim χA = lim v(χA )= ν(An). N n N n n n=1 n=1 n

It is clear that ν<<μ,sinceu(χA)=u(0) = 0 if μ(A)=0. By the Radon-Nikodym theorem for finite measures, v(χA)= ν(A)= ≥ A hdμ for a unique integrable function h 0, and also v(s)= sh dμ if s is a simple function.

102 4. Duality

p If 0 ≤ f ∈ L , we choose simple functions sn so that 0 ≤ sn ↑ f.Then p sn → f in L by dominated convergence; hence

v(f) = lim v(sn) = lim snhdμ= fhdμ n n by monotone convergence, and fhdμ ≤vLp(μ) fp. ∈ p  ∞ (b) In the general σ-finite case and for v L (μ)+,letΩ= n=1 Ωn with μ(Ωn) < ∞. p We obtain hn associated to the restrictionvn of v to L (μn)asin(a), ≥ ∈ p and we extend hn with zeros. Then, if h := n hn 0 and f L (μ), we can write   f,h = fhn dμ = fhdμ = v(fχΩn )=v(f). n n Ωn n p Note that fh is integrable, since n fχΩn = f in L (μ) by dominated convergence, the last equality follows, and |fh| dμ = |fhn| dμ = v(|f|) < ∞. n

p  1 (c) If v = v+ − v− ∈ (L ) ,by(b)thereexists h so that fh ∈ L for every real function f ∈ Lp and v(f)= fhdμ.

p  (d) In the complex case, if v ∈ (L ), we obtain h = h1 + ih2 so that 1 p fh1,fh2 ∈ L for every f ∈ L , v(f)= fh1 dμ, v(f)= fh1 dμ.Then v(f)= fhdμ if f ∈ Lp is a real function. By linearity, also v(f)= fhdμ for any complex function f ∈ Lp.

p (e) The function h obtained in (c) and (d) belongs to L and hp = v(Lp) : Suppose first that p>1 and let Bn =Ωn ∩{|h|≤n},withΩn ↑ Ω. | |p−1 | | Then the functions fn= h sgn (h)χBn (sgnz = z /z,with0/0:=0) p | |p belong to L , v(fn)= h χBn dμ, and 1/p p p h dμ ≤v(Lp) fnp = v(Lp) h dμ . Bn Bn

p p/p   ≤     ≤ By monotone convergence, h p v (Lp) h p , and h p v (Lp) . Since v = ·,h,byH¨older’s inequality v(Lp) ≤hp .

When p =1,wealsohavev(L1) ≤h∞. Suppose v(L1) + ε< h∞; then, since the measure is assumed to be σ-finite, there exists some

4.2. Applications of the Riesz representation theorem 103

A ∈Bsuch that 0 <μ(A) < ∞ and |h(x)|≥v(L1) + ε for every x ∈ A. If f := sgn (h)χ ∩ ,then n Bn A

v(fn)= |h| dμ ≥ (v(L1) + ε)μ(Bn ∩ A) Bn∩A ≤      ∩ and, since also v(fn) v (L1) fn 1 = v (L1) μ(Bn A), we arrive at a contradiction by allowing n ↑∞. ∈B (f) Finally, h is uniquely determined since, if A hdμ=0foreveryA with finite measure, then h =0a.e. 

Note that if 1 ≤ p<∞, by Theorem 4.10, the natural linear mapping g →·,g is a bijective isometry from Lp onto (Lp) which allows us to represent by Lp (μ) the dual of Lp(μ) and to describe the Lp-norm by duality:

(4.6) gp =sup|f,g| =sup|f,·|, fp≤1 sp=1; s∈S where S denotes the vector space of all integrable simple functions. Indeed, S is a dense subspace of Lp, so that the norm of u = f,· does not change when we restrict u to this subspace. Remark 4.11. Similarly, if 1 ≤ p<∞, p can be regarded as the dual p ∈ p →· ∈ p    of  throughout the mapping y  ,y ( ) ,nowwith x, y = ∞ ∈ p ∈ p p n=1 xnyn, x =(xn)  , and y =(yn)  . Recall that  is the space Lp associated to the counting measure on N (see Exercise 1.11).

4.2.3. The dual of C(K). Another fundamental theorem of F. Riesz shows that every continuous linear form on C(K) is represented by a complex measure.2 Lemma 4.12. For every complex measure μ there is a |μ|-a.e. uniquely defined |μ|-integrable function h such that |h| =1which satisfies (4.7) μ(B)= hd|μ| (B ∈B). B Proof. The existence of h ∈ L1(|μ|) follows from an application of the Radon-Nikodym theorem to μ+, μ−, μ+, and μ−. By (1.7), the four of them are finite and absolutely continuous with respect to |μ|, so that they have an integrable Radon-Nikodym derivative. | | ∀ ∈B The uniqueness of h follows by noting that B hdμ =0 B implies h =0a.e.

2 ∈C In 1909, F. Riesz obtained the representation of every u [0, 1] as a Riemann-Stieltjes 1 integral u(g)= 0 g(x) dF (x) when solving a problem posed by Jacques Hadamard in 1903. J. Radon found the extension to compact subsets of Rn in 1913, S. Banach to metric compact spaces in 1937, and S. Kakutani to general compact spaces in 1941.

104 4. Duality

| | | | | |≤| | If μ (B) > 0, then B hdμ = μ(B) μ (B) and 1 | | ≤ (4.8) | | hdμ 1, μ (B) B which implies |h(x)|≤1a.e. Indeed, D¯(0, 1)c ⊂ C is a countable union of discs D = D(a, r) and it is enough to prove that |μ|(h−1(D)) = 0. If we assume that |μ|(B) > 0forB = h−1(D), then 1 1 | |− ≤ | − | | |≤ | | hdμ a | | h a d μ r μ (B) B μ (B) B and 1 | | ∈ ¯ | | hdμ D(0, 1), μ (B) B which is in contradiction to (4.8). To show that also |h(x)|≥1 a.e., let us check that B(r):={|h|

The identity (4.7) is represented by dμ = hd|μ|, which is called the polar representation of μ. p p It is natural to define L (μ):=L (|μ|) and fdμ:= fhd|μ| for every f ∈ L1(μ). Obviously fdμ ≤ |f| d|μ|. Now we are ready to prove the Riesz representation theorem. Theorem 4.13. Let K be a compact subset of Rn and let C(K) be the real or complex Banach space of all continuous functions on K, with the usual sup norm. If μ is, respectively, a real or complex Borel measure on K,then

uμ(g):= gdμ defines a continuous linear form on C(K),andμ → uμ is a bijective linear map between the vector space M(K) of all Borel real or complex measures on K and the dual space C(K). Proof. Since |uμ(g)|≤ |g| d|μ|≤gK|μ|(K), it follows that

uμC(K) ≤|μ|(K)

4.2. Applications of the Riesz representation theorem 105

3 and uμ is a continuous linear form.  Conversely, if v ∈C(K) , we are going to prove that v = uμ for some μ ∈ M(K). In the case C(K)=C(K; R) of real functions, and with the same proof of  Lemma 4.9, every v ∈C(K; R) is the difference v = v+ − v− of two positive ∈C  linear forms v+,v− (K)+. They are defined on any positive function f ∈C(K)+ by v+(f):= sup v(g), 0≤g≤f and v− = v − v+.

Every f ∈C(K) admits a decomposition f = f1 − f2 with f1,f2 ∈C(K), + − for instance by taking f1 = f = sup(f,0) and f2 = f = sup(−f,0) and, as in Lemma 4.9, v+(f):=v+(f1) − v+(f2) does not depend on the + − − + descomposition, since f1 −f2 = f −f implies v+(f1)+v+(f )=v+(f )+ v+(f2). With this procedure we obtain C  C  −C  (K) = (K)+ (K)+.

We write

(4.9) v(g)= gdμ+ − gdμ− = gdμ (g ∈C(K)), if the Borel measures μ± represent the positive linear forms v± and μ = μ+ − μ−, as in the Riesz-Markov representation theorem. In the complex case, every v ∈C(K) determines two continuous linear forms v, v ∈C(K; R).Since v(f)=v(g + ih)=v(g)+iv(h)(g,h ∈C(K; R)), if (v)(g)= gdμ and (v)(g)= gdλ on C(K; R), we can write v(f)= gdμ+ i hdμ+ i gdλ− hdλ= fdμ+ i fdλ.

If μ and λ are two Borel real measures on K,thenν = μ + iλ : B→C is a complex measure and v(f)= fdν. To prove the uniqueness, let ghd|μ| =0forallg ∈C(K). Since 1 ¯ C(K)isdenseinL (|μ|), by Corollary 2.13, we can consider h − gk1 → 0, gk ∈C(K). Then ¯ ¯ |μ|(K)= (1 − gkh) d|μ| = (h − gk)hd|μ| ≤h − gk1 → 0 and μ = 0. Thus, the linear map μ → uμ is injective. 

3 It can be proved that uμC(K) = |μ|(K).

106 4. Duality

4.3. The Hahn-Banach theorem

For a Hilbert space, the projection theorem has been useful to give a com- plete description of the dual. The duality theory of more general spaces is based on the Hahn-Banach theorem.4

4.3.1. Analytic form of Hahn-Banach theorem. The basic form of the Hahn-Banach theorem refers to a convex functional on a real vector space E, which is a function q : E → R such that q(x + y) ≤ q(x)+q(y) and q(αx)=αq(x)(x, y ∈ E and α ≥ 0). We note that the sets {x; q(x) ≤ 1} and {x; q(x) < 1} are convex. Obviously, a semi-norm is a convex functional. Theorem 4.14 (Hahn-Banach). Let E be a real vector space, F avector subspace of E, q a convex functional on E,andu alinearformonF which is dominated by q: u(y) ≤ q(y)(y ∈ F ). Then u canbeextendedtoallE as a linear form v dominated by q: v(x) ≤ q(x)(x ∈ E).

Proof. If F = E, we start with a simple extension of u to the subspace F ⊕ [y]={z + ty; z ∈ F, t ∈ R}, the linear span of F and y ∈ F . Of course, if s ∈ R,thenv(z + ty)=u(z)+ts is a linear extension of u. Our aim is to obtain the estimate v(z + ty)=u(z)+ts ≤ q(z + ty) by choosing a convenient value for s = v(y). Since q is positive homogeneous, if this estimate holds for t = ±1, then v(z + ty)=(1/|t|)v(|t|z ± y) ≤ (1/|t|)q(|t|z ± y)=q(z + ty). Hence all we need are the inequalites u(z)+s ≤ q(z + y),u(z) − s ≤ q(z − y)(z,z ∈ F ); that is, u(z) − q(z − y) ≤ s ≤ q(z + y) − u(z). Hence u(z + z) ≤ q(z − y + z + y) ≤ q(z − y)+q(z + y). It is possible to choose s so that sup(u(z) − q(z − y)) ≤ s ≤ inf (q(z + y) − u(z)) z∈F z∈F and then v is dominated by q.

4The first version of the Hahn-Banach theorem dates back to the work of the Austrian mathematician Eduard Helly in 1912, essentially with the same proof given later independently by H. Hahn (1926) and S. Banach (1929). We give the original proof published by S. Banach in 1929; only his transfinite induction has been changed by an application of Zorn’s lemma.

4.3. The Hahn-Banach theorem 107

Once we know that a one-dimensional extension is always possible, we can continue with a standard application of Zorn’s lemma as follows. Consider the family Φ of all extensions  of u to vector subspaces L of E that are dominated by q, and we order Φ by (L1,1) ≤ (L2,2) meaning that L1 ⊂ L2 and 2 agrees with 1 on L1. { } Every totally ordered subset (Lα,a) of Φ has the upper bound (L, ) ∈ obtained by defining L := α Lα and (y):=α(y)ify Lα.Ifalso y ∈ Lα ,thenα(y)=α (y) by the total ordering of the set {(Lα,a)}, and the previous definition is unambiguous. For the same reason, L is a vector subspace of E and  is a linear extension of all the linear forms α. By Zorn’s lemma, there is a maximal element (F,˜ ˜) in Φ. But according to the first part of this proof, this extension must be the whole space E, since if F˜ = E, we would obtain an extension v to a strictly larger subspace F˜ ⊕ [y]ofE, in contradiction to the maximality of (F,˜ ˜). 

The following version of the Hahn-Banach theorem for semi-norms holds for real and complex vector spaces: Theorem 4.15 (Hahn-Banach). Let F be a vector subspace of the vector space E, q a semi-norm on E,andu alinearformonF that satisfies the estimate |u(y)|≤q(y)(y ∈ F ). Then there exists an extension of u toalinearformv on E which satisfies |v(x)|≤q(x)(x ∈ E).

Proof. The real case is simple. We know from Theorem 4.14 that there exists an extension v which satisfies v(x) ≤ q(x) and −v(x)=v(−x) ≤ q(−x)=q(x), i.e. |v(x)|≤q(x). Now suppose E is a complex vector space5 and F is a complex vector subspace of E. In this case we split the complex linear form u into the real and imaginary parts, u(y)=u1(y)+iu2(y). Then u1 and u2 are real linear forms on F , which is also a real vector subspace of E regarded as a real vector space. Since u1(iy)+iu2(iy)=u(iy)=iu(y), u1 and u2 are related by u1(iy)=−u2(y).

Conversely, if u1 is a real linear form on F , the additive functional

(4.10) u(y)=u1(y) − iu1(iy) is a complex linear form, since u(iy)=iu(y) and u(ry)=ru(y)whenr ∈ R.

5The complex version of the Hahn-Banach theorem was published simultaneously in 1938 by H. F. Bohnenblust and A. Sobczyk and by G. Buskes; curiously, in his work Banach only considered the real case.

108 4. Duality

To extend our u(y)=u1(y) − iu1(iy), it follows from the real case and |u1(y)|≤|u(y)|≤q(y) that there exists a real linear extension v1 of u1 so that |v1(x)|≤q(x). Then the complex linear form v(x)=v1(x) − iv1(ix)is an extension of u,sincev1(y)=u1(y).

Finally, we have |v1(x)|≤q(x) and, if for a given x ∈ E we write |v(x)| = λv(x)with|λ| = 1, then it follows that also

|v(x)| = v(λx)=v1(λx) ≤ q(λx)=|λ|q(x)=q(x), since q is a semi-norm. 

4.3.2. The geometric Hahn-Banach theorem. Suppose that K is a convex subset of a real vector space E and that K is absorbing in the sense that t>0(tK)=E. Then the gauge or Minkowski functional of K is defined as the functional

pK : E → [0, ∞) such that x p (x)=inf t>0; ∈ K . K t It is easily checked that this functional is convex and that

K ⊂{x ∈ E; pK (x) ≤ 1}.

To show the subadditivity of pK , note that, if x/t, y/s ∈ K,also x + y t x s y (4.11) = + ∈ K t + s t + s t t + s s and pK (x+y) ≤ t+s. By taking the infimum with respect to t and then with respect to s, we obtain pK (x+y) ≤ pK (x)+s and pK (x+y) ≤ pK (x)+pK (y).

It is also worth noticing that pK (x) < 1ifx is an internal point of K, in the sense that for any z ∈ E there is an ε = ε(z) > 0 such that {x + tz; |t|≤ε}⊂K, that is, for every line L through x, L ∩ K is a neighborhood of x in L. Indeed, if x is internal, then we have (1 + ε)x ∈ K for some ε>0, so that pK (x) ≤ 1/(1 + ε) < 1. Sometimes we will use self-explanatory notation such as f(A)

4.3. The Hahn-Banach theorem 109

Proof. A translation allows us to assume that x0 = 0 is an internal point of K, and then K is absorbing.

Note that pK (y) ≥ 1, since y = 0, and we can define f on [y] so that f(y) = 1; that is, f(ty)=t.Thent ≤ tpK (y)=pK (ty)ift ≥ 0, and t ≤ pK (ty) is obvious if t<0.

Having shown that f ≤ pK on [y], we conclude from Theorem 4.14 that f can be extended to all of E so that f(x) ≤ pK (x) ≤ 1=f(y).

Recall that if x is an internal point of K,thenpK (x) < 1. 

Next we consider E equipped with a vector topology over the reals and two disjoint subsets A and B in E. We say that these sets are separated if there exists a nonzero f ∈ E such that f(A)

Proof. We choose a ∈ A, b ∈ B. Then the convex set 0 0 K = A − B + b0 − a0 = (A − y + b0 − a0) y∈B is an open neighborhood of zero and y = b0 − a0 ∈ K. An application of Theorem 4.16 provides a linear form f such that f(y) = 1 and f(K) < 1. Also f(−K) ≥ 1. Then U = K ∩−K is an open neighborhood of zero such that |f(U)|≤1, and f ∈ E. If a ∈ A and b ∈ B,thenf(a)

Then f(A) and f(B) are two disjoint convex subsets in R, and f(A) is open. This is readily shown, since f is a nonzero linear form and, if U is a balanced and convex neighborhood of 0, then f(U)=I ⊂ R is a neighborhood of 0 in R and it follows as in the proof of Theorem 3.10 that f(G)isopenifG is open. Thus, if r =inff(B), f(a)

110 4. Duality

To prove (b), since A is compact and B is closed, for every a ∈ A we ∩ ∅ can consider (a + Ua + Ua) B = ,withUa a convex neighborhood of zero; ⊂ N N then, by choosing A n=1(an + Uan ), U = n=1 Uan is an open convex set. The set A + U is also open and convex, and it is disjoint with B,since N N ⊂ ⊂ A + U (an + Uan + U) (an + Uan + Uan ). n=1 n=1 Now we apply (a) to the couple A + U, B to obtain f ∈ E so that f(A + U) and f(B) are two disjoint intervals in R, f(A)acompactinterval contained in the open interval f(A + U), and (b) follows.  Remark 4.18. For a complex locally convex space E, separation refers to E as a real locally convex space. Note that, if f is a continuous real linear form such that f(A)

(a) If K is a convex and balanced closed subset of E and x0 ∈ E \ K,  then |u(K)|≤1 and u(x0) > 1, real, for some u ∈ E .

(b) If M is a closed vector subspace of E and x0 ∈ E \ M,thenu(M)=  {0} and u(x0)=1forsomeu ∈ E .

Proof. (a) By Theorem 4.17, K and {x0} are two strictly separated sets and 0 ∈ K, so that we can choose f = u satisfying f(x) < 1

Proof. (a) If q(x):=uF xE, by the Hahn-Banach Theorem 4.15, u admits a linear extension v that also satisfies |v(x)|≤q(x)=uF xE, and obviously vE ≥uF .

4.3. The Hahn-Banach theorem 111

 If y = x1 − x2 = 0, it is easy to obtain v ∈ E such that v(x1 − x2)= x1 − x2E =0.Justextend u(ty):=tyE as above.

(b) On the subspace Z =[x0] ⊕ F of E we can define the linear form u(λx0 + y)=λd(x0,F), so that uZ ≤ 1, since 1 |u(λx + y)|≤|λ|x + y = λx + y. 0 0 λ 0 Then, by choosing x0 −yεE x0 − yεE d(x0,F)+ε  for every ε>0. Thus, uF = 1 and u has an extension to v ∈ E with vE =1.

In the special case F = {0}, v(x0)=d(x0, 0) = x0E and vE = 1.  Theorem 4.21. If F is a subspace of a locally convex space E and u ∈ F , then there exists an extension of u to a continuous linear form v ∈ E.  If x1 and x2 are two distinct points of E, there exists v ∈ E so that v(x1) = v(x2). Proof. The topology of F is defined by the restriction of any sufficient family of semi-norms for the topology of E. By Theorem 3.4, there exists a continuous semi-norm q on E so that |u(y)|≤q(y) for all y ∈ F .Bythe Hahn-Banach Theorem 4.15, u admits a linear extension v that also satisfies |v(x)|≤q(x) and v ∈ E.

If y = x1 − x2 =0, u(λy):=λ defines a linear form on F =[y] and we can choose a continuous semi-norm p on E such that p(y) =0.Then we have |u(y)| = cp(y), |u(λy)| = cp(λy), and u has an extension to some linear form v on E such that |v(x)|≤cp(x) and v ∈ E.Sinceu(y) =0, v(x1) = v(x2). 

Assume that E is a locally convex space and that M a closed subspace of E. If there exists a second closed subspace of E such that E = M ⊕ N, that is, E = M+N and M∩N = {0},thenM and N are said to be topologically complementary subspaces of E and M (and N)isacomplemented subspace. As an application of Theorem 4.20 and Theorem 4.21, let us show that finite-dimensional subspaces are complemented: Theorem 4.22. Suppose that E is a normed space (or any locally convex space) and that N is a finite-dimensional vector subspace of E.ThenE = N ⊕ M for some closed subspace M of E.

112 4. Duality

{ } ≤ ≤ Proof. Let e1,...,en be a base of Nand let πj (1 j n)bethe n corresponding projections, so that y = j=1 πj(y)ej.SinceN has finite dimension, every projection is continuous and, by Theorem 4.21, it has a → continuous linear extensionπ ˜j : E K. We are going to show that we can take M = n Kerπ ˜ , which is a closed vector subspace of E. j=1 j ∈ n ∈ − Indeed, if x E,lety := j=1 π˜j(x)ej N and z := x y.Then − ∈ π˜j(z)=˜πj(x) πj(y) = 0 for every j and z N, so that E = N + M.This n ∈ sum is direct, since if y = j=1 πj(y)ej N is also in M,thenπj(y)= π˜j(y) = 0 for every j and y =0.  Remark 4.23. For any locally convex space E, separation for a point and a closed subspace F can be obtained from Theorem 4.21 by means of the quotient map π from E onto the quotient space E/F. If P is a sufficient family of semi-norms on E, recall that E/F is endowed with the topology defined by the family of semi-norms P˜ defined by p˜(˜x)=infp(y)= infp(x − z) y∈x˜ z∈F and the quotient map is continuous. If x ∈ E \F ,then0=˜ x ∈ E/F and we can findv ˜ ∈ (E/F) which satisfiesv ˜(˜x) = 0 so that we only need to define v =˜vπ.

4.3.4. Proofs by duality: annihilators, total sets, completion, and the transpose. Here we present some duality results that depend on the Hahn-Banach theorem. Most of these results turn out to be very useful in applications, in spite of the nonconstructive nature of that theorem, in whose proof we have used Zorn’s lemma.

Let E be the dual of a locally convex space E and write (4.12) x, u := u(x). Then ·, · : E × E → K is a bilinear form such that, if u(x)=x, u =0 for all x ∈ E,thenu = 0, and also x =0ifx, u =0forallu ∈ E,by Theorem 4.21. The annihilator of A ⊂ E is the closed subspace of E Ao := {v ∈ E; a, v =0∀a ∈ A} = Ker a, · a∈A and the annihilator of U ⊂ E is the closed subspace of E U o := {x ∈ E; x, u =0∀u ∈ U} = Ker u. u∈U Obviously, A ⊂ B ⇒ Bo ⊂ Ao, and A ⊂ Aoo.

4.3. The Hahn-Banach theorem 113

Annihilators play the role that orthogonality plays in Hilbert spaces. They can be used to characterize by duality the closure of a vector subspace: Theorem 4.24. Suppose E is a locally convex space. The closed linear span [A] of a subset A of E coincides with Aoo, the annihilator in E of Ao ⊂ E, so that A is total in E if and only if Ao = {0}. Thus, a vector subspace F of E is closed if and only if F oo = F .

Proof. It is clear that the annihilator of A coincides with the annihilator of the linear span [A]ofA and, by continuity, with the annihilator of the closure of [A]; thus, if F = [A], we need to prove that F oo = F . Indeed, we have F ⊂ F oo and, if x ∈ F , by Theorem 4.20(b), we can choose v ∈ E so that v ∈ F o and v(x) = 0. This shows that also x ∈ F oo. Note that, if F = E,thereexistsv ∈ F o, v = 0, so that F o = {0}. 

Theorems 4.24 and 2.36 are in the basis of certain approximation results. To prove that a point x of a locally convex space E lies in the closure of a subspace F , all we need is to show that u(x) = 0 for every u ∈ E that vanishes on F . Theorem 4.25. For any normed space E the mapping J : E → E such that J(x)=x,where x(u)=x, u = u(x), is a linear isometry from E into the Banach space E, endowed with the norm w =sup  ≤ |w(u)|. E u E 1 Hence, the closure of J(E) in E is a completion of E.

 Proof. The function x = x, u is clearly linear on E , and xE ≤xE, since |x(u)| = |u(x)|≤uE xE. According to Theorem 4.20(b), we can  find some v ∈ E such that vE ≤ 1 and x(v)=xE ; thus xE = xE. Finally, J is linear:

J(x1 + x2)(u)=u(x1 + x2)=J(x1)(u)+J(x2)(u)=(J(x1)+J(x2))(u) and also J(λx)(u)=λJ(x)(u) for every u ∈ E. 

Assume now that T : E → F is a continuous linear operator between two normed spaces. The transpose T  of T is defined on every v ∈ F  by T v = v ◦T , which is obviously a linear and continuous functional on E, and clearly it depends linearly on v.Then T  : F  → E is continuous, since  (4.13) T vE = vTE ≤T vF .

114 4. Duality

It is useful to rewrite the definition of T  as Tx,v = x, T v. Theorem 4.26. The transposition map T ∈L(E; F ) → T  ∈L(F ; E) is a linear isometry and for every T ∈L(E; F ) the following properties hold: (a) ( Im T )o =KerT , (b) ( Ker T )o = Im T ,and (c) ( Im T )o =KerT .

Proof. It is clear that (T + S) = T  + S and (λT ) = λT . Moreover, as in the Hilbert space case for the adjoint, T  =sup|Tx,v| =sup|x, T v| = T .   ≤   ≤   ≤   ≤ x E 1, v F 1 x E 1, v F 1 (a) Note that Tx,v = x, T v and v ∈ (ImT )o if and only if x, T v = 0 for all x ∈ E, that is, if and only if v ∈ Ker T . (b) By (a) and Theorem 4.24, ( Ker T )o =(ImT )oo = Im T . (c) As in (a), since x, T v = Tx,v, x ∈ Ker T if and only if x, T v = 0 for all T v ∈ Im T .  Remark 4.27. It is easily checked that also Im T  ⊂ (KerT )o, but the reverse inclusion is not always true, as shown in Exercise 4.20.

4.4. Spectral theory of compact operators

The principal axes theorem of analytical geometry asserts that any symmet- ric quadratic form on Rn n (Ax, x)= αijxixj i,j=1 n 2 can be rewritten in the normal form i=1 λixi by means of an orthogonal transform. The general form of this theorem, in the language of matrices or operators, says that each real symmetric matrix A is orthogonally equivalent to a diagonal matrix whose diagonal entries are the roots of the equations det(A − λI) = 0, that is, the eigenvalues of A.6 The earliest extensions to an infinite-dimensional theory were achieved after a construction of determinants of infinite systems. They were applied to integral equations around 1900, defined by operators with properties that are close to those of the finite-dimensional case.

6This was obtained in 1852 by the English mathematician J. Sylvester in terms of the qua- dratic form (Ax, x), and A. Cayley inaugurated the calculus of matrices in which the reduction to normal form corresponds to a diagonalization process.

4.4. Spectral theory of compact operators 115

Here, for a first version of this spectral theory in the infinite-dimensional case, we are going to consider the class of compact operators in Banach spaces.

Let T : E → F be a linear map between Banach spaces and let BE be the closed unit ball of E.ThenT is said to be compact if T (BE)is compact in F ,thatis,ifeverysequenceinT (BE) has a Cauchy subsequence (see Exercise 1.3). ⊂ ∞ Such an operator is bounded, since T (BE) n=0 BF (0,n) and, by compactness, T (BE) ⊂ BF (0,N)forsomeN. Every bounded linear operator T with finite-dimensional range T (E)is compact, since in T (E) the closure of the bounded set T (BE) is compact, by Theorem 2.25.

We will use the notation Lc(E; F ) to represent the collection of all com- pact linear operators between E and F , and Lc(E)=Lc(E; E).

4.4.1. Elementary properties. The following theorem is useful to prove the compactness of certain operators:

Theorem 4.28 (Ascoli-Arzel`a7). Let K be a compact metric space and assume that Φ ⊂C(K) satifies the following two conditions: | | ∞ ∈ 1. supf∈Φ f(x) < for every x K (Φ(x) is bounded for every x ∈ K). | − |≤ 2. For every ε>0 there is some δ>0 such that supf∈Φ f(x) f(y) ε if d(x, y) ≤ δ (we say that Φ is “equicontinuous”).

Then Φ¯ is compact in C(K).

Proof. If δ =1/m, the compact set K has a finite covering by balls BK(cm,j,δ), and the collection of all the centers for m =1, 2,... is a count- able dense set C = {ck} in K.

Let {fn}⊂Φ. By the first condition, we can select a convergent sub- sequence {fn,1(c1)}, then we obtain a subsequence {fn,2}⊂{fn,1} so that {fn,2(c2)} is also convergent, and so on. Let {fn } be the diagonal sequence {fm,m}, which is convergent at every c ∈ C.

7In 1884 Giulio Ascoli (at the Politecnico di Milano) needed the assumption of equicontinuity to prove that a sequence of uniformly bounded functions possesses a convergent subsequence. In 1889 Cesare Arzel`a (in Bologna) considered the case of continuous functions and proved what is nowadays usually called the theorem of Ascoli-Arzel`a and in 1896 he published a paper in which he applied his results to prove, under certain extra assumptions, the Dirichlet principle.

116 4. Duality

For ε>0, let δ =1/m be as in condition 2. For every x ∈ K, we choose ¯ cj = cj(x) ∈ C so that x ∈ BK(cj,δ)withj ≤ n(ε) and then

|fp (x) − fq (x)|

≤|fp (x) − fp (cj)| + |fp (cj) − fq (cj)| + |fq (cj) − fq (x)|

≤ 2ε + |fp (cj) − fq (cj)| where |fp (cj) − fq (cj)|≤ max |fp (ck) − fq (ck)|→0, k≤n(ε) so that fp − fq K → 0. 

The following result gathers together some of the basic properties of compact linear operators:

Theorem 4.29. The collection Lc(E; F ) of all compact linear operators between two Banach spaces is a closed subspace of L(E; F ) and the right or left composition of a compact linear operator with a bounded linear operator is compact.    If T ∈Lc(E; F ),thenalsoT ∈Lc(F ; E );ifE and F are Hilbert ∗ 8 spaces, T ∈Lc(F ; E)(Schauder theorem).

Proof. It is clear that Lc(E; F ) is a linear subspace of L(E; F ).

If T = limn Tn with Tn ∈Lc(E; F ), to prove that T is compact, let ε>0 and let N be such that T − TN ≤ε/2. Since TN (BE) is compact, ∈ it is covered by a finite collection of balls BF (yi,ε/2) (i I), and then ⊂ ∈  −  ≤ T (BE) i∈I BF (yi,ε), since for every x BE we have Tx TN x F ε/2 and TN x ∈ BF (yi,ε/2) for some i ∈ I.

Now, by considering εm =1/m, it is easy to check that every sequence {Txn} with xn ∈ BE has a partial Cauchy sequence, since we obtain succes- sive subsequences {Txn,m}⊂{Txn,m−1} contained in some BF (y, εm) and the diagonal sequence {Txm,m} is a Cauchy subsequence of {Txn}. Let S = TR and assume first that T : F → G is compact and R : E → F is bounded. Then R(BE) ⊂RBF and S(BE) ⊂RT (BF ), whose closure is compact; hence, S is compact. If T is bounded and R compact, then R(BE) is compact and its image by the continuous map T is also compact, so that S(BE) is contained in a compact set.

Suppose T is compact and consider {vn}n∈N ⊂ BF . To obtain a Cauchy  subsequence of {T vn},letK := T (BE) and Φ := {fn = vn|K ; n ∈ N}⊂

8In 1930, the Polish mathematician Julius Pawel Schauder proved this result that allowed the use of duality in the Riesz-Fredholm theory for general Banach spaces. Schauder is well known for his fixed point theorem, for the Schauder bases in Banach spaces, and for the Leray-Schauder principle on partial differential equations. See also footnote 3 in Chapter 3.

4.4. Spectral theory of compact operators 117

C(K). Then Φ is equicontinuous because

|fn(Tx) − fn(Ty)| = |vn(Tx) − vn(Ty)|≤Tx− Ty, and it is also uniformly bounded, since fnK ≤vnF ≤ 1. According the Ascoli-Arzel`a Theorem 4.28, Φ¯ is compact in C(K) and {fn} has a Cauchy { } subsequence fnk , so that    −  | − |≤ | − |→ T vnp T vnq E =sup vnp (Tx) vnq (Tx) sup fnp (a) fnq (a) 0 x∈BE a∈K as p, q →∞.  Example 4.30. Every Volterra operator x Tf(x)= K(x, y)f(y) dy, a where K(x, y) is continuous on Δ = {(x, y) ∈ [a, b] × [a, b]; a ≤ y ≤ x ≤ b}, is a compact operator T : C[a, b] →C[a, b], since Φ = T (BC[a,b]) satisfies the conditions of the Ascoli-Arzel`a Theorem 4.28.

Indeed, if a ≤ t ≤ s ≤ b and if for a given ε>0 we choose δ>0sothat |K(s, y) − K(t, y)|≤ε if |s − t|≤δ and (s, y), (t, y) ∈ Δ(K is uniformly continuous on Δ), then t |Tf(s) − Tf(t)|≤ |K(s, y) − K(t, y)||f(y)| dy a s + |K(s, y)||f(y)| dy t ≤ (b − a)ε + KΔ|s − t| for every f ∈ BC[a,b] if |s − t|≤δ, and Φ is equicontinuous. Obviously it is uniformly bounded, since Tf[a,b] ≤KΔ(b − a)f[a,b]. Example 4.31. It is shown in a similar way that every Fredholm operator TK , defined as in (2.22) by a continuous integral kernel K, is compact. | − |≤ d | − || | ≤ − Note that TK f(s) TK f(t) c K(s, y) K(t, y) f(y) dy (d c)ε if f[c,d] ≤ 1 and |s − t|≤δ small, by the uniform continuity of K. 2 2 Example 4.32. The Hilbert-Schmidt operator TK : L (Y ) → L (X), de- fined as in Theorem 4.6 by a kernel K ∈ L2(X × Y ), is also compact.

2 This is proved by choosing a couple of orthogonal bases {un}⊂L (X) 2 and {vm}⊂L (Y ), so that an application of Fubini’s theorem shows that the 2 products wm,n(x, y)=un(x)vm(y) form an orthogonal basis in L (X × Y ). By the Fischer-Riesz Theorem 2.37, K = cn,mwn,m, n,m

118 4. Duality with convergence in L2(X × Y ). The Hilbert-Schmidt operator defined by the kernel KN = cn,mwn,m n,m≤N is compact, since it is continuous by Theorem 4.6 and its range is finite- dimensional, contained in [un; n ≤ N]. Now, again by Theorem 4.6,  − ≤ −  → TK TKN K KN 2 0 and, according to Theorem 4.29, TK is compact.

4.4.2. The Riesz-Fredholm theory. For compact operators there is a complete spectral theory.9

Recall that λ ∈ K is an eigenvalue of T ∈L(E)ifNT (λ)= Ker(T −λI) is nonzero, that is, if T − λI is not one-to-one. Every nonzero x ∈NT (λ) is called an eigenvector, and NT (λ)isaneigenspace. Obviously T = λI on NT (λ), which is a closed subspace of E and it is invariant for T .The multiplicity of an eigenvalue λ is the dimension of NT (λ). The spectrum10 of T is the set σ(T ) of all scalars λ such that T − λI is not invertible. That is, λ ∈ σ(T )ifeitherλ is an eigenvalue of T or the range RT (λ):=Im(T − λI) is not all of E. This subspace of E need not be closed, but it is also invariant for T ,sinceT (Tx− λx)=T (Tx) − λT x belongs to RT (λ). Note that if T is compact and 0 ∈ σ(T ), then dim (E) < ∞, since in this case T −1 is continuous by the open mapping theorem and I = T −1T is compact, so that the unit ball BE and the unit sphere SE = {x; xE =1} are compact, and Theorem 2.28 applies.

Theorem 4.33 (The Fredholm alternative). Suppose T ∈Lc(E) and λ =0 . Then:

(a) dim NT (λ) < ∞. o (b) RT (λ)=NT (λ) , and it is closed in E.IfE is a Hilbert space, then ⊥ RT (λ)=NT ∗ (λ¯) .

(c) NT (λ)={0} if and only if RT (λ)=E;thus,if0 = λ ∈ σ(T ),then λ is an eigenvalue of T .

9David Hilbert constructed his spectral theory for 2 essentially with Fredholm’s method (see footnote 5 in Chapter 2), and F. Riesz followed him to develop the theory on L2. In 1918, Riesz extended Hilbert’s notion of a compact operator to complex functional spaces, a few years before the introduction of general Banach spaces. 10In this context, the term spectrum was coined by D. Hilbert when dealing with integral equations with quadratic forms, or, equivalently, with linear operators on 2.

4.4. Spectral theory of compact operators 119

−1 Proof. Since T −λI = λ (λT −I) and λT ∈Lc(E), we can assume without loss of generality that λ = 1. Denote N = NT (1) and R = RT (1).

(a) Since TN = I : N → N is compact, dim N<∞. (b) According to Theorem 4.22, E = N ⊕ M, M a closed subspace of E.ThenS := (T − I)|M : M → R is a continuous isomorphism and we only −1 need to show that S is also continuous, which means that CxE ≤SxE for some constant C>0 and for all x ∈ M.

If this were not the case, then we would find xn ∈ M so that xnE =1 and SxnE ≤ 1/n, and, since T is compact, passing to a subsequence if necessary, Txn → z and Sxn → 0, with z ∈ M,sincexn = Txn − Sxn → z and M is closed. Moreover, Sz = limn Sxn = 0 and z =0sinceS is one-to-one, which is in contradiction to zE = limn xnE =1. By the properties of the transpose, R = R¯ =(Ker(T  − I))o. (c) Suppose N = {0} and R(1) := R = E.ThenT : R(1) → R(1) is compact and, according to (b), R(2) = (T − I)(R(1)) is a closed subspace of R(1) and R(2) = R(1), since T − I is one-to-one. In this way, by denoting R(n)=(T − I)n(E), we obtain a strictly decreasing sequence of closed subspaces.

As in Remark 2.27, we choose un ∈ R(n) so that d(un,R(n +1))≥ 1/2 and unE = 1. Then, if p>q,

Tup − Tuq =(T − I)up − (T − I)uq + up − uq = z − uq with z ∈ E(p+1)+E(q +1)+E(p) ⊂ E(q +1), so that Tup −TuqE ≥ 1/2, which is impossible, since T is compact. This shows that R = E if N = {0}. For the converse suppose that R = E, so that Ker (T  − I)= Im(T − I)o = Ro = {0} and we can apply the previous result to T , which is compact by Schauder’s theorem. Hence N =Im(T  − I)o =(E)o = {0}. 

Theorem 4.34. Let T ∈Lc(E).Thenσ(T ) ⊂{λ; |λ|≤T } and, for every δ>0 there are only a finite number of eigenvalues λ of T such that |λ|≥δ.

Proof. If Tx = λx and xE =1,then|λ| = TxE ≤T . Suppose that, for some δ>0, there are infinitely many different eigenval- ues λn such that |1/λn|≤1/δ, and let Txn = λnxn with xnE =1.These eigenvectors are linearly independent, since xn = β1x1 + ··· + βn−1xn−1 with x1,...,xn−1 linearly independent would imply, after an application of T , that βj = βjλj/λn (1 ≤ j

We can apply Remark 2.27 to the spaces Mn =[x1,...,xn] to obtain un = β1x1 + ···+ βnxn ∈ Mn such that unE = 1 and d(un,Mn−1) ≥ 1/2.

120 4. Duality

Then the sequence {un/λn} is bounded and we arrive at a contradiction by showing that it has no Cauchy subsequence:

It is easily checked that un − Tun/λn ∈ [x1,...,xn−1]=Mn−1 and, if p>q, 1 1 1 1 Tup − Tuq = up − up − Tup + Tuq ≥ 1/2. λp λq E λp λq E 

Remark 4.35. If E is any complex Banach space and T ∈L(E), it will be proved in Theorem 8.10 that σ(T ) is always a nonempty subset of C which is contained in the disc λ; |λ|≤r(T ) ,where

r(T ) = lim T n1/n =infT n1/n ≤T . n→∞ n

If T ∈Lc(E), Theorems 4.33 and 4.34 show that σ(T ) \{0} is a finite or countable set of eigenvalues with finite multiplicity. These nonzero eigenvalues will be repeated according to their multiplic- ity in a sequence {λn} so that {|λn|} is decreasing. If this sequence is infinite, then λn → 0, since for every ε>0, according to Theorem 4.34, only a finite number of them satisfy |λn| >ε. It may happen that σ(T ) \{0} = ∅, as shown by Theorem 2.30 for Volterra operators T ∈Lc(C[a, b]). For these operators σ(T )={0},since C[a, b] is infinite dimensional.

In the special case of a self-adjoint compact operator of a Hilbert space, H, the spectral theorem will show the existence of eigenvalues and will give a diagonal representation for the operator. ∗ Assume that 0 = A = A ∈Lc(H). Note that eigenvectors of different eigenvalues are orthogonal, since it follows from Ax = αx and Ay = βy that (α − β)(x, y)H =(Ax, y)H − (x, Ay)H =0.

If M(A):=supxH =1(Ax, x)H and m(A):=infxH =1(Ax, x)H ,then

A =sup|(Ax, x)| =max(M(A), −m(A)), xH =1 by Theorem 4.7, and every eigenvalue λ is in the interval [m(A),M(A)], since Ax = λx for some x ∈ H with xH = 1, and then λ =(Ax, x)H .

4.4. Spectral theory of compact operators 121

Theorem 4.36 (Hilbert-Schmidt spectral theorem11). The self-adjoint com- pact operator A =0 has the eigenvalue α such that |α| = A, and either α = M(A) if A = M(A) or α = m(A) if A = −m(A).

Moreover, if {un} is an orthonormal sequence of eigenvectors associated to the sequence {λn} of nonzero eigenvalues, then Ax = λn(x, un)H un (x ∈ H) n≥1 in H and, if there are infinitely many eigenvalues, then AN → A in L(H) as N →∞,where N AN x := λn(x, un)H un. n=1

Proof. If A = M(A), then M(A) = limn(Axn,xn)H with xnH = 1 and AxnH ≤ M(A). Then it follows from  − 2  2 − 2 Axn M(A)xn H = Axn H 2M(A)(Axn,xn)H + M(A) 2 ≤ 2M(A) − 2M(A)(Axn,xn)H that limn(Axn−M(A)xn) = 0 and M(A) ∈ σ(A), since if limn(Axn−λxn)= −1 0 and λ ∈ σ(A), then xn =(A − λI) (Axn − λxn)=0bycontinuity.But M(A)=A = 0 and M(A) is an eigenvalue.

Similarly, if A = −m(A), then m(A) = limn(Axn,xn)withxnH =1 and AxnH ≤−m(A), so that, with the same proof as before,  2 → Axn + m(A)xn H 0 and m(A) ∈ σ(A) \{0}. To prove the second part of the theorem, note first that, if N =KerA ⊥ and F = [u1,u2,...], then F = N , so that H = F ⊕ N. ⊥ Indeed, if Ax =0,then(x, un)H =0foralln ≥ 0, and N ⊂ F .It ⊥ ⊥ ⊥ follows from A(F ) ⊂ F that also A(F ) ⊂ F :ifz ∈ F ,then(Az, x)H = ⊥ (z,Ax)H =0forallx ∈ F ,sinceAx ∈ F . But necessarily A(F )={0}, and also F ⊥ ⊂ N, since the restriction A : F ⊥ → F ⊥ is a self-adjoint compact operator, and if we suppose that it is nonzero, then, according to the first part of this theorem, A would have a nonzero eigenvalue α which ⊥ should be one of the eigenvalues λn of A : H → H, so that un ∈ F ∩ F ,a contradiction.

11D. Hilbert first developed his spectral theory for a large class of operators with a spectrum containing only eigenvalues, and E. Schmidt identified them as the compact operators through a “complete continuity condition”. See Exercise 5.13.

122 4. Duality

Now let x = y + z ∈ F ⊕ N with y ∈ F and z ∈ N. By the Fischer-Riesz theorem, y = (y, un)H un = (x, un)H un n≥1 n≥1 and Ax = Ay = (x, un)H Aun = λn(x, un)H un. n≥1 n≥1 To show that A → A,letx ≤ 1. Then, using Bessel estimates, N H  − 2 | |2 ≤| | | |2 ≤| | (A AN )x H = λn(x, un)H λN (x, un)H λN n>N n>N and λN → 0. 

An approximate eigenvalue of a linear operator A ∈L(H)isanum- ber λ such that limn Axn − λxnH = 0 for some sequence of vectors xn ∈ H such that xnH = 1. In this case (Axn − λxn,xn)H → 0 and λ = limn(Axn,xn), so that λ ∈ [m(A),M(A)] if A is self-adjoint.

Approximate eigenvalues belong to σ(A) since, if limn Axn −λxnH =0 c and λ ∈ σ(A) ,thenxn =(A − λI)(Axn − λxn) → 0asn →∞would be in contradiction to the condition xnH =1foralln. Remark 4.37. If A is a bounded self-adjoint operator, it follows from the proof of Theorem 4.36 that both m(A) and M(A) are approximate eigen- values.12

Indeed, we may assume without loss of generality that 0 ≤ m(A) ≤ M(A)=A,sinceM(A) is an approximate eigenvalue of A if and only if M(A)+t is an approximate eigenvalue of A + tI.Thecaseλ = m(A)is similar.

4.5. Exercises

Exercise 4.1 (Banach limits). Consider the delay operator τx(n)=x(n +1) { }∞ ∈ ∞ acting on real sequences x = x(n) n=1  and the averages x(1) + ···+ x(n) Λ x = . n n Prove that p(x) := lim sup Λnx n→∞

12It will be proved in Theorem 9.9 that every spectral value of A is an approximate eigenvalue, so that [m(A),M(A)] is the least interval which contains σ(A).

4.5. Exercises 123 defines a convex functional p on E = ∞ and show that there exists a linear functional Λ : ∞ → R such that, for every x ∈ ∞, Λ(τx)=Λ(x) and lim inf x(n) ≤ Λ(x) ≤ lim sup x(n). n→∞ n→∞

Exercise 4.2. Let x0 be a point in a real normed space E.Ifx0E =1,  show that there exists u ∈ E such that u(x0) = 1 and so that the ball B¯E(0, 1) lies in the half-space {u ≤ 1}. Exercise 4.3. Let M be a closed subspace of a locally convex space E. Prove that if M is of finite codimension (that is, dim (E/M) < ∞), then M is complemented in E. Exercise 4.4. Suppose T : F → ∞ is a bounded linear operator on a subspace F of a normed space E. Prove that T canbeextendedtoa bounded linear map T˜ : E → ∞ with the same norm, T˜ = T . Exercise 4.5. If E is a topological vector space, prove that a linear form u on E is continuous if and only if Ker u is closed. Exercise 4.6. Suppose that E is a locally convex space and that A = {en; n ∈ N} satisfies the following properties:

en → 0,E=[A], and en ∈ [ej; j = n] ∀n ∈ N. N(x) ∈ If x = n=1 πn(x)en E, then prove that the projections πn are continuous linear forms on E and that the convex hull co(K)ofK = A∪{0} is a closed subset of E which is not compact, but K is compact. Find a concrete example for E and A. Exercise 4.7. Prove that the completion H˜ of a normed space H with a  2 norm defined by a scalar product, x H =(x, x)H , is a Hilbert space. Exercise 4.8. Let H be a Hilbert space. When identifying every x ∈ H with (·,x) ∈ H, show that A⊥ = Ao for any subset A of H. Exercise 4.9. We must be careful when identifying (L2) = L2 or (2) = 2, if we are dealing simultaneously with several spaces. Consider the example H = 2 and the weighted 2 space ∞ 2 { 2}∞ { { }∞ 2| |2 ∞} V =  ( n n=1):= x = xn n=1; n xn < n=1 ∞ 2 with the scalar product (x, y)V := n=1 n xny¯n. Prove that V is a Hilbert space with a continuous inclusion V→ 2 and 2  that every u ∈ ( ) is uniquely determined by its restriction uV to V ,which is a bounded linear form on V , so that a continuous inclusion (2) → V  is defined and, by considering (2) = 2, we obtain V ⊂ 2 ⊂ V .

124 4. Duality

It would be nonsense to also consider V  = V , and one must choose (2) = 2 or V  = V when dealing with both V and 2. Exercise 4.10. Let E, F , and G be three normed spaces. Show that a bilinear or sesquilinear map B : E × F → G is continuous if and only if there exists a constant C ≥ 0 such that

B(x, y)G ≤ CxEyF (x ∈ E, y ∈ F ). Exercise 4.11. Prove that the inclusions Cm+1[a, b] →Cm[a, b]arecom- pact.

Exercise 4.12. If {fk} is a bounded sequence of E(R), then show that ∈ { (m)}∞ for every m N there is a subsequence of fk k=1 which is uniformly convergent on compact subsets of R and E(R) has the Heine-Borel property. Extend this to every E(Ω), Ω an open subset of Rn, and prove that these spaces (and also DK (Ω) if K has nonempty interior) are not normable. Exercise 4.13. Suppose 0

4.5. Exercises 125

1 Exercise 4.19. Prove that  is isometrically isomorphic to the dual of c0, the Banach subspace of ∞ of all the sequences with limit 0. Exercise 4.20. Prove that if T : 1 → 1 is defined by x T (x )=T n , n n then Im T  =(Ker T )o.

Exercise 4.21. Prove that the set of all the characteristic functions χI of intervals I ⊂ (a, b) is total in Lp(a, b), for every 1 ≤ p<∞. Exercise 4.22 (Minkowski integral inequality). Let K(x, y) be a measur- able function on R2 and let 1 ≤ p ≤∞. Using the duality properties of Lp, prove that +∞ +∞ K(·,y) dy ≤ K(·,y)p dy −∞ p −∞ first if K ≥ 0 and then when K(·,y) ∈ Lp(R) for every y ∈ R. Exercise 4.23. Let μ be the Borel measure on (0, 1) defined through the Riesz-Markov theorem by the linear form 1 g(x) u(g):= dx 0 x on Cc(0, 1). Is μ the restriction of a real Borel measureμ ˜ on R? Exercise 4.24. Let u be a linear form on the real vector space C(K)ofall real-valued functions on a compact set K of Rn. Prove that u is positive if | |   and only if u(1) = sup{|g|≤1} u(g) and that in this case u C(K) = u(1). Exercise 4.25 (The dual of H(D)). In the disk D = {|z| < 1}⊂C consider it c the circles γr(t)=re (0 ≤ t ≤ 2π), 0 } of D in C and such that ∞ g( ) = limz→∞ g(z) = 0. Prove the following statements: ∈H c ⊂ 1 (a) If g 0(D ) and γr Ug,thenug(f):= 2πi γ f(z)g(z) dz defines  r ug ∈H(D) which does not depend on r. ¯ (b) If μ is a complex Borel measure on D with 0 <<1, then uμ(f):= fdμalso defines u ∈H(D). D¯ μ (c) If μ is as in (b) and g (z):= 1 dμ(ω), then g ∈H(Dc) and μ D¯ z−ω μ 0 u = u (use the Cauchy integral formula f(ω)= 1 f(z) dz). gμ μ 2πi γr z−ω c  (d) The map g ∈H0(D ) → ug ∈H(D) is bijective. Exercise 4.26. Prove the easy converse of the Schauder theorem: If E and F are two Banach spaces and T  ∈L(F ; E) is compact, then T ∈L(E; F ) is also compact.

126 4. Duality

Exercise 4.27. Find a concrete Volterra operator T ∈Lc(C[a, b]) with no eigenvalues. Exercise 4.28. Prove that every nonempty compact subset K of C is the spectrum of a bounded operator of 2.

Exercise 4.29. Show that if K = {λn; n ∈ N}∪{0} with λn → 0, then 2 K = σ(T )forsomeT ∈Lc( ). Exercise 4.30. By Theorem 4.29, if T is the limit in L(E; F ) of a sequence {Tn} of continuous linear operators of finite rank, T is compact. ProvethattheconverseistrueifF is a Hilbert space by associating to every T ∈Lc(E; F ) and to every ε>0 an orthogonal projection Pε of F on a finite-dimensional subspace such that

T − PεT ≤ε. P. Enflo (1973) proved that this converse is not true for general Banach spaces by giving a counterexample in the setting of separable reflexive spaces.

References for further reading: N. I. Akhiezer and I. M. Glazman, Theory of Linear Operators in Hilbert Space. S. Banach, Th´eorie des op´erations lin´eaires. S. K. Berberian, Lectures in Functional Analysis and Operator Theory. B. A. Conway, A Course in Functional Analysis. R. Courant and D. Hilbert, Methods of Mathematical Physics. L. Kantorovitch and G. Akilov, Analyse fonctionnelle. G. K¨othe, Topological Vector Spaces I. P. D. Lax, Functional Analysis. F. Riesz and B. Sz. Nagy, Le¸cons d’analyse fonctionelle. W. Rudin, Real and Complex Analysis. W. Rudin, Functional Analysis. K. Yosida, Functional Analysis.

Chapter 5

Weak topologies

This short chapter is devoted to the introduction of the weak topologies, the only locally convex space topologies that we are considering in this book which can be nonmetrizable. We are mainly interested in the weak* topology on a dual E, such that the  weak* convergence un → u of a sequence in E means that un(x) → u(x) for all x ∈ E. For any normed space E, the Alaoglu theorem shows that the closed unit ball in the dual space E is weak* compact. Moreover, if E is separable, then this closed unit ball equipped with the weak* topology is metrizable. These facts make it easier to use the weak topology on bounded sets of E. As an application to the Dirichlet problem for the disc, we include a proof of the Fatou and Herglotz theorems concerning harmonic functions which are the Poisson integrals of functions or measures on the unit circle T ⊂ C. The weak convergence and the weak* topology will appear again when studying distributions and with the Gelfand transform of commutative Banach algebras.

5.1. Weak convergence

A sequence {xn} in a normed space E is said to converge weakly to x ∈ E  if u(xn) → u(x) for every u ∈ E .

The usual convergence xn → x, meaning that xn − xE → 0, is also called the strong convergence. It is stronger than weak convergence, since |u(xn) − u(x)|≤uE xn − xE. The converse will not be true in general (cf. Exercises 5.3 and 5.7).  Similarly, a sequence {un} in the dual E of the normed space is said to  converge weakly* to u ∈ E if un(x) → u(x) for every x ∈ E. Again, un → u  in E implies weak* convergence, since |un(x) − u(x)|≤un − uE xE.

127

128 5. Weak topologies

This weak* convergence is weaker than the weak convergence of {un} since,  if ω(un) → ω(u) for every ω ∈ E ,thenalsox(un)=un(x) → x(u) for every x ∈ E. We have a similar situation with the pointwise convergence of a sequence X of functions fn = {fn(x)}x∈X ∈ C ,since

fn(x) → f(x)(x ∈ X) means that δx(fn) → δx(f) for every evaluation functional δx,whichisa linear form on the vector space CX of all functions f : X → C. We know from Example 3.2 that this pointwise convergence is the con- vergence associated to the product topology on CX . These weak limits will be limits with respect to certain locally convex space topologies.

5.2. Weak and weak* topologies

Let E be any real or complex vector space and let E be a vector subspace of the algebraic dual of E, which is the vector space of all linear forms on E. We say that (E,E)isadual couple if E separates points of E, that is, if u(x)=u(y) for all u ∈E implies x = y. A typical dual couple is a locally convex space E with the dual E, and we will see that every dual couple is of this form, for a convenient topology on E. E ∈E ∈E Theorem 5.1. Suppose (E, ) is a dual couple, u1,...,un ,andu . n Then u = k=1 λkuk if and only if

u1(x)=···= un(x)=0⇒ u(x)=0.

Proof. Suppose u(x) = 0 whenever u1(x)=··· = un(x) = 0. Define the n one-to-one linear map Φ : E → K such that Φ(x)=(u1(x),...,un(x)). There existsu ˜ ∈ (Kn) such that u =˜u ◦ Φ that can be defined on Φ(E)as u˜(u1(x),...,un(x)) := u(x) since, if (u1(x),...,un(x)) = (u1(y),...,un(y)), it follows from our assumption that u(x)=u(y). We can write

u˜(α1,...,αn)=λ1α1 + ···+ λnαn, so that

u(x)=˜u(Φ(x)) = λ1u1(x)+···+ λnun(x)(x ∈ E) and u = λ u + ···+ λ u . 1 1 n n n ··· The converse is obvious: if u = k=1 λkuk,thenu1(x)= = un(x)= 0 ⇒ u(x)=0. 

5.2. Weak and weak* topologies 129

∈  ∈ Note that, if E is a locally convex space, u E , and x, x1,...,xn E, n then, according to Theorem 5.1, x = k=1 λkxk if and only if u(x)=0  (u ∈ E ) whenever u(x1)=···= u(xn)=0.

We assume that (E,E) is a dual couple. The weak topology σ(E,E) is the locally convex topology on E defined by the sufficient family of semi-norms pu(x)=|u(x)| (u ∈E). Similarly, σ(E,E) is defined by the sufficient family of semi-norms px(u)=|u(x)| (x ∈ E). It is the restriction to E⊂KE of the product topology or the topology of the pointwise convergence on KE of Example 3.2. We use the prefix σ(E,E)− to indicate that we are considering the topol- ogy σ(E,E)onE.

Theorem 5.2. With the notation x(u)=u(x), σ(E,E) is the weakest topol- ogy on E that makes every function x continuous. The dual of (E,σ(E,E)) is E, as a vector subspace of the algebraic dual of E.

Proof. Every functional x is σ(E,E)-continuous, since |x(u)| = px(u) and | | ∈ x(u) <εif u Upx (ε). If every function x is T -continuous for a topology T on E,thenevery set

V (u0):={u ∈E; |x(u) − x(u0)| <ε} = {u; px(u − u0) <ε} is an T -neighborhood of u0, and it is also a σ(E,E)-neighborhood of u0. Thus, every point u0 ofaweaklyopensetG is a T -interior point of G and T is finer than σ(E,E). Let E be the σ(E,E)-dual of E. By construction, E ⊂E.  Reciprocally, if ω ∈E, then there exist x1,...,xn ∈ E and a constant C ≥ 0 such that | |≤ ω(u) C max(px1 (u),...,pxn (u)). ··· n Hence, ω(u)=0ifu(x1)= = u(xn) = 0, and then ω = k=1 λkxk by Theorem 5.1. 

In a locally convex space, the closed convex sets and the closed subspaces are closed for the original topology of the space:

Theorem 5.3. Suppose C is a convex subset of the locally convex space E. Then the weak closure of C is equal to the closure C¯ of C for the topology of E.

130 5. Weak topologies

Proof. The original topology is finer than w := σ(E,E) and the weak closure C¯w of C is closed in E, so that C¯ ⊂ C¯w. Conversely, suppose that  x0 ∈ C¯; according to Theorem 4.17(b), we can choose u ∈ E so that

sup u(x0)

In the special case of a normed space E,wecallw∗ = σ(E,E)the weak* topology of the dual space E, and w = σ(E,E)istheweak topology of E. These topologies are weaker than the corresponding norm topologies on E and E.  Note that a sequence {un}⊂E is convergent to u if un → u for the topology w∗ if and only if it is weakly* convergent to u,sincewehave |un(x) − u(x)| = px(un − u) → 0 for every x ∈ E.

Similarly xn → x for the weak topology w if and only if u(xn) → u(x)  for every u ∈ E , and {xn} is weakly convergent to x in E. The most important facts concerning the topology w∗ are contained in the following compactness and metrizability result.

1  Theorem 5.5 (Alaoglu ). The closed unit ball BE = {u ∈ E ; uE ≤ 1} of the dual E of a normed space E is w∗-compact. If E is separable, the ∗ w -topology restricted to BE is metrizable.

Proof. Recall that w∗ is the restriction to E of the product topology on E ¯ K and observe that BE is contained in K := Πx∈ED(0, x) since, if uE ≤ 1, then u = {u(x)}x∈E satisfies |u(x)|≤x for every x ∈ E.By the Tychonoff theorem, K is a compact subset of KE, and BE = {f ∈ K; f(x + y)=f(x)+f(y)}∩ {f ∈ K; f(λx)=λf(x)} x,y λ,x is the intersection of a family of subsets, all of them being closed as defined by the equalities with continuous functions

πx+y(f)=πx(f)+πy(f) and πλx(f)=λπx(f).

Thus, BE is a closed subset of K and it is compact.

1This is the best known result of the Canadian-American mathematician Leonidas Alaoglu (1938), contained in his thesis (Chicago, 1937). For separable spaces, it was first published by S. Banach (1932).

5.2. Weak and weak* topologies 131

{ }∞ Suppose now that the sequence xn n=1 is dense in E. Then the fam- | |  ily of semi-norms pxn (u)= u(xn) is sufficient on E since, by continuity, u(xn) = 0 for all xn implies u(x) = 0 for all x ∈ E. This sequence of semi-norms defines on E  a locally convex metrizable topology T which is ∗ clearly weaker than the w -topology. On BE these topologies coincide, ∗ since Id : (BE ,w ) → (BE , T ) is continuous and the image of a compact (or closed) set is T -closed, since it is T -compact. 

Note that Theorem 5.5 states that, if E is separable, the w∗-topology is metrizable when restricted to a bounded set, but this is far from being true in general for w∗ on the whole E. This only happens if E is finite dimensional (Exercise 5.12).

n Example 5.6. Suppose {Kλ}λ∈Λ is a summability kernel on R such that | | ∈ ∞ n ∗ limλ→0 sup|y|≥M Kλ(x) =0forallM>0. If f L (R ), then limλ→0 Kλ f = f in the w∗-topology on L∞(Rn)=L1(Rn). A similar result holds for the periodic summability kernels.

This is shown as in the proof of Theorem 2.41 by considering, for every g ∈ L1, (Kλ ∗ f − f)g ≤ (f(x − y) − f(x))g(x) dx |Kλ(y)| dy ≤ sup (τyf(x) − f(x))g(x) dx |y|≤M

+2g1f∞ sup |Kλ(y)|, |y|≥M where (τyf(x) − f(x))g(x) dx= (τyg(x) − g(x))f(x) dx≤τyg − g1f∞,

 −  and we know that limM→0 sup|y|≤M τyg g 1 = 0 by Theorem 2.14.

It is worth noticing that the analogue of this last example holds for measures as well: If μ is a complex Borel measure on Rn (or on T), which has a polar representation dμ = hd|μ|, and g is a bounded Borel measurable function, then the convolution (μ ∗ g)(x):= g(x − y) dμ(y)= g(x − y)h(y) dμ(y) is well-defined.

132 5. Weak topologies

If every function Kλ of the summability kernel is bounded and continu- ous, then

(5.1) μ ∗ Kλ − μ → 0 n for the weak topology with respect to Cc(R )(orC(T) in the periodic case). ∗ ∞ ∞ Similarly, f ∗ Kλ − f → 0inthew -topology on L if f ∈ L . Note also that we can define the Fourier series for any complex measure μ on T (or on (−π,π]) ∞ ikt μ ∼ ck(μ)e k=−∞ by 1 −ikt −k ck(μ)= e dμ(t)= z dμ(z). 2π (−π,π] T

Then the corresponding Ces`aro sums are σN (μ)=μ ∗ FN ,whereFN is the Ces`aro summability kernel (2.25), and it is a special case of these remarks ∗ that σN (μ) → μ in the above w -topology.

5.3. An application to the Dirichlet problem in the disc

Our next aim is to show how the previous results are related to the classical Dirichlet problem. Let U denote the open unit disc |z| < 1 in the plane domain, and let 1 1 ∂ = (∂ − i∂ ), ∂¯ = (∂ + i∂ ) 2 x y 2 x y so that ¯ 2 2 " 4∂∂ = ∂x + ∂y = .

With this notation, the Cauchy-Riemann equations ∂xu = ∂yv, ∂xv = −∂yu for f = u + iv read ∂f¯ = 0, and every holomorphic function f = u + iv on U is C∞ as a two-variables function which is harmonic; this is, "f =0. Obviously, u = f and v = f are also harmonic. For a real-valued harmonic function u, any real-valued v such that f = u + iv is holomorphic on U is called a harmonic conjugate of u and, according to the Cauchy-Riemann equations, the harmonic conjugate of u is unique up to an additive constant, since ∂xv = ∂yv = 0 leads to v = C. When v is chosen with the condition v(0) = 0, v is called “the” harmonic conjugate of u.

5.3. An application to the Dirichlet problem in the disc 133

Our aim is to study the extension of a function f defined on T = ∂U to a harmonic function on U, that is, to study the Dirichlet problem for the disc (5.2) "F =0,F= f on T. The condition F = f on the boundary has to be understood in an appro- priate sense. If f ∈C(T), we look for a classical solution, which is a continuous function F on U¯ which is harmonic in U. This problem is completely solved by the Poisson integral that can be obtained by an application of the Hahn- Banach theorem, as follows. C Let E be the subspace of the complex Banach space (T) which contains N n all the complex polynomial functions g(z)= n=1 cnz restricted to T.By     the maximum modulus property, g T = g U¯ , and the evaluation map g → g(z0)atafixedpointz0 ∈ U¯ is a continuous linear form with norm ∈C  1 which is extended by the Hahn-Banach theorem to uz0 (T) so that   uz0 =1. Note that if E is any linear subspace of C(T) with the maximum modulus property, it can, and will, also be assumed to be contained in C(U¯), and the evaluation map is defined on this subspace of C(T). By the Riesz representation Theorem 4.13, ∈C uz0 (f)= fdμz0 (f (T)) T for a complex Borel measure μz0 on T, and then we say that μz0 represents uz0 or z0. We will see that this measure is uniquely determined by z0. Lemma 5.7. If u ∈C(T), u =1,andu(1) = 1, then the representing measure of u is positive.

Proof. We need to prove that u(f) ≥ 0iff ≥ 0 and we can also asume that f ≤ 1. Let g =2f − 1, so that −1 ≤ g ≤ 1 and, if u(g)=a + ib (a, b ∈ R), it follows from the hypothesis that for every x ∈ R 1+x2 ≥|u(g + ix)|2 = |a + ib + ix|2 = a2 +(b + x)2 ≥ (b + x)2,

2 so that b +2xb ≤ 0 and then b =0,u(g)=a. Finally we have |a|≤gT ≤ 1 and then a = u(g)=2u(f) − 1 forces u(f) ≥ 0. 

k iϑ Note that for the functions ek(z)=z (k ∈ Z), if z0 = re , n inϑ ≥ r e = en dμz0 (n 0) T

134 5. Weak topologies

and, since the measure is positive and e− =¯e on T, n n n −inϑ r e = e−n dμz0 . T |k| ikϑ Hence, T ek dμz0 = r e . By addition, for every r ∈ [0, 1) we obtain the 2π-periodic function ∞ |k| iks (5.3) Pr(s)= r e k=−∞ such that π 1 iϑ − (5.4) fdμz0 = f(e )Pr(ϑ t) dt T 2π −π for every trigonometric polynomial, since it holds when f = en.Butthe Fej´er kernel FN is a summability kernel and, for every f ∈C(T), the Ces`aro sums σN (f)=FN ∗ f are trigonometric polynomials such that σN (f) → f uniformly. By continuity, (5.4) holds for every f ∈C(T) and the Borel measure μz0 is the uniquely determined absolutely continuous measure 1 dμ = P (ϑ − t) dt (z = reiϑ). z0 2π r 0

Note that Pr(ϑ − t) is the real part of ∞ eit + z 1 − r2 +2ir sin(ϑ − t) 1+2 (z e−it)n = 0 = ; 0 eit − z |1 − z e−it| n=1 0 0 that is, 1 − r2 P (ϑ − t)= . r 1 − 2r cos(ϑ − t)+r2

The family {Pr}0

sup Pr(t) ≤ Pr(δ) → 0ifδ ↓ 0, 0<δ≤|t|≤π so that {Pr}0

Theorem 5.8. The Poisson kernel {Pr}0

5.3. An application to the Dirichlet problem in the disc 135 and 1 dμ = P (ϑ − t) dt (z = reiϑ) z0 2π r 0 represents every point z0 ∈ U in the sense that it is the unique measure on T such that π 1 it f(z0)= f(e )Pr(ϑ − t) dt =(Pr ∗ f)(ϑ) 2π −π for every f in a vector subspace E of C(U¯) which contains the polynomials     and satisfies the maximum modulus property f U¯ = f T.

Now we are ready to solve the Dirichlet problem (5.2) using the Poisson integral of a function π 1 iϑ (Pr ∗ f)(ϑ)= f(e )Pr(ϑ − t) dt 2π −π or of a measure μ, 1 π (Pr ∗ μ)(ϑ)= Pr(ϑ − t) dμ(t). 2π −π Theorem 5.9. Let f ∈ L1(T) and let μ be a complex Borel measure on T. Denote iϑ iϑ F (re ):=(Pr ∗ f)(ϑ) or F (re ):=(Pr ∗ μ)(ϑ), with 0 ≤ r<1 and ϑ ∈ R. Then F is harmonic on the open unit disc U and, as r → 1, the functions iϑ Fr(ϑ):=F (re ) satisfy the following convergence results: iϑ (a) If f ∈C(T),thenFr → f uniformly, so that, by defining F (e ):= f(eiϑ), F is a classical solution of the Dirichlet problem (5.2). p p (b) If f ∈ L (T) with 1 ≤ p<∞,thenFr → f in L (T). ∞ ∗ ∞ (c) If f ∈ L (T),thenFr → μ in the w -convergence on L (T) ≡ L1(T). ∗ (d) If Fr = Pr ∗μ,thenFr → μ in the w -convergence for μ ∈ M(T) ≡ C(T).

Proof. If f is real, then F is the real part of the holomorphic function 1 π eit + z (5.5) V (z):= f(eit) dt, it − 2π −π e z iϑ and it is harmonic. A similar reasoning shows that F (re )=(Pr ∗ μ)(ϑ) defines a harmonic function on U. Since we are dealing with a summability kernel, the statements (a) and (b) hold, and (c)–(d) follow from (5.1). 

136 5. Weak topologies

Remark 5.10. A simple change of variables gives 1 π R2 − r2 f(z)= f(a + Reit) dt 2 − − 2 2π −π R 2r cos(ϑ t)+r for every z = a + reiϑ ∈ D(a, R)iff is continuous on the closed disc D¯(a, R)={z ∈ C; |z − a|≤R} and harmonic in D(a, R). For r = 0 this is the mean value property of the harmonic function f: 1 π f(a)= f(a + Reit) dt. 2π −π

Let us now consider the inverse problem and, given a harmonic function F on U, try to find out whether it is the Poisson integral of some function iϑ iϑ or measure on T. We still denote Fr(ϑ)=F (re )ifz = re ∈ U. Theorem 5.11. Suppose F is a complex-valued harmonic function in the open unit disc U.

(a) F is the Poisson integral of some f ∈C(T) if and only if Fr is uniformly convergent when r ↑ 1. Inthiscase,F is the unique classical solution of the Dirichlet problem (5.2). 1 (b) F is the Poisson integral of some f ∈ L (T) if and only if Fr is convergent in L1(T) when r ↑ 1. (c) Fatou’s theorem: If 1

sup Frp < ∞. 0≤r<1 (d) F is the Poisson integral of some complex Borel measure if and only if sup Fr1 < ∞. 0≤r<1 (e) Herglotz’s theorem: F isthePoissonintegralofsomeBorel measure if and only if F ≥ 0.

Proof. We need to prove only the direct parts, the converses being con- tained in Theorem 5.9.

Let us start with (a) by showing that, if Fr → f uniformly, then F is the unique solution of the Dirichlet problem for the disc. We can suppose that F is real-valued and we will prove that F has to be the real part of the holomorphic function V defined in (5.5).

The function V1 = V is a classical solution of the Dirichlet problem with the boundary value f.ThenH = F − V1 ∈C(U¯) is harmonic in U and zero on T, and we only need to show that H = 0 at every point of U.

5.3. An application to the Dirichlet problem in the disc 137

Assume that H(z0) > 0forsomez0 ∈ U, denote ε = H(z0)/2, and let h(z):=H(z)+ε|z|2, a continuous function on U¯ such that h = ε on T and h(z0) >ε.Then

max h = h(z1) ∈ 2 ≤ 2 ≤ for some z1 U, so that ∂xh(z1) 0 and ∂y h(z1) 0. This is in contradic- tion to "h(z1)=4ε, which follows from the definition of h. The assumption H(z0) < 0 would also lead to a similar contradiction and H = F − V1 =0.   Proceeding now to the proofs of (d) and (e), let R := sup0≤r<1 Fr 1 and  ur(g):=g,Fr.Thenur ∈C(T) and ur≤R. According to Alaoglu’s theorem, since C(T) is a separable Banach space, the ball with radius R is a metrizable w∗-compact set and from a sequence r → 1wecanchoosea ∗ ∈C  subsequence urn which is w -convergent to some u (T) . By the Riesz representation theorem there is a complex Borel measure μ on T such that u(g)= gdμ= lim g(t)F (t) dt (g ∈C(T)). →∞ rn T n T

Note that every function hn(z):=F (rnz) is harmonic in a neighborhood of U¯ and on U it is the unique solution of the Dirichlet problem (5.2) for it it iϑ f(e )=F (rne ). Thus, if z = re , π 1 it hn(z)= Pr(ϑ − t)hn(e ) dt. 2π −π

iϑ If for a fixed z = re ∈ U we consider the continuous function g = Pz, where eit + z P (t):= = P (ϑ − t)(z = reiϑ), z eit − z r we obtain π F (z) = lim hn(z)= Pz dμ = Pr(ϑ − t) dμ(t). n T −π

If F ≥ 0, then Fr1 = F (0) for every r ∈ [0, 1) by the mean value property of harmonic functions, and the complex measure μ obtained in (d) is positive.

The proof of (c) is similar. In this case, ur(g) is defined for every g ∈ p ∈ p   ≤   ≤  L (T) and ur L (T) with ur R =sup0≤r<1 Fr p.Since1 p < ∞, Lp (T) is separable and, according to Alaoglu’s theorem, we have some → ∗ p  urn u in the w -topology on L (T) . Now by the Riesz representation q ∈ p theorem for the dual of an L -space, u(g)= T gh for some h L (T)such that hp ≤ R. Now the proof continues as in (d).

138 5. Weak topologies

1 The proof of (b) is simple. If Fr → f in L (T) and dμ = f(t) dt on T, then as in (d) it follows that π F (z) = lim hn(z)= Pz dμ = Pr(ϑ − t)f(t) dt, n T −π since now ur = ·,Fr→·,f. 

5.4. Exercises

Exercise 5.1. Prove that the unit ball of 1 is not weakly compact. Exercise 5.2. Prove that every sequence in the closed unit ball of L2(R) has a weakly convergent subsequence, and find a weakly convergent sequence with no convergent subsequence. Exercise 5.3. In the Banach space C[0, 1], prove that the sequence ⎧ ⎨⎪nx if 0 ≤ x<1/n, fn(x):= 2 − nx if 1/n ≤ x<2/n, ⎩⎪ 0if2/n ≤ x ≤ 1 is weakly convergent to 0 but it is not strongly convergent. Exercise 5.4. If δ ∈C[−1, 1] is the linear form δ(g):=g(0) { }∞ ⊂C− → and hn n=1 [ 1, 1], prove that hn δ weakly (in the sense that   1 → ∈C− g,hn = −1 ghn g(0) for every g [ 1, 1]) if and only if the following three conditions are satisfied: 1 (1) limn − hn(t) dt =1, 1 1 ∈C∞ − (2) limn −1 hn(t)ϕ(t) dt =0ifϕ [ 1, 1] vanishes in a neighbor- hood of 0, and 1 ∞ (3) supn −1 hn(t) dt < . Prove also that if sup 1 h (t) dt = ∞, then there exists a function g ∈ n −1 n C − 1 →∞ [ 1, 1] for which −1 ghn .

Exercise 5.5. If xn → x weakly in a Banach space E, prove that xE ≤ lim inf xnE. Exercise 5.6. For every Banach space E there is a linear isometry from E onto a closed subspace of C(K), where K is the closed unit ball BE endowed with the restriction of the weak* topology of E.

5.4. Exercises 139

Exercise 5.7. Let E be a locally convex space. If xn → 0inE,thenalso xn → 0 weakly. By considering an orthonormal system in a Hilbert space, show that the converse is not true in general. → Exercise 5.8. InaFr´echet space E, suppose that xn x weakly. Prove N that x is the limit in E of a sequence of convex combinations j=1 αjxnj of elements from the sequence {xn}. Exercise 5.9. As an application of the uniform boundedness principle, prove that a subset of a normed space E is bounded if and only if it is weakly bounded. If E is complete, also show that a subset of E is bounded if and only if it is w∗-bounded. Exercise 5.10. Let T : E → F be a linear mapping between two Fr´echet spaces and let T (v):=v ◦ T (v ∈ F ), the transpose of T .Provethe equivalence of the following properties: (a) T is continuous. (b) T is weakly continuous (T : E(σ(E,E)) → F (σ(F, F ) continuous). (c) T (F ) ⊂ E (and then T  : F (σ(F ,F)) → E(σ(E,E) continuous). Exercise 5.11. Let E be an infinite-dimensional locally convex space. Prove that the weak topology σ(E,E)onE is metrizable if and only if E has a countable algebraic basis. Exercise 5.12. Let E be a normed space. Prove that if the weak topology σ(E,E)onE, or the weak* topology on E, is metrizable, then E is finite dimensional. Exercise 5.13. Let H be a separable Hilbert space and let T ∈L(H). Prove that T is compact if and only if, for any sequence {xn}⊂H such that (x, xn)H → 0asn →∞for all x ∈ H (that is, xn → 0 weakly), it follows that TxnH → 0. Operators with this property were called completely continuous by Hilbert and Schmidt.

Exercise 5.14. Suppose E is a real normed space and K1, K2 are two disjoint weakly compact convex subsets of E. Then prove that f(K1)

140 5. Weak topologies

(a) Prove that E is reflexive if and only if the closed unit ball BE of E is weakly compact. (b) Prove that every closed subspace F of a reflexive normed space E is also reflexive. (c) Prove that a Banach space E is reflexive if and only if E is reflexive. Exercise 5.17. Prove that every Hilbert space H is reflexive and that Lp(0, 1) (1 ≤ p ≤∞) is reflexive if and only if 1

Exercise 5.19. Find a sequence of functions fn such that, for any 1

1 N (5.7) σ := S , N N +1 n n=0 the Ces`aro means. Prove that (5.6) is the Fourier series of a complex Borel measure on T   ∞ if and only if supN σN 1 < . Exercise 5.22. Prove that (5.6) is the Fourier series of a function f ∈ Lp(T) ≤∞   ∞ (1

5.4. Exercises 141

Exercise 5.24. Prove that (5.6) is the Fourier series of a 2π-periodic con- tinuous function if and only if {σN } is a uniformly convergent sequence.

References for further reading: S. Banach, Th´eorie des op´erations lin´eaires. N. Dunford and J. T. Schwartz, Linear Operators: Part 1. G. K¨othe, Topological Vector Spaces I. P. D. Lax, Functional Analysis. F. Riesz and B. Sz. Nagy, Le¸cons d’analyse fonctionelle. W. Rudin, Real and Complex Analysis. W. Rudin, Functional Analysis. K. Yosida, Functional Analysis.

Chapter 6

Distributions

For a long time, physicists have been operating with certain “singular func- tions” that are not true functions in the usual sense. A typical simple example was Dirac’s function δ, assumed to be “sup- { } ported by 0 but so large on this point that R δ(t) dt = 1”, which is the unit impulse of signal theory.1 Previously, in 1883, to solve some physical problems, Heaviside2 intro- duced a symbolic calculus that included the derivatives of singular functions. With this calculus it was accepted that δ is the derivative Y  of Heaviside’s function Y = χ[0,∞), and then a formal partial integration with ϕ regular enough and with compact support gives ϕ(t)δ(t) dt = − ϕ(t) dt = ϕ(0). R [0,∞)

In his 1932 text [43] on quantum mechanics, von Neumann warned against the use of these unclear objects. In the preface of the book he says that “The method of Dirac ... in no way satisfies the requirements of

1The δ function was introduced by the British physicist, and one of the founders of quantum mechanics, Paul Adrien Maurice Dirac as the continuous form of the discrete Kronecker delta in the formulation of quantum mechanics, which is contained in his 1930 book “The Principles of Quantum Mechanics”, a landmark in the history of science. 2The English telegraph operator and self-taught engineer, physicist, and mathematician Oliver Heaviside patented the co-axial cable in 1880, reformulated Maxwell’s initially cumbersome equations by reducing the original system of 20 differential equations to 4 differential equations in 1884, and between 1880 and 1887 developed a controversial operational calculus that motivated his saying “Mathematics is an experimental science, and definitions do not come first, but later on”.

143

144 6. Distributions mathematical rigor [with] the introduction of ‘improper’ functions with self- contradictory properties” and he calls the delta functions and their deriva- tives “mathematical fictions”. Very soon afterwards Bochner introduced in a rigorous way singular functions in Fourier analysis, and Sobolev (1938) used weak derivatives in the study of partial differential equations. But it was L. Schwartz3 who defined them as linear forms acting on a family of test functions ϕ, with the appropriate continuity properties. The idea was that, as a mater of fact, a singular function such as δ always appears inside an integral formula. In the case of the Dirac function, the useful property is that R ϕ(t)δ(t) dt = ϕ(0), and δ can be directly defined as the linear form ϕ → ϕ(0) acting on a convenient family of test functions ϕ.

If we consider any function 0 ≤  ∈Cc(R) supported by [−r, r] and such that R (t) dt = 1, the summability kernel n(t):=n(nt) is such that lim ϕ(t) (t) dt = lim ϕ,   = ϕ(0) (ϕ ∈C(R)), →∞ n →∞ n n R n | − |≤ r/n | − | → since R ϕ(t)n(t) dt ϕ(0) −r/n ϕ(t) ϕ(0) n(t) dt 0. This means that δ is also a weak limit of the sequence {n} by associating to n the linear form ·,n.

As a positive linear form on the space of test functions Cc(R), by the Riesz-Markov theorem δ can also be considered a Borel measure. The class of test functions for distributions will be much smaller than Cc(R), and op- erations with distributions will include well-defined generalized derivatives. The derivatives δ, δ, ... of δ will be distributions but will not longer be measures. With the description of the general theory of distributions, we will in- clude some basic facts and examples concerning differential equations, a field where the influence of functional analysis has grown continuously pre- cisely due to the use of distributions. In the next chapter we will find more examples. The results are stated for complex-valued distributions, but the reader can check that they are also valid for real distributions.

6.1. Test functions

Let Ω be a nonempty open subset of Rn.

3Laurent Schwartz formalized the mathematical theory of the generalized functions or distri- butions in 1945, and his 1950–51 book “Th´eorie des Distributions” [40] remains a basic reference for this topic. See footnote 1 in Chapter 3.

6.1. Test functions 145

In Section 3.1 we introduced the Fr´echet space E(Ω) of all C∞ com- plex functions on Ω, endowed with the locally convex topology of the local uniform convergence of functions and their derivatives. If K(Ω) represents the family of all compact subsets of Ω, the space of test functions is the vector space of all C∞ complex functions on Ω with compact support, D(Ω) = DK (Ω). K∈K(Ω)

Recall that, if K ∈K(Ω), in (3.3) we defined DK (Ω) as the closed subspace of E(Ω) that contains all f ∈E(Ω) whose support lies in K, so that its topology is defined by the increasing sequence of norms α (6.1) qN (f)= D fK (N ∈ N). |α|≤N α α Thus, ϕk → ϕ in DK (Ω) means that D ϕk → D ϕ uniformly, for every α ∈ Nn. −1/t −| |2 Example 6.1. Let g(t)=e χ(0,+∞](t) and denote (x)=Cg(1 x ) with C>0 such that (x) dx =1.Then is a test function whose graph ≤ ∈D n has the familiar bell shape that satisfies 0  B¯(0,1)(R ).

(n) −1/t The derivatives are g (t)=Pn(1/t)e if t>0, where every Pn is a (n) (n) (n) polynomial and g (t)=0ift<0; thus g (0) = limt→0 g (t)=0for ∈ ∞ ∞ ∈D n every n N, so that g is C supported by [0, ), and  B¯(0,1)(R ). ∞ { } n From  we can define a C summability kernel ε 0<ε<ε0 on R by −n ≤ ∈D n ε(x)=ε (x/ε) (see (2.22)). Note that 0 ε B¯(0,ε)(R ) and that ε(x) dx = (x) dx = 1. Such a function is often called a mollifier. 1 1 Let us define the local analogue of L (Ω) by denoting Lloc(Ω) the vector space of complex measurable functions locally integrable on the open set Ω, i.e. functions integrable on every compact subset K of Ω. As usual, two locally integrable functions are supposed to be equivalent if they coincide a.e. If f ∈ L1 (Rn),  ∗ f given by loc ε

(ε ∗ f)(x)= ε(x − y)f(y) dy = f(x − y)ε(y) dy Rn B¯(0,ε) is a well-defined C∞ function, since we can differentiate under the integral sign.

New test functions can be constructed from Example 6.1. For every couple K ⊂ Ω, there is a test function  ∈D(Ω) which is a smooth Urysohn function for this couple, so that K ≺  ≺ Ω.Itcanbedefinedasfollows:

146 6. Distributions

Let 0 <δ

n Proof. Let supp f ⊂ K and supp ε ⊂ B¯(0,ε). Then f ∗ ε ∈E(R ) and supp (f ∗ e) ⊂ K(ε) ⊂ K(η). Now (a) and (b) follow from Theorem 2.41. 

The small class D(Ω) is large enough to be dense in many spaces, such as Lp(Ω) if 1 ≤ p<∞ (see Exercise 6.2) and E(Ω): Theorem 6.3. The set D(Ω) of all test functions is dense in E(Ω).

Proof. For every f ∈E(Ω) we consider {mf}⊂D(Ω), where m are the Urysohn functions associated to an increasing sequence of compact subsets Km of Ω such that every other compact subset K of Ω is contained in one α α of them, KN , and mf = f, so that D (mf)=D f on K if m>N.Then α α D (mf) → D f uniformly on every compact set K ⊂ Ω. 

6.2. The distributions

The distributions in a nonempty open subset Ω of Rn are defined as the linear forms on D(Ω) which satisfy a convenient continuity property. D D On (Ω) = K∈K(Ω) K(Ω), instead of defining a topology, it will be sufficient to consider a notion of convergence.4

We say that ϕk → ϕ in D(Ω) if ϕk → ϕ in DK (Ω) for some K ∈K(Ω); that is, ϕ and ϕk are test functions on Ω that satisfy

• supp ϕk ⊂ K for some fixed compact set K ⊂ Ω and

4This notion of convergence follows from the topology defined in Exercise 6.5. With this topology, D(Ω) is known as the inductive limit of the spaces DK (Ω).

6.2. The distributions 147

α α n • D ϕk → D ϕ uniformly as k →∞, for every α ∈ N .

A distribution on the open set Ω is defined as a linear form u : D(Ω) → C which is continuous with respect to the above convergence or, equivalently, ∈D  ⊂ such that u|DK (Ω) K (Ω) for every compact set K Ω. We denote by D(Ω) the complex vector space of all distributions on Ω. Hence u ∈D(Ω) means that u : D(Ω) → C is a linear form such that u(ϕk) → 0 whenever ϕk → 0insomeDK (Ω). Thus, if u ∈D(Ω), for every compact set K ⊂ Ω there are an integer N = N > 0 and a constant C > 0, both depending on K, such that K K α |u(ϕ)|≤CK qN (ϕ)=CK D ϕK (ϕ ∈DK (Ω)). |α|≤N If there is an N independent of K, then the smallest such N is called the order of the distribution u (the constant CK still can depend on K). If this N does not exist, u is said to be of infinite order. Example 6.4. Suppose f is a locally integrable function on Ω. Then we can define

uf (ϕ):=ϕ, f = f(x)ϕ(x) dx (ϕ ∈D(Ω)). Ω

It is clear that uf is a linear form on D(Ω) such that |uf (ϕ)|≤fL1(K)ϕK ,  that is, uf ∈D(Ω). ∈ 1 →· ∈D Next we prove that the linear mapping f Lloc(Ω) ,f (Ω) is 1 ⊂D one-to-one, so that we can consider Lloc(Ω) (Ω) and it is said that the distribution uf is a function. If no confusion is possible, we write f for uf . ∈ 1 Theorem 6.5. If f Lloc(Ω) and Ω f(x)ϕ(x) dx =0for every test func- tion ϕ ∈D(Ω),thenf = 0 (a.e.).

Proof. We can assume that f is a real function and we only need to prove that f = 0 a.e. on every ball B(a, r) ⊂ Ω.

If A = {x ∈ B(a, r); f(x) ≥ 0}, choose Km ⊂ A ⊂ Gm ⊂ B(a, r)so that |Gm \ Km|↓0(Km compact and Gm open sets). If ϕm is a Urysohn function for K ⊂ G ,thenϕ → χ a.e. and m m m A + f (x) dx = lim f(x)ϕm(x) dx =0. A Hence, f +(x)=0a.e.onB(a, r). Analogously, f −(x)=0a.e.onB(a, r). 

148 6. Distributions

Example 6.6. For every a ∈ Ω, the Dirac distribution, δa, is defined as n δa(ϕ)=ϕ(a). On R , we denote δ = δ0. Example 6.7. If μ is a Borel measure, or a complex Borel measure on Ω, then

g,μ := gdμ= ghd|μ| (g ∈Cc(Ω)) Ω Ω defines a linear form ·,μ on Cc(Ω), and the restriction to D(Ω) is clearly a distribution which is identified with μ.Heredμ = hd|μ| is the polar representation of μ.

 The pair (D(Ω), D (Ω)) is a dual couple since, if u(ϕ) = 0 for every u ∈D(Ω), then ϕ =0( |ϕ|2 = ϕ,¯ ϕ =0).Thus,ifu is a distribution and ϕ a test function, we also write ϕ, u instead of u(ϕ). Eventually we will use the notation ϕ(x),u(x) if we need to refer to the variable that is used at each moment. The space D(Ω) is endowed with the weak topology σ(D(Ω), D(Ω)), and the distributional convergence uk → u means that uk(ϕ) → u(ϕ) for every test function ϕ. Multiplication by a C∞ function, g ∈E(Ω), is naturally extended to distributions: If f ∈ L1 (Ω), also gf ∈ L1 (Ω) and, as a distribution, loc loc ϕ, gf = ϕ(x)g(x)f(x) dx = gϕ,f, Ω which suggests that we can define gu for every u ∈D(Ω) by the rule (gu)(ϕ)=u(gϕ); that is, ϕ, gu := gϕ,u. Since gϕ ∈D(Ω) for every ϕ ∈D(Ω), to prove that gu ∈D(Ω), we only need to check the continuity property for this new linear functional gu on D(Ω). But multiplication by g is a linear operator g· : D(Ω) →D(Ω) such that

(6.2) ϕk → ϕ in D(Ω) ⇒ gϕk → gϕ in D(Ω) since, if ϕk → ϕ in DK (Ω), it is easily checked that also gϕk → gϕ in DK (Ω) from ∂j(gϕ)K ≤∂jgKϕK + gK∂jϕK and from the corresponding estimates for the successive derivatives Dα(gϕ).  Hence, also (gu)(ϕk)=u(gϕk) → u(gϕ)=(gu)(ϕ), and gu ∈D(Ω). There is a general procedure for extending some operations on D(Ω) to operations on D(Ω) which includes the above multiplication by a C∞ function as a special case.

6.2. The distributions 149

  If T : D(Ω1) →D(Ω2) is a linear map such that T u = u ◦ T ∈D(Ω1)     for every u ∈D(Ω2), then T : D (Ω2) →D(Ω1) is the transpose of T :   Tϕ,u = ϕ, T u (ϕ ∈D(Ω1),u∈D(Ω2)). Both T and T  are weakly continuous, since   pϕ(T u)=|ϕ, T u| = |Tϕ,u| = pTϕ(u) and pu(Tϕ)=pT u(ϕ), where the pϕ and the pu are, respectively, the semi- norms that define the weak topology on D and on D. Sometimes there are two linear operators

T : D(Ω1) →D(Ω2),R: D(Ω2) →D(Ω1) which are transposes of each other in the sense that

ϕ, Rψ = Tϕ,ψ (ϕ ∈D(Ω1),ψ∈D(Ω2).  Then, for every u ∈D(Ω2) we can define Ru = u ◦ T , which is a linear form on D(Ω1) characterized by the condition ϕ, Ru = Tϕ,u.

If T satisfies the continuity condition Tϕk → 0inD(Ω1) whenever ϕk → 0   in D(Ω1), then every Ru is a distribution on Ω1, and R : D (Ω2) →D(Ω1)  is the transpose T of T : D(Ω1) →D(Ω2). This transpose is considered the extension of R : D(Ω2) →D(Ω1).  D →D 1 Very often, the restriction of R = T : (Ω2) (Ω1)toLloc(Ω2)is also a natural extension of R : D(Ω2) →D(Ω1). This is the case of multiplication by a C∞ function g ∈E(Ω), R = g· on D(Ω). With our definitions, also T = g· on D(Ω), since ϕ, gψ = ϕ(x)g(x)ψ(x) dx = gϕ,ψ, Ω · D · 1 and R = g , extended to (Ω), coincides with g when restricted to Lloc(Ω).

∞ Let us now consider the case of a C change of variables ψ :Ω2 → Ω1, −1 1 and let Rf represent the function f(ψ (x)) if f ∈ L (Ω2), so that loc

ϕ, Rf = f(y)ϕ(ψ(y))|Jψ(y)| dy = |Jψ|ϕ(ψ),f. Ω2

The following theorem shows that ψ : D(Ω1) →D(Ω2) continuously. ∞ Theorem 6.8. Let ψ :Ω2 → Ω1 be a C -change of variables, K acompact −1 set in Ω1,andletL = ψ (K). Then the linear map ϕ → ϕ(ψ) is continuous from DK (Ω1) to DL(Ω2).

150 6. Distributions

Proof. If ϕ(x)=0whenx ∈ K and y ∈ L,thenx = ψ(y) ∈ K and ϕ(ψ)(y) = 0. So Ψ : ϕ ∈DK (Ω1) → ϕ(ψ) ∈DL(Ω2) is a well-defined linear mapping. From ∂j(ϕ(ψ)) = ∂jψ(∂jϕ)(ψ), it follows as for (6.2) that ∂j(ϕ(ψ))L ≤∂jψL∂jϕK . Similarly,

∂k∂j(ϕ(ψ))L ≤∂k∂jψL∂jϕK + ∂jψL∂k∂jϕK  α  ≤  β  → and, by induction, D ϕ(ψ) L C β≤α D ϕ K.Thus,ϕk 0in DK (Ω1) implies Ψ(ϕk) → 0inDL(Ω2). 

 Let Tϕ := |Jψ|ϕ(ψ)ifϕ ∈D(Ω1), and let u ∈D(Ω2). Then the linear  form T u on D(Ω1) defined as the transpose of T by  ϕ, T u = |Jψ|ϕ(ψ),u is a distribution, since it is the composition

ϕ → ϕ(ψ) →|Jψ|ϕ(ψ) → u(|Jψ|ϕ(ψ)) of three linear mappings which have the appropriate continuity properties. ∈ 1   If u = f Lloc(Ω2), T u = Rf. Hence, we write Ru = T u and

ϕ, Ru = |Jψ|ϕ(ψ),u defines the change of variables for distributions. Translations, scaling, and symmetry of distributions on Ω = Rn are special cases:

n Example 6.9. On R the translated τa(u), the symmetricu ˜, and the dila- tion u(αx) of a distribution u are the distributions defined as follows:

(a) ϕ, τa(u) = τ−aϕ, u,orϕ(x),u(x − a) = ϕ(x + a),u(x). (b) ϕ, u˜ = ϕ,˜ u,orϕ(x),u(−x) = ϕ(−x),u(x). (c) ϕ(x),u(α−1x) = |α|nϕ(αx),u(x) if α =0.

An example is τaδ = δa,since

ϕ, τaδ = ϕ(x + a),δ(x) = ϕ(a)=ϕ, δa.

6.3. Differentiation of distributions

If f ∈E1(R), or if f is an everywhere differentiable function on R and f  ∈ 1 Lloc(R) (see e.g. Rudin’s “Real and Complex Analysis” [39]), integration by parts yields r r ϕ, f  = ϕ(t)f (t) dt = − ϕ(t)f(t) dt = −ϕ,f −r −r if ϕ ∈D[−r,r](R).

6.3. Differentiation of distributions 151

This result suggests that we associate to each distribution u on the open n  set Ω ⊂ R the partial “derivatives” ∂ju (or Du = u when n =1)bythe formula

ϕ, ∂ju := −∂jϕ, u (1 ≤ j ≤ n).   That is, (∂ju)(ϕ)=−u(∂jϕ)(u (ϕ)=−u(ϕ )ifn =1).

If ϕk → 0inDK (Ω), then

−∂jϕk,u→0,  since ∂jϕk → 0inDK (Ω). Thus, ∂ju ∈D(Ω), the operator   ∂j : D (Ω) →D(Ω) is the adjoint of

−∂j : D(Ω) →D(Ω),  1 and ∂j on D (Ω) extends the partial derivatives of C functions on Ω. 1 →D Note that ∂j : Lloc(Ω) (Ω) can be defined as the restriction of   ∂j : D (Ω) →D(Ω). It is a custom, which we shall usually follow, to write ∂jf instead of ∂juf , which is a distribution called the distributional ∈ 1 derivative or weak derivative of f Lloc(Ω). It is characterized by the identity

ϕ, ∂jf = − f(x)∂jϕ(x) dx (ϕ ∈D(Ω)). Ω The αth distributional derivatives are defined by induction. If Dα = α1 ◦···◦ αn ∂1 ∂n ,then Dα : D(Ω) →D(Ω) is a continuous linear operator such that ϕ, Dαu =(−1)|α|Dαϕ, u. α α ∈ 1 We also write D f for D uf if f Lloc(Ω). Theorem 6.10. If f ∈Em(Ω) and |α|≤m, then the usual pointwise deriv- α α ative D f coincides with the distributional derivative: D uf = uDαf . The Leibniz formula for the derivatives of a product of functions5 is extended to 1  ∂j(fu)=(∂jf)u + f∂ju (f ∈E (Ω),u∈D(Ω)).

Proof. By induction, assume m = 1 and consider Dα = ∂ : 1

ϕ, ∂1uf  = − ∂1ϕ(x)f(x) dx. K

5See in Exercise 2.1 the extension of the Leibniz formula to higher-order derivatives.

152 6. Distributions

We can suppose that f(x)=0ifx ∈ Rn \ G,whereG ⊂ Ωisanopenset containing K = supp ϕ, so that

ϕ, ∂1f = − ∂1ϕ(x)f(x) dx. Rn If x =(t, x¯) ∈ R × Rn−1 and f (t)=f(x), integration by parts gives x¯ r   −    ϕ, ∂1uf = ϕx¯(t)fx¯(t) dt dx¯ = ϕ, ∂1f . Rn−1 −r For the product,

ϕ, ∂j(fu) = −f∂jϕ, u = ϕ∂jf − ∂(fϕ),u = ϕ, (∂jf)u + f∂ju. 

  ∈ 1 If f is absolutely continuous, so that f exists a.e., and if f Lloc(R), it is shown in Exercise 6.20 that this a.e. derivative is also the distributional derivative of f. But one has to be careful since the following examples show ∈ 1 that there are many functions f Lloc(R) with an a.e. defined derivative  ∈ 1 f Lloc(R) which is not the distributional derivative of f. 1 Example 6.11. Let f be a C function on (a, b) \{t1,t2,...,tn} (a

Define ϕ = 0 outside of (a, b). If t0 = a and tn+1 = b, n tj+1    −  ϕ, uf = ϕ (t)f(t) dt j=1 tj and integration by parts gives n tj+1     − − ϕ, uf = ϕ(t)f (t) dt + f(tj )ϕ(tj) f(tj+1+)ϕ(tj+1) j=1 tj b n  = ϕ(t)f (t) dt + f(tj+) − f(tj−) ϕ(tj). a j=1

Example 6.12. If f ∈E(R) and Y = χ[0,∞) (Heaviside function), then n−1 (fY) = f Y + f(0)δ and (fY)(n) = f (n)Y + f (k)(0)δ(n−k−1). k=0

6.3. Differentiation of distributions 153

The first identity is contained in Example 6.11. Then, by induction, n−1 (fY)(n+1) =((fY)(n)) = f (n+1)Y + f (n)δ + f (k)(0)δ(n−k) k=0 and f (n)δ = f (n)(0)δ. The following results show a couple of instances where the derivatives of distributions behave as the derivatives of functions. First, on an interval of R, the constant functions are the unique distri- butions with zero derivative: Theorem 6.13. Let u ∈D(a, b)(−∞ ≤ a

The derivative of a distribution on Rn can be defined as a quotient limit of distributions: n  n Theorem 6.14. Suppose ϕ ∈D(R ) and u ∈D(R ),andletej be the jth standard basis vector of Rn.Then − − τ hej ϕ ϕ n lim = ∂jϕ in D(R ) h→0 h and − − τ hej u u  n lim = ∂ju in D (R ). h→0 h Proof. We can suppose j = 1 and we choose a compact set K ⊂ Rn which contains supp ϕ and so that d = d( supp ϕ, Kc) > 0. By the Lagrange mean value theorem, ϕ(x + he1) − ϕ(x) − ∂1ϕ(x) = |∂1ϕ(x + hxe1) − ∂1ϕ(x)| h with |hx|≤|h|. Note that supp ϕ + he1 ⊂ K if |h|

154 6. Distributions

The same holds for every Dαϕ and, if h → 0, ϕ(x + he ) − ϕ(x) 1 → ∂ ϕ(x) h 1 n n in DK (R ) and in D(R ). n From the continuity of the distribution u on DK (R ), ϕ(x − he ),u(x)−ϕ, u 1 →−∂ ϕ, u h 1 →  −1 −     if h 0, so that limh→0 ϕ, h (τ−hej u u) = ϕ, ∂ju .

6.4. Convolution of distributions

The convolution f ∗ g of two integrable functions f and g is a well-defined integrable function and, if one of them has compact support, then ϕ, f ∗ g = g(y) ϕ(x)f(x − y) dx dy Rn Rn = ϕ(y + z),f(z)g(y) = ϕ(y + z),f(z),g(y). If u and v are two distributions, this suggests that we should try ϕ, u ∗ v = ϕ(y + z),u(z),v(y) as a possible definition, but then we need to apply v to the function f(y)= ϕ(y + z),u(z). Note that if supp ϕ ⊂ B¯(0,r), we can only state that ϕ(y + z) = 0 whenever (y, z) lies in |y + z|≥r, and f can no longer have a compact support. This shows that it is convenient to consider cases where u can be applied to functions with an unbounded support. This will be possible if u belongs to the class of distributions with compact support.

6.4.1. Support of a distribution and distributions with compact support. We recall that the support of a continuous function f on Ω is the closure in Ω of the set of points where f(x) = 0 or, equivalently, the complement of the largest open subset of Ω on which f is zero. To define an analogous concept for distributions, it is convenient to show that they are locally determined, and partitions of unity are useful to this end.

Recall that, according to Theorem 1.3 and Remark 1.4, if {Ω1,...,Ωm} is a finite family of open subsets of Rn which covers a compact set K ⊂ Rn, then there exists a partition of unity ϕj ∈D(Ωj)(1≤ j ≤ m) subordinate to this covering.

6.4. Convolution of distributions 155

Note also that if G is an open subset of Ω, we can consider D(G) ⊂ D(Ω) ⊂D(Rn) and the restriction of a distribution u ∈D(Ω) to D(G)is   also a distribution, uG ∈D(G). If the restrictions of u, v ∈D(Ω) are the same, we say that these distributions are equal on G, and we write u = v on G. Theorem 6.15. For a given distribution u ∈D(Ω), there is a largest open subset G of Ω where u is zero.

Proof. Let G be the union of all open subsets of Ω where u is zero, and let ϕ ∈DK (G). Then K ⊂ Ω1 ∪···∪Ωm with u = 0 on every Ωj, and we can { }m choose a partition of unity ϕj j=1 subordinate to this covering of K.Then m m ⊂  u(ϕ)=u( j=1 ϕjϕ)= j=1 u(ϕjϕ) = 0, since supp ϕjϕ Ωj.

The support of a distribution u ∈D(Ω), supp u , is defined as the complement Ω \ G of the largest open subset G of Ω where u vanishes. Theorem 6.16. The support of u ∈D(Ω) is compact if and only if u is the restriction to D(Ω) of some v ∈E(Ω), which is uniquely determined by u.WecallE(Ω) the dual of E(Ω).

Proof. Since the inclusions DK (Ω) →E(Ω) are all continuous, the restric- tion u of every v ∈E(Ω) to D(Ω) is a distribution. The continuity of v means that

(6.3) |v(f)|≤CqK,m(f)(f ∈E(Ω)) α for some semi-norm qK,m(f)=max|α|≤m D fK and some constant C>0. Thus v(f) = 0 if supp f ⊂ Ω \ K, since in this case qK,m(f) = 0, and then supp v ⊂ K. Reciprocally, given u ∈D(Ω) with supp u = K, compact, we choose K(δ)=K + B¯(0,δ) ⊂ Ω and K(δ) ≺  ≺ Ω. We claim that v(f):=u(f) defines a linear form on E(Ω) which is an extension of u which does not depend on  or δ.  Indeed, if also K(δ ) ≺ 1 ≺ Ω, then u(f) − u(f1)=u(f( − 1)) = 0 since supp ( − 1) ∩ K = ∅.

Moreover, if ϕ ∈DK (Ω), it is shown that v(ϕ)=u(ϕ)=u(ϕ)by choosing a compact set L such that K ∪ supp ϕ ⊂ Int (L) and L ≺  ≺ Ω. If fk → 0inE(Ω), it is easily checked that fk → 0inD(Ω) and then  v(fk) → 0. That is, v ∈E(Ω). According to Theorem 6.3, the set D(Ω) of all test functions is dense in E(Ω), and v is uniquely determined by u. 

156 6. Distributions

  Since v ∈E(Ω) → u = v|D(Ω) ∈D(Ω) is linear and one-to-one onto the class of all distributions on Ω with compact support, E(Ω)issaidtobethe space of distributions with compact support on Ω.

6.4.2. Convolution of distributions with functions. Next we define the convolution of a distribution with a function on the whole Rn when at least one of them has compact support. Suppose first that ϕ ∈D(Rn) and u ∈D(Rn). ∗  −  ∈ 1 n Since (f ϕ)(x)= ϕ(x y),f(y) if f Lloc(R ), we define

(u ∗ ϕ)(x):=ϕ(x − y),u(y) = (τxϕ˜)(y),u(y) = ϕ,˜ τ−xu = τ−xϕ, u˜.

Note that (u ∗ ϕ)(x) is continuous with respect to u in D(Rn)(when ϕ ∈D(Rn) and x ∈ Rn are fixed), with respect to ϕ, and with respect to x. That is, (6.4) (uk ∗ ϕ)(x) → (u ∗ ϕ)(x), (u ∗ ϕk)(x) → (u ∗ ϕ)(x), (uk ∗ ϕ)(xk) → (u ∗ ϕ)(x)  n n n when uk → u in D (R ), ϕk → ϕ in DK (R ), and xk → x in R , respec- tively.  n Indeed, (uk ∗ ϕ)(x)=τ˜xϕ, uk→τ˜xϕ, u if uk → u,sinceonD (R ) we are considering the weak convergence.

Also, if ϕk → ϕ,then(u ∗ ϕk)(x)=ϕ˜k,τ−xu→ϕ, τ−xu. → ∗  →  Finally, if xk x,then(u ϕ)(xk)= τ−xk ϕ, u˜ τ−xϕ, u˜ from the α α α  −  n ≤ uniform continuity of ϕ and of all D ϕ,since D ϕ τ−xk D ϕ R ε if k ⊂ n ⊂ is large, and there is a single compact set L R so that supp τ−xk ϕ L. Let us gather together the basic properties of this convolution: Theorem 6.17. If ϕ ∈D(Rn) and u ∈D(Rn),thenu ∗ ϕ ∈E(Rn) and the following properties are satisfied:

(a) τa(u ∗ ϕ)=(τau) ∗ ϕ = u ∗ (τaϕ). (b) Dα(u ∗ ϕ)=(Dαu) ∗ ϕ = u ∗ (Dαϕ). (c) supp (u ∗ ϕ) ⊂ supp u + supp ϕ. n (d) (u ∗ ϕ) ∗ ψ = u ∗ (ϕ ∗ ψ) if ψ ∈Cc(R ). n n n (e) u∗ϕk → u∗ϕ in E(R ) if ϕk → ϕ in D(R );hence,u∗ : D(R ) → n n E(R ) is continuous (when restricted to every DK (R )).

Proof. (a) From the definition,

(u ∗ ϕ)(x − a)=ϕ(x − (y + a)),u(y) = ϕ(x − y),τau(y) =((τau) ∗ ϕ)(x) and also ϕ(x − a − y),u(y) =(u ∗ (τaϕ))(x).

6.4. Convolution of distributions 157

α (b) We can assume D = ∂1. Then, according to Theorem 6.14, an application of (a) gives ∗ − ∗ τ−te1 (u ϕ)(x) (u ϕ)(x) ∂1(u ∗ ϕ)(x) = lim t→0 t τ− − u − τ− u = lim ϕ,˜ x te1 x t→0 t = ϕ,˜ τ−x∂1u =((∂1u) ∗ ϕ)(x). Similarly, − τ−te1 ϕ˜ ϕ˜ ∂1(u ∗ ϕ)(x) = lim ,τ−xu =(u ∗ ∂1ϕ)(x). t→0 t (c) If supp ϕ(x −·) ∩ supp u = ∅,thenϕ(x −·),u = 0. Hence, if (u∗ϕ)(x) = 0, necessarily supp ϕ(x−·)∩ supp u = ∅, so that x−y ∈ supp ϕ at least for one point y ∈ supp u and then x ∈ supp u + supp ϕ.Thus, {u ∗ ϕ =0 }⊂ supp u + supp ϕ, which is closed as a sum of a compact set and a closed set, so that supp u ∗ ϕ = {u ∗ ϕ =0 }⊂ supp u + supp ϕ.

(d) We claim that, by writing the integral that defines (ϕ ∗ ψ)(x)fora given point x ∈ Rn as the limit of Riemann sums with uniform increments 0

158 6. Distributions

n from (6.4) that (u ∗ ϕk)(x) → (u ∗ ϕ)(x) for every x ∈ R . Hence, f(x)= (u ∗ ϕ)(x). 

The convolution that we have defined can be considerably extended. If u ∈D(Rn) is a distribution with compact support and f ∈E(Rn), then we also may define

(u ∗ f)(x):=u(τxf˜) since u is extended to a unique u ∈E(Rn), by Theorem 6.16. We have an analogous result to Theorem 6.17: Theorem 6.18. If f ∈E(Rn) and u ∈E(Rn),thenu ∗ f ∈E(Rn) and the following properties are satisfied:

(a) τa(u ∗ f)=(τau) ∗ f = u ∗ (τaf). (b) Dα(u ∗ f)=(Dαu) ∗ f = u ∗ (Dαf). (c) (u ∗ f) ∗ ψ = u ∗ (f ∗ ψ) if ψ ∈D(Rn).

Proof. The statement (a) is proved exactly as in Theorem 6.17 and (Dαu)∗ f = u ∗ (Dαf), which is one half of (b), follows directly from the definition. To also show that Dα(u ∗ f)(x)=u ∗ (Dαf)(x), we choose  =˜ ∈D(Rn) such that  = 1 on a neighborhood of supp u ∪ τxf˜ and then (u ∗ Dαf)(x)=u ∗ (Dα(f))(x)=Dα(u ∗ (f))(x) with (u ∗ (f))(x)=u(τxf˜)=(u ∗ f)(x). Note also that ((Dαu) ∗ f)(x)=((Dαu) ∗ (f))(x) is continuous. To prove (c), let G be a bounded open set that contains supp u and choose any  ∈D(Rn) such that ˜ = f˜ on supp ψ + G.Thenf∗ ψ = ∗ ψ on G and it follows from the definition that (u ∗ (f ∗ ψ))(0) = (u ∗ ( ∗ ψ))(0) = (u ∗ (ψ ∗ ))(0). ˜ Since τhf = τh˜ on W , u∗f = u∗ at every point in − supp ψ and it follows that ((u ∗ f) ∗ ψ)(0) = ((u ∗ ) ∗ ψ)(0). But supp (u ∗ ψ) ⊂ supp u + supp ψ, so that ((u ∗ ψ) ∗ f)(0) = ((u ∗ ψ) ∗ )(0) and ((u ∗ ψ) ∗ )(0) = (u ∗ (ψ ∗ ))(0), which combined with the first identity gives (u ∗ (f ∗ ψ))(0) = ((u ∗ ψ) ∗ f)(0) and we obtain (c) from (a) by applying a convenient translation. 

6.4. Convolution of distributions 159

Example 6.19. δ ∗ f = f for every f ∈E(Rn), since (δ ∗ f)(x)=f(x − y),δ(y) = f(x).

6.4.3. Convolution of distributions. A typical fact of convolution op- erators is that they commute with translations. By Theorem 6.17(a), for every distribution u ∈D(Rn), the operator u∗ : D(Rn) →E(Rn) is linear, n continuous on every DK (R ), and u ∗ (τaϕ)=τa(u ∗ ϕ).Theconverseis also true: Theorem 6.20. If T : D(Rn) →C(Rn) is linear, continuous on every n n DK (R )(or, simply, (Tϕk)(0) → 0 if ϕk → 0 in D(R )) and if it commutes with translations, then T = u∗ for a uniquely determined distribution u ∈ D(Rn).

Proof. Since necessarily (Tϕ)(0) = (u ∗ ϕ)(0) = (τ0u)(ϕ ˜)=u(˜ϕ), we must n define u(ϕ):=(T ϕ˜)(0). Note that u is linear and, if ϕk → 0inD(R ), then  n u(ϕk)=(T ϕ˜k)(0) → 0; hence u ∈D(R ).

But T (τ−aϕ)(0) = (u∗τ−aϕ)(0) and, since T commutes with translations,

(Tϕ)(a)=(τ−aTϕ)(0) = T (τ−aϕ)(0) = (u ∗ τ−aϕ)(0) = (u ∗ ϕ)(a). 

The preceding theorem will allow us to extend the previous definitions to a convolution of two distributions u and v on Rn if at least the support of one of them is compact. In this case we can define

Tu,v(ϕ):=u ∗ (v ∗ ϕ). Note that, if v ∈E(Rn), then v ∗ ϕ ∈D(Rn) and u ∗ (v ∗ ϕ) is well-defined; similarly, if u has compact support, then v ∗ ϕ ∈E(Rn) and we also are allowed to consider

u ∗ (v ∗ ϕ)(x)=u(τx(v∗ ϕ)).

n n In both cases, the linear map Tu,v : D(R ) →E(R ) is continuous on n every DK (R ) and commutes with translations. By Theorem 6.20, there  n exists a unique w ∈D(R ) such that Tu,v = w∗. Then we define u ∗ v := w, so that u ∗ v ∈D(Rn) is characterized by the identity (u ∗ v) ∗ ϕ = u ∗ (v ∗ ϕ)(ϕ ∈D(Rn)).  n Theorem 6.21. Let u1,u2,u3,u,v ∈D(R ). (a) If u ∗ ϕ =0for every ϕ ∈D(Rn),thenu =0.  n (b) If at least two of u1,u2,u3 ∈D(R ) have compact support, then

u1 ∗ (u2 ∗ u3)=(u1 ∗ u2) ∗ u3.

160 6. Distributions

(c) If at least the support of one of u and v is compact, then u∗v = v∗u, Dα(u ∗ v)=(Dαu) ∗ v =(u ∗ Dαv), and supp (u ∗ v) ⊂ supp u + supp v.

Proof. (a) u(ϕ)=u(τ0ϕ˜)=(u ∗ ϕ)(0) = 0 if u ∗ ϕ =0.

(b) From the definitions, (u1 ∗ (u2 ∗ u3)) ∗ ϕ =((u1 ∗ u2) ∗ u3) ∗ ϕ since, in any case,

(u1 ∗ (u2 ∗ u3)) ∗ ϕ = u1 ∗ ((u2 ∗ u3) ∗ ϕ)=u1 ∗ (u2 ∗ (u3 ∗ ϕ)) and

((u1 ∗ u2) ∗ u3) ∗ ϕ =(u1 ∗ u2) ∗ (u3 ∗ ϕ)=u1 ∗ (u2 ∗ (u3 ∗ ϕ)).

(c) Let u1 = u ∗ v and u2 = v ∗ u. To show that u1 = u2, we only need n to prove that u1 ∗ (ϕ1 ∗ ϕ2)=u2 ∗ (ϕ1 ∗ ϕ2) for any ϕ1,ϕ2 ∈D(R ), since then, by an application of (b), (u1 ∗ ϕ1) ∗ ϕ2 =(u2 ∗ ϕ1) ∗ ϕ2. Now we apply (a), so that u1 ∗ ϕ1 = u2 ∗ ϕ1 and then u1 = u2. But, by using the fact that the convolution of functions is commutative and (b),

(u ∗ v) ∗ (ϕ1 ∗ ϕ2)=u ∗ ((v ∗ ϕ1) ∗ ϕ2)=u ∗ (ϕ2 ∗ (v ∗ ϕ1)) = (u ∗ ϕ2) ∗ (v ∗ ϕ1) and also

(v ∗ u) ∗ (ϕ1 ∗ ϕ2)=(v ∗ ϕ1) ∗ (u ∗ ϕ2)=(u ∗ ϕ2) ∗ (v ∗ ϕ1). Hence, v ∗ u = u ∗ v. To study the supports, let  be a test function supported by B¯(0, 1) and n with  = 1, and let k(x)=k (kx). Assume that v has compact support. Then supp (u ∗ v) ⊂ supp (u ∗ v ∗ k) and

supp (u ∗ v ∗ k) ⊂ supp u + supp (v ∗ k) ⊂ ( supp u + supp v) + supp k where supp k = B¯(0, 1/k); hence ( supp u + supp v)+B¯(0, 1/k) = supp u + supp v, k a closed set, since supp v is compact. Finally, (Dα(u ∗ v)) ∗ ϕ =(u ∗ v) ∗ Dαϕ = u ∗ (v ∗ Dαϕ)=u ∗ (Dαv) ∗ ϕ for every ϕ, and Dα(u ∗ v)=u ∗ (Dαv).  Example 6.22. If u ∈D(Rn), u∗δ = u,since(u∗δ)∗ϕ = u∗(δ∗ϕ)=u∗ϕ.

This example shows that the derivatives are convolution operators: Dαu = Dα(δ ∗ u)=(Dαδ) ∗ u.

6.5. Distributional differential equations 161

6.5. Distributional differential equations

6.5.1. Linear differential equations. A very simple differential equation on (α, β) ⊂ R is u =0 and in Theorem 6.13 we proved that its distributional solutions are the classical constant solutions. This property is easily extended to all ordinary linear equations with smooth coefficients: Theorem 6.23. Let a ∈E(α, β) and f ∈C(α, β). The solutions u ∈ D(α, β) of the first order linear differential equation u + au = f are the classical solutions u ∈E1(α, β). This fact still holds for linear systems: u + Au = f { j}n ×  ∈C n ∈D n (A = ai i,j=1 an n n matrix, f =(f1,...,fn) (α, β) , u (α, β) ).

Proof. If a = 0 and F ∈E1(α, β) is a primitive of f,then(u − F ) =  −  − 1 u F = 0 and u F = C. Hence,u = F + C is a C function. In the general numerical case, if a is a primitive of a, then the function e =exp a is a C∞ solution of e = ea and, for any solution u of u +au = f, (eu) = eu + eu = e(u + au)=ef, 1 −1 which is continuous. Then eu is a C function and so is u = e eu . { j}n In the vector case we follow the same model with A = ai i,j=1. ∞ k Then e =exp A = k=0(1/k!)( A) is a square matrix with an inverse e−1 that allows us to write again u = e−1eu. 

Corollary 6.24. If f ∈C(α, β) and a0,a1,...,an−1 ∈E(α, β), then the distributional solutions of the linear equation (n) (n−1)  u + an−1u + ···+ a1u + a0u = f are the classical solutions u ∈En(α, β).

(k−1) Proof. With the usual procedure, we denote uk = u (k =1,...,n)to obtain the first order linear system  ···  −  − un + an−1un + + a1u2 + a0u1 = f, un−1 un =0,..., u1 u2 =0 and then we apply Theorem 6.23. 

162 6. Distributions

Such a regularity property does not hold in the several variables case, and we have an example in Exercise 6.35. A differential operator with constant n coefficients on R , α P (D)= aαD , |α|≤N is said to be hypoelliptic6 whenever, for every open set Ω ⊂ Rn,iff ∈E(Ω) and u is a distributional solution u ∈D(Ω) of P (D)u = f, u must also belong to E(Ω). There is a characterization of all the polynomials P such that P (D) is hypoelliptic due to H¨ormander (see, for instance, Yoshida’s “Functional Analysis” [44]). Here we are going to prove a sufficient condition in terms of what is known as a fundamental solution.

6.5.2. Fundamental solutions. On Rn,afundamental solution of a differential operator with constant coefficients α P (D)= aαD |α|≤N is a distributional solution of the equation P (D)u = δ. The interest in having a fundamental solution, E, is due to the fact that, if v is any distribution and E or v has a compact support, so that v ∗ E is well-defined, then (6.6) P (D)(E ∗ v)=(P (D)E) ∗ v = δ ∗ v = v and E ∗ v is a solution of P (D)u = v. By the Malgrange-Ehrenpreis theorem, every differential operator with constant coefficients has a fundamental solution.7 In the following pages we are going to show a few important concrete examples.

Theorem 6.25. On R, the ordinary linear operator with constant coeffi- cients (n) (n−1)  P (D)u = u + a1u + ···+ an−1u + anu has the fundamental solution E = fY,iff ∈E(R) is the solution of

P (D)f =0,f(n−1)(0) = 1,f(n−2)(0) = ···= f (0) = f(0) = 0.

 6 α If the characteristic polynomial PN (x)= |α|=N aαx does not vanish at any point, P (D) is said to be elliptic; an example is the Laplacian. Every elliptic operator is hypoelliptic (see [14]). 7A proof due to H¨ormander can be found in Yosida’s “Functional Analysis” [44] or in Rudin’s “Functional Analysis” [38].

6.5. Distributional differential equations 163

Proof. The derivative of E = fY is E = fδ + f Y = f(0)δ + f Y = f Y . Then E = f (0)δ + f Y = f Y and, by induction E(n) = f (n−1)(0)δ + f (n)Y = δ + f (n)Y. Hence, (n) (n−1)  P (D)E = δ + f Y + a1f Y + ···+ an−1f Y + anfY = δ + YP(D)f = δ as announced. 

As a first example in several variables, let us look for a fundamental solution of the Laplacian "≡ 2 ··· 2 ∂1 + + ∂n on Rn, which is the most important of all differential operators. To this end, Green’s identities8

(6.7) v(x)"u(x) dx + ∇u(x) ·∇v(x) dx = v∂νudσ Ω Ω S and

(6.8) v(x)"u(x) dx − u(x)"v(x) dx = (v∂νu − u∂νv) dσ Ω Ω S are useful. Here Ω ⊂ Rn is a regular domain for the divergence theorem and u, v are two C2 functions on Ω¯ and ν denotes the outer normal vector field on the positively oriented boundary S of Ω. Note that (6.8) follows from (6.7), and (6.7) is obtained by taking w = v∇u in the divergence theorem (Div w)(x) dx = w · νdσ. Ω S By the spherical symmetry of ", we are led to try a radial function

En(x)=ψ(r),r= |x|, such that "En =0onr>0 as a possible fundamental function. In this case, n − 1 "E (x)=ψ(r)+ ψ(r), n r obtained from the chain rule of differentiation. Then, the radial solutions of "E =0onr>0aretheC∞ functions given by n C log r (n =2), (6.9) ψ(r)= C 2−n 2−n r (n>2)

8 In 1828, G. Green’s included these identities in “An Essay on the Application of Mathe- matical Analysis to the Theories of Electricity and Magnetism”. See footnote 17 in Chapter 2.

164 6. Distributions plus an additive constant. n Denote by ωn−1 the area of the unit sphere Sn−1 ⊂ R : 2πn/2 ω =2,ω=2π, ω =4π, ω − = . 0 1 2 n 1 Γ(n/2) Theorem 6.26. The function E defined on Rn \{0} by n 1 log |x| if n =2, ω1 (6.10) En(x)= − 1 |x|2−n if n =2 (n−2)ωn−1 is a locally integrable fundamental solution for the Laplacian, and it is of class C∞ on Rn \{0}.

Proof. Polar integration shows that En is integrable on B(0, 1), and it is clearly a C∞ function on Rn \{0}. For ϕ ∈D(Rn), we require that

(6.11) ("ϕ)(x)En(x) dx = ϕ(0). Rn n Since En is singular at x = 0, we cut out from R a small ball B¯(0,r) and, since ϕ is supported by some ball |x|

(6.12) lim En(x)"ϕ(x) dx = ϕ(0). r↓0 Ωr

We apply Green’s identity (6.8) with v(x)=En(x)=ψ(r) given by (6.9) and u = ϕ.Since"v =0onΩr and ϕ = 0 on a neighborhood of the outer boundary |x| = R of Ω ,wehave r

En"ϕdx= (En∂νϕ − ϕ∂νEn) dσ. Ωr {|x|=r} On |x| = r the exterior normal points towards 0 and E(x)=ψ(r). Conse- quently, again using Green’s identities with v = 1 and u = ϕ,  En"ϕdx = ψ(r) ∂νϕdσ+ ψ (r) ϕdσ {| | } {| | } Ωr x =r x =r = −ψ(r) "ϕ + Cr1−n ϕdσ. B(0,r) {|x|=r} n n Here |B(0,r)| = Ar and limr→0 r ψ(r) = 0, so that, from the continuity of ϕ

lim En(x)"ϕ(x) dx = Cωnϕ(0) r↓0 Ωr and we obtain (6.12) by taking C =1/ωn. 

6.5. Distributional differential equations 165

Remark 6.27. It will follow from Theorem 6.31 that " is hypoelliptic, a result known as Weyl’s lemma.

The behavior of this fundamental solution will allow us to show that (6.6) gives a solution of the Poisson equation,9 "u = f, for any f ∈ L1,ifn ≥ 3. In Exercise 6.37 we consider the case n =2. Theorem 6.28. If f ∈ L1(Rn) and n>2,thenu := "∗f is a well-defined locally integrable function which is a distributional solution of "u = f.

1 n ∞ n Proof. Define E0 = χB(0,1)En ∈ L (R ) and E∞ = En − E0 ∈ L (R ). 1 n ∞ n Then E0 ∗ f ∈ L (R ) and E∞ ∗ f ∈ L (R ), so that u = "∗f is well- defined and locally integrable. Now consider N N N f = χB(0,N)f and u = "∗f , so that, by a standard application of the dominated convergence theorem, uN → u in D(Rn) and also "uN →"u in D(Rn). But "uN = f N ,since

N N N ϕ, "u  = "ϕ, En ∗ f  = f ∗"ϕ, En N N = "(f ∗ ϕ),En = f ∗ ϕ, LEn =(f N ∗ ϕ)(0) = ϕ, f N , N N and we are led to "u = limN "u = limN f = f. 

There are similar results for the heat operator10: Theorem 6.29. The function Γ defined on R1+n by (4cπt)−n/2e−|x|2/(4ct) if t>0, (6.13) Γ(t, x)= 0 if t ≤ 0 is a locally integrable fundamental solution, of class C∞ on Rn \{0},for the operator n ≡ − " − 2 L ∂t c = ∂t c ∂xj . j=1 Here c>0.

9Named after Sim´eon-Denis Poisson, who in 1812 discovered that Laplace’s equation u =0 of potential theory is valid only outside a solid. 10In Chpter 7, the Fourier transform will show that Γ(t, x) is a natural fundamental function for this operator.

166 6. Distributions

Proof. Note that, if t>0, after a change of variable, we obtain 1 2 Γ(t, x) dx = e−|u| /2 du =1 n/2 Rn (2π) Rn and Γ is locally integrable on R1+n. For every test function ϕ, ∞ ϕ, ∂ Γ = − lim ∂ ϕ(t, x)Γ(t, x) dt dx t → t ε 0 ε Rn and, by partial integration, ∞ (6.14) ϕ, ∂ Γ = lim ϕ(ε, x)∂ Γ(t, x) dt dx + f(ε, x)Γ(t, x) dx. t → t ε 0 ε Rn Rn √ Here Γ(t, x)=t−n/2Γ(1,t−1/2x) and the substitution x = u ε gives √ ϕ(ε, x)Γ(t, x) dx = ϕ(ε, u ε)Γ(1,u) du → ϕ(0) Γ(1,u) du = ϕ(0) Rn Rn Rn when ε ↓ 0, by dominated convergence. A direct computation shows that, on t>0,

∂tΓ(t, x)=c"Γ(t, x) and integration by parts proves that

ϕ(t, x)∂tΓ(t, x) dt dx = c"ϕ(t, x)Γ(t, x) dt dx {t>ε} Rn {t>ε} Rn and (6.14) becomes

ϕ, ∂tΓ = ϕ, c"Γ + ϕ(0), which means that LΓ=δ. It is also clear that Γ is C∞ away from the origin, since the partial derivatives vanish as t ↓ 0whenx =0. 

This fundamental solution is zero for t ≤ 0 and, as for the Laplacian, we have Theorem 6.30. If f ∈ L1(R1+n),thenu := Γ ∗ f is a well-defined locally integrable function which is a distributional solution of the heat equation,

Lu ≡ ∂tu − c"u = f.

Proof. Denote Γt(x)=Γ(t, x) and ft(x)=f(t, x). To prove that t u(t, x)= Γt−s(x − y)fs(y) dy ds −∞ Rn

6.5. Distributional differential equations 167 is well-defined as a distribution, we apply Young’s inequality to the convo- lution Γt−s ∗ fs to obtain ∞ sup |u(t, x)| dx ≤ |f(s, x)| ds dx = f1 t∈R Rn −∞ Rn and we conclude that u is defined a.e. and locally integrable. Now, as in Theorem 6.28, we define N N N f = χB(0,N)f and u =Γ∗ f , so that uN → u in D(R1+n) and also LuN → Lu in D(R1+n). But again LuN = f N ,since   ϕ, LuN  = L ϕ, Γ ∗ f N  = f N ∗ L ϕ, Γ  = L (f N ∗ ϕ), Γ = f N ∗ ϕ, LΓ =(f N ∗ ϕ)(0) = ϕ, f N .

N N Thus Lu = limN Lu = limN f = f. 

The regularity of the solutions of both Poisson and heat equations ap- pears as special cases of the following theorem about operators that have a fundamental function which is C∞ away from zero. We will use the fact that, if f ∈E(Ω) and u ∈D(Ω), then α α β α−β α D (fu)=fD u + CβD fD u = fD u + vα, 0<|β|≤|α| which follows from the Leibniz formula. Note that, if f is constant on an open set G ⊂ Ω, then v =0onG,sinceDβf = 0 on this open set for every β =0. Of course, by linearity, it follows that (6.15) P (D)(fu)=fP(D)u + v and v =0onG if f is constant on G, for every differential operator with constant coefficients P (D).

Theorem 6.31. If a differential operator with constant coefficients P (D) has a fundamental solution which is of class C∞ on Rn \{0},thenP (D) is hypoelliptic.

Proof. Suppose that P (D)E = δ with E of class C∞ on {0}c and that P (D)u = f on an open set Ω ⊂ Rn,withf ∈E(Ω) and u ∈D(Ω). We start by showing that every compact set K ⊂ Ω has an open neighborhood G ⊂ Ωwhereu is C∞.

168 6. Distributions

We select a second compact set K(2δ)=K+B¯(0, 2δ) ⊂ Ω and we choose G = K +B(0,δ). Then let K(2δ) ≺  ≺ Ω and B¯(0,δ/2) ≺ γ ≺ B(0,δ), and consider the compactly supported distributions u and γE. As in (6.15), P (D)(u)=P (D)u + v = f + v, where v ∈D(Ω) and supp v ∩ K(2δ)=∅,sincev iszeroonaneighborhood of K(2δ). Also P (D)(γE)=γP(D)E + h = δ + h and h ∈D(Rn)sinceE is C∞ on {0}c and h =0onB(0,δ/2). Then P (D)(γE ∗ u)=γE ∗ f + γE ∗ v and P (D)(γE ∗ u)=u + h ∗ u, so that u = −h ∗ u + γE ∗ f + γE ∗ v = ϕ + γE ∗ v, ϕ ∈D(Rn), ∗ ⊂ ¯ ⊂ c with supp (γE v) B(0,δ) + supp v ΩK ;thusu = ϕ on G. By considering an exhaustive increasing sequence Km of compact sets ∞ in Ω as in (3.2) and Km ⊂ Gm ⊂ Int Km+1 so that u is a C function gm on Gm, we have that gm+1 = u = gm on Gm, so that u = g on Ω if g is the common extension to Ω of the functions gm. 

Now the regularity of the solutions of the Poisson equation "u = f is an obvious corollary of the preceding results: Theorem 6.32. Let Ω be an open subset of Rn.Iff ∈E(Ω) and if u ∈D(Ω) is a distributional solution of "u = f, then u ∈E(Ω).

Of course, there is the analogous result for the heat operator, but the situation is not the same for the wave operator, i.e., n  ≡ 2 −" 2 − 2 ∂t = ∂t ∂xj j=1 on R1+n. This operator is no longer hypoelliptic and it is more involved. Here we describe only the elementary case n = 1, i.e.,  ≡ 2 − 2 ∂t ∂x.

6.5. Distributional differential equations 169

After the change of variables s = t − x, y = t + x we are led to the operator ∂s∂y since then f(s, y)=f(t − x, t + x)=g(t, x) shows that 2 − 2 4∂s∂yf(s, y)=(∂t ∂x)g(t, x). We must find a locally integrable solution F (s, y)of

4∂s∂yF (s, y)=δ(s, y), and the simplest one, obtained by separating variables, is 1 1 F (s, y)= Y (s)Y (y)= χ{ ≥ }(s, y). 4 4 s,y 0 Our candidate is then E(t, x)=cY (t − x)Y (t + x)withc such that ϕ(0, 0) = E(t, x)ϕ(t, x) dt dx R2 c s + y s − y = Y (y)Y (s)4∂s∂yϕ , ds dy 2 R2 2 2 =2cϕ(0, 0). Thus 1 E(t, x)= Y (t − x)Y (t + x); 2 that is, E =1/2, constant in the sector t+x ≥ 0, t−x ≥ 0, and 0 elsewhere. This shows that the operator is not hypoelliptic.

6.5.3. Green’s functions. For the boundary value problem Lu ≡ (pu) − qu = g (0

170 6. Distributions

Note that, as a distribution on (a, b), LG(·,ξ)=δξ, which will also be represented by LxG(x, ξ)=δξ(x).

Indeed, for every ϕ ∈D(a, b), and if we denote Gξ = G(·,ξ), we can write   ϕ, LGξ = (pϕ ) ,Gξ + ϕ, qGξ, where, since supp ϕ ⊂ (a, b) and Gξ is continuous, integration by parts produces ξ− b     − −   (pϕ ) ,Gξ = ϕ pGξ. a ξ+ Finally we integrate by parts once again to obtain ξ− b     −  −  −  − ϕ, LGξ = ϕ, qGξ + ϕ(pGξ) (ϕpGξ)(ξ )+(ϕpGξ)(ξ ) a ξ+ ξ− b − −  − −  = (ϕLGξ) (ϕp)(ξ)(Gξ(ξ ) Gξ(ξ+)) a ξ+ 1 =(ϕp)(ξ) = ϕ(ξ)=ϕ, δ . p(ξ) ξ On the space 2 {u ∈C [a, b]; B1(u)=0,B2(u)=0}, b   the operator Tf(x)= a G(x, ξ)f(ξ) dξ = f(ξ),G(x, ξ) solves Lu = f. In a more general setting, we say that G(x, ξ)isaGreen’s function for a differential operator L on an open set Ω ⊂ Rn if G(·,ξ) ∈D(Ω) and LG(·,ξ)=δξ ∈ for every ξ Ω. Formally, if u(x)=f(ξ),G(x, ξ) = G(x, ξ)f(ξ) dξ, we obtain Ω

Lu(x)= LG(x, ξ)f(ξ) dξ = f(ξ),δx(ξ) = f(x) Ω and Tf(x):=f(ξ),G(x, ξ) solves Lu = f. Note that if E is a fundamental solution for a linear operator with con- stant coefficients P (D), then G(x, ξ)=E(x − ξ) is a Green’s function for P (D)onΩ=Rn,since P (D)G(·,ξ)=P (D)τ E = τ P (D)E = τ δ = δ ξ ξ ξ ξ ∗ and the formal solution is u(x)= Ω G(x, ξ)f(ξ) dξ = E f. By requiring G(·,ξ) to satisfy some linear conditions that determine a subspace H of D(Ω), we can try to solve Lu = f with u ∈ H.

6.5. Distributional differential equations 171

This was the case in Theorem 2.53 for a boundary value problem. As a very important example, let us describe a method to find a Green’s function for the Dirichlet problem

(6.16) "u = f, u|∂Ω =0 on a bounded open set Ω ⊂ Rn. For such a Green’s function, G,the requirements are

(6.17) "Gξ = δξ,Gξ ∈C(Ω)¯ ,Gξ(y)=0 ∀y ∈ ∂Ω for every ξ ∈ Ω, where we denote Gξ(x)=G(x, ξ). The uniqueness of such a function will follow from Theorem 7.37, and the existence will be proved for instance if for every ξ ∈ Ωwecansolvethe Dirichlet problem

"gξ =0 onΩ,gξ ∈C(Ω)¯ ,gξ(y)=En(y − ξ) ∀y ∈ ∂Ω, with En the fundamental solution defined in (6.10), since by defining

G(x, ξ)=En(x − ξ) − gξ(x),

Gξ satisfies (6.17). From the condition G =0on∂Ω, the possible solution ξ u(x)= G(x, ξ)f(ξ) dξ Ω " ∈ of u = f will also satisfy u(y)= Ω G(y, ξ)f(ξ) dξ =0wheny ∂Ω. Similar heuristic arguments can be used to guess a method to solve the homogeneous Dirichlet problem on Ω with inhomogeneous boundary conditions

(6.18) "u =0,u|S = f. It can be shown that, if the bounded open set Ω has a C∞ boundary, the Green’s function G exists, G(x, y)=G(y, x), G(x, ·) extends to G(x, ·) ∈ C∞(Ω¯ \{x}) for every x ∈ Ω, and u(x)= G(x, y)f(y) dy Ω is the solution of (6.16). If u is this solution, since "u = 0, a formal application of Green’s identiy (6.8) leads to u(x)= u(y)δ(x − y) dy = u(y)"Gx(y) − Gx(y)"u(y) dy Ω Ω

= u(y)∂ν(y)Gx(y)dσ(y)= f(y)∂ν(y)Gx(y)dσ(y). S S

172 6. Distributions

The function

P (x, y)=∂ν(y)G(x, y)((x, y) ∈ Ω × S) is called the Poisson kernel for Ω, and u(x)= P (x, y)f(y) dy S is the Poisson integral, a candidate for the solution of the Dirichlet prob- lem (6.18). Usually it is hard to find Green’s functions and Poisson kernels. We restrict ourselves here to the very special but important case of the ball B = {x ∈ Rn; |x| < 1} and we refer to Folland’s “Introduction to Partial Differential Equations” [14] for further information on this subject.

6.5.4. Green’s function of the Dirichlet problem in the ball. We write 1 | − | 2π log x y if n =2, E(x, y)=E(y, x)=En(x − y)= − 1 |x − y|2−n if n =2 (n−2)ωn−1 and note that, if x = 0 is given, the function x g := |x|2−nE , · x |x|2 n 2 is harmonic on R \{0,x/|x| }. We claim that gx(y)=E(x, y)ify ∈ S = ∂B. Indeed, if n>2 and |y| =1, 1 2−n −1 2−n E(x, y) − gx(y)=− |x − y| − |x| x −|x|y (n − 2)ωn−1 and (6.19) |x|−1x −|x|y = |x − y| since |x − y|2 = |x|2 − 2x · y +1 = |x|y2 − 2|x|−1x · (|x|y)+|x|−1x2 = |x|y −|x|−1x2. We define 1 2−n −1 2−n G(x, y):=E(x, y) − gx(y)=− |x − y| − |x| x −|x|y (n − 2)ωn−1 if x = 0 and 1 G(0,y):=− (|y|2−n − 1). (n − 2)ωn−1

6.5. Distributional differential equations 173

Then G satisfies all the required properties. If n =2, 1 (6.20) G(x, y):= log |x − y|−log |x|−1x −|x|y 2π when x = 0, and 1 G(0,y):= log |y|. 2π We can compute the Poisson kernel for the ball,

P (x, y)=∂ν(y)G(x, y)((x, y) ∈ B × S) since ν(y)=y is the normal vector to the sphere S and the normal derivative n on S is ∂ν(y) = j=1 yj∂yj .Thus,whenn>2, n − −| | | |−1 − | | − 1 yj(xj yj) xyj( x xj yj x ) P (x, y)= + − n ω − |x − y|n |x|y −|x| 1x n 1 j=1 | |2 − · | |2| |2 − · 1 y x y − x y x y = n − n ωn−1 |x − y| |x|y −|x| 1x and from (6.19) we obtain −| |2 1 1 x ∈ × (6.21) P (x, y)= n ((x, y) B S) ωn−1 |x − y| when n>2. If n = 2, we recover the Poisson kernel of Theorem 5.8, since a direct computation of ∂ν(ξ)G(z,ξ), with G defined as in (6.20), leads to 1 1 −|z|2 1 ξ + z P (z,ξ)= =  ((z,ξ) ∈ U × T). 2π |ξ − z|2 2π ξ − z Note that here we have included the normalizing factor 1/2π. Now the expected result for n>2 can be proved: Theorem 6.33. If f ∈C(S) and P (x, y) is the Poisson kernel for the ball given by (6.21), then the function u(x):= P (x, y)f(y) dy (x ∈ B) S is harmonic on B and extends continuously to B¯ and u = f on S.

Proof. We will follow the argument used to prove Theorem 2.41, now based in the following facts: (1) P (x, y) dσ(y)=1. S ∈ ∩ (2) If y0 S and V = B(y0,δ) S, then limr↑1 S\V P (ry0,y) dσ(y)=0.

174 6. Distributions

To prove (1), we will apply the mean value theorem for harmonic functions11 to Py = P (·,y), 1 1 Py(x)= Py(z) dσ(z)= Py(x + rz) dσ(z), | | − Sr Sr(x) ωn 1 S at the point x =0,if0

2 1 1 −|ry0| P (ry0,y)= n ωn−1 |ry0 − y|

2 n limr↑1(1 −|ry0| ) = 0 and 1/|ry0 − y| is uniformly bounded for 1/2 0 is given, choose δ>0 so that |f(y) − f(z)|≤ε if |y − z|≤δ and V (y)=B(y, δ) ∩ S. Then by (1) f(y) − u(ry)= + (f(y) − f(z))P (ry, z) dσ(z) V (y) S\V (y) and we obtain

|f(y) − u(ry)|≤ε +2fS P (ry, z) dσ(z) ≤ 2ε S\V (y) if r is close to 1, by (2).

This shows that limr↑1 u(ry)=f(y) uniformly for y ∈ S, and u has a continuous extension to B¯ given by u(y)=f(y)ify ∈ S. 

In Exercise 6.38 we leave it to the reader to prove a similar result for the Lp convergence u(ry) → f(y) for every f ∈ Lp(S).

11See Folland, “Introduction to Partial Differential Equations” [14, page 90].

6.6. Exercises 175

6.6. Exercises

Exercise 6.1. Let 0

176 6. Distributions

with qn as in (6.1) on the compact set [−R, R]. { }∞ ⊂ ∞ (b) Show that it is possible to choose rn n=1 [1, ) so that the functions f satisfy q − (f ) ≤ 1/2n. n n 1 n ∞ ∈E (m) (c) Prove that n=1 fn(t)=f(t)forsomef (R) and f (0) = cm. ∈ 1 Exercise 6.7. Prove that, for every function f Lloc(Ω) and every complex Borel measure μ on Ω, the distributions defined by ϕ, f = ϕ(x)f(x) dx and ϕ, μ = ϕdμ Ω Ω are of order 0.

Exercise 6.8. Prove that the Dirac distribution δa is not a function. Exercise 6.9 (δ is not a measure). Show that the distribution δ is not a complex measure.  n Exercise 6.10. Find limλ→0 Kλ in D (R )if{Kλ}λ∈Λ is a summability kernel on Rn.  +∞ Exercise 6.11 (Dirac comb). Prove that = k=−∞ δk on R is well- defined as the sum of a convergent series in D(R). Exercise 6.12. Find the order of δ(m) ∈D(R). Exercise 6.13. Prove that ∞ u(ϕ)= ϕ(n)(1/n) n=1 defines on (0, ∞) a distribution of infinite order that is not the restriction of any distribution v on R, meaning that u = v|D(0,∞). Exercise 6.14. Every positive linear form u : D(Ω) → C (u(ϕ) ≥ 0if ϕ ≥ 0) is a distribution. The restrictions u : DK (Ω) → C satisfy |u(ϕ)|≤ u()ϕK if  is a Urysohn function on K ⊂ Ω, and they are continuous. ∈ 1 n \{ } Exercise 6.15. Let f Lloc(R x0 ) and assume that x0 is a noninte- grable singularity of f. We only know that fϕ exists for a test function ∈ n ϕ if x0 supp ϕ.A regularization of f is a distribution uf on R which satisfies uf (ϕ)= fϕ if x0 ∈ supp ϕ. For the function f(t)=1/t,onR we obtain a regularization of f by taking a, b > 0 and defining −a ϕ(t) b ϕ(t) − ϕ(0) +∞ ϕ(t) uf (ϕ):= dt + dt + dt. −∞ t −a t b t − There is a continuous function ψ so that u (ϕ):= b ϕ(t) ϕ(0) dt = ϕ 2 −a t b −a ψϕ(t) dt defines a distribution on R, and uf appears as the sum of three ∈ distributions. If 0 supp ϕ,thenuf (ϕ)= R ϕ(t)/t dt.

6.6. Exercises 177

Exercise 6.16. Prove the following statements concerning limits in D(R). r ϕ(t)−ϕ(0) ∈D → (a) If ur(ϕ):= −r t dt (r>0), then ur (R) and ur 0as r ↓ 0. 1 1 (b) pv t := limε↓0 χ[−ε,ε]c (t) t is a well-defined distribution. Moreover, if supp ϕ ⊂ [−r, r], then 1 ϕ, pv  = u (ϕ) t r and we write 1 +∞ ϕ(t) ϕ, pv  =pv dt. t −∞ t It is a regularization of f(t)=1/t. ∈ 1 n \{ } | |m Exercise 6.17. Let f Lloc(R 0 ) so that f(x) x is locally integrable for some positive integer m and, for a test function ϕ, denote by m 1 T ϕ(x):= dkϕ(0)(x,...,x) m k! k=0 m the Taylor polynomial of degree m, so that |ϕ(x) − Tmϕ(x)|≤Cr|x| if |x|≤r. Show that uf (ϕ):= f(x) ϕ(x) − Tmϕ(x)χ(−∞,1)(|x|) dx Rn defines a regularization of f. Exercise 6.18. Find the distributional derivatives Y (n) of the Heaviside function Y = χ[0,∞). Exercise 6.19. Show that f(t):=log|t| is locally integrable on R and  1 − − prove that f =pvt . Note that limε→0 log ε(ϕ( ε) ϕ(ε)) = 0. t ∈ ∈ 1 Exercise 6.20. Let f(t)= 0 g(x) dx (t R)forsomeg L (R); that is, f is an absolutely continuous function on R, and f  = g a.e. Prove that g = f  is the distributional derivative of f: ϕ, g = −ϕ,f for every test function ϕ on R. Exercise 6.21. Prove that Dα : D(Ω) →D(Ω) is a linear continuous operator that extends Dα : Em(Ω) →Em−|a|(Ω), if |α|≤m. α Exercise 6.22. Suppose P (D)= |α|≤m cαD is a differential operator ∞  α with C coefficients cα on Ω and denote P (D) u := |α|≤m D (cαu). Prove that P (D) = P (D) and (P (D)u)ϕ = u(P (D)ϕ). Exercise 6.23. Prove that, if at least one of u, v ∈D(Rn) has compact support, then τa(u ∗ v)=(τau) ∗ v = u ∗ (τav).

178 6. Distributions

Exercise 6.24. Prove that supp Dαu ⊂ supp u for every distribution u ∈ D(Ω). Exercise 6.25. Find the supports of Y and δ on R. Exercise 6.26. Prove that the order of all distributions with compact sup- port is finite. Exercise 6.27. Prove that 1 ∗ (δ ∗ Y ) =(1 ∗ δ) ∗ Y and that this is not in contradiction to Theorem 6.21(b). Exercise 6.28. If u ∈E(Rn), prove that u∗ : E(Rn) →E(Rn)isacontin- uous linear operator which commutes with translations. { }∞ ⊂ 1 Exercise 6.29. We say that dm m=1 Lloc(R)isanapproximation of δ if b b lim d (t) dt =0if0 ∈ [a, b] and lim d (t) dt =1if0∈ (a, b). →∞ m →∞ m m a m a { }∞ x (a) If dm m=1 is an approximation of δ and Ym(x):= −1 dm(t) dt,  prove that Ym → Y and dm → δ in D (R). (b) Find a concrete approximation of δ. Exercise 6.30. Not all the solutions in D(R) of the differential equation tu(t) = 0 are classical solutions. Exercise 6.31. Find the fundamental solutions of the differential operator P (D)u = u − u + u − u. Exercise 6.32. Find the fundamental solutions of P (D)u = u +3u + 3u + u. 2 2 2 Exercise 6.33. Here we assume that ϕ ∈Cc(R ) ∩E (R ) and that E2 2 is the fundamental solution for " on R . Prove that u := ϕ ∗ E2 is a classical solution of the Poisson equation "u = ϕ; that is, "u(x)existsin the pointwise sense and coincides with ϕ(x). Exercise 6.34. On C = R2, prove that 1 E(x, y):= (z = x + iy) πz is a fundamental solution of the Cauchy-Riemann operator 1 ∂ := (∂ + i∂ ). z¯ 2 x y Exercise 6.35. (a) If c = 0 and u ∈D(R), show that we can define u(x − cy) ∈D(R2)by 2 ϕ(x, y),u(x − cy) := ϕ(·,y),τcyu dy (ϕ ∈D(R )) R

6.6. Exercises 179

∈ 1 and that, if u Lloc(R), then this distribution is a locally integrable function on R2. (b) Show that for any distribution u ∈D(R), v = u(x − ct)isadistri- 2 2 − 2 2 butional solution on R of the wave equation ∂t v c ∂xv =0. Exercise 6.36. Prove that the heat operator and the Cauchy-Riemann operator are hypoelliptic. Exercise 6.37. Prove a version of Theorem 6.28 to find a solution of "u = f on R2 under the assumptions f ∈ L1(R2) and f(x)log+(|x|) ∈ L1(R2). Exercise 6.38. Suppose f ∈ Lp(S)(1≤ p<∞) and P (x, y) is the Poisson kernel for the ball given by (6.21). Prove that u(x):= P (x, y)f(y) dσ(y)(x ∈ B) S is a well-defined harmonic function on B which satisfies the boundary value condition lim |f(y) − u(ry)|p dσ(y)=0. ↑ r 1 S

References for further reading: P. A. M. Dirac, The Principles of Quantum Mechanics. R. E. Edwards, Functional Analysis, Theory and Applications. G. B. Folland, Introduction to Partial Differential Equations. I. M. Gelfand and G. E. Chilov, Generalized Functions. L. H¨ormander, Linear Partial Differential Operators. J. Horvath, Topological Vector Spaces and Distributions. E. H. Lieb and M. Loss, Analysis. W. Rudin, Functional Analysis. L. Schwartz, Th´eorie des Distributions. K. Yosida, Functional Analysis.

Chapter 7

Fourier transform and Sobolev spaces

The Fourier transform is one of the most powerful operators in analysis. Its scope and applications have been extended to areas as different as harmonic analysis, partial differential equations, signal theory, probabilities, and alge- braic number theory. Its virtues depend on the use of the functions eiαt =cos(αt)+i sin(αt), which are the homomorphisms of the additive group R to the multiplicative group T, and on the translation-invariance of the Lebesgue measure. These facts are intimately linked to the fundamental properties of con- verting convolution and linear differential operators into multiplication oper- ators, changing convolution and partial differential equations into algebraic equations, and yielding explicit solutions in basic equations such as Laplace, heat, and wave equations. With the extension to distributions, the scope of the Fourier transform increased substantially. By considering the Sobolev spaces of functions with distributional derivatives in L2 up to a certain order, a control on the smoothness properties of these functions is obtained. The reason is that the Fourier transform, which changes differentiation into multiplication, is an L2 isometry and L2 is a Hilbert space. With the fundamental properties of the Fourier transform of distribu- tions, we present an introduction to the theory of Sobolev spaces.1 For

1 Around 1930 the Russian mathematician Sergei L’vovich Sobolev introduced his space W 1,2(Ω), or H1(Ω), with the use of weak derivatives as the natural Hilbert space for solving the Laplace or Poisson equation −u = f with boundary conditions. A little later, in France Jean Leray considered a similar method to find weak solutions for the Navier-Stokes equation.

181

182 7. Fourier transform and Sobolev spaces completeness, we give the basic definitions in the Lp setting but, in fact, we only use the Hilbert space case, p =2. The estimates given by the Sobolev norms are a standard tool to prove the existence and regularity of solutions for partial differential equations. We illustrate this by means of an application to the Dirichlet problem and by studying the eigenfunctions of the Laplacian, and we include Rellich’s compactness theorem, a result which is of great importance in the applica- tions.

7.1. The Fourier integral

n n For each ξ =(ξ1,...,ξn) ∈ R , eξ is the complex sinusoidal defined on R by n eξ(x):=exp(2πix · ξ)=exp 2πi xkξk . k=1 The Fourier integral f of a function f ∈ L1(Rn) is defined by −2πix·ξ f(ξ):= f(x)e dx = f,e−ξ. Rn The Fourier transform F : L1(Rn) → L∞(Rn) ∩C(Rn), such that Ff = f, is a linear mapping and f∞ ≤f1 (i.e., F ≤ 1). An application of Fubini’s theorem, derivation under the integral, partial integration and elementary changes of variables show that the following useful properties of the Fourier transform hold on L1(Rn): (a) τaf = e−af and eaf = τaf. (b) h−nf(h−1x) (ξ)=f(hξ) and [f(hx)] (ξ)=h−nf(h−1ξ). ∗ · (c) f g = f g and Rn f(y)g(y) dy = Rn f(y)g(y) dy.  (d) ∂1f(ξ)=(−2πi)[x1f(x)](ξ), if x1f(x) is also integrable.

1 (e) ∂1f(ξ)=2πiξ1f(ξ), if f is of class C and ∂1f is also integrable. t  Note that to check (e) we can suppose n =1.Thenf(t)= 0 f (x) dx + t   f(0) and limt→±∞ f(t)= 0 f (x) dx + f(0) exists and is finite, since f is assumed to be integrable, and this limit has to be 0, since f is also integrable. Integration by parts gives +∞ t=+∞ +∞ f (t)e−2πiξt dt = f(t)e−2πiξt +2πiξ f(t)e−2πiξt dt −∞ t=−∞ −∞ −2πiξt t=+∞ with f(t)e ]t=−∞ =0.

7.1. The Fourier integral 183

The inverse Fourier transform is the mapping F such that 2πix·ξ (Ff)(ξ):= f(x)e dx = f,eξ = f(−ξ). Rn

Example 7.1. The Fourier integral of thesquarewave, χ(−1/2,1/2),isthe function sinc defined as sin(πξ) sinc (ξ):= πξ since 1/2 1/2 1/2 e−2πiξx dx = cos(2πξx) dx − i sin(2πξx) dx =sinc(ξ). −1/2 −1/2 −1/2 This function plays an important role in signal analysis and will appear again in Theorem 7.18.

Example 7.2. If W is the function defined on Rn by W (x)=e−π|x|2 ,then W = W and 2 1 2 (7.1) e−πa|x| e−2πix·ξ dx = e−π|ξ| /a n/2 Rn a for every a>0.

If n =1,fromW (t)=−2πtW(t), the Fourier transform gives

2πξW(ξ)+(W)(ξ)=0 and then 2 2 2 (eπξ W(ξ)) =2πξW(ξ)eπξ +(W)(ξ)eπξ =0. πξ2 Thus e W (ξ)=K, a constant. The value√ of this constant is obtained√ from ∞ −x2 the Euler-Gauss integral −∞ e dx = π with the substitution x = πt, which gives ∞ 2 K = W(0) = e−πt dt =1, −∞ so that W(ξ)=W (ξ). −π|x|2 For n variables, W (x):=e = W (x1) ···W (xn), and also W = W . Another simple change of variables or an application of property (b) of the Fourier transform yields (7.1).

The operators F and F will be extended to certain distributions, known as temperate distributions, and their extensions will essentially keep prop- erties (a)–(e).

184 7. Fourier transform and Sobolev spaces

To show why F is a valuable tool for solving some partial differential equations, consider the example of an initial value problem for the heat equation on [0, ∞) × Rn,

(7.2) ∂tu(t, x) −"u(t, x)=0,u(0,x)=f(x), " n 2 F with = j=1 ∂xj , and formally apply with respect to the variable x ∈ Rn. By property (d),

2 2 ∂tu(t, ξ)+4π |ξ| u(t, ξ)=0, u(0,ξ)=f(ξ), and, taking ξ as a parameter, we note that this is a very simple initial value problem for an ordinary linear differential equation whose solution is

2 2 u(t, ξ)=f(ξ)e−4π |ξ| t. According to Example 7.2, if a =1/(4πt), 2 2 2 e−4π |ξ| t =(4πt)−n/2 e−|x| /4te−2πix·ξ dx Rn so that, by property (c),  u(t, ξ)=f(ξ)Wt(ξ)=f ∗ Wt(ξ)

−n/2 −|x|2/4t if Wt(x)=(4πt) e . Hence, the function 1 2 (7.3) u(t, x)=(f ∗ W )(x)= e−|y| /4tf(x − y) dy t n/2 (4πt) Rn is a candidate for a solution of (7.2). Note that 1 x Wt(x)= √ W √ ( 4πt)n 4πt is a summability kernel, known as the Gauss-Weierstrass kernel, associ- ated to the positive integrable function W , so that the initial value condition 2 limt↓0 f ∗ Wt = f will hold.

Moreover, equation (7.3) suggests Γ(t, x)=Wt(x)Y (t) as a possible fundamental solution of the heat operator, which is the case as we have seen in Theorem 6.29.3

The following Poisson theorem relates Fourier integrals with Fourier series:

2See the details in Exercise 7.2. 3This was how Poisson constructed solutions for the heat equation in the work contained in his “Th´eorie math´ematique de la chaleur” (1835).

7.1. The Fourier integral 185

Theorem 7.3. Suppose that f ∈ L1(R) satisfies the condition ∞ k (7.4) f < ∞. T k=−∞

Then there exists a continuous T -periodic function fT on R such that ∞ fT (t)= f(t − kT) a.e. k=−∞ and ∞ 1 k f (t)= f e2πikt/T , T T T k=−∞ a series which is uniformly convergent on R,thatis,inCT (R).

Proof. Denote L = T/2. On [−L, L) (and on every [kT − L, kT + L)), we can define a.e. the periodic function ∞ fT (t):= f(t − kT), k=−∞ with convergence in L1(−L, L), since L ∞ |f(t − kT)| dt = f1 < ∞ − L k=−∞ ∞ | − | ∞ and then k=−∞ f(t kT) < a.e. ∈ 1 The Fourier coefficients of fT LT (R)are 1 k c (f )= f , k T T T since ∞ L 1 −2kπit/T ck(fT )= f(t − kT)e dt T − k=−∞ L ∞ 1 L−kT = f(t)e−2kπit/T dt T −L−kT k=−∞ 1 = f(t)e−2kπit/T dt. T R From condition (7.4) and by the M-test of Weierstrass, it follows that the Fourier series is absolutely and uniformly convergent to a continuous function which coincides with fT a.e. 

186 7. Fourier transform and Sobolev spaces

If the support of f is compact, or if there are two constants A, δ > 0 such that |f(ξ)|≤A(1 + |ξ|)−1−δ, then the integrable function f satisfies condition (7.4).

The function fT is called the periodized extension of f. If supp f ⊂ [−T/2,T/2], fT is the T -periodic extension of the restriction of f to the interval [−T/2,T/2].

7.2. The Schwartz class S

To define the Fourier transform of distributions, instead of D(Rn) we need to consider a class of C∞ functions that is invariant under the Fourier trans- form. Properties (c) and (d) of the Fourier transform suggest that we con- sider the complex vector space n n S(R ):= ϕ ∈E(R ); qN (ϕ) < ∞ for N =0, 1, 2,... , where 2 N α qN (ϕ):= sup (1 + |x| ) |D ϕ(x)|. x∈Rn; |α|≤N Note that |xα|≤(1+|x|2)N if |α|≤N, so that ϕ ∈E(Rn)isinS(Rn) if and only if, for every couple P , Q of polynomials, the function P (x)Q(D)ϕ(x) is bounded.

n The topology of S(R ) is defined by the sequence q0 ≤ q1 ≤ q2 ≤··· of n norms, so that the convergence ϕk → ϕ in S(R ) is the uniform convergence on Rn β α β α x D ϕk(x) → x D ϕ(x) for all α, β ∈ Nn, which is equivalent to the uniform convergence

P · Q(D)ϕk → P · Q(D)ϕ for every couple P , Q of polynomials. The following theorem collects some basic properties of this new space. Theorem 7.4. S(Rn), with the topology defined by the increasing sequence { }∞ of norms qN N=0,isaFr´echet space. The inclusions D(Rn) ⊂S(Rn) ⊂ L1(Rn) are continuous (that is, for n n n 1 n every compact set K ⊂ R , the mappings DK (R ) →S(R ) → L (R ) are continuous), and D(Rn) is dense in S(Rn). The differential operators P (D), the multiplication by polynomials, trans- lations, dilations, the symmetry, and every modulation or multiplication by n a complex sinusoidal ea are also continuous linear mappings of S(R ) into S(Rn).

7.2. The Schwartz class S 187

{ }⊂S n { | |2 N α }∞ Proof. If ϕk (R ) is such that every (1 + x ) D ϕk(x) k=1 is a 2 N α uniformly Cauchy sequence, then (1 + |x| ) D ϕk(x) → ϕN,α(x) uniformly n α as k →∞, so that ϕ = ϕ0,0 ∈E(R ), since D ϕk(x) → ϕ0,α(x) uniformly n α and then ϕ ∈E(R ) and ϕ0,α(x)=D ϕ(x). 2 N α 2 N α Hence (1+|x| ) D ϕk(x) → (1+|x| ) D ϕ(x) uniformly, which means n that ϕk → ϕ in S(R ). Note that 1 ∞ rn−1 dx = ω dr < ∞ | |2 N n 2 N Rn (1 + x ) 0 (1 + r ) if n − 1 − 2N<−1. Then, if N>n/2 and ϕ ∈S(Rn), 2 −N ϕ1 ≤ qN (ϕ)(1 + |x| ) dx = CqN (ϕ) Rn and S(Rn) → L1(Rn) is continuous. n Recall that the topology of DK (R ) is defined by the increasing sequence of norms α α pN (ϕ):= sup D ϕK =supD ϕRn , |α|≤N |α|≤N n so that, if ϕ ∈DK (R ),

2 N qN (ϕ) ≤ sup(1 + |x| ) pN (ϕ), x∈K n n and DK (R ) →S(R ) is continuous. To prove that D(Rn)isdenseinS(Rn), let ϕ ∈S(Rn). To find functions n n n ϕN ∈D(R ) such that ϕN → ϕ in S(R ), choose B¯(0, 1) ≺  ≺ R and −1 n define N (x)=(N x)(N ∈ N), so that B¯(0,N) ≺ N ≺ R .

Then, if ϕN = N ϕ, α |Dα(ϕ − ϕ )(x)|≤ |Dα−βϕ(x)||Dβ(1 −  )(x)| N β N β≤α α α−β −|β| β = |D ϕ(x)|N sup |D (1 − N )(x/N)|, β ∈ n β≤α x R | α−β |≤ ≤ where we can select N so that sup|x|≥N D ϕ(x) ε for all β α,since α−β n β D ϕ ∈S(R ). Note that D (1 − N )(x)=0if|x|≤N, and we obtain α α −|β| β sup |D (ϕ − ϕN )(x)|≤ εN sup |D (1 − N )(x/N)|≤CN,αε. ∈ n β | |≥ x R β≤α x N

188 7. Fourier transform and Sobolev spaces

| |2 m| α − | A similar estimate holds for every supx∈Rn (1+ x ) D (ϕ ϕN )(x) ,since also sup (1 + |x|2)m|Dα−βϕ(x)|≤ε. |x|≥N

Thus, qm(ϕ − ϕN ) → 0ifN →∞. We leave the remaining part of the proof as an easy exercise. 

Theorem 7.5 (Inversion Theorem). The Fourier transform is a continuous bijective linear operator F : S(Rn) →S(Rn) and F −1 = F.

Proof. For every ϕ ∈S(Rn), the functions Dβxαϕ(x)arealsoinS(Rn) and it follows from the properties of F that ξβDαϕ(ξ) are bounded and ϕ ∈S(Rn). To prove that F : S(Rn) →S(Rn) is continuous, note that 2 N 2 −N |ϕ(ξ)|≤ |ϕ(x)|(1 + |x| ) (1 + |x| ) dx ≤ CqN (ϕ)(N>n/2), Rn n so that, if ϕk → 0 and ϕk → ψ in S(R ), then ϕk(ξ) → 0 and ψ(ξ)=0, and the continuity now follows from the closed graph theorem. To show that FF = I, if we try a direct calculation of F(Fϕ), the integral f(x)e2πi(z−x)·ξ dξ dx Rn Rn is not absolutely convergent on R2n. We avoid this problem by using properties (b) and (c) of the Fourier transform to obtain f(y)g(hy) dy = f(y)[h−ng(h−1x)](y) dy = f(y)h−ng(h−1y) dy Rn Rn Rn which, with the substitution hy = x, becomes 1 −1 1 −1 n f(h x)g(x) dx = f(y) n g(h y) dy; h Rn Rn h −1 −1 →∞ that is, Rn f(h x)g(x) dx = Rn f(y)g(h y) dy. By letting h ,the dominated convergence theorem gives f(0)g(x) dx = f(y)g(0) dy. Rn Rn If we choose g = W ,since Rn W (x) dx = Rn W (x) dx = 1 and W (0) = 1, f(0) = f(y) dy. Rn

7.3. Tempered distributions 189

An application of this identity to f(x)=(τ−xf)(0) combined with property (a) gives  f(x)= τ−xf(y) dy = ex(y)f(y) dy, Rn Rn which is f = FF f.Theidentityf = FFf is similar. 

As an application, let us present a new proof of the Riemann-Lebesgue lemma. 1 n n Corollary 7.6 (Riemann-Lebesgue). If f ∈ L (R ), then f ∈C0(R ); that is, lim f(ξ)=0. |ξ|→∞

Proof. We know that D(Rn)isdenseinL1(Rn) (see Exercise 6.2) and, if 1 n n ϕk → f in L (R ), then ϕk → f uniformly with ϕk ∈S(R ) so that they are null at infinity. 

Not only multiplication by a polynomial and by ea is continuous on S(Rn):

n 2 s/2 Theorem 7.7. If ψ ∈S(R ) and ωs(x):=(1+|x| ) (s ∈ R),the pointwise multiplications ψ· and ωs· and the convolution ψ∗ are continuous linear operators of S(Rn).

Proof. Let ϕ ∈S(Rn) and note that, if |α|≤N, α (1 + |x|2)N |Dα(ψϕ)|≤ (1 + |x|2)N |Dα−βψ(x)||Dβϕ(x)|≤Cq (ϕ), β N β≤α α−β since every |D ψ(x)| is bounded. The case of ωs· is similar, since |ωsϕ|≤ 2 N (1 + |x| ) |ϕ(x)| and |∂jωs(x)|≤|sxjωs−2(t)|. Finally, F(ψ ∗ ϕ)=F(ψ)F(ϕ), so that ψ ∗ ϕ = F(F(ψ)F(ϕ)) and ψ∗ is the product of continuous operators. 

7.3. Tempered distributions

If S(Rn) is the topological dual space of S(Rn), it follows from Theorem 7.4  n  n that the mapping u ∈S(R ) → v = u|D(Rn) ∈D(R ) is linear and one-to- one. A distribution v ∈D(Rn) is the restriction of an element u ∈S(Rn)if and only if there exist N ∈ N and a constant CN > 0 such that 2 N α n |u(ϕ)|≤CN qN (ϕ)=CN sup (1 + |x| ) |D ϕ(x)| (ϕ ∈D(R )). x∈Rn; |α|≤N

190 7. Fourier transform and Sobolev spaces

The elements of S(Rn), or their restrictions to D(Rn), are called tem- pered distributions.OnS(Rn) we consider the topology w∗, so that  n n uk → u in S (R ) means that ϕ, uk→ϕ, u for every ϕ ∈S(R ). It is customary to identify every u ∈S(Rn) with its restriction v ∈ D(Rn), so that S(Rn) ⊂D(Rn). Example 7.8. Suppose 1 ≤ p<∞ and N ∈ N.If(1+|x|2)−N f(x)isin Lp(Rn), then, as a distribution, f ∈S(Rn). In particular, Lp(Rn) →S(Rn)(1≤ p ≤∞), and this inclusion is continuous.

If p =1, |ϕ, f| = (1 + |x|2)−N f(x)(1 + |x|2)N ϕ(x) dx Rn 2 −N ≤ (1 + |x| ) f(x)qN (ϕ) dx Rn = CN qN (ϕ). If p>1, using H¨older’s inequality, we obtain 1/p | | ≤ 1/p | |2 N p ϕ, f CN (1 + x ) ϕ(x) dx Rn | |2 N−M ∞ and, if M is such that Rn (1 + x ) dx = C< , a division and multiplication by (1 + |x|2)M give | | ≤ 1/p 1/p | |2 M | |≤ ϕ, f CN C sup (1 + x ) ϕ(x) KqM (ϕ). x∈Rn Example 7.9. Every distribution with compact support, u ∈E(Rn), is a tempered distribution.

The inclusion S(Rn) →E(Rn) is continuous, since the topology of E(Rn) is defined by the seminorms α pK,N (ϕ)= sup D ϕK |α|≤N n and pK,N (ϕ) ≤ qN (ϕ)ifϕ ∈S(R ). Moreover S(Rn)isdenseinE(Rn), since D(Rn) is dense, and the re- striction of u ∈E(Rn)toS(Rn) is a tempered distribution. Let P be a polynomial with constant coefficients, ψ ∈S(Rn), and u ∈ S(Rn). As in the case of general distributions, we define Pu, P (D)u, and ψu by ϕ, P (D)u = P (−D)ϕ, u, ϕ, P u = Pϕ,u, ϕ, ψu = ψϕ, u,

7.3. Tempered distributions 191

− − |α| α α where P ( D)= |α|≤N cα( 1) D if P (x)= |α|≤N cαx (the substitu- α j − αj α α1 ··· αn tion of every xj by ( ∂j) , and D = ∂1 ∂n ). They belong to S(Rn), since they are the composition of continuous linear mappings. For instance, P (D)u : ϕ → P (−D)ϕ → u(P (−D)ϕ), where P (−D), P ·, and ψ· are continuous on S(Rn), their transposes are P (D), P ·, and ψ·, and they are continuous linear operators of S(Rn). Also translations, modulations, dilations, and the symmetry defined, respectively, by

ϕ, τau = τ−aϕ, u, ϕ, eau = eaϕ, u, ϕ(x),u(hx) = h−nϕ(h−1x),u(x), ϕ, u˜ = ϕ,˜ u, are continuous linear mappings of S(Rn)intoS(Rn).

7.3.1. Fourier transform of tempered distributions. Property (c) of the Fourier integral on L1(Rn) suggests that we may also define the Fourier transform u of any tempered distribution u by ϕ, u := ϕ, u (ϕ ∈S(Rn)). Since F is continuous on S(Rn), u = u ◦F ∈S(Rn) for any u ∈S(Rn). We still write Fu = u. Similarly, F is defined on S(Rn)byϕ, Fu := Fϕ, u. Theorem 7.10. The Fourier transform F : S(Rn) →S(Rn) is a bijective continuous linear extension of F : L1(Rn) → L∞(Rn). The inverse of F on S(Rn) is F. The behavior of F on L1(Rn) and on S(Rn) with respect to derivatives, translations, modulations, dilations, and symmetry extends to the Fourier transform on S(Rn),andFu = Fu.

1 n Proof. If f ∈ L (R ) and uf = ·,f,thenFuf = ·, f. As the transpose of F : S(Rn) →S(Rn), F : S(Rn) →S(Rn)is weakly continuous. By property (c) of the Fourier integral, if f ∈ L1(Rn), F F 1 n → ∞ n F then uf = uf and : L (R ) L (R ) is the restriction of : S(Rn) →S(Rn). Also, FF = Id and FF =Id,since ϕ, FF u = FFϕ, u = ϕ, u. Let us consider the behavior of F with respect to dilations: ϕ(y), F[u(hx)](y) = (Fϕ)(x),u(hx) = h−n(Fϕ)(h−1x),u(x) = F[ϕ(hy)](x),u(x) = ϕ(hy), (Fu)(y) = ϕ(y),h−n(Fu)(h−1y),

192 7. Fourier transform and Sobolev spaces

and u(hx) is the distribution h−nu(h−1y). We leave it to the reader to check the remaining statements. α ∈ For differential operators, P (D)= |α|≤m cαD (cα C), ϕ, FP (D)u = P (−D)Fϕ, u = P (2πiξ)Fϕ(ξ),u(ξ) and FP (D)u is the distribution P (2πiξ)Fu(ξ).      Example 7.11. 1=δ,since ϕ, 1 = Rn ϕ(ξ) dξ = ϕ(0) = ϕ, δ .Asa modulation of 1, the Fourier transform in S(Rn) of the function e2kπia·x is τaδ = δa. 2kπia·ξ Similarly, δ = 1, and δa(ξ)=e . ∈ 2 Example 7.12. Every f LT (R) is a tempered distribution such that, in S(R), ∞ +∞ 2kπit/T f = ck(f)e and f = ck(f)δk/T . k=−∞ k=−∞ Here a+T 1 −2πikt/T ck(f)= f(t)e dt (k ∈ Z) T a are the Fourier coefficients of f.

Since ∞ |f(t)|2 + (k+1)T |f(t)|2 dt ≤ dt = Cf2, 1+t2 1+k2 2 R k=−∞ kT 2 →S it follows from Example 7.8 that LT (R)  (R), continuously. Then ∞ 2kπit/T f = ck(f)e k=−∞ S 2 in (R), since this is true in LT (R). S But the Fourier transform is linear and continuous in (R), and maps 2kπit/T ∞ 2kπit/T S 4 e into δk/T , so that f = k=−∞ ck(f)e in (R). 7.3.2. Plancherel Theorem. We also have Lp(Rn) ⊂S(Rn), and the action of F on L2(Rn) is especially important. Theorem 7.13 (Plancherel5). The restrictions of F and F to L2(Rn) are linear bijective isometries such that F = F −1.  4 ∈ p ∞ 2kπit/T These results are also true for f LT (R)(p>1), since f = k=−∞ ck(f)e in p LT (R). 5Named after the Swiss mathematician Michel Plancherel, who in 1910 established conditions under which the theorem holds. It was first used in 1889 by Lord Rayleigh (John William Strutt) in the investigation of blackbody radiation.

7.3. Tempered distributions 193

Proof. Since D(Rn)isdenseinL2(Rn)(ϕ, f =0foreveryϕ implies f = 0, so that D(Rn)⊥ = {0} in L2(Rn)), the larger subspace S(Rn)isalso dense. The identity ϕ(y)g(y) dy = ϕ(y)g(y) dy holds for all ϕ, g ∈ Rn Rn S n ¯ (R ). If g = ψ,theng = ψ and Rn ϕ(y)ψ(y) dy = Rn ϕ(y)ψ(y) dy, 2 n i.e., (ϕ, ψ)2 =(ϕ, ψ)2. This shows that F is an L -isometry on S(R ) that 2 n by continuity extends to a unique isometry F2 of L (R ). The restriction of F on S(Rn)toF : L2(Rn) →S(Rn) is continuous, since the inclusion L2(Rn) →S(Rn) is continuous, and it coincides with n 2 n F2 on the dense subspace S(R ); hence F2 = F on L (R ). The operator F, such that Fu = Fu, has a similar behavior. To show that F = F −1 on L2(Rn), note that FF =Id=FF on S(Rn), so that also FF =Id=FF on L2(Rn).  ∈ 2 n Remark 7.14. If f,g L (R ), Rn f(y)g(y) dy = Rn f(y)g(y) dy, but on L2(Rn) the Fourier transform can be seen as an improper integral for the convergence in L2(Rn), f(ξ) = lim f(x)e−2πixξ dx. ↑∞ R B(0,R)

1 n 2 n Note that χB(0,R)f ∈ L (R ) ∩ L (R ) and

f = lim χB(0,R)f R↑∞ in L2(Rn), so that f = lim F(χ f), with F(χ f)(ξ)= f(x)e−2πixξ dx. ↑∞ B(0,R) B(0,R) R B(0,R)

Obviously, instead of χB(0,R) we can use more regular functions, such as (R−1x)withB¯(0, 1) ≺  ≺ Rn.6 Example 7.15. If Ω > 0 and h ∈ R, then sinc (2Ωt + h)isinL2(R), and πihξ/Ω 2Ω[ sinc (2Ωt + h)] (ξ)=e χ[−Ω,Ω](ξ). Indeed, since sinc is the Fourier transform of the square wave of Exam- ple 7.1, it belongs to L2(R), and the same happens with sinc (2Ωt + h)=  sinc (2Ω(t + h/2Ω)). By the Plancherel theorem sinc = χ[−1/2,1/2] and, using the properties of the Fourier transform, h 2πi h ξ 1 πi h ξ  sinc 2Ω(t + ) (ξ)=e 2Ω [sinc(2Ωt)] (ξ)= e Ω sinc (ξ/2Ω) 2Ω 2Ω  6 M −2πixξ ∈ ∈ 2 If n =1,f(ξ) = limM→∞ −M f(x)e dx for almost all ξ R holds if f L .This result is equivalent to the Carleson theorem on the almost everywhere convergence of Fourier series, one of the most celebrated theorems in Fourier analysis.

194 7. Fourier transform and Sobolev spaces

 with sinc (ξ/2Ω) = χ[−1/2,1/2](ξ/2Ω) = χ[−Ω,Ω](ξ).

We also know from Example 7.9 that the distributions with compact support are tempered, so that we can consider the restriction of F to the space E(Rn). Let us denote

2πiz·ξ n n ez(ξ):=e (ξ ∈ R ,z∈ C ). Theorem 7.16. If u ∈E(Rn),thenu is the restriction to Rn of the entire function

F (z):=e−z,u. For every α ∈ Nn there is an integer N such that the function

(1 + |x|2)−N/2Dαu(x) is bounded.

→ E n Proof. The function F is continuous on C,sinceez ez0 in (R )as → α → α z z0. Indeed, D ez D ez0 uniformly on every compact subset of R, since for one variable e2πitz − e2πitz0 = 2πite2πitζ dζ. [z0,z]

Let us show that F is holomorphic in every variable zj by an application of Morera’s theorem; that is, γ F (z) dzj =0ifγ is the oriented boundary of a rectangle in C. By writing the integral as a limit of Riemann sums, F (z) dzj = u(e−z) dzj = e−z(t) dzj,u(t) =0, γ γ γ − 2πizj tj ∈ since γ e dzj =0foreverytj R. n For every ϕ ∈D[a,b]n (R ), ϕ(t),u(e−t) = ϕ(t)e−t,u dt = ϕ(t)e−t(x) dt, u(x) = ϕ, u [a,b]n [a,b]n and u = F on Rn. Note that by the continuity of u on E(Rn),

α |α| α β α |D u(ξ)| = |(−2πi) x e−ξ(x),u(x)| ≤ C sup |D (x e−ξ(x))| |β|≤N, |x|≤N and it follows that |Dαu(ξ)|≤C(1 + |ξ|2)N/2. 

7.4. Fourier transform and signal theory 195

7.4. Fourier transform and signal theory

A first main topic in the digital processing of signals is the analog-to-digital conversion by means of sampling, which changes a continuous time signal { }∞ ⊂ f(t) into a discrete time signal x = x[k] k=−∞ C, x[k]:=f(kT). The band of an analog signal f is the smallest interval [−Ω, Ω] which supports its Fourier transform f and we shall see that for a band-limited signal, that is, with Ω < ∞, sampling can be done in an efficient manner. It is worth observing that in this case f is analytic: We know from Theorem 7.16 that, if u ∈S(R) has a Fourier transform with compact support u ∈E(R), then u is the restriction to R of the entire function, F (z):=e2πiξz, u(ξ). This shows that a signal cannot be simultaneously band-limited and time- limited.7 Usually, analog signals are of finite time, so they are not of limited band, but they are almost band-limited in the sense that u $ 0 outside of some finite interval [−Ω, Ω]. Sometimes, filtering of the analogical signal is convenient in order to reduce it to a band-limited signal. We will suppose that f ∈ L2(R) and supp f ⊂ [−Ω, Ω], so f is analytic. 8 The minimal value ΩN of Ω is called the Nyquist frequency of f. Let us consider the T -periodized extension of f with T =2Ω, ∞ fT (ξ)= f(ξ − kT)(P =2Ω), k=−∞ 2 which is in LT (R). Then 1 k 1 −k (7.5) c (f )= f = f k T 2Ω 2Ω 2Ω 2Ω and ∞ 1 −k f (ξ)= f eπikξ/Ω, T 2Ω 2Ω k=−∞ with L2(−Ω, Ω)-convergence of this Fourier series. Ω 2πitξ According to the inversion theorem, f(t)= −Ω f(ξ)e dξ for every x ∈ R, and the scalar product by e−2πitξ/Ω on [−Ω, Ω] gives the pointwise identity ∞ 1 −k Ω f(t)= f e2πi(t+k/2Ω)ξ dξ 2Ω 2Ω − k=−∞ Ω

7This is a version of the uncertainty principle 8Named after the Swedish engineer Harry Nyquist, who in 1927 determined that the number of independent pulses that could be put through a telegraph channel per unit of time is limited to twice the bandwidth of the channel.

196 7. Fourier transform and Sobolev spaces and, by Example 7.15, 1 Ω (7.6) e2πi(t+k/2Ω)ξ dξ =sinc(2Ωt + k). 2Ω −Ω This shows that we obtain a complete reconstruction of f(t), ∞ k (7.7) f(t)= f sinc (2Ωt − k), 2Ω k=−∞ with pointwise convergence, by sampling with the sampling period Tm = 1/2Ω. Our aim is to prove that in fact we have uniform and L2 convergence. Lemma 7.17. The functions √ (7.8) 2Ω sinc (2Ωt − k)(k ∈ Z) form an orthonormal system in L2(R).

Proof. If

ϕk := [ sinc (2Ωt − k)] =[sinc(2Ω(t − k/2Ω))] , according to Example 7.15

πikξ/Ω πikξ/Ω 1 ϕ (ξ)=e [sinc(2Ωt)] (ξ)=e χ − (ξ). k 2Ω [ Ω,Ω] From the Plancherel theorem,

([ sinc (2Ωt − m)], [sinc(2Ωt − n)])2 =(ϕm,ϕn)2, and then Ω 2 πi(m−n)ξ/Ω (2Ω) (ϕm,ϕn)= e dξ =0 −Ω if m = n, and (2Ω)2ϕ 2 = 2Ω, so that m 2 √ [sinc(2Ωt − m)]2 =1/ 2Ω. 

The family of functions (7.8) is called the Shannon system. Theorem 7.18 (Shannon9). Suppose f ∈ L2(R) and supp f ⊂ [−Ω, Ω],so that we can assume that f is continuous. Then ∞ k f(t)= f sinc (2Ωt − k) 2Ω k=−∞ in L2(R) and uniformly.

9Named after the electrical engineer and mathematician Claude Elwood Shannon, the founder of information theory in 1947.

7.4. Fourier transform and signal theory 197

Proof. We have seen in (7.7) that the sequence N k s (f,t)= f sinc (2Ωt − k) N 2Ω k=−N N k 1 Ω = f e2πitξe−2πikξ/2Ω 2Ω 2Ω − k=−N Ω is pointwise convergent to f(t) and that Ω f(t)= f(ξ)e2πiξt dξ. −Ω Hence, Ω N − 1 k 2πikξ/2Ω 2πiξt |f(t) − sN (f,x)| = f(ξ) − f e e dξ − 2Ω 2Ω Ω k=−N and, by Schwarz inequality,

N Ω − 2 1/2 1 k 2πikξ/2Ω 1/2 |f(t) − sN (f,x)|≤ f(ξ) − f e dξ (2Ω) . − 2Ω 2Ω Ω k=−N We know from (7.5) that the Fourier coefficients of fT with respect to the trigonometric system are 1 −k c (f )= f . k T 2Ω 2Ω Hence, N 1 −k f e2πikξ/2Ω = S (f ,t), 2Ω 2Ω N T k=−N → 2 and SN (fT ) fT in LT (R). Then, 1/2 sup |f(t) − sN (f,x)|≤(2Ω) f − SN (fT )L2(−Ω,Ω), t∈R which yields the uniform convergence. 2 Since c(fT )={ck(fT )}∈ and the Shannon system is orthonormal, ∞ 1 k √ (7.9) g(t)= f 2Ω sinc (2Ωt − k) 2Ω 2Ω k=−∞ in L2(R). But some subsequence of the partial sums is a.e. convergent to g and uniformly convergent to f, so that g = f as elements of L2(R). 

198 7. Fourier transform and Sobolev spaces

The Shannon theorem shows that the sampling rate of Ωs =2ΩN sam- ples/seg is optimal, and it is called the Nyquist rate.IfΩs > 2ΩN ,we are considering supp f ⊂ [−Ωs/2, Ωs/2], wider than the symmetric interval which supports f and an unnecessary oversampling if Ωs is much greater than 2ΩN .IfΩs > 2ΩN , the function g obtained in the sum (7.9) differs from the original signal f and is called an alias of f. { }∞ In digital processing, the discrete time signals x = x[k] k=−∞ obtained by sampling from analogical signals are usually of finite time, so that x[k]=0 if |k| >N for some N, but for technical reasons it is convenient to consider more general signals. We say that x is slowly increasing, and write x ∈ , if there exist two constants N and C, such that |x[k]|≤C|k|N (k =0) . The class  is a vector space with the usual operations and slowly in- creasing sequences can be considered tempered distributions: Theorem 7.19. If x ∈ ,then +∞ ux := x[k]δk k=−∞  defines a tempered distribution, and the correspondence x ∈  → ux ∈S(R) is an injective linear mapping which shows that we can consider  ⊂S(R). The Fourier transform of x as a tempered distribution is +∞ x = x[k]e−2πikξ. k=−∞ +∞ ∈ ∈S Proof. If u = k=−∞ x[k]δk = 0, for every n Z we can choose ϕ with support in (n − 1,n+ 1) such that u(ϕ)=x[n]. Since +∞ −2N 4N (7.10) |x[k]ϕ(k)|≤C |k| |k| |ϕ(k)|≤Kq2N (ϕ), k=−∞ k=0 → +∞ S u : ϕ k=−∞ x[k]ϕ(k) is linear and continuous on (R). It is the limit  N in S of the partial sums uN = − x[k]δk, uN (ϕ) → u(ϕ)ifϕ ∈S. k=N +∞ +∞ −2πikξ  Moreover, u = k=−∞ x[k]δk = k=−∞ x[k]e .

This Fourier transform (7.10) is called the spectrum of x. { }∈ 2 2 ∈ 2 → ∈ 2 If x[k]  , we have convergence in L1(R) and x  x L1(R) is a bijective isometry, such that x[−k]=ck(x)(k ∈ Z). 1 1 2 If {x[k]}∈ ,thenx ∈C1(R). Of course,  ⊂  ⊂  (see Exercise 7.13).

7.4. Fourier transform and signal theory 199

Example 7.20. The Fourier transform of x[j]=c sinc (cj)(0

∈ 2 ∈ 2 In this example, x  ,sincex L1(R), but x is not continuous, so that x ∈ 1.

The Fourier transform of a signal and that of its samples are related as follows:

Theorem 7.21. Let f ∈ L2(R) be a band-limited signal and let x[j]:= f(j/2Ω),withΩ ≥ ΩN .Then 1 ξ f(ξ)= x (|ξ|≤Ω). 2Ω 2Ω

Proof. According to the inversion theorem, 1/2 1 Ω ξ x[j]= x(ξ)e2πiξj dξ = x e2πiξj/2Ω dξ −1/2 2Ω −Ω 2Ω so, by the 2Ω-periodicity of e2πiξj/2Ω,

∞ j ∞ Ω x[j]=f = f(ξ)e2πiξj/2Ω dξ = f(ξ − 2Ωk)e2πiξj/2Ω dξ 2Ω −∞ − k=−∞ Ω Ω ∞ = f(ξ − 2Ωk) e2πiξj/2Ω dξ − Ω k=−∞ and, from the uniqueness property of the Fourier coefficients,

∞ 1 ξ x = f(ξ − 2Ωk)=f . 2Ω 2Ω 2Ω k=−∞

This relation means that ∞ x(ξ)=2Ω f(2Ω(ξ − k)), k=−∞

where the right side is f scaled by the factor 2Ω. Since Ω ≥ ΩN ,wehave f2Ω(ξ)=f(ξ)if|ξ|≤1/2 and then x(ξ)=2Ωf(2Ωξ). 

200 7. Fourier transform and Sobolev spaces

7.5. The Dirichlet problem in the half-space In Theorem 6.33 we have obtained the solution u(x)= S P (x, y)f(y) dy of the homogeneous Dirichlet problem10 for the ball with the inhomogeneous boundary condition u|S = f by means of its Poisson kernel P . Here, as in (7.2) for the heat equation, we will use the Fourier trans- form as a tool to solve the homogeneous Dirichlet problem in the half-space 1+n { ∈ × n } R+ := (t, x) R R ; t>0 , 2 " (7.11) ∂t u + u =0,u(0,x)=f(x), " n 2 where = j=1 ∂xj . We will be looking for bounded solutions,11 that is, for bounded har- monic functions u on the half-space t>0 such that, in some sense, u(t, x) → f(x)ast ↓ 0.

7.5.1. The Poisson integral in the half-space. The Fourier transform changes a linear differential equation with constant coefficients P (D)u = f, α with P (D)= |α|≤m cαD , into the algebraic equation P (2πiξ)u(ξ)=f(ξ), α α1 ··· αn where P (x)= |α|≤m cαx = |α|≤m cαx1 xn . For this reason, to find an integral kernel for the Dirichlet problem (7.11) similar to the Poisson kernel for the ball, we apply the Fourier transform in x to convert the partial differential equation in (7.11) into an ordinary differential equation. Assuming for the moment that f ∈S(Rn), for every t>0 we obtain 2 − 2| |2 ∂t u(t, ξ) 4π ξ u(t, ξ)=0, (u(0,ξ)=f(ξ)), and we are led to solve an ordinary differential equation in t for every ξ ∈ Rn. The general solution of this equation is u(t, ξ)=A(ξ)e2π|ξ|t + B(ξ)e−2π|ξ|t,A(ξ)+B(ξ)=f(ξ). If we want to apply the inverse Fourier transform, and also because of the boundedness condition, we must have A(ξ)=0.ThenB(ξ)=f(ξ) and

10The work of Johann Peter Gustav Lejeune Dirichlet included potential theory, integra- tion of hydrodynamic equations, convergence of trigonometric series and Fourier series, and the foundation of analytic number theory and algebraic number theory. In 1837 he proposed what is today the modern definition of a function. After Gauss’s death, Dirichlet took over his post in G¨ottingen. 11This boundedness requirement is imposed to obtain uniqueness; see Exercise 7.12.

7.5. The Dirichlet problem in the half-space 201 with this election −2π|ξ|t u(t, x)=(Pt ∗ f)(x), Pt(ξ)=e , where P (t, x):=Pt(x) will be the Poisson kernel for the half-space. If n = 1, an easy computation will show that the Fourier transform of g(x)=e−2π|x| on R is 1 1 (7.12) g(ξ)= . π 1+ξ2 Note that the family of functions 1 x P (x)= (t>0) t π t2 + x2 is the Poisson summability kernel of Example 2.42, obtained from P1 by letting Pt(x)=(1/t)P1(x/t). To check (7.12), a double partial integration in ∞ e−2π|x|e−2πiξx dx =2 e−2πx cos(2πξx) dx R 0 shows that 1 ∞ ∞ g(ξ)=− e−2πx cos(2πξx) − 2π e−2πx sin(2πξx) dx π 0 0 1 ∞ 1 = − 2π e−2πx sin(2πξx) dx = − ξ2g(ξ). π 0 π Thus 1 1 g(ξ)= 2 . π 1+ξ Obviously P1 = g>0 and R P1(x) dx = g(0) = 1. This result is extended for n>1, but the calculation is somewhat more involved and it will be obtained from (7.1) and from ∞ −t e 2 √ (7.13) √ e−a /4t dt = πe−a 0 t for a>0. To prove (7.13), we will use the obvious identity ∞ 2 e−(1+s )t dt =1/(1 + s2) 0 and also iat e −a 2 dt = πe , R 1+t which follows from (7.12) by a change of variables.

202 7. Fourier transform and Sobolev spaces

Indeed, ∞ ∞ 1 2 1 2 e−a = eiat e−(1+t )s ds dt = e−s eiate−st dt ds π R 0 π 0 R ∞ ∞ 2 2 2 1 =2 e−s e2πiaxe−4π sx dx ds = e−se−a /4s √ ds 0 R 0 sπ and (7.13) follows. Now we are ready to prove that n+1 t Γ 2 (7.14) P (t, x)=cn cn = ; (t2 + |x|2)(n+1)/2 π(n+1)/2 that is, t e−2π|ξ|te−2πix·ξ dξ = c . n 2 2 (n+1)/2 Rn (t + |x| ) A change of variables allows us to suppose t = 1 and, using (7.1) and (7.13), P (1,x)= e−2π|ξ|e−2πix·ξ dξ Rn ∞ −t 1 e 2 2 = √ √ e−4π |ξ| /4t dt e−2πix·ξ dξ n π R 0 t ∞ −t e 2 2 = √ e−π |ξ| /te−2πix·ξ dξ dt 0 tπ Rn ∞ = π−(n+1)/2(1 + |x|2)−(n+1)/2 e−xx(n+1)/2 dx 0 c = n . (1 + |x|2)(n+1)/2

According to Theorem 2.41, since Pt = P (t, ·) is a summability kernel, if n f ∈C(R ) tends to 0 at infinity, then u(t, x)=(Pt ∗ f) → f(x) uniformly as t ↓ 0. If f ∈ Lp(Rn)(1≤ p<∞), then u(t, ·) → f in Lp when t ↓ 0. Thus, in both cases, f can be considered as the boundary value of u, defined on t>0. 2 " Moreover, a direct calculation shows that (∂t + )P (t, x) = 0, and then u is harmonic on the half-space t>0, since 2 " − (∂t + ) P (t, x y)f(y) dy =0. Rn +∞ Note also that if |f|≤C,then|u(t, x)|≤C −∞ P (t, x − y) dt = C.

7.5. The Dirichlet problem in the half-space 203

7.5.2. The Hilbert transform. In the two-variables case we write 1 y P (x, y)= . π x2 + y2 We have seen that it is a harmonic function on the half-plane y>0. This also follows from the fact that it is the real part of the holomorphic function i 1 y + ix = = P (x)+iQ (x). πz π x2 + y2 y y The imaginary part, 1 x Q (x)=Q(x, y)= , y π x2 + y2 which is the conjugate function of P (x, y), is the conjugate Poisson ker- nel,whichforeveryx = 0 satisfies 1 1 lim Qy(x)= . y↓0 π x This limit is not locally integrable and does not appear directly as a distribution, but we can consider its principal value as a regularization of  1/x as in Exercise 6.16. Let us describe it as a limit in S (R)ofQy: By definition, for every ϕ ∈S(R), 1 +∞ ϕ(x) ϕ(x), pv  := pv dx = lim hε(x)ϕ(x) dx x −∞ x ε↓0 R −1 −1 if hε(x)=x χ{|x|>ε}(x). This limit exists since ε<|x|<1 x ϕ(0) dx =0 and then 1 ϕ(x) − ϕ(0) ϕ(x) ϕ(x), pv  = dx + dx. x |x|<1 x |x|>1 x Theorem 7.22. In S(R), 1 1 pv = lim Qy π x y↓0 and 1 1 F( pv )=−i sgn . π x

Proof. For the first equality, we only need to show that Fε := πQε −hε → 0 in S(R)asε → 0. But we note that, for every ϕ ∈S(R), xϕ(x) xϕ(x) ϕ(x) ϕ, Fε = dx + − dx {| | } ε2 + x2 {| | } ε2 + x2 x x <ε x >ε xϕ(εx) − ϕ(εx) = 2 dx 2 dx {|x|<1} 1+x {|x|>1} x(1 + x ) and both integrals tend to 0 as ε → 0, by dominated convergence.

204 7. Fourier transform and Sobolev spaces

For the second formula, a direct computation of the Fourier transform of the function −i(sgnξ)e−2πy|ξ| shows that

−2πy|ξ| Qy(ξ)=−i(sgnξ)e and then 1 1 F( pv )(ξ) = lim −i(sgnξ)e−2πy|ξ| = −i(sgnξ) π x y↓0 in S(R), since lim −i(sgnξ)e−2πy|ξ|ϕ(ξ) dξ = −i(sgnξ)ϕ(ξ) dξ ↓ y 0 R R for every ϕ ∈S(R). 

The Hilbert transform12 is the fundamental map of harmonic analysis and signal theory H : L2(R) → L2(R) defined by

Hf(ξ)=−i sgn (ξ)f(ξ)(f ∈ L2(R)).

Theorem 7.23. The Hilbert transform is a bijective linear isometry such that H2 = −I and H∗ = −H. For every ϕ ∈S(R),

1 1  Hϕ = pv ∗ ϕ = lim(Qy ∗ ϕ) in S (R). π x y↓0

Proof. If m(ξ):=−i sgn ξ, it is clear that M : f → mf is a bijective linear isometry of L2(R) and, by the Plancherel theorem, H = FMF is also a bijective linear isometry. Moreover H2 = −I,sincem2 = −1, and (Hf,g)2 =(FMFf,g)2 =(MFf,Fg)2 =(Ff,−MFg)2 =(f,−Hg)2.

With Theorem 7.22 in hand and from the properties of the convolution, F(Hϕ) = lim Qyϕ = lim F(Qy ∗ ϕ), y↓0 y↓0  so that Hϕ = limy↓0(Qy ∗ ϕ)inS (R). F F F ∗ 1 1 ∗  Also Hϕ = (M ϕ)=( m) ϕ = π pv x ϕ.

12The name was coined by the English mathematician G. H. Hardy after David Hilbert, who was the first to observe the conjugate functions in 1912. He also showed that the function sin(ωt) is the Hilbert transform of cos(ωt) and this gives the ±π/2 phase-shift operator, which is a basic property of the Hilbert transform in signal theory.

7.5. The Dirichlet problem in the half-space 205

If ϕ ∈S(R), its bounded harmonic extension to the half-plane y>0,

u(x, y)=(Py ∗ ϕ)(x) has v(x, y)=(Qy ∗ ϕ)(x) as the conjugate function, so that

F (z)=u(x, y)+iv(x, y) 1 y i x − t = ϕ(t) dt + ϕ(t) dt − 2 2 − 2 2 π R (x t) + y π R (x t) + y 1 t − z¯ = ϕ(t) dt | − |2 πi R t z is holomorphic on z = y>0, and it is continuous on y ≥ 0with lim F (z)=ϕ(x)+i(Hϕ)(x). y↓0 Since F 2(z)=u2(x, y) − v2(x, y)+i2u(x, y)v(x, y) is also holomorphic, 2uv is the conjugate function of u2 −v2, and H(ϕ2 −(Hϕ)2)=2ϕHϕ,where H−1 = −H.Thus (7.15) (Hϕ)2 = ϕ2 +2H(ϕHϕ).

We can write 1 ϕ(x − y) 1 ϕ(y) Hϕ(x)= dy = − dy π R y π R x y in the sense of the principal value, and the integrals are called singular integrals. The kernel 1 K(x, y)= x − y is far from satisfying the conditions of the Young inequalities (2.20), but H will still be an operator of type (p, p)if1

Proof. We claim that if Hϕp ≤ Cpϕp,thenHϕ2p ≤ (2Cp +1)ϕ2p.

Indeed, either Hϕ2p ≤ϕ2p and there is nothing to prove, or ϕ2p ≤ Hf2p. In this last case, by (7.15),  2  2 ≤ 2   Hϕ 2p = (Hϕ) p ϕ p +2 H(ϕHϕ) p ≤2   ϕ 2p +2Cp ϕHϕ p ≤2     ϕ 2p +2Cp ϕ 2p Hϕ 2p ≤ϕ2p(1 + 2Cp)Hϕ2p as claimed.

206 7. Fourier transform and Sobolev spaces

From this claim, starting from Hϕ2 = ϕ2, we obtain by induction n Hϕ2n ≤ (2 − 1)ϕ2n (n =1, 2,...) and an application of the Riesz-Thorin interpolation theorem13 gives

Hϕp ≤ Cpϕp (ϕ ∈S(R)) for every 2 ≤ p<∞, so that H(Lp(R)) ⊂ Lp(R) and H is of type (p, p)for these values of p. Suppose now that 1 2 and H is a bounded linear operator Lp (R) → Lp (R). Then Hfp =sup g(t)Hϕ(t) dt; gp ≤ 1 R =sup ϕ(t)Hg(t) dt; gp ≤ 1 ≤ϕpCp . R 

7.6. Sobolev spaces

7.6.1. The spaces W m,p. Let Ω be a nonempty open subset of Rn,1≤ p ≤∞and m ∈ N. The Sobolev space of order m ∈ N on Ω is defined by W m,p(Ω) := {u ∈ Lp(Ω); Dαu ∈ Lp(Ω), |α|≤m}, where the Dαu represent the distributional derivatives of u. We endow W m,p(Ω) with the topology of the norm α u(m,p) := D up. |α|≤m Note that the linear maps W m,p(Ω) → Lp(Ω), W m+1,p(Ω) → W m,p(Ω), and Dα : W m,p(Ω) → W m−|α|,p(Ω) are continuous, if |α|≤m. N In R all the norms are equivalent, so that ·(m,p) is equivalent to the norm α u := max D up. |α|≤m m,p It is easy to show that W (Ω) is a Banach space by describing it as p { α } the subspace of |α|≤m L (Ω) of all elements of the type D u |α|≤m.It α (α) is closed, since, if {D uk}|α|≤m →{u }|α|≤m in the product space, then α (α) p D uk → u (|α|≤m)inL (Ω).

13Marcel Riesz proved his convexity theorem, the Riesz-Thorin Theorem 2.45 for p(ϑ) ≤ q(ϑ), to use it in the proof of this fact and related results of harmonic analysis. See footnote 14 in Chapter 2.

7.6. Sobolev spaces 207

p  Note that if fk → f in L (Ω), then also fk → f in D (Ω) since

|ϕ, f − fk≤ϕp f − fkp. α α (α)  α α Hence D uk → D u = u in D (Ω), so that {D uk}|α|≤m →{D u}|α|≤m. In the case p = 2, we can renorm the space with the equivalent norm 1/2    α 2 u m,2 = D u 2 , |α|≤m so that W m,2(Ω) becomes a Hilbert space with the scalar product α α (u, v)m,2 = (D u, D v)2. |α|≤m

Every u ∈ W m,p(Ω) ⊂ Lp(Ω) is a class of functions, and it is said that it is a Cm function if it has a Cm representative. In the case of one variable, we can consider W m,p(a, b) ⊂C[a, b] for every m ≥ 1, since, if u = v a.e. and both of them are continuous, then u = v. Moreover, the following regularity result holds: ∈ 1,p t  Theorem 7.25. If u W (a, b) and v(t):= c u (s) ds,thenu(t)= v(t)+C a.e., so that u coincides a.e. with a continuous function, which we still denote by u, such that x u(x) − u(y)= u(s) ds (x, y ∈ (a, b)). y The distributional derivative u is the a.e. derivative of u, and the inclusion W 1,p(a, b) →C[a, b] is continuous.

 ∈ 1 Proof. Function v, as a primitive of u Lloc(a, b), is absolutely continuous on [a, b], and, by the Lebesgue differentiation theorem, u is its a.e. deriva- tive. The distributional derivatives v and u are the same, since, by partial integration and from ϕ(a)=ϕ(b)=0whenϕ ∈D(a, b), we obtain b t b −ϕ,v = − ϕ(t) u(s) ds dt = u(t)ϕ(t) dt = ϕ, u. a c a −  − But (u v) = 0 implies u v = C, and u is continuous on [a, b]. − − x  It follows from u = v + C that u(x) u(y)=v(x) v(y)= y u (s) ds. 1,p If uk → u in W (a, b) and uk → v in C[a, b], then v = u,since there exists a subsequence of {uk} which is a.e. convergent to u. Hence, W 1,p(a, b) →C[a, b] has a closed graph.  Remark 7.26. It can be shown that, if m>n/pand 1 ≤ p<∞, W m,p(Ω) ⊂Ek(Ω) whenever k

208 7. Fourier transform and Sobolev spaces

We will prove this result in the Hilbert space case p = 2 (see Theo- rem 7.29).

7.6.2. The spaces Hs(Rn). There is a Fourier characterization of the space W m,2(Rn) which will allow us to define the Sobolev spaces of fractional order s ∈ R on Rn. In Theorem 7.7 we saw that the pointwise multiplication by the function 2 s/2 ωs(ξ):=(1+|ξ| ) is a continuous linear operator of S(Rn), and it can be extended to S(Rn),  n  n with ωsu ∈S(R ) for every u ∈S(R ), defined as usual by

ϕ, ωsu = ωsϕ, u.

We define the operator Λs on S(Rn) in terms of the Fourier transform s −1 Λ u = F (ωsu). It is a bijective continuous linear operator of S(Rn) and of S(Rn), with (Λs)−1 =Λ−s and such that

s Λ u = ωsu.

It is called the Fourier multiplier with symbol ωs, since it is the result of the multiplication by ωs “at the other side of the Fourier transform”. Since Λ2m =(Id− (4π)−1")m, we can also write Λs =(Id− (4π)−1")s/2.

We define the Sobolev space of order s ∈ R, Hs(Rn)={u ∈S(Rn); Λsu ∈ L2(Rn)}, i.e., Hs(Rn)=Λ−s(L2(Rn)), and we provide it with the norm

s s u(s) = Λ u2 = Λ u2 = ωsu2, s s associated with the scalar product (u, v)(s) =(Λu, Λ v)2. It is a Hilbert space, since Λs is a linear bijective isometry between Hs(Rn) and L2(Rn)=H0(Rn) and also from Hr(Rn)ontoHr−s(Rn), which corresponds to u → ωsu:

s s Λ u(r−s) = ωr−sΛ u2 = ωru2 = u(r).

If t

7.6. Sobolev spaces 209

Theorem 7.27. If k ∈ N and s ∈ R, Hs(Rn)={u ∈S(Rn); Dαu ∈ Hs−k(Rn) ∀|α|≤k}, · →  α  and (s) and u |α|≤k D u (s−k) are two equivalent norms. In particular, if s = m,thenHs(Rn)=W m,2(Rn) with equivalent norms.

Proof. If u ∈ Hs(Rn), note that every

α α α s D u(ξ)=(2πiξ) u(ξ)=(2πiξ) ω−s(ξ)Λ u(ξ)

s 2 n α is a locally integrable function, since Λ u ∈ L (R ) and (2πiξ) ω−s(ξ)is continuous. If |α|≤k,then n | | |α|/2 | α| | α1 ··· αn |≤| | α 2 ξ = ξ1 ξn ξ = ξj j=1 | |αj ≤ n 2 αj /2 | |αj | α|≤ | |2 k/2 since, for every j, ξj ( i=1 ξi ) = ξ .Soξ (1 + ξ ) = ω (ξ). k | |2 k/2 ≤ | α| | α| Moreover (1+ ξ ) C |α|≤k ξ ,since |α|≤k ξ > 0everywhere and (1 + |ξ|2)k/2 F (ξ):= | α| |α|≤k ξ is a continuous function on Rn such that, if |ξ|≥1, (1 + |ξ|2)k/2 |ξ|k F (ξ) ≤ ≤ 2k/2 ≤ C. n | |k n | |k j=1 ξj j=1 ξj

Thus, (1 + |ξ|2)k/2 $ (2π)|α||ξα| = |(2πiξ)α|. |α|≤k |α|≤k From these estimates we obtain α α u(s) = ωs−kωku2 ≤ C ωs−k|(2πiξ) u|2 = C D u(ξ)(s−k) |α|≤k |α|≤k and also, if |α|≤k, α α D u(s−k) = ωs−k|(2πiξ) u(ξ)|2 ≤ Cωs−kωku2 = Cu(s). 

Theorem 7.28 (Sobolev). If s − k>n/2, Hs(Rn) ⊂Ek(Rn).

210 7. Fourier transform and Sobolev spaces

| |2 k−s ∞ Proof. By polar integration, Rn (1 + ξ ) dξ < if and only if ∞ ∞ (1 + r2)k−srn−1 dr $ r2k−2srn−1 dr < ∞, 1 1 i.e., when 2k −2s+n−1 < −1, which is equivalent to condition s−k>n/2. Then, if ϕ ∈S(Rn), multiplication and division in Dαϕ(x)= Dαϕ(ξ)e2πix·ξ dξ = (2πiξ)αϕ(ξ)e2πix·ξ dξ Rn Rn 2 (s−k)/2 by ωs−k(ξ)=(1+|ξ| ) followed by an application of the Schwarz inequality yields α α |D ϕ(x)|≤ |(2πξ) |ωs−k(ξ)|ϕ(ξ)|ωk−s(ξ) dξ n R 1/2 α 2 k−s ≤D ϕ(s−k) (1 + |ξ| ) dξ Rn and α α (7.16) D ϕ∞ ≤ CD ϕ(s−k). s n s n n When u ∈ H (R ), we can consider ϕm → u in H (R )withϕk ∈S(R ), { α }∞ ⊂S n the estimate (7.16) ensures that D ϕm m=1 (R ) is uniformly Cauchy, α α and we obtain that D ϕm → D u uniformly, if |α|≤k. This proves that u ∈Ek(Rn). 

7.6.3. The spaces Hm(Ω). If m ∈ N and for any open set Ω ⊂ Rn,we will use the notation Hm(Ω) = W m,2(Ω) suggested by Theorem 7.27. In this case, as a consequence of the Sobolev Theorem 7.28 we obtain Theorem 7.29. If m − k>n/2, Hm(Ω) ⊂Ek(Ω).

Proof. It is sufficient to prove that u ∈ Hm(Ω) ⊂ L2(Ω) is a Ck function on a neighborhood of every point. By multiplying u by a test function if necessary, we can suppose that as a distribution its support is a compact subset K of Ω. If K ≺ η ≺ Ω and ifu ¯ is the extension of u ∈ L2(Ω) by zero tou ¯ ∈ L2(Rn), thenu ¯ ∈ Hm(Rn), since for every |α|≤m we can apply the Leibniz rule to ϕ, Dαu¯ = ϕ, Dα(ηu¯) to show that ·,Dαu¯ is L2-continuous on test functions, so Dαu¯ ∈ L2(Rn) by the Riesz representation theorem, andu ¯ ∈ Hm(Rn). By Theorem 7.28 we know thatu ¯ ∈Ek(Rn) and then u ∈Ek(Ω). 

7.6. Sobolev spaces 211

Remark 7.30. It is useful to approximate functions in Hm(Ω) by C∞ functions. The space Hm(Ω) can be defined as the completion of E(Ω) ∩ m m,p H (Ω) under the norm ·m,2. In fact it can be proved that E(Ω)∩W (Ω) is dense in W m,p(Ω) for any 1 ≤ p<∞ and m ≥ 1 an integer.14

We content ourselves with the following easier approximation result known as the Friedrichs theorem.

1 n Theorem 7.31. If u ∈ H (Ω), then there exists a sequence {ϕm}⊂D(R ) such that limm u−ϕmLp(Ω) =0and limm ∂ju−∂jϕmLp(ω) =0for every 1 ≤ j ≤ n and for every open set ω such that ω¯ is a compact subset of Ω.

Proof. Denote by uo the extension of u by zero on Rn and choose a mollifier o o 2 n o ε.Thenε ∗u → u in L (R ) and so limε→0 u−ε ∗u Lp(Ω) = 0. Allow c o o 0 <ε

As an application, we can prove the following chain rule for functions v ∈ H1(Ω) and any  ∈E(R) with bounded derivative and such that (0) = 0:  (7.17) ∂j( ◦ v)=( ◦ v)∂jv (1 ≤ j ≤ n). Indeed, since ||≤M and (0) = 0, by the mean value theorem |(t)|≤ 2  M|t|.Thus|◦v|≤M|v| and ◦v ∈ L (Ω). It is also clear that ( ◦v)∂jv ∈ L2(Ω). n Pick ϕm ∈D(R ) as in Theorem 7.31. Then, for every ϕ ∈D(Ω) with supp ϕ ⊂ ω, from the usual chain rule  ( ◦ ϕm)(x)∂jϕ(x) dx = ( ◦ ϕm)(x)∂jϕm(x)ϕ(x) dx. Ω Ω 2   2 Here  ◦ ϕm →  ◦ v in L (Ω) and ( ◦ ϕm)∂jϕm → ( ◦ v)∂jv in L (ω)by dominated convergence, and (7.17) follows.

14This is the Meyer-Serrin theorem and a proof can be found in [1]or[17].

212 7. Fourier transform and Sobolev spaces

m ∈ 7.6.4. The spaces H0 (Ω). When looking for distributional solutions u Hm(Ω) in boundary value problems such as the Dirichlet problem with a homogeneous boundary condition, it does not make sense to consider the pointwise values u(x)ofu. Vanishing on the boundary in the distributional sense is defined by con- sidering u as an element of a convenient subspace of Hm(Ω). The Sobolev space Hm(Ω) is defined as the closure of 0 Dm Dm (Ω) = K (Ω) K∈K(Ω) in Hm(Ω), and it is endowed with the restriction of the norm of Hm(Ω). In this definition, Dm(Ω) can be replaced by D(Ω): ∈ D m Theorem 7.32. For every m N, (Ω) is dense in H0 (Ω). Proof. Let ψ ∈Dm(Ω) ⊂Dm(Rn). If  ≥ 0 is a test function supported by n B¯(0, 1) such that 1 =1,thenk(x):=k (kx) is another test function n such that 1 = 1, now supported by B¯(0, 1/k). Moreover k ∗ ψ ∈D(R ) and α α α D (k ∗ ψ)=k ∗ D ψ → D ψ in L2(Rn) for every |α|≤m with

supp k ∗ ψ ⊂ supp k + supp ψ ⊂ Ω n m if k is large enough. Hence, D(R )  k ∗ ψ → ψ in H (Ω) and ψ ∈ D(Ω), m  closure in H0 (Ω).

The class S(Rn)isdenseinHm(Rn), D(Rn) is also dense in S(Rn), and the inclusion S(Rn) → Hm(Rn) is continuous, so that D(Rn)isdense in Hm(Rn) and m n m n H0 (R )=H (R ). m The fact that the elements in H0 (Ω) can be considered as distributions that vanish on the boundary ∂Ω of Ω is explained by the following results, where for simplicity we restrict ourselves to the special and important case m =1. ∈ 1 ∈ 1 Theorem 7.33. If u H (Ω) is compactly supported, then u H0 (Ω),and its extension u¯ by zero on Rn belongs to H1(Rn).

Proof. If u ∈ H1(Ω) has a compact support K ⊂ Ω, it is shown as in Theorem 7.29 thatu ¯ ∈ L2(Rn) belongs to H1(Rn)byusingη such that K ≺ η ≺ Ω. n 1 n 1 n But D(R )isdenseinH (R ) and it follows from ϕm → u¯ in H (R ) → 1 ∈D ∈ 1  that ϕmη u in H (Ω) with ϕmη (Ω), so that u H0 (Ω).

7.7. Applications 213

Theorem 7.34. Let Ω ⊂ Rn be a bounded open set. If u ∈ H1(Ω) ∩C(Ω)¯ ∈ 1 and u|∂Ω =0,thenu H0 (Ω).

Proof. Assume that u is real and let  ∈E1(R) be such that |(t)|≤|t| on R,  =0on[−1, 1], and (t)=t on (−2, 2)c.If[−1, 1] ≺ ϕ ≺ (−2, 2), just take (t)=t(1 − ϕ(t)). If v ∈ H1(Ω), then  ◦ v ∈ L2(Ω) and by the chain rule (7.17) also  2 1 ∂j( ◦ v)=( ◦ v) · ∂jv ∈ L (Ω), so  ◦ v ∈ H (Ω). −1 ∈ 1 ⊂{| |≥ } Hence um := m (mu) H0 (Ω) since supp um u 1/m ,which is a compact subset of the bounded open set Ω. By dominated convergence, → 1 ∈ 1  um u in H (Ω) and it follows that u H0 (Ω).

It can be shown that this result is also true for unbounded open sets and that the converse holds when Ω is of class C1: ∈C ¯ ∩ 1 ⇒ (7.18) u (Ω) H0 (Ω) u|∂Ω =0. We only include here the proof in the easy case n =1:

1 Theorem 7.35. If n =1and Ω=(a, b),thenH0 (a, b) is the class of all functions u ∈ H1(a, b) ⊂C[a, b] such that u(a)=u(b)=0.

Proof. By Theorem 7.34, we only need to show that u(a)=u(b)=0for ∈ 1 ⊂C D  → 1 every u H0 (a, b) [a, b]. But, if (a, b) ϕk u in H0 (a, b), also → 1 →C ϕk u uniformly, since H0 (a, b)  [a, b] is continuous by Theorem 7.25, and then ϕk(a)=ϕk(b)=0. 

7.7. Applications

To show how Sobolev spaces provide a good framework for the study of differential equations, let us start with a one-dimensional problem.

7.7.1. The Sturm-Liouville problem. We consider here the problem of solving (7.19) −(pu) + qu = f, u(a)=u(b)=0 when q ∈C[a, b], p ∈C1[a, b], and p(t) ≥ δ>0. If f ∈C[a, b], a classical solution is a function u ∈C2[a, b] that satis- fies (7.19) at every point. ∈ 2 ∈ 1 If f L (a, b), a weak solution is a function u H0 (a, b) whose distributional derivatives satisfy −(pu) + qu = f, i.e., b b b p(t)u(t)ϕ(t) dt + q(t)u(t)ϕ(t) dt = f(t)ϕ(t) dt ϕ ∈D(a, b) . a a a

214 7. Fourier transform and Sobolev spaces

∈ 1 → ∈D If v H0 (a, b), by taking ϕk v (ϕk (a, b)), the identity b b b p(t)u(t)v(t) dt + q(t)u(t)v(t) dt = f(t)v(t) dt a a a also holds. To prove the existence and uniqueness of a weak solution for this Sturm- Liouville problem, with f ∈ L2(a, b), we define b b B(u, v):= p(t)u(t)v(t) dt + q(t)u(t)v(t) dt. a a 1 × 1 Then we obtain a sesquilinear continuous form on H0 (a, b) H0 (a, b) and · ∈ 1  ( ,f)2 H0 (a, b) ,since   |B(u, v)|≤p∞u 2v 2 + q∞u2v2 ≤ cu(1,2)v(1,2) and |(u, f)2|≤f2u(1,2). If B is coercive, we can apply the Lax-Milgram theorem and, for a ∈ 2 ∈ 1 given f L (a, b), there exists a unique u H0 (a, b) such that B(v, u)= (v, f)2, which means that u is the uniquely determined weak solution of problem (7.19). For instance, if also q(t) ≥ δ>0, then b |  |2 | |2 ≥  2 B(u, u)= (p(t) u (t) + q(t) u(t) ) dt δ u H1(a,b) a 0 and B is coercive. Finally, if f ∈C[a, b], the weak solution u is a C2 function, and then it is a classical solution. Indeed, pu ∈ L2(a, b) satisfies (pu) = qu − f,which is continuous; then g := pu and u = g/p are C1 functions on [a, b], so that u ∈C2[a, b], and u(a)=u(b) = 0 by Theorem 7.35.

7.7.2. The Dirichlet problem. Now let Ω be a nonempty bounded open domain in Rn with n>1, and consider the Dirichlet problem (7.20) −"u = f, u =0on∂Ω(f ∈ L2(Ω)). If f is continuous on Ω¯ and u is a classical solution, then u ∈C2(Ω) ∩C(Ω)¯ and (7.20) holds in the pointwise sense. ∇ ∇ n Let us write ( u, v)2 := j=1(∂ju, ∂jv)2. When trying to obtain existence and uniqueness of such a solution, we again start by looking for solutions in a weak sense. After multiplying by test functions ϕ ∈D(Ω), by integration we are led to consider functions u ∈C2(Ω) ∩C(Ω)¯ such that u =0on∂Ω and (ϕ, −"u)2 =(ϕ, f)2 ϕ ∈D(Ω) ,

7.7. Applications 215 or, equivalently, such that

(∇ϕ, ∇u)2 =(ϕ, f)2 (ϕ ∈D(Ω)). ∈ 1 Then it follows from Theorem 7.34 that u H0 (Ω), and ∇ ∇ ∈ 1 (7.21) ( v, u)2 =(v, f)2 (v H0 (Ω)) ∈ 1 → 1 → since for every v H0 (Ω) we can take ϕk v in H (Ω), so that ϕk v 2 and ∂jϕk → ∂ju in L (Ω). ∈ 1 A weak solution of the Dirichlet problem is a function u H0 (Ω) such that −"u = f in the distributional sense or, equivalently, such that property (7.21) is satisfied. Every classical solution is a weak solution, and we can look for weak solutions even for f ∈ L2(Ω). To prove the existence and uniqueness of such a weak solution, we will · 1 use the Dirichlet norm D on H0 (Ω), defined by n  2 |∇ |2 | |2 u D = u = ∂ju(x) dx. Ω Ω j=1 It is a true norm, associated to the scalar product

(u, v)D := (∇u, ∇v)2, and it is equivalent to the original one: Lemma 7.36 (Poincar´e). There is a constant C depending on the bounded domain Ω such that   ≤   ∈ 1 (7.22) u 2 C u D (u H0 (Ω)), 1 · · and on H0 (Ω) the Dirichlet norm D and the Sobolev norm (1,2) are equivalent.

D 1 Proof. Since (Ω) is dense in H0 (Ω), we only need to prove (7.22) for test functions ϕ ∈D(Ω) ⊂D(Rn). If Ω ⊂ [a, b]n, let us consider any x =(x ,x) ∈ Ω and write 1 x 1  ϕ(x)= ∂1ϕ(t, x ) dt. a By the Schwarz inequality, b 1/2 1/2  2 |ϕ(x)|≤(b − a) |∂1ϕ(t, x )| dt a and then, by Fubini’s theorem, b  2 ≤ − |  |2 − 2 2 ϕ 2 (b a) dx ∂1ϕ(t, x ) dt =(b a) ∂1ϕ 2 [a,b]n a

216 7. Fourier transform and Sobolev spaces

with |∂1ϕ|≤|∇ϕ|, and (7.22) follows. From this estimate,  2  2 |∇ |2 ≤ 2  2 u (1,2) = u 2 + u 2 (C +1) u D and obviously also uD ≤u(1,2).  Theorem 7.37. The Dirichlet problem (7.20) on the bounded domain Ω ∈ 1 ∈ 2 has a uniquely determined weak solution u H0 (Ω) for every f L (Ω), and the operator "−1 2 → 1 : L (Ω) H0 (Ω) is continuous.

Proof. If C is the constant that appears in the Poincar´e lemma, then, by the Schwarz inequality,

|(v, f)2|≤f2v2 ≤ Cf2vD · ∈ 1   · ≤   and ( ,f)2 H0 (Ω) with ( ,f)2 C f 2. By the Riesz representation ∈ 1 theorem, there is a uniquely determined function u H0 (Ω) such that ∈ 1 (v, f)2 =(v, u)D for all v H0 (Ω), which is property (7.21). −1 The estimate uD = (·,f)2 1 ≤ Cf2 shows that " ≤ H0 (Ω) C. 

An application of (7.18) shows that a weak solution of class C2 is also a classical solution if Ω is C1: Theorem 7.38. Let u ∈C2(Ω)∩C(Ω)¯ and f ∈C(Ω).Ifu is a weak solution of the Dirichlet problem (7.20) and Ω is a C1 domain, then u is a classical solution; that is, −"u(x)=f(x) for every x ∈ Ω and u(x)=0for every x ∈ ∂Ω.

2 Proof. By (7.18), u|∂Ω =0.Sinceu ∈C(Ω), the distribution "u is the function "u(x) on Ω, and the distributional relation −"u = f is an identity of functions. 

We have not proved (7.18) if n>1, and the proof of the regularity of the weak solutions is more delicate. For instance, if f ∈C∞(Ω)¯ and the boundary ∂ΩisC∞, it can be shown that every weak solution u is also in E(Ω),¯ so that it is also a classical solution. More precisely, the following result holds: Theorem 7.39. Let Ω be a bounded open set of Rn of class Cm+2 with n n { } ∈ m m>n/2(or R or R+ = x : xn > 0 ) and let f H (Ω).Then every weak solution u of the Dirichlet problem (7.20) belongs to C2(Ω)¯ ,and it is a classical solution.

7.7. Applications 217

7.7.3. Eigenvalues and eigenfunctions of the Laplacian. We are go- ing to apply the spectral theory for compact operators. The following result will be helpful: Theorem 7.40. Suppose that Φ ⊂ L2(Rn) satisfies the following three con- ditions: (a) Φ is bounded in L2(Rn), | |2 ∈ (b) limR→∞ |x|>R f(x) dx =0uniformly on f Φ,and

(c) limh→0 f − τhf2 =0uniformly on f ∈ Φ. Then the closure Φ¯ of Φ is compact in L2(Rn).

Proof. Let ε>0. By (b), we can choose R>0 so that |f(x)|2 dx ≤ ε2 (f ∈ Φ). | | x >R n Choose 0 ≤ ϕ ∈D(B(0, 1)) with ϕ = 1, so that ϕk(x)=k ϕ(kx)isa n summability kernel on R such that supp ϕk ⊂ B¯(0, 1/k), and we know 2 n that limk→∞ f ∗ ϕk − f2 =0iff ∈ L (R ). In fact, since ϕk =0on |y|≥1/k, it follows from the proof of Theorem 2.41 that |(f ∗ ϕk)(x) − f(x)| = [f(x − y) − f(y)]ϕk(y) dy |y|<1/k  ∗ −  ≤  −  and then f ϕk f 2 sup|h|≤1/k τhf f 2. Thus, by (c), we can choose N so that f − f ∗ ϕN 2 ≤ ε (f ∈ Φ). Moreover it follows very easily from the Schwarz inequality that

|(f ∗ ϕN )(x) − (f ∗ ϕN )(y)|≤τx−yf − f2ϕN 2 and also |(f ∗ ϕN )(x)|≤f2ϕN 2. These estimates, with conditions (a) and (c), allow us to apply the Ascoli- n Arzel`a theorem on B¯(0,R) ⊂ R to the restrictions of the functions f ∗ ϕN with f ∈ Φ, which can be covered by a finite family of balls in C(B¯(0,R)) with the centers in Φ,

BC(B¯(0,R))(f1,δ),...,BC(B¯(0,R))(fm,δ), for every δ>0. Note that at every point x ∈ Rn

|f(x) − fj(x)|≤χ{|x|>R}(x)|f(x)| + χ{|x|>R}(x)|fj(x)|

+|f(x) − (f ∗ ϕn)(x)| + |fj(x) − (fj ∗ ϕN )(x)|

+χ{|x|≤R}(x)|(f ∗ ϕN )(x) − (fj ∗ ϕN )(x)|

218 7. Fourier transform and Sobolev spaces and the previous estimates yield 1/2 f − fj2 ≤ 4ε + |B(0,R)| sup |(f ∗ ϕN )(x) − (fj ∗ ϕN )(x)|, |x|≤R 1/2 so that, by choosing δ = ε/|B(0,R)| , it follows that f − fj2 ≤ 5ε and, by Theorem 1.1, the closure of Φ in L2(Rn) is compact.  Remark 7.41. Obvious changes in the proof, such as applying H¨older’s inequality instead of the Schwarz inequality, shows that the above theorem has an evident extension to Lp(Rn)if1≤ p<∞. Theorem 7.42 (Rellich15). If Ω is a bounded open set of Rn,thenatural 1 → 2 inclusion H0 (Ω)  L (Ω) is compact. Proof. The extension by zero mapping L2(Ω) → L2(Rn)isisometric,so that it is sufficient to prove the compactness of the extension by zero map of Theorem 7.33, i.e., ∈ 1 → ˜ ∈ 1 n 1 n ⊂ 2 n ˜ ∈ c f H0 (Ω) f H (R )=H0 (R ) L (R )(f(x)=0ifx Ω ). This follows as an application of Theorem 7.40 when Φ is B˜ = {f˜; f ∈ B}, 1 if B is the closed unit ball in H0 (Ω). ˜ 2 n Indeed, B is contained in the unit ball of L (R ) and, if Ω ⊂ B(0,R), | ˜|2 then |x|>R f = 0, so that conditions (a) and (b) of Theorem 7.40 are satisfied. To prove (c), note that 1 n (7.23) τhu − u2 ≤|h||∇u|2 (u ∈ H (R )) n 1 n since we can consider ϕk ∈D(R ) so that ϕk → u in H (R )ask →∞ and for every test function ϕ we have 1 2 1 2 2 2 |τhϕ(x) − ϕ(x)| = h ·∇ϕ(x − th) dt ≤|h| |∇ϕ(x − th)| dt, 0 0 by the Schwarz inequality, and (7.23) follows for ϕ by integration. ˜ ˜ Then τhf − f2 ≤|h| for every f ∈ B, which is property (c) for B˜.  Theorem 7.43. If Ω is a bounded open set of Rn,then(−")−1 is a compact 2 1 and injective self-adjoint operator on L (Ω) and on H0 (Ω). Proof. The compactness of (−")−1 : L2(Ω) → L2(Ω) follows by consider- ing the decomposition −" −1 2 → 1 → 2 ( ) : L (Ω) H0 (Ω)  L (Ω),

15The theorem, in this case p = 2, is attributed to the South-Tyrolian mathematician Franz Rellich (1930 in G¨ottingen) and to Vladimir Kondrachov (1945) for the more general case stating 1,p q − C ¯ that W0 (Ω) is compactly embedded in L (Ω) for any qn. For a proof we refer the reader to Gilbarg and Trudinger [17] and Brezis [5].

7.7. Applications 219

−" −1 2 → 1 where ( ) : L (Ω) H0 (Ω) is continuous by Theorem 7.37 and, by the 1 → 2 Rellich Theorem 7.42, H0 (Ω)  L (Ω) is compact. Similarly,

− −1 −" −1 1 → 2 ( −→) 1 ( ) : H0 (Ω)  L (Ω) H0 (Ω) is also compact. −" −1 2 → 1 Note that ( ) : L (Ω) H0 (Ω) is bijective, by Theorem 7.37. If u =(−")−1ϕ and v =(−")−1ψ,withϕ, ψ ∈D(Ω), then

(u, v)D =(∇u, ∇v)2 =(−"u, v)2 =(ϕ, v)2, so that −1 −1 ((−") ϕ, ψ)D =(ϕ, (−") ψ)D and −1 −1 ((−") u, v)D =(u, (−") v)D ∈ 1 for all u, v H0 (Ω) by continuity. Also −1 −1 ((−") ϕ, ψ)2 =(∇ϕ, ∇ψ)2 =(ϕ, ψ)D =(ϕ, (−") ψ)2. −" −1 1 2  This shows that ( ) is self-adjoint on H0 (Ω) and on L (Ω).

Note that −" −1 ∈ 1 (7.24) (( ) u, v)D =(u, v)2 (u, v H0 (Ω)), D 1 −" −1 from the density of (Ω) in H0 (Ω). Thus ( ) is a positive operator on 1 H0 (Ω) in the sense that −1 (7.25) ((−") u, u)D > 0 ∈ 1 if 0 = u H0 (Ω). 1 ∈ 1 An eigenfunction for the Laplacian on H0 (Ω) is an element u H0 (Ω) such that "u = λu for some λ, which is said to be an eigenvalue of " if there exists some nonzero eigenfunction u such that "u = λu. " 1 → 2 Hence 0 is not an eigenvalue, since : H0 (Ω) L (Ω) is injective, and ∈ 1 " u H0 (Ω) is an eigenfunction for the eigenvalue λ of if and only if 1 (−")−1u = − u. λ The solutions of this equation form the eigenspace for this eigenvalue λ.Note that λ<0, as a consequence of the positivity property (7.25) of (−")−1. From the spectral theory of compact self-adjoint operators, −" −1 1 → 1 ( ) : H0 (Ω) H0 (Ω)

220 7. Fourier transform and Sobolev spaces has a spectral representation ∞ −" −1 ∈ 1 (7.26) ( ) v = μk(v, uk)Duk (v H0 (Ω)), k=0 1 − ↓ a convergent series in H0 (Ω), where μk = 1/λk 0 is the sequence of the −" −1 { }∞ 1 eigenvalues of ( ) and uk k=0 is an orthonormal system in H0 (Ω) with −1 −1 respect to (·, ·)D such that (−") uk = μkuk.Since(−") is injective, { }∞ 1 uk k=0 is a basis in H0 (Ω). Moreover (uk,uj)2 =0ifk = j, by (7.24). Theorem 7.44. Suppose f ∈ L2(Ω). The weak solution of the Dirichlet problem −" ∈ 1 u = f (u H0 (Ω)) is given by the sum ∞ u = − (f,uk)2uk k=0 √ 1 − in H0 (Ω). The sequence of eigenfunctions λkuk is an orthonormal basis of L2(Ω).

∈ 1 Proof. It follows from (7.24) applied to the elements uk H0 (Ω) that  2 − uk 2 = μk = 1/λk, (uk,um)2 =0 if m = k. 1 D Moreover, since H0 (Ω) contains (Ω), it is densely and continuously in- 2 { }∞ 2 cluded√ in L (Ω), uk k=0 is total in L (Ω), and the orthonormal system { − }∞ 2 λkuk k=0 is complete in L (Ω). For every f ∈ L2(Ω), ∞ ∞ (7.27) f = (f, −λkuk)2 −λkuk = − λk(f,uk)2uk k=0 k=0 √ 2 { − }∞ ∈ 2 { }∞ ∈ 2 √in L (Ω) and λk(f,uk)2 k=0  .Also(f,uk)2 k=0  ,since −λk →∞. We can define ∞ u = (f,uk)2uk k=0 1 since the series converges in H0 (Ω). Then ∞ ∞ "u = (f,uk)2"uk = λk(f,uk)2uk k=0 k=0 in D(Ω). In (7.27) we have a sum in D(Ω), so that −"u = f. 

7.7. Applications 221

To complete our discussion, we want to show that the eigenfunctions uk are in E(Ω), so that they are classical solutions of −"uk = λkuk.The method we are going to use is easily extended to any elliptic linear differential operator L with constant coefficients. Theorem 7.45. Suppose L = "+λ and u ∈D(Ω),whereΩ is a nonempty open set in Rn.IfLu ∈E(Ω),thenu ∈E(Ω).

Proof. If Lu ∈E(Ω), then (7.28) ϕLu ∈ Hs(Rn) ∀ϕ ∈D(Ω) for every s ∈ R. We claim that it follows from (7.28) that ϕu ∈ Hs+2(Rn) ∀ϕ ∈D(Ω). Then an application of Theorem 7.29 shows that ϕu ∈E(Ω) for every ϕ ∈ D(Ω) and then u ∈E(Ω). To prove this claim, let supp ϕ ⊂ U, U an open set with compact closure U¯ in Ω, and choose U¯ ≺ ψ ≺ Ω. Note that ψu ∈E(Rn) and it follows from Theorem 7.16 that ψu ∈ Ht(Rn). By decreasing t if necessary, we can suppose that s +2− t = k ∈ N.

Let ψ0 = ψ, ψk = ϕ and define ψ1,...,ψk−1 by recurrence so that

supp ψj+1 ≺ ψj ≺ Uj ⊂{ψj−1 =1}. t+j n It is sufficient to show that ψju ∈ H (R ), since then ϕu = ψku ∈ Ht+k(Rn)=Hs+2(Rn) will complete the proof. We only need to prove that if ϕ, ψ ∈D(Ω) are such that supp ϕ ≺ ψ and ψu ∈ Ht(Rn), then ϕu ∈ Ht+1(Rn). From the definition of L and from the condition supp ϕ ≺ ψ, n − 2 [L, ϕ]u = L(ϕu) ϕLu = (∂j ϕ)u +2(∂jϕ)∂ju j=1 is a differential operator of order 1 with smooth coefficients and satisfies [L, ϕ]u =[L, ϕ](ψu). Hence L(ϕu)=[L, ϕ](ψu)+ϕLu with [L, ϕ](ψu) ∈ Ht+1(Rn) and ϕLu ∈D(Rn), and we conclude that L(ϕu) ∈ Ht+1(Rn).

Therefore also (Δ − 1)(ϕu)=L(ϕu) − (λ − 1)ϕu ∈ Ht+1(Rn) and, since Δ − 1 is a bijective operator from Ht−1(Rn)toHt+1(Rn), we conclude that ϕu ∈ Ht−1(Rn). 

222 7. Fourier transform and Sobolev spaces

For a more complete analysis concerning the Dirichlet problem we refer the reader to Brezis,“Analyse fonctionelle”[5], and Folland,“Introduction to Partial Differential Equations” [14].

7.8. Exercises

Exercise 7.1. Calculate the Fourier integral of the following functions on R: −t2 (a) f1(t)=te .

(b) f2(t)=χ(a,b). −|t| (c) f3(t)=e . 2 −1 (d) f4(t)=(1+t ) . Exercise 7.2. Assuming that 1 ≤ p ≤∞and f ∈ Lp(Rn), check that the Gauss-Weierstrass kernel 1 −|x|2/4t Wt(x):= e (h>0) (4πt)n/2 is a summability kernel in S(Rn), and prove that 1 2 u(t, x):= e−|y| /4tf(x − y) dy (t>0) n/2 (4πt) Rn defines a solution of the heat equation

∂tu −"u =0 on (0, ∞) × Rn. p n If p<∞, show that limt↓0 u(t, ·)=f in L (R ). If f is bounded and continuous, prove that u has an extension to a continuous function on [0, ∞) × Rn such that u(0,x)=f(x) for all x ∈ Rn. Exercise 7.3. The heat flow in an infinitely long road, given an initial temperature f, is described as the solution of the problem 2 ∂tu(x, t)=∂xu(x, t),u(x, 0) = f(x).

Prove that if f ∈C0(R) is integrable, then the unique bounded classical solution is −4π2ξ2t 2πiξx u(x, t)= f(ξ)e e dξ =(f ∗ Kt)(x) R where 1 − x2 Kt(x)= e 4t . (4πt)1/2 Exercise 7.4. Find the norm of the Fourier transform F : L1(Rn) → L∞(Rn).

7.8. Exercises 223

Exercise 7.5. Is it true that f,g ∈ L1(R) and f ∗ g = 0 imply either f =0 or g =0? Exercise 7.6. Let 1 ≤ p ≤∞. Prove that ϕ ∈E(Rn)isinS(Rn) if and only if the functions xβDαϕ(x)areallinLp(Rn). Show that the inclusion S(Rn) → Lp(Rn) is continuous. Exercise 7.7. Show that if a rational function f belongs to S(Rn), then f =0. Exercise 7.8. If f is a function on Rn such that fϕ ∈S(Rn) for all ϕ ∈S(Rn), prove that the pointwise multiplication f· is a continuous linear operator on S(Rn). Exercise 7.9. Suppose f ∈S(R) and f ∈D[−R,R], and let 1 ≤ p<∞. Prove that there is a sequence of constants Cn ≥ 0(n =0, 1, 2,...), which depend only on p, such that (n) n f p ≤ CnR fp (n ∈ N). Exercise 7.10 (Hausdorff-Young). Show that, if 1 ≤ p ≤ 2 and f ∈ Lp(Rn), then f ∈ L∞(Rn)+L2(Rn) and F : Lp(Rn) → Lp (Rn). 2 1 Exercise 7.11. Show that sinc ∈ L (R)\L (R), and find  sinc 2, F(sinc), and F(sinc). Exercise 7.12. If u is a solution of the Dirichlet problem (7.11) on a half- plane, find another solution by adding to u an appropriate harmonic func- tion. p Exercise 7.13. Show that  ⊂ (1 ≤ p ≤∞) and that the injective linear p →S → ∞ map   such that x[k] k=−∞ x[k]δk(t) is continuous. ∞ | |p 1/p ≤  Exercise 7.14 (Hausdorff-Young). Show that ( k=−∞ ck(f) ) f p, with the usual change if p = ∞,if1≤ p ≤ 2 and f ∈ Lp(T). Exercise 7.15 (Poisson summation formula). Prove the following facts: ∈S +∞ − (a) If ϕ (R), ϕ1(t)= k=−∞ ϕ(t k) is uniformly convergent. (b) The Fourier series of ϕ is also uniformly convergent. 1 +∞ +∞ (c) k=−∞ ϕ(k)= k=−∞ ϕ(k), with absolute convergence.   (d) For the Dirac comb, = . Exercise 7.16. If 1 ≤ q ≤∞, prove that α f := {D fp}|α|≤mq p,m defines on W (Ω) a norm which is equivalent to ·(m,p). Exercise 7.17. Every u ∈ W 1,p(a, ∞) is uniformly continuous.

224 7. Fourier transform and Sobolev spaces

Exercise 7.18. For a half-line (a, ∞) prove a similar result to Theorem 7.25, now about the continuity of W 1,p(a, ∞) →C[a, ∞) ∩ L∞(a, ∞). 1,p 1,p Exercise 7.19. Every u ∈ W (0, ∞) can be extended to Ru ∈ W (R) − − x ∈ so that Ru(t)=u( t)ift<0 and Ru(x) u(0) = 0 v(t) dt for all x R. Moreover, R : W 1,p(0, ∞) → W 1,p(R) is linear and continuous. 1 → 1 n Exercise 7.20. The extension by zero, P : H0 (Ω) H (R ), is a contin- uous operator. Exercise 7.21. If ∂Ω has zero measure, the extension by zero operator, P , ∈ 1 satisfies ∂jPu = P∂ju for every u H0 (Ω). Exercise 7.22. If u ∈ H1(−1, 1), its extension by zero, uo, is not always in H1(R). Exercise 7.23. If s − k>n/2(s ∈ R) and m − k>n/2(m ∈ N), prove that the inclusions Hs(Rn) →Ek(Rn) and Hm(Ω) →Ek(Ω) of the Sobolev Theorem 7.28 are continuous. Exercise 7.24. Let u(x)=e−2π|x| as in Example 7.12. Prove the following facts: (a) u, u ∈ L2(R) (distributional derivative), and u ∈ Hs(R)ifs<3/2. (b) u ∈ H3/2(R). Exercise 7.25. Prove that the Dirichlet problem −"u + u = f, u =0on∂Ω(f ∈ L2(Ω)) has a unique weak solution by applying the Lax-Milgram theorem to the sesquilinear form B(u, v):= ∇u(x) ·∇v¯(x)+u(x)¯v(x) dx Ω 1 × 1 on H0 (Ω) H0 (Ω).

References for further reading: R. A. Adams, Sobolev Spaces. H. Brezis, Analyse fonctionelle: Th´eorie et applications. G. B. Folland, Introduction to Partial Differential Equations. I. M. Gelfand and G. E. Chilov, Generalized Functions. D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order. L. H¨ormander, Linear Partial Differential Operators. E. H. Lieb and M. Loss, Analysis.

7.8. Exercises 225

V. Mazya, Sobolev Spaces. W. Rudin, Functional Analysis. E. M. Stein and G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces.

Chapter 8

Banach algebras

Some important Banach spaces are equipped in a natural way with a con- tinuous product that determines a Banach algebra structure.1 Two basic examples are C(K) with the pointwise multiplication and L(E)withthe product of operators if E is a Banach space. It can be useful for the reader to retain C(K) as a simple reference model. The first work devoted to concrete Banach algebras is contained in some papers by J. von Neumann and beginning in 1930. The advantage of con- sidering algebras of operators was clear in his contributions, but it was the abstract setting of Banach algebras which proved to be convenient and which allowed the application of similar ideas in many directions. The main operator on these algebras is the Gelfand transform2 G : a → a, which maps a unitary commutative Banach algebra A on C to the space C(Δ) of all complex continuous functions on the spectrum Δ of A,which is the set of all nonzero elements χ ∈ A that are multiplicative. Here Δ is endowed with the restriction of the w∗-topology and it is compact. As seen in Example 8.14, Δ is the set of all the evaluations δt (t ∈ K)ifA = C(K), and f(δt)=δt(f)=f(t), so that in this case one can consider f = f. But we will be concerned with the spectral theory of operators in a complex Hilbert space H.IfT is a bounded normal operator in H, so that

1Banach algebras were first introduced in 1936 with the name of “linear metric rings” by the Japanese mathematician Mitio Nagumo. He extended Cauchy’s function theory to the functions with values in such an algebra to study the resolvent of an operator. They were renamed “Banach algebras” by Charles E. Rickart in 1946. 2Named after the Ukrainian mathematician Israel Moiseevich Gelfand, who is considered the creator, in 1941, of the theory of commutative Banach algebras. Gelfand and his colleagues created this theory which included the spectral theory of operators and proved to be an appropriate setting for harmonic analysis.

227

228 8. Banach algebras

T and the adjoint T ∗ commute, then the closed Banach subalgebra A = T  of L(H) generated by I, T , and T ∗ is commutative. It turns out that the Gelfand theory of commutative Banach algebras is especially well suited in this setting. Through the change of variables z = T(χ) one can consider σ(T ) ≡ Δ, and the Gelfand transform is a bijective mapping that allows us to define a functional calculus g(T )by g(T )=g(T)ifg is a continuous function on the spectrum of T . For this continuous functional calculus there is a unique operator-valued measure E on σ(T ) such that g(T )= g(λ) dE(λ), σ(T ) and the functional calculus is extended by f(T )= f(λ) dE(λ) σ(T ) to bounded measurable functions f. The Gelfand transform, as a kind of abstract Fourier operator, is also a useful tool in harmonic analysis and in function theory. The proof of Wiener’s 1932 lemma contained in Exercise 8.15 is a nice unexpected appli- cation discovered by Gelfand in 1941, and generalizations of many theorems of Tauberian type and applications to the theory of locally compact groups have also been obtained with Gelfand’s methods. We refer the reader to the book by I. M. Gelfand, D. A. Raikov and G. E. Chilov [16]formore information.

8.1. Definition and examples

We say that A is a complex Banach algebra or, simply, a Banach algebra if it is a complex Banach space with a bilinear multiplication and the norm satisfies xy≤xy, so that the multiplication is continuous since, if (xn,yn) → (x, y), then

xy − xnyn≤xy − yn + x − xnyn→0. Real Banach algebras are defined similarly. The Banach algebra A is said to be unitary if it has a unit, which is an element e such that xe = ex = x for all x ∈ A and e = 1. This unit is unique since, if also ex = xe = x,thene = ee = e. We will only consider unitary Banach algebras. As a matter of fact, every Banach algebra can be embedded in a unitary Banach algebra, as showninExercise8.1.

8.2. Spectrum 229

Example 8.1. (a) If X is a nonempty set, B(X) will denote the unitary Banach algebra of all complex bounded functions on X, with the pointwise   | | multiplication and the uniform norm f X := supx∈X f(x) . The unit is the constant function 1. (b) If K is a compact topological space, then C(K) is the closed subal- gebra of B(K) that contains all the continuous complex functions on K.It is a unitary Banach subalgebra of B(K), since 1 ∈C(K). (c) The disc algebra is the unitary Banach subalgebra A(D)ofC(D¯). Since the uniform limits of analytic functions are also analytic, A(D)is closed in C(D¯). Example 8.2. If Ω is a σ-finite measure space, L∞(Ω) denotes the unitary Banach algebra of all measurable complex functions on Ω with the usual norm ·∞ of the essential supremum. As usual, two functions are considered equivalent when they are equal a.e. Example 8.3. Let E be any nonzero complex Banach space. The Banach space L(E)=L(E; E) of all bounded linear operators on E, endowed with the usual product of operators, is a unitary Banach algebra. The unit is the identity map I.

8.2. Spectrum

Throughout this section, A denotes a unitary Banach algebra,pos- sibly not commutative. An example is L(E), if E is a complex Banach space. A homomorphism between A and a second unitary Banach algebra B is a homomorphism of algebras Ψ : A → B such that Ψ(e)=e if e denotes the unit both in A and in B. The notion of the spectrum of an operator is extended to any element of A: The spectrum of a ∈ A is the subset of C

σA(a)=σ(a):={λ ∈ C; λe − a ∈ G(A)}, where G(A) denotes the multiplicative group of all invertible elements of A. Note that, if B is a unitary Banach subalgebra of A and b ∈ B,an inverse of λe − b in B is also an inverse in A, so that σA(b) ⊂ σB(b). Example 8.4. If E is a complex Banach space and T ∈L(E), we denote σ(T )=σL(E)(T ). Thus, λ ∈ σ(T ) if and only if T − λI is not bijective, by the Banach-Schauder theorem. Recall that the eigenvalues of T , and also the approximate eigenvalues, are in σ(T ). Cf. Subsection 4.4.2.

230 8. Banach algebras

Example 8.5. If E is an infinite-dimensional Banach space and T ∈L(E)is compact, the Riesz-Fredholm theory shows that σ(T ) \{0} can be arranged in a sequence of nonzero eigenvalues (possibly finite), all of them with finite multiplicity, and 0 ∈ σ(T ), by the Banach-Schauder theorem. Example 8.6. The spectrum of an element f of the Banach algebra C(K) is its image f(K).

Indeed, the continuous function f − λ has an inverse if it has no zeros, that is, if f(t) = λ for all t ∈ K. Hence, λ ∈ σ(f) if and only if λ ∈ f(K).

Let us consider again a general unitary Banach algebra A. N n ∈ Theorem 8.7. If p(λ)= n=0 cnλ is a polynomial and a A,then σ(p(a)) = p(σ(a)).

N Proof. We assume that p(a)=c0e + c1a + ···+ cN a , and we exclude the trivial case of a constant polynomial p(λ) ≡ c0. For a given μ ∈ C, by division we obtain p(μ) − p(λ)=(μ − λ)q(λ) and p(μ)e−p(a)=(μe−a)q(a). If μe−a ∈ G(A), then also p(μ)e−p(a) ∈ G(A). Hence, p(σ(a)) ⊂ σ(p(a)). Conversely, if μ ∈ σ(p(a)), by factorization we can write

μ − p(λ)=α(λ1 − λ) · ...· (λN − λ) with α =0.Then μe − p(a)=α(λ1e − a) · ...· (λN e − a), where μe − p(a) ∈ G(A), so that λie − a ∈ G(A)forsome1≤ i ≤ N.Thus,λi ∈ σ(a) and we have p(λi)=μ, which means that μ ∈ p(σ(a)). 

c The resolvent of an element a ∈ A is the function Ra : σ(a) → A such −1 that Ra(λ)=(λe − a) . It plays an important role in spectral theory. Note that, if λ =0, −1 −1 −1 −1 Ra(λ)=−(a − λe) = λ (e − λ a) .

To study the basic properties of Ra, we will use some facts from function theory. As in the numerical case and with the same proofs, a vector-valued function F :Ω→ A on an open subset Ω of C is said to be analytic or holomorphic if every point z0 ∈ Ω has a neighborhood where F is the sum of a convergent power series: ∞ n F (z)= (z − z0) an (an ∈ A). n=0

8.2. Spectrum 231

The series is absolutely convergent at every point of the convergence disc, which is the open disc in C with center z0 and radius 1 R = > 0.  1/n lim supn an The Cauchy theory remains true without any change in this setting, and F is analytic if and only if, for every z ∈ Ω, the complex derivative F (z + h) − F (z) F (z) = lim h→0 h exists.

We will show that σ(a) is closed and bounded and, to prove that Ra is c  ∈ analytic on σ(a) , we will see that Ra(λ) exists whenever λ σ(a). Let us first show that Ra is analytic on |λ| > a. Theorem 8.8. (a) If a < 1, e − a ∈ G(A) and ∞ (e − a)−1 = an (a0 := e). n=0 (b) If |λ| > a,thenλ ∈ σ(a) and ∞ −n−1 n Ra(λ)= λ a . n=0 (c) Moreover, 1 R (λ)≤ , a |λ|−a and lim|λ|→∞ Ra(λ)=0. ∞ n Proof. (a) As in (2.7), the Neumann series n=0 a is absolutely convergent  m≤ m   ∞ n ∈ ( a a and a < 1), so that z = n=0 a A exists, and it is easy to check that z is the right and left inverse of e − a. For instance, N lim (e − a) an =(e − a)z, N→∞ n=0 since the multiplication by e − a is linear and continuous, so that N N N+1 (e − a) an = an − an = e − aN+1 → e if N →∞. n=0 n=0 n=1 (b) Note that −1 −1 −1 Ra(λ)=λ (e − λ a) and, if λ−1a < 1, we obtain the announced expansion from (a).

232 8. Banach algebras

(c) Finally, ∞ 1 R (λ) = |λ|−1 λ−nan≤ . a |λ|−a n=0 

The spectral radius of a ∈ A is the number r(a):=sup{|λ|; λ ∈ σ(a)}. From Theorem 8.8 we have that r(a) ≤a, an inequality that can be strict. The following estimates are useful. Lemma 8.9. (a) If a < 1, a2 (e − a)−1 − e − a≤ . 1 −a

(b) If x ∈ G(A) and h < 1/(2x−1),thenx + h ∈ G(A) and (x + h)−1 − x−1 + x−1hx−1≤2x−13h2.

Proof. To check (a), we only need to sum the right-hand side series in ∞ ∞ (e − a)−1 − e − a =  an≤ an. n=2 n=2 To prove (b) note that x + h = x(e + x−1h), and we have x−1h≤x−1h < 1/2. If we apply (a) to a = −x−1h,sincea < 1/2, we obtain that x+h ∈ G(A), and (x + h)−1 − x−1 + x−1hx−1≤(e − a)−1 − e − ax−1 with (e − a)−1 − e − a≤x−1h2/(1 −a) ≤ 2x−1h2. 

Theorem 8.10. (a) G(A) is an open subset of A and x ∈ G(A) → x−1 ∈ G(A) is continuous. c (b) Ra is analytic on σ(a) and zero at infinity. (c) σ(a) is a nonempty subset of C and 3 r(a) = lim an1/n =infan1/n. n→∞ n

3This spectral radius formula and the analysis of the resolvent have a precedent in the study by Angus E. Taylor (1938) of operators which depend analytically on a parameter. This formula was included in the 1941 paper by I. Gelfand on general Banach algebras.

8.2. Spectrum 233

Proof. (a) According to Lemma 8.9(b), for every x ∈ G(A), 1 B x, ⊂ G(A) 2x−1 and G(A) is an open subset of A. Moreover (x + h)−1 − x−1≤(x + h)−1 − x−1 + x−1hx−1 + x−1hx−1→0 if h→0, and x ∈ G(A) → x−1 ∈ G(A) is continuous.

(b) On σ(a)c,  −1 − −1 − − −1 − 2 Ra(λ) = lim μ [((λ + μ)e a) (λe a) ]= Ra(λ) μ→0 follows from an application of Lemma 8.9(b) to x = λe − a and h = μe.In this case x−1hx−1 = μx−1x−1 and, writing x−2 = x−1x−1, we obtain μ−1[(x + μe)−1 − x−1]=μ−1[(x + μe)−1 − x−1 + x−1hx−1] − x−2 → x−2 as μ → 0, since μ−1[(x + μe)−1 − x−1 + x−1hx−1]≤|μ|−12x−13|μ|2 → 0.

By Theorem 8.8(b), Ra(λ)≤1/(|λ|−a) → 0if|λ|→∞.

(c) Recall that σ(a) ⊂{λ; |λ|≤r(a)} and r(a) ≤a. This set is closed in C,sinceσ(a)c = F −1(G(A)) with F (λ):=λe − x, which is a continuous function from C to A, and G(A) is an open subset of A. Hence σ(a)isa compact subset of C. If we suppose that σ(a)=∅, we will arrive at a contradiction. The function Ra would be entire and bounded, with lim|λ|→∞ Ra(λ) = 0, and the Liouville theorem is also true in the vector-valued case: for every u ∈ A, u ◦ Ra would be an entire complex function and lim|λ|→∞ u(Ra(λ)) = 0, so that u(Ra(λ)) = 0 and by the Hahn-Banach theorem Ra(λ)=0,a contradiction to Ra(λ) ∈ G(A). Let us calculate the spectral radius. Since ∞ −1 −n n Ra(λ)=λ λ a n=0 | | ∞ n n if λ >r(a), the power series n=0 z a is absolutely convergent when | | | |−1 ∞  n| |n z = λ < 1/r(a), and the convergence radius of n=0 a z is R = (lim sup an1/n)−1 ≥ 1/r(a). n→∞ ≥  n1/n Then, r(a) lim supn→∞ a .

234 8. Banach algebras

Conversely, if λ ∈ σ(a), then λn ∈ σ(an) by Theorem 8.7, so that |λn|≤ an and |λ|≤inf an1/n ≤ lim inf an1/n. n n 1/n n 1/n Then it follows that r(a) = limn→∞ a  =infn a  . 

As an important application of these results, let us show that C is the unique Banach algebra which is a field, in the sense that if A is a field, then λ → λe is an isometric isomorphism from C onto A.Theinverseisometry is the canonical isomorphism:

Theorem 8.11 (Gelfand-Mazur4). If every nonzero element of the unitary Banach algebra A is invertible (i.e., G(A)=A \{0}),thenA = Ce,and λ → λe is the unique homomorphism of unitary algebras between C and A.

Proof. Let a ∈ A and λ ∈ σ(a)(σ(a) = ∅). Then λe − a ∈ G(A) and it follows from the hypothesis that a = λe. A homomorphism C → A = Ce maps 1 → e and necessarily λ → λe. 

8.3. Commutative Banach algebras

In this section A represents a commutative unitary Banach alge- bra. Some examples are C, B(X), C(K), and L∞(Ω). Recall that L(E)(if dim E>1) is not commutative, and the convolution algebra L1(R) is not unitary.

8.3.1. Maximal ideals, characters, and the Gelfand transform. A character of A is a homomorphism χ : A → C of unitary Banach algebras (hence χ(e)=1).WeuseΔ(A), or simply Δ, to denote the set of all characters of A. It is called the spectrum of A. An ideal, J,ofA is a linear subspace such that AJ ⊂ J and J = A. It cannot contain invertible elements, since x ∈ J invertible would imply e = xx−1 ∈ J and then A = Ae ⊂ J, a contradiction to J = A. Note that, if J is an ideal, then J¯ is also an ideal, since it follows from J ∩ G(A)=∅ that e ∈ J¯ and J¯ = A. The continuity of the operations implies that J¯+ J¯ ⊂ J¯ and AJ¯ ⊂ J¯. This shows that every maximal ideal is closed.

4According to a result announced in 1938 by Stanislaw Mazur, a close collaborator of Banach who made important contributions to geometrical methods in linear and nonlinear functional analysis, and proved by Gelfand in 1941.

8.3. Commutative Banach algebras 235

Theorem 8.12. (a) The kernel of every character is a maximal ideal and the map χ → Ker χ between characters and maximal ideals of A is bijective. (b) Every character χ ∈ Δ(A) is continuous and χ =sup|χ(a)| =1. aA≤1 (c) An element a ∈ A is invertible if and only if χ(a) =0 for every χ ∈ Δ. { ∈ } | | (d) σ(a)= χ(a); χ Δ(A) ,andr(a)=supχ∈Δ χ(a) . Proof. (a) The kernel M of any χ ∈ Δ(A) is an ideal and, as the kernel of a nonzero linear functional, it is a hyperplane; that is, the complementary subspaces of M in A are one-dimensional, since χ is bijective on them, and M is maximal. If M is a maximal ideal, the quotient space A/M has a natural structure of unitary Banach algebra, and it is a field. Indeed, if π : A → A/M is the canonical mapping and π(x)=x is not invertible in A/M,then J = π(xA) = A/M is an ideal of A/M, and π−1(J) = A is an ideal of A which is contained in a maximal ideal that contains M.Thus,π−1(J)=M, so that π(xA) ⊂ π(M)={0} and x =0. Let χ : A/M = Ce˜ → C be the canonical isometry, so that M is the kernel of the character χM := χ ◦ πM . Any other character χ1 with the same kernel M factorizes as a product of πM with a bijective homomorphism between A/M and C which has to be the canonical mapping Ce˜ → C, and then χ1 = χM .

(b) If χ = χM ∈ Δ(A), then χ≤πM χ = πM ≤1 and χ≥ χ(e)=1. (c) If x ∈ G(A), we have seen that it does not belong to any ideal. If x ∈ G(A), then xA does not contain e and is an ideal, and by Zorn’s lemma every ideal is contained in a maximal ideal. So x ∈ G(A)c if and only if x belongs to a maximal ideal or, equivalently, χ(x) = 0 for every character χ. (d) Finally, λe−a ∈ G(A) if and only if χ(λe−a) = 0, that is , λ = χ(a) for some χ ∈ Δ(A). 

We associate to every element a of the unitary commutative algebra A the function a which is the restriction of a, · to the characters,5 so that a :Δ(A) → C is such that a(χ)=χ(a). On Δ(A) ⊂ B¯A we consider the Gelfand topol- ogy, which is the restriction of the weak-star topology w∗ = σ(A,A)ofA.

5Recall that a, u = u(a) was defined for every u in the dual A of A as a Banach space.

236 8. Banach algebras

In this way, a ∈C(Δ(A)), and G : a ∈ A → a ∈C(Δ(A)) is called the Gelfand transform. Theorem 8.13. Endowed with the Gelfand topology, Δ(A) is compact and the Gelfand transform G : A →C(Δ(A)) is a continuous homomorphism of commutative unitary Banach algebras. Moreover a = r(a) ≤a and Ge =1, so that G =1.

Proof. For the first part we only need to show that Δ ⊂ B¯A is weakly closed, since B¯ is weakly compact, by the Alaoglu theorem. But A Δ= ξ ∈ B¯A ; ξ(e)=1,ξ(xy)=ξ(x)ξ(y) ∀x, y ∈ A is the intersection of the weakly closed sets of B¯A defined by the conditions xy, · − x, ·y, · =0(x, y ∈ A) and e, · =1. It is clear that it is a homomorphism of commutative unitary Banach algebras. For instance, e(χ)=χ(e) = 1 and xy(χ)=χ(x)χ(y)=x(χ)y(χ).  | |≤   Also, a =supχ∈Δ χ(a) a , according to Theorem 8.12(d). Example 8.14. If K is a compact topological space, then C(K) is a unitary commutative Banach algebra whose characters are the evaluation maps δt at the different points t ∈ K, and t ∈ K → δt ∈ Δ is a homeomorphism.

Obviously δt ∈ Δ. Conversely, if χ = χM ∈ Δ, we will show that there is a common zero for all f ∈ M. If not, for every t ∈ K there would exist some ft ∈ M such that ft(t) = 0, and |ft|≥εt > 0onaneighborhoodU(t) of this point t. Then, K = U(t1) ∪···∪U(tN ), and the function | |2 ··· | |2 ··· f = ft1 + + ftN = ft1 ft1 + + ftN ftN , which belongs to M, would be invertible, since it has no zeros. Hence, there exists some t ∈ K such that f(t) = 0 for every f ∈ M.But M is maximal and contains all the functions f ∈C(K) such that f(t)=0.

Both K and Δ are compact spaces and t ∈ K → δt ∈ Δ, being continu- ous, is a homeomorphism.

8.3.2. Algebras of bounded analytic functions. Suppose that Ω is a bounded domain of C and denote by H∞(Ω) the algebra of bounded analytic functions in Ω, which is a commutative Banach algebra under the uniform norm f∞ =sup|f(z)|. z∈Ω It is a unitary Banach subalgebra of B(Ω).

8.3. Commutative Banach algebras 237

The Gelfand transform G : H∞(Ω) →C(Δ) is an isometric isomorphism, since f 2 = f2 and r(f)=f for every f ∈ A, and we can see H∞(Ω) is a unitary Banach subalgebra of C(Δ) (cf. Exercise 8.18). ∞ For every ζ ∈ Ω, the evaluation map δζ is the character of H (Ω) uniquely determined by δζ(z)=ζ,wherez denotes the coordinate function. Indeed, if χ ∈ Δ satisfies the condition χ(z)=ζ and if f ∈ H∞(Ω), then f(z) − f(ζ) f(z)=f(ζ)+ (z − ζ) z − ζ and f − f(ζ) χ(f)=f(ζ)+χ χ(z − ζ)=f(ζ). z − ζ

It can be shown that the embedding Ω → Δ such that ζ → δζ is a homeomorphism from Ω onto an open subset of Δ (see Exercise 8.14 where we consider the case Ω = U, the unit disc) and, for every f ∈ H∞(Ω), it is convenient to write f(δζ)=f(ζ)ifζ ∈ Ω. Suppose now that ξ ∈ ∂Ω, a boundary point. Note that z − ξ is not invertible in H∞(Ω), so that −1 Δξ := χ ∈ Δ; χ(z − ξ)=0 = χ ∈ Δ; χ(z)=ξ =(z) (ξ) is not empty. For every χ ∈ Δ, χ(z − χ(z)) = 0 and z − χ(z) is not invertible, so that χ(z) ∈ Ω¯ and χ ∈ Ωorχ ∈ Ωξ for some ξ ∈ ∂Ω. That is, (8.1) Δ = Ω ∪ Δξ ξ∈∂Ω

−1 and we can imagine Δ as the domain Ω with a compact fiber Δξ =(z) (ξ) lying above every ξ ∈ ∂Ω. The corona problem asks whether Ω is dense in Δ for the Gelfand topol- ogy, and it admits a more elementary equivalent formulation in terms of function theory:

Theorem 8.15. For the Banach algebra H∞(Ω), the domain Ω is dense in Δ if and only if the following condition holds: ∞ If f1,...,fn ∈ H (Ω) and if

(8.2) |f1(ζ)| + ···+ |fn(ζ)|≥δ>0 ∞ for every ζ ∈ Ω, then there exist g1,...,gn ∈ H (Ω) such that

(8.3) f1g1 + ···+ fngn =1.

238 8. Banach algebras

Proof. Suppose that Ω is dense in Δ. By continuity, if |f1| + ···+ |fn|≥δ on Ω, then also |f1| + ···+ |fn|≥δ on Δ, so that {f1,...,fn} is contained in no maximal ideal and ∞ ∞ ∞ 1 ∈ H (Ω) = f1H (Ω) + ···+ fnH (Ω).

Conversely, suppose Ω is not dense in Δ and choose χ0 ∈ Δwitha neighborhood V disjoint from Ω. The Gelfand topology is the w∗-topology and this neighborhood has the form ∞ V = χ;max|χ(hj) − χ0(hj)| <δ,h1,...,hn ∈ H (Ω) . j=1,...,n

The functions fj = hj −χ0(hj)areinV and they satisfy (8.2) because δζ ∈ V and then |fj(ζ)|≥δ. But (8.3) is not possible because f1,...,fn ∈ Ker χ0 and χ0(1) = 1. 

Starting from the above equivalence, in 1962 Carleson6 solved the corona problem for the unit disc, that is, D is dense in Δ(H∞(D)). The version of the corona theorem for the disc algebra is much easier. See Exercise 8.3.

8.4. C∗-algebras

We are going to consider a class of algebras whose Gelfand transform is a bijective and isometric isomorphism. Gelfand introduced his theory to study these algebras.

8.4.1. Involutions. A C∗-algebra is a unitary Banach algebra with an involution, which is a mapping x ∈ A → x∗ ∈ A that satisfies the following properties: (a) (x + y)∗ = x∗ + y∗, (b) (λx)∗ = λx¯ ∗, (c ) (xy)∗ = y∗x∗, (d) x∗∗ = x, and (e) e∗ = e for any x, y ∈ A and λ ∈ C, and such that x∗x = x2 for every x ∈ A.

6The Swedish mathematician Lennart Carleson, awarded the Abel Prize in 2006, has solved some outstanding problems such as the corona problem (1962) and the almost everywhere con- vergence of Fourier series of any function in L2(T) (1966) and in complex dynamics. To quote Carleson, “The corona construction is widely regarded as one of the most difficult arguments in modern function theory. Those who take the time to learn it are rewarded with one of the most malleable tools available. Many of the deepest arguments concerning hyperbolic manifolds are easily accessible to those who understand well the corona construction.”

8.4. C∗-algebras 239

An involution is always bijective and it is its own inverse. It is isometric, since x2 = x∗x≤x∗x, so that x≤x∗ and x∗≤x∗∗ = x. Throughout this section, A will be a C∗-algebra. If H is a complex Hilbert space, L(H)isaC∗-algebra with the invo- lution T → T ∗,whereT ∗ denotes the adjoint of T . It has been proved in Theorem 4.4 that T ∗T  = TT∗ = T 2. Let A and B be two C∗-algebras. A homomorphism of C∗-algebras is a homomorphism Ψ : A → B of unitary Banach algebras such that Ψ(x∗)= Ψ(x)∗ (and, of course, Ψ(e)=e). We say that a ∈ A is hermitian or self-adjoint if a = a∗. The orthog- onal projections of H are hermitian elements of L(H). We say that a ∈ A is normal if aa∗ = a∗a. Example 8.16. If a ∈ A is normal and a denotes the closed subalgebra of A generated by a, a∗, and e,thena contains all elements of A that can be obtained as the limits of sequences of polynomials in a, a∗ and e.With the restriction of the involution of A, a is a commutative C∗-algebra. Lemma 8.17. Assume that A is commutative. ∗ (a) If a = a ∈ A,thenσA(a) ⊂ R. (b) For every a ∈ A and χ ∈ Δ(A), χ(a∗)=χ(a).

Proof. If t ∈ R,sinceχ =1, |χ(a + ite)|2 ≤a + ite2 = (a + ite)∗(a + ite) = (a − ite)(a + ite) = a2 + t2e≤a2 + t2. Let χ(a)=α + iβ (α, β ∈ R). Then a2 + t2 ≥|α + iβ + it|2 = α2 + β2 +2βt + t2, 2 2 2 i.e., a ≥ α + β +2βt, and it follows that β = 0 and χA(a)=α ∈ R. For any a ∈ A,ifx =(a + a∗)/2 and y =(a − a∗)/2i, we obtain a = x + iy with x, y hermitian, χ(x),χ(y) ∈ R, and a∗ = x − iy. Hence, χ(a)=χ(x)+iχ(y) and χ(a∗)=χ(x) − iχ(y)=χ(a).  Theorem 8.18. If B is a closed unitary subalgebra of A such that b∗ ∈ B for every b ∈ B,thenσB(b)=σA(b) for every b ∈ B. ∗ Proof. First let b = b. From Lemma 8.17 we know that σb ⊂ R and, obviously, σA(b) ⊂ σB(b) ⊂ σb(b)=∂σb(b).

To prove the inverse inclusions, it is sufficient to show that ∂σb(b) ⊂ σA(b). Let λ ∈ ∂σb(b) and suppose that λ ∈ σA(b). There exists x ∈ A so

240 8. Banach algebras

that x(b − λe)=(b − λe)x = e and the existence of λn ∈ σb(b) such that λn → λ follows from λ ∈ ∂σb(b). Thus we have −1 −1 −1 (b−λne) ∈b⊂A, b−λne → b−λe, and (b−λne) → (b−λe) = x.

Hence x ∈b, in contradiction to λ ∈ σb(b). In the general case we only need to prove that if x ∈ B has an inverse y in A,theny ∈ B also. But it follows from xy = e = yx that (x∗x)(yy∗)= e =(yy∗)(x∗x), and x∗x is hermitian. In this case we have seen above that x∗x has its unique inverse in B, so that yy∗ =(x∗x)−1 ∈ B and y = y(y∗x∗)=(yy∗)x∗ ∈ B. 

8.4.2. The Gelfand-Naimark theorem and functional calculus. We have proved in Theorem 8.13 that the Gelfand transform satisfies aΔ = r(a) ≤a, but in the general case it may not be injective. This is not the case for C∗-algebras. Theorem 8.19 (Gelfand-Naimark). If A is a commutative C∗-algebra, then the Gelfand transform G : A →C(Δ(A)) is a bijective isometric isomorphism of C∗-algebras.

Proof. We have a∗(χ)=χ(a)=a(χ) and G(a∗)=G(a). ∗ 2n 1/2n 2 ∗ 2 If x = x,thenr(x) = limn x  = x,sincex  = xx  = x and, by induction, x2(n+1)  = (x2n )2 =(x2n )2 = x2(n+1) . ∗ ∗ ∗ If we take x = a a,thena aΔ = a a,so  2  ∗  ∗   2 a = a a = a a Δ = aa Δ = a Δ and a = aΔ. Since G is an isometric isomorphism, G(A) is a closed subalgebra of C(Δ(A)). This subalgebra contains the constant functions (e = 1) and it is self-conjugate and separates points (if χ1 = χ2,thereexistsa ∈ A such that χ1(a) = χ2(a), i.e., a(χ1) = a(χ2)). By the complex form of the Stone- Weierstrass theorem, the image is also dense, so G(A)=C(Δ(A)) and G is bijective.  Theorem 8.20. Let a be a normal element of the C∗-algebra A,letΔ= Δa be the spectrum of the subalgebra a,andletG : a→C(Δ) be the Gelfand transform. The function a :Δ→ σA(a)=σa(a) is a homeomor- phism.

Proof. We know σ(a)=a(Δ). If χ1,χ2 ∈ Δ, from a(χ1)=a(χ2) we obtain ∗ ∗ χ1(a)=χ2(a), χ1(a )=χ1(a)=χ2(a)=χ2(a ), and χ1(e)=1=χ2(e), so that χ1(x)=χ2(x) for all x ∈a; hence, χ1 = χ2 and a :Δ→ σ(a)is bijective and continuous between two compact spaces, and then the inverse is also continuous. 

8.5. Spectral theory of bounded normal operators 241

The homeomorphism a :Δ→ σ(a)(λ = a(χ)) allows us to define the isometric isomorphism of C∗-algebras τ = ◦a : C(σ(a)) →C(Δ) such that [g(λ)] → [G(χ)] = [g(a(χ))]. By Theorem 8.19, the composition −1 Φa = G ◦ τ : C(σ(a)) →C(Δ) →a⊂A, such that g ∈C(σ(a)) →G−1(g(a)) ∈a, is also an isometric isomorphism ∗  of C -algebras. If g ∈C(σ(a)), then the identity Φa(g)=g ◦ a = g(a) suggests that we may write g(a):=Φa(g). So, we have the isometric isomorphism of C∗-algebras

Φa : g ∈C(σ(a)) → g(a) ∈a⊂A  such that, if g0(λ)=λ is the identity on σ(a), then Φa(g0)=a and g0(a)=a, since τ(g )=a = G(a). Alsog ¯ (a)=a∗ and 0 0 j ∗ k j k (8.4) p(a)= cj,ka (a ) if p(z)= cj,kz z¯ . 0≤j,k≤N 0≤j,k≤N

We call Φa the functional calculus with continuous functions.It is the unique homomorphism Φ : C(σ(a)) → A of C∗-algebras such that j ∗ k Φ(p)= cj,ka (a ) (a) 0≤j,k≤N j k if p(z)= 0≤j,k≤N cj,kz z¯ . Indeed, it follows from the Stone-Weierstrass theorem that the subalge- bra P of all polynomials p(z) considered in (8.4) is dense in C(σ(a)) and, if g = limn pn in C(σ(a)) with pn ∈P,then

Φ(g) = lim pn(a)=Φa(g). n

These facts are easily checked and justify the notation g(a)forΦa(g).

8.5. Spectral theory of bounded normal operators

In this section we are going to consider normal operators T ∈L(H). By Theorem 8.18, σ(T )=σL(H)(T )=σT (T ) and it is a nonempty compact subset of C. From now on, by B(σ(T )) we will denote the C∗-algebra of all bounded Borel measurable functions f : σ(T ) → C, endowed with the involution f → f¯ and with the uniform norm. Obviously C(σ(T )) is a closed unitary subalgebra of B(σ(T )). An application of the Gelfand-Naimark theorem to the commutative C∗-algebra T  gives an isometric homomorphism from T  onto C(ΔT ).

242 8. Banach algebras

The composition of this homomorphism with the change of variables λ = T(χ)(λ ∈ σ(T ) and χ ∈ ΔT ) defines the functional calculus with continuous functions on σ(T ), g ∈C(σ(T )) → g(T ) ∈T ⊂L(H), which is an isometric homomorphism of C∗-algebras. If x, y ∈ H are given, then

ux,y(g):=(g(T )x, y)H defines a continuous linear form on C(σ(T )) and, by the Riesz-Markov rep- resentation theorem,

(g(T )x, y)H = ux,y(g)= gdμx,y σ(T ) for a unique complex Borel measure μx,y on σ(T ).

We will say that {μx,y} is the family of complex spectral measures of T . For any bounded Borel measurable function f on σ(T ), we can define

ux,y(f):= fdμx,y σ(T ) and in this way we extend ux,y to a linear form on these functions. Note that |ux,y(g)| = |(g(T )x, y)H|≤xH yH gσ(T ).

8.5.1. Functional calculus of normal operators. Now our goal is to show that it is possible to define f(T ) ∈L(H) for every f in the C∗-algebra B(σ(T )) of all bounded Borel measurable functions on σ(T ) ⊂ C,equipped with the uniform norm and with the involution f → f¯, so that

(f(T )x, y)H = ux,y(f)= fdμx,y σ(T ) in the hope of obtaining a functional calculus f → f(T ) for bounded but not necessarily continuous functions.

Theorem 8.21. Let T ∈L(H) be a normal operator (TT∗ = T ∗T ) and let {μx,y} be its family of complex spectral measures. Then there exists a unique homomorphism of C∗-algebras

ΦT : B(σ(T )) →L(H) such that

(ΦT (f)x, y)H = fdμx,y (x, y ∈ H). σ(T ) It is an extension of the continuous functional calculus g → g(T ),and ΦT (f)≤fσ(T ).

8.5. Spectral theory of bounded normal operators 243

Proof.Note that, if μ1 and μ2 are two complex Borel measures on σ(T ) and if gdμ1 = gdμ2 for all real g ∈C(σ(T )), then μ1 = μ2,bythe uniqueness in the Riesz-Markov representation theorem. If g ∈C(σ(T )) is a real function, then g(T ) is self-adjoint, since g(T )∗ = g¯(T ). Hence, (g(T )x, y)H = (g(T )y, x)H and then

gdμx,y = gdμy,x = gdμ¯y,x, σ(T ) σ(T ) σ(T ) so that μ =¯μ . x,y y,x → Obviously, (x, y) σ(T ) gdμx,y =(g(T )x, y)H is a continuous sesquilinear form and, from the uniqueness in the Riesz-Markov representation theorem, the map (x, y) → μx,y(B) is also sesquilinear, for any Borel set B ⊂ σ(T ). ¯ For instance, μx,λy = λμx,y, since for continuous functions we have ¯ gdμx,λy = λ(g(T )x, y)H = gdλμx,y. σ(T ) σ(T ) With the extension ux,y(f):= σ(T ) fdμx,y of ux,y to functions f in B(σ(T )), it is still true that

|ux,y(f)|≤xH yH fσ(T ). For every f ∈ B(σ(T )),

(x, y) → Bf (x, y):= fdμx,y σ(T ) is a continuous sesquilinear form on H × H and Bf (y, x)=Bf (x, y), since

fdμx,y = fdμy,x σ(T ) σ(T )

(μx,y(B)=μy,x(B) extends to simple functions). Let us check that an application of the Riesz representation theorem produces a unique operator ΦT (f) ∈L(H) such that Bf (x, y)=(ΦT (f)x, y)H.  Note that Bf (·,x) ∈ H and there is a unique ΦT (f)x ∈ H so that Bf (y, x)=(y, ΦT (f)x)H for all y ∈ H.Then

(ΦT (f)x, y)H = Bf (y, x)=Bf (x, y)= fdμx,y (x, y ∈ H). σ(T )

It is clear that Bf (x, y) is linear in f and that we have defined a bounded linear mapping ΦT : B(σ(T )) →L(H) such that

|(ΦT (f)x, y)H|≤fσ(T )xH yH

244 8. Banach algebras

and ΦT (f)≤fσ(T ). Moreover, with this definition, ΦT extends the functional calculus with continuous functions, g → g(T ). ∗ To prove that ΦT is a continuous homomorphism of C -algebras, all that remains is to check its behavior with the involution and with the product.

If f is real, then from μx,y =¯μy,x we obtain (ΦT (f)x, y)H =(ΦT (f)y, x)H ∗ ∗ and ΦT (f) =ΦT (f). In the case of a complex function, f,ΦT (f) =ΦT (f¯) follows by linearity.

Finally, to prove that ΦT (f1f2)=ΦT (f1)ΦT (f)(f2), we note that, on continuous functions,

hg dμx,y =(h(T )g(T )x, y)H = hdμg(T )x,y σ(T ) σ(T ) and gdμx,y = dμ (x, y ∈ H). Hence, also g(T )x,y

f1gdμx,y = f1 dμg(T )x,y σ(T ) σ(T ) if f is bounded, and then 1 ∗ f1gdμx,y =(ΦT (f1)g(T )x, y)H =(g(T )x, ΦT (f1) y)H σ(T )

∗ = gdμx,Φ(f1) y. σ(T ) ∗ ∗ Again f1dμx,y = dμx,ΦT (f1) y, and also σ(T ) f1f2 dμx,y = σ(T ) f2 dμx,f1(T ) y if f and f are bounded. Thus, 1 2

(ΦT (f1f2)x, y)H = f1f2 dμx,y σ(T )

∗ = f2 dμx,ΦT (f1) y =(ΦT (f1)ΦT (f2)x, y)H σ(T ) and ΦT (f) is multiplicative. 

As in the case of the functional calculus for continuous functions, if f ∈ B(σ(T )), we will denote the operator Φ (f)byf(T ); that is, T

(f(T )x, y)H = fdμx,y (x, y ∈ H). σ(T ) 8.5.2. Spectral measures. For a given Hilbert space, H,aspectral measure,oraresolution of the identity, on a locally compact sub- set K of C (or of Rn), is an operator-valued mapping defined on the Borel σ-algebra BK of K, E : BK −→ L (H), that satisfies the following conditions:

8.5. Spectral theory of bounded normal operators 245

(1) Each E(B) is an orthogonal projection.

(2) E(∅) = 0 and E(K)=I, the identity operator.

(3) If Bn ∈BK (n ∈ N) are disjoint, then ∞ ∞ E( Bn)x = E(Bn)x n=1 n=1 for every x ∈ H, and it is said that ∞ ∞ E( Bn)= E(Bn) n=1 n=1 for the strong convergence, or that E is strongly σ-additive. Note that E also has the following properties:

(4) If B1 ∩ B2 = ∅,thenE(B1)E(B2) = 0 (orthogonality).

(5) E(B1 ∩ B2)=E(B1)E(B2)=E(B2)E(B1) (multiplicativity).

(6) If B1 ⊂ B2,thenImE(B1) ⊂ Im E(B2) (usually represented by E(B1) ≤ E(B2)).

(7) If Bn ↑ B or Bn ↓ B, then limn E(Bn)x = E(B)x for every x ∈ H (it is said that E(Bn) → E(B) strongly).

Indeed, to prove (4), if y = E(B2)x, the equality 2 2 (E(B1)+E(B2)) = E(B1  B2) = E(B1)+E(B2) and the condition B1 ∩ B2 = ∅ yield

E(B1)E(B2)x + E(B2)E(B1)x =0, that is, E(B1)y + y = 0 and, applying E(B1)tobothsides,E(B1)y =0. Now (5) follows from multiplying the equations

E(B1)=E(B1∩B2)+E(B1\B1∩B2),E(B2)=E(B1∩B2)+E(B2\B1∩B2) and taking into account (4).

If Bn ↑ B, then limn E(Bn)x = E(B)x follows from (3), since

B = B1  (B2 \ B1)  (B3 \ B2) ··· .

The decreasing case Bn ↓ B reduces to K \ Bn ↑ K \ B. It is also worth noticing that the spectral measure E generates the family of complex measures Ex,y (x, y ∈ H) defined as

Ex,y(B):=(E(B)x, y)H. ∈ B → If x H,thenEx(B):=E(B)xdefines a vector measure Ex : K H, ∅ ∞ ∞ i.e., Ex( ) = 0 and Ex( n=1 Bn)= n=1 Ex(Bn)inH.

246 8. Banach algebras

Note that for every x ∈ H, Ex,x is a (positive) measure such that

 2  2 Ex,x(B)=(E(B)x, y)H = E(B)x H ,Ex,x(K)= x H , a probability measure if xH = 1, and that operations with the complex measures Ex,y, by polarization, reduce to operations with positive measures: 1 E (B)= E (B)−E − − (B)+iE (B)−iE − − (B) . x,y 4 x+y,x+y x y,x y x+iy,x+iy x iy,x iy

The notions E-almost everywhere (E-a.e.) and E-essential supre- mum have the usual meaning. In particular, if f is a real measurable function,

E-supf =inf{M ∈ R; f ≤ ME-a.e.}.

Note that E(B) = 0 if and only if Ex,x(B) = 0 for every x ∈ H.Thus,if B1 ⊂ B2 and E(B2)=0,thenE(B1) = 0, and the class of E-null sets is closed under countable unions.

The support of a spectral measure E on K is defined as the least closed set supp E such that E(K \ supp E) = 0. The support consists precisely of those points in K for which every neighborhood has nonzero E-measure and E(B)=E(B ∩ supp E) for every Borel set B ⊂ K. The existence of the support is proved by considering the union V of the open sets Vα of K such thatE(Vα) = 0. Since there is a sequence Vn of open sets in K such that V = V ,thenalsoV = V α {n; Vn⊂Vα} n {n; Vn⊂V } n and E(V ) = 0. Then supp E = K \ V .

We write R = fdE K to mean that

(Rx, y)H = fdEx,y (x, y ∈ H). K

It is natural to ask whether the family {μx,y} of complex measures associ- ated to a normal operator T is generated by a single spectral measure E asso- ciated to T . The next theorem shows that the answer is affirmative, allowing us to rewrite the functional calculus of Theorem 8.21 as f(T )= σ(T ) fdE.

8.5. Spectral theory of bounded normal operators 247

Theorem 8.22 (Spectral resolution7). If T ∈L(H) is a normal operator, then there exists a unique spectral measure E : Bσ(T ) →L(H) which satisfies (8.5) T = λdE(λ). σ(T ) Furthermore, (8.6) f(T )= f(λ) dE(λ)(f ∈ B(σ(T ))), σ(T ) and

(8.7) E(B)=χB(T )(B ∈Bσ(T )).

Proof. If ΦT : B(σ(T )) →L(H) is the homomorphism that defines the functional calculus, then to obtain (8.6) we must define E by condition (8.7),

E(B):=ΦT (χB)(B ∈Bσ(T )), and then check that E is a spectral measure with the convenient properties. 2 ∗ Obviously, E(B)=E(B) and E(B) = E(B)(χB is real), so that E(B) is an orthogonal projection (Theorem 3.13). Moreover, it follows from the properties of the functional calculus for continuous functions that E(σ(T )) = Φ(1) = 1(T )=I and E(∅)=Φ(0)=0 and, since Φ is linear, E is additive. Also, from

(E(B)x, y)H =(Φ(χB)x, y)H = μx,y(B), we obtain that gdE = g(T ), fdE=Φ(f) g ∈C(σ(T )),f∈ B(σ(T )) . σ(T ) σ(T )

Finally, E is strongly σ-additive since, if Bn (n ∈ N) are disjoint Borel sets, E(Bn)E(Bm)=0ifn = m, so that the images of the projections ∈ E(Bn) are mutually orthogonal (if y = E(Bm)x,wehave y Ker E(Bn) ∈ ⊥ ∈ and y E(Bn)(H) ) and then, for every x H, n E(Bn)x is convergent to some Px ∈ H since  2 ≤ 2 E(Bn)x H x H , n  N 2 ≤ 2 this being true for partial sums, E( n=1 Bn)x H x H .

7In their work on integral equations, D. Hilbert for a self-adjoint operator on 2 and F. Riesz on L2 used the Stieltjes integral   tdE(t) = lim tk(E(tk) − E(tk−1)) (here E(tk) − E(tk−1)=E(tk−1,tk]) σ(T ) k to obtain this spectral theorem.

248 8. Banach algebras

But then ∞ ∞ ∞ (Px,y)H = (E(Bn)x, y)H = μx,y Bn =(E Bn x, y)H n=1 n=1 n=1 8 and n E(Bn)x = Px = E n Bn x. The uniqueness of E follows from the uniqueness for the functional cal- culus for continuous functions ΦT and from the uniqueness of the measures Ex,y in the Riesz-Markov representation theorem.  Remark 8.23. A more general spectral theorem due to John von Neumann in 1930 can also be obtained from the Gelfand-Naimark theorem: any com- mutative family of normal operators admits a single spectral measure which simultaneously represents all operators of the family as integrals K gdE for various functions g.

8.5.3. Applications. There are two special instances of normal operators that we are interested in: self-adjoint operators and unitary operators. Recall that an operator U ∈L(H)issaidtobeunitary if it is a bijective isometry of H. This means that UU∗ = U ∗U = I ∗ since U U = I if and only if (Ux,Uy)H =(x, y) and U is an isometry. If it is −1 −1 −1 ∗ −1 bijective, then (U x, U y)H =(x, y)H and ((U ) U x, y)H =(x, y)H , −1 ∗ −1 ∗ −1 −1 ∗ −1 ∗ −1 where (U ) U x =(U ) U x =(UU ) x and then ((UU ) x, y)H = ∗ ∗ (x, y)H, so that UU = I.Conversely,ifUU = I,thenU is exhaustive. The Fourier transform is an important example of a unitary operator of L2(Rn). Knowing the spectrum allows us to determine when a normal operator is self-adjoint or unitary: Theorem 8.24. Let T ∈L(H) be a normal operator. (a) T is self-adjoint if and only if σ(T ) ⊂ R. (b) T is unitary if and only if σ(T ) ⊂ S = {λ; |λ| =1}.

Proof. We will apply the continuous functional calculus ΦT for T to the identity function g(λ)=λ on σ(T ), so that g(T )=T andg ¯(T )=T ∗. ∗ From the injectivity of ΦT , T = T if and only if g =¯g, meaning that λ = λ¯ ∈ R for every λ ∈ σ(T ). Similarly, T is unitary if and only if TT∗ = T ∗T = I, i.e., when gg¯ =1, which means that |λ| =1forallλ ∈ σ(T ). 

8See also Exercise 8.21.

8.5. Spectral theory of bounded normal operators 249

Positivity can also be described through the spectrum: Theorem 8.25. Suppose T ∈L(H).Then

(8.8) (Tx,x)H ≥ 0(x ∈ H) if and only if (8.9) T = T ∗ and σ(T ) ⊂ [0, ∞). Such an operator is said to be positive.

Proof. It follows from (8.8) that (Tx,x)H ∈ R and then ∗ (Tx,x)H =(x, T x)H =(T x, x)H . Let us show that then S := T − T ∗ =0.

Indeed, (Sx,y)H +(Sy,x) = 0 and, replacing y by iy, −i(Sx,y)H + i(Sy,x) = 0. Now we multiply by i and add to obtain (Sx,y)H =0forall x, y ∈ H, so that S =0. Thus σ(T ) ⊂ R. To prove that λ<0 cannot belong to σ(T ), we note that the condition (8.8) allows us to set  − 2  2 − 2 2 (T λI)x H = Tx H 2λ(Tx,x)H + λ x H .

This shows that Tλ := T − λI : H → F = (T − λI) has a continuous inverse with domain F , which is closed. This operator is easily extended to ⊥ a left inverse R of Tλ by defining R =0onF .ButTλ is self-adjoint and ∗ RTλ = I also gives TλR = I, Tλ is also right invertible, and λ ∈ σ(T ). Suppose now that T = T ∗ and σ(T ) ⊂ [0, ∞). In the spectral resolution

(Tx,x)H = λdEx,x(λ) ≥ 0, σ(T ) since Ex,x is a positive measure and λ ≥ 0onσ(T ) ⊂ [0, ∞). 

Let us now give an application of the functional calculus with bounded functions: Theorem 8.26. If T = σ(T ) λdE(λ) is the spectral resolution of a normal operator T ∈L(H) and if λ0 ∈ σ(T ),then

Ker (T − λ0I)= ImE{λ0}, so that λ0 is an eigenvalue of T if and only if E({λ0}) =0 . − Proof. The functions g(λ)=λ λ0 and f = χ{λ0} satisfy fg = 0 and g(T )f(T ) = 0. Since f(T )=E({λ0}),

Im E({λ0}) ⊂ Ker (g(T )) = Ker (T − λ0I).

250 8. Banach algebras

Conversely, let us take G = σ(T ) \{λ0} = Bn n with d(λ0,Bn) > 0 and define the bounded functions

χBn (λ) fn(λ)= . λ − λ0 − − Then fn(T )(T λ0I)=E(Bn), and (T λ0I)x = 0 implies E(Bn)x =0 { } { } and E(G)x = n E(Bn)x = 0. Hence, x = E(G)x+E( λ0 )x = E( λ0 )x, i.e., x ∈ Im E({λ0}). 

AsshowninSection4.4,ifT is compact, then every nonzero eigenvalue has finite multiplicity and σ(T )\{0} is a finite or countable set of eigenvalues with finite multiplicity with 0 as the only possible accumulation point. If T is normal, the converse is also true: Theorem 8.27. If T ∈L(T ) is a normal operator such that σ(T ) has no accumulation point except possibly 0 and dim Ker (T − λI) < ∞ for every λ =0 ,thenT is compact.

Proof. Let σ(T ) \{0} = {λ1,λ2,...} and |λ1|≥|λ2| ≥ ···. We apply the functional calculus to the functions gn defined as

gn(λ)=λ if λ = λk and k ≤ n and gn(λ) = 0 at the other points of σ(T ) to obtain the compact operator with finite-dimensional range n gn(T )= λkE({λk}). k=1 Then T − gn(T )≤ sup |λ − gn(λ)|≤|λn| λ∈σ(T ) and |λn|→0asn →∞if σ(T ) \{0} is an infinite set. This shows that T is compact as a limit of compact operators. 

8.6. Exercises

Exercise 8.1. Show that every Banach algebra A without a unit element can be considered as a Banach subalgebra of a unitary Banach algebra A1 constructed in the following fashion. On A1 = A × C, which is a vector space, define the multiplication (a, λ)(b, μ):=(ab + λb + μa, λμ) and the norm (a, λ) := a + |λ|. The unit is δ =(0, 1).

8.6. Exercises 251

The map a → (a, 0) is an isometric homomorphism (i.e., linear and multiplicative), so that we can consider A as a closed subalgebra of A1.By denoting a =(a, 0) if a ∈ A,wecanwrite(a, λ)=a + λδ and the projection χ0(a + λδ):=λ is a character χ0 ∈ Δ(A1).

If A is unitary, then the unit e cannot be the unit δ of A1.

Exercise 8.2. Suppose that A = C1 is the commutative unitary Banach algebra obtained by adjoining a unit to C as in Exercise 8.1. Describe Δ(A) and the corresponding Gelfand transform. N n Exercise 8.3. (a) Prove that the polynomials P (z)= n=0 cnz are dense in the disc algebra A(D) by showing that, if f ∈ A(D) and nz f (z):=f , n n +1 then fn → f uniformly on D¯ and, if f − fn≤ε/2, there is a Taylor polynomial P of fn such that fn − P ≤ε/2. Hence, polynomials P are not dense in C(D¯). Why is this not in contra- diction to the Stone-Weierstrass theorem?

(b) Prove that the characters of A(D) are the evaluations δz (|z|≤1) and that z ∈ D¯ → δz ∈ Δ(A(D)) is a homeomorphism.

(c) If f1,...,fn ∈ A(D) have no common zeros, prove that there exist g1,...,gn ∈ A(D) such that f1g1 + ···+ fngn =1. Exercise 8.4. Show that, with the convolution product, f ∗ g(x):= f(x − y)g(y) dy, R the Banach space L1(R) becomes a nonunitary Banach algebra. Exercise 8.5. Show also that L1(T), the Banach space of all complex 1- periodic functions that are integrable on (0, 1), with the convolution product 1 f ∗ g(x):= f(x − y)g(y) dy, 0 and the usual L1 norm, is a nonunitary Banach algebra. Exercise 8.6. Show that 1(Z), with the discrete convolution, +∞ (u ∗ v)[k]:= u[k − m]v[m], m=−∞ is a unitary Banach algebra. Exercise 8.7. Every unitary Banach algebra, A, can be considered a closed subalgebra of L(A) by means of the isometric homomorphism a → La,where La(x):=ax.

252 8. Banach algebras

Exercise 8.8. In this exercise we want to present the Fourier transform on L1(R) as a special case of the Gelfand transform. To this end, consider 1 the unitary commutative Banach algebra L (R)1 obtained as in Exercise 8.1 by adjoining the unit to L1(R), which is a nonunitary convolution Banach algebra L1(R) (see Exercise 8.4). 1 1 (a) Prove that, if χ ∈ Δ(L (R)1) \{χ0} and χ(u)=1withu ∈ L (R), then γχ(α):=χ(τ−αu)=χ([u(t + α)]) defines a function γχ : R → T ⊂ C which is continuous and such that γχ(α + β)=γχ(α)γχ(β).

(b) Prove that there exists a uniquely determined number ξχ ∈ R such 2πiξ α that γχ(α)=e χ . 1 (c) Check that, if Gf denotes the Gelfand transform of f ∈ L (R), then − G 2πiαξχ F f(χ)= R f(α)e dα = f(ξχ). Exercise 8.9. Let us consider the unitary Banach algebra L∞(Ω) of Ex- ∞ ample 8.2. The essential range, f[Ω], of f ∈ L (Ω) is the complement of the open set {G; G open, μ(f −1(G)) = 0}. Show that f[Ω] is the smallest − closed subset F of C such that μ(f 1(F c)) = 0, f∞ =max{|λ|; λ ∈ f[Ω]}, and f[Ω] = σ(f). Exercise 8.10. The algebra of quaternions, H, is the real Banach space R4 endowed with the distributive product such that 1x = x, ij = −ji = k, jk = −kj = i, ki = −ik = j, i2 = j2 = k2 = −1 if x ∈ H,1=(1, 0, 0, 0), i =(0, 1, 0, 0), j =(0, 0, 1, 0), and k =(0, 0, 0, 1), so that one can write (a, b, c, d)=a + bi + cj + dk. Show that H is an algebra such that xy = xy and that every nonzero element of A has an inverse. Remark. It can be shown that every real Banach algebra which is a field is isomorphic to the reals, the complex numbers or the quaternions (cf. Rickart, General Theory of Banach Algebras, [35, 1.7]). Hence, C is the only (complex) Banach algebra which is a field and H is the only real Banach algebra which is a noncommutative field. Exercise 8.11. If χ : A → C is linear such that χ(ab)=χ(a)χ(b) and χ = 0, then prove that χ(e) = 1, so that χ is a character. Exercise 8.12. Prove that, if T is a compact topology on Δ(A) and every function a (a ∈ A)isT -continuous, then T is the Gelfand topology. Exercise 8.13. Prove that the Gelfand transform is an isometric isomor- phism from C(K)ontoC(Δ). Exercise 8.14. Let U be the open unit disc of C and suppose Δ is the spectrum of H∞(U). Prove that, through the embedding U→ Δ, U is an

8.6. Exercises 253 open subset of Ω. Write Δ=D ∪ Δξ , ξ∈∂D as in (8.1), and prove that the fibers Δξ (|ξ| = 1) are homeomorphic to one another. Exercise 8.15 (Wiener algebra). Show that the set of all 2π-periodic com- plex functions on R +∞ +∞ ikt f(t)= cke ( |ck| < ∞), k=−∞ k=−∞   +∞ | | with the usual operations and the norm f W := k=−∞ ck ,isacommu- tative unitary Banach algebra, W . Moreover prove that the characters of W are the evaluations δt on the different points t ∈ R and that, if f ∈ W has no zeros, then 1/f ∈ W .

Exercise 8.16. Every f ∈ W is 2π-periodic and it can be identified as it ikt the function F on T such that f(t)=F (e ). If f(t)= k cke , F (z)= k ∈ k ckz . In Exercice 8.15 we have seen that the δt (t R) are the characters of W , but show that G : W →C(T), one-to-one and with f≤fW ,is not an isometry and it is not exhaustive. Exercise 8.17. Suppose A is a unitary Banach algebra and a ∈ A, and   ⊂ denote M(U)=supλ∈U c Ra(λ) . Prove that, if U C is an open set and σA(a) ⊂ U,thenσA(b) ⊂ U whenever b − a <δif δ ≤ 1/M (U) (upper semi-continuity of σA). Exercise 8.18. Let A be a commutative unitary Banach algebra. Prove that the Gelfand transform G : A →C(Δ) is an isometry if and only if a2 = a2 for every a ∈ A. Show that in order for a to coincide with the spectral radius r(a), the condition a2 = a2 is necessary and sufficient. Remark. This condition characterizes when a Banach algebra A is a uni- form algebra, meaning that A is a closed unitary subalgebra of C(K)for some compact topological space K. Exercise 8.19. In the definition of an involution, show that property (e), e∗ = e, is a consequence of (a)–(d). If x ∈ A is invertible, prove that (x∗)−1 =(x−1)∗. Exercise 8.20. With the involution f → f¯,wheref¯ is the complex con- jugate of f, show that C(K) is a commutative C∗-algebra. Similarly, show that L∞(Ω), with the involution f → f¯, is also a commutative C∗-algebra.

254 8. Banach algebras

∞ Exercise 8.21. If {Pn} is a sequence of orthogonal projections and n=1 ∞ their images are mutually orthogonal, then the series n=1 Pn is strongly convergent to the orthogonal projection on the closed linear hull Pn(H) of the images of the projections Pn. ∞ Exercise 8.22. Let Ax = k=1 λk(x, ek)H ek be the spectral representation { }∞ of a self-adjoint compact operator of H, and let Pn n=1 be the sequence of the orthogonal projections on the different eigensubspaces

H1 =[e1,...,ek(1)], ..., Hn =[ek(n−1)+1,...,ek(n)], ... for the eigenvalues αn = λk(n−1)+1 = ...= λk(n) of A. Show that we can write ∞ A = αnPn n=1 and prove that E(B)= Pn

αn∈B is the resolution of the identity of the spectral resolution of A. Exercise 8.23. Let μ be a Borel measure on a compact set K ⊂ C and let H = L2(μ). Show that multiplication by characteristic functions of Borel 2 sets in K, E(B):=χB·, is a spectral measure E : BK → L (μ).

Exercise 8.24. If E : BK →L(H) is a spectral measure, show that the null sets for the spectral measure have the following desirable properties: ∈ ∞ (a) If E(Bn)=0(n N), then E( n=1 Bn)=0. (b) If E(B1) = 0 and B2 ⊂ B1,thenE(B2)=0. Exercise 8.25. Show that the equivalences of Theorem 8.25 are untrue on the real Hilbert space R2. Exercise 8.26. Show that every positive T ∈L(H) in the sense of Theo- rem 8.25 has a unique positive square root. Exercise 8.27. With the functional calculus, prove also that, if T ∈L(H) is normal, then it can be written as T = UP with U unitary and P positive.Thisisthepolar decomposition of a bounded normal operator in a complex Hilbert space.

References for further reading: I. M. Gelfand, D. A. Raikov and G. E. Chilov, Commutative Normed Rings. E. Hille and R. S. Phillips, Functional Analysis and Semigroups.

8.6. Exercises 255

T. Kato, Perturbation Theory for Linear Operators. P. D. Lax, Functional Analysis. M. A. Naimark, Normed Rings. M. Reed and B. Simon, Methods of Modern Mathematical Physics. C. E. Rickart, General Theory of Banach Algebras. F. Riesz and B. Sz. Nagy, Le¸cons d’analyse fonctionelle. W. Rudin, Functional Analysis. A. E. Taylor and D. C. Lay, Introduction to Functional Analysis. K. Yosida, Functional Analysis.

Chapter 9

Unbounded operators in a Hilbert space

Up to this moment all of our linear operators have been bounded, but densely defined unbounded operators also occur naturally in connection with the foundations of quantum mechanics. When in 1927 J. von Neumann1 introduced axiomatically Hilbert spaces, he recognized the need to extend the spectral theory of self-adjoint operators from the bounded to the unbounded case and immediately started to obtain this extension, which was necessary for his presentation of the transforma- tion theory of quantum mechanics created in 1925–1926 by Heisenberg and Schr¨odinger.2 The definition of unbounded self-adjoint operators on a Hilbert space re- quires a precise selection of the domain, the symmetry condition (x, Ax)H = (Ax, x)H for a densely defined operator not being sufficient for A to be self- adjoint, since its spectrum has to be a subset of R. The creators of quantum

1 The Hungarian mathematician J´anos (John) von Neumann is considered one of the foremost mathematicians of the 20th century: he was a pioneer of the application of operator theory to quantum mechanics, a member of the Manhattan Project, and a key figure in the development of game theory and of the concepts of cellular automata. Between 1926 and 1930 he taught in the University of Berlin. In 1930 he emigrated to the USA where he was invited to Princeton University and was one of the first four people selected for the faculty of the Institute for Advanced Study (1933–1957). 2 The German physicist Werner Karl Heisenberg, in G¨ottingen, was one of the founders of quantum mechanics and the head of the German nuclear energy project; with Max Born and Pascual Jordan, Heisenberg formalized quantum mechanics in 1925 using matrix transformations. The Austrian physicist Erwin Rudolf Josef Alexander Schr¨odinger, while in , in 1926 derived what is now known as the Schr¨odinger wave equation, which is the basis of his development of quantum mechanics. Based on the Born statistical interpretation of quantum theory, P. Dirac and Jordan unified “matrix mechanics” and “wave mechanics” with their “transformation theory”.

257

258 9. Unbounded operators in a Hilbert space mechanics did not care about this and it was von Neumann himself who clar- ified the difference between a self-adjoint operator and a symmetric one. In this chapter, with the Laplacian as a reference example, we include the Rellich theorem, showing that certain perturbations of self-adjoint operators are still self-adjoint, and the Friedrichs method of constructing a self-adjoint extension of many symmetric operators. Then the spectral theory of bounded self-adjoint operators on a Hilbert space is extended to the unbounded case by means of the Cayley transform, which changes a self-adjoint operator T into a unitary operator U.The functional calculus of this operator allows us to define the spectral resolution of T . We include a very short introduction on the principles of quantum me- chanics, where an observable, such as position, momentum, and energy, is an unbounded self-adjoint operator, their eigenvalues are the observable values, and the spectral representing measure allows us to evaluate the observable in a given state in terms of the probability of belonging to a given set.3 Von Neumann’s text “Mathematical Foundations of Quantum Mechan- ics” [43] is strongly recommended here for further reading: special attention is placed on motivation, detailed calculations and examples are given, and the thought processes of a great mathematician appear in a very transparent manner. More modern texts are available, but von Neumann’s presentation contains in a lucid and very readable way the germ of his ideas on the subject. In that book, for the first time most of the modern theory of Hilbert spaces is defined and elaborated, as well as “quantum mechanics in a unified representation which ... is mathematically correct”. The author explains that, just as Newton mechanics was associated with infinitesimal calculus, quantum mechanics relies on the Hilbert theory of operators. With von Neumann’s work, quantum mechanics is Hilbert space analysis and, conversely, much of Hilbert space analysis is quantum mechanics.

9.1. Definitions and basic properties

Let H denote a complex linear space. We say that T is an operator on H if it is a linear mapping T : D(T ) → H, defined on a linear subspace D(T )of H, which is called the domain of the operator.

3Surprisingly, in this way the atomic spectrum appears as Hilbert’s spectrum of an operator. Hilbert himself was extremely surprised to learn that his spectrum could be interpreted as an atomic spectrum in quantum mechanics.

9.1. Definitions and basic properties 259

Example 9.1. The derivative operator D : f → f  (distributional de- rivative) on L2(R) has D(D)= f ∈ L2(R); f  ∈ L2(R) as its domain, which is the Sobolev space H1(R). This domain is dense in L2(R), since it contains D(R).

Example 9.2. As an operator on L2(R), the domain of the position operator, Q : f(x) → xf(x), is D(Q)= f ∈ L2(R); [xf(x)] ∈ L2(R) .

It is unbounded, since χ(n,n+1)2 = 1 and Qχ(n,n+1)2 ≥ n.

Recall that f ∈ L2(R) if and only if f ∈ L2(R) and both f and xf(x) are in L2(R) if and only if f,f ∈ L2(R). Thus, the Fourier transform is a unitary operator which maps D(Q)ontoH1(R)=D(D) and changes 2πiQ into D.Conversely,2πiQ = F −1DF on D(Q). Under these conditions it is said that 2πiD and Q are unitarily equiv- alent. Unitarily equivalent operators have the same spectral properties. Of course, it follows that D is also unbounded (see Exercise 9.3).

We are interested in the spectrum of T . If for a complex number λ the operator T − λI : D(T ) → H is bijective and (T − λI)−1 : H →D(T ) ⊂ H is continuous, then we say that λ is a regular point for T . The spectrum σ(T ) is the subset of C which consists of all nonregular points, that is, all complex numbers λ for which T −λI : D(T ) → H does not have a continuous inverse. Thus λ ∈ σ(T ) when it is in one of the following disjoint sets:

(a) The point spectrum σp(T ), which is the set of the eigenvalues of T . That is, λ ∈ σp(T )whenT − λI : D(T ) → H is not injective. In this case (T − λI)−1 does not exist.

(b) The continuous spectrum σc(T ), the set of all λ ∈ C \ σp(T ) such that T − λI : D(T ) → H is not exhaustive but Im (T − λI)=H and (T − λI)−1 is unbounded.

(c) The residual spectrum σr(T ), which consists of all λ ∈ C \ σp(T ) such that Im (T − λI) = H.Then(T − λI)−1 exists but is not densely defined. The set σ(T )c of all regular points is called the resolvent set.Thus, λ ∈ σ(T )c when we have (T − λI)−1 ∈L(H).

260 9. Unbounded operators in a Hilbert space

The resolvent of T is again the function c −1 RT : σ(T ) →L(H),RT (λ):=(T − λI) . The spectrum of T is not necessarily a bounded subset of C, but it is still closed and the resolvent function is analytic: c Theorem 9.3. The set σ(T ) is an open subset of C, and every point λ0 ∈ σ(T )c has a neighborhood where ∞ k k+1 RT (λ)=− (λ − λ0) RT (λ0) , k=0 the sum of a convergent Neumann series.

Proof. Let us consider λ = λ0 + μ such that |μ| < RT (λ0).Thesumof the Neumann series ∞ k k+1 S(μ):= μ RT (λ0) (|μ| < 1/RT (λ0)) k=0 will be the bounded inverse of T − λI.

The condition μRT (λ0) < 1 ensures that the series is convergent, and it is easily checked that (T − λI)S(μ)=I: N k −1 k+1 N+1 (T − λ0I − μI) μ ((T − λ0I) ) = I − (μRT (λ0)) → I, k=0 and S(μ)commuteswithT . 

A graph is a linear subspace F ⊂ H × H such that, for every x ∈ H, the section Fx := {y;(x, y) ∈ F } has at most one point, y, so that the first projection π1(x, y)=x is one-to-one on F . This means that x → y (y ∈ Fx) is an operator TF on H with D(TF )={x ∈ H; Fx = ∅} and G(TF )=F . We write S ⊂ T if the operator T is an extension of another operator S, that is, if D(S) ⊂D(T ) and T|D(S) = S or, equivalently, if G(S) ⊂G(T ). If G(T )isclosedinH × H, then we say that T is a closed operator. Also, T is said to be closable if it has a closed extension T. This means that G(T ) is a graph, since, if T¯ is a closed extension of T , G(T ) ⊂G(T ) and ψ1 is one-to-one on G(T¯), so that it is also one-to-one on G(T ). Conversely, if G(T ) is a graph, it is the graph of a closed extension of T ,sinceG(T ) ⊂ G(T ). ¯ ¯ If T is closable, then T will denote the closure of T ; that is, T = TG(T ). When defining operations with unbounded operators, the domains of the new operators are the intersections of the domains of the terms. Hence D(S ± T )=D(S) ∩D(T ) and D(ST)= x ∈D(T ); Tx ∈D(S) .

9.1. Definitions and basic properties 261

Example 9.4. The domain of the commutator [D, Q]=DQ − QD of the derivation operator with the position operator on L2(R)isD(DQ)∩D(QD), which contains D(R), a dense subspace of L2(R). Since D(xf(x))−xDf(x)=f(x), the commutator [D, Q] coincides with the identity operator on its domain, so that we simply write [D, Q]=I and consider it as an operator on L2(R).

9.1.1. The adjoint. We will only be interested in densely defined oper- ators, which are the operators T such that D(T )=H. If T is densely defined, then every bounded linear form on D(T ) has a unique extension to H, and from the Riesz representation Theorem 4.1 we know that it is of the type (·,z)H. This fact allows us to define the adjoint T ∗ of T . Its domain is defined as ∗ D(T )={y ∈ H; x → (Tx,y)H is bounded on D(T )} and, if y ∈D(T ∗), T ∗y ∈ H is the unique element such that ∗ (Tx,y)H =(x, T y)H (x ∈D(T )). ∗ ∗ ∗ Hence, y ∈D(T ) if and only if (Tx,y)H =(x, y )H for some y ∈ H,for all x ∈D(T ), and then y∗ = T ∗y. Theorem 9.5. Let T be densely defined. Then the following properties hold: (a) (λT )∗ = λT¯ ∗. (b) (I + T )∗ = I + T ∗. (c) T ∗ is closed. (d) If T : D(T ) → H is one-to-one with dense image, then T ∗ is also one-to-one and densely defined, and (T −1)∗ =(T ∗)−1.

Proof. Both (a) and (b) are easy exercises. ∗ ∗ To show that the graph of T is closed, suppose that (yn,T yn) → (y, z) ∗ ∗ (yn ∈D(T )). Then (x, T yn)H → (x, z)H and (Tx,yn)H → (Tx,y)H for ∗ every x ∈D(T ), with (x, T yn)H =(Tx,yn)H . Hence (x, z)H =(Tx,y)H and z = T ∗y, so that (y, z) ∈G(T ∗). In (d) the inverse T −1 :ImT →D(T ) is a well-defined operator with dense domain and image. We need to prove that (T ∗)−1 exists and coincides with (T −1)∗. First note that T ∗y ∈D((T −1)∗) for every y ∈D(T ∗), since the linear −1 ∗ −1 ∗ form x → (T x, T y)H =(x, y)H on D(T ) is bounded and T y is well- defined. Moreover (T −1)∗T ∗y = y, so that (T −1)∗T ∗ = I on D(T ∗), (T ∗)−1 : Im T ∗ →D(T ∗), and (T ∗)−1 ⊂ (T −1)∗ since, for y =(T ∗)−1z in (T −1)∗T ∗y = y,wehave(T −1)∗z =(T ∗)−1z.

262 9. Unbounded operators in a Hilbert space

To also prove that (T −1)∗ ⊂ (T ∗)−1,letx ∈D(T ) and y ∈D((T ∗)−1). Then Tx ∈ Im (T )=D(T −1) and

−1 ∗ −1 ∗ ∗ −1 ∗ (Tx,(T ) y)H =(x, y)H, (Tx,(T ) y)H =(x, T (T ) y)H . Thus, (T −1)∗y ∈D(T ∗) and T ∗(T −1)∗y = y, so that T ∗(T −1)∗ = I on D((T ∗)−1)= ImT ∗, and (T ∗)−1 :Im(T ∗) →D(T ∗) is bijective. 

It is useful to consider the “rotation operator” G : H × H → H × H, such that G(x, y)=(−y, x). It is an isometric isomorphism with respect to    2  2 1/2 the norm (x, y) := ( x H + y H ) associated to the scalar product     ((x, y), (x ,y))H×H := (x, x )H +(y, y )H , which makes H × H a Hilbert space. Observe that G2 = −I.

Theorem 9.6. If T is closed and densely defined, then

H × H = G(G(T )) ⊕G(T ∗)=G(T ) ⊕ G(G(T ∗)), orthogonal direct sums, T ∗ is also closed and densely defined, and T ∗∗ = T .

Proof. LetusfirstprovethatG(T ∗)=G(G(T ))⊥, showing the first equality, ∗ ∗ and that T is closed. Since (y, z) ∈G(T ) if and only if (Tx,y)H =(x, z)H for every x ∈D(T ), we have

(G(x, T x), (y, z))H×H =((−Tx,x), (y, z))H×H =0, and this means that (y, z) ∈ G(G(T ))⊥, so that G(T ∗)=G(G(T ))⊥. Also, since G2 = −I,

H × H = G(G(G(T )) ⊕G(T ∗)) = G(T ) ⊕ G(G(T ∗)).

∗ ∗ If (z,y)H =0forally ∈D(T ), then ((0,z), (−T y, y))H×H = 0. Hence, (0,z) ∈ G(G(T ∗))⊥ = G(T ) and it follows that z = T 0 = 0. Thus, D(T ∗)is dense in H. Finally, since also H×H = G(G(T ∗))⊕G(T ∗∗) and G(T ) is the orthogonal complement of G(G(T ∗)), we obtain the identity T = T ∗∗. 

9.2. Unbounded self-adjoint operators

T : D(T ) ⊂ H → H is still a possibly unbounded linear operator on the complex Hilbert space H.

9.2. Unbounded self-adjoint operators 263

9.2.1. Self-adjoint operators. We say that the operator T is symmetric if it is densely defined and

(Tx,y)H =(x, T y)H (x, y ∈D(T )). Note that this condition means that T ⊂ T ∗. Theorem 9.7. Every symmetric operator T is closable and its closure is T ∗∗.

Proof. Since T is symmetric, T ⊂ T ∗ and, G(T ∗) being closed, G(T ) ⊂ G(T ) ⊂G(T ∗). Hence G(T ) is a graph, T is closable, and G(T ) is the graph of T¯.Asa consequence, let us show that the domain of T ∗ is dense. According to Theorem 9.6, (x, y) ∈G(T ∗) if and only if (−y, x) ∈G(T )⊥, in H × H. Hence, G(T )=G(T )⊥⊥ = {(T ∗x, −x); x ∈D(T ∗)}⊥.

This subspace is not a graph if and only if (y, z1), (y, z2) ∈ G(T )fortwo ∗ ∗ ⊥ different points z1,z2 ∈ H; that is, (0,z) ∈{(T x, −x); x ∈D(T )} for ∗ some z =0.Then( z,x)H =0forallx ∈D(T ) which means that 0 = z ∈ D(T ∗)⊥, and it follows that D(T ∗) = H. Since D(T ∗)=H, T ∗∗ is well-defined. We need to prove that G(T )=G(T¯)={(T ∗x, −x); x ∈D(T ∗)}⊥ ∗∗ ∗ is G(T ). But (v, u) ∈ G(T ) if and only if (T x, v)H − (x, u)H =0forall x ∈D(T ∗); that is, v ∈D(T ∗∗) and u = T ∗∗v, which means that (v, u) ∈ G(T ∗∗). 

The operator T is called self-adjoint if it is densely defined and T = T ∗, i.e., if it is symmetric and D(T ∗) ⊂D(T ), ∗ this inclusion meaning that the existence of y ∈ H such that (Tx,y)H = ∗ ∗ (x, y )H for all x ∈D(T ) implies y = Tx. Theorem 9.8. If T is self-adjoint and S is a symmetric extension of T , then S = T .HenceT does not have any strict symmetric extension; it is “maximally symmetric”.

Proof. It is clear that T = T ∗ ⊂ S and S ⊂ S∗,sinceS is symmetric. It follows from the definition of a self-adjoint operator that T ⊂ S implies S∗ ⊂ T ∗.FromS ⊂ S∗ ⊂ T ⊂ S we obtain the identity S = T . 

264 9. Unbounded operators in a Hilbert space

We are going to show that, in the unbounded case, the spectrum of a self-adjoint operator is also real. This property characterizes the closed symmetric operators that are self-adjoint. First note that, if T = T ∗, the point spectrum is real, since if Tx = λx and 0 = x ∈D(T ), then

λ¯(x, x)H =(x, T x)H =(Tx,x)H = λ(x, x)H and λ¯ = λ. Theorem 9.9. Suppose that T is self-adjoint. The following properties hold: c (a) λ ∈ σ(T ) if and only if Tx− λxH ≥ cxH for all x ∈D(T ), for some constant c>0. (b) The spectrum σ(T ) is real and closed.

(c) λ ∈ σ(T ) if and only if Txn − λxn → 0 for some sequence {xn} in D(T ) such that xnH =1(λ is an approximate eigenvalue).

(d) The inequality RT (λ)≤1/|λ| holds.

c Proof. (a) If λ ∈ σ(T ) ,thenRT (λ) ∈L(H) and −1 xH ≤RT (λ)(T − λI)xH = c (T − λI)xH .

Suppose now that Tx− λxH ≥ cxH and let M =Im(T − λI), so that we have T − λI : D(T ) → M with continuous inverse. To prove that M = H, let us first show that M is dense in H. If z ∈ M ⊥, then for every Tx− λx ∈ M we have

0=(Tx− λx, z)H =(Tx,z)H − λ(x, z)H. ∗ Hence (Tx,z)H =(x, λz¯ )H if x ∈D(T ), and then z ∈D(T )=D(T ) and Tz = λz¯ . Suppose z = 0, so that λ¯ = λ and we arrive at Tz − λz = 0 and 0 = z ∈ M, a contradiction. Thus, M ⊥ = 0 and M is dense.

To prove that M is closed in H,letM  yn = Txn − λxn → y.Then −1 xp − xq≤c yp − yqH , and there exist x = lim xn ∈ H and limn Txn = y + λx.ButT is closed, so that Tx = y + λx and y ∈ M. (b) To show that every λ = α + iβ ∈ σ(T ) is real, observe that, if x ∈D(T ), − − − −¯ (Tx λx, x)H =(Tx,x)H λ(x, x)H , (Tx λx, x)H =(Tx,x)H λ(x, x)H , since (Tx,x)H ∈ R. Subtracting, − − −  2 (Tx λx, x)H (Tx λx, x)H =2iβ x H, − − − − − where (Tx λx, x)H (Tx λx, x)H = 2i Im (Tx λx, x)H. Hence, | | 2 | − |≤| − |≤ −    β x H = Im (Tx λx)H (Tx λx, x)H Tx λx H x H

9.2. Unbounded self-adjoint operators 265

and then |β|xH ≤Tx− λxH if x ∈D(T ). As seen in the proof of (a), the assumption β = 0 would imply λ ∈ σ(T )c. (c) If λ ∈ σ(T ), the estimate in (a) does not hold and then, for every c =1/n,wecanchoosexn ∈D(T ) with norm one such that Txn −λxnH ≤ 1/n and λ is an approximate eigenvalue. Every approximate eigenvalue λ is in σ(T ), since, if (T − λI)−1 were bounded on H, then it would follow from −1 Txn − λxn → 0 that xn =(T − λI) (Txn − λxn) → 0, a contradiction to xnH =1. (d) If y ∈D(T ) and λ = λ + iλ ∈ R, then it follows that  − 2 − − ≥   | |2 2 (T λI)y H =(Ty λy, T y λy)H (( λ)y, ( λ)y)H = λ y H . − ∈  2 ≥| |2 2 If x =(T λI)y H,theny = RT (λ)x and x H λ RT (λ)x H;thus |λ|RT (λ)≤1. 

The condition σ(T ) ⊂ R is sufficient for a symmetric operator to be self-adjoint. In fact we have more: Theorem 9.10. Suppose that T is symmetric. If there exists z ∈ C \ R such that z,z¯ ∈ σ(T )c,thenT is self-adjoint.

Proof. Let us first show that ((T − zI)−1)∗ =(T − zI¯ )−1, that is, −1 −1 ((T − zI) x1,x2)H =(x1, (T − zI¯ ) x2)H . −1 −1 We denote (T − zI) x1 = y1 and (T − zI¯ ) x2 = y2. The desired identity means that (y1, (T − zI¯ )y2)H =((T − zI)y1,y2)H and it is true if y1,y2 ∈ D(T ), since T is symmetric. But the images of T − zI and T − zI¯ are both −1 −1 the whole space H, so that ((T −zI) x1,x2)H =(x1, (T −zI¯ ) x2)H holds for any x1,x2 ∈ H. Now we can prove that D(T ∗) ⊂D(T ). Let v ∈D(T ∗) and w = T ∗v, i.e., (Ty1,v)H =(y1,w)H (∀y1 ∈D(T )).

We subtract z(y1,v)H to obtain

((T − zI)y1,v)H =(y1,w− zv¯ )H . −1 −1 Still with the notation (T − zI) x1 = y1 and (T − zI¯ ) x2 = y2, but now with x2 = w − zv¯ ,since(x1,v)H =((T − zI)y1,v)H =(y1,w− zv¯ )H, − −1 − − −1 − ∀ ∈ (x1,v)H = (T zI) x1,w zv¯ H = x1, (T zI¯ ) (w zv¯ ) H ( x1 H). Thus, v =(T − zI¯ )−1(w − zv¯ ) and v ∈ Im (T − z¯)−1 = D(A). 

In the preceding proof, we have only needed the existence of z ∈ R such that (T − zI)−1 and (T − zI¯ )−1 are defined on H, but not their continuity.

266 9. Unbounded operators in a Hilbert space

Example 9.11. The position operator Q of Example 9.2 is self-adjoint and σ(Q)=R, but it does not have an eigenvalue.

Obviously Q is symmetric, since, if f,g,[xf(x)], [xg(x)] ∈ L2(R), then

(Qf, g)2 = xf(x)g(x) dx = f(x)xg(x) dx =(f,Qg)2. R R ∗ ∗ ∗ 2 Let us show that D(Q ) ⊂D(Q). If g ∈D(Q ), then there exists g ∈ L (R) ∗ ∈D such that (Qf, g)2 =(f,g )2 for all f (Q). So R ϕ(x)xg(x) dx = ∗ ∈D ∗ R ϕ(x)g (x) dx if ϕ (R), and then g (x)=xg(x)a.e.onR. I.e., xg(x) is in L2(R) and g ∈D(Q). Hence Q is self-adjoint, since it is symmetric and D(Q∗) ⊂D(Q). If λ ∈ σ(Q), then T =(Q − λI)−1 ∈L(H) and, for every g ∈ L2(R), the equality (Q − λI)Tg = g implies that (Tg)(x)=g(x)/(x − λ) ∈ L2(R) and 2 λ ∈ R, since, when λ ∈ R and g := χ(λ,b), g(x)/(x − λ) ∈ L (R). Example 9.12. The adjoint of the derivative operator D of Example 9.1 is −D, and iD is self-adjoint. The spectrum of iD is also R and it has no eigenvalues.

By means of the Fourier transform we can transfer the properties of Q. 1 If f,g ∈ H (R), then (Df, g)2 =(Df, g)2.From(Qf,g)2 =(f,Qg)2 and Df(x)=2πitf(x)=2πi(Qf)(x) we obtain (Df, g)2 =(2πiQf,g)2 =(f,−2πiQg)2 =(f,−Dg)2 and −D ⊂ D∗. As in the case of Q,alsoD(D∗) ⊂D(D). Furthermore, (iD)∗ = −iD∗ = iD and σ(iD) ⊂ R.IfT =(iD − λI)−1 and g ∈ L2, then an application of the Fourier transform to (iD−λI)Tg = g gives −2πxTg(x) − λTg(x)=g(x), and g(x) Tg(x)=− 2πx + λ has to lie in L2(R). So we arrive to λ ∈ R by taking convenient functions g = χ(a,b). Example 9.13. The Laplace operator Δ of L2(Rn) with domain H2(Rn) is self-adjoint. Its spectrum is σ(Δ) = [0, ∞).

Recall that H2(Rn)= u ∈ L2(Rn); Dαu ∈ L2(Rn), |α|≤2 = u ∈ L2(Rn); |(1 + |ξ|2)u(ξ)|2 dξ < ∞ Rn

9.2. Unbounded self-adjoint operators 267

and Δ, with this domain, is symmetric: (Δu, v)2 =(u, Δv)2 follows from the Fourier transforms, since u(ξ)|ξ|2v(ξ) dξ = u(ξ)|ξ|2v(ξ) dξ. Rn Rn To prove that it is self-adjoint, let u ∈D(Δ∗) ⊂ L2(Rn). If w ∈ L2(Rn)is such that 2 n (Δv, u)2 =(v, w)2 (v ∈ H (R )), then, up to a nonzero multiplicative constant, v(ξ)|ξ|2u(ξ) dξ = v(ξ)w(ξ) dξ Rn Rn 2 n 2 n 2 for every v ∈ H (R ), a dense subspace of L (R ), and |ξ| u(ξ)=cw(ξ), 2 n | | |2 |2 ∞ ∈ 1 n D in L (R ). Hence, Rn (1 + ξ )u(ξ) dξ < and u H (R )= (Δ). Thus, D(Δ∗) ⊂D(Δ). The Fourier transform, F, is a unitary operator of L2(Rn), so that the spectrum of Δ is the same as the spectrum of the multiplication operator FΔF −1 =4π2|ξ|2·, which is self-adjoint with domain f ∈ L2(Rn); |(1 + |ξ|2)f(ξ)|2 dξ < ∞ , Rn and λ ∈ σ(FΔF −1)c if and only if the multiplication by 4π2|ξ|2 − λ has a continuous inverse on L2(Rn), the multiplication by 1/(4π2|ξ|2 − λ). This means that λ =4 π2|ξ|2 for every ξ ∈ Rn, i.e., λ ∈ [0, ∞).

An application of Theorem 9.10 shows that a perturbation of a self- adjoint operator with a “small” symmetric operator is still self-adjoint. For a more precise statement of this fact, let us say that an operator S is relatively bounded, with constant α, with respect to another operator A if D(A) ⊂D(S) and there are two constants α, c ≥ 0 such that  2 ≤ 2 2 2 2 ∈D (9.1) Sx H α Ax H + c x H (x ). Let us check that this kind of estimate is equivalent to   (9.2) SxH ≤ α AxH + c xH (x ∈D) and that we can take α < 1ifα<1 and α<1ifα < 1. By completing the square, it is clear that (9.2) follows from (9.1) with α = α and c = c. Also, from (9.2) we obtain (9.1) with α2 =(1+ε−1)α2 2 2     ≤ −1 2 2 and c =(1+ε)c , for any ε>0, since 2α Ax H c x H ε α Ax H + 2 2 εc x H and an easy substitution shows that     2 ≤ 2 2 2 2 (α Ax H + c x H ) α Ax H + c x H .

268 9. Unbounded operators in a Hilbert space

Theorem 9.14 (Rellich4). Let A be a self-adjoint operator and let S be symmetric, with the same domain D ⊂ H.IfS is relatively bounded with constant α with respect to A,thenT = A+S is also self-adjoint with domain D.

Proof. Let us first check that the symmetric operator T is closed. If (x, y) ∈ G(T ), then we choose xn ∈ D so that xn → x and Txn → y.Fromthe hypothesis we have

Axn − AxmH ≤Txn − TxmH + αAxn − AxmH + cxn − xmH , which implies 1 c Ax − Ax  ≤ Tx − Tx  + x − x  , n m H 1 − α n m H 1 − α n m H and there exists z = limn Axn.ButA is closed and z = Ax with x ∈ D.

Moreover Sxn − SxH ≤ αA(xn − x)H + cxn − xH and Sxn → Sx. Hence y = limn Txn = Tx and (x, y) ∈G(T ). The operators T − zI (z ∈ C), with domain D, are also closed. With Theorem 9.10 in hand, we only need to check that ±λi ∈ σ(T )c when λ ∈ R is large enough (|λ| >c). To show that T − λiI is one-to-one if λ =0,let (T − λiI)x = y (x ∈ D) and note that the absolute values of the imaginary parts of both sides of

(Tx,x)H − λi(x, x)H =(y, x)H | | 2 | |≤    are equal, so that λ x H = (y, x)H y H x H and −1 xH ≤|λ| yH (x ∈ D). Thus, y = 0 implies x =0.

Let us prove now that T − λiI has a closed image. Let yn → y with −1 yn =(T − λiI)xn.Thenxn − xmH ≤|λ| yn − ymH and the limit x = lim xn ∈ H exists. Since (T − λiI)xn → y and the graph of T − λiI is closed, x ∈ D and y =(T − λiI)x ∈ Im (T − λiI). Let us also show that Im (T − λiI)=H by proving that the orthogonal is zero. Let v ∈ H be such that

(Ax + Sx − λix, v)H =0 (x ∈ D).

4F. Rellich worked on the foundations of quantum mechanics and on partial differential equations, and his most important contributions, around 1940, refer to the perturbation of the spectrum of self-adjoint operators A() which depend on a parameter . See also footnote 15 in Chapter 7.

9.2. Unbounded self-adjoint operators 269

Then (A − λi)(D)=H,sinceλi ∈ σ(A)c.If(A − λiI)u = v,letx = u, and then we obtain that

((A − λi)u, (A − λi)u)H +(Su,(A − λi)u)H =0.  − 2 ≤   −  From the Cauchy-Schwarz inequality, Au λiu H Su H Au λiu H and

Au − λiuH ≤SuH . − −  2 2 2 Since A is symmetric, (Ay λiy, Ay λiy)H = Ay H + λ y H and  2 2 2  − 2 ≤ 2 ≤ 2 2 2 2 Au + λ u H = Au λiu H Su H α Au H + c u H . But, if |λ| >c, the condition α2 < 1 implies u = 0, and then v =0. We have proved that (T − λiI)−1 : H → H is well-defined and closed, i.e., it is bounded. Hence, ±iλ ∈ σ(T )c and the symmetric operator T is self-adjoint.  Example 9.15. The operator H = −Δ −|x|−1 on L2(R3), with domain H2(R3), is self-adjoint.

−1 Let −|x| = V0(x)+V1(x)withV0(x):=χB(x)V (x)(B = {|x|≤1}). Multiplication by the real function |x|−1 is a symmetric operator whose 2 3 2 3 domain contains H (R ), the domain of −Δ, since V0u ∈ L (R )ifu ∈ H2(R3), with

V0u2 ≤V0∞u2 = u2 and V1u2 ≤V12u∞,whereu∞ ≤u1. To apply Theorem 9.14, we will show that multiplication by −|x|−1 is relatively bounded with respect to −Δ. From the Cauchy-Schwarz inequality and from the relationship between the Fourier transform and the derivatives, we obtain 2 dξ 2π3 |u(ξ)| dξ ≤ (−Δ+β2I)u2 = (−Δ+β2I)u2. | 2π |2 2 2 2 2 R3 R3 ( ξ + β ) ξ From the inversion theorem we obtain that u is bounded and continuous, since it is the Fourier co-transform of the integrable function u.Then

−1/2 3/2 2 3 V1u2 ≤ c(β −Δu2 + β u2)(u ∈ H (R )) so that −1/2 3/2 Vu2 ≤ cβ −Δu2 +(cβ +1)u2 and cβ−1/2 < 1ifβ is large. It follows from the Rellich theorem that H is self-adjoint with domain H2(R3).

270 9. Unbounded operators in a Hilbert space

9.2.2. Essentially self-adjoint operators. Very often, operators appear to be symmetric but they are not self-adjoint, and in order to apply the spectral theory, it will be useful to know whether they have a self-adjoint extension. Recall that a symmetric operator is closable and that a self- adjoint operator is always closed and maximally symmetic. A symmetric operator is said to be essentially self-adjoint if its closure is self-adjoint. In this case, the closure is the unique self-adjoint extension of the operator. Example 9.16. It follows from Example 9.13 that the Laplacian Δ, as an operator on L2(Rn) with domain S(Rn), is essentially self-adjoint. Its closure is again Δ, but with domain H2(Rn).

Theorem 9.17. If T is symmetric and a sequence {un}n∈N ⊂D(T ) is an orthonormal basis of H such that Tun = λnun (n ∈ N),thenT is essentially self-adjoint and the spectrum of its self-adjoint extension T¯ is σ(T¯)={λn; n ∈ N}.

Proof. The eigenvalues λn are all real. Define ∞ ∞ ∞ D ¯ | |2 2 | |2 ∞ (T ):= αnun; αn + λn αn < , n=1 n=1 n=1 D ∞ ∈D a linear subspace of H that contains (T ), since, if x = n=1 αnun (T ) ∞ ∈ and Tx = n=1 βnun H, the Fourier coefficients αn and βn satisfy

βn =(Tx,un)H =(x, T un)H = λn(x, un)H = λnαn 2 and {αn}, {λnαn}∈ . We can define the operator T¯ on D(T¯)by ∞ ∞ T¯ αnun := λnαnun. n=1 n=1 Let us show that T¯ is a self-adjoint extension of T .

It is clear that T¯ is symmetric, T ⊂ T¯, and every λn is an eigenvalue of T¯, so that {λn; n ∈ N}⊂σ(T¯).

If λ ∈ {λn; n ∈ N}, so that |λ − λn|≥δ>0, then it follows that λ ∈ σ(T¯) since we can construct the inverse of ∞ ∞ (T − λI) αnun = αn(λn − λ)un n=1 n=1 by defining ∞ ∞ α R α u := n u . n n λ − λ n n=1 n=1 n

9.2. Unbounded self-adjoint operators 271

Indeed, we obtain an operator R ∈L(H)(R≤1/δ) which obviously is one-to-one and its image is D(T¯), since ∞ α 2 λ 2 n λ2 ≤|α |2 1+ < ∞ λ − λ n n δ n=1 n ∞ ∈D ¯ and, moreover, we can associate to every x = n=1 αnun (T ) the ele- ment ∞ ∞ y = βnun = αn(λn − λ)un ∈ H n=1 n=1 such that Ry = x. Also, ∞ ∞ ∞ α (T¯ − λI)R α u = − n (T¯ − λI)u = α u n n λ − λ n n n n=1 n=1 n n=1 and R =(T¯ − λI)−1. To prove that T¯ is self-adjoint, we need to see that D((T¯)∗) ⊂D(T¯). ∞ ∈D ¯ ∗ ¯ ∗ ∈ If x = n=1 αnun ((T ) ) and y =(T ) x, then, for every n N,

(y, un)H =(x, Tu¯ n)H = λn(x, un)H = λnαn ∞ | |2 ∞ ∈D and n=1 λnαn < , i.e., x (T ). Finally, to prove that T¯ is the closure of T , consider ∞ ∞ (x, Ty¯ )=( αnun, λnαnun) ∈G(T¯). n=1 n=1 N ∈D Then xN := n=1 αnun (T ) and N N (xN ,TxN )=( αnun, λnαnun) → (x, Tx¯ ) n=1 n=1 2 in H × H,since{αn}, {λnαn}∈ .  Remark 9.18. A symmetric operator T may have no self-adjoint extensions at all, or many self-adjoint extensions. According to Theorems 9.7 and 9.8, if T is essentially self-adjoint, T ∗∗ is the unique self-adjoint extension of T .

9.2.3. The Friedrichs extensions. A sufficient condition for a symmet- ric operator T to have self-adjoint extensions, known as the Friedrichs extensions, concerns the existence of a lower bound for the quadratic form (Tx,x)H .

272 9. Unbounded operators in a Hilbert space

We say that T , symmetric, is semi-bounded5 with constant c,if

c := inf (Tx,x)H > −∞, x∈D(T ), xH =1 ≥  2 ∈D so that (Tx,x)H c x H for all x (T ). In this case, for any c ∈ R, T −cI is also symmetric on the same domain and semi-bounded, with constant c + c.IfT is a self-adjoint extension of T ,thenT − cI is a self-adjoint extension of T − cI, and we will choose a convenient constant in our proofs. Let us denote

(x, y)T := (Tx,y)H, D a sesquilinear form on (T ) such that (y, x)T = (x, y)T .Ifc>0, then we have an inner product. Theorem 9.19 (Friedrichs-Stone6). If T is a semi-bounded symmetric op- erator, with constant c, then it has a self-adjoint extension T such that ≥  2 ∈D (Tx,x)H c x H ,ifx (T ).

Proof. We can suppose that c = 1, and then (x, y)T is a scalar product on D   1/2 ≥  D = (T ) which defines a norm x T =(x, x)T x H . Let DT be the ·T -completion of D.SincexH ≤xT ,every·T - Cauchy sequence {xn}⊂D, which represents a point x ∈ DT , has a limit x in H, and we have a natural mapping J : DT → H, such that Jx = x.

This mapping J is one-to-one, since, if Jy = 0 and xn → y in DT , {xn}⊂D is also a Cauchy sequence in H and there exists x = lim xn in H. Then x = Jy = 0 and, from the definition of (y, x)T and by the continuity of the scalar product, it follows that, for every v ∈ D,

(v, y)T = lim(v, xn)T = lim(Tv,xn)H =(Tv,x)H =0. n n

But D is dense in DT and y =0.

We have D = D(T ) ⊂ DT → H and, to define the Friedrichs extension  T of T , we observe that, for every u =(·,y)H ∈ H ,

|u(x)|≤xH yH ≤xT yH (x ∈ DT ) and there exists a unique element w ∈ DT such that u =(·,w)T on DT .We define D(T) as the set of all these elements, D(T )= w ∈ DT ;(·,w)T =(·,y)H on DT for some y ∈ H ,

5In 1929 J. von Neumann and also A. Wintner identified this class of operators that admit self-adjoint extensions. 6Kurt Otto Friedrichs (1901–1982) made contributions to the theory of partial differential equations, operators in Hilbert space, perturbation theory, and bifurcation theory. He published his extension theorem in G¨ottingen in 1934, and M. Stone did the same in New York in 1932.

9.3. Spectral representation of unbounded self-adjoint operators 273

∗ ∗ i.e., D(T )=D(T ) ∩ DT and T w = y for a unique w ∈D(T ), for every y ∈ H. Next we define Tw = y if (·,y)H =(·,w)T over DT w ∈D(T ) , ∗ ∗ so that T is the restriction of T to D(T )=D(T ) ∩ DT .

This new operator is a linear extension of T , since, for all v ∈ DT , (9.3) (v, w)T =(v, Tw)H w ∈D(T ) and, if y = Tx ∈ H with x ∈ D,

(v, y)H =(v, Tx)H =(Tv,x)H =(v, x)T (v ∈ D). Thus, x = w and Tw = Tx, i.e., D ⊂D(T) and T ⊂ T. To show that T is symmetric, apply (9.3) to w ∈D(T ) ⊂ DT .If v, w ∈D(T ), then (w, v)T =(w, Tv)H and the scalar product is symmetric, so that (Tw,v)H =(w, Tv)H . Observe that T : D(T) → H is bijective, since, in our construction, since every y ∈ H, w was the unique solution of the equation Tw = y.Moreover, the closed graph theorem shows that A := T−1 : H →D(T) ⊂ H is a −1 bounded operator, since yn → 0 and T yn → w imply −1 −1 0 = lim(T x, yn)H =(x, T yn)H =(y, w)H n for every x ∈ H, and then w = 0. This bounded operator, being the inverse of a symmetric operator, is also symmetric, i.e., it is self-adjoint. But then, every z ∈ C \ R is in σ(A)c. The identity z−1I −A−1 = A−1(A−zI)z−1 shows that z−1 ∈ σ(A−1)c = σ(T)c,ifz ∈ R. By Theorem 9.10, T is self-adjoint. 

9.3. Spectral representation of unbounded self-adjoint operators

T : D(T ) ⊂ H → H is still a possibly unbounded linear operator. The functional calculus for a bounded normal operator T has been based on the spectral resolution T = λdE(λ), σ(T ) where E represents a spectral measure on σ(T ). If f is bounded, then this representation allows us to define f(T )= f(λ) dE(λ). σ(T )

274 9. Unbounded operators in a Hilbert space

This functional calculus can be extended to unbounded functions, h, and then it can be used to set a spectral theory for unbounded self-adjoint operators. The last section of this chapter is devoted to the proof of the following result: Theorem 9.20 (Spectral theorem). For every self-adjoint operator T on H, there exists a unique spectral measure E on R which satisfies T = tdE(t) R in the sense that +∞ (Tx,y)H = tdEx,y(t)(x ∈D(T ),y∈ H). −∞ If f is a Borel measurable function on R, then a densely defined operator f(T )= f(t) dE(t) R is obtained such that +∞ (f(T )x, y)H =(ΦE(f)x, y)H = f(t) dEx,y(t)(x ∈D(f),y∈ H), −∞ where +∞ 2 D(f)= x ∈ H; |f(λ)| dEx,x < ∞ . −∞ For this functional calculus,  2 | |2 ∈D (a) f(T )x H = σ(T ) f dEx,x if x (f(T )), (b) f(T )h(T ) ⊂ (fh)(T ), D(f(T )h(T )) = D(h(T )) ∩D((fh)(T )),and

(c) f(T )∗ = f¯(T ) and f(T )∗f(T )=|f|2(T )=f(T )f(T )∗. If f is bounded, then D(f)=H and f(T ) is a bounded normal operator. If f is real, then f(T ) is self-adjoint.

The following example will be useful in the next section. Example 9.21. The spectral measure of the position operator of Exam- ples 9.2 and 9.11, Q = tdE(t), R is E(B)=χB· and dE (t)=ϕ(t)ψ(t) dt, i.e., ϕ,ψ

tdEϕ,ψ(t)=(Qϕ, ψ)2 = tϕ(t)ψ(t) dt (ϕ ∈D(Q)). R R

9.3. Spectral representation of unbounded self-adjoint operators 275

This is proved by defining F (B)ψ := χBψ for every Borel set B ⊂ R; that is, F (B)=χB·, a multiplication operator. It is easy to check that 2 F : BR →L(L (R)) is a spectral measure, and to show that F = E,we only need to see that

tϕ(t)ψ(t) dt = tdFϕ,ψ(t), R R where Fϕ,ψ(B)=(F (B)ϕ, ψ)2 = R(F (B)ϕ)(t)ψ(t) dt. But the integral for the complex measure Fϕ,ψ is a Lebesgue-Stieltjes integral with the distribution function t F (t)=Fϕ,ψ (−∞,t] =(χ(−∞,t]ϕ, ψ)2 = ϕ(s)ψ(s) ds, −∞ and then dF (t)=ϕ(t)ψ(t) dt.

The spectrum σ(T ) of a self-adjoint operator can be described in terms of its spectral measure E: Theorem 9.22. If T = R tdE(t) is the spectral representation of a self- adjoint operator T ,then (a) σ(T ) = supp E,

(b) σp(T )={λ ∈ R; E{λ} =0 },and

(c) Im E{λ} is the eigenspace of every λ ∈ σp(T ).

Proof. We will use the fact that  − 2 − 2 ∈D ∈ (T λI)x H = (t λ) dEx,x(t)(x (T ),λ R), R which follows from Theorem 9.20(a). (a) If λ ∈ supp E,thenE (λ − ε, λ + ε)=0forsomeε>0, and x,x  − 2 − 2 ≥ 2 2 (T λI)x H = (t λ) dEx,x(t) ε x H , (λ−ε,λ+ε) which means that λ ∈ σ(T ), by Theorem 9.9. Conversely, if λ ∈ supp E,thenE(λ − 1/n, λ +1/n) =0forevery n>0 ∈ − ⊂ and we can choose 0 = xn Im E(λ 1/n, λ +1/n). Then supp Exn,xn [λ − 1/n, λ +1/n] since it follows from V ∩ (λ − 1/n, λ +1/n)=∅ that E(V ) and E(λ − 1/n, λ +1/n) are orthogonal and Ex,x(V )=(E(V )x, x)H =0. Thus  − 2 − 2 ≤ 1  2 (T λI)xn H = (t λ) dExn,xn (t) 2 xn H R n and λ is an approximate eigenvalue.

276 9. Unbounded operators in a Hilbert space

∈D − 2 (b) Tx = λx for 0 = x (T ) if and only if R(t λ) dEx,x(t)=0, meaning that Ex,x{λ} = 0 and Ex,x(R \{λ})=0.

(c) The identity Ex,x(R \{λ}) = 0 means that x = E{λ}(x) satisfies Tx = λx. 

Since E(B)=E(B ∩ supp E), in the spectral representation of the self- adjoint operator, T , R can be changed by supp E = σ(T ); that is, T = tdE(t)= tdE(t), R σ(T ) and also h(T )= hdE = hdE(t). R σ(T ) As an application, we define the square root of a positive operator: Theorem 9.23. A self-adjoint operator T is positive ((Tx,x) ≥ 0 for all x ∈D(T )) if and only if σ(T ) ⊂ [0, ∞). In this case there exists a unique self-adjoint√ operator R which is also positive and satisfies R2 = T , so that R = T , the square root of T .

Proof. If (Tx,x)H ≥ 0 for every x ∈D(T ) and λ>0, we have  2 ≤ ≤    λ x H ((T + λI)x, x)H (T + λI)x H x H , so that (T + λI)xH ≥ λxH (x ∈D(T )). −1 ∈L − ∈ By Theorem 9.9 there exists (T + λI) (H) and λ σ(T ). ⊂ ∞ ∈D ∞ ≥ Conversely, if σ(T ) [0, ) and x (T ), then 0 tdEx,x(t) 0. Moreover ∞ (Tx,y)H = tdEx,y(t)(x ∈D(T ),y∈ H). 0 1/2 ∞ Define R = f(T )withf(t)=t .ThenD(R)={x; tdEx,x < ∞}, ∞ 0 which contains D(T )={x; t2 dE < ∞}.Thus, 0 x,x √ ∞ R = T = t1/2 dE(t). 0 From Theorem 9.29(b), R2 = T ,sinceD(f 2)=D(T ) ⊂D(f). To prove the uniqueness, suppose that we also have ∞ S = tdF(t) 0 2 such that S = T and ∞ T = t2 dF (t). 0

9.4. Unbounded operators in quantum mechanics 277

2  1/2 With the substitution λ = t we obtain a spectral measure E (λ)=F (λ ) ∞  such that T = 0 λdE (λ). From the uniqueness of the spectral measure, E = E and then S = R. 

9.4. Unbounded operators in quantum mechanics

To show how unbounded self-adjoint operators are used in the fundamentals of quantum mechanics, we are going to start by studying the case of a single particle constrained to move along a line.

9.4.1. Position, momentum, and energy. In quantum mechanics, what matters about the position is the probability that the particle is in [a, b] ⊂ R, and this probability is given by an integral b |ψ(x)|2 dx. a 2 2 The density distribution |ψ(x)| is defined by some ψ ∈ L (R), which is | |2  2 called the state function, such that R ψ(x) dx = ψ 2 = 1 is the total probability. Here ψ is a complex-valued function and a complex factor α in ψ is meaningless (|α| = 1 is needed to obtain ψ2 =1).Thereisa dependence on the time, t, which can be considered as a parameter. The mean position of the particle will be 2 μψ = x|ψ(x)| dx = xψ(x)ψ(x) dx = xdEψ,ψ(x) R R R with dEψ,ψ = ψ(x)ψ(x) dx.

If Q denotes the position operator, Qϕ(x)=xϕ(x), note that μψ = (Qψ, ψ)2. The dispersion of the position with respect to its mean value is measured by the variance, 2 2 2 2 varψ = (x−μψ) |ψ(x)| dq = x(x−μψ) dEψ,ψ(x)=((Q−μψI) ψ, ψ)2. R R | || |2 ∞ Similarly, if R f(x) ψ(x) dx < , the mathematical expectation of f is 2 (9.4) f(x)|ψ(x)| dx =(fψ,ψ)2 = f(x) dEψ,ψ(x). R R The momentum of the particle is defined as mass × velocity: p = mx.˙

278 9. Unbounded operators in a Hilbert space

Note that, from the properties of the Fourier transform, (9.5) 2 1  1  ξ|ψ(ξ)| dξ = ξψ(ξ)ψ(ξ) dξ = ψ (ξ)ψ(ξ) dξ = (ψ ,ψ)2. R R 2πi R 2πi By assuming that the probability that p ∈ [a, b]isgivenby 1 b p 2 b/h ψ dp = |ψ(ξ)|2 dξ, h a h a/h where h =6.62607095(44) · 10−34 J · seg is the Planck constant,7 the average value of p is 1 p 2 pψ dp = h ξ|ψ(ξ)|2 dξ. h R h R Here the Fourier transform can be avoided by considering the momentum operator P defined as h d P = D D = , 2πi dx since then, as noted in (9.5), this average is 2 (9.6) h ξ|ψ(ξ)| dξ =(Pψ,ψ)2. R If |f(hξ)||ψ(ξ)|2 dξ, then the functional calculus gives the value R 2 μψ(f)= f(hξ)|ψ(ξ)| dξ R for the mathematical expectation of f, which in the case f(p)=pn is n n μψ(p )=(P ψ, ψ)2. The kinetic energy is p2 T = , 2m so that its mathematical expectation will be 1 μ (T )= (P 2ψ, ψ) . ψ 2m 2 The potential energy is given by a real-valued function V (x) and from (9.4) we obtain the value 2 μψ(V )= V (x)|ψ(x)| dx =(Vψ,ψ)2 R | || |2 ∞ for the mathematical expectation of V if R V (x) ψ(x) dx < .

7This is the value reported in October 2007 by the National Physical Laboratory for this constant, named in honor of Max Planck, considered to be the founder of quantum theory in 1901 when, in his description of the black-body radiation, he assumed that the electromagnetic energy could be emitted only in quantized form, E = hν,whereν is the frequency of the radiation.

9.4. Unbounded operators in quantum mechanics 279

The mathematical expectation is additive, so that the average of the total energy is 1 μ (T + V )= P 2ψ + Vψ,ψ =(Hψ,ψ) , ψ 2m 2 2 where 1 H = P 2 + V 2m is the energy operator,orHamiltonian, of the particle.

9.4.2. States, observables, and Hamiltonian of a quantic system. As in the case of classical mechanics, the basic elements in the description of a general quantic system are those of state and observable. Classical mechanics associates with a given system a phase space, so that for an N-particle system we have a 6N-dimensional phase state. Similarly, quantum mechanics associates with a given system a complex Hilbert space H as the state space,whichisL2(R) in the case of a single particle on the line. In a quantum system the observables are self-adjoint operators, such as the position, momentum, and energy operators. A quantum system, in the Schr¨odinger picture, is ruled by the fol- lowing postulates:

Postulate 1: States and observables A state of a physical system at time t is a line [ψ] ⊂H,whichwe represent by ψ ∈Hsuch that ψH =1. A wave function is an H-valued function of the time parameter t ∈ R → ψ(t) ∈H.Ifψ(t) describes the state, then cψ(t), for any nonzero constant c, represents the same state. The observable values of the system are magnitudes such as position, momentum, angular momentum, spin, charge, and energy that can be mea- sured. They are associated to self-adjoint operators. In a quantic system, an observable is a time-independent8 self-adjoint operator A on H,which has a spectral representation A = λdE(λ). R By the “superposition principle”, all self-adjoint operators on H are assumed to be observable,9 and all lines [ψ] ⊂Hare admissible states.

8In the Heisenberg picture of quantum mechanics, the observables are represented by time- dependent operator-valued functions A(t) and the state ψ is time-independent. 9Here we are following the early assumptions of quantum mechanics, but the existence of “superselection rules” in quantum field theories indicated that this superposition principle lacks experimental support in relativistic quantum mechanics.

280 9. Unbounded operators in a Hilbert space

The elements of the spectrum, λ ∈ σ(A), are the observable values of the observable A.

Postulate 2: Distribution of an observable in a given state The values λ ∈ σ(A) in a state ψ are observable in terms of a probability A distribution Pψ . As in the case of the position operator Q for the single particle on R, H the observable A = R λdE(λ)on is evaluated in a state ψ at a given A ⊂ time in terms of the probability Pψ (B) of belonging to a set B R with respect to the distribution dEψ,ψ(λ) (we are assuming that ψH =1),so that A Pψ (B)= λdEψ,ψ(λ)=(E(B)ψ, ψ)H B and the mean value is Aψ := λdEψ,ψ(λ)=(Aψ, ψ)H. R ∈D 2 When ψ (A), this mean value Aψ exists, since λ is integrable with | | ∞ respect to the finite measure Eψ,ψ, and also R λ dEψ,ψ(λ) < . In general, if f is Eψ,ψ-integrable,  f(A)ψ =(f(A)ψ,ψ)H, is the expected value of f, the mean value with respect to Eψ,ψ. The variance of A in the state ψ ∈D(A)isthen 2 2 2 varψ(A)= (λ − Aψ) dEψ,ψ(λ)=((A − AψI) ψ, ψ)H = Aψ − AψψH. R It is said that A certainly takes the value λ0 in the state ψ if Aψ = λ0 and varψ(A)=0.

This means that ψ is an eigenvector of A with eigenvalue λ0, since it follows from Aψ = λ0ψ that Aψ =(Aψ, ψ)H = λ0, and also

2 varψ(A)=Aψ − AψψH =0. Conversely, varψ(A) = 0 if and only if Aψ − Aψψ =0.

Postulate 3: Hamiltonians and the Schr¨odinger equation There is an observable, H,theHamiltonian, defining the evolution of the system

ψ(t)=Utψ0, where ψ0 is the initial state and Ut is an operator defined as follows:

9.4. Unbounded operators in quantum mechanics 281

− it λ If h is the Planck constant and gt(λ)=e h , a continuous function with its values in the unit circle, then using the functional calculus, we can define the unitary operators

Ut := gt(H) ∈L(H)(t ∈ R) that satisfy the conditions

U = I, U U = U , and lim U x − U xH =0∀x ∈H, 0 s t s+t → t s t s since gtg¯t =1,g0 =1,gsgt = gst, and, if H = σ(H) λdE(λ) is the spectral representation of H, then the continuity property 2 − it λ − is λ 2 Utψ − UsψH = |e h − e h | dEψ,ψ(λ) → 0ast → s σ(H) follows from the dominated convergence theorem.

Such a family of operators Ut is called a strongly continuous one- parameter group of unitary operators, and we say that A = −(i/h)H is the infinitesimal generator. It can be shown (Stone’s theorem) that the converse is also true: every strongly continuous one-parameter group of unitary operators {Ut}t∈R has − it H a self-adjoint infinitesimal generator A = −(i/h)H; that is, Ut = e h for some self-adjoint operator H. It is said that − it H Ut = e h is the time-evolution operator of the system.

It is worth noticing that, if ψ ∈D(H), the function t → Utψ is differen- tiable and d U ψ = U Aψ = AU ψ dt t t t at every point t ∈ R. Indeed, 1 1 (U U ψ − U ψ)=U (U ψ − ψ) s s t t t s s and 1 i (9.7) lim (Usψ − ψ)=Aψ = − Hψ, s→0 s h since −isλ/h − 1 − i 2 e 1 i 2 → (Usψ ψ)+ Hψ H = + λ dEψ,ψ(λ) 0 s h σ(A) s h as s → 0, again by dominated convergence.

For a given initial state ψ0, it is said that ψ(t)=Utψ0 is the correspond- ing wave function.

282 9. Unbounded operators in a Hilbert space

If ψ(t) ∈D(H) for every t ∈ R, then the vector-valued function t → ψ(t) is derivable and satisfies the Schr¨odinger equation10 ihψ(t)=Hψ(t), since by (9.7) d i ψ(t)= (U ψ )=AU ψ = − Hψ(t). dt t 0 t 0 h In this way, from a given initial state, subsequent states can be calculated causally from the Schr¨odinger equation.11

9.4.3. The Heisenberg uncertainty principle and compatible ob- servables. To illustrate the role of probabilities in the postulates, let us consider again the case of a single particle on R. Recall that the momen- tum operator, h Pψ(q)= ψ(q), 2πi is self-adjoint on L2(R) and with domain H1(R). From Example 9.4 we know that the commutator of P and Q is bounded and h [P, Q]=PQ− QP = I, 2πi where D([P, Q]) = D(PQ)∩D(QP ), or extended to all L2(R)bycontinuity. Lemma 9.24. The commutator C =[S, T ]=ST − TS of two self-adjoint operators on L2(R) satisfies the estimate |Cψ|≤2 varψ(S) varψ(T ) for every ψ ∈D(C). Proof. Obviously, A = S −SψI and B = T −TψI are self-adjoint (note that Sψ, Tψ ∈ R) and C =[A, B]. From the definition of the expected value, |Cψ|≤|(Bψ,Aψ)2| + |(Aψ, Bψ)2|≤2Bψ2Aψ2,  2 2 where, A being self-adjoint, Aψ 2 =(A ψ, ψ)2 =varψ(S). Similarly,  2  Bψ 2 =varψ(T ).

10E. Schr¨odinger published his equation and the spectral analysis of the hydrogen atom in a series of four papers in 1926, which where followed the same year by Max Born’s interpretation of ψ(t) as a probability density. 11We have assumed that the energy is constant and the Hamiltonian does not depend on t but, if the system interacts with another one, the Hamiltonian is an operator-valued function H(t) of the time parameter. In the Schr¨odinger picture, all the observables except the Hamiltonian are time-invariant.

9.4. Unbounded operators in quantum mechanics 283

Theorem 9.25 (Uncertainty principle). h var (Q) var (P ) ≥ . ψ ψ 4π Proof. In the case C =[P, Q], |Cψ| = |(h/2πi)Iψ| = h/2π and we can apply Lemma 9.24.  The standard deviations varψ(Q) and varψ(P ) measure the uncer- tainties of the position and momentum, and the uncertainty principle shows that both uncertainties cannot be arbitrarily small simultaneously. Position and moment are said to be incompatible observables. It is a basic principle of all quantum theories that if n observables A1,...,An are compatible in the sense of admitting arbitrarily accurate simultaneous measurements, they must commute. However, since these op- erators are only densely defined, the commutators [Aj,Ak] are not always densely defined. Moreover, the condition AB = BA for two commuting operators is unsatisfactory; for example, taking it literally, A0 =0 A if A is ⊂ unbounded, but A0 0A and [A, 0] = 0 on the dense domain of A. j k This justifies saying that Aj = R λdE (λ) and Ak = R λdE (λ) com- mute, or that their spectral measures commute,if j k (9.8) [E (B1),E (B2)] = 0 (B1,B2 ∈BR).

If both Aj and Ak are bounded, then this requirement is equivalent to 12 [Aj,Ak] = 0 (see Exercise 9.18). For such commuting observables and n a given (normalized) state ψ, there is a probability measure Pψ on R so that 1 n Pψ(B1 ×···×Bn)=(E (B1) ···E (Bn)ψ, ψ)H is the predicted probability that a measurement to determine the values λ1,...,λn of the observables A1,...,An will lie in B = B1 × ··· × Bn. See Exercise 9.19, where it is shown how a spectral measure E on Rn can be defined so that dEψ,ψ is the distribution of this probability; with this spectral measure there is an associated functional calculus f(A1,...,An)of n commuting observables.

9.4.4. The harmonic oscillator. A heuristic recipe to determine a quan- tic system from a classical system of energy n p2 T + V = j + V (q ,...,q ) 2m 1 1 j=1 j

12 But E. Nelson proved in 1959 that there exist essentially self-adjoint operators A1 and A2 with a common and invariant domain, so that [A1,A2] is defined on this domain and [A1,A2]=0 but with noncommuting spectral measures.

284 9. Unbounded operators in a Hilbert space

is to make a formal substitution of the generalized coordinates qj by the po- sition operators Qj (multiplication by qj) and every pj by the corresponding momentum Pj. Then the Hamiltonian or energy operator should be a self- adjoint extension of n 2 Pj H = + V (Q1,...,Q1). 2mj j=1 For instance, in the case of the two-body problem under Coulomb force, which derives from the potential −e/|x|, n = 3 and the energy of the system is 1 e2 E = T + V = |p|2 − . 2m |x| − − 1 Hence, in a convenient scale, H = Δ |x| is the possible candidate of the Hamiltonian of the hydrogen atom. In Example 9.15 we have seen that it is a self-adjoint operator with domain H2(R3). With the help of his friend Hermann Weyl, Schr¨odinger calculated the eigenvalues of this operator. The coincidence of his results with the spec- tral lines of the hydrogen atom was considered important evidence for the validity of Schr¨odinger’s model for quantum mechanics.

Several problems appear with this quantization process, such as finding the self-adjoint extension of H, determining the spectrum, and describing the evolution of the system for large values of t (“scattering”).

Let us consider again the simple classical one-dimensional case of a single particle with mass m, now in a Newtonian field with potential V , so that d −∇V = F = (mq˙), dt q denoting the position. We have the linear momentum p = mq˙, the kinetic energy T =(1/2)mq˙2 = p2/2m, and the total energy E = T + V .

The classical harmonic oscillator corresponds to the special case of the field F (q)=−mω2q on a particle bound to the origin by the potential ω2 V (q)=m q2 2 if q ∈ R is the position variable. Hence, in this case, 1 ω2 1 ω2 E = T + V = mq˙2 + m q2 = p2 + m q2. 2 2 2m 2 From Newton’s second law, the initial stateq ˙(0) = 0 and q(0) = a>0 determines the state of the system at every time, q = a cos(ωt).

9.4. Unbounded operators in quantum mechanics 285

The state space for the quantic harmonic oscillator is L2(R), and the position Q = q· and the momentum P are two observables. By making the announced substitutions, we obtain as a possible Hamiltonian the operator 1 ω2 H = P 2 + m Q2. 2m 2 On the domain S(R), which is dense in L2(R), it is readily checked that

(Hϕ,ψ)2 =(ϕ, Hψ)2, so that H is a symmetric operator. We will prove that it is essentially self-adjoint and the Hamiltonian will be its unique self-adjoint extension H¯ = H∗∗, which is also denoted H. In coordinates, h2 d2 mω2 H = − + q2 2m · 4π2 dq2 2 which after the substitution x = aq,witha2 =2πmω/h, can be written hω d2 H = x2 − . 2 dx2 Without loss of generality, we suppose hω = 1, and it will be useful to consider the action of 1 d2 H = x2 − 2 dx2 on 2 F := {P (x)e−x /2; P polynomial}, the linear subspace of S(R) that has the functions xne−x2/2 as an algebraic basis. Since H(xne−x2/2) ∈F,wehaveH(F) ⊂F. Similarly, A(F) ⊂F and B(F) ⊂F if 1 d A := √ x + , 2 dx the annihilation operator, and 1 d B := √ x − , 2 dx the creation operator. Theorem 9.26. The subspace F of S(R) is dense in L2(R),andtheGram- { n −x2/2}∞ Schmidt process applied to x e n=0 generates an orthonormal basis { }∞ 2 S ψn n=0 of L (R). The functions ψn are in the domain (R) of H and they are eigenfunctions with eigenvalues λn = n +1/2. According to Theo- rem 9.17, the operator H is essentially self-adjoint.

286 9. Unbounded operators in a Hilbert space

Proof. On F, a simple computation gives 1 1 H = BA + I = AB − I; 2 2 1 − 1 hence HB = BAB + 2 B and BH = BAB 2 B, so that [H, B]=B. Then, if Hψ = λψ and Bψ =0with ψ ∈F, it follows that λ + 1 is also an eigenvalue of H, with the eigenfunction Bψ,since H(Bψ)=B(Hψ)+Bψ = λBψ + Bψ =(λ +1)Bψ. For −x2/2 ψ0(x):=e , 2 −x2/2 −x2/2  −x2/2 we have 2Hψ0(x)=x e − (e ) = e , so that 1 Hψ = ψ 0 2 0 and ψ is an eigenfunction with eigenvalue 1/2. 0 √ We have 2Bψ (x)=2xe−x2/2 = 0 and, if we denote 0 √ √ n ψn := ( 2B) ψ0 = 2Bψn−1, from the above remarks we obtain 1 Hψ = n + ψ (n =0, 1, 2,...) n 2 n −x2/2 and ψn(x)=Hn(x)e . By induction over n, it follows that Hn is a polynomial with degree n. It is called a Hermite polynomial.

The functions ψn are mutually orthogonal, since they are eigenfunctions with different eigenvalues, and they generate F. F 2 ∈ 2 To prove that is a dense subspace of L (R), let f L (R)besuch that f(x)xne−x2/2 dx =(xne−x2/2,f(x)) =0foralln ∈ N.Then R 2 2 F (z):= f(x)e−x /2e−2πixz dx R is defined and continuous on C, and the Morera theorem shows that F is an entire function, with 2 F (n)(z)=(−2πi)n xnf(x)e−x /2e−2πixz dx. R (n) n −x2/2 But F (0) = (x e ,f(x))2 =0foralln ∈ N, so that F =0.Fromthe Fourier inversion theorem we obtain f(x)e−x2/2 = 0 and f =0.  −1 It follows that the eigenfunctions ψn := ψn 2 ψn of H are the elements of an orthonormal basis of L2(R), all of them contained in S(R), which is the domain of the essentially self-adjoint operator H. 

9.5. Appendix: Proof of the spectral theorem 287

Remark 9.27. In the general setting, for any mh, hω d2 H = x2 − , 2 dx2 1 and we have Hψn = hω(n + 2 )ψn.Thus σ(H)={hω/2,hω(1 + 1/2),hω(2 + 1/2),...}. The wave functions ψn are known as the bound states, and the numbers are the energy eigenvalues of these bound states. The minimal energy is 13 hω/2, and ψ0 is the “ground state”.

9.5. Appendix: Proof of the spectral theorem

The proof of Theorem 9.20 will be obtained in several steps. First, in Theo- rem 9.28, we define a functional calculus with bounded functions for spectral measures. Then this functional calculus will be extended to unbounded func- tions in Theorem 9.29. The final step will prove the spectral theorem for unbounded self-adjoint operators by the von Neuman method based on the use of the Cayley transform.

9.5.1. Functional calculus of a spectral measure. Our first step in the proof of the spectral theorem for unbounded self-adjoint operators will be to define a functional calculus associated to a general spectral measure

E : BK →L(H) as the integral with respect to this operator-valued measure. Denote by L∞(E) the complex normed space of all E-essentially bounded complex functions (the functions coinciding E-a.e. being identified as usual) endowed with the natural operations and the norm

f∞ = E-sup|f|. With the multiplication and complex conjugation, it becomes a commutative C∗-algebra, and the constant function 1 is the unit. Every f ∈ L∞(E) has a bounded representative. We always represent simple functions as N ∈ s = αnχBn S(K), n=1 where {B1,...,BN } is a partition of K. Since every bounded measurable function is the uniform limit of simple functions, S(K)isdenseinL∞(E), and we will start by defining the integral of simple functions:

13Max Planck first applied his quantum postulate to the harmonic oscillator, but he assumed that the lowest level energy was 0 instead of hω/2. See footnote 7 in this chapter.

288 9. Unbounded operators in a Hilbert space

As in the scalar case, N sdE := αnE(Bn) ∈L(H) n=1 is well-defined and uniquely determined, independently of the representation of s, by the relation ( sdE x, y)H = sdEx,y (x, y ∈ H), K since sdE = N α (E(B )x, y) =( N α E(B )x, y) . K x,y n=1 n n H n=1 n n H It is readily checked that this integral is clearly linear, 1 dE = I, and ( sdE)∗ = sdE¯ . It is also multiplicative, (9.9) st dE = sdE tdE = tdE sdE, N since for a second simple function t we can suppose that t = n=1 βnχBn , with the same sets Bn as in s, and then N sdE tdE = βn sdE E(Bn) n=1 N N = βnαnE(Bn)E(Bn)= αnβnE(Bn) n=1 n=1 = st dE.

Also 2 2 sdE x = |s| dEx,x (x ∈ H, s ∈ S(K)) H K since ∗ ( sdE x, sdE x)H =( sdE sdE x, x)H 2 =( |s| dE x, x)H .

This yields sdE ≤s∞ and, in fact, the integral is isometric. Indeed, if we choose n so that s∞ = |αn| with E(Bn) = 0 and x ∈ Im E(Bn), then sdE x = αnE(Bn)x = αnx

9.5. Appendix: Proof of the spectral theorem 289

and necessarily sdE x = s∞. H Now the integral can be extended over L∞(E) by continuity, since it is a bounded linear map from the dense vector subspace S(K)ofL∞(E)to the Banach space L(H). We will denote

ΦE(f):= fdE= lim sn dE n → ∞ ∈ if sk f in L (E)(sk S(K)). The identities (Φ (s )x, y) = s dE extend to E k H K k x,y

(ΦE(f)x, y)H = fdEx,y K by taking limits. All the properties of ΦE contained in the following theorem are now obvious:

Theorem 9.28. If E : BK →L(H) is a spectral measure, then there is a unique homomorphism of C∗-algebras Φ : L∞(K) →L(H) such that E ∞ (ΦE(f)x, y)H = fdEx,y (x, y ∈ H, f ∈ L (K)). K This homomorphism also satisfies  2 | |2 ∈ ∈ ∞ (9.10) ΦE(f)x H = f dEx,x (x H, f L (K)). K 9.5.2. Unbounded functions of bounded normal operators. To ex- tend the functional calculus f(T )=Φf (T ) of a bounded normal operator with bounded functions to unbounded measurable functions h,westartby extending to unbounded functions the functional calculus of Theorem 9.28 for any spectral measure E:

Theorem 9.29. Suppose K a locally compact subset of C, E : BK →L(H) a spectral measure, h a Borel measurable function on K ⊂ C,and 2 D(h):= x ∈ H; |h(λ)| dEx,x < ∞ . K Then there is a unique linear operator Φ (h) on H, represented as E

ΦE(h)= hdE, K with domain D(Φ (h)) = D(h) and such that E

(ΦE(h)x, y)H = h(λ) dEx,y(λ)(x ∈D(h),y∈ H). K

290 9. Unbounded operators in a Hilbert space

This operator is densely defined and, if f and h are Borel mesurable func- tions on K, the following properties hold:  2 | |2 ∈D (a) ΦE(h)x H = K h dEx,x,ifx (h).

(b) ΦE(f)ΦE(h) ⊂ ΦE(fh) and D(ΦE(f)ΦE(h)) = D(h) ∩D(fh).

∗ ∗ 2 ∗ (c) ΦE(h) =ΦE(h¯) and ΦE(h) ΦE(h)=ΦE(|h| )=ΦE(h)ΦE(h) .

Proof. It is easy to check that D(h) is a linear subspace of H. For instance,  2 ≤  2  2 E(B)(x + y) H 2 E(B)x H +2 E(B)y H so that

Ex+y,x+y(B) ≤ 2Ex,x(B)+2Ey+y(B) and D(h)+D(h) ⊂D(h). This subspace is dense. Indeed, if y ∈ H, we consider Bn := |h|≤n ↑ K, so that, from the strong σ-additivity of E,

y = E(K)y = lim E(Bn)y, n where xn := E(Bn)y ∈D(h)since

E(B)xn = E(B)E(Bn)xn = E(B ∩ Bn)xn (B ⊂ K) and E (B)=E (B ∩ B ), the restriction of E to B , so that xn,xn xn,xn n xn,xn n | |2 | |2 ≤ 2 2 ∞ h dExn,xn = h dExn,xn n xn H < . K Bn If h is bounded, then let us also prove the estimate 1/2 2 (9.11) hdEx,y ≤ |h| d|Ex,y|≤ |h| dEx,x yH < ∞, K K K where |Ex,y| is the total variation of the Borel complex measure Ex,y. From the polar representation of a complex measure (see Lemma 4.12), we obtain a Borel measurable function  such that || = 1 and

h dEx,y = |h| d|Ex,y|, where |Ex,y| denotes the total variation of Ex,y.Thus, hdEx,y ≤ |h| d|Ex,y| = h dEx,y =(ΦE(h)x, y)H K K K 1/2 2 ≤ΦE(h)xHyH = |h| dEx,x yH K 1/2 2 = |h| dEx,x yH , K where in the second line we have used (9.10), and (9.11) holds.

9.5. Appendix: Proof of the spectral theorem 291

When h is unbounded, to define ΦE(h)x for every x ∈D(h), we are → going to show that y K hdEx,y is a bounded conjugate-linear form on → ∈ H. Let us consider hn(z)=h(z)χBn (z) h(z)ifz K, so that 1/2 2 hn dEx,y ≤ |hn| dEx,x yH , K K and by letting n →∞, we also obtain (9.11) for h in this unbounded case if ∈D x (h). → Then the conjugate-linear functional y K hdEx,y is bounded with ≤ | |2 1/2 norm ( K h dEx,x) , and by the Riesz representation theorem there is a unique ΦE(h)x ∈ H such that, for every y ∈ H, 1/2 2 (ΦE(h)x, y)H = h(λ) dEx,y(λ), ΦE(h)xH ≤ |h| dEx,x . K K

The operator ΦE(h) is linear, since Ex,y is linear in x, and densely defined. We know that (a) holds if h is bounded. If it is unbounded, then let hk = hχB and observe that D(h − hk)=D(h). By dominated convergence, k  − 2  − 2 ≤ | − |2 → ΦE(h)x ΦE(hk)x H = ΦE(h hk)x H h hk dEx,x 0 K as k →∞; according to Theorem 9.28, every hk satisfies (a), which will follow for h by letting k →∞. To prove (b) when f is bounded, we note that D(fh) ⊂D(h) and dE = fdE , since both complex measures coincide on every Borel x,Φ¯ E (f)z x,z set. It follows that, for every z ∈ H, (Φ (f)Φ (h)x, z) =(Φ(h)x, Φ¯ (f)z) = hdE E E H E E H x,Φ¯ E (f)z K =(ΦE(fh)x, z)H and, if x ∈D(h), we obtain from (a) that | |2 | |2 ∈D f dEΦE (h)x,ΦE (h)x = fh dEx,x (x (h)). K K

Hence, ΦE(f)ΦE(h) ⊂ ΦE(fh). If f is unbounded, then we take limits and | |2 | |2 ∈D f dEΦE (h)x,ΦE (h)x = fh dEx,x (x (h)) K K holds, so that ΦE(h)x ∈D(f) if and only if x ∈D(fh), and

D(ΦE(f)ΦE(h)) = {x ∈D(h); ΦE(h)x ∈D(f)} = D(h) ∩D(fh), as stated in (b).

292 9. Unbounded operators in a Hilbert space

Now let x ∈D(h) ∩D(fh) and consider the bounded functions fk = → 2 → fχBk , so that fkh fh in L (Ex,x). From (a) we know that ΦE(fkh)x ΦE(fh)x,

ΦE(f)ΦE(h)x = lim ΦE(fk)ΦE(h)x = lim ΦE(fkh)x =ΦE(fh)x, k k and (b) is true. ∈D D ¯ To prove (c), let x, y (h)= (h). If hk = hχBk ,then ¯ ¯ (ΦE(h)x, y)H = lim(ΦE(hk)x, y)H = lim(x, ΦE(hk)y)H =(x, ΦE(h)y)H, k k ∗ ∗ and it follows that y ∈D(ΦE(h) ) and Φ¯ E(h) ⊂ ΦE(h) . To finish the proof, ∗ let us show that D(ΦE(h) ) ⊂D(h)=D(h¯). ∈D ∗ Let z (ΦE(h) ). We apply (b) to hk = hχBk and we have ΦE(hk)=

ΦE(h)ΦE(χBk )withΦE(χBk ) bounded and self-adjoint. Then ∗ ∗ ∗ ⊂ ∗ ΦE(χBk )ΦE(h) =ΦE(χBk ) ΦE(h) (ΦE(h)ΦE(χBk )) ∗ ¯ =ΦE(hk) =ΦE(hk) ∗ ¯ and χB (ΦE(h) )z =ΦE(hk)z.But|χk|≤1, so that k

| |2 | |2 ∗ ∗ ≤ ∗ ∗ hk dEz,z = χBk dEΦE (h) z,ΦE(h) z EΦE (h) z,ΦE(h) z(K). K K We obtain that z ∈D(h) by letting k →∞.

The last part follows from (b), since D(ΦE(hh¯)) ⊂D(h). 

Remark 9.30. In Theorem 9.29, if ΦE(B0) = 0, we can change K to K\B0:

(ΦE(h)x, y)H := h(λ) dEx,y(λ)(x ∈D(ΦE(h)),y∈ H) K\B0 if h is Borel measurable on K \ B0.

If E is the spectral measure of a bounded normal operator T ,thenwe write h(T )forΦE(h), and then the results of Theorem 9.29 read h(T )= hdE σ(T ) on D(h)={x ∈ H; f2 < ∞}, in the sense that Ex,x

(h(T )x, y)H = hdEx,y (x ∈D(h),y∈ H). σ(T ) Also  2 | |2 ∈D (a) h(T )x H = σ(T ) h dEx,x if x (h(T )),

9.5. Appendix: Proof of the spectral theorem 293

(b) f(T )h(T ) ⊂ (fh)(T ), D(f(T )h(T )) = D(h(T )) ∩D((fh)(T )) with f(T )h(T )=(fh)(T ) if and only if D((fh)(T )) ⊂D(h(T )), and (c) h(T )∗ = h¯(T ) and h(T )∗h(T )=|h|2(T )=h(T )h(T )∗.

9.5.3. The Cayley transform. We shall obtain a spectral representation theorem for self-adjoint operators using von Neumann’s method of making a reduction to the case of unitary operators. If T is a bounded self-adjoint operator on H, then the continuous func- tional calculus allows a direct definition of the Cayley transform of T as14 U = g(T )=(T − iI)(T + iI)−1, where g(t)=(t − i)/(t + i), a continuous bijection from R onto S \{1}, and it is a unitary operator (cf. Theorem 8.24). Let us show that in fact this is also true for unbounded self-adjoint operators.

Let T be a self-adjoint operator on H. By the symmetry of T and from 2 2 2 the identity Ty± iy = y + Ty ± (iy, T y)H ± (Ty,iy)H, H H H  ± 2  2  2 ∈D Ty iy H = y H + Ty H y (T ) . The operators T ±iI : D(T ) → H are bijective and with continuous inverses, since ±i ∈ σ(T )c. For every x = Ty + iy ∈ Im (T + iI)=H (y ∈D(T )), we define Ux = U(Ty+ iy):=Ty− iy; that is, Ux =(T − iI)(T + iI)−1x (x ∈ H).  2  − 2 Then U is a bijective isometry of H,since Ty + iy H = Ty iy H and Im (T ± iI)=H, and U is called the Cayley transform of T . Lemma 9.31. The Cayley transform U =(T − iI)(T + iI)−1 of a self-adjoint operator T is unitary, I − U is one-to-one, Im (I − U)= D(T ),and T = i(I + U)(I − U)−1 on D(T ).

14Named after Arthur Cayley, this transform was originally described by Cayley (1846) as a mapping between skew-symmetric matrices and special orthogonal matrices. In complex analysis, the Cayley transform is the conformal mapping between the upper half-plane and the unit disc given by g(z)=(z−i)/(z+i). It was J. von Neumann who, in 1929, first used it to map self-adjoint operators into unitary operators.

294 9. Unbounded operators in a Hilbert space

Proof. We have proved that U is unitary and, from the definition, Ux = (T − iI)y if x =(T + iI)y for every y ∈D(T ) and every x ∈ H. It follows that (I + U)x =2Ty and (I − U)x =2iy,with(I − U)(H)=D(T ). If (I − U)x =0,theny = 0 and also (I + U)x = 0, so that a subtraction gives 2Ux = 0, and x = 0. Finally, if y ∈D(T ), 2Ty =(I +U)(I −U)−1(2iy).  Remark 9.32. Since I − U is one-to-one, 1 is not an eigenvalue of U.

9.5.4. Proof of Theorem 9.20: Let T be a self-adjoint operator on H. To construct the (unique) spectral measure E on σ(T ) ⊂ R such that T = tdE(t), σ(T ) the Cayley transform U of T will help us to transfer the spectral represen- tation of U to the spectral representation of T . According to Theorem 8.24, the spectrum of U is a closed subset of the unit circle S, and 1 is not an eigenvalue, so that the spectral measure E of U satisfies E{1} = 0, by Theorem 8.26. We can assume that it is defined on Ω = S \{1} and we have the functional calculus f(U)= f(λ) dE(λ)= f(λ) dE(λ)(f ∈ B(Ω)), σ(U) Ω which was extended to unbounded functions in Subsection 9.5.2. If h(λ):=i(1 + λ)/(1 − λ)onΩ,thenwealsohave  ∈D ∈ (h(U)x, y)H = h(λ) dEx,y(λ)(x (h(U)),y H), Ω with D { ∈ | |2  ∞} (h(U)) = x H; h dEx,x < . Ω The operator h(U) is self-adjoint, since h is real and h(U)∗ = h¯(U)= h(U). From the identity h(λ)(1 − λ)=i(1 + λ), an application of (b) in Theorem 9.29 gives h(U)(I − U)=i(I + U), since D(I − U)=H. In particular, Im (I − U) ⊂D(h(U)). From the properties of the Cayley transform, T = i(I + U)(I − U)−1, and then T (I − U)=i(I + U), D(T )= Im(I − U) ⊂D(h(U)),

9.6. Exercises 295 so that h(U) is a self-adjoint extension of the self-adjoint operator T .But, T being maximally symmetric, T = h(U). This is,  ∈D ∈ (Tx,y)H = h(λ) dEx,y(λ)(x (T ),y H). Ω The function t = h(λ) is a homeomorphism between Ω and R that allows us to define E(B):=E(h−1(B)), and it is readily checked that E is a spectral measure on R such that

(Tx,y)H = tdEx,y(t)(x ∈D(T ),y∈ H). R Conversely, if E is a spectral measure on R which satisfies

(Tx,y)H = tdEx,y(t)(x ∈D(T ),y∈ H), R by defining E(B):=E(h(B)), we obtain a spectral measure on Ω such that  ∈D ∈ (h(U)x, y)H = h(λ) dEx,y(λ)(x (h(U)),y H). Ω But U = h−1(h(U)) and  ∈ (Ux,y)H = λdEx,y(λ)(x, y H). Ω From the uniqueness of E with this property, the uniqueness of E follows. Of course, the functional calculus for the spectral measure E defines the +∞ +∞ functional calculus f(T )= −∞ fdE for T = −∞ λdE(λ), and f(T )= f(h(U)).

9.6. Exercises

Exercise 9.1. Let T : D(T ) ⊂ H → H be a linear and bounded operator. Prove that T has a unique continuous extension on D(T ) and that it has a bounded linear extension to H. Show that this last extension is unique if and only if D(T )isdenseinH. Exercise 9.2. Prove that if T is a symmetric operator on a Hilbert space H and D(T )=H,thenT is bounded. Exercise 9.3. Prove that the derivative operator D is unbounded on L2(R). Exercise 9.4. If T is an unbounded densely defined linear operator on a Hilbert space, then prove that ( Im T )⊥ =KerT ∗. Exercise 9.5. If T is a linear operator on H and λ ∈ σ(T )c,thenprove that RT (λ)≥1/d(λ, σ(T )).

296 9. Unbounded operators in a Hilbert space

Exercise 9.6. Show that, if T is a symmetric operator on H and Im T = H, then T is self-adjoint. Exercise 9.7. If T is an injective self-adjoint operator on D(T ) ⊂ H,then show that Im T = D(T −1)isdenseinH and that T −1 is also self-adjoint. Exercise 9.8. Prove that the residual spectrum of a self-adjoint operator on a Hilbert space H is empty. Exercise 9.9. Suppose A is a bounded self-adjoint operator on a Hilbert space H and let A = λdE(λ) σ(A) be the spectral representation of A. A vector z ∈ H is said to be cyclic for { n }∞ A if the set A z n=0 is total in H. If A has a cyclic vector z and μ = Ez,z, then prove that A is unitarily equivalent to the multiplication operator M : f(t) → tf(t)ofL2(μ); that is, M = U −1AU where U : L2(μ) → H is unitary. Exercise 9.10. Let A = λdE(λ) σ(A) be the spectral resolution of a bounded self-adjoint operator of H and denote F (t):=E(−∞,t]=E(σ(A) ∩ (−∞,t]). Prove that the operator-valued function F : R →L(H) satisfies the follow- ing properties:

(a) If s ≤ t,thenF (s) ≤ F (t); that is, (F (s)x, x)H ≤ (F (t)x, x)H for every x ∈ H. (b) F (t)=0ift

(c) F (t+) = F (t); that is, lims↓t F (s)=F (t)inL(H). If aM(A), then show that with convergence in L(H) b M(A) A = tdF(t)= tdF(t)= fdF(t) a m(A)+ R as a Stieltjes integral. Exercise 9.11. On L2(0, 1), let S = iD with domain H1(0, 1). Prove the following facts: (a) Im S = L2(R). ∗ 1 (b) S = iD with domain H0 (0, 1). (c) S is a non-symmetric extension of iD with D(iD)=H2(0, 1).

9.6. Exercises 297

2 1 ∗ Exercise 9.12. On L (0, 1), let R = iD with domain H0 (0, 1) (i.e, S in Exercice 9.11). Prove the following facts: { ∈ 2 1 } (a) Im R = u L (R); 0 u(t) dt =0 . (b) R∗ = iD with domain H1(0, 1) (i.e, R∗ = S of Exercice 9.11). Exercise 9.13. As an application of Theorem 9.17, show that the operator −D2 = −d2/dx2 in L2(0, 1) with domain the C∞ functions f on [0, 1] such that f(0) = f(1) = 0 is essentially self-adjoint. Exercise 9.14. Show also that the operator −D2 = −d2/dx2 in L2(0, 1) with domain the C∞ functions f on [0, 1] such that f (0) = f (1) = 0 is essentially self-adjoint. Exercise 9.15. Prove that −D2 = −d2/dx2 with domain D(0, 1) is not an essentially self-adjoint operator in L2(0, 1). Exercise 9.16. Let V be a nonnegative continuous function on [0, 1]. Then the differential operator T = −d2/dx2 + V on L2(0, 1) with domain D2(0, 1) has a self-adjoint Friedrichs extension. Exercise 9.17. Let h Q ϕ(x)=x ϕ(x),Pϕ = ∂ ϕ (1 ≤ k ≤ n) k k k 2πi k represent the position and momentum operators on L2(Rn). Prove that they are unbounded self-adjoint operators whose commuta- tors satisfy the relations h [Q ,Q ]=0, [P ,P ]=0, [P ,Q ]=δ I. j k j k j k j,k 2πi Note: These are called the canonical commutation relations satisfied by the system {Q1,...,Qn; P1,...,Pn} of 2n self-adjoint operators, and it is said that Qk is canonically conjugate to Pk.

Exercise 9.18. Prove that, if A1 and A2 are two bounded self-adjoint op- erators in a Hilbert space, then A1A2 = A2A1 if and only if their spectral 1 2 1 2 2 1 measures E and E commute as in (9.8): E (B1)E (B2)=E (B2)E (B1) for all B1,B2 ∈BR.

Exercise 9.19. Let 1 1 A1 = λdE (λ),A2 = λdE (λ) R R be two self-adjoint operators in a Hilbert space H. If they commute (in the sense that their spectral measures commute), prove that there exists a unique spectral measure E on R2 such that

E(B1 × B2)=E(B1)E(B2)(B1,B2 ∈BR).

298 9. Unbounded operators in a Hilbert space

2 2 In the case of the position operators A1 = Q1 and A2 = Q2 on L (R ), show that E(B)=χB· (B ⊂BR2 ). Exercise 9.20. Find the infinitesimal generator of the one-parameter group 2 of unitary operators Utf(x):=f(x + t)onL (R). Exercise 9.21. Suppose that g : R → R is a continuous function. Describe 2 itgf the multiplication g· as a self-adjoint operator in L (R) and Utf := e as a one-parameter group of unitary operators. Find the infinitesimal generator A of Ut (t ∈ R).

References for further reading E. Hille and R. S. Phillips, Functional Analysis and Semigroups. T. Kato, Perturbation Theory for Linear Operators. P. D. Lax, Functional Analysis. E. H. Lieb and M. Loss, Analysis. E. Prugoveˇcki, Quantum Mechanics in Hilbert Space. M. Reed and B. Simon, Methods of Modern Mathematical Physics. F. Riesz and B. Sz. Nagy, Le¸cons d’analyse fonctionelle. W. Rudin, Functional Analysis. A. E. Taylor and D. C. Lay, Introduction to Functional Analysis. J. von Neumann, Mathematical Foundations of Quantum Mechanics. K. Yosida, Functional Analysis.

Hints to exercises

Chapter 1

Exercise 1.1. The identity is a homeomorphism between the two metric spaces, and {1, 2, 3,...} is a Cauchy sequence in (R,d) but not in (R, |·|).

Exercise 1.2. A finite number of balls B(a, m)(m ∈ N) cover the compact set.

Exercise 1.3. If every sequence {ak}⊂A has a convergent subsequence in M, for every sequence {xk}⊂A¯, choose {ak}⊂A with d(xk,ak) < 1/k.If → → akm x,alsoxkm x. Exercise 1.4. If U is an open neighborhood of x ∈ K, using the fact that U¯ is compact and that K is Hausdorff, show that U contains a compact neighborhood W of x in U¯.

Exercise 1.5. If X = F ∩ G (F closed and G open) and a ∈ X,then B¯(a, R) ⊂ G for some R>0 and B¯(a, r) ∩ F (r

299

300 Hints to exercises

Exercise 1.8. Consider I :(K, T ) → (K, T ) and use that I(F ) is a compact subset in (K, T ) to show that I−1 is continuous.

Exercise 1.9. (b) Show that xn → x0 in X if and only if xn = x0 for every n ≥ n0.

Exercise 1.10. See W. Rudin [39, Chapter 1].

Exercise 1.11. Every function is measurable and, according to the defini- tion (1.3) of the Lebesgue integral, the integral of a nonnegative function f = {f(j)}j∈J is the sum f(j):=sup f(k); k ∈ K, K ⊂ finite . j∈J k∈F | | ∞ { ∈ } ∞ { ∈ | |≥ If j∈J f(j) < ,thenN := j J; f(j) =0 = n=1 j J; f(j) 1/n} is at most countable, since every {j ∈ J; |f(j)|≥1/n} is finite. Hence | | | | j∈J f(j) = n∈N f(n) and fdν= f + dν − f − dν = f +(k) − f −(k)= f(k). k∈N k∈N k∈N

Exercise 1.12. Use monotone convergence (first prove that (1 + x/n)n is increasing), dominated convergence, and Fatou’s lemma, respectively.

Exercise 1.13. Show that I =2J with 1 x dy x dy x−ε dy J = dx , = lim . − α − α ↓ − α 0 0 (x y) 0 (x y) ε 0 0 (x y) | − |≤ | | Exercise 1.14. If E = k(ak,bk), then k F (bk) F (ak) E f(t) dt = μ(E), where μ is a finite measure such that μ(A)=0if|A| =0.Itis shown that μ(E) → 0as|E|→0 by supposing that, for some ε>0there ≥ | |≤ n ∈ existsEn such that μ(En) ε and En 1/2 for every n N, since then | | ≥ E := k≥1 n≥k En would satisfy E = 0 but μ(E) n. Exercise 1.15. See Royden [37, III.3].

Exercise 1.16. μ is the linear combination of four finite measures. | | Exercise 1.17. Using the representation B fdμ = B fhd μ with h as | | | | ≤ in (1.10), we have both B hdμ = μ (B)with h = 1 and also B fdμ | | | |≤| | | |≤ B f d μ μ (B)if f 1. ∞ Exercise 1.18. The Riesz-Markov theorem with u(g)= k=1 λkg(ak) yields the first part.

Hints to exercises 301

| |≤ ∞ | | ∈C n | |≤ Obviously gdμ k=1 λn if g c(R ) and g 1. For the opposite estimate, if |λ |≤ε, construct a convenient function g such k>n0 n that g(a )= sgnλ if k ≤ n .Then |λ |≤|u(g)| + ε and k k 0 k≤n0 n ∞ |λn| =sup |λn|≤ sup |u(g)|. n | |≤ k=1 0 k≤n g 1 0 Similarly |λ | = |μ|(G) for any open set G ⊂ Rn and |μ| is associated ak∈G n to {|λk|} in the same way that μ is associated to {λk}. Clearly |μ|(G)=0 if G = F c.

Chapter 2 m+1 m m Exercise 2.1. If n = 1, use induction and k = k + k−1 . If n>1 and α =(α1,...,αn+1), writeα ¯ := (α1,...,αn, 0) and αn+1 α − Dα¯ ∂αn+1 (fg) = n+1 Dα¯ ∂k f∂αn+1 kg . n+1 k n+1 n+1 k=0 Then − α¯ − Dα¯ ∂k f∂αn+1 kg = Dβ(∂k f)Dα¯−β(∂αn+1 kg). n+1 n+1 β n+1 n+1 β≤α¯ Finally, αn+1 α α¯ − α n+1 Dβ(∂k f)Dα¯−β(∂αn+1 kg)= (Dβf)Dα−βg. k β n+1 n+1 β k=0 β≤α¯ β≤α

Exercise 2.2. If U(x)=x + U ⊂ F , a neighborhood of x,thenalsoU ⊂ F ∞ and m=1 mBE(0,r)=E. Exercise 2.3. Suppose x ∈ E \{0} and let U = {0}. The product λx should be continuous in λ, so that we should have λx ∈ U if |λ|≤ε for some ε>0.

Exercise 2.4. If 0

Exercise 2.5. Define u(e1)=1,u(e2)=2,... if {en} is the canonical basis sequence.

Exercise 2.6. If g and h are two different bounded continuous functions, then G = {f = g} is a nonempty open set and |G| > 0, so that g = h in L∞(Rn).  −  ≥ Choose an uncountable family of intervals Iα, so that χIα χIβ ∞ 1if α = β to show that L∞(Rn) is not separable. See Remark 2.18.

302 Hints to exercises

Exercise 2.7. Use the fact that, as a subspace of C(B¯(0,m)), Cc(B(0,m)) is separable and it is dense in Lp(B(0,m)). Then approximate every f ∈ p n L (R )byfχB(0,m).

n Exercise 2.8. C0(R ) is complete and (x/k)g(x) → g(x) uniformly if n B¯(0, 1) ≺  ≺ B(0, 2). Every {f ∈Cc(R ); supp f ⊂ B¯(0,m)} is sepa- rable. ∈ N(j) Exercise 2.9. (a) For every n N, K = j=1 B(cn,j, 1/n), and the set {cn,j}n,j is dense in K. (b) Apply the Stone-Weierstrass theorem to the subalgebra of C(K) generated by the functions ϕm,n.

Exercise 2.10. Since C(0,T)isdenseinL2(0,T), C(T) is also dense in L2(T). The Stone-Weierstrass theorem shows that the algebra of trigono- metric polynomials is dense in C(T), and the uniform convergence implies the convergence in L2(T).

Exercise 2.11. Describe C1[a, b] as a closed subspace of C[a, b] ×C[a, b].

Exercise 2.12. The constants are 1 and n1/p.

Exercise 2.13. Show that |u(f)| =1with|f|≤1 implies f =sgng on every interval (1/(n +1), 1/n) and that f cannot be continuous, so that u(BC) ⊂ (−1, 1). If −1

Exercise 2.14. TK 1=TK. 1 | |   ≤   Exercise 2.15. If M =max0≤y≤1 0 K(x, y) dx,then Tf 1 M f 1 by Fubini-Tonelli, and T ≤M.   ≥ −   ≤ To prove that Th 1 M ε with h [0,1] 1, note that M = 1 | | ∈ 0 K(x, y0) dx for some y0 [0, 1], use the uniform continuity of K to choose y1 ≤ y0 ≤ y2 in [0, 1] so that |K(x, y) − K(x, y0)|≤ε if y ∈ [y1,y2], − −1 and define h =(y2 y1) χ[y1,y2].

Exercise 2.16. TK  = TK 1p (p = ∞ or 1).

n Exercise 2.17. Show that T = 0, since its kernel is Kn =0ifn>1, and −1 then (I − T ) = I + T . Note that v0(y)sin(ny) is an odd function when v0 is even.   − Exercise 2.18. Write the equations in the form (pu ) qu = f and the 1 − Cauchy problems as 0 K(x, y)u(y) dy u(x)=g(x). Then (a) f =0, K(x, y)=y − x, g(x)=−x;(b)f(x)=cos(x), K(x, y)=y − x, g(x)= −x +cosx;(c)f =0,K(x, y)=(a0/a1)(exp(a1y − a1x) − 1), g(x)=

Hints to exercises 303

2 2 −α +(b/α1)(exp(−a1x) − 1); (d) f =0,K(x, y)=x(1 − exp(x /2 − y /2)), g(x)=−x exp(x2/2).

Exercise 2.19. By induction and Fubini’s theorem, x xn−1 x1 x ··· 1 − n−1 dxn−1 dxn−2 f(t) dt = − (x t) f(t) dt. a a a (n 1)! a

Exercise 2.20. Consider v = u(n),integrateon[a, x], use the initial conditions (j) u (a)=cj, and apply the result of Exercise 2.19. Exercise 2.21. Show that supp f ∗ g contains [0, ∞) ×{0} but supp f + supp g =(0, ∞) × (0, ∞), a sum of noncompact sets.

Exercise 2.22. If s(f)(x)=f(−x), show that |(f ∗ g)(b) − (f ∗ g)(a)|≤ Cτbs(f) − τas(f) and apply Theorem 2.14. Exercise 2.23. Just compute the convolutions.

Exercise 2.24. See the proof of Theorem 2.41(a).

Exercise 2.25. Write − − − 1 2N 1 1 N1 1 2N 1 2F − F = D − D = D . 2N N N n N n N n n=0 n=0 n=N

Exercise 2.26. c =1.

n Exercise 2.27. Wt(x)=(1/t )W (x/t). Exercise 2.28. If PF (x)=y,theny = n≥1(y, en)H en (Fischer-Riesz). If x = y + z,then(y, en)H =(x, en)H . Exercise 2.29. Check that 1/p |K(x, y)||f(y)| dy ≤ C1/p |K(x, y)||f(y)|p dy Y Y and then integrate.

Exercise 2.30. Note that T is (1, 1) and (∞, ∞)withM0 = M1 = C and 1/p =1/q =1− ϑ.

Exercise 2.31. T˜≤2T .

Exercise 2.32. Let ϑ =1/2 and note that T is of type (4, 4/3) with constant 1/2 1/2 1/2 3/4 22 (2 ) (Exercise 2.31). Show that T (1, 2)4/3 > 2 (1, 2)4 if (x, y)=(1, 2).

304 Hints to exercises

Exercise 2.33. Obviously |ck(f)|≤f1, and also c(f)2 = f2 (Parseval). If 1

Chapter 3 Exercise 3.1. (a) See (4.11). Exercise 3.2. If A ⊂ rU and U is convex, also co(A) ⊂ rU.IfA is open, then co(A)= {t1A + ···+ tnA; n ∈ N,tj ≥ 0,t1 + ···+ tn =1} is also open. Exercise 3.3 Let U be a neighborhood of zero in E.ThenλV ⊂ U if |λ|≤δ,forsomeδ>0 and some neighborhood of zero, V . It follows that {λV ; |λ|≤δ}⊂U is balanced. → → Exercise 3.4. If fn f uniformly on compact sets, then 0 = γ fn(z) dz H C γ f(z) dz = 0, and f is analytic. Hence, (Ω) is closed in (Ω). Exercise 3.5. Use the open mapping theorem. Exercise 3.6. (d) λx≤ N 2−n pn(λx) +2−N ≤ ε if N is such that n=1 1+pn(λx) 2−N ≤ ε/2, and allow λ → 0, so that N 2−n pn(λx) ≤ ε.(f)nx≤1 n=1 1+pn(λx) and nx = nx →∞. Exercise 3.7. It should be h≤1/2.

j j Exercise 3.8. If the semi-norms pn define the topology of E , then the 1 1 m m 1 m semi-norms qn(x):=max(pn(x ),...,pn (x )) (x =(x ,...,x )) define the product topology on E. If E = E1 ×···×Em ×···, a countable product, the topology is defined by the sequence of semi-norms q (x =(x1,...,xm,...)). m E → ∞ C → (k) Embed (Ω)  k=0 (Ω) by f (f ). { } → Exercise 3.9. (a) If fnk is a subsequence, then fnk (t) 0ift = 0, and → ∈ fnk (0) 1. (b) Choose a Ω and define d(t, B(a, 1/n)c) f (t):= n d(x, a)+d(t, B(a, 1/n)c) with n large enough.

Exercise 3.10. Consider M  xn → x ∈ E1 and prove that Txn is a Cauchy ∈   → ∈ sequence with a limit y E2 such that, if also M xn x E1,then   necessarily Tx1,Tx1,Tx2,Tx2,...is a Cauchy sequence with the same limit y. Define T˜x˜ := y.

Hints to exercises 305

Exercise 3.11. (a) First show that the topology in E/F is the collection of all sets G ⊂ E/F such that π−1(G)isanopensetinE, so that π is continuous, and if a set M ⊂ E/F is closed, then π−1(M)isclosedinE.

(b) If ·E is an F -norm associated to the topology on E, then check that π(x) =infz∈F x−zE is a corresponding F -norm for E/F. To prove that { }  −  every Cauchy sequence un in E/F must converge, choose unk unk+1 < k  −  k → 1/2 and π(xk)=unk so that xk xk+1 E < 1/2 still. If xk x in E, → → then unk π(x)inE/F and also un π(x)inE/F. (c) F + M = π−1(π(M)) is closed, since π(M) is finite dimensional in E/F and every finite dimensional subspace is closed by Theorem 2.25 (in thecaseofaFr´echet space, cf. K¨othe [26, 15.5] or Rudin [38, 1.21]). { }⊂ Exercise 3.12. Show that en F + M and F + M is dense. The element ∞ 1 x0 = n=1 n e−n is well-defined. Suppose x0 = x1 + x2 with x2 = lim bm (bm ∈ [un; n ≥ 1]) and x1 ∈ F and show that, if n ≥ 1, then (−ne−n + − en,x2)H = 0 and also (x1,e−n)H = 0. It follows that (x0, ne−n + en)H = − − | |2 ∞ 1sothat(x1,en)H = 1, and then n (x1,en)H = , a contradiction. Exercise 3.13. Write [a, b]= {x },whereInt({x })=∅. α α α ∞ ∅ Exercise 3.14. If F = n=1[e1,...,en] and Int ([e1,...,em]) = ,then F =[e1,...,em].

Exercise 3.15. Consider f(x)=1/x (f(0) := 0).

Exercise 3.16. Use the closed graph theorem.

n → C1 ·  Exercise 3.17. If fn(t)=t /n,thenfn 0in( [0, 1], [0,1]) but fn(t)= n−1 1 t does not converge in (C [0, 1], ·[0,1]).

Exercise 3.18. See Remark 3.12. { ≤ } Exercise 3.19. The set Fn := ψ(x) n is closed and M = n Fn.By Corollary 3.9 of Baire’s theorem, Int Fm = ∅ for some m.

Exercise 3.20. Apply the uniform boundedness principle (Theorem 3.14) to the sequence {Jn}.

Exercise 3.21. (a) If (xn,yn) → (x0,y0)inE1 × E2, show that the lin- ear mappings Tn = B(·,yn) are uniformly bounded and write B(xn,yn) − B(x0,y0)=Tn(xn − x0)+B(x0,yn − y0). (b) Choose Pn polynomials such −1/2 1 1 that Pn(x) → x in L (0, 1) (polynomials are dense in L (0, 1), since Cc(0, 1) is dense by Corollary 2.13 and polynomials are uniformly dense in Cc(0, 1)) and observe that {Pn} is a bounded sequence such that {B(Pn,Pn)} is unbounded.

306 Hints to exercises

Exercise 3.22. See Exercise 3.21.

Exercise 3.23. Use the closed graph theorem.

Chapter 4

Exercise 4.1. See the proof of Theorem 4.14, now for the vector subspace ∞ F = {x ∈  ; ∃ lim Λnx ∈ R}. n

Exercise 4.2. See Theorem 4.17.

Exercise 4.3. See, for instance, K¨othe [26, 15.8].

∞ Exercise 4.4. If πn is the n-projection of  , apply the Hahn-Banach theorem  ∞ to extend πn ◦ T to Tn ∈ E and define T (x)=(Tn(x)) ∈  .

Exercise 4.5. If u = 0 and Ker u is closed, then x + U ⊂ E \ Ker u for some neighborhood of zero, U, which can be supposed balanced, and then u(U) ⊂ K is also balanced. Show that either u(U) is bounded, and then u is continuous, or u(U)=K,inwhichcaseu(x)=−u(y)forsomey ∈ U and then x + y ∈ (x + U) ∩ Ker u, a contradiction.

Exercise 4.6. Note that {en} is a linearly independent system. Check that Ker πn =[ej; j = n]=[ej; j = n] (show that x ∈ [ej; j = n] \ [ej; j = n] ∈ would imply πn(x)en [ej; j = n]), so that πn is continuous (cf. Exer- ∈ N cise 4.5). Observe that x co(K) if and only if x = n=1 πn(x)en ≥ N ≤ with πn(x) 1 and n=1 πn(x) 1. To prove that co(K) is closed, if N ∈ ∈ x = n=1 πn(x)en H, consider the cases (a) πn0 (x) [0, 1] and (b) N π (x) ∈ [0, 1]. Check that x ∈ V ⊂ co(K)c if V = π−1([0, 1]c)incase n=1 n n0 −1 N c (a), and V = n=1 πn ([0, 1] ) in case (b).

Example: The linear hull of an orthonormal sequence un in a Hilbert space and en =(1/n)un.

Exercise 4.7. Ifx ˜ = limn xn andy ˜ = limn yn in H˜ with xn,yn ∈ H, show that {(xn,yn)H } is a Cauchy sequence of numbers and that (˜x, y˜) := lim(xn,yn)H is a well-defined inner product.

Exercise 4.8. If ux =(·,x), then (a, x)H = 0 if and only if ux(a) = 0. Thus, ⊥ o x ∈ A if and only if ux ∈ A .

2 Exercise 4.9. x2 ≤xV and V is dense in  .

Hints to exercises 307

Exercise 4.10. If B is continuous at (0, 0), then B(x, y)G ≤ 1ifxE <ε and yF ≤ ε,forsomeε>0. Then B((ε/xE)x, (ε/yF )y)G ≤ 1(if −2 x, y = 0), and B(x, y)G ≤ ε xEyF .

Conversely, the condition B(x, y)G ≤ CxEyF yields

B(x, y) − B(a, b)≤Cx − aEyF + CaEy − bF .

 (j) ≤ ∈ ≤ ≤ Exercise 4.11. Suppose fk [a,b] 1(k N and 0 j m + 1) and use the mean value theorem to apply Theorem 4.28. Exercise 4.12. Similar to Exercise 4.11. Exercise 4.13. If m(E) = 0 and μ(E) > 0, then m([a, c] \ E)=c;ifalso μ(A)=c,thenm(A ∪ E)=c but μ(B) >c. To prove that μ = m, show first that also m(A)=c/2 ⇒ μ(A)=c/2 when c ≤ 1/2. If, for instance, m(A1)=c/2 and μ(A1) c/2, μ(A3) c/2, a contradiction. If c>1/2, extend μ to [0, 2] by defining μ(A)=μ(A − 1) if A ⊂ (1, 2].

Exercise 4.14. The condition is m(E \ F ) = 0, and then dμE = χEdμF . p−1 Exercise 4.15. If xp = 1, define yk =sgnxk|xk| .

Exercise 4.16. Consider xk =1− 1/k. ∞  Exercise 4.17. |v(x)|≤x∞ and v extends to v ∈ ( ) .Fromv(ek)= k ek,y = y it would follow that y =0. Exercise 4.18. See Exercise 4.17, with C[a, b] as a substitute of c. ∈  k N k → Exercise 4.19. If v c0, define x = v(ek). If xN = k=1 x ek x in c0, N k k   k k k k so that v(y) = limk=1 y x = x, y , choose z =sgny (z =0ify =0), so that v(z) ≤v, and |v(y)|≤y1x∞.  ∞ Exercise 4.20. Represent the transpose of T as T :(yn) ∈  → (yn/n) ∈ ∞   ,withImT ⊂ co and Ker T = {0}.

Exercise 4.21. Approximate uniformly every g ∈Cc(a, b) by step functions. Exercise 4.22. If K ≥ 0, H¨older’s inequality yields +∞ +∞ +∞ 1/p p K(x, y) dy |g(x)| dx ≤ K(x, y) dx gp . −∞ −∞ −∞ If K(·,y) ∈ Lp(R), consider |K(x, y)|. Exercise 4.23. Note thatμ ˜ has to be finite.

308 Hints to exercises

| | ≤ ≤ Exercise 4.24. Let u(1) = sup|g|≤1 u(g) and assume u(1) = 1. If 0 f 1, define g =2f − 1 and then |u(g)|≤1; thus u(f)=(1+u(g))/2 ≥ 0. 1 Exercise 4.25. (a) |ug(f)|≤ | f(z)g(z) dz|≤f ¯ g c .More- 2π γr D(0,r) D(0,r) over, f(z)g(z) dz = f(z)g(z) dz if 

Exercise 4.26. Consider T as a restriction of T . x ≤ ≤ Exercise 4.27. If Tf(x):= 0 f(t) dt (0 x 1) and Tf =0,then differentiation shows that f =0.

2 Exercise 4.28. If K = {λ1,λ2,...}, define T ∈L( ) such that Tek = λkek. Exercise 4.29. See Exercise 4.28. ⊂ N(n) n Exercise 4.30. If K = T (BE), consider K k=1 B(ck , 1/n) and Fn = n N(n)  −  [c1 ,...,ck ]. Show that limn→∞ T PFn T =0.

Chapter 5

Exercise 5.1. Not every x ∈ (1) attains its norm on the unit sphere of 1 (Exercise 4.16).

2 2  2 Exercise 5.2. Since L (R)=L (R) and L (R) is separable, BL2(R) is w∗-compact and metrizable. For the last part, see Exercise 5.7.

Exercise 5.3. Suppose u(f ) → 0, so that there exist δ>0 and {f } so n nk − N that nk+1 > 2nk and u(fnk ) >δ(or < δ). Define yN := k=1 fnk ,which satisfies yN (x) < 4 everywhere. It follows from u(fnk ) >δthat u(yN ) >Nδ, and u would be unbounded. Hence, fn → 0 weakly, but fn[0,1] =1.

Exercise 5.4. Suppose that g(0) = 0 and choose ϕto be zero near 0 and 1 such that g − ϕ − ≤ ε. Show that (3) implies (g − ϕ)h ≤ cε and [ 1,1] −1 n 1 ≤  → lim supn −1 ghn ε, so that g,hn g(0). If g(0) = C,writeg = g0 + C. For the converse, choose g = 1 to prove (1). To prove (3), apply the uniform boundedness principle to the sequence ·,hn.

Exercise 5.5. Let xE = u(x), uE =1.ThenxE = limn |u(xn)|≤ lim infn xnE.

Hints to exercises 309

∗ Exercise 5.6. K = BE is w -compact, and x → x (x(u)=u(x)ifuE ≤ 1) is a linear isometry of E into C(K).  Exercise 5.7. If xn → 0inE,thenpu(xn)=|u(xn)|→0withu ∈ E .Let { }∞ → en n=1 be an orthonormal system in a Hilbert space. Then en 0inH, → ∞ | |2 ≤ 2 ∞ → but en 0weakly,since n=1 (en,x)H x H < and (en,x)H 0.

Exercise 5.8. By Theorem 5.3, the weak closure K of co({xn})isalsoits closure, x ∈ K, and x is the limit of points in co({xn}). Exercise 5.9. Consider A ⊂ E as a subset of E = L(E, K). Exercise 5.10. Assume (b): the graph of T is closed and (a) holds, since if xn → x in E and Txn → y in F , then it follows from u(xn) → u(x) for all   u ∈ E that v(Txn) → v(Tx)(u ∈ v ◦ T ∈ E ) and also v(Txn) → v(y). Thus, v(Tx)=v(y) for all v ∈ F , and Tx = y. To show that (b) and (c)  follow from (a), note that px(T u)=|u ◦ T (x)| = pTx(u). { { } } ∈ Exercise 5.11. Let Un = max px1 ,...,pxN < 1/M .Ifx E, every ball {px <ε} contains some Un; that is, |u(xj)| < 1/n ∀j ⇒|u(x)| < 1. Hence u(x1)=···= u(xN )=0⇒ u(x) = 0 and x ∈ [x1,...,xn]. | | Exercise 5.12. None of the semi-norms sup1≤j≤N x, uj defining the weak topology can be a norm.

Exercise 5.13. If T is compact and xn → 0 weakly, every subsequence of {Txn} has a subsequence which is norm-convergent to 0, so TxnH → 0in H.

For the converse note that BH is weakly compact and metrizable. Then, { }⊂ → for every xk BH , there is a weakly convergent subsequence, xkn x in − →  −  → BH , so that xkn x 0 weakly and Txkn Tx H 0.  Exercise 5.14. There exists f in E which strictly separates K1 and K2.  Exercise 5.15. Denote by B¯E the weak* closure in E of the closed unit ball  BE and suppose that K = R. To prove that B¯E = BE ,letw0 ∈ E be  any point not in B¯E, a weak* closed convex set. Then there is a u ∈ E , u = 1, such that sup u, w < u, w , and it follows that w  > E w∈B¯E 0 0 E ¯ 1. Therefore BE ⊂ BE.

Exercise 5.16. (a) If BE is weakly compact, use Goldstine’s theorem to show that BE = BE . The converse follows from Alaoglu’s theorem. (b) The weak topology of F is the restriction of the weak topology of E.     (c) If E is reflexive, then σ(E ,E)=σ(E ,E ) and BE is σ(E ,E)- compact and σ(E,E)-compact, so that E is reflexive. If E is reflexive, then E is a closed subspace of the reflexive space E.

310 Hints to exercises

 Exercise 5.17. To show that w = x0, · for every w ∈ H ,writeτu = z   if u =(·,z)H ∈ H .Thenτ : H → H is a bijective skewlinear isometry,  and H is a Hilbert space with the inner product defined by (u1,u2)H =  (τu2,τu1)H . Finally, w(u)=(u, u0)H =(τu0,τu)H for some u0 ∈ H . Choose x0 = τu0. The mapping f ∈ L1(0, 1) →·,f∈L1(0, 1) is not exhaustive: if ·  ∈ 1 1  { ∈ ∞ } w = ,f0 for some f0 L (0, 1), since L (0, 1) = ug; g L (0, 1) with 1 ∀ ∈ ∞ ug(f)= 0 f(t)g(t) dt, necessarily w(ug)=ug(f0) g L (0, 1), and the mapping J of Exercise 4.18 is not exhaustive. As the dual of L1(0, 1), L∞(0, 1) is not reflexive (Exercise 5.16(c)).

Exercise 5.18. For the last part, see K¨othe [26, 22.4(2)].

int Exercise 5.19. Choose fn(t)=e and apply the Riemann-Lebesgue lemma. Exercise 5.20. If 1 fg =0forallg ∈C[0, 1], with f ∈ L1(0, 1), then also 0 b ⊂ ∞ a f =0forevery(a, b) (0, 1) and it follows that f =0.InL (0, 1) a limit of continuous functions is also continuous.

See the proof of Theorem 5.11 to solve Exercises 5.21–5.24

Chapter 6  − − ≥ 1−r − − Exercise 6.1. ψ (x)=g(x)g(1 r x) 0, ψ(x)= 0 g(t)g(1 r t) dt = C, a constant, if x ≥ 1−r, and ψ(x) = 0 if and only if x ≤ 0. Hence (x)=0 if and only if x ≤−1orx ≥ 1, and (x)=C2, a constant, if −r ≤ x

p Exercise 6.2. Since Cc(Ω) is dense in L (Ω) (Corollary 2.13), approximate n every f ∈CK (Ω) ⊂Cc(R )byf ∗ε as in Theorem 6.2, with supp f ∗ε ⊂ Ω. ∞ −N  −   −  Exercise 6.3. The distance d(ϕ, ψ):= N=0 2 ϕ ψ N /(1 + ϕ ψ N ) ∈D defines the topology. Choose  [0,1](R) and show that the test functions N T { } ϕN = k=0 τk form a Cauchy sequence in the new topology but ϕN it is not convergent in the convergence we are considering for test functions. | (m)|≤ −m−1 | (m)| Exercise 6.4. Note that ϕn n supR ϕ and that n supp ϕn is unbounded.

Exercise 6.5. The family T of all unions of sets of the form ϕ+U (ϕ ∈D(Ω), U ∈U) is a topology on D(Ω). To check (a), first show that if ϕ ∈ G1 ∩ G2 with G1,G2 ∈T,thenϕ + U ⊂ G1 ∩ G2 for some U ∈U.

To prove that every Cauchy sequence is contained in some DK (Ω), sup- pose that this is not the case, so that there are terms ϕk of the sequence

Hints to exercises 311

and distinct points xk ∈ Ω(k ∈ N) such that ϕk(xk) = 0 and with no limit points in Ω. Then U := {ϕ; k|ϕ(xk)| < |ϕk(xk)|} ∈ U,sinceK ∈K(Ω) contains finitely many points xk and then U ∩DK (Ω) ∈T.

Exercise 6.6. (a) By the Leibniz formula, c m m n! f (m)(t)= n tn−jrm−jϕ(k−j)(r t) n n! j (n − j)! n n j=0 and supp ϕ(rnt) ⊂ [−R/rn,R/rn]. m−n (b) r ≤ 1/rn when mp, so that n=1 fn is uniformly convergent on compact subsets and f ∈E(R). Also fn(t)= n (p) (n) cnt /n!if|t|≤1/rn, so that fn (0) = 0 when p = n and fn (0) = cn. | | ≤ | | | | ≤ | | Exercise 6.7 Note that ϕ, f ( K f )q0(ϕ) and ϕ, μ μ (K)q0(ϕ) for every ϕ ∈DK (Ω). Exercise 6.8. We would have ϕf =0ifϕ(a) = 0 and then, as in Theo- n rem 6.5, f = 0 a.e. on R \{a}, that is, f = 0 a.e. but δa =0.

 Exercise 6.9. Apply δ to the test functions ψn(t)=− sin(nt)ϕ(t), with   [−1/2, 1/2] ≺ ϕ ≺ (−1, 1). Then δ (ψn)=n and, if δ = μ, n ≤ ϕ(t)sin(nt) dμ(t) ≤|μ|([−1, 1]). [−1,1] Exercise 6.10. Write Kλ(x)ϕ(x) dx − ϕ(0) = Kλ(x)(ϕ(x) − ϕ(0)) dx and see the proof of Theorem 2.41. ⊂ − ≥ N Exercise 6.11. If supp ϕ [ n, n] andN n,then k=−N δk(ϕ)= n  +∞  +n k=−n ϕ(k). We can define (ϕ):= k=−∞ ϕ(k), = k=−n δk on D[−n,n](R). ≤ | (m)| ≤ m | | Exercise 6.12. The order is m,since ϕ, δ j=0 ϕ(0) .Toshow (m) that it is >m− 1, apply δ to the functions ψk,where[−1, 1] ≺  ≺ 1−m 1−m (−2, 2) and ψk(x)=k cos(kx)ifm is even, else ψk(x)=k sin(kx). | |≤ N  (n) Exercise 6.13. The distribution u satisfies u(ϕ) CK n=0 ϕ K = CKqN (ϕ)ifϕ ∈DK and K ⊂ (1/N, ∞). To prove that there is no N ∈ N such that the above estimate holds for every compact set K ⊂ (0, ∞), apply Exersice 6.12. There is no extension v of u to R, since the continuity of v on D[−2,2] would imply that the order should be finite.

312 Hints to exercises

Exercise 6.14. Let K ≺  ≺ Ω, |ϕ|≤ϕK  for every ϕ ∈DK (Ω). If ϕ is real, then −ϕK  ≤ ϕ ≤ϕK and |u(ϕ)|≤u()ϕK.Ifϕ is not real, then |u(ϕ)|≤2u()ϕK. ∈C− − Exercise 6.15. Let ψ [ a, b] be such that ψ(t)=(ϕ(t) ϕ(0))/t if t =0. b ϕ(t)−ϕ(0) b Then u2(ϕ):= −a t dt = −a ψ(t) dt defines a distribution on R, ∈D ∈ and uf (R) as the sum of three distributions. If 0 supp ϕ,then ψ(t)=ϕ(t)/t and uf (ϕ)= R ϕ(t)/t dt.

Exercise 6.16. (a) If ϕk → 0inD(R), define ψk(t):=(ϕk(t) − ϕk(0))/t   (ϕ (0) if t = 0), continuous and such that |ψ (t)|≤ϕ  − , to prove that k k k [ r,r] → ϕ(t)−ϕ(0) → r ϕ(t)−ϕ(0) ↓ ur(ϕk) 0. (b) Observe that ε≤|t|≤r t dt −r t dt if ε 0.

Exercise 6.17. Write

uf (ϕ)= f(x)ϕk(x)dx + f(x)(ϕk(x) − Tmϕk(x))dx {|x|≥1} {|x|<1} n to prove that uf (ϕk) → 0asϕk → 0inD(R ). Note that if 0 ∈/ supp ϕ, then

uf (ϕ)= f(x)ϕ(x)dx = ϕ, f, Rn since ϕ(x) − Tmϕ(x)=ϕ(x). ⊂ − − ∞  − Exercise 6.18. If supp ϕ [ N,N], then 0 ϕ = ϕ(N)+ϕ(0) = ϕ(0) = δ(ϕ), so that Y  = δ. By induction, Y (n+1)(ϕ)=±ϕ(n)(0). Exercise 6.19. If supp ϕ ⊂ [−n, n], then f(t) dt =2 n log tdt = [−n,n] 0 − ∈ 1 −   −  n log n n and f Lloc(R). Moreover ϕ ,f = limε↓0 ε≤|t|≤r ϕ (t)f(t) dt and ϕ(t) 1 ϕ(t)f(t) dt = −ϕ(ε)logε+ϕ(ε)logε− dt →−ϕ,vp  ε≤|t|≤r ε≤|t|≤r t t  since | log ε(ϕ(−ε) − ϕ(ε))|≤ε| log ε|ϕ [−n,n] → 0asε → 0.

Exercise 6.20. If supp ϕ ⊂ [−a, a], Δ = {−a ≤ x ≤ y ≤ a}, and I = ϕ(x)f (y) dx dy, then by Fubini’s theorem Δ I = f (y)ϕ(y) dy = − ϕ(x)f(x) dx. R R

Exercise 6.21. ϕ, Dαu := (−1)|α|Dαϕ, u.

Exercise 6.22. If P (D)=cDα, note that ϕ, cDαu =(−1)αDα(cϕ),u.

Exercise 6.23. See Theorems 6.17 and 6.18.

Hints to exercises 313

Exercise 6.24. If u(ϕ) = 0 when supp ϕ ⊂ G,thenDαu(ϕ)=±u(Dαu)=0, since supp Dαϕ ⊂ G.

Exercise 6.25. supp Y =[0, ∞) and supp δ = {0}.

Exercise 6.26. If v ∈E(Ω) and supp v ≺ Int K,thenv = v. This yields |v(ϕ)|≤CK qN (ϕ) for every ϕ ∈DK (Ω) and the Leibniz formula shows that qN (ϕ) ≤ MqN (ϕ). Hence |v(ϕ)| = |v(ϕ)|≤CKMqN (ϕ).

Exercise 6.27. 1 ∗ (δ ∗ Y ) = 1 and (1 ∗ δ) ∗ Y = 0. Two of the supports are not compact.

Exercise 6.28. See Theorem 6.18. → | |≤ N | | ∞ Exercise 6.29. Ym(x) Y (x)ifx = 0 and Ym(x) −N dm(t) <  on [−N,N](N ≥ 1). Then Ym → Y in D (R) by dominated conver-  → D  gence. Moreover Ym δ in (R) and Ym = dm (integrate by parts in  1  [a,b] ϕ(t)Ym(t) dt,whenYm is C or absolutely continuous and Y = dm in the distributional sense). 1 m Finally, dm(t):= π m2t2+1 is an approximation of δ. Exercise 6.30. Two different solutions are u = 1 and u = Y .

Exercise 6.31. The general solution of P (D)=0isF (t)=Aet + B cos t +   C sin t.Fromf (0) = 1 and f(0) = f (0) = 0 we obtain A =1/2e, B = C = −1/2. Hence, E(t)=F (t)+ (et)/(2e) − (1/2) cos t − (1/2) sin t Y (t).

Exercise 6.32. The general solution of P (D)=0isF (t)=Ae−t + Bte−t + Ct2e−t.Fromf (0) = 1 and f(0) = f (0) = 0 we obtain A = B = 0 and C =1/2. Hence, E(t)=F (t) + ((1/2)t2e−t)Y (t).

Exercise 6.33. Take derivatives under the integral.

Exercise 6.34. Check that ∂z¯f = πδ as follows: Define the continuous 2 functions fn(z)=1/z if |z| > 1/n and fn(z)=n z¯ if |z|≤1/n.Then 2 →∞ show that ∂z¯fn = n χD¯(0,1/n), which tend to πδ as n . Finally prove  2 that fn → 1/z in D (R ) (use dominated convergence).

Exercise 6.35. It is easy to see that F (y):=ϕ(x, y),u(x − cy) defines 2 a continuous function and that ϕk,v→0asϕk → 0inD(R )(use dominated convergence). If u ∈E(R), a direct computation shows that 2 − 2 2 ∈ 1 ∂t v c ∂xv =0.Whenu Lloc(R) and L isthewaveoperator,usethe fact that ϕ, Lv = Lϕ, v.

Exercise 6.36. Apply Theorem 6.31.

314 Hints to exercises

0 ∞ 0 Exercise 6.37. Write E2 = E2 + E2 , E2 = χ{|x|≤1}. To check that u = ∗ 0 ∗ ∞ ∗ ∞ f E2 + f E2 is defined and locally integrable, note that f E2 exists everywhere and is locally bounded, since log |x − y|−log |y|→0asy →∞. Every function fn := fχ{|x|≤n} satisfies the same conditions as f and un := fn ∗ E2 → u in the distributional sense, by dominated convergence, so that  2 also "un →"u in D (R ). Moreover un, "ϕ = E2, "(f˜n ∗ ϕ) = fn,ϕ. Therefore "u = limn "un = limn fn = f.

Exercise 6.38. Write ur(y)=u(ry), choose g ∈C(S) so that f − gp ≤ ε, and let v(x)= S P (x, y)g(y) dσ(y). Then

f − urp ≤f − gp + g − vrp + vr − urp ≤ 2ε + vr − urp when r is close to 1. Prove that S P (rx, y) dσ(y)= S P (rx, y) dσ(x)=1 p p (use the mean value theorem) to show that f ∈ L (S) → ur ∈ L (S) has p norm 1, so that vr − urp → 0asg → f in L (S).

Chapter 7

−t2 Exercise 7.1. Note that f1 is the derivative of −e /2, f2 is the translation of a dilation of χ(−1/2,1/2), and f3 and f4 are related to that Poisson kernel. Exercise 7.2. Take derivatives under the integral, and use the properties of summability kernels.

Exercise 7.3. Compare Kt with the Gauss-Weierstrass kernel Wh. Exercise 7.4. f∞ ≤f1 and W = W , so that F =1.

Exercise 7.5. If f = ϕ, g = ψ, and supp ϕ ∩ supp ψ = ∅,thenf ∗ g =0.

Exercise 7.6. If g(x)=xβDαϕ(x)isinS(Rn), then also g ∈S(Rn) ⊂ p n L (R ). Moreover, ϕp = ω−2N ω2N ϕp ≤ CqN (ϕ)ifNp > n/2. p n 1 n If ω2N ϕ ∈ L (R ) for all N,thenϕ ∈ L (R ). If the functions xβDαϕ(x)areinL1(Rn) ∩E(Rn), then ϕ is bounded, and so is every xβDαϕ(x).

Exercise 7.7. From Qf = P ∈S(Rn), P a polynomial, it follows that P =0.

Exercise 7.8. Apply the closed graph theorem. Exercise 7.9. Write f = Rf with [−R, R] ≺ R ≺ (−R − 1,R+ 1) and apply the Fourier transform to f (n).

1 n 2 n Exercise 7.10. Consider f = fχ{|f|>1} +fχ{|f|≤1} ∈ L (R )+L (R ). Since f∞ ≤f1 and f2 = f2, the result follows from the Riesz-Thorin theorem by choosing θ =2/p.

Hints to exercises 315

Exercise 7.11. F(sinc) = F(sinc) = χ(−1/2,1/2) is not continuous, and  sinc 2 = χ(−1/2,1/2)2 =1. Exercise 7.12. Define v(x, y)=u(x, y)+y. ∈ p ⊂ ∞ | |≤| |≤ | | ∈ Exercise 7.13. If x[k]   ,then x[k] x[k] C k .Ifϕ D   N → → D [−N,N](R), then ϕ, ux = k=−N x[k]ϕ(k) 0asϕ 0in [−N,N](R). Exercise 7.14. See Exercise 7.10. Exercise 7.15. Since 1/2 k=∞ |ϕ(t − k)| dt = ϕ1 < ∞, − 1/2 k=−∞ +∞ − − ϕ1(t)= k=−∞ ϕ(t k) is well-defined on interval ( 1/2, 1/2] and the series converges a.e. and in L1. The Fourier coefficients are +∞ 1/2 −2kπit ck(ϕ1)= ϕ(t − k)e dt = ϕ(k) − k=−∞ 1/2 +∞ | | ∞ ∈S and k=−∞ ϕ(k) < since ϕ (R). Therefore the Fourier series is +∞ +∞ uniformly convergent and k=−∞ ϕ(k)=ϕ1(0) = k=−∞ ϕ(k), the sum of the Fourier series at the origin. Note that (d) follows from (c). N N | | N | | N | |p 1/p Exercise 7.16. In R the norms k=1 xk ,maxk=1 x k, and ( k=1 xk ) are equivalent.

1/p  Exercise 7.17. |u(x) − u(y)|≤|x − y| u p. t  Exercise 7.18. As in Theorem 7.25, v(t)= c u (s) ds + C is continuous and  1,p ∞ |v(t)|≤u 1 + |C|.AlsoW (a, ∞) →C[a, ∞) ∩ L (a, ∞) is continuous, by the closed graph theorem. p p  Exercise 7.19. By Theorem 7.25, Ru ∈C(R), and Rup =2up.Ifv = u , the distributional derivative of u on (0, ∞), define v(−t):=−v(t)ift<0. Then 0 x Ru(x) − u(0) = u(−x) − u(0) = v(−t) dt = v(t) dt (x<0) x 0 − − x ≥  and Ru(x) u(0) = u(x) u(0) = 0 v(t) dt also holds if x 0. Then (Ru) =  p  p 1,p 1/p v, (Ru) p =2u p, and Ru ∈ W (R)withRu1,p =2 u1,p. → 1 → 1 n Exercise 7.20. If uk 0inH0 (Ω) andu ¯k v in H (R ), then v =0.

Exercise 7.21. (∂jPu)|Ω =(∂ju)|Ω and (∂jPu)|Ω¯ c =(∂ju)|Ω¯ c = 0, so that 2 n (∂jPu)|∂Ωc =(∂ju)|∂Ωc and ∂Ωisanullset.Hence∂jPu = P∂ju in L (R ) and as distributions.

316 Hints to exercises

Exercise 7.22. Choose u = 1, a constant on (−1, 1). Then uo ∈ L2(R), but o  2 (u ) = δ−1 − δ1 ∈ L (R), since it is not a function. Exercise 7.23. Use the closed graph theorem.

Exercise 7.24. The Fourier transform of u is of type (1 + |ξ|2)−1 and ∞ ∞ 2 −1 2 2 s−2 [(1 + ξ ) ωs(ξ)] dξ = (1 + ξ ) dξ < ∞ 0 0 if s<3/2. 2 2 2 Exercise 7.25. B(u, u)= |∇u(x)| + |u(x)| dx = u 1 ,coercive. Ω H0

Chapter 8

Exercise 8.1. If e =(e, 0) is the unit in A ⊂ A1,thenδe = e = δ.

Exercise 8.2. Note that χ(1)χ(1) = χ(1) for every χ ∈ Δ(C1), and χ(δ)=1. Here 1 = (1, 0) and δ =(0, 1).

Exercise 8.3. (a) z ∈ A(D) butz ¯ ∈ A(D). (b) Suppose χ ∈"and let g(z)=z.Thenσ(g)=D¯, χ(P )=P (α), and, by continuity, χ(f)=f(α)= δa(f). (c) If α ∈ D¯,then|fj(α)| > 0 for at least one j and χ(fj) =0.Then J = f1A(D)+···+ fnA(D) is not contained in a maximal ideal, since there is no δα such that δα(fj) = 0 for every j (1 ≤ j ≤ n). Exercise 8.4. Suppose e ∗ f = f for every f ∈ L1(R). From the properties of the Fourier transform, eϕ = ϕ for every ϕ ∈S(R), which implies e =1.

Exercise 8.5. Similar to Exercise 8.4, with Fourier coefficients.

Exercise 8.6. The unit is δ = {δ[k]} such that δ[0] = 1 and d[k]=0ifk =0.    ≤ Exercise 8.7. Note that Lab = LaLb, Le = I, and La =supx≤1 ax a and Lae = a.

Exercise 8.8. (a) |γχ(α+β)−γχ(α)| = |χ(τ−α(τ−βu−u))|≤τ−βu−u1 → 0 as β → 0 and γχ(α + β)=χ(τ−αu ∗ τ−αu)=γχ(α)γχ(β). It follows from n γχ(α)γχ(−α)=γχ(0) = 1 and from γχ(nα)=γχ(α) that |γχ(α)| =1(γχ is bounded). (b) A continuous solution γ : R → T of γ(α + β)=γ(α)γ(β) has iξα the form γχ(α)=e for some ξ ∈ R. The following steps lead to a n −n proof: (1) γ(nα)=γ(α) .(2)ωn := Arg (γ(2 )) → 0asn →∞.(3) 2ωn+1 − ωn ∈ 2πZ.(4)2ωn+1 = ωn for all n ≥ p for some p ∈ N.(5) p−n p iξα ωn =2 ωp for all n ≥ p.(6)Ifξ =2ωp,thenγ(α)=e for all α ∈{m2n; m, n ∈ Z,n≥ p}, which is a dense subset of R.

Hints to exercises 317

∗ ∗ − (c) From u f(ξ)= Rf(α)τ−αu(ξ) dα, χ(f)=χ(u f)= Rf(α)γχ( a) dα and χ is continuous on L1(R). c c −1 c −1 {| | Exercise 8.9. If Fα = Gα,then Gα = f (Fα). Also μ(f ( z > − f∞})) = 0 and μ(f 1({|z| > |λ|})) =0if |λ| < f∞. Finally, λ ∈ σ(f) iff 1/(f(x) − λ) exists a.e. and is bounded a.e.

Exercise 8.10. See Rickart [35, I.7].

Exercise 8.11. χ(e) = 1 follows from 0 = χ(a)=χ(a)χ(e).

Exercise 8.12. See Exercise 1.8. Exercise 8.13. For every f ∈C(K), f(δt)=f(t) is continuous with respect to the initial compact topology, which coincides with the coarser Gelfand topology. Moreover, f(δt)=f(t) shows that fΔ = fK and C(Δ) = G(C(K)).

Exercise 8.14. U is open in Δ because U = {χ; |z(χ)| < 1},wherez denotes the coordinate function. The rotation  : z → ξz induces an isomorphism ∞ f → f ◦  of H (U), and the adjoint of this isomorphism maps Δ1 onto Δξ.

Exercise 8.15. To prove that every χ ∈ Δ(W ) has the form χ = δτ ,let u(t)=eit, so that u = 1/u = 1 and |χ(u)| =1,since|χ(u)|, |χ(1/u)|≤ iτ k k 1. There is some τ ∈ R such that χ(u)=e = u(τ) and χ(u )=χ(u) = k ikt u (τ). Hence χ(P )=P (τ)ifP (t)= |k|≤N cke , these trigonometric polynomials are dense in W , and χ(f)=δt(f) for every f ∈ W .Iff does not vanish at any point, then f has an inverse in W which must be 1/f.

Exercise 8.16. δt = δs if and only if t − s =2kπ, and δtf = δzF . Hence Δ=T and f = f (or F ) as in Exercise 8.13. But not every function in C(T) is the sum of an absolutely convergent Fourier series, and W is dense in C(T) (use the Stone-Weierstrass theorem). This implies that G cannot be an isometry.

Exercise 8.17. From the properties of the resolvent, there is a number M>0 such that (λe − x)−1

2n 1/2n Exercise 8.18. Note that aΔ = r(a) = limn a  .IfG is an isometry,     2 2 2  2 so that a Δ = a , it is clear that a = a Δ = a Δ = a . Exercise 8.19. e∗x =(x∗e)∗ = x∗∗ = x and (x−1)∗x∗ =(xx−1)∗ = e∗ = e.

318 Hints to exercises

Exercise 8.21. If Mn = Pn(H) and M = Mn,theny ∈ M if and only if ∞ ∞ y = y ,where y 2 < ∞ and y ∈ M . n=1 n n=1 n n n ∈ ∞  2 ∞ ⊥ Consider yn Mnsuch that n=1 yn < ,withyn ym when N 2 n = m.ThenSN = yn is a Cauchy sequence, since Sp − Sq = ∞ n=1  p y 2 ≤ y 2 → 0asp →∞. n=q+1 n n=q+1 n { ∞ ∞} Show that M := n=1 yn, limN SN < is the smallest closed sub- ∞ ∈ space which contains every Mn.ThenPx := n=1 Pnx M, Px = x if 2 and only if x ∈ M, P = P , and (Px1,x2)H =(x1,Px2).

Exercise 8.22. See Theorem 4.36. ⊂ Exercise 8.23. B K,(E(B)f,g)H = K χBfg¯ = K fχBg =(f,E(B)g)H and E(B)E(B)f = χBχBf = E(B)f, so that E(B) is an orthogonal pro- ∩ jection. Also, E(A B)f =χA∩Bf = χAχBf = E(A)E(B)f.Moreover, ∞ ∞ E( B )· = χ ∞ · = χ ·. n=1 n n=1 Bn n=1 Bn

Exercise 8.24. E(B) = 0 if and only if (E(B)x, x)H =0foreveryx ∈ H. 01 Exercise 8.25. Try T = . −10

Exercise 8.26. By Theorem 8.25, T = T ∗ and σ(T ) ⊂ [0, ∞). Note that 0 ≤ T ∈C(σ(T )), T = f 2 for a unique f ∈C(σ(T )), f ≥ 0, and there is a unique S ∈T  such that S = f and S ≥ 0, which is equivalent to S ≥ 0.

Exercise 8.27. On σ(T )=σT (T ), let λ = p(λ)s(λ)withp(λ)=|λ|≥0 and |s(λ)| = 1 everywhere. Define P = p(T ) and U = s(T ).

Chapter 9

Exercise 9.1. If F = D(T ) is not the whole space H,writeH = F ⊕ F ⊥ and consider two different operators on F ⊥ to extend T from F to H.

Exercise 9.2. Since |(Tx,y)H|≤xH Ty, {Tx; xH ≤ 1} is weakly bounded and it is bounded by the uniform boundedness principle.

Exercise 9.3. A reduction to Example 9.2 is obtained by considering the Fourier transform of χ(n,n+1).

⊥ Exercise 9.4. Note that y ∈ (ImT ) if and only if x ∈D(T ) → (Tx,y)H =0, so y ∈D(T ∗) and T ∗y =0.

c Exercise 9.5. λ ∈ σ(T ) if |λ − λ0| < 1/RT (λ0); then d(λ0,σ(T )) ≥ 1/RT (λ0).

Hints to exercises 319

Exercise 9.6. To show that D(T ∗) ⊂D(T ), let y∗ = T ∗y for any y ∈D(T ∗), and choose x ∈D(T ) so that Tx = y∗.Theny = x since, for every z ∈D(T ), ∗ (Tz,y)H =(z,y )H =(Tz,x)H,thatis,(u, y)H =(u, x)H for all u.

⊥ −1 ∗ −1 Exercise 9.7. Im T =KerT = H.Lety ∈D((T ) ); then (A x, y)H = ∗ −1 ∗ (x, y )H for any x ∈D(T ) and (z,y)H =(Tz,y )H if z ∈D(T )(z = T −1x), where y∗ ∈D(T ) and Ty∗ = y since T = T ∗.Theny ∈ Im T = D(T −1) and (T −1)∗y = y∗ = T −1y, so that (T −1)∗ = T −1.

Exercise 9.8. Let λ ∈ σp(T ), assume that F = D(T − λI) is not dense, and ⊥ choose 0 = y ∈ F .But(Tx− λx, y)H =0forallx ∈ H and, since λ ∈ R ∗ and T = T ,also(x, (T − λI)y)H = 0, and for x =(T − λI)y we obtain Ty = λy, a contradiction to λ ∈ σp(T ).

Exercise 9.9. Show that U(f)=f(A)z defines a bijective isometry (note that U[tn]=Anz) and check that AUf = U[tf(t)].

Exercise 9.10. Cf. the constructions in Yosida [44,XI.5]. 1  1  Exercise 9.11. Note that 0 iϕ (t)ψ(t) dt = 0 f(t)iψ (t) dt to check that ⊂ ∗ x ∈D ∗ S S . Denote V (x)= 0 v(t) dt (v (S ), and choose u =1in 1 1 iu(t)v(t) dt = u(1)V (1) − u(t)V (t) dt (u ∈D(S)) 0 0 to show that V (1) = 0. It follows that iv − V ∈ (ImS)⊥ = {0} and D ∗ 1 (S )=H0 (0, 1). Exercise 9.12. See Exercise 9.11.

Exercise 9.13. Show that −D2f = λf with the conditions f(0) = f(1) = 0 2 2 has the solutions λn = −π n , fn(x)=sin(πnx) and prove that {fn; n = 1, 2, 3,...} is an orthogonal total system in L2(0, 1).

Exercise 9.14. Consider −D2f = λf with the conditions f (0) = f (1) = 0.

Exercise 9.15. Show that the operator has at least two different self-adjoint extensions, obtained in Exercises 9.13 and 9.14.   1 | |2 ≥ 2 Exercise 9.16 (Tψ,ψ)2 =(ψ ,ψ)2 + 0V (x) ψ(x) dx ψ 2. Indeed, | |≤ t |  | ≤ 1/2  1 | |2 ≤ 2 ψ(t) 0 ψ (x) dx t ψ 2; hence 0 ψ(t) dt ψ 2.

h Exercise 9.17. Qj and Pj are Q and P = 2πiD in the case n =1.

Exercise 9.18. Prove that A1 and A2 commute if and only if A2 commutes 1 1 with every E (B1), and then A2 will commutes with E (B1) if and only if

320 Hints to exercises

E1(B )commuteswitheveryE2(B ). Indeed, 1 2 1 1 (A2A1x, y)H = tdEx,A2y, (A1A2x, y)H = tdEA2x,y and also 1 1 1 1 (A2E (B)x, y)H = Ex,A2y(B), (E (B)A2x, y)H = EA2x,y(B). It follows from A1A2 = A2A1 that AA2 = A2A for all A ∈A1 = g(A ); g ∈C(σ(A )) , and then E1 = E1 .IfA E1(B)=E1(B)A , 1 1 x,A2y A2x,y 2 2 then also E1 = E1 . x,A2y A2x,y

Exercise 9.19. Show first that every E(B1 × B2) is an orthogonal projection and that E satisfies the conditions (1)–(3) of spectral measures on R2 (see Subsection 8.5.2) on Borel sets of type B = B1 × B2.ThenextendE to all Borel sets in R2 as in the construction of scalar product measures. Exercise 9.20. Solution: Au = −u with D(A)=H1(R). Note that u ∈ 1 2 −1 H (R)whenu ∈ L (R) and the distributional limit limh→0 h [u(x − h) − u(x)] exists in L2(R). Exercise 9.21. Au = gu, and D(A)={u ∈ L2(R); gu ∈ L2(R)}.

Bibliography

[1] R. A. Adams, Sobolev Spaces, Academic Press, Inc., 1975. [2] N. I. Akhiezer and I. M. Glazman, Theory of Linear Operators in Hilbert Space, Dover, 1993. [3] S. Banach, Th´eorie des op´erations lin´eaires, Monografje Matamaty- czne, Warsaw, 1932, and Chelsea Publishing Co., 1955. [4] S. K. Berberian, Lectures in Functional Analysis and Operator Theory, Springer-Verlag, 1974. [5] H. Brezis, Analyse fonctionelle: Th´eorie et applications, Masson, 1983. [6] J. Cerd`a, An´alisis Real, Edicions de la Universitat de Barcelona, 1996. [7]B.A.Conway, A Course in Functional Analysis, Springer-Verlag, 1985. [8] R. Courant and D. Hilbert, Methods of Mathematical Physics, Oxford University Press, 1953. [9] J. Dieudonn´e, Foundations of Modern Analysis, Academic Press, 1960. [10] J. Dieudonn´e, History of Functional Analysis, North-Holland, 1981. [11] P. A. M. Dirac, The Principles of Quantum Mechanics, Clarendon Press, 1930. [12] N. Dunford and J. T. Schwartz, Linear Operators: Part 1, Interscience Publishers, Inc., 1958. [13] R. E. Edwards, Functional Analysis, Theory and Applications, Hold, Rinehart and Windston, 1965. [14] G. B. Folland, Introduction to Partial Differential Equations, Princeton University Press and University of Tokyo Press, 1976.

321

322 Bibliography

[15] I. M. Gelfand, and G. E. Chilov, Generalized Functions, Academic Press, New York, 1964 (translated from the 1960 Russian edition). [16] I. M. Gelfand, D. A. Raikov and G. E. Chilov, Commutative Normed Rings, Chelsea Publishing Co., New York, 1964 . [17] D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order, Springer-Verlag, 2001 (revision of the 1983 edition). [18] P. R. Halmos, Measure Theory, D. Van Nostrand Company, Inc., 1950. [19] E. Hille and R. S. Phillips, Functional Analysis and Semigroups, Amer. Math. Soc. Colloquium Publ., vol. 31, 1957. [20] L. H¨ormander, Linear Partial Differential Operators, Springer-Verlag, 1963. [21] J. Horvath, Topological Vector Spaces and Distributions, Addison Wes- ley, 1966. [22] L. Kantorovitch and G. Akilov, Analyse fonctionnelle, “Mir”, 1981 (translated from the 1977 Russian edition). [23] T. Kato, Perturbation Theory for Linear Operators, Springer-Verlag, 1976. [24] J. L. Kelley, General Topology, Van Nostrand, Princeton, NJ, 1963. [25] A. N. Kolmogorov and S. V. Fomin, Elements of the Theory of Func- tions and Functional Analysis, Glaglock Press 1961 (translated from the 1960 Russian edition). [26] G. K¨othe, Topological Vector Spaces I, Springer-Verlag, 1969. [27] P. D. Lax, Functional Analysis, Wiley, 2002. [28] E. H. Lieb and M. Loss, Analysis, Graduate Studies in Mathematics, American Mathematical Society, 1997. [29] V. Mazya, Sobolev Spaces, Springer-Verlag, 1985. [30] R. Meise and D. Vogdt, Introduction to Functional Analysis, The Clarendon Press, Oxford University Press, 1997 (translated from the 1992 German edition). [31] A. F. Monna, Functional Analysis in Historical Perspective, John Wiley & Sons, 1973. [32] M. A. Naimark, Normed Rings, Erven P. Noordhoff, Ltd., 1960 (trans- lation of the 1955 Rusian edition). [33] E. Prugoveˇcki, Quantum Mechanics in Hilbert Space, Academic Press, 1971. [34] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Aca- demic Press, 1972. [35] C. E. Rickart, General Theory of Banach Algebras, D. Van Nostrand Company, Inc., 1960.

Bibliography 323

[36] F. Riesz and B. Sz. Nagy, Le¸cons d’analyse fonctionelle, Akad´emiai Kiad´o, Budapest, 1952, and Functional Analysis, F. Ungar, 1955. [37] H. L. Royden, Real Analysis, The Macmillan Company, 1968. [38] W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. [39] W. Rudin, Real and Complex Analysis, McGraw-Hill Book Company, 1966. [40] L. Schwartz, Th´eorie des distributions, Hermann & Cie, 1969 (new edition, 1973). [41] E. M. Stein and G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, 1971. [42] A. E. Taylor and D. C. Lay, Introduction to Functional Analysis, John Wiley & Sons, Inc., second edition, 1980. [43] J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton University Press, 1996 (translation of the 1932 German edi- tion). [44] K. Yosida, Functional Analysis, Springer-Verlag, 1968.

Index

,9 E-almost everywhere (E-a.e.), 246 ≺,8 E-essential supremum (E- sup), 246 ∂j ,30 eξ, 182 E(Ω), 156 E Δ(A), 234 (Ω), 81 Em(Ω), 82 δa,δ, 148 H∞, 236 χA,10 m σ-additivity, 9 H (Ω), 210 m σ-algebra, 8 H0 (Ω), 212 σ(E,E), 129 Hs, 208 σ(T ), 118 H(Ω), 88 1 σA(a), 229 Lloc, 145 ∞ τh, 33, 150 L , 229 L∞(E), 287 p A(D), 229, 251 L , 14, 31, 100 ⊥ Lp(T), 56 A ,47 p Ao, 112 LT (R), 56 L A , 280 (E), 43 ψ L B , 115 (E; F ), 42 E L B(X), 29, 229 c(E), 115 Lc(E; F ), 115 BΩ,9 C∗-algebra, 238 , 198 ∞ C(K), 33, 229  ,29 p Cm(Ω),¯ 30  , 31, 103 n C0(R ), 70, 189 Ra(λ), 230 Cb(X), 30 r(a), 232 Cc(X), 7 S(X), 60 co, 123 sinc , 183 c0,59 supp E, 246 c00,69 supp f,7 Dα, 30, 151 supp u, 155 D(Ω), 147 S, 186 D(Ω), 145 S(Rn), 189 m D (Ω), 212 varψ(A), 280 DK (Ω), 81 W , 253 Dm m,p K (Ω), 82 W (Ω), 206

325

326 Index

Alaoglu theorem, 130 Cayley transform, 293 Alaoglu, L., 130 Cayley, A., 114, 293 Algebra Cech,ˇ E., 6 Banach, 228 Ces`aro sums, 58 Banach unitary, 228 Chain rule, 211 disc, 229, 251 Character, 234 uniform, 253 Closed graph theorem, 84 Wiener, 253 Closure of a set, 2 Alias, 198 Coercive sesquilinear form, 95 Almost everywhere (a.e.), 11 Commutator, 261, 282, 283, 297 Annihilation operator, 285 Commuting Annihilator of a subset, 112 operators, 283 Approximation of the identity, 54, 56 spectral measures, 283 Arzel`a, C., 115 Compact linear map, 115 Ascoli, G., 115 Completion, 113 Ascoli-Arzel`a theorem, 115 of a normed space, 33 Complex spectral measures, 242 Baire’s theorem, 82 Convex Ball, 1 functional, 106 Banach limit, 122 hull, 88 Banach, S., 28, 83, 85, 103, 106, 130 Convolution, 53, 56, 154 Banach-Schauder theorem, 83 Convolvable functions, 53 Banach-Steinhaus theorem, 86 Corona problem, 237 Band, 195 Creation operator, 285 Bernstein polynomials, 35 Cyclic vector, 296 Bernstein, S. N., 35 Bessel’s inequality, 49 Daniel, P. J., 14 Bilinear form, 45 de la Vall´ee-Poussin kernel, 72 Bohnenblust, H. F., 107 Dieudonn´e, J., 77 Borel σ-algebra, 9 Dilation, 150, 191 Borel, E.,´ 10 Dirac comb, 176 Born, M., 257, 282 Dirac, P., 143, 257 Bound states, 287 Dirichlet Boundary kernel, 57 conditions, separated, 66 norm, 215 problem, 68 Dirichlet problem, 135, 200, 214, 216, 220 value problem, 66 in the ball, 172 Bourbaki, N., 14 weak solution, 215 Bunyakovsky, V., 45 Dirichlet, J., 200 Buskes, G., 107 Distance, 1 Distribution, 146, 147 Calder´on, A., 59 Dirac, 148 Canonical tempered, 190 commutation relations, 297 with compact support, 156 isomorphism, 234 Distribution function, 21 Canonically Distribution of an observable, 280 conjugate, 297 Distributional Carath´eodory condition, 16 convergence, 148 Carath´eodory, C., 16 derivative, 151 Carleson, L., 193, 238 Divergence theorem, 163 Cauchy Domain of an operator, 258 integral, 125 Dominated convergence theorem, 12 problem, 64, 71 Dual couple, 128 sequence, 4 Cauchy, A., 45 Eigenspace, 118 Cauchy-Bunyakovsky-Schwarz inequality, Eigenvalue, 118, 259 45 approximate, 122, 264

Index 327

Eigenvector, 118 Gelfand-Mazur theorem, 234 Energy eigenvalues, 287 Gelfand-Naimark theorem, 240 Equicontinuous, 115 Goldstine theorem, 139 Essential range, 252 Graph, 260 Euclidean Green’s distance, 1 function, 67, 170, 171 inner product, 47 identities, 163 norm, 2 Green, G., 67, 163 Evolution operator, 281 Ground state, 287 Expected value, 280 H¨older inequality, 13 Fatou H¨older, O., 13 lemma, 11 Hahn, H., 85, 106 theorem, 136 Hahn-Banach theorem, 106–108, 110 Fatou, P., 11 Hamiltonian, 279, 280 Fej´er kernel, 57 Hardy, G.H., 204 Finite intersection property, 4 Harmonic oscillator Fischer, E., 50 classical, 284 Fischer-Riesz theorem, 50 quantic, 285 Fourier Hausdorff, F., 2, 3 coefficients, 50, 56 Heat equation, 166, 184, 222 integral, 182 Heaviside, O., 143 multiplier, 208 Heine-Borel series, 50, 51, 56 property, 79 transform, 182 theorem, 5 transform, inverse, 183 Heisenberg, W., 257 Fr´echet norm, 79 Helly, E., 106 Fr´echet, M., 3, 28, 80 Herglotz theorem, 136 Fredholm Hermite polynomial, 286 alternative, 118 Hermitian element, 239 operator, 39, 117 Hilbert transform, 204 Fredholm, E.I., 39 Hilbert, D., 118, 121, 204, 247 Friedrichs Hilbert-Schmidt extension, 271 operator, 117 theorem, 211 spectral theorem, 120 Friedrichs, K., 272 Homomorphism Function of C∗-algebras, 239 absolutely continuous, 12 of unitary Banach algebras, 229 absolutely integrable, 12 continuous, 3 Ideal, 234 Dirac, 143 maximal, 234 Heaviside, Y , 152 Incompatible observables, 283 integrable, 12 Infinitesimal generator, 281 locally integrable, 145, 147 Initial measurable, 11 value problem, 64 periodized, 186 Inner product, 45 sequentially continuous, 4 Inner regularity, 10 test, 145 Integral kernel, 52 Functional calculus, 241 Interior of a set, 2 Fundamental solution, 162 Internal point, 108 Inversion theorem, 188 Gauge, 108 Involution, 238 Gauss-Weierstrass kernel, 73, 184, 222 Isomorphism of normed spaces, 40 Gelfand topology, 235 Jordan, P., 257 transform, 236 Gelfand, I., 227, 232, 234 Kakutani, S., 103

328 Index

Kinetic energy, 278 Norm, 27 Kondrachov, V., 218 equivalent, 40 finer, 40 Lagrange identity, 64 Normal element, 239 Laplacian, 163 Nyquist Lax, P., 95 frequency, 195 Lax-Milgram theorem, 95 rate, 198 Lebesgue Nyquist, H., 195 differentiation theorem, 12 integral, 11 Observable, 279 numbers, 87 values, 280 Lebesgue, H., 10 Open mapping theorem, 83 Leibniz’s formula, 30 Operator Leray, J., 181 adjoint, 96, 261 Local bounded linear , 39 basis, 27 Cauchy-Riemann, 178 subbasis, 77 closable, 260 Local uniform convergence, 77 closed, 260 compact, 115 Malgrange-Ehrenpreis theorem, 162 densely defined, 261 Mazur, S., 234 derivative, 258 Mean value, 280 differential, 162 Measure, 9 elliptic, 162 σ-finite, 10 energy, 279 absolutely continuous, 98 essentially self-adjoint, 270 Borel, 10 heat, 165 complex, 18 Hilbert-Schmidt, 97 counting, 22, 31, 47 hypoelliptic, 162 Dirac, 10 momentum, 278 Lebesgue, 10 norm, 42 outer, 15 position, 259 positive, 9 relatively bounded, 267 Radon, 14 self-adjoint, 97, 263 real, 18 semi-bounded, 272 regular, 10 symmetric, 263 spectral, 244 unitary, 248 Metric, 1 wave, 168 Meyer-Serrin theorem, 211 Order Minkowski of a distribution, 147 functional, 88, 108 of a differential operator, 30 inequality, 13 Orthogonal complement, 49 integral inequality, 125 Orthonormal Minkowski, H., 13 basis, 50 Modulation, 186, 191 system, 49 Mollifier, 145 Outer regularity, 10 Momentum, 277 Monotone convergence theorem, 11 Parallelogram identity, 47 Multiplication by a C∞ function, 148 Parseval’s relation, 50 Multiplicity of an eigenvalue, 118 Parseval,M.A.,52 Partition of unity, 8 Nagumo, M., 227 Periodic extension, 186 Nearly orthogonal, 41 Plancherel, M., 192 Neighborhood, 2 Planck constant, 278 basis, 2 Planck, M., 278, 287 Nelson, E., 283 Poincar´e lemma, 215 Neumann series, 43, 231, 260 Poisson Neumann, C., 43 equation, 165

Index 329

integral, 135, 172 Schr¨odinger kernel, 55, 172, 173, 201 equation, 282 kernel of the disc, 134 picture, 279 kernel, conjugate, 203 Schr¨odinger, E., 257, 282 theorem, 184 Schur, I., 59 Poisson, S. D., 165, 184 Schwartz, L., 14, 77, 144 Polar decomposition of a bounded normal Schwarz inequality, 45 operator, 254 Schwarz, H., 45 Polar representation of a complex measure, Self-adjoint 104 element, 239 Polarization identities, 51, 98 form, 63 Position, 277 Semi-norm, 76 Positive linear form, 12, 100 Separated sets, 109 Product norm, 30 strictly, 109 Projection Sesquilinear form, 45 optimal, 47 Set orthogonal, 49 Gδ,82 theorem, 47 absorbing, 76, 108 Pythagorean theorem, 47 balanced, 76 Borel, 9 Quadrature method, 90 bounded, 39, 78 Quaternions, 252 closed, 2 Quotient compact, 4 locally convex space, 78 convex, 27 map, 78 measurable, 9 space, 112 open, 2 resolvent, 259 Radon, J., 14, 103 Shannon Radon-Nikodym system, 196 derivative, 99 theorem, 196 theorem, 98 Shannon, C., 196 Rayleigh, Lord, 192 Signal Regular point, 259 analog-to-digital conversion, 195 Regularization, 176 band-limited, 195 Rellich, F., 218, 268 continuous time, 195 Resolution of the identity, 244 discrete time, 195 Resolvent, 230 Skewlinear, 45 of an operator, 260 Slowly increasing sequence, 198 Rickart, C., 227 Sobczyk, A., 107 Riemann-Lebesgue lemma, 59, 189 Sobolev, S., 181 Riesz Space lemma, 41 Banach, 28 representation theorem, 94 compact, 4 for (Lp), 101 complete, 5 for C(K), 104 countably semi-normable, 79 theorem for the Hilbert transform, 205 dual, 43 Riesz, F., 14, 50, 103, 118, 247 Euclidean, 1, 29, 47 Riesz, M., 59, 206 finite-dimensional, 40 Riesz-Markov representation theorem, 15 Fr´echet, 80 Riesz-Thorin theorem, 61, 73 Hausdorff, 2 Rogers, L., 13 Hilbert, 46, 93 locally compact, 7 Sampling, 195 locally convex , 77 Scalar product, 45, 93 measurable, 8 Schauder theorem, 116 measure, 9 Schauder, J., 83, 116 metric, 1 Schmidt, E., 46, 121 normable, 28

330 Index

normed, 28 Tychonoff,A.N.orTichonov,A.N.,6 reflexive normed, 139 Type (p, q), 52 separable, 36 sequentially compact, 4 Uncertainty principle, 195, 282 Sobolev, 206, 208 Uniform boundedness principle, 85 topological, 2 Uniform norm, 29 Spectral radius, 232 Uniqueness theorem for Fourier coefficients, Spectrum, 118, 198, 229, 259 51, 59 continuous, 259 Unitarily equivalent operators, 259 of a commutative unitary Banach Urysohn function algebra, 234 smooth, 145 point, 259 continuous, 7 residual, 259 Square wave, 183, 199 Variance, 280 State, 279 Variation space, 279 positive, negative, 20 Steinhaus, H., 85 total, 18 Stone’s theorem, 281 Vector topology, 26 Stone, M., 37, 272 Volterra operator, 40, 117 Stone-Weierstrass theorem, 37 Volterra, V., 40 complex form, 38 von Neumann, J., 46, 257, 272, 293 Strongly continuous one-parameter group of unitary operators, 281 Wave equation, 179 Sturm-Liouville problem, 213 Wave function, 279 classical solution, 213 Weak derivative, 151 weak solution, 213 Weierstrass theorem, 34, 35 Subspace, 27, 29 Weierstrass, K., 34 complemented, 111 Weyl’s lemma, 165 orthogonal, 47 Whitney’s notation, 30 topological, 3 Wintner, A., 272 Sufficient family of semi-norms, 76 Wronskian, 65 Summability kernel, 54, 56 Wronskian determinant, 64 Support of a distribution, 155 Young’s inequality, 54, 63 of a function, 7 Young, W. H., 10, 14 of a spectral measure, 246 Sylvester, J., 114 Zorn’s lemma, 6, 50, 107 Symmetry, 150, 191

Taylor, A., 232 Thorin, O., 59 Three lines theorem, 59 Topological vector space, 26 Topologically complementary subspaces, 111 Topology, 2 coarser, 3 finer, 3 metrizable, 4 of a metric space, 3 product, 6 weak, 129 weak*, 130 Total subset, 36 Translation, 150, 191 Transpose, 96, 113 Transposition map, 114

Functional analysis studies the algebraic, geometric, and topo- logical structures of spaces and operators that underlie many classical problems. Individual functions satisfying specific equations are replaced by classes of functions and transforms that are deter- mined by the particular problems at hand. This book presents the basic facts of linear functional analysis as related to fundamental aspects of mathematical analysis and their applications. The exposition avoids unnecessary terminology and generality and focuses on showing how the knowledge of these structures clarifies what is essential in analytic problems. The material in the first part of the book can be used for an introductory course on functional analysis, with an emphasis on the role of duality. The second part introduces distributions and Sobolev spaces and their applications. Convolution and the Fourier transform are shown to be useful tools for the study of partial differential equations. Fundamental solutions and Green’s functions are considered and the theory is illustrated with several applications. In the last chapters, the Gelfand transform for Banach algebras is used to present the spectral theory of bounded and unbounded operators, which is then used in an introduction to the basic axioms of quantum mechanics. The presentation is intended to be accessible to readers whose backgrounds include basic linear algebra, integration theory, and general topology. Almost 240 exercises will help the reader in better understanding the concepts employed.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-116

American Mathematical Society www.ams.org

GSM/116 Real Sociedad Matemática Española www.rsme.es