<<

Ralph Chill, Eva Fasangovˇ a´

Gradient Systems

– 13th International Internet Seminar –

February 10, 2010

c by R. Chill, E. Faˇsangov´a

Contents

0 Introduction ...... 1

1 systems in ...... 5 1.1 Thespace Rd ...... 5 1.2 Fr´echetderivative ...... 7 1.3 Euclideangradient...... 8 1.4 Euclideangradientsystems ...... 9 1.5 Algebraicequationsand steepest descent ...... 12 1.6 Exercises ...... 13

2 Gradient systems in finite dimensional space ...... 15 2.1 Definitionofgradient ...... 17 2.2 Definitionofgradientsystem...... 18 2.3 Local existence for ordinary differential equations ...... 18 2.4 Local and global existence for gradientsystems ...... 21 2.5 Newton’smethod...... 23 2.6 Exercises ...... 25

3 in infinite dimensional space: the Laplace on a bounded interval ...... 27 3.1 Definitionofgradient ...... 28 3.2 Gradientsofquadraticforms ...... 30 3.3 Sobolevspacesinonespacedimension ...... 31 3.4 The Dirichlet- on a bounded interval ...... 33 3.5 The Dirichlet-Laplace operator with multiplicative coefficient ..... 35 3.6 Exercises ...... 37

4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator ...... 39 4.1 Sobolevspacesinhigherdimensions ...... 39 4.2 TheDirichlet-Laplaceoperator ...... 42

v vi Contents

4.3 The Dirichlet-p-Laplaceoperator ...... 44 4.4 Anauxiliaryresult...... 46 4.5 OriginsoftheLaplaceoperator...... 49 4.6 Exercises ...... 52

5 Bochner-Lebesgue and Bochner-Sobolev spaces ...... 55 5.1 TheBochnerintegral...... 56 5.2 Bochner-Lebesguespaces ...... 58 5.3 Bochner-Sobolevspaces in one space dimension ...... 59 5.4 Exercises ...... 63

6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions ...... 65 6.1 Gradient systems in infinite dimensional spaces ...... 65 6.2 Global existence and uniqueness of solutions for gradient systems withconvexenergy...... 66 6.3 Exercises ...... 75

7 equations ...... 77 7.1 Heatconduction...... 77 7.2 Thereaction-diffusionequation...... 80 7.3 The Perona-Malikmodelin imageprocessing...... 81 7.4 Global existenceand uniquenessof solutions ...... 82 7.5 Exercises ...... 84

8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II ...... 87 8.1 Global existence and uniqueness of solutions for gradient systems withellipticenergy ...... 88 8.2 Exercises ...... 97

9 Regularity of solutions ...... 99 9.1 Interpolation...... 99 9.2 Timeregularity–Angenent’strick ...... 101 9.3 Exercises ...... 104

10 Existence of local solutions ...... 107 10.1 The local inverse theorem and existence of local solutions of abstractdifferentialequations ...... 108 10.2 Gradients of quadratic forms and identification of the space.. . 109 10.3 Asemilineardiffusionequation...... 112 10.4 Exercises ...... 117 Contents vii

11 Asymptotic behaviour of solutions of gradient systems ...... 119 11.1 The ω- set of a continuous on R+ ...... 120 11.2 A stabilisation result for global solutions of gradientsystems ...... 121 11.3 Nonstabilisation results for global solutions of gradient systems. . . . 123 11.4 Exercises ...... 126

12 Asymptotic behaviour of solutions of gradient systems II ...... 129 12.1 The Łojasiewicz-Simon inequality and stabilisation of global solutionsofgradientsystems...... 130 12.2 The Łojasiewicz-Simon inequality in Hilbert spaces ...... 133 12.3 Exercises ...... 138

13 The universe, soap bubbles and curve shortening ...... 141

References ...... 151

A Primerontopology ...... 1001 A.1 Metricspaces...... 1001 A.2 Sequences,convergence...... 1004 A.3 Compactness ...... 1006 A.4 Continuity ...... 1007 A.5 Completionofametricspace...... 1008

B Banach spaces and bounded linear operators ...... 1011 B.1 Normedspaces...... 1011 B.2 Productspacesandquotientspaces...... 1017 B.3 Boundedlinearoperators ...... 1020 B.4 TheArzela-Ascolitheorem ...... 1025

C on Banach spaces ...... 1029 C.1 Differentiable functions between Banach spaces ...... 1029 C.2 Local inverse theorem and theorem ...... 1030 C.3 *Newton’smethod...... 1034

D Hilbert spaces ...... 1035 D.1 Innerproductspaces ...... 1035 D.2 Orthogonaldecomposition...... 1039 D.3 *Fourierseries ...... 1044 D.4 LinearfunctionalsonHilbertspaces...... 1048 D.5 WeakconvergenceinHilbertspaces ...... 1049

E Dual spaces and weak convergence ...... 1053 E.1 ThetheoremofHahn-Banach ...... 1053 E.2 Weak∗ convergenceand the theorem of Banach-Alaoglu ...... 1059 E.3 Weakconvergenceandreflexivity ...... 1060 E.4 *Minimizationofconvexfunctionals...... 1065 viii Contents

E.5 *ThevonNeumannminimaxtheorem...... 1068

F The theorem (BY WOLFGANG ARENDT) ...... 1071 F.1 Open sets of class C1 ...... 1071 F.2 Thedivergencetheorem ...... 1074 F.3 Proofofthedivergencetheorem ...... 1077

G Sobolev spaces ...... 1083 G.1 Test functions, convolution and regularization ...... 1083 G.2 Sobolevspacesinonedimension ...... 1086 G.3 Sobolevspacesinseveraldimensions...... 1092 G.4 * Elliptic partial differentialequations ...... 1094

H Bochner-Lebesgue and Bochner-Sobolev spaces ...... 1097 H.1 TheBochnerintegral...... 1097 H.2 Bochner-Lebesguespaces ...... 1100 H.3 Bochner-Sobolevspaces in onedimension...... 1102

Index ...... 1105 Lecture 0 Introduction

Philosophy is written in that gigantic book which is perpetually open in front of our eyes (I allude to the universe), but no one can understand it who does not strive beforehand to learn the language and recognize the letters in which it is written. It is written in mathematical language, and its letters are triangles, circles and other geometrical figures, and without these means it is humanly impossible to understand any of it; without them, all we can do is to wander aimlessly in an obscure labyrinth. Galileo Galilei

The evolution of many systems from , biology, chemistry, engineering, economics, social sciences can be rewritten in the mathematical language which uses ordinary or partial differential equations. Due to the analysis of the differen- tial equations we have the hope that we need not wander aimlessly in an obscure labyrinth but that we understand the corresponding evolution problems. In order to illustrate this statement, think of the motion of a pendulum, for example the oscillations of the great lantern in the nave of Pisa cathedral. At the end of the 16th century, after mere observation of this lantern and probably other pendula, Galilei conjectured that on the one hand the period of one swing is proportional to the length of the pendulum and on the other hand this period is independent of the amplitude of the swing. The conjecture about the precise period of one swing would have many consequences if it was true, especially since it is connected to the problem of measuring time. But how could Galilei confirm his conjecture? Could or can it be confirmed mathematically? Discovered some 30 years after Galilei by Newton and Leibniz, the (the calculus of fluxions, as it was called by Newton) turned out to be an appropriate mathematical language into which the oscillations of the great lantern in the nave of Pisa cathedral could be translated. At least then, by the mathematical analysis of the corresponding , there was an explanation why Galilei’s conjecture was very close to the truth but not quite the truth; in particular, there was an explanation why the period of one swing does depend on the amplitude. Moreover, the differential calculus could be used to construct noncircular pendula for which the period does not depend on the amplitude, that is, to solve the so-called

1 2 0 Introduction tautochrone problem1. But the success of the new language of differential calculus may be illustrated by a second classical example: the motion of planets and stars. The problem to describe and predict their motions fascinated astronomers since the ancient times, but there was, like in the case of the pendulum, a language missing to explain all the observations of the great astronomers. Starting from his law of gravitation and his second law of equilibrium of , and by translating these laws into the mathematical language of differential equations, Newton was able to confirm Kepler’s laws about the elliptic orbits and the regular motions of the planets in our solar system.

Today, more than three centuries after Newton wrote down his treatise on the theory of fluxions, and after the rise of the , it has turned out that a large number of evolution models may be expressed in the mathematical language of ordinary and partial differential equations. Besides the evolution models from , the mechanics of points and solids, we want to mention other phenomena such as the evolution of heat or diffusing particles, the evolution of waves (water waves, electromagnetic waves, the oscillations of elastic materials), the evolution of cells or other populations, the evolution of soap bubbles or other surfaces. These are only a few, however important examples besides many others. Moreover, it is natural to expect that the list of all phenomena which may be expressed in the mathematical language using ordinary and partial differential equations is still open. For all the phenomena, one aspect, which is mentioned in Galilei’s citation, is important for our : the mathematical analysis of the various differential equations and the interpretations of the qualitative properties which we deduce from this analysis allow us to understand the various evolution phenomena in a deeper way. One aim of this course is to make a small step in this process of understanding, as far as gradient systems are concerned. We believe that the class of gradient systems forms a fundamental class within the differential equations.

The of evolution models into the mathematical language using ordinary or partial differential equations very often starts from first principles like the equilibrium of forces, the conservation of total energy, the conservation of total mass or the conservation of total number of individuals. One may therefore be surprised when discovering that the resulting differential equations are dissipative in the sense that a characteristic quantity - like for example again some form of energy - is strictly decreasing in time, unless the system is at rest. Although in the theory of dynamical systems the word dissipative is used in a wider sense, we think it reflects well the existence of a strictly decreasing quantity: the Concise Oxford Dictionary translates the verb dissipate by disperse, dispel, (cause to) disappear, (cloud, vapour, care, fear, darkness); break up entirely, bring or come

1 To tell the truth, by geometrical arguments, Huygens disproved Galilei’s conjecture before the discovery of differential calculus and in addition solved the tautochrone problem. Only later, by us- ing differential calculus, Lagrange, Euler and Abel gave analytic proofs of the fact that the cycloid is the solution to this problem. 0 Introduction 3 to nothing; squander (money); fritter away (energy, attention). Galilei’s lantern in Pisa, like any other mechanical pendulum, stabilizes and finally comes to rest because of inner friction forces and because of the friction between the lantern and the surrounding air. The friction forces, however small they are compared to the leading gravitational , can not be completely neglected when translating Newton’s second law into a second order differential equation. The analysis of this differential equation shows that the friction is the reason for the dissipation of the total energy of the pendulum ( plus ). For a similar reason, a vibrating string on a guitar or a violon eventually ceases to vibrate if let alone, and the same is true for a drum or any other vibrating plate.

The simplest partial differential equation arising from the problem of heat con- duction is the linear

ut ∆u = 0. − It is a result of the principle of conservation of total heat and of Fourier’s law, as we will see in this course. At the same time we shall see that the linear heat equation is a dissipative evolution equation. Both properties, conservation (of heat) and dissi- pation (of energy), are not necessarily contradictory: experiments confirm that due to heat conduction in a physical material the heat distribution in an isolated volume eventually tends to a uniform, constant distribution. In other words, there is an ef- fect of stabilization of the heat distribution, while the total amount of heat remains conserved. The effect of stabilization can be explained by the existence of a charac- teristic quantity which is dissipated by the heat equation, and it is only a side-order remark that in the case of the linear heat equation there are actually infinitely many such quantities. But even more is true, as we shall see: the linear heat equation, as well as the more general diffusion equation

ut divc(x, ∇u )∇u = 0, − | | the Cahn-Hilliard equation from phase separation models, the mean equation, and many other dissipative parabolic evolution equations can be written as abstract gradient systems.

The abstract gradient system

u˙ + ∇E (u)= 0 is the prototype example of a dissipative evolution equation. In this abstract differential equation, E is a real-valued function defined on an open subset of Rd, of a , or of a finite or infinite dimensional . The function E plays the role of the characteristic quantity mentioned above. Its gradient ∇E depends on the ambient metric and will be defined later; for the moment being it suffices to think of the case that E is defined on Rd and ∇E is the usual euclidean 4 0 Introduction

∂ ∂ gradient: ∇E (u) = ( E (u),..., E (u)). ∂ u1 ∂ ud Let u be a solution of the above gradient system, that is, u is a differentiable Rd-valued function defined on an interval I R which satisfies the equationu ˙ + ∇E (u)= 0 everywhere on I. If E (u) denotes⊆ the composition of the functions E and u, then one calculates d E (u(t)) = ∇E (u(t)),u˙(t) dt h i = u˙(t),u˙(t) 0; −h i≤ here we denoted by , the euclidean inner product, we used the and the assumption thath· ·iu is a solution. This simple but fundamental relation shows that the function E (u) is strictly decreasing in time unless the solution u is constant, which would mean that the underlying evolution is at rest. In other words, the quantity E , which may be some form of energy, is dissipated by the gradient system. This fundamental property of a gradient system, namely dissipativity, but also the gradient structure itself may be used to prove existence and uniqueness of solutions, and to obtain qualitative results about their regularity and their long-time behaviour, like the stabilization or non-stabilization of solutions.

Let us end this Introduction by a philosophical question. The dissipation of some form of energy is a characteristic property of many evolution phenomena, and many results which will be proved in this course remain true in a more general context of dissipative evolution equations; we see this as a motivation for our approach, too. But at the same time some questions arise: how much larger is the class of dissipative evolution equations compared to the class of gradient systems? How to show that some dissipative ordinary or partial differential equation is a gradient system? And how to show that it is not a gradient system? Very probablywe will not be able to answer these farreaching questions which are already interesting for the Navier-Stokes equations. Nevertheless, by modifying a well known citation from Einar Hille we would like to say: we hail a gradient system whenever we see one and we seem to see them everywhere2 ... We hope that the reader of this course will share our impression.

2 The original citation from the 1948 edition of his book on Analysis is: I hail a semi- group whenever I see one and I seem to see them everywhere. This vision has been largely shared by many respectful mathematicians. Lecture 1 Gradient systems in euclidean space

We start by recalling the definition of the of a real-valued function on Rd, we define the euclidean gradient of such a function, we define what we mean by a euclidean gradient system, and we present an example of such a gradient system. We believe that the definition of derivative and euclidean gradient should be well known for the reader, and we shall in the sequel use several results from differential calculus, like for example the chain rule, the , and also the definition of the derivative of a function between finite (or infinite) dimensional spaces without proving or even stating them (some results, however, can be found in the Appendix). By recalling the definition of derivative and gradient, despite of our believe that they are well known, we only want to make sure from the very beginning that they are different objects. Let us say it in this form: a function comes always along with its derivative, while this is not immediately the case for a gradient which depends in addition on the geometry of the ambient space on which the function is defined. First properties of the gradient system

u˙ + ∇eucE (u)= 0 are studied. We introduce the concept of energy for a general ordinary differential equation, that is, a quantity which is decreasing along solutions of gradient systems. The phenomenon of a decreasing quantity is very frequent in many examples of ordinary differential equations which the nature provides to us, and it is this phe- nomenon and its consequences which are discussed in several places of this course.

1.1 The space Rd

We fix some notation in Rd. Elements of Rd are denoted in one of the following ways:

5 6 1 Gradient systems in euclidean space

u1 . u or (ui) or (ui)1 i d or u1,...,ud or . ; ≤ ≤  .  u   d  in particular, we do not distinguish between column vectorsand row vectors.

The euclidean inner product on Rd is given by

d d u,v = u,v euc = ∑ uivi for every u = (ui), v = (vi) R . h i h i i=1 ∈

The subscript euc may be omitted if it is clear from the context that the euclidean in- ner product is meant. An inner product, and in particular the euclidean inner product on Rd, induces a norm by setting u := u,u 1/2. d k k hd i We denote by (R )′ the of R , that is, the space of all linearfunctionals Rd R. Given a linear functional u (Rd ) , we write → ′ ∈ ′

u′(u) or u′ u or u′,u or u′,u Rd Rd h i h i( )′, d for the value of u′ at the point u. The dual space (R )′ is equipped with the norm

u′ Rd = sup u′,u . ( )′ k k u 1 |h i| k kRd ≤

d d It may happen that we simply use for the norm both on R and on (R )′, for example, if it is clear from the context,k·k and especially from the element to which the norm is applied, which norm is meant. Similarly, there will be no confusion if we use , for an inner product of two elements of Rd, and for the duality between h· ·i d d an element in (R )′ and an element in R .

d d Continuity of functions from or to R or (R )′ is always meant with respect to the topology induced by a norm. Recall on this occasion that on a given finite d d dimensional space like R or (R )′ there is in fact only one such topology since any two norms on this space are equivalent.

d Lemma 1.1 (Representation lemma). For every linear functional u′ (R )′ there exists a unique vector u Rd such that ∈ ∈ d u′(v)= u,v euc for every v R . (1.1) h i ∈ Proof. For every u Rd we define a linear functional u (Rd ) by ∈ ′ ∈ ′ d u′(v) := u,v euc, v R . h i ∈ Due to the bilinearity and definiteness of the euclidean inner product, the mapping 1.2 Fr´echet derivative 7

d d J : R (R )′, → u u′ 7→ d d thus defined is linear and injective. Since R and (R )′ are both d-dimensional (d < ∞), the mapping J is actually a linear isomorphism. This means that every linear d d functional u′ on R is represented by some unique element u R via the identity (1.1) and the lemma is proved. ∈

1.2 Frechet´ derivative

Let U be an open subset of Rd, and let E : U R be a function. We say that E is differentiable if for every u U there exists a→ linear functional u (Rd) such that ∈ ′ ∈ ′ E (u + h) E (u) u ,h lim − − h ′ i = 0. (1.2) h 0 h k k→ k k By definition, if a function is differentiable, then in the neighbourhood of every point u U it can be written as the sum of a constant term, a linear term, and a rest of order∈o(h). The function h E (u + h) is approximated by the affine function h E (u)+ u ,h , with a rest7→ of order o(h), so that a particular case of Taylor’s 7→ h ′ i theorem (Taylor expansion up to order 1) is true. The functional u′, which represents the linear term, is uniquely determined (exercise!). d The derivative is the function function E ′ : U (R )′ which assigns to every u U the unique linear functional u (Rd ) for which→ (1.2) holds. Consequently, ∈ ′ ∈ ′ we denote the derivative of E at a point u by E ′(u). We say that E is continuously differentiable if E is differentiable and if the derivative is continuous from U into d (R )′. The set of all continuously differentiable functions U R is a vector space which is denoted by C1(U). →

Given u U, we say that the function E admits a in di- rection h ∈Rd if the limit ∈ ∂E E (u +th) E (u) (u) := lim − ∂h t 0 t → ∂ E ∂ E exists. If h = ei is a canonical basis vector, then we call (u)=: (u) also the ∂ ei ∂ ui of E with respect to ei. It is an exercise to show that if E is differentiable, then E admits at every u U directional in all directions h Rd and ∈ ∈ ∂E (u)= E (u)h. ∂h ′ 8 1 Gradient systems in euclidean space 1.3 Euclidean gradient

There are several ways to introduce the gradient of a , and the most familiar but least flexible is perhaps the one from Lemma 1.2 below. But the gradient is an object, as we already pointed out, which depends on the geometry of the ambient space. This is euclidean space here, with the euclidean inner product and the euclidean norm. The following definition seems therefore to be more appropriate.

Let U be an open subset of Rd, and let E : U R be a differentiable function. → The euclidean gradient of E is the function ∇eucE which assigns to every point d u U the unique element ∇eucE (u) R such that ∈ ∈ d E ′(u)v = ∇eucE (u),v euc for every v R . (1.3) h i ∈

By the Representation Lemma (Lemma 1.1) the euclidean gradient ∇eucE is well defined in the sense that it exists and that it is unique. We emphasize that the eu- clidean gradient at some point u is an element of Rd and that this vector has to be d distinguished from the derivative E ′(u) which is an element of the dual space (R )′. d d It should be clear that the two sets R and (R )′ are different from the set theo- retical point of view. On the other hand, as it is often done in mathematics, one may identify the two spaces via some linear isomorphism. One natural isomorphism is d given via the euclidean inner product , euc on R and the Representation Lemma (Lemma 1.1). h· ·i The following lemma gives an alternative expression of the euclidean gradient in terms of partial derivatives.

Lemma 1.2 (Euclidean gradient and partial derivatives). For every u U, ∈ ∂ E ∂ E ∇eucE (u) = ( (u),..., (u)). ∂ e1 ∂ ed ∇ E E Proof. Let euc (u) i be the i-th component of the euclidean gradient of . By definition of the euclidean inner product and by definition of the canonical basis ∇ E ∇ E vectors ei we have euc (u) i = euc (u),ei euc. Moreover, by definition of the partial derivatives and the euclideanh gradient, fori every 1 i d,  ≤ ≤ ∂E (u)= E ′(u)ei ∂ei = ∇eucE (u),ei euc h i ∇ E = euc (u) i. This was the claim. 

More important than the previous identification of the gradient is the property that the gradient direction coincides with the direction of steepest ascent, that is the direction into which E has greatest directional derivative. The last line of the 1.4 Euclidean gradient systems 9 following lemma characterizes the euclidean gradient in terms of the derivative and the steepest ascent direction.

Lemma 1.3 (Euclidean gradient and steepest ascent). Assume that E is differ- entiable at u U and that E ′(u) = 0. Then there exists a unique steepest ascent direction, that∈ is, there exists a unique6 vector v Rd with v = 1 such that ∈ k k

E ′(u) = sup E ′(u)w = E ′(u)v. k k w =1 k k Given this steepest ascent direction v one has

∇eucE (u)= E ′(u) v. k k Proof. This statement is a consequence of the Cauchy-Schwarz inequality. By defi- nition of the gradient and by the Cauchy-Schwarz inequality we have for w = 1 k k

E ′(u)w = ∇eucE (u),w euc ∇eucE (u) w = ∇eucE (u) , h i ≤k kk k k k and equality holds if and only if ∇eucE (u) and w are colinear. Hence, if E (u) = 0, ′ 6 then v = ∇eucE (u)/ ∇eucE (u) is the desired unique vector in the unit ball which maximizes the directionalk derivativesk of E .

Lemma 1.4. Let U Rd be an , and let E : U R be a differentiable function. Then E is continuously⊆ differentiable if and only→ if the euclidean gradient d ∇eucE : U R is continuous. → Proof. Let J : Rd (Rd) be the isomorphism from the proof of the Representation → ′ Lemma (Lemma 1.1). With this isomorphism one has E ′ = J∇eucE , or ∇eucE = 1 1 J− E ′. The claim follows from these two equalities and the fact that J and J− are continuous.

1.4 Euclidean gradient systems

A euclidean gradient system is an ordinary differential equation of the form

u˙ + ∇eucE (u)= 0, (1.4) where E is a given continuously differentiable, real-valued function on an open subset U of Rd. The unknown in this equation is an Rd-valued function u defined on an interval I R, which is in general also unknown. ⊆ A euclidean gradient system is a special case of the general ordinary differential equation u˙ + F(t,u)= 0, (1.5) 10 1 Gradient systems in euclidean space where F : D Rd is a function on an open set D R1+d; consider → ⊆ F(t,u) = ∇eucE (u) for (t,u) D = R U. In the general ordinary differen- tial equation, the unknown is again∈ a function× u : I Rd defined on an interval I R. The equalities (1.4) and (1.5) have to be understood→ as equalities of functions defined⊆ on some interval I. In this way, we avoid to write the argument of u in both differential equations. The argument of u will in the following usually be denoted by t and is interpreted as time, although in concrete applications it may have a different interpretation. Together with the argument, the gradient system has the formu ˙(t)+ ∇eucE (u(t)) = 0.

We call a function u : I Rd a solution of the differential equation (1.5) if u is continuous, F( ,u( )) is locally→ integrable, and if · · t u(t)= u(s) F(r,u(r)) dr for every s, t I. − s ∈ Z Since the gradient system (1.4) is a special case of the ordinary differential equation (1.5), a solution of the gradient system (1.4) is therefore a u : I Rd such that → t u(t)= u(s) ∇eucE (u(r)) dr for every s, t I. − s ∈ Z A solution in the above sense is a priori not differentiable in the classical sense and is not a solution in the sense which one might expect from the formulation of the equation. Instead, a solution satisfies the differential equations (1.4) and (1.5) only in an integrated form. However, if E is continuously differentiable as we assumed, then its euclidean gradient is continuous by Lemma 1.4, and therefore, by the preceding equality and the fundamental theorem of calculus, a solution of the gradient system is necessarily continuously differentiable and satisfies the equationu ˙ + ∇eucE (u) = 0 pointwise everywhere. However, if u is a solution of the differential equationu ˙ + F(t,u) = 0 and if the function F satisfies only weak regularity conditions (F is not necessarily continuous), then there may exist solutions which are not differentiable in the classical sense.

If u is a solution of the gradient system (1.4), then the time derivative of u is always equal to the negative gradient ∇E (u) and therefore points into the direction of steepest descent (Lemma 1.3). It is− therefore natural to expect that the energy E is decreasing along any solution of the gradient system. The proof of this fact has been given in the Introduction and we recall it here.

Lemma 1.5. Whenever u is a solution of the gradient system (1.4), and E is contin- uously differentiable, then the composition E (u)= E u is a decreasing function. Moreover, if the composition E (u) is constant, then the◦ solution u itself is constant.

Proof. Since the function E and the solution u are continuously differentiable, the composition E (u) is continuously differentiable, too. Hence, it suffices to prove that 1.4 Euclidean gradient systems 11 the derivative of E (u) is nonpositive. For this, we simply calculate the derivative of this function: d E E dt (u)= ′(u)u˙ (chain rule)

= ∇eucE (u),u˙ euc (definition of the euclidean gradient) h i = u˙,u˙ euc (u is a solution of the gradient system) −h i 0 (positivityoftheeuclideaninnerproduct). ≤ This inequality implies that E (u) is decreasing, but it also implies that if E (u) is constant, then u˙,u˙ euc = 0. Since the euclidean inner product is definite, one then obtainsu ˙ = 0, andh thereforei u is constant, too.

Let U Rd be an open set and let F : U Rd be a continuous function. We call a function⊆E : U R an energy function for→ the ordinary differential equation → u˙ + F(u)= 0, (1.6) if for every solution u of this differential equation the composition E (u)= E u is decreasing. By the preceding lemma, E is always an energy function for the gradi-◦ ent system (1.4). The existence of an energy function is quite common for ordinary differential equations coming from applications. For example, in ordinary differen- tial equations arising from physical models it is natural that some physical energy is decreasing or conserved along solutions. These models are certainly a motivation for the term ”energy function”. Very often, an energy function is also called Lya- punov function. In some places of the literature, an energy function is also called cost function, mainly because of applications to optimisation problems. Differential equations admitting an energy function may be called dissipative systems1.

Lemma 1.6. Every limit point of a global solution of the euclidean gradient system (1.4) is an equilibrium point of E . d In other words: if u : R+ R is a solution, if ϕ = lim u(tn) exists for some n ∞ → → (tn) ∞, and if ϕ U, then ∇eucE (ϕ)= 0. ր ∈ Proof. By the preceding lemma, the composed function E (u) is decreasing. More- over, since lim E (u(tn)) = E (ϕ) exists by assumption and by continuity of E , it n ∞ →E d E 2 follows that (u) is bounded below. Integrating the equality dt (u)= u˙ we ∞ −k k then obtain that the u˙ 2 is finite. This implies lim t+1 u˙ 2 = 0. 0 t ∞ t k k → k k Hence, R R

1 to dissipate (lat.: dissipare): to cause to lose energy (such as heat) irreversibly (http://www. thefreedictionary.com/). Other explanations are: to spend or expend intemperately or wastefully. Or: to attenuate to or almost to the point of disappearing. Of course, a constant function is an energy function for every ordinary differential equation, but this trivial energy function does not allow us to call every differential equation dissipative. One should have the additional property from Lemma 1.5 in mind. 12 1 Gradient systems in euclidean space

tn+s lim u(tn + s)= lim u(tn)+ u˙ = ϕ uniformly in s [0,1], n ∞ n ∞ t ∈ → → Z n  since, by H¨older’s inequality,

tn+s tn+s tn+1 1 sup u˙ sup u˙ u˙ 2 2 0 as n ∞. t ≤ t k k≤ t k k → → s [0,1] Z n s [0,1]Z n Z n ∈ ∈ 

By continuity of ∇eucE , this implies lim ∇eucE (u(tn +s)) = ∇eucE (ϕ) uniformly in n ∞ s [0,1], and therefore → ∈ 1 ∇eucE (ϕ)= ∇eucE (ϕ) Z0 1 = lim ∇eucE (u(tn + s)) ds n ∞ → Z0 1 = lim u˙(tn + s) ds − n ∞ 0 → Z = 0.

This is the claim.

1.5 Algebraic equations and steepest descent

Let U Rd be an open set and let F : U Rd be a continuously differentiable function.⊆ Consider the problem of finding a solution→ u ¯ U of the algebraic equation ∈ F(u¯)= 0.

One way to solve this problem could be to consider the function E : U R given E 1 2 → by (u)= 2 F(u) , where is the euclidean norm. Then one has F(u¯)= 0 if and only if Ek(u¯)=k0. Since Ek·kis a positive function, the problem amounts to search for a point where the infimum of E is attained and, if it is attained, to hope that this infimum is equal to 0. The infimum of E can be searched for by a method2. It is an exercice to show that the derivative of the function E is given by

d E ′(u)v = F(u),F′(u)v for every u U, v R . h i ∈ ∈

We define the adjoint of F′(u) with respect to the euclidean inner product to be the linear mapping F (u) : Rd Rd given by ′ ∗ →

2 This idea has already been proposed by Augustin Cauchy in a short communication to the Comptes Rendus de l’Acad´emie des Sciences de Paris in 1847, M´ethodes g´en´erale pour la r´esolution des syst`emes d’´equations simultan´ees, volume 25 (1847), pages 536–538. 1.6 Exercises 13

d F′(u)∗v,w = v,F′(u)w for every v, w R . h i h i ∈ With this definition, the euclidean gradient of the function E is given by

∇E (u)= F′(u)∗F(u), and the euclidean gradient system associated with E is hence the differential equa- tion u˙ + F′(u)∗F(u)= 0. (1.7)

Proposition 1.7. Assume that F′(v) is invertible for every v U. Assume further d ∈ that there exists a (global) solution u : R+ R of (1.7) with relatively compact range in U. Then there exists u¯ U such that→ F(u¯)= 0. ∈ Proof. Since the solution u has relatively compact range in U, there exist a sequence (tn) ∞ and an elementu ¯ U such that limn ∞ u(tn)= u¯. By what has been shown above,ր the differential equation∈ (1.7) is the→ euclidean gradient system associated with the function E (u)= 1 F(u) 2. By Lemma 1.6, the elementu ¯ U is therefore 2 k k ∈ a critical point of E , that is F′(u¯)∗F(u¯)= 0. Since F′(u¯), and therefore also F′(u¯)∗, is invertible by assumption, this implies F(u¯)= 0. From the proof of the preceding proposition it turns out that every limit point of every global solution of the gradient system (1.7) having relatively compact range in U is a solution to the algebraic equation F(u¯)= 0. In other words, the problem of finding a solution to an algebraic equation is transformed into the problem of finding a global solution of a gradient system having relatively compact range. We will study the problem of global existence for the first time in the following lecture and we will see that the gradient structure also helps for that problem.

1.6 Exercises

d d 1.1. The sublevels of an energy function E : R R are the sets Kc = u R : E (u) c with c R. Show that each sublevel→ is under the{ gradient∈ ≤ } ∈ d systemu ˙ + ∇eucE (u)= 0, that is, if u : [0,T ] R is a solution and u(0) Kc, then → ∈ u(t) Kc for every t [0,T]. ∈ ∈ 1.2. a) Prove that every ordinary differential equation of the formu ˙ + F(u)= 0 with continuous F : R R is a euclidean gradient system. → b) Prove that every solution of the above ordinary differential equation is mono- tone. 1.3. a) Prove that the system 2 u˙1 + u1u2 = 0 1 3 u˙2 + 3 u1 = 0 is a euclidean gradient system. 14 1 Gradient systems in euclidean space

b) Prove that the differential system of the damped pendulum

u˙1 u2 = 0 − u˙2 + αu2 + β sinu1 = 0

α > 0, β > 0, is not a euclidean gradient system.

1.4. Let Rd be equipped with a norm which is strictly convex, that is, for any two points x, y Rd with x = y = 1 and x = y one has x+y < 1. Let E : Rd R ∈ k k k k 6 k 2 k → be a differentiable function. Prove that for every u Rd there exists a unique vector v Rd, v = 1, such that ∈ ∈ k k

E ′(u) := sup E ′(u)w = E ′(u)v. k k w =1 k k Given this vector v, we define the gradient ∇E (u) := E (u) v. By Lemma 1.3, k ′ k this gradient coincides with the euclidean gradient if = euc. Prove that the euclidean norm is strictly convex. k·k k·k p 1 Remark: The p-norm given by u p = (∑ ui ) p is strictly convex if 1 < p < ∞. k k i | | 1.5. Consider the differential equation

∇E (u) u˙ + = 0, ∇E (u) k k and show that solutions of this differential equation and solutions of the gradient systemv ˙+ ∇E (v)= 0 can be transformed into each other by a change of time vari- able u(t)= v(α(t)) and v(t)= u(β(t)) for some functions α and β depending on v and u, respectively. Find the differential equations determining the functions α and β.

1.6. Draw the phase portrait (or alternatively: the direction field) of some planar gradient system and insert the level curves of the underlying energy function. What do you observe? Lecture 2 Gradient systems in finite dimensional space

In this lecture we continue to consider functions, gradients and gradient systems on Rd. But we generalize the notion of gradient and gradient system in two steps. The key is to replace the euclidean inner product in the definition of the euclidean gradient by an arbitrary inner product or by a Riemannian metric.

Recall that an inner product on Rd is a bilinear, symmetric, positive definite form , : Rd Rd R, that is, for every u, v, w Rd and every λ R one has h· ·i × → ∈ ∈ (i) λu + v,w = λ u,w + v,w and u,λv + w = λ u,v + u,w (bilinearity), h i h i h i h i h i h i (ii) u,v = v,u (symmetry) and h i h i (iii) u,u 0 (positivity) with equality only for u = 0 (definiteness). h i≥ d d In the following, we denote by L2(R ;R) the space of all bilinear forms a : R Rd R, that is, of all forms satisfying property (i) above. This space is a vector× space→ for the natural addition and the natural multiplication,and it is a Banach space for the norm a := sup a(u,v) . k k u 1 | | kvk≤1 k k≤ The set of all inner products on Rd, which we denote by Inner(Rd), is a subset of d L2(R ;R). It is therefore naturally equipped with the metric induced by the above norm.

Lemma 2.1 (Representation lemma). Let , be an inner product on Rd. For every linear functional u (Rd) there existsh· a·i unique vector u Rd such that ′ ∈ ′ ∈ d u′(v)= u,v for every v R . (2.1) h i ∈ d d We say that the functional u′ (R )′ is represented by the element u R with respect to the inner product , ∈. ∈ h· ·i Proof. Note that for every u Rd one may define a linear functional u (Rd ) by ∈ ′ ∈ ′

15 16 2 Gradient systems in finite dimensional space

d u′(v) := u,v , v R . h i ∈ Due to the bilinearity and definiteness of the inner product , , the mapping h· ·i d d J : R (R )′, → u u′ 7→ d d thus defined is linear and injective. Since R and (R )′ are both d-dimensional (d < ∞), the mapping J is actually a linear isomorphism. This means that every linear d d functional u′ on R is represented by some unique element u R via the identity (2.1) and the lemma is proved. ∈

Lemma 2.2. Let , be an inner product on Rd. Then there exists a linear mapping Q : Rd Rd whichh· ·i is → d (i) symmetric, that is, for every v, w R one has Qv,w euc = v,Qw euc, ∈ h i h i d (ii) positive, that is, for every v R one has Qv,v euc 0, ∈ h i ≥ d (iii) definite, that is, if Qv,w euc = 0 for every w R , then v = 0, and h i ∈ (iv) for every v, w Rd one has ∈

v,w = Qv,w euc. h i h i d d d Proof. For every v R the mapping v′ : R R given by v′(w)= v,w (w R ) is linear and continuous.∈ The Representation→ Lemma 2.1 applied toh thei euclidean∈ inner product (or: the Representation Lemma 1.1) yields that there exists an element d d Qv R such that v′(w)= Qv,w euc for every w R . By definition, Q satisfies property∈ (iv). Moreover, ash a consequencei of the bilinearit∈ y of , , the mapping Q is linear. By definition of Q and by the symmetry of the innerh· products·i , and h· ·i , euc, h· ·i d Qv,w euc = v,w = w,v = Qw,v euc = v,Qw euc for every v, w R . h i h i h i h i h i ∈ Hence, Q is symmetric. Moreover, for every v Rd, ∈

Qv,v euc = v,v 0 h i h i≥ by positivity of the inner product , . Finally, by definiteness of the inner product h· d·i , , Qv,w euc = 0 for every w R implies v = 0, and hence Q is definite, too. h· ·i h i ∈ Let U Rd be an open set. A continuous function g : U Inner(Rd) is called a ⊆ → Riemannian metric on U. Given a Riemannian metric, we write , g(u) to denote the inner product g(u) at the point u U, and we denote by h· the·i correspond- ∈ k·kg(u) ing norm on Rd.

Lemma 2.3. Let g : U Inner(Rd) be a Riemannian metric, and let Q : U L (Rd ) be the function given→ by → 2.1 Definition of gradient 17

d Q(u)v,w euc = v,w for every u U, v, w R . h i h ig(u) ∈ ∈ Then Q is continuous.

Proof. For every u1, u2 U we have ∈

Q(u1) Q(u2) = sup Q(u1)v Q(u2)v k − k v 1 k − k k k≤ = sup Q(u1)v Q(u2)v,w euc v 1 |h − i | kwk≤1 k k≤

= sup v,w g(u1) v,w g(u2) v 1 |h i − h i | kwk≤1 k k≤ = g(u1) g(u2) . k − k Since g is continuous, the continuity of Q follows.

2.1 Definition of gradient

Let U Rd be an open set and let E : U R be a continuously differentiable function.⊆ We recall from the preceding chapter→ that for every u U and every v Rd the derivative and the euclidean gradient are related by the equality∈ ∈

E ′(u)v = ∇eucE (u),v euc. h i Starting from this definition and from the general Representation Lemma 2.1, it is natural to generalize the notion of gradient. We actually consider two generaliza- tions.

First, given an arbitrary inner product , on Rd, the gradient of E with respect to this inner product is the function ∇Eh·which·i assigns to each u U the unique element ∇E (u) Rd such that ∈ ∈ d E ′(u)v = ∇E (u),v for every v R . h i ∈ Second, given a Riemannian metric g on U, the gradient of E with respect to g d is the function ∇gE which assigns to each u U the unique element ∇gE (u) R such that ∈ ∈ d E ′(u)v = ∇gE (u),v for every v R . h ig(u) ∈ If the metric g is clear from the context, then we may simply write ∇E instead of ∇gE . Clearly, the second generalization includes the first one: it suffices to take a constant Riemannian metric. Lemma 2.4. If E : U R is continuously differentiable, and if g is a Riemannian → d metric on U, then the gradient ∇gE : U R is continuous. → 18 2 Gradient systems in finite dimensional space

1 Proof. If Q is the function from Lemma 2.3, then ∇gE (u)= Q(u)− ∇eucE (u). Con- tinuity of ∇gE is therefore a direct consequence of Lemma 2.3 and Lemma 1.4.

2.2 Definition of gradient system

We call a gradient system any ordinary differential equation in Rd which is of the form u˙ + ∇gE (u)= 0, (2.2) where ∇gE is the gradient of a continuously differentiable function E : U R with respect to a Riemannian metric g. This gradient system admits E as an energy→ function. The proof is a repetition of the proof of Lemma 1.5.

Lemma 2.5. Whenever u is a solution of the gradient system (2.2), and E is contin- uously differentiable, then the composition E (u)= E u is a decreasing function. Moreover, if the composition E (u) is constant, then the◦ solution u itself is constant.

Proof. Since the function E and the solution u are continuously differentiable, the composition E (u) is continuously differentiable, too. Hence, it suffices to prove that the derivative of E (u) is non-positive. Using the chain rule, the definition of the gradient ∇gE , and the differential equa- tion (2.2), we can calculate d E E dt (u)= ′(u)u˙ (chain rule)

= ∇gE (u),u˙ (definition of the gradient) h ig(u) = u˙,u˙ (u is solution of the gradient system) −h ig(u) 0 (positivity of the inner product , ). ≤ h· ·ig(u) This inequality implies that E (u) is decreasing, but it also implies that if E (u) is constant, then u˙,u˙ g(u) = 0. Since , g(u) is definite, one then obtainsu ˙ = 0, and therefore u is constant,h i too. h· ·i

2.3 Local existence for ordinary differential equations

This section is a small digression. In view of the existence results for gradient sys- tems in infinite dimensional spaces, we discuss in some detail the question of local existence of solutions of gradient systems in finite dimensions. However, since local existence for general ordinary differential equations is not more difficult to prove, we state the general result. The following classical result from the theory of ordinary differential equations, Carath´eodory’s theorem, is perhaps known and one may skip the proof in this case. 2.3 Local existence for ordinary differential equations 19

Theorem 2.6 (Caratheodory).´ Let D R Rd be anopenset andlet F : D Rd, F = F(t,u), be a function which satisfies⊆ the× following Caratheodory´ conditions→ :

F( ,u) is measurable for every u, (2.3) · F(t, ) is continuous for every t, and (2.4) · 1 for every (t0,u0) D there exist α > 0, r > 0 and m L (t0,t0 + α) (2.5) ∈ ∈ such that F(t,u) m(t) for every t (t0,t0 + α), u B(u0,r). k k≤ ∈ ∈

Then, for every (t0,u0) D the ordinary differential equation with initial condition ∈ u˙ + F(t,u)= 0, (2.6) ( u(t0)= u0, admits a local solution. This means that there exists an interval I = [t0,t0 + α] R which is not reduced to one point, and there exists a continuous function u : I ⊆Rd such that for every t I → ∈ t u(t)= u0 F(s,u(s)) ds. (2.7) − t Z 0 Note that this definition of local solution coincides with the definition from Lec- ture 1. We give a “constructive” proof of Carath´eodory’s theorem in the sense that we first define a sequence of local approximate solutions, which “almost” satisfy the integral equation (2.7), and we then prove that a subsequence of these approxi- mate solutions converges to an exact solution of the differential equation (2.6). The compactness argument behind the existence of such a subsequence is based on the Arzel`a-Ascoli theorem.

Proof (Proof of Theorem 2.6). Let (t0,u0) D, and let the constants α > 0, r > 0 and the function m be as in the condition (2.5).∈ Define M(t) := t m(s) ds for t t0 [t ,t + α]. Then the function M is continuous and M(t )= 0. Using this continuity∈ 0 0 0 R and choosing α > 0 small enough, we may assume that M(t) r for every t | |≤ ∈ [t0,t0 + α]. d For every k 2 we define a function uk C([t0,t0 + α];R ) by ≥ ∈ α u0 if t [t0,t0 + ], ∈ k uk(t) := α α  t k α 2α (k 1) u0 − F(s,uk(s)) ds if t (t0 + ,t0 + ],...,(t0 + − ,t0 + α].  − t0 ∈ k k k R Note first that the functions uk are well defined in the sense that the function F( ,uk( )) in the integral is defined. In fact, one proves easily that · ·

uk(t) u0 r for every k 2, t [t0,t0 + α]. (2.8) k − k≤ ≥ ∈ 20 2 Gradient systems in finite dimensional space

Fix ε > 0. Since the function M is uniformly continuous on the compact interval [t0,t0 + α], there exists δ > 0 such that for every t, t [t0,t0 + α] the implication ′ ∈

t t′ δ M(t) M(t′) ε | − |≤ ⇒ | − |≤ is true. Moreover, for every k 2 and every t, t [t0,t0 + α], t t , one has ≥ ′ ∈ ≥ ′

uk(t) uk(t′) k − k≤α t k α α α − α F(s,uk(s)) ds M(t ) M(t′ ) if t t′ t0 + , t′ k k k k −α k k ≤ | − − − | ≥ ≥  R t k α α − F(s,uk(s)) ds M(t ) if t t0 + t , ≤  t0 k k ≤ | − k | ≥ k ≥ ′  α 0R if t0 + t t . k ≥ ≥ ′   In any case, for every k 2 and every t, t [t0,t0 + α] with t t δ one has ≥ ′ ∈ | − ′|≤

uk(t) uk(t′) ε. k − k≤ d This means that the sequence (uk) C([t0,t0 +α];R ) is equicontinuous. Moreover, by the estimate (2.8), this sequence∈ is also uniformly bounded. Hence, by the Arzel`a-Ascoli theorem (Theorem B.43), there exists a subsequence d (uk ) of (uk) and a continuous function u : [t0,t0 + α] R such that uk converges j → j uniformly on [t0,t0 + α] to u. By condition (2.4), for every t [t0,t0 + α] one has ∈

lim F(t,uk (t)) = F(t,u(t)). j ∞ j → By the estimate (2.8) and by condition (2.5),

F(t,uk(t)) m(t) for every k 2, t [t0,t0 + α]. k k≤ ≥ ∈

Hence, by definition of the functions uk and by the dominated convergencetheorem, for every t [t0,t0 + α] ∈

u(t)= lim uk (t) j ∞ j → α t − k j = u0 lim F(s,uk j (s)) ds − j ∞ t → Z 0 t = u0 F(s,u(s)) ds. − t Z 0 Hence, u is a local solution.

Theorem 2.6 is like a key which opens the door to study ordinary differential equations, or like a source which provides us with solutions of ordinary differential equations. The question of local existence, under the Carath´eodory assumptions, 2.4 Local and global existence for gradient systems 21 is completely solved, and we can now study properties of these solutions. Note, however, that without any further assumptions on the function F, one can neither expect existence of global solutions (defined on [t0,∞), for example), nor uniqueness of local solutions. 1 Example 2.7. a) The function u(t)= 1 t , t [0,1), is a solution of the ordinary differential equation − ∈ u˙ u2 = 0, − ( u(0)= 1, and can not be extended to a global solution (defined on [0,∞)). 1 2 R b) The functions u(t)= 0 and u(t)= 4 t , t +, are two distinct solutions of the ordinary differential equation ∈

u˙ u = 0, − | | ( u(0)=p 0.

By Exercise 1.2, both differential equations in (a) and (b) are gradient systems.

d A solution u : [t0,t0 +α) R of the ordinary differential equation (2.6) is called a maximal solution if it can→ not be extended to a solution on a strictly larger interval [t0,t0 + β). Local existence of solutions of ordinary differential equations always implies the existence of maximal solutions. The proof of this fact is based on the Lemma of Zorn.

Corollary 2.8. Under the assumptions of Theorem 2.6, for every (t0,u0) D there exists a maximal solution of the problem (2.6). ∈ Proof (Idea). Consider the set S of all pairs (α,u) of positive numbers α (0,∞] d ∈ (∞ is included) and of continuous functions u : [t0,t0 +α) R which are solutions of the initial value problem (2.6). By Carath´eodory’s Theo→rem, this set is nonempty. On the set S , we consider the following partial ordering: we say that (α,u) (β,v), if α β with respect to the usual ordering in (0,∞], and if u is a restriction≤ ≤ of the function v to the interval [t0,t0 + α). We leave it as an exercise to show that is really a partial ordering on S . ≤ We also leave it as an exercise to show that every totally ordered subset of S admits a maximal element. The claim then follows from Zorn’s Lemma.

2.4 Local and global existence for gradient systems

From Carath´eodory’s local existence theorem we immediately obtain the following corollary about local existence of solutions of non-autonomous gradient systems. Corollary 2.9 (Local existence for non-autonomous gradient systems). Let U Rd be an open set, let E : U R be a continuously differentiable function, and let⊆ → 22 2 Gradient systems in finite dimensional space g : U Inner(Rd) be a Riemannian metric on U. Let I R be an open interval, → 1 d ⊆ andlet f L (I;R ). Then, for every t0 I and every u0 U the problem ∈ ∈ ∈

u˙ + ∇gE (u)= f , (2.9) ( u(t0)= u0, admits a local solution. This means that there exists an interval J = [t0,t0 + α] I which is not reduced to one point, and there exists a continuous function u : J ⊆Rd such that for every t J → ∈ t t u(t) u0 + ∇gE (u(s)) ds = f (s) ds. − t t Z 0 Z 0 Proof. It suffices to check that the assumptions of Carath´eodory’s theorem are sat- isfied. We leave this as an exercise. Maximal solutions of the gradient system (2.9) are defined in the same way as for the general ordinary differential equation (2.6). It follows from Corollary 2.8 that the gradient system (2.9) always admits a maximal solution. For the general ordinary differential equation as well as for the gradient system an important question is to determine the maximal existence interval or to show that maximal solutions are global. By Example 2.7 (a), global existence of solutions does not always hold, neither for ordinary differential equations nor for gradient systems. But under a natural assumption on the energy, global existence of solutions of gradient systems can be proved via a priori estimates. A function E : Rd R is coercive if for every c R the sub- → ∈ d Kc := u R : E (u) c is bounded. { ∈ ≤ } Theorem 2.10 (Global existence for gradient systems). Let E : Rd R be a con- tinuously differentiable, coercive function, and let g be a Riemannian→ metric on Rd Rd such that v g(u) C v euc for every u, v and some constant C 0. Then, for k kd ≤ k k ∈ d ≥ every u0 R and every continuous f : [0,T] R (T > 0), every maximal solution u of the non-autonomous∈ gradient system →

u˙ + ∇gE (u)= f , (2.10) ( u(0)= u0, exists on [0,T ]. Moreover,

u(t) : t [0,T] Kc, { ∈ }⊆ E C T 2 where c = (u0)+ 2 0 f euc depends only on the data u0, f and the embedding constant C. k k R d Proof. Let u : [0,Tmax) R (Tmax T ) be any maximal solution of the non- autonomous gradient system→ (2.10);≤ the existence of such a maximal solution is 2.5 Newton’s method 23 guaranteed by Corollaries 2.9 and 2.8. Since ∇gE (u) and f are continuous func- tions, any (maximal) solution of (2.10) is continuously differentiable and satisfies the differential equation (2.10) in the classical sense (and not only in the integrated sense from Corollary 2.9). We will show that Tmax = T . We multiply equation (2.10) byu ˙ with respect to the inner product , , inte- h· ·ig(u) grate the result over [0,t] (0 < t < Tmax T ), apply the Cauchy-Schwarz inequality and the assumption on the metric g in order≤ to obtain

t t t 2 ∇ E u˙ g(u) + g (u),u˙ g(u) = f ,u˙ g(u) 0 k k 0 h i 0 h i Z Z Z t t 1 2 1 2 f g(u) + u˙ g(u) ≤ 2 0 k k 2 0 k k Z Z T t C 2 1 2 f euc + u˙ g(u). ≤ 2 0 k k 2 0 k k Z Z ∇ By definition of the gradient and by the chain rule, we have gE (u),u˙ g(u) = E d E h i ′(u)u˙ = dt (u). This and the preceding inequality imply that

t T 1 2 E E C 2 ∞ u˙ g(u) + (u(t)) (u0)+ f euc =: c < . 2 0 k k ≤ 2 0 k k Z Z

The right-hand side of this inequality is independent of t [0,Tmax), and the first term on the left-hand side is positive. As a consequence,∈ the range u(t) : t { ∈ [0,Tmax) is a subset of the sub-level set Kc. By coercivity and continuity of E , } d Kc is bounded and closed in R which is finite dimensional. Hence, Kc is compact. Since the gradient ∇gE is continuous on Kc (Lemma 2.4) and since the range ∇ E ∞ of u is contained in Kc, we obtain supt [0,Tmax) g (u(t)) euc < . The differential ∈ k k equation (2.10) then implies that u˙ euc is bounded, and integrable, on [0,Tmax). k k Hence, u extends to a continuous function on the closed interval [0,Tmax]. Now, if Tmax < T , then we could extend the solution u to a larger interval by solving the gradient systemu ˙ + ∇gE (u)= f with initial time Tmax and initial value u(Tmax). This would contradict to the fact that u is already a maximal solution, and hence Tmax < T is not possible. As a consequence, Tmax = T , that is, u is a global solution.

2.5 Newton’s method

Let U Rd be an open set, and let F : U Rd be a continuously differentiable function.⊆ Consider again the problem of finding→ a solutionu ¯ U of the algebraic equation ∈ F(u¯)= 0. We repeat the idea from the previous lecture and consider the function E : U R given by E (u)= 1 F(u) 2, where is the euclidean norm on Rd. Then one→ has 2 k k k·k 24 2 Gradient systems in finite dimensional space

F(u¯)= 0 if and only if E (u¯)= 0, and one may try to find a solution of the latter problem by considering a gradient system associated with E .

In this example, we assume that the derivative

F′(u) is invertible for every u U, ∈ and we define a Riemannian metric g : U Inner(Rd ) by setting → d v,w = F′(u)v,F′(u)w euc, u U, v, w R . h ig(u) h i ∈ ∈ For every u U the bracket , is clearly bilinear and symmetric. Moreover, ∈ h· ·ig(u) for every u U, v Rd, ∈ ∈

v,v = F′(u)v,F′(u)v euc h ig(u) h i 2 = F′(u)v 0, k keuc ≥ so that , is in addition positive semidefinite. If v,v = 0, then h· ·ig(u) h ig(u) F (u)v 2 = 0 and hence F (u)v = 0 by the definiteness of the euclidean inner k ′ keuc ′ product. Since F′(u) is invertible by assumption, this implies v = 0, and therefore , is definite, too. Continuity of g follows from the continuity of F . h· ·ig(u) ′ For the gradient of E with respect to the metric g we have

∇gE (u),v = E (u)v (definition of gradient) h ig(u) ′ = F(u),F (u)v euc (compute E ) h ′ i ′ 1 1 = F (u)F (u) F(u),F (u)v euc (insert the term F (u)F (u) ) h ′ ′ − ′ i ′ ′ − = F (u) 1F(u),v (definition of metric g). h ′ − ig(u) Comparing left-hand side and right-hand side yields

1 ∇gE (u)= F′(u)− F(u).

Hence, the differential equation

1 u˙ + F′(u)− F(u)= 0 (2.11) is a gradient system associated with the energy E . We call this differential equa- tion continuous Newton’s method since it is a continuous (in time) version of the discrete Newton algorithm

1 un 1 un + F′(un)− F(un)= 0, n 0. (2.12) + − ≥ Much could be said about this algorithm which Newton proposed in his De analysi per aequationes numero terminorum infinitas (1669) and in his Methods of 2.6 Exercises 25 and fluxions (1671) for a special polynomial F1. It was originally not designed to be a steepest descent algorithm, that is, a discrete version of a gradient system. The original idea of Newton’s method is very different and involves no energy function: given an approximation un of the exact solutionu ¯ of the equation F(u¯)= 0, Taylor expansion of order 1 yields the equation

0 = F(u¯)= F(un)+ F′(un)(u¯ un)+ o(u¯ un). − −

Dropping the rest term o(u¯ un) by assuming that it is small compared to the other − terms if un is close tou ¯, and solving the resulting linear equation

0 = F(un)+ F′(un)(un 1 un) + − leads to the iteration formula (2.12). It turns out that if u0 is an initial value close to u¯ (close in a sense which can in general not be quantified), then the sequence (un) of iterates given by the Newton algorithm converges tou ¯ as n ∞. We will not give a proof of this fact, but we rather content ourselves of having→proved that continuous Newton’s method is a gradient system2.

Proposition 2.11. Assume that F′(v) is invertible for every v U. Assume further d ∈ that there exists a (global) solution u : R+ R of continuous Newton’s method (2.11), and assume that u has relatively compactrange→ inU. Then there exists u¯ U such that F(u¯)= 0. ∈

Proof. See the proof of Proposition 1.7.

2.6 Exercises

2.1. Prove Corollary 2.9.

2.2. Show that a function E : Rd R is coercive if and only if lim E (u)=+∞. → u ∞ k k→ 2.3 (Damped pendulum). Given α > 0, β > 0, ε > 0, consider the two functions G : R2 R2 and E : R2 R given by → →

u1 u2 G = − u2 αu2 + β sinu1     and

1 See D. T. Whiteside (ed.), The Mathematical Papers of , Volume II, 1667–1670, Cambridge University Press, Cambridge, 1968 and D. T. Whiteside (ed.), The Mathematical Pa- pers of Isaac Newton, Volume III, 1670–1673, Cambridge University Press, Cambridge, 1969. 2 The observation that the continuous Newton method is a gradient system is taken from J. W. Neuberger, Prospects for a central theory of partial differential equations, Math. Intelligencer 27 (2005), 47–55 26 2 Gradient systems in finite dimensional space 1 E u1 β 2 ε = cosu1 + u2 + u2 sinu1. u2 − 2  

a) Prove that for every u R2 (kπ,0) : k Z and every ε > 0 small enough one has ∈ \{ ∈ } ∇eucE (u),G(u) euc > 0. h i 1 2 λ 2 Hint: Recall the special case of Young’s inequality, ab 2λ a + 2 b , which is true for every a, b 0, λ > 0. ≤ ≥ b) For every u R2, consider the matrix ∈

u2 ε sinu1 u2 A(u)= − − − . β sinu1 + εu2 cosu1 αu2 + β sinu1 !

Prove that for every u R2 (kπ,0) : k Z and every ε > 0 small enough the matrix A(u) is invertible.∈ \{ ∈ } Hint: Calculate detA(u) and compare with (a). c) Prove that for every u R2 (kπ,0) : k Z and every ε > 0 small enough ∈ \{ ∈ } 1 1 2 v,w := ∇eucE (u),G(u) euc A(u)− v,A(u)− w euc, v, w R , h ig(u) h i h i ∈ defines an inner product on R2. The function g : R2 (kπ,0) : k Z Inner(R2) thus defined is a Riemannian metric. \{ ∈ } →

d) Show that (for ε > 0 small enough) ∇gE = G. e) Conclude that the differential system modeling the damped pendulum

u˙1 u2 = 0 − u˙2 + αu2 + β sinu1 = 0

is a gradient system on R2 (kπ,0) : k Z , that is, outside the set of equi- librium points. \{ ∈ }

d d 2.4. Let F : R R be a continuously differentiable function such that F′(u) is invertible for every→ u Rd and such that lim F(u) =+∞. ∈ u ∞k k k k→ a) Show that the equation F(u¯)= 0 has a solution. b) Show that F is surjective. Hint: Replace the function F by F v, where v Rd is an arbitrary given vector. − ∈ Lecture 3 Gradients in infinite dimensional space: the Laplace operator on a bounded interval

Starting with this lecture, we consider real-valued functions which are defined on infinite dimensional Banach spaces. We study their gradients and the associated gradient systems. Many of the aspects and objects which we know for gradient systems in finite dimensional space reappear in the following study: the notion of derivative, ambient inner product or metric, gradient, gradient system, energy function, dissipation. But the analysis of gradient systems in infinite dimensional spaces – local and global existence of solutions, for example – requires more effort.

The effort is worthwhile! Many elliptic partial differential operators which appear in evolution models from physics, biology, chemistry or engineering turn out to be gradients, and the corresponding evolution models like diffusion models, phase separation models are gradient systems.

The most prominent example of an is the Laplace operator

d ∂ 2 ∆ = div∇ = ∑ ∂ 2 i=1 xi acting on functions defined on domains in Rd. We show that the Laplace operator on a bounded interval (the !), equipped with Dirichlet boundary conditions, is a gradient. This fundamental observation is not only important for the study of the heat equation ut uxx = 0 or similar parabolic equations. It is also important for the study of elliptic− boundary value problems by variational methods. These problems and the beautiful theory of the calculus of variations are touched very slightly in this and the following lecture. The study of gradient systems is postponed for a while.

In a first step, we define the gradient of a real-valued function defined on an infinite dimensional Banach space. Our first examples of gradients, in particular the Laplace operator and other elliptic partial differential operators, are gradients of

27 28 3 Gradients in infinite dimensional space: the Laplace operator on a bounded interval

quadratic forms. We devote a short separate section to quadratic forms.

Throughout this lecture, V is a real Banach space with norm V . We denote by k·k V ′ the dual space of V, that is, the space of all continuous linear functionals V R. If u V , then we write → ′ ∈ ′

u′(u) or u′ u or u′,u or u′,u h i h iV ′,V

for the value of u′ at the element u V. The dual space V ′ is equipped with the norm u = sup u ,u . ∈ ′ V ′ ′ k k u V 1h i k k ≤

3.1 Definition of gradient

Let U V be an open subset of V, and let E : U R be a function.We say that E is ⊆ → differentiable if for every u U there exists a continuous linear functional u′ V ′ such that ∈ ∈ E (u + h) E (u) u ,h lim − − h ′ i = 0. h V 0 h V k k → k k The functional u V , if it exists, is unique. The derivative is the function ′ ∈ ′ E ′ : U V ′ which assigns to every u U the unique linear functional u′ V ′ for which the→ above equality holds. Consequently,∈ we denote the derivative of∈E at a point u by E ′(u). We say that E is continuously differentiable if E is differentiable and if the derivative is continuous from U into V ′. The set of all continuously differentiable functions U R is a vector space which is denoted by C1(U). →

Let H be a real with inner product , H and associated norm h· ·i H . Assume that V is continuously and densely embedded into H. This means k·kthat we can identify V with a dense subspace of H (via some injection V H) and → there exists a constant C 0 such that u H C u V for every u V. We will ∋ write V ֒ H for this situation.≥ k k ≤ k k →

Given a differentiable function E : U R, we define the gradient ∇H E with → respect to the inner product , H by h· ·i

D(∇H E ) := u U : there exists v H such that for every { ∈ ∈ ϕ V one has E ′(u)ϕ = v,ϕ H , and ∈ h i } ∇H E (u) := v,

that is, ∇H E (u) is the unique element in H, if it exists, which represents the deriva- tive E ′(u) with respect to the inner product in H:

E ′(u)ϕ = ∇H E (u),ϕ H for every ϕ V. (3.1) h i ∈ 3.1 Definition of gradient 29

The “if it exists” is important and marks a difference to gradients in finite dimen- sional space. The fact that we consider two (in general different) spaces V and H leads to the necessity to consider the domain D(∇H E ) of the gradient of ∇H E . This domain is in general strictly contained in U; not every derivative E ′(u) can be represented by some element in H (see the various examples in this lecture). As soon as V is densely and continuously embedded into H, the dual H′ is con- tinuously embedded into the dual V ′; the restriction to V of a continuous linear functional H R defines a continuous linear functional V R. The resulting op- erator H V→is linear and continuous, and it is injective by→ the fact that V is dense ′ → ′ in H. But the embedding H′ V ′ is in general not surjective, that is, not every bounded linear functional u →V extends to a bounded linear functional on H.Asa ′ ∈ ′ consequence, it may happen that a derivative E ′(u) V ′ does not extend to a linear functional on H. ∈ However, if a bounded linear functional u V (for example, the derivative ′ ∈ ′ E ′(u)) extends to a bounded linear functional on H, that is, if u′ H′, then this functional may be represented by some element u H due to a generalization∈ of the Representation Lemmas 1.1 and 2.1. The Riesz-Fr´echet∈ representation theorem (Theorem D.53) says that if H is a Hilbert space, then for every bounded linear functional u H there exists a unique element u H such that ′ ∈ ′ ∈

u′,v = u,v H for every v H. h iH′,H h i ∈ By the Riesz-Fr´echet representation theorem, an element u U belongs to the ∈ domain D(∇H E ) if and only if the derivative E ′(u) extends to a continuous linear functional on H.

Similarly as above, one may define the gradient with respect to a metric. Given a Hilbert space H with inner product , H , we let L2(H;R) be the space of all bounded bilinear forms a : H H Rh·, that·i is, the space of all forms such that for every u, v, w H and every λ× R→one has ∈ ∈ a(λu + v,w)= λa(u,w)+ a(v,w) and a(u,λv + w)= λa(u,v)+ a(u,w), and a(u,v) C u H v H for every u, v H and some C 0. | |≤ k k k k ∈ ≥ This space is a vector space for the natural addition and the natural scalar multipli- cation, and it is a Banach space for the norm

a := sup a(u,v) . k k u H 1 | | kvk ≤1 k kH ≤ Besides the norm convergence we consider also strong convergence of sequences. We say that a sequence (an) L2(H;R) converges strongly to an element a ⊂ ∈ 30 3 Gradients in infinite dimensional space: the Laplace operator on a bounded interval

L2(H;R) if for every u, v H onehas lim an(u,v)= a(u,v). Every norm convergent ∈ n ∞ sequence is strongly convergent. → The set of all inner products on H, which we denote by Inner(H), is a subset of L2(H;R). It is therefore natural to speak of norm convergent sequences and strongly convergent sequences in Inner(H).A metric on an open subset U V is a function g : U Inner(H) which maps norm convergent sequences into strongly⊆ → convergent sequences, that is, whenever limn ∞ un u V = 0, un, u U, then → k − k ∈ (g(un)) converges strongly to g(u). For every u U, the inner product g(u) is also denoted by the bracket , , and the induced∈ norm is denoted by . h· ·ig(u) k·kg(u) Given a differentiable function E : U R and given a metric g, we define the → gradient ∇gE with respect to g by

D(∇gE ) := u U : there exists v H such that for every { ∈ ∈ ϕ V one has E ′(u)ϕ = v,ϕ , and ∈ h ig(u)} ∇gE (u) := v, that is, ∇gE (u) is the unique element in H, if it exists, which represents the deriva- tive E (u) with respect to the inner product , : ′ h· ·ig(u)

E ′(u)ϕ = ∇gE (u),ϕ for every ϕ V. h ig(u) ∈ If, for every u U, the inner product , is equivalent to the inner product ∈ h· ·ig(u) , H , then an element u U belongs to the domain D(∇gE ) if and only if the h· ·i ∈ derivative E ′(u) extends to a continuous linear functional on H.

3.2 Gradients of quadratic forms

A function E : V R is a quadratic form if there exists a bilinear, symmetric form a : V V R such→ that × → 1 E (u)= a(u,u). 2 Recall that a bilinear, symmetric form is a bilinear form a : V V R such that for every u, v V one has a(u,v)= a(v,u). × → ∈ Proposition 3.1. A quadratic form E : V R is continuous if and only if the asso- ciated bilinear form a is continuous. →

Proof. Clearly, if a bilinear form a is continuous, then the associated quadratic form E is continuous. In order to prove the converse, assume that the function u 1 a(u,u) is continuous. Then, by the polarization identity 7→ 2 1 a(u,v)= a(u + v,u + v) a(u v,u v) , 4 − − −  3.3 Sobolev spaces in one space dimension 31

a is continuous, too.

Proposition 3.2. If E : V R is a continuous quadratic form with associated bilin- ear form a : V V R, then→ E is continuously differentiable and × →

E ′(u)v = a(u,v) for every u, v V. ∈ The proof of this proposition is left as an exercise, but we point out that actually more is true: every continuous quadratic form is infinitely many times continuously differentiable.

Given a continuous quadratic form E : V R with associated bilinear form a : → V V R, and given a Hilbert space H with inner product , H , such that V is× densely→ and continuously embedded into H, we may computeh· the·i gradient of E with respect to the inner product in H by using the definition of the gradient and Proposition 3.2. They imply that

D(∇H E )= u V : there exists v H such that for every { ∈ ∈ ϕ V one has a(u,ϕ)= v,ϕ H and ∈ h i } ∇H E (u)= v.

It turns out that the gradient of a quadratic form E with respect to an inner product , H is a linear operator on H. The domain D(∇H E ) is a linear subspace of H and h· ·i the mapping u ∇H E (u) is linear (exercise). We point out that the gradient of a quadratic form7→ with respect to a metric is in general not linear. Quadratic forms arise naturally in the study of linear elliptic operators. These elliptic operators, in turn, appear in many parabolic equations. As already indicated in the introduction of this lecture, we spend some time in order to introduce elliptic operators. This implies that we have to define Sobolev spaces on domains. To start with, we confine ourselves to the one-dimensional case.

3.3 Sobolev spaces in one space dimension

In this section we define Sobolev spaces on intervals. We will not prove any properties of Sobolev spaces, and actually, for this lecture it is not necessary to prove any properties. It suffices to know the definition of Sobolev spaces and the notion of weak derivative.

Let I R be an open interval. For every continuous function ϕ : I R we define the support⊆ → suppϕ := x I : ϕ(x) = 0 , { ∈ 6 } where the closure is to be taken in I. The space 32 3 Gradients in infinite dimensional space: the Laplace operator on a bounded interval

∞ ∞ D(I) := C (I) := ϕ C (I) : suppϕ I is compact c { ∈ ⊂ } is called the space of all test functions on I.

By the fundamentalrule of partial integration, if u, v : [a,b] R are continuously differentiable functions on some compact interval [a,b], then →

b b uv′ = u(b)v(b) u(a)v(a) u′v. a − − a Z Z In particular, for every u C1([a,b]) and every ϕ D(a,b) ∈ ∈ b b uϕ′ = u′ϕ, (3.2) a − a Z Z since ϕ(a)= ϕ(b)= 0. In the following definition, this partial integration rule is not a theorem, but it serves for the definition of a derivative. For every ∞ a < b ∞ and every 1 p ∞, we define the first − ≤ ≤ ≤ ≤ W 1,p(a,b) := u Lp(a,b) : there exists v Lp(a,b) such that for every { ∈ ∈ b b ϕ D(a,b) one has uϕ′ = vϕ . ∈ a − a } Z Z If p = 2, then we also write H1(a,b) := W 1,2(a,b). By Lemma G.4, the function v Lp(a,b) is uniquely determined, if it exists. In ∈ the following, we write u′ := v, in accordance with the partial integration rule (3.2), 1,p and we call u′ the weak derivative of u. With this notation, the space W (a,b) is the space of all functions in Lp(a,b) which admit a weak derivative in Lp(a,b). The spaces W 1,p(a,b) are Banach spaces for the norms

1 p p p u 1,p := u p + u′ p , 1 p < ∞, and k kW k kL k kL ≤ u 1,∞ := sup u L∞, u′ L∞ , k kW {k k k k } and the space H1(a,b) is a Hilbert space for the inner product

b b u,v H1 := uv + u′v′. h i a a Z Z We further define 1 p 1,p , D k·kW W0 (a,b) := (a,b) , 1,p 1,p that is, W0 (a,b) is the closure of the test functions in W (a,b). The space 1,2 1 W0 (a,b) is also denoted H0 (a,b). The following result about Sobolev spaces on the interval will be proved later.

Theorem 3.3. Let (a,b) be a bounded interval, and let 1 p ∞. Then: ≤ ≤ 3.4 The Dirichlet-Laplace operator on a bounded interval 33

a) Every function u W 1,p(a,b) is continuous on [a,b]. More precisely: every function u W 1,p∈(a,b) has a continuous representant u : [a,b] R. ∈ → b) A function u W 1,p(a,b) belongs to W 1,p(a,b) if and only if u(a)= u(b)= 0. ∈ 0 For every 1 p ∞ and every k 2 we define inductively the k-th Sobolev space ≤ ≤ ≥ k,p 1,p k 1,p W (a,b) := u W (a,b) : u′ W − (a,b) , { ∈ ∈ } which is a Banach space for the norm

k 1 ( j) p p ∞ u Wk,p := ∑ u Lp if 1 p < ,or k k j=0 k k ≤  (k) u 1,∞ := sup u L∞, u′ L∞ ,..., u L∞ if p = ∞. k kW {k k k k k k } Here, u( j) denotes the j-th weak derivative of u, u(0) = u. The space Hk(a,b) := W k,2(a,b) is a Hilbert space for the inner product

k ( j) ( j) u,v Hk := ∑ u ,v L2 . h i j=0h i

3.4 The Dirichlet-Laplace operator on a bounded interval

1 1 We consider the Sobolev space V = H0 (0,1), equipped with the H norm which turns it into a Hilbert space. We consider in addition the quadratic form E : H1(0,1) R given by 0 → 1 1 E 2 1 (u)= (u′) , u H0 (0,1). 2 0 ∈ Z The associated bilinear form a : H1(0,1) H1(0,1) R is given by 0 × 0 → 1 1 a(u,v)= u′v′, u, v H0 (0,1). 0 ∈ Z Both functions E and a are continuous. We consider the space H = L2(0,1) equipped with the usual inner product, that is 1 u,v L2 = uv, h i 0 Z and compute the gradient of E with respect to this inner product. We only use the definition of the gradient, Proposition 3.2 and the definition of the Sobolev spaces H1(0,1) and H2(0,1). First, when we translate the definition of the gradient, we find 34 3 Gradients in infinite dimensional space: the Laplace operator on a bounded interval

∇ E 1 2 D( L2 )= u H0 (0,1) : there exists v L (0,1) such that for every { ∈ ∈ 1 1 ϕ 1 ϕ ϕ H0 (0,1) one has u′ ′ = v , ∈ 0 0 } Z Z ∇ E L2 (u)= v. This gradient may be characterized in a more explicit way, as follows.

∇ E Let u D( L2 ). Then, by the above characterization of the domain, u H1(0,1) and∈ there exists v L2(0,1) such that for every ϕ H1(0,1) one has ∈ 0 ∈ ∈ 0 1 1 1 u′ϕ′ = vϕ = ( v)ϕ. 0 0 − 0 − Z Z Z In particular, this equality holds for every test function ϕ D(0,1). Recalling now 1 ∈ 1 the definition of the Sobolev space H (0,1), we see that u′ belongs to H (0,1) 2 2 1 (that is u H (0,1)) and u′′ = v. This proves that u H (0,1) H0 (0,1) and ∇ E ∈ − ∈ ∩ L2 (u)= u′′. − 2 1 2 1 Conversely, let u H (0,1) H0 (0,1). Then u′′ L (0,1), u′ H (0,1), and by ∈ ∩ 1 ∈ ∈1 ϕ 1 ϕ definition of the Sobolev space H0 (0,1) we obtain the equality 0 u′ ′ = 0 u′′ ϕ D 1 − for every test function (0,1). By definition of the space H0 (0,1), the test 1∈ R R functions are dense in H0 (0,1). It is then an exercise to show that the equality 1 u ϕ = 1 u ϕ holds for every ϕ H1(0,1). Since u L2(0,1), this proves 0 ′ ′ − 0 ′′ ∈ 0 ′′ ∈ u D(∇ 2 E ). R L R ∈Putting the above inclusions together, we have therefore proved that

2 1 D(∇ 2 E )= H (0,1) H (0,1) and L ∩ 0 ∇ 2 E (u)= u′′. L − D We set ∆ := ∇ 2 E and we call this negative gradient the Dirichlet-Laplace (0,1) − L operator on the interval (0,1). Solving the boundary value problem

u H2(0,1), ∈  u (x)= f (x) for x (0,1), (3.3) − ′′ ∈   u(0)= u(1)= 0,  for a given f L2(0,1) is equivalent to solving the abstract equation ∈ D ∇ 2 E (u)= f (or: ∆u = f ), L − (0,1) in particular because the domain of the gradient coincides exactly with the space of all functions in u H2(0,1) which satisfy the Dirichlet boundary condition u(0)= u(1)= 0. ∈ 3.5 The Dirichlet-Laplace operator with multiplicative coefficient 35 3.5 The Dirichlet-Laplace operator with multiplicative coefficient

1 We consider, as in the previous example, the Sobolev space V = H0 (0,1) and the function E : H1(0,1) R given by 0 → 1 E 1 2 1 (u)= (u′) , u H0 (0,1). 2 0 ∈ Z We also consider, as in the previous example, the space H = L2(0,1), but now it is equipped with the inner product

1 1 v,w = vw , h i 0 m Z where m L∞(0,1) is a given positive function such that 1 L∞(0,1). This inner ∈ m ∈ product is equivalent to the usual inner product on L2(0,1), as one easily verifies. What is the gradient of E with respect to this new inner product?

As in the previous example, we first translate the definition of the gradient into this situation, by using Proposition 3.2, and we now find

∇E 1 2 D( )= u H0 (0,1) : there exists v L (0,1) such that for every { ∈ ∈ 1 1 ϕ 1 ϕ ϕ 1 H0 (0,1) one has u′ ′ = v , ∈ 0 0 m} Z Z ∇E (u)= v.

This gradient may be characterized in a more explicit way, as follows.

∇E 1 2 Let u D( ). Then u H0 (0,1) and there exists v L (0,1) such that for every ϕ ∈H1(0,1) ∈ ∈ ∈ 0 1 1 1 1 1 u′ϕ′ = vϕ = ( v )ϕ. 0 0 m − 0 − m Z Z Z In particular, this equality holds for every test function ϕ D(0,1). Recalling the 1 ∈ 1 definition of the Sobolev space H (0,1), we see that u′ belongs to H (0,1) (that is, u H2(0,1)) and u = v 1 . This proves u H2(0,1) H1(0,1) and ∇E (u)= ∈ − ′′ m ∈ ∩ 0 mu′′. − 2 1 2 1 Conversely, let u H (0,1) H0 (0,1). Then u′′ L (0,1), u′ H (0,1), and ∈ ∩ 1 ∈ ∈ 1 ϕ by definition of the Sobolev space H (0,1), we obtain the equality 0 u′ ′ = 1 ϕ 1 ϕ 1 ϕ D D 0 u′′ = 0 mu′′ m first for every (0,1), and then, since R (0,1) is − 1 − ϕ 1 ∈ 2 denseR in H0 (0,R1), also for every H0 (0,1). Since mu′′ L (0,1), this proves u D(∇E ). ∈ ∈ ∈Putting the above inclusions together, we have therefore proved that 36 3 Gradients in infinite dimensional space: the Laplace operator on a bounded interval

D(∇E )= H2(0,1) H1(0,1) and ∩ 0 ∇E (u)= mu′′. − Hence, solving the abstract equation

∇E (u)= f is now equivalent to solving the boundary value problem

u H2(0,1), ∈  m(x)u (x)= f (x) for x (0,1), − ′′ ∈   u(0)= u(1)= 0.   We consider also the following nonlinear variant of the above example. Let ε R ε 1 1 ∈ (0,1), and let m : [ , ε ] be a continuous function. For every u H0 (0,1) we consider the inner product→ ∈

1 1 v,w g(u) = vw , h i 0 m(u) Z where m(u)= m u is the composition of the functions m and u. The resulting func- 1 ◦ 2 tion g : H0 (0,1) Inner(L (0,1)), u , g(u) maps norm convergent sequences into strongly convergent→ sequences (exercise)7→ h· ·i and is therefore a metric. By repeating the above computation, one sees that the gradient of E with respect to the metric g is given by

2 1 D(∇gE )= H (0,1) H (0,1) and ∩ 0 ∇gE (u)= m(u)u′′. − Hence, solving the abstract equation

∇gE (u)= f is now equivalent to solving the boundary value problem

u H2(0,1), ∈  m(u(x))u (x)= f (x) for x (0,1), (3.4) − ′′ ∈   u(0)= u(1)= 0.  This problem is in general nonlinear. 3.6 Exercises 37 3.6 Exercises

3.1. Prove Proposition 3.2. 3.2. Show that the gradient of a quadratic function E : V R with respect to an → inner product , H is a linear operator, that is, the domain D(∇H E ) is a linear h· ·i subspace of H and the mapping u ∇H E (u) is linear. 7→ 3.3 (The Dirichlet-Laplace operator with first order coefficient). Let b : [0,1] R x → be a continuous function and let B(x) := 0 b(y) dy. Consider the energy function E 1 RR : H0 (0,1) , → 1 1 2 B u (u′) e− . 7→ 2 0 Z ∇ E 1 B Compute the gradient H with respect to the inner product v,w H = 0 vwe− on H = L2(0,1). More precisely, show that h i R 2 1 D(∇H E )= H (0,1) H (0,1), ∩ 0 ∇H E (u)= u′′ + bu′. − 3.4 (A second order operator in “divergence form”). Let a L∞(0,1). Consider the energy function ∈

E 1 R : H0 (0,1) , → 1 1 2 u a (u′) . 7→ 2 0 · Z ∇ E 2 Compute the gradient L2 with respect to the usual inner product on L (0,1), 1 v,w 2 = vw. More precisely, show that h iL 0 R 1 1 D(∇ 2 E )= u H (0,1) : au′ H (0,1) , L { ∈ 0 ∈ } ∇ 2 E (u)= (au′)′. L − 1 3.5. Let ε (0,1), and let m : R [ε, ε ] be a continuous function. For every u 1 ∈ → ∈ H0 (0,1) we consider the inner product 1 1 v,w g(u) = vw . h i 0 m(u) Z 1 2 Show that the function g : H0 (0,1) Inner(L (0,1)), u , g(u) is a metric, that is, it maps norm convergent sequences→ into strongly converg7→ h·ent·i sequences. Hint. One may use, without proof, the Sobolev embedding theorem (Theorem 1 G.14, Theorem 3.3) which says that the space H0 (0,1) is continuously embedded 1 into C([0,1]), that is, every function u H0 (0,1) is continuous on [0,1] and there ∈ ∞ exists a constant C > 0 (independent of u) such that u L C u H1 . k k ≤ k k 0

Lecture 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

In this lecture, we present further examples of gradients in infinite dimensional space: given an open set Ω Rd, we define the Dirichlet-Laplace operator on L2(Ω) and the Dirichlet-p-Laplace⊆ operator on L2(Ω). The reason, why we give these operators so much importance, can only be given later, when we will discuss linear and nonlinear diffusion equations as examples of gradient systems. But actually the Laplace operator and the p-Laplace operator appear not only in diffusion equations. They also appear in many other models from mathematical physics, such as wave propagation models, models from fluid mechanics, or in elasticity.

In order to introduce the new examples, we need to define Sobolev spaces on open sets Ω Rd. In this lecture, we write also ⊆ d uv := u,v euc = ∑ uivi for the euclidean inner product, and h i i=1 u for the euclidean norm on Rd. | | For a differentiable function u : Ω R, where Ω Rd is an open set, we denote by ∇u its euclidean gradient. → ⊆

4.1 Sobolev spaces in higher dimensions

Let Ω Rd be an open set with boundary ∂Ω. For every continuous function ϕ : Ω R⊆we define the support → suppϕ := x Ω : ϕ(x) = 0 , { ∈ 6 } where the closure is to be taken in Ω. Let

39 40 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

C1(Ω) := ϕ C1(Ω) : suppϕ is compact . c { ∈ } The following theorem, the in higher dimensions, is a con- sequence of the (Theorem F.6). In this theorem, C1(Ω¯ ) is the set of all continuously differentiable functions u : Ω R such that u and ∇u admit continuous extensions to the closure Ω¯ . For the definition→ of what we mean by an open set of class C1 (a notion related to the regularity of the boundary), and the def- inition of the outer vector, we refer to Appendix F. Domains with “smooth” boundary (balls, for example) are of class C1, while domains with corners (triangles, rectangles) or slits are not of class C1.

A C1 domain with A domain with corners A domain with a slit some outer normal is not of class C1 (here an ellipse minus vectors a curve) is not of class C1. Theorem 4.1 (Integration by parts). Let Ω Rd be open, bounded and of class C1. Then there exists a unique Borel measure σ⊆on ∂Ω (called the measure) such that for every u, v C1(Ω¯ ) and every 1 i d ∈ ≤ ≤ ∂v ∂u u = uvni dσ v, Ω ∂x ∂Ω − Ω ∂x Z i Z Z i

where n(x) = (ni(x))1 i d denotes the outer normal vector at a point x ∂Ω. ≤ ≤ ∈ In particular, if u C1(Ω¯ ) and ϕ C1(Ω), then ∈ ∈ c ∂ϕ ∂u u = ϕ, Ω ∂x − Ω ∂x Z i Z i and one may prove that this equality holds without any regularity assumption on Ω. Similarly as in the case of Sobolev spaces on intervals and weak derivatives of functions of one variable, the rule of integration by parts serves for the definition of weak partial derivatives. For every 1 p ∞ and every open Ω Rd we define the first Sobolev space ≤ ≤ ⊆

1,p p p W (Ω) := u L (Ω) : for every i = 1,...,d there exists gi L (Ω) such { ∈ ∂ϕ ∈ ϕ 1 Ω ϕ that for every Cc ( ) : u = gi . ∈ Ω ∂x − Ω } Z i Z 1 1,2 p We also write H (Ω) instead of W (Ω). By Lemma G.4, the functions gi L (Ω) ∂ u ∈ are uniquely determined, if they exist. In the following, we write := gi, and we ∂ xi 4.1 Sobolev spaces in higher dimensions 41

∂ ∂ ∂ call u the i-th weak partial derivative of u and ∇u := ( u ,..., u ) the weak ∂ xi ∂ x1 ∂ xd euclidean gradient of u. With these definitions, the Sobolev space W 1,p(Ω) is the space of all functions in Lp(Ω) which admit all weak partial derivatives in Lp(Ω). If u is continuously differentiable, then partial derivatives coincide with weak partial derivatives by Gauß’ theorem, and the euclidean gradient coincides with the weak euclidean gradient by Lemma 1.2.

The Sobolev spaces W 1,p(Ω) are Banach spaces for the norms

d ∂u 1 p p p ∞ u W1,p := u Lp + ∑ ∂ Lp , 1 p < , and k k k k i=1 k xi k ≤ ∂u  ∂u ∞ ∞ ∞ u W1,∞ := sup u L , L ,..., L , k k {k k k∂x1 k k∂xd k }

and H1(Ω) is a Hilbert space for the inner product

d ∂u ∂v u,v H1 := u,v L2 + ∑ ∂ , ∂ L2 . h i h i i=1h xi xi i We define 1,p Ω 1 Ω k·kW1,p W0 ( ) := Cc ( ) , 1,p Ω 1 Ω 1,p Ω that is, W0 ( ) is the closure of the space Cc ( ) in W ( ), and we put 1 Ω 1,2 Ω H0 ( ) := W0 ( ).

Not all properties of Sobolev spaces on intervals carry over to Sobolev spaces on open sets Ω Rd. For example, it is not true that every function u W 1,p(Ω) is continuous on⊆ the closure Ω¯ , or even in the interior Ω, unless∈ one makes further assumptions on p and Ω (dimension of Ω, regularity of the boundary)! It is therefore not immediately clear how to give a meaning to the restriction to the boundary ∂Ω of a function in W 1,p(Ω), since the boundary ∂Ω is in general a set of Lebesgue measure zero. The problem of defining such restrictions to the boundary – one calls them also traces – is not discussed here. We only say that it is possible to define such traces on the boundary, provided the boundary is regular enough. If the boundary ∂Ω is regular enough, then one can show that the space 1,p Ω 1,p Ω ∂Ω W0 ( ) is the space of all functions in W ( ) whose traces on the boundary vanish.

For every 1 p ∞ and every k 2 we define inductively the k-th Sobolev space ≤ ≤ ≥ ∂ k,p 1,p u k 1,p W (Ω) := u W (Ω) : for every i = 1,..., d one has W − (Ω) , { ∈ ∂xi ∈ } which is a Banach space for the norm 42 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

d ∂ 1 p u p p u k,p := u p + ∑ if 1 p < ∞, and W L ∂ W k 1,p k k k k i=1 k xi k − ≤ ∂u  ∂u ∞ ∞ u W k,∞ := sup u L , W k 1,∞ ,..., W k 1,∞ if p = . k k {k k k∂x1 k − k∂xd k − }

The space Hk(Ω) := W k,2(Ω) is a Hilbert space for the inner product

d ∂u ∂v u,v k := u,v 2 + ∑ , k 1 . H L ∂ ∂ H − h i h i i=1h xi xi i

For further results about Sobolev spaces, we refer to the Appendix or to [Adams (1975)], [Adams and Fournier (2003)].

4.2 The Dirichlet-Laplace operator

Ω Rd 1 Ω Let be an open set and consider the Banach space V = H0 ( ). We consider the function⊆ E : H1(Ω) R given by 0 → 1 E (u)= ∇u 2, u H1(Ω). 2 Ω | | ∈ 0 Z This function is a quadratic form with associated bilinear form a given by

∇ ∇ 1 Ω a(u,v)= u v, u, v H0 ( ). Ω ∈ Z We consider the Hilbert space H = L2(Ω), equipped with the usual inner product. Then V is densely and continuously embedded into H (for the density, it suffices 1 Ω 2 Ω E to prove that Cc ( ) is dense in L ( ), see Theorem G.3). The gradient of with respect to the usual L2 inner product is, as computed in Section 3.2, given by

1 2 D(∇ 2 E )= u H (Ω) : there exists v L (Ω) such that for every L { ∈ 0 ∈ ϕ 1 Ω ∇ ∇ϕ ϕ H0 ( ) one has u = v and ∈ Ω Ω } ∇ E Z Z L2 (u)= v. D∆ ∇ E We write Ω := L2 and we call this negative gradient the Dirichlet-Laplace operator on L2(Ω−). This term is justified by the following facts.

First, the Laplace operator is the partial

d ∂ 2 ∆ = div∇ = ∑ . ∂ 2 i=1 xi 4.2 The Dirichlet-Laplace operator 43

2 ∂ u ∂ 2u Second, if u H (Ω), then, by definition, all weak partial derivatives ∂ , ∂ ∂ ∈ xi xi x j exist and belong to L2(Ω). In particular, ∆u L2(Ω), if we understand by ∆u the ∂ 2u ∈ sum of the weak partial derivatives ∂ 2 . By definition of the weak derivatives (or: xi by definition of the Sobolev spaces H1 and H2), for every function ϕ C1(Ω) ∈ c d ∂u ∂ϕ d ∂u ∂ϕ ∇u∇ϕ = ∑ = ∑ Ω Ω ∂x ∂x Ω ∂x ∂x Z Z i=1 i i i=1 Z i i d ∂ 2u d ∂ 2u = ∑ ϕ = ∑ ϕ − Ω ∂x2 − Ω ∂x2 i=1 Z i Z i=1 i = ∆uϕ. − Ω Z 1 Ω 1 Ω ∇ ∇ϕ Since, by definition, the space Cc ( ) is dense in H0 ( ), the equality Ω u = ∆ ϕ ϕ 1 Ω Ω u actually holds for every H0 ( ). As a consequence of this and the − D∆ ∈ 2 Ω 1 Ω R definition of Ω , we obtain that every u H ( ) H0 ( ) belongs to the domain RD ∈ ∩ D(Ω ∆) and D Ω ∆u = ∆u. D 1 Third, since D(Ω ∆) H (Ω), every function in the domain of the Dirichlet- ⊆ 0 Laplace operator satisfies the Dirichlet boundary condition u ∂Ω = 0 in the sense that its trace (the restriction of the function to the boundary,| in the weak sense of Sobolev spaces) vanishes.

The Dirichlet-Laplace operator is of importance when one studies the boundary value problem ∆u = f in Ω, − ( u = 0 on ∂Ω, 2 Ω 1 Ω for some given function f L ( ). We call a function u H0 ( ) a weak solution of this boundary value problem∈ if for every ϕ H1(Ω) ∈ ∈ 0 ∇u∇ϕ = f ϕ. Ω Ω Z Z Moreover, we call a function u C2(Ω¯ ) a classical solution if it satisfies the equa- tion ∆u = f in Ω (all second∈ derivatives being now classical derivatives) and the boundary− condition u = 0 on ∂Ω, pointwise everywhere. By Gauß’ theorem, every classical solution is a weak solution. Moreover, one shows in an exercise that u is a weak solution if and only if D Ω ∆u = f . − Remark. Unfortunately and unlikely to the case of the Dirchlet-Laplace operator on the interval, we can not characterize the domain of the Dirichlet-Laplace operator d D on Ω, if Ω R is a generalopen set. For example, one can not provethat D(Ω ∆)= H2(Ω) H⊆1(Ω); only the inclusion is true in general, as we have shown above. ∩ 0 ⊇ 44 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

D 2 As soon as the dimension d 2, one can not conclude from Ω ∆u L (Ω) that each 2 ≥ ∈ ∂ u 2 Ω weak partial derivative ∂ 2 exists and belongs to L ( ), not to speak of the weak xi 2 mixed partial derivatives ∂ u . In fact, this is a difficult problem from the theory ∂ xi∂ x j of elliptic operators, and we refer to [Gilbarg and Trudinger (2001), Chapter 8], for example. Only two remarks, without proof: first, if Ω ′ Ω is an open subset such Ω¯ Ω ⊂ D∆ Ω that the closure ′ is still contained in , then the restriction of u D(Ω ) to ′ 2 ∈2 belongs to H (Ω ′) (“interior regularity”). Second, if Ω is of class C (a regularity D∆ 2 Ω 1 Ω property of the boundary again), then one does have D(Ω )= H ( ) H0 ( ) (“global regularity”); for example, this equality is true if Ω is a ball. These∩ results D∆ 2 Ω 1 Ω are not trivial and we summarize that the equality D(Ω )= H ( ) H0 ( ) is false in general. ∩

4.3 The Dirichlet-p-Laplace operator

Let Ω Rd be an open set and let p > 1. We consider the Banach space V = ⊆ W 1,p(Ω) and the function E : W 1,p(Ω) R given by 0 0 → 1 E (u)= ∇u p, u W 1,p(Ω). p Ω | | ∈ 0 Z This function is not a quadratic form (unless p = 2), but still we can show that it is continuously differentiable and compute its derivative, by applying a general result which we state and prove only in the following section. Theorem 4.2. The function E is continuously differentiable, and

p 2 1,p E ′(u)v = ∇u − ∇u∇v foreveryu, v W (Ω). (4.1) Ω | | ∈ 0 Z

Moreover, E and E ′ map bounded sets into bounded sets. Ω Rd R 1 p Proof. Consider the function F : given by F(x,u) = p u . Then × → ∂ F | | F(x,0)= 0, and the function F is continuously differentiable. If ∂ u (x,u) denotes Rd ∂ F p 2 the derivative of the function F(x, ) at u , then ∂ u (x,u)v = u − uv and ∂ · ∈ | | F (x,u) = u p 1 for every u, v Rd. Therefore, the function F satisfies the con- | ∂ u | | | − ∈ ditions of Theorem 4.3 below. By Theorem 4.3, the function F : Lp(Ω;Rd ) R → given by F (u)= Ω F(x,u(x)) dx is continuously differentiable and R p 2 p d F ′(u)v = u − uv for every u, v L (Ω;R ). Ω | | ∈ Z Now the statement of the theorem follows by observing that the function E is the composition of the function F above and the continuous, linear mapping L : W 1,p(Ω) Lp(Ω;Rd ), u ∇u, that is, E = F L . Every continuous lin- 0 → 7→ ◦ 4.3 The Dirichlet-p-Laplace operator 45 ear mapping is continuously differentiable, so that E is continuously differentiable, too. The chain rule yields (4.1). Since F and F ′ map bounded sets into bounded sets, by Theorem 4.3, and since L is bounded, E and E ′ map bounded sets into bounded sets, too. 1,p Ω 2 Ω We consider next the Banach space V = W0 ( ) L ( ), equipped with the ∩ 2 Ω sum norm u V := u W 1,p + u L2 , and the Hilbert space H = L ( ), equipped with the usualk k innerk product.k Thenk k V is continuously and densely embedded into 2 Ω 1,p Ω E L ( ) and into W0 ( ). By Theorem 4.2, the restriction of to the space V is continuously differentiable, E and E ′ map bounded sets into bounded sets, and

p 2 E ′(u)v = ∇u − ∇u∇v for every u, v V. Ω | | ∈ Z The gradient of E with respect to the usual L2 inner product is, according to the definition, the operator

2 D(∇ 2 E )= u V : there exists v L (Ω) such that for every L { ∈ ∈ p 2 ϕ V one has ∇u − ∇u∇ϕ = vϕ and ∈ Ω | | Ω } ∇ E Z Z L2 (u)= v. D∆ ∇ E We write Ω p := L2 and we call this negative gradient the Dirichlet-p- Laplace operator on−L2(Ω). This term is justified by the following considerations.

The p-Laplace operator is the suitable partial differential operator which to every function u : Ω R assigns the function → p 2 ∆pu := div ∇u − ∇u . | |

Note that ∆2 is equal to the Laplace operator ∆. 

Consider the boundary value problem

∆pu = f in Ω, − (4.2) ( u = 0 on ∂Ω, where f L2(Ω) is a given function. We call a function u W 1,p(Ω) L2(Ω) a ∈ ∈ 0 ∩ weak solution of this problem if for every ϕ W 1,p(Ω) L2(Ω) one has ∈ 0 ∩ p 2 ∇u − ∇u∇ϕ = f ϕ. Ω | | Ω Z Z With this definition of weak solution, by the definition of the Dirichlet-p-Laplace operator, and by Theorem 4.2, a function u is a weak solution of the above boundary value problem if and only if D Ω ∆pu = f . − 46 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator 4.4 An auxiliary result

Theorem 4.3. Let Ω Rd be an open set, and let 1 < p < ∞. Let F : Ω Rm R (m 1) be a measurable⊆ function such that × → ≥ F(x, ) is continuously differentiable for every x Ω, and · ∈ p p 1 there exist C1 L ′ (Ω) (p′ := ), C2 L (Ω), C 0 such that ∈ p 1 ∈ ≥ ∂ − p 1 m F(x,u) C1(x)+C u − for every x Ω, u R , and |∂u |≤ | | ∈ ∈ F(x,0) C2(x) for every x Ω, | |≤ ∈ ∂ where ∂ u F(x,u) is the derivative of the function F(x, ) at the point u. Then the function F : Lp(Ω;Rm) R given by · → F (u)= F(x,u(x)) dx, u Lp(Ω;Rm), Ω ∈ Z is well defined, continuously differentiable, and ∂ p m F ′(u)v = F(x,u(x))v(x) dx foreveryu, v L (Ω;R ). Ω ∂u ∈ Z

Moreover, F and F ′ map bounded sets into bounded sets. Proof. We first prove that the function F is well defined. The two growth conditions on F and the imply that for every x Ω and every u Rm ∈ ∈ F(x,u) F(x,0) + F(x,u) F(x,0) | | ≤ | | | ∂− | F ξ F(x,0) + sup ∂ (x, ) u ≤ | | ξ [0,u]| u || | ∈ p 1 C2(x)+ C1(x)+C sup ξ − u ≤ ξ [0,u] | | | | ∈ p  = C2(x)+C1(x) u +C u . | | | | For every u Lp(Ω;Rm), H¨older’s inequality implies that ∈ p F(x,u(x)) dx C2 1 + C1 p u Lp +C u p . Ω | | ≤k kL k kL ′ k k k kL Z By this estimate, the function F is well defined, and we also see that it maps bounded sets into bounded sets. Fix u Lp(Ω;Rm) and define the linear functional u Lp(Ω;Rm) by ∈ ′ ∈ ′ ∂ F p m u′(v)= (x,u(x))v(x), v L (Ω;R ). Ω ∂u ∈ Z 4.4 An auxiliary result 47

This linear functional u′ is well defined since, by H¨older’s inequality, for every v Lp(Ω;Rm) ∈

∂F ∂F (x,u(x))v(x) dx ( ,u) p v Lp Ω | ∂u | ≤k ∂u · kL ′ k k Z p 1 C1 p +C u −p v Lp . (4.3) ≤ k kL ′ k kL k k  We prove that F is differentiable and that F ′(u)= u′. For this, we have to prove that F (u + h) F (u) u (h) lim | − − ′ | = 0. h p 0 h Lp k kL → k k For every h Lp(Ω;Rm), by H¨older’s inequality, ∈

F (u + h) F (u) u′(h) (4.4) | − − |≤ ∂F F(x,u(x)+ h(x)) F(x,u(x)) (x,u(x))h(x) dx ≤ Ω − − ∂u Z ∂ F F(x,u(x)+ h(x)) F(x,u(x)) ∂ (x,u(x))h(x) = − − u h(x) dx Ω h(x) | | Z | | ∂ F 1 F(x,u(x)+ h(x)) F(x,u(x)) ∂ u (x,u(x))h(x ) p′ − − dx p′ h Lp , ≤ Ω h(x) k k Z | |  where we interpret the term under the integral as 0 if h(x)= 0. Let now (hn) Lp(Ω;Rm) be a sequence converging to 0. From this sequence, we can extract⊂ a subsequence (which we denote for simplicity again (hn)) and we can find a g Lp(Ω) such that ∈

hn 0 almost everywhere on Ω and → hn g almost everywhere, for every n. | |≤ Since F(x, ) is differentiable for every x Ω, we have for this subsequence · ∈ ∂ F F(x,u(x)+ hn(x)) F(x,u(x)) ∂ (x,u(x))hn(x) − − u 0 almost everywhere on Ω. hn(x) → | | ∂ F Moreover, using the growth assumptions on ∂ u , for every n and almost every x we have

∂ F F(x,u(x)+ hn(x)) F(x,u(x)) (x,u(x))hn(x) − − ∂ u hn(x) ≤ | | ∂ ∂ F ξ F sup ∂ (x, (x)) + ∂ (x,u(x)) ≤ ξ (x) [u(x),u(x)+hn(x)] | u | | u ∈

48 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

p 1 p 1 2C1(x)+C sup ξ (x) − +C u(x) − ≤ ξ (x) [u(x),u(x)+hn(x)] | | | | ∈ p 1 p 1 2C1(x)+C ( u(x) + hn(x) ) − +C u(x) − ≤ | | | p| 1 | | p 1 2C1(x)+C (Cp + 1) u(x) − +CCp hn(x) − ≤ | | | | p 1 p 1 2C1(x)+C (Cp + 1) u(x) − +CCp g(x) − . ≤ | | Here, we have also used the estimate

p 1 p 1 p 1 (a + b) − Cp a − +Cp b − , ≤ which is true for every a, b 0 and some constant Cp 0. Hence, by Lebesgue’s dominated convergence theorem,≥ ≥

∂ F F(x,u(x)+ hn(x)) F(x,u(x)) ∂ (x,u(x))hn(x) p lim − − u ′ dx = 0. n ∞ Ω h (x) → Z n | | p m We have thus proved that for every sequence (hn) L (Ω;R ) converging to 0 ⊂ there exists a subsequence (which we denote for simplicity again by (hn)) such that the preceding equality holds. We claim that this implies

∂ F F(x,u(x)+ h(x)) F(x,u(x)) ∂ (x,u(x))h(x) p lim − − u ′ dx = 0. (4.5) h p 0 Ω h(x) k kL → Z | | p m In fact, if this was not true, then there exist ε > 0 and a sequence (hn) L (Ω;R ) converging to 0 such that ⊂

∂ F F(x,u(x)+ hn(x)) F(x,u(x)) ∂ (x,u(x))hn(x) p − − u ′ ε, Ω h (x) ≥ Z n | | which is not possible by what we have proved above. From (4.5) and the estimate (4.4) we deducethat F is differentiable and F ′(u)= u′. In particular, F is continuous, and the estimate (4.3) says

p 1 F ′(u) p C1 p +C u −p , k k(L )′ ≤k kL ′ k kL so that F ′ maps bounded sets into bounded sets. p m It remains only to prove that F ′ is continuous. Fix u L (Ω;R ), and let (hn) Lp(Ω;Rm) be a sequence converging to 0. As before, we∈ can extract a subsequence⊂ p (which we denote again by (hn)) and we can find a g L (Ω) such that ∈

hn 0 almost everywhere on Ω and → hn g almost everywhere, for every n. | |≤ ∂ F Then, using the growth assumptions on ∂ u as above, by H¨older’s inequality and by Lebesgue’s dominated convergence theorem, 4.5 Origins of the Laplace operator 49

F F limsup ′(u + hn) ′(u) (Lp) n ∞ k − k ′ ≤ → ∂F ∂F limsup sup ∂ (x,u(x)+ hn(x)) ∂ (x,u(x)) v(x) dx ≤ n ∞ v p 1 Ω u − u | | → k kL ≤ Z ∂ ∂ 1 F F p′ limsup (x,u(x)+ hn(x)) (x,u(x)) dx p′ ≤ ∞ Ω ∂u − ∂u n Z →  = 0. In particular, limsup and liminf coincide. We have proved that for every sequence p m (hn) L (Ω;R ) converging to 0 there exists a subsequence (which we denote ⊂ again by (hn)) such that

lim F ′(u + hn) F ′(u) Lp = 0. n ∞ ( )′ → k − k

An argument by contradiction, as it was already used in this proof, implies that F ′ is continuous.

4.5 Origins of the Laplace operator

... et j’ose me flatter de pr´esenter aux g´eom`etres, dans cet Ouvrage, une th´eorie des attrac- tions des sph´ero¨ıdes et de la figure des plan`etes plus g´en´erale et plus simple que celles qui sont d´ej`aconnues.1 Pierre Simon de Laplace, 1785

In this short section, we try to describe the context in which the so-called Laplace operator and Laplace’s equation appeared for the first time. As we already indicated, the Laplace operator arises in many partial differential equations describing various physical problems. One such physical problem is the problem of attraction of masses and Newton’s gravitational force.

In 1785, in his memoir Theorie´ des attractions des sphero´ ¨ıdes et de la figure des planetes` 2, Pierre Simon de Laplace describes the gravitational forces induced by mass distributions concentrated on spheroids (closed surfaces in the space), in particular by planets. It had been known that the gravitational force induced by a mass distribution is a vector field which is the gradient of a so-called potential.

... the recognition that a function exists which is a potential for the Newtonian gravitational force appeared first in Lagrange’s 1773/1774 memoir Sur l’´equation s´eculaire de la lune in

1 Oeuvres compl`etes de Laplace. Tome 10 / publi´ees sous les auspices de l’Acad´emie des sciences, par MM. les secr´etaires perp´etuels – Gauthier-Villars (Paris) - 1878–1912. See the memoir Th´eorie des attractions des sph´ero¨ıdes et de la figure des plan`etes on page 341. It is for us difficult to translate the beginning of the citation, but a translation could be: ... and I dare to be proud to present to the geometers, in this work, a theory of attractions of spheroids and of the shape of the planets which is more general and simpler than those which are already known. 2 Op. cit., pages 341–. 50 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

which it seems to serve principally as a convenient method for calculating the components of force in various coordinate systems. In 1777, Lagrange devoted a short paper to some of the properties of potential functions, including an application to the question of replacing a system of bodies by the centre of mass of the system for the purposes of calculating their motions.3

The , denoted by V in Laplace’s memoir, can be obtained by integrating the potential of a point mass over the mass distribution. The representation of the gravitational potential in this form allows one to consider very general mass distributions, in particular mass distributions on arbitrary spheroids, not only round spheres or other regular surfaces. This is an advantage compared to earlier works on , as Laplace remarks in his introduction.

Considering spherical coordinates (r,ϕ,θ) and putting cosθ = µ, Laplace ob- tains the equation4

∂ ∂V 1 ∂ 2V ∂ 2(rV) [(1 µ2) ]+ + r = 0, ∂µ − ∂µ 1 µ2 ∂ϕ2 ∂r2 − which is the so-called Laplace equation for the gravitational potential

∆V = 0 in spherical coordinates.

In a later memoir from 1789 on Saturn rings5, Laplace writes the above equation in cartesian coordinates. He writes:

V est la somme des mol´ecules du sph´ero¨ıde divis´ees par leurs distances respectives au point attir´em ; pour avoir l’attraction du sph´ero¨ıde sur ce point parall`element `aune droite quelconque, il faut donc consid´erer V comme une fonction de trois coordonn´ees rectangles dont l’une soit parall`ele `acette droite, et diff´erentier cette fonction relativement `acette coordonn´ee : le coefficient de la diff´erentielle de la coordonn´ee, pris avec un signe contraire, sera la valeur de l’attraction du sph´ero¨ıde d´ecompos´ee parall`element `ala droite donn´ee et dirig´ee vers l’origine de la coordonn´ee qui lui est parall`ele. Si l’on repr´esent par β la fonction

1 , (x x )2 + (y y )2 + (z z )2 − ′ − ′ − ′ on aura p V = βρ dx′ dy′ dz′. Z

3 Jahnke, H. N. (Ed.). : A history of analysis. Vol. 24 of History of Mathematics. American Math- ematical Society, Providence, RI, 2003, p. 198. 4 Oeuvres compl`etes de Laplace. Tome 10. Op. cit., page 362. We denote the spherical coordinates slightly differently. 5 Oeuvres compl`etes de Laplace. Tome 11 / publi´ees sous les auspices de l’Acad´emie des sciences, par MM. les secr´etaires perp´etuels – Gauthier-Villars (Paris) - 1878–1912. See the M´emoire sur la th´eorie de l’anneau de Saturne. 4.5 Origins of the Laplace operator 51

... Mais il est facile de s’assurer, par la diff´erentiation, que l’on a

∂ 2β ∂ 2β ∂ 2β 0 = + + ; ∂x2 ∂y2 ∂z2

on aura donc pareillement ∂ 2V ∂ 2V ∂ 2V 0 = + + . ∂x2 ∂y2 ∂z2 Cette ´equation, rapport´ee `a d’autres coordonn´ees, est la base de la th´eorie que j’ai pr´esent´ee dans nos M´emoires de 1782 sur les attractions des sph´ero¨ıdes et sur la figure des plan`etes.6

The english translation is taken from [Jahnke (2003), pages 198-199]:

V is the sum of the molecules of the spheroid divided by their respective distances to the attracted point. To find the attraction of the spheroid on this point parallel to a given line, one must therefore consider V as a function of three rectangular coordinates of which one is parallel to this line, and differentiate this function with respect to this coordinate. The differential coefficient of the coordinate, taken with the opposite sign, is the value of the component of the attraction of the spheroid parallel to the given line and directed toward the origin of the coordinate which is parallel to it. If we represent by β the function

1 , (x x )2 + (y y )2 + (z z )2 − ′ − ′ − ′ [β is the graviational potentialp of a mass point] then we obtain

V = βρ dx′ dy′ dz′. Z [ρ is the density]... But it is easy to verify by differentiation that we have

∂ 2β ∂ 2β ∂ 2β 0 = + + ; ∂x2 ∂y2 ∂z2

likewise we have ∂ 2V ∂ 2V ∂ 2V 0 = + + . ∂x2 ∂y2 ∂z2 Our translation of the last sentence of the french quotation is

This equation, given in different coordinates, is the basis of the theory which I presented in our M´emoires from 17827 on the attraction of the spheroids and of the shape of the planets.

Laplace’s memoirs from 1785 and 1789 are perhaps the first occasions on which Laplace’s equation and the Laplace operator appear. The fact that graviational potentials V satisfy Laplace’s equation is a direct consequence of the fact that the potential β satisfies Laplace’s equation; this may be checked by a straightforward calculation. Note that the function β is, up to a constant, the so-called fundamental

6 Oeuvres compl`etes de Laplace. Tome 11. Op. cit., pages 277–278. 7 The memoir Th´eorie des attractions ... was submitted to the Acad´emie des Sciences in 1782, and published in 1785. 52 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator solution of Laplace’s equation.

In the following, when studying the problem of heat conduction, we shall see how the Laplace operator appears as a consequence of Lemma 1.3, Fourier’s law and the Divergence theorem (Theorem F.6). We shall see that the problem of heat conduction leads to Laplace’s equation, too, if the heat conduction is stationary (constant in time); the unkown function V then represents the heat density.

4.6 Exercises

1 Ω Ω Rd 4.1. Show that every function u Cc ( ) ( open) vanishes in a neighbour- hood of the boundary ∂Ω. ∈ ⊆

D 2 2 4.2. Let Ω ∆ be the Dirichlet-Laplace operator on L (Ω), and let f L (Ω). Show that u H1(Ω) is a weak solution of the elliptic boundary value problem∈ ∈ 0 ∆u = f in Ω, − ( u = 0 on ∂Ω,

D if and only if u D(Ω ∆) and ∈ D Ω ∆u = f . − 4.3 (The Neumann-Laplace operator on a bounded interval). Consider the Sobolev space V = H1(0,1) and the quadratic form E : H1(0,1) R given by → 1 1 E (u)= (u )2. 2 ′ Z0 ∇ E 2 Compute the gradient L2 with respect to the usual inner product in H = L (0,1). More precisely, show that

2 D(∇ 2 E )= u H (0,1) : u′(0)= u′(1)= 0 , L { ∈ } ∇ 2 E (u)= u′′. L − 1 Hint. One may use, without proof, the formula of integration by parts 0 vw′ = 1 1 v(1)w(1) v(0)w(0) 0 v′w for functions v, w H (0,1). The inclusionR ∇ E − 2 − ∈ D( L2 ) H (0,1) is shown similarly as in Section 3.4. In order to show that ⊂ ∇ E R every u D( L2 ) satisfies the boundary conditions u′(0)= u′(1)= 0, use special “test functions”∈ ϕ H1(0,1). ∈ N 2 Remark. We call ∆ := ∇ 2 E the Neumann-Laplace operator on L (0,1). (0,1) − L N∆ Solving the equation (0,1) u = f is equivalent to solving the boundary value prob- lem − 4.6 Exercises 53

u H2(0,1), ∈  u (x)= f (x) for x (0,1), (4.6) − ′′ ∈   u′(0)= u′(1)= 0.  The boundary conditions u′(0)= 0 and u′(1)= 0 are called Neumann boundary conditions or no-flux boundary conditions.

4.4 (An Ornstein-Uhlenbeck operator with Dirichlet boundary conditions). Let Ω Rd be an open set, and let B : Ω R be a bounded, continuously differentiable ⊆ ∇ → E 1 Ω R function with bounded gradient B. Consider the quadratic form : H0 ( ) E 1 ∇ 2 B ∇ E → given by (u)= 2 Ω u e− , and let H be its gradient with respect to the | | B 2 Ω 1 Ω ∇ E inner product v,w HR = Ω vwe− . Show that H ( ) H0 ( ) D( H ), and that for every u Hh 2(Ωi ) H1(Ω) one has ∇ E (u)= ∆∩u + ∇B∇⊆u. R0 H ∈ ∩ − ∂ ∂ v ∂ w Hint. You may use, without proof, the product rule ∂ (vw)= ∂ w + v ∂ for v xi xi xi ∈ H1(Ω), w C1(Ω). ∈ 4.5. Let E : V R be a quadratic form on a Banach space V which embeds densely → and continuously into a Hilbert space H. Show that the gradient ∇H E is a symmetric operator on H, that is, for every u, v D(∇H E ) one has ∈

∇H E (u),v H = u,∇H E (v) H . h i h i 4.6. a) Let F : (0,1) R R be a continuously differentiable function satisfying × → ∂ F the hypotheses of Theorem 4.3 for some p > 1, and let f (x,u) := ∂ u (x,u). Show that the function E : H1(0,1) R given by 0 → 1 1 2 E (u)= (u′) + F( ,u) 0 2 · Z   is continously differentiable and compute the gradient with respect to the usual inner product in L2(0,1). More precisely, show that

2 1 D(∇ 2 E )= H (0,1) H (0,1), L ∩ 0 ∇ 2 E (u)= u′′ + f ( ,u). L − · b) Let F and f be as in (a), and let further m L∞(0,1) be a positive function 1 ∞ ∈ E R such that m L (0,1). Find a Banach space V, a function : V and a → Hilbert space∈H such that V ֒ H and → 2 1 D(∇H E )= H (0,1) H (0,1), ∩ 0 ∇H E (u)= mu′′ + f ( ,u). − · 4.7 (The Dirichlet-Laplace operator). 1 a) (Poincare’s´ inequality). Show that u,v H1 = 0 u′v′ defines an inner product h i 0 1 1 on H (0,1) which is equivalent to the usual RH inner product , 1 (that 0 h· ·iH 54 4 Gradients in infinite dimensional space: the Laplace operator and the p-Laplace operator

is, the corresponding norms are equivalent). Note that it suffices to prove that there exists a constant λ > 0 such that

1 1 λ 2 2 1 u (u′) for every u H0 (0,1). 0 ≤ 0 ∈ Z Z Hint. Prove this inequality first for u C1(0,1). ∈ c 2 1 E b) Onthespace V = H (0,1) H0 (0,1) we consider the quadratic form : V R E 1 1 ∩2 E → given by (u)= 2 0 (u′′) . Compute the gradient of with respect to the inner product , H1 defined in (a). More precisely, show that h· ·i 0 R ∇ E 3 1 1 D( H1 )= u H (0,1) H0 (0,1) : u′′ H0 (0,1) , 0 { ∈ ∩ ∈ } ∇ E H1 (u)= u′′. 0 − ψ 1 Hint. One may use the following two facts: first, for every Cc (0,1) there exists a (unique) solution of the problem ∈

ϕ C2([0,1]), ∈  ϕ (x)= ψ(x) for x (0,1), − ′′ ∈   ϕ(0)= ϕ(1)= 0.  ϕ  1 1 ψ 1 ψ ψ Note that V. Second, if v, w L (0,1) and if 0 v = 0 w for every C1(0,1), then∈ v = w. The first fact∈ may be proved by integrating the equation∈ c R R ϕ = ψ twice and adjusting constants. For the second fact, see Lemma G.4. − ′′ d d 4.8. a) Let A : R R be a linear mapping and assume that A = ∇H E is the gradient of a quadratic→ form E : Rd R with respect to an inner product → d , H . Show that the spectrum of A (considered also as a linear mapping C h·Cd·i) is real, that is, whenever Au = λu for some λ C and some nonzero→ u Cd, then λ R. ∈ ∈ ∈ b) (The damped mathematical pendulum). Let α, β > 0. Show that the ordi- nary differential equation of the damped mathematical pendulum

u˙1 u2 = 0 − u˙2 + αu2 + β u1 = 0

is not a gradient system of the formu ˙ + ∇H E (u)= 0 if α < 2 β (that is, if the damping is weak). p Remark. By changing Exercise 2.3 in an appropriate way, one can show that the damped mathematical pendulum is a gradient system of the form 2 u˙ + ∇gE (u)= 0, at least on R 0 . \{ } Lecture 5 Bochner-Lebesgue and Bochner-Sobolev spaces

This lecture on integration theory is a short introduction to the Bochner integral of Banach space-valued functions, to Bochner-Lebesgue spaces and to Bochner- Sobolev spaces. These notions are necessary notions in order to study abstract differential equations in Banach spaces and, in particular, gradient systems in Banach spaces; the solutions of such abstract differential equations naturally live in Bochner-Lebesgueor Bochner-Sobolev spaces of Banach space-valued functions.

We suppose that the reader is familiar with the Lebesgue integral of real-valued functions. While it is then relatively straightforward to define , Lebesgue and Sobolev spaces of functions with values in Rd – by reducing everything to the components of such functions – the situation is a priori less clear if one considers functions with values in general Banach spaces. Should one argue componentwise, which means here, if u : Ω X is a function with values in a Banach space X, to reduce everything to the→ scalar-valued case by considering the functions x ,u with x X ? This seems to be one possibility, and if properly done, it ′ X′,X ′ ′ leadsh i to the definition∈ of the so-called Pettis integral. For our purposes, however, we follow another strategy by repeating the ideas of the Lebesgue integral for functions with values in Banach spaces (notion of step function, measurability, integrability, integral). This leads to the definition of the Bochner integral which shares many properties of the Lebesgue integral, to Bochner-Lebesgue spaces and to Bochner-Sobolev spaces.

Here and throughout, we consider only open subsets in Rd with the Lebesgue measure as measure spaces, but most of the results on the Bochner integral and Bochner-Lebesgue spaces remain true for general measure spaces. We take the op- portunity to prove some results for Bochner-Sobolev spaces of functions defined on intervals – which are equally true for functions with values in Banach spaces and for scalar-valued functions –, which were not proved in Lecture 3.

55 56 5 Bochner-Lebesgue and Bochner-Sobolev spaces 5.1 The Bochner integral

Let X be a real Banach space with norm , let Ω Rd be an open set, and denote by A the Lebesgue σ-algebra on Ω, thatk·k is, the smallest⊆ σ-algebra which contains the Borel σ-algebra (generated by the open sets) and all subsets of sets of Lebesgue measure zero. The Lebesgue measure on Ω is denoted by µ. A function f : Ω X is called step function, if there exists a sequence (An) A → ⊆ of mutually disjoint Lebesgue measurable sets and a sequence (xn) X such that ∑ ⊂ f = n 1An xn, where 1A denotes the characteristic function of the set A. A function f : Ω X is mesurable, if there exists a sequence ( fn) of step functions fn : Ω → → X such that fn f pointwise almost everywhere. Note that in the case X = R, this definition of→ measurability is equivalent to the one which says that preimages of measurable sets should be measurable. With the definition of measurability, the proofs of the statements (a)-(e) in the following lemma are straightforward. Lemma 5.1. Let X andY be two real Banach spaces. a) Every continuous function f : Ω X is measurable. → b) If f : Ω X is measurable, then f : Ω R is measurable. → k k → c) If f : Ω X is measurable and if g : X Y is continuous, then the composite function→ g f : Ω Y is measurable. → ◦ → d) If f : Ω X andg : Ω R are measurable, then the product fg : Ω X is measurable.→ → → e) If f : Ω X and g : Ω X are measurable, then the product g, f : ′ X′,X Ω R is→ measurable. → h i → f) If ( fn) is a sequence of measurable functions Ω X such that fn f point- wise almost everywhere, then f is measurable. → → The statement (f) may be obtained as a consequence of the correspond- ing statement with X = R and the following theorem which due to Pettis; see [Hille and Phillips (1957), Theorem 3.5.3]. Theorem 5.2 (Pettis). A function f : Ω X is measurable if and only if x , f is → h ′ i measurable for every x′ X ′ (we say that f is weakly measurable) and there exists a Lebesgue null set N ∈A such that f (Ω N) is separable. ∈ \ We say that a function f : Ω X is integrable if f is measurable and Ω f < ∞, that is, if f is measurable and→ the positive function f : Ω R is integrablek k R in the usual Lebesgue sense. For every integrable stepk functionk → f : Ω X, f = ∑ → n 1An xn, we define the (Bochner) integral

f dµ := ∑ µ(An)xn. Ω Z n µ The series ∑n (An)xn converges absolutely and the limit is independent of the rep- resentation of f , as one can easily show. Hence, the Bochner integral for integrable 5.1 The Bochner integral 57 step functions is well defined; the integral Ω f dµ is an element of X. For every integrable function f : Ω X we define the (Bochner) integral → R

f dµ := lim fn dµ, Ω n ∞ Ω Z → Z where ( fn) is any sequence of step functions Ω X such that fn f and fn → k k≤k k → f pointwise almost everywhere. We note without proof that such a sequence ( fn) always exists, that the step functions fn are integrable, that the limit of the integrals Ω fn dµ exists and that the limit is independent of the choice of the sequence ( fn). The Bochner integral is therefore a natural generalization of the Lebesgue integral R of a scalar-valued function. The Bochner integral enjoys many properties known from the Lebesgue integral. For example, the triangle inequality

f dµ f dµ Ω ≤ Ω k k Z Z is true. Lebesgue’s dominated convergence theorem holds true, too.

Theorem 5.3 (Lebesgue, dominated convergence). Let ( fn) be a sequence of in- tegrable functions Ω X and let f : Ω X be a function. Suppose that there → → exists an integrable function g : Ω R such that fn g for every n and fn f pointwise almost everywhere. Then→ f is integrablek andk≤ →

f dµ = lim fn dµ. Ω n ∞ Ω Z → Z It is easy to prove that the Bochner integral is linear in the following sense.

Lemma 5.4 ( of the Bochner integral). a) For every integrable f , g : Ω X the sum f + g is integrable and → ( f + g) dµ = f dµ + gdµ. Ω Ω Ω Z Z Z b) For every integrable f : Ω X and every linear continuous T : X Y the function T f : Ω Y is integrable→ and → → Tfdµ = T f dµ. Ω Ω Z Z We also use the following notation for the Bochner integral:

f or f (t) dµ(t), Ω Ω Z Z and if Ω = (a,b) is an interval in R, then we write

b b b f or f (t) dµ(t) or f (t) dt Za Za Za 58 5 Bochner-Lebesgue and Bochner-Sobolev spaces for the Bochner integral of an integrable function f : (a,b) X. →

5.2 Bochner-Lebesgue spaces

For every measurable function f : Ω X and every 1 p < ∞ we put → ≤ p 1/p f Lp := f dµ . k k Ω k k Z  We also put f L∞ := inf C 0 : µ( f C )= 0 . k k { ≥ {k k≥ } } For every 1 p ∞ we then define ≤ ≤ p L (Ω;X) := f : Ω X measurable : f Lp < ∞ . { → k k } Similarly as in the scalar case, one can show that L p(Ω;X) is a vector space and p that Lp is a seminorm on L (Ω;X). If we let k·k p Np := f L (Ω;X) : f Lp = 0 { ∈ k k } = f L p(Ω;X) : f = 0 almost everywhere { ∈ } then the quotient space

p p p L (Ω;X) := L (Ω;X)/Np := f + Np : f L (Ω;X) { ∈ } becomes a Banach space for the norm

[ f ] Lp := f Lp ([ f ]= f + Np). k k k k The norm of the equivalence class [ f ] is well defined, that is, it is independent of the representative f in this class. We call the spaces Lp(Ω;X) Bochner-Lebesgue spaces. As in the scalar case, we identify functions f L p(Ω;X) with their equiv- alence classes [ f ] Lp(Ω;X), and we say that Lp is a∈. In particular, we identify two functions∈ if they are equal almost everywhere. If Ω = (a,b) is an interval in R we simply write

Lp(a,b;X) := Lp((a,b);X).

For bounded Ω Rd and if 1 q p ∞ we have the inclusions ⊆ ≤ ≤ ≤ ∞ C(Ω¯ ;X) L (Ω;X) Lp(Ω;X) Lq(Ω;X) L1(Ω;X), ⊆ ⊆ ⊆ ⊆ as can be easily proved by using H¨older’s inequality. In particular, if Ω is bounded and f is continuous on the closure Ω¯ , then f belongs to Lp(Ω;X) for every 1 p ∞. ≤ ≤ 5.3 Bochner-Sobolev spaces in one space dimension 59

Let us collect some results (without proof) about Bochner-Lebesgue spaces which are used in following lectures.

Theorem 5.5. If 1 p < ∞ and if X is separable, then Lp(Ω;X) is separable. More ≤ p precisely, whenever (hn) L (Ω) and (xn) X are two dense sequences, then ⊆ ⊆

F := f : Ω X : f = hn xm for some n, m N { → ∈ } is countable and total in Lp(Ω;X), that is, spanF is dense in Lp(Ω;X).

Let 1 p ∞ and let p = p be the conjugate exponent (with usual interpreta- ≤ ≤ ′ p 1 tions if p = 1 or p = ∞). By H¨older’s− inequality, for every f Lp(Ω;X) and every ∈ g Lp′ (Ω;X ) the function f ,g is integrable and ∈ ′ h iX,X′

f ,g X,X f Lp(Ω;X) g p Ω . Ω h i ′ ≤k k k kL ′ ( ;X′) Z

Moreover,

g p = sup f ,g X,X . k kL ′ Ω h i ′ f Lp 1 Z k k ≤ p Thus, similarly as in the scalar case, we may identify L ′ (Ω;X ′) with a closed sub- space of Lp(Ω;X) . The linear mapping which maps every g Lp′ (Ω;X ) to the ′ ∈ ′ continuous f Ω f ,g is, as in the scalar case, an isometry. However, and this is different to the7→ scalarh case,i this mapping is in general not surjective, even R ∞ p Ω p′ Ω ∞ if p < . The usual identification L ( )′ ∼= L ( ) (for p < ) is no longer true for general Bochner-Lebesgue spaces, that is, we do not have a full identification of the dual space of Lp(Ω;X) in terms of the Bochner integral and Bochner-Lebesgue spaces. However, the following results are true; for the proof of the first two state- ments, see [Diestel and Uhl (1977)], the third statement is an easy exercise.

Theorem 5.6. a) If 1 p < ∞ and if X is reflexive, then Lp(Ω;X) = Lp′ (Ω;X ). ≤ ′ ∼ ′ b) If 1 < p < ∞ and if X is reflexive, then the space Lp(Ω;X) is reflexive, too. c) If H is a Hilbert space, then the space L2(Ω;H) is a Hilbert space for the inner product

2 f ,g 2 Ω := f ,g H , f , g L (Ω;H). h iL ( ;H) Ω h i ∈ Z

5.3 Bochner-Sobolev spaces in one space dimension

Let X be a real Banach space, ∞ a < b ∞, and p [1,∞]. The first Bochner- Sobolev space is the space − ≤ ≤ ∈ 60 5 Bochner-Lebesgue and Bochner-Sobolev spaces

W 1,p(a,b;X) := u Lp(a,b;X) : there exists v Lp(a,b;X) such that for { ∈ ∈ b b ϕ 1 ϕ ϕ every Cc (a,b) one has u ′ = v . ∈ a − a } Z Z

By Lemma G.4, the function v is uniquely determined, if it exists. We write u′ := v 1,p and we call u′ the weak derivative of u. The spaces W (a,b;X) are Banach spaces for the norms

1 p p p u 1,p := u p + u′ p if 1 p < ∞, and k kW k kL k kL ≤ u 1,∞ := sup u L∞ , u′ L∞ . k kW {k k k k } The linear mapping

1,p p p T : W (a,b;X) L (a,b;X) L (a,b;X), u (u,u′), → × 7→ shows that W 1,p(a,b;X) is isomorphic to a closed subspace of Lp(a,b;X) Lp(a,b;X). From this and from Theorems 5.5 and 5.6 we immediately obtain the× following properties of Bochner-Sobolev spaces. Theorem 5.7. a) If 1 p < ∞ and if X is separable, then W 1,p(a,b;X) is sepa- rable. ≤ b) If 1 < p < ∞ and if X is reflexive, then the space W 1,p(a,b;X) is reflexive, too. c) If H is a Hilbert space,then the spaceH1(a,b;H) := W 1,2(a,b;H) is a Hilbert space for the inner product

b b 1 u,v H1(a,b;H) := u,v H + u′,v′ H , u, v H (a,b;H). h i a h i a h i ∈ Z Z In the rest of this section, we prove the analog of Theorem 3.3 and a little bit more. 1,p Lemma 5.8. Let u W (a,b;X) be such that u′ = 0. Then u is constant almost everywhere. ∈ ψ 1 b ψ ϕ 1 Proof. Choose Cc (a,b) such that a = 1. Then, for every Cc (a,b), ϕ ∈b ϕ ψ 1 ∈ b ϕ the function ( a ) is the derivativeR of a function in Cc (a,b) since a ( b ϕ ψ − − ( a ) )= 0. Hence,R by assumption and by definition of the weak derivative,R

R b b ϕ ϕ ψ ϕ 1 0 = u( ( ) ) for every Cc (a,b). a − a ∈ Z Z If we put c := b uψ X, then the preceding equality becomes a ∈ R b ϕ ϕ 1 (u c) = 0 for every Cc (a,b). a − ∈ Z By Lemma G.4, u = c almost everywhere. 5.3 Bochner-Sobolev spaces in one space dimension 61

p Lemma 5.9. Let (a,b) be a bounded interval, t0 [a,b],g L (a,b;X), and set ∈ ∈ t u(t) := g(s) ds for every t [a,b]. t ∈ Z 0 Then u W 1,p(a,b;X) and u = g. ∈ ′ Proof. Let ϕ C1(a,b). Then, by Fubini’s theorem, ∈ c b b t uϕ′ = g(s) dsϕ′(t) dt Za Za Zt0 t0 t b t = g(s) dsϕ′(t) dt + g(s) dsϕ′(t) dt Za Zt0 Zt0 Zt0 t0 s b b = ϕ′(t) dtg(s) ds + ϕ′(t) dtg(s) ds − a a t s Z Z Z 0 Z t0 b = ϕ(s)g(s) ds ϕ(s)g(s) ds − a − t Z Z 0 b = gϕ. − a Z Theorem 5.10. Let (a,b) be a bounded interval and u W 1,p(a,b;X). Then there exists a continuous function u˜ : [a,b] X, which coincides∈ with u almost everywhere and such that for every s, t [a,b] → ∈ t u˜(t) u˜(s)= u′(r) dr. − s Z Proof. Fix t (a,b) and set v(t) := t u (s) ds for every t [a,b]. Clearly, the func- 0 t0 ′ ∈ 1,p ∈ tion v is continuous. By Lemma 5.9,Rv W (a,b;X) and v′ = u′. By Lemma 5.8, u v = c almost everywhere for some constant∈ c X. This proves that u coincides almost− everywhere with the continuous functionu ˜∈= v + c, and that

t u˜(t) u˜(s)= v(t) v(s)= u′(r) dr. − − s Z Due to Theorem 5.10, we may identify every function u W 1,p(a,b;X) with its continuous representativeu ˜, and we simply say that every function∈ in W 1,p(a,b;X) is continuous. We state this in the following form.

Theorem 5.11 (Sobolev embedding theorem). Let (a,b) be a bounded interval. Then W 1,p(a,b;X) is contained in C([a,b];X) and there exists a constant C 0 such that ≥ 1,p u L∞ C u 1,p for every u W (a,b;X). k k ≤ k kW ∈ Proof. The fact that every function u W 1,p(a,b;X) is continuous on [a,b] fol- lows from Theorem 5.10. The boundednessof∈ the identity mapping W 1,p(a,b;X) → 62 5 Bochner-Lebesgue and Bochner-Sobolev spaces

C([a,b];X), u u may be seen as a consequence of the closed graph theorem. In 7→ 1,p fact, if un u in W (a,b;X) and un v in C([a,b];X), then, since both spaces → p → p are continuously embedded into L (a,b;X), un u and un v in L (a,b;X). By uniqueness of the limit, this is only possible if u =→v. Hence, the→ identity mapping is closed.

Theorem 5.12 (Product rule, integration by parts). Let (a,b) be a bounded inter- val, fix 1 p ∞, andlet u W 1,p(a,b;X) and v W 1,p(a,b). ≤ ≤ ∈ ∈ a) (Product rule). The product uv belongs to W 1,p(a,b;X) and

(uv)′ = u′v + uv′.

b) (Integration by parts).

b b u′v = u(b)v(b) u(a)v(a) uv′. a − − a Z Z For every 1 p ∞ and every k 2 we define inductively the k-th Bochner- Sobolev spaces≤ ≤ ≥

k,p 1,p k 1,p W (a,b;X) := u W (a,b;X) : u′ W − (a,b;X) , { ∈ ∈ } which are Banach spaces for the norms

k 1 ( j) p p ∞ u Wk,p := ∑ u Lp if 1 p < , and k k j=0 k k ≤  (k) u 1,∞ := sup u L∞, u′ L∞ ,..., u L∞ , k kW {k k k k k k } If H is a Hilbert space, then Hk(a,b;H) := W k,2(a,b;H) is a Hilbert space for the inner product k ( j) ( j) u,v Hk := ∑ u ,v L2 . h i j=0h i Finally, we define

1,p 1 k·kW1,p W0 (a,b;X) := Cc (a,b;X) ,

1 1,2 and we put H0 (a,b;H) := W0 (a,b;H). Theorem 5.13. Let (a,b) be a bounded interval. A function u W 1,p(a,b;X) be- 1,p ∈ longs to W0 (a,b;X) if and only if u(a)= u(b)= 0.

1,p Proof. The “only if” part is a consequence of the definition of W0 (a,b;X) and the Sobolev embedding theorem (Theorem 5.11). The less obvious “if” part is left without proof. 5.4 Exercises 63

Theorem 5.14 (Poincare’s´ inequality). Let (a,b) be a bounded interval and 1 p < ∞. Then there exists a constant λ > 0 such that ≤

b b λ p p 1,p u u′ for every u W0 (a,b;X). a k k ≤ a k k ∈ Z Z Proof. Let u C1(a,b;X). Then ∈ c b b t p p u(t) dt = u′(s) ds dt a k k a a Z Z Z b t p u′(s) ds dt ≤ a a k k Z Z b t p 1  p (t a) − u′(s) ds dt ≤ a − a k k Z Z b b p 1 p (b a) − u′(s) ds dt ≤ − a a k k Z Z b p p = (b a) u′(s) ds. − a k k Z 1 λ This calculation yields Poincar´e’s inequality for every u Cc (a,b;X) with = p 1 1,p∈ (b a)− . Since, by definition, Cc (a,b;X) is dense in W (a,b;X), the full claim may− be obtained by an approximation argument.

5.4 Exercises

5.1. Prove Theorem 5.12. Hint. Prove the statement first for u W 1,p(a,b;X) and v C1([a,b]). ∈ ∈ 5.2. Let X and Y be two Banach spaces such that Y is continuously embedded into X. Fix p [1,∞) and let ∈ W = W 1,p(0,1;X) Lp(0,1;Y) ∩ be equipped with the norm

p p p u = u p + u′ p , k kW k kL (0,1;Y) k kL (0,1;X) so that W is a Banach space. We recall from Theorem 5.10 that every function u W is continuous with values in X. ∈ a) Show that the space W0 := u W : u(0)= 0 is a closed subspace of W . { ∈ } b) Showthatthe trace space

Tr := x X : there exists u W such that u(0)= x { ∈ ∈ } 64 5 Bochner-Lebesgue and Bochner-Sobolev spaces

equipped with the norm

x Tr := inf u W : u W and u(0)= x k k {k k ∈ } is a Banach space. Hint. Find a natural bijection W /W0 Tr. → c) Show that Y Tr X and that the embeddings are continuous. ⊆ ⊆ d) Show that if X and Y are Hilbert spaces and if p = 2, then Tr is a Hilbert space. e) Show that W C([0,1];Tr) ⊆ with continuous embedding. p Remark. The trace space Tr is also denoted by (X,Y) 1 ,p (with p′ = p 1 ). p′ − It is a particular example in the scale of so-called real interpolation spaces p (X,Y )θ,p which can be defined similarly by considering weighted L spaces, [Lunardi (1995)]. In general, Tr is a strict subspace of X, so that the embedding from (e) is not a direct consequence of the Sobolev embedding theorem 5.11. 5.3. Let Ω Rd be open and bounded, and let p [1,∞). ⊆ ∈ a) Show that the mapping

J : C([0,1];C(Ω¯ )) C([0,1] Ω¯ ) → × given by (Ju)(t,x)= u(t)(x) ((t,x) [0,1] Ω¯ ) is bijective. ∈ × b) Show that for every u C([0,1];C(Ω¯ )) one has ∈

u Lp(0,1;Lp(Ω)) = Ju Lp((0,1) Ω). k k k k × p m c) Using the fact that Cc(U) is dense in L (U) whenever U R is open and p < ∞ (Theorem G.3), show that C([0,1];C(Ω¯ )) is dense in⊆Lp(0,1;Lp(Ω)). d) Conclude that J extends to an isometric isomorphism Lp(0,1;Lp(Ω)) Lp((0,1) Ω). → × Remark. While the above isomorphy seems to be natural, it is not immediately clear how to assign to a function (equivalence class) u Lp(0,1;Lp(Ω)) a function (equivalence class) Ju Lp((0,1) Ω). The direct definition∈ (Ju)(t,x)= u(t)(x) leads to the problem of∈ measurability× of Ju. One way to circumvent this problem of measurability is the way described in this exercise.

Let X and Y be two Banach spaces such that Y ֒ X and X ′ ֒ Y ′ (dense and .5.4 continuous embeddings), and let p [1,∞). Show that→ every function→ ∈ ∞ u W 1,p(0,1;X) L (0,1;Y ) ∈ ∩ (admits a representative which) is weakly continuous with values in Y, that is, for every y Y the function y ,u is continuous. ′ ∈ ′ h ′ iY ′,Y Lecture 6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions

In this lecture we consider gradient systems in infinite-dimensional spaces. Under the assumption that the underlying energy is a convex function and under certain other assumptions we obtain existence and uniqueness of solutions. The proof of this key theorem (the proof of existence of a solution) is based on the so-called Ritz / Ritz-Galerkin / Faedo-Galerkin method. We see three advantages of this method. First, it is constructive, and it is therefore particularly interesting from the point of view of numerical analysis. Second, the method is intuitive and the four important steps of the proof are easy to remember (we hope that the reader agrees with this opinion; we do not claim that the whole proof with all technical details is easy to remember). Finally, we think that the method is efficient since it leads directly to a maximal regularity result.

Throughout, let V be a Banach space with norm V . Let U V be an open subset, and let E : U R be a continuously differentiablek·k function.⊆ In addition, let → H be a Hilbert space with inner product , H , and assume that V is densely and h· ·i continuously embedded into H. The gradient ∇H E : D(∇H E ) H is defined as in Lecture 3. →

6.1 Gradient systems in infinite dimensional spaces

A non-autonomous1 gradient system is a differential equation of the form

u˙ + ∇H E (u)= f , (6.1)

1 autonomous (greek: auto+nomos, nomos = law; http://www.thefreedictionary. com/): not controlled by others or by outside forces; independent. Or: Independent of the laws of another state or government; self-governing.

65 66 6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions

E 2 R where as above and f Lloc(I;H), I being an interval (the subscript means that f is square integrable∈ on every compact⊆ subinterval of I). We call this gradient system non-autonomoussince the function f depends explicitly on the time variable. A solution of the gradient system (6.1) is a measurable function u : I V such that →

∞ u W 1,2(I;H) L (I;V ), ∈ loc ∩ loc u(t) D(∇H E ) for almost every t I, and ∈ ∈ the equality (6.1) holds almost everywhere on I.

This definition of solution may look arbitrary and is motivated mainly by the existence and uniqueness result below, and by the energy inequality therein. We point out that the solutions in the above sense are exactly the solutions which have maximal possible regularity in the sense that if u is a solution, then the two terms on the left-hand side of (6.1) have the same regularity (local square integrability) as the given right-hand side.

By the Sobolev embeddingtheorem (Theorem 5.11), every solution is continuous with values in X; this will allow us to give a sense to the initial value problem in Theorem 6.1. If V = H is finite-dimensional and if ∇H E is continuous (that is, if E is continuously differentiable), then every solution in the above sense is also a solution as defined in Lecture 1, and in particular as considered in Carath´eodory’s theorem (Theorem 2.6). Conversely, if u is a solution in the sense of Lecture 1, then u is a solution in the above sense. This follows from the fact that the composite function ∇H E (u) is, by continuity, locally square-integrable, that for every s, t I ∈ t t u(t) u(s)+ ∇H E (u(τ)) dτ = f (τ) dτ, − s s Z Z and from Lemma 5.9. Thus, in the finite-dimensional case, both notions of solutions coincide.

By definition of the gradient, u is a solutionof (6.1)if andonlyif u W 1,2(I;H) L∞(I;V ) and ∈ ∩

u˙,v H + E ′(u)v = f ,v H for every v V and almost every t I. (6.2) h i h i ∈ ∈ We call (6.2) the variational form of the gradient system (6.1).

6.2 Global existence and uniqueness of solutions for gradient systems with convex energy

Let E : V R be a function. We say that E is convex if → 6.2 Global existence and uniqueness of solutions for gradient systems with convex energy 67

E (λu + (1 λ)v) λE (u) + (1 λ)E (v) for every u, v V, λ [0,1]. − ≤ − ∈ ∈ Moreover, we say that E is coercive if for every c R the sublevel set ∈

Kc = u V : E (u) c is bounded in V. { ∈ ≤ } The following existence and uniqueness result for the gradient systems (6.1) with initial value is the main result of this lecture and a key result in the study of gradient systems. We say that it is a global existence result since the statement asserts exis- tence of solutions on the whole interval on which the right-hand side f is defined. Note that this interval is bounded in the statement, that is, T < ∞.

Theorem 6.1. Suppose that V is a reflexive, separable Banach space, and suppose that E : V R is a convex, coercive, continuously differentiable function and that E maps bounded→ sets into bounded sets. Then, for every f L2(0,T ;H) (T < ∞) ′ ∈ and every u0 V the gradient system with initial value ∈

u˙ + ∇H E (u)= f , (6.3) ( u(0)= u0,

admits a unique solution u W 1,2(0,T ;H) L∞(0,T;V ). For this solution, and for every t [0,T ], the energy inequality∈ ∩ ∈ t t 2 E E u˙ H + (u(t)) (u0)+ f ,u˙ H (6.4) 0 k k ≤ 0 h i Z Z holds true.

Uniqueness

Uniqueness of solutions is a consequence of differentiability and convexity of E , and the following lemma.

Lemma 6.2. Let E : V R be a differentiable, convex function on a Banach space V. Then, for every u, v →V, ∈

E ′(u) E ′(v),u v 0. h − − iV′,V ≥ Proof. By convexity, for every u, v V and every λ (0,1) ∈ ∈ E (u + λ(v u)) (1 λ)E (u)+ λE (v). − ≤ − Hence, E (u + λ(v u)) E(u) − − E (v) E (u). λ ≤ − Letting λ 0 in this inequality gives → 68 6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions

E ′(u),v u E (v) E (u). h − iV′,V ≤ − Changing the roles of u and v in this inequality gives

E ′(v),u v E (u) E (v). h − iV′,V ≤ − Summing up the preceding two inequalities yields the claim.

Proof (Proof of Theorem 6.1 – Uniqueness). Let u1 and u2 be two solutions of (6.3). Then, by Lemma 6.2, for almost every t [0,T], ∈

1 d 2 u1(t) u2(t) = u1(t) u2(t),u˙1(t) u˙2(t) H 2 dt k − kH h − − i = u1(t) u2(t),∇H E (u1(t)) ∇H E (u2(t)) H −h − − i = u1(t) u2(t),E ′(u1(t)) E ′(u2(t)) −h − − iV,V ′ 0. ≤ As a consequence, for almost every t [0,T], ∈ 2 2 u1(t) u2(t) u1(0) u2(0) = 0, k − kH ≤k − kH and therefore u1 = u2.

Existence

Existence of a solution of the gradient system (6.3) is proved by constructing a sequence of solutions of approximating gradient systems on finite-dimensional spaces, by extracting a convergent subsequence, and by showing that the limit is a solution we are looking for. For the existence part of the proof of Theorem 6.1, we need the following two compactness results from functional analysis; see Theorems E.18 and D.57. We recall the necessary notions of weak convergence and weak∗ convergence.

Given a Banach space X, we say that a sequence (un) X converges weakly to ⊆ an element u X (and we write un ⇀ u) if for every u′ X ′ limn ∞ u′,un X ,X = ∈ ∈ → h i ′ u ,u . If X is a Hilbert space with inner product , X , then, by the Riesz- h ′ iX′,X h· ·i Fr´echet representation theorem (Theorem D.53), un ⇀ u if and only if for every v X limn ∞ un,v X = u,v X . ∈ → h i h i Let X ′ be the dual space of X. We say that a sequence (un′ ) X ′ converges weak ⊆ weakly to an element u X (and we write u ∗ u ) if for every u X ∗ ′ ∈ ′ n′ → ′ ∈ limn ∞ un′ ,u X ,X = u′,u X ,X . → h i ′ h i ′ Theorem 6.3. a) (Banach-Alaoglu). Let X be a separable Banach space. Then every bounded sequence in X ′ admits a weak∗ convergent subsequence. 6.2 Global existence and uniqueness of solutions for gradient systems with convex energy 69

b) Every bounded sequence in a Hilbert space admits a weakly convergent sub- sequence.

Proof (Proof of Theorem 6.1 – Existence). To prove existence of a solution, we use the so called Ritz / Ritz-Galerkin / Faedo-Galerkinapproximation.We consider a se- quence of appropriate gradient systems on appropriate finite dimensional subspaces of V, for which local existence is known. Then, by proving appropriate norm and energy bounds, we show that the approximating problems admit global solutions. The bounds for the solutions are also used to extract weak and weak∗ convergent subsequences. Finally, we show that the weak limit is a solution of the gradient sys- tem (6.3). It turns out that especially in the crucial step of proving norm and energy bounds, the gradient structure of our problem is particularly helpful. Part 1 (Formulation of approximating problems in finite-dimensional spaces): Let (wn) be an arbitrary sequence in V such that span wn : n 1 is dense in V ; such a sequence exists because V is separable. { ≥ } For every m N, we put ∈

Vm = span wn :1 n m , { ≤ ≤ } m and we choose u Vm such that 0 ∈ m lim u = u0 in V. m ∞ 0 →

This is possible since Vm is dense in V and since Vm Vm 1. m ⊆ + For every m N, we consider the variational problem of finding um 1,2 ∈ S ∈ Wloc ([0,Tm);Vm) such that

u˙m,v H + E (um)v = f ,v H h i ′ h i  for every v Vm and almost every t (0,Tm), (6.5)  ∈ ∈  m um(0)= u0 .  Problem (6.5) is equivalent to the problem of finding a solution um 1,2 ∈ Wloc ([0,Tm);Vm) of the non-autonomous gradient system ∇ E u˙m + Hm m(um)= Pm f (6.6) m ( um(0)= u0 ,

where Em is the restriction of E to Vm, Hm = Vm is equipped with the inner product ∇ E E induced by H, Hm m is the gradient of m in Vm with respect to the inner product , H , and Pm : H H is the orthogonal projection from H onto Vm with respect to h· ·i → the inner product , H . Since Vm is finite dimensional, for every u Vm the gradient ∇ E h· ·i ∈ Hm m(u) exists and belongs to Vm. By the corollary to Carath´eodory’s theorem (Corollary 2.8), problem (6.6) 1,2 admits a maximal solution um Wloc ([0,Tm);Vm) (remember that solutions in the sense of Carath´eodory’s theorem∈ are weakly differentiable). Maximal solution 70 6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions means that either Tm = T, or Tm < T and the solution um can not be extended to any larger interval. For every m N, let um be a maximal solution of (6.6). ∈

Part 2 (Bounds for the solutions um of the approximating problems): We show that the maximal solutions um are global, that is, Tm = T, and that the sequence (um) is bounded in appropriate function spaces. This part of the proof essentially repeats arguments from the proof of Theorem 2.10.

We multiply the equation (6.6) byu ˙m with respect to the inner product , H (or: h· ·i we take v = u˙m in (6.5)), integrate the result over [0,t] (t (0,Tm)), and apply the Cauchy-Schwarz inequality in order to obtain ∈

t 2 E E m u˙m(s) H ds + (um(t)) (u0 )= (6.7) 0 k k − Z t = f (s),u˙m(s) H ds 0 h i Z t t 1 2 1 2 f (s) H ds + u˙m(s) H ds. ≤ 2 0 k k 2 0 k k Z Z m E E m Since limm ∞ u0 = u0 in V, and since is continuous, we have limm ∞ (u0 )= E → E m → (u0). In particular, the sequence ( (u0 )) is bounded. Hence, there exists a con- stant C 0 which is independent of m such that, for every t (0,Tm), ≥ ∈ t T 1 2 E 1 2 u˙m(s) H ds + (um(t)) C + f (s) H ds. (6.8) 2 0 k k ≤ 2 0 k k Z Z

The right-hand side of this inequality is independent of m and t (0,Tm), and the ∈ first term on the left-hand side is positive. This implies that the set um(t) : m N 1 T{ 2 ∈ , t (0,Tm) is contained in the sublevel set Kc, where c := C + 2 0 f (s) H ds. By coercivity∈ } of E , K is bounded in V, and therefore k k c R

sup sup um(t) V < ∞. m N t [0,Tm) k k ∈ ∈ Since the continuous, convex, coercive function E is bounded from below (see ex- ercise!), we in addition deduce from the inequality (6.8) that

2 ∞ sup u˙m L (0,Tm;H) < . m N k k ∈

Since Tm T is finite, this implies that for each m N the functionu ˙m is integrable ≤ ∈ on [0,Tm). Hence, um extends to a continuous function on the closed interval [0,Tm], and Carath´eodory’s theorem and the definition of maximal solution imply that this is only possible if Tm = T, that is, the um are global. From the preceding two inequalities and the continuous embedding V ֒ H we obtain that → 1,2 ∞ (um) is bounded in W (0,T ;H) L (0,T;V ). (6.9) ∩ 6.2 Global existence and uniqueness of solutions for gradient systems with convex energy 71

By assumption, the derivative E ′ : V V ′ maps bounded sets into bounded sets, so ∞ → that the boundedness of (um) in L (0,T ;V) implies that ∞ (E ′(um)) is bounded in L (0,T ;V ′).

Part 3 (Extracting a convergent subsequence): The space W 1,2(0,T;H) is ∞ 1 ∞ a Hilbert space and the spaces L (0,T;V ) = L (0,T ;V ′)′ and L (0,T ;V ′) = 1 ∼ ∼ L (0,T ;V)′ are dual spaces by reflexivity of V (and V ′) and by Theorem 5.6. More- 1 1 over, the spaces L (0,1;V ′) and L (0,1;V) are separable by Theorem 5.5. Hence, by Theorem 6.3, there exist u W 1,2(0,T ;H), v L∞(0,T;V ), χ L∞(0,T ;V ) and ∈ ∈ ∈ ′ a subsequence of (um) (which we denote for simplicity again by (um)) such that

1,2 um ⇀ u in W (0,T ;H), weak ∗ ∞ um v in L (0,T ;V), and → weak ∗ ∞ E ′(um) χ in L (0,T ;V ′). → We leave it as an exercise to show that u W 1,2(0,T ;H) L∞(0,T;V ) and u = v. Moreover, it is an exercise to show that every∈ continuous∩ linear operator between two Hilbert spaces maps weakly convergent sequences into weakly convergent se- quences. Since the operator W 1,2(0,T ;H) L2(0,T ;H), u u˙ and the point eval- uations W 1,2(0,T;H) H, u u(0) and→W 1,2(0,T ;H) 7→H, u u(T ) are lin- ear and continuous (for→ the continuity7→ of the point evalutio→ ns, use7→ the Sobolev embedding theorem 5.11 in order to see this), the weak convergence of (um) in W 1,2(0,T ;H) therefore implies

2 u˙m ⇀ u˙ in L (0,T;H),

um(0) ⇀ u(0) in H and

um(T ) ⇀ u(T) in H.

The above weak respectively weak∗ convergences in the respective function spaces mean that

T T 1 v,um V ,V v,u V ,V for every v L (0,T ;V ′), 0 h i ′ → 0 h i ′ ∈ Z Z T T 2 u˙m,v H u˙,v H for every v L (0,T ;H), and (6.10) 0 h i → 0 h i ∈ Z Z T T χ 1 E ′(um),v V ,V ,v V ,V for every v L (0,T ;V). 0 h i ′ → 0 h i ′ ∈ Z Z Part 4 (Showing that the limit u is a solution): First of all, we have just seen m m that um(0) ⇀ u(0) in H. On the other hand, um(0)= u0 by (6.5), and u0 u0 in V m → by the choice of the sequence (u0 ). Since V is continuously embedded into H, we obtain u(0)= u0, that is, u satisfies the initial condition in (6.3). It remains to show that u satisfies also the differential equation. 72 6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions

2 Let w Vm and ϕ L (0,T). For every n m we multiply equation (6.6) by ∈ ∈ ≥ ϕ( )w with respect to the inner product , H , integrate over (0,T) and obtain that · h· ·i T T T ϕ ϕ ϕ u˙n(t), (t)w H dt + E ′(un(t)), (t)w V ,V dt = f (t), (t)w H dt. 0 h i 0 h i ′ 0 h i Z Z Z Letting n ∞ in this last equality and using (6.10), we obtain → T T T ϕ χ ϕ ϕ u˙(t), (t)w H dt + (t), (t)w V ,V dt = f (t), (t)w H dt. 0 h i 0 h i ′ 0 h i Z Z Z 2 Using the fact that ϕ( )w : w Vm, ϕ L (0,T) spans a dense subspace of { · ∈ m ∈ } [ L2(0,T ;V) (Theorem 5.5), we obtain for every v L2(0,T ;V) ∈ T T T χ u˙,v H dt + ,v V ,V dt = f ,v H dt. (6.11) 0 h i 0 h i ′ 0 h i Z Z Z

It is left to show that χ = E ′(u). We multiply equation (6.6) by um with respect to the inner product , H , integrate the result over (0,T), and obtain h· ·i T T T E ′(um)um dt = f ,um H dt u˙m,um H dt (6.12) 0 0 h i − 0 h i Z Z Z T T 1 d 2 = f ,um H dt um H dt 0 h i − 0 2 dt k k Z Z T 1 2 1 m 2 = f ,um H dt um(T ) H + u0 H . 0 h i − 2k k 2k k Z

The weak convergence um(T ) ⇀ u(T ) in H implies

2 2 u(T) liminf um(T ) . H m ∞ H k k ≤ → k k m This estimate, the convergence u0 u0 in H, the equality (6.12) and the weak 2 → convergence um ⇀ u in L (0,T ;H) imply that

T T E 1 2 1 2 limsup ′(um)um dt f ,u H dt u(T) H + u0 H (6.13) m ∞ 0 ≤ 0 h i − 2k k 2k k → Z Z T T 1 d 2 = f ,u H dt u H dt 0 h i − 0 2 dt k k Z Z T T = f ,u H dt u˙,u H dt 0 h i − 0 h i Z Z T χ = ,u V ,V dt. 0 h i ′ Z In the last equality we have also used (6.11). By Lemma 6.2 and integration over (0,T ), 6.2 Global existence and uniqueness of solutions for gradient systems with convex energy 73

T T E ′(um),um u V ,V E ′(u),um u V ,V , 0 h − i ′ ≥ 0 h − i ′ Z Z weak weak ∗ ∗ so that the weak convergences um u and E (um) χ imply ∗ → ′ → T T χ liminf E ′(um),um V ,V ,u V ,V . m ∞ 0 h i ′ ≥ 0 h i ′ → Z Z This inequality together with (6.13) implies

T T χ lim E ′(um),um V ,V = ,u V ,V . (6.14) m ∞ 0 h i ′ 0 h i ′ → Z Z ∞ Let v L (0,T;V ) be arbitrary, and set wλ = (1 λ)u + λv, where λ (0,1). By Lemma∈ 6.2 and integration over (0,T), − ∈

T E ′(um) E ′(wλ ),um wλ V ,V 0. 0 h − − i ′ ≥ Z Using the definition of wλ , this inequality can be rewritten as

T λ E ′(um),u v V ,V 0 h − i ′ ≥ Z T T T λ E ′(wλ ),u v V ,V + E ′(wλ ),um u V ,V E ′(um),um u V ,V . ≥ 0 h − i ′ 0 h − i ′ − 0 h − i ′ Z Z Z

Letting m ∞ in this inequality, we obtain on using again the weak∗ convergences weak → weak ∗ ∗ um u, E (um) χ, and (6.14) that → ′ → T T λ χ λ ,u v V ,V E ′(wλ ),u v V ,V . 0 h − i ′ ≥ 0 h − i ′ Z Z

We divide by λ > 0 and let λ 0, use the continuity of E ′ and Lebesgue’s domi- nated convergence theorem. Then→

T T χ ,u v V ,V E ′(u),u v V ,V . 0 h − i ′ ≥ 0 h − i ′ Z Z Since v L∞(0,T;V ) is arbitrary in this inequality, and since u L∞(0,T ;V ), we therefore∈ obtain ∈

T T χ ∞ ,v V ,V E ′(u),v V ,V for every v L (0,T ;V). 0 h i ′ ≥ 0 h i ′ ∈ Z Z This inequality implies E ′(u)= χ.

Replacing χ in (6.11) by E ′(u), it follows that u is a solution of the differential equation in (6.2). 74 6 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions

Energy inequality

The following lemma is a consequence of the Hahn-Banach theorem and will not be proved here (see Corollary E.37).

Lemma 6.4. Let X beaBanachspace,un,u X,andlet F : X R be a continuous ∈ → convex function. Then un ⇀ u implies F (u) liminfn ∞ F (un). ≤ → Proof (Proof of Theorem 6.1 – Energy inequality). We maintain the notation from the preceding proof of existence of a solution. In particular, (um) is the sequence of solutions of the approximating problems (6.6) which we obtained after extracting a subsequence, and u is the solution of the gradient system (6.3) which we obtained as a weak limit of the sequence (um). We recall from (6.7) that for every m one has the energy equality

t t 2 E E m u˙m(s) H ds + (um(t)) = (u0 )+ f (s),u˙m(s) H ds. (6.15) 0 k k 0 h i Z Z m E We also recall that limm ∞ u0 = u0 in V. The continuity of thus yields → m lim E (u )= E (u0). m ∞ 0 → 2 The weak convergenceu ˙m ⇀ u˙ in L (0,T ;H) implies that for every t [0,T ] ∈ t t lim f (s),u˙m(s) H ds = f (s),u˙(s) H ds. m ∞ 0 h i 0 h i → Z Z 2 F t 2 By Lemma 6.4 applied with X = L (0,T ;H) and (v)= 0 v H , the same weak convergence implies that for every t [0,T] k k ∈ R t t 2 2 u˙(s) H ds liminf u˙m(s) H ds. 0 k k ≤ m ∞ 0 k k Z → Z 1,2 The weak convergence um ⇀ u in W (0,T ;H) implies that for every t [0,T] one ∞ ∈ has um(t) ⇀ u(t) in H. On the other hand, since (um) is bounded in L (0,T;V ), we deduce that for almost every t [0,T ] one has um(t) ⇀ u(t) in V. By Lemma 6.4 applied with X = V and F = E∈, this implies that for almost every t [0,T ] ∈

E (u(t)) liminfE (um(t)). m ∞ ≤ → Taking the limes inferior (as m ∞) on both sides of (6.15) yields the energy in- equality (6.4). Theorem 6.1 is completely→ proved. 6.3 Exercises 75 6.3 Exercises

6.1. Show that a quadratic form E : V R on a Banach space V is convex if and only if it is nonnegative, that is, E 0.→ ≥ 6.2. Let Ω Rd be open,and let p (1,∞). Show that the energy E : W 1,p(Ω) R, 1 ⊂ p ∈ → E (u)= Ω ∇u is convex. p | | 6.3. Let VRbe a Banach space, and let C V be a bounded convex set. ⊆ a) Show that every continuous, convex function E : C R is bounded from be- low. → b) Show that every continuous, convex, coercive function E : V R is bounded from below. → Remark. In this exercise, continuity of E may be replaced by lower semicontinuity. The function E : C R is lower semicontinuous if limn ∞ un = u (with un, u C) → → ∈ implies E (u) liminfn ∞ E (un). There is a theorem≤ which→ asserts that every continuous, convex, coercive function E : V R on a reflexive Banach space attains its infimum (and in particular it is bounded→ from below; see Theorem E.38). While that result is nontrivial, this exer- cise here may be solved by using only the definition of convexity and continuity.

6.4 (De Giorgi – Energy dissipation rate). In this exercise, we come back to fi- nite dimensional gradient systems. Let E : Rd R be a continuously differentiable function, and let g : Rd Inner(Rd) be a Riemannian→ metric. Show that a contin- uously differentiable function→ u : I Rd (I R an interval) is a solution of the gradient system → ⊆ u˙ + ∇gE (u)= 0 if and only if, for every t I, the energy inequality ∈ d 1 2 1 2 E (u(t)) u˙(t) ∇gE (u(t)) dt ≤−2 k kg(u) − 2 k kg(u) holds. Remark. According to a remark in L. Ambrosio, Gradient flows in metric spaces and in the space of probability measures, and applications to Fokker-Planck equa- tions with respect to log-concave measures, Bolletino della Unione Matematica Ital- iana IX 1 (2008), 223-240, this observation is due to E. De Giorgi; see E. De Giorgi, A. Marino and M. Tosques, Problems of evolution in metric spaces and maximal de- creasing curves, Atti Accad. Naz. Lincei rend. Cl. Sci. Fis. Mat. Natur. 68 (1980), 180-187 and G. Perelman and A. Petrunin, Quasigeodesics and gradient curves in Alexandrov spaces, unpublished preprint (1994).

Lecture 7 Diffusion equations

As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality. 1 Albert Einstein

Diffusion equations are prominent examples of gradient systems, and they provide us with the first examples of abstract gradient systems where we can apply the Theorem 6.1 on global existence and uniqueness of solutions.

Although it is not strictly necessary for this course on gradient systems, we spend some time on the derivation of the various mathematical models related to the prob- lem of heat conduction, of diffusion of particles, and of denoising of images. While the assumptions which lead one to the linear heat equation – the heat and Fourier’s law – may be criticized, it is certainly useful to keep them in mind throughout the following study. The comparison of the results obtained from the mathematical analysis of the various models and of the physical experiments can help to justify or reject a particular model. In this respect, the proof of existence and uniqueness of solutions is a test for the validity of a model. We consider two comparatively simple diffusion equations for which we can prove global existence and uniqueness of solutions.

7.1 Heat conduction

Let Ω Rd be an open set respresenting a spatial domain (so usually d 3). We consider⊆ the problem of heat conduction inside Ω. For this, we consider a≤ function u = u(t,x) which representsthe heat density dependingon the time variable t [0,T ] and on the space variable x Ω. Given an open set O Ω, the integral ∈ ∈ ⊆ 1 Albert Einstein, Geometrie und Erfahrung (Geometry and experience), An Address to the Prus- sian Academy of Sciences in Berlin on January 27th, 1921. German original quotation: Insofern sich die S¨atze der Mathematik auf die Wirklichkeit beziehen, sind sie nicht sicher, und insofern sie sicher sind, beziehen sie sich nicht auf die Wirklichkeit.

77 78 7 Diffusion equations

u(t,x) dx O Z is then the total amount of heat, at time t, inside the volume O. This total amount of heat depends explicitly on the time variable; it changes infinitesimally only if heat is transported through the boundary ∂O or if there is a heat source or sink inside O. This is expressed by the heat conservation law

d u(t,x) dx = j(t,x)n(x) dσ(x) + f (t,x) dx . (7.1) dt O − ∂ O O Z Z Z infinitesimal heat transport heat production O |change{z of heat} | through{z boundary } or| loss{z inside} Here j = j(t,x) is the heat flux vector which indicates into which direction and, by its length, how much heat is transported, n = n(x) is the outer normal vector at x ∂O, and the Euclidean inner product jn is the amount of heat which is transported∈ into normal direction; heat, which is transported into tangential direction does not leave or enter the volume O and has therefore no influence on the infinitesimal change of total amount of heat. Note that the minus sign on the right-hand side is necessary, for if the heat flux points into O, then the total amount of heat should increase and the left-hand side and the right-hand side should be positive. However, for inwards pointing j, the inner product jn is negative. The term f = f (t,x) stands for a heat source or sink (depending on whether it is positive or negative), and it depends here on time and space. We have also assumed that O is of class C1, and we denote by σ the surface measure on ∂O.

When we interchange the time derivative with the integral on the left-hand side (by assuming that the function u is regularenough),and when we apply Gauß’ diver- gence theorem (Theorem F.6) to the first term on the right-hand side, then equation (7.1) becomes ∂ u(t,x) dx = divj(t,x) dx + f (t,x) dx, O ∂t − O O Z Z Z ∂ where divj = ∑d ji is the divergence of j. The preceding formula is true for every i=1 ∂ xi volume O Ω (O of class C1), which is only possible if ⊆ ∂ u(t,x)= divj(t,x)+ f (t,x) for every (t,x) (0,T ) Ω. (7.2) ∂t − ∈ × Experiments show that heat is transported away from zones with higher heat density into zones with lower heat density. Recall from Lemma 1.3 that the negative euclidean gradient ∇u (the gradientis to be taken in the x variable only) points into the direction of steepest− descent, that is, into the direction where u decreases most rapidly. It is therefore natural to assume that the heat flux vector j and the negative ∇u point into the same direction. In the simplest model, one − 7.1 Heat conduction 79 may assume thatthe length of theheat flux j and of the negative temperature gradient ∇u are proportional to each other. This leads to the linear constitutive relation − j(t,x)= c∇u(t,x) (Fourier’s law), (7.3) − where c > 0 is the heat conductivity characterizing the material; we assume here that c does not depend on time or space, that is, the material filling Ω is ideally homogeneous. Inserting Fourier’s law into (7.2) and noting that div∇ = ∆ is the Laplace operator, we obtain the heat equation ∂ u(t,x)= c∆u(t,x)+ f (t,x) for every (t,x) (0,T) Ω. (7.4) ∂t ∈ × If instead of Fourier’s law (7.3) we consider a nonlinear constitutive relation with the heat conductivity depending on u and ∇u, c = c(u,∇u), then we are lead to the nonlinear heat equation ∂ u = div(c(u,∇u)∇u)+ f in (0,T) Ω. (7.5) ∂t × The choice c = c( ∇u )= ∇u p 2 leads to the nonlinear heat equation | | | | − ∂ u = ∆pu + f in (0,T) Ω, ∂t ×

p 2 where ∆p is the p-Laplace operator: ∆pu = div( ∇u − ∇u). The choice c = c(u)= m 1 | | mu − leads to the porous medium equation ∂ u = ∆(um)+ f in (0,T ) Ω. ∂t ×

On the boundary ∂Ω, we consider three types of behaviour. Either, we assume that the boundary is ideally conducting and that the heat density outside Ω is kept at a certain level,

u = uΩ c on (0,T ) ∂Ω (Dirichlet boundary condition), × or we assume that the boundary is ideally isolating and there is no heat flux through the boundary,

jn = 0 on (0,T ) ∂Ω (Neumann boundary condition), × or we assume that the heat flux through the boundary depends on the difference between the heat density inside and outside Ω, for example linearly,

jn = b(u uΩ c) on (0,T ) ∂Ω (Robin boundary condition). − × 80 7 Diffusion equations

In the first and in the last equality, uΩ c is the heat density outside Ω and in the simple models one sets uΩ c = 0. Of course, one may imagine other types of boundary conditions, for example mixed boundary conditions (one part of the boundary ∂Ω might be conducting, while the other part is isolating).

In the case that the constitutive relation for the heat flux vector is Fourier’s law, and if uΩ c = 0, the Dirichlet, Neumann and Robin boundary conditions can be rewritten as

u = 0 on (0,T ) ∂Ω (Dirichlet boundary condition), × ∂ u = 0 on (0,T ) ∂Ω (Neumann boundary condition), ∂ n × and ∂ u + b u = 0 on (0,T) ∂Ω (Robin boundary condition), ∂ n c × ∂ ∇ respectively. Here, ∂ n u = un is the outer normal derivative of u.

7.2 The reaction-diffusion equation

Again, let Ω Rd be an open set representing a spatial domain. We consider now a function u =⊆u(t,x) which represents the concentration of a chemical component or the density of a biological population occupying the region Ω. In this case, for a given open set O Ω, the integral ⊆ u(t,x) dx O Z is the total amount of the chemical component or of the population, at time t, inside the volume O. The conservation law can be expressed similarly as in (7.1),

d u(t,x) dx = j(t,x)n(x) dσ(x)+ f (t,x,u(t,x)) dx. (7.6) dt O − ∂ O O Z Z Z where j = j(t,x) is now the drift vector of particles resp. individuals, and f = f (t,x,u) indicates the production of u due to chemical reaction, resp. birth or death of individuals (depending on whether f is positive or negative). Experiments show that particles resp. individuals wander from regions where they are dense to regions where they are rarer, and that the drift rate is proportional to the norm of the gradient of the density and points to the opposite direction of the gradient. This is expressed by

j(t,x)= c∇u(t,x) (Fick’s law). (7.7) − Here, the diffusion coefficient c might be constant, or depend on u and/or on ∇u. In this way we obtain the reaction-diffusion equation 7.3 The Perona-Malik model in image processing 81 ∂ u(t,x)= div(c∇u)+ f ( , ,u) in (0,T ) Ω. (7.8) ∂t · · × This equation is equipped with boundary conditions, too. The Dirichlet boundary conditions mean that there are no particles resp. individuals on the boundary (or that their number is fixed), while Neumann boundary conditions mean that particles resp. individuals can not leave the region Ω.

7.3 The Perona-Malik model in image processing

Let Ω Rd be an open set (d = 2 or d = 3). A black-and-white image is a function u : Ω ⊆R. At every point x Ω, the value u(x) stands for a grey-level. A reasonable image→ takes only values in∈ the interval [0,1]; for example, the value 0 stands for a black pixel, the value 1 for a white pixel, and values in (0,1) for a grey pixel with a grey-level between black and white (the role of 0 and 1 is just convention and may be inverted of course). Suppose that u0 describes the input image which is an image destroyed by a noise. The task is to remove the noise. One possible algorithm to denoise the image is to solve the problem

ut div(c( ∇u )∇u)= 0 in (0,T ) Ω, − | | × ∂  u = 0 on (0,T) ∂Ω, (7.9) ∂ n ×   u(0,x)= u0(x) for x Ω, ∈  and to take u(T,.) for some T > 0 as the output image. Perona and Malik [Perona and Malik (1990)] proposed to take the diffusion coefficient c depending on the of the gradient, ∇u , in such a way that if ∇u is small (in zones where the grey level does not vary| too| much), then the system| (|7.9) smoothes the image (it behaves like a diffusion equation), and if ∇u is large, then the diffusion (the regularization) is stopped and edges will be preserved| | (observe that ∇u is large where the image has edges). This is achieved by a function c = c( ∇u )|, where| c is decreasing, c(0)= 1, and lim c(s)= 0 (zero diffusion for large s =| ∇u| ). The func- s ∞ | | p 2 → s s2 tions c(s) = (1 + s) − (p < 2), c(s)= e− or c(s)= e− are possible candidates, and according to the modelit seems that the faster the function c decreases as s ∞, the better the edges in the image are preserved. → 82 7 Diffusion equations 7.4 Global existence and uniqueness of solutions

The linear heat equation

We first consider the initial-boundary value problem

ut uxx = f in (0,T ) (0,1), − ×  u(t,0)= u(t,1)= 0 for t (0,T ), (7.10) ∈   u(0,x)= u0(x) for x (0,1), ∈  where u0 : [0,1] Rand f : (0,T ) (0,1) R, f = f (t,x), are two given func- tions. This problem→ is a model for the× heat→ conduction in a rod (one dimensional spatial domain), with ideally conducting end points, heat conductivity c = 1, given initial heat distribution u0 and heat source f .

We recall from Lecture 3 the observation that solving the time-independent / stationary problem u H2(0,1), ∈  uxx(x)= f (x) for x (0,1), − ∈   u(0)= u(1)= 0, for a given f L2(0,1) is equivalent to solving the abstract equation ∈  D∆u = f , − (0,1) D 2 where the Dirichlet-Laplace operator ∆u = ∇ 2 E is the L -gradient of the (0,1) − L energy E : H1(0,1) R given by 0 → 1 1 E (u)= (u )2. 2 ′ Z0 Without addressing the problem what could be a solution of the problem (7.10) (certainly it is a function of the two variables t and x), it seems to be natural to replace the problem (7.10) by the abstract gradient system

u˙ D∆u = f , − (0,1) (7.11) ( u(0)= u0, in which the unknown u is a function of the time variable only and takes its values in L2(0,1).

1 2 2 Theorem 7.1. For every u0 H0 (0,1) and every f L (0,T ;L (0,1)) (T > 0) the gradient system (7.11) ∈admits a unique solution∈ u H1(0,T ;L2(0,1)) ∞ 1 ∈ ∩ L (0,T ;H0 (0,1)). 7.4 Global existence and uniqueness of solutions 83

E 1 Proof. The domain of the energy , that is, the space V = H0 (0,1) is separable and reflexive. The energy E is a continuous quadratic form, and therefore it is continuously differentiable by Proposition 3.2. The derivative E ′ is linear and con- tinuous, and therefore bounded. Moreover, since E is a nonnegative (E 0), E is convex by Exercise 6.1. By Poincar´e’s inequality (see Exercise 4.7 or Theorem≥ 1 5.14), the inner product v,w H1 = 0 v′w′ is equivalent to the usual inner product h i 0 1 1 v,w 1 = vw + v w , and in particularR there exists a constant C > 0 such that h iH 0 0 ′ ′ R R 2 2 1 v H1 C v H1 for every v H0 (0,1); k k ≤ k k 0 ∈

1+λ in fact, one may take C = λ , where λ > 0 is the constant from Poincar´e’s inequal- E 1 ity. As a consequence of this estimate, every sublevel of is bounded in H0 (0,1), that is, E is coercive. The claim now follows from Theorem 6.1.

A nonlinear reaction-diffusion equation

We consider the nonlinear initial-boundary value problem

3 ut uxx + u = f in (0,T) (0,1), − ×  ux(t,0)= ux(t,1)= 0 for t (0,T), (7.12) ∈   u(0,x)= u0(x) for x (0,1), ∈  where u0 : [0,1] R and f : (0,T) (0,1) R, f = f (t,x), are two given functions. This is→ a model for the evolution× of→ the concentration of a chemical component in an isolated channel (one dimensional spatial domain), undergoing diffusion with Fick’s law and diffusion coefficient equal to 1, reaction with production rate u3, and external supply f . Note that the “production” rate is really negative; compare− the problem (7.12) with the reaction-diffusion equation (7.8) and note our convention to write the reaction term on a different side of the equality sign.

Similarly as in Exercise 4.3 on the Neumann-Laplace operator and in Exercise 4.6, by using Theorem 4.3, one shows that the energy E : H1(0,1) R given by → 1 1 1 1 E (u)= (u )2 + u4 2 ′ 4 Z0 Z0 is continuously differentiable and that

∇ E 2 D( L2 )= u H (0,1) : ux(0)= ux(1)= 0 , { ∈ 3 } ∇ 2 E (u)= uxx + u . L − It therefore seems to be natural to replace the problem (7.10) by the abstract gradient system 84 7 Diffusion equations ∇ E u˙ + L2 (u)= f , (7.13) ( u(0)= u0.

1 2 2 Theorem 7.2. For every u0 H (0,1) and every f L (0,T;L (0,1)) the gradient system (7.13) admits a unique∈ solution u H1(0,T ∈;L2(0,1)) L∞(0,T ;H1(0,1)). ∈ ∩ Proof. By Theorem 6.1, it suffices to check that the energy E is continuously dif- 1 ferentiable, convex and coercive, and that E ′ is maps bounded sets of H (0,1) into 1 1 bounded sets of H (0,1)′ (we note that the energy space H (0,1) is separable and reflexive). We have already observed that E is continuously differentiable, and we leave it as an exercise to show that E is convex and E ′ bounded. By H¨older’s in- equality, 1 1 1 2 1 2 2 E (u) (u′) + u , ≥ 2 0 4 0 Z Z and this inequality implies that E is coercive. 

Remark 7.3. The question of existence and uniqueness of local and/or global solu- tions becomes more difficult if we want to study the nonlinear diffusion equation

3 ut uxx u = f in (0,T ) (0,1), − − ×  ux(t,0)= ux(t,1)= 0 for t (0,T), (7.14) ∈   u(0,x)= u0(x) for x (0,1), ∈  in which – with respect to the problem (7.12) – only the sign of the nonlinearity changed. In order to guess the difference, it is instructive to compare the solutions of the two ordinary differential equations

u˙ + u3 = 0 andu ˙ u3 = 0. − The second equation does not admit positive global solutions.

7.5 Exercises

7.1. Throughout, let p > 1. a) Show that the embedding W 1,p(0,1) ֒ C([0,1]) is compact. Conclude that .the embedding W 1,p(0,1) ֒ Lp(0,1) is→ compact, too Hint. Use the Arzel`a-Ascoli→ theorem. b) Show that the infimum

1 p λ 0 u′ 1,p 1 := inf 1| | : u W0 (0,1), u = 0 ( R u p ∈ 6 ) 0 | | R 7.5 Exercises 85

1,p λ 1 p is attained, that is, there exists u W0 (0,1), u = 0 such that 1 0 u = 1 p ∈ 6 | | u . Recall from Poincar´e’s inequality that λ1 > 0. 0 | ′| R s c)R Let f : R R be a continuous function, F(s) := f (r) dr. Assume that there → 0 exist constants c1, c2, c3 0 and λ < λ1 such that for every s R ≥ R ∈ λ p 1 p f (s) c1 + c2 s − and F(s) c3 s . | |≤ | | ≥− − p | |

E 1,p R E 1 1 p Show that the energy : W0 (0,1) given by (u)= 0 [ p u′ + F(u)] is continuously differentiable and coercive.→ | | R λ λ E 1 R E d) Show that for every (0, 1] the energy : H0 (0,1) , (u) = 1 1 2 2 ∈ → [ u λ u ] is convex, and that it is not convex if λ > λ1. 2 0 | ′| − | | Remark.R In (b), (c) and (d) one may replace the interval (0,1) by an ar- bitrary open bounded set Ω Rd. By the Rellich-Kondrachov theorem (see ⊆ Ω 1,p Ω ֒ ( ) Adams and Fournier (2003)]), for every bounded the embedding W0] Lp(Ω) is compact. →

7.2. Let C : R+ R be a continuously differentiable function such that C′(0)= 0. → p 1 Assume that there exist p > 1, c1 0, R 0 such that C′(s) c1 s − for every s R. Let Ω Rd be an open and bounded.≥ ≥ ≤ ≥ ⊆ a) Show that the energy

E : W 1,p(Ω) R, → u C( ∇u ) 7→ Ω | | Z is continuously differentiable, and that for every u, h W 1,p(Ω), ∈ ∇u E ′(u)h = C′( ∇u ) ∇h, Ω | | ∇u Z | | where the term under the integral is interpreted as 0 if ∇u = 0. Moreover, E : W 1,p(Ω) W 1,p(Ω) maps bounded sets into bounded sets. ′ → ′ b) Assume in addition that C is increasing and convex. Show that the energy E from (a) is convex. c) Show that the energy E is not coercive. 7.3. Let Ω Rd be open. We return to the model of heat conduction in Ω, starting from the conservation⊂ law (7.2). Find an interpretation of the constitutive relation

j(t,x)= c∇u(t,x)+ b(x)u(t,x), − in which b : Ω Rd is a given function. Which partial differential equation does one obtain from→ this constitutive relation and (7.2)?

Lecture 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II

The assumptions of Theorem 6.1 on the existence and uniqueness of global solutions of gradient systems are natural and easy to verify in applications arising, for example, from diffusion models; we refer to the problems (7.10) and (7.12) which were considered and solved in the previous lecture. However, there are examples of diffusion models / gradient systems to which Theorem 6.1 does not apply: this is the case for the problem (7.12) if the reaction term u3 is dropped (the energy of the Laplace operator with Neumann boundary conditions is not coercive), for the problem (7.14), in which the reaction term had a different sign, and this is the case for the Perona-Malik model even if the diffusion coefficient c is chosen appropriately so that the associated energy is convex; compare with Exercises 7.2 (c) and 8.3 (b). It is therefore necessary to think about more general existence and uniqueness theorems.

In this lecture we prove a variant of Theorem 6.1. We would like to call it a generalisation of Theorem 6.1, although it is not a strict generalisation. First, in Theorem 8.1 below, the energy is assumed to be H-elliptic instead of coercive and convex. This new property is weaker than coercivitiy and convexity. For example, the energy of the Neumann-Laplace operator and the energy associated with the Perona-Malik model are L2(Ω)-elliptic – if in the latter the diffusion coefficient is chosen appropriately. Second, we consider gradients taken with respect to general metrics. The price which we pay for both generalisations is that we can not prove uniqueness of solutions in general, and the embedding of the energy space V into the Hilbert space H is assumed to be compact. Using again the Ritz / Ritz-Galerkin / Faedo-Galerkin method in the proof, the compact embedding seems to be a neces- sary assumption. At the same time, it is an open problem whether the hypotheses of Theorem 8.1 imply uniqueness of solutions or not.

87 88 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II 8.1 Global existence and uniqueness of solutions for gradient systems with elliptic energy

Let V be a Banach space with norm V . Let U V be an open subset, and let E : U R be a continuously differentiablek·k function.⊆ In addition, let H be a Hilbert → space with inner product , H , and assume that V is densely and continuously em- bedded into H. Let finallyh·g·i: U Inner(H) be a metric on U.A non-autonomous gradient system is a differential→ equation of the form

u˙ + ∇gE (u)= f , (8.1) E 2 R with and g as above and f Lloc(I;H), I being an interval. A solution of this gradient system is a measurable∈ function ⊆u : I V such that → ∞ u W 1,2(I;H) L (I;V ), ∈ loc ∩ loc u(t) D(∇gE ) for almost every t I, and ∈ ∈ the equality (8.1) holds almost everywhere on I.

By the Sobolev embedding theorem (Theorem 5.11), every solution is continuous with values in H; this allows us to give a sense to the initial value problem in The- orem 8.1. Moreover, by a similar reasoning as in Lecture 6, if V = H is finite- dimensional, then every solution in the above sense is also a solution as defined in Lecture 1 and vice versa. By definition of the gradient, u is a solution of (8.1) if and only if u W 1,2(I;H) L∞ (I;V) and ∈ loc ∩ loc

u˙,v + E ′(u)v = f ,v for every v V, almost everywhere on I. (8.2) h ig(u) h ig(u) ∈ We call (8.2) the variational form of the gradient system (8.1).

We say that a function E : V R is H-elliptic if there exists ω R such that the → ω ∈ function Eω : V R, u E (u)+ u 2 is coercive and convex. If the space H is → 7→ 2 k kH clear from the context, then we may simply say that E is elliptic. Note that if Eω is coercive (resp. convex) for some ω R, then Eω is coercive (resp. convex) for ′ every ω ω. ∈ ′ ≥ Theorem 8.1. Suppose that V is a reflexive, separable Banach space which is com- pactly embedded into H. Suppose that E is an H-elliptic, continuously differentiable function such that E ′ : V V ′ maps bounded sets into bounded sets. Let T (0,∞). Suppose in addition that→ the metric g is continuous in the sense that ∈

1,2 un ⇀ u in W (0,T ;H), weak ∗ ∞ T T un u in L (0,T;V ),  v ,w v,w . (8.3) → 2 n g(un) g(u) vn ⇀ v in L (0,T ;H), and  ⇒ 0 h i → 0 h i  Z Z w L2(0,T ;H) ∈   8.1 Global existence and uniqueness of solutions for gradient systems with elliptic energy 89

Suppose finally that there exist two constants c1,c2 > 0 such that for every u V and every v H ∈ ∈ c1 v H v c2 v H . k k ≤k kg(u) ≤ k k 2 Then, for every f L (0,T;H) and every initial value u0 V, there exists a solution u W 1,2(0,T;H)∈ L∞(0,T ;V) of the problem ∈ ∈ ∩

u˙ + ∇gE (u)= f , (8.4) ( u(0)= u0.

If, in addition, the metric g is constant – that is, , g(u) = , H for every u V –, then the above problem admits a unique solution.h· ·i For thish· s·iolution, the energy∈ inequality t t 2 E E u˙ H + (u(t)) (u0)+ f ,u˙ H (8.5) 0 k k ≤ 0 h i Z Z holds true.

Remark 8.2. Theorem8.1 is an L2-maximal regularity result for the nonlinear prob- 2 lem (8.4) in the sense that for every f L (0,T;H) and every u0 V, the problem ∈ ∈ (8.4) admits a solution u such that the two membersu ˙ and ∇gE (u) of the left-hand side of (8.4) belong also to L2(0,T ;H).

Existence

In the proof of Theorem 8.1, we need the following three lemmas.

Lemma 8.3 (Gronwall). Let ϕ : [0,T ] R+ be a nonnegative continuous function. Assume that there exist two constants C,→ω 0 such that for every t [0,T ] ≥ ∈ t ϕ(t) C + ω ϕ(s) ds. ≤ 0 Z Then, for every t [0,T ], ω ∈ ϕ(t) Ce t . ≤ ε ψ ε ω t ϕ ψ Proof. Let > 0 and set (t) := + C + 0 (s) ds (t [0,T]). Then is a positive, nondecreasing, continuously differentiable function.∈ By assumption and R since ε > 0, for every t [0,T ], ∈

ψ′(t)= ω ϕ(t) ωψ(t). ≤ Dividing by ψ(t) > 0, and integrating the resulting inequality yields that for every t [0,T] ω ω ∈ ψ(t) ψ(0)e t = (C + ε)e t . ≤ Since ϕ(t) ψ(t) by assumption, and since ε > 0 is arbitrary, this implies the claim. ≤ 90 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II

Lemma 8.4. a) (Chain rule). Let X and Y be two Banach spaces, p [1,∞), (a,b) R a bounded interval, and F : X Y a continuously differentiable∈ function.⊆ Then, for every u W 1,p(a,b;X)→the composite function F u be- longs to W 1,p(a,b;Y ) and d∈(F u)(t)= F (u(t))u˙(t) for almost every◦ t. dt ◦ ′ b) (Product rule). Let H be a Hilbert space and (a,b) R be an interval. 1 2 ⊆ 1,1 Then, for every u H (a,b;H) the function u H belongs to W (a,b) and d 2 ∈ k k u(t) = 2 u(t),u˙(t) H for almost every t (a,b). dt k kH h i ∈ Proof (Sketch). (a) The claim is true forevery u C1([a,b];X) by the classical chain rule. Since C1([a,b];X) is dense in W 1,p(a,b;∈X) (compare with Corollary G.13, where this claim is proved for X = R; the proof of Corollary G.13 remains valid in the case of vector-valued functions), the claim follows upon an approximation argument which uses also the Sobolev embedding theorem (Theorem 5.11). The proof of (b) is very similar: the product rule is true for continuously differ- entiable functions u H1(a,b;H), and the subspace of continuously differentiable functions is dense in∈H1(a,b;H) by the vector-valued variant of Corollary G.13. The claim thus follows upon an approximation argument.

Lemma 8.5 (Aubin - Lions). Let X andY be two Banach spaces such thatY ֒ X with compact embedding. Then, for every p (1,∞), the embedding → ∈ ∞ (W 1,p(0,T ;X) L (0,T;Y ) ֒ C([0,T ];X ∩ → is compact, too.

1,p ∞ Proof. Let (un) W (0,T;X) L (0,T ;Y ) be a bounded sequence. It suffices to ⊆ ∩ prove that (un) is relatively compact in C([0,T];X). For every n N and every s, t [0,T], s t, ∈ ∈ ≤ t un(t) un(s) X = u˙n(r) dr X (Theorem 5.10) k − k s Z t u˙n(r) X dr (triangle inequality) ≤ s k k Z 1 u˙n p (t s) p′ (H¨older’s inequality) ≤k kL (0,T;X) − 1 C (t s) p′ , ≤ − p ∞ where the constant C 0 does not depend on n, and p′ = p 1 < . This inequality ≥ − implies that the sequence (un) is uniformly equicontinuous with values in X. ∞ Since (un) is uniformly bounded in L (0,T ;Y ), there exists a set N of Lebesgue measure 0 such that for every t [0,T ] N the sequence (un(t)) is bounded in Y. Since Y is compactly embedded∈ into X,\ we obtain that for every t [0,T ] N the ∈ \ sequence (un(t)) is relatively compact in X. Since [0,T ] N is dense in [0,T ], the \ Arzel`a-Ascoli theorem (Theorem B.43) implies that the sequence (un) is relatively compact in C([0,T ];X). 8.1 Global existence and uniqueness of solutions for gradient systems with elliptic energy 91

Proof (Proof of Theorem 8.1 – Existence). We proceed similarly as in the proof of Theorem 6.1. In particular, we use the Ritz / Ritz-Galerkin / Faedo-Galerkin approximation. Part 1 (Formulation of the finite dimensional approximating problems): We choose a total sequence (wn) V, we define the finite dimensional spaces Vm, and m ⊆ m we choose u Vm such that u0 = lim u , just as in Part 1 of the proofof Theorem 0 ∈ m ∞ 0 6.1. →

For every m N, we consider the variational problem of finding um 1,2 ∈ ∈ Wloc ([0,T );Vm) such that

u˙m,v + E (um)v = f ,v h ig(um) ′ h ig(um)  for every v Vm, almost everywhere on (0,T ), (8.6)  ∈  m um(0)= u0 .  Problem (8.6) is equivalent to the problem of finding a solution um 1,2 ∈ Wloc ([0,T );Vm) of the non-autonomous gradient system in Vm

∇ E um u˙m + gm m(um)= Pm f , (8.7) m ( um(0)= u0 , E E ∇ E where m and gm are the restrictions of and g, respectively, to Vm, gm m is the u gradient of Em in Vm with respect to the metric gm, and P m : H H is the orthogonal m → projection from H onto Vm with respect to the inner product , g(um). Since Vm is ∇ E h· ·i finite dimensional, for every u Vm the gradient gm m(u) exists and belongs to Vm. In order to obtain existence∈ of a maximal solutions, we check that the function ∇ E u F : (0,T ) Vm Vm, (t,u) gm m(u) Pm f (t) satisfies the Carath´eodory con- ditions. Since× g→: V Inner7→(H) is a metric,− it maps norm convergent sequences → in V into strongly convergent sequences in Inner(H). Since Inner(Vm) is finite- dimensional, norm convergence and strong convergence coincide, and therefore the restriction gm : Vm Inner(Vm) is a Riemannian metric. By Lemma 2.4, the gra- ∇ E → dient gm m is continuous, and the first term in the definition of F satisfies the Carath´eodory conditions. We check that the second term in the definition of F satisfies the Carath´eodory conditions, too. For every u Vm, let Qm(u) L (Vm) be the operator given by ∈ ∈ Qm(u)v,w H = v,w gm(u) = v,w g(u) (v, w Vm). Then, for every u, v Vm and almosth everyi t h[0,Ti], h i ∈ ∈ ∈ u u u Qm(u)P f (t),v H = P f (t),v = f (t),P v = f (t),v . h m i h m ig(u) h m ig(u) h ig(u)

Since g is a metric, the right-hand side depends continuously on u Vm. Since Vm is finite-dimensional, weak convergence and norm convergence coincide,∈ and there- u fore for almost every t [0,T] the function Vm Vm, u Qm(u)P f (t) is continu- ∈ → 7→ m ous. Since gm is continuous, Qm is continuous, too (compare with Lemma 2.3), and 92 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II since for every u Vm the operator Qm(u) is invertible, we obtain that the second term in the definition∈ of F is continuous with respect to the second variable. More- over, since , is uniformly equivalent to , H , for almost every t [0,T] and h· ·ig(u) h· ·i ∈ every u Vm, ∈ u 1 u 1 c2 Pm f (t) H Pm f (t) g(u) f (t) g(u) f (t) H . k k ≤ c1 k k ≤ c1 k k ≤ c1 k k

Since f L2(0,T ;H), this implies that F satisfies the Carath´eodory conditions. By the corollary∈ to Carath´eodory’s theorem (Corollary 2.8), problem (8.7) admits a 1,2 maximal solution um W ([0,Tm);Vm). Maximal means here that either Tm = T , ∈ loc or Tm < T and the solution um can not be extended to any larger interval. For every m N, let um be a maximal solution of (8.7). ∈

Part 2 (Bounds for the solutions um of the approximating problems): We show that the maximal solutions um are global, that is, Tm = T, and that the sequence (um) is bounded in appropriate function spaces. This part of the proof essentially repeats arguments from the proof of Theorem 2.10.

We multiply the equation (8.7) byu ˙m with respect to the inner product , h· ·ig(um) (or: we take v = u˙m in (8.6)), integrate the result over [0,t] (t (0,Tm)), and apply Lemma 8.4 (a) and the Cauchy-Schwarz inequality in order to∈ obtain

t 2 E E m u˙m(s) g(u (s)) ds + (um(t)) (u0 )= 0 k k m − Z t

= f (s),u˙m(s) g(um(s)) ds 0 h i Z t t 1 2 1 2 f (s) g(u (s)) ds + u˙m(s) g(u (s)) ds. ≤ 2 0 k k m 2 0 k k m Z Z m E E m Since limm ∞ u0 = u0 in V, and since is continuous, we have limm ∞ (u0 )= E → E m → (u0). In particular, the sequence ( (u0 )) is bounded. Hence, there exists a con- stant C1 0 which is independent of m and t [0,Tm) such that ≥ ∈ t T c1 2 E c2 2 u˙m(s) H ds + (um(t)) C1 + f (s) H ds. (8.8) 2 0 k k ≤ 2 0 k k Z Z In this estimate we have also used the assumption that the inner products , h· ·ig(um) are uniformly equivalent to the inner product , H . Let ω R be such that Eω is convex and coercive. Then the preceding inequalityh· ·i can be re∈written as

c t 1 2 2 E u˙m(s) H ds + um(t) H + ω (um(t)) 2 0 k k k k ≤ Z ω T + 2 2 c2 2 C1 + um(t) H + f (s) H ds. (8.9) ≤ 2 k k 2 0 k k Z 8.1 Global existence and uniqueness of solutions for gradient systems with elliptic energy 93

1 2 We estimate the term um as follows: 2 k kH t 1 2 1 2 1 d 2 um(t) H = um(0) H + um(s) H ds 2k k 2k k 0 2 dsk k Z t 1 2 = um(0) H + u˙m(s),um(s) H ds 2k k 0 h i Z t ω t c1 2 + 2 2 C2 + u˙m(s) H ds + um H ds, ≤ 4(ω + 2) 0 k k c 0 k k Z 1 Z 1 m 2 where C2 0 is an upperboundforthe sequence ( u ). Since Eω is continuous, ≥ 2 k 0 kH convex and coercive, Eω is bounded from below by Exercise 6.3 (b), that is, there exists a constant C3 0 such that Eω (u) C3 for every u V. Inserting this estimate and the preceding≥ estimate into (8.9),≥− we obtain ∈

t c1 2 2 u˙m(s) H ds + um(t) H 4 0 k k k k ≤ Z ω 2 t T ω ( + 2) 2 c2 2 C1 + ( + 2)C2 +C3 + um(s) H ds + f (s) H ds ≤ c 0 k k 2 0 k k 1 Z Z ω 2 t ( + 2) 2 = C4 + um(s) H ds, c 0 k k 1 Z where C4 0 does not depend on m and t [0,Tm). The first term on the left-hand side is positive.≥ Hence, by applying Gronwall’s∈ inequality to the function ϕ(t)= 2 um(t) , we obtain that for every m and every t [0,Tm), k kH ∈ (ω+2)2 (ω+2)2 2 c t c T um(t) C4 e 1 C4 e 1 . k kH ≤ ≤

The right-hand side of this inequality does not depend on m and t [0,Tm), and therefore ∈ 2 ∞ sup sup um(t) H < . m N t [0,Tm) k k ∈ ∈ Inserting this estimate into (8.9), we obtain the existence of a constant C5 0 which is independent of m and t such that ≥

t c1 2 2 E u˙m(s) H ds + um(t) H + ω (um(t)) C5. (8.10) 2 0 k k k k ≤ Z Since the first two terms on the left-hand side are positive, this inequality implies N E that the set um(t) : m , t (0,Tm) is contained in the sublevel set KC5 of ω . { E ∈ ∈ } By coercivity of ω , KC5 is bounded in V, and therefore

sup sup um(t) V < ∞. m N t [0,Tm) k k ∈ ∈ We use again the fact that Eω is bounded from below and deduce in addition from inequality (8.10) that 94 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II

1,2 ∞ sup um W (0,Tm;H) < . m N k k ∈ Since Tm T is finite, this implies that for each m N the functionu ˙m is integrable ≤ ∈ on [0,Tm). Hence, um extends to a continuous function on the closed interval [0,Tm], and Carath´eodory’s theorem and the definition of maximal solution imply that this is only possible if Tm = T, that is, the solutions um are global. From the preceding two inequalities and the continuous embedding V ֒ H we obtain that → 1,2 ∞ (um) is bounded in W (0,T ;H) L (0,T;V ). (8.11) ∩ By assumption, the derivative E ′ maps bounded sets into bounded sets, so that the ∞ boundedness of (um) in L (0,T;V ) implies that

∞ (E ′(um)) is bounded in L (0,T ;V ′).

Part 3 (Extracting a convergent subsequence): The space W 1,2(0,T;H) is ∞ 1 ∞ a Hilbert space and the spaces L (0,T;V ) = L (0,T ;V ′)′ and L (0,T ;V ′) = 1 ∼ ∼ L (0,T ;V)′ are dual space by reflexivity of V (and V ′) and by Theorem 5.6. More- 1 1 over, the spaces L (0,T;V ′) and L (0,T ;V) are separable by Theorem 5.5. Hence, by Theorem 6.3, by the assumption that V is compactly embedded into H, and by Lemma 8.5, there exist u W 1,2(0,T ;H), v L∞(0,T;V ), w C([0,T ];H), ∞ ∈ ∈ ∈ χ L (0,T ;V ) and a subsequence of (um) (which we denote for simplicity again ∈ ′ by (um)) such that

1,2 um ⇀ u in W (0,T ;H), weak ∗ ∞ um v in L (0,T ;V), → um w in C([0,T ];H), and (8.12) → weak ∗ ∞ E ′(um) χ in L (0,T ;V ′). → We leave it as an exercise to show that u W 1,2(0,T ;H) L∞(0,T;V ) and u = v = w. Moreover, it is an exercise to show that∈ every continuous∩ linear operator between two Hilbert spaces maps weakly convergent sequences into weakly convergent se- quences. Since the operator W 1,2(0,T;H) L2(0,T ;H), u u˙ is continuous and linear, and by (8.12), → 7→

2 u˙m ⇀ u˙ in L (0,T;H),

um(0) u(0) in H and → um(T ) u(T) in H. →

The above weak respectively weak∗ convergences in the respective function spaces mean that 8.1 Global existence and uniqueness of solutions for gradient systems with elliptic energy 95

T T 1 v,um V ,V v,u V ,V for every v L (0,T ;V ′), 0 h i ′ → 0 h i ′ ∈ Z Z T T 2 u˙m,v H u˙,v H for every v L (0,T ;H), and (8.13) 0 h i → 0 h i ∈ Z Z T T χ 1 E ′(um),v V ,V ,v V ,V for every v L (0,T ;V). 0 h i ′ → 0 h i ′ ∈ Z Z Part 4 (Showing that the limit u is a solution): First of all, we have just seen m m that um(0) u(0) in H. On the other hand, um(0)= u0 by (8.6), and u0 u0 in V → m → by the choice of the sequence (u0 ). Since V is continuously embedded into H, we obtain u(0)= u0, that is, u satisfies the initial condition in (8.4). It remains to show that u satisfies also the differential equation. 2 Let w Vm and ϕ L (0,T). For every n m we multiply equation (8.7) by ϕ ∈ ∈ ≥ ( )w with respect to the inner product , g(un), integrate over (0,T) and obtain that· h· ·i

T T ϕ E ϕ u˙n(t), (t)w g(un(t)) dt + ′(un(t)), (t)w V ,V dt = 0 h i 0 h i ′ Z Z T ϕ = f (t), (t)w g(un(t)) dt. 0 h i Z Letting n ∞ in this last equality and using (8.13) and the continuity assumption on g, we obtain→

T T T ϕ χ ϕ ϕ u˙(t), (t)w g(u(t)) dt + (t), (t)w V ,V dt = f (t), (t)w g(u(t)) dt. 0 h i 0 h i ′ 0 h i Z Z Z 2 Using the fact that ϕ( )w : w Vm, ϕ L (0,T) spans a dense subspace of { · ∈ m ∈ } [ L2(0,T ;V) (Theorem 5.5), we obtain for every v L2(0,T ;V) ∈ T T T χ u˙,v g(u) + ,v V ,V = f ,v g(u). (8.14) 0 h i 0 h i ′ 0 h i Z Z Z

It is left to show that χ = E ′(u). We multiply equation (8.7) by um with respect to the inner product , , integrate the result over (0,T ), and obtain h· ·ig(um) T T T E ′(un),un V ,V = f ,un g(un) u˙n,un g(un). (8.15) 0 h i ′ 0 h i − 0 h i Z Z Z The continuity assumption (8.3) on g implies

T T

f ,un g(un) f ,u g(u). 0 h i −→ 0 h i Z Z

Moreover, the continuity assumption on g, the uniform convergence of (un) (see (8.12)), the Cauchy-Schwarz inequality and the uniform equivalence of , and h· ·ig(u) 96 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II

, H imply h· ·i T T T

u˙n,un g(un) = u˙n,u g(un) + u˙n,un u g(un) 0 h i 0 h i 0 h − i Z Z Z T u˙,u g(u). → 0 h i Z Hence, if we let n ∞ in equation (8.15) and if we use the equality (8.14) with v = u, we obtain →

T T χ E ′(un),un V ,V dt ,u V ,V dt. 0 h i ′ −→ 0 h i ′ Z Z

Since Eω′ (u)= E ′(u)+ ωu, since un u in C([0,T];H) (see (8.12)), and since 2 → un,un = un , this last convergence implies h iV ′,V k kH T T χ ω Eω′ (un),un V ,V + u,u V ,V . (8.16) 0 h i ′ −→ 0 h i ′ Z Z ∞ Let v L (0,T;V ) and λ R. By Lemma 6.2 (applied to the convex function Eω ) and integration∈ over (0,T),∈

T T λ λ λ Eω′ (um),um u v V ,V Eω′ (u + v),um u v V ,V . 0 h − − i ′ ≥ 0 h − − i ′ Z Z

Letting m ∞ in this inequality, we obtain on using again the weak∗ convergences weak → weak ∗ ∗ um u, Eω (um) χ + ωu, and (8.16) that → ′ → T T χ ω λ λ λ + u, v V ,V Eω′ (u + v), v V ,V . − 0 h i ′ ≥− 0 h i ′ Z Z We divide by λ > 0 and by λ < 0, let λ 0+ and λ 0 , and use the continuity → → − of E ′ in order to obtain that T T χ ,v V ,V = E ′(u),v V ,V . 0 h i ′ 0 h i ′ Z Z Since v L∞(0,T;V ) is arbitrary, this implies ∈

E ′(u)= χ.

Hence, we may replace χ by E ′(u) in equality (8.14) and we deduce that u is a solution of the variational form ofu ˙ + ∇gE (u)= f on [0,T ]. In particular, u is a solution of the gradient system (8.4). 8.2 Exercises 97

Uniqueness and energy inequality

See Exercise 8.1 (c).

8.2 Exercises

8.1. a) Prove that a constant metric g : V Inner(H) – that is, , g(u) = , H for every u V – satisfies the continuity→ condition (8.3) fromh· Theorem·i 8.1.h· ·i ∈ b) Show that if a function g : V Inner(H) satisfies the continuity condition (8.3), then it is continuous in the→ following sense:

un ⇀ u in V,

vn ⇀ v in H, and vn,w g(un) v,w g(u); (8.17) w H  ⇒ h i → h i ∈ 

In particular, g is a metric, that is, un u in V implies v,w g(un) v,w g(u) for every v, w H. → h i → h i Open problem.∈ Prove or disprove that the condition (8.17) implies (8.3). c) Prove the addendum in Theorem 8.1, that is, prove uniqueness of solutions and the energy inequality (8.5).

8.2 (Quadratic forms). Let E : V R be a continuous quadratic form on a Banach space V which is densely and continuously→ embedded into a Hilbert space H. a) Show that if E is coercive, then E is convex. b) Show that E is H-elliptic if and only if there exist ω 0, η > 0 such that ≥ ω E (u)+ u 2 η u 2 for every u V. 2 k kH ≥ k kV ∈ c) Show that if E is H-elliptic, then there exists an equivalent norm on V which comes from an inner product (that is, V is a Hilbert space). d) Show that if E is H-elliptic, then

2 2 2 u ∇ E := u + ∇H E (u) k kD( H ) k kH k kH

defines a complete norm on the domain D(∇H E ). 2 e) Show that if E is H-elliptic, then for every u0 V and every f L (0,T ;H) the gradient system ∈ ∈

u˙ + ∇H E (u)= f , u(0)= u0,

1 ∞ 2 admits a unique solution u H (0,T ;H) L (0,T;V ) L (0,T;D(∇H E )). -Hint. We do not assume that∈ the embedding∩ V ֒ H is∩ compact so that The → 98 8 Gradient systems in infinite dimensional spaces: existence and uniqueness of solutions II

orem 8.1 does not apply. Use Theorem 6.1 and the observation that u is a ωt solution if and only if v(t)= e− u(t) is a solution of

ωt v˙+ ∇H E (v)+ ωv = e− f , v(0)= u0.

d 8.3 (Perona-Malik model). Let Ω R be open and bounded. Let c : R+ (0,∞) ⊆ → be a continuously differentiable function such that there exist p > 1, c1 0 and p 1 2 ≥ s0 0 such that c(s)s c1 s − for every s s0. Given f L (Ω), we call a function u ≥W 1,p(Ω) L2(Ω)≤a weak solution of≥ the problem ∈ ∈ ∩ div(c( ∇u )∇u)= f in Ω, − | | (8.18) ∂ u ∂Ω ( ∂ n = 0 on , if for every ϕ W 1,p(Ω) L2(Ω) one has ∈ ∩ c( ∇u )∇u∇ϕ = f ϕ. Ω | | Ω Z Z

a) Assume in addition that Ω is of class C1 and f C(Ω¯ ). Show that u C2(Ω¯ ) is a weak solution of the problem (8.18) if and only∈ if u is a classical∈ solution, that is, all derivatives are understood to be classical derivatives and u satisfies the partial differential equation and the boundary condition in the usual sense. p 2 b) Let c(s) = (1 + s) − with p (1,2), so that c satisfies the properties from ∈ s the Perona-Malik model. Let C(s) := 0 c(r)r dr. Consider the Banach space 1,p Ω 2 Ω V =W ( ) L ( ) with norm u V = u W1,p + u L2 . Showthat the func- ∩ k k R k k k k 2 tion E : V R, E (u)= Ω C( ∇u ) is continuously differentiable, L (Ω)- → | | elliptic, and that E ′ maps bounded sets into bounded sets. Show that u is a R ∇ E weak solution of (8.18) if and only if L2 (u)= f . Remark. Recall from Exercise 7.2 that E is convex if C is convex and non- decreasing. Note that C is convex if and only if sc(s) is nondecreasing. In particular, if C is convex, then the coefficient c can not decrease arbitrarily rapidly. c) In addition to the assumptions of (b), assume that Ω R2 is of class C1. Con- 1,p 2 ⊆ 2 clude that for every u0 W (Ω) and every f L (0,T;L (Ω)) there exists a unique solution u W∈1,2(0,T ;L2(Ω)) L∞(0∈,T ;W 1,p(Ω)) of the problem ∈ ∩ ∇ E u˙ + L2 (u)= f , u(0)= u0. Write the variational form of this gradient system. Remark. If p (1,2) and Ω R2 is bounded and of class C1, then, by the .Rellich-Kondrachev∈ theorem, W⊆1,p(Ω) ֒ L2(Ω) with compact embedding → Lecture 9 Regularity of solutions

After having proved existence of solutions of abstract gradient systems, several questions about their qualitative behaviour arise naturally. Some of them are about regularity of the solutions. Is the regularity of the solutions which we obtained in Theorems 6.1 and 8.1 the optimal one or can we expect higher regularity if the data in the gradient system (6.3) and (8.4) are more regular? For example, when can we expect that solutions of gradient systems are continuous or continuously differentiable with values in the energy space V? When does the energy inequality (8.5) hold, and when is it actually an energy equality? Finding answers to these questions is a crucial step before studying, for example, the long time behaviour of solutions of gradient systems.

It turns out that for a certain class of gradient systems, including linear, and so-called semilinear and quasilinear problems, higher regularity can be obtained from mere existence and maximal regularity results such as Theorems 6.1 and 8.1. The idea of the proof, which we present in this lecture, is the so-called parameter trick and a clever application of the implicit function theorem; it is due to Angenent (see [Angenent (1990b)] and also [Angenent (1990a)]).

The regularity result in this lecture is true for abstract differential equations and there is no necessity to confine the study to gradient systems. We come back to gradient systems in the following lectures.

9.1 Interpolation

In this section, we recall shortly some results which were the subject of Exercise 5.2. Let D and X be two Banach spaces such that D is densely and continuously embedded into X. For every p [1,∞) we consider the space ∈ 1,p p MRp(X,D) := W (0,1;X) L (0,1;D) ∩

99 100 9 Regularity of solutions equipped with the norm

p p p u := u + u p , k kMRp(X,D) k kW 1,p(0,T ;X) k kL (0,T ;D) which turns it into a Banach space. We recall from Theorem 5.10 that every func- tion u MRp(X,D) is continuous with values in X; more precisely, every element ∈ u MRp(X,D) admits a representative which is continuous with values in X. It ∈ makes therefore sense to evaluate functions u MRp(X,D) in every t [0,1] and in particular to define the trace space ∈ ∈

Trp(X,D) := x X : there exists u MRp(X,D) such that u(0)= x . { ∈ ∈ } This space is a Banach space for the norm

x Tr := inf u MR : u MRp(X,D) and u(0)= x . k k p {k k p ∈ }

The fact that Trp(X,D) is a Banach space becomes clear from considering the com- mutative diagram

δ0 MRp(X,D) X −−−−→ q i

 0 x MRp(X,D)/MRp(X,D) Trp(X,D) y −−−−→b  in which δ0 is the point evaluation at t = 0 (a continuous linear operator, by The- 0 orem 5.10), MR (X,D) := u MRp(X,D) : u(0) = 0 is its kernel (a closed p { ∈ } subspace, by continuity of δ0), q is the canonical quotient map, b is the canoni- cal bijection onto the range of δ0, and i is the canonical injection. The quotient 0 space MRp(X,D)/MRp(X,D) is clearly a Banach space and Trp is the natural norm which turns b into an isometric isomorphism. In the literature,k·k the trace space p Trp(X,D) is also denoted by (X,D) 1 ,p (with p′ = p 1 ); it is a particular example p′ − in the scale of so-called real interpolation spaces (X,D)θ,p which depend on two parameters (0 < θ < 1, 1 p ∞ or (θ, p) = (1,∞)); see [Lunardi (1995), Chapter 1], [Lunardi (2009)]. One≤ has≤

D Trp(X,D) X, ⊆ ⊆ and the embeddings are dense and continuous. It is also true – but a little bit less obvious to prove – that

MRp(X,D) C([0,1];Trp(X,D)) ⊆ with dense and continuous embedding. Using translations and dilations, it is clear from this embedding that for every a, b R, a < b, one has ∈ 1,p p (W (a,b;X) L (a,b;D) ֒ C([a,b];Trp(X,D)). (9.1 ∩ → 9.2 Time regularity – Angenent’s trick 101 9.2 Time regularity – Angenent’s trick

Let D and X be two Banach spaces such that D is densely and continuously em- bedded into X. Let F : D X be a continuously differentiable function and let f : [0,T ] X be an integrable→ function. In this section we study the regularity of solutions→ of the abstract differential equation

u˙ + F(u)= f . (9.2)

Theorem 9.1 (Time regularity). Let p [1,∞) and 0 < T < T ′. Assume that, for some k 1, the composition operator∈ F : W 1,p(0,T ;X) Lp(0,T ;D) Lp(0,T ;X),≥ u F(u) = F u is k times continuously differentiable.∩ Let f → W k,p(0,T ;X),7→ andlet u W 1◦,p(0,T ;X) Lp(0,T ;D) be a solution of (9.2) in the∈ ′ ∈ ′ ∩ ′ sense that the equality holds almost everywhere on (0,T ′). Assume that the linear problem v˙+ F′(u)v = g (9.3) ( v(0)= 0, admits for every g Lp(0,T ;X) a unique solution v W 1,p(0,T;X) Lp(0,T ;D). Then ∈ ∈ ∩

k+1,p k,p k u W ((0,T ];X) W ((0,T ];D) C ((0,T ];Trp(X,D)) and ∈ loc ∩ loc ∩ t t ju( j)(t) W 1,p(0,T ;X) Lp(0,T ;D) for every j = 0,...,k. 7→ ∈ ∩ If F and f are ofclassC∞, then u C∞((0,T ];D). ∈ Observe that we assume no higher regularity of the function F, but we rather assume higher regularity of the composition operator F . Of course, one then pre- sumes higher regularity of the function F, but note carefully that F Ck alone does in general not imply F Ck. The implication “F Ck F Ck” is∈ true in special situations, for example∈ if one assumes additional∈ growth⇒ conditions∈ on the deriva- tives of F; see also Theorem 4.3. Here are two natural cases in which the regularity assumption on F is satisfied (see Exercises). a) The function F : D X is continuous and linear. → b) The function F : D X extends to a k times continuously differentiable func- → tion from Trp(X,D) into X. c) The function F : D X is the sum of two functions as in (a) or (b). → Maximal regularity plays an important role in Theorem 9.1, in fact in two ways. On the one hand, there is the maximal regularity assumption on the linear, nonau- tonomous problem (9.3) (the linear operator F′(u)= F′ u depends in fact on time): for every given right-hand side g Lp(0,T ;X) of (9.3)◦ there exists a unique solu- tion v such that the two terms on∈ the left-hand side have the same regularity as g. On the other hand, the existence of the solution u of the problem (9.2), which is as- sumed in Theorem 9.1, is in concrete applications obtained by some existence and 102 9 Regularity of solutions maximal regularity result for which only f Lp(0,T ;X) was assumed: two such results are our Theorems 6.1 and 8.1 which∈ are existence and maximal regularity results for gradient systems. The maximal regularity assumptions in Theorem 9.1 are the optimal assumptions from the point of view of the proof in which we apply the following classical theorem from calculus (see Theorem C.6 for the proof).

Theorem 9.2 (Implicit function theorem). Let X1, X2 and Y be three Banach spaces, and let U X1 X2 be an open set. Let G : U Y be continuously dif- ⊆ × ∂ G → ferentiable. Let x¯ = (x¯1,x¯2) U be such that ∂ (x¯) : X2 Y is an isomorphism. ∈ x2 → Then there exist neighbourhoodsU1 X1 of x¯1 andU2 X2 of x¯2,U1 U2 U, and ⊆ ⊆ × ⊆ a continuously differentiable function g : U1 U2 such that →

(x1,x2) U1 U2 : G(x1,x2)= G(x¯1,x¯2) = (x1,g(x1)) : x1 U1 . { ∈ × } { ∈ } If, in addition, G is k times continuously differentiable, then the implicit function g is k times continuously differentiable, too.

Lemma 9.3. Let A : W 1,p(0,T ;X) Lp(0,T;D) Lp(0,T;X) be a continuous, lin- ear operator. Assume that the problem∩ →

v˙+ Av = g (9.4) ( v(0)= 0, admits for every g Lp(0,T ;X) a unique solution v W 1,p(0,T;X) Lp(0,T ;D). ∈ p ∈ ∩ Then, for every g L (0,T ;X) and every v0 Trp(X,D) the problem ∈ ∈ v˙+ Av = g (9.5) ( v(0)= v0, admits a unique solution v W 1,p(0,T ;X) Lp(0,T ;D). ∈ ∩ Proof. Uniqueness of solutions of (9.5) follows from linearity and uniqueness of p solutions of (9.4). In order to show existence, let g L (0,T ;X) and v0 Trp(X,D). By definition of the trace space (we use in fact the∈ definition and a dilation∈ of the interval (0,1) onto (0,T )), there exists w W 1,p(0,T ;X) Lp(0,T;D) such that ∈ p ∩ w(0)= v0. For this function w onehasw ˙ + Aw L (0,T ;X). By assumption, there exists a unique z W 1,p(0,T ;X) Lp(0,T ;D) which∈ is solution of ∈ ∩ z˙+ Az = g w˙ Aw, − − ( z(0)= 0.

Setting v = w + z, we obtain a solution of (9.5).

Proof (Proof of Theorem 9.1). It is useful to set

1,p p MRp(0,T ;X,D) := W (0,T;X) L (0,T;D). ∩ 9.2 Time regularity – Angenent’s trick 103

T T Let ε (0, ′− ). For every λ ( ε,ε) and every t (0,T) we define ∈ T ∈ − ∈ uλ (t) := u(t + λt).

Then, for every λ ( ε,ε) the function uλ belongs to MRp(0,T;X,D) and it solves the nonlinear problem∈ −

u˙λ + (1 + λ)F(uλ ) = (1 + λ) f ((1 + λ) ), · (9.6) ( uλ (0)= u(0). Starting from this observation, we consider the nonlinear operator

p G :( ε,ε) MRp(0,T;X,D) L (0,T ;X) Trp(X,D), − × → × (λ,v) (v˙+ (1 + λ)F(v) (1 + λ) f ((1 + λ) ),v(0) u(0)). 7→ − · − It follows from the assumptions – F is k times continuously differentiable and f W k,p(0,T ;X) –, that the operator G is k times continuously differentiable. ∈ ′ Moreover, by definition of the operator G, and since the functions uλ are solutions of (9.6), one has G(λ,uλ ) = (0,0) for every λ ( ε,ε). ∈ − We show that G satisfies the assumptions of the implicit function theorem at ∂ G (0,u) = (0,u0). For this, we have to consider the partial derivative ∂ v at (0,u) which is the linear operator given by ∂ G p (0,u) : MRp(0,T;X,D) L (0,T ;X) Trp(X,D), ∂v → × v (v˙+ F′(u)v,v(0)). 7→ Observe at this point that, by our assumption on the linear problem (9.3) and by Lemma 9.3, the problem v˙+ F′(u)v = g

( v(0)= v0, p admits for every g L (0,T;X) and every v0 Trp(X,D) a unique solution ∈ ∈ ∂ G v MRp(0,T;X,D). In other words, the operator ∂ v (0,u) is bijective. By the ∈ ∂ G bounded inverse theorem, the linear operator ∂ v (0,u) is continuously invertible. Hence, by the implicit function theorem, there exists ε (0,ε), a neighbourhood ′ ∈ U MRp(0,T;X,D) of u, and a k times continuously differentiable implicit func- tion⊆g : ( ε ,ε ) U such that − ′ ′ →

G(λ,g(λ)) = G(0,u0) = (0,0).

Moreover, all solutions in ( ε ,ε ) U of the equation G(λ,v) = (0,0) are of the − ′ ′ × form (λ,g(λ)). Since, for every λ ( ε ,ε ), the elements (λ,uλ ) are solutions ∈ − ′ ′ of this equation, we obtain that uλ = g(λ). In particular, we have obtained that the 104 9 Regularity of solutions function

g : ( ε′,ε′) MRp(0,T ;X,D), − → λ uλ = u((1 + λ) ), 7→ · is k times continuously differentiable. This is the desired information. Since the point evaluation MRp(0,T;X,D) X, v v(t) is linear, we obtain that the function λ u(t + λt) is k times continuously→ differentiable7→ with values in X. In particular, → d j u is k times continuously differentiable with values in X and u(t + λt) λ = dλ j | =0 t ju( j)(t) for every t (0,T ) and every j 0, ..., k . Coming back to the function g, we see that the consecutive∈ derivatives∈{ of g at λ =}0 are given by

t tu˙(t) MRp(0,T ;X,D), 7→ ∈ . . k (k) t t u (t) MRp(0,T;X,D). 7→ ∈ This is the stated regularity of the solution u.

9.3 Exercises

In order to formulate the exercises, consider first the following definition and the extension of the implicit function theorem (see [Zeidler (1990), Corollary 4.23]). Let X and Y be two Banach spaces, and let U X be an open set. We say that a function F : U Y is analytic if it is infinitely many⊆ times differentiable, and if for every x U there→ exists r > 0 such that ∈ ∞ 1 ∑ F(k)(x) rk < ∞ and k=0 k! k k ∞ 1 F(x + h)= ∑ F(k)(x)hk for every h X with h r and x + h U. k=0 k! ∈ k k≤ ∈

In this definition, F(k)(x) is the k-th derivative of F at the point x. It is identified with a k-linear operator X X Y , and F(k)(x)hk stands for F(k)(x)(h,...,h) (the k-th derivative applied×···× to the vector→ with k identical entries h). For every x U the supremum over all r > 0 such that the above relations are true is called the∈radius of convergence of F at x.

Theorem 9.4 (Addendum to the implicit function theorem, ). If, in the implicit function theorem 9.2 the function G is analytic, then the implicit function g is ana- lytic, too. 9.3 Exercises 105

9.1. a) By analyzingthe proofof Theorem9.1,by using the addendumtothe Im- plicit function theorem, and by using the embedding (9.1), prove the following result.

Theorem 9.5 (Addendum to Theorem 9.1). If, in Theorem 9.1, the function F is analytic and f = 0, then the solution u : (0,T) Trp(X,D) is analytic (u is even analytic with values in D). Morever, if r(t) is→ the radius of convergence ofuatt (0,T), then r(t) ct for some constant c > 0 independent of t. ∈ ≥ b) Every continuous linear operator is analytic. Starting from this observation, prove the following: Assume that A : D X is a continuous, linear operator, and that the linear problem →

u˙ + Au = f , (9.7) ( u(0)= 0

has Lp-maximal regularity in the sense that for every f Lp(0,T;X) it ad- mits a unique solution u W 1,p(0,T ;X) Lp(0,T ;D). Show∈ that then for ev- ∈ ∩ ery u0 Trp the initial value problem ∈ u˙ + Au = 0, (9.8) ( u(0)= u0

admits a unique solution u W 1,p(0,T;X) Lp(0,T ;D). This solution is ana- ∈ ∩ lytic on (0,T) with values in Trp and there exists c > 0 such that the radius of convergence r(t) of u at t satisfies r(t) ct. ≥ Remark for the readerwho is familiar with the theory of linear C0-semigroups. It has beenshown in [Dore (1993)] that if the problem(9.7)has Lp-maximal regularity, then A with domain D generates an analytic C0-semigroup. Theorem 9.5 may be seen− as a nonlinear variant of this result.

9.2 (Nemytski operators). Let F : R R be continuously differentiable. → a) Show that for every compact K R one has ⊆ F(s + h) F(s) F (s)h lim − − ′ = 0 uniformly in s K. h 0 h ∈ → b) Let Ω Rd be an open set. Show that the composition operator F : L∞(Ω) L∞(Ω)⊆, u F(u)= F u is continuously differentiable and that for every→u, h L∞(Ω)7→ ◦ ∈

F ′(u)h = F′(u( ))h( ) almost everywhere on Ω. · · Remark. The operator, which maps every function m L∞(Ω) to the multi- plication operator M L (L∞(Ω)) given by Mu(x) :=∈m(x)u(x) (u L∞(Ω), x Ω), is isometric.∈ We may therefore identify its range with L∞(Ω∈). In this ∈ 106 9 Regularity of solutions

∞ sense, we may identify the derivative F ′(u) L (L (Ω)) with the function F (u) L∞(Ω). With this identification, F is∈ again a Nemytski operator. ′ ∈ ′ c) Show that if F is k times continuously differentiable, then F is k times con- tinuously differentiable. d) Show that if F is analytic, then F is analytic, too.

9.3. Prove statements (a)-(c) on page 101.

9.4. It may be instructive to consider the problem of regularity of solutions of the ordinary differential equation u˙ + F(u)= 0, where F : Rd Rd is a continuousfunctions. Show by induction that if the function F is k times continuously→ differentiable (k 0), then every solution of this differen- tial equation is k +1 times continuously differentiable≥ (solution is meant here in the sense of Lectures 1 and 2. Compare your proof to the proof of Theorem 9.1. What about the case of analytic F? Lecture 10 Existence of local solutions

In many textbooks on differential calculus, one can find the implicit function the- orem and the local inverse theorem nearby. Depending on the particular textbook, one theorem is deduced from the other, and perhaps it is justified to call them brother and sister. Lecture 9 and the present lecture are somehow a copy of this reality in textbooks, because implicit function theorem and local inverse theorem appear nearby; we apply them both, in fact, to the same evolution equation.

Once again, in this lecture we turn our attention to the abstract differential equa- tion u˙ + F(u)= f . While in the preceding lecture we applied the implicit function theorem in order to prove regularity of solutions, in this lecture we prove existence of local solutions by the help of the local inverse theorem. It turns out that we meet in the assumptions of both theorems the same linear problem for which one has to prove existence, uniqueness and maximal regularity. Once one has existence, uniqueness and maximal regularity for the linear problem, existence of solutions is an immediate consequence of the local inverse theorem (Theorem 10.1 below).

In the second part of this lecture, we show how Theorems 9.1 and 10.1 may be applied to a semilinear diffusion equation, or, more precisely, to a gradient system associated with a semilinear diffusion equation. Due to the importance of an exis- tence, uniqueness and maximal regularity result for a linear problem, Theorem 6.1 on the existence and uniqueness of global solutions of gradient systems becomes in this example an essential ingredient, although its application in the example of the semilinear diffusion equation is a little bit hidden. Find the two places where Theorem 6.1 is applied.

107 108 10 Existence of local solutions 10.1 The local inverse theorem and existence of local solutions of abstract differential equations

Let D and X be two Banach spaces such that D is densely and continuously embed- ded into X. Let F : D X be a continuously differentiable function, p 1, T > 0, p → ≥ f L (0,T;X), and u0 Trp(X,D). We study existence of local solutions of the abstract∈ differential equation∈

u˙ + F(u)= f , (10.1) ( u(0)= u0.

As in the previous lecture, for simplicity of notation, we let

1,p p MRp(0,T ;X,D) := W (0,T;X) L (0,T;D), ∩ and we call a function u MRp(0,T ;X,D) (0 < T T) a (local) solution if it ∈ ′ ′ ≤ satisfies the initial condition u(0)= u0 and the differential equationu ˙ + F(u)= f almost everywhere on (0,T ′). Theorem 10.1 (Existence of local solutions). Assume that the composition oper- p ator MRp(0,T ;X,D) L (0,T;X), u F(u)= F u is everywhere defined and → 7→ ◦ continuously differentiable. Assume further that there exists u¯ MRp(0,T ;X,D) p ∈ such that u¯(0)= u0 and for every g L (0,T ;X) the linear problem ∈

v˙+ F′(u¯)v = g (10.2) ( v(0)= 0, admits a unique solution v MRp(0,T ;X,D). Then the problem (10.1) admits a local solution. ∈

The proof of this theorem is almost an immediate consequence of the local in- verse theorem (see Theorem C.5).

Theorem 10.2 (Local inverse theorem). Let X and Y be two Banach spaces and let U X be open. Let G : U Y be a continuously differentiable function and ⊆ → let x¯ U be such that G′(x¯) : X Y is an isomorphism, that is, continuous, linear, bijective∈ and the inverse is also continuous.→ Then there exist neighbourhoodsV U of x¯ andW Y ofG(x¯) such that G : V W is a C1 diffeomorphism, that is⊆ G is ⊆ → 1 continuously differentiable, bijective and the inverse G− : W V is continuously differentiable, too. →

Proof (of Theorem 10.1). Consider the operator

p G : MRp(0,T ;X,D) L (0,T ;X) Trp(X,D), → × u (u˙ + F(u),u(0)). 7→ 10.2 Gradients of quadratic forms and identification of the trace space 109

By the assumption on F, and since the mappings u u˙ and u u(0) are continuous and linear, the operator G is continuously differentiable.7→ Its7→ derivative at a point u MRp(0,T ;X,D) is the linear operator given by ∈ p G′(u) :MRp(0,T;X,D) L (0,T ;X) Trp(X,D), → × v (v˙+ F′(u)v,v(0)). 7→ Observe that the assumption on the linear problem (10.2), together with Lemma 9.3, just translates the fact that G′(u¯) is bijective. Therefore, by the bounded in- verse theorem, G′(u¯) is an isomorphism. By the local inverse theorem (Theorem 10.2), there exists a neighbourhood V MRp(0,T;X,D) ofu ¯ and a neighbourhood p ⊆ 1 W L (0,T;X) Trp(X,D) of G(u¯) = (u¯˙ + F(u¯),u0) such that G : V W is a C ⊆ × → diffeomorphism. In particular, for every (h,z0) W there exists a unique z V such that ∈ ∈ z˙+ F(z)= h, (10.3) ( z(0)= z0. For T (0,T], consider the function h Lp(0,T ;X) given by ′ ∈ ∈

f (t) if t (0,T ′), h(t)= ∈ ( u¯˙ + F(u¯) if t (T ′,T ). ∈

Let T > 0 be small enough, so that (h,u0) W . Let z V be the solution of (10.3) ′ ∈ ∈ and let u MRp(0,T ′;X,D) be the restriction of z to the interval (0,T ′). Then u is a solution∈ of (10.1).

10.2 Gradients of quadratic forms and identification of the trace space

Let V and H be two separable Hilbert spaces with inner products , V and , H , respectively, and assume that V is continuously and densely embeddedh· ·i into Hh·. Con-·i 1 2 sider the quadratic form E : V R, E (u)= u , and let ∇H E be the gradient of → 2 k kV E with respect to the inner product , H . We recall from Exercise 8.2 that ∇H E is h· ·i a linear operator and that the domain D := D(∇H E ) is a Hilbert space for the inner product u,v D := ∇H E (u),∇H E (v) H + u,v H (u, v D). h i h i h i ∈ One has D V H ⊆ ⊆ with continuous embeddings. Moreover, the space

1,2 2 MR2(0,T ;H,D) := W (0,T;H) L (0,T ;D) ∩ 110 10 Existence of local solutions

is a Hilbert space for the inner product

T T

u,v MR2 := u(t),v(t) D dt + u˙(t),v˙(t) H dt. h i 0 h i 0 h i Z Z We omit the proof of the following lemma. It is very similar to the proof that C1([0,T ]) is dense in W 1,p(0,T ) (compare with Corollary G.13). 1 Lemma 10.3. The space C ([0,T];D) is dense in MR2(0,T ;H,D). 2 1,1 Lemma 10.4. For every u MR2(0,T ;H,D) one has u W (0,T ) and ∈ k kV ∈

1 d 2 u = ∇H E (u),u˙ H , 2 dt k kV h i the derivative on the left-hand side of this inequality being the weak derivative. Proof. If u C1([0,T ];D), then the function u 2 is continuously differentiable ∈ k kV and, by definition of the gradient ∇H E ,

d 2 d u = u,u V = 2 u,u˙ V = 2 ∇H E (u),u˙ H . dt k kV dt h i h i h i

Let u MR2(0,T ;H,D). By Lemma 10.3, there exists a sequence (un) C1([0,T∈];D) such that ⊆

2 un u in L (0,T;D), → 2 ∇H E (un) ∇H E (u) in L (0,T;H), and (10.4) → 2 u˙n u˙ in L (0,T;H). → By the above calculation and by the rule of integration by parts, for every ϕ 1 ∈ Cc (0,T ) and every n,

T T 2 ϕ ∇ E ϕ un V ˙ = 2 H (un),u˙n H . 0 k k − 0 h i Z Z The convergences in (10.4) allow us to pass to the limit on both sides of this equa- 1 tion. We thus obtain that for every u MR2(0,T ;H,D) and every ϕ C (0,T ) ∈ ∈ c T T 2 ϕ ∇ E ϕ u V ˙ = 2 H (u),u˙ H . 0 k k − 0 h i Z Z The claim follows from this equality, the definition of the Sobolev space W 1,1(0,1) 1 and the fact that, by the Cauchy-Schwarz inequality, ∇H Eω (u),u˙ H L (0,T ). h i ∈ Lemma 10.5. One has

MR2(0,T;H,D) C([0,T ];V), ⊆ with continuous embedding. 10.2 Gradients of quadratic forms and identification of the trace space 111

Proof. For every u C1([0,T ];D) we have, by Lemma 10.4, and by continuity of the Sobolev embedding∈ W 1,1(0,T) C([0,T ]) (applied to the function u 2 ) ⊆ k kV 2 2 2 u C([0,T];V) = sup u(t) V = u V ∞ k k t [0,T ] k k kk k k ∈ T T 2 d 2 C u(t) V dt + u(t) V dt ≤ 0 k k 0 dt k k Z Z T T  2 ∇ E C u(t) D dt + 2 H (u(t )) H u˙(t) H dt ≤ 0 k k 0 k k k k Z Z C u 2 .  ≤ k kMR2(0,T;H,D) The constants C here may change from line to line, they depend on the Sobolev embedding constant and on the constant of the embedding D ֒ V , but they do not depend on u. The claim thus follows from this estimate and Lemma→ 10.3.

Lemma 10.6. One has Tr2(H,D)= V.

Proof. The inclusion Tr2(H,D) V is a direct consequence of Lemma 10.5. For the ⊆ other inclusion, recall that, by Theorem 6.1, for every u0 V the gradient system ∈

u˙ + ∇H E (u)= 0, u(0)= u0,  1,2 ∞ admits a (unique) solution u W (0,1;H) L (0,1;V). Since ∇H E (u) = u˙ L2(0,T ;H), this solution∈ belongs also to∩L2(0,1;D), and consequently to − ∈ MR2(0,1;H,D). Thus, by solving the gradient system, we have proved that for ev- ery u0 V there exists u MR2(0,1;H,D) such that u(0)= u0. Since u0 V was ∈ ∈ ∈ arbitrary, this proves the inclusion V Tr2(H,D). ⊆ 1 2 2 Example 10.7. Let V = H0 (0,1) and H = L (0,1). We equip L (0,1) with the 2 1 1 usual L inner product, and H0 (0,1) with the inner product u,v H1 = 0 u′v′. h i 0 By Poincar´e’s inequality, this inner product is equivalent to the usual innerR prod- uct which is induced from H1(0,1) (Exercise 4.7). The gradient of energy E : 1 R E 1 2 2 H0 (0,1) , (u)= 2 u 1 with respect to the usual L inner product is the → k kH0 negative Dirichlet-Laplace operator on L2(0,1), that is

2 1 D(∇ 2 E )= H (0,1) H (0,1), L ∩ 0 D ∇ 2 E (u)= u′′ = ∆u; L − − (0,1) see Lecture 3. By Lemma 10.6, we therefore find

2 2 1 1 Tr2(L (0,1),H (0,1) H (0,1)) = H (0,1). (10.5) ∩ 0 0 d D Remark 10.8. a) More generally, if Ω R is a bounded,open set, and if Ω ∆ is the Dirichlet-Laplace operator on L2⊆(Ω) (see Lecture 4), then 2 Ω D∆ 1 Ω Tr2(L ( ),D(Ω )) = H0 ( ). 112 10 Existence of local solutions

One may argue as in the preceding example, by noting that the Poincar´e inequality holds for every bounded open Ω Rd ([Adams (1975)], [Br´ezis (1992)]). ⊆ b) It is actually not necessary to appeal to Poincar´e’s inequality, for Lemma 10.6 remains true in a slightly more general setting. Assume that E : V R is a continuous, H-elliptic, quadratic form. Still, by Exercise 8.2, the grad→ ient ∇H E is a linear operator and the domain D := D(∇H E ) is a Hilbert space. The equality Tr2(H,D)= V is still true in this situation (Exercise). c) In particular, if H = L2(0,1) and D = u H2(0,1) : u (0)= u (1)= 0 , then { ∈ ′ ′ } 1 Tr2(H,D)= H (0,1).

This can be seen by considering the Neumann-Laplace operator on L2(0,1), for D is just the domain of this operator, and H1(0,1) is the domain of the associated energy (see Exercise 4.3).

10.3 A semilinear diffusion equation

Let f : R R be k times continuously differentiable for some k 1, and consider the semilinear→ diffusion equation ≥

ut uxx + f (u)= 0 in (0,∞) (0,1), − ×  u(t,0)= u(t,1)= 0 for t (0,∞), (10.6) ∈   u(0,x)= u0(x) for x (0,1). ∈  We rewrite the equation (10.6) as an abstract gradient system and we prove existence and regularity of a solution for that problem. We consider the energy E : H1(0,1) R given by 0 → 1 1 1 E (u)= u2 + F(u), 2 x Z0 Z0 s E where F(s)= 0 (r) dr. By Exercise 4.6, the energy is continuously differentiable. 2 Moreover, if ∇ 2 E is the gradient with respect to the usual L inner product, then RL 2 1 D(∇ 2 E )= H (0,1) H (0,1), L ∩ 0 D ∇ 2 E (u)= ∆u + f (u). L − (0,1) 10.3 A semilinear diffusion equation 113

D∆ 2 Here (0,1) is the Dirichlet-Laplace operator on L (0,1) as it was defined in Lec- 1 2 ture 3, and f is the composition operator H0 (0,1) L (0,1), u f (u)= f u; we use the same letter for the original function f and→ any composition7→ operator◦ asso- ciated with this function since we think that there is no danger of confusion. In the following, we denote

H := L2(0,1), 1 V := H0 (0,1), and 2 1 D := D(∇ 2 E )= H (0,1) H (0,1). L ∩ 0 Recall from Example 10.7 that

2 2 1 1 Tr2(L (0,1),H (0,1) H (0,1)) = H (0,1). ∩ 0 0 1 Moreover, by Lemma 10.5, and by the Sobolev embedding H0 (0,1) C([0,1]) (Theorem 5.11), we have the continuous embeddings ⊆

1 MR2(0,T;H,D) C([0,T ];H (0,1)) C([0,T ];C([0,1])). (10.7) ⊆ 0 ⊆ Throughout this section, we deliberately use the identifications

C([0,T ];C([0,1])) = C([0,T ] [0,1]) and × C((0,T ];C([0,1])) = C((0,T ] [0,1]) × and their consequences. It follows from these identifications that we may identify a solution u of the gradient system (10.8) below with a real-valued function of two variables (time and space variable), or, vice versa, we may identify a solution of the semilinear diffusion equation (10.6) with an element of a Bochner or Bochner- Sobolev space. The aim is of course to show that every solution of the gradient sys- tem (10.8) below gives rise to a solution of the semilinear diffusion equation (10.6), and vice versa. However, here we concentrate on the abstract gradient system.

1 Theorem 10.9. For every u0 H (0,1) the problem ∈ 0 u˙ D∆u + f (u)= 0, − (0,1) (10.8) ( u(0)= u0, admits a unique local solution

u MR2(0,T ;H,D) satisfying in addition ∈ u W k+1,2((0,T ];L2(0,1)) W k,2((0,T ];H2(0,1)) Ck((0,T ];H1(0,1)). ∈ loc ∩ loc ∩ 0 This theorem is an application of the Theorems 10.1 and 9.1. In order to be able to apply both theorems, we need the following lemmas.

k Lemma 10.10. a) The gradient ∇ 2 E : D H is of class C . L → 114 10 Existence of local solutions

2 ∇ E b) The composition operator MR2(0,T ;H,D) L (0,T;H), u L2 (u) k → ∇7→ E is everywhere defined and of class C . The derivative ( L2 )′(u) : 2 MR2(0,T ;H,D) L (0,T ;H) at a point u is given by → D (∇ 2 E )′(u)v = ∆v + f ′(u)v, L − (0,1)

where f ′(u)v stands for the pointwise multplication of the composite function f (u)= f u and the function v. ′ ′ ◦ Proof. The Dirichlet-Laplace operator is continuous and linear from D into H. Hence, by Exercise 9.3, it suffices to study only the composition operator (Nemytski operator) associated with the function f . (a) We may factorize the composition operator associated with f in the following way: f D H −−−−→ i i ,  x C([0,1]) C([0,1]) y −−−−→f  where i stands for the natural embeddings. The natural embedding D C([0,1]) is continuous by the Sobolev embedding theorem (Theorem 5.11), and⊆ the natural embedding C([0,1]) H is continuous as a consequence of H¨older’s inequality. With this factorization,⊆ since the embeddings i are linear (and therefore of class C∞), and since the composition operator associated with f leaves the C([0,1]) invariant, the claim follows from Exercise 9.2. (b) We may factorize the composition operator associated with f in the following way: f 2 MR2(0,T ;H,D) L (0,T ;H) −−−−→ i i  x C([0,T ];C([0,1])) C([0,T ];C([0,1])), y  b b  x C([0,T] [0,1]) C([0,T] [0,1]) y× −−−−→f × where i stands for the natural embeddings and b for the natural bijections. In the first embedding on the left, we used (10.7). With the above factorization, since the operators i and b are linear, and since the composition operator associated with f leaves the C([0,1]) invariant, the claim follows from Exercise 9.2. 2 2 1 Lemma 10.11. For every T > 0, every c L (0,T ;L (0,1)), every v0 H0 (0,1) and every g L2(0,T;L2(0,1)) the problem∈ ∈ ∈ v˙ D∆v + cv = g, − (0,1) (10.9) ( v(0)= v0 10.3 A semilinear diffusion equation 115

admits a unique solution v MR2(0,T;H,D). ∈ The first line in (10.9) meansv ˙(t) D∆v(t)+ c(t)v(t)= g(t) for almost ev- − (0,1) ery t (0,T) (equality in L2(0,1)), and the term c(t)v(t) stands for the point- wise multiplication∈ of the functions c(t) L2(0,1) and v(t) H1(0,1) C([0,1]). ∈ ∈ 0 ⊆ Note that, by the embedding (10.7), for every v MR2(0,T ;H,D) and every c L2(0,T;L2(0,1)) one has cv L2(0,T ;L2(0,1)).∈ In the proof of Lemma 10.11 we∈ need the following variant of∈ Gronwall’s lemma.

Lemma 10.12 (Gronwall). Let λ : [a,b] R be integrable, and let ϕ : [a,b] R be a continuous functions on the bounded interval→ [a,b]. Assume that, for some→ C 0 and every t [a,b], ≥ ∈ t ϕ(t) C + λ(s)ϕ(s) ds. ≤ a Z Then, for every t [a,b], ∈ Λ ϕ(t) Ce (t), ≤ Λ t λ where (t)= 0 (s) ds. Proof (of LemmaR 10.11). Fix T > 0. It suffices to show that for every c L2(0,T ;L2(0,1)) the linear operator ∈

2 2 1 Gc : MR2(0,T ;H,D) L (0,T ;L (0,1)) H (0,1), → × 0 v (v˙ D∆v + cv,v(0)) 7→ − (0,1) is an isomorphism. Consider the set

2 2 C := c L (0,T;L (0,1)) : Gc is an isomorphism . { ∈ } 2 2 For every c1, c2 L (0,T ;L (0,1)) one has ∈

Gc1 Gc2 L = sup Gc1 u Gc2 u L2(0,T ;L2) H1 k − k u 1 k − k × 0 k kMR2 ≤ = sup (c1 c2)u L2(0,T;L2) u 1 k − k k kMR2 ≤ ∞ ∞ sup c1 c2 L2(0,T;L2) u L (0,T;L ) ≤ u 1 k − k k k k kMR2 ≤ C c1 c2 2 2 , ≤ k − kL (0,T;L )

where C is the constant of the embedding (10.7). Hence, the mapping c Gc is continuous from the space L2(0,T ;L2(0,1)) into the space of continuous linear7→ op- 2 2 1 erators from MR2(0,T ;H,D) into L (0,T ;L (0,1)) H0 (0,1). Since the isomor- phisms form an open subset in the set of all continuous,× linear operators, we see that C is open. Next, we remark that for c 0, the problem (10.9) is nothing else than equation ≡ 2 2 1 (7.11). By Theorem 7.1, for every g L (0,T;L (0,1)) and every u0 H (0,1) ∈ ∈ 0 116 10 Existence of local solutions the problem (10.9) (with c 0) admits a unique solution v W 1,2(0,T ;L2(0,1)) ∞ 1 ≡ ∈ ∩ L (0,T ;H0 (0,1)). It follows from equation (10.9) that this solution satisfies D 2 ∆v L (0,T;H), and hence v MR2(0,T ;H,D). Hence, G0 is continuous, (0,1) ∈ ∈ linear and bijective. By the bounded inverse theorem, the operator G0 is an iso- morphism. Hence, the set C is nonempty. 2 2 1 Finally, let c C. Let g L (0,T ;L (0,1)), v0 H (0,1), and let v ∈ ∈ ∈ 0 ∈ MR2(0,T ;H,D) be the unique solution of the problem (10.9). By Lemma 10.4, 2 1,1 the function t v(t) 1 belongs to W (0,T ) and 7→ k kH0

1 d 2 D∆ v H1 = (0,1) v,v˙ L2 . 2 dt k k 0 −h i Hence, if we multiply equation (10.9) byv ˙ with respect to the L2 inner product and integrate the result over (0,t) (with t (0,T)), then we obtain ∈ t t 2 1 2 1 2 v˙ L2 + v(t) H1 = v0 H1 + g cv,v˙ L2 0 k k 2 k k 0 2 k k 0 0 h − i Z Z 1 t t 1 t 2 2 2 2∞ 2 v0 H1 + g L2 + c L2 v L + v˙ L2 , ≤ 2 k k 0 0 k k 0 k k k k 2 0 k k Z Z Z or

t T t 1 2 1 2 1 2 2 2 2 v˙ L2 + v(t) H1 v0 H1 + g L2 + c L2 v H1 . (10.10) 2 0 k k 2 k k 0 ≤ 2 k k 0 0 k k 0 k k k k 0 Z Z Z Here, we have also applied the Sobolev embedding theorem (Theorem 5.11). By the variant of Gronwall’s lemma (Lemma 10.12), the preceding inequality implies that

T t 1 2 1 2 2 2 c 2 0 k kL2 v(t) H1 ( v0 H1 + g L2 )e for every t (0,T ). 2 k k 0 ≤ 2k k 0 0 k k R ∈ Z When we insert this estimate into (10.10), we obtain that for every t (0,T ) ∈ t T t 1 2 1 2 1 2 2 2 c 2 0 k kL2 v˙ L2 + v(t) H1 ( v0 H1 + g L2 )e . 2 0 k k 2 k k 0 ≤ 2k k 0 0 k k R Z Z Hence,

1 2 c 2 1 T 2 2 k kL2(0,T;L2) 2 2 v˙ L2(0,T;L2) + v L∞ 0 T ;H1 2e ( v0 H1 + g L2 ). k k 2 k k ( , 0 ) ≤ 2k k 0 0 k k Z From equation (10.9) and the Sobolev embedding H1(0,1) C([0,1]) we obtain 0 ⊆ D∆ 0 1 v L2(0,T;L2) v˙ L2(0,T ;L2) + c L2(0,T ;L2) v L∞ 0 T ;H1 + g L2(0,T ;L2). k ( , ) k ≤k k k k k k ( , 0 ) k k This inequality and the preceding estimate finally imply that for every R 0 there ≥ exists KR 0 such that whenever c 2 2 R, then ≥ k kL (0,T ;L ) ≤ 10.4 Exercises 117

v 2 G 1 g v 2 K g 2 v 2 MR2 = c− ( , 0) MR2 R ( L2(0,T;L2) + 0 H1 ). k k k k ≤ k k k k 0 1 In other words, Gc− is uniformly bounded whenever c varies in bounded subsets of C. This implies that C must be closed. Since C L2(0,T ;L2(0,1)) is open, closed and nonempty, and since the space L2(0,T ;L2(⊆0,1)) is connected, we obtain that C = L2(0,T;L2(0,1)). The claim fol- lows from the definition of C and since T > 0 was arbitrary.

1 Proof (of Theorem 10.9). Existence. Let u0 H0 (0,1). Since, by Lemma 10.6, V = 1 ∈ H0 (0,1)= Tr2(H,D), we can chooseu ¯ MR2(0,1;H,D) such thatu ¯(0)= u0. By (10.7),u ¯ C([0,1];C([0,1])), and therefore∈ ∈ 2 2 f ′(u¯) C([0,1];C([0,1])) L (0,1;L (0,1)), ∈ ⊆ 2 2 since f ′ is continuous. By Lemma 10.11, for every g L (0,1;L (0,1)) and every 1 ∈ v0 H (0,1), the problem ∈ 0 v˙ D∆v + f (u¯)v = g, − (0,1) ′ ( v(0)= v0

admits a unique solution v MR2(0,1;H,D). Since, by Lemma 10.10, D ∈ (∇ 2 E ) (u¯)v = ∆v + f (u¯)v, we may therefore apply Theorem 10.1 and we L ′ − (0,1) ′ obtain a local solution u MR2(0,T ′;H,D) of the problem (10.8). Again, for this solution we have ∈

2 2 f ′(u) C([0,T ];C([0,1])) L (0,T ;L (0,1)) (T (0,T ′)), ∈ ⊆ ∈ 2 2 since f ′ is continuous. By Lemma 10.11, for every g L (0,T;L (0,1)) and every 1 ∈ v0 H (0,1), the problem ∈ 0 v˙ D∆v + f (u)v = g, − (0,1) ′ ( v(0)= v0

admits a unique solution v MR2(0,T ;H,D). By Theorem 9.1 and Lemma 10.10, we obtain that the solution∈ satisfies the stated regularity. Uniqueness. See Exercises.

10.4 Exercises

10.1. Let V and H be two Hilbert spaces such that V is densely and continuously embedded into H. Let E : V R be a continuous, H-elliptic, quadratic form. By → Exercise 8.2, the gradient ∇H E is a linear operator and the domain D := D(∇H E ) is a Hilbert space. Show the equality 118 10 Existence of local solutions

Tr2(H,D)= V.

10.2 (Gronwall’s lemma). Prove the Lemma 10.12.

10.3. Prove the uniqueness of local solutions in Theorem 10.9. Hint: Assume that u MR2(0,T ;H,D) and v MR2(0,T ′;H,D) are two local so- lutions. By using the embedding(10.7)and∈ the∈ fact that the function f is locally Lip- 2 schitz continuous, establish a differential inequality for the function u( ) v( ) L2 (0 t min T,T ). k · − · k ≤ ≤ { ′} 10.4. Consider the semilinear diffusion equation

3 ut uxx u = 0 in (0,∞) (0,1), − − ×  u(t,0)= u(t,1)= 0 for t (0,∞), (10.11) ∈   u(0,x)= u0(x) for x (0,1). ∈  This equation can be rewritten as an abstract gradient system for the energy E : H1(0,1) R given by 0 → 1 1 E 1 2 1 4 (u)= ux u . 2 0 − 4 0 Z Z

a) Show that the energy E is not L2(0,1)-elliptic. 1 b) Show that for every u0 H (0,1) satisfying E (u0) < 0 the unique local solu- tion of (10.8) (which exists∈ by Theorem 10.9) can not be extended to a global solution defined on (0,∞). Hint. Multiply the gradient system (10.8) by u with respect to the usual L2 inner product. By using that E (u) is decreasing with respect to the time 2 variable, establish a differential inequality for u L2 and deduce that there must be blow-up in finite time, that is, that therek k exists T < ∞ such that 2 ∞ limt T u(t) 2 = . → − k kL Lecture 11 Asymptotic behaviour of solutions of gradient systems

The gradient system u˙ + ∇gE (u)= 0 (11.1) is a prototype example of a dissipative system, that is, by our loose definition from Lecture 1, a system which admits an energy function. An energy function is, by def- inition, a function which is decreasing along every solution. In the case of a gradient system in finite-dimensionalspace, by Lemma 2.5, the function E itself is an energy, and it satsfies the stronger property that if it is constant along a solution then that so- lution must already be constant. For gradient systems in infinite-dimensional spaces we did not prove an analogous statement, but the energy inequalities in Theorems 6.1 and 8.1 and the energy equality in Lemma 10.5 indicate that some gradient sys- tems in infinite-dimensional spaces are dissipative, too. Actually, we are not aware of a gradient system which is not dissipative. Concerning the meaning of “dissipa- tive” or “to dissipate”, let us recall the footnote from page 11:

to dissipate (lat.: dissipare): to cause to lose energy (such as heat) irreversibly. Or: to spend or expend intemperately or wastefully. Or: to attenuate to or almost to the point of disappearing.

In this explanation, the last point is of particular interest in connection with this lecture in which we study the asymptotic behaviour of solutions of gradient systems. What does it mean, “to attenuate to oralmost to the point of disappearing”?

First, it should be noted that, by Lemma 1.6, every limit point of a global solution of a Euclidean gradient system is an equilibrium point, that is, a point in which the gradient vanishes. It is not too difficult to see that this result is also true for general gradient systems in finite-dimensional spaces, and for gradient systems in infinite- dimensional spaces if some energy inequality is true along the solution (Lemma 11.3). A basic problem, however, is to know whether every global “bounded” solu- tions has only one limit point, that is, whether it converges to a single equilibrium point = “the point of disappearing”. Moreover, if a global, bounded solution con- verges to a single equilibrium point, can we determine the rate of convergence?Such questions are studied in this and the following lecture. The problem of convergence

119 120 11 Asymptotic behaviour of solutions of gradient systems to equilibrium admits a sort of dichotomy: there is a relatively simple convergence result which applies in situations where we know something about the topology of the set of equilibrium points, in particular, that the set of equilibrium points is dis- crete (Theorem 11.4). However, if we know nothing about the set of equilibrium points, or if we know that there is a continuum of equilibrium points, then the prob- lem of convergence becomes very quickly quite involved. In general, we must say that the above description of dissipativity does not coincide with the reality of gra- dient systems: there exists an example of a bounded solution of a gradient system which does not converge! The existence of such an example was already suggested in words by Curry in 1944; a concrete example was then given in 1982 by Palis and de Melo. We describe this example of a nonconvergent soltuion at the end of this lecture.

11.1 The ω-limit set of a continuous function on R+

Let (M,d) be a metric space, and let u : R+ M be a continuous function. The set →

ω(u)= ϕ M : there exists an unbounded (tn) R+ such that lim u(tn)= ϕ . n ∞ { ∈ ⊆ → } is called the ω-limit set1 of u. Clearly, in this definition, by passing to appropriate subsequences, one may replace “unbounded (tn)” by “unbounded and increasing (tn)”. Alternatively, one may define the ω-limit set in the following way.

Lemma 11.1. One has ω(u)= u(s) : s t . t 0 { ≥ } \≥

Proof. Let ϕ ω(u). By definition, there exists an unbounded sequence (tn) R+ ∈ ⊆ such that limn ∞ u(tn)= ϕ. Since the sequence (tn) is unbounded, this implies ϕ → ϕ ∈ u(s) : s t for every t 0, and hence t 0 u(s) : s t . { ≥ } ϕ ≥ ∈ ≥ { ≥ N} ϕ Conversely, let t 0 u(s) : s t . Then,T forevery n , u(s) : s n . ∈ ≥ { ≥ } ∈ ∈ { 1≥ } In particular, for every n N, there exists tn n such that d(u(tn),ϕ) . We T ∈ ≥ ≤ n obtain therefore an unbounded sequence (tn) R+ such that limn ∞ u(tn)= ϕ. By definition, ϕ ω(u). ⊆ → ∈ Lemma 11.2. Assume that u : R+ M has relatively compact range. Then: → a) The ω-limit set is non-empty, compact and connected. b) If one denotes dist(u(t),ω(u)) := inf d(u(t),ϕ) : ϕ ω(u) , then one has { ∈ } limt ∞ dist(u(t),ω(u)) = 0. → 1 The letters α and ω are the first and the last letter, respectively, of the greek alphabet. While the “ω-limit set” gives information about the behaviour of a continuous function near +∞, the “α-limit set” is defined analogously for a continuous function defined on R and gives information about the behaviour near ∞. − − 11.2 A stabilisation result for global solutions of gradient systems 121

c) The limit limt ∞ u(t) exists if and only if ω(u) contains exactly one element. → Proof. (a) Since u is continuous and has relatively compact range, for every t 0 ≥ the set Kt := u(s) : s t is non-empty, compact and connected. Moreover, the { ≥ } family (Kt ) is decreasing, that is, Kt Ks for t s. Then the characterization from Lemma 11.1 implies that ω(u) is non-empty,⊆ compact≥ and connected, too. (b) Assume that the claim is not true. Then there exists an unbounded se- quence (tn) R+ such that inf dist(u(tn),ω(u)) : n 1 > 0. Since u has rela- ⊆ { ≥ } tively compact range, we can extract an unbounded subsequence (tnk ) of (tn), and ϕ ϕ ϕ ω find V such that limk ∞ u(tnk )= . By definition, (u). This implies ∈ ω → ∈ limk ∞ dist(u(tnk ), (u)) = 0, a contradiction. → (c) If the limit limt ∞ u(t)=: ϕ exists, then ω(u)= ϕ ; this follows from the definition of the ω-limit→ set. On the other hand, if ω(u)={ ϕ} , then (b) implies that { } limt ∞ u(t)= ϕ. →

11.2 A stabilisation result for global solutions of gradient systems

Let U be an open subset of a Banach space V which embeds densely and continu- ously into a Hilbert space H. Let E : U R be a continuously differentiable func- tion, and let g : U Inner(H) be a metric.→ We assume that there exists a constant c 0 such that → ≥ v H c v for every v H, u U. (11.2) k k ≤ k kg(u) ∈ ∈ 1,2 R R Lemma 11.3. Let u Wloc ( +;H) C( +;V) be a global solution of the gradient system (11.1). Assume∈ that u has relatively∩ compact range inU, and that the energy inequality t 2 E E u˙(s) g(u(s)) ds + (u(t)) (u(0)) (11.3) 0 k k ≤ Z holds for every t R+. Then every ω-limit point of u is an equilibrium point of E , ∈ that is, every element ϕ ω(u) satisfies ϕ D(∇gE ) and ∇gE (ϕ)= 0. ∈ ∈ Proof. Since u has relatively compact range in U, and since E is continuous on U, the composite function E (u) is necessarily bounded. Then the energy inequality ∞ 2 (11.3) implies that the integral 0 u˙(s) g(u(s)) ds is finite. By assumption (11.2), ∞ k k the integral u˙(s) 2 is thus finite, too. 0 k kH R Let ϕ ω(u). Then there exists an unbounded increasing sequence (tn) R+ ∈ R ⊆ such that limn ∞ u(tn) ϕ V = 0. Since V is continuously embedded into H, and by H¨older’s inequality,→ k we− obtaink for every s [0,1] ∈ tn+s limsup u(tn + s) ϕ H limsup u(tn) ϕ H + u˙(r) H dr n ∞ k − k ≤ n ∞ k − k tn k k → → Z tn+s 1  2 2 limsup u˙(r) H dr = 0. ≤ ∞ t k k n Z n →  122 11 Asymptotic behaviour of solutions of gradient systems

Using again the relative compactness of the range of u and a sub-subsequence argu- ment, this implies that

lim u(tn + s) ϕ V = 0 for every s [0,1]. n ∞ → k − k ∈

By continuity of E ′,

lim E ′(u(tn + s)) E ′(ϕ) V = 0 for every s [0,1]. n ∞ ′ → k − k ∈ Therefore, by the dominated convergencetheorem, by definition of the gradient, and since u is a solution of the gradient system (11.1), for every v V, ∈ 1 E ′(ϕ),v = E ′(ϕ),v ds |h i| 0 h i Z 1 = lim E ′(u(tn + s)),v ds n ∞ 0 h i → Z 1

= lim u˙(tn + s),v g(u(tn+s)) ds . n ∞ 0 h i → Z

By continuity of the metric g, and since u has relatively compact range in U, for R every v, w H the set v,w g(u(t)) : t + is bounded. Hence, by the uniform boundedness∈ principle,{|h there existsi a| constant∈ }C 0 such that for every v H and 2 2 ≥ ∈ every t R+ one has v C v . Hence, for every v V, ∈ k kg(u(t)) ≤ k kH ∈

1 1 1 1 E ϕ 2 2 2 2 ′( ),v lim u˙(tn + s) g(u(t +s)) ds v g(u(t +s)) |h i| ≤ n ∞ 0 k k n 0 k k n → Z Z 1  1  2 2 √ lim u˙(tn + s) g(u(t +s)) ds C v H ≤ n ∞ 0 k k n k k → Z = 0. 

This yields E ′(ϕ)= 0 and the claim. A topological Hausdorff space S is discrete if the topology on S is the discrete topology, that is, every subset of S is open and closed. In a discrete space the only connected subsets are the subsets which contain at most one element. This observa- tion leads to the following stabilisation result for global solutions of gradient sys- tems.

Theorem 11.4. Assume that the set of equilibrium points S := ϕ D(∇gE ) : { ∈ ∇gE (ϕ)= 0 U is discrete. Then every global solution u of (11.1) with rela- tively compact}⊆ range in U and satisfying the energy inequality (11.3) converges to an equilibrium, that is, there exists ϕ S such that limt ∞ u(t)= ϕ inV. ∈ → Proof. By Lemma 11.3, the ω-limit set of every continuous, global solution hav- ing relatively compact range in U is contained in the set of equilibrium points. By Lemma 11.2 (a), the ω-limit set is non-empty and connected. Since, by assumption, 11.3 Nonstabilisation results for global solutions of gradient systems 123 the set of equilibrium points is discrete, we deduce that the ω-limit set contains exactly one point. The claim now follows from Lemma 11.2 (c).

11.3 Nonstabilisation results for global solutions of gradient systems

Theorem 11.4 provides a very useful criterion for stabilization of global solutions having relatively compact range. Exercise 11.3 contains a nontrivial example of a semilinear diffusion equation where the set of equilibrium points is discrete. How- ever, one can easily imagine situations where the set of equilibrium points is non- discrete: think of an energy which is constant on a nonempty open set or look at Exercise 11.4. In those situations, can we still expect convergence to a single equi- librium point? Is convergence a consequence of the gradient structure and/or dissi- pativity? The following suggests that the answer is no, although there is no explicit example.

The following example shows that we cannot expect a better result without further restric- tions on E . Let E (x,y)= 0 on the unit circle and E (x,y) > 0 elsewhere. Outside the unit circle let the surface have a spiral gully making infinitely many turns about the circle. Then the path C will evidently follow the gully and have all points of the unit circle as limit points.2 Haskell B. Curry

It is not clearwhetherCurry had a concreteenergy E in mind.In fact, it took quite a while until in [Palis and de Melo (1982), page 14] the following C∞ “Mexican- hat” function

1 exp( r2 1 ) if r < 1, − E (r cosθ,r sinθ)=  0 if r = 1,   1 1 θ exp( r2 1 ) sin( r 1 ) if r > 1, − − − −  appeared. Palis and de Melo showed by some abstract arguments that the Euclidean gradient system associated with this energy possesses a global, bounded solution which has the whole unit circle as ω-limit set. For curiosity we note that in their example the solution stays outside the unit circle and the particular shape of E inside the unit circle is not important. In fact, there are many ways how to extend E to a global C∞ function inside the unit disk, and the particular definition of Palis and de Melo is perhaps only motivated by the desire to obtain a “Mexican hat”. We do not repeat the analysis of Palis and de Melo, but rather consider the fol- lowing energy

2 The citation is taken from H. B. Curry, The method of steepest descent for non-linear minimiza- tion problems, Quart. Appl. Math. 2 (1944). We have changed the notation for the energy E which was denoted by G in the original. 124 11 Asymptotic behaviour of solutions of gradient systems

Fig. 11.1 The graph of the “mexican hat” of Palis & de Melo

1 exp( r2 1 ) if r < 1, − E (r cosθ,r sinθ)=  0 if r = 1,  4  exp 1 1 4r sin 1 θ if r 1 ( r2− 1 ) + 4r4+(1 r2)4 ( r2 1 ) > , − − − −  which was essentially proposed by Absil, Mahony and Andrews in [Absil et al. (2005)]. It looks more complicated than the energy of Palis and de Melo, but it is easier to show that this energy gives an example of nonstabiliza- tion. Note that the Euclidean gradient systemu ˙ +∇eucE (u)= 0 in polar coordinates becomes ∂E˜ r˙+ (r,θ)= 0, ∂r 1 ∂E˜ θ˙ + (r,θ)= 0. r2 ∂θ

Here, E˜(r,θ)= E (r cosθ,r sinθ). Let r : R+ (0,∞) be a global solution of → 11.3 Nonstabilisation results for global solutions of gradient systems 125

Fig. 11.2 The level curves of the “mexican hat” of Palis & de Melo near the unit circle

1 2r (r2 1)2 r˙+ exp( − ) − = 0 (11.4) r2 1 4r4 + (r2 1)4 − − with r(0) > 1. Then limt ∞ r(t)= 1 (exercise!). Moreover, if we put in addition → 1 θ(t) := , r(t)2 1 − then a straightforward calculation shows that the pair (r,θ) is a solution of the gra- dient system above (in polar coordinates). However, for this solution one has

lim r(t)= 1 and lim θ(t)= ∞. t ∞ t ∞ → → 2 Hence, the function u : R+ R given by →

u(t) = (r(t)cosθ(t),r(t)sinθ(t)) (t R+) ∈ is a global solution of the euclidean gradient system associated with E , and has the whole unit circle as ω-limit set. It “follows the gully”, as described by Curry. Thus, in general, global and bounded solutions of gradient systems need not converge. 126 11 Asymptotic behaviour of solutions of gradient systems 11.4 Exercises

11.1. Show that every global and bounded solution of a scalar ordinary differential equation converges to an equilibrium point. 11.2. Show that every maximal solution r of (11.4) with r(0) > 1 is global and satisfies limt ∞ r(t)= 1 → 11.3 (Shooting method). Let p > 1 and consider the set

2 1 p 1 S := ϕ H (0,1) H (0,1) : ϕ′′ ϕ − ϕ = 0 { ∈ ∩ 0 − − | | } of all equilibrium points of the problem

p 1 ut uxx u u = 0 in (0,∞) (0,1), − − | | − × ( u(t,0)= u(t,1)= 0 for every t (0,∞). ∈

a) By recalling an existence and uniqueness result for ordinary differential equa- tions, show that for every c R the initial-value problem ∈ ϕ ϕ p 1ϕ = 0 on R, − ′′ − | | −  ϕ(0)= 0, (11.5)   ϕ′(0)= c,  admits a unique maximal solution ϕ : [0,xmax) R. → b) Show that the for every solution ϕ of (11.5) the quantity E(ϕ(x),ϕ′(x)) := 1 ϕ (x)2 + 1 ϕ(x) p+1 does not depend on x, that is, 2 ′ p+1 | |

1 2 1 p+1 ϕ′(x) + ϕ(x) = const =: d for every x 0. (11.6) 2 p + 1| | ≥ Deduce from this that the maximal solution of (11.5) is global, that is, exists on R+. Moreover, express the constant d in terms of c.

c) Let ϕ be a solution of (11.5) and assume that ϕ′(0)= c > 0. Then, by conti- nuity, x0 := inf x 0 : ϕ′(x) 0 > 0. { ≥ ≥ } Express x0 as a function of c. Hint. On the interval [0,x0), one has ϕ′ > 0. Equation (11.6) can thus be rewritten in the form

2 p+1 ϕ′(x)= 2d ϕ(x) . s − p + 1| |

Integrate this ordinary differential equation with separated variables. 11.4 Exercises 127

d) Let ϕ and x0 be as in the previous item. Show that ϕ(2nx0)= 0 for every 1 ϕ integer n 1. Show that x0 = 2n for some integer n 1 if and only if (restricted≥ to the interval [0,1]) is an equilibrium point. ≥ e) Show that the mapping S R, ϕ ϕ (0) is continuous and injective. → 7→ ′ f) Show that the set S is discrete.

11.4. Let f : R R be a continuously differentiable function, and consider the semilinear diffusion→ equation

ut uxx + f (u)= 0 in (0,∞) R, − × ( limx ∞ u(t,x)= limx +∞ u(t,x)= 0. →− → ζ Define F(ζ)= f (s) ds, let ζ0 := inf ζ > 0 : F(ζ) 0 , and assume that 0 { ≤ } R f (0)= 0 and f ′(0) > 0,

0 < ζ0 < ∞ and f (ζ0) < 0.

a) Show that the problem

ϕ + f (ϕ)= 0 in R, − ′′ ( limx ∞ ϕ(x)= limx +∞ ϕ(x)= 0, →− → admits a strictly positive solution (called ground state). Hint. Similarly as in the previous exercise, one may employ a shooting method. Show that the initial value problem

ϕ + f (ϕ)= 0 in R, − ′′  ϕ(0)= ζ,   ϕ′(0)= 0,  admits for appropriate ζ >0 a global solution which is positive, symmetric with respect to the origin, and which satisfies limx ∞ ϕ(x)= 0. In order to study solutions of this initial value problem, it is conveni→± ent to note that the 1 ϕ 2 ϕ quantity 2 ′(x) F( (x)) does not depend on x, and to study a scalar first order ordinary differential− equation instead. b) Show that the solution found in (a) belongs to H1(R). c) Show that the semilinear diffusion equation above admits a continuumof equi- librium points in H1(R). Hint. Consider translates of ϕ.

Lecture 12 Asymptotic behaviour of solutions of gradient systems II

In the years 1963 and 1965 appeared the following result of S. Łojasiewicz [Łojasiewicz (1963)], [Łojasiewicz (1965)].

Theorem 12.1 (Łojasiewicz). Let E : U R be an analytic function defined on an open set U Rd, and let ϕ U be an→ equilibrium point of E . Then there exist θ ⊆ 1 σ ∈ ϕ σ constants (0, 2 ], > 0 and C 0 such that for every u U with u one has ∈ ≥ ∈ k − k≤ 1 θ E (u) E (ϕ) − C E ′(u) Rd . (12.1) | − | ≤ k k( )′ The inequality (12.1), which may also be written in the form

θ θ ((E (u) E (ϕ)) )′ Rd > 0, k − k( )′ ≥ C is usually called Łojasiewicz inequality or, more precisely, Łojasiewicz gradient inequality. Theorem 12.1 expresses a particular, regular behaviour of analytic functions of several variables near equilibrium points, a behaviour which is in general not shared by C∞ functions. Its proof, however, is quite involved from the point of view of this lecture course, and we therefore omit it.

We start this lecture by stating the above theorem, because the Łojasiewicz inequality is a (or rather the?) fundamental tool in order to show convergence to equilibrium of global and bounded solutions of gradient systems. The discovery that the inequality may be applied in the context of gradient systems is due to Łojasiewicz himself. In his articles he proved that every global and bounded solution of a gradient system in Rd and with analytic energy converges to a single equilibrium. This result is in contrast to the mexican hat example of Palis and de Melo where the energy is C∞ (!) and the associated gradient system admits a global and bounded solution which does not converge.

We emphasize that one should really distinguish two independent problems in the context of the Łojasiewicz inequality and the convergence to equilibrium,

129 130 12 Asymptotic behaviour of solutions of gradient systems II although Łojasiewicz’ name is connected to both of them: the first problem is the problem of convergence to equilibrium of global solutions of gradient systems under the assumption that the Łojasiewicz inequality holds – this problem is basically solved with Theorem 12.2 below, be repeating essentially Łojasiewicz’ idea. The second and more difficult problem is to know whether a concrete energy does satisfy the Łojasiewicz inequality near some particular equilibrium point. This problem is independent of the first one and independent of the formulation of a particular gradient system. A positive answer to the second question, that is, a proof of the Łojasiewicz inequality for a particular energy function, may have consequences for many different gradient systems, since out of one energy function one can generate a variety of gradient systems through the choice of the metric.

In this lecture we discuss both problems in two independent sections. In the first section we show that the infinite-dimensional variant of the Łojasiewicz inequal- ity implies convergence to equilibrium of global and bounded solutions of abstract gradient systems. In the second section we prove the Łojasiewicz inequality in two comparatively simple situations.

12.1 The Łojasiewicz-Simon inequality and stabilisation of global solutions of gradient systems

Let V be a Banach space, and let H be a Hilbert space such that V is densely and continuously embedded into H. Let U V be open, and let E : U R be a contin- uously differentiable function. Let g : U⊆ Inner(H) be a metric on→U. We assume → that there exist constants c1, c2 > 0 such that, for every v H and every u U, ∈ ∈

c1 v H v c2 v H . (12.2) k k ≤k kg(u) ≤ k k 1,2 R R Theorem 12.2. Let u Wloc ( +;H) C( +;V) be a solution of the gradient sys- tem ∈ ∩ u˙ + ∇gE (u)= 0. (12.3)

Assume that u has relatively compact range in U, and assume that for every t R+ the energy equality ∈

t 2 E E u˙(s) g(u(s)) ds + (u(t)) = (u(0)) (12.4) 0 k k Z holds. Assume further that there exists ϕ ω(u), θ (0, 1 ], σ > 0 and C 0 such ∈ ∈ 2 ≥ that for every v U with v ϕ V σ one has ∈ k − k ≤ 1 θ E (v) E (ϕ) − C E ′(v) . (12.5) | − | ≤ k kV ′

Then limt ∞ u(t) ϕ V = 0. Moreover, as t ∞, → k − k → 12.1 The Łojasiewicz-Simon inequality and stabilisation of global solutions of gradient systems131

O(e ct ) for some c > 0, if θ = 1 , ϕ − 2 u(t) H = θ θ (12.6) k − k ( O(t /(1 2 )) if θ (0, 1 ). − − ∈ 2 In the following, the inequality (12.5) is called Łojasiewicz-Simon inequality, in order to honour also the work of L. Simon who generalized Łojasiewicz’ ideas to energies and gradient systems on infinite-dimensional spaces. For example, Simon applied the Łojasiewicz-Simon inequality in order to prove convergence to equilibrium of solutions of semilinear diffusion equations with analytic energy [Simon (1983)].

Note that the energy equality (12.4) implies that the composite function E (u) 1,1 R belongs to Wloc ( +), by Lemma 5.9. In the following, derivatives of the composite function are to be understood as weak derivatives. By (12.3), the energy equality may also be written in the form

t ∇ E 2 E E g (u(s)) g(u(s)) ds + (u(t)) = (u(0)) or 0 k k Z t ∇ gE (u(s)) g(u(s)) u˙(s) g(u(s)) ds + E (u(t)) = E (u(0)). 0 k k k k Z Proof (of Theorem 12.2). By the energy equality (12.4), the function E is nonin- creasing along u, and if E (u) is constant, then the function u is constant. Since ϕ ω(u) by assumption, there exists an unbounded sequence (tn) R+ such that ∈ ⊆ limn ∞ u(tn)= ϕ. Since E is continuous, we obtain limn ∞ E (u(tn)) = E (ϕ). Since → → E is nonincreasing along u, it follows that E (u(t)) E (ϕ) and limt ∞ E (u(t)) = E (ϕ). ≥ → If E (u(t0)) = E (ϕ) for some t0 0, then E (u(t)) = E (ϕ) for every t t0, and ≥ ≥ therefore, by the energy equality (12.4),u ˙(t)= 0 for t t0. In this case, the function ≥ u is constant for t t0, and the assertions about convergence and decay estimate hold trivially. ≥ So we can assume that E (u(t)) > E (ϕ) for every t 0. Let, for every t 0, ≥ ≥ θ H (t) := (E (u(t)) E (ϕ)) . −

Then H is nonincreasing, H (t) > 0 for every t 0, and limt ∞ H (t)= 0. Let ≥ → t0 0 be such that u(t0) ϕ V < σ, and define ≥ k − k

t1 := inf t t0 : u(t) ϕ V = σ . { ≥ k − k }

By continuity of the function u, we have t1 > t0. By using successively the chain rule, the energy equality (12.4) and the fact that u is a solution of (12.3), and the assumption (12.2), we obtain for almost every t [t0,t1), ∈ 132 12 Asymptotic behaviour of solutions of gradient systems II

d θ 1 d H (t)= θ (E (u(t)) E (ϕ)) − E (u(t)) −dt − − dt θ 1 = θ (E (u(t)) E (ϕ)) − ∇gE (u(t))  u˙(t) − k kg(u(t)) k kg(u(t)) θ 1 θ c1 (E (u(t)) E (ϕ)) − ∇gE (u(t)) u˙(t) H . ≥ − k kg(u(t)) k k

By using the assumption (12.2) again, we obtain that for every t R+ the estimate ∈ ∇ E ∇ E g (u(t)) g(u(t)) = sup g (u(t)),v g(u(t)) k k v 1h i k kg(u(t))≤ = sup E (u(t)),v ′ V ′,V v 1h i k kg(u(t))≤ sup E (u(t)),v ′ V ′,V ≥ c v H 1h i 2 k k ≤ sup E (u(t)),v ′ V ′,V ≥ c c v V 1h i 3 2 k k ≤ 1 = E ′(u(t)) V , c3 c2 k k ′ where c3 > 0 is the constant of the embedding V ֒ H. This estimate and the → Łojasiewicz-Simon inequality (12.5) imply that, for almost every t [t0,t1), ∈ θ d c1 θ 1 H (t) (E (u(t)) E (ϕ)) − E ′(u(t)) V u˙(t) H (12.7) −dt ≥ c2 c3 − k k ′ k k θ c1 u˙(t) H . ≥ Cc2 c3 k k

Hence, for almost every t [t0,t1), ∈

u(t) ϕ H u(t) u(t0) H + u(t0) ϕ H (12.8) k − k ≤k t − k k − k u˙(s) H ds + u(t0) ϕ H ≤ t k k k − k Z 0 Cc2 c3 H (t0)+ u(t0) ϕ H. ≤ θ c1 k − k n R σ n Now let (t0 ) + be an unbounded, increasing sequence such that > u(t0 ) ϕ ⊆ n n k − V 0, and define the corresponding t1 as above. Assume that t1 was finite for k → n n ϕ σ every n. Then, by definition of t1 and by continuity of u, u(t1 ) V = for every n. Since u has relatively compact range in U V, we cank extract− ak subsequence of n ⊆ n n ψ (t1 ) (which we denote for simplicity again by (t1 )) such that limn ∞ u(t1 )=: . By → continuity of the norm, ψ ϕ V = σ > 0. On the other hand, from the inequality ψ ϕk − k n ϕ (12.8) we obtain H = limn ∞ u(t1 ) H = 0, which is a contradiction. k − k →n k − k 1 Hence, for some n large enough, t =+∞. By (12.7), this impliesu ˙ L (R+;H). 1 ∈ By Cauchy’s criterion, limt ∞ u(t) exists in H. By using the relative compactness of the range of u in V, a subsubsequence→ argument, and since ϕ ω(u), this implies ∈ limt ∞ u(t)= ϕ in V. → 12.2 The Łojasiewicz-Simon inequality in Hilbert spaces 133

In order to prove the decay estimate, let t0 0 be large enough such that u(t) ≥ k − ϕ V σ for every t t0. By the inequality (12.7), for every t t0, k ≤ ≥ ≥ ∞ u(t) ϕ H u˙(s) H ds (12.9) k − k ≤ t k k Z Cc c 2 3 H (t). ≤ θ c1

Moreover, by the energy equality (12.4), for almost every t t0, ≥ d θ 1 d H (t)= θ (E (u(t)) E (ϕ)) − ( E (u(t))) −dt − −dt θ 1 2 = θ (E (u(t)) E (ϕ)) − ∇gE (u(t)) − k kg(u(t)) θ E E ϕ θ 1 E 2 = 2 2 ( (u(t)) ( )) − ′(u(t)) V c2 c3 − k k θ θ E u t E ϕ 1 2 2 2 ( ( ( )) ( )) − ≥ C c2 c3 − θ 1 θ H t −θ = 2 2 2 ( ) . C c2 c3 Now, if we integrate the resulting inequality

d H θ 1 d 1 θ dt (log (t)) if = 2 θ H t H t −θ − ( ) ( )− = θ 1 2θ 2 2 2 −dt  d H −θ θ 1  ≥ C c c 1 2θ dt (t)− if (0, 2 ) 2 3  − ∈  over the interval (t0,t), then we obtain the estimate 

O(e ct ) if θ = 1 , H − 2 (t)= θ θ ( O(t /(1 2 )) if θ (0, 1 ). − − ∈ 2

Combining this estimate with (12.9), the decay estimate for u ϕ H follows. k − k

12.2 The Łojasiewicz-Simon inequality in Hilbert spaces

Let V be a Hilbert space and let U V be an open subset. Let E C2(U), and let ⊆ ∈ ϕ U be an equilibrium point of E , that is, E ′(ϕ)= 0. We formulate conditions which∈ imply that E satisfies the Łojasiewicz-Simon inequality near ϕ, that is, that θ 1 σ ϕ σ there exists (0, 2 ], > 0 and C 0 such that for every u U with u V one has ∈ ≥ ∈ k − k ≤ 1 θ E (u) E (ϕ) − C E ′(u) . (12.10) | − | ≤ k kV ′ 134 12 Asymptotic behaviour of solutions of gradient systems II

Although in this inequality only the first derivative of E appears, the conditions and arguments in this section involve the second derivative, too. Recall that the Fr´echet derivative E maps V into V , and that the second derivative at a point u U ′ ′ ∈ is thus a linear operator from V into V ′.

Theorem 12.3. Assume that E ′′(ϕ) is continuously invertible. Then E satisfies the ϕ θ 1 Łojasiewicz-Simon inequality near with = 2 .

Proof. Consider the Taylor expansions of E and E ′,

2 E (u)= E (ϕ)+ E ′(ϕ)(u ϕ)+ O( u ϕ ) and − k − kV E ′(u)= E ′(ϕ)+ E ′′(ϕ)(u ϕ)+ o( u ϕ V). − k − k

Since E ′(ϕ)= 0, the first one yields

E (u) E (ϕ) C u ϕ 2 | − |≤ k − kV in a neighbourhood of ϕ. Since E (ϕ) is invertible, one has E (ϕ)(u ϕ) ′′ k ′′ − kV ′ ≥ c u ϕ V for every u U and some constant c > 0. Hence, the second Taylor expansionk − k yields ∈ c E ′(u) u ϕ V k kV′ ≥ 2 k − k in a neighbourhood of ϕ. Combining the preceding two inequalities yields

1 E (u) E (ϕ) 2 C E ′(u) | − | ≤ k kV ′ in a neighbourhood of ϕ, and this is the claim.

The short proof of Theorem 12.3 relies only on Taylor expansion and the definition of the Fr´echet derivative. This idea can be generalized to situations in which E ′′(ϕ) is not necessarily invertible, but in which, by means of the implicit function theorem and a nonlinear decomposition of the space V, one can reduce the arguments to the invertible case. In the rest of this section, we assume that E ′′(ϕ) is a Fredholm operator, that is, the kernel kerE (ϕ)= u V : E (ϕ)u = 0 is ′′ { ∈ ′′ } finite dimensional and the range rgE ′′(ϕ)= E ′′(ϕ)u : u V is closed in V ′ and has finite codimension. { ∈ }

Let P L (V ) be any projection onto kerE ′′(ϕ). Then V is the direct topological sum ∈

V = V0 V1 ⊕ = rgP kerP ⊕ = kerE ′′(ϕ) kerP. ⊕ Moreover, the spaces rgP , kerP V , where P L (V ) is the adjoint projection, ′ ′ ⊆ ′ ′ ∈ ′ may be naturally identified with the dual spaces V0′ and V1′, respectively, and we 12.2 The Łojasiewicz-Simon inequality in Hilbert spaces 135 simply write V1′ = kerP′.

Symmetry of the second derivative of real-valued functions is a well known property of twice continuously differentiable functions defined on Rd; see [Cartan (1967), Th´eor`eme 5.1.1] for the analogous result for functions defined on a Banach space.

Theorem 12.4 (Schwarz). For every u U the linear operator E ′′(u) : V V ′ is symmetric, that is, for every v, w V onehas∈ → ∈

E ′′(u)v,w = E ′′(u)w,v . h iV ′,V h iV ′,V By Schwarz’ theorem, for every u, v V, ∈

P′E ′′(ϕ)u,v = E ′′(ϕ)u,Pv h iV ′,V h iV ′,V = E ′′(ϕ)Pv,u h iV ′,V = 0, where in the last inequality we have used that P projects onto the kernel of E ′′(ϕ). This equality implies that

rgE ′′(ϕ) kerP′ = V ′ (12.11) ⊆ 1

Lemma 12.5. The linear operator E (ϕ) : V1 V is continuously invertible. ′′ → 1′

Proof. Since V1 V0 = V1 kerE ′′(ϕ)= 0 , the operator E ′′(ϕ) is injective on V1. ∩ E∩ ϕ { } E ϕ We next prove that ′′( ) has dense range in V1′. Note that ′′( )(V1) = E ϕ E ϕ ′′( )(V ). Hence, in order to show that ′′( ) has dense range in V1′, it suffices show that u V1 and ∈ u = 0. E ′′(ϕ)v,u V V = 0 for every v V ⇒ h i ′, ∈  This implication, however, follows immediately from Schwarz’ theorem and injec- tivity of E ′′(ϕ) on V1. By assumption, E ′′(ϕ) has closed range in V ′, so that E ′′(ϕ) is surjective, hence bijective from V1 onto V1′. The assertion of the lemma follows now from the bounded inverse theorem.

Lemma 12.6. Let P L (V) be a projection onto kerE (ϕ), and define the set ∈ ′′

S := u U : (I P′)E ′(u)= 0 , { ∈ − } Then, locally near ϕ, S a , called critical manifold, satisfy- ing dimS = dimkerE ′′(ϕ). k k 1 If E C (U) for some k 2,thenS is aC − -manifold. If E is analytic, then S is analytic.∈ ≥ 136 12 Asymptotic behaviour of solutions of gradient systems II

Proof. Consider the function

G : V = V0 V1 U V ′, ⊕ ⊇ → 1 u = u0 + u1 (I P′)E ′(u). 7→ −

Since E ′ is continuously differentiable, the function G is continuously differen- tiable, too. Moreover, G(ϕ)= G(ϕ0 +ϕ1)= 0 and G′(ϕ) = (I P′)E ′′(ϕ)= E ′′(ϕ) ∂ (see (12.11) for the last equality). By Lemma 12.5, the partial− derivative G (ϕ)= ∂ u1 E (ϕ) V : V1 V is an isomorphism (continuously invertible). Hence, by the im- ′′ | 1 → 1′ plicit function theorem (Theorem 9.2), there exists a neighbourhood U0 V0 of ϕ0, 1 ⊆ a neighbourhood U1 V1 of ϕ1, U0 +U1 U, and a function g C (U0;U1) such ⊆ ⊆ ∈ that g(ϕ0)= ϕ1 and

u U0 +U1 : G(u)= 0 = (u0,g(u0)) : u0 U0 . { ∈ } { ∈ } By definition of the function G and the set S, the set on the left-hand side of this equality is just the intersection of S with the neighbourhoodU0 +U1 of ϕ. So, locally near ϕ, S is the graph of the differentiable function g, that is, S is a differentiable manifold. Higher regularity of the manifold S in the case of higher regularity of E follows immediately from the implicit function theorem. The lemma is proved.

Note that the critical manifold may not be a submanifold of V, but by Lemma 12.6 it is locally near ϕ a submanifold(it is the graph of the implicit function g). The critical manifold depends on the choice of the projection P, but it always contains the set of all equilibrium points S0:

S0 := u U : E ′(u)= 0 S. { ∈ }⊆ Theorem 12.7. Define the critical manifold as in Lemma 12.6, and assume that the restriction E S satisfies the Łojasiewicz-Simon inequality near ϕ, that is, there exist constants θ| (0, 1 ], σ > 0 and C 0 such that for every u U S (!) with ∈ 2 ≥ ∈ ∩ u ϕ V σ one has k − k ≤ 1 θ E (u) E (ϕ) − C E ′(u) . | − | ≤ k kV ′ Then E itself satisfies the Łojasiewicz-Simon inequality near ϕ with the same expo- nent θ.

Proof. Choose the neighbourhood U := U0 +U1 of ϕ and the implicit function g : U0 U1 as in the proof of Lemma 12.6. Suppose that U is sufficiently small so that → the restriction E S satisfies the Łojasiewicz-Simon inequality in U S. We define a nonlinear| projection Q : U U by ∩ →

Qu = Q(u0 + u1) := u0 + g(u0).

Note that, for every u U, ∈ 12.2 The Łojasiewicz-Simon inequality in Hilbert spaces 137

Qu S and ∈ u Qu V1. − ∈ Moreover, Qϕ = ϕ. For every u U, the Taylor expansion of E at Qu is ∈ E (u) E (Qu)= − 1 2 = E ′(Qu),u Qu + E ′′(Qu)(u Qu),u Qu + o( u Qu ). h − iV′,V 2h − − iV′,V k − k

By definition of V1, and by definition of the manifold S,

E ′(Qu),u Qu = E ′(Qu),(I P)(u Qu) h − iV′,V h − − iV′,V = (I P′)E ′(Qu),u Qu h − − iV′,V = 0, that is, the first term on the right-hand side of the Taylor expansion of E is zero. Therefore, and since E ′′ is uniformly bounded on U by continuity, if we choose U small enough, we have for every u U ∈ E (u) E (Qu) C u Qu 2 . (12.12) | − |≤ k − kV From now on, the constant C may vary from line to line. By the definition of differ- entiability, E ′(u) E ′(Qu)= E ′′(Qu)(u Qu)+ o( u Qu ). (12.13) − − k − k We apply the projection I P to this equality and use that Qu S in order to obtain − ′ ∈

(I P′)E ′(u) = (I P′)E ′′(Qu)(u Qu)+ o( u Qu ). (12.14) − − − k − k E ϕ E ϕ By Lemma 12.5, the operator (I P′) ′′( ) = ′′( ) : V1 V1′ is continu- ously invertible. Hence, by continuity− and if we choose U small→ enough, then E (I P′) ′′(Qu) : V1 V1′ is continuously invertible for all u U and the inverses are− uniformly bounded→ in U. Hence, by (12.14), there exists a∈ constant C 0 such that for every u U ≥ ∈

u Qu V C (I P′)E ′(u) C E ′(u) . (12.15) k − k ≤ k − kV′ ≤ k kV ′ By (12.13) and (12.15),

E ′(Qu) E ′(u) +C u Qu V C E ′(u) . (12.16) k kV ′ ≤k kV′ k − k ≤ k kV ′

Combining the estimates (12.12) and (12.15) with the assumption that E S satisfies the Łojasiewicz-Simon inequality in U S, and using also the estimate (12.16),| we obtain that for every u U ∩ ∈ 138 12 Asymptotic behaviour of solutions of gradient systems II

E (u) E (ϕ) E (u) E (Qu) + E (Qu) E (ϕ) | − | ≤ | − | | − | E 2 E 1/(1 θ) C ′(u) V +C ′(Qu) − ≤ k k ′ k kV ′ E 2 E 1/(1 θ) C ′(u) V + ′(u) − ≤ k k ′ k kV′ Choosing U sufficiently small, we can by continuity assume that E (u) 1 for ′ V ′ every u U. Since θ (0, 1 ], we then obtain k k ≤ ∈ ∈ 2 1/(1 θ) E (u) E (ϕ) C E ′(u) − for every u U, | − |≤ k kV ′ ∈ and this is the claim of Theorem 12.7.

Corollary 12.8. Let E C2(U), let ϕ V be an equilibrium point, and assume that ∈ ∈ E ′′(ϕ) is a Fredholm operator. Define the critical manifold S as in Lemma 12.6. Assume that the set of all equilibrium points,

S0 := u V : E ′(u)= 0 , { ∈ } forms a neighbourhood of ϕ in S. Then E satisfies the Łojasiewicz-Simon inequality θ 1 with exponent = 2 .

Proof. By assumption, the derivative E ′ is zero in a neighbourhood of ϕ in S. This implies that the restriction E S is constant in the same neighbourhood. A constant function trivially satisfies the| Łojasiewicz-Simon inequality for the Łojasiewicz ex- θ 1 ponent = 2 . The claim follows from Theorem 12.7.

12.3 Exercises

12.1. a) Let F : R R be a k times continuously differentiable function (k 2) → (k 1) (k) ≥ such that F(0)= F′(0)= = F − (0)= 0 and F (0) = 0. Show that F ··· θ 6 1 satisfies the Łojasiewicz inequality near 0 with exponent = k . Deduce that every analytic function F : R R satisfies the Łojasiewicz inequality near every equilibrium point. → b) Let F(u)= uk for some k 2. Solve the differential equationu ˙ + F (u)= ≥ ′ 0, show that every solution satisfies limt ∞ u(t)= 0, and compare the actual decay rate of u with the decay rate obtained→ by Theorem 12.2 and (a). Remark. The proofs in (a) rely on Taylor expansions and should be elementary. However, the proofs in the one-dimensional case considered here do not give a real hint how to prove the corresponding result in higher dimensions like, for example, Theorem 12.1.

12.2. a) Let V be a Hilbert space and E : V R be a continuous, coercive, → quadratic form. Show that E ′′(0) is invertible. Hint. Use the Riesz-Fr´echet theorem (Theorem D.53). 12.3 Exercises 139

b) Let c : R R be a continuous function such that limx ∞ c(x)= 0. Show that the (linear)→ multiplication operator →±

H1(R) L2(R), → u cu 7→ is compact. Hint. One may use that for every bounded interval (a,b), the embedding .(H1(a,b) ֒ L2(a,b) is compact (see Exercise 7.1 → c) Let c : R R be a continuous function such that limx ∞ c(x)= c0 > 0, and consider the→ quadratic form →±

E : H1(R) R, → 1 2 2 u (ux + cu ). 7→ 2 R Z

Show that E ′′(0) is a Fredholm operator. Hint. By the Riesz-Schauder Theorem, every operator of the form I + K with K compact, is a Fredholm operator. 12.3. We recall the setting of Exercise 11.4. Let f : R R be a continuously differ- ζ ζ ζ → ζ ζ entiable function, define F( ) := 0 f (s) ds, let 0 := inf > 0 : F( ) 0 , and assume that { ≤ } R

f (0)= 0 and f ′(0) > 0,

0 < ζ0 < ∞ and f (ζ0) < 0.

Consider the energy

E : H1(R) R, → 1 2 u ux + F(u). 7→ 2 R R Z Z

a) Show that E ′′(0) is invertible. Conclude that E satisfies the Łojasiewicz- θ 1 Simon inequality near 0 with exponent = 2 . b) Let ϕ be the ground state found in Exercise 11.4 (a), that is, a solution ϕ 0, ϕ = 0 of the problem ≥ 6 ϕ + f (ϕ)= 0 in R, − ′′ ( limx ∞ ϕ(x)= 0. →±

Show that E ′′(ϕ) is a Fredholm operator.

c) In addition to (b), show that the kernel of E ′′(ϕ) is one-dimensional. Hint. Show that ψ C2(R) H1(R) belongs to the kernel of E (ϕ) if and ∈ ∩ ′′ 140 12 Asymptotic behaviour of solutions of gradient systems II

only if ψ + f (ϕ)ψ = 0 in R, − ′′ ′ ( limx ∞ ψ(x)= 0. →± One solution is ψ = ϕ′. A second solution ψ may be found by substitution ψ z := ( ϕ )′ which leads to a first order differential equation. ′ 1 d) Show that, locally near ϕ, the set of all equilibrium points S0 = u H (R) : { ∈ E ′(u)= 0 and the critical manifold S defined in Lemma 12.6 coincide. Con- clude that}E satisfies the Łojasiewicz-Simon inequality near 0 with exponent θ 1 = 2 . Lecture 13 The universe, soap bubbles and curve shortening

I do not suppose that there is any one in this room who has not occasionally blown a common soap-bubble, and while admiring the perfection of its form, and the marvellous brilliancy of its colours, wondered how it is that such a magnificent object can be so easily produced. I hope that none of you are yet tired of playing with bubbles, ... C. V. Boys, author of Soap-bubbles and the Forces which Mould Them, 1896

My grandfather talked continuously about soap bubbles, and of course in mathematical terms. I did not understand a word of what he said. Bernhard Caesar Einstein, the grandson of Albert Einstein

To be clear from the very beginning of this last ISEM lecture: there will be no mathematical discussion of any models describing the evolution of the universe or of soap bubbles in the air. However, the title shall suggest that there is an important class of evolution problems in physics, chemistry or engineering dealing with moving curves and surfaces. Some of these models are gradient systems.

Moving curves and surfaces can be observed in everyday’s life, for example in the form of soap bubbles blown into the air and moving away by wind or gravitational forces. However, these soap bubbles are not only moving as a whole through the air, but they exhibit also inner deformations due to various surface forces acting in the soap film. Especially for large bubbles such deformations can be observed with the eye; smaller bubbles seem to exist only as round Soap bubbles, Jean Sim´eon Chardin, spheres. Everybody who is not yet tired mid-18th century of playing with bubbles, can repeat this experiment, and everybody who is curious to understand this model in a deeper way may formulate the evolution equations behind it and analyze it from a mathematical point of view. Besides soap bubbles, vibrating strings or plates, the water surface on a lake or in a glass of water, the sharp interface separating two chemical

141 142 13 The universe, soap bubbles and curve shortening phases, or the interface between a piece of ice and surrounding water are further examples of surfaces which change their shape with time and which may be easily observed in our daily life. The description of the physical and chemical forces or processes acting on / in these surfaces leads to partial differential equations, and the mathematical analysis of these partial differential equations may allow us to obtain a deeper understanding of the various models.

In the theory of relativity or gravitational physics, one deals with a surface at a much different scale at which our human intuition is of no big help. If we believe the experts, then the universe is a 3-dimensional manifold which changes its shape and size with time due to the existence of matter, and it is a challenge to understand this evolution, too. Perhaps this evolution is driven by an energy which is, from the mathematical point of view, very similar to the elastic energy inside a surface, for example, a football. Perhaps this elastic energy is exactly the energy which is nowadays also used in image processing where moving surfaces also appear.

For esthetical reasons we would very much like to discuss the evolution of soap bubbles, believing that this model is relatively accessible. However, in order to sim- plify the presentation, we considers the evolution of curves instead of surfaces.

Curves and curvature vector

A smooth curve is a subset Γ Rd which is the image of a continuously differ- entiable function u : I Rd (I⊆ R an interval) with never vanishing derivative. The function u is called→ parametrization⊆ of the curve Γ . One curve admits infinitely many parametrizations, since for every C1 diffeomorphism θ : J I between two intervals J, I R the composite function u θ is also a parametrization.→ ⊆ ◦ Let Γ be a curve with a C2 parametrization u : I Rd. Then, for every x I the → ∈ vector u′(x) is the to Γ at u(x). The curvature vector at u(x) is defined by

1 u′(x) κ(u(x)) := ′, u′(x) u′(x) | || | where denotes the euclidean norm in Rd. The curvature vector κ measures, by its length,| · | how much the curve Γ is curved (see Lemma 13.1 below). Moreover, for planar curves it points into the direction where the curve is convex. The curvature vector depends only on the point u(x) Γ and shape of the curve Γ , but it does not depend on the particular parametrization∈ of Γ . In fact, if v : J Rd is another C2 parametrization of Γ , v = u θ for some C2 diffeomorphism θ→: J I, then, at the point v(x)= u(θ(x)) Γ , ◦ → ∈ 13 The universe, soap bubbles and curve shortening 143

1 v′ 1 (u θ)′(x) ′ (x)= θ ◦ θ ′ v′ v′ (u )′(x) (u )′(x) | | | |  | ◦ | | ◦ |  1 u′(θ(x)) = sgnθ ′(x) ′ u (θ(x)) θ (x) u (θ(x)) | ′ || ′ | | ′ | 1 u′  = ′(θ(x))θ ′(x) sgnθ ′(x) u (θ(x)) θ (x) u | ′ || ′ | | ′| 1 u′  = ′ (θ(x)). u′ u′ | || |  2 d Given a C parametrization u : I R of a curve Γ , ds = u′(x) dx is the in- → x+h ξ ξ | | finitesimal arclength of the curve at u(x), x u′( ) d is the length of the arc segment running from u(x) to u(x + h) (respecting| | the orientation of u), and R E (u)= u (x) dx is the total length of the curve Γ . I | ′ | LemmaR 13.1. Fix x I. For h R small, let θ(h) be the angle between u (x + h) ∈ ∈ ′ and u′(x), let s(h) be the length of the arc segment running from u(x) to u(x + h). Then θ(h) κ(u(x)) = lim . | | h 0 s(h) →

u x u x h Proof. Consider the triangle with vortices 0, ′( ) and ′( + ) . By definition, θ(h) u′(x) u′(x+h) is the angle at the vortex 0, and the two sides| forming| this| ang| le have length equal to 1. By bisecting the angle θ(h), one easily sees that

θ(h) 1 u (x + h) u (x) sin = ′ ′ . 2 2 u′(x + h) − u′(x) | | | | Then θ(h) θ(h)/2 2 sin(θ(h)/2) lim = lim h 0 s(h) h 0 sin(θ(h)/2) s(h) → → x+h u′(x + h) u′(x) 1 1 = lim h u′(ξ ) dξ − h 0 u′(x + h) − u′(x) h x | | → | | | | Z u ′(x) 1   = ′ u′(x) u′(x) | | | | = κ(u(x)) . | | Given a C2 parametrization u : [a,b] Rd of a curve Γ , we define s(x) := x ξ ξ → Γ a u′( ) d . Then s : [a,b] [0,L] (L the length of the curve ) is strictly increas- ing,| surjective| and twice continuously→ differentiable. Moreover, γ(x) := u(s 1(x)) R − defines another C2 parametrization of Γ , called the arclength parametrization. The arclength parametrization has unit speed, that is, the first derivative has unit length, and the second derivative coincides with the curvature vector. Moreover, with the help of the arclength parametrization one easily sees that the curvature vec- 144 13 The universe, soap bubbles and curve shortening tor is always orthogonal to the . These properties are summarized in the following lemma.

Lemma 13.2. For every x [0,L], the arclength parametrization γ satisfies ∈ a) γ (x) = 1, | ′ | b) κ(γ(x)) = γ′′(x), and

c) γ′(x)γ′′(x)= 0. Proof. One has,

1 1 1 u′(s− (x)) (a) γ′(x) = u′(s− (x))(s− )′(x) = | | = 1, | | | | s (s 1(x)) | ′ − | 1 γ′(x) (b) κ(γ(x)) = ′ = γ′′(x) (by part (a)), and γ (x) γ (x) | ′ | | ′ | d 1 2  (c) γ′(x)γ′′(x)= γ′(x) = 0 (by part (a)). dx 2| | Example 13.3. Let Γ be the circle with radius r > 0 and center 0. The function u : R R2, u(x) = (r cosx,r sinx) is a parametrization. For every x R, → ∈ 1 κ(u(x)) = (cosx,sinx) − r is the curvature vector at u(x) Γ . The curvature vector points into the disk enclosed by Γ and has length = 1/r. ∈

The curve shortening flow

The curve shortening problem arises in several contexts and applications. Consider first the following problem: given two points in Rd, find the shortest curve con- necting the two points. If one gently ignores the well-known solution, that is, the straight line between the two points, then this problem – as it is posed – is exactly the problem of finding a minimizer of a real-valued function defined on a certain set of curves. This function assigns to each curve its length and is called the length functional in the following.

Similarly, one may consider the analogous problem of finding curves with minimal length for curves in Riemannian . A solution is then not so easy to imagine. By finding curves with minimal length, G. D. Birkhoff solved in 1917 a problem about the existence of periodic solutions of the Lagrange equations from classical mechanics; for this he was awarded the Bˆocher Memorial Prize by the American Mathematical Society in 1923. Periodic solutions of the Lagrange equations correspond to closed curves of minimal length, that is, closed geodesics, 13 The universe, soap bubbles and curve shortening 145 in certain Riemannian surfaces determined by the Lagrangian. Starting from appropriate closed curves, Birkhoff proposed a discrete curve shortening algorithm which, step by step, constructed closed curves of shorter and shorter length and he showed that the algorithm converges to a desired solution.

In R2, Birkhoff’s idea of curve shorten- ing can be described as follows. Let Γ0 be an initial curve, and take three points A, B, Γ C 0 sufficiently close together⌢ and such that∈ B lies inside the arc segment AC. Then we may construct a shorter curve Γ1 by sim- Γ ply taking⌢ the curve 0 outside the arc seg- ⌢ment AC, and by replacing the arc segment AC by the straight line AC. This shortens the curve Γ0 but the new curve Γ1 is in general 1 no longer parametrized by a C function; Γ1 is actually not really a curve in the sense of Fig. 13.1 Curve shortening by moving B our definition. Intuitively, instead of replac- into the direction of the curvature vector ⌢ ing the arc segment⌢ AC by the line segment AC, we may say that the point B on the arc segment AC should be slightly moved into the direction of the curvature vector κ(B). This should also shorten the curve since the curvature vector points into the direction where the straight line AC lies. Similarly, if A , A , ..., A is a finite family of points on Γ such that the open arc ⌢0 1 n 0 segments A A are mutually disjoint, we may shorten the curve Γ , by taking for j j+1 ⌢ 0 every j = 0, ..., n 1 some point B j A jA j 1 and by moving the curve slightly into − ∈ + the direction of the curvature vector κ(B j). By taking finer and finer partitions of the curve Γ0, and by making shorter and shorter steps into the direction of the curvature vector, one may arrive at a continuous version of curve shortening. Without making a formal limiting process, it is conceivable that we are looking for a family of curves Γ ( t )t [0,T ] parametrized by a function u = u(t,x) (that is, u(t, ) is a parametrization ∈ · of Γt ) such that ut (t,x)= κ(u(t,x)). (13.1) Indeed, this equation expresses the idea that, at every time t, points of the curve Γt moves into the direction of the curvature vector. In (13.1), the ut with which the points move is equal to the curvature vector. Equation (13.1) is called the curve shortening flow equation. Depending on the particular context, it has to be equipped with boundary conditions and an initial condition.

Let us approach the curve shortening problem in a different way. Although the objects of interest are curves and although the length functionalis definedon a set of curves, we think of it being defined on the correspondingset of all parametrizations. In this way, the length functional is defined on a subset of a linear space, and it is possible to study its continuity or differentiability, to consider gradients of the 146 13 The universe, soap bubbles and curve shortening

length functional and to search for a minimizer by a steepest descent method, that is, by a gradient system. If u is a solution of a gradient system associated with the length functional, then, for each “time” t, u(t) is a parametrization of a curve Γt and we expect that the length of Γt is decreasing with time.

Instead of considering curves connecting two given points, let us consider in the following only closed curves, that is, by definition, curves which admit a periodic parametrization defined on R. Without loss of generality, it suffices to consider 2π- periodic parametrizations, and we define accordingly

1 d P2π := u C (R;R ) : u(x)= u(x + 2π) and u′(x) = 0 for every x R . { ∈ 6 ∈ } This set is an open subset of the Banach space

1 1 d C π := u C (R;R ) : u(x)= u(x + 2π) for every x R . 2 { ∈ ∈ }

Every element u P2π is a parametrization of some closed curve Γ with finite ∈ length. In fact, ds = u′(x) dx is the infinitesimal arclength of the curve at u(x), b | | a u′(x) dx is the length of the arc segment running from u(a) to u(b) (respecting the| orientation| of u), and R 2π E (u)= u′(x) dx 0 | | Z is the total length of Γ . The function E : P2π R thus defined is the length func- → tional on the set of closed curves. If u, v P2π are two parametrizations of the same curve Γ , v = u θ for some diffeomorphism∈ θ : [0,2π] [0,2π], then a implies◦ →

2π E (v)= (u θ)′(x) dx 0 | ◦ | Z 2π = u′(θ(x)) θ ′(x) dx 0 | || | Z 2π = u′(y) dy 0 | | Z = E (u).

Hence, the length functional E depends only on the curve Γ and not on the particular parametrization u P2π . Since the euclidean norm is continuously differentiable d ∈ |·| in R 0 , and since the derivative of elements in P2π is bounded away from 0, it \{ } is straightforward to show that E is continuously differentiable on the set P2π (the C1 topology is also crucial here), and that

2π u′(x)ϕ′(x) E ′(u)ϕ = dx; 0 u (x) Z | ′ | 13 The universe, soap bubbles and curve shortening 147 compare with Exercise 7.2. Given a parametrization u P2π of a curve Γ , it seems natural to consider the following inner product on ∈

2 2 d L π = u L (R;R ) : u(x)= u(x + 2π) for almost every x R , 2 { ∈ loc ∈ } namely the L2 inner product with respect to arclength:

2π 2 v,w g(u) := v(x)w(x) u′(x) dx (v, w L2π ). h i 0 | | ∈ Z 2 In this way, we obtain a metric g : P2π Inner(L π ). We calculate the gradient of → 2 E with respect to this metric g. First, let u D(∇gE ). By definition, there exists 2 ∈ 1 v L π , v = ∇gE (u), such that, for every ϕ C π , ∈ 2 ∈ 2 2π ϕ 2π u′(x) ′(x) E ϕ ϕ ϕ dx = ′(u) = v, g(u) = v(x) (x) u′(x) dx. (13.2) 0 u (x) h i 0 | | Z | ′ | Z 1 By definition of the Sobolev space H2π , this implies

u′ 1 u′ H π and ′ = v u′ . u ∈ 2 u − | | | ′| | ′|  P u′ 1 1 u′ 2 Conversely, let u 2π be such that H2π . Then ( )′ L2π , and the equa- ∈ u′ ∈ u′ u′ ∈ 1 u | | | | | | tion (13.2) holds with v = ( ′ )′. Hence, u D(∇gE ), by the definition of the u′ u′ ∈ domain of the gradient. We| have| | thus| proved that

u′ 1 D(∇gE )= u P2π : H π , { ∈ u ∈ 2 } | ′| 1 u′ ∇gE (u)= ′ = κ(u), − u u − | ′| | ′|  where κ(u) is the curvature vector defined in the previous section.

The gradient system associated with the length functional and the metric g is thus exactly the curve shortening flow equation for closed curves:

u 1 ux = 0 in [0,T ] R, t ux ux x − | | | | ×  u(t,x)=u(t,x + 2π) for (t,x) [0,T ] R, (13.3) ∈ ×   u(0,x)= u0(x) for x R. ∈  Let us repeat that u(t, ) P2π is a parametrization of a curve Γt , and a solution of · ∈ Γ this geometric evolution equation corresponds to a family ( t )t [0,T ] of curves. In particular, if u and v are solutions of the partial differential equation∈ (13.3) such that, for every time t, u(t, ) and v(t, ) are parametrizations of the same curve Γt , · · Γ then the two solutions are identified. We call a family ( t )t [0,T ) of curves a curve ∈ 148 13 The universe, soap bubbles and curve shortening shortening flow if it admits a parametrization u = u(t,x) which is a maximal solution to the curve shortening flow equation (we skip the precise definition of solution, so one may think of classical solution).

Several questions arise in the context of problem (13.3): does one have existence and uniqueness of local or maximal solutions? For which initial curves? May we expect smooth, that is, C∞ soluions? What is the maximal existence time of a curve shortening flow? What is the behaviour near the maximal existence time? Do singularities develop, and if yes, which kind of singularities?

In order to get a first idea how answers may look like, the following example is helpful.

Example 13.4 (of a solution of the curve shortening flow equation). Let Γ0 be the circle of radius r0 > 0 and center 0. We are looking for a solution family (Γt ) of the curve shortening flow equation. By reasons of symmetry, we make the ansatz that Γt is a circle of radius r(t) and center 0. We look for a solution of the form

u(t,x) := (r(t)cosx,r(t)sinx).

Inserting this function into (13.3) and recalling Example 13.3, we see that u is a solution of (13.3)if and only if the function r is a solution of the ordinary differential equation 1 r˙ + = 0, r(0)= r . r 0 Hence, 1 u(t,x) := r2 2t (cosx,sinx) for (t,x) [0, r2) R 0 − ∈ 2 0 × q is indeed a solution of the curve shortening flow equation. This solution is maximal 1 2 but not global. The maximal existence time Tmax = 2 r0 is finite, and the solution shrinks to a point as time approaches the maximal existence time.

2 By using maximal regularity results in the Sobolev space H2π , by using fixed point theorems or the local inverse theorem, and by using regularity results for gra- dient systems, it is possible to prove the following local / maximal existence and uniqueness result. It is, however, not a straightforward consequence of Theorems 6.1 or 8.1 for abstract gradient systems.

2 Theorem 13.5. For every initial curve in H2π , the problem (13.3) admits a unique 2 ∞ ∞ maximal solution u C([0,Tmax);H π ) C ((0,Tmax);C π ). ∈ 2 ∩ 2 More difficult is the question about the maximal existence time and the be- haviour of the curve shortening flow near the maximal existence time. In order to give some short outlook, let us cite a few results in this context. 13 The universe, soap bubbles and curve shortening 149

A closed, planar curve Γ R2 is called a Jordan curve if it has no self- ⊆ intersections, that is, if every parametrization u P2π of the curve is injective on [0,2π). By the Jordan curve theorem, every Jordan∈ curve divides the plane into two connected components having the curve as boundary. One component is bounded, the other one is unbounded. We say that a Jordan curve is convex if the enclosed bounded component is convex. For convex initial curves and the curve shortening flow, the following can be said [Gage and Hamilton (1986)].

Theorem 13.6 (Gage & Hamilton). Let (Γt ) be a curve shortening flow. If Γ0 is a convex Jordan curve, then the Γt are convex Jordan curves for every t and they shrink to a point as time approaches the maximal existence time.

This result has been strengthened to general Jordan curves [Grayson (1987)].

Theorem 13.7 (Grayson). Let (Γt ) be a curve shortening flow. If Γ0 is a Jordan Γ Γ curve, then the t are Jordan curves for every t. Moreover, t0 is convex for some t0 and Γt shrinks to a point as time approaches the maximal existence time.

Variants of the length functional may be considered on Riemannian manifolds, for example on the space Rd equipped with a metric g. The length of a closed curve is in this case 2π E (u)= u′(x) g(u(x)) dx. 0 | | Z Clearly, minimizers of this functional, defined on the set of all curves connecting two given points, need no longer be straight lines. For closed curves, solutions 2 of the associated curve shortening equation (the associated L2π gradient system) need no longer shrink to points. In the Lagrange equations from classical mechan- ics, the above more general situation appears very naturally, and we refer back to [Birkhoff (1917)] or [Cartan (1967), pp. 287-].

References

[Absil et al. (2005)] Absil, P.-A., Mahony, R., Andrews, B. : Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16, 2005, 531–547 (electronic). [Adams (1975)] Adams, R. A. : Sobolev Spaces. Academic Press, New York, 1975. [Adams and Fournier (2003)] Adams, R. A., Fournier, J. J. F. : Sobolev Spaces. Vol. 140 of Pure and applied mathematics. Academic Press, Amsterdam, 2003. [Amann (1995)] Amann, H. : Linear and Quasilinear Parabolic Problems: Abstract Linear The- ory. Vol. 89 of Monographs in Mathematics. Birkh¨auser Verlag, Basel, 1995. [Ambrosio (2008)] Ambrosio, L. : Gradient flows in metric spaces and in the space of probability measures, and applications to Fokker-Planck equations with respect to log-concave measures. Bolletino della Unione Matematica Italiana IX 1, 2008, 223–240. [Ambrosio et al. (2005)] Ambrosio, L., Gigli, N., Savar´e, G.: Gradient Flows. Lectures in Math- ematics ETH Z¨urich. Birkh¨auser, Basel, 2005. [Angenent (1990a)] Angenent, S. : Parabolic equations for curves on surfaces. I. Curves with p- integrable curvature. Ann. of Math. (2) 132 (3), 1990a, 451–483. URL http://dx.doi.org/10.2307/1971426 [Angenent (1990b)] Angenent, S. B. : Nonlinear analytic semiflows. Proc. Roy. Soc. Edinburgh 115A, 1990b, 91–107. [Arendt (2004)] Arendt, W., 2004. : Semigroups and evolution equations: , regularity and kernel estimates. In: Handbook of Differential Equations (C. M. Dafermos, E. Feireisl eds.). Elsevier/North Holland, pp. 1–85. [Aubin (1979)] Aubin, J. : Applied Functional Analysis. Pure Applied Mathematics. John Wiley & Sons, New York, Chichester, Brisbane, Toronto, 1979. [Birkhoff (1917)] Birkhoff, G. D. : Dynamical systems with two degrees of freedom. Trans. Amer. Math. Soc. 18, 1917, 199–300. [Boys (1896)] Boys, C. V. : Soap Bubbles and the Forces which Mould Them. E. & J. B. Young & Co., New York, 1896. [Br´ezis (1973)] Br´ezis, H. : Op´erateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. Vol. 5 of North Holland Mathematics Studies. North-Holland, Amsterdam, London, 1973. [Br´ezis (1992)] Br´ezis, H. : Analyse fonctionnelle. Masson, Paris, 1992. [Cartan (1967)] Cartan, H. : Calcul diff´erentiel. Hermann, Paris, 1967. [Cipriani and Grillo (2003)] Cipriani, F., Grillo, G. : Nonlinear Markov semigroups, nonlinear Dirichlet forms and application to minimal surfaces. J. reine angew. Math. 562, 2003, 201– 235. [Cl´ement and Li (1994)] Cl´ement, P., Li, S. : Abstract parabolic quasilinear problems and appli- cation to a groundwater flow problem. Adv. Math. Sci. Appl. 3, 1994, 17–32. [Curry (1944)] Curry, H. B. : The method of steepest descent for non-linear minimization prob- lems. Quart. Appl. Math. 2, 1944.

151 152 References

[Dautray and Lions (1985)] Dautray, R., Lions, J. : Analyse math´ematique et calcul num´erique pour les sciences et les techniques. Vol. I. INSTN: Collection Enseignement. Masson, Paris, 1985. [Dautray and Lions (1987)] Dautray, R., Lions, J. : Analyse math´ematique et calcul num´erique pour les sciences et les techniques. Vol. VIII. INSTN: Collection Enseignement. Masson, Paris, 1987. [Dautray and Lions (1988)] Dautray, R., Lions, J. : Analyse math´ematique et calcul num´erique pour les sciences et les techniques. Vol. VI. INSTN: Collection Enseignement. Masson, Paris, 1988. [De Giorgi et al. (1980)] De Giorgi, E., Marino, A., Tosques, M.: Problems of evolution in metric spaces and maximal decreasing curve. Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 68 (3), 1980, 180–187. [Denk et al. (2003)] Denk, R., Hieber, M., Pr¨uss, J. : R-Boundedness, Fourier Multipliers and Problems of Elliptic and Parabolic Type. Vol. 166 of Memoirs Amer. Math. Soc. Amer. Math. Soc., Providence, R.I., 2003. [Diestel and Uhl (1977)] Diestel, J., Uhl, J. J. : Vector Measures. Vol. 15 of Mathematical Surveys. Amer. Math. Soc., Providence, R.I., 1977. [Dore (1993)] Dore, G., 1993. : Lp regularity for abstract differential equations. In: Functional analysis and related topics, 1991 (Kyoto). Vol. 1540 of Lecture Notes in Math. Springer, Berlin, pp. 25–38. [Dr´abek and Milota (2007)] Dr´abek, P., Milota, J. : Methods of nonlinear analysis. Birkh¨auser Advanced Texts: Basler Lehrb¨ucher. [Birkh¨auser Advanced Texts: Basel Textbooks]. Birkh¨auser Verlag, Basel, 2007. [Escher et al. (2003)] Escher, J., Pr¨uss, J., Simonett, G., 2003. : A new approach to the regularity of solutions of parabolic equations. In: Evolution Equations: Proceedings in Honor of J. A. Goldstein’s 60th Birthday. Marcel Dekker, New York, pp. 167–190. [Evans (1998)] Evans, L. C. : Partial Differential Equations. Vol. 19 of Graduate Studies in Math- ematics. American Mathematical Society, Providence, RI, 1998. [Feireisl and Simondon (2000)] Feireisl, E., Simondon, F. : Convergence for semilinear degener- ate parabolic equations in several space dimensions. J. Dynam. Differential Equations 12, 2000, 647–673. [Gage and Hamilton (1986)] Gage, M., Hamilton, R. S. : The heat equation shrinking convex plane curves. J. Differential Geom. 23 (1), 1986, 69–96. URL http://projecteuclid.org/getRecord?id=euclid.jdg/ 1214439902 [Gilbarg and Trudinger (2001)] Gilbarg, D., Trudinger, N. S.: Elliptic Partial Differential Equa- tions of Second Order. Springer Verlag, Berlin, Heidelberg, New York, 2001. [Goldstein (1962)] Goldstein, A. A. : Cauchy’s method of minimization. Numer. Math. 4, 1962, 146–150. [Grayson (1987)] Grayson, M. A. : The heat equation shrinks embedded plane curves to round points. J. Differential Geom. 26 (2), 1987, 285–314. URL http://projecteuclid.org/getRecord?id=euclid.jdg/ 1214441371 [Haraux (1990)] Haraux, A. : Syst`emes dynamiques dissipatifs et applications. Masson, Paris, 1990. [Hille and Phillips (1957)] Hille, E., Phillips, R. S. : Functional Analysis and Semi-Groups. Amer. Math. Soc., Providence, R.I., 1957. [Huisken and Polden (1999)] Huisken, G., Polden, A., 1999. : Geometric evolution equations for . In: Calculus of variations and geometric evolution problems (Cetraro, 1996). Vol. 1713 of Lecture Notes in Math. Springer, Berlin, pp. 45–84. [Jahnke (2003)] Jahnke, H. N. (Ed.). : A history of analysis. Vol. 24 of History of Mathematics. American Mathematical Society, Providence, RI, 2003. [Kantoroviˇc(1947)] Kantoroviˇc, L. V. : On the method of steepest descent. Doklady Akad. Nauk SSSR (N. S.) 56, 1947, 233–236. 1000 References

[Kunstmann and Weis (2004)] Kunstmann, P. C., Weis, L., 2004.: Maximal Lp regularity for parabolic equations, Fourier multiplier theorems and H∞ functional calculus. In: Levico Lectures, Proceedings of the Autumn School on Evolution Equations and Semigroups (M. Iannelli, R. Nagel, S. Piazzera eds.). Vol. 69. Springer Verlag, Heidelberg, Berlin, pp. 65–320. [Ladyˇzenskaja et al. (1967)] Ladyˇzenskaja, O. A., Solonnikov, V. A., Ural′ceva, N. N. : Linear and quasilinear equations of parabolic type. Translated from the Russian by S. Smith. Trans- lations of Mathematical Monographs, Vol. 23. American Mathematical Society, Providence, R.I., 1967. [Lions (1969)] Lions, J. : Quelques m´ethodes de r´esolution des probl`emes aux limites non lin´eaires. Dunod, Gauthier-Villars, Paris, 1969. [Łojasiewicz (1963)] Łojasiewicz, S., 1963. : Une propri´et´etopologique des sous-ensembles ana- lytiques r´eels. In: Colloques internationaux du C.N.R.S.: Les ´equations aux d´eriv´ees partielles, Paris (1962). Editions du C.N.R.S., Paris, pp. 87–89. [Łojasiewicz (1965)] Łojasiewicz, S. : Ensembles semi-analytiques, 1965, preprint, I.H.E.S. Bures-sur-Yvette. [Lunardi (1995)] Lunardi, A. : Analytic Semigroups and Optimal Regularity in Parabolic Prob- lems. Vol. 16 of Progress in Nonlinear Differential Equations and Their Applications. Birkh¨auser, Basel, 1995. [Lunardi (2009)] Lunardi, A. : Interpolation theory, 2nd Edition. Appunti. Scuola Normale Supe- riore di Pisa (Nuova Serie). [Lecture Notes. Scuola Normale Superiore di Pisa (New Series)]. Edizioni della Normale, Pisa, 2009. [Neuberger (1997)] Neuberger, J. W. : Sobolev Gradients and Differential Equations. Vol. 1670 of Lect. Notes Math. Springer Verlag, Berlin, Heidelberg, New York, 1997. [Neuberger (2005)] Neuberger, J. W. : Prospects for a central theory of partial differential equa- tions. Math. Intelligencer 27, 2005, 47–55. [Ouhabaz (2004)] Ouhabaz, E. M. : Analysis of Heat Equations on Domains. Vol. 30 of London Mathematical Society Monographs. Princeton University Press, Princeton, 2004. [Palis and de Melo (1982)] Palis, J., de Melo, W. : Geometric Theory of Dynamical Systems. An Introduction. Springer Verlag, New York, Heidelberg, Berlin, 1982. [Perelman and Petrunin (1994)] Perelman, G., Petrunin, A. : Quasigeodesics and gradient curves in Alexandrov spaces, 1994, unpublished preprint. [Perona and Malik (1990)] Perona, P., Malik, J. : Scale-Space and Using . IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (7), 1990, 629–639. [Simon (1983)] Simon, L. : Asymptotics for a class of non-linear evolution equations, with appli- cations to geometric problems. Ann. of Math. 118, 1983, 525–571. [Simon (1996)] Simon, L. : Theorems on Regularity and Singularity of Energy Minimizing Maps. Lecture in Mathematics ETH Z¨urich. Birkh¨auser Verlag, Basel, 1996. [Temam (1984)] Temam, R. : Navier-Stokes Equations. Vol. 2 of Studies in Mathematics and its Applications. Elsevier Science Publishers, 1984. [Triebel (1983)] Triebel, H. : Theory of Function Spaces. Birkh¨auser, Basel, 1983. [Whiteside (ed.) (1968)] Whiteside (ed.), D. T. : The Mathematical Papers of Isaac Newton, Vol- ume II, 1667-1670. Cambridge University Press, Cambridge, 1968. [Whiteside (ed.) (1969)] Whiteside (ed.), D. T. : The Mathematical Papers of Isaac Newton, Vol- ume III, 1670-1673. Cambridge University Press, Cambridge, 1969. [Wolfe (1969)] Wolfe, P. : Convergence conditions for ascent methods. SIAM Rev. 11, 1969, 226– 235. [Zeidler (1990)] Zeidler, E. : Nonlinear Functional Analysis and Its Applications. I. Springer Ver- lag, New York, Berlin, Heidelberg, 1990. Appendix A Primer on topology

It is the purpose of this introductory chapter to recall some basic facts about metric spaces, sequences in metric spaces, compact metric spaces, and continuous func- tions between metric spaces. Most of the material should be known, and if it is not known in the context of metric spaces, it has certainly been introduced on Rd. The generalization to metric spaces should be straightforward, but it is nevertheless worthwhile to spend some time on the examples. We also introduce some further notions from topology which may be new; see for example the definitions of density or of completion of a metric space.

A.1 Metric spaces

Definition A.1. Let M be a set. We call a function d : M M R+ a metric or a distance on M if for every x, y, z M × → ∈ (i) d(x,y)= 0 if and only if x = y, (ii) d(x,y)= d(y,x) (symmetry), and (iii) d(x,y) d(x,z)+ d(z,y) (triangle inequality). ≤ A pair (M,d) of a set M and a metric d on M is called a metric space.

It will be convenient to write only M instead of (M,d) if the metric d on M is known from the context, and to speak of a metric space M.

Example A.2. 1. Let M Rd and ⊆ d d(x,y) := ∑ xi yi i=1 | − | or

1001 1002 A Primer on topology

1 d 2 2 d(x,y) := ∑ xi yi . i=1 | − | ! Then (M,d) is a metric space. The second metric is called the euclidean metric. Often, if the metric on Rd is not explicitly given, we mean the euclidean metric. 2. Let M C([0,1]), the space of all continuous functions on the interval [0,1], and ⊆ d( f ,g) := sup f (x) g(x) . x [0,1] | − | ∈ Then (M,d) is a metric space. 3. Let M be any set and 0 if x = y, d(x,y) := ( 1 otherwise. Then (M,d) is a metric space. The metric d is called the discrete metric. 4. Let (M,d) be a metric space. Then

d(x,y) d (x,y) := 1 1 + d(x,y)

and d2(x,y) := min d(x,y),1 { } define also metrics on M. 5. Let M = C(R), the space of all continuous functions on R, and let

dn( f ,g) := sup f (x) g(x) (n N) x [ n,n] | − | ∈ ∈ − and n dn( f ,g) d( f ,g) := ∑ 2− . n N 1 + dn( f ,g) ∈ Then (M,d) is a metric space. Note that the functions dn are not metrics for any n N! 6. Let∈ (M,d) be a metric space. Then any subset M˜ M is a metric space for the induced metric ⊆ d˜(x,y)= d(x,y), x, y M˜ . ∈ We may sometimes say that M˜ is a subspace of M, that is, a subset and a metric space, but certainly this is not to be understood in the sense of linear subspaces of vector spaces (M need not be a vector space). 7. Let (Mn,dn) be metric spaces (n N). Then the cartesian product M := n N Mn is a metric space for the metric ∈ ∈ N n d(x,y) := ∑ 2− min dn(xn,yn),1 . n N { } ∈ A.1 Metric spaces 1003

Clearly, in a similar way, every finite cartesian product of metric spaces is a metric space.

Definition A.3. Let (M,d) be a metric space. a) For every x M and every r > 0 we define the open ball B(x,r) := y M : d(x,y) < r ∈with center x and radius r. { ∈ } b) A set O M is called open if for every x O there exists some r > 0 such that B(x,r) ⊆O. ∈ ⊆ c) A set A M is called closed if its complement Ac = M A is open. ⊆ \ d) A set U M is called a neighbourhoodof x M if there exists r > 0 such that B(x,r) ⊆U. ∈ ⊆ Remark A.4. (a) The notions open, closed, neighbourhood depend on the set M!! For example, M is always closed and open in M. The set Q is not closed in R (for the euclidean metric), but it is closed in Q for the induced metric! Therefore, one should always say in which metric space some given set is open or closed. (b) Clearly, a set O M is open (in M) if and only if it is a neighbourhoodof every of its elements. ⊆

Lemma A.5. Let (M,d) be a metric space. The following are true:

a) Arbitrary unions of open sets are open. That means: if (Oi)i I is an arbitrary ∈ family of open sets (no restrictions on the index set I), then i I Oi is open. ∈ b) Arbitrary intersections of closed sets are closed. That means:S if (Ai)i I is an ∈ arbitrary family of closed sets, then i I Ai is closed. ∈ c) Finite intersections of open sets areT open. d) Finite unions of closed sets are closed.

Proof. (a) Let (Oi)i I be an arbitrary family of open sets and let O := i I Oi. If x O, then x O for∈ some i I, and since O is open, B(x,r) O for some∈ r > 0. i i i S This∈ implies that∈ B(x,r) O,∈ and therefore O is open. ⊆ ⊆ (c) Next let (Oi)i I be a finite family of open sets and let O := i I Oi. If x O, ∈ ∈ ∈ then x Oi for every i I. Since the Oi are open, there exist ri such that B(x,ri) Oi. ∈ ∈ T ⊆ Let r := mini I ri which is positive since I is finite. By construction, B(x,r) Oi for every i I, and∈ therefore B(x,r) O, that is, O is open. ⊆ The∈ proofs for closed sets are⊆ similar or follow just from the definition of closed sets and the above two assertions. Exercise A.6 Determine all open sets (respectively, all closed sets) of a metric space (M,d), where d is the discrete metric. Exercise A.7 Show that a ball B(x,r) in a metric space M is always open. Show also that B¯(x,r) := y M : d(x,y) r { ∈ ≤ } is always closed. 1004 A Primer on topology

Definition A.8 (Closure, interior, boundary). Let (M,d) be a metric space and let S M be a subset. Then the set S¯ := A : A M is closed and S A is called the closure⊆ of S. The set S := O : O {M is open⊆ and O S is called⊆ } the interior ◦ T of S. Finally, we call ∂S := x{ M : ⊆ε > 0B(x,ε) S = ⊆/0and} B(x,ε) Sc = /0 the S boundary of S. { ∈ ∀ ∩ 6 ∩ 6 }

By Lemma A.5, the closure of a set S is always closed (arbitrary intersections of closed sets are closed). By definition, S¯ is the smallest closed set which contains S. Similarly, the interior of a set S is always open, and by definition it is the largest open set which is contained in S. Note that the interior might be empty.

Exercise A.9 Give an example of a metric space M and some x M, r > 0, to show that B¯(x,r) need not coincide with the closure of B(x,r). ∈

Exercise A.10 Let (M,d) be a metric space and consider the metrics d1 and d2 from Example A.2 (4). Show that the set of all open subsets, closed subsets or neigh- bourhoods of M is the same for the three given metrics. The set of all open subsets is also called the topology of M. The three metrics d, d1 and d2 thus induce the same topology. Sometimes it is good to know that one can pass from a given metric d to a finite metric (d1 and d2 take only values between 0 and 1) without changing the topology.

A.2 Sequences, convergence

Throughout the following, sequences will be denoted by (xn). Only when it is nec- essary, we make precise the index n; usually, n 0 or n 1, but sometimes we will also consider finite sequences or sequences indexed≥ by Z≥.

Definition A.11. Let (M,d) be a metric space.

a) We callasequence (xn) M a Cauchy sequence if for every ε > 0 there exists ⊆ n0 such that for every n, m n0 one has d(xn,xm) < ε. ≥ b) Wesaythatasequence (xn) M converges to some element x M if for every ⊆ ∈ ε > 0 there exists n0 such that for every n n0 one has d(xn,x) < ε. If (xn) ≥ converges to x, we also write limn ∞ xn = x or xn x as n ∞. → → → Exercise A.12 Let C([0,1]) be the metric space from Example A.2 (2). Show that a sequence ( fn) C([0,1]) converges to some f for the metric d if and only if it converges uniformly.⊆ We say that the metric d induces the topology of uniform convergence. Show also that a sequence ( fn) C(R) (Example A.2 (5)) converges to some f for the metric d if and only if it converges⊆ uniformly on compact subsets of R. In this example, we say that the metric d induces the topology of local uniform convergence. A.2 Sequences, convergence 1005

Exercise A.13 Determine all Cauchy sequences and all convergent sequencesina discrete metric space.

Lemma A.14. Let M be a metric space and (xn) M be a sequence. Then: ⊆ a) limn ∞ xn = x for some element x M if and only if for every neighbourhood → ∈ U of x there exists n0 such that for every n n0 one has xn U. ≥ ∈ b) (Uniqueness of the limit) If limn ∞ xn = x and limn ∞ xn = y, then x = y. → → Lemma A.15. AsetA M is closed if andonly if for every sequence (xn) A which converges to some x ⊆M onehasx A. ⊆ ∈ ∈ Proof. Assume first that A is closed and let (xn) A be convergent to x M. If x does not belong to A, then it belongs to Ac which is⊆ open. By definition, there∈ exists c ε > 0 such that B(x,ε) A . Given this ε, there exists n0 such that xn B(x,ε) for ⊆ ∈ every n n0, a contradiction to the assumption that xn A. Hence, x A. ≥ ∈ ∈ On the other hand, assume that limn ∞ xn = x A for every convergent (xn) A and assume in addition that A is not closed→ or,∈ equivalently, that Ac is not open.⊆ Then there exists x Ac such that for every n N the set B(x, 1 ) A is nonempty. ∈ ∈ n ∩ From this one can construct a sequence (xn) A which converges to x, which is a contradiction because x Ac. ⊆ ∈ Lemma A.16. Let (M,d) be a metric space, and let S M be a subset. Then ⊆

S¯ = x M : (xn) S s.t. lim xn = x n ∞ { ∈ ∃ ⊆ → } = x M : d(x,S) := inf d(x,y)= 0 . { ∈ y S } ∈ Proof. Let A := x M : (xn) S s.t. lim xn = x n ∞ { ∈ ∃ ⊆ → } and B := x M : d(x,S) := inf d(x,y)= 0 . { ∈ y S } ∈ These two sets are clearly equal by the definition of the inf and the definition of convergence. Moreover, the set B is closed by the following argument. Assume that (xn) B is convergent to x M. By definition of B, for every n there exists y S ⊆ ∈ ∈ such that d(xn,yn) 1/n. Hence, ≤

limsupd(x,yn) limsupd(x,xn)+ limsupd(xn,yn)= 0, n ∞ ≤ n ∞ n ∞ → → → so that x B. Clearly,∈ B contains S, and since B is closed, B contains S¯. It remains to show that B S¯. If this is not true, then there exists x B S¯. Since the complement of S¯ is open⊆ in M, there exists r > 0 such that B(x,∈r) \S¯ = /0, a contradiction to the definition of B. ∩ Definition A.17. A metric space (M,d) is called complete if every Cauchy sequence converges. 1006 A Primer on topology

Exercise A.18 Show that the spaces Rd, C([0,1]) and C(R) are complete. Show also that any discrete metric space is complete. Lemma A.19. A subspace N M of a complete metric space is complete if and only if it is closed in M. ⊆

Proof. Assume that N M is closed, and let (xn) be a Cauchy sequence in N. By ⊆ the assumption that M is complete, (xn) is convergent to some element x M. Since N is closed, x N. ∈ ∈ Assume on the other hand that N is complete, and let (xn) N be convergent to some element x M. Clearly, every convergentsequence is also⊆ a Cauchy sequence, ∈ and since N is complete, (xn) converges to some element y N. By uniqueness of the limit, x = y N. Hence, N is closed. ∈ ∈

A.3 Compactness

Definition A.20. We say that a metric space (M,d) is compact if for every open covering there exists a finite subcovering, that is, whenever (Oi)i I is a family of ∈ open sets (no restrictions on the index set I) such that M = i I Oi, then there exists ∈ a finite subset I0 I such that M = i I Oi. ⊆ ∈ 0 S Lemma A.21. A metric space (M,dS) is compactif andonlyif it is sequentially com- pact, that is, if and only if every sequence (xn) M has a convergent subsequence. ⊆ Proof. Assume that M is compact and let (xn) M. Assume that (xn) does not ⊆ have a convergent subsequence. Then for every x M there exists εx > 0 such that ∈ B(x,εx) contains only finitely many elements of xn . Note that (B(x,εx))x M is an open covering of M so that by the compactness of{M}there exists a finite subset∈ N ε ⊆ M such that M = x N B(x, x). But this means that (xn) takes only finitely many values, and hence there∈ exists even a constant subsequence which is in particular S also convergent; a contradiction to the assumption on (xn). On the other hand, assume that M is sequentially compact and let (Oi)i I be an open covering of M. We first show that there exists ε > 0 such that for every∈ x M ε ∈ there exists ix I with B(x, ) Oix . If this were not true, then for every n N ∈ 1 ⊆ ∈ there exists xn such that B(xn, ) Oi for every i I. Passing to a subsequence, we n 6⊆ ∈ may assume that (xn) is convergent to some x M. There exists some i0 I such ∈ ε ε∈ that x Oi0 , and since Oi0 is open, we find some > 0 such that B(x, ) Oi0 . ∈ 1 ε ⊆ Let n0 be such that < . By the triangle inequality, for every n n0 we have n0 2 ≥ 1 ε B(xn, n ) B(x, ) Oi0 , a contradiction to the construction of the sequence (xn). ⊆ ⊆ n Next we show that M = B(x j,ε) for a finite family of x j M. Choose any j=1 ∈ x1 M. If B(x1,ε)= M, then we are already done. Otherwise we find x2 M ∈ S ∈ \ B(x1,ε). If B(x1,ε) B(x2,ε) = M, then we even find x3 M which does not belong ε ε∪ 6 n ε ∈ to B(x1, ) B(x2, ), and so on. If j=1 B(x j, ) is never all of M, then we find actually a sequence∪ (x ) such that d(x ,x ) ε for all j = k. This sequence can not j Sj k have a convergent subsequence, a contradiction≥ to sequenti6 al compactness. A.4 Continuity 1007

Since every of the B(x j,ε) is a subset of Oi for some ix I, we have proved that x j j ∈ M = n O , i.e. the open covering (O ) admits a finite subcovering. The proof is j=1 ix j i complete.S Lemma A.22. Any compact metric space is complete.

Proof. Let (xn) be a Cauchy sequence in M. By the preceeding lemma, there ex- ists a subsequence which converges to some x M. If a subsequence of a Cauchy sequence converges, then the sequence itself converges,∈ too.

A.4 Continuity

Definition A.23. Let (M1,d1), (M2,d2) be two metric spaces, and let f : M1 M2 be a function. →

a) We say that f is continuous at some point x M1 if ∈

ε > 0 δ > 0 y B(x,δ) : d2( f (x), f (y)) < ε. ∀ ∃ ∀ ∈ b) We say that f is continuous if it is continuous at every point. c) We say that f is uniformly continuous if

ε > 0 δ > 0 x, y M1 : d1(x,y) < δ d2( f (x), f (y)) < ε. ∀ ∃ ∀ ∈ ⇒ d) We say that f is Lipschitz continuous if

L 0 x, y M : d2( f (x), f (y)) Ld1(x,y). ∃ ≥ ∀ ∈ ≤

Lemma A.24. A function f : M1 M2 between two metric spaces is continuous at → some point x M1 if and only if it is sequentially continuous at x, that is, if and only ∈ if for every sequence (xn) M1 which converges to x one has limn ∞ f (xn)= f (x). ⊆ → Proof. Assume that f is continuous at x M1 and let (xn) be convergent to x. Let ε > 0. There exists δ > 0 such that for every∈ y B(x,δ) one has f (y) B( f (x),ε). ∈ ∈ By definition of convergence, there exists n0 such that for every n n0 one has ≥ xn B(x,δ). For this n0 and every n n0 one has f (xn) B( f (x),ε). Hence, ∈ ≥ ∈ limn ∞ f (xn)= f (x). Assume→ on the other hand that f is sequentially continuous at x. If f was not con- 1 tinuous in x then there exists ε > 0 such that for every n N there exists xn B(x, ) ∈ ∈ n with f (xn) B( f (x),ε). By construction, limn ∞ xn = x. Since f is sequentially con- 6∈ → tinuous, limn ∞ f (xn)= f (x). But this is a contradiction to f (xn) B( f (x),ε), and therefore f is→ continuous. 6∈

Lemma A.25. A function f : M1 M2 between two metric spaces is continuous if and only if preimages of open sets→ are open, that is, if and only if for every open set 1 O M2 the preimage f (O) is open in M1. ⊆ − 1008 A Primer on topology

1 Proof. Let f : M1 M2 be continuous and let O M2 be open. Let x f − (O). Since O is open, there→ exists ε > 0 such that B( f (x)⊆,ε) O. Since f is continuous,∈ there exists δ > 0 such that for every y B(x,δ) one has⊆ f (y) B( f (x),ε). Hence, 1 1 ∈ ∈ B(x,δ) f − (O) so that f − (O) is open. On the⊆ other hand, if the preimage of every open set is open, then for every 1 x M1 and every ε > 0 the preimage f − (B( f (x),ε)) is open. Clearly, x belongs to ∈ 1 this preimage, and therefore there exists δ > 0 such that B(x,δ) f − (B( f (x),ε)). This proves continuity. ⊆ Lemma A.26. Let f : K M be a continuousfunction from a compact metric space K into a metric space M.→ Then: a) The image f (K) is compact. b) The function f is uniformly continuous.

1 Proof. (a) Let (Oi)i I be an open covering of f (K). Since f is continuous, f − (Oi) ∈ 1 is open in K. Moreover, ( f − (Oi))i I is an open covering of K. Since K is compact, ∈ 1 there exists a finite subcovering: K = i I f − (Oi) for some finite I0 I. Hence, ∈ 0 ⊆ (Oi)i I is a finite subcovering of f (K). ∈ 0 S (b) Let ε > 0. Since f is continuous, for every x K there exists δx > 0 such that ∈ for all y B(x,δx) one has f (y) B( f (x),ε). By compactness, there exists a finite ∈ ∈ n δ δ δ family (xi)1 i n K such that K = i=1 B(xi, xi /2). Let = min xi /2:1 i n ≤ ≤ ⊆ δ δ { ≤ ≤ } and let x, y K such that d(x,y) < . Since x B(xi, xi /2) for some 1 i n, we ∈ δ S ∈ ε ≤ ≤ find that y B(xi, xi ). By construction, f (x), f (y) B( f (xi), ) so that the triangle inequality∈ implies d( f (x), f (y)) < 2ε. ∈

Lemma A.27. Any Lipschitz continuous function f : M1 M2 between two metric spaces is uniformly continuous. →

Proof. Let L > 0 be a Lipschitz constant for f and let ε > 0. Define δ := ε/L. Then, for every x, y M such that d1(x,y) δ one has ∈ ≤

d2( f (x), f (y)) Ld1(x,y) ε, ≤ ≤ and therefore f is uniformly continuous.

A.5 Completion of a metric space

Definition A.28. We say that a subset D M of a metric space (M,d) is dense in ⊆ M if D¯ = M. Equivalently, D is dense in M if for every x M there exists (xn) D ∈ ⊆ such that limn ∞ xn = x. → Lemma A.29 (Completion). Let (M,d) be a metric space. Then there exists a com- plete metric space (Mˆ ,dˆ) and a continuous, injective j : M Mˆ such that → d(x,y)= dˆ( j(x), j(y)), x, y M, ∈ A.5 Completion of a metric space 1009 and such that the image j(M) is dense in M.ˆ

Definition A.30. Let (M,d) be a metric space. A complete metric space (Mˆ ,dˆ) ful- filling the properties from Lemma A.29 is called a completion of M.

Proof (Proof of Lemma A.29). Let

M¯ := (xn) M : (xn) is a Cauchy sequence . { ⊆ }

We say that two Cauchy sequences (xn), (yn) M¯ are equivalent (and we write ⊆ (xn) (yn)) if limn ∞ d(xn,yn)= 0. Clearly, is an equivalence relation on M¯ . ∼ → ∼ We denote by [(xn)] the equivalence class in M¯ of a Cauchy sequence (xn), and we let Mˆ := M¯ / = [(xn)] : (xn) M¯ ∼ { ∈ } be the set of all equivalence classes. If we define

dˆ([(xn)],[(yn)]) := lim d(xn,yn), n ∞ → then dˆ is well defined (the definition is independent of the choice of representatives) and it is a metric on Mˆ . The fact that dˆ is a metric and also that (Mˆ ,dˆ) is a complete metric space are left as exercises. One also easily verifies that j : M Mˆ defined by j(x) = [(x)] (the equivalence class of the constant sequence (x)) is continuous,→ injective and in fact isometric, i.e.

d(x,y)= dˆ( j(x), j(y)) for every x, y M. The proof is here complete. ∈ Lemma A.31. Let (Mˆ i,dˆi) (i = 1, 2) be two completions of a metric space (M,d). Then there exists a bijection b : Mˆ 1 Mˆ 2 such that for every x, y Mˆ 1 → ∈

dˆ1(x,y)= dˆ2(b(x),b(y)).

Lemma A.31 shows that up to isometric bijections there exists only one comple- tion of a given metric space and it allows us to speak of the completion of a metric space.

Lemma A.32. Let f : M1 M2 be a uniformly (!) continuous function between two → metric spaces. Let Mˆ1 and Mˆ2 be the completions of M1 and M2, respectively. Then there exists a unique continuous extension fˆ : Mˆ1 Mˆ2 of f. → Proof. Since f is uniformly continuous, it maps equivalent Cauchy sequences into equivalent Cauchy sequences (equivalence of Cauchy sequences is defined as in the proof of Lemma A.29). Hence, the function fˆ([(xn)]) := [( f (xn))] is well defined. It is easy to check that fˆ is an extension of f and that fˆ is continuous (even uniformly continuous). 1010 A Primer on topology

The assumption of uniform continuity in Lemma A.32 is necessary in general. The functions f (x)= sin(1/x) and f (x)= 1/x on the open interval (0,1) do not admit continuous extensions to the closed interval [0,1] (which is the completion of (0,1)). Appendix B Banach spaces and bounded linear operators

Throughout, let K R,C . ∈{ }

B.1 Normed spaces

Definition B.1. Let X be a vector space over K. A function : X R+ is called a norm if for every x, y X and every λ K k·k → ∈ ∈ (i) x = 0 if and only if x = 0, k k (ii) λx = λ x , and k k | |k k (iii) x + y x + y (triangle inequality). k k≤k k k k A pair (X, ) of a vector space X and a norm is called a normed space. k·k k·k Often, we will speak of a normed space X if it is clear which norm is given on X.

Example B.2. 1. (Finite dimensional spaces) Let X = Kd. Then

1/p d p x p := ∑ xi , 1 p < ∞, k k i=1 | | ! ≤

and x ∞ := sup xi k k 1 i d | | ≤ ≤ are norms on X. 2. (Sequence spaces) Let 1 p < ∞, and let ≤ p p l := (xn) K : ∑ xn < ∞ { ⊂ n | | } with norm

1011 1012 B Banach spaces and bounded linear operators

1/p p x p := ∑ xn . k k | |  n  p Then (l , p) is a normed space. 3. (Sequencek·k spaces) Let X be one of the spaces

∞ l := (xn) K : sup xn < ∞ , { ⊂ n | | } c := (xn) K : lim xn exists , or n ∞ { ⊂ → } c0 := (xn) K : lim xn = 0 , or n ∞ { ⊂ → } c00 := (xn) K : the set n : xn = 0 is finite , { ⊂ { 6 } } and let x ∞ := sup xn . k k n | | Then (X, ∞) is a normed space. 4. (Functionk·k spaces: continuous functions) Let C([a,b]) be the space of all continu- ous, K-valued functions on a compact interval [a,b] R. Then ⊂ b 1/p p f p := f (x) dx , 1 p < ∞, k k a | | ≤ Z  and f ∞ := sup f (x) k k x [a,b] | | ∈ are norms on C([a,b]). 5. (Function spaces: continuous functions) Let K be a compact metric space and let C(K) be the space of all continuous, K-valued functions on K. Then

f ∞ := sup f (x) k k x K | | ∈ is a norm on C(K). 6. (Function spaces: integrable functions) Let (Ω,A , µ) be a measure space and let p Xp = L (Ω) (1 p ∞). Let ≤ ≤ 1/p p f p := f dµ , 1 p < ∞, k k Ω | | ≤ Z  or f ∞ := ess sup f (x) := inf c R+ : µ( f > c )= 0 . k k | | { ∈ {| | } } Then (Xp, p) is a normed space. 7. (Functionk·k spaces: differentiable functions) Let

C1([a,b]) := f C([a,b]) : f is continuously differentiable . { ∈ } B.1 Normed spaces 1013

Then ∞ and k·k f 1 := f ∞ + f ′ ∞ k kC k k k k are norms on C1([a,b]).

We will see more examples in the sequel.

Lemma B.3. Every normed space is a metric space for the metric

d(x,y) := x y , x, y X. k − k ∈ By the above lemma, also every subset of a normedspace becomesa metric space in a natural way. Moreover, it is natural to speak of closed or open subsets (or linear subspaces!) of normed spaces, or of closures and interiors of subsets.

Exercise B.4 Show that in a normed space X, for every x X and every r > 0 the closed ball B¯(x,r) coincides with closure B(x,r) of the open∈ ball. Also the notion of continuity of functions between normed spaces (or between a metric space and a normed space) makes sense. The following is a first example of a continuous function.

Lemma B.5. Given a normed space, the norm is a continuous function.

This lemma is a consequence of the following lemma.

Lemma B.6 (Triangle inequality from below). Let X be a normed space. Then, for every x, y X, ∈ x y x y . k − k≥ k k−k k

Proof. The triangle inequality implies x = x y + y k k k − k x y + y , ≤k − k k k so that x y x y . k k−k k≤k − k Changing the role of x and y implies

y x y x = x y , k k−k k≤k − k k − k and the claim follows.

A notion which can not really be defined in metric spaces but in normed spaces is the following.

Definition B.7. A subset B of a normed space X is called bounded if

sup x : x B < ∞. {k k ∈ } 1014 B Banach spaces and bounded linear operators

It is easy to check that if X is a normed space, and M is a metric space, then the set C(M;X) of all continuous functions from M into X is a vector space for the obvious addition and scalar multiplication. If M is in addition compact, then f (M) X is also compact for every such function, and hence f (M) is necessarily bounded⊂ (every compact subset of a normed space is bounded!). So we can give a new example of a normed space.

Example B.8. 8. (Function spaces: vector-valued continuous functions) Let (X, ) be a normed space and let K be a compact metric space. Let E = C(K;X) bek · thek space of all X-valued continuous functions on K. Then

f ∞ := sup f (x) k k x K k k ∈ is a norm on C(K;X).

Also the notions of Cauchy sequences and convergent sequences make sense in normed spaces. In particular, one can speak of a complete normed space, that is, a normed space in which every Cauchy sequence converges.

Definition B.9. A complete normed space is called a Banach space.

Example B.10. The finite dimensional spaces, the sequence spaces l p (1 p ∞), p ≤ ≤ c, and c0, and the function spaces (C([a,b]), ∞), (L (Ω), p) are Banach spaces. k·k k·k The spaces (c00, ∞), (C([a,b]), p) (1 p < ∞) are not Banach spaces. k·k k·k ≤ If X is a Banach space, then also (C(K;X), ∞) is a Banach space. k·k Definition B.11. We say that two norms 1 and 2 on a real or complex vector space X are equivalent if there exist two constantsk·k ck·k, C > 0 such that for every x X ∈

c x 1 x 2 C x 1. k k ≤k k ≤ k k

Lemma B.12. Let 1, 2 be two norms on a vector space X (over K). The following are equivalent:k·k k·k

(i) The norms 1, 2 are equivalent. k·k k·k (ii) A set O X is open for the norm 1 if and only if it is open for the norm ⊂ k·k 2 (and similarly for closed sets). k·k (iii) A sequence (xn) X converges to 0 for the norm 1 if and only if it con- ⊂ k·k verges to 0 for the norm 2. k·k In other words, if two norms 1, 2 on a vector space X are equivalent, then the open sets, the closed sets andk·k the nullk·k sequences are the same. We also say that the two norms define the same topology. In particular, if X is a Banach space for one norm then it is also a Banach space for the other (equivalent) norm.

Exercise B.13 The norms ∞ and p are not equivalent on C([0,1]). k·k k·k B.1 Normed spaces 1015

Theorem B.14. Any two norms on a finite dimensionalreal or complex vector space are equivalent.

Proof. We may without loss of generality consider Kd. Let be a norm on Kd d k·kd and let (ei)1 i d be the canonical basis of K . For every x K ≤ ≤ ∈ d x = ∑ xiei k k k i=1 k d ∑ xi ei ≤ i=1 | |k k C x 1, ≤ k k

where C := sup ei < ∞ and 1 is the norm from Example B.2.1. By the 1 i d k k k·k triangle inequality≤ ≤ from below, for every x, y Kd, ∈

x y x y C x y 1. |k k−k k|≤k − k≤ k − k d d Hence, the norm : (K , 1) R+ is continuous (on K equipped with the k·k d k·k → norm 1). If S := x K : x 1 = 1 denotes the for the norm 1, then Sk·kis compact. As{ a∈ consequencek k } k·k

c := inf x : x S > 0, {k k ∈ } since the infimum is attained by the continuity of . This implies k·k d c x 1 x for every x K . k k ≤k k ∈ d We have proved that every norm on K is equivalent to the norm 1. Hence, any two norms on Kd are equivalent. k·k

Corollary B.15. Any finite dimensional normed space is complete. Any finite dimen- sional subspace of a normed space is closed.

d d Proof. The space (K , 1) is complete (exercise!). If is a second norm on K k·k k·k and if (xn) is a Cauchy sequence for that norm, then it is also a Cauchy sequence d in (K , 1) (use that the norms 1 and are equivalent), and therefore con- k·k d k·k k·k vergent in (K , 1). By equivalence of norms again, the sequence (xn) is also convergent in (Kk·kd, ), and therefore (Kd, ) is complete. k·k k·k Let Y be a finite dimensional subspace of a normed space X, and let (xn) Y bea ⊂ convergent sequence with x = limn ∞ xn X. Since (xn) is also a Cauchy sequence, and since Y is complete,we find (by→ uniquenessof∈ the limit) that x Y, and therefore Y is closed (Lemma A.15). ∈

Definition B.16. Let (xn) be a sequence in a normed space X. We say that the series ∑n xn is convergent if the sequence (∑ j n x j) of partial sums is convergent. We say ≤ that the series ∑ xn is absolutely convergent if ∑ xn < ∞. n n k k 1016 B Banach spaces and bounded linear operators

Lemma B.17. Let (xn) be a sequence in a normed space X. If the series ∑n xn is convergent, then necessarily limn ∞ xn = 0. → Note that in a normed space not every absolutely convergentseries is convergent. In fact, the following is true.

Lemma B.18. A normed space X is a Banach space if and only if every absolutely convergent series converges.

Proof. Assume that X is a Banach space, and let ∑n xn be absolutely convergent. It follows easily from the triangle inequality that the corresponding sequence of partial sums is a Cauchy sequence, and since X is complete, the series ∑n xn is convergent. On the other hand, assume that every absolutely convergent series is convergent. Let (xn)n 1 X be a Cauchy sequence. From this Cauchy sequence, one can ex- ≥ ⊂ k tract a subsequence (xnk )k 1 such that xnk+1 xnk 2− , k 1. Let y0 = xn1 and ≥ k ∑ − k≤ ≥ yk = xnk+1 xnk , k 1. Then the series k 0 yk is absolutely convergent. By as- − ≥ ≥ ∑k sumption, it is also convergent. But by construction, ( l=0 yl) = (xnk ), so that (xnk ) is convergent. Hence, we have extracted a subsequence of the Cauchy sequence (xn) which converges. As a consequence, (xn) is convergent, and since (xn) was an arbitrary Cauchy sequence, X is complete.

Lemma B.19 (Riesz). Let X be a normed space and let Y X be a closed linear subspace. If Y = X, then for every δ > 0 there exists x X Y⊂ such that x = 1 and 6 ∈ \ k k dist(x,Y )= inf x y : y Y 1 δ. {k − k ∈ }≥ − Proof. Let z X Y. Since Y is closed, ∈ \ d := dist(z,Y ) > 0.

Let δ > 0. By definition of the infimum, there exists y Y such that ∈ d z y . k − k≤ 1 δ − z y Let x := z−y . Then x X Y, x = 1, and for every u Y k − k ∈ \ k k ∈ 1 x u = z y − z (y + z y u) k − k k − k 1 k − k − k k z y − d 1 δ, ≥k − k ≥ − since (y + z y u) Y. k − k ∈ Theorem B.20. A normed space is finite dimensional if and only if every closed bounded set is compact.

Proof. If the normed space is finite dimensional, then every closed bounded set is compact by the Theorem of Heine-Borel. Note that by Theorem B.14 it is not B.2 Product spaces and quotient spaces 1017 important which norm on the finite dimensional space is considered. By Lemma B.12, the closed and bounded sets do not change. On the other hand, if the normed space is infinite dimensional, then, by the Lemma of Riesz, one can construct inductively a sequence (xn) X such that 1 ⊂ xn = 1 and dist(xn 1,Xn) for every n N, where Xn = span xi :1 i n k k + ≥ 2 ∈ { ≤ ≤ } (note that Xn is closed by Corollary B.15). By construction, (xn) belongs to the closed unit ball, but it can not have a convergent subsequence (even not a Cauchy subsequence). Hence, the closed unit ball is not compact. We state this result sepa- rately.

Theorem B.21. In an infinite dimensional Banach space the closed unit ball is not compact.

Lemma B.22 (Completion of a normed space). For every normed space X there exists a Banach space Xˆ and a linear injective j : X Xˆ such that j(x) = x (x X)and j(X) is dense in X.ˆ Up to isometry, the Banach→ space Xˆ k is uniquek k (upk to∈ isomorphism). It is called the completion of X.

Proof. It suffices to repeatthe proof of Lemma A.29 and to note that the completion Xˆ of X (considered as a metric space) carries in a natural way a linear structure: addition of - equivalence classes of - Cauchy sequences is their componentwise addition, and also multiplication of - an equivalence class - of a Cauchy sequence and a scalar is done componentwise. Moreover, for every [(xn)], one defines the norm [(xn)] := lim xn . n ∞ k k → k k Uniqueness of Xˆ follows from Lemma A.31.

B.2 Product spaces and quotient spaces

Lemma B.23 (Product spaces). Let (Xi)i I be a finite (!) family of normed spaces, X ∈ and let := i I Xi be the cartesian product. Then ∈ N 1/p x := ∑ x p (1 p < ∞), p i Xi k k i I k k ! ≤ ∈ and

x ∞ := sup xi Xi k k i I k k ∈ define equivalent norms on X . In particular, the cartesian product is a normed space.

Proof. The easy proof is left to the reader. 1018 B Banach spaces and bounded linear operators

X Lemma B.24. Let (Xi)i I be a finite family of normed spaces, and let := i I Xi ∈ ∈ be the cartesian product equipped with one of the equivalent norms p from n n X k·kN Lemma B.23. Then a sequence (x ) = ((xi )i) converges (is a Cauchy se- n ⊂ quence) if and only if (xi ) Xi is convergent (is a Cauchy sequence) for every i I. ⊂ ∈ As a consequence, X is a Banach space if and only if all the Xi are Banach spaces.

Proposition B.25 (Quotient space). Let X be a vector space (!) over K, and let Y X be a linear subspace. Define, for every x X, the affine subspace ⊂ ∈ x +Y := x + y : y Y , { ∈ } and define the quotient space or factor space

X/Y := x +Y : x X . { ∈ } Then X/Y is a vector space for the addition

(x +Y) + (z +Y) := (x + z +Y),

and the scalar multiplication

λ(x +Y) := (λx +Y).

The neutral element is Y.

For the definition of quotient spaces, it is not important that we consider real or complex vector spaces.

Examples of quotient spaces are already known. In fact, Lp is such an example. Usually, one defines L p(Ω,A , µ) p to be the space of all mesurable functions f : Ω K such that Ω f dµ < ∞. Moreover, → | | R N := f L p(Ω,A , µ) : f p = 0 . { ∈ Ω | | } Z Note that N is a linear subspace of L p(Ω,A , µ), and that N is the space of all functions f L p which vanish almost everywhere. Then ∈ Lp(Ω,A , µ) := L p(Ω,A , µ)/N.

Proposition B.26. Let X be a normed space and let Y X be a linear subspace. Then ⊂ x +Y := inf x y : y Y k k {k − k ∈ } B.2 Product spaces and quotient spaces 1019

defines a norm on X/Y if and only ifY is closed in X. If X is a Banach space and Y X closed, then X/Y is also a Banach space. ⊂ Proof. We have to check that satisfies all properties of a norm. Recall that 0 = Y , and that for all x Xk·k X/Y ∈ x +Y = 0 k k inf x y : y Y = 0 ⇔ {k − k ∈ } (yn) Y : lim yn = x n ∞ ⇔ ∃ ⊂ → ( if Y closed) : x Y ⇔ ⇒ ∈ x +Y = Y. ⇔ Second, for every x X and every λ K 0 , ∈ ∈ \{ } λ(x +Y) = λx +Y k k k k = inf λx y : y Y {k − k ∈ } = inf λ(x y) : y Y {k − k ∈ } = λ inf x y : y Y | | {k − k ∈ } = λ x +Y . | |k k Third, for every x, z X, ∈ (x +Y) + (z +Y) = (x + z)+Y k k k k = inf x + z y : y Y {k − k ∈ } = inf x + z y1 y2 : y1, y2 Y {k − − k ∈ } inf x y1 + z y2 : y1, y2 Y ≤ {k − k k − k ∈ } inf x y : y Y + inf z y : y Y ≤ {k − k ∈ } {k − k ∈ } = x +Y + z +Y . k k k k Hence, X/Y is a normed space if Y is closed. Assume next that X is a Banach space. Let (xn) X be such that the series ⊂∞ ∑n 1 xn +Y converges absolutely, i.e. ∑n 1 xn +Y < . By definition of the norm ≥ ≥ k k n in X/Y, we find (yn) Y such that xn yn xn +Y + 2 . Replacing (xn) by ⊂ k − k≤k k − (xˆn) = (xn yn), we findthat xn +Y = xˆn +Y and that the series ∑n 0 xˆn is absolutely − ≥ convergent. Since X is complete, by Lemma B.18, the limit ∑n 1 xˆn = x X exists. As a consequence, ≥ ∈

n n (x +Y) ∑ (xˆk +Y) = (x ∑ xˆk)+Y k − k=1 k k − k=1 k n x ∑ xˆk 0, ≤k − k=1 k →

i.e. the series ∑n 1 xn +Y converges. By Lemma B.18, X/Y is complete. ≥ 1020 B Banach spaces and bounded linear operators B.3 Bounded linear operators

In the following a linear mapping between two normed spaces X and Y will also be called a linear operator or just operator.If Y = K, then we call linear operators also linear functionals. If T : X Y is a linear operator between two normed spaces, then we denote by → kerT := x X : Tx = 0 { ∈ } its kernel or null space, and by

ranT := Tx : x X { ∈ } its range or image. Observe that we simply write Tx instead of T (x), meaning that T is applied to x X. The identity operator X X, x x is denoted by I. ∈ → 7→ Lemma B.27. Let T : X Y be a linear operator between two normed spaces X and Y. Then the following→ are equivalent (i) T is continuous. (ii) T is continuous at 0. (iii) TB is boundedinY, where B = B(0,1) denotes the unit ball in X. (iv) There exists a constant C 0 such that for every x X ≥ ∈ Tx C x . k k≤ k k Proof. The implication (i) (ii) is trivial. (ii) (iii). If T is continuousat⇒ 0, then there exists some δ > 0 such that for every x B(⇒0,δ) one has Tx B(0,1) (so the ε from the ε-δ definition of continuity is chosen∈ to be 1 here). By∈ linearity, for every x B = B(0,1) ∈ 1 1 Tx = T(δx) , k k δ k k≤ δ and this means that TB is bounded. (iii) (iv). The set TB being bounded in Y means that there exists some constant C 0 such⇒ that for every x B one has Tx C. By linearity, for every x X 0 , ≥ ∈ k k≤ ∈ \{ } x Tx = T x C x . k k k x kk k≤ k k k k (iv) (i). Let x X, and assume that limn ∞ xn = x. Then ⇒ ∈ →

Txn Tx = T(xn x) C xn x 0 as n ∞, k − k k − k≤ k − k → →

so that limn ∞ Txn = Tx. → Definition B.28. We call a continuous linear operator T : X Y between two normed spaces X and Y also a (since it maps→ the unit ball of B.3 Bounded linear operators 1021

X to a bounded subset of Y). The set of all bounded linear operators is denoted by L (X,Y ). Special cases: If X = Y , then we write L (X,X)=: L (X). If Y = K, then we write L (X,K)=: X ′. Lemma B.29. The set L (X,Y ) is a vector space and

T := inf C 0 : Tx C x for all x X (B.1) k k { ≥ k k≤ k k ∈ } = sup Tx : x 1 {k k k k≤ } = sup Tx : x = 1 {k k k k } is a norm on L (X,Y).

Proof. We first show that the three quantities on the right-hand side of (B.1) are equal. In fact, the equality

sup Tx : x 1 = sup Tx : x = 1 {k k k k≤ } {k k k k } is easy to check so that it remains only to show that

A := inf C 0 : Tx C x for all x X = sup Tx : x = 1 =: B. { ≥ k k≤ k k ∈ } {k k k k } If C > A, then for every x X 0 , Tx C x or T x C. Hence, C B ∈ \{ } k k≤ k k k x k≤ ≥ which implies that A B. If C > B, then for every x X k k 0 , T x C, and ≥ ∈ \{ } k x k≤ therefore Tx C x . Hence, C A which implies that A B. k k Now wek checkk≤ thatk k is a norm≥ on L (X,Y). First, for every≤ T L (X,Y ), k·k ∈ T = 0 sup Tx : x 1 = 0 k k ⇔ {k k k k≤ } x X, x 1 : Tx = 0 ⇔ ∀ ∈ k k≤ k k ( is a norm on Y ) x X, x 1 : Tx = 0 ⇔ k·k ∀ ∈ k k≤ ( linearity of T) x X : Tx = 0 ⇔ ⇒ ∀ ∈ T = 0. ⇔ Second, for every T L (X,Y) and every λ K ∈ ∈ λT = sup (λT)x : x 1 k k {k k k k≤ } = sup λ Tx : x 1 {| |k k k k≤ } = λ T . | |k k Finally, for every T , S L (X,Y), ∈ T + S = sup (T + S)x : x 1 k k {k k k k≤ } sup Tx + Sx : x 1 ≤ {k k k k k k≤ } T + S . ≤k k k k The proof is complete. 1022 B Banach spaces and bounded linear operators

Remark B.30. (a) Note that the infimum on the right-hand side of (B.1) in Lemma B.29 is always attained. Thus, for every operator T L (X,Y) and every x X, ∈ ∈ Tx T x . k k≤k kk k This inequality shall be frequently used in the sequel! Note that on the other handthe suprema on the right-hand side of (B.1) are not always attained. (b) From Lemma B.29 we can learn how to show that some operator T : X Y is bounded and how to calculate the norm T . Usually (in most cases), one should→ prove in the first step some inequality of thek formk

Tx C x , x X, k k≤ k k ∈ because this inequality shows on the one hand that T is bounded, and on the other hand it shows the estimate T C. In the second step one should prove that the estimate C was optimal by findingk k≤ some x X of norm x = 1 such that Tx = C, ∈ k k k k or by finding some sequence (xn) X of norms xn 1 such that limn ∞ Txn = C, because this shows that T =C⊂. Of course,thek secondstepk≤ onlyworksif→ k onehask not lost anything in the estimatek k of the first step. There are in fact many examples of bounded operators for which it is difficult to estimate their norm. Example B.31. 1. (Shift-operator). On l p(N) consider the left-shift operator

Lx = L(xn) = (xn+1).

Then 1/p 1/p p p L(xn) p = ∑ xn 1 ∑ xn , k k | + | ≤ | |  n   n  so that L is bounded and L 1. On the other hand, for x = (0,1,0,0,...) one k k≤ computes that x p = 1 and Lx p = (1,0,0,...) p = 1, and one concludes that L = 1. k k k k k k 2. (Shift-operator).k k Similarly, one shows that the right-shift operator R on l p(N) defined by Rx = R(xn) = (0,x0,x1,...) p is bounded and R = 1. Note that actually Rx p = x p for every x l . 3. (Multiplicationk operator).k Let m l∞ and considerk k onkl pkthe multiplication∈ oper- ator ∈ Mx = M(xn) = (mnxn). 4. (Functionals on C). Consider the linear functional ϕ : C([0,1]) K defined by → 1 2 ϕ( f ) := f (x) dx. Z0 Then 1 2 1 ϕ( f ) f (x) dx f ∞, | |≤ 0 | | ≤ 2k k Z B.3 Bounded linear operators 1023

ϕ ϕ 1 so that is bounded and 2 . On the other hand, for the constant function k k≤ 1 1 f = 1 one has f ∞ = 1 and ϕ( f ) = , so that ϕ = . k k | | 2 k k 2 Lemma B.32. Let X, Y, Z be three Banach spaces, and let T L (X,Y ) and S L (Y,Z). Then ST L (X,Z) and ∈ ∈ ∈ ST S T . k k≤k kk k Proof. The boundedness of ST is clear since compositions of continuous functions are again continuous. To obtain the bound on ST, we calculate

ST = sup STx k k x 1 k k k k≤ sup S Tx ≤ x 1 k kk k k k≤ S T . ≤k kk k Lemma B.33. IfY is a Banach space then L (X,Y) is a Banach space.

Proof. Assume that Y is a Banach space and let (Tn) be a Cauchy sequence in L (X,Y ). By the estimate

Tnx Tmx = (Tn Tm)x Tn Tm x , k − k k − k≤k − kk k

the sequence (Tnx) is a Cauchy sequence in Y for every x X. Since Y is complete, ∈ the limit limn ∞ Tnx exists for every x X. Define Tx := limn ∞ Tnx. Clearly, T : X Y is linear.→ Moreover, since any Cauchy∈ sequence is bounded,→ we find that →

Tx sup Tnx C x k k≤ n k k≤ k k for some constant C 0, i.e. T is bounded. Moreover, for every n N we have the estimate ≥ ∈

T Tn = sup Tx Tnx k − k x 1 k − k k k≤ sup sup Tmx Tnx ≤ x 1 m n k − k k k≤ ≥ sup Tm Tn . ≤ m n k − k ≥ Since that right-hand side of this inequality becomes arbitrarily small for large n, we see that limn ∞ Tn = T exists, and so we have proved that L (X,Y) is a Banach space. → Remark B.34. The converse of the statement in Lemma B.33 is also true, i.e if L (X,Y ) is a Banach space then necessarily Y is a Banach space. For the proof, however, one has to know that there are nontrivial operators in L (X,Y ) as soon as Y is nontrivial (i.e. Y = 0 ). For this, we need the Theorem of Hahn-Banach and its consequences discussed6 { } in Chapter E. 1024 B Banach spaces and bounded linear operators

Corollary B.35. The space X ′ = L (X,K) of all bounded linear functionals on X is always a Banach space. Definition B.36. Let X, Y be two normed spaces. a) We call T L (X,Y ) an isomorphism if T is bijective and T 1 L (Y,X). ∈ − ∈ b) We call T L (X,Y ) an isometry if Tx = x for every x X. ∈ k k k k ∈ c) We say that X and Y are isomorphic (and we write X ∼= Y) if there exists an isomorphism T L (X,Y). ∈ d) We say that X and Y are isometrically isomorphic if there exists an isometric isomorphism T L (X,Y). ∈ Remark B.37. 1. Two norms 1, 2 on a K vector space X are equivalent if k·k k·k and only if the identity operator I : (X, 1) (X, 2) is an isomorphism. 2. Saying that two normed spaces X and Yk·kare isomorphic→ k·k means that they are not only ’equal’ as vector spaces (in the sense that we find a bijective linear operator) but also as normed spaces (i.e. the bijection is continuous as well as its inverse). 3. If T L (X,Y) and S L (Y,Z) are isomorphisms, then ST L (X,Z) is an ∈ ∈1 1 1 ∈ isomorphism and (ST)− = T − S− . 4. Every isometry T L (X,Y) is clearly injective. If it is also surjective, then T is ∈ 1 an isometric isomorphism, i.e. the inverse T − is also bounded (even isometric). 5. Clearly, if T L (X,Y) is isometric, then it is an isometric isomorphism from X onto ranT , and∈ we may say that X is isometrically embedded into Y (via T ). Example B.38. The right-shift operator from Example B.31 (2) is isometric,but not surjective. In particular, l p is isometrically isomorphic to a proper subspace of l p.

Exercise B.39 Show that the spaces (c, ∞) of all convergent sequences and k·k (c0, ∞) of all null sequences are isomorphic. k·k Exercise B.40 Show that (c0, ∞) is (isometrically) isomorphic to a linear sub- k·k space of (C([0,1]), ∞), i.e. find an isometry T : c0 C([0,1]). k·k → Lemma B.41 (Neumann series). Let X be a Banach space and let T L (X) be such that T < 1. Then I T is boundedly invertible, i.e. it is an isomorphism.∈ k k 1 − n Moreover, (I T)− = ∑n 0 T . − ≥ Proof. Since X is a Banach space, L (X) is also a Banach space by Lemma B.33. n By assumption on T , the series ∑n 0 T is absolutely convergent, and hence, by Lemma B.18, it is convergentk k to some≥ element S L (X). Moreover, ∈ n (I T)S = lim (I T) ∑ T k = lim (I T k+1)= I, n ∞ n ∞ − → − k=0 → − and similarly, S(I T)= I. − Corollary B.42. Let X and Y be two Banach spaces. Then the set I (X,Y) of all 1 isomorphisms in L (X,Y ) is open, and the mapping T T − is continuous from I (X,Y ) onto I (Y,X). 7→ B.4 The Arzela-Ascoli theorem 1025

Proof. Let I L (X,Y ) be the set of all isomorphisms, and assume that I is not ⊂ I 1 empty(if it is empty,then it is also open).Let T . Then for every S B(T, 1 ) ∈ ∈ T − we have k k 1 S = T + S T = T (I + T − (S T)), − − 1 1 1 and since T − (S T) T − S T < 1, the operator I +T − (S T) L (X) is an isomorphismk − by Lemmak≤k B.41.kk As− ak composition of two isomorphisms,− ∈S I , and hence I is open. The continuity is also a direct consequence of the ab∈ove representation of S (and thus of its inverse), using the Neumann series.

B.4 The Arzela-Ascoli theorem

It is a consequence of Riesz’ Lemma (Lemma B.19) that the unit ball in an infinite dimensional Banach space is not compact; see also Theorem B.21. But compact sets play an importantrole in many theoremsfrom analysis, in particular when one wants to prove the existence of some fixed point, the existence of a solution to an algebraic equation, the existence of a solution of a differential equation, the existence of a solution of a partial differential equation etc. It is therefore important to identify the compact sets in Banach spaces, in particular in the classical Banach spaces. The Arzela-Ascoli theorem characterizes the compact subsets of C(K;X), where (K,d) is a compact metric space and X is a Banach space. We say that a subset B C(K;X) is equicontinuous at some point x K if for every ε > 0 there exists⊆δ > 0 such that for every y K and every f ∈B the implication ∈ ∈ d(x,y) < δ f (x) f (y) < ε ⇒ k − k holds.

Theorem B.43 (Arzela-Ascoli). Let (K,d) be a compact metric space, X be a Ba- nach space and consider the Banach space C(K;X) of all continuous functions K X equipped with the supremum norm f ∞ = supx K f (x) . For a subset B →C(K;X), the following assertions are equivalent:k k ∈ k k ⊆ (i) The set B is compact. (ii) The set B is closed, equicontinuous at every x K and pointwise compact in ∈ the sense that for every x K theset Bx = f (x) : f B is compact. ∈ { ∈ } We point out that, by the Heine-Borel theorem, the condition of pointwise com- pactness of B can be replaced by mere pointwise boundedness or boundedness as soon as the space X is finite dimensional.

Corollary B.44 (Arzela-Ascoli). Let (K,d) be a compact metric space, and con- sider the Banach spaceC(K;Rd ) of all continuous functions K Rd equipped with → d the supremum norm f ∞ = supx K f (x) . For a subset B C(K;R ), the follow- ing assertions are equivalent:k k ∈ k k ⊆ 1026 B Banach spaces and bounded linear operators

(i) The set B is compact. (ii) The set B is closed, equicontinuous at every x K and pointwise bounded in ∈ the sense that for every x K theset Bx = f (x) : f B is bounded. ∈ { ∈ } Proof (Proof of Theorem B.43). The proof of the Arzela-Ascoli theorem is a nice application of Cantor’s diagonal sequence argument which we see here for the first time, but which we will see again below when we prove that every bounded sequence in a reflexive Banach space admits a weakly convergent subsequence. Given a sequence, Cantor’s argument allows us to construct a subsequence which satisfies a countable number of properties. It is instructive to learn the idea of Cantor’s argument since it can be help in various situations.

We first assume that B C(K;X) is compact. Any compact subset of a Banach space is closed and bounded,⊆ and therefore B is closed and bounded, too. For every x K, the point evaluation C(K;X) X, f f (x) is linear and continuous. Since continuous∈ images of compact sets→ are compact,7→ the image of B under the point evaluation, that is the set Bx = f (x) : f B , is compact. We show that B is equicontinuous{ at every∈ } x. Assume that this was not the case. Then there exist x K and ε > 0 such that for every n 1 there exist yn K and ∈ 1 ≥ ∈ fn B such that d(x,yn) < and fn(x) fn(yn) ε. Since B is compact, there ∈ n k − k≥ exists a subsequence of ( fn) (which we denote for simplicity again by ( fn)) such that limn ∞ fn = f in C(K;X). Then, by the triangle inequality from below, →

liminf f (x) f (yn) = liminf f (x) fn(x)+ fn(x) fn(yn)+ fn(yn) f (yn) n ∞ n ∞ → k − k → k − − − k liminf fn(x) fn(yn) 2 f fn ∞ n ∞ ≥ → k − k− k − k ε.  ≥

This inequality, however, contradicts to the continuity of f (note that limn ∞ yn = x), and therefore, B is equicontinuous at every x K. → Assume now that B satisfies the properties∈ from assertion (ii). In order to show that B is compact, it suffices to show that every sequence ( fn) B admits a conver- ⊆ gent subsequence, that is, B is sequentially compact. So let ( fn) B be an arbitrary sequence. ⊆ Recall that every compact metric space is separable. Hence, there exists a se- quence (xm)m 1 K which is dense in K. ≥ ⊆ Consider the sequence ( fn(x1)) Bx X. Since Bx is compact by assumption, ⊆ 1 ⊆ 1 there exists a subsequence ( fϕ (n)) of ( fn) such that limn ∞ fϕ (n)(x1) exists. 1 → 1 Consider next the sequence ( fϕ (x2)) Bx X. Since Bx is compact by as- 1(n) ⊆ 2 ⊆ 2 sumption, there exists a subsequence ( fϕ (n)) of ( fϕ (n)) such that limn ∞ fϕ (n)(x2) 2 1 → 2 exists. Note that we have also the existence of the limit limn ∞ fϕ (n)(x1). → 2 Iterating this argument, we obtain for every m 2 a subsequence ( fϕ ) of ≥ m(n) ∞ ( fϕm 1(n)) such that limn fϕm(n)(xi) exists for every 1 i m. These subse- quences− converge therefore→ pointwise at a finite number of el≤ements≤ of K.

We now consider the diagonal subsequence ( fϕ(n)) = ( fϕn(n)). This diagonal sub- sequence has the property of being a subsequence of ( fϕ ) for every m 1, up m(n) ≥ B.4 The Arzela-Ascoli theorem 1027

to a finite number of initial elements perhaps. It enjoys therefore the property that limn ∞ fϕ(n)(xm) exists for every m 1, that is, it converges pointwise on a dense → ≥ subset of K. We will show that ( fϕ(n)) converges everywhere and uniformly on K. Since C(K;X) is complete, it suffices to show that ( fϕ(n)) is a Cauchy sequence in C(K;X). Let ε > 0. Since B is equicontinuous at every x K, for every x K there exists ∈ ∈ δx > 0 such that for every y K and every f B the implication ∈ ∈ d(x,y) < δ f (x) f (y) < ε (B.2) ⇒ k − k δ is true. We clearly have K = x K B(x, x), and since K is compact, we find finitely ∈ k many points x′ , ..., x′ such that K = B(x′,δi) (with δi = δ ). Since the se- 1 k S i=1 i xi′ quence (xm) is dense in K, for every 1 i k there exists mi 1 such that S ≤ ≤ ≥ xm B(x ,δi). Since the sequence ( fϕ ) converges pointwise on (xm), there ex- i ∈ i′ (n) ists n0 0 such that ≥

for every n, n′ n0 and every 1 i k fϕ (xm ) fϕ (xm ) < ε. ≥ ≤ ≤ k (n) i − (n′) i k

Let now x K be arbitrary. Then x B(xi,δi) for some 1 i k. Hence, for every ∈ ∈ ≤ ≤ n, n n0, by the preceding estimate and by the implication (B.2), ′ ≥

fϕ (x) fϕ (x) fϕ (x) fϕ (x′) + k (n) − (n′) k≤ k (n) − (n) i k + fϕ (x′) fϕ (xm ) + k (n) i − (n) i k + fϕ (xm ) fϕ (xm ) + k (n) i − (n′) i k + fϕ (xm ) fϕ (x′) + k (n′) i − (n′) i k + fϕ (x′) fϕ (x) k (n′) i − (n′) k 5ε. ≤

Since n0 0 did not depend on x K, and since ε > 0 was arbitrary, this proves ≥ ∈ that ( fϕ(n)) is a Cauchy sequence in C(K;X). We have therefore proved that every sequence in B admits a convergent subsequence. Since B is closed, we obtain that B is sequentially compact, and hence compact.

Appendix C Calculus on Banach spaces

C.1 Differentiable functions between Banach spaces

Definition C.1. Let X, Y be two Banach spaces, and let U X be open. A function f : U Y is ⊂ → a) differentiable at x U if there exists a bounded linear operator T L (X,Y ) such that ∈ ∈ f (x + h) f (x) Th lim − − = 0, (C.1) h 0 h k k→ k k b) differentiable if it is differentiable at every point x U. ∈ If f is differentiable at a point x U, then T L (X,Y) is uniquely determined. We ∈ ∈ write D f (x) := f ′(x) := T and call D f (x)= f ′(x) the derivative of f at x. Lemma C.2. If a function f : U Y is differentiable at x U, then it is continuous at x. In particular, every differentiable→ function is continuous.∈

Proof. Let (xn) U be convergent to x. By definition (equation (C.1)) and continu- ⊂ ity of f ′(x),

f (xn) f (x) f (xn) f (x) f ′(x)(x xn) + f ′(x)(x xn) k − k≤k − − − k k − k 0, → as n ∞. → Definition C.3. Let X, Y be two Banach spaces, and let U X be open. A function ⊂ f : U Y is called continuously differentiable if it is differentiable and if f ′ : U L (X,→Y ) is continuous. We denote by →

1 C (U;Y ) := f : U Y : f differentiable and f ′ C(U;L (X,Y)) { → ∈ } the space of all continuously differentiable functions. Moreover, for k 2, we de- note by ≥

1029 1030 C Calculus on Banach spaces

k k 1 C (U;Y) := f : U Y : f differentiable and f ′ C − (U;L (X,Y )) { → ∈ } the space of all k times continuously differentiable functions. n Definition C.4. Let Xi (1 i n) and Y be Banach spaces. Let U i=1 Xi be open. We say that a function≤ f≤: U Y is at a = (a ) U partially⊂ differen- i 1 i n N tiably with respect to the i-th coordinate→ if the function ≤ ≤ ∈

fi : Ui Xi Y, xi f (a1,...,xi,...,an) ⊂ → 7→ ∂ f is differentiable in ai. We write ∂ (a) := f ′(ai) L (Xi,Y ). xi i ∈

C.2 Local inverse theorem and implicit function theorem

Let X and Y be two Banach spaces and let U X be an open subset. The following are two classical theorems in differential calculus.⊆ Theorem C.5 (Local inverse theorem). Let f : U Y be continuously differen- → tiable and x¯ U such that f ′(x¯) : X Y is an isomorphism, that is, bounded, bi- jective and the∈ inverse is also bounded.→ Then there exist neighbourhoods V U of x¯ andW Y of f (x¯) such that f : V W is a C1 diffeomorphism, that is⊂ f is ⊂ → 1 continuously differentiable, bijective and the inverse f − : W V is continuously differentiable, too. →

Theorem C.6 (Implicit function theorem). Assume that X = X1 X2 for two Ba- × nach spaces X1, X2, and let f : X U Y be continuously differentiable. Let ∂ f ⊃ → x¯ = (x¯1,x¯2) U be such that ∂ (x¯) : X2 Y is an isomorphism. Then there exist ∈ x2 → neighbourhoodsU1 X1 of x¯1 andU2 X2 of x¯2,U1 U2 U, and a continuously ⊂ ⊂ × ⊂ differentiable function g : U1 U2 such that →

x U1 U2 : f (x)= f (x¯) = (x1,g(x1)) : x1 U1 . { ∈ × } { ∈ } For the proof of the local inverse theorem, we need the following lemma. Lemma C.7. Let f : U Y be continuously differentiable such that f : U f (U) is a homeomorphism, that→ is, continuous, bijective and with continuous inverse.→ Then 1 f is a C diffeomorphism if and only if for every x U the derivative f ′(x) : X Y is an isomorphism. ∈ → Proof. Assume first that f is a C1 diffeomorphism. When we differentiate the iden- 1 1 tities x = f − ( f (x)) and y = f ( f − (y)), which are true for every x U and every y f (U), then we find ∈ ∈ 1 IX = ( f − )′( f (x)) f ′(x) for every x U and 1 1 ∈ IY = f ′( f − (y))( f − )′(y) 1 1 = f ′(x)( f − )′( f (x)) for every x = f − (y) U. ∈ C.2 Local inverse theorem and implicit function theorem 1031

As a consequence, f ′(x) is an isomorphism for every x U. For the converse, assume that f (x) is an isomorphism∈ for every x U. For every ′ ∈ x1, x2 U one has, by differentiability, ∈

f (x2)= f (x1)+ f ′(x1)(x2 x1)+ o(x2 x1), − − o(x2 x1) 1 where o depends on x1 and limx x − = 0. We have x1 = f − (y1) and x2 = 2→ 1 x2 x1 1 k − k f − (y2) if we put yi := f (xi). Hence, the above identity becomes

1 1 1 1 1 y2 = y1 + f ′( f − (y1))( f − (y2) f − (y1)) + o( f − (y2) f − (y1)). − − 1 1 To this identity, we apply the inverse operator ( f ′( f − (y1)))− and we obtain

1 1 1 1 1 1 1 1 f − (y2)= f − (y1)+( f ′( f − (y1)))− (y2 y1) ( f ′( f − (y1)))− o( f − (y2) f − (y1)). − − − 1 Since f − is continuous, the last term on the right-hand side of the last equality is 1 sublinear. Hence, f − is differentiable and

1 1 1 ( f − )′(y1) = ( f ′( f − (y1)))− .

1 1 From this identity (using that f − and f ′ are continuous) we obtain that f − is continuously differentiable. The claim is proved.

Proof (Proof of the local inverse theorem). Consider the function

g : U X, → 1 x f ′(x¯)− f (x). 7→ It suffices to show that g : V W is a C1 diffeomorphism for appropriate neigh- bourhoods V ofx ¯ and W of g(→x¯). Consider also the function

ϕ : U X, → x x g(x). 7→ − This function ϕ is continuously differentiable and ϕ (x)= I f (x¯) 1 f (x) for every ′ − ′ − ′ x U. In particular, ϕ′(x¯)= 0. By continuity of ϕ′, there exists r > 0 and L < 1 such that∈ ϕ (x) L for every x B¯(x¯,r) U. Hence, k ′ k≤ ∈ ⊂

ϕ(x1) ϕ(x2) L x1 x2 for every x1, x2 B¯(x¯,r). k − k≤ k − k ∈ By the definition of ϕ, this implies

g(x1) g(x2) = x1 x2 (ϕ(x1) ϕ(x2)) (C.2) k − k k − − − k x1 x2 L x1 x2 ≥k − k− k − k = (1 L) x1 x2 . − k − k 1032 C Calculus on Banach spaces

We claim that for every y B¯(g(x¯),(1 L)r) there exists a unique x B¯(x¯,r) such that g(x)= y. ∈ − ∈ The uniqueness follows from (C.2). In order to prove existence, let x0 = x¯, and then define recursively xn+1 = y + 1 ϕ(xn)= y + xn f (x¯) f (xn) for every n 0. Then − ′ − ≥ n 1 − xn x¯ = ∑ xk+1 xk k − k k k=0 − k n 1 − x1 x0 + ∑ ϕ(xk) ϕ(xk 1) ≤k − k k=1 k − − k n 1 − k ∑ L x1 x0 ≤ k=0 k − k 1 Ln = − y g(x¯) 1 L k − k − (1 Ln)r r, ≤ − ≤ which implies xn B¯(x¯,r) for every n 0. Similarly, for every n m 0, ∈ ≥ ≥ ≥ n 1 − k xn xm ∑ L y g(x¯) , k − k≤ k=m k − k so that the sequence (xn) is a Cauchy sequence in B¯(x¯,r). Since B¯(x¯,r) is complete, there exists limn ∞ xn =: x B¯(x¯,r). By continuity, → ∈ x = y + ϕ(x)= y + x g(x), − or g(x)= y. This proves the above claim, that is, g is locally invertible. It remains to show that 1 1 g− is continuous (then g is a homeomorphism, and therefore a C diffeomorphism by Lemma C.7). Contiunity of the inverse function, however, is a direct consequence of (C.2) (which even implies Lipschitz continuity). Remark C.8. The iteration formula

1 xn 1 = y + xn f ′(x¯)− f (xn) + − used in the proof of the local inverse theorem in order to find a solution of g(x)= 1 f ′(x¯)− f (x)= y should be compared to the discrete Newton iteration

1 xn 1 = y + xn f ′(xn)− f (xn); + − see Theorem C.11 below. Proof (Proof of the implicit function theorem). Consider the function C.2 Local inverse theorem and implicit function theorem 1033

F : U X1 Y, → × (x1,x2) (x1, f (x1,x2)). 7→ Then F is continuously differentiable and

∂ f ∂ f F′(x¯)(h1,h2) = (h1, (x¯)h1 + (x¯)h2). ∂x1 ∂x2

In particular, by the assumption, F′(x¯) is locally invertible with inverse ∂ ∂ 1 f 1 f F′(x¯)− (y1,y2) = (y1,( (x¯))− (y2 (x¯)y1)). ∂x2 − ∂x1

By the local inverse theorem (Theorem C.5), there exists a neighbourhood U1 ofx ¯1, a neighbourhood U2 ofx ¯2 and a neighbourhood V of (x¯1, f (x¯)) = F(x¯) such that 1 F : U1 U2 V is a C diffeomorphism. The inverse is of the form × → 1 F− (y1,y2) = (y1,h2(y1,y2)), where h2 is a function such that f (y1,h2(y1,y2)) = y2. Let

U˜1 := x1 U1 : (x1, f (x¯)) V . { ∈ ∈ }

Then U˜1 is open by continuity of the function x1 (x1, f (x¯)), andx ¯1 U˜1. We 7→ ∈ restrict F to U˜1 U2, and we define ×

g : U˜1 X2, (C.3) → 1 x1 g(x1)= F− (x1, f (x¯))2, 7→ 1 1 where F ( )2 denotes the second component of F ( ). Then g is continuously dif- − · − · ferentiable, g(U˜1) U2 and g satisfies the required property of the implicit function. ⊂ Lemma C.9 (Higher regularity of the local inverse). Let f Ck(U;Y) for some k 1 and assum that f : U f (U) is a C1 diffeomorphism. Then∈ f is a Ck diffeo- ≥ 1 → morphism, that is, f − is k times continuously differentiable.

Proof. For every y f (U) we have ∈ 1 1 1 ( f − )′(y)= f ′( f − (y))− .

The proof therefore follows by induction on k.

Lemma C.10 (Higher regularity of the implicit function). If, in the implicit func- tion theorem (Theorem C.6), the function f is k times continuously differentiable, then the implicit function g is also k times continuously differentiable.

Proof. This follows from the previous lemma (Lemma C.9) and the definition of the implicit function in the proof of the implicit function theorem. 1034 C Calculus on Banach spaces C.3 * Newton’s method

Theorem C.11 (Newton’s method). Let X andY be two Banachspaces,U X an open set. Let f C1(U;Y) and assume that there exists x¯ U such that (i) f ⊂(x¯)= 0 and (ii) f (x¯) ∈L (X,Y ) is an isomorphism. Then there∈ exists a neighbourhood ′ ∈ V U of x¯ such that for every x0 V the operator f (x0) is an isomorphism, the ⊂ ∈ ′ sequence (xn) defined iteratively by

1 xn 1 = xn f ′(xn)− f (xn), n 0, (C.4) + − ≥

remains in V and limn ∞ xn = x.¯ → Proof. By Corollary B.42 and continuity, there exists a neighbourhood V˜ U of ⊂ x¯ such that f ′(x) is isomorphic for all x V˜ . Next, it will be useful to define the auxiliary function ϕ : V˜ X by ∈ → 1 ϕ(x) := x f ′(x)− f (x), x V˜ . − ∈ Since f (x¯)= 0, we find that for every x V˜ ∈ 1 ϕ(x) ϕ(x¯)= x f ′(x)− ( f (x) f (x¯)) x¯ − − 1 − − = x x¯ f ′(x)− ( f ′(x¯)(x x¯)+ r(x x¯)), − − − − so that by the continuity of f ( ) 1 ′ · − ϕ(x) ϕ(x¯) lim k − k = 0. x x¯ x x¯ → k − k Hence, there exists r > 0 such that V := B(x¯,r) V˜ U and such that for every x V ⊂ ⊂ ∈ 1 ϕ(x) x¯ = ϕ(x) ϕ(x¯) x x¯ . k − k k − k≤ 2 k − k

This implies that for every x0 V one has ϕ(x0) V and if we define iteratively n+1 ∈ ∈ xn+1 = ϕ(xn)= ϕ (x0), then

1 n xn x¯ x0 x¯ 0 as n ∞. k − k≤ 2 k − k → →  Appendix D Hilbert spaces

Let H be a vector space over K.

D.1 Inner product spaces

Definition D.1. A function , : H H K is called an inner product if for every x, y, z H and every λ Kh· ·i × → ∈ ∈ (i) x,x 0 for every x H and x,x = 0 if and only if x = 0, h i≥ ∈ h i (ii) x,y = y,x , h i h i (iii) λx + y,z = λ x,z + y,z . h i h i h i A pair (H, , ) of a vector space over K and a scalar product is called an inner product spaceh· ·i.

Example D.2. 1. On the space H = Kd,

d x,y := ∑ xiy¯i h i i=1 defines an inner product. 2 2 2. On the space H = l := (xn) K : ∑ xn < ∞ , { ⊂ | | }

x,y := ∑xny¯n h i n defines an inner product. 3. On the space H = C([0,1]), the

1 f ,g := f (x)g(x) dx h i 0 Z

1035 1036 D Hilbert spaces

defines an inner product. 4. On the space H = L2(Ω), the integral

f ,g := f gd¯ µ h i Ω Z defines an inner product. Lemma D.3. Let , be an inner product on a vector space H. Then, for every x, y, z H and λ Kh· ·i ∈ ∈ (iv) x,λy + z = λ¯ x,y + x,z . h i h i h i Proof.

x,λy + z = λy + z,x = λ¯ y,x + z,x = λ¯ x,y + x,z . h i h i h i h i h i h i In the following, if H is an , then we put

x := x,x , x H. k k h i ∈ Lemma D.4 (Cauchy-Schwarz inequality).p Let H be an inner product space. Then, for every x, y H, ∈ x,y x y , |h i|≤k kk k and equality holds if and only if x and y are colinear. Proof. Let λ K. Then ∈ 0 x + λy,x + λy ≤ h i = x,x + λy,x + x,λy + λ 2 y,y h i h i h i | | h i = x,x + λ x,y + λ¯ x,y + λ 2 y,y , h i h i h i | | h i that is, 0 x + λy 2 = x 2 + 2Reλ¯ x,y + λ 2 y 2. (D.1) ≤k k k k h i | | k k Assuming that y = 0 (for y = 0 the Cauchy-Schwarz inequality is trivial), we may put λ := x,y /6 y 2. Then −h i k k x,y x,y 0 x h iy,x h i y ≤ h − y 2 − y 2 i k k k k x,y 2 = x 2 |h i| , k k − y 2 k k which is the Cauchy-Schwarz inequality. The calculation also shows that equality holds if and only if x = λy, that is, if x and y are colinear. Lemma D.5. Every inner product space H is a normed linear space for the norm

x = x,x , x H. k k h i ∈ p D.1 Inner product spaces 1037

Proof. Properties(i) and (ii) in the definition of a norm follow from the properties (i) and (iii) (together with Lemma D.3) in the definition of an inner product. The only difficulty is to show that satisfies the triangle inequality. This, however, follows from putting λ = 1 in (D.1)k·k and estimating with the Cauchy-Schwarz inequality:

x + y 2 ( x + y )2. k k ≤ k k k k Definition D.6. A complete inner product space is called a Hilbert space.

Example D.7. The spaces Kd (with euclidean inner product), l2 and L2(Ω) are Hilbert spaces. More examples are given by the Sobolev spaces defined below.

Lemma D.8 (Completion of an inner product space). Let H be an inner product space. Then there exists a Hilbert space K and a bounded linear operator j : H K such that for every x, y H → ∈

x,y H = j(x), j(y) K , h i h i and such that j(H) is dense in K. The Hilbert space K is unique up to isometry. It is called the completion of H.

Lemma D.9 (Parallelogram identity). Let H be an inner product space. Then for every x, y H ∈ x + y 2 + x y 2 = 2( x 2 + y 2). k k k − k k k k k Proof. The parallelogram identity follows immediately from (D.1) by putting λ = 1 and adding up. ± Exercise D.10 (von Neumann) Show that a norm satisfying the parallelogram identity comes from a scalar product. That means, the parallelogram identity char- acterises inner product spaces.

Definition D.11. A subset K of a vector space X (over K) is convex if for every x, y K and every t [0,1] one has tx + (1 t)y K. ∈ ∈ − ∈ Theorem D.12 (Projection onto closed, convex sets). Given a nonempty closed, convex subset K of a Hilbert space H, and given a point x H, there exists a unique y K such that ∈ ∈ x y = inf x z : z K . k − k {k − k ∈ } Proof. Let d := inf x z : z K , and choose (yn) K such that {k − k ∈ } ∈

lim x yn = d. (D.2) n ∞ → k − k

Applying the parallelogram identity to (x yn)/2 and (x ym)/2, we obtain − − yn + ym 2 1 2 1 2 2 x + yn ym = ( x yn + x ym ). k − 2 k 4k − k 2 k − k k − k 1038 D Hilbert spaces

Since K is convex, yn+ym K and hence x yn+ym 2 d2. Using this and (D.2), 2 ∈ k − 2 k ≥ the last identity implies that (yn) is a Cauchy sequence. Since H is complete, y := limn ∞ yn exists. Since K is closed, y K. Moreover, x y = limn ∞ x yn = d, so that→ y is a minimizer for the distance∈ to x. To seek − thatk there→ is onlyk − onk such minimizer, suppose that y′ K is a second one, and apply the parallelogram identity to x y and x y . ∈ − − ′ Definition D.13. Let H be an inner product space. We say that two vectors x, y H are orthogonal (and we write x y), if x,y = 0. Given a subset S H, we define∈ the orthogonal space S := y ⊥H : x hy fori all x S .If S = K is a⊂ linear subspace ⊥ { ∈ ⊥ ∈ } of H, then we call K⊥ also the orthogonal complement of K. Theorem D.14. Let H be a Hilbert space, S H be a subset and K a closed linear subspace. Then: ⊂

a) S⊥ is a closed linear subspace of H, b) K andK are complementary subspaces, i.e. every x H can be decomposed ⊥ ∈ uniquely as a sum of an x0 K andanx1 K , ∈ ∈ ⊥ c) (K⊥)⊥ = K and (S⊥)⊥ = spanS. d) spanS is densein H if andonlyif S = 0 . ⊥ { }

Proof. (a) It follows from the bilinearity of the inner product that S⊥ is a linear subspace of H. Let (yn) S⊥ be convergent to some y H. Then, for every x S, by the Cauchy-Schwarz inequality,∈ ∈ ∈

x,y = lim x,yn = 0, n ∞ h i → h i that is, y S and therefore S is closed. ∈ ⊥ ⊥ (b) For every x H we let x0 K be the unique element (Theorem D.12) such that ∈ ∈ x x0 = inf x y : y K . k − k {k − k ∈ } Put x1 = x x0. For every y K and every λ K, by the minimum property of x0, − ∈ ∈ 2 2 x1 x1 λy k k ≤k − k 2 ¯ 2 2 = x1 2Reλ x1,y + λ y . k k − h i | | k k

This implies that x1,y = 0, that is, x1 K . Every decomposition x = x0 +x1 with h i ∈ ⊥ x0 K and x1 K⊥ is unique since x K K⊥ implies x,x = 0, that is, x = 0. ∈(c) and (d) follow∈ immediately from∈ (a)∩ and (b). h i

Lemma D.15 (Pythagoras). Let H be an inner product space. Whenever x, y H are orthogonal, then ∈ x + y 2 = x 2 + y 2. k k k k k k Proof. The claim follows from (D.1) and putting λ = 1. D.2 Orthogonal decomposition 1039

Definition D.16. Let X be a normed space. We call an operator P : X X a projec- tion if P2 = P. → Lemma D.17. Let X bea normedspaceandlet P L (X) be a bounded projection. Then the following are true: ∈ a) Q = I P is a projection. − b) Either P = 0 or P 1. k k≥ c) The kernel kerP and the range ranP are closed in X.

d) Every x X can be decomposed uniquely as a sum of an x0 kerP and an ∈ ∈ x1 ranP, and X = kerP ranP. ∈ ∼ ⊕ Proof. (a) Q2 = (I P)2 = I 2P+ P2 = I P = Q. (b) follows from− P = P−2 P 2. − k k k k≤k k 1 (c) Since 0 is closed in X and since P is continuous, kerP = P− ( 0 ) is closed. Similarly, ran{P}= ker(I P) is closed. { } − (d) For every x X we can write x = Px + (I P)x = x1 + x2 with x1 ranP and ∈ − ∈ x2 kerP. The decomposition is unique since if x kerP ranP, then x = Px = 0. This∈ proves that the vector spaces X and kerP ran∈P are isomorphic.∩ That they are also isomorphic as normed spaces follows from⊕ the continuity of P. Lemma D.18. LetH bea HilbertspaceandK H be a closed linear subspace. For ⊂ every x H weletx1 = Px be the unique element in K which minimizes the distance to x (Theorem∈ D.12). Then P : H H is a bounded projection satisfying ranP = K. → Moreover, kerP = K⊥. We call P the orthogonal projection onto K.

D.2 Orthogonal decomposition

Definition D.19. We call a metric space separable if there exists a countable dense subset. Example D.20. The space Rd (or Cd) is separable: one may take Qd as an example of a dense countable subset. It is not too difficult to see that subsets of separable metric spaces are separable (note, however, that in general the dense subset has to be constructed carefully), and that finite products of separable metric spaces are separable. Lemma D.21. A normed space X is separable if and only if there exists a sequence (xn) X such that span xn : n N is dense in X (such a sequence is in general called⊂ a total sequence). { ∈ }

Proof. If X is separable, then there exists a sequence (xn) X such that xn : n N ⊂ { ∈ } is dense. In particular, the larger set span xn : n N is dense. { ∈ } If, one the other hand, there exists a total sequence (xn) X,andifweput D = Q in the case K = R and D = Q + iQ in the case K = C, then⊂ the set 1040 D Hilbert spaces

m λ λ ∑ ixni : m N, i D, ni N {i=1 ∈ ∈ ∈ }

is dense in X (in fact, the closure contains all finite linear combinationsofthe xn, that is, it contains span xn : n N ). It is an exercise to show that this set is countable. The claim follows.{ ∈ }

Corollary D.22. The space (C([0,1]), ∞) is separable. k·k Proof. By Weierstrass’ theorem, the subspace of all polynomials is dense in C([0,1]) (Weierstrass’ theorem says that every continuous function f : [0,1] R can be uniformly approximated by polynomials). The polynomials, however, are→ the n linear span of the monomials fn(t)= t . The claim therefore follows from Lemma D.21.

p Corollary D.23. The space l is separable if 1 p < ∞. The space c0 is separable. ≤ p p Proof. Let en = (δnk)k l be the n-th in l (here δnk denotes the Kro- ∈ necker symbol: δnk = 1 if n = k and δnk = 0 otherwise). Then span en : n N = c00 (the space of all finite sequences) is dense in l p if 1 p < ∞. The{ claim∈ for}l p fol- ≤ lows from Lemma D.21. The argument for c0 is similar. Lemma D.24. The space l∞ is not separable.

Proof. The set 0,1 N l∞ of all sequences taking only values 0 or 1 is uncount- able. Moreover,{ whenever} ⊂ x, y 0,1 N, x = y, then ∈{ } 6 x y ∞ = 1. k − k 1 N 1 Hence, the balls B(x, 2 ) with centers x 0,1 and radius 2 are mutually disjoint. If l∞ was separable, that is, if there exists∈{ a dense} countable set D l∞, then in each B(x, 1 ) there exists at least one element y D, a contradiction. ⊂ 2 ∈ Definition D.25. Let H be an inner product space. A family (el )l I H is called ∈ ⊂ a) an orthogonal system if (el,ek)= 0 whenever l = k, 6 b) an orthonormal system if it is an orthogonal system and el = 1 for every l I, and k k ∈ c) an if it is an orthonormal system and span el : l I is dense in H. { ∈ }

Lemma D.26 (Gram-Schmidt process). Let (xn) be a sequence in an inner prod- uct space H. Then there exists an orthonormal system (en) such that span xn = { } span en . { } Proof. Passing to a subsequence, if necessary, we may assume that the (xn) are linearly independent. Let e1 := x1/ x1 . Then e1 and x1 span the same linear subspace. Next, assume k k that we have constructed an orthonormal system (ek)1 k n such that ≤ ≤ D.2 Orthogonal decomposition 1041

span xk :1 k n = span ek :1 k n . { ≤ ≤ } { ≤ ≤ } n Let e := xn 1 ∑ xn 1,ek ek. Since the xn are linearly independent, we n′ +1 + − k=1h + i find e = 0. Let en 1 := e / e . By construction, for every 1 k n, n′ +1 6 + n′ +1 k n′ +1k ≤ ≤ en 1,ek = 0, and h + i

span xk :1 k n + 1 = span ek :1 k n + 1 . { ≤ ≤ } { ≤ ≤ } Proceeding inductively, the claim follows.

Corollary D.27. Every separable inner product space admits an orthonormal basis.

Example D.28. Consider the inner product space C([ 1,1]) equiped with the scalar 1 − n product f ,g = 1 f (t)g(t) dt and resulting norm 2. Let fn(t) := t (n 0), so h i − k·k ≥ that span fn is the space of all polynomials on the interval [ 1,1]. Applying the { } R − Gram-Schmidt process to the sequence ( fn) yields a orthonormal sequence (pn) of polynomials. The pn are called Legendre polynomials. Recall that the space of all polynomials is dense in C([ 1,1]) by Weierstrass’ − theorem (even for the uniform norm; a fortiori also for the norm 2). Hence, the Legendre polynomials form an orthonormal basis in C([ 1,1]). k·k − Lemma D.29 (Bessel’s inequality). Let H be an inner product space, (en)n N H an orthonormal system. Then, for every x H, ∈ ⊂ ∈ 2 2 ∑ x,en x . n N |h i| ≤k k ∈ ∑N Proof. Let N N. Put xN = x n=1 x,en en so that xN en for every 1 n N. By Pythagoras∈ (Lemma D.15),− h i ⊥ ≤ ≤

N 2 2 2 x = xN + ∑ x,en en k k k k k n=1h i k N 2 2 = xN + ∑ x,en k k n=1 |h i| N 2 ∑ x,en . ≥ n=1 |h i| Since N was arbitrary, the claim follows.

Lemma D.30. Let H be a (separable) Hilbert space, (en)n N H an orthonormal system. Then: ∈ ⊂

a) For every x H, the series ∑n N x,en en converges. ∈ ∈ h i b) P : H H,x ∑n N x,en en is the orthogonal projection onto span en : n N . → 7→ ∈ h i { ∈ } 1042 D Hilbert spaces

Proof. (a) Let x H. Since (en) is an orthonormal system, by Pythagoras (Lemma D.15), for every ∈l > k 1, ≥ l k l 2 2 ∑ x,en en ∑ x,en en = ∑ x,en en k n=1h i − n=1h i k k n=k+1h i k l 2 = ∑ x,en . n=k+1 |h i|

l Hence, by Bessel’s inequality, the sequence (∑ x,en en) of partial sums forms a n=1h i Cauchy sequence. Since H is complete, the series ∑n N x,en en converges. (b) is an exercise. ∈ h i

Theorem D.31. Let H be a (separable) Hilbert space, (en)n N an orthonormal sys- tem. Then the following are equivalent: ∈

(i) (en)n N is an orthonormal basis. ∈ (ii) If x en for every n N, then x = 0. ⊥ ∈ (iii) x = ∑n N x,en en for every x H. ∈ h i ∈ (iv) x,y = ∑n N x,en en,y for every x, y H. h i ∈ h ih i ∈ (v) (Parseval’s identity) For every x H, ∈ 2 2 x = ∑ x,en . k k n N |h i| ∈ Proof. (i) (ii) follows from Theorem D.14. ⇒ (ii) (iii) follows from Lemma D.30 (i). In fact, let x0 = ∑n N x,en en (which ⇒ ∈ h i exists by Lemma D.30 (i)). Then x x0,en = 0 for every n N, and by assumption h − i ∈ (ii), this implies x = x0. (iii) (iv) follows when multiplying x scalarly with y, applying also the Cauchy- ⇒ 2 Schwarz inequality for the sequences ( x,el ), ( el,y ) l . (iv) (v) follows from putting x = yh. i h i ∈ ⇒ 2 (v) (i). Let x span en : n N . Then Parseval’s identity implies x = 0, ⇒ ∈ { ∈ }⊥ k k that is, x = 0. By Theorem D.14, span en : n N is dense in H, that is, (en) is an orthonormal basis. { ∈ }

Definition D.32. A bounded linear operator U L (H,K) between two Hilbert spaces is called a unitary operator if it is invertible∈ and for every x, y H, ∈

x,y H = Ux,Uy K. h i h i Two Hilbert spaces H and K are unitarily equivalent if there exists a unitary operator U L (H,K). ∈ Corollary D.33. Every infinite dimensional separable Hilbert space H is unitarily equivalent to l2. D.2 Orthogonal decomposition 1043

Proof. Choose an orthonormalbasis (en)n N of H (which exists by Corollary D.27), 2 ∈ and define U : H l by U(x) = ( x,en )n N. Then x,y H = U(x),U(y) l2 by Theorem D.31; in→ particular, U is bounded,h i isometric∈ andh injective.i h The fact thati U 2 is surjective, that is, that ∑n cnen converges for every c = (cn) l , follows as in the proof of Lemma D.30 (i). ∈

Clearly, if a sequence (en) in a Hilbert space H is an orthonormal basis, then necessarily H is separable by Lemma D.21. Hence, the equivalent statements of Theorem D.31 are only satisfied in separable Hilbert spaces. In most of the applica- tions (if not all!), we will only deal with separable Hilbert spaces so that Theorem D.31 is sufficient for our purposes. However, what is true in general Hilbert spaces? The following sequence of re- sults generalizes the preceeding results to arbitrary Hilbert spaces.

Definition D.34. Let X be a normed space, (xi)i I be a family. We say that the series ∈ ∑i I xi converges unconditionally if the set I0 := i I : xi = 0 is countable, and ∈ ∞ { ∈ 6 } for every bijective ϕ : N I0 the series ∑ xϕ converges. → n=1 (n) Corollary D.35 (Bessel’s inequality, general case). Let H be an inner product space, (el)l I H an orthonormal system. Then, for every x H, the set l I : ∈ ⊂ ∈ { ∈ x,el = 0 is countable and h i 6 } 2 2 ∑ x,el x . (D.3) l I |h i| ≤k k ∈

Proof. By Bessel’s inequality, the sets l I : x,el 1/n must be finite for { ∈ |h i| ≥ } every n N. The countability of l I : x,el = 0 follows. The inequality (D.3) is then a∈ direct consequence of Bessel’s{ ∈ inequality.h i 6 }

Lemma D.36. Let H be a Hilbert space, (el)l I H an orthonormal system. Then: ∈ ⊂ a) For every x H, the series ∑l I x,el el converges unconditionally. ∈ ∈ h i b) P : H H,x ∑l I x,el el is the orthogonal projection onto span el : l I . → 7→ ∈ h i { ∈ } Corollary D.37. Every Hilbert space admits an orthonormal basis.

Proof. If H is separable, the claim follows directly from the Gram-Schmidt process and has already been stated in Corollary D.27. In general, one may argue as follows: The set of all orthonormal systems in H forms a partially ordered set by inclu- sion. Given a totally ordered collection of orthonormal systems, the union of all vectors contained in all systems in this collection forms a supremum. By Zorn’s lemma, there exists an orthonormal system (el )l I which is maximal. It follows from Bessel’s inequality (D.3) that this system is actually∈ an orthonormal basis.

Theorem D.31 remains true for arbitrary Hilbert spaces when replacing the countable orthonormal system (en)n N by an arbitrary orthonormal system (el )l I . ∈ ∈ 1044 D Hilbert spaces D.3 *

In the following we will identify the space L1(0,2π) with

2π 1 π λ ∞ L2π (R) := f : R C measurable, 2 -periodic : f d < . { → 0 | | } Z 2 π 2 Similarly, we identify L (0,2 ) with L2π (R), and we define

C2π (R) := f C(R) : f is 2π-periodic . { ∈ } 1 1 Definition D.38. For every f L (0,2π)= L π (R) and every n Z we call ∈ 2 ∈ 1 2π fˆ(n) := f (t)e int dt 2π − Z0 the n-th Fourier coefficient of f . The sequence fˆ = ( fˆ(n)) is called the Fourier 1 ∑ ˆ in transform of f . The formal series π n Z f (n)e · is called the Fourier series of √2 ∈ f . 1 π 1 ˆ ∞ Lemma D.39. For every f L (0,2 )= L2π (R) we have f l (Z) and the Fourier transformˆ: L1(0,2π) l∞∈is a bounded, linear operator. More∈ precisely, → 1 1 fˆ ∞ f 1, f L (0,2π). k k ≤ 2π k k ∈ Proof. For every f L1(0,2π) and every n Z, ∈ ∈ 2π 2π 1 int 1 fˆ(n) = f (t)e− dt f (t) dt. | | 2π | 0 |≤ 2π 0 | | Z Z ∞ This proves that fˆ l and the required bound on fˆ ∞. Linearity ofˆis clear. ∈ k k 1 1 Lemma D.40 (Riemann-Lebesgue). For every f L (0,2π)= L π (R) we have ∈ 2 fˆ c0(Z), i.e. ∈ lim fˆ(n) = 0. n ∞ | | | |→ 1 1 Proof. Let f L (0,2π)= L π (R) and n Z, n = 0. Then ∈ 2 ∈ 6 1 2π fˆ(n)= f (t)e int dt 2π − Z0 2π 1 int iπ n = f (t)e− (1 e n ) dt 4π 0 − Z 2π π 1 int in(t ) = f (t)(e− e− − n ) dt 4π 0 − Z 2π π 1 int = ( f (t) f (t + ))e− dt, 4π 0 − n Z D.3 * Fourier series 1045 so that 1 2π π fˆ(n) f (t) f (t + ) dt. | |≤ 4π 0 | − n | Z 1 Hence, if f = 1O L (0,2π) for some open set O [0,2π], then fˆ c0(Z) by ∈ ⊂ ∈ Lebesgue dominated convergence theorem. On the other hand, since span 1O : O [0,2π] open is dense in L1(0,2π), since the Fourier transform is bounded{ with⊂ ∞ } ∞ values in l (Z) (Lemma D.39), and since c0(Z) is a closed subspace of l (Z), we 1 find that fˆ c0(Z) for every f L (0,2π). ∈ ∈ Remark D.41. At the end of the proof of the Lemma of Riemann-Lebesgue, we used the following general principle: if T L (X,Y ) is a bounded linear operator between two normed linear spaces X, Y , and∈ if M X is dense, then ranT T (M). ∞ ⊂ ⊂ We used in addition that c0(Z) is closed in l (Z).

Theorem D.42. Let f C2π (R) be differentiable in some point s R. Then ∈ ∈ f (s)= ∑ fˆ(n)eins. n Z ∈

Proof. Note that for fs(t) := f (s +t),

1 2π 1 2π fˆ (n)= f (s +t)e int dt = f (t)e in(t s) dt = eins fˆ(n). s 2π − 2π − − Z0 Z0 Hence, replacing f by fs, if necessary, we may without loss of generality assume that s = 0. Moreover, replacing f by f f (0), if necessary, we may without loss of generality assume that f (0)= 0. We hence− have to show that if f is differentiable in ˆ 0 and if f (0)= 0, then ∑n Z f (n)= 0. f (t) ∈ π Let g(t) := 1 eit . Since f is differentiable in 0, f (0)= 0, and since f is 2 - − periodic, the function g belongs to C2π (R). By the Lemma of Riemann-Lebesgue, gˆ c0(Z). Note that ∈ 2π 1 it int fˆ(n)= g(t)(1 e )e− dt = gˆ(n) gˆ(n 1). 2π 0 − − − Z Hence,

n n ∑ fˆ(k)= ∑ gˆ(k) gˆ(k 1) k= n k= n − − − − = gˆ(n) gˆ( n 1) 0 (n ∞). − − − → → This is the claim.

1 1 Corollary D.43. For every f C π (R) := C2π (R) C (R) and every t R ∈ 2 ∩ ∈ f (t)= ∑ fˆ(n)eint . n Z ∈ 1046 D Hilbert spaces

Remark D.44. We will see that the convergence in the preceeding corollary is even uniform in t R. ∈ 2 π 2 Throughoutthe following, we equip the space L (0,2 )= L2π (R) with the scalar product given by 1 2π f ,g := f (t)g(t) dt, h i 2π 0 Z 1 which differs from the usual scalar product by the factor 2π .

1 2 Lemma D.45. The space C2π (R) is dense in L2π (R). π 2 π 2 Proof. We first prove that C([0,2 ]) is dense in L (0,2 )= L2π (R). For this, con- 2 π π sider first a characteristic function f = 1(a,b) L (0,2 ). Let (gn) C([0,2 ]) be defined by ∈ ⊂ 1, t [a,b], ∈  1 + n(t a), t [a 1/n,a), − ∈ − gn(t) :=   1 n(t b), t (b,b + 1/n],  − − ∈ 0, else.    π k·kL2 It is then easy to see that limn ∞ f gn L2 = 0, so that f = 1(a,b) C([0,2 ]) . → k − k ∈ In the second step, consider a characteristic function f = 1A of an arbitrary Borel set A B([0,2π]), and let ε > 0. By outer regularity of the Lebesgue mea- sure, there exists∈ an open set O A such that λ(O A) < ε2. Recall that O is the countable union of mutually disjoint⊃ intervals. Since\O has finite measure, there ex- ist finitely many (mutually disjoint) intervals (an,bn) O (1 n N) such that λ N ε2 ⊂ ≤ ≤ (O n=1(an,bn)) . By the preceeding step, for every 1 n N there exists \ ≤ ε N ≤ ≤ gn C([0,2π]) such that 1 gn 2 . Let g := ∑ gn C([0,2π]). Then ∈ S k (an,bn) − k ≤ N n=1 ∈

f g 2 1A 1O 2 + 1O 1 N a b 2 + 1 N a b g 2 k − k ≤k − k k − n=1( n, n)k k n=1( n, n) − k N S S ε ε + + ∑ (1(an,bn) gn) 2 ≤ k n=1 − k 3ε. ≤

L2 This proves 1A C([0,2π])k·k for every Borel set A B([0,2π]). Since span 1A : A B([0,2π])∈= L2(0,2π), we find that C([0,2π]) is∈ dense in L2(0,2π). { ∈ } 1 π It remains to show that C2π (R) is dense in C([0,2 ]) for the norm 2. So let f C([0,2π]) and let ε > 0. By Weierstrass’ theorem, there exists ak·k function ∈∞ 1 g0 C ([0,2π]) (even a polynomial!) such that f g0 ∞ ε. Let g1 C ([0,2π]) ∈ π π k π− k ≤ ∈ π be such that g1(2 )= g1′ (2 )= 0, g1(0)= g0(2 ) g0(0) and g1′ (0)= g0′ (2 ) ε − − g0′ (0) and g1 2 . Such a function g1 exists: it suffices for example to consider functions fork whichk ≤ the derivative is of the form D.3 * Fourier series 1047

g0(2π) g0(0)+ ct, t [0,h1], − ∈ g′ (t)=  g0(2π) g0(0)+ ch1 + d(t h1), t (h1,h2), 1 − − ∈   0, t [h2,2π], ∈  with appropriate constants 0 h1 h2 and c, d C. Having chosen g1, we let ≤ ≤ ∈ g = g0 + g1 and we calculate that

f g 2 f g0 2 + g1 2 2ε. k − k ≤k − k k k ≤ 1 1 Since g extends to a function in C2π (R), we have thus proved that C2π (R) is dense 2 in L2π (R). Remark D.46. An adaptation of the above proof actually shows that for every1 p < ∞ and every compact interval [a,b] R, the space C([a,b]) is dense in Lp(a,b≤). A further application of Weierstrass’ theorem⊂ actually shows that the space of all polynomials is dense in Lp(a,b). In particular, we may obtain the following result. Corollary D.47. The space Lp(a,b) is separable if 1 p < ∞. The space L∞(a,b) is not separable. ≤ int Corollary D.48. Let en(t) := e , n Z, t R. Then (en)n Z is an orthonormal 2 ∈ ∈ ∈ basis in L2π (R). 2 Proof. The fact that (en)n Z is an orthonormal system in L2π (R) is an easy calcu- ∈ 2 lation. We only have to prove that span en : n Z is dense in L2π (R). Note that ˆ 2 { ∈ } f (n) = ( f ,en) for every f L2π (R) and every n Z. By Lemma D.30, we know 2 ∈ ∈ that for every f L π (R) ∈ 2 ˆ 2 g := ∑ f (n)en exists in L2π (R). n Z ∈ ∑k ˆ In particular, a subsequence of ( n= k f (n)en) converges almost everywhere to g. − ∑k ˆ But by Corollary D.43 we know that ( n= k f (n)en) converges pointwise every- 1 − 1 where to f if f C π (R). As a consequence, for every f C π (R), ∈ 2 ∈ 2 k ˆ 2 lim ∑ f (n)en = f in L2π (R), k ∞ → n= k − 1 1 so that span en : n Z is dense in (C2π (R), L2 ). Since C2π (R) is dense in { ∈ } k·k 2π 2 2 L2π (R) by Lemma D.45, we find that (en)n Z is an orthonormal basis in L2π (R). ∈ 2 ˆ 2 Theorem D.49 (Plancherel). For every f L2π (R) we have f l (Z) and the 2 2 ∈ ∈ Fourier transform ˆ: L2π (R) l (Z) is an isometric isomorphism. Moreover, for 2 → every f L2π (R), ∈ ˆ 2 ∑ f (n)en = f in L2π (R), n Z ∈ that is, the Fourier series of f converges to f in the L2 sense. 1048 D Hilbert spaces

2 Proof. By Corollary D.48, the sequence (en)n Z is an orthonormal basis in L2π (R). 2 ∈ ˆ Moreover, recall that for every f L2π (R) and every n Z, f (n)= f ,en . Hence, ˆ 2 ∈∑ ˆ ∈ ˆ h i by Theorem D.31, f l (Z), f = n Z f (n)en, and f L2 = f l2 (the last prop- ∈ ∈ k k 2π k k erty being Parseval’s identity).

1 Corollary D.50. Let f C2π (R) be such that fˆ l (Z). Then ∈ ∈

∑ fˆ(n)en = f in C2π (R), n Z ∈ that is, the Fourier series of f converges uniformly to f .

1 Proof. Note that for every n Z, en ∞ = 1. The assumption fˆ l (Z) therefore ˆ∈ k k ∈ implies that the series ∑n Z f (n)en converges absolutely in C2π (R), i.e. for the uni- ∈ ˆ form norm ∞. Since (C2π (R), ∞) is complete, the series ∑n Z f (n)en con- k·k k·k ∈ verges uniformly to some element g C2π (R). By Plancherel, g = f . ∈ Remark D.51. The assumption fˆ l1(Z) in Corollary D.50 is essential. For general ∈ ˆ f C2π (R), the Fourier series ∑n Z f (n)en need not not converge uniformly. Ques- tions∈ regarding the convergence of∈ Fourier series (which type of convergence? for which function?)can go deeply into the theory of harmonicanalysis and answers are sometimes quite involved. The L2 theory gives in this context satisfactory answers with relatively easy proofs (see Plancherel’s theorem). For continuous functions we state the following result without giving a proof.

Theorem D.52 (Fejer).´ For every f C2π (R) one has ∈ 1 K k lim ∑ ∑ fˆ(n)en = f in C2π (R), K ∞ K → k=1 n= k − i.e. the Fourier series of f converges in the C´esaro mean uniformly to f .

D.4 Linear functionals on Hilbert spaces

In this section, we discuss bounded functionals on Hilbert spaces. Compared to the case of bounded linear functionals on general Banach spaces, the case of bounded linear functionals on Hilbert spaces is considerably easy but it has far reaching con- sequences.

Theorem D.53 (Riesz-Frechet).´ Let H be a Hilbert space. Then for every bounded linear functional ϕ H there exists a unique y H such that ∈ ′ ∈ ϕ(x)= x,y for every x H. h i ∈

Proof. Uniqueness. Let y1, y2 H be two elements such that ∈ D.5 Weak convergence in Hilbert spaces 1049

ϕ(x)= x,y1 = x,y2 for every x H. h i h i ∈

Then x,y1 y2 = 0 for every x H, in particular also for x = y1 y2. This implies h 2 − i ∈ − y1 y2 = 0, that is, y1 = y2. k Existence.− k We may assume that ϕ = 0 since the case ϕ = 0 is trivial. Lety ˜ (kerϕ) 0 . Since H = kerϕ and since6 kerϕ is closed, such ay ˜ exists. Next, let∈ ⊥ \{ } 6 y := ϕ(y˜)/ y˜ 2 y˜. k k Note that ϕ(y)= y 2 = y,y . Recall that every x H can be uniquely written as k k h i ∈ x = x0 + λy with x0 kerϕ and λ K so that λy (kerϕ)⊥. Note that (kerϕ)⊥ is one-dimensional. Hence,∈ for every∈x H, ∈ ∈

ϕ(x)= ϕ(x0 + λy)

= ϕ(x0)+ λϕ(y) = λϕ(y) = λ y,y h i = λy,y h i = x0,y + λy,y h i h i = x,y . h i The claim is proved.

Corollary D.54. Let J : H H be the mapping which maps to every y H the → ′ ∈ functional Jy H′ given by Jy(x)= x,y . Then J is antilinear if K = C and linear if K = R. Moreover,∈ J is isometric andh bijective.i

Proof. The fact that J is isometric follows from the Cauchy-Schwarz inequality. Antilinearity (or linearity in case K = R) follows from the sesquilinearity (resp. bilinearity) of the scalar product on H. Since J is isometric, it is injective. The surjectivity of J follows from Theorem D.53.

Remark D.55. The theorem of Riesz-Fr´echet allows us to identify any (real) Hilbert space H with its dual space H′. Note, however, that there are situations in which one does not identify H′ with H. This is for example the case when V is a second Hilbert space which embeds continuously and densely into H, that is, for which there exists a bounded, injective J : V H with dense range. →

D.5 Weak convergence in Hilbert spaces

Definition D.56. Let H be a Hilbert space. We say that a sequence (xn) H con- ⊂ verges weakly to some element x H if for every y H one has limn ∞ xn,y = weak ∈ ∈ → h i x,y . We write xn ⇀ x or xn x if (xn) converges weakly to x. h i → 1050 D Hilbert spaces

Theorem D.57. Every bounded sequence (xn) in a Hilbert space H admits a weakly convergent subsequence, that is, there exists x H and there exists a subsequence weak ∈ (xn ) of (xn) such that xn x. k k → In the proof of this theorem, we will use the following general result.

Lemma D.58. Let X andY be two normedspaces, let (Tn) L (X,Y ) be a bounded sequence of bounded operators. Assume that there exists∈ a dense set M X such ⊂ that limn ∞ Tnx exists for every x M. Then limn ∞ Tnx =: T x exists for every x X and T →L (X,Y). ∈ → ∈ ∈ Proof. Define Tx := limn ∞ Tnx for every x spanM. Then → ∈

Tx = lim Tnx sup Tn x , n ∞ k k → k k≤ n k kk k that is. T : spanM Y is a bounded linear operator. Since M is dense in X, T admits a unique bounded→ extension T : X Y. Let x X and ε > 0. Since M is→ dense in X, there exists y M such that x y ∈ ∈ k − k≤ ε. By assumption, there exists n0 such that for every n n0 we have Tny Ty ε. ≥ k − k≤ Hence, for every n n0, ≥

Tnx Tx Tnx Tny + Tny Ty + Ty Tx k − k≤k − k k − k k − k sup Tn x y + ε + T x y ≤ n k kk − k k kk − k ε(sup Tn + 1 + T ), ≤ n k k k k and therefore limn ∞ Tnx = Tx. → Proof (Proof of Theorem D.57). As in the proof of the Arzela-Ascoli theorem (The- orem B.43), we use Cantor’s diagonal sequence argument. Let (xn) be a bounded sequence in H. We first assume that H is separable, and we let (ym) H be a dense sequence. ⊂ Since ( xn,y1 ) is bounded by the boundedness of (xn), there exists a subse- h i quence (xϕ ) of (xn) (ϕ1 : N N is increasing, unbounded) such that 1(n) →

lim xϕ n ,y1 exists. n ∞ 1( ) → h i

Similarly, there exists a subsequence (xϕ2(n)) of (xϕ1(n)) such that

lim xϕ n ,y2 exists. n ∞ 2( ) → h i Note that for this subsequence, we also have that

lim xϕ n ,y1 exists. n ∞ 2( ) → h i

Iterating this argument, we find a subsequence (xϕ3(n)) of (xϕ2(n)) and finally for N every m , m 2, a subsequence (xϕm(n)) of (xϕm 1(n)) such that ∈ ≥ − D.5 Weak convergence in Hilbert spaces 1051

lim xϕ n ,y j exists for every 1 j m. n ∞ m( ) → h i ≤ ≤

Let (xn′ ) := (xϕn(n)) be the ’diagonal sequence’. Then (xn′ ) is a subsequence of (xn) and lim x′ ,ym exists for every m N. n ∞ n → h i ∈ By Lemma D.58 and the Riesz-Fr´echet representation theorem (Theorem D.53), there exists x H such that ∈

lim x′ ,y = x,y for every y H, n ∞ n → h i h i ∈ and the claim is proved in the case when H is separable. If H is not separable as we first assumed, then one may replace H by H˜ := span xn : n N which is separable. By the above, there exists x H˜ and a subse- { ∈ } ∈ quence of (xn) (which we denote again by (xn)) such that for every y H˜ , ∈

lim xn,y = x,y , n ∞ → h i h i that is, (xn) converges weakly in H˜ . On the other hand, for every y H˜ ⊥ and every n, ∈ xn,y = x,y = 0. h i h i The decomposition H = H˜ H˜ therefore yields that (xn) converges weakly in H. ⊕ ⊥

Appendix E Dual spaces and weak convergence

E.1 The theorem of Hahn-Banach

Given a normed space X, we denote by X ′ := L (X,K) the space of all bounded linear functionals on X. Recall that X ′ is always a Banach space by Corollary B.35 of Chapter B.

However, a priori it is not clear whether there exists any bounded linear func- tional on a normed space X (apart from the zero functional). This fundamental ques- tion and the analysis of dual spaces (analysis of functionals) shall be developed in this chapter. The existence of nontrivial bounded functionals is guaranteed by the Hahn- Banach theorem which actually admits several versions. However, before stating the first version, we need the following definition.

Definition E.1. Let X be a real or complex vector space. A function p : X R is called sublinear if → (i) p(λx)= λ p(x) for every λ > 0, x X, and ∈ (ii) p(x + y) p(x)+ p(y) for every x, y X. ≤ ∈ Example E.2. On a normed space X, the norm is sublinear. Every linear p : X R is sublinear. k·k → Theorem E.3 (Hahn-Banach; version of linear algebra, real case). Let X be a real vector space,U X a linear subspace,and p : X R sublinear. Let ϕ : U R be linear such that ⊂ → → ϕ(x) p(x) for all x U. ≤ ∈ Then there exists a linear ϕ˜ : X R such that ϕ˜(x)= ϕ(x) for every x U (that is, ϕ˜ is an extension of ϕ) and → ∈

ϕ˜(x) p(x) for all x X. (E.1) ≤ ∈

1053 1054 E Dual spaces and weak convergence

The following lemma asserts that this version of Hahn-Banach is true in the spe- cial case when X/U has dimension 1. It is an essential step in the proof of Theorem E.3.

Lemma E.4. Take the assumptions of Theorem E.3 and assume in addition that dimX/U = 1. Then the assertion of Theorem E.3 is true.

Proof. If dimX/U = 1, then there exists x0 X U such that every x X can be ∈ \ ∈ uniquely written in the form x = u + λx0 with u U and λ R. So we define ϕ˜ : X R by ∈ ∈ → ϕ˜(x) := ϕ˜(u + λx0) := ϕ(u)+ λr, where r R is a parameter which has to be chosen such that (E.1) holds, that is, such that∈ for every u U, λ R, ∈ ∈

ϕ(u)+ λr p(u + λx0). (E.2) ≤ If λ = 0, then this condition clearly holds for every u U by the assumption on ϕ. If λ > 0, then (E.2) holds for every u U if and only if∈ ∈

λr p(u + λx0) ϕ(u) for every u U ≤ u − u ∈ r p( + x0) ϕ( ) for every u U ⇔ ≤ λ − λ ∈ r inf p(v + x0) ϕ(v). ⇔ ≤ v U − ∈ Similarly, if λ < 0, then (E.2) holds for every u U if and only if ∈

λr p(u + λx0) ϕ(u) for every u U ≤ u − u ∈ r p( x0) ϕ( ) for every u U ⇔− ≤ λ − − λ ∈ − − r sup ϕ(w) p(w x0). ⇔ ≥ w U − − ∈ So it is possible to find an appropriate r R in the definition of ϕ˜ if and only if ∈

ϕ(w) p(w x0) p(v + x0) ϕ(v) for all v, w U, − − ≤ − ∈ or, equivalently, if

ϕ(w)+ ϕ(v) p(v + x0)+ p(w x0) for all v, w U. ≤ − ∈ However, by the assumptions on ϕ and p, for every v, w U, ∈

ϕ(w)+ ϕ(v)= ϕ(w + v) p(w + v)= p(v + x0 + w x0) p(v + x0)+ p(w x0). ≤ − ≤ − For the second step in the proof of Theorem E.3, we need the Lemma of Zorn.

Lemma E.5 (Zorn). Let (M, ) be a ordered set. Assume that every totally ordered subset T M (i.e. for every≤ x, y T one either has x y or y x) admits an ⊂ ∈ ≤ ≤ E.1 The theorem of Hahn-Banach 1055 upper bound. Then for every x M there exists a maximal element m x (that is, an element m such that m m˜ implies∈ m = m˜ for every m˜ M). ≥ ≤ ∈ Proof (Proof of Theorem E.3). Define the following set

M := (V,ϕV ) : V X linear subspace,U V, ϕV : V R linear, s.t. { ⊂ ⊂ → ϕ(x)= ϕV (x)(x U) and ϕV (x) p(x)(x V) , ∈ ≤ ∈ } and equip it with the order relation defined by ≤

(V1,ϕV ) (V2,ϕV ) : V1 V2 and ϕV (x)= ϕV (x) for all x V1. 1 ≤ 2 ⇔ ⊂ 1 2 ∈ ϕ Then (M, ) is an ordered set. Let T = ((Vi, Vi ))i I M be a totally ordered subset. ≤ ∈ ⊂ Then the element (V,ϕV ) M defined by ∈ ϕ ϕ V := Vi and V (x)= Vi (x) for x Vi i I ∈ [∈ is an upper boundof T . By the Lemma of Zorn,the set M admits a maximal element ϕ (X0, X0 ). Assume that X0 = X. Then, by Lemma E.4, we could constructan element 6 ϕ ϕ which is strictly larger than (X0, X0 ), a contradiction to the maximality of (X0, X0 ). ϕ ϕ Hence, X = X0, and ˜ := X0 is an element we are looking for. The complex version of the Hahn-Banach theorem reads as follows. Theorem E.6 (Hahn-Banach; version of linear algebra, complex case). Let X be a complex vector space, U X a linear subspace, and p : X R sublinear. Let ϕ : U C be linear such that⊂ → → Reϕ(x) p(x) for all x U. ≤ ∈ Then there exists a linear ϕ˜ : X C such that ϕ˜(x)= ϕ(x) for every x U (that is ϕ˜ is an extension of ϕ) and → ∈

Reϕ˜(x) p(x) for all x X. (E.3) ≤ ∈ Proof. We may consider X also as a real vector space. Note that ψ(x) := Reϕ(x) is an R-linear functional on X. By Theorem E.3, there exists an extension ψ˜ : X R of ψ satisfying → ψ˜ (x) p(x) for every x X. ≤ ∈ Let ϕ˜(x) := ψ˜ (x) iψ˜ (ix), x X. − ∈ It is an exercise to show that ϕ˜ is C-linear, that ϕ(x)= ϕ˜(x) for every x U and it is clear from the definition that Reϕ˜(x)= ψ˜ (x). Thus, ϕ˜ is a possible element∈ we are looking for. Theorem E.7 (Hahn-Banach; extension of bounded linear functionals). Let X be a normed space and U X a linear subspace. Then for every bounded linear ⊂ 1056 E Dual spaces and weak convergence u′ : U K there exists a bounded linear extension x′ : X K (that is, x′ U = u′) such that→ x = u . → | k ′k k ′k Proof. We first assume that X is a real normed space. The function p : X R defined by p(x) := u x is sublinear and → k ′kk k

u′(x) p(x) for every x U. ≤ ∈ By the first Hahn-Banach theorem (Theorem E.3), there exists a linear x : X R ′ → extending u′ such that

x′(x) p(x)= u′ x for every x X. ≤ k kk k ∈ Replacing x by x, this implies that −

x′(x) u′ x for every x X. | |≤k kk k ∈ Hence, x is bounded and x u . On the other hand, one trivially has ′ k ′k≤k ′k

x′ = sup x′(x) sup x′(x) = sup u′(x) = u′ . k k x X | |≥ x U | | x U | | k k x∈ 1 x∈ 1 x∈ 1 k k≤ k k≤ k k≤ If X is a complex normed space, then the second Hahn-Banach theorem (Theorem E.6) implies that there exists a linear x : X C such that ′ →

Rex′(x) p(x)= u′ x for every x X. ≤ k kk k ∈ In particular,

iθ x′(x) = sup Rex′(e x) u′ x for every x X, | | θ [0,2π] ≤k kk k ∈ ∈ so that again x′ is bounded and x′ u′ . The inequality x′ u′ follows as above. k k≤k k k k≥k k

Corollary E.8. If X isa normed space,thenfor every x X 0 there exists x′ X ′ such that ∈ \{ } ∈ x′ = 1 and x′(x)= x . k k k k In particular, X separates the points of X, i.e. for every x1,x2 X, x1 = x2, there ′ ∈ 6 exists x X such that x (x1) = x (x2). ′ ∈ ′ ′ 6 ′ Proof. By the Hahn-Banach theorem (Theorem E.7), there exists an extension x ′ ∈ X ′ of the functional u′ : span x K defined by u′(λx)= λ x such that x′ = u = 1. { } → k k k k k ′k For the proof of the second assertion, set x := x1 x2. − Corollary E.9. If X is a normed space, then for every x X ∈ E.1 The theorem of Hahn-Banach 1057

x = sup x′(x) . (E.4) k k | | x′ X′ x ∈ 1 k ′k≤ Proof. For every x X with x 1 one has ′ ∈ ′ k ′k≤

x′(x) x′ x x , | |≤k kk k≤k k which proves one of the required inequalities. The other inequality follows from Corollary E.8. Remark E.10. The equality (E.4) should be compared to the definition of the norm of an element x X : ′ ∈ ′ x′ = sup x′(x) . k k x X | | x∈ 1 k k≤ From now on, it will be convenient to use the following notation. Given a normed space X and elements x X, x X , we write ∈ ′ ∈ ′

x′,x := x′,x X X := x′(x). h i h i ′× For the bracket , , we note the following properties. The function h· ·i

, : X ′ X K, h· ·i × → (x′,x) x′,x = x′(x) 7→ h i is bilinear and for every x X , x X, ′ ∈ ′ ∈

x′,x x′ x . |h i|≤k kk k The bracket , thus appeals to the notion of the scalar product on inner product spaces, and theh· ·i last inequality appeals to the Cauchy-Schwarz inequality, but note, however, that the bracket is not a scalar product since it is defined on a pair of two different spaces. Moreover, even if X = H is a complex Hilbert space, then the bracket differs from the scalar product in that it is bilinear instead of sesquilinear. Corollary E.11. Let X be a normed space, U X a closed linear subspace and x X U. Then there exists x X such that ⊂ ∈ \ ′ ∈ ′

x′(x) = 0 and x′(u)= 0 for every u U. 6 ∈ Proof. Let π : X X/U be the quotient map (π(x)= x +U). Since x U, we have → 6∈ π(x) = 0. By Corollary E.8, there exists ϕ (X/U)′ such that ϕ(π(x)) = 0. Then x :=6ϕ π X is a functional we are looking∈ for. 6 ′ ◦ ∈ ′ Definition E.12. A linear subspace U of a normed space X is called complemented if there exists a projection P L (X) such that ranP = U. ∈ Remark E.13. If P is a projection (that means, if P2 = P), then Q = I P is also a projection and ranP = kerQ. Hence, if P is a bounded projection, then− ranP is 1058 E Dual spaces and weak convergence necessarily closed. Thus, a necessary condition for U to be complemented is that U is closed.

Corollary E.14. Every finite dimensional subspace of a normed space is comple- mented.

Proof. Let U be a finite dimensional subspace of a normed space X. Let (bi)1 i N be a basis of U. By Corollary E.11, there exist functionals x X such that ≤ ≤ i′ ∈ ′ 1 if i = j, xi′,b j = h i ( 0 otherwise.

Let P : X X be defined by → N Px := ∑ xi′,x bi, x X. i=1h i ∈

2 Then Pbi = bi for every 1 i N, and thus P = P, that is, P is a projection. Moreover, ranP = U by construction.≤ ≤ By the estimate

N Px ∑ xi′,x bi k k≤ i=1 |h i|k k N ∑ xi′ bi x , ≤ i=1 k kk k k k  the projection P is bounded.

The following lemma which does not depend on the Hahn-Banach theorem is stated for completeness.

Lemma E.15. In a Hilbert space every closed linear subspace is complemented.

Proof. Take the orthogonal projection onto the closed subspace as a possible pro- jection.

Corollary E.16. If X is a normed space such that X ′ is separable, then X is separa- ble, too.

Proof. Let D′ = xn′ : n N be a dense subset of the unit sphere of X ′. For every { ∈ } 1 n N we choose an element xn X such that xn 1 and x ,xn . We claim ∈ ∈ k k≤ |h n′ i| ≥ 2 that D := span xn : n N is dense in X. If this was not true, i.e. if D¯ = X, then, { ∈ } 6 by Corollary E.11, we find an element x X 0 such that x (xn)= 0 for every ′ ∈ ′ \{ } ′ n N. We may without loss of generality assume that x′ = 1. Since D′ is dense in ∈ k k 1 the unit sphere of X , we find n0 N such that x x . But then ′ ∈ k ′ − n′ 0 k≤ 4 1 1 x′ ,xn = x′ x′,xn x′ x′ xn , 2 ≤ |h n0 0 i| |h n0 − 0 i|≤k n0 − kk 0 k≤ 4 E.2 Weak∗ convergence and the theorem of Banach-Alaoglu 1059 which is a contradiction. Hence, D¯ = X and X is separable by Lemma D.21 of Chapter D.

E.2 Weak∗ convergence and the theorem of Banach-Alaoglu

Definition E.17. Let X be a Banach space. We say that a sequence (x ) X con- n′ ⊂ ′ verges weak∗ to some element x′ X ′ if for every x X one has limn ∞ xn′ ,x = weak ∈ ∈ → h i x ,x . We write x ∗ x if (x ) converges weak to x . h ′ i n′ → ′ n′ ∗ ′ Theorem E.18 (Banach-Alaoglu). Let X be a separable Banach space. Then every bounded sequence (xn′ ) X ′ admits a weak∗ convergent subsequence, that is, there ⊂ weak exists x X and there exists a subsequence (x ) of (x ) such that x ∗ x . ′ ∈ ′ n′ k n′ n′ k → ′ Proof. As in the proof of the Arzela-Ascoli theorem (Theorem B.43) and the the- orem about weak sequential compactness of the unit ball in Hilbert spaces (The- orem D.57), we use Cantor’s diagonal sequence argument. Let (xn′ ) be a bounded sequence in X ′. Since X is separable by assumption, we can choose a dense sequence (xm) X. ⊂ Since ( x ,x1 ) is bounded by the boundedness of (x ), there exists a subsequence h n′ i n′ (x′ϕ ) of (x′ ) (ϕ1 : N N is increasing, unbounded) such that 1(n) n →

lim x′ϕ ,x1 exists. n ∞ 1(n) → h i Similarly, there exists a subsequence (x ) of (x ) such that ϕ′ 2(n) ϕ′ 1(n)

lim x′ϕ ,x2 exists. n ∞ 2(n) → h i Note that for this subsequence, we also have that

lim xϕ′ ,x1 exists. n ∞ 2(n) → h i Iterating this argument, we find a subsequence (x ) of (x ) and finally for ϕ′ 3(n) ϕ′ 2(n) every m N, m 2, a subsequence (x ) of (x ) such that ϕ′ m(n) ϕ′ m 1(n) ∈ ≥ −

lim xϕ′ ,x j exists for every 1 j m. n ∞ m(n) → h i ≤ ≤ Let (y ) := (x ) be the ’diagonal sequence’. Then (y ) is a subsequence of n′ ϕ′ n(n) n′ (xn′ ) and lim y′ ,xm exists for every m N. n ∞ n → h i ∈ By Lemma D.58 of Chapter D, there exists x X such that ′ ∈ ′

lim y′ ,x = x′,x for every x X. n ∞ n → h i h i ∈ 1060 E Dual spaces and weak convergence

This is the claim.

E.3 Weak convergence and reflexivity

Given a normed space X, we call X ′′ := (X ′)′ = L (X ′,K) the bidual of X. Lemma E.19. Let X be a normed space. Then the mapping

J : X X ′′, → x (x′ x′,x ), 7→ 7→ h i is well defined and isometric.

Proof. The linearity of x x ,x is clear, and from the inequality ′ 7→ h ′ i

Jx(x′) = x′,x x′ x , | | |h i|≤k kk k follows that Jx X ′′ (i.e. J is well defined) and Jx x . The fact that J is isometric follows∈ from Corollary E.8. k k≤k k

Definition E.20. A normed space X is called reflexive if the isometry J from Lemma E.19 is surjective, i.e. if JX = X ′′. In other words: a normed space X is reflexive if for every x X there exists x X such that ′′ ∈ ′′ ∈

x′′,x′ = x′,x for all x′ X ′. h i h i ∈

Remark E.21. If a normed space is reflexive then X and X ′′ are isometrically iso- morphic (via the operator J). Since X ′′ is always complete, a reflexive space is nec- essarily a Banach space. Note that it can happen that X and X ′′ are isomorphic without X being reflexive (the example of such a Banach space is however quite involved). We point out that reflexivity means that the special operator J is an isomorphism.

Lemma E.22. Every Hilbert space is reflexive.

Proof. By the Theorem of Riesz-Fr´echet, we may identify H with its dual H′ and thus also H with its bidual H′′. The identification is done via the scalar product. It is an exercise to show that this identification of H with H′′ coincides with the mapping J from Lemma E.19.

Remark E.23. It should be noted that for complex Hilbert spaces, the identification of H with its dual H′ is only antilinear, but after the second identification (H′ with H′′) it turns out that the identification of H with H′′ is linear. Lemma E.24. Every finite dimensional Banach space is reflexive. E.3 Weak convergence and reflexivity 1061

Proof. It suffices to remark that if X is finite dimensional, then

dimX = dimX ′ = dimX ′′ < ∞.

Surjectivity of the mapping J (which is always injective) thus follows from linear algebra.

Theorem E.25. The space Lp(Ω) is reflexive if 1 < p < ∞ ((Ω,A , µ) being an arbitrary measure space).

We will actually only prove the following special case.

Theorem E.26. The spaces l p are reflexive if 1 < p < ∞.

The proof of Theorem E.26 is based on the following lemma. ∞ p Lemma E.27. Let 1 p < and let q := p 1 be the conjugate exponent so that 1 1 ≤ − p + q = 1. Then the operator

q p T : l (l )′, → (an) ((xn) ∑anxn), 7→ 7→ n

p q is an isometric isomorphism, i.e. (l )′ = l .

Proof. Linearity of T is obvious. Assume first p > 1, so that q < ∞. Note that for q q 2 q/p every a := (an) l 0 the sequence (xn) := (ca¯n an − ) (c = a q− ) belongs to l p and ∈ \{ } | | k k p q (q 1)p x p = a q− ∑ an − = 1. k k k k n | | This particular x l p shows that ∈ q/p q q(p 1)/p Ta p ∑a x = a − ∑ a = a − = a . (l )′ n n q n q q k k ≥ n k k n | | k k k k On the other hand, by H¨older’s inequality,

Ta p = sup ∑a x a , (l )′ n n q k k x p 1 | n |≤k k k k ≤ so that T is isometric in the case p (1,∞). The case p = 1 is very similar and will be omitted. ∈ p In order to show that T is surjective, let ϕ (l )′. Denote by en the n-th unit p ∈ ∞ q vector in l , and let an := ϕ(en). If p = 1, then (an) l = l by the trivial estimate ∈

an = ϕ(en) ϕ en 1 = ϕ . | | | |≤k kk k k k If p > 1, then we may argue as follows. For every N N, ∈ 1062 E Dual spaces and weak convergence

N N q q 2 ∑ an = ∑ an a¯n an − n=1 | | n=1 | | N q 2 = ϕ( ∑ a¯n an − en) n=1 | |

N 1 (q 1)p p ϕ ∑ an − ≤k k n=1 | |  N 1 q p = ϕ ∑ an , k k n=1 | |  which is equivalent to

N 1 N 1 q 1 p q q ∑ an − = ∑ an ϕ . n=1 | | n=1 | | ≤k k   Since the right-hand side of this inequality does not depend on N N, we obtain q ∈ that a := (an) l and a q ϕ . Next, observe∈ that fork everyk ≤kx kl p one has ∈ N x = ∑xnen = lim ∑ xnen, N ∞ n → n=1 the series converging in l p (here we need the restriction p < ∞!). Hence, for every x l p, by the boundedness of ϕ, ∈ N ϕ(x)= lim ϕ( ∑ xnen) N ∞ → n=1 N = lim ∑ xnan N ∞ → n=1 = ∑xnan n = Ta(x).

Hence, T is surjective.

p q Proof (Proof of Theorem E.26). By Lemma E.27, we may identify (l )′ with l and, p q p if 1 < p < ∞ (!), also (l )′′ = (l )′ with l . One just has to notice that this identifi- p p p p cation of l with (l )′′ = l (the identity map on l ) coincides with the operator J from Lemma E.19, so that l p is reflexive if 1 < p < ∞. Lemma E.28. The spaces l1, L1(Ω) (Ω RN) andC([0,1]) are not reflexive. ⊂ Proof. For every t [0,1], let δt C([0,1]) be defined by ∈ ∈ ′

δt , f := f (t), f C([0,1]). h i ∈ E.3 Weak convergence and reflexivity 1063

Then δt = 1 and whenever t = s, then k k 6

δt δs = 2. k − k 1 In particular, the uncountably many balls B(δt , ) (t [0,1]) are mutually disjoint 2 ∈ so that C([0,1])′ is not separable. Now, if C([0,1]) were reflexive, then C([0,1])′′ = C([0,1]) would be separable (since C([0,1]) is separable), and then, by Corollary E.16, C([0,1])′ would be sepa- rable; a contradiction to what has been said before. This proves that C([0,1]) is not reflexive. The cases of l1 and L1(Ω) are proved similarly. They are separable Banach spaces with nonseparable dual.

Theorem E.29. Every closed subspace of a reflexive Banach space is reflexive.

Proof. Let X be a reflexive Banach space, and let U X be a closed subspace. Let u U . Then the mapping x : X K defined by ⊂ ′′ ∈ ′′ ′′ ′ →

x′′,x′ = u′′,x′ U , x′ X ′, h i h | i ∈ is linear and bounded, i.e. x X . By reflexivity of X, there exists x X such that ′′ ∈ ′′ ∈

x′,x = u′′,x′ U , x′ X ′. (E.5) h i h | i ∈

Assume that x U. Then, by Corollary E.9, there exists x X such that x U = 0 6∈ ′ ∈ ′ ′| and x′,x = 0; a contradiction to the last equality. Hence, x U. We need to show that h i 6 ∈ u′′,u′ = u′,x , u′ U ′. (E.6) h i h i ∀ ∈ However, if u U , then, by Hahn-Banach we can choose an extension x X , i.e. ′ ∈ ′ ′ ∈ ′ x U = u . The equation (E.6) thus follows from (E.5). ′| ′ Corollary E.30. The Sobolev spaces W k,p(Ω) (Ω RN open) are reflexive if 1 < p < ∞,k N. ⊂ ∈ Proof. For example, for k = 1, the operator

T : W 1,p(Ω) Lp(Ω)1+N, → ∂u ∂u u (u, ,..., ), 7→ ∂x1 ∂xN is isometric, so that we may consider W 1,p(Ω) as a closed subspace of Lp(Ω)1+N which is reflexive by Theorem E.25. The claim follows from Theorem E.29.

Corollary E.31. A Banach space is reflexive if and only if its dual is reflexive.

Proof. Assume that the Banach space X is reflexive. Let x′′′ X ′′′ (the tridual!). Then the mapping x : X K defined by ∈ ′ → 1064 E Dual spaces and weak convergence

x′,x := x′′′,JX (x) , x X, h i h i ∈ is linear and bounded, i.e. x X (here JX denotes the isometry X X ). Let x ′ ∈ ′ → ′′ ′′ ∈ X be arbitrary. Since X is reflexive, there exists x X such that JX x = x . Hence, ′′ ∈ ′′

x′′′,x′′ = x′′′,JX x = x′,x = x′′,x′ , h i h i h i h i which proves that J x = x , i.e. the isometry J : X X is surjective. Hence, X′ ′ ′′′ X′ ′ → ′′′ X ′ is reflexive. On the other hand, assume that X ′ is reflexive. Then X ′′ is reflexive by the pre- ceeding argument, and therefore X (considered as a closed subspace of X ′′ via the isometry J) is reflexive by Theorem E.29.

Definition E.32. Let X be a normed space. We say that a sequence (xn) X con- verges weakly to some x X if ⊂ ∈

lim x′,xn = x′,x for every x′ X ′. n ∞ → h i h i ∈

Notations: if (xn) converges weakly to x, then we write xn ⇀ x, w limn ∞ xn = x, − → xn x in σ(X,X ), or xn x weakly. → ′ → Theorem E.33. In a reflexive Banach space every bounded sequence admits a weakly convergent subsequence.

Proof. Let (xn) be a bounded sequence in a reflexive Banach space X. We first assume that X is separable. Then X ′′ is separable by reflexivity, and X ′ is separable by Corollary E.16. Let (x ) X be a dense sequence. m′ ⊂ ′ Since ( x ,xn ) is bounded by the boundedness of (xn), there exists a subse- h 1′ i quence (xϕ ) of (xn) (ϕ1 : N N is increasing, unbounded) such that 1(n) →

lim x′ ,xϕ n exists. n ∞ 1 1( ) → h i

Similarly, there exists a subsequence (xϕ2(n)) of (xϕ1(n)) such that

lim x′ ,xϕ n exists. n ∞ 2 2( ) → h i Note that for this subsequence, we also have that

lim x′ ,xϕ n exists. n ∞ 1 2( ) → h i

Iterating this argument, we find a subsequence (xϕ3(n)) of (xϕ2(n)) and finally for N every m , m 2, a subsequence (xϕm(n)) of (xϕm 1(n)) such that ∈ ≥ −

lim x′ ,xϕ n exists for every 1 j m. n ∞ j m( ) → h i ≤ ≤

Let (yn) := (xϕn(n)) be the ’diagonal sequence’. Then (yn) is a subsequence of (xn) and E.4 * Minimization of convex functionals 1065

lim x′ ,yn exists for every m N. n ∞ m → h i ∈ By Lemma D.58 of Chapter D, there exists x X such that ′′ ∈ ′′

lim x′,yn = x′,x′′ for every x′ X ′. n ∞ → h i h i ∈

Since X is reflexive, there exists x X such that Jx = x′′. For this x, we have by definition of J ∈ lim x′,yn = x′,x exists for every x′ X ′, n ∞ → h i h i ∈ i.e. (yn) converges weakly to x. If X is not separable as we first assumed, then one may replace X by X˜ := span xn : n N which is separable. By the above, there exists x X˜ and a sub- { ∈ } ∈ sequence of (xn) (which we denote again by (xn)) such that for everyx ˜ X˜ , ′ ∈ ′

lim x˜′,xn = x˜′,x , n ∞ → h i h i i.e. (xn) converges weakly in X˜ . If x X , then x ˜ X˜ , and it follows easily that ′ ∈ ′ ′|X ∈ ′ the sequence (xn) also converges weakly in X to the element x.

E.4 * Minimization of convex functionals

Theorem E.34 (Hahn-Banach; separation of convex sets). Let X be a Banach space, K X a closed, nonempty, convex subset, and x0 X K. Then there exists x X and⊂ ε > 0 such that ∈ \ ′ ∈ ′

Re x′,x + ε Re x′,x0 , x K. h i ≤ h i ∈ Lemma E.35. Let K be an open, nonempty,convex subset of a Banachspace X such that 0 K. Define the Minkowski functional p : X R by ∈ → x p(x)= inf λ > 0 : K . { λ ∈ } Then p is sublinear, there exists M 0 such that ≥ p(x) M x , x X, ≤ k k ∈ and K = x X : p(x) < 1 . { ∈ } Proof. Since B(0,r) K for some r > 0, we find that ⊂ 1 p(x) x for every x X. ≤ r k k ∈ The property p(αx)= α p(x) for every α > 0 and every x X is obvious. ∈ 1066 E Dual spaces and weak convergence

Next, if p(x) < 1, then there exists λ (0,1) such that x/λ K. Hence, by x x ∈ ∈ convexity, x = λ λ = λ λ +(1 λ)0 K. On theotherhand,if x K, then (1+ε)x − ∈ 1 ∈ ∈ K, since K is open. Hence, p(x) (1 + ε)− < 1, so that K = x X : p(x) < 1 . Let finally x, y X. Then for every≤ ε > 0, x/(p(x)+ε) K and{ ∈y/(p(y)+ε) }K. In particular, for every∈ t [0,1], ∈ ∈ ∈ t 1 t x + − y K. p(x)+ ε p(y)+ ε ∈

Setting t = (p(x)+ ε)/(p(x)+ p(y)+ 2ε), one finds that x + y K, p(x)+ p(y)+ 2ε ∈

so that p(x + y) p(x)+ p(y)+ 2ε. Since ε > 0 was arbitrary, we find p(x + y) p(x)+ p(y). The≤ claim is proved. ≤

Proof (Proof of Theorem E.34). We prove the theorem for the case when X is a real Banach space. The complex case is proved similarly. We may without loss of generality assume that 0 K; it suffices to translate K ∈ and x0 for this. Since x0 K and since K is closed, we find that d := dist(x0,K) > 0. Put 6∈ Kd := x X : dist(x,K) < d/2 , { ∈ } so that Kd is an open, convex subset such that 0 Kd. Let p be the corresponding Minkowski functional (see Lemma E.35). ∈ Define on the one-dimensional subspace U := λx0 : λ R the functional u : { ∈ } ′ U R by u ,λx0 = λ. Then → h ′ i

u′,u p(u), u U. h i≤ ∈

By the Hahn-Banach theorem E.3, there exists a linear extension x′ : X R such that → x′,x p(x), x X. (E.7) h i≤ ∈ In particular, by Lemma E.35,

x′,x M x , |h i| ≤ k k

so that x X and x M. By construction, x ,x0 = 1. Moreover, by (E.7) and ′ ∈ ′ k ′k≤ h ′ i Lemma E.35, x ,x < 1 for every x K Kd, so that h ′ i ∈ ⊂

x′,x x′,x0 (= 1), x Kd. h i ≤ h i ∈

Replacing the above argument with (1 ε )x0 instead of x0 (where ε > 0 is chosen − ′ ′ so small that (1 ε )x0 Kd ), we find that − ′ 6∈

x′,x + ε′ x′,x0 x′,x0 , x K Kd, h i h i ≤ h i ∈ ⊂ E.4 * Minimization of convex functionals 1067 and putting ε := ε = ε x ,x0 > 0 yields the claim. ′ ′h ′ i Corollary E.36. Let X be a Banach space and K X a closed, convex subset ⊂ (closed for the norm topology). If (xn) K converges weakly to some x X, then x K. ⊂ ∈ ∈ Proof. Assume the contrary, i.e. x K. By the Hahn-Banach theorem (Theorem E.34), there exist x X and ε > 0 such6∈ that ′ ∈ ′

Re x′,xn + ε Re x′,x for every n N, h i ≤ h i ∈ a contradiction to the assumption that xn ⇀ x. A function f : K R on a convex subset K of a Banach space X is called convex if for every x, y K→, and every t [0,1], ∈ ∈ f (tx + (1 t)y) t f (x) + (1 t) f (y). (E.8) − ≤ − Corollary E.37. Let X be a Banach space, K X a closed, convex subset, and ⊂ f : K R a continuous, convex function. If (xn) K converges weakly to x K, then → ⊂ ∈ f (x) liminf f (xn). n ∞ ≤ → Proof. For every l R, the set Kl := x K : f (x) l is closed (by continuity of f ) and convex (by convexity∈ of f ). After{ ∈ extracting≤ a subsequence,} if necessary, we may assume that l := liminfn ∞ f (xn)= limn ∞ f (xn). Then for every ε > 0 the → → sequence (xn) is eventually in Kl+ε , i.e. except for finitely many xn, the sequence (xn) lies in Kl+ε . Hence, by Corollary E.36, x Kl+ε , which means that f (x) l +ε. Since ε > 0 was arbitrary, the claim follows. ∈ ≤

Theorem E.38. Let X be a reflexive Banach space, K X a closed, convex, nonempty subset, and f : K R a continuous, convex function⊂ such that → lim f (x)=+∞ (coercivity). x ∞ k xk→K ∈

Then there exists x0 K such that ∈

f (x0)= inf f (x) : x K > ∞. { ∈ } −

Proof. Let (xn) K be such that limn ∞ f (xn)= inf f (x) : x K . By the coerciv- ⊂ → { ∈ } ity assumption on f , the sequence (xn) is bounded. Since X is reflexive, there exists a weakly convergent subsequence (Theorem E.33); we denote by x0 the limit. By Corollary E.36, x0 K. By Corollary E.37, ∈

f (x0) lim f (xn)= inf f (x) : x K . n ∞ ≤ → { ∈ } The claim is proved. 1068 E Dual spaces and weak convergence

Remark E.39. Theorem E.38 remains true if f is only lower semicontinuous, i.e. if

liminf f (xn) f (x) n ∞ → ≥ for every convergent (xn) K with x = limxn. In fact, already Corollary E.37 re- mains true if f is only lower⊂ semicontinuous (and then Corollary E.37 says that lower semicontinuity of a convex function in the norm topology implies lower semi- continuity in the weak topology). It suffices for example to remark that the sets Kl := f l (l R) are closed as soon as f is lower semicontinuous. { ≤ } ∈

E.5 * The von Neumann minimax theorem

In the following theorem, we call a function f : K R on a convex subset K of a Banach space X concave if f is convex, or, equivalently,→ if for every x, y K and every t [0,1], − ∈ ∈ f (tx + (1 t)y) t f (x) + (1 t) f (y). (E.9) − ≥ − A function f : K R is called strictly convex (resp. strictly concave) if for every x, y K, x = y, f (x)=→ f (y) the inequality in (E.8) (resp. (E.9)) is strict for t (0,1). ∈ 6 ∈ Theorem E.40 (von Neumann). Let K and L be two closed, bounded, nonempty, convex subsets of reflexive Banach spaces X and Y, respectively. Let f : K L R be a continuous function such that × →

x f (x,y) is strictly convex for every y L, and 7→ ∈ y f (x,y) is concave for every x K. 7→ ∈ Then there exists (x¯,y¯) K L such that ∈ × f (x¯,y) f (x¯,y¯) f (x,y¯) for every x K, y L. (E.10) ≤ ≤ ∈ ∈ Remark E.41. A point (x¯,y¯) K L satisfying (E.10) is called a saddle point of f . A saddle point is a point of∈ equilibrium× in a two-person zero-sum game in the following sense: If the player controlling the strategy x modifies his strategy when the second player playsy ¯, he increases his loss; hence, it is his interest to playx ¯. Similarly, if the player controlling the strategy y modifies his strategy when the first player playsx ¯, he diminishes his gain; thus it is in his interest to playy ¯. This property of equilibrium of saddle points justifies their use as a (reasonable) solution in a two-person zero-sum game ([Aubin (1979)]).

Proof. Define the function F : L R by F(y) := infx K f (x,y) (y L). By Theorem E.38, for every y L there exists→x K such that F(y)=∈ f (x,y). By∈ strict convexity, this element x is uniquely∈ determined.∈ We denote x := Φ(y) and thus obtain

F(y)= inf f (x,y)= f (Φ(y),y), y L. (E.11) x K ∈ ∈ E.5 * The von Neumann minimax theorem 1069

By concavity of the function y f (x,y) and by the definition of F, for every y1, 7→ y2 L and every t [0,1], ∈ ∈

F(ty1 + (1 t)y2)= f (Φ(ty1 + (1 t)y2),ty1 + (1 t)y2) − − − t f (Φ(ty1 + (1 t)y2),y1) + (1 t) f (Φ(ty1 + (1 t)y2),y2) ≥ − − − tF(y1) + (1 t)F(y2), ≥ − so that F is concave. Moreover, F is upper semicontinuous: let (yn) L be conver- ⊂ gent to y L. For every x K and every n N one has F(yn) f (x,yn), and taking the limes∈ superior on both∈ sides, we obtain,∈ by continuity of f≤,

limsupF(yn) limsup f (x,yn)= f (x,y). n ∞ ≤ n ∞ → →

Since x K was arbitrary, this inequality implies limsupn ∞ F(yn) F(y), i.e. F is upper semicontinuous.∈ → ≤ By Theorem E.38 (applied to F; use also Remark E.39), there existsy ¯ L such that − ∈ f (Φ(y¯),y¯)= F(y¯)= supF(y). y L ∈ We putx ¯ = Φ(y¯) and show that (x¯,y¯) is a saddle point. Clearly, for every x K, ∈ f (x¯,y¯) f (x,y¯). (E.12) ≤ Therefore it remains to show that for every y L, ∈ f (x¯,y¯) f (x¯,y). (E.13) ≥ 1 1 Φ Let y L be arbitrary and put yn := (1 n )y¯ + n y and xn = (yn). Then, by concavity,∈ −

F(y¯) F(yn) = f (xn,yn) ≥ 1 1 (1 ) f (xn,y¯)+ f (xn,y) ≥ − n n 1 1 (1 )F(y¯)+ f (xn,y), ≥ − n n or F(y¯) f (xn,y) for every n N. ≥ ∈ Since K is bounded and closed, the sequence (xn) K has a weakly convergent ⊂ subsequence which convergesto some element x0 K (Theorem E.33 and Corollary E.36). By the preceeding inequality and Corollary∈ E.37,

F(y¯) f (x0,y). ≥

This is just the remaininginequality (E.13)if we can provethat x0 = x¯. By concavity, for every x K and every n N, ∈ ∈ 1070 E Dual spaces and weak convergence

f (x,yn) f (xn,yn) ≥ 1 1 (1 ) f (xn,y¯)+ f (xn,y) ≥ − n n 1 1 (1 ) f (xn,y¯)+ F(y). ≥ − n n Letting n ∞ in this inequality and using Corollary E.37 again, we obtain that for every x →K, ∈ f (x,y¯) f (x0,y¯). ≥ Hence, x0 = Φ(y¯)= x¯ and the theorem is proved. Appendix F The divergence theorem

by WOLFGANG ARENDT

Introduction

This is an appendix to the course “PARTIAL DIFFERENTIAL EQUATIONS” given in Summer Semester 2006. We give a proof of the Divergence Theorem. The idea is to use Riesz’ Theorem to prove uniqueness of the surface measure, i.e., a measure on the boundary of a C1-domain Ω Rn such that ⊂

D jvdx = vν jdσ ΩZ ΓZ for all j = 1,...n and all v C1(Ω¯ ), where ν C(Γ ,Rn) denotes the outer normal. In the constructionof the surface∈ measure, this∈ uniqueness results makes superfluous the tedious proof of independanceof the chosen graph and the partition of the unity.

F.1 Open sets of class C1

Let Ω Rn be an open, bounded set and Γ = ∂Ω its boundary. For x,y Rn we denote⊂ by ∈ n x y = ∑ x jy j · j=1 the scalar product, by x := √x x the Eucleadian norm and by x ∞ := max x j the | | · | | j=1...n | | supremum norm.

1071 1072 F The divergence theorem (BY WOLFGANG ARENDT)

Definition F.1. a) Let U Rn be open. We say that Γ U is a normal C1-graph (with Ω on one side), if⊂ there exist g C1(Rn 1),r > 0∩,h > 0 such that ∈ − n 1 U = (y,g(y)+ s) : y R − , y ∞ < r,s R, s < h { ∈ | | ∈ | | } such that for x = (y,g(y)+ s) U one has ∈ x Ω if and only if s > 0 ; ∈ x Γ if and only if s = 0 and ∈ x Ω¯ if and only if s < 0 . 6∈ b) Let U Rn be open. We say that U Γ is a C1-graph with Ω on one side if there exist⊂ an orthonormal n n matrix∩B and b Rn such that φ(U) φ(Γ ) is a normal C1-graph with φ(Ω) on× one side where φ(∈x)= Bx + b. ∩ c) We say that Ω is of class C1 or that Ω has C1-boundary if for each z Γ there exists an open neighborhood U Rn of z such that U Γ is a C1-graph with∈ Ω on one side. ⊂ ∩

Remark F.2. Similarly, we define a normal Ck-graph or a normal Lipschitz graph, if the function g in a) is of class Ck or Lipschitz continuous, respectively. Accordingly, we define Ck-graphs and Lipschitz graphs as in b). We say that Ω is of class Ck or Ω has Lipschitz boundary, if for each z Γ there exists an open neighborhood U such that U Γ is a Ck-graph or a Lipschitz-graph,∈ respectively. ∩ Next we give the definition of the of Γ at a point z Γ . Here we need this notion merely to define the outer normal intrinsically (Theorem∈ F.5 below).

Definition F.3. Let z Γ . A vector v Rn is said to be tangent to Γ at z if there exists ψ C1(( ε,ε)∈,Rn) such that ∈ ∈ −

ψ(t) Γ for t < ε and ψ(0)= z,ψ′(0)= v . ∈ | |

We denote by Tz the set of all vectors v which are tangent to Γ at z. It is easy to see n that Tz is a vector subspace of R . We call it the tangent space of Γ at z.

n 1 Proposition F.4. Let Ω R be of class C , and let z Γ . Then Tz has dimension n 1. ⊆ ∈ − Proof. Let U Rn be an open neighborhood of z such that U Γ is a C1-graph with Ω on one⊂ side. We may assume that the graph is normal and∩ keep the notations n n 1 of Definition F.1.a). For x = (x1,...xn) R we let x := (x1,...xn 1) R − . Thus ∈ − ∈ zn = g(x). We show that

n 1 n − Tz = v R : vn = ∑ v jD jg(x) (F.1) { ∈ j=1 } F.1 Open sets of class C1 1073

n 1 n − which implies the claim. Let v R such that vn = ∑ v jD jg(z). Define ψ(t)= ∈ j=1 (z)+ tv,g(z + tv)). Then ψ C1(( ε,ε),Rn) and ψ(t) Γ for t < ε if ε > 0 ∈ − n 1 ∈ | | − is small enough. Then ψ(0)= z and ψ′(0) = (v, ∑ v jD jg(z)) = v. This proves j=1 one inclusion. Conversely, let v Tz. Consider ψ as in Definition F.3. Then ∈ ψ(0) = z,ψ′(0) = v. Since ψ(t) = (ψ1(t),...,ψn(t)) Γ for t < ε, it fol- ψ ψ ψ ε ∈ | | ψ lows that n 1(t)= g( 1(t),..., n(t)) for t < . Consequently, vn = n′ (0)= n 1 − n 1 | | − ψ ψ ψ − ∑ ′j(0)D jg( 1(0),..., n 1(0)) = ∑ v jD jg(z). This proves the other inclusion. j=1 − j=1 For A Rn we denote by ⊂ n A⊥ := v R : x v = 0 forall x A { ∈ · ∈ } n the space which is orthogonal to A. For x = (x1,...,xn) R we let x := n 1 ∈ (x1,...,xn 1) R − . − ∈ Theorem F.5 (Definition of the outer normal). Assume that Ω is of class C1. Then for each z Γ there exists a unique vector ν(z) Rn satisfying ∈ ∈ a) ν(z) T , ν(z) = 1, ∈ z⊥ | | b) z +tν(z) Ω for t ( ε,0), ∈ ∈ − c) z +tν(z) Ω for t [0,ε) 6∈ ∈ for some ε > 0. We call ν(z) the outer normal of Ω at z. Moreover, ν C(Γ ,Rn). ∈ Proof. Since dTz = n 1 it follows that for each z Γ there exists at most one ν(z) Rn satisfying (a),− (b), (c). Let U Rn be open∈ such that U Γ is a C1-graph. We show∈ that for z Γ U there exists⊂ν(z) satisfying (a), (b), (c).∩ We may assume that the graph is normal∈ ∩ and keep the notation of Definition F.1. For x Rn we let n 1 ∈ x = (x1,...xn 1) R − . Let − ∈ (∇g(z), 1) ν(z)= − for z Γ U . (F.2) 1 + ∇g(z) 2 ∈ ∩ | | ν ν p ψ Then (z) = 1 and (z) Tz⊥ by (F.1). It remains to show (b) and (c). Let (x)= | | 1 ∈n g(x) xn. Then ψ C (R ) and for x U one has − ∈ ∈ ψ(x) < 0 if and only if x Ω and ∈ ψ(x) 0 if and only if x Ω . ≥ 6∈ Let z Γ U. By Taylor’s formula, ∈ ∩ ψ(z + h) = (∇ψ(z) h)+ o(h) | ∇ψ with o(h) 0 as h 0. Note that ∇ψ(z) = (∇g(x), 1), hence ν(z)= (z) . h → | | → − ∇ψ(z) Thus | | | | 1074 F The divergence theorem (BY WOLFGANG ARENDT)

o(tν(z)) ψ(z +tν(z)) = t(∇ψ(z) ν(z)) + o(tν(z)) = t ∇ψ(z) + . · {| | t } ∇ψ o(tν(z)) ε ψ Since (z) 1 and t 0 as t 0 there exists > 0 such that (z + tν(z))| 0 for|≥t [0,ε) and ψ(z→+ tν(z)) →< 0 for t ( ε,0). Thus (b) and (c) are proved.≥ It follows∈ from (F.2) that ν : U Γ Rn is∈ continuous.− ∩ →

F.2 The divergence theorem

Let Ω Rn be a bounded, open set with C1-boundary Γ = ∂Ω. In this section we introduce⊂ the surface measure on Γ as the unique measure on Γ such that the Divergence Theorem holds. The Divergence Theorem is the extension of the Fundamental Theorem of Calculus to higher dimension. It implies the formula of partial integration in higher dimension and the Green’s formulas. The proof of the Divergence Theorem will be given in the next section.

Let Ω Rn be open and bounded with boundary Γ := ∂Ω. By C1(Ω¯ ) we denote ⊂ 1 the space of all continuous functions u : Ω¯ R which are C in Ω such that D ju → ∂ u has a continuous extension to Ω¯ for j = 1,...n. Here D ju(x)= (x1,...,xn) is the ∂ x j jth partial derivative at x = (x1,...,xn).

Theorem F.6 (Divergence Theorem). Assume that Ω is of class C1. Then there exists a unique Borel measure σ on Γ such that

1 D judx = uν jdσ ( j = 1,...n) forall u C (Ω¯ ) . (F.3) ∈ ΩZ ΓZ

Here ν C(Γ ,Rn) is the outer normal with components ∈

ν(z) = (ν1(z),...,νn(z)) (z Γ ) . ∈ The measure σ is called the surface measure or the (n 1)-dimensional Haus- dorff measure. − The Divergence Theorem is also called Gauß’s Theorem. It is an extension of the Fundamental Theorem of Calculus

b

u′(x)dx = u(b) u(a) − Za

(u C1[a,b]) to higher dimension. The following formula of integration by parts in higher∈ dimension is frequently used. F.2 The divergence theorem 1075

Corollary F.7 (Integration by parts).

(D ju)vdx = uD jvdx+ uvν jdσ (F.4) − ΩZ ΩZ ΓZ

1 (u,v C (Ω¯ )), j = 1,...n, where ν(z) = (ν1(z),...,νn(z)) is the outer normal. ∈ Proof. By the Divergence Theorem applied to uv we have

uvν jdσ = D j(uv)dx ΓZ ΩZ

= (D ju)vdx + u(D jv)dx . ΩZ ΩZ

Taking v = 1 in (F.4) we recover the original formula (F.3). Next we prove the Green’s formula. For u C1(Ω¯ ) we denote by ∈ ∂ n u ν ∂ν (z) := ∑ D ju(z) j(z) j=1 the normal derivative. Recall, that for x Ω,v Rn, ∈ ∈

∂ D u(x) := u(x +tv) v ∂t |t=0 = ∇u(x) v · is the directional derivative (or Gateaux derivative) of u at x in the direction v. Here n n ∇u(x) v = ∑ D ju(x)v j is the scalar product in R . Thus · j=1

∂u ∂ν (z)= Dν(z)u(z) if u has an extension in C1(Rn). Corollary F.8 (Green’s formula). Let u,v C2(Ω¯ ). Then ∈ ∂ u a) ∆u vdx = ∇u∇vdx+ ∂ν vdσ Ω · −Ω Γ R ∂ u R R b) ∆udx = ∂ν dσ. Ω Γ R 2 R 1 1 Here C (Ω¯ )= u C (Ω¯ ) : D ju C (Ω¯ ), j = 1...n . { ∈ ∈ } Proof. (a) Applying the integration-by-parts formula (F.4) to D ju instead of u we obtain 1076 F The divergence theorem (BY WOLFGANG ARENDT)

n ∆uvdx = ∑ D j(D ju)vdx ΩZ j=1ΩZ n = ∑ D juD jvdx+ ν jD juvdσ {− } j=1 ΩZ ΓZ ∂u = ∇u ∇vdx+ vdσ . − · ∂ν ΩZ ΓZ

(b) follows from (a) taking v = 1. Exercise. For u,v C2(Ω¯ ), ∈ ∂u ∂v ((∆u)v u∆v)dx = ( v u )dσ . − ∂ν − ∂ν ΩZ ΓZ

Finally we give some comments explaining the name “Divergence Theorem” and also the physical significance of this result. Let v C1(Ω¯ ,Rn), be a vectorfield, 1 ∈ v(x) = (v1(x),...,vn(x)) with v j C (Ω¯ ), j = 1,...n. Denote by ∈ n (div v)(x) := ∑ D jv j(x) j=1

n the divergence of v. Note that div v C(Ω¯ ). By (ν(z) v(z)) = ∑ ν j(z)v j(z) (z ∈ · j=1 ∈ Γ ) we denote the scalar product of v and the normal vector on Γ . Corollary F.9. Let v C1(Ω¯ ,Rn). Then ∈ (div v)(x)dx = v ν dσ . (F.5) · ΩZ ΓZ

Proof. One has

n n (div v)(x)dx = ∑ D jv j(x)dx = ∑ ν j(z)v j(z)dσ(z)= v ν dσ · ΩZ j=1ΩZ j=1ΓZ ΓZ by Theorem F.6. Remark F.10 (Physical interpretation). The scalar product v(z) ν(z) = v(z) cosα, where α is the angle between the vectors v(z) and ν(z), is· the pro- kjectionk of v(z) to the normal vector ν(z). If dσ(z) is identified with a small square (we consider dimension n = 3), then v(z) ν(z)dσ(z) is the flux of the vectorfield v through the surface element dσ(z). Thus,· if v is the velocity of a liquid, measured in m/sec, then v(z) ν(z)dσ(z) is the quantity of liquid going through the surface element dσ(z) in 1 second.· The integral F.3 Proof of the divergence theorem 1077

v(z) ν(z)dσ(z) (F.6) · ∂ωZ is the quantity of liquid going through the surface of a body ω in 1 second. If the liquid is incompressible, then the quantity going into the body is equal to the quan- tity going out of it; hence the flux (F.5) is 0. By the Divergence Theorem in the form of Corollary F.9 this implies that

(div v)(x)dx = 0 ωZ for all open sets ω of class C1 satisfying ω¯ Ω. This implies that ⊂ div v(x)= 0 forall x Ω . ∈ This result is rephrased by saying that the velocity field of an incompressible liquid is divergence free.The same is true for the electrical field in a space withoutcharge.

Example F.11 (Archimedean Principle). A body A is in a liquid of constant den- sity c > 0, whose surface coincides with the plane (x1,x2,0) : x1,x2 R . At the { ∈ } point z ∂A the liquid yields a pressure onto the body A which is equal to cz3ν(z). ∈ Observe that z3 is negative and the presure is directed to the interior of A. Thus, the force F executed onto the body A is

F = cz3ν(z)dσ(z) ∂ZA i.e., Fi = cz3νi(z)dσ(z) (i = 1,2,3). By the Divergence Theorem, ∂ A R ∂ Fi = c x3 dx . ∂xi ZA

Hence F1 = F2 = 0 and

F3 = c dx = c A | | ZA where A is the volume of A. | |

F.3 Proof of the divergence theorem

We start recalling Riesz’s Theorem. Let K be a compact space and C(K) the space of all continuous functions f : K R. By a positive linear form ϕ on C(K) we understand a linear mapping ϕ : C→(K) R such that → 1078 F The divergence theorem (BY WOLFGANG ARENDT)

f 0 implies ϕ( f ) 0 ≥ ≥ for all f C(K). If µ is a Borel measure on K, then ∈ ϕ( f )= f (x)dµ(x) KZ

defines a positive linear form on C(K). Conversely, Riesz’s Theorem says that each positive linear form is induced by a measure.

Theorem F.12 (Riesz). Let ϕ be a positive linear form on C(K). Then there exists a unique Borel measure µ on K such that

ϕ( f )= f (x)dµ(x) (F.7) KZ

for all f C(K). ∈ Next we recall the Stone-Weierstraß Theorem. A subalgebra of C(K) is a sub- space A of C(K) such that

f ,g A implies f g A . ∈ · ∈ We say that A separates the points of K if for all x,y K satisfying x = y there exists f A such that f (x) = f (y). ∈ 6 ∈ 6 Theorem F.13 (Stone-Weierstraß). Let A be a subalgebra of C(K) which sepa- rates the points of K and contains the constant-1-function 1K. Then A is dense in C(K).

Now let Ω Rn be an open, bounded set of class C1. We first prove uniqueness of the surface measure.⊂

Proposition F.14 (Uniqueness of the surface measure). Let σ be a Borel measure on Γ such that uν jdσ = 0 ( j = 1,...,n) (F.8) ΓZ for all u C1(Ω¯ ). Then σ = 0. ∈ A 1 n Γ Proof. The set = u Γ : u C (R ) is dense in C( ) by the Stone- { | ∈ } Weierstraß Theorem. It follows from this and the assumption (F.8) that uν jdσ = 0 Γ Γ ν ν2 σ R Γ for all u C( ). Replacing u by u j we conclude that u j d = 0 forall u C( ). ∈ Γ ∈ n R ν2 σ σ Since ∑ j = 1, it follows that ud = 0. This implies that = 0, by Riesz’ The- j=1 Γ orem. R F.3 Proof of the divergence theorem 1079

Next we prove existence of the surface measure. Let U Rn be open such that U Γ = /0. Let ⊂ ∩ 6

Cc(U Γ ) := u C(Γ ) : K U compact such that u(z)= 0 for z Γ K . ∩ { ∈ ∃ ⊂ ∈ \ }

We denote by Cc(U Γ ) := ϕ : Cc(U Γ ) R : ϕ is linear such that f 0 ∩ +′ { ∩ → ≥ implies ϕ( f ) 0 for all f Cc(U Γ ) the set of all positive linear forms on ≥ ∈ ∩ } Cc(U Γ ). Let U Γ be a normal graph. Define ∩ ∩ ϕ( f )= f (y,g(y)) 1 + ∇g(y) 2 dy (F.9) | | y ∞Z

for all f Cc(U Γ ) where we use the notation of Definition F.1. Then ϕ Cc(U Γ ∈ ∩ ∈ ∩ )+′ . Lemma F.15. Let U Γ be a normal C1-graph as above. Let u C1(Ω¯ ) be such that supp u U. Then∩ ∈ ⊂ ϕ ν D judx = (u Γ j) ( j = 1...n) . (F.10) | ΩZ

Here supp u := x Ω : u(x) = 0 (i.e. the closure of the open set x Ω : u(x) = 0 ) is a compact{ ∈ subset of6 Rn.} We assume that supp u U, hence{u(x∈)= 0 6 } Γ ⊂ for all x U supp u. Thus u Γ Cc(U ). ∈ \ | ∈ ∩ Proof. Denote by n 1 φ : R − ( h,h) U × − → the diffeomorphism given by φ(y,s) = (y,g(y)+ s). Then det Dφ(y,s)= 1 for all (y,s) Rn 1 ( h,h). ∈ − × − a) Let j 1,...,n 1 . Let u C1(Ω¯ ) such that supp u U. Then ∈{ − } ∈ ⊂ h

D judx = D ju(y,g(y)+ s) dy ΩZ y ∞Z

= u(y,g(y))D jg(y)dy y ∞Z

where we used the Fundamental Theorem of Calculus in 1 variable and the fact that ν D jg(y) ϕ ν supp u U. Since j(y,g(y)) = , the last term equals ( ju Γ ). √1+ ∇g(y) 2 | ⊂ | | ∂ u b) Let j = n. Then ∂ s (y,g(y)+ s)= Dnu(y,g(y)+ s). Hence

h

Dnu(x)dx = Dnu(y,g(y)+ s) dy = ΩZ y ∞Z

ν˜ (φ(z)) = Bν(z) (F.11)

for all z Γ . Consider the positive linear form ϕ˜ on Cc(Γ˜ U˜ ) constructed above. ϕ∈ Γ ∩ Define Cc( U)+′ by ∈ ∩ 1 ϕ( f )= ϕ˜( f φ − ) (F.12) ◦ for f Cc(Γ U). ∈ ∩ Lemma F.16. LetU Γ beaC1-graph as above. Let u C1(Ω¯ ) such that supp u U. Then ∩ ∈ ⊂ ϕ ν D judx = (u Γ j) ( j = 1,...,n) . | ΩZ

1 1 Proof. Let B = (bk j)k j 1 n.Letu ˜ = u φ .Thenu ˜ C (Ω˜ ) and by Lemma F.15. , = ... ◦ − ∈ F.3 Proof of the divergence theorem 1081 ∂ D ju = u˜(φ(x))dx ∂x j ΩZ ΩZ n = ∑ Dku˜(φ(x))bk j dx ΩZ k=1 n = ∑ Dku˜(y)bk j dy ΩZ˜ k=1 n ϕ ν˜ = ∑ bk j ˜(u˜ Γ˜ k) k=1 | 1 1 = ϕ˜(u φ − (B− ν˜ ) j) Γ˜ ◦ | ϕ 1ν φ = (u Γ (B− ˜ ) j) | ◦ ϕ ν = (u Γ j) . | Now we piece together the surface measure using the positive linear forms on the 1 n graphs. Since Ω is of class C there exist open sets Um R ,m = 1,...,M such that ⊂ M Γ Um ⊂ m[=1 1 and Γ Um is a C -graph for each m 1,...,M . Let U0 be an open set such that M∩ ∈{ } Ω¯ Um. We recall that there exists a partition of unity on Ω¯ subordinate to the ⊂ m=0 ∞ n open setsS Um, m = 0,1...M; that is, there exist functions ηm C (R ) such that ∈ 0 ηm 1, supp ηm Um and ≤ ≤ ⊂ M ∑ ηm(x)= 1 m=0 for all x Ω¯ . For m 1,...,M let ϕm Cc(Γ Um) be the positive linear form ∈ ∈{ } ∈ ∩ +′ constructed above; that is, ϕm satisfies

D u = ϕ (ν u ) j m j Γ Um | ∩ ΩZ

1 for all u C (Ω¯ ) satisfying supp u Um. Define the positive linear form ϕ on C(Γ ) by ∈ ⊂ M ϕ( f ) := ∑ ϕm(ηm f ) . m=1 By Riesz’ Theorem there exists a Borel measure σ on Γ such that 1082 F The divergence theorem (BY WOLFGANG ARENDT)

ϕ( f )= f (z)dσ(z) ΓZ

1 for all f C(Γ ). Let u C (Ω¯ ) and let j 1,...n . Then uη0 has compact support ∈ ∈ ∈{ } in Ω. Hence D j(uη0)dx = 0. Indeed, extending uη0 by 0 outside of Ω we obtain Ω R 1 n a function v C (R ) vanishing outside of a cube Q := x : x j R for all j = 1,...,n . Since∈ { | |≤ } R ∂v (x1 ...x j,x...xn)dx j = v(x1,...,R,...xn) v(x1,..., R,...xn)= 0 ∂x j − − ZR − for all x1,...x j 1,x j+1,...xn R it follows that − ∈

D j(uη0)dx = D jvdx Ω Q R R R R ∂ v = ... ∂ x (x1 ...x j ...xn)dx j dx1 ...dx j 1 dx j+1 ...dx1 = 0 . R R j − −R −R Let m 1...M . Then uηm Cc(Γ Uj). Hence for j 1,...n , ∈{ } ∈ ∩ ∈{ } η ϕ η ν D j( mu)dx = m( mu Γ j) . | ΩZ

Thus

M D judx = D j( ∑ ηmu)dx ΩZ ΩZ m=0 M = ∑ D j(ηmu)dx m=0ΩZ M = ∑ D j(ηmu)dx m=1ΩZ M ϕ η ν = ∑ m( mu Γ j) m=1 | ϕ ν = (u Γ j) | ν σ = u Γ jd . | ΓZ

Thus σ is a Borel measure satisfying the requirements of the Divergence Theorem. We have seen before that it is unique. In particular, the construction given above does not depend on the choice of the graphs and the choice of the partition of unity. Appendix G Sobolev spaces

G.1 Test functions, convolution and regularization

Let Ω Rd be an open set. For every continuous function ϕ C(Ω) we define the support⊆ ∈ suppϕ := x Ω : ϕ(x) = 0 , { ∈ 6 } where the closure is to be understoodin Rd. Thus, the supportis by definition always closed in Rd, but it is not necessarily a subset of Ω. Next we let

∞ ∞ D(Ω) := C (Ω) := ϕ C (Ω) : suppϕ Ω is compact c { ∈ ⊂ } be the space of test functions on Ω, and

1 Ω Ω ∞ Ω Lloc( ) := f : K measurable : f < K compact { → K | | ∀ ⊂ } Z the space of locally integrable functions on Ω. For every f L1 (Rd) and every ϕ D(Rd) we define the convolution f ϕ by ∈ loc ∈ ∗ f ϕ(x) := f (x y)ϕ(y) dy ∗ Rd − Z = f (y)ϕ(x y) dy. Rd − Z 1 d ϕ D d ϕ ∞ d Lemma G.1. For every f Lloc(R ) and every (R ) onehas f C (R ) and for every 1 i d, ∈ ∈ ∗ ∈ ≤ ≤ ∂ ∂ϕ ( f ϕ)= f . ∂xi ∗ ∗ ∂xi

d Proof. Let ei R be the i-th unit vector. Then ∈ 1 ∂ϕ lim (ϕ(x + hei) ϕ(x)) = (x) h 0 h − ∂xi →

1083 1084 G Sobolev spaces uniformly in x Rd (note that ϕ has compact support). Hence, for every x Rd ∈ ∈ 1 ( f ϕ(x + hei) f ϕ(x)) h ∗ − ∗ 1 = f (y)(ϕ(x + hei y) ϕ(x y)) dy h Rd − − − Z ∂ϕ f (y) (x y) dy. → Rd ∂x − Z i The following theorem is proved in courses on measure theory. We omit the proof.

Theorem G.2 (Young’s inequality). Let f Lp(Rd ) and ϕ D(Rd). Then f ϕ Lp(Rd) and ∈ ∈ ∗ ∈ f ϕ p f p ϕ 1. k ∗ k ≤k k k k Theorem G.3. For every 1 p < ∞ and every open Ω Rd the space D(Ω) is dense in Lp(Ω). ≤ ⊂

Proof. The technique of this proof (regularization and truncation) is important in the theory of partial differential equations, distributions and Sobolev spaces. The first step (regularization) is based on Lemma G.1. The truncation step is in this case relatively easy. ϕ D d ϕ ϕ Regularization. Let (R ) be a positive function such that 1 = Rd = 1. One may take for example∈ the function k k R 1/(1 x 2) ce −| | if x < 1, ϕ(x) := | | (G.1) ( 0 otherwise,

d with an appropriate constant c > 0. Then let ϕn(x) := n ϕ(nx), so that ϕn 1 = ϕ k k Rd n = 1 for every n N. Let f Lp(Rd ). By∈ Lemma G.1 and Young’sinequality (Theorem G.2), for every R ∈ ∞ d p d n N, fn := f ϕn C (R ) L (R ) and fn p f p. Hence, for every n N ∈ ∗ p ∈ d p ∩ d k k ≤k k ∈ the operator Tn : L (R ) L (R ), f f ϕn is linear and bounded and Tn 1. → 7→ ∗ k k≤ Moreover, if f = 1I for some bounded interval I = (a1,b1) (ad,bd) Ω, then ×···× ⊂ p p ϕ d fn f p = f (x y) (ny)n dy f (x) dx k − k Rd Rd − − Z Z p y = ( f (x ) f (x))ϕ(y) dy dx Rd Rd − n − Z Z p y f (x ) f (x) ϕ(y) dy dx 0 ≤ Rd Rd | − n − | → Z Z  as n ∞ by Lebesgue’s dominated convergence theorem. In other words, → d limn ∞ Tn f f p = 0 for every f = 1I with I as above. Since span 1I : I R → k − k { ⊂ G.1 Test functions, convolution and regularization 1085

p d bounded interval is dense in L (R ), we find that limn ∞ Tn f f p = 0 for ev- } p d → k − k ery f from a dense subset M of L (R ). Since the Tn are bounded, we conclude p d p d from Lemma D.58 that Tn f f in L (R ) for every f L (R ). This proves that Lp C∞(Rd ) is dense in Lp(→Rd). ∈ ∩Truncation. Now we consider a general open set Ω Rd and prove the claim. ϕ D d ⊂ϕ ϕ Let (R ) be a positive test function such that supp B(0,1) and Rd = 1 (one may∈ take for example the function from (G.1)). Then let⊂ ϕ (x) := ndϕ(nx). n R For every n N we let ∈ 1 Kn := x Ω : dist(x,∂Ω) B(0,n), { ∈ ≥ n} ∩ so that Kn Ω is compact for every n N. Now let⊂f Lp(Ω) Lp(Rd) and ε∈> 0. Let ∈ ⊂

f (x) if x Kn, ∈ f 1Kn (x)= ( 0 if x Ω Kn. ∈ \ Ω By Lebesgue’s dominated convergence theorem (since n Kn = ), S p p p ∞ f f 1Kn p = f (1 1Kn) 0 as n . k − k Ω | | − → → Z ε In particular, there exists n N such that f f 1Kn p . ∈ k −ϕ k p≤ ∞ d For every m 4n we define gm := ( f 1Kn ) m L C (R ); note that we here consider Lp(Ω)≥as a subspace of Lp(Rd ) by∗ extending∈ ∩ functions in Lp(Ω) by 0 outside Ω. However, since gm = 0 outside K2n, we find that actually gm D(Ω). ∈ By the first step (regularisation), there exists m 4n so large that gm f 1K p ε. ≥ k − n k ≤ For such m we have f gm p 2ε, and the claim is proved. k − k ≤ Lemma G.4. Let f L1 (Ω) be such that ∈ loc f ϕ = 0 for every ϕ D(Ω). Ω ∈ Z Then f = 0. Proof. We first assume that f L1(Ω) is real and that Ω has finite measure. By ∈ Theorem G.3, for every ε > 0 there exists g D(Ω) such that f g 1 ε. By assumption, this implies ∈ k − k ≤

gϕ = ( f g)ϕ ε ϕ ∞ ϕ D(Ω). | Ω | | Ω − |≤ k k ∀ ∈ Z Z

Let K1 := x Ω : g(x) ε and K2 := x Ω : g(x) ε . Since g is a test { ∈ ≥ } { ∈ ≤− } function, the sets K1, K2 are compact. Since they are disjoint and do not touch the boundary of Ω,

inf x y , x z , y z : x K1, y K2, z ∂Ω =: δ > 0. {| − | | − | | − | ∈ ∈ ∈ } 1086 G Sobolev spaces

δ Ω δ δ δ Let Ki := x : dist(x,Ki) /4 (i = 1, 2). Then K1 and K2 are two compact disjoint subsets{ ∈ of Ω. Let ≤ }

1 if x Kδ , ∈ 1 δ h(x) :=  1 if x K2 ,  − ∈  0 else,

 d choose a positive test function ϕ D(R ) such that d ϕ = 1 and suppϕ ∈ R ⊂ B(0,δ/8), and let ψ := h ϕ. Then ψ D(Ω), 1 ψ 1, ψ = 1 on K1 and ∗ ∈ − ≤ R ≤ ψ = 1 on K2. Let K := K1 K2. Then − ∪ g = gψ ε + gψ ε + g . K | | K ≤ Ω K | |≤ Ω K | | Z Z Z \ Z \ Hence, g = g + g ε + 2 g ε(1 + 2 Ω ), Ω | | K | | Ω K | |≤ Ω K | |≤ | | Z Z Z \ Z \ which implies f f g + g 2ε(1 + Ω ). Ω | |≤ Ω | − | Ω | |≤ | | Z Z Z Since ε > 0 was arbitrary, we find that f = 0. The general case can be obtained from the particular case ( f L1 and Ω < ∞) by considering first real and imaginary part of f separately, and then∈ by considering| | f 1B for all closed (compact) balls B Ω. ⊂

G.2 Sobolev spaces in one dimension

Recall the fundamental rule of partial integration: if f , g C1([a,b]) on some com- pact interval [a,b], then ∈

b b fg′ = f (b)g(b) f (a)g(a) f ′g. a − − a Z Z In particular, for every f C1([a,b]) and every ϕ D(a,b) ∈ ∈ b b f ϕ′ = f ′ϕ, (G.2) a − a Z Z since ϕ(a)= ϕ(b)= 0.

Definition G.5 (Sobolev spaces). Let ∞ a < b ∞ and 1 p ∞. We define − ≤ ≤ ≤ ≤ b b 1,p p p W (a,b) := u L (a,b) : g L (a,b) ϕ D(a,b) : uϕ′ = gϕ . { ∈ ∃ ∈ ∀ ∈ a − a } Z Z G.2 Sobolev spaces in one dimension 1087

The space W 1,p(a,b) is called (first) Sobolev space. If p = 2, then we also write H1(a,b) := W 1,2(a,b). By Lemma G.4, the function g Lp(a,b) is uniquely determined if it exists. In ∈ 1,p the following, we will write u′ := g, in accordance with (G.2). We equip W (a,b) with the norm u 1,p := u p + u′ p, k kW k k k k and if p = 2, then we define the inner product

b b u,v H1 := uv + u′v′, h i a a Z Z 2 2 1 which actually yields the norm u 1 = ( u + u ) 2 (which is equivalent to k kH k k2 k ′k2 1,2 ). k·kW Lemma G.6. The Sobolev spaces W 1,p(a,b) are Banach spaces, which are separa- ble if p = ∞. The space H1(a,b) is a separable Hilbert space. 6 Proof. The fact that the W 1,p are Banach spaces, or that H1 is a Hilbert space, is an exercise. Recall that Lp(a,b) is separable (Remark D.46). Hence, the product space Lp(a,b) Lp(a,b) is separable, and also every subspace of this product space is separable.× Now consider the linear mapping

1,p p p T : W (a,b) L (a,b) L (a,b), u (u,u′), → × 7→ which is bounded and even isometric. Hence, W 1,p is isometrically isomomorphic to a subspace of Lp Lp which is separable. Hence W 1,p is separable. × Lemma G.7. Let u W 1,p(a,b) be such that u = 0. Then u is constant. ∈ ′ ψ D b ψ ϕ D Proof. Choose (a,b) such that a = 1. Then, for every (a,b), the ϕ b ϕ∈ψ b ϕ ∈ b ϕ ψ function ( a ) is the derivative ofR a test function since a ( ( a ) )= 0. Hence, by definition,− − R b b R R 0 = u(ϕ ( ϕ)ψ), a − a Z Z b ψ or, with c = a u = const,

R b (u c)ϕ = 0 ϕ D(a,b). a − ∀ ∈ Z By Lemma G.4, u = c almost everywhere. p Lemma G.8. Let ∞ < a < b < ∞ and let t0 [a,b].Let g L (a,b) and define − ∈ ∈ t u(t) := g(s) ds, t [a,b]. t ∈ Z 0 Then u W 1,p(a,b) and u = g. ∈ ′ 1088 G Sobolev spaces

Proof. Let ϕ D(a,b). Then, by Fubini’s theorem, ∈ b b t uϕ′ = g(s) dsϕ′(t) dt Za Za Zt0 t0 t b t = g(s) dsϕ′(t) dt + g(s) dsϕ′(t) dt Za Zt0 Zt0 Zt0 t0 s b b = ϕ′(t) dtg(s) ds + ϕ′(t) dtg(s) ds − a a t s Z Z Z 0 Z t0 b = ϕ(s)g(s) ds ϕ(s)g(s) ds − a − t Z Z 0 b = gϕ. − a Z Theorem G.9. Let u W 1,p(a,b) (bounded or unbounded interval). Then there ex- ists u˜ C((a,b)) which∈ is continuous up to the boundary of (a,b), which coincides with u∈ almost everywhere and such that for every s, t (a,b) ∈ t u˜(t) u˜(s)= u′(r) dr. − s Z t Proof. Fix t0 (a,b) and define v(t) := u (s) ds (t (a,b)). Clearly, the function ∈ t0 ′ ∈ v is continuous. By Lemma G.8, v W 1,p(c,d) for every bounded interval (c,d) ∈ R ⊂ (a,b), and v′ = u′. By Lemma G.7, u v = C for some constant C which clearly does not depend on the choice of the− interval (c,d). This proves that u coincides almost everywhere with the continuous functionu ˜ = v +C. By Lemma G.8,

t u˜(t) u˜(s)= v(t) v(s)= u′(r) dr. − − s Z Remark G.10. By Theorem G.9, we will identify every function u W 1,p(a,b) with its continuous representant, and we say that every function in W 1,p∈(a,b) is continu- ous.

Lemma G.11 (Extension lemma). Let u W 1,p(a,b). Then there exists u˜ W 1,p(R) such that u˜ = u on (a,b). ∈ ∈

Proof. Assume first that a and b are finite and define

u (t) if t [a,b], ′ ∈  u(a) if t [a 1,a), ∈ − g(t) :=   u(b) if t (b,b + 1],  − ∈ 0 else.    G.2 Sobolev spaces in one dimension 1089

p t Then g L (R). Letu ˜(t) := ∞ g(s) ds, so thatu ˜ = u on (a,b). By Lemma G.8, 1,∈p − u˜ W (c,d) for every boundedR interval (c,d) R. However,u ˜ = 0 outside (a 1,∈b + 1) which implies thatu ˜ W 1,p(R). ∈ − The case of a = ∞ or b =∈∞ is treated similarly. − Lemma G.12. For every 1 p < ∞, the space D(R) is dense in W 1,p(R). ≤ Proof. Let u W 1,p(R). ∈ ϕ D ϕ Regularization: Choose a positive test function (R) such that R = 1 and ∞ p ∈ p put ϕn(x)= nϕ(nx). Then un := u ϕn C L (R), u = u ϕn L (R) and ∗ ∈ ∩ n′ ′ ∗ ∈ R

lim u un p = 0 and n ∞ → k − k lim u′ u′ p = 0, n ∞ n → k − k 1,p ∞ so that limn ∞ u un W 1,p = 0. This proves that W (R) C (R) is dense in W 1,p(R). → k − k ∩ Truncation: Choose a sequence (ψn) D(R) such that 0 ψn 1, ψn = 1 on ⊂ ∞≤ ≤ [ n,n] and ψ ∞ C for all n N. Let ε > 0. Choose v C W 1,p(R) such that − k n′ k ≤ ∈ ∈ ∩ u v 1,p ε (regularization step). For every n N, one has vψn D(R) and it k − kW ≤ ∈ ∈ is easy to check that for all n large enough, v vψn 1,p ε. The claim is proved. k − kW ≤ Corollary G.13. For every u W 1,p(a,b) (bounded or unbounded interval, 1 p < ∈ ≤ ∞) and every ε > 0, there exists v D(R) such that u v 1,p ε. ∈ k − |(a,b)kW ≤ Proof. Given u W 1,p(a,b), we first choose an extensionu ˜ W 1,p(R) (extension ∈ ∈ lemma G.11) and then a test function v D(R) such that u˜ v 1,p ε (Lemma ∈ k − kW (R) ≤ G.12). Then u˜ v 1,p = u v 1,p ε. k − kW (a,b) k − kW (a,b) ≤ Corollary G.14 (Sobolev embedding theorem). Every function u W 1,p(a,b) is continuous and bounded and there exists a constant C 0 such that∈ ≥ 1,p u ∞ C u 1,p for every u W (a,b). k k ≤ k kW ∈ Proof. If p = ∞, there is nothing to prove. We first prove the claim for the case (a,b)= R. ∞ D p 1 1 So let 1 p < and let v (R). Then G(v) := v − v Cc (R) and G(v)′ = p v p 1v . By≤ H¨older’s inequality,∈ | | ∈ | | − ′ x p 1 p 1 G(v)(x) = p v − v′ p v p− v′ p, | | | ∞ | | |≤ k k k k Z− so that by Young’s inequality (ab 1 ap + 1 bp′ ) ≤ p p′ 1/p v ∞ = G(v) ∞ C v 1,p . k k k k ≤ k kW Since D(R) is dense in W 1,p(R) by Lemma G.12, the claim for (a,b)= R follows by an approximation argument. The case (a,b) = R is an exercise. 6 1090 G Sobolev spaces

Theorem G.15 (Product rule, partial integration). Let u, v W 1,p(a,b) (1 p ∞). Then: ∈ ≤ ≤ (i) (Product rule). The product uv belongs to W 1,p(a,b) and

(uv)′ = u′v + uv′.

(ii) (Partial integration). If ∞ < a < b < ∞, then − b b u′v = u(b)v(b) u(a)v(a) uv′. a − − a Z Z 1,p Proof. Since every function in W (a,b) is bounded, we find that uv, u′v + uv′ p D ∈ L (a,b). Choose sequences (un), (vn) (R) such that limn ∞ un (a,b) = u and 1,p ⊂ → | limn ∞ vn (a,b) = v in W (a,b) (Corollary G.13). By Corollary G.14, this implies → | also limn ∞ un (a,b) u ∞ = limn ∞ vn (a,b) v ∞ = 0. The classical product rule implies → k | − k → k | − k (unvn)′ = u′ vn + unv′ for every n N, n n ∈ and the classical rule of partial integration implies

b b un′ vn = un(b)vn(b) un(a)vn(a) unvn′ for every n N. a − − a ∈ Z Z The claim follows upon letting n tend to ∞.

Definition G.16. For every 1 p ∞ and every k 2 we define inductively the Sobolev spaces ≤ ≤ ≥

k,p 1,p k 1,p W (a,b) := u W (a,b) : u′ W − (a,b) , { ∈ ∈ } which are Banach spaces for the norms

k ( j) u Wk,p := ∑ u p. k k j=0 k k

We denote Hk(a,b) := W k,2(a,b) which is a Hilbert space for the scalar product

k ( j) ( j) u,v Hk := ∑ u v L2 . h i j=0

Finally, we define k p k,p , D k·kW W0 (a,b) := (a,b) , k,p k,p that is, W0 (a,b) is the closure of the test functions in W (a,b), and we put k k,2 H0 (a,b) := W0 (a,b). G.2 Sobolev spaces in one dimension 1091

∞ ∞ 1,p Theorem G.17. Let < a < b < . A function u W0 (a,b) if and only if u W 1,p(a,b) and u(a)=−u(b)= 0. ∈ ∈ Theorem G.18 (Poincare´ inequality). Let ∞ < a < b < ∞ and 1 p < ∞. Then there exists a constant λ > 0 such that − ≤

b b λ p p 1,p u u′ for every u W0 (a,b). a | | ≤ a | | ∈ Z Z Proof. Let u W 1,p(a,b). Then ∈ b b x p p u(x) dx = u′(y) dy dx a | | a a Z Z Z b b p

u′(y) dy dx ≤ a a | | Z Z  b b p 1 p (b a) − u′(y) dydx ≤ a − a | | Z Z b p p = (b a) u′(y) dy. − a | | Z Between the first and the second line, we have used the assumption that u(a)= 0, while in the following inequality we applied H¨older’s inequality. Theorem G.19. Let ∞ < a < b < ∞. For every f L2(a,b) there exists a unique function u H1(a,b)− H2(a,b) such that ∈ ∈ 0 ∩

u u′′ = f and − (G.3) ( u(a)= u(b)= 0 .

1 2 Proof. We first note that if u H0 (a,b) H (a,b) is a solution, then, by partial integration (Theorem G.15), for∈ every v ∩H1(a,b) ∈ 0 b b (uv + u′v′) = (u,v) 1 = fv. (G.4) H0 Za Za ϕ 1 By the Cauchy-Schwarz inequality, the linear functional H0 (a,b)′ defined ϕ b ∈ by (v)= a fv is bounded: R ϕ (v) f 2 v 2 f 2 v H1 . | |≤k k k k ≤k k k k 0 1 By the theorem of Riesz-Fr´echet, there exists a unique u H0 (a,b) such that 1 ∈ (G.4)holds true for all v H0 (a,b). This proves uniqueness of a solution of (G.3), and if we provethat in addition∈ u H2(a,b), then we prove existence, too. However, (G.4)holds in particular for all v ∈D(a,b), i.e. ∈ b b u′v′ = (u f )v v D(a,b) a − a − ∀ ∈ Z Z 1092 G Sobolev spaces

2 1 and u f L (a,b) by assumption. Hence, by definition, u′ H (a,b), i.e. u H2(a,b−) and∈ u = u f . Using also Theorem G.17, the claim is∈ proved. ∈ ′′ −

G.3 Sobolev spaces in several dimensions

In order to motivate Sobolev spaces in several space dimensions, we have to recall the partial integration rule in this case (see the Divergence Theorem F.6). Theorem G.20 (Gauß). Let Ω Rd be open and bounded such that ∂Ω is of class C1. Then there exists a unique⊆ Borel measure σ on ∂Ω such that for every u, v C1(Ω¯ ) and every 1 i d ∈ ≤ ≤ ∂v ∂u u = uvni dσ v, Ω ∂x ∂Ω − Ω ∂x Z i Z Z i

where n(x) = (ni(x))1 i d denotes the outer normal vector at a point x ∂Ω. ≤ ≤ ∈ In particular, if u C1(Ω¯ ) and ϕ D(Ω), then ∈ ∈ ∂ϕ ∂u u = ϕ. Ω ∂x − Ω ∂x Z i Z i Definition G.21 (Sobolev spaces). Let Ω Rd be any open set and 1 p ∞. We define ⊆ ≤ ≤

1,p p p W (Ω) := u L (Ω) : 1 i d gi L (Ω) { ∈ ∀ ≤ ≤ ∃ ∈ ∂ϕ ϕ D(Ω) : u = giϕ . ∀ ∈ Ω ∂x − Ω } Z i Z The space W 1,p(Ω) is called (first) Sobolev space. If p = 2, then we also write H1(Ω) := W 1,2(Ω).

1,p Let u W (Ω). By Lemma G.4, the functions gi are uniquely determined. We ∂ u ∈ ∂ u write := gi and call the partial derivative of u with respect to xi. As in the ∂ xi ∂ xi one-dimensional case, the following holds true. Lemma G.22. The Sobolev spaces W 1,p(Ω) are Banach spaces for the norms

d ∂ u ∞ u W1,p := u p + ∑ ∂ p (1 p ), k k k k i=1 k xi k ≤ ≤

and H1(Ω) is a Hilbert space for the inner product

d ∂u ∂v u,v H1 := u,v L2 + ∑ ∂ , ∂ L2 . h i h i i=1h xi xi i G.3 Sobolev spaces in several dimensions 1093

Proof. Exercise. Not all properties of Sobolev spaces on intervals carry over to Sobolev spaces on open sets Ω Rd. For example, it is not true that every function u W 1,p(Ω) is continuous (without⊂ any further restrictions on p and Ω)! ∈ Definition G.23. For every open Ω Rd, 1 p ∞ and every k 2 we define inductively the Sobolev spaces ⊂ ≤ ≤ ≥ ∂ k,p 1,p u k 1,p W (Ω) := u W (Ω) : 1 i d : W − (Ω) , { ∈ ∀ ≤ ≤ ∂xi ∈ } which are Banach spaces for the norms

k ∂u u k,p := u p + ∑ k 1,p . W ∂ W − k k k k i=0 k xi k

We denote Hk(Ω) := W k,2(Ω) which is a Hilbert space for the inner product

k ∂u ∂v u,v k := u,v 2 + ∑ , k 1 . H L ∂ ∂ H − h i h i i=0h xi xi i Finally, we define k p k,p , Ω D Ω k·kW W0 ( ) := ( ) , k,p Ω k,p Ω that is, W0 ( ) is the closure of the test functions in W ( ), and we put k Ω k,2 Ω H0 ( ) := W0 ( ). Theorem G.24 (Poincare´ inequality). Let Ω Rd be a bounded domain, and let 1 p < ∞. Then there exists a constant C 0 ⊂such that ≤ ≥ u p Cp ∇u p for every u W 1,p(Ω). Ω | | ≤ Ω | | ∈ 0 Z Z We note that the Poincar´einequality implies that

1 p u := ∇u p k k Ω | | Z  defines an equivalent norm on W 1,p(Ω) if Ω Rd is bounded. Clearly, 0 ⊂ 1,p u u 1,p for every u W W 0 , k k≤k k 0 ∈ by the definition of the norm in W 1,p. On the other hand,

u 1,p C u p ∇u p W ( L + L ) k k 0 ≤ k k k k C ∇u Lp = C u , ≤ k k k k 1094 G Sobolev spaces

by the Poincar´einequality.

We also state the following two theorems without proof.

Theorem G.25 (Sobolev embedding theorem). Let Ω Rd be an open set with C1 boundary. Let 1 p ∞ and define ⊂ ≤ ≤ dp d p if 1 p < d p∗ := − ≤ ( ∞ if d < p,

andif p = d, then p [1,∞). Then, for every p q p we have ∗ ∈ ≤ ≤ ∗ W 1,p(Ω) Lq(Ω) ⊂ with continuous embedding, that is, there exists C = C(p,q) 0 such that ≥ 1,p u Lq C u 1,p for every u W (Ω). k k ≤ k kW ∈ Theorem G.26 (Rellich-Kondrachov). Let Ω Rd be an open and bounded set 1 ⊂ with C boundary. Let 1 p ∞ and define p∗ as in the Sobolev embedding theo- rem. Then, for every p ≤q <≤∞ the embedding ≤ W 1,p(Ω) Lq(Ω) ⊂ is compact, that is, every bounded sequence in W 1,p(Ω) has a subsequence which converges in Lq(Ω).

G.4 * Elliptic partial differential equations

Let Ω Rd be an open, bounded set, f L2(Ω), and consider the elliptic partial differential⊂ equation ∈ u ∆u = f in Ω, − (G.5) ( u = 0 in ∂Ω, where d ∂ 2 ∆u(x) := ∑ u(x) ∂ 2 i=1 xi stands for the Laplace operator. 1 Ω 2 Ω If u H0 ( ) H ( ) is a solution of (G.5), then, by definition of the Sobolev spaces,∈ for every∩v D(a,b) ∈ G.4 * Elliptic partial differential equations 1095

d ∂u ∂v u,v H1 = uv + ∑ h i 0 Ω ∂x ∂x Z i=1 i i  d ∂ 2u = uv ∑ v Ω − ∂x2 Z i=1 i  = (u ∆u)v Ω − Z = fv. Ω Z 1 Ω By density of the test functions in H0 ( ), this equality holds actually for all v 1 Ω ∈ H0 ( ). This may justify the following definition of a weak solution. 1 Ω Definition G.27. A function u H0 ( ) is called a weak solution of (G.5) if for every v H1(Ω) ∈ ∈ 0 ∇ ∇ u,v H1 = uv + u v = fv, (G.6) h i 0 Ω Ω Ω Z Z Z where ∇u is the usual, euclidean gradient of u.

Theorem G.28. Let Ω Rd be an open, bounded set. Then, for every f L2(Ω) there exists a unique weak⊂ solution u H1(Ω) of the problem (G.5). ∈ ∈ 0 Proof. By the Cauchy-Schwarz inequality, the linear functional ϕ H1(Ω) defined ∈ 0 ′ by ϕ(v)= Ω fv is bounded: R ϕ (v) f 2 v 2 f 2 v H1 . | |≤k k k k ≤k k k k 0 1 Ω By the theorem of Riesz-Fr´echet, there exists a unique u H0 ( ) such that (G.6) holds true for all v H1(a,b). The claim is proved. ∈ ∈ 0