<<

Introduction to Analysis in Several Variables (Advanced )

Michael Taylor

Math. Dept., UNC E-mail address: [email protected] 2010 Subject Classification. 26B05, 26B10, 26B12, 26B15, 26B20 Key words and phrases. real , complex numbers, Euclidean space, metric spaces, compact spaces, Cauchy , continuous , , , value theorem, Riemann , fundamental theorem of calculus, arclength, , logarithm, trigonometric functions, Euler’s formula, multiple , surfaces, surface , differential forms, Stokes theorem, degree, Riemannian , metric , , , Gauss-Bonnet theorem,

Contents

Preface ix Some basic notation xv Chapter 1. Background 1 §1.1. One calculus 2 §1.2. Euclidean spaces 19 §1.3. Vector spaces and linear transformations 25 §1.4. 35 Chapter 2. Multivariable differential calculus 45 §2.1. The derivative 46 §2.2. and theorem 66 §2.3. Systems of differential and vector fields 80 Chapter 3. Multivariable integral calculus and calculus on surfaces 101 §3.1. The in n variables 102 §3.2. Surfaces and surface integrals 135 §3.3. Partitions of unity 167 §3.4. Sard’s theorem 168 §3.5. Morse functions 169 §3.6. The space to a manifold 171 Chapter 4. Differential forms and the Gauss-Green-Stokes formula 177 §4.1. Differential forms 179 §4.2. Products and exterior of forms 186

vii viii Contents

§4.3. The general Stokes formula 190 §4.4. The classical Gauss, Green, and Stokes formulas 196 §4.5. Differential forms and the change of variable formula 207 Chapter 5. Applications of the Gauss-Green-Stokes formula 213 §5.1. Holomorphic functions and harmonic functions 215 §5.2. Differential forms, homotopy, and the 230 §5.3. Differential forms and degree 236 Chapter 6. Differential geometry of surfaces 255 §6.1. Geometry of surfaces I: geodesics 260 §6.2. Geometry of surfaces II: curvature 274 §6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 291 §6.4. Smooth groups 306 §6.5. The derivative of the exponential map 327 §6.6. A spectral mapping theorem 332 Chapter 7. Fourier analysis 335 §7.1. 339 §7.2. The Fourier transform 356 §7.3. Poisson formulas 380 §7.4. Spherical harmonics 383 §7.5. Fourier series on compact matrix groups 428 §7.6. Isoperimetric inequality 436 Appendix A. Complementary material 439 §A.1. Metric spaces, convergence, and compactness 440 §A.2. Inner product spaces 453 §A.3. Eigenvalues and eigenvectors 459 §A.4. Complements on 465 §A.5. The Weierstrass theorem and the Stone-Weierstrass theorem 471 §A.6. Further results on harmonic functions 475 Bibliography 481 Index 485 Preface

This text was produced for the second part of a two-part on ad- vanced calculus, whose aim is to provide a firm logical foundation for analy- sis, for students who have had 3 semesters of calculus and a course in linear . The first part treats analysis in one variable, and the text [44] was written to cover that material. The text at hand treats analysis in several variables. These two texts can be used as companions, but they are written so that they can be used independently, if desired. Chapter 1 treats background needed for multivariable analysis. The first section gives a brief treatment of one-variable calculus, including the Rie- mann integral and the fundamental theorem of calculus. This section distills material developed in more detail in the companion text [44]. We have in- cluded it here to facilitate the independent use of this text. Subsequent sections in Chapter 1 present basic linear algebra background of use for the rest of the text. They include material on n-dimensional Euclidean spaces and other vector spaces, on linear transformations on such spaces, and on determinants of such linear transformations. Chapter 2 develops multi-dimensional differential calculus on domains in n-dimensional Euclidean space Rn. The first section defines the derivative of a differentiable map F : O → Rm, at a point x ∈ O, for O open in Rn, as a from Rn to Rm, and establishes basic properties, such as the . The next section deals with the Inverse Function Theorem, giving a condition for such a map to have a differentiable inverse, when n = m. The third section treats n × n systems of differential equations, bringing in the concepts of vector fields and flows on an open O ∈ Rn. While the emphasis here is on differential calculus, we do make use of integral calculus in one variable, as exposed in Chapter 1.

ix x Preface

Chapter 3 treats multi-dimensional integral calculus. We define the Rie- mann integral for a class of functions on Rn, and establish basic proper- ties, including a change of variable formula. We then study smooth m- dimensional surfaces in Rn, and extend the Riemann integral to a class of functions on such surfaces. Going further, we abstract the notion of surface to that of a manifold, and study a class of known as Riemannian manifolds. These possess an object known as a “.” We also define the Riemann integral for a class of functions on such manifolds. The change of variable formula is instrumental in this extension of the integral. In Chapter 4 we introduce a further class of objects that can be de- fined on surfaces, differential forms. A k-form can be integrated over a k-dimensional surface, endowed with an extra structure, an “orientation.” Again the change of variable formula plays a role in establishing this. Im- portant operations on differential forms include products and the . A key result of Chapter 4 is a general Stokes formula, an impor- tant integral that can be seen as a multi-dimensional version of the fundamental theorem of calculus. In §4.4 we specialize this general Stokes formula to classical cases, known as theorems of Gauss, Green, and Stokes. A concluding section of Chapter 4 makes use of material on differential forms to give another proof of the change of variable formula for the integral, much different from the proof given in Chapter 3. Chapter 5 is devoted to several applications of the material on the Gauss- Green-Stokes theorems from Chapter 4. In §5.1 we use Green’s Theorem to derive fundamental properties of holomorphic functions of a complex vari- able. Sprinkled throughout earlier sections are some allusions to functions of complex variables, particularly in some of the exercises in §§2.1–2.2. Read- ers with no previous exposure to complex variables might wish to return to these exercises after getting through §5.1. In this section, we also discuss some results on the closely related study of harmonic functions. One result is Liouville’s Theorem, stating that a bounded harmonic function on all of Rn must be . When specialized to holomorphic functions on C = R2, this yields a proof of the Fundamental Theorem of Algebra. In §5.2 we define the notion of smoothly homotopic maps and consider the behavior of closed differential forms under pull back by smoothly homo- topic maps. This material is then applied in §5.3, which introduces degree theory and derives some interesting consequences. Key results include the Brouwer fixed point theorem, the existence of critical points of smooth vec- tor fields tangent to an even-dimensional sphere, and the Jordan-Brouwer separation theorem (in the smooth case). We also show how degree theory yields another proof of the fundamental theorem of algebra. Preface xi

Chapter 6 applies results of Chapters 2–5 to the study of the geometry of surfaces (and more generally of Riemannian manifolds). Section 6.1 studies geodesics, which are locally length-minimizing . Section 6.2 studies curvature. Several varieties of curvature arise, including Gauss curvature and Riemann curvature, and it is of great interest to understand the relations between them. Section 6.3 ties the curvature study of §6.2 to material on degree theory from §5.3, in a result known as the Gauss-Bonnet theorem. Section 6.4 studies smooth matrix groups, which are smooth surfaces in M(n, F) that are also groups. These carry left and right invariant metric , with important consequences for the application of such groups to other aspects of analysis, including results presented in §7.4. Chapter 7 is devoted to an introduction to multi-dimensional Fourier analysis. Section 7.1 treats Fourier series on the n-dimensional torus Tn, and §7.2 treats the Fourier transform for functions on Rn. Section 7.3 intro- duces a topic that ties the first two together, known as Poisson’s summation formula. We apply this formula to establish a classical result of Riemann, his functional for the Riemann zeta function. The material in §§7.1–7.3 bears on topics rather different from the geo- metrical material emphasized in the latter part of Chapter 3 and in Chapters 4–6. In fact, this part of Chapter 7 could be tackled right after one gets through §3.1. On the other hand, the last three sections of Chapter 7 make strong contact with this geometrical material. Section 7.4 treats Fourier analysis on the sphere Sn−1, which involves expanding a function on Sn−1 in terms of eigenfunctions of the ∆S, arising from the Rie- mannian metric on Sn−1. This study of course includes integrating functions over Sn−1. It also brings in the matrix SO(n), introduced in Chapter 3, which acts on each eigenspace Vk of ∆S, and its subgroup SO(n − 1), and makes use of integrals over SO(n − 1). Section 7.4 also makes use of the Gauss-Green-Stokes formula, and applications to harmonic functions, from §§4.4 and 5.1. We believe the reader will gain a good appreciation of the utility of unifying geometrical concepts with those aspects of Fourier analysis developed in the first part of Chapter 7. We complement §7.4 with a brief discussion of Fourier series on compact matrix groups, in §7.5. Section 7.6 deals with the purely geometric problem of showing that, among smoothly bounded planar domains Ω ⊂ R2, with fixed area, the disks have the smallest perimeter. This is the 2D isoperimetric inequality. Its placement here is due to the fact that its proof is an application of Fourier series. The text ends with a collection of appendices, some giving further back- ground material, others providing complements to results of the main text. xii Preface

Appendix A.1 covers some basic notions of metric spaces and compactness, used from to time throughout the text, such as in the study of the Riemann integral and in the proof of the fundamental existence theorem for ODE. As is the case with §1.1, this appendix distills material developed at a more leisurely pace in [44], again serving to make this text independent of the first one. Appendices A.2 and A.3 complement results on linear algebra presented in Chapter 1 with some further results. Appendix A.2 treats a general class of inner product spaces, both finite and infinite dimensional. Treatments in the latter case are relevant to results on Fourier analysis in Chapter 7. Appendix A.3 treats eigenvalues and eigenvectors of linear transformations on finite dimensional vector spaces, providing results useful in various places, from §2.1 to §6.6. Appendix A.4 discusses the in the power series of a func- tion. Appendix A.5 deals with the Weierstrass theorem, on approximating a by , and an extension, known as the Stone- Weierstrass theorem, a useful tool in analysis, with applications in Sections 5.3, 7.1, and 7.4. Appendix A.6 builds on material on harmonic functions presented in Chapters 5 and 7. Results range from a removable singularity theorem to extensions of Liouville’s theorem.

We point out some distinctive features of this treatment of Advanced Calculus.

1) Applications of the Gauss-Green-Stokes formulas. These formulas form a high point in any advanced calculus course, but we do not want them to be seen as the culmination of the course. Their significance arises from their many applications. The first application we treat is to the theory of functions of a complex variable, including the Cauchy integral theorem and basic consequences. This basically constitutes a mini-course in . (A much more extensive treatment can be found in [46].) We also derive applications to the study of harmonic functions, in n variables, a study that is closely related to complex analysis when n = 2. We also apply differential forms and the Stokes formula to results of a topological flavor, involving a set of tools known as degree theory. We start with a result known as the Brouwer fixed point theorem. We give a short proof, as a direct application of the Stokes formula, thus making this theorem a precursor to degree theory, rather than an application.

2) The unity of analysis and geometry. This starts with calculus on surfaces, Preface xiii computing surface and surface integrals, given in terms of the metric tensors these surfaces inherit, but it proceeds much further. There is the question of finding geodesics, shortest paths, described by certain differential equations, whose coefficients arise from the metric tensor. Another issue is what makes a curved surface curved. One particular is called the Gauss curvature. There are formulas for the integrated Gauss curvature, which in turn make contact with degree theory. Such are examples of connections uniting analysis and geometry, pursued in the text. Other connections arise in the treatment of Fourier analysis. In addi- tion to Fourier analysis on Euclidean space, the text treats Fourier analysis on spheres. Matrix groups, such as rotation groups SO(n), make an ap- pearance, both as tools to study Fourier analysis on spheres, and as further sources of problems in Fourier analysis, thereby expanding the theater on which to bring to bear techniques of advanced calculus developed here.

We follow this preface with a list of some basic notation, of use through- out the text.

Acknowledgment During the preparation of this book, my has been supported by a of NSF grants, most recently DMS-1500817.

Some basic notation

R is the set of real numbers.

C is the set of complex numbers.

Z is the set of .

Z+ is the set of integers ≥ 0.

N is the set of integers ≥ 1 (the “natural numbers”).

Q is the set of rational numbers. x ∈ R x is an element of R, i.e., x is a .

(a, b) denotes the set of x ∈ R such that a < x < b.

[a, b] denotes the set of x ∈ R such that a ≤ x ≤ b.

{x ∈ R : a ≤ x ≤ b} denotes the set of x in R such that a ≤ x ≤ b.

[a, b) = {x ∈ R : a ≤ x < b} and (a, b] = {x ∈ R : a < x ≤ b}.

xv xvi Some basic notation

z = x − iy if z = x + iy ∈ C, x, y ∈ R.

Ω denotes the closure of the set Ω. f : A → B denotes that the function f takes points in the set A to points in B. One also says f maps A to B. x → x0 means the variable x tends to the x0. f(x) = O(x) means f(x)/x is bounded. Similarly g(ε) = O(εk) means g(ε)/εk is bounded. f(x) = o(x) as x → 0 (resp., x → ∞) means f(x)/x → 0 as x tends to the specified limit.

S = sup |an| means S is the smallest real number that satisfies S ≥ |an| for n all n. If there is no such real number then we take S = +∞. ( ) lim sup |a | = lim sup |a | . k →∞ k k→∞ n k≥n Chapter 1

Background

This first chapter provides background material on one-variable calculus, the geometry of n-dimensional Euclidean space, and linear algebra. We begin in §1.1 with a presentation of the elements of calculus in one variable. We first define the Riemann integral for a class of functions on an . We then introduce the derivative, and establish the Fundamental Theorem of Calculus, relating differentiation and integration as essentially inverse operations. Further results are dealt with in the exercises, such as the change of variable formula for integrals, and the Taylor formula with remainder, for power series. Next we introduce the n-dimensional Euclidean spaces Rn, in §1.2. The on Rn gives rise to a norm, hence to a distance function, mak- ing Rn a metric space. (More general metric spaces are studied in Appendix A.1.) We define the notion of open and closed subsets of Rn, of conver- gent sequences, and of compactness, and establish that nonempty closed, bounded subsets of Rn are compact. This material makes use of results on the real line R, dealt with at length in [44], and reviewed in Appendix A.1. The spaces Rn are special cases of vector spaces, explored in greater generality in §1.3. We also study linear transformations T : V → W between two vector spaces. We define the class of finite-dimensional vector spaces, and show that the of such a is well defined. If V is a real vector space and dim V = n, then V is isomorphic to Rn. Linear transformations from Rn to Rm are given by m × n matrices. In §1.4 we define the , det A, of an n × n matrix A, and show that A is invertible if and only if det A ≠ 0. In Chapter 2, such linear transformations arise as derivatives of nonlinear maps, and understanding the behavior of

1 2 1. Background these derivatives is basic to many key results in , both in Chapter 2 and in subsequent chapters.

1.1. One variable calculus

In this brief discussion of one variable calculus, we introduce the Riemann integral, and relate it to the derivative. We will define the Riemann integral of a bounded function over an interval I = [a, b] on the real line. For now, we assume f is real valued. To start, we partition I into smaller intervals. A partition P of I is a finite collection of subintervals {Jk : 0 ≤ k ≤ N}, disjoint except for their endpoints, whose union is I. We can order the Jk so that Jk = [xk, xk+1], where

(1.1.1) x0 < x1 < ··· < xN < xN+1, x0 = a, xN+1 = b.

We call the points xk the endpoints of P. We set

(1.1.2) ℓ(Jk) = xk+1 − xk, maxsize (P) = max ℓ(Jk) 0≤k≤N We then set ∑ IP (f) = sup f(x) ℓ(Jk), Jk (1.1.3) ∑k IP (f) = inf f(x) ℓ(Jk). Jk k

Definitions of sup and inf are given in (A.1.17)–(A.1.18). We call IP (f) and IP (f) respectively the upper sum and lower sum of f, associated to the partition P. See Figure 1.1.1 for an illustration. Note that IP (f) ≤ IP (f). These should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of I, we say P refines Q, and write P ≻ Q, if P is formed by partitioning each interval in Q. Equivalently, P ≻ Q if and only if all the endpoints of Q are also endpoints of P. It is easy to see that any two partitions have a common refinement; just take the union of their endpoints, to form a new partition. Note also that refining a partition lowers the upper sum of f and raises its lower sum:

(1.1.4) P ≻ Q =⇒ IP (f) ≤ IQ(f), and IP (f) ≥ IQ(f).

Consequently, if Pj are any two partitions and Q is a common refinement, we have

≤ ≤ Q ≤ P (1.1.5) IP1 (f) IQ(f) I (f) I 2 (f). 1.1. One variable calculus 3

Figure 1.1.1. Upper and lower sums

Now, whenever f : I → R is bounded, the following quantities are well defined:

(1.1.6) I(f) = inf IP (f),I(f) = sup IP (f), P∈Π(I) P∈Π(I) where Π(I) is the set of all partitions of I. We call I(f) the lower integral of f and I(f) its upper integral. Clearly, by (1.1.5), I(f) ≤ I(f). We then say that f is Riemann integrable provided I(f) = I(f), and in such a case, we set ∫ ∫ b (1.1.7) f(x) dx = f(x) dx = I(f) = I(f). a I We will denote the set of Riemann integrable functions on I by R(I). We derive some basic properties of the Riemann integral. Proposition 1.1.1. If f, g ∈ R(I), then f + g ∈ R(I), and ∫ ∫ ∫ (1.1.8) (f + g) dx = f dx + g dx.

I I I 4 1. Background

Proof. If Jk is any subinterval of I, then sup (f + g) ≤ sup f + sup g, and inf (f + g) ≥ inf f + inf g, J J J Jk Jk Jk k k k so, for any partition P, we have IP (f + g) ≤ IP (f) + IP (g). Also, using common refinements, we can simultaneously approximate I(f) and I(g) by IP (f) and IP (g), and ditto for I(f + g). Thus the characterization (1.1.6) implies I(f + g) ≤ I(f) + I(g). A parallel argument implies I(f + g) ≥ I(f) + I(g), and the proposition follows. 

Next, there is a fair supply of Riemann integrable functions. Proposition 1.1.2. If f is continuous on I, then f is Riemann integrable.

Proof. Any continuous function on a compact interval is bounded and uni- formly continuous (see Propositions A.1.15 and A.1.16). Let ω(δ) be a mod- ulus of continuity for f, so (1.1.9) |x − y| ≤ δ =⇒ |f(x) − f(y)| ≤ ω(δ), ω(δ) → 0 as δ → 0. Then

(1.1.10) maxsize (P) ≤ δ =⇒ IP (f) − IP (f) ≤ ω(δ) · ℓ(I), which yields the proposition. 

We denote the set of continuous functions on I by C(I). Thus Proposi- tion 1.1.2 says C(I) ⊂ R(I).

The proof of Proposition 1.1.2 provides a∫ criterion on a partition guar- anteeing that IP (f) and IP (f) are close to I f dx when f is continuous. We produce an extension, giving a condition under which IP (f) and I(f) are close, and IP (f) and I(f) are close, given f bounded on I. Given a partition P0 of I, set

(1.1.11) minsize(P0) = min{ℓ(Jk): Jk ∈ P0}. Lemma 1.1.3. Let P and Q be two partitions of I. Assume 1 (1.1.12) maxsize(P) ≤ minsize(Q). k Let |f| ≤ M on I. Then 2M IP (f) ≤ IQ(f) + ℓ(I), (1.1.13) k 2M I (f) ≥ I (f) − ℓ(I). P Q k 1.1. One variable calculus 5

Proof. Let P1 denote the minimal common refinement of P and Q. Con- sider on the one hand those intervals in P that are contained in intervals in Q and on the other hand those intervals in P that are not contained in intervals in Q. Each interval of the first type is also an interval in P1. Each interval of the second type gets partitioned, to yield two intervals in P Pb 1. Denote by 1 the collection of such divided intervals. By (1.1.12), the Pb ≤ lengths of the intervals in 1 sum to ℓ(I)/k. It follows that ∑ ℓ(I) |IP (f) − IP (f)| ≤ 2Mℓ(J) ≤ 2M , 1 k ∈Pb J 1 | − | ≤ and similarly IP (f) IP1 (f) 2Mℓ(I)/k. Therefore 2M 2M IP (f) ≤ IP (f) + ℓ(I),IP (f) ≥ IP (f) − ℓ(I). 1 k 1 k

P ≤ Q ≥  Since also I 1 (f) I (f) and IP1 (f) IQ(f), we obtain (1.1.13).

The following consequence is sometimes called Darboux’s Theorem.

Theorem 1.1.4. Let Pν be a sequence of partitions of I into ν intervals Jνk, 1 ≤ k ≤ ν, such that

maxsize(Pν) −→ 0. If f : I → R is bounded, then

P → → (1.1.14) I ν (f) I(f) and IPν (f) I(f). Consequently, ∑ν (1.1.15) f ∈ R(I) ⇐⇒ I(f) = lim f(ξνk)ℓ(Jνk), ν→∞ k=1 ∫ ∈ for arbitrary ξνk Jνk, in which case the limit is I f dx.

Proof. As before, assume |f| ≤ M. Pick ε = 1/k > 0. Let Q be a partition such that I(f) ≤ IQ(f) ≤ I(f) + ε,

I(f) ≥ IQ(f) ≥ I(f) − ε. Now pick N such that

ν ≥ N =⇒ maxsize Pν ≤ ε minsize Q. Lemma 1.1.3 yields, for ν ≥ N, ≤ IPν (f) IQ(f) + 2Mℓ(I)ε, ≥ − IPν (f) IQ(f) 2Mℓ(I)ε. 6 1. Background

Hence, for ν ≥ N, ≤ ≤ I(f) IPν (f) I(f) + [2Mℓ(I) + 1]ε, ≥ ≥ − I(f) IPν (f) I(f) [2Mℓ(I) + 1]ε. This proves (1.1.14). 

Remark. The∫ sums on the right side of (1.1.15) are called Riemann sums, approximating I f dx (when f is Riemann integrable).

Remark. A second proof of Proposition 1.1.1 can readily be deduced from Theorem 1.1.4.

One should be warned that, once such a specific choice of Pν and ξνk has been made, the limit on the right side of (1.1.15) might exist for a bounded function f that is not Riemann integrable. This and other phenomena are illustrated by the following example of a function which is not Riemann integrable. For x ∈ I, set (1.1.16) ϑ(x) = 1 if x ∈ Q, ϑ(x) = 0 if x∈ / Q, where Q is the set of rational numbers. Now every interval J ⊂ I of positive length contains points in Q and points not in Q, so for any partition P of I we have IP (ϑ) = ℓ(I) and IP (ϑ) = 0, hence (1.1.17) I(ϑ) = ℓ(I),I(ϑ) = 0.

Note that, if Pν is a partition of I into ν equal subintervals, then we could pick each ξνk to be rational, in which case the limit on the right side of (1.1.15) would be ℓ(I), or we could pick each ξνk to be irrational, in which case this limit would be zero. Alternatively, we could pick half of them to be rational and half to be irrational, and the limit would be ℓ(I)/2. Associated to the Riemann integral is a notion of size of a set S, called content. If S is a subset of I, define the “characteristic function”

(1.1.18) χS(x) = 1 if x ∈ S, 0 if x∈ / S. We define “upper content” cont+ and “lower content” cont− by + − (1.1.19) cont (S) = I(χS), cont (S) = I(χS). We say S “has content,” or “is contented” if these quantities are equal, which happens if and only if χS ∈ R(I), in which case the common value of cont+(S) and cont−(S) is ∫

(1.1.20) m(S) = χS(x) dx. I 1.1. One variable calculus 7

It is easy to see that {∑N } + (1.1.21) cont (S) = inf ℓ(Jk): S ⊂ J1 ∪ · · · ∪ JN , k=1 where Jk are intervals. Here, we require S to be in the union of a finite collection of intervals. See the appendix at the end of this section for a generalization of Proposi- tion 1.1.2, giving a sufficient condition for a bounded function to be Riemann integrable on I, in terms of the upper content of its set of discontinuities. There is a more sophisticated notion of the size of a subset of I, called Lebesgue measure. The key to the construction of Lebesgue measure is to cover a set S by a countable (either finite or infinite) set of intervals. The outer measure of S ⊂ I is defined by {∑ ∪ } ∗ (1.1.22) m (S) = inf ℓ(Jk): S ⊂ Jk . k≥1 k≥1

Here {Jk} is a finite or countably infinite collection of intervals. Clearly (1.1.23) m∗(S) ≤ cont+(S).

Note that, if S = I ∩ Q, then χS = ϑ, defined by (1.1.16). In this case it is easy to see that cont+(S) = ℓ(I), but m∗(S) = 0. Zero is the “right” measure of this set. More material on the development of measure theory can be found in a number of books, including [15] and [42]. ∫ It is useful to note that I f dx is additive in I, in the following sense.

→ R Proposition 1.1.5. If a < b < c, f :[a, c] , f1 = f [a,b], f2 = f [b,c], then ( ) ( ) ( ) (1.1.24) f ∈ R [a, c] ⇐⇒ f1 ∈ R [a, b] and f2 ∈ R [b, c] , and, if this holds, ∫ ∫ ∫ c b c (1.1.25) f dx = f1 dx + f2 dx. a a b

Proof. Since any partition of [a, c] has a refinement for which b is an end- point, we may as well consider a partition P = P1 ∪ P2, where P1 is a partition of [a, b] and P2 is a partition of [b, c]. Then

P P P (1.1.26) I (f) = I 1 (f1) + I 2 (f2),IP (f) = IP1 (f1) + IP2 (f2), so { } { } P − P − P − (1.1.27) I (f) IP (f) = I 1 (f1) IP1 (f1) + I 2 (f2) IP2 (f2) . 8 1. Background

Since both terms in braces in (1.1.27) are ≥ 0, we have equivalence in (1.1.24). Then (1.1.25) follows from (1.1.26) upon taking sufficiently fine partitions. 

Let I = [a, b]. If f ∈ R(I), then f ∈ R([a, x]) for all x ∈ [a, b], and we can consider the function ∫ x (1.1.28) g(x) = f(t) dt. a If a ≤ x ≤ x ≤ b, then 0 1 ∫ x1 (1.1.29) g(x1) − g(x0) = f(t) dt, x0 so, if |f| ≤ M,

(1.1.30) |g(x1) − g(x0)| ≤ M|x1 − x0|. In other words, if f ∈ R(I), then g is Lipschitz continuous on I. A function g :(a, b) → R is said to be differentiable at x ∈ (a, b) provided there exists the limit 1 [ ] (1.1.31) lim g(x + h) − g(x) = g′(x). h→0 h When such a limit exists, g′(x), also denoted dg/dx, is called the derivative of g at x. Clearly g is continuous wherever it is differentiable. The next result is part of the Fundamental Theorem of Calculus. Theorem 1.1.6. If f ∈ C([a, b]), then the function g, defined by (1.1.28), is differentiable at each point x ∈ (a, b), and (1.1.32) g′(x) = f(x).

Proof. Parallel to (1.1.29), we have, for h > 0, ∫ 1 [ ] 1 x+h (1.1.33) g(x + h) − g(x) = f(t) dt. h h x If f is continuous at x, then, for any ε > 0, there exists δ > 0 such that |f(t) − f(x)| ≤ ε whenever |t − x| ≤ δ. Thus the right side of (1.1.33) is within ε of f(x) whenever h ∈ (0, δ]. Thus the desired limit exists as h ↘ 0. A similar argument treats h ↗ 0. 

The next result is the rest of the Fundamental Theorem of Calculus. Theorem 1.1.7. If G is differentiable and G′(x) is continuous on [a, b], then ∫ b (1.1.34) G′(t) dt = G(b) − G(a). a 1.1. One variable calculus 9

Figure 1.1.2. Illustration of the

Proof. Consider the function ∫ x (1.1.35) g(x) = G′(t) dt. a We have g ∈ C([a, b]), g(a) = 0, and, by Theorem 1.1.6, g′(x) = G′(x), ∀ x ∈ (a, b). Thus f(x) = g(x) − G(x) is continuous on [a, b], and (1.1.36) f ′(x) = 0, ∀ x ∈ (a, b). We claim that (1.1.36) implies f is constant on [a, b]. Granted this, since f(a) = g(a) − G(a) = −G(a), we have f(x) = −G(a) for all x ∈ [a, b], so the integral (1.1.35) is equal to G(x) − G(a) for all x ∈ [a, b]. Taking x = b yields (1.1.34). 

The fact that (1.1.36) implies f is constant on [a, b] is a consequence of the following result, known as the Mean Value Theorem. This is illustrated in Figure 1.1.2. 10 1. Background

Theorem 1.1.8. Let f :[a, β] → R be continuous, and assume f is differ- entiable on (a, β). Then ∃ ξ ∈ (a, β) such that f(β) − f(a) (1.1.37) f ′(ξ) = . β − a Proof. Set g(x) = f(x) − κ(x − a), where κ is the right side of (1.1.37). Note that g′(ξ) = f ′(ξ) − κ, so it suffices to show that g′(ξ) = 0 for some ξ ∈ (a, β). Note also that g(a) = g(β). Since [a, β] is compact, g must assume a maximum and a minimum on [a, β]. Since g(a) = g(β), one of these must be assumed at an interior point, at which g′ vanishes. 

Now, to see that (1.1.36) implies f is constant on [a, b], if not, ∃ β ∈ (a, b] such that f(β) ≠ f(a). Then just apply Theorem 1.1.8 to f on [a, β]. This completes the proof of Theorem 1.1.7.

We now extend Theorems 1.1.6–1.1.7 to the setting of Riemann inte- grable functions. Proposition 1.1.9. Let f ∈ R([a, b]), and define g by (1.1.28). If x ∈ [a, b] and f is continuous at x, then g is differentiable at x, and g′(x) = f(x).

The proof is identical to that of Theorem 1.1.6. Proposition 1.1.10. Assume G is differentiable on [a, b] and G′ ∈ R([a, b]). Then (1.1.34) holds.

Proof. We have − [ ( ) ( )] n∑1 k + 1 k G(b) − G(a) = G a + (b − a) − G a + (b − a) n n k=0 − b − a n∑1 = G′(ξ ), n kn k=0 for some ξkn satisfying k k + 1 a + (b − a) < ξ < a + (b − a) , n kn n as a consequence of the Mean Value Theorem. Given G′ ∈ R([a, b]), Dar- → ∞ − boux’s theorem∫ (Theorem 1.1.4) implies that as n one gets G(b) b ′  G(a) = a G (t) dt. Note that the beautiful symmetry in Theorems 1.1.6–1.1.7 is not pre- served in Propositions 1.1.9–1.1.10. The hypothesis of Proposition 1.1.10 requires G to be differentiable at each x ∈ [a, b], but the conclusion of Propo- sition 1.1.9 does not yield differentiability at all points. For this reason, we 1.1. One variable calculus 11 regard Propositions 1.1.9–1.1.10 as less “fundamental” than Theorems 1.1.6– 1.1.7. There are more satisfactory extensions of the fundamental theorem of calculus, involving the Lebesgue integral, and a more subtle notion of the “derivative” of a non-smooth function. For this, we can point the reader to Chapters 10–11 of the text [42], Measure Theory and Integration. So far, we have dealt with integration of real valued functions. If f : I → C, we set f = f1 + if2 with fj : I → R and say f ∈ R(I) if and only if f and f are in R(I). Then 1 2 ∫ ∫ ∫

f dx = f1 dx + i f2 dx. I I I There are straightforward extensions of Propositions 1.1.5–1.1.10 to complex valued functions. Similar comments apply to functions f : I → Rn. ′ If a function G is differentiable on (a, b), and( G is) continuous on (a, b), 1 ∈ 1 we say (G is a) C function, and write( G) C (a, b) . Inductively, we say G ∈ Ck (a, b) provided G′ ∈ Ck−1 (a, b) . An easy consequence of the definition (1.1.31) of the derivative is that, for any real constants a, b, and c, df f(x) = ax2 + bx + c =⇒ = 2ax + b. dx Now, it is a simple enough step to replace a, b, c by y, z, w, in these formulas. Having done that, we can regard y, z, and w as variables, along with x : F (x, y, z, w) = yx2 + zx + w. We can then hold y, z and w fixed (e.g., set y = a, z = b, w = c), and then differentiate with respect to x; we get ∂F = 2yx + z, ∂x the of F with respect to x. Generally, if F is a function of n variables, x1, . . . , xn, we set ∂F (x , . . . ,x ) ∂x 1 n j [ (1.1.38) 1 = lim F (x1, . . . , xj−1, xj + h, xj+1, . . . , xn) h→0 h ] − F (x1, . . . , xj−1, xj, xj+1, . . . , xn) , where the limit exists. Section 2.1 carries on with a further investigation of the derivative of a function of several variables.

Complementary results on Riemann integrability 12 1. Background

Here we provide a condition, more general then Proposition 1.1.2, which guarantees Riemann integrability. Proposition 1.1.11. Let f : I → R be a bounded function, with I = [a, b]. Suppose that the set S of points of discontinuity of f has the property (1.1.39) cont+(S) = 0. Then f ∈ R(I). | | ≤ Proof. Say f(x) M. Take ε > 0.∑ As in (1.1.21), take intervals ⊂ ∪ · · · ∪ N J1,...,JN such that S J1 JN and k=1 ℓ(Jk) < ε. In fact, fatten each Jk such that S is contained in the interior of this collection of intervals. Consider a partition P0 of I, whose intervals include J1,...,JN , amongst others, which we label I1,...,IK . Now f is continuous on each interval Iν, so, subdividing each Iν as necessary, hence refining P0 to a partition P1, we arrange that sup f − inf f < ε on each such subdivided interval. Denote ′ ′ these subdivided intervals I1,...,IL. It readily follows that ∑N ∑L ′ ≤ P − 0 I 1 (f) IP1 (f) < 2Mℓ(Jk) + εℓ(Ik) k=1 k=1 < 2εM + εℓ(I). Since ε can be taken arbitrarily small, this establishes that f ∈ R(I). 

Remark. An even better result is that such f is Riemann integrable if and only if (1.1.40) m∗(S) = 0, where m∗(S) is defined by (1.1.22). The implication m∗(S) = 0 ⇒ f ∈ R(I) (in the n-dimensional setting) is established in Proposition 3.1.31 of this text. For the 1-dimensional case, see also Proposition 4.2.12 of [44]. For the reverse implication f ∈ R(I) ⇒ m∗(S) = 0, one can see standard books on measure theory, such as [15] and [42].

We give an example of a function to which Proposition 1.1.11 applies, and then an example for which Proposition 1.1.11 fails to apply, though the function is Riemann integrable.

Example 1. Let I = [0, 1]. Define f : I → R by f(0) = 0, f(x) = (−1)j for x ∈ (2−(j+1), 2−j], j ≥ 0. 1.1. One variable calculus 13

Then |f| ≤ 1 and the set of points of discontinuity of f is S = {0} ∪ {2−j : j ≥ 1}. It is easy to see that cont+ S = 0. Hence f ∈ R(I).

See Exercises 19–20 below for a more elaborate example to which Proposition 1.1.11 applies.

Example 2. Again I = [0, 1]. Define f : I → R by f(x) = 0 if x∈ / Q, 1 m if x = , in lowest terms. n n Then |f| ≤ 1 and the set of points of discontinuity of f is S = I ∩ Q. As we have seen below (1.1.23), cont+ S = 1, so Proposition 1.1.11 does not apply. Nevertheless, it is fairly easy to see directly that I(f) = I(f) = 0, so f ∈ R(I). In fact, given ε > 0, f ≥ ε only on a finite set, hence I(f) ≤ ε, ∀ ε > 0. As indicated below (1.1.23), (1.1.40) does apply to this function.

By contrast, the function ϑ in (1.1.16) is discontinuous at each point of I.

We mention an alternative characterization of I(f) and I(f), which can be useful. Given I = [a, b], we say g : I → R is piecewise constant on I (and write g ∈ PK(I)) provided there exists a partition P = {Jk} of I such that g is constant on the interior of each interval Jk. Clearly PK(I) ⊂ R(I). It is easy to see that, if f : I → R is bounded, {∫ } I(f) = inf f1 dx : f1 ∈ PK(I), f1 ≥ f , (1.1.41) {I∫ } I(f) = sup f0 dx : f0 ∈ PK(I), f0 ≤ f . I Hence, given f : I → R bounded, f ∈ R(I) ⇔ for each ε > 0, ∃f , f ∈ PK(I) such that 0∫ 1 (1.1.42) f0 ≤ f ≤ f1 and (f1 − f0) dx < ε. I 14 1. Background

This can be used to prove (1.1.43) f, g ∈ R(I) =⇒ fg ∈ R(I), via the fact that

(1.1.44) fj, gj ∈ PK(I) =⇒ fjgj ∈ PK(I). In fact, we have the following, which can be used to prove (1.1.43). Proposition 1.1.12. Let f ∈ R(I), and assume |f| ≤ M. Let φ :[−M,M] → R be continuous. Then φ ◦ f ∈ R(I).

Proof. We proceed in steps.

Step 1. We can obtain φ as a uniform limit on [−M,M] of a sequence φν of continuous, piecewise linear functions. Then φν ◦ f → φ ◦ f uniformly on I. A uniform limit g of functions gν ∈ R(I) is in R(I) (see Exercise 12). So it suffices to prove Proposition 1.1.12 when φ is continuous and piecewise linear.

Step 2. Given φ :[−M,M] → R continuous and piecewise linear, it is an exercise to write φ = φ1−φ2, with φj :[−M,M] → R monotone, continuous, and piecewise linear. Now φ1 ◦ f, φ2 ◦ f ∈ R(I) ⇒ φ ◦ f ∈ R(I).

Step 3. We now demonstrate Proposition 1.1.12 when φ :[−M,M] → R is monotone and Lipschitz. By Step 2, this will suffice. So we assume

−M ≤ x1 < x2 ≤ M =⇒ φ(x1) ≤ φ(x2) and φ(x2) − φ(x1) ≤ L(x2 − x1), for some L < ∞. Given ε > 0, pick f0, f1 ∈ PK(I), as in (1.1.42). Then

φ ◦ f0, φ ◦ f1 ∈ PK(I), φ ◦ f0 ≤ φ ◦ f ≤ φ ◦ f1, and ∫ ∫

(φ ◦ f1 − φ ◦ f0) dx ≤ L (f1 − f0) dx ≤ Lε. I I This proves φ ◦ f ∈ R(I). 

Exercises

1. Let c > 0 and let f :[ac, bc] → R be Riemann integrable. Working directly with the definition of integral, show that ∫ ∫ b 1 bc (1.1.45) f(cx) dx = f(x) dx. a c ac 1.1. One variable calculus 15

More generally, show that ∫ ∫ b−d/c 1 bc (1.1.46) f(cx + d) dx = f(x) dx. a−d/c c ac

n 2. Let ∫f : I × S → R be continuous, where I = [a, b] and S ⊂ R . Take φ(y) = I f(x, y) dx. Show that φ is continuous on S. Hint. If fj : I → R are continuous and |f1(x) − f2(x)| ≤ δ on I, then ∫ ∫

(1.1.47) f1 dx − f2 dx ≤ ℓ(I)δ. I I e Hint. Suppose yj ∈ S, yj → y ∈ S. Let S = {yj} ∪ {y}. This is compact. Thus f : I × Se → R is uniformly continuous. Hence

|f(x, yj) − f(x, y)| ≤ ω(|yj − y|), ∀ x ∈ I, where ω(δ) → 0 as δ → 0.

→ R ≤ 3. With f as in Exercise 2, suppose∫ gj : S are continuous and a g1(y) g0(y) < g1(y) ≤ b. Take φ(y) = f(x, y) dx. Show that φ is continuous g0(y) on S. Hint. Make a , linear in x, to reduce this to Exercise 2.

4. Suppose f :(a, b) → (c, d) and g :(c, d) → R are differentiable. Show that h(x) = g(f(x)) is differentiable and h′(x) = g′(f(x))f ′(x). This is the chain rule. Hint. Peek at the proof of the chain rule in §1.

5. If f1 and f2 are differentiable on (a, b), show that f1(x)f2(x) is differen- tiable and d ( ) f (x)f (x) = f ′ (x)f (x) + f (x)f ′ (x). dx 1 2 1 2 1 2 If f2(x) ≠ 0, ∀ x ∈ (a, b), show that f1(x)/f2(x) is differentiable, and ( ) ′ − ′ d f1(x) f1(x)f2(x) f1(x)f2(x) = 2 . dx f2(x) f2(x)

6. Let φ :[a, b] → [A, B] be C1 on a neighborhood J of [a, b], with φ′(x) > 0 for all x ∈ [a, b]. Assume φ(a) = A, φ(b) = B. Show that the identity ∫ ∫ B b ( ) (1.1.48) f(y) dy = f φ(t) φ′(t) dt, A a 16 1. Background for any f ∈ C(I),I = [A, B], follows from the chain rule and the Funda- mental Theorem of Calculus. This is the change of variable formula for the one-dimensional integral. Hint. Replace b by x, B by φ(x), and differentiate.

7. Show that (1.1.48) holds for each f ∈ PK(I). Using (1.1.41)–(1.1.42), show that f ∈ R(I) ⇒ f ◦ φ ∈ R([a, b]) and (1.1.48) holds. (This result contains that of Exercise 1.)

8. Show that, if f and g are C1 on a neighborhood of [a, b], then ∫ ∫ b b [ ] (1.1.49) f(s)g′(s) ds = − f ′(s)g(s) ds + f(b)g(b) − f(a)g(a) . a a This transformation of integrals is called “.”

9. Let f :(−a, a) → R be a Cj+1 function. Show that, for x ∈ (−a, a), f ′′(0) f (j)(0) (1.1.50) f(x) = f(0) + f ′(0)x + x2 + ··· + xj + R (x), 2 j! j where ∫ x − j (x s) (j+1) (1.1.51) Rj(x) = f (s) ds 0 j! This is Taylor’s formula with remainder. Hint. Use induction. If (1.1.50)–(1.1.51) holds for 0 ≤ j ≤ k, show that it holds for j = k + 1, by showing that (1.1.52)∫ ∫ x (x − s)k f (k+1)(0) x (x − s)k+1 f (k+1)(s) ds = xk+1 + f (k+2)(s) ds. 0 k! (k + 1)! 0 (k + 1)! To establish this, use the integration by parts formula (1.1.49), with f(s) replaced by f (k+1)(s), and with appropriate g(s). See §2.1 for another ap- proach to the remainder formula. Note that another presentation of (1.1.51) is ∫ j+1 1 (( ) ) x (j+1) 1/(j+1) (1.1.53) Rj(x) = f 1 − t x dt. (j + 1)! 0

10. Assume f :(−a, a) → R is a Cj function. Show that, for x ∈ (−a, a), (1.1.50) holds, with ∫ x [ ] 1 j−1 (j) (j) (1.1.54) Rj(x) = (x − s) f (s) − f (0) ds. (j − 1)! 0 1.1. One variable calculus 17

Hint. Apply (1.1.51) with j replaced by j − 1. Add and subtract f (j)(0) to the factor f (j)(s) in the resulting integrand.

11. Given I = [a, b], show that (1.1.55) f, g ∈ R(I) =⇒ fg ∈ R(I), as advertised in (0.41).

12. Assume fk ∈ R(I) and fk → f uniformly on I. Prove that f ∈ R(I) and ∫ ∫

(1.1.56) fk dx −→ f dx. I I

13. Given I = [a, b],Iε = [a + ε, b − ε], assume fk ∈ R(I), |fk| ≤ M on I for all k, and

(1.1.57) fk −→ f uniformly on Iε, for all ε ∈ (0, (b − a)/2). Prove that f ∈ R(I) and (1.1.56) holds.

14. Use the fundamental theorem of calculus to compute ∫ b (1.1.58) xr dx, r ∈ Q \ {−1}, a where 0 ≤ a < b < ∞ if r ≥ 0 and 0 < a < b < ∞ if r < 0.

15. Use the change of variable result of Exercise 6 to compute ∫ 1 √ x 1 + x2 dx. 0

16. We say f ∈ R(R) provided f|[k,k+1] ∈ R([k, k + 1]) for each k ∈ Z, and ∫ ∑∞ k+1 (1.1.59) |f(x)| dx < ∞. k=−∞ k If f ∈ R(R), we set ∫ ∫ ∞ k (1.1.60) f(x) dx = lim f(x) dx. →∞ −∞ k −k Formulate and demonstrate basic properties of the integral over R of ele- ments of R(R).

17. This exercise discusses the integral test for absolute convergence of an 18 1. Background infinite series, which goes as follows. Let f be a positive, monotonically decreasing, continuous function on [0, ∞), and suppose |ak| = f(k). Then ∑∞ ∫ ∞ |ak| < ∞ ⇐⇒ f(x) dx < ∞. k=0 0 Prove this. Hint. Use ∫ ∑N N N∑−1 |ak| ≤ f(x) dx ≤ |ak|. k=1 0 k=0

18. Use the integral test to show that, if p > 0, ∞ ∑ 1 (1.1.61) < ∞ ⇐⇒ p > 1. kp k=1 ∫ N −p Hint. Use Exercise 14 to evaluate IN (p) = x dx, for p ≠ −1, and let ∫ ∞ 1 N → ∞. See if you can show x−1 dx = ∞ without knowing about log N. ∫ 1 ∫ 2 −1 2N −1 Subhint. Show that 1 x dx = N x dx.

In Exercises 19–20, C ⊂ R is the Cantor set, defined as follows. Take a closed, bounded interval [a, b] = C0. Let C1 be obtained from C0 by deleting the open middle third interval, of length (b − a)/3. At the jth stage, Cj is j −j a disjoint union of 2 closed intervals, each of length 3 (b − a). Then Cj+1 j is obtained from Cj by deleting the open middle third of each of these 2 intervals. We have C0 ⊃ C1 ⊃ · · · ⊃ Cj ⊃ · · · , each a closed subset of [a, b]. The Cantor set is C = ∩j≥0Cj.

+ j 19. Show that cont Cj = (2/3) (b − a), and conclude that cont+ C = 0.

20. Define f :[a, b] → R as follows. We call an interval of length 3−j(b − a), omitted in passing from Cj−1 to Cj, a “j-interval.” Set f(x) = 0, if x ∈ C, (−1)j, if x belongs to a j-interval. Show that the set of discontinuities of f is C. Hence Proposition 1.1.11 implies f ∈ R([a, b]).

21. Generalize Exercise 8 as follows. Assume f and g are differentiable on a neighborhood of [a, b] and f ′, g′ ∈ R([a, b]). Then show that (1.1.49) holds. Hint. Use the results of Exercise 11 to show that (fg)′ ∈ R([a, b]). 1.2. Euclidean spaces 19

22. Let f : I → R be bounded, I = [a, b]. Show that {∫ } I(f) = inf f1 dx : f1 ∈ C(I), f1 ≥ f , I∫ { } I(f) = sup f0 dx : f0 ∈ C(I), f0 ≤ f . I Compare (1.1.41). Then show that f ∈ R(I) ⇐⇒ for each ε > 0, ∃ f , f ∈ C(I) such that 0∫ 1 (1.1.62) f0 ≤ f ≤ f1 and (f1 − f0) dx < ε. I Compare (1.1.42).

1.2. Euclidean spaces

The space Rn, n-dimensional Euclidean space, consists of n-tuples of real numbers:

n (1.2.1) x = (x1, . . . , xn) ∈ R , xj ∈ R, 1 ≤ j ≤ n.

The number xj is called the jth component of x. Here we discuss some important algebraic and metric structures on Rn. First, there is addition. n If x is as in (1.1) and also y = (y1, . . . , yn) ∈ R , we have n (1.2.2) x + y = (x1 + y1, . . . , xn + yn) ∈ R . Addition is done componentwise. Also, given a ∈ R, we have

n (1.2.3) ax = (ax1, . . . , axn) ∈ R . This is scalar . In (1.2.1), we represent x as a row vector. Sometimes we want to represent x by a column vector,   x1  .  (1.2.4) x =  .  . xn Then (1.2.2)–(1.2.3) are converted to     x1 + y1 ax1  .   .  (1.2.5) x + y =  .  , ax =  .  . xn + yn axn 20 1. Background

We also have the dot product, ∑n (1.2.6) x · y = xjyj = x1y1 + ··· + xnyn ∈ R, j=1 given x, y ∈ Rn. The dot product has the properties x · y = y · x, (1.2.7) x · (ay + bz) = a(x · y) + b(x · z), x · x > 0 unless x = 0. Note that · 2 ··· 2 (1.2.8) x x = x1 + + xn. We set √ (1.2.9) |x| = x · x, which we call the norm of x. Note that (1.2.7) implies (1.2.10) (ax) · (ax) = a2(x · x), hence (1.2.11) |ax| = |a| · |x|, for a ∈ R, x ∈ Rn.

Taking a cue from the Pythagorean theorem, we say that the distance from x to y in Rn is (1.2.12) d(x, y) = |x − y|. For us, (1.2.9) and (1.2.12) are simply definitions. We do not need to depend on a derivation of the Pythagorean theorem via classical . Significant properties will be derived below, without recourse to a prior theory of Euclidean geometry. A set X equipped with a distance function is called a metric space. We consider metric spaces in general in Appendix A.1. Here, we want to show that the Euclidean distance, defined by (1.2.12), satisfies the “ inequality,” (1.2.13) d(x, y) ≤ d(x, z) + d(z, y), ∀ x, y, z ∈ Rn. This in turn is a consequence of the following, also called the triangle in- .

Proposition 1.2.1. The norm (1.7) on Rn has the property (1.2.14) |x + y| ≤ |x| + |y|, ∀ x, y ∈ Rn. 1.2. Euclidean spaces 21

Proof. We compare the squares of the two sides of (1.2.14). First, |x + y|2 = (x + y) · (x + y) (1.2.15) = x · x + y · x + y · x + y · y = |x|2 + 2x · y + |y|2. Next, (1.2.16) (|x| + |y|)2 = |x|2 + 2|x| · |y| + |y|2. We see that (1.2.14) holds if and only if x · y ≤ |x| · |y|. Thus the proof of Proposition 1.2.1 is finished off by the following result, known as Cauchy’s inequality.  Proposition 1.2.2. For all x, y ∈ Rn, (1.2.17) |x · y| ≤ |x| · |y|.

Proof. We start with the chain (1.2.18) 0 ≤ |x − y|2 = (x − y) · (x − y) = |x|2 + |y|2 − 2x · y, which implies (1.2.19) 2x · y ≤ |x|2 + |y|2, ∀ x, y ∈ Rn. If we replace x by tx and y by t−1y, with t > 0, the left side of (1.2.19) is unchanged, so we have (1.2.20) 2x · y ≤ t2|x|2 + t−2|y|2, ∀ t > 0. Now we pick t so that the two terms on the right side of (1.2.20) are equal, namely |y| |x| (1.2.21) t2 = , t−2 = . |x| |y| (At this point, note that (1.2.17) is obvious if x = 0 or y = 0, so we will assume that x ≠ 0 and y ≠ 0.) Plugging (1.2.21) into (1.2.20) gives (1.2.22) x · y ≤ |x| · |y|, ∀ x, y ∈ Rn. This is almost (1.2.17). To finish, we can replace x in (1.2.22) by −x = (−1)x, getting (1.2.23) −(x · y) ≤ |x| · |y|, and together (1.2.22) and (1.2.23) give (1.2.17). 

We now discuss a number of notions and results related to convergence n n n in R . First, a sequence of points (pj) in R converges to a limit p ∈ R (we write pj → p) if and only if

(1.2.24) |pj − p| −→ 0, 22 1. Background where | · | is the Euclidean norm on Rn, defined by (1.2.9), and the meaning of (1.2.24) is that for every ε > 0 there exists N such that

(1.2.25) j ≥ N =⇒ |pj − p| < ε.

If we write pj = (p1j, . . . , pnj) and p = (p1, . . . , pn), then (1.2.24) is equiva- lent to 2 2 (p1j − p1) + ··· + (pnj − pn) −→ 0, as j → ∞, which holds if and only if

|pℓj − pℓ| −→ 0 as j → ∞, for each ℓ ∈ {1, . . . , n}. n That is to say, convergence pj → p in R is eqivalent to convergence of each component. A set S ⊂ Rn is said to be closed if and only if

(1.2.26) pj ∈ S, pj → p =⇒ p ∈ S. The complement Rn \ S of a closed set S is open. Alternatively, Ω ⊂ Rn is open if and only if, given q ∈ Ω, there exists ε > 0 such that Bε(q) ⊂ Ω, where n (1.2.27) Bε(q) = {p ∈ R : |p − q| < ε}, so q cannot be a limit of a sequence of points in Rn \ Ω. An important property of Rn is completeness, a property defined as n follows. A sequence (pj) of points in R is called a if and only if

(1.2.28) |pj − pk| −→ 0, as j, k → ∞. n Again we see that (pj) is Cauchy in R if and only if each component is n Cauchy in R. It is easy to see that if pj → p for some p ∈ R , then (1.2.28) holds. The completeness property is the converse.

n Theorem 1.2.3. If (pj) is a Cauchy sequence in R , then it has a limit, i.e., (1.2.24) holds for some p ∈ Rn.

n Proof. Since convergence pj → p in R is equivalent to convergence in R of each component, the result is a consequence of the completeness of R. This is proved in [44]. 

Completeness provides a path to the following key notion of compactness. A nonempty set K ⊂ Rn is said to be compact if and only if the following property holds.

Each infinite sequence (pj) in K has a subsequence (1.2.29) that converges to a point in K. 1.2. Euclidean spaces 23

It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there exists R < ∞ such that K ⊂ BR(0). Indeed, if K is not bounded, there exist pj ∈ K such that |pj+1| ≥ |pj| + 1. In such a case, |pj − pk| ≥ 1 whenever j ≠ k, so (pj) cannot have a convergent subsequence. The following converse result is the n-dimensional Bolzano- Weierstrass theorem. Theorem 1.2.4. If a nonempty K ⊂ Rn is closed and bounded, then it is compact.

Proof. If K ⊂ Rn is closed and bounded, it is a closed subset of some box n (1.2.30) B = {(x1, . . . , xn) ∈ R : a ≤ xk ≤ b, ∀ k}. Clearly every closed subset of a compact set is compact, so it suffices to show that B is compact. Now, each closed bounded interval [a, b] in R is compact, as shown in Appendix A.1, and (by reasoning similar to the proof of Theorem 1.2.3) the compactness of B follows readily from this. 

We establish some further properties of compact sets K ⊂ Rn, leading to the important result, Proposition 1.2.8 below. A generalization of this result is given in Appendix A.1. n Proposition 1.2.5. Let K ⊂ R be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a decreasing sequence of closed subsets of K. If each Xm ≠ ∅, then ∩mXm ≠ ∅.

Proof. Pick xm ∈ Xm. If K is compact, (xm) has a convergent subsequence, → { ≥ } ⊂ ∈ xmk y. Since xmk : k ℓ Xmℓ , which is closed, we have y ∩mXm.  n Corollary 1.2.6. Let K ⊂ R be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · n form an increasing sequence of open sets in R . If ∪mUm ⊃ K, then UM ⊃ K for some M.

Proof. Consider Xm = K \ Um.  Before getting to Proposition 1.2.8, we bring in the following. Let Q denote the set of rational numbers, and let Qn denote the set of points in Rn all of whose components are rational. The set Qn ⊂ Rn has the following “denseness” property: given p ∈ Rn and ε > 0, there exists q ∈ Qn such that |p − q| < ε. Let n (1.2.31) R = {Br(q): q ∈ Q , r ∈ Q ∩ (0, ∞)}. Note that Q and Qn are countable, i.e., they can be put in one-to-one corre- spondence with N. Hence R is a countable collection of balls. The following lemma is left as an exercise for the reader. 24 1. Background

Lemma 1.2.7. Let Ω ⊂ Rn be a nonempty . Then ∪ (1.2.32) Ω = {B : B ∈ R,B ⊂ Ω}.

To state the next result, we say that a collection {Uα : α ∈ A} covers K n if K ⊂ ∪α∈AUα. If each Uα ⊂ R is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈BUβ, we say {Uβ : β ∈ B} is a subcover. The following is the n-dimensional Heine-Borel theorem.

Proposition 1.2.8. If K ⊂ Rn is compact, then it has the following prop- erty.

(1.2.33) Every open cover {Uα : α ∈ A} of K has a finite subcover.

Proof. By Lemma 1.2.7, it suffices to prove the following.

Every countable cover {Bj : j ∈ N} of K by open balls (1.2.34) has a finite subcover.

To see this, write R = {Bj : j ∈ N}. Given the cover {Uα}, pass to {Bj : j ∈ J}, where j ∈ J if and only of Bj is contained in some Uα. By (1.2.32), {Bj : j ∈ J} covers K. If (1.2.34) holds, we have a subcover { ∈ } ⊂ ∈ A ⊂ Bℓ : ℓ L for some finite L J. Pick αℓ such that Bℓ Uαℓ . The { ∈ } Uαℓ : ℓ L is the desired finite subcover advertised in (1.2.33). Finally, to prove (1.2.34), we set

(1.2.35) Um = B1 ∪ · · · ∪ Bm and apply Corollary 1.2.6. 

Exercises

1. Identifying z = x + iy ∈ C with (x, y) ∈ R2 and w = u + iv ∈ C with (u, v) ∈ R2, show that the dot product satisfies z · w = Re zw.

2. Show that the inequality (1.2.14) implies (1.2.13).

3. Prove Lemma 1.2.7.

4. Use Proposition 1.2.8 to prove the following extension of Proposition 1.2.5. 1.3. Vector spaces and linear transformations 25

n Proposition 1.2.9. Let K ⊂ R be compact. Assume {Xα : α ∈ A} is a collection of closed subsets of K. Assume that for each finite set B ⊂ A, ∩ ̸ ∅ α∈BXα = . Then ∩ Xα ≠ ∅. α∈A

n Hint. Consider Uα = R \ Xα.

n 5. Let K ⊂ R be compact. Show that there exist x0, x1 ∈ K such that

|x0| ≤ |x|, ∀ x ∈ K,

|x1| ≥ |x|, ∀ x ∈ K. We say |x0| = min |x|, |x1| = max |x|. x∈K x∈K

1.3. Vector spaces and linear transformations

We have seen in §1.2 how Rn is a vector space, with vector operations given by (1.2.2)–(1.2.3), for row vectors, and by (1.2.4)–(1.2.5) for column vectors. We could also use complex numbers, replacing Rn by Cn, and allowing a ∈ C in (1.2.3) and (1.2.5). We will use F to denote R or C. Many other vector spaces arise naturally. We define this general notion now. A vector space over F is a set V , endowed with two operations, that of vector addition and multiplication by scalars. That is, given v, w ∈ V and a ∈ F, then v + w and av are defined in V . Furthermore, the following properties are to hold, for all u, v, w ∈ V, a, b ∈ F. First there are laws for vector addition: Commutative law : u + v = v + u, Associative law : (u + v) + w = u + (v + w), (1.3.1) Zero vector : ∃ 0 ∈ V, v + 0 = v, Negative : ∃ − v, v + (−v) = 0. Next there are laws for multiplication by scalars: Associative law : a(bv) = (ab)v, (1.3.2) Unit : 1 · v = v. Finally there are two distributive laws: a(u + v) = au + av, (1.3.3) (a + b)u = au + bu. It is easy to see that Rn and Cn satisfy all these rules. We will present a number of other examples below. Let us also note that a number of other 26 1. Background simple identities are automatic consequences of the rules given above. Here are some, which the reader is invited to verify: v + w = v ⇒ w = 0, v + 0 · v = (1 + 0)v = v, 0 · v = 0, (1.3.4) v + w = 0 ⇒ w = −v, v + (−1)v = 0 · v = 0, (−1)v = −v.

Here are some other examples of vector spaces. Let I = [a, b] denote an interval in R, and take a non-negative k. Then Ck(I) denotes the set of functions f : I → F whose derivatives up to order k are continuous. We denote by P the set of polynomials in x, with coefficients in F. We denote by Pk the set of polynomials in x of degree ≤ k. In these various cases, (1.3.5) (f + g)(x) = f(x) + g(x), (af)(x) = af(x).

Such vector spaces and certain of their linear subspaces play a major role in the material developed in these notes. Regarding the notion just mentioned, we say a subset W of a vector space V is a linear subspace provided

(1.3.6) wj ∈ W, aj ∈ F =⇒ a1w1 + a2w2 ∈ W. Then W inherits the structure of a vector space.

Linear transformations and matrices

If V and W are vector spaces over F (R or C), a map

(1.3.7) T : V −→ W is said to be a linear transformation provided

(1.3.8) T (a1v1 + a2v2) = a1T v1 + a2T v2, ∀ aj ∈ F, vj ∈ V. We also write T ∈ L(V,W ). In case V = W , we also use the notation L(V ) = L(V,V ). Linear transformations arise in a number of ways. For example, an m×n matrix A with entries in F defines a linear transformation

(1.3.9) A : Fn −→ Fm 1.3. Vector spaces and linear transformations 27 by       a11 ··· a1n b1 Σa1ℓbℓ  . .   .   .  (1.3.10)  . .   .  =  .  . am1 ··· amn bn Σamℓbℓ We also have linear transformations on function spaces, such as multi- plication operators k k (1.3.11) Mf : C (I) −→ C (I),Mf g(x) = f(x)g(x), given f ∈ Ck(I),I = [a, b], and the operation of differentiation: (1.3.12) D : Ck+1(I) −→ Ck(I), Df(x) = f ′(x). We also have integration: ∫ x (1.3.13) I : Ck(I) −→ Ck+1(I), If(x) = f(y) dy. a Note also that

(1.3.14) D : Pk+1 −→ Pk, I : Pk −→ Pk+1, where Pk denotes the space of polynomials in x of degree ≤ k.

Two linear transformations Tj ∈ L(V,W ) can be added:

(1.3.15) T1 + T2 : V −→ W, (T1 + T2)v = T1v + T2v. Also T ∈ L(V,W ) can be multiplied by a scalar: (1.3.16) aT : V −→ W, (aT )v = a(T v). This makes L(V,W ) a vector space. We can also compose linear transformations S ∈ L(W, X),T ∈ L(V,W ): (1.3.17) ST : V −→ X, (ST )v = S(T v). For example, we have k+1 k ′ (1.3.18) Mf D : C (I) −→ C (I),Mf Dg(x) = f(x)g (x), given f ∈ Ck(I). When two transformations (1.3.19) A : Fn −→ Fm,B : Fk −→ Fn are represented by matrices, e.g., A as in (1.3.10) and   b11 ··· b1k  . .  (1.3.20) B =  . .  , bn1 ··· bnk then (1.3.21) AB : Fk −→ Fm 28 1. Background is given by matrix multiplication:   Σa1ℓbℓ1 ··· Σa1ℓbℓk  . .  (1.3.22) AB =  . .  . Σamℓbℓ1 ··· Σamℓbℓk For example, ( )( ) ( ) a a b b a b + a b a b + a b 11 12 11 12 = 11 11 12 21 11 12 12 22 . a21 a22 b21 b22 a21b11 + a22b21 a21b12 + a22b22 Another way of writing (1.3.22) is to represent A and B as

(1.3.23) A = (aij),B = (bij), and then we have ∑n (1.3.24) AB = (dij), dij = aiℓbℓj. ℓ=1 To establish the identity (1.3.22), we note that it suffices to show the two k sides have the same effect on each ej ∈ F , 1 ≤ j ≤ k, where ej is the column vector in Fk whose jth entry is 1 and whose other entries are 0. First note that   b1j  .  (1.3.25) Bej =  .  , bnj the jth column in B, as one can see via (1.3.10). Similarly, if D denotes the right side of (1.3.22), De is the jth column of this matrix, i.e., j   Σa1ℓbℓj  .  (1.3.26) Dej =  .  . Σamℓbℓj On the other hand, applying A to (1.3.25), via (1.3.10), gives the same result, so (1.3.25) holds. Associated with a linear transformation as in (1.3.7) there are two special linear spaces, the null space of T and the range of T . The null space of T is (1.3.27) N (T ) = {v ∈ V : T v = 0}, and the range of T is (1.3.28) R(T ) = {T v : v ∈ V }. Note that N (T ) is a linear subspace of V and R(T ) is a linear subspace of W . If N (T ) = 0 we say T is injective; if R(T ) = W we say T is surjective. Note that T is injective if and only if T is one-to-one, i.e.,

(1.3.29) T v1 = T v2 =⇒ v1 = v2. 1.3. Vector spaces and linear transformations 29

If T is surjective, we also say T is onto. If T is one-to-one and onto, we say it is an isomorphism. In such a case the inverse

(1.3.30) T −1 : W −→ V is well defined, and it is a linear transformation. We also say T is invertible, in such a case.

Basis and dimension

Given a finite set S = {v1, . . . , vk} in a vector space V , the span of S is the set of vectors in V of the form

(1.3.31) c1v1 + ··· + ckvk, with cj arbitrary scalars, ranging over F = R or C. This set, denoted Span(S) is a linear subspace of V . The set S is said to be linearly dependent if and only if there exist scalars c1, . . . , ck, not all zero, such that (1.3.31) vanishes. Otherwise we say S is linearly independent.

If {v1, . . . , vk} is linearly independent, we say S is a of Span(S), and that k is the dimension of Span(S). In particular, if this holds and Span(S) = V , we say k = dim V . We also say V has a finite basis, and that V is finite dimensional. By convention, if V has only one element, the zero element, we say V = 0 and dim V = 0.

It is easy to see that any finite set S = {v1, . . . , vk} ⊂ V has a maximal subset that is linearly independent, and such a subset has the same span as S, so Span(S) has a basis. To take a complementary perspective, S will have a minimal subset S0 with the same span, and any such minimal subset will be a basis of Span(S). Soon we will show that any two bases of a finite- dimensional vector space V have the same number of elements (so dim V is well defined). First, let us relate V to Fk.

So say V has a basis S = {v1, . . . , vk}. We define a linear transformation

(1.3.32) A : Fk −→ V by

(1.3.33) A(c1e1 + ··· + ckek) = c1v1 + ··· + ckvk, 30 1. Background where     1 0 0 .   . (1.3.34) e1 = . , ...... , ek =   . . 0 0 1

k We say {e1, . . . , ek} is the standard basis of F . The linear independence of S is equivalent to the injectivity of A and the statement that S spans V is equivalent to the surjectivity of A. Hence the statement that S is a basis of V is equivalent to the statement that A is an isomorphism, with inverse uniquely specified by −1 (1.3.35) A (c1v1 + ··· + ckvk) = c1e1 + ··· + ckek. We begin our demonstration that dim V is well defined, with the follow- ing concrete result.

k Lemma 1.3.1. If v1, . . . , vk+1 are vectors in F , then they are linearly de- pendent.

Proof. We use induction on k. The result is obvious if k = 1. We can suppose the last component of some vj is nonzero, since otherwise we can regard these vectors as elements of Fk−1 and use the inductive hypothe- sis. Reordering these vectors, we can assume the last component of vk+1 is nonzero, and it can be assumed to be 1. Form

wj = vj − vkjvk+1, 1 ≤ j ≤ k, t where vj = (v1j, . . . , vkj) . Then the last component of each of the vectors k−1 w1, . . . , wk is 0, so we can regard these as k vectors in F . By induction, there exist scalars a1, . . . , ak, not all zero, such that

a1w1 + ··· + akwk = 0, so we have

a1v1 + ··· + akvk = (a1vk1 + ··· + akvkk)vk+1, the desired linear dependence relation on {v1, . . . , vk+1}. 

With this result in hand, we proceed.

Proposition 1.3.2. If V has a basis {v1, . . . , vk} with k elements and {w1, . . . , wℓ} ⊂ V is linearly independent, then ℓ ≤ k.

Proof. Take the isomorphism A : Fk → V described in (1.3.32)–(1.3.33). −1 −1 The hypotheses imply that {A w1,...,A wℓ} is linearly independent in Fk, so Lemma 1.3.1 implies ℓ ≤ k.  1.3. Vector spaces and linear transformations 31

Corollary 1.3.3. If V is finite-dimensional, any two bases of V have the same number of elements. If V is isomorphic to W , these spaces have the same dimension.

Proof. If S (with #S elements) and T are bases of V , we have #S ≤ #T and #T ≤ #S, hence #S = #T . For the latter part, an isomorphism of V onto W takes a basis of V to a basis of W . 

The following is an easy but useful consequence. Proposition 1.3.4. If V is finite dimensional and W ⊂ V a linear subspace, then W has a finite basis, and dim W ≤ dim V .

Proof. Suppose {w1, . . . , wℓ} is a linearly independent subset of W . Propo- sition 1.3.2 implies ℓ ≤ dim V . If this set spans W , we are done. If not, there is an element wℓ+1 ∈ W not in this span, and {w1, . . . , wℓ+1} is a linearly independent subset of W . Again ℓ + 1 ≤ dim V . Continuing this process a finite number of must produce a basis of W . 

A similar argument establishes: Proposition 1.3.5. Suppose V is finite dimensional, W ⊂ V a linear sub- space, and {w1, . . . , wℓ} a basis of W . Then V has a basis of the form {w1, . . . , wℓ, u1, . . . , um}, and ℓ + m = dim V .

Having this, we can establish the following result, sometimes called the fundamental theorem of linear algebra. Proposition 1.3.6. Assume V and W are vector spaces, V finite dimen- sional, and (1.3.36) A : V −→ W a linear map. Then (1.3.37) dim N (A) + dim R(A) = dim V.

Proof. Let {w1, . . . , wℓ} be a basis of N (A) ⊂ V , and complete it to a basis

{w1, . . . , wℓ, u1, . . . , um} of V . Set L = Span{u1, . . . , um}, and consider

−→ (1.3.38) A0 : L W, A0 = A L.

Clearly w ∈ R(A) ⇒ w = A(a1w1 + ··· + aℓwℓ + b1u1 + ··· + bmum) = A0(b1u1 + ··· + bmum), so

(1.3.39) R(A0) = R(A). 32 1. Background

Furthermore,

(1.3.40) N (A0) = N (A) ∩ L = 0.

Hence A0 : L → R(A0) is an isomorphism. Thus dim R(A) = dim R(A0) = dim L = m, and we have (1.3.37). 

The following is a significant special case.

Corollary 1.3.7. Let V be finite dimensional, and let A : V → V be linear. Then A injective ⇐⇒ A surjective ⇐⇒ A isomorphism.

We mention that these equivalences can fail for infinite dimensional spaces. For example, if P denotes the space of polynomials in x, then Mx : P → P (Mxf(x) = xf(x)) is injective but not surjective, while D : P → P (Df(x) = f ′(x)) is surjective but not injective. Next we have the following important characterization of injectivity and surjectivity.

Proposition 1.3.8. Assume V and W are finite dimensional and A : V → W is linear. Then

(1.3.41) A surjective ⇐⇒ AB = IW , for some B ∈ L(W, V ), and

(1.3.42) A injective ⇐⇒ CA = IV , for some C ∈ L(W, V ).

Proof. Clearly AB = I ⇒ A surjective and CA = I ⇒ A injective. We establish the converses.

First assume A : V → W is surjective. Let {w1, . . . , wℓ} be a basis of W . Pick vj ∈ V such that Avj = wj. Set

(1.3.43) B(a1w1 + ··· + aℓwℓ) = a1v1 + ··· + aℓvℓ.

This works in (1.3.41).

Next assume A : V → W is injective. Let {v1, . . . , vk} be a basis of V . Set wj = Avj. Then {w1, . . . , wk} is linearly independent, hence a basis of R(A), and we then can produce a basis {w1, . . . , wk, u1, . . . , um} of W . Set

(1.3.44) C(a1w1 + ··· + akwk + b1u1 + ··· + bmum) = a1v1 + ··· + akvk.

This works in (1.3.42).  1.3. Vector spaces and linear transformations 33

An m × n matrix A defines a linear transformation A : Fn → Fm, as in (1.3.9)–(1.3.10). The columns of A are   a1j  .  (1.3.45) aj =  .  . amj As seen in (1.3.25),

(1.3.46) Aej = aj, n where e1, . . . , en is the standard basis of F . Hence (1.3.47) R(A) = linear span of the columns of A, so m m (1.3.48) R(A) = F ⇐⇒ a1, . . . , an span F . Furthermore, (∑n ) ∑n (1.3.49) A cjej = 0 ⇐⇒ cjaj = 0, j=1 j=1 so

(1.3.50) N (A) = 0 ⇐⇒ {a1, . . . , an} is linearly independent. We have the following conclusion, in case m = n. Proposition 1.3.9. Let A be an n × n matrix, defining A : Fn → Fn. Then the following are equivalent: A is invertible, (1.3.51) The columns of A are linearly independent, The columns of A span Fn.

Exercises

1. Show that the results in (1.3.4) follow from the basic rules (1.3.1)–(1.3.3). Hint. To start, add −v to both sides of the identity v + w = v, and take account first of the associative law in (1.3.1), and then of the rest of (1.3.1). For the second line of (1.3.4), use the rules (1.3.2) and (1.3.3). Then use the first two lines of (1.3.4) to justify the third line...

2. Demonstrate the following results for any vector space. Take a ∈ F, v ∈ V . a · 0 = 0 ∈ V, a(−v) = −av. 34 1. Background

Hint. Feel free to use the results of (1.3.4).

Let V be a vector space (over F) and W, X ⊂ V linear subspaces. We say (1.3.52) V = W + X provided each v ∈ V can be written (1.3.53) v = w + x, w ∈ W, x ∈ X. We say (1.3.54) V = W ⊕ X provided each v ∈ V has a unique representation (1.3.53).

3. Show that V = W ⊕ X ⇐⇒ V = W + X and W ∩ X = 0.

4. Let A : Fn → Fm be defined by an m × n matrix, as in (1.3.9)–(1.3.10). (a) Show that R(A) is the span of the columns of A. Hint. See (1.3.25). (b) Show that N (A) = 0 if and only if the columns of A are linearly inde- pendent.

5. Define the of an m×n matrix A = (ajk) to be the n×m matrix t A = (akj). Thus, if A is as in (1.3.9)–(1.3.10),   a11 ··· am1 t  . .  (1.3.55) A =  . .  . a1n ··· amn For example,   1 2 ( ) 1 3 5 A = 3 4 =⇒ At = . 2 4 6 5 6 Suppose also B is an n × k matrix, as in (2.14), so AB is defined, as in (1.3.21). Show that (1.3.56) (AB)t = BtAt.

6. Let   ( ) 2 A = 1 2 3 ,B = 0 . 2 1.4. Determinants 35

Compute AB and BA. Then compute AtBt and BtAt.

7. Let P5 be the space of real polynomials in x of degree ≤ 5 and set ( ) 3 T : P5 −→ R , T p = p(−1), p(0), p(1) .

3 Specify R(T ) and N (T ), and verify (1.3.37) for V = P5,W = R ,A = T .

8. Denote the space of m × n matrices with entries in F (as in (1.3.10)) by (1.3.57) M(m × n, F). If m = n, denote it by (1.3.58) M(n, F). Show that dim M(m × n, F) = mn, especially dim M(n, F) = n2.

1.4. Determinants

Determinants arise in the study of inverting a matrix. To take the 2 × 2 case, solving for x and y the system ax + by = u, (1.4.1) cx + dy = v can be done by multiplying these equations by d and b, respectively, and sub- tracting, and by multiplying them by c and a, respectively, and subtracting, yielding (ad − bc)x = du − bv, (1.4.2) (ad − bc)y = av − cu. The factor on the left is ( ) a b (1.4.3) det = ad − bc, c d and solving (1.4.2) for x and y leads to ( ) ( ) a b 1 d −b (1.4.4) A = =⇒ A−1 = , c d det A −c a provided det A ≠ 0. 36 1. Background

We now consider determinants of n × n matrices. Let M(n, F) denote the set of n × n matrices with entries in F = R or C. We write   a11 ··· a1n  . .  (1.4.5) A =  . .  = (a1, . . . , an), an1 ··· ann where   a1j  .  (1.4.6) aj =  .  anj is the jth column of A. The determinant is defined as follows.

Proposition 1.4.1. There is a unique function

(1.4.7) ϑ : M(n, F) −→ F, satisfying the following three properties:

(a) ϑ is linear in each column aj of A, (b) ϑ(Ae) = −ϑ(A) if Ae is obtained from A by interchanging two columns, (c) ϑ(I) = 1.

This defines the determinant:

(1.4.8) ϑ(A) = det A.

If (c) is replaced by

(c′) ϑ(I) = r, then

(1.4.9) ϑ(A) = r det A.

The proof will involve constructing an explicit formula for det A by fol- lowing the rules (a)–(c). We start with the case n = 3. We have

∑3 (1.4.10) det A = aj1 det(ej, a2, a3), j=1 ∑ by applying (a) to the first column of A, a1 = j aj1ej. Here and below, n {ej : 1 ≤ j ≤ n} denotes the standard basis of F , so ej has a 1 in the jth 1.4. Determinants 37 slot and 0s elsewhere. Applying (a) to the second and third columns gives ∑3 det A = aj1ak2 det(ej, ek, a3) j,k=1 (1.4.11) ∑3 = aj1ak2aℓ3 det(ej, ek, eℓ). j,k,ℓ=1 This is a sum of 27 terms, but most of them are 0. Note that rule (b) implies (1.4.12) det B = 0 whenever B has two identical columns.

Hence det(ej, ek, eℓ) = 0 unless j, k, and ℓ are distinct, that is, unless (j, k, ℓ) is a permutation of (1, 2, 3). Now rule (c) says

(1.4.13) det(e1, e2, e3) = 1, and we see from rule (b) that det(ej, ek, eℓ) = 1 if one can convert (ej, ek, eℓ) to (e1, e2, e3) by an even number of column interchanges, and det(ej, ek, eℓ) = −1 if it takes an odd number of interchanges. Explicitly,

det(e1, e2, e3) = 1, det(e1, e3, e2) = −1,

(1.4.14) det(e2, e3, e1) = 1, det(e2, e1, e3) = −1,

det(e3, e1, e2) = 1, det(e3, e2, e1) = −1. Consequently (1.4.11) yields

det A = a11a22a33 − a11a32a23

(1.4.15) + a21a32a13 − a21a12a33

+ a31a12a23 − a31a22a13. Note that the second indices occur in (1, 2, 3) order in each product. We can rearrange these products so that the first indices occur in (1, 2, 3) order:

det A = a11a22a33 − a11a23a32

(1.4.16) + a13a21a32 − a12a21a33

+ a12a23a31 − a13a22a31. Now we tackle the case of general n. Parallel to (1.4.10)–(1.4.11), we have ∑ det A = aj1 det(ej, a2, . . . , an) = ··· j (1.4.17) ∑ ··· = aj11 ajnn det(ej1 , . . . ejn ), j1,...,jn by applying rule (a) to each of the n columns of A. As before, (1.4.12) implies det(ej1 , . . . , ejn ) = 0 unless (j1, . . . , jn) are all distinct, that is, unless 38 1. Background

(j1, . . . , jn) is a permutation of the set (1, 2, . . . , n). We set

(1.4.18) Sn = set of permutations of (1, 2, . . . , n).

That is, Sn consists of elements σ, mapping the set {1, . . . , n} to itself, (1.4.19) σ : {1, 2, . . . , n} −→ {1, 2, . . . , n}, that are one-to-one and onto. We can compose two such permutations, obtaining the product στ ∈ Sn, given σ and τ in Sn. A permutation that interchanges just two elements of {1, . . . , n}, say j and k (j ≠ k), is called a transposition, and labeled (jk). It is easy to see that each permutation of {1, . . . , n} can be achieved by successively transposing pairs of elements of this set. That is, each element σ ∈ Sn is a product of transpositions. We claim that

(1.4.20) det(eσ(1)1, . . . , eσ(n)n) = (sgn σ) det(e1, . . . , en) = sgn σ, where (1.4.21) sgn σ = 1 if σ is a product of an even number of transpositions, − 1 if σ is a product of an odd number of transpositions. In fact, the first identity in (1.4.20) follows from rule (b) and the second identity from rule (c). There is one point to be checked here. Namely, we claim that a given σ ∈ Sn cannot simultaneously be written as a product of an even number of transpositions and an odd number of transpositions. If σ could be so written, sgn σ would not be well defined, and it would be impossible to satisfy condition (b), so Proposition 1.4.1 would fail. One neat way to see that sgn σ is well defined is the following. Let σ ∈ Sn act on functions of n variables by

(1.4.22) (σf)(x1, . . . , xn) = f(xσ(1), . . . , xσ(n)).

It is readily verified that if also τ ∈ Sn, (1.4.23) g = σf =⇒ τg = (τσ)f. Now, let P be the ∏ (1.4.24) P (x1, . . . , xn) = (xj − xk). 1≤j

(1.4.26) (σP )(x) = (sgn σ)P (x), ∀ σ ∈ Sn, and sgn σ is well defined. 1.4. Determinants 39

The proof of (1.4.20) is complete, and substitution into (1.4.17) yields the formula ∑ (1.4.27) det A = (sgn σ)aσ(1)1 ··· aσ(n)n. σ∈Sn It is routine to check that this satisfies the properties (a)–(c). Regarding (b), note that if ϑ(A) denotes the right side of (1.4.27) and Ae is obtained from A e by applying a permutation τ to the columns of A, so A = (aτ(1), . . . , aτ(n)), then ∑ e ϑ(A) = (sgn σ)aσ(1)τ(1) ··· aσ(n)τ(n) ∈ σ∑Sn = (sgn σ)aστ −1(1)1 ··· aστ −1(n)n ∈ σ∑Sn = (sgn ωτ)aω(1)1 ··· aω(n)n ω∈Sn = (sgn τ)ϑ(A), the last identity because

sgn ωτ = (sgn ω)(sgn τ), ∀ ω, τ ∈ Sn.

As for the final part of Proposition 1.4.1, if (c) is replaced by (c′), then (1.4.20) is replaced by

(1.4.28) ϑ(eσ(1), . . . , eσ(n)) = r(sgn σ), and (1.4.9) follows.

Remark. (1.4.27) is taken as a definition of the determinant by some au- thors. While it is a useful formula for the determinant, it is a bad definition, which has perhaps led to a bit of fear and loathing among math students.

Remark. Here is another formula for sgn σ, which the reader is invited to verify. If σ ∈ Sn, sgn σ = (−1)κ(σ), where κ(σ) = number of pairs (j, k) such that 1 ≤ j < k ≤ n, but σ(j) > σ(k).

Note that

−1 (1.4.29) aσ(1)1 ··· aσ(n)n = a1τ(1) ··· anτ(n), with τ = σ , 40 1. Background and sgn σ = sgn σ−1, so, parallel to (1.4.16), we also have ∑ (1.4.30) det A = (sgn σ)a1σ(1) ··· anσ(n). σ∈Sn Comparison with (1.4.27) gives (1.4.31) det A = det At, t t where A = (ajk) ⇒ A = (akj). Note that the jth column of A has the same entries as the jth row of A. In light of this, we have: Corollary 1.4.2. In Proposition 1.4.1, one can replace “columns” by “rows.”

The following is a key property of the determinant, called multiplicativ- ity: Proposition 1.4.3. Given A and B in M(n, F), (1.4.32) det(AB) = (det A)(det B).

Proof. For fixed A, apply Proposition 1.4.1 to

(1.4.33) ϑ1(B) = det(AB).

If B = (b1, . . . , bn), with jth column bj, then

(1.4.34) AB = (Ab1, . . . , Abn). e Clearly rule (a) holds for ϑ1. Also, if B = (bσ(1), . . . , bσ(n)) is obtained from e B by permuting its columns, then AB has columns (Abσ(1), . . . , Abσ(n)), obtained by permuting the columns of AB in the same fashion. Hence rule ′ (b) holds for ϑ1. Finally, rule (c ) holds for ϑ1, with r = det A, and (1.4.32) follows,  Corollary 1.4.4. If A ∈ M(n, F) is invertible, then det A ≠ 0.

Proof. If A is invertible, there exists B ∈ M(n, F) such that AB = I. Then, by (1.4.32), (det A)(det B) = 1, so det A ≠ 0. 

The converse of Corollary 1.4.4 also holds. Before proving it, it is con- venient to show that the determinant is invariant under a certain class of column operations, given as follows. e Proposition 1.4.5. If A is obtained from A = (a1, . . . , an) ∈ M(n, F) by adding caℓ to ak for some c ∈ F, ℓ ≠ k, then (1.4.35) det Ae = det A.

Proof. By rule (a), det Ae = det A + c det Ab, where Ab is obtained from A b by replacing the column ak by aℓ. Hence A has two identical columns, so det Ab = 0, and (1.4.35) holds.  1.4. Determinants 41

We now extend Corollary 1.4.4. Proposition 1.4.6. If A ∈ M(n, F), then A is invertible if and only if det A ≠ 0.

Proof. We have half of this from Corollary 1.4.4. To finish, assume A is not invertible. As seen in §1.3, this implies the columns a1, . . . , an of A are linearly dependent. Hence, for some k, ∑ (1.4.36) ak + cℓaℓ = 0, ℓ≠ k e with cℓ ∈ F. Now we can apply Proposition 1.4.5 to obtain det A = det A, e ∑ e where A is obtained by adding cℓaℓ to ak. But then the kth column of A is 0, so det A = det Ae = 0. This finishes the proof of Proposition 1.4.6. 

Further useful facts about determinants arise in the following exercises.

Exercises

1. Show that     1 a12 ··· a1n 1 0 ··· 0     0 a22 ··· a2n 0 a22 ··· a2n (1.4.37) det . . .  = det . . .  = det A11 . . .  . . .  0 an2 ··· ann 0 an2 ··· ann where A11 = (ajk)2≤j,k≤n.

Hint. Do the first identity using Proposition 1.4.5. Then exploit unique- ness for det on M(n − 1, F).

j−1 2. Deduce that det(ej, a2, . . . , an) = (−1) det A1j where Akj is formed by deleting the kth column and the jth row from A.

3. Deduce from the first sum in (1.4.17) that ∑n j−1 (1.4.38) det A = (−1) aj1 det A1j. j=1 More generally, for any k ∈ {1, . . . , n}, ∑n j−k (1.4.39) det A = (−1) ajk det Akj. j=1 This is called an expansion of det A by minors, down the kth column. 42 1. Background

4. By definition, the cofactor matrix of A is given by j−k Cof(A)jk = ckj = (−1) det Akj. Show that ∑n (1.4.40) ajℓckj = 0, if ℓ ≠ k. j=1 Deduce from this and (5.39) that (1.4.41) Cof(A)t A = (det A)I. Hint. Reason as in Exercises 1–3 that the left side of (1.4.40) is equal to

det (a1, . . . , aℓ, . . . , aℓ, . . . , an), with aℓ in the kth column as well as in the ℓth column. The identity (1.4.41) is known as Cramer’s formula. Note how this generalizes (1.4.4).

5. Show that   a11 a12 ··· a1n    a22 ··· a2n (1.4.42) det  . .  = a11a22 ··· ann.  .. .  ann Hint. Use (1.4.37) and induction. Alternative: Use (1.4.27). Show that σ ∈ Sn, σ(k) ≤ k ∀ k ⇒ σ(k) ≡ k.

Exercises on the cross product

1. If u, v ∈ R3, show that the formula   w1 u1 v1   (1.4.43) w · (u × v) = det w2 u2 v2 w3 u3 v3 for u × v = Π(u, v) defines uniquely a bilinear map Π : R3 × R3 → R3. Show that it satisfies i × j = k, j × k = i, k × i = j, where {i, j, k} is the standard basis of R3.

2. We say T ∈ SO(3) provided that T is a real 3 × 3 matrix satisfying T tT = I and det T > 0, (hence det T = 1). Show that (1.4.44) T ∈ SO(3) =⇒ T u × T v = T (u × v). 1.4. Determinants 43

Hint. Multiply the 3 × 3 matrix in Exercise 1 on the left by T.

3. Show that, if θ is the between u and v in R3, then

(1.4.45) |u × v| = |u| |v| | sin θ|.

Hint. Check this for u = i, v = ai + bj, and use Exercise 2 to show this suffices.

4. Show that, for all u, v, w, x ∈ R3, ( ) u · w v · w (1.4.46) (u × v) · (w × x) = det . u · x v · x

Taking w = u, x = v, show that this implies (1.4.45). Hint. Using Exercise 2, show that it suffices to check this for

w = i, x = ai + bj, in which case w × x = bk. Then the left side of (1.4.46) is   0 u · i v · i (u × v) · bk = det 0 u · j v · j . b u · k v · k

Show that this equals the right side of (1.4.46).

5. Show that κ : R3 → Skew(3), the set of antisymmetric real 3×3 matrices, given by   0 −y3 y2   (1.4.47) κ(y1, y2, y3) = y3 0 −y1 −y2 y1 0 satisfies

(1.4.48) Kx = y × x, K = κ(y).

Show that, with [A, B] = AB − BA, [ ] κ(x × y) = κ(x), κ(y) , (1.4.49) ( ) Tr κ(x)κ(y)t = 2x · y.

The trace of a matrix 44 1. Background

Let A ∈ M(n, F) be as in (1.4.5). We define the trace of A as ∑n Tr A = ajj. j=1

1. If also B ∈ M(n, F), show that Tr AB = Tr BA.

2. Deduce that if B is invertible, Tr B−1AB = Tr A.

3. Show that, as t → 0, det(I + tA) = 1 + t Tr A + O(t2).

4. Deduce that, if B is invertible, det(B + tA) = (det B)(1 + t Tr B−1A) + O(t2). Chapter 2

Multivariable differential calculus

This chapter develops differential calculus on domains in n-dimensional Eu- clidean space Rn. In §2.1 we define the derivative of a function F : O → Rm, where O is an open subset of Rn, as a linear map from Rn to Rm. We establish some basic properties, such as the chain rule. We use the one-dimensional integral as a tool to show that, if the matrix of first order partial derivatives of F is continuous on O, then F is differentiable on O. We also discuss two convenient multi-index notations for higher derivatives, and derive the Taylor formula with remainder for a smooth function F on O ⊂ Rn. In §2.2 we establish the Inverse Function Theorem, stating that a smooth map F : O → Rn with an invertible derivative DF (p) has a smooth inverse defined near q = F (p). We derive the Implicit Function Theorem as a con- sequence of this. As a tool in proving the Inverse Function Theorem, we use a fixed point theorem known as the Principle. In §2.3 we treat systems of differential equations. We establish a basic existence and uniqueness theorem and also study the smooth dependence of a solution on initial data. We interpret the solution operator as a flow generated by a vector field and introduce the concept of the Lie of vector fields. We also consider the linearization of a system of ODEs about a solution. Within the setting of linear systems, we introduce the matrix exponential as a tool and derive a number of its basic properties.

45 46 2. Multivariable differential calculus

2.1. The derivative

Let O be an open subset of Rn, and F : O → Rm a continuous function. We say F is differentiable at a point x ∈ O, with derivative L, if L : Rn → Rm is a linear transformation such that, for y ∈ Rn, small,

(2.1.1) F (x + y) = F (x) + Ly + R(x, y) with

∥R(x, y)∥ (2.1.2) → 0 as y → 0. ∥y∥

We denote the derivative at x by DF (x) = L. With respect to the standard bases of Rn and Rm,DF (x) is simply the matrix of partial derivatives,   ( ) ∂F1/∂x1 ··· ∂F1/∂xn ∂Fj  . .  (2.1.3) DF (x) = =  . .  , ∂xk ∂Fm/∂x1 ··· ∂Fm/∂xn

t so that, if v = (v1, . . . , vn) , (regarded as a column vector) then  ∑  (∂F1/∂xk)vk    k  (2.1.4) DF (x)v =  .  . ∑ .  (∂Fm/∂xk)vk k

Recall the definition of the partial derivative ∂fj/∂xk from §1.1. It will be shown below that F is differentiable whenever all the partial derivatives exist and are continuous on O. In such a case we say F is a C1 function on O. More generally, F is said to be Ck if all its partial derivatives of order ≤ k exist and are continuous. If F is Ck for all k, we say F is C∞. In (2.1.2), we can use the Euclidean norm on Rn and Rm. As seen in §1.2, this norm is defined by ( ) ∥ ∥ 2 ··· 2 1/2 (2.1.5) x = x1 + + xn

n for x = (x1, . . . , xn) ∈ R . An application of the Fundamental Theorem of Calculus, to functions of m each variable xj separately, yields the following. If we assume F : O → R is differentiable in each variable separately, and that each ∂F/∂xj is continuous 2.1. The derivative 47 on O, then ∑n [ ] F (x + y) = F (x) + F (x + zj) − F (x + zj−1) j=1 ∑n (2.1.6) = F (x) + Aj(x, y)yj, j=1 ∫ 1 ∂F ( ) Aj(x, y) = x + zj−1 + tyjej dt, 0 ∂xj where z0 = 0, zj = (y1, . . . , yj, 0,..., 0), and {ej} is the standard basis of Rn. Consequently, ∑n ∂F F (x + y) = F (x) + (x) y + R(x, y), ∂x j j=1 j (2.1.7) ∫ { } ∑n 1 ∂F ∂F R(x, y) = y (x + z − + ty e ) − (x) dt. j ∂x j 1 j j ∂x j=1 0 j j Now (2.1.7) implies F is differentiable on O, as we stated below (2.1.4). Thus we have established the following. Proposition 2.1.1. If O is an open subset of Rn and F : O → Rm is of class C1, then F is differentiable at each point x ∈ O.

One can use the Mean Value Theorem in place of the fundamental the- orem of calculus and obtain a slightly more general result. See Exercise 0 below for prompts on how to accomplish this. Let us give some examples of derivatives. First, take n = 2, m = 1, and set

(2.1.8) F (x) = (sin x1)(sin x2). Then

(2.1.9) DF (x) = ((cos x1)(sin x2), (sin x1)(cos x2)). Next, take n = m = 2 and ( ) x x (2.1.10) F (x) = 1 2 . 2 − 2 x1 x2 Then ( ) x x (2.1.11) DF (x) = 2 1 . 2x1 −2x2 We can replace Rn and Rm by more general finite-dimensional real vector spaces, isomorphic to Euclidean space. For example, the space M(n, R) of 48 2. Multivariable differential calculus real n × n matrices is isomorphic to Rn2 . Consider the function (2.1.12) S : M(n, R) −→ M(n, R),S(X) = X2. We have (X + Y )2 = X2 + XY + YX + Y 2 (2.1.13) = X2 + DS(X)Y + R(X,Y ), with R(X,Y ) = Y 2, and hence (2.1.14) DS(X)Y = XY + YX. For our next example, we take (2.1.15) O = Gℓ(n, R) = {X ∈ M(n, R) : det X ≠ 0}, which, as shown below, is open in M(n, R). We consider (2.1.16) Φ: Gℓ(n, R) −→ M(n, R), Φ(X) = X−1, and compute DΦ(I). We use the following. If, for A ∈ M(n, R), (2.1.17) ∥A∥ = sup{∥Av∥ : v ∈ Rn, ∥v∥ ≤ 1}, then A, B ∈ M(n, R) ⇒ ∥A + B∥ ≤ ∥A∥ + ∥B∥ (2.1.18) and ∥AB∥ ≤ ∥A∥ · ∥B∥, so Y ∈ M(n, R) ⇒ ∥Y k∥ ≤ ∥Y ∥k. Also 2 k k Sk = I − Y + Y − · · · + (−1) Y 2 3 k k+1 (2.1.19) ⇒ YSk = SkY = Y − Y + Y − · · · + (−1) Y k k+1 ⇒ (I + Y )Sk = Sk(I + Y ) = I + (−1) Y , hence ∑∞ (2.1.20) ∥Y ∥ < 1 =⇒ (I + Y )−1 = (−1)kY k = I − Y + Y 2 − · · · , k=0 so (2.1.21) DΦ(I)Y = −Y. Related calculations show that Gℓ(n, R) is open in M(n, R). In fact, given X ∈ Gℓ(n, R),Y ∈ M(n, R), (2.1.22) X + Y = X(I + X−1Y ), which by (2.1.20) is invertible as long as (2.1.23) ∥X−1Y ∥ < 1. One can proceed from here to compute DΦ(X). See the exercises. 2.1. The derivative 49

We return to general considerations, and derive the chain rule for the derivative. Let F : O → Rm be differentiable at x ∈ O, as above, let U be a neighborhood of z = F (x) in Rm, and let G : U → Rk be differentiable at z. Consider H = G ◦ F. We have H(x + y) = G(F (x + y)) ( ) = G F (x) + DF (x)y + R(x, y) (2.1.24) ( ) = G(z) + DG(z) DF (x)y + R(x, y) + R1(x, y)

= G(z) + DG(z)DF (x)y + R2(x, y) with ∥R (x, y)∥ 2 → 0 as y → 0. ∥y∥ Thus G ◦ F is differentiable at x, and

(2.1.25) D(G ◦ F )(x) = DG(F (x)) · DF (x).

Another useful remark is that, by the Fundamental Theorem of Calculus, applied to φ(t) = F (x + ty), ∫ 1 (2.1.26) F (x + y) = F (x) + DF (x + ty)y dt, 0 provided F is C1. For a typical application, see (4.1.6). For the study of higher order derivatives of a function, the following result is fundamental.

Proposition 2.1.2. Assume F : O → Rm is of class C2, with O open in Rn. Then, for each x ∈ O, 1 ≤ j, k ≤ n, ∂ ∂F ∂ ∂F (2.1.27) (x) = (x). ∂xj ∂xk ∂xk ∂xj

Proof. It suffices to take m = 1. We label our function f : O → R. For 1 ≤ j ≤ n, we set 1 ( ) (2.1.28) ∆ f(x) = f(x + he ) − f(x) , j,h h j n where {e1, . . . , en} is the standard basis of R . The mean value theorem (for functions of xj alone) implies that if ∂jf = ∂f/∂xj exists on O, then, for x ∈ O, h > 0 sufficiently small,

(2.1.29) ∆j,hf(x) = ∂jf(x + αjhej), 50 2. Multivariable differential calculus

for some αj ∈ (0, 1), depending on x and h. Iterating this, if ∂j(∂kf) exists on O, then, for x ∈ O, h > 0 sufficiently small,

∆k,h∆j,hf(x) = ∂k(∆j,hf)(x + αkhek)

(2.1.30) = ∆j,h(∂kf)(x + αkhek)

= ∂j∂kf(x + αkhek + αjhej), with αj, αk ∈ (0, 1). Here we have used the elementary result

(2.1.31) ∂k∆j,hf = ∆j,h(∂kf). We deduce the following. 

Proposition 2.1.3. If ∂kf and ∂j∂kf exist on O and ∂j∂kf is continuous at x0 ∈ O, then

(2.1.32) ∂j∂kf(x0) = lim ∆k,h∆j,hf(x0). h→0 Clearly

(2.1.33) ∆k,h∆j,hf = ∆j,h∆k,hf, so we have the following, which easily implies Proposition 2.1.2.

Corollary 2.1.4. In the setting of Proposition 2.1.3, if also ∂jf and ∂k∂jf exist on O and ∂k∂jf is continuous at x0, then

(2.1.34) ∂j∂kf(x0) = ∂k∂jf(x0). We now describe two convenient notations to express higher order deriva- tives of a Ck function f :Ω → R, where Ω ⊂ Rn is open. In one, let J be a k-tuple of integers between 1 and n; J = (j1, . . . , jk). We set (J) ··· ∂ (2.1.35) f (x) = ∂jk ∂j1 f(x), ∂j = . ∂xj We set |J| = k, the total order of differentiation. As we have seen in Proposi- 2 k tion 2.1.2, ∂i∂jf = ∂j∂if provided f ∈ C (Ω). It follows that, if f ∈ C (Ω), ··· ··· { } then ∂jk ∂j1 f = ∂ℓk ∂ℓ1 f whenever ℓ1, . . . , ℓk is a permutation of {j1, . . . , jk}. Thus, another convenient notation to use is the following. Let α be an n-tuple of non-negative integers, α = (α1, . . . , αn). Then we set (α) α1 ··· αn | | ··· (2.1.36) f (x) = ∂1 ∂n f(x), α = α1 + + αn. Note that, if |J| = |α| = k and f ∈ Ck(Ω), (J) (α) (2.1.37) f (x) = f (x), with αi = #{ℓ : jℓ = i}.

Correspondingly, there are two expressions for in x = (x1, . . . , xn): J ··· α α1 ··· αn (2.1.38) x = xj1 xjk , x = x1 xn , and xJ = xα provided J and α are related as in (2.1.37). Both these notations are called “multi-index” notations. 2.1. The derivative 51

We now derive Taylor’s formula with remainder for a smooth function F :Ω → R, making use of these multi-index notations. We will apply the one variable formula (1.1.50)–(1.1.51), i.e., 1 1 (2.1.39) φ(t) = φ(0) + φ′(0)t + φ′′(0)t2 + ··· + φ(k)(0)tk + r (t), 2 k! k with ∫ t 1 k (k+1) (2.1.40) rk(t) = (t − s) φ (s) ds, k! 0 given φ ∈ Ck+1(I),I = (−a, a). (See Exercise 9 of §0, Exercise 7 of this section, and also Appendix A.4 for further discussion.) Let us assume 0 ∈ Ω, and that the line segment from 0 to x is contained in Ω. We set φ(t) = F (tx), and apply (2.1.39)–(2.1.40) with t = 1. Applying the chain rule, we have ∑n ∑ ′ (J) J (2.1.41) φ (t) = ∂jF (tx)xj = F (tx)x . j=1 |J|=1 Differentiating again, we have ∑ ∑ (2.1.42) φ′′(t) = F (J+K)(tx)xJ+K = F (J)(tx)xJ , |J|=1,|K|=1 |J|=2 where, if |J| = k, |K| = ℓ, we take J + K = (j1, . . . , jk, k1, . . . , kℓ). Induc- tively, we have ∑ (2.1.43) φ(k)(t) = F (J)(tx)xJ . |J|=k Hence, from (2.1.39) with t = 1, ∑ 1 ∑ (2.1.44) F (x) = F (0) + F (J)(0)xJ + ··· + F (J)(0)xJ + R (x), k! k |J|=1 |J|=k or, more briefly, ∑ 1 (2.1.45) F (x) = F (J)(0)xJ + R (x), |J|! k |J|≤k where (∫ ) 1 ∑ 1 (2.1.46) R (x) = (1 − s)kF (J)(sx) ds xJ . k k! |J|=k+1 0

This gives Taylor’s formula with remainder for F ∈ Ck+1(Ω), in the J-multi- . 52 2. Multivariable differential calculus

We also want to write the formula in the α-multi-index notation. We have ∑ ∑ (2.1.47) F (J)(tx)xJ = ν(α)F (α)(tx)xα, |J|=k |α|=k where (2.1.48) ν(α) = #{J : α = α(J)}, and we define the relation α = α(J) to hold provided the condition (2.1.37) holds, or equivalently provided xJ = xα. Thus ν(α) is uniquely defined by ∑ ∑ α J k (2.1.49) ν(α)x = x = (x1 + ··· + xn) . |α|=k |J|=k

k α To evaluate ν(α), we can expand (x1 +···+xn) in terms of x by a repaeted application of the formula: ( ) k k (x1 + ··· + xn) = x1 + (x2 + ··· + xn) ∑ ( ) k − α1 ··· k α1 = x1 (x2 + + xn) α1 α1≤k ∑ ( )( ) k k − α − − 1 α1 α2 ··· k α1 α2 = x1 x2 (x3 + + xn) ≤ α1 α2 (2.1.50) α1+α2 k = ··· ∑ ( )( ) ( ) k k − α k − α − · · · − α − 1 ··· 1 n 1 α1 ··· αn = x1 xn α1 α2 αn |α|=k ∑ = ν(α)xα. |α|=k We have ν(α) equal to the product of binomial coefficients given above, i.e., to − − − · · · − k! · (k α1)! ··· (k α1 αn−1)! α1!(k − α1)! α2!(k − α1 − α2)! αn!(k − α1 − · · · − αn)! k! = . α1! ··· αn! In other words, for |α| = k, k! (2.1.51) ν(α) = , where α! = α ! ··· α ! α! 1 n Thus the Taylor formula (2.1.45) can be rewritten ∑ 1 (2.1.52) F (x) = F (α)(0)xα + R (x), α! k |α|≤k 2.1. The derivative 53 where (∫ ) ∑ k + 1 1 (2.1.53) R (x) = (1 − s)kF (α)(sx) ds xα. k α! |α|=k+1 0

The formula (2.1.52)–(2.1.53) holds for F ∈ Ck+1. It is significant that (2.1.52), with a variant of (2.1.53), holds for F ∈ Ck. In fact, for such F , we can apply (2.1.53) with k replaced by k − 1, to get ∑ 1 (α) α (2.1.54) F (x) = F (0)x + R − (x), α! k 1 |α|≤k−1 with ∫ ∑ ( 1 ) k k−1 (α) α (2.1.55) R − (x) = (1 − s) F (sx) ds x . k 1 α! |α|=k 0 We can add and subtract F (α)(0) to F (α)(sx) in the integrand above, and obtain the following.

k Proposition 2.1.5. If F ∈ C on a ball Br(0), the formula (2.1.52) holds for x ∈ Br(0), with (∫ ) ∑ k 1 [ ] (2.1.56) R (x) = (1 − s)k−1 F (α)(sx) − F (α)(0) ds xα. k α! |α|=k 0

Remark. Note that (2.1.56) yields the estimate ∑ | α| x (α) (α) (2.1.57) |Rk(x)| ≤ sup |F (sx) − F (0)|. α! 0≤s≤1 |α|=k

The term corresponding to |J| = 2 in (2.1.45), or |α| = 2 in (2.1.52), is of particular interest. It is

∑ ∑n 2 1 (J) J 1 ∂ F (2.1.58) F (0) x = (0) xjxk. 2 2 ∂xk∂xj |J|=2 j,k=1 We define the Hessian of a C2 function F : O → R as an n × n matrix: ( ) ∂2F (2.1.59) D2F (y) = (y) . ∂xk∂xj Then the power series expansion of second order about 0 for F takes the form 1 (2.1.60) F (x) = F (0) + DF (0)x + x · D2F (0)x + R (x), 2 2 54 2. Multivariable differential calculus where, by (2.1.57), 2 (α) (α) (2.1.61) |R2(x)| ≤ Cn|x| sup |F (sx) − F (0)|. 0≤s≤1,|α|=2 In all these formulas we can translate coordinates and expand about y ∈ O. For example, (2.1.60) extends to 1 (2.1.62) F (x) = F (y)+DF (y)(x−y)+ (x−y)·D2F (y)(x−y)+R (x, y), 2 2 with 2 (α) (α) (2.1.63) |R2(x, y)| ≤ Cn|x − y| sup |F (y + s(x − y)) − F (y)|. 0≤s≤1,|α|=2

Example. If we take F (x) as in (2.1.8), so DF (x) is as in (2.1.9), then ( ) − sin x sin x cos x cos x D2F (x) = 1 2 1 2 . cos x1 cos x2 − sin x1 sin x2

The results (2.1.62)–(2.1.63) are useful for extremal problems, i.e., deter- mining where a sufficiently smooth function F : O → R has local maxima and local minima. Clearly if F ∈ C1(O) and F has a local maximum or minimum at x0 ∈ O, then DF (x0) = 0. In such a case, we say x0 is a critical point of F . To check what kind of critical point x0 is, we look at the 2 2 n × n matrix A = D F (x0), assuming F ∈ C (O). By Proposition 1.2, A is a symmetric matrix. A basic result in linear algebra is that if A is a real, symmetric n × n matrix, then Rn has an orthonormal basis of eigenvectors, {v1, . . . , vn}, satisfying Avj = λjvj; the real numbers λj are the eigenvalues of A. We say A is positive definite if all λj > 0, and we say A is negative definite if all λj < 0. We say A is strongly indefinite if some λj > 0 and another λk < 0. Equivalently, given a real, symmetric matrix A, A positive definite ⇐⇒ v · Av ≥ C|v|2, (2.1.64) A negative definite ⇐⇒ v · Av ≤ −C|v|2, for some C > 0, all v ∈ Rn, and A strongly indefinite ⇐⇒ ∃ v, w ∈ Rn, nonzero, such that (2.1.65) v · Av ≥ C|v|2, w · Aw ≤ −C|w|2, for some C > 0. In light of (2.1.45)–(2.1.46), we have the following result. Proposition 2.1.6. Assume F ∈ C2(O) is real valued, O open in Rn. Let x0 ∈ O be a critical point for F . Then 2 (i) D F (x0) positive definite ⇒ F has a local minimum at x0, 2 (ii) D F (x0) negative definite ⇒ F has a local maximum at x0, 2 (iii) D F (x0) strongly indefinite ⇒ F has neither a local maximum nor a local minimum at x0. 2.1. The derivative 55

In case (iii), we say x0 is a saddle point for F . The following is a test for positive definiteness.

Proposition 2.1.7. Let A = (aij) be a real, symmetric, n × n matrix. For 1 ≤ ℓ ≤ n, form the ℓ × ℓ matrix Aℓ = (aij)1≤i,j≤ℓ. Then

(2.1.66) A positive definite ⇐⇒ det Aℓ > 0, ∀ ℓ ∈ {1, . . . , n}.

Regarding the implication ⇒, note that if A is positive definite, then det A = det An is the product of its eigenvalues, all > 0, hence is > 0. Also in this case, it follows from the hypothesis on the left side of (2.1.66) that each Aℓ must be positive definite, hence have positive determinant, so we have ⇒. The implication ⇐ is easy enough for 2 × 2 matrices. If A is symmetric and det A > 0, then either both its eigenvalues are positive (so A is positive definite) or both are negative (so A is negative definite). In the latter case, A1 = (a11) must be negative, so we have ⇐ in this case. We prove ⇐ for n ≥ 3, using induction. The inductive hypothesis implies that if det Aℓ > 0 for each ℓ ≤ n, then An−1 is positive definite. The next lemma then guarantees that A = An has at least n − 1 positive eigenvalues. The hypothesis that det A > 0 does not allow that the remaining eigenvalue be ≤ 0, so all the eigenvalues of A must be positive. Thus Proposition 2.1.7 is proven, once we have the following.

Lemma 2.1.8. In the setting of Proposition 2.1.7, if An−1 is positive defi- nite, then A = An has at least n − 1 positive eigenvalues.

n Proof. Since A is symmetric, R has an orthonormal basis v1, . . . , vn of eigenvectors of A; Avj = λjvj. If the conclusion of the lemma is false, at least two of the eigenvalues, say λ1, λ2, are ≤ 0. Let W = Span(v1, v2), so w ∈ W =⇒ w · Aw ≤ 0. Since W has dimension 2, Rn−1 ⊂ Rn satisfies Rn−1 ∩W ≠ 0, so there exists a nonzero w ∈ Rn−1 ∩ W , and then

w · An−1w = w · Aw ≤ 0, contradicting the hypothesis that An−1 is positive definite. 

Remark. Given (2.1.66), we see by taking A 7→ −A that if A is a real, symmetric n × n matrix, ℓ (2.1.67) A negative definite ⇐⇒ (−1) det Aℓ > 0, ∀ ℓ ∈ {1, . . . , n}. We return to higher order power series formulas with remainder and complement Proposition 2.1.5. Let us go back to (2.1.39)–(2.1.40) and note 56 2. Multivariable differential calculus that the integral in (2.1.40) is 1/(k+1) times a weighted average of φ(k+1)(s) over s ∈ [0, t]. Hence we can write 1 r (t) = φ(k+1)(θt), for some θ ∈ [0, 1], k (k + 1)! if φ is of class Ck+1. This is the Lagrange form of the remainder; see Appendix A.4 for more on this, and for a comparison with the Cauchy form of the remainder. If φ is of class Ck, we can replace k + 1 by k in (2.1.39) and write 1 1 (2.1.68) φ(t) = φ(0) + φ′(0)t + ··· + φ(k−1)(0)tk−1 + φ(k)(θt)tk, (k − 1)! k! for some θ ∈ [0, 1]. Pluging (2.1.68) into (2.1.43) for φ(t) = F (tx) gives ∑ 1 1 ∑ (2.1.69) F (x) = F (J)(0)xJ + F (J)(θx)xJ , |J|! k! |J|≤k−1 |J|=k for some θ ∈ [0, 1] (depending on x and on k, but not on J), when F is of k n class C on a neighborhood Br(0) of 0 ∈ R . Similarly, using the α-multi- index notation, we have, as an alternative to (2.1.54)–(2.1.55), ∑ 1 ∑ 1 (2.1.70) F (x) = F (α)(0)xα + F (α)(θx)xα, α! α! |α|≤k−1 |α|=k for some θ ∈ [0, 1] (depending on x and on |α|, but not on α), if F ∈ k C (Br(0)). Note also that ∑ ∑n 2 1 (J) J 1 ∂ F F (θx)x = (θx)xjxk 2 2 ∂xk∂xj (2.1.71) |J|=2 j,k=1 1 = x · D2F (θx)x, 2 2 2 with D F (y) as in (2.1.59), so if F ∈ C (Br(0)), we have, as an alternative to (2.1.60), 1 (2.1.72) F (x) = F (0) + DF (0)x + x · D2F (θx)x, 2 for some θ ∈ [0, 1]. We next complemant the multi-index notations for higher derivatives of a function F by a multi-linear notation, defined as follows. If k ∈ N,F ∈ Ck(U), and y ∈ U ⊂ Rn, set

k ··· ··· (2.1.73) D F (y)(u1, . . . , uk) = ∂t1 ∂tk F (y +t1u1 + +tkuk) , t1=···=tk=0 n for u1, . . . , uk ∈ R . For k = 1, this formula is equivalent to the definition of DF given at the beginning of this section. For k = 2, we have (2.1.74) D2F (y)(u, v) = u · D2F (y)v, 2.1. The derivative 57 with D2F (y) on the right as in (2.1.59). Generally, (2.1.73) defines DkF (y) n as a symmetric, k- in u1, . . . , uk ∈ R . We can relate (2.1.73) to the J-multi-index notation as follows. We start with ∑ ··· (J) J (2.1.75) ∂t1 F (y + t1u1 + + tkuk) = F (y + Σtjuj)u1 , |J|=1 and inductively obtain (2.1.76) ∑ ··· ··· (J1+ +Jk) J1 ··· Jk ∂t1 ∂tk F (y + Σtjuj) = F (y + Σtjuj)u1 uk , |J1|=···=|Jk|=1 hence ∑ ··· k (J1+ +Jk) J1 ··· Jk (2.1.77) D F (y)(u1, . . . , uk) = F (y)u1 uk . |J1|=···=|Jk|=1 In particular, if u = ··· = u = u, 1 k ∑ (2.1.78) DkF (y)(u, . . . , u) = F (J)(y)uJ . |J|=k Hence (2.1.69) yields the multi-linear Taylor formula with remainder 1 F (x) = F (0) + DF (0)x + ··· + Dk−1F (0)(x, . . . , x) (k − 1)! (2.1.79) 1 + DkF (θx)(x, . . . , x), k! k for some θ ∈ [0, 1], if F ∈ C (Br(0)). In fact, rather than appealing to (2.1.69), we can note that

⇒ (k) ··· ··· φ(t) = F (tx) = φ (t) = ∂t1 ∂tk φ(t + t1 + + tk) t1=···=tk=0 = DkF (tx)(x, . . . , x), and obtain (2.1.79) directly from (2.1.68). We can also use the notation (2.1.80) DjF (y)x⊗j = DjF (y)(x, . . . , x), with j copies of x within the last set of parentheses, and rewrite (2.1.79) as 1 F (x) = F (0) + DF (0)x + ··· + Dk−1F (0)x⊗(k−1) (k − 1)! (2.1.81) 1 + DkF (θx)x⊗k. k! Note how (2.1.79) and (2.1.81) generalize (2.1.72).

Convergent power series and their derivatives 58 2. Multivariable differential calculus

Here we consider functions given by convergent power series, of the form ∑ α (2.1.82) F (x) = bαx . α≥0 n α We allow bα ∈ C, and take x = (x1, . . . , xn) ∈ R , with x given by (2.1.38). Here is our first result. n Proposition 2.1.9. Assume there exist y ∈ R and C0 < ∞ such that α (2.1.83) |yk| = ak > 0, ∀ k, |bαy | ≤ C0, ∀ α. Then, for each δ ∈ (0, 1), the series (2.1.82) converges absolutely and uni- formly on each set n (2.1.84) Rδ = {x ∈ R : |xk| ≤ (1 − δ)ak, ∀ k}. e n The sum F (x) is continuous on R = {x ∈ R : |xk| < ak, ∀ k}. Proof. We have α |α| (2.1.85) x ∈ Rδ =⇒ |bαx | ≤ C0(1 − δ) , ∀ α, hence ∑ ∑ α |α| (2.1.86) |bαx | ≤ C0 (1 − δ) < ∞. α≥0 α≥0

Thus the power series (2.1.82) is absolutely convergent whenever x ∈ Rδ. We also have, for each N ∈ N, ∑ α (2.1.87) F (x) = bαx + RN (x), |α|≤N and, for x ∈ R , δ ∑ α |RN (x)| ≤ |bαx | |α|>N ∑ |α| (2.1.88) ≤ C0 (1 − δ) |α|>N

= εN → 0 as N → ∞.

This shows that RN (x) → 0 uniformly for x ∈ Rδ, and completes the proof of Proposition 2.1.9. 

We next discuss differentiability of power series. Proposition 2.1.10. In the setting of Proposition 2.1.9, F is differentiable on Re and, for each j ∈ {1, . . . , n}, ∂F ∑ α−εj e (2.1.89) (x) = αjbαx , ∀ x ∈ R. ∂xj α≥εj 2.1. The derivative 59

Here, we set εj = (0,..., 1,..., 0), with the 1 in the jth slot. It is convenient to begin the proof of Proposition 2.1.10 with the following. Lemma 2.1.11. In the setting of Proposition 2.1.9, for each j ∈ {1, . . . , n}, ∑ α−εj (2.1.90) Gj(x) = αjbαx

α≥εj e is absoletely convergent for x ∈ R, uniformly on Rδ for each δ ∈ (0, 1), e therefore defining Gj as a continuous function on R.

Proof. Take a = (a1, . . . , an), with aj as in (2.1.83). Given x ∈ R , we have ∑ ∑ δ α−εj |α|−1 α−εj αj|bαx | ≤ αj(1 − δ) |bαa |

α≥εj α≥εj (2.1.91) ∑ C0 |α| ≤ αj(1 − δ) , aj(1 − δ) α≥0 and this is

(2.1.92) ≤ Mδ < ∞, ∀ δ ∈ (0, 1).

This gives the asserted convergence on Rδ and hence defines the function e Gj, continuous on R.  To prove Proposition 2.1.10, we need to show that ∂F e (2.1.93) = Gj on R, ∂xj for each j. Let us use the notation

(2.1.94) xbj = (x1, . . . , xj−1, 0, xj+1, . . . , xn) = x − xjej, n where ej is the jth standard basis vector of R . Now, given x ∈ Rδ, δ ∈ (0, 1), the uniform convergence of (2.1.90) on R implies ∫ ∫δ xj ∑ xj α−εj Gj(xbj + tej) dt = αjbα (xbj + tej) dt 0 α≥ε 0 ∑j −1 α = αjbααj x (2.1.95) α≥ε ∑j α = bαx

α≥εj

= F (x) − F (xbj).

Applying ∂/∂xj to the left side of (2.1.95) and using the fundamental the- orem of calculus then yields (2.1.93) as desired. This gives the identity e (2.1.89). Since each Gj is continuous on R, this implies F is differentiable on Re. 60 2. Multivariable differential calculus

We can iterate Proposition 2.1.10, obtaining ∂k∂jF (x) = ∂kGj(x) as a convergent power series on Re, etc. In particular, we have the following. Corollary 2.1.12. In the setting of Proposition 2.1.9, we have F ∈ C∞(Re). 2.1. The derivative 61

Exercises

0. Here we provide a path to a strengthening of Proposition 2.1.1. Let n O ⊂ R be open, f : O → R. Assume ∂f/∂xj exists on O for each j. Fix x ∈ O and assume that ∂f (2.1.96) is continuous at x, for each j. ∂xj Task: prove that f is differentiable at x. Hint. Start as in (2.1.6), with ∑n { } f(x + y) = f(x) + f(x + zj) − f(x + zj−1) , j=1 where z0 = 0, zj = (y1, . . . , yj, 0,..., 0) = zj−1 + yjej, zn = y. Deduce from the mean value theorem that, for each j, ∂f f(x + zj) − f(x + zj−1) = (x + zj−1 + θjyjej)yj, ∂xj for some θj ∈ (0, 1). Deduce that ∑n ∂f f(x + y) = f(x) + (x)y + R(x, y), ∂x j j=1 j where { } ∑n ∂f ∂f R(x, y) = (x + z − + θ y e ) − (x) y . ∂x j 1 j j j ∂x j j=1 j j Show that the hypothesis (2.1.96) implies that R(x, y) = o(∥y∥).

1. Consider the following function f : R2 → R: f(x, y) = (cos x)(cos y). Find all its critical points, and determine which of these are local maxima, local minima, and saddle points.

2. Let M(n, R) denote the space of real n×n matrices, and let Ω ⊂ M(n, R) be open. Assume F,G :Ω → M(n, R) are of class C1. Show that H(X) = F (X)G(X) defines a C1 map H :Ω → M(n, R), and DH(X)Y = DF (X)YG(X) + F (X)DG(X)Y.

3. Let Gl(n, R) ⊂ M(n, R) denote the set of invertible matrices. Show that Φ: Gl(n, R) −→ M(n, R), Φ(X) = X−1 62 2. Multivariable differential calculus is of class C1 and that DΦ(X)Y = −X−1YX−1. Hint. The series expansion (2.1.20) should be useful.

3A. Define S, Φ,F : Gℓ(n, R) → M(n, R) by S(X) = X2, Φ(X) = X−1,F (X) = X−2. Compute DF (X)Y using each of the following approaches: (a) Take F (X) = Φ(X)Φ(X) and use the (Exercise 2). (b) Take F (X) = Φ(S(X)) and use the chain rule. (c) Take F (X) = S(Φ(X)) and use the chain rule.

4. Identify R2 and C via z = x + iy. Then multiplication by i on C corre- sponds to applying ( ) 0 −1 J = 1 0 Let O ⊂ R2 be open, f : O → R2 be C1. Say f = (u, v). Regard Df(x, y) as a 2 × 2 real matrix. One says f is holomorphic, or complex-analytic, provided the Cauchy-Riemann equations hold: ∂u ∂v ∂u ∂v = , = − . ∂x ∂y ∂y ∂x Show that this is equivalent to the condition Df(x, y) J = J Df(x, y). Generalize to O open in Cm, f : O → Cn.

5. Let f be C1 on a region in R2 containing [a, b] × {y}. Show that, as h → 0, 1 [ ] ∂f f(x, y + h) − f(x, y) −→ (x, y), uniformly on [a, b] × {y}. h ∂y Hint. Show that the left side is equal to ∫ 1 h ∂f (x, y + s) ds, h 0 ∂y and use the uniform continuity of ∂f/∂y on [a, b]×[y−δ, y+δ]; cf. Proposition A.1.16.

6. In the setting of Exercise 5, show that ∫ ∫ d b b ∂f f(x, y) dx = (x, y) dx. dy a a ∂y 2.1. The derivative 63

7. Considering the power series f (j)(y) f(x) = f(y) + f ′(y)(x − y) + ··· + (x − y)j + R (x, y), j! j show that ∂R 1 j = − f (j+1)(y)(x − y)j,R (x, x) = 0. ∂y j! j Use this to re-derive (1.1.51), and hence (2.1.39)–(2.1.40).

We define “big oh” and “little oh” notation:

f(x) f(x) = O(x) (as x → 0) ⇔ ≤ C as x → 0, x f(x) f(x) = o(x) (as x → 0) ⇔ → 0 as x → 0. x 8. Let O ⊂ Rn be open and y ∈ O. Show that ∑ 1 f ∈ Ck+1(O) ⇒ f(x) = f (α)(y)(x − y)α + O(|x − y|k+1), α! |α|≤k ∑ 1 f ∈ Ck(O) ⇒ f(x) = f (α)(y)(x − y)α + o(|x − y|k). α! |α|≤k

9. Assume G : U → O,F : O → Ω. Show that (2.1.97) F,G ∈ C1 =⇒ F ◦ G ∈ C1. More generally, show that, for k ∈ N, (2.1.98) F,G ∈ Ck =⇒ F ◦ G ∈ Ck.

Hint. Write H = F ◦ G, with hℓ(x) = fℓ(g1(x), . . . , gn(x)), and use (2.1.25) to get ∑n (2.1.99) ∂jhℓ(x) = ∂kfℓ(g1, . . . , gn)∂jgk. k=1 Show that this yields (2.1.97). To proceed, deduce from (2.1.99) that ∑n

∂j1 ∂j2 hℓ(x) = ∂k1 ∂k2 fℓ(g1, . . . , gn)(∂j1 gk1 )(∂j2 gk2 ) k ,k =1 (2.1.100) 1 2 ∑n

+ ∂kfℓ(g1, . . . , gn)∂j1 ∂j2 gk. k=1 64 2. Multivariable differential calculus

Use this to get (2.1.98) for k = 2. Proceeding inductively, show that # # there exist constants C(µ, J , k ) = C(µ, J1,...,Jµ, k1, . . . , kµ) such that if F,G ∈ Ck and |J| ≤ k, ∑ (2.1.101) h(J)(x) = C(µ, J #, k#)g(J1) ··· g(Jµ)f (k1,...,kµ)(g , . . . , g ), ℓ k1 kµ ℓ 1 n where the sum is over

µ ≤ |J|,J1 + ··· + Jµ ∼ J, |Jν| ≥ 1, and J1 + ··· + Jµ ∼ J means J is a rearrangement of J1 + ··· + Jµ. Show that (2.1.98) follows from this.

10. Show that the map Φ : Gl(n, R) → Gl(n, R) given by Φ(X) = X−1 is Ck for each k, i.e., Φ ∈ C∞. Hint. Start with the material of Exercise 3. Write DΦ(X)Y = −X−1YX−1 as ∂ ∂ℓmΦ(X) = Φ(X) = DΦ(X)Eℓm = −Φ(X)EℓmΦ(X), ∂xℓm where X = (xℓm) and Eℓm has just one nonzero entry, at position (ℓ, m). Iterate this to get − − ∂ℓ2m2 ∂ℓ1m1 Φ(X) = (∂ℓ2m2 Φ(X))Eℓ1m1 Φ(X) Φ(X)Eℓ1m1 (∂ℓ2m2 Φ(X)), and continue.

Exercises 11–13 deal with properties of the determinant, as a differentiable function on spaces of matrices.

11. Let M(n, R) be the space of n × n matrices with real coefficients, det : M(n, R) → R the determinant. Show that, if I is the identity matrix, D det(I)B = Tr B, i.e., d det(I + tB)| = Tr B. dt t=0

12. If A(t) = (ajk(t)) is a smooth in M(n, R), use the expansion of (d/dt) det A(t) as a sum of n determinants, in which the rows of A(t) are successively differentiated, to show that ( ) d det A(t) = Tr Cof(A(t))t · A′(t) , dt and deduce that, for A, B ∈ M(n, R), ( ) D det(A)B = Tr Cof(A)t · B . Here Cof(A), the cofactor matrix, is defined in Exercise 4 of §1.4. 2.1. The derivative 65

13. Suppose A ∈ M(n, R) is invertible. Using det(A + tB) = (det A) det(I + tA−1B), show that D det(A)B = (det A) Tr(A−1B). Comparing this result with that of Exercise 12, establish Cramer’s formula:

(det A)A−1 = Cof(A)t. Compare the derivation in Exercise 4 of §1.4.

14. Define f(x, y) on R2 by xy f(x, y) = √ , (x, y) ≠ (0, 0), x2 + y2 0, (x, y) = (0, 0). Show that f is continuous on R2 and smooth on R2 \(0, 0). Show that ∂f/∂x and ∂f/∂y exist at each point of R2, and are continuous on R2 \ (0, 0), but not on R2. Show that ∂f ∂f (0, 0) = (0, 0) = 0. ∂x ∂y Show that f is not differentiable at (0, 0). Hint. Show that f(x, y) is not o(∥(x, y)∥) as (x, y) → (0, 0), by considering f(x, x).

15. Define g(x, y) on R2 by xy3 g(x, y) = , (x, y) ≠ (0, 0), x2 + y2 0, (x, y) = (0, 0). 2 1 2 Show that g is smooth on R \ (0, 0) and class C on R . Show that ∂x∂yg 2 2 and ∂y∂xg exist at each point of R , and are continuous on R \ (0, 0), but not on R2. Show that ∂ ∂g ∂ ∂g (0, 0) = 1, (0, 0) = 0. ∂y ∂x ∂x ∂y 66 2. Multivariable differential calculus

2.2. Inverse function and implicit function theorem

The Inverse Function Theorem gives a condition under which a function can be locally inverted. This theorem and its corollary the Implicit Function Theorem are fundamental results in multivariable calculus. First we state the Inverse Function Theorem. Here, we assume k ≥ 1.

k Theorem 2.2.1. Let F be a C map from an open neighborhood Ω of p0 ∈ n n R to R , with q0 = F (p0). Suppose the derivative DF (p0) is invertible. Then there is a neighborhood U of p0 and a neighborhood V of q0 such that F : U → V is one-to-one and onto, and F −1 : V → U is a Ck map. (One says F : U → V is a diffeomorphism.)

First we show that F is one-to-one on a neighborhood of p0, under these hypotheses. In fact, we establish the following result, of interest in its own right. Proposition 2.2.2. Assume Ω ⊂ Rn is open and convex, and let f :Ω → Rn be C1. Assume that the symmetric part of Df(u) is positive-definite, for each u ∈ Ω. Then f is one-to-one on Ω.

Proof. Take distinct points u1, u2 ∈ Ω, and set u2 − u1 = w. Consider φ : [0, 1] → R, given by

φ(t) = w · f(u1 + tw). ′ Then φ (t) = w · Df(u1 + tw)w > 0 for t ∈ [0, 1], so φ(0) ≠ φ(1). But φ(0) = w · f(u1) and φ(1) = w · f(u2), so f(u1) ≠ f(u2). 

To continue the proof of Theorem 2.2.1, let us set ( ) −1 (2.2.1) f(u) = A F (p0 + u) − q0 ,A = DF (p0) . Then f(0) = 0 and Df(0) = I, the identity matrix. We will show that f maps a neighborhood of 0 one-to-one and onto some neighborhood of 0. We can write (2.2.2) f(u) = u + R(u),R(0) = 0,DR(0) = 0, and R is C1. Pick b > 0 such that 1 (2.2.3) ∥u∥ ≤ 2b =⇒ ∥DR(u)∥ ≤ . 2 Then Df = I + DR has positive definite symmetric part on n B2b(0) = {u ∈ R : ∥u∥ < 2b}, so by Proposition 2.2.2, n f : B2b(0) −→ R is one-to-one. 2.2. Inverse function and implicit function theorem 67

We will show that the range f(B2b(0)) contains Bb(0), that is to say, we can solve (2.2.4) f(u) = v, given v ∈ Bb(0), for some (unique) u ∈ B2b(0). This is equivalent to u + R(u) = v. To get the solution, we set

(2.2.5) Tv(u) = v − R(u). Then solving (2.2.4) is equivalent to solving

(2.2.6) Tv(u) = u. We look for a fixed point (2.2.7) u = K(v) = f −1(v). Also, we want to show that DK(0) = I, i.e., that (2.2.8) K(v) = v + r(v), r(v) = o(∥v∥). The “little oh” notation is defined in Exercise 8 of §2.1. If we succeed in −1 doing this, it follows that, for y close to q0, G(y) = F (y) is defined. Also, taking

x = p0 + u, y = F (x), v = f(u) = A(y − q0), as in (2.2.1), we have, via (2.2.8),

G(y) = p0 + u = p0 + K(v)

= p0 + K(A(y − q0))

= p0 + A(y − q0) + o(∥y − q0∥).

Hence G is differentiable at q0 and −1 (2.2.9) DG(q0) = A = DF (p0) .

A parallel argument, with p0 replaced by a nearby x and y = F (x), gives (2.2.10) DG(y) = DF (G(y))−1.

Thus our task is to solve (2.2.6). To do this, we use the following general result, known as the Contraction Mapping Theorem. Theorem 2.2.3. Let X be a , and let T : X → X satisfy (2.2.11) dist(T x, T y) ≤ r dist(x, y), for some r < 1. (We say T is a contraction.) Then T has a unique fixed k point x. For any y0 ∈ X,T y0 → x as k → ∞. 68 2. Multivariable differential calculus

k k Proof. Pick y0 ∈ X and let yk = T y0. Then dist(yk, yk+1) ≤ r dist(y0, y1), so dist(y , y ) ≤ dist(y , y ) + ··· + dist(y − , y ) k k+m ( k k+1 ) k+m 1 k+m k k+m−1 (2.2.12) ≤ r + ··· + r dist(y0, y1) ( ) k −1 ≤ r 1 − r dist(y0, y1).

It follows that (yk) is a Cauchy sequence, so it converges; yk → x. Since T yk = yk+1 and T is continuous, it follows that T x = x, i.e., x is a fixed point. Uniqueness of the fixed point is clear from the estimate dist(T x, T x′) ≤ r dist(x, x′), which implies dist(x, x′) = 0 if x and x′ are fixed points. This proves Theorem 2.2.3. 

Returning to the task of solving (2.2.6), having b as in (2.2.3), we claim that

(2.2.13) ∥v∥ ≤ b =⇒ Tv : Xv → Xv, where

Xv = {u ∈ B2b(0) : ∥u − v∥ ≤ Av}, (2.2.14) Av = sup ∥R(w)∥. ∥w∥≤2∥v∥ See Figure 2.2.1. Note from (2.2.2)–(2.2.3) that 1 ∥w∥ ≤ 2b =⇒ ∥R(w)∥ ≤ ∥w∥, and ∥R(w)∥ = o(∥w∥). 2 Hence

(2.2.15) ∥v∥ ≤ b =⇒ Av ≤ ∥v∥, and Av = o(∥v∥).

Thus ∥u − v∥ ≤ Av ⇒ u ∈ Xv. Also

u ∈ Xv =⇒ ∥u∥ ≤ 2∥v∥

(2.2.16) =⇒ ∥R(u)∥ ≤ Av

=⇒ ∥Tv(u) − v∥ ≤ Av, so (2.2.13) holds.

As for the contraction property, given uj ∈ Xb, ∥v∥ ≤ b,

∥Tv(u1) − Tv(u2)∥ = ∥R(u2) − R(u1)∥ (2.2.17) 1 ≤ ∥u − u ∥, 2 1 2 the last inequality by (2.2.3), so the map (2.2.13) is a contraction. Hence, by Theorem 2.2.3, there is a unique fixed point, u = K(v) ∈ Xv. Also, since u ∈ Xv,

(2.2.18) ∥K(v) − v∥ ≤ Av = o(∥v∥). 2.2. Inverse function and implicit function theorem 69

Figure 2.2.1. Tv : Xv → Xv

Thus we have (2.2.8). This establishes the existence of the inverse function G = F −1 : V → U, and we have the formula (2.2.10) for the derivative DG. Since G is differentiable on V , it is certainly continuous, so (2.2.10) implies DG is continuous, given F ∈ C1(U). To finish the proof of the Inverse Function Theorem and show that G is Ck if F is Ck, for k ≥ 2, one uses an inductive argument. See Exercise 6 at the end of this section for an approach to this last argument.

Thus if DF is invertible on the domain of F,F is a local diffeomorphism. Stronger hypotheses are needed to guarantee that F is a global diffeomor- phism onto its range. Proposition 2.2.2 provides one tool for doing this. Here is a slight strengthening. Corollary 2.2.4. Assume Ω ⊂ Rn is open and convex, and that F :Ω → Rn is C1. Assume there exist n × n matrices A and B such that the symmetric part of ADF (u) B is positive definite for each u ∈ Ω. Then F maps Ω diffeomorphically onto its image, an open set in Rn.

Proof. Exercise.  70 2. Multivariable differential calculus

We make a comment about solving the equation F (x) = y, under the hypotheses of Theorem 2.2.1, when y is close to q0. The fact that finding the k fixed point for Tv in (2.2.13) is accomplished by taking the limit of Tv (v) implies that, when y is sufficiently close to q0, the sequence (xk), defined by ( ) −1 (2.2.19) x0 = p0, xk+1 = xk + DF (p0) y − F (xk) , converges to the solution x. An analysis of the rate at which xk → x, and F (xk) → y, can be made by applying F to (2.2.19), yielding −1 F (xk+1) = F (xk + DF (p0) (y − F (xk)) −1 = F (xk) + DF (xk)DF (p0) (y − F (xk)) −1 + R(xk,DF (p0) (y − F (xk))), and hence ( ) −1 y − F (xk+1) = I − DF (xk)DF (p0) (y − F (xk)) (2.2.20) e + R(xk, y − F (xk)), + e with ∥R(xk, y − F (xk))∥ = o(∥y − F (xk)∥).

It turns out that replacing p0 by xk in (2.2.19) yields a faster approxima- tion. This method, known as Newton’s method, is described in the exercises.

We consider some examples of maps to which Theorem 2.2.1 applies. First, we look at ( ) ( ) r cos θ x(r, θ) (2.2.21) F : (0, ∞) × R −→ R2,F (r, θ) = = . r sin θ y(r, θ) Then ( ) ( ) ∂ x ∂ x cos θ −r sin θ (2.2.22) DF (r, θ) = r θ = , ∂ry ∂θy sin θ r cos θ so (2.2.23) det DF (r, θ) = r cos2 θ + r sin2 θ = r. Hence DF (r, θ) is invertible for all (r, θ) ∈ (0, ∞) × R. Theorem 2.1 im- plies that each (r0, θ0) ∈ (0, ∞) × R has a neighborhood U and (x0, y0) = (r0 cos θ0, r0 sin θ0) has a neighborhood V such that F is a smooth diffeo- morphism of U onto V . In this simple situation, it can be verified directly that (2.2.24) F : (0, ∞) × (−π, π) −→ R2 \{(x, 0) : x ≤ 0} is a smooth diffeomorphism. 2.2. Inverse function and implicit function theorem 71

Note that DF (1, 0) = I in (2.2.22). Let us check the domain of appli- cability of Proposition 2.2.2. The symmetric part of DF (r, θ) in (2.2.22) is ( ) 1 − cos θ 2 (1 r) sin θ (2.2.25) S(r, θ) = 1 − . 2 (1 r) sin θ r cos θ By Proposition 2.1.7, this is positive definite if and only if (2.2.26) cos θ > 0, and 1 (2.2.27) det S(r, θ) = r cos2 θ − (1 − r)2 sin2 θ > 0. 4 Now (2.2.26) holds for θ ∈ (−π/2, π/2), but not on all of (−π, π). Further- more, (2.2.27) holds for (r, θ) in a neighborhood of (r0, θ0) = (1, 0), but it does not hold on all of (0, ∞) × (−π/2, π/2). We see that Proposition 2.2.2 does not capture the full force of the diffeomorphism property of (2.2.24). We move on to another example. As in §2.1, we can extend Theorem 2.2.1, replacing Rn by a finite dimensional real vector space, isometric to a Euclidean space, such as M(n, R) ≈ Rn2 . As an example, consider ∞ ∑ 1 (2.2.28) Exp : M(n, R) −→ M(n, R), Exp(X) = eX = Xk. k! k=0 of Exp follows from Corollary 2.1.12. Since 1 (2.2.29) Exp(Y ) = I + Y + Y 2 + ··· , 2 we have (2.2.30) D Exp(0)Y = Y, ∀ Y ∈ M(n, R), so D Exp(0) is invertible. Then Theorem 2.2.1 implies that there exist a neighborhod U of 0 ∈ M(n, R) and a neighborhood V of I ∈ M(n, R) such that Exp : U → V is a smooth diffeomorphism. To motivate the next result, we consider the following example. Take a > 0 and consider the equation (2.2.31) x2 + y2 = a2,F (x, y) = x2 + y2. Note that

(2.2.32) DF (x, y) = (2x 2y),DxF (x, y) = 2x, DyF (x, y) = 2y. The equation (2.2.31) defines y “implicitly” as a smooth function of x if |x| < a. Explicitly, √ (2.2.33) |x| < a =⇒ y = a2 − x2, 72 2. Multivariable differential calculus

Similarly, (2.2.31) defines x implicitly as a smooth function of y if |y| < a; explicitly √ (2.2.34) |y| < a =⇒ x = a2 − y2. 2 Now, given x0 ∈ R, a > 0, there exists y0 ∈ R such that F (x0, y0) = a if and only if |x0| ≤ a. Furthermore, 2 (2.2.35) given F (x0, y0) = a ,DyF (x0, y0) ≠ 0 ⇔ |x0| < a. 2 Similarly, given y0 ∈ R, there exists x0 such that F (x0, y0) = a if and only if |y0| ≤ a, and 2 (2.2.36) given F (x0, y0) = a ,DxF (x0, y0) ≠ 0 ⇔ |x0| < a. Note also that, whenever (x, y) ∈ R2 and F (x, y) = a2 > 0, (2.2.37) DF (x, y) ≠ 0, so either DxF (x, y) ≠ 0 or DyF (x, y) ≠ 0, and, as seen above whenever 2 2 2 (x0, y0) ∈ R and F (x0, y0) = a > 0, we can solve F (x, y) = a for either y as a smooth function of x for x near x0 or for x as a smooth function of y for y near y0. We move from these observations to the next result, the Implicit Func- tion Theorem.

m Theorem 2.2.5. Suppose U is a neighborhood of x0 ∈ R ,V a neighbor- ℓ k hood of y0 ∈ R , and we have a C map ℓ (2.2.38) F : U × V −→ R ,F (x0, y0) = u0.

Assume DyF (x0, y0) is invertible. Then the equation F (x, y) = u0 defines k y = g(x, u0) for x near x0 (satisfying g(x0, u0) = y0) with g a C map.

Proof. Consider H : U × V → Rm × Rℓ defined by ( ) (2.2.39) H(x, y) = x, F (x, y) . (Actually, regard (x, y) and (x, F (x, y)) as column vectors.) We have ( ) I 0 (2.2.40) DH = . DxFDyF −1 Thus DH(x0, y0) is invertible, so G = H exists, on a neighborhood of k (x0, u0), and is C , by the Inverse Function Theorem. Let us set (2.2.41) G(x, u) = (ξ(x, u), g(x, u)). Then H ◦ G(x, u) = H(ξ(x, u), g(x, u)) (2.2.42) = (ξ(x, u),F (ξ(x, u), g(x, u)). 2.2. Inverse function and implicit function theorem 73

Since H ◦ G(x, u) = (x, u), we have ξ(x, u) = x, so (2.2.43) G(x, u) = (x, g(x, u)) and hence (2.2.44) H ◦ G(x, u) = (x, F (x, g(x, u)), hence (2.2.45) F (x, g(x, u)) = u.

Note that G(x0, u0) = (x0, y0), so g(x0, u0) = y0, and g is the desired map. 

Here is an example where Theorem 2.2.5 applies. Set ( ) x(u2 + v2) (2.2.46) F : R4 −→ R2,F (u, v, x, y) = . xu + yv

We have ( ) 4 (2.2.47) F (2, 0, 1, 1) = . 2 Note that ( ) 2xu 2xv (2.2.48) D F (u, v, x, y) = , u,v x y hence ( ) 4 0 (2.2.49) D F (2, 0, 1, 1) = u,v 1 1 is invertible, so Theorem 2.2.5 (with (u, v) in place of y and (x, y) in place of x) implies that the equation ( ) 4 (2.2.50) F (u, v, x, y) = 2 defines smooth functions (2.2.51) u = u(x, y), v = v(x, y), for (x, y) near (x0, y0) = (1, 1), satisfying (2.2.50), with (u(1, 1), v(1, 1)) = (2, 0). Let us next focus on the case ℓ = 1 of Theorem 2.2.5, so (2.2.52) z = (x, y) ∈ Rn, x ∈ Rn−1, y ∈ R,F (z) ∈ R.

Then DyF = ∂yF . If F (x0, y0) = u0, Theorem 2.2.5 says that if

(2.2.53) ∂yF (x0, y0) ≠ 0, 74 2. Multivariable differential calculus then one can solve

(2.2.54) F (x, y) = u0 for y = g(x, u0), k for x near x0 (satisfying g(x0, u0) = y0), with g a C function. This phe- nomenon was illustrated in (2.2.31)–(2.2.35). To generalize the observa- tions involving (2.2.36)–(2.2.37), we note the following. Set (x, y) = z = ̸ (z1, . . . , zn), z0 = (x0, y0). The condition (2.2.53) is that ∂zn F (z0) = 0. Now a simple permutation of variables allows us to assume ̸ (2.2.55) ∂zj F (z0) = 0,F (z0) = u0, and deduce that one can solve

(2.2.56) F (z) = u0, for zj = g(z1, . . . , zj−1, zj+1, . . . , zn). Let us record this result, changing notation and replacing z by x.

n Proposition 2.2.6. Let Ω be a neighborhood of x0 ∈ R . Asume we have a Ck function

(2.2.57) F :Ω −→ R,F (x0) = u0, and assume

(2.2.58) DF (x0) ≠ 0, i.e., (∂1F (x0), . . . , ∂nF (x0)) ≠ 0.

Then there exists j ∈ {1, . . . , n} such that one can solve F (x) = u0 for

(2.2.59) xj = g(x1, . . . , xj−1, xj+1, . . . , xn), k with (x10, . . . , xj0, . . . , xn0) = x0, for a C function g.

Remark. For F :Ω → R, it is common to denote DF (x) by ∇F (x),

(2.2.60) ∇F (x) = (∂1F (x), . . . , ∂nF (x)).

Here is an example to which Proposition 2.2.6 applies. Using the nota- tion (x, y) = (x1, x2), set (2.2.61) F : R2 −→ R,F (x, y) = x2 + y2 − x. Then (2.2.62) ∇F (x, y) = (2x − 1, 2y), which vanishes if and only if x = 1/2, y = 0. Hence Proposition 2.2.6 applies if and only if (x0, y0) ≠ (1/2, 0). Let us give an example involving a real valued function on M(n, R), namely (2.2.63) det : M(n, R) −→ R. 2.2. Inverse function and implicit function theorem 75

As indicated in Exercise 13 of §2.1, if det X ≠ 0, (2.2.64) D det(X)Y = (det X) Tr(X−1Y ), so (2.2.65) det X ≠ 0 =⇒ D det(X) ≠ 0. We deduce that, if

(2.2.66) X0 ∈ M(n, R), det X0 = a ≠ 0, then, writing

(2.2.67) X = (xjk)1≤j,k≤n, there exist µ, ν ∈ {1, . . . , n} such that the equation (2.2.68) det X = a has a smooth solution of the form ( ) (2.2.69) xµν = g xαβ :(α, β) ≠ (µ, ν) , such that, if the argument of g consists of the matrix entries of X0 other than the µ, ν entry, then the left side of (2.2.69) is the µ, ν entry of X0. Let us return to the setting of Theorem 2.2.5, with ℓ not necessarily equal to 1. In notation parallel to that of (2.2.55), we assume F is a Ck map, ℓ (2.2.70) F :Ω −→ R ,F (z0) = u0, n where Ω is a neighborhood of z0 in R . We assume n ℓ (2.2.71) DF (z0): R −→ R is surjective.

Then, upon reordering the variables z = (z1, . . . , zn), we can write z = (x, y), x = (x1, . . . , xn−ℓ), y = (y1, . . . , yℓ), such that DyF (z0) is invertible, and Theorem 2.5 applies. Thus (for this reordering of variables), we have a Ck solution to

(2.2.72) F (x, y) = u0, y = g(x, u0), satisfying y0 = g(x0, u0), z0 = (x0, y0). To give one example to which this result applies, we take another look at F : R4 → R2 in (2.2.46). We have ( ) 2xu 2xv u2 + v2 0 (2.2.73) DF (u, v, x, y) = . x y u v The reader is invited to determine for which (u, v, x, y) ∈ R4 the matrix on the right side of (2.2.73) has 2. 76 2. Multivariable differential calculus

Here is another example, involving a map defined on M(n, R). Set ( ) det X (2.2.74) F : M(n, R) −→ R2,F (X) = . Tr X Parallel to (2.2.64), if det X ≠ 0,Y ∈ M(n, R), ( ) (det X) Tr(X−1Y ) (2.2.75) DF (X)Y = . Tr Y Hence, given det X ≠ 0, DF (X): M(n, R) → R2 is surjective if and only if ( ) Tr(X−1Y ) (2.2.76) L : M(n, R) → R2, LY = Tr Y is surjective. This is seen to be the case if and only if X is not a scalar multiple of the identity I ∈ M(n, R).

Exercises

1. Suppose F : U → Rn is a C2 map, p ∈ U, open in Rn, and DF (p) is invertible. With q = F (p), define a map N on a neighborhood of p by ( ) (2.2.77) N(x) = x + DF (x)−1 q − F (x) .

Show that there exists ε > 0 and C < ∞ such that, for 0 ≤ r < ε,

∥x − p∥ ≤ r =⇒ ∥N(x) − p∥ ≤ C r2.

Conclude that, if ∥x1 − p∥ ≤ r with r < min(ε, 1/2C), then xj+1 = N(xj) defines a sequence converging very rapidly to p. This is the basis of Newton’s method, for solving F (p) = q for p. Hint. Apply F to both sides of (2.73).

2. Applying Newton’s method to f(x) = 1/x, show that you get a fast approximation to division using only addition and multiplication. Hint. Carry out the calculation of N(x) in this case and notice a “miracle.”

3. Identify R2n with Cn via z = x + iy, as in Exercise 4 of §2.1. Let U ⊂ R2n be open, F : U → R2n be C1. Assume p ∈ U, DF (p) invertible. If F −1 : V → U is given as in Theorem 2.2.1, show that F −1 is holomorphic provided F is.

4. Let O ⊂ Rn be open. We say a function f ∈ C∞(O) is real analytic 2.2. Inverse function and implicit function theorem 77

provided that, for each x0 ∈ O, we have a convergent power series expansion

∑ 1 (2.2.78) f(x) = f (α)(x )(x − x )α, α! 0 0 α≥0 valid in a neighborhood of x0. Show that we can let x be complex in (2.2.16), and obtain an extension of f to a neighborhood of O in Cn. Show that the extended function is holomorphic, i.e., satisfies the Cauchy-Riemann equations. Hint. Use Proposition 2.1.10. Remark. It can be shown that, conversely, any has a power series expansion. See §5.1. For the next exercise, assume this as known.

5. Let O ⊂ Rn be open, p ∈ O, f : O → Rn be real analytic, with Df(p) invertible. Take f −1 : V → U as in Theorem 2.2.1. Show f −1 is real analytic. Hint. Consider a holomorphic extension F :Ω → Cn of f and apply Exercise 3.

6. Use (2.2.10) to show that if a C1 diffeomorphism has a C1 inverse G, and if actually F is Ck, then also G is Ck. Hint. Use induction on k. Write (2.2.10) as G(x) = Φ ◦ F ◦ G(x), with Φ(X) = X−1, as in Exercises 3 and 10 of §2.1, G(x) = DG(x), F(x) = DF (x). Apply Exercise 9 of §2.1 to show that, in general G, F, Φ ∈ Cℓ =⇒ G ∈ Cℓ. Deduce that if one is given F ∈ Ck and one knows that G ∈ Ck−1, then this result applies to give G = DG ∈ Ck−1, hence G ∈ Ck.

7. Show that there is a neighborhood O of (1, 0) ∈ R2 and there are functions u, v, w ∈ C1(O)(u = u(x, y), etc.) satisfying the equations u3 + v3 − xw3 = 0, u2 + yw2 + v = 1, xu + yvw = 1, for (x, y) ∈ O, and satisfying u(1, 0) = 1, v(1, 0) = 0, w(1, 0) = 1. 78 2. Multivariable differential calculus

Hint. Define F : R5 → R3 by   u3 + v3 − xw3 F (u, v, w, x, y) =  u2 + yw2 + v  , xu + yvw t Then F (1, 0, 1, 1, 0) = (0, 1, 1) . Evaluate the 3×3 matrix Du,v,wF (1, 0, 1, 1, 0). Compare (2.2.46)–(2.2.51).

8. Consider F : M(n, R) → M(n, R), given by F (X) = X2. Show that F is a diffeomorphism of a neighborhood of the identity matrix I onto a neighborhood of I. Show that F is not a diffeomorphism of a neighborhood of ( ) 1 0 0 −1 onto a neighborhood of I (in case n = 2).

9. Prove Corollary 2.2.4.

10. Let f : R2 → R3 be a C1 map. Assume f(0) = (0, 0, 0) and ∂f ∂f (0) × (0) = (0, 0, 1). ∂x ∂y Show that there exist neighborhoods O and Ω of 0 ∈ R2 and a C1 map u :Ω → R such that the image of O under f in R3 is the graph of u over Ω. Hint. Let Π : R3 → R2 be Π(x, y, z) = (x, y), and consider φ(x, y) = Π(f(x, y)), φ : R2 → R2. Show that Dφ(0) : R2 → R2 is invertible, and apply the inverse function theorem. Then let u be the z-component of f ◦ φ−1.

11. Generalize Exercise 10 to the setting where f : Rm → Rn (m < n) is C1 and Df(0) : Rm −→ Rn is injective. Remark. For related results, see the opening paragraphs of §3.2.

n n 12. Let Ω ⊂ R be open and contain p0. Assume F : Ω → R is continuous 1 and F (p0) = q0. Assume F is C on Ω and DF (x) is invertible for all x ∈ Ω. Finally, assume there exists R > 0 such that

(2.2.79) x ∈ ∂Ω =⇒ ∥F (x) − q0∥ ≥ R. Show that

(2.2.80) F (Ω) ⊃ BR/2(q0). 2.2. Inverse function and implicit function theorem 79

Hint. Given y0 ∈ BR/2(q0), use compactness to show that there exists x0 ∈ Ω such that ∥F (x0) − y0∥ = inf ∥F (x) − y0∥. x∈Ω Use the hypothesis (2.2.79) to show that x0 ∈ Ω. If F (x0) ≠ y0, use

F (x0 + tz) = F (x0) + tDF (x0)z + o(∥tz∥), n to produce z ∈ R (say DF (x0)z = y0 − F (x0)) such that F (x0 + tz) is closer to y0 than F (x0) is, for small t > 0. Contradiction.

13. Do Exercise 12 with the conclusion (2.2.80) strengthened to

(2.2.81) F (Ω) ⊃ BR(q0).

Hint. It suffices to show that F (Ω) ⊃ BS(q0) for each S < R. Given such S, produce a diffeomorphism φ : Rn → Rn such that Exercise 12 applies to φ ◦ F , and yields the desired conclusion. 80 2. Multivariable differential calculus

2.3. Systems of differential equations and vector fields

In this section we study n × n systems of ODE, dy (2.3.1) = F (t, y), y(t ) = y . dt 0 0 To begin, we prove the following fundamental existence and uniqueness re- sult. n Theorem 2.3.1. Let y0 ∈ Ω, an open subset of R ,I ⊂ R an interval containing t0. Suppose F is continuous on I × Ω and satisfies the following Lipschitz estimate in y :

(2.3.2) ∥F (t, y1) − F (t, y2)∥ ≤ L∥y1 − y2∥ for t ∈ I, yj ∈ Ω. Then the equation (2.3.1) has a unique solution on some t-interval containing t0.

To begin the proof, we note that the equation (2.3.1) is equivalent to the integral equation ∫ t ( ) (2.3.3) y(t) = y0 + F s, y(s) ds. t0 Existence will be established via the Picard iteration method, which is the following. Guess y0(t), e.g., y0(t) = y0. Then set ∫ t ( ) (2.3.4) yk(t) = y0 + F s, yk−1(s) ds. t0

We aim to show that, as k → ∞, yk(t) converges to a (unique) solution of (2.3.3), at least for t close enough to t0. To do this, we use the Contraction Mapping Theorem, established in §2.2. We look for a fixed point of Φ, defined by ∫ t ( ) (2.3.5) (Φy)(t) = y0 + F s, y(s) ds. t0 Let { } n (2.3.6) X = u ∈ C(J, R ): u(t0) = y0, sup ∥u(t) − y0∥ ≤ R . t∈J

Here J = [t0 − T, t0 + T ], where T will be chosen, sufficiently small, below. The R is picked so that

BR(y0) = {y : ∥y − y0∥ ≤ R} is contained in Ω, and we also suppose J ⊂ I. Then there exists M such that (2.3.7) sup ∥F (s, y)∥ ≤ M. s∈J,∥y−y0∥≤R 2.3. Systems of differential equations and vector fields 81

Then, provided R (2.3.8) T ≤ , M we have (2.3.9) Φ: X → X. Now, using the Lipschitz hypothesis (2.3.2), we have, for t ∈ J, ∫ t ∥(Φy)(t) − (Φz)(t)∥ ≤ L∥y(s) − z(s)∥ ds (2.3.10) t0 ≤ TL sup ∥y(s) − z(s)∥ s∈J assuming y and z belong to X. It follows that Φ is a contraction on X provided one has 1 (2.3.11) T < L in addition to the hypotheses above. This proves Theorem 2.3.1. Note that the bound M and the Lipschitz hypothesis on F were needed only on BR(y0). Thus we can extend Theorem 2.3.1 to the following setting:

For each compact K ⊂ Ω, there exists MK < ∞ such that (2.3.12) ∥F (t, x)∥ ≤ MK , ∀ x ∈ K, t ∈ I, and

For each K as above, there exists LK < ∞ such that (2.3.13) ∥F (t, x) − F (t, y)∥ ≤ LK ∥x − y∥, ∀ x, y ∈ K, t ∈ I.

Note that, if K ⊂ Ω is compact, there exists RK > 0 such that e n K = {x ∈ R : dist(x, K) ≤ RK } ⊂ Ω, e and K is compact. It follows that for each y0 ∈ K, the solution to (2.3.1) exists on the interval { ∈ | − | ≤ } (2.3.14) t I : t t0 min(RK /MKe , 1/2LKe ) . Now that we have local solutions to (2.3.1), it is of interest to investigate when global solutions exist. Here is an example where breakdown occurs: dy (2.3.15) = y2, y(0) = 1. dt The solution blows up in finite time. See Exercise 1. It is useful to know that “blowing up” is the only way a solution can fail to exist globally. We have the following result. 82 2. Multivariable differential calculus

Proposition 2.3.2. Let F be as in Theorem 2.3.1, but with the bounded- ness and Lipschitz hypotheses replaced by (2.3.12)–(2.3.13). Assume [a, b] is contained in the open interval I, and assume y(t) solves (2.3.1) for t ∈ (a, b). Assume there exists a compact K ⊂ Ω such that y(t) ∈ K for all t ∈ (a, b). Then there exist a1 < a and b1 > b such that y(t) solves (2.3.1) for t ∈ (a1, b1).

Proof. We deduce from (2.3.14) that there exists δ > 0 such that for each y1 ∈ K, t1 ∈ [a, b], the solution to dy (2.3.16) = F (t, y), y(t ) = y dt 1 1 exists on the interval [t1 −δ, t1 +δ]. Now, under the hypotheses, take t1 ∈ (b−δ/2, b) and y1 = y(t1), with y(t) solving (2.3.1). Then solving (3.16) continues y(t) past t = b. Similarly one can continue y(t) past t = a. 

Here is an example of a global existence result that can be deduced from Proposition 2.3.2. Consider the 2 × 2 system for y = (x, v): dx = v, (2.3.17) dt dv = −x3. dt Here we take Ω = R2,F (t, y) = F (t, x, v) = (u, −x3). If (2.3.17) holds for t ∈ (a, b), we have ( ) d v2 x4 dv dx (2.3.18) + = v + x3 = 0, dt 2 4 dt dt so each y(t) = (x(t), v(t)) solving (2.3.17) lies on a level curve x4/4+v2/2 = C, hence is confined to a compact subset of R2, yielding global existence of solutions to (2.3.17). For more examples of global existence, see Exercises 2–4 below, and also further material below, treating linear systems. The discussion above dealt with first order systems. Often one wants to deal with a higher-order ODE. There is a standard method of reducing an nth-order ODE (2.3.19) y(n)(t) = f(t, y, y′, . . . , y(n−1)) to a first-order system. One sets u = (u0, . . . , un−1) with (j) (2.3.20) u0 = y, uj = y , and then du ( ) (2.3.21) = u , . . . , u − , f(t, u , . . . , u − ) = g(t, u). dt 1 n 1 0 n 1 2.3. Systems of differential equations and vector fields 83

If y takes values in Rk, then u takes values in Rkn. If the system (2.3.1) is non-autonomous, i.e., if F explicitly depends on t, it can be converted to an autonomous system (one with no explicit t-dependence) as follows. Set z = (t, y). We then have ( ) dz dy ( ) (2.3.22) = 1, = 1,F (z) = G(z). dt dt Sometimes this process destroys important features of the original system (2.3.1). For example, if (2.3.1) is linear, (2.3.22) might be nonlinear. Nev- ertheless, the trick of converting (2.3.1) to (2.3.22) has some uses.

Linear systems

Here we consider linear systems, of the form dx (2.3.23) = A(t)x, x(0) = x , dt 0 given A(t) continuous in t ∈ I (an interval about 0), with values in M(n, R). We will apply Proposition 2.3.2 to establish global existence of solutions. It suffices to establish the following. Proposition 2.3.3. If ∥A(t)∥ ≤ K for t ∈ I, then the solution to (2.3.23) satisfies K|t| (2.3.24) ∥x(t)∥ ≤ e ∥x0∥.

Proof. It suffices to prove (2.3.24) for t ≥ 0. Then y(t) = e−Ktx(t) satisfies dy (2.3.25) = C(t)y, y(0) = x ,C(t) = A(t) − KI. dt 0 We claim that, for t ≥ 0, (2.3.26) ∥y(t)∥ ≤ ∥y(0)∥, which then implies (2.3.24), for t ≥ 0. In fact, d ∥y(t)∥2 = y′(t) · y(t) + y(t) · y′(t) (2.3.27) dt = 2y(t) · (A(t) − K)y(t). Now y(t) · A(t)y(t) ≤ ∥y(t)∥ · ∥A(t)y(t)∥ ≤ ∥A(t)∥ · ∥y(t)∥2, so the hypothesis ∥A(t)∥ ≤ K implies d (2.3.28) ∥y(t)∥2 ≤ 0. dt yielding (3.26).  84 2. Multivariable differential calculus

Thanks to Proposition 2.3.3, we have, for s, t ∈ I, the solution operator for (2.3.23), (2.3.29) S(t, s) ∈ M(n, R),S(t, s)x(s) = x(t). We have ∂ (2.3.30) S(t, s) = A(t)S(t, s),S(s, s) = I. ∂t Note that S(t, s)S(s, r) = S(t, r). In particular, S(t, s) = S(s, t)−1. We can use the solution operator S(t, s) to solve the inhomogeneous system dx (2.3.31) = A(t)x + f(t), x(t ) = x . dt 0 0 Namely, we can take ∫ t (2.3.32) x(t) = S(t, t0)x0 + S(t, s)f(s) ds. t0 This is known as Duhamel’s formula. Verifying that this solves (2.3.30) is an exercise. We will make good use of this in the next subsection.

Dependence of solutions on initial data and other

We study how the solution to a system of differential equations dx (2.3.33) = F (x), x(0) = y dt depends on the initial condition y. As shown in (2.3.22), there is no loss of generality in considering the autonomous system (2.3.33). We will assume F :Ω → Rn is smooth, Ω ⊂ Rn open and convex, and denote the solution to (2.3.33) by x = x(t, y). We want to examine smoothness in y. Let DF (x) denote the n × n matrix valued function of partial derivatives of F . To start, we assume F is of class C1, i.e., DF is continuous on Ω, and we want to show x(t, y) is differentiable in y. Let us recall what this means. Take y ∈ Ω and pick R > 0 such that BR(y) is contained in Ω. We seek an n n × n matrix W (t, y) such that, for w0 ∈ R , ∥w0∥ ≤ R,

(2.3.34) x(t, y + w0) = x(t, y) + W (t, y)w0 + r(t, y, w0), where

(2.3.35) r(t, y, w0) = o(∥w0∥), which means r(t, y, w ) (2.3.36) lim 0 = 0. → w0 0 ∥w0∥ 2.3. Systems of differential equations and vector fields 85

When this holds, x(t, y) is differentiable in y, and

(2.3.37) Dyx(t, y) = W (t, y). In other words,

(2.3.38) x(t, y + w0) = x(t, y) + Dyx(t, y)w0 + o(∥w0∥). In the course of proving this differentiability, we also want to produce an equation for W (t, y) = Dyx(t, y). This can be done as follows. Suppose x(t, y) were differentiable in y. (We do not yet know that it is, but that is okay.) Then F (x(t, y)) is differentiable in y, so we can apply Dy to (2.3.32). Using the chain rule, we get the following equation, dW (2.3.39) = DF (x)W, W (0, y) = I, dt called the linearization of (2.3.33). Here, I is the n × n identity matrix. n Equivalently, given w0 ∈ R ,

(2.3.40) w(t, y) = W (t, y)w0 is expected to solve dw (2.3.41) = DF (x)w, w(0) = w . dt 0 Now, we do not yet know that x(t, y) is differentiable, but we do know from results above on linear systems that (2.3.39) and (2.3.41) are uniquely solvable. It remains to show that, with such a choice of W (t, y), (2.3.34)– (2.3.35) hold. To rephrase the task, set

(2.3.42) x(t) = x(t, y), x1(t) = x(t, y + w0), z(t) = x1(t) − x(t), and let W (t) solve (2.3.39), and w(t) satisfy (2.3.40)–(2.3.41). We then have

x(t, y + w0) = x(t, y) + W (t, y)w0 + {z(t) − w(t)}, so the task of verifying (2.3.34)–(2.3.35) is equivalent to the task of verifying

(2.3.43) ∥z(t) − w(t)∥ = o(∥w0∥). To establish (2.3.43), we will obtain for z(t) an equation similar to (2.3.41). To begin, (2.3.42) implies dz (2.3.44) = F (x ) − F (x), z(0) = w . dt 1 0 Now the fundamental theorem of calculus gives

(2.3.45) F (x1) − F (x) = G(x1, x)(x1 − x), with ∫ 1 ( ) (2.3.46) G(x1, x) = DF τx1 + (1 − τ)x dτ. 0 86 2. Multivariable differential calculus

If F is C1, then G is continuous. Then (2.3.44)–(2.3.45) yield dz (2.3.47) = G(x , x)z, z(0) = w . dt 1 0 Given that (2.3.48) ∥DF (u)∥ ≤ L, ∀ u ∈ Ω, which we have by continuity of DF , after possibly shrinking Ω slightly, we deduce from Proposition 2.3.3 that |t|L (2.3.49) ∥z(t)∥ ≤ e ∥w0∥, that is, |t|L (2.3.50) ∥x(t, y) − x(t, y + w0)∥ ≤ e ∥w0∥. This establishes that x(t, y) is Lipschitz in y. To proceed, since G is continuous and G(x, x) = DF (x), we can rewrite (2.3.47) as dz (2.3.51) = G(x + z, x)z = DF (x)z + R(x, z), z(0) = w , dt 0 where 1 (2.3.52) F ∈ C (Ω) =⇒ ∥R(x, z)∥ = o(∥z∥) = o(∥w0∥). Now comparing (2.3.51) with (2.3.41), we have d (2.3.53) (z − w) = DF (x)(z − w) + R(x, z), (z − w)(0) = 0. dt Then Duhamel’s formula gives ∫ t (2.3.54) z(t) − w(t) = S(t, s)R(x(s), z(s)) ds, 0 where S(t, s) is the solution operator for d/dt−B(t), with B(t) = G(x1(t), x(t)), which as in (2.3.49), satisfies (2.3.55) ∥S(t, s)∥ ≤ e|t−s|L. We hence have (2.3.43), i.e.,

(2.3.56) ∥z(t) − w(t)∥ = o(∥w0∥). This is precisely what is required to show that x(t, y) is differentiable with respect to y, with derivative W = Dyx(t, y) satisfying (2.3.39). Hence we have: Proposition 2.3.4. If F ∈ C1(Ω) and if solutions to (2.3.33) exist for 1 t ∈ (−T0,T1), then, for each such t, x(t, y) is C in y, with derivative Dyx(t, y) satisfying (2.3.39). 2.3. Systems of differential equations and vector fields 87

We have shown that x(t, y) is both Lipschitz and differentiable in y. The continuity of W (t, y) in y follows easily by comparing the differential equations of the form (2.3.39) for W (t, y) and W (t, y + w0), in the spirit of the analysis of z(t) done above. If F possesses further smoothness, we can establish higher differentia- bility of x(t, y) in y by the following trick. Couple (2.3.33) and (2.3.39), to get a system of differential equations for (x, W ): dx = F (x), (2.3.57) dt dW = DF (x)W, dt with initial conditions (2.3.58) x(0) = y, W (0) = I.

We can reiterate the preceding argument, getting results on Dy(x, W ), hence 2 on Dyx(t, y), and continue, proving: Proposition 2.3.5. If F ∈ Ck(Ω), then x(t, y) is Ck in y.

Similarly, we can consider dependence of the solution to dx (2.3.59) = F (τ, x), x(0) = y dt on a τ, assuming F smooth jointly in (τ, x). This result can be deduced from the previous one by the following trick. Consider the system dx dz (2.3.60) = F (z, y), = 0, x(0) = y, z(0) = τ. dt dt Then we get smoothness of x(t, τ, y) jointly in (τ, y). As a special case, let F (τ, x) = τF (x). In this case x(t0, τ, y) = x(τt0, y), so we can improve the conclusion in Proposition 2.3.5 to the following: (2.3.61) F ∈ Ck(Ω) =⇒ x ∈ Ck jointly in (t, y).

Vector fields and flows

Let U ⊂ Rn be open. A vector field on U is a smooth map (2.3.62) X : U −→ Rn. Consider the corresponding ODE dy (2.3.63) = X(y), y(0) = x, dt 88 2. Multivariable differential calculus with x ∈ U. A curve y(t) solving (2.3.63) is called an integral curve of the vector field X. It is also called an orbit. For fixed t, write F t (2.3.64) y = y(t, x) = X (x). F t The locally defined X , mapping (a subdomain of) U to U, is called the flow generated by the vector field X. As a consequence of the results on smooth dependence of solutions to ODE, in (2.3.64), y is a smooth function of (t, x).

The vector field X defines a differential operator on scalar functions, as follows: [ ] L −1 F h − d F t (2.3.65) X f(x) = lim h f( X x) f(x) = f( X x) . h→0 dt t=0 We also use the common notation

(2.3.66) LX f(x) = Xf, that is, we apply X to f as a first order differential operator. Note that, if we apply the chain rule to (2.3.65) and use (2.3.63), we have ∑ ∂f (2.3.67) L f(x) = X(x) · ∇f(x) = a (x) , X j ∂x ∑ j n if X = aj(x)ej, with {ej} the standard basis of R . In particular, using the notation (3.66), we have

(2.3.68) aj(x) = Xxj. In the notation (2.3.66), ∑ ∂ (2.3.69) X = aj(x) . ∂xj We note that X is a derivation, that is, a map on C∞(U), linear over R, satisfying (2.3.70) X(fg) = (Xf)g + f(Xg). Conversely, any derivation on C∞(U) defines a vector field, i.e., has the form (2.3.69), as we now show. Proposition 2.3.6. If X is a derivation on C∞(U), then X has the form (2.3.69). ∑ # # Proof. Set aj(x) = Xxj,X = aj(x)∂/∂xj, and Y = X − X . Then Y is a derivation satisfying Y xj = 0 for each j. We aim to show that Y f = 0 for all f. Note that whenever Y is a derivation 1 · 1 = 1 ⇒ Y · 1 = 2Y · 1 ⇒ Y · 1 = 0. 2.3. Systems of differential equations and vector fields 89

Thus Y annihilates constants. Thus in this case Y annihilates all polyno- mials of degree ≤ 1. ∈ Now we show that Y f(p) = 0 for all p ∫ U. Without loss of generality, 1 we can suppose p = 0. Then, with bj(x) = 0 (∂jf)(tx) dt, we can write ∑ f(x) = f(0) + bj(x)xj. It immediately follows that Y f vanishes at 0, so the proposition is proved. 

A fundamental fact about vector fields is that they can be “straightened out” near points where they do not vanish. To see this, let X be a smooth vector field on U, and suppose X(p) ≠ 0. Then near p there is a hyperplane H that is not tangent to X near p, say on a portion we denote M. We can choose coordinates near p so that p is the origin and M is given by {xn = 0}. Thus we can identify a point x′ ∈ Rn−1 near the origin with x′ ∈ M. We can define a map

(2.3.71) F : M × (−t0, t0) −→ U by F ′ F t ′ (2.3.72) (x , t) = X (x ). This is C∞ and has surjective derivative at (0, 0), and so by the inverse function theorem is a local diffeomorphism. This defines a new near p, in which the flow generated by X has the form F s ′ ′ (2.3.73) X (x , t) = (x , t + s).

If we denote the new coordinates by (u1, . . . , un), we see that the following result is established. Theorem 2.3.7. If X is a smooth vector field on U and X(p) ≠ 0, then there exists a coordinate system (u1, . . . , un), centered at p (so uj(p) = 0) with respect to which ∂ (2.3.74) X = . ∂un By contrast with the situation in Theorem 2.3.7, if X is a vector field on U, p ∈ U, and X(p) = 0, we say p is a critical point of X. It is of interest to understand the behavior of X and its flow near such a critical point. One special feature that arises here is that if X(p) = 0, then the linearization of (2.3.33) at p, given in general by (2.3.41), takes the special form dw (2.3.75) = Aw, A = DX(p), dt 90 2. Multivariable differential calculus since the solution to (2.3.33) satisfying x(0) = p is x(t) ≡ p. (Here F has been relabeled X.) The solution to (2.3.75) is given explicitly as a matrix exponential: tA (2.3.76) w(t) = e w0, explored in an exercise set at the end of this section. We say p is a non- degenerate critical point of X if A = DX(p) is invertible. In such a case, tA the behavior of e is governed by that of the eigenvalues {λj} of A. In particular, tA Re λj < 0 ∀ j =⇒ e w0 → 0 as t → +∞, tA Re λj > 0 ∀ j =⇒ e w0 → 0 as t → −∞. We say X has a sink at p in the first case, and a source at p in the second case. If Re λj is positive for some j and negative for some j, but never 0, we say X has a saddle at p. If Re λj ≡ 0, we say X(p) has a center at p. This exhausts the possibilities for non-degenerate critical points in dimension 2. In higher dimension there are other possibilities, which the reader can catalogue. In dimension 2, we illustrate these cases in Figure 2.3.1, showing two sinks, a saddle, and a center, taking, respectively, ( ) ( ) ( ) ( ) −1 −a −1 1 −1 (2.3.77) A = , , , , −1 1 −a −1 1 with a > 0 in the second case. Reverse the signs on the first two matrices to exhibit sources. F t It follows from material above that the action of X on p+w0 is close to tA that of e on w0, for small w0, and for t in a bounded interval, say [−T0,T0]. ∈ − F t tA Of course, for t [ T0,T0], both X (p + w0) and e w0 move very little when w0 is small. If one is to show that the structure of the orbits of X near p is close to that of the linearization, a finer analysis, involving large t, is needed. For sources and sinks, this analysis is fairly straightforward, and can be found in §3, Chapter 4, of [45]. The case of saddles is more subtle, and is treated in Appendix C to Chapter 4 of [45]. When one has a center tA F t for e , the behavior of X can be rather different. The following example is considered in §3, Chapter 4 of [45]: ( ) −1 (2.3.78) X(x) = Jx − |x|2x, x ∈ R2,J = . 1 This has a critical point at x = 0, and DX(0) = J. It is shown that F t −→ ↗ ∞ (2.3.79) X (x) 0, as t + , in this case. In summary, when one passes from the behavior of the lin- earization of X to X near a non-degenerate critical point, sources are sources, sinks are sinks, and saddles are saddles, 2.3. Systems of differential equations and vector fields 91

Figure 2.3.1. Two sinks, a saddle, and a center but the center does not hold.

We turn to further mapping properties of vector fields. If F : V → W is a diffeomorphism between two open domains in Rn, and Y is a vector field on W, we define a vector field F#Y on V so that (2.3.80) F t = F −1 ◦ F t ◦ F, F#Y Y or equivalently, by the chain rule, ( )( ) ( ) −1 (2.3.81) F#Y (x) = DF F (x) Y F (x) . In particular, if U ⊂ Rn is open and X is a vector field on U, defining a flow F t F t | | , then for a vector field Y, #Y is defined on most of U, for t small, and we can define the Lie derivative: ( ) L −1 F h − d F t (2.3.82) X Y = lim h #Y Y = #Y , h→0 dt t=0 as a vector field on U. 92 2. Multivariable differential calculus

Another natural construction is the operator-theoretic bracket, also called the Lie bracket: (2.3.83) [X,Y ] = XY − YX, where the vector fields X and Y are regarded as first order differential op- ∞ erators on C∑(U). One verifies that∑ (2.3.83) defines a vector field on U. In fact, if X = aj(x)∂/∂xj,Y = bj(x)∂/∂xj, then ∑( ) ∂bj ∂aj ∂ (2.3.84) [X,Y ] = ak − bk . ∂xk ∂xk ∂xj j,k The basic fact about the Lie bracket is the following. Theorem 2.3.8. If X and Y are smooth vector fields, then

(2.3.85) LX Y = [X,Y ].

L F s Proof. We examine X Y = (d/ds) X#Y s=0, using (2.3.81), which implies that ( ) ( ) F s F −s F s F s (2.3.86) Ys(x) = X#Y (x) = D X X (x) Y X (x) . Gs F −s Gs → L Rn ∈ Gs Let us set = D (X . Note that) : U ( ). Hence, for x U, D (x) is an element of L Rn, L(Rn) . To take the s-derivative of (2.3.86), we can apply the product rule and the chain rule. In particular, d (2.3.87) Y (F s (x)) = DY (F s (x))X(F s (x)), ds X X X in view of the master identity d (2.3.88) F s (x) = X(F s (x)). ds X X To differentiate the first factor on the right side of (2.3.86), we start with ( ) d d d (2.3.89) Gs(F s (x)) = Gs (F s (x)) + DGs(F s (x)) F s (x), ds X ds X X ds X and then write ( ) ( ) d Gs F s d F −s F s ( X (x)) = D X ( X (x)) (2.3.90) ds (ds ) d = D F −s (F s (x)), ds X X and, as in (2.3.88), d (2.3.91) D F −s(y) = −DX(F −s(y)), ds X X so the right side of (2.3.90) is equal to (2.3.92) −DX(x). 2.3. Systems of differential equations and vector fields 93

Putting together (2.3.87)–(2.3.92), we have ( ) d − F s Ys(x) = DX(x)Y X (x) ds ( ) ( ) ( ) (2.3.93) + DGs F s (x) X F s (x) Y F s (x) (X ) X( ) X( ) F −s F s F s F s + D X X (x) DY X (x) X X (x) . Note that G0(x) = I ∈ L(Rn) for all x ∈ U, so DG0 = 0. Thus d (2.3.94) L Y = Y (x) = −DX(x)Y (x) + DY (x)X(x), X ds s s=0 which agrees with the formula (2.3.84) for [X,Y ].  Corollary 2.3.9. If X and Y are smooth vector fields on U, then d (2.3.95) F t Y = F t [X,Y ] dt X# X# for all t. F t+s F s F t F t+s Proof. Since locally X = X X , we have the same identity for X#. Hence d d (2.3.96) F t Y = F t F s Y = F t L Y, dt X# ds X# X# s=0 X# X which yields (2.3.95). 

Here is one useful application of Corollary 2.3.9. Note that (2.3.81) implies t −s t s (2.3.97) FF s = F ◦ F ◦ F , X#Y X Y X while (2.3.95) yields the implication ⇒ F s ≡ (2.3.98) [X,Y ] = 0 = X#Y Y. Putting together (2.3.97) and (2.3.98), we have the following. Proposition 2.3.10. Let X and Y be smooth vector fields on U. Let O ⊂ U be open, and assume that, for |s| < a and |t| < b, F s F t F t ◦ F s F s ◦ F t F −s ◦ F t ◦ F s (2.3.99) X , Y , Y Y , X Y , and X Y X , all map O → U. Then ⇒ F t ◦ F s F s ◦ F t (2.3.100) [X,Y ] = 0 = Y X (x) = X Y (x), for x ∈ O, |s| < a, and |t| < b. 94 2. Multivariable differential calculus

Exercises

1. Solve the initial value problem dy = y2, y(0) = a, dt given a ∈ R. On what t-interval is the solution defined?

2. Assume in (3.1) that F : R × Rn → Rn is C1 and satisfies ∥F (t, y)∥ ≤ M for all (t, y) ∈ R × Rn. Use Proposition 2.3.2 to show that (2.3.1) has a unique solution for all t ∈ R.

3. Let M be a compact smooth surface in Rn. Suppose F : Rn → Rn is a smooth map (vector field), such that, for each x ∈ M,F (x) is tangent to M, i.e., the line γx(t) = x + tF (x) is tangent to M at x, at t = 0. Show that, if x ∈ M, then the initial value problem dy = F (y), y(0) = x dt has a solution for all t ∈ R, and y(t) ∈ M for all t. Hint. Locally, straighten out M to be a linear subspace of Rn, to which F is tangent. Use uniqueness. Material in §2.2 helps do this local straightening.

4. Show that the initial value problem dx dy = −x(x2 + y2), = −y(x2 + y2), x(0) = x , y(0) = y dt dt 0 0 has a solution for all t ≥ 0, but not for all t < 0, unless (x0, y0) = (0, 0).

5. Verify Duhamel’s formula (2.3.32), for the solution to (2.3.31). That is, show that the solution to dx = A(t)x + f(t), x(t ) = x dt 0 0 is given by ∫ t x(t) = S(t, t0)x0 + S(t, s)f(s) ds. t0 Hint. Set x(t) = S(t, t0)y(t) and seek a simpler differential equation for y(t).

Exercises on exponential functions

1. Let a ∈ R. Show that the unique solution to u′(t) = au(t), u(0) = 1 is 2.3. Systems of differential equations and vector fields 95 given by ∞ ∑ aj (2.3.101) u(t) = tj. j! j=0 We denote this function by u(t) = eat, the exponential function. We also write exp(t) = et. Hint. Integrate the series term by term and use the Fundamental Theorem of Calculus. Alternative. Setting u0(t) = 1, and using the Picard∑ iteration method (2.3.4) k j j to define the sequence uk(t), show that uk(t) = j=0 a t /j!

2. Show that, for all s, t ∈ R, (2.3.102) ea(s+t) = easeat. a(s+t) as at Hint. Show that u1(t) = e and u2(t) = e e solve the same initial value problem. Alternative. Apply d/dt to ea(s+t)e−at.

3. Show that exp : R → (0, ∞) is a diffeomorphism. We denote the inverse by log : (0, ∞) −→ R. Show that v(x) = log x solves the ODE dv/dx = 1/x, v(1) = 0, and deduce that ∫ x 1 (2.3.103) dy = log x. 1 y

4. Here we significantly expand the scope of Problems 1–2. Let a ∈ C. Show that the unique solution to f ′(t) = af, f(0) = 1 is given by ∞ ∑ aj (2.3.104) f(t) = tj. j! j=0 We denote this function by f(t) = eat. Show that, for all t ∈ R, a, b, ∈ C, (2.3.105) e(a+b)t = eatebt. Hint. See the alternative hint for Problem 2.

5. Write ∞ ∞ ∑ (−1)j ∑ (−1)j (2.3.106) eit = t2j + i t2j+1 = u(t) + iv(t). (2j)! (2j + 1)! j=0 j=0 96 2. Multivariable differential calculus

Show that (2.3.107) u′(t) = −v(t), v′(t) = u(t). We denote these functions by u(t) = cos t, v(t) = sin t. The identity (2.3.108) eit = cos t + i sin t is called Euler’s formula. For a presentation of the standard geometric def- inition of sin t and cos t, and its equivalence with (2.3.108), see exercises in §3.2.

Auxiliary exercises on trigonometric functions

We use the definitions of the trigonometric functions sin t and cos t given in Exercise 5 of the last exercise set.

1. Use (2.3.105) to derive the identities sin(x + y) = sin x cos y + cos x sin y (2.3.109) cos(x + y) = cos x cos y − sin x sin y.

2. Use (2.3.105)–(2.3.109) to show that 1( ) (2.3.110) sin2 t + cos2 t = 1, cos2 t = 1 + cos 2t . 2 Hint. Show that eit = e−it and hence |eit|2 = 1, for t ∈ R.

3. Show that (2.3.111) γ(t) = (cos t, sin t) is a map of R onto the S1 ⊂ R2 with non-vanishing derivative, and, as t increases, γ(t) moves monotonically, counterclockwise. Use Exercise 5 above to show that (2.3.112) γ′(t) = (− sin t, cos t), and deduce that γ(t) has unit speed.

We define π to be the smallest number t1 ∈ (0, ∞) such that γ(t1) = (−1, 0), so cos π = −1, sin π = 0.

Show that 2π is the smallest number t2 ∈ (0, ∞) such that γ(t2) = (1, 0), so cos 2π = 1, sin 2π = 0. 2.3. Systems of differential equations and vector fields 97

Show that cos(t + 2π) = cos t, cos(t + π) = − cos t sin(t + 2π) = sin t, sin(t + π) = − sin t. Show that γ(π/2) = (0, 1), and that ( ) ( ) 1 1 cos t + π = − sin t, sin t + π = cos t. 2 2

4. Show that sin : (−π/2, π/2) → (−1, 1) is a diffeomorphism. We denote its inverse by ( ) 1 1 arcsin : (−1, 1) −→ − π, π . 2 2 Show that u(t) = arcsin t solves the ODE du 1 = √ , u(0) = 0. dt 1 − t2 ( ) Hint. Apply the chain rule to sin u(t) = t. Deduce that, for t ∈ (−1, 1), ∫ t dx (2.3.113) arcsin t = √ . 2 0 1 − x

5. Show that √ √ 1 3 3 1 eπi/3 = + i, eπi/6 = + i. 2 2 2 2 Hint. First compute ( √ ) 1 3 3 + i 2 2 and use Exercise 3. Then compute eπi/2e−πi/3. For behind these formulas, look at Figure 2.3.2

6. Show that sin(π/6) = 1/2, and hence that ∫ ∞ ( ) π 1/2 dx ∑ a 1 2n+1 = √ = n , 6 − 2 2n + 1 2 0 1 x n=0 where 2n + 1 a = 1, a = a . 0 n+1 2n + 2 n Show that k ( ) π ∑ a 1 2n+1 4−k − n < . 6 2n + 1 2 3(2k + 3) n=0 Using a calculator, sum the series over 0 ≤ n ≤ 20, and verify that π ≈ 3.141592653589 ··· 98 2. Multivariable differential calculus

√ Figure 2.3.2. Regular hexagon, a = (1 + 3i)/2

7. For x ≠ (k + 1/2)π, k ∈ Z, set sin x tan x = . cos x Show that 1 + tan2 x = 1/ cos2 x. Show that w(x) = tan x satisfies the ODE dw = 1 + w2, w(0) = 0. dx

8. Show that tan : (−π/2, π/2) → R is a diffeomorphism. Denote the inverse by ( ) 1 1 arctan : R −→ − π, π . 2 2 Show that ∫ y dx (2.3.114) arctan y = 2 . 0 1 + x

9. Use the identity π 1 tan = √ 6 3 2.3. Systems of differential equations and vector fields 99 together with (2.3.114) to produce an infinite series for π that is different from that of Exercise 6.

Exercises on the matrix exponential

1. Let A be an n × n matrix. Parallel to Exercises 1 and 3 of the exercise set on exponential functions, show that ∞ ∑ tk (2.3.115) etA = Ak k! k=0 solves d (2.3.116) etA = AetA = etAA, etA = I. dt t=0

2. Show that, for t ∈ R,A ∈ M(n, C), (2.3.117) etAe−tA = I. Hint. Compute (d/dt)etAe−tA. Show it is 0.

3. In the setting of Exercise 2, show that for s, t ∈ R, (2.3.118) e(s+t)A = esAetA. Hint. Compute (d/dt) e(s+t)Ae−tA. Show it is 0.

4. Let A, B ∈ M(n, C), and assume (2.3.119) AB = BA. Show that (2.3.120) et(A+B) = etAetB. Hint. Compute d (2.3.121) et(A+B)e−tBe−tA. dt Show it is 0. To get this, you will want to show that, if A and B commute, then e−tBA = Ae−tB.

5. We desire to compute etJ when ( ) 0 −1 (2.3.122) J = . 1 0 100 2. Multivariable differential calculus

Noting that J 2 = −I,J 3 = −J, J 4 = I, show that the power series for etJ resembles (2.3.106) and deduce that ( ) cos t − sin t (2.3.123) etJ = (cos t)I + (sin t)J = . sin t cos t Note the resemblance of (2.3.123) and (2.3.108).

6. Note that X(t) = Exp(tA) = etA is given as the solution to a system of the form (2.3.59): dX = F (A, X),X(0) = I, dt where F (A, X) = AX. Deduce from the discussion of (2.3.59)–(2.3.60) that (2.3.124) Exp : M(n, C) −→ M(n, C) is smooth, of class Ck, for each k ∈ N. Note that such smoothness also follows from Corollary 2.1.12.

7. Given A ∈ M(n, C) and continuous f : R → Cn, show that the solution to dx (2.3.125) = Ax + f, x(0) = x ∈ Cn dt 0 is given by the following, known as Duhamel’s formula: ∫ t tA (t−s)A (2.3.126) x(t) = e x0 + e f(s) ds. 0 Hint. Derive a simpler ODE for y(t) = e−tAx(t). Extend the argument, and provide a proof of (2.3.32).

8. Given A ∈ M(n, C), show that det eA = eTr A. Chapter 3

Multivariable integral calculus and calculus on surfaces

This central chapter develops integral calculus on domains in Rn, and then pursues notions of calculus to a higher level, for surfaces in Rn, and more generally for a class of objects known as “manifolds.” In §3.1 we take up the multidimensional Riemann integral. The basic definition is quite parallel to the one-dimensional case, but a number of cen- tral results, while parallel in statement to the one-dimensional case, require more elaborate demonstrations in higher . This section is one of the most demanding in this text, and it is in a sense the heart of the course. One central result is the change of variable formula for multidimen- sional integrals. Another is the reduction of multiple integrals to iterated integrals. In §3.2 we define the notion of a smooth m-dimensional surface in Rn and study properties of these objects. We associate to such a surface a “metric tensor,” and make use of this to define the integral of functions on a surface. This includes the study of surface area. Examples include the computation of areas of higher dimensional spheres. We also explore integration on the group of rotations on Rn, leading to the notion of “averaging over rotations.” In addition, we introduce a class of objects more general than surfaces, called manifolds. Manifolds can also be endowed with metric tensors. These are called Riemannian manifolds, and one can again define the integral of functions.

101 102 3. Multivariable integral calculus and calculus on surfaces

These two main sections are followed by several short sections on sup- plementary material, including a section on partitions of unity, useful to localize analysis on an n-dimensional surface to analysis on an open subset of Rn. A section on Sard’s theorem leads to one on a special family of func- tions known as Morse functions, whose existence allows one to construct lots of conveniently behaved vector fields on a surface. We end with a notion of to a manifold that conveniently abstracts the notion of the tangent space to a surface in Euclidean space.

3.1. The Riemann integral in n variables

We define the Riemann integral of a bounded function f : R → R, where n R ⊂ R is a cell, i.e., a product of intervals R = I1 × · · · × In, where Iν = [aν, bν] are intervals in R. Recall that a partition of an interval I = [a, b] is a finite collection of subintervals {Jk : 0 ≤ k ≤ N}, disjoint except for their endpoints, whose union is I. We can take Jk = [xk, xk+1], where

(3.1.1) a = x0 < x1 < ··· < xN < xN+1 = b.

Now, if one has a partition of each Iν into Jν1 ∪· · ·∪Jν,N(ν), then a partition P of R consists of the cells × × · · · × (3.1.2) Rα = J1α1 J2α2 Jnαn , where 0 ≤ αν ≤ N(ν). For such a partition, define

(3.1.3) maxsize (P) = max diam Rα, α 2 2 ··· 2 where (diam Rα) = ℓ(J1α1 ) + + ℓ(Jnαn ) . Here, ℓ(J) denotes the length of an interval J. Each cell has n-dimensional volume ··· (3.1.4) V (Rα) = ℓ(J1α1 ) ℓ(Jnαn ).

Sometimes we use Vn(Rα) for emphasis on the dimension. We also use A(R) for V2(R), and, of course, ℓ(R) for V1(R). We set ∑ IP (f) = sup f(x) V (Rα), Rα (3.1.5) ∑α IP (f) = inf f(x) V (Rα). R α α

Note that IP (f) ≤ IP (f). These quantities should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of R, we say P refines Q, and write P ≻ Q, if each partition of each interval factor Iν of R involved in the definition of Q is further refined in order to produce the partitions of 3.1. The Riemann integral in n variables 103

the factors Iν, used to define P, via (3.1.2). It is an exercise to show that any two partitions of R have a common refinement. Note also that

(3.1.6) P ≻ Q =⇒ IP (f) ≤ IQ(f), and IP (f) ≥ IQ(f).

Consequently, if Pj are any two partitions of R and Q is a common refine- ment, we have

≤ ≤ Q ≤ P (3.1.7) IP1 (f) IQ(f) I (f) I 2 (f). Now, whenever f : R → R is bounded, the following quantities are well defined:

(3.1.8) I(f) = inf IP (f),I(f) = sup IP (f), P∈Π(R) P∈Π(R) where Π(R) is the set of all partitions of R, as defined above. Clearly, by (3.1.7), I(f) ≤ I(f). We then say that f is Riemann integrable (on R) provided I(f) = I(f), and in such a case, we set ∫ (3.1.9) f(x) dV (x) = I(f) = I(f).

R We will denote the set of Riemann integrable functions on R by R(R). If dim R = 2, we will often use dA(x) instead of dV (x) in (3.1.9). For general n, we might also use simply dx. We derive some basic properties of the Riemann integral. First, the proofs of Lemma 1.1.3 and Theorem 1.1.4 readily extend, to give:

Proposition 3.1.1. Let Pν be any sequence of partitions of R such that

maxsize (Pν) = δν → 0, and let ξνα be any choice of one point in each cell Rνα in the partition Pν. Then, whenever f ∈ R(R), ∫ ∑ (3.1.10) f(x) dV (x) = lim f(ξνα) V (Rνα). ν→∞ α R This is the multidimensional Darboux theorem. The sums that arise in (3.1.10) are Riemann sums. Also, we can extend the proof of Proposition 1.1.1, to obtain: Proposition 3.1.2. If f ∈ R(R) and c ∈ R, then c f +c f ∈ R(R), and ∫ j j ∫ 1 1 ∫ 2 2

(3.1.11) (c1f1 + c2f2) dV = c1 f1 dV + c2 f2 dV. R R R Next, we establish an integrability result analogous to Proposition 1.1.2. 104 3. Multivariable integral calculus and calculus on surfaces

Proposition 3.1.3. If f is continuous on R, then f ∈ R(R).

Proof. As in the proof of Proposition 1.1.2, we have that,

(3.1.12) maxsize (P) ≤ δ =⇒ IP (f) − IP (f) ≤ ω(δ) · V (R), where ω(δ) is a modulus of continuity for f on R. This proves the proposition. 

When the number of variables exceeds one, it becomes more important to identify some nice classes of discontinuous functions on R that are Riemann integrable. A useful tool for this is the following notion of size of a set S ⊂ R, called content. Extending (1.1.18)–(1.1.19), we define “upper content” cont+ and “lower content” cont− by + − (3.1.13) cont (S) = I(χS), cont (S) = I(χS), where χS is the characteristic function of S. We say S has content, or “is contented,” if these quantities are equal, which happens if and only if χS ∈ R(R), in which case the common value of cont+(S) and cont−(S) is ∫

(3.1.14) V (S) = χS(x) dV (s). R

We mention that, if S = I1 × · · · × In is a cell, it is readily verified that the definitions in (3.1.5), (3.1.8), and (3.1.13) yield + − cont (S) = cont (S) = ℓ(I1) ··· ℓ(In), so the definition of V (S) given by (3.1.14) is consistent with that given in (3.1.4). It is easy to see that {∑N } + (3.1.15) cont (S) = inf V (Rk): S ⊂ R1 ∪ · · · ∪ RN , k=1 where Rk are cells contained in R. In the formal definition, the Rα in (4.15) should be part of a partition P of R, as defined above, but if {R1,...,RN } are any cells in R, they can be chopped up into smaller cells, some perhaps thrown away, to yield a finite cover of S by cells in a partition of R, so one gets the same result. It is an exercise to see that, for any set S ⊂ R, (3.1.16) cont+(S) = cont+(S), where S is the closure of S. We note that, generally, for a bounded function f on R, (3.1.17) I(f) + I(1 − f) = V (R). 3.1. The Riemann integral in n variables 105

This follows directly from (3.1.5). In particular, given S ⊂ R, (3.1.18) cont−(S) + cont+(R \ S) = V (R). Using this together with (3.1.16), with S and R \ S switched, we have ◦ (3.1.19) cont−(S) = cont−(S), ◦ ◦ where S denotes the interior of S. The difference S \S is called the boundary of S, and denoted bS. Note that {∑N } − (3.1.20) cont (S) = sup V (Rk): R1 ∪ · · · ∪ RN ⊂ S , k=1 where here we take {R1,...,RN } to be cells within a partition P of R, and let P vary over all partitions of R. Now, given a partition P of R, classify each cell in P as either being contained in R \ S, or intersecting bS, or ◦ contained in S. Letting P vary over all partitions of R, we see that ◦ (3.1.21) cont+(S) = cont+(bS) + cont−(S). In particular, we have: Proposition 3.1.4. If S ⊂ R, then S is contented if and only if cont+(bS) = 0.

If a set Σ ⊂ R has the property that cont+(Σ) = 0, we say that Σ has content zero, or is a nil set. Clearly Σ is nil if and only if Σ is nil.∪ It follows ≤ ≤ K easily from Proposition 3.1.2 that, if Σj are nil, 1 j K, then j=1 Σj is nil. ◦ ◦ ◦ If S1,S2 ⊂ R and S = S1 ∪S2, then S = S1 ∪S2 and S ⊃ S1 ∪S2. Hence bS ⊂ b(S1) ∪ b(S2). It follows then from Proposition 3.1.4 that, if S1 and S2 ∪ c \ are contented, so is S1 S2. Clearly, if Sj are contented, so are( Sj = R) Sj. ∩ c ∪ c c It follows that, if S1 and S2 are contented, so is S1 S2 = S1 S2 . A family F of subsets of R is called an algebra of subsets of R provided the following conditions hold: R ∈ F,

Sj ∈ F ⇒ S1 ∪ S2 ∈ F, and S ∈ F ⇒ R \ S ∈ F. of sets are automatically closed under finite intersections also. We see that: Proposition 3.1.5. The family of contented subsets of R is an algebra of sets. 106 3. Multivariable integral calculus and calculus on surfaces

The following result specifies a useful class of Riemann integrable func- tions. For a sharper result, see Proposition 3.1.31. Proposition 3.1.6. If f : R → R is bounded and the set S of points of discontinuity of f is a nil set, then f ∈ R(R).

Proof. Suppose |f| ≤ M on R, and take ε > 0. Take a partition P of R, and write P = P′ ∪ P′′, where cells in P′ do not meet S, and cells in P′′ do intersect S. Since cont+(S) = 0, we can pick P so that the cells in P′′ have total volume ≤ ε. Now f is continuous on each cell in P′. Further refining the partition if necessary, we can assume that f varies by ≤ ε on each cell in P′. Thus [ ] (3.1.22) IP (f) − IP (f) ≤ V (R) + 2M ε. This proves the proposition. 

To give an example, suppose K ⊂ R is a closed set such that bK is nil. Let f : K → R be continuous. Define fe : R → R by fe(x) = f(x) for x ∈ K, (3.1.23) 0 for x ∈ R \ K.

Then the set of points of discontinuity of fe is contained in bK. Hence fe ∈ R(R). We set ∫ ∫ (3.1.24) f dV = fe dV.

K R In connection with this, we note the following fact, whose proof is an exercise. Suppose R and Re are cells, with R ⊂ Re. Suppose that g ∈ R(R) and that g˜ is defined on Re, to be equal to g on R and to be 0 on Re \ R. Then ∫ ∫ (3.1.25) g˜ ∈ R(Re), and g dV = g˜ dV.

R Re This can be shown by an argument involving refining any given pair of e P P partitions of R and R, respectively, to a pair of partitions R and Re with P P the property that each cell in R is a cell in Re. The following describes an important class of sets S ⊂ Rn that have content zero. Proposition 3.1.7. Let Σ ⊂ Rn−1 be a closed bounded set and let g :Σ → R be continuous. Then the graph of g, {( ) } G = x, g(x) : x ∈ Σ is a nil subset of Rn. 3.1. The Riemann integral in n variables 107

n−1 + Proof. Put Σ in a cell R0 ⊂ R . Suppose |f| ≤ M on Σ. Take N ∈ Z and set ε = M/N. Pick a partition P0 of R0, sufficiently fine that g varies by at most ε on each set Σ ∩ Rα, for any cell Rα ∈ P0. Partition the interval I = [−M,M] into 2N equal intervals Jν, of length ε. Then {Rα × Jν} = {Qαν} forms a partition of R0 × I. Now, over each cell Rα ∈ P0, there lie + at most 2 cells Qαν which intersect G, so cont (G) ≤ 2ε · V (R0). Letting N → ∞, we have the proposition. 

Similarly, for any j ∈ {1, . . . , n}, the graph of xj as a continuous function of the complementary variables is a nil set in Rn. So are finite unions of such graphs. Such sets arise as boundaries of many ordinary-looking regions in Rn. Here is a further class of nil sets. Proposition 3.1.8. Let O ⊂ Rn be open and let S ⊂ O be a compact nil subset. Assume f : O → Rn is a Lipschitz map. Then f(S) is a nil subset of Rn.

Proof. The Lipschitz hypothesis on f is that there exists L < ∞ such that, for p, q ∈ O, |f(p) − f(q)| ≤ L|p − q|. ≤ If we cover S with k cells (in a partition), of total volume α, each√ cubical ≤ with edgesize δ, then f(S) is covered by k sets of√ L nδ, hence it can√ be covered by k cubical cells of edgesize L nδ, having total volume ≤ (L n)nα. From this we have the (not very sharp) general bound ( ) √ (3.1.26) cont+ f(S) ≤ (L n)n cont+(S), which proves the proposition. 

In evaluating n-dimensional integrals, it is usually convenient to reduce them to iterated integrals. The following is a special case of a result known as Fubini’s Theorem. Theorem 3.1.9. Let Σ ⊂ Rn−1 be a closed, bounded contented set and let g :Σ → R be continuous, with g (x) < g (x) on Σ. Take j { 0 1 } n (3.1.27) Ω = (x, y) ∈ R : x ∈ Σ, g0(x) ≤ y ≤ g1(x) . Then Ω is a contented set in Rn. If f :Ω → R is continuous, then ∫ g1(x) (3.1.28) φ(x) = f(x, y) dy g0(x) is continuous on Σ, and ∫ ∫

(3.1.29) f dVn = φ dVn−1, Ω Σ 108 3. Multivariable integral calculus and calculus on surfaces i.e., ∫ ∫ (∫ ) g1(x) (3.1.30) f dVn = f(x, y) dy dVn−1(x). g0(x) Ω Σ Proof. The continuity of (3.1.28) is an exercise in one-variable integration; see Exercises 2–3 of §1.1. Let ω(δ) be a modulus of continuity for g0, g1, and f, and also φ. We also can assume that ω(δ) ≥ δ.

Put Σ in a cell R0 and let P0 be a partition of R0. If A ≤ g0 ≤ g1 ≤ B, partition the interval [A, B], and from this and P0 construct a partition P of R = R0 × [A, B]. We denote a cell in P0 by Rα and a cell in P by Rαℓ = Rα × Jℓ. Pick points ξαℓ ∈ Rαℓ. ◦ P P′ ∪ P′′ ∪ P′′′ Write 0 = 0 0 0 , consisting respectively of cells inside Σ, ′ ′′ ′′′ meeting bΣ, and inside R0 \ Σ. Similarly write P = P ∪ P ∪ P , consisting ◦ respectively of cells inside Ω, meeting bΩ, and inside R \ Ω, as illustrated in Figure 3.1.1. For fixed α, let ′ ′ z (α) = {ℓ : Rαℓ ∈ P }, and let z′′(α) and z′′′(α) be similarly defined. Note that ′ ̸ ∅ ⇐⇒ ∈ P′ z (α) = Rα 0, provided we assume maxsize (P) ≤ δ and 2δ < min[g1(x)−g0(x)], as we will from here on. It follows from (1.1.9)–(1.1.10) that ∫ ∑ B(α) (3.1.31) f(ξαℓ)ℓ(Jℓ)− f(x, y) dy ≤ (B −A)ω(δ), ∀ x ∈ Rα, ℓ∈z′(α) A(α) ∪ where ℓ∈z′(α) Jℓ = [A(α),B(α)]. Note that A(α) and B(α) are within 2ω(δ) ∈ ∈ P′ | | ≤ of g0(x) and g1(x), respectively, for all x Rα, if Rα 0. Hence, if f M, ∫ B(α) (3.1.32) f(x, y) dy − φ(x) ≤ 4Mω(δ), ∀ x ∈ Rα. A(α) Thus, with C = B − A + 4M,

∑ − ≤ ∀ ∈ ∈ P′ (3.1.33) f(ξαℓ)ℓ(Jℓ) φ(x) Cω(δ), x Rα 0. ℓ∈z′(α) ′ Multiplying by V − (R ) and summing over R ∈ P , we have n 1 α α 0 ∑ ∑ (3.1.34) f(ξαℓ)Vn(Rαℓ) − φ(xα)Vn−1(Rα) ≤ CV (R0)ω(δ), ∈P′ ∈P′ Rαℓ Rα 0 where xα is an arbitrary point in Rα. 3.1. The Riemann integral in n variables 109

Figure 3.1.1. Fubini partition

Now, if P0 is a sufficiently fine partition of R0, it follows from the proof ∫of Proposition 3.1.6 that the second sum in (3.1.34) is arbitrarily close to Σ φ dVn−1, since bΣ has content zero. Furthermore, an argument such as used to prove Proposition 3.1.7 shows that bΩ has content zero, and one verifies that, for a∫ sufficiently fine partition, the first sum in (3.1.34) is  arbitrarily close to Ω f dVn. This proves the desired identity (3.1.29).

We next take up the change of variables formula for multiple integrals, extending the one-variable formula, (1.1.44). We begin with a result on linear changes of variables. The set of invertible real∫ n×n matrices∫ is denoted R Gl(n, ). In (3.1.35) and subsequent formulas, f dV denotes R f dV for some cell R on which f is supported. The integral is independent of the choice of such a cell; cf. (3.1.25).

Proposition 3.1.10. Let f be a continuous function with compact support in Rn. If A ∈ Gl(n, R), then ∫ ∫ (3.1.35) f(x) dV = | det A| f(Ax) dV. 110 3. Multivariable integral calculus and calculus on surfaces

Proof. Let G be the set of elements A ∈ Gl(n, R) for which (3.1.35) is true. Clearly I ∈ G. Using det A−1 = (det A)−1, and det AB = (det A)(det B), we can conclude that G is a subgroup of Gl(n, R). In more detail, for A ∈ Gl(n, R), f as above, let ∫

IA(f) = fA dV = I(fA), fA(x) = f(Ax).

Then −1 A ∈ G ⇐⇒ IA(f) = | det A| I(f), for all such f. We see that

IAB(f) = I(fAB) = IB(fA), so −1 A, B ∈ G =⇒ IAB(f) = | det B| I(fA) = | det B|−1| det A|−1I(f) = | det AB|−1I(f) =⇒ AB ∈ G.

Applying a similar argument to IAA−1 (f) = I(f), also yields the implication A ∈ G ⇒ A−1 ∈ G. To prove the proposition, it will therefore suffice to show that G contains all elements of the following 3 forms, since (as shown in the exercises on row reduction at the end of this section) the method of applying elementary row operations to reduce a matrix shows that any element of Gl(n, R) is a product of a finite number of these elements. Here, {ej : 1 ≤ j ≤ n} denotes the standard basis of Rn, and σ a permutation of {1, . . . , n}.

A1ej = eσ(j),

(3.1.36) A2ej = cjej, cj ≠ 0

A3e2 = e2 + ce1,A3ej = ej for j ≠ 2. The proofs of (3.1.35) in the first two cases are elementary consequences of the definition of the Riemann integral, and can be left as exercises.

We show that (3.1.35) holds for transformations of the form A3 by using Theorem 3.1.9 (in a special case), to reduce it to the case n = 1. Given f ∈ C(R), compactly supported, and b ∈ R, we clearly have ∫ ∫ (3.1.37) f(x) dx = f(x + b) dx.

′ Now, for the case A = A3, with x = (x1, x ), we have ∫ ∫ (∫ ) ′ ′ ′ f(x1 + cx2, x ) dVn(x) = f(x1 + cx2, x ) dx1 dVn−1(x ) (3.1.38) ∫ (∫ ) ′ ′ = f(x1, x ) dx1 dVn−1(x ), 3.1. The Riemann integral in n variables 111

the second identity by (3.1.37). Thus we get (3.1.35) in case A = A3, so the proposition is proved. 

It is desirable to extend Proposition 3.1.10 to some discontinuous func- tions. Given a cell R and f : R → R, bounded, we say (3.1.39) f ∈ C(R) ⇐⇒ the set of discontinuities of f is nil. Proposition 3.1.6 implies (3.1.40) C(R) ⊂ R(R). From the closure of the class of nil sets under finite unions it is clear that C(R) is closed under sums and products, i.e., that C(R) is an algebra of n functions on R. We will denote by Cc(R ) the set of bounded functions f : Rn → R such that f has compact support and its set of discontinuities ∈ Rn ∈ is nil. Any f Cc( ) is supported in some cell R, and f R C(R). Here is another useful class of functions. Given a cell R ⊂ Rn and f : R → R bounded, we say f ∈ PK(R) ⇐⇒ ∃ a partition P of R such that f is constant (3.1.41) on the interior of each cell Rα ∈ P. The following will be a useful tool for extending Proposition 3.1.10. It is also of interest in its own right, and it will have other uses.

Proposition 3.1.11. Given a cell R ⊂ Rn and f : R → R bounded, {∫ } I(f) = inf g dV : g ∈ PK(R), g ≥ f

{R∫ } ∈ ≥ (3.1.42) = inf g dV : g C(R), g f {R∫ } = inf g dV : g ∈ C(R), g ≥ f .

R Similarly, {∫ } I(f) = sup g dV : g ∈ PK(R), g ≤ f

{R∫ } ∈ ≤ (3.1.43) = sup g dV : g C(R), g f {R∫ } = sup g dV : g ∈ C(R), g ≤ f .

R 112 3. Multivariable integral calculus and calculus on surfaces

Proof. Denote the three quantities on the right side of (3.1.42) by I1(f), I2(f), and I3(f), respectively. The definition of I1(f) is sufficiently close to that of I(f) in (3.1.8) that the identity I(f) = I1(f) is apparent. Now I2(f) is an inf over a larger class of functions g than that defining I1(f), so I2(f) ≤ I1(f). On the other hand, I(g) ≥ I(f) for all g involved in defining I2(f), so I2(f) ≥ I(f), hence I2(f) = I(f).

Next, I3(f) is an inf over a smaller class of functions g than that defining I2(f), so I3(f) ≥ I(f). On the other hand, given ∫ε > 0 and ψ ∈ PK(R), ∈ ≥ − one can readily find g C(R) such that g ψ and R(g ψ) dV < ε. This implies I3(f) ≤ I(f) + ε for all ε > 0 and finishes the proof of (3.1.42). The proof of (3.1.43) is similar. 

n We can now extend Proposition 3.1.10. Say f ∈ Rc(R ) if f has compact n support, say in some cell R, and f ∈ R(R). Also say f ∈ Cc(R ) if f is continuous on Rn, with compact support. Proposition 3.1.12. Given A ∈ Gl(n, R), the identity (3.1.35) holds for n all f ∈ Rc(R ).

Proof. We have from Proposition 3.1.11 that, for each∫ ν ∈ N, there exist g , h ∈ C (Rn) such that h ≤ f ≤ g and, with B = f dV , ν ν c ∫ ν ν ∫ 1 1 B − ≤ h dV ≤ B ≤ g dV ≤ B + . ν ν ν ν

Now Proposition 3.1.10 applies to gν and hν, so (3.1.44) ∫ ∫ 1 1 B − ≤ | det A| h (Ax) dV ≤ B ≤ | det A| g (Ax) dV ≤ B + . ν ν ν ν

Furthermore, with fA(x) = f(Ax), we have hν(Ax) ≤ fA(x) ≤ gν(Ax), so (4.44) gives 1 1 (3.1.45) B − ≤ | det A| I(f ) ≤ | det A| I(f ) ≤ B + , ν A A ν for all ν, and leting ν → ∞ we obtain (3.1.35).  Corollary 3.1.13. If Σ ⊂ Rn is a compact, contented set and A ∈ Gl(n, R), then A(Σ) = {Ax : x ∈ Σ} is contented, and ( ) (3.1.46) V A(Σ) = | det A| V (Σ).

We now extend Proposition 3.1.10 to nonlinear changes of variables. Proposition 3.1.14. Let O and Ω be open in Rn,G : O → Ω a C1 diffeo- morphism, and f a continuous function with compact support in Ω. Then ∫ ∫ ( ) (3.1.47) f(y) dV (y) = f G(x) | det DG(x)| dV (x).

Ω O 3.1. The Riemann integral in n variables 113

Figure 3.1.2. Image of a cell

Proof. It suffices to prove the result under the additional assumption that f ≥ 0, which we make from here on. Also, using a partition of unity (see §3.3), we can write f as a finite sum of continuous functions with small supports, so it suffices to treat the case where f is supported in a cell Re ⊂ Ω and f ◦ G is supported in a cell R ⊂ O. See Figure 3.1.2. Let P = {Rα} be a partition of R. Note that for each Rα ∈ P, bG(Rα) = G(bRα), so G(Rα) is contented, in view of Propositions 3.1.4 and 3.1.8. e Let ξα be the center of Rα, and let Rα = Rα − ξα, a cell with center at the origin. Then ( ) e (3.1.48) G(ξα) + DG(ξα) Rα = ηα + Hα is an n-dimensional parallelepiped, each point of which is very close to a e point in G(Rα), if Rα is small enough. To be precise, for y ∈ Rα,

G(ξα + y) = ηα + DG(ξα)y + Φ(ξα, y)y, ∫ 1[ ] Φ(ξα, y) = DG(ξα + ty) − DG(ξα) dt. 0 See Figure 3.1.3. 114 3. Multivariable integral calculus and calculus on surfaces

Figure 3.1.3. Cell image closeup

Consequently, given ε > 0, if δ > 0 is small enough and maxsize(P) ≤ δ, then we have

(3.1.49) ηα + (1 + ε)Hα ⊃ G(Rα), for all Rα ∈ P. Now, by (3.1.46),

(3.1.50) V (Hα) = |det DG(ξα)| V (Rα). Hence ( ) n (3.1.51) V G(Rα) ≤ (1 + ε) |det DG(ξα)| V (Rα). Now we have ∫ ∑ ∫ f dV = f dV α G(R ) ∑ α ( ) (3.1.52) ≤ sup f ◦ G(x) V G(Rα) R α α ∑ n ≤ (1 + ε) sup f ◦ G(x) |det DG(ξα)| V (Rα). α Rα 3.1. The Riemann integral in n variables 115

To see that the first line of (3.1.52) holds, note∑ that fχG(Rα) is Riemann integrable, by Proposition 3.1.6; note also that α fχG(Rα) = f except on a set of content zero. Then the additivity result in Proposition 3.1.2 applies. The first inequality in (3.1.52) is elementary; the second inequality uses (3.1.51) and f ≥ 0. If we set (3.1.53) h(x) = f ◦ G(x) |det DG(x)|, then we have

(3.1.54) sup f ◦ G(x) |det DG(ξα)| ≤ sup h(x) + Mω(δ), Rα Rα provided |f| ≤ M and ω(δ) is a modulus of continuity for DG. Taking arbitrarily fine partitions, we get ∫ ∫ (3.1.55) f dV ≤ h dV.

Ω O If we apply this result, with G replaced by G−1, O and Ω switched, and f replaced by h, given by (3.1.53), we have ∫ ∫ ∫ (3.1.56) h dV ≤ h ◦ G−1(y) | det DG−1(y)| dV (y) = f dV.

O Ω Ω The inequalities (3.1.55) and (3.1.56) together yield the identity (3.1.47). 

We now extend Proposition 3.1.14 to more general Riemann integrable n functions. Recall that f ∈ Rc(R ) if f has compact support, say in some n n cell R, and f ∈ R(R). If Ω ⊂ R is open and f ∈ Rc(R ) has support in Ω, n we say f ∈ Rc(Ω). We also cay f ∈ Cc(Ω) if f ∈ Cc(R ) has support in Ω, and we say f ∈ Cc(Ω) if f is continuous with compact support in Ω. Theorem 3.1.15. Let O and Ω be open in Rn,G : O → Ω a C1 diffeomor- phism. If f ∈ Rc(Ω), then f ◦ G ∈ Rc(O), and (3.1.47) holds. Proof. The proof is similar to that of Proposition 3.1.12. Given ν ∈ N, we have from Proposition 3.1.11∫ that there exist gν, hν ∈ Cc(Ω) such that h ≤ f ≤ g and, with B = f dV , ν ν ∫ Ω ∫ 1 1 B − ≤ h dV ≤ B ≤ g dV ≤ B + . ν ν ν ν Then Proposition 3.1.14 applies to h and g , so ∫ ν ν 1 B − ≤ h (G(x))| det DG(x)| dV (x) ν ν O ∫ 1 ≤ B ≤ g (G(x))| det DG(x)| dV (x) ≤ B + . ν ν O 116 3. Multivariable integral calculus and calculus on surfaces

Now, with fG(x) = f(G(x)), we have hν(G(x)) ≤ fG(x) ≤ gν(G(x)), so 1 1 (3.1.57) B − ≤ I(f | det DG|) ≤ I(f | det DG|) ≤ B + , ν G G ν for all ν, and letting ν → ∞, we obtain (3.1.47). 

We have seen how Proposition 3.1.11 has been useful. The following result, to some degree a variant of Proposition 3.1.11, is also useful. Lemma 3.1.16. Let F : R → R be bounded, B ∈ R. Suppose that, for each + ν ∈ Z , there exist Ψν, Φν ∈ R(R) such that

(3.1.58) Ψν ≤ F ≤ Φν and ∫ ∫

(3.1.59) B − δν ≤ Ψν(x) dV (x) ≤ Φν(x) dV (x) ≤ B + δν, δν → 0. R R Then F ∈ R(R) and ∫ (3.1.60) F (x) dV (x) = B. R Furthermore, if there exist Ψ , Φ ∈ R(R) such that (3.1.58) holds and ∫ ν ν

(3.1.61) (Φν(x) − Ψν(x)) dV ≤ δν → 0, R then there exists B such that (3.1.59) holds. Hence F ∈ R(R) and (3.1.60) holds.

The most frequently invoked case of the change of variable formula, in the case n = 2, involves the following change from Cartesian to polar coordinates: (3.1.62) x = r cos θ, y = r sin θ. See (2.3.111). Thus, take G(r, θ) = (r cos θ, r sin θ). We have ( ) cos θ −r sin θ (3.1.63) DG(r, θ) = , det DG(r, θ) = r. sin θ r cos θ For example, if ρ ∈ (0, ∞) and 2 2 2 2 (3.1.64) Dρ = {(x, y) ∈ R : x + y ≤ ρ }, then, for f ∈ C(Dρ), ∫ ∫ ∫ ρ 2π (3.1.65) f(x, y) dA = f(r cos θ, r sin θ)r dθ dr. 0 0 Dρ 3.1. The Riemann integral in n variables 117

To get this, we first apply Proposition 3.1.14, with O = [ε, ρ] × [0, 2π − ε], then apply Theorem 3.1.9, then let ε ↘ 0. We next use Lemma 3.1.16 to establish the following useful result on products of Riemann integrable functions.

Proposition 3.1.17. Given f1, f2 ∈ R(R), we have f1f2 ∈ R(R).

Proof. It suffices to prove this when fj ≥ 0. Take partitions Pν and func- tions ψjν, φjν ≥ 0, constant in the interior of each cell in Pν, such that

0 ≤ ψjν ≤ fj ≤ φjν ≤ B, and ∫ ∫ ∫

ψjν dV, φjν dV −→ fj dV. We apply Lemma 3.1.16 with

F = f1f2, Ψν = ψ1νψ2ν, Φν = φ1νφ2ν. Note that Φν − Ψν = φ1ν(φ2ν − ψ2ν) + ψ2ν(φ1ν − ψ1ν)

≤ B(φ2ν − ψ2ν) + B(φ1ν − ψ1ν). Hence (3.1.61) holds, giving F ∈ R(R). 

As a consequence of Proposition 3.1.17, we can make the following con- struction. Assume R is a cell and S ⊂ R is a contented set. If f ∈ R(R), we have χ f ∈ R(R), by Proposition 3.1.17. We define S ∫ ∫

(3.1.66) f(x) dV (x) = χS(x)f(x) dV (x). S R Note how his extends the scope of (3.1.24).

Integrals over Rn

It is often useful to integrate a function whose support is not bounded. Generally, given a bounded function f : Rn → R, we say f ∈ R(Rn) provided f ∈ R(R) for each cell R ⊂ Rn, and R ∫ |f| dV ≤ C,

R for some C < ∞, independent of R. If f ∈ R(Rn), we set ∫ ∫ n (3.1.67) f dV = lim f dV, Rs = {x ∈ R : |xj| ≤ s, ∀ j}. s→∞ n R Rs 118 3. Multivariable integral calculus and calculus on surfaces

The existence of the limit in (3.1.67) can be established as follows. If M < N, then ∫ ∫ ∫ f dV − f dV = f dV, \ RN RM ∫ RN RM which is dominated in by |f| dV . If f ∈ R(Rn), then ∫ RN \RM aN = |f| dV is a bounded monotone sequence, which hence converges, RN so ∫ ∫ ∫ |f| dV = |f| dV − |f| dV −→ 0, as M,N → ∞.

RN \RM RN RM The following simple but useful result is an exercise.

Proposition 3.1.18. If Kν is any sequence of compact contented subsets n of R such that each Rs, for s < ∞, is contained in all Kν for ν sufficiently large, i.e., ν ≥ N(s), then, whenever f ∈ R(Rn), ∫ ∫ (3.1.68) f dV = lim f dV. ν→∞ n R Kν Change of variables formulas and Fubini’s Theorem extend to this case. For example, the limiting case of (3.1.65) as ρ → ∞ is (3.1.69) ∫ ∫ ∫ ∞ 2π f ∈ C(R2) ∩ R(R2) =⇒ f(x, y) dA = f(r cos θ, r sin θ)r dθ dr. 0 0 R2

To see this, use Proposition 3.1.17 with Kν = Dν, defined as in (3.1.64), to write ∫ ∫ (3.1.70) f(x, y) dA = lim f(x, y) dA, ν→∞ R2 Dν and apply (3.1.65) to write the integral on the right as ∫ ∫ ν 2π f(r cos θ, r sin θ)r dθ dr. 0 0 You get the right side of (3.1.69) in the limit ν → ∞. The following is a good example. Take f(x, y) = e−x2−y2 . We have ∫ ∫ ∫ ∫ ∞ 2π ∞ 2 2 2 2 (3.1.71) e−x −y dA = e−r r dθ dr = 2π e−r r dr. 0 0 0 R2 Now, methods of §1.1 allow the substitution s = r2, so ∫ ∞ ∫ ∞ 2 1 1 e−r r dr = e−s ds = . 0 2 0 2 3.1. The Riemann integral in n variables 119

Hence ∫ 2 2 (3.1.72) e−x −y dA = π.

R2 On the other hand, Theorem 3.1.9 extends to give ∫ ∫ ∞ ∫ ∞ 2 2 2 2 e−x −y dA = e−x −y dy dx −∞ −∞ (3.1.73) R2 (∫ ∞ )(∫ ∞ ) 2 2 = e−x dx e−y dy . −∞ −∞ Note that the two factors in the last product are equal. We deduce that ∫ ∞ 2 √ (3.1.74) e−x dx = π. −∞ We can generalize (3.1.73), to obtain (via (3.1.74)) ( ) ∫ ∫ ∞ n −|x|2 −x2 n/2 (3.1.75) e dVn = e dx = π . −∞ Rn The integrals (3.1.71)–(3.1.75) are called Gaussian integrals, and their eval- uation has many uses. We shall see some in §3.2. We record the following additivity result for the integral over Rn, whose proof is also an exercise. Proposition 3.1.19. If f, g ∈ R(Rn), then f + g ∈ R(Rn), and ∫ ∫ ∫ (3.1.76) (f + g) dV = f dV + g dV.

Rn Rn Rn

Unbounded integrable functions

There are lots of unbounded functions we would like to be able to inte- grate. For example, consider f(x) = x−1/2 on (0, 1] (defined any way you like at x = 0). Since, for ε ∈ (0, 1), ∫ 1 √ (3.1.77) x−1/2 dx = 2 − 2 ε, ε this has a limit as ε ↘ 0, and it is natural to set ∫ 1 (3.1.78) x−1/2 dx = 2. 0 120 3. Multivariable integral calculus and calculus on surfaces

Sometimes (3.1.78) is called an “,” but we do not consider that to be a proper designation. We aim for a treatment of the integral for a natural class of unbounded functions. To this end, we define a class R#(I) of not necessarily bounded “integrable” functions on I. The set I will stand for either Rn or a cell in Rn. To start, assume f ≥ 0 on I, and for A ∈ (0, ∞), set

fA(x) = f(x) if f(x) ≤ A, (3.1.79) A, if f(x) > A.

(We hereby abandon the use of fA as in the proof of Proposition 3.1.10.) We say f ∈ R#(I) provided f ∈ R(I), ∀ A < ∞, and A ∫ (3.1.80) ∃ uniform bound fA dV ≤ M. I ∫ ≥ If f 0 satisfies (3.1.80), then I fA dV ∫increases monotonically to a finite ↗ ∞ limit as A + , and we call the limit I f dV : ∫ ∫ # (3.1.81) fA dV ↗ f dV, for f ∈ R (I), f ≥ 0. I I ∫ If I is understood, we might just write f dV .

Remark. If f ∈ R(I) is ≥ 0, then fA ∈ R(I) for all A < ∞. See the easy part of Exercise 15.

It is valuable to have the following. Proposition 3.1.20. If f, g : I → R+ are in R#(I), then f + g ∈ R#(I), and ∫ ∫ ∫ (3.1.82) (f + g) dV = f dV + g dV.

I I I

Proof. To start, note that (f + g)A ≤ fA + gA. In fact,

(3.1.83) (f + g)A = (fA + gA)A. ∫ ∫ ∫ ∫ ∈ R ≤ ≤ Hence∫ (f + g)A (I) and (f + g)A dV fA dV + gA dV f dV + g dV , so we have f + g ∈ R#(I) and ∫ ∫ ∫ (3.1.84) (f + g) dV ≤ f dV + g dV. 3.1. The Riemann integral in n variables 121

On the other hand, if B > 2A, then (f + g) ≥ f + g , so ∫ ∫ B ∫ A A

(3.1.85) (f + g) dV ≥ fA dV + gA dV, for all A < ∞, and hence ∫ ∫ ∫ (3.1.86) (f + g) dV ≥ f dV + g dV.

Together, (3.1.84) and (3.1.86) yield (3.1.82). 

Next, we take f : I → R and set f = f + − f −, f +(x) = f(x) if f(x) ≥ 0, (3.1.87) 0 if f(x) < 0. Then we say (3.1.88) f ∈ R#(I) ⇐⇒ f +, f − ∈ R#(I), and set ∫ ∫ ∫ (3.1.89) f dV = f + dV − f − dV,

I I I where the two terms on the right are defined as in (3.1.81). To extend the additivity, we begin as follows

# # Lemma 3.1.21. Assume that g ∈ R (I) and that gj ≥ 0, gj ∈ R (I), and

(3.1.90) g = g0 − g1.

Then ∫ ∫ ∫

(3.1.91) g dV = g0 dV − g1 dV.

Proof. Take g = g+ − g− as in (3.1.87). Then (3.1.90) implies + − (3.1.92) g + g1 = g0 + g , which by Proposition 3.1.19 yields ∫ ∫ ∫ ∫ + − (3.1.93) g dV + g1 dV = g0 dV + g dV.

This implies ∫ ∫ ∫ ∫ + − (3.1.94) g dV − g dV = g0 dV − g1 dV, which yields (3.1.91). 

We now extend additivity. 122 3. Multivariable integral calculus and calculus on surfaces

Proposition 3.1.22. Assume f , f ∈ R#(I). Then f + f ∈ R#(I) and ∫ 1 2 ∫ ∫ 1 2

(3.1.95) (f1 + f2) dV = f1 dV + f2 dV. I I I + − − + − − Proof. If g = f1 + f2 = (f1 f1 ) + (f2 f2 ), then − + + − − (3.1.96) g = g0 g1, g0 = f1 + f2 , g1 = f1 + f2 . We have g ∈ R#(I), and then j∫ ∫ ∫

(f1 + f2) dV = g0 dV − g1 dV ∫ ∫ + + − − − (3.1.97) = (f1 + f2 ) dV (f1 + f2 ) dV ∫ ∫ ∫ ∫ + + − − − − = f1 dV + f2 dV f1 dV f2 dV, the first equality by Lemma 3.1.21, the second tautologically, and the third by Proposition 3.1.20. Since ∫ ∫ ∫ + − − (3.1.98) fj dV = fj dV fj dV, this gives (3.1.95). 

# If f : I → C, we set f = f1 + if2, fj : I → R, and say f ∈ R (I) if and only if f and f belong to R#(I). Then we set 1 2 ∫ ∫ ∫

(3.1.99) f dV = f1 dV + i f2 dV.

Similar comments apply to f : I → Rn. We next establish a useful result on products. Proposition 3.1.23. Assume f ∈ R#(Rn), g ∈ R(Rn), and f, g ≥ 0. Then fg ∈ R#(Rn) and ∫ ∫

(3.1.100) fAg dV ↗ fg dV as A ↗ +∞.

Proof. Given the additivity properties just established, it would be equiv- alent to prove this with g replaced by g + 1, so we will assume from here that g ≥ 1. Then

(3.1.101) (fg)A = (fAg)A.

By Proposition 3.1.17, fAg|R ∈ R(R) for each cell R. Hence (e.g., by the easy part of Exercise 15), (f g) | ∈ R(R) for each cell R. Thus A A R ∈ R (3.1.102) (fg)A R (R). 3.1. The Riemann integral in n variables 123

Now there exists K < ∞ such that 1 ≤ g ≤ K, so

(3.1.103) fAg ≤ KfA, hence (fg)A ≤ KfA. The hypothesis f ∈ R#(Rn) implies there exists M < ∞ such that ∫

(3.1.104) fA dV ≤ M, R for all A < ∞ and each cell R. Hence, by (3.1.103), ∫

(3.1.105) sup (fg)A dV ≤ MK, A R independent of R. This implies fg ∈ R#(Rn). By definition, ∫ ∫

(3.1.106) (fg)A dV ↗ fg dV, as A ↗ +∞.

Meanwhile, clearly fAg ↗ as A ↗, so the estimate (3.1.103) implies ∫

(3.1.107) fAg dV ↗ L, as A ↗ +∞, for some L ∈ R+. It remains to identify the limits in (3.1.106) and (3.1.107). Now (3.1.101) implies ∫

(3.1.108) (fg)A ≤ fAg, hence fg dV ≤ L.

Finally, since fAg ≤ fg and fAg ≤ KA, we have

(3.1.109) fAg ≤ (fg)B for B ≥ KA. This implies ∫ ∫

(3.1.110) L ≤ sup (fg)B dV = fg dV, B and hence we have (3.1.100). 

We now extend the change of variable formula in Theorem 3.1.15 to unbounded functions. It is convenient to introduce the following notation. n # # n Given an open set Ω ⊂ R , we say f ∈ Rc (Ω) provided f ∈ R (R ) and f is supported on a compact subset of Ω. Proposition 3.1.24. Let O and Ω be open in Rn, G : O → Ω a C1 diffeo- # # morphism. If f ∈ Rc (Ω), then f ◦ G ∈ Rc (O) and ∫ ∫ (3.1.111) f(y) dV (y) = f(G(x))| det DG(x)| dV (x).

Ω O 124 3. Multivariable integral calculus and calculus on surfaces

Proof. It suffices to establish this in case f ≥ 0, which we assume from here. Then ∫ ∫

(3.1.112) fA dV ↗ f dV. Ω Ω

We set φ = f ◦ G and note that fA ◦ G = φA. Hence, by Theorem 3.1.15, for each A ∈ (0, ∞), ∫ ∫

(3.1.113) fA(y) dV (y) = φA(x)| det DG(x)| dV (x). Ω O

If f is supported on a compact set K ⊂ Ω, then φA is supported on G−1(K) ⊂ O, also compact, hence on which | det DG| has a positive lower bound. Hence∫ an upper bound on the right side of (3.1.113) implies an upper # n bound on φA dV , independent of A, so φ ∈ R (R ). Then Proposition 3.1.23 implies φ| det DG| ∈ R#(Rn) and ∫ ∫

(3.1.114) φA(x)| det DG(x)| dV (x) ↗ φ(x)| det DG(x)| dV (x).

Together (3.1.112)–(3.1.114) yield (3.1.111). 

One also has versions of Proposition 3.1.24 where f need not have com- pact support. See Exercise 13 below for an example. Our next result on a class of elements of R#(I) ties in closely with the example in (3.1.77). As before, I is either Rn or a cell in Rn.

Proposition 3.1.25. Let f : I → [0, ∞) and assume fA ∈ R(I) for each A < ∞. Assume there are nested contented subsets of I:

(3.1.115) U1 ⊃ U2 ⊃ U3 ⊃ · · · ,V (Uν) → 0. Assume f(1−χ ) ∈ R(I) for each ν and that there exists C < ∞ such that Uν ∫

(3.1.116) f dV = Jν ≤ C, ∀ ν.

I\Uν Then f ∈ R#(I) and ∫

(3.1.117) Jν ↗ f dV. I

Proof. The hypothesis (3.1.116) implies Jν ↗ J for some J ∈ [0, ∞). Also, since 0 ≤ f ≤ f, we have A ∫

(3.1.118) fA dV ≤ J, ∀ v, A.

I\Uν 3.1. The Riemann integral in n variables 125

Furthermore, ∫

(3.1.119) fA dV ≤ AV (Uν),

Uν so ∫

(3.1.120) fA dV ≤ J + AV (Uν), ∀ ν, A, I hence ∫

(3.1.121) fA dV ≤ J, ∀A. I It follows that f ∈ R#(I) and ∫ (3.1.122) f dV ≤ J.

I On the other hand, ∫ ∫

(3.1.123) f dV ≥ f dV = Jν,

I I\Uν for each ν, so we have (3.1.117). 

Monotone convergence theorem

We aim to establish a circle of results known as monotone convergence theorems. Here is the first result. Proposition 3.1.26. Let R ⊂ Rn be a cell. Assume f ∈ R(R). Then ∫ k

(3.1.124) fk(x) ↘ 0 ∀ x ∈ R =⇒ fk dV ↘ 0. R

Proof. It suffices to assume V (R) = 1. Say 0 ≤ f1 ≤ K on R, so also 0 ≤ f ≤ K. We have k ∫

(3.1.125) fk dV ↘ α, R for some α ≥ 0, and we want to show that α = 0. Suppose α > 0. Pick a partition P of R such that I (f ) ≥ α/2. Thus f ≥ φ ≥ 0 for some k Pk k k k ∈ P ≥ φk PK(R), constant∫ on the interior of each cell in k, with integral α/2. ≤ ≤ The contribution to R φk dV from the cells on which φk α/4 is α/4, so the contribution from the cells on which φk ≥ α/4 must be ≥ α/4. Since 126 3. Multivariable integral calculus and calculus on surfaces

φk ≤ K on R, it follows that the latter class of cells must have total volume ≥ α/4K. Consequently, for each k, there exists Sk ⊂ R, a finite union of cells in Pk, such that α α (3.1.126) V (S ) ≥ , and f ≥ on S . k 4K k 4 k

Then fℓ ≥ α/4 on Sk for all ℓ ≤ k. Hence, with ∪ (3.1.127) Oℓ = Sk, k≥ℓ we have α α (3.1.128) cont−(O ) ≥ , f ≥ on O . ℓ 4K ℓ 4 ℓ The hypothesis fℓ ↘ 0 on R implies

(3.1.129) Oℓ ↘ ∅ as ℓ ↗ ∞.

Without loss of generality, we can take Sk open in (3.1.126), hence each Oℓ is open. The conclusion of Proposition 3.1.26 is hence a consequence of the following, which implies that (3.1.128) and (3.1.129) are contradictory. 

Proposition 3.1.27. If Oℓ ⊂ R are open sets, for ℓ ∈ N, then − (3.1.130) Oℓ ↘ ∅ =⇒ cont (Oℓ) ↘ 0.

Proof. Assume Oℓ ↘ ∅. If the conclusion of (3.1.130) fails, then − (3.1.131) cont (Oℓ) ↘ b for some b > 0. Passing to a subsequence if necessary, we can assume − −ℓ −9 (3.1.132) cont (Oℓ) ≤ b + δℓ, δℓ < 2 · 10 · b.

Then we can pick Kℓ ⊂ Oℓ, a compact union of finitely many cells in a partition of R, such that

(3.1.133) V (Kℓ) ≥ b − δℓ.

We claim that ∩ℓKℓ ≠ ∅, which will provide a contradiction.

Place K1 ∪ K2 in a finite union C1 of cells, contained in O1. We then have

V (K1 ∩ K2) ≥ V (K1) − V (C1 \ K2) (3.1.134) ≥ b − (2δ1 + δ2), − since V (C1 \ K2) = V (C1) − V (K2) ≤ cont (O1) − V (K2) ≤ δ1 + δ2. Next, place (K1 ∩ K2) ∪ K3 in a finite union C2 of cells, contained in O2. Then

V (K1 ∩ K2 ∩ K3) ≥ V (K1 ∩ K2) − V (C2 \ K3) (3.1.135) ≥ b − (2δ1 + δ2) − (2δ2 + δ3), 3.1. The Riemann integral in n variables 127

− since V (C2 \ K3) = V (C2) − V (K3) ≤ cont (O2) − V (K3) ≤ δ2 + δ3. Pro- ceeding in this fashion, we get (∩k ) ∑k (3.1.136) V Kℓ ≥ b − (2δℓ + δℓ+1) > 0, ∀ k. ℓ=1 ℓ=1 e ∩k Thus, Kk = ℓ=1Kℓ is a decreasing sequence of nonempty compact sets. Hence ∩ ∩ (3.1.137) Oℓ ⊃ Kℓ ≠ ∅, ℓ≥1 ℓ≥1 contradicting the hypothesis of (3.1.130). 

Having Proposition 3.1.26, we proceed to the following significant im- provement.

# Proposition 3.1.28. Assume fk ∈ R (R). Then ∫

(3.1.138) fk(x) ↘ 0 ∀ x ∈ R =⇒ fk dV ↘ 0. R

Proof. Again we have (3.1.125) for some α ≥ 0 and again we want to show that α = 0. For each A ∈ (0, ∞) and each k ∈ N, form (fk)A, as in (3.1.79). Thus (fk)A ∈ R(R), and the hypothesis of (3.1.138) implies (fk)A ↘ 0 as k ↗ ∞. Thus, by Proposition 3.1.26, ∫

(3.1.139) (fk)A dV ↘ 0 as k → ∞, for each A < ∞. R We note that

(3.1.140) fk+1(x) − (fk+1)A(x) ≤ fk(x) − (fk)A(x) for each x ∈ R, k ∈ N. In fact, if fk(x) ≤ A (so fk+1(x) ≤ A), both sides of (3.1.140) are 0, if fk+1(x) ≥ A (so fk(x) ≥ A), we get fk+1(x) − A ≤ fk(x) − A, and if fk+1(x) < A < fk(x), we get 0 ≤ fk(x) − A. It follows that, for each A < ∞, ∫

(3.1.141) [fk − (fk)A] dV ↘ α, as k → ∞. R ∫ ∞ − However, for each δ > 0, there exists A = A(δ) < such that R[f1 (f1)A] dV ≤ δ. This forces α = 0, and proves Proposition 3.1.28. 

Applying Proposition 3.1.28 to fk = g − gk, we have the following. 128 3. Multivariable integral calculus and calculus on surfaces

Corollary 3.1.29. Assume g, g ∈ R#(R). Then k ∫ ∫

(3.1.142) gk(x) ↗ g(x) ∀ x ∈ R =⇒ gk dV ↗ g dV. R R Finally, we remove the support constraint. Proposition 3.1.30. Assume g, g ∈ R#(Rn). Then k ∫ ∫ n (3.1.143) gk(x) ↗ g(x) ∀ x ∈ R =⇒ gk dV ↗ g dV. Rn Rn Proof. Clearly ∫ ∫

(3.1.144) gk dV ↗ c, and c ≤ g dV. Rn Rn Now, given ε > 0, there is a cell R ⊂ Rn such that ∫

(3.1.145) (|g| + |g1|) dV < ε, Rn\R and Corollary 3.1.29 gives ∫ ∫

(3.1.146) gk dV ↗ g dV. ∫ R R ≥ −  We deduce that c Rn g dV ε for all ε > 0, so (4.137) holds. In the Lebesgue theory of integration, there is a stronger result. Namely, Rn ↗ if gk are integrable on∫ and gk(x) g(x) for each x, and if there is a ≤ ∞ Rn uniform upper bound Rn gk dx B < , then g is integrable on and the conclusion of (3.1.143) holds. Such a result can be found in [42].

Upper content and outer measure

Given a bounded set S ⊂ Rn, its upper content is defined in (3.1.13) and an equivalent characterization given in (3.1.15). A related quantity is the outer measure of S, defined by {∑ ∪ } ∗ n (3.1.147) m (S) = inf V (Rk): Rk ⊂ R cells,S ⊂ Rk . k≥1 k≥1 The difference between (3.1.15) and (3.1.147) is that in (3.1.15) we require the cover of S by cells to be finite and in (3.1.147) we allow any countable cover of S by cells. Clearly (3.1.147) is an inf over a larger collection of objects than (3.1.15), so (3.1.148) m∗(S) ≤ cont+(S). 3.1. The Riemann integral in n variables 129

We get the same result in (3.1.147) if we require ∪ ◦ (3.1.149) S ⊂ → Rk k≥1

−k (just expand each Rk by a factor of (1 + 2 ε)). Since any open cover of a compact set has a finite subcover (see Appendix A.1), it follows that (3.1.150) S compact =⇒ m∗(S) = cont+(S). On the other hand, it is readily verified from (3.1.147) that (3.1.151) S countable =⇒ m∗(S) = 0.

n For example, if R = {x ∈ R : 0 ≤ xj ≤ 1, ∀ j}, then (3.1.152) m∗(R ∩ Qn) = 0, but cont+(R ∩ Qn) = 1, the latter result by (3.1.16). We now establish the following integrability criterion, which sharpens Proposition 3.1.6.

Proposition 3.1.31. Let f : R → R be bounded, and let S ⊂ R be the set of points of discontinuity of f. Then (3.1.153) m∗(S) = 0 =⇒ f ∈ R(R).

| | ≤ { } Proof. Assume f M and pick ε > 0. Take a countable∑ collection Rk ⊂ ∪ of cells that are open (in R) such that S k≥1Rk and k≥1 V (Rk) < ε. # Now f is continuous at each p ∈ R\S, so there exists a cell Rp , open (in R), # containing p, such that sup # f − inf # f < ε. Then {Rk : k ∈ N} ∪ {Rp : Rp Rp p ∈ R \ S} is an open cover of R. Since R is compact, there is a finite { # # } subcover, which we denote R1,...,RN ,R1 ,...,RM . We have ∑N (3.1.154) V (Rk) < ε, and sup f − inf f < ε, ∀ j ∈ {1,...,M}. # R# k=1 Rj j

Recall that R = I1 × · · · × In is a product of n closed, bounded intervals. # ∈ { } Also each cell Rk and Rj is a product of intervals. For each ν 1, . . . , n , take the collection of all endpoints in the νth factor of each of these cells, and use these to form a partition of Iν. Taking products yields a partition P of R. We can write

P = {Lk : 1 ≤ k ≤ µ} ( ∪ ) ( ∪ ) (3.1.155) = Lk ∪ Lk , k∈A k∈B 130 3. Multivariable integral calculus and calculus on surfaces

∈ A # where we say k provided Lk is contained in a cell of the form Rj for some j ∈ {1,...,M}, as in (3.1.154). Consequently, if k ∈ B, then Lk ⊂ Rℓ for some ℓ ∈ {1,...,N}, so

∪ ∪N (3.1.156) Lk ⊂ Rℓ. k∈B ℓ=1 We therefore have ∑ (3.1.157) V (Lk) < ε, and sup f − inf f < ε, ∀ j ∈ A. Lj k∈B Lj It follows that ∑ ∑ 0 ≤ IP (f) − IP (f) < 2MV (Lk) + εV (Lj) (3.1.158) k∈B j∈A < 2εM + εV (R).

Since ε can be taken arbitrarily small, this establishes that f ∈ R(R). 

Remark. The condition (3.1.153) is sharp. That is, given f : R → R bounded, f ∈ R(R) ⇔ m∗(S) = 0. Proofs of this can be found in standard measure theory texts, such as [42].

Exercises

1. Show that any two partitions of a cell R have a common refinement. Hint. Consider the argument given for the one-dimensional case in §1.1.

2. Write down a careful proof of the identity (3.1.16), i.e., cont+(S) = cont+(S).

3. Write down the details of the argument giving (3.1.25), on the indepen- dence of the integral from the choice of cell.

4. Write down a direct proof that the transformation formula (3.1.35) holds for those linear transformations of the form A1 and A2 in (3.1.36). Compare Exercise 1 of §1.1.

5. Consider spherical polar coordinates on R3, given by

x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ, 3.1. The Riemann integral in n variables 131

( ) i.e., take G(ρ, φ, θ) = ρ sin φ cos θ, ρ sin φ sin θ, ρ cos θ . Show that

det DG(ρ, φ, θ) = ρ2 sin φ. Use this to compute the volume of the unit ball in R3.

6. If B is the unit ball in R3, show that Theorem 3.1.9 implies ∫ √ V (B) = 2 1 − |x|2 dA(x),

D where D = {x ∈ R2 : |x| ≤ 1} is the unit disk. Use polar coordinates, as in (3.1.62)–(3.1.65), to compute this integral. Compare the result with that of Exercise 6.

7. Apply Corollary 3.1.13 and the answer to Exercises 5 and 6 to compute the volume of the ellipsoidal region in R3 defined by x2 y2 z2 + + ≤ 1, a2 b2 c2 given a, b, c ∈ (0, ∞).

8. Prove Lemma 3.1.16.

9. If R is a cell and S ⊂ R is a contented set, and f ∈ R(R), we have, via Proposition 3.1.17, ∫ ∫

f(x) dV (x) = χS(x)f(x) dV (x). S R

Show that, if Sj ⊂ R are contented and they are disjoint (or more generally + cont (S1 ∩ S2) = 0), then, for f ∈ R(R), ∫ ∫ ∫ f(x) dV (x) = f(x) dV (x) + f(x) dV (x).

S1∪S2 S1 S2

10. Establish the convergence result (3.1.68), for all f ∈ R(Rn).

11. Take B = {x ∈ Rn : |x| ≤ 1/2}, and let f : B → R+. Assume f is continuous on B \ 0. Show that ∫ f ∈ R#(B) ⇔ f dV is bounded as ε ↘ 0.

|x|>ε 132 3. Multivariable integral calculus and calculus on surfaces

n 12. With B ⊂ R as in Exercise 11, define qb : B → R by 1 q (x) = , b xn| log x|b

# for x ≠ 0. Say qb(0) = 0. Show that qb ∈ R (B) ⇔ b > 1.

13. Show that

2 f(x) = |x|−ae−|x| ∈ R#(Rn) ⇐⇒ a < n.

14. Theorem 3.1.9, relating multiple integrals and iterated integrals, played the following role in the proof of the change of variable formula (3.1.47). Namely, it was used to establish the identity (3.1.50) for the volume of the parallelepiped Hα, via an appeal to Corollary 3.1.13, hence to Proposition 3.1.10, whose proof relied on Theorem 3.1.9. Try to establish Corollary 3.1.13 directly, without using Theorem 3.1.9, in the case when Σ is either a cell or the image of a cell under an element of Gl(n, R).

In preparation for the next three exercises, review the proof of Proposition 1.1.12.

15. Assume f ∈ R(R), |f| ≤ M, and let φ :[−M,M] → R be Lipschitz and monotone. Show directly from the definition that φ ◦ f ∈ R(R).

16. If φ :[−M,M] → R is continuous and piecewise linear, show that you can write φ = φ1 − φ2 with φj Lipschitz and monotone. Deduce that f ∈ R(R) ⇒ φ ◦ f ∈ R(R) when φ is piecewise linear.

17. Assume uν ∈ R(R) and that uν → u uniformly on R. Show that u ∈ R(R). Deduce that if f ∈ R(R), |f| ≤ M, and ψ :[−M,M] → R is continuous, then ψ ◦ f ∈ R(R).

18. Let R ⊂ Rn be a cell and let f, g : R → R be bounded. Show that I(f + g) ≤ I(f) + I(g),I(f + g) ≥ I(f) + I(g). Hint. Look at the proof of Proposition 1.1.1.

19. Let R ⊂ Rn be a cell and let f : R → R be bounded. Assume that for 3.1. The Riemann integral in n variables 133

each ε > 0, there exist bounded fε, gε such that

f = fε + gε, fε ∈ R(R), I(|gε|) ≤ ε. Show that f ∈ R(R) and ∫ ∫

fε dV −→ f dV. R R Hint. Use Exercise 18.

20. Use the result of Exercise 19 to produce another proof of Proposition 3.1.6.

21. Behind (3.1.45) is the assertion that if R is a cell, g is supported on K ⊂ R, and |g| ≤ M, then I(|g|) ≤ M cont+(K). Prove this. More generally, if g, h : R → R are bounded and |g| ≤ M, show that I(|gh|) ≤ MI(|h|).

22. Establish the following Fubini-type theorem, and compare it with The- orem 3.1.9. Proposition 3.1.32. Let A ⊂ Rm and B ⊂ Rn be cells, and take f ∈ R(A × B). For x ∈ A, define fx : B → R by fx(y) = f(x, y). Define Lf ,Uf : A → R by

Lf (x) = I(fx),Uf (x) = I(fx). Then L and U belong to R(A), and f f ∫ ∫ ∫

f dV = Lf (x) dx = Uf (x) dx. A×B A A Hint. Given ε > 0, use Proposition 3.1.11 to take φ, ψ ∈ PK(A × B) such that ∫ ∫ φ ≤ f ≤ ψ, ψ dV − φ dV < ε. With definitions of φ and ψ analogous to that of f , show that ∫ x x∫ x

φ dV = φx dx ≤ I(Lf ) A×B A ∫ ∫

≤ I(Uf ) ≤ ψx dx = ψ dV. A A×B Deduce that I(Lf ) = I(Uf ), and proceed. 134 3. Multivariable integral calculus and calculus on surfaces

Exercises on row reduction and matrix products

We consider the following three types of row operations on an n × n matrix A = (ajk). If σ is a permutation of {1, . . . , n}, let

ρσ(A) = (aσ(j)k).

If c = (c1, . . . , cj), and all cj are nonzero, set

−1 µc(A) = (cj ajk). Finally, if c ∈ R and µ ≠ ν, define

εµνc(A) = (bjk), bνk = aνk − caµk, bjk = ajk for j ≠ ν.

We relate these operations to left multiplication by matrices Pσ,Mc, and Eµνc, defined by the following actions on the standard basis {e1, . . . , en} of Rn:

Pσej = eσ(j),Mcej = cjej, and

Eµνceµ = eµ + ceν,Eµνcej = ej for j ≠ µ.

1. Show that

A = Pσρσ(A),A = Mcµc(A),A = Eµνcεµνc(A).

−1 2. Show that Pσ = Pσ−1 .

̸ −1 3. Show that, if µ = ν, then Eµνc = Pσ E21cPσ, for some permutation σ.

4. If B = ρσ(A) and C = µc(B), show that A = PσMcC. Generalize this to other cases where a matrix C is obtained from a matrix A via a sequence of row operations.

5. If A is an invertible, real n × n matrix (i.e., A ∈ Gl(n, R)), then the rows of A form a basis of Rn. Use this to show that A can be transformed to the identity matrix via a sequence of row operations. Deduce that any A ∈ Gl(n, R) can be written as a finite product of matrices of the form Pσ,Mc and Eµνc, hence as a finite product of matrices of the form listed in (3.1.36). 3.2. Surfaces and surface integrals 135

3.2. Surfaces and surface integrals

A smooth m-dimensional surface M ⊂ Rn is characterized by the following property. Given p ∈ M, there is a neighborhood U of p in M and a smooth map φ : O → U, from an open set O ⊂ Rm bijectively to U, with injective derivative at each point, and continuous inverse φ−1 : U → O. Such a map φ is called a coordinate chart on M. We call U ⊂ M a coordinate patch. If all such maps φ are smooth of class Ck, we say M is a surface of class Ck. In §4.3 we will define analogous notions of a Ck surface with boundary, and of a Ck surface with corners. There is an abstraction of the notion of a surface, namely the notion of a manifold, which we will discuss at the end of this section. Examples include projective spaces and other spaces obtained as quotients of surfaces. If φ : O → U is a Ck coordinate chart, such as described above, or more n k generally φ : O → R is a C map with injective derivative, and φ(x0) = p, we set

(3.2.1) TpM = Range Dφ(x0),

n a linear subspace of R of dimension m, and we denote by NpM its orthog- onal complement. It is useful to consider the following map. Pick a linear n−m isomorphism A : R → NpM, and define (3.2.2) Φ: O × Rn−m −→ Rn, Φ(x, z) = φ(x) + Az.

Thus Φ is a Ck map defined on an open subset of Rn. Note that ( ) v (3.2.3) DΦ(x , 0) = Dφ(x )v + Aw, 0 w 0

n n so DΦ(x0, 0) : R → R is surjective, hence bijective, so the Inverse Function Theorem applies; Φ maps some neighborhood of (x0, 0) diffeomorphically onto a neighborhood of p ∈ Rn. Suppose there is another Ck coordinate chart, ψ :Ω → U. Since φ and ψ are by hypothesis one-to-one and onto, it follows that F = ψ−1 ◦ φ : O → Ω is a well defined map, which is one-to-one and onto. See Figure 3.2.1. Also F and F −1 are continuous. In fact, we can say more.

Lemma 3.2.1. Under the hypotheses above, F is a Ck diffeomorphism.

−1 k Proof. It suffices to show that F and F are C on a neighborhood of x0 and y0, respectively, where φ(x0) = ψ(y0) = p. Let us define a map Ψ in e a fashion similar to (3.2.2). To be precise, we set TpM = Range Dψ(y0), e and let NpM be its orthogonal complement. (Shortly we will show that 136 3. Multivariable integral calculus and calculus on surfaces

Figure 3.2.1. Coordinate charts e TpM = TpM, but we are not quite ready for that.) Then pick a linear n−m e isomorphism B : R → NpM and consider

Ψ:Ω × Rn−m −→ Rn, Ψ(y, z) = ψ(y) + Bz.

k Again, Ψ is a C diffeomorphism from a neighborhood of (y0, 0) onto a e neighborhood of p. To be precise, there exist neighborhoods O of (x0, 0) in n−m e n−m e n O × R , Ω of (y0, 0) in Ω × R , and U of p in R such that

Φ: Oe −→ U,e and Ψ : Ωe −→ Ue are Ck diffeomorphisms. It follows that Ψ−1 ◦ Φ: Oe → Ωe is a Ck diffeomorphism. Now note that, for (x, 0) ∈ Oe and (y, 0) ∈ Ω,e ( ) ( ) (3.2.4) Ψ−1 ◦ Φ(x, 0) = F (x), 0 , Φ−1 ◦ Ψ(y, 0) = F −1(y), 0 . 3.2. Surfaces and surface integrals 137

In fact, to verify the first identity in (3.2.4), we check that Ψ(F (x), 0) = ψ(F (x)) + B0 = ψ(ψ−1 ◦ φ(x)) = φ(x) = Φ(x, 0). The identities in (3.2.4) imply that F and F −1 have the desired regularity. 

Thus, when there are two such coordinate charts, φ : O → U, ψ :Ω → U, we have a Ck diffeomorphism F : O → Ω such that (3.2.5) φ = ψ ◦ F. By the chain rule, (3.2.6) Dφ(x) = Dψ(y) DF (x), y = F (x).

In particular this implies that Range Dφ(x0) = Range Dψ(y0), so TpM in (3.2.1) is independent of the choice of coordinate chart. It is called the tangent space to M at p.

Remark. An application of the inverse function theorem related to the proof of Lemma 3.2.1 can be used to show that if O ⊂ Rm is open, m < n, and φ : O → Rn is a Ck map such that Dφ(p): Rm → Rn is injective, (p ∈ O), then there is a neighborhood Oe of p in O such that the image of Oe under φ is a Ck surface in Rn. Compare Exercise 11 in §2.2.

We next define an object called the metric tensor on M. Given a( coordi-) nate chart φ : O → U, there is associated an m × m matrix G(x) = gjk(x) of functions on O, defined in terms of the inner product of vectors tangent to M : ∑n ∂φ ∂φ ∂φℓ ∂φℓ (3.2.7) gjk(x) = Dφ(x)ej · Dφ(x)ek = · = , ∂xj ∂xk ∂xj ∂xk ℓ=1 m where {ej : 1 ≤ j ≤ m} is the standard orthonormal basis of R . Equiva- lently, (3.2.8) G(x) = Dφ(x)t Dφ(x).

We call (gjk) the metric tensor of M, on U, with respect to the coordi- nate chart φ : O → U. Note that this matrix is positive-definite. From a coordinate-independent point of view, the metric tensor on M specifies inner products of vectors tangent to M, using the inner product of Rn. 138 3. Multivariable integral calculus and calculus on surfaces

If we take another coordinate chart ψ :Ω → U, we want to compare (gjk) with H = (hjk), given by t (3.2.9) hjk(y) = Dψ(y)ej · Dψ(y)ek, i.e., H(y) = Dψ(y) Dψ(y). As seen above we have a diffeomorphism F : O → Ω such that (3.2.5)–(3.2.6) hold. Consequently, (3.2.10) G(x) = DF (x)t H(y) DF (x), for y = F (x), or equivalently, ∑ ∂Fi ∂Fℓ (3.2.11) gjk(x) = hiℓ(y). ∂xj ∂xk i,ℓ We now define the notion of on M. If f : M → R is a continuous function supported on U, we set ∫ ∫ √ (3.2.12) f dS = f ◦ φ(x) g(x) dx,

M O where (3.2.13) g(x) = det G(x). We need to know that this is independent of the choice of coordinate chart φ : O → U. Thus,∫ if we use ψ√:Ω → U instead, we want to show that ◦ (3.2.12) is equal to Ω f ψ(y) h(y) dy, where h(y) = det H(y). Indeed, since f ◦ψ◦F = f ◦φ, we can apply the change of variable formula, Theorem 3.1.15, to get ∫ √ ∫ √ (3.2.14) f ◦ ψ(y) h(y) dy = f ◦ φ(x) h(F (x)) |det DF (x)| dx.

Ω O Now, (3.2.10) implies that √ √ (3.2.15) g(x) = |det DF (x)| h(y), so the right side of (3.2.14) is seen to be equal to (3.2.12), and our surface integral is well defined, at least for f supported in a coordinate patch. More generally, if f : M → R has compact support, write it as a finite sum of terms, each supported on a coordinate patch, and use (3.2.12) on each patch. Using (3.1.11), one readily verifies that ∫ ∫ ∫

(3.2.16) (f1 + f2) dV = f1 dV + f2 dV, M M M if fj : M → R are continuous functions with compact support. Let us pause to consider the special cases m = 1 and m = 2. For m = 1, we are considering a curve in Rn, say φ :[a, b] → Rn. Then G(x) is a 1 × 1 3.2. Surfaces and surface integrals 139 matrix, namely G(x) = |φ′(x)|2. If we denote the curve in Rn by γ, rather than M, the formula (3.2.12) becomes the integral ∫ ∫ b (3.2.17) f ds = f ◦ φ(x) |φ′(x)| dx. a γ In case m = 2, let us consider a surface M ⊂ R3, with a coordinate chart φ : O → U ⊂ M. For f supported in U, an alternative way to write the surface integral is ∫ ∫

(3.2.18) f dS = f ◦ φ(x) |∂1φ × ∂2φ| dx1dx2, M O where u × v is the cross product of vectors u and v in R3. To see this, we compare this integrand with the one in (3.2.12). In this case, ( ) ∂1φ · ∂1φ ∂1φ · ∂2φ 2 2 2 (3.2.19) g = det = |∂1φ| |∂2φ| − (∂1φ · ∂2φ) . ∂2φ · ∂1φ ∂2φ · ∂2φ Recall from (1.4.45) that |u×v| = |u| |v| | sin θ|, where θ is the angle between u and v. Equivalently, since u · v = |u| |v| cos θ, ( ) (3.2.20) |u × v|2 = |u|2|v|2 1 − cos2 θ = |u|2|v|2 − (u · v)2. √ Thus we see that |∂1φ × ∂2φ| = g, in this case, and (3.2.18) is equivalent to (3.2.12). An important class of surfaces is the class of graphs of smooth functions. 1 n−1 Let u ∈ C (Ω), for( an open) Ω ⊂ R , and let M be the graph of z = u(x). The map φ(x) = x, u(x) provides a natural coordinate system, in which the metric tensor formula (3.2.7) becomes ∂u ∂u (3.2.21) gjk(x) = δjk + . ∂xj ∂xk 1 If u is C , we see that gjk is continuous. To calculate g = det(gjk), at a given point p ∈ Ω, if ∇u(p) ≠ 0, rotate coordinates so that ∇u(p) is parallel to the x1 axis. We obtain √ ( ) (3.2.22) g = 1 + |∇u|2 1/2. (See Exercise 31 for another take on this formula.) In particular, the (n−1)- dimensional volume of the surface M is given by ∫ ∫ ( ) 2 1/2 (3.2.23) Vn−1(M) = dS = 1 + |∇u(x)| dx. M Ω Particularly important examples of surfaces are the unit spheres Sn−1 in Rn, (3.2.24) Sn−1 = {x ∈ Rn : |x| = 1}. 140 3. Multivariable integral calculus and calculus on surfaces

Spherical polar coordinates on Rn are defined in terms of a smooth diffeo- morphism

(3.2.25) R : (0, ∞) × Sn−1 −→ Rn \ 0,R(r, ω) = rω.

n−1 Let (hℓm) denote the metric tensor on S (induced from its inclusion in Rn) with respect to some coordinate chart φ : O → U ⊂ Sn−1. Then we have a coordinate chart Φ : (0, ∞) × O → U ⊂ Rn given by Φ(r, y) = rφ(y). Take y0 = r, y = (y1, . . . , yn−1). In the coordinate system Φ the Euclidean metric tensor (ejk) is given by

e00 = ∂0Φ · ∂0Φ = φ(y) · φ(y) = 1,

e0j = ∂0Φ · ∂jΦ = φ(y) · ∂jφ(y) = 0, 1 ≤ j ≤ n − 1, 2 2 ejk = r ∂jφ · ∂kφ = r hjk, 1 ≤ j, k ≤ n − 1.

The fact that φ(y) · ∂jφ(y) = 0 follows by applying ∂/∂yj to the identity φ(y) · φ(y) ≡ 0. To summarize, ( ) ( ) 1 (3.2.26) ejk = 2 . r hℓm Now (3.2.26) yields √ √ (3.2.27) e = rn−1 h.

We therefore have the following result for integrating a function in spherical polar coordinates. ∫ ∫ [∫ ∞ ] (3.2.28) f(x) dx = f(rω)rn−1 dr dS(ω). 0 Rn Sn−1

We next compute the (n − 1)-dimensional area An−1 of the unit sphere Sn−1 ⊂ Rn, using (3.2.28) together with the computation ∫ 2 (3.2.29) e−|x| dx = πn/2,

Rn from (3.1.75). First note that, whenever f(x) = φ(|x|), (3.2.28) yields ∫ ∫ ∞ n−1 (3.2.30) φ(|x|) dx = An−1 φ(r)r dr. 0 Rn

In particular, taking φ(r) = e−r2 and using (3.2.29), we have ∫ ∞ ∫ ∞ n/2 −r2 n−1 1 −s n/2−1 (3.2.31) π = An−1 e r dr = 2 An−1 e s ds, 0 0 3.2. Surfaces and surface integrals 141 where we used the substitution s = r2 to get the last identity. We hence have 2πn/2 (3.2.32) An−1 = n , Γ( 2 ) where Γ(z) is Euler’s Gamma function, defined for z > 0 by ∫ ∞ (3.2.33) Γ(z) = e−ssz−1 ds. 0 We need to complement (3.2.32) with some results on Γ(z) allowing a com- putation of Γ(n/2) in terms of more familiar quantities. Of course, setting z = 1 in (3.2.33), we immediately get (3.2.34) Γ(1) = 1. Also, setting n = 1 in (3.2.31), we have ∫ ∞ ∫ ∞ 2 π1/2 = 2 e−r dr = e−ss−1/2 ds, 0 0 or ( ) 1 (3.2.35) Γ = π1/2. 2 We can proceed inductively from (3.2.34)-(3.2.35) to a formula for Γ(n/2) for any n ∈ Z+, using the following. Lemma 3.2.2. For all z > 0, (3.2.36) Γ(z + 1) = zΓ(z).

Proof. We can write ∫ ( ) ∫ ∞ d ∞ d ( ) Γ(z + 1) = − e−s sz ds = e−s sz ds, 0 ds 0 ds the last identity by integration by parts. The last here is seen to equal the right side of (3.2.36). 

Consequently, for k ∈ Z+, ( ) ( ) ( ) 1 1 1 (3.2.37) Γ(k) = (k − 1)!, Γ k + = k − ··· π1/2. 2 2 2 Thus (3.2.32) can be rewritten 2πk 2πk (3.2.38) A − = ,A = ( ) ( ). 2k 1 − 2k − 1 ··· 1 (k 1)! k 2 2 We discuss another important example of a smooth surface, in the space M(n, R) ≈ Rn2 of real n × n matrices, namely SO(n), the set of matrices T ∈ M(n, R) satisfying T tT = I and det T > 0 (hence det T = 1). The 142 3. Multivariable integral calculus and calculus on surfaces exponential map Exp: M(n, R) → M(n, R) defined by Exp(A) = eA has the property (3.2.39) Exp : Skew(n) −→ SO(n), where Skew(n) is the set of skew-symmetric matrices in M(n, R). As seen in (2.2.28)–(2.2.30), (3.2.40) D Exp(0)Y = Y, ∀ Y ∈ M(n, R), and hence the Inverse Function Theorem implies that there is a ball Ω centered at 0 in M(n, R) that is mapped diffeomorphically by Exp onto a neighborhood Ωe of I in M(n, R). From the identities Exp Xt = (Exp X)t, Exp(−X) = (Exp X)−1, we see that, given X ∈ Ω,A = Exp X ∈ Ω,e A ∈ SO(n) ⇐⇒ X ∈ Skew(n). Thus there is a neighborhood O of 0 in Skew(n) that is mapped by Exp diffeomorphically onto a smooth surface U ⊂ M(n, R), of dimension m = n(n − 1)/2. Furthermore, U is a neighborhood of I in SO(n). For general T ∈ SO(n), we can define maps

(3.2.41) φT : O −→ SO(n), φT (A) = T Exp(A), and obtain coordinate charts in SO(n), which is consequently a smooth surface of dimension n(n − 1)/2 in M(n, R). Note that SO(n) is a closed bounded subset of M(n, R); hence it is compact. We use the inner product on M(n, R) computed componentwise; equiv- alently, (3.2.42) ⟨A, B⟩ = Tr (BtA) = Tr (BAt). This produces a metric tensor on SO(n). The surface integral over SO(n) has the following important invariance property. ( ) Proposition 3.2.3. Given f ∈ C SO(n) , if we set

(3.2.43) ρT f(X) = f(XT ), λT f(X) = f(TX), for T,X ∈ SO(n), we have ∫ ∫ ∫

(3.2.44) ρT f dS = λT f dS = f dS. SO(n) SO(n) SO(n)

Proof. Given T ∈ SO(n), the maps RT ,LT : M(n, R) → M(n, R) defined by RT (X) = XT,LT (X) = TX are easily seen from (3.2.42) to be isome- tries. Thus they yield maps of SO(n) to itself which preserve the metric tensor, proving (3.2.44).  3.2. Surfaces and surface integrals 143

( ) ∫ Since SO(n) is compact, its total volume V SO(n) = SO(n) 1 dS is finite. We define the integral with respect to “Haar measure” ∫ ∫ 1 (3.2.45) f(g) dg = ( ) f dS. V SO(n) SO(n) SO(n) This is used in many arguments involving “averaging over rotations.” Ex- amples of such averaging arise in §§5.1, 5.3, and §7.4. See also §6.4 for generalizations to other matrix groups.

Extended notion of coordinates

Basic calculus as developed in this text so far has involved maps from one Euclidean space to another, of the type F : Rn → Rm. It is convenient and useful to extend our setting to F : V → W , where V and W are general finite-dimensional real vector spaces. There is the following notion of the derivative. Let V and W be as above, and let Ω ⊂ V be open. We say F :Ω → W is differentiable at x ∈ Ω provided there exists a linear map L : V → W such that, for y ∈ V small, (3.2.46) F (x + y) = F (x) + Ly + r(x, y), with r(x, y) → 0 faster than y → 0, i.e., ∥r(x, y)∥ (3.2.47) −→ 0 as y → 0. ∥y∥ For this to be meaningful, we need norms on V and W . Often these norms come from inner products. See Appendix A.2 for a discussion of inner prod- uct spaces. If (3.2.46)–(3.2.47) hold, we set DF (x) = L, and call the linear map DF (x): V −→ W the derivative of F at x. We say F is C1 if DF (x) is continuous in x. Notions of F in Ck are produced in analogy with the situation in §2.1. Of course, we can reduce all this to the setting of §2.1 by picking bases of V and W . n Often such V and W arise as linear subspaces of R , such as TpM in (3.2.1), or V = NpM, mentioned right below that. As noted there, we can take a linear isomorphism of such V with Rk for some k, and keep working in the context of maps between such standard Euclidean spaces, as in (3.2.2). However, it can be convenient to avoid this distraction, and, for example, replace (3.2.2) by n (3.2.48) Φ: O × NpM −→ R , Φ(x, z) = φ(x) + z, 144 3. Multivariable integral calculus and calculus on surfaces and (3.2.3) by ( ) v (3.2.49) DΦ(x , 0) = Dφ(x )v + w. 0 w 0 In order to carry out Lemma 3.2.1 in this setting, we want the following version of the Inverse Function Theorem.

Proposition 3.2.4. Let V and W be real vector spaces, each of dimension k n. Let F be a C map from an open neighborhood Ω of p0 ∈ V to W , with q0 = F (p0), k ≥ 1. Assume the derivative

DF (p0): V → W is an isomorphism. e Then there exist a neighborhood U of p0 and a neighborhood U of q0 such that F : U → Ue is one-to-one and onto, and F −1 : Ue → U is a Ck map.

While Proposition 3.2.4 is apparently an extension of Theorem 2.2.1, there is no extra required to prove it. One can simply take linear isomorphisms A : Rn → V and B : Rn → W and apply Theorem 2.2.1 to the map G(x) = B−1F (Ax). Thus Proposition 3.2.4 is not a technical improvement of Theorem 2.2.1, but it is a useful conceptual extension. With this in mind, we can define the notion of an m-dimensional surface M ⊂ V (an n-dimensional vector space) as follows. Take a vector space W , of dimension m. Given p ∈ M, we require there to be a neighborhood U of p in M and a smooth map φ : O → U, from an open set O ⊂ W bijectively to U, with an injective derivative at each point. We call such a map a coordinate chart. If all such maps are smooth of class Ck, we say M is a surface of class Ck. As a further wrinkle, we could take different vector spaces Wp for different p ∈ M, as long as they all have dimension m. The reader is invited to formulate the appropriate modification of Lemma 3.2.1 in this setting.

Submersions

Let V and W be finite dimensional real vector spaces, Ω ⊂ V open, and F :Ω → W a Ck map, k ≥ 1. We say F is a submersion provided that, for each x ∈ Ω, DF (x): V → W is surjective. (This requires dim V ≥ dim W .) We establish the following Submersion Mapping Theorem, which the reader might recognize as a variant of the Implicit Function Theorem. In the statement, ker T denotes the null space ker T = {v ∈ V : T v = 0}, if T : V → W is a linear transformation. 3.2. Surfaces and surface integrals 145

Proposition 3.2.5. With V,W , and Ω ⊂ V as above, assume F :Ω → W is a Ck map, k ≥ 1. Fix p ∈ W , and consider (3.2.50) S = {x ∈ V : F (x) = p}. Assume that, for each x ∈ S, DF (x): V → W is surjective. Then S is a Ck surface in Ω. Furthermore, for each x ∈ S,

(3.2.51) TxS = ker DF (x).

Proof. Given q ∈ S, set Kq = ker DF (q) and define

(3.2.52) Gq : V −→ W ⊕ Kq,Gq(x) = (F (x),Pq(x − q)), where Pq is a projection of V onto Kq. Note that

(3.2.53) Gq(q) = (F (q), 0) = (p, 0). Also

(3.2.54) DGq(x) = (DF (x),Pq), x ∈ V. We claim that

(3.2.55) DGq(q) = (DF (q),Pq): V → W ⊕ Kq is an isomorphism. This is a special case of the following general observation.  Lemma 3.2.6. If A : V → W is a surjective linear map and P is a projec- tion of V onto ker A, then (3.2.56) (A, P ): V −→ W ⊕ ker A is an isomorphism.

We postpone the proof of this lemma and proceed with the proof of Proposition 3.2.5. Having (3.2.55), we can apply the Inverse Function The- orem (Proposition 3.2.4) to obtain a neighborhood U of q in V and a neigh- k borhood O of (p, 0) in W ⊕ Kq such that Gq : U → O is bijective, with C inverse −1 O −→ −1 (3.2.57) Gq : U, Gq (p, 0) = q. By (3.2.52), given x ∈ U,

(3.2.58) x ∈ S ⇐⇒ Gq(x) = (p, v), for some v ∈ Kq. ∩ k −1 O ∩ { Hence S U is the image under the C diffeomorphism Gq of (p, v): k v ∈ Kq}. Hence S is smooth of class C and dim TqS = dim Kq. It follows from the chain rule that TqS ⊂ Kq, so the dimension count yields TqS = Kq. This proves Proposition 3.2.5. Note that we have the following coordinate chart on a neighborhood of q ∈ S: −1 → (3.2.59) ψq(v) = Gq (p, v), ψq :Ωq S, where Ωq is a neighborhood of 0 in TqS = Kq = ker DF (q). 146 3. Multivariable integral calculus and calculus on surfaces

It remains to prove Lemma 3.2.6. Indeed, given that A : V → W is surjective, the fundamental theorem of linear algebra implies dim V = dim(W ⊕ ker A), and it is clear that (A, P ) in (3.2.56) is injective, so the isomorphism property follows.

Remark. In case V = Rn and W = R, DF (x) is typically denoted ∇F (x), the hypothesis on DF (x) becomes ∇F (x) ≠ 0, and (5.50) is equivalent to the assertion that dim S = n − 1 and, for x ∈ S,

(3.2.60) ∇F (x) ⊥ TxS. Compare the discussion following Proposition 2.2.6. We illustrate Proposition 3.2.5 with another proof that (3.2.61) SO(n) ⊂ M(n, R) is a smooth surface, different from the argument involving (3.2.39)–(3.2.41). To get this, we take (3.2.62) V = M(n, R),W = {A ∈ M(n, R): A = At}, and (3.2.63) F : V −→ W, F (X) = XtX. Now, given X,Y ∈ V , Y small, (3.2.64) F (X + Y ) = XtX + XtY + Y tX + O(∥Y ∥2), so (3.2.65) DF (X)Y = XtY + Y tX. We claim that (3.2.66) X ∈ SO(n) =⇒ DF (X): M(n, R) → W is surjective. Indeed, given A ∈ W , i.e., A ∈ M(n, R) and At = A, and X ∈ SO(n), we have 1 (3.2.67) Y = XA =⇒ DF (X)Y = A. 2 This establishes (3.2.66), so Proposition 3.2.5 applies. Again we conclude that SO(n) is a smooth surface in M(n, R).

Riemann integrable functions on a surface

Let M ⊂ Rn be an m-dimensional surface, smooth of class C1. We define the class Rc(M) of compactly supported Riemann integrable functions as 3.2. Surfaces and surface integrals 147 follows, guided by Proposition 3.1.11. If f : M → R is bounded and has compact support, we set {∫ } I(f) = inf g dS : g ∈ Cc(M), g ≥ f , (3.2.68) {M∫ } I(f) = sup h dS : h ∈ Cc(M), h ≤ f , M where Cc(M) denotes the set of continuous functions on M with compact support. Then

(3.2.69) f ∈ Rc(M) ⇐⇒ I(f) = I(f), ∫ and if such is the case, we denote the common value by M f dS. It follows readily from the definition and arguments produced in §3.1 that (3.2.70) ∫ ∫ ∫ f1, f2 ∈ Rc(M) ⇒ f1+f2 ∈ Rc(M) and (f1+f2) dS = f1 dS+ f2 dS. M M M In fact, using (3.2.16) for functions that are continuous on M with compact support, one obtains from the definition (3.2.68) that, if fj : M → R are bounded and have compact support,

I(f1 + f2) ≤ I(f1) + I(f2),I(f1 + f2) ≥ I(f1) + I(f2), which yields (3.2.70). Also one can modify the proof of Proposition 3.1.17 to show that

(3.2.71) f ∈ Rc(M), u ∈ C(M) =⇒ uf ∈ Rc(M).

Furthermore, if φ : O → U ⊂ M is a coordinate chart and f ∈ Rc(U), then an application of Proposition 3.1.11 gives ∫ ∫ √ (3.2.72) f ◦ φ ∈ Rc(O), and f dS = f(φ(x)) g(x) dx, M O ∈ R with g(x) as in (3.2.12)–(3.2.13). Given any f ∑ c(M),∑ we can take a { } continuous partition of unity∫ uj , write f = j fj = j ujf, and use (3.2.70)–(3.2.72) to express M f dS as a sum of integrals over coordinate charts. If Σ ⊂ M has compact closure, then

+ (3.2.73) cont Σ = I(χΣ), and Σ is contented if and only if χΣ ∈ Rc(M). In such a case, (3.2.73) is the area of Σ. Given f : M → R, bounded and compactly supported, in 148 3. Multivariable integral calculus and calculus on surfaces parallel with (3.1.39) we say

f ∈ Cc(M) ⇔ the set Σ of points of discontinuity of f (3.2.74) satisfies cont+ Σ = 0. We have

(3.2.75) Cc(M) ⊂ Rc(M), and (again parallel to Proposition 3.1.11) if f : M → R is bounded and compactly supported, {∫ } I(f) = inf g dS : g ∈ Cc(M), g ≥ f , (3.2.76) {M∫ } I(f) = sup h dS : h ∈ Cc(M), h ≤ f . M One can proceed from here to define the spaces (3.2.77) R(M), R#(M), and establish properties of functions in these spaces, in analogy with work in §3.1 on R(Rn) and R#(Rn). We leave such an investigation to the reader.

Vector fields and flows on surfaces

Let M ⊂ Rn be a smooth, m-dimensional surface. A smooth vector field X on M (sometimes called a tangent vector field) is a smooth map n (3.2.78) X : M −→ R such that X(p) ∈ TpM, ∀ p ∈ M. If φ : O → U ⊂ M is a coordinate chart, then there is a unique smooth m vector field Xφ : O → R such that

(3.2.79) X(φ(x)) = Dφ(x)Xφ(x). F t The vector field X generates a flow X on M, satisfing F t F t (3.2.80) X (φ(x)) = φ( Xφ (x)), at least for small |t|. As before, we have the defining property d (3.2.81) F t (x) = X(F t (x)). dt X X Note also that F s+t F t ◦ F s (3.2.82) X (x) = X X (x). With slight abuse of notation we will use the same symbol X for the vector field X on M and for the associated vector field Xφ on a coordinate patch. 3.2. Surfaces and surface integrals 149

F t Valuable information on the behavior of the flow X can be obtained by investigating the t-derivative of F t (3.2.83) vt(x) = v( X (x)), ∈ 1 1 given v C0 (M), i.e., v is of class C and vanishes outside some compact ∈ 1 ∈ 1 O subset of M. In fact, we take v C0 (U), and identify this with v C0 ( ), with O ⊂ Rm as above. The chain rule plus (3.2.81) yields d (3.2.84) v (x) = X(F t (x)) · ∇v(F t (x)), dt t X X In particular, d (3.2.85) v(F s (x)) = X(x) · ∇v(x). ds X s=0

Here ∇v is the of v, given by ∇v = (∂v/∂x1, . . . , ∂v/∂xm). A useful alternative formula to (3.2.84) is d d v (x) = v (F s (x)) (3.2.86) dt t ds t X s=0 = X(x) · ∇vt(x), the first equality following from (3.2.82) and the second from (3.2.85), with v replaced by vt. One significant consequence of (3.2.86), which will lead to the formula (3.2.91) below, is that, for v ∈ C1(O), ∫ 0 ∫ d √ √ v(F t (x)) g dx = X(x) · ∇v (x) g dx dt X t O O ∫ (3.2.87) √ − F t = div X(x)v( X (x)) g dx. O

Here, div X(x) is the of the vector field X(x) = (X1(x),...,Xm(x)), defined (in local coordinates) by ∑ ∂ (3.2.88) div X(x) = g−1/2 (g1/2X (x)). ∂x j j j The last equality in (3.2.87) follows by integration by parts, ∫ ∫ ∂vt ∂Gk √ Gk(x) dx = − vt(x) dx, Gk(x) = gXk(x), ∂xk ∂xk O O followed by summation over k. We restate (3.2.87) in global terms: ∫ ∫ d (3.2.89) v(F t (x)) dV = − div X(x)v(F t (x)) dV. dt X X M M 150 3. Multivariable integral calculus and calculus on surfaces

∈ 1 So far, we have (3.2.89) for v C0 (M). We extend this to less regular functions. First, note that (3.2.89) implies ∫ ∫ F t − v( X (x)) dV v(x) dV M ∫ ∫ M (3.2.90) t − F s = div X(x)v( X (x)) dV ds. 0 M ∈ 1 Basic results on the integral allow one to pass from v C0 (M) in (3.2.58) to more general v, including v = χΩ (the characteristic function of Ω, defined to be equal to 1 on Ω and 0 on M \ Ω), for smoothly bounded compact Ω ⊂ M. In more detail, if Ω ⊂ M is a compact, smoothly bounded subset, let Bδ = {x ∈ M : dist(x, Ω) ≤ δ}. There exists δ0 > 0 such that Bδ ⊂ M for ∈ ∈ 1 δ (0, δ0]. For such δ, one can produce vδ C0 (M) such that

vδ = 1 on Bδ, 0 ≤ vδ ≤ 1, vδ = 0 on M \ Bδ. Then ∫ ∫

χΩ(x) dV − vδ(x) dV ≤ Vol(Bδ \ Ω) → 0, as δ → 0,

→ so, as δ 0, ∫ ∫

vδ(x) dV −→ χΩ(x) dV. M M Similar arguments give ∫ ∫ F t −→ F t vδ( X (x)) dV χΩ( X (x)) dV, M M and ∫ ∫ ∫ ∫ t t F s → F s div X(x)vδ( X (x)) dV ds div X(x)χΩ( X (x)) dV ds. 0 0 M M

These results allow one to take v = χΩ in (3.2.90). Now one can pass from (3.2.90) back to (3.2.89), via the fundamental theorem of calculus. Note that ∫ F t F −t Vol X (Ω) = χΩ( X (x)) dV. M

We can apply (3.2.89), with t replaced by −t, and v by χΩ, and deduce the following. 3.2. Surfaces and surface integrals 151

1 F t Proposition 3.2.7. If X is a C vector field, generating the flow X , well defined on M for t ∈ I, and Ω ⊂ M is compact and smoothly bounded, then, for t ∈ I, ∫ d (3.2.91) Vol F t (Ω) = div X(x) dV. dt X F t X (Ω) This result is behind the notation div X, i.e., the divergence of X. Vector F t fields with positive divergence generate flows X that magnify volumes as t increases, while vector fields with negative divergence generate flows that shrink volumes as t increases. We will see more of the divergence operation on vector fields in Sections 4.4 and 5.2.

Projective spaces, quotient surfaces, and manifolds

Real projective space Pn−1 is obtained from the sphere Sn−1 by identi- fying each pair of antipodal points: (3.2.92) Pn−1 = Sn−1/ ∼, where (3.2.93) x ∼ y ⇐⇒ x = y, for x, y ∈ Sn−1 ⊂ Rn. More generally, if M ⊂ Rn is an m-dimensional surface, smooth of class Ck, satisfying (3.2.94) 0 ∈/ M, x ∈ M ⇒ −x ∈ M, we define (3.2.95) P(M) = M/ ∼, using the equivalence relation (3.2.93). Note that M has the metric space structure d(x, y) = ∥x − y∥, and then P(M) becomes a metric space with metric (3.2.96) d([x], [y]) = min{d(x′, y′): x′ ∈ [x], y′ ∈ [y]}, or, in view of (3.2.93), (3.2.97) d([x], [y]) = min{d(x, y), d(x, −y)}. Here, x ∈ M and [x] ∈ P(M) is its associated equivalence class. The map x 7→ [x] is a continuous map (3.2.98) ρ : M −→ P(M). It has the following readily established property. 152 3. Multivariable integral calculus and calculus on surfaces

Lemma 3.2.8. Each p ∈ P(M) has an open neighborhood U ⊂ P(M) such −1 that ρ (U) = U0 ∪ U1 is the disjoint union of two open subsets of M, and, for j = 0, 1, ρ : Uj → U is a , i.e., it is continuous, one-to-one, and onto, with continuous inverse.

−1 Given p ∈ P(M), {p0, p1} = ρ (p), let U0 be a neighborhood of p0 in M k m for which there is a C coordinate chart φ0 : O → U0 (O ⊂ R open). Then φ1(x) = −φ0(x) gives a coordinate chart φ1 : O → U1 onto a neighborhood U1 of p1 ∈ M. If U0 is picked small enough, U0 and U1 are disjoint. The projection ρ maps U0 and U1 homeomorphically onto a neighborhood U of p in P(M), and we have “coordinate charts”

(3.2.99) ρ ◦ φj : O −→ U. k In fact, ρ ◦ φ1 = ρ ◦ φ0. If ψ0 :Ω → U0 is another C coordinate chart, then, as in Lemma 3.2.1, we have a Ck diffeomorphism F : O → Ω such that ψ0 ◦ F = φ0. Similarly ψ1 ◦ F = φ1, with ψ1(x) = −ψ0(x), and we have ρ ◦ ψj ◦ F = ρ ◦ φj. The structure just placed on the “quotient surface” P(M) makes it a manifold, an object we now define. Given a metric space X, we say X has the structure of a Ck manifold of dimension m provided the following conditions hold. First, for each p ∈ X, m we have an open neighborhood Up of p in X, an open set Op ⊂ R , and a homeomorphism

(3.2.100) φp : Op −→ Up.

Next, if also q ∈ X and Upq = Up ∩ Uq ≠ ∅, then the homeomorphism from O −1 O −1 pq = φp (Upq) to qp = φq (Upq),

−1 (3.2.101) Fpq = φ ◦ φp , q Opq k is a C diffeomorphism. As before, we call the maps φp : Op → Up ⊂ X coordinate charts on X. A metric tensor on a Ck manifold X is defined by positive-definite, k−1 symmetric m × m matrices Gp ∈ C (Op), satisfying the compatibility condition t (3.2.102) Gp(x) = DFpq(x) Gq(y)DFpq(x), for

(3.2.103) x ∈ Opq ⊂ Op, y = Fpq(x) ∈ Oqp ⊂ Oq. We then set k−1 (3.2.104) gp = det Gp ∈ C (Op), 3.2. Surfaces and surface integrals 153 satisfying √ √

(3.2.105) gp(x) = | det DFpq(x)| gq(y), for x and y as in (3.2.103). If f : X → R is a continuous function supported in Up, we set ∫ ∫ √

(3.2.106) f dS = f(φp(x)) gp(x) dx.

X Op ∫ As in (3.2.14)–(3.2.15), this leads to a well defined integral X f dS for f ∈ Cc(X), obtained by writing f as a finite sum of continuous functions supported on various coordinate patches Up. From here we can develop the class of functions Rc(X) and their integrals over X, in a fashion parallel to that done above when X is a surface in Rn. The quotient surfaces P(M) are examples of Ck manifolds as defined above. They get natural metric tensors with the property that ρ in (3.2.98) is a local isometry. In such a case, ∫ ∫ 1 (3.2.107) f dS = f ◦ ρ dS. 2 P(M) M Another important quotient manifold is the “flat torus” (3.2.108) Tn = Rn/Zn. Here the equivalence relation on Rn is x ∼ y ⇔ x − y ∈ Zn. Natural local coordinates on Tn are given by the projection ρ : Rn → Tn, restricted to sufficiently small open sets in Rn. The quotient Tn gets a natural metric tensor for which ρ is a local isometry. Given two Ck manifolds X and Y , a continuous map ψ : X → Y is said to be smooth of class Ck provided that for each p ∈ X, there are neighborhoods e e e U of p and U of q = ψ(p), and coordinate charts φ1 : O → U, φ2 : O → U, −1◦ ◦ O → Oe k k such that φ2 ψ φ1 : is a C map. We say ψ is a C diffeomorphism if it is one-to-one and onto and ψ−1 : Y → X is a Ck map. If X is a Ck manifold and M ⊂ Rn a Ck surface, a Ck diffeomorphism ψ : X → M is called a Ck embedding of X into Rn. Here is an embedding of Tn into R2n: ∑n ∑n (3.2.109) ψ(x) = (cos 2πxj)ej + (sin 2πxj)en+j. j=1 j=1 A priori, ψ : Rn → R2n, but ψ(x) = ψ(y) whenever x − y ∈ Zn, so this naturally induces a smooth map Tn → R2n, which can be seen to be an embedding. 154 3. Multivariable integral calculus and calculus on surfaces

If M ⊂ Rn is an m-dimensional surface satisfying (3.2.94), an embedding of P(M) into M(n, R) can be constructed via the map (3.2.110) ψ : Rn −→ M(n, R), ψ(x) = xxt.

Note that     2 ··· ··· x1 x1 x1xj x1xn  .  t  . . .  (3.2.111) x =  .  =⇒ xx =  . . .  . ··· ··· 2 xn xnx1 xnxj xn We need a couple of lemmas. Lemma 3.2.9. For ψ as in (3.2.110), x, y ∈ Rn, (3.2.112) ψ(x) = ψ(y) ⇐⇒ x = y.

Proof. The map ψ is characterized by ψ(x)ej = xjx, where x is as in n (3.2.111) and {ej} is the standard basis of R . It follows that if x ≠ 0, ψ(x) has exactly one nonzero eigenvalue, namely |x|2, and ψ(x)x = |x|2x. Thus ψ(x) = ψ(y) implies that |x|2 = |y|2 and that x and y are parallel. Thus x = ay and a = 1.  Lemma 3.2.10. In the setting of Lemma 3.2.9, if x ≠ 0, (3.2.113) Dψ(x): Rn −→ M(n, R) is injective.

Proof. A calculation gives (3.2.114) Dψ(x)v = xvt + vxt. Thus, if v ∈ ker Dψ(x), (3.2.115) xvt = −vxt. Both sides are rank 1 elements of M(n, R). The range of the left side is spanned by x and that of the right side is spanned by v, so v = ax for some a ∈ R. Then (3.2.115) becomes (3.2.116) axxt = −axxt, which implies a = 0 if x ≠ 0. 

Remark. Here is a refinement of Lemma 3.2.10. Using the inner product on M(n, R) given by (3.2.42), we can calculate ( ) (3.2.117) ⟨Dψ(x)v, Dψ(x)v⟩ = 2 |x|2|v|2 + (x · v)2 .

Lemmas 3.2.9 and 3.2.10 imply that if M ⊂ Rn is an m-dimensional sur- face satisfying (3.2.94), then ψ|M yields an embedding of P(M) into M(n, R). Denote the image surface by M #. As we see from (3.2.117), this embedding 3.2. Surfaces and surface integrals 155 is not typically an isometry. However, if M = Sn−1 and v is tangent to Sn−1 at x, then v · x = 0, and (3.2.117) implies that in this case the embedding of Pn−1 into M(n, R) is an isometry, up to a factor of 2. It is the case that if X is any Ck manifold that is a countable union of compact sets, then X can be embedded into Rn for some n. In case X is compact, this is not very hard to prove, using local coordinate charts and smooth cutoffs, and the interested reader might take a crack at it. If X is provided with a metric tensor, this embedding might not preserve this metric tensor. If it does, one calls is an isometric embedding. It is the case that any such manifold has an isometric embedding into Rn for some n (if k is sufficiently large). This result is the famous Nash embedding theorem, and its proof is quite difficult. For X compact and C∞, a proof is given in Chapter 14 of [41].

Polar decomposition of matrices

We define the spaces Sym(n) and P(n) by Sym(n) = {A ∈ M(n, R): A = At}, (3.2.118) P(n) = {A ∈ Sym(n): x · Ax > 0, ∀ x ∈ Rn \ 0}.

It is easy to show that P(n) is an open, convex subset of the linear space Sym(n). We aim to prove the following result.

Proposition 3.2.11. Given A ∈ Gl+(n, R), there exist unique U ∈ SO(n) and Q ∈ P(n) such that

(3.2.119) A = UQ.

The representation (3.2.119) is called the polar decomposition of A. Note that

(3.2.120) (UQ)tUQ = QU tUQ = Q2, so if the identity (3.2.119) were to hold, we would have

(3.2.121) AtA = Q2.

Note also that

(3.2.122) A ∈ Gl(n, R) =⇒ AtA ∈ P(n), since x · AtAx = (Ax) · (Ax) = |Ax|2. To prove Proposition 3.2.11, we bring in the following basic result of linear algebra. See Appendix A.3. 156 3. Multivariable integral calculus and calculus on surfaces

Proposition 3.2.12. Given B ∈ Sym(n), there is an orthonormal basis of n R consisting of eigenvectors of B, with eigenvalues λj ∈ R. Equivalently, there exists V ∈ SO(n) such that (3.2.123) B = VDV −1, with   λ1  .  (3.2.124) D =  ..  , λn

λj ∈ R.

If B ∈ P(n), then each λj > 0. We can then set   1/2 λ1  .  −1 (3.2.125) Q = V  ..  V , 1/2 λn and obtain the following. Corollary 3.2.13. Given B ∈ P(n), there is a unique Q ∈ P(n) satisfying (3.2.126) Q2 = B. We say Q = B1/2.

To obtain the decomposition (3.2.119), we set (3.2.127) Q = (AtA)1/2,U = AQ−1. Note that (3.2.128) U tU = Q−1AtAQ−1 = Q−1Q2Q−1 = I, and (det U)(det Q) = det A > 0, so det U > 0, and hence U ∈ SO(n), as desired. By (3.2.121) and Corollary 3.2.13, the factor Q ∈ P(n) in (3.2.119) is unique, and hence so is the factor U. We can use Proposition 3.2.11 to prove the following.

Proposition 3.2.14. The set Gl+(n, R) is connected. In fact, given A ∈ Gl+(n, R), there is a smooth path γ : [0, 1] → Gl+(n, R) such that γ(0) = I and γ(1) = A.

Proof. To start, we have that (3.2.129) Exp : Skew(n) −→ SO(n) is onto. See Exercise 14 below for this (or Corollary A.3.9). Hence, with A = UQ as in (3.2.119), we have a smooth path α(t) = Exp(tS), α : [0, 1] → SO(n), such that α(0) = I and α(1) = U. Since P(n) is a convex subset of Sym(n), 3.2. Surfaces and surface integrals 157 we can take β(t) = (1 − t)I + tQ, obtaining a smooth path β : [0, 1] → P(n), such that β(0) = I and β(1) = Q. Then (3.2.130) γ(t) = α(t)β(t) does the trick. 

Exercises

1. The following exercise deals with parametrizing a curve by arc length. (a) Let γ :[a, b] → Rn be a C1 curve, and assume γ′(t) ≠ 0 for t ∈ [a, b]. By (3.2.17), the length of the segment γ([a, t]) is ∫ t ℓ(t) = |γ′(x)| dx. a We have a C1 map ℓ :[a, b] → [0,L] (where L = ℓ(b)), satisfying ℓ′(t) = |γ′(t)| > 0. Hence (by the 1D inverse function theorem) there is a C1 inverse, ℓ−1 : [0,L] → [a, b]. We set σ(s) = γ(ℓ−1(s)), s ∈ [0,L]. Show that σ : [0,L] → Rn is a C1 curve and |σ′(s)| ≡ 1. We say that σ is obtained from γ by parametrization by arc length.

(b) Consider the circle C = {(x, y) ∈ R2 : x2 + y2 = 1}. Show that C has a parametrization by arc length, say γ, such that γ(0) = (1, 0) and γ′(0) = (0, 1). Let us say γ(t) = (c(t), s(t)). Remark. The curve segment γ([0, t]) has length t. In trigonometry, the line segments from (0, 0) to (1, 0) = γ(0) and from (0, 0) to (c(t), s(t)) = γ(t) are said to meet at an angle, measured in , equal to the length of this curve, i.e., to t radians. Then the geometric definition of the trigonometric functions cos t and sin t yields cos t = c(t), sin t = s(t). (c) Consult the auxiliary exercises on trigonometric functions in §3, and deduce from (2.3.111) that cos t and sin t, defined above, are identical with cos t and sin t as they appear in (2.3.108). For a related approach, see Exercises 11–13 below.

(d) Taking π as characterized below (2.3.112), show that the arc length of the unit circle C is equal to 2π (reiterating the case n = 2 of (3.2.32)).

2. Compute the volume of the unit ball Bn = {x ∈ Rn : |x| ≤ 1}. Hint. Apply (3.2.30) with φ = χ[0,1]. 158 3. Multivariable integral calculus and calculus on surfaces

n 3. Taking the upper half of the sphere S to be the graph of xn+1 = (1 − |x|2)1/2, for x ∈ Bn, the unit ball in Rn, deduce from (3.2.23) and (3.2.30) that ∫ ∫ 1 n−1 π/2 r n−1 An = 2An−1 √ dr = 2An−1 (sin θ) dθ. 2 0 1 − r 0

Use this to get an alternative derivation of the formula (3.2.38) for An. Hint. Rewrite this formula as ∫ π k An = An−1bn−1, bk = sin θ dθ. 0 To analyze b , you can write, on the one hand, k ∫ π k 2 bk+2 = bk − sin θ cos θ dθ, 0 and on the other, using integration by parts, ∫ π d k+1 bk+2 = cos θ sin θ dθ. 0 dθ Deduce that k + 1 b = b . k+2 k + 2 k

n 4. Suppose M is a surface in R of dimension( 2, and φ : O) → U ⊂ M is a 2 coordinate chart, with O ⊂ R . Set φjk(x) = φj(x), φk(x) , so φjk : O → R2. Show that the formula (3.2.12) for the surface integral is equivalent to ∫ ∫ √ ( ) ∑ 2 f dS = f ◦ φ(x) det Dφjk(x) dx. M O j

5. If M is an m-dimensional surface, φ : O → M ⊂ M a coordinate chart, for J = (j1, . . . , jm) set ( ) O → Rm φJ (x) = φj1 (x), . . . , φjm (x) , φJ : . Show that the formula (3.2.12) is equivalent to ∫ ∫ √ ( ) ∑ 2 f dS = f ◦ φ(x) det DφJ (x) dx. ··· M O j1<

n+1 n 6. Let M be the graph in R of xn+1 = u(x), x ∈ O ⊂ R . Show that, for p = (x, u(x)) ∈ M,TpM (given as in ((3.2.1)) has a 1-dimensional( orthogonal complement NpM, spanned by (−∇u(x), 1). We set N = 1 + )− |∇u|2 1/2(−∇u, 1), and call it the (upward-pointing) unit normal to M.

7. Let M be as in Exercise 6, and define N as done there. Show that, for a continuous function f : M → Rn+1, ∫ ∫ ( ) ( ) f · N dS = f x, u(x) · −∇u(x), 1 dx.

M O ∫ · The left side is often denoted M f dS.

8. Let M be a 2-dimensional surface in R3, covered by a single coordinate 3 chart,∫ φ : O → M. Suppose f : M → R is continuous. Show that, if · M f dS is defined as in Exercise 7, then ∫ ∫ ( ) f · dS = f φ(x) · (∂1φ × ∂2φ) dx. M O

9. Consider a symmetric n × n matrix A = (ajk) of the form ajk = vjvk. Show that the range of A is the one-dimensional space spanned by v = (v1, . . . , vn) (if this is nonzero). Deduce that A has exactly one nonzero eigenvalue, namely λ = |v|2. Use this to give another derivation of (3.2.22) from (3.2.21). Hint. Show that Aej = vjv, for each j.

10. Let Ω ⊂ Rn be open and u :Ω → R be a Ck map. Fix c ∈ R and consider S = {x ∈ Ω: u(x) = c}. Assume S ≠ ∅ and that ∇u(x) ≠ 0 for all x ∈ S. As seen after Proposition 3.2.5, S is a Ck surface of dimension n − 1, and, for each p ∈ S, TpS has a 1-dimensional orthogonal complement NpS spanned by ∇u(p). Assume now that there is a Ck map φ : O → R, with O ⊂ Rn−1 open, such that u(x′, φ(x′)) = c, and that x′ 7→ (x′, φ(x′)) parametrizes S. Show that ∫ ∫ |∇u| f dS = f dx′, |∂nu| S O 160 3. Multivariable integral calculus and calculus on surfaces where the functions in the integrand on the right are evaluated at (x′, φ(x′)). Hint. Compare the formula in Exercise 6 for N with the fact that N = ∇u/|∇u|, and keep in mind the formula (3.2.23).

In the next exercises, we study Exp tJ = etJ , where ( ) 0 −1 J = . 1 0

11. Show that if v ∈ R2, then d ∥etJ v∥2 = 2etJ v · JetJ v = 0, dt and deduce that ∥etJ v|| = ∥v∥ for all v ∈ R2, t ∈ R.

12. Define c(t) and s(t) by ( ) ( ) 1 c(t) etJ = . 0 s(t) Show that the identity (d/dt)etJ = JetJ implies c′(t) = −s(t), s′(t) = c(t). Deduce that (c(t), s(t)) is a unit speed curve, starting at (c(0), s(0)) = (1, 0), with initial velocity (c′(0), s′(0)) = (0, 1), and tracing out the unit circle x2 + y2 = 1 in R2. See Figure 3.2.2. Compare the derivation of (2.3.110)– (2.3.112).

13. Using Exercise 12 and (3.2.17), show that for t > 0, the curve γ : [0, t] → R2 given by γ(τ) = (c(τ), s(τ)) has length t. As stated in Exercise 1, in trigonometry the line segments from (0, 0) to (1, 0) and from (0, 0) to (c(t), s(t)) are said to meet at an angle, measured in radians, equal to the length of this curve, i.e., to t radians. Then the geometric definitions of the trigonometric functions cos t and sin t yield (3.2.131) cos t = c(t), sin t = s(t). Deduce that ( ) ( ) 1 cos t (3.2.132) etJ = , 0 sin t and from this, using etJ J = JetJ , that ( ) cos t − sin t (3.2.133) etJ = = (cos t)I + (sin t)J. sin t cos t 3.2. Surfaces and surface integrals 161

Figure 3.2.2. Unit circle

Compare (2.3.123). However, the derivation of (3.2.133) here is completely independent of the one in §2.3, which used (2.3.106)–(2.3.108). It shows that the definition of cos t and sin t given here is equivalent to that given in §2.3, and again leads to Euler’s formula (2.3.108).

14. The following result in linear algebra is established in Proposition A.3.8 of Appendix A.3.

Proposition. If A : Rn → Rn is orthogonal, so AtA = I, then Rn has an orthonormal basis in which the matrix representation of A consists of blocks ( ) cj −sj 2 2 , cj + sj = 1, sj cj plus perhaps an identity matrix block if 1 is an eigenvalue of A, and a block that is −I if −1 is an eigenvalue of A.

Use this and (3.2.133) to prove that (3.2.134) Exp : Skew(n) −→ SO(n) is onto. 162 3. Multivariable integral calculus and calculus on surfaces

Figure 3.2.3. Inner tube

,

In the next exercise, T denotes the “inner tube” obtained as follows. Take the circle in the (y, z)-plane, centered at y = a, z = 0, of radius b, with 0 < b < a. Rotate this circle about the z-axis. Then T is the surface so swept out. See Figure 3.2.3.

15. Define ψ : R2 → R3 by ψ(θ, φ) = (x(θ, φ), y(θ, φ), z(θ, φ)), with

x(θ, φ) = (a + b cos φ) cos θ, y(θ, φ) = (a + b cos φ) sin θ, z(θ, φ) = b sin φ.

Show that ψ maps [0, 2π] × [0, 2π] onto T . Show that |∂θψ × ∂φψ| = b(a + b cos φ). Using (3.2.18), show that

Area T = 4π2ab. 3.2. Surfaces and surface integrals 163

16. In the setting of Exercise 15, compute the following integrals. ∫ ∫ ∫ x2 dS, y2 dS, z2 dS. T T T

In the next exercise, M is a surface of revolution, obtained by taking the y = f(x), a ≤ x ≤ b (assuming f > 0) and rotating it about the x-axis, in R3.

17. Define ψ :[a, b] × R → R3 by ψ(s, t) = (s, f(s) cos t, f(s) sin t). Show that ψ maps [a, b] × [0, 2√π] onto M. ′ 2 Show that |∂sψ × ∂tψ| = f(s) 1 + f (s) . Using (3.2.18), show that if u : M → R is continuous, ∫ ∫ ∫ 2π b √ u dS = u(s, f(s) cos t, f(s) sin t)f(s) 1 + f ′(s)2 ds dt. 0 a M In particular, ∫ b √ Area M = 2π f(s) 1 + f ′(s)2 ds. a

18. Consider the ellipsoid of revolution Ea, given for a > 0 by x2 + y2 + z2 = 1. a2 Use the method of Exercise 17 to show that ∫ a √ 1 1 E − 2 − Area a = 4π 1 βs ds, β = 2 4 . 0 a a

19. Given a, b, c > 0, consider the ellipsoid E(a, b, c), given by x2 y2 z2 + + = 1. a2 b2 c2 Using (3.2.23), write down a formula for the area of E(a, b, c) as an integral over the region { } x2 y2 E = (x, y) ∈ R2 : + ≤ 1 . a,b a2 b2

20. Consider the parabolic curve ( ) t2 γ(t) = t, . 2 164 3. Multivariable integral calculus and calculus on surfaces

Show that the length of γ([0, x]) is ∫ x √ ℓ(x) = 1 + t2 dt. 0 Evaluate this integral using the substitution t = sinh u.

21. Let M be the surface of revolution obtained by taking the graph of the function y = ex, a ≤ x ≤ b, and rotating it about the x-axis in R3. Show that Exercise 17 yields ∫ b √ Area M = 2π es 1 + e2s ds. a Taking t = es, show that this is equal to ∫ β √ 2π 1 + t2 dt, α = ea, β = eb. α Relate this to Exercise 20.

In the next exercise, M is an n-dimensional “surface of revolution” given by a smooth map ψ :[a, b] × Sn−1 −→ M ⊂ R × Rn of the form ψ(s, ω) = (s, f(s)ω). A coordinate chart φ :Ω → Sn−1, with Ω open in Rn−1, gives rise to a coordinate chart ψ˜ :[a, b] × Ω −→ M, ψ˜(s, x) = (s, f(s)φ(x)).

We set x0 = s, x = (x1, . . . , xn−1).

22. Show that, in ψ˜-coordinates, the metric tensor of M takes the form (gjk), for 0 ≤ j, k ≤ n − 1, with the following components: ˜ ˜ ∂ψ ∂ψ ′ 2 g00 = · = 1 + f (s) , ∂x0 ∂x0 ∂ψ˜ ∂ψ˜ g0j = · = 0, for 1 ≤ j ≤ n − 1, ∂x0 ∂xj ˜ ˜ ∂ψ ∂ψ 2 gjk = · = f(s) hjk, for 1 ≤ j, k ≤ n − 1, ∂xj ∂xk n−1 where (hℓm) is the metric tensor of S in the φ-coordinates. Otherwise said, ( ) 1 + f ′(s)2 (gjk) = 2 . f(s) hℓm 3.2. Surfaces and surface integrals 165

Compare (3.2.26).

23. In the setting of Exercise 22, deduce that if u : M → R is continuous, ∫ ∫ ∫ b √ u dS = u(s, f(s)ω)f(s)n−1 1 + f ′(s)2 dS(ω) ds. a M Sn−1

In particular, with An−1 as in (3.2.30)–(3.2.32), ∫ b √ n−1 ′ 2 Area M = An−1 f(s) 1 + f (s) ds. a Note how this generalizes the conclusion of Exercise 17. √ 24. In the setting of Exercises 22–23, let M = Sn, with f(s) = 1 − s2. Show that ∫ ∫ 1 2 (n−2)/2 u(x0) dS(x) = An−1 u(s)(1 − s ) ds. −1 Sn

25. Let ψ : SO(n) → M(k, R) be continuous and satisfy the following properties: ψ(gh) = ψ(g)ψ(h), ψ(g−1) = ψ(g)−1, for all g, h ∈ SO(n). We say ψ is a representation of SO(n) on Rk. Form ∫ P = ψ(g) dg, P ∈ M(k, R),

SO(n) using the integral (3.2.45) (but here with a matrix valued integrand). Show that P : Rk −→ V, and v ∈ V ⇒ P v = v, where V = {v ∈ Rk : ψ(g)v = v, ∀ g ∈ SO(n)}. k Thus P is a projection of R onto V . ∫ Hint. With h ∈ SO(n) arbitrary, express ψ(h)P as the integral ψ(hg) dg, and apply (3.2.44).

26. In the setting of Exercise 25, show that ∫ dim V = χ(g) dg, χ(g) = Tr ψ(g).

SO(n) 166 3. Multivariable integral calculus and calculus on surfaces

27. Given u ∈ C(Rn), define Au ∈ C(Rn) by ∫ Au(x) = u(gx) dg.

SO(n) Show that Au is a radial function, in fact ∫ 1 Au(x) = Su(|x|), with Su(r) = u(rω) dS(ω). An−1 Sn−1

28. In the setting of Exercise 27, show that if u(x) is a polynomial in x, then Su(r) is a polynomial in r. Hint. Show that Au(x) is a polynomial in x. Look at Au(re1).

1 29. Let M be a C surface, K ⊂ M compact. Let φj : Oj → Uj be coordinate charts on M and assume K ⊂ ∪k U . Take v ∈ C (U ) such ∑ j=1 j j c j k that 1 vj > 0 on K. Let f : M → R be bounded and supported on K. Show that

f ∈ Rc(M) ⇐⇒ (vjf) ◦ φj ∈ Rc(Oj), for each j.

Here, use the definition (3.2.68)–(3.2.69) for Rc(M) and define Rc(Oj) as in §3.1.

30. Let M ⊂ Rn be a compact, m-dimensional, C1 surface. We define a contented partition of M to be a finite collection P = {Σk} of contented subsets of M such that ∪ + M = Σk, cont (Σj ∩ Σk) = 0, ∀ j ≠ k. k We say

maxsize P = max diam(Σk), k where diam Σ = sup ∥x − y∥. Establish the following variant of the k x,y∈Σk Darboux theorem (Proposition 3.1.1).

Proposition. Let Pν = {Σkν : 1 ≤ k ≤ N(ν)} be a sequence of contented partitions of M such that maxsize Pν → 0. Pick points ξkν ∈ Σkν. Then, given f ∈ R(M), we have ∫ N∑(ν) f dS = lim f(ξjν) V (Σkν), ν→∞ M k=1 3.3. Partitions of unity 167

∫ where V (Σkν) = M χΣkν dS is the content of Σkν.

Hint. First treat the case f ∈ C(M). Use the material in (3.2.68)–(3.2.76) to extend this to f ∈ R(M).

31. We desire to compute det G when G = (gjk) is an m × m matrix given by

gjk = δjk + vjvk. Compare (3.2.21). In other words,

G = I + T,T = (tjk), tjk = vjvk. m m (a) Let v ∈ R have components vj. Show that, for w ∈ R , T w = (v · w)v. (b) Deduce that T has one nonzero eigenvalue, |v|2. (c) Deduce that one eigenvalue of G is 1+|v|2, and the other m−1 eigenvalues are 1. √ √ (d) Deduce that g = det G = 1+|v|2, so g = 1 + |v|2. Compare (3.2.22), with v = ∇u.

3.3. Partitions of unity

In the text we have made occasional use of partitions of unity, and we include some material on this topic here. We begin by defining and constructing a continuous partition of unity on a compact metric space, subordinate to a open cover {Uj : 1 ≤ j ≤ N} of X. By definition, this is a family of continuous functions φj : X → R such that ∑ (3.3.1) φj ≥ 0, supp φj ⊂ Uj, φj = 1. j To construct such a partition of unity, we do the following. First, it can be shown that there is an open cover {Vj : 1 ≤ j ≤ N} of X and open sets Wj such that

(3.3.2) V j ⊂ Wj ⊂ W j ⊂ Uj. \ ⊂ Given this, let ψj(x) = dist (x, X Wj). Then ψj is continuous,∑ supp ψj ⊂ W j Uj, and ψj is strictly positive on V j. Hence Ψ = j ψj is continuous and strictly positive on X, and we see that −1 (3.3.3) φj(x) = Ψ(x) ψj(x) yields such a partition of unity.

We indicate how to construct the sets Vj and Wj used above, starting with V1 and W1. Note that the set K1 = X \ (U2 ∪ · · · ∪ UN ) is a compact 168 3. Multivariable integral calculus and calculus on surfaces

subset of U1. Assume it is nonempty; otherwise just throw U1 out and relabel the sets Uj. Now set { ∈ 1 \ } V1 = x U1 : dist (x, K1) < 3 dist (x, X U1) , and { ∈ 2 \ } W1 = x U1 : dist (x, K1) < 3 dist (x, X U1) . To construct V2 and W2, proceed as above, but use the cover {U2,...,UN ,V1}. Continue until done. Given a smooth compact surface M (perhaps with boundary), covered by coordinate patches Uj (1 ≤ j ≤ N), one can construct a smooth partition of unity on M, subordinate to this cover. The main additional tool for this ∈ ∞ Rn is the construction of a function ψ C0 ( ) such that 1 (3.3.4) ψ(x) = 1 for |x| ≤ , ψ(x) = 0 for |x| ≥ 1. 2 One way to get this is to start with the function on R given by −1/x f0(x) = e for x > 0, (3.3.5) 0 for x ≤ 0. It is an exercise to show that ∞ f0 ∈ C (R). Now the function 1 − f1(x) = f0(x)f0( 2 x) belongs to C∞(R) and is zero outside the interval [0, 1/2]. Hence the function ∫ x f2(x) = f1(s) ds −∞ belongs to C∞(R), is zero for x ≤ 0, and equals some positive constant (say C2) for x ≥ 1/2. Then 1 ψ(x) = f2(1 − |x|) C2 is a function on Rn with the desired properties. With this function in hand, to construct the smooth partition of unity mentioned above is an exercise we recommend to the reader.

3.4. Sard’s theorem

Let F : O → Rn be a C1 map, with O open in Rn. If p ∈ O and DF (p): Rn → Rn is not surjective, then p is said to be a critical point, and F (p) a critical value. The set C of critical points can be a large subset of O, even all of it, but the set of critical values F (C) must be small in Rn, as the following result implies. 3.5. Morse functions 169

Proposition 3.4.1. If F : O → Rn is a C1 map, C ⊂ O its set of critical points, and K ⊂ O compact, then F (C ∩ K) is a nil subset of Rn.

Proof. Without loss of generality, we can assume K is a cubical cell. Let P ′ ′′ be a partition of K into cubical cells Rα, all of diameter δ. Write P = P ∪P , where cells in P′ are disjoint from C, and cells in P′′ intersect C. Pick ′′ xα ∈ Rα ∩ C, for Rα ∈ P . Fix ε > 0. Now we have

(3.4.1) F (xα + y) = F (xα) + DF (xα) + rα(y), and, if δ > 0 is small enough, then |rα(y)| ≤ ε|y| ≤ εδ, for xα + y ∈ Rα. Thus F (Rα) is contained in an εδ-neighborhood of the set Hα = F (xα) + DF (xα)(Rα − xα), which is a parallelipiped of dimension ≤ n − 1, and diameter ≤ Mδ, if |DF | ≤ M. Hence

+ n ′ ′′ (3.4.2) cont F (Rα) ≤ Cεδ ≤ C εV (Rα), for Rα ∈ P . Thus ∑ + + ′′ (3.4.3) cont F (C ∩ K) ≤ cont F (Rα) ≤ C ε. ′′ Rα∈P Taking ε → 0, we have the proof. 

This is the easy case of a result known as Sard’s Theorem, which also treats the case F : O → Rn when O is an open set in Rm, m > n. Then a more elaborate argument is needed, and one requires more differentiability, namely that F is class Ck, with k = m − n + 1. A proof can be found in [31] or [39].

3.5. Morse functions

If Ω ⊂ Rn is open, a C2 function f :Ω → R is said to be a Morse function if each critical point of f is nondegenerate, i.e.,

(3.5.1) ∀ p ∈ Ω, ∇f(p) = 0 ⇒ D2f(p) is invertible, where D2f(p) is the symmetric n × n matrix of second order partial deriva- tives defined in §2.1. More generally, if M is an n-dimensional surface, a C2 function f : M → R is said to be a Morse function if f ◦ φ is a Morse function on Ω for each coordinate patch φ :Ω → U ⊂ M. Our goal here is to establish the existence of lots of Morse functions on an n-dimensional surface M. For simplicity, we restrict attention to the case where M is compact. Here is our main result. 170 3. Multivariable integral calculus and calculus on surfaces

Theorem 3.5.1. Let M ⊂ RN be a compact, smooth, n-dimensional sur- face. For a ∈ RN , set

(3.5.2) φa : M −→ R, φa(x) = a · x, x ∈ M. 2 N Take f ∈ C (M). Then the set Of of a ∈ R such that

(3.5.3) f + φa : M −→ R is a Morse function is a dense open subset of RN .

2 It is easy to verify that Of is open, since when (3.5.1) holds, a small C perturbation g of f has the property that D2g(x) is invertible for x near p. What is not so easy is to show that Of is dense (or even nonempty!). Our proof of such denseness will make use of Sard’s theorem, from §3.4. We begin with an easy special case.

Proposition 3.5.2. In the setting of Theorem 3.5.1, assume N = n + 1 and M = ∂Ω, with Ω ⊂ Rn+1 open. Then

n (3.5.4) {a ∈ S : a∈ / O0} is a nil set, hence has empty interior in the unit sphere Sn.

Proof. Here we are examining when φa is a Morse function on M. Let n N : M → S be the exterior unit normal. Then x0 ∈ M is a critical point of φa if and only if N(x0) = a. Such a point x0 is a nondegenerate critical n point of φa if and only if it is not a critical point of N. Hence, if a ∈ S are regular values of N, then φa is a Morse function, i.e., a ∈ O0. By Sard’s theorem, the set of points in Sn that are critical values of N is a nil set, so the proof of Proposition 3.5.2 is done. 

We begin to tackle Theorem 3.5.1 with the following result.

Lemma 3.5.3. Let Ω ⊂ Rn be open, and take g ∈ C2(Ω). Let U ⊂ Ω be the closure of a smoothly bounded open U. Set ga(x) = g(x) + a · x. Let O ∈ Rn | g denote the set of a such that ga U has only nondegenerate critical n points. Then R \Og is a nil set.

Proof. Consider (3.5.5) F (x) = −∇g(x),F :Ω −→ Rn.

A point x ∈ Ω is a critical point of ga if and only if F (x) = a, and this critical point is degenerate only if, in addition, a is a critical value of F . Hence the desired conclusion holds for all a ∈ Rn that are not critical values |  of F U . Again Sard’s theorem applies. 3.6. The tangent space to a manifold 171

Proof of Theorem 3.5.1. Each p ∈ M has a neighborhood Up in M N such that U p ⊂ Ωp ⊂ M and some n of the coordinates xj on R produce coordinates on Ωp. Say x1, . . . , xn do it. Let (an+1, . . .∑ , aN ) be fixed, but N arbitrary. Then Lemma 3.5.3 can be applied to g = f + n+1 ajxj, treated as a function of (x1, . . . , xn). It follows that, for all (a1, . . . , an) but a nil set, f + φa has only nondegenerate critical points in U p. Thus N (3.5.6) {a ∈ R : f + φa has only nondegenerate critical points in U p} is dense in RN . We also know this set is open. Now M can be covered by a finite collection of such sets Up, so Of , defined in Theorem 3.5.1, is a finite intersection of open dense subsets of RN , hence it is open and dense, as asserted. 

3.6. The tangent space to a manifold

Let X be a Ck manifold of dimension m, as defined in §3.2. Thus, for n each p ∈ X, there are an open set Up ⊂ X an open Op ⊂ R , and a homeomorphism

(3.6.1) φp : Op −→ Up of Op onto Up. One requires the following compatibility. If also q ∈ X ∩ ̸ ∅ O −1 and Upq = Up Uq = , then the homeomorphism from pq = φp (Upq) to O −1 qp = φq (Upq),

−1 ◦ (3.6.2) Fpq = φq φp , Opq k is a C diffeomorphism. The maps φp : Op → Up ⊂ X are coordinate charts on X.

In §3.2 we defined the tangent space TpX as an m-dimensional linear subspace of Rn in case X is a Ck surface in Rn. Here we will give an k intrinsic definition of TpX, for a general C manifold, and show that it is naturally isomorphic to the tangent space defined in §3.2 when X is a Ck n surface in R . We proceed as follows. Given p ∈ X, let Cp(X) denote the set of C1 curves (3.6.3) γ :(−1, 1) −→ X, γ(0) = p. The notion that such γ is a C1 map is as defined below (3.2.108).

Definition. The tangent space TpX is the set of equivalence classes of 172 3. Multivariable integral calculus and calculus on surfaces

curves in Cp(X), where, given γ0, γ1 ∈ Cp(X), ∼ ⇐⇒ ′ ′ (3.6.4) γ0 γ1 σ0(0) = σ1(0), where −1 ◦ | | (3.6.5) σj(t) = φp γj(t), for t < ε, and ε > 0 is sufficiently small that |t| < ε ⇒ γj(t) ∈ Up.

As we will see shortly, the set TpX is unchanged if one takes some other coordinate chart about p. Given the characterization above, we have a bijective map D −1 −→ Rm φp : TpX , defined by (3.6.6) D −1 −1 ◦ ′ φp ([γ]) = (φp γ) (0), where, for γ ∈ Cp(X), [γ] denotes its equivalence class in TpX.

Given some other coordinate chart ψ : O → Up, we likewise set −1 m −1 −1 ′ (3.6.7) Dψ (p): TpX −→ R , Dψ (p)([γ]) = (ψ ◦ γ) (0), with γ ∈ Cp(X). To compare (3.6.6) and (3.6.7), we apply the chain rule to −1 ◦ −1 ◦ ◦ −1 ◦ (3.6.8) ψ γ = ψ φp φp γ, obtaining D −1 −1 ◦ −1 ◦ ′ ψ (p)([γ]) = D(ψ φp)(o)(φp γ) (0) (3.6.9) −1 ◦ D −1 = D(ψ φp)(o) φp ([γ]), −1 −1 ◦ Rm → Rm where o = φp (p). Now D(ψ φp)(o): is a linear isomorphism, so the bijective map (I.6) gives TpX the structure of an m-dimensional vector space, independent of the choice of coordinate chart about p. Note that if X = Rm and p ∈ Rm, then we can use the identity map as the coordinate chart φp, i.e., φp(x) = x. This leads to the natural isomor- phism m m ′ m (3.6.10) Ip : TpR −→ R , Ip([γ]) = γ (0), γ ∈ Cp(R ). Now suppose X and Y are two Ck manifolds, of dimension m and n, respec- tively, and (3.6.11) f : X −→ Y is a C1 map, as defined below (3.2.108). Take p ∈ X, q = f(p). Note that

(3.6.12) γ ∈ Cp(X) =⇒ f ◦ γ ∈ Cq(Y ). We can define

(3.6.13) Df(p): TpX −→ TqY 3.6. The tangent space to a manifold 173 by

(3.6.14) Df(p)([γ]) = [f ◦ γ], γ ∈ Cp(X). In view of (3.6.10), we see that if X ⊂ Rm and Y ⊂ Rn are open subsets, then I ◦ ◦ I−1 Rm −→ Rn (3.6.15) q Df(p) p : gives the same map as we specified for Df(p) as it was defined in §2.1. Here are some related results, whose proofs are left to the reader.

1 Proposition 3.6.1. Given a C coordinate chart ψ : O → Up ⊂ X, leading −1 m to (I.7), we have ψ : Up → R , and −1 m m (3.6.16) Dψ (p): TpX −→ TpR ≈ R coincides with Dψ−1(p) in (3.6.7). Proposition 3.6.2. In the setting of (3.6.11)–(3.6.14), the map Df(p) in (3.6.13) is linear.

Going further, suppose Z is a C1 manifold and we have a C1 map (3.6.17) g : Y −→ Z, g(q) = r, so, parallel to (3.6.13),

(3.6.18) Dg(q): TqY −→ TrZ. We then have the chain rule D(g ◦ f)(p)([γ]) = [g ◦ f ◦ γ] (3.6.19) = Dg(q) Df(p)([γ]), for γ ∈ Cp(X). If X is a Ck manifold of dimension m, we can form the disjoint union of the tangent spaces TpX, ∪ (3.6.20) TX = TpX, p∈X called the tangent bundle of X. It is useful to know that TX has a natural structure of a Ck−1 manifold, produced as follows. Suppose φ : O → U ⊂ X is a coordinate chart, and set TU = ∪p∈U TpX ⊂ TX. We have Dφ : O × Rm → T U, Dφ(x, v) = Dφ(x)v, (3.6.21) m m Dφ(x): R ≈ TxR → Tφ(x)U. If ψ :Ω → U is a smooth coordinate chart, with (3.6.22) Dψ :Ω × Rm → T U, Dψ(y, w) = Dψ(y)w, 174 3. Multivariable integral calculus and calculus on surfaces then we have −1 m m Fφ,ψ = (Dψ) ◦ (Dφ): O × R −→ Ω × R , (3.6.23) ( ) −1 −1 Fφ,ψ(x, v) = ψ ◦ φ(x),D(ψ ◦ φ)(x)v , where (3.6.24) ψ−1 ◦ φ : O −→ Ω is a Ck diffeomorphism and D(ψ−1 ◦ φ)(x): Rm → Rm is defined as in k−1 §2.1. It follows that Fφ,ψ in (3.6.23) is a C diffeomorphism, with inverse m m Fψ,φ :Ω × R → O × R . Thus covering X by coordinate charts that are Ck compatible leads to a covering of TX by coordinate charts that are Ck−1 compatible. Note that there is a natural projection

(3.6.25) π : TX −→ X, π : TpX → {p}. There is also a natural inclusion (3.6.26) ι : X −→ TX, k namely, given p ∈ X, ι(p) is the zero element of TpX. Given that X is a C manifold and ℓ ≤ k − 1, we can define a Cℓ vector field V on X as a Cℓ map (3.6.27) V : X −→ TX, such that π(V (x)) = x, ∀ x ∈ X.

Let us now recall how we defined a metric tensor on a Ck manifold in §3.2. Namely, to each coordinate chart φp as in (3.6.1), there is a positive definite m × m matrix valued function

(3.6.28) Gp : Op −→ M(m, R), smooth of class Cℓ (ℓ ≤ k − 1), satisfying the compatibility condition t (3.6.29) Gp(x) = DFpq(x) Gq(y) DFpq(x), for

(3.6.30) x ∈ Opq ⊂ Op, y = Fpq(x) ∈ Oqp ⊂ Oq.

We can now see that such a family {Gp} gives an inner product on each tangent space TpX, as follows. Take p ∈ X, and suppose p ∈ Uq, with φq : Oq → Uq. Take V,W ∈ TpX. Then we set ⟨ ⟩ −1 · −1 −1 (3.6.31) V,W (p) = Dφq (p)V Gq(φq (p))Dφq (p)W. Here we take the standard dot product on Rm. The compatibility condition (3.6.29) gives ⟨ ⟩ −1 · −1 −1 (3.6.32) V,W (p) = Dφp (p)V Gp(φp (p))Dφp (p)W, 3.6. The tangent space to a manifold 175 so the inner product is independent of the choice of q in (3.6.31), as long as p ∈ Up ∩ Uq.

Chapter 4

Differential forms and the Gauss-Green-Stokes formula

The calculus of differential forms, one of E. Cartan’s fundamental contri- butions to analysis, provides a superb set of tools for calculus on surfaces and other manifolds. A 1-form α on an open set Ω ⊂ Rn can be written α = a1(x) dx1 + ··· + an(x) dxn. One can integrate such a 1-form over a smooth curve γ : I → Ω, via ∫ ∫ (4.0.1) α = γ∗α,

γ I ∑ ∗ ′ where γ α is the pull-back of α, given by j aj(γ(t))γj(t) dt. More generally, a k-form is a finite sum of terms ∧ · · · ∧ (4.0.2) aj(x) dxj1 dxjk , j = (j1, . . . , jk), where the “wedge product” satisfies the anticommutativity relation

(4.0.3) dxℓ ∧ dxm = −dxm ∧ dxℓ. If φ : O → Ω is a smooth map, and α is a k-form on Ω, one has the pull-back φ∗α to a k-form on O, satisfying (4.0.4) φ∗(α ∧ β) = φ∗α ∧ φ∗β, (φ ◦ ψ)∗α = ψ∗(φ∗α), if also ψ : U → O is a smooth map.

177 178 4. Differential forms and the Gauss-Green-Stokes formula

Another fundamental ingredient is the exterior derivative,

(4.0.5) d :Λk(Ω) −→ Λk+1(Ω), where Λk(Ω) denotes the space of smooth k-forms on Ω. One has the crucial identities

(4.0.6) ddα = 0, d(φ∗α) = φ∗(dα).

The action of φ∗ on n-forms (for Ω, O open in Rn) is given by

∗ (4.0.7) φ (F (x) dx1 ∧ · · · ∧ dxn) = F (φ(x))(det Dφ(x)) dx1 ∧ · · · ∧ dxn.

Hence, the change of variable formula established in §3.1 yields ∫ ∫ (4.0.8) α = φ∗α,

Ω O provided φ : O → Ω is a diffeomorphism such that det Dφ(x) > 0 on O. (One says φ preserves orientation.) Given this, one can define ∫ (4.0.9) β

M whenever β is a k form and M ⊂ Rn is a k-dimensional surface, assuming M possesses an “orientation.” Complementing the important identities (4.0.6)–(4.0.8), one has the fol- lowing, which could be called the “fundamental theorem of the calculus of differential forms,” ∫ ∫ (4.0.10) dα = α,

M ∂M when M is a k-dimensional oriented sirface (or manifold) with smooth boundary ∂M, and α is a smooth (k − 1)-form on M. This identity general- izes classical identities of Gauss, Green, and Stokes, and is called the general Stokes formula. The proof of this, in §4.3, is followed by a discussion of these classical cases in §4.4. In §4.5 we will use the theory of differential forms to obtain another proof of the change of variable formula for the integral, a proof very much different from the one given in §3.1. Chapter 5 will present a number of substantial applications of the cal- culus of differential forms, particularly of the Gauss-Green-Stokes formula. 4.1. Differential forms 179

4.1. Differential forms

It is very desirable to be able to make constructions that depend as little as possible on a particular choice of coordinate system. The calculus of differential forms, whose study we now take up, is one convenient set of tools for this purpose. We start with the notion of a 1-form. It is an object that gets integrated over a curve; formally, a 1-form on Ω ⊂ Rn is written ∑ (4.1.1) α = aj(x) dxj. j If γ :[a, b] → Ω is a smooth curve, we set ∫ ∫ b ∑ ( ) ′ (4.1.2) α = aj γ(t) γj(t) dt. a γ In other words, ∫ ∫ (4.1.3) α = γ∗α

γ I where I = [a, b] and ∑ ∗ ′ γ α = aj(γ(t))γj(t) dt j is the pull-back of α under the map γ. More generally, if F : O → Ω is a smooth map (O ⊂ Rm open), the pull-back F ∗α is a 1-form on O defined by ∑ ∗ ∂Fj (4.1.4) F α = aj(F (y)) dyk. ∂yk j,k The usual change of variable formula for integrals gives ∫ ∫ (4.1.5) α = F ∗α γ σ if γ is the curve F ◦ σ. If F : O → Ω is a diffeomorphism, and ∑ ∂ (4.1.6) X = bj(x) ∂xj is a vector field on Ω, recall from (2.3.40) that we have the vector field on O : ( ) −1 (4.1.7) F#X(y) = DF (p) X(p), p = F (y). 180 4. Differential forms and the Gauss-Green-Stokes formula

If we define a pairing between 1-forms and vector fields on Ω by ∑ j (4.1.8) ⟨X, α⟩ = b (x)aj(x) = b · a, j a simple calculation gives ∗ (4.1.9) ⟨F#X,F α⟩ = ⟨X, α⟩ ◦ F. Thus, a 1-form on Ω is characterized at each point p ∈ Ω as a linear trans- formation of the space of vectors at p to R. More generally, we can regard a k-form α on Ω as a k-multilinear map on vector fields: ∞ (4.1.10) α(X1,...,Xk) ∈ C (Ω); we impose the further condition of anti-symmetry when k ≥ 2:

(4.1.11) α(X1,...,Xj,...,Xℓ,...,Xk) = −α(X1,...,Xℓ,...,Xj,...,Xk). Let us note that a 0-form is simply a function.

There is a special notation we use for k-forms. If 1 ≤ j1 < ··· < jk ≤ n, j = (j1, . . . , jk), we set 1 ∑ (4.1.12) α = a (x) dx ∧ · · · ∧ dx k! j j1 jk j where

(4.1.13) aj(x) = α(Dj1 ,...,Djk ),Dj = ∂/∂xj. More generally, we assign meaning to (4.1.12) summed over all k-indices (j1, . . . , jk), where we identify ∧ · · · ∧ ∧ · · · ∧ (4.1.14) dxj1 dxjk = (sgn σ) dxjσ(1) dxjσ(k) ,

σ being a permutation of {1, . . . , k}. If any jm = jℓ (m ≠ ℓ), then (4.1.14) vanishes. A common notation for the statement that α is a k-form on Ω is (4.1.15) α ∈ Λk(Ω).

In particular, we can write a 2-form β as 1 ∑ (4.1.16) β = b (x) dx ∧ dx 2 jk j k − and pick coefficients satisfying∑ bjk(x) = bkj∑(x). According to (4.1.12)– (4.1.13), if we set U = uj(x)∂/∂xj and V = vj(x)∂/∂xj, then ∑ j k (4.1.17) β(U, V ) = bjk(x)u (x)v (x). ∑ If bjk is not required to be antisymmetric, one gets β(U, V ) = (1/2) (bjk − j k bkj)u v . 4.1. Differential forms 181

If F : O → Ω is a smooth map as above, we define the pull-back F ∗α of a k-form α, given by (4.1.12), to be ∑ ( ) ∗ ∗ ∧ · · · ∧ ∗ (4.1.18) F α = aj F (y) (F dxj1 ) (F dxjk ) j where ∑ ∗ ∂Fj (4.1.19) F dxj = dyℓ, ∂yℓ ℓ the algebraic computation in (4.1.18) being performed using the rule (4.1.14). Extending (4.1.9), if F is a diffeomorphism, we have ∗ (4.1.20) (F α)(F#X1,...,F#Xk) = α(X1,...,Xk) ◦ F.

If B = (bjk) is an n × n matrix, then, by (4.1.14), (∑ ) (∑ ) (∑ ) b1kdxk ∧ b2kdxk ∧ · · · ∧ bnkdxk k ∑ k k ··· ∧ · · · ∧ = b1k1 bnkn dxk1 dxkn (4.1.21) k1,...,kn ( ∑ ) = (sgn σ)b1σ(1)b2σ(2) ··· bnσ(n) dx1 ∧ · · · ∧ dxn ∈ ( σ Sn ) = det B dx1 ∧ · · · ∧ dxn.

Here Sn denotes the set of permutations of {1, . . . , n}, and the last identity is the formula for the determinant presented in (1.4.30). It follows that if F : O → Ω is a C1 map between two domains of dimension n, and

(4.1.22) α = A(x) dx1 ∧ · · · ∧ dxn is an n-form on Ω, then ∗ (4.1.23) F α = det DF (y) A(F (y)) dy1 ∧ · · · ∧ dyn.

Comparison with the change of variable formula∫ for multiple integrals suggests that one has an intrinsic definition of Ω α when α is an n-form on Ω, n = dim Ω. To implement this, we need to take into account that det DF (y) rather than | det DF (y)| appears in (4.1.21). We say a smooth map F : O → Ω between two open subsets of Rn preserves orientation if det DF (y) is everywhere positive. The object called an “orientation” on Ω can be identified as an equivalence class of nowhere vanishing n-forms on Ω, two such forms being equivalent if one is a multiple of another by a positive function in C∞(Ω); the standard orientation on Rn is determined by n+k dx1 ∧· · ·∧dxn. If S is an n-dimensional surface in R , an orientation on S can also be specified by a nowhere vanishing form ω ∈ Λn(S). If such a form 182 4. Differential forms and the Gauss-Green-Stokes formula exists, S is said to be orientable. The equivalence class of positive multiples a(x)ω is said to consist of “positive” forms. A smooth map ψ : S → M between oriented n-dimensional surfaces preserves orientation provided ψ∗σ is positive on S whenever σ ∈ Λn(M) is positive. If S is oriented, one can choose coordinate charts which are all orientation preserving. We mention that there exist surfaces that cannot be oriented, such as the famous “M¨obius strip,” and also the projective space P2, discussed in §3.2. We define the integral of an n-form over an oriented n-dimensional sur- face as follows. First, if α is an n-form supported on an open set Ω ⊂ Rn, given by (4.1.22), then we set ∫ ∫ (4.1.24) α = A(x) dV (x),

Ω Ω the right side defined as in §3.1. If O is also open in Rn and F : O → Ω is an orientation preserving diffeomorphism, we have ∫ ∫ (4.1.25) F ∗α = α,

O Ω as a consequence of (4.1.23) and the change of variable formula (3.1.47). More generally, if S is an n-dimensional surface with an orientation, say the image of an open set O ⊂ Rn by φ : O → S, carrying the natural orientation of O, we can set ∫ ∫ (4.1.26) α = φ∗α

S O for∫ an n-form α on S. If it takes several coordinate patches to cover S, define α by writing α as a sum of forms, each supported on one patch. S ∫ We need to show that this definition of S α is independent of the choice of coordinate system on S (as long as the orientation of S is respected). Thus, suppose φ : O → U ⊂ S and ψ :Ω → U ⊂ S are both coordinate patches, so that F = ψ−1 ◦φ : O → Ω is an orientation-preserving diffeomor- phism, as in Figure 3.2.1 of §3.2. We need to check that, if α is an n-form on S, supported on U, then ∫ ∫ (4.1.27) φ∗α = ψ∗α.

O Ω To establish this, we first show that, for any form α of any degree,

(4.1.28) ψ ◦ F = φ =⇒ φ∗α = F ∗ψ∗α. 4.1. Differential forms 183

∗ ∑It suffices to check (4.1.28) for α = dxj. Then (4.1.19) gives ψ dxj = (∂ψj/∂xℓ) dxℓ, so ∑ ∑ ∗ ∗ ∂Fℓ ∂ψj ∗ ∂φj (4.1.29) F ψ dxj = dxm, φ dxj = dxm; ∂xm ∂xℓ ∂xm ℓ,m m but the identity of these forms follows from the chain rule: ∂φ ∑ ∂ψ ∂F (4.1.30) Dφ = (Dψ)(DF ) =⇒ j = j ℓ . ∂xm ∂xℓ ∂xm ℓ Now that we have (4.1.28), we see that the left side of (4.1.27) is equal to ∫ (4.1.31) F ∗(ψ∗α), O which is equal to the right side of (4.1.27), by (4.1.25). Thus the integral of an n-form over an oriented n-dimensional surface is well defined.

Exercises

k 1. If F : U0 → U1 and G : U1 → U2 are smooth maps and α ∈ Λ (U2), then (4.1.26) implies

∗ ∗ ∗ k (4.1.32) (G ◦ F ) α = F (G α) in Λ (U0).

n In the special case that Uj = R and F and G are linear maps, and k = n, show that this identity implies

(4.1.33) det(GF ) = (det F )(det G).

Compare this with the derivation of (1.4.32).

2. Let ΛkRn denote the space of k-forms (4.1.12) with constant coefficients. Show that ( ) k n n (4.1.34) dimR Λ R = . k If T : Rm → Rn is linear, then T ∗ preserves this class of spaces; we denote the map

(4.1.35) ΛkT ∗ :ΛkRn −→ ΛkRm.

Similarly, replacing T by T ∗ yields

(4.1.36) ΛkT :ΛkRm −→ ΛkRn. 184 4. Differential forms and the Gauss-Green-Stokes formula

3. Show that ΛkT is uniquely characterized as a linear map from ΛkRm to ΛkRn which satisfies k m (Λ T )(v1 ∧ · · · ∧ vk) = (T v1) ∧ · · · ∧ (T vk), vj ∈ R .

4. Show that if S, T : Rn → Rn are linear maps, then (4.1.37) Λk(ST ) = (ΛkS) ◦ (ΛkT ). Relate this to (4.1.28).

n If {e1, . . . , en} is the standard orthonormal basis of R , define an inner prod- uct on ΛkRn by declaring an orthonormal basis to be { ∧ · · · ∧ ≤ ··· ≤ } (4.1.38) ej1 ejk : 1 j1 < < jk n . If A :ΛkRn → ΛkRn is a linear map, define At :ΛkRn → ΛkRn by (4.1.39) ⟨Aα, β⟩ = ⟨α, Atβ⟩, α, β ∈ ΛkRn, where ⟨ , ⟩ is the inner product on ΛkRn defined above.

5. Show that, if T : Rn → Rn is linear, with transpose T t, then (4.1.40) (ΛkT )t = Λk(T t). Hint. Check the identity ⟨(ΛkT )α, β⟩ = ⟨α, (ΛkT t)β⟩ when α and β run ∧ · · · ∧ over the orthonormal basis (4.1.38). That is, show that if α = ej1 ejk , ∧ · · · ∧ β = ei1 eik , then ⟨ ∧· · ·∧ ∧· · ·∧ ⟩ ⟨ ∧· · ·∧ t ∧· · ·∧ t ⟩ (4.1.41) T ej1 T ejk , ei1 eik = ej1 ejk ,T ei1 T eik . ∧ · · · ∧ Hint. Say T = (tij). In the spirit of (4.1.21), expand T ej1 T ejk , and show that the left side of (4.1.41) is equal to ∑ ··· (4.1.42) (sgn σ)tiσ(1)j1 tiσ(k)jk , σ∈Sk where Sk denotes the set of permutations of {1, . . . , k}. Similarly, show that the right side of (4.1.41) is equal to ∑ ··· (4.1.43) (sgn τ)ti1jτ(1) tikjτ(k) . τ∈Sk To compare these two formulas, see the treatment of (1.4.31).

n 6. Show that if {u1, . . . , un} is any orthonormal basis of R , then the set { ∧ · · · ∧ ≤ ··· ≤ } kRn uj1 ujk : 1 j1 < < jk n is an orthonormal basis of Λ . Hint. Use Exercises 4 and 5 to show that if T : Rn → Rn is an orthogonal 4.1. Differential forms 185 transformation on Rn (i.e., preserves the inner product) then ΛkT is an orthogonal transformation on ΛkRn.

n 7. Let vj, wj ∈ R , 1 ≤ j ≤ k (k < n). Form the matrices V , whose k columns are the column vectors v1, . . . , vk, and W , whose k columns are the column vectors w1, . . . , wk. Show that t ⟨v1 ∧ · · · ∧ vk, w1 ∧ · · · ∧ wk⟩ = det W V (4.1.44) = det V tW.

Hint. Show that both sides are linear in each vj and in each wj. Use this to reduce the problem to verifying (4.1.44) when each vj and each wj is chosen from among the set of basis vectors {e1, . . . , en}. Use anti-symmetries to reduce the problem further.

n 8. Deduce from Exercise 7 that if vj, wj ∈ R , then ∑ (4.1.45) ⟨v1 ∧ · · · ∧ vk, w1 ∧ · · · ∧ wk⟩ = (sgn π)⟨v1, wπ(1)⟩···⟨vk, wπ(k)⟩, π where π ranges over the set of permutations of {1, . . . , k}.

9. Show that the conclusion of Exercise 6 also follows from (4.1.45).

k n k k 10. Let A, B : R → R be linear maps and set ω = e1 ∧ · · · ∧ ek ∈ Λ R . We have ΛkAω, ΛkBω ∈ ΛkRn. Deduce from (4.1.44) that

(4.1.46) ⟨ΛkAω, ΛkBω⟩ = det BtA.

11. Let φ : O → Rn be smooth, with O ⊂ Rm open. Deduce from Exercise 10 that, for each x ∈ O,

(4.1.47) ∥ΛmDφ(x)ω∥2 = det Dφ(x)tDφ(x), where ω = e1 ∧· · ·∧em. Deduce that if φ : O → U ⊂ M is a coordinate patch on a smooth m-dimensional surface M ⊂ Rn and f ∈ C(M) is supported on U, then ∫ ∫ (4.1.48) f dS = f(φ(x))∥ΛmDφ(x)ω∥ dx.

M O

12. Show that the result of Exercise 5 in §3.2 follows from (4.1.48), via (4.1.41)–(4.1.42). 186 4. Differential forms and the Gauss-Green-Stokes formula

13. Recall the projective spaces Pn, constructed in §3.2. Show that Pn is orientable if and only if n is odd.

4.2. Products and exterior derivatives of forms

Having discussed the notion of a differential form as something to be in- tegrated, we now consider some operations on forms. There is a wedge product, or exterior product, characterized as follows. If α ∈ Λk(Ω) has the form (4.1.12) and if ∑ ∧ · · · ∧ ∈ ℓ (4.2.1) β = bi(x) dxi1 dxiℓ Λ (Ω), i define ∑ ∧ ∧ · · · ∧ ∧ ∧ · · · ∧ (4.2.2) α β = aj(x)bi(x) dxj1 dxjk dxi1 dxiℓ j,i in Λk+ℓ(Ω). A special case of this arose in (4.1.18)–(4.1.21). We retain the equivalence (4.1.14). It follows easily that (4.2.3) α ∧ β = (−1)kℓβ ∧ α. In addition, there is an if α ∈ Λk(Ω) with a vector field k−1 X on Ω, producing ιX α = α⌋X ∈ Λ (Ω), defined by

(4.2.4) (α⌋X)(X1,...,Xk−1) = α(X,X1,...,Xk−1). ∧ · · · ∧ Consequently, if α = dxj1 dxjk ,Di = ∂/∂xi, then ⌋ − ℓ−1 ∧ · · · ∧ c ∧ · · · ∧ (4.2.5) α Djℓ = ( 1) dxj1 dxjℓ dxjk c where dxjℓ denotes removing the factor dxjℓ . Furthermore,

i∈ / {j1, . . . , jk} =⇒ α⌋Di = 0. If F : O → Ω is a diffeomorphism and α, β are forms and X a vector field on Ω, it is readily verified that ∗ ∗ ∗ ∗ ∗ (4.2.6) F (α ∧ β) = (F α) ∧ (F β),F (α⌋X) = (F α)⌋(F#X).

We make use of the operators ∧k and ιk on forms:

(4.2.7) ∧kα = dxk ∧ α, ιkα = α⌋Dk. There is the following useful anticommutation relation:

(4.2.8) ∧kιℓ + ιℓ∧k = δkℓ, where δkℓ is 1 if k = ℓ, 0 otherwise. This is a fairly straightforward conse- quence of (4.2.5). We also have

(4.2.9) ∧j ∧k + ∧k ∧j = 0, ιjιk + ιkιj = 0. 4.2. Products and exterior derivatives of forms 187

From (4.2.8)–(4.2.9) one says that the operators {ιj, ∧j : 1 ≤ j ≤ n} generate a “Clifford algebra.” Another important operator on forms is the exterior derivative: (4.2.10) d :Λk(Ω) −→ Λk+1(Ω), defined as follows. If α ∈ Λk(Ω) is given by (4.1.12), then ∑ ∂aj ∧ ∧ · · · ∧ (4.2.11) dα = dxℓ dxj1 dxjk . ∂xℓ j,ℓ Equivalently, ∑n (4.2.12) dα = ∂ℓ ∧ℓ α ℓ=1 where ∂ℓ = ∂/∂xℓ and ∧ℓ is given by (4.2.7). The antisymmetry dxm ∧ 2 2 dxℓ = −dxℓ ∧dxm, together with the identity ∂ aj/∂xℓ∂xm = ∂ aj/∂xm∂xℓ, implies (4.2.13) d(dα) = 0, for any differential form α. We also have a product rule: (4.2.14) d(α ∧ β) = (dα) ∧ β + (−1)kα ∧ (dβ), α ∈ Λk(Ω), β ∈ Λj(Ω).

The exterior derivative has the following important property under pull- backs: (4.2.15) F ∗(dα) = dF ∗α, if α ∈ Λk(Ω) and F : O → Ω is a smooth map. To see this, extending (4.2.14) to a formula for d(α ∧ β1 ∧ · · · ∧ βℓ) and using this to apply d to F ∗α, we have (4.2.16) ∑ ( ) ( ) ( ) ∗ ∂ ◦ ∧ ∗ ∧ · · · ∧ ∗ dF α = aj F (x) dxℓ F dxj1 F dxjk ∂xℓ j,ℓ ∑ ( )( ) ( ) ( )  ∗ ∧ · · · ∧ ∗ ∧ · · · ∧ ∗ + ( )aj F (x) F dxj1 d F dxjν F dxjk . j,ν Now the definition (4.1.18)–(4.1.19) of pull-back gives directly that ∑ ∗ ∂Fi (4.2.17) F dxi = dxℓ = dFi, ∂xℓ ℓ ∗ and hence d(F dxi) = ddfi = 0, so only the first sum in (4.2.16) contributes to dF ∗α. Meanwhile, ∑ ∂a ( ) ( ) ( ) (4.2.18) F ∗dα = j F (x) (F ∗dx ) ∧ F ∗dx ∧ · · · ∧ F ∗dx , ∂x m j1 jk j,m m 188 4. Differential forms and the Gauss-Green-Stokes formula so (4.2.15) follows from the identity ∑ ( ) ∑ ( ) ∂ ∂aj ∗ (4.2.19) aj ◦ F (x) dxℓ = F (x) F dxm, ∂xℓ ∂xm ℓ m which in turn follows from the chain rule. If dα = 0, we say α is closed; if α = dβ for some β ∈ Λk−1(Ω), we say α is exact. Formula (4.2.13) implies that every exact form is closed. The converse is not always true globally. Consider the multi-valued angular coordinate θ on R2 \ (0, 0); dθ is a single valued closed form on R2 \ (0, 0) which is not globally exact. An important result of H. Poincar´e,is that every closed form is locally exact. A proof is given in §5.2. (A special case is established in §4.3.)

Exercises

1. If α is a k-form, verify the formula (4.2.14), i.e., d(α ∧ β) = (dα) ∧ β + (−1)kα ∧ dβ. If α is closed and β is exact, show that α ∧ β is exact. ∑ R3 3 2. Let F be a vector field on U, open in ,F = 1 fj(x)∂/∂xj. The vector field G = F is classically defined as a formal determinant   e1 e2 e3   (4.2.20) curl F = det ∂1 ∂2 ∂3 , f1 f2 f3 ∑ { } R3 3 where ej is the standard basis of . Consider the 1-form φ = 1 fj(x)dxj. Show that dφ and curl F are related in the following way: ∑3 ∑3 curl F = gj(x)ej = gj(x) ∂/∂xj, (4.2.21) 1 1 dφ = g1(x) dx2 ∧ dx3 + g2(x) dx3 ∧ dx1 + g3(x) dx1 ∧ dx2. See (4.4.30)–(4.4.37) for more on this connection.

3. If F and φ are related as in Exercise 2, show that curl F is uniquely specified by the relation (4.2.22) dφ ∧ α = ⟨curl F, α⟩ ω 3 for all 1-forms α on U ⊂ R , where ω = dx1 ∧ dx2 ∧ dx3 is the volume form.

4. Let B be a ball in R3,F a smooth vector field on B. Show that (4.2.23) ∃ u ∈ C∞(B) s.t. F = grad u =⇒ curl F = 0. 4.2. Products and exterior derivatives of forms 189

Hint. Compare F = grad u with φ = du.

5. Let B be a ball in R3 and G a smooth vector field on B. Show that (4.2.24) ∃ vector field F s.t. G = curl F =⇒ div G = 0. ∑ 3 Hint. If G = 1 gj(x)∂/∂xj, consider

(4.2.25) ψ = g1(x) dx2 ∧ dx3 + g2(x) dx3 ∧ dx1 + g3(x) dx1 ∧ dx2. Show that

(4.2.26) dψ = (div G) dx1 ∧ dx2 ∧ dx3.

6. Show that the 1-form dθ mentioned below (4.2.19) is given by x dy − y dx dθ = . x2 + y2

For the next set of exercises, let Ω be a planar domain, X = f(x, y) ∂/∂x + g(x, y) ∂/∂y a nonvanishing vector field on Ω. Consider the 1-form α = g(x, y) dx − f(x, y) dy.

7. Let γ : I → Ω be a smooth curve, I = (a, b). Show that the image C = γ(I) is the image of an integral curve of X if and only if γ∗α = 0. Consequently, with slight abuse of notation, one describes the integral curves by g dx − f dy = 0. If α is exact, i.e., α = du, conclude the level curves of u are the integral curves of X.

8. A function φ is called an integrating factor ifα ˜ = φα is exact, i.e., if d(φα) = 0, provided Ω is simply connected. Show that an integrating factor always exists, at least locally. Show that φ = ev is an integrating factor if and only if Xv = − div X. Find an integrating factor for α = (x2 + y2 − 1) dx − 2xy dy.

n 9. Define the radial vector field R = x1∂/∂x1 + ··· + xn∂/∂xn, on R . Show that ω = dx1 ∧ · · · ∧ dxn =⇒ ∑n j−1 c ω⌋R = (−1) xj dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn. j=1 Show that d(ω⌋R) = n ω. 190 4. Differential forms and the Gauss-Green-Stokes formula

10. Show that if F : Rn → Rn is a linear rotation (i.e., F ∈ SO(n)) than β = ω⌋R in Exercise 9 has the property that F ∗β = β.

4.3. The general Stokes formula

The Stokes formula involves integrating a k-form over a k-dimensional sur- face with boundary. We first define that concept. Let S be a smooth k- dimensional surface (say in RN ), and let M be an open subset of S, such that its closure M (in RN ) is contained in S. Its boundary is ∂M = M \ M. We say M is a smooth surface with boundary if also ∂M is a smooth (k−1)- dimensional surface. In such a case, any p ∈ ∂M has a neighborhood U ⊂ S with a coordinate chart φ : O → U, where O is an open neighborhood of 0 k in R , such that φ(0) = p and φ maps {x ∈ O : x1 = 0} onto U ∩ ∂M. If S is oriented, then M is oriented, and ∂M inherits an orientation, uniquely determined by the following requirement: if

k k (4.3.1) M = R− = {x ∈ R : x1 ≤ 0}, then ∂M = {(x2, . . . , xk)} has the orientation determined by dx2 ∧· · ·∧dxk. We can now state the Stokes formula.

Proposition 4.3.1. Given a compactly supported (k−1)-form β of class C1 on an oriented k-dimensional surface M (of class C2) with boundary ∂M, with its natural orientation, ∫ ∫ (4.3.2) dβ = β.

M ∂M

Proof. Using a partition of unity and invariance of the integral and the exterior derivative under coordinate transformations, it suffices to prove this when M has the form (4.3.1). In that case, we will be able to deduce (4.3.2) from the Fundamental Theorem of Calculus. Indeed, if c (4.3.3) β = bj(x) dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxk, with bj(x) of bounded support, we have

j−1 ∂bj (4.3.4) dβ = (−1) dx1 ∧ · · · ∧ dxk. ∂xj If j > 1, we have ∫ ∫ {∫ ∞ } j−1 ∂bj ′ (4.3.5) dβ = (−1) dxj dx = 0, −∞ ∂xj M 4.3. The general Stokes formula 191 and also κ∗β = 0, where κ : ∂M → M is the inclusion. On the other hand, for j = 1, we have ∫ ∫ ∫ { 0 } ∂b1 dβ = dx1 dx2 ··· dxk −∞ ∂x1 M ∫ ′ ′ (4.3.6) = b1(0, x ) dx ∫ = β.

∂M This proves Stokes’ formula (4.3.2). 

It is useful to allow singularities in ∂M. We say a point p ∈ M is a corner of dimension ν if there is a neighborhood U of p in M and a C2 diffeomorphism of U onto a neighborhood of 0 in

k (4.3.7) K = {x ∈ R : xj ≤ 0, for 1 ≤ j ≤ k − ν}, where k is the dimension of M. If M is a C2 surface and every point p ∈ ∂M is a corner (of some dimension), we say M is a C2 surface with corners. In such a case, ∂M is a locally finite union of C2 surfaces with corners. The following result extends Proposition 4.3.1.

Proposition 4.3.2. If M is a C2 surface of dimension k, with corners, and β is a compactly supported (k − 1)-form of class C1 on M, then (4.3.2) holds.

Proof. It suffices to establish this when β is supported on a small neighbor- hood of a corner p ∈ ∂M, of the form U described above. Hence it suffices to show that (4.3.2) holds whenever β is a (k − 1)-form of class C1, with compact support on K in (4.3.7); and we can take β to have the form (4.3.3). Then, for j > k − ν, (4.3.5) still holds, while, for j ≤ k − ν, we have, as in (4.3.6), (4.3.8)∫ ∫ ∫ { 0 } j−1 ∂bj c dβ = (−1) dxj dx1 ··· dxj ··· dxk −∞ ∂xj K ∫ j−1 c = (−1) bj(x1, . . . , xj−1, 0, xj+1, . . . , xk) dx1 ··· dxj ··· dxk ∫ = β.

∂K This completes the proof.  192 4. Differential forms and the Gauss-Green-Stokes formula

The reason we required M to be a surface of class C2 (with corners) in Propositions 4.3.1 and 4.3.2 is the following. Due to the formulas (4.1.18)– (4.1.19) for a pull-back, if β is of class Cj and F is of class Cℓ, then F ∗β is generally of class Cµ, with µ = min(j, ℓ − 1). Thus, if j = ℓ = 1,F ∗β might be only of class C0, so there is not a well-defined notion of a differential form of class C1 on a C1 surface, though such a notion is well defined on a C2 surface. This problem can be overcome, and one can extend Propositions 4.3.1–4.3.2 to the case where M is a C1 surface (with corners), and β is a (k − 1)-form with the property that both β and dβ are continuous. We will not go into the details. Substantially more sophisticated generalizations are given in [12]. We will mention one useful extension of the scope of Proposition 4.3.2, to surfaces with piecewise smooth boundary that do not satisfy the corner condition. An example is illustrated in Figure 4.3.1. There the point p is a singular point of ∂M that is not a corner, according to the definition using (4.3.7). However, in many cases M can be divided into pieces (M1 amd M2 for the example presented in Figure 4.3.1) and each piece Mj is a surface with corners. Then Proposition 4.3.2 applies to each piece separately: ∫ ∫ (4.3.9) dβ = β,

Mj ∂Mj and one can sum over j to get (4.3.2) in this more general setting. We next apply Proposition 4.3.2 to prove the following special case of the Poincar´elemma, which will be used in §5.1.

Proposition 4.3.3. If α is a 1-form on B = {x ∈ R2 : |x| < 1} and dα = 0, then there exists a real valued u ∈ C∞(B) such that α = du.

In fact, let us set ∫

(4.3.10) uj(x) = α,

γj (x) where γj(x) is a path from 0 to x = (x1, x2) which consists of two line segments. The path first goes from 0 to (0, x2) and then from (0, x2) to x, if j = 1, while if j = 2 it first goes from 0 to (x1, 0) and then from (x1, 0) to x. See Figure 4.3.2. It is easy to verify that ∂uj/∂xj = αj(x). We claim that u1 = u2, or equivalently that ∫ (4.3.11) α = 0,

σ(x) 4.3. The general Stokes formula 193

Figure 4.3.1. Division into surfaces with corners

where σ(x) is a closed path consisting of γ2(x) followed by γ1(x), in reverse. In fact, Stokes’ formula, Proposition 4.3.2, implies that ∫ ∫ (4.3.12) α = dα,

σ(x) R(x) where R(x) is the rectangle whose boundary is σ(x). If dα = 0, then (4.3.12) vanishes, and we have the desired function u : u = u1 = u2.

Exercises

1. In the setting of Proposition 4.3.1, show that ∫ ∂M = ∅ =⇒ dβ = 0.

M

2. Consider the region Ω ⊂ R2 defined by Ω = {(x, y) : 0 ≤ y ≤ x2, 0 ≤ x ≤ 1}. 194 4. Differential forms and the Gauss-Green-Stokes formula

Figure 4.3.2. of closed 1-form

Show that the boundary points (1, 0) and (1, 1) are corners, but (0, 0) is not a corner. The boundary of Ω is too sharp at (0, 0) to be a corner; it is called a “.” Extend Proposition 4.3.2. to treat this region.

3. Suppose U ⊂ Rn is an open set with smooth boundary M = ∂U, and U has the standard orientation, determined by dx1 ∧ · · · ∧ dxn. (See the para- graph above (4.1.23).) Let φ ∈ C1(Rn) satisfy φ(x) < 0 for x ∈ U, φ(x) > 0 for x ∈ Rn\U, and grad φ(x) ≠ 0 for x ∈ ∂U, so grad φ points out of U. Show that the natural orientation on ∂U, as defined just before Proposition 4.3.1, is the same as the following. The equivalence class of forms β ∈ Λn−1(∂U) defining the orientation on ∂U satisfies the property that dφ∧β is a positive multiple of dx1 ∧ · · · ∧ dxn, on ∂U.

n 4. Suppose U = {x ∈ R : xn < 0}. Show that the orientation on ∂U n−1 n described above is that of (−1) dx1 ∧ · · · ∧ dxn−1. If V = {x ∈ R : xn > 0}, what orientation does ∂V inherit?

5. Extend the special case of Poincar´e’sLemma given in Proposition 4.3.3 4.3. The general Stokes formula 195 to the case where α is a closed 1-form on B = {x ∈ Rn : |x| < 1}, i.e., from the case dim B = 2 to higher dimensions.

6. Define β ∈ Λn−1(Rn) by ∑n j−1 c β = (−1) xj dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn. j=1 Let Ω ⊂ Rn be a smoothly bounded compact subset. Show that ∫ 1 β = Vol(Ω). n ∂Ω

7. In the setting of Exercise 6, show that if f ∈ C1(Ω), then ∫ ∫ fβ = (Rf + nf) dx,

∂Ω Ω where ∑n ∂f Rf = x . j ∂x j=1 j

8. In the setting of Exercises 6–7, and with Sn−1 ⊂ Rn the unit sphere, show that ∫ ∫ fβ = f dS.

Sn−1 Sn−1 − − Hint. Let B√⊂ Rn 1 be the unit ball, and define φ : B → Sn 1 by φ(x′) = (x′, 1 − |x′|2). Compute φ∗β. Compare surface area formulas derived in §3.2. j Another approach. The unit sphere Sn−1 ,→ Rn has a volume form (cf. (4.4.13)), it must be a scalar multiple g(j∗β), and, by Exercise 10 of §4.2, g must be constant. Then Exercise 6 identifies this constant, in light of results from §3.2. See the exercises in §4.4 for more on this.

9. Given β as in Exercise 6. show that the (n − 1)-form ω = |x|−n β ∫ Rn \ ̸ on 0 is closed. Use Exercise 6 to show that Sn−1 ω = 0, and hence ω is not exact.

10. Let Ω ⊂ Rn be a compact, smoothly bounded subset. Take ω as in 196 4. Differential forms and the Gauss-Green-Stokes formula

Exercise 9. Show that ∫

ω = An−1 if 0 ∈ Ω, ∂Ω 0 if 0 ∈/ Ω.

4.4. The classical Gauss, Green, and Stokes formulas

The case of (4.3.1) where S = Ω is a region in R2 with smooth boundary yields the classical Green Theorem. In this case, we have ( ) ∂g ∂f (4.4.1) β = f dx + g dy, dβ = − dx ∧ dy, ∂x ∂y and hence (4.3.1) becomes the following Proposition 4.4.1. If Ω is a region in R2 with smooth boundary, and f and g are smooth functions on Ω, which vanish outside some compact set in Ω, then ∫∫ ( ) ∫ ∂g ∂f (4.4.2) − dx dy = (f dx + g dy). ∂x ∂y Ω ∂Ω

Note that, if we have a vector field X = X1∂/∂x + X2∂/∂y on Ω, then the integrand on the left side of (4.4.2) is ∂X ∂X (4.4.3) 1 + 2 = div X, ∂x ∂y provided g = X , f = −X . We obtain 1 ∫∫ 2 ∫ ( (4.4.4) (div X) dx dy = −X2 dx + X1 dy). Ω ∂Ω ( ) If ∂Ω is parametrized by arc-length, as γ(s) = x(s), y(s) , with orientation as defined for Proposition( 4.3.1, then the) unit normal ν, to ∂Ω, pointing out of Ω, is given by ν(s) = y′(s), −x′(s) , and (4.4.4) is equivalent to ∫∫ ∫ (4.4.5) (div X) dx dy = ⟨X, ν⟩ ds.

Ω ∂Ω This is a special case of Gauss’ . We now derive a more general form of the Divergence Theorem. We begin with a definition of the divergence of a vector field on a surface M. Let M be a region in Rn, or an n-dimensional surface in Rn+k, provided with a volume form n (4.4.6) ωM ∈ Λ M. 4.4. The classical Gauss, Green, and Stokes formulas 197

Let X be a vector field on M. Then the divergence of X, denoted div X, is a function on M given by

(4.4.7) (div X) ωM = d(ωM ⌋X). If M = Rn, with the standard

(4.4.8) ω = dx1 ∧ · · · ∧ dxn, and if ∑ ∂ (4.4.9) X = Xj(x) , ∂xj then ∑n j−1 j d (4.4.10) ω⌋X = (−1) X (x)dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn. j=1 Hence, in this case, (4.4.7) yields the familiar formula ∑n j (4.4.11) div X = ∂jX , j=1 where we use the notation ∂f (4.4.12) ∂jf = . ∂xj Suppose now that M is endowed with both an orientation and a metric tensor gjk(x). Then M carries a natural volume element ωM , determined by the condition that, if one has an orientation-preserving coordinate system in which gjk(p0) = δjk, then ωM (p0) = dx1 ∧· · ·∧dxn. This condition produces the following formula, in any orientation-preserving coordinate system: √ (4.4.13) ωM = g dx1 ∧ · · · ∧ dxn, g = det(gjk), by the same sort of calculations as done in (3.2.10)–(3.2.15). We now compute div X when the volume element on M is given by (4.4.13). We have ∑ √ j−1 j d (4.4.14) ωM ⌋X = (−1) X g dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn j and hence √ j (4.4.15) d(ωM ⌋X) = ∂j( gX ) dx1 ∧ · · · ∧ dxn. Here, as below, we use the summation convention. Hence the formula (4.4.7) gives −1/2 1/2 j (4.4.16) div X = g ∂j(g X ). Compare (3.2.56). 198 4. Differential forms and the Gauss-Green-Stokes formula

We now derive the Divergence Theorem, as a consequence of Stokes’ formula, which we recall is ∫ ∫ (4.4.17) dα = α,

M ∂M for an (n−1)-form on M, assumed to be a smooth compact oriented surface with boundary. If α = ω ⌋X, formula (4.4.7) gives ∫M ∫

(4.4.18) (div X) ωM = ωM ⌋X. M ∂M This is one form of the Divergence Theorem. We will produce an alternative expression for the integrand on the right before stating the result formally.

Given that ωM is the volume form for M determined by a Riemannian metric, we can write the interior product ωM ⌋X in terms of the volume element ω∂M on ∂M, with its induced orientation and Riemannian metric, as follows. Pick coordinates on M, centered at p0 ∈ ∂M, such that ∂M is tangent to the hyperplane {x1 = 0} at p0 = 0 (with M to the left of ∂M), and such that gjk(p0) = δjk, so ωM (p0) = dx1 ∧ · · · ∧ dxn. Consequently, ω∂M (p0) = dx2 ∧ · · · ∧ dx2. It follows that, at p0, ∗ (4.4.19) j (ωM ⌋X) = ⟨X, ν⟩ω∂M , where ν is the unit vector normal to ∂M, pointing out of M and j : ∂M ,→ M the natural inclusion. The two sides of (4.4.19), which are both defined in a coordinate independent fashion, are hence equal on ∂M, and the identity (4.4.18) becomes ∫ ∫

(4.4.20) (div X) ωM = ⟨X, ν⟩ ω∂M . M ∂M Finally, we adopt the notation of the sort used in §§3.1–3.2. We denote the volume element on M by dV and that on ∂M by dS, obtaining the Divergence Theorem: Theorem 4.4.2. If M is a compact surface with boundary, X a smooth vector field on M, then ∫ ∫ (4.4.21) (div X) dV = ⟨X, ν⟩ dS,

M ∂M where ν is the unit outward-pointing normal to ∂M.

The only point left to mention here is that M need not be orientable. Indeed, we can treat the integrals in (4.4.21) as surface integrals, as in §3.2, and note that all objects in (4.4.21) are independent of a choice of 4.4. The classical Gauss, Green, and Stokes formulas 199 orientation. To prove the general case, just use a partition of unity supported on orientable pieces. We obtain some further integral identities. First, we apply (4.4.21) with X replaced by uX. We have the following “derivation” identity: (4.4.22) div uX = u div X + ⟨du, X⟩ = u div X + Xu, which follows easily from the formula (4.4.16). The Divergence Theorem immediately gives ∫ ∫ ∫ (4.4.23) (div X)u dV + Xu dV = ⟨X, ν⟩u dS.

M M ∂M Replacing u by uv and using the derivation identity X(uv) = (Xu)v+u(Xv), we have ∫ ∫ ∫ [ ] (4.4.24) (Xu)v + u(Xv) dV = − (div X)uv dV + ⟨X, ν⟩uv dS.

M M ∂M It is very useful to apply (4.4.23) to a gradient vector field X. If v is a smooth function on M, grad v is a vector field satisfying (4.4.25) ⟨grad v, Y ⟩ = ⟨Y, dv⟩, for any vector field Y, where the on the left are given by the metric tensor on M and those on the right by the natural pairing of vector fields j jk jk and 1-forms. Hence grad v = X has components X = g ∂kv, where (g ) is the matrix inverse of (gjk). Applying div to grad v defines the Laplace operator: ( ) −1/2 jk 1/2 (4.4.26) ∆v = div grad v = g ∂j g g ∂kv . When M is a region in Rn and we use the standard Euclidean metric, so div X is given by (4.4.11), we have the Laplace operator on Euclidean space: 2 2 ∂ v ··· ∂ v (4.4.27) ∆v = 2 + + 2 . ∂x1 ∂xn Now, setting X = grad v in (4.4.23), we have Xu = ⟨grad u, grad v⟩, and ⟨X, ν⟩ = ⟨ν, grad v⟩, which we call the normal derivative of v, and denote ∂v/∂ν. Hence ∫ ∫ ∫ ∂v (4.4.28) u(∆v) dV = − ⟨grad u, grad v⟩ dV + u dS. ∂ν M M ∂M If we interchange the roles of u and v and subtract, we have ∫ ∫ ∫ [ ] ∂v ∂u (4.4.29) u(∆v) dV = (∆u)v dV + u − v dS. ∂ν ∂ν M M ∂M 200 4. Differential forms and the Gauss-Green-Stokes formula

Formulas (4.4.28)–(4.4.29) are also called Green formulas. We will make further use of them in §5.1. We return to the Green formula (4.4.2), and give it another formulation. Consider a vector field Z = (f, g, h) on a region in R3 containing the planar surface U = {(x, y, 0) : (x, y) ∈ Ω}. If we form   i j k   (4.4.30) curl Z = det ∂x ∂y ∂z f g h we see that the integrand on the left side of (4.4.2) is the k-component of curl Z, so (4.4.2) can be written ∫∫ ∫ (4.4.31) (curl Z) · k dA = (Z · T ) ds,

U ∂U where T is the unit tangent vector to ∂U. To see how to extend this result, note that k is a unit normal field to the planar surface U. To formulate and prove the extension of (4.4.31) to any compact oriented surface with boundary in R3, we use the relation between curl and exterior derivative discussed in Exercises 2–3 of §4.2. In particular, if we set ∑3 ∂ ∑3 (4.4.32) F = f (x) , φ = f (x) dx , j ∂x j j j=1 j j=1 ∑ 3 then curl F = 1 gj(x) ∂/∂xj where

(4.4.33) dφ = g1(x) dx2 ∧ dx3 + g2(x) dx3 ∧ dx1 + g3(x) dx1 ∧ dx2. Now Suppose M is a smooth oriented (n − 1)-dimensional surface with boundary in Rn. Using the orientation of M, we pick a unit normal field N to M as follows. Take a smooth function v which vanishes on M but such that ∇v(x) ≠ 0 on M. Thus ∇v is normal to M. Let σ ∈ Λn−1(M) define the orientation of M. Then dv ∧ σ = a(x) dx1 ∧ · · · ∧ dxn, where a(x) is nonvanishing on M. For x ∈ M, we take N(x) = ∇v(x)/|∇v(x)| if a(x) > 0, and N(x) = −∇v(x)/|∇v(x)| if a(x) < 0. We call N the “positive” unit normal field to the oriented surface M, in this case. Part of the motivation for this characterization of N is that, if Ω ⊂ Rn is an open set with smooth boundary M = ∂Ω, and we give M the induced orientation, as described in §4.3, then the positive normal field N just defined coincides with the unit normal field pointing out of Ω. Compare Exercises 2–3 of §4.3.

Now, if G = (g1, . . . , gn) is a vector field defined near M, then ∫ ∫ (∑n ) j−1 c (4.4.34) (N · G) dS = (−1) gj(x) dx1 ∧ · · · dxj · · · ∧ dxn . M M j=1 4.4. The classical Gauss, Green, and Stokes formulas 201

This result follows from (4.4.19). When n = 3 and G = curl F, we deduce from (4.4.32)–(4.4.33) that ∫∫ ∫∫ (4.4.35) dφ = (N · curl F ) dS.

M M Furthermore, in this case we have ∫ ∫ (4.4.36) φ = (F · T ) ds,

∂M ∂M where T is the unit tangent vector to ∂M, specied as follows by the orien- tation of ∂M; if τ ∈ Λ1(∂M) defines the orientation of ∂M, then ⟨T, τ⟩ > 0 on ∂M. We call T the “forward” unit tangent vector field to the oriented curve ∂M. By the calculations above, we have the classical Stokes formula: Proposition 4.4.3. If M is a compact oriented surface with boundary in R3, and F is a C1 vector field on a neighborhood of M, then ∫∫ ∫ (4.4.37) (N · curlF ) dS = (F · T ) ds,

M ∂M where N is the positive unit normal field on M and T the forward unit tangent field to ∂M.

Remark. The right side of (4.4.37) is called the of F about ∂M. Proposition 4.4.3 shows how curl F arises to measure this circulation.

Direct proof of the Divergence Theorem

Let Ω be a bounded open subset of Rn, with a C1 smooth boundary ∂Ω. Hence, for each p ∈ ∂Ω, there is a neighborhood U of p in Rn, a rotation of coordinate axes, and a C1 function u : O → R, defined on an open set O ⊂ Rn−1, such that n ′ ′ Ω ∩ U = {x ∈ R : xn ≤ u(x ), x ∈ O} ∩ U, ′ ′ where x = (x , xn), x = (x1, . . . , xn−1). We aim to prove that, given f ∈ C1(Ω), and any constant vector e ∈ Rn, ∫ ∫ (4.4.38) e · ∇f(x) dx = (e · N)f dS,

Ω ∂Ω where dS is surface measure on ∂Ω and N(x) is the unit normal to ∂Ω, pointing out of Ω. At x = (x′, u(x′)) ∈ ∂Ω, we have (4.4.39) N = (1 + |∇u|2)−1/2(−∇u, 1). 202 4. Differential forms and the Gauss-Green-Stokes formula

To prove (4.4.38), we may as well suppose f is supported in such a neighborhood U. Then we have ∫ ∫ ( ∫ ) ∂f ′ ′ dx = ∂nf(x , xn) dxn dx ∂xn O ≤ ′ Ω ∫ xn u(x ) ( ) ′ ′ ′ (4.4.40) = f x , u(x ) dx O∫

= (en · N)f dS. ∂Ω The first identity in (4.4.40) follows from Theorem 3.1.9, the second identity from the Fundamental Theorem of Calculus, and the third identity from the identification ( ) dS = 1 + |∇u|2 1/2 dx′, n established in (3.2.22). We use the standard basis {e1, . . . , en} of R .

Such an argument works when en is replaced by any constant vector e with the property that we can represent ∂Ω ∩ U as the graph of a function ′ yn =u ˜(y ), with the yn-axis parallel to e. In particular, it works for e = en + aej, for 1 ≤ j ≤ n − 1 and for |a| sufficiently small. Thus, we have ∫ ∫

(4.4.41) (en + aej) · ∇f(x) dx = (en + aej) · N f dS. Ω ∂Ω If we subtract (4.4.40) from this and divide the result by a, we obtain (9.38) for e = ej, for all j, and hence (4.4.38) holds in general.

Note that replacing e by ej and f by fj in (4.4.38), and summing over 1 ≤ j ≤ n, yields ∫ ∫ (4.4.42) (div F ) dx = N · F dS,

Ω ∂Ω for the vector field F = (f1, . . . , fn). This is the usual statement of Gauss’ Divergence Theorem, as given in Theorem 4.4.2 (specialized to domains in Rn). Reversing the argument leading from (4.4.2) to (4.4.5), we also have another proof of Green’s Theorem, in the form (4.4.2).

Exercises

1. Newton’s equation md2x/dt2 = −∇V (x) for the motion in Rn of a body 4.4. The classical Gauss, Green, and Stokes formulas 203 of mass m, in a potential force field F = −∇V , can be converted to a first-order system for (x, ξ), with ξ = mx. One gets d (x, ξ) = H (x, ξ), dt f 2n where Hf is a “Hamiltonian vector field” on R , given by [ ] ∑n ∂f ∂ ∂f ∂ H = − . f ∂ξ ∂x ∂x ∂ξ j=1 j j j j In the case described above, 1 f(x, ξ) = |ξ|2 + V (x). 2m

Calculate div Hf from (4.4.11).

2. Let X be a smooth vector field on a smooth surface M, generating a F t O ⊂ flow X . Let M be a compact, smoothly bounded subset, and set t Ot = F (O). As seen in Proposition 3.2.7, X ∫ d (4.4.43) Vol(O ) = (div X) dV. dt t Ot Use the Divergence Theorem to deduce from this that ∫ d (4.4.44) Vol(O ) = ⟨X, ν⟩ dS. dt t ∂Ot Remark. Conversely, a direct proof of (4.4.44), together with the Divergence Theorem, would lead to another proof of (4.4.43).

3. Show that, if F : R3 → R3 is a linear rotation, then, for a C1 vector field Z on R3,

(4.4.45) F#(curl Z) = curl(F#Z).

4. Let M be the graph in R3 of a smooth function, z = u(x, y), (x, y) ∈ O ⊂ R2, a bounded region with smooth boundary (maybe with corners). Show that (4.4.46)∫ ∫∫ [( )( ) ( )( ) ∂F ∂F ∂u ∂F ∂F ∂u (curl F · N) dS = 3 − 2 − + 1 − 3 − ∂y ∂z ∂x ∂z ∂x ∂y O M ( )] ∂F ∂F + 2 − 1 dx dy, ∂x ∂y 204 4. Differential forms and the Gauss-Green-Stokes formula

( ) where ∂Fj/∂x and ∂Fj/∂y are evaluated at x, y, u(x, y) . Show that ∫ ∫ ( ) ( ) ∂u ∂u (4.4.47) (F · T ) ds = Fe + Fe dx + Fe + Fe dy, 1 3 ∂x 2 3 ∂y ∂M ∂O ( ) e e where Fj(x, y) = Fj x, y, u(x, y) . Apply Green’s Theorem, with f = F1 + e e e F3(∂u/∂x), g = F2 +F3(∂u/∂y), to show that the right sides of (4.4.46) and (4.4.47) are equal, hence proving Stokes’ Theorem in this case.

n ′ ′ 5. Let M ⊂ R be the graph of a function xn = u(x ), x = (x1, . . . , xn−1). If ∑n j−1 c β = (−1) gj(x) dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn, j=1 ( ) as in (4.4.34), and φ(x′) = x′, u(x′) , show that   n∑−1 ( ) ( ) ∗ n  ′ ′ ∂u ′ ′  φ β = (−1) g x , u(x ) − g x , u(x ) dx ∧ · · · ∧ dx − j ∂x n 1 n 1 j=1 j n−1 = (−1) G · (−∇u, 1) dx1 ∧ · · · ∧ dxn−1, where G = (g1, . . . , gn), and verify the identity (4.4.34) in this case. Hint. For the last part, recall Exercises 2–3 of §4.3, regarding the orientation of M.

6. Let S be a smooth oriented 2-dimensional surface in R3, and M an open subset of S, with smooth boundary; see Figure 4.4.1. Let N be the positive unit normal field to S, defined by its orientation. For x ∈ ∂M, let ν(x) be the unit vector, tangent to M, normal to ∂M, and pointing out of M, and let T be the forward unit tangent vector field to ∂M. Show that, on ∂M, N × ν = T, ν × T = N.

7. If M is an oriented (n − 1)-dimensional surface in Rn, with positive unit normal field N, show that the volume element ωM on M is given by

ωM = ω⌋N, n where ω = dx1 ∧ · · · ∧ dxn is the standard volume form on R . Deduce that the volume element on the unit sphere Sn−1 ⊂ Rn is given by ∑n j−1 d ωSn−1 = (−1) xj dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn, j=1 if Sn−1 inherits the orientation as the boundary of the unit ball. 4.4. The classical Gauss, Green, and Stokes formulas 205

Figure 4.4.1. Setup for classical Stokes formula

8. Let M be a Ck surface, k ≥ 2. Suppose φ : M → M is a Ck isometry, i.e., it preserves the metric tensor. Taking φ∗u(x) = u(φ(x)) for u ∈ C2(M), show that

(4.4.48) ∆φ∗u = φ∗∆u.

Hint. The Laplace operator is uniquely specified by the metric tensor on M, via (4.4.26).

9. Let X and Y be smooth vector fields on an open set Ω ⊂ R3. Show that

Y · curl X − X · curl Y = div(X × Y ).

10. In the setting of Exercise 9, assume Ω is compact and smoothly bounded, and that X and Y are C1 on Ω. Show that ∫ ∫ X · curl Y dx = Y · curl X dx,

Ω Ω 206 4. Differential forms and the Gauss-Green-Stokes formula provided either (a) X is normal to ∂Ω, or (b) X is parallel to Y on ∂Ω.

11. Recall the formula (3.2.26) for the metric tensor of Rn in spherical polar coordinates R : (0, ∞) × Sn−1 → Rn, R(r, ω) = rω. Using (4.4.26), show that if u ∈ C2(Rn), then ∂2 n − 1 ∂ 1 (4.4.49) ∆u(rω) = u(rω) + u(rω) + ∆ u(rω), ∂r2 r ∂r r2 S n−1 where ∆S is the Laplace operator on S . Deduce that n − 1 (4.4.50) u(x) = f(|x|) =⇒ ∆u(rω) = f ′′(r) + f ′(r). r

12. Show that (4.4.51) |x|−(n−2) is harmonic on Rn \ 0. In case n = 2, show that (4.4.52) log |x| is harmonic on R2 \ 0.

In Exercise 13, we take n ≥ 3 and consider ∫ 1 f(y) Gf(x) = n−2 dy Cn |x − y| R∫n 1 f(x − y) = n−2 dy, Cn |y| Rn with Cn = −(n − 2)An−1.

∈ 2 Rn Rn \ { ∈ Rn | | } 13. Assume f C0 ( ). Let Ωε = Bε, where Bε = x : x < ε . Verify that ∫ 2−n Cn ∆Gf(0) = lim ∆f(x) · |x| dx ε→0 Ω∫ε = lim [∆f(x) · |x|2−n − f(x)∆|x|2−n] dx ε→0 Ω ε ∫ [ ] ∂f = − lim ε2−n − (2 − n)ε1−nf dS ε→0 ∂r ∂Bε = −(n − 2)An−1f(0), 4.5. Differential forms and the change of variable formula 207 using (4.4.29) for the third identity. Use this to show that ∆Gf(x) = f(x).

14. Work out the analogue of Exercise 13 in case n = 2 and ∫ 1 Gf(x) = f(y) log |x − y| dy. 2π R2

4.5. Differential forms and the change of variable formula

The change of variable formula for one-variable integrals, ∫ ∫ t ( ) φ(t) (4.5.1) f φ(x) φ′(x) dx = f(x) dx, a φ(a) given f continuous and φ of class C1, is easily established, via the funda- mental theorem of calculus and the chain rule. We recall how this was done in §1.1. If we denote the left side of (4.5.1) by A(t) and the right by B(t), we apply these results to get ( ) (4.5.2) A′(t) = f φ(t) φ′(t) = B′(t), and since A(a) = B(a) = 0, another application of the fundamental theorem of calculus (or simply the mean value theorem) gives A(t) = B(t). For multiple integrals, the change of variable formula takes the following form, given in Proposition 3.1.14: Theorem 4.5.1. Let O, Ω be connected, open sets in Rn and let φ : O → Ω be a C1 diffeomorphism. Given f continuous on Ω, with compact support, we have ∫ ∫ ( ) (4.5.3) f φ(x) det Dφ(x) dx = f(x) dx.

O Ω There are many variants of Theorem 4.5.1. In particular one wants to extend the class of functions f for which (4.5.3) holds, but once one has Theorem 4.5.1 as stated, such extensions are relatively painless. See the derivation of Theorem 3.1.15. Let’s face it; the proof of Theorem 4.5.1 given in §3.1 was a grim affair, involving careful estimates of volumes of images of small cubes under the map φ and numerous pesky details. Recently, P. Lax [28] found a fresh approach to the proof of the multidimensional change of variable formula. More precisely, [28] established the following result, from which Theorem 4.5.1 is a consequence. 208 4. Differential forms and the Gauss-Green-Stokes formula

Theorem 4.5.2. Let φ : Rn → Rn be a C1 map. Assume φ(x) = x for |x| ≥ R. Let f be a continuous function on Rn with compact support. Then ∫ ∫ ( ) (4.5.4) f φ(x) det Dφ(x) dx = f(x) dx.

We will give a variant of the proof of [28]. One difference between this proof and that of [28] is that we use the of differential forms.

Proof of Theorem 4.5.2. Via standard approximation arguments, it suf- 2 ∈ 1 Rn fices to prove this when φ is C and f C0 ( ), which we will assume from here on.

To begin, pick A > 0 such that f(x − Ae1) is supported in {x : |x| > R}, where e1 = (1, 0,..., 0). Also take A large enough that the image of {x : |x| ≤ R} under φ does not intersect the support of f(x − Ae1). We can set ∂ψ (4.5.5) F (x) = f(x) − f(x − Ae1) = (x), ∂x1 where ∫ A − ∈ 1 Rn (4.5.6) ψ(x) = f(x se1) ds, ψ C0 ( ). 0 Then we have the following identities involving n-forms: ∂ψ α = F (x) dx1 ∧ · · · ∧ dxn = dx1 ∧ · · · ∧ dxn ∂x1 (4.5.7) = dψ ∧ dx2 ∧ · · · ∧ dxn

= d(ψ dx2 ∧ · · · ∧ dxn), i.e., α = dβ, with β = ψ dx2 ∧ · · · ∧ dxn a compactly supported (n − 1)-form of class C1. Now the pull-back of α under φ is given by ( ) ∗ (4.5.8) φ α = F φ(x) det Dφ(x) dx1 ∧ · · · ∧ dxn. Furthermore, the right side of (4.5.8) is equal to ( ) (4.5.9) f φ(x) det Dφ(x) dx1 ∧ · · · ∧ dxn − f(x − Ae1) dx1 ∧ · · · ∧ dxn. Hence we have ∫ ∫ ( ) f φ(x) det Dφ(x) dx1 ··· dxn − f(x) dx1 ··· dxn (4.5.10) ∫ ∫ ∫ = φ∗α = φ∗dβ = d(φ∗β), where we use the general identity (4.5.11) φ∗dβ = d(φ∗β), 4.5. Differential forms and the change of variable formula 209 a consequence of the chain rule. On the other hand, a very special case of Stokes’ theorem applies to ∑ ∗ d (4.5.12) φ β = γ = γj(x) dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn, j ∈ 1 Rn with γj C0 ( ). Namely ∑ ∂γ (4.5.13) dγ = (−1)j−1 j dx ∧ · · · ∧ dx , ∂x 1 n j j and hence, by the fundamental theorem of calculus, ∫ (4.5.14) dγ = 0.

This gives the desired identity (4.5.4), from (4.5.10).  We make some remarks on Theorem 4.5.2. Note that φ is not assumed to be one-to-one or onto. In fact, as noted in [28], the identity (4.5.4) implies that such φ must be onto, and this has important topological implications. We mention that, if one puts absolute values around det Dφ(x) in (4.5.4), the appropriate formula is ∫ ∫ ( ) (4.5.15) f φ(x) det Dφ(x) dx = f(x) n(x) dx, where n(x) = #{y : φ(y) = x}. A proof of (4.5.15) can be found in texts on geometrical measure theory. As noted in [28], Theorem 4.5.2 was proven in [4]. The proof there makes use of differential forms and Stokes’ theorem, but it is quite different from the proof given here. A crucial difference is that the proof in [4] requires that one knows the change of variable formula as formulated in Theorem 4.5.1. We now show how Theorem 4.5.1 can be deduced from Theorem 4.5.2. We will use the following lemma. Lemma 4.5.3. In the setting of Theorem 4.5.1, and with det Dφ > 0, given p ∈ Ω, there exists a neighborhood U of p and a C1 map Φ: Rn → Rn such that (4.5.16) Φ = φ on φ−1(U), Φ(x) = x for |x| large, and (4.5.17) Φ(x) ∈ U =⇒ x ∈ φ−1(U).

Granted the lemma, we proceed as follows. Assume det Dφ > 0 on O. Given f ∈ C(Ω), supp f ⊂ K, compact in Ω, cover K with a finite number 210 4. Differential forms and the Gauss-Green-Stokes formula

of subsets Uj as in Lemma∑ 4.5.3, and, using a continuous partition of unity § ⊂ (cf. 3.3), write f = j fj, supp fj Uj. Also, let Φj have the obvious significance. By Theorem 4.5.2, we have ∫ ∫

(4.5.18) fj(Φj(x)) det DΦj(x) dx = fj dx.

But we also have∫ ∫

(4.5.19) fj(Φj(x)) det DΦj(x) dx = fj(φ(x)) det Dφ(x) dx. O Now summing over j gives (4.5.3). If we do not have det Dφ > 0 on O, then det Dφ < 0 on O. In this case, one can compose with the map n n ′ ′ (4.5.20) κ : R −→ R , κ(x1, x ) = (−x1, x ), (for which Theorem 4.5.1 is elementary) and readily recover the desired result.

We turn to the proof of Lemma 4.5.3. Say q = φ−1(p), Dφ(q) = A ∈ Gℓ+(n, R), i.e., A ∈ Gℓ(n, R) and det A > 0. Translating coordinates, we can assume p = q = 0. We set (4.5.21) Ψ(x) = β(x)φ(x) + (1 − β(x))Ax, ∈ ∞ Rn ≡ where β C0 ( ) has support in a small neighborhood of q and β 1 on a smaller neighborhood V = φ−1(U), chosen so that we can apply Corollary 2.2.4, to deduce that (4.5.22) Ψ maps Rn diffeomorphically onto its image, an open set in Rn. In fact, estimates behind the proof of Proposition 2.2.2 imply that, for ap- propriately chosen β, there exists b > 0 such that |Ψ(x) − Ψ(y)| ≥ b|x − y| for all x, y ∈ Rn. Hence the image Ψ(Rn) is closed in Rn, as well as open, so actually Ψ maps Rn diffeomorphically onto Rn. Note that Ψ = φ on V = φ−1(U). We want to alter Ψ(x) for large |x| to obtain Φ, satisfying (4.5.16)–(4.5.17). To do this, we use the fact that Gℓ+(n, R) is connected (see Proposition 3.2.14). Pick a smooth path Γ : [0, 1] → Gℓ+(n, R) such that Γ(t) = A for t ∈ [0, 1/4] and Γ(t) = I for t ∈ [3/4, 1]. Let (4.5.23) M = sup ∥Γ(t)−1∥, so |Γ(t)x| ≥ M −1|x|, ∀ x ∈ Rn. 0≤t≤1 ⊂ { ∈ Rn | | } ⊂ Now assume U BR1 = x : x < R1 , so Ψ(V ) BR1 . Next, take −1 ⊂ R2 so large that V = φ (U) BR2 and

(4.5.24) |x| ≥ R2 =⇒ |Ψ(x)| > MR1 and Ψ(x) = Ax. 4.5. Differential forms and the change of variable formula 211

Now set Φ(x) = Ψ(x) for |x| ≤ R2,

(4.5.25) Γ(t)x for |x| = R2 + t, 0 ≤ t ≤ 1,

x for |x| ≥ R2 + 1. This map has the properties (4.5.16)–(4.5.17).

Chapter 5

Applications of the Gauss-Green-Stokes formula

In this chapter we present two major types of applications of the theory of differential forms developed in Chapter 4. The first set of applications, given in §5.1, deals with complex function theory. If Ω ⊂ C is an open set, a C1 function f :Ω → C is said to be holomorphic if it is complex differentiable, of equivalently if it satisfies a set of equations called the Cauchy-Riemann equations. We deduce from Green’s theorem that if Ω is a smoothly bounded domain and f ∈ C1(Ω) is holomorphic on Ω, then we have the Cauchy integral theorem, ∫ (5.0.1) f(z) dz = 0,

∂Ω and the Cauchy integral formula, ∫ 1 f(z) (5.0.2) f(z0) = dz, z0 ∈ Ω. 2πi z − z0 ∂Ω These key results lead to further results on holomorphic functions, such as power series developments. In §5.1 we also consider functions on domains Ω ⊂ Rn that are harmonic, and use Gauss-Green formulas to establish results about such functions, such as mean value properties, and Liouville’s theorem, which states that a bounded harmonic on all of Rn must be constant. These results specialize to

213 214 5. Applications of the Gauss-Green-Stokes formula holomorphic functions on C. One consequence is the fundamental theorem of algebra, which states that if p(z) is a nonconstant polynomial on C, it must have a complex root. The second set of applications, given in §§5.2–5.3, yields important re- sults on the topological behavior of smooth maps on regions in Rn, and on surfaces and more generally on manifolds. A central notion here is that of degree theory. If X is a smooth, compact, oriented, n-dimensional sur- face, and F : X → Y is a smooth map to a compact, connected, oriented, n-dimensional surface Y , then the degree of F is given by ∫ (5.0.3) Deg(F ) = F ∗ω, ∫X where ω is an n-form on Y such that Y ω = 1. That this is well-defined, independent of the choice of such ω, is a consequence of the fundamental exactness criterion, given in Proposition 5.3.6, that says∫ a smooth n-form α on Y is exact, i.e., has the form α = dβ, if and only if Y α = 0. With this, we are able to develop degree theory as a powerful tool. Applications range from the Brouwer fixed-point theorem (actually arising here as a precursor to degree theory) and the Jordan-Brouwer separation theorem (in the smooth case) to a degree-theory proof of the fundamental theorem of algebra. We also consider on a compact surface M a vector field X with non- degenerate critical points, define the index of such a vector field, and show that (5.0.4) Index X = χ(M) is independent of the choice of such a vector field. This defines an invariant χ(M), called the Euler characteristic. Investigations of χ(M) will play an important role in Chapter 6. 5.1. Holomorphic functions and harmonic functions 215

5.1. Holomorphic functions and harmonic functions

Let f be a complex-valued C1 function on a region Ω ⊂ R2. We identify R2 with C, via z = x + iy, and write f(z) = f(x, y). We say f is holomorphic on Ω provided it is complex differentiable, in the sense that 1 (5.1.1) lim (f(z + h) − f(z)) exists, h→0 h for each z ∈ Ω. When this limit exists, we denote it f ′(z), or df/dz. An equivalent condition (given f ∈ C1) is that f satisfies the Cauchy-Riemann equation: ∂f 1 ∂f (5.1.2) = . ∂x i ∂y In such a case, ∂f 1 ∂f (5.1.3) f ′(z) = (z) = (z). ∂x i ∂y Note that f(z) = z has this property, but f(z) = z does not. The following is a convenient tool for producing more holomorphic functions. Lemma 5.1.1. If f and g are holomorphic on Ω, so is fg.

Proof. We have ∂ ∂f ∂g ∂ ∂f ∂g (5.1.4) (fg) = g + f , (fg) = g + f , ∂x ∂x ∂x ∂y ∂y ∂y so if f and g satisfy the Cauchy-Riemann equation, so does fg. Note that d (5.1.5) (fg)(z) = f ′(z)g(z) + f(z)g′(z). dz 

Using Lemma 5.1.1, one can show inductively that if k ∈ N, zk is holo- morphic on C, and d (5.1.6) zk = kzk−1. dz Also, a direct analysis of (5.1.1) gives this for k = −1, on C \ 0, and then an inductive argument gives it for each negative integer k, on C\0. The exercises explore various other important examples of holomorphic functions. Our goal in this section is to show how Green’s theorem can be used to establish basic results about holomorphic functions on domains in C (and also develop a study of harmonic functions on domains in Rn). In Theorems 5.1.2–5.1.4, Ω will be a bounded domain with piecewise smooth boundary, and we assume Ω can be partitioned into a finite number of C2 domains with corners, as defined in §4.3. 216 5. Applications of the Gauss-Green-Stokes formula

To begin, we apply Green’s theorem to the ∫ ∫ f dz = f(dx + i dy).

∂Ω ∂Ω Clearly (9.2) applies to complex-valued functions, and if we set g = if, we get ∫ ∫∫ ( ) ∂f ∂f (5.1.7) f dz = i − dx dy. ∂x ∂y ∂Ω Ω Whenever f is holomorphic, the integrand on the right side of (5.1.7) van- ishes, so we have the following result, known as Cauchy’s Integral Theorem:

Theorem 5.1.2. If f ∈ C1(Ω) is holomorphic, then ∫ (5.1.8) f(z) dz = 0.

∂Ω Using (5.1.8), we can establish Cauchy’s Integral Formula: Theorem 5.1.3. If f ∈ C1(Ω) is holomorphic and z ∈ Ω, then ∫ 0 1 f(z) (5.1.9) f(z0) = dz. 2πi z − z0 ∂Ω

Proof. Note that g(z) = f(z)/(z − z0) is holomorphic on Ω \{z0}. Let Dr be the disk of radius r centered at z0. Pick r so small that Dr ⊂ Ω. See Figure 5.1.1. Then (5.1.8) implies ∫ ∫ f(z) f(z) (5.1.10) dz = dz. z − z0 z − z0 ∂Ω ∂Dr

To evaluate the integral on the right, parametrize the curve ∂Dr by γ(θ) = iθ iθ z0 + re . Hence dz = ire dθ, so the integral on the right is equal to ∫ ∫ 2π iθ 2π f(z0 + re ) iθ iθ (5.1.11) iθ ire dθ = i f(z0 + re ) dθ. 0 re 0

As r → 0, this tends in the limit to 2πif(z0), so (5.1.9) is established. 

1 Suppose f ∈ C (Ω) is holomorphic, z0 ∈ Dr ⊂ Ω, where Dr is the disk of radius r centered at z0, and suppose z ∈ Dr. See Figure 5.1.2. Then Theorem 5.1.3 implies ∫ 1 f(ζ) (5.1.12) f(z) = dζ. 2πi (ζ − z0) − (z − z0) ∂Ω 5.1. Holomorphic functions and harmonic functions 217

Figure 5.1.1. Proving Cauchy’s integral formula

We have the infinite series expansion ∞ ( ) 1 1 ∑ z − z n (5.1.13) = 0 , (ζ − z ) − (z − z ) ζ − z ζ − z 0 0 0 n=0 0 valid as long as |z − z0| < |ζ − z0|. Hence, given |z − z0| < r, this series is uniformly convergent for ζ ∈ ∂Ω, and we have ∞ ∫ ( ) 1 ∑ f(ζ) z − z n (5.1.14) f(z) = 0 dζ. 2πi ζ − z0 ζ − z0 n=0 ∂Ω

We summarize what has been established.

1 Theorem 5.1.4. Given f ∈ C (Ω), holomorphic on Ω and a disk Dr ⊂ Ω as above, for z ∈ Dr, f(z) has the convergent power series expansion ∑∞ ∫ − n 1 f(ζ) (5.1.15) f(z) = an(z z0) , an = n+1 dζ. 2πi (ζ − z0) n=0 ∂Ω 218 5. Applications of the Gauss-Green-Stokes formula

Figure 5.1.2. Convergent power series on Dr(z0)

Note that, when (5.1.9) is applied to Ω = Dr, the disk of radius r centered at z0, the computation (5.1.11) yields ∫ ∫ 2π 1 iθ 1 (5.1.16) f(z0) = f(z0 + re ) dθ = f(z) ds(z), 2π 0 ℓ(∂Dr) ∂Dr

1 when f is holomorphic and C on Dr, and ℓ(∂Dr) = 2πr is the length of the circle ∂Dr. This is a mean value property, which extends to harmonic functions on domains in Rn, as we will see below.

Note that we can write (5.1.1) as (∂x + i∂y)f = 0; applying the operator ∂x − i∂y to this gives

∂2f ∂2f (5.1.17) + = 0 ∂x2 ∂y2 for any holomorphic function. A general C2 solution to (5.1.17) on a region Ω ⊂ R2 is called a harmonic function. More generally, if O is an open set in Rn, a function f ∈ C2(Ω) is called harmonic if ∆f = 0 on O, where, as in 5.1. Holomorphic functions and harmonic functions 219

(4.4.27), ∆ is the Laplace operator, 2 2 ∂ f ··· ∂ f (5.1.18) ∆f = 2 + + 2 . ∂x1 ∂xn Generalizing (5.1.16), we have the following, known as the mean value prop- erty of harmonic functions: Proposition 5.1.5. Let Ω ⊂ Rn be open, u ∈ C2(Ω) be harmonic, p ∈ Ω, and B (p) = {x ∈ Ω: |x − p| ≤ R} ⊂ Ω. Then R ∫ 1 (5.1.19) u(p) = ( ) u(x) dS(x). A ∂BR(p) ∂BR(p)

For the proof, set ∫ 1 (5.1.20) ψ(r) = u(p + rω) dS(ω), A(Sn−1) Sn−1 for 0 < r ≤ R. We have ψ(R) equal to the right side of (5.1.19), while clearly ψ(r) → u(p) as r → 0. Now (5.1.21) ∫ ∫ ′ 1 · ∇ ( 1 ) ∂u ψ (r) = n−1 ω u(p + rω) dS(ω) = dS(x). A(S ) A ∂Br(p) ∂ν n−1 S ∂Br(p) At this point, we establish: Lemma 5.1.6. If O ⊂ Rn is a bounded domain with smooth boundary and u ∈ C2(O) is harmonic in O, then ∫ ∂u (5.1.22) (x) dS(x) = 0. ∂ν ∂O Proof. Apply the Green formula (4.4.29), with M = O and v = 1. If ∆u = 0, every integrand in (4.4.29) vanishes, except the one appearing in (5.1.22), so this integrates to zero. 

It follows from this lemma that (5.1.21) vanishes, so ψ(r) is constant. This completes the proof of (5.1.19). We can integrate the identity (5.1.19), to obtain ∫ 1 (5.1.23) u(p) = ( ) u(x) dV (x), V BR(p) BR(p) ( ) 2 where u ∈ C BR(p) is harmonic. This is another expression of the mean value property. 220 5. Applications of the Gauss-Green-Stokes formula

The mean value property of harmonic functions has a number of impor- tant consequences, a couple of which we mention here. The first is known as the maximum principle. Proposition 5.1.7. Let Ω ⊂ Rn be bounded, open, and connected. Assume u :Ω → R is harmonic. If x0 ∈ Ω and

(5.1.24) u(x0) ≥ u(x), ∀ x ∈ Ω, then u is constant. Consequently, if u ∈ C(Ω) ∩ C2(Ω) is harmonic, then (5.1.25) sup |u(x)| = sup |u(y)|. x∈Ω y∈∂Ω

Proof. Assume (5.1.24) holds, and set a = u(x0). We see from (5.1.23) that if BR(x0) ⊂ Ω, then u(x) = a for all x ∈ BR(x0). It follows that {x ∈ Ω: u(x) = a} is open in Ω. It is also clearly closed in Ω, so if Ω is connected, this set must be all of Ω, and we have the first part of Proposition 5.1.7. As indicated, the second part is a consequence. We leave this argument to the reader. 

The next result is known as Liouville’s Theorem. Proposition 5.1.8. If u ∈ C2(Rn) is harmonic on all of Rn and bounded, then u is constant.

Proof. Pick any two points p, q ∈ Rn. We have, for any r > 0,   ∫ ∫ 1   (5.1.26) u(p) − u(q) = ( )  u(x) dx − u(x) dx . V Br(0) B (p) B (q) ( ) r r Note that V B (0) = C rn, where C is evaluated in problem 2 of §5. Thus r n n ∫ C (5.1.27) |u(p) − u(q)| ≤ n |u(x)| dx, rn ∆(p,q,r) where ( ) ( ) (5.1.28) ∆(p, q, r) = Br(p)△Br(q) = Br(p) \ Br(q) ∪ Br(q) \ Br(p) .

See Figure 5.1.3. Note that, if a = |p−q|, then ∆(p, q, r) ⊂ Br+a(p)\Br−a(p); hence ( ) (5.1.29) V ∆(p, q, r) ≤ C(p, q) rn−1, r ≥ 1. It follows that, if |u(x)| ≤ M for all x ∈ Rn, then −1 (5.1.30) |u(p) − u(q)| ≤ MCnC(p, q) r , ∀ r ≥ 1. Taking r → ∞, we obtain u(p) − u(q) = 0, so u is constant.  5.1. Holomorphic functions and harmonic functions 221

Figure 5.1.3. Two large balls, with centers at p and q

We will now use Liouville’s Theorem to prove the Fundamental Theorem of Algebra:

n n−1 Theorem 5.1.9. If p(z) = anz + an−1z + ··· + a1z + a0 is a polynomial of degree n ≥ 1 (an ≠ 0), then p(z) must vanish somewhere in C.

Proof. Consider 1 (5.1.31) f(z) = . p(z) If p(z) does not vanish anywhere in C, then f(z) is holomorphic on all of C. (See Exercise 9 below.) On the other hand, 1 1 (5.1.32) f(z) = n −1 −n , z an + an−1z + ··· + a0z so (5.1.33) |f(z)| −→ 0, as |z| → ∞. Thus f is bounded on C, if p(z) has no roots. By Proposition 5.1.8, f(z) must be constant, which is impossible, so p(z) must have a complex root.  222 5. Applications of the Gauss-Green-Stokes formula

From the fact that every holomorphic function f on O ⊂ R2 is harmonic, it follows that its real and imaginary parts are harmonic. This result has a converse. Let u ∈ C2(O) be harmonic. Consider the 1-form ∂u ∂u (5.1.34) α = − dx + dy. ∂y ∂x We have dα = −(∆u)dx ∧ dy, so α is closed if and only if u is harmonic. Now, if O is diffeomorphic to a disk, it follows from Proposition 4.3.3 that α is exact on O, whenever it is closed, so, in such a case,

(5.1.35) ∆u = 0 on O =⇒ ∃ v ∈ C1(O) s.t. α = dv.

In other words, ∂u ∂v ∂u ∂v (5.1.36) = , = − . ∂x ∂y ∂y ∂x This is precisely the Cauchy-Riemann equation (5.1.1) for f = u + iv, so we have:

Proposition 5.1.10. If O ⊂ R2 is diffeomorphic to a disk and u ∈ C2(O) is harmonic, then u is the real part of a holomorphic function on O.

The function v (which is unique up to an additive constant) is called the harmonic conjugate of u. We close this section with a brief mention of holomorphic functions on a domain O ⊂ Cn. We say f ∈ C1(O) is holomorphic provided it satisfies ∂f 1 ∂f (5.1.37) = , 1 ≤ j ≤ n. ∂xj i ∂yj

Suppose z ∈ O, z = (z1, . . . , zn). Suppose ζ ∈ O whenever |z −ζ| < r. Then, by successively applying Cauchy’s integral formula (5.1.9) to each complex variable zj, we have that ∫ ∫ −n −1 −1 (5.1.38) f(z) = (2πi) ··· f(ζ)(ζ1 − z1) ··· (ζn − zn) dζ1 ··· dζn,

γn γ1 where γj is any simple counterclockwise curve about zj in C with the prop- erty that √ |ζj − zj| < r/ n for all ζj ∈ γj. Consequently, if p ∈ Cn and O contains the “polydisc”

n D = {z ∈ C : |zj − pj| ≤ δ, ∀ j}, 5.1. Holomorphic functions and harmonic functions 223 then, for z ∈ D, the interior of D, we have ∫ ∫ [ ] −n −1 f(z) = (2πi) ··· f(ζ) (ζ1 − p1) − (z1 − p1) ··· (5.1.39) C C n 1 [ ] −1 (ζn − pn) − (zn − pn) dζ1 ··· dζn, where Cj = {ζ ∈ C : |ζ − pj| = δ}. Then, parallel to (5.1.12)–(5.1.15), we have ∑ α (5.1.40) f(z) = cα(z − p) , α≥0 for z ∈ D, where α = (α1, . . . , αn) is a multi-index,

α α1 αn (z − p) = (z1 − p1) ··· (zn − pn) , as in (2.1.13), and (5.1.41) ∫ ∫

−n −α1−1 −αn−1 cα = (2πi) ··· f(ζ)(ζ1 − p1) ··· (ζn − pn) dζ1 ··· dζn.

Cn C1 Thus holomorphic functions on open domains in Cn have convergent power series expansions. We refer to [2], [22], and [46] for more material on holomorphic functions of one complex variable, and to [27] for material on holomorphic functions of several complex variables. A source of much information on harmonic functions is [26]. Also further material on these subjects can be found in [41].

Exercises

1. Let fk :Ω → C be holomorphic on an open set Ω ⊂ C. Assume fk → f and ∇fk → ∇f locally uniformly in Ω. Show that f :Ω → C is holomorphic. See Exercise 26 for a stronger result.

2. Assume ∑∞ k (5.1.42) f(z) = akz k=0 is absolutely convergent for |z| < R. Deduce from Proposition 2.1.10 and Exercise 1 above that f is holomorphic on |z| < R, and that ∑∞ ′ k−1 (5.1.43) f (z) = kakz , for |z| < R. k=1 224 5. Applications of the Gauss-Green-Stokes formula

3. As in (2.3.104), the exponential function ez is defined by ∞ ∑ 1 (5.1.44) ez = zk. k! k=0 Deduce from Exercise 2 that ez is holomorphic in z.

4. By (2.3.105), we have ez+h = ezeh, ∀ z, h ∈ C. Use this to show directly from (5.1.1) that ez is complex differentiable and (d/dz)ez = ez on C, giving another proof that ez is holomorphic on C. Hint. Use the power series for eh to show that eh − 1 lim = 1. h→0 h

5. For another approach to the fact that ez is holomorphic, use ez = exeiy and (2.3.104) to verify that ez satisfies the Cauchy-Riemann equation.

6. For z ∈ C, set 1( ) 1 ( ) (5.1.45) cos z = eiz + e−iz , sin z = eiz − e−iz . 2 2i Show that these functions agree with the definitions of cos t and sin t given in (2.3.106)–(2.3.108), for z = t ∈ R. Show that cos z and sin z are holomorphic in z ∈ C. Show that d d (5.1.46) cos z = − sin z, sin z = cos z, dz dz and (5.1.47) cos2 z + sin2 z = 1, for all z ∈ C.

7. Let O, Ω be open in C. If f is holomorphic on O, with range in Ω, and g is holomorphic on Ω, show that h = g ◦ f is holomorphic on O, and h′(z) = g′(f(z))f ′(z). Hint. See the proof of the chain rule in §2.1.

8. Let Ω ⊂ C be a connected open set and let f be holomorphic on Ω. (a) Show that if f(zj) = 0 for distinct zj ∈ Ω and zj → z0 ∈ Ω, then 5.1. Holomorphic functions and harmonic functions 225

f(z) = 0 for z in a neighborhood of z0. Hint. Use the power series expansion (5.1.1). (b) Show that if f = 0 on a nonempty open set O ⊂ Ω, then f ≡ 0 on Ω. Hint. Let U ⊂ Ω denote the interior of the set of points where f vanishes. Use part (a) to show that U ∩ Ω is open.

9. Let Ω = C \ (−∞, 0] and define log : Ω → C by ∫ 1 (5.1.48) log z = dζ, ζ γz where γz is a path from 1 to z in Ω. Use Theorem 5.1.2 to show that this is independent of the choice of such path. Show that it yields a holomorphic function on C \ (−∞, 0], satisfying d 1 log z = , z ∈ C \ (−∞, 0]. dz z

10. Taking log z as in Exercise 9, show that (5.1.49) elog z = z, ∀ z ∈ C \ (−∞, 0]. Hint. If φ(z) denotes the left side, show that φ(1) = 1 and φ′(z) = φ(z)/z. Use uniqueness results from §2.3 to deduce that φ(x) = x for x ∈ (0, ∞), and from there deduce that φ(z) ≡ z, using Exercise 8. Alternative. Apply d/dz to show that log ez = z, for z in some neighborhood of 0. Deduce from this (and Exercise 3 of §2.2) that (5.1.49) holds for z in some neighborhood of 1. Then get it for all z ∈ C \ (−∞, 0] using Exercise 8.

11. With Ω = C \ (−∞, 0] as in Exercise 9, and a ∈ C, define za for z ∈ Ω by (5.1.50) za = ea log z. Show that this is holomorphic on Ω and d (5.1.51) za = aza−1, zazb = za+b, ∀ z ∈ Ω. dz

12. Let O = C \{[1, ∞) ∪ (−∞, −1]}, and define As : O → C by ∫ As(z) = (1 − ζ2)−1/2 dζ,

σz 226 5. Applications of the Gauss-Green-Stokes formula

where σz is a path from 0 to z in O. Show that this is independent of the choice of such a path, and that it yields a holomorphic function on O.

13. With As as in Exercise 12, show that

As(sin z) = z, for z in some neighborhood of 0. (Hint. Apply d/dz.) From here, show that

sin(As(z)) = z, ∀ z ∈ O.

Thus we write ∫ z (5.1.52) arcsin z = (1 − ζ2)−1/2 dζ. 0 Compare (2.3.113).

14. Look again at Exercise 4 in §2.1.

15. Look again at Exercises 3–5 in §2.2. Write the result as an inverse function theorem for holomorphic maps.

16. Differentiate (5.1.9) to show that, in the setting of Theorem 5.1.3, for k ∈ N, we have the derivative Cauchy integral formula, ∫ (k) k! f(z) (5.1.53) f (z0) = k+1 dz. 2πi (z − z0) ∂Ω Show that this also follows from (5.1.15).

17. Assume f is holomorphic on C, and set

M(z0,R) = sup |f(z)|. |z−z0|≤R Use the k = 1 case of Exercise 16 to show that M(z ,R) |f ′(z )| ≤ 0 , ∀ R ∈ (0, ∞). 0 R

18. In the setting of Exercise 17, assume f is bounded, say |f(z)| ≤ M for ′ all z ∈ C. Deduce that f (z0) = 0 for all z0 ∈ C, and in that way obtain another proof of Liouville’s theorem, in the setting of holomorphic functions on C. (Note that Proposition 5.1.8 is more general.) 5.1. Holomorphic functions and harmonic functions 227

The next four exercises deal with the function ∫ ∞ 2 (5.1.54) G(z) = e−t +tz dt, z ∈ C. −∞

19. Show that G is continuous on C.

20. Show that G is holomorphic on C, with ∫ ∞ 2 G′(z) = te−t +tz dt. −∞ Hint. Write ∫ ∞ 1 2 1 [G(z + h) − G(z)] = e−t +tz (eth − 1) dt, h −∞ h and 1 1 (eth − 1) = t + R(th), h h where ew = 1 + w + R(w), |R(w)| ≤ C|w|2e|w|, so 1 R(th) ≤ Ct2|h|e|th|. h

21. Show that, for x ∈ R, √ 2 G(x) = πex /4.

Hint. Write ∫ ∞ 2 2 G(x) = ex /4 e−(t−x/2) dt, −∞ and make a change of variable in the integral.

22. Deduce from Exercises 21 and 8 that √ 2 (5.1.55) G(z) = π ez /4, ∀ z ∈ C.

The next exercises deal with the Gamma function, ∫ ∞ (5.1.56) Γ(z) = e−ssz−1 ds, 0 defined for z > 0 in (3.2.33).

23. Show that the integral is absolutely convergent for Re z > 0 and defines Γ(z) as a holomorphic function on {z ∈ C : Re z > 0}. 228 5. Applications of the Gauss-Green-Stokes formula

24. Extend the identity (3.2.36), i.e., (5.1.57) Γ(z + 1) = zΓ(z), to Re z > 0.

25. Use (5.1.57) to extend Γ to be holomorphic on C \{0, −1, −2, −3,... }.

26. Use the result of Exercise 16 to show that if fν are holomorphic on an open set Ω ⊂ C and fν → f uniformly on compact subsets of Ω, then f is ′ → ′ holomorphic on Ω and fν f uniformly on compact subsets.

27. The Riemann zeta function ζ(z) is defined for Re z > 1 by ∞ ∑ 1 (5.1.58) ζ(z) = . kz k=1 Show that ζ(z) is holomorphic on {z ∈ C : Re z > 1}.

The following exercises deal with harmonic functions on domains in Rn.

28. Using the formula (4.4.26) for the Laplace operator together with the formula (3.2.26) for the metric tensor on Rn in spherical polar coordinates x = rω, x ∈ Rn, r = |x|, ω ∈ Sn−1, show that of u ∈ C2(Ω), Ω ⊂ Rn, ∂2 n − 1 ∂ 1 (5.1.59) ∆u(rω) = u(rω) + u(rω) + ∆ u(rω), ∂r2 r ∂r r2 S n−1 where ∆S is the Laplace operator on S .

29. If f(x) = φ(|x|) on Rn, show that n − 1 (5.1.60) ∆f(x) = φ′′(|x|) + φ′(|x|). |x| In particular, show that (5.1.61) |x|−(n−2) is harmonic on Rn \ 0, if n ≥ 3, and (5.1.62) log |x| is harmonic on R2 \ 0.

If O, Ω are open in Rn, a smooth map φ : O → Ω is said to be conformal provided the matrix function G(x) = Dφ(x)tDφ(x) is a multiple of the identity, G(x) = γ(x)I. Recall formula (3.2.2). 5.1. Holomorphic functions and harmonic functions 229

30. Suppose n = 2 and φ preserves orientation. Show that φ (pictured as a function φ : O → C) is conformal if and only if it is holomorphic. If φ reverses orientation, φ is conformal ⇔ φ is holomorphic (we say φ is anti-holomorphic).

31. If O and Ω are open in R2 and u is harmonic on Ω, show that u ◦ φ is harmonic on O, whenever φ : O → Ω is a smooth conformal map. Hint. Use Exercise 7 and Proposition 5.1.10.

The following exercises will present an alternative approach to the proof of Proposition 5.1.5 (the mean value property of harmonic functions). For this, n 2 let BR = {x ∈ R : |x| ≤ R}. Assume u is continuous on BR and C and ◦ harmonic on the interior BR. We assume n ≥ 2.

◦ 32. Given g ∈ SO(n), show that ug(x) = u(gx) is harmonic on BR. Hint. See Exercise 7 of §4.4.

33. As in Exercise 27 of §3.2, define Au ∈ C(BR) by ∫ Au(x) = u(gx) dg.

SO(n) Thus Au(x) is a radial function: ∫ 1 Au(x) = Su(|x|), Su(r) = u(rω) dS(ω). An−1 Sn−1 ◦ Deduce from Exercise 32 above that Au is harmonic on BR.

34. Use Exercise 29 to show that φ(r) = Su(r) satisfies n − 1 φ′′(r) + φ′(r) = 0, r for r ∈ (0,R). Deduce from this differential equation that there exist con- stants C0 and C1 such that −(n−2) φ(r) = C0 + C1r , if n ≥ 3,

C0 + C1 log r, if n = 2.

Then show that, since Au(x) does not blow up at x = 0, C1 = 0. Hence

Au(x) = C0, ∀ x ∈ BR. 230 5. Applications of the Gauss-Green-Stokes formula

35. Note that Au(0) = u(0). Deduce that for each r ∈ (0,R], ∫ 1 (5.1.63) u(0) = Su(r) = u(rω) dS(ω). An−1 Sn−1

5.2. Differential forms, homotopy, and the Lie derivative

Let X and Y be smooth surfaces. Two smooth maps f0, f1 : X → Y are said to be smoothly homotopic provided there is a smooth F : [0, 1]×X → Y such that F (0, x) = f0(x) and F (1, x) = f1(x). The following result illustrates the significance of maps being homotopic. Proposition 5.2.1. Assume X is a compact, oriented, k-dimensional sur- k face and α ∈ Λ (Y ) is closed, i.e., dα = 0. If f0, f1 : X → Y are smoothly homotopic, then ∫ ∫ ∗ ∗ (5.2.1) f0 α = f1 α. X X

In fact, with [0, 1] × X = Ω, this is a special case of the following. Proposition 5.2.2. Assume Ω is a smoothly bounded, compact, oriented (k + 1)-dimensional surface, and α ∈ Λk(Y ) is closed. If F : Ω → Y is a smooth map, then ∫ (5.2.2) F ∗α = 0.

∂Ω

Proof. Stokes’ theorem gives ∫ ∫ (5.2.3) F ∗α = dF ∗α = 0,

∂Ω Ω since dF ∗α = F ∗dα and, by hypothesis, dα = 0. 

Proposition 5.2.2 is one generalization of Proposition 5.2.1. Here is an- other. Proposition 5.2.3. Assume X is a k-dimensional surface and α ∈ Λℓ(Y ) → ∗ − ∗ is closed. If f0, f1 : X Y are smoothly homotopic, then f0 α f1 α is exact, i.e., ∗ − ∗ (5.2.4) f0 α f1 α = dβ, for some β ∈ Λℓ−1(X). 5.2. Differential forms, homotopy, and the Lie derivative 231

Proof. Take a smooth F : R × X → Y such that F (j, x) = fj(x). Consider (5.2.5) α˜ = F ∗α ∈ Λℓ(R × X). Note that dα˜ = F ∗dα = 0. Now consider

(5.2.6) Φs : R × X −→ R × X, Φs(t, x) = (s + t, x). We claim that − ∗ ˜ (5.2.7) α˜ Φ1α˜ = dβ, for some β˜ ∈ Λℓ−1(R × X). Now take (5.2.8) β = j∗β,˜ j : X → R × X, j(x) = (0, x).

We have F ◦ j = f0,F ◦ Φ1 ◦ j = f1, so it follows that ∗ − ∗ ∗ − ∗ ∗ f0 α f1 α = j α˜ j Φ1α˜ (5.2.9) = j∗dβ,˜ given (5.2.7), which yields (5.2.4) with β as in (5.2.8). 

It remains to prove (5.2.7), under the hypothesis that dα˜ = 0. The following result gives this. The formula (5.2.10) uses the interior product, defined by (4.2.4)–(4.2.5).

ℓ Lemma 5.2.4. If α˜ ∈ Λ (R × X) and Φs is as in (5.2.6), then ( ) d (5.2.10) Φ∗α˜ = Φ∗ d(˜α⌋∂ ) + (dα˜)⌋∂ . ds s s t t Hence, if dα˜ = 0, (5.2.7) holds with ∫ 1 ˜ − ∗ ⌋ (5.2.11) β = (Φsα˜) ∂t ds. 0 ∗ ∗ ∗ ∗ ∗ Proof. Since Φs+σ = ΦsΦσ = ΦσΦs, it suffices to show that (5.2.10) holds at s = 0. It also suffices to work in local coordinates on X. Say ∑ # ∧ · · · ∧ α˜ = αi (t, x) dxi1 dxiℓ i (5.2.12) ∑ b ∧ ∧ · · · ∧ + αj(t, x) dt dxj1 dxjℓ−1 . j ∗ # We have Φsα˜ given by a similar formula, with coefficients replaced by αi (t+ b s, x) and αj(t + s, x), hence d ∑ Φ∗α˜ = ∂ α#(t, x) dx ∧ · · · ∧ dx ds s s=0 t i i1 iℓ (5.2.13) i ∑ b ∧ ∧ · · · ∧ + ∂tαj(t, x) dt dxj1 dxjℓ−1 . j 232 5. Applications of the Gauss-Green-Stokes formula

Meanwhile ∑ ⌋ b ∧ · · · ∧ (5.2.14) α˜ ∂t = αj(t, x) dxj1 dxjℓ−1 , j so ∑ ⌋ b ∧ ∧ · · · ∧ d(˜α ∂t) = ∂tαj(t, x) dt dxj1 dxjℓ−1 j (5.2.15) ∑ b ∧ ∧ · · · ∧ + ∂xν αj(t, x) dxν dxj1 dxjℓ−1 . j,ν A similar calculation yields ∑ ⌋ # ∧ · · · ∧ (dα˜) ∂t = ∂tαi (t, x) dxi1 dxiℓ i (5.2.16) ∑ − b ∧ ∧ · · · ∧ ∂xν αj(t, x) dxν dxj1 dxjℓ−1 . j,ν Comparison of (5.2.15)–(5.2.16) with (5.2.13) yields (5.2.10) at s = 0, prov- ing Lemma 5.2.4. 

The following consequence of Proposition 5.2.3 is the Poincar´elemma. Proposition 5.2.5. Let X be a smooth k-dimensional surface. Assume the identity map I : X → X is smoothly homotopic to a constant map K : X → X, satisfying K(x) ≡ p. Then, for all ℓ ∈ {1, . . . , k}, (5.2.17) α ∈ Λℓ(X), dα = 0 =⇒ α is exact.

Proof. By Proposition 5.2.3, α − K∗α is exact. However, K∗α = 0. 

Proposition 5.2.5 applies to any open X ⊂ Rk that is star-shaped, so

(5.2.18) Ds : X −→ X for s ∈ [0, 1],Ds(x) = sx. Thus, for any open star-shaped X ⊂ Rk, each closed α ∈ Λℓ(X) is exact. We next present an important generalization of Lemma 5.2.4. Let Ω be a smooth n-dimensional surface. If α ∈ Λk(Ω) and X is a vector field on Ω, F t L generating a flow X , the Lie derivative X α is defined to be

L d F t ∗ (5.2.19) X α = ( X ) α . dt t=0

Note the similarity to the definition (2.3.82) of LX Y for a vector field Y , for which there was the alternative formula (2.3.85). The following useful result is known as Cartan’s formula for the Lie derivative. Proposition 5.2.6. We have

(5.2.20) LX α = d(α⌋X) + (dα)⌋X. 5.2. Differential forms, homotopy, and the Lie derivative 233

Proof. We can assume Ω is an open subset of Rn. First we compare both sides in the special case X = ∂/∂x = ∂ . Note that ∑ ℓ ℓ (5.2.21) (F t )∗α = a (x + te ) dx ∧ · · · ∧ dx , ∂ℓ j ℓ j1 jk j so ∑ L ∧ · · · ∧ (5.2.22) ∂ℓ α = ∂xℓ aj(x) dxj1 dxjk = ∂ℓα. j

To evaluate the right side of (5.2.21), with X = ∂ℓ, we could parallel the calculation (5.2.14)–(5.2.16). Alternatively, we can use (4.2.12) to write this as ∑n (5.2.23) d(ιℓα) + ιℓdα = (∂j ∧j ιℓ + ιℓ∂j∧j)α. j=1

Using the commutativity of ∂j with ∧j and with ιℓ, and the anticommuta- tivity relations (4.2.8), we see that the right side of (5.2.23) is ∂ℓα, which coincides with (5.2.22). Thus the proposition holds for X = ∂/∂xℓ. Now we prove the proposition in general, for a smooth vector field X on Ω. It is to be verified at each point x0 ∈ Ω. If X(x0) ≠ 0, we can apply Theorem 2.3.7 to choose a coordinate system about x0 so X = ∂/∂x1 and use the calculation above. This shows that the desired identity holds on the set {x0 ∈ Ω: X(x0) ≠ 0}, and by continuity it holds on the closure of this set. However, if x0 has a neighborhood on which X vanishes, it is clear that LX α = 0 near x0 and also α⌋X and dα⌋X vanish near x0. This completes the proof. 

F s+t F s F t From (5.2.19) and the identity X = X X , it follows that d (5.2.24) (F t )∗α = L (F t )∗α = (F t )∗L α. dt X X X X X It is useful to generalize this. Let Ft be a smooth family of diffeomorphisms of M into M. Define vector fields Xt on Ft(M) by d (5.2.25) F (x) = X (F (x)). dt t t t Then, given α ∈ Λk(M),

d ∗ ∗L Ft α = Ft Xt α (5.2.26) dt [ ] ∗ ⌋ ⌋ = Ft d(α Xt) + (dα) Xt .

In particular, if α is closed, then, if Ft are diffeomorphisms for 0 ≤ t ≤ 1, ∫ 1 ∗ − ∗ ∗ ⌋ (5.2.27) F1 α F0 α = dβ, β = Ft (α Xt) dt. 0 234 5. Applications of the Gauss-Green-Stokes formula

The fact that the left side of (5.2.27) is exact is a special case of Proposition 5.2.3, but the explicit formula given in (5.2.27) can be useful.

More on the divergence of a vector field

Let M ⊂ Rn be an m-dimensional, oriented surface, with volume form ω. Then dω = 0 on M, so, if X is a vector field on M,

(5.2.28) LX ω = d(ω⌋X).

Comparison with (4.4.7) gives

(5.2.29) (div X)ω = LX ω.

This is sometimes taken as the definition of div X. It readily leads to a F t formula for how the flow X affects volumes. To get this, we start with

d t ∗ t ∗ (F ) ω = (F ) LX ω (5.2.30) dt X X F t ∗ = ( X ) ((div X)ω).

⊂ F t Hence, if Ω M is a smoothly bounded domain on which the flow X is defined for t ∈ I, then, for such t, ∫ d d Vol F t (Ω) = (F t )∗ω dt X dt X ∫ Ω = (F t )∗((div X) ω) (5.2.31) X Ω ∫ = (div X) ω.

F t X (Ω) In other words, ∫ d (5.2.32) Vol F t (Ω) = (div X) dV. dt X F t X (Ω) This result is equivalent to Proposition 3.2.7, but the derivation here is substantially different. Compare also the discussion in Exercise 2 of §4.4.

Exercises 5.2. Differential forms, homotopy, and the Lie derivative 235

1. Show that if α is a k-form and X,Xj are vector fields, (L α)(x ,...,X ) X 1 k ∑ (5.2.33) = X · α(X1,...,Xk) − α(X1,..., LX Xj,...,Xk). j

Recall from (2.3.85) that LX Xj = [X,Xj], and rewrite (5.2.33) accordingly.

2. Writing (5.2.20) as ιX dα = LX α − dιX α, deduce that (dα)(X0,X1,...,Xk) (5.2.34) L − = ( X0 α)(X1,...,Xk) (dιX0 α)(X1,...,Xk).

3. In case α is a one-form, deduce from (5.2.33)–(5.2.34) that

(5.2.35) (dα)(X0,X1) = X0 · α(X1) − X1 · α(X0) − α([X0,X1]).

4. Using (5.2.33)–(5.2.34) and induction on k, show that, if α is a k-form,

(dα)(X0,...,Xk) ∑k = (−1)ℓX · α(X ,..., Xb ,...,X ) (5.2.36) ℓ 0 ℓ k ℓ=0 ∑ j+ℓ b b + (−1) α([Xℓ,Xj],X0,..., Xℓ,..., Xj,...,Xk). 0≤ℓ

5. Show that if X is a vector field, β a 1-form, and α a k-form, then

(5.2.37) (∧βιX + ιX ∧β)α = ⟨X, β⟩α. Deduce that (5.2.38) (df) ∧ (α⌋X) + (df ∧ α)⌋X = (Xf)α.

6. Show that the definition (5.2.19) implies

(5.2.39) LX (fα) = fLX α + (Xf)α.

7. Show that the definition (5.2.19) implies

(5.2.40) dLX α = LX (dα). 236 5. Applications of the Gauss-Green-Stokes formula

8. Denote the right side of (5.2.20) by LX α, i.e., set

(5.2.41) LX α = d(α⌋X) + (dα)⌋X. Show that this definition directly implies

(5.2.42) LX (dα) = d(LX α).

9. With LX defined by (5.2.41), show that

(5.2.43) LX (fα) = fLX α + (Xf)α. Hint. Use (5.2.38).

10. Use the results of Exercises 6–9 to give another proof of Proposition 5.2.6, i.e., LX α = LX α. Hint. Start with LX f = Xf = ⟨X, df⟩ = LX f.

In Exercises 11–12, let X and Y be smooth vector fields on M and α ∈ Λk(M).

11. Show that L[X,Y ]α = LX LY α − LY LX α.

12. Using Exercise 11 and (5.2.29), show that div[X,Y ] = X(div Y ) − Y (div X).

5.3. Differential forms and degree theory

Degree theory assigns an integer, Deg(f), to a smooth map f : X → Y , when X and Y are smooth, compact, oriented surfaces of the same dimension, and Y is connected. This has many uses, as we will see. Results of §5.2 provide tools for this study. A major ingredient is Stokes’ theorem. As a prelude to our development of degree theory, we use the calculus of differential forms to provide simple proofs of some important topological results of Brouwer. The first two results concern retractions. If Y is a subset of X, by definition a retraction of X onto Y is a map φ : X → Y such that φ(x) = x for all x ∈ Y . Proposition 5.3.1. There is no smooth retraction φ : B → Sn−1 of the closed unit ball B in Rn onto its boundary Sn−1.

This is Brouwer’s no-retraction theorem. In fact, it is just as easy to prove the following more general result. The approach we use is adapted from [25]. 5.3. Differential forms and degree theory 237

Proposition 5.3.2. If M is a compact oriented n-dimensional surface with nonempty boundary ∂M, there is no smooth retraction φ : M → ∂M. ∫ ∈ n−1 Proof. Pick ω Λ (∂M) to be the volume form on ∂M, so ∂M ω > 0. Now apply Stokes’ theorem to β = φ∗ω. If φ is a retraction, then φ◦j(x) = x, where j : ∂M ,→ M is the natural inclusion. Hence j∗φ∗ω = ω, so we have ∫ ∫ (5.3.1) ω = dφ∗ω.

∂M M But dφ∗ω = φ∗dω = 0, so the integral (5.3.1) is zero. This is a contradiction, so there can be no retraction. 

A simple consequence of this is the famous Brouwer Fixed-Point Theo- rem. We first present the smooth case. Theorem 5.3.3. If F : B → B is a smooth map on the closed unit ball in Rn, then F has a fixed point.

Proof. We are claiming that F (x) = x for some x ∈ B. If not, define φ(x) to be the endpoint of the ray from F (x) to x, continued until it hits ∂B = Sn−1. See Figure 5.3.1. An explicit formula is √ b2 + 4ac − b φ(x) = x + t(x − F (x)), t = , 2a a = ∥x − F (x)∥2, b = 2x · (x − F (x)), c = 1 − ∥x∥2. Here t is picked to solve the equation ∥x+t(x−F (x))∥2 = 1. Note that ac ≥ 0, so t ≥ 0. It is clear that φ would be a smooth retraction, contradicting Proposition 5.3.1. 

Now we give the general case, using the Stone-Weierstrass theorem (es- tablished in Appendix A.5) to reduce it to Theorem 5.3.3. Theorem 5.3.4. If G : B → B is a continuous map on the closed unit ball in Rn, then G has a fixed point.

Proof. If not, then inf |G(x) − x| = δ > 0. x∈B The Stone-Weierstrass theorem implies there exists a polynomial P such that |P (x) − G(x)| < δ/8 for all x ∈ B. Set ( ) δ F (x) = 1 − P (x). 8 Then F : B → B and |F (x) − G(x)| < δ/2 for all x ∈ B, so δ inf |F (x) − x| > . x∈B 2 238 5. Applications of the Gauss-Green-Stokes formula

Figure 5.3.1. Purported retraction of B onto ∂B

This contradicts Theorem 5.3.3. 

As a second precursor to degree theory, we next show that an even dimensional sphere cannot have a smooth nonvanishing vector field.

Proposition 5.3.5. There is no smooth nonvanishing vector field on Sn if n = 2k is even.

Proof. If X were such a vector field, we could arrange it to have unit length, so we would have X : Sn → Sn with X(v) ⊥ v for v ∈ Sn ⊂ Rn+1. Thus there would be a unique unit speed curve γv along the great circle from n n v to X(v), of length π/2. Define a smooth family of maps Ft : S → S by Ft(v) = γv(t). Thus F0(v) = v, Fπ/2(v) = X(v), and Fπ = A would be the antipodal map, A(v) = −v. By Proposition 5.2.3, we deduce that A∗ω −ω = dβ is exact, where ω is the volume form on Sn. Hence, by Stokes’ theorem, ∫ ∫ (5.3.2) A∗ω = ω.

Sn Sn 5.3. Differential forms and degree theory 239

Alternatively, (5.3.2) follows directly from Proposition 5.2.1. On the other hand, it is straightforward that A∗ω = (−1)n+1ω, so (5.3.2) is possible only when n is odd. 

Note that an important ingredient in the proof of both Proposition 5.3.2 and Proposition 5.3.5 is the existence of n-forms on a compact oriented n-dimensional surface M that are not exact (though of course they are closed). We next establish the following exactness criterion, counterpoint to the Poincar´elemma.

Proposition 5.3.6. If M is a compact, connected, oriented surface of di- mension n and α ∈ ΛnM, then α = dβ for some β ∈ Λn−1(M) if and only if ∫ (5.3.3) α = 0.

M

We have already discussed the necessity of (5.3.3). To prove the suffi- ciency, we first look at the case M = Sn. In that case, any n-form α is of the form a(x)ω, a ∈ C∞(Sn), ω the volume form on Sn, with its standard metric. The group G = SO(n + 1) of rotations of Rn+1 acts as a transitive group of isometries on Sn. In §3.2 we constructed the integral of functions over SO(n + 1), with respect to Haar measure. As seen in §3.2, we have the smooth map

Exp : Skew(n + 1) −→ SO(n + 1), giving a diffeomorphism from a ball O about 0 in Skew(n + 1) onto an open set U ⊂ SO(n + 1) = G, a neighborhood of the identity. Since G is compact, we can pick a finite number of elements ξj ∈ G such that the open sets Uj = {ξjg : g ∈ U} cover G. Using Corollary A.3.9, we can pick ηj ∈ Skew(n + 1) such that Exp ηj = ξj. Define Φjt : Uj → G for 0 ≤ t ≤ 1 by ( ) (5.3.4) Φjt ξj Exp(A) = (Exp tηj)(Exp tA),A ∈ O.

Now partition G into subsets Ωj, each of whose boundaries has content zero, such that Ωj ⊂ Uj. If g ∈ Ωj, set g(t) = Φjt(g). This family of elements of n n SO(n + 1) defines a family of maps Fgt : S → S . Now by (5.2.27) we have ∫ 1 ∗ − ∗ ⌋ (5.3.5) α = g α dκg(α), κg(α) = Fgt(α Xgt) dt, 0 240 5. Applications of the Gauss-Green-Stokes formula

n for each g ∈ SO(n + 1), where Xgt is the family of vector fields on S associated to Fgt, as in (5.2.25). Therefore, ∫ ∫ ∗ (5.3.6) α = g α dg − d κg(α) dg. G G ∫ Now the first term on the right is equal to αω, where α = a(g · x)dg is a constant; in fact, the constant is ∫ 1 (5.3.7) α = α. Vol Sn Sn Thus in this case (5.3.3) is precisely what serves to make (5.3.6) a represen- tation of α as an exact form. This takes care of the case M = Sn. For a general compact, oriented, connected M, proceed as follows. Cover M with open sets O1,..., OK such that each Oj is diffeomorphic to the n closed unit ball in R . Set U1 = O1, and inductively enlarge each Oj to Uj, so that U j is also diffeomorphic to the closed ball, and such that Uj+1 ∩Uj ≠ ∅, 1 ≤ j < K. You can do this by drawing a simple curve from Oj+1 to a point in Uj and thickening it. Pick a smooth partition of unity φj, subordinate to this cover. (See Section 3.3.) ∫ n Given α ∈ Λ M, satisfying (5.3.3), takeα ˜j = φjα. Most likely α˜1 = n ∫c1 ≠ 0, so take σ1 ∈ Λ M, with compact support in U1 ∩ U2, such that σ1 = c1. Set α1 =α ˜1 − σ1, and∫ redefineα ˜2 to be the oldα ˜2 plus σ1. Make a similar construction using α˜2 = c2, and continue. When you are done, you have

(5.3.8) α = α1 + ··· + αK , with αj compactly supported in Uj. By construction, ∫

(5.3.9) αj = 0 ∫ for 1 ≤ j < K. But then (5.3.3) implies αK = 0 too. Now pick p ∈ Sn and define smooth maps

n (5.3.10) ψj : M −→ S

n which map Uj diffeomorphically onto S \ p, and map M \ Uj to p. There is n n n ∗ a unique vj ∈ Λ S , with compact support in S \ p, such that ψ vj = αj. Clearly ∫

vj = 0, Sn 5.3. Differential forms and degree theory 241 so by the case M = Sn of Proposition 5.3.6 already established, we know n−1 n that vj = dwj for some wj ∈ Λ S , and then ∗ (5.3.11) αj = dβj, βj = ψj wj. This concludes the proof of Proposition 5.3.6.

We are now ready to introduce the notion of the degree of a map between compact oriented surfaces. Let X and Y be compact oriented n-dimensional surfaces, and assume Y is connected. We want to define the degree of a smooth map F : X → Y. We pick ω ∈ ΛnY such that ∫ (5.3.12) ω = 1.

Y We propose to define ∫ (5.3.13) Deg(F ) = F ∗ω.

X The following result shows that Deg(F ) is indeed well defined by this for- mula. The key argument is an application of Proposition 5.3.6. Lemma 5.3.7. The quantity (5.3.13) is independent of the choice of ω, as long as (5.3.12) holds. ∫ ∫ ∈ n − Proof. Pick ω1 Λ Y satisfying Y ω1 = 1, so Y (ω ω1) = 0. By Propo- sition 5.3.6, this implies n−1 (5.3.14) ω − ω1 = dα, for some α ∈ Λ Y.

Thus ∫ ∫ ∫ ∗ ∗ ∗ (5.3.15) F ω − F ω1 = dF α = 0, X X X and the lemma is proved. 

The following homotopy invariance of degree is a most basic property.

Proposition 5.3.8. If F0 and F1 are smoothly homotopic, then Deg(F0) = Deg(F1).

Proof.∫ By∫ Proposition 5.2.1, if F0 and F1 are smoothly homotopic, then ∗ ∗  X F0 ω = X F1 ω. The following result is a simple but powerful extension of Proposition 5.3.8. Compare the relation between Propositions 5.2.1 and 5.2.2. 242 5. Applications of the Gauss-Green-Stokes formula

Proposition 5.3.9. Let M be a compact oriented surface with boundary, dim M = n + 1. Take Y as above, n = dim Y . Given a smooth map F : → → M Y, let f = F ∂M : ∂M Y. Then Deg(f) = 0.

Proof. Applying Stokes’ Theorem to α = F ∗ω, we have ∫ ∫ f ∗ω = dF ∗ω.

∂M M But dF ∗ω = F ∗dω, and dω = 0 if dim Y = n, so we are done. 

Brouwer’s no-retraction theorem is an easy corollary of Proposition 5.3.9. Compare the proof of Proposition 5.3.2.

Corollary 5.3.10. If M is a compact oriented surface with nonempty bound- ary ∂M, then there is no smooth retraction φ : M → ∂M.

Proof. Without loss of generality, we can assume M is connected. If there were a retraction, then ∂M = φ(M) must also be connected, so Proposi- tion 5.3.9 applies. But then we would have, for the map id. = φ ∂M , the contradiction that its degree is both zero and 1. 

We next give an alternative formula for the degree of a map, which is very useful in many applications. In particular, it implies that the degree is always an integer.

A point y0 ∈ Y is called a regular value of F, provided that, for each ∈ → x X satisfying F (x) = y0,DF (x): TxX Ty0 Y is an isomorphism. The easy case of Sard’s Theorem, discussed in Section 3.4, implies that most points in Y are regular. Endow X with a volume element ωX , and similarly endow Y with ωY . If DF (x) is invertible, define JF (x) ∈ R \ 0 by ∗ F (ωY ) = JF (x)ωX . Clearly the sign of JF (x), i.e., sgn JF (x) = 1, is independent of choices of ωX and ωY , as long as they determine the given orientations of X and Y.

Proposition 5.3.11. If y0 is a regular value of F, then ∑ (5.3.16) Deg(F ) = {sgnJF (xj): F (xj) = y0}.

n Proof. Pick ω ∈ Λ Y, satisfying (5.3.12),∑ with support in a small neigh- ∗ borhood of y0. Then F ω∫will be a sum ωj, with ωj supported in a small neighborhood of xj, and ωj = 1 as sgn JF (xj) = 1.  5.3. Differential forms and degree theory 243

For an application of Proposition 5.3.11, let X be a compact smooth oriented hypersurface in Rn+1, and set Ω = Rn+1 \ X. Given p ∈ Ω, define x − p (5.3.17) F : X −→ Sn,F (x) = . p p |x − p|

It is clear that Deg(Fp) is constant on each connected component of Ω. It is also easy to see that, when p crosses X, Deg(Fp) jumps by 1. Thus Ω has at least two connected components. This is most of the smooth case of the Jordan-Brouwer separation theorem: Theorem 5.3.12. If X is a smooth compact oriented hypersurface of Rn+1, which is connected, then Ω = Rn+1 \ X has exactly 2 connected components.

Proof. X being oriented, it has a smooth global normal vector field. Use this to separate a small collar neighborhood C of X into 2 pieces; C\ X = C0 ∪ C1. The collar C is diffeomorphic to [−1, 1] × X, and each Cj is clearly connected. It suffices to show that any connected component O of Ω inter- sects either C0 or C1. Take p ∈ ∂O. If p∈ / X, then p ∈ Ω, which is open, so p cannot be a boundary point of any component of Ω. Thus ∂O ⊂ X, so O must intersect a Cj. This completes the proof. 

Let us note that, of the two components of Ω, exactly one is unbounded, say Ω0, and the other is bounded; call it Ω1. Then we claim

(5.3.18) p ∈ Ωj =⇒ Deg(Fp) = j. n Indeed, for p very far from X,Fp : X → S is not onto, so its degree is 0. And when p crosses X, from Ω0 to Ω1, the degree jumps by +1. For a simple closed curve in R2, Theorem 5.3.12 is the smooth case of the Jordan curve theorem. That special case of the argument given above can be found in [40]. The existence of a smooth normal field simplifies the use of basic degree theory to prove such a result. For a general continuous, simple closed curve in R2, such a normal field is not available, and the proof of the Jordan curve theorem in this more general context requires a different argument, which can be found in [19].

We apply results just established on degree theory to properties of vec- tor fields, particularly of their critical points. A critical point of a vector field V is a point where V vanishes. Let V be a vector field defined on a neighborhood O of p ∈ Rn, with a single critical point, at p. Then, for any small ball Br about p, Br ⊂ O, we have a map V (x) (5.3.19) V : ∂B → Sn−1,V (x) = . r r r |V (x)| 244 5. Applications of the Gauss-Green-Stokes formula

The degree of this map is called the index of V at p, denoted indp(V ); it is clearly independent of r. If V has a finite number of critical points, then the index of V is defined to be ∑

(5.3.20) Index(V ) = indpj (V ).

If ψ : O → O′ is an orientation preserving diffeomorphism, taking p to p and V to W, then we claim

(5.3.21) indp(V ) = indp(W ). In fact, Dψ(p) is an element of GL(n, R) with positive determinant, so it is homotopic to the identity, and from this it readily follows that Vr and Wr n−1 are homotopic maps of ∂Br → S . Thus one has a well defined notion of the index of a vector field with a finite number of critical points on any oriented surface M. There is one more wrinkle. Suppose X is a smooth vector field on M and p an isolated critical point. If you change the orientation of a small n−1 coordinate neighborhood O of p, then the orientations of both ∂Br and S in (5.3.19) get changed, so the associated degree is not changed. Hence one has a well defined notion of the index of a vector field with a finite number of critical points on any smooth surface M, oriented or not. A vector field V on O ⊂ Rn is said to have a non-degenerate critical point at p provided DV (p) is a nonsingular n × n matrix. The following formula is convenient. Proposition 5.3.13. If V has a nondegenerate critical point at p, then

(5.3.22) indp(V ) = sgn detDV (p).

Proof. If p is a nondegenerate critical point, and we set ψ(x) = DV (p)x, ψr(x) = ψ(x)/|ψ(x)|, for x ∈ ∂Br, it is readily verified that ψr and Vr are homotopic, for r small. The fact that Deg(ψr) is given by the right side of (5.3.22) is an easy consequence of Proposition 5.3.11 

The following is an important global relation between index and degree. Proposition 5.3.14. Let Ω be a smooth bounded region in Rn+1. Let V be a vector field on Ω, with a finite number of critical points pj, all in the interior Ω. Define F : ∂Ω → Sn by F (x) = V (x)/|V (x)|. Then (5.3.23) Index(V ) = Deg(F ). ∪ \ Proof. If we apply Proposition 5.3.9 to M = Ω j Bε(pj), we see that n Deg(F ) is equal to the sum of degrees of the maps of ∂Bε(pj) to S , which gives (5.3.23).  5.3. Differential forms and degree theory 245

Next we look at a process of producing vector fields in higher dimensional spaces from vector fields in lower dimensional spaces.

n Proposition 5.3.15. Let W be a vector field( on R , vanishing) only at 0. Define a vector field V on Rn+k by V (x, y) = W (x), y . Then V vanishes only at (0, 0). Then we have

(5.3.24) ind0W = ind(0,0)V.

Proof. If we use Proposition 5.3.11 to compute degrees of maps, and choose n−1 n+k−1 y0 ∈ S ⊂ S , a regular value of Wr, and hence also for Vr, this identity follows. 

We turn to a more sophisticated variation. Let X be a compact n dimensional surface in Rn+k,W a (tangent) vector field on X with a finite number of critical points pj. Let Ω be a small tubular neighborhood of X, π : Ω → X mapping z ∈ Ω to the nearest point in X. Let φ(z) = dist(z, X)2. Now define a vector field V on Ω by (5.3.25) V (z) = W (π(z)) + ∇φ(z). Proposition 5.3.16. If F : ∂Ω → Sn+k−1 is given by F (z) = V (z)/|V (z)|, then (5.3.26) Deg(F ) = Index(W ).

Proof. We see that all the critical points of V are points in X that are critical for W, and, as in Proposition 5.3.15, Index(W ) = Index(V ). Then Proposition 5.3.14 implies Index(V ) = Deg(F ). 

Since φ(z) is increasing as one goes away from X, it is clear that, for z ∈ ∂Ω,V (z) points out of Ω, provided it is a sufficiently small tubular neighborhood of X. Thus F : ∂Ω → Sn+k−1 is homotopic to the Gauss map (5.3.27) N : ∂Ω −→ Sn+k−1, given by the outward pointing normal. This immediately gives: Corollary 5.3.17. Let X be a compact n-dimensional surface in Rn+k, Ω a small tubular neighborhood of X, and N : ∂Ω → Sn+k−1 the Gauss map. If W is a vector field on X with a finite number of critical points, then (5.3.28) Index(W ) = Deg(N).

Clearly the right side of (5.3.28) is independent of the choice of W. Thus any two vector fields on X with a finite number of critical points have the same index, i.e., Index(W ) is an invariant of X. This invariant is denoted (5.3.29) Index(W ) = χ(X), 246 5. Applications of the Gauss-Green-Stokes formula and is called the Euler characteristic of X.

Remark. The existence of smooth vector fields with only nondegenerate critical points (hence only finitely many critical points) on a given compact surface X follows from results presented in Section 3.5.

Exercises

1. Let X be a compact, oriented, connected surface. Show that the identity map I : X → X has degree 1.

2. Suppose Y is also a compact, oriented, connected surface. Show that if F : X → Y is not onto, then Deg(F ) = 0.

3. If A : Sn → Sn is the antipodal map, show that Deg(A) = (−1)n−1.

4. Show that the homotopy invariance property given in Proposition 5.3.8 can be deduced as a corollary of Proposition 5.3.9. Hint. Take M = X × [0, 1].

n n−1 5. Let p(z) = z +an−1z +···+a1z +a0 be a polynomial of degree n ≥ 1. The fundamental theorem of algebra, proved in §5.1, states that p(z0) = 0 for some z0 ∈ C. We aim for another proof, using degree theory. To get this, by contradiction, assume p : C → C \ 0. For r ≥ 0, define p(reiθ) F : S1 −→ S1,F (eiθ) = . r r |p(reiθ)|

Show that each Fr is smoothly homotopic to F0, and note that Deg(F0) = 0. Then show that there exists r0 such that

r ≥ r0 ⇒ Fr is homotopic to Φ, where Φ(eiθ) = einθ. Show that Deg(Φ) = n, and obtain a contradiction. Note. Regarding the use of degree theory here, show how the proof of Proposition 5.3.6 vastly simplifies when M = S1.

6. Show that each odd-dimensional sphere S2k−1 has a smooth, nowhere vanishing tangent vector field. Hint. Regard S2k−1 ⊂ Ck, and multiply the unit normal by i.

7. Let V be a planar vector field. Assume it has a nondegenerate critical 5.3. Differential forms and degree theory 247 point at p. Show that

p saddle =⇒ indp(V ) = −1

p source =⇒ indp(V ) = 1

p sink =⇒ indp(V ) = 1

p center =⇒ indp(V ) = 1 Refer to Figure 2.3.1 for illustrations of these cases.

8. Let M be a compact oriented 2-dimensional surface. Given a triangu- lation of M, within each triangle construct a vector field, vanishing at 7 points as illustrated in Figure 5.3.2, with the vertices as sinks, the center as a source, and the midpoints of each edge as saddle points. Fit these together to produce a smooth vector field X on M. Show directly that Index(X) = V − E + F, where V = # vertices,E = # edges,F = # faces, in the triangulation. The resulting identity χ(M) = V − E + F is known as Euler’s formula for χ(M).

9. With X = Sn ⊂ Rn+1, note that the manifold ∂Ω in (5.3.27) consists of two copies of Sn, with opposite orientations. Compute the degree of the map N in (5.3.27)–(5.3.28), and use this to show that (5.3.30) χ(Sn) = 2 if n even, 0 if n odd, granted (5.3.28)–(5.3.29).

10. Consider the vector field R on S2 generating rotation about an axis. Show that R has two critical points, at the “poles.” Classify the critical points, compute Index(R), and compare the n = 2 case of (5.3.30).

11. Generalizing Exercise 9, Let X ⊂ Rn+1 be a smooth, compact, oriented, n-dimensional surface, so again the neighborhood Ω of X as in (5.3.27) has boundary ∂Ω consisting essentially of two copies of X, with opposite orientations. Let Ne : X → Sn be the outward pointing unit normal. Show that 1 (5.3.31) Deg Ne = χ(X), if n is even. 2 248 5. Applications of the Gauss-Green-Stokes formula

Figure 5.3.2. Vector field on a triangle, with source, sinks, and saddles

n e ∗ Remark. If ωS is the volume form on S and ωX that on X, then N ωS = K ωX , and K : X → R is called the Gauss curvature of X. Then (5.3.31) implies ∫ 1 K(x) dS(x) = A χ(X), 2 n X if n is even. This is a basic case of the Gauss-Bonnet formula. See §6.3 for more on this.

12. In the setting of Exercise 11, assume n is odd. Show that

(5.3.32) χ(X) = 0.

Give examples where the identity in (5.3.31) fails.

13. Retain the setting of Exercise 12, especially that n is odd. Let X = ∂O, with O ⊂ Rn+1 bounded and open. Take a smooth function

φ : O −→ [0, ∞), φ(x) = dist(x, X) near X, φ(x) > 0 on O. 5.3. Differential forms and degree theory 249

Let Σ ⊂ Rn+2 be the surface Σ = {(x, y): x ∈ O, y2 = φ(x)}, and let ν :Σ → Sn+1 be the outward pointing unit normal. Show that (5.3.33) Deg ν = Deg N,e and deduce that 1 (5.3.34) Deg Ne = χ(Σ). 2 Hint. Taking Ne : X → Sn ⊂ Sn+1, show that each regular value of Ne is also a regular value of ν, with the same preimage in X ⊂ Σ. Then show that Proposition 5.3.11 applies.

14. Actually, (5.3.33) holds whether n is odd or even. Can you get anything else from this?

15. In the setting of Exercise 12 (n is odd), generalize the construction of Exercise 6 to show directly that there is a smooth, nowhere vanishing vector field tangent to X.

16. Let O ⊂ Rn be open and f : O → R smooth of class C2. Let V = ∇f. Assume p ∈ O is a nondegenerate critical point of f, so its Hessian D2f(p) is a nondegenerate n × n symmetric matrix. Say (5.3.35) D2f(p) has ℓ positive eigenvalues and n − ℓ negative eigenvalues. Show that Proposition 5.3.13 implies n−ℓ (5.3.36) indp(V ) = (−1) .

17. Let X ⊂ Rn+k be a smooth, compact, n-dimensional surface. Assume there exists f ∈ C2(X), with just two critical points, a max and a min, both nondegenerate. Use Exercise 16 to show that (5.3.37) χ(X) = 2 if n is even, 0 if n is odd. Considering Sn ⊂ Rn+1, use this to give another demonstration of (5.3.30).

18. Let T ⊂ R3 be the “inner tube” surface described in Exercise 15 of §3.2.

(a) Show that rotation about the z-axis is generated by a vector field that is tangent to T and nowhere vanishing of T . 250 5. Applications of the Gauss-Green-Stokes formula

Figure 5.3.3. Three-holed torus, Euler characteristic −4

(b) Define f : T → R by f(x, y, z) = x, (x, y, z) ∈ T . Show that f has four critical points, a max, a min, and two saddles. Deduce from Exercise 16 that ∇f is a vector field on T of index 0.

(c) Show that both part (a) and part (b) imply χ(T ) = 0.

19. Figure 5.3.3 shows a 3-holed torus M in R3, lined up along the x-axis. Define → R φ : M , φ = x M , and consider the vector field X = ∇φ on M. (a) Show that X has 8 critical points: one source, 6 saddles, and one sink. (b) Deduce from part (a) that χ(M) = 2 − 6 = −4.

20. Extending the scope of Exercise 19, consider a g-holed torus, Mg, again lined up along the x-axis, and define a similar vector field X on Mg. Show 5.3. Differential forms and degree theory 251 that X has 2g + 2 critical points: one source, 2g saddles, and one sink. Deduce that χ(Mg) = 2 − 2g. Compare part (b) of Exerise 18.

21. Let M ⊂ Rn be a smooth, compact, m-dimensional surface. Assume (5.3.38) x ∈ M ⇐⇒ −x ∈ M, 0 ∈/ M, and form the projective manifold P(M), as in §3.2. Let X be a vector field on P(M) with only nondegenerate critical points. Show that there is naturally associated a vector field Y on M such that Index Y = 2 Index X. Deduce that χ(M) = 2χ(P(M)).

22. In the setting of Exericse 20, with Mg arranged to satisfy (5.3.38), show that χ(P(M)) = 1 − g.

Let X ⊂ Rn be an m-dimensional surface, and let Y ⊂ Rν be a µ-dimensional surface, both smooth of class Ck. Then the Cartesian product X × Y = {(x, y) ∈ Rn × Rν : x ∈ X, y ∈ Y } ⊂ Rn × Rν has a natural structure of an (m + µ)-dimensional Ck surface.

23. Let X and Y be as above, and assume they are both compact. Let V1 be a smooth vector field tangent to X and V2 a smooth vector field tangent to Y , both with only nondegenerate critical points. Say {} are the critical points of V1 and {qj} those of V2.

(a) Show that W (x, y) = V1(x) + V2(y) is a smooth vector field tangent to X × Y . Show that its critical points are precisely the points {(pi, qj)}, each nondegenerate. Show that Proposition 5.3.13 gives

(5.3.39) ind(pi,qj ) W = (indpi V1)(indqj V2). (b) Show that

(5.3.40) Index W = (Index V1)(Index V2). (c) Deduce that (5.3.41) χ(X × Y ) = χ(X)χ(Y ). 252 5. Applications of the Gauss-Green-Stokes formula

Let X and Y be smooth, compact, oriented surfaces in Rn. Assume k = dim X, ℓ = dim Y , and k + ℓ = n − 1. Assume X ∩ Y = ∅. Set x − y (5.3.42) φ : X × Y −→ Sn−1, φ(x, y) = . |x − y| We define the (5.3.43) λ(X,Y, Rn) = Deg φ.

24. Let γ and σ ⊂ R3 be the following simple closed curves, parametrized by s, t ∈ R/(2πZ): γ(s) = (cos s, sin s, 0), σ(t) = (0, 1 + cos t, sin t). Thus γ is a circle in the (x, y)-plane centered at (0, 0, 0) and σ is a circle in the (y, z)-plane, centered at (0, 1, 0), both of unit radius. See Figure 5.3.4. Show that λ(γ, σ, R3) = 1. 2 Hint. With φ as above, show that e2 = (0, 1, 0) ∈ S has exactly one preimage point, under φ : γ × σ → S2.

25. Let M be a smooth, compact, oriented, (n−1)-dimensional surface, and assume φ : M → Rn \ 0 is a smooth map. Set φ(x) F (x) = ,F : M −→ Sn−1. |φ(x)| Take ω ∈ Λn−1(Rn \ 0) to be the form considered in Exercises 9–10 of §4.3, i.e., ∑n −n c ω = |x| xj dx1 ∧ · · · ∧ dxj ∧ · · · ∧ dxn. j=1 Show that ∫ 1 Deg(F ) = φ∗ω. An−1 M Hint. Use Proposition 5.2.1 (with Y = Rn \ 0), plus Exercise 9 of §8, to show that ∫ ∫ φ∗ω = F ∗ω,

M M j and show that, under Sn−1 ,→ Rn, j∗ω is the area form on Sn−1. 5.3. Differential forms and degree theory 253

Figure 5.3.4. Two curves in R3 with linking number 1

26. In Exercise 25, take n = 3 and M = T2 = R2/(2πZ2), parametrized by (s, t) ∈ R2. Show that   φ1 φ2 φ3 ∗ −3   φ ω = |φ| det ∂sφ1 ∂sφ2 ∂sφ3 (ds ∧ dt) ∂tφ1 ∂tφ2 ∂tφ3 ( ) ∂φ ∂φ = |φ|−3φ · × (ds ∧ dt). ∂s ∂t In case φ(s, t) = γ(s) − σ(t), deduce the Gauss linking number formula: ∫ 1 γ(s) − σ(t) λ(γ, σ, R3) = − · (γ′(s) × σ′(t)) ds dt. 4π |γ(s) − σ(t)|3 T2

27. Take X,Y ⊂ Rn as in (5.3.42)–(5.3.43). We say an unlinking of X and n n Y is a pair of smooth families ξt : X → R , ηt : Y → R of smooth maps 254 5. Applications of the Gauss-Green-Stokes formula such that ξ0(x) = x, η0(y) = y, ∀ x ∈ X, y ∈ Y,

ξt(X) ∩ ηt(Y ) = ∅, ∀ t ∈ [0, 1], n ξ1(X) and η1(Y ) are separated by a hyperplane in R . Show that if there is an unlinking of X and Y , then λ(X,Y, Rn) = 0. Deduce that there is no unlinking of the curves σ and γ in Exercise 24. n−1 Hint. Consider φt : X × Y → S , given by

ξt(x) − ηt(y) φt(x, y) = . |ξt(x) − ηt(y)| Chapter 6

Differential geometry of surfaces

Here we study the geometry of an n-dimensional surface M in Rk, or more generally of an n-dimensional manifold M, equipped with a metric tensor (a Riemannian manifold). The first object of our study is the class of geodesics, curves γ : I → M that are critical points for the length functional ∫ b (6.0.1) L(γ) = ∥γ′(t)∥ dt, a with fixed endpoints, say γ(a) = p, γ(b) = q. If we parametrize γ to have constant speed, these curves are equivalently critical points of the functional ∫ 1 b (6.0.2) E(γ) = ∥γ′(t)∥2 dt, 2 a and, moreover, critical points of (6.0.2) are seen to automatically have con- stant speed. The condition can be expressed as a differential equa- tion. We provide three approaches to this geodesic equation. In the first approach, M ⊂ Rn is an n-dimensional surface. We see that a smooth curve γ : I → M is a critical point of (6.0.2) if and only if, for each t ∈ (a, b), γ′′(t) is normal to M at γ(t). To derive a differential equation from this characterization, we bring in the matrix-valued function P : M → M(k, R), given by

k (6.0.3) P (x) = orthogonal projection of R onto TxM,

255 256 6. Differential geometry of surfaces for x ∈ M. We derive the geodesic equation [ ] (6.0.4) γ′′(t) + DP ⊥(γ(t))γ′(t) γ′(t) = 0, where P ⊥(x) = I − P (x). (DP (x) will appear again, in (6.0.11).) The second approach uses local coordinates and works on an arbitrary Riemannian manifold. We take γ(t) = (x1(t), . . . , xn(t)) and write (6.0.2) as ∫ b 1 j k (6.0.5) E(γ) = gjk(x(t))x ˙ (t)x ˙ (t) dt, 2 a using the summation convention (sum over repeated indices). We obtain the geodesic equation in the form ℓ j k ℓ (6.0.6) x¨ +x ˙ x˙ Γ jk = 0, ℓ where Γ jk are the Christoffel symbols, given by (6.1.47). We also rewrite k this as a first-order system, for (x, ξ), where ξℓ = gℓkx˙ ; see (6.1.49). This is a Hamiltonian system. The fact that solutions are constant speed curves is manifested as a conservation law. Our third approach to the geodesic equation brings in the notion of a , which associates to each tangent vector field X on M a first-order differential operator ∇X , itself acting on vector fields on M. To each Riemannian manifold M there is a naturally associated Levi- Civita covariant derivative, given by the formula (6.1.62). We show that when M ⊂ Rk is an n-dimensional surface, then the Levi-Civita covariant derivative is given by

(6.0.7) ∇X Y = P (x)DX Y, at x ∈ M, where DX acts componentwise on Y and P (x) is the projection (6.0.3). For a general Riemannian manifold M, the geodesic equation takes the form

(6.0.8) ∇T T = 0, at each point γ(t), with T (t) = γ′(t). Using the basic existence, uniqueness, and smooth dependence on pa- rameters for systems of ODE, we obtain, for each p ∈ M, an exponential map −→ (6.0.9) Expp : U M, defined on a neighborhood of 0 in TpM by

(6.0.10) Expp(v) = γv(1), ′ where γv(t) is the unique constant-speed geodesic satisfying γv(0) = p, γv(0) = v. The derivative D Expp(0) is the identity map on TpM, so Expp gives a 6. Differential geometry of surfaces 257

diffeomorphism from some open neighborhood of 0 ∈ TpM onto a neighbor- hood of p in M, called an exponential coordinate system. We next take up the study of curvature, in §6.2. If M ⊂ Rk is a connected n-dimensional surface, it is part of a flat plane in Rk if and only if P (x) is constant on M. Hence a measure of how M curves at x is given by

(6.0.11) DP (x): TxM −→ M(k, R).

Given X(x) ∈ TxM, we set DX P (x) = DP (x)X(x), yielding

(6.0.12) DX P : M −→ M(k, R), when X is a vector field on M. One sees that, for x ∈ M,

(6.0.13) DX P (x) maps TxM to νxM, and νxM to TxM, ⊥ k where νxM = (TxM) ⊂ R . If X and Y are tangent vector fields to M, there is the second fundamental form, given by

(6.0.14) II(X,Y ) = (DX P )Y, which is normal to M. If ξ is a normal field to M, we define the Weingarten map,

Aξ(x): TxM −→ TxM, by (6.0.15) ⟨AξX,Y ⟩ = ⟨ξ, II(X,Y )⟩. One has the Weingarten formula,

(6.0.16) PDX ξ = −AξX. In case k = n + 1, so M has codimension 1, we take a smooth unit normal field N to M, so (6.0.17) N : M −→ Sn is the Gauss map, introduced in §5.3, giving rise to the Gauss curvature, ( )

(6.0.18) K(x) = det DN(x) , TxM n where DN(x): TxM → TN(x)S = TxM. In this case, the Weingarten formula becomes

(6.0.19) DX N = −AN X, so we have n (6.0.20) K(x) = (−1) det AN (x). Section 6.2 explores a number of explicit computations of the Gauss curva- ture of special classes of surfaces. The measures of curvature mentioned above are extrinsically defined, i.e., they are defined by how the surface M sits in Rk. There is also an 258 6. Differential geometry of surfaces intrinsically defined curvature, the Riemann curvature, defined as follows. If M is a Riemannian manifold and X,Y,Z are vector fields on M, we set

(6.0.21) R(X,Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ]Z, where ∇ is the Levi-Civita covariant derivative on M. There is a formula for R in terms of the Christoffel symbols, presented in (6.2.65)–(6.2.69). In case M ⊂ Rk is an n-dimensional surface, we have a formula relating the Riemann curvature to the Weingarten map and the second fundamental form:

(6.0.22) R(X,Y )Z = AII(Y,Z)X − AII(X,Z)Y. In case k = n + 1, we can set (6.0.23) II(X,Y ) = II(e X,Y )N, and deduce that ( ) II(e X,W ) II(e X,Z) (6.0.24) ⟨R(X,Y )Z,W ⟩ = det . II(e Y,W ) II(e Y,Z) In case n = 2, k = 3, this yields the formula (6.0.25) K(x) = ⟨R(U, V )V,U⟩(x) for the Gauss curvature in terms of the Riemann curvature (where U and V form an orthonormal basis of TxM). This is the Gauss theorema egregium. It implies that the Gauss curvature of M, defined initially extrinsically, is actually an intrinsic measure of the curvature of M, when M ⊂ R3 is a 2D surface. One can use the right side of (6.0.25) to define the Gauss curvature of any 2D Riemannian manifold, whether or not it is a surface in R3. In (6.2.102)–(6.2.107), we show that the Gauss curvature of a surface M ⊂ Rn+1 of dimension n is intrinsically defined whenever n = 2m is even, and we extend the definition of Gauss curvature to all Riemannian manifolds of dimension 2m. In §6.3 we tie results on curvature to results on degree theory from Chap- ter 5. Results of §5.3 show that if M ⊂ Rn+1 is a compact, n-dimensional surface, with Gauss map N : M → Sn, and if n = 2m is even, then ∫ 1 Deg(N) = K(x) dS(x), and An (6.0.26) M 1 Deg(N) = χ(M), 2 n where An is the n-dimensional area of the unit sphere S , K(x) is the Gauss curvature of M, given by (6.0.18), and χ(M) is the Euler characteristic of 6. Differential geometry of surfaces 259

M. Putting these identities together and specializing to n = 2 yields ∫ (6.0.27) K dS = 2πχ(M),

M for M ⊂ R3, which is part of the classical Gauss-Bonnet theorem. We have two primary goals in this section. One is to establish (6.0.27) for an arbitrary 2D compact Riemannian manifold, regardless of whether it is a surface in R3. Going further, we consider a domain Ω in such∫ a 2D Riemannian manifold M and seek to extend (6.0.27) to a formula for Ω K dS. For example, we show that if Ω ⊂ M is smoothly bounded and M \Ω has k connected components, each diffeomorphic to a closed disk, then ∫ ∫ ( ) (6.0.28) K dS + κ ds = 2π χ(M) − k ,

Ω ∂Ω where κ is the geodesic curvature of ∂Ω. We make some comments on higher dimensional versions of the Gauss-Bonnet theorem, beyond the case of codimension-1 surfaces treated in (6.0.26), noting that pursuing this is a task for a more advanced course. In §6.4 we study smooth matrix groups, which are subsets G ⊂ M(n, F) that are smooth surfaces and have the property (6.0.29) g, h ∈ G =⇒ gh, g−1 ∈ G. Such surfaces get both left and right invariant metric tensors, and associated volume elements, and we investigate properties of the resulting invariant integrals. We show that the matrix exponential has the property tA (6.0.30) A ∈ g = TI G =⇒ e ∈ G, ∀ t ∈ R, and explore a Lie algebra structure on g. In case G has a bi-invariant metric tensor, we show that these curves etA are geodesics on M, and obtain formulas for the covariant derivative and the Riemann curvature in terms of the Lie algebra structure on g. We conclude this chapter with some results on the derivative of the ex- ponential map, actually two exponential maps, first the matrix exponential, and then the map Expp in (6.0.9)–(6.0.10). We particularly consider exis- tence of critical points, associated to conjugate points on M, and examine the influence of curvature on the existence of such critical points. This also serves as an introduction to a topic that can be pursued much further in a more advanced course. 260 6. Differential geometry of surfaces

6.1. Geometry of surfaces I: geodesics

Let M be a smooth, n-dimensional surface in Rk (k > n), or more gen- erally a smooth, n-dimensional manifold equipped with a metric tensor (a Riemannian manifold). Let γ :[a, b] → M be a smooth curve. As seen in §3.2, its length is ∫ b (6.1.1) L(γ) = ∥γ′(t)∥ dt, a where

∥γ′(t)∥2 = ⟨γ′(t), γ′(t)⟩ = γ′(t) · G(γ(t))γ′(t) (6.1.2) ∑ ′ ′ = gjk(γ(t))γj(t)γk(t), j,k the last expression being given in a local coordinate system, with G = (gjk) denoting the metric tensor in this coordinate system. We use ⟨ , ⟩ to denote the inner product of two vectors defined by this metric tensor:

⟨V (x),W (x)⟩ = V (x) · G(x)W (x) ∑ (6.1.3) = gjk(x)Vj(x)Wk(x). j,k

In case M is a surface in Rk, with the induced metric tensor, ⟨V (x),W (x)⟩ is k given by the standard dot product on R , applied to V (x),W (x) ∈ TxM ⊂ Rk. We aim to study smooth curves γ that are length minimizing, among curves with the same endpoints. Such curves are called geodesics. They have the following property. Let γs be a smooth family of curves satisfying

(6.1.4) γs :[a, b] −→ M, γs(a) ≡ p, γs(b) ≡ q, with γ0 = γ. Then L(γs) ≥ L(γ0) for all s, so

d (6.1.5) L(γ ) = 0. ds s

In other words, γ0 is a critical point of the length functional. We define “geodesic” to include all such critical paths. Later we will investigate the length minimizing properties of general geodesics. 6.1. Geometry of surfaces I: geodesics 261

Note that L(γ) is unchanged under reparametrization. We will param- ∥ ′ ∥ ≡ etrize γ0 so that γ0(t) c0 is constant. Then ∫ b d d ⟨ ′ ′ ⟩1/2 L(γs) = γs(t), γs(t) dt ds s=0 ds a (6.1.6) ∫ b 1 ∂ ⟨ ′ ′ ⟩ = γs(t), γs(t) dt . 2c0 a ∂s s=0 Equivalently,

d 1 d (6.1.7) L(γs) = E(γs) , ds s=0 c0 ds s=0 where ∫ 1 b E(γ ) = ∥γ′ (t)∥2 dt s 2 s (6.1.8) ∫a b 1 ⟨ ′ ′ ⟩ = γs(t), γs(t) dt 2 a is the energy of the curve γs :[a, b] → M. Thus we shift from seeking a critical point of the length functional to finding a critical point of the energy functional. As we will see, such a critical path automatically has the property that ∥γ′(t)∥ is constant. Our next goal is to derive a differential equation characterizing critical points of the energy functional. We will discuss three approaches to such a geodesic equation. In all cases, we start with ∫ b d 1 ∂ ⟨ ′ ′ ⟩ (6.1.9) E(γs) = γs(t), γs(t) dt. ds 2 a ∂s In our first approach, we assume M is a smooth surface in Rk. Then we have from (6.1.7) that ∫ b⟨ ⟩ d ∂ ′ ′ (6.1.10) E(γs) = γs(t), γs(t) dt. ds a ∂s Using the identity ⟨ ⟩ ⟨ ⟩ ⟨ ⟩ d ∂ ∂ ∂ (6.1.11) γ (t), γ′ (t) = γ′ (t), γ′ (t) + γ (t), γ′′(t) , dt ∂s s s ∂s s s ∂s s s together with the fundamental theorem of calculus and the fact that (given (6.1.4)) ∂ (6.1.12) γ (t) = 0, at t = a and b, ∂s s we have ∫ b d − ⟨ ′′ ⟩ (6.1.13) E(γs) = V (t), γ0 (t) dt, ds s=0 a 262 6. Differential geometry of surfaces

k Figure 6.1.1. The projection P (x) of R onto TxM where ∂ (6.1.14) V (t) = γ (t) ∈ T M. ∂s s s=0 γ0(t) → Rk ∈ Now, given any smooth V :[a, b] satisfying V (t) Tγ0(t)M and V (a) = V (b) = 0, one can find a smooth family of curves γs :[a, b] → M satisfying (13.4), such that γ0 = γ and (6.1.14) holds. We have the following. Proposition 6.1.1. If M ⊂ Rk is a smooth surface, a smooth curve γ = γ0 :[a, b] → M is a critical point of the energy functional if and only if, for each t ∈ (a, b), ′′ (6.1.15) γ (t) ⊥ V (t) for each V (t) ∈ Tγ(t)M. A convenient restatement of (6.1.15) can be formulated as follows. We take P : M −→ M(k, R), (6.1.16) k P (x) = orthogonal projection of R onto TxM. See Figure 6.1.1. Then (6.1.15) is equivalent to the statement that (6.1.17) P (γ(t))γ′′(t) = 0, 6.1. Geometry of surfaces I: geodesics 263 for all t ∈ (a, b). In order to get a differential equation in standard form, we complement (6.1.17) with (6.1.18) P ⊥(γ(t))γ′(t) = 0, ⊥ ′ (where P (x) = I − P (x)), which follows from the fact that γ (t) ∈ Tγ(t)M. Applying d/dt to (6.1.18), and using the product rule and the chain rule, we have d [ ] (6.1.19) 0 = P ⊥(γ(t))γ′(t) = DP ⊥(γ(t))γ′(t) γ′(t) + P ⊥(γ(t))γ′′(t). dt Here, for x ∈ M, ⊥ (6.1.20) DP (x): TxM −→ M(k, R), so DP ⊥(γ(t))γ′(t) ∈ M(k, R). Adding (6.1.19) to (6.1.17) gives [ ] (6.1.21) γ′′(t) + DP ⊥(γ(t))γ′(t) γ′(t) = 0.

In order to apply ODE theory to (6.1.21), it is convenient to have the following set-up. Assume M = M0 and that there is a diffeomorphism k−n Ba × M → O, where Ba is an open ball about 0 in R and O is an open k neighborhood of M0 in R . Then O is a union of surfaces My, y ∈ Ba. We extend P in (6.1.16) to a smooth map P : O −→ M(k, R), (6.1.22) k P (x) = orthogonal projection of R onto TxMy, if x ∈ My. Thus we can regard (6.1.21) as an ODE on O. By results of §2.3, it has a unique short time solution, given initial data γ(a) = p ∈ O, γ′(a) = v ∈ Rk. The following result shows when solutions to (6.1.21) give geodesics on M. Proposition 6.1.2. Assume γ(t) solves (6.1.21), on an interval I, contain- ing a, with initial data ′ (6.1.23) γ(a) = p ∈ M, γ (a) = v ∈ TpM. Then γ(t) satisfies (6.1.17)–(6.1.18), and γ(t) ∈ M, for t ∈ I.

Proof. To start, we derive an identity based on the fact that each P (x) is a projection. Applying d/dt to (6.1.24) P (γ(t))P (γ(t)) = P (γ(t)) gives (6.1.25)[ ] [ ] DP (γ(t))γ′(t) P (γ(t)) + P (γ(t)) DP (γ(t))γ′(t) = DP (γ(t))γ′(t), hence [ ] [ ] (6.1.26) P (γ(t)) DP (γ(t))γ′(t) = DP (γ(t))γ′(t) P ⊥(γ(t)). 264 6. Differential geometry of surfaces

Meanwhile, applying P (γ(t)) to (6.1.21) yields [ ] (6.1.27) P (γ(t))γ′′(t) = P (γ(t)) DP (γ(t))γ′(t) γ′(t). Hence, by (6.1.26), [ ] (6.1.28) P (γ(t))γ′′(t) = DP (γ(t))γ′(t) P ⊥(γ(t))γ′(t). Now set (6.1.29) σ(t) = P ⊥(γ(t))γ′(t). Applying d/dt, we have [ ] σ′(t) = DP ⊥(γ(t))γ′(t) + P ⊥(γ(t))γ′′(t) (6.1.30) = −P (γ(t))γ′′(t) [ ] = − DP (γ(t))γ′(t) σ(t), the second identity by (6.1.21) and the third identity by (6.1.28). Hence σ satisfies a first-order, homogeneous, linear ODE, so ′ (6.1.31) γ (a) ∈ Tγ(a)M ⇒ σ(a) = 0 ⇒ σ(t) ≡ 0. From (6.1.28)–(6.1.29), it is clear that (6.1.32) σ(t) ≡ 0 =⇒ P (γ(t))γ′(t) ≡ 0, so we have (6.1.17)–(6.1.18), and (6.1.18) yields γ(t) ∈ M for all t ∈ I.  Corollary 6.1.3. In the setting of Proposition 6.1.2, if (6.1.23) holds, then ∥γ′(t)∥ is constant.

Proof. We have d (6.1.33) ⟨γ′(t), γ′(t)⟩ = 2⟨γ′′(t), γ′(t)⟩, dt which vanishes when (6.1.17)–(6.1.18) hold. 

We give an alternative presentation of the geodesic equation in case k = n + 1, i.e., M has codimension 1 in Rk. Suppose M is defined by u(x) = c, ∇u(x) ≠ 0. Then (6.1.17) is equivalent to (6.1.34) γ′′(t) = K(t)∇u(γ(t)), for a real valued function K, which remains to be determined. Meanwhile, the condition that u(γ(t)) ≡ c implies (6.1.35) ⟨γ′(t), ∇u(γ(t))⟩ = 0 (cf. (6.1.18)), and differentiating this gives (6.1.36) ⟨γ′′(t), ∇u(γ(t))⟩ = −⟨γ′(t),D2u(γ(t))γ′(t)⟩, 6.1. Geometry of surfaces I: geodesics 265 where D2u is the k × k matrix of second-order partial derivatives of u. Comparing (6.1.34) and (6.1.36) gives K(t), and we have the ODE ⟨ ⟩ γ′(t),D2u(γ(t))γ′(t) (6.1.37) γ′′(t) = − ∇u(γ(t)), ∥∇u(γ(t))∥2 for a geodesic γ lying in M.

For our second approach to the geodesic equation, we let M be a general Riemannian manifold. We assume γs :[a, b] → M, satisfying (6.1.4), are contained in a single coordinate patch, on which the metric tensor is given by the n × n symmetric, positive-definite matrix G = (gjk). We use the following notation:

1 n γs(t) = xs(t) = (xs(t), . . . , xs (t)), (6.1.38) d γ (t) =x ˙ (t) = (x ˙ 1(t),..., x˙ n(t)). dt s s s s We use the following summation convention, converting (6.1.2) to ∥ ∥2 j k (6.1.39) x˙ s(t) = gjk(xs(t))x ˙ s(t)x ˙ s (t). (We sum over repeated indices.) Thus the energy functional is ∫ b 1 j k (6.1.40) E(xs) = gjk(x(t))x ˙ s(t)x ˙ s (t) dt. 2 a Hence, with

j ∂ j (6.1.41) V (t) = xs(t) , ∂s s=0 and x(t) = x0(t), we have ∫ b[ d ∂ j k E(xs) = gjk(x(t)) x˙ s(t) x˙ (t) ds s=0 ∂s s=0 (6.1.42) a ] 1 ∂g + V j(t) kℓ x˙ k(t)x ˙ ℓ(t) dt, 2 ∂xj where we have made use of the symmetry gjk = gkj. Now, in analogy with (6.1.11), we can write ( ) d j k gjk(x(t))V (t)x ˙ (t) dt ∂ j k (6.1.43) = gjk(x(t)) x˙ s(t) x˙ (t) ∂s s=0 ∂g + g (x(t))V j(t)¨xk(t) +x ˙ ℓ(t) jk V j(t)x ˙ k(t). jk ∂xℓ 266 6. Differential geometry of surfaces

Thus, by the fundamental theorem of calculus, ∫ b[ d j k ℓ ∂gjk j k E(xs) = − gjkV x¨ +x ˙ V x˙ ds s=0 ∂xℓ (6.1.44) a ] 1 ∂g − V j kℓ x˙ kx˙ ℓ dt, 2 ∂xj and the stationary condition (d/ds)E(xs)|s=0 = 0 for all variations of the form (6.1.4) becomes ( ) ∂g 1 ∂g (6.1.45) g x¨k(t) = − jk − kℓ x˙ kx˙ ℓ. jk ∂xℓ 2 ∂xj Symmetrizing the quantity in parentheses with respect to k and ℓ yields the geodesic equation ℓ j k ℓ (6.1.46) x¨ +x ˙ x˙ Γ jk = 0, ℓ where Γ jk is defined by ( ) 1 ∂g ∂g ∂g (6.1.47) g Γℓ = jk + ik − ij . kℓ ij 2 ∂xi ∂xj ∂xk ℓ The functions Γ ij are called the Christoffel symbols. We will see more of them. We next convert (6.1.46) to a first-order system. It is convenient to set k (6.1.48) ξℓ = gℓkx˙ . Then consider the system ℓ ℓk x˙ = g (x)ξk, (6.1.49) 1 ∂gjk ξ˙ = − ξ ξ , ℓ 2 ∂xℓ j k jk where (g ) is the matrix inverse to (gjk). If we apply d/dt to the first ˙ equation and plug in the second one for ξk, we get ( ) 1 ∂gik ∂giℓ (6.1.50) x¨ℓ = − gℓj + gkj ξ ξ , 2 ∂xj ∂xj i k and using (6.1.48), together with ∂ ∂G (6.1.51) G(x)−1 = −G(x)−1 G(x)−1, ∂xj ∂xj straightforward manipulations yield the geodsic equation (6.1.46). A special structure of the system (6.1.49) is revealed by writing this system as ∂ x˙ ℓ = f(x, ξ), ∂ξ (6.1.52) ℓ ∂ ξ˙ = − f(x, ξ), ℓ ∂xℓ 6.1. Geometry of surfaces I: geodesics 267 where 1 (6.1.53) f(x, ξ) = gjk(x)ξ ξ . 2 k k A system of ODEs of the form (6.1.52) is called a Hamiltonian system. Note that if (6.1.52) holds, then

d ℓ ∂f ˙ ∂f f(x, ξ) =x ˙ ℓ + ξℓ dt ∂x ∂ξℓ (6.1.54) ℓ ˙ ˙ℓ ℓ = −x˙ ξℓ + ξ x˙ = 0. Hence f(x(t), ξ(t)) is constant for a solution to (6.1.52). Since

jk −1 g (x)ξjξk = ξ · G(x) ξ (6.1.55) = G(x)x ˙ · G(x)−1G(x)x ˙ j k = gjk(x)x ˙ x˙ , this implies that the solution curve to the geodesic equation (6.1.46) has constant speed, parallel to the result of Corollary 6.1.3. The passage from the geodesic equation (6.1.46) to the system (6.1.52) is a special case of a more general passage from a “Lagrangian equation” to an associated Hamiltonian system. More on this can be found in Chapter 4, §7 of [45], and in Chapter 1, §12 of [41].

We now discuss a third approach to the geodesic equation. Again, M is a smooth n-dimensional Riemannian manifold, and γs :[a, b] → M is a smooth family of curves satisfying (6.1.4), and V (t) is defined as in (6.1.14). Let ′ (6.1.56) T = γs(t). Then, parallel to (6.1.9), ∫ d 1 b (6.1.57) E(γs) = V ⟨T,T ⟩ dt. ds 2 a ′ Now we need a generalization of (∂/∂s)γs(t) and of the formulas (6.1.10)– (6.1.11). To achieve this, we introduce the notion of a covariant derivative.

If X and Y are vector fields on M, the covariant derivative ∇X Y is a vector field on M. The following properties are required. We assume that ∇X Y is additive in both X and Y , that

(6.1.58) ∇fX Y = f∇X Y, 268 6. Differential geometry of surfaces for f ∈ C∞(M), and that

(6.1.59) ∇X (fY ) = f∇X Y + (Xf)Y.

Thus ∇X acts as a derivation. The operator ∇X is also required to have the following relation to the Riemannian metric:

(6.1.60) X⟨Y,Z⟩ = ⟨∇X Y,Z⟩ + ⟨Y, ∇X Z⟩. One further property will uniquely specify ∇:

(6.1.61) ∇X Y − ∇Y X = [X,Y ]. If all these properties hold, we say ∇ is the Levi-Civita covariant derivative on the Riemannian manifold M. We have the following basic existence and uniqueness result. Proposition 6.1.4. Associated with a Riemannian metric is a unique Levi- Civita covariant derivative, given by

2⟨∇X Y,Z⟩ = X⟨Y,Z⟩ + Y ⟨X,Z⟩ − Z⟨X,Y ⟩ (6.1.62) + ⟨[X,Y ],Z⟩ − ⟨[X,Z],Y ⟩ − ⟨[Y,Z],X⟩.

Proof. To obtain the formula (6.1.62), cyclically permute X,Y , and Z in (6.1.60) and take the appropriate alternating sum, using (6.1.61) to cancel out all terms involving ∇ but two copies of ⟨∇X Y,Z⟩. This derives the formula and establishes uniqueness. On the other hand, if (6.1.62) is taken as the definition of ∇X Y , then verification of the properties (6.1.58)–(6.1.61) is a straightforward exercise. 

To see that the passage from (6.1.57) to the next step ((6.1.64) below) generalizes passage from (6.1.9) to (6.1.10), we note the following. Proposition 6.1.5. If M is a smooth surface in Rk, with the induced Rie- mannian metric, and if ∇M is its Levi-Civita covariant derivative, then, for X and Y tangent to M, ∇M ∈ (6.1.63) X Y = P (x)DX Y, at x M, where DX acts componentwise on Y , and P (x) is the projection (6.1.16). Proof. It is routine to verify that the right side of (6.1.63) satisfies all the conditions described in (6.1.58)–(6.1.61) that define the Levi-Civita covari- ant derivative. 

The identity (6.1.63) will also play an important role in the study of curvature, in the next section. We resume our analysis of (6.1.57), which becomes ∫ b d ⟨∇ ⟩ (6.1.64) E(γs) s=0 = V T,T dt. ds a 6.1. Geometry of surfaces I: geodesics 269

Since ∂/∂s and ∂/∂t commute, we have [V,T ] = 0 on γ0, and (6.1.61) implies ∫ d b (6.1.65) E(γs) = ⟨∇T V,T ⟩ dt. ds s=0 a The replacement for (6.1.11) is

(6.1.66) T ⟨V,T ⟩ = ⟨∇T V,T ⟩ + ⟨V, ∇T T ⟩, so, by the fundamental theorem of calculus, ∫ d b (6.1.67) E(γs) = − ⟨V, ∇T T ⟩ dt. ds s=0 a If this is to vanish for all smooth vector fields over γ0, vanishing at p and q, we must have

(6.1.68) ∇T T = 0. We show how this leads again to (6.1.46). If M has a coordinate chart Ω ⊂ Rn that carries a Riemannian metric (gjk) and a corresponding Levi-Civita covariant derivative, the Christoffel symbols can be defined by ∑ ∇ k (6.1.69) Di Dj = Γ jiDk, k where Dk = ∂/∂xk. The formula (6.1.62) implies ( ) ℓ 1 ∂gjk ∂gik ∂gij (6.1.70) gkℓΓ ij = + − , 2 ∂xi ∂xj ∂xk in agreement with (6.1.47). We can rewrite the geodesic equation (6.1.68) 1 n for γ(t) = x(t) as follows. With x = (x1, . . . , xn) and T = (x ˙ ,..., x˙ ), we have ∑ ∑ ℓ ℓ ℓ (6.1.71) 0 = ∇T (x ˙ Dℓ) = (¨x Dℓ +x ˙ ∇T Dℓ). ℓ ℓ In view of (6.1.69), this becomes ℓ j k ℓ (6.1.72) x¨ +x ˙ x˙ Γ jk = 0, where we bring back the summation convention. We have recovered the geo- ′ desic equation (6.1.46). Note that if T = γ (t), then T ⟨T,T ⟩ = 2⟨∇T T,T ⟩ = 0, so if (6.1.68) holds, γ(t) automatically has constant speed. Shortly we will verify that a curve satisfying the geodesic equation is indeed locally length minimizing. For a given p ∈ M, the exponential map −→ (6.1.73) Expp : U M n is defined on a neighborhood of 0 in TpM ≈ R by

(6.1.74) Expp(v) = γv(1), 270 6. Differential geometry of surfaces

Figure 6.1.2. The exponential map: Expp(tv) = γv(t) where γv(t) is the unique constant-speed geodesic satisfying ′ (6.1.75) γv(0) = p, γv(0) = v.

See Figure 6.1.2. Note that Expp(tv) = γv(t). It is clear that Expp is well defined and smooth on a sufficiently small neighborhood U of 0 in TpM, and its derivative at 0 is the identity. Thus, perhaps shrinking U, we have O that Expp is a diffeomorphism of U onto a neighborhood of p in M. This provides what is called an exponential coordinate system on a neighborhood of p ∈ M. Clearly the geodesics through p are the straight lines through the origin in this coordinate system. We now establish a result, known as the Gauss lemma, which will imply that each geodesic is locally length minimizing. For a > 0 small, let Σa = { ∈ ∥ ∥ } v TpM : v = a , and let Sa = Expp(Σa).

Proposition 6.1.6. Any unit-speed geodesic through p hitting Sa at t = a is orthogonal to Sa.

Proof. If γ0(t) is a unit-speed geodesic, γ0(0) = p, γ0(a) = q ∈ Sa, and V ∈ TpM is tangent to Sa, there is a smooth family of unit-speed geodesics γs(t), such that γs(0) = p and (∂/∂s)γs(a)|s=0 = V . Since L(γs) = E(γs) is 6.1. Geometry of surfaces I: geodesics 271 constant, we can use (6.1.75)–(6.1.76) to conclude that ∫ a ⟨ ⟩ ⟨ ′ ⟩ (6.1.76) 0 = T V,T dt = V, γ0(a) , 0 which proves the proposition.  → Corollary 6.1.7. Suppose Expp : Ba M is a diffeomorhism of Ba = { ∈ ∥ ∥ ≤ } B ∈ B v TpM : v a onto its image . Then, for each q , q = Expp(w), ≤ ≤ the curve γ(t) = Expp(tw), 0 t 1, is the unique shortest path from p to q.

Proof. We can assume ∥w∥ = a. Let σ : [0, 1] → M be another constant speed path from p to q, say ∥σ′(t)∥ = b. We can assume σ(t) ∈ B for all t ∈ [0, 1]; otherwise restrict σ to [0, β], where β = inf{t ≥ 0 : σ(t) ∈ ∂B} and the argument below will show this segment has length ≥ a. ∈ B \ For all t such that σ(t) p, we can write σ(t) = Expp(r(t)ω(t)), for uniquely determined ω(t) in the unit sphere of TpM, and r(t) ∈ (0, a]. If we pull the metric tensor of M back to Ba, we have (6.1.77) ∥σ′(t)∥2 = r′(t)2 + r(t)2∥ω′(t)∥2, by the Gauss lemma. Hence ∫ 1 b = L(σ) = ∥σ′(t)∥ dt 0∫ 1 1 (6.1.78) = ∥σ′(t)∥2 dt b ∫0 1 1 ≥ r′(t)2 dt. b 0 Cauchy’s inequality yields ∫ (∫ ) 1 1 1/2 (6.1.79) |r′(t)| dt ≤ r′(t)2 dt , 0 0 so the last quantity in (6.1.78) is ≥ a2/b. This implies b ≥ a, with equality only if ∥ω′(t)∥ = 0 for all t. The corollary is proved. 

The following is a useful converse. Proposition 6.1.8. Let γ : [0, 1] → M be a constant speed Lipschitz curve from p to q that is absolutely length minimizing. Then γ is a smooth curve satisfying the geodesic equation.

Proof. We make use of the following fact, which will be established below. ∈ → Namely, there exists a > 0 such that, for each point x γ, Expx : Ba M is a diffeomorphism of Ba onto its image (and a is independent of x ∈ γ). 272 6. Differential geometry of surfaces

So choose t0 ∈ [0, 1] and consider x0 = γ(t0). The hypothesis implies that γ must be a length-minimizing curve from x0 to γ(t), for all t ∈ [0, 1]. By Corollary 6.1.7, γ(t) coincides with a geodesic for t ∈ [t0, t0 + α] and for t ∈ [t0 − β, t0], where t0 + α = min(t0 + a, 1) and t0 − β = max(t0 − a, 0). We need only show that, if t0 ∈ (0, 1), these two geodesic segments fit together smoothly, i.e., that γ is smooth in a neighborhood of t0.

To see this, pick ε > 0 such that ε < min(t0, a), and consider t1 = t0 − ε. The same argument as above applied to this case shows that γ coincides with a smooth geodesic on a segment including t0 in its interior, so we are done. 

The asserted lower bound on a follows from compactness plus the fol- lowing observation. Given p ∈ M, there is a neighborhood O of (p, 0) in TM on which E O −→ E ∈ (6.1.80) : M, (x, v) = Expx(v), (v TxM) is defined. Let us set F F O −→ × (6.1.81) (x, v) = (x, Expx(v)), : M M. We readily compute that ( ) I 0 (6.1.82) DF(p, 0) = , II ⊕ ≈ as a map on TpM TpM, where we use Expp to identify T(p,0)TpM TpM ⊕ TpM ≈ T(p,p)(M × M). Hence the inverse function theorem implies that F is a diffeomorphism from a neighborhood of (p, 0) in TM onto a neighborhood of (p, p) in M × M. Let us remark that, though a geodesic is locally length minimizing, it need not be globally length minimizing. Easy examples arise when M is the standard unit sphere in Euclidean space, for which the great circles are the geodesics.

Exercises

1. Let u : Rk → R be smooth, and assume γ :(a, b) → Rk is a smooth solution to (6.1.37). Note that, in general, d u(γ(t)) = ⟨γ′(t), ∇u(γ(t))⟩. dt Show that (6.1.37) implies d (6.1.83) ⟨γ′(t), ∇u(γ(t))⟩ ≡ 0. dt 6.1. Geometry of surfaces I: geodesics 273

Deduce that, if t0 ∈ (a, b), d γ′(t ) ⊥ ∇u(γ(t )) =⇒ u(γ(t)) ≡ 0. 0 0 dt

2. In the setting of Exercise 1, note that d ⟨γ′(t), γ′(t)⟩ = 2⟨γ′′(t), γ′(t)⟩. dt Deduce that, if (13.37) holds, then ′ ′ ′ γ (t0) ⊥ ∇u(γ(t0)) =⇒ ∥γ (t)∥ ≡ ∥γ (t0)∥. Hint. Use (6.1.83) again.

3. Suppose M is a smooth surface of dimension k−1 in Rk. Let N : M → Rk be a smooth unit normal field to M. Show that an alternative form of the geodesic equation (6.1.37) is ⟨ ⟩ d γ′′(t) = − γ′(t), N(γ(t)) N(γ(t)). dt

4. We say a 2D Riemannian manifold has a Clairaut parametrization if there are coordinates (u, v) in which the metric tensor takes the form ( ) G (u) G(u, v) = 1 , G2(u) ℓ with no v dependence. Compute Γ jk in this case, and show that the geodesic equations (6.1.46) become 1 G′ 1 G′ u¨ + 1 u˙ 2 − 2 v˙2 = 0, 2 G1 2 G1 G′ v¨ + 2 u˙v˙ = 0. G2 Note that the second equation is equivalent to d (G (u)v ˙) = 0, hence G (u)v ˙ = a. dt 2 2 Use this and the constant speed condition 2 2 2 G1(u)u ˙ + G2(u)v ˙ = c to get a first-order ODE for u, to which you can apply separation of variables.

5. Show that ∂gjk a a = gakΓ jℓ + gajΓ kℓ. ∂xℓ 274 6. Differential geometry of surfaces

Hint. Start with ⟨ ⟩ ⟨∇ ⟩ ⟨ ∇ ⟩ Dℓ Dj,Dk = Dℓ Dj,Dk + Dj, Dℓ Dk , and plug in (6.1.69).

→ 6. Consider the exponential map Expp : U M defined in (6.1.74)–(6.1.75). Show that, in this coordinate system, a Γ bk(p) = 0.

Hint. Since the line through the origin in any direction aDj + bDk is a geodesic, we have ∇ ∀ ∈ R aDj +bDk (aDj + bDk) = 0, at p, a, b , j, k.

7. In the setting of Exercise 6, deduce that, at the center of an exponential coordinate system, ∂g jk (p) = 0, ∀ j, k, ℓ. ∂xℓ

8. Let M be a compact, connected Riemannian manifold, and take p, q ∈ M, p ≠ q. Let F denote the set of smooth curves σ : [0, 1] → M satisfying σ(0) = p, σ(1) = q, ∥σ′(t)∥ independent of t. Let d(p, q) = inf{L(σ): σ ∈ F}, and take σν ∈ F such that L(σν) → d(p, q). Use the Arzela-Ascoli theorem to show that there is a uniformly convergent subsequence −→ → σνk γ : [0, 1] M, γ(0) = p, γ(1) = q, such that γ is length minimizing, among such curves. Use Proposition 6.1.8 to establish that γ is a geodesic from p to q.

6.2. Geometry of surfaces II: curvature

The curvature of a surface (or, in the 1D case, a curve) is a measure of its not being flat. To take the simplest case, let γ :(a, b) → Rk be smooth, with non-vanishing velocity. We can parametrize γ by arclength, so (6.2.1) T (t) = γ′(t), ∥T (t)∥ ≡ 1. Then γ is a straight line if and only if T (t) is constant. Thus a measure of how γ curves is given by (6.2.2) T ′(t). 6.2. Geometry of surfaces II: curvature 275

We call this the curvature vector of γ. Note that (6.2.3) T · T ≡ 1 =⇒ T ′(t) · T (t) = 0, so T ′(t) is orthogonal to T (t). In connection with this, we can interpret Proposition 6.1.1 as follows. Proposition 6.2.1. Let M ⊂ Rk be a smooth surface, γ :(a, b) → M a smooth curve, parametrized by arc length. Then γ is a geodesic on M if and only if the curvature vector of γ is normal to M at each point of γ(t).

Let us now specialize to planar curves, so γ :(a, b) → R2. In such a case, we apply counterclockwise rotation by 90◦ to T (t) to get a unit normal to γ: ( ) 0 −1 (6.2.4) N(t) = JT (t),J = . 1 0 In such a case, (6.2.3) implies that T ′(t) is parallel to N(t), say (6.2.5) T ′(t) = κ(t)N(t), and we call κ(t) the curvature of γ. Note that, by (6.2.4), N ′(t) = κ(t)JN(t) (6.2.6) = κ(t)J 2T (t) = −κ(t)T (t).

We move on to the case that M ⊂ Rn+1 is a smooth, codimension-one surface, with smooth unit normal N. See Figure 6.2.1. Then (6.2.7) N : M −→ Sn ⊂ Rn+1 is called the Gauss map. In this case M is flat if and only if N is constant, so we measure its curvature by n (6.2.8) DN(x): TxM −→ TN(x)S = TxM. In particular, we have a well defined real-valued function ( ) (6.2.9) K(x) = det DN(x) , TxM called the Gauss curvature of M at x. In case n = 1, (6.2.6) yields K(γ(t)) = −κ(t). Of course, the map (6.2.8) contains further curvature information when n > 1. We return to this below. For a general n-dimensional surface M ⊂ Rk, a natural object with which to measure how M curves in Rk is the map P , introduced in (6.1.16), i.e., P : M −→ M(k, R), (6.2.10) k P (x) = orthogonal projection of R onto TxM. 276 6. Differential geometry of surfaces

Figure 6.2.1. The unit normal to M at x

We see that M is flat if and only if P is constant, so a natural measure of curvature is

(6.2.11) DP (x): TxM −→ M(k, R). We use the notation

(6.2.12) DX P : M −→ M(k, R), for a vector field X on M; DX P (x) = DP (x)X(x). Since we are differenti- ating a smooth family of symmetric k × k matrices, it is clear that t (6.2.13) DX P (x) = DX P (x) in M(k, R), for each x ∈ M. The following is another useful fact. Proposition 6.2.2. For x ∈ M, let ⊥ (6.2.14) νxM = TxM , k the orthogonal complement of TxM in R . Then, if X is a vector field on M,

DX P (x): TxM −→ νxM, and (6.2.15) DX P (x): νxM −→ TxM. 6.2. Geometry of surfaces II: curvature 277

Proof. We start with the projection identity, P = PP , and apply DX , using the product rule, to obtain

(6.2.16) DX P = (DX P )P + P (DX P ), which in turn yields ⊥ (DX P )P = P (DX P ), (6.2.17) ⊥ (DX P )P = P (DX P ), where P ⊥ = I − P . This gives (6.2.15). 

To proceed, we bring in an object called the second fundamental form of M ⊂ Rk, defined for vector fields X and Y on M by − ∇M (6.2.18) II(X,Y ) = DX Y X Y, where ∇M is the intrinsic covariant derivative on M, given by Proposition 6.1.4. Clearly, for f ∈ C∞(M), (6.2.19) II(fX, Y ) = f II(X,Y ). Furthermore, if g ∈ C∞(M), − ∇M II(X, gY ) = DX (gY ) X (gY ) − ∇M − (6.2.20) = gDX Y + (Xg)Y g X Y (Xg)Y = g II(X,Y ). We also note that ⊥ (6.2.21) II(X,Y ) = P DX Y, as a consequence of Proposition 6.1.5. In particular,

(6.2.22) II(X,Y )(x) ∈ νxM, ∀ x ∈ M. k A smooth function ξ : M → R with the property that ξ(x) ∈ νxM for all x is called a normal field.

The following important identity connects DX P to the second funda- mental form. Proposition 6.2.3. If X and Y are smooth vector fields on M,

(6.2.23) (DX P )Y = II(X,Y ).

Proof. We have DX Y = DX (PY ) = (DX P )Y + P (DX Y ) (6.2.24) ∇M = (DX P )Y + X Y, hence − ∇M (6.2.25) (DX P )Y = DX Y X Y = II(X,Y ). 278 6. Differential geometry of surfaces



From this, (6.2.13), and (6.2.15), we have:

Corollary 6.2.4. If ξ is a smooth normal field on M, then (DX P )ξ is a vector field on M, uniquely specified by the identity

(6.2.26) ⟨(DX P )ξ, Y ⟩ = ⟨ξ, II(X,Y )⟩, for all vector fields Y on M.

This last identity motivates us to bring in an object called the Wein- garten map, defined as follows. If ξ is a normal field on M, we define

(6.2.27) Aξ(x): TxM −→ TxM by

(6.2.28) ⟨AξX,Y ⟩ = ⟨ξ, II(X,Y )⟩. Let us note that − − − ∇M − ∇M II(X,Y ) II(Y,X) = (DX Y DY X) ( X Y Y X) (6.2.29) = [X,Y ] − [X,Y ] = 0, so

(6.2.30) II(X,Y ) = II(Y,X), and hence

t (6.2.31) Aξ = Aξ on TxM. From (6.2.26) we have

(6.2.32) (DX P )ξ = AξX. Using

(6.2.33) 0 = DX (P ξ) = (DX P )ξ + PDX ξ, we are led from (6.2.32) to the following result, known as the Weingarten formula.

Proposition 6.2.5. If ξ is a normal field and X is a vector field on M, then

(6.2.34) PDX ξ = −AξX. 6.2. Geometry of surfaces II: curvature 279

Second proof. We know PDX ξ is tangent to M. Given a vector field Y ,

⟨PDX ξ, Y ⟩ = ⟨DX ξ, Y ⟩

= DX ⟨ξ, Y ⟩ − ⟨ξ, DX Y ⟩ (6.2.35) −⟨ − ∇M ⟩ = ξ, DX Y X Y = −⟨ξ, II(X,Y )⟩, ⟨ ⟩ ≡ ⟨ ∇M ⟩ ≡ where we have used ξ, Y 0 and ξ, X Y 0. The last identity in (6.2.35) yields (6.2.34). 

In order to state a more definitive result, we can define a covariant derivative ∇ν on normal fields on M by ∇ν ⊥ (6.2.36) X ξ = P DX ξ, when X is a vector field on M and ξ is a normal field. One readily verifies analogues of (6.1.58)–(6.1.59): ∇ν ∇ν fX ξ = f X ξ, (6.2.37) ∇ν ∇ν (fξ) = f X ξ + (Xf)ξ. Furthermore, parallel to (6.1.60), if η is also a normal field, ⟨ ⟩ ⟨∇ν ⟩ ⟨ ∇ν ⟩ (6.2.38) X ξ, η = X ξ, η + ξ, X η , thanks to the identity X⟨ξ, η⟩ = ⟨DX ξ, η⟩ + ⟨ξ, DX η⟩. Using (6.2.36), we deduce the following extension of Proposition 6.2.5. Proposition 6.2.6. In the setting of Proposition 6.2.5, ∇ν − (6.2.39) DX ξ = X ξ AξX,

The right side splits DX ξ into the sum of a normal field on M and a vector field tangent to M.

In case k = n + 1, so M has codimension 1, with smooth unit normal N, we can combine (6.2.34) with (6.2.8), to deduce the following classical Weingarten formula: Corollary 6.2.7. If M has codimension 1 in Rn+1, with smooth unit normal field N, and X is a vector field on M, then

(6.2.40) DX N = −AN X.

Consequently, (6.2.9) yields the formula n (6.2.41) K(x) = (−1) det AN (x), for the Gauss curvature of M at x. 280 6. Differential geometry of surfaces

In case M has codimension 1 in Rk and we have in hand the smooth unit normal N, it is natural to define a real-valued form II,e by (6.2.42) II(X,Y ) = II(e X,Y )N. This is the classical case of the second fundamental form. It is useful to note the following characterization of II,e when M has codimension 1 in Rk. Translating and rotating coordinates, we can move a specific point p ∈ M to the origin in Rk, and suppose M is given locally by ′ (6.2.43) xk = f(x ), ∇f(0) = 0, ′ where x = (x1, . . . , xk−1). We can then identify the tangent space of M at p with Rk−1. Proposition 6.2.8. In the setting described in the previous paragraph, the second fundamental form of M at p is given by

∑k−1 2 e ∂ f (6.2.44) II(X,Y ) = (0) XjYℓ. ∂xj∂xℓ j,ℓ=1

Proof. From (6.2.34) we have, for any ξ normal to M,

(6.2.45) ⟨II(X,Y ), ξ⟩ = −⟨DX ξ, Y ⟩. Taking ( ) ∂f ∂f (6.2.46) ξ = − ,..., − , 1 ∂x1 ∂xk−1 gives the desired formula. 

3 If M is a surface in R , given locally by x3 = f(x1, x2) with f(0) = 0 and ∇f(0) = 0, then the Gauss curvature of M at the origin is seen by (6.2.44) to equal ( ) ∂2f(0) (6.2.47) det . ∂xj∂xℓ Besides providing a good conception of the second fundamental form of a codimension 1 surface in Rk, Proposition 6.2.8 leads to useful formulas for computation, one of which we will give in (6.2.53). First, we give a more invariant reformulation of Proposition 6.2.8. Suppose the (k−1)-dimensional surface M in Rk is given by (6.2.48) u(x) = c, with ∇u ≠ 0 on M. Then we can use the computation (6.2.45) with ξ = ∇u to obtain (6.2.49) ⟨II(X,Y ), ∇u⟩ = −Y · (D2u)X, 6.2. Geometry of surfaces II: curvature 281 where D2u is the k × k matrix of second-order partial derivatives of u. In other words, (6.2.50) II(e X,Y ) = −∥∇u∥−1Y · (D2u)X, for X and Y tangent to M. In particular, if M is a two-dimensional surface in R3, given by (6.2.48), then the Gauss curvature at p ∈ M is given by

(6.2.51) K(p) = ∥∇u(p)∥−2 det(D2u) , TpM 2 | 2 where D u TpM denotes the restriction of the quadratic form D u to the tan- gent space TpM. With this calculation we can derive the following formula, extending (6.2.47). Proposition 6.2.9. If M ⊂ R3 is given by

(6.2.52) x3 = f(x1, x2), then, at p = (x′, f(x′)) ∈ M, the Gauss curvature is given by ( ) ∂2f (6.2.53) K(p) = (1 + ∥∇f(x′)∥2)−2 det . ∂xj∂xℓ

Proof. We can apply (6.2.51) with u(x) = f(x1, x2) − x3. Note that ∥∇u∥2 = 1 + ∥∇f(x′)∥2 and ( ) D2f 0 (6.2.54) D2u = . 0 0

Noting that a basis of TpM is given by (1, 0, ∂1f) = v1, (0, 1, ∂2f) = v2, we obtain ( ) det v · (D2u)v 2 j k ∥∇ ′ ∥2 −1 2 (6.2.55) det D u T M = = (1 + f(x ) ) det D f, p det(vj · vk) which yields (6.2.53). 

Intrinsic curvature

So far, we have considered how M is curved in Rk. That is, we have examined extrinsic measures of curvature of M. We now look at intrinsic measures of curvature of M. In this setting, M can be any n-dimensional Riemannian manifold, not necessarily a surface in Rk. n One distinguishing property of R with the standard flat metric (δjk) and associated covariant derivative D is that, for any two vector fields X and Y on Rn,

(6.2.56) DX DY − DY DX = D[X,Y ]. 282 6. Differential geometry of surfaces

With this in mind, if M is an n-dimensional Riemannian manifold, with Levi-Civita covariant derivative ∇, we set

(6.2.57) R(X,Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ]Z. We call R the Riemann curvature of M. It is clearly additive in each term X,Y , and Z. The following property implies it is linear in each variable, over C∞(M). Proposition 6.2.10. Given f, g, h ∈ C∞(M), (6.2.58) R(fX, gY )hZ = fghR(X,Y )Z.

Proof. It suffices to treat f, g and h separately. To start,

(6.2.59) R(fX, Y )Z = ∇fX ∇Y Z − ∇Y ∇fX Z − ∇[fX,Y ]Z. Use the identity [fX, Y ] = f[X,Y ] − (Y f)X to write this as

(6.2.60) f∇X ∇Y Z − ∇Y (f∇X Z) − f∇[X,Y ]Z + (Y f)∇X Z, and then write the second term in (6.2.60) as

(6.2.61) −f∇Y ∇X Z − (Y f)∇X Z, to get (14.58) when g = h = 1. The other cases are similar. 

It follows that R(X,Y )Z is determined by its behavior on the coordinate a vector fields Dj = ∂/∂xj. We define the components R bjk of the Riemann curvature (in a coordinate system) by a (6.2.62) R(Dj,Dk)Db = R bjkDa

(using the summation convention). Since [Dj,Dk] = 0, ∇ ∇ − ∇ ∇ (6.2.63) R(Dj,Dk)Db = Dj Dk Db Dk Dj Db. Recall from (6.1.69) that ∇ a (6.2.64) Dk Db = Γ bkDa, a where Γ bk are the Christoffel symbols, given by (6.1.70). Plugging this into (6.2.63) and using the derivation property (6.1.59) readily gives the formula

a ∂ a ∂ a a c a c (6.2.65) R bjk = Γ bk − Γ bj + Γ cjΓ bk − Γ ckΓ bj. ∂xj ∂xk These formulas can be written in shorter form, as follows. For each j and k in {1, . . . , n}, we define n × n matrices a a (6.2.66) Γj = (Γ bj), Rjk = (R bjk). Then (6.2.63) is equivalent to ∂ ∂ (6.2.67) Rjk = Γk − Γj + [Γj, Γk], ∂xj ∂xk 6.2. Geometry of surfaces II: curvature 283

where [Γj, Γk] = ΓjΓk − ΓkΓj is the matrix commutator. Note that Rjk is antisymmetric in j and k. Now we can define a “connection 1-form” Γ and a “curvature 2-form” Ω by ∑ 1 ∑ (6.2.68) Γ = Γ dx , Ω = R dx ∧ dx , j j 2 jk j k j j,k and the formula (6.2.67) is equivalent to (6.2.69) Ω = dΓ + Γ ∧ Γ. We next mention a couple of basic symmetries of the Riemann curvature. Proposition 6.2.11. The Riemann curvature satisfies (6.2.70) R(X,Y )Z = −R(Y,X)Z, and (6.2.71) ⟨R(X,Y )Z,W ⟩ = −⟨Z,R(X,Y )W ⟩.

Proof. The identity (6.2.70) is immediate from the definition (6.2.57). Next, the metric property (6.1.60) of ∇ yields 0 = (XY − YX − [X,Y ])⟨Z,W ⟩ (6.2.72) = ⟨R(X,Y )Z,W ⟩ + ⟨Z,R(X,Y )W ⟩, which gives (6.2.71). 

We set c (6.2.73) Rabjk = gacR bjk = ⟨Da,R(Dj,Dk)Db⟩. Then the identities (6.2.70)–(6.2.71) become

Rabjk = −Rabkj (6.2.74) = −Rbajk. Having defined the Riemann curvature of a general n-dimensional Rie- mannian manifold and developed some of its basic properties, we turn back to the case of a smooth n-dimensional surface M ⊂ Rk, with its induced metric, and relate the Riemann curvature to the second fundamental form. We again make use of Proposition 6.1.5, which says the Levi-Civita covariant derivative ∇ on M is given by

(6.2.75) ∇X Y = PDX Y. Consequently, (6.2.57) implies

R(X,Y )Z = PDX (PDY Z) − PDY (PDX Z) − PD[X,Y ]Z

(6.2.76) = PDX DY Z − PDY DX Z − PD[X,Y ]Z

+ P (DX P )(DY Z) − P (DY P )(DX Z). 284 6. Differential geometry of surfaces

Now DX DY Z − DY DX Z − D[X,Y ]Z = 0, so we are left with the last two terms. Furthermore, ⊥ P (DX P )(DY Z) = (DX P )P (DY Z)

(6.2.77) = (DX P ) II(Y,Z)

= AII(Y,Z)X, the first identity by (6.2.17), the second by (6.2.21), and the third by (6.2.32). This yields the following result.

Proposition 6.2.12. The Riemann curvature of a surface M ⊂ Rk satisfies

(6.2.78) R(X,Y )Z = AII(Y,Z)X − AII(X,Z)Y, if X,Y , and Z are vector fields on M. Equivalently, if also W is a vector field on M, (6.2.79) ⟨R(X,Y )Z,W ⟩ = ⟨II(Y,Z), II(X,W )⟩ − ⟨II(X,Z), II(Y,W )⟩.

Corollary 6.2.13. Assume k = n + 1, so M is a codimension 1 surface. Then ( ) II(e X,W ) II(e X,Z) (6.2.80) ⟨R(X,Y )Z,W ⟩ = det . II(e Y,W ) II(e Y,Z)

In the setting of Corollary 6.2.13, we have ( ) II(e X,X) II(e X,Y ) (6.2.81) ⟨R(X,Y )Y,X⟩ = det . II(e Y,X) II(e Y,Y )

Also, in this setting, e (6.2.82) II(X,Y ) = ⟨AN X,Y ⟩. We therefore have the following, known as Gauss’ Theorema Egregium.

Proposition 6.2.14. Assume M ⊂ R3 is a 2D surface. Take p ∈ M, and let U and V be vector fields on M such that {U(p),V (p)} forms an orthonormal basis of TpM. Then the Gauss curvature of M at p is given by (6.2.83) K(p) = ⟨R(U, V )V,U⟩(p).

Proof. By (6.2.41), K(p) = det AN (p), but ( ) ⟨AN U, U⟩⟨AN U, V ⟩ (6.2.84) det AN (p) = det , ⟨AN V,U⟩⟨AN V,V ⟩ and the result then follows from (6.2.81)–(6.2.82).  6.2. Geometry of surfaces II: curvature 285

It follows that the Gauss curvature is an intrinsic quantity on a 2D sur- face. In fact, if M is a 2D Riemannian manifold, not necessarily a surface in R3, one can define its Gauss curvature at p ∈ M by (6.2.83). One can readily show that the right side of (6.2.83) is independent of the choice of orthonor- mal basis of TpM. Going further, if we take two vectors X,Y ∈ TpM and expand them in terms of an orthonormal basis {U, V }, the following result is a straightforward consequence of Propositions 6.2.10–6.2.11 and (6.2.83).

Proposition 6.2.15. If X and Y are vector fields on a 2D Riemannian manifold M, with Gauss curvature K(x), then ( ) ⟨X,X⟩⟨X,Y ⟩ (6.2.85) ⟨R(X,Y )Y,X⟩ = K(x) det . ⟨X,Y ⟩⟨Y,Y ⟩

If we take a coordinate chart on M and set X = D1,Y = D2, the coordinate vector fields, then the left side of (6.2.85) becomes

(6.2.86) ⟨R(D1,D2)D2,D1⟩ = R1212, as defined in (6.2.73), and the matrix on the right is G(x), the matrix defining the metric tensor. Then (6.2.85) yields the identity R (x) (6.2.87) K(x) = 1212 , g(x) = det G(x), g(x) when dim M = 2. Here is an explicit calculation of the Gauss curvature for an important class of 2D manifolds.

Proposition 6.2.16. Let M be a 2D Riemannian manifold. Suppose that one has a coordinate chart in which the metric tensor takes the form ( ) G1(x) (6.2.88) G(x) = , g(x) = G1(x)G2(x). G2(x) Then the Gauss curvature is given by [ ( ) ( )] 1 ∂ G ∂ G (6.2.89) K(x) = − √ ∂ √1 2 + ∂ √2 1 . 2 g 1 g 2 g

Proof. One can first compute that ( ) a 1 ∂1G1/G1 ∂2G1/G1 Γ1 = (Γ b1) = , 2 −∂2G1/G2 ∂1G2/G2 (6.2.90) ( ) a 1 ∂2G1/G1 −∂1G2/G1 Γ2 = (Γ b2) = . 2 ∂1G2/G2 ∂2G2/G2 286 6. Differential geometry of surfaces

a Then, computing R12 = (R b12) = ∂1Γ2 − ∂2Γ1 + Γ1Γ2 − Γ2Γ1, we have ( ) ( ) 1 ∂ G 1 ∂ G R1 = − ∂ 1 2 − ∂ 2 1 212 2 1 G 2 2 G ( 1 1 ) 1 ∂ G ∂ G ∂ G ∂ G (6.2.91) + − 1 1 1 2 + 2 1 2 2 4 G G G G ( 1 1 1 2) 1 ∂ G ∂ G ∂ G ∂ G − 2 1 2 1 − 1 2 1 2 . 4 G1 G1 G1 G2 1 Now R1212 = G1R 212 in this case, and (6.2.87) yields

1 1 1 (6.2.92) K(x) = R1212 = R 212. G1G2 G2 If we divide (6.2.91) by G2, then in the resulting formula for K(x) inter- change G1 and G2, and ∂1 and ∂2, and sum the two formulas for K(x), we get [ ( ) ( )] 1 1 ∂ G 1 ∂ G K(x) = − ∂ 1 2 + ∂ 1 2 4 G 1 G G 1 G (6.2.93) [ 2 ( 1 ) 1 ( 2 )] 1 1 ∂2G1 1 ∂2G1 − ∂2 + ∂2 , 4 G1 G2 G2 G1 which is readily transformed into (6.2.89). 

Coordinates in which the metric tensor takes the form (6.2.88) are called . If in addition we have G1 = G2, they are called isothermal coordinates. In such a case, (6.2.89) specializes to the following neat formula. Corollary 6.2.17. Suppose dim M = 2, and one has an isothermal coordi- nate system, in which 2v(x) (6.2.94) gjk(x) = e δjk, for a smooth v. Then the Gauss curvature is given by ( 2 2 ) − −2v ∂ v ∂ v (6.2.95) K(x) = e 2 + 2 . ∂x1 ∂x2 A significant example of Corollary 6.2.17 is provided by the Poincar´e disk, 4 (6.2.96) D = {x ∈ R2 : ∥x∥ < 1}, g (x) = δ . jk (1 − ∥x∥2)2 jk Application of (6.2.95) yields (6.2.97) K ≡ −1, for this metric. A related example is the Poincar´eupper half plane, U { ∈ R2 } −2 (6.2.98) = x : x2 > 0 , gjk(x) = x2 δjk, 6.2. Geometry of surfaces II: curvature 287 again yielding (6.2.97). It is clear that one cannot obtain the Gauss curvature of an n-dimensional M ⊂ Rn+1 from its Riemann curvature R when n = 1. In fact, if n = 1, (6.2.70) implies R ≡ 0, while the Gauss curvature is given by (6.2.6). More generally, one cannot obtain K(x) from R when n is odd. On the other hand, we can obtain K(x) from R when the dimension of M is even. We discuss how this works. Assume M is a smooth surface of dimension n+1 n = 2m in R . Take p ∈ M, and let {e1, . . . , en} be an orthonormal basis of T M. For short, set A = A . By (6.2.70), p N ( ) ⟨Aej, ea⟩⟨Aej, eb⟩ (6.2.99) ⟨R(ej, ek)eb, ea⟩ = det . ⟨Aek, ea⟩⟨Aek, eb⟩ It follows that 1 ∑ (6.2.100) Ae ∧ Ae = ⟨R(e , e )e , e ⟩e ∧ e . j k 2 j k b2 b1 b1 b2 b1,b2 Now (det A) e ∧ · · · ∧ e = (Ae ∧ Ae ) ∧ · · · ∧ (Ae − ∧ Ae ) 1 ∑ n 1 2 n 1 n −m = 2 ⟨R(e , e )e , e ⟩···⟨R(e − , e )e , e ⟩ (6.2.101) 1 2 b2 b1 n 1 n bn bn−1 b × ∧ ∧ · · · ∧ ∧ eb1 eb2 ebn−1 ebn , where the sum runs over b ∈ Sn, the set of permutations of {1, . . . , n}. ∧ · · · ∧ ∧ · · · ∧ Since eb1 ebn = (sgn b) e1 en, this leads to a formula for det A. Recalling (6.2.41), we obtain the following higher-dimensional extension of Proposition 6.2.14. Proposition 6.2.18. Assume M ⊂ Rn+1 is a surface of dimension n = 2m. Pick p ∈ M, and let {e1, . . . , en} be an orthonormal basis of TpM. Then the Gauss curvature of M at p is given by (6.2.102) ∑ −n/2 ⟨ ⟩···⟨ ⟩ K(p) = 2 (sgn b) R(e1, e2)eb2 , eb1 R(en−1, en)ebn , ebn−1 . b∈Sn A variant of the formula (6.2.102) is given by taking arbitrary permuta- tions of {e1, . . . , en} and summing. This gives 2−n/2 ∑ K(p) = (sgn a)(sgn b)⟨R(e , e )e , e ⟩ n! a1 a2 b2 b1 (6.2.103) a,b∈Sn ···⟨ ⟩ R(ean−1 , ean )ebn , ebn−1 . We can use this formula to define the Gauss curvature of any Riemannian manifold M of dimension n = 2m, regardless of whether it is a surface in Rn+1. One needs to check that the right side of (6.2.103) is independent of 288 6. Differential geometry of surfaces

the choice of an orthonormal basis of TpM. One way to get this involves the following objects. In addition to the curvature 2-form Ω in (6.2.68), one has the “curvature (2,2)-form” 1 ∑n (6.2.104) Ωe = R (dx ∧ dx ) ⊗ (dx ∧ dx ). 4 abjk a b j k a,b,j,k=1 One can define a product on (k, ℓ)-forms by

(6.2.105) (α1 ⊗ β1) ∧ (α2 ⊗ β2) = (α1 ∧ α2) ⊗ (β1 ∧ β2), given kj-forms αj and ℓj-forms βj. The product is a (k1 + k2, ℓ1 + ℓ2)-form. Now, if dim M = n = 2m, we form the (n, n)-form (6.2.106) P(Ω)e = Ωe ∧ · · · ∧ Ω(e m factors). If we take p ∈ M and choose a coordinate system about p in which the coordinate vector fields D1,...,Dn are orthonormal, then comparison with the calculations (6.2.103)–(6.2.104) above give, at p, n! (6.2.107) P(Ω)e = K(p) ω ⊗ ω , 2n/2 M M where, at p, ωM = dx1 ∧· · ·∧dxn (in this coordinate system). Consequently, if we take ωM to be the volume form on M, then (6.2.106)–(6.2.107) can be taken as a definition of the Gauss curvature on a general Riemannian manifold of dimension n = 2m. Note that e n = 2 =⇒ Ω = R1212 (dx1 ∧ dx2) ⊗ (dx1 ∧ dx2), √ ∧ P e e (6.2.108) and ωM = g dx1 dx2, and (Ω) = Ω, 1 =⇒ P(Ω)e = R ω ⊗ ω , g 1212 M M in which case (6.2.107) implies 1 (6.2.109) K = R , for n = 2, g 1212 and we recover (6.2.87).

Exercises

1. Give another proof of Proposition 6.2.3, starting with ⊥ ⊥ 0 = DX (P Y ) = −(DX P )Y + P DX Y, and using (6.2.21).

2. Discuss how (6.2.6) is a special case of the Weingarten identity (6.2.40). 6.2. Geometry of surfaces II: curvature 289

3. We say a 2D Riemannian manifold has a Clairaut parametrization if there are coordinates (u, v) in which the metric tensor takes the form ( ) G (u) G(u, v) = 1 G2(u) (with no v dependence). Show that the Gauss curvature is given by ′ 1 d G2(u) K(u) = − √ √ , g = G1G2. 2 g(u) du g(u)

4. Let M ⊂ R3 be a surface of revolution, with coordinate chart ( ) X(u, v) = g(u), h(u) cos v, h(u) sin v , obtained by taking the curve γ(u) = (g(u), h(u), 0) in the xy-plane and rotating it about the x-axis in R3. Show that in these coordinates the metric tensor is given by ( ) |γ′(u)|2 G(u, v) = . h(u)2 Apply Exercise 3 to compute the Gauss curvature.

5. Let γs :[a, b] → M be a smooth family of curves satisfying γs(a) ≡ p, γs(b) ≡ q. Assume γ0 has unit speed. Define

V (t) = ∂ γ (t) ∈ T M. s s s=0 γ0(t)

Recall the energy function E(γs). Show that ∫ d2 b[ ] ⟨ ⟩ ⟨∇ ∇ ⟩ ⟨∇ ∇ ⟩ 2 E(γs) = 2 R(V,T )V,T + T V, T V + V V, T T dt. ds s=0 a

Note that the last term in the integrand vanishes if γ0 is a geodesic. Show that, since V (a) = V (b) = 0, the middle term in the integrand can be replaced by −⟨V, ∇T ∇T V ⟩, so, if γ0 is a geodesic, ∫ d2 b[ ] − ⟨ ⟩ ⟨ ∇ ∇ ⟩ (6.2.110) 2 E(γs) = 2 R(V,T )T,V + V, T T V dt. ds s=0 a

6. Let M ⊂ Rk be a smooth n-dimensional surface, and let X,Y , and Z be smooth vactor fields on M. Use (6.2.18) and (6.2.39) to verify that ∇M ∇M ∇M − ∇ν DX DY Z = X Y Z + II(X, Y Z) AII(Y,Z)X + X II(Y,Z).

Get a similar expression for DY DX Z, and apply (6.2.18) to D[X,Y ]Z. Use the fact that DX DY Z −DY DX Z −D[X,Y ]Z = 0 to deduce that the Riemann 290 6. Differential geometry of surfaces

curvature of M satisfies { − ∇M ∇M R(X,Y )Z = II(X, Y Z) + II(Y, X Z) + II([X,Y ],Z) } − ∇ν ∇ν (6.2.111) X II(Y,Z) + Y II(X,Z) { } + AII(Y,Z)X − AII(X,Z)Y . The quantity in the first set of braces is normal to M, so it vanishes. This vanishing is called Codazzi’s equation. The identity of R(X,Y )Z with the quantity in the second set of braces is equivalent to the Gauss equation (6.2.78).

7. In the setting of Exercise 6, assume k = n + 1. Take the inner product of both sides of (6.2.111) with N and deduce that Codazzi’s equation is equivalent to ∇M − ∇M (6.2.112) X (AN Y ) Y (AN X) = AN ([X,Y ]). Hint. Start with ⟨ ∇M ⟩ ⟨ ∇M ⟩ N, II(X, Y Z) = AN X, Y Z ⟨ ⟩ − ⟨∇M ⟩ = Y AN X,Z Y (AN X),Z , and then show that ⟨ ∇ν ⟩ ⟨ ⟩ N, Y II(X,Z) = Y AN X,Z .

8. Take M ⊂ Rk as in Exercise 7 (k = dim M + 1). We say a point p ∈ M is an umbilic if AN (p) = λ(p)I, for some λ(p) ∈ R. Assume that each p ∈ M is an umbilic and that M is connected. Show that λ must be constant. Hint. Apply the Codazzi equation (6.2.112). Deduce that (Xλ)Y = (Y λ)X for arbitrary smooth vector fields X and Y on M, hence Xλ ≡ Y λ ≡ 0. 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 291

6.3. Geometry of surfaces III: the Gauss-Bonnet theorem

Here we establish results that make contact between material on curvature in §6.2 and material of §5.3, including material on the degree of the Gauss map, and material on the Euler characteristic, defined in (5.3.29). We will also bring in the notion of parallel transport. To begin, assume M is a smooth, compact surface of dimension n in Rn+1, with smooth unit normal N → Sn. As seen in §5.3, the degree of this map satisfies ∫ (6.3.1) Deg(N) = N ∗ω,

M for any n-form ω on Sn that integrates to 1. In particular, we can take

−1 (6.3.2) ω = An ωS, n where ωS is the volume form of S , and (cf. (3.2.32)) ∫ 2π(n+1)/2 (6.3.3) A = ω = . n S Γ((n + 1)/2) Sn n Recall that for each x ∈ M, DN(x): TxM → TN(x)S = TxM. We leave the following result as an exercise.

Proposition 6.3.1. In the setting described above, ∗ (6.3.4) N ωS = (det DN) ωM , where ωM is the volume form of M.

Recalling the characterization of Gauss curvature in (6.2.9), we deduce the following.

Proposition 6.3.2. If M ⊂ Rn+1 is a compact, n-dimensional surface, with smooth unit normal N : M → Sn, and associated Gauss curvature K, then ∫ 1 (6.3.5) Deg(N) = K(x) dS(x). An M

Putting this together with Corollary 5.3.17, we have the following.

Proposition 6.3.3. Take M as in Proposition 6.3.2, and assume dim M = n = 2m is even. Then 1 (6.3.6) Deg(N) = χ(M), 2 292 6. Differential geometry of surfaces where χ(M) is the Euler characteristic of M, defined by (5.3.29). Conse- quently ∫ A (6.3.7) K(x) dS(x) = n χ(M). 2 M

Proof. (Compare Exercise 11 of §5.3.) Corollary 5.3.17 applies directly to a neighborhood Ω of M whose boundary is essentially two copies of M, with normals N and −N. The conclusion (5.3.28)–(5.3.29) then becomes χ(M) = Deg(N) + Deg(−N) (6.3.8) = (1 + (−1)n) Deg(N), which yields (6.3.6) when n is even. 

Specializing (6.3.7) to n = 2, we get ∫ (6.3.9) K(x) dS(x) = 2πχ(M),

M when M ⊂ R3 is a compact, 2D surface. This is the classical Gauss-Bonnet formula, for surfaces without boundary. See Figure 6.3.1 for an example, involving a three-holed torus. As seen in §5.3, Exercise 19, χ(M) = −4 in this case, so ∫ K dS = −8π.

M The classical formula also treats 2D surfaces with boundary, and we will want to derive such a result below. We also want to extend (6.3.9) from compact 2D surfaces in R3 to arbitrary compact 2D Riemannian manifolds. One approach to such extensions involves the notion of parallel transport, which we now introduce. Let M be an n-dimensional Riemannian manifold and γ : [0, τ] → M be a smooth (or piecewise smooth) curve. Say γ(0) = p, and take V0 ∈ TpM. We say a vector field V along γ is defined by parallel transport (or “parallel translation”) if ′ (6.3.10) ∇T V = 0,T = γ , with V (t) ∈ Tγ(t)M,V (0) = V0. If γ is contained in a coordinate patch, in which

(6.3.11) γ(t) = (x1(t), . . . , xn(t)), we can set j (6.3.12) V (t) = v (t)Dj, 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 293

∫ − − Figure 6.3.1. Three-holed torus, χ(M) = 4, M K dS = 8π. where Dj = ∂/∂xj. Here and below we use the summation convention. We obtain from (6.3.10) and (6.1.69) the following linear system of ODE for the components of V : dva dx (6.3.13) = −Γa vb k . dt bk dt The following is a key result, relating curvature and parallel translation. Proposition 6.3.4. Let γ : [0, τ] → M be a piecewise smooth closed loop on M, parametrized by arc length, with γ(τ) = γ(0). If V (t) is a vector field over γ defined by parallel translation, with components as in (6.3.12), then (∫ ) 1 ∑ (6.3.14) va(τ) − va(0) = − Ra dx ∧ dx vb(0) + O(τ 3), 2 bjk j k j,k,b A where A is an oriented 2-surface in M with ∂A = γ.

Proof. If we put a coordinate system on a neighborhood of p = γ(0) ∈ M, then parallel transport is defined by (6.3.13), so ∫ t a a a b dxk (6.3.15) v (t) = v (0) − Γ bk(γ(s))v (s) ds. 0 ds 294 6. Differential geometry of surfaces

We hence have a a a b 2 (6.3.16) v (t) = v (0) − Γ bk(p)v (0)(xk − pk) + O(t ). We can solve (6.3.13) up to O(t3) if we use a a a 2 (6.3.17) Γ bj(x) = Γ bj(p) + (xk − pk)∂kΓ bj + O(|x − p| ). Hence ∫ t[ ] a a a a v (t) = v (0) − Γ bk(p) + (xj − pj)∂jΓ bk(p) (6.3.18) 0 [ ]dx · vb(0) − Γb (p)vc(0)(x − p ) k ds + O(t3). cℓ ℓ ℓ ds If γ(τ) = γ(0), we get ∫ τ a a a b v (τ) = v (0) − xj dxk (∂jΓ bk)v (0) ∫0 (6.3.19) τ a b c 3 + xj dxk Γ bkΓ cjv (0) + O(τ ), 0 the components of Γ and their first derivatives being evaluated at p. Now Stokes’ theorem gives ∫ ∫

xj dxk = dxj ∧ dxk, γ A so [ ](∫ ) a a a a c b 3 (6.3.20) v (τ) − v (0) = − ∂jΓ bk − Γ ckΓ bj dxj ∧ dxk v (0) + O(τ ). A Recall that the Riemann curvature is given by (6.2.65), i.e., a a a a c a c (6.3.21) R bjk = ∂jΓ bk − ∂kΓ bj + Γ cjΓ bk − Γ ckΓ bj. Now the right side of (6.3.21) is the antisymmetrization,∫ with respect to j ∧ and k, of the quantity in brackets in (6.3.20). Since A dxj dxk is antisym- metric in j and k, we get the desired formula (6.3.14). 

Let us specialize to dim M = 2, and use exponential coordinates centered at p, so g (x) = δ + O(|x − p|2). Then jk ∫jk ∫ ( √ ) 2 dx1 dx2 = g dx1 dx2 (1 + O(τ )) (6.3.22) A A = (Area A)(1 + O(τ 2)) = Area A + O(τ 3), and (6.3.14) becomes a a a b 3 (6.3.23) v (τ) − v (0) = −R b12(p)(Area A)v (0) + O(τ ). 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 295

Furthermore, ( ) ( ) 0 1 (6.3.24) Ra = K(p) = −K(p)J, b12 −1 0 where K(p) is the Gauss curvature of M at p and J : TpM → TpM is counterclockwise rotation by 90◦. We have the following. Corollary 6.3.5. In the setting of Proposition 6.3.4, if dim M = 2 and γ = ∂A, then (6.3.25) V (τ) − V (0) = K(p)(Area A)JV (0) + O(τ 3).

In order to make use of (6.3.25), we examine further the parallel trans- port of a vector V along a smooth segment of γ. Say γ :[a, b] → M is smooth, and, as before, γ′ = T, ∥T ∥ ≡ 1. Note that, for a ≤ t ≤ b, d (6.3.26) ∥V (t)∥2 = T ⟨V,V ⟩ = 2⟨∇ V,V ⟩ = 0, dt T so V (t) has constant length. Next,

(6.3.27) T ⟨V,T ⟩ = ⟨∇T V,T ⟩ + ⟨V, ∇T T ⟩ = ⟨V, ∇T T ⟩. In particular, if γ is a geodesic for t ∈ [a, b], then ⟨V (t),T (t)⟩ is constant on that interval. Consequently, the angle between V (t) and T (t) is constant on that interval. With these observations in hand, we can examine what happens when one takes a vector V0 ∈ TpM and parallel transports it along a curve γ that bounds a geodesic triangle A ⊂ M, having one vertex at p. Here and below, a “triangle” in a 2D manifold M is assumed to be homeomorphic to a standard triangle in R2. As one sees from Figure 6.3.2, the resulting vector V3 ∈ TpM is obtained from V0 by rotation in TpM through an angle that depends on the angle defect δ = α+β +γ −π of the triangle. Indeed, if ξ is the angle that V0 (hence V1) makes with the geodesic from p to q, then the angle from V0 to V3 is (6.3.28) (π + α) − (2π − β − γ − ξ) − ξ = α + β + γ − π. That is to say, (α+β+γ−π)J (6.3.29) V3 = e V0. We thus have from (6.3.25) that ∫ (6.3.30) α + β + γ − π = K dS + O(τ 3), A if the length of γ = ∂A is τ. Finally, we can get rid of the remainder term and obtain the following celebrated formula of Gauss. 296 6. Differential geometry of surfaces

Figure 6.3.2. Parallel transport along a geodesic triangle

Proposition 6.3.6. If A is a geodesic triangle in a 2D Riemannian mani- fold M, with α, β, and γ, then ∫ (6.3.31) α + β + γ − π = K dS. A

Proof. Break up the geodesic triangle A into N 2 little geodesic , each of diameter O(N −1) and area O(N −2). Since the angle defects are additive, the estimate (6.3.30) implies ∫ α + β + γ − π = K dS + N 2O(N −3) A (6.3.32) ∫ = K dS + O(N −1), A and taking the limit as N → ∞ yields (15.31). 

Our next goal is to extend the formula (6.3.9) to general compact, 2D Riemannian manifolds. To begin, we assert that we can partition M into 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 297 geodesic triangles. Suppose such a triangulation of M has

(6.3.33) F faces (triangles),E edges,V vertices.

If the angles of the jth triangle are αj, βj, and γj, then summing all the angles clearly produces 2πV . On the other hand, (6.3.31) applied to the jth triangle, and summed over j, yields ∑ ∫ (6.3.34) (αj + βj + γj) = πF + K dS. j M ∫ − Hence M K dS = (2V F )π. Since in this case all the faces are triangles, counting each triangle three times will count each edge twice, so 3F = 2E. Thus we obtain ∫ (6.3.35) K dS = 2π(V − E + F ).

M Now we have Euler’s formula,

(6.3.36) χ(M) = V − E + F, whose derivation is discussed in Exercise 8 of §5.3. Putting together (6.3.35) and (6.3.36), we have the desired extension of (6.3.9).

Remark. This argument depends on the existence of a triangulation of M as described above. See Exercise 8 below for a proof of (6.3.9), for a general compact 2D manifold M, that does not depend on a triangulation.

We next want to extend Proposition 6.3.6 to cases where A is a triangle whose sides are not geodesics. To start, we pursue the study of parallel transport of V (t) along a smooth curve γ :[a, b] → M, begun in (6.3.26)– (6.3.27). Now we no longer assume γ is a geodesic, so (6.3.27) does not imply T ⟨V,T ⟩ = 0. Note that

(6.3.37) ⟨T,T ⟩ ≡ 1 =⇒ T ⟨∇T T,T ⟩ = 0, so ∇T T is orthogonal to T . Let us set (6.3.38) N(t) = JT (t),

◦ where J : Tγ(t)M → Tγ(t)M is counterclockwise rotation by 90 . (Note: N is not the Gauss map of M; in fact, we do not assume M ⊂ R3, so we do not have such a Gauss map.) Then ∇T T is parallel to N(t), so we set

(6.3.39) ∇T T = κ(t)N(t), 298 6. Differential geometry of surfaces and this defines κ(t), which we call the geodesic curvature of γ at t. Compare (6.2.5), which deals with the case M = R2. Next, note that

⟨N,T ⟩ ≡ 1 ⇒ 0 = T ⟨N,T ⟩ = ⟨∇T N,T ⟩ + ⟨N, ∇T T ⟩ (6.3.40) ⇒ ⟨∇T N,T ⟩ = −κ(t), and

(6.3.41) ⟨N,N⟩ ≡ 1 =⇒ 0 = 2⟨∇T N,N⟩, so ∇T N is parallel to T . Hence

(6.3.42) ∇T N = −κ(t)T (t).

Compare (6.2.6). Returning to V (t), solving ∇T V = 0, we have

T ⟨V,T ⟩ = ⟨V, ∇T T ⟩ = κ⟨V,N⟩, (6.3.43) T ⟨V,N⟩ = ⟨V, ∇T N⟩ = −κ⟨V,T ⟩. In other words, if we expand

(6.3.44) V (t) = v1(t)T (t) + v2(t)N(t), we obtain the 2 × 2 system

dv1 = κ(t)v2(t), (6.3.45) dt dv 2 = −κ(t)v (t). dt 1 Hence dz z(t) = v1(t) + iv2(t) ⇒ = −iκ(t)z(t) (6.3.46) dt ∫ −i t κ(s) ds ⇒ z(t) = e a z(a). Hence V (t) = (v1(t)I + v2(t)J)T (t) (6.3.47) ∫ −J t κ(s) ds = e a (v1(a)I + v2(a)J)T (t). This leads to the following result. Proposition 6.3.7. Let M be an oriented 2D Riemannian manifold, γ : [a, b] → M a smooth, unit speed curve. Let V (t) be obtained from V (a) ∈ Tγ(a)M by parallel transport. Assume V (a) makes the angle ξ0 with T (a), i.e., up to a positive real factor, (6.3.48) V (a) = eξ0J T (a). Then, for t ∈ [a, b], V (t) makes with T (t) the angle ∫ t (6.3.49) ξ(t) = ξ0 − κ(s) ds, a where κ(s) is the geodesic curvature of γ at γ(s). 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 299

From here, a straightforward modification of the proof of Proposition 6.3.6 yields the following generalization. Proposition 6.3.8. Let A be a triangle in a 2D Riemannian manifold, whose boundary consists of three smooth arcs, meeting at angles α, β, and γ. Then ∫ ∫ (6.3.50) α + β + γ − π = K dS + κ ds,

A ∂A where κ is the geodesic curvature of (the smooth part of) ∂A.

2 Remark. When one∫ divides A into N little triangles Aj, there arises a sum of N 2 terms κ ds. The integrals along curve segments that are ∂Aj ∫ interfaces of two triangles cancel out, so the sum of these terms is ∂A κ ds.

Note that if O ⊂ M is a smoothly bounded domain whose closure is diffeomorphic to a closed disk, then O is a limiting case of “triangles” treated in Proposition 6.3.8, in which α = β = γ = π, so we have the following. Proposition 6.3.9. If O is a smoothly bounded open subset of a compact 2D Riemannian manifold M, and O is diffeomorhic to a closed disk, then ∫ ∫ (6.3.51) K dS + κ ds = 2π.

O ∂O The following result deals with a broad class of 2D manifolds with bound- ary. Proposition 6.3.10. Let M be a compact, oriented, 2D manifold, and let Ω ⊂ M be a smoothly bounded open subset. Assume that ∪k (6.3.52) M \ Ω = Oj, j=1 where each Oj is connected, with closure Oj diffeomorphic to a closed disk. Then ∫ ∫ ( ) (6.3.53) K dS + κ ds = 2π χ(M) − k .

Ω ∂Ω Proof. First observe the cancellation property ∫ ∑k ∫ (6.3.54) κ ds + κ ds = 0. j=1 ∂Ω ∂Oj 300 6. Differential geometry of surfaces

Hence, by (6.3.9), extended from surfaces in R3 to M in the current setting, ∫ 2πχ(M) = K dS

M ∫ ∑k ∫ = K dS + K dS (6.3.55) j=1 Ω Oj ∫ ∫ ∑k [∫ ∫ ] = K dS + κ ds + K dS + κ ds . j=1 Ω ∂Ω Oj ∂Oj By Proposition 6.3.9, each term in the last sum over j is equal to 2π, so (6.3.53) follows. 

Moving back to higher dimensions, we state the following extension of Proposition 6.3.3, known as the generalized Gauss-Bonnet theorem. Proposition 6.3.11. Let M be a compact Riemannian manifold of dimen- sion n = 2m, and define its Gauss curvature K by (6.2.104)–(6.2.107). Then ∫ A (6.3.56) K(x) dS(x) = n χ(M). 2 M Such a result is subject for a full-blown course in differential geometry. Treatments can be found in [38] and in Appendix C (Vol. 2) of [41].

Exercises

1. Let M ⊂ Rk be an n-dimensional surface, with the induced metric tensor, and let γ :[a, b] → M be a smooth curve. For t ∈ [a, b], let P (t) denote the k orthogonal projection of R onto Tγ(t)M. Take V0 ∈ Tγ(a)M. Show that the solution to dV = P ′(t)V (t),V (a) = V dt 0 is the vector field along γ obtained from V0 by parallel transport. Hint. Show that W (t) = P ⊥(t)V (t) satisfies W ′(t) = −P (t)P ′(t)V (t) = −P ′(t)P ⊥(t)V (t) = −P ′(t)W (t), to deduce that W (t) ≡ 0. Then check Proposition 6.1.5.

k 2. Let M1 and M2 be n-dimensional surfaces in R . Supppose M1 ∩ M2 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 301 contains a curve γ and

p ∈ γ =⇒ TpM1 = TpM2.

Show that parallel transport along γ for M1 coincides with parallel transport along γ for M2 Hint. Use Exercise 1.

3. Let M be an oriented 2D Riemannian manifold. For each p ∈ M, define ◦ J : TpM → TpM to be counterclockwise rotation by 90 . Let ∇ be the Lavi-Civita covariant derivative on M. Show that, if X and Y are vector fields on M,

∇X JY = J∇X Y. Hint. See if you can get this from (6.3.39) and (6.3.42).

4. In the setting of Exercise 3, show that if X,Y , and Z are vector fields on M, R(X,Y )JZ = JR(X,Y )Z.

5. Let M be an oriented 2D Riemannian manifold, O ⊂ M an open set. Assume you have a smooth vector field E1 on O such that ⟨E1,E1⟩ ≡ 1. Set E2 = JE1. Define a 1-form ω on O by

(6.3.57) ω(X) = ⟨∇X E1,E2⟩. Show (using (5.2.35)) that, for all vector fields X,Y,Z on O, R(X,Y )Z = dω(X,Y ) JZ. In particular,

R(E1,E2)E2 = −dω(E1,E2)E1.

Since the Gauss curvature K(x) = ⟨R(E1,E2)E2,E1⟩, deduce that

(6.3.58) dω = −K σM , where σM denotes the area 2-form on M.

6. In the setting of Exercise 5, let γ :[a, b] → O be a smooth, unit-speed curve, T (t) = γ′(t). Define a smooth function φ :[a, b] → R such that

T (t) = cos φ(t) E1 + sin φ(t) E2. Show that the geodesic curvature of γ is given by κ(t) = φ′(t) + ω(T ). 302 6. Differential geometry of surfaces

7. In the setting of Exercise 6, let Ω ⊂ O be a smoothly bounded domain, γ = ∂Ω. Apply Stokes’ theorem to get ∫ ∫ ∫ ∫ K dS = − ω = − κ(s) ds + φ′(s) ds.

Ω ∂Ω ∂Ω ∂Ω Obtain variants of this when Ω has piecewise smooth boundary. Use this ap- proach to produce alternative proofs of Propositions 6.3.6 and 6.3.8–6.3.10.

8. Let M be a compact, oriented, 2D Riemannian manifold, V a smooth vector field on M, {pj} its set of critical points, each assumed to be nonde- generate. Set

−1 E1 = ∥V ∥ V,E2 = JE1, on O = M \{pj}.

Define the 1-form ω on O as in (6.3.57). For small ε > 0, let Dj(ε) denote the disk of radius ε centered at pj, and set ∪ Ωε = M \ Dj(ε). j By (6.3.58), ∫ ∫ ∑ ∫ K dS = − ω = ω. j Ωε ∂Ωε ∂Dj (ε) Show that ∫

lim ω = 2π indp (V ). ε→0 j ∂Dj (ε) (Recall (5.3.19)–(5.3.20) and Exercise 7 of §5.3.) Deduce that ∫ K dS = 2π Index(V ),

M thus obtaining another proof of (6.3.9), for general compact, oriented 2D Riemannian manifolds.

9. In this exercise, we consider the compact 2D surface M depicted in Figure 6.3.3. This is shown sitting in a cube C ⊂ R3, say C = [−π, π]3, but we identify opposite faces of C and take (6.3.59) M ⊂ T3 = R3/2πZ3, a compact surface without boundary. 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 303

Figure 6.3.3. A surface M ⊂ T3 satisfying χ(M) = −4

3 3 (a) Show that the natural isomorphism TxT = R leads to a well defined Gauss map (6.3.60) M ⊂ T3 =⇒ N : M → S2, and that, parallel to the case M ⊂ R3, 1 (6.3.61) Deg(N) = χ(M). 2 Furthermore, show that Proposition 6.3.2 continues to hold. (These obser- vations extend more generally to n-dimensional M ⊂ Tn+1, n even.) (b) Show that the surface M depicted in Figure 6.3.3 is diffeomorphic to the 3-holed torus depicted in Figure 6.3.1, hence (6.3.62) χ(M) = −4.

(c) Combining (6.3.62) with either Part (a) or the Gauss-Bonnet theorem for general compact 2D manifolds, deduce that ∫ (6.3.63) K dS = −8π.

M 304 6. Differential geometry of surfaces

Figure 6.3.4. Regular tetrahedron in R3

(d) Now regard M as a compact surface with boundary (consisting of 6 circles) in R3, and use Proposition 6.3.10, with M relabeled as Ω, and putting Ω in a surface diffeomorphic to S2. In this case, κ ≡ 0 on ∂Ω. Show that Proposition 6.3.10 (with these changes in labeling) also implies (6.3.63). This approach to (6.3.63) does not use χ(M) = −4, just χ(S2) = 2. (e) See if you can produce a surface M ⊂ T3, looking like that in Figure 6.3.3, whose Gauss curvature satisfies (6.3.64) K(x) < 0, ∀ x ∈ M.

10. Let T ⊂ R3 be a (solid) regular tetrahedron; cf. Figure 6.3.4. The angle θ between two faces is specified by √ ( ) 1 2 2 π 1 cos θ = , sin θ = , sin − θ = . 3 3 2 3 Assume one vertex of T is the origin in R3. Dilate T by a factor of 2, and consider Ω = S2 ∩ 2T . 6.3. Geometry of surfaces III: the Gauss-Bonnet theorem 305

(a) Show that Ω ⊂ S2 is a geodesic triangle, each of whose angles is equal to θ, specified above. (b) Deduce from the Gauss formula, Proposition 6.3.6, that Area(Ω) = 3θ − π, hence π 1 Area(Ω) = − 3 sin−1 . 2 3 Remark. It is shown in §7.4 of [47] that sin−1(1/3) is not a rational multiple of π. 306 6. Differential geometry of surfaces

6.4. Smooth matrix groups

A smooth matrix group is a subset (6.4.1) G ⊂ M(n, F), F = R or C, with the following two properties:

(i) G is a smooth m-dimensional surface in the vector space M(n, F),

(ii) G is a group, i.e.,

g1, g2 ∈ G =⇒ g1g2 ∈ G, and (6.4.2) g ∈ G =⇒ g is invertible and g−1 ∈ G. We assume G is nonempty, and note that the two conditions in (6.4.2) imply I ∈ G, where I ∈ M(n, F) is the n × n identity matrix. Here are some examples of smooth matrix groups. Gl(n, F) = {g ∈ M(n, F) : det g ≠ 0}, Sl(n, F) = {g ∈ M(n, F) : det g = 1}, O(n) = {g ∈ M(n, R): g∗g = I}, (6.4.3) SO(n) = {g ∈ O(n) : det g = 1}, U(n) = {g ∈ M(n, C): g∗g = I}, SU(n) = {g ∈ U(n) : det g = 1}. The fact that all these sets of matrices satisfy (6.4.2) follows from the iden- tities ∗ ∗ ∗ (6.4.4) det(g1g2) = (det g1)(det g2), (g1g2) = g2g1, and the fact that g ∈ M(n, F) is invertible if and only if det g ≠ 0. These facts also imply that if G ⊂ M(n, F) is a matrix group, then in fact (6.4.5) G ⊂ Gl(n, R).

Note that the defining property det g ≠ 0 implies (6.4.6) Gl(n, F) is open in M(n, F). The fact that the other groups listed in (6.4.3) are smooth surfaces can be established using the submersion mapping theorem, Proposition 3.2.5, which we recall here. Proposition 6.4.1. Let V and W be finite dimensional vector spaces, Ω ⊂ V open, F :Ω ⊂ W a smooth map. Fix p ∈ W , and consider (6.4.7) S = {x ∈ Ω: F (x) = p}. 6.4. Smooth matrix groups 307

Assume that, for each x ∈ S, DF (x): V → W is surjective. Then S is a smooth surface in Ω. Furthermore, for each x ∈ S,

(6.4.8) TxS = N DF (x).

To apply this to the groups listed in (6.4.3), we start with Sl(n, F). Here we take (6.4.9) V = M(n, F),W = F,F : V → W, F (A) = det A. Now, given A invertible, (6.4.10) F (A + B) = det(A + B) = (detA) det(I + A−1B), and we have, for X ∈ M(n, F), (6.4.11) det(I + X) = 1 + Tr X + O(∥X∥2), so (6.4.12) DF (A)B = (det A) Tr(A−1B). Now, given A ∈ Sl(n, F), or even A ∈ Gl(n, F), it is readily verified that −1 (6.4.13) τA : M(n, F) → F, τA(B) = Tr(A B), is nonzero, hence surjective, and Proposition 6.4.1 applies. We turn to O(n). In this case, V = M(n, R),W = {X ∈ M(n, R): X = X∗}, (6.4.14) F : V −→ W, F (A) = A∗A. Now, given A ∈ V , (6.4.15) F (A + B) = A∗A + A∗B + B∗A + O(∥B∥2), so (6.4.16) DF (A)B = A∗B + B∗A = A∗B + (A∗B)∗. We claim that (6.4.17) A ∈ O(n) =⇒ DF (A): M(n, R) → W is surjective. Indeed, given X ∈ W , i.e., X = X∗ ∈ M(n, R), we have 1 (6.4.18) B = AX =⇒ DF (A)B = X. 2 Again Proposition 6.4.1 applies so O(n) is a smooth surface. Similar arguments apply to U(n). For SU(n), we take V = M(n, C),W = {X ∈ M(n, C): X = X∗} ⊕ R, (6.4.19) F : V −→ W, F (A) = (A∗A, Im det A). Note that A ∈ U(n) implies | det A| = 1, so Im det A = 0 ⇔ det A = 1. 308 6. Differential geometry of surfaces

As a further comment on O(n), we note that, given A ∈ M(n, R), defin- ing A : Rn → Rn, (6.4.20) A ∈ O(n) ⇐⇒ ⟨Au, Av⟩ = ⟨u, v⟩, ∀ u, v ∈ Rn, where ⟨u, v⟩ is the Euclidean inner product on Rn, ∑ (6.4.21) ⟨u, v⟩ = ujvj, j t where u = (u1, . . . , un) , etc. Similarly, given A ∈ M(n, C), defining A : Cn → Cn, (6.4.22) A ∈ U(n) ⇐⇒ (Au, Av) = (u, v), ∀ u, v ∈ Cn, where (u, v) denotes the Hermitian inner product on Cn, ∑ (6.4.23) (u, v) = ujvj. j Note that (6.4.24) ⟨u, v⟩ = Re(u, v) defines the Euclidean inner product on Cn ≈ R2n, and we have (6.4.25) U(n) ,→ O(2n).

If G ⊂ M(n, F) is a smooth matrix group, it is of particular interest to consider the tangent space to G at the ,

(6.4.26) g = TI G ⊂ M(n, F), an R-linear subspace. For the groups listed in (6.4.3), we have the following specific identifications:

TI Sl(n, F) = {A ∈ M(n, F) : Tr A = 0}, ∗ TI O(n) = {A ∈ M(n, R): A = −A} = TI SO(n), (6.4.27) ∗ TI U(n) = {A ∈ M(n, C): A = −A}, ∗ TI SU(n) = {A ∈ M(n, C): A = −A, Tr A = 0}. For the first two identities, take A = I in (6.4.12) and (6.4.16), respectively, yielding DF (I)A = Tr A and DF (I)A = A + A∗, respectively. Having (6.4.27) and making use of the identities

−1 ∗ (6.4.28) etB AB = B−1etAB, etA = (etA)∗, det etA = et Tr A, one readily verifies the following, for ∞ ∑ 1 (6.4.29) Exp(A) = eA = Ak. k! k=0 6.4. Smooth matrix groups 309

Proposition 6.4.2. For each matrix group listed in (6.4.3),

(6.4.30) Exp : TI G −→ G.

Here is a significant extension of Proposition 6.4.2.

Proposition 6.4.3. Let G ⊂ M(n, F) be a smooth matrix group, g = TI G. Then, for A ∈ M(n, F), we have (6.4.31) A ∈ g ⇐⇒ etA ∈ G, ∀ t ∈ R.

tA The “⇐” part is clear from the identity (d/dt)e |t=0 = A. As for the “⇒” part, we have this for G in (6.4.3), by Proposition 6.4.2. We will postpone the proof of the rest of Proposition 6.4.3 until later in this section, but will proceed to develop some consequences of this result, starting with the following. For A, B ∈ M(n, F), set (6.4.32) [A, B] = AB − BA. This is called the commutator, or the Lie bracket of A and B.

Proposition 6.4.4. Let G ⊂ M(n, F) be a smooth matrix group, g = TI G. Then (6.4.33) A, B ∈ g =⇒ [A, B] ∈ g.

Remark. For the tangent spaces TI G listed in (6.4.27), the implication (6.4.33) is straightforward, via such identities as (6.4.34) [A, B]∗ = −[A∗,B∗], Tr[A, B] = 0. We turn to the general situation.

Proof. Given g ∈ G, A ∈ g, −1 (6.4.35) g−1etAg = etg Ag, ∀ t, and the left side of (6.4.35) belongs to G, so by (6.4.31) we have (6.4.36) g−1Ag ∈ g, ∀ g ∈ G, A ∈ g. Setting g = etB,B ∈ g, we have (6.4.37) e−tBAetB ∈ g, ∀ A, B ∈ g. Applying d/dt at t = 0 gives (6.4.33) 

The commutator [A, B] = AB−BA gives g the structure of a Lie algebra. We aim to establish further relations between the Lie algebra structure of g and the group structure of G. 310 6. Differential geometry of surfaces

To begin, for A, B ∈ g, we have, for small t, ( )( ) t2 t2 etAetB = I + tA + A2 + O(t3) I + tB + B2 + O(t3) (6.4.38) 2 2 t2 = I + t(A + B) + (A2 + 2AB + B2) + O(t3), 2 and similarly t2 (6.4.39) etBetA = I + t(A + B) + (A2 + 2BA + B2) + O(t3), 2 hence (6.4.40) etAetB = etBetA + t2[A, B] + O(t3). Consequently, etAetBe−tAe−tB = I + t2[A, B] + O(t3) (6.4.41) 2 = et [A,B] + O(t3).

We apply these calculations to show how the Lie algebra structure is preserved under smooth homomorphisms of G. Thus, assume we have a smooth homomorphism (6.4.42) π : G −→ Gl(m, F), i.e., a smooth map satisfying

(6.4.43) π(g1g2) = π(g1)π(g2), ∀ g1, g2 ∈ G. Note that (6.4.43) implies (6.4.44) π(I) = π(I · I) = π(I)π(I), hence π(I) = I, Let us set σ = Dπ(I): g −→ M(m, F), so (6.4.45) d σ(A) = π(esA) , ds s=0 for A ∈ g. Note that, for such A, d d π(etA) = π(e(s+t)A) dt ds s=0 d (6.4.46) = π(esA)π(etA) ds s=0 = σ(A)π(etA), and since γ(t) = π(etA) satisfies γ(0) = I, this gives (6.4.47) π(etA) = etσ(A). We are ready to prove the following 6.4. Smooth matrix groups 311

Proposition 6.4.5. If π is a smooth homomorphism in (6.4.42) and σ : g → M(m, F) is given by (6.4.45), then, for A, B ∈ g, we have (6.4.48) σ([A, B]) = [σ(A), σ(B)].

Proof. Applying π to (6.4.41), we have 2 (6.4.49) π(etAetBe−tAe−tB) = π(et [A,B]) + O(t3). By (6.4.47), and a second application of (6.4.41), the left side of (6.4.49) is equal to 2 (6.4.50) etσ(A)etσ(B)e−tσ(A)e−tσ(B) = et [σ(A),σ(B)] + O(t3). Comparing the right sides of (6.4.49) and (6.4.50), we have d (6.4.51) π(es[A,B]) = [σ(A), σ(B)], ds s=0 which gives (6.4.48). 

We next associate to each A ∈ TI G = g a certain vector field on G. To start, take A ∈ M(n, F), the Lie algebra of Gl(n, F). We define a vector field XA on Gl(n, F) by

(6.4.52) XA(g) = gA, for g ∈ Gl(n, F). This vector field is left invariant. That is to say, if for each h ∈ Gl(n, F) we define left translation Lh : Gl(n, F) → Gl(n, F) by

(6.4.53) Lh(g) = hg, then we have

(6.4.54) XA(hg) = DLh(g)XA(g). We now have the following simple result:

Proposition 6.4.6. If A ∈ g = TI G, then XA is tangent to G.

Proof. Given g ∈ G, we have Lg : G → G, and hence

(6.4.55) DLg(I): TI G −→ TgG, hence A ∈ g ⇒ XA(g) ∈ TgG.  ∈ F F t F Given A M(n, ), the flow A on Gl(n, ) generated by XA is given by F t tA (6.4.56) A(g) = ge , as is readily checked: d F t (g) = X (g) (by definition) (6.4.57) dt A t=0 A = gA, 312 6. Differential geometry of surfaces

tA which coincides with (d/dt)ge |t=0. With these observations, we can give a short

Proof of Proposition 6.4.3 (the “⇒” part). § F t As seen in 3.2, the flow A generated by XA leaves invariant each smooth surface in Gl(n, F) to which XA is tangent. If A ∈ g, then, by Proposition tA F t  6.4.6, G has this property, and e = A(I).

We record the following useful complement to Proposition 6.4.3.

Proposition 6.4.7. If G ⊂ M(n, F) is a smooth matrix group, g = TI G, then Exp : g −→ G is a smooth map and there exist neighborhoods O of 0 ∈ g and Ω of I ∈ G such that Exp : O → Ω is a diffeomorphism.

Proof. The mapping property has just been proved. The smoothness fol- lows from the smoothness of Exp : M(n, F) → Gl(n, F). We also have for D Exp(I): g → TI G = g that D Exp(I)A = A, ∀ A ∈ g, so the inverse function theorem applies. 

As seen in §2.3, a smooth vector field X defines a differential operator (also denoted X) on smooth functions by d Xu(x) = u(F t(x)) , dt t=0 where F t is the flow generated by X. In particular, for A ∈ M(n, F), d X u(g) = u(getA) (6.4.58) A dt t=0 = Du(g) · gA, where the “dot product” gives the action of Du(g) ∈ L(M(n, F), R) on gA ∈ M(n, F). Recall from §2.3 that the Lie bracket of vector fields is given by

(6.4.59) [XA,XB] = XAXB − XBXA. The following result provides an equivalence between the Lie algebra struc- ture on g and the Lie bracket in (6.4.59). Proposition 6.4.8. Given A, B ∈ M(n, F), we have

(6.4.60) [XA,XB] = X[A,B]. 6.4. Smooth matrix groups 313

Proof. To begin, we have ∂2 (6.4.61) X X u(g) = u(getAesB) , A B ∂s∂t s,t=0 and hence 2 [ ] ∂ tA sB sB tA (6.4.62) (XAXB − XBXA)u(g) = u(ge e ) − u(ge e ) . ∂s∂t s,t=0 We can extend (6.4.40) to

(6.4.63) etAesB = esBetA + st[A, B] + O((|s| + |t|)3).

Consequently, u(getAesB) = u(gesBetA + stg[A, B] + O((|s| + |t|)3)) (6.4.64) = u(gesBetA) + stDu(gesBetA) · g[A, B] + O((|s||t|)3).

2 Applying ∂ /∂s∂t)|s,t=0, we obtain

(XAXB − XBXA)u(g) = Du(g) · g[A, B] (6.4.65) = X[A,B]u(g), the last identity holding by (6.4.58). This proves (6.4.60). 

We turn to the production of metric tensors on a smooth matrix group G ⊂ M(n, R). (For simplicity, we restrict attention to F = R here, which is no real loss of generality, since Gl(n, C) ⊂ Gl(2n, R.) To start, we use the inner product on M(n, R) computed componentwise; equivalently

(6.4.66) ⟨A, B⟩ = Tr(B∗A) = Tr(BA∗).

This produces a metric tensor on G ⊂ M(n, R), by restriction. This Rie- mannian metric interfaces well with the group structure of G in the following situation.

Proposition 6.4.9. Assume the smooth matrix group G ⊂ M(n, R) satisfies

(6.4.67) G ⊂ O(n).

Then the metric tensor on G induced from the inner product (6.4.66) on M(n, R) is invariant under both left and right translations, i.e., under

(6.4.68) Lg,Rg : G −→ G, g ∈ G, defined by

(6.4.69) Lg(x) = gx, Rg(x) = xg, g, x ∈ G. 314 6. Differential geometry of surfaces

Proof. Indeed, we have

(6.4.70) Lg,Rg : M(n, R) −→ M(n, R), isometries, ∀ g ∈ O(n), where

(6.4.71) LgA = gA, RgA = Ag, A ∈ M(n, R), g ∈ O(n). 

In the setting of Proposition 6.4.9, we say the Riemannian metric on G defined above is bi-invariant. When G does not satisfy (6.4.68), the metric tensor on G induced from M(n, R) is generally not what we want to deal with. In such cases, take some positive-definite inner product ⟨ , ⟩ on g = TI G (perhaps that induced from (6.4.66)), and define inner products ⟨ , ⟩ℓ,g and ⟨ , ⟩r,g on TgG by

⟨DLg(I)A, DLg(I)B⟩ℓ,g = ⟨A, B⟩, (6.4.72) ⟨DRg(I)A, DRg(I)B⟩r,g = ⟨A, B⟩, for A, B ∈ g. We have the following.

Proposition 6.4.10. Given an inner product ⟨ , ⟩ on g = TI G, there are unique extensions to Riemannian metric tensors ⟨ , ⟩ℓ and ⟨ , ⟩r on G such that, for all g ∈ G,

Lg : G −→ G is an isometry for ⟨ , ⟩ℓ, and (6.4.73) Rg : G −→ G is an isometry for ⟨ , ⟩r. These metric tensors give rise to volume elements and hence to integrals that are, respectively, left and right invariant, so ∫ ∫

f(x) dVℓ(x) = f(gx) dVℓ(x), (6.4.74) ∫G G∫ f(x) dVr(x) = f(xg) dVr(x), G G for all g ∈ G and all integrable f (e.g., f continuous and compactly sup- ported on G).

Generally, if dV1 and dV2 are two smooth volume elements on a smooth surface M, they differ by a smooth positive factor, so ∫ ∫

(6.4.75) f(x) dV1(x) = f(x)φ12(x) dV2(x). M M If M = G and dV , dVe are both smooth left-invariant volume elements, then ℓ ∫ℓ ∫ e (6.4.76) f(x) dVℓ(x) = f(x)φ(x) dVℓ(x) G G 6.4. Smooth matrix groups 315 equals both ∫ ∫ e (6.4.77) f(gx) dVℓ(x) = f(gx)φ(x) dVℓ(x) G G and ∫ e (6.4.78) f(gx)φ(gx) dVℓ(x), G for all f ∈ C0(G), g ∈ G. This forces φ(x) = φ(gx) for all x, g ∈ G, hence φ = constant. We can apply this observation to compare a left-invariant volume ele- ment dV and a right translation of this, say dV , defined by ℓ ∫ ∫ ℓh

(6.4.79) f(x) dVℓh(x) = f(xh) dVℓ(x). G G We have

(6.4.80) dVℓh(x) = ψ(h) dVℓ(x), h ∈ G. Note that, if h , h ∈ G, then 1 ∫2 ∫

f(xh1h2) dVℓ(x) = ψ(h1) f(xh2) dVℓ(x) (6.4.81) G G ∫ = ψ(h2)ψ(h1) f(x) dVℓ(x), G so (6.4.82) ψ : G −→ (0, ∞) is a smooth multiplicative homomorphism,

(6.4.83) ψ(h1h2) = ψ(h1)ψ(h2). This leads to the following. Proposition 6.4.11. If the only smooth homomorphism ψ : G → (0, ∞) is the trivial homomorphism ψ(g) ≡ 1, then the left-invariant volume element dVℓ is also right invariant.

Note that the image of G under ψ must be a multiplicative subgroup of (0, ∞), and the only proper subgroup is {1}. Since the image is compact if G is compact, we have the following. Proposition 6.4.12. If G is a compact, smooth matrix group, then each left-invariant volume element on G is also right invariant. 316 6. Differential geometry of surfaces

Given G compact, we normalize the bi-invariant volume element, and define ∫ ∫ 1 (6.4.84) f(g) dg = f(x) dVℓ(x). Vℓ(G) G G Recall that we already have a bi-invariant Riemannian metric tensor, hence a bi-invariant integral, in case G ⊂ O(n). We come full circle with the following result. Proposition 6.4.13. Let G ⊂ M(n, R) be a smooth, compact matrix group. Then there is an inner product on Rn preserved by the action of G.

Proof. Let ⟨ , ⟩ denote the standard inner product on Rn, and define a new inner product ( , ) by ∫ (6.4.85) (v, w) = ⟨gv, gw⟩ dg.

G Then, for h ∈ G, ∫ (hv, hw) = ⟨ghv, ghw⟩ dg

(6.4.86) G∫ = ⟨gv, gw⟩ dg,

G by right invariance of the integral, and the last integral is equal to (v, w). 

A matrix group G for which the left-invariant volume form is also right invariant is said to be unimodular. One can deduce from Proposition 6.4.11 that (6.4.87) Sl(n, R) is unimodular. On the other hand, also (6.4.88) Gl(n, R) is unimodular, though Proposition 6.4.11 does not apply to this case. An example of a group that is not unimodular is the two-dimensional group {( ) } a b (6.4.89) G = : a > 0, b ∈ R . 0 1 We refer the reader to such sources as [48] or [49] for details on which matrix groups are unimodular. One application of integration over G arises in the proof of the following significant result. 6.4. Smooth matrix groups 317

Proposition 6.4.14. If G is a smooth matrix group, every continuous ho- momorphism (6.4.90) π : G −→ Gl(m, F) is smooth (of class C∞).

To set up the proof, we define

(6.4.91) π(f) ∈ M(m, F), for f ∈ C0(G), by ∫ m (6.4.92) π(f)v = f(x)π(x)v dVℓ(x), v ∈ F . G We set (6.4.93) C∞(π) = {v ∈ Fm : π(g)v is C∞ in g}. Clearly C∞(π) is a linear subspace of Fm. Our goal is to show that C∞(π) = Fm. To this end, we have:

Lemma 6.4.15. Let fν ∈ C0(G) satisfy f ≥ 0, f (x) = 0 for ∥x − I∥ ≥ 2−ν, ν ∫ν (6.4.94) fν(x) dVℓ(x) = 1. G Then m (6.4.95) π(fν)v −→ v, as ν → ∞, ∀ v ∈ F .

We complement this with: Lemma 6.4.16. In the setting described above, ∈ ∞ ∈ Fm ⇒ ∈ ∞ (6.4.96) f C0 (G), v = π(f)v C (π).

Proof. For g ∈ G, we have ∫

π(g)π(f)v = f(x) π(gx)v dVℓ(x) (6.4.97) G∫ −1 = f(g y)π(y)v dVℓ(y). G We leave (6.4.95) and the smoothness of the last integral as a function of g as exercises.  318 6. Differential geometry of surfaces

∈ ∞ Proof of Proposition 6.4.14. Take fν C0 (G) satisfying (6.4.94). We ∞ m have π(fν)v ∈ C (π) and π(fν)v → v as ν → ∞, for each v ∈ F . Hence C∞(π) is dense in Fm. However, the only dense linear subspace of Fm is Fm itself. 

It is useful to broaden the class of group homomorphisms (6.4.90), as follows. Let V be a finite-dimensional vector space, G a smooth matrix group. A representation of G on V is a continuous map (6.4.98) π : G −→ L(V ), satisfying

(6.4.99) π(I) = I, π(g1g2) = π(g1)π(g2), ∀ gj ∈ G. Picking a basis of V yields an isomorphism J : V → Fm, hence a homomor- phism

−1 (6.4.100) πJ : G → Gl(m, F), πJ (g) = J π(g)J .

We can readily translate results on πJ to results on π. For example, each continuous representation π in (6.4.98) is smooth, and we have analogues of (6.4.47)–(6.4.48). The language of representation theory is a convenient one in which to formulate results. Here is an example.

Proposition 6.4.17. Let G be a compact, smooth matrix group, π a con- tinuous representation of G on a finite-dimensional vector space V . Then ∫ (6.4.101) P = π(g) dg

G is a projection of V onto the subspace

(6.4.102) V0 = {v ∈ V : π(g)v = v, ∀ g ∈ G}, on which π acts trivially.

Proof. For each h ∈ G, v ∈ V , we have ∫ π(h)P v = π(hg)v dg

(6.4.103) G∫ = π(g)v dg,

G 6.4. Smooth matrix groups 319 the second identity by left invariance of the integral. The last integral is equal to P v, so we have P v ∈ V0 for each v ∈ V , i.e.,

(6.4.104) P : V −→ V0. Furthermore, ∫ ∫

(6.4.105) v ∈ V0 ⇒ P v = π(g)v dg = v dg = v, G G so indeed P is a projection onto V0.  Examples of representations of a matrix group G include representations on spaces of functions. In §7.4 we will see representations of SO(n) on spaces of functions on the sphere Sn−1 known as spherical harmonics. Analysis of these representations of SO(n) will be seen to provide a useful tool in the study of spherical harmonics.

We now focus on matrix groups with the following property. G is a smooth matrix group, endowed (6.4.106) with a bi-invariant Riemannian metric, and examine differential geometric properties of these groups. Recall that compact, smooth matrix groups possess bi-invariant metric tensors. In fact, these are the main examples. To start, consider (6.4.107) ψ : G −→ G, ψ(x) = x−1. This map takes a left-invariant metric tensor to a right-invariant metric tensor, and vice-versa. We have

(6.4.108) Dψ(I) = −I on g = TI G. We can deduce from this that, if (6.4.106) holds, then ψ is an isometry. This generalizes as follows. Proposition 6.4.18. If (6.4.106) holds, then, for each g ∈ G, −1 (6.4.109) ψg : G −→ G, ψg(x) = gx g, is an isometry of G, fixing g and satisfying

(6.4.110) Dψg(g) = −I on TgG.

Using this, we can prove the following. Proposition 6.4.19. If γ is a unit-speed geodesic on G satisfying γ(0) = I, then (6.4.111) γ(s + t) = γ(s)γ(t). 320 6. Differential geometry of surfaces

Proof. Fix t ∈ R and consider (6.4.112) σ(s) = γ(t + s). This is a unit-speed geodesic satisfying σ(0) = γ(t), σ′(0) = γ′(t). It follows that

(6.4.113) σ˜(s) = ψγ(t)(σ(s)) is the unit-speed geodesic satisfyingσ ˜(0) = γ(t), σ˜′(0) = −γ′(t). This forces σ˜(s) = γ(t − s), i.e.,

γ(t − s) = ψγ(t)(γ(t + s)) (6.4.114) = γ(t)γ(t + s)−1γ(t). Taking t = 0 gives (6.4.115) γ(−s) = γ(s)−1, and taking t = s in (6.4.114) gives I = γ(t)γ(2t)−1γ(t), hence γ(2t)−1 = γ(t)−1γ(t)−1, so (6.4.116) γ(2t) = γ(t)γ(t). Inductively, we obtain γ(kt) = γ(t)k, for k = 2ν, which leads to (6.4.111) when s and t have the same sign. In such a case, (6.4.114) implies (6.4.117) γ(t − s) = γ(t)γ(s)−1γ(t)−1γ(t) = γ(t)γ(−s), so we have (6.4.111) in general. 

Proposition 6.4.19 implies that if A ∈ g, then tA (6.4.118) γA(t) = e is a constant-speed geodesic through I. In other words, when (6.4.106) holds, the exponential map −→ (6.4.119) ExpI : TI G G defined in §6.1 coincides with the matrix exponential (6.4.120) Exp : g −→ G used in this section. We also get that ExpI in (6.4.119) is defined on all of TI G.

Given A ∈ g, the associated left-invariant vector field X = XA generates the flow F t tA tA (6.4.121) X (g) = ge = Lg(e ), which for each g ∈ G is a geodesic on G, since Lg : G → G is an isometry. In other words, each integral curve of X is a constant speed geodesic. Recalling the geodesic equation from §6.1, we have the following. 6.4. Smooth matrix groups 321

Proposition 6.4.20. If G satisfies (6.4.106), then each left-invariant vector field X on G satisfies

(6.4.122) ∇X X = 0 on G. If also Y is a left-invariant vector field on G, so is X + Y , and (6.4.122) applies in these cases. Expanding

(6.4.123) ∇X+Y (X + Y ) = 0 then yields

(6.4.124) ∇X Y + ∇Y X = 0. Since

(6.4.125) ∇X Y − ∇Y X = [X,Y ], we obtain the following. Proposition 6.4.21. If G satisfies (6.4.106) and X and Y are left-invariant vector fields on G, then 1 (6.4.126) ∇ Y = [X,Y ]. X 2 As seen in §6.2, the Riemann curvature R is given by

(6.4.127) R(X,Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ]Z. We hence obtain the following formula. Proposition 6.4.22. If G satisfies (6.4.106) and X,Y,Z are left-invariant vector fields on G, then 1 (6.4.128) R(X,Y )Z = − [[X,Y ],Z]. 4

Lie groups and Lie algebras

The matrix groups we have studied in this section fall into a broader category of objects called Lie groups. Generally, a Lie group is a smooth manifold G that also has the structure of a group. That is, there is a map µ : G×G → G, called multiplication, µ(g1, g2) = g1g2, that has the following properties. First, the associative law,

(6.4.129) (g1g2)g3 = g1(g2g3), ∀ gj ∈ G, next, the existence of an identity element e ∈ G such that (6.4.130) ge = eg = g, ∀ g ∈ G, and third that each g ∈ G has an inverse g−1 = Inv(g) ∈ G, satisfying (6.4.131) gg−1 = g−1g = e, ∀ g ∈ G. 322 6. Differential geometry of surfaces

Furthermore, we require that (6.4.132) µ : G × G → G and Inv : G → G are smooth. Such groups arise as sets of symmetries of various structures. An exam- ple is E(n), the group of isometric mappings of Rn onto itself. A typical element of E(n) has the form n (6.4.133) T1(v) = A1v + w1,A1 ∈ O(n), w1 ∈ R .

If also T2(v) = A2v + w2, we have

T1T2(v) = A1(T2v) + w1

(6.4.134) = A1(A2v + w2) + w1

= A1A2v + A1w2 + w1. Hence, as a smooth manifold, (6.4.135) E(n) = O(n) × Rn, and the group operation is given by

(6.4.136) (A1, w1) · (A2, w2) = (A1A2, w1 + A1w2). While E(n) is not defined as a matrix group, it is nevertheless isomorphic to a matrix group, namely {( ) } A w (6.4.137) : A ∈ O(n), w ∈ Rn , 0 1 as one sees from the computation ( )( ) ( ) A w A w A A A w + w (6.4.138) 1 1 2 2 = 1 2 1 2 1 . 0 1 0 1 0 1 There exist Lie groups that are not isomorphic to matrix groups, though it turns out that each Lie group is “locally isomorphic” to a matrix group. For an abstract Lie group G, we define g to consist of the set of left- invariant vector fields on G. This is a linear space, and if X1,X2 ∈ g, then the Lie bracket [X1,X2] also belongs to g, and this gives g the structure of a Lie algebra. Restriction to the identity element e ∈ G gives a linear isomorphism

(6.4.139) g ≈ TeG. The exponential map (6.4.140) Exp : g −→ G is given by F t (6.4.141) Exp(tX) = X (e), F t where X is the flow generated by X. 6.4. Smooth matrix groups 323

There are a number of books that develop the theory of Lie groups, such as [10] and [49]. The reader who has gotten this far should be in a good position to pursue this topic in such texts.

Exercises

In Exercises 1–7, let G be a compact, smooth matrix group, endowed with Haar measure, as in (6.4.84).

1. Show that, for sll f ∈ C(G), ∫ ∫ f(g−1) dg = f(g) dg.

G G 2. Let V be a finite-dimensional inner product space and assume π : G → L(V ) is a continuous representation that is unitary, i.e. (π(g)u, v) = (u, π(g−1)v), ∀ u, v ∈ V, g ∈ G. (We say π is a unitary representation of G on V .) Show that ∫ P = π(g) dg

G is the orthogonal projection of V onto

V0 = {v ∈ V : π(g)v = v, ∀ g ∈ G}. Hint. By Proposition 6.4.17, one need only show P = P ∗. Note that ∫ ∫ P ∗ = π(g)∗ dg = π(g−1) dg,

G G and use Exercise 1.

3. Let π and λ be unitary representations of G on finite-dimensional inner product spaces V and W , respectively. Give L(W, V ) the inner product (A, B) = Tr B∗A, and define a unitary representation ν of G on L(W, V ) by ν(g)A = π(g)Aλ(g−1). Define Q : L(W, V ) → L(W, V ) by πλ ∫ ∫ −1 QπλA = ν(g)A dg = π(g)Aλ(g ) dg. G G 324 6. Differential geometry of surfaces

Show that Qπλ is the orthogonal projection of L(W, V ) onto I(π, λ) = {A ∈ L(W, V ): π(g)A = Aλ(g), ∀ g ∈ G}. 4. Assume π and λ are irreducible unitary representations, i.e., V and W have no proper linear subspaces invariant under the action of G. Show that if A ∈ I(π, λ) is not zero, then A : W → V must be an isomorphism, i.e., N (A) = 0 and R(A) = V . Hint. N (A) is invariant under λ and R(A) is invariant under π.

5. In the setting of Exercise 4, suppose A ∈ I(π, λ) is an isomorphism of W onto V , and consider T = A∗A : W −→ W. Show that T = T ∗ and that T λ(g) = λ(g)T, ∀ g ∈ G. Show that this forces T = αI on W, for some α > 0. Hint. Show that each eigenspace of T is invariant under λ(g), for all g ∈ G.

When this holds, show that −1/2 A0 = α A : W −→ V is unitary (i.e., preserves inner products) and −1 ∀ ∈ π(g) = A0λ(g)A0 , g G. We say π and λ are unitarily equivalent.

6. Deduce from the exercises above that, if π and λ are irreducible unitary representations of G,

π and λ are not unitarily equivalent ⇔ I(π, λ) = 0 ⇔ Qπλ = 0, and I(π, π) = Span(I), hence −1 QππA = (dim V ) (Tr A)I. 7. Take π and λ, irreducible unitary representations of G, as in Exercise 4. Take orthonormal bases of V and W , so π(g) and λ(g) have matrix representations π(g) and λ(g) . Deduce from Exercise 6 that ∫ jk mℓ −1 π(g)jkπ(g)mℓ dg = (dim V ) δjmδkℓ, G 6.4. Smooth matrix groups 325 while, if π is not unitarily equivalent to λ, then ∫ ≡ π(g)jkλ(g)mℓ dg 0. G These identities are known as the Weyl orthogonality relations. Here is a restatement.

Proposition 6.4.23. Let G be a compact, smooth matrix group, and {πα : α ∈ A} a set of mutually inequivalent, irreducible unitary representations of G on spaces Vα, of dimension dα. Pick orthonormal bases of Vα, so that πα(g) has the matrix representation (πα(g)jk). Then F { 1/2 ∈ A ≤ ≤ } (6.4.142) = dα πα(g)jk : α , 1 j, k dα is an orthonormal set of functions on G.

In Exercises 8–10, we retain the hypothesis that G is a compact, smooth matrix group. Pick A such that each irreducible, unitary representation of G is equivalent to πα for some α ∈ A.

8. Let π be a unitary representation of G on a finite-dimensional inner product space V . Show that if V1 ⊂ V is invariant under π(g) for all g, so ⊥ is its orthogonal complement V1 . Deduce that, if π is not irreducible, there exists an orthogonal decomposition

V = V1 ⊕ · · · ⊕ VK such that π acts irreducibly on each factor Vj. Hint. Induction on dimension.

9. Let π and λ be irreducible unitary representations of G, with matrix entries π(g)jk and λ(g)mℓ. Apply Exercise 8 to the representation of G introduced in Exercise 3 to show that ∈ F π(g)jkλ(g)mℓ Span .

10. Note that if (π(g)jk) is a unitary representation of G, so is (π(g)jk), so Span F is closed under complex conjugation. Then we can deduce from Exercise 9 that Span F is an algebra. Use the Stone-Weierstrass theorem to establish the following complement to Proposition 6.4.23. Proposition 6.4.24. The orthonormal set F in (6.4.142) has the property that Span F is dense in C(G). 326 6. Differential geometry of surfaces

To set up the next few exercises, for g ∈ G, define −1 Cg : G −→ G, Cg(x) = gxg = LgRg−1 (x).

Note that Cg(I) = I and, with g = TI G,

DCg(I): g −→ g is given by −1 DCg(I)A = gAg ,A ∈ g.

11. Let Ad(g)A = gAg−1. Show that Ad is a representation of G on g. Show that the derived repre- sentation of g on g, arising as in (6.4.45), is given by ad(A)B = [A, B], A, B ∈ g.

12. Give g a positive-definite inner product ⟨ , ⟩, and use {Lg} to extend this to a left-invariant metric tensor on G. Show that this metric tensor is also right invariant (hence bi-invariant) if and only if, for each g ∈ G, Ad(g): g → g preserves this inner product, i.e., ⟨Ad(g)A, Ad(g)B⟩ = ⟨A, B⟩, ∀ A, B ∈ g, g ∈ G. Show that, if this holds, ⟨ad(C)A, B⟩ = −⟨A, ad(C)B⟩, ∀ A, B, C ∈ g.

13. Assume G has a bi-invariant metric tensor. Using Proposition 6.4.22 and Exercise 13, show that, if X,Y,Z, and W are left-invariant vector fields on G, then 1 ⟨R(X,Y )Z,W ⟩ = − ⟨[X,Y ], [Z,W ]⟩. 4 In particular, 1 ⟨R(X,Y )Y,X⟩ = ∥[X,Y ]∥2. 4 6.5. The derivative of the exponential map 327

6.5. The derivative of the exponential map

Here we look at formulas for the derivative of two types of exponential maps, first the matrix exponential (6.5.1) Exp : M(n, C) −→ M(n, C), introduced in §2.3, and studied further in §6.4, and then the exponential map −→ (6.5.2) Expp : U M, introduced in §6.1, where M is a smooth surface in Rk, or more generally a smooth Riemannian manifold, p ∈ M, and U is a neighborhood of 0 in the tangent space TpM. We also investigate where D Exp is invertible, and draw conclusions from this. We start with the matrix exponential ∞ ∑ tk (6.5.3) Exp(tA) = etA = Ak,A ∈ M(n, C), k! k=0 introduced in (2.3.115), satisfying d (6.5.4) etA = AetA, Exp(0) = I. dt As noted there, Exp is a C∞ map. We will derive a formula for

d (6.5.5) D Exp(A)B = eA+sB ,B ∈ M(n, C). ds s=0 To get this, it is convenient to bring in the matrix-valued function (6.5.6) U(s, t) = et(A+sB). Note that

(6.5.7) D Exp(A)B = V (1), where V (t) = ∂sU(s, t) s=0. We will derive a differential equation for V (t). To start, (6.5.6) yields

(6.5.8) ∂tU(s, t) = (A + sB)U(s, t), and applying ∂s to this gives

(6.5.9) ∂s∂tU = (A + sB)∂sU + BU.

Since ∂s∂tU = ∂t∂sU, we obtain V ′(t) = AV (t) + BU(0, t) (6.5.10) = AV (t) + BetA. Note that V (0) = 0. Thus Duhamel’s formula (2.3.126) yields ∫ t (6.5.11) V (t) = e(t−s)ABesA ds. 0 328 6. Differential geometry of surfaces

Consequently ∫ 1 (6.5.12) D Exp(A)B = eA e−sABesA ds. 0 Here is a convenient alternative formula. Define (6.5.13) A : M(n, C) −→ M(n, C), A(B) = [A, B] = AB − BA. Then (6.5.14) e−sABesA = e−sAB.

In fact, the two sides, Y1(s) and Y2(s), both satisfy the same differential equation, ′ −A − (6.5.15) Yj (s) = Yj(s) = Yj(s)A AYj(s),Yj(0) = B. Thus ∫ ∫ 1 1 (6.5.16) e−sABesA ds = e−sAB ds = Φ(A)B, 0 0 where ∞ 1 − e−z ∑ 1 Φ(z) = = (−z)k−1, z k! k=1 (6.5.17) ∞ ∑ 1 Φ(A) = (−A)k−1. k! k=1 In this setting, we can rewrite (6.5.12) as (6.5.18) D Exp(A)B = eAΦ(A)B. Thus, given A ∈ M(n, C), D Exp(A) is invertible on M(n, C) ⇐⇒ (6.5.19) Φ(A) is invertible on M(n, C).

We will now consider the spectrum of Φ(A). Given a linear transforma- tion T on a finite dimensional vector space X, Spec T is the set of eigenvalues of T . Thus T is invertible on X if and only if 0 ∈/ Spec T . The following is a version of the spectral mapping theorem. ∑ k ∈ C Proposition 6.5.1. Let Φ(z) = k≥0 akz converge for all z . Let A : X → X be a linear transformation∑ on a finite-dimensional complex A Ak vector space X. Set Φ( ) = k≥0 ak . Then

(6.5.20) Spec Φ(A) = {Φ(λj): λj ∈ Spec A}. 6.5. The derivative of the exponential map 329

A proof of this result is given in §6.6.

In case Φ is given by (6.5.17), we deduce that Φ(A) is invertible on M(n, C) ⇐⇒ (6.5.21) Spec A is disjoint from {2πik : k ∈ Z \ 0}. Furthermore, if A is related to A as in (6.5.13), (6.5.22) Spec A = {λ − µ : λ, µ ∈ Spec A}. Given these results, we have the following conclusion. Proposition 6.5.2. Given Exp as in (J.3) and A ∈ M(n, C), D Exp(A) is invertible on M(n, C) ⇐⇒ (6.5.23) no two eigenvalues of A/2πi differ by a nonzero integer.

Remark. It is also of interest to consider (6.5.24) Exp : M(n, R) −→ M(n, R). Of course, the formulas (6.5.12) and (6.5.18) for D Exp(A): M(n, R) → M(n, R), given A ∈ M(n, R), work without change. Furthermore, given a linear transformation T : Rm → Rm (m = n2), extended to a linear transformation T : Cm → Cm, it is clear that T is invertible on Rm if and only if it is invertible on Cm. Hence we have the following immediate consequence of Proposition 6.5.2. Corollary 6.5.3. Given Exp as in (6.5.24) and A ∈ M(n, R), D Exp(A) is invertible on M(n, R) if and only if no two eigenvalues of A differ by 2πi times a nonzero integer.

We next look at the second type of exponential map, (6.5.2). Let M be a smooth Riemannian manifold, p ∈ M. As seen in §6.1, there is a neighborhood U of 0 in TpM on which we have a smooth map −→ (6.5.25) Expp : U M, satisfying the condition that, for v ∈ U, ≤ ≤ (6.5.26) Expp(tv) = γ(t), 0 t 1, is a geodesic of constant speed ∥v∥, satisfying (6.5.27) γ(0) = p, γ′(0) = v. We desire to analyze

d (6.5.28) D Expp(v)w = Expp(v + sw) , ds s=0 330 6. Differential geometry of surfaces

given v ∈ U, w ∈ TpM. To do this, we assume v ≠ 0 and bring in the family of curves

(6.5.29) γs(t) = Expp(t(v + sw)), defined for t ∈ [0, 1] and |s| ≤ δ, a sufficiently small positive quantity. This is a family of geodesics on M, of speed ∥v + sw∥, with tangent ′ ∇ (6.5.30) Ts(t) = γs(t), satisfying Ts Ts = 0. We also bring in the vector field ∈ (6.5.31) Ws(γs(t)) = ∂sγs(t) Tγs(t)M. ∇ In rough analogy to (6.5.8)–(6.5.9), we apply Ws to the geodesic equation in (6.5.30), obtaining

0 = ∇W (∇T Ts) (6.5.32) s s ∇ ∇ ∇ = Ts Ws Ts + [Ws,Ts] + R(Ws,Ts)Ts, the latter identity using the characterization (6.2.57) of the Riemann curva- ture R of M. It is useful to note that

(6.5.33) [Ws,Ts] = 0. Furthermore, the identity (6.1.61) for the Levi-Civita covariant derivative implies ∇ ∇ ∇ (6.5.34) Ws Ts = Ts Ws + [Ws,Ts] = Ts Ws. Hence (6.5.32) yields the equation ∇ ∇ (6.5.35) Ts Ts Ws + R(Ws,Ts)Ts = 0. We specialize to s = 0, obtaining

(6.5.36) ∇T ∇T W + R(W, T )T = 0, where

′ (6.5.37) T (t) = γ (t), γ(t) = Expp(tv),W = W0 = ∂sγs(t) s=0. Note that ∥T (t)∥ = ∥v∥. The equation (6.5.36) is called the Jacobi varia- tional equation for the geodesic flow. It has the form of a second-order ODE for

(6.5.38) W : [0, 1] −→ TM,W (t) ∈ Tγ(t)M. It is appropriate to impose the following initial condition at t = 0;

(6.5.39) W (0) = 0, ∇T W (0) = w.

In fact, close enough to p one can use Expp as an exponential coordinate system, in which W (t) = tw, and this leads to (6.5.39). We denote the solution to (6.5.36)–(6.5.39) by

(6.5.40) W (t) = Jp,v,w(t), p ∈ M, v, w ∈ TpM, v ≠ 0. 6.5. The derivative of the exponential map 331

Then (6.5.41) D Exp (v): T (M) −→ T M p p Expp(v) is given by

(6.5.42) D Expp(v)w = Jp,v,w(1). A solution to (6.5.36) is called a Jacobi field along the geodesic γ. The computation (6.5.42) has the following consequence. ∈ ⊂ Proposition 6.5.4. Given Expp as in (6.5.25)–(6.5.27), and v U ̸ TpM, v = 0, D Expp(v) in (6.5.41) is not invertible if and only if there exists a Jacobi field J(t) along γ(t) = Expp(tv) such that J(0) = 0 and J(1) = 0. The null space of D Expp(v) is given by

(6.5.43) {w ∈ TpM : Jp,v,w(1) = 0}. Remark. In (6.5.29)–(6.5.42) we have assumed v ≠ 0. From these formulas one can show that

(6.5.44) lim Jp,v,w(1) = w, v→0 consistent with the formula

(6.5.45) D Expp(0)w = w, which also follows immediately form (6.5.26)–(6.5.27), as observed below (6.1.75).

We say two points p and q on a geodesic γ on M are conjugate if there exists a (not identically zero) Jacobi field J along γ that vanishes at p and q. By Proposition 6.5.4, an equivalent condition is that D Expp is singular ∈ at v TpM if γ(t) = Expp(tv), p = γ(0), and q = γ(1). A certain negative curvature condition can guarantee that there are no conjugate points. Proposition 6.5.5. Suppose that for all vector fields X and Y on M, (6.5.46) ⟨R(X,Y )Y,X⟩ ≤ 0. Then no two points of M are conjugate along any geodesic.

Proof. Let γ be a constant speed geodesic, p = γ(0), q = γ(b). Suppose J is a Jacobi field along γ satisfying J(p) = 0. Let T = γ′. A computation gives 1 T 2⟨J, J⟩ = T ⟨∇ J, J⟩ 2 T (6.5.47) = ⟨∇T ∇T J, J⟩ + ⟨∇T J, ∇T J⟩

= ⟨∇T J, ∇T J⟩ − ⟨R(J, T )T,J⟩, 332 6. Differential geometry of surfaces the last identity by (6.5.36). The hypothesis (6.5.46) then gives 1 (6.5.48) T 2⟨J, J⟩ ≥ ∥∇ J∥2 ≥ 0. 2 T If J is not identically zero along γ([0, b]), then ∇T J must be nonzero at some point γ(ξ), ξ ∈ (0, b). Since J(γ(0)) = 0, (6.5.48) forces ∥J(γ(b))∥2 > 0. 

For example, if M is a 2D Riemannian manifold with Gauss curvature K(x) ≤ 0 for all x ∈ M, then there are no conjugate points. In contrast, for positive curvature, there is the following result. Proposition 6.5.6. Let M be a 2D Riemannian manifold. Assume its Gauss curvature K satisfies (6.5.49) K(x) ≥ κ > 0, ∀ x ∈ M. √ Then any geodesic on M of length > π/ κ has conjugate points.

We refer to [8] for a proof of this, and to [9] for a higher dimensional gen- eralization. We state a few other results about Jacobi fields and conjugate points, referring to [9] for proofs. We mention that the proof of Proposition 6.5.6 makes use of the second variational formula (6.2.110), as does the proof of the next result. Proposition 6.5.7. If γ is a unit speed geodesic on a Riemannian manifold M, and p = γ(a) and q = γ(b) are conjugate along γ (b > a), then (6.5.50) d(p, γ(t)) < t − a, for t > b.

To formulate the next result, we say a Riemannian manifold M is com- ∈ plete if Expp is defined on all of TpM, for each p M. Proposition 6.5.8. Assume M is a complete Riemannian manifold of di- mension 2. If (6.5.49) holds, then M is compact and π (6.5.51) d(p, q) ≤ √ , ∀ p, q ∈ M. κ √ Note that a sphere of radius R = 1/ κ in R3, whose Gauss curvature is ≡ κ, illustrates the sharpness of Propositions 6.5.6 and 6.5.8. The proof of Proposition 6.5.8 uses Propositions 6.5.6 and 6.5.7. Higher dimensional extensions can also be found in [9].

6.6. A spectral mapping theorem

Let A ∈ M(n, C), and let Φ(z) be given by ∑∞ k (6.6.1) Φ(z) = ckz , k=0 6.6. A spectral mapping theorem 333 a power series that is convergent for all z ∈ C. We define Φ(A) ∈ M(n, C) by ∑∞ k (6.6.2) Φ(A) = ckA . k=0 Convergence of this series follows by estimates of the form (2.1.18), which hold for complex as well as real matrices. We want to describe Spec Φ(A) in terms of Spec A, where Spec A denotes the set of eigenvalues of A. We prove the following, which plays a role in the proof of Proposition 6.5.2. Proposition 6.6.1. For A ∈ M(n, C) and Φ(A) defined above,

(6.6.3) Spec Φ(A) = {Φ(λj): λj ∈ Spec A}.

The proof will make use of some basic results of linear algebra, discussed in Appendix A.3, and treated in more detail in Chapter 2 of [45] and in §§6– 7 of [47]. To start, let us note that (6.6.3) is quite easy to prove when A n is diagonalizable, i.e., C has a basis {uj : 1 ≤ j ≤ n} of eigenvectors of A, satisfying

(6.6.4) Auj = λjuj, 1 ≤ j ≤ n. k k In this case, A uj = λj uj, so (K.2) gives ∑∞ k (6.6.5) Φ(A)uj = ckλj uj = Φ(λj)uj. k=0 This immediately gives (6.6.3). The proof of (6.6.3) requires more work when A is not diagonalizable. In that case, for each λj ∈ Spec A, define the “generalized eigenspace” n ν (6.6.6) GE(A, λj) = {v ∈ C :(A − λjI) v = 0 for some ν}. It is a general result (proved in [45] and [47]) that each v ∈ Cn has a unique decomposition ∑ (6.6.7) v = vj, vj ∈ GE(A, λj), λj ∈ Spec A. j We also write ⊕ (6.6.8) Cn = GE(A, λ). λ∈Spec A Similarly, ⊕ (6.6.9) Cn = GE(Φ(A), µ). µ∈Spec Φ(A) In light of this, (6.6.3) is a consequence of the following. 334 6. Differential geometry of surfaces

Lemma 6.6.2. In the setting of Proposition 6.6.1,

(6.6.10) GE(A, λj) ⊂ GE(Φ(A), Φ(λj)).

Proof. If we set Ψ(z) = Φ(z+λj)−Φ(λj), so Ψ(0) = 0, and set B = A−λjI, it suffices to show that (6.6.11) GE(B, 0) ⊂ GE(Ψ(B), 0). Now B is nilpotent on GE(B, 0), i.e., for some ν ∈ N, Bν = 0 on GE(B, 0). Noting that (6.6.12) Ψ(z) = zF (z), so Ψ(B) = BF (B), where F (z) is a convergent power series, we see that (6.6.13) Bν = 0 on GE(B, 0) =⇒ Ψ(B)ν = 0 on GE(B, 0), which gives (6.6.11) and completes the proof of Lemma 6.6.2.  Chapter 7

Fourier analysis

This chapter is devoted to Fourier series and the Fourier transform, in several variables. We begin in §7.1 with Fourier series. Given f ∈ R(Tn), (where Tn is the n-dimensional torus Rn/2πZn) or more generally f ∈ R#(Tn), we define its Fourier coefficients ∫ (7.0.1) fˆ(k) = (2π)−n f(θ)e−ik·θ dθ, h

Tn n where k = (k1, . . . , kn) ∈ Z . A key result is the Fourier inversion formula, ∑ (7.0.2) f(θ) = fˆ(k)eik·θ. k∈Zn We first establish this for f ∈ A(Tn), where we say ∑ (7.0.3) f ∈ A(Tn) ⇐⇒ |fˆ(k)| < ∞. k∈Zn In this case, (7.0.2) holds, with absolute and uniform convergence. We show that n (7.0.4) f ∈ Cm(Tm), m > =⇒ f ∈ A(Tn). 2 It is also important to deal with Fourier series of discontinuous functions. For this, we set ∑ ˆ ik·θ (7.0.5) SN f(θ) = f(k)e , |k|≤N and show that

(7.0.6) SN f −→ f, as N → ∞,

335 336 7. Fourier analysis in L2-norm, when f ∈ R(Tn), or more generally when f, |f|2 ∈ R#(Tn). In such a case, L2-norm convergence of (7.0.6) yields the Plancherel identity ∫ ∑ (7.0.7) (2π)−n |f(θ)|2 dθ = |fˆ(k)|2. n Tn k∈Z We give a brief discussion of how the Lebesgue theory of integration produces L2(Tn), which is the completion of R(Tn) with respect to the L2-norm, and which is naturally the largest space on which (7.0.6) holds, in L2-norm. We then consider Fourier series of more singular objects, namely distri- butions, which arise as continuous linear functionals (7.0.8) w : C∞(Tn) −→ C. We say w ∈ D′(Tn), and write the action on f ∈ C∞(Tn) as f 7→ ⟨f, w⟩. As an example, if M ⊂ Tn is a compact, m-dimensional, C1 surface, we can define δ ∈ D′(Tn) by M ∫

(7.0.9) ⟨f, δM ⟩ = f(x) dS(x). M In this distributional setting, the Fourier coefficients are given by −n −ik·θ (7.0.10) wˆ(k) = (2π) ⟨ek, w⟩, ek(θ) = e . We extend (7.0.6) to

(7.0.11) SN w −→ w, as N → ∞, ∞ n in the sense that ⟨f, SN w⟩ → ⟨f, w⟩, for all f ∈ C (T ). In §7.2 we turn to the Fourier transform, given by ∫ (7.0.12) fˆ(ξ) = (2π)−n/2 f(x)e−ix·ξ dx,

Rn for ξ ∈ Rn, for f ∈ R(Rn), or more generally f ∈ R#(Rn). In this case, the Fourier inversion formula takes the form∫ (7.0.13) f(x) = (2π)−n/2 fˆ(ξ)eix·ξ dξ.

Rn We first establish this for f ∈ A(Rn), where, given f ∈ R(Rn) ∩ C(Rn), we say (7.0.14) f ∈ A(Rn) ⇐⇒ fˆ ∈ R(Rn). We derive a sufficient condition for f to belong to A(Rn), somewhat similar to (7.0.4). For such f, (7.0.13) holds, the right side being an absolutely convergent integral. 7. Fourier analysis 337

Pursuing a Fourier inversion formula for discontinuous functions, we define ∫ −n/2 ˆ ix·ξ (7.0.15) SRf(x) = (2π) f(ξ)e dξ, |ξ|≤R and show that

(7.0.16) SRf −→ f, as R → ∞, in L2-norm, for f ∈ R(Rn), or more generally for f, |f|2 ∈ R#(Rn). In such a case, we also have the Plancherel identity ∫ ∫ (7.0.17) |f(x)|2 dx = |fˆ(ξ)|2 dξ.

Rn Rn Parallel to the toral case, the Lebesgue theory of integration gives rise to L2(Rn), as the maximal space for which (7.0.16) holds, in L2-norm. In the Euclidean setting, we also have an extension of the Fourier trans- form to a class of distributions, defined by L. Schwartz, known as tempered distributions, and results on this are derived in §7.2. In §7.3 we bring Fourier series and the Fourier transform together to produce an important family of identities known as Poisson summation for- mulas. They are obtained by taking a sufficiently nice function f on Rn, periodizing it to obtain a function φ on Tn, and evaluating φ in two ways, both in terms of fˆ(ξ) and in terms ofφ ˆ(k). A striking example of this, obtained for f(x) = e−x2 on R, is known as the Jacobi inversion formula. We show how it leads to an identity for the Riemann zeta function, known as the Riemann functional equation. In §7.4 we study spherical harmonics, that is, Fourier analysis on spheres. In this case, in place of using exponentials eik·θ, we seek to write a function n−1 Rn ℓ ∈ f on the unit sphere S in as an infinite series in functions Yk ∞ n−1 C (S ) that are eigenfunctions of the Laplace operator ∆S on the surface n−1 S , with its naturally arising Riemannian metric. We see that ∆S has − 2 eigenvalues of the form λk, where 2 2 − (7.0.18) λk = k + (n 2)k, with k ∈ Z+. The associated eigenspaces { ∈ ∞ n−1 − 2 } (7.0.19) Vk = g C (S ) : ∆Sg = λkg are seen to consist of the restrictions to Sn−1 of the class of harmonic poly- nomials on Rn that are homogeneous of degree k. We define the orthogonal 338 7. Fourier analysis

projection Ekf of f on Vk, and the version of Fourier series that arises is ∑∞ (7.0.20) f = Ekf. k=0 2 Summing over k ∈ {0,...,N} yields SN f. Parallel to (7.0.6), we have L - norm convergence SN f → f. Also, parallel to (7.0.4), we have absolute and uniform convergence of (7.0.20), when f is a sufficiently smooth function on Sn−1. One significant tool for this study invloves a formula for the projection E , of the form k ∫ (n−2)/2 · (7.0.21) Ekf(ω) = γnk Ck (ω y)f(y) dS(y), Sn−1 α where γnk are certain explicit constants and Ck (t) are polynomials in t, of degree k, known as Gegenbauer polynomials, which specialize to Legendre 2 polynomials Pk(t) in the classical case n = 3, leading to analysis on S . Our approach to (7.0.21) goes through the Dirichlet problem, to solve for u on the unit ball B ⊂ Rn the equation (7.0.22) ∆u = 0 on B, u = f on ∂B = Sn−1. On the one hand, there is the Poisson integral formula for the solution, ∫ − | |2 1 x f(y) ∈ (7.0.23) u(x) = n dS(y), x B, An−1 |x − y| Sn−1 and on the other hand the identity ∑∞ k n−1 (7.0.24) u(rω) = r Ekf(ω), ω ∈ S , r ∈ [0, 1). k=0 The simultaneous use of these two formulas leads to many interesting results about spherical harmonics.

For another tool, we study the action of SO(n) on the spaces Vk and develop some results on the representation theory of these rotation groups, with particular emphasis on SO(3) and its applications to spherical harmon- ics on S2. We complement this material with a brief discussion of Fourier series on compact matrix groups, in §7.5. In §7.6 we give a geometric application of Fourier series, to the isoperi- metric inequality, which states that, among smoothly bounded planar do- mains whose boundaries have a fixed length, disks have the largest area. The proof brings in both Fourier series and Green’s theorem. 7.1. Fourier series 339

7.1. Fourier series

We consider Fourier series of functions on the n-dimensional torus Tn = Rn/(2πZn). Given f ∈ R(Tn) (or more generally f ∈ R#(Tn)) we set, for k ∈ Zn, ∫ (7.1.1) fˆ(k) = (2π)−n f(θ)e−ik·θ dθ,

Tn i.e., (7.1.2) ∫ ∫ 2π 2π −n −i(k1θ1+···+knθn) fˆ(k) = (2π) ··· f(θ1, . . . , θn)e dθ1 ··· dθn. 0 0 We call fˆ(k) the Fourier coefficients of f. The first major problem is to recover f from its Fourier coefficients. We first accomplish this when f belongs to the space { ∑ } (7.1.3) A(Tn) = f ∈ C(Tn): |fˆ(k)| < ∞ . k∈Zn Proposition 7.1.1. If f ∈ A(Tn), then the following Fourier inversion formula holds: ∑ (7.1.4) f(θ) = fˆ(k)eik·θ. k∈Zn ∑ Proof. Given |fˆ(k)| < ∞, the right side of (7.1.4) is absolutely and uniformly convergent, defining ∑ (7.1.5) g(θ) = fˆ(k)eik·θ, g ∈ C(Tn), k∈Zn and our task is to show that f ≡ g. Making use of the identities ∫ (2π)−n eiℓ·θ dθ = 0, if ℓ ≠ 0,

(7.1.6) Tn 1, if ℓ = 0, for ℓ ∈ Zn, we getg ˆ(k) = fˆ(k) for all k ∈ Zn. Let us set u = f − g. We have (7.1.7) u ∈ C(Tn), uˆ(k) = 0, ∀ k ∈ Zn. It remains to show that this implies u ≡ 0. To prove this, we use Corollary A.5.5 (a consequence of the Stone-Weierstrass theorem, treated in Appen- dix A.5), which implies that for each v ∈ C(Tn), there exist trigonometric ik·θ n polynomials, i.e., finite linear combinations vN of {e : k ∈ Z }, such that

n (7.1.8) vN −→ v uniformly on T . 340 7. Fourier analysis

Now (7.1.7) implies ∫

(7.1.9) u(θ)vN (θ) dθ = 0, ∀ N, Tn and passing to the limit, using (7.1.8), gives ∫ (7.1.10) u(θ)v(θ) dθ = 0, ∀ v ∈ C(Tn).

Tn Taking v = u gives ∫ (7.1.11) |u(θ)|2 dθ = 0,

Tn forcing u ≡ 0 and completing the proof. 

We seek conditions on f implying that f ∈ A(Tn). Assume f ∈ Cℓ(Tn). Then integration by parts gives ∫ (7.1.12) (2π)−n f (α)(θ)e−ik·θ dθ = (ik)αfˆ(k), for |α| ≤ ℓ.

Tn Hence ∑ ℓ n ˆ −ℓ (α) (7.1.13) f ∈ C (T ) ⇒ |f(k)| ≤ C(1 + |k|) ∥f ∥L1 , |α|≤ℓ where we set ∫ −n (7.1.14) ∥u∥L1 = (2π) |u(θ)| dθ. Tn ∑ | | −(n+1) ∞ Since k∈Zn (1 + k ) < , we deduce that (7.1.15) Cn+1(Tn) ⊂ A(Tn). We will sharpen this implication below. ∫ | |2 We next make use of (7.1.6) to produce results on Tn f(θ) dθ, starting with the following. Proposition 7.1.2. Given f ∈ A(Tn), ∑ ∫ (7.1.16) |fˆ(k)|2 = (2π)−n |f(θ)|2 dθ. n k∈Z Tn More generally, if also g ∈ A(Tn), ∑ ∫ (7.1.17) fˆ(k)gˆ(k) = (2π)−n f(θ)g(θ) dθ. n k∈Z Tn 7.1. Fourier series 341

Proof. Switching order of summation and integration and using (7.1.6), we have ∫ ∫ ∑ (2π)−n f(θ)g(θ) dθ = (2π)−n fˆ(j)gˆ(k)e−i(j−k)·θ dθ j,k∈Zn (7.1.18) Tn ∑ Tn = fˆ(k)gˆ(k), k giving (7.1.17). Taking g = f gives (7.1.16). 

We will extend the scope of Proposition 7.1.2 below. A related issue is the convergence of S f to f as N → ∞, where N ∑ ˆ ik·θ (7.1.19) SN f(θ) = f(k)e . |k|≤N | | 2 ··· 2 1/2 ∈ A Tn ⇒ → Here we take k = (k1 + + kn) . Clearly f ( ) Snf f uniformly on Tn as N → ∞. Here, we are interested in convergence in L2-norm, where ∫ ∥ ∥2 −n | |2 (7.1.20) f L2 = (2π) f(θ) dθ. Tn Given f ∈ R(Tn), this defines a “norm,” satisfying the following result, called the triangle inequality:

(7.1.21) ∥f + g∥L2 ≤ ∥f∥L2 + ∥g∥L2 . See Appendix A.2 for details on this. Behind this result is the fact that ∥ ∥2 (7.1.22) f L2 = (f, f)L2 , where, when f, g ∈ L2(Tn), we set ∫ −n (7.1.23) (f, g)L2 = (2π) f(θ)g(θ) dθ. Tn Thus the content of (7.1.16) is that ∑ | ˆ |2 ∥ ∥2 (7.1.24) f(k) = f L2 , k∈Zn and that of (7.1.17) is that ∑ ˆ (7.1.25) f(k)gˆ(k) = (f, g)L2 . k∈Zn The left side of (7.1.24) can be regarded as the square norm of an element 2 n 2 n of ℓ (Z ). Generally, an element (a ) ∈Zn belongs to ℓ (Z ) if and only if k∑k ∥ ∥2 | |2 ∞ (7.1.26) (ak) ℓ2 = ak < . k∈Zn 342 7. Fourier analysis

There is an associated inner product ∑ (7.1.27) ((an), (bn))ℓ2 = akbk. k∈Zn As in (7.1.21), one has

(7.1.28) ∥(ak) + (bk)∥ℓ2 ≤ ∥(ak)∥ℓ2 + ∥(bk)∥ℓ2 . As for the notion of L2-convergence, we say 2 (7.1.29) fν → f in L ⇐⇒ ∥f − fν∥L2 → 0. There is a similar notion of convergence in ℓ2. Clearly

(7.1.30) ∥f − fν∥L2 ≤ sup |f(θ) − fν(θ)|. θ n In view of the uniform convergence SN f → f for f ∈ A(T ), noted above, we have n 2 (7.1.31) f ∈ A(T ) =⇒ SN f → f in L , as N → ∞. The triangle inequality implies

(7.1.32) ∥f∥L2 − ∥SN f∥L2 ≤ ∥f − SN f∥L2 , and clearly (by Proposition 7.1.2) ∑ ∥ ∥2 | ˆ |2 (7.1.33) SN f L2 = f(k) , |k|≤N so ∑ ∥ − ∥2 → → ∞ ⇒ ∥ ∥2 | ˆ |2 (7.1.34) f SN f L2 0 as N = f L2 = f(k) . k∈Zn We now consider more general functions f ∈ R(Tn). With fˆ(k) and SN f defined by (7.1.1) and (7.1.19), we define RN f by (7.1.35) f = S f + R f. ∫ ∫ N N −ik·θ −ik·θ | | ≤ Note that Tn f(θ)e dθ = Tn SN f(θ)e dθ for k N. Hence

(7.1.36) (f, SN f)L2 = (SN f, SN f)L2 , and hence

(7.1.37) (SN f, RN f)L2 = 0. Consequently, ∥ ∥2 f 2 = (SN f + RN f, SN f + RN f)L2 (7.1.38) L ∥ ∥2 ∥ ∥2 = SN f L2 + RN f L2 . In particular,

(7.1.39) ∥SN f∥L2 ≤ ∥f∥L2 . 7.1. Fourier series 343

We are now in a position to prove the following.

n Lemma 7.1.3. Let f, fν ∈ R(T ). Assume

(7.1.40) lim ∥f − fν∥L2 = 0, ν→∞ and, for each ν,

(7.1.41) lim ∥fν − SN fν∥L2 = 0. N→∞ Then

(7.1.42) lim ∥f − SN f∥L2 = 0. N→∞

Proof. Writing f − SN f = (f − fν) + (fν − SN fν) + SN (fν − f) and using the triangle inequality, we have, for each ν,

(7.1.43) ∥f − SN f∥L2 ≤ ∥f − fν∥L2 + ∥fν − SN fν∥L2 + ∥SN (fν − f)∥L2 . Taking N → ∞ and using (7.1.39), we have

(7.1.44) lim sup ∥f − SN f∥L2 ≤ 2∥f − fν∥L2 , N→∞ for each ν. Then (7.1.40) yields the desired conclusion (7.1.42). 

n Given f ∈ C(T ), we have trigonometric polynomials fν → f uniformly n on T (by Corollary A.5.5), and clearly (7.1.41) holds for each such fν. Thus Lemma 7.1.3 yields the following. f ∈ C(Tn) =⇒ S f → f in L2, and ∑N (7.1.45) | ˆ |2 ∥ ∥2 f(k) = f L2 . k∈Zn To go further, we bring in the following lemma.

n n Lemma 7.1.4. Given f ∈ R(T ), there exist fν ∈ C(T ) such that fν → f in L2.

Proof. One easily reduces this to treating the case f ≥ 0. Then Proposition n 3.1.11∫ implies that there exist fν ∈ C(T ) such that 0 ≤ fν ≤ f and − → Tn (f fν) dθ 0. Hence ∫ ∫ 2 |f − fν| dθ ≤ (sup f) (f − fν) dθ → 0. Tn Tn 

The last two lemmas and (7.1.45) yield the following result. 344 7. Fourier analysis

Proposition 7.1.5. We have f ∈ R(Tn) =⇒ S f → f in L2, and ∑N (7.1.46) | ˆ |2 ∥ ∥2 f(k) = f L2 . k∈Zn

The last identity is called the Plancherel identity. Having some general results guaranteeing the conclusion in (7.1.16), we now improve (7.1.15).

Proposition 7.1.6. We have n (7.1.47) ℓ > =⇒ Cℓ(Tn) ⊂ A(Tn). 2

Proof. As before, we have (7.1.12). We deduce from this that ∑ | α ˆ |2 ∥ (α)∥2 | | ≤ (7.1.48) k f(k) = f L2 , for α ℓ. k∈Zn Summing over |α| ≤ ℓ yields ∑ ∑ | | 2ℓ| ˆ |2 ≤ ∥ (α)∥2 (7.1.49) (1 + k ) f(k) C f L2 . k∈Zn |α|≤ℓ Then Cauchy’s inequality gives ∑ ∑ |fˆ(k)| = (1 + |k|)−ℓ(1 + |k|)ℓ|fˆ(k)| k∈Zn k∈Zn { } (7.1.50) ∑ 1/2 ≤ 1/2 | | 2ℓ| ˆ |2 An,ℓ (1 + k ) f(k) , k∈Zn where ∑ −2ℓ (7.1.51) An,ℓ = (1 + |k|) < ∞, if 2ℓ > n. k∈Zn ∑ This establishes the desired finiteness of |fˆ(k)|. 

It is illuminating to restate some of the results established above, bring- p n n ing in the spaces ℓ (Z ) of functions a : Z → C, with norm ∥ · ∥ℓp , defined by ∑ ∥ ∥p | |p ∞ (7.1.52) a ℓp = a(k) < , k∈Zn for 1 ≤ p < ∞, and

(7.1.53) ∥a∥ℓ∞ = sup |a(k)| < ∞. k 7.1. Fourier series 345

The case p = 2 arose in (7.1.26)–(7.1.27). Also, we denote by F the transfor- mation that assigns to an integrable function f on Tn its Fourier coefficients (fˆ(k)). The definition (7.1.1) readily gives # n ∞ n (7.1.54) F : R (T ) −→ ℓ (Z ), ∥Ff∥ℓ∞ ≤ ∥f∥L1 , with ∥f∥L1 defined as in (7.1.14). Proposition 7.1.5 gives n 2 n (7.1.55) F : R(T ) −→ ℓ (Z ), ∥Ff∥ℓ2 = ∥f∥L2 . Meanwhile, the definition (7.1.3) gives (7.1.56) F : A(Tn) −→ ℓ1(Zn). We can restate Proposition 7.1.1 by bringing in the transformation F ∗, de- fined on ℓ1(Zn) by ∑ (7.1.57) F ∗(a)(θ) = a(k)eik·θ. k∈Zn The content of Proposition 7.1.1 is that (7.1.58) F ∗ : ℓ1(Zn) → A(Tn) is the 2-sided inverse of F in (7.1.56). The step (7.1.7) in the proof of that result is equivalent to FF ∗ = I on ℓ1(Zn), and the reverse result, F ∗F = I on A(Tn), made use of the Stone- Weierstrass theorem. The analytical apparatus for an extension of (7.1.55) to a result involving F and F ∗ as inverses of each other was produced by H. Lebesgue in what is now known as the theory of Lebesgue measure and integration. We give a brief description of this, refering the reader to other sources, such as [15] or [42], for a detailed presentation. To start, we say a set S ⊂ Tn is measurable provided that (7.1.59) m∗(S) + m∗(Tn \ S) = V (Tn), where m∗ is the outer measure defined by (3.1.124). We say a function f : Tn → C is measurable provided that, for each open O ⊂ C, f −1(O) ⊂ Tn is measurable. The Lebesgue integral associates a value in [0, +∞] to ∫ (7.1.60) f(θ) dθ

Tn for each measurable f satisfying f(θ) ≥ 0 for all θ. The space L1(Tn) consists of all measurable functions f such that ∫ −n (7.1.61) ∥f∥L1 = (2π) |f(θ)| dθ < ∞. Tn

In such a case, one can write f = f0+ −f0− +i(f1+ −f1−), with all fj mea- surable and ≥ 0, and with finite integral, and the process alluded to above 346 7. Fourier analysis

∫ applies to evaluate these integrals, and hence to evaluate Tn f(θ) dθ. There is one further wrinkle. The space L1(Tn) actually consists of equivalence classes of measurable functions satisfying (16.61), where one says f1 ∼ f2 n provided {x ∈ T : f1(x) ≠ f2(x)} has outer measure zero. This makes L1(Tn) a normed space. More generally, for p ∈ [1, ∞), Lp(Tn) consists of equivalence classes of measurable functions for which ∫ ∥ ∥p −n | |p ∞ (7.1.62) f Lp = (2π) f(θ) dθ < . Tn The only case of p > 1 we work with here is p = 2. One has (7.1.63) f, g ∈ L2(Tn) =⇒ fg ∈ L1(Tn), and L2(Tn) is an inner product space, via (7.1.23), extended to this setting. These Lp norms satisfy the triangle inequality

(7.1.64) ∥f + g∥Lp ≤ ∥f∥Lp + ∥g∥Lp . For p = 1, this inequality is a simple consequence of the pointwise inequality |f(θ) + g(θ)| ≤ |f(θ)| + |g(θ)|. For p = 2, the proof, as indicated in (7.1.21)– (7.1.23), follows from material in Appendix A.2. For other p, which we do not need here, the reader can consult the references mentioned above. Thanks to (7.1.63), Lp(Tn) has the structure of a metric space, with d(f, g) = p ∥f − g∥Lp . We say fν → f in L if ∥f − fν∥Lp → 0. For all p ∈ [1, ∞), these spaces have the following important metric properties. The first is a denseness property.

p n ℓ n Proposition A. Given f ∈ L (T ) and ℓ ∈ N, there exist fν ∈ C (T ) such p that fν → f in L .

The second is a completeness property.

p n Proposition B. If (fν) is a Cauchy sequence in L (T ), then there exists p n p f ∈ L (T ) such that fν → f in L .

We refer to [15] or [42] for proofs of these results. The following two neat L2 results illustrate the usefulness of the Lebesgue theory of integration in Fourier analysis. Proposition 7.1.7. The maps F and F ∗ have unique continuous linear extensions from (7.1.65) F : A(Tn) → ℓ1(Zn), F ∗ : ℓ1(Zn) → A(Tn) 7.1. Fourier series 347 to (7.1.66) F : L2(Tn) → ℓ2(Zn), F ∗ : ℓ2(Zn) → L2(Tn), and the identities (7.1.67) F ∗Ff = f, FF ∗a = a and ∗ (7.1.68) ∥Ff∥ℓ2 = ∥f∥L2 , ∥F a∥L2 = ∥a∥ℓ2 hold for all f ∈ L2(Tn) and a ∈ ℓ2(Zn). 2 n 2 n Proposition 7.1.8. Define SN as in (7.1.19). Then SN : L (T ) → L (T ) and 2 n 2 n (7.1.69) f ∈ L (T ) =⇒ SN f → f in L (T ).

Rather than presenting arguments here, we refer the reader to §7.2, and the analogous results, Propositions 7.2.10 and 7.2.11. As a complement to Proposition 7.1.7, we mention that, using the inner products (7.1.23) and (7.1.27), we have ∗ (7.1.70) (a, Ff)ℓ2 = (F a, f)L2 , for all f ∈ L2(Tn), a ∈ ℓ2(Zn). In case f ∈ A(Tn) and a ∈ ℓ1(Zn), such an identity is elementary. We now discuss another way of taking Fourier analysis beyond the setting of A(Tn), which is a normed space, with norm

(7.1.71) ∥f∥A = ∥Ff∥ℓ1 , for which we have ∗ (7.1.72) ∥F a∥A = ∥a∥ℓ1 , for all a ∈ ℓ1(Zn). We define the A′(Tn) to consist of continuous linear functionals (7.1.73) w : A(Tn) −→ C, i.e., those linear maps satisfying n (7.1.74) |w(f)| ≤ C∥f∥A, ∀ f ∈ A(T ).

The optimal constant in (7.1.74) is denoted ∥w∥A′ . Other useful notations are (7.1.75) ⟨f, w⟩ = w(f), and (f, w) = w(f), for f ∈ A(Tn), w ∈ A′(Tn). We want to pass from F and F ∗ in (7.1.69) to (7.1.76) F : A′(Tn) → ℓ∞(Zn), F ∗ : ℓ∞(Zn) → A′(Tn). To explain the use of ℓ∞(Zn), we establish the following. 348 7. Fourier analysis

Lemma 7.1.9. The dual space to ℓ1(Zn) is ℓ∞(Zn).

Proof. What is to be established is a natural identification of the set of continuous linear functionals on ℓ1(Zn), (7.1.77) β : ℓ1(Zn) −→ C, with the set of bounded functions b : Zn → C, (7.1.78) b ∈ ℓ∞(Zn). The map ℓ∞(Zn) → ℓ1(Zn)′ is given by ∑ 1 n ∞ n (7.1.79) ⟨a, b⟩ = akbk, a ∈ ℓ (Z ), b ∈ ℓ (Z ). k∈Zn Note that ( ) ∑ (7.1.80) |⟨a, b⟩| ≤ sup |bk| |ak| = ∥a∥ℓ1 ∥b∥ℓ∞ . k k Conversely, given β as in (7.1.77), satisfying

(7.1.81) |β(a)| ≤ B∥a∥ℓ1 , let us define

(7.1.82) bk = β(εk), where εk(ℓ) = 1 if ℓ = k, 0 if ℓ ≠ k. ∞ n We have |bk| ≤ B for all k, so b = (bk) ∈ ℓ (Z ), and ∥b∥ℓ∞ ≤ B. Further- more, (7.1.83) ⟨a, b⟩ = β(a), n 1 n for a = εk, for each k ∈ Z , and hence (7.1.83) holds for all a ∈ ℓ (Z ). This proves the lemma. 

Remark. An inspection of the proof shows that the correspondence β ↔ b of ℓ1(Zn)′ with ℓ∞(Zn) is norm preserving, i.e., the optimal constant ∥β∥ in (16.81) is equal to ∥b∥ℓ∞ .

We are now ready to define F and F ∗ in (7.1.76). Taking off from (7.1.70), we define Fw ∈ ℓ∞(Zn) for w ∈ A′(Tn) by (7.1.84) (a, Fw) = (F ∗a, w), a ∈ ℓ1(Zn), and we define F ∗b ∈ A′(Tn) for b ∈ ℓ∞(Zn) by (7.1.85) (f, F ∗b) = (Ff, b), f ∈ A(Tn). Note that (7.1.84) yields ∗ ∗ (7.1.86) |(a, Fw)| ≤ |w(F a)| ≤ ∥w∥A′ ∥F a∥A = ∥w∥A′ ∥a∥ℓ1 , 7.1. Fourier series 349 the last identity by (7.1.72). Hence, by Lemma 7.1.9 and the remark follow- ing it,

(7.1.87) ∥Fw∥ℓ∞ ≤ ∥w∥A′ . Next, (7.1.85) yields ∗ (7.1.88) |(f, F b)| ≤ ∥Ff∥ℓ1 ∥b∥ℓ∞ = ∥f∥A∥b∥ℓ∞ , the last identity by (7.1.71), hence ∗ (7.1.89) ∥F b∥A′ ≤ ∥b∥ℓ∞ . Thus F and F ∗ in (7.1.76) are well defined. The following Fourier inverson formula complements those in Proposi- tions 7.1.1 and 7.1.7. Proposition 7.1.10. The maps F and F ∗ in (7.1.76) are two-sided inverses of each other, i.e., (7.1.90) F ∗Fw = w and FF ∗b = b, ∀w ∈ A′(Tn), b ∈ ℓ∞(Zn).

Proof. Given f ∈ A(Tn), a ∈ ℓ1(Zn), we have (7.1.91) (f, F ∗Fw) = (Ff, Fw) = (F ∗Ff, w) = (f, w), the first identity by (7.1.85), the second by (7.1.84), and the third by (7.1.58). This implies F ∗Fw = w. Similarly, (7.1.92) (a, FF ∗b) = (F ∗a, F ∗b) = (FF ∗a, b) = (a, b), yielding FF ∗b = b. 

We can now sharpen (7.1.87) and (7.1.89). Corollary 7.1.11. For w ∈ A′(Tn), b ∈ ℓ∞(Zn), ∗ (7.1.93) ∥Fw∥ℓ∞ = ∥w∥A′ , and ∥F b∥A′ = ∥b∥ℓ∞ .

∗ Proof. First, w = F Fw ⇒ ∥w∥A′ ≤ ∥Fw∥ℓ∞ , by (7.1.89) with b = Fw. This together with (7.1.87) yields the first identity in (7.1.93). The second identity has a similar proof. 

Let us note that (7.1.84), applied to a = εk, defined as in (7.1.82), gives, for w ∈ A′(Tn), −n n −ik·θ (7.1.94) Fw(k) = (2π) ⟨ek, w⟩, ek ∈ A(T ), ek(θ) = e . The space A′(Tn) contains an interesting variety of functions and other objects. For example, there is a natural map (7.1.95) ι : R#(Tn) −→ A′(Tn), 350 7. Fourier analysis given by ∫ (7.1.96) ⟨f, ι(u)⟩ = (2π)−n f(θ)u(θ) dθ.

Tn

We have ( ) −n |⟨f, ι(u)⟩| ≤ (2π) sup |f| ∥u∥L1 (7.1.97) −n ≤ (2π) ∥f∥A∥u∥L1 . Given the results on the Lebesgue integral mentioned above, this extends to (7.1.98) ι : L1(Tn) −→ A′(Tn). We have from (7.1.94) that ∫ (7.1.99) Fι(u)(k) = (2π)−n u(θ)e−ik·θ dθ, k ∈ Zn.

Tn The space A′(Tn) also contains some objects more singular than func- n ′ n tions. For example, given p ∈ T , we define δp ∈ A (T ) by

(7.1.100) ⟨f, δp⟩ = f(p). More generally, let M ⊂ Tn be a compact, m-dimensional, C1 surface, and u ∈ C(M). Define uδ ∈ A′(Tn) by M ∫

(7.1.101) ⟨f, uδM ⟩ = f(x)u(x) dS(x). M We have

|⟨f, uδM ⟩| ≤ CM (sup |u|)(sup |f|) (7.1.102) ≤ CM (sup |u|)∥f∥A. Again (7.1.94) applies, to give ∫ −n −ik·θ n (7.1.103) F(uδM )(k) = (2π) u(θ)e dS(θ), k ∈ Z . M

Objects such as δp and δM are examples of “distributions.” A beautiful theory of Fourier analysis on the space of distributions on Tn, which includes some more singular objects, was constructed by L. Schwartz. We present some of his results here. We start with the space C∞(Tn). By (7.1.12)– (7.1.13), (7.1.104) f ∈ C∞(Tn) =⇒ Ff ∈ s(Zn), where n ∞ n −N (7.1.105) s(Z ) = {a ∈ ℓ (Z ): |a(k)| ≤ CN (1 + |k|) , ∀ N}. 7.1. Fourier series 351

It is also easy to see that (7.1.106) F ∗ : s(Zn) −→ C∞(Tn), and (7.1.58) specializes, to yield (7.1.107) F ∗Ff = f, FF ∗a = a, ∀ f ∈ C∞(Tn), a ∈ s(Zn). The space C∞(Tn) carries the following sequence of norms: (α) (7.1.108) pk(f) = max sup |f (θ)|, |α|≤k θ∈Tn and the space s(Zn) carries the norms ν (7.1.109) qν(a) = sup (1 + |k|) |a(k)|. k∈Zn We see from (7.1.13) that

(7.1.110) qν(Ff) ≤ Cνpν(f). ∑ | | −(n+1) ∞ Also, since k∈Zn (1 + k ) < , we have ∗ (7.1.111) p0(F a) ≤ Cqn+1(a), and from this we get ∗ (7.1.112) pk(F a) ≤ Cqk+n+1(a). Now a distribution w ∈ D′(Tn) is a continuous linear functional (7.1.113) w : C∞(Tn) −→ C, that is to say, w is a linear map from C∞(Tn) to C with the property that there exist k ∈ Z+ and C < ∞ such that ∞ n (7.1.114) |w(f)| ≤ Cpk(f), ∀ f ∈ C (T ). As in (7.1.75), we use the notation (7.1.115) ⟨f, w⟩ = w(f), (f, w) = w(f). It follows from (7.1.15) that

(7.1.116) ∥f∥A ≤ Cpn+1(f), ′ n ′ n so each w ∈ A (T ) also defines an element of D (T ). Thus δp in (7.1.100) n and δM in (7.1.101) are examples of distributions on T . To produce more singular distributions, we can define (7.1.117) ∂α : D′(Tn) −→ D′(Tn), by (7.1.118) ⟨f, ∂αw⟩ = (−1)|α|⟨f (α), w⟩. We seek to define (7.1.119) F : D′(Tn) −→ s′(Zn), F ∗ : s′(Zn) −→ D′(Tn), 352 7. Fourier analysis where (7.1.120) s′(Zn) = {b : (1 + |k|)−N b ∈ ℓ∞(Zn), for some N ∈ Z+}. The significance of s′(Zn) is explained by the following. Lemma 7.1.12. The dual space to s(Zn) is s′(Zn).

Proof. What we claim is that there is a natural identification of the set s(Zn)′ of continuous linear functionals on s(Zn), (7.1.121) β : s(Zn) −→ C, with the set of functions s′(Zn), (7.1.122) b ∈ s′(Zn). The condition of continuity on β in (7.1.121) is that there exist ν ∈ N and C < ∞ such that

(7.1.123) |β(a)| ≤ Cqν(a). The map s′(Zn) → s(Zn)′ is given by ∑ n ′ n (7.1.124) ⟨a, b⟩ = akbk, a ∈ s(Z ), b ∈ s (Z ). k∈Zn Note that ( )( ) (7.1.125) |⟨a, b⟩| ≤ C sup (1 + |k|)N+n+1|a(k)| sup (1 + |k|)−N |b(k)| . k k Conversely, given β as in (7.1.121), satisfying (7.1.123), we define

(7.1.126) bk = β(εk), ′ n with εk as in (7.1.82), and verify that b ∈ s (Z ) and that (7.1.127) ⟨a, b⟩ = β(a), n n first for a = εk, for all k ∈ Z , and then for all a ∈ s(Z ). 

We are now ready to define F and F ∗ in (7.1.119). Parallel to (7.1.84), we define Fw ∈ s′(Zn) for w ∈ D′(Tn) by (7.1.128) (a, Fw) = (F ∗a, w), a ∈ s(Zn), and we define F ∗b ∈ D′(Tn) for b ∈ s′(Zn) by (7.1.129) (f, F ∗b) = (Ff, b), f ∈ C∞(Tn). As we have seen, a ∈ s(Zn) ⇒ F ∗a ∈ s(Zn), with estimates (7.1.112), and f ∈ C∞(Tn) ⇒ Ff ∈ s(Zn), with estimates (7.1.110). These results enable one to deduce that (7.1.128) defines Fw ∈ s′(Zn) and (7.1.129) defines F ∗b ∈ D′(Tn). Here is a further extension of the Fourier inversion formula. 7.1. Fourier series 353

Proposition 7.1.13. The maps F and F ∗ in (7.1.119) are two-sided in- verses of each other, i.e., (7.1.130) F ∗Fw = w, and FF ∗b = b, ∀ w ∈ D′(Tn), b ∈ s′(Zn).

The proof is parallel to that of Proposition 7.1.10. The result (7.1.12) implies (7.1.131) Ff (α)(k) = (ik)αFf(k), ∀ f ∈ C∞(Tn). We can extend this to D′(Tn), using (7.1.117)–(7.1.118) and (7.1.128)– (7.1.129), to obtain: Proposition 7.1.14. Given w ∈ D′(Tn), (7.1.132) F(∂αw)(k) = (ik)αFw(k).

Exercises

1. Consider f(θ) = |θ| for −π ≤ θ ≤ π, extended periodically to define an element of C(T1). (a) Compute fˆ(k). (b) Show that f ∈ A(T1). (c) Use the Fourier inversion formula (7.1.4) at θ = 0 to show that ∞ ∑ 1 π2 (7.1.133) = . (2ℓ + 1)2 8 ℓ=0 (d) Deduce from (c) that ∞ ∑ 1 π2 (7.1.134) = . k2 6 k=1 Hint. Decompose this sum into the sum over k odd and the sum over k even. ∑ ∞ −2 Remark. k=1 k is ζ(2).

2. Consider g(θ) = 1 for 0 < θ < π, 0 for −π < θ < 0, defining a bounded integrable function on T1. (a) Computeg ˆ(k). (b) Use the Plancherel identity (7.1.46) to obtain another derivation of (7.1.133).

3. Apply the Plancherel identity to f and fˆ in Exercise 1. Use this to show 354 7. Fourier analysis that ∞ ∑ 1 π4 (7.1.135) = . k4 90 k=1 Note. This sum is ζ(4).

4. Given f ∈ R(T1), set ∑ |k| ikθ (7.1.136) Prf(θ) = r fˆ(k)e , 0 ≤ r < 1. k∈Z Show that

∥Prf − f∥L2 −→ 0, as r ↗ 1.

5. In the setting of Exercise 4, show that ∑∞ ∑∞ k k iθ (7.1.137) Prf(θ) = fˆ(k)z + fˆ(−k)z , z = re . k=0 k=1 iθ Deduce that u(re ) = Prf(θ) is harmonic on the unit disk D = {z ∈ C : |z| < 1}.

6. Interpret the results of Exercises 4–5 as providing a solution to the Dirichlet boundary probem

(7.1.138) ∆u = 0 on D, u ∂D = f.

7. In the setting of Exercises 4–6, establish the following Poisson integral formula: ∫ 2π (7.1.139) Prf(θ) = f(φ)p(r, θ − φ) dφ, 0 where ∞ 1 ∑ (7.1.140) p(r, θ) = r|k|eikθ. 2π k=−∞ Decompose this sum into a sum of two and show that 1 1 − r2 (7.1.141) p(r, θ) = . 2π 1 − 2r cos θ + r2

8. Show that ∫ p(r, θ) dθ = 1, ∀ r ∈ [0, 1),

T1 7.1. Fourier series 355 and, for all δ > 0, p(r, θ) → 0 uniformly for θ ∈ [−π, π] \ [−δ, δ], as r ↗ 1. Hint. For the first part, integrate the series for p(r, θ) term by term.

1 1 9. Show that if f ∈ C(T ), then Prf → f uniformly on T , as r ↗ 1. Hint. Look forward, to Lemma 7.2.3.

10. Adapt the proof of Lemma 7.2.3 to establish the following.

Proposition X. If f ∈ R#(T1),I ⊂ T1 is an open set on which f is continuous, and K ⊂ I is compact, then Prf → f uniformly on K, as r ↗ 1.

For comparison, we mention the following result, given in Proposition 4.12 in Chapter 5 of [44].

Proposition Y. If f ∈ R#(T1) and I ⊂ T1 is an open set on which f is H¨oldercontinuous, with some positive exponent, then SN f(θ) → f(θ) as N → ∞, for each θ ∈ I.

Here SN f is as in (7.1.19), which for n = 1 is ∑N ˆ ikθ SN f(θ) = f(k)e . k=−N ∑ ∞ k − 11. From the geometric series k=0 z = 1/(1 z), deduce that ∞ ∑ zk 1 = log , for |z| < 1. k 1 − z k=1

12. Show that 1 f(θ) = log =⇒ f ∈ R#(T1). 1 − eiθ Set 1 f (θ) = log , for 0 ≤ r < 1. r 1 − reiθ 1 Show that fr ∈ C(T ) for each r ∈ [0, 1) and

∥f − fr∥L1 −→ 0, as r ↗ 1. Deduce that

fˆr(k) −→ fˆ(k) as r ↗ 1, for each k ∈ Z. 356 7. Fourier analysis

13. In the setting of Exercise 12, note that, for 0 ≤ r < 1, ∫ 2π( ) ˆ 1 1 −ikθ fr(k) = log iθ e dθ. 2π 0 1 − re Input the power series from Exercise 11 and deduce (via Exercise 12) that 1 fˆ(k) = , if k ≥ 1, k 0, if k ≤ 0.

14. Deduce from Exercise 13 and Proposition Y that there is pointwise convergence (though not absolute convergence) ∞ ∑ zk 1 = log , for |z| = 1, z ≠ 1. k 1 − z k=1 Note that taking z = −1 yields 1 1 1 1 − + − + ··· = log 2. 2 3 4 For a more elementary approach to this special case, see Exercise 30 for Chapter 4, §5, of [44].

n n 15. Let f ∈ C(T ) and assume that there exist fν ∈ A(T ) such that fν → f uniformly, and, for some K < ∞, independent of ν, ˆ ∥fν∥ℓ1 ≤ K. Show that f ∈ A(Tn). Show that this holds if there exists m > n/2 and K < ∞ such that each f ∈ Cm(Tn) and 1 ν ∑ ∥ (α)∥ ≤ fν L2 K1. |α|≤m

16. Deduce from Exercise 15 that Lip(T1) ⊂ A(T1). Note the improvement over (7.1.47) (in case n = 1). Note that this result applies to the function in Exercise 1.

7.2. The Fourier transform

Given f ∈ R(Rn) (or more generally f ∈ R#(Rn)), we define its Fourier transform to be ∫ (7.2.1) fˆ(ξ) = Ff(ξ) = (2π)−n/2 f(x)e−ix·ξ dx, x ∈ Rn.

Rn 7.2. The Fourier transform 357

Similarly, we set ∫ (7.2.2) F ∗f(ξ) = (2π)−n/2 f(x)eix·ξ dx, ξ ∈ Rn,

Rn and ultimately plan to identify F ∗ as the inverse Fourier transform. Here, of course we take R(Rn) to consist of complex-valued functions. Recall from §3.1 that this means their real and imaginary parts are separately Riemann integrable. Clearly ˆ −n/2 (7.2.3) |f(ξ)| ≤ (2π) ∥f∥L1 , where we use the notation ∫

(7.2.4) ∥f∥L1 = |f(x)| dx. Rn We also have continuity. Proposition 7.2.1. If f ∈ R(Rn), then fˆ is continuous on Rn. ∫ ∞ | | Proof. Given ε > 0, pick N < such that |x|>N f(x) dx < ε. Write f = fN + gN where fN (x) = f(x) for |x| ≤ N, 0 for |x| > N. Then ˆ ˆ (7.2.5) f(ξ) = fN (ξ) +g ˆN (ξ), and

(7.2.6) |gˆN (ξ)| < ε, ∀ ξ. Meanwhile, for ξ, ζ ∈ Rn, ∫ ( ) ˆ ˆ −n/2 −ix·ξ −ix·ζ (7.2.7) fN (ξ) − fN (ζ) = (2π) f(x) e − e dx, |x|≤N and −ix·ξ −ix·ζ −ix·η |e − e | ≤ |ξ − ζ| max |∇ηe | η (7.2.8) ≤ |x| · |ξ − ζ| ≤ N|ξ − ζ|, for |x| ≤ N, so N (7.2.9) |fˆ (ξ) − fˆ (ζ)| ≤ ∥f∥ 1 |ξ − ζ|. N N (2π)n/2 L ˆ ˆ Hence each fN is continuous, and, by (7.2.6), f is a uniform limit of contin- uous functions, so it is continuous.  358 7. Fourier analysis

Remark. Proposition 7.2.1 also holds for f ∈ R#(Rn).

We compute some Fourier transforms, making use of the result (5.1.55) that ∫ ∞ 2 √ 2 (7.2.10) e−x +xz dx = πez /4, ∀ z ∈ C. −∞ Taking z = iξ, ξ ∈ R, gives ∫ ∞ 2 √ 2 (7.2.11) ex +ixξ dx = πe−ξ /4, ∀ ξ ∈ R. −∞ Writing

−|x|2+ix·ξ −x2+ix ξ −x2 +ix ξ (7.2.12) e = e 1 1 1 ··· e n x n , we get ∫ 2 2 (7.2.13) e−|x| eix·ξ dx = πn/2e−|ξ| /4, ∀ ξ ∈ Rn.

Rn Thus

2 2 (7.2.14) g(x) = e−|x| on Rn =⇒ Fg(ξ) = F ∗g(ξ) = 2−n/2e−|ξ| /4.

A change of variable gives generally, for f ∈ R(Rn), a > 0,

−n −1 (7.2.15) fa(x) = f(ax) =⇒ Ffa(ξ) = a fˆ(a ξ), and consequently, for b > 0

−b|x|2 n ∗ −n/2 −|ξ|2/4b (7.2.16) gb(x) = e on R =⇒ Fgb(ξ) = F gb(ξ) = (2b) e . ∗ ∗ From (7.2.16) we see that Fg1/2 = g1/2 and also that F Fgb = FF gb = gb. The Fourier inversion formula asserts that ∫ (7.2.17) f(x) = (2π)−1/2 fˆ(ξ)eix·ξ dξ,

Rn in appropriate senses, depending on the nature of f. We will approach this by examining ∫ −n/2 −ε|ξ|2 ix·ξ (7.2.18) Jεf(x) = (2π) fˆ(ξ)e e dξ, Rn with ε > 0. By (6.2.3), fˆ(ξ)e−ε|ξ|2 is Riemann integrable over Rn whenever f ∈ R(Rn). We can plug in (7.2.1) for fˆ(ξ) and switch order of integration, 7.2. The Fourier transform 359 getting ∫∫ −n i(x−y)·ξ −ε|ξ|2 Jεf(x) = (2π) f(y)e e dy dξ (7.2.19) ∫ = f(y)Hε(x − y) dy, where ∫ −n −ε|ξ|2+ix·ξ (7.2.20) Hε(x) = (2π) e dξ. Rn Using (7.2.16), we have −n/2 −|x|2/4ε (7.2.21) Hε(x) = (4πε) e . ∫ A change of variable and use of e−|x|2 dx = πn/2 gives ∫ Rn

(7.2.22) Hε(x) dx = 1, ∀ ε > 0. Rn Using this information, we will be able to prove the following.

n Proposition∫ 7.2.2. Assume f is bounded and continuous on R , and take Jεf(x) = f(y)Hε(x − y) dy, with Hε as in (7.2.21)–(7.2.22). Then, as ε ↘ 0, n (7.2.23) Jεf(x) −→ f(x), ∀ x ∈ R .

It is convenient to put this result in a more general context. If f is bounded and continuous on Rn and h ∈ R(Rn), we define the convolution h ∗ f by ∫ (7.2.24) h ∗ f(x) = h(y)f(x − y) dy.

Rn Clearly ∫ (7.2.25) |h| dx = A, |f| ≤ M on Rn =⇒ |h ∗ f| ≤ AM on Rn.

Also, a change of variables gives ∫ (7.2.26) h ∗ f(x) = h(x − y)f(y) dy.

Rn We want to analyze the convolution action of a family of integrable functions h on Rn that satisfy the following conditions: ν ∫ ∫

(7.2.27) hν ≥ 0, hν dx = 1, hν dx ≤ εν → 0,

n R \Sν 360 7. Fourier analysis where n (7.2.28) Sν = {x ∈ R : |x| ≤ δν}, δν → 0. Assume (7.2.29) f ∈ C(Rn), |f| ≤ M on Rn. We aim to prove the following.

n n Lemma 7.2.3. If hν ∈ R(R ) satisfy (7.2.27)–(7.2.28) and if f ∈ C(R ) satisfies (7.2.29), then n (7.2.30) fν(x) = hν ∗ f(x) −→ f(x), ∀ x ∈ R , locally uniformly in x.

Proof. Given that f is continuous, it is uniformly continuous on compact sets, so we can supplement (7.2.29) with ′ ′ (7.2.31) |x − x | ≤ δν, |x| ≤ R =⇒ |f(x) − f(x )| ≤ ε˜ν(R) → 0, for each R < ∞. To proceed, write ∫ ∫

fν(x) = hν(y)f(x − y) dy + hν(y)f(x − y) dy

(7.2.32) n Sν R \Sν

= gν(x) + rν(x). Clearly n (7.2.33) |rν(x)| ≤ Mε, ∀ x ∈ R . Next, ∫

(7.2.34) gν(x) − f(x) = hν(y)[f(x − y) − f(x)] dy − ενf(x),

Sν so

(7.2.35) |gν(x) − f(x)| ≤ ε˜ν(R) + Mεν, for |x| ≤ R, hence

(7.2.36) |fν(x) − f(x)| ≤ ε˜ν(R) + 2Mεν, for |x| ≤ R, yielding (7.2.30). 

In view of (7.2.21)–(7.2.22), Proposition 7.2.2 follows from Lemma 7.2.3. From here, we obtain the following.

Proposition 7.2.4. Assume f is bounded and continuous on Rn, and f, fˆ ∈ R(Rn). Then the Fourier inversion formula (7.2.17) holds for all x ∈ Rn. 7.2. The Fourier transform 361

Proof. If f ∈ R(Rn), then fˆ is bounded and continuous. If also fˆ ∈ R(Rn), then the right side of (7.2.18) converges to the right side of (7.2.17), i.e., to F ∗fˆ(x), for each x ∈ Rn, as ε ↘ 0. That is to say, ∗ n (7.2.37) lim Jεf(x) = F fˆ(x), ∀ x ∈ R . ε↘0 In concert with (7.2.23), this proves the proposition. 

Remark. With some more work, one can omit the hypothesis in Proposition 7.2.4 that f be bounded and continuous, and use (7.2.37) to deduce these properties as a conclusion. This sort of reasoning is best carried out in a course on measure theory and integration.

In light of the arguments given above, we see that the following class of functions arises as one that is significant for Fourier analysis. (7.2.38) A(Rn) = {f ∈ R(Rn): f bounded and continuous, fˆ ∈ R(Rn)}. By Proposition 7.2.4, the Fourier inversion formula (7.2.17) holds for all f ∈ A(Rn). It also follows that f ∈ A(Rn) ⇒ fˆ ∈ A(Rn). It is of interest to know when f ∈ A(Rn). We begin with the following simple result. Suppose f ∈ Ck(Rn) has compact support (we write f ∈ Ck(Rn)). Then integration by parts yields c ∫ (7.2.39) (2π)−n/2 f (α)(x)e−ix·ξ dx = (iξ)αfˆ(ξ), |α| ≤ k.

Rn Hence ∈ k Rn ⇒ | ˆ | ≤ | | −k f Cc ( ) = f(ξ) C(1 + ξ ) (7.2.40) =⇒ fˆ ∈ R(Rn), if k > n, so n+1 Rn ⊂ A Rn (7.2.41) Cc ( ) ( ). We will obtain some substantially sharper results below.

To proceed, it is useful to bring in the quantities ∥f∥L2 and (f, g)L2 , defined by ∫ ∥ ∥2 | |2 (7.2.42) f L2 = f(x) dx, Rn and ∫

(7.2.43) (f, g)L2 = f(x)g(x) dx. Rn 362 7. Fourier analysis

∥ ∥2 R Rn Note that f L2 = (f, f)L2 . Since elements of ( ) are both bounded and integrable, we have ∫ ( ) n 2 (7.2.44) f ∈ R(R ) =⇒ |f(x)| dx ≤ sup |f| ∥f∥L1 , Rn n where ∥f∥L1 is defined by (7.2.4). Use of (7.2.43) makes R(R ) an inner product space and∫ use of (7.2.42) makes it a normed space (if we identify f and g whenever |f − g| dx = 0). In particular, the triangle inequality holds:

(7.2.45) ∥f + g∥L2 ≤ ∥f∥L2 + ∥g∥L2 . A parallel result holds for L1:

(7.2.46) ∥f + g∥L1 ≤ ∥f∥L1 + ∥g∥L1 . The inequality (7.2.46) is immediate from the definition (7.2.4) and the pointwise estimate |f(x) + g(x)| ≤ |f(x)| + |g(x)|. The proof of (7.2.45) takes a longer argument, involving along the way Cauchy’s inequality,

(7.2.47) |(f, g)L2 | ≤ ∥f∥L2 ∥g∥L2 . See Appendix A.2, on inner product spaces, for a derivation of (7.2.47) and (7.2.45). An important property of the Fourier transform F and of F ∗ is that they preserve the L2-norm and inner product. We first derive this for elements of A(Rn). Since F and F ∗ differ only in replacing e−ix·ξ by its eix·ξ, we have, for f, g ∈ A(Rn), ∗ (7.2.48) (Ff, g)L2 = (f, F g)L2 . Combining this with Proposition 7.2.4, we have n ∗ (7.2.49) f, g ∈ A(R ) =⇒ (Ff, Fg)L2 = (f, F Fg)L2 = (f, g)L2 . One readily obtains a similar result with F replaced by F ∗. Hence ∗ (7.2.50) ∥Ff∥L2 = ∥F f∥L2 = ∥f∥L2 , for all f ∈ A(Rn). The result (7.2.50) is called the Plancherel identity. It extends beyond f ∈ A(Rn). We aim to prove the following. Proposition 7.2.5. If f ∈ R(Rn), then |Ff|2 and |F ∗f|2 belong to R(Rn), and (7.2.50) holds.

Note that we do not assert that f ∈ R(Rn) implies fˆ ∈ R(Rn). Indeed, that can fail, as the following simple example shows, in case n = 1. Namely, if

(7.2.51) χR(x) = 1 for |x| ≤ R, 0 for |x| > R, 7.2. The Fourier transform 363 then ∫ R 1 −ixξ χbR(ξ) = √ e dx 2π − ∫ R 1 R (7.2.52) = √ cos xξ dx √2π −R 2 sin Rξ = , π ξ a function that is square integrable but not integrable. We use an approximation argument to prove Proposition 7.2.5, making use of the following lemma. Pick k > n. ∈ R Rn ∈ k Rn Lemma 7.2.6. Given f ( ), there exist fν Cc ( ) such that sup |fν| ≤ sup |f|,

(7.2.53) ∥f − fν∥L1 −→ 0, and ∥f − fν∥L2 −→ 0.

n ∫Proof. Take fR(x) = f(x) for |x| < R, 0 for |x| ≥ R. Then f ∈ R(R ) → n |f − fR| dx → 0 as R → ∞, so we specialize to the case f ∈ Rc(R ). It n suffices to treat the case f ≥ 0. Then the existence of fν ∈ Cc(R ) such that 0 ≤ fν ≤ f and ∥f − fν∥L1 → 0 follows from Proposition 3.1.11. Then ∥ − ∥ → k Rn also f fν L2 0. Further approximation by elements of Cc ( ) is left to the reader. One might use the Stone-Weierstrass theorem, or perhaps a convolution argument. 

We now move to the proof of Proposition 7.2.5. By (7.2.41), each fν belongs to A(Rn), so, by (7.2.50), ˆ (7.2.54) ∥fν∥L2 = ∥fν∥L2 , ∀ ν. By (7.2.3) we have ˆ ˆ −n/2 (7.2.55) sup |f(ξ) − fν(ξ)| ≤ (2π) ∥f − fν∥L1 , ξ so fˆ → fˆ uniformly on Rn. Consequently, for each R < ∞, ν ∫ ∫ 2 2 (7.2.56) |fˆ(ξ)| dξ = lim |fˆν(ξ)| dξ. ν→∞ |ξ|≤R |ξ|≤R Meanwhile, the right side of (7.2.56) is dominated by ∥fˆ ∥2 , so ∫ ν L2 ˆ 2 ˆ 2 |f(ξ)| dξ ≤ lim inf ∥fν∥ 2 ν→∞ L |ξ|≤R (7.2.57) 2 = lim inf ∥fν∥ 2 ν→∞ L ∥ ∥2 = f L2 , 364 7. Fourier analysis the second line by (7.2.54) and the third by (7.2.53). Since (7.2.57) holds for all R < ∞, we have at this point that n ˆ (7.2.58) f ∈ R(R ) =⇒ ∥f∥L2 ≤ ∥f∥L2 .

To proceed, we apply this implication to gν = f − fν, obtaining ∥gˆν∥L2 ≤ ∥gν∥L2 , i.e., ˆ ˆ (7.2.59) ∥f − fν∥L2 ≤ ∥f − fν∥L2 .

By (7.2.53), ∥f − fν∥L2 → 0 as ν → ∞, so ˆ ˆ (7.2.60) ∥f − fν∥L2 −→ 0. Hence ˆ ˆ (7.2.61) ∥f∥L2 = lim ∥fν∥L2 . ν→∞ Applying (7.2.54) and (7.2.53) again, we get n ˆ (7.2.62) f ∈ R(R ) =⇒ ∥f∥L2 = ∥f∥L2 , as desired. The same sort of argument applies to F ∗f. Proposition 7.2.5 is proved.

This L2 material sets us up for another type of Fourier inversion formula. To state it, we bring in the following family of linear transformations. For R ∈ (0, ∞) and f ∈ R(Rn), set ∫ −n/2 ˆ ix·ξ (7.2.63) SRf(x) = (2π) f(ξ)e dξ. |ξ|≤R Equivalently, ∗ ˆ (7.2.64) SRf = F (χRf), n with χR as in (7.2.51), but this time x ∈ R . Note that (7.2.65) f ∈ R(Rn) =⇒ fˆ bounded and continuous (and square integrable) n =⇒ χRf ∈ R(R ) ∗ ˆ =⇒ F (χRf) bounded and continuous, and square integrable. We also have ˆ ˆ (7.2.66) ∥χRf − f∥L2 −→ 0 as R → ∞. This suggests the following result, our second Fourier inversion formula. Proposition 7.2.7. Given f ∈ R(Rn),

(7.2.67) ∥SRf − f∥L2 −→ 0 as R → ∞. 7.2. The Fourier transform 365

It is convenient to approach this via a lemma. Lemma 7.2.8. If f ∈ A(Rn), then (7.2.67) holds.

ˆ n n Proof. We already know that χRf ∈ R(R ). If f ∈ A(R ), then also ˆ n ˆ ˆ f ∈ R(R ), so Proposition 17.5 applies to χRf − f, yielding ∗ ˆ ˆ ˆ ˆ (7.2.68) ∥SRf − f∥L2 = ∥F (χRf − f)∥L2 = ∥χRf − f∥L2 → 0. 

n To prove Proposition 7.2.7, we use Lemma 7.2.6 to obtain fν ∈ A(R ) such that ∥f − fν∥L2 → 0. Note also that, by (7.2.65), n ˆ ˆ (7.2.69) f ∈ R(R ) ⇒ ∥SRf∥L2 = ∥χRf∥L2 ≤ ∥f∥L2 = ∥f∥L2 , for all R ∈ (0, ∞). Now we can write

(7.2.70) SRf − f = (SRf − SRfν) + (SRfν − fν) + (fν − f), and hence

∥SRf − f∥L2 ≤ ∥SR(f − fν)∥L2 + ∥SRfν − fν∥L2 + ∥f − fν∥L2 (7.2.71) ≤ ∥SRfν − fν∥L2 + 2∥f − fν∥L2 , the last inequality by (7.2.69), with f replaced by f − fν. Consequently, by Lemma 7.2.8, for each f ∈ R(Rn),

(7.2.72) lim sup ∥SRf − f∥L2 ≤ 2∥f − fν∥L2 , ∀ ν, R→∞ and taking ν → ∞ gives (7.2.67). Proposition 7.2.7 is proved.

We next use the Plancherel theorem to establish the following sharpening of (7.2.41). See Propositions 7.2.15–7.2.16 for a further improvement. Proposition 7.2.9. We have n (7.2.73) k > =⇒ Ck(Rn) ⊂ A(Rn). 2 c ∈ k Rn Proof. Given f Cc ( ), we again use the identity (7.2.39). The Plancherel identity then gives ∫ | α ˆ |2 ∥ (α)∥2 ∀ | | ≤ (7.2.74) ξ f(ξ) dξ = f L2 , α k. Rn Summing over |α| ≤ k gives ∫ ∑ | | 2k| ˆ |2 ≤ ∥ (α)∥2 (7.2.75) (1 + ξ ) f(ξ) dξ C f L2 . Rn |α|≤k 366 7. Fourier analysis

Now Cauchy’s inequality gives ∫ ∫ |fˆ(ξ)| dξ = (1 + |ξ|)−k(1 + |ξ|)k|fˆ(ξ)| dξ

Rn Rn (7.2.76) {∫ } 1/2 ≤ 1/2 | | 2k| ˆ |2 An,k (1 + ξ ) f(ξ) dξ , Rn where ∫ −2k (7.2.77) An,k = (1 + |ξ|) dξ < ∞, if 2k > n. Rn ∫ | ˆ |  This establishes the desired finiteness of Rn f(ξ) dξ.

Having seen important roles played by L1 and L2 norms and L2 inner products, we are motivated to advertise how Fourier analysis is a natural setting in which to work with spaces of functions larger than R(Rn) or R#(Rn), spaces that are labeled L1(Rn) and L2(Rn). These are defined using the Lebesgue theory of integration. We give a brief description of this, refering the reader to other sources, such as [15] or [42], for a detailed presentation. To start, we say a set S ⊂ Rn is measurable provided that, for each cell R ⊂ Rn, (7.2.78) m∗(S ∩ R) + m∗(R \ S) = V (R), where m∗ is the outer measure, defined in (3.1.124). We say a function f : Rn → C is measurable provided that, for each open O ⊂ C, f −1(O) ⊂ Rn is measurable. The Lebesgue integral associates a value in [0, +∞] to ∫ (7.2.79) f(x) dx

Rn for each measurable f satisfying f(x) ≥ 0 for all x. The space L1(Rn) consists of all measurable functions f such that ∫

(7.2.80) ∥f∥L1 = |f(x)| dx < ∞. Rn

In such a case, one can write f = f0+ − f0− + i(f1+ − f1−) with all fj mea- surable and ≥ 0, and with finite integral, and the process∫ alluded to above applies to evaluate these integrals, and hence to evaluate Rn f(x) dx. There is one further wrinkle. The space L1(Rn) actually consists of equivalence classes of measurable functions satisfying (7.2.80), where the equivalence is n f1 ∼ f2 if and only if {x ∈ R : f1(x) ≠ f2(x)} has outer measure 0. This makes L1(Rn) a normed space. 7.2. The Fourier transform 367

More generally, for p ∈ [1, ∞), Lp(Rn) consists of equivalence classes of measurable functions for which ∫ ∥ ∥p | |p ∞ (7.2.81) f Lp = f(x) dx < . Rn The only case of p > 1 that we work with here is p = 2. One has (7.2.82) f, g ∈ L2(Rn) =⇒ fg ∈ L1(Rn), and L2(Rn) is an inner product space, via (7.2.43), extended to this setting. These Lp norms satisfy the triangle inequality:

(7.2.83) ∥f + g∥Lp ≤ ∥f∥Lp + ∥g∥Lp . For p = 1 or 2, the proofs of (7.2.83) are as described before. For other p ∈ (1, ∞), which we do not deal with here, the reader can consult the references mentioned above. Thanks to (7.2.83), Lp(Rn) has the structure p of a metric space, with d(f, g) = ∥f − g∥Lp . We say fν → f in L if ∥fν −f∥Lp → 0. For all p ∈ [1, ∞), these spaces have the following important metric properties. The first is a denseness property.

∈ p Rn ∈ N ∈ k Rn Proposition A. Given f L ( ) and k , there exist fν Cc ( ) p such that fν → f in L .

The next is a completeness property.

p n Proposition B. If (fν) is a Cauchy sequence in L (R ), then there exists p n p f ∈ L (R ) such that fν → f in L .

We refer to [15] or [42] for proofs of these results. We mention that if f is bounded and continuous on Rn, then f ∈ L1(Rn) if and only if f ∈ R(Rn). Hence (7.2.38) is equivalent to (7.2.84) A(Rn) = {f ∈ L1(Rn): f is bounded and continuous, and fˆ ∈ L1(Rn)}. We also have (7.2.85) A(Rn) ⊂ L2(Rn), either by (7.2.44) or by ( ) ∥ ∥2 ≤ | | ∥ ∥ ≤ −n/2∥ ˆ∥ ∥ ∥ (7.2.86) f L2 sup f f L1 (2π) f L1 f L1 .

The following neat extensions of Propositions 7.2.5 and 7.2.7 illustrate the usefulness of the Lebesgue theory of integration in Fourier analysis. 368 7. Fourier analysis

Proposition 7.2.10. The maps F and F ∗ have unique continuous linear extensions from (7.2.87) F, F ∗ : A(Rn) −→ A(Rn) to (7.2.88) F, F ∗ : L2(Rn) −→ L2(Rn), and the identities (7.2.89) F ∗Ff = f, FF ∗f = f, and ∗ (7.2.90) ∥Ff∥L2 = ∥F f∥L2 = ∥f∥L2 hold for all f ∈ L2(Rn). Proposition 7.2.11. Define S by R ∫ −n/2 ˆ ix·ξ (7.2.91) SRf(x) = (2π) f(ξ)e dξ. |ξ|≤R 2 n 2 n Then SR : L (R ) → L (R ), and 2 n 2 n (7.2.92) f ∈ L (R ) =⇒ SRf → f in L (R ), as R → ∞.

Proposition 7.2.10 can be proven using Propositions A and B (with p = 2) and the inclusion (7.2.41), which, together with Proposition A, implies that (7.2.93) 2 n n 2 given f ∈ L (R ), there exist fν ∈ A(R ) such that fν → f in L . 2 n The argument goes like this. Given f ∈ L (R ), take fν as in (17.93). Then n ∥fµ − fν∥L2 → 0 as µ, ν → ∞. Now (14.50), applied to fµ − fν ∈ A(R ), gives

(7.2.94) ∥Ffµ − Ffν∥L2 = ∥fµ − fν∥L2 → 0, 2 n as µ, ν → ∞. Hence (Ffµ) is a Cauchy sequence in L (R ). By Proposition 2 n 2 B, there exits a limit h ∈ L (R ), that is, Ffν → h in L . One gets the same element h regardless of the choice of (fν) such that (17.93) holds, and so we ∗ set Ff = h. The same argument applies to F fν, which hence converges to F ∗f. We have ∗ ∗ (7.2.95) ∥Ffν − Ff∥L2 , ∥F fν − F f∥L2 → 0. From here, the results (7.2.89)–(7.2.90) follow. 7.2. The Fourier transform 369

As for Proposition 7.2.11, we have ∗ (7.2.96) SRf = F (χRFf), and, thanks to Proposition 7.2.10, f ∈ L2(Rn) =⇒ Ff ∈ L2(Rn) 2 n (7.2.97) =⇒ χRFf → Ff in L (R ) ∗ ∗ 2 n =⇒ F (χRFf) → F Ff in L (R ), and finally, again by Proposition 7.2.10, F ∗Ff = f. We now discuss another way of taking Fourier analysis beyond the setting of A(Rn), which is a normed linear space, with norm ˆ (7.2.98) ∥f∥A = ∥f∥L1 + ∥f∥L1 , for which we have ∗ (7.2.99) ∥Ff∥A = ∥F f∥A = ∥f∥A, for all f ∈ A(Rn), thanks to Proposition 7.2.4 and the fact that F ∗f(ξ) = Ff(−ξ). We define the dual space A′(Rn) to consist of continuous linear functionals (7.2.100) w : A(Rn) −→ C, i.e., those linear maps w satisfying n (7.2.101) |w(f)| ≤ C∥f∥A, ∀ f ∈ A(R ).

The optimal constant in (7.2.101) is denoted ∥w∥A′ . We also use the notation (7.2.102) ⟨f, w⟩ = w(f), f ∈ A(Rn), w ∈ A′(Rn). Then, we define (7.2.103) F, F ∗ : A′(Rn) −→ A′(Rn) by (7.2.104) ⟨f, Fw⟩ = ⟨Ff, w⟩, ⟨f, F ∗w⟩ = ⟨F ∗f, w⟩. Note that

(7.2.105) |⟨f, Fw⟩| = |⟨Ff, w⟩| ≤ ∥w∥A′ ∥Ff∥A = ∥w∥A′ ∥f∥A, so

(7.2.106) ∥Fw∥A′ ≤ ∥w∥A′ . We also have the Fourier inversion formula on A′(Rn): (7.2.107) F ∗Fw = FF ∗w = w, ∀ w ∈ A′(Rn). Indeed, given f ∈ A(Rn), w ∈ A′(Rn), (7.2.108) ⟨f, F ∗Fw⟩ = ⟨F ∗f, Fw⟩ = ⟨F ∗Ff, w⟩ = ⟨f, w⟩, 370 7. Fourier analysis the last identity by Proposition 7.2.4. The proof that FF ∗w = w is similar. Incidentally, this Fourier inversion formula combines with (7.2.106) to yield

(7.2.109) ∥Fw∥A′ = ∥w∥A′ . The space A′(Rn) contains an interesting variety of functions and other objects. For example, there is a natural map (7.2.110) ι : BC(Rn) −→ A′(Rn), where BC(Rn) consists of the set of bounded continuous functions on Rn, with norm

(7.2.111) ∥w∥BC = sup |w|. The map is given by ∫ (7.2.112) ⟨f, ι(w)⟩ = f(x)w(x) dx.

Rn Clearly

(7.2.113) |⟨f, ι(w)⟩| ≤ (sup |w|)∥f∥L1 . We also claim that the map ι in (7.2.110) is injective, i.e, w ∈ BC(Rn), ⟨f, ι(w)⟩ = 0 for all f ∈ A(Rn) implies w = 0. This is a consequence of the following more general result, whose proof we leave to the reader. ∫ Lemma 7.2.12. If w ∈ BC(Rn) and f(x)w(x) dx = 0 for all f ∈ ∞ Rn ≡ Cc ( ), then w 0. The space A′(Rn) also contains some objects more singular than func- n ′ n tions. For example, given p ∈ R , we define δp ∈ A (R ) by

(7.2.114) ⟨f, δp⟩ = f(p).

For p = 0, we simply set δ = δ0. Let us compute some Fourier transforms, and observe the Fourier inver- sion formula in action. We have ∫ (7.2.115) ⟨f, Fδ⟩ = ⟨Ff, δ⟩ = fˆ(0) = (2π)−n/2 f(x) dx, so (7.2.116) Fδ(ξ) = (2π)−n/2, a . By comparison, ∫ ⟨f, F ∗1⟩ = ⟨F ∗f, 1⟩ = F ∗f(ξ) dξ

(7.2.117) = (2π)n/2FF ∗f(0) = (2π)n/2f(0), 7.2. The Fourier transform 371 the last identity by Proposition 7.2.4. Hence (7.2.118) F ∗1 = (2π)n/2δ. Thus (7.2.116) and (7.2.118) illustrate (7.2.107). n Generalizing the construction of δp, let M ⊂ R be a compact, m- 1 ′ n dimensional, C surface, and u ∈ C(M). Define uδM ∈ A (R ) by ∫

(7.2.119) ⟨f, uδM ⟩ = f(x)u(x) dS(x). M Then ⟨f, F(uδ )⟩ = ⟨Ff, uδ ⟩ M ∫ M = fˆ(ξ)u(ξ) dS(ξ) (7.2.120) M ∫ (∫ ) = (2π)−n/2 f(x)e−ix·ξ dx u(ξ) dS(ξ).

M Rn Now covering M with coordinate charts and chopping u into pieces sup- ported on coordinate patches, we can repeatedly apply the Fubini theorem to write ∫ (∫ ) −n/2 −ix·ξ (7.2.121) ⟨f, F(uδM )⟩ = (2π) u(ξ)e dS(ξ) f(x) dx, Rn M n so F(uδM ) is (the image under ι in (7.2.110) of) an element of BC(R ), given by ∫ −n/2 −x·ξ (7.2.122) F(uδM )(x) = (2π) u(ξ)e dS(ξ). M In particular, ∫ −n/2 −ix·ξ (7.2.123) FδM (x) = (2π) e dS(ξ). M

Tempered distributions

Objects such as δp and δM are examples of “distributions.” A beautiful theory of Fourier analysis on the space of “tempered distributions,” which includes some more singular objects, was constructed by L. Schwartz. We present some of his results here. We start with the space S(Rn), defined as (7.2.124) S(Rn) = {f ∈ C∞(Rn): xβf (α) bounded, ∀ α, β}. 372 7. Fourier analysis

By this definition, we clearly have f ∈ S(Rn) =⇒ xβf (α) ∈ S(Rn), ∀ α, β. Using the integrability of (1 + |x|)−(n+1) on Rn, one can show that (7.2.125) f ∈ S(Rn) =⇒ xβf (α) ∈ R(Rn), ∀ α, β. Also, the integration by parts argument in (7.2.39) extends: ∫ (7.2.126) f ∈ S(Rn) ⇒ (2π)−n/2 f (α)(x)e−ix·ξ dx = (iξ)αfˆ(ξ), ∀ α.

Rn In particular, f ∈ S(Rn) ⇒ fˆ ∈ R(Rn), so (7.2.127) S(Rn) ⊂ A(Rn). Thus, by Proposition 7.2.4, (7.2.128) f ∈ S(Rn) =⇒ f = F ∗Ff = FF ∗f. We can also complement (7.2.126) by ∫ ∈ S Rn ⇒ −n/2 β −ix·ξ |β| β ˆ (7.2.129) f ( ) (2π) x f(x)e dx = i ∂ξ f(ξ), Rn and deduce that (7.2.130) F, F ∗ : S(Rn) −→ S(Rn). By (7.2.128), the operators F and F ∗ are inverses of each other on S(Rn). The space S(Rn) carries the following sequence of norms: β (α) (7.2.131) pk(f) = max sup |x f (x)|. |α|+β|≤k x∈Rn One says a linear map T : S(Rn) → S(Rn) is continuous provided that, for + + each k ∈ Z , there exists ℓ ∈ Z and Ck < ∞ such that n (7.2.132) pk(T f) ≤ Ckpℓ(f), ∀ f ∈ S(R ). The observations leading to (7.2.125) and (7.2.129) show that n (7.2.133) pk(Ff) ≤ Ckpk+n+1(f), ∀ f ∈ S(R ), so F in (7.2.130) is continuous. The same goes for F ∗. Now a tempered distribution w ∈ S′(Rn) is a continuous linear functional

(7.2.134) w : S(Rn) −→ C, that is to say, w is a linear map from S(Rn) to C with the property that there exists k ∈ Z+ and C < ∞ such that n (7.2.135) |w(f)| ≤ Cpk(f), ∀ f ∈ S(R ). 7.2. The Fourier transform 373

As in (7.2.102), we use the notation (7.2.136) ⟨f, w⟩ = w(f), f ∈ S(Rn), w ∈ S′(Rn). Analysis behind (7.2.127) gives

(7.2.137) ∥f∥A ≤ Cp2n+2(f), ′ n ′ n so each w ∈ A (R ) also defines an element of S (R ). Thus δp in (7.2.114) and δM in (7.2.119) are examples of tempered distributions. To produce more singular tempered distributions, we can define (7.2.138) ∂α : S′(Rn) −→ S′(Rn) by (7.2.139) ⟨f, ∂αw⟩ = (−1)|α|⟨f (α), w⟩. In this way, we get, for example, δ′ ∈ S′(R). We can also define xβw for w ∈ S′(Rn) by (7.2.140) ⟨f, xβw⟩ = ⟨xβf, w⟩.

We now define (7.2.141) F, F ∗ : S′(Rn) −→ S′(Rn) by (7.2.142) ⟨f, Fw⟩ = ⟨Ff, w⟩, ⟨f, F ∗w⟩ = ⟨F ∗f, w⟩, for f ∈ S(Rn), w ∈ S′(Rn). Note that, given w ∈ S′(Rn), n |⟨f, w⟩| ≤ Cpk(f), ∀ f ∈ S(R ) (7.2.143) ⇒ |⟨f, Fw⟩| ≤ Cpk(Ff) ≤ CCkpk+n+1(f), by (7.2.133), so indeed if w ∈ S′(Rn), (17.140) defines Fw as an element of S′(Rn). The same goes for F ∗w. Here is a further extension of the Fourier inversion formula. Proposition 7.2.13. Given w ∈ S′(Rn), (7.2.144) F ∗Fw = FF ∗w = w.

Proof. Parallel to (7.2.108), we have, for f ∈ S(Rn), w ∈ S′(Rn), (7.2.145) ⟨f, F ∗Fw⟩ = ⟨F ∗f, Fw⟩ = ⟨FF ∗f, w⟩ = ⟨f, w⟩, the last identity by (7.2.128). The same goes for ⟨f, FF ∗w⟩. 

We can extend (7.2.126) and (7.2.129) from S(Rn) to S′(Rn), using (7.2.139)–(7.2.140) and an argument parallel to (17.143), to obtain: 374 7. Fourier analysis

Proposition 7.2.14. Given w ∈ S′(Rn), (7.2.146) F(∂αw) = (iξ)αFw, and F β |β| βF (7.2.147) (x w) = i ∂ξ w, with similar formulas involving F ∗.

For example, with δ as in (7.2.114) (p = 0), and n = 1, we have from (7.2.116) and (7.2.146) (7.2.148) Fδ′(ξ) = (2π)−1/2iξ.

More general sufficient condition for f ∈ A(Rn)

We return to the task of identifying elements of A(Rn), and establish a result substantially sharper than Proposition 7.2.9. We mention that an analogous result holds for Fourier series. The interested reader can investi- gate this. To set things up, given f ∈ R(Rn), let

(7.2.149) fh(x) = f(x + h). Here is our result in the case n = 1. Proposition 7.2.15. If f ∈ R(R) and there exists C < ∞ such that α (7.2.150) ∥f − fh∥L2 ≤ C|h| , for |h| ≤ 1, with 1 (7.2.151) α > , 2 then f ∈ A(R).

Proof. A calculation gives ˆ ihξ ˆ (7.2.152) fh(ξ) = e f(ξ), so, by the Plancherel identity, ∫ ∞ ∥ − ∥2 | − ihξ|2 | ˆ |2 (7.2.153) f fh L2 = 1 e f(ξ) dξ. −∞ Now, π 3π (7.2.154) ≤ |hξ| ≤ =⇒ |1 − eihξ|2 ≥ 2, 2 2 7.2. The Fourier transform 375 so ∫ ∥ − ∥2 ≥ | ˆ |2 (7.2.155) f fh L2 2 f(ξ) dξ. π ≤| |≤ 3π 2 hξ 2 If (7.2.150) holds, we deduce that, for 0 < |h| ≤ 1, ∫ (7.2.156) |fˆ(ξ)|2 dξ ≤ C|h|2α,

2 ≤| |≤ 4 |h| ξ |h| hence (setting |h| = 2−ℓ+1), for ℓ ≥ 1, ∫ (7.2.157) |fˆ(ξ)|2 dξ ≤ C2−2αℓ.

2ℓ≤|ξ|≤2ℓ+1 Cauchy’s inequality gives ∫ |fˆ(ξ)| dξ

2ℓ≤|ξ|≤2ℓ+1 { ∫ } { ∫ } 1/2 1/2 ≤ | ˆ |2 × (7.2.158) f(ξ) dξ 1 dξ 2ℓ≤|ξ|≤2ℓ+1 2ℓ≤|ξ|≤2ℓ+1 ≤ C2−αℓ · 2ℓ/2 = C2−(α−1/2)ℓ. Summing over ℓ ∈ N and using (again by Cauchy’s inequality) ∫ ˆ ˆ (7.2.159) |f| dξ ≤ C∥f∥L2 = C∥f∥L2 , |ξ|≤2 then gives the proof. 

To see how close to sharp Proposition 7.2.15 is, consider

f(x) = χI (x) = 1 if 0 ≤ x ≤ 1, (7.2.160) 0 otherwise. We have, for |h| ≤ 1, ∥ − ∥2 | | (7.2.161) f fh L2 = 2 h , so (7.2.150) holds, with α = 1/2. Since A(R) ⊂ C(R), this function does not belong to A(R), so the condition (7.2.151) is about as sharp as it could be. 376 7. Fourier analysis

To produce the appropriate generalization to n variables, let us focus on (7.2.158), and note that when R is replaced by Rn, the integral of 1 dξ becomes ∼ 2nℓ, so to obtain the result ∑ ∫ (7.2.162) |fˆ(ξ)| dξ < ∞, ℓ≥1 2ℓ≤|ξ|≤2ℓ+1 we want ∫ n (7.2.163) |fˆ(ξ|2 dξ ≤ C2−2γℓ, γ > . 2 2ℓ≤|ξ|≤2ℓ+1

It is convenient to rewrite∫ this as (7.2.164) |ξ|2k|fˆ(ξ)|2 dξ ≤ C2−2αℓ,

2ℓ≤|ξ|≤2ℓ+1 where α > 0, if n = 2k, (7.2.165) 1 α > , if n = 2k + 1. 2 Now we bring in Proposition 7.2.14. Assume β n (7.2.166) ∂ f = fβ ∈ R(R ), for |β| ≤ k, where a priori ∂βf is defined as an element of S′(Rn), by (7.2.139). Then calculations parallel to (7.2.152)–(7.2.157), applied to fβ in place of f, show that, if α (7.2.167) ∥fβ − (fβ)h∥L2 ≤ C|h| , for |β| ≤ k, then ∫ (7.2.168) |ξβfˆ(ξ)|2 dξ ≤ C2−αℓ,

2ℓ≤|ξ|≤2ℓ+1 for ℓ ≥ 1. Summing over |β| ≤ k then yields (7.2.164). We hence have the following higher dimensional extension of Proposition 7.2.15. Proposition 7.2.16. Assume n = 2k or n = 2k + 1. Take f ∈ R(Rn) and assume that β n (7.2.169) ∂ f = fβ ∈ R(R ), for |β| ≤ k, and that, for each such β, α (7.2.170) ∥fβ − (fβ)h∥L2 ≤ C|h| , for |h| ≤ 1, where α satisfies (7.2.165). Then f ∈ A(Rn). 7.2. The Fourier transform 377

We mention a result that refines Proposition 7.2.16 when n > 1. To state it, we bring in the difference operators − 1 − β β1 ··· βn (7.2.171) ∆j,εf(x) = ε (f(x + εej) f(x)), ∆ε = ∆1,ε ∆n,ε, n where {e1, . . . , en} is the standard basis of R . Here is the result. Proposition 7.2.17. Assume n = 2k or n = 2k + 1. Take f ∈ R(Rn) and assume that there exists C < ∞ such that, for |β| ≤ k, ∥ β − β ∥ ≤ | |α | | ≤ ≤ (7.2.172) ∆ε f (∆ε f)h L2 C h , for h 1, 0 < ε 1, where α satisfies (7.2.165). Then f ∈ A(Rn).

We will not present a proof of Proposition 7.2.17, but leave this as a challenge to the ambitious reader.

Exercises

1. Let f : Rn → C satisfy |f (α)(x)| ≤ C(1 + |x|)−(n+1) for |α| ≤ n + 1. Show that |fˆ(ξ)| ≤ C(1 + |ξ|)−(n+1). Deduce that f ∈ A(Rn).

2. Sharpen the result of Exercise 1 as follows. Assume f satisfies [ ] n |f (α)(x)| ≤ C(1 + |x|)−(n+1) for |α| ≤ + 1. 2 Then show that f ∈ A(Rn).

3. Take n = 1. For each of the following functions f : R → C, compute fˆ(ξ). (7.2.173) f(x) = e−|x|, 1 (7.2.174) f(x) = , 1 + x2 (7.2.175) f(x) = χ[−1,1](x),

(7.2.176) f(x) = (1 − |x|)χ[−1,1](x).

4. In each case of Exercise 3, record the identity that follows from the Plancherel identity (7.2.50), established in Proposition 7.2.5. 378 7. Fourier analysis

5. Define fr ∈ C(R) by 2 r fr(x) = (1 − x ) for |x| ≤ 1, 0 for |x| > 1.

Show that fr ∈ A(R) for each r > 0, as a consequence of Proposition 7.2.15. What is the best conclusion one could draw from Proposition 7.2.9?

The next exercises bear on the∫ function ix·ξ n Φn(x) = e dS(ξ), x ∈ R . Sn−1

6. Show that ∞ n (a) Φn ∈ C (R ). n (b) (∆ + 1)Φn = 0 on R . (c) Φn is radial, i.e., Φn(x) = φn(|x|).

7. Deduce from Exercise 6 and the formula for ∆ in spherical polar coordi- nates that n − 1 φ′′ (s) + φ′ (s) + φ (s) = 0. n s n n Note that φn(s) = Φn(se1) is a smooth, even function of s.

8. Convert the calculation ∫

isξ1 φn(s) = e dS(ξ) Sn−1 ∞ ∫ ∑ (is)k = ξk dS(ξ) k! 1 k=0 Sn−1 ∞ ∫ ∑ (−1)ℓ = s2ℓ ξ2ℓ dS(ξ) (2ℓ)! 1 ℓ=0 Sn−1 into a formula for ∫ 2ℓ ∈ N ξ1 dS(ξ), ℓ . Sn−1

9. More generally, produce a formula for ∫ α + ξ dS(ξ), α = (α1, . . . , αn), αj ∈ Z , Sn−1 7.2. The Fourier transform 379

(α) in terms of Φn (0).

10. In the spirit of Exercises 8–9, use ∫ 2 2 e−|x| /4 = π−n/2 e−|ξ| eix·ξ dξ

Rn to produce a formula for ∫ 2 ξαe−|ξ| dξ

Rn in terms of derivatives of e−|x|2/4 at x = 0.

11. Show that |x|2−n defines an element of S′(Rn), and 2−n n ∆(|x| ) = Cnδ on R ,Cn = −(n − 2)An−1, for n ≥ 3. Similarly, show that ∆(log |x|) = 2πδ on R2. Hint. Check Exercises 13–14 of §4.4. 380 7. Fourier analysis

7.3. Poisson summation formulas

Comparing Fourier transforms of functions on Rn with Fourier series of re- lated functions on Tn leads to highly nontrivial identities, known as Poisson summation formulas. We derive some of them here. To start, we take ∑ (7.3.1) f ∈ S(Rn), φ(x) = f(x + 2πℓ). ℓ∈Zn We have (7.3.2) φ ∈ C∞(Rn), φ(x) = φ(x + 2πk), ∀ k ∈ Zn, hence (with slight abuse of notation) (7.3.3) φ ∈ C∞(Tn), Tn = Rn/(2πZn).

We next observe∫ that ∫ (7.3.4) φ(x)e−ik·x dx = f(x)e−ik·x dx, ∀ k ∈ Zn.

Tn Rn Consequently, withφ ˆ(k) defined as in (7.1.1) and fˆ(ξ) defined as in (7.2.1), (7.3.5) φˆ(k) = (2π)−n/2fˆ(k), ∀ k ∈ Zn. Now the Fourier inversion formula, in the form of Proposition 7.1.1, applies to φ: ∑ (7.3.6) φ(x) = φˆ(k)eik·x, ∀ x ∈ Tn. k∈Zn Putting this together with (7.3.1) and (7.3.5) gives the following general Poisson summation formula. Proposition 7.3.1. Given f ∈ S(Rn), we have, for each x ∈ Rn, ∑ ∑ (7.3.7) f(x + 2πℓ) = (2π)−n/2 fˆ(k)eik·x. ℓ∈Zn k∈Zn In particular, ∑ ∑ (7.3.8) f(2πℓ) = (2π)−n/2 fˆ(k). ℓ∈Zn k∈Zn We can apply this to 2 (7.3.9) f(x) = e−t|x| , and use (7.2.16) to evaluate fˆ(k). This leads to ∑ ∑ 2 2 2 (7.3.10) e−4π t|ℓ| = (4πt)−n/2 e−|k| /4t, t > 0. ℓ∈Zn k∈Zn 7.3. Poisson summation formulas 381

Taking τ = 4πt, we can rewrite this as ∑ ∑ 2 2 (7.3.11) e−πτ|ℓ| = τ −n/2 e−π|k| /τ , τ > 0. ℓ∈Zn k∈Zn This result is known as the Jacobi inversion formula.

The Riemann functional equation

The Riemann zeta function ζ(s) is defined by ∑∞ (7.3.12) ζ(s) = k−s, Re s > 1. k=1 This defines ζ(s) as a function holomorphic in {s ∈ C : Re s > 1}. See (5.1.58). Here we establish a formula of Riemann that extends ζ(s) beyond the half plane Re s > 1. To start the analysis, we relate ζ(s) to the function ∑∞ 2 (7.3.13) g(t) = e−πk t. k=1 We have ∫ ∞ ∑∞ ∫ ∞ g(t)ts−1 dt = k−2sπ−s e−tts−1 dt (7.3.14) 0 k=1 0 = ζ(2s)π−sΓ(s), for Re s > 1/2. The Gamma function Γ(s) is as in (5.1.56). This gives rise to further identities, via the n = 1 case of the Jacobi inversion formula (7.3.11), i.e., √ ∑∞ ∑∞ 2 1 2 (7.3.15) e−πℓ t = e−πk /t, t ℓ=−∞ k=−∞ which implies ( ) 1 1 1 (7.3.16) g(t) = − + t−1/2 + t−1/2g . 2 2 t To use this, we first note from (7.3.14) that, for Re s > 1, ( ) ∫ s ∞ Γ π−s/2ζ(s) = g(t)ts/2−1 dt 2 ∫0 ∫ (7.3.17) 1 ∞ = g(t)ts/2−1 dt + g(t)ts/2−1 dt. 0 1 382 7. Fourier analysis

Into the integral over [0, 1], we substitute the right side of (7.3.16) for g(t), to obtain ( ) ∫ ( ) s 1 1 1 Γ π−s/2ζ(s) = − + t−1/2 ts/2−1 dt 2 2 2 (7.3.18) 0 ∫ ∫ 1 ∞ + g(t−1)ts/2−3/2 dt + g(t)ts/2−1 dt. 0 1 We evaluate the first integral on the right and replace t by 1/t in the second integral, to obtain, for Re s > 1, ( ) ∫ s 1 1 ∞[ ] (7.3.19) Γ π−s/2ζ(s) = − + ts/2 + t(1−s)/2 g(t)t−1 dt. 2 s − 1 s 1 Note that g(t) ≤ Ce−πt for t ∈ [1, ∞), so the integral on the right side of (17.165) defines a function holomorphic for all s ∈ C. As seen in §5.1, Γ(z) is holomorphic on C \{0, −1, −2, −3,... }. Further results on the Gamma function include the following. Lemma 7.3.2. The function 1/Γ(z) extends to be holomorphic on all of C, with zeros at {0, −1, −2, −3,... }.

We refer to [46], Chapter 4, for a proof. Given this, we have from (7.3.19) that ζ(s) extends to be holomorphic on C \{1}. The formula (7.3.19) does more than establish such a holomorphic ex- tension of the zeta function. Note that the right side of (7.3.19) is invariant under replacing s by 1 − s. Thus we have the following identity, known as Riemann’s functional equation: ( ) ( ) s 1 − s (7.3.20) Γ π−s/2ζ(s) = Γ π−(1−s)/2ζ(1 − s). 2 2 The Riemann zeta function plays a central role in the study of prime numbers. Some basic material on this can be found in Chapter 4 of [46], and a great deal more in [11].

Exercises

1. Show that the Poisson summation formula (7.3.8) applies when f ∈ A(Rn) and φ ∈ A(Tn), ∑ where, as in (7.3.1), φ(x) = ℓ f(x + 2πℓ). 7.4. Spherical harmonics 383

7.4. Spherical harmonics

One type of generalization of Fourier series on the circle S1 ≈ T1 is Fourier series on the n-dimensional torus, treated in §7.1. Another, which we treat in this section, involves the unit sphere Sn−1 ⊂ Rn, leading to what are called spherical harmonics. Our approach to this theory will emphasize contact with the following Dirichlet problem, for harmonic functions on the unit ball (7.4.1) Bn = {x ∈ Rn : |x| < 1}, ∂Bn = Sn−1. n Namely, given f ∈ C(Sn−1), we seek a function u ∈ C(B ) ∩ C2(Bn) satis- fying

n (7.4.2) ∆u = 0 on B , u Sn−1 = f. In case n = 2, connections between this problem and Fourier series are explored in Exercises 4–8 of §7.1. In particular, we have from (7.1.139)– (7.1.141) that, when n = 2, ∫ − 2 π iθ 1 r f(φ) (7.4.3) u(re ) = 2 dφ. 2π −π 1 − 2r cos(θ − φ) + r A change of variable gives, for x ∈ R2, |x| < 1, ∫ 1 − |x|2 f(y) (7.4.4) u(x) = ds(y), 2π |x − y|2 S1 where ds(y) denotes arc-length. Moving from dimension 2 to dimension n ≥ 3, we are motivated to try a formula for the solution to (7.4.2) of the form ∫ f(y) (7.4.5) u(x) = C (1 − |x|2) dS(y). n |x − y|n Sn−1

We will show that this works, and along the way calculate the constant Cn. First we will show that, for each f ∈ C(Sn−1), the function u is harmonic on Bn. This is a consequence of the following. Lemma 7.4.1. For a given y ∈ Sn−1 (i.e., |y| = 1), set (7.4.6) v(x) = (1 − |x|2)|x − y|−n. Then v is harmonic on Rn \{y}.

Proof. It suffices to show that w(x) = v(x+y) is harmonic on Rn \0. Since 1 − |x + y|2 = −2(x · y + |x|2) provided |y| = 1, we have (7.4.7) −w(x) = 2(y · x)|x|−n + |x|2−n. 384 7. Fourier analysis

That |x|2−n is harmonic on Rn \ 0, we have already seen, as a consequence of the formula for ∆ in polar coordinates (cf. (5.1.59)), which yields n − 1 (7.4.8) g(x) = φ(r) =⇒ ∆g = φ′′(r) + φ′(r). r n Now applying ∂/∂xj to a smooth harmonic function on an open set in R gives another, so the following are harmonic on Rn \ 0:

∂ 2−n −n (7.4.9) wj(x) = |x| = (2 − n)xj|x| . ∂xj For n = 2 we take instead ∂ −2 (7.4.10) log |x| = xj|x| . ∂xj Thus the first term on the right side of (7.4.7) is a linear combination of these functions, so the lemma is proved. 

To justify (7.4.5), it remains to show that if u is given by this formula, n−1 and Cn is chosen correctly, then u = f on S . Note that if we write x = rω, ω ∈ Sn−1, then (7.4.5) yields ∫ (7.4.11) u(rω) = p(r, ω, y)f(y) dS(y),

Sn−1 where 2 −n (7.4.12) p(r, ω, y) = Cn(1 − r )|rω − y| . It is clear that (7.4.13) p(r, ω, y) → 0 as r ↗ 1, if ω ≠ y, with uniform convergence on each compact subset of {(ω, y) ∈ Sn−1 ×Sn−1 : ω ≠ y}. We claim that ∫ ′ (7.4.14) p(r, ω, y) dS(y) = Cn, Sn−1 a constant independent of r and ω. The independence of ω follows by ro- tational symmetry. Thus we can integrate with respect to ω. But Lemma 7.4.1 implies that 2 2 −n (7.4.15) p(r, x, y) = Cn(1 − r |x| )|rx − y| is harmonic in x, for |x| < 1/r, so the mean value property for harmonic functions gives ∫ 1 (7.4.16) p(r, ω, y) dS(ω) = Cn An−1 Sn−1 ∈ n−1 ′ for all r < 1, y S . This implies (7.4.14), with Cn = CnAn−1. 7.4. Spherical harmonics 385

Thus, in view of (7.4.13), p(r, ω, y) is highly peaked near ω = y ∈ Sn−1 as r ↗ 1, and an argument parallel to that used in the proof of Proposition 7.2.2 (see also Exercise 9 of §7.1) shows that the limit of (7.4.11) as r ↗ 1 n−1 is equal to CnAn−1f(ω), for each f ∈ C(S ). This justifies the formula (7.4.5) and fixes the constant: Cn = 1/An−1. We have proved most of the following. n Proposition 7.4.2. Given f ∈ C(Sn−1), the solution in C(B ) ∩ C2(Bn) to (7.4.2) is given by the Poisson integral formula ∫ 1 − |x|2 f(y) (7.4.17) u(x) = n dS(y). An−1 |x − y| Sn−1 Furthermore this solution is unique.

Proof. It remains to establish uniqueness. In fact, the difference v of two n such solutions would be harmonic on Bn, belong to C(B ), and vanish on ∂Bn. Hence the maximum principle, from Proposition 5.1.7, yields v ≡ 0 on Bn. 

Another way to write the conclusion (7.4.17) of Proposition 7.4.2 is ∫ 1 − r2 f(y) (7.4.18) u(rω) = dS(y), 2 n/2 An−1 (1 − 2rω · y + r ) Sn−1 for ω ∈ Sn−1, 0 ≤ r < 1. The next step in the extension of results connecting Fourier series and harmonic functions in §7.1 brings in spherical harmonics. We look for har- monic functions in the form (7.4.19) u(rω) = φ(r)g(ω). To get this, we use the formula for ∆ in spherical polar coordinates, derived in (5.1.59): ∂2 n − 1 ∂ 1 (7.4.20) ∆u(rω) = u(rω) + u(rω) + ∆ u(rω), ∂r2 r ∂r r2 S n−1 where ∆S is the Laplace operator on the unit sphere S . In case u is given by (7.4.19), we have [ ] n − 1 1 (7.4.21) ∆u(rω) = φ′′(r) + φ′(r) g(ω) + φ(r)∆ g(ω). r r2 S Thus (7.4.19) defines a harmonic function on Bn \0 if and only if there exists a constant µ such that n−1 (7.4.22) ∆Sg = µg on S , and ′′ n−1 ′ µ (7.4.23) φ (r) + r φ (r) + r2 φ(r) = 0, 386 7. Fourier analysis for 0 < r < 1. Note that (7.4.17) implies the harmonic function u is C∞ on Bn. Hence, in (7.4.19), we must have g ∈ C∞(Sn−1). If g satisfies (7.4.22), we say g is an eigenfunction of ∆ , with eigenvalue µ. Note that ∫ S ∫ 2 µ |g| dS = (∆Sg)g dS

n−1 n−1 (7.4.24) S S ∫ = − | grad g|2 dS,

Sn−1 so µ ≤ 0. Say µ = −λ2. Now, the equation (7.4.23) is an Euler equation, whose solutions are linear combinations of

k (7.4.25) φ(r) = r , where k = k satisfies the equation (7.4.26) k(k − 1) + (n − 1)k − λ2 = 0, with roots n − 2 1√ (7.4.27) k = −  (n − 2)2 + 4λ2. 2 2 The smoothness of u on Bn requires that the exponent in (7.4.25) be positive, so we need n − 1 1√ (7.4.28) φ(r) = rk, k = − + (n − 2)2 + 4λ2. 2 2 Furthermore, such smoothness requires (7.4.29) k ∈ Z+ = {0, 1, 2, 3,... }. − 2 Then (7.4.27) yields the eigenvalue µ = λk, with 2 2 − λk = k + (n 2)k ( ) ( ) (7.4.30) n − 2 2 n − 2 2 = k + − . 2 2 Let us set { ∈ ∞ n−1 − 2 } (7.4.31) Vk = g C (S ) : ∆Sg = λkg . + k We see that if k ∈ Z and g ∈ Vk, then u(rω) = r g(ω) is harmonic on Bn \ 0 and bounded. It follows that {0} is a removable singularity, and u extends to be harmonic on all of Bn. (See §A.6 for this removable singularity theorem.) As seen above, this harmonic function u belongs to C∞(Bn). It α − | | is homogeneous of degree k, so ∂x u(x) is homogeneous of degree k α . In particular, it is homogeneous of degree 0 for |α| = k. Being smooth, it must be constant, so | | ⇒ α (7.4.32) α = k = ∂x u = cα, const. 7.4. Spherical harmonics 387

Hence u(x) is a polynomial in x, homogeneous of degree k. Let us set n Hk = space of harmonic polynomials on R , (7.4.33) homogeneous of degree k. We have the following. Proposition 7.4.3. The map

H −→ (7.4.34) τ : k Vk, τu = u Sn−1 , is an isomorphism.

Next, we have the following important orthogonality result. Proposition 7.4.4. Assume g ∈ V , g ∈ V , j ≠ k. Then j j ∫k k

(7.4.35) (gj, gk)L2(Sn−1) = gjgk dS = 0. Sn−1 Proof. By (4.4.26) (or (4.4.29), with M = Sn−1, so ∂M = ∅), for g, h ∈ C2(Sn−1),

(7.4.36) (∆Sg, h)L2 = (g, ∆Sh)L2 .

Hence, for gj, gk as above − 2 λj (gj, gk)L2 = (∆Sgj, gk)L2

(7.4.37) = (gj, ∆Sgk)L2 − 2 = λk(gj, gk)L2 .

Since j ≠ k ⇒ λj ≠ λk, we have (7.4.35). 

Second proof. Denote by gj, gk the corresponding elements of Hj, Hk, and apply (4.4.29), with M = Bn, ∂M = Sn−1. Note that, due to the homogeneity, ∂g ∂g (7.4.38) j = jg , k = kg , on Sn−1, ∂ν j ∂ν k so you get ∫ − (7.4.39) (j k) gjgk dS = 0, Sn−1 which yields (7.4.35). 

The following result provides very valuable information on the spaces Hk. Let n (7.4.40) Pk = space of polynomials on R , homogeneous of degree k. 388 7. Fourier analysis

Proposition 7.4.5. For all k ∈ Z+, we have the direct sum decomposition 2 2j (7.4.41) Pk = Hk ⊕ |x| Hk−2 ⊕ · · · ⊕ |x| Hk−2j, with k ∈ {2j, 2j + 1}.

Proof. By Proposition 7.4.4, the various summands on the right side of (7.4.41) are mutually orthogonal with respect to the L2-inner product on Sn−1, so the sum on the right is direct. It remains to do a dimension count.

We do this by induction on k. Note that P0 = H0 and P1 = H1, so (7.4.41) is clear for k = 0, 1. Now assume the analogue of (7.4.41) holds for Pℓ, for all ℓ < k. Given this, the right side of (7.4.41) is equal to 2 (7.4.42) Hk ⊕ |x| Pk−2 = Qk, and we need to show this has the same dimension as Pk. The direct sum condition in (7.4.42) implies

(7.4.43) dim Hk + dim Pk−2 = dim Qk ≤ dim Pk. Now consider

(7.4.44) ∆ : Pk −→ Pk−2.

The null space is N (∆) = Hk, so the fundamental theorem of linear algebra implies

dim Pk = dim N (∆) + dim R(∆) (7.4.45) ≤ dim Hk + dim Pk−2.

Comparison with (7.4.43) gives dim Pk = dim Qk, and finishes the proof. (It also shows that ∆ in (7.4.44) is surjective.) 

We can use Proposition 7.4.5, and its corollary 2 (7.4.46) Pk = Hk ⊕ |x| Pk−2, to compute dim Hk (hence dim Vk). To approach this, let us use the notation n n Pk(R ) for the space of polynomials on R homogeneous of degree k, and Pk(Rn) for the space of polynomials on Rn of degree ≤ k, and set n (7.4.47) dk(n) = dim Pk(R ). We see that k n−1 (7.4.48) dk(n) = dim P (R ) = dk(n − 1) + dk−1(n − 1) + ··· + d0(n − 1), and similarly

dk−1(n) = dk−1(n − 1) + ··· + d0(n − 1), hence dk(n) − dk−1(n) = dk(n − 1). 7.4. Spherical harmonics 389

Similarly dk−1(n) − dk−2(n) = dk−1(n − 1), so dk(n) − dk−2(n) = dk(n − 1) + dk−1(n − 1). Hence (7.4.46) gives

dim Hk = dk(n − 1) + dk−1(n − 1) (7.4.49) = dim Pk(Rn−2) + dim Pk−1(Rn−2), and this is also equal to dim Vk. As one example, we see that 2 (7.4.50) On S , dim Vk = 2k + 1, since dim Pk(R1) = k + 1. More generally, one can deduce inductively from (7.4.48) that ( ) k + n − 1 (7.4.51) dim P (Rn) = , k k and hence, ( ) ( ) k + n − 2 k + n − 3 (7.4.52) On Sn−1, dim V = + . k k k − 1 Returning from dimension counts to other consequences of Proposition 7.4.5, we have the following important algebraic result.

Proposition 7.4.6. Given gj ∈ Vj and gk ∈ Vk, the product satisfies ⊕j+k (7.4.53) gjgk ∈ Vℓ. ℓ=0

Proof. The functions gj and gk extend to elements of Pj and Pk, so gjgk extends to an element of Pj+k. Then (7.4.53) follows from (7.4.41), applied to Pj+k. 

We hence get the following important density result. Proposition 7.4.7. The space ∪ ⊕ℓ (7.4.54) V = Vk ℓ>0 k=0 n−1 of finite linear combinations of eigenfunctions of ∆S is dense in C(S ).

Proof. We see that (7.4.54) is an algebra of continuous functions on Sn−1. Clearly it separates points (since V1 does that) and it is closed under complex conjugates, so the Stone-Weierstrass theorem applies, to yield denseness in C(Sn−1).  390 7. Fourier analysis

Corollary 7.4.8. For each k ∈ Z+, let { ℓ ∈ } (7.4.55) Yk : ℓ Σk be an orthonormal basis of Vk, where Σk is an index set of cardinality dim Vk. Then { ℓ ≥ ∈ } (7.4.56) Yk : k 0, ℓ Σk is an orthonormal set of functions on Sn−1 whose linear span is dense in C(Sn−1).

These results give rise to constructions parallel to some done in §7.1. Given f ∈ R(Sn−1), set ∫ ˆ ℓ (7.4.57) f(k, ℓ) = f(y)Yk (y) dS(y). Sn−1 Then set ∑ ˆ ℓ (7.4.58) Ekf(y) = f(k, ℓ)Yk (y), ℓ∈Σk yielding

n−1 (7.4.59) Ek : R(S ) −→ Vk.

n−1 The map Ek is an orthogonal projection of R(S ) onto Vk, satisfying

Ekf = f, ∀ f ∈ Vk, (7.4.60) n−1 f − Ekf ⊥ Vk, ∀ f ∈ R(S ), in the sense that

(7.4.61) (f − Ekf, g)L2 = 0, ∀ g ∈ Vk.

The properties (7.4.59)–(7.4.60) uniquely characterize Ek. As such, Ek is independent of the choice of orthonormal basis on Vk. We set ∑N SN f = Ekf k=0 (7.4.62) ∑N ∑ ˆ ℓ = f(k, ℓ)Yk . k=0 ℓ∈Σk Given Corollary 7.4.8, arguments parallel to those used in Proposition 7.1.5 establish the following. 7.4. Spherical harmonics 391

Proposition 7.4.9. We have n−1 2 f ∈ R(S ) =⇒ SN f → f in L -norm, and ∞ (7.4.63) ∑ ∑ | ˆ |2 ∥ ∥2 f(k, ℓ) = f L2 . k=0 ℓ∈Σk

We are also interested in conditions guaranteeing that SN f → f uni- formly on Sn−1. This study is complicated by the fact that, for n ≥ 3, ℓ the eigenfunctions Yk are not uniformly bounded. (See Corollary 7.4.22.) Somewhat compensating for this is the following interesting result. ∈ Z+ { ℓ ∈ } Proposition 7.4.10. For each k , if Yk : ℓ Σk is an orthonormal basis of Vk, then ∑ | ℓ |2 dim Vk ∀ ∈ n−1 (7.4.64) Yk (ω) = , ω S . An−1 ℓ∈Σk

Proof. We define the map n−1 (7.4.65) ψk : S −→ Vk by the property

(7.4.66) (g, ψk(ω))L2(Sn−1) = g(ω), ∀ g ∈ Vk. n−1 Then, for R ∈ SO(n), g ∈ Vk, ω ∈ S , −1 −1 (g, ψk(R ω)) 2 = g(R ω) (7.4.67) L = πk(R)g(ω), where we define

πk : SO(n) −→ L(Vk) by (7.4.68) −1 πk(R)g(ω) = g(R ω). In connection with this, it follows from (4.4.48) that the action of SO(n) on ∞ n−1 C (S ) commutes with ∆S, so this action preserves Vk. To continue, we have (7.4.67) equal to −1 (7.4.69) (πk(R)g, ψk(ω))L2 = (g, πk(R )ψk(ω))L2 , so −1 −1 n−1 (7.4.70) ψk(R ω) = πk(R )ψk(ω), ∀ ω ∈ S ,R ∈ SO(n). −1 2 Now the action of πk(R ) on Vk given by (7.4.68) preserves the L -norm, so we have ∥ −1 ∥ ∥ ∥ ∀ ∈ ∈ n−1 (7.4.71) ψk(R ω) Vk = ψk(ω) Vk , R SO(n), ω S . 392 7. Fourier analysis

On the other hand, we have, for each ω ∈ Sn−1, ∑ 2 ℓ 2 ∥ψ (ω)∥ = |(Y , ψ (ω)) 2 | k Vk k k L ℓ∈Σ (7.4.72) ∑k | ℓ |2 = Yk (ω) , ℓ∈Σk by (7.4.66). This implies that ∑ | ℓ |2 (7.4.73) Yk (ω) = ck ℓ∈Σk is independent of ω ∈ Sn−1. Integrating over ω ∈ Sn−1 then gives (7.4.64). 

For a first application of Proposition 7.4.10, we have the following esti- n−1 mate on Ekf, given by (7.4.58). For all ω ∈ S , ∑ | | ≤ | ˆ | · | ℓ | Ekf(ω) f(k, ℓ) Yk (ω) ℓ∈Σ ( k ) ( ) ∑ 1/2 ∑ 1/2 (7.4.74) ≤ | ˆ |2 | ℓ |2 f(k, ℓ) Yk (ω) ℓ ℓ −1/2 1/2∥ ∥ = An−1 (dim Vk) Ekf Vk , the second inequality by Cauchy’s inequality, and the last identity by (7.4.64). Let us use the notation

(7.4.75) Dk = dim Vk.

To proceed, suppose now that f ∈ C2m(Sn−1), and set

(7.4.76) Lf = (1 − ∆S)f, so

m m 2 m (7.4.77) EkL f = L Ekf = (1 + λk) Ekf.

Thus we deduce from (7.4.74) that

1/2 | | ≤ 2 −m 1/2∥ m ∥ ∀ ∈ n−1 (7.4.78) An−1 Ekf(ω) (1 + λk) Dk EkL f Vk , ω S . 7.4. Spherical harmonics 393

Another application of Cauchy’s inequality gives ∑N 1/2 | | An−1 Ekf(ω) k=M+1 ∑N ≤ 2 −m 1/2∥ m ∥ (1 + λk) Dk EkL f Vk k=M+1 (7.4.79) ( N ) ( N ) ∑ 1/2 ∑ 1/2 ≤ (1 + λ2)−2mD ∥E Lmf∥2 k k k Vk k=M+1 k=M+1 ( N ) ∑ 1/2 2 −2m ∥ − m ∥ = (1 + λk) Dk (SN SM )L f L2 . k=M+1 2 Recall that λk is given by (7.4.30) and Dk is given by (7.4.49)–(7.4.52). It follows that 2 −2m ≤ −4m+n−2 (7.4.80) (1 + λk) Dk cn(1 + k) . Hence ∑∞ 2 −2m ∞ − (7.4.81) (1 + λk) Dk < , provided 4m > n 1. k=0 This leads to the following result, which one can compare with Proposition 7.1.6. Proposition 7.4.11. Assume n − 1 (7.4.82) f ∈ C2m(Sn−1), and 2m > . 2 Then, for 0 ≤ M < N < ∞, we have, for all ω ∈ Sn−1, ∑N m (7.4.83) |Ekf(ω)| ≤ εmn(M)∥(SN − SM )L f∥L2 , k=M+1 with

(7.4.84) εmn(M) −→ 0, as M → ∞. Consequently n−1 (7.4.85) SN f(ω) −→ f(ω) as N → ∞, uniformly in ω ∈ S .

Proof. The result (7.4.83) follows from (7.4.79)–(7.4.81). Hence SN f is n−1 uniformly convergent to some limit g ∈ C(S ). It follows that SN f → g 2 2 in L -norm. But Proposition 7.4.9 yields SN f → f in L -norm, so f = g, and we have (7.4.85).  394 7. Fourier analysis

We return to the connection between the Dirichlet problem and spherical harmonic expansions, and examine ∑∞ k u(rω) = r Ekf(ω) k=0 (7.4.86) ∑∞ ∑ ˆ k ℓ = f(k, ℓ)r Yk (ω). k=0 ℓ∈Σk

k ℓ ℓ Recall that each term r Yk (ω) = ηk(rω) is a harmonic polynomial (homoge- neous of degree k) on Rn. We have from (7.4.52) and the estimate (7.4.74) that, for all ω ∈ Sn−1, ∑∞ k r |Ekf(ω)| k=0 ∑∞ (n−2)/2 k (7.4.87) ≤ cn (1 + k) r ∥Ekf∥L2 k=0 (∑∞ ) (n−2)/2 k ≤ cn (1 + k) r ∥f∥L2 . k=0

It follows that, whenever f ∈ R(Sn−1), or more generally f, |f|2 ∈ R#(Sn−1), n the series (7.4.86) converges uniformly on all balls BR = {x ∈ R : |x| ≤ R}, for R < 1, to a function u ∈ C(Bn). Of course, any partial sum

∑N k (7.4.88) SN f(rω) = r Ekf(ω) k=0 is harmonic, being a finite sum of harmonic polynomials.

To pass from harmonicity of SN f to u in (7.4.86), we have the following useful result.

n ∞ Proposition 7.4.12. Let O ⊂ R be open. Assume uν ∈ C (O) are harmonic and uν → u, uniformly on compact subsets of O. Then

∞ α α (7.4.89) u ∈ C (O), ∂ uν → ∂ u uniformly on compact subsets of O, and u is harmonic on O.

Proof. It suffices to show that if x0 ∈ BR(x0) ⊂ O and 0 < ρ < R, then

uν → u uniformly on ∂BR(x0) (7.4.90) α ⇒ ∂ uν uniformly Cauchy on Bρ(x0). 7.4. Spherical harmonics 395

Translating and dilating, we can assume x0 = 0 and R = 1, and apply (7.4.17), so ∫ − | |2 − − 1 x uµ(y) uν(y) (7.4.91) uµ(x) uν(x) = n dS(y). An−1 |x − y| Sn−1 α  We can apply ∂x to the right side to get (7.4.90). Applying this result to (7.4.86)–(7.4.88), we have the following. Proposition 7.4.13. Given f ∈ R(Sn−1), or more generally f, |f|2 ∈ R#(Sn−1), the function u defined by (7.4.87) is harmonic on Bn, and the sequence SN f(rω) of harmonic polynomials given by (7.4.88) satisfies α α (7.4.92) ∂ SN f(x) −→ ∂ u(x), uniformly each ball Bρ(0), for ρ < 1.

We now want to show that the solution to the Dirichlet problem (7.4.2), with f ∈ C(Sn−1), has the representation (7.4.86). Let us fix some notation, and denote by n (7.4.93) PI : C(Sn−1) −→ {u ∈ C(B ) ∩ C∞(Bn) : ∆u = 0 on Bn} the solution to (7.4.2), given by Proposition 7.4.2, i.e., the Poisson integral ∫ 1 − |x|2 f(y) (7.4.94) PI f(x) = n dS(y). An−1 |x − y| Sn−1 Here is the result. Proposition 7.4.14. For all f ∈ C(Sn−1), ∑∞ k (7.4.95) PI f(rω) = r Ekf(ω), k=0 for rω ∈ Bn.

Proof. Denote the right side of (7.4.95) by PIff(rω). As seen in (7.4.86)– (7.4.88) and Proposition 7.4.12, we have (7.4.96) PIf : R(Sn−1) −→ {u ∈ C∞(Bn) : ∆u = 0 on Bn}, and (∑∞ ) f (n−2)/2 k (7.4.97) |PIf(rω)| ≤ cn (1 + k) r ∥f∥L2 . k=0 We want to show that (7.4.98) PI f(rω) = PIff(rω), ∀ rω ∈ Bn, 396 7. Fourier analysis for all f ∈ C(Sn−1). It is clear that if f is a finite sum of eigenfunctions f of ∆S, i.e., if f ∈ V, given in (7.4.54), then PIf is a smooth solution to (7.4.2), so by the uniqueness part of Propisition 7.4.2, (7.4.98) holds for all f ∈ V. As seen in Proposition 7.4.7, V is dense in C(Sn−1). Thus, given n−1 f ∈ C(S ), there exist fν ∈ V such that n−1 2 (7.4.99) fν → f uniformly on C(S ), hence in L -norm. We have f (7.4.100) PI fν(rω) = PIfν(rω), for all ω ∈ Sn−1, r ∈ [0, 1). Furthermore, as ν → ∞,

PI fν(rω) −→ PI f(rω), (7.4.101) f f PIfν(rω) −→ PIf(rω), the latter result by (7.4.97), applied to f −fν. Together, (7.4.100)–(7.4.101) give (7.4.98), hence (7.4.95). 

Using (7.4.18), we deduce from Proposition 7.4.14 that, for f ∈ C(Sn−1), r ∈ [0, 1), ∞ ∫ ∑ 1 − r2 f(y) (7.4.102) rkE f(ω) = dS(y). k 2 n/2 An−1 (1 − 2rω · y + r ) k=0 Sn−1 We are consequently motivated to expand the integral on the right side of (7.4.102) in powers of r. The following “generating function identity,” ∑∞ − 2 −α α k (7.4.103) (1 2tr + r ) = Ck (t)r k=0 defines a class of special functions known as Gegenbauer polynomials. To compute these, we can use the identity ∞ ( ) ∑ j + α − 1 (7.4.104) (1 − z)−α = zj, j j=0 with z = r(2t − r), to write the left side of (7.4.103) as ∞ ( ) ∑ j + α − 1 rj(2t − r)j j j=0 ∑∞ ∑j ( )( ) j + α − 1 j − (7.4.105) = (−1)ℓrj+ℓ(2t)j ℓ j ℓ j=0 ℓ=0 ∞ [k/2] ( )( ) ∑ ∑ k − ℓ + α − 1 k − ℓ = (−1)ℓ (2t)k−2ℓrk. k − ℓ ℓ k=0 ℓ=0 7.4. Spherical harmonics 397

Hence [k/2] ( )( ) ∑ k − ℓ + α − 1 k − ℓ (7.4.106) Cα(t) = (−1)ℓ (2t)k−2ℓ. k k − ℓ ℓ ℓ=0 We have ∫ − 2 ∑∞ 1 r k n/2 · (7.4.107) PI f(rω) = r Ck (ω y)f(y) dS(y), An−1 k=0 Sn−1 for 0 ≤ r < 1. Comparison with the left side of (7.4.102) gives ∫ [ ] 1 n/2 · − n/2 · (7.4.108) Ekf(ω) = Ck (ω y) Ck−2(ω y) f(y) dS(y), An−1 Sn−1 α provided we make the convention that Ck (t) = 0 for k < 0. Summing (7.4.108) over 0 ≤ k ≤ N yields ∫ [ ] 1 n/2 · n/2 · (7.4.109) SN f(ω) = CN (ω y) + CN−1(ω y) f(y) dS(y). An−1 Sn−1

We seek an alternative formula for Ek. For its derivation, it is convenient k to temporarily replace the exponent k in r by νk, given by ( ) n − 2 2 n − 2 (7.4.110) ν2 = λ2 + , ν = k + . k k 2 k 2 Multiplying (7.4.102) by r(n−2)/2 and making the change of variable r = e−s yields ∑∞ ∫ − 2 sinh s (7.4.111) e νksE f(ω) = f(y) dS(y). k n/2 An−1 (2 cosh s − 2ω · y) k=0 Sn−1 −s Now integrating over s ∈ [s1, ∞) and taking r = e 1 (and dividing by r(n−2)/2) gives (7.4.112) ∞ ∫ ∑ 2 f(y) ν−1rkE f(ω) = dS(y). k k 2 (n−2)/2 (n − 2)An−1 (1 − 2rω · y + r ) k=0 Sn−1 We again apply the generating function identity (7.4.103), this time with α = (n − 2)/2, and compare coefficients of rk, to get ∫ 2νk (n−2)/2 · (7.4.113) Ekf(ω) = Ck (ω y)f(y) dS(y). (n − 2)An−1 Sn−1 In the classical case S2 ⊂ R3, these Gegenbauer polynomials specialize to Legendre polynomials 1/2 (7.4.114) Pk(t) = Ck (t). 398 7. Fourier analysis

Since A2 = 4π and νk = k + 1/2 in this case, we get ∫ 2k + 1 (7.4.115) E f(ω) = P (ω · y)f(y) dS(y). k 4π k S2

We denote the integral of the projection Ek by Ek(ω, y), so ∫

(7.4.116) Ekf(ω) = Ek(ω, y)f(y) dS(y). Sn−1 Then the content of (7.4.113) is that

2νk (n−2)/2 · (7.4.117) Ek(ω, y) = Ck (ω y). (n − 2)An−1 The following is a useful general identity. { ℓ ∈ } Proposition 7.4.15. If Yk : ℓ Σk is an orthonormal basis of Vk, then ∑ ℓ ℓ (7.4.118) Ek(ω, y) = Yk (ω)Yk (y). ℓ∈Σk

Proof. Denote the right side of (7.4.118) by Fk(ω, y), and set ∫

(7.4.119) Fkf(ω) = Fk(ω, y)f(y) dS(y). Sn−1 ℓ ℓ ∈ ⊥ We see that FkYk = Yk for all ℓ Σk and that Fkf = 0 if f Vk, so indeed Fk = Ek. 

If we set ω = y in (7.4.118) and integrate over Sn−1, we get ∫

(7.4.120) Ek(y, y) dS(y) = dim Vk. Sn−1 Since ω = y ∈ Sn−1 ⇒ ω · y = 1, we deduce from (7.4.117) that

2ν − (7.4.121) dim V = k C(n 2)/2(1). k n − 2 k On the other hand, setting t = 1 in (7.4.103), obtaining (1 − r)−2α, we have ( ) k + 2α − 1 (7.4.122) Cα(1) = , k k so (7.4.121) implies ( ) 2k + n − 2 k + n − 3 (7.4.123) dim V = , k n − 2 k 7.4. Spherical harmonics 399 which, as one can check (writing the numerator in the fraction above as k + (k + n − 2)), agrees with (7.4.52). Specializing to n = 3, we have 1/2 (7.4.124) Pk(1) = Ck (1) = 1, and hence, 2 (7.4.125) on S , dim Vk = 2νk = 2k + 1, as in (7.4.50). The following identity is a useful complement to (7.4.120). Proposition 7.4.16. For each k ∈ Z+, ∫ ∫ 2 (7.4.126) |Ek(ω, y)| dS(ω) dS(y) = dim Vk. Sn−1 Sn−1 Proof. From (7.4.118), we have ∑ | |2 ℓ ℓ m m (7.4.127) Ek(ω, y) = Yk (ω)Yk (y)Yk (ω)Yk (y). ℓ,m∈Σk n−1 × n−1 { ℓ ∈ } Integrating over S S and using orthonormality of Yk : ℓ Σk gives ∫ ∫ ∑ 2 |Ek(ω, y)| dS(ω) dS(y) = δℓm

− − ℓ,m∈Σk Sn 1 Sn 1 ∑ (7.4.128) = 1

ℓ∈Σk = dim Vk. 

To proceed, we would like to apply the formula (7.4.117) to (7.4.126). The following general comments are useful. Each T ∈ SO(n) acts on Sn−1, and we have, for each y ∈ Sn−1, ∫ ∫ F (ω · y) dS(ω) = F (T ω · T y) dS(ω)

n−1 n−1 (7.4.129) S S ∫ = F (ω · T y) dS(ω),

Sn−1 the last identity because T : Sn−1 → Sn−1 is an isometry, and hence pre- serves volumes. It follows that the integral on the left side of (7.4.129) is independent of y ∈ Sn−1. We can fix e ∈ Sn−1, and obtain ∫ ∫ ∫

(7.4.130) F (ω · y) dS(ω) dS(y) = An−1 F (ω · e) dS(ω). Sn−1 Sn−1 Sn−1 400 7. Fourier analysis

It follows that, with 2k + n − 2 (7.4.131) γnk = , (n − 2)An−1 we have ∫ ∫ 2 |Ek(ω, y)| dS(ω) dS(y) Sn−1 Sn−∫1 ∫ 2 (n−2)/2 · 2 (7.4.132) = γnk Ck (ω y) dS(ω) dS(y) Sn−1 Sn∫−1 2 (n−2)/2 · 2 = γnkAn−1 Ck (ω e) dS(ω), Sn−1 and this is equal to dim Vk. For the sake of simplicity, let us specialize this to n = 3, i.e., to analysis 2 3 on S ⊂ R . Then we have the Legendre polynomials Pk, given by (7.4.124), and (7.4.132) yields ∫ 1 1 (7.4.133) P (ω · e)2 dS(ω) = . 4π k 2k + 1 S2 The identities obtained above from (7.4.126) can be generalized. In fact, from (7.4.118) (plus the fact that each Ej(ω, y) is real valued) we have ∫ n−1 (7.4.134) Ej(ω, z)Ek(z, y) dS(z) = δjk Ek(ω, y), ∀ ω, y ∈ S . Sn−1 Using (7.4.117), we can rewrite this as ∫ (n−2)/2 · (n−2)/2 · (n−2)/2 · (7.4.135) γnk Cj (ω z)Ck (z y) dS(z) = δjk Ck (ω y). Sn−1 In particular, for n = 3, ∫ 2k + 1 (7.4.136) P (ω · z)P (z · y) dS(z) = δ P (ω · y). 4π j k jk k S2 2 Note that taking j = k and ω = y = e ∈ S gives (7.4.133), since Pk(1) = 1.

Zonal functions

A zonal function on Sn−1 is a continuous function of the form (7.4.137) f(ω) = φ(ω · e), 7.4. Spherical harmonics 401 where e = (0,..., 0, 1) (which we might call the “north pole”). We write f ∈ Z(Sn−1). Provided n ≥ 3, an equivalent condition is the following. Consider SO(n − 1) as a subgroup of SO(n) consisting of rotations about the xn-axis: (7.4.138) SO(n − 1) = {R ∈ SO(n): Re = e}. Then, given f ∈ C(Sn−1), (7.4.139) f ∈ Z(Sn−1) ⇐⇒ f(Rω) = f(ω), ∀ R ∈ SO(n − 1), ω ∈ Sn−1. For n = 2, SO(1) consists only of the identity transformation, and this equivalence fails. In the rest of this subsection, we will assume n ≥ 3. We define the space of zonal harmonics n−1 (7.4.140) Zk = Vk ∩ Z(S ).

We have seen examples of elements of Zk, namely (n−2)/2 · (7.4.141) Zk(ω) = Ck (ω e). The following complement to Proposition 7.4.7 is of key importance.

+ n−1 Proposition 7.4.17. The linear span of {Zk : k ∈ Z } is dense in Z(S ).

α Proof. From (7.4.106) we see that Ck (t) is a polynomial in t of degree k, whose leading term is ( ) k + α − 1 (7.4.142) (2t)k. k { α ∈ Z+} Thus the linear span of Ck : k is the space of all polynomials in t, which, by the Weierstrass approximation theorem, is dense in C([−1, 1]), so we have the proposition. 

Here is a basic corollary of Proposition 7.4.17. Proposition 7.4.18. For each k ∈ Z+,

(7.4.143) Zk = Span(Zk).

In particular dim Zk = 1.

Proof. Suppose f ∈ Zk and f ⊥ Zk, i.e., (f, Zk)L2 = 0. Clearly (f, Zj)L2 = 0 for all j ≠ k, so (f, g)L2 = 0 for all g that are finite linear combinations { } ∈ Z ∫of Zj , hence, by Proposition 7.4.17, for all g k. Taking g = f yields |f|2 dS = 0, hence f ≡ 0. 

To proceed, let us set 0 ∥ ∥−1 (7.4.144) Yk (ω) = Zk L2 Zk(ω), 402 7. Fourier analysis

{ 0 ∈ Z+} Z n−1 so Yk : k is an orthonormal set of functions in (S ). As observed below (7.4.68), the action of SO(n) on C(Sn−1) given by (7.4.145) π(R)f(ω) = f(R−1ω) preserves each space Vk. Hence it commutes with Ek. Thus the characteri- zation (7.4.139) implies n−1 (7.4.146) Ek : Z(S ) −→ Zk. By (7.4.143), the identity (7.4.58) specializes to ∈ Z n−1 ⇒ 0 0 f (S ) Ekf(ω) = (f, Yk )L2 Yk (ω) (7.4.147) ˆ 0 = f(k, 0)Yk (ω). Hence Proposition 7.4.9 specializes to the following. Proposition 7.4.19. If f ∈ Z(Sn−1), then ∑N ˆ 0 (7.4.148) SN f(ω) = f(k, 0)Yk (ω) k=0 has the property that 2 (7.4.149) SN f −→ f in L -norm, and ∑∞ | ˆ |2 ∥ ∥2 (7.4.150) f(k, 0) = f L2 . k=0 If we specialize to n = 3, then (7.4.133) yields ( ) 2k + 1 1/2 (7.4.151) Y 0(ω) = P (ω · e), k 4π k and we have, for f(ω) = φ(ω · e), ∑∞ (7.4.152) φ(ω · e) = φkPk(ω · e), k=0 with ∫ 2k + 1 φ = φ(ω · e)P (ω · e) dS(ω) k 4π k S2 (7.4.153) ( ) ∫ 1 1 = k + φ(t)Pk(t) dt, 2 −1 a result known as the Funk-Hecke theorem. We turn to a consideration of the transformation (7.4.154) Π: C(Sn−1) −→ Z(Sn−1), 7.4. Spherical harmonics 403 given by ∫ (7.4.155) Πf(ω) = f(R−1ω) dR,

SO(n−1) where dR denotes the Haar measure on SO(n − 1), introduced in (3.2.45). Since π(R) commutes with E , so does Π, and we have k

(7.4.156) Πk : Vk −→ Zk, Πk = Π . Vk Clearly

(7.4.157) f ∈ Zk =⇒ Πkf = f. Also, with Z⊥ { ∈ } (7.4.158) k = f Vk :(f, Zk)L2 = 0 , we have ∈ Z⊥ ⇒ ∈ Z⊥ ∀ ∈ − (7.4.159) f k = fR k , R SO(n 1), −1 where fR(ω) = f(R ω), so Z⊥ −→ Z⊥ ∩ Z (7.4.160) Πk : k k k = 0. To summarize:

Proposition 7.4.20. The map Πk, given by (7.4.155)–(7.4.156), is the or- thogonal projection of Vk onto Zk, hence, for f ∈ Vk, 0 0 (7.4.161) Πkf(ω) = (f, Yk )L2 Yk (ω).

Incidentally, note from (7.4.155) that, for all f ∈ Vk,

(7.4.162) Πkf(e) = f(e). Thus we have:

Corollary 7.4.21. For f ∈ Vk, ⊥ 0 ⇒ (7.4.163) f Yk = f(e) = 0. ℓ In particular, if {Y : ℓ ∈ Σk} is an orthonormal basis of Vk, ̸ ⇒ ℓ (7.4.164) ℓ = 0 = Yk (e) = 0.

Taking into account the identity (7.4.64), we have:

0 Corollary 7.4.22. For Yk given by (7.4.141) and (7.4.144), we have | 0 |2 dim Vk (7.4.165) Yk (e) = . An−1 404 7. Fourier analysis

This gives a sharp illustration of the fact, mentioned above Proposition ℓ 7.4.10, that the eigenfunctions Yk are not uniformly bounded. For n = 3, (7.4.165) specializes to ( ) 2k + 1 1/2 (7.4.166) Y 0(e) = , on S2, k 4π which we have already seen in (7.4.151).

Action of SO(n) on the eigenspaces Vk

As we have seen above, the rotation group SO(n) acts on functions on Sn−1 by (7.4.167) π(R)f(ω) = f(R−1ω). This action works on various function spaces, including C∞(Sn−1), C(Sn−1), and R(Sn−1). Since the transformation R is an isometry on Sn−1, this ∞ n−1 action on C (S ) commutes with the Laplace operator ∆S, so one has each eigenspace Vk invariant, yielding

(7.4.168) πk : SO(n) −→ L(Vk).

One can check from the definition (7.4.167) that the map πk is continuous, in fact smooth. Also, for Rj ∈ SO(n),

(7.4.169) πk(R1R2) = πk(R1)πk(R2), πk(I) = I.

We say πk is a representation of SO(n) on Vk. We also see that π(R) and 2 hence each πk(R), preserves the L -norm. A smooth representation (7.4.170) ρ : SO(n) −→ L(V ) on a finite-dimensional inner product space V that preserves the norm is called a unitary representation. A linear subspace W ⊂ V is said to be invariant under the representation ρ provided (7.4.171) ρ(R): W −→ W, ∀ R ∈ SO(n). When this holds, and ρ is unitary, we also have (7.4.172) ρ(R): W ⊥ −→ W ⊥, ∀ R ∈ SO(n), where W ⊥ = {v ∈ V :(v, w) = 0, ∀ w ∈ W }. Indeed, if w ∈ W, v ∈ W ⊥,R ∈ SO(n), then (w, π(R)v) = (π(R)∗w, v) (7.4.173) = (π(R−1)w, v), which vanishes if (7.4.171) holds. 7.4. Spherical harmonics 405

The unitary representation ρ is said to be irreducible if V has no proper invariant linear subspaces. As just seen, if W is a proper invariant subspace, then V = W ⊕ W ⊥ and ρ acts on each factor. If dim V < ∞, an inductive procedure decomposes

(7.4.174) V = W0 ⊕ · · · ⊕ WM , into factors Wj, on each of which SO(n) acts irreducibly. With these notions in hand, we can make the following important state- ment about the representations πk defined by (7.4.167)–(7.4.168). + Proposition 7.4.23. For each k ∈ Z , the representation πk of SO(n) on the eigenspace Vk is irreducible.

Proof. Suppose W ⊂ Vk is invariant under πk and f ∈ W, f ≠ 0. Then n−1 f(ω0) ≠ 0 for some ω0 ∈ S . Pick R0 ∈ SO(n) such that R0ω0 = e. Then g = π(R0)f ∈ W and g(e) ≠ 0. Applying Πk, defined by (7.4.156), we have h = Πkg ∈ W ∩ Zk, and h(e) = g(e) ≠ 0. Thanks to Proposition 7.4.18, this implies Zk ∈ W . ⊥ If W is not all of Vk, then W ⊂ Vk is a nonzero invariant subspace, ⊥ and the same argument implies Zk ∈ W . Contradiction.  In an effort to understand the nature of unitary representations ρ in (7.4.170), we will bring in the derived map

(7.4.175) σ = Dρ(I): TI SO(n) −→ L(V ). Recall that

(7.4.176) TI SO(n) = Skew(n), where (7.4.177) Skew(n) = {A ∈ M(n, R): At = −A}. We also bring in the notation

(7.4.178) G = SO(n), g = TI G, so g = Skew(n). A key connection between g and G is provided by the following result, involving the matrix exponential (defined in (2.3.115)). Proposition 7.4.24. Given A ∈ M(n, R), (7.4.179) A ∈ g ⇐⇒ etA ∈ G, ∀ t ∈ R.

Proof. Indeed, t (7.4.180) (etA)t = etA , (etA)−1 = e−tA, so (etA)t = (etA)−1 for all t ∈ R if and only if t (7.4.181) etA = e−tA, ∀ t ∈ R, 406 7. Fourier analysis which in turn holds if and only if At = −A. 

There is an algebraic structure on g called the Lie bracket (or the com- mutator): (7.4.182) A, B ∈ g =⇒ [A, B] = AB − BA ∈ g. To establish this for g = Skew(n), note that (AB − BA)t = BtAt − AtBt (7.4.183) = BA − AB, if A, B ∈ Skew(n). The following calculation captures the interaction between the Lie bracket on g and the product on G. First, for A, B ∈ g, we have, for small t, ( )( ) t2 t2 etAetB = I + tA + A2 + O(t3) I + tB + B2 + O(t3) (7.4.184) 2 2 t2 = I + t(A + B) + (A2 + 2AB + B2) + O(t3), 2 and similarly t2 (7.4.185) etBetA = I + t(A + B) + (A2 + 2BA + B2) + O(t3), 2 hence (7.4.186) etAetB = etBetA + t2[A, B] + O(t3). Consequently, etAetBe−tAe−tB = I + t2[A, B] + O(t3) (7.4.187) 2 = et [A,B] + O(t3).

We apply these calculations to show how the Lie bracket is preserved under maps on g arising from representations of G. Thus, assume we have a smooth representation ρ as in (7.4.170), and define σ : g → L(V ) as in (7.4.175), so d (7.4.188) σ(A) = ρ(esA) , ds s=0 for A ∈ g. Then, for such A, d d ρ(etA) = ρ(e(s+t)A) dt ds s=0 d (7.4.189) = ρ(esA)ρ(etA) ds s=0 = σ(A)ρ(etA), 7.4. Spherical harmonics 407 and since γ(t) = ρ(etA) satisfies γ(0) = I, this gives (7.4.190) ρ(etA) = etσ(A). We are ready to prove the following. Proposition 7.4.25. If ρ is a smooth representation of G on V and σ : g → L(V ) is given by (7.4.175), then, for A, B ∈ g, we have (7.4.191) σ([A, B]) = [σ(A), σ(B)].

Proof. Applying ρ to (7.4.187), we have

2 (7.4.192) ρ(etAetBe−tAe−tB) = ρ(et [A,B]) + O(t3). By (7.4.190) (and a second application of (7.4.187)), the left side of (7.4.192) is equal to

2 (7.4.193) etσ(A)etσ(B)e−tσ(A)e−tσ(B) = et [σ(A),σ(B)] + O(t3). Comparing the right sides of (7.4.192) and (7.4.193), we have d (7.4.194) ρ(es[A,B]) = [σ(A), σ(B)], ds s=0 which gives (7.4.191). 

A linear map σ : g → L(V ) satisfying (7.4.191) is called a Lie algebra representation of g. The content of Proposition 7.4.25 is that each smooth representation ρ of G on V gives rise, via (7.4.175), to a Lie algebra rep- resentation σ of g on V . We say σ is the derived representation associated with ρ. It follows from (7.4.188) that if ρ is a unitary representation, then (7.4.195) A ∈ g =⇒ σ(A)∗ = −σ(A), i.e., σ is a representation of g by skew-adjoint linear transformations on V . Parallel to (7.4.171), we say a linear subspace W ⊂ V is invariant under the Lie algebra representation σ provided (7.4.196) σ(A): W −→ W, ∀ A ∈ g. When this holds and σ is a skew-adjoint representation, then, parallel to (7.4.172), we have (7.4.197) σ(A): W ⊥ −→ W ⊥, ∀ A ∈ g. The following result connects the two notions of invariance. Proposition 7.4.26. A linear subspace W ⊂ V is invariant under the representation ρ of G if and only if it is invariant under the Lie algebra representation σ of g. 408 7. Fourier analysis

Proof. First we observe that, for each A ∈ g, ρ(etA): W → W, ∀ t ∈ R ⇐⇒ etσ(A) : W → W, ∀ t ∈ R (7.4.198) ⇐⇒ σ(A): W → W. This readily gives (7.4.199) W invariant under ρ =⇒ W invariant under σ. To finish the argument, we use the fact that, for each R ∈ SO(n), there exist A ∈ Skew(n) such that (7.4.200) R = eA, to get the converse implication to (7.4.199). 

Remark. The content of (7.4.200) is that Exp : Skew(n) → SO(n) is onto. See Appendix A.3, Proposition A.3.8 and Corollary A.3.9.

We say a Lie algebra representation σ of g on V is irreducible if V has no proper linear subspaces invariant under σ. We then have the following immediate consequence of Proposition 7.4.26. Proposition 7.4.27. Let V be a finite-dimensional vector space. A smooth representation ρ of G on V is irreducible if and only if its derived represen- tation σ of g on V is irreducible.

It is a natural task to classify the family of irreducible skew-adjoint representations of g, up to equivalence, where we say representations σ and τ of g on V and W are equivalent if there is an isomorphism ≈ (7.4.201) U : V −→ W, such that σ(A) = U −1τ(A)U, ∀ A ∈ g. In light of Proposition 7.4.23, we expect that a useful approach to under- standing the representations πk of SO(n) on Vk would be to classify the irreducible representations of Skew(n) and identify which are equivalent to the representation πk. In the case n = 3, carrying out this program is relatively straightforward, and this is what we proceed to do. To get started, we select the following basis of Skew(3): (7.4.202)       0 −1 0 −1 0       A1 = 1 0 ,A2 = 0 ,A3 = 0 −1 . 0 1 0 1 0

Note that the families of transformations etAj are groups of rotations about the x3−j-axis, each periodic in t of period 2π. A straightforward calculation 7.4. Spherical harmonics 409 yields the following “commutator identities,”

(7.4.203) [A1,A2] = A3, [A2,A3] = A1, [A3,A1] = A2. Now suppose σ : Skew(3) → L(V ) is a skew-adjoint representation. Set ∈ L ∗ − (7.4.204) Lj = σ(Aj) (V ),Lj = Lj. The identities (7.4.203) yield

(7.4.205) [L1,L2] = L3, [L2,L3] = L1, [L3,L1] = L2. One key to understanding the structure of this representation is provided by 2 2 2 ∈ L (7.4.206) M = L1 + L2 + L3 (V ). Note that ∑3 ∗ 2 (7.4.207) M = M , (Mv, v) = − ∥Ljv∥ ≤ 0, ∀ v ∈ V, j=1 so V has an orthonormal basis of eigenvectors of M, and all its eigenvalues are ≤ 0. Now, using the general identity (7.4.208) [X,Y 2] = [X,Y ]Y + Y [X,Y ],X,Y ∈ L(V ), we can readily verify that

(7.4.209) [Lj,M] = 0, ∀ j.

It follows that each eigenspace of M is invariant under L1,L2, and L3. This establishes the following. Lemma 7.4.28. Assume σ is an irreducible skew-adjoint representation of Skew(3) on V . Then (7.4.210) M = −λ2I, for some λ ∈ [0, ∞).

To proceed, we diagonalize L on V . Set 1 ⊕ (7.4.211) Wµ = {v ∈ V : L1v = iµv},V = Wµ.

iµ∈Spec L1

The structure of σ is determined by how L2 and L3 behave on Wµ. It is convenient to set

(7.4.212) L = L2 ∓ iL3. The following key identity follows directly from (7.4.205):

(7.4.213) [L1,L] = iL. This leads to the following. 410 7. Fourier analysis

Lemma 7.4.29. For each µ,

(7.4.214) L : Wµ −→ Wµ1.

In particular, if iµ ∈ Spec L1, then either L+ = 0 on Wµ or i(µ + 1) ∈ Spec L1, and also either L− = 0 on Wµ or i(µ − 1) ∈ Spec L1.

Proof. If v ∈ Wµ, then (7.4.213) implies

(7.4.215) L1Lv = LL1v  iLv = i(µ  1)Lv, and this identity yields the lemma. 

Remark. Because of (7.4.214) and (7.4.215), one calls L ladder operators.

The following gives more precise information. Proposition 7.4.30. If σ is an irreducible skew-adjoint representation of Skew(3) on V , then Spec L1 must consist of a sequence,

(7.4.216) Spec L1 = i{µ0, µ0 + 1, . . . , µ0 + ℓ = µ1}, with −→≈ ≤ ≤ − (7.4.217) L+ : Wµ0+j Wµ0+j+1, for 0 j ℓ 1, and −→≈ ≤ ≤ − (7.4.218) L− : Wµ1−j Wµ1−j−1, for 0 j ℓ 1.

Proof. A computation gives 2 2 L−L+ = L2 + L3 + i[L3,L2] (7.4.219) − 2 − 2 − = λ L1 iL1, on V , and, similarly, − 2 − 2 (7.4.220) L+L− = λ L1 + iL1 on V . Hence 2 L−L+ = µ(µ + 1) − λ , on Wµ, (7.4.221) 2 L+L− = µ(µ − 1) − λ , on Wµ.

Also, since L2 and L3 are skew-adjoint, ∗ (7.4.222) L+ = −L−, so − ∗ − ∗ (7.4.223) L+L− = L−L−,L−L+ = L+L+. 7.4. Spherical harmonics 411

Hence, we have the identity of null spaces,

(7.4.224) N (L+) = N (L−L+), N (L−) = N (L+L−). These observations establish (7.4.216)–(7.4.218).  ∈ In the setting of Proposition 7.4.30, we see that, if v Wµ0 is nonzero, then − { µ1 µ0 } (7.4.225) Span v, L+v, . . . , L+ v is invariant under L1,L+, and L−, hence under the representation σ, so it must be all of V , if σ is irreducible. This implies

(7.4.226) dim Wµ = 1, for iµ ∈ Spec L1, µ0 ≤ µ ≤ µ1. From (7.4.221) we see that 2 (7.4.227) µ1(µ1 + 1) = λ = µ0(µ0 − 1), hence ℓ ℓ µ − µ = ℓ =⇒ µ = − , µ = , 1 0 0 2 1 2 (7.4.228) dim V = ℓ + 1, and ℓ(ℓ + 2) λ2 = . 4 The vectors in the basis (7.4.225) of V are mutually orthogonal, since the various eigenspaces of L1 are, but they do not form an orthonormal basis. To nail down the structure of the action of Skew(3) on V , we have the following. Proposition 7.4.31. Assume σ is an irreducible skew-adjoint representa- tion of Skew(3) on V , iµ ∈ Spec L1, and wµ ∈ Wµ. Then the norms of the vectors Lwµ ∈ Wµ1 are given by 2 2 2 ∥L+wµ∥ = [λ − µ(µ + 1)] ∥wµ∥ , (7.4.229) 2 2 2 ∥L−wµ∥ = [λ − µ(µ − 1)] ∥wµ∥ .

Proof. Using (7.4.221) and (7.4.222), we have ∥ ∥2 ∗ L+wµ = (L+L+wµ, wµ)

(7.4.230) = −(L−L+wµ, wµ) 2 2 = [λ − µ(µ + 1)] ∥wµ∥ . 2 The computation of ∥L−wµ∥ is similar. 

This leads to the following explicit description of the action of Skew(3) on V . 412 7. Fourier analysis

Proposition 7.4.32. Let V be a complex inner product space of dimension ℓ+1, ℓ ∈ Z+ = {0, 1, 2,... }. If σ is a skew-adjoint representation of Skew(3) on V , then V has an orthonormal basis ℓ (7.4.231) v , µ = − + j, j ∈ {0, . . . , ℓ}, µ 2 with respect to which L v = iµv , 1 µ √ µ (7.4.232) L v = λ2 − µ(µ + 1) v , + µ √ µ+1 2 L−vµ = λ − µ(µ − 1) vµ−1, where ℓ(ℓ + 2) (7.4.233) λ2 = . 4 Consequently, for each ℓ ∈ Z+, there is, up to equivalence, just one such representation of Skew(3).

Among the representations of Skew(3) arising in Proposition 7.4.32 are the derived representations (which we denote σk) associated to the repre- n−1 sentations πk of SO(n) on Vk ⊂ C(S ), in case n = 3. Recall that

(7.4.234) n = 3 =⇒ dim Vk = 2k + 1. Thus we get “half” of the representations described in Proposition 7.4.32, those for which ℓ is even. If we denote by τℓ the representation of Skew(3) described in Proposition 7.4.32, when dim V = ℓ + 1, then

(7.4.235) σk is equivalent to τ2k.

Actually, for ℓ odd, the representation τℓ is not derived from a represen- tation of SO(3). We can see this as follows. From (7.4.202), we have

(7.4.236) e2πA1 = I. Hence, if ρ is an irreducible representation of SO(3), with derived represen- tation σ, then

(7.4.237) e2πL1 = e2πσ(A1) = ρ(e2πA1 ) = I, so Spec L1 ⊂ iZ. But if ℓ is odd, we have from (7.4.231)–(7.4.232) that i/2 ∈ Spec L1. This has the following consequence. Proposition 7.4.33. Each irreducible unitary representation of SO(3) is 2 equivalent to one of the representations πk of SO(3) on a space Vk ⊂ C(S ) of spherical harmonics. 7.4. Spherical harmonics 413

On the other hand there is a group, closely related to SO(3), whose irre- ducible unitary representations representations have derived representations equivalent to each of those given in Proposition 7.4.32, namely SU(2). Generally, for n ≥ 2, we set (7.4.238) SU(n) = {U ∈ M(n, C): U ∗U = I and det U = 1}. As with SO(n), this is a group of matrices which is also a smooth surface in the vector space M(n, C). Its tangent space at the identity element is n (7.4.239) TI SU(n) = su(n) = {X ∈ Skew(C ) : Tr X = 0}, where (7.4.240) Skew(Cn) = {X ∈ M(n, C): X∗ = −X}. The space su(2) is a 3-dimensional real vector space, closed under the Lie bracket. It has the basis( ) ( ) ( ) 1 i 0 1 0 1 1 0 i (7.4.241) X = ,X = ,X = , 1 2 0 −i 2 2 −1 0 3 2 i 0 with commutator identities

(7.4.242) [X1,X2] = X3, [X2,X3] = X1, [X3,X1] = X2, parallel to those in (7.4.203). Consequently, the map Aj 7→ Xj extends uniquely to a linear isomorphism ≈ (7.4.243) τ : Skew(3) −→ su(2), which preserves the Lie bracket (we call τ a Lie algebra isomorphism). In fact, composing τ with the inclusion su(2) ⊂ M(2, C) gives a representation of Skew(3) equivalent to τ1, described above. The group SU(2) has representations on complex vector spaces of di- mension ℓ + 1, described as follows. We take 2 Pℓ(C ) = space of polynomials in z = (z1, z2), (7.4.244) homogeneous of degree ℓ, 2 and define the representation γℓ of SU(2) on Pℓ(C ) by −1 2 2 (7.4.245) γℓ(U)p(z) = p(U z),U ∈ SU(2), z ∈ C , p ∈ Pℓ(C ). One can use on P (C2) the inner product ℓ ∫ (7.4.246) (f, g) = f(z)g(z) dS(z),

S3 where S3 ⊂ C2 ≈ R4 is the unit sphere, with its induced Riemannian metric. Then γℓ is a unitary representation. It is irreducible for each ℓ, a fact we leave as a challenge to the reader. Given this, it follows from Proposition 7.4.32 414 7. Fourier analysis

that the derived representation of γℓ is a representation of su(2) ≈ Skew(3) + that is equivalent to τℓ, for each ℓ ∈ Z .

We return to the study of the representations πk of SO(3), and derived ∞ 2 representations σk, on the eigenspaces of ∆S, Vk ⊂ C (S ), and see how Proposition 7.4.32 yields specific formulas. First, note that, since dim Vk = 2k + 1, (7.4.233) gives

(7.4.247) λ = k(k + 1) = λk, with λk as in (7.4.30), with n = 3. Equivalently, 2 2 2 (7.4.248) L1 + L2 + L3 = ∆S.

Next, we have an orthonormal basis of Vk of the form ℓ − ≤ ≤ (7.4.249) Yk , k ℓ k, satisfying ℓ ℓ (7.4.250) L1Yk = iℓYk . 0 For ℓ = 0, this identifies Yk as a zonal function. Following (7.4.151), we take ( ) 2k + 1 1/2 (7.4.251) Y 0(ω) = P (ω · e), k 4π k where e = (0, 0, 1) and Pk(t) are the Legendre polynomials. For each ℓ ∈ {−k, . . . , 0, . . . , k}, (7.4.250) says

ℓ tA1 iℓt ℓ (7.4.252) Yk (e ω) = e Yk (ω), ∈ ∑with A1 given by (7.4.202). Writing a general element f Vk as f = j j cjYk , we deduce the following result.

Proposition 7.4.34. Given f ∈ Vk, ℓ ∈ {−k, . . . , 0, . . . , k}, we have ∫ 2π 1 −iℓt itA ℓ ℓ (7.4.253) e f(e ω) dt = (f, Yk )L2 Yk (ω). 2π 0

Elements of Vk that one could plug into (7.4.253) include y · ∈ 2 (7.4.254) fk (ω) = Pk(ω y), y S , and ( ) ∑ k ∑ c ∈ C 2 (7.4.255) gk(ω) = cjωj , cj , cj = 0, j j ∑ k since then ( cjxj) is a harmonic polynomial, homogeneous of degree k. A particular case of this is k (7.4.256) gk(ω) = (ω1 + iω2) , 7.4. Spherical harmonics 415

Figure 7.4.1. Spherical coordinates: x(θ, ψ) = (sin θ cos ψ, sin θ sin ψ, cos θ) which satisfies L1gk = ikgk, implying k k (7.4.257) Yk (ω) = αk(ω1 + iω2) , for some constant αk. ℓ A more direct path to explicit formulas for Yk is found by applying (7.4.232), to get √ ℓ − ℓ+1 (7.4.258) L+Yk (ω) = k(k + 1) ℓ(ℓ + 1) Yk (ω). We can start at ℓ = 0 with (7.4.251) and apply this iteratively to obtain ℓ ≤ ≤ − ≤ ≤ − formulas for Yk (ω), for 1 ℓ k. For k ℓ 1, we could work similarly with iterates of L−, or we could just take

−ℓ ℓ (7.4.259) Yk (ω) = Yk (ω), noting that L1f = L1f. To apply (7.4.258) explicitly, we use spherical coordinates (θ, ψ), defined by x(θ, ψ) = (sin θ cos ψ, sin θ sin ψ, cos θ), (7.4.260) 0 ≤ θ ≤ π, 0 ≤ ψ ≤ 2π, 416 7. Fourier analysis so θ = 0 defines the “north pole,” (0, 0, 1) = e, and θ = π defines the south pole, −e. See Figure 7.4.1. In these coordinates, we have [ ] ∂ iψ ∂ ∂ (7.4.261) L = ,L = ie  + i cot θ . 1 ∂ψ ∂θ ∂ψ Also, in these coordinates, (7.4.251) takes the form ( ) 2k + 1 1/2 (7.4.262) Y 0(θ, ψ) = P (cos θ). k 4π k To start the iteration (7.4.258) at ℓ = 0, we have the general formula iψ ′ (7.4.263) L+g(θ) = ie g (θ), hence iψ ′ (7.4.264) L+G(cos θ) = −ie (sin θ) G (cos θ). More generally, we calculate ( ) iℓψ ℓ L+ e sin θ Gℓ(cos θ) (7.4.265) − i(ℓ+1)ψ ℓ+1 ′ = ie sin θ Gℓ(cos θ). Hence, inductively, we obtain the formula ℓ iℓψ ℓ (ℓ) ≤ ≤ (7.4.266) Yk (θ, ψ) = αkℓe sin θ Pk (cos θ), 0 ℓ k, with constants αkℓ obtainable via (7.4.258). Recall that Pk(t) is a polynomial (k) in t of degree k, so Pk (t) is constant, so (7.4.266) gives k ikψ k (7.4.267) Yk (θ, ψ) = αke sin θ, which recovers (7.4.257), since, by (7.4.260), iψ (7.4.268) e sin θ = ω1 + iω2. In light of this identity, we see that another way to write (7.4.266) is ℓ ℓ (ℓ) ≤ ≤ (7.4.269) Yk (ω) = αkℓ(ω1 + iω2) Pk (ω3), 0 ℓ k. We record the conclusion.

∞ 2 Proposition 7.4.35. Each eigenspace Vk ⊂ C (S ) of the Laplace operator 2 { ℓ − ≤ ≤ } ∆S on S has an orthonormal basis Yk : k ℓ k of the form ( ) 2k + 1 1/2 (7.4.270) Y 0(ω) = P (ω ), k 4π k 3 and, for 1 ≤ ℓ ≤ k (if k ≥ 1), ℓ ℓ (ℓ) Yk (ω) = αkℓ(ω1 + iω2) P (ω3), (7.4.271) k −ℓ − ℓ (ℓ) Yk (ω) = αkℓ(ω1 iω2) Pk (ω3). with coefficients αkℓ ∈ (0, ∞) obtainable from (7.4.258). 7.4. Spherical harmonics 417

√ Figure 7.4.2. Normalized Legendre polynomials, yk(t) = 2k + 1Pk(t)

See Exercises 9–10 below for a convenient recursive formula for the Le- gendre polynomials Pk(t). See Figure 7.4.2 for graphs of the normalized Legendre polynomials

√ (7.4.272) yk(t) = 2k + 1Pk(t), for 0 ≤ k ≤ 5, and Figure 7.4.3 for k = 10, 20, 30. Note the spiking at t = 1 as k increases. Since the argument of Pk(t) often appears as t = cos θ, it is also of interest to see graphs of yk(cos θ). See Figure 7.4.4. 0 It is of interest to contrast the behavior of the zonal eigenfunctions Yk (ω) k with that of the other extreme case, Yk (ω). We have

| k |2 | |2 k Yk (ω) = αk (ω1 + ω2) (7.4.273) 2 2 k = |αk| (1 − t ) , 418 7. Fourier analysis

Figure 7.4.3. Graphs of yk(t) for k = 10, 20, 30

∈ − ∥ k∥ with αk as in (7.4.257) and t = ω3 [ 1, 1]. The normalization Yk L2(S2) = 1 gives ∫ | k |2 1 = Yk (ω) dS(ω) 2 (7.4.274) S ∫ 1 2 2 k = 2π|αk| (1 − t ) dt. −1

2 To evaluate |αk| , it is convenient to have the following.

Lemma 7.4.36. With ∫ 1 2 k (7.4.275) γk = (1 − t ) dt, −1 we have γ0 = 2 and, for k ≥ 1, 2k (7.4.276) γ = γ − . k 2k + 1 k 1 7.4. Spherical harmonics 419

√ 0 − ≤ ≤ Figure 7.4.4. Graphs of yk(cos θ) = 4πYk (ω), π/2 θ π/2

Proof. Integration by parts gives ∫ 1 d 2 k γk = − (1 − t ) t dt − dt (7.4.277) ∫ 1 1 = 2k (1 − t2)k−1t2 dt. −1 Meanwhile, ∫ 1 2 k−1 2 (7.4.278) γk = (1 − t ) (1 − t ) dt. −1

Comparing (7.4.277)–(7.4.278), we have γk = γk−1 − γk/2k, hence (7.4.276). 

We see from (7.4.274) that

2 1 (7.4.279) |αk| = . 2πγk

The following result describes the behavior of γk as k → ∞. 420 7. Fourier analysis

Lemma 7.4.37. We have √ π (7.4.280) γ ∼ , as k → ∞. k k

Proof. Since 1−t2 and e−t2 both have nondegenerate maxima at t = 0 and agree up to O(t4) there, one can show that ∫ ∫ √ 1 ∞ 2 π (7.4.281) (1 − t2)k dt ∼ e−kt dt = . −1 −∞ k The industrious reader is invited to tackle the demonstration of (7.4.281) directly. This is a special case of results established in Appendix A.3 of [46]. 

Remark. Using results on the gamma function given in Chapter 4 of [46] (cf. (4.3.40)–(4.3.41)), one can show that 1 Γ( 2 )Γ(k + 1) (7.4.282) γk = 3 . Γ(k + 2 ) Results on the gamma function established in §3.2 of this text also allow one to deduce (7.4.276) from (7.4.282). In addition, Stirling’s formula, on the behavior of Γ(x) as x → ∞, treated in Appendix A.3 of [46], allows one to deduce (7.4.280) from (7.4.282) (though the derivation via (7.4.281) is more direct and elementary). Now, parallel to (7.4.272), we set √ 2 2 k/2 (7.4.283) wk(t) = (1 − t ) , γk so 1 (7.4.284) |Y k(ω)|2 = w (t)2, k 4π k with t = ω3, and (7.4.274) is equivalent to ∫ 1 1 2 (7.4.285) wk(t) dt ≡ 1. 2 −1

See Figure 7.4.5 for graphs of the functions wk(cos θ), for various values of k between 1 and 30.

The graphs of the functions wk(t) also spike as k → ∞, though the nature of the spiking differs from that of yk(t). For example, the spikes√ for wk(t) have a smaller amplitude. The maximum value of yk(t) is 2k + 1, while that for w (t) is k √ ( ) 2 4k 1/4 ∼ . γk π 7.4. Spherical harmonics 421

√ | k | ≤ ≤ Figure 7.4.5. Graphs of wk(cos θ) = 4π Yk (ω) , ω3 = cos θ, 0 θ π

k 0 k Regarding the associated behavior of Yk and of Yk , we see that Yk 2 0 spikes on a neighborhood of the equator in S , while Yk spikes on a smaller area, around the poles. Another difference one can perceive from Figures k → ∞ 7.4.4–7.4.5 is that the amplitude of Yk tends to 0 as k , outside any k neighborhood of the equator, so Yk really concentrates around the equator. 0 On the other hand, Yk remains sizable outside its spike zone. 422 7. Fourier analysis

√ 2 −1/4 Figure 7.4.6. y30(t) and the upper envelope y = 4/π (1 − t )

In fact, we can state the following result, though we do not prove it here. (See [33], §4.8.)

Proposition 7.4.38. In the limit as k → ∞, the sequence of functions yk(t) has the upper envelope √ 4 y = (1 − t2)−1/4. π The reader can examine Figure 7.4.6 and formulate a definition of “upper envelope” in this setting. 7.4. Spherical harmonics 423

Exercises

1. Given that the solution to (7.4.2) is specified by (7.4.5), for some constant Cn, show that plugging in f = 1 and evaluating at x = 0 implies Cn must be equal to 1/An−1.

2. Refining Proposition 7.4.6, show that if f ∈ Vj and g ∈ Vk, then ⊕k+j (7.4.286) fg ∈ Vℓ. ℓ=|k−j| Hint. Suppose 0 ≤ j < k and m < k − j, so j + m < k. Deduce that

h ∈ Vm =⇒ h ⊥ fg, from the implication h ∈ Vm, f ∈ Vj ⇒ hf ⊥ Vk.

n−1 3. Recall the map ψk : S → Vk, given in (7.4.65)–(7.4.67). (a) Implicit in the construction of (7.4.65)–(7.4.66) is the following assertion. Let V be a finite-dimensional (complex) inner product space, and λ : V −→ C a linear map. Then there is a unique ψ ∈ V such that

λ(g) = (g, ψ)V , ∀ g ∈ V. Prove this. (Hint. Consider the action of λ on an orthonormal basis of V .) (b) Show that, in the setting of (7.4.66), (n−1)/2 · ψk(ω)(y) = Ek(ω, y) = βnkCk (ω y), for an appropriate constant βnk. (c) Use (7.4.117)–(7.4.118) to give another proof of Proposition 7.4.10. (d) Assume k ≥ 1. Show that for each p ∈ Sn−1, n−1 Dψk(p): TpS −→ Vk is injective. In fact, show that there exists bkn ∈ (0, ∞) such that n−1 ∥ ∥ ∥ ∥ n ∀ ∈ Dψk(p)X Vk = bkn X R , X TpS . n−1 (e) Show that if x, y ∈ S , then ψ2(x) = ψ2(y) if and only if y = x. Hence ψ2 induces a smooth map n−1 ψ˜2 : P −→ V2, of real projective space Pn−1, defined by (3.2.92). Show that this map is an embedding. How does this embedding compare to the one arising from (3.2.110)–(3.2.111)? 424 7. Fourier analysis

4. Verify the identity (7.4.104) for (1 − z)−α, for |z| < 1, with the binomial coefficient ( ) j + α − 1 α(α + 1) ... (α + j − 1) (7.4.287) = . j j!

5. Assume x, y ∈ Rn and |x| < |y|. Deduce from (7.4.103) that ∞ ( ) 1 1 ∑ |x| k (7.4.288) = P (cos θ) , |x − y| |y| k |y| k=0 where θ is the angle between x and y, i.e., x · y = |x| · |y| cos θ. This expansion is Legendre’s multipole expansion. It started out the study of the Legendre polynomials.

6. Looking back at Exercise 2, show that if g ∈ Vk, then

(7.4.289) ωℓg(ω) ∈ Vk+1 ⊕ Vk−1.

Hint. Use a parity argument to show that the component in Vk is 0.

n−1 0 7. Deduce from Exercise 6 that, on S , ωnYk (ω) is a linear combination 0 0 − of Yk+1(ω) and Yk−1(ω), Equivalently, when α = (n 2)/2, α α α (7.4.290) tCk (t) = AkαCk+1(t) + BkαCk−1(t). In particular,

(7.4.291) tPk(t) = akPk+1(t) + bkPk−1(t).

α 8. Recall that the leading coefficient of the polynomial Ck (t) is given by α α (7.4.142). Write down the leading coefficient of t in tCk (t) and in Ck+1(t) and find the coefficient Akα in (7.4.290).

9. Since Pk(1) ≡ 1, note that, in (7.4.291), ak + bk = 1. Use this together with Exercise 8 to show that k + 1 k (7.4.292) tP (t) = P (t) + P − (t). k 2k + 1 k+1 2k + 1 k 1

10. Rewrite (7.4.292) as a formula for Pk+1(t). Use it to write down the polynomials Pk(t) for 0 ≤ k ≤ 5. If you can program a computer, explore further, and draw graphs. 7.4. Spherical harmonics 425

α 11. Use the computation of Ck (1) in (7.4.122), together with Exercise 8, to evaluate the coefficient Bkα in (7.4.290), when α = (n − 2)/2.

n 12. Fix c = (c1, . . . , cn) ∈ C , satisfying 2 ··· 2 c1 + + cn = 0. We say c ∈ Kn. Set c ··· k Gk(x) = (c1x1 + + cnxn) . Show that Gc is a harmonic polynomial on Rn, in H , hence k k c c ∈ gk = Gk Sn−1 Vk. Show that { c ∈ n} Span gk : c K = Vk. Hint. Denote the span by Vk. Show that πk : Vk → Vk and use irreducibility of πk.

13. For y ∈ Sn−1, set y (n−2)/2 · fk (ω) = Ck (ω y), which belongs to Vk. Show that { y ∈ n−1} Span fk : y S = Vk.

14. Take n = 3, and, for y ∈ S2, set ( ) 2k + 1 1/2 (7.4.293) φy(ω) = P (ω · y), k 4π k y ∈ 2 ∈ 2 so each φk Vk has unit L -norm. Use (7.4.136) to show that, for y, η S , y η · (7.4.294) (φk, φk)L2 = Pk(y η).

2 15. Take {yj : 1 ≤ j ≤ 2k + 1} ⊂ S , and recall that dim Vk = 2k + 1. Then { yj ≤ ≤ } ⊂ φk : 1 j 2k + 1 Vk is a set of elements of unit L2-norm. Use Exercise 14 to formulate a condition that this set be a basis of Vk. C2k+1 → yj { } Hint. Define T : Vk by T ej = φk , where ej is the standard basis of C2k+1. Show that the matrix entries of A = T ∗T ∈ M(2k + 1, C) are given by yi yj aij = (φk , φk )L2 . 426 7. Fourier analysis

16. Recall the formula (7.4.103) for the Gegenbauer polynomials: ∑∞ − 2 −α α k (7.4.295) (1 2tr + r ) = Ck (t)r . k=0

(a) Show that for t ∈ [−1, 1] and z ∈ C, the roots of z2 − 2tz + 1 are √ 2 z = t  i 1 − t2, so |z| = 1.

(b) Use this to show that, for t ∈ [−1, 1], α ∈ R,

2 −α Fα,t(z) = (1 − 2tz + z ) is holomorphic for |z| < 1, and that ∑∞ − 2 −α α k (7.4.296) (1 2tz + z ) = Ck (t)z , k=0 with absolute convergence for |z| < 1. (c) Show that (7.4.296) converges absolutely and uniformly for

1 1 t, z ∈ C, |t| ≤ , |z| ≤ , 2 2 and more generally for ( ) 1 1 t, z ∈ C, |t| ≤ T, |z| ≤ min , . 2 4T

1/2 17. Take α = 1/2 in Exercise 16, and recall that Ck (t) = Pk(t) are the Legendre polynomials. Use Cauchy’s integral formula to show that, for t ∈ [−1, 1], ∫ 1 P (t) = (1 − 2tz + z2)−1/2z−k−1 dz, k 2πi γ where γ is an appropriate closed path about the origin.

18. Show that the formula (7.4.106) specialized to α = 1/2 yields

[k/2] ∑ (2k − 2ℓ)! (7.4.297) P (t) = (−1)ℓ tk−2ℓ. k 2kℓ!(k − ℓ)!(k − 2ℓ)! ℓ=0 7.4. Spherical harmonics 427

Applying the binomial formula to (t2 − 1)k, show that dk dk ∑k k! (t2 − 1)k = (−1)ℓ t2k−2ℓ dtk dtk ℓ!(k − ℓ)! ℓ=0 (7.4.298) [k/2] ∑ k! (2k − 2ℓ)! = (−1)ℓ tk−2ℓ. ℓ!(k − ℓ)! (k − 2ℓ)! ℓ=0 Deduce that, for k ∈ Z+, 1 dk (7.4.299) P (t) = (t2 − 1)k. k 2kk! dtk This is the Rodrigues formula for the Legendre polynomials.

19. Recall the map Πk : Vk → Zk discussed in (7.4.155)–(7.4.161). Here, we specialize to n = 3. (a) Show that, for f ∈ V , ∫ k 1 π Πkf(ω) = f((cos φ)ω1 + (sin φ)ω2, −(sin φ)ω1 + (cos φ)ω2, ω3) dφ. 2π −π (b) Show that

f ∈ Vk, f(e) = 1 =⇒ Πkf(ω) = Pk(ω3). (c) Apply part (a) to f (ω) = (ω + iω )k. Show that k ∫ 3 1 π 1 k Πkfk(ω) = (ω3 + i[(cos φ)ω1 + (sin φ)ω2]) dφ. 2π −π Deduce from part (b) that ∫ ( ) π √ k 1 2 (7.4.300) Pk(t) = t + i(sin φ) 1 − t dφ. 2π −π (d) Using the binomial expansion, show that ( ) ∑k k P (t) = ij a tk−j(1 − t2)j/2, k j j j=0 ∫ with π 1 j aj = sin φ dφ. 2π −π Note that aj = 0 for j odd. Hence [k/2] ( ) ∑ k P (t) = (−1)ℓ a tk−2ℓ(1 − t2)ℓ. k 2ℓ 2ℓ ℓ=0 428 7. Fourier analysis

7.5. Fourier series on compact matrix groups

Let G be a smooth, compact matrix group. As seen in §6.4, G has a bi- invariant volume form of total mass 1, and we can also equip G with a bi-invariant Riemannian metric. We form a family

(7.5.1) πα, α ∈ A, of mutually inequivalent, irreducible unitary representations of G on inner product spaces Vα, of dimension dα. We arrange that each irreducible uni- tary representation of G is equivalent to one of these. Choose an orthonor- mal basis of Vα, so that πα(g) has the matrix representation (πα(g)jk), 1 ≤ j, k ≤ dα. The following result summarizes Propositions 6.4.23–6.4.24. Proposition 7.5.1. The collection of functions F { 1/2 ∈ A ≤ ≤ } (7.5.2) = dα πα(g)jk : α , 1 j, k dα is an orthonormal set of functions on G. Furthermore, (7.5.3) Span F is dense in C(G).

Given this result, we can parallel arguments used to prove Proposition 7.1.5 (see also Proposition 7.4.9), obtaining the following result, known as the Peter-Weyl theorem. To set it up, identify A with N, and for N ∈ N and f ∈ C(G), or more generally f ∈ R(G), set ∑ ∑ (7.5.4) SN f(x) = dα(f, παjk)L2 πα(x)jk. α≤N j,k≤dα

Proposition 7.5.2. For each f ∈ R(G), 2 (7.5.5) SN f −→ f in L -norm, as N → ∞, and ∑ ∑ ∥ ∥2 2 (7.5.6) f L2 = dα (f, παjk)L2 . α j,k≤dα Another way to write S is as N ∑ (7.5.7) SN f = Pαf, α≤N where, for each α ∈ A, ∑ (7.5.8) Pαf(x) = dα(f, παjk)L2 πα(x)jk, j,k≤dα so Pα is a projection of R(G) onto the space

(7.5.9) Vα = Span{πα(x)jk : 1 ≤ j, k ≤ dα} ⊂ C(G). 7.5. Fourier series on compact matrix groups 429

This choice of basis of Vα produces a linear isomorphism ≈ (7.5.10) Jα : Vα −→ M(dα, C).

Each space Vα is invariant under left and right translations, so we have a representation γα of G × G on Vα, −1 (7.5.11) γα(g, h)u(x) = u(g xh).

Setting u(x) = πα(x)jk, we have γ (g, h)π (x) = π (g−1xh) α α jk ∑α jk −1 = πα(g )jℓπα(xh)ℓk (7.5.12) ∑ℓ −1 = πα(g )jℓπα(x)ℓmπα(h)mk. ℓ,m Hence, −1 (7.5.13) A = Jα(u) =⇒ Jα(γα(g, h)u) = πα(h)Aπα(g ).

The following result is to some degree parallel to Proposition 7.4.23.

Proposition 7.5.3. For each α ∈ A, the representation γα of G × G on Vα is irreducible.

To start the proof, supose W ⊂ Vα is a linear subspace, invariant under the action of G×G via γα. Suppose w ∈ W is nonzero, say w(g0) ≠ 0. Then w0 = γ(I, g0)w ∈ W and w0(I) ≠ 0. Now consider ∫ ∫ −1 (7.5.14) w1(x) = γα(g, g)w0(x) dg = w0(g xg) dg. G G

We have w1(I) = w0(I) ≠ 0, so w1 ∈ W and w1 ≠ 0. Also γα(g, g)w1 = w1 for all g ∈ G, i.e., −1 (7.5.15) w1(g xg) = w1(x), ∀ x, g ∈ G.

Now we can produce an element of Vα that has this conjugation invariance, namely

∑dα (7.5.16) χα(x) = Tr πα(x) = πα(x)jj. j=1 It is useful to have the following.

Lemma 7.5.4. If w1 ∈ Vα is conjugation invariant, i.e., if (7.5.15) holds, then w1 is a constant multiple of χα. 430 7. Fourier analysis

Before proving Lemma 7.5.4, let us see how it finishes off the proof of Proposition 7.5.3. Indeed, if W ⊂ Vα is a linear subspace that is invariant under γα, then the argument above shows that W must contain χα, given the lemma above. If W were a proper linear subspace, then its orthogonal ⊥ complement W ⊂ Vα would also be invariant, so the same argument gives ⊥ χα ∈ W . This is a contradiction, so Proposition 7.5.3 is proved, modulo a proof of Lemma 7.5.4.

To prove Lemma 7.5.4, we use the isomorphism Jα in (7.5.10)–(7.5.12). We have A1 = J (w1) ∈ M(dα, C), satisfying −1 (7.5.17) πα(g)A1πα(g) = A1, ∀ g ∈ G. The following result, called Schur’s lemma, applies, and yields Lemma 7.5.4.

dα Lemma 7.5.5. If πα is an irreducible unitary representation of G on C and (7.5.17) holds for some A1 ∈ M(dα, C), then A1 is a scalar multiple of the identity matrix.

∗ Proof. If (7.5.17) holds, then πα(g) also commutes with A1 for all g, hence ∗ − ∗ with A1 + A1 and with (A1 A1)/i, so it suffices to consider A1 self adjoint. Then (7.5.17) implies each eigenspace of A1 is invariant under the action of G via πα. The irreducibility of πα then implies A1 must be a scalar multiple of I. 

At this point, we endow G with a bi-invariant Riemannian metric and consider the associated Laplace operator ∆ on G. We aim to show that, for each α ∈ A,

(7.5.18) ∆ : Vα −→ Vα, and ∆ acts like a scalar multiple of the identity on each of these spaces. The following result enables us to establish (7.5.18), and is interesting in its own right.

Proposition 7.5.6. Let {Ej} be an orthonormal basis of g = TI G, and consider the left-invariant vector fields Xj = XEj . Then ∑ 2 (7.5.19) ∆ = Xj . j

Proof. Denote the right side of (7.5.19) by L. Both ∆ and L are second or- der differential operators on G; L is manifestly left invariant, and ∆ is both left and right invariant. If we use the exponential coordinate system, asso- ciated to the map Exp : g → G, we see that both ∆ and L have expressions 7.5. Fourier series on compact matrix groups 431 of the form ∑ ∑ jk (7.5.20) a (x)∂j∂k + bj(x)∂j + c(x), j,k j with ajk(0) = δjk, in both cases. Since both ∆ and L are left invariant, this implies ∆ − L is a first-order differential operator, so (7.5.21) ∆ − L = X + c, where X is a left-invariant vector field and c is a constant. Since both ∆1 = 0 and L1 = 0, we have c = 0. Also, X is skew-adjoint: (7.5.22) (Xf, g) = −(f, Xg), f, g ∈ C∞(G), while ∆ and L are self adjoint, i.e., (7.5.23) (∆f, g) = (f, ∆g), and similarly for L. This implies X = 0, hence ∆ = L. 

The identity (7.5.19) applies to (7.5.18) as follows. First,

(7.5.24) A ∈ g =⇒ XA : Vα → Vα, via

d tA XAπα(x)jk = πα(xe )jk dt t=0 (7.5.25) ∑ = πα(x)jℓσα(A)ℓk, ℓ where σα : g → L(Vα) is the derived representation associated to πα. From (7.5.24) we have ∑ 2 V −→ V (7.5.26) L = Xj : α α, J which, via (7.5.19), implies (7.5.18). Combining this with the fact that ∆ commutes with left and right translations gives

(7.5.27) ∆γα(g, h) = γα(g, h)∆ on Vα, ∀ g, h ∈ G. In view of the irreducibility result, Proposition 7.5.3, we have the following.

Proposition 7.5.7. For each α ∈ A, we have λα ∈ [0, ∞) such that − 2 ∀ ∈ V (7.5.28) ∆u = λαu, u α.

The function χα = Tr πα defined in (7.5.16) is called the character of the representation πα. One can see that these characters play a role in the proof of irreducibility in Proposition 7.5.3 similar to that played by zonal harmonics in the proof of the irreducibility in Proposition 7.4.23. Going 432 7. Fourier analysis further, the role of zonal functions on spheres Sn−1, introduced in (7.4.139), has an analogue in the space of central functions, (7.5.29) Z(G) = {f ∈ C(G): f(g−1xg) = f(x), ∀ x, g ∈ G}. The content of Lemma 7.5.4 is that

(7.5.30) Vα ∩ Z(G) = Span(χα), which is parallel to Proposition 7.4.18. The following consequence of Propo- sitions 7.5.1–7.5.2 and the observations above is parallel to Proposition 7.4.19. Proposition 7.5.8. The family of functions

(7.5.31) X = {χα : α ∈ A} is an orthonormal set of functions on G. Furthermore, (7.5.32) Span X is dense in Z(G). Consequently, for each f ∈ Z(G), ∑ (7.5.33) f = (f, χα)L2 χα, α with convergence in L2-norm.

The expansion (7.5.33) is analogous to the Funk-Hecke identity, (7.4.152)– (7.4.153). For another example of the parallel between characters on compact ma- trix groups G and zonal harmonics on spheres Sn−1, we have the following result, which can be compared with (7.4.113)–(7.4.117).

Proposition 7.5.9. The projection Pα of R(G) onto Vα, introduced in (7.5.8), is given by ∫ −1 (7.5.34) Pαf(x) = dα χα(g x)f(g) dg. G

Proof. The characterization (7.5.8) says ∫

(7.5.35) Pαf(x) = Kα(x, g)f(g) dg, G with ∑ (7.5.36) Kα(x, g) = dα πα(g)jkπα(x)jk. j,k≤dα 7.5. Fourier series on compact matrix groups 433

By comparison, ∑ −1 −1 χα(g x) = πα(g x)kk ∑k = π (g−1) π (x) (7.5.37) α kj α jk ∑j,k = πα(g)jkπα(x)jk, j,k −1 ∗ the last identity due to unitarity: πα(g ) = πα(g) . This gives (7.5.34). 

The operation in (7.5.34) is an example of a convolution. Generally, the convolution of two functions u and∫ v on G is defined as (7.5.38) u ∗ v(x) = u(g)v(g−1x) dg.

G Thus (7.5.34) says

(7.5.39) Pαf = dαf ∗ χα. The convolution product is associative, as one can check: (7.5.40) u ∗ (v ∗ w) = (u ∗ v) ∗ w. This product is not generally commutative (unless G is commutative), but we have the following. Proposition 7.5.10. If G is a compact matrix group and f ∈ R(G), (7.5.41) χ ∈ Z(G) =⇒ f ∗ χ = χ ∗ f.

We close this section with a brief discussion of a case where the methods of this section and of §7.4 are equally applicable, namely the matrix group {( ) } z w (7.5.42) SU(2) = : z, w ∈ C, |z|2 + |w|2 = 1 , −w z consisting of (7.5.43) U ∈ M(2, C) such that U ∗U = I, det U = 1. There is a natural bijective map of SU(2) onto (7.5.44) {(z, w) ∈ C2 : |z|2 + |w|2 = 1} = S3. The group SU(2) has a unique bi-invariant Riemannian metric, up to a constant factor, and this correspondence maps SU(2) isometrically onto 3 2 4 S , with its standard metric induced from inclusion( ) in C ≈ R (up to a 1 0 constant factor). The identity element I = of SU(2) corresponds 0 1 to (1, 0) = (1 + 0i, 0 + 0i) ∈ C2, i.e., to (1, 0, 0, 0) ∈ R4, which differs just 434 7. Fourier analysis by a rotation from our choice of “north pole” (0, 0, 0, 1) in §7.4. If we make this adjustment, we can verify that, under the diffeomorphism SU(2) → S3 described above, the space Z(SU(2)) of central functions on SU(2) is taken isomorphically to the space Z(S3) of zonal functions on S3. Thanks to the isometry of SU(2) → S3, the bi-invariant Laplace oper- ator on SU(2) considered here corresponds to the Laplace operator ∆S on S3 considered in §7.4, so we have a natural isomorphism of each eigenspace ∞ of ∆ (a subspace of C (SU(2))) with an eigenspace of ∆S (a subspace of C∞(S3)), and vice-versa. Recall that, with G = SU(2), G × G acts on G as a group of isometries, via x 7→ gxh−1. We get the identity map here precisely when (7.5.45) (g, h) ∈ {(I,I), (−I, −I)}, and this sets up a natural homomorphism (7.5.46) SU(2) × SU(2) −→ SO(4), onto the group of rotations of R4 (yielding isometries of S3), taking the set (7.5.45) to I ∈ SO(4).

Recall that ∆ acts as a scalar on each space Vα, on which SU(2)×SU(2) acts irreducibly. Meanwhile, each eigenspace Vk of ∆S is a space of functions on S3 on which SO(4) acts irreducibly. Comparing these facts allows us to show that the isometry SU(2) → S3 induces an isomporphism ≈ (7.5.47) Vα −→ Vk, for appropriate k = k(α). These ingredients then set up an equivalence between Fourier series on SU(2), as considered in this section, and analysis of spherical harmonic expansions on S3, as considered in §7.4. 7.5. Fourier series on compact matrix groups 435

Exercises

1. Verify the assertions (7.5.40) and (7.5.41), regarding the convolution product.

2. Let π be a finite-dimensional continuous representation of the compact matrix group G. For f ∈ R(G), set ∫ π(f) = f(x)π(x) dx.

G Show that π(u ∗ v) = π(u)π(v).

3. Show that ∥u ∗ v∥sup ≤ ∥u∥L2 ∥v∥L2 .

4. Use (7.5.38) in concert with (7.5.39)–(7.5.40) to show that

Pα(u ∗ v) = (Pαu) ∗ (Pαv).

Hint. Start with the observation that χα = Pαχα = dαχα ∗ χα, hence P ∗ 2 ∗ ∗ ∗ α(u v) = dα(u v) (χα χα).

5. Show that 2∥Pα(u ∗ v)∥sup ≤ 2∥Pαu∥L2 ∥Pαv∥L2 ≤ ∥P ∥2 ∥P ∥2 αu L2 + αv L2 .

6. Noting that ∑ ∥ ∥2 ∥P ∥2 u L2 = αu L2 , α show that ∑ ∥P ∗ ∥ ≤ ∥ ∥2 ∥ ∥2 2 α(u v) sup u L2 + v L2 . α Take u 7→ tu and v 7→ t−1v and optimize the resulting estimates over t to deduce that ∑ ∥Pα(u ∗ v)∥sup ≤ ∥u∥L2 ∥v∥L2 . α 436 7. Fourier analysis

7.6. Isoperimetric inequality

Let Ω ⊂ R2 be a smoothly bounded open set, with boundary ∂Ω. The following result, known as the isoperimetric inequality, implies that if the length ℓ(∂Ω) is fixed, then the area Area(Ω) is maximal when Ω is a disk.

Proposition 7.6.1. In the setting described above,

(7.6.1) ℓ(∂Ω)2 ≥ 4π Area(Ω).

In case of equality in (7.6.1), Ω must be a disk.

Proof. We can assume ∂Ω has only one connected component. Otherwise, one component of ∂Ω, say γ, touches the unbounded component O of R2 \Ω. Then R2 \ γ has two connected components, O and U, each having γ as its boundary. We have

(7.6.2) ℓ(∂Ω) ≥ ℓ(∂U), Area(U) ≥ Area(Ω), so the inequality

(7.6.3) ℓ(∂U)2 ≥ 4π Area(U) would imply (7.6.1). So we can assume ∂Ω = γ. To proceed, parametrize γ to have constant speed, say

(7.6.4) γ : T = R/(2πZ) −→ ∂Ω is a diffeomorphism, with |γ′(t)| ≡ ℓ(∂Ω)/2π. We will obtain formulas for ℓ(∂Ω) and Area(Ω) in terms of Fourier coefficients, as follows. Identifying R2 with C, take ∫ ∑ π ikt 1 −iks (7.6.5) γ(t) = ake , ak =γ ˆ(k) = γ(s)e ds. 2π − k π Since γ has constant speed, we have ∫ 4π2 π ℓ(∂Ω)2 = |γ′(t)|2 dt 2π − (7.6.6) ∑π 2 2 2 = 4π k |ak| , k by the Plancherel identity for Fourier series, plus the identity ∑ ′ ikt (7.6.7) γ (t) = i kake . k 7.6. Isoperimetric inequality 437

Meanwhile, by Green’s theorem, ∫ ∫ ∫ 1 Area(Ω) = x dy = − y dx = z dz 2i ∂Ω ∫ ∂Ω ∂Ω 1 π = γ′(t)γ(t) dt (7.6.8) 2i −π ′ = −πi(γ , γ) 2 T ∑ L ( ,dt/2π) 2 = π k|ak| . k Hence ∑ 2 2 2 2 (7.6.9) ℓ(∂Ω) − 4π Area(Ω) = 4π (k − k)|ak| . k Note that k2 − k ≥ 0 on Z, so the right side of (7.6.9) is ≥ 0, and we have (7.6.1). 

To prove the last assertion in Proposition 7.6.1, we will establish the following quantitative refinement. Proposition 7.6.2. Let Ω ⊂ R2 be a smoothly bounded domain. Assume ∂Ω has just one connected component. There exists C1 ∈ (0, ∞), independent of Ω, with the following property. If δ ∈ (0, 2π] and (7.6.10) ℓ(∂Ω)2 − 4π Area(Ω) ≤ δ Area(Ω), then there is a disk D such that 1/2 (7.6.11) Area(Ω△D) ≤ C1δ Area(Ω).

Here Ω△D denotes the symmetric difference, (Ω \ D) ∪ (D \ Ω). Note that △ ∥ − ∥2 (7.6.12) Area(Ω D) = χΩ χD L2 . Proof. Taking γ as in the proof of Proposition 7.6.1, let D be the disk with boundary curve it (7.6.13) γ0(t) = a0 + a1e . By (7.6.9), ∑ 2 2 2 2 (7.6.14) ℓ(∂Ω) − 4π Area(Ω) = 4π (k − k)|γˆ(k) − γˆ0(k)| , k 2 2 since |γˆ(k) − γˆ0(k)| = |ak| except for k = 0, 1. Also k2 (7.6.15) k2 − k ≥ on Z \{0, 1}, 2 438 7. Fourier analysis

so ∑ 2 2 2 2 ℓ(∂Ω) − 4π Area(Ω) ≥ 2π k |γˆ(k) − γˆ0(k)| (7.6.16) 2∥ ′ − ′ ∥2 = 2π γ γ0 L2(T).

In turn, since γ − γ0 has mean value zero, we have | − | ≤ ∥ ′ − ′ ∥ (7.6.17) sup γ(t) γ0(t) C2 γ γ0 L2(T). t∈T Then the hypothesis (18.9) gives 1/2 (7.6.18) sup |γ(t) − γ0(t)| ≤ C2δ ∥χΩ∥L2 . t∈T

Also, if ρ = |a1| is the radius of D, and if we denote by Dζ a concentric disk of radius ζ, we have

sup |γ(t) − γ0(t)| ≤ σ (7.6.19) =⇒ Dρ−σ ⊂ Ω ⊂ Dρ+σ 1/2 =⇒ Area(Ω△D) ≤ 4πρσ = 4π σ∥χD∥L2 . Hence 1/2 1/2 (7.6.20) Area(Ω△D) ≤ 4π C2δ ∥χD∥L2 ∥χΩ∥L2 . Meanwhile, 2 (7.6.21) Area(D) = π|a1| ≤ Area(Ω), by (7.6.8), so ∥χD∥L2 ≤ ∥χΩ∥L2 , and we have the desired conclusion (7.6.11).  Appendix A

Complementary material

Here we treat a number of topics that illuminate material in the main body of the text. In Appendix A.1 we discuss metric spaces. A metric space is a set X, equipped with a distance function d(x, y), satisfying some natural conditions (cf. (A.1.1)), notably the triangle inequality, d(x, y) ≤ d(x, z) + d(z, y), for all x, y, z ∈ X. The notion of a metric space is a natural abstraction of that of Euclidean space, treated in §1.2. More general types of spaces that fall under the category of metric spaces include various spaces of functions, and the unification of these notions is quite useful. Appendix A.2, treating inner product spaces, also complements §1.2 and extends its scope. It treats both finite and infinite dimensional inner product spaces, the latter class making contact with Fourier series in §7.1. Material in Appendix A.3, on eigenvalues and eigenvectors, is useful for the study of various types of critical points in §2.1. Material on power series in Appendix A.4 both complements results in §2.1 and provides a path to a proof of the Weierstrass approximation theorem in Appendix A.5. This result in turn is extended to the Stone-Weierstrass approximation theorem in that appendix, and this is of use in the treatment of Fourier series, in §7.1. In §A.6 we present some results on harmonic functions, complementing results established in Chapters 5 and 7. One such result is a removable singularity theorem, used in the proof of Proposition 7.4.3.

439 440 A. Complementary material

A.1. Metric spaces, convergence, and compactness

A metric space is a set X, together with a distance function d : X × X → [0, ∞), having the properties that d(x, y) = 0 ⇐⇒ x = y, (A.1.1) d(x, y) = d(y, x), d(x, y) ≤ d(x, z) + d(y, z). The third of these properties is called the triangle inequality. An example of a metric space is the set of rational numbers Q, with d(x, y) = |x − y|. Another example is X = Rn, with √ 2 2 d(x, y) = (x1 − y1) + ··· + (xn − yn) .

+ If (xν) is a sequence in X, indexed by ν = 1, 2, 3,..., i.e., by ν ∈ Z , one says xν → y if d(xν, y) → 0, as ν → ∞. One says (xν) is a Cauchy sequence if d(xν, xµ) → 0 as µ, ν → ∞. One says X is a complete metric space if every Cauchy sequence converges to a limit in X. Some metric spaces are not complete; for example, Q is not complete.√ You can take a sequence (xν) of rational numbers such that xν → 2, which is not rational. Then (xν) is Cauchy in Q, but it has no limit in Q. If a metric space X is not complete, one can construct its completion Xb as follows. Let an element ξ of Xb consist of an equivalence class of Cauchy sequences in X, where we say (xν) ∼ (yν) provided d(xν, yν) → 0. We write the equivalence class containing (xν) as [xν]. If ξ = [xν] and η = [yν], we can set d(ξ, η) = limν→∞ d(xν, yν), and verify that this is well defined, and makes Xb a complete metric space. The following is the major result of Chapter 1 of [44]. Theorem A.1.1. If the completion process described above is applied to the set Q of rational numbers, one obtains the set R of real numbers, as a complete space, with metric d(x, y) = |x − y|.

There are a number of useful concepts related to the notion of closeness. We define some of them here. First, if p is a point in a metric space X and r ∈ (0, ∞), the set

(A.1.2) Br(p) = {x ∈ X : d(x, p) < r} is called the open ball about p of radius r. Generally, a neighborhood of p ∈ X is a set containing such a ball, for some r > 0. A set U ⊂ X is called open if it contains a neighborhood of each of its points. The complement of an open set is said to be closed. The following result characterizes closed sets. A.1. Metric spaces, convergence, and compactness 441

Proposition A.1.2. A subset K ⊂ X of a metric space X is closed if and only if

(A.1.3) xj ∈ K, xj → p ∈ X =⇒ p ∈ K.

Proof. Assume K is closed, xj ∈ K, xj → p. If p∈ / K, then p ∈ X \ K, which is open, so some Bε(p) ⊂ X \ K, and d(xj, p) ≥ ε for all j. This contradiction implies p ∈ K.

Conversely, assume (A.1.3) holds, and let q ∈ U = X \ K. If B1/n(q) is not contained in U for any n, then there exists xn ∈ K ∩ B1/n(q), hence xn → q, contradicting (A.1.3). This completes the proof.  The following is straightforward.

Proposition A.1.3. If Uα is a family of open sets in X, then ∪αUα is open. If Kα is a family of closed subsets of X, then ∩αKα is closed. Given S ⊂ X, we denote by S (the closure of S) the smallest closed subset of X containing S, i.e., the intersection of all the closed sets Kα ⊂ X containing S. The following result is straightforward.

Proposition A.1.4. Given S ⊂ X, p ∈ S if and only if there exist xj ∈ S such that xj → p. Given S ⊂ X, p ∈ X, we say p is an accumulation point of S if and only if, for each ε > 0, there exists q ∈ S ∩ Bε(p), q ≠ p. It follows that p is an accumulation point of S if and only if each Bε(p), ε > 0, contains infinitely many points of S. One straightforward observation is that all points of S \S are accumulation points of S. The interior of a set S ⊂ X is the largest open set contained in S, i.e., the union of all the open sets contained in S. Note that the complement of the interior of S is equal to the closure of X \ S. We now turn to the notion of compactness. We say a metric space X is compact provided the following property holds:

(A.1.4) Each sequence (xk) in X has a convergent subsequence. We will establish various properties of compact metric spaces, and provide various equivalent characterizations. For example, it is easily seen that (AA) is equivalent to: (A.1.5) Each infinite subset S ⊂ X has an accumulation point. The following property is known as total boundedness: Proposition A.1.5. If X is a compact metric space, then

Given ε > 0, ∃ finite set {x1, . . . , xN } (A.1.6) such that Bε(x1),...,Bε(xN ) covers X. 442 A. Complementary material

Proof. Take ε > 0 and pick x1 ∈ X. If Bε(x1) = X, we are done. If not, pick x2 ∈ X \ Bε(x1). If Bε(x1) ∪ Bε(x2) = X, we are done. If not, pick x3 ∈ X\[Bε(x1)∪Bε(x2)]. Continue, taking xk+1 ∈ X\[Bε(x1)∪· · ·∪Bε(xk)], if Bε(x1) ∪ · · · ∪ Bε(xk) ≠ X. Note that, for 1 ≤ i, j ≤ k,

i ≠ j =⇒ d(xi, xj) ≥ ε.

If one never covers X this way, consider S = {xj : j ∈ N}. This is an infinite set with no accumulation point, so property (A.1.5) is contradicted. 

Corollary A.1.6. If X is a compact metric space, it has a countable dense subset.

−n Proof. Given ε = 2 , let Sn be a finite set of points xj such that {Bε(xj)} covers X. Then C = ∪nSn is a countable dense subset of X. 

Here is another useful property of compact metric spaces, which will eventually be generalized even further, in (A.1.10) below.

Proposition A.1.7. Let X be a compact metric space. Assume K1 ⊃ K2 ⊃ K3 ⊃ · · · form a decreasing sequence of closed subsets of X. If each Kn ≠ ∅, then ∩nKn ≠ ∅.

Proof. Pick xn ∈ Kn. If (A.1.4) holds, (xn) has a convergent subsequence, → { ≥ } ⊂ ∈ ∩  xnk y. Since xnk : k ℓ Knℓ , which is closed, we have y nKn.

Corollary A.1.8. Let X be a compact metric space. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open subsets of X. If ∪nUn = X, then UN = X for some N.

Proof. Consider Kn = X \ Un. 

The following is an important extension of Corollary A.1.8. Proposition A.1.9. If X is a compact metric space, then it has the prop- erty:

(A.1.7) Every open cover {Uα : α ∈ A} of X has a finite subcover.

Proof. Each Uα is a union of open balls, so it suffices to show that (A.1.4) implies the following:

Every cover {Bα : α ∈ A} of X (A.1.8) by open balls has a finite subcover.

Let C = {zj : j ∈ N} ⊂ X be a countable dense subset of X, as in Corollary ∈ C ∩ A.1.6. Each Bα is a union of balls Brj (zj), with zj Bα, rj rational. A.1. Metric spaces, convergence, and compactness 443

Thus it suffices to show that

Every countable cover {Bj : j ∈ N} of X (A.1.9) by open balls has a finite subcover. For this, we set

Un = B1 ∪ · · · ∪ Bn and apply Corollary A.1.8. 

The following is a convenient alternative to property (A.1.7): ∩ If Kα ⊂ X are closed and Kα = ∅, (A.1.10) α then some finite intersection is empty.

Considering Uα = X \ Kα, we see that (A.1.7) ⇐⇒ (A.1.10). The following result completes Proposition A.1.9. Theorem A.1.10. For a metric space X, (A.1.4) ⇐⇒ (A.1.7).

Proof. By Proposition A.1.9, (A.1.4) ⇒ (A.1.7). To prove the converse, it will suffice to show that (A.1.10) ⇒ (A.1.5). So let S ⊂ X and assume S has no accumulation point. We claim: Such S must be closed. Indeed, if z ∈ S and z∈ / S, then z would have to be an accumulation point. Say S = {xα : α ∈ A}. Set Kα = S \{xα}. Then each Kα has no accumulation point, hence Kα ⊂ X is closed. Also ∩αKα = ∅. Hence there exists a finite set F ⊂ A such that ∩α∈F Kα = ∅, if (A.1.10) holds. Hence S = ∪α∈F {xα} is finite, so indeed (A.1.10) ⇒ (A.1.5). 

Remark. So far we have that for every metric space X, (A.1.4) ⇐⇒ (A.1.5) ⇐⇒ (A.1.7) ⇐⇒ (A.1.10) =⇒ (A.1.6). We claim that (A.1.6) implies the other conditions if X is complete. Of course, compactness implies completeness, but (A.1.6) may hold for incom- plete X, e.g., X = (0, 1) ⊂ R.

Proposition A.1.11. If X is a complete metric space with property (A.1.6), then X is compact. 444 A. Complementary material

Proof. It suffices to show that (A.1.6) ⇒ (A.1.5) if X is a complete metric space. So let S ⊂ X be an infinite set. Cover X by balls

B1/2(x1),...,B1/2(xN ). One of these balls contains infinitely many points of S, and so does its closure, say X1 = B1/2(y1). Now cover X by finitely many balls of radius 1/4; their intersection with X1 provides a cover of X1. One such set contains infinitely many points of S, and so does its closure X2 = B1/4(y2) ∩ X1. Continue in this fashion, obtaining ⊃ ⊃ ⊃ · · · ⊃ ⊃ ⊃ · · · ⊂ X1 X2 X3 Xk Xk+1 ,Xj B2−j (yj), each containing infinitely many points of S. One sees that (yj) forms a Cauchy sequence. If X is complete, it has a limit, yj → z, and z is seen to be an accumulation point of S. 

If Xj, 1 ≤ j ≤ m, is a finite collection of metric spaces, with metrics dj, we can define a Cartesian product metric space ∏m (A.1.11) X = Xj, d(x, y) = d1(x1, y1) + ··· + dm(xm, ym). j=1 √ 2 2 Another choice of metric is δ(x, y) = d1(x1, y1) + ··· + dm(xm, ym) . The metrics d and δ are equivalent, i.e., there exist constants C0,C1 ∈ (0, ∞) such that

(A.1.12) C0δ(x, y) ≤ d(x, y) ≤ C1δ(x, y), ∀ x, y ∈ X. A key example is Rm, the Cartesian product of m copies of the real line R. We describe some important classes of compact spaces. ≤ ≤ Proposition∏ A.1.12. If Xj are compact metric spaces, 1 j m, so is m X = j=1 Xj.

Proof. If (xν) is an infinite sequence of points in X, say xν = (x1ν, . . . , xmν), pick a convergent subsequence of (x1ν) in X1, and consider the corresponding subsequence of (xν), which we relabel (xν). Using this, pick a convergent subsequence of (x2ν) in X2. Continue. Having a subsequence such that xjν → yj in Xj for each j = 1, . . . , m, we then have a convergent subsequence in X. 

The following result (already stated in Theorem 1.2.4) is useful for cal- culus on Rn. Proposition A.1.13. If K is a closed bounded subset of Rn, then K is compact. A.1. Metric spaces, convergence, and compactness 445

Proof. The discussion above reduces the problem to showing that any closed interval I = [a, b] in R is compact. This compactness is a corol- lary of Proposition A.1.11. For pedagogical purposes, we redo the argument here, since in this concrete case it can be streamlined. Suppose S is a subset of I with infinitely many elements. Divide I into 2 equal subintervals, I1 = [a, b1],I2 = [b1, b], b1 = (a + b)/2. Then either I1 or I2 must contain infinitely many elements of S. Say Ij does. Let x1 be any element of S lying in Ij. Now divide Ij in two equal pieces, Ij = Ij1 ∪ Ij2. One of these intervals (say Ijk) contains infinitely many points of S. Pick x2 ∈ Ijk to be one such point (different from x1). Then subdivide Ijk into two equal subintervals, and continue. We get an infinite sequence of distinct −ν points xν ∈ S, and |xν − xν+k| ≤ 2 (b − a), for k ≥ 1. Since R is complete, (xν) converges, say to y ∈ I. Any neighborhood of y contains infinitely many points in S, so we are done. 

If X and Y are metric spaces, a function f : X → Y is said to be continuous provided xν → x in X implies f(xν) → f(x) in Y. An equivalent condition, which the reader is invited to verify, is (A.1.13) U open in Y =⇒ f −1(U) open in X. Proposition A.1.14. If X and Y are metric spaces, f : X → Y continuous, and K ⊂ X compact, then f(K) is a compact subset of Y.

Proof. If (yν) is an infinite sequence of points in f(K), pick xν ∈ K such → that f(xν) = yν. If K is compact, we have a subsequence xνj p in X, and →  then yνj f(p) in Y.

If F : X → R is continuous, we say f ∈ C(X). A useful corollary of Proposition A.1.14 is: Proposition A.1.15. If X is a compact metric space and f ∈ C(X), then f assumes a maximum and a minimum value on X.

Proof. We know from Proposition A.1.14 that f(X) is a compact subset of R. Hence f(X) is bounded, say f(X) ⊂ I = [a, b]. Repeatedly subdividing I into equal halves, as in the proof of Proposition A.1.13, at each stage throwing out intervals that do not intersect f(X), and keeping only the leftmost and rightmost interval amongst those remaining, we obtain points α ∈ f(X) and β ∈ f(X) such that f(X) ⊂ [α, β]. Then α = f(x0) for some x0 ∈ X is the minimum and β = f(x1) for some x1 ∈ X is the maximum. 

At this point, the reader might take a look at the proof of the Mean Value Theorem, given in §1.1, which applies this result. 446 A. Complementary material

If S ⊂ R is a nonempty, bounded set, Proposition A.1.13 implies S is compact. The function η : S → R, η(x) = x is continuous, so by Proposition A.1.15 it assumes a maximum and a minimum on S. We set (A.1.14) sup S = max x, inf S = min x, s∈S x∈S when S is bounded. More generally, if S ⊂ R is nonempty and bounded from above, say S ⊂ (−∞,B], we can pick A < B such that S ∩ [A, B] is nonempty, and set (A.1.15) sup S = sup S ∩ [A, B]. Similarly, if S ⊂ R is nonempty and bounded from below, say S ⊂ [A, ∞), we can pick B > A such that S ∩ [A, B] is nonempty, and set (A.1.16) inf S = inf S ∩ [A, B]. If X is a nonempty set and f : X → R is bounded from above, we set (A.1.17) sup f(x) = sup f(X), x∈X and if f : X → R is bounded from below, we set (A.1.18) inf f(x) = inf f(X). x∈X If f is not bounded from above, we set sup f = +∞, and if f is not bounded from below, we set inf f = −∞. Given a set X, f : X → R, and x → x, we set n ( ) (A.1.19) lim sup f(x ) = lim sup f(x ) , n →∞ k n→∞ n k≥n and ( ) (A.1.20) lim inf f(xn) = lim inf f(xk) . n→∞ n→∞ k≥n We return to the notion of continuity. A function f ∈ C(X) is said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that (A.1.21) x, y ∈ X, d(x, y) ≤ δ =⇒ |f(x) − f(y)| ≤ ε. An equivalent condition is that f have a modulus of continuity, i.e., a mono- tonic function ω : [0, 1) → [0, ∞) such that δ ↘ 0 ⇒ ω(δ) ↘ 0, and such that (A.1.22) x, y ∈ X, d(x, y) ≤ δ ≤ 1 =⇒ |f(x) − f(y)| ≤ ω(δ). Not all continuous functions are uniformly continuous. For example, if X = (0, 1) ⊂ R, then f(x) = sin 1/x is continuous, but not uniformly continuous, on X. The following result is useful, for example, in the development of the Riemann integral in §2.1. A.1. Metric spaces, convergence, and compactness 447

Proposition A.1.16. If X is a compact metric space and f ∈ C(X), then f is uniformly continuous.

−ν Proof. If not, there exist xν, yν ∈ X and ε > 0 such that d(xν, yν) ≤ 2 but

(A.1.23) |f(xν) − f(yν)| ≥ ε. → → Taking a convergent subsequence xνj p, we also have yνj p. Now → → continuity of f at p implies f(xνj ) f(p) and f(yνj ) f(p), contradicting (A.14). 

If X and Y are metric spaces, the space C(X,Y ) of continuous maps f : X → Y has a natural metric structure, under some additional hypotheses. We use ( ) (A.1.24) D(f, g) = sup d f(x), g(x) . x∈X This sup exists provided f(X) and g(X) are bounded subsets of Y, where to say B ⊂ Y is bounded is to say d : B × B → [0, ∞) has bounded image. In particular, this supremum exists if X is compact. The following result is useful in the proof of the fundamental local existence result for ODE, in §2.3. Proposition A.1.17. If X is a compact metric space and Y is a complete metric space, then C(X,Y ), with the metric (A.1.16), is complete.

Proof. That D(f, g) satisfies the conditions to define a metric on C(X,Y ) is straightforward. We check completeness. Suppose (fν) is a Cauchy sequence in C(X,Y ), so, as ν → ∞, ( ) (A.1.25) sup sup d fν+k(x), fν(x) ≤ εν → 0. k≥0 x∈X

Then in particular (fν(x)) is a Cauchy sequence in Y for each x ∈ X, so it converges, say to g(x) ∈ Y . It remains to show that g ∈ C(X,Y ) and that fν → g in the metric (A.1.16). In fact, taking k → ∞ in the estimate above, we have ( ) (A.1.26) sup d g(x), fν(x) ≤ εν → 0, x∈X i.e., fν → g uniformly. It remains only to show that g is continuous. For this, let xj → x in X and fix ε > 0. Pick N so that εN < ε. Since fN is continuous, there exists J such that j ≥ J ⇒ d(f (x ), f (x)) < ε. Hence ( ) ( )N (j N ) j ≥ J ⇒ d g(x ), g(x) ≤ d g(x ), f (x ) + d f (x ), f (x) j ( j N j ) N j N + d fN (x), g(x) < 3ε. This completes the proof.  448 A. Complementary material

In case Y = R, C(X, R) = C(X), introduced earlier in this appendix. The distance function (A.1.24) can be written

D(f, g) = ∥f − g∥sup, ∥f∥sup = sup |f(x)|. x∈X

∥f∥sup is a norm on C(X). Generally, a norm on a vector space V is an assignment f 7→ ∥f∥ ∈ [0, ∞), satisfying ∥f∥ = 0 ⇔ f = 0, ∥af∥ = |a| ∥f∥, ∥f + g∥ ≤ ∥f∥ + ∥g∥, given f, g ∈ V and a a scalar (in R or C). A vector space equipped with a norm is called a normed vector space. It is then a metric space, with distance function D(f, g) = ∥f − g∥. If the space is complete, one calls V a . In particular, by Proposition A.1.17, C(X) is a Banach space, when X is a compact metric space. We next give a couple of slightly more sophisticated results on com- pactness. The following extension of Proposition A.1.12 is a special case of Tychonov’s Theorem.

+ Proposition A.1.18. If {Xj : j ∈ Z } are compact metric spaces, so is ∏∞ X = j=1 Xj. Here, we can make X a metric space by setting ∞ ∑ d (p (x), p (y)) (A.1.27) d(x, y) = 2−j j j j , 1 + d (p (x), p (y)) j=1 j j j where pj : X → Xj is the projection onto the jth factor. It is easy to verify that, if xν ∈ X, then xν → y in X, as ν → ∞, if and only if, for each j, pj(xν) → pj(y) in Xj.

Proof. Following the argument in Proposition A.1.12, if (xν) is an infinite sequence of points in X, we obtain a nested family of subsequences 1 2 j (A.1.28) (xν) ⊃ (x ν) ⊃ (x ν) ⊃ · · · ⊃ (x ν) ⊃ · · · j such that pℓ(x ν) converges in Xℓ, for 1 ≤ ℓ ≤ j. The next step is a diagonal construction. We set ν (A.1.29) ξν = x ν ∈ X. Then, for each j, after throwing away a finite number N(j) of elements, one j obtains from (ξν) a subsequence of the sequence (x ν) in (A.1.28), so pℓ(ξν) converges in Xℓ for all ℓ. Hence (ξν) is a convergent subsequence of (xν). 

The next result the Arzela-Ascoli Theorem. A.1. Metric spaces, convergence, and compactness 449

Proposition A.1.19. Let X and Y be compact metric spaces, and fix a modulus of continuity ω(δ). Then { ( ) ( ) } ′ ′ ′ (A.1.30) Cω = f ∈ C(X,Y ): d f(x), f(x ) ≤ ω d(x, x ) ∀ x, x ∈ X is a compact subset of C(X,Y ).

Proof. Let (fν) be a sequence in Cω. Let Σ be a countable dense subset of X, as in Corollary A.1.6. For each x ∈ Σ, (fν(x)) is a sequence in Y, which hence has a convergent subsequence. Using a diagonal construction similar to that in the proof of Proposition A.1.18, we obtain a subsequence (φν) of (fν) with the property that φν(x) converges in Y, for each x ∈ Σ, say

(A.1.31) φν(x) → ψ(x), for all x ∈ Σ, where ψ :Σ → Y. So far, we have not used (A.1.30). This hypothesis will now be used to show that φν converges uniformly on X. Pick ε > 0. Then pick δ > 0 such that ω(δ) < ε/3. Since X is compact, we can cover X by finitely many balls Bδ(xj), 1 ≤ j ≤ N, xj ∈ Σ. Pick M so large that φν(xj) is within ε/3 of its limit for all ν ≥ M (when 1 ≤ j ≤ N). Now, for any x ∈ X, picking ℓ ∈ {1,...,N} such that d(x, x ) ≤ δ, we have, for k ≥ 0, ν ≥ M, ( ) ℓ( ) ( ) d φ (x), φ (x) ≤ d φ (x), φ (x ) + d φ (x ), φ (x ) ν+k ν ν+(k ν+k ℓ ) ν+k ℓ ν ℓ (A.1.32) + d φν(xℓ), φν(x) ≤ ε/3 + ε/3 + ε/3.

Thus (φν(x)) is Cauchy in Y for all x ∈ X, hence convergent. Call the limit ψ(x), so we now have (A.1.31) for all x ∈ X. Letting k → ∞ in (A.1.32) we have uniform convergence of φν to ψ. Finally, passing to the limit ν → ∞ in ′ ′ (A.1.33) d(φν(x), φν(x )) ≤ ω(d(x, x )) gives ψ ∈ Cω. 

We want to re-state Proposition A.1.19, bringing in the notion of equicon- tinuity. Given metric spaces X and Y , and a set of maps F ⊂ C(X,Y ), we say F is equicontinuous at a point x0 ∈ X provided ∀ ε > 0, ∃ δ > 0 such that ∀ x ∈ X, f ∈ F, (A.1.34) dX (x, x0) < δ =⇒ dY (f(x), f(x0)) < ε. We say F is equicontinuous on X if it is equicontinuous at each point of X. We say F is uniformly equicontinuous on X provided ∀ ε > 0, ∃ δ > 0 such that ∀ x, x′ ∈ X, f ∈ F, (A.1.35) ′ ′ dX (x, x ) < δ =⇒ dY (f(x), f(x )) < ε. 450 A. Complementary material

Note that (A.1.35) is equivalent to the existence of a modulus of continuity ω such that F ⊂ Cω, given by (A.1.30). It is useful to record the following result. Proposition A.1.20. Let X and Y be metric spaces, F ⊂ C(X,Y ). As- sume X is compact. then (A.1.36) F equicontinuous =⇒ F is uniformly equicontinuous.

Proof. The argument is a variant of the proof of Proposition A.1.16. In ′ ∈ ∈ F more detail, suppose there exist xν, xν X, ε > 0, and fν such that ′ ≤ −ν d(xν, xν) 2 but ′ ≥ (A.1.37) d(fν(xν), fν(xν)) ε. → ∈ ′ → Taking a convergent subsequence xνj p X, we also have xνj p. Now equicontinuity of F at p implies that there esists N < ∞ such that ε (A.1.38) d(g(x ), g(p)) < , ∀ j ≥ N, g ∈ F, νj 2 contradicting (A.1.37). 

Putting together Propositions A.1.19 and A.1.20 then gives the follow- ing. Proposition A.1.21. Let X and Y be compact metric spaces. If F ⊂ C(X,Y ) is equicontinuous on X, then it has compact closure in C(X,Y ).

We next define the notion of a connected space. A metric space X is said to be connected provided that it cannot be written as the union of two disjoint nonempty open subsets. The following is a basic class of examples.

Proposition A.1.22. Each interval I in R is connected.

Proof. Suppose A ⊂ I is nonempty, with nonempty complement B ⊂ I, and both sets are open. Take a ∈ A, b ∈ B; we can assume a < b. Let ξ = sup{x ∈ [a, b]: x ∈ A} This exists, as a consequence of the basic fact that R is complete. Now we obtain a contradiction, as follows. Since A is closed ξ ∈ A. But then, since A is open, there must be a neighborhood (ξ − ε, ξ + ε) contained in A; this is not possible. 

We say X is path-connected if, given any p, q ∈ X, there is a contin- uous map γ : [0, 1] → X such that γ(0) = p and γ(1) = q. It is an easy consequence of Proposition A.1.22 that X is connected whenever it is path- connected. A.1. Metric spaces, convergence, and compactness 451

The next result, known as the Intermediate Value Theorem, is frequently useful. Proposition A.1.23. Let X be a connected metric space and f : X → R continuous. Assume p, q ∈ X, and f(p) = a < f(q) = b. Then, given any c ∈ (a, b), there exists z ∈ X such that f(z) = c.

Proof. Under the hypotheses, A = {x ∈ X : f(x) < c} is open and contains p, while B = {x ∈ X : f(x) > c} is open and contains q. If X is connected, then A ∪ B cannot be all of X; so any point in its complement has the desired property. 

Exercises

1. If X is a metric space, with distance function d, show that |d(x, y) − d(x′, y′)| ≤ d(x, x′) + d(y, y′), and hence d : X × X −→ [0, ∞) is continuous.

2. Let φ : [0, ∞) → [0, ∞) be a C2 function. Assume φ(0) = 0, φ′ > 0, φ′′ < 0. Prove that if d(x, y) is symmetric and satisfies the triangle inequality, so does δ(x, y) = φ(d(x, y)). Hint. Show that such φ satisfies φ(s + t) ≤ φ(s) + φ(t), for s, t ∈ R+.

3. Show that the function d(x, y) defined by (A.1.27) satisfies (A.1.1). Hint. Consider φ(r) = r/(1 + r).

4. Let X be a compact metric space. Assume fj, f ∈ C(X) and

fj(x) ↗ f(x), ∀ x ∈ X.

Prove that fj → f uniformly on X. (This result is called Dini’s theorem.) Hint. For ε > 0, let Kj(ε) = {x ∈ X : f(x) − fj(x) ≥ ε}. Note that Kj(ε) ⊃ Kj+1(ε) ⊃ · · · .

Given a metric space X and f : X → [−∞, ∞], we say f is lower semicon- tinuous at x ∈ X provided f −1((c, ∞]) ⊂ X is open, ∀ c ∈ R. 452 A. Complementary material

We say f is upper semicontinuous provided f −1([−∞, c)) is open, ∀ c ∈ R.

5. Show that f is lower semicontinuous ⇐⇒ f −1([−∞, c]) is closed, ∀ c ∈ R, and f is upper semicontinuous ⇐⇒ f −1([c, ∞]) is closed, ∀ c ∈ R.

6. Show that

f is lower semicontinuous ⇐⇒ xn → x implies lim inf f(xn) ≥ f(x). Show that

f is upper semicontinuous ⇐⇒ xn → x implies lim sup f(xn) ≤ f(x).

7. Given S ⊂ X, show that

χS is lower semicontinuous ⇐⇒ S is open.

χS is upper semicontinuous ⇐⇒ S is closed.

8. If X is a compact metric space, show that f : X → R is lower semicontinuous =⇒ min f is achieved.

9. In the setting of (A.1.11), let { } 1/2 2 2 δ(x, y) = d1(x1, y1) + ··· + dm(xm, ym) .

Show that √ δ(x, y) ≤ d(x, y) ≤ m δ(x, y).

10. Let X and Y be compact metric spaces. Show that if F ⊂ C(X,Y ) is compact, then F is equicontinuous. (This is a converse to Proposition A.20.)

11. Recall that a Banach space is a complete normed linear space. Consider C1(I), where I = [0, 1], with norm ′ ∥f∥C1 = sup |f| + sup |f |. I I Show that C1(I) is a Banach space. A.2. Inner product spaces 453

1 12. Let F = {f ∈ C (I): ∥f∥C1 ≤ 1}. Show that F has compact closure in C(I). Find a function in the closure of F that is not in C1(I).

A.2. Inner product spaces

In §6.1 we have looked at norms and inner products on finite-dimensional vector spaces other than Rn, and in §§7.1–7.2 we have looked at norms and inner products on spaces of functions, such as C(Tn) and R(Rn), which are infinite-dimensional vector spaces. We discuss general results on such objects here. Generally, as discussed in §1.3, a complex vector space V is a set on which there are operations of vector addition: (A.2.1) f, g ∈ V =⇒ f + g ∈ V, and multiplication by an element of C (called scalar multiplication): (A.2.2) a ∈ C, f ∈ V =⇒ af ∈ V, satisfying the following properties. For vector addition, we have (A.2.3) f + g = g + f, (f + g) + h = f + (g + h), f + 0 = f, f + (−f) = 0. For multiplication by scalars, we have (A.2.4) a(bf) = (ab)f, 1 · f = f. Furthermore, we have two distributive laws: (A.2.5) a(f + g) = af + ag, (a + b)f = af + bf. These properties are readily verified for the function spaces mentioned above. An inner product on a complex vector space V assigns to elements f, g ∈ V the quantity (f, g) ∈ C, in a fashion that obeys the following three rules:

(a1f1 + a2f2, g) = a1(f1, g) + a2(f2, g), (A.2.6) (f, g) = (g, f), (f, f) > 0 unless f = 0. A vector space equipped with an inner product is called an inner product space. For example, ∫ 1 (A.2.7) (f, g) = f(θ)g(θ) dθ 2π S1 defines an inner product on C(S1), and also on R(S1), where we identify two functions that differ only on a set of upper content zero. Similarly, ∫ ∞ (A.2.8) (f, g) = f(x)g(x) dx −∞ 454 A. Complementary material defines an inner product on R(R) (where, again, we identify two functions that differ only on a set of upper content zero). 2 As another example, in we define ℓ to consist of sequences (ak)k∈Z such that ∑∞ 2 (A.2.9) |ak| < ∞. k=−∞ An inner product on ℓ2 is given by ( ) ∑∞ (A.2.10) (ak), (bk) = akbk. k=−∞

Given an inner product on V , one says the object ∥f∥ defined by √ (A.2.11) ∥f∥ = (f, f) is the norm on V associated with the inner product. Generally, a norm on V is a function f 7→ ∥f∥ satisfying (A.2.12) ∥af∥ = |a| · ∥f∥, a ∈ C, f ∈ V, (A.2.13) ∥f∥ > 0 unless f = 0, (A.2.14) ∥f + g∥ ≤ ∥f∥ + ∥g∥. The property (A.2.14) is called the triangle inequality. A vector space equipped with a norm is called a normed vector space. We can define a distance function on such a space by (A.2.15) d(f, g) = ∥f − g∥. Properties (A.2.12)–(A.2.14) imply that d : V × V → [0, ∞) satisfies the properties in (A.1.1), making V a metric space. If ∥f∥ is given by (A.2.11), from an inner product satisfying (A.2.6), it is clear that (A.2.12)–(A.2.13) hold, but (A.2.14) requires a demonstration. Note that ∥f + g∥2 = (f + g, f + g) (A.2.16) = ∥f∥2 + (f, g) + (g, f) + ∥g∥2 = ∥f∥2 + 2 Re(f, g) + ∥g∥2, while (A.2.17) (∥f∥ + ∥g∥)2 = ∥f∥2 + 2∥f∥ · ∥g∥ + ∥g∥2. Thus to establish (A.2.17) it suffices to prove the following, known as Cauchy’s inequality. A.2. Inner product spaces 455

Proposition A.2.1. For any inner product on a vector space V , with ∥f∥ defined by (A.2.11), (A.2.18) |(f, g)| ≤ ∥f∥ · ∥g∥, ∀ f, g ∈ V.

Proof. We start with (A.2.19) 0 ≤ ∥f − g∥2 = ∥f∥2 − 2 Re(f, g) + ∥g∥2, which implies (A.2.20) 2 Re(f, g) ≤ ∥f∥2 + ∥g∥2, ∀ f, g ∈ V. Replacing f by af for arbitrary a ∈ C of absolute velue 1 yields 2 Re a(f, g) ≤ ∥f∥2 + ∥g∥2, for all such a, hence 2|(f, g)| ≤ ∥f∥2 + ∥g∥2, ∀ f, g ∈ V. Replacing f by tf and g by t−1g for arbitrary t ∈ (0, ∞), we have (A.2.21) 2|(f, g)| ≤ t2∥f∥2 + t−2∥g∥2, ∀ f, g ∈ V, t ∈ (0, ∞). If we take t2 = ∥g∥/∥f∥, we obtain the desired inequality (A.2.18). This assumes f and g are both nonzero, but (A.2.18) is trivial if f or g is 0. 

An inner product space V is called a Hilbert space if it is a complete met- ric space, i.e., if every Cauchy sequence (fν) in V has a limit in V . The space ℓ2 has this completeness property, but C(S1), with inner product (A.2.7), does not, nor does R(S1). Appendix A.1 describes a process of construct- ing the completion of a metric space. When applied to an incomplete inner product space, it produces a Hilbert space. When this process is applied to C(S1), the completion is the space L2(S1). An alternative construction of L2(S1) uses the Lebesgue integral. For this approach, one can consult Chapter 4 of [42]. For the rest of this appendix, we confine attention to finite-dimensional inner product spaces.

If V is a finite-dimensional inner product space, a basis {u1, . . . , un} of V is called an orthonormal basis of V provided

(A.2.22) (uj, uk) = δjk, 1 ≤ j, k ≤ n, i.e.,

(A.2.23) ∥uj∥ = 1, j ≠ k ⇒ (uj, uk) = 0. In such a case we see that v = a1u1 + ··· + anun, w = b1u1 + ··· + bnun (A.2.24) =⇒ (v, w) = a1b1 + ··· + anbn. It is often useful to construct orthonormal bases. The construction we now describe is called the Gramm-Schmidt construction. 456 A. Complementary material

Proposition A.2.2. Let {v1, . . . , vn} be a basis of V , an inner product space. Then there is an orthonormal basis {u1, . . . , un} of V such that

(A.2.25) Span{uj : j ≤ ℓ} = Span{vj : j ≤ ℓ}, 1 ≤ ℓ ≤ n.

Proof. To begin, take 1 (A.2.26) u1 = v1. ∥v1∥

Now define the linear transformation P1 : V → V by P1v = (v, u1)u1 and set v˜2 = v2 − P1v2 = v2 − (v2, u1)u1.

We see that (˜v2, u1) = (v2, u1) − (v2, u1) = 0. Alsov ˜2 ≠ 0 since u1 and v2 are linearly independent. Hence we set 1 (A.2.27) u2 = v˜2. ∥v˜2∥

Inductively, suppose we have an orthonormal set {u1, . . . , um} with m < n and (A.2.25) holding for 1 ≤ ℓ ≤ m. Then define Pm : V → V by

(A.2.28) Pmv = (v, u1)u1 + ··· + (v, um)um, and set v˜m+1 = vm+1 − Pmvm+1 (A.2.29) = vm+1 − (vm+1, u1)u1 − · · · − (vm+1, um)um. We see that

(A.2.30) j ≤ m ⇒ (˜vm+1, uj) = (vm+1, uj) − (vm+1, uj) = 0.

Also, since vm+1 ∈/ Span{v1, . . . , vm} = Span{u1, . . . , um}, it follows that v˜m+1 ≠ 0. Hence we set 1 (A.2.31) um+1 = v˜m+1. ∥v˜m+1∥ This completes the construction. 

2 Example. Take V = P2, with basis {1, x, x }, and inner product given by ∫ 1 (A.2.32) (p, q) = p(x)q(x) dx. −1 The Gramm-Schmidt construction gives first 1 (A.2.33) u1(x) = √ . 2 Then v˜2(x) = x, A.2. Inner product spaces 457

∫ 1 2 since by symmetry (x, u1) = 0. Now − x dx = 2/3, so we take √1 3 (A.2.34) u (x) = x. 2 2 Next 2 2 2 1 v˜3(x) = x − (x , u1)u1 = x − , ∫ 3 2 1 2 − 2 since by symmetry (x , u2) = 0. Now −1(x 1/3) dx = 8/45, so we take √ ( ) 45 1 (A.2.35) u (x) = x2 − . 3 8 3 Let V be an n-dimensional inner product space, W ⊂ V an m-dimensional linear subspace. By Proposition A.2.2, W has an orthonormal basis

{w1, . . . , wm}. We know from §1.3 that V has a basis of the form

(A.2.36) {w1, . . . , wm, v1, . . . , vℓ}, ℓ + m = n. Applying Proposition A.2.2 again gives the following. Proposition A.2.3. If V is an n-dimensional inner product space and W ⊂ V an m-dimensional linear subspace, with orthonormal basis {w1, . . . , wm}, then V has an orthonormal basis of the form

(A.2.37) {w1, . . . , wm, u1, . . . , uℓ}, ℓ + m = n.

We see that, if we define the orthogonal complement of W in V as (A.2.38) W ⊥ = {v ∈ V :(v, w) = 0, ∀ w ∈ W }, then ⊥ (A.2.39) W = Span{u1, . . . , uℓ}. In particular, (A.2.40) dim W + dim W ⊥ = dim V.

In the setting of Proposition A.2.3, we can define PW ∈ L(V ) by ∑m (A.2.41) PW v = (v, wj)wj, for v ∈ V, j=1 and see that PW is uniquely defined by the properties ⊥ (A.2.42) PW w = w, ∀ w ∈ W, PW u = 0, ∀ u ∈ W .

We call PW the orthogonal projection of V onto W . Note the appearance of such orthogonal projections in the proof of Proposition A.2.2, namely in (A.2.28). 458 A. Complementary material

Another object that arises in the setting of inner product spaces is the adjoint, defined as follows. If V and W are finite-dimensional inner product spaces and T ∈ L(V,W ), we define the adjoint (A.2.43) T ∗ ∈ L(W, V ), (v, T ∗w) = (T v, w). If V and W are real vector spaces, we also use the notation T t for the adjoint, and call it the transpose. In case V = W and T ∈ L(V ), we say (A.2.44) T is self-adjoint ⇐⇒ T ∗ = T, and T is unitary (if F = C), or orthogonal (if F = R) (A.2.45) . ⇐⇒ T ∗ = T −1. The following gives a significant connection between adjoints and or- thogonal complements. Proposition A.2.4. Let V be an n-dimensional inner product space, W ⊂ V a linear subspace. Take T ∈ L(V ). Then (A.2.46) T : W → W =⇒ T ∗ : W ⊥ → W ⊥.

Proof. Note that (A.2.47) (w, T ∗u) = (T w, u) = 0, ∀ w ∈ W, u ∈ W ⊥, if T : W → W . This shows that T ∗u ⊥ W for all u ∈ W ⊥, and we have (A.2.46). 

In particular, (A.2.48) T = T ∗,T : W → W =⇒ T : W ⊥ → W ⊥. A.3. Eigenvalues and eigenvectors 459

A.3. Eigenvalues and eigenvectors

Let T : V → V be linear. If there is a nonzero v ∈ V such that

(A.3.1) T v = λjv, for some λj ∈ F, we say λj is an eigenvalue of T , and v is an eigenvector. Let E(T, λj) denote the set of vectors v ∈ V such that (A.3.1) holds. It is clear that E(T, λj) (the λj-eigenspace of T ) is a linear subspace of V and

(A.3.2) T : E(T, λj) −→ E(T, λj).

The set of λj ∈ F such that E(T, λj) ≠ 0 is denoted Spec(T ). Clearly λj ∈ Spec(T ) if and only if T − λjI is not injective, so, if V is finite dimensional,

(A.3.3) λj ∈ Spec(T ) ⇐⇒ det(λjI − T ) = 0.

We call KT (λ) = det(λI − T ) the characteristic polynomial of T . If F = C, we can use the fundamental theorem of algebra, which says every non-constant polynomial with complex coefficients has at least one complex root. (See §5.1 for a proof of this result.) This proves the following.

Proposition A.3.1. If V is a finite-dimensional complex vector space and T ∈ L(V ), then T has at least one eigenvector in V .

Remark. If V is real and KT (λ) does have a real root λj, then there is a real λj-eigenvector.

Sometimes a linear transformation has only one eigenvector, up to a scalar multiple. Consider the transformation A : C3 → C3 given by   2 1 0 (A.3.4) A = 0 2 1 . 0 0 2 We see that det(λI − A) = (λ − 2)3, so λ = 2 is a triple root. It is clear that

(A.3.5) E(A, 2) = Span{e1}, t 3 where e1 = (1, 0, 0) is the first standard basis vector of C . If one is given T ∈ L(V ), it is of interest to know whether V has a basis of eigenvectors of T . The following result is useful. Proposition A.3.2. Assume that the characteristic polynomial of T ∈ L(V ) has k distinct roots, λ1, . . . , λk, with eigenvectors vj ∈ E(T, λj), 1 ≤ j ≤ k. Then {v1, . . . , vk} is linearly independent. In particular, if k = dim V , these vectors form a basis of V . 460 A. Complementary material

Proof. We argue by contradiction. If {v1, . . . , vk} is linearly dependent, take a minimal subset that is linearly dependent and (reordering if necessary) say this set is {v1, . . . , vm}, with T vj = λjvj, and

(A.3.6) c1v1 + ··· + cmvm = 0, with cj ≠ 0 for each j ∈ {1, . . . , m}. Applying T − λmI to (A.3.6) gives

(A.3.7) c1(λ1 − λm)v1 + ··· + cm−1(λm−1 − λm)vm−1 = 0, a linear dependence relation on the smaller set {v1, . . . , vm−1}. This contra- diction proves the proposition. 

Here is another important class of transformations that have a full com- plement of eigenvectors. Proposition A.3.3. Let V be an n-dimensional inner product space, T ∈ L(V ). Assume T is self-adjoint, i.e., T = T ∗. The V has an orthonormal basis of eigenvectors of T .

Proof. First, assume V is a complex vector space (F = C). Proposition A.3.1 implies that there exists an eigenvector v1 of T . Let W = Span{v1}. Then Proposition A.2.4 gives (A.3.8) T : W ⊥ −→ W ⊥, and dim W ⊥ = n − 1. The proposition then follows by induction on n. 

If V is a real vector space (F = R), then the characteristic polynomial e det(λI − T ) has a complex root, say λ1 ∈ C. Denote by V the complexifi- cation of V . The transformation T extends to T ∈ L(Ve), as a self-adjoint transformation on this complex inner product space. Hence there exists e nonzero v1 ∈ V such that T v1 = λ1v1. We now take note of the following. Proposition A.3.4. If T = T ∗, every eigenvalue of T is real.

Proof. Say T v1 = λ1v1, v1 ≠ 0. Then 2 λ1∥v1∥ = (λ1v1, v1) = (T v1, v1) (A.3.9) 2 = (v1, T v1) = (v1, λv1) = λ1∥v1∥ .

Hence λ1 = λ1, so λ1 is real. 

Returning to the proof of Proposition A.3.3 when V is a real inner prod- uct space, we see that the (complex) root λ1 of det(λI − T ) must in fact be real. Hence λ1I −T : V → V is not injective, so there exists a λ1-eigenvector v1 ∈ V . Induction on n, as in the argument above, finishes the proof.

Here is a useful general result on orthogonality of eigenvectors. A.3. Eigenvalues and eigenvectors 461

Proposition A.3.5. Let V be an inner product space, T ∈ L(V ). If (A.3.10) T u = λu, T ∗v = µv, λ ≠ µ, then (A.3.11) u ⊥ v.

Proof. We have (A.3.12) λ(u, v) = (T u, v) = (u, T ∗v) = µ(u, v). 

As a corollary, it T = T ∗, then T u = λu, T v = µv, λ ≠ µ ⇒ u ⊥ v.

Our next goal is to extend Proposition A.3.3 to a broader class of trans- formations. Given T ∈ L(V ), where V is an n-dimensional complex inner product space, we say T is normal if T and T ∗ commute, i.e., TT ∗ = T ∗T . Equivalently, taking (A.3.13) T = A + iB, A = A∗,B = B∗, we have (A.3.14) T normal ⇐⇒ AB = BA. Generally, for A, B ∈ L(V ), we see that

(A.3.15) BA = AB =⇒ B : E(A, λj) → E(A, λj). Thus, in the setting of (A.3.13), we can find an orthonormal basis of each space E(A, λ), λ ∈ Spec A, consisting of eigenvectors of B, to get an or- thonormal basis of V consisting of vectors that are simultaneously eigenvec- tors of A and B, hence eigenvectors of T . This establishes the following. Proposition A.3.6. Let V be an n-dimensional complex inner product space, T ∈ L(V ) a normal transformation. Then V has an orthonormal basis of eigenvectore of T .

Note that if T has the form (A.3.13)–(A.3.14) and λ = a + ib, a, b ∈ R, then E(T, λ) = E(A, a) ∩ E(B, b) (A.3.16) = E(T ∗, λ). We deduce from Proposition A.3.5 the following. Proposition A.3.7. In the setting of Proposition A.3.6, with T normal, (A.3.17) λ ≠ µ =⇒ E(T, λ) ⊥ E(T, µ). 462 A. Complementary material

An important class of normal operators is the class of unitary operators, defined in §A.2. We recall that if V is an inner product space and T ∈ L(V ), then (A.3.18) T is unitary ⇐⇒ T ∗ = T −1. We write T ∈ U(V ), if V is a complex inner product space. We see from (A.3.16) (or directly) that T ∈ U(V ), λ ∈ Spec T =⇒ λ = λ−1 (A.3.19) =⇒ |λ| = 1. We deduce that if T ∈ U(V ), then V has an orthonormal basis of eigen- vectors of T , each eigenvalue being a of absolute value 1.

If V is a real n-dimensional inner product space and (A.3.18) holds, we say T is an orthogonal transformation, and write T ∈ O(V ). In such a case, V typically does not have an orthonormal basis of eigenvectors of T . However, V does have an orthonormal basis with respect to which such an orthogonal transformation has a special structure, as we proceed to show. To get it, we construct the complexification of V ,

(A.3.20) VC = {u + iv : u, v ∈ V }, which has a natural structure of a complex n-dimensional vector space, with a Hermitian inner product. A transformation T ∈ O(V ) has a unique C- linear extension to a transformation on VC, which we continue to denote by T , and this extended transformation is unitary on VC. Hence VC has an orthonormal basis of eigenvectors of T . Say u + iv ∈ VC is such an eigenvector, (A.3.21) T (u + iv) = e−iθ(u + iv), eiθ ∈/ {1, −1}. Writing eiθ = c + is, c, s ∈ R, we have T u + iT v = (c − is)(u + iv) (A.3.22) = cu + sv + i(−su + cv), hence T u = cu + sv, (A.3.23) T v = −su + cv. In such a case, applying complex conjugation to (A.3.21) yields T (u − iv) = eiθ(u − iv), and eiθ ≠ e−iθ if eiθ ∈/ {1, −1}, so Proposition A.3.7 yields (A.3.24) u + iv ⊥ u − iv, A.3. Eigenvalues and eigenvectors 463 hence 0 = (u + iv, u − iv) (A.3.25) = (u, u) − (v, v) + i(v, u) + i(u, v) = |u|2 − |v|2 + 2i(u, v), or equivalently (A.3.26) |u| = |v| and u ⊥ v. Now Span{u, v} ⊂ V has an (n − 2)-dimensional orthogonal complement, on which T acts, and an inductive argument gives the following. Proposition A.3.8. Let V be a n-dimensional real inner product space, T : V → V an orthogonal transformation. Then V has an orthonormal basis in which the matrix representation of T consists of blocks ( ) cj −sj 2 2 (A.3.27) , cj + sj = 1, sj cj plus perhaps an identity matrix block if 1 ∈ Spec T , and a block that is −I if −1 ∈ Spec T .

This result has the following consequence, advertised in Exercise 14 of §3.2. Corollary A.3.9. For each integer n ≥ 2, (A.3.28) Exp : Skew(n) −→ SO(n) is onto.

As in §3.2 we leave the proof as an exercise for the reader. The key is to use the Euler-type identity ( ) ( ) 0 −1 cos θ − sin θ (A.3.29) J = =⇒ eθJ = . 1 0 sin θ cos θ In cases when T is a linear transform on an n-dimensional complex vector space V , and V does not have a basis of eigenvectors of T , it is useful to have the concept of a generalized eigenspace, defined as k (A.3.30) GE(T, λj) = {v ∈ V :(t − λjI) v = 0 for some k}.

If λj is an eigenvalue of T , nonzero elements of GE(T, λj) are called gen- eralized eigenvectors. Clearly E(T, λj) ⊂ GE(T, λj). Also T : GE(T, λj) → GE(T, λj). Furthermore, one has the following.

Proposition A.3.10. If µ ≠ λj, then ≈ (A.3.31) T − µI : GE(T, λj) −→ GE(T, λj). 464 A. Complementary material

It is useful to know the following. Proposition A.3.11. If W is an n-dimensional complex vector space, and T ∈ L(V ), then W has a basis of generalized eigenvectors of T .

We will not give a proof of this result here. A proof can be found in Chapter 2, §7 of [45], and also in [47]. A.4. Complements on power series 465

A.4. Complements on power series

If a function f is sufficiently differentiable on an interval in R containing x and y, the Taylor expansion about y reads 1 (A.4.1) f(x) = f(y) + f ′(y)(x − y) + ··· + f (n)(y)(x − y)n + R (x, y). n! n (n) n Here, Tn(x, y) = f(y) + ··· + f (y)(x − y) /n! is that polynomial of degree n in x all of whose x-derivatives of order ≤ n, evaluated at y, coincide with those of f. This prescription makes the formula for Tn(x, y) easy to derive. The analysis of the remainder term Rn(x, y) is more subtle. One useful result about this remainder is the following. Say x > y, and for simplicity assume f (n+1) is continuous on [y, x]; we say f ∈ Cn+1([y, x]). Then m ≤ f (n+1)(ξ) ≤ M, ∀ ξ ∈ [y, x] (A.4.2) (x − y)n+1 (x − y)n+1 =⇒ m ≤ R (x, y) ≤ M . (n + 1)! n (n + 1)! Under our hypotheses, this result is equivalent to the Lagrange form of the remainder: 1 (A.4.3) R (x, y) = (x − y)n+1f (n+1)(ζ ), n (n + 1)! n for some ζn between x and y. A proof of (A.4.3). will be given below. One of our purposes here is to comment on how effective estimates on Rn(x, y) are in determining the convergence of the infinite series ∞ ∑ f (k)(y) (A.4.4) (x − y)k k! k=0 to f(x). That is to say, we want to perceive that Rn(x, y) → 0 as n → ∞, in appropriate circumstances. Before we look at how effective the estimate (A.4.2) is at this job, we want to introduce another player, and along the way discuss the derivation of various formulas for the remainder in (A.4.1).

A simple formula for Rn(x, y) follows upon taking the y-derivative of both sides of (A.4.1); we are assuming that f is at least (n + 1)-fold differ- entiable. When we do this (applying the Leibniz formula to those terms that are products) an enormous amount of cancellation arises, and the formula collapses to ∂R 1 (A.4.5) n = − f (n+1)(y)(x − y)n,R (x, x) = 0. ∂y n! n

If we concentrate on Rn(x, y) as a function of y and look at the difference quotient [Rn(x, y) − Rn(x, x)]/(y − x), an immediate consequence of the 466 A. Complementary material mean value theorem is that 1 (A.4.6) R (x, y) = (x − y)(x − ξ )n f (n+1)(ξ ), n n! n n for some ξn between x and y. This result, known as Cauchy’s formula for the remainder, has a slightly more complicated appearance than (A.4.3), but as we will see it has advantages over Lagrange’s formula. The application of the mean value theorem to obtain (A.4.6) does not require the continuity of f (n+1), but we do not want to dwell on that point. If f (n+1) is continuous, we can apply the Fundamental Theorem of Cal- culus to (A.4.5), in the y-variable, and obtain the basic integral formula ∫ x 1 n (n+1) (A.4.7) Rn(x, y) = (x − s) f (s) ds. n! y Another proof of (A.4.7) is indicated in Exercise 9 of §1.1. If we think of the integral in (A.4.7) as (x − y) times the mean value of the integrand, we see (A.4.6) as a consequence. On the other hand, if we want to bring a factor of (x − y)n+1 outside the integral in (A.4.7), the change of variable x − s = t(x − y) gives the integral formula ∫ 1 ( ) 1 n+1 n (n+1) (A.4.8) Rn(x, y) = (x − y) t f ty + (1 − t)x dt. n! 0 If we think of this integral as 1/(n + 1) times a weighted mean value of f (n+1), we recover the Lagrange formula (A.4.3). From the Lagrange form (A.4.3) of the remainder in the (A.4.1) we have the estimate | − |n+1 x y (n+1) (A.4.9) |Rn(x, y)| ≤ sup |f (ζ)|, (n + 1)! ζ∈I(x,y) where I(x, y) is the open interval from x to y (either (x, y) or (y, x), disre- garding the trivial case x = y). Meanwhile, from the Cauchy form (A.4.6) of the remainder we have the estimate | − | x y n (n+1) (A.4.10) |Rn(x, y)| ≤ sup |(x − ξ) f (ξ)|. n! ξ∈I(x,y) We now study how effective these estimates are in determining that various power series converge. We begin with a look at these remainder estimates for the power series expansion about the origin of the simple function 1 (A.4.11) f(x) = . 1 − x A.4. Complements on power series 467

We have, for x ≠ 1,

(A.4.12) f (k)(x) = k! (1 − x)−k−1, and formula (A.4.1) becomes 1 (A.4.13) = 1 + x + ··· + xn + R (x, 0). 1 − x n Of course, everyone knows that the infinite series

(A.4.14) 1 + x + ··· + xn + ··· converges to f(x) in (A.4.11), precisely for x ∈ (−1, 1). What we are inter- ested in is what can be deduced from the estimate (A.4.9), which, for the function (A.4.11), takes the form

n+1 −n−2 (A.4.15) |Rn(x, 0)| ≤ |x| · sup |1 − ζ| . ζ∈I(x,0)

We consider two cases. First, if x ≤ 0, then |1 − ζ| ≥ 1 for ζ ∈ I(x, 0), so

n+1 (A.4.16) x ≤ 0 =⇒ |Rn(x, 0)| ≤ |x| .

Thus the estimate (A.4.9) implies that Rn(x, 0) → 0 in (A.4.13), for all x ∈ (−1, 0]. Suppose however that x ≥ 0. What we have from (A.4.15) is

n+1 −n−2 x ≥ 0 =⇒ |Rn(x, 0)| ≤ |x| sup |1 − ζ| 0≤ζ≤x (A.4.17) ( ) 1 x n+1 = . 1 − x 1 − x This tends to 0 as n → ∞ if and only if x < 1−x, i.e., if and only if x < 1/2. What we have is the following:

Conclusion. The estimate (A.4.9) implies the convergence of the Taylor series (about the origin) for the function f(x) = 1/(1 − x), only for −1 < x < 1/2.

This example points to a weakness in the estimate (A.4.9), coming from the Lagrange form of the remainder. Now let us see how well we can do with the estimate (A.4.10), coming from the Cauchy form of the remainder. For the function (A.4.11), this takes the form | − |n | | ≤ | | x ξ (A.4.18) Rn(x, 0) (n + 1) x sup n+2 . ξ∈I(x,0) |1 − ξ| 468 A. Complementary material

For −1 < x ≤ 0 one has an estimate like (A.4.16), with a factor of (n + 1) thrown in. On the other hand, one readily verifies that x − ξ 0 ≤ ξ ≤ x < 1 =⇒ ≤ x, 1 − ξ so we deduce from (A.4.18) that xn+1 (A.4.19) 0 ≤ x < 1 =⇒ |R (x, 0)| ≤ (n + 1) , n 1 − x which does tend to 0 for all x ∈ [0, 1). One might be wondering if one could come up with some more com- plicated example, for which Cauchy’s form is effective only on an interval shorter than the interval of convergence. In fact, you can’t. Cauchy’s form of the remainder is always effective in the interior of the interval of conver- gence. This can be demonstrated, using methods of complex analysis. We look at some more power series, and see when convergence can be established at an endpoint of an interval of convergence, using the estimate (A.4.10), i.e.,

|Rn(x, y)| ≤ Cn(x, y), | − | (A.4.20) x y n (n+1) Cn(x, y) = sup |(x − ξ) f (ξ)|. n! ξ∈I(x,y) We consider the following family of examples: (A.4.21) f(x) = (1 − x)a, a > 0. The power series expansion has radius of convergence 1 (if a is not an integer) and, as we will see, one has convergence at both endpoints, +1 and −1, whenever a > 0. Let us see when Cn(1, 0) → 0. We have (A.4.22) f (n+1)(x) = (−1)n+1a(a − 1) ··· (a − n) (1 − x)a−n−1. Hence

− ··· − | − − |n − a(a 1) (a n) 1 ξ Cn( 1, 0) = sup − n! − |1 − ξ|n+1 a ( ) ( 1<ξ<0) (A.4.23) a a = a(1 − a) 1 − ··· 1 − 2 n = O(n−a), as one can see by applying the log, and using log(1−a/k) ≤ −a/k for k > a. (Compare the proof of Proposition A.4.1.) Hence Cn(−1, 0) → 0 as n → ∞, whenever a > 0 in (A.4.21). On the other hand, − ··· − a(a 1) (a n) a−1 (A.4.24) Cn(1, 0) = sup (1 − ξ) . n! 0<ξ<1 If a ∈ (0, 1), this is identically +∞, while if a ≥ 1 it is O(n−a), as above. A.4. Complements on power series 469

Conclusion. The estimate (A.4.20) is successful at establishing the conver- gence of the Taylor series (about the origin) for the function f(x) = (1−x)a, at x = −1, whenever a > 0. It fails to establish the convergence at x = +1, when 0 < a < 1, but it is successful when a ≥ 1.

We mention that convergence for x ∈ (−1, 1) is easily checked for the power series of the functions (A.4.21). The failure of (A.4.20) to establish convergence at x = +1 does not imply failure of such convergence. In fact we have the following result, which will be useful in Appendix A.5. Proposition A.4.1. Given a > 0, the Taylor series about the origin for the function f(x) = (1 − x)a converges absolutely and uniformly to f(x) on the closed interval [−1, 1].

Proof. As noted, the series is ∞ ∑ a(a − 1) ··· (a − n + 1) (A.4.25) c xn, c = (−1)n , n n n! n=0 and, by an analysis parallel to (A.4.23), −1−a (A.4.26) |cn| ≤ C n . In more detail, if n − 1 > a, ( ) ( ) a ∏ a ∏ a (A.4.27) c = − 1 − 1 − , n n k k 1≤k≤a a 0, we see that the series (A.4.25) does converge absolutely and uniformly on [−1, 1]; so its limit is a continuous function fa on [−1, 1]. The a remark above has shown that fa(x) = (1 − x) for x ∈ [−1, 1); by continuity this identity also holds at x = 1. 

The material above has emphasized the study of the expansion (A.4.1) for infinitely smooth f, concentrating on the issue of convergence as n → ∞. The behavior for fixed n (e.g., n = 2) as x → y is also of great interest, and 470 A. Complementary material in this connection it is important to note that (A.4.1) holds, with a useful n n+1 formula for Rn(x, y), when f is merely C , not necessarily C . So suppose f ∈ Cn, i.e., f, f ′, . . . , f (n) are continuous on an interval I about y. Then the result (A.4.7) holds, with n replaced by n − 1; i.e., for x ∈ I we have f(x) = f(y) + f ′(y)(x − y) + ···

(A.4.28) 1 (n−1) n−1 + f (y)(x − y) + R − (x, y), (n − 1)! n 1 with ∫ x 1 n−1 (n) (A.4.29) Rn−1(x, y) = (x − s) f (s) ds. (n − 1)! y

Now we can add and subtract f (n)(y) to the factor f (n)(s) in the integrand above, and obtain

R − (x, y) = n 1 ∫ (A.4.30) 1 1 x [ ] f (n)(y)(x − y)n + (x − s)n−1 f (n)(s) − f (n)(y) ds. n! (n − 1)! y This establishes the following.

Proposition A.4.2. Assume f has n continuous derivatives on an interval I containing y. Then, for x ∈ I, the formula (A.4.1) holds, with ∫ x [ ] 1 n−1 (n) (n) (A.4.31) Rn(x, y) = (x − s) f (s) − f (y) ds. (n − 1)! y

Note that since the integral in (A.4.31) equals x − y times the value of the integrand at some point s = ξ between x and y, we can write a “Cauchy form” of the remainder (A.4.31) as 1 [ ] (A.4.32) R (x, y) = f (n)(ξ) − f (n)(y) (x − ξ)n−1(x − y). n (n − 1)! Alternatively, parallel to (A.4.8), we can write ∫ − n 1[ ] (x y) (n) (n) n−1 (A.4.33) Rn(x, y) = f (sx+(1−s)y)−f (y) (1−s) ds, (n − 1)! 0 and obtain a “Lagrange form”: (x − y)n [ ] (A.4.34) R (x, y) = f (n)(ζ) − f (n)(y) , n n! for some ζ between x and y. Note that (A.4.34) also follows by replacing n by n − 1 in (A.4.3). A.5. The Weierstrass theorem and the Stone-Weierstrass theorem 471

A.5. The Weierstrass theorem and the Stone-Weierstrass theorem

The following result is a very useful tool in analysis known as the Weierstrass polynomial approximation theorem. Proposition A.5.1. Given a compact interval I, any continuous function f on I is a uniform limit of polynomials.

Otherwise stated, the result is that the space C(I) of continuous (real valued) functions on I is equal to P(I), the uniform closure in C(I) of the space of polynomials. To prove this, our starting point will be the result that the power series for (1 − x)a converges uniformly on [−1, 1], for any a > 0. This was established in §A.4, and we will use it, with a = 1/2. From the identity x1/2 = (1 − (1 − x))1/2, we have x1/2 ∈ P([0, 2]). More to the point, from the identity ( ) (A.5.1) |x| = 1 − (1 − x2) 1/2, √ √ we have |x| ∈ P([− 2, 2]). Using |x| = b−1|bx|, for any b > 0, we see that |x| ∈ P(I) for any interval I = [−c, c], and also for any closed subinterval, hence for any compact interval I. By translation, we have (A.5.2) |x − a| ∈ P(I) for any compact interval I. Using the identities 1 1 1 1 (A.5.3) max(x, y) = (x + y) + |x − y|, min(x, y) = (x + y) − |x − y|, 2 2 2 2 we see that for any a ∈ R and any compact I, (A.5.4) max(x, a), min(x, a) ∈ P(I).

We next note that P(I) is an algebra of functions, i.e., (A.5.5) f, g ∈ P(I), c ∈ R =⇒ f + g, fg, cf ∈ P(I). Using this, one sees that, given f ∈ P(I), with range in a compact interval J, one has h ◦ f ∈ P(I) for all h ∈ P(J). Hence f ∈ P(I) ⇒ |f| ∈ P(I), and, via (A.5.3), we deduce that (A.5.6) f, g ∈ P(I) =⇒ max(f, g), min(f, g) ∈ P(I).

Suppose now that I′ = [a′, b′] is a subinterval of I = [a, b]. With the notation x+ = max(x, 0), we have ( ) ′ ′ (A.5.7) fII′ (x) = min (x − a )+, (b − x)+ ∈ P(I). This is a piecewise , equal to zero off I \ I′, with 1 from a′ to the midpoint m′ of I′, and slope −1 from m′ to b′. 472 A. Complementary material

Now if I is divided into N equal subintervals, any continuous function on I that is linear on each such subinterval can be written as a linear combi- nation of such “tent functions,” so it belongs to P(I). Finally, any f ∈ C(I) can be uniformly approximated by such piecewise linear functions, so we have f ∈ P(I), proving the theorem.

A far reaching extension of Proposition A.5.1, due to M. Stone, is the following result, known as the Stone-Weierstrass theorem. Proposition A.5.2. Let X be a compact metric space, A a subalgebra of CR(X), the algebra of real valued continuous functions on X. Suppose 1 ∈ A and that A separates points of X, i.e., for distinct p, q ∈ X, there exists hpq ∈ A with hpq(p) ≠ hpq(q). Then the closure A is equal to CR(X).

We will derive this from the following lemma.

Lemma A.5.3. Let A ⊂ CR(X) satisfy the hypotheses of Proposition A.5.2, and let K,L ⊂ X be disjoint, compact subsets of X. Then there exists gKL ∈ A such that

(A.5.8) gKL = 1 on K, 0 on L, and 0 ≤ gKL ≤ 1 on X.

Proof of Proposition A.5.2. To start, take f ∈ CR(X) such that 0 ≤ f ≤ 1 on X. Set { ∈ ≥ 2 } { ∈ 1 } \ (A.5.9) K = x X : f(x) 3 ,U = x X : f(x) > 3 ,L = X U. Lemma A.5.3 implies that there exist g ∈ A such that 1 1 g = on K, g = 0 on L, and 0 ≤ g ≤ . 3 3 Then 0 ≤ g ≤ f ≤ 1 on X, and, more precisely, 2 (A.5.10) 0 ≤ f − g ≤ , on X. 3

We can apply such reasoning with f replaved by f − g, obtaining g2 ∈ A such that ( ) 2 2 (A.5.11) 0 ≤ f − g − g ≤ , on X, 2 3 and iterate, obtaining gj ∈ A such that, for each k, ( ) 2 k (A.5.12) 0 ≤ f − g − g − · · · − g ≤ , on X. 2 k 3 This yields f ∈ A, whenever f ∈ CR(X) satisfies 0 ≤ f ≤ 1. It is an easy step to see that f ∈ CR(X) ⇒ f ∈ A.  A.5. The Weierstrass theorem and the Stone-Weierstrass theorem 473

Proof of Lemma A.5.3. We present the proof in six steps.

Step 1. By Proposition A.5.1, if f ∈ A and φ : R → R is continuous, then φ ◦ f ∈ A.

Step 2. Consequently, if fj ∈ A, then 1 1 (A.5.13) max(f , f ) = |f − f | + (f + f ) ∈ A, 1 2 2 1 2 2 1 2 and similarly min(f1, f2) ∈ A.

Step 3. It follows from the hypotheses that if p, q ∈ X and p ≠ q, then there exists fpq ∈ A, equal to 1 at p and to 0 at q.

Step 4. Apply an appropriate continuous φ : R → R to get gpq = φ ◦ fpq ∈ A, equal to 1 on a neighborhood of p and to 0 on a neighborhood of q, and satisfying 0 ≤ gpq ≤ 1 on X.

Step 5. Let L ⊂ X be compact and fix p ∈ X \ L. By Step 4, given q ∈ L, there exists gpq ∈ A such that gpq = 1 on a neighborhood Oq of p, equal to 0 on a neighborhood Ωq of q, satisfying 0 ≤ gpq ≤ 1 on X. { } Now Ωq is an open cover of L, so there exists a finite subcover Ωq1 ,..., ΩqN . Let

(A.5.14) gpL = min gpq ∈ A. 1≤j≤N j O ∩N O Taking = 1 qj , an open neighborhood of p, we have

gpL = 1 on O, 0 on L, and 0 ≤ gpU ≤ 1 on X.

Step 6. Take K,L ⊂ X disjoint, compact subsets. By Step 5, for each p ∈ K, there exists gpL ∈ A, equal to 1 on a neighborhood Op of p, and equal to 0 on L. {O } O O Now p covers K, so there exists a finite subcover p1 ,..., pM . Let

(A.5.15) gKL = max gp L ∈ A. 1≤j≤M j We have

(A.5.16) gKL = 1 on K, 0 on L, and 0 ≤ gKL ≤ 1 on X, as asserted in the lemma.  474 A. Complementary material

Proposition A.5.2 has a complex analogue. In that case, we add the assumption that f ∈ A ⇒ f ∈ A, and conclude that A = C(X). This is easily reduced to the real case. Here are a couple of applications of Proposition A.5.2, in its real and complex forms: Corollary A.5.4. If X is a compact subset of Rn, then every f ∈ C(X) is a uniform limit of polynomials on Rn. Corollary A.5.5. The space of trigonometric polynomials on the n-torus Tn, given by ∑ ik·θ (A.5.17) ake , |k|≤N is dense in C(Tn). A.6. Further results on harmonic functions 475

A.6. Further results on harmonic functions

Recall that if Ω ⊂ Rn is open, a function u ∈ C2(Ω) is harmonic on Ω provided ∆u = 0, where

2 2 ∂ ··· ∂ (A.6.1) ∆ = 2 + + 2 ∂x1 ∂xn is the Laplace operator. Some useful results on harmonic functions have been presented in Chapters 5 and 7. In particular, we have established the mean value property (cf. (5.1.23)): if u is harmonic on Ω and the closed ball BR(x0) ⊂ Ω, then ∫ 1 (A.6.2) u(x0) = u(x) dx. V (BR) BR(x0) This leads to the maximum principle, Proposition 5.1.7. In particular, if Ω is bounded and u ∈ C(Ω) is harmonic on Ω, then

(A.6.3) sup |u(x)| = sup |u(y)|. x∈Ω y∈∂Ω One consequence of this is that, for such bounded Ω ⊂ Rn, a solution u ∈ C2(Ω) ∩ C(Ω) to the Dirichlet problem

(A.6.4) ∆u = 0 on Ω, u ∂Ω = f, with f ∈ C(∂Ω) given, is unique (provided it exists). In Chapter 7 we showed that solutions do exist if Ω = BR(x0) is a ball. In fact, we derived a formula, the Poisson integral formula, for the solution in case the ball is B = B1(0). In that case, the solution is given by u = PI f, where ∫ 1 − |x|2 f(y) (A.6.5) PI f(x) = n dS(y). An−1 |x − y| Sn−1

One can then treat arbitrary balls BR(x0), via translation and dilation. One consequence of this formula is that, whenever Ω ⊂ Rn is open and u ∈ C2(Ω) is harmonic, then in fact u ∈ C∞(Ω). It then follows that all derivatives ∂αu are harmonic in Ω. In case Ω is a bounded set and u ∈ C(Ω), and K ⊂ Ω is compact, we also have an estimate

α (A.6.6) sup |∂ u(x)| ≤ CKα sup |u(y)|. x∈K y∈∂Ω

Our goal in this appendix is to establish some further useful results about harmonic functions. We start with the following removable singularity theorem, which is of use in the proof of Proposition 7.4.3. Take B = B1(0). 476 A. Complementary material

Proposition A.6.1. Assume u ∈ C2(B \0)∩C(B \0) is harmonic on B \0 and bounded, i.e., there exists M < ∞ such that (A.6.7) |u(x)| ≤ M, ∀ x ∈ B \ 0. Then u can be extended (in a unique fashion) to be harmonic on all of B.

Proof. Let f = u|∂B ∈ C(∂B) and set (A.6.8) v = PI f, v ∈ C(B) ∩ C2(B). We claim v = u on B \ 0. To this end, consider w = u − v on B \ 0. We have w ∈ C(B \ 0) ∩ C2(B \ 0), ∆w = 0 on B \ 0, and w = 0 on ∂B. Also |w| ≤ 2M on B \ 0. We claim w ≡ 0. To show this, we can assume that w is real valued. Now bring in the function H ∈ C(B \ 0) ∩ C2(B \ 0), given by H(x) = |x|2−n − 1, if n ≥ 3, (A.6.9) 1 log , if n = 2. |x| We see that H is harmonic on B \ 0, H ≥ 0 on B \ 0, H = 0 on ∂B, and H(x) → +∞ as x → 0. Hence, for each ε > 0, there exists δ0 > 0 such that

(A.6.10) εH − w ≥ 0 on ∂Bδ(0), ∀ δ ∈ (0, δ0]. The maximum principle implies that (A.6.11) εH − w ≥ 0 on B \ Bδ(0). Taking δ ↘ 0 yields (A.6.11) on B \ 0. Then taking ε ↘ 0 yields (A.6.12) w ≤ 0 on B \ 0. A similar argument gives w ≥ 0 on B \ 0, hence w ≡ 0, and the proof is complete. 

We turn now to a converse to the mean value property of harmonic functions. To state it, we say a function u ∈ C(Ω) has the mean value property provided that (A.6.2) holds whenever the ball BR(x0) ⊂ Ω. One point of abstracting this property is that such functions satisfy the maximum principle: Lemma A.6.2. If Ω is bounded and u ∈ C(Ω) has the mean value property on Ω, then (A.6.3) holds.

In fact, the proof of Proposition 5.1.7 works without change here. Here is our converse result. Proposition A.6.3. If u ∈ C(Ω) has the mean value property, then u is harmonic on Ω. (In particular, u ∈ C∞(Ω).) A.6. Further results on harmonic functions 477

Proof. It suffices to show that if B ⊂ Ω is a ball, then u is harmonic on B. Translating and dilating, we can assume B = B1(0). Now take

(A.6.13) f = u ∂B, v = PI f. It suffices to show that v ≡ u on B, or equivalently that w = v − u ≡ 0 on B. Indeed, w has the mean value property on B, w ∈ C(B), and w = 0 on ∂B, so Lemma A.6.2 implies w ≡ 0. 

The next result is known as the Schwarz reflection principle. To state is, let B = B1(0) and set

(A.6.14) B+ = {x ∈ B : xn > 0}.

Proposition A.6.4. Assume u ∈ C(B+) is harmonic on B+ and n (A.6.15) u = 0 on S = B ∩ {x ∈ R : xn = 0}. Define v on B by

v(x) = u(x), if x ∈ B+, (A.6.16) − u(ρ(x)), if ρ(x) ∈ B+, where

(A.6.17) ρ(x1, . . . , xn) = (x1,..., −xn). Then v is harmonic on B.

Proof. The hypothesis (A.6.15) implies v ∈ C(B). In particular f = v|∂B belongs to C(∂B). Consider (A.6.18) w = PI f. We claim w ≡ v. Since x 7→ ρ(x) is an isometry, we have PI(f ◦ ρ) = w ◦ ρ, so (A.6.19) w(ρ(x)) = −w(x). Hence w = 0 on S. Also w = f = v on ∂B, so

(A.6.20) w = u on ∂B+.

Since u and w are harmonic on B+, we have

(A.6.21) w = u on B+, which, together with (A.6.19), yields w ≡ v, completing the proof. 

We turn to a circle of results on harmonic functions that satisfy one- sided bounds. To start, let u : B1(0) → R be harmonic and assume u ≥ 0. 478 A. Complementary material

Also assume u ∈ C(B1(0)), for now, so u|Sn−1 = f ≥ 0 and u is given by (A.6.5). Hence, for x ∈ B1(0), ≥ − | |2 · | − |−n u(x) (1 x ) min x y AvgSn−1 f |y|=1 (A.6.22) 1 − |x|2 = u(0), (1 + |x|)n so 1 − |x| (A.6.23) u(x) ≥ u(0), ∀ x ∈ B (0). (1 + |x|)n−1 1

If we omit the hypothesis that u ∈ C(B1(0)) and apply this reasoning to ub(x) = u(bx) and let b ↗ 1, we obtain (A.6.23) for this more general class. Going further, we can apply translations and dilations, and obtain the following result, known as Harnack’s inequality.

Proposition A.6.5. Assume u is harmonic on BR(x0) and ≥ 0 there. Then, for all x1 ∈ BR(x0), − −1| − | ≥ 1 R x1 x0 (A.6.24) u(x1) −1 n−1 u(x0). (1 + R |x1 − x0|) Using Harnack’s inequality, we can establish the following stronger ver- sion of Liouville’s theorem. Proposition A.6.6. Assume v : Rn → R is harmonic and bounded from below: v(x) ≥ −M, ∀ x ∈ Rn. Then v is constant.

Proof. The function u(x) = v(x) + M is harmonic and ≥ 0 on Rn. Given n x0, x1 ∈ R , we can take R > |x1 − x0| and apply (A.6.24). Taking R → ∞ then gives n u(x1) ≥ u(x0), ∀ x0, x1 ∈ R . Reversing roles also gives u(x0) ≥ u(x1), so u is constant, and so is v.  It is useful to complement Harnack’s lower bound with an upper bound. To start, assume u ∈ C(B1(0)) is harmonic and ≥ 0, and complement (A.6.22) with ≤ − | |2 · | − |−n u(x) (1 x ) max x y AvgSn−1 f |y|=1 (A.6.25) 1 − |x|2 = u(0), (1 − |x|)n so 1 + |x| (A.6.26) u(x) ≤ u(0), ∀ x ∈ B (0). (1 − |x|)n−1 1 A.6. Further results on harmonic functions 479

We can remove the hypothesis oc continuity on B1(0) by the dilation ar- gument used above. Further translation and dilation gives the following complement to Proposition A.6.5.

Proposition A.6.7. Assume u is harmonic on BR(x0) and ≥ 0 there. Then, for all x1 ∈ BR(x0), −1| − | ≤ 1 + R x1 x0 (A.6.27) u(x1) −1 n−1 u(x0). (1 − R |x1 − x0|) The following result will lead to further extensions of Liouville’s theorem.

Proposition A.6.8. For each n ≥ 2, there exist constants Kn ∈ (0, ∞) n with the following property. Let u be harmonic on BR(0) ⊂ R . Assume

(A.6.28) u(0) = 0, u(x) ≤ M on BR(0). Then

(A.6.29) u(x) ≥ −KnM on BR/2(0).

Proof. Apply Proposition A.6.7, with u replaced by M − u, which is ≥ 0 on BR(0) and equal to M at 0. We see that R 3 (A.6.30) |x | = =⇒ M − u(x ) ≤ 2n−1M, 1 2 1 3 n−1 so (A.6.29) holds with Kn = 3 · 2 − 1. 

We can use Proposition A.6.8 t give a second proof of Proposition A.6.6. Indeed, if v ≥ 0 is harmonic on Rn, then u(x) = v(0)−v(x) satisfies (A.6.28), n with M = v(0), for all R, so (A.6.29) implies u(x) ≥ −Knv(0) for all x ∈ R . Hence u is harmonic and bounded on Rn, so the fact that u is constant follows from the version of Liouville’s theorem given in Proposition 5.1.8. An extension of this argument gives the following.

Proposition A.6.9. Assume that u is harmonic on Rn and that there exist + C0,C1 ∈ (0, ∞)) and k ∈ Z such that k n (A.6.31) u(x) ≤ C0 + C1|x| , ∀ x ∈ R .

Then there exist C2,C3 ∈ (0, ∞) such that k n (A.6.32) u(x) ≥ −C2 − C3|x| , ∀ x ∈ R .

k Proof. Apply Proposition A.6.8 to u(x)−u(0),M = C0+|u(0)|+C1R . 

The next result is an important extension of Liouville’s theorem. 480 A. Complementary material

Proposition A.6.10. Let u : Rn → R be harmonic, and assume there exist C0,C1, and k such that k n (A.6.33) |u(x)| ≤ C0 + C1|x| , ∀ x ∈ R . Then u(x) is a polynomial in x, of degree ≤ k.

−k Proof. Set vR(x) = R u(Rx). Then {vR : R ∈ [1, ∞)} is uniformly bounded on B1(0). By (A.6.6), we have 1 (A.6.34) |∂αv (x)| ≤ C , for |x| ≤ ,R ≥ 1, R α 2 or equivalently 1 (A.6.35) R−k+|α||∂αu(Rx)| ≤ C , for |x| ≤ ,R ≥ 1. α 2 Taking R → ∞ gives (A.6.36) |∂αu(y)| = 0, ∀ y ∈ Rn, |α| > k. This implies u is a polynomial of degree ≤ k. 

Here is a basic application of Propositions A.6.9–A.6.10. Proposition A.6.11. Let f : C → C be holomorphic. Assume that there is a bound (A.6.37) |ef(z)| ≤ CeA|z|, ∀ z ∈ C. Then f(z) has the form (A.6.38) f(z) = az + b.

Proof. The bound (A.6.37) implies (A.6.39) Re f(z) ≤ A|z| + A′. Hence, by Proposition A.6.9, (A.6.40) | Re f(z)| ≤ B|z| + B′. By Proposition A.6.10, we have (A.6.41) Re f(z) = αx + βy + γ. Consequently the harmonic conjugate v(x, y) of u(x, y) = Re f(z) is also a first-order polynomial in x, y. This yields (A.6.38).  Bibliography

[1] R. Abraham and J. Marsden, Foundations of , Benjamin, Reading, Mass., 1978. [2] L. Ahlfors, Complex Analysis, McGraw-Hill, New York, 1979. [3] V. Arnold, Mathematical Methods of , Springer, New York, 1978. [4] L. Baez-Duarte, Brouwer’s fixed-point theorem and a generalization of the formula for change of variables in multiple integrals, J. Math. Anal. Appl. 177 (1993), 412– 414. [5] R. Buck, Advanced Calculus, McGraw-Hill, New York, 1978. [6] F. Dai and Y. Xu, Approximation Theory and on Spheres and Balls, Springer, New York, 2013. [7] A. Devinatz, Advanced Calculus, Holt, Reinhart, and Winston, New York, 1968. [8] M. do Carmo, Differential Geometry of Curves and Surfaces, Prentice-Hall, Engle- wood Cliffs, NJ, 1976. [9] M. do Carmo, Riemannian Geometry, Birkhauser, Boston, 1992. [10] J. J. Duistermaat and J. Kolk, Lie Groups, Springer-Verlag, New York, 2000. [11] H. Edwards, Riemann’s Zeta Function, Dover, New York, 2001. [12] H. Federer, Geometric Measure Theory, Springer, New York, 1969. [13] H. Flanders, Differential Forms with Applications to the Physical Sciences, Academic Press, New York, 1963. [14] W. Fleming, Functions of Several Variables, Addison-Wesley, Reading, Mass., 1965. [15] G. Folland, : Modern Techniques and Applications, Wiley-Interscience, New York, 1984. [16] P. Franklin, A Treatise on Advanced Calculus, John Wiley, New York, 1955. [17] H. Goldstein, Classical Mechanics, Addison-Wesley, New York, 1950. [18] E. Goursat, A Course in Modern Analysis, Vols. 1–3, Dover, New York, 1959. Trans- lated from the French in 1904. [19] M. Greenberg and J. Harper, Algebraic Topology, a First Course, Addison-Wesley, New York, 1981.

481 482 Bibliography

[20] J. Hale, Ordinary Differential Equations, Wiley, New York, 1969. [21] N. Hicks, Notes on Differential Geometry, Van Nostrand, New York, 1965. [22] E. Hille, Theory, Chelsea Publ., New York, 1977. [23] M. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra, Academic Press, New York, 1974. [24] F. Hobson, The Theory of Spherical and Ellipsoidal Harmonics, Chelsea, New York, 1965. (Reprint of 1931 Cambridge U. Press monograph) [25] Y. Kannai, An elementary proof of the no-retraction theorem, Amer. Math. Monthly 88 (1981), 264–268. [26] O. Kellogg, Foundations of Potential Theory, Dover, New York, 1953. [27] S. Krantz, Function Theory of Several Complex Variables, Wiley, New York, 1982. [28] P. Lax, Change of variables in multiple integrals, Amer. Math. Monthly 106 (1999), 497–501. [29] S. Lefschetz, Differential Equations, Geometric Theory, J. Wiley, New York, 1957. [30] L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, New York, 1968. [31] J. Milnor, Topology from the Differentiable Viewpoint, Univ. Press of Virginia, Char- lottesville VA, 1965. [32] H. Nickerson, D. Spencer, and N. Steenrod, Advanced Calculus, Van Nostrand, Princeton, New Jersey, 1959. [33] F. Olver, Asymptotics and Special Functions, Academic Press, New York, 1974. [34] M. Protter and C. Morrey, A First Course in Real Analysis, (2nd ed.) Springer- Verlag, New York, 1991. [35] B. Simon, Harmonic Analysis, American Mathematical Society, Providence RI, 2015. [36] K. Smith, Primer of Modern Analysis, Springer-Verlag, 1983. [37] M. Spivak, Calculus on Manifolds, Benjamin, New York, 1965. [38] M. Spivak, A Comprehensive Introduction to Differential Geometry, Vols. 1–5, Pub- lish or Perish Press, Berkeley, CA, 1979. [39] S. Sternberg, Lectures on Differential Geometry, Prentice Hall, New Jersey, 1964. [40] J. J. Stoker, Differential Geometry, Wiley-Interscience, New York, 1969. [41] M. Taylor, Partial Differential Equations, Vols. 1–3, Springer-Verlag, New York, 1996 (2nd ed., 2011). [42] M. Taylor, Measure Theory and Integration, GSM #76, American Mathematical Society, Providence RI, 2006. [43] M. Taylor, Differential Geometry, Lecture notes, Preprint, 2019. [44] M. Taylor, Introduction to Analysis in One Variable, American Math. Society, Prov- idence RI, to appear. [45] M. Taylor, Introduction to Differential Equations, American Math. Soc., Providence, RI, 2011. [46] M. Taylor, Introduction to Complex Analysis, GSM # 202, American Math. Society, Providence RI, 2019. [47] M. Taylor, Linear Algebra, American Math. Society, to appear. [48] M. Taylor, Noncommutative Harmonic Analysis, Math. Surv. Monogr. # 22, Amer- ican Mathematical Society, Providence RI, 1986. [49] V. S. Varadarajan, Lie Groups, Lie Algebras, and Their Representations, Prentice Hall, Englewood Cliffs NJ, 1974. Bibliography 483

[50] E. Whittaker and G. Watson, Modern Analysis, Cambridge Univ. Press, 1927. [51] E. B. Wilson, Advanced Calculus, Dover, New York, 1958. (Reprint of 1912 ed.)

Index

accumulation point, 441 Cauchy sequence, 22, 440 Ad, 326 Cauchy’s inequality, 21, 344, 362, 392, ad, 326 454 adjoint, 458 Cauchy-Riemann equations, 62, 77, 215, algebra of sets, 105 222 analysis, ix cell, 102 angle defect, 295 center, 90, 247 anti-symmetry, 180 central function, 432 anticommutation relation, 186 chain rule, 15, 49, 85, 88, 137, 172, 183, antipodal map, 238, 246 224 antipodal points, 151 change of variable formula, 16, 109, 112, arc length, 139 118, 123, 138, 179, 207 arcsin, 97, 226 character of a representation, 432 arctan, 98 characteristic function, 6, 104 area, 140 characteristic polynomial, 459 Arzela-Ascoli theorem, 274, 448 Christoffel symbols, 266, 269, 282 averaging over rotations, 143 Clairaut parametrization, 273, 289 classical Stokes formula, 201 ball, 440 Clifford algebra, 187 Banach space, 448 closed, 440 basis, 29, 425 closed form, 188 bi-invariant metric, 314 closed set, 22 binomial coefficient, 424 closure, 104, 441 Bolzano-Weierstrass theorem, 23 Codazzi equation, 290 bracket, 92 cofactor matrix, 42, 64 Brouwer fixed-point theorem, 237 column vector, 19, 28 commutator, 309, 406 Cantor set, 18 commutator identities, 409 Cartan’s formula for Lie derivative, 232 compact, 22, 441 Cartesian product, 251 complete metric space, 440 Cauchy integral formula, 216, 226, 426 completeness property, 22, 346, 367 Cauchy integral theorem, 216 completion, 440 Cauchy remainder formula, 56, 466, 470 complex analytic, 62

485 486 Index

complexification, 462 distribution, 351 conformal, 228 div, 149, 189, 196, 234 conjugate points, 331 divergence, 149, 234 connected, 156, 210, 450 divergence theorem, 196, 198, 202 connection 1-form, 283 dot product, 20 content, 104 dual space, 347, 369 continuous, 4, 46, 103, 445 Duhamel’s formula, 84, 100, 327 contraction mapping theorem, 67, 80 convergence, 21 eigenfunction, 386 convergent power series, 58 eigenspace, 459 convolution, 359, 433 eigenvalue, 155, 386, 459 coordinate chart, 135, 171 eigenvector, 155, 459 coordinate syatem, 179 ellipsoid, 131 cos, 96, 157, 160, 224 embedding, 153 countable, 442 energy functional, 255, 261 covariant derivative, 267 Euclidean metric tensor, 140 covariant derivative on normal fields, Euclidean space, 19, 48 279 Euler characteristic, 246, 292 Cramer’s formula, 42, 65 Euler’s formula, 96, 161, 463 critical point, 54, 61 Euler’s other formula, 247, 297 critical point of a vector field, 89, 243 Expp, 269 cross product, 42, 139 exact form, 188, 230 curl, 188, 201 exactness criterion, 239 curvature, 257, 275 Exp, 71, 100, 142, 162, 463 curvature (2,2)-form, 288 expansion by minors, 41 curvature 2-form, 283 exponential function, 95 curvature vector, 275 exponential map, 142, 256, 269, 327 curve, 139, 157, 179 exterior derivative, 187 Darboux theorem, 5, 103, 166 exterior product, 186 definition vs. formula, 39 extremal problem, 54 Deg, 236, 241 degree, 236, 241, 246 fixed point, 67 degree of Gauss map, 291 flat torus, 153 dense, 389, 442 flow, 88 derivation, 88 Fourier coefficients, 339 derivative, 8, 46, 143 Fourier inversion formula, 335, 336, 339, derived representation, 407, 412 349, 352, 360, 364, 369, 373 det, 36, 100 Fourier series, 335, 339 determinant, 35, 36, 64, 181 Fourier transform, 336, 356 diffeomorphism, 66, 135, 153, 179 Fubini’s theorem, 107, 118, 133 differentiability of power series, 58 fundamental theorem of algebra, 221, differentiable, 8, 46, 143 246, 459 differential equation, 80 fundamental theorem of calculus, 8, 46, differential form, 179 190, 261, 266, 269, 466 differential operator, 88 fundamental theorem of linear algebra, dimension, 29 31, 146, 388 Dini’s theorem, 451 Funk-Hecke theorem, 402, 432 Dirichlet problem, 338, 354, 383, 394, 395, 475 Gl+(n, R), 155, 156, 210 disk, 436 Gamma function, 141, 227, 381 distance, 20, 440 Gauss angle defect formula, 295 Index 487

Gauss curvature, 248, 257, 275, 279, interior, 105, 441 281, 287, 289, 291, 301 interior product, 186, 231 Gauss linking number formula, 253 intermediate value theorem, 451 Gauss map, 245, 275 interval, 2 Gauss theorema egregium, 258, 284 inverse, 29 Gauss-Bonnet formula, 248 inverse Fourier transform, 357 Gauss-Bonnet theorem, 259, 292 inverse function theorem, 66, 135, 142, Gaussian integral, 119 144, 226 Gegenbauer polynomials, 396 invertible, 33, 41 generalized eigenspace, 333, 463 irreducible representation, 324, 405 generalized Gauss-Bonnet theorem, 300 isometric embedding, 155 generating function, 396 isomorphism, 29 geodesic curvature, 298 isoperimetric inequality, 436 geodesic equation, 256, 261, 264, 266, isothermal coordinates, 286 273, 320 iterated integral, 107 geodesic triangle, 295 geodesics, 255, 260 Jacobi field, 331 geometric series, 354 Jacobi inversion formula, 381 Gl(n, R), 48 Jacobi variational equation, 330 global diffeomorphism, 69 Jordan curve theorem, 243 grad, 188 Jordan-Brouwer separation theorem, Gramm-Schmidt construction, 455 243 Green formulas, 200 Green’s theorem, 196, 204, 215, 437 ladder operators, 410 group, 321 Lagrange remainder formula, 56, 465, 470 Haar measure, 143, 239, 403 Laplace operator, 199, 205, 206, 219, Hamiltonian system, 267 228, 385, 475 Hamiltonian vector field, 203 Lebesgue integral, 345, 366 harmonic, 206, 215, 219, 228, 354, 475 Lebesgue measure, 7 harmonic conjugate, 222 Legendre polynomials, 397, 414 harmonic function, 383 length, 260 harmonic polynomial, 387, 394 length functional, 255, 261 Harnack’s inequality, 478 Levi-Civita covariant derivative, 256, Heine-Borel theorem, 24 268 Hessian, 53 Lie algebra, 309 Hilbert space, 455 Lie algebra isomorphism, 413 holomorphic, 62, 77, 215 Lie algebra representation, 407 homotopic, 230, 245 Lie bracket, 92, 309, 406 homotopy invariance, 241, 246 Lie derivative, 91, 232 Lie group, 321 implicit function theorem, 72, 144 linear system of ODE, 83 index of a vector field, 244 linear transformation, 26, 46 inf, 2 linearization of an ODE, 85 injective, 28 linearly dependent, 29 inner product, 137, 174, 347, 362, 453 linearly independent, 29 inner tube, 162, 249 linking number, 252 integral equation, 80 Liouville’s theorem, 220, 226, 478 integral of an n-form, 182 Lipschitz, 107 integral test, 18 local diffeomorphism, 69 integrating factor, 189 local maximum, 54, 61 integration by parts, 16, 141 local minimum, 54, 61 488 Index

log, 95, 225 outer measure, 7, 128 lower content, 6, 104 parallel translation, 292 M(n, F), 35, 36, 100 parallel transport, 292 M(n, R), 142 parametrization by arc length, 157, 274 manifold, 152, 171 partial derivative, 11, 46 matrix, 26 partition, 2, 102 matrix exponential, 90, 99, 327, 405 partition of unity, 167, 190 matrix group, 306 permutation, 38 matrix multiplication, 28 Peter-Weyl theorem, 428 maximum, 445 PI, 395 maximum principle, 220, 385, 475 pi, 96, 97, 157 maxsize, 2, 102 Picard iteration method, 80 mean value property, 219, 384, 475 piecewise constant, 13 mean value theorem, 9, 49 PK, 111 measurable, 345 Plancherel identity, 336, 344, 353, 362, metric space, 20, 440 365, 374, 436 metric tensor, 137, 152, 174, 260 Poincar´edisk, 286 minimum, 445 Poincar´elemma, 192, 232, 239 modulus of continuity, 446 Poisson integral, 395 monotone convergence theorem, 125 Poisson integral formula, 338, 354, 385 Morse function, 169 Poisson summation formula, 380, 382 multi-index notation, 50 polar coordinates, 116, 384 multi-linear notation, 56 polar decomposition, 155 multi-linear Taylor formula, 57 polynomial, 27 multiplicativity, 40 positive definite, 54 multipole expansion, 424 power series, 217, 223, 465 prime numbers, 382 negative definite, 54 product rule, 187 neighborhood, 440 projection, 165, 318 Newton method miracle, 76 projective space, 151, 186, 423 Newton’s method, 70, 76 pull-back, 179, 181, 187 nil set, 105 Pythagorean theorem, 20 no-retraction theorem, 236, 242 non-degenerate critical point, 90, 244 quotient surface, 152 norm, 20, 46, 341, 362, 448, 454 normal field, 277 radial vector field, 189 normal transformation, 461 range, 28 null space, 28 rational numbers, 440 real analytic, 77 open, 440 real numbers, 440 open set, 22, 46 regular value, 242 orbit, 88 removable singularity theorem, 386, 475 orientable, 182, 186 representation, 165, 318, 404 orientation, 181 retraction, 236 orthogonal complement, 457 Riemann curvature, 258, 282, 321 orthogonal coordinates, 286 Riemann integrability criterion, 129 orthogonal projection, 263, 275, 390, Riemann integrable, 3, 103, 147 457 Riemann integral, 2, 3, 102 orthogonal transformation, 462 , 6, 103 orthogonality, 387 Riemann zeta function, 381 orthonormal basis, 137, 155, 390, 455 Riemann’s functional equation, 382 Index 489

Riemannian manifold, 260 tempered distribution, 372 Rodrigues formula, 427 torus, 339 rotation group, 404 total boundedness, 441 row operation, 110, 134 Tr, 44, 100 row vector, 19 trace, 44 transpose, 34 saddle, 90, 247 triangle inequality, 20, 341, 346, 362, saddle point, 55, 61 440, 454 Sard’s theorem, 169, 242 triangulation, 297 Schur’s lemma, 430 trigonometric function, 96, 157 Schwarz reflection principle, 477 trigonometric polynomial, 339, 474 second fundamental form, 257, 277, 280 Tychonov theorem, 448 self-adjoint, 458 sequence, 441 umbilic, 290 sin, 96, 157, 160, 224 unbounded integrable function, 119 sinh, 164 uniformly continuous, 446 sink, 90, 247 unimodular group, 316 Skew(n), 142, 162, 405, 463 unit normal, 275 SO(n), 43, 142, 162, 229, 404, 463 unitarily equivalent, 324 source, 90, 247 unitary, 458 span, 29 unitary representation, 323, 404 Spec, 459 unlinking, 254 spectral mapping theorem, 328, 332 upper content, 6, 104 sphere, 139, 140 vector, 180 spherical coordinates, 415 vector field, 87, 148, 174, 276 spherical harmonic expansion, 394 vector space, 25, 48, 453 spherical harmonics, 337, 383, 385 volume, 102, 139 spherical polar coordinates, 131, 140, volume of a ball, 131, 157 385 Stokes formula, 190, 204, 230, 242, 294, wedge product, 186 302 Weierstrass approximation theorem, 471 Stone-Weierstrass theorem, 237, 325, Weingarten formula, 257, 278, 279, 288 339, 389, 472 Weingarten map, 257, 278 SU(n), 413 Weyl orthogonality relations, 325 submersion, 144 submersion mapping theorem, 144 zonal function, 400 subsequence, 441 zonal harmonics, 401 summation convention, 197, 265 sup, 2 surface, 135 surface integral, 138 surface of revolution, 164, 289 surface with boundary, 190 surface with corners, 191 surjective, 28 Sym(n), 155 tan, 98 tangent bundle, 173 tangent space, 137, 171 tangent vector field, 148 Taylor formula with remainder, 16, 51