<<

1

-

Multivariable analysis

Roland van der Veen

Groningen, 20-1-2020 2 Contents

1 Introduction 5 1.1 Basic notions and notation ...... 5

2 How to solve equations 7 2.1 Linear algebra ...... 8 2.2 ...... 10 2.3 Elementary Riemann integration ...... 12 2.4 and Banach contraction ...... 15 2.5 Inverse and Implicit theorems ...... 17 2.6 Picard’s theorem on existence of solutions to ODE ...... 20

3 Multivariable fundamental theorem of 23 3.1 Exterior algebra ...... 23 3.2 Differential forms ...... 26 3.3 Integration ...... 27 3.4 More on cubes and their boundary ...... 28 3.5 ...... 30 3.6 The fundamental theorem of calculus (Stokes Theorem) ...... 31 3.7 Fundamental theorem of calculus: Poincar´elemma ...... 32

3 4 CONTENTS Chapter 1

Introduction

The goal of these notes is to explore the notions of differentiation and integration in arbitrarily many variables. The material is focused on answering two basic questions:

1. How to solve an equation? How many solutions can one expect?

2. Is there a higher dimensional analogue for the fundamental theorem of calculus? Can one find a primitive?

The equations we will address are systems of non-linear equations in finitely many variables and also ordinary differential equations. The approach will be mostly theoretical, schetching a framework in which one can predict how many solutions there will be without necessarily solving the equation. The key assumption is that everything we do can locally be approximated by linear functions. In other words, everything will be differentiable. One of the main results is that the linearization of the equation predicts the number of solutions and approximates them well locally. This is known as the theorem. For ordinary differential equations we will prove a similar result on the existence and uniqueness of solutions. To introduce the second question, recall what the fundamental theorem of calculus says.

Z b f 0(x)dx = f(b) − f(a) a What if f is now a function depending on two or more variables? In two and three dimensions, gives some partial answers involving div, grad, and the theorems of Gauss, Green and Stokes. How can one make sense of these and are there any more such theorems perhaps in higher dimensions? The key to understanding this question is to pass from functions to differential forms. In the example above this means passing from f(x) to the differential form f(x)dx. Taking the dx part of our integrands seriously clarifies all formulas and shows the way to a general fundamental theorem of calculus that works in any dimension, known as the (generalized) Stokes theorem: Z Z dω = ω Ω ∂Ω All the results mentioned in this paragraph are special cases of this powerful theorem. This is not calculus. We made an attempt to prove everything we say so that no black boxes have to be accepted on faith. This self-sufficiency is one of the great strengths of mathematics. The reader is asked to at least try the exercises. Doing exercises (and necessarily failing some!) is an part of mathematics.

1.1 Basic notions and notation

Most of the material is standard and can be found in references such as Calculus on manifolds by M. Spivak. n n 1 n Pn i We will mostly work in R whose elements are vectors R 3 x = (x , . . . , x ) = i=1 x ei where ei is the i-th standard basis vector whose coordinates are all 0 except a 1 at the i-th place. Throughout the text I try to write f functions as A 3 a 7−→ a2 + a + 1 ∈ B, instead of f : A → B defined by f(a) = a2 + a + 1. An important function || n p 2 2 2 is the (Euclidean) norm R 3 x 7−→ R where |(x1, x2, . . . , xn)| = x1 + x2 + . . . xn. As the name suggests it satisfies the triangle inequality |x + y| ≤ |x| + |y|. Another piece of notation is that of open and closed subsets of Rn. A set S ⊂ Rn is called open if it is the n n union of (possibly infinitely many) open balls Br(p) = {x ∈ R : |x − p| < r}. The set S ⊂ R is closed if its n n complement R − S is open. For example the closed disk Br(p) = {x ∈ R : |x − p| ≤ r} is closed.

5 6 CHAPTER 1. INTRODUCTION

n n i i For a sequence of points (pk) ∈ R and q ∈ R we write limk→∞ pk = q if for all i we have limk→∞ pk = q . i n f m Here q is the i-th coordinate of q (not i-th power!). For a function R ⊃ A −→ B ⊂ R we say limp→q f(p) = s if n for any sequence (pk) converging to q in R we have limk→∞ f(pk) = s. The function f is said to be continuous at q if limp→q f(p) = f(q) and f is continuous if it is continuous at every point. f is uniformly continuous on set U if for all  > 0 there is a δ > 0 such that for all u, v ∈ U |u − v| < δ ⇒ |f(u) − f(v)| < . A continuous function on a closed bounded U ⊂ Rn is uniformly continuous. Moreover, continuous functions Rm → Rn send closed bounded subsets to closed bounded subsets. For more on open sets and continuity we refer to the course on metric and topological spaces next block. The above results and definitions should all be familliar at least in the case n = 1. When formulated in terms of the norm all proofs are identical in the case of Rn.

Exercises Exercise 0. True or false:

1. B1(1) ∪ Bπ(5) is open. 2. [0, 1]10 is closed. 3. [0, 1) is open.

n 4. {(x1, . . . xn) ∈ R |x1 > 0} is open. 7 2 2 5. {x ∈ R : x1 + ··· + x7 = 1} is open. Chapter 2

How to solve equations

Under what conditions can a system of n real equations in k + n variables be solved? Naively one may hope that each equation can be used to determine a variable so that in the end k variables are left undetermined and all others are functions of those. For example consider the two systems of two equations on the left and on the right (k = 1, n = 2): x + y + z = 0 sin(x + y) − log(1 − z) = 0 (2.1) 1 −x + y + z = 0 ey − = 0 (2.2) 1 + x − z The system on the left is linear and easy to solve, we get x = 0 and y = −z. The system on the right is hard

Figure 2.1: Solutions to the two systems. The yellow surface is the solution to the first equation, blue the second. The positive x, y, z axes are drawn in red, green, blue respectively. to solve explicitly but looks very similar near (0, 0, 0) since sin(x + y) ≈ x + y and log(1 − z) ≈ −z near zero. We will be able to show that just like in the linear situation, a curve of solutions passes through the origin. The key point is that the derivative of the complicated looking functions at the origin is precisely the shown on the left. We will look at equations involving only differentiable functions. This means that locally they can be approximated well by linear functions. The goal of the chapter is to prove the implicit function theorem. Basically it says that the linear approximation decides whether or not a system of equations is solvable locally and if so how many solutions it has. This is illustrated in the figures above.

Exercises Exercise 0. The non-linear equation sin(x − y) = 0 has a solution (x, y) = (0, 0). Find a linear equation that approximates the non-linear equation well near (0, 0).

Exercise 1. (Linear case) Is it always true that a system of two linear equations in three unknowns has a line of solutions? Prove or

7 8 CHAPTER 2. HOW TO SOLVE EQUATIONS provide counter example.

Exercise 2. (Non-linear equation) Solve z and y as a function of x subject to the equations.

x2 + y2 + z2 = 1 x + y2 + z3 = 0

Exercise 3. (Linear equation) Write the system below as a matrix equation Ax = b for matrix A and vectors x, b.

x + 3y + 2z = 1 x + y + z = 0 −x + y + 4z = 0

Exercise 4. (Three holes) Give a single equation in three unknowns such that the solution set is a bounded subset of R3, looks smooth and two-dimensional everywhere and has a hole. Harder: Can you increase the number of holes to three?

2.1 Linear algebra

The basis for our investigation of equations is the linear case. Linear equations can neatly be summarized in terms of a single matrix equation Av = b. Here v is a vector in Rk+n, and b ∈ Rn and A is an n × (k + n) matrix. In case b = 0 we call the equation homogeneous and the solution set is some linear subspace ker A = {v ∈ Rk+n|Av = 0}, the kernel of the map defined by A. In general, given a single solution p ∈ Rk+n such that Ap = b the entire solution set {v ∈ Rk+n|Av = b} is the affine linear subspace (ker A) + p = {s + p|s ∈ ker A}. In discussing the qualitative properties of linear equations it is more convenient to think in terms of linear maps. Most of this material should be familiar from linear algebra courses but we give a few pointers here to establish notation and emphasize the important points. With some irony, the first and second rules of linear algebra are:

1. YOU DO NOT PICK A BASIS

2. IF YOU PICK A BASIS BE READY TO CHANGE IT

In this section W, V will always be real vector spaces of finite dimensions m and n.A basis for V is an ordered set of linearly independent vectors b1, . . . , bn that span the whole space. The dimension dim V is the number of basis elements one needs and does not depend on the basis chosen. The standard basis of Rn is n denoted by e1, . . . en. Once we choose a basis, b1, . . . bn for V we get a linear bijection between V and R P i i sending bi to ei. Vectors v ∈ V are written v = i v bi for some coefficients v ∈ R. Notice we use upper indices (not powers!) for the coefficients vi of a vector v ∈ V . While powerful and concrete choosing a basis is dangerous because it tends to destroy a lot of symmetry and structure, one basis may be natural for some purpose while another basis may be more approriate for doing another task. The set of all linear maps from V to W is denoted L(V,W ). If we would set V = Rn and W = Rm then i P i ϕ ∈ L(V,W ) could be described by a matrix (ϕj) defined by ϕej = i ϕjei. In our notation, upper indices indicate rows and lower indices are columns and the columns of the matrix are the images of the basis vectors. As we said, the matrix of ϕ might look easier with respect to another basis so we prefer to keep the V,W i abstract and express the linear map ϕ with respect to bases b1, . . . bm of W and c1, . . . cn of V as (ϕj) given by P i ϕcj = i ϕjbi. The dual space V ∗ = L(V, R) becomes a vector space in its own right when we define addition and scalar ∗ multiplication pointwise: (af + g)(v) = af(v) + g(v) for any f, g ∈ V and v ∈ V and a ∈ R. A basis b1, . . . bn ( 1 n ∗ i i i 1 if i = j of V yields a dual basis b , . . . b of V called the dual basis by requiring b (bj) = δ . Here δ = j j 0 if i 6= j is the Kronecker delta. Elements f ∈ V ∗ are thought of as row vectors and expressed in the dual basis as P i f = i fib . A useful feature of the dual space is the pull-back (also known as transpose). Given f ∈ L(V,W ) the pull-back is f ∗ ∈ L(W ∗,V ∗) defined by f ∗ϕ = ϕ ◦ f. n m n n Finally the (Euclidean) norm of ϕ ∈ L(R , R ) is defined as |ϕ| = maxv∈Sn |ϕ(v)|, where S = {v ∈ R : |v| = 1}. It satisfies |φ(v)| ≤ |φ||v|. 2.1. LINEAR ALGEBRA 9

Exercises

Exercise 0. b Why is our definition of basis as a linear isomorphism Rn −→ V consistent with the usual notion of a basis as an (ordered) set of vectors b1, b2 . . . bn ∈ V that is both linearly independent and spans V ? What is the linear n n n map in L(R , R ) that describes the standard basis e1, e2 . . . en of R ?

Exercise 1. 2 3 Set V = R and W = R and define a linear map ϕ ∈ L(V,W ) by ϕe1 = ϕe2 = e3. What is the matrix of ϕ 1 with respect to the standard bases of V and W ? What is the matrix of ϕ with respect to the bases b1 = 0 1 and b2 = 1 of V and c1 = e1 + e3 and c2 = e3 + e3 and c3 = e3 of W ?

Exercise 2. The vector space Λ2R3 is the vector space spanned by things of the form v ∧ w where ∧ satisfies the following 2 3 rules: v ∧w = −w ∧v and (αu+βv)∧w = αu∧w +βv ∧w. A basis for Λ R is given by the vectors f1 = e1 ∧e2, P i P i f2 = e2 ∧ e3, f3 = e3 ∧ e1. Show that if v × w = i c ei then v ∧ w = i c fi.

Exercise 3. The set Pn of polynomials of degree ≤ n in one variable x with real coefficients is a vector space with respect d dx to the usual addition and multiplication by scalars. Give a basis of P3 and express the linear map P3 −−→ P3 in your basis.

Exercise 4. Find a linear map from V to V ∗∗ without choosing a basis.

Exercise 5. Show that (g ◦ f)∗ = f ∗ ◦ g∗.

Exercise 6. How many solutions does a system of two equations in 4 unknowns generally have? What is the dimension of the solution space?

Exercise 7. Prove that any linear map ϕ ∈ L(Rn, Rm) is continuous. Also show that the image of ϕSn is closed and bounded. (Hint: use a property of continuous functions mentioned in the introduction).

Exercise 8. 1 n ∗ Let b1, b2, . . . , bn be a basis for V . Check that the dual basis b , . . . , b is a basis for V . Show that dim V = dim V ∗

Exercise 9. The set L(V,W ) is a vector space when we define addition and scalar multiplication pointwise: (af + g)(v) = af(v) + g(v). Find a basis for L(R2, R3) and compute dim L(R2, R3). Same questions for L(V,W ) where V,W are arbitrary vector spaces.

Exercise 10. In this exercise we identify the complex numbers C with R2 via x + iy ↔ (x, y). Show that the map C 3 z 7→ (a + bi)z corresponds to an element of L(R2, R2) whose matrix with respect to the standard bases is  a −b  . b a Exercise 11. 3 2 Consider the linear map L ∈ L(R , R ) defined by Lu = e1 and Lv = e2 and Lw = 0. 3 Suppose u = e1 + 2e2 + 3e3 and v = e1 − e2 − 4e3 and w = e1 − e2 − e3. Do u, v, w form a basis of R ? Write down the matrix of L with respect to u, v, w and the standard basis in R2. Also write down the matrix of L with respect to the standard bases on both sides.

Exercise 12. 1 n ∗ Pn i Prove if e1, . . . en is a basis of V with dual basis e , . . . e then f ∈ V satisfies f = i=1 f(ei)e . 10 CHAPTER 2. HOW TO SOLVE EQUATIONS

2.2 Derivative

f Now that we understand linear functions, we would like to use this to study more general functions Rm ⊃ P −→ Rn, where unless stated otherwise P is always a non-empty open subset of Rm. The key idea is to locally approximate non-linear objects by linear ones. In this case at every point p ∈ P we are looking for the linear map f 0(p) ∈ L(Rm, Rn) best approximating f close to p. This is just the first order Taylor approximation to f at p. f,g Since we are approximating, some specialized notation is useful. For functions Rm −−→ Rn we define f = o(g) |f(h)| 2 to mean limh→0 |g(h)| = 0, intuitively f goes to zero faster than g as h goes to 0. For example h = o(h). We often use the triangle inequality to show that f = o(h) and g = o(h) implies f + g = o(h) (Exercise!). Although our notation may be a little unfamiliar, the picture is just like in one variable, see figure 2.2.

Figure 2.2: The derivative D at p is the linear map that best approximates to f at point p.

Definition 2.2.1. (Differentiability) f A map Rm ⊃ P −→ Rn is called differentiable at p ∈ P if there exists a linear map D ∈ L(Rm, Rn) such that f(p + h) − f(p) − Dh = o(h). When f is differentiable for all p ∈ P we say f is differentiable.

To see the relation with the Taylor expansion we set f,D,p(h) = f(p + h) − f(p) − Dh and write

f(p + h) = f(p) + Dh + f,D,p(h) (2.3)

The function  represents the error in the first order Taylor approximation and differentiability means that the |f,D,p(h)| error is o(h) as h goes to 0, so limh→0 |h| = 0. f We start with a 1-dimensional example: R 3 x 7−→ x3 ∈ R and p = 1. We know we should have f 0(p) = 3p2 = 3 but notice our notion of derivarive should be a linear map, not a number. Take D ∈ L(R, R) to be 3 2 3 multiplication by 3, so De1 = 3e1. This works because f(p+h)−f(p)−Dh = (1+h) −1−3h = 3h +h = o(h). f For a higher dimensional example take R2 3 (x, y) 7−→ x2 − y2 ∈ R and p = (0, 1). In this case we may take D ∈ L(R2, R) to be given by the matrix (0, −2) with respect to the standard bases. To see that this works we 0 set h = (k, `) and show that the error f,D,p(h) = f(p + h) − f(p) − f (p)(h) goes to zero faster than h does:

2 2 2 2 f,D,p(h) = f(k, ` + 1) − f(0, 1) + 2` = k − (` + 1) + 1 + 2` = k − `

| (h)| |k2−`2| |k2+`2| So as promised f,D,p = √ ≤ √ = |h|. Taking the h → 0 shows D satisfies equation (2.3). |h| | k2+`2| k2+`2 Provided it exists, the linear approximation D above is actually unique. It therefore deserves a special name, the derivative of f at p, notation: f 0(p).

Definition 2.2.2. (Derivative) If f is differentiable at p then the derivative of f at p called f 0(p) ∈ L(Rm, Rn) is the unique linear map satisfying (2.3).

Proof. (Of uniqueness). Suppose we have another A ∈ L(Rm, Rn) also satisfies (2.3). Subtracting these two m equations gives (D − A)h = f,A,p(h) − f,D,p(h) = o(h). Therefore for any non-zero vector w ∈ R we have w 1 |(D−A) n | |(D−A)h| |w| |(D−A)w| = limn→∞ |w| = limh→0 |h| = 0 so that Dw = Aw. Since w was arbitrary A = D. n 2.2. DERIVATIVE 11

f For functions R −→ R our definition of derivative f 0(p) is just a complicated reformulation of the usual definition. Actually the matrix of the derivative with respect to the standard bases is just the matrix of partial ∂f ∂f . In the above example the linear map D is just ( ∂x (p), ∂y (p)) = (0, −2). This and much more will follow from the next theorem.

f Theorem 2.2.1. (Properties of derivative) Imagine a function Rk ⊃ Q −→ P ⊂ R`. g 1. (Chain-rule). If f is differentiable at q ∈ Q and R` ⊃ P −→ Rm is differentiable at f(q) ∈ P we have (g ◦ f)0(q) = g0(f(q))f 0(q).

2. If f is constant then f is differentiable and f 0(q) = 0 for all q ∈ Q.

3. If f ∈ L(Rk, R`) then f 0(q) = f for all q ∈ Rk. 1 2 ` P i 4. The function f = (f , f , . . . , f ) = i f ei is differentiable at q if and only if the component functions i f 0 1 0 ` 0 P i 0 P −→ R are. If so, f (q)(v) = ((f ) (q)(v),..., (f ) (q)(v)) = i(f ) (q)(v)ei.

× 5. The product R2 3 (x, y) 7−→ xy ∈ R is a differentiable function with ×0(x, y)(k, `) = yk + x`. Proof. Part 1 (). Set p = f(q). For the chain rule it suffices to show that the linear map g0(p)f 0(q) ∈ ` m 0 0 L(R , R ) satisfies equation (2.3). We know that f(q + h) = p + f (q)h + f,q(h) and g(p + k) = g(p) + g (p)k + g,p(k). Combining those we can approximate (g ◦ f)(q + h) =

0 0 0 0 g(p + f (q)h + f,q(h)) = g(p) + g (p)k + g,p(k) = g(p) + g (p)f (q)h + (g◦f),q(h)

0 0 where we set k = f (q)h + f,q(h) and (g◦f),q(h) = g (p)f,q(h) + g,p(k) = A + B. Now we need to show that (g◦f),q(h) = o(h) as h → 0. In fact A = o(h) and B = o(h). For A it follows from the differentiability of f and continuity of the linear map g0(p). For B we use differentiability of g to see that for any α > 0 we have 1 1 0 |g,p(k)| < αk whenever k is suitably small. So |h| |g,p(k(h))| < α |h| (f (q)h + f,q(h)) < Cα for some constant C, showing that B(h) = o(h). Here we used differentiability of f once more. Part 2 follows directly from the definition and uniqueness of the derivative because by assumption: f(p+h) = f(p) + 0h + 0 and 0 = o(h). Part 3. If we set f 0(p)h = f(h) then by linearity of f we get f(p + h) − f(p) − f 0(p)h = 0 = o(h). Part 4. Suppose f is differentiable at p then by the chain rule (part 1) so is f i = πi ◦ f where πi is projection onto the i-coordinate (a linear map, using part 3). Conversely suppose all the functions f i are differentiable i i i 0 1 at p then f (p + h) = f (p) + (f ) (p)h + f i,p(h) with f i,p(h) = o(h). Then limh→0 |h| |f(p + h) − f(p) − 1 0 ` 0 1 ((f ) (p)h, . . . , (f ) (p)h)| = limh→0 |h| |(f 1,p(h), . . . , f `,p(h))| = 0. Part 5. Setting h = (k+`) we may write ×(x+k, y+`)−×(x, y) = ky+x`+k`. We are done since k` = o(h).

In some sense this is all we need to know about differentiation. Using the chain rule and our knowledge of the one variable derivatives from calculus we are able to differentiate many complicated looking multivariate F functions step by step. For example: the function R2 ∈ (x, y) 7−→ (cos(xy), x3 + e−y) ∈ R2 can be differentiated as follows. By part 4 we can do the components F 1,F 2 separately so let us focus on computing (F 1)0(a, b) for some point (a, b) ∈ R2. Notice that F 1 = cos ◦× so (F 1)0(a, b) = cos0(ab)(×0(a, b)). In other words, 1 0 1 0 ∂F 1 (F ) (a, b)(v, w) = − sin(ab)(bv + aw) and setting (v, w) = e1 we get the more usual (F ) (a, b)e1 = ∂x (a, b) = − sin(ab)b. A slightly different way of thinking about derivatives is in terms of directional derivatives.

Definition 2.2.3. () F Given Rm ⊃ P −→ Rn we define the directional derivative of F at p ∈ P in direction w ∈ Rm as F (p + tw) − F (p) ∂wF (p) = lim t→0 t

In case w = ej the directional derivative is known as the j-th . F is called C1 if all partial derivatives exist and are continuous. Furthermore F is C2 if all its partial derivatives are C1 functions in their own right.

0 m Assuming F (p) exists and setting ιw : R → R given by ιw(t) = p + wt connects our two notions of 0 0 derivatives. From the chain rule and parts 2,3 of Theorem 2.2.1 we see ∂wF (p) = (F ◦ ιw) (0) = F (p)w. In 0 particular, this means that when it exists, the matrix of F (p) with respect to the standard bases ei is just the 0 i i matrix of partial derivatives: F (p)j = (∂jF )(p). In section 2.4 we will see how directional derivatives even shed light on the existence of F 0, see lemma 2.4.1. 12 CHAPTER 2. HOW TO SOLVE EQUATIONS

Of course one may go on and define even higher partial derivatives but we will not consider them in this course. Occasionally it is useful to also define differentiable and C1-functions on sets that are not necessarily open. f In that case we say Rm ⊃ P −→ Rn is differentiable if there exists an open subset Q ⊃ P and a differentiable g function Q −→ Rn such that f(p) = g(p) for all p ∈ P .

Exercises Exercise 0. Show that if h = (k, `) then k` = o(h) as h goes to 0. Hint: (k − `)2 ≥ 0.

Exercise 1. f x Compute the derivative f(1, 1)0 where R2 3 (x, y) 7−→ (log(xy), ee + y, 1) ∈ R3 using only the properties 1, 2, 3, 4, 5 mentioned in Theorem 2.2.1.

Exercise 2. f Compute the derivative f(1, 1)0 where R2 3 (x, y) 7−→ ((x + y)(2x + y), sin(2x + xy) + y, 1, y − x) ∈ R4 using only the properties 1, 2, 3, 4, 5 mentioned in Theorem 2.2.1.

Exercise 3. Show that if f = o(h) and g = o(h) then f + g = o(h) as h → 0.

Exercise 4. f Identify the complex numbers C with R2 via x + iy ↔ (x, y). A complex function C −→ C corresponds to a F function R2 −→ R2 via F (x, y) = (Ref(x + iy), Imf(x + iy)). Show that if f is complex differentiable then F is G C1 differentiable. Construct a C1 differentiable function R2 −→ R2 that does NOT correspond to a complex dif- ferentiable function, in the above way. Hint, take a good look at the linear map F 0(x, y), what is its determinant?

Exercise 5. n f n n Suppose R −→ R is a differentiable function with differentiable inverse g, so f ◦ g = g ◦ f = idR . Express g0(y) in terms of f 0(x) where y = f(x).

Exercise 6. q a Show that the function quotient q given by R × (0, ∞) 3 (a, b) 7−→ b ∈ R is differentiable and find its derivative. In finding the derivarive it suffices to provide a matrix with respect to a basis of your choice.

Exercise 7. f m f,g Formulate and prove the quotient rule for the derivative of the quotient g where R −−→ R are differentiable.

2.3 Elementary Riemann integration

Since we only plan to work with functions that are continuous in this course we choose to set up a very naive version of integration. We assume the domain of integration is a rectangle and just subdivide evenly. While limited this framework is complete and shows many arguments in their simplest form. Readers familiar with more advanced integration theories are welcome to substitute their preferred notion of integral. Definition 2.3.1. (, light) f Qk i i k Qk i i For a R −→ R on a rectangular box R = i=1[a , b ] ⊂ R we define vol(R) = i=1(b − a ) and define the integral as

Z k i i vol(R) X X i b − a f = lim IR,n(f) where IR,n = nk f(a + j n ei) R n→∞ 2 2 j∈{0,1,...,2n−1}k i=1

Be careful that the upper indices denote the coordinates of the vectors a,b and j. As an elementary example n −n P2 −1 j −2n n n 1 −n−1 take f(x) = x and R = [0, 1] then we get IR,n(f) = 2 j=0 2n = 2 2 (2 − 1)/2 = 2 − 2 so as R 1 expected [0,1] x = 2 . It is customary to write ’dx’ after the integrand. We will not do this because later in the chapter we will use ’dx’ in the sense of differential forms. When it is necessary to indicate the variable that is R 2 2 integrated we write for example y∈[0,1] x y to indicate that we are supposed to integrate the function y 7→ x y. 2.3. ELEMENTARY RIEMANN INTEGRATION 13

f Figure 2.3: Approximating the integral of a function R −→ R on rectangle R = [a, b] by IR,n, the sum of the function values on the binary subdivision, shown in dots.

Some elementary properties are given in the next lemma. As said before, all functions in this section are assumed to be continuous. Lemma 2.3.1. (Properties of R ) R f 1. The limit defining R f exists for any continuous R −→ R. R R R 2. For α ∈ R and function g on R we have R(f + αg) = R f + α R g. R 3. R f ∈ [vol(R) minR f, vol(R) maxR f]

f,g k R R 4. Given continuous functions U −−→ R defined on open subset U ⊂ R . If R f = R g for all rectangles R ⊂ U then f = g.

Proof. For part 1 we aim to show that (IR,n(f)) is a Cauchy sequence so choose  > 0. Notice that for m < n we have k vol(R) X X  ji  bi − ai I (f) = f(a + e ) R,m 2nk 2n−m 2m i j∈{0,1,...,2n−1}k i=1 So k k vol(R) X X ji bi − ai X  ji  bi − ai |I (f) − I (f)| = |f(a + e ) − f(a + e )| R,n R,m 2nk 2n−m 2m i 2n−m 2m i j∈{0,1,...,2n−1}k i=1 i=1 In view of continuity of f we want to know how close these points are:

n k k X ji bi − ai X  ji  bi − ai X |bi − ai| | e − e | ≤ 2n−m 2m i 2n−m 2m i 2m i=1 i=1 i=1 Moreover, R is closed and bounded so f is uniformly continuous on R. This means there exists δ > 0 such  that for all p ∈ R and all |p − q| < δ we have |f(p) − f(q)| < vol(R) . Now for n > m large enough such that Pk |bi−ai| i=1 2m < δ we find our Cauchy estimate: vol(R) X  |I (f) − I (f)| ≤ <  R,n R,m 2nk vol(R) j∈{0,1,...,2n−1}k

Part 2 follows from IR,n(f + αg) = IR,n(f) + αIR,n(g). For part 3 we note that any continuous function attains its max and min on the closed and bounded set R. For any n we have IR,n(f) ≤ (maxR f)IR,n(1) and similarly for the minimum. Part 4 follows from part 2: Set 1 k 1 R Rn = p + [0, ] . Then f ∈ [minR f, maxR f]. Since f is uniformly continuous on R, for any  n vol(Rn) Rn n n there is an n such that |f(q) − f(p)| <  for all q ∈ Rn, finishing the proof. The Fubini theorem about computing an integral by first integrating out a couple of variables is a simple matter in this framework. Lemma 2.3.2. (Fubini) For a continuous function f defined on a rectangle R × S ⊂ Rk × R` we have Z Z Z f = F where F (p) = f(p, ·) for p ∈ R R×S R S 14 CHAPTER 2. HOW TO SOLVE EQUATIONS

Qk Q` Proof. Assuming R = i=1[ai, bi] and S = i=1[ci, di] we have F (p) = limm→∞ IS,mf(p, ·) defines a continuous function F on R (Exercise!).

n i i vol(R) X X i b − a IR,n(F ) = lim IS,m(f(a + j ei, ·)) = m→∞ 2nk 2n j∈{0,1,...,2n−1}k i=1

n i i n i i vol(R × S) X X X i b − a X i d − c lim f(a + j ei, c + h ei) m→∞ 2n(k+`) 2n 2m j∈{0,1,...,2n−1}k h∈{0,1,...,2m−1}` i=1 i=1

If we denote the last formula as limm→∞ vm,n then notice that vn,n = IR×S,n(f). It follows that Z Z F = lim vm,n = lim vn,n = f R n,m→∞ n→∞ R×S

In one dimension the fundamental theorem of calculus is the following. One of the main aims of this course x=b is to find a multivariable analogue. In computations the following notation f(b) − f(a) = f(x)|x=a is often useful. Lemma 2.3.3. (Fundamental theorem of calculus) Suppose f is C1 on [a, b]. Then Z 0 b f = f(b) − f(a) = f|a [a,b] R 0 The function F (x) = [a,a+x] f then is differentiable and F (x) = f(x). Proof.

2n−1 2n−1 X b − a b − a X 0 b − a b − a b − a 0 f(b)−f(a) = f(a+(j+1) )−f(a+j ) = f (a+j ) +f,a+j b−a ( ) = Ib−a,n(f )+E 2n 2n 2n 2n 2n 2n j=0 j=0

P2n−1 b−a where E =  b−a ( n ) converges to 0 since (h) = o(h). j=0 f,a+j 2n 2 For the second equality use part 2 of Lemma 2.3.1 to get Z F (x + h) − F (x) = f ∈ [h min f(x + t), h max f(x + t)] [x,x+h] t∈[0,h] t∈[0,h]

Continuity of f means that limh→0 mint∈[0,h] f(x + t) = f(x) and the same for the maximum. Dividing by h and taking the limit on both sides finishes the proof. Taken together Fubini’s theorem and the fundamental theorem of calculus allow us to integrate many R f functions on rectangles. For example let us compute R f where R = [0, 1] × [2, 3] × [−1, 1] and R 3 (x, y, z) 7−→ 2 R z3 z=1 2 R R xy+z ∈ R. First set F (x, y) = [−1,1] f(x, y, ·) = xyz+ 3 |z=−1 = 2xy+ 3 then Fubini says R f = [0,1]×[2,3] F . R 2 2 y=3 2 R R 5x2 2 1 5 2 Again define G(x) = [2,3] F (x, ·) = (xy + 3 y)|y=2 = 5x+ 3 . So finally R f = [0,1] G = ( 2 + 3 x)|0 = 2 + 3 = 19 6 . Fubini’s theorem allows us to give a soft proof of the fact that mixed partial derivatives commute. This result will be very important later in discussing the exterior derivative. Recall that the partial derivative is 0 ∂if(p) = f (p)ei. Lemma 2.3.4. (Mixed partial derivatives commute) 2 For any C function f we have ∂i∂jf = ∂j∂if.

2 Proof. It suffices to prove the case of a function f defined on an open subset of R . This is because ∂i∂jf(p) = ˜ ˜ R R ∂1∂2f(0, 0) with fp(x, y) = f(p + xei + yej). We will show that I = [a,b]×[c,d] ∂1∂2f = [a,b]×[c,d] ∂2∂1f = J. Part 3 of Lemma 2.3.1 then implies ∂2∂1f = ∂1∂2f. R R 0 Using Fubini, I = [a,b] F where F (p) = [c,d] g and g(q) = ∂1f(p, q). By the fundamental theorem of R R R 0 calculus I = [a,b] g(d) − g(c) = [a,b] ∂1f(·, d) − ∂1f(·, c) = [a,b] h with h(p) = f(p, d) − f(p, c). So we conclude that I = h(b) − h(a) = f(b, d) − f(b, c) − f(a, d) + f(a, c). Splitting the integral in the other order and doing the same steps shows that J gives the same answer. Yet another application of Fubini is to prove that one can differentiate under the integral sign: 2.4. MEAN VALUE THEOREM AND BANACH CONTRACTION 15

Lemma 2.3.5. (Differentiation under the integral sign) n 1 R R For any rectangle R ⊂ R and a < b ∈ R and C function f defined on [a, b] × R we have ∂1 R f = R ∂1f, where ∂1 denotes the first partial derivative. Proof. By part 3 of the properties of integration lemma (Lemma 2.3.1), it suffices to prove that for all [c, d] R R R R we have [c,d] ∂1 R f = [c,d] R ∂1f. Using the fundamental theorem of calculus the left hand side is equal to R R R R R f(d, ·) − f(c, ·). Fubini says the right hand side is R [a,b] ∂1f = R f(d, ·) − f(c, ·) finishing the proof.

Exercises Exercise 0. f R Set R = [1, 2] × [3, 5] and R 3 (x, y) 7−→ 2xy ∈ R. Compute the integral R f directly from the definition given in the text.

Exercise 1. ϕ Prove the theorem for a C1 function [a, b] −→ R with ϕ(a) < ϕ(b) by applying the fundamental f theorem of calculus. So given a continuous function [ϕ(a), ϕ(b)] −→ R and ∀x ∈ [a, b]: ϕ0(x) ≥ 0 show that: Z Z (f ◦ ϕ)ϕ0 = f [a,b] [ϕ(a),ϕ(b)]

Exercise 2. R f 2 2 Compute the integral R f for R = [0, 2] × [0, 3] and R 3 (x, y) 7−→ x + y using Fubini’s theorem and the fundamental theorem of Calculus.

Exercise 3. R f 3 3 Compute the integral R f for R = [0, 2] × [0, 3] × [−2, 0] and R 3 (x, y, z) 7−→ x + y + cos(z) using Fubini’s theorem and the fundamental theorem of Calculus.

Exercise 4. n 1 2 n f 1 2 n i Compute the integral infR f for R = [0, 1] and R 3 (x , x , . . . x ) 7−→ x x . . . x (here x means the i-th coordinate of x) using Fubini’s theorem and the fundamental theorem of Calculus.

Exercise 5. Prove that the F from the statement of Fubini’s theorem is continuous.

2.4 Mean value theorem and Banach contraction

In this section we prepare a few results necessary for proving the inverse and implicit function theorems of next section. Recall the mean value theorem that says that if f is a differentiable function on [a, b] then there exists c ∈ (a, b) such that f 0(c)(b − a) = f(b) − f(a). This allows us to show differentiability of a function can be checked by looking at the partial derivatives.

Lemma 2.4.1. (C1 implies differentiable) f Suppose Rm ⊃ P −→ Rn is a C1 function at p ∈ P , then f 0(p) exists and is determined by the partial derivatives: 0 f (p)ei = ∂if(p), defined as in Definition 2.2.3.

P i Proof. According to Theorem 2.2.1 it suffices to treat the case n = 1. Writing h = i h ei and using the i i 1-dimensional mean value theorem to the function t 7→ f(p + tei) there is a ci ∈ (0, h ) such that h ∂if(ci) = i Pm P j P j f(q + h ei) − f(q) for any p ∈ P . We compute f(p + h) − f(p) = i=1 f(p + j≤i h ej) − f(p + j

Proof. Since |F 0(x)| is continuous and D closed and bounded it attains a maximum M. Fix a unit vector u ∈ Rm. g The function [0, 1] 3 t 7−→ u·F (x+th) ∈ R is differentiable with g0(t) = u·F 0(x+th)h. The mean value theorem 16 CHAPTER 2. HOW TO SOLVE EQUATIONS tells us there exists a c ∈ (0, 1) such that g(1) − g(0) = g0(c) so u · (F (x + h) − F (x)) = u · F 0(x + ch)h ≤ M|h|. Since this is true for any unit vector u we must have |F (x + h) − F (x)| ≤ M|h|. Finally to come up with solutions to equations we often use the following lemma.

Lemma 2.4.3. (Banach contraction lemma)

Φ 1. Suppose C ⊂ Rn is a non-empty closed bounded subset and α ∈ [0, 1). If C −→ C is continuous and for all x 6= y ∈ C we have |Φ(x) − Φ(y)| < α|x − y| then exists a unique fixed point p ∈ C with Φ(p) = p.

2. Suppose C is the set of continuous functions C = {γ :[−τ, τ] → D}, where D ⊂ Rn is a closed disk and α ∈ [0, 1). If Φ: C → C is such that sup|t|≤τ |Φ(γ(t)) − Φ(δ(t))| < α sup|t|≤τ |γ(t) − δ(t)| then again Φ(π) = π for a unique π ∈ C. Maps that satisfy the condition of the lemma are known as contractions. Each time we apply a contraction, the distance between points is multiplied by α < 1. The proof of this important lemma and its generalizations will be treated in the class on metric and topological spaces.

Exercises Exercise 0. (Multi-dimensional mean value theorem) F Prove the following generalization of the mean value theorem to multiple variables. Suppose Q −→ R is a differentiable map defined on an open subset Q ⊂ Rn. If Q contains the line segment between two points a, b then there exists a point c on this segment such that: F (b)−F (a) = F 0(c)(b−a). Hint: use the one-dimensional mean value theorem on F ◦ γ for γ a suitable curve.

Exercise 1. (Mean failure) F Why is there no version of the mean value theorem for R2 −→ R2? Give an example of a C2 function F R2 −→ R2 and a 6= b ∈ R2 such that there is no c on the line segment between a, b with the property that F (b) − F (a) = F 0(c)(b − a).

Exercise 2. (Constant?) F Suppose Rm ⊃ R −→ Rn is a C1 function defined on a rectangle R that satisfies F 0(p) = 0 for all p ∈ R. Is it true that F must be constant?

Exercise 3. (Contractions) f We say D −→ D is a contraction if for all x, y ∈ D we have |f(x) − f(y)| < α|x − y| for α ∈ [0, 1) as in the Banach lemma. Taking D = [0, 1], which of the following functions is a contraction? f(x) = x2, f(x) = x, x π 1−x f(x) = 2 , f(x) = sin( 2 x), f(x) = 1+x .

Exercise 4. (Contractions 2) Prove that if p is a fixed point of the function C −→Φ C satisfying the hypotheses of the Banach lemma, then for any x ∈ C we must have the sequence x, Φ(x), (Φ ◦ Φ)(x), (Φ ◦ Φ ◦ Φ)(x),... converging to p. 2.5. INVERSE AND IMPLICIT FUNCTION THEOREMS 17

2.5 Inverse and Implicit function theorems

In this section we provide some answers to the first of the two main questions we posed at the beginning of these notes: How many solutions does a system of equations have? Intuitively a system of n equations in n unknowns should have only one or at most finitely many solutions. At least for linear equations Ax = y this is true provided that the linear map A is invertible because the unique solution is then x = A−1y. Invertibility of a linear map is easy to check, it is equivalent to det A 6= 0. Gaussian elimination will determine the determinant efficiently by bringing a matrix for A into upper triangular form and multiplying the diagonal entries. What about a system f(x) = y of n equations given by differentiable functions of n unknowns? We investigate the situation near a solution f(x0) = y0. The inverse function theorem says we can find a (local) inverse of f 0 −1 provided f (x0) is invertible. Applying the inverse gives the unique solution in the form x = f (y) just like for the matrices. 0 Intuitively what is going on here is that we approximate f near x0 as f(x0 + h) ≈ f(x0) + f (x0)h. If we 0 −1 set x = x0 + h we can solve for x in f(x) = y to get x ≈ x0 + f (x0) (y − y0). f For example take R 3 x 7−→ x2 ∈ R. Close to any point p 6= 0 we have f 0(p) 6= 0 so the theorem says there is a unique solution x to x2 = y with the property that x is close to p.

0 Figure 2.4: The inverse function says that if f (x0) is invertible, so is f close to x0. Close means there g exists X × Y 3 (x0, y0) (green) on which the graph of f (blue) coincides with the graph of Y −→ X shown in red.

Theorem 2.5.1. (Inverse function theorem) 1 n f n 0 Imagine a C function between on an open set R ⊃ U −→ R with f(x0) = y0. If f (x0) is invertible, then n 1 g there are open sets x0 ∈ X ⊂ U and y0 ∈ Y ⊂ R and a C function Y −→ X such that f ◦ g = idY and 0 0 −1 g ◦ f = idX . Also g (y) = f (g(y)) .

0 n Proof. Without loss of generality we may assume that x0 = y0 = 0 and f (0) = IdR . 0 Since f is continuous we may choose δ > 0 such that the closed disk D = Bδ(0) ⊂ U and for all x ∈ Bδ(0) we 0 1 0 0 1 n n have |f (x)−IdR | < 2 . Set w(x) = f(x)−x so w (x) = f (x)−IdR and by Lemma 2.4.2 |w(x+h)−w(x)| ≤ 2 |h|. In other words |h| |f(x + h) − f(x) − h| ≤ ∀x, x + h ∈ B (0) (2.4) 2 δ

Take Y = B δ (0). Using Banach’s contraction Lemma 2.4.3 we will show that for any y ∈ Y there exists a 2 g unique x ∈ D such that f(x) = y. This defines a function Y −→ D by setting x = g(y). To this end define for Φ any fixed y ∈ Y the function D 3 z 7−→ z + y − f(z) ∈ Rn. If there is an x ∈ D with Φ(x) = x then this will be a solution to f(x) = y and vice versa. To get this fixed point x we check the hypotheses of Banach’s lemma. 1 1 First the image of Φ is contained in D because setting x = 0 in (2.4) gives |f(h)−h| ≤ 2 |h| ≤ 2 δ for any h ∈ D and replacing h by z in combination with the triangle inequality yields |Φ(z)| = |y−f(z)+z| ≤ |y|+|f(z)−z| < δ. |h| Second we show that Φ is a contraction. Using (2.4) we find |Φ(z +h)−Φ(z)| = |−f(z +h)+f(z)+h| ≤ 2 as required. The resulting fixed point x satisfies x = Φ(x) is in fact in the open ball X = Bδ(0) as we saw above. g We thus found a function Y −→ X such that g(f(x)) = x for all x ∈ X (by uniqueness of the fixed point). For the same reason f(g(y)) = y for all y ∈ Y . 18 CHAPTER 2. HOW TO SOLVE EQUATIONS

To see that g is continuous take any two points y, y + k ∈ Y and set x = g(y), x + h = g(y + k) for some h. 1 1 Equation (2.4) tells us |k − h| < 2 |h| and the triangle inequality says |h| ≤ |h − k| + |k| ≤ 2 |h| + |k| so |h| ≤ 2|k|. This shows continuity of g at y (why?). 0 Finally to show that g is differentiable recall that f is, so f(x+h) = f(x)+f (x)h+f,x(h) with f,x = o(h). 0 In the previous notation with A = f (x) this means k = A(g(y + k) − g(y)) + f,x(h). Now |Ax − x| ≤ 1 −1 |A − Id||x| ≤ 2 |x| by definition of D above. It follows that ker A = {0} so A is invertible. Applying A to −1 −1 the above equation yields g(y + k) = g(y) + A k + g,y(k) where g,y(k) = −A f,x(h). This means that 0 −1 g is differentiable with derivative g (y) = A because the estimate below shows that g,y = o(k) as k → 0: −1 g,y (k)| |A f,x(h)| −1 |f,x(h)||h| −1 |f,x(h)| |k| = |k| ≤ |A | |h||k| ≤ 2|A | |h| . Finally the derivative g0(y) = f 0(g(y))−1 is continuous since matrix inversion is continuous and so are g and f 0. Now that we know something about systems with as many equations as there are variables, what about if there are more variables than equations? Again the linear case decides what happens in the non-linear case, at least locally. If we have n + m variables and m equations, can we select n free variables and parametrize the solutions set in terms of those? The implicit function theorem says that locally the solution set is the as shown in Figure 2.5.

Figure 2.5: The implicit function says that the solutions to the equation f(x, y) = z0 (shown in red) near the solution (x0, y0), (i.e. in the green box N × M) look like the graph of a function.

Theorem 2.5.2. (Implicit function theorem) 1 n m f m 0 Imagine a C function R × R ⊃ U −→ R and (x0, y0) ∈ U and set z0 = f(x0, y0). If F (y0) is invertible, 1 where F (y) = f(x0, y), then there exist open sets N,M such that (x0, y0) ∈ N × M ⊂ U and a unique C g function N −→ M such that −1 (N × M) ∩ f (z0) = {(x, g(x))|x ∈ N} 2 2 2 x3 2 2 For example take n = 1, m = 2 and f(x, y, z) = (x + y + z , 3 + y − z ) and (x0, y0) = (0, (1, 1)) −1 and z0 = (2, 0). Then√ the solution set f ({z0}) is the green tennisball curve which is the intersection of the sphere with radius 2 and the surface shown in yellow in Figure 2.6 below. Adding the two equations we see q 2 x 2 x2 x3 x (1 + 3 ) + 2y = 2 so y = ± 1 − 2 − 6 and taking the difference tells us z, so in this case we can actually q x2 x3 q x2 x3 find g explicitly: g(x) = ( 1 − 2 − 6 , 1 − 2 + 6 ). In general we will not be so lucky and g is only defined implicitly. The uniqueness of g is assured by choosing the signs of the square roots to be positive, which is valid in N × M = [0, ∞)3. The function F is in this case F (y, z) = f(0, y, z) = (y2 + z2, y2 − z2) and F 0((1, 1)) has  2 2  matrix and non-zero determinant. 2 −2

n+m G n+m 1 0 Proof. Define R 3 (x, y) 7−→ (x, f(x, y)) ∈ R . Then G is C differentiable and the matrix for G (x0, y0)   Id 0 0 wrt the standard basis has the block form 0 . This means G (x0, y0) is invertible so by the inverse ZF (y0) n m 1 H function theorem there exist open subsets R1,S1 ⊂ R and R2,S2 ⊂ R and C -function S −→ R such that H ◦ G = idS and G ◦ H = idR. Here we set S = S1 × S2 and R = R1 × R2. 1 h H has to be of the form H(x, y) = (x, h(x, y)) for some C -function S −→ R1 because otherwise composing with G cannot give the identity. Also f(x, h(x, y)) = f(H(x, y)) = π2 ◦ G ◦ H(x, y) = y, where π2 means projection on the second coordinate. It follows that f(x, h(x, z0)) = z0 so we may take M = S1 and N = R1 g 1 and define M 3 x 7−→ h(x, z0) ∈ N. The function g is C by the chain rule. 2.5. INVERSE AND IMPLICIT FUNCTION THEOREMS 19

Figure 2.6: The solution to f = (2, 0) is shown in green, while the two level sets of f 1 and f 2 are the sphere and the yellow surface respectively. The linear approximation to the solution set around point (0, 1, 1) is also shown.

Exercises Exercise 0. (Loss of generality) 0 n Prove that the inverse function theorem in the special case where x0 = y0 = 0 and f (x0) = IdR implies the theorem in general. 0 −1 Hint: F (x) = f (x0) (f(x + x0) − y0)

Exercise 1. (More loss of generality) 0 n Prove that the implicit function theorem in the special case where x0 = y0 = z0 = 0 and F (y0) = IdR implies the theorem in general.

Exercise 2. (Inverse from implicit) Derive the inverse function theorem from the implicit function theorem. Hint: Apply implicit function theorem to the function B(y, x) = f(x) − y.

Exercise 3. (Derivative of implicit function) Use the chain rule to find an expression for the derivative of the function g in the implicit function theorem.

Exercise 4. (Scary system?) Consider the two equations in the four unknowns a, b, c, d with parameters z1, z2 given by

abc + abd + acd + bcd = z1

ab + ac + ad + bc + bd + cd = z2

Write the system as f(x, y) = z0 = (z1, z2) where x = (a, b) and y = (c, d). For z0 = (2, 0) = f(1, −1, −1, −1) −1 Can one write the solution set f ({z0}) near the point (x0, y0) = (1, −1, −1, −1) as the graph of a function of variables a, b? What about the case (x0, y0) = (1, 1, 1, 1)?

Exercise 5. () 0 Given f as in the implicit function theorem, show that the condition det F (y0) 6= 0 is equivalent to the con- 0 dition that the dimension of the projection of the tangent plane ker f (x0, y0) onto the first n coordinates (the x-coordinates) has dimension n.

Exercise 6. (Coordinate transformation) x −y Consider the coordinate transformation given by f(x, y) = ( x2+y2 , x2+y2 ). Does there exist an inverse to f close to the point (2, 1, f(2, 1))? 20 CHAPTER 2. HOW TO SOLVE EQUATIONS

2 1 Identifying C with R we see that f is really the function z 7→ z .

Exercise 7. (Invertible?) 1 n n Explain why |A − Id| < 2 implies that linear map A ∈ L(R , R ) is invertible. Hint: It is enough to show that Ax 6= 0 when x 6= 0. This follows from an estimate on |Ax − x|.

Exercise 8. (Sinful system) π π π −1 1 Set f(x, y, z) = (sin sin(x + y), sin sin sin(x + y + z)) and p = ( 4 , 2 , 2 ). Show that the solution set f ({ 2 }) is the graph of some function near point p.

Exercise 9. (Inverse) Suppose f is a C1 differentiable function and f 0(x) is invertible for all x in the domain of f. Does this mean that f is a bijection?

Exercise 10. (Implicit) Use the implicit function theorem to show that the solutions to f(x, y, z) = (5, 2) near (0, 1, 2) can be written as the graph of a function g. Here f(x, y, z) = (x2 + y2 + z2, yz). Also find a one variable function parametrizing the solutions of f(x, y, z) = (2, 1) near (0, 1, 1).

Exercise 11. (Flattening) 1 n+m f m 0 Given C -function R −→ R such that f(x0, y0) = z0 and the final m columns of f (x0, y0) form a basis of m 1 B R , show that there exist open sets N 3 x0 and M 3 y0 and an invertible C function N × M −→ N × M with 1 −1 C inverse such that f ({z0}) ∩ N × M) = B({0} × M).

Exercise 12. (Proof tracking) Write out explicitly what the proof of the implicit function theorem says in the case where f is a linear function. What is G, what is H what is h what is g and why does it make sense?

2.6 Picard’s theorem on existence of solutions to ODE

F Recall that a vector field on open set P ⊂ Rn is just a function P −→ Rn. Vector fields provide a way to encode ordinary differential equations (ODE) basically by ’following the arrows’. More precisely solving a differential equation comes down to finding an integral curve as defined below. Definition 2.6.1. (Integral curve) F Imagine a vector field P −→ Rn on open set P ⊂ Rn. An integral curve γ for F through p ∈ P is a differentiable γ map (−a, a) −→ P for some a > 0 such that γ0(t) = F (γ(t)) for all t ∈ (−a, a) and γ(0) = p. Theorem 2.6.1. (Existence of solutions to ODE) F If P −→ Rm is a C1 vector field on P then for any p ∈ P there exists an integral curve for F through p. Any two integral such integral curves have to agree on an interval (−a, a) for some a > 0. G Proof. We first reformulate the theorem in terms of integration. In what follows the integral of a function Rm −→ n R R P i P R i R over a rectangle R just means the integral of each of its components so R G = R i G ei = i( R G )ei. F is continuous so there is a closed ball Br(p) ⊂ P with radius r and center p and a constant M such that |F (x)| ≤ M for all x ∈ B (p). Choose τ > 0 such that τM ≤ r and τL < 1 where L = max |F 0(x)|. r x∈Br (p) γ R For a curve (−a, a) −→ P define a new curve Φ(γ) by Φ(γ)(t) = p+ [0,t] F ◦γ. Notice that if we had a curve γ such that Φ(γ) = γ, i.e a fixed point for Φ, then that γ would be an integral curve for F through p. Conversely, 0 R 0 any integral curve for F through p would be a fixed point of Φ since F ◦ γ = γ so Φ(γ)(t) = p + [0,t] γ = γ(t) by the fundamental theorem of calculus. γ Considering Φ as a function from the space of curves C = {[−τ, τ] −→ Br(p)|γ continuous} to itself we now seek to apply the Banach lemma 2.4.3 to find a fixed point. Recall that on C we measure distance using |γ −δ| = sup|t|≤τ |γ(t)−δ(t)|. Notice that Φ(γ) is in fact differentiable (by the fundamental theorem of calculus). R Also that Φ(γ) ∈ C for γ ∈ C because |p − Φ(γ)(t)| = | [0,t] Φ(γ)| ≤ τM < r. We will show that Φ is a contraction with respect to the sup norm. Once we do that the Banach contraction lemma 2.4.3 guarantees the existence of a unique fixed point of Φ. Φ is a contraction because Z |Φ(γ) − Φ(δ)| = sup | F (γ(·)) − F (δ(·))| ≤ τ sup |F (γ(s)) − F (δ(s))| ≤ τ sup L|γ(s) − δ(s)| = τL|γ − δ| |t|≤τ [0,t] s≤τ s≤τ 2.6. PICARD’S THEOREM ON EXISTENCE OF SOLUTIONS TO ODE 21

The first and last equality holds by definition of the sup-norm on C. The first inequality is an application of 2.3.1 and the second uses the mean value inequaltiy 2.4.2. Since the fixed point we found is unique and any integral curve of F through p gives such a fixed point when restricted to [−τ, τ], all such integral curves have to agree for |t| ≤ τ.

Exercises Exercise 0. (Polya vector field) f P For a complex differentiable C −→ C we define the Polya vector field R2 −→ R2 as follows. P (x, y) = (Ref(x + iy), −Imf(x + iy)). Recall the div of a vector field is the sum of the partial deriva- tives in all directions. What is div P ?

Exercise 1. γ If (−a, a) −→ P and (−b, b) −→δ P are two integral curves through p ∈ P for differentiable vector field F and 0 < a < b is it true that δ(t) = γ(t) for all |t| < a?

Exercise 2. f Compute curl and div of the vector field ∇f where R2 −→ R is given by f(x, y) = x2 − y. Same question for f(x, y) = cos(xy).

Exercise 3. F Sketch the arrows of the vector field R2 −→ R2 given by F (x, y) = (x2 − y2, −2xy). Also sketch the integral curves for F through the points (0, 0) , (1, 0) and (1, 1) and (0, 1).

Exercise 4. Linear systems of ordinary differential equations are often written asx ˙ = Ax for some matrix A and vector function x = x(t) and initial condition x(0) = x0. How can we formulate a vector field F such that the problem of finding an integral curve for F corresponds to solving this system? What F should we take? What about the initial condition?

Exercise 5. F Describe all integral curves for the vector field R2 3 (x, y) 7−→ (−y, x) ∈ R2.

Exercise 6. Show that for γ, δ ∈ C we have |γ + δ| ≤ |γ| + |δ|. Here C is the space of continuous functions on [−τ, τ] used in the proof of Picard’s theorem.

Exercise 7. What vector field corresponds to the differential equation x0 = x2? Dividing both sides of the equation by x2 and integrating with respect to t find an explicit solution, one for each initial value x(0). If you did it right you will see the solution is not defined for all t.

Exercise 8. Write down the first few terms of the Taylor of tan(t) around 0. Using the contraction map Φ from the proof of the Picard theorem we want to approximate the solution to the differential equation x0 = 1 + x2 with initial condition x(0) = 0. Our first guess is the constant function x(t) = γ(t) = 0. Now compute Φ(γ) and Φ(Φ(γ)) and Φ(Φ(Φ(γ))). The vector field is F (x) = 1 + x2 and p = 0.

Exercise 9. Describe the integral curve through the point (π, π, π) for the vector field F on R3 given by F (x, y, z) = e1 + e2 + e3. 22 CHAPTER 2. HOW TO SOLVE EQUATIONS Chapter 3

Multivariable fundamental theorem of calculus

The fundamental theorem of calculus states that the integral of the derivative is the function evaluated at the boundary and that every function has a primitive, an indefinite integral. Before setting out to generalize the fundamental theorem of calculus to arbitrary dimensions let us have a brief look at vector calculus. Recall the theorems of Gauss and Stokes and line . None of these concepts is made precise in the present section, they are just meant to guide us into the right direction. F A vector field F is a differentiable Rm ⊃ P −→ Rm where P is open. When m = 3 recall div, curl, grad were defined by

X i 2 3 3 1 1 2 div(F ) = ∂iF curl(F ) = (∂3F − ∂2F , ∂1F − ∂3F , ∂2F − ∂1F ) grad(f) = (∂1f, ∂2f, ∂3f) i

As written div, grad, curl depend very much on the choice of basis in R3. Gauss: For a vector field F defined on a three-dimensional domain D bounded by S we have Z Z div(F )dV = F · NdS D S Stokes: For a vector field F defined on a surface S bounded by a curve C we have Z Z curl(F ) · NdS = F · dr S C For a function f defined on a curve C with end-points a, b we have: Z grad(f) · dr = f(b) − f(a) C Without going into too much detail, we notice that the left hand side is integration over a k-dimensional object D involving some kind of derivative. The right hand side relates this to an integral over the boundary of D of the original function. Also the type of object integrated varies. The counts how many points fit in a cube. The counts how many arrows of our vector field pierce through the surface. The counts how many surfaces perpendicular to the vector field get pierced by the curve. Finally we note that finding a primitive/ in this context comes down to finding potentials. Not every vector field is the of a function. However in R3 vector fields F satisfying curlF = 0 are shown to be F = grad(f) for some function f. Likewise a vector field F is of the form F = curlG for some other vector field G, provided divF = 0. Hopefully at the end of this chapter we will have more insight into why this must be the case.

3.1 Exterior algebra

Although somewhat abstract, the exterior algebra is a great way to discuss determinants and the geometry of lines and planes that one naturally finds in multivariable analysis. They are defined starting with a huge vector space H that is then quotiented out by another enormous vector space. Both vector spaces are a little

23 24 CHAPTER 3. MULTIVARIABLE FUNDAMENTAL THEOREM OF CALCULUS artificial but that is perhaps also their charm. Starting from any set S one can consider the vector space W spanned by S. By definition the elements of S form a basis of W and general vectors of W look like finite linear combinations of elements of S. For example if S = {cats, dogs, frogs} then a typical vector in W would be 10cats − 2dogs + πfrogs. Much like quotients of Abelian groups one can quotient a vector space by one of its subspaces:

Definition 3.1.1. (Quotient vector space) Given a vector space V and a linear subspace W ⊂ V we define the quotient space V/W to be the vector space of elements v + W with addition and scalar multiplication given by (v + W ) + (v0 + W ) = v + v0 + W and λ(v + W ) = λv + W .

We are now ready to define one of the most important algebras you have not seen before:

Definition 3.1.2. (Exterior algebra) Given vector space V , consider the vector space H spanned1 by the set of all finite sequences of elements of V . The subspace R is spanned by all elements in H of the form

1. (...,v,w,... ) + (...,w,v,... )

2. (. . . , α1v1 + α2v2, w, . . . ) − α1(. . . , v1, w, . . . ) − α2(. . . , v2, w, . . . ) Here the dots indicate that the vectors in each of the three points are equal except in the places shown, also v, w ∈ V are arbitrary elements and α1, α2 ∈ R. The exterior algebra ΛV is the quotient vector space H/R Vk we use the notation (v1, v2) + R = v1 ∧ v2 and v1 ∧ v2 ∧ · · · ∧ vk = i=1 vi. Moreover ΛnV is the subspace of ΛV spanned by the wedges of length n.

In what follows we will identify Λ1V with V and also set Λ0V = R. The wedge ∧ can be viewed as a map ∧ :ΛV × ΛV → ΛV that is bilinear, associative and anti-symmetric. More concretely ΛV consists of combinations of +, ∧ applied to any finite number of vectors of V in any order subject to the rules

1. v ∧ w = −w ∧ v, v, w ∈ V

2. (α1v1 + α2v2) ∧ w = α1v1 ∧ w + α2v2 ∧ w, vi, w ∈ V , αi ∈ R. 3. (u ∧ v) ∧ w = u ∧ (v ∧ w), u, v, w ∈ V .

The above formal construction makes working with such relations rigourous but in practice one often just works with these relations without remembering they came from a quotient. We will mostly apply this construction to the dual space V ∗ to build Λ(V ∗). One benefit of the dual space is that it allows a pull-back map. Recall that for L ∈ L(V,W ) we had the pull-back L∗f = f ◦ L. On the exterior algebra we now apply the usual pull-back to all the dual vectors in a wedge product and extend linearly.

Definition 3.1.3. (Pull-back) ∗ The pull-back along a linear map V −→L W is ΛW ∗ −−→L ΛV ∗ defined by L∗(f 1 ∧ · · · ∧ f k) = (L∗f 1 ∧ · · · ∧ L∗f k). It is extended to all of ΛW ∗ by asking that it is a linear map.

∗ ∗ For example if Le1 = f1 + 2f3 and Le2 = 3f1 − f2 for bases e1, e2 of V and f1, f2, f3 of W then V and W 1 2 1 2 3 ∗ 1 1 ∗ ∗ 1 ∗ 1 have dual bases e , e and f , f , f . Also L f = f ◦ L ∈ W . More explicitly L f (e1) = 1 and L f (e2) = 3 so L∗f 1 = e1 + 3e2. Likewise L∗f 2 = −e2 and L∗f 3 = 2e1. Finally the pull-back of ω = 2f 1 ∧ f 2 + 4f 1 ∧ f 3 is L∗ω = L∗2f 1 ∧ L∗f 2 + L∗4f 1 ∧ L∗f 3 = 2(e1 + 3e2) ∧ (−e2) + 4(e1 + 3e2) ∧ (2e1) = −26e1 ∧ e2. It is helpful to remember that the matrix of the pull-back with respect to the dual bases is the transposef of P i the matrix of the original transformation. Indeed this follows from Lej = i Ljfi by applying the dual basis k ∗ k k k ∗ k P k j vector f to both sides: we get (L f )ej = f ◦ Lej = Lj so L f = j Lj e . The wedge product and the pull-back together allow us to express determinants elegantly. Recall the determinant of a matrix can be defined recursively in terms of the minors of the matrix. Given matrix A we i+j define the cofactor Cij = (−1) det(Mij) where Mij is the matrix obtained by deleting row i and column j. P i The determinant is then defined as by det(A) = i A1Ci1. Such a formula is known as expanding in the first column. Perhaps more conceptually we may also express the determinant in terms of the pull-back and wedge. Intuitively it says that the determinant records the (signed) change in volume when doing a linear change of coordinates. 1Please keep in mind these sequences should NOT be added pointwise and the empty sequence is included too. 3.1. EXTERIOR ALGEBRA 25

Lemma 3.1.1. (Wedge and det) ∗ Vn i Vn i When n = dim V and L ∈ L(V,V ) we have L i=1 v = det(L) i=1 v . Proof. We might as well assume that the vectors vi are linearly independent and hence form a basis since P i otherwise both sides of the above equation are 0. With respect to the dual basis vi of V we set Lvj = i Ljvi. k ∗ k k k Applying v to both sides we get L v (vj) = Lj since v (vi) = 0 unless k = i in which case it is 1. This means ∗ k P k j L v = j Lj v . Now we argue by induction on n and expand the pull-back by bringing to the front all terms v1 that occur in the wedge product: X L∗(v1 ∧ · · · ∧ vn) = L∗v1 ∧ · · · ∧ L∗vn = L1 vj1 ∧ · · · ∧ Ln vjn = j1 jn j1,...jn     X X L1v1 ∧ L2 vj2 ∧ · · · ∧ Ln vjn − L2v1 ∧ L1 vj1 ∧ L3 vj3 · · · ∧ Ln vjn + ... 1  j2 jn  1  j1 j3 jn  j2,...jn6=1 j1,j3...jn6=1   X X (−1)nLnv1 ∧ L1 vj1 ∧ · · · ∧ Ln−1 vjn−1 = v1 ∧ Li C v2 ∧ · · · ∧ vn = det Lv1 ∧ · · · ∧ vn 1  j1 jn−1  1 i1 j1,...jn−16=1 i

The final equality is by the above recursive definition of determinant in terms of cofactor Ci1. By induction this cofactor is seen to satisfy precisely X ^ C v2 ∧ · · · ∧ vn = (−1)i+1 Ls vjs = B∗(v2 ∧ · · · ∧ vn) i1 js j1...,ji−1,ji+1,...jn6=1 s6=i where B has matrix Mi1, the minor obtained from the matrix of L by erasing the i-th row and the first column. To make the exterior computations more concrete it is useful to see how a basis of V extends to one of ΛV . Lemma 3.1.2. (Wedge basis) k V A basis b1, . . . bn for V gives a basis of Λ V consisting of all elements bI = i∈I bi where I is an increasing 1 n ∗ I V i sequence of k elements. If b , . . . b denotes the dual basis of V we also define b = i∈I b . k V k Proof. The fact that the bI span Λ V follows directly from expressing each of the vi in vi ∈ Λ V in terms of P the bi basis. The bI with I an increasing sequence are all independent. To see why assume I cI bI = 0. For ¯ P increasing sequence J consider the complementary sequence J of n − k elements. Then 0 = bJ¯ ∧ I cI bI = ±cJ b(1,...,n) so the coefficient cI must be 0.

Exercises Exercise 0. (Tensor product) For vector spaces V,W we consider the enormous vector space spanned by all pairs v ⊗ w where v ∈ V and w ∈ W . The tensor product V ⊗ W is the quotient vector space where we work modulo all elements of the form

(α1v1 + α2v2) ⊗ w − α1(v1 ⊗ w) − α2(v2 ⊗ w) v ⊗ (α1w1 + α2w2) − α1(v ⊗ w1) − α2(v ⊗ w2)

L∞ ⊗n where vi ∈ V, wi ∈ W and αi ∈ R. The tensor algebra is T (V ) = n=0 V . Show that ΛV = T (V ) modulo the relations v ∧ v = 0 where v ∈ V .

Exercise 1. (Triangles) In this exercise we identify vectors in R2 elements of Λ1R2. Explain why for any a, b ∈ R2 we have a λ ∈ R λ such that a ∧ b = λe1 ∧ e2. We say that 2 is the oriented area of the triangle with vertices 0, a, b. Prove that 2 1 1 1 1 for a, b, c ∈ R we have 2 (b − a) ∧ (c − a) = 2 a ∧ b + 2 b ∧ c + 2 c ∧ a. What is the relationship between areas of triangles that is implied?

Exercise 2. Find a non-zero f ∈ Λ2R4 such that f ∧ f 6= 0.

Exercise 3. Vk k Suppose v1, . . . vk ∈ V are linearly independent vectors. The element i=1 vi ∈ Λ V has something to do with the subspace W ⊂ V spanned by the vi. Consider another sequence w1, . . . wk ∈ V that also spans W . Show 26 CHAPTER 3. MULTIVARIABLE FUNDAMENTAL THEOREM OF CALCULUS

Vk that i=1 wi differs by only a scalar.

Exercise 4. Expand (ae1 + ce2) ∧ (be1 + de2) in terms of the basis elements.

Exercise 5. What is e1 ∧ e2 ∧ e3 + e2 ∧ e1 ∧ e3 + e3 ∧ e2 ∧ e1 + e1 ∧ e3 ∧ e2 + e2 ∧ e3 ∧ e1 + e3 ∧ e1 ∧ e2?

Exercise 6. Prove that our pull-back definition of determinant gives rise to the following general formula as a sum over the group Sn of permutations σ of n elements. Here n is the size of the matrix A:

n X Y i det(A) = sign(σ) Aσ(i) σ∈Sn i=1

3.2 Differential forms

In our quest for the fundamental theorem of calculus the main tool is differential forms. These are the natural integrands, rather than functions. Roughly speaking they are integrands with a built in device for dealing with the change of scale coming with a change of variables. Later we will see that f(x)dx can be understood as a differential form. In changing variables we know from calculus we have to change both the f and the dx part. We will eventually make sense of this in terms of the following.

Definition 3.2.1. A differential k-form ω on P ⊂ Rn is a choice of an element ω(p) ∈ Λk(Rn)∗ for each p ∈ P . We may express ω(p) in terms of eI and the k-form is called C1 if its coefficients are C1 functions. The set of all C1 k-forms on P is denoted Ωk(P ).

n n ∗ 1 n Recall that if the standard basis of R is e1, . . . en then the dual basis of (R ) is denoted e . . . e . We know from Lemma 3.1.2 that Λk(Rn)∗ has basis eI . Therefore any differential k-form on open set P can be written P I uniquely as I ωI e for some functions ωI defined on P . As always the sum runs over all increasing sequences I with entries in {1, . . . n}. f For a differentiable function P −→ R we get a natural 1-form called the differential df defined as df(p) = f 0(p). i n ∗ i Writing x ∈ (R ) for the function that provides the coefficient of ei one often sees the notation dx for the constant 1-form p 7→ e1. Many constructions found on Λk can be extended to differential forms pointwise. For example if ω, η are k-forms and f, g are real valued functions all defined on open set P then we define fω + gη to be the differential k-form defined by (fω + gη)(p) = f(p)ω(p) + g(p)η(p). Likewise the wedge product is extended pointwise to forms: (ω ∧ η)(p) = ω(p) ∧ η(p). Finally the pull-back along a differentiable f is defined on forms as

Definition 3.2.2. (Pull-back on differential forms) ϕ ϕ∗ For a map P −→ Q we have a pull-back Ωk(Q) −−→ Ωk(P ) defined by (ϕ∗ω)(p) = (ϕ0(p))∗ω(ϕ(p)). In the special case k = 0 of functions we define ϕ∗f = f ◦ ϕ for f ∈ Ω0(P ).

ϕ As an example consider polar coordinates: P = (0, ∞) × (0, 2π) 3 (r, t) 7−→ (r cos t, r sin t) ∈ Q, where Q is R2 minus the positive x-axis, and ω ∈ Ω2(Q) is given by ω(x, y) = (x − y)e1 ∧ e2. By definition (ϕ∗ω)(r, t) = r(cos t − sin t)ϕ0(r, t)∗(e1 ∧ e2). Since e1 ◦ ϕ0(r, t) = cos te1 − r sin te2 and e2 ◦ ϕ0(r, t) = sin te1 + r cos te2 we get ϕ0(r, t)∗(e1 ∧ e2) = [(e1, e2) ◦ ϕ0(r, t)] = [(e1 ◦ ϕ0(r, t), e2 ◦ ϕ0(r, t))] = (e1 ◦ ϕ0(r, t)) ∧ (e2 ◦ ϕ0(r, t))) = (cos te1 − r sin te2) ∧ (sin te1 + r cos te2) = re1 ∧ e2 so we found (ϕ∗ω)(r, t) = r2(cos t − sin t)(e1 ∧ e2). Beware that we used e1, e2 to denote the dual basis in both the domain and the range. One often sees this computation done in terms of dx, dy and dr, dt instead, just plugging in the formulas of ϕ for x, y to express dx and dy in terms of dr, dt by differentiating. The end result would now look like ϕ∗((x−y)dx∧dy) = r2(cos t−sin t)dr ∧dt.

Exercises Exercise 0. Show that f ∗ ◦ g∗ = (g ◦ f)∗ still holds for the pull-back of differential forms.

Exercise 1. f Define R2 3 (x, y) 7−→ (x2 − y2, −2xy, x) ∈ R3. On P = (−1, 1) × (2, 0) × (−2, 0) we define the differential 1-form 3.3. INTEGRATION 27

ω = cos(xy)e1 + sin(x + y)e2. Compute f ∗ω by providing its coefficients with respect to the basis e1, e2.

Exercise 2. ∗ Vs i Vs ∗ i Pull-back and wedge commute: ϕ i=1 η = i=1 ϕ η .

Exercise 3. First notice that 0-forms really are functions because we set Λ0V = R for any vector space V . For functions f ∈ Ω0(P ) taking pull-back commutes with the differential: dϕ∗f = ϕ∗df.

Exercise 4. Pn i n f Can you make sense of the formula df = i=1 ∂ifdx where R −→ R is a differentiable function?

3.3 Integration

Integration is defined much like the familiar case of line integrals. In general we integrate a k-form along a γ ’k-dimensional parametrized curve’ [0, 1]k −→ Rn. We call γ a (singular k-) cube.

Definition 3.3.1. (Integral of k-form along a k-cube) 1 k f R 1 k R For a C function [0, 1] −→ R define [0,1]k f(e ∧ · · · ∧ e ) = [0,1]k f where the right hand side is the ordinary integral of the function f. More generally, we define Z Z ω = γ∗ω γ [0,1]k

γ where [0, 1]k −→ P is a C1 function (singular cube) and ω ∈ Ωk(P ).

Notice that the final integrand γ∗ω is in Ωk([0, 1]k). Any k-form can be written as f(e1 ∧ · · · ∧ ek) for some function f so this definition actually covers all possible cases. In the case of k = 1 this corresponds to the usual definition of a line integral. To see how this works consider γ P ⊂ Rn if ω ∈ Ω1(P ) and [0, 1] −→ Rn is a 1-cube. Then γ∗ω(t) = γ0(t)∗ω(γ(t)) = ω(γ(t))γ0(t) ∈ R∗. Any ∗ 1 R R 0 1 R 0 f ∈ R satisfies f = f(e1)e so finally γ ω = [0,1] ω(γ(·))γ (·)e1e = [0,1] ω(γ(·))γ (·)e1. 2 1 2 2 0 For example if P = R and ω(x, y) = (cos y)e + sin(x + y)e and γ(t) = (t, t ) then ω(γ(t))γ (t)e1 = 2 1 2 2 2 2 ((cos t )e + sin(t + t )e )(e1 + 2te2) = (cos(t ) + 2t sin(t + t )). Here we ommitted the 1 = e1 basis vector of R. R R 2 2 Therefore γ ω = [0,1](cos(t ) + 2t sin(t + t ))dt ≈ 1.7065. Notice the dt is in our notation completely optional. Integrating the 2-form ω ∈ Ω2(R3) defined by ω(x, y, z) = (x + y)e1 ∧ e3 over a square parametrized by γ [0, 1]2 −→ R3 defined by γ(s, t) = (s + t2, t, −s). Again we integrate by computing the pull-back γ∗ω(s, t) = 0 ∗ 0 0 0 ∗ 1 γ (s, t) ω(γ(s, t)). Now γ (s, t)e1 = e1 − e3 and γ (s, t)e2 = 2te1 + e2 so applying the transpose γ (s, t) e = e1 + 2te2 and γ0(s, t)∗e2 = e2 and γ0(s, t)∗e3 = −e1. Therefore γ∗ω(s, t) = (s + t2 + t)γ0(s, t)∗e1 ∧ γ0(s, t)∗e3 = 2 1 2 1 2 1 2 R R 2 (s + t + t)(e + 2te ) ∧ (−e ) = 2t(s + t + t)e ∧ e . So finally γ ω = [0,1]2 f where f(s, t) = 2t(s + t + t). 5 By Fubini and the fundamental theorem of calculus this is equal to 3 . f The fundamental theorem of calculus for 1-forms aka line integrals is as follows: For any P −→ R and C1 curve γ we have Z df = f(γ(1)) − f(γ(0)) γ

R R ∗ Unwinding the definitions this is just a restatement of the usual fundamental theorem since γ df = [0,1] γ df = R 0 0 R 0 [0,1] f (γ(·))(γ (·)) = [0,1](f ◦ γ) (·) = f(γ(1)) − f(γ(0)). The importance of the pull-back is that it describes how the integrand changes under a change of coordinates:

Lemma 3.3.1. (Substitution lemma) ϕ For ω ∈ Ωk(Q) and P −→ Q a C1 function we have Z Z ω = ϕ∗ω ϕ◦γ γ

R ∗ R ∗ ∗ R ∗ Proof. By definition the left hand side is [0,1]k (ϕ ◦ γ) ω = [0,1]k γ ϕ ω = γ ϕ ω. 28 CHAPTER 3. MULTIVARIABLE FUNDAMENTAL THEOREM OF CALCULUS

Exercises Exercise 0 R Compute γ ω where:

1. γ(t) = (1, t, −t, cos t, sin t) and ω ∈ Ω1(R5) is defined as ω(x) = x1e2 + x4e5.

2. γ(s, t) = (2s − t, t + 1) and ω ∈ Ω2(R2) with ω(x, y) = e1 ∧ e2.

3. γ(r, s, t) = (rst, rs, r, 1) and ω ∈ Ω3(R4) with ω(x, y, z, w) = ze1 ∧ e2 ∧ e3.

s 2 7 6 7 1 5 4. γ(s, t) = ( t , 0, 0, 0, 0, 0, 1) and ω ∈ Ω (R ) with ω(x) = (x + x )e ∧ e . Exercise 1 For ω ∈ Ω2(R4) defined by ω(x, y, z, w) = (x + z)dx1(x, y, z, w) ∧ e3 + (y + w)e2 ∧ e4 and the 2-cube γ given by R γ(s, t) = (s, 2t, t, −3s) compute the integral γ ω explicitly. Hint: The dx1(x, y, z, w) is just to show off. It can safely be replaced by ei for some i, (which i?)

Exercise 2 1 n f Pn ∂f n Define the gradient of a C function ⊃ P −→ as ∇f = ei. Prove that for any v ∈ and p ∈ P R R i=1 ∂xi R we have ∇f(p) · v = (df(p))(v). Also show that for any C1 curve γ in a level set of f with γ(0) = p the velocity vector γ0(0) is perpendicular to ∇f(p) and is in ker df(p).

Exercise 3 ϕ ∗ Define (0, ∞) × (0, 2π) 3 (r, t) 7−→ (r cos t, r sin t) ∈ R2. Let x, y ∈ R2 be the dual basis to the standard basis and r, t the dual basis in the domain of ϕ.

a. Compute ϕ∗(dx) and d(ϕ∗x) explicitly from the definitions. R b. Compute γ η with η = rdt and γ is the 1-cube defined by γ(s) = (s, s).

c. Compute R ω with α(u) = (u cos u, u sin u) and ω = √ −y dx+ √ x dy by expressing it as the integral α x2+y2 x2+y2 of a pull-back along ϕ.

Note that in this exercise we follow the common abuse of notation using the symbols r, t for both the coordinates of a point and also the dual vectors reading off these coordinates. The 1-form η defined loosely by rdt is really ∗ ∗ sending point p to e1(p)e2 ∈ Λ1R2 = R2 .

Exercise 4 R R 2 3 Compute γ ω and γ η where γ is a 2-cube defined by γ(s, t) = (s cos(t), sin(t), s) and the 2-forms ω, η ∈ Ω (R ) is defined as ω(x, y, z) = yze1 ∧ e3 and η(x, y, z) = yze3 ∧ e1. What is the sum of these integrals?

Exercise 5 Prove or disprove the following statement R R R For k-forms ω, η and k-cube γ we have γ ω + γ η = γ ω + η.

3.4 More on cubes and their boundary

We would like to have a version of the fundamental theorem of calculus that says: integration of the derivative over a cube is integration of the function on the boundary of that cube. For example in one dimension we have R 0 R [a,b] f = f(b) − f(a). It is tempting to write the last integral as {a,b} f since {a, b} is the set of boundary points of the interval [a, b] however that way we will miss the crucial minus sign. The boundary needs to be oriented so that point a carries a minus sign and b a plus sign. Also boundary of the interval is not described by a single cube but rather by two, corresponding to its two faces. Notice we are talking about cubes as if they are actual geometric cubes but in reality our cubes are maps [0, 1]k → Rn. Nevertheless it still makes sense to speak about faces, using the domain of the maps. Definition 3.4.1. (Faces) The standard k-cube is the identity Ik : [0, 1]k → Rk. The faces of the standard k-cube are (k − 1)-cubes in Rk k k indexed by i ∈ {1, . . . n} and σ ∈ {0, 1} defined by I(i,σ)(x) = I (x1,...,σ,...xk) with the σ in the i-th place. k k For a general k-cube γ : [0, 1] → M we define the faces γi,σ = γ ◦ Ii,σ. 3.4. MORE ON CUBES AND THEIR BOUNDARY 29

Figure 3.1: The standard cube I2 and its faces.

The boundary of a k-cube is the union of all these faces. Instead of union we prefer to write it as a linear combination. This is convenient for keeping track of their orientations and makes sense once we start integrating over the faces. The integral over the boundary will be the sum of the integrals over the faces anyway. In this context formal combinations of cubes are referred to a chains. Definition 3.4.2. (k-Chain) A k-chain is a finite formal linear combination of k-cubes. In other words a k-chain is an element of the vector space spanned by all k-cubes. The integral is extended to k-chains by R R R aγ+bγ˜ ω = a γ ω + b γ˜ ω. For example the boundary of the standard 1-cube I1 (interval [0, 1]) are the endpoints {0} and {1} where the first gets a minus sign and the second a plus sign. Written as a 0-chain the boundary of I1 will be the 0-chain −(0 7→ {0}) + (0 7→ {1}) a formal sum of maps from [0, 1]0 = {0} to [0, 1]. Likewise the boundary of the unit square is a sum of four terms [0, 1] × {0}, {1} × [0, 1], [0, 1] × {1} and {0} × [0, 1]. Taking the orientations 2 2 2 2 2 such as in figure 3.1 we write the 1-chain version of the boundary of I as −I1,0 − I2,1 + I1,1 + I2,0. In general we define the boundary of the standard k-cube to be the (k − 1)-chain

k k X X i+σ k ∂I = (−1) Ii,σ i=1 σ∈{0,1} More generally we define the boundary of a k-chain by Pk P i+σ Definition 3.4.3. (Boundary) The boundary of the k-cube γ is then ∂γ = i=1 σ∈{0,1}(−1) γi,σ. The P P boundary of the chain i aiγi is by definition i ai∂γi. Whenever the dimensions are clear from the context we will drop the subscript and write ∂ for all boundary maps. The boundary of the boundary is always zero, for example in the picture we see each of the four vertices appear twice in the expression for ∂∂I2, once with a plus sign and once with a minus sign. This is true in general too. Lemma 3.4.1. (Boundary of boundary) For all chains γ we have ∂∂γ = 0. k−1 k Proof. It suffices to check this for the standard k-cube. When i < j and τ, σ ∈ {0, 1} we have Ij−1,τ ◦ Ii,σ = k−1 k Ii,σ ◦ Ij,τ . It follows that every (k − 2)-cube that is the intersection of two distinct faces appears twice in the formula for ∂∂Ik, with opposite signs. This lemma is the starting point of the subject of homology in algebraic topology. For any space X the k-th homology is the vector space spanned by all k-chains in X that have no boundary modulo those that are the boundary of something. By the previous lemma this definition actually makes sense. More precisely the k-th homology of X is defined to be Hk(X) = ker ∂k/im∂k+1 where the subscript indicates the dimension of the chains we are to apply ∂ to.

Exercises Exercise 0. γ Spherical coordinates (rescaled slightly) give a 2-cube [0, 1]2 −→ R3 by setting φ θ φ θ θ γ(φ, θ) = (cos sin , sin sin , cos ) 2π π 2π π π Write down each of the cubes making up the 1-chain ∂γ. Same question for cylindrical coordinates: i.e, the 2-cube γ defined by γ(φ, t) = (cos φ, sin φ, t).

Exercise 1. Find a 4-cube γ in R6 such that ∂γ = 0. 30 CHAPTER 3. MULTIVARIABLE FUNDAMENTAL THEOREM OF CALCULUS

3.5 Exterior derivative

The final ingredient for the generalization of the fundamental theorem of calculus is to define the appropriate notion of derivative on differential forms, called the exterior derivative d. It should be appropriate in the sense that it should reflect the boundary ∂ of chains and cubes that we defined in the previous section. In particular we would like this derivative d to satisfy dd = 0 reflecting ∂∂ = 0 for chains. To be more concrete we recall that any k-form may be expressed in terms of the constant k-forms eI multiplied by functions, see lemma 3.1.2. The simplest way to take the derivative of a differential form would be to simply take the derivative of its coordinate functions. However this does not satisfy dd = 0 and is also dependent on the particular basis one chooses. Instead we take the following combination of derivatives:

Definition 3.5.1. (Exterior derivative) P I k P I P I Define the exterior derivative dω of ω = I fI e ∈ Ω (P ) to be the (k + 1)-form d( I fI e ) = I dfI ∧ e .

This definition may seem to depend on the particular basis of Rn we are using but this is not the case. The following lemma shows that the exterior derivative has many good properties that make it a respectable operation:

Lemma 3.5.1. (properties of d) ϕ Assume α, ω ∈ Ωk(Q) and P −→ Q ⊂ Rm a C1 map between open sets.

1. d(sα + ω) = sdα + dω for any s ∈ R. 2. d(fω) = (df) ∧ ω + fdω for any f ∈ Ω0(Q)

3. If ω is a k-form and η an `-form then d(ω ∧ η) = dω ∧ η + (−1)kω ∧ dη.

4. ddω = 0 for all k-forms that are C2.

5. ϕ∗dω = dϕ∗ω

Proof. Property 1) follows directly from the definition. For the next property we use the d(fg) = I I I I (df)g + fdg. Setting ω = gI e we get d(fω) = d(fgI ) ∧ e = (df)gI ∧ e + fdgI ∧ e = (df) ∧ ω + fdω. The general case follows by summing over all possible I. For part 3) we observe that since deI = 0 the fomula is correct for any ω = eI , η = eJ . The general case follows by parts 1) and 2). By linearity and part two 4) just needs to be proved for ω = fη with dη = 0 a constant k-form. We know 2 dω = df ∧η. Since df = P ∂f ei we have ddω = d(P ∂f ei∧η) = P d( ∂f )∧ei∧η = P ( ∂ f ∧ej ∧ei∧η = 0 i ∂xi i ∂xi i ∂xi i,j ∂xi∂xj because the partial derivatives commute, see lemma 2.3.4 and ei ∧ ej = ej ∧ ei. The last property follows from the chain rule when k = 0. We proceed by induction on k. Suppose the formula is true for k-forms then to check it for (k + 1)-forms it suffices to consider the case ω = η ∧ ei. We have using part 3) and dei = 0:

ϕ∗dω = ϕ∗(dη ∧ ei) + (−1)kϕ∗(η ∧ dei) = dϕ∗(η) ∧ ϕ∗ei = d(ϕ∗(η) ∧ ϕ∗ei) = d(ϕ∗ω)

Exercises Exercise 0 Given ω ∈ Ω3(R4) defined by ω(p) = cos(e1(p) + sin e4(p))e1 ∧ e2 compute dω. With the understanding that x, y, z are the coordinate functions on R3 compute d(x2dx + y2dz).

Exercise 1 ∗ In this exercise we identify the complex plane with R2. Let x, y ∈ R2 be the dual basis to the standard basis of R2 so that dx, dy ∈ Ω1(R2). If we define dz = dx + idy and set f(x, y) = u(x, y) + iv(x, y), explain how the R 1 Cauchy-Riemann equations are equivalent to d(fdz) = 0. Compute γ z dz where γ(t) = (cos 2πt, sin 2πt) is a parameterization of the unit circle. Why does your answer imply that there is no element of g ∈ Ω0(R2) such 1 that dg = Im( z dz)?

Exercise 2 3 1 For x, y, z the dual standard basis of R verify that curlf = d(f1dx + f2dy + f3dz) for any C vector field 3.6. THE FUNDAMENTAL THEOREM OF CALCULUS (STOKES THEOREM) 31

f = f1e1 + f2e2 + f3e3.

Exercise 3 Compute dω where ω = e1∧e3 or ω(x, y, z) = ydx∧dy+xdy∧dz or ω(p) = e1(p)e2(p)e(1,3,5,7)+sin(e3(p))e(2,4,6,8).

Exercise 4 Prove that for any ω ∈ Ωk(Rn) we can compute dω using only properties 1-5 of Lemma 3.5.1. Also compute dω where ω ∈ Ω3(R10) is given as ω(p) = 2x4(p)x5(p)dx1 ∧ dx2 ∧ dx10 + exp(x6(p))dx2 ∧ dx4 ∧ dx10.

Exercise 5 Prove that for any ω ∈ Ωk(Rn) we can compute dω using only properties 1-5 of Lemma 3.5.1.

Exercise 6 F Using property 5 of Lemma 3.5.1. Compute F ∗ω where R3 3 (x, y, z) 7−→ (x, y, z, xyz, x3 − y, 6x + 3z) ∈ R6 and ω ∈ Ω2(R6) is given by ω(w) = (w1 + w2)dw3 ∧ dw4 + 2dw6 ∧ dw5 + w2dw1 ∧ dw2.

3.6 The fundamental theorem of calculus (Stokes Theorem)

Finally we are ready to prove the most important part of the fundamental theorem of calculus, known as Stokes theorem. It is the part that relates the integral of the derivative to the integral on the boundary.

Theorem 3.6.1. (Fundamental theorem of calculus (Stokes Theorem)) k−1 R R For ω ∈ Ω (P ) and γ any k-chain we have γ dω = ∂γ ω

Proof. We start with a proof of the simplest case where γ is the standard k-cube Ik in Rk. Also, assume ω = feJ for some ordered (k − 1)-tuple J excluding a single index j from 1 . . . k. In that case we can explicitly compute the left hand side (below we comment on what was done at each step): Z Z Z Z J j−1 (1...k) j−1 dω = df ∧ e = (−1) ∂jf(·)e = (−1) ∂jf(·) (3.1) Ik Ik [0,1]k [0,1]k Z Z ! Z j−1 j−1 k k = (−1) ∂jf = (−1) (f ◦ Ij,1 − f ◦ Ij,0) (3.2) [0,1]k−1 [0,1] [0,1]k−1

In the first step we computed the exterior derivative using the definition. In the second step we expanded P ∂f i i J j 1...k df = e and used e ∧ e = δij(−1) e . In the third step we used the theorem of Fubini to first i ∂xi integrate in the j direction. In the fourth step we carried out the integral in the j direction using the 1d fundamental theorem of calculus. Next we turn to the right hand side and compute:

Z Z Z 1 Z X i+σ X i+σ k ∗ X j+σ k ω = (−1) ω = (−1) (Ii,σ) ω = (−1) f ◦ Ii,σ (3.3) k k k−1 k−1 ∂I i,σ Ii,σ i,σ [0,1] σ=0 [0,1]

The first step is the definition of boundary of the standard cube. The second equality is the definition of the integral. This explains the final equality and finishes the proof of the special case. Finally we explain how the general case reduces to the case we just treated. First we may reduce to the case of k-cubes γ since both ∂ and the integral are additive. Given any k-form ω we have Z Z Z Z Z dω = γ∗(dω) = d(γ∗ω) = γ∗ω = ω γ Ik Ik ∂Ik ∂γ

The third equality comes from expressing γ∗ω as a sum of instances of the above special case where we integrate J fJ e .

As a simple example consider η ∈ Ω2(R3) given by η(x, y, z) = −xe2 ∧ e3 + ze1 ∧ e2 and the 2-cube R γ(s, t) = (cos 2πs)e1 + (sin 2πs)e2 + te3. The integral γ η is simplified once we notice that η = dω with ω(x, y, z) = zxe2. Using Stokes we get R η = R ω = − R ω + R ω + R ω − R ω. The first two γ ∂γ γ1,0 γ1,1 γ2,0 γ2,1 integrals cancel out equal because γ1,0 = γ1,1. The third integral is 0 because ω(γ2,0) = 0 and finally the last integral is equal to −π, (exercise!). 32 CHAPTER 3. MULTIVARIABLE FUNDAMENTAL THEOREM OF CALCULUS

The usual integral theorems of Gauss and Stokes from vector analysis in R3, mentioned at the beginning of the chapter, follow directly from our more general Stokes theorem. Using the inner product one identifies both the 1- and 2-formss with ordinary vector fields in R3. Under this identification gradient, curl and divergence are all instances of the exterior derivative so our Stokes theorem can be applied.

Exercises Exercise 0(Cauchy meets Stokes) f Suppose C −→ C is complex differentiable. Setting u(x, y) = Ref(x + iy) and v(x, y) = Imf(x + iy) we find 2 u,v 2 two functions R −−→ R that are differentiable and satisfy the Cauchy-Riemann equations ∂1u = ∂2v and ∂2u = −∂1v. We previously showed that working formally with the definitions dz = dx+idy we get d(fdz) = 0. What actually makes sense in our definitions is the statement that dω = 0 and dη = 0 where ω = udx − vdy γ and η = vdx + udy. Now show that for any [0, 1] −→ C Z Z Z fdz = ω + i η γ γ γ

γ˜ 2 where [0, 1] −→ R is the real equivalent of γ defined byγ ˜(t) = Reγ(t)e1 + iImγ(t)e2. The left hand integral is defined as in complex analysis class: Z Z Z fdz = Re(γ0(t)f(γ))dt + i Im(γ0(t)f(γ))dt γ [0,1] [0,1]

R 2 Finally use Stokes to prove Cauchy’s formula γ fdz = 0 provided thatγ ˜ = ∂B for some 2-cube B in R . You can also prove the Residue theorem this way.

Exercise 1 f Consider the 2-cube γ in R2 defined by γ(s, t) = (hs, ht). For C1-function R2 −→ R and 1-form ω = fdy prove R f(x+h,y)−f(x,y) R that ∂γ ω = h . Show that dω = gdx∧dy and that limh→0 γ dω = g(0). Conclude from Stokes the- orem that g(0) = ∂1f without using the definition of d. If we wanted we could actually define d along these lines.

Exercise 2 k R R If ω, η ∈ Ω (P ) and γ is a k-cube such that dω = dη. Is it true that γ ω = γ η?

Exercise 3 1 2 3 2 3 1 3 1 2 2 3 e (p)e ∧e +e (p)e ∧e +e (p)e ∧e R Define θ ∈ Ω (R −{0}) by θ(p) = |p|3 . Show that dθ = 0 and also γ θ = −4π where γ(s, t) = (cos 2πs sin πt, sin 2πs sin πt, cos πt) defines a 2-cube γ. Explain why there cannot be a 3-chain β such that ∂β = γ.

3.7 Fundamental theorem of calculus: Poincar´elemma

The Stokes theorem is only one part of the fundamental theorem of calculus. The part telling us how to integrate the derivative. The other half is known as the Poincar´elemma. It provides a way to find a primitive, to write your integrand as a derivative. More specifically it answers the question whether or not ω ∈ Ωk(P ) is of the form ω = dα for some α ∈ Ωk−1(P ). We call α the primitive or potential. Suppose dω 6= 0 then we cannot have ω = dα because then 0 = ddα = dω 6= 0 by lemma 3.5.1. So a necessary condition for finding a primitive is to have dω = 0. The Poincar´elemma states that when the domain P is simple this condition is actually sufficient.

Theorem 3.7.1. (Poincar´elemma) Suppose P ⊂ Rn is an open set such that for any p ∈ P , P contains the line segment connecting 0 and p. If P I ω = I ωI e ∈ Ω(P ) is a k-form with dω = 0 then there exists a (k − 1)-form α on P with dα = ω and

Z ! k X k−1 X g−1 ig I−{ig } α(x) = t ωI (tx) (−1) x e I t=[0,1] g=1 where x = (x1 . . . xn) ∈ P . 3.7. FUNDAMENTAL THEOREM OF CALCULUS: POINCARE´ LEMMA 33

∂ Pk g−1 ∂ Proof. The condition dω = 0 means that we have ωI = (−1) i ω . To show that dα = ω it ∂xu g=1 ∂x g I|ig 7→u I ig suffices to prove that the coefficient of e in dα is ωI . Differentiating the x for g = 1 . . . k gives the contribution R k−1 k t=[0,1] t ωI (xt) and differentiating the integral gives

k Z Z X X k g−1 ∂ u X k ∂ u t (−1) ( ωI| )(xt)x = t ( ωI )(xt)x = (3.4) ∂xig ig 7→u ∂xu u g=1 t=[0,1] u [0,1] Z X ∂ Z d tk xu( ω )(xt) = tk ω (xt) (3.5) ∂xu I dt I t=[0,1] u t=[0,1]

R d k Together the terms form precisely t=[0,1] dt t ωI (xt) = ωI (x) finishing the proof. Lemma 3.7.1. (Invariance under reparameterization of cubes) k ϕ k 1 k R R Suppose [0, 1] −→ [0, 1] is a C function such that ∂ϕ = ∂I . Then γ◦ϕ ω = γ ω for any k-cube γ that maps into P ⊂ Rn and ω ∈ Ωk(P ). Proof. Find α ∈ Ωk−1([0, 1]k) such that dα = γ∗ω using the Poincar´elemma on [0, 1]k. This is possible since ∗ R R ∗ R γ ω is a k-form on a k-dimensional space so dω = 0. Then using Stokes we find γ◦ϕ ω = ϕ γ ω = ϕ dα = R R R R ∗ R α = k α = k dα = γ ω = ω. ∂ϕ ∂I I Ik γ

Exercises Exercise 0 Set k = 1 in the proof of the Poincar´elemma and make all the steps as explicit as you can.

Exercise 1 For each of the k-forms ω below either find an α ∈ Ωk−1 such that dα = ω or prove that it cannot be done.

a. ω ∈ Ω2(R2 − {0}) given by ω(p) = e1 ∧ e2. b. ω ∈ Ω3(R4 − {0}) given by ω(p) = e1(p)e1 ∧ e2 ∧ e4 + e1(p)e2(p)e2 ∧ e3 ∧ e4.

2 1 1 2 1 2 −e (p)e +e (p)e c. ω ∈ Ω (R − {0}) given by ω(p) = |p|2 .

2 1 1 2 1 2 −e (p−q)e +e (p−q)e d. ω ∈ Ω (((−1, 1) ) given by ω(p) = |p−q|2 and q = (2, 2).