<<

Introduction to

Roland van der Veen

2018 2 Contents

1 Introduction 5 1.1 Overview ...... 5

2 How to solve ? 7 2.1 Linear algebra ...... 9 2.2 ...... 10 2.3 Intermediate and mean value theorems ...... 13 2.4 Implicit theorem ...... 15

3 Is there a fundamental theorem of in higher dimensions? 23 3.1 Elementary Riemann integration ...... 24 3.2 k-vectors and k-covectors ...... 26 3.3 (co)-vector fields and integration ...... 33 3.3.1 Integration ...... 34 3.4 More on cubes and their boundary ...... 37 3.5 ...... 38 3.6 The fundamental theorem of calculus (Stokes Theorem) ...... 40 3.7 Fundamental theorem of calculus: Poincar´elemma ...... 42

4 Geometry through the dot product 45 4.1 Vector spaces with a scalar product ...... 45 4.2 Riemannian geometry ...... 47

5 What if there is no good choice of coordinates? 51 5.1 Atlasses and manifolds ...... 51 5.2 Examples of manifolds ...... 55 5.3 Analytic continuation ...... 56 5.4 Bump functions and partitions of unity ...... 58 5.5 Vector bundles ...... 59 5.6 The fundamental theorem of calculus on manifolds ...... 62 5.7 Geometry on manifolds ...... 63

3 4 CONTENTS Chapter 1

Introduction

1.1 Overview

The goal of these notes is to explore the notions of differentiation and integration in a setting where there are no preferred coordinates. Manifolds provide such a setting. This is not calculus. We made an attempt to prove everything we say so that no black boxes have to be accepted on faith. This self-sufficiency is one of the great strengths of . Sometimes mathematics texts start by giving answers neglecting to properly state the questions they were meant to answer. We will try motivate concepts and illustrate definitions. In turn the reader is asked to at least try some of the exercises. Doing exercises (and possibly failing some) is an part of mathematics. At the beginning of the Dutch national masters program in mathematics there is a one day ’intensive reminder on manifolds’ course that consists the following topics:

1. definition of manifolds*,

2. and cotangent bundles*,

3. vector bundles*,

4. differential forms and exterior derivative*,

5. flows and Lie derivative,

6. Cartan calculus,

7. integration and Stokes theorem*.

8. Frobenius theorem

Since this is an introductory course we only treat the topics marked *. To illustrate our techniques we will touch upon some concepts in Riemannian, complex and symplectic geometry. More systematically these lecture notes consist of five chapters the first of which is this intro- duction. In chapter two we start by studying non-linear systems of equations by approximating them by linear ones, leading to the implicit function theorem. Basically it says that the solution

5 6 CHAPTER 1. INTRODUCTION set looks like the in good cases. Along the way we develop a suitably general notion of derivative. The corresponding notion of integration is developed in the next chapter. This is more involved as it requires defining new kinds of objects as the natural integrands (covector fields, differential forms). Differntiation and integration are connected by a generalization of the fundamental theorem of calculus (Stokes theorem) and the Poincare lemma. In chapter four we briefly explore how our techniques are useful in setting up various kinds of geometry. In the final chapter we will show how to lift all the theory we developed so far to the context of manifolds. Basically a is just several pieces of Rn linked by coordinate transformations. Given that we set up our theory in a way that makes coordinate tranformations easy to deal with, most local aspects of the theory are no different from the way they are in Rn. Most of the material is standard and can be found in references such as Calculus on manifolds by M. Spivak or Introduction to smooth manifolds. However the proofs presented here are simplified and streamlined significantly. This especially goes for the proof of the implicit function theorem and the theorem for and the Poincar´elemma. I tried to motivate the use of exterior calculus more than usual, while limiting its algebraic preliminaries. f Throughout the text I try to write functions as A 3 a 7−→ a2 + a + 1 ∈ B, instead of f : A → B defined by f(a) = a2 + a + 1. Acknowledgement. Much of this material is presented in a way inspired by the work of my former master student Jules Jacobs. I would also like to thank Kevin van Helden for his helpful comments, exercises and excellent teaching assistance over the years. Chapter 2

How to solve equations?

Postponing the formal definition until chapter 5, manifolds often arise as solution sets to equations. In this preliminary chapter we explore under what conditions a system of n real equations in k + n variables can be solved. Naively one may hope that each can be used to determine a variable so that in the end k variables are left undetermined and all others are functions of those. For example consider the two systems of two equations on the left and on the right (k = 1, n = 2):

x + y + z = 0 sin(x + y) − log(1 − z) = 0 (2.1) 1 −x + y + z = 0 ey − = 0 (2.2) 1 + x − z

The system on the left is linear and easy to solve, we get x = 0 and y = −z. The system on the

Figure 2.1: Solutions to the two systems. The yellow is the solution to the first equation, blue the second. The positive x, y, z axes are drawn in red, green, blue respectively.

7 8 CHAPTER 2. HOW TO SOLVE EQUATIONS?

Figure 2.2: Some random level sets. right is hard to solve explicitly but looks very similar near (0, 0, 0) since sin(x + y) =∼ x + y and log(1 − z) =∼ −z near zero. We will be able to show that just like in the linear situation a of solutions passes through the origin. The key point is that the derivative of the complicated looking functions at the origin is precisely the shown on the left. We will look at equations involving only differentiable functions. This means that locally they can be approximated well by linear functions. The goal of the chapter is to prove the implicit function theorem. Basically it says that the linear approximation decides whether or not a system of equations is solvable locally. This is illustrated in the figures above. Later in the course solutions to equations will be an important source of examples of manifolds. Even the solution set to single equation in three unknowns can take many forms. See for example figure 2.2 where we generated random equations and plotted the solution set.

Exercises Exercise 1. (Three holes) Give a single equation in three unknowns such that the solution set is a bounded subset of R3, looks smooth and two-dimensional everywhere and has a hole. Harder: Can you increase the number of holes to three? 2.1. LINEAR ALGEBRA 9

2.1 Linear algebra

The basis for our investigation of equations is the linear case. Linear equations can neatly be summarized in terms of a single matrix equation Av = b. Here v is a vector in Rk+n, and b ∈ Rn and A is an n×(k +n) matrix. In case b = 0 we call the equation homogeneous and the solution set is some linear subspace S = {v ∈ Rk+n|Av = 0}, the kernel of the map defined by A. In general, given a single solution p ∈ Rk+n such that Ap = b the entire solution set {v ∈ Rk+n|Av = b} is the affine linear subspace S + p = {s + p|s ∈ S}. In discussing the qualitative properties of linear equations it is more convenient to think in terms of linear maps. Most of this material should be familiar from linear algebra courses but we give a few pointers here to establish notation and emphasize the important points. With some irony, the first rule of linear algebra is YOU DO NOT PICK A BASIS, the second rule of linear algebra is YOU DO NOT PICK A BASIS. In this section W, V will always be real vector spaces of finite dimensions m and n. Of course W is isomorphic to Rm and choosing such an isomorphism b : Rm → W means choosing a basis. m Usually we write ei for the standard basis of R and abbreviate b(ei) = bi. We may then write P i vectors w ∈ W as w = i w bi. However we do not want to pin ourselves down on a specific basis since that makes it harder to switch between various interpretations of W as a space of directions, complex numbers or linear transformations, holomorphic functions and so on. A relevant example is the set of all linear maps from V to W is denoted L(V,W ), it is a vector space in its own right. If we would set V = Rn and W = Rm then ϕ ∈ L(V,W ) could be described i P i by a matrix ϕj defined by ϕej = i ϕjei. However the matrix might look easier with respect to another basis so we prefer to keep the V,W abstract and describe ϕ using bases c : Rn → V and m −1 i b : R → W . With respect to these bases the matrix of ϕ ∈ L(V,W ) is defined to be (b ◦ ϕ ◦ c)j. P i So ϕcj = i ϕjbi. An important special case of the previous is the dual space V ∗ = L(V, R). Evaluation gives a ev bilinear map V ∗ × V 3 (f, v) 7−→ f(v) ∈ R. In general a bilinear map B is called non-degenerate if ∀f : B(f, v) = 0 implies v = 0 and ∀v : B(f, v) = 0 implies f = 0. In our case ev actually is i ∗ non-degenerate. This also allows us to set up a basis of (b ) of V dual to any basis (bi) of V by ( i i i 1 if i = j requiring b (bj) = δ . Here δ = is the Kronecker delta. j j 0 if i 6= j A useful feature of the dual space is the pull-back (also known as transpose). Given f ∈ L(V,W ) there is a map called f ∗ ∈ L(W ∗,V ∗) defined by f ∗ϕ = ϕ ◦ f. To better understand determinants we use Gaussian elimination to compute them. This can be c n n c j implemented by the elementary operations Rij : R → R defined by Rij = I + ceie for i 6= j. We c c have det Rij = 1 and for square matrix A the matrix ARij is the result of adding c times column i c to column j. Likewise RijA is the result of adding c times row i to row j. Using these operations it is possible to interchange two rows or columns (exercise!).

Lemma 1. (Gaussian elimination) Any n × n matrix A can be written as a product A = EDE˜ where D is diagonal and E, E˜ are c products of Rij. In particular det A = det D. Proof. Induction on the size n. For n = 1 this is clear. For the induction step we consider an n × n matrix A. Unless all entries in the final row and column are zero we may use elementary n operations to make An 6= 0. In either case we can use more elementary operations to make all 10 CHAPTER 2. HOW TO SOLVE EQUATIONS? off-diagonal entries in the final row and column equal to 0. By induction we can do the same for the (n − 1) × (n − 1) block. Just like one can compute with integers modulo n one can also compute with vectors modulo some subspace. Given subspace U ⊂ V this means that we compute with equivalence classes v = v0 mod U if v − v0 ∈ U. The result is again a vector space called the quotient vector space V/U.

Exercises 2.2 Derivative

Now that we understand linear functions, we would like to use this to study more general functions f Rm ⊃ P −→ Rn, where unless stated otherwise P is always a non-empty open subset of Rm. The key idea is to locally approximate non-linear objects by linear ones. In this case at every point p ∈ P we are looking for the linear map f 0(p) ∈ L(Rm, Rn) best approximating f close to p. f,g Since we are approximating, some specialized notation is useful. For functions Rm −−→ Rn we |f(h)| define f = o(g) to mean limh→0 |g(h)| = 0, intuitively f goes to zero faster than g. For example eh − 1 − h = o(h). We often use the triangle inequality to show that f = o(h) and g = o(h) implies f + g = o(h) (Exercise!). Although our notation may be a little unfamiliar, the picture is just like the familiar one- dimensional picture, see figure 2.3.

Figure 2.3: The derivative D at p is the linear map that best approximates to f at point p.

Definition 1. (Differentiability) f A map Rm ⊃ P −→ Rn is called differentiable at p ∈ P if there exists a linear map D ∈ L(Rm, Rn) such that for any h ∈ Rm converging to 0:

f(p + h) = f(p) + Dh + f,D,p(h) with f,D,p(h) = o(h) (2.3) When f is differentiable for all p ∈ P we say f is differentiable. 2.2. DERIVATIVE 11

f For example take R2 3 (x, y) 7−→ x2 − y2 ∈ R and p = (0, 1). In this case we may take D ∈ L(R2, R) to be given by the matrix (0, −2) with respect to the standard bases. To see that 0 this works we set h = (k, `) and show that the error f,D,p(h) = f(p + h) − f(p) − f (p)(h) goes to zero faster than h does.

2 2 2 2 f,D,p(h) = f(k, ` + 1) − f(0, 1) − 2` = k − (` + 1) + 1 + 2` = k − `

| (h)| |k2−`2| |k2+`2| So as promised f,D,p = √ ≤ √ = |h|. Taking the limit h → 0 shows D satisfies |h| | k2+`2| k2+`2 equation (2.3). Provided it exists, the linear approximation D above is actually unique. It therefore deserves a special name, the derivative of f at p or f 0(p). Definition 2. (Derivative) If f is differentiable at p then the derivative of f at p called f 0(p) ∈ L(Rm, Rn) is the unique linear map satisfying (2.3).

Proof. (Of uniqueness). Suppose we have another A ∈ L(Rm, Rn) also satisfies (2.3). Subtracting m these two equations gives (D − A)h = f,A,p(h) − f,D,p(h) = o(h). Setting h = tw with w ∈ R (D−A)wt and t ∈ R non-zero shows that limt→0 t = 0 so that D = A.

f For functions R −→ R our definition of derivative f 0(p) is just a complicated reformulation of the usual definition. Actually the matrix of the derivative with respect to the standard bases is just the ∂f ∂f matrix of partial . In the above example the linear map D is just ( ∂x (p), ∂y (p)) = (0, −2). This and much more will follow from the next theorem. Theorem 1. (Properties of derivative)

f g 1. (Chain-rule). Given functions Rk ⊃ Q −→ P ⊂ R` and R` ⊃ P −→ Rm differentiable at q ∈ Q and f(q) ∈ P we have (g ◦ f)0(q) = g0(f(q))f 0(q). 2. If f is constant then f 0 = 0.

3. If f ∈ L(Rk, R`) then f 0(q) = f for all q ∈ Rk. ` P 4. For any basis (bi) of R the function f = i fibi is differentiable at q if and only if the fi 0 P 0 component functions P −→ R are, and in that case f (q)(v) = i(fi) (q)(v)bi. Proof. Part 1 (). Set p = f(q). For the chain rule it suffices to show that the linear map 0 0 ` m 0 g (p)f (q) ∈ L(R , R ) satisfies equation (2.3). We know that f(q + h) = p + f (q)h + f,q(h) and 0 g(p + k) = g(p) + g (p)k + g,p(k). Combining those we can approximate (g ◦ f)(q + h) =

0 0 0 0 g(p + f (q)h + f,q(h)) = g(p) + g (p)k + g,p(k) = g(p) + g (p)f (q)h + (g◦f),q(h)

0 0 where we set k = f (q)h + f,q(h) and (g◦f),q(h) = g (p)f,q(h) + g,p(k) = A + B. Now we need to show that (g◦f),q(h) = o(h) as h → 0. In fact A = o(h) and B = o(h). For A it follows from the differentiability of f and continuity of the linear map g0(p). For B we use differentiability of g 1 to see that for any α > 0 we have |g,p(k)| < αk whenever k is suitably small. So |h| |g,p(k(h))| < 1 0 α |h| (f (q)h + f,q(h)) < Cα for some constant C, showing that B(h) = o(h). Here we used differentiability of f once more. 12 CHAPTER 2. HOW TO SOLVE EQUATIONS?

Part 2 follows directly from the definition and uniqueness of the derivative because by assump- tion: f(p + h) = f(p) + 0h + 0 and 0 = o(h). Part 3. For the same reason we may use linearity to write f(p + h) = f(p) + f(h) + 0. Part 4. Suppose f is differentiable at p then by the chain rule (part 1) so is fi = πi ◦ f where πi is projection onto the i-th basis vector bi (a linear map, using part 3). Conversely suppose 0 all the functions fi are differentiable at p then fi(p + h) = fi(p) + fi (p)h + (fi, p, h). Now P P 0 P 0 P f(p+h) = i fi(p+h)bi = i(fi(p)+fi (p)h+(fi, p, h))bi = f(p)+ i fi (p)hbi + i (fi, p, h))bi. P 0 P 0 Since i (fi, p, h) = o(h) we find that indeed f (p) = i fi (p)bi.

In some sense this is all we need to know about differentiation except perhaps the fact that the product is a differentiable function, see Exercise 1. Using the chain rule and our knowledge of the one variable derivatives from calculus we are able to differentiate many complicated looking multivariate F functions step by step. For example: the function R2 ∈ (x, y) 7−→ (cos(xy), x3 + e−y) ∈ R2 can be differentiated as follows. By part 4 we can do the components F1,F2 separately so let us focus on 0 2 1 2 2 ∗ computing F1(a, b) for some (a, b) ∈ R . Using the linear maps e , e ∈ (R ) forming the basis dual 1 2 0 0 1 2 to the standard basis we may rewrite F1(x, y) = cos ◦(e · e ) so F1(a, b) = cos (ab)(e · e (a, b) + 1 2 1 2 0 e (a, b) · e ) = − sin(ab)(be + ae ). In other words, F1(a, b)(x, y) = − sin(ab)(bx + ay). A slightly different way of thinking about derivatives is in terms of directional derivatives.

Definition 3. () F Given Rm ⊃ P −→ Rn we define the directional derivative of F at p ∈ P in direction w ∈ Rm as F (p + tw) − F (p) ∂wF (p) = lim t→0 t

In case w = ej the directional derivative is known as the j-th .

0 m Assuming F (p) exists and setting ιw : R → R given by ιw(t) = p + wt connects our two 0 0 notions of derivatives. From the chain rule and parts 2,3 we see ∂wF (p) = (F ◦ ιw) (0) = F (p)w. In particular, this means that when it exists, the matrix of F 0(p) with respect to the standard 0 ∂Fi bases ei is just the matrix of partial derivatives: F (p)ij = (p). In section 2.3 we will see how ∂xj directional derivatives even shed light on the existence of F 0, see lemma 2. Provided they exist for all relevant points we may consider the directional derivatives as functions in their own right and attempt to differentiate them. This allows us to construct higher order derivatives as follows. For any finite sequence of vectors (v1, v2 ... ) define the directional derivative of F in direction (v1, v2,... ) at p inductively by ∂(v1,v2,... )F (p) = ∂v1 (∂(v2,... )F )(p)

Definition 4. (Ck functions) F A function P −→ Rn defined on an open P is called Ck if for all p ∈ P the partial derivatives of order ≤ k of F exist and are continuous at p. A function F : X → Rn defined on a general subset X ⊂ Rm is said to be Ck if it can be extended to a Ck function on an open set P containing X.

Perhaps a more natural definition of Ck would be to require all directional derivatives at p up to order k to be continuous at p but this is equivalent (Exercise!). In the next section we will see a connection between C1 and differentiability. For convenience one often restricts attention to functions in C∞, also known as smooth functions in the literature. 2.3. INTERMEDIATE AND MEAN VALUE THEOREMS 13

Many of the functions used in practice are smooth, for example linear functions are always smooth. Functions defined by or power are smooth as well. Differentiable functions often come up as changes of variables. To make sure one does not lose information the change of variable often has to be a and its inverse should also be differentiable. The technical term for such nice changes of variables is diffeomorphism.

Definition 5. (Diffeomorphism) P −→F Q is called a Ck diffeomorphism if F is a bijection and both F and its inverse are Ck.

Apart from linear changes of coordinates perhaps polar coordinates are the best known example 2 of a diffeomorphism. Pα : (0, ∞) × (0, 2π) → R − Lα. Here Lα is the half-line in the plane that starts at the origin and makes angle α. Pα is given by P (re1 +θe2) = r(cos(θ +α)e1 +sin(θ +α)e2). 2 2 Rotations in the plane also give diffeomorphisms. Let Rθ : R − {0} → R − {0} be the rotation of the plane with angle θ. Then Rβ ◦ Pα is again a diffeomorphism, in fact we have Rβ ◦ Pα = Pβ+α. A proof that Pα actually is a diffeomorphism will be given using the theorem later. Identifying C =∼ R2 by sending x+iy to (x, y) it follows from complex analysis that any injective holomorphic function f defined on region P such that with f 0(z) 6= 0 for all z ∈ P is actually a diffeomorphism from P to f(P ). Since coordinates often depend on arbitrary choices it is a good habit to consider what happens under a change of coordinate (diffeomorphism). Often the powerful concepts are the ones that do not depend on the coordinates. In other words we will often try to phrase things in an diffeomorphism invariant way. This is in the same spirit as linear algebra where one prefers linear maps to matrices, group theory (groups up to isomorphism) and general relativity (principle of covariance). Diffeomorphisms also arise often as the flow of a solution of ordinary differential equations. The derivative of a diffeomorphism has to be a linear isomorphism (why??). Surprisingly the converse is true too, at least locally. This is the inverse function theorem that we will prove at the end of this chapter.

Exercises Exercise 1. (Derivative of product) M Define R2 3 (x, y) 7−→ xy ∈ R. Show directly from the definition that for any (a, b) ∈ R2 the function M is differentiable and its derivative satisfies M 0(a, b)(x, y) = bx + ay. Hint: First show that the error  in (2.3) must be k` where R2 3 h = (k, `) and recall |k`| ≤ k2 +`2.

Exercise 2. (Derivative of det) 2 Identify the space of all n × n matrices with Rn in the usual way. By writing the determinant as a sum over the symmetric group or otherwise, show that the determinant det defines a smooth n2 function on R . Also compute for any n × n matrix M the directional derivative ∂M det(I).

2.3 Intermediate and mean value theorems

One of our goals is to give (relatively) simple and conceptual proofs for the correctness of all common constructions in multivariate calculus. Ultimately many such proofs rely on the following two theorems from elementary analysis that we will take for granted: 14 CHAPTER 2. HOW TO SOLVE EQUATIONS?

Theorem 2. (Intermediate and mean value theorems) f Imagine a [a, b] −→ R. 1. If f(a) < λ < f(b) then there exists a c ∈ [a, b] with λ = f(c). 2. If f is differentiable on (a, b) then there exists a point c ∈ (a, b) such that f(b) − f(a) = f 0(c)(b − a). In the remainder of this section we prove a few simple lemmas directly using the . These are all used in the next section to prove the implicit function theorem and its friends. We start by connecting the notions of differentiability and C1-functions: Lemma 2. (C1 implies differentiable) f Suppose Rm ⊃ P −→ Rn is a C1 function at p ∈ P , then the f 0(p) exists and is determined by 0 ∂f f (p)ei = (p), defined as in definition 3. ∂xi P Proof. According to Theorem 1 it suffices to treat the case n = 1. Writing h = i hiei and using ∂f the mean value theorem there is a ci ∈ (0, hi) such that hi (ci) = f(q + hiei) − f(q) for any q, ∂xi Pm P P Pm ∂f we compute f(p + h) − f(p) = f(p + hjej) − f(p + hjej) = hi (ci) with i=1 j≤i j

F (b) − F (a) = F 0(c)(b − a)

Proof. By assumption the curve γ(t) = a + t(b − a) maps [0, 1] into Q. Applying the above one- 0 variable mean value theorem to F ◦ γ we get F (b) − F (a) = (F ◦ γ)(1) − (F ◦ γ)(0) = (F ◦ γ) (t0) = 0 F (c)(b − a) with γ(t0) = c. A fairly typical application of this mean value theorem is the following. This lemma deals an issue that will come up in the uniqueness part of the implicit function theorem next section.

F Lemma 4. If a C1 function Rn ⊃ P −→ Rn defined on open P satisfies det F 0(p) 6= 0 then there is an open neighborhood of p in which F is injective. Proof. Since the determinant function is continuous and det F 0(p) 6= 0 we may restrict to a ball ∂Fi p ∈ B ⊂ P where the linear map defined by the matrix M = ( (ci))i,j=1...n is an isomorphism ∂xj (det M 6= 0). Suppose there are a, b ∈ B such that F (a) = F (b) then the mean value theorem says 0 that for all i = 1 . . . n there must be a ci on the line segment between a, b such that Fi (ci)(b−a) = 0. In other words M(b − a) = 0. Since det M 6= 0 we must have a = b. In fact under the same hypotheses the map F must locally be invertible with differentiable inverse. This is the inverse function theorem, see corollary 2 in the next section. 2.4. IMPLICIT FUNCTION THEOREM 15

Exercises Exercise 1. (Mean failure) F Why is there no version of the mean value theorem for R2 −→ R2? Give an example of a C11 F function R2 −→ R2 and a 6= b ∈ R2 such that there is no c on the line segment between a, b with the property that F (b) − F (a) = F 0(c)(b − a).

Exercise 2. (Constant?) F Suppose Rm ⊃ P −→ Rn is a C1 function that satisfies F 0(p) = 0 for all p ∈ P and where P is a non-empty open subset. Is it true that F must be constant? What if P is connected?

2.4 Implicit function theorem

The implicit function theorem tells us about the size of the set of solutions to n equations in n + k unknowns. Basically it says that if the system is given by a differentiable function then, locally, our intuition from pretending all equations are linear is correct. In solving linear equations an easy case is the system of equations F x = q with F ∈ L(Rm, Rn) surjective. Given some solution p, all other solutions are parametrized by ker F . More precisely F −1({q}) = p + ker F . When F is non-linear but its derivative at a solution p is still surjective, the implicit function theorem says that locally the solutions are still parameterized by the kernel ker F 0(p).

Figure 2.4: The red can locally (in the green box) be viewed as the graph of a function. The map α sends the coordinate axes to the heavily drawn axes.

Theorem 3. (Implicit function theorem) F To describe the level set F −1({q}) close to p ∈ F −1({q}), for some C1 function Rk+n ⊃ P −→ Rn 3 q α with k > 0, find any linear isomorphism Rk+n −→ Rk × Rn sending ker F 0(p) to Rk × {0}. If α exists then1:

α(F −1({q})) ∩ (α(p) + X × Y ) = α(p) + {(x, f(x))|x ∈ X}

f for some open subsets 0 ∈ X ⊂ Rk and 0 ∈ Y ⊂ Rn and a unique C1-function X −→ Y . 16 CHAPTER 2. HOW TO SOLVE EQUATIONS?

Figure 2.5: The tennis ball curve example (green) F −1({(2, 0)}) where F (x, y, z) = (x2 + y2 + 2 2 y3 2 z , x + 3 − z ). You can see the level sets of F1 (sphere) and F2 (the big one) and also a solution p = (1, 0, 1) (thick dot) and the horizontal line p + ker F 0(p) and the complementary space p + C (square) 2.4. IMPLICIT FUNCTION THEOREM 17

As an illustration of both the theorem and its proof with k = 1, n = 2 we consider F (x, y, z) = 2 2 2 2 y3 2 2 3 (x +y +z , x + 3 −z ) ∈ R defined on P = R . The point p = (1, 0, 1) satisfies F (p) = q = (2, 0),  2 0 2  see Figure 2.5. The matrix of F 0(p) ∈ L( 3, 2) is and ker F 0(p) = Span({e }). For R R 2 0 −2 2 the change of coordinates α we pick the linear map that permutes e1, e2 leaving e3 fixed. Notice that F 0(p) is surjective (the first and last column span R2). According to theorem 3 above we f should get a C1 function X −→ Y between some opens X ⊂ ker F 0(p), Y ⊂ C describing the level set near p. To foreshadow the proof let us make the implicit explicit and compute f directly. Along the way we will run into restrictions on the domain, leading to a choice of X and Y . The idea is very simple indeed. Just use the equations Fi = qi, i = 1 . . . n to eliminate variables that do not 0 span ker F (p) one by one. We start with the last equation F2(x, y, z) = 0 and solve for z to get q 2 y3 z = f3(x, y) = x − 3 . Since we want to be close to p we chose the positive branch of the square root, it is defined when |y| < 1. Plugging in the value for z we continue with the smaller q 2 y3 system G(x, y) = F1(x, y, x − 3 ) = 2, |y| < 1. In the proof this system has already been solved by the induction hypothesis if we can just show that the derivative G0(1, 0) is surjective. Setting J(x, y) = (x, y, f3(x, y)) so that G = F1 ◦ J we apply the chain rule to find it is indeed surjective:

 1 0  0 0 0 G (1, 0) = F1(p)J (1, 0) = (2, 0, 2)  0 1  = (4, 0) 1 0

2 2 2 y3 Explicitly we just solve for x (not for y!) in G(x, y) = x + y + x − 3 = 2 giving x = g(y) = q y2 1 − 6 (3 − y), again we chose the positive square root to get close to p and |y| < 1 is still sufficient. So finally the function f we seek is r r y2 y2 y3 f(y) = (g(y), f (g(y), y)) = ( 1 − (3 − y), 1 − (3 − y) − ) 3 6 6 3

From the formula it is clear that whenever defined the function is C1. For the domain X of f we may take X = {y ∈ R : |y| < 1} and as the range Y = {(x, z) ∈ R2| max(|x|, |z|) < 1}. Now that we have seen all the main features we proceed with a proof. Everything we did so far comes together in this proof so enjoy the ride.

Proof. We only have to prove the theorem in the special case where α is the identity. The general case follows by applying this special case to F˜ = F ◦ α−1. From now on we will assume that Rk × {0} = ker F 0(p). The first step is to invoke lemma 4 to show that the function

S k+n P 3 (u, v) 7−→ (u, F (u, v)) ∈ R

k n is injective on a smaller domain p + X1 × Y1, for some open 0 ∈ X1 ⊂ R and 0 ∈ Y1 ⊂ R . This is allowed because by the chain rule S0(p) is a linear isomorphism from Rk+n to itself (the

1For any set S we use the abbreviation p + S = {p + s|s ∈ S} 18 CHAPTER 2. HOW TO SOLVE EQUATIONS? matrix with respect to the standard bases is triangular). In what follows we will always restrict F to p + (X1 × Y1) ⊂ P so that S is injective. To prove the existence of X, Y, f we proceed by induction on n. The base case n = 1 will be settled using the intermediate value theorem as follows. Since 0 0 0 ker F (p) is k-dimensional we must have F (p)ek+1 6= 0, say it is positive. By continuity of F and the mean value theorem there must be a β > 0 such that F (p + ek+1) > q > F (p − ek+1) and by continuity even F (p + (x, β)) > q > F (p + (x, −β)) for all x ∈ X = {x ∈ Rk : |x| < β} (possibly after shrinking β). The intermediate value theorem applied to t 7→ F (p + (x, t)) gives for each x ∈ X a value y ∈ Rn such that F (p + (x, y)) = q. This y must be unique since if we had another suchy ˜ then S(x, y) = S(x, y˜) contradicting injectivity of S. We may thus call y = f(x) defining f the function X −→ Y we were looking for with Y = Y1. f Next we prove that the X −→ Y found above is C1 in any u ∈ X. This follows from 0 ∂(w,0)F (p + (u, f(u))) k ∂wf(u) = − 0 for any w ∈ R (2.4) ∂ek+1 F (p + (u, f(u))) To prove this formula we pick t > 0 and apply the multivariate mean value theorem (lemma 3) to F on the line L connecting a = p + (u, f(u)) to b = p + (u + tw, f(u + tw)) inside p + X × Y . Since F (b) = F (a) = q we find p + (x, y) ∈ L such that 0 = F 0(p + (x, y))(b − a) = tF 0(p + (x, y))(w, 0) + 0 0 f(u+tw)−f(u) F (p+(x,y))(w,0) (f(u + tw) − f(u))F (p + (x, y))ek+1. In other words = − 0 . Taking the t F (p+(x,y))ek+1 limit t → 0 proves formula (2.4). For the induction step assume the theorem holds whenever the number of equations is less Pn 0 k than some n > 1. To prove the theorem for the case F = i=1 Fiei with ker F (p) = R × {0} we argue in two steps. (1) Use the n-th equation to express one of the variables in terms of the others, (2) plug this expression into the remaining n − 1 equations and apply the induction hypothesis. n (1) We can apply the induction hypothesis to the equation Fn(p) = e (q) = qn. There must k+n γ k+n−1 0 k+n−1 be a linear isomorphism R −→ R × R with γ ker Fn(p) = R × {0} because otherwise dim ker F 0(p) > k, we choose it to fix ker F 0(p). By induction we obtain a C1-function Rk+n−1 ⊃ h −1 A −→ B ⊂ R such that γFn (qn) ∩ (γ(p) + A × B) = γ(p) + {(a, h(a))|a ∈ A}. Notice that ∂ −1 Fn(p) −1 γ ei Fn(p + γ (x, h(x))) = qn so differentiating in a = 0 gives ∂ −1 h(0) = − = 0 when γ ei ∂ −1 Fn(p) γ ek+n i = 1 . . . k. J (2) Now we plug in h. Setting Rk+n−1 3 u 7−→ p + γ−1(u, h(u)) ∈ Rk+n we define Rk+n−1 ⊃ G n−1 Pn−1 Q −→ R by G(z) = i=1 (Fi ◦J)(z)ei. Here 0 ∈ Q is an open set that J should send to P . For all −1 0 0 −1 −1 −1 −1 i = 1 . . . k we have γ ei ∈ ker G (0) because J (0)γ ei = γ ei + ∂γ ei fn(0) = γ ei. Therefore ker G0(0) is mapped onto Rk×{0} by γ. We may thus apply the induction hypothesis to G to obtain a 1 k g n−1 −1 C function R ⊃ X2 −→ Y2 ⊂ R such that γG (q1, . . . , qn−1)∩(X2 ×Y2) = {(x, g(x))|x ∈ X2}. −1 Setting f(x) = γ (g(x), h(x, g(x))) and 0 ∈ X ⊂ X2 and Y ⊂ Y1 such that A ⊃ X × Y2 finishes the proof. This is because f is C1 by the chain rule and f is well defined on X and unique because −1 −1 of S. Moreover z ∈ F ({q}) ∩ p + (X × Y ) implies Fn(z) = qn so z = p + γ (a, h(a)) and so −1 G(a) = (q1, . . . , qn−1). This means a = γ (x, g(x)) so z = p + (x, f(x)) for some x ∈ X. Here we used the fact that γ fixes the first k coordinates. Often the following more coordinate dependent version of the theorem is used: Corollary 1. (Explicit implicit function theorem) F Imagine a C1 function Rk × Rn ⊃ P −→ Rn with F (a, b) = q for some q ∈ Rn and n, k > 0. If 2.4. IMPLICIT FUNCTION THEOREM 19

Figure 2.6: The level set F = c locally (in the green area) looks like the graph of a function f of x ∈ Rk. det F (a, ·)0(b) 6= 0 then there are open sets a ∈ U ⊂ Rk, b ∈ V ⊂ Rn and a unique C1 function f U −→ V such that F −1({q}) ∩ (U × V ) = {(u, f(u))|u ∈ U}

α Proof. Set Rk × Rn 3 (x, y) −→ (x, (F 0(a, ·)(b))−1y) ∈ Rk × Rn then α is a linear isomorphism and must send ker F 0(p) to the first k coordinates, where p = (a, b). By theorem 3 we thus have F −1({q}) ∩ (p + (X ∩ α−1Y )) = {(a + x, b + α−1g(x))|x ∈ X} for some neighborhoods of zero X and Y . Here we used that α fixes the first k coordinates. A translation by p finishes the proof with f(u) = b + α−1g(u − a) and U = a + X, V = b + Y . An important corollary is the inverse function theorem. To solve G(x) = y for x we apply the implicit function theorem to the function F (y, x) = G(x) − y describing the graph of G, now as a graph of a function of y. Corollary 2. (Inverse function theorem) G If Rn ⊃ P −→ Rn is C1 and G(u) = v and det G0(u) 6= 0 then there are open sets u ∈ U, v ∈ V and 1 f a C inverse V −→ U, so G ◦ f = idV and f ◦ G = idU .

F Proof. Apply the (explicit) implicit function theorem to P × Rn 3 (y, x) 7−→ G(x) − y ∈ Rn. The conditions for applying the theorem are satisfied since we have F (v, u) = 0 and F (v, ·)0(u) = G0(u) f has non-zero determinant. Therefore there is a C1-function V −→ Rn such that F (y, f(y)) = 0 or G(f(y)) = y for all y in some open set V 3 v. By lemma 4 we know G is injective and U = G−1(V ) is open and contains u. So G restricts to a bijection from U to V . It follows that f is a bijection from V to U.

The inverse function theorem makes it easy to prove that maps such as polar coordinates Pα are diffeomorphisms. One just has to check that the derivative of your C1 bijection never has 20 CHAPTER 2. HOW TO SOLVE EQUATIONS? determinant equal to 0. Another way to look at the implicit function theorem is to say the solution set F −1(p) is made up from open pieces of Rk. Of course the open pieces of have to fit together in a coherent way: any two descriptions of the same neighborhood should be related by a diffeomorphism. In chapter 5 we will formalize this into the notion of an atlas of a manifold. For now the conclusion is that any local construction on Rk that is invariant under diffeomorphisms can be lifted to the level sets.

For solving equations in practice the methods used in the above proofs may not be very effective. Our goal was mostly to find conditions that guarantee existence of the solutions. If necessary we can then attempt to find them explicitly using more specialized techniques. Often however one is only interested in some qualitative property of the solution, not the solution itself so we sometimes get away without any explicit calculations. In case one does need more explicit solutions, Newton’s method may help. Let us look at solving G y = G(x) for some Rn −→ Rn, compare the inverse function theorem. We try to approximate the 0 −1 solution x by a sequence xn defined by xn+1 = xn + G (xn) (y − G(xn)) for some initial guess 0 x0. This is motivated by assuming y = G(x) = G(xn + x − xn) ≈ G(xn) + G (xn)(x − xn) so 0 −1 2 x ≈ xn + G (xn) (y − G(xn)). If G is C and the guess x0 is sufficiently near x one can show that this procedure actually converges2 to the solution. Using the Banach contraction principle this numerical procedure can be elevated to a proof too. This type of argument is in some sense more powerful than the one we gave here, it works in infinite dimensions just as well and will prove the existence of solutions to ordinary differential equations.

Exercises Exercise 1. (Crazy system?) Consider the following system of equations.

sin(a + b) + c + d + e + f + (1 + b)g = 5 ac + cos(b)d + e + f + g = 4 a + b(c + d) + e + exp(a)(f + g) = 3 b + bc + ad + ae + f + g = 2 ab + ac + ad + be + bf + exp(sin ab)g = 1

F a. Give R7 −→ R5 and c ∈ R5 such that the solution set to the system is of the form F −1({c}) and explain why your F is a C1-function.

b. Verify that (0, 0, 1, 1, 1, 1, 1) ∈ F −1({c}).

c. Give a condition under which we can write the solution set F −1({c}) close to the solution (0, 0, 1, 1, 1, 1, 1) as the graph of a C1 function.

d. Check that the condition you found in part c. is actually satisfied.

2 Even for simple G it is often hard to predict which initial guess x0 will converge, fractals like the Mandelbrot set first appeared in this context. 2.4. IMPLICIT FUNCTION THEOREM 21

Exercise 2. (Inverse implies implicit) In this exercise we derive a version of the implicit function theorem from the inverse function F theorem where R2 −→ R is a C1 function with F (p) = q and ker F 0(p) = R × {0}.

S a. Define R2 3 (x, y) 7−→ (x, F (x, y)) ∈ R2. Prove that det S0(p) 6= 0.

−1 1 S b. Prove that there is a C inverse function A1 × A2 −−→ B for some open sets p1 ∈ A1, q ∈ A2,B 3 p. Here p = (p1, p2).

f −1 1 c. Define A1 −→ R by S (x, q) = (x, f(x)). Why is f well-defined and C ? −1 d. Prove that F ({q}) ∩ B = {(x, f(x))|x ∈ A1}. e. Prove that there are open sets p ∈ U, V 3 0 such that there is a C1 diffeomorphism from U ∩ F −1({q}) to V ∩ (R × {0}). Exercise 3. (Line bundle) ∼ 2 2 ∼ 4 2 F z Identifying C = R and C = R we consider the function {(z, w) ∈ C |w 6= 0} 3 (z, w) 7−→ w ∈ C. Determine ker F 0(p) for p = (i, i) and describe F −1({1}), both as subsets of R4. 22 CHAPTER 2. HOW TO SOLVE EQUATIONS? Chapter 3

Is there a fundamental theorem of calculus in higher dimensions?

The fundamental theorem of calculus states that the integral of the derivative is the function evaluated at the boundary and that every function has a primitive, an indefinite integral. Before setting out to generalize the fundamental theorem of calculus to arbitrary dimensions let us have a brief look at . Recall the theorems of Gauss and Stokes and line integrals. None of these concepts is made precise in the present section, they are just meant to guide us into the right direction. F A vector field F is a differentiable Rm ⊃ P −→ Rm where P is open. When m = 3 recall div, , grad were defined by

X 0 i 0 2 0 3 0 3 0 1 0 1 0 2 0 0 0 div(F ) = (F )i curl(F ) = ((F )3−(F )2, (F )1−(F )3, (F )2−(F )1) grad(f) = (f e1, f e2, f e3) i

0 i Notice we take (F )j to mean the coefficients of the derivative with respect to the standard bases. As written div, grad, curl depend very much on the choice of basis in R3. Gauss: For a vector field F defined on a three-dimensional domain D bounded by S we have Z Z div(F )dV = F · NdS D S Stokes: For a vector field F defined on a surface S bounded by a curve C we have Z Z curl(F ) · NdS = F · dr S C For a function f defined on a curve C with end-points a, b we have: Z grad(f) · dr = f(b) − f(a) C Without going into too many detail, we notice that the left hand side is integration over a k-dimensional object D involving some kind of derivative. The right hand side relates this to an integral over the boundary of D of the original function.

23 24CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

Also the type of object integrated varies. The counts how many points fit in a cube. The counts how many arrows of our vector field pierce through the surface. The counts how many surfaces perpendicular to the vector field get pierced by the curve. Finally we note that finding a primitive/ in this context comes down to finding potentials. Not every vector field is the of a function. However on a simply connected domain vector fields F satisfying curlF = 0 are shown to be F = grad(f) for some function f. Likewise a vector field F is of the form F = curlG for some other vector field G, provided divF = 0. Hopefully at the end of this chapter we will have more insight into why this must be the case.

3.1 Elementary Riemann integration

Since we only plan to work with functions that are differentiable in this course we choose to set up a very naive version of integration. While limited this framework is complete and shows many arguments in their simplest form. Readers familiar with more advanced integration theories are welcome to substitute their preferred notion of integral.

Definition 6. (, light) f Qk For a continuous function R −→ R on a rectangular box R = i=1[ai, bi] we define Z −nk X f = lim IR,n(f) where IR,n = 2 f(p) n→∞ R −n k p∈R∩(2 Z)

n −2n P2 As an elementary example take f(x) = x and R = [0, 1] then we get IR,n(f) = 2 j=0 j = −2n n n 1 −n−1 R 1 2 2 (2 + 1)/2 = 2 + 2 so as expected [0,1] x = 2 . It is customary to write ’dx’ after the integrand. We will not do this because later in the chapter we will use ’dx’ in the sense of differential forms or covector fields. Some elementary properties are given in the next lemma.

Lemma 5. (Properties of R )

R 1. The limit defining R f exists for any continuous f. R 2. R f ∈ vol(R)[minR f, maxR f] R R 3. ∀R : R f = R g ⇒ f = g Proof. Since any continous function on the compact set R is uniformly continuous we see the limit exists. To show this it suffices to check that |IR,n(f) − IR,m(f)| becomes small for sufficiently large n < m so that it is a Cauchy sequence. Given  there is an N such that for all |q| < 2−N and all p we have |f(p) − f(p + q)| < , n, m ≥ N:

X −mk mk −mk |IR,n(f) − IR,m(f)| ≤ 2 |f(¯p) − f(p)| < 2 2  −m k p∈R∩(2 Z) herep ¯ is p with all coordinates rounded to the closest multiple of 2−n. 3.1. ELEMENTARY RIEMANN INTEGRATION 25

For part 2 we note that any continuous function attains its max and min on the compact set R. For any n we have IR,n ≤ maxR f and similar for min. Part 3 follows from part 2 and continuity of f: just take a sequence of smaller and smaller rectangles centered at point p. Then f(p) = g(p). The Fubini theorem about computing an integral by first integrating out a couple of variables is a simple matter in this framework. Lemma 6. (Fubini) Z Z Z f = F where F (p) = f(p, ·) R×S R S

Proof. F (p) = limm→∞ IS,mf(p, ·)

X −nk X −nk IR,n(F ) = 2 lim IS,mf(p, ·) = lim 2 IS,mf(p, ·) = m→∞ m→∞ −n k −n k p∈R∩(2 Z) p∈R∩(2 Z)

X −nk−m` lim 2 f(p, q) = lim am,n m→∞ m→∞ −n k −m ` (p,q)∈R×S∩(2 Z) ×(2 Z) R R Notice that an,n = IR×S(f) so finally R F = limn,m→∞ am,n = limn→∞ an,n = R×S f Lemma 7. (Fundamental theorem of calculus) Suppose f is C1 on [a, b]. Then Z f 0 = f(b) − f(a) [a,b] R 0 The function F (x) = [a,a+x] f then is differentiable and F (x) = f(x). Proof.

X X Z f(b) − f(a) = f(p + 2−n) − f(p) = 2−nf 0(p) + (f, p, 2−n) = f 0 −n −n [a,b] p∈[a,b)∩2 Z p∈[a,b)∩2 Z

n −n ∗ The last equality is valid because for all p we have limn→∞ 2 (f, p, 2 ) = 0. Taking p to be the −n P −n n ∗ −n point where (f, p, 2 ) is maximal we have | −n (f, p, 2 )| ≤ 2 (f, p , 2 ) converging p∈[a,b)∩2 Z to 0. For the second equality consider Z F (x + h) − F (x) = f ∈ h[ min f(x + t), max f(x + t)] [x,x+h] t∈[0,h] t∈[0,h]

Continuity of f means that limh→0 mint∈[0,h] f(x + t) = f(x) and the same for the maximum. Dividing by h and taking the limit on both sides finishes the proof. Fubini’s theorem allows us to give a soft proof of the fact that mixed partial derivatives commute. 0 This result will be very important later in discussing the exterior derivative. Set ∂if(p) = f (p)ei. Lemma 8. (Mixed partial derivatives commute) 2 For any C function f we have ∂i∂jf = ∂j∂if. 26CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

Proof. It suffices to prove the case of a function f defined on an open subset of R2. This is because ˜ ˜ R ∂i∂jf(p) = ∂1∂2f(0, 0) with fp(x, y) = f(p + xei + yej). We will show that I = [a,b]×[c,d] ∂1∂2f = R [a,b]×[c,d] ∂2∂1f = J. By continuity it then follows that ∂2∂1f = ∂1∂2f. R R 0 Using Fubini, I = [a,b] F where F (p) = [c,d] g and g(q) = ∂1f(p, q). By the fundamental R R R 0 theorem of calculus I = [a,b] g(d)−g(c) = [a,b] ∂1f(p, d)−∂1f(p, c) = [a,b] h with h(p) = f(p, d)− f(p, c). So we conclude that I = h(b) − h(a) = f(b, d) − f(b, c) − f(a, d) + f(a, c). Splitting the integral in the other order and doing the same steps shows that J gives the same answer. Yet another application of Fubini is to prove that one can differentiate under the integral sign: Lemma 9. (Differentiation under the integral sign) 1 R R For any C function f defined on rectangle [a, b] × R we have ∂1 R f = R ∂1f. Proof. By part 3 of the properties of integration lemma, it suffices to prove that for all [c, d] R R R R we have [c,d] ∂1 R f = [c,d] R ∂1f. Using the fundamental theorem of calculus the left hand side is equal to (R f)(d) − (R f)(c) = R f(d, ·) − f(c, ·). Fubini says the right hand side is R R R R R R R [a,b] ∂1f = R f(d, ·) − f(c, ·) finishing the proof.

Exercises Exercise 1. ϕ Prove the change of variables theorem for a C1 function [a, b] −→ R with ϕ(a) < ϕ(b) by applying f the fundamental theorem of calculus. So given a continuous function [ϕ(a), ϕ(b)] −→ R and ∀x ∈ [a, b]: ϕ0(x) ≥ 0 show that: Z Z (f ◦ ϕ)ϕ0 = f [a,b] [ϕ(a),ϕ(b)] Exercise 2. Confirm that the usual Riemann integral coincides with our notion of integral for continuous func- tions on a product of closed intervals. Notice that since Fubini holds in either theory it suffices to consider the one-dimensional case.

3.2 k-vectors and k-covectors

In this section we describe generalizations of vectors and covectors that we will be integrating with. They are designed to capture the properties seen in the vector calculus of R3 and carry them over to Rn or really any finite dimensional vector space V . The foundation for our theory is the intersection map I:

k k I L(R ,V ) × L(V, R ) 3 (B,C) 7−→ det(C ◦ B) ∈ R (3.1) The number I(B,C) can be interpreted as the oriented intersection of the level sets of C with the box B([0, 1]k). The following lemma makes this more precise. Lemma 10. (Determinant counts oriented intersections) For all B ∈ L(Rk,V ) and C ∈ L(V, Rk) we have

−k 1 k −1 k I(B,C) = det(C ◦ B) = ± lim n #{w ∈ ( Z) : C (w) ∩ B[0, 1] 6= ∅} n→∞ n 3.2. K-VECTORS AND K-COVECTORS 27

Figure 3.1: Intersections in V = R3 in various codimension. The box B([0, 1]k) is shown in purple, a few of the level sets of C are in yellow. 28CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS? where # means the cardinality of the set.

Proof. If C ◦ B is not an isomorphism then the left hand side of the equation is 0 and so is the right hand side. −1 First the case k = 1. Suppose I(B,C) = C(B(e1)) = x ∈ R. The condition z ∈ {C (w) ∩ 1 m B[0, 1]|w ∈ n Z} means z = tB(e1) and C(z) = n for some t ∈ [0, 1], m ∈ Z. In other words m C(tB(e1)) = tx = n and |m| = 0, 1,..., bnxc. It follows that up to a sign, the right hand side is −1 equal to limn→∞ n (bnxc + 1) = x c For the general case we make use of lemma 1. Replacing C by Rij ◦ C does not change the determinant and does not change the intersection count either. The same goes for replacing B c i by B ◦ Rij. By Gaussian elmination we may thus assume that C ◦ B is diagonal, so C (Bj) = 0 Q i unless i = j. In that case we have det C ◦ B = C (Bi). Using the k = 1 case above we see this corresponds to the right hand side as well.

The sign of the determinant compares the orientations of the boxes B([0, 1]k) and C−1([0, 1]k)∩ imB, the sign is + if the orientations agree and minus otherwise. Of course this is no more than a visually appealing tautology. At the end of the day Gaussian elimination lemma 1 is what counts. Geometrically it just means that the determinant is supposed to be invariant under shears. In order to count several intersections at the same time it makes sense to extend I to formal linear combinations1 of B’s and C’s. In other words, for any set S we may consider the vector space Span(S) to be spanned by all finite linear combinations of elements of S. For example we might want to measure how a given C intersects two distinct boxes B and B˜ by adding the intersection numbers (with sign). Likewise we can ask how much in total the level sets of C and C˜ intersect a given box. Furthermore, thinking of intersection with a box that is twice as large, it makes sense to introduce scalar multiplication as well. Finally multiplication by −1 should reverse the orientation and change the sign of the intersection. All this motivates the definition

Definition 7. (Intersection map) I Define the map Span L(Rk,V ) × Span L(V, Rk) −→ R by

I(B + aB,C˜ ) = I(B,C) + aI(B,C˜ ) I(B,C + aC˜) = I(B,C) + aI(B, C˜)

What really matters is not the B and C’s themselves but rather the effect of I on them. So we say that two combinations of B’s are equivalent if they always give the same value of I when paired with a C. Likewise two combinations of C’s are equivalent if they give the same answer when evaluated on all B’s. This brings us to the definition of k-vectors and k-covectors as a natural consequence of studying2 the intersection map I. 3

Definition 8. (k-(co)vectors)

1. The space of k-vectors ΛkV is the vector space of finite formal linear combinations of elements of L(Rk,V ), modulo the equivalence relation X ∼ X˜ if for all C ∈ L(V, Rk) we have I(X,C) = I(X,C˜ ).

1 k Formal means we do NOT use the pointwise addition of L(R ,V ) 2Applying the above procedure to a map different from I produces other potentially interesting vector spaces Q such as tensor powers (use ev(B,C) = i Ci(Bi)) and symmetric powers (use the permanent) etc. 3 k In what follows we are taking the quotient of the vector space SpanL(R ,V ) by the subspace spanned by all the differences of equivalent elements. 3.2. K-VECTORS AND K-COVECTORS 29

2. The space of k-covectors ΛkV ∗ is the vector space of finite formal linear combinations of elements of L(V, Rk), modulo the equivalence relation Y ∼ Y˜ if for all B ∈ L(Rk,V ) we have I(B,Y ) = I(B, Y˜ ).

The key feature of this definition is that although we work with equivalence classes, intuitively we may always think in terms of a specific B being intersected with a specific C. This way the geometry is never lost. Working with equivalence classes is common in integration, for example the space of L2(X) functions on X is not really a space of functions. Rather two functions that differ only on a set of measure zero (are equal almost everywhere) are in the same equivalence class. More technically speaking the definition is made to force I to be a non-degenerate pairing between ΛkV and ΛkV ∗. What is not yet clear is why these spaces are finite dimensional. At the end of the k n section we will see that dim Λ V = k . The k-vectors are meant to generalize usual vectors in V in the sense that 1-vectors correspond to usual vectors, elements of V . Or rather

1 ϕ P P Lemma 11. There is a natural isomorphism Λ V −→ V given by ϕ([ i aiBi]) = i aiBi(1). Likewise there is an isomorphism V ∗ → Λ1V ∗ sending the covector C ∈ V ∗ to its equivalence class [C] ∈ Λ1V ∗.

P P 0 0 ∗ Proof. The linear map ϕ is well-defined because if [ i aiBi] and [ i aiBi], then for all C ∈ V we must have

X 0 0 X 0 0 X 0 0 X X X Cϕ( aiBi) = aiCBi(1) = I( aiBi,C) = I( aiBi,C) = aiCBi(1) = Cϕ( aiBi) i i i i i i

P 0 0 P Since this holds for arbitary C we must have ϕ( i aiBi) = ϕ( i aiBi). Surjectivity of ϕ follows P ∗ from ϕ([1 7→ v]) = v for any v ∈ V and finally if Z = [ aiBi] ∈ ker ϕ and C ∈ V then P P i I(Z,C) = i ai(C ◦ Bi)(1) = C( i aiBi(1)) = Cϕ(Z) = 0. The case of covectors is left as an exercise.

Often 1-vectors and 1-covectors will be identified with the usual vectors and covectors using the above isomorphisms. For k = 0 we get Λ0V = Λ0V ∗ = R. Suppose C ∈ L(V, R0) and B ∈ L(R0,V ). Then C is the 0- function and B the 0 vector but nevertheless we have the convention that I(B,C) = det(C ◦B) = 1 since the determinant is an empty product. That means that we allow for example 2[C] and do not identify it with [C]. For k = 2 we already obtain something new. For example in Λ2R3 we have [B] + [B˜] = [B¯], where Be1 = e1, Be2 = e1 + e3, Be˜ 1 = e1 − e2, Be˜ 2 = e2 and Be¯ 1 = e1, Be¯ 2 = e2 + e3. To show equality we use brute force and write down matrices for B, B,˜ B¯ and a general C ∈ L(R3, R2) and 30CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS? compute:

 1 1   a b c   a a + c  det 0 0 = det = a(d + f) − (a + c)d (3.2) d e f   d d + f 0 1  1 0   a b c   a − b b  det −1 1 = det = (a − b)e − b(d − e) (3.3) d e f   d − e e 0 0  1 0   a b c   a b + c  det 0 1 = det = a(e + f) − (b + c)d (3.4) d e f   d e + f 0 1

We see that as expected det(C ◦ B) + det(C ◦ B˜) = det(C ◦ B¯) i.e. the first two lines add to make the third for any C represented by the 2 × 3 matrix. Therefore [B] + [B˜] = [B¯]. Of course there are better and more geometric ways to do this as we will see below. Since the definition is really confusing to work with directly we will show how any basis of V gives rise to bases for the spaces of k-(co)vectors, see lemma 12. To construct the basis we first give a recipe for building k-vectors from usual ones.

Definition 9. (Wedge) Vk k For B1,...,Bk ∈ L(R,V ) define i=1 Bi = B1 ∧ B2 ∧ · · · ∧ Bk = [B] ∈ Λ V with B(v1, . . . vk) = P 1 k Vk i 1 k 1 k i Bivi. Similarly for C ,...,C ∈ L(V, R) define i=1 C = C ∧ · · · ∧ C = [(C ,...,C )] ∈ ΛkV ∗.

Any element of L(Rk,V ) and L(V, Rk) can be written as a wedge product. The key thing to notice here is that taking the intersection maps of wedge products is easy:

^ ^ i i I( Bj, C ) = det(C ◦ Bj)i,j=1...k (3.5) j i

Viewed as a matrix whose columns are determined by the vectors bj = Bj(1) we get a function i 4 D(b1, . . . bk) = det(C (bj))i,j=1...k. The two main properties of the determinant are that it is multilinear and alternating in the following sense (for all i 6= j):

0 0 D(b1, . . . , bj−1, bj + abj, bj−1, . . . bk) = D(b1, . . . , bj, . . . bk) + aD(b1, . . . , bj, . . . , bk) (3.6)

D(b1, . . . , bi, . . . , bj, . . . bk) = −D(b1, . . . , bj . . . , bi, . . . , bk) (3.7)

For our wedges this means that we get the following relations in ΛkV :

0 0 B1 ∧ · · · ∧ (Bj + aBj) ∧ · · · ∧ Bn = (B1 ∧ · · · ∧ Bj ∧ · · · ∧ Bn) + a(B1 ∧ · · · ∧ Bj ∧ · · · ∧ Bk) (3.8)

B1 ∧ · · · ∧ Bi ∧ · · · ∧ Bj ∧ ...Bk = −B1 ∧ · · · ∧ Bj · · · ∧ Bi ∧ · · · ∧ Bk (3.9)

This is proven by pairing with an arbitrary element C = V Ci and unwinding the definitions. Notice that the addition on the left hand side can be taken either as vectors because of the isomorphism ϕ between V and Λ1V .

4this is a direct consequence of Gaussian elimination, lemma 1. 3.2. K-VECTORS AND K-COVECTORS 31

k ∗ i Precisely the same formulas hold in Λ V where instead of the Bi we now have C ∈ L(V, R). This is because the determinant is not just multilinear and alternating in the columns but also in the rows of a matrix. The example above can now be streamlined considerably. Notice [B] = e1 ∧ (e1 + e3) while [B˜] = (e1 − e2) ∧ e2. From the above rules it follows that [B] + [B˜] = e1 ∧ (e1 + e3) + (e1 − e2) ∧ e2 = e1 ∧ e3 + e1 ∧ e2 = e1 ∧ (e2 + e3) = [B¯]. The wedge products also provide a convenient basis for our spaces of k-(co)vectors. Given a i basis b1, . . . bn with dual basis b of V we can build bases for the spaces of k-(co)vectors as follows. V I V i Define bI = i∈I bI and b = i∈I b . Here the wedge product is taken in increasing order. For example b{1,5,2} = b1 ∧ b2 ∧ b5.

Lemma 12. (Basis of k-(co)vectors) I {bI |I ⊂ {1, . . . n}, |I| = k} and {b |I ⊂ {1, . . . n}, |I| = k} are a basis and dual basis for the k k ∗ k n k ∗ k ∗ ∼ k ∗ spaces Λ V and Λ V . In particular dim Λ V = k = dim Λ V and (Λ V ) = Λ V via the non-degenerate pairing I.

k Vk k Proof. Since Λ V is spanned by elements i=1 Bi ∈ L(R ,V ) it suffices to write those in terms of ∼ the bI . Expressing Bi ∈ L(R,V ) = V in terms of the basis and reordering the factors in the wedge product using (3.8) is enough. The same works to show that bI span the space of k-covectors. ( J 1 if I = J Linear independence follows from the formula I(bI , b ) = that is a consequence of 0 if I 6= J the determinant interpretation above.

In practice one often has to change coordinates or map to other vector spaces. To transfer k-covectors from vector space W to vector space V we use the pull-back. Recall that given a linear map H from V to W we have a transpose/dual map H∗ from W ∗ to V ∗ by pre-composing with H.

Definition 10. (Pull-back) For any H ∈ L(V,W ) we define

∗ k ∗ X i H X i k ∗ Λ W 3 [ aiC ] 7−−→ [ ai(C ◦ H)] ∈ Λ V i i

P i P ¯i This is well defined because if [ i aiC ] = [ i a¯iC ] then for any B we have

∗ X i X i X ∗ X i I(B,H a¯iC¯ ) = a¯iI(H ◦ B, C¯ ) = aiI(H ◦ B,Ci) = I(B,H aiC ) i i i i

∗ P i ∗ P ¯i Since this holds for all B we have H [ i aiC ] = H [ i a¯iC ] as required. A similar definition allows us to transport k-vectors but the direction is opposite (Exercise). The pull-back factors nicely through compositions and wedge products as H∗ V Ci = V H∗Ci and H∗G∗ = (G ◦ H)∗. 2 2 A simple example of pull-back is the following. Define H ∈ L(R , R ) by He1 = ae1 + ce2 and 1 2 2 2∗ ∗ 1 2 He2 = be1 + de2. Then e ∧ e ∈ Λ R can be pulled back along H and this gives H (e ∧ e ) = [(e1 ◦ H, e2 ◦ H)] = (ae1 + be2) ∧ (ce1 + de2) = (ad − bc)(e1 ∧ e2). It is not a coincidence that the determinant of the matrix of H pops up here. 32CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

Exercises Exercise 1 (Connection with multilinear alternating functions) In the the literature one usually describes k-covectors using alternating multilinear maps. A function k A : V → R is alternating if A(v1, . . . vk) = 0 whenever vi = vj for some i, j and multilinear means linear in each component. Let Altk(V ) be the space of all alternating multilinear maps on V . In this exercise we prove that Altk(V ) =∼ ΛkV ∗.

k V a. For any C ∈ L(V, R ) show that the function AC defined by AC (v1 . . . vk) = I( i vi,C) is in AltkV .

b. Prove that AltkV is a vector space with respect to point-wise linear combinations. Show that the previous part gives a linear map ΛkV ∗ → Altk(V ).

c. Show that the above map is an isomorphism.

Exercise 2 (Exploring 2-vectors) What are 2-vectors in R? In R2? and in R3 and R4?

Exercise 3 ∧ Wedge product. There is a bilinear map Λk(Rn)∗ × Λ`(Rn)∗ −→ Λk+`(Rn)∗ called the wedge prod- uct. It is defined by F ∧ G = [(F,G)] for any F,G ∈ L(Rn, Rk),L(Rn, R`) and extended bilinearly. Why is it well-defined?

Exercise 4 One may push-forward a k-vector field on W to a k-vector field on V using formulas similar to those of the pull-back. Can you make this precise?

Exercise 5 Show that for B ∈ L(Rk,V ) and C ∈ L(V, Rk) we have X I(B,C)−1 = ± lim n−k #(C−1([0, 1]k) ∩ B(w)) n→∞ 1 k w∈( n Z)

You may assume that C ◦ B is an isomorphism.

Exercise 6 Give an explicit isomorphism between the space of 1-covectors Λ1V ∗ and V ∗.

Exercise 7 When V = R3 are there any 2-covectors Y ∈ Λ2V ∗ such that Y 6= [C] for any C ∈ L(V, R2)? What about the case V = R4? Hint: Use lemma 12

Exercise 8 Given H ∈ L(V,W ) and G ∈ L(U, V ) prove that for any Y ∈ ΛkW ∗ we have (G∗ ◦ H∗)Y = (H ◦ G)∗Y . 3.3. (CO)-VECTOR FIELDS AND INTEGRATION 33

Figure 3.2: Some random vector fields on a square given by quadratic functions.

Exercise 9 In this exercise we identify vectors in R2 with 1-vectors in Λ1R2. Explain why for any a, b ∈ R2 we λ have a λ ∈ R such that a ∧ b = λe1 ∧ e2. We say that 2 is the oriented area of the triangle with 2 1 1 1 1 vertices 0, a, b. Prove that for a, b, c ∈ R we have 2 (b − a) ∧ (c − a) = 2 a ∧ b + 2 b ∧ c + 2 c ∧ a. What is the relationship between areas of triangles that is implied?

3.3 (co)-vector fields and integration

Now that we created k-(co)vectors we would like them to vary as you move around an open set P ⊂ Rn. Perhaps the most familiar instance of this idea is that of a vector field. A vector field X is just a function P −→ Rn that we visualize by drawing an arrow X(p) at point p ∈ P . In our terminology vectors coincide with 1-vectors and so we can define k-vector fields and even k-covector fields in the same way.

Definition 11. ((co-)vector fields in Rn) Let P ⊂ Rn be an open set. A Cm, k-vector field is a Cm function X : P → ΛkRn.A Cm, ∗ k-covector field5 is a Cm function ω : P → ΛkRn . The set of all C2 k-covector fields is called Ωk(P ). Our convention is that Ω0(P ) is the set of C1-functions on P 6.

We are using here the fact that the derivative can be defined for functions P → V for any vector space V . All the formulas and definitions generalize to this case (Exercise!). Alternatively one makes an explicit isomorphism between V and RN by choosing a basis and checks that nothing really depended on the choices made. ∗ The simplest examples of k-covector fields are the constant k-covector fields. For any Y ∈ ΛkRn we get a k-covector field also called Y ∈ Ωk(P ) defined by Y (p) = Y for all p ∈ P . For example e1 ∧ e2 is a 2-covector in R2 and can also be interpreted as the constant 2-covector field on some P ⊂ R2. Constant functions and constant vector fields are used similarly. Operations on k-covectors are extended pointwise to k-covector fields. For example for α, ω ∈ k k V i Vs i Ω (P ) we define (α + ω) ∈ Ω (P ) by (α + ω)(p) = α(p) + ω(p) and ( i∈I η )(p) = i=1(η (p)) for η1 . . . ηs ∈ Ω1(P ). For any f ∈ Ω0(P ) we define fω ∈ Ωk(P ) by (fω)(p) = f(p)ω(p). Actually the constant k-covector fields together with pointwise multiplication by functions al- ready is enough to describe all fields.

5These are also known as k-forms or differential k-forms 6 0 0∗ This is consistent with Λ R =∼ R 34CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

Lemma 13. Basis for the k-(co)vector fields) n n I k Given any basis b1, . . . bn of R and P ⊂ R denote by b ∈ Ω (P ) the constant k-covector I V i k P I field defined by b (p) = i∈I b . Any ω ∈ Ω (P ) can uniquely be expressed as ω = I fI b for 0 m fI ∈ Ω (P ) and the sum ranges over all k-element subsets I ⊂ {1, . . . n}. Likewise any C k-vector P m field on P may be expressed uniquely as I fI bI for C -functions fI on P . k n∗ Proof. For any p ∈ P we have ω(p) ∈ Λ R . By Lemma 12 there must exist numbers fI (p) ∈ R P I 1 such that ω(p) = I fI (p)b . The functions p 7→ fI (p) defined this way are C because they are ω ∗ the components of the C1-function P −→ ΛkRn , see property 4 of Theorem 1 with respect to the basis formed by the bI . The case for k-vector fields is similar.

Sometimes it is convenient to figure out the coefficient functions fI of some unknown ω = P I J I fI b by intersecting with the dual basis bI . Recall I(bI , b ) = 0 unless I = J in which case we get 1. Therefore fI (p) = I(bI , ω(p)) for any p ∈ P . k-covector fields naturally arise as integrands and also as certain derivatives of ordinary func- tions. For example important 1-covector fields are provided by the derivative of a function: Definition 12. (differential of a function) Define for any f ∈ Ω0(P ) the 1-covector field df ∈ Ω1(P ) by df(p) = f 0(p). This 1-covector plays the role of the ’gradient’ ∇f, in the sense that its level sets are locally those of f. More precisely, using an inner product, 1-covector fields may also be written in terms of vector fields by setting ω(p)v = X(p) · v. In particular we claim df(p)v = ∇f(p) · v for all vectors v and p ∈ P (Exercise!). The differential also demystifies the dx from calculus. Viewing x as the function on say R2 that sends any vector to its first coordinate, its differential is dx, just a 1-covector field. Since our x is a linear function we have dx(p) = e1 for all p ∈ R2. Below we will see that this definition fits well with our integrals. More generally, denote the function on Rn that extracts the i-th coefficient of i i i n∗ i i I V i a vector as x , so really x = e ∈ R . Then dx (p) = e for all p. As before set dx = i∈I dx for some I ⊂ {1, . . . n}. The above lemma 13 then says that any k-covector field is a sum of terms I fI dx . This is how they often appear implicitly in calculus texts as integrands. There one writes dxdy instead of the more correct dx ∧ dy and usually also abuses notation by using x for both the first coordinate and the function extracting the first coordinate. The formula for the differential can be written in terms of the basis dx1, . . . dxn as follows df = P ∂f dxi (Exercise!). This is a formula that is often used in calculus but now it actually i ∂xi makes sense as expansion of 1-covectors.

3.3.1 Integration Integration is defined much like the familiar case of line integrals. In a line integral one integrates a covector field over a curve by plugging in the velocity vector at each point and summing up the results. In general we integrate a k-covector field along a ’k-dimensional parametrized curve’ γ [0, 1]k −→ Rn by again plugging in the derivative at each point. Plugging in now means intersecting using I. We call γ a (singular k-) cube. Definition 13. (Integral of k-covector) Define the integral by Z Z ω = I(γ0, ω(γ)) γ [0,1]k 3.3. (CO)-VECTOR FIELDS AND INTEGRATION 35

γ where [0, 1]k −→ P is a C1 function (singular cube) and ω ∈ Ωk(P ). The integrand is shorthand for the function t 7→ I(γ0(t), ω(γ(t))).

A typical line integral would be if we set say γ(t) = (cos t, sin t) and ω(x, y) = −(x2 + y2)e2. R R 2 R Then γ ω = [0,1] −e (− sin t, cos t) = [0,1] − cos t = − sin 1. For example take the 2-covector ω ∈ Ω2(R3) defined by ω(x, y, z) = (x + y)e1 ∧ e3 we can γ integrate this over a square parametrized by [0, 1]2 −→ R3 defined by γ(s, t) = (s + t2, t, −s). Then R 0 the integral γ ω is computed as follows. Notice [γ (s, t)] = b1 ∧ b2 = 2te1 ∧ e3 + e1 ∧ e2 + e2 ∧ e3 0 where b1 = (1, 0, −1) and b2 = (2t, 1, 0). Therefore the integrand I(γ (s, t), ω(γ(s, t))) can be 2 1 3 2 computed as I(b1 ∧ b2, (s + t + t)e ∧ e ) = 2t(s + t + t) so in the end we get an ordinary integral R 2 R 3 2 5 [0,1]2 2t(s + t + t) = [0,1] t + 2t + 2t = 3 . In our new notation the usual fundamental theorem of calculus for line integrals, 1-covector f fields, is as follows: For any P −→ R and C1 curve γ we have Z df = f(γ(1)) − f(γ(0)) γ R Unwinding the definitions this is just a restatement of the usual fundamental theorem since γ df = R 0 R 0 0 R 0 [0,1] I(γ (·), df(γ(·))) = [0,1] f (γ(·))(γ (·)) = [0,1](f ◦ γ) (·) = f(γ(1)) − f(γ(0)). As with usual k-covectors we may transfer k-covector fields by pulling them back along a map ϕ P −→ Q. In practice this just comes down to plugging in the equations for ϕ and differentiating.

Definition 14. (Pull-back on covector fields) ϕ ϕ∗ For a map P −→ Q we have a pull-back Ωk(Q) −−→ Ωk(P ) defined by (ϕ∗ω)(p) = (ϕ0(p))∗ω(ϕ(p)). In the special case k = 0 of functions we define ϕ∗f = f ◦ ϕ for f ∈ Ω0(P ).

ϕ As an example consider polar coordinates: P = (0, ∞) × (0, 2π) 3 (r, t) 7−→ (r cos t, r sin t) ∈ Q, where Q is R2 minus the positive x-axis, and ω ∈ Ω2(Q) is given by ω(x, y) = (x − y)e1 ∧ e2. By definition (ϕ∗ω)(r, t) = r(cos t − sin t)ϕ0(r, t)∗(e1 ∧ e2). Since e1 ◦ ϕ0(r, t) = cos te1 − r sin te2 and e2 ◦ ϕ0(r, t) = sin te1 + r cos te2 we get ϕ0(r, t)∗(e1 ∧ e2) = [(e1, e2) ◦ ϕ0(r, t)] = [(e1 ◦ ϕ0(r, t), e2 ◦ ϕ0(r, t))] = (e1 ◦ ϕ0(r, t)) ∧ (e2 ◦ ϕ0(r, t))) = (cos te1 − r sin te2) ∧ (sin te1 + r cos te2) = re1 ∧ e2 so we found (ϕ∗ω)(r, t) = r2(cos t − sin t)(e1 ∧ e2). Beware that we used e1, e2 to denote the dual basis in both the domain and the range. One often sees this computation done in terms of dx, dy and dr, dt instead, just plugging in the formulas of ϕ for x, y to express dx and dy in terms of dr, dt by differentiating. The end result would now look like ϕ∗((x − y)dx ∧ dy) = r2(cos t − sin t)dr ∧ dt. The pull-back is what transforms the integrands correctly when changing coordinates:

Lemma 14. (Substitution lemma) ϕ For ω ∈ Ωk(Q) and P −→ Q a C1 function we have Z Z ω = ϕ∗ω ϕ◦γ γ Proof. By definition the integral on the left hand side means Z Z ω = I((ϕ ◦ γ)0, ω(ϕ ◦ γ)) ϕ◦γ [0,1]k 36CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

P i i k n∗ Setting ω(ϕ ◦ γ(t)) = ci[C ] for some C ∈ Λ R we see that the integrand equals (use the 0 0 P i 0 0 P 0 i 0 chain rule) I((ϕ (γ(t))◦γ (t), ω(s)) = i ci det(C ◦ϕ (γ(t))◦γ (t)) = i ciI(γ (t),C ◦ϕ (γ(t))) = P 0 0 ∗ i 0 ∗ i ciI(γ (t), ϕ (γ(t)) C ) = I(γ (t), (ϕ ω)(γ(t))) which is precisely the integrand on the right hand side. The pull-back fits well with compositions, we have ϕ∗ ◦ ψ∗ = (ψ ◦ ϕ)∗ for any C1 functions ϕ ψ ∗ Vs i Vs ∗ i P −→ Q −→ R. Exercise! Also pull-back and wedge commute: ϕ i=1 η = i=1 ϕ η because this is true pointwise. For functions f ∈ Ω0(P ) taking pull-back also commutes with the differential: dϕ∗f = ϕ∗df. This follows from the chain rule (Exercise). Finally a word about our geometric interpretation. Recall we visualize a k-covector by consid- ering its level sets, which are affine hyperplanes of codimension k. For a k-covector field ω ∈ Ωk(P ) we can do the same at every point p ∈ P . One can imagine that the hyperplanes move and curve k γ R as we move p. Given a singular k-cube [0, 1] −→ P the interpretation of γ ω is much like the interpretation of the intersection map I itself: At every point p we may ask how many times the level set of ω(p) intersect the image of γ or rather its linear approximation γ0(p). In terms of the level sets, the pull-back is simply applying the linear transformation ϕ0(p) to all the level sets at that point. The substitution lemma expresses the fact that derivative scales everything correctly so that the intersections in P or in Q are counted the same. While perhaps not precise enough to count as proof such visual arguments can help getting a sense of what is going on and serve as an antidote to the rather heavy notation used in this subject.

Exercises Exercise 1 For ω ∈ Ω2(R4) defined by ω(x, y, z, w) = (x + z)dx1(x, y, z, w) ∧ e3 + (y + w)e2 ∧ e4 and the 2-cube R γ given by γ(s, t) = (s, 2t, t, −3s) compute the integral γ ω explicitly. Hint: The dx1(x, y, z, w) is just to show off. It can safely be replaced by ei for some i, (which i?)

Exercise 2 1 n f Pn ∂f n Define the gradient of a C function ⊃ P −→ as ∇f = ei. Prove that for any v ∈ R R i=1 ∂xi R and p ∈ P we have ∇f(p) · v = (df(p))(v). Also show that for any C1 curve γ in a level set of f with γ(0) = p the velocity vector γ0(0) is perpendicular to ∇f(p) and is in ker df(p).

Exercise 3 ϕ ∗ Define (0, ∞) × (0, 2π) 3 (r, t) 7−→ (r cos t, r sin t) ∈ R2. Let x, y ∈ R2 be the dual basis to the standard basis and r, t the dual basis in the domain of ϕ. a. Compute ϕ∗(dx) and d(ϕ∗x) explicitly from the definitions. R b. Compute γ η with η = rdt and γ is the 1-cube defined by γ(s) = (s, s). c. Compute R ω with α(u) = (u cos u, u sin u) and ω = √ −y dx + √ x dy by expressing it α x2+y2 x2+y2 as the integral of a pull-back along ϕ. Note that in this exercise we follow the common abuse of notation using the symbols r, t for both the coordinates of a point and also the dual vectors reading off these coordinates. The 1-covector ∗ ∗ field η defined loosely by rdt is really sending point p to e1(p)e2 ∈ Λ1R2 = R2 . 3.4. MORE ON CUBES AND THEIR BOUNDARY 37

3.4 More on cubes and their boundary

We would like to have a version of the fundamental theorem of calculus that says: integration of the derivative over a cube is integration of the function on the boundary of that cube. For example R 0 R in one dimension we have [a,b] f = f(b) − f(a). It is tempting to write the last integral as {a,b} f since {a, b} is the set of boundary points of the interval [a, b] however that way we will miss the crucial minus sign. The boundary needs to be oriented so that point a carries a minus sign and b a plus sign. Also boundary of the interval is not described by a single cube but rather by two, corresponding to its two faces. Notice we are talking about cubes as if they are actual geometric cubes but in reality our cubes are maps [0, 1]k → Rn. Nevertheless it still makes sense to speak about faces, using the domain of the maps. Definition 15. (Faces) The standard k-cube is the identity Ik : [0, 1]k → Rk. The faces of the standard k-cube are (k − 1)- k k k cubes in R indexed by i ∈ {1, . . . n} and σ ∈ {0, 1} defined by I(i,σ)(x) = I (x1,...,σ,...xk) with k k the σ in the i-th place. For a general k-cube γ : [0, 1] → M we define the faces γi,σ = γ ◦ Ii,σ.

Figure 3.3: The standard cube I2 and its faces.

The boundary of a k-cube is the union of all these faces. Instead of union we prefer to write it as a linear combination. This is convenient for keeping track of their orientations and makes sense once we start integrating over the faces. The integral over the boundary will be the sum of the integrals over the faces anyway. In this context formal combinations of cubes are referred to a chains. Definition 16. (k-Chain) A k-chain is a finite formal linear combination of k-cubes. The integral R R R is extended to k-chains by aγ+bγ˜ ω = a γ ω + b γ˜ ω. For example the boundary of the standard 1-cube I1 (interval [0, 1]) are the endpoints {0} and {1} where the first gets a minus sign and the second a plus sign. Written as a 0-chain the boundary of I1 will be the 0-chain −(0 7→ {0}) + (0 7→ {1}) a formal sum of maps from [0, 1]0 = {0} to [0, 1]. Likewise the boundary of the unit square is a sum of four terms [0, 1] × {0}, {1} × [0, 1], [0, 1] × {1} and {0} × [0, 1]. Taking the orientations such as in figure 3.3 we write the 1-chain version of the 2 2 2 2 2 boundary of I as −I1,0 − I2,1 + I1,1 + I2,0. In general we define the boundary of the standard k-cube to be the (k − 1)-chain k k X X i+σ k ∂I = (−1) Ii,σ i=1 σ∈{0,1} 38CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

More generally we define the boundary of a k-chain by

Pk P i+σ Definition 17. (Boundary) The boundary of the k-cube γ is then ∂kγ = i=1 σ∈{0,1}(−1) γi,σ. P P The boundary of the chain i aiγi is by definition i ai∂kγi. Whenever the dimensions are clear from the context we will drop the subscript and write ∂ for all boundary maps. The boundary of the boundary is always zero, for example in the picture we see each of the four vertices appear twice in the expression for ∂∂I2, once with a plus sign and once with a minus sign. This is true in general too.

Lemma 15. (Boundary of boundary) For all chains γ we have ∂∂γ = 0.

Proof. Exercise.

This lemma is the starting point of the subject of homology in algebraic topology. For any space X the k-th homology is the vectorspace of all k-chains in X that have no boundary modulo those that are the boundary of something. By the previous lemma this definition actually makes sense. More precisely the k-th homology of X is defined to be Hk(X) = ker ∂k/im∂k+1 where the subscript indicates the dimension of the chains we are to apply ∂ to.

Exercises 3.5 Exterior derivative

At each point p ∈ P ⊂ Rn we imagine the k-covector ω provides a stack of codimension k- hyperplanes in some direction with a certain intensity. We can now ask how ω varies with p. In particular we can imagine the hyperplanes rotate a bit and to increase the intensity new planes need to be added every now and then. Planes that get added have to start somewhere at a certain codimension k + 1 stack of planes. Indeed if a codim k-plane ends, it ends on a k + 1 codim plane. This is what the exterior derivative does: it describes at each point the direction and intensity of the planes that start at that point. The notation is dω and it is a (k + 1)-covector field. To be more concrete we recall that any k-covector field may be expressed in terms of the constant k-covector fields eI multiplied by functions, see lemma 13. It suffices to define the exterior derivative for such k-covector fields and this can be done in using the differential of the functions:

Definition 18. (Exterior derivative) k k+1 P I P I Define d :Ω (P ) → Ω (P ) by d( I fI e ) = I dfI ∧ e .

This definition may seem to depend on the particular basis of Rn we are using but this is not the case. The following lemma shows that the exterior derivative has many good properties that make it a respectable operation:

Lemma 16. (properties of d) ϕ Assume α, ω ∈ Ωk(Q) and P −→ Q ⊂ Rm a C1 map between open sets. 1. d(α + ω) = dα + dω

2. d(fω) = (df) ∧ ω + f ∧ dω for any f ∈ Ω0(Q) 3.5. EXTERIOR DERIVATIVE 39

3. ddω = 0 for all k-covector fields that are C2. ( βi if i 6= j 4. For βi ∈ Ω1(P ) we have d Vm βi = Pm (−1)j−1 V ηi with ηi = i=1 j=1 i j j dβi if i = j

5. ϕ∗dω = dϕ∗ω Proof. Properties 1) follows directly from the definition. For the next property we use the product I I I I rule d(fg) = (df)g + fdg. Setting ω = gI e we get dfω = d(fgI ) ∧ de = (df)gI ∧ e + fdgI ∧ e = dfω + fdω. The general case follows by summing over all possible I. By linearity and part two 3) just needs to be proved for ω = fη with dη = 0 a constant k-covector. We know dω = df ∧ η. Since 2 df = P ∂f ei we have ddω = d P ∂f ei ∧ η = P d( ∂f ) ∧ ei ∧ η = P ( ∂ f ∧ ej ∧ ei ∧ η = 0 i ∂xi i ∂xi i ∂xi i,j ∂xi∂xj because the partial derivatives commute, see lemma 3.1 and ei ∧ ej = ej ∧ ei. i ki V i Next for 4) it suffices to consider the case β = fie for some functions fi. Then d i β = V ki Pm kj V ki (d(f1 . . . fm)) ∧ i e = j=1(dfje ) ∧ i6=j fie using the for the differential. Then kj moving the factor dfje to the j-th position proves the statement. The last property follows from the chain rule when k = 0. For the general case it suffices to I ∗ I V ∗ i ∗ ∗ I ∗ I take ω = fe and use 4) and 3): d(ϕ fe ) = d((f ◦ ϕ) ∧ i∈I ϕ e ) = (d(ϕ f)) ∧ ϕ e = ϕ d(fe ). The other terms appearing from applying part 4) vanish because dϕ∗ei = dϕ∗dei = ddϕ∗ei = 0. In the last equality we used the k = 0 case. Thinking of k-covectors by their level sets, in a k-covector field ω we may ask how the level sets of ω(p) vary with p. Intuitively at least we may attempt to connect the level sets of ω(p) and ω(q) when p and q are close. In case there are more level sets around p, some of them must have been created in going from p to q. The exterior derivative describes where these new level sets are created. For example compare the two equations d(xdy) = dx ∧ dy and d(ydy) = 0 on R2. Visualizing the level sets of dy is easy: just horizontal lines, at every point we have the same number of them. The level sets of xdy are similar horizontal lines but now they get more dense as we increase the x-coordinate. This means we cannot connect the lines, going to the right, more and more lines appear. How many and where? Precisely one for each level set of dx ∧ dy which is a uniform grid of points in the plane. In contrast the level sets of ydy have the same horizontal direction but now they get more dense as we go higher in the y-direction. This way it is no problem to connect up the level sets without the need to create more lines along the way. In one sentence then, our intuitive interpretation of the exterior derivative dω is that it describes the boundaries of the level sets of ω. Keeping this interpretation in mind the above properties hopefully look more natural.

Exercises Exercise 1 ∗ In this exercise we identify the complex plane with R2. Let x, y ∈ R2 be the dual basis to the standard basis of R2 so that dx, dy ∈ Ω1(R2). If we define dz = dx + idy and set f(x, y) = u(x, y) + iv(x, y), explain how the Cauchy-Riemann equations are equivalent to d(fdz) = 0. Com- R 1 pute γ z dz where γ(t) = (cos 2πt, sin 2πt) is a parameterization of the unit . Why does your 0 2 1 answer imply that there is no element of g ∈ Ω (R ) such that dg = Im( z dz)? 40CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

Exercise 2 3 1 For x, y, z the dual standard basis of R verify that curlf = d(f1dx+f2dy +f3dz) for any C vector field f = f1e1 + f2e2 + f3e3.

3.6 The fundamental theorem of calculus (Stokes Theorem)

Finally we are ready to prove the most important part of the fundamental theorem of calculus, known as Stokes theorem. It is the part that relates the integral of the derivative to the integral on the boundary.

Theorem 4. (Fundamental theorem of calculus (Stokes Theorem)) k−1 R R For ω ∈ Ω (P ) and γ any k-chain we have γ dω = ∂γ ω

Proof. We start with a proof of the simplest case where γ is the standard k-cube Ik in Rk. Also, assume ω = feJ for some ordered k − 1-tuple J excluding a single index j from 1. . . k. In that case we can explicitly compute the left hand side (below we comment on what was done at each step): Z Z Z Z J j−1 ∂f {1...k} j−1 ∂f dω = df ∧ e = (−1) j (·)e = (−1) j (·) (3.10) Ik Ik [0,1]k ∂x [0,1]k ∂x Z Z ! Z j−1 ∂f j−1 k k = (−1) j = (−1) (f ◦ Ij,1 − f ◦ Ij,0) (3.11) [0,1]k−1 [0,1] ∂x [0,1]k−1

In the first step we computed the exterior derivative using the definition. In the second step we P ∂f i i J j 1...k expanded df = e and used e ∧ e = δij(−1) e . In the third step we used the theorem i ∂xi of Fubini to first integrate in the j direction. In the fourth step we carried out the integral in the j direction using the 1d fundamental theorem of calculus. Next we turn to the right hand side and compute:

Z Z Z 1 Z X i+σ X i+σ k 0 k X j+σ k ω = (−1) ω = (−1) I(Ii,σ , ω(Ii,σ)) = (−1) f ◦ Ii,σ k k k−1 k−1 ∂I i,σ Ii,σ i,σ [0,1] σ=0 [0,1] (3.12)

The first step is the definition of boundary of the standard cube. The second equality is the k 0 definition of the integral. Next we notice that the derivative satisfies [Ii,σ ] = e{1...ˆi,...k} and the hat k k J means i is ommitted. Also recall ω(Ii,σ) = (f ◦ Ii,σ)e . Therefore the intersection can be computed J using I(eI , e ) = 0 unless I = J and then we get 1. This explains the final equality and finishes the proof of the special case. Finally we explain how the general case reduces to the case we just treated. First we may reduce to the case of k-cubes γ since both ∂ and the integral are additive. Given any k-covector field ω we have Z Z Z Z Z dω = γ∗(dω) = d(γ∗ω) = γ∗ω = ω γ Ik Ik ∂Ik ∂γ The third equality comes from expressing γ∗ω as a sum of instances of the above special case where J we integrate fJ e . 3.6. THE FUNDAMENTAL THEOREM OF CALCULUS (STOKES THEOREM) 41

As a simple example consider η ∈ Ω2(R5) given by η(x, y, z) = −xe2 ∧e3 +ze1 ∧e2 and the 2-cube R γ(s, t) = (cos 2πs)e1 + (sin 2πs)e2 + te3. The integral γ η is simplified once we notice that η = dω with ω(x, y, z) = zxe2. Using Stokes we get R η = R ω = − R ω +R ω +R ω −R ω. The γ ∂γ γ1,0 γ1,1 γ2,0 γ2,1 first two integrals cancel out equal because γ1,0 = γ1,1. The third integral is 0 because ω(γ2,0) = 0 R 0 R and finally the last integral is equal to − [0,1] I(γ2,1(s), ω(γ2,1(s))) = − [0,1] I(−2π sin 2πse1 + 2 R 2 2π cos 2πse2, cos 2πse ) = −2π [0,1] cos 2πs = −π. The usual integral theorems of Gauss and Stokes from vector analysis in R3, mentioned at the beginning of the chapter, follow directly from our more general Stokes theorem. Using the inner product one identifies both the 1- and 2-covector fields with ordinary vector fields in R3. Under this identification gradient, curl and are all instances of the exterior derivative so our Stokes theorem can be applied. Stokes theorem gives a more conceptual expression for the exterior derivative as an infinitesimal version of this theorem.

Lemma 17. (Exterior derivative is a co-boundary) For any B ∈ L(Rk, Rn) we denote the k-cube given by t 7→ B(t) + p by B˜. Z I(B, dω(p)) = lim −k ω →0 ∂B˜ Proof. Indeed by Stokes theorem Z Z Z −k ω = −k dω = −k kI(B, dω(p+t)) ∈ [ min I(B, dω(p+t)), max I(B, dω(p+t))] ∂B˜ B˜ t∈[0,1]k t∈[0,1]k t∈[0,1]k using property 2 of lemma 5. Taking the limit of the right hand side gives I(B, dω(p)).

The significance of this lemma is that since we know the intersection of dω(p) with any B ∈ L(Rk, Rn) we have determined dω(p). Logically speaking we could even have used this formula as the definition of the exterior derivative. We chose not to do so because it makes the derivations more complicated but it does give useful intuition for what d is.

Exercises Exercise 1 Recall one can interpret the exterior derivative intuitively in terms of level sets as follows, see the end of section 3.5. The level sets of dω(p) are the places where the new level sets of ω(p) appear at p. Can you give an interpretation of Stokes theorem in terms of level sets?

Exercise 2 k R R If ω, η ∈ Ω (P ) and γ is a k-cube such that dω = dη. Is it true that γ ω = γ η?

Exercise 3 1 2 3 2 3 1 3 1 2 2 3 e (p)e ∧e +e (p)e ∧e +e (p)e ∧e Define θ ∈ Ω (R − {0}) by θ(p) = |p|3 . Show that dθ = 0 and also R γ θ = −4π where γ(s, t) = (cos 2πs sin πt, sin 2πs sin πt, cos πt) defines a 2-cube γ. Explain why there cannot be a 3-chain β such that ∂β = γ. 42CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS?

3.7 Fundamental theorem of calculus: Poincar´elemma

The Stokes theorem is only one part of the fundamental theorem of calculus. The part telling us how to integrate the derivative. The other half is known as the Poincar´elemma. It provides a way to find a primitive, to write your integrand as a derivative. More specifically it answers the question whether or not ω ∈ Ωk(P ) is of the form ω = dα for some α ∈ Ωk−1(P ). We call α the primitive or potential. Suppose dω 6= 0 then we cannot have ω = dα because then 0 = ddα = dω 6= 0 by lemma 16. So a necessary condition for finding a primitive is to have dω = 0. The Poincar´elemma states that when the domain P is simple this condition is actually sufficient. Theorem 5. (Poincar´elemma) Suppose P ⊂ Rn is an open set such that for any p ∈ P , P contains the line segment connecting 0 P I and p. If ω = I ωI e ∈ Ω(P ) is a k-covector field with dω = 0 defined on a disk then dα = ω and Z ! k X k−1 X g−1 I−{ig } α(x) = t ωI (tx) (−1) xig e I [0,1] g=1 where x = (x1 . . . xn) ∈ P .

∂ Pk g−1 ∂ Proof. The condition dω = 0 means that we have ωI = (−1) ωI|i 7→u . To show ∂xu g=1 ∂xig g I that dα = ω it suffices to prove that the coefficient of e in dα is ωI . Differentiating the xig for R k−1 g = 1 . . . k gives the contribution k [0,1] t ωI (xt) and differentiating the integral gives

k Z Z X X k g−1 ∂ X k ∂ t (−1) ( ωI| )(xt)xu = t ( ωI )(xt)xu = (3.13) ∂x ig 7→u ∂x u g=1 [0,1] ig u [0,1] u Z X ∂ Z d tk x ( ω )(xt) = tk ω (xt) (3.14) u ∂x I dt I [0,1] u u [0,1]

R d k Together the terms form precisely [0,1] dt t ωI (xt) = ωI (xt) finishing the proof. Lemma 18. (Invariance under reparameterization of cubes) k ϕ k 1 k R R Suppose [0, 1] −→ [0, 1] is a C function such that ∂ϕ = ∂I . Then γ◦ϕ ω = γ ω for any k-cube γ and ω ∈ Ωk(P ). Proof. Find α such that dα = ω using the Poincar´elemma on the cube. This is possible because by definition a C1-function on [0, 1]k is defined on a slightly larger product of open intervals P = ∗ (−, 1+)k for some  > 0 satisfying the requirements of the lemma. Also, dω = 0 since Λk+1Rk = R R ∗ R ∗ R ∗ R ∗ R ∗ {0}. Then using Stokes we find γ◦ϕ ω = ϕ γ ω = ϕ dγ α = ∂ϕ γ α = ∂Ik γ α = Ik γ dα = R γ∗ω = R ω. Ik γ

Exercises Exercise 1 For each of the k-covector fields ω below either find an α ∈ Ωk−1 such that dα = ω or prove that it cannot be done. 3.7. FUNDAMENTAL THEOREM OF CALCULUS: POINCARE´ LEMMA 43

a. ω ∈ Ω2(R2 − {0}) given by ω(p) = e1 ∧ e2. b. ω ∈ Ω3(R4 − {0}) given by ω(p) = e1(p)e1 ∧ e2 ∧ e4 + e1(p)e2(p)e2 ∧ e3 ∧ e4.

2 1 1 2 1 2 −e (p)e +e (p)e c. ω ∈ Ω (R − {0}) given by ω(p) = |p|2 .

2 1 1 2 1 2 −e (p−q)e +e (p−q)e d. ω ∈ Ω (((−1, 1) ) given by ω(p) = |p−q|2 and q = (2, 2). 44CHAPTER 3. IS THERE A FUNDAMENTAL THEOREM OF CALCULUS IN HIGHER DIMENSIONS? Chapter 4

Geometry through the dot product

So far none of our constructions used the dot product of Rn in any essential way. In this chapter we will explore how geometric notions arise from the inner product. Instead of just the Euclidean geometry, much more general curved spaces can be described if we allow the inner product to vary as we move in space. We start by recalling some essential linear algebra.

4.1 Vector spaces with a scalar product

In this section we work with an n-dimensional real inner product space V with inner product h·, ·i : V × V → R. Recall that a basis of V really is a linear isomorphism b ∈ L(Rk,V ) and that we often use the abbreviation bi = b(ei). An orthonormal basis b of V means that hbi, bji = δij for all i, j = 1, . . . n. The Gram-Schmid algorithm allows us to produce orthonormal bases.

Definition 19. (Orthogonal transformations) An element M ∈ L(V,V ) is said to be orthogonal if it preserves the inner product: hMv, Mwi = hv, wi. The set of all orthogonal transformations is denoted O(V ).

Reflections and rotations are examples of orthogonal transformations. Here we focus on reflec- tions since they are easier to understand.

Definition 20. (Reflection) Given a unit vector m ∈ V we define the reflection Rm ∈ O(V ) by Rmm = −m and Rmv = v for all v ⊥ m.

The vector m is the vector to the hyperplane that acts as the mirror. An explicit formula for the reflection is Rmv = v − 2mhm, vi. As expected a reflection is its own inverse and have determinant −1. Perhaps less expected is that reflections generate all orthogonal transformations and thus relate all orthonormal bases.

Lemma 19. (Reflections generate) Any two orthogonal bases of V are related by an element of O(V ). Such elements may be written as a composition of finitely many reflections. If R fixes a k-dimensional subspace then it is the composition of at most n − k reflections.

45 46 CHAPTER 4. GEOMETRY THROUGH THE DOT PRODUCT

Proof. If b, c : Rn → V are orthogonal bases then there exists an R ∈ O(V ) with c = R ◦ b. We check that hRv, Rwi = hv, wi, it suffices to do this for v, w basis elements of the basis b and then it is clear by orthogonality of ci = Rbi. For the last statement we argue by induction on n − k. When n − k = 0 we must have R = idV . For the induction step, suppose R fixes a k-dimensional subspace U and Rv = w 6= v. Then R v−w ◦ R fixes a k + 1-dimensional subspace spanned by v and U. This is because v − w is |v−w| orthogonal to U (why?) so by the induction hypothesis the proof is complete. Conjugating by a relfection has the same effect as reflecting the mirror itself. More precisely:

RwRmRw = RRw(m). To check this is true it suffices to say that both sides send Rw(m) to its negative and both sides fix its orthogonal complement. Definition 21. (Orientation) Define an equivalence relation on the set of all bases of V : Bases b, c are said to have the same orientation if det c−1 ◦ b > 0. An orientation on V is a choice of an equivalence class of bases. Lemma 20. (Volume n-vector) For any two orthonormal bases b, c we have [b] = ±[c] ∈ ΛnV where the sign is 1 iff the orientations are the same. Proof. Since b, c are related by a finite number of reflections and each reflection reverses orientation we just have to prove the lemma in the case ci = Rmbi for some reflection given by unit vector P V V V 2 m = i wibi.[c] = i Rmbi = bi − 2wwi = i bi(1 − 2|w| ) = −[b]. For example when V = R3 with the standard inner product and e is the standard basis then [e] = e1 ∧ e2 ∧ e3. As a consequence we see that if we fix the orientation, a volume element n-vector is uniquely determined. Its dual [b]∗ will be used a lot in the next section to define lengths, areas and volumes by integration. Going one step further we define a complementary (n − k)-vector to each k-vector in V called the Hodge star. It is convenient to refine our notation a little and define for any sequence I of V elements of {1 . . . n} and any basis b the element bI = i∈I bi where the wedge is in the order of the sequence I. Previously we defined for a set S the element bS to be the wedge product taken in increasing order. Definition 22. (Hodge star) k n−k For any orthonormal basis b define ?b :Λ V → Λ V by bI ∧?bbI = [b] for all k-element sequences I ⊂ {1, . . . , n} and extend linearly.

More concretely this means that ?bbI = σ(IJ)bJ where J is an (n − k)-element sequence in {1, . . . , n} such that the concatenation of the sequences (IJ) is a permutation of {1 . . . n} with sign σ(IJ). For example when V = R3 with the standard inner product and e is the standard basis then ?ee1 = e2 ∧ e3 and ?ee2 = e3 ∧ e1 also ?e1 = [e] = e1 ∧ e2 ∧ e3 and ?ee3,2 = −e1. Just like for the volume n-vector the Hodge star actually does not depend on the chosen basis, only the orientation matters. Lemma 21. (Hodge star) For any two orthonormal bases b, c we have ?b = ±?c, where the sign is 1 iff the orientations are the same. 4.2. RIEMANNIAN GEOMETRY 47

Proof. As in the proof of the lemma for the volume element, may assume that ci = Rmbi for some P reflection given by unit vector m = i wibi. We need to prove that ?ccI = − ?b cI for any sequence I ⊂ {1, . . . n}. Choose a sequence J complementary to I, and notice (ij)J is complementary to (ij)I, where (ij) is the permutation permuting i and j. The right hand side is

X 2 X X 2 X −?bcI = −?bbI (1−2 wi )+2?b b(ij)I = −σ(IJ)bJ (1−2 wi )+2 σ((ij)I(ij)J)wiwjb(ij)J i∈I i∈I,j∈J i∈I i∈I,j∈J The left hand side is

X 2 X ?ccI = σ(IJ)cJ = σ(IJ)cJ (1 − 2 wj ) − 2σ(IJ) wiwjb(ij)J j∈J i∈I,j∈J

2 P 2 P 2 The first terms are equal because 1 = |w| = i∈I wi + j∈J wj . The second terms are equal because σ(IJ) = −σ((ij)I(ij)J).

When the orientation is clear we will simply write ? instead of ?b for the Hodge star. Finally we transfer our constructions to covectors in V since those will be integrated in the next section. The inner product on V gives in particular a basis-independent isomorphism V −→Φ V ∗ i ∗∗ sending v to hv, ·i. In terms of an orthonormal basis b we have Φ(bi) = b . Identifying V with V ∗ k k ∗ I as usual, we get an isomorphism Φ :Λ V → Λ V . For orthonormal bases it just sends bI to b . This way we obtain a Hodge star ΛkV ∗ −→? Λn−kV ∗ by ?bI = σ(IJ)bJ much like the above.

Exercises

Exercise 1 (Rotations in R2) A rotation is a composition of two reflections. The angle of the rotation Rm ◦ Rw is twice the angle between m and w.

a. Prove that if two rotations have the same angle then they must be equal elements of O(R2). Hint. If you dont want to just write things out on a basis maybe you can solve b first and conclude that the composition of any three

reflections is a reflection. From this it follows that conjugating a reflection by a rotation inverts the rotation. Conjugating by a suitable

reflection sends one of your rotations to the inverse of the other one.

b. Show that an element of O(R2) is either a reflection, a rotation or the identity. c. Given a rotation in R ∈ O(R2) is it true that R∗ ? e1 = ?R∗e1?

4.2 Riemannian geometry

In this section we study some consequences of the existence of an inner product at each point. This is usually known as Riemannian geometry and the inner product is called the metric (not in the sense of metric spaces!). Definition 23. (Riemannian Metric) A metric on open set P ⊂ Rn is a choice of inner product g(p): Rn ×Rn → R for each point p ∈ P . The pair of an open set and a metric is denoted as (P, g). We say the metric is Ck if for all Ck vector fields X,Y on P we have p 7→ g(p)(X(p),Y (p)) is a Ck function on P . 48 CHAPTER 4. GEOMETRY THROUGH THE DOT PRODUCT

Of course the most important metric is the standard Euclidean metric gEucl(p) = h., .i the usual dot product for all p ∈ P . Another interesting metric is the hyperbolic metric on the half plane 1 H = R × R>0 given by ghyp(x, y) = y2 gEucl. The pair (H, ghyp) is known as the hyperbolic plane. In an open subset with a Riemannian metric (P, g) we may define length and angles as usual: Given two 1-cubes β, γ intersecting at β(q) = γ(q) = p ∈ (P, g) the cosine of the angle α 0 0 between β, γ is cos α = √ g(p)(β (q),γ (q)) . The length of a 1-cube is the integral g(p)(β0(q),β0(q))g(p)(γ0(q),γ0(q)) R p 0 0 L(γ) = [0,1] g(·)(γ (·), γ (·)). For example in the hyperbolic plane the length of the vertical line between (0, a) and (0, b) given by the 1-cube γ(t) = (t(b − a) + a)e2 with a < b is given by R b−a L(γ) = [0,1] t(b−a)+a dt = log b − log a. Also notice that angles in the hyperbolic plane are just the Euclidean angles.

Figure 4.1: An Escher-like tesselation of the hyperbolic plane H (left) and the same picture in the Poincar´edisk model D (right). All devils are the same size! And H and D are isometric.

Metrics can also be transferred by pull-back much like 2-covector fields. Definition 24. (Pull-back metric) ϕ Given a C1 map P −→ Q such that ϕ0(p) is injective for all p ∈ P , and a metric g on Q we may define a metric ϕ∗g on P by (ϕ∗g)(p)(v, w) = g(ϕ(p))(ϕ0(p)v, ϕ0(p)w). The derivative of ϕ is required to be injective in order to ensure non-degeneracy of the pulled back inner product (Exercise). Pulling back the Euclidean metric provides a way to precisely measure the shape of objects we ϕ see around us. This is the literal meaning of the word geometry. Given a function R2 ⊃ P −→ R3 0 ∗ with ϕ (p) injective for all p we get a metric ϕ gEucl on P describing the shape ϕ(P ) in terms of the P . Calculating on P is usually easier than working in the ambient space R3. π π For example we may take the sphere and geographic coordinates P = (− 2 , 2 )×(−π, π) and P 3 ϕ (µ, λ) 7−→ (cos λ sin µ, sin λ sin µ, cos µ) ∈ R3. Here µ is the latitude coordinate and λ the longitude, π for example Leiden is the point ϕ(52.1601, 4.4970) 180 written in degrees. Explicitly the inner ∗ product ϕ gEucl is given by calculating it at every point for the basis vectors e1, e2. Since the matrix  cos λ cos µ − sin λ sin µ  0 ∗ ∗ for ϕ (p) is  sin λ cos µ cos λ sin µ  we get ϕ gEucl(p)(e1, e1) = 1, ϕ gEucl(p)(e1, e2) = 0, − sin µ 0 ∗ 2 ϕ gEucl(p)(e2, e2) = sin µ. 4.2. RIEMANNIAN GEOMETRY 49

Now that we have some interesting examples of metrics we will start transferring the linear algebra from the previous section to the Riemannian setting. Since we will be working with or- thonormal bases we have to apply the Gram-Schmid algorithm to get an orthonormal basis for g(p) at every point p ∈ P .

Definition 25. (Volume n-covector field) An orientation on P ⊂ Rn is a choice of an orientation of Rn at each point p ∈ P . Given g a n ∗ metric on P and an orientation, define νg ∈ Ω (P ) by setting νg(p) = [b(p)] where b(p) is an orthonormal basis with respect to the inner product g(p) that agrees with the chosen orientation.

For example in the sphere example the volume 2-covector field in µ, λ coordinates must be 1 2 −1 (sin µ)e ∧ e because a positive orthonormal basis is e1, (sin µ) e2. Volume is now defined by integrating the volume covector field.

Definition 26. (Volume) R The volume of a k-cube γ in (P, g) is the integral [0,1]k νγ∗g

The case k = 1 coincides with the length of a curve defined earlier. We should be careful however that this is really signed volume because when we orient our cube in the opposite direction the volume will be negative. We also note that the volume of a cube does not depend on the chosen parameterization because of lemma 18.

Definition 27. (Isometry) ϕ An isometry is a diffeomorphism that preserves the metric in the following sense. (P, g) −→ (Q, g˜) is an isometry if g = ϕ∗g˜.

Since all geometric properties derive from the metric, isometries can be thought of as those transformations that preserve shape. 2  2  For example the (Poincar´e)disk model D = ({u ∈ C : |u| < 1}) with metric g(u) = 1−|u|2 gEucl(u). 2 φ u+i Here as usual we identify R and C. Now we claim that D −→ H given by φ(u) = iu+1 is an isometry. Here we also identified the upper half plane H with a subset of C. To verify this we compute using complex numbers as much as possible to avoid lengthly ex- pressions. First we use the fact that for a complex differentiable function f : C → C we may identify the derivative (a linear transformation in L(R2, R2)) with multiplication by f 0(z). Also identifying a vector v by a + ib the Euclidean inner product becomes hv, wi = Revw¯. The Revw¯ metric on H is then written as ghyp(z)(v, w) = (Im(z))2 . The pull back of this metric along 0 0 2 0 2 ∗ φ (u)vφ0(u)w |φ (u)| |φ (u)| φ becomes φ ghyp(u)(v, w) = Re (Im(φ(u)))2 = (Im(φ(u)))2 Re(vw¯) = (Im(φ(u)))2 gEucl(u). Finally 2 0 2 4 1−|u| |φ (u)| = |iu+1|4 and Imφ(u) = |iu+1|2 finishing the computation. A similar computation shows that isometries of the hyperbolic plane H are given by the linear az+b fractional tranformations z 7→ cz+d where a, b, c, d ∈ R and ad − bc = 1. In fact these are all orientation preserving isometries but we will not show this here.

Definition 28. (Hodge star) The Hodge star on an oriented (P, g) is the map Ωk(P ) −→? Ωn−k(P ) defined pointwise by (?ω)(p) = ?(ω(p)). 50 CHAPTER 4. GEOMETRY THROUGH THE DOT PRODUCT

For example the pull-back of the Euclidean metric in spherical coordinates (µ, λ) had orthonor- −1 1 1 mal basis b1(µ, λ) = e1, b2(µ, λ) = (sin µ) e2 at point (µ, λ). With dual basis b (µ, λ) = e and b2 = sin µe2. Therefore the Hodge star is ?b1 = b2. The Hodge star allows us to formulate some of the fundamental partial differential equations in a truely geometric way. Many of the famous fundamental partial differential equations involve the Laplacian ∆. For example the Laplace equation ∆u = 0, the Poisson equation ∆u = f, the heat equation, the wave equation, the diffusion equation, the Sch¨odingerequation, the Klein-Gordon equation, the Helmholtz equation, the Maxwell equation and the Navier-Stokes equation. Definition 29. (Hodge Laplacian) k A choice of metric g and orientation on P defines for all k the Laplacian Ωk(P ) −−→∆ Ωk(P ) by ∆kω = (?d ? d + d ? d?)ω

f As a basic example take Rn with the Euclidean metric. Viewing a C2 function Rn −→ R as an element of Ω0(P ) we see that ?f = f[e] ∈ Ωn( n) and so d ? f = 0. Also df = P ∂f e1 so R i ∂xi P ∂f i V i P ∂2f 0 P P ∂2f ?df = (−1) e and d ? df = 2 [e] and so ∆ f = 2 . So our Laplacian i ∂xi j6=i i (∂xi) i i (∂xi) is a very complicated (but geometrically natural!) generalization of the sum of the second partial derivatives. If we allow ourselves to use the Minkowski metric on P = R4 (not positive definite) then the equation ∆2 = 0 is precisely Maxwell’s equations (in the absence of charges). Also ∆0 = 0 is the wave equation. The famous Hodge conjecture (one of the million dollar problems) basically says that all har- monic k-covector fields with rational coefficients on a complex non-singular complex manifold cor- respond to complex submanifolds of dimension k. So what are manifolds?

Exercises Exercise 1(Area of a Hyperbolic disk) We work with the Poincar´edisk model D. As usual we identify C with R2. a. Identifying u = x + iy ∈ D with (x, y) ∈ R2 show that the hyperbolic metric on D is given 4 2 by g(x, y) = (1−x2−y2)2 gEucl. Give an orthonormal basis for R with respect to the inner product g(x, y).

2 4 1 b. Show that the volume 2-covector field νg ∈ Ω (D) may be written as νg(x, y) = (1−x2−y2)2 e ∧ e2.

c. Using the 2-cube γ : [0, 1]2 → D given by γ(s, t) = (rs cos 2πt, rs sin 2πt) compute the area of R the disk with Euclidean radius 0 < r < 1 by A(r) = γ νg.

1+r d. Show that the length of the 1-cube β : [0, 1] → D given by β(t) = (0, rt) is R = log 1−r and R that r = tanh 2 . e. Assuming the length of β was the hyperbolic radius of the disk with Euclidean radius r show 2 R that the area of the disk with hyperbolic radius R is 4π sinh 2 . Chapter 5

What if there is no good choice of coordinates?

So far we’ve studied spaces and functions on them in a fixed coordinate system. This is comparable to doing linear algebra in a fixed basis (bad idea). For example the level sets X = F −1(q) we studied in chapter 2 can be expressed as the graph of a function as long as we stay sufficiently close to some p ∈ X. In the case of the sphere F (x, y, z) = x2 + y2 + z2 and q = 1. To describe the G+3 p sphere near the north pole we can use D2 3 (x, y) 7−−−→ (x, y, 1 − x2 + y2) ∈ S2, where D2 ⊂ R2 G is the open unit disk. Near the south pole we need the other square root: D2 3 (x, y) 7−−−→−3 (x, y, −p1 − x2 + y2) ∈ S2. Points on the equator are more conveniently described by one of the G G four maps D2 3 (x, y) 7−−−→±2 (x, ±p1 − x2 + y2, y) ∈ S2. D2 3 (x, y) 7−−−→±1 (±p1 − x2 + y2, x, y) ∈ S2. Transitioning between these descriptions is not hard because to get back we just project −1 −1 p 2 2 for example G1 (a, b, c) = (b, c) and so G1 ◦ G2(x, y) = ( 1 − x + y , y). This leads us to 2 an atlas of the sphere consisting of six disks D±i = D and transition functions between them. −1 −1 i τij = Gj ◦ Gi : Di ∩ Gi (Dj) → Dj ∩ G (Di) all given by some all given by the square roots as above.

5.1 Atlasses and manifolds

Before we begin we should note that our definitions of manifold and atlas are slightly different from the one usually found in the literature. They are equivalent but somewhat more concrete, thus hopefully allowing a quicker entry into the essence of the subject. In the exercises the reader is invited to prove the equivalence. Definition 30. (Atlas) k α m An m-dimensional C atlas is a family of non-empty open sets (charts) {M ⊂ R }α∈A together with a family of Ck-diffeomorphisms (transitions) between open subsets τ α α α β β β {M ⊃ Mβ −−→ Mα ⊂ M }α,β∈A, indexed by some set A. We furthermore require that ∀α, β, γ ∈ A:

α β −1 α 1. τβ = (τα ) τα = id|M α .

51 52 CHAPTER 5. WHAT IF THERE IS NO GOOD CHOICE OF COORDINATES?

γ β α 2. τα ◦ τγ ◦ τβ = id, whenever the composition is defined. Atlasses are often meant to describe some particular space. The space is the equivalence class of points on the charts where two points are equivalent if they are related by a transition map. To make sense of such spaces topologically recall the definition of quotient topology. Definition 31. (Quotient topology) Given a topological space X and an equivalence relation ∼, the quotient X/ ∼ is the set of equivalence q classes. The map X −→ X/ ∼ sending each point to its equivalence class is called the quotient map. X/ ∼ becomes a topological space by declaring a subset U ⊂ X/ ∼ to be open if q−1(U) is. For example take X = [0, 1] ⊂ R with the standard (subspace) topology and set 0 ∼ 1. Then X/ ∼ is homeomorphic to a circle. A homeomorphism is for example given by g : [0, 1] → R2 defined by g(t) = (cos 2πt, sin 2πt). An important property of the quotient topology is that a continuous function on X/ ∼ is precisely a continuous function on X that always takes the same value on equivalent points of X. Atlasses provide many more examples of quotient spaces. The same point can be described differently on each chart of the atlas but the transition functions tell us it’s still the same point. If the quotient is a nice topological space we call it a manifold. Definition 32. (Manifold) k α F α Given an m-dimensional C atlas {M }α∈A we define its quotient space as M = α∈A M / ∼ α where x ∼ y if ∃α, β ∈ A : y = τβ (x). We say the atlas defines a manifold M when M has the following topological properties: 1. Hausdorff,

2. Every point has a neighborhood homeomorphic to Rm, 3. Has a countable basis for its topology. In what follows we will often speak about manifolds as topological space M. However the reader should remember our manifolds always are intended to come with a specific atlas whose quotient is the topological space M. So by an m-dimensional Ck-manifold we really mean a nice m-dimensional Ck-atlas. Much of what we will say below actually works for quotients of atlasses that are not necessarily manifolds. The Hausdorff property is chosen to ensure uniqueness of limits and having a countable basis allows partitions of unity, see section 5.4. Setting A = {0} and M = M 0 any open subset of Rm yields a simple example of an m- dimensional C∞ manifold. A more interesting example is the circle T described by A = {0, 1} and charts T1 = T0 = ( 0 1 0 1 t − 1 if t > 0 (−1, 1) ⊂ R and T1 = T0 = (−1, 0) ∪ (0, 1) and transitions τ1 (t) = τ0 (t) = . t + 1 if t < 0 1 1 0 1 1 1 T is compact because it is the image of the compact sets [− 2 , 2 ] ⊂ M and [− 2 , 2 ] ⊂ M under the quotient map q. To identify T with the usual unit circle S1 it is convenient to use the fact that a continuous bijection from a compact space to a Hausdorff space must be a homeomorphism. Consider the maps f 0 : T0 → S1 and f 1 : T1 → S1 given by f 0(t) = (cos πt, sin πt) and f 1(t) = (cos π(t + 1), sin π(t + 1)). They show how the two charts of T describe overlapping parts of the 1 0 1 1 0 1 circle. Notice that f (t) = f (τ0 (t)) and so the pair f , f defines a continuous map f : T → S . It is a bijection and the inverse is also continuous because S1 is Hausdorff and T is compact. Let us emphasize that the abstract circle T is much simpler than the one sitting in the plane! 5.1. ATLASSES AND MANIFOLDS 53

Lemma 22. (Coordinate patches) For every α ∈ A the quotient map q of an atlas restricts to a homeomorphism between M α and q(M α) ⊂ M.

Proof. No two points of M α are equivalent under the equivalence relation of definition 32 so q restricts to a continuous bijection between M α and q(M α). To show it is a homeomorphism it suffices to prove that for any open set U ⊂ M α the image q(U) is also open. By the quotient topology we should check that q−1q(U) ∩ M β is open in M β for every β ∈ A. We may assume this intersection is non-empty so suppose x ∈ q−1q(U) ∩ M β. Then q(x) ∈ q(M α) so there exists a α β y ∈ M with y = τα (x). Since the transition maps τ are all homeomorphisms, we may take an open α α β neighborhood y ∈ V ⊂ U ∩ Mβ . Its image W = τβ (V ) satisfies q(W ) ⊂ q(U) and x ∈ W ⊂ M so q(U) is open.

Using the above lemma one often pretends M α is the same as q(M α) ⊂ M to simplify notation. We will try to refrain from this here.

Definition 33. (Local description of maps between manifolds) α β Suppose M has atlas {M }α∈A and N has atlas {N }β∈B. The local description of a continuous f map M −→ N is the set f α,β {M α,β −−−→ N β|α ∈ A, β ∈ B}

α,β α −1 β α,β −1 where M = M ∩ f q(N ) and f = q |N β ◦ f ◦ q.

In practice one often does not start with a function between manifolds but rather tries to construct one by patching together local descriptions. The next lemma gives the conditions we need to check to do this.

Lemma 23. (How to define functions locally) Suppose M,N are manifolds as in the previous definition. A family continuous maps

f α,β {M α,β −−−→ N β|α ∈ A, β ∈ B}

f α,β α S α,β is the local description of a continuous map M −→ N if and only if M ⊂ M and β∈B M = α γ,δ α β α,β M and ∀α, γ ∈ A, ∀β, δ ∈ B we have f ◦ τγ = τδ ◦ f . Proof. The only if part is clear by definition (Exercise!) so we focus on the if part here. First we α,β α α α −1 α,β check that for each fixed α ∈ A the f define a map f : M → N such that f ◦ qN β = f . Define f α(x) = f α,β(x) for any β such that x ∈ M α,β. This does not depend on the choice of β since q ◦ f α,δ(x) = q ◦ f α,β(x) for any δ ∈ B with the same property as β. Collecting all f α ˜ F α ˜ into a map f : α∈A M → N. Since the value of f must be the same for any two equivalent points, the universal property of the quotient topology gives a continous map f : M → N such that f˜ = f ◦ q.

For example recall the atlas for the circle T with two charts T0, T1 described in this section. f A map T −→ T wrapping the circle around itself twice has the following local description: T0,0 = 54 CHAPTER 5. WHAT IF THERE IS NO GOOD CHOICE OF COORDINATES?

 1 2t if |t| < 2 1 0,0  1 0,1 0,1 (−1, 1) − {± 2 } and f (t) = 2t − 2 if t > 2 . And T = (−1, 1) − {0} and f (t) =  1 2t + 2 if t < − 2 ( 2t − 1 if t > 0 . Likewise T1,1 = T0,0 and f 1,1 = f 0,0 and T1,0 = T0,1 and f 1,0 = f 0,1. 2t + 1 if t < 0 Functions between manifolds may be differentiated in terms of their local descriptions too. By the chain rule the following definition makes sense.

Definition 34. (Ck maps between manifolds) f A map M −→ N between Ck-manifolds is said to be Ck if all maps in the local description are Ck. A Ck map is said to be a Ck diffeomorphism if it is a bijection and the inverse is Ck as well.

f For example, the map T −→ T doubling the angle as described above is a C∞ diffeomorphism. Suppose two atlasses have the homeomorphic quotients, the resulting manifolds may or may not be Ck diffeomorphic. If they are then we can take the union of the atlasses and the diffeomorphisms between the charts to get a bigger atlas with the same quotient. In such a way there is a maximal atlas describing the topological space. Of course such an atlas is not practical to work with but it clarifies the role of atlasses. Surprising things happen in dimension four. According to a theorem of Donaldson, there are infinitely many non-diffeomorphic C∞ manifolds homeomorphic to Rn if and only if n = 4. The famous smooth Poincar´econjecture poses the same question for the sphere Sn. It is currently unknown whether there exists fake S4, i.e. a manifold homeomorphic but not diffeomorphic to the four sphere. Milnor proved there do exist fake 7-spheres.

Exercises Exercise 1 (M¨obiusstrip) The strip is the quotient space M = [−1, 1] × (−1, 1)/ ∼ where (1, y) ∼ (−1, −y) for all y ∈ (−1, 1). The bottle is K = [−1, 1]2/ ∼ where (1, y) ∼ (−1, −y) and (x, 1) ∼ (x, −1).

a. Give two maps φ1, φ2 : M → K such that φi is a homeomorphism onto φi(M) and K − (φ1(M) ∪ φ2(M)) is homeomorphic to a circle and φ1(M) ∩ φ2(M) = ∅. A picture is sufficient to get points, providing the homeomorphisms will give you bonus points.

b. Provide a 2-dimensional C3 atlas whose quotient space is homeomorphic to M. Here you do need to give the homeomorphism.

Exercise 2 (Non-Manifolds) Find an atlas whose quotient space is not a Hausdorff space. Also find an atlas whose quotient space does not have a countable basis for its topology.

Exercise 3 (Product manifold) Given two manifolds M,N show how M ×N naturally is a manifold too by taking the atlas defining M and the atlas defining N and taking the Cartesian product of each pair of charts M α × N β. Show that taking the product of transition functions gives a new atlas whose quotient is M × N. 5.2. EXAMPLES OF MANIFOLDS 55

Exercise 4 (Vector space as manifold) Imagine an n-dimensional vector space V and consider the set of all bases A = {F ∈ L(Rn,V )| ker F = α α n 0}. Show that we can define an atlas indexed by A with M = Mβ = R for all α, β ∈ A and α −1 τβ = β ◦ α. How does this turn V into a topological space?

Exercise 5 (Traditional definition of manifold) In the literature manifolds are usually defined in a slightly different way as follows. To make the temporary distinction we call them T-manifolds. An m-dimensional Ck T-manifold is a second φ countable Hausdorff space M together with for each p ∈ M a homeomorphism p ∈ U −→ V ⊂ Rm between open subsets such that for any two such homeomorphisms φ, ψ the map φ ◦ ψ−1 is a Ck- diffeomorphism. In this exercise you will check that the two definitions are really equivalent. To build an atlas from a T-manifold we take the charts to be the open sets V ⊂ Rm that are the target of the homeo- morphisms φ. The transition maps are just the φ ◦ ψ−1. In the other direction, given an atlas the lemma 22 gives us homeomorphisms M α → q(M α) ⊂ M. Their inverses are the φ we are looking for in the definition of T -manifold.

5.2 Examples of manifolds

The implicit function theorem tells us that solving equations leads to manifolds.

Theorem 6. (Manifolds from implicit function theorem) F Imagine Rk+n ⊃ P −→ Rn is a C1 function and consider the level set L = F −1({z}) for some z ∈ Rn. If for all p ∈ L we have dim ker F 0(p) = k then L is a k-dimensional C1 manifold. Proof. The fact that the kernel is k-dimensional for each p implies that there exist isomorphisms k n αp n+k 1 fp R × R −−→ R and so the implicit function theorem provides C maps Xp −→ Yp. The maps φp −1 Xp 3 x 7−→ p + αp (x, fp(x)) ∈ L are homeomorphisms onto their image. The inverse is simply −1 −1 φp (y) = πkα (y − p) where πk is projection onto the first k coordinates. Consider the atlas whose charts Xp are indexed by the points of p ∈ L. The transition maps p −1 p −1 t −1 are τt = φt ◦ φp. These are diffeomorphisms between Xt = φp φt(Xt) and Xp = φt φp(Xp). If we view L ⊂ Rk+n with the subspace topology, it must be homeomorphic to the quotient X of the atlas {Xp|p ∈ L}. The homeomorphism is defined by the maps φp. Taking balls with rational radius and rational centers we see that L is second countable and of course it is also Hausdorff as a subset of Rk+n.

The atlas provided in the proof is rather large and mostly of theoretical value. In practice we could have worked with fewer charts. In case L is compact even with finitely many because an open covering suffices. Also if F is Ck one can also prove that the level sets are Ck as well. A simple example of an atlas with two charts is the atlas {M +,M −} where M ± = Rn and + − n + − + y M− = M+ = R − {0} and τ− = τ+ is given by τ− (y) = |y|2 . Thinking of the stereographic projections the quotient is actually homeomorphic to the n-sphere. Recall stereographic projection from the north/south pole is the map n+1 ⊃ Sn 3 u 7−→σ u−en+1un+1 ∈ n with inverse σ−1(x) = R 1±un+1 R 56 CHAPTER 5. WHAT IF THERE IS NO GOOD CHOICE OF COORDINATES?

2 2x±en+1(|x| −1) |x|2+1 . However as always the atlas is often simpler than the actual realization of the sphere in Rn+1. The Cayley map provides an explicit way to parametrize part of the orthogonal group O(n) = O(Rn), where we take Rn with the standard inner product. In terms of matrices O(n) consists of matrices M of size n satisfying MM T = I. Define o(n) = {X ∈ Mat(n, n)|XT = −X}. The Cayley map is the map o(n) 3 X 7−→C (1 + X)(1 − X)−1 ∈ O(n). It is well defined since the eigenvalues of elements of o(n) are purely imaginary and the determinant is the product of the eigenvalues so det(1 − X) 6= 0. To check the image is an orthogonal matrix we compute C(X)T C(X) = (1 + X)(1 − X)−1(1 − XT )−1(1 + XT ) = (1 + X)(1 − X)−1(1 + X)−1(1 − X) = (1 + X)(1 − X2)−1(1 − X) = (1 + X)(1 + X)−1(1 − X)−1(1 − X) = 1. Looking at the determinant the image is not the whole of O(n), just a neighborhood of 1. Actually the neighborhood can be described precisely to be the set B of matrices with determinant 1 that have no eigenvalue −1. We claim C is a diffeomorphism from o(n) onto B with inverse C−1(M) = (M + 1)−1(M − 1). A useful class of manifolds is the class where all transition functions are linear.

Exercises Exercise 1 (Hyperbolic punctured torus) 1 1 1 1 Consider the interior of the region B = {z ∈ C| − 1 < Re(z) < 1, |z − 2 | > 2 , |z + 2 | > 2 }.

5.3 Analytic continuation

When describing level sets as manifolds we first have the level set and then produce from it an atlas using the implicit function theorem. By construction the quotient space of this atlas is homeomorphic to the level set. It is not always like this. Often the atlas comes first and the manifold arises from it. The process of analytic continuation from complex analysis is a clear example of this. Starting with a single holomorphic function defined on some subset of the complex plane one is naturally led to defining a 2-manifold. The manifold is the natural domain for the analytic contination of our function. Riemann pioneered this field and so these manifolds are often called Riemann-surfaces. Definition 35. (Analytic atlas) f A holomorphic pair is a pair (U, f) where U ⊂ C is open and U −→ C is holomorphic. A set of holomorphic pairs {(U α, f α)|α ∈ A} defines an atlas by taking charts {U α|α ∈ A} and setting α β α β α β α U = U = {z ∈ U ∩ U |f (z) = f (z)} and τ = id α . β α β Uβ It is not hard to see that the quotient space of any atlas coming from a set of holomorphic pairs has a quotient space that is both Hausdorff and second countable. The analytic atlas is in some sense the right domain for the holomorphic function described by the pairs in the atlas. We may speak about the function because of the identity theorem from complex analysis. Recall it says that if two holomorphic functions defined on connected open set U agree on a subset S ⊂ U containing a limit point, they must agree on all of U. The square root in the complex plane provides a good example of a non-trivial analytic atlas. To iθ set up the square root functions define for any θ the ray Aθ = {re ∈ C|r ≥ 0}. For any z ∈ C−Aθ the argument Argθ(z) ∈ [0, 2π) of z is the positive angle between Aθ and the vector pointing to z. 5.3. ANALYTIC CONTINUATION 57

θ,± θ,± θ,± ±,θ 1 i (Arg (z)+θ) Define four holomorphic pairs (U , f ) where U = C − Aθ and f (z) = ±|z| 2 e 2 θ θ,± for θ ∈ {0, π}. The resulting analytic atlas is as follows. We have Uθ,∓ = ∅ since there the functions take opposite values. For Imz > 0 we have f π,±(z) = f 0,±(z) and for Imz < 0 we have π,± 0,± 0,± π,± 0,± π,∓ f (z) = −f (z). Therefore Uπ,± = {z|Im(z) > 0} = U0,± and Uπ,∓ = {z|Im(z) < 0} = U0,± with transition functions equal to the identity. √ The quotient√ of our atlas for the square root is two linked copies of C − {0}, one copy for + z and one for − z thus clarifying multivaluedness of the square root. The logarithm can be treated similarly but this requires infinitely many holomorphic pairs as the ambiguity is integer multiples of 2πi. For any n ∈ Z and θ ∈ {0, π} define (U θ,n, f θ,n) where θ,n n,θ U = C − Aθ and f (z) = log |z| + i(θ + Argθ(z) + 2πn) for θ ∈ {0, π}. As with the square θ,n roots we can compute the analytic atlas as follows. Uθ,m = ∅ for n 6= m. When Imz > 0 we have f π,n(z) = f 0,n(z) and when Imz < 0 we have f π,n(z) = f 0,n+1(z). Therefore  {z|Imz > 0} if m = n π,n 0,m  U0,m = Uπ,n = {z|Imz < 0} if m = n + 1 ∅ otherwise

The quotient space of the atlas for the logarithm is an infinite spiral of copies of C − {0} glued together. Ones for each branch of the logarithm. In general one may always try to analytically continue any holomorphic pair as far as possible using Taylor expansion. To this end, recall that any holomorphic function can be expanded as a around any point in its domain. The radius of convergence of the power series ∞ 1 P n − n f = n=0 an(z − w) around point w is the number Rf = lim infn |an| . If we denote the open disk with center w and radius R by Dw,R then the series converges and is holomorphic on Dw,R.

In other words (Dw,Rf , f) is a holomorphic pair. Analytic continuation goes as follows. Start with a single holomorphic pair (U 0, f 0) where U 0 is a disk. Choose a point w ∈ U 0 close to its boundary and set f 1 to be the power series of f 0 around w. We get a new holomorphic pair (U 1, f 1) where U 1 = D . By the identity theorem f 1 = f 0 w,Rf0 on U 0 ∩ U 1 but with some luck U 1 is not contained in U 0. This is what we mean by extending our f 0 analytically to a bigger domain. There is no reason to stop after one step. Pick another point v ∈ U 1 and expand f 1 as a series around v. As above this yields a third holomorphic pair (U 2, f 2) extending our function even further and so on. γ One way to structure the process of analytic continuation is to choose a path (1-cube) [0, 1] −→ C starting at a point inside the initial pair (U 0, f 0). Analytic continuation along the path is possible in finitely many steps because of compactness of the image of γ. Studying how the analytic continuation depends on the choice of path naturally leads to the idea of fundamental group. Coming back to analytic atlasses, they really come with a function into the complex numbers as follows (identifying C and R2 as usual):

Lemma 24. On the manifold U defined as the quotient of an analytic atlas coming from holomor- f phic pairs (U α, f α), there is a C∞ function U −→ C whose local descriptions are the f α.

Proof. We may just apply lemma 23 noting the transitions are the identity precisely on the subsets where the functions from the pairs agree. 58 CHAPTER 5. WHAT IF THERE IS NO GOOD CHOICE OF COORDINATES?

In this sense U really is a natural domain of the functions defined by the holomorphic pairs. In the theory of Riemann surfaces and complex manifolds one would define what it means for a function on manifolds to be holomorphic and our f would be an example of such.

5.4 Bump functions and partitions of unity

Recall that the support of a function is the closure of the set where it takes non-zero values. Lemma 25. (Bump functions) Given a point p in an open subset U in a Ck manifold M there exists a Ck function b : M → [0, 1] with b(p) 6= 0 and support contained in U.

( −1 η e t t > 0 Proof. The function R −→ R defined by η(t) = is a C∞ function. So for every r > 0 0 t ≤ 0 f and z ∈ Rn the function Rn −→ R defined by f(x) = η(r2 − |x − z|2) has support inside the ball Br(z) with radius r centered at z. α α Choose a chart M of the manifold such that q(z) = p and Br(z) ⊂ M . The local description α,0 γ,0 γ γ of the function b we are looking for is b = f and b = f ◦ τα on Mα and zero otherwise. This makes b well-defined by lemma 23 and Ck by the chain rule. The essential tool for cutting and pasting differentiable objects is called partition of unity. Definition 36. (Partition of unity) A Ck partition of unity with respect to an open covering of manifold M is a family of Ck functions M −→hi [0, 1], i ∈ I such that

1. For all i ∈ I the function hi has support inside some set of the covering.

2. For every p ∈ M there is a neighborhood V ⊂ M such that hi|V = 0 for all but finitely many i ∈ I. P 3. i∈I hi = 1 One reason for demanding that are manifolds have a countable basis is that this allows partitions of unity to exist. Theorem 7. (Existence of partitions of unity) For any Ck manifold and any open covering a partition of unity exists. Proof. First suppose M is compact. Using lemma 25 choose for every p ∈ M a bump function bp with bp(p) > 0 each supported in some set of the open covering. By compactness of M there must be points p , . . . p ∈ M such that M = Sn b−1(0, ∞). Our partition of unity will be the 1 n i=1 pj functions hj = bpj /(bb1 + . . . bbn ) for j = 1 . . . n. If M is not compact we use a countable basis {Ui}i∈ for the topology of M to construct a N S sequence K1 ⊂ K2 ⊂ · · · ⊂ M of compact subsets with Ki ⊂ int(Ki+1) and M = i Ki. First we throw away any basis elements that do not have compact closure. The result still is a basis for the S ¯ ni Sni+1 topology (Exercise!). Now choose n1 < n2 < . . . such that Ki = i Ui=1 ⊂ i=1 Ui. For each i finitely many bump functions ci , . . . ci : M → [0, 1] can be chosen so that ci + 1 si 1 ··· + ci > 0 for all x in the compact set K − int(K ) and all supports fit in a member of the si i i 5.5. VECTOR BUNDLES 59

P i k open covering and inside open set int(Ki+1) − Ki−2. The function c = i,j cj is C and positive i i everywhere on M so setting hj = cj/c gives the desired partition of unity.

Exercises Exercise 1 (C∞ function) Prove that the function η from the text is indeed C∞.

5.5 Vector bundles

Vector bundles are a way to transfer linear algebra constructions in calculus to manifolds. Basically we want a vector space above every point in our manifold and all such vector spaces should fit together smoothly as one walks in the manifold. Locally the structure should be the Cartesian product of manifold and vector space.

Definition 37. (Vector bundle) A Ck-manifold E is a vector bundle over Ck-manifold M with fiber a vector space V if the atlas α α {E }α∈A of E is described in terms of the atlas {M }α∈A for M as follows. The charts are α α α α α α α E = M × V and Eβ = Mβ × V and the transition functions are β = (τβ ,Lβ ) for some k α α α C -functions Lβ on Eβ such that Lβ (p, ·) is an invertible element of L(V,V ) for all fixed p. The most basic vector bundle is a Cartesian product M × V for any manifold M and any finite α dimensional vector space V . When all the Lβ are the identity this is called the trivial vector bundle. For example the cylinder S1 ×R. A less trivial example closely related to the cylinder is the M¨obius strip. To construct the strip as a vector bundle over the circle with fiber R and base the circle T we i i 0 1 0 take charts E = T ×R and overlaps E1 = E0 = T1×R = (−1, 0)×R∪(0, 1)×R. The new transition ( 0 v if t > 0 map is defined as E0 −→1 E1 defined by 0(t, v) = (τ 0(t),L0(t, v)) with L0(t, v) = . 1 0 1 1 1 1 −v if t < 0

Lemma 26. (Vector bundles are manifolds) Any vector bundle E over an m-dimensional Ck manifold M is an m + dim(E)-dimensional Ck- k π α,β α manifold and there is a C surjective map E −→ M with local description π (p, w) = τβ (p). Proof. It follows from the description of the atlasses that a countable basis of M extends to a countable basis of E. If there is no single chart containing both points a, b ∈ E then we managed to separate them already. If there is such a chart we use that this chart is a Hausdorff space M α × V to separate them there. The fact that every point of M has a neighborhood homeomorphic to Rm means every point of E has a neighborhood homeomorphic to Rm × E =∼ Rm+dim(E). The reader γ,δ α β α,β should check that the local maps are compatible by checking π ◦ γ = τδ ◦ π . This follows from part 2 of the definition of atlas and the definition of the bundle transtion function .

One of the most important vector bundles is the tangent bundle of a manifold. It is basically the manifold together with a linear approximation () at every point. The cylinder above is for example the tangent bundle of the circle.

Definition 38. (Tangent bundle and derivative) Define the vector bundle TM to be the vector bundle over m-manifold M with fiber Rm with the 60 CHAPTER 5. WHAT IF THERE IS NO GOOD CHOICE OF COORDINATES?

α α 0 new transition functions determined by Lβ (p, v) = τβ (p) v, where τ is a transition function of M. 0 f f 0 α Also, the derivative of a function M −→ N is a map TM −→ TN defined locally by (f )β (p, v) = (f α,β(p), (f α,β)0(p)v). α α α 0 Notice we defined the vector bundle TM to have transition functions β (p, v) = (τβ (p), τβ (p) v). In other words the tangent bundle TM of m-dimensional manifold M is a 2m-dimensional manifold that we can think of as follows. At every point we carry have a linear approximation or tangent space with us. Usually a tangent plane to a surface M ⊂ R3 such as a sphere is thought of as a plane touching M at a point. TM does the same in case M is not necessarily a subset of any RN . By the chain rule, the local descriptions of the derivative of function f in the above definition β α,β γ,δ α give rise to a global definition. Indeed differentiating the equality τδ ◦ f = f ◦ τγ at p shows the extension to the tangent bundles is consistent. Definition 39. (Section of bundle) ` ` s A C section of a vector bundle E over M is a C map M −→ E such that π ◦ s = idM . Define for α any chart M α of M and the function M α −→s V by sα,α(p) = (p, sα(p))

f When the vector bundle is trivial a section always is the graph of some function M −→ V , so s(p) = (p, f(p)). In other words, there is a bijection between the set of such sections and the set of functions M → V . The functions sα always determine the whole section since for every chart M α there is the corresponding chart Eα = M α × V of the vector bundle. A section of the tangent bundle is usually known as a vector field on M. More generally we would like to introduce vector bundles whose sections will be k-(co)vector fields on M. More generally the bundles used to transfer our k-covectors to manifolds are the following. Definition 40. (Wedge Bundles and their sections) ∗ Define the vector bundle ΛkTM ∗ to be the vector bundle over m-manifold M with fiber ΛkRm and α β0 α ∗ k k m transitions given by Lβ (p, v) = τα (τβ (p)) v). Similarly the vector bundle Λ TM has fiber Λ R α α0 α and its transition functions are given as Lβ (p, v) = τβ (p))∗v in terms of the transitions τβ of M. A k-vector field on manifold M is a section of ΛkTM. A k-covector field on manifold M is a section of ΛkTM ∗. The set of all C2, k-covector fields on M is called Ωk(M). We will often jump back and forth between a section of a trivial vector bundle s : M → M × V and a function f : M → V . The relationship is given by s(p) = (p, f(p)). This is convenient because right now we have two slightly different definitions of what a vector field X is on an open set P ⊂ Rm. According to chapter 3 this is a map X : P → Rm. But viewing P as an m-dimensional manifold (with an atlas with a single chart) a vector field is a section s : P → Λ1TP . With the above identification the two will agree. The same goes for k-(co)vector fields on P . The bundles Λ1TM and Λ1TM ∗ are usually known as the tangent and the cotangent bundle and are written as TM and TM ∗. In the above definition we used the following notion of push-forward k along a linear map A ∈ L(V,W ). The push-forward A∗B of B ∈ L(R ,V ) is given by A∗B = A◦B. f This extends to a map ΛkV −−→A∗ ΛkW . For k-vector Y on Q and an injective C1 map P −→ Q n m 0 −1 between open subsets of R , R we also define the push-forward as f∗Y (p) = f (f (p)). Notice that a k-covector field ω is described on chart α using the function ωα which is a k- covector field on M α. Covector fields on manifolds may be pushed forward much like their local counterparts from chapter 3. 5.5. VECTOR BUNDLES 61

Definition 41. (Pull-back) f f ∗ The pull-back by a C1 manifold map M −→ N is a map Ω(N) −→ Ω(M) defined in terms of the f α,β ∗ local descriptions M α,β −−−→ N β. On M α,β we set (f ∗ω)β = f α,β ωβ.

There is a similar notion of push-forward but it is not always defined when f is not injective. Just like in the case of open sets of Rn.

Exercises Exercise 1 (Cone) 1 2 i 2 1 2 Consider the 2-dimensional atlas {C ,C } with C = {(x, y) ∈ R |y > 0} and C2 = C1 = {(x, y) ∈ ( (−y, x) if x > 0 C1|x 6= 0} and τ 1 = τ 2 = τ : C1 → C2 given by τ(x, y) = . Viewing the 2 1 (y, −x) if x < 0 interval X = (−1, 1) a manifold with a single chart X = X0 consider the function f : X → C given on the atlasses by functions f 0,i : X0,i → Ci and X0,1 = X, X0,2 = X − {0} f 0,1(t) = (−t, 1) and ( (−1, −t) if t < 0 f 0,2(t) = (1, t) if t > 0

a. Check that f does indeed define a differentiable function from X to C.

b. Explain why the tangent bundle TX is just X × R.  c. The tangent bundle TC has atlas with charts TCi = Ci ×R2 and transition map TC1 −→ TC2 defined by (p, v) 7→ (τ(p), τ 0(p)v). Compute the derivatives (f 0,1)0,(f 0,2)0 and show how they combine to give a well-defined map f 0 : TX → TC. d. Describe the bundle projection π : TC → C explicitly in terms of the charts of the atlas. e. Give an example of a C∞ vector field on C viewed as a section of the tangent bundle TC.

f. There is a bijective map h : C → R2 − {(0, 0)}, where R2 − {(0, 0)} is viewed as a manifold with a single chart with index 0. On one chart h is given by h1,0(x, y) = (x2 − y2, 2xy). What should h2,0 be in order for h to be differentiable? p Bonus Explain how C really describes a cone {(x, y, z) ∈ R3|z = x2 + y2, z > 0} with the pull-back of the Euclidean metric. C can be visualized by bringing together adjacent corners of a piece of paper and τ just rotates the cone moving the cut by 180 degrees around the z-axis. On C1 this map is (x, y) 7→ (h1,0(x, y), x2 + y2). Exercise 2 (M¨obiusstrip is not orientable) Recall the M¨obiusstrip M is a 2-dimensional C∞ manifold defined by the atlas {M i|i ∈ {0, 1}} i 0 1 where M = (−1, 1) × R and overlaps M1 = M0 = (−1, 0) × R ∪ (0, 1) × R. The transition map is ( τ 0 (x − 1, y) if x > 0 defined as M 0 −→1 M 1 defined by τ 0(x, y) = . 1 0 1 (x + 1, −y) if x < 0 In this exercise we aim to show that every continuous section s of Λ2TM ∗ must take the value 0 at some point of M. Suppose for a contradiction that there is a section s that never takes the value 0. 62 CHAPTER 5. WHAT IF THERE IS NO GOOD CHOICE OF COORDINATES?

a. On M i define the 2-covector field si by si,i(p) = (p, si(p)). More precisely by s taking the value 0 at some point we really mean that si(p) = 0 for some p. Show that the functions i i fi : M → R defined by fi(p) = I(e1 ∧ e2, s (p)) are continuous and have constant sign. 1 ∗ 0 1 1 1 0 b. Explain why s (p) = A s (q) for all q, p ∈ M0 with q = τ0 (p) and A = (τ0 ) (p). 1 0 1 c. Compute (τ0 ) (p) explicitly for all p ∈ M0 . d. Show that for any vector space V and k-covector C ∈ Λ2(V )∗ and k-vector B ∈ Λ2(V ) and ∗ linear map A ∈ L(V,V ) we have I(B,A C) = I(A∗B,C). 1 1 e. Prove that for a = ( 2 , 0) and b = (− 2 , 0) we have f1(a) = −f0(b) and also f1(b) = f0(a). f. Derive a contradiction to finish the argument. Bonus Explain what this result has to do with non-orientability of M.

5.6 The fundamental theorem of calculus on manifolds

γ As before a k-cube in a manifold M is a C1-map [0, 1]k −→ M. By definition of differentiability γ is the restriction of a differentiable function defined on a bigger product of open intervals containing [0, 1]. To avoid heavy notation we will also write this interval as [0, 1]. As such it makes sense to γ0 talk about the tangent bundle and derivative T [0, 1]k −→ TM. The tangent bundle on the interval is the trivial bundle [0, 1]k × R. Given ω ∈ Ωk(M) the pull-back γ∗ω along γ is a k-covector field on [0, 1]k. We may thus define the integral as follows. The reader is warned that here we jump back and forth between viewing a k-covector field on [0, 1]k as a section of the wedge bundle and as a map ∗ [0, 1]k → Λk[0, 1]k . Definition 42. (Integration on manifolds) Z Z ω = γ∗ω γ [0,1]k The exterior derivative may be lifted to manifolds in terms of local descriptions. Definition 43. (Exterior derivative on manifolds) For any k-covector field on manifold M define

(dω)α,β = dωα,β

The reader should check this is well-defined and satisfies the compatibility conditions of lemma 23. With all definitions in place, lifting the fundamental theorems of calculus to manifolds is rather simple. The first part is Stokes theorem, the second is the Poincar´elemma. Theorem 8. (Fundamental theorem of calculus on manifolds) Suppose γ is a k-chain and ω is a C1, (k − 1)-covector field in manifold M. Then Z Z dω = ω γ ∂γ 5.7. GEOMETRY ON MANIFOLDS 63

Moreover, For any C1, k-covector field η and any point p ∈ M there exists an open neighborhood of p and (k − 1) covector field ω such that dω = η if and only if dη = 0. Proof. The first part is proven using the fact that exterior derivative commutes with pull-back and Stokes theorem on open subsets of Rm: Z Z Z Z Z dω = γ∗dω = dγ∗ω = γ∗ω = ω γ [0,1]k [0,1]k ∂[0,1]k ∂γ

For the second part, we recall that p has a neighborhood homeomorphic to Rm and we intersect it with chart M α also containing p and choose an open ball B around p in this intersection. On q−1B we may apply the usual Poincar´elemma to get a ξ such that dξ = q∗η if and only if dη = 0. ∗ Define ω = q−1 ξ on B.

Of course a specific formula for the ω from the Poincar´elemma part of the theorem is available too. Just pull back the formula on Rm. Using partitions of unity the (k − 1)-covector fields defined locally may be patched together unless topological obstructions arise. Studying these obstructions is part of de Rham cohomology.

5.7 Geometry on manifolds

Of course the idea of Riemannian metrics used to set up geometry in the previous chapter generalizes to manifolds as well. The local description of a metric g on manifold M is a family of metrics gα α α ∗ β α on M that are compatible in the following sense (τβ ) g = g . Of course we could reformulate this in terms of sections of yet another type of vector bundles but we leave this to the imagination of the reader. To lift the Hodge star and volume form to manifolds we first need to recall the notion of orientation. An orientation on a vector space is a choice of one of the two equivalence classes of bases. Two bases are equivalent if they are related by a linear map of positive determinant. An orientation on an open P ⊂ Rm is a choice of an orientation of Rm at each point p ∈ P . Since we already know how to extend m-vector fields to manifolds we reformulate the notion of orientation in terms of an ω ∈ Ωm(P ). The field ω defines an orientation on P The concept of orientation on an m-manifold M can be dealt with in terms of m-dimensional covector fields on M. Another type of geometry is called symplectic geometry. A symplectic form is a 2-covector field ω such that dω = 0 and at every point p ω(p) is non-degenerate. On the cotangent bundle there is a canonical 1-covector field. Its exterior derivative is an important example of a symplectic form.