AN INTRODUCTION to DIFFERENTIAL GEOMETRY Contents 1. Introduction: Why Manifolds? 3 2. Smooth Manifolds 3 2.1. Digression: Smoot

Home , Cotangent bundle, Cotangent space, Pullback bundle, Tangent bundle

AN INTRODUCTION TO DIFFERENTIAL GEOMETRY

EUGENE LERMAN

Contents 1. Introduction: why manifolds? 3 2. Smooth manifolds 3 2.1. Digression: smooth maps from open subsets of Rn to Rm 3 2.2. Definitions and examples of manifolds 4 2.3. Maps of manifolds 7 2.4. Partitions of unity 8 3. Tangent vectors and tangent spaces 10 3.1. Tangent vectors and tangent spaces 10 3.2. Digression: vector spaces and their duals 13 3.3. Differentials 13 3.4. The tangent bundle 15 3.5. The cotangent bundle 17 3.6. Vector fields 18 4. Submanifolds and the implicit function theorem 21 4.1. The inverse function theorem and a few of its consequence 21 4.2. Transversality 25 4.3. Embeddings, Immersions, and Rank 26 5. Vector fields and flows 27 5.1. Definitions, examples, correspondence between vector fields and flows 27 5.2. The geometry of the Lie bracket 33 5.3. Map-related vector fields 35 6. (Multi)linear algebra 36 6.1. Tensor products 36 6.2. The Grassmann (exterior) algebra and alternating maps 42 6.3. Pairings 44 7. Differential forms and integration 45 7.1. Motivation 45 7.2. Pullback of differential forms 47 7.3. Integration 49 8. Vector bundles 53 8.1. Sections 54 8.2. Frames and local frames 55 8.3. Vector bundles via transition maps 56 9. Exterior differentiation, contractions and Lie derivatives of forms 58 9.1. Exterior differentiation 58 9.2. Contractions of forms and vector fields 60 9.3. Lie derivatives of differential forms 62 9.4. de Rham cohomology 65 10. Stokes’s theorem 68 11. Connections on vector bundles 71 11.1. Connections 71 typeset August 19, 2011. 1 11.2. Parallel Transport 76 12. Riemannian geometry 78 12.1. Levi-Civita connection 78 Fiber metrics 79 12.2. Connections induced on submanifolds 81 12.3. The second fundamental form of an embedding 84 13. Geodesics as critical points of the energy functional 87

2 1. Introduction: why manifolds? There are many different ways to formulate mathematically the notion of a ‘space’ that occurs in different branches of science and engineering. For instance one can talk about the space of configurations of a physical system. This, of course, requires a decision as to the level of details one is trying to model. For example, we can regard the configuration space of a system consisting of a sun and a planet as R3 × R3. We use three real numbers to describe the position of the center of mass of the sun and three real numbers to describe the position of the center of mass of the planet. In this model we assume that the sun and the planet are simply two points in space. We also allow collisions. If we exclude collisions (but still allow the sun and the planet to come arbitrarily close to each other), the configuration space is then 3 3 Q = {(x, y) ∈ R × R | x 6= y}. Here is another idealized example: the configuration space of a penny tumbling through the air. Fix a frame of reference. We will need a triple of real numbers to describe the position of the penny’s center of gravity and three orthonormal vectors to describe the orientation of the penny. Thus the configuration space in question is 3 Q = R × O(3), where O(3) denotes the set of 3 × 3 orthogonal matrices (recall that an n × n matrix is orthogonal if (and 1 only if) its columns form an orthonormal basis of Rn). Exercise 1.1. What is the configuration space of a penny rolling on a plane? Manifolds constitute a particular way to formalize the notion of a configuration space. These are the spaces that “locally look like Rn.” The reason we will limit ourselves to manifolds is that they are particularly suitable for generalizing the ideas of calculus — differentiation and integration. We will see that the two examples of configuration spaces given above: Q = {(x, y) ∈ R3 × R3 | x 6= y} and Q = R3 × O(3) are, indeed, manifolds. Remark 1.1. There are, of course, many other notions of a “space.” In linear algebra one studies vector spaces and maps between them. In algebraic geometry one studies spaces of solutions of polynomial equations which give rise to the notion of an algebraic variety. In metric topology/geometry one studies metric spaces, spaces with a notion of a distance. In point set topology and in algebraic topology one talks about topological spaces. In analysis one may study the space of solutions of a partial differential equation. In geometry and topology one may be forced to study spaces that have singularities such as orbifolds and stratified spaces. Before we can discuss orbifolds and more complicated spaces we should first come to terms with manifolds which are smooth.

2. Smooth manifolds 2.1. Digression: smooth maps from open subsets of Rn to Rm. We start out by recalling the definition of a differentiable map. Definition 2.1. Let U ⊂ Rn be an open subset. A map f : U → Rm is differentiable at a point x ∈ U if there is a linear map L : Rn → Rm so that 1 lim (f(x + h) − f(x) − Lh) = 0. h→0 ||h|| It is not hard to show that if such a map L exists, it is unique. The linear map L is variously called the derivative of f at x, the differential of f at x, ... and is denoted by dfx or by Dfx or by Df(x) or by a similar notation. Moreover, the matrix corresponding to L with respect to the standard basis of Rn and Rm is the so called Jacobian matrix. That is, if f = (f1, . . . fm) then  ∂f1 (x) ... ∂f1 (x)  ∂x1 ∂xn  . .  Dfx =  . .  ∂fm (x) ... ∂fm (x) ∂x1 ∂xn

1 3 Strictly speaking the conﬁguration space is R × SO(3), where SO(3) denotes the set of orthogonal matrices with positive determinant. Why? 3 Deﬁnition 2.2. Let U ⊂ Rn be an open subset. A map f : U → Rm is smooth (or C∞) on the set U if all partial derivatives of f to all orders exist at all points of U.

Here is a more “sophisticated” version of the the definition above. Suppose f : U → Rm is differentiable nm at all points of U. Then we have a map g(x) := Dfx : U → R . We can require that g is differentiable as a map from U to Rnm. The derivative of g is a map from U to a bigger vector space RN for an appropriate N. We can require that this map is differentiable and so on...In other words, if all derivatives of f : U → Rn exist and are differentiable we say that f is smooth. 2.2. Definitions and examples of manifolds. A smooth manifold is a generalization of a smooth surface in R3. A smooth surface in S ⊂ R3 has local parameterizations: for every point p ∈ S there is an open set V ⊂ R3 with p ∈ V and a map x : U → S ∩ V (where U ⊂ R2 is an open set) such that ∞ (1) x is C . That is x(u1, u2) = (x1(u1, u2), x2(u1, u2), x3(u1, u2)) and each xi(u1, u2), 1 ≤ i ≤ 3 is an infinitely differentiable function of u = (u1, u2) ∈ U; (2) x is 1-1 (injective) and onto. The map x is a local parameterization of S. Example 2.3. The two sphere 2 3 S = {x ∈ R | ||x|| = 1} 2 3 is a smooth surface: if p = (p1, p2, p3) ∈ S and p3 > 0 take V = {x ∈ R | x3 > 0}, U = {(u1, u2) | ||u|| < 1} 2 p 2 2 and a local parameterization x : U → S ∩ V to be x(u1, u2) = (u1, u2, 1 − u1 − u2). It’s easy to check ∞ p 2 2 that this x is 1-1, onto and C . If p3 < 0 take the local parameterization x(u) = (u1, u2, − 1 − u1 − u2). If p3 = 0 then either p1 or p2 is non-zero (or both) and there are formulas for local parameterizations similar to the ones above. 2 2 Note that if S is a smooth surface and xα : R ⊃ Uα → S and xβ : R ⊃ Uβ → S are two local parameterizations with Wαβ := xα(Uα) ∩ xβ(Uβ) 6= ∅ then −1 2 −1 −1 2 xβ ◦ xα : R ⊃ xα (Wαβ) → xβ (Wαβ) ⊂ R is C∞.

This motivates: Definition 2.4. [of a C∞ manifold, first approximation, not quite right] A C∞ manifold of dimension m is m a set M and a family of injective maps {xα : Uα → M} where Uα ⊂ R are open sets, such that S (1) xα(Uα) = M; −1 −1 (2) if for some pair of indices α and β, the set Wαβ := xα(Uα) ∩ xβ(Uβ) 6= ∅ then xα (Wαβ), xβ (Wαβ) are open in Rm and −1 −1 −1 xβ ◦ xα : xα (Wαβ) → xβ (Wαβ) are C∞. One thing that is wrong with this definition is that there is no topology specified on M. The other is that instead of parameterizations one usually works with charts that go the other way. Namely

Deﬁnition 2.5 (Chart). Let X be a topological space. An Rn (coordinate) chart on X is a homeomorphism φ : X ⊃ U → U 0 ⊂ Rn. Notation. We will often write φ : U → Rn or even (U, φ) for a coordinate chart φ : X ⊃ U → U 0 ⊂ Rn. Note n that since φ takes values in R , it is an n-tuple of functions φ = (x1, . . . , xn) for some functions xi : U → R, the coordinate functions on U associated to the coordinate chart φ : U → Rn. Notation. When dealing with charts it will be convenient to to adopt the notation where the standard n n coordinate functions on R are denote by ri, 1 ≤ i ≤ n. That is, ri assigns to a point a = (a1, . . . , an) ∈ R n the number ai. If φ : U → R is a chart then xi = ri ◦ φ : U → R 4 are the coordinate functions on U,

∞ 0 Definition 2.6 (Atlas). A C atlas on a topological space X is a collection of charts {φα : Uα → Uα} (with all U 0’s being open subsets of one fixed Rn such that 2 (1) {Uα} is an open cover of X, and −1 ∞ (2) If Uα ∩ Uβ 6= ∅, then φβ ◦ φα : φα(Uα ∩ Uβ) → φβ(Uα ∩ Uβ) is C as a map from an open subset of Rn to Rn. That is, changes of coordinates are smooth. Example 2.7. The identity map f : R → R, f(x) = x is the standard chart on R. The set {(f, R)} consisting of one chart is an atlas on R. The map g : R → R, g(x) = x3 is also a chart on R; it defines a different atlas on R. Here is a third atlas on R. For each integer n ∈ Z, φn :(n, n + 2) → R, φn(x) = x is a chart. The set {(φn, (n, n + 2)} is an atlas on R. Definition 2.8. We say that two atlases are equivalent if their union is also an atlas.

0 0 The definition above amounts to: an atlas {xα : Uα → Uα} is equivalent to an atlas {yβ : Vβ → Vβ} if for −1 any indices α, β with Uα ∩ Vβ 6= ∅ the map xα ◦ yβ : yβ(Uα ∩ Vβ) → xα(Uα ∩ Vβ) is smooth. One can easily verify that this is indeed an equivalence relation. Exercise 2.1. Convince yourself that the first and the third atlases in Example 2.7 are equivalent. Show that the first and the second example of atlases are not equivalent. Definition 2.9 (Manifold). An n-dimensional (C∞) manifold a topological space M together with an equivalence class of C∞ atlases. Notation. We will denote the manifold and the underlying topological space by the same letter, with the equivalence class of atlases usually understood.

Example 2.10. Let M = Rn. We cover M by one open set and take the identity map as our chart. This is the standard manifold structure on Rn. Example 2.11. Let M = Cn. Again we cover Cn by one open set U = Cn, and take as our coordinate chart the map φ : Cn → R2n which is given by

φ(z1, . . . , zn) = (Rez1, Imz1,...). Example 2.12. If M is a manifold, and V ⊂ M is an open subset, then V is naturally a manifold. Check this!

n2 Example 2.13. The set Mn(R) of n × n matrices with real coefficients is a manifold, since it is R . The subset GL(n, R) ⊂ Mn(R) of invertible matrices is an open subset: a matrix A is invertible if and only if its determinant is non-zero and determinant det : Mn(R) → R is a polynomial map, hence continuous. Hence the subset {A ∈ Mn(R) | det A 6= 0} is open. So by the previous example, GL(n, R) is a manifold. Example 2.14. The two-sphere S2 := {x ∈ R3 | ||x||2 = 1} is a manifold. To see this, we give S2 the subspace topology that it inherits as a subset of R3. Next we define charts. To do this, let + 2 Ui = {x = (x1, x2, x3) ∈ S : xi > 0} and − 2 Ui = {x = (x1, x2, x3) ∈ S : xi < 0}, 2 ± ± i = 1, 2, 3 (6 charts altogether) which gives us an open cover of S . Define φ1 (x) = (x2, x3), φ2 (x) = (x1, x3), ± and φ3 (x) = (x1, x2). + + −1 We need to verify that changes of coordinates are smooth. Consider, for example, φ2 ◦ (φ1 ) (u1, u2) = p 2 2 ( 1 − u1 − u2, u2), which is smooth in its region of definition. The other compositions yield similar results. It follows that S2 is indeed a manifold.

2 That is, each Uα ⊂ X is open and ∪αUα = X 5 Example 2.15. Now we consider a slightly more interesting example of a manifold, the real projective space RP n−1 which is, by deﬁnition, the space of lines through the origin in Rn. To give RP n−1 a topology, we think of it as the set of equivalence classes of nonzero vectors in Rn. That is,

n−1 n RP = (R r {0})/ ∼, where two non-zero vectors v and v0 are equivalent if and only if there is a constant λ 6= 0 such that v = λv0. Note that this is an equivalence relation. We then have a surjective map

n n−1 π : R − {0} → RP , π(v) = [v], where [v] denotes the equivalence class of v ([v] is the line through v). We put on RP n−1 the quotient topology: U ⊂ RP n−1 is open if and only if π−1(U) is open in Rn − {0}. I leave it to the reader to check that this topology is Hausdorﬀ. Charts here are given as follows: for each 1 ≤ i ≤ n, let

n−1 Ui = {[x1, ..., xn] ∈ RP : xi 6= 0} and deﬁne n−1 φi : Ui → R by x1 xi−1 xi+1 xn [x1, ..., xn] 7→ , ··· , , , ··· . xi xi xi xi −1 Note that the inverse φi is given by

−1 φi :(x1, ··· , xn−1) 7→ [x1, ··· , xi−1, 1, ··· , xn].

We must check that the change of coordinates maps are smooth. If j < i, then on the interesection Ui ∩Uj −1 u1 ui−1 1 un φj ◦ φi (u1, ··· , un−1) = φj(u1, ··· , ui−1, 1, ··· , un) = , ··· , , , ··· , , uj uj uj uj which is smooth. Other computations are similar (and are left to the reader).

Exercise 2.2. Deﬁne the complex projective space CP n−1 to be the set of complex lines through the origin in Cn and prove that it is a manifold. Exercise 2.3. If M and N are manifolds, show that M × N is also naturally a manifold.

Exercise 2.4. Let V be a finite-dimensional vector space over R. Then V is a manifold: a choice of n P basis v1, . . . , vn (n = dim V ) of V defines a linear bijection σ : R → V , σ(r1, . . . , rn) = rivi. Define a topology on V by requiring that σ is a homeomorphism (that is, U ⊂ V is open ⇔ σ−1(U) ⊂ Rn is open). Check that this is indeed a Hausdorff second countable topology. Define σ−1 : V → Rn to be a chart and {σ−1 : V → Rn} to be an atlas (one chart!). Prove that a different choice of basis of V defines the same topology and an equivalent atlas.

Exercise 2.5. Let M be a manifold. Show that for each point x ∈ M there is a coordinate chart φ : U → Rn with x ∈ U such that φ(x) = 0 and φ(U) is B1(0), the ball of radius 1 centered at 0. Remark 2.16. In Deﬁnition 2.9 we have made no assumption on the topology of our manifolds. It is standard to assume that the manifolds are Hausdorﬀ. Otherwise all sorts of pathologies turn up. Another set of standard assumptions guarantees the existence of partitions of unity (see subsection 2.4 below). For this the simplest assumption to make is that the manifold in question is second countable. However, this assumption is too stringent and paracompactness is much more reasonable. All of this will be discussed later on. 6 2.3. Maps of manifolds. In the Bourbakist view every area of mathematics has its collection of objects and its collection of maps between objects (or, more generally, morphisms). While it is enjoyable to make fun of Bourbaki and Bourbakists, there is some merit to this point of view. A map f : M → N between two manifolds is smooth if it is continuous and is smooth in coordinates. More precisely we have:

Definition 2.17 (smooth map). Let M and N be two smooth manifolds with atlases {(Uα, φα)} and ∞ {(Vβ, ψβ)}, respectively. A continuous map f : M → N is a smooth map (or a morphism of C manifolds) if for all α and β with −1 f (Vβ) ∩ Uα 6= ∅, the composition −1 −1 ψβ ◦ f ◦ φα : φα(Uα ∩ f (Vβ)) → ψβ(Vβ) is C∞. We will write C∞(M,N) to denote the set of all smooth maps from M to N. Note that this definition does not depend on which atlases on M and N we choose [check this]. Also note a special case of this definition is that of a smooth function on a manifold, which is a map from M to R. To wit Definition 2.18. A function f : M → R is smooth if f is continuous and if for all coordinate charts −1 ∞ {(Uα, φα)}, f ◦ φα : φα(Uα) → R is C . It’s consistent with the previous definition: we think of the real line R as a manifold with the standard coordinate chart id : R → R. We denote the collection of all smooth functions on a manifold M by C∞(M) = C∞(M, R). Exercise 2.6. Let M be a manifold. Check that C∞(M) is a vector space over the reals under the standard addition of functions and multiplication by scalars. Is it finite dimensional? Exercise 2.7. Let M be a manifold. Check that a constant function on a manifold M is smooth. Here are some examples of smooth maps. Example 2.19. Take M = Rn r {0}, and let N = RP n−1. Let π : Rn r {0} → RP n−1 be the projection π(v) = [v]. I claim that π is a smooth map. Let’s check it. The atlas on M is given by one chart — the inclusion φ of M into Rn. The charts on RP n−1 are the −1 n same as last time. Note that π (Ui) = {v ∈ R r {0} : vi 6= 0}. To see that π is smooth, we need to check −1 −1 n−1 ∞ that φi ◦ π ◦ φ : π (Ui) → R is C . But note that −1 v1 vn (φi ◦ π ◦ φ )(v) = φi(π(v)) = φi([v]) = , ··· , . vi vi Example 2.20. Let M = R with the coordinate chart φ(x) = x3. Let N = R with the coordinate chart ψ(x) = x. Let f : M → N be the map x 7→ x3. Is f a C∞ map? (ψ ◦ f ◦ φ−1)(x) = ψ ◦ f(x1/3) = ψ(x) = x, which is smooth. So f is smooth. Now let us see if the map h : M → N, h(x) = x is smooth. We have ψ ◦ h ◦ φ−1(x) = x1/3, which is not differentiable at 0. So h is not smooth. Finally note that f −1 : N → M is smooth: φ ◦ f −1 ◦ ψ−1(x) = (x1/3)3 = x.

Example 2.21. Constant functions are smooth maps of manifolds The appropriate notion of “isomorphism” in differential geometry is the following one: Definition 2.22 (Diffeomorphism). A C∞ map f : M → N between two smooth manifolds is a diffeomorphism if f is a homeomorphism and both f and f −1 are C∞ maps. Example 2.23. The map f : M → N of Example 2.20 is a diffeomorphism. Exercise 2.8. If M and N are manifolds, prove that M × N is diffeomorphic to N × M. 7 Exercise 2.9. Show that the composition of smooth maps is smooth.

Exercise 2.10. Let LA : GL(n, R) → GL(n, R) be left multiplication by A ∈ GL(n, R). Prove that LA is a 2 diffeomorphism. [Recall that GL(n, R) ⊂ Rn is the set of all invertible n × n matrices and that it is open 2 in Rn .] 2.4. Partitions of unity. In this subsection we define partitions of unity (that is, writing the constant function 1 as a sum of bump functions with certain properties) and prove the existence of a partition of unity subordinate to a cover on a second countable manifold. The existence of such partitions of unity is very useful. The proof of the existence of the partition of unity is not terribly useful and should be skipped on the first (and second) reading. The reason for this advice is that the proof is technical and the techniques will never be used again in this course. We start with a string of definitions. Definition 2.24 (second countable). A topological space X is second countable if there is a countable collection of open subsets {Ui} of X such that any open set in X is the union of some collection of Ui’s. In other words, the topology of X has a countable basis.

Example 2.25. The real line R with the standard topology is second countable: the collection {Ui} is consists of open intervals (a, b) where a and b are rational numbers. n Similarly R is second countable: the collection {Ui} consists of open balls Br(x) of rational radius r centered at points x with rational coordinates. Remark 2.26. Any (topological) subspace of a second countable space is second countable [prove it]. Hence any manifold that can be realized as a subspace of some Rn has to be second countable. The condition of second countability is much more than necessary for the existence of the partition of unity. One can get away with assuming only paracompactness. Here, for the record, is its definition. It takes a paragraph to state because we have to define a few more things first.

Definition 2.27. Let M be a topological space. A collection {Uα} of subsets of M is a cover of a subset S W ⊂ M if W ⊂ Uα. It is an open cover if each {Uα} is open. A refinement {Vβ} of a cover {Uα} is a cover such that for each index β there is an index α = α(β) with Vβ ⊂ Uα. A collection of subsets {Uα} of M is locally finite if for every point m ∈ M there is a neighborhood W of M with W ∩ Uα 6= 0 for only finitely many α. 1 1 Example 2.28. The cover {(n, n + 2}n∈Z is a locally finite cover of R. The cover {[− n , n ]} is a cover of (−1, 1) which is not locally finite — there is a problem at 0. Definition 2.29 (paracompact). A topological space is paracompact if every open cover has a locally finite refinement. Example 2.30. Any compact space is paracompact. We will see shortly that second countable Hausdorff manifolds are paracompact. Definition 2.31 (support). The support supp f of a continuous function f : X → R is the closure of the set of points where f is non-zero: supp f = {x ∈ X : f(x) 6= 0}.

Deﬁnition 2.32 (Partition of Unity). Let {Uα} be an open cover of a manifold M.A partition of unity subordinate to the cover {Uα} is a collection of smooth functions {ρβ : M → [0, 1]} such that :

(1) For each index β there is an index α with supp(ρβ) ⊂ Uα. (2) For each point m ∈ M, there is a neighborhood W of m such that ρβ|W 6= 0 for only finitely many β. That is, the collection of supports {supp ρβ} is locally finite. P (3) β ρβ = 1. Remark 2.33. Note that we need condition (2) to make sense of the sum in (3): by (2), for each point P m ∈ M the sum ρβ(m) is actually a finite sum. So there are no problems with convergence. Theorem 2.34. Let M be a second countable Hausdorff manifold. Then every open cover of M has a partition of unity subordinate to it. 8 Proof. (You should not read this proof the first time around) ∞ Step 1. We first construct a collection {Xk}k=1 of open subsets of M such that their closures Xk are S∞ compact, Xk ⊂ Xk+1 and M = k=1 Xk. Since M is second countable, there is a countable basis of the topology of M. Out of this collection of open sets choose those that have compact closure and denote them S by W1, W2, . . . We claim that that they cover M: M = Wi. Indeed, a point x ∈ M has a neighborhood homeomorphic to an open subset of Rn (n = dim M, of course). For any point y in an open set U ⊂ Rn ¯ ¯ n there is a closed ball Br(y) centered at y with Br(y) ⊂ U. Closed balls in R are compact. Hence every point x ∈ M has a neighborhood U(x) whose closure U(x) is compact. Now U(x) is a union of a certain number of elements of the countable basis of the topology of M. The closure of each of these elements is S compact. Therefore x ∈ Wi for some index i. This proves that M = Wi. ∞ Let X1 = W1. The whole collection {Wi}i=1 covers X1. Since X1 is compact, X1 = Wi1 ∪ Wi2 ∪ ... ∪ Wip

for some i1 < i2 < ··· < ip. Let X2 = Wi1 ∪ Wi2 ∪ ... ∪ Wip . Then X2 is compact ... Continuing in this ∞ manner we get the desired collection {Xk}k=1.

Step 2. We construct three open countable covers {Vβ,1}, {Vβ,2}, {Vβ,3} with {Vβ,1} ⊂ {Vβ,2} ⊂ {Vβ,3}, S β{Vβ,1} = M and {Vβ,3} is locally finite and subordinate to {Uα}, the cover we started out with. Note that this will prove that any Hausdorff second countable manifold is paracompact, as promised. Fix an index k. For each point z ∈ Xk r Xk−1 choose an open set Vz,3 such that Vz,3 ⊂ Uα for some α, Vz,3 ⊂ Xk+1 and Vz,3 ∩ Xk−1 = ∅. Additionally we require that there is a coordinate chart ψz mapping Vz,3 homeomorphically onto n B3(0) := {x ∈ R | ||x|| < 3}. −1 Let Vz,i = ψz Bi(0) for i = 1, 2. The open sets Vz,1 cover the compact set Xk r Xk−1 (and are contained in Xk+1 r Xk−2). Therefore, for each k, there is a finite collection of Vz,1’s covering Xk r Xk−1. Take all of these finite collections. We get a cover {Vβ,1} of M. Similarly we get two more covers: {Vβ,2} and {Vβ,3}. Note that by construction they are locally finite and are subordinate to {Uα}: for each β there is α(β) with Vβ,i ⊂ Uα(β).

Step 3. Now we construct a partition of unity. The function − 1 e t , if t > 0 f(t) = 0, if t ≤ 0 is smooth on all of R [this fact is not entirely trivial]. Hence − 1 e 1−t , if t < 1 f˜(t) = 0, if t ≥ 1 is smooth on all of R. Therefore h : Rn → [0, ∞) given by h(x) = f˜(||x||2/4)

is also smooth. Note that h(x) > 0 for all x ∈ B2(0) and h(x) = 0 for all x 6∈ B2(0). Therefore, for each index β, h(ψβ(x)) if x ∈ Vβ,3 gβ(x) = 0, if x 6∈ Vβ,3,

where ψβ : Vβ,3 → B3(0) is the corresponding coordinate chart, is a smooth function on M. Moreover, gβ(x) > 0 for x ∈ Vβ,1. Since the cover {Vβ,3} is locally ﬁnite, the sum X G(x) = gβ(x) β

makes sense [converges for each x] and deﬁnes a smooth function on M. Since {Vβ,1} covers M, G(x) > 0 for all x ∈ M. Let ρβ(x) = gβ(x)/G(x). P Then 1 ≥ ρβ(x) ≥ 0, ρβ = 1 and supp ρβ ⊂ Vβ,3 ⊂ Uα(β). Thus the collection {ρβ} is the desired partition of 1. 9 ∞ Corollary 2.34.1. Let M be a second countable Hausdorﬀ manifold and {Ui}i=1 a countable open cover. Then there is a partition of unity {ρi} with supp ρi ⊂ Ui.

Proof. By Theorem 2.34 there is a partition of unity {τβ} with supp τβ ⊂ Ui for some i = i(β). Let

I(i) = {β | supp τβ ⊂ Ui and supp τβ 6⊂ Uj for j < i}. Deﬁne X ρi = τβ. β∈I(i)

The collection {ρi} is the desired partition of 1. Proposition 2.35. Suppose that M is a second countable Hausdorﬀ manifold, K ⊂ M a closed subset and U ⊂ M an open set with K ⊂ U. Then there is a smooth function f : M → [0, 1] such that

(1) f|K ≡ 1 and (2) supp(f) ⊂ U.

Proof. Let U1 = U and U2 = M r K. By Corollary 2.34.1 there exists smooth functions ρ1, ρ2 : M → [0, 1] with supp ρi ⊂ Ui and ρ1 + ρ2 = 1. Since supp ρ2 ⊂ M r K, ρ2|K ≡ 0. Hence ρ1|K ≡ 1. Now let f = ρ1. Corollary 2.35.1. Let M be a (second countable Hausdorﬀ) manifold. For any point x ∈ M and any neighborhood U of x in M there is a smooth function f : M → R so that (1) f ≡ 1 on a neighborhood V of x contained in U and (2) supp(f) ⊂ U. Proof. Exercise. You can use the proposition above. Alternatively prove it directly ﬁrst in the case where M = Rn and then use a coordinate chart around x to prove it for arbitrary M. Is the condition that M is second countable really necessary?

3. Tangent vectors and tangent spaces 3.1. Tangent vectors and tangent spaces. We learn in physics that a vector is an arrow sticking out of a point in space and that a vector field assigns an arrow to each point in space. When we learn linear algebra, we are told to forget this point of view: all vectors are sticking out of one point — the origin. For the purposes of differential geometry the physics point of view is correct after all: all our vectors are anchored at various points in space. There is another issue we need to deal with. If S ⊂ R3 is a smooth convex surface, one can imagine that for every point p ∈ S there is a two-plane TpS touching S at that point, a plane tangent to S at p. (It is not entirely clear that such a plane is unique, but that’s another story.) A vector tangent to S at p would be an arrow anchored at p and lying in TpS. This raises a problem: our manifolds are defined abstractly and not as subsets of some Rn. So what would a tangent plane be in this case? and what vector space would it lie in? The solution is to think of vectors as directional derivatives. A directional derivative of a function on Rn depends on two things: a direction and the point at which the function is being differentiated. For a smooth function f ∈ C∞(Rn), we write d D f(p) = | f(p + tv) v dt 0 for the directional derivative of f at a point p ∈ Rn in the direction v ∈ Rn. Observe that (1) the directional derivatives are linear: for any f, g ∈ C∞(Rn) and any λ, µ ∈ R

Dv(λf + µg)(p) = λDvf(p) + µDvg(p); (2) the directional derivatives have a derivation property:

Dv(fg)(p) = f(p) Dvg(p) + Dvf(p) g(p). This motivates the following deﬁnition: 10 Deﬁnition 3.1 (Tangent vector). Let M be a manifold and a ∈ M a point. A tangent vector to M at a is an R-linear map v : C∞(M) → R such that (3.1) v(fg) = f(a)v(g) + g(a)v(f) for all functions f, g ∈ C∞(M). Linear maps C∞(M) → R satisfying (3.1) are also said to have a derivation property and are called derivations (into R).

Deﬁnition 3.2 (Tangent space). The tangent space TaM to a manifold M at a point a is the collection of all tangent vectors to M at a.

Exercise 3.1. The tangent space TaM is a vector space over the reals. [That’s why the elements of the tangent space are called “vectors”!] That is, if v, w ∈ TaM and λ, µ ∈ R then the linear map λv + µw : C∞(M) → R is a derivation. Note that by our deﬁnition every direction derivative at a point p ∈ Rn is a tangent vector at p to Rn This begs a question: are there tangent vectors that are not directional derivatives? The answer is no, tangent vectors to points of Rn are directional derivatives and that’s all there is to it: n ∞ n Proposition 3.3. Let w ∈ TaR be a tangent vector. That is, suppose w : C (R ) → R is a linear map satisfying (3.1). Then w(f) = Dvf (a) n n for some v ∈ R . The same result holds with R replaced by some open ball Br(a). To prove the proposition we ﬁrst “recall” a version of Taylor’s theorem. Lemma 3.4. Let f be a smooth function on Rn. Fix a point a ∈ Rn Then for any x ∈ Rn X (3.2) f(x) = f(a) + (xi − ai)hi(x)

where hi(x) are smooth functions with ∂f hi(a) = (a). ∂xi Proof. Suppose ﬁrst that a = 0. Then, by the fundamental theorem of calculus and chain rule, Z 1 d Z 1 X ∂f X Z 1 ∂f f(x) − f(0) = f(tx) dt = ( xi (tx)) dt = xi (tx) dt. 0 dt 0 ∂xi 0 ∂xi R 1 ∂f Let hi(x) = (tx) dt. These are the desired functions. If a 6= 0 apply the previous argument to 0 ∂xi ¯ f(x) = f(x − a).

Remark 3.5. If f is a smooth function on an open ball Br(a) then (3.2) still holds at all x ∈ Br(a), except ∞ now hi ∈ C (Br(a)). The proof is exactly the same. Before proving the proposition we need one more simple lemma.

Lemma 3.6. Let M be a manifold and w ∈ TaM a tangent vector. Then for any constant function c we have w(c) = 0. Proof. Apply the tangent vector w to the constant function 1: w(1) = w(1 · 1) = 1w(1) + w(1)1 = 2w(1). ⇒ w(1) = 0. Since w is linear, for any constant function c = c · 1 w(c) = w(c · 1) = c w(1) = 0. P Proof of Proposition 3.3. By Lemma 3.4, f(x) = f(a) + (xi − ai)hi(x). Hence X X X ∂f w(f) = w(f(a)) + (w(xi − ai)hi(a) + (ai − ai)w(hi)) = 0 + w(xi)hi(a) + 0 = w(xi) (a). ∂xi

Therefore w = Dvf (a), were v = (w(x1), . . . , w(xn)). We leave the ball version of the proof as an exercise. 11 ∂ n Remark 3.7. The proof above actually shows that the derivations { |a} form a basis of Ta . ∂xi R For arbitrary manifolds a choice of coordinates near a point also defines a basis of the tangent space at the point. To express this precisely it will be convenient to slightly change our notation. To this end, denote the n points of R by r = (r1, . . . , rn). We also think of ri as a function that assign to a point its i-th coordinate. n If φ : U → R is a coordinate chart on a manifold M, then φ = (r1 ◦ φ, . . . , rn ◦ φ). We then think of xi = ri ◦ φ as coordinate functions on U. ∞ ∂ The coordinates define tangent vectors at points of U: for any a ∈ U and any f ∈ C (M) we define |a ∂xi by ∂ ∂ −1 |a(f) := |φ(a)(f ◦ φ ). ∂xi ∂ri It is easy to see that these are, indeed, tangent vectors. It should come as no surprise that they form a basis n n of the tangent space TaM. After all, manifolds locally look like R and in R the partial derivatives do form bases of tangent spaces. Now let’s prove this. We first observe that tangent vectors are local. ∞ Lemma 3.8. Let M be a manifold and v ∈ TaM a tangent vector. Then for any two functions f, g ∈ C (M) with f = g in a neighborhood U of a, we have v(f) = v(g). In particular, if h is constant on a neighborhood U of a, then v(h) = 0 (cf. Lemma 3.6).

Proof. As v : C∞(M) → R is R-linear, it is enough to show that v(f − g) = 0. Chose a smooth bump function ρ : M → [0, 1] with supp ρ ⊂ U which is identically 1 on a neighborhood V of a. We then have that ρ · (f − g) = 0 on all of M by construction. Furthermore, because v is linear, v(0) = 0, hence 0 = v(ρ (f − g)) = v(ρ)(f − g)(a) + ρ(a) v(f − g) = v(f − g). n What’s the point of the lemma, aside from its esthetic appeal? If φ = (x1, . . . , xn): U → R is a coordinate chart on a manifold M and v ∈ TaM is a tangent vector at some point a ∈ U, then we cannot apply v to a coordinate function xi. The function xi is only defined on U; it is not a smooth function on all of M. However, there is a way around this problem. Pick a smooth bump function ρ : M → [0, 1] with supp ρ ⊂ U which is identically 1 on some neighborhood of a. Then xiρ is a smooth function on M and so v(xiρ) does make sense. Moreover, this number does not depend on the choice of the bump function: if τ : M → [0, 1] is another choice of a bump function with the same properties, then xiρ = xiτ on some (perhaps smaller) neighborhood of a. Therefore, by the preceding lemma, v(xiρ) = v(xiτ). We therefore define v(xi) := v(xiρ) for some choice of the bump function ρ. Similarly, if h ∈ C∞(U) we define v(h) := v(hρ) for some (any) choice of the appropriate bump function ρ. n Lemma 3.9. If φ = (x1, . . . , xn): U → R is a coordinate chart on a manifold M and v ∈ TaM is a tangent vector at some point a ∈ U. Then X ∂ (3.3) v = v(x ) | . i ∂x a i i ∂ Moreover, the vectors { |a} form a basis of TaM. ∂xi Proof. We evaluate both sides of (3.3) on a function f ∈ C∞(M). It is no loss of generality to assume that φ(U) is a ball and that φ(a) = 0. By Lemma 3.4, −1 −1 X (f ◦ φ )(r) = (f ◦ φ )(0) + rihi(r)

∂ −1 where hi(0) = (f ◦ φ )|0. Thus, ∂ri X f(x) = f(a) + xi · fi(x), 12 where ∂ −1 ∂ fi(a) = (f ◦ φ )(0) = |a(f), ∂ri ∂xi for all x ∈ U. Hence, for any v ∈ TaM, we have X v(f) = v(f(a) + xifi) X X = xi(a)v(fi) + v(xi)fi(a) X = v(xi)fi(a) X ∂ = v(xi) |a(f). ∂xi ∂ This shows that { |a} span TaM. To check linear independence observe that ∂xi ∂ |a(xj) = δij, ∂xi where δij denotes the Kronecker delta function: it’s 1 if i = j and zero otherwise. n n Remark 3.10. We have seen in the preceding discussion that for any p ∈ R the tangent space TpR is isomorphic to Rn. Explicitly the isomorphism is give by taking a vector v ∈ Rn to the directional derivative at p in the direction of v: n ' n R → TpR v 7→ Dv(·)(p). In particular d →' T s 7→ s | . R aR dr a 3.2. Digression: vector spaces and their duals. Given two (ﬁnite dimensional) vector spaces V and W we denote the set of all linear maps from V to W by Hom(V,W ). It is a vector space: any linear combination of two linear maps is again a linear map. Of special interest is the vector space V ∗ := Hom(V, R) of linear n maps from a vector space V to R, the so called dual vector space. If {vi}i=1 is a basis of V , the dual basis ∗ ∗ is a basis {vi } of V deﬁned by ∗ vi (vj) = δij for all 1 ≤ i, j ≤ n. This is indeed a basis. If ` ∈ V ∗ is an arbitrary functional, then X ∗ ` = `(vi)vi

because both sides of the formula above agree on the basis vectors vj (I am tacitly using the fact that if two linear maps µ, ν : V → R agree on basis vectors, then they agree). It follows that dim V ∗ = dim V . Finally observe that for any vector u ∈ V , X ∗ u = vi (u)vi. ∗ Why is the formula above true? Apply vj to both sides. Exercise 3.2. Show that a choice of basis of vector spaces V and W identifies Hom(V,W ) with a space of matrices. Conclude that dim Hom(V,W ) = dim V · dim W. 3.3. Differentials. Definition 3.11. Let f : M → N be a smooth map of manifolds and a ∈ M a point. The differential of f at a is the linear map dfa : TaM → Tf(a)N defined by (dfa(v))(h) = v(h ◦ f) ∞ for all v ∈ TaM and all h ∈ C (N).

Exercise 3.3. Check that the deﬁnition above makes sense. That is, given v ∈ TaM, check that the map ∞ C (N) → R, h 7→ v(h ◦ f) is a linear map satisfying (3.1). 13 n m We will check shortly that in the case of a smooth map f : R → R , dfa = Dfa under the natural n n identiﬁcation TaR ' R .

We next sort out what the definition of a differential amounts to in the case where f : M → R is a smooth function (in other words the target manifold N = R). By definition 3.11, dfa is a map from TaM ' to Tf(a)R ' R. That is, if we compose dfa with the isomorphism Tf(a)R −→ R (see Remark 3.10, we get a linear map dfa : TaM → R ∗ By definition, dfa an element of the dual vector space Ta M := Hom(TaM, R). I claim that the linear map dfa is given by

(3.4) dfa(v) = v(f).

for any tangent vector v ∈ TaM. Proof. Let r : R → R denote the identity map. We think of it as the standard coordinates on R. Then for d every point x ∈ R the vector dr |x is a basis vector of TxR, which gives us an isomorphism d T → , t | 7→ t. xR R dr x The map above has a “coordinate free” description as well. It is:

TxR 3 v 7→ v(r). Therefore dfa(v) = (dfa(v)) (r) = v(r ◦ f) = v(f). ∞ Remark 3.12. It is customary not to distinguish between dfa and dfa. Thus, in the case of f ∈ C (M), the diﬀerential dfa denotes both the linear map dfa : TaM → Tf(a)R and the linear functional dfa : TaM → R. In other words, from now on we drop the notation dfa and write (3.4) as

(3.5) dfa(v) = v(f). ∞ for all f ∈ C (M), a ∈ M, v ∈ TaM. Deﬁnition 3.13. The vector space ∗ Ta M := Hom(TaM, R) is called the cotangent space of M at a. The new concept of the diﬀerential allows us to re-interpret the formula (3.3). Recall that a choice of n ∂ coordinates φ = (x1, . . . , xn): U → on a manifold M gives rise to a basis { |a} of TaM for any point R ∂xi ∗ a ∈ U. We claim that {(dxi)a} form the dual basis of the cotangent space Ta M. Indeed, by (3.5), ∂ ∂ (dxj)a |a = |a(xj) = δij. ∂xi ∂xi

Since for v ∈ TaM we have v(xi) = (dxi)a(v), (3.3) becomes

X ∂ (3.6) v = (dxi)a(v) . ∂xi a n m n Let f = (f1, . . . , fm): R → R be a smooth map. We are now in the position to compare dfa : TaR → m n m n Tf(a)R with Dfa : R → R . Let r1, . . . rn denote the standard coordinates on R and s1, . . . , sm the standard coordinates on Rm. Using (3.6) we compute: ∂ ∂ ∂ (dsi)f(a)(dfa( )) = (dfa( ))(si) = (si ◦ f) ∂rj a ∂rj a ∂rj a ∂ = (fi) ∂rj a ∂f = i (a) ∂rj 14 n m ∂ ∂ Thus the matrix of the linear map dfa : Ta → T with respect to the basis { } and { } R f(a)R ∂rj a ∂si f(a) is the Jacobian matrix of Dfa. .

It is worth singling out another special case of the definition of a differential of a map: M = R. In this case f : R → N is a smooth curve. We define the tangent vector to f at t ∈ R to be d f 0(t) := df | . t dr t 0 Note that by definition f (t) is a tangent vector in Tf(t)N, the tangent space to N at f(t).

Exercise 3.4. Let M be a manifold, p ∈ M a point and v ∈ TpM a tangent vector at the point p. Show that there is a curve γ : I → M (where I is an open interval containing 0) with γ(0) = p and γ0(0) = v. We next observe that the chain rule holds for the diﬀerentials of smooth maps. Theorem 3.14 (Chain Rule). If F : X → Y and H : Y → Z are smooth maps of manifolds, then

d(H ◦ F )a = dHF (a) ◦ dFa for any point a ∈ X.

∞ Proof. Fix a ∈ X, v ∈ TaX, and f ∈ C (Z). Then

(d(H ◦ F )a(v))(f) = v(f ◦ (H ◦ F )) = v((f ◦ H) ◦ F )

= (dFa(v))(f ◦ H)

= (dHF (a)(dFa(v)))(f).

Remark 3.15. Theorem 3.14 and Exercise 3.4 give us a useful way of computing diﬀerentials dfa : TaM → 0 Tf(a)N. By the exercise, for any v ∈ TaM we can ﬁnd a curve γ : I → M with γ(0) = a and γ (0) = v. Then, by the chain rule, d d df (v) = df (γ0(0)) = df (dγ( | )) = d(f ◦ γ) ( | ) = (f ◦ γ)0(0). a a a dr 0 0 dr 0

Exercise 3.5. Prove that if F : M → N is a diﬀeomorphism then the diﬀerential dFa : TaM → TF (a)N is an isomorphism.

Exercise 3.6. Let M and N be manifolds. Prove that for any (a, b) ∈ M ×N the tangent space T(a,b)(M ×N) is isomorphic to TaM × TbN. n Exercise 3.7. Suppose that γ : R → R , γ(t) = (γ1(t), . . . γn(t)) is a smooth curve. Show that d X ∂ dγ ( ) = γ 0(t) , dt i ∂r i i 0 where γi (t) are ordinary derivatives. 3.4. The tangent bundle. Deﬁnition 3.16 (provisional). The tangent bundle TM of a manifold M is (as a set) G TM = TaM. a∈M Note that there is a natural projection (the tangent bundle projection) π : TM → M

which sends a tangent vector v ∈ TaM to the corresponding point a of M. 15 We want to show that the tangent bundle TM itself is a manifold in a natural way and the projection map π : TM → M is smooth. Strictly speaking, we first should specify a topology on TM. However, our strategy will be different. We will first find candidates for coordinate charts on the tangent bundle TM. They will be constructed out of coordinate charts on M. We will check that the change of these candidate coordinates on TM is smooth. We will then use these candidate coordinates to manufacture a topology on TM. n Let φ = (x1, ··· , xn): U → R be a coordinate chart on M. Out of it we construct a chart on TU. The first n functions come for free: we take the functions x1 ◦ π, . . . , xn ◦ π. Another set of n functions come for free also: by (3.6), given a vector v ∈ TaU, X ∂ v = (dxi)a(v) |a. ∂xi Hence, abusing the notation a bit, we get maps

dxi : TU → R,TU 3 v 7→ (dxi)a(v), where a = π(v). Thus we deﬁne a candidate coordinate chart ˜ n n φ := (x1 ◦ π, ··· , xn ◦ π, dx1, ··· , dxn): TU → R × R by ˜ φ(v) = (x1(π(v)), . . . , xn(π(v)), (dx1)π(v)(v),..., (dxn)π(v)(v)). ˜ If {Uα, φα)} is an atlas on M, we get a candidate atlas {(TUα, φα)} on TM. To see why this could possibly be an atlas, we need to check that the change of coordinates in this new purported atlas is smooth. To this end pick two coordinate charts (U, φ = (x1, ··· , xn)) and (V, ψ = (y1, ··· , yn)) on M with U ∩ V 6= ∅. Then T (U ∩ V ) = TU ∩ TV 6= ∅. Let ˜ n n φ = (x1, ··· , xn, dx1, ··· , dxn): TU → R × R and ˜ n n ψ = (y1, ··· , yn, dy1, ··· , dyn): TV → R × R be the corresponding candidates charts on TM. Now let us compute the change of coordinates ψ˜ ◦ φ˜−1. First, note that

˜−1 X ∂ φ (r1, ··· , rn, u1, ··· , un) = ui ∈ Tφ−1(r ,··· ,r )M. ∂x φ−1(r ,··· ,r ) 1 n i i 1 n So

˜ X ∂ −1 X ∂ X ∂ ψ( ui ) = (ψ(φ (r1, ··· , rn)), dy1( ui ), ··· , dyn( ui )). ∂x φ−1(r ,··· ,r ) ∂x ∂x i 1 n i i i i But X ∂ X ∂ X ∂yj X ∂ dy ( u ) = u ( (y )) = u = (r (ψ ◦ φ−1))u . j i ∂x i ∂x j ∂x i ∂r j i i i i i i i i i Thus the change of the candidate coordinates is given by

X ∂y1 X ∂yn ψ˜ ◦ φ˜−1(r , ··· , r , u , ··· , u ) =(ψ ◦ φ−1(r), ( (r)u ,..., (r)u )) 1 n 1 n ∂x i ∂x i i i i i (3.7)  u  1 −1 ∂yj . =(ψ ◦ φ (r), (r)  .  ), ∂xi   un ˜ ˜−1 where r = (r1, . . . rn). Clearly ψ ◦ φ is smooth wherever it is defined. It remains to define a topology on TM so that the charts φ˜ : TU → φ(U) × Rn are homeomorphisms. We declare a subset O ⊂ TM to be open if for any coordinate chart φ : U → Rn on M, the set φ˜(O ∩ TU) ⊂ Rn × Rn is open. Proposition 3.17. The collection of open sets on TM defined above does indeed form a topology. Moreover, if M is Hausdorff and second countable, so is TM.

Proof. An exercise for the reader. 16 We conclude that if M is an n-dimensional Hausdorﬀ second countable manifold then its tangent bundle TM is a 2n-dimensional Hausdorﬀ second countable manifold. Moreover, each coordinate chart (x1, . . . xn): n 2n U → R on M gives rise to a coordinate chart (x1 ◦ π, . . . xn ◦ π, dx1, . . . , dxn): TU → R .

Remark 3.18. The following notation is suggestive: we write (m, v) ∈ TM for v ∈ Tm(M). Strictly speaking, it is redundant since m = π(v).

Remark 3.19. It is customary to simply write xi : TU → R for xi ◦ π : TU → R.

Exercise 3.8. Prove that the map π : TM → M is smooth and that the diﬀerential dπv : Tv(TM) → Tπ(v)M is surjective for all tangent vectors v ∈ TM. Hint: do it in (convenient) coordinates. 3.5. The cotangent bundle. As a set, the cotangent bundle T ∗M is the disjoint union of cotangent spaces: ∗ G ∗ T M = Ta M. a∈M Note that there is a natural projection (the cotangent bundle projection) π : T ∗M → M ∗ which sends a cotangent vector (a covector for short) η ∈ Ta M to the corresponding point a of M. We make the cotangent bundle T ∗M into a manifold in more or less the same way we made the tangent bundle into a manifold. That is, we manufacture new coordinate charts on T ∗M out of coordinate charts on M and check that the transition maps between the new coordinate charts are smooth. n So let φ = (x1, . . . , xn): U → R be a coordinate chart on M. Then for each point a ∈ U the covectors ∗ ∂ ∗ {(dxi)a} form a basis of T M. The partials { |a} form the dual basis. Hence for any η ∈ T M, a ∂xi a X ∂ η = η( |a)(dxi)a. ∂xi Therefore the partials { ∂ } give us coordinate functions on T ∗U: ∂xi

∂ ∗ n ∗ ∂ : T U → R ,T U 3 η 7→ η( |a), ∂xi ∂xi where a = π(η). We now define the candidate coordinates ∗ n n φ¯ : T U → R × R by ¯ ∂ ∂ φ = (x1 ◦ π, . . . , xn ◦ π, ,..., ). ∂x1 ∂xn Note that n ¯−1 X ∗ φ (r1, . . . , rn, w1, . . . , wn) = wi(dxi)φ−1(r) ∈ Tφ−1(r)M, i=1 where again we have abbreviated (r1, . . . , rn) as r. We now check the transition maps. Let ψ = (y1, . . . , yn): V → Rn be a coordinate chart on M with V ∩ U 6= ∅. Then n ¯ ¯−1 ¯ X ψ ◦ φ (r1, . . . , rn, w1, . . . , wn) =ψ( wi(dxi)φ−1(r)) i=1 n n ∂ X ∂ X =((ψ ◦ φ−1)(r), ( w dx ),..., ( w dx )) ∂y i i ∂y i i 1 i=1 n i=1 X ∂xi X ∂xi =((ψ ◦ φ−1)(r), w ,..., w ). i ∂y i ∂y i 1 i n We conclude that   w1 ∂x ¯ ¯−1 −1 i  .  (3.8) ψ ◦ φ (r1, ··· , rn, w1, ··· , wn) = (ψ ◦ φ (r), (r) . ), ∂yj   wn 17 which is smooth. The rest of the argument proceeds as in the case of the tangent bundle. Remark 3.20. Later on, when we look at the general vector bundles, it will be instructive to compare the formulas for the change of coordinates in the tangent and the cotangent bundles. In particular note that the ∂y matrices j (r) and ∂xi (r) are inverse transposes of each other. ∂xi ∂yj 3.6. Vector fields. A vector field X on a manifold M smoothly assigns to a point a ∈ M a tangent vector 3 n X(a) ∈ TaM. What does “smoothly” mean? If X is a vector field in R then X ∂ X(a) = fi(a) |a ∂ri n for certain functions fi(a) ∈ R of the point a ∈ R . So whatever we mean by “smooth” should amount to the functions fi being smooth. This suggests one definition of a smooth vector field:

Deﬁnition 3.21. A vector ﬁeld X on a manifold M is smooth if for any coordinate chart φ = (x1, . . . , xn): U → Rn we have, for any point a ∈ U, X ∂ (3.9) X(a) = fi(a) |a ∂xi for some smooth functions fi : U → R.

There is something a bit unsatisfying about this deﬁnition: is it possible that the functions fi in (3.9) are smooth for one choice of coordinates and not smooth for another choice? So we will use it as as starting point for a better one. Note that the functions fi in (3.9) are given by:

fi(a) = (dxi)a(X(a)), for any a ∈ U. Thus Definition 3.21 simply says that the composite (x1, . . . , xn, dx1, . . . , dxn) ◦ X : U → Rn × Rn is smooth. But this is the same thing as saying that the map X : M → TM is smooth. Not every map Z : M → TM is a vector field: we need to make sure that Z(a) ∈ TaM. The condition is equivalent to π(Z(a)) = a for all a ∈ M. Here, as before, π : TM → M is the natural projection. This gives us a slightly more “sophisticated” definition of a vector field: Definition 3.22. A (smooth) vector field X on a manifold M is a smooth map X : M → TM such that π ◦ X = id. There is yet another definition of a vector field, which is quite useful from some points of view: Definition 3.23. A smooth vector field X on a manifold M is a linear map X : C∞(M) → C∞(M) such that (3.10) X(fg) = fX(g) + gX(f) for all f, g ∈ C∞(M). Proposition 3.24. Definitions 3.22 and 3.23 are equivalent. Proof. Exercise. Here are a few hints. Given a vector field X : M → TM define a map X˜ from C∞(M) to functions on M by (X˜(f))(a) = Xa(f) for all f ∈ C∞(M) and all a ∈ M. Check that X˜(f) is a smooth function and that the map X˜ so defined is a derivation. That is, show that (3.10) holds with X replaced by X˜. Conversely, given a map X˜ : C∞(M) → C∞(M) with the derivation property as above, define X : M → TM by Xa(f) = (X˜(f))(a) ∞ for all f ∈ C (M) and all a ∈ M. Check that Xa is indeed a tangent vector in TaM and that the map X : M → TM, a 7→ Xa is smooth in a.

3 Sometimes this is also written Xa. 18 Remark 3.25. From now on we will not distinguish between the two definitions and will think of vector fields as either smooth maps M → TM satisfying certain conditions or as R-linear maps C∞(M) → C∞(M) satisfying the appropriate conditions. We will make no notation distinction between the two ways of looking at vector fields. Thus X(a) will stand for the value of a vector field at a point a if a is a point. On the other hand, if f is a smooth function, X(f) will stand for a new smooth function, the “derivative” of f with respect to the vector field X. Notation. There are several standard ways to denote the space of all smooth vector fields on a given manifold M. The two most common ones are Γ(TM) [vector fields are sections of the tangent bundle, see below] and X (M). Remark 3.26. 1. The space of vector fields Γ(TM) is a vector space over R: if X,Y ∈ Γ(TM) are (smooth) vector fields and λ, µ ∈ R are scalars, then their linear combination λX + µY is defined by (λX + µY )(a) := λX(a) + µY (a) for any a ∈ M. It is again a smooth vector field. 2. We can also multiply vector fields on M by smooth functions: if X ∈ Γ(TM) and f ∈ C∞(M) then fX is defined by (fX)(a) := f(a)X(a) for all a ∈ M. A fancy way of describing 2. is to say that Γ(TM) is a module over the ring of smooth functions C∞(M). See if you can impress your date. If X,Y ∈ Γ(TM) are two vector fields on a manifold M then it is not true that the R-linear map C∞(M) → C∞(M), f 7→ X(Y (f)). d is a vector field — it does not have the correct derivation property. For example, if M = R and X = Y = dt , then X(Y (f)) = f 00 and (fg)00 = (f 0g + fg0)0 = f 00g + 2f 0g0 + fg00 6= f 00g + fg00. However, Lemma 3.27. Let X,Y ∈ Γ(TM) be two smooth vector fields on a manifold M. Then the map (3.11) [X,Y ]: C∞(M) → C∞(M), f 7→ X(Y (f)) − Y (X(f)) is a vector field. Proof. Clearly the map [X,Y ] is R-linear. We need to check that it has the correct derivation property. This is a mindless computation. Pick two functions f, g ∈ C∞(M). Then [X,Y ](fg) =X(Y (fg)) − Y (X(fg)) =X(Y (f)g + fY (g)) − Y (X(f)g + fX(g)) =X(Y (f))g + Y (f)X(g) + X(f)Y (g) + fX(Y (g)) − Y (X(f))g − X(f)Y (g) − Y (f)X(g) − fY (X(g)) =X(Y (f))g − Y (X(f))g + fX(Y (g)) − fY (X(g)) =([X,Y ](f))g + f([X,Y ](g)). Definition 3.28. The Lie bracket of two vector fields X and Y on a manifold M is the vector field [X,Y ] defined by (3.11). We now quickly recall the definitions of bilinear and skew-symmetric bilinear maps, the point being that Lie bracket will turn out to be a skew-symmetric bilinear map. Definition 3.29. Let V , U and W be three vector spaces over the reals. A map b : V × U → W

is bilinear if it is (R-) linear in each argument: for all u1, u2 ∈ U, c1, c2 ∈ R and all v ∈ V ,

b(v, c1u1 + c2u2) = c1b(v, u1) + c2b(v, u2);

and for all v1, v2 ∈ V , c1, c2 ∈ R and all u ∈ U,

b(c1v1 + c2v2, u) = c1b(v1, u) + c2b(v2, u). 19 Deﬁnition 3.30. A bilinear map b : U × U → V is skew-symmetric if

b(u1, u2) = −b(u2, u1) for all u1, u2 ∈ U.

It is easy to see that the Lie bracket on a manifold M is R-bilinear and skew-symmetric. Note that it is not C∞(M)-bilinear: [X, hY ] = X(h)Y + h[X,Y ] for any X,Y ∈ Γ(TM), h ∈ C∞(M) (prove this). Somewhat surprisingly the Lie bracket has a kind of derivation property: Lemma 3.31 (Jacobi identity). For any three vector fields X,Y,Z ∈ Γ(TM) on a manifold M (3.12) [X, [Y,Z]] = [[X,Y ],Z] + [Y, [X,Z]]. Here is how one sees this as a derivation property: for a vector field X ∈ Γ(TM) define

LX : Γ(TM) → Γ(TM) by

LX (Y ) = [X,Y ]. With this deﬁnition (3.12) becomes:

LX ([Y,Z]) = [LX (Y ),Z] + [Y,LX (Z)]

Proof of Lemma 3.31. This is another computation that’s easier to do yourself than watch someone else doing it. To keep the notation from getting out of hand, we will drop parentheses. Thus XY Zf stands for X(Y (Z(f)))) etc. We pick a function f ∈ C∞(M) and compute: ([[X,Y ],Z] + [Y, [X,Z]])f =[X,Y ]Zf − Z[X,Y ]f + Y [X,Z]f − [X,Z]Y f = XY Zf − Y XZf − ZXY f + ZY Xf + Y XZf − Y ZXf − XZY f + ZXY f = XY Zf + ZY Xf − Y ZXf − XZY f = X(Y Zf − ZY f) + (ZY − YZ)Xf = [X, [Y,Z]]f.

This proves the Jacobi identity. Equation (3.12) is called the Jacobi identity and is often written as [X, [Y,Z]] + [Y, [Z,X]] + [Z, [X,Y ]] = 0. (it is equivalent to (3.12) by skew-symmetry of [·, ·].

Definition 3.32. A (real) Lie algebra is a vector space V over R (perhaps infinite dimensional) together with a map [·, ·]: V × V → V , a Lie bracket, such that (1) [·, ·] is bilinear, (2) [·, ·] is skew-symmetric, and (3) [·, ·] satisfies the Jacobi identity: for all v, u, w ∈ V [u, [v, w]] = [[u, v], w] + [v, [u, w]]. Example 3.33. We have proved that the space of vector fields Γ(TM) on a manifold M forms a Lie algebra.

Example 3.34. R3 with the cross (vector) product is a Lie algebra. Remark 3.35. The bracket on a Lie algebra can be thought of as a multiplication. Note that it is not associative in general because of the Jacobi identity. The geometric meaning of the Lie brackets of vector ﬁelds will be discussed later. 20 4. Submanifolds and the implicit function theorem Given a smooth function F : Rm → Rn and a point c ∈ Rn the level set −1 m F (c) := {x ∈ R | F (x) = c} may or may not be a smooth manifold. For example, take f(x, y) = x2 − y2, a smooth function on R2. Then f −1(0) is the union of two lines: y = ±x. It is not a manifold. However, for c 6= 0, f −1(c) is a union of two smooth curves, hence a 1 dimensional manifold. The goal of this section is to describe a suﬃcient condition for the level sets F −1(c) to be manifolds. We then generalize this to level sets of smooth maps between manifolds. The key technical result that makes it all possible is the inverse function theorem. 4.1. The inverse function theorem and a few of its consequence.

Theorem 4.1 (Inverse function theorem). Let U, U 0 ⊂ Rn, be open sets and F : U → U 0 a smooth map. Suppose for some point a ∈ U the diﬀerential n n dFa : R → R 0 0 is invertible. Then there are open neighborhoods U0 of a in U and U0 of F (a) in U so that 0 F : U0 → U0 is a diﬀeomorphism.

We will assume this result without proof. It is not essential that U and U 0 are open subsets of Rn — any ﬁnite dimensional vector space will do. It is even true with Rn replaced by a Banach space. We now discuss various consequences of the inverse function theorem. The most famous one is the implicit function theorem. But ﬁrst we prove the manifold version. Proposition 4.2. Let f : N → M be a smooth map of manifolds with f(p) = q (p ∈ N, q ∈ M). Suppose

dfp : TpN → TqM is an isomorphism (invertible linear map). There there are neighborhoods U of p ∈ N, V of q in M so that

f|U : U → V is a diffeomorphism (invertible map with a smooth inverse). 0 n 0 Proof. Note first that if φ : U → R is a coordinate chart on N then for any z ∈ U the map dφz : TzN → n ∂ ∂ T is an isomorphism ( for instance if φ = (x1, . . . , xn), dφx( ) = ). φ(z)R ∂xi ∂ri φ ψ So let p ∈ U 0 → Rn and q ∈ V 0 → Rm be two coordinate charts on M and N respectively. Then the diagram f U 0 −−−−→ V 0     (4.1) φy yψ φ(U 0) −−−−−−−→ ψ(V 0) (ψ◦f◦φ−1) commutes: ψ ◦ f = (ψ ◦ f ◦ φ−1) ◦ φ. Hence the diagram of differentials

dfp TpN −−−−→ TqM   dφ  dψ (4.2) py y q 0 0 Tφ(p)φ(U ) −−−−−−−−−−→ Tψ(q)ψ(V ) −1 d(ψ◦f◦φ )φ(p) commutes as well. By the inverse function theorem, there are neighborhoods U¯ of φ(p) and V¯ of ψ(q) so that −1 ¯ ¯ (ψ ◦ f ◦ φ )|U¯ : U → V is a diﬀeomorphism. Consequently, f : φ−1(U¯) → ψ−1(V¯ ) 21 is a diﬀeomorphism. Next we turn to the implicit function theorem, the vector space version.

Theorem 4.3 (Implicit function theorem). Let F : Rn × Rk → Rk be a smooth map, (a, b) ∈ Rn × Rk a point and c = F (a, b). Suppose that the restriction of the diﬀerential k k dF | k : {0} × → (a,b) {0}×R R R is onto. Then there are neighborhoods U of a ∈ Rn, W of (a, b) in Rn × Rk and a smooth map g : U → Rk with g(a) = b such that the −1 k F (c) ∩ W = graph {g : U → R }. That is, for (x, y) ∈ W F (x, y) = c ⇔ y = g(x). In other words the function g is implicitly deﬁned by the equation F (x, g(x)) = c.

∂F ∂F Proof. We write suggestively (a, b) for the restriction dF | n and (a, b) for dF | k . Con- ∂x (a,b) R ×{0} ∂y (a,b) {0}×R sider the smooth map H : Rn × Rk → Rn × Rk defined by H(x, y) = (x, F (x, y)) for all (x, y) ∈ Rn × Rk. Then the differential of H at (a, b) is of the form  I 0  dH(a,b) =   , ∂F ∂F ∂x (a, b) ∂y (a, b) n n ∂F where I : R → R is the identity map. By assumption ∂y (a, b) is invertible. Hence dH(a,b) is invertible. By the inverse function theorem the function H is invertible on a neighborhood of (a, b). Let G(u, v) = (G1(u, v),G2(u, v)) denote its inverse, which is defined on a neighborhood of H(a, b) = (a, F (a, b)) = (a, c). We may take this neighborhood to be of the form U × V , with U ⊂ Rn and V ⊂ Rk being open. Let W = G(U × V ). Then (u, v) = H(G(u, v)) = (G1(u, v),F (G1(u, v),G2(u, v))

for all (u, v) ∈ U × V . Hence G1(u, v) = u. Therefore

F (u, G2(u, v)) = v for all (u, v) ∈ U × V . Conversely, if for any (x, y) ∈ W we have F (x, y) = v then

(x, y) = G(H(x, y)) = G(x, F (x, y)) = G(x, v) = (G1(x, v),G2(x, v))

and therefore y = G2(x, v). Define the function g : U → Rk by g(x) = G2(x, c). It is a smooth function and, by the above discussion, F (x, y) = c ⇔ y = g(x) for any (x, y) ∈ W . Remark 4.4. Here is a slightly different and ultimately more useful way to look at what we have proved. The argument above shows that there is a diffeomorphism H : W → U × V mapping bijectively the set {F = c} ∩ W := {(x, y) ∈ W | F (x, y) = c} onto the set n H(W ) ∩ (R × {c}) This motivates the following definition. 22 Definition 4.5 (Submanifold). Let M be an m-dimensional manifold. A subset N ⊂ M is an n-dimensional m embedded submanifold if for every point q ∈ N, there is a coordinate chart φ = (x1, ··· , xm): U → R with q ∈ U such that n φ(U ∩ N) = φ(U) ∩ (R × {0}). That is, for all a ∈ N ∩ U, φ(a) = (x1(a), ··· , xn(a), 0, ··· , 0). Such charts are said to be adapted to N.

2 3 2 Example 4.6. The sphere S is an embedded submanifold of R . For example if (x1, x2, x3) ∈ S and x3 > 0 then q 2 2 φ(x1, x2, x3) = (x1, x2, x3 − 1 − x1 − x2) is a chart adapted to S2 (and there are 5 more charts like this). Thus the implicit function theorem says that, under certain conditions, portions of a level set of a map F : Rn ×Rk → Rk are embedded submanifolds. Naturally the embedded submanifolds are manifolds in their own right. Lemma 4.7. If N ⊂ M is an n-dimensional embedded submanifold of an m-dimensional manifold M then it is naturally an n-dimensional manifold in its own right, and the inclusion map ι : N,→ M, ι(a) = a is smooth.

Proof. We make N into a topological space by giving it the subspace topology. If φ : U → Rm is a chart n m n on M adapted to N, then p ◦ φ|N : N ∩ U → φ(U) ∩ R is a homeomorphism. Here p : R → R is m the projection p(x1, . . . , xn, . . . , xm) = (x1, . . . , xn). If ψ : V → R is another chart adapted to N, then ψ ◦ φ−1 : φ(U ∩ V ) → ψ(U ∩ V ) maps φ(U ∩ V ) ∩ (Rn × {0}) diﬀeomorphically to ψ(U ∩ V ) ∩ (Rn × {0}). m S Hence if {φα : Uα → R } is a collection of charts on M adapted to N with M = Uα then {p ◦ φα|Uα∩N : n Uα ∩ N → R } is an atlas on N. Checking that the inclusion map ι is smooth is easy: in coordinates it’s n m the inclusion R → R ,(r1, . . . , rn) 7→ (r1, . . . , rn, 0,..., 0) We now generalize the implicit function theorem.

Proposition 4.8. Let F : Rm → Rk be a smooth map and c ∈ F (Rm) ⊂ Rk a point. Suppose that for all points q ∈ F −1(c) the differential m k dFq : R → R is onto. Then the level set F −1(c) is a submanifold of Rm and (if F −1(c) is nonempty) −1 m k dim F (c) = dim R − dim R . −1 m Proof. Fix a point q ∈ F (c). Let Z = ker dFp. Let X ⊂ R be the vector space complement to Z so that m R = Z ⊕ X ' Z × X. m We can thus think of a point p ∈ R as a pair (z, x) ∈ Z × X. By assumption on dFq and by construction of X, the restriction k dFq|X : X → R is an isomorphism of vector spaces. We now proceed as in the proof of the implicit function theorem. Consider k H : Z × X → Z × R ,H(z, x) = (z, F (z, x)). ∂F ∂F Write ∂z for dF |Z and ∂x for dF |X . Then Then the differential of H is of the form  I 0  dH(z,x) =   . ∂F ∂F ∂z ∂x ∂F k By construction ∂x (q): X → R is a bijection. Hence dHq is a bijection. By the inverse function theorem there exist neighborhoods W of q in Rm and U × V of H(q) in Z × Rk so that H : W → U × V is a 23 diffeomorphism. Moreover, as in the proof of the implicit function theorem H maps bijectively {F = c} ∩ W to (U × V ) ∩ (Z × {c}). Therefore F −1(c) = {F = c} is a submanifold of Rm of dimension m k dim Z = dim R − dim R . n P 2 Example 4.9. Consider F : R → R, F (x) = xi . Then dFx = (2x1,..., 2xn). Hence dFx is surjective −1 n P 2 n for all nonzero x. In particular F (1) = {x ∈ R | xi = 1} is a submanifold of R of dimension n − 1. This is, of course, the standard sphere of radius 1. Definition 4.10 (Regular value). Suppose f : M → N is a smooth map of manifolds. A point c ∈ N is a regular value of f if for all x ∈ f −1(c) the differential

dfx : TxM → TcN is surjective. The previous proposition then simply states that non-empty preimages of a regular values of a map F : Rm → Rk are submanifolds of Rm. Remark 4.11. Note that if f −1(c) = ∅, then c is a regular value of f. It seems silly to construct a definition this way. The reason for the peculiar phrasing is that it makes easier to state Sard’s theorem. Theorem 4.12 (Sard’s Theorem). Let f : M → N be a smooth map. Then the set of regular values of f is dense in M (and in fact its compliment has measure 0). Note that if F : M → N maps everything to one point {c} then c is not a regular value (the differential of F is 0 everywhere), but N r {c} does consist of regular values. So Sard’s theorem does hold for constant maps, except for the preimage of every regular value of a constant map is empty. It will take us too far afield to prove Sard’s theorem, so we won’t do it. On the other hand Proposition 4.8 nicely generalizes to manifolds: Theorem 4.13. If c is a regular value of a smooth map of manifolds f : M → N and if f −1(c) 6= ∅ then the level set f −1(c) is an embedded submanifold of M of dimension dim f −1(c) = dim(M) − dim(N). Before we proceed with the proof of Theorem 4.13, we make a two observations.

m 1. Let {φα : Uα → R } be an atlas on a manifold M. Suppose for some index β there is a diﬀeomorphism m σ : φβ(Uβ) → W ⊂ R (W is some open set). Then m (i) σ ◦ φβ : Uβ → R is a chart on M, m (ii) this chart is compatible with the atlas {φα : Uα → R } we started out with. The implies that

3. If Z is a submanifold of a manifold M and H : M → M 0 is a diffeomorphism, then H(Z) is a submanifold of M 0. Proof of Theorem 4.13. It is enough to show that for every point a of f −1(c) there is a neighborhood U of a such that U ∩ f −1(c) is a submanifold of U of dimension m − n. Let a ∈ f −1(c) be a point. Let φ : U → Rm be a chart of M with a ∈ U and ψ : V → Rn be a chart on N with c ∈ V . Then ψ ◦ f ◦ φ−1 : U 0 → V is a smooth map. Moreover, by the chain rule, −1 −1 d(ψ ◦ f ◦ φ )0 = dψc ◦ dfa ◦ d(φ )0. −1 −1 Since dψc and dφa are isomorphisms and dfa is onto for any a ∈ f (c) by assumption, d(ψ ◦ f ◦ φ )φ(a) : m n −1 −1 −1 −1 Tφ(a)R → Tψ(c)R is onto for any a ∈ f (c)∩U. By Proposition 4.8 (ψ ◦f ◦φ ) (ψ(c)) = φ(U ∩f (c)) is a submanifold of φ(U) of dimension m − n. Therefore U ∩ f −1(c) is a submanifold of U ⊂ M of dimension −1 m − n. Since a is arbitrary, f (c) is a submanifold of all of M of the desired dimension. 24 The next statement describes the tangent bundle of a regular level set f −1(c). Corollary 4.13.1. Suppose that c is a regular value of f : M → N and f −1(c) 6= ∅. Then for all a ∈ f −1(c), −1 Taf (c) = ker(dfa). −1 −1 Proof. Since dim Taf (c) = dim f (c) = dim M − dim N = dim ker dfa, it is enough to prove that −1 −1 Taf (a) ⊂ ker dfa. Let v ∈ Taf (c) be a vector. By exercise 3.4 there is a curve γ : I → f −1(c) (where I is an interval containing 0) such that γ(0) = a and d d d dγ( dt ) = v. Since f ◦ γ is a constant map, d(f ◦ γ)0 = 0. By the chain rule, d(f ◦ γ)0( dt ) = dfγ(0)(dγ0( dt )) = −1 dfa(v). Therefore Taf (c) ⊂ ker dfa and we are done. n P 2 Example 4.14. Let f : R → R be given by f(x) = xi . Then, as we have seen before, 1 is a regular n −1 n−1 value of f and dfx = (2x1,..., 2xn) for all x ∈ R . Therefore, for any x ∈ f (1) = S the tangent n−1 P space TxS is naturally isomorphic to ker{v 7→ 2xivi}, which is the (n − 1) dimensional hyperplane in n n R ' TxR orthogonal to the vector x. Exercise 4.1. Show that O(n), the set of all n × n orthogonal matrices, is a submanifold of GL(n, R). Hint: Consider the map f : GL(n, R) → Sym(n, R) given by A 7→ AAT . Show that the identity matrix I is a regular value of f. 4.2. Transversality. We now have enough tools to do a bit of differential topology. Definition 4.15 (Transversality). A smooth map F : M → N of manifolds is transverse to a submanifold Z of N if for every z ∈ Z and any m ∈ F −1(z), we have

TzZ + dFm(TmM) = TzN (not necessarily as a direct sum!).

Notation. We write F t Z if a map F is transverse to a submanifold Z. Example 4.16.

2 3 2 2 Let N = R , M = R , Z = S ⊂ M and f : N → M is given by f(x1, x2) = (x1, x2, 0). Then f t S . Remark 4.17. A map F : M → N is transverse to submanifold Z consisting of one point c if and only if c is a regular value of F .

Example 4.18. Take M = N = R2. Consider F : M → N given by F (x, y) = (x, x2). Then F is transverse to {0} × R, but it is not transverse to R × {0}. Theorem 4.19. If a smooth map F : M → N of manifolds is transverse to a submanifold Z of N, then F −1(Z) is a submanifold of M. Moreover, −1 −1 Ta(F (Z)) = (dFa) (TF (a)Z), for all a ∈ F −1(Z), and dim(M) − dim(F −1(Z)) = dim(N) − dim(Z).

Proof. We first consider a special case: assume that N = Rn, Z = Rk × {0} ⊂ Rk × Rn−k = Rn. Let π : Rk × Rn−k → Rn−k denote the canonical projection map. Then −1 k π (0) = R × {0} = Z, hence (π ◦ F )−1(0) = F −1(Z). Additionally, for all a ∈ F −1(Z) n n−k d(π ◦ F )a(TaM) = dπF (a)(dFa(TaM)) = dπF (a)(dFa(TaM) + TF (a)Z) = dπF (a)(R ) = R , where for the second equality we used the fact that dπF (a)(TF (a)Z) = 0. Therefore 0 is a regular value of π ◦ F and consequently (π ◦ F )−1(0) = F −1(Z) is a submanifold of M. Moreover, −1 −1 −1 −1 TaF (Z) = Ta(π ◦ F ) (0) = ker d(π ◦ F )a = ker{dπF (a) ◦ dFa} = (dFa) (ker dπF (a)) = (dFa) (TF (a)Z). 25 Finally, since (dπ ◦ F )m is surjective, −1 n−k dim F (Z) = dim(ker(dπ ◦ F )a) = dim M − dim R . Therefore −1 n−k dim M − dim F (Z) = dim M − (dim M − dim R ) = dim N − dim Z. What about the general case? Since Z is an embedded submanifold for all z ∈ Z, there is a coordinate chart n k ψ = (x1, . . . , xn): N → R adapted to Z with z ∈ V . Hence ψ(Z) = ψ(V ) ∩ (R × {0}). Now apply the −1 n k previous argument to ψ ◦ F : F (V ) → R and ψ(V ) ∩ (R × {0}). 3 Example 4.20. Consider two surfaces S1 and S2 in R such that TxS1 6= TxS2 for every x ∈ S1 ∩ S2. Then 3 TxS1 + TxS2 = R for all x ∈ S1 ∩ S2. 3 Let F : S1 ,→ R be the inclusion map. Then dFx(TxS1) = TxS1. Thus, F is transverse to S2. By the −1 theorem above F (S2) = S1 ∩ S2 is a submanifold of S1 of dimension 1. In other words, if two surfaces are nowhere tangent then they intersect in a collection of curves. 4.3. Embeddings, Immersions, and Rank. Definition 4.21 (Immersion). A smooth map of manifold f : Z → M is an immersion if its differential is injective at every point of Z. Immersions need not be injective: consider the map f : S1 → S1, f(eiθ) = e2iθ. It’s a 2-1 map but its differential everywhere is a bijection. Example 4.22. The inclusion map of an submanifold is a 1-1 immersion. Definition 4.23 (Submersion). A map f : M → N is called a submersion if its differential at every point is surjective. Exercise 4.2. Show that for any manifold M the canonical projection π : TM → M is a submersion — compute in the appropriate coordinates. Exercise 4.3. Show that if Z ⊂ M is an embedded submanifold, then π−1(Z) ⊂ TM is an embedded submanifold of the tangent bundle TM of M. Here again π : TM → M is the projection. Note that −1 π (Z) = ∪a∈Z TaM. It is often denoted by TM . Z Definition 4.24 (Embedding). A smooth map of manifold f : Z → M is an embedding if f(Z) ⊂ M is an embedded submanifold and f : Z → f(Z) is a diffeomorphism. This says, in particular, that every embedding is a 1-1 immersion. The converse is not true. Example 4.25. Let Z be an interval and consider a map f that sends it to figure 8 as in the picture. Then f : Z → R2 is a 1-1 immersion which is not an embedding: the topology on f(Z) as a subspace of R2 is coarser than the topology on f(Z) that makes f : Z → f(Z) a homeomorphism. Or, if you prefer f −1 : f(Z) → Z is not continuous if f(Z) is given the subspace topology.

Example 4.26. Consider the map f : R → S1 × S1 given by √ f(t) = (e2πit, e2π 2t). 1 1 The image of f is dense in S × S . Hence f is a 1-1 immersion which is not an embedding. Deﬁnition 4.27 (Rank). The rank of a smooth map f : M → N of manifold at a point a ∈ M is the rank of the linear map dfa : TaM → Tf(a)N. Proposition 4.28. If f : M → N is a smooth and rank(f) = k at some point a ∈ M, then for all a0 suﬃciently close to a, (rankfs) ≥ k.

∂yi◦f Proof. The rank of f at a is the rank of the matrix (( (a))), where (x1, . . . , xm) are coordinates on M ∂xj near a and (y1, . . . , yn) are coordinates on N near f(a). By a suitable permutation of coordinates, we may ∂fi assume that det (( (a)))i,j≤k 6= 0. Since the determinant is a continuous mapping, this determinant is ∂xj also non-zero for points suﬃciently close to a. 26 The following theorem, which is a generalization of the Implicit Function Theorem, applies in particular to immersions, but we state the more general version. Theorem 4.29 (Rank Theorem). Suppose that a smooth f : M → N has rank k at all points a ∈ M. Then for any point a ∈ M there are coordinate chart φ : U → Rm on M about a and a chart ψ : V → Rn on N about f(a) such that −1 (ψ ◦ f ◦ φ )(r1, ··· , rm) = (r1, ··· , rk, 0, ··· , 0). We will not prove this theorem since we don’t have the time

Exercise 4.4. Deﬁne f : R3 → R6 by f(x, y, z) = (x2, y2, z2, yz, zx, xy). Is f an immersion? Show that the restriction of f to S2 is an immersion of S2 into R6. Exercise 4.5. Show that there is no immersion f : S2 → R2.

Exercise 4.6. (a) Let N be a manifold. Prove that the diagonal ∆N = {(n, n) ∈ N × N : n ∈ N} is an embedded submanifold of N × N. (b) Let F : M → N and g : L → N be smooth maps such that, for all m ∈ M and l ∈ L with f(m) = g(l) we have

dfm(TmM) + dgl(TlL) = TrN, r = f(m) = g(l). Show that Z = {(m, l) ∈ M × L : f(m) = g(l)} is a submanifold of M × L.

Exercise 4.7. Let f : Rn → Rn be a smooth map such that for every x with ||x|| ≥ 2, we have ||f(x)|| < 1/||x||. Show that (a) ||f|| attains its maximum value at a point of Rn. (b) f is not an immersion. Exercise 4.8. Let N be a closed embedded submanifold of M. Show that every vector field X on N can be extended to a vector field Y on M. Hint: First extend the vector field in adapted coordinates. Next, use a partition of unity to combine each of the locally defined extensions into a global vector field.

2 1 6 1 2 2 −1 Exercise 4.9. Consider f(x, y) = y + 6 x − 2 x on R . For each c ∈ R, determine whether or not f (c) is a submanifold of R2. Justify your answer.

5. Vector fields and flows 5.1. Definitions, examples, correspondence between vector fields and flows. We start with a few words about notation. In this section I and J will stand for an open connected subset of the reals containing the origin, such as an open interval (a, b) or half-infinite intervals (−∞, b) and (a, +∞) or the whole of R (of course a < 0 < b). Recall next that given a curve γ : I → M in a manifold M, the tangent vector γ˙ (t) to the curve at γ(t) is d γ˙ (t) := dγ ( ). t dt n As you have proved in the homework if γ(t) = (γ1(t), . . . , γn(t)) is a curve in R , thenγ ˙ (t) is the vector 0 0 0 (γ1(t), . . . , γn(t)) where denotes the ordinary derivative. Next, an important definition: Definition 5.1. A curve γ : I → M is an integral curve of a vector field X on a manifold M through the point q if γ˙ (t) = Xγ(t) for all t ∈ I γ(0) = q. In other words the tangent vector to the curve γ at t is the value of the vector field X at γ(t). 27 We are now in position to summarize the goals of this subsection. We will see that vector fields are the geometric version of ordinary differential equations (ODEs) and integral curves are the geometric version of the solutions of ODEs. Using this connection with ODEs we will show that integral curves of vector fields exist and that on Hausdorff manifolds integrals curves are unique. We will then assume that all our manifolds are Hausdorff. With this assumption we will show all integral curves of a given vector field can be put together to form a flow. Moreover, there is a bijective correspondence between vector fields and flows. We will then use flows to give the Lie bracket a geometric meaning.

We first interpret the problem of existence of integral curves in coordinates. Let φ = (x1, . . . , xm): U → Rm be a coordinate chart on a manifold M. Suppose γ : I → U is an integral curve of a vector field X. Since X is a smooth vector field, there are smooth functions fi : U → R, 1 ≤ i ≤ m so that X ∂ Xa = fi(a) |a ∂xi

for all a ∈ U (of course fi = dxi(X)). Similarly, d γ˙ (t) = dγ ( ) t dt X d ∂ = dxi(dγt( )) |γ(t) dt ∂xi X d ∂ = |t(xi ◦ γ) |γ(t) dt ∂xi X 0 ∂ = (xi ◦ γ) (t) |γ(t) ∂xi

Therefore, the equationγ ˙ (t) = Xγ(t) is equivalent to

X 0 ∂ X ∂ (xi ◦ γ) (t) |γ(t) = fi(γ(t)) |γ(t) ∂xi ∂xi for all t ∈ I. Thus γ is an integral curve of X in U if and only if 0 (xi ◦ γ) (t) = fi(γ(t)), t ∈ I, 1 ≤ i ≤ m. This is a system of ordinary differential equations. Conversely, any solution to the above system defines an integral curve of the vector field X inside the open set U. We now quote without proof the appropriated theorem from the theory of ODEs.

m m Theorem 5.2. Let V ⊂ R be an open set and F = (F1,...,Fm): V → R a smooth map. For any point q0 ∈ V there is an open neighborhood V0 of q0, > 0 and a smooth map

Φ:(−, ) × V0 → V

so that for each q ∈ V0 the curve γq(t) := Φ(t, q) is the unique solution of the ODE 0 γi(t) = Fi(γ(t)), t ∈ (−, ), 1 ≤ i ≤ m subject to the initial condition γq(0) = q. The proof uses a contraction mapping principle and is similar to the proof of the inverse function theorem. We will have no time for it.

Corollary 5.2.1. Suppose X is a vector ﬁeld on a manifold M. For every point q0 ∈ M there is a neighborhood U of q0, > 0 and a smooth map Φ:(−, ) × U → M so that for any q ∈ U, γq(t) := Φ(t, q) is the unique integral curve of X through q. In particular, if σ : I → U is another integral curve of X with σ(0) = q then σ(t) = γq(t) for all t ∈ I ∩ (−, ). 28 It is important to note that uniqueness of integral curves does depend on the fact that we keep track of the initial conditions.

Lemma 5.3. If γ : I → M is an integral curve of a vector field X passing through p then for any s ∈ R, the curve σ(t) = γ(t + s) is also an integral curve of X. However, at time 0 it passes through q = γ(s)4 (The curve σ is defined on I0 = {t ∈ R | t + s ∈ I}.) Proof. This is an easy application of the chain rule. Here are the gory details. Define the translation 0 τs : I → I by τs(t) = t + s. Then σ = γ ◦ τs. Note that d(τs)t : R → R = id. Hence d d d σ˙ (t) = dσ ( ) = d(γ ◦ τ ) ( ) = (dγ ◦ d(τ ) )( ) t dt s t dt s+t s t dt d = dγ ( ) =γ ˙ (t + s) = X = X . s+t dt γ(t+s) σ(t) The open set U in Corollary 5.2.1 above lies inside some coordinate chart on M. Therefore the general uniqueness of integral curves of X doesn’t quite follow from the corollary. Here is an example where the uniqueness fails. d Example 5.4. Consider first the real line R with the constant vector field dt . The corresponding differential equation is γ0(t) = 1. The solutions are curves of the form γ(t) = p + t. Now consider the non-Hausdorff manifold M obtained by gluing two copies of R along R r {0}. More precisely, let M˜ = R×{0, 1}. Define an equivalence relation ∼ by (x, 0) ∼ (x, 1) for all x 6= 0. Let M = M/˜ ∼. We write [x, 0] and [x, 1] for the equivalence classes of (x, 0) and (x, 1) respectively. Note that by design [0, 0] 6= [0, 1]. These are the “two origins” of the “line” M. For x 6= 0 we have [x, 0] = [x, 1]. Note that M comes with two natural coordinate charts: φ([x, 0]) = x and ψ([x, 1]) = x for all x ∈ R. The change of coordinates φ ◦ ψ−1 is defined on all of R r {0} and is the identity map. It follows that the d −1 −1 constant vector field dt defines a vector field X on M. Moreover, γ(t) = φ (t + 1) and σ(t) = ψ (t + 1) are integral curves of X with γ(0) = [1, 0] = [1, 1] = σ(0). Additionally γ(t) = σ(t) except for t = −1. Why do problems like these not occur on Hausdorff manifolds? The key point is: a manifold M is Hausdorff if and only if the diagonal

∆M := {(m, m) ∈ M × M | m ∈ M} is closed in M × M [prove it]. Consequently, if γ : I → M and σ : J → M are two curves, then the set K := {t ∈ I ∩ J | γ(t) = σ(t)},

where the two curves agree, is closed in I ∩ J. Indeed, K is the preimage of ∆M under the map I ∩ J 3 t 7→ (γ(t), σ(t)) ∈ M × M. Now suppose additionally that γ and σ are two integral curves of a vector field X ∈ Γ(TM). Then, by Corollary 5.2.1 and Lemma 5.3, the set of points K is also open in I ∩ J. Since I ∩ J is an interval and K is open and closed, it follows that the set K has to be all of I ∩ J. This gives us uniqueness: two integral curves of a given vector field passing through a given point at t = 0 agree for all t in the intersection of their domains of definition. Furthermore it makes sense to take the union of the integral curves γ and σ: γ(t) t ∈ I (γ ∪ σ)(t) := σ(t) t ∈ J Taking the union of all integral curves of a vector field X passing through a given point p we get a maximal integral curve γp : Ip → M of X passing through p. It is maximal in the following sense: if γ : I → M is

4γ(s) is not γ(0) unless γ(t) = γ(0) for all t, in which case γ(t) = σ(t). 29 any other integral curve of X passing through p then I ⊂ Ip and γ(t) = γp(t) for all t ∈ I. We have proved the following lemma. Lemma 5.5. Let M be a Hausdorﬀ manifold and X ∈ Γ(TM) a vector ﬁeld. For any two integral curves γ : I → M and σ : J → M of X {t ∈ I ∩ J | γ(t) = σ(t)} = I ∩ J.

Consequently, for any point p ∈ M there is a unique maximal integral curve γp of X passing through p. From now on, unless noted otherwise, all manifolds are assumed to be Hausdorff. Example 5.6. An integral curve of a vector field need not be defined for all time. Here is a simple example. d Let M = (−∞, 0) and X = dt . Then the maximal integral curve γp of X passing through p ∈ (−∞, 0) is given by γp(t) = p + t, hence is defined only when p + t < 0, i.e., t < −p.

Corollary 5.2.1 has another important consequence: the maximal integral curve γp of the vector ﬁeld X depends smoothly on the point p. We can therefore put the maximal integral curves together and obtain a map

(5.1) Φ(t, p) = γp(t) for all t ∈ Ip and all p ∈ M. We have to be a bit careful about the set where the map Φ is deﬁned. It is deﬁned on a subset A of R × M containing {0} × M. Moreover, by Corollary 5.2.1, the subset A is open.

Deﬁnition 5.7. We use the notation above: γp denotes the maximal integral curve through the point p of a vector ﬁeld X on a manifold M. The map

Φ: R × M ⊃ A → M defined by (5.1) is called the (local) flow of the vector field X. The word “local” refers to the fact that the flow Φ need not be defined for all time t but only for t in some neighborhood of 0, the neighborhood that depends on the point p. If the set A in the definition above is all of R × M, we say that X has a global flow. Lemma 5.8. Let X be a vector field on a Hausdorff manifold M and let Φ: R × M ⊃ A → M denote its local flow. Then Φ(t, Φ(s, p)) = Φ(s + t, p) for all p ∈ M and all s, t ∈ R for which both sides of the equation make sense. Proof. Fix p ∈ M and s ∈ R. Let γ(t) = Φ(t, Φ(s, p)) and let σ(t) = Φ(s + t, p). Then γ is the maximal integral curve of X passing through Φ(s, p). By Lemma 5.3 σ(t) is the maximal integral curve of X passing through σ(0) = Φ(s + 0, p). Therefore, since maximal integral curves are unique on Hausdorff manifolds, γ(t) = σ(t) for all t. This motivates the following definition. Definition 5.9 (abstract local flow). A local flow on a manifold M is a map Ψ : A → M, where A is open subset of R × M containing {0} × M, having the following two properties (1) Ψ(0, p) = p for all p ∈ M (2) Ψ(t, Ψ(s, p)) = Ψ(s + t, p) whenever both sides make sense.

Example 5.10. Let M = Rn. The map Ψ : R × Rn → Rn, Ψ(t, p) = etp is a flow. Example 5.11. Let M = R2. The map cos t sin t x Ψ(t, (x, y)) = − sin t cos t y is a flow. The example can be described more succinctly in complex coordinates: let M = C and write Ψ(t, z) = eitz. 30 It may be a bit hard to see what the meaning of the two conditions of Definition 5.9 really is. It’s easier to understand what’s going on in the case where Ψ is a global flow, that is, when the domain of the definition A of Ψ is all of R × M. Given a global flow Ψ : R × M → M, we have, for each t ∈ R a map

Ψt : M → M, Ψt(q) := Ψ(t, q).

Condition (1) in the deﬁnition of local ﬂow then simply says that Ψ0 is the identity map idM on M. Condition (2) becomes

(5.2) Ψt(Ψs(q)) = Ψt+s(q) for all t, s ∈ R and q ∈ M. Hence Ψt ◦Ψ−t = idM = Ψ−t ◦Ψt. Consequently Ψt : M → M is a diffeomorphism for each t ∈ R. Moreover, we can interpret (5.2) as saying that we have a homomorphism of groups R 3 t 7→ Ψt ∈ Diff(M), where Diff(M) denotes the group of diffeomorphisms of M (it’s a group under composition). This is why global flows are also referred to as 1-parameter groups of diffeomorphisms.

Now let’s return to vector fields. The point of much of the preceding discussion is that for a vector field X on (Hausdorff) manifold M the collection of integral curves taken together forms a local flow. The converse is true as well.

Lemma 5.12. Let Ψ: R × M ⊃ A → M be a local flow. Then the map X : C∞(M) → C∞(M) defined by d Xf(p) = f(Ψ(t, p)) dt 0 for all f ∈ C∞(M) is a vector field. Moreover, Ψ is the local flow of X. Proof. Since Ψ(t, p) is a smooth function of t and p, f(Ψ(t, p)) is also a smooth function of t and p and its d ∞ derivative dt |0f(Ψ(t, p)) is a smooth function of p. For any f, g ∈ C (M), we have (fg)(Ψ(t, p)) = f(Ψ(t, p)) g(Ψ(t, p)). Hence X(fg) = X(f)g + fX(g), i.e., X is a vector field. It remains to check that Ψ is the local flow of X. We need to show that for each p ∈ M, 0 γp(t) = Xγp(t) ∞ where γp(t) := Ψ(t, p). Let f ∈ C (M) be a function. Then d (γ0 (t)) f = | f(γ (s)) p ds s=t p d = | f(Ψ(s, p)) ds s=t d = | f(Ψ(s + t, p)) ds s=0 d = | f(Ψ(s, Ψ(t, p))) ds s=0

= (Xf)(Ψ(t, p)) = XΨ(t,p)f = Xγp(t)f. Definition 5.13. A vector field is complete if its local flow is a global flow. That is, each integral curve is defined for all t ∈ R. Here are two examples of vector fields that are not complete. d Example 5.14. The vector field dt on (−∞, 0) is not complete. 2 d x Example 5.15. The vector field x dx on R is not complete: Φ(t, x) = 1−xt is its local flow [check it]. The flow is defined for t ∈ (−∞, 1/x) if x > 0 and for t ∈ (1/x, +∞) if x < 0. It is nice to know when a vector field has a global flow. For this purpose we define: 31 Definition 5.16. The support of a vector field X on a manifold M is

supp(X) = {p ∈ M | Xp 6= 0},

the closure of the set of points where X is non-zero.

Theorem 5.17. A vector field with compact support is complete. In particular any vector field on a compact manifold defines a global flow.

Recall that we are tacitly assuming throughout that all our manifolds are Hausdorﬀ. Also, recall that any closed subset of a compact space is compact. Hence any vector ﬁeld on a compact manifold has compact support. There are more than one way to prove the theorem above. For our proof we will need the following lemma.

Lemma 5.18. Let X ∈ Γ(TM) be a vector field with the flow Φ: R × M ⊃ A → M. Suppose that {τ} × M ⊂ A, that is, the flow of X is defined for time τ at all points of M.Then

d(Φτ )m(Xm) = XΦτ (m)

for all m ∈ M. Here, as before, Φτ (m) := Φ(τ, m).

0 Proof. Let γm(t) be the maximal integral curve of X through m: γm(t) = Φ(t, m). Then Xm = γm(0). Also,

Φτ (γm(t)) = Φ(τ, Φ(t, m)) = Φ(τ + t, m)

= Φ(t, Φ(τ, m)) = γΦτ (m)(t). Hence d d(Φ ) (X ) = d(Φ ) (γ0 (0)) = d(Φ ) ◦ d(γ ) ( ) τ m m τ m m τ m m 0 dt d 0 0 = d(Φτ ◦ γm)0 ( ) = (Φτ (γm)) (0) = (γΦ (m)) (0) = Xγ (0) = XΦ (m). dt τ Φτ (m) τ

Proof of Theorem 5.17. We want to show that the domain of definition A of the local flow Φ(t, p) of X is all of R × M. If Xm = 0 then the constant curve γm(t) = m is the integral curve of X through m. It is defined for all t. Therefore on M r supp X the flow is defined for all t: R × (M r supp X) ⊂ A. Also {0} × M ⊂ A by definition of the flow. In particular {0} × supp X ⊂ A. Since supp X is compact and A is open, there is > 0 so that [−, ] × supp X ⊂ A. Hence [−, ] × M ⊂ A. We now define Φ˜ : [0, 2] × M → M by

Φ(˜ t, p) = Φ(, Φ(t − , p)) = Φ(Φ(t − , p)).

Here, as before, Φ(q) = Φ(, q). We claim that for any p ∈ M the curve

Φ(t, p) t ∈ [−, ] γ˜ (t) = p Φ(˜ t, p) t ∈ [0, 2] is an integral curve of X. Indeed, for t ∈ [0, 2], by deﬁnition ofγ ˜p and Φ,˜

0 0 γ˜p(t) = dΦ(˜γp(t − ))

= dΦ(XΦ(p,t−))

= XΦ(Φ(p,t−)) = XΦ(˜ t,p) = Xγ˜p(t), where the third equality holds by Lemma 5.18. It follows that the maximal integral curve γp of X through p is defined for t ∈ [−, 2]. Hence [−, 2] × M ⊂ A. Arguing inductively we get [−k, n] × M ⊂ A for all positive integers k and n. Therefore A = R × M and X is complete. 32 5.2. The geometry of the Lie bracket. As before, we continue to assume that all manifolds are Hausdorff. Additionally, in this subsection we will pretend that all flows are global, equivalently, that all vector fields are complete. Assuming completeness is not necessary. On the other hand carrying out the argument in full generality obscures the main simple ideas.

Definition 5.19. Let X and Y be two vector fields with Φ denoting the flow of X. The Lie derivative LX Y of Y with respect to X is a vector field defined by 1 d

(LX Y )p := lim (d(Φ−t)p(YΦt(p) − Yp) = d(Φ−t)p(YΦt(p)) t→0 t dt t=0 for all p ∈ M. Several remarks are in order. It is not entirely clear that the Lie derivative, as deﬁned above, is a smooth

vector field. We will prove this shortly. Second, t → d(Φ−t)p(YΦt(p)) is a curve in a finite dimensional vector n n n space TpM. We have seen that for R , we can always canonically identify TpR with R . A moment of reflection should convince you that the same identification works for any finite dimensional vector space. Hence it does make sense to think of the Lie derivative (LX Y )p as a vector in the tangent space TpM. Finally note that if γ : I → TpM is any smooth curve, then d d (5.3) γ(t) f = (γ(t)f) for any f ∈ C∞(M). dt t=0 dt t=0 We will need the equation in the proof of Theorem 5.20 below. There are many ways to prove (5.3). For example, pick a basis of TpM and compute both sides of (5.3) in coordinates defined by the basis. What makes the proof work is the fact that partials commute. With these preliminaries out of the way we are ready to state the main result of the subsection. Theorem 5.20. Lie derivative is a Lie bracket. That is, for any two vector fields X and Y on a manifold M (LX Y )p = ([X,Y ])p for all points p ∈ M.

∞ Proof. Denote the ﬂow of Y by Ψ. We evaluate (LX Y )p on an arbitrary smooth function f ∈ C (M): d

(LX Y )pf = d(Φ−t)p(YΦt(p)) f dt t=0 d = d(Φ−t)p(YΦt(p))f dt t=0 d

= YΦt(p)(f ◦ Φ−t) dt t=0 d ∂ = (f ◦ Φ−t)(Ψs(Φt(p))) dt t=0 ∂s s=0 ∂2 = (f ◦ Φ−t ◦ Ψs ◦ Φt)(p) ∂t∂s (0,0) d ∂ = (f ◦ Φ−t ◦ Ψs ◦ Φt)(p) ds s=0 ∂t t=0 d ∂ ∂ = (f ◦ Φ−t)(Ψs(Φ0(p))) + (f ◦ Φ−0 ◦ Ψs)(Φt(p)) ds s=0 ∂t t=0 ∂t t=0 d ∂

= −XΨs(p)f + (f ◦ Ψs ◦ Φt)(p) ds s=0 ∂t t=0 d d ∂ = − (Xf)(Ψs(p)) + (f ◦ Ψs)(Φt(p)) ds s=0 dt t=0 ∂s s=0 d = −Y (Xf)(p) + (Y f)(Φt(p)) dt t=0 = −Y (Xf)(p) + X(Y f)(p) = ([X,Y ]p)f. 33

In particular this proves that the Lie derivative LX Y is a vector field. As a corollary to the above proof, we get: Corollary 5.20.1. Let Φ and Ψ denote the flows of vector fields X and Y respectively. Then for any smooth function f, ∂2 (5.4) ([X,Y ]f)(p) = (f ◦ Φ−t ◦ Ψs ◦ Φt)(p). ∂t∂s (0,0)

Note that if the ﬂows Φt and Ψs commute, that is,

Φt ◦ Ψs = Ψs ◦ Φt for all t and s,

then Φ−t ◦ Ψs ◦ Φt = Ψs ◦ Φ−t ◦ Φt = Ψs. In particular, it’s independent of t. Hence the right hand side of (5.4) is 0. Therefore [X,Y ] = 0. The converse is true as well. Lemma 5.21. Let Φ and Ψ denote the ﬂows on a manifold M of vector ﬁelds X and Y respectively. Then

[X,Y ] = 0 if and only if Φt ◦ Ψs = Ψs ◦ Φt. Proof. We have just proved that if the ﬂows commute the Lie bracket has to vanish. Now suppose [X,Y ] = 0. Our proof will use the following observation. Let V and W be two ﬁnite dimensional vector spaces, T : V → W a linear map and γ : I → V a smooth curve. Then, since dT = T , (T ◦ γ)0(t) = T (γ0(t)). Here, again we identify γ0(t) with a vector in V and similarly for (T ◦ γ)0(t). With the preliminaries out of the way, we proceed with the actual proof. Since [X,Y ] = 0, d

0 = (dΦ−h)(YΦh(p)) dh h=0 for all points p. Hence, with • denoting the appropriate point, d d

d(Φ−t)•(YΦt(p)) = (dΦ−(s+h))•(YΦs+h(p)) dt t=s dh h=0 d

= d(Φ−s)•[(dΦ−h)•(YΦh(Φs(p)))] dh h=0 d

= d(Φ−s)•[ d(Φ−h)•(YΦh(Φs(p)))] = 0. dh h=0

Here, in the last equality we used the fact that d(Φ−s)• is a linear map between tangent spaces and h 7→

d(Φ−h)•(YΦh(Φs(p)) is a curve in the tangent space TΦs(p)M. Hence the curve t 7→ d(Φ−t)•(YΦt(p)) is a constant curve. In particular,

d(Φ−t)•(YΦt(p)) = d(Φ−0)•(YΦ0(p)) = Yp for all t. Consequently

(5.5) YΦt(p) = d(Φt)pYp for all t.

We use the equation above to argue that σ(s) = Φt(Ψs(p)) is an integral curve of Y passing through Φt(p):

0 d σ (s) = [Φt(Ψτ (p))] dτ τ=s d = (dΦt)( Ψτ (p) dτ τ=s

= (dΦt)(YΨτ (p)) = Y(Φt◦Ψs)(p) by (5.5) = Y (σ(s)).

On the other hand s 7→ Ψs(Φt(p)) is also an integral curve of Y passing through Φt(p). Therefore the two curves are equal: Φt(Ψs(p)) = Ψs(Φt(p)). 34 We end the section with a somewhat technical subsection. The point of this subsection will not be apparent for some time. 5.3. Map-related vector fields. Recall that given a smooth map between manifolds f : M → N, for each point p ∈ M we get a map of tangent spaces dfp : TpM → Tf(p)N. Therefore, given a vector field X : M → TM we get for each p ∈ M a vector dfp(Xp) ∈ Tf(p)N. It is not a vector field on N. If additionally f is diffeomorphism we can make it into a vector field: define 0 X (q) = dff −1(q)Xf −1(q) The new vector field X0 is related to the old vector field X by df ◦ X = X0 ◦ f where we think of X, X0 and df as maps X : M → TM, X0 : N → TN and df : TM → TN respectively. In other words, the diagram df TM / TN O O X X0 f M / N commutes. Definition 5.22. Let X : M → TM, Y : N → TN be two vector fields and f : M → N a smooth map. The two vector fields X and Y are f-related if df ◦ X = Y ◦ f.

Example 5.23. Let f : R2 → R be the projection onto the first factor: f(x, y) = x. Then any vector field ∂ ∂ 2 d of the form X = ∂x + g(x, y) ∂y , where g : R → R is a smooth function is f-related to Y = dx . The main fact worth remembering about related vector fields is that Lie brackets go to Lie brackets. More precisely,

Lemma 5.24. Let X1,X2 : M → TM and Y1,Y2 : N → TN be two pairs of vector ﬁelds related by a map f : M → N: df ◦ Xi = Yi ◦ f i = 1, 2. Then df ◦ [X1,X2] = [Y1,Y2] ◦ f. Proof. Note that two vector ﬁelds X and Y are f-related (f : M → N) if and only if for any smooth function h ∈ C∞(N), Y (h)f(p) = Xp(h ◦ f) for all p ∈ M. Or, more concisely, (Y h) ◦ f = X(h ◦ f) We now compute:

[X1,X2](h ◦ f) = X1(X2(h ◦ f)) − X2(X1(h ◦ f))

= X1((Y2h) ◦ f)) − X2((Y1h) ◦ f)

= (Y1(Y2h)) ◦ f − (Y2(Y1h)) ◦ f = ([Y1,Y2]h) ◦ f. Exercise 5.1. Find the flows of the following vector fields on R2: ∂ ∂ X = x1 + x2 ∂x1 ∂x2 and ∂ ∂ Y = x1 − x2 . ∂x2 ∂x1 35 Exercise 5.2. Prove that if a vector field X on a manifold M vanishes at a point p, X(p) = 0, then there is an open set W containing p such that the flow of X on W exists for all t ∈ [0, 1].

Exercise 5.3. Let M be a manifold. An isotopy on M is a collection of diﬀeomorphisms {ft : M → M}t∈(−,) such that

(1) f0 is the identity, and (2) the map (−, ) × M → M given by (t, m) 7→ ft(m) is smooth.

A time-dependent vector field {Xt} is a smooth map (−, ) × M → TM of the form (t, m) 7→ (Xt)m =: Xt(m). An isotopy {ft} defines a time-dependent vector field {Xt} by d Xs(fs(m)) = ft(m). dt t=s

Prove that given a time-dependent vector field {Xt}, there is an isotopy {ft} such that the equation above holds. d Hint: Let X(t, m) = ( dt ,Xt(m)); it is a vector field on R × M. The local flow Φs(t, m) of X is of the form 1 2 1 Φs(t, m) = (Φs(t, m), Φs(t, m)). Show that Φs(t, m) = s + t. d 1 Exercise 5.4. Consider a time-dependent vector field Xt(m) = t dθ on S . Compute the corresponding isotopy. Exercise 5.5. Suppose that M and N are manifolds. If X ∈ Γ(TM) is a vector field, show that X¯ : M × N → T (M × N) ' TM × TN given by X¯(m, n) = (Xm, 0) is a well-defined vector field on M × N. Similarly, given Y ∈ Γ(TN) we get Y¯ ∈ Γ(T (M × N)). Show that [X,¯ Y¯ ] = 0. Exercise 5.6. Suppose that X and Y are vector fields on M. Compute an expression for [X,Y ] in local coordinates.

6. (Multi)linear algebra The goal of this section is to define tensors, tensor algebra and Grassmann (exterior) algebra. We will use these constructions to define tensors and differential forms on manifolds. In this section, unless noted otherwise, all vector spaces are over the real number and are finite dimensional. There are two ways to think about tensors: (1) tensors are multi-linear maps; (2) tensors are elements of a “tensor product” of two or more vector spaces. The first way is more concrete. The second is more abstract but also more powerful.

6.1. Tensor products. We start by reviewing multi-linear maps.

Deﬁnition 6.1. Let V1,...,Vn and U be vector spaces. A map n factors z }| { f :V1 × · · · × Vn→ U, (v1, . . . , vn) 7→ f(v1, . . . , vn)

is multi-linear if for each ﬁxed index i and a ﬁxed (n − 1)-tuple of vectors v1, . . . , vi−1, vi+1, . . . , vn the map

Vi → U, w 7→ f(v1, . . . , vi−1, w, vi+1, . . . , vn) is linear. When the number of factors is n, as above, we will also say that f is n-linear.

n factors 2 z n }| n{ For example, if we identify Rn ' R × · · · × R by thinking of an n × n matrix as an n-tuple of column vectors, then the determinant n factors z n }| n{ det :R × · · · × R → R, (v1, . . . , vn) 7→ det(v1| ... |vn) is an n-linear map. Here is an example of a bilinear map. Any inner product on a vector space V : V × V 3 (v, w) 7→ v · w ∈ R 36 is bilinear. There is no standard notation for the space of n-linear maps from V1 × · · · × Vn to U. We will denote it by Mult(V1 × · · · × Vn,U) = Multn(V1 × · · · × Vn,U)

(n is to indicate that these are n-linear maps). This space, Mult(V1 × · · · × Vn,U), is a vector space: any linear combination of two n-linear maps is n-linear. We now take a closer look at the space of bilinear maps Mult2(V × W, U). This case is complicated enough to understand what happens with multi-linear maps in general, but simple enough not to bog down in notation. ∗ ∗ Lemma 6.2. Let {vi}, {wj} and {uk} denote the bases of V , W and U respectively and {vi }, {wj } and ∗ {uk} the corresponding dual bases. Then the maps k k ∗ ∗ φij : V × W → U, φij(v, w) = vi (v)wj (w)uk are bilinear and form a basis of Mult2(V × W, U). Hence

dim Mult2(V × W, U) = dim V dim W dim U. k Proof. It is easy to see that φij are bilinear. Next, for any b ∈ Mult2(V × W, U), any w ∈ W and any v ∈ V , X ∗ X ∗ b(v, w) = b( vi (v)vi, wj (w)wj) X ∗ ∗ = vi (v)wj (w)b(vi, wj) i,j X ∗ ∗ ∗ = vi (v)wj (w)uk(b(vi, wj))uk i,j,k X ∗ k = uk(b(vi, wj))φij(v, w). i,j,k k ∗ Hence the maps φij span Mult2(V × W, U). Also, the collection of numbers uk(b(vi, wj)) uniquely determine k the bilinear form b. Hence φij’s are linearly independent. We now turn to the deﬁnition of the tensor product V ⊗ W [pronounced “V tensor W ”] of two vector spaces V and W . Informally it consists of ﬁnite linear combinations of symbols v ⊗ w, where v ∈ V and w ∈ W . Additionally, these symbols are subject to the following identities:

(v1 + v2) ⊗ w − v1 ⊗ w − v2 ⊗ w = 0

v ⊗ (w1 + w2) − v ⊗ w1 − v ⊗ w2 = 0 α (v ⊗ w) − (αv) ⊗ w = 0 α (v ⊗ w) − v ⊗ (αw) = 0, for all v, v1, v2 ∈ V , w, w1, w2 ∈ W and α ∈ R. These identities simply say that the map ⊗ : V ×W → V ⊗W , (v, w) 7→ v ⊗ w, is a bilinear map. The fact that everything in V ⊗ W is a linear combination of symbols v ⊗ w means that the image of the map ⊗ : V × W → V ⊗ W spans V ⊗ W .5 Here is the formal definition of the tensor product of two vector spaces. Definition 6.3. A tensor product of two finite dimensional vector spaces V and W is a vector space V ⊗ W together with a bilinear map ⊗ : V × W → V ⊗ W ,(v, w) 7→ v ⊗ w6 such that for any bilinear map b : V × W → U there is a unique linear map ¯b : V ⊗ W → U with ¯b(v ⊗ w) = b(v, w). That is, the diagram

b V × W / U ; ⊗ ¯b V ⊗ W commutes. The existence of the map ¯b satisfying the above conditions is called the universal property of the tensor product.

5But the image of ⊗ is not all of V ⊗ W . The elements in the image are called decomposable tensors. 6The symbol v ⊗ w stands for the value of the map ⊗ on the pair (v, w) 37 This definition is quite abstract. It is not clear that such objects exist and, if they exist, that they are unique. Setting the question of existence and uniqueness of tensor products aside, let’s us sort out the relationship between V ⊗ W and bilinear maps Mult(V × W, U). Recall that Hom(X,Y ) is the space of all linear maps from a vector space X to a vector space Y and is itself a vector space (see p. 13). Lemma 6.4. Assume that V ⊗ W exists. Then Hom(V ⊗ W, U) −→' Mult(V × W, U). Proof. The isomorphism in question is built into the definition of the tensor product. Given a linear map A : V ⊗ W → U the composition A ◦ ⊗ : V × W → U is bilinear. And conversely, given a bilinear map b ∈ Mult(V × W, U) there is a unique linear map ¯b : V ⊗ W → U so that (¯b ◦ ⊗)(v, w) = b(v, w) for all (v, w) ∈ V × W . In other words the maps Hom(V ⊗ W, U) 3 A 7→ A ◦ ⊗ ∈ Mult(V × W, U) and Mult(V × W, U) 3 b 7→ ¯b ∈ Hom(V ⊗ W, U) are inverses of each other. Next we observed that the uniqueness of the tensor product is also built into the definition of the tensor product. Proposition 6.5. If tensor products exist, they are unique up to isomorphism. Proof. The proof is quite formal and uses nothing but the universal property. Suppose there are two vector spaces V ⊗1W and V ⊗2W with corresponding bilinear maps ⊗1 : V ×W → V ⊗1W and ⊗2 : V ×W → V ⊗2W which satisfy the conditions of the Definition 6.3. We will argue that these vector spaces are isomorphic. By the universal property there exist a unique linear map ⊗1 : V ⊗2 W → V ⊗1 W so that the diagram

⊗1 V × W / V ⊗1 W 8

⊗2 ⊗ 1 V ⊗2 W

commutes. By the same argument, switching the roles of ⊗1 and ⊗2, there is a unique linear map ⊗2 : V ⊗1 W → V ⊗2 W making the diagram

⊗2 V × W / V ⊗2 W 8

⊗1 ⊗ 2 V ⊗1 W commute. Deﬁne

T1 = ⊗1 ◦ ⊗2 : V ⊗1 W → V ⊗1 W

T2 = ⊗2 ◦ ⊗1 : V ⊗2 W → V ⊗2 W. These are linear maps making the diagrams

⊗1 ⊗2 V × W / V ⊗1 W and V × W / V ⊗2 W 8 8

⊗1 ⊗2 T1 T2 V ⊗1 W V ⊗2 W

commute. But the identity maps idi : V ⊗i W → V ⊗i W , i = 1, 2, are linear and also make the respective diagrams commute. By uniqueness Ti = idi. Hence ⊗1 and ⊗2 are inverses of each other and provide the desired isomorphisms. Now we construct the tensor product as a quotient of an infinite dimensional vector space by an infinite dimensional subspace thereby proving its existence. Proposition 6.6. Tensor products exist. 38 Proof. Let V and W be two finite dimensional vector spaces. We want to construct a new vector space V ⊗ W and a bilinear map ⊗ : V × W → V ⊗ W satisfying the conditions of Definition 6.3. We start with a vector space F (V × W ) made of formal finite linear combinations of ordered pairs (v, w), v ∈ V , w ∈ W . Its basis is the set {(v, w) | v ∈ V, w ∈ W } = V × W . If you prefer you can think of F (V × W ) as the set of functions

{f : V × W → R | f(v, w) 6= 0 for only finitely many pairs (v, w)}. This set of functions is an infinite dimensional vector space. Its basis consists of functions that take value 1 on a given pair (v0, w0) and 0 on all other pairs. It’s tempting to call this function (v0, w0). The vector space F (V × W ) is called the free vector space generated by the set V × W . Note that we have an inclusion map ι : V × W → F (V × W ), ι(v, w) = (v, w). It is not bilinear since (v1 + v2, w) 6= (v1, w) + (v2, w) in F (V,W ). Consider the smallest subspace K of F (V,W ) containing the following collection of vectors:   (v1 + v2, w) − (v1, w) − (v2, w)    (v, w1 + w2) − (v, w1) − (v, w2)  S = v, v1, v2 ∈ V, w, w1, w2 ∈ W and c ∈ R c(v, w) − (cv, w)    c(v, w) − (v, cw),  In other words, consider the subspace K of F (V ×W ) spanned by the set S. Define V ⊗W to be the quotient of F (V × W ) by K: V ⊗ W := F (V × W )/K. Define the map ⊗ : V × W → V ⊗ W to be the composite of the inclusion ι : V × W,→ F (V × W ) and the quotient map F (V × W ) → F (V × W )/K. The definition of K is rigged precisely so that this composite is bilinear. We write v⊗w for the value of ⊗ on the pair (v, w). By construction the set {v⊗w | (v, w) ∈ V ×W } spans V ⊗ W [but it’s much too big to be a basis]. We check that the map ⊗ : V ×W → V ⊗W has the required universal property. Suppose b : V ×W → U is bilinear. Since V × W is a basis for F (V × W ), b defines a unique linear map ˜b : F (V × W ) → U given on the basis by ˜b((v, w)) = b(v, w). As b is bilinear, ˜b is 0 on K by the definition of K. Thus we obtain a linear map b : F (V × W )/K = V ⊗ W → U with b(v ⊗ w) = ˜b((v, w)) = b(v, w). Since the vectors of the form v ⊗ w span V ⊗ W , b is unique. This verifies the universal property and thereby proves the existence of the tensor product. Lemma 6.7. For any vector spaces V and W dim(V ⊗ W ) = dim V · dim W. Proof. ∗ dim V ⊗ W = dim(V ⊗ W ) = dim Hom(V ⊗ W, R) = dim Mult(V × W, R) by Lemma 6.4 = dim V · dim W · dim R. We are now in position to quickly prove a number of results about tensor products.

Corollary 6.7.1. If {vi} and {wj} are a basis of V and W respectively, then {vi ⊗ wj} is a basis of V ⊗ W .

Proof. Since the vectors of the form v ⊗ w, v ∈ V , w ∈ W , span V ⊗ W , the much smaller set {vi ⊗ wj} also 7 spans V ⊗ W . But dim(V ⊗)W = dim V · dim W is precisely the number of elements in the set {vi ⊗ wj}. Hence the set {vi ⊗ wj} is a basis. Lemma 6.8. V ⊗ W is isomorphic to W ⊗ V .

7 We are using here the fact that for any (v, w) ∈ V × W , the tensor v ⊗ w is a linear combination of vi ⊗ wj ’s. 39 Proof. Consider the map b : W × V → V ⊗ W deﬁned by b(w, v) = v ⊗ w. Since b is bilinear, there is a unique linear map b : W ⊗ V → V ⊗ W with b(w ⊗ v) = v ⊗ w. Since the set {v ⊗ w | v ∈ V, w ∈ W } generates V ⊗ W , the map b is surjective. It is an isomorphism by dimension count. Lemma 6.9. V ∗ ⊗ W is isomorphic to Hom(V,W ). Proof. Consider b : V ∗ × W → Hom(V,W ) deﬁned by (b(v∗, w))(v) = v∗(v)w for all v∗ ∈ V ∗, v ∈ V, w ∈ W. Since b is bilinear, it induces a linear map b : V ∗ ⊗ W → Hom(V,W ) with (b(v∗ ⊗ w))(v) = v∗(v)w for all v∗ ∈ V ∗, v ∈ V, w ∈ W. Observe that linear maps of the form v 7→ v∗(v)w span Hom(V,W ) (The proof of this fact is very similar to the proof of Lemma 6.2 and is left as an exercise). Hence b is an isomorphism by dimension count. ∗ Exercise 6.1. Show that if {vi} is a basis of a vector space V , {vi } the dual basis and {wj} the basis of a ∗ vector space W , then {vi (·)wj} is a basis of Hom(V,W ). Lemma 6.10. If A : V → W and B : V 0 → W 0 are two linear maps, then there is a unique linear map A ⊗ B : V ⊗ V 0 → W ⊗ W 0 such that (A ⊗ B)(v ⊗ w) = A(v) ⊗ B(w) for all (v, w) ∈ V × W . Proof. Consider b : V × W → V 0 ⊗ W 0 given by b(v, w) = Av ⊗ Bw. The map b is bilinear, whence the universal property gives us a unique linear map ¯b : V ⊗ W → V 0 ⊗ W 0 with ¯b(v ⊗ w) = Av ⊗ Bw for all (v, w) ∈ V × W .

Exercise 6.2. Show that if A : V → W is represented by a matrix (aij) with respect to some bases of V 0 0 0 0 and W and B : V → W is represented by a matrix (bkl) with respect to bases of V and W , then A ⊗ B is represented by the matrix (aijbkl) with respect to the appropriate bases. ' Exercise 6.3. Show that there is a natural isomorphism φ : V ∗ ⊗ W ∗ → Mult(V × W, R) with φ(v∗ ⊗ w∗)(v, w) = v∗(v)w∗(w) for all v∗, w∗, v, w. Show that there is a natural isomorphism ψ : V ∗ ⊗ W ∗ → (V ⊗ W )∗ with ψ(v∗ ⊗ w∗)(v ⊗ w) = v∗(v)w∗(w) for all v∗, w∗, v, w. ' Exercise 6.4. Show that the map R × V → V ,(a, v) 7→ av gives rise to an isomorphism R ⊗ V → V which sends a ⊗ v to av for all a ∈ R and v ∈ V . Exercise 6.5. Show that taking tensor product is associative: V ⊗ (U ⊗ W ) ' (V ⊗ U) ⊗ W for any three vector spaces V,U and W . From now on we write V ⊗ U ⊗ W for V ⊗ (U ⊗ W ) since the order of taking tensor products doesn’t matter. Exercise 6.5 above also allows us to deﬁne recursively tensor powers of a vector space V . We deﬁne ⊗0 V := R, V ⊗1 := V and V ⊗n := V ⊗(n−1) ⊗ V for n > 1. 40 It is not hard to generalize the relationship between bilinear maps and tensor products to the relationship between n-linear maps and n-fold tensor products. For example: Exercise 6.6. Prove that given a n-linear map n z }| { f :V × · · · × V → U, then there exists a unique linear map f : V ⊗n → U with

f(v1 ⊗ · · · ⊗ vn) = f(v1, . . . , vn). for all (v1, . . . , vn) ∈ V × · · · × V .

Moreover, given a ∈ V ⊗n and b ∈ V ⊗m, a ⊗ b is in V ⊗n ⊗ V ⊗m ' V ⊗(n+m). This gives us an R-bilinear map, V ⊗n × V ⊗m → V ⊗(n+m), (a, b) 7→ a ⊗ b. Note that if n = 0 the map above is simply

⊗m ⊗m R × V → V , (a, t) 7→ at. (cf. Exercise 6.4).

Deﬁnition 6.11. An algebra over R is a vector space A together with a bilinear map A×A → R,(a, a0) 7→ aa0 (“multiplication”). An algebra A is said to be an algebra with unity if there is an element 1 ∈ A such that 1 · a = a for all a ∈ A. An algebra A is associative if the multiplication is associative. Remark 6.12. Note that in any algebra A, 0a = a0 = 0 for all a ∈ A (this is because multiplication is required to be bilinear).

Remark 6.13. If A is an algebra with 1 then there is an injection R → A, x 7→ x1. We will always identify R with its image in A. Example 6.14. A Lie algebra is an algebra. It is not associative and does not have 1 (why not?).

Example 6.15. The space Mn(R) of n × n matrices forms an algebra under matrix multiplication. It is an algebra with unity: the identity matrix I is the unity. Deﬁnition 6.16. An algebra A is graded if ∞ X A = Ai direct sum i=0 and if for any a ∈ Ai and b ∈ Aj we have a · b ∈ Ai+j. We will refer to the elements of Ak as elements of degree k. Given a vector space V we construct the corresponding tensor algebra T (V ) as follows. As a vector space T (V ) is the direct sum:

∞ ⊗2 ⊗n X ⊗i T (V ) = R ⊕ V ⊕ V ⊕ · · · ⊕ V ⊕ · · · = V . i=0

⊗ij Thus the elements of T (V ) are ﬁnite sums ai1 + ai2 + ··· aik , aij ∈ V . We deﬁne the multiplication on T by extending the multiplication V ⊗n × V ⊗m → V ⊗(n+m) (a, b) 7→ a ⊗ b. bilinearly to all of T (V ). The tensor algebra T (V ) of a vector space V is a graded associative algebra with 1. Note that by construction the elements of T (V ) are sums of products of elements of V , that is, T (V ) is generated by V . 41 6.2. The Grassmann (exterior) algebra and alternating maps. We have seen that tensor products are intimately related to multi-linear maps. Exterior (Grassmann) algebras are just as intimately related to alternating multilinear maps. Recall that an n-linear map f : V × · · · × V → U is alternating if it changes sign whenever we switch to adjacent entries:

f(v1, . . . , vi, vi+1, . . . , vn) = −f(v1, . . . , vi+1, vi, . . . , vn)

for all (v1, . . . , vn) ∈ V × · · · × V and any index i. Example 6.17. The determinant n factors z n }| n{ det :R × · · · × R → R, (v1, . . . , vn) → det(v1| ... |vn) is an alternating map. Example 6.18. Consider a vector space V and a, b ∈ V ∗. Deﬁne the bilinear map a ∧ b by

(a ∧ b)(v1, v2) := a(v1)b(v2) − a(v2)b(v1), v1, v2 ∈ V. The map a ∧ b (“a wedge b”) is alternating.

Deﬁnition 6.19 (Grassmann (exterior) algebra). Let V be a ﬁnite dimensional vector space over R. The Grassmann (exterior) algebra Λ∗(V ) is an algebra over R with unity together with an injective linear map i : V → Λ(V ) called the structure map which has the following universal property: If A is an algebra over R with unity and j : V → A is a linear map such that j(v) · j(v) = 0 for all v ∈ V , then there is a unique algebra map 8  :Λ∗(V ) → A such that the following diagram commutes: V OOO OOO j i OOO OOO  OOO Λ∗(V ) /' A. Proposition 6.20. If the exterior algebra Λ∗(V ) exists, it is unique (up to isomorphism).

Proof. This is a formal exercise and is left to the reader. Proposition 6.21. For every vector space V the exterior algebra Λ∗(V ) exists. Proof. Let I be the two-sided ideal in the tensor algebra T (V ) generated by the set {v ⊗ v : v ∈ V }. Note that R ∩ I = 0 and V ∩ I = 0 for degree reasons. Define Λ∗(V ) := T (V )/I, the quotient of the tensor algebra by the ideal I. Then Λ∗(V ) is an algebra — it inherits the multiplication from T (V ). The induced multiplication in Λ∗(V ) is denoted by ∧ (“wedge”). Since the tensor algebra is graded, so is I, and I = (I ∩ V ⊗2) ⊕ (I ∩ V ⊗3) ⊕ · · · Since V ∩ I = 0, the composite i : V → T (V ) → T (V )/I = Λ∗(V ) is an injection. Note that any element of Λ∗(V ) is a finite linear combination of products of elements of V . Now that we have constructed the exterior Λ∗(V ), let us prove the universal property. Suppose that A is an algebra and that we are given a linear map j : V → A with j(v) · j(v) = 0 for all v ∈ V . Consider the map b : V × V → A given by b(v, w) = j(v) · j(w). Since the map b is bilinear, there is a unique linear map j(2) : V ⊗ V → A with j(2)(v ⊗ w) = j(v) · j(w). Similarly, for all positive integers k, we have k-linear maps j(k) : V ⊗k → A with (k) j (v1 ⊗ · · · ⊗ vk) = j(v1) ··· j(vk). (0) In addition, we define j (a) = a · 1A, for all a ∈ R. In this way, we obtain an algebra map ˜ : T (V ) → A. By assumption, ˜(v ⊗ v) = 0 for all v ∈ V . Therefore ˜ vanishes on the ideal I. This implies that ˜ descends to an algebra map  :Λ∗(V ) = T /I → A with (v) = j(v) for all v ∈ V . Since an algebra map is uniquely ∗ determined on generators, and since V generated Λ (V ), the map  is unique.

8 A map f : A → B between two algebras is an algebra map if f is linear and preserves multiplication: f(a1a2) = f(a1)f(a2) 42 Remark 6.22. For any v ∈ V , we have v ∧ v = 0 in the exterior algebra Λ∗(V ). Also,

0 = (v1 + v2) ∧ (v1 + v2) = v1 ∧ v1 + v1 ∧ v2 + v2 ∧ v1 + v2 ∧ v2 gives that v1 ∧ v2 = −v2 ∧ v1; That is, the wedge product is skew-commutative. Remark 6.23. Let Λk(V ) = T k(V )/(T k(V ) ∩ I). The vector space Λk(V ) is called the kth exterior power of V . Then ∞ X Λ∗(V ) = Λk(V ), k=0 where 0 1 Λ (V ) = R and Λ (V ) = V. Also, if α ∈ Λk(V ) and β ∈ Λl(V ), then α ∧ β ∈ Λk+1(V ). Thus, Λ∗(V ) is a graded algebra with 1.

Remark 6.24. We know that if {v1, . . . , vn} is a basis for V , then {vi ⊗ vj} is a basis for V ⊗ V . By ⊗k k ⊗k ⊗k induction, {vi1 ⊗ · · · ⊗ vik } is a basis for V . Thus, {vi1 ∧ · · · ∧ vik } generates Λ (V ) = V /(I ∩ V ). Since ∧ is skew-commutative, however, we can reduce this generating set to a smaller one:

(6.1) {vi1 ∧ · · · ∧ vik | i1 < ··· < ik}, This implies that Λl(V ) = 0 whenever l > dim V. We will see below that the set (6.1) is a basis of Λk(V ). We now investigate the connection between the k-th exterior power Λk(V ) of a vector space V and alternating maps. Proposition 6.25 (Universal property of k-th exterior power of a vector space). Let U and V be vector k z }| { spaces. If f :V × · · · × V → U is alternating then there is a unique linear map f :Λk(V ) → U with

f(v1 ∧ · · · ∧ vk) = f(v1, . . . , vk). ⊗k ˜ ⊗k ˜ Proof. By the universal property of V , there is a unique linear map f : V → U such that f(v1⊗· · ·⊗vk) = ∗ f(v1, . . . , vk). Since f is alternating, f = 0, where I is the ideal deﬁned in the construction of Λ (V ). I∩V ⊗k k ⊗k ⊗k This gives us the linear map f :Λ (V ) = V /(I ∩ V ) → U with the desired property. Corollary 6.25.1. The space of k-linear alternating maps {f : V × · · · × V → U | f is alternating} is isomorphic to the space Hom(Λk(V ),U). Lemma 6.26. Let V be an n-dimensional vector space. Then Λn(V ) is 1-dimensional. n n Proof. We may assume that V = R . Let e1, . . . , en be the standard basis. Then e1 ∧ · · · ∧ en spans Λ (V ). n n We need to show that e1 ∧ · · · ∧ en 6= 0. The determinant det : R × · · · × R → R is 1 on the identity matrix n n I = (e1| ... |en): det(e1| ... |en) = 1. Hence the induced linear map det : Λ (R ) → R is 1 on e1 ∧ · · · ∧ en. Therefore e1 ∧ · · · ∧ en 6= 0.

Corollary 6.26.1. If {f1, . . . , fn} is a basis for a vector space V , then {fi1 ∧· · ·∧fik : 1 ≤ i1 < ··· < ik ≤ n} is a basis for its k-th exterior power Λk(V ). Proof. By Remark 6.24 the above set generates Λk(V ). So we only need to check independence. Suppose X 0 = ai1,...,ik fi1 ∧ · · · ∧ fik for some ai1,...,ik ∈ R i1<···

Pick a sequence j1 < j2 < ··· < jk. Let jk+1 < ··· < jn be the remaining indices. Then X ( ai1,...,ik fi1 ∧ · · · ∧ fik ) ∧ fjk+1 ∧ · · · ∧ fjn

= aj1,...,jk fj1 ∧ · · · fjk ∧ fjk+1 ∧ · · · ∧ fjn , 43 since ai1,...,ik fi1 ∧ · · · fik ∧ fjk+1 ∧ · · · ∧ fjn = 0 whenever is = jr for some s, r. This gives aj1,...,jk = 0.

Also, fj1 ∧ · · · ∧ fjk ∧ fjk+1 ∧ · ∧ fjn = ±f1 ∧ ·fn 6= 0. Hence fj1 ∧ · · · ∧ fjk 6= 0. Corollary 6.26.2. For any ﬁnite dimensional vector space V dim V (dim V )! dim Λk(V ) = = . k k!(dim V − k)! Lemma 6.27. Let A : V → W be a linear map. Then there is a unique linear map Λk(A):Λk(V ) → Λk(W ) such that k (Λ (A))(v1 ∧ · · · ∧ vk) = Av1 ∧ · · · ∧ Avk for all v1, . . . , vk ∈ V . Proof. Consider the map b : V × · · · × V → Λk(W ) given by

b(v1, . . . , vk) = Av1 ∧ · · · ∧ Avk. Since ∧ is skew-commutative, b is an alternating map. By Proposition 6.25 there exists a unique linear map k k k Λ (A):Λ (V ) → Λ (W ) with the required properties. Exercise 6.7. Let A : V → W be a linear map as above. Choose bases of V and W and the corresponding bases of Λk(V ) and of Λk(W ). Show that the entries of the matrix representing Λk(A) are polynomial in the entries of the matrix representing A. 6.3. Pairings.

Deﬁnition 6.28. Let V and W be two vector spaces. A pairing is a bilinear map h·, ·i : V × W → R. Example 6.29. Let V be a vector space and V ∗ be its dual. The evaluation map ∗ V × V → R h`, vi = `(v) is a pairing.

Deﬁnition 6.30. A pairing h·, ·i : V × W → R is non-degenerate if

hv0, wi = 0 ∀w ∈ W ⇒ v0 = 0

hv, w0i = 0 ∀v ∈ V ⇒ w0 = 0. Example 6.31. The evaluation map ∗ V × V → R h`, vi = `(v) is a non-degenerate pairing. In a sense it is the only nondegenerate pairing:

Proposition 6.32. If b : V × W → R is a nondegenerate pairing, then V ' W ∗ and W ' V ∗. # ∗ Proof. Consider b1 : V → W given by # (b1 (v))(w) = b(v, w). # The map b1 is linear, and # # ker b1 = {v0 ∈ V : b1 (v0) = 0} = {v0 ∈ V : b(v0, w) = 0 ∀w} = {0}. Thus dim V ≤ dim W ∗ = dim W . By the same argument, we have dim W ≤ dim V ∗ = dim V . Therefore # dim V = dim W . Hence b1 is an isomorphism. # ∗ By the same argument, b2 : W → V given by w 7→ b(·, w) is an isomorphism as well. Proposition 6.33. There is a nondegenerate pairing k ∗ k h·, ·i :Λ (V ) × Λ (V ) → R with ∗ ∗ ∗ hv1 ∧ · · · ∧ vk, v1 ∧ · · · ∧ vki = det vi (vj) . Hence Λk(V ∗) ' (Λk(V ))∗. 44 k k z ∗ }| ∗{ z }| { Proof. Consider b :V × · · · × V × V × · · · × V → R given by b(l1, . . . , lk, v1, . . . , vk) = det li(vj) .

∗ ∗ ∗ ∗ k For a ﬁxed (l1, . . . , lk) ∈ V ×· · ·×V , b is alternating in the v’s. So there is a map b :(V ×·×V )×Λ (V ) → with R (l1, . . . , lk, v1 ∧ · · · ∧ vk) 7→ det li(vj) .

k Similarly, for a ﬁxed v1 ∧ · · · ∧ vk ∈ Λ (V ), b is alternating in the l’s, which means that there is a map ˜b :Λk(V ∗) × Λk(V ) → R with the desired property. To check non-degeneracy evaluate the pairing on the respective bases. Combining the proposition above with Corollary 6.25.1 we get:

Corollary 6.33.1. The space of k-linear alternating maps {f : V × · · · × V → R | f is alternating } is isomorphic to the k-th exterior power Λk(V ∗).

k ∗ Remark 6.34. Explicitly `1 ∧ · · · ∧ `k ∈ Λ (V ) deﬁnes a k-linear alternating map by

`1 ∧ · · · ∧ `k (v1, ··· , vk) = det(`i(vj))

for all v1, . . . , vk ∈ V . In particular

`1 ∧ `2 (v1, v2) = `1(v1)`2(v2) − `1(v2)`2(v1) Exercise 6.8. Suppose that V is an n-dimensional vector space. Given a linear map A : V → V , we get a map Λn(A):Λn(V ) → Λn(W ), and since dim Λn(V ) = 1, the map Λn(A) is multiplication by a scalar. Show that this scalar is det A.

7. Differential forms and integration 7.1. Motivation. Suppose we want to integrate a function f over a manifold M. We start with an easiest case: the support of f is contained inside a coordinate chart φ : U → Rm. We could then try to deﬁne Z Z Z f = f := (f ◦ φ−1)(x) dx. m M U φ(U)⊂R Right away we would then run into a problem when we try to compute this integral with respect to a diﬀerent coordinate chart. Recall the change of variables formula for integrals:

Lemma 7.1. Let F : U → V be diffeomorphism between two open subsets of Rm and f ∈ C∞(V ) an integrable function. Then y 7→ f(F (y)) | det dFy| is an integrable function on U and Z Z (7.1) f(y) dy = f(F (x)) | det dFx| dx V =F (U) U Now suppose ψ : U → Rm is another coordinate chart on M with the same domain. By our definition of R M f we would want Z Z f = (f ◦ ψ−1)(y) dy. M ψ(U) But the change of variables formula (7.1) gives us Z Z −1 −1 −1 −1 ((f ◦ ψ )(y) dy = (f ◦ ψ ) ◦ (ψ ◦ φ ) (x) | det(d(ψ ◦ φ )x)| dx ψ(U) φ(U) Z −1 −1 = (f ◦ φ )(x) | det(d(ψ ◦ φ )x)| dx φ(U) −1 Since there is no reason for | det(d(ψ ◦ φ )x)| to be the constant function 1, the integral of f over M is ill-defined. 45 One solution is to integrate something other than functions. If µ is one of those somethings and F is a diffeomorphism, then µ should transform under F by the rule µ (µ ◦ F ) det(dF ). This will be made more precise shortly. Additionally we will need to confine ourselves to manifolds with −1 atlases φα with the property that the differentials d(φα ◦φβ ) all have positive determinants. Such manifolds are called orientable. It turns out that what one integrates over manifolds are differential forms and we now proceed to define them. F ∗ Again, let M be a manifold. Recall that we made the disjoint union of its cotangent spaces q∈M Tq M into a manifold, the cotangent bundle T ∗M of M. Moreover, we defined the manifold structure on T ∗M in such a way that the natural projection ∗ ∗ π : T M → M,Tq M 3 η 7→ q ∈ M is smooth. Similarly one can make the disjoint union of kth exterior powers of the cotangent spaces of M into a manifold Λk(T ∗M): k ∗ G k ∗ Λ (T M) = Λ (Tq M). q∈M Moreover, the natural projection k ∗ k ∗ π :Λ (T M) → M, Λ (Tq M) 3 ν 7→ q ∈ M can be arranged to be smooth. We defer the details of this construction for section 8, where we will carry it out for arbitrary vector bundles and not just for the cotangent bundle. The preimages of points π−1(q) under π :Λk(T ∗M) → M are called fibers of π. By design they are vector k ∗ 0 spaces Λ (Tq M). Recall that for any vector space V , the 0th exterior power Λ (V ) is just the real numbers and the 1st exterior power Λ1(V ) is the vector space V itself. It will turn out that Λ0(T ∗M) = M × R and Λ1(T ∗M) = T ∗M. Definition 7.2. A smooth k-form µ (a.k.a. a differential form of degree k) is a smooth map µ : M → k ∗ Λ (T M), q 7→ µq so that k ∗ µq ∈ Λ (Tq M) for all q ∈ M. The last condition can be stated as: π ◦ µ : M → M is the identity map. The smoothness condition will be discussed a few paragraphs down.

Remark 7.3. By definition a 0 form on M is a smooth map µ : M → M × R such that µq = (q, f(q)) for all q ∈ M, where f(q) ∈ R depends on q. In other words, 0 forms are nothing but functions. And the smoothness of 0 forms is the smoothness of functions. Notation. We denote the space of differential k-forms on a manifold M by Ωk(M). We denote the space of all differential forms by Ω∗(M). Thus Ω∗(M) = Ω0(M) ⊕ Ω1(M) ⊕ · · · ⊕ Ωk(M) ⊕ · · · Let us try to get some feel for differential forms by considering the special case where the manifold M is m an open subset of R . We denote the standard coordinates on M by x1, . . . , xm. Then for every q ∈ M, the ∗ m ∗ differentials (dx1)q,...,(dxm)q form a natural basis of the cotangent space Tq M ' (R ) . Hence the set

{(dxi1 )q ∧ ... ∧ (dxik )q | 1 ≤ i1 < ··· < ik ≤ m} k ∗ is a basis of the kth exterior power Λ (Tq M). At this point it is convenient to have a bit more notation at our disposal: if I is an ordered k-tuple i1 < ··· < ik then

(7.2) dxI := dxi1 ∧ ... ∧ dxik . We write |I| to indicate the size of the tuple I: if I is a k-tuple, then |I| = k. With this notation, a typical k-form µ on an open subset of Rm has the following expression: ! X X (7.3) µ = aI dxI = ai1···ik dxi1 ∧ · · · ∧ dxik .

|I|=k i1<···

µq ∧ νq k ∗ l ∗ makes sense since µq ∈ Λ (Tq M) and νq ∈ Λ (Tq M). This defines the exterior product on differential forms: ∧ :Ωk(M) × Ωl(M) → Ωk+l(M), (µ, ν) 7→ µ ∧ ν, with (µ ∧ ν)q := µq ∧ νq for all q ∈ M. Note that if µ ∈ Ω0(M), that is, if µ is a function, then µ ∧ ν = µ ν. That is, wedging a function with a differential form is the same as multiplying the differential form by the function. 7.2. Pullback of differential forms. In order to discuss integration of differential forms we need to discuss their pullback under smooth maps. We start by discussing the underlying linear algebra. Recall that by Lemma 6.27 if A : W → V is a linear map, then there exists a unique linear map k k k k Λ (A):Λ (W ) → Λ (V ), with Λ (A)(w1 ∧ · · · ∧ wk) = Aw1 ∧ · · · ∧ Awk for all w1, . . . , wk ∈ W . In fact, by the universal property of the exterior algebra, we have more than just a collection of linear maps Λk(A), k = 0, 1,.... Namely, the linear map A : W → V defines a linear map A : W → Λ∗V with Aw ∧ Aw = 0 for all w ∈ W . Hence, by the universal property of Λ∗(W ) there is a unique algebra map ∗ ∗ ∗ ∗ Λ (A):Λ (W ) → Λ (V ) with Λ (A)(w1 ∧ · · · ∧ wk) = Aw1 ∧ · · · ∧ Awk 0 0 0 for all w1, . . . , wk ∈ W and for all k > 0. Note that Λ (A):Λ (W ) = R → R = Λ (V ) is the identity map. This is the reason why the pull-back of a 0-form, thought of as a function, is composition (see below).

If F : M → N is a smooth map between two manifolds, it deﬁnes a pullback map F ∗ : C∞(N) → C∞(M) on functions by F ∗f := f ◦ F for any f ∈ C∞(N). I claim that F ∗ extends to a map of algebras F ∗ :Ω∗(N) → Ω∗(M). Indeed, given F : M → N we have linear maps

dFq : TqM → TF (q)N, q ∈ M, and therefore dual maps ∗ ∗ ∗ dFq : TF (q)N → Tq M, which, in turn, induce maps on the exterior powers k ∗ k ∗ k ∗ Λ (dFq ):Λ (TF (q)N) → Λ (Tq M) with k ∗ ∗ ∗ Λ (dFq )(ν1 ∧ · · · ∧ νk) = (dFq ν1) ∧ · · · ∧ (dFq νk) 47 ∗ k ∗ ∗ k for ν1, . . . , νk ∈ TF (q)N. Therefore if µ : N → Λ (T N) is a k-form we deﬁne its pullback F µ ∈ Ω (M) by

∗ k ∗ (7.5) (F µ)q := Λ (dFq )(µF (q)) for all q ∈ M. This looks a bit convoluted but has a simple (simpler?) interpretation. Recall that for any vector space V we have a canonical isomorphism between the kth exterior Λk(V ∗) of its dual and the space of alternating k-linear maps f : V × · · · × V → R. The identiﬁcation in question is given by

(ν1 ∧ · · · ∧ νk)(v1, . . . , vk) := det(νi(vj)) ∗ for all ν1, . . . , νk ∈ V and all v1, . . . , vk ∈ V (cf. Remark 6.34). Hence, if A : W → V is a linear map and A∗ : V ∗ → W ∗ its dual, then k ∗ ∗ ∗ Λ (A )(ν1 ∧ · · · ∧ νk) (w1, ··· , wk) = (A ν1 ∧ · · · ∧ A νk)(w1, ··· , wk) ∗ = det((A νi)(wj))

= det(νi(Awj))

= (ν1 ∧ · · · ∧ νk)(Aw1, . . . , Awk). Hence for any µ ∈ Λk(V ∗) k ∗ (Λ (A )µ)(w1, . . . , wk) = µ(Aw1, . . . , Awk). Therefore, the pullback of a differential form µ ∈ Ωk(N) by F : M → N is given by ∗ (7.6) (F µ)q(v1, . . . , vk) := µF (q)(dFqv1, . . . , dFqvk) for all q ∈ M, v1, . . . , vk ∈ TqM. So why did we define the pullback by (7.5) and not by (7.6)? The reason is that the first definition tells us that pullback automatically respects exterior multiplication of forms: (7.7) (F ∗µ) ∧ (F ∗ν) = F ∗(µ ∧ ν) for any two differential forms µ and ν on N. We will see later on that this is useful. Remark 7.4. It is easy to see that if F : M → N and G : N → Z are two smooth maps then (G ◦ F )∗µ = F ∗(G∗µ) for any form µ ∈ Ω∗(Z). Remark 7.5. If N is a submanifold of a manifold M and µ ∈ Ω∗(M) is a differential form on M, the restriction µ|N of µ to N is, by definition, the pullback of µ to N by the inclusion map ι : N → M. Before we can get back to our original goal of integrating forms on manifolds, we need to take care of a preliminary observation and some definitions.

Lemma 7.6. Let U, V ⊂ Rm be two open subsets and F : U → V a diﬀeomorphism. Then for any smooth function f ∈ C∞(V ) ∗ (F (f dx1 ∧ · · · ∧ dxm))q = f(F (q)) (det dFq)(dx1 ∧ · · · ∧ dxm)q for all q ∈ U.

m Proof. Let {e1, . . . , em} be the standard basis of R . Then for any point q, {(dx1)q,..., (dxm)q} is the dual basis. Hence ∗ (F (f dx1 ∧ · · · ∧ dxm))q(e1, . . . , em) = (f dx1 ∧ · · · ∧ dxm))F (q)(dFqe1, . . . , dF1em) = f(F (q)) det (dxi)F (q)(dFqej)

= f(F (q)) · det(dFq) · 1 = f(F (q)) det(dFq) (dx1 ∧ · · · ∧ dxm)q(e1, . . . , em)

48 Deﬁnition 7.7. The support of a k-form µ ∈ Ωk(M) on a manifold M is the closure of the set of points where µ is non-zero:

supp µ = {q ∈ M | µq 6= 0} k We denote the space of compactly supported k-forms on M by Ωc (M). m Deﬁnition 7.8. A manifold M is orientable if there is an atlas {φα : Uα → R } so that for any two indices α and β −1 (7.8) det(d(φα ◦ φβ )q) > 0

for all q ∈ φβ(Uα ∩ Uβ). A choice of such an atlas is an orientation of M. Two atlases on M deﬁne the same orientation if their union is an atlas satisfying (7.8).

Example 7.9. The identity map id : Rn → Rn defines an orientation of Rn called the standard orientation. n n The map φ : R → R , φ(x1, x2, . . . , xn) = (−x1, x2, . . . , xn) defines a different orientation. Remark 7.10. It is not at all obvious at this point, but a given connected orientable manifold can have only two orientations. Example 7.11. It should not be too hard to see that an n-sphere Sn is orientable. Somewhat harder is the fact that the real projective space RP n is orientable if and only if n is odd. Klein bottle and Möbiusstrip are not orientable. 7.3. Integration. We now proceed with defining integration of compactly supported m-forms over oriented manifolds of dimension m. Given an oriented manifold M of dimension m and a compactly supported form m R µ ∈ Ωc (M) of top degree we want to define a number M µ in a reasonable way. For example we want the integration map Z Z k :Ωc (M) → R, µ 7→ µ M M to be linear. m m If µ ∈ Ωc (R ) then µ = f dx1 ∧ · · · ∧ dxm for some compactly supported function f. We define Z Z f(x) dx1 ∧ · · · ∧ dxm := f(x) dx m m R R where the right hand side is the Riemann integral of the compactly supported function f over Rm. (Note R R that dx2 ∧ dx1 ∧ · · · ∧ xm = −dx1 ∧ dx2 ∧ · · · ∧ xm, hence m f(x) dx2 ∧ dx1 ∧ · · · ∧ xm = − m f(x) dx R R and so on.) The definition naturally extends to arbitrary open subsets of Rm: if U ⊂ Rm is open and m µ = f dx1 ∧ · · · ∧ xm ∈ Ωc (U) then Z Z µ := f(x) dx. U U R m R Clearly the map U :Ωc (U) → R is linear. In particular if µ = 0, then U µ = 0 as well. Next we consider the change of variables formula for the integration of m forms over open subsets of Rm. Definition 7.12. A diffeomorphism F : U → V between two open subsets of Rm is orientation-preserving if det(dFx) > 0 for all x ∈ U. Lemma 7.13 (change of variables formula for differential forms). Let F : U → V be an orientation- m m preserving diffeomorphism between two open subsets of R and let ω ∈ Ωc (V ) be a compactly supported form of top degree. Then Z Z (7.9) F ∗ω = ω. U F (U) 49 ∞ Proof. We know that ω = f dx1 ∧ · · · ∧ dxm for some f ∈ Cc (V ) and that Z Z ω = f(x) dx. V V On the other hand, by Lemma 7.6, ∗ F ω = (f ◦ F ) · det dF · dx1 ∧ · · · ∧ dxm. Hence, Z Z ∗ F ω = (f ◦ F )(x) det dFx dx U U Since det(dFx) = | det(dFx)| for all x ∈ U by assumption, we have Z Z ∗ F ω = (f ◦ F )(x) | det dFx| dx U U Z = f(y) dy by (7.1) F (U) Z = ω. V Theorem 7.14. Let M be an oriented m-dimensional manifold. There exists a unique linear map (integration) Z Z m :Ωc (M) → R, µ 7→ µ M M m m such that if φ : U → R is a coordinate chart (in an atlas defining the orientation of M) and ω ∈ Ωc (U) a compactly supported form of top degree, then Z Z ω = (φ−1)∗ω M φ(U) Proof. We need to check that integration of forms is well-defined and unique (linearity follows from familiar properties of the Riemann integrals). We do this in two steps.

Step I. We check that if the support of ω is in an open subset U ⊂ M and φ : U → Rm, ψ : U → Rm are two different charts on M defining the same orientation, then Z Z (φ−1)∗ω = (ψ−1)∗ω. φ(U) ψ(U) Since φ−1 = ψ−1 ◦ (ψ ◦ φ−1), (ψ−1)∗ω = (ψ−1 ◦ (ψ ◦ φ−1))∗ω = (ψ ◦ φ−1)∗((ψ−1)∗ω) by Remark 7.4 By the change of variables formula (7.9), Z Z (φ−1)∗ω = (ψ ◦ φ−1)∗((ψ−1)∗ω) φ(U) φ(U) Z = (ψ−1)∗ω. ψ(U) m Step II. We now deal with the general case. Let ω ∈ Ωc (M) be an arbitrary compactly supported m form on the manifold M of degree m = dim M. Let {φα : Uα → R } be an atlas on M giving it its orientation. Since supp ω is compact, the there are finitely many sets U1,...,Un with U1 ∪ ...Un ⊃ supp ω. n Let U0 := M r supp ω. Then U0,U1,...,Un is a cover of M. Let {ρi}i=0 be a partition of unity subordinate to this cover. Note that ρ0ω ≡ 0. Define the integral of ω over M by Z n Z X −1 ∗ (7.10) ω := (φi ) (ρiω). M i=1 φi(Ui) 50 R We need to show that our definition of M ω does not depend on the choices we made. Accordingly, suppose m that {ψβ : Vβ → R } is another atlas giving M the same orientation, V1,...,V` a cover of supp ω, V0 = ` M r supp ω and {τj}j=0 is a partition of unity subordinate to the cover V0, V1,...,V` of M. By step I, for all indices i > 0 and j > 0 Z Z −1 ∗ −1 ∗ (7.11) (ψj ) (τjρiω) = (φi ) (τjρiω). ψj (Ui∩Vj ) φi(Ui∩Vj ) Therefore, Z Z X −1 ∗ X −1 ∗ X (φi ) (ρiω) = (φi ) (ρi( τjω)) i φi(Ui) i φi(Ui) j Z X −1 ∗ = (φi ) (ρiτjω) i,j φi(Ui∩Vj ) Z X −1 ∗ = (ψj ) (ρiτjω) by (7.11) i,j ψj (Ui∩Vj ) Z X −1 ∗ = (ψj ) (τjω) j ψj (Vj ) Therefore the integral of ω over M is well-defined. The following lemma is very useful for carrying out integration. m Lemma 7.15. Let M be an oriented manifold of dimension m, ω ∈ Ωc (M) a compactly supported form and N ⊂ M an embedded submanifold of codimension 1 or greater. Then Z Z ω = ω. M MrN Proof. We may assume that M is an open subset of Rm (why?). In this case the result follows easily from the properties of Riemann integrals of functions. To compute any integrals of forms, we also need to have a good way of computing pull-backs of forms. We have already seen that if f : N → R is a 0-form (i.e., a function) and F : M → N a smooth map of manifolds then F ∗f = f ◦ F .

Exercise 7.1. Let F : M → N be a smooth map of manifolds and f : N → R a smooth function. Then df is a 1-form on N and F ∗df = d(f ◦ F ).

∗ Solution: for any point q ∈ M and any v ∈ TqM,(F df)q(v) = dfF (q)(dFq(v)) = d(f ◦ F )q(v) by the chain rule. Exercise 7.2. Compute the integral of (the restriction of ) the one form xdy − ydx over the circle S1 = {(x, y) ∈ R2 | x2 + y2} (pick any orientation of the circle you want). Solution: Consider the map F : (0, 2π) → S1 given by F (t) = (cos t, sin t). Note that the image of F is all of S1 except for one R R 1 point. Therefore, by Lemma 7.15, S1 xdy − ydx = F ((0,2π)) xdy − ydx. Also F : (0, 2π) → S is an open embedding, hence the inverse of F is a coordinate chart on S1. Therefore, Z Z xdy − ydx = F ∗(xdy − ydx). F ((0,2π)) (0,2π) Since pull-back respects exterior multiplication, F ∗(xdy − ydx) = (F ∗x)(F ∗dy) − (F ∗y)(F ∗dx). But F ∗x = cos t and F ∗y = sin t, while by Exercise 7.1 F ∗dx = d(F ∗x) = d cos t = − sin t dt and, similarly, F ∗dy = d sin t = cos t dt. Therefore Z xdy − ydx = F ∗(xdy − ydx) = cos t d sin t − sin t d cos t = cos2 t dt + sin2 t dt = dt. S1 We conclude that Z Z xdy − ydx = dt = 2π. S1 (0,2π) 51 Exercise 7.3. Compute the pull-back of dx∧dy by the map F : (0, ∞)×R → R2, F (r, θ) = (r cos θ, r sin θ).

Solution: F ∗(dx ∧ dy) = F ∗dx ∧ F ∗dy = d(F ∗x) ∧ d(F ∗y) = d(r cos θ) ∧ d(r sin θ) = (cos θdr − r sin θdθ) ∧ (sin θdr + r cos θdθ) = r cos2 θ dr ∧ dθ − r sin2 θ dθ ∧ dr = rdr ∧ dθ.

The deﬁnition of orientability of a manifold that we used above is convenient for deﬁning integration. It is inconvenient for everything else. The following criterion is useful. Proposition 7.16. An m-dimensional manifold M is orientable if and only if there is a form ν on M of degree m so that

νq 6= 0 for all q ∈ M. Remark 7.17. A nowhere vanishing form of top degree on a manifold M as in Proposition 7.16 is called a volume form.

m Proof of Proposition 7.16. Suppose M is orientable and {φα : Uα → R } is an atlas giving M an orientation. Let {ρα} be a partition of unity subordinate to the cover {Uα} of M. Deﬁne an m-form ν on M by

X ∗ ν = ρα(φα(dx1 ∧ ... ∧ dxm))

We need to check that ν vanishes nowhere. Fix a point q ∈ M. Then ρα(q) 6= 0 for ﬁnitely many α, say α1, . . . αk. Therefore k −1 ∗ X −1 ∗ ∗ ((φα ) ν)φα (q) = (φα ) (ραi φα (dx1 ∧ ... ∧ dxn)) 1 1 1 i φ (q) i=1 α1 k X −1 ∗ = ραi (q) (φαi ◦ φα ) (dx1 ∧ ... ∧ dxn) 1 φ (q) i=1 α1 k ! X = ρ (q) det d(φ ◦ φ−1) (dx ∧ ... ∧ dx ) 6= 0 αi αi α1 φα1 (q) 1 n φα1 (q) i=1 since det d(φ ◦ φ−1) > 0 and ρ (q) > 0. αi α1 φα1 (q) αi m m Conversely suppose ν ∈ Ω (M) is a volume form. We want to ﬁnd an atlas {φα : Uα → R } so that −1 det d(φα ◦ φβ )q > 0 m for all q and all α, β. Let {ψβ : Vβ → R } be an arbitrary atlas on M. It is no loss of generality to assume that all the sets Vβ are connected. Then for each index β −1 ∗ (ψβ ) ν = fβ dx1 ∧ · · · ∧ dxm,

with fβ(x) 6= 0 for all x ∈ ψβ(Vβ). Since Vβ is connected fβ is either strictly positive or strictly negative. If m m fβ > 0, keep the chart ψβ. Otherwise replace it by T ◦ ψβ where T : R → R is the diﬀeomorphism given by

T (x1, x2, ··· , xm) = (−x1, x2, ··· , xm). Exercise 7.4. Suppose that M and N are orientable manifolds. Prove that their product M × N is orientable. Exercise 7.5. Show that the tangent bundle TM is always orientable, regardless of whether or not the manifold M is.

R 3 Exercise 7.6. Evaluate S ω|S where S is the helicoid in R parameterized by φ(s, t) = (s cos t, s sin t, t), 0 < s < 1, 0 < t < 4π, and ω = z dx ∧ dy + 3 dz ∧ dx − x dy ∧ dz. Use the orientation of S defined by φ (that is, φ−1 : S → R2 is a coordinate chart on S). 52 8. Vector bundles Informally a vector bundle is a collection of vector spaces parameterized by points in a manifold. You have already seen two example: the tangent bundle and the cotangent bundle. Here is the formal definition. Definition 8.1. A real vector bundle E of rank k over a manifold M is a manifold E together with a smooth map π : E → M so that −1 (1) for each x ∈ M the fiber Ex := π (x) is a real vector space of dimension k and (2) for all x ∈ M, there is an open neighborhood U of x ∈ M and a diffeomorphism ψ : π−1(U) → U ×Rk such that pr ◦ ψ = π, where pr is the natural projection from U × Rk to U, pr(q, v) = q. That is, the diagram ψ −1 π (U) / U × Rk KK KK KK pr π KK KK % U k commutes. Hence ψ maps the fiber Ey to {y} × R for all y ∈ U. Additionally we require that the restrictions k ψ|Ey : Ey → {y} × R are vector space isomorphisms for all y ∈ U. Definition 8.2. • The manifold E is called the total space of the bundle π : E → M. • The manifold M is called the base of the bundle π : E → M. • The maps ψ : π−1(U) → U × Rk are called local trivializations of the bundle π : E → M. Example 8.3. The projection π : M × Rk → M, π(m, v) = m is a vector bundle of rank k. Example 8.4. I claim that the tangent bundle π : TM → M is a vector bundle of rank dim M. Let us m construct local trivializations. Given a point q ∈ M choose a coordinate chart φ = (x1, . . . , xm): U → R with q ∈ U. The map

−1 m ψ : TM|U ≡ π (U) → U × R , ψ(v) = (π(v), (dx1(v), . . . , dxm(v)) −1 m is a local trivialization. Note that its inverse ψ : U × R → TM|U is given by

−1 X ∂ ψ (p, (a1, . . . , am)) = ai . ∂xi p Example 8.5. The cotangent bundle T ∗M → M is also a vector bundle over M of rank dim M. It is useful to be able to say when two vector bundles are “the same.”

Definition 8.6. Let πE : E → M and πF : F → M be two vector bundles over a manifold M. A smooth map f : E → F is a vector bundle map if f(Ex) ⊂ Fx for all x and if the map f|Ex : Ex → Fx is linear. A vector bundle map f : E → F is an isomorphism of vector bundles if it is invertible and if f −1 : F → E is a vector bundle map. Definition 8.7. A vector bundle E → M of rank k is trivial if there is a vector bundle isomorphism E → M × Rk. ∂ ∂ 2 1 Example 8.8. The vector field X = x1 − x2 on is tangent to the unit circle S and is not zero ∂x2 ∂x1 R 1 1 anywhere. Therefore the map f : S × R → TS , f(q, t) = tXq is an invertible vector bundle map. Convince yourself that the inverse f −1 : TS1 → S1 × R is smooth. n z 1 }| {1 Exercise 8.1. Show that the tangent bundle of the n-torus Tn :=S × · · · × S is trivial. 53 Exercise 8.2. Let E → M, F → M be two vector bundles over a manifold M. Show that if f : E → F is a vector bundle map and f has a set-theoretic inverse f −1, then f −1 is a vector bundle map.

Hints. Consider ﬁrst the case where E and F are trivial bundles: E = M × Rk, F = M × Rl. Prove that the map inv : GL(R, k) → GL(R, k) given by A 7→ A−1 is smooth. Use trivializations to reduce everything to the trivial bundles case. Exercise 8.3. If π : E → M is a vector bundle and N ⊂ M is an embedded submanifold, show that −1 π : E|N := π (N) → N is a vector bundle over N, called the restriction of E to N.

Hints: to show that E|N is a submanifold of E prove that π : E → M is transverse to N. The local trivializations of E|N are the restrictions of the local trivializations of E. Example 8.9 (tautological line bundle). (This is a sketch with no actual proofs.) Recall that the complex projective space CP n is the set of all complex lines in Cn+1. We identify lines CP n with equivalence classes of nonzero vectors [v]. The equivalence relation is given by v ' v0 if and only if v and v0 are collinear: v = λv0 for some 0 6= λ ∈ C. We define the tautological complex line bundle π : L → CP n as follows. We let n n+1 L = {(l, v) ∈ CP × C : v ∈ l}. and define π : L → CP n by π(l, v) = l. Thus the fiber π−1(l) consists of all vectors v ∈ Cn+1 that lie on the line l, that is, of the complex line l itself. Hence the name. I claim that π : L → CP n is indeed a real vector bundle of rank 2 (2 because C is a 2-dimensional vector space over R). Why is L a manifold? The relationship v ∈ l is really a collection of algebraic equations: if v ∈ [w] then (v1, . . . , vn+1) = λ(w1, . . . , wn+1) for some 0 6= λ ∈ C. Therefore vi/wi = λ = vj/wj for all i and j and hence

vjwi = viwj for all i, j. From this one can deduce that L is indeed a manifold (in other words I am not really giving you a proof). To construct local trivializations let n Ui = {[w] ∈ CP : wi 6= 0}. −1 Define ψi : π (Ui) → Ui × C by ψi([w], v) = ([w], vi). 8.1. Sections. Definition 8.10 (Section). A section s : M → E of a vector bundle π : E → M is a C∞ map such that π ◦ s = idM . That is, a section is a smooth map from M to E such that s(x) ∈ Ex for all x. Notation. We denote the set of smooth sections of a bundle π : E → M by Γ(E). Example 8.11. • The space of sections Γ(TM) of the tangent bundle of a manifold M is the space of vector fields on a manifold M. • The space of sections Γ(T ∗M) of the cotangent bundle of a manifold M is the space of 1-forms on a manifold M. That is, Γ(T ∗M) = Ω1(M). • The space of sections Γ(Λk(T ∗M)) of the kth exterior power of the cotangent bundle (which we have not constructed yet) is the space of k-forms Ωk(M). • The space of sections Γ(M × R) of the trivial bundle M × R → R is the space of smooth maps of the form m 7→ (m, f(m)), where f : M → R is a smooth function. Thus Γ(M × R) “is” C∞(M). Lemma 8.12. Let E → M be a vector bundle over a manifold M. The space of sections Γ(E) is a vector space over R under pointwise addition and multiplication by scalars. Moreover, if s : M → E is a section and f ∈ C∞(M) is a function then we can define a new section fs : M → E by

(fs)q = f(q)sq for q ∈ M. Thus the space of sections Γ(E) is a module over the space of functions C∞(M). 54 Proof. The only possible worry is this: suppose f, f¯ ∈ C∞(M) are two smooth functions and s, s¯ ∈ Γ(E) are two smooth sections. Is the section fs + f¯s¯ smooth? Since every bundle is locally trivial and since smoothness is a local condition, we may assume that E is the trivial bundle M × Rk → M. In this case Γ(E) “is” the space of smooth maps from M to Rk, and the lemma is clearly true for these maps — they ∞ do form a module over C (M).

Remark 8.13. The map that assigns to every point q ∈ M the origin 0q in the ﬁber Eq is smooth. It’s called the zero section and is often denoted by 0. Deﬁnition 8.14 (local section). A local section of a vector bundle π : E → M is a section of π−1(U) =:

E → U for some open U ⊂ M. Equivalently, a local section is a smooth map s : U → E such that U π ◦ s = idU .

Example 8.15. If (U, x1, . . . , xn) is a coordinate chart on a manifold M, then for each index i, the map

∂ ∗ q 7→ ∂x is a local section of the tangent bundle TM → M. Similarly dxi : U → T M is a local section of i q the cotangent bundle.

Exercise 8.4. Let E → M be a vector bundle, x ∈ M a point and v ∈ Ex. Show that there is a global section s with sx = v. 8.2. Frames and local frames. We now address the issue raised in section 7: a k-form µ ∈ Ωk(M) on a m manifold M is smooth if for any coordinate chart (x1, . . . , xm): U → R on M, we have ! X X µ = aI dxI = ai1···ik dxi1 ∧ · · · ∧ dxik ,

|I|=k i1<···

where aI ’s are smooth functions on U. To this end we deﬁne frames on a vector bundle.

Definition 8.16. Let E → M be a vector bundle of rank k. A collection s1, . . . , sk ∈ Γ(E) of sections is a frame of E if for each point x ∈ M the vectors {s1(x), ··· , sk(x)} form a basis of the fiber Ex. Similarly, a collection of local sections s1, . . . , sk : U → E is a local frame of E if for each point x ∈ U the vectors {s1(x), ··· , sk(x)} form a basis of the fiber Ex. Example 8.17. A nowhere zero vector field X on a circle S1 is a frame of the tangent bundle TS1 → S1. Proposition 8.18. A vector bundle E → M of rank k is trivial if and only if it has a frame of k sections s1, . . . , sk ∈ Γ(E). Proof. Suppose that E is a trivial vector bundle over a manifold M. Then we have a global trivialization ψ : π−1(M) → M × Rk. Define −1 si(x) = ψx (ei) k where {e1, . . . , ek} is the canonical basis for R . Then the collection {s1, . . . , sk} satisfies the desired properties. Conversely, suppose that we have smooth sections s1, . . . , sk that form a basis of Ex at every x. Then a global trivialization is given by X (q, v1, . . . , vk) 7→ visi(q). i Exercise 8.5. A section s of a vector bundle E → M is smooth if and only if for each point q ∈ M there is ∞ a neighborhood U of q, a local frame s1, . . . , sk : U → E and smooth functions f1, . . . fk ∈ C (U) so that

s = f1s1 + ··· + fksk.

Exercise 8.6. Show that the discussion of smoothness of k-forms following (7.4) is correct: if (x1, . . . , xm): m ∗ U → R is a coordinate chart on a manifold M, then {dx1, . . . , dxm} is a local frame of T M over U. Hence

{dxI | |I| = k} 55 is a local frame of the kth exterior power Λk(T ∗M) of the cotangent bundle. By Exercise 8.5, a section k ∗ ω ∈ Γ(Λ (T M)) is smooth on U if and only if there are smooth functions aI : U → R such that X ω|U = aI dxI . 8.3. Vector bundles via transition maps. The goal of this section is to design a way of tearing vector bundles apart and then putting them back together in a new way. This will allow us to carry over the operations of direct sum ⊕, tensor ⊗, exterior power Λk, taking duals and so on from vector spaces to vector bundles.

Suppose that π : E → M is a vector bundle of rank k and {Uα} is a cover of M such that E|Uα are trivial −1 k and let ψα : π (Uα) → Uα × R denote the local trivializations. If Uα ∩ Uβ 6= ∅, we have a map −1 k k ψβ ◦ ψα :(Uα ∩ Uβ) × R → (Uα ∩ Uβ) × R k −1 Since the trivializations ψα maps ﬁbers Ey linearly to ﬁbers {y}×R the composition ψβ ◦ψα maps linearly k k −1 {y} × R to {y} × R for all y ∈ Uα ∩ Uβ. Hence the map ψβ ◦ ψα has to be of the form −1 ψβ ◦ ψα (y, v) = (y, ψβα(y)v). for some function k ψβα : Uα ∩ Uβ → GL(R ) k Note that ψβα is smooth, because for every basis vector ej of R , the map

q 7→ ψβα(q)ej is smooth. Such maps are called a transition maps for the bundle π : E → M. It is not hard to see that the k set of transition maps ψαβ : Uα ∩ Uβ → GL(R ) for the bundle E → M relative to the cover {Uα} satisfy the following three conditions called the cocycle conditions:

(1) ψαα = idUα for all α. k (2) ψαβ · ψβα = idUα∩Uβ for all pairs of indices α, β (the dot denotes the multiplications in GL(R )).

(3) ψαβ · ψβγ · ψβγ = idUα∩Uβ ∩Uγ for all triples of indices α, β and γ. −1 Note that (2) implies that ψβα = ψαβ . The transition maps determine the vector bundle E. k Theorem 8.19. Let M be a manifold, {Uα} an open cover, and {ψαβ : Uα ∩ Uβ → GL(R )} a collection of smooth maps satisfying the cocycle conditions. Then there is a vector bundle E over M of rank k with transition maps {ψαβ}. k Sketch of proof. Consider the disjoint union E of the trivial bundles Uα × R : G k E = (Uα × R ). α Define a relation on E by k 0 0 k 0 0 Uα × R 3 (q, v) ∼ (q , v ) ∈ Uβ × R if and only if q = q and ψβα(v) = v . The cocycle conditions guaranty that ∼ is an equivalence relation. Let E = E/ ∼, and write [q, v] for the equivalence class of (q, v). Define the projection π : E → M by π([q, v]) = q. Then −1 k π (Uα) = {[q, v] | (q, v) ∈ Uα × R }. −1 k Define the trivializations ψα : π (Uα) → Uα × R by

ψα([q, v]) = (q, v).

It’s not hard to check that the maps ψα are well-defined and that the corresponding transitions maps are the maps φαβ we started out with. It remains to check that E can be given the structure of a manifold so that all the maps in sight are smooth. But this is not bad. Let’s examine what we have. 56 −1 We have a topological space E covered by open sets π (Uα). For each α we have a homeomorphism −1 k ψα : π (Uα) → Uα × R . This suggests a way to get coordinate charts on our topological space E: compose k homeomorphisms ψα with charts on Uα × R . This gives us a cover of E by open sets and a collection of homeomorphism from these sets to open subsets of Rn, where n = dim M + k. This is an atlas because the maps −1 k k ψβ ◦ ψα :(Uα ∩ Uβ) × R → (Uα ∩ Uβ) × R , (q, v) = (q, ψβα(q)v). k are smooth. Note that here we are given that the maps ψβα : Uα ∩ Uβ → GL(R ) are smooth and we are −1 k k using it to conclude that ψβ ◦ ψα :(Uα ∩ Uβ) × R → (Uα ∩ Uβ) × R are smooth. Remark 8.20. Naturally a different choice of a cover of M and a different choice of trivializations gives rise to a different collection of transition maps. And we should worry whether two different sets of data (open cover, transitions maps) give rise to the same bundle. But this would take us too far afield. As a first application of Theorem 8.19 we construct the direct sum E ⊕ F → M of two vector bundles πE : E → M and πF : F → M over a manifold M. The direct sum E ⊕ F (also known as Whitney sum) should be a vector bundle with with the fiber (E ⊕ F )q = Eq ⊕ Fq for q ∈ M. We define it by way of the transition maps. E k Pick an open cover {Uα} of M such that E|Uα and F |Uα are trivial. Let ψαβ : Uα ∩ Uβ → GL(R ) and F l ψαβ : Uα ∩ Uβ → GL(R ) be the associated transition maps (thus E is of rank k and F is of rank l). Define the maps E⊕F k l ψαβ : Uα ∩ Uβ → GL(R ⊕ R ) by E E⊕F ψαβ(a) 0 ψαβ (q) = F . 0 ψαβ(a) E⊕F It is not hard to check that the maps ψαβ are smooth and satisfy cocycle conditions. Therefore, by Theo- E⊕F rem 8.19 there is a vector bundle E ⊕ F → M with transition maps ψαβ . Its fibers are isomorphic to the direct sum of the corresponding fibers of E and F .

It was worth reﬂecting on what made the construction above work. It is simply the fact that the map A 0 GL( k) × GL( l) → GL( k+l), (A, B) → A ⊕ B := R R R 0 B

2 2 2 is smooth (as a map between open subspaces of Rk × Rl and R(k+l) ) and the fact that under this map the compositions go to compositions: (A ◦ A0) ⊕ (B ◦ B0) = (A ⊕ B) ◦ (A0 ⊕ B0). There are many more examples of maps of this sort. For instance, consider the map that takes a matrix A ∈ GL(Rk) to its inverse transpose: −1 ∗ −1 ∗ k ∗ ( ) : A 7→ (A ) ∈ GL((R ) ). The map is smooth (since the entries of the matrix (A−1)∗ are rational functions of the entries of the matrix −1 ∗ −1 ∗ −1 ∗ A), and ((AB) ) = (A ) (B ) . Now let E → M be a vector bundle of rank k and {Uα} an open cover k of M such that E|Uα are trivial. Let ψβα : Uβ ∩ Uα → GL(R ) be the corresponding transition maps. Then ∗ k ∗ the maps ψβα : Uβ ∩ Uα → GL((R ) ) defined by ∗ −1∗ ψβα(x) = (ψβα(x)) are smooth and satisfy the cocycle conditions. By Theorem 8.19 there exists a vector bundle E∗ → M whose ∗ ∗ ∗ transitions maps are precisely {ψβα}. The bundle E is called the dual bundle of E. Its fibers Eq are vector spaces dual to the fibers Eq of E. We have seen this construction in one special case: the cotangent bundle T ∗M is the dual bundle of the tangent bundle. The maps (A, B) 7→ A ⊕ B and A 7→ (A−1)∗ are what is known as smooth functors. They allowed us to define direct sum of two bundles and the dual bundle, respectively. Here are a few more examples of the functors that will be very useful for us. Let V and W be finite-dimensional vector spaces, A ∈ GL(V ) and B ∈ GL(W ) over M. Then 57 • (A, B) 7→ A ⊗ B ∈ GL(V ⊗ W ) • A 7→ Λk(A) ∈ GL(Λk(V )) and • (A, B) 7→ Hom(A, B) ∈ GL(Hom(V,W ), Hom(A, B)T := B ◦ T ◦ A−1 are smooth functors.9 Indeed, pick bases of V and W . Then the entries of the matrix representing A ⊗ B are products of entries of matrices representing A and B. The entries of the matrix representing Λk(A) are polynomial in the entries of the matrix representing A. [If this is confusing, work out the following simple 3 2 example and you’ll see what I mean. Let V = R , k = 2 and compute Λ (A)(ei ∧ ej) in terms of the basis {e1 ∧ e2, e1 ∧ e3, e2 ∧ e3}.] Similarly the entries of matrix representing Hom(A, B) are polynomial in the entries of A and B. This allows us, given two vector bundles E → M and F → M to construct the bundles • E ⊗ F → M • Λk(E) → M and • Hom(E,F ) → M.

Exercise 8.7. Check that the bundle E∗ and Hom(E,M × R) are isomorphic. Exercise 8.8. Let E → M and F → M be two vector bundles. Convince yourself that a section of Hom(E,F ) “is” a vector bundle map from E to F .

Show that E∗ ⊗ F is isomorphic to Hom(E,F ).

Exercise 8.9. Compute transition maps for the tautological real line bundle L → RP n: n n+1 L = {(l, v) ∈ RP × R | v ∈ l}. Compute transition maps for L⊗L. (Hint: write down the isomorphism R⊗R → R.) Compute the transitions k z }| { maps for L⊗k :=L ⊗ · · · ⊗ L, k > 1.

Exercise 8.10. Let πE : E → M and πF : F → M be two vector bundles over M. (a) Show that E × F is a vector bundle over M × M. (b) Explain why G = {(e, f) ∈ E × F : πE(e) = πF (f)} can be considered a vector bundle over M. (c) Show that, as a vector bundle over M, G is isomorphic to the Whitney sum E ⊕ F .

9. Exterior differentiation, contractions and Lie derivatives of forms 9.1. Exterior differentiation. In this section we first learn how to differentiate differential forms. We define an operator d of exterior differentiation that raises the degree of the form by 1. It is an generalization of div, grad and curl operators of vector calculus.

Theorem 9.1. For every manifold M, there is a unique R-linear operator ∗ ∗+1 dM :Ω (M) → Ω (M) with the following properties : k k+1 (1) dM raises the degrees by 1: dM (Ω (M)) ⊂ Ω (M); ∞ (2) dM f = df for all f ∈ C (M), that is, dM extends the operator d, which takes functions to 1-forms, to forms of arbitrary degree; ∗ (3) the operator dM commutes with restrictions to open sets: for all open sets U ⊂ M and all ω ∈ Ω (M), (dM ω)|U = dU (ω|U ); k k (4) the operator dM is a super-derivation: dM (ω ∧ η) = (dM ω) ∧ η + (−1) ω ∧ (dM η) for ω ∈ Ω (M), η ∈ Ωl(M); (5) dM ◦ dM = 0. Remark 9.2. Note that any open set U ⊂ M is a manifold, so the theorem asserts that there is an operator ∗ ∗ dU :Ω (U) → Ω (U) with properties (1) – (5), hence property (3) makes sense.

9 0 0 Note that Λ (A) = 1 ∈ GL(Λ (V )) = GL(R). 58 Proof of Theorem 9.1. We prove uniqueness of the operator dM ﬁrst. We then construct the operator locally, on open sets. By uniqueness, these locally deﬁned operators patch together into a global operator. This would prove existence. m Suppose the operator dM with the desired properties exist. Fix a coordinate chart (x1, . . . , xm): U → R k P ∞ on M. Then for all α ∈ Ω (M), α|U = |I|=k aI dxI , where aI ∈ C (U) (cf. (7.2) and (7.3)). We claim that X (9.1) (dM α)|U = daI ∧ dxI . |I|=k

This would prove uniqueness since the right hand side is deﬁned independently of dM . We prove (9.1) in four steps. By property (3) of dM , (dM α)|U = dU (α|U ). By properties (2) and (5) dU (dxi) = dU (dU xi) = (dU ◦ dU )xi = 0. Hence, by property (4)

dU (dxi1 ∧ dxi2 ∧ · · · ∧ dxik ) = dU (dxi1 ) ∧ (dxi2 ∧ · · · ∧ dxik ) − dxi1 ∧ dU (dxi2 ∧ · · · ∧ dxik )

Since dU (dxi) = 0, induction on k then gives:

dU (dxI ) = dU (dxi1 ∧ dxi2 ∧ · · · ∧ dxik ) = 0. Hence for any multi-index I, dU (aI dxI ) = daI ∧ dxI .

Linearity of dU ﬁnishes the proof of (9.1). To prove existence of dM we run equation (9.1) backwards. Given a coordinate chart (x1, . . . , xm): U → m k k+1 R on M we deﬁne an operator dU :Ω (U) → Ω (U) by X X (9.2) dU ( aI dxI ) = daI ∧ dxI |I|=k |I|=k

(in particular, if k = 0, then dU a = da). Suppose, for the moment, that dU deﬁned by (9.2) satisﬁes m properties (1) – (5). Then by uniqueness, for any two coordinate charts (x1, . . . , xm): U → R and m k (y1, . . . , ym): V → R and any k-form α ∈ Ω (M)

(dU α|U )|U∩V = (dV α|V )|U∩V . ∗ ∗+1 Consequently dM :Ω (M) → Ω (M), given by

(dM α)|U = dU (α|U ) for all coordinate charts U, is well-deﬁned. Since dU s have properties (1) – (5), so does dM (check that). It remain to prove that the map dU given by (9.2) has the desired properties. Clearly dU is R-linear and raises degrees by 1. Property (2) holds by deﬁnition. P k To prove (3) we want to show that for any open subset W ⊂ U and any k-form α = aI dxI ∈ Ω (U)

(dU α)|W = dW (α|W ) m (Note that (x1, . . . , xm)|W : W → R is also a coordinate chart). Let j : W,→ U denote the inclusion. For ∞ ∗ any smooth function f ∈ C (U), j f = f|W . Hence, by Exercise 7.1,

= dW (α|W ). k ∗ To prove (4) it’s enough to show that for any aI dxI ∈ Ω (U) and any bJ dxJ ∈ Ω (U) k (9.3) dU (aI dxI ∧ bJ dxJ ) = dU (aI dxI ) ∧ bJ dxJ + (−1) aI dxI ∧ dU (bI dxJ ). 59 We compute:

dU (aI dxI ∧ bJ dxJ ) = dU (aI bJ dxI ∧ dxJ )

= d(aI bJ ) ∧ dxI ∧ dxJ

= (bJ daI + aI dbJ ) ∧ dxI ∧ dxJ k = daI ∧ dxI ∧ bJ dxJ + (−1) aI dxI ∧ dbJ ∧ dxJ k = dU (aI dxI ) ∧ bJ dxJ + (−1) (aI dxI ) ∧ dU (bJ dxJ ). k This proves (4). Similarly, if α = aI dxI ∈ Ω (U) then

dU (dU α)) = dU (daI ∧ dxI ) m ! X ∂a = d dx ∧ dx U ∂x i I i=1 i   X ∂2a = dx ∧ dx ∧ dx  ∂x ∂x j i I i,j j i

Now, for i = j, dxi ∧ dxj = 0 so we are only summing over indices i and j with i 6= j. Each unordered pair ∂2a ∂2a i, j with i 6= j contributes two terms to the sum: dxj ∧ dxi and dxi ∧ dxj. These two terms ∂xj ∂xi ∂xi∂xj cancel since dxj ∧ dxi = −dxj ∧ dxi while the mixed partials commute. Therefore

dU (dU α) = 0. By linearity this is true for all k forms on the coordinate patch U. This proves property (5) and we are done.

Notation. From now on we drop the subscript M from dM and simply write d instead. Example 9.3. The exterior derivative of a form is easy to compute: let α = dz + xdy be a 1-form on R3. Then dα = d(dz) + d(xdy) = 0 + dx ∧ dy = dx ∧ dy. 9.2. Contractions of forms and vector fields. To relate the exterior derivate operation to the standard calculus operation of div, grad and curl we need to define contractions of forms with vector fields. Let u be a vector in a finite dimensional vector space V . Then u defines a linear map ι(u):Λk(V ∗) → Λk−1(V ∗) by (ι(u)η)(v1, . . . , vk−1) = η(u, v1, . . . , vk−1) k ∗ for any η ∈ Λ (V ) and any v1, . . . , vk−1 ∈ V . Here, of course, we think η as k-linear alternating maps from V × · · · × V to R. We refer to ι(u)η as the contraction of u with η. Note that if η ∈ Λ1(V ∗) = V ∗, then ι(u)η is simply the number η(u). If η ∈ Λ0(V ∗) = R, then we define ι(u)η := 0 (and tacitly define Λ−1(V ∗) = 0). Similarly, if X ∈ Γ(TM) is a vector field on a manifold M and α ∈ Ωk(M) is a k form with k > 0 we define the contraction of X with α to be the k − 1 form ι(X)α given, for any point q ∈ M, by

(ι(X)α)q = ι(Xq)αq. ∗ Here, on the right we are contracting a vector Xq ∈ TqM with αq ∈ Λ((TqM) ). In particular, if α is a 1-form, ι(X)α = α(X). And again, if α is a 0-form, then ι(X)α = 0 (and the space of (−1)-forms is 0). ∗ 2 ∗ Example 9.4. Suppose l1, l2 ∈ V , so that l1 ∧ l2 ∈ Λ (V ). Let u ∈ V be a vector. Then, for any v ∈ V ,

(ι(u)(l1 ∧ l2))(v) = (l1 ∧ l2)(u, v)

= l1(u)l2(v) − l1(v)l2(u)

= (l1(u)l2 − l2(u)l1)(v). Hence ι(u)(l1 ∧ l2) = l1(u)l2 − l2(u)l1. 60 This example suggests a general way of computing contractions.

∗ Lemma 9.5. If l1, . . . , lk ∈ V , u ∈ V , then k X j−1 ι(u)(l1 ∧ ... ∧ lk) = (−1) (ι(u)lj)(l1 ∧ ... ∧ lbj ∧ ... ∧ lk), j=1 where lbj means that lj is omitted from the expression.

Proof. For any k − 1 vectors v1, . . . , vk−1 ∈ V ,   l1(u) l1(v1) . . . l1(vk−1)  . .  (ι(u)l1 ∧ ... ∧ lk)(v1, . . . , vk−1) = det  . .  lk(u) lk(v1) . . . lk(vk−1) k X j−1 = (−1) lj(u) det Aj j=1 k X j−1 = (−1) lj(u)(l1 ∧ ... ∧ lbj ∧ ... ∧ lk)(v1, . . . , vk−1), j=1 where Aj is the matrix obtained from the matrix   l1(u) l1(v1) . . . l1(vk−1)  . .   . .  lk(u) lk(v1) . . . lk(vk−1) by deleting the ﬁrst column and jth row. Corollary 9.5.1. Let V be a vector spaces, u ∈ V a vector and α ∈ Λr(V ∗) and β ∈ Λs(V ∗) be two exterior forms. Then ι(u)(α ∧ β) = (ι(u)α) ∧ β + (−1)rα ∧ (ι(u)β).

∗ Proof. It’s enough to consider the case of α = l1 ∧ ... ∧ lr and β = lr+1 ∧ ... ∧ lr+s for some l1, . . . , lr+s ∈ V . Then

ι(u)(α ∧ β) = ι(u))(l1 ∧ ... ∧ lr ∧ lr+1 ∧ ... ∧ lr+s) r+s X j−1 = (−1) (ι(u)lj)(l1 ∧ ... ∧ lbj ∧ ... ∧ lr+s) j=1

 r  X j−1 =  (−1) (ι(u)lj)(l1 ∧ ... ∧ lbj ∧ lr ∧ lr+1 ∧ ... ∧ lr+s j=1

 r+s  X j−1 + l1 ∧ ... ∧ lr ∧  (−1) (ι(u)lj)(lr+1 ∧ ... ∧ lbj ∧ lr+s) j=r+1

 r+s  X r j0−1 = (ι(u)α) ∧ β + α ∧  (−1) (−1) (ι(u)lj0+r)(lr+1 ∧ ... ∧ blj0+r ∧ ... ∧ lr+s j0=1 = (ι(u)α) ∧ β + (−1)rα ∧ (ι(u)β).

Corollary 9.5.2. Let M be a manifold, X ∈ Γ(TM) a vector field and α ∈ Ωr(V ∗) and β ∈ Ωs(V ∗), be two differential forms. Then ι(X)(α ∧ β) = (ι(X)α) ∧ β + (−1)rα ∧ (ι(X)β). 61 ∂ ∂ ∂ 3 Example 9.6. Let W = x ∂x + y ∂y + z ∂z be a vector field on R and let ω = dx ∧ dy ∧ dz (ω is the standard volume form on R3). Then ι(W )ω = ι(W )(dx ∧ dy ∧ dz) = dx(W ) dy ∧ dz − dy(W ) dx ∧ dz + dz(W ) dx ∧ dy = x dx ∧ dy − y dy ∧ dz + z dx ∧ dy

Exercise 9.1. In R3, the standard inner product (·, ·) defines an isomorphism R3 → (R3)∗, v 7→ (v, ·), which in turn induces an isomorphism of spaces of sections 3 1 3 A : Γ(T R ) → Ω (R ),A(X) = (X, ·). 3 2 3 ∗ The standard volume form µ = dx1 ∧dx2 ∧dx3 defines an isomorphism R → Λ ((R ) ) by v 7→ ι(v)µ, which also induces an isomorphism 3 2 3 B : Γ(T R ) 7→ Ω (R ) B(X) = ι(X)µ. Finally, the map ∞ 3 3 3 C : C (R ) → Ω (R ) C(f) = fµ is also an isomorphism. (Check these facts!) Show that the standard vector calculus notions of div, grad, and curl can be defined as (1) grad(f) = A−1(df) for any smooth function f on R3. (2) curl(X) = B−1(d(A(X))) for any vector field X on R3. (3) div(X) = C−1(d(B(X))) for any vector field X on R3. 9.3. Lie derivatives of differential forms. In order to understand divergence of a vector field on a manifold we need to define Lie derivatives of differential forms. This is fairly easy to do, but then the definition is hard to compute with. Cartan’s formula makes computation of Lie derivatives of forms easy, but it requires understanding of the interaction between exterior differentiation and pull-backs. Which is why we address the pull-backs first. Lemma 9.7. Exterior differentiation d commutes with pull-backs. That is to say, let F : M → N be a smooth map between two manifold and α ∈ Ω∗(N) a differential form. Then (9.4) d(F ∗α) = F ∗(dα). Proof. We know that equation (9.4) holds if α is a zero-form, that is, a function (cf. Exercise 7.1). n We now argue that for any coordinate chart (x1, . . . , xn): U → R on N and for any k-form α on N, k > 1, we have ∗ ∗ (9.5) (F (dα))|F −1(U) = d F α|F −1(U) . ∗ ∗ P Equation 9.5 is enough to prove the lemma. Now, (F dα)|F −1(U) = F (dα|U ) and α|U = |I|=k aI dxI for ∞ all multi-indices I of size k and some functions aI ∈ C (U). Therefore X dα|U = daI ∧ dxI , I and ∗ X ∗ ∗ (F (dα))|F −1(U) = F daI ∧ F dxI . I Since (9.4) holds for functions, ∗ ∗ F daI = d(F aI ). Similarly, ∗ ∗ ∗ F dxI = d(F xi1 ) ∧ ... ∧ d(F xik ) for all I = (i1, . . . , ik). Therefore

∗ X ∗ ∗ ∗ −1 (F (dα))|F (U) = d(F aI ) ∧ d(F xi1 ) ∧ ... ∧ d(F xik ) I 62 ∗ We now argue that the right hand side of the equation above is d (F α)|F −1(U) . Properties (4) and (5) of the exterior derivative d and induction on k shows that for any k functions f1, . . . fk,

d(df1 ∧ ... ∧ dfk) = 0.

Hence for any functions f0, f1, . . . , fk,

d(f0df1 ∧ ... ∧ dfk) = df0 ∧ df1 ∧ ... ∧ dfk. In particular,

∗ ∗ ∗ ∗ ∗ ∗ ∗ d(F aI ) ∧ d(F xi1 ) ∧ ... ∧ d(F xik ) = d F aI d(F xi1 ) ∧ ... ∧ d(F xik ) = d (F (aI dxi1 ∧ ... ∧ dxik )) .

Therefore, ! ∗ X ∗ ∗ X ∗ ∗ −1 −1 (F (dα))|F (U) = d F (aI dxi1 ∧ ... ∧ dxik ) = d(F aI dxI ) = d (F (α|U )) = d(F α|F (U)) I I and we are done.

k Definition 9.8. Let X be a vector field on a manifold M and ω ∈ Ω (M) a k-form. Let φt denote the local flow of X. The Lie derivative LX ω of ω with respect to X is defined by

d ∗ (LX ω)q = (φt ω)q dt t for any point q ∈ M.

∞ Note that by deﬁnition of the ﬂow φt, the Lie derivative of a 0-form f ∈ C (M) is

d ∗ (LX f)(q) = (φt f)(q) dt t=0 d = (f ◦ φt)(q) = Xq(f) dt t=0 As was mentioned above, the goal of this subsection is to prove Cartan’s formula for Lie derivatives.

Theorem 9.9 (Cartan’s Formula). Suppose that X is a vector ﬁeld on a manifold M and ω ∈ Ω∗(M) a diﬀerential form. Then

LX ω = d(ι(X)ω) + ι(X)dω. We prove the theorem in a sequence of lemmas.

Lemma 9.10. Let X be a vector field on a manifold M. The Lie derivative LX is a derivation on the space of forms Ω∗(M) which commutes with the exterior differentiation d. That is to say, ∗ ∗ (1) LX :Ω (M) → Ω (M) is R-linear. ∗ (2) LX (ω ∧ η) = (LX ω) ∧ η + ω ∧ (LX η) for all ω, η ∈ Ω (M). ∗ (3) LX (dω) = d(LX ω) for all ω ∈ Ω (M) Proof. The first property of the Lie derivative is easy to see: pull-backs and differentiation are both linear. Let us prove (2). Since pull-back respects exterior multiplication,

d ∗ d ∗ ∗ (φt (ω ∧ η)) = (φt ω) ∧ (φt η). dt t=0 dt t=0 Since exterior multiplication is bilinear,

d ∗ ∗ d ∗ ∗ ∗ d ∗ (φt ω) ∧ (φt η) = ( φt ω) ∧ (φ0η) + (φ0ω) ∧ (φt η) dt t=0 dt t=0 dt t=0 = (LX ω) ∧ η + ω ∧ (LX η). 63 This proves that the Lie derivative is a derivation. We now prove that it commutes with the exterior multiplication. For any form ω

d ∗ LX (dω) = (φt (dω)) dt t=0 d ∗ = d(φt ω) dt t=0 d ∗ = d( φt ω) (since mixed partials commute) dt t=0 = d(LX ω). Lemma 9.11. Let X be a vector ﬁeld on a manifold M. Let Q = dι(X) + ι(X)d :Ω∗(M) → Ω∗(M). The operator Q is also a derivation on the space of forms Ω∗(M) which commutes with the exterior diﬀer- entiation d.

Proof. It’s clear that Q is R-linear. We check that Q commutes with d: Q ◦ d = d ι(X) d + ι(X) d d = d ι(X) d (since d ◦ d = 0) = d d ι(X) + d ι(X) d = d ◦ Q. Now we need to check that Q is a derivation. Accordingly, let ω ∈ Ωk(M), η ∈ Ωl(M) be two forms on M. Then Q(ω ∧ η) = d(ι(X)(ω ∧ η)) + ι(X)(d(ω ∧ η)) = d[(ι(X)ω) ∧ η + (−1)kω ∧ (ι(X)η)] + ι(X)[dω ∧ η + (−1)kω ∧ dη] = d(ι(X)ω) ∧ η + (−1)k−1(ι(X)ω) ∧ dη + (−1)kdω ∧ ι(X)η + (−1)k(−1)kω ∧ dι(X)η + (ι(X)dω) ∧ η + (−1)k+1dω ∧ ι(X)η + (−1)k(ι(X)ω) ∧ dη + (−1)k(−1)kω ∧ ι(X)dη = Q(ω) ∧ η + ω ∧ Q(η)

Proof of Cartan’s formula. If f ∈ Ω0(M) is a function, then ι(X)f = 0 by deﬁnition. Hence Q(f) = (ι(X)d)f = ι(X) df = df(X), while

LX f = X(f) = df(X).

We conclude that LX and Q agree on functions. To prove Cartan’s formula it is enough to prove

(LX ω)|U = (Qω)|U , m where (x1, . . . , xm): U → R be a coordinate chart. But both LX and Q commute with restrictions, so it’s enough to prove that

LX (ω|U ) = Q(ω|U ).

Thus, we may further assume that ω = aI dxI = aI dxi1 ∧ ... ∧ dxin for some function aI and multi-index I. Both the Lie derivative LX and Q are derivations that commute with d, so

LX (aI dxi1 ∧ ... ∧ dxin ) = (LX aI )d(LX xi1 ) ∧ ... ∧ (LX dxin )

= (QaI )d(Qxi1 ) ∧ ... ∧ d(Qxin )

= Q(aI dxi1 ∧ ... ∧ dxin ). 64 Exercise 9.2. Let M be an orientable m-dimensional manifold and µ ∈ Ωm(M) a nowhere zero form of top degree. Show that for any vector ﬁeld X on M the Lie derivative LX µ satisﬁes

LX µ = fµ for some function f ∈ C∞(M), which depends on X. We deﬁne the divergence of X with respect to µ to be this function f and denote it by divµ(X). Thus,

LX µ = divµ(X) µ. m Show that for M = R and µ = dx1 ∧ ... ∧ dxm X ∂ X ∂vi div ( vi ) = . µ ∂x ∂x i i i i Exercise 9.3. Consider polar coordinates (r, θ) on R2. The “function” θ is defined up to a constant. Show that dθ is a well-defined 1-form on R2 − {0} and that 1 dθ = (x dy − y dx). x2 + y2 3 Exercise 9.4. (1) Consider the two-form ω = x1dx2 ∧ dx3 + x2dx3 ∧ dx1 + x3dx1 ∧ dx2 in R . Compute dω. (2) Compute 3 X ∂ ι( x )dx ∧ dx ∧ dx . i ∂x 1 2 3 i=1 i P3 ∂ (3) Compute LX (dx1 ∧ dx2 ∧ dx3), where X = xi . i=1 ∂xi Exercise 9.5. Consider k : R2 → R2 given by (u, v) 7→ (u2 + 1, uv). Compute k∗ (xy − y)dx ∧ dy . Exercise 9.6. Let X and Y be vector fields and α a 1-form on a manifold M. Prove that 1) LX (ι(Y )α) = ι(Y )(LX α) + α(LX Y ). 2) Using (1), show that dα(X,Y ) = X(α(Y )) − Y (α(X)) − α([X,Y ]). 9.4. de Rham cohomology. One of the most interesting applications of Cartan’s formula is the proof of smooth homotopy invariance of de Rham cohomology. We start by defining de Rham cohomology. Definition 9.12. Let M be a manifold. A form α ∈ Ωk(M) is closed if dα = 0. A form β ∈ Ωk(M) is exact if there is a k − 1 form γ with β = dγ. Note that since d2 = 0, any exact form is closed. The converse need not be true. The difference between the spaces of closed and exact forms is measured by the de Rham cohomology. Definition 9.13. Let M be a manifold. The kth de Rham cohomology Hk(M) is defined by Hk(M) : = {closed k-forms}/{exact k-forms} = ker(d :Ωk(M) → Ωk+1(M)) / Im(d :Ωk−1(M) → Ωk(M)). Hk(M) is a vector space over the reals. Thus Hk(M) is the space of equivalence classes [α] of closed k-forms: two closed k-forms α and α0 are equivalent if and only if α − α0 = dγ for some k − 1 form γ. Remark 9.14. By definition Ω−1(M) = 0 so H0(M) = {f ∈ C∞(M) | df = 0} = locally constant functions on M k = R , where k is the number of connected components of M. In particular H0(point) = R. Definition 9.15. We define the de Rham cohomology H∗(M) to be the direct sum of the de Rham cohomology groups: H∗(M) := H0(M) ⊕ · · · ⊕ Hk(M) ⊕ · · · 65 It takes a bit of work to compute the de Rham cohomology of just about anything. Here is an important, but not very exciting, example of a computation directly from the definition. Example 9.16. Let M be a connected zero dimensional manifold, that is, a single point. Then Ωk(M) = 0 for k > 0. Hence Hk(M) = 0 for k > 0. On the other hand H0(M) = R since a point has one connected component. Lemma 9.17. The de Rham cohomology H∗(M) has a well-defined the multiplication given by [α] ∧ [β] := [α ∧ β], which makes H∗(M) into a ring. Proof. We need to show that the space of exact forms is an ideal in the algebra of closed forms. That is, if dα = 0 then dβ ∧ α is exact for any β. But d(β ∧ α) = dβ ∧ α ± β ∧ dα = dβ ∧ α + 0, and we are done. Lemma 9.18. Let F : M → N be a smooth map. Then for each positive integer k the pull-back map F ∗ :Ω∗(N) → Ω∗(M) gives rise to a well-defined ring homomorphism. F ∗ : H∗(N) → H∗(M),F ∗[α] := [F ∗α].

∗ ∗ ∗ Moreover, if idM : M → M is the identity map then idM : H (M) → H (M) is also the identity map. Additionally, for any two maps F : M → N, G : N → Z we have (G ◦ F )∗ = F ∗ ◦ G∗. Proof. If dα = 0, then dF ∗α = F ∗dα = F ∗0 = 0. Therefore F ∗ maps closed forms to closed forms. For the same reason, F ∗ maps exact forms to exact forms. Consequently the pullback on forms gives rise to a well-deﬁned pullback of cohomology classes. Since F ∗(α ∧ β) = (F ∗α) ∧ (F ∗β), the map on cohomology is a ring homomorphism. The rest of the lemma is left as an exercise.

Deﬁnition 9.19 (Homotopy). Two maps f1, f0 : M → N of manifolds are (smoothly) homotopic if there is a smooth map F :(a, b) × M → N, where (a, b) is an open interval containing [0, 1], so that

F (0, x) = f0(x) for all x ∈ M and

F (1, x) = f1(x) for all x ∈ M.

n n n n Example 9.20. The maps f1 : R → R , f1(x) = x and f0 : R → R , f0(x) = 0 are smoothly homotopic: let F (t, x) = tx.

Lemma 9.21 (homotopy invariance of de Rham cohomology). If two smooth maps f1, f0 : M → N are ∗ ∗ ∗ ∗ homotopic, then f0 , f1 : H (N) → H (M) are the same map: ∗ ∗ ∗ f0 [α] = f1 [α] for all [α] ∈ H (N). To prove Lemma 9.21, we need the following simple observation.

Lemma 9.22. Let {φt} denote the ﬂow of a vector ﬁeld X on a manifold M. For any k-form α on M,

d ∗ ∗ φt α = φτ (LX α). dt t=τ 66 Proof. For any map f : M → M, d ∗ ∗ ∗ d ∗ f φt α = f φt α dt t=0 dt t=0 ∗ k ∗ k ∗ since for any point q ∈ M the map Λ(dfq ):Λ (Tf(q)M → Λ (Tq M) is linear. Therefore d ∗ d ∗ d ∗ ∗ ∗ d ∗ ∗ φt α = φτ+tα = φτ (φt α) = φτ φt α = φτ (LX α). dt t=τ dt t=0 dt t=0 dt t=0

Proof of Lemma 9.21. Let F :(a, b) × M → N denote the homotopy between f1 and f0. It is no loss of generality to assume that the interval (a, b) is all of R. ( If (a, b) is not all of R, let ρ : R → [0, 1] be a smooth ¯ function with supp ρ ⊂ (a, b) and ρ|[0,1] = 1. Deﬁne the map F : R × M → N by F (ρ(t)t, x) t ∈ (a, b) F¯(t, x) = F (0, x) t 6∈ (a, b) ¯ The map F is a homotopy between f1 and f0.) Let i0 : M,→ R × M denote the embedding given by

i0(x) = (0, x) and let φt : R × M → R × M be given by

φt(s, x) = (s + t, x). ∗ ∗ ∗ ∗ ∗ ∗ Then f1 = F ◦ φ1 ◦ i0 and f0 = F ◦ φ0 ◦ i0. Therefore, since ft = i0 ◦ φt ◦ F , f1 and f0 are the same map on cohomology if and only if φ1, φ0 : R × M → R × M induce the same map in cohomology. The collection ∂ k of maps {φt} is the flow of the vector field X = ∂t on R × M. Therefore, for any k-form α ∈ Ω (R × M), Z 1 ∗ ∗ d ∗ φ1α − φ0α = (φt α) dt 0 dt Z 1 Z 1 ∗ ∗ = φt (LX α) dt = φt (dι(X) + ι(X)d)α dt 0 0 Z 1 Z 1 ∗ ∗ = d( φt (ι(X) α) dt) + φt (ι(X) dα) dt 0 0 = dκ(α) + κ(dα), where Z 1 ∗ κ(α) := φt (ι(X) α) dt. 0 Therefore, for any α ∈ Ωk(R × M) with dα = 0, ∗ ∗ φ1α − φ0α = d(κ(α)). Hence ∗ ∗ [φ1α] = [φ0α] and we are done. Corollary 9.22.1 (Poincarélemma). k = 0 Hk( n) = R R 0 k > 0

Proof. Let ı : {0} → Rn be the inclusion and p : Rn → {0} be the map that sends every point to 0. We want ∗ ∗ ∗ n ∗ ∗ n ∗ to show that p : H ({0}) → H (R ) is an isomorphism. It’s enough to show that ı0 : H (R ) → H ({0}) ∗ n n and p are the inverses of each other. Deﬁne ft : R → R by

ft(x) = tx. ∗ ∗ ∗ n The F (t, x) = ft(x) is a homotopy between f0 and f1. Hence f0 = f1 as maps on H (R ). Moreover,

p ◦ ı = id{0} and ı ◦ p = f0. 67 Therefore ∗ ∗ ∗ idH∗({0}) = (p ◦ ı) = ı ◦ p and ∗ ∗ ∗ ∗ ∗ ∗ n idH (R ) = f1 = f0 = (ı ◦ p) = p ◦ ı . ∗ ∗ ∗ ∗ n Therefore p and ı are inverses of each other, and H ({0}) and H (R ) are isomorphic.

10. Stokes’s theorem There are two slightly diﬀerent (but equivalent) ways of stating Stokes’s theorem: for manifolds with boundary and for regular domains. Recall that manifolds are locally homeomorphic to open subsets of Rn. Manifolds with boundary are locally homeomorphic to opens subsets of the half-space

n n H := {x ∈ R | x1 ≤ 0}. Technically it is slightly easier to work with regular domains, which is what we will do. Any regular domain is a manifold with boundary. And conversely, any manifold with boundary is a regular domain in some larger manifold. We will not prove the last two statements.

Deﬁnition 10.1. Let M be a manifold of dimension m. A closed subset D ⊂ M is a regular domain (or alternatively, a domain with smooth boundary) if for any point p ∈ D, there is a coordinate chart m φ = (x1, . . . , xm): U → R on M such that p ∈ U and n n φ(U ∩ D) = φ(U) ∩ {x ∈ R | x1 ≤ 0} = φ(U) ∩ H . Such a chart φ is adapted to the domain D.

Example 10.2. The unit disk 2 2 2 D = {(x, y) ∈ R | x + y ≤ 1} is a regular domain in R2. For example, if p = (1, 0), we may take U = {(x, y) | x ≥ 0} and φ(x, y) = (x − p1 − y2, y).

Recall that the interior int(Y ) of a subset Y of a topological space X is the union of all open subsets of X which are contained in Y . We deﬁne the boundary ∂D of a regular domain D in a manifold M to be the points in D that are not in the interior of D. Alternatively, q ∈ ∂D if and only if any open set containing q contains points of D and points of M r D. Lemma 10.3. Let D be a regular domain in a manifold M. The boundary ∂D is a submanifold of M of codimension 1.

m m Proof. Let φ : U → R be a chart adapted to D. The φ(U ∩ int(D)) ⊂ {x ∈ R | x1 < 0} and m m φ(U ∩∂D) ⊂ {x ∈ R | x1 = 0}. If ψ : V → R is another coordinate chart adapted to D, then ψ also maps m m V ∩ int(D) to an open subset of {x ∈ R | x1 < 0} and V ∩ ∂D to an open subset of {x ∈ R | x1 = 0}. Therefore ψ ◦ φ−1 : φ(U ∩ V ) → ψ(U ∩ V ) maps smoothly φ(U ∩ V ∩ ∂D) = φ(U ∩ V ) ∩ {x1 = 0} to ψ(U ∩ V ) ∩ {x1 = 0}. m It follows that the collection of charts {φα : Uα → R } of M which are adapted to D give rise to an atlas m−1 {φα|∂D : ∂D ∩ Uα → {x1 = 0}' R } on ∂D.

Lemma 10.4. If D is a regular domain in an orientable manifold M then int(D) and ∂D are orientable.

Proof. An open subset of an orientable manifold is orientable. Hence int(D) is orientable. We now address the orientability of ∂D. If φ : U → Rm, ψ : V → Rm are two charts adapted to D, then ψ ◦ φ−1 maps the φ(U ∩ V ) ∩ {x1 < 0} to ψ(U ∩ V ∩ {x1 < 0}. Therefore, at the points of φ(U ∩ V ) ∩ {x1 = 0} the differential 68 −1 d(ψ ◦ φ ) maps the vectors that point into {x1 < 0} to vectors that point into {x1 < 0}. In other words, at a point q = (0, x2, . . . , xm−1) the differential has the form  a 0 ...... 0  .  .  −1   d(ψ ◦ φ )q =  .   . −1  m−1  . d(ψ ◦ φ )|{0}×R   .  . −1 −1 m−1 for some smooth function a = a(x2, . . . , xm−1) > 0. Hence if det d(ψ◦φ )q > 0 then det d(φ ◦ ψ )q|{0}×R > 0 as well. Therefore, if M is orientable, then so is the boundary ∂D of a regular domain D in M. We rephrase the lemma above in terms of volume forms (cf. Proposition 7.16). Proposition 10.5. Let D be a regular domain in a manifold M and µ a non-vanishing top form on M. Then there is a vector field N defined on M near ∂D which points out of D. Moreover,

ν = (ι(N)µ)|∂D is an orientation on ∂D.

m ∂ m Proof. If D = {x ∈ : x1 ≤ 0}, take N = . In general, cover ∂D by adapted charts {φi : Ui → } R ∂x1 R (Since all our manifolds are second countable, by passing to a subcover we may assume that the cover is countable). On each Ui there is a vector field Ni ∈ Γ(TUi) such that Ni points outward. Pick a partition of P unity ρi subordinate to {Ui}, and let N = ρiNi. The vector field N defined on ∪Ui is the desired vector m field. Note that in adapted coordinates (x1, . . . , xm): Ui → R it has the form ∂ ∂ N = b1 + ··· + bm ∂x1 ∂xm

for some functions b1, . . . , bm with b1 > 0. We now argue that ν := (ι(N)µ)|∂D is a nowhere vanishing form on ∂D. We argue in adapted coordinates (x1, . . . , xm). The form µ satisﬁes

µ = f(x1, . . . , xm)dx1 ∧ ... ∧ dxm for some nowhere zero function f. Consequently m X j−1 ι(N)µ = (−1) fbj dx1 ∧ ... ∧ dxdj ∧ ... ∧ dxm j=1

(recall that dxdj means that dxj is omitted). Since for j > 1,

dx1 ∧ ... ∧ dxdj ∧ ... ∧ dxm|x1=0 = 0 we have

(ι(N)µ)|∂D = (fb1)|x1=0dx2 ∧ ... ∧ dxm with fb1 6= 0. Definition 10.6. Let M be an oriented manifold and D ⊂ M a regular domain. We will refer to the orientation on the boundary ∂M defined by the orientation of M as in Proposition10 above as the induced orientation. Theorem 10.7 (Stokes’s Theorem). Let M be an oriented m-dimensional manifold, D ∈ M a regular m−1 domain, and ω ∈ Ωc (M) a compactly supported form of degree one less than the dimension of M. Then Z Z dω = ω|∂D. int(D) ∂D Here int(D) and ∂D are both given the orientation induced by the one on M. 69 m m Proof. First, consider the case that M = R and D = {x ∈ R | x1 ≤ 0}. It doesn’t matter what orientation we choose on M; we just have to be consistent in orienting int(D) and ∂D. Choose the orientation on Rm ∂ defined by the standard volume form µ = dx1 ∧ ... ∧ dxm. Let N = so that ι(N)µ|∂D = dx2 ∧ ... ∧ dxm. ∂x1 m−1 m Let ω ∈ Ωc (R ) be a compactly supported form. Then

X j−1 ω = (−1) fjdx1 ∧ ... ∧ dxdj ∧ ... ∧ dxm j

for some compactly supported functions fj. Note that  

X j−1 ω|∂D =  (−1) fjdx1 ∧ ... ∧ dxdj ∧ ... ∧ dxm = f1(0, x2, . . . , xm) dx2 ∧ ... ∧ dxm.

j x1=0 On the other hand, X ∂f dω = (−1)j−1 dx ∧ dx ∧ ... ∧ dxˆ ∧ ... ∧ dx ∂x j 1 j m j j

X ∂fj = dx ∧ ... ∧ dx . ∂x 1 m j j Now, Z X Z ∂f X Z ∂f dω = dx ∧ ... ∧ dx = dx . . . dx . ∂x 1 m ∂x 1 m D j {x1≤0} j j {x1≤0} j

Since the supports of fj’s are compact, there is an R > 0 such that n supp(fj) ⊂ {x ∈ R | −R ≤ xj ≤ R} for all j. For j > 1, Z ∂f Z Z ∂f dx = dxj dx1 ... dxdj . . . dxm ∂x ∂x {x1<0} j {x1<0} R j ! Z Z R ∂f = dxj dx1 ... dxdj . . . dxm {x1<0} −R ∂xj = 0, since Z R ∂f dxj = f(x1, . . . , xj−1, R, xj+1, . . . , xm) − f(x1, . . . , xj−1, −R, xj+1, . . . , xm) = 0 − 0 = 0. −R ∂xj For j = 1, we have Z Z Z 0 Z Z 0 ∂f1 ∂f1 ∂f1 dx = dx1 dx2 . . . dxm = dx1 dx2 . . . dxm ∂x m−1 ∂x m−1 ∂x {x1<0} 1 R −∞ 1 R −R 1 Z = (f1(0, x2, . . . , xm) − 0) dx2 . . . dxm m−1 ZR Z = f1(0, x2, . . . , xm dx2 ∧ ... ∧ dxm = ω|∂D. m−1 R ∂D Therefore Z Z dω = ω|∂D int(D) ∂D m in the special case of M = R , D = {x1 < 0}.

70 We now consider a slightly more general case: D is a regular domain in an oriented manifold M of m m−1 dimension m, φ : U → R a chart adapted to D and ω ∈ Ωc (M) with supp ω ⊂ U. Then Z Z Z Z dω = dω = φ∗(dω) = d(φ∗ω) int(D) int(D)∩U φ(int(D)) {x1<0} Z ∗ = φ ω|∂{x1≤0} (here we used the special case above) ∂{x1≤0} Z Z Z ∗ = φ ω|φ(U∩∂D) = ω|U∩∂D = ω|∂D. φ(U∩∂D) U∩∂D ∂D m−1 Finally we remove the restriction on the support of ω. Let ω ∈ Ωc (M) be an arbitrary compactly m supported form. Cover D ∩ supp ω by ﬁnitely many charts {φα : Uα → R } adapted to the domain D and giving D its orientation (we now have to make sure that changes of coordinates between charts preserve orientation). It is no loss of generality to assume that M = ∪αUα (after all, we are going to be only interested P in ω|D and supp ω|D ⊂ ∪αUα.) Let {ρα} be a partition of unity subordinate to the cover. Then ραω = ω and supp(ραω) ⊂ Uα. By the previous discussion Z Z d(ραω) = ραω int(D) ∂D for each index α. Therefore Z Z X X Z X Z Z X Z dω = d(ραω) = d(ραω) = ραω = ραω = ω. int(D) D D ∂D ∂D ∂D Exercise 10.1. Let M be an m-dimensional compact oriented manifold, D ⊂ M a domain with smooth boundary, f ∈ C∞(M), and ω ∈ Ωm−1(M). Show that Z Z Z f dω = fω − df ∧ ω. D ∂D D Exercise 10.2. Let M be an m-dimensional oriented manifold and µ ∈ Ωn(M) a nowhere vanishing form. Recall that for any vector ﬁeld X on M,

LX µ = divµ(X) µ,

where divµ(X) is the divergence of µ with respect to X (cf. Exercise 9.2). Show that if D ⊂ M is a regular domain then Z Z divµ(X) µ = ι(X)µ D ∂D for any vector ﬁeld X with compact support.

Exercise 10.3. What is the integral of x dy − y dx over ∂D, where D is the unit disk in R2 (and R2 is given the standard orientation)?

11. Connections on vector bundles 11.1. Connections. If X is a vector field on an open subset U of Rm, then X is determined by m-tuple (a1, . . . am) of functions: X ∂ X = a i ∂x i i Therefore we know how to take directional derivatives of X at a point q ∈ U in the direction of a vector m v ∈ TqU = R — we simply differentiate the coefficients: X ∂ (D X) = (D a ) | v q v i q ∂x q i i 71 where Dvai is the directional derivative of the function ai in the direction v. Consequently we know when a vector field does not change along a curve γ:

Dγ˙ X = 0. Covariant derivatives generalize the directional derivatives allowing us to differentiate vector fields on arbitrary manifolds and, more generally, sections of arbitrary vector bundles. Definition 11.1 (Covariant derivative of sections of a vector bundle). Let π : E → M be a vector bundle. A covariant derivative (also knows as a connection) is an R-bilinear map

∇ : Γ(TM) × Γ(E) → Γ(E), (X, s) 7→ ∇X s such that

(1) ∇fX s = f∇X s (2) ∇X (fs) = X(f) · s + f∇X s. for all f ∈ C∞(M), all X ∈ Γ(TM), and all s ∈ Γ(E).

Example 11.2. Let U ⊂ Rm be an open set and E = TU → U the tangent bundle. Deﬁne a connection D on TU → U by X ∂ X ∂ DX ( ai ) = X(ai) . ∂xi ∂xi I leave it to the reader to check that this is indeed a connection.

Remark 11.3. Lie derivative (X,Y ) 7→ LX Y is not a connection on the tangent bundle (why not?).

Example 11.4. Let π : E → M be a trivial bundle of rank k. Then there exist global sections {s1, . . . , sk} of E such that {sj(x)} is a basis for Ex for all points x ∈ M ({si} is a frame of E|U ). So for any s ∈ Γ(E), P ∞ we have s = j fjsj, for some C functions fj. We deﬁne a bilinear map ∇ : Γ(TM) × Γ(E) → Γ(E) by X X ∇X s = ∇X ( fjsj) := X(fj)sj. j j It is easy to check that ∇ is indeed a connection on E: X X X ∇fX s = ∇fX ( fjsj) = fX(fj)sj = f fjsj = f∇X s; j j j and X X X X ∇X (fs) = ∇X (f fjsj) = X(ffj)sj = X(f) fjsj + f X(fj)sj = X(f)s + f∇X s. j j j j Lemma 11.5. Any convex linear combination of two connections on a vector bundle E → M is a connection. 1 2 ∞ More precisely, let ∇ , ∇ be two connections on E and ρ1, ρ2 ∈ C (M) be two functions with ρ1 + ρ2 = 1. Then 1 2 Γ(TM) × Γ(E) 3 (X, s) 7→ ∇X s := ρ1∇X s + ρ2∇X s ∈ Γ(E) is a connection.

Proof. Exercise. Check that the two properties of the connection hold. As a corollary we get: Proposition 11.6. Any vector bundle π : E → M has connection. α Proof. Choose a cover {Uα} on M such that E|Uα is trivial. Let ∇ be a connection on E|Uα , as in Example 11.4. Let {ρβ} be a partition of unity subordinate to {Uα}. Then supp ρβ ⊂ Uα for some α = α(β). Define a map ∇ : Γ(TM) × Γ(E) → Γ(E) by X ∇ s = ρ (∇α s| ). X β XU Uα β This is indeed a connection, since a convex linear combination of any finite number of connections is a connection — see Lemma 11.5 above. 72 Proposition 11.7. Let ∇ be a connection on a vector bundle π : E → M. Then ∇ is local: for any open set 0 0 U and any vector fields X and Y , and for any sections s and s of E such that X|U = Y |U and s|U = s |U , we have 0 (∇X s)|U = (∇Y s )|U . Proof. Since ∇ is bilinear, it is enough to show two things: (a) if X|U = 0, then (∇X s)|U = 0 for any s ∈ Γ(E); and (b) if s|U = 0, then (∇X s)|U = 0 for any X ∈ Γ(TM). Fix a point x0 ∈ U. Then there is a smooth function ρ : U → [0, 1] with supp ρ ⊂ U and ρ|V = 1 for some open neighborhood V of x0. If X|U = 0 then ρX = 0, and hence for any section s of E,

0 = (∇ρX s)(x0) = ρ(x0)(∇X s)(x0) = (∇X s)(x0).

Since x0 ∈ U is arbitrary, (a) follows. If s|U = 0 then ρs = 0 on M. This in turn implies that

0 = (∇X ρs)(x0) = (X(ρ)s + ρ∇X s)(x0) = 0 + ρ(x0)(∇X s)(x0) = (∇X s)(x0). Remark 11.8. It follows that if ∇ is a connection on a vector bundle E → M then ∇ induces a connection U ∇ : Γ(TU) × Γ(EU ) → Γ(E|U )

on the restriction E|U for any open set U ⊂ M. Namely, for any x0 ∈ U let ρ : U → [0, 1] be a bump function as in the proof above. Then for any X ∈ Γ(TU) and any s ∈ Γ(E|U ) we have ρX ∈ Γ(TM) and ρs ∈ Γ(E) (with ρX and ρs extended to all of M by 0). We deﬁne: U (∇ X s)(x0) = (∇ρX ρs)(x0). By Proposition 11.7, the right hand side does not depend on the choice of the function ρ. We leave it to the reader to check that ∇U is a connection.

Definition 11.9 (Christoffel symbols). Let E → M be a vector bundle with a connection ∇. Let (x1, . . . , xn): m U → R be a coordinate chart on M small enough so that E|U is trivial. Let {sα} be a frame of E|U : for each x ∈ U we require that {sα(x)} is a basis of the fiber Ex. Then any local section s ∈ Γ(E|U ) can be written as a linear combination of sα’s. In particular, for each index i and β U X α ∇ ∂ sβ = Γiβsα ∂x i α β ∞ for some functions Γiα ∈ C (U). These functions are the Christoffel symbols of the connection ∇ relative to the coordinates (x1, . . . , xn) and the frame {sα}. It follows easily that the Christoffel symbols determine the connection ∇U on the coordinate chart U. It is customary not to distinguish between ∇ and its restriction ∇U . Proposition 11.10. Let ∇ be a connection on on a vector bundle π : E → M. For any X ∈ Γ(TM), any s ∈ Γ(E) and any point q the value of the connection (∇X s)(q) at a point q ∈ M depends only on the vector Xq (and not on the value of X near q).

Proof. It’s enough to show that if Xq = 0 then (∇X s)(q) = 0. Since connections are local we can argue in m coordinates. Choose a coordinate chart (x1, . . . , xn): U → R on M with q ∈ U such that E|U is trivial. P i ∂ P k Pick a local frame {sj} of E|U . Then, if X = X , s = fjsj, and Γ denote the associated Christoﬀel ∂xi ij symbols, X X i X ∇X s = ∇P Xi ∂ ( fjsj) = X ∇ ∂ ( fjsj) ∂xi ∂xi X i ∂fj X = X sj + Xifj∇ ∂ sj ∂xi ∂xi X i X ∂fj X k = X ( sj + fjΓijsk). ∂xi i If Xq = 0 then X (q) = 0 for all i. Hence (∇X s)(q) = 0 and we are done. 73 As a corollary of the proof computation above we get an expression for the connection in terms of the Christoﬀel symbols.

m Corollary 11.10.1. Let ∇ be a connection on on a vector bundle π : E → M and (x1, . . . , xn): U → R a coordinate chart on M with E|U being trivial. Let {sj} be a frame of E|U . Then

X X i ∂fk X k (11.1) ∇P Xi ∂ ( fjsj) = X ( + fjΓij)sk. i ∂xi ∂xi j i,k j We note one more corollary that will be useful when we try to deﬁne connections induced on submanifolds. Corollary 11.10.2. Let ∇ be a connection on on a vector bundle π : E → M. For any X ∈ Γ(TM), any s ∈ Γ(E) and any point q the value of the connection (∇X s)(q) at a point q ∈ M depends only on the values of s along the integral curve of X through q

P i ∂ P Proof. By the previous corollary, for X = X and s = fjsj i ∂xi j X i k (∇X s)(q) = (Xfk)(q) sk(q) + X (q)fj(q)Γij(q)sk(q). i,k,j

And (Xfk)(q) depends only on the values of fk along the integral curve of X. The proof that connections are local has an important generalization to maps of sections of vector bundles. Definition 11.11. Let E → M and F → M be two vector bundles. We say that a map T : Γ(E) → Γ(F ) is tensorial if T is R-linear and for any f ∈ C∞(M) T (fs) = fT (s) for all sections s ∈ Γ(E). Lemma 11.12. Let E → M and F → M be two vector bundles. If T : Γ(E) → Γ(F ) is tensorial then there is a vector bundle map φ : E → F so that [T (s)](x) = φ(s(x)) for all s ∈ Γ(E) and x ∈ M. And conversely, any vector bundle map φ : E → F defines a tensorial map on sections Tφ : Γ(E) → Γ(F ) by Tφ(s) = φ ◦ s. Proof. The proof is in two steps. We first argue that T is local: if s ∈ Γ(E) vanishes on an open set U ⊂ M then T (s) vanishes on U as well. Pick a point x ∈ U and a smooth function ρ ∈ C∞(M) with supp ρ ⊂ U and ρ ≡ 1 on a neighborhood V of x (V ⊂ U, of course). Then ρs is identically zero everywhere. Hence 0 = T (ρs)(x) = ρ(x) T (s)(x) = T (s)(x).

Since x ∈ U is arbitrary T (s)|U = 0. Since T is local and E, F are locally trivial, we may assume that E and F are, in fact, trivial. That is E = M × Rk and F = M × Rl. Moreover the sections of E and F are simply k- and l-tuples of functions. We want to deﬁne a vector bundle map φ : E → F . Then φ : M × Rk → M × Rl has to be of the form φ(x, v) = (x, A(x)v)

where A : M → Hom(Rk, Rl) is smooth, with the property that   f1(x)  .  T (f1, . . . , fk)(x) = A(x)  .  fk(x)

for all x ∈ M. But this is easy: deﬁne the jth column of A(x) to be the l-tuple of functions T (ej), where ej is the section of E that assigns to every point the jth basis vector (0,..., 0, 1, 0,..., 0) (1 in jth slot). Or, if you prefer, ej is the k-tuple of functions with jth function being identically 1 and all the others being zero. 74 Remark 11.13. Lemma 11.12 above generalizes further: let E1, E2,...Ek and F be vector bundles over a manifold M and T : Γ(E1) × · · · × Γ(Ek) → Γ(F ) a k-linear map which is tensorial in each slot:

T (f1s1, . . . , fksk) = f1 . . . fkT (s1, . . . , sk) ∞ for all si ∈ Γ(Ei) and fj ∈ C (M). Then for every x ∈ M there is a unique k-linear map

Tx :(E1)x × · · · × (Ek)x → Fx with Tx(s1(x), . . . , sk(x)) = [T (s1, . . . , sk)](x). Globally this means that there is a vector bundle map

φ : E1 ⊗ · · · ⊗ Ek → F so that T (s1, . . . , sk)(x) = φ(s1(x) ⊗ ... ⊗ sk(x)) for all x ∈ M and all sections si ∈ Γ(Ei). Remark 11.14. We add one more layer of abstraction to the remark above: there is a bijection between vector bundle maps φ : E → F and sections of the bundle Hom(E,F ) ' E∗ ⊗ F . Namely, if φ : E → F is a vector bundle map, then φ|Ex : Ex → Fx is an element of Hom(Ex,Fx) = Hom(E,F )x for each point x ∈ M. Thus x 7→ φ|Ex is a section of the bundle Hom(E,F ) → M. We summarize the preceding discussion as a proposition.

Proposition 11.15. Let E1, E2,...Ek and F be vector bundles over a manifold M. There is a bijection between k-linear tensorial maps T : Γ(E1) × · · · × Γ(Ek) → Γ(F ) ∗ ∗ and the sections of the bundle E1 ⊗ · · · ⊗ Ek ⊗ F → M. Here are a few instances where the above point of view is useful. Lemma 11.16. Let ∇1 and ∇2 be two connections on a vector bundle E → M. Their difference ∇1 − ∇2 “is” a section of the bundle T ∗M ⊗ E∗ ⊗ E ' Hom(TM ⊗ E,E). Conversely, given a connection ∇ on E → M and a section A of the bundle Hom(TM ⊗ E,E) then the map ∇A : Γ(TM) × Γ(E) → Γ(E) given by A (∇X s)(x) := ∇X s(x) + Ax(Xx ⊗ s(x)) is again a connection on E. Here, of course, x ∈ M is a point, X a vector field on M and s is a section of E. Thus a choice of a connection on E → M defines a bijection {space of all connections on E → M} ↔ Γ(T ∗M ⊗E∗ ⊗E) = Γ(Hom(TM ⊗E,E)) = Γ(T ∗M ⊗Hom(E,E)). Proof. In one direction it’s enough to prove that ∇1 − ∇2 is tensorial in both slots. It’s obviously tensorial in the vector field slot. The tensoriality in the second slot is an easy computation. A We also leave it to the reader to check that ∇ as defined above is a connection. Definition 11.17. A connection on a manifold M is a connection on its tangent bundle TM → M. Definition 11.18. The torsion T ∇ of a connection ∇ on a manifold M is a bilinear map ∇ ∇ T : Γ(TM) × Γ(TM) → Γ(TM),T (X,Y ) := ∇X Y − ∇Y X − [X,Y ]. If T ∇ = 0, the connection ∇ is called torsion-free. Lemma 11.19. The torsion of a connection is tensorial, hence corresponds to a section of the bundle T ∗M ⊗ T ∗M ⊗ TM.

Proof. This is yet another computation left to the reader. 75 Deﬁnition 11.20. The curvature R of a connection ∇ on a vector bundle E → M is a tri-linear map Γ(TM) × Γ(TM) × Γ(E) → Γ(E) deﬁned by

R(X,Y )s = ∇X (∇Y s) − ∇Y (∇X s) − ∇[X,Y ]s. Lemma 11.21. Curvature is tensorial hence correspond to a section of T ∗M ⊗ T ∗M ⊗ Hom(E,E) → M. Moreover, since R(X,Y )s = −R(Y,X)s, it actually corresponds to a section of Λ2(T ∗M) ⊗ Hom(E,E). Proof. Once again this is a computation. We check tensoriality in one slot and leave the rest to the reader. For all vector ﬁelds X,Y , sections s and functions f,

R(X,Y )(fs) = ∇X (∇Y (fs)) − ∇Y (∇X (fs)) − ∇[X,Y ](fs)

= ∇X (Y (f)s + f∇Y s) − ∇Y (X(f)s − f∇X s) − ([X,Y ]f)s − f∇[X,Y ]s

= X(Y (f))s + Y (f)∇X s + X(f)∇Y s + f∇X (∇Y s) − Y (X(f))s

−X(f)∇Y s − Y (f)∇X s − f∇X (∇Y s) − ([X,Y ]f)s − f∇[X,Y ]s = fR(X,Y )s 11.2. Parallel Transport. In general there is no consistent way of identifying vectors in tangent spaces at different points of a manifold. More generally there is no consistent way of identifying vectors in fibers of a vector bundle above different points of a manifold. However we will see that given a connection ∇ on a vector bundle π : E → M, for any curve γ :[a, b] → M there is a family of vector space isomorphisms

t2 t2 Pt1 (γ) = Pt1 : Eγ(t1) → Eγ(t2),

t2 depending smoothly on t1, t2 ∈ [a, b]. These isomorphisms Pt1 are called parallel transport along γ. The connection can then be recovered from parallel transport. We now proceed with the construction. Definition 11.22. Let π : E → M be a vector bundle and γ :[a, b] → M a curve. A section σ of E → M along γ is a smooth map s :[a, b] → E so that π(σ(t)) = γ(t) for all t ∈ [a, b]. We denote the space of sections of E along the map γ by Γ(γ∗E). Example 11.23. If s : M → E is a section of E, then s ◦ γ is a section along γ. d Example 11.24. The derivativeγ ˙ := dγt( dt |t) is a section of the tangent bundle TM → M along γ. Remark 11.25. If E = TM then a section along a curve γ is also known as a vector field along γ. It’s not true that every section σ along γ is of the form σ = s ◦ γ for some s ∈ Γ(E): if the curve γ crosses itself than γ˙ cannot be of the form X ◦ γ for any vector field X on M. Remark 11.26. Here’s another way to consider sections along a curve γ. Suppose f : N → M is a smooth map of manifolds and that π : E → M is a vector bundle. Define the pullback of the bundle E along f to be the set f ∗E = {(n, e) ∈ N × E | f(n) = π(e)}. together with the projection π0 : f ∗E → N, f ∗E 3 (n, e) 7→ n. A transversality argument shows that f ∗E is a submanifold of N × E, so π0 is smooth. It’s not hard to see that f ∗E is a vector bundle of the same rank as E. The point of this construction is that a section of a bundle E → M along a curve γ :(a, b) → M is simply a section of the pullback bundle γ∗E → [a, b]. Strictly speaking the construction above doesn’t apply to maps from closed intervals, since a closed interval is not a manifold. However, a smooth map from a closed interval [a, b] is, by definition, a smooth curve from a slightly larger open interval (a0, b0) ⊃ [a, b] and pulling back E to a bundle over (a0, b0) does make sense. Definition 11.27. Let π : E → M be a vector bundle and γ :[a, b] → M a smooth curve. A covariant ∇ derivative dt along γ is an R-linear map ∇ ∇ : Γ(γ∗(E)) → Γ(γ∗(E)), σ 7→ σ dt dt such that for all function f ∈ C∞([a, b]) and all sections σ ∈ Γ(γ∗(E)) ∇ df ∇ (11.2) (fσ) = σ + f σ. dt dt dt 76 Proposition 11.28. Given a connection ∇ on a vector bundle π : E → M and a curve γ :[a, b] → M, there ∇ ∗ ∗ is a unique covariant derivative dt : Γ(γ (E)) → Γ(γ (E)) along γ such that ∇ (11.3) (s ◦ γ)(t) = (∇ s)(γ(t)). dt γ˙ (t) for all sections s of the bundle E.

∇ Proof. (Uniqueness) Arguing as in Proposition 11.7, it is not hard to show that dt is local: for a section σ ∇ of E along γ the value ( dt σ)(t) at a point t depends only on the values of σ near t. Therefore, in order to prove uniqueness it is no loss of generality to assume that the image γ([a, b]) of γ is contained in an open set ∗ U in M with E|U trivial. Pick a frame {sj} of E|U . Then for any σ ∈ Γ(γ E) there are smooth functions ∞ fj ∈ C ([a, b]) so that X σ(t) = fj(t)sj(γ(t)) for all t ∈ [a, b]. Then, using (11.2) and (11.3), we get

∇ ∇ X X dfj X (11.4) σ(t) = f (s ◦ γ) (t) = (t) s (γ(t)) + f (∇ s )(γ(t)). dt dt j j dt j j γ˙ (t) j ∇ Since the right hand side of (11.4) depends only on ∇, dt is unique.

∇ (Existence) Cover γ([a, b]) with sets Uj such that E|Uj is trivial. It’s enough to construct dt on each ∗ ∗ ∇ −1 −1 Γ(γ E|γ (Uj )) for by uniqueness the operators on each Γ(γ E|γ (Uj )) will patch together to a map dt : ∗ ∗ (j) ∇ ∗ Γ(γ E) → Γ(γ E). Pick a frame {sk } on E|Uj and define dt on γ (E|Uj ) by (11.4). ∇ Definition 11.29. We will refer to the covariant derivative dt along γ as in the Proposition 11.28 above as being induced by the connection ∇. Definition 11.30. Let E → M be a vector bundle with a connection ∇, γ :[a, b] → M a curve. A section σ ∈ Γ(γ∗E) is parallel if ∇ σ = 0, dt ∇ where dt is the covariant derivative along γ induced by ∇.

To deﬁne parallel transport along a curve γ :[a, b] → M, we want, for every vector v ∈ Eγ(a), a section v ∗ v ∇ v v σ ∈ Γ(γ (E)) such that σ (a) = v and dt σ = 0. We also want the map v 7→ σ to be linear. The existence of such sections and linearity in v is the result of the next two lemmas. The ﬁrst one is the standard result for linear time dependent ODE’s.

k2 Lemma 11.31. Suppose that B = (Bjk(t)) : [c, d] → R is a smooth curve in the space of k × k real matrices. Then there is a smooth curve R :[c, d] → GL(R, k) such that f(t) := R(t)f 0 is a solution of the ODE  0    f1(t) f1(t)  .   .  (11.5)  .  = B(t)  .  , 0 fk(t) fk(t) with initial conditions f(c) = f 0. Lemma 11.32. Let E → M be a vector bundle with a connection ∇ and γ :[a, b] → M be smooth curve. v ∗ v ∇ v For any vector v ∈ Eγ(a) there is a section σ ∈ Γ(γ (E)) such that σ (a) = v and dt σ = 0. Moreover, the map ∗ v Eγ(a) → Γ(γ E), v 7→ σ is a linear isomorphism. Proof. As before, it is no loss of generality to assume the image of γ is contained in a coordinate chart m k (x1, . . . , xm): U → R with E|U being trivial. Let {sj} be a frame of E|U and Γij the corresponding Christoﬀel symbols. Suppose σ is a section of E along γ which is parallel and satisﬁes σ(a) = v. Then there 77 ∞ P are smooth functions fj ∈ C ([a, b]) so that σ = fj (sj ◦ γ). We argue that the fj’s satisfy a linear ODE ∇ as in Lemma 11.31 for some curve B. By (11.4), since dt σ = 0, X dfj X (t) s (γ(t)) = − f (∇ s )(γ(t)). dt j j γ˙ (t) j P d ∂ We also haveγ ˙ = ( γi) , where γi := xi ◦ γ. Therefore i dt ∂xi X X k X X k ∇γ˙ sj = γ˙ i (∇ ∂ sj) ◦ γ = γ˙ i(Γijsk) ◦ γ = ( γ˙ i (Γij ◦ γ)) (sk ◦ γ) ∂xi i,j,k k i P We conclude that σ = fj (sj ◦ γ) is parallel if and only if

dfk X (11.6) (t) = − f (t)γ ˙ (t)Γk (γ(t)). dt j i ij i,j

That is, f = (f1, . . . , fk) satisfies the ODE (11.5) with X k Bjk(t) = γ˙ i(t)(Γij(γ(t)) i By Lemma 11.31 the system of linear equations (11.6) has a solution defined for all time t ∈ [a, b] which depends linearly on the initial conditions. Therefore the desired parallel transport exists. Parallel transport leads to one definition of geodesics. Definition 11.33. Let ∇ be a connection on the tangent bundle TM → M of a manifold M. A curve γ :[a, b] → M is a geodesic if its velocity fieldγ ˙ (t) is parallel: ∇ (11.7) γ˙ = 0. dt m Remark 11.34. It will be useful to know what (11.7) means in coordinates. Let (x1, . . . , xm): U → R be d d P ∂ a coordinate chart on our manifold. Define γi = xi ◦ γ,γ ˙ i = γi andγ ï = γ˙ i. Thenγ ˙ = γ˙ i . Hence dt dt ∂xi the functions fk in (11.6) areγ ˙ ks. Therefore, in this case, (11.6) reads X k (11.8)γ ¨k = − γ˙ iγ˙ jΓij(γ). We conclude that a curve γ is a geodesic for a connection ∇ if and only if (11.8) holds in every coordinate chart. n P i ∂ Exercise 11.1. Consider the manifold . We have seen that DX Y = X(Y ) is a connection. Suppose R ∂xi n D that γ : R → R is a curve. Let dt denote the covariant derivative along γ induced by the connection D on Rn. Show that D d2γ γ˙ =γ ¨ (= ). dt dt2 Conclude that the geodesics in Rn with respect to D are straight lines. 12. Riemannian geometry 12.1. Levi-Civita connection. We now specialize the discussion of connections and parallel transport to the case of manifolds with a choice of an inner product on each tangent space. Definition 12.1 (Riemannian metric). A Riemannian metric g on a manifold M assigns smoothly to each point x ∈ M a positive definite inner product gx on TxM. A Riemannian manifold is a manifold M together with a choice of a Riemannian metric g. In other words, it’s a pair (M, g).

Remark 12.2. An inner product h on a vector space V is a bilinear map h : V × V → R. Hence it is an element of the tensor product V ∗ ⊗ V ∗. Therefore a Riemannian metric on a manifold M is nothing but a smooth section of the bundle (T ∗M)⊗2 := T ∗M ⊗ T ∗M → M. (Not all sections of T ∗M ⊗2 → M are Riemannian metrics. For instance, the zero section is not. But all symmetric and positive deﬁnite sections of T ∗M ⊗2 → M are Riemannian metrics.) 78 Theorem 12.3. Any second countable manifold M has a Riemannian metric.

(i) (i) m Proof. Let {φi = (x1 , . . . , xm ): Ui → R } be a countable collection of coordinate charts that cover M. (i) P (i) (i) One each chart Ui define a metric g = j dxj ⊗ dxj Let {ρi} be a partition of unity subordinate to this cover. Define a section g of T ∗M ⊗ T ∗M → M by X (i) g = ρig . i Then g is a Riemannian metric. Fiber metrics. The notion of a Riemannian metric generalizes to arbitrary vector bundles. Definition 12.4. A fiber metric on the vector bundle E → M assigns smoothly to each point x ∈ M a positive definite ∗ ∗ symmetric bilinear form gx : Ex × Ex → R. In particular a fiber product is a section of E ⊗ E → M. Proposition 12.5. Every vector bundle E → M over a paracompact manifold M has a fiber metric.

Proof. If {sα : U → E} is a local frame, then X X X gx( aαsα(x), bβ sβ (x)) = aαbβ δαβ is a ﬁber metric on E|U . Patch these local ﬁber metrics together using a partition of unity. The next theorem is the fundamental theorem of Riemannian geometry. It says that for every Riemannian manifold (M, g) there is a connection ∇ (which depends on the metric g) with two important properties. Such connection is called the Levi-Civita connection. Theorem 12.6 (existence and uniqueness of the Levi-Civita connection). On every Riemannian manifold (M, g) there is a unique connection ∇ : Γ(TM) × Γ(TM) → Γ(TM) which is (1) Torsion-free : ∇X Y − ∇Y X = [X,Y ] for all X,Y ∈ Γ(TM) (2) metric (i.e. compatible with g): X(g(Y,Z)) = g(∇X Y,Z) + g(Y, ∇X Z) for all X,Y,Z ∈ Γ(TM). Proof. (Uniqueness) The proof is a trick. Suppose that ∇ exists. Then for any X,Y,Z ∈ Γ(TM),

X g(Y,Z) =g(∇X Y,Z) + g(Y, ∇X Z)

Y g(Z,X) =g(∇Y Z,X) + g(Z, ∇Y X)

−Z g(X,Y ) = − g(∇Z X,Y ) − g(X, ∇Z Y ) since the connection is compatible with the metric. Adding up the three equations and using the fact that the connection is torsion free, we get

X g(Y,Z) + Y g(Z,X) − Z g(X,Y ) =g(∇X Y,Z) + g(∇Y X,Z) + g(Y, ∇X Z − ∇Z X) + g(X, ∇Y Z − ∇Z Y )

=g(∇X Y,Z) + g(∇X Y − [X,Y ],Z) + g(Y, [X,Z]) + g(X, [Y,Z])

=2g(∇X Y,Z) − g([X,Y ],Z) + g(Y, [X,Z]) + g(X, [Y,Z]) Thus, we have

(12.1) 2g(∇X Y,Z) = X(g(Y,Z)) + Y (g(Z,X)) − Z(g(X,Y )) + g([X,Y ],Z) − g(Y, [X,Z]) − g(X, [Y,Z]).

Since Z is arbitrary and g is nondegenerate, the formula above uniquely determines ∇X Y . This proves uniqueness of a Levi-Civita connection. It remains to prove existence. The proof is very simple, if one is willing to skip all the details. Define an R-trilinear map Γ(TM) × Γ(TM) × Γ(TM) → C∞(M) by sending a triple of vector fields (X,Y,Z) to 1/2 of the right hand side of (12.1). Since g is nondegenerate this defines an R-bilinear map

Γ(TM) × Γ(TM) → Γ(TM), (X,Y ) 7→ “∇”X Y. It remains to verify that “∇” so defined is a connection, and that it is metric and torsion-free. These minor details are traditionally left to the reader. We will provide a different and more detailed proof below after a brief detour. Equation (12.1) has the following interesting consequence: 79 Lemma 12.7. The Christoffel symbols of the Levi-Civita connection depend only on the metric and its first partials.

m k Proof. Given a coordinate chart (x1, . . . , xm): U → R on M, the Christoﬀel symbols Γij of the Levi-Civita connection ∇ are deﬁned by X k ∇∂i ∂j = Γij∂k, k ∂ where ∂i = . Plugging X = ∂i, Y = ∂j and Z = ∂k into (12.1) we get ∂xi

2g(∇∂i ∂j, ∂k) = ∂i(g(∂j, ∂k)) + ∂j(g(∂j, ∂i)) − ∂k(g(∂i, ∂j)) since [∂i, ∂j] = [∂j, ∂k] = [∂i, ∂k] = 0. Writing gij = g(∂i, ∂j) etc., we obtain

X l (12.2) 2 Γijglk = ∂igjk + ∂jgji − ∂kgij. l rs Since g is a metric, the matrix (gij) is nondegenerate. Let (g ) denote its inverse, so that

X rs g gsk = δrk. s Multiplying both sides of (12.2) by gsk and summing over k we get X 1 X δ Γl = gsk (∂ g + ∂ g − ∂ g ) , sl ij 2 i jk j ji k ij l k and simplifying 1 X (12.3) Γs = gsk (∂ g + ∂ g − ∂ g ) . ij 2 i jk j ji k ij k This proves that the Christoffel symbols depend only on the metric and its first order partials. Proof of Theorem 12.6 continued. It remains to (re)prove the existence of the Levi-Civita connection. By uniqueness, it is enough to construct a Levi-Civita connection ∇ in each coordinate chart. For then by uniqueness, these coordinate chart connections patch together into a Levi-Civita connection on the whole manifold M. We have shown that if the Levi-Civita connection exists then its Christoffel symbols have to m be given by (12.3). Therefore on a chart (x1, . . . , xm): U → R we define a connection ∇ by k ∇Xi∂i Yj∂j = Xi(∂iYj)∂j + XiYjΓij∂k k with Christoffel symbols Γij given by (12.3). In the equation above we finally resorted to the Einstein summation convention: we sum on repeated indices and omit the symbol P. We now check that ∇ is a Levi-Civita connection. k k Since Γij = Γji (c.f. (12.3)), k k ∇∂i ∂j − ∇∂j ∂i = Γij∂k − Γji∂k = 0.

Thus, for two vector ﬁelds X = Xi∂i and Y = Yj∂j, we have

∇X Y − ∇Y X = ∇Xi∂i (Yjpj) − ∇Yj pj (Xi∂i)

= Xi(∂iYj)∂j + XiYj∇∂i ∂j − Yj(∂jXi)∂i − YjXi∇∂j ∂i

= Xi(∂iYj)∂j − Yj(∂jXi)∂i

= [Xi∂i,Yj∂j]. Thus, ∇ is torsion-free. Compatibility with g is a somewhat longer computation. First, note that l m g(∇∂i ∂j, ∂k) + g(∂j, ∇∂i ∂k) = g(Γij∂l, ∂k) + g(∂j, Γik∂m) l m = Γijglk + Γikgjm

= ∂igjk, 80 where the last equality follows from (12.3). Thus, we have for vector ﬁelds X = Xi∂i,Y = Yj∂j and Z = Zk∂k,

(Xj∂j)g(Yi∂i,Zk∂k) = Xj∂j(YiZkgik)

= Xj(∂jYi)Zkgik + XjYi(∂jZk)gik + XiYjZk(∂jgik)

= g(Xj(∂jYi)∂i,Zk∂k) + g(Yi∂i,Xj(∂jZk)∂k)

+XjYiZk(g(∇∂j ∂i, ∂k) + g(∂i, ∇∂j ∂k))

= g((Xj∂j)Yi∂i,Zk∂k) + g(Yi∇Xj ∂j ∂i,Zk∂k)

+g(Yi∂i, (Xj∂j)Zk∂k) + g(Yi∂i,Zk∇Xj ∂j ∂k)

= g(∇Xj ∂j (Yi∂i),Zk∂k) + g(Yi∂i, ∇Xj ∂j (Zk∂k)). That is, the connection ∇ is compatible with the metric g. Therefore the connection with Christoffel symbols defined by (12.3) is a Levi-Civita connection. This finishes the proof of existence and uniqueness of the Levi- Civita connection. n P ∂ Example 12.8. Consider the manifold . We have seen that DX Y = X(Yi) is a connection. An R ∂xi easy computation shows D is the Levi-Civita connection on Rn with respect to the standard inner product on Rn. We end this section with a brief discussion of the geometric meaning of a connection being metric. Definition 12.9. Let E → M be a vector bundle with a fiber metric g. A connection ∇ on E is metric if 0 0 0 X(g(s, s )) = g(∇X s, s ) + g(s, ∇X s ) for all vector fields X and sections s, s0 ∈ Γ(E).

Deﬁnition 12.10. Let V1, V2 be two vector spaces with inner products g1, g2 respectively. A linear map A : V1 → V2 is an isometry if g2(Av, Aw) = g1(v, w)

for all v, w ∈ V1. Lemma 12.11. If a connection ∇ is metric then the associated parallel transport is an isometry. Proof. We will only prove the lemma for embedded curves and leave the general case as an exercise. If γ :[a, b] → M is an embedded curve, then locally any section σ :[a, b] → E is of the form s ◦ γ. Let v w v w v, w ∈ Eγ(a) be two vectors and σ , σ :[a, b] → E two parallel sections with σ (a) = a and σ (a) = w. v w We want to prove that the function t 7→ gγ(t)(σ (t), σ (t)) is constant. For this it’s enough to prove that its derivative is zero for all t. This condition is local in t, so we may assume, by above remark, that σv = sv ◦ γ and σw = sw ◦ γ for some (local) sections sv, sw of E. Then v w v w gγ(t)(σ (t), σ (t)) = [g(s , s )](γ(t)). Hence

d v w v w gγ(t)(σ (t), σ (t)) =γ ˙ (g(s , s )) dt t v w v w = g(∇γ˙ s , s ) + g(s , ∇γ˙ s ) = g(0, sw) + g(sv, 0) = 0. 12.2. Connections induced on submanifolds. Let (M, g) be a Riemannian manifold and N,→ M an embedded submanifold (think of a surface in R3). We’ll see that the embedding induces a Levi-Civita connection on N in two ways that turn out to be equivalent. It will also turn out that for surfaces in R3 the curvature of the induced connection is intimately related to Gauss curvature. Suppose f : N → M is a map of manifolds. Then we can use f to pull back a metric g on M to a positive semi-deﬁnite symmetric bilinear form on N: ∗ (f g)x(v, w) = gf(x)(dfxv, dfxw) 81 ∗ for all x ∈ N, v, w ∈ TxN. Moreover, if dfx is injective then (f g)x is non-degenerate. Therefore if f : N → M is an immersion then gN := f ∗g is a metric on N. The metric gN deﬁnes a Levi-Civita connection ∇N on N. Suppose now that f : N,→ M is an embedding. Then there is another way to induce a connection on N from a connection M. First of all, for all point x ∈ N the tangent space TxM splits as an orthogonal direct sum with respect to gx: ⊥ TxM = TxN ⊕ (TxN) . Hence there is an orthogonal projection

Πx : TxM → TxN. ⊥ Globally ν := tx(TxN) is a vector bundle, the normal bundle of the embedding of N into M. Hence globally the ﬁrst equation says that the restriction TM|N is a direct sum of two bundles:

TM|N = TN ⊕ ν and the second equation says that we have a bundle map

Π: TM|N → TN. m Here is how one can see that Πx depends smoothly on x: Choose coordinates φ = (x1, . . . , xn, . . . , xm): U → R on M near a point x ∈ N that are adapted to N. That is, φ(N ∩ U) = φ(U) ∩ {xn+1 = 0, . . . , xm = 0}. Apply Gram-Schmidt to the basis vectors ∂ ∂ ∂ { ,..., ,..., } to obtain an orthonormal frame {e1(x), . . . , en(x), . . . , em(x)} on TU. Remember that every tangent space ∂x1 ∂xn ∂xm TxM has an inner product gx that depends smoothly on x. The Gram-Schmidt is smooth in the inner product. Define the projection Π by n X Πx(v) = gx(v, ei(x))ei(x) i=1 Definition 12.12. Let N ⊂ M be an embedded submanifold. A vector field X˜ ∈ Γ(TM) is an extension of a vector field X ∈ Γ(TN) if

Xx = X˜x for all x ∈ N. We will also say that X˜ is tangent to N. Lemma 12.13. Let N ⊂ M be an embedded submanifold and X ∈ Γ(TN) a vector ﬁeld. Then for any x ∈ N there is a neighborhood U ⊂ M and an extension X˜ ∈ Γ(TM|U ) of X|N∩U . m Pn ∂ Proof. Let (x1, . . . , xn, . . . , xm): U → be coordinates on M adapted to N. Then X = Xi , with R i=1 ∂xi Xi being smooth functions on U ∩ N. Extend Xi to all of U by making them constant in xn+1, . . . , xm. This extends X to all of U. Lemma 12.14. Let N ⊂ M be an embedded submanifold, X,Y ∈ Γ(TN) be two vector ﬁelds and X,˜ Y˜ ∈ Γ(TM) their extensions. Then their Lie bracket [X,˜ Y˜ ] is tangent to N, hence is an extension of [X,Y ].

Proof. We give two proofs. The ﬁrst is computational. In coordinates (x1, . . . , xn, . . . , xm) on M adapted Pm ∂ Pm ∂ to N, X˜ = X˜i with X˜i(x) = 0 for i > n for all x ∈ N. Similarly Y˜ = Y˜i with Y˜i(x) = 0 i=1 ∂xi i=1 ∂xi for i > n for all x ∈ N. Since ˜ ˜ X ∂Yj ∂ X ∂Xi ∂ [X,˜ Y˜ ] = X˜ − Y˜ i ∂x ∂x j ∂x ∂x i,j i j i,j j i for i > n the coeﬃcient in front of ∂ vanishes at the points of N. ∂xi

Here is a geometric proof. If X˜ is tangent to N, its ﬂow φt preserves N (maps it into itself). Hence its diﬀerential dφt maps vectors tangent to N to vectors tangent to N. But Y˜ is tangent to N. Hence for any x ∈ N

(d(φ−t)Y˜ )x ∈ TxN for all t. Diﬀerentiating with respect to t we get

[X,˜ Y˜ ]x ∈ TxN. 82 We now deﬁne a connection ∇¯ on a manifold N induced by its embedding into a Riemannian manifold (M, g) by ¯ ˜ ∇X Y (x) := Πx(∇X˜ Y (x)),

where x ∈ N is a point, X,Y ∈ Γ(TN) are two vector fields, X,˜ Y˜ their (local) extensions to M,Πx : TxM → TxN is the orthogonal projection and ∇ is the Levi-Civita connection on (M, g). We need to make sure that ∇¯ is well-defined, that is, that ∇¯ X Y (x) does not depend on the choice of the ˜ ˜ ˜ ˜ ˜ local extensions X, Y . By Corollary 11.10.2 ∇X˜ Y (x) depends only on Xx = Xx and the values of Y along ˜ ˜ the integral curve of X through x. Therefore ∇X˜ Y (x) depends only on Xx and the values of Y along the integral curve of X through x. Hence ∇¯ is well-defined. Moreover, ∇¯ X Y is clearly tensorial in the X slot. To see that it is a connection, let f ∈ C∞(N) be a function and f˜ its (local) extension to M. Then, at the points of N, ¯ ˜˜ ˜ ˜ ˜ ˜ ˜ ∇X (fY ) = Π(∇X˜ (fY ) = Π((Xf)Y + f∇X˜ Y ) ˜ ˜ ˜ ˜ ˜ ¯ = (Xf)Π(Y ) + fΠ(∇X˜ Y ) = (Xf)Y + f∇X Y.

We conclude that the induced connection ∇¯ is indeed a connection.

Remark 12.15. The projection Π is really necessary in the definition of the induced connection. This is because even if vector fields X˜ and Y˜ are tangent to a submanifold N there is no reason for their covariant ˜ derivative ∇X˜ Y to be tangent to N. Here is an example: ∂ ∂ 2 Let W = Z = x2 − x1 , two vector fields on M = . Let D denote the Levi-Civita connection on ∂x1 ∂x2 R R2 for the standard metric dx ⊗ dx + dy ⊗ dy. Then ∂ ∂ ∂ ∂ DW Z = (W x2) + (W (−x1)) = −x1 − x2 . ∂x1 ∂x2 ∂x1 ∂x2

1 Let N = S . Then W and Z are tangent to N, hence are extensions of a vector ﬁeld on N. But DW Z is orthogonal to S1.

Lemma 12.16. Let (M, g) be a Riemannian manifold and i : N,→ M an embedded submanifold. Then the connection ∇¯ induced on N by the Levi-Civita connection ∇ on M is the Levi-Civita connection for the pullback metric gN := i∗g.

Proof. It is enough to check that (1) ∇¯ is torsion-free and that (2) ∇¯ is metric. For all X,Y ∈ Γ(TN) and their local extensions X,˜ Y˜ ∈ Γ(TM) ¯ ¯ ˜ ˜ ˜ ˜ ∇X Y − ∇Y X = Π(∇X˜ Y − ∇Y˜ X) = Π([X, Y ]) = Π([X,Y ]) = [X,Y ]. To show that ∇¯ is metric we need to check that

N N N Z(g (X,Y )) = g (∇¯ Z X,Y ) + g (X, ∇¯ Z Y ) for any vector ﬁelds X,Y,Z on N. At any point of N,

Z(gN (X,Y )) = Z˜(g(X,˜ Y˜ )) ˜ ˜ ˜ ˜ = g(∇Z˜X, Y ) + g(X, ∇Z˜Y ) ¯ ˜ ¯ ¯ ˜ ¯ = g(∇Z X + (∇Z˜X − ∇Z X),Y ) + g(X, ∇Z Y + (∇Z˜Y − ∇Z Y )) = g(∇¯ Z X,Y ) + g(X, ∇¯ Z Y ). ˜ ¯ ˜ ¯ since ∇Z˜Y − ∇Z Y and ∇Z˜X − ∇Z X are perpendicular to N. 83 12.3. The second fundamental form of an embedding. As before let N,→ M be an embedded submanifold of a Riemannian manifold (M, g). We want to understand how much N curves in M. We define a ⊥ tensor, the second fundamental form IIx : TxN × TxN → (TxN) to measure the extrinsic geometry of N in M. We first define II : Γ(TN) × Γ(TN) → Γ(TN ⊥) by ˜ ¯ II(X,Y ) = ∇X˜ Y − ∇X Y, where, as before, ∇ is the Levi-Civita connection on M, ∇¯ is the induced Levi-Civita connection on N, X,˜ Y˜ ∈ Γ(TM) are local extensions of the vector fields X,Y ∈ Γ(TN). Proposition 12.17. The map II defined above is symmetric and tensorial. Proof. We first argue that II is symmetric. ˜ ¯ ˜ ¯ II(X,Y ) − II(Y,X) = (∇X˜ Y − ∇X Y ) − (∇Y˜ X − ∇Y X) ˜ ˜ ¯ ¯ = (∇X˜ Y − ∇Y˜ X) − (∇X Y − ∇Y X) = [X,˜ Y˜ ] − [X,Y ] = 0. Next we argue that II is tensorial in the first slot. Let f˜ be a local extension of a function f on N. Then at the points of N, ˜ ¯ ˜ ˜ ¯ II(fX, Y ) = ∇f˜X˜ Y − ∇fX Y = f∇X˜ Y − f∇X Y = f II(X,Y ). It follows that for all points x ∈ N there is a symmetric bilinear map ⊥ IIx : TxN × TxN → (TxN) . Remark 12.18. In classical terminology the first fundamental form of an embedding is the induced metric. Next suppose that the embedded submanifold N is a hypersurface, that is, that dim M −dim N = 1. Then the normal bundle TN ⊥ has 1-dimensional fibers hence, locally, a frame on TN ⊥ is defined by one nowhere zero vector field. By rescaling, if necessary, we may assume that this vector n field has length 1 everywhere:

gx(nx, nx) = 1 for all points x ∈ N. We furthermore make an extra assumption that unit vector field n normal to N is defined on all of N. That is, N is orientable inside M. This is true for the sphere embedded in R3 but false for the central circle of the Möbiusband inside the band. If N ⊂ M has a globally defined unit normal n, we can write IIx(v, w) = hx(v, w)nx for a symmetric bilinear map hx : TxN × TxN → R. Unwinding the definitions we see that for any vector fields X,Y on N ˜ h(X,Y ) = g(∇X˜ Y , n). We will refer to h ∈ Γ(TN ∗ ⊗ TN ∗) also as the second fundamental form. The second fundamental form h allows us to relate the curvature tensor R of the Levi-Civita connection on M, the Riemann curvature of M, and the curvature R¯ of the induced connection on N: Theorem 12.19. Let N,→ M be an embedded orientable hypersurface of a Riemannian manifold (M, g). Let h ∈ Γ(T ∗N ⊗2) be the second fundamental form of the embedding. Then for any vector fields X,Y,Z,W ∈ TN (12.4) g(R(X,Y )Z,W ) = gN (R¯(X,Y )Z,W ) − h(Y,Z)h(X,W ) + h(X,Z)h(Y,W ), where R is the Riemann curvature tensor of M and R¯ is the induced Riemann curvature tensor of N. We prove an easy lemma before tackling the computations involved in the proof of the theorem. Lemma 12.20. Let (M, g), ∇, N, n and h be as above. Then

h(X,W ) = −g(∇X n, W ). for any vector ﬁelds X,W ∈ Γ(TN) (here we didn’t bother with putting tildes on the extensions). 84 Proof. The function g(n, W ) is identically 0 on N. Hence

0 = X(g(n, W )) = g(∇X n, W ) + g(n, ∇X W ) since ∇ is a metric connection. Proof of Theorem 12.19. Recall that

R(X,Y )Z = ∇X (∇Y Z) − ∇Y (∇X Z) − ∇[X,Y ]Z.

∇X (∇Y Z) = ∇X (∇¯ Y Z + h(Y,Z)n)

= ∇¯ X (∇¯ Y Z) + h(X, ∇¯ Y Z)n + (Xh(Y,Z))n + h(Y,Z)∇¯ X n. Hence

(12.5) g(∇X (∇Y Z),W ) = g(∇¯ X (∇¯ Y Z),W ) + h(Y,Z)g(∇¯ X n, W ) = g(∇¯ X (∇¯ Y Z),W ) − h(Y,Z)h(X,W ). Similarly,

(12.6) g(∇Y (∇X Z),W ) = g(∇¯ Y (∇¯ X Z),W ) − h(X,Z)h(Y,W ), while ¯ (12.7) g(∇[X,Y ]Z,W ) = g(∇[X,Y ]Z,W ). Subtracting (12.6) and (12.7) from (12.5) we get (12.4). Let us see what the theorem tells us about the curvature of oriented surfaces in R3. If N ⊂ R3 is an oriented embedded manifold, then the unit normal ﬁeld n assigns to every point in N a unit vector in R3. Hence we can think of n as a map to the unit sphere, n : N → S2. 2 This is the Gauss map. Since TxN and Tnx S are two planes perpendicular to the same vector nx, they are 3 the same two plane in R . Therefore we may think of the diﬀerential dnx of the Gauss map as a map

dnx : TxN → TxN. Definition 12.21. The Gauss curvature κ of an oriented surface N in R3 is the determinant of the differential of the Gauss map: κ(x) = det dnx. We compute a few examples of Gauss curvature by brute force. Example 12.22. Consider 3 N = {(x1, x2, x3) ∈ R | x3 = 0}, a plane. The normal vector field n(x) is constant, and so the Gauss curvature κ(x) is 0. Example 12.23. Now let N be a round cylinder: 3 2 2 2 N = {(x1, x2, x3) ∈ R | x2 + x3 = R },

Here the unit normal n(x) is constant in the x1 direction. Hence, dnx(e1) = 0, and so the Gauss curvature is again zero. Example 12.24. Let N be the standard round sphere of radius R: 2 2 2 2 N = {(x1, x2, x3): x1 + x2 + x3 = R }. 1 Then the normal vector field n is given by n(x) = R x, hence 1 dn = · id. R Therefore 1 κ(x) = . R2 Note that the Gauss curvature is constant and positive. Also, the bigger the radius of the sphere the smaller the Gauss curvature. This makes sense since the sphere gets flatter as its radius increases. 85 In general one computes the Gauss curvature from the first and second fundamental form.Once again we denote the Levi-Civita connection on R3 by D. Then for any vector v and vector field Y : R3 → R3

DvY = dY (v). Hence for any two vector ﬁelds X,Y on a surface N,

(12.8) hx(Xx,Yx) = −gx((DX n)(x),Yx) = −gx(dnx(Xx),Yx). In particular the differential of the Gauss map is completely determined by the induced metric and the second fundamental form. We will see shortly that the Gauss curvature depends only on the metric g and its first and second partials. But first we extract Gauss curvature from the above equation.

Lemma 12.25. Let g be a positive deﬁnite inner product on a vector space V , h : V × V → R a symmetric bilinear map and S : V → V the linear map uniquely deﬁned by h(v, w) = g(Sv, w).

Let {ei} be a basis of V . Then

det(h(ei, ej)) = det(g(ei, ej)) det S.

Proof. The matrix (ski) of S with respect to the basis {ei} is deﬁned by X Sei = skiek. k Therefore X X h(ei, ej) = g(Sei, ej) = g( skiek, ej) = skig(ek, ej). k k

Therefore the matrix (h(ei, ej)) is the product of matrices (ski) and (g(ej, ek)) = (g(ek, ej)). Thus

det(h(ei, ej)) = det(g(ej, ek)) det(ski).

Together Lemma 12.25 above and (12.8) tell us how to compute the Gauss curvature: pick a basis {e1, e2} of the tangent space TxN. Then det(h(e , e )) κ(x) = i j . det(g(ei, ej))

In particular, if the basis {e1, e2} is orthonormal with respect to the induced metric g,

κ(x) = det(h(ei, ej)). We are now ready to prove Gauss’ theorema egregium (“remarkable theorem”) from 1828!

Theorem 12.26. Let N,→ R3 be an oriented embedded surface. Let R¯ denote the Riemann curvature on N. Then the Gauss curvature κ is given by N ¯ κ(x) = −gx (Rx(e1, e2)e1, e2) N where {e1, e2} is a basis of TxN orthonormal with respect to the induced metric g . Hence the Gauss metric depends only on the induced metric and its ﬁrst and second partials and not on the embedding.

Proof. The Riemann curvature of the standard Levi-Civita connection D on R3 is 0. Hence, by Theorem 12.19 N ¯ gx (Rx(e1, e2)e1, e2) = hx(e2, e1)hx(e1, e2) − hx(e1, e1)hx(e1, e1) = − det(hx(ei, ej)) = −κ(x). The curvature of a connection depends on the Christoffel symbols and their first partials. The Christoffel symbols of a Levi-Civita connection are functions of the metric and its first partials. 86 Exercise 12.1. Let f(x, y) be a smooth function on R2 and N its graph in R3: 2 N = {(x, y, f(x, y)) | (x, y) ∈ R } Show that the Gauss curvature κ is given by 2 fxxfyy − fxy κ = 2 2 2 (1 + fx + fy )

∂2f where fxy = ∂x∂y and so on.

13. Geodesics as critical points of the energy functional This section is a brief excursion into the calculus of variations. The basic setup is this. Let M be a manifold. Consider the set of all maps P from a ﬁxed interval [a, b] to M with ﬁxed end points:

P = P([a, b], q1, q2) = {γ :[a, b] → M | γ(a) = q1, γ(b) = q2},

where q1, q2 ∈ M are two points. Every path γ ∈ P gives rise to a pathγ ˙ :[a, b] → TM. Therefore, a smooth function L : TM → R on the tangent bundle of M (a “Lagrangian”) deﬁnes a map (“action”) Z b A : P → R, A(γ) = L(γ ˙ (t)) dt. a For example, if g is a Riemannian metric on a manifold M then 1 L(x, v) = g (v, v) x ∈ M, v ∈ T M 2 x x is a Lagrangian and the corresponding action Z b 1 AL(γ) = gγ(t)(γ ˙ (t), γ˙ (t)) dt a 2 is the “energy” of the path. The term “energy” comes from the fact that for a particle of mass m moving in 3 1 2 2 2 1 2 R the quantity 2 m(v1 + v1 + v3) = 2 m||v|| is the kinetic energy.

We want to make sense of a path γ ∈ P being critical for a an action AL : P → R. This is a bit delicate since we have been careless with the topology on P and since P is inﬁnite dimensional. The cheapest way to do it is by analogy with a ﬁnite dimensional case: a point is critical for a function f if and only if for every

d 0 path σ(s) through the point, we have ds f(σ(s)) = 0. Now, a path in the space P through γ ∈ P is a s=0 0 family of curves γs with γs|s=0 = γ , where s varies in some open interval (−, ). We say that γs depends smoothly on s if the map

(−, ) × [a, b] → M, (s, t) 7→ γs(t) is smooth.

Deﬁnition 13.1. Let P = P([a, b], q1, q2) be a space of paths in a manifold M and L : TM → R a 0 0 Lagrangian. A path γ ∈ P is L-critical if for any family γs of paths through γ we have d (AL(γs)) = 0, ds s=0

where AL is the associated action. A connection between variational problems and Riemannian geometry is provided by the following theorem.

1 Theorem 13.2. Let (M, g) be a Riemannian manifold and L(x, v) = 2 gx(v, v) the associated Lagrangian. A path γ is L-critical if and only if γ is a geodesic of the Levi-Civita connection. We will ﬁrst prove the theorem above locally, when the image of the path is contained in a coordinate chart. We will then show that any L-critical path is a geodesic. We will not have time to prove the converse. We start by examining what critical paths for an arbitrary Lagrangian look like locally. 87 m m 0 0 0 Theorem 13.3. Let L : R ×R → R, (x, v) 7→ L(x, v) be a Lagrangian. A path γ (t) = (γ1 (t), . . . , γm(t)) : [a, b] → Rm is L-critical if and only if it satisﬁes the Euler-Lagrange equations: d ∂L ∂L (13.1) (γ(t), γ˙ (t)) − (γ(t), γ˙ (t)) = 0, 1 ≤ i ≤ m. dt ∂vi ∂xi 0 0 Proof. Let γs(t) = γ(s, t) = (γ1(s, t), . . . , γm(s, t)) be a variation of γ . Then γ(0, t) = γ (t) for all t, and γ(s, a) = γ0(a), γ(s, b) = γ0(b) for all s. Hence

∂ m h(t) := γ(s, t):[a, b] → R ∂s s=0 has to vanish at t = a and at t = b. It’s important that there are no other restrictions on h: given an arbitrary curve h :[a, b] → Rm which vanishes at the endpoints, γ(s, t) := γ0(t) + sh(t)

0 ∂ is a variation of γ . Note further thatγ ˙ s(t) = ∂t γ(s, t) and consequently t 2 ∂ ∂ γ d ∂ ˙ γ˙ s(t) = = ( γ(s, t) = h(t). ∂s s=0 ∂s∂t (0,t) dt t ∂s s=0 Since γ0 is L-critical, d Z b

0 = L(γ(t, s), γ˙ (t, s)) dt ds s=0 a Z b ∂

= L(γ(t, s), γ˙ (t, s)) dt a ∂s s=0 Z b X ∂L ∂γi ∂L ∂γ˙ i = (γ0, γ˙ 0) + (γ0, γ˙ 0) dt ∂x ∂s s=0 ∂v ∂s s=0 a i i i X Z b ∂L ∂L = ( (γ0, γ˙ 0)h + (γ0, γ˙ 0)h˙ ) dt. ∂x i ∂v i i a i i Integration by parts gives Z b ∂L ∂L b Z b d ∂L 0 0 ˙ 0 0 0 0 (γ , γ˙ )hi dt = (γ , γ˙ )hi − (γ (t), γ˙ (t)) hi(t) dt. a ∂vi ∂vi a a dt ∂vi Therefore X Z b ∂L d ∂L 0 = (γ0, γ˙ 0) − (γ0(t), γ˙ 0(t)) h (t) dt. ∂x dt ∂v i i a i i Since hi(t) are arbitrary, the equation above forces (13.1): see Lemma 13.4 below. Running the computations backwards we see that if γ0 satisﬁes the Euler-Lagrange equations then γ0 is L-critical. Lemma 13.4. If f ∈ C∞([a, b]) is a smooth function and if for any h ∈ C∞([a, b]) with h(a) = h(b) = 0 we R b have a f(t)h(t) dt = 0, then f(t) ≡ 0. Proof. Exercise. m 1 Proposition 13.5. Let g be a metric on R and L(x, v) = 2 gx(v, v) the associated Lagrangian. Then γ is L-critical if and only if it is a geodesic for the Levi-Civita connection deﬁned by the metric g. Proof. We have X 2L(x, v) = gkl(x) vkvl. k,l Therefore, for each index i, ∂L X ∂gkl 2 = vkvl ∂xi ∂xi k,l 88 and ∂L X 2 = (gilvl + gkivk). ∂vi k,l The Euler-Lagrange equations in this case then are   X ∂gkl d X γ˙ kγ˙ l =  (gilγ˙ l + gkiγ˙ k) . ∂xi dt k,l k,l

Diﬀerentiating and gatheringγ ¨s terms on one side, we get: X 1 X ∂gki ∂gil ∂gkl (13.2) gisγ¨s = − + − γ˙ lγ˙ k. 2 ∂xl ∂xk ∂xi s k,l 1 Here we used the fact that γis = γsi; this is where the 2 comes from. As before we denote the entries of the αβ P αβ inverse of the matrix (gαβ) by g so that β g gβγ = δαγ . Therefore if we multiply both sides of (13.2) by gji and sum on i we get

1 X ji∂gki ∂gil ∂gkl X j γ¨j = − g + − γ˙ lγ˙ k = − Γklγ˙ kγ˙ l, 2 ∂xl ∂xk ∂xi i,k,l k,l i where Γkl are the Christoﬀel symbols for the Levi-Civita connection (cf. (12.3)). We now see that this is the geodesic equation. Thus, L-critical curves are geodesics and vice versa. The result for Lagrangians on Rn, Theorem 13.3, and the corresponding result for geodesics, Propo- m sition 13.5, generalize to the manifold setting. To be precise, recall that if (x1, . . . , xn): U → R is a coordinate chart on a manifold M, then it deﬁnes an associated coordinate chart (x1, . . . , xm, v1, . . . , vm): m m TU → R × R on the tangent bundle of M. Namely, if q ∈ U is a point and w ∈ TqU = TqM is a vector, then there are unique numbers v1 = v1(w), . . . , vm = vm(w) so that X ∂ w = vi(w) , ∂x q i i ∂ since ∂x is a basis of TqM. Of course, vi(w) = (dxi)q(w). i q

Proposition 13.6. Let M be a manifold and L : TM → R a Lagrangian. If a path γ0 :[a, b] → M lies m entirely inside a coordinate chart (x1, . . . , xn): U → R (i.e., γ([a, b]) ⊂ U), then 0 0 0 0 0 0 0 0 (γ1 (t), . . . , γm(t), γ˙ 1 (t),..., γ˙ m(t)) := (x1 ◦ γ (t), . . . , xm ◦ γ (t), v1 ◦ γ˙ (t), . . . , vm ◦ γ˙ (t)) m m satisﬁes the Euler-Lagrange equations. Here, as above, (x1, . . . , xm, v1, . . . , vm): TU → R × R is the m coordinate chart on the tangent bundle TM associated with the chart (x1, . . . , xn): U → R on the manifold M. 0 Proof. The only possible concern is that the image of a variation γs of our curve γ lies outside the domain U of our coordinate chart. But we only care about γs for s small, and for small values of the parameter s 0 the variation γs(t) is close to γ (t), hence lies in U. From Propositions 13.5 and 13.6 we deduce: Corollary 13.6.1. Let M be a manifold with a Lagrangian L. A path γ0 :[a, b] → M lying inside a coordinate chart on M is a geodesic for a Riemannian metric g if and only if γ0 is critical for the energy 1 Lagrangian L(x, v) = 2 gx(v, v). What about L-critical paths whose images cannot be covered by a single coordinate chart? Suppose γ : m [a, b] → M is L-critical and for some time t0 the point γ(t0) lies in a coordinate chart (x1, . . . , xm): U → R . 0 0 0 0 Then γ([a , b ]) ⊂ U for some subinterval [a , b ] ⊂ [a, b] containing t0. Any variation of γ|[a0,b0] is a variation of γ. Hence γ|[a0,b0] is also L-critical. Therefore it satisﬁes Euler-Lagrange equations in the chart U. In particular, if γ is critical for the energy Lagrangian, then γ is a geodesic in every coordinate chart, hence a geodesic. This proves one global direction of Theorem 13.2, as promised. 89 The converse is true as well, but this requires a coordinate-free description of L-critical curves which we don’t have time for.