<<

True infinitesimal differential

Mikhail G. Katz∗ Vladimir Kanovei Tahl Nowik

M. Katz, Department of , Bar Ilan Univer- sity, Ramat Gan 52900 Israel E-mail address: [email protected] V. Kanovei, IPPI, Moscow, and MIIT, Moscow, Russia E-mail address: [email protected] T. Nowik, Department of Mathematics, Bar Ilan Uni- versity, Ramat Gan 52900 Israel E-mail address: [email protected] ∗Supported by the Israel Science Foundation grant 1517/12.

Abstract. This text is based on two courses taught at Bar Ilan University. One is the undergraduate course 89132 taught to a group of about 120 freshmen. This course used infinitesimals to introduce all the basic concepts of the , mainly following Keisler’s textbook. The other is the graduate course 88826 taught to about 10 graduate students. The graduate course developed two viewpoints on infinitesimal generators of flows on manifolds. Much of what we say here is deeply affected by the pedagogical experience of teaching the two groups. The classical notion of an infinitesimal generator of a flow on a manifold is a (classical) vector field whose relation to the actual flow is expressed via integration. A true infinitesimal framework allows one to re-develop the foundations of differential geometry in this area. The True Infinitesimal Differential Geometry (TIDG) framework enables a more transparent relation between the infini- tesimal generator and the flow. Namely, we work with a combinatorial object called a hyper- real walk constructed by hyperfinite iteration. We then deduce the continuous flow as the real shadow of the said walk. Namely, a vector field is defined via an infinitesimal displace- ment in the manifold itself. Then the walk is obtained by iteration rather than integration. We introduce synthetic combinatorial con- ditions Dk for the regularity of a vector field, replacing the classical analytic conditions of Ck type, that guarantee the usual theorems such as uniqueness and existence of solution of ODE locally, Frobe- nius theorem, Lie bracket, small oscillations of the pendulum, and other concepts. Here we cover vector fields, infinitesimal generator of flow, Frobenius theorem, hyperreals, infinitesimals, , ultrafilters, ultrapower construction, hyperfinite partitions, micro- continuity, internal sets, halo, prevector, prevector field, regularity conditions for prevector fields, flow of a prevector field, relation to classical flow. Contents

Chapter 1. Preface 11

Part 1. True Infinitesimal Differential Geometry 13 Chapter 2. Introduction 15 2.1. Breakdown by chapters 16 2.2. Historical remarks 17 2.3. Preview of flows and infinitesimal generators 18 Chapter 3. A hyperreal view 19 3.1. Successive extensions N, Z, Q, R, ∗R 19 3.2. Motivating discussion for infinitesimals 20 3.3. Introduction to the transfer principle 21 3.4. Transfer principle 24 3.5. Orders of magnitude 25 3.6. Standard part principle 27 3.7. Differentiation 27 Chapter 4. Ultrapower construction and transfer principle 31 4.1. Introduction to filters 31 4.2. Examples of Filters 32 4.3. Properties of filters 33 4.4. Real field as quotient of of rationals 34 4.5. Extension of Q 34 4.6. Examples of infinitesimals 34 4.7. Ultrapower construction of the hyperreal field 35 Chapter 5. , EVT, internal sets 37 5.1. Construction via equivalence relation 37 5.2. Extending the ordered field language 37 5.3. Existential and Universal Transfer 39 5.4. Examples of first order statements 39 5.5. Galaxies 41 5.6. Microcontinuity 43 5.7. Uniform continuity in terms of microcontinuity 45 3 4 CONTENTS

5.8. Hyperreal 46 5.9. Intermediate value theorem 47 5.10. Ultrapower construction applied to (R); internal sets 48 5.11. The set of infinite hypernaturals is notP internal 49 5.12. Transfering the well-ordering of N; Henkin semantics 50 5.13. From hyperrationals to reals via ultrapower 51 5.14. Repeatingtheconstruction 53

Chapter 6. Application: small oscillations of a pendulum 55 6.1. Small oscillations of a pendulum 55 6.2. Vector fields, walks, and flows 56 6.3. The hyperreal walk and the real flow 57 6.4. pq 58 6.5. Infinitesimal oscillations 59 6.6. Linearisation 59 6.7. Adjusting linear prevector field 60 6.8. Conclusion 60

Chapter 7. Differentiable manifolds 63 7.1. Definition of differentiable manifold 63 7.2. Open submanifolds, Cartesian products 64 7.3. Tori 65 7.4. Projective spaces 66 7.5. Definition of a vector field 67 7.6. Zeros of vector fields 68 7.7. 1-parameter groups of transformations of a manifold 69 7.8. Invariance of infinitesimal generator under flow 70 7.9. Lie derivative and Lie bracket 71

Chapter 8. Theorem of Frobenius 73 8.1. Invariance under diffeomorphism 73 8.2. Frobenius thm, commutation of vector fields and flows 75 8.3. Proof of Frobenius 76 8.4. A-trackandB-trackapproaches 78 8.5. Relations of and 78 8.6. Smooth manifolds;≺ notion≺≺ nearstandardness 79 8.7. Results for general culture on topology 80

Chapter 9. Gradients, Taylor, prevectorfields 83 9.1. Propertiesofthenaturalextension 83 9.2. Remarks on gradient 83 9.3. Prevectors 85 9.4. space to a manifold via prevectors 86 CONTENTS 5

9.5. Action of a prevector on smooth functions 87 9.6. Maps acting on prevectors 88 Chapter 10. Prevector fields, classes D1 and D2 91 10.1. Prevector fields 91 10.2. Realizing classical vector fields 92 10.3. Regularity class D1 for prevectorfields 94 10.4. Second differences, class D2 94 10.5. Secord-order mean value theorem 95 10.6. The condition C2 implies D2 96 10.7. Bounds for D1 prevector fields 97 Chapter 11. Prevectorfield bounds via underspill, walks, flows 99 11.1. Operations on prevector fields 99 11.2. Sharpening the bounds via underspill 100 11.3. Further bounds via underspill 101 11.4. Global prevector fields, walks, and flows 102 11.5. Internal induction 104 11.6. Dependence on initial condition and prevector field 104 11.7. Dependence on initial conditions for local prevector field 106 11.8. Inducing a real flow 108 11.9. Geodesic flow 109 Chapter 12. Universes, transfer 111 12.1. Universes 111 12.2. Therealsasthebaseset 112 12.3. More sets in universes 113 12.4. Membership relation 114 12.5. Ultrapower construction of nonstandard universes 114 12.6. Los theorem 117 12.7. Elementary embedding 119 12.8. Definability 122 12.9. Conservativity 123 Chapter13. Moreregularity 127 13.1. D2 implies D1 127 13.2. Injectivity 128 13.3. Time reversal 129

Part 2. 500 yrs of infinitesimal math: Stevin to Nelson 131 Chapter 14. 16th century 133 14.1. Simon Stevin 133 14.2. Decimals from Stevin to Newton and Euler 135 6 CONTENTS

14.3. Stevin’s cubic and the IVT 136 Chapter 15. 139 15.1. 139 15.2. James Gregory 139 15.3. Gottfried Wilhelm von Leibniz 139 Chapter 16. 18th century 141 16.1. 141 Chapter 17. 19th century 143 17.1. Augustin-Louis Cauchy 143 Chapter 18. 20th century 145 18.1. Thoralf Skolem 145 18.2. Edwin Hewitt 145 18.3. JerzyLo´s 145 18.4.AbrahamRobinson 145 18.5. Edward Nelson 145 Bibliography 147 Chapter 19. Theorema Egregium 153 19.1. Preliminaries to the Theorema Egregium of Gauss 153 19.2. The theorema egregium ofGauss 155 19.3. Intrinsic versus extrinsic 156 19.4. Gaussian curvature and Laplace-Beltrami operator 157 19.5. Duality in linear algebra 158 19.6. Polar, cylindrical, and spherical coordinates 159 19.7. Duality in calculus; derivations 161 Chapter20. Derivations,tori 163 20.1. Space of derivations 163 20.2. Constructing bilinear forms out of 1-forms 165 20.3. Firstfundamentalform 165 20.4. Expressing the metric in terms of 1-forms 166 20.5. Hyperbolic metric 167 20.6. Surface of revolution 167 20.7. Coordinate change 169 20.8. Conformal equivalence 170 20.9. Uniformisation theorem for tori 170 20.10. Isothermalcoordinates 171 20.11. Tori of revolution 172 Chapter 21. Conformal parameter 175 CONTENTS 7

21.1. Conformal parameter τ of standard tori of revolution 175 21.2. Comformal parameter ofstandard torus ofrev 176 21.3. Curvature expressed in terms of θ(s) 178 21.4. Total curvature of a convex Jordan curve 179 ˜ 21.5. Signed curvature kα of Jordan curves in the plane 180 21.6. Coordinate patch defined by family of normal vectors 183 21.7. Total Gaussian curvature and Gauss–Bonnet theorem 185

Chapter 22. Addendum on a Bernoullian framework 187 22.1. Ultrapower construction 187 22.2. Saturation and topology 187

Chapter 23. More on Gauss-Bonnet 189 23.1. Second proof of Gauss–Bonnet theorem 189 23.2. Euler characteristic 189 23.3. Exercise on index notation 190

Chapter 24. Bundles 191 24.1. Trivial bundle (E,B,π), as a section (mikta) 191 24.2. Mobius band as a first example of a nontrivial bundle 192 24.3. Tangent bundle 192 24.4. Hairy ball theorem 193 24.5. Cotangent bundle 193 24.6. Exterior derivative of a function 194 24.7. Extending gradient and exterior derivative 194 24.8. Plan for building de Rham cohomology 195 24.9. Exterioralgebra,area,determinant,rank 196 24.10. Formal definition, alternating property 196 24.11. Example: rank 1 197 24.12. Areas in the plane 197 24.13. Vectorandtripleproducts 199 24.14. Anticommutativity of the wedge product 200 24.15. The k-exterior power; simple (decomposable) multivectors200 24.16. Basis and dimension of exterior algebra 201 24.17. Rank of a multivector 201

Chapter 25. Multilinear forms; exterior differential complex 203 25.1. Formalconstructionoftheexterioralgebra 203 25.2. Exterior differential complex 204 25.3. Alternating multilinear forms 205 25.4. Case of 2-forms 206 25.5. Norm on 1-forms 206 25.6. Polarcoordinatesanddualnorms 207 8 CONTENTS

25.7. Dual bases and dual lattices in a Euclidean space 208

Chapter 26. Comass norm, Hermitian product, symplectic form 211 2 2 n 26.1. Λ (V ) is isomorphic to the standard model Λ0(R ) 211 26.2. Euclidean norm on k-multivectors and k-forms 212 26.3. Comassnormofaform 212 26.4. Symplectic form from a complex viewpoint 213 26.5. Hermitian product 214 26.6. Orthogonal diagonalisation 215

Chapter 27. Wirtinger inequality, projective spaces 217 27.1. Wirtinger inequality 217 27.2. Wirtinger inequality for an arbitrary 2-form 219 27.3. Realprojectivespace 220 27.4. Complex projective line CP1 221 Chapter 28. Complex projective spaces, de Rham cohomology 223 28.1. A clarification 223 28.2. Complex projective space CPn 223 28.3. Real projective space as a quotient 224 28.4. Unitary group 224 28.5. Hopf bundle 224 28.6. Flags and cell decomposition of CPn 225 28.7. g, A, and J in a complex 226 28.8. Fubini-Study 2-form on CP1 227 28.9. Exterior differential complex 228 28.10. De Rham cocycles and coboundaries 229 28.11. De Rham cohomology ring of M, Betti 229

Chapter 29. de Rham cohomology continued 231 29.1. Comparison of topology and de Rham cohomology 231 29.2. de Rham cohomology of a circle 231 29.3. 2-dimensional cohomology of CP1 232 29.4. 2-dimensionalcohomologyofthetorus 233

Chapter 30. Elements of homology theory 235 30.1. Relation between Fubini-Study 2-form and volume form on CPn 235 30.2. Cup product 236 30.3. Cohomology of complex projective space 236 30.4. Abelianisation in 237 30.5. The fundamental group π1(M) 237 30.6. The second homotopy group π2(M) 238 30.7. First singular homology group 238 CONTENTS 9

Chapter31. Stablenorminhomology 241 31.1. Length, area, volume of cycles 241 31.2. Length in 1-dimensional homology of surfaces 241 31.3. Stable norm for higher-dimensional manifolds 242 31.4. Stable 1-systole 243 31.5. 2-dimensional homology and systole 244 31.6. Stable norm in two-dimensional homology 244 31.7. Pu’s inequality 245 31.8. Integration of a 1-form over a circle 246 31.9. Volumeformofasurface 247 31.10. Duality of homology and de Rham cohomology 248

Chapter 32. Orientation and fundamental class 251 32.1. Orientation and induced orientation 251 32.2. Fundamental class in homology 252 32.3. Duality of comass and stable norm 253 32.4. Fubini-Study metric on complex projective line 254 32.5. Fubini-Study metric on complex projective space 255 32.6. Gromov’s inequality for complex projective space 256

Chapter33. ProofofGromov’stheorem 257 33.1. Homology class α and cohomology class ω 257 33.2. Comass and application of Wirtinger inequality 257 33.3. Volume calculation 258

Chapter 34. Eiseinstein, Hermite, Loewner, Pu, and Guth 261 34.1. Eisenstein and Hermite constant 261 34.2. Standard fundamental domain and Eisenstein integers 262 34.3. Loewner’s torus inequality 263 34.4. Loewner’s inequality with defect 263 34.5. Computational formula for the variance (shonut) 264 34.6. An application 264 34.7. Fundamental domain and Loewner’s torus inequality 265 34.8. Boundary case of equality 266 34.9. Hopf fibration h 267 34.10. Tangent map 267 34.11. Riemannian submersions 268 34.12. Hamilton 268 34.13. Complex structures on the algebra H 269 Chapter 35. From surface tension to Loewner’s inequality 271 35.1. Surfacetensionandshapeofawaterdrop 271 35.2. Isoperimetric inequality in the plane 271 10 CONTENTS

35.3. Central symmetry 271 35.4. Property of a centrally symmetric polyhedron in 3-space 272 35.5. Notion of systole 272 35.6. The real projective plane RP2 273 35.7. Pu’s inequality 273 35.8. Loewner’s torus inequality 274 35.9. Bonnesen’s inequality 274 Chapter 36. Pu’s inequality 275 36.1. Fromcomplexstructuretofibration 275 36.2. Lie groups 276 36.3. A special isomorphism 276 36.4. Sphere as homogeneous space 277 36.5. Dualrealprojectiveplane 278 36.6. A pair of fibrations 278 36.7. Double fibration of SO(3) and geometry on S2 279 36.8. An inequality 280 36.9. Proof of Pu’s inequality 281 Chapter 37. Gromov’s essential inequality 283 37.1. Essential manifolds 283 37.2. Gromov’s inequality for essential manifolds 283 37.3. Filling radius of a loop in the plane 283 37.4. The sup-norm 285 37.5. Riemannian manifolds as spaces with a distance function285 37.6. Kuratowski imbedding 285 37.7. Homologytheoryforgroups 285 37.8. Relative filling radius 286 37.9. Absolute filling radius 286 37.10. further 286 CHAPTER 1

Preface

Terence1 Tao has recently published a number of monographs where he exploits ultrapowers in general, and Robinson’s infinitesimals in par- ticular, as a fundamental tool; see e.g., [Tao 2014], [Tao & Vu 2016]. In the present text, we apply such an approach to some basic topics in differential geometry. Differential geometry has its roots in the study of curves and sur- faces by methods of in the 17th century. Ac- cordingly it inherited the infinitesimal methods typical of these early studies in analysis. Weierstrassian effort at the end of the 19th century solidified mathematical foundations by breaking with the infinitesimal mathematics of Leibniz, Euler, and Cauchy, but in many ways the baby was thrown out with the water, as well. This is because mathematics, including differential geometry, was stripped of the intuitive appeal and clarity of the earlier studies. Shortly afterwards, the construction of a non-Archimedean totally ordered field by Levi-Civita in 1890s paved the way for a mathemat- ically correct re-introduction of infinitesimals. Further progress even- tually lead to ’s hyperreal framework, a modern foundational theory which introduces and treats infinitesimals to great effect with full mathematical rigor. Since then, Robinson’s framework has been used in a foundational role both in mathematical analysis [Keisler 1986] and other branches of mathematics, such as differ- ential equations [Benoit 1997], measure, and stochastic analysis [Herzberg 2013], [Albeverio et al. 2009], asymptotic se- ries [Van den Berg 1987], mathematical economics [Sun 2000], etc. These developments have led both to new mathematical results and to new insights and simplified proofs of old results. The goal of the present monograph is to restore the original infini- tesimal approach in differential geometry, on the basis of true infinitesi- mals in Robinson’s framework. In the context of vector fields and flows, the idea is to work with a combinatorial object called a hyperreal walk

1Issues in note 7 (chapter 5), note 2 (chapter11), note 10 (chapter 11), (chap- ter 12), (chapter 12). 11 12 1. PREFACE constructed by hyperfinite iteration. We then deduce the continuous flow as the real shadow of the said walk. We address the book to a reader willing to benefit from both the in- tuitive clarity and mathematical rigor of this modern approach. Read- ers with prior experience in differential geometry will find new (or rather old but somewhat neglected since Weierstrass) insights on known concepts and arguments, while readers with experience in applications of the hyperreal framework will find a new and fascinating application in a thriving area of modern mathematics. Part 1

True Infinitesimal Differential Geometry

CHAPTER 2

Introduction

The classical notion of an infinitesimal generator of a flow on a manifold is a (classical) vector field whose relation to the actual flow is expressed via integration. A true infinitesimal framework allows one to re-develop the foundations of differential geometry in this area in such a way that the relation between the infinitesimal generator and the flow becomes more transparent. The idea is to work with a combinatorial object called a hyper- real walk constructed by hyperfinite iteration. We then deduce the continuous flow as the real shadow of the said walk. Thus, a vector field is defined via an infinitesimal displacement defined by a self-map F of the manifold itself. The flow is then obtained N by iteration F ◦ (rather than integration), as in Euler’s method. As a result, the proof of the invariance of a vector field under a flow becomes a consequence of basic set-theoretic facts such as the associativity of N composition, or more precisely the commutation relation F F ◦ = N ◦ F ◦ F . Furthermore,◦ there are synthetic, or combinatorial, conditions Dk for the regularity of a (pre)vector field, replacing the usual analytic con- ditions of Ck type, that guarantee the usual theorems such as unique- ness and existence of solution of ODE locally, Frobenius theorem, Lie bracket, and other concepts. A pioneering work in applying true infinitesimals to treat differen- tial geometry of curves and surfaces is [Stroyan 1977]. Our approach is different because Stroyan’s point of departure is the analytic class C1 of vector fields, whereas we use purely combinatorial synthetic condi- tions D1 (and D2) in place of C1 (and C2). Similarly, the Ck conditions were taken as the point of departure in [Stroyan & Luxemburg 1976], [Lutz & Goze 1981], and [Almeida, Neves & Stroyan 2014]. To illustrate the advantages of this approach, we provide a proof of a case of the theorem of Frobenius on commuting flows. We also treat small oscillations of a pendulum (see Section 6.1), where the idea that the period of oscillations with infinitesimal amplitude should be independent of the amplitude finds precise mathematical expression.

15 16 2. INTRODUCTION

One advantage of the hyperreal approach to solving a differential equation is that the hyperreal walk exists for all time, being defined combinatorially by iteration of a selfmap of the manifold. The focus therefore shifts away from proving the existence of a solution, to estab- lishing the properties of a solution. Thus, our estimates show that given a uniform Lipschitz bound on the vector field, the hyperreal walk for all finite time stays in the finite part of the ∗M. If M is complete then the finite part is nearstandard. Our estimates they imply that the hyperreal walk for all finite time descends to a real flow on M.

2.1. Breakdown by chapters In Chapters 3 and 4, we introduce the basic notions of an extended number system while avoiding excursions into mathematical logic that are not always accessible to students trained in today’s undergraduate and graduate programs. We start with a syntactic account of a number system containing infinitesimals and infinite numbers, and progress to the relevant concepts such as filter, ultrapower construction, the exten- sion principle (from sets and functions over the reals to their natural extensions over the hyperreals), and the transfer principle. In Chapter 5, we present some typical notions and results from undergraduate calculus and analysis, such as continuity and uniform continuity, extreme value theorem, etc., from the point of view of the extended number system, introduce the notion of gener- alizing that of a natural extention of a real set, and give a related construction of the reals out of hyperrationals. In Chapter 6.1 we apply the above to treat infinitesimal oscillations of the pendulum. In Chapter 7.1, we present the traditional approach to differentiable manifolds, vector fields, flows, and 1-parameter families of transforma- tions. In Chapter 7.8, we present the traditional “A-track” (see section 2.2) approach to the invariance of a vector field under its flow and to the theorem Frobenius on the commutation of flows. In Chapters 8.4 and 9.6, we present an approach to vectors and vec- tor fields as infinitesimal displacements, and obtain bounds necessary for proving the existence of the flow locally. In Chapters 10 and 11 we approach flows as iteration of the (pre)vector field. Chapters 12 and following contain more advanced material on uni- verses. 2.2. HISTORICAL REMARKS 17

2.2. Historical remarks Many histories of analysis are based on a default Weierstrassian foundation taken as a primary point of reference. In contrast, the article [Bair et al. 2013] developed an approach to the history of analysis as evolving along separate, and sometimes competing, tracks. These are: the A-track, based upon an Archimedean continuum; and • the B-track, based upon what we refer to as a Bernoullian (i.e., • infinitesimal-enriched) continuum.1 Historians often view the work in analysis from the 17th to the mid- dle of the 19th century as rooted in a background notion of continuum that is not punctiform. This necessarily creates a tension with modern, punctiform theories of the continuum, be it the A-type set-theoretic continuum as developed by Cantor, Dedekind, Weierstrass, and oth- ers, or B-type continua as developed by [Hewitt 1948], [Lo´s1955 ], [Robinson 1966], and others. How can one escape a trap of pre- sentism in interpreting the past from the viewpoint of set-theoretic foundations commonly accepted today, whether of type A or B? A possible answer to this query resides in a distinction between procedure and ontology. In analyzing the work of Fermat, Leibniz, Euler, Cauchy, and other great of the past, one must be careful to distinguish between (1) its practical aspects, i.e., actual mathematical practice involv- ing procedures and inferential moves, and, (2) semantic aspects related to the actual construction of the enti- ties such as points of the continuum, i.e., issues of the ontology of mathematical entities such as numbers or points. In Chapter 14 we provide historical comments so as to connect the notions of infinitesimal analysis and differential geometry with their historical counterparts, with the proviso that the connection is meant in the procedural sense as explained above.

1Scholars attribute the first systematic use of infinitesimals as a foundational concept to Johann Bernoulli. While Leibniz exploited both infinitesimal methods and “exhaustion” methods usually interpreted in the context of an Archimedean continuum, Bernoulli never wavered from the infinitesimal methodology. To note the fact of such systematic use by Bernoulli is not to say that Bernoulli’s foundation is adequate, or that that it could distinguish between manipulations with infinites- imals that produce only true results and those manipulations that can yield false results. 18 2. INTRODUCTION

2.3. Preview of flows and infinitesimal generators As discussed in Section 7.7, the infinitesimal generator X of a flow θt(p) = θ(t, p): R M M on a differentiable manifold M is a (classical) vector field× defined→ by the relation 1 Xp(f) = lim f(θ∆t(p)) f(p) ∆t 0 → ∆t − satisfied by all functions f on M. Even though the term “infinitesimal generator” uses the adjective “infinitesimal”, this traditional notion does not actually exploit any infinitesimals. The formalism is somewhat involved, as can be sensed already in the proof of the invariance of the infinitesimal generator under the flow it generates, as well as the proof of the special case of the Frobenius theorem on commuting flows in Section 8.2. We will develop an alternative formalism where the infinitesimal generator of a flow is an actual infinitesimal (pre)vector field. As an application, we will present more transparent proofs of both the invari- ance of a vector field under its flow and the Frobenius theorem in this case. CHAPTER 3

A hyperreal view

3.1. Successive extensions N, Z, Q, R, ∗R Our reference for true infinitesimal calculus is Keisler’s textbook [Keisler 1986], downloadable at http://www.math.wisc.edu/~keisler/calc.html We start by motivating the familiar of extensions of num- ber systems N ֒ Z ֒ Q ֒ R → → → in terms of their applications in arithmetic, algebra, and geometry. Each successive extension is introduced for the purpose of solving prob- lems, rather than enlarging the number system for its own sake. Thus, the extension Q R enables one to express the length of the diagonal of the unit square⊆ and the area of the unit disc in our number system. The familiar continuum R is an Archimedean continuum, in the sense that it satisfies the following Archimedean property: ( ǫ> 0)( n N)[nǫ > 1] . ∀ ∃ ∈ We therefore provisionally denote this continuum A where “A” stands for Archimedean. Thus we obtain a chain of extensions ,N ֒ Z ֒ Q ֒ A → → → as above. In each case one needs an enhanced ordered1 number system to solve an ever broader range of problems from algebra or geometry. Some historical remarks on Simon Stevin and his real numbers can be found in Section 14.1. The next stage is the extension ,N ֒ Z ֒ Q ֒ A ֒ B → → → → where B is a Bernoullian continuum containing infinitesimals, defined as follows. Definition 3.1.1. A Bernoullian extension of R is any proper ex- tension which is an ordered field.

1sadur 19 20 3. A HYPERREAL VIEW

Any Bernoullian extension allows us to define infinitesimals and do interesting things with those. But things become really interesting if we assume the Transfer Principle (Section 3.4), and work in a true hyperreal field, defined as in Definition 3.3.1 below. We will provide some motivating comments for the transfer principle in Section 3.3.

3.2. Motivating discussion for infinitesimals Infinitesimals can be motivated from three different angles: geo- metric, algebraic, and arithmetic/analytic.

θ

Figure 3.2.1. Horn angle θ is smaller than every recti- linear angle

Here is another figure (1) Geometric (horn angles): Some students have expressed the sentiment that they did not understand infinitesimals until they heard a geometric explanation of them in terms of what was classically known as horn angles. A horn angle is the crivice between a circle and its tangent line at the point of tangency. If one thinks of this crevice as a quantity, it is easy to convince oneself that it should be smaller than every rectilinear angle (see Figure 3.2.1). This is because a sufficiently small arc of the circle will be contained in the convex region cut out by the rectilinear angle no matter how small. When one renders this in terms of analysis and arithmetic, one gets a positive quantity smaller than every positive . We cite this example merely as intuitive motivation (our actual construction is different). 3.3. INTRODUCTION TO THE TRANSFER PRINCIPLE 21

(2) Algebraic (passage from ring to field): The idea is to represent an infinitesimal by a sequence tending to zero. One can get something in this direction without reliance on any form of the axiom of choice. Namely, take the ring S of all sequences of real numbers, with arithmetic operations defined term-by-term. Now quotient the ring S by the equivalence relation that declares two sequences to be equivalent if they differ only on a finite set of indices. The resulting object S/K is a proper ring extension of R, where R is embedded by means of the constant sequences. However, this object is not a field. For example, it has zero divisors. But quotienting it further in such a way as to get a field, by extending the kernel K to a maximal ideal K′, produces a field S/K′, namely a hyperreal field. (3) Analytic/arithmetic: One can mimick the construction of the reals out of the rationals as the set of equivalence classes of Cauchy sequences, and construct the hyperreals as equiva- lence classes of sequences of real numbers under an appropriate equivalence relation. This viewpoint is detailed in Section 4.5. Some more technical comments on the definability of a hyperreal line can be found in Section 12.8.

3.3. Introduction to the transfer principle The transfer principle is a type of theorem that, depending on the context, asserts that rules, laws or procedures valid for a certain num- ber system, still apply (i.e., are “transfered”) to an extended number system. Thus, the familiar extension Q R preserves the property of being an ordered field. To give a negative⊆ example, the exten- sion R R of the real numbers to the so-called extended reals does not⊆ preserve∪{±∞} the property of being an ordered field. The hyperreal extension R ∗R (defined below) preserves all first-order properties (i.e., properties⊆ involving quantification over elements but not over sets; see Section 5.4 for a fuller discussion) of ordered fields. For example, the formula sin2 x + cos2 x = 1, true over R for all real x, remains valid over ∗R for all hyperreal x, including infinitesimal and infinite values of x ∗R. ∈ Thus the transfer principle for the extension R ∗R is a theorem ⊆ asserting that any statement true over R is similarly true over ∗R, 22 3. A HYPERREAL VIEW and vice versa. Historically, the transfer principle has its roots in the procedures involving Leibniz’s .2 We will explain the transfer principle in several stages of increasing degree of abstraction. More details can be found in Sections 3.4, 5.2, and 5.3. Definition 3.3.1. An ordered field B, properly including the field A = R of real numbers (so that A $ B) and satisfying the Transfer Principle, is called a hyperreal field. If such an extended field B is fixed then elements of B are called hyperreal numbers,3 while the extended field itself is usually denoted ∗R. Theorem 3.3.2. Hyperreal fields exist. For example, a hyperreal field can be constructed as the quotient of the ring RN of sequences of real numbers, by an appropriate maximal ideal. This will be shown in Section 4.7. Definition 3.3.3. A positive infinitesimal is a positive ǫ such that ( n N)[nǫ < 1] ∀ ∈ More generally, we have the following. Definition 3.3.4. A hyperreal number ε is said to be infinitely small or infinitesimal if a<ε 0 1 is infinitesimal then N = ε is positive infinite, i.e., greater than every real number. A hyperreal number that is not an infinite number are called finite. Sometimes the term limited is used in place of finite. Keisler’s textbook exploits the technique of representing the hyper- real line graphically by means of dots indicating the separation between the finite realm and the infinite realm. One can view infinitesimals with microscopes as in Figure 3.6.1. One can also view infinite numbers with telescopes as in Figure 3.3.1. We have an important subset

finite hyperreals ∗R { } ⊆ 2Leibniz’s theoretical strategy in dealing with infinitesimals was analyzed in detailed studies [Katz & Sherry 2012] as well as [Katz & Sherry 2013], [Sherry & Katz 2014], [Bair et al. 2013]. We found Leibniz’ strategy to be more robust than George Berkeley’s flawed critique thereof. 3Similar terminology is used with regard to integers and ; see Section 5.5. 3.3. INTRODUCTION TO THE TRANSFER PRINCIPLE 23

Positive Finite infinite

2 1 0 1 2 − −

1/ǫ Infinite telescope 1 1 1 +1 ǫ − ǫ

Figure 3.3.1. Keisler’s telescope which is the domain of the function called the (also known as the shadow) which rounds off each finite hyperreal to the nearest real number (for details see Section 3.6).

2 Example 3.3.5. Slope calculation of y = x at x0. Here we use standard part function (also called the shadow)

st : finite hyperreals R { }→ (for details concerning the shadow see Section 3.6). If a curve is defined 2 by y = x we wish to find the slope at the point x0. To this end we use an infinitesimal x-increment ∆x and compute the corresponding y- increment ∆y =(x + ∆x)2 x2 =(x + ∆x + x )(x + ∆x x )=(2x + ∆x)∆x. 0 − 0 0 0 − 0 0 The corresponding “average” slope is therefore ∆y =2x + ∆x ∆x 0 which is infinitely close to 2x, and we are naturally led to the definition ∆y ∆y of the slope at x0 as the shadow of ∆x , namely st ∆x =2x0.  24 3. A HYPERREAL VIEW

The extension principle expresses the idea that all real objects have natural hyperreal counterparts. We will be mainly interested in sets, functions and relations. We then have the following extension principle. Extension principle. The order relation on the hyperreals con- tains the order relation on the reals. There exists a hyperreal number greater than zero but smaller than every positive real. Every set D R ⊆ has a natural extension ∗D ∗R. Every real function f with domain D ⊆ 4 has a natural hyperreal extension ∗f with domain ∗D. Here the naturality of the extension alludes to the fact that such an extension is unique, and the coherence refers to the fact that the domain of the natural extension of a function is the natural extension of its domain. A positive infinitesimal is a positive hyperreal smaller than every positive real. A negative infinitesimal is a negative hyperreal greater than every negative real. An arbitrary infinitesimal is either a positive infinitesimal, a negative infinitesimal, or zero. Ultimately it turns out counterproductive to employ asterisks for hyperreal functions (in fact we already dropped it in equation (3.4.1)). This issue will be dealt with in (5.4.3).

3.4. Transfer principle Definition 3.4.1. The Transfer Principle asserts that every first- order statement true over R is similarly true over ∗R, and vice versa. Here the adjective first-order alludes to the limitation on quantifi- cation to elements as opposed to sets, as discussed in more detail in Section 5.4. Listed below are a few examples of first-order statements. A de- tailed account in terms of a first-order limitation can be found in Sec- tion 5.4. Example 3.4.2. The commutativity rule for addition x+y = y +x is valid for all hyperreal x, y by the transfer principle. Example 3.4.3. The formula sin2 x + cos2 x = 1 (3.4.1) is valid for all hyperreal x by the transfer principle.

4Here the noun principle (in extension principle) means that we are going to ∗ assume that there is a function f : B B which satisfies certain properties. → ∗ It is a separate problem to define B which admits a coherent definition of f for all f : A A, to be solved below. → 3.5. ORDERS OF MAGNITUDE 25

Example 3.4.4. The statement 1 1 0

Example 3.4.5. The characteristic function χQ of the rational numbers equals 1 on rational inputs and 0 on irrational inputs. By ∗ the transfer principle, its natural extension ∗χQ = χ Q will be 1 on hyperrational numbers ∗Q and 0 on hyperirrational numbers (namely, numbers in the complement ∗R ∗Q). \ To give additional examples of real statements to which transfer applies, note that all ordered field-statements are subject to Trans- fer. As we will see below, it is possible to extend Transfer to a much broader category of statements, such as those containing the function symbols exp or sin or those that involve infinite sequences of reals. A hyperreal number x is finite if there exists a real number r such that x < r. A hyperreal number is called positive infinite if it is greater| | than every real number, and negative infinite if it is smaller than every real number.

3.5. Orders of magnitude Hyperreal numbers come in three orders of magnitude: infinitesi- mal, appreciable, and infinite. A number is appreciable if it is finite but not infinitesimal. In this section we will outline the rules for ma- nipulating hyperreal numbers. To give a typical proof, consider the rule that if ǫ is positive infini- 1 tesimal then ǫ is positive infinite. Indeed, for every positive real r we 1 have 0 <ǫ

Hb and HK are infinite. • We have the following rules for quotients. ǫ , ǫ , b are infinitesimal; • b H H b is appreciable; • c b , H , H are infinite provided ǫ =0. • ǫ ǫ b 6 We have the following rules for roots, where n is a standard . if ǫ> 0 then √n ǫ is infinitesimal; • n if b> 0 then √b is appreciable; • if H > 0 then √n H is infinite. • Note that the traditional topic of the so-called “indeterminate forms” can be treated without introducing any ad-hoc terminology by means of the following remark.

ǫ H Remark 3.5.2. We have no rules in certain cases, such as δ , K , Hǫ, and H + K. Theorem 3.5.3. Arithmetic operations on the hyperreal numbers are governed by the following rules. (1) every hyperreal number between two infinitesimals is infinites- imal. (2) Every hyperreal number which is between two finite hyperreal numbers, is finite. (3) Every hyperreal number which is greater than some positive infinite number, is positive infinite. (4) Every hyperreal number which is less than some negative infi- nite number, is negative infinite. Example 3.5.4. The difference √H +1 √H 1 (where H is infinite) is infinitesimal. Namely, − − √H +1 √H 1 √H +1+ √H 1 √H +1 √H 1= − − − − − √H +1+ √H 1  −  H +1 (H 1) = − −  √H +1+ √H 1 − 2 =  √H +1+ √H 1 − is infinitesimal. Once we introduce limits, this example can be refor- mulated as follows: limn √n +1 √n 1 = 0. →∞ − −  3.7. DIFFERENTIATION 27

Definition 3.5.5. Two hyperreal numbers a, b are said to be in- finitely close, written a b, ≈ if their difference a b is infinitesimal. − It is convenient also to introduce the following terminology and notation. Definition 3.5.6. Two nonzero hyperreal numbers a, b are said to be adequal, written a pq b, if either a 1 or a = b = 0. b ≈ Note that the relation sin x x for infinitesimal x is immediate from the continuity of sine at the≈ origin (in fact both sides are infin- itely close to 0), whereas the relation sin x pq x is a subtler relation equivalent to the computation of the first order Taylor approximation of sine.

3.6. Standard part principle Theorem 3.6.1 (Standard Part Principle). Every finite hyperreal number x is infinitely close to an appropriate real number. Proof. The result is generally true for an arbitrary proper ordered field extension E of R. Indeed, if x E is finite, then x induces a ∈ Dedekind cut on the subfield Q R E via the total order of E. The real number corresponding to the⊆ Dedekind⊆ cut is then infinitely close to x.  The real number infinitely close to x is called the standard part, or shadow, denoted st(x), of x. We will use the notation ∆x, ∆y for infinitesimals. Remark 3.6.2. There are three consecutive stages in a typical cal- culation: (1) calculations with hyperreal numbers, (2) calculation with standard part, (3) calculation with real numbers.

3.7. Differentiation An infinitesimal increment ∆x can be visualized graphically by means of a microscope as in the Figure 3.7.1. The slope s of a function f at a real point a is defined by setting f(a + ∆x) f(a) s = st − ∆x   28 3. A HYPERREAL VIEW

1 0 1 2 3 4 −

r + α r r + β r + γ

st

1 0 1 r 2 3 4 − Figure 3.6.1. The standard part function, st, “rounds off” a finite hyperreal to the nearest real number. The func- tion st is here represented by a vertical projection. Keisler’s “infinitesimal microscope” is used to view an infinitesimal neighborhood of a standard real number r, where α, β, and γ represent typical infinitesimals. Courtesy of Wikipedia. whenever the shadow exists (i.e., the ratio is finite) and is the same for each nonzero infinitesimal ∆x. The construction is illustrated in Figure 3.7.2. Definition 3.7.1. Let f be a real function of one real variable. The derivative of f is the new function f ′ whose value at a real x is the slope of f at x. In symbols, f(x + ∆x) f(x) f ′(x)=st − ∆x   whenever the slope exists.

f(x+∆x) f(x) − Equivalently, we can write f ′(x) ∆x . When y = f(x) we define a new dependent variable ∆≈y by settting

∆y = f(x + ∆x) f(x) − ∆y called the y-increment, so we can write the derivative as st ∆x .  3.7. DIFFERENTIATION 29

a

a a + ∆x

Figure 3.7.1. Infinitesimal increment ∆x under the microscope

Example 3.7.2. If f(x)= x2 we obtain the derivative of y = f(x) by the following direct calculation: ∆y f ′(x) ≈ ∆x (x + ∆x)2 x2 = − ∆x (x + ∆x x)(x + ∆x + x) = − ∆x ∆x(2x + ∆x) = ∆x =2x + ∆x 2x. ≈ 30 3. A HYPERREAL VIEW

f(a + ∆x) f(a) | − |

a a + ∆x

a

Figure 3.7.2. Defining slope of f at a

Given a function y = f(x) one defines the dependent variable ∆y = f(x+∆x) f(x) as above. One also defines a new dependent variable dy − by setting dy = f ′(x)∆x at a point where f is differentiable, and sets for symmetry dx = ∆x. Note that we have dy pq ∆y as in Definition 3.5.6 whenever dy = 0. We then have Leibniz’s notation 6 dy f ′(x)= dx and rules like the chain rule acquire an appealing form. CHAPTER 4

Ultrapower construction and transfer principle

To motivate the material on filters contained in this chapter, we will first outline a construction of a hyperreal field ∗R exploiting filters. A more detailed technical presentation of the construction appears in Section 4.7.

4.1. Introduction to filters Let RN denote the ring of sequences of real numbers, with arithmetic operations defined termwise. Then we have

N ∗R = R /MAX (4.1.1) where “MAX” is an appropriate maximal ideal. What we wish to emphasize is the formal analogy (4.1.1) and the construction of the real field as the quotient field of the ring of Cauchy sequences of rational numbers. Note that in both cases, the subfield is embedded in the superfield by means of constant sequences. We now describe a construction of such a maximal ideal. The ideal MAX consists of all “negligible” sequences un , i.e., sequences which vanish for a set of indices of full measure 1 namely,h i

m n N : un =0 =1. { ∈ } Here m : (N) 0,1 (thus m takes only two values, 0 and 1) is a finitelyP additive→ {measure} taking the value 1 on each cofinite set,1 where (N) is the set of subsets of N. The subset m (N) consisting of setsP of full measure m is called a free ultrafilter.F These⊆ P originate with [Tarski 1930]. The construction of a Bernoullian continuum outlined above was therefore not available prior to that date. The construc- tion outlined above is known as an ultrapower construction. The first construction of this type appeared in [Hewitt 1948], as did the term hyper-real.

1For each pair of complementary infinite subsets of N, such a measure m decides in a coherent way which one is negligible (i.e., of measure 0) and which is dominant (measure 1). 31 32 4. ULTRAPOWER CONSTRUCTION AND TRANSFER PRINCIPLE

As already mentioned, to present an infinitesimal-enriched contin- uum of hyperreals, we need some preliminaries on filters. Let I be a nonempty set (usually N). The power set of I is the set (I)= A : A I P { ⊆ } of all subsets of I. Definition 4.1.1. A filter on I is a nonempty collection (I) of subsets of I satisfying the following axioms: F ⊆ P Intersections: if A, B , then A B . • Supersets: if A and∈FA B ∩I, then∈FB . • ∈F ⊆ ⊆ ∈F Thus to show B , it suffices to show ∈F A A B, 1 ∩···∩ n ⊆ for some finite n and some A1,...,An . The full power set (I) is itself a filter.∈F P A filter contains the empty set ∅ if and only if = (I). F F P Definition 4.1.2. We say that is proper if ∅ . F 6∈ F Every filter contains I, and in fact the one-element set I is the smallest filter on I. { } Definition 4.1.3. An ultrafilter is a proper filter that satisfies the following additional property: for any A I, either A or Ac , where Ac = I A. • ⊆ ∈F ∈F \

4.2. Examples of Filters We present several examples of filters. (1) Principal ultrafilter. Choose i I. Then the collection ∈ i = A I : i A F { ⊆ ∈ } is an ultrafilter, called the principal ultrafilter generated by i. If I is finite, then every ultrafilter on I is of the form i for some i I, and so is principal. F (2) Fr´echet∈ filter. The filter Fre = A I : I A is finite is the cofinite, or Fr´echet, filterF on I, and{ ⊆ is proper\ 2 if and only} if I is infinite. Note that Fre is not an ultrafilter. F 2amiti 4.3. PROPERTIES OF FILTERS 33

(3) Filter generated by a collection. If ∅ = (I), then the filter generated by , i.e., the smallest6 filterH⊆ on PI including , is the collection H H

H = A I : A B B for some n and some B F { ⊆ ⊃ 1 ∩···∩ n i ∈ H} For = ∅ we put H = I . H F { } If has a single member B, then H = A I : A B , whichH is called the principal filter generatedF by{ B⊆. The ultrafil-⊃ } ter i of Example (1) is the special case of this when B = i , namelyF a set containing a single element. { }

4.3. Properties of filters Theorem 4.3.1. An ultrafilter satisfies F A B iff A and B , ∩ ∈F ∈F ∈F as well as A B iff A or B , ∪ ∈F ∈F ∈F and also Ac if and only if A . ∈F 6∈ F Theorem 4.3.2. Let be an ultrafilter and A ,...,A a finite F { 1 n} collection of pairwise disjoint (Ai Aj = ∅) sets such that ∩ A ... A . 1 ∪ ∪ n ∈F Then A for exactly one i such that 1 i n. i ∈F ≤ ≤ Theorem 4.3.3. A nonprincipal (or free) ultrafilter must contain each cofinite set. Thus every free ultrafilter includes the Fr´echet filter Fre, namely F F Fre . F ⊆F This is a crucial property used in the construction of infinitesimals and infinitely large numbers. Remark 4.3.4. A filter is an ultrafilter on I if and only if it is a maximal proper filter on I, i.e.,F a proper filter that cannot be extended to a larger proper filter on I. Definition 4.3.5. A collection (I) has the finite intersection property if the intersection of everyH nonempty ⊆ P finite subcollection of is nonempty, i.e., H

B1 Bn = ∅ for any n and any B1,...,Bn . ∩···∩ 6 ∈ H Note that the filter H is proper if and only if the collection has the finite intersection propertyF (see Definition 4.3.5). H 34 4. ULTRAPOWER CONSTRUCTION AND TRANSFER PRINCIPLE

4.4. Real number field as quotient of sequences of rationals To motivate the construction of the hyperreal numbers, we will first analyze the construction of the real numbers via Cauchy sequences. Let QN denote the ring of sequences of rational numbers. Let N QC denote the subring consisting of Cauchy sequences. The reals are by definition the quotient field N R := QC MAX (4.4.1) where MAX is the maximal ideal consisting of null sequences (i.e., N sequences tending to zero). Note that QC is only a ring, whereas the quotient (4.4.1) is a field. The point we wish to emphasize is that a field extension is con- structed starting with the base field and using sequences of elements in the base field.

4.5. Extension of Q We will think of sets in the ultrafilter as dominant and their com- plements as negligible.3 We choose a nonprincipal ultrafilter on N. WeformanidealMAX = N F MAX Q consisting of all sequences un : n N of rational num- bers suchF ⊆ that h ∈ i n N : un =0 . { ∈ }∈F An infinitesimal-enriched field extension ∗Q of Q may be obtained via the ultrapower construction by forming the quotient as follows: N ∗Q = Q MAX, where we dropped the subscript from MAX for simplicity. F  Remark 4.5.1. This is not the same maximal ideal as the one used in Section 4.4. We use the same notation to emphasize the similarity of the two constructions.

4.6. Examples of infinitesimals With respect to the construction presented in Section 4.5, we now give some examples of infinitesimals. Definition 4.6.1. The equivalence class of a sequence u = u h ni will be denoted [u] or alternatively [un].

3zaniach 4.7. ULTRAPOWER CONSTRUCTION OF THE HYPERREAL FIELD 35

1 Example 4.6.2. Let α = n . Then α is smaller than every positive real number r > 0.   In more detail, α = [u] where u = un : n N is the null sequence 1h ∈ i (i.e., sequence tending to zero) un = n . Consider the set

S = n N : un 0, there will be only finitely many terms in the sequence such that un ǫ. Therefore the corresponding set of indices is a member of the Fr´echet≥ filter. Therefore it is a member of every ultrafilter and in particular the chosen ultrafilter . If (the F collection of even natural numbers) 2N then the sequence (4.6.1) 1 ∈F is equivalent to n and therefore generates a positive infinitesimal. 4.7. Ultrapower construction of the hyperreal field To obtain a full hyperreal field model of a B-continuum, we re- place Q by R in the construction of Section 4.5, and form a similar quotient N ∗R := R MAX (4.7.1) as explained below. We wish to emphasize the analogy with for- mula (4.4.1) defining the A-continuum, and also a key difference: the N N basic domain is R rather than RC . In more detail, we proceed as follows. (1) We define RN to be a ring with componentwise operations. (2) Choose a nonprincipal ultrafilter (N). F ⊆ P (3) Define MAX to be a subset of RN consisting of real sequences F un such that n N : un =0 . h i { ∈ }∈F 36 4. ULTRAPOWER CONSTRUCTION AND TRANSFER PRINCIPLE

(4) We observe that MAX is an ideal of RN. F (5) We observe that the quotient RN / MAX is a field and in fact an ordered field (see Definition 4.7.1). F (6) We denote it by RN / to simplify notation. F Definition 4.7.1. A natural order relation <∗ on ∗R is defined by setting [un] <∗ [vn] if and only of the relation < holds for a “dominant” set of indices, where “dominance” is determined by a fixed ultrafilter : F [un] <∗ [vn] if and only if n N : un < vn . { ∈ }∈F The complement of a dominant set can be viewed as “negligible.”4 More details on the ultrapower construction can be found in [Davis 1977]. Theorem 4.7.2. The field RN / satisfies the transfer principle. F We give a proof of the transfer principle in Chapter 12. See [Chang & Keisler 1990, Chapter 4], on a more general model theoretic study of ultrapowers and their applications.

4zaniach CHAPTER 5

Microcontinuity, EVT, internal sets

5.1. Construction via equivalence relation From now on, we will work with the following version of the con- struction (4.7.1) formulated in terms of an equivalence relation, which is more readily generalizable to other contexts (see Section 5.10). Namely, we set N ∗R = R / ∼ where is an equivalence relation defined as follows: ∼ un vn if and only if n N : un = vn . h i∼h i { ∈ }∈F Since the relation depends on , so does the quotient RN/ . For this reason the logicians,∼ and they areF definitely a special breed,∼ decided to write directly RN / . F But we will see that this abuse of notation offers some insights.

Definition 5.1.1. Subsets ∗Q and ∗N of ∗R consist of -classes of sequences u with rational (respectively, natural) terms. F h ni If X R then the subset ∗X of ∗R consists of -classes of se- ⊆ F quences un with terms un X for each n. In particular the subsets ∗Q h i ∈ and ∗N of ∗R consist of -classes of sequences un with rational (re- spectively, natural) terms.F h i

Definition 5.1.2. If X R and f : X R, then the map ∗f : ⊆ → ∗X ∗R sends each -class of a sequence un (un X) to the -class of the→ sequence v Fwhere v = f(u ) forh eachi n. ∈ F h ni n n It is easy to check that this definition ∗f is independent of choices made in the construction. As noted in Definition 4.7.1, the order on ∗R is defined by set- ting [u] <∗ [v] if and only if n N : un < vn . { ∈ }∈F 5.2. Extending the ordered field language Let the ordered field language be defined starting with a pair of operations and +, and enhanced as follows. · 37 38 5. MICROCONTINUITY, EVT, INTERNAL SETS

(1) We allow expressions like (x + y) z +(x 2y) called terms; · − (2) we further define formulas like T = T ′ and T < T ′ where T, T ′ are terms; (3) we allow more complex formulas by means of logical connec- tives and quantifiers; (4) given a concrete ordered field F , we allow the replacement of free variables in formulas by elements of F ; (5) this leads to the notion of a formula being true in F ; (6) this finally enables us to express and study various properties of F in a formal and well defined way. However, the ordered field language is not sufficient as a basis for either calculus or analysis. For instance there is no way to express phe- nomena related to non-algebraic functions like exp, sin, etc. Therefore we need to extend the ordered field language further.

Definition 5.2.1. We will use the term extended real number lan- guage to refer to the ordered field language extended further as follows: (7) we can freely use expressions like f(x), where f : R R is any particular function; → (8) we can freely use formulas like x D, where D R is any particular set; ∈ ⊆ (9) we can freely use any particular real numbers as parameters.

Theorem 5.2.2 (Transfer revisited). Let A be a sentence of the extended real number language, and let ∗A be obtained from A by sub- stituting ∗f and ∗D for each f or D which occur in A. Then A is true over R if and only if ∗A is true over ∗R.

In Chapter 12 we will prove a generalisation of Theorem 5.2.2. In Remark 5.4.3 we will explain our approach to dropping the as- terisks on functions.

Remark 5.2.3. In the statement of the theorem we wrote true over R, or over ∗R, rather than the usual true in R, to emphasize the fact that the sentences involved refer not only to real numbers themselves but also to sets of reals as well as real functions, namely, objects usually thought of as elements of a superstructure over R; see Chapter 12 for more details.

Further analysis of the transfer principle appears in Section 5.3. A complete proof of the transfer principle appears in Chapter 12. 5.4. EXAMPLES OF FIRST ORDER STATEMENTS 39

5.3. Existential and Universal Transfer Some motivating comments for the transfer principle already ap- peared in Section 3.3. Recall that by Theorem 4.7.2 we have an exten- sion R ∗R of an Archimedean continuum by a Bernoullian one which ⊆ furthermore satisfies transfer. To define ∗R, we fix a nonprincipal ultra- N filter over N and let ∗R = R / . By construction, elements [u] of ∗R F F N are represented by sequences u = un : n N of real numbers, u R , with an appropriate equivalenceh relation∈ definedi in terms of ∈as in Section 4.7, or in symbols F N ∗R := R . F Here an order is defined as follows. We have [u] > 0 if and only if the set n N : un > 0 belongs to . { ∈ } F Definition 5.3.1. The inclusion R ∗R is defined by sending a → real number r R to the constant sequence un = r, i.e., ∈ r, r, r, . . . . h i Since we view R as a subset of ∗R, we will denote the resulting (stan- dard) hyperreal by the same symbol r. The transfer principle was formulated in Theorem 5.2.2. To summa- rize, it asserts that truth over R is equivalent truth over ∗R. Therefore there are two directions to transfer in. Definition 5.3.2. Transfer of statements in the direction

R ∗R is called universal,1 whereas transfer in the opposite direction

∗R R is called existential.2

5.4. Examples of first order statements In this section we will illustrate the application of the transfer prin- ciple by several examples of transfer, applied to sentences familiar from calculus. The statements transfer applies to are first-order quantified formulas, namely formulas involving quantification over field elements only. Quantification over subsets of the field is disallowed. Thus, the completeness property of the reals is not transferable because its for- mulation involves quantification over all subsets of the field.

1haavara colelet 2haavara yeshit 40 5. MICROCONTINUITY, EVT, INTERNAL SETS

Example 5.4.1. Rational numbers q > 0 satisfy the following: n ( q Q+)( n, m N) q = , ∀ ∈ ∃ ∈ m + h i where Q is the set of positive rationals. By universal transfer, we obtain the following statement, satisfied by all hyperrational numbers: + n ( q ∗Q )( n, m ∗N) q = . ∀ ∈ ∃ ∈ m Here of course n, m may be infinite. Hypernaturalh i numbers will be discussed in more detail in Section 5.5. Example 5.4.2. Recall that a function f is continuous at a real point c R if the following condition is satisfied: ∈ ( ǫ R+)( δ R+)( x R) x c < δ f(x) f(c) < ǫ . ∀ ∈ ∃ ∈ ∀ ∈ | − | ⇒ | − | By universal transfer, a continuous function f similarly satisfies the following formula over ∗R: + + ( ǫ ∗R )( δ ∗R )( x ∗R) x c < δ ∗f(x) ∗f(c) < ǫ . ∀ ∈ ∃ ∈ ∀ ∈ | − | ⇒ | − | Note that the formula is satisfied in particular for each positive infini- tesimal ǫ ∗R. We remark that Transfer is applicable to functions, as well (in addition∈ to the ordered field formulas). Remark 5.4.3. In practice most calculations exploit the hyperreal extension of a function f, rather than f itself. We will therefore con- tinue on occasion to denote by f the extended hyperreal function, as well. Example 5.4.4. A real function f is continuous in a domain D R if the following condition is satisfied: ⊆ + + ( x D)( ǫ R )( δ R )( x′ D) ∀ ∈ ∀ ∈ ∃ ∈ ∀ ∈ x′ x < δ f(x′) f(x) < ǫ . | − | ⇒ | − | or equivalently   + + ( ǫ R )( x D)( δ R )( x′ D) ∀ ∈ ∀ ∈ ∃ ∈ ∀ ∈ (5.4.1) x′ x < δ f(x′) f(x) < ǫ . | − | ⇒ | − | By universal transfer, the hyperreal extension ∗f of such a function will similarly satisfy the following formula over ∗R: + + ( ǫ ∗R )( x ∗D)( δ ∗R )( x′ ∗D) ∀ ∈ ∀ ∈ ∃ ∈ ∀ ∈ (5.4.2) x′ x < δ ∗f(x′) ∗f(x) < ǫ , | − | ⇒ | − | where ∗D ∗R is the natural extension of D R (see Section 3.1). Note that⊆ transfer is applicable to all such sets ⊆D. 5.5. GALAXIES 41

Example 5.4.5. A real function f is uniformly continuous in a domain D R if the following condition is satisfied: ⊆ + + ( ǫ R )( δ R )( x D)( x′ D) ∀ ∈ ∃ ∈ ∀ ∈ ∀ ∈ x′ x < δ f(x′) f(x) < ǫ . | − | ⇒ | − | By universal transfer, the hyperreal extension ∗f of such a function will similarly satisfy the following formula over ∗R: + + ( ǫ ∗R )( δ ∗R )( x ∗D)( x′ ∗D) ∀ ∈ ∃ ∈ ∀ ∈ ∀ ∈ x′ x < δ ∗f(x′) ∗f(x) < ǫ . | − | ⇒ | − | An alternative characterisation of uniform continuity, of reduced quantifier complexity, is presented in Section 5.7. Example 5.4.6. To give an example of existential transfer, consider the problem of showing that if f is differentiable at c R and f ′(c) > 0 ∈ then there is a point x R,x>c such that f(x) > f(c). Indeed, for ∈ f(c+α) f(c) − infinitesimal α> 0 we have st α > 0 and hence f(c + α) > f(c). Setting x = c + α, we see that the following formula is satisfied over ∗R: ( x)[f(x) > f(c)]. ∃ Therefore by existential transfer, the same formula holds over the real field R. Thus there is a real x such that f(x) > f(c). Remark 5.4.7. Another example of existential transfer will be given in Section 5.6 following formula (5.6.3). 5.5. Galaxies

Let us consider the hypernatural numbers ∗N (positive hyperinte- gers) in more detail. Definition 5.5.1. A hyperreal number x is called finite if x is less than some real number. Equivalently, x is finite if x is less| than| some natural number. A number is infinite if it is not finite.| | Thus an infinite positive hypereal is a hyperreal number bigger than each real number.

Now consider the extension N ∗N constructed by an ultrapower ⊆ as in Section 4.7. Recall that a sequence u NN given by ∈ u = un : n N h ∈ i N is equivalent to a sequence v N given by v = vn : n N if and only if the set of indices ∈ h ∈ i

n N : un = vn { ∈ } 42 5. MICROCONTINUITY, EVT, INTERNAL SETS is a member of a fixed ultrafilter (N). F ⊆ P We identify N with its image in ∗N under the natural embedding generated by n n, n, n, . . . . 7→ h i Theorem 5.5.2. Every finite hypernatural number is a natural num- ber.

Proof. Let [u] ∗N where u = un : n N and suppose [u] is finite. Then there exists∈ a real numberh ∈ i r> 0 such that [u]

n N : un

Si0 = n N : un = i0 , { ∈ } is dominant, i.e., it is a member of . Therefore [u]= i0 and so [u] is a natural number. F We provide an alternative proof exploiting the transfer principle. One observes that natural numbers satisfy the formula ( n N) n

( n ∗N) n

Definition 5.5.4. An equivalence relation g on ∗N is defined as ∼ follows: for n, m ∗N, we have ∈ n m if and only if n m is finite. ∼g | − | Definition 5.5.5. An equivalence class of the relation g is called a galaxy. The galaxy of an element x will be denoted gal(x∼).3 By Corollary 5.5.3, there is a unique galaxy containing finite num- bers, namely the galaxy gal(0) = 0, 1, 2,... = N . { } Each galaxy containing an infinite number such as H is of the form gal(H)= ...,H 3,H 2,H 1,H,H +1,H +2,H +3,... { − − − } and is therefore order-isomorphic to Z (rather than N). 4 Corollary 5.5.6. The ordered set (∗N,<) is not well-ordered.

5.6. Microcontinuity

Let D = Df R be the domain of a real function f, where the notation was discussed⊆ in Remark 5.4.3. Definition 5.6.1. We say that f is microcontinuous at x if when- ever x′ x, one also has f(x′) f(x) for all x′ is in the domain ∗D ≈ ≈ f of ∗f. Here x′ x means that x′ x is infinitesimal. ≈ − Remark 5.6.2. It is a good exercise to prove that ∗(Df )= D∗f . Remark 5.6.3. This definition can be applied not only at a real point x D but also at a hyperreal point x ∗D ; see Example 5.7.3. ∈ f ∈ f Microcontinuity at all points in a segment [x0,X] provides a useful modern proxy for Cauchy’s definition of continuity in his 1821 text Cours d’Analyse: the function f(x) is continuous with respect to x be- tween the given limits if, between these limits, an in- finitely small increment in the variable always pro- duces an infinitely small increment in the function itself. (translation by [Bradley & Sandifer 2009, p. 26]).

3Sometimes galaxies are defined in the context of larger number systems, such ∗ as R. Our main interest in galaxies lies in illustrating the concept of a non-internal set; see Section 5.11. 4Sadur tov 44 5. MICROCONTINUITY, EVT, INTERNAL SETS

In the current cultural climate it needs to be pointed out that Cauchy was obviously not familiar with our particular model of a B-continuum. On the other hand, his procedures and inferential moves find closer proxies in the context of modern infinitesimal-enriched continua than in the context of modern Archimedean continua. Theorem 5.6.4. Let c R. A real function f is continuous in ∈ the ǫ, δ sense at c if and only if ∗f is microcontinuous at c. Proof. We are following [Goldblatt 1998, p. 76]. Recall that a real function f is continuous at c in the ǫ, δ sense if + + ( ǫ R )( δ R )( x Df ) x c < δ f(x) f(c) < ǫ . ∀ ∈ ∃ ∈ ∀ ∈ | − | ⇒| − | (5.6.1)   Assume that ∗f is microcontinuous at c, so that

x c ∗f(x) ∗f(c). (5.6.2) ≈ ⇒ ≈ Let us prove that f is continuous in the sense of (5.6.1). Choose a real ǫ > 0 as in the leftmost quantifier in (5.6.1). Let d > 0 be infini- tesimal. If x c 0 as required.  Conversely, assume that the ǫ, δ condition (5.6.1) holds. Let ǫ be a positive real. By (5.6.1) there is an appropriate real δ > 0 such that the following sentence is true: ( x D ) x c < δ f(x) f(c) < ǫ . (5.6.4) ∀ ∈ f | − | ⇒| − | Applying universal transfer to formula (5.6.4), we obtain

( x ∗D ) x c < δ ∗f(x) ∗f(c) < ǫ . ∀ ∈ f | − | ⇒| − | But now if we choose x such that x c, the condition x c < δ is automatically satisfied since x c ≈is infinitesimal. Therefore| − | the − 5.7. UNIFORM CONTINUITY IN TERMS OF MICROCONTINUITY 45 inequality ∗f(x) ∗f(c) < ǫ holds for arbitrary real ǫ > 0. It fol- | − | lows that ∗f(x) ∗f(c), proving the relation (5.6.2) and the opposite implication. ≈ 

5.7. Uniform continuity in terms of microcontinuity The ǫ, δ definition of uniform continuity was reviewed in Section 5.3. We now introduce an equivalent definition in terms of microcontinuity. Definition 5.7.1 (Alternative definition). A real function f is uni- formly continuous in its domain D = Df if and only if

( x ∗D )( x′ ∗D )[x x′ ∗f(x) ∗f(x′)] . ∀ ∈ f ∀ ∈ f ≈ ⇒ ≈ Remark 5.7.2. This definition of uniform continuity of a real func- tion f can be reformulated in terms of its natural extension ∗f to the hyperreals as follows:

for all x ∗D , ∗f is microcontinuous at x. ∈ f This sounds startlingly similar to the definition of continuity it- self. What is the difference between the two definitions? The point is that microcontinuity is now required at every point of the Bernoullian continuum rather than merely at the points of the Archimedean con- tinuum, i.e., in the domain of ∗f which is the natural extension of the real domain of f. Example 5.7.3. The function f(x)= x2 fails to be uniformly con- tinuous on its domain D = R because of the failure of microcontinuity of its natural extension ∗f at any single infinite hyperreal H ∗D = ∈ f ∗R. The failure of microcontinuity at H is checked as follows. Consider 1 the infinitesimal e = H , and the point H + e infinitely close to H. To show that ∗f is not microcontinuous at H, we calculate 2 2 2 2 2 2 ∗f(H + e)=(H + e) = H +2He + e = H +2+ e H +2. ≈ 2 This value is not infinitely close to ∗f(H)= H :

∗f(H + e) ∗f(H)=2+ ǫ 0. − 6≈ Therefore microcontinuity fails at the point H ∗R. Thus the squaring ∈ function f(x)= x2 is not uniformly continuous on R. The term “microcontinuity” is exploited in [Davis 1977] as well as [Gordon et al. 2002]. It reflects the existence of two definitions of continuity, one using infinitesimals, and one using epsilons. The former is what we refer to as microcontinuity. It is given a special name to dis- tinguish it from the traditional definition of continuity. Note that mi- crocontinuity at a non-standard hyperreal does not correspond to any 46 5. MICROCONTINUITY, EVT, INTERNAL SETS notion available in the epsilontic framework limited to an Archimedean continuum. 1 Example 5.7.4. Consider the function f given by f(x)= x on the open interval D = (0, 1) R. Then ∗f fails to be microcontinuous at a positive infinitesimal.⊂ To see this, choose an infinite H > 0 and 1 1 let x = and x′ = . Clearly x x′. Both of these points are in H H+1 ≈ the extended domain ∗D = ∗(0, 1). Then

∗f(x′) ∗f(x)= H +1 H =1 0. − − 6≈ It follows that the real function f is not uniformly continuous on the open interval (0, 1) R. ⊆ 5.8. Hyperreal extreme value theorem Example 5.8.1. The unit interval [0, 1] R has a natural extension ⊆ ∗[0, 1] ∗R. This contains all positive infinitesimals as well as all numbers⊆ smaller than 1 and infinitely close to 1. Consider the extreme value theorem (EVT). The classical proof of the EVT usually proceeds in two or more stages. (1) Typically, one first shows that the function is bounded. (2) Then one proceeds to construct an extremum by one or an- other procedure involving choices of sequences. The hyperreal approach is both more economical (there is no need to prove boundedness first) and less technical. Furthermore, in the hyperreal approach one can actually write down a succinct formula for a maximum (rather than merely prove its exis- tence); see formula (5.8.3). Theorem 5.8.2. A f on [0, 1] R has a max- imum. ⊂ Proof. Let H ∗N N ∈ \ be an infinite hypernatural number.5 The real interval [0, 1] has a natural hyperreal extension

∗[0, 1] = x ∗R :0 x 1 . { ∈ ≤ ≤ } 1 Consider its partition into H subintervals of equal infinitesimal length H , with partition points i xi = H , i =0,...,H. 5For instance, the one represented by the sequence 1, 2, 3,... with respect to the ultrapower construction outlined in Section 4.7. h i 5.9. INTERMEDIATE VALUE THEOREM 47

The existence of such a partition follows by universal transfer (see Section 5.3) applied to the first order formula

( n N)( x [0, 1]) ( i < n) i x< i+1 . ∀ ∈ ∀ ∈ ∃ n ≤ n The function f has a natural extension ∗fdefined on the hyperreals between 0 and 1. Note that in the real setting (when the number of partition points is finite), a point with the maximal value of f among the partition points xi can always be chosen by induction. We have the following first order property expressing the existence of a maximum of f over a finite collection:

i0 i ( n N)( i0 n)( i n) f( ) f( ) . ∀ ∈ ∃ ≤ ∀ ≤ n ≥ n We now apply the transfer principle to obtain 

i0 i ( n ∗N)( i0 n)( i n) ∗f( ) ∗f( ) , (5.8.1) ∀ ∈ ∃ ≤ ∀ ≤ n ≥ n where ∗N is the collection of hypernatural numbers. Formula (5.8.1) is true in particular for a specific infinite hypernatural value of n given by H ∗N N. Thus, there is a hypernatural i0 such that 0 i0 H and ∈ \ ≤ ≤

( i ∗N) i H ∗f(xi0 ) ∗f(xi) . (5.8.2) ∀ ∈ ≤ ⇒ ≥ Consider the real point  

c = st(xi0 ) (5.8.3) where “st” is the standard part function. By continuity of f at c R, ∈ we have ∗f(x 0 ) ∗f(c), and therefore i ≈

st(∗f(xi0 )) = ∗f(st(xi0 )) = f(c). An arbitrary real point x lies in an appropriate sub-interval of the partition, namely x [xi, xi+1], so that st(xi)= x, or xi x. Applying “st” to the inequality∈ in the formula (5.8.2), we obtain ≈

st(∗f(x )) st(∗f(x )). i0 ≥ i Hence f(c) f(x), for all real x, proving c to be a maximum of f (and ≥ by transfer, of ∗f as well). 

5.9. Intermediate value theorem Theorem 5.9.1. Let f be a continuous real function on [0, 1] and assume f(0) < 0 and f(1) > 0. Then there is a point c [0, 1] such that f(c)=0. ∈ 48 5. MICROCONTINUITY, EVT, INTERNAL SETS

Proof. Let H ∗ N N and consider the corresponding partition ∈ \ i of ∗[0, 1] with partition points xi = H , i =0,...,H. Consider the set of all partition points xi such that for all points before xi, the function is nonpositive. By transfer this set has a last point xi0 . By hypothesis f(x ) 0 < f(x ). i0 ≤ i0+1 Applying standard part we obtain st(f(x )) 0 st(f(x )). i0 ≤ ≤ i0+1

But by microcontinuity st(f(xi0 )) = st(f(xi0+1)). Therefore the point c = st(xi0 ) is a zero of the function. 

5.10. Ultrapower construction applied to (R); internal sets P Consider the set of subsets of R, denoted (R). Thus A is contained P in (R), i.e., A (R) means that A is included in R, i.e., A R. By theP extension principle∈ P (see Section 3.3) we have the corresponding⊆ sub- set ∗A ∗R called the natural extension of A. We would like to apply ⊆ the ultrapower construction to (R). For convenience, we will use the P shortened notation P = (R). Then the natural extension ∗P of P is P constructed as before as a quotient of PN by an appropriate equivalence relation defined in terms of an ultrafilter using the dominant/negligible dichotomy. The natural extensions of subsets of R constitute a useful family of subsets. We will now construct an important larger class of subsets of ∗R called internal sets. These are the members of ∗P.

Definition 5.10.1. An α ∗P is an equivalence class α = [A] ∈ where A is a sequence A = An P : n N of elements of P (i.e., h ∈ ∈ i subsets of R), where A B if and only if n N : An = Bn . ∼ { ∈ }∈F In more detail, we have the following definition.

Definition 5.10.2. Consider a sequence A = An (R): n N h ∈ P ∈ i of subsets of R. We use it to define a set α = [A] ∗R as follows. A ⊆ hyperreal [u] = [un] is an element of the set α = [A] if and only if one has

n N : un An . { ∈ ∈ }∈F Such sets α ∗R are called internal. ⊆ Here as usual (N) is the fixed free ultrafilter used in the construction of theF hyperreal ⊆ P field. 5.11. THE SET OF INFINITE HYPERNATURALS IS NOT INTERNAL 49

Example 5.10.3. An example of a subset of ∗R which is internal but is not a natural extension of any real set is the interval [0,H] where H is an infinite hyperreal. To show that it is internal, represent H by a sequence u = un : n N , so that H = [u]. Then the set α = [0,H] is h ∈ i the equivalence class α = [A] of the sequence A = An : n N of real h ∈ i intervals An = [0,un] R. ⊆ 5.11. The set of infinite hypernaturals is not internal

Does the construction of Section 5.10 give all possible subsets of ∗R? The answer turns out to be negative even in the case of the hypernat- urals, as we show in Theorem 5.11.4.

Theorem 5.11.1. Each internal subset of ∗N has a least element.

Remark 5.11.2. In other words, ∗N is internally well-ordered in the sense that the well-ordering property is satisfied if we only deal with internal sets. Proof. Assume that an internal α ia given by α = [A] where A = A . Here each A can be assumed nonempty by Lemma 5.11.3 below. h ni n Since N is well-ordered we can choose a minimal element un An for ∈ each n. Consider the sequence u = un : n N . Its equivalence class [u] is the minimal element in theh set α. ∈ i  Lemma 5.11.3. A nonempty set α = [A] where A = A can always h ni be represented by a sequence where each of the sets An in the sequence is nonempty.

Proof. The set of indices n for which An = ∅ is negligible since otherwise α would be the empty set itself. For each n from this neg- ligible set of indices n, we can replace the corresponding set An = ∅ by An = N without affecting the equivalence class α = [A]. In the new sequence all the sets An are nonempty. 

Theorem 5.11.4. The set of infinite hypernaturals, ∗N N, is not \ an internal subset of ∗N.

Proof. If the set ∗N N were internal it would have a least ele- ment by Theorem 5.11.1. But\ there is no minimal infinite hypernatural because if H is an infinite hypernatural then H 1 is another infinite hypernatural. −  Recall that every galaxy of infinite hypernaturals is order-isomorphic to Z. For the structure of the galaxies see Section 5.5. Thus the set of infinite hypernaturals is external. It follows that every internal set including ∗N N will necessarily contain also some \ 50 5. MICROCONTINUITY, EVT, INTERNAL SETS elements that are finite natural numbers. This principle is called un- derspill. This property is expploited in Section 11.2.

5.12. Transfering the well-ordering of N; Henkin semantics Is it possible to transfer the well-ordering of the natural numbers? All the examples of transfer given in Section 5.3 deal with quantifi- cation over numbers, such as natural, rational, or real numbers. As already pointed out, quantification over sets cannot be encom- passed by the transfer principle because the least upper bound property for bounded sets in R fails when interpreted literally over ∗R.

Example 5.12.1. The set of all infinitesimals in ∗R does not admit a least upper bound. Indeed, if C > 0 were such a bound, then either C is infinitesimal or it is appreciable. If C were infinitesimal, then the infinitesimal 2C is greater than C, contradicting the claim that C is a l.u.b. If C is appreciable, then C/2 is also appreciable and therefore a smaller upper bound than C. The contradiction shows that there is no l.u.b. On the other hand we showed in Section 5.11 that, while the set of infinite hypernaturals is not well-ordered with respect to the natural ordering induced from ∗N, nevertheless internal subsets of ∗N do satisfy the well-ordering condition. In fact, quantification over sets in N can be transfered on condition of being interpreted as quantification over elements of P = (N), as follows. The condition of well-ordering can be stated as follows.P To simplify the formula we will use the symbol ′ to denote quantification over nonempty sets only: ∀

( ′A N)( u A)( x A) u x . ∀ ⊆ ∃ ∈ ∀ ∈ ≤ To make this transferable, we reformulate the condition as follows: ( A P ∅ )( u A)( x A) u x , ∀ ∈ \{ } ∃ ∈ ∀ ∈ ≤ where P = (N). At this point transfer can be applied, resulting in the followingP sentence:

A ∗P ∅ ( u A)( x A) u x . ∀ ∈ \{ } ∃ ∈ ∀ ∈ ≤ Here quantification is over elements of ∗P. In other words, quantifica- tion is over internal subsets of ∗N, resulting in a correct sentence. The point is that ∗ (N) $ (∗N). P P 5.13. FROM HYPERRATIONALS TO REALS VIA ULTRAPOWER 51

Remark 5.12.2. Robinson refers to this approach to quantification as Henkin semantics. All sentences involving quantification over subsets of A R (5.12.1) ⊆ can similarly be transfered provided we interpret the quantification as ranging over A (R) (5.12.2) ∈ P and transfering formula (5.12.2) instead of (5.12.1), to obtain

A ∗ (R), ∈ P entailing quantification over internal subsets. For example, the least upper bound property for bounded sets holds over the hyperreals, pro- vided we interpret it as applying to internal subsets only.

5.13. From hyperrationals to reals via ultrapower This section supplements the material of Section 4.7 on the ultra- power construction. We used the ultrapower construction to build the hyperreal field out of the real field. But the ultrapower construction can alo be used to give an alternative construction of the real number system starting with the rationals. Let Q be the field of rational numbers. The star-transform of Q gives an extension Q ∗Q ⊆ N to the hyperrationals. Here ∗Q = Q / MAX (see Section 4.5 for de- tails). Consider the subring F ∗Q consisting of all finite hyperrationals. More precisely, we have the⊆ following definition. Definition 5.13.1. We set

F = x ∗Q :( r Q) x 0 x

Theorem 5.13.2. The ideal I F is maximal, so that the quotient field ⊆ ρ = F/I is naturally isomorphic to R. Proof. We will indicate the isomorphisms in each direction, as follows: φ : ρ R and ψ : R ρ. Given an element→ x ρ, choose→ a representativex ˜ F . Sincex ˜ is finite, its standard part∈ is well-defined, and we set ∈ φ(x) = st(˜x), where “st” is the standard part. This is independent of the choice of the representative because the standard part function suppresses all infinitesimal differences. We therefore obtain a map φ : ρ R. To construct a homomorphism in the opposite direction,→ consider a positive real number y R. The procedure will be similar for a negative number but we describe∈ it for a positive number so as to fix ideas. Let K = 10H y ⌊ ⌋ so that 0 10H y K < 1. ≤ − Dividing by 10H we obtain K 1 0 y < ≤ − 10H 10H so that y K . In other words, we are considering the decimal ap- ≈ 10H proximation of y truncated6 at rank H, resulting in a hyperrational K F ∗Q. 10H ∈ ⊆ Then the map ψ : R ρ is defined by sending the real number y to the → K equivalence class of the finite hyperrational 10H in F/I, or in formulas K ψ(y)= + I. 10H K 7 Then we have φ(ψ(y)) = st( 10H )= y, proving the theorem.  6katum (with kuf)? mekutzatz? google translate gives “esroni mekutzatz” 7Should the following material be included in the main text? This can be elaborated as follows in terms of the extended decimal expansion. Consider the decimal expansion of y: y = a.a1a2a3 ...an ... defined for each natural index n N. By transfer, the decimal digits an are defined for each hypernatural ∈ rank, as well: y = a.a1a2a3 ...an ... ; ...aH−1aH aH+1 ... Here ranks to the left of the semicolon are finite, while ranks to the right of the semicolon are infinite. ∗ Choose a infinite hypernatural H N N. The product 10H y has the form 10H y = ∈ \ 5.14. REPEATING THE CONSTRUCTION 53

5.14. Repeating the construction What happens if we apply the F/I-type construction but now with R in place of Q? Namely, consider the star-transfer ∗R, and the ring of finite hyperreals ∗RF ∗R, as well as the ideal of infinitesimals ⊆ ∗RI ∗RF . ⊆ It turns out that we do not get anything new in this direction, as attested to by the following.

Theorem 5.14.1. The quotient ∗RF /∗RI is naturally isomorphic to R itself. The proof is essentially the same, with the main ingredient being the existence of the standard part function with range R:

st : ∗RF R → with kernel precisely ∗RI . Remark 5.14.2. Similar considerations apply to Rn, showing that n n n the quotient ∗RF /∗RI is naturally isomorphic to R itself. We will return to this observation in Section 8.5.

aa1a2a3 ...an ...aH−1aH .aH+1 ... with a decimal point after the digit aH . There- H fore its part is the K = 10 y = aa a a ...a ...a − a , ⌊ ⌋ 1 2 3 n H 1 H and we proceed as above.

CHAPTER 6

Application: small oscillations of a pendulum

The material in this chapter first appeared in Quantum Studies: Mathematics and Foundations [Kanovei, Katz & Nowik 2016].

6.1. Small oscillations of a pendulum In his 1908 book Elementary Mathematics from an Advanced Stand- point, Felix Klein advocated the introduction of calculus into the high- school curriculum. One of his arguments was based on the problem of small oscillations of the pendulum.1 The problem had been treated until then using a somewhat mysterious superposition principle. The latter involves (a vertical plane projection of) a hypothetical circular motion of the pendulum. Klein advocated what he felt was a bet- ter approach, involving the differential equation of the pendulum; see [Klein 1908, p. 187]. The classical problem of the pendulum translates into the second g order nonlinear differential equationx ¨ = ℓ sin x for the variable an- gle x with the vertical direction, where g is− the constant of gravity and ℓ is the length of the (massless) rod or string. The problem of small os- cillations deals with the case of small amplitude,2 i.e., x is small, so that sin x is approximately x. Then the equation is boldly replaced by the linear onex ¨ = g x, whose solution is harmonic motion with − ℓ period 2π ℓ/g. This suggests that the period of small oscillations should be in- dependentp of their amplitude. The intuitive solution outlined above may be acceptable to a physicist, or at least to the mathematicians’ proverbial physicist. The solution Klein outlined in his book does not go beyond the physicist’s solution. The Hartman–Grobman theorem [Hartman 1960], [Grobman 1959] provides a criterion for the flow of the nonlinear system to be conjugate to that of the linearized sys- tem, under the hypothesis that the linearized matrix has no eigenvalue with vanishing real part. However, the hypothesis is not satisfied for the pendulum problem.

1Tnudat metutelet 2misraat 55 56 6. APPLICATION: SMALL OSCILLATIONS OF A PENDULUM

To provide a rigorous mathematical treatment for the notion of small oscillation, it is tempting to exploit a hyperreal framework fol- lowing [Nowik & Katz 2015]. Here the notion of small oscillation can be given a precise sense, namely infinitesimal amplitude. Note however the following. Remark 6.1.1. Even for infinitesimal x one cannot boldly replace sin x by x. Therefore additional arguments are required. The linearisation of the pendulum is treated in [Stroyan 2015] us- ing Dieners’ “Short Shadow” Theorem; see Theorem 5.3.3 and Exam- ple 5.3.4 there. This text can be viewed as a self-contained treatement of Stroyan’s Example 5.3.4. The traditional setting in the context of the real continuum can only make sense of the claim that “the period of small oscillations is inde- pendent of the amplitude” by means of a paraphrase in terms of limits, rather than a specific oscillation. In the context of an infinitesimal- enriched continuum, such a claim can be formalized more literally; see Corollary 6.8.1.

6.2. Vector fields, walks, and flows The framework developed in [Robinson 1966] involves a proper extension ∗R R preserving the properties of R to a large extent dis- ⊇ cussed in Remark 6.3.3. Elements of ∗R are called hyperreal numbers. A positive hyperreal number is called infinitesimal if it is smaller than every positive real number. We choose a fixed positive infinitesimal λ ∗R (a restriction on the choice of λ appears in Section 6.8). ∈ Remark 6.2.1. We will work with flows in a two-dimensional con- text where the manifold can be conveniently identified with C. We will also exploit the natural extension ∗C C. ⊇ Definition 6.2.2. Given a classical vector field V = V (z) where z ∈ C, one forms an infinitesimal displacement δF (of the intended prevec- torfield Φ) by setting δΦ = λV .

Thus the aim is to construct the corresponding flow Φt in the plane. Note that a zero of δΦ corresponds to a fixed point of the flow. Definition 6.2.3. The infinitesimal generator of the flow is the function Φ : ∗C ∗C, also called a prevector field, defined by → Φ(z)= z + δΦ(z), (6.2.1) 6.3. THE HYPERREAL WALK AND THE REAL FLOW 57 where δΦ(z) = λV (z) in the case of a displacement generated by a classical vector field as above. Remark 6.2.4. An infinitesimal generator, or prevector field, could be a more general internal function. See Section 5.10 and Chapter 8.4 for details on internal sets, and Definition 10.3.3 for the D1 condition for prevectorfields. 6.3. The hyperreal walk and the real flow We propose a concept of solution of differential equation based on Euler’s method with infinitesimal step, with well-posedness based on a property of adequality (see Section 6.4), as follows.

Definition 6.3.1. The hyperreal flow, or walk, Φt(z) is a t-paramet- rized map ∗C ∗C defined whenever t is a hypernatural multiple t = Nλ of λ, by setting→ N Φ (z) = Φ (z) = Φ Φ ... Φ(z) = Φ◦ (z), (6.3.1) t Nλ ◦ ◦ ◦ N where Φ◦ is the N-fold composition. For arbitrary hyperreal t the walk can be defined by setting Φt = Φ t/λ λ. ⌊ ⌋ Remark 6.3.2. The fact that the infinitesimal generator Φ given by (6.2.1) is invariant under the flow Φt of (6.3.1) receives a transparent meaning in this framework, expressed by the commutation relation N N Φ Φ◦ = Φ◦ Φ ◦ ◦ due to transfer of associativity of composition of maps. Remark 6.3.3. As discussed in Section 3.3, the transfer principle is a type of theorem that, depending on the context, asserts that rules, laws or procedures valid for a certain number system, still apply (i.e., are transfered) to an extended number system. Thus, the familiar ex- tension Q R preserves the properties of an ordered field. To give a ⊆ negative example, the extension R R of the real numbers to the so-called extended reals does⊆ not preserve∪{±∞} the properties of an ordered field. The hyperreal extension R ∗R preserves all first-order properties, including the identity sin2 x +⊆ cos2 x = 1 (valid for all hy- perreal x, including infinitesimal and infinite values of x ∗R). The ∈ natural numbers N R ∗R are naturally extended to the hypernat- ⊆ ⊆ urals ∗N ∗R. For a more detailed discussion, see Chapter 12. ⊆ Definition 6.3.4. The real flow φt on C for t R when it exists is ∈ constructed as the shadow (i.e., standard part) of the hyperreal walk Φt by setting φt(z)=st ΦNλ(z)  58 6. APPLICATION: SMALL OSCILLATIONS OF A PENDULUM

t where N = λ , while x rounds off the hyperreal number x to the nearest integer no greater⌊ ⌋ than x, and “st” (standard part or shadow) rounds off each  finite hyperreal to its nearest real number. For t sufficiently small, appropriate regularity conditions ensure that the point ΦNλ(z) is finite so that the shadow is well-defined. The usual relation of being infinitely close is denoted . Thus for finite (i.e., non-infinite) z,w we have z w if and only if st(≈z) = st(w). This relation is an additive one. ≈ The appropriate relation for working with small prevector fields is a multiplicative rather than an additive one, as detailed in Section 6.4.

6.4. Adequality pq We will use Leibniz’s notation pq . Leibniz actually used a symbol that looks more like but the latter is commonly used to denote a product. Leibniz used⊓ the symbol to denote a generalized notion of equality “up to” (though he did not distinguish it from the usual sym- bol = which he also used in the same sense). A prototype of such a relation (though not the notation) appeared already in Fermat under the name adequality. We will use it for a multiplicative relation among (pre)vectors.

Definition 6.4.1. Let z,w ∗C. We say that z and w are adequal ∈ z z and write z pq w if either one has w 1 (i.e., w 1 is infinitesimal) or z = w = 0. ≈ − This implies in particular that the angle between z,w is infinites- imal (when they are nonzero), but the relation pq entails a stronger condition. If one of the numbers is appreciable, then so is the other and the relation z pq w is equivalent to z w. If one of z,w is infin- itesimal then so is the other, and the difference≈ z w is not merely infinitesimal, but so small that the quotients z | w−/z |and z w /w are infinitesimal, as well. | − | | − | We are interested in the behavior of orbits in a neighborhood of a fixed point 0, under the assumption that the infinitesimal displace- ment satisfies the Lipschitz condition. In such a situation, we have the following theorem. Theorem 6.4.2. Prevector fields defined by adequal infinitesimal displacements define hyperreal walks that are adequal at each finite time.

In particular, if δΦ pq δG then φt = gt where φt and gt are the corresponding real flows. This was shown in [Nowik & Katz 2015, Example 5.12]. 6.6. LINEARISATION 59

6.5. Infinitesimal oscillations Let x denote the variable angle between an oscillating pendulum and the downward vertical direction. By considering the projection of the force of gravity in the direction of motion, one obtains the equation of motion mℓx¨ = mg sin x where m is the mass of the bob of the pendulum, ℓ is the− length of its massless rod, and g is the constant of gravity. Thus we have a second order nonlinear differential equation x¨ = g sin x. (6.5.1) − ℓ The initial condition of releasing the pendulum at angle a (for ampli- tude) is described by x(0) = a (x˙(0) = 0 We replace (6.5.1) by the pair of first order equations g x˙ = ℓ y, y˙ = g sin x, ( −p ℓ and initial condition (x, y)=(a,p0). We identify (x, y) with x + iy and (a, 0) with a + i0 as in Section 6.2. The classical vector field corresponding to this system is then V (x, y)= g y g sin x i. (6.5.2) ℓ − ℓ The corresponding prevectorq field (i.e.,q infinitesimal generator of the flow) Φ is defined by the infinitesimal displacement δ =(λ g y) (λ g sin x)i Φ ℓ − ℓ so that q q Φ(z)= z + δΦ(z). We are interested in the flow of Φ, with initial condition a +0i. The flow is generated by hyperfinite iteration of Φ.

6.6. Linearisation

Consider also a prevector field E(z) = z + δE(z) defined by the displacement δ (z)= λ g y iλ g x E ℓ − ℓ where as before z = x + iy. We areq interestedq in small oscillations, i.e., the case of infinitesimal amplitude a. Since sin x pq x for infinitesimal x we have δE pq δΦ. 60 6. APPLICATION: SMALL OSCILLATIONS OF A PENDULUM

Due to the multiplicative nature of the relation of adequality, the rescal- ings of E and Φ by change of variable z = aZ remain adequal and therefore define adequal hyperreal walks by Theorem 6.4.2.

6.7. Adjusting linear prevector field We will compare E to another linear prevector field H defined by g iλ H(x + iy)= e− q ℓ (x + iy) = x cos λ g + y sin λ g + x sin λ g + y cos λ g i ℓ ℓ − ℓ ℓ  q q   q q g given by clockwise rotation of the x, y plane by infinitesimal angle λ ℓ . The corresponding hyperreal walk, defined by hyperfinite iteration of H, satisfies the exact equality p

H (a, 0) = a cos g t, a sin g t (6.7.1) t ℓ − ℓ whenever t is a hypernatural multipleq of λ. Inq particular, we have the periodicity property H 2π (z)= z and hence √g/ℓ

2 Ht+ π = Ht (6.7.2) √g/ℓ 2π whenever both t and g are hypernatural multiples of λ. The usual q ℓ estimates cos λ g 1 sin λ g ℓ − 0, ℓ 1. (6.7.3) λ ≈ λ g ≈ p pℓ imply an adequality δE pq δH whenever xpand y are finite. By Theo- rem 6.4.2 the hyperfinite walks of E and H satisfy

Et(a, 0) pq Ht(a, 0) for each finite initial amplitude a and for all finite time t which is a hypernatural multiple of λ.

6.8. Conclusion The advantage of the prevector field H is that its hyperreal walk is given by an explicit formula (6.7.1) and is therefore periodic with 2π period precisely g , provided we choose our base infinitesimal λ in q ℓ 2π such a way that g is hypernatural. We obtain the following con- λ q ℓ sequence of the periodicity property (6.7.2): modulo an appropriate choice of a representing prevector field (namely, H) in the adequality 6.8. CONCLUSION 61 class, the hyperreal walk is periodic with period 2π ℓ/g. This can be summarized as follows. p Corollary 6.8.1. The period of infinitesimal oscillations of the pendulum is independent of their amplitude. If one rescales such an infinitesimal oscillation to appreciable size by a change of variable z = aZ where a is the amplitude, and takes standard part, one obtains a standard harmonic oscillation with pe- riod 2π ℓ/g. The formulation contained in Corollary 6.8.1 has the advantage of involving neither rescaling nor shadow-taking. p

CHAPTER 7

Differentiable manifolds

7.1. Definition of differentiable manifold A n-dimensional manifold M is a set covered by a collection of sub- sets (called coordinate charts or neighborhoods), typically denoted A or B, and having the following properties. For each coordinate neigh- borhood A we have an injective map u : A Rn whose image is an n → open set in R . Thus, the coordinate neighborhood is a pair (A, u). The maps are required to satisfy the following compatibility condition. Let n i u : A R , u =(u )i=1,...,n, → and similarly n α v : B R , v =(v )α=1,...,n. → Whenever the overlap A B is nonempty, it has a nonempty image v(A B) in Euclidean space.∩ Restricting to this subset, we obtain a one-to-one∩ map 1 n u v− : v(A B) R (7.1.1) ◦ ∩ → from an open set v(A B) Rn to Rn. ∩ ⊆1 n n Similarly, the map v u− from the open set u(A B) R to R is one-to-one. ◦ ∩ ⊆ Definition 7.1.1. The defining condition of a smooth manifold is that the map (7.1.1) is differentiable for all choices of coordinate 1 neighborhoods A and B as above. The maps u v− are called the transition maps.1 The collection of coordinate charts◦ as above is called an atlas. The coordinate charts induce a topology on M by imposing the usual conditions (arbitrary unions of open sets are open; finite inter- sections of open sets are open).

1funktsiot maavar (nothing to do with ekron haavara) 63 64 7. DIFFERENTIABLE MANIFOLDS

Remark 7.1.2. We will usually assume that M is connected. Given the manifold structure as above, connectedness of M is equivalent to path-connectedness2 of M. Remark 7.1.3 (Metrizability). There are some pathological non- Hausdorff examples like two copies of R glued along an open halfline. To rule out such examples, the simplest condition is that of metrizabil- ity; see e.g., Example 7.3.2, Theorem 7.4.2. Identifying A B with a subset of Rn by means of the coordi- nates (ui), we can∩ think of v as given by n functions vα(u1,...,un) as α runs from 1 to n. Remark 7.1.4. The manifold condition can be stated as the re- quirement that the functions vα(ui) are all smooth. Example 7.1.5. The usual hierarchy of smoothness denoted Ck (or C∞) in Euclidean space generalizes to manifolds. Thus, for k = 1 a manifold M is C1 if and only if all the partial derivatives ∂vα , α =1,...,n, i =1,...,n ∂ui are continuous. Similarly, M is C2 if all second partials ∂2vα ∂ui∂uj are continuous, etc.

7.2. Open submanifolds, Cartesian products The notion of open and closed set in M is inherited from Euclidean space via the coordinate charts. An open subset C M of a mani- fold M is itself a submanifold, with differentiable structure⊆ obtained by the restriction of the coordinate map u =(ui)of(A, u) or in formu- las u⇂A C . ∩ Let Matn,n(R) be the set of square matrices with real coefficients. This is identified with Euclidean space of dimension n2, and is therefore a manifold.

Theorem 7.2.1. Define a subset GL(n, R) Matn,n(R) by setting ⊆ GL(n, R)= X Matn,n(R) : det(X) =0 . { ∈ 6 } Then GL(n, R) is a submanifold.

2kshir-mesila 7.3. TORI 65

Proof. The determinant function is a polynomial in the entries xij of the matrix X. Therefore it is a continuous function of the entries, 2 which are the coordinates in Rn . Hence GL(n, R) is an open set, and therefore a manifold.  Note that its complement is the closed set consisting of matrices of zero determinant. Theorem 7.2.2. Let M and N be two differentiable manifolds of dimensions m and n. Then the Cartesian product M N is a differ- entiable manifold of dimension m + n. The differentiable× structure is defined by coordinate neighborhoods of the form (A B,u v), where (A, u) is a coordinate chart on M, while (B, v) is× a coordinate× chart on N. Here the function u v on A B is defined by × × (u v)(a, b)=(u(a), v(b)) × for all a A, b B. ∈ ∈ 7.3. Tori Example 7.3.1. The torus T 2 = S1 S1 is a 2-dimensional man- ifold. More generally, the n-torus T n =×S1 S1 (product of n copies of the circle) is an n-dimensional manifold.×···× Example 7.3.2. The circle S1 = (x, y) R2 : x2 + y2 = 1 itself is a manifold. Let us give an explicit{ atlas for∈ the circle. Let A+} S1 be the open upper halfcircle ⊆ A+ = a =(x, y) S1 : y> 0 . { ∈ } Consider the coordinate chart (A+,u), namely u : A+ R, → defined by setting u(x, y)= x. The open lower halfcircle A− also gives a coordinate chart (A−,u) where the coordinate u is defined by the same formula. We similarly define the right halfcircle B+ = a =(x, y) S1 : x> 0 , { ∈ } yielding a coordinate chart (B+, v) where v(x, y) = y, and similarly + + for B−. The transition function between A and B is calculated by noting that in the overlap A+ B+ one has both x> 0 and y > 0. ∩ 1 1 Let us calculate a typical transition function u v− . The map v− ◦ sends y R1 to the point ( 1 y2,y) S1, and then the coordinate ∈ − ∈ map u sends ( 1 y2,y) to the first coordinate 1 y2 R1. Thus − 1 p − ∈ the composed map u v− given by p ◦ p y 1 y2 7→ − p 66 7. DIFFERENTIABLE MANIFOLDS is the transition function in this case. Since this is smooth for all y (0, 1), the circle is a C∞ manifold of dimension 1 (modulo spelling out∈ the remaining transition functions). Setting d(p, q) = arccos p, q we see that S1 is metrizable. h i

7.4. Projective spaces Definition 7.4.1. Let X = Rn+1 0 be the collection of (n + 1)-tuples x = (x0,...,xn). Define an\{ equivalence} relation among tuples x, y X by setting x y if and only if there is a real number∼ t = 0 such that∈y = tx, i.e., ∼ 6 yi = txi, i =0, . . . , n. The collection of equivalence classes is called the real projective space and denoted RPn. Theorem 7.4.2. The space RPn is a smooth n-dimensional mani- fold.

Proof. We define coordinate neighborhoods Ai, where i =0,...,n by setting i Ai = x : x =0 . (7.4.1) { 6 } n We will now define the coordinate pair (Ai,ui), where ui : Ai R , namely the corresponding coordinate chart. We can set → x0 xi 1 xi+1 xn u (x)= ,..., − , ,..., i xi xi xi xi   i since division by x is allowed in Ai by (7.4.1). The coordinate ui is well-defined because if x y then u (x) = u (y). To calculate the ∼ i i transition map, let u = ui and v = uj, where we assume for simplicity 1 that i < j. We wish to calculate the map u v− associated with Ai Aj. 0 n 1 n ◦ j ∩ Take a point z = (z ,...,z − ) R . Since we work with x = 0, we can rescale the homogeneous coordinates∈ so that xj = 1, so6 we can 1 represent v− (z) by 1 0 j 1 j n 1 v− (z)=(z ,...,z − , 1, z ,...,z − ).

Now we apply u = ui, obtaining 0 i 1 i+1 j 1 j+1 n 1 1 z z − z z − 1 z z − u v− (z)= ,..., , ,... , , ,..., (7.4.2) ◦ zi zi zi zi zi zi zi   All transition functions appearing in (7.4.2) are rational functions and are therefore smooth. Thus RPn is a differentiable manifold. Set- ting d(p, q) = arccos p, q for unit p, q we see that RPn is metriz- able. |h i|  7.5. DEFINITION OF A VECTOR FIELD 67

7.5. Definition of a vector field

Recall that the tangent space TpM at a point p M is the collection of all tangent vectors at the point p. Here a tangent∈ vector can be defined via derivations.3 i With respect to a coordinate chart (A, u) where u =(u )i=1,...,n, we ∂ have a basis ( ∂ui ) for TpM. Therefore an arbitrary vector X at p is a linear combination ∂ Xi , ∂ui for appropriate Xi. Here we use the Einstein summation convention. ∂ Since the vectors ∂ui form a basis for the tangent space at every point p A, a choice of component functions Xi(u1,...,un) in the  neighborhood∈ will define a vector field ∂ Xi(u1,...,un) ∂ui in A. Here the components Xi are required to be of an appropriate differentiability type. More precisely, we have the following definition.

Definition 7.5.1. (see [Boothby 1986, p. 117]) Let M be a C∞ manifold. A vector field X of class Cr on M is a function assigning to each point p of M a vector Xp TpM whose components in any local coordinate (A, u) are functions∈ of class Cr.

Example 7.5.2. Let M be the Euclidean plane R2. Via obvious identifications the Euclidean norm in the (x, y)-plane leads naturally to a Euclidean norm on the tangent space (i.e., tangent plane) at | | ∂ ∂ every point with respect to which both tangent vectors ∂x and ∂y are orthogonal and have unit norm: ∂ ∂ = =1. ∂x ∂y

Therefore the dual basis (dx,dy ) of 1-forms also satisfies dx = dy =

1, and the two basis 1-forms are orthogonal. | | | | ∂ ∂ 2 Note that each of ∂x and ∂y defines a global vector field on R (defined at every point). Example 7.5.3 (Vector fields defined by polar coordinates). In- teresting examples of vector fields are provided by polar coordinates. These may be undefined at the origin, i.e., apriori only defined in the

3This material was covered in the undergraduate course 88201. 68 7. DIFFERENTIABLE MANIFOLDS

2 ∂ open submanifold R 0 . Here ∂θ is represented by the path (r cos t, r sin t) with derivative ( r sin\{t,} r cos t). Hence − ∂ = r (7.5.1) ∂θ

1 ∂ so that r ∂θ is of norm 1. Passing to the dual basis we obtain that rdθ ∂ ∂ 4 has norm 1 because ∂r , ∂θ are orthogonal.

7.6. Zeros of vector fields Example 7.6.1 (Zero of type source/sink). We mentioned that the ∂ ∂ vector field ∂r in the plane is undefined at the origin, but r ∂r is a continuous (C0) vector field vanishing at the origin. Moreover it is ∂ ∂ equal to x ∂x + y ∂dy and hence smooth. ∂ The vector field in the plane defined by r ∂r is sometimes called a “source” while the vector field X = r ∂ is sometimes called a − ∂r “sink”.5 Note that the integral curves of a source flow from the origin and away from it, whereas the integral curves of a sink flow into the origin, and converge to it at for large time. At a point p R2 with polar coordinates (r, θ) this is given by ∈

r ∂ if p =0 X(p)= − ∂r 6 (0 if p =0.

∂ Example 7.6.2 (Zero of type “circulation”). The vector ∂θ in the plane, viewed as a tangent vector at a point at distance r from the origin, tends to zero as r tends to 0, as is evident from (7.5.1). Therefore the vector field defined by ∂ on R2 0 extends by continuity to the ∂θ \{ } point p = 0. Moreover it equals y ∂ + x ∂ and hence smooth. − ∂x ∂y

4Alternatively, since dr2 = dx2 + dy2 by Pythagoras, we have dr = 1 as well. 2 y 1 | x | xdy−ydx Meanwhile θ = arctan x and therefore dθ = 1+(y/x)2 d(y/x) = y2+x2 x2 = xdy−ydx r2 . Hence xdy ydx r 1 dθ = | − | = = . (7.5.2) | | r2 r2 r Thus rdθ is a unit covector. We therefore have an orthonormal basis (dr, rdθ) for ∂ the cotangent space. Since dθ ∂θ = 1, equation (7.5.2) implies (7.5.1). Therefore 1 ∂ is a unit vector. r ∂θ  5Would these be makor and kior? 7.7. 1-PARAMETER GROUPS OF TRANSFORMATIONS OF A MANIFOLD 69

Thus we obtain a continuous vector field p X(p)= y ∂ + x ∂ 7→ − ∂x ∂y on R2 which vanishes at the origin: ∂ if p =0 X(p)= ∂θ 6 (0 if p =0. Such a vector field is sometimes described as having circulation6 around the point 0. Note that the integral curves of a circulation are circles around the origin. Example 7.6.3 (Zero of type “circulation”on the sphere). Spherical coordinates (ρ,θ,ϕ) in R3 restrict to the unit sphere S2 R3 to give coordinates (θ,ϕ) on S2. The north pole is defined by ϕ⊆ = 0. Here ∂ the angle θ is undefined but the vector field ∂θ can be extended by continuity as in the plane, and we obtain a zero of type “circulation”. ∂ Similarly the south pole is defined by ϕ = π. Here ∂θ has a zero of type “circulation” but going clockwise (with respect to the natural orientation on the 2-sphere).

7.7. 1-parameter groups of transformations of a manifold Let θ : R M M be a smooth mapping (this θ is in general × → unrelated to polar coordinates). Denote coordinates t R and p M. ∈ ∈ We will write θt(p) for θ(t, p). Assume θ satisifies the following two conditions: (1) θ (p)= p for all p M; 0 ∈ (2) θt θs(p)= θt+s(p) for all p M and all s, t R. ◦ ∈ ∈ Then θ is called a 1-parameter group of transformations, a flow, or an action for short. An action θ defines a vector field X on M called the infinitesimal generator of θ, as follows.

Definition 7.7.1. The infinitesimal generator X of the flow θt is defined at a point p M as a derivation Xp defined by setting for each function f on M, ∈ 1 Xp(f) = lim f(θ∆t(p)) f(p) . ∆t 0 → ∆t − Remark 7.7.2. This is a generally accepted term defined as above in a context where actual infinitesimals are not present.

6machzor, tzirkulatsia. 70 7. DIFFERENTIABLE MANIFOLDS

7.8. Invariance of infinitesimal generator under flow

In this section we follow [Boothby 1986]. Consider a flow θt on a manifold M as in Section 7.7, namely a map θ : R M M, × → where we write θ (p) for θ(t, p), such that θ (p)= p for all p M, and t 0 ∈ θt θs(p)= θt+s(p) for all p M and all s, t R. ◦ ∈ ∈ Then one obtains an induced action θt∗ on a vector field Y on M, producing a new vector field θt∗ (Y ) as follows. We view Y as a deriva- tion acting on functions f on M. Given a function f on M, we need to specify the manner in which the new vector field θt∗ (Y ) differentiates the function f. Definition 7.8.1. The induced action on the vector field Y of the flow θ of the manifold M is defined by setting θ (Y )f = Y (f θ ) t∗ p p ◦ t so that the vector θ ∗ (Y ) resides at the point θ (p) M. t p t ∈ In more detail, let q = θt(p). If a function f is defined in a neigh- borhood of q M then the composed function f θ is defined in a ∈ ◦ t neighorhood of p and therefore it makes sense apply the derivation Yp to the function f θt. Recall that the◦ infinitesimal generator X of a flow θ is the vector field satisfying 1 Xp(f) = lim f(θ∆t(p)) f(p) . ∆t 0 → ∆t − What happens when we apply the induced action of the flow to the vector field which is the infinitesimal generator of the flow itself? An answer is provided by the following theorem. Theorem 7.8.2 (Boothby p. 124). The infinitesimal generator X of an action θt is invariant under the flow:

θt∗ (Xp)= Xθt(p). Proof. The proof is a direct computation, obtained by testing the field on a function f Dq where q = θt(p) (and Dq is the space of smooth functions defined∈ in a neighborhood of q) as follows:

θ ∗ (X )f = X (f θ ) t p p ◦ t 1 = lim f θt (θ∆t(p)) f θt(p) . ∆t 0 → ∆t ◦ − ◦  7.9. LIE DERIVATIVE AND LIE BRACKET 71

Since R is abelian, we have θt θ∆t = θt+∆t = θ∆t θt (by definition of a flow). Hence ◦ ◦ 1 θt∗ (Xp)f = lim f θ∆t(θt(p)) f(θt(p)) . ∆t 0 → ∆t ◦ − = Xθt(p)f,  proving the theorem.  Note that while theorem 7.8.2 is intuitively “obvious”, the received formalism involved in proving it is bulky. We will develop a more direct approach in Section 9.6. 7.9. Lie derivative and Lie bracket In this section as in the previous one we follow Boothby.

Definition 7.9.1 (Boothby p. 154). Let θt be a flow on M with infinitesimal generator X, and let Y be a vector field on M. The vector field LX Y , called the Lie derivative of Y with respect to X, is defined by setting 1 LX Y = lim θ t (Yθ(t,p)) Yp t 0 − ∗ → t − Theorem 7.9.2. If X and Yare vector fields on M then

LX Y = [X,Y ], where [X,Y ]f = X (Y f) Y (Xf) is the Lie bracket. p − p Proof. This can be checked in coordinates using a Taylor formula with remainder.  Example 7.9.3. The coordinate vector fields commute with respect to the Lie bracket, i.e., ∂ ∂ , =0 ∂ui ∂uj   for all i, j =1,...,n. This is an equivalent way of stating the equality of mixed partials (sometimes called Schwarz’s theorem or Clairaut’s theorem).

CHAPTER 8

Theorem of Frobenius

8.1. Invariance under diffeomorphism In this section as in Section 7.9 we follow Boothby. We first review some concepts from advanced calculus. If σ : M N → is a smooth map such that σ(p) = q N, we obtain an induced ∈ map σ : TpM TqN. Here σ is the map induced on the tangent space.∗ This can→ be defined as follows.∗ Definition 8.1.1. Choose a representing curve α(t) so that X = α′(0), and define σ (X) to be the vector represented by the curve de- fined by the composition∗ σ α on the image manifold N. ◦ In coordinate-free notation, chain rule takes the following form. Theorem 8.1.2 (Chain rule). for smooth maps σ, τ among mani- folds asserts that the composition σ τ satisfies ◦ (σ τ) = σ τ , ◦ ∗ ∗ ◦ ∗ i.e., (σ τ) (X)= σ τ (X) for all tangent vectors X. ◦ ∗ ∗ ◦ ∗ Definition 8.1.3. A diffeomorphism of M is a bijective 1-1 map φ : 1 M M such that both φ and φ− are C∞ functions for all coordinate charts.→ Recall that we have the notion of a flow θ : M M depending on t → parameter t R. ∈ Definition 8.1.4. A vector field X is said to generate a flow θt if X is the infinitesimal generator of θt as in Definition 7.7.1. Recall from Definition 7.7.1 that the vector field (which is the in- finitesimal generator of the flow) and the flow itself are related by 1 Xp(f) = lim f(θ∆t(p)) f(p) . (8.1.1) ∆t 0 → ∆t − 73  74 8. THEOREM OF FROBENIUS

Definition 8.1.5. An integral curve,1 or orbit, of a vector field X through a fixed point p M is a curve θ(t, p) satisfying θ(0,p)= p and ∂θ ∈ ∂t = X where t ranges through an open interval containing 0. Recall that a vector field X is said to be invariant under a smooth map σ if σ (X) = X at every point. In more detail, if q = σ(p) M ∗ ∈ then we require that σ (Xp)= Xq. ∗ Theorem 8.1.6 (Boothby p. 142). Let X be a vector field gener- ating a flow θ(t, p) on M and let F : M M be a diffeomorphism. Then X is invariant under F if and only if→F commutes with the flow, i.e., F (θ(t, p)) = θ(t, F (p)). Proof of the easy direction. First we show the easy direc- tion. Suppose X is invariant under the map F , so that F (Xp)= XF (p) ∗ for all p M. Let σp(t) be an integral curve through p M. Since we wish to use∈ p as the index, we use the letter σ in place∈ of θ to avoid clash of notation with θt introduced earlier (see (8.1.1)). By chain rule we have for each t that

(F σp) = F σp . ◦ ∗ ∗ ◦ ∗ Thus the image curve F σ sends the vector d (a tangent vector ◦ p dt to R) to F (X); see (8.1.3) for more details. Thus F takes the inte- ∗ gral curve σp of the vector field X to an integral curve of the vector field F (X). Since F (X) = X by hypothesis, the uniqueness of in- tegral∗ curves implies∗ that F (θ(t, p)) = θ(t, F (p)), proving the easy direction.  Proof of the opposite direction. To prove the opposite di- rection, suppose F (θ(t, p)) = θ(t, F (p)) (8.1.2) and let us prove that F (Xp)= XF (p). ∗ Consider an integral curve σp through p defined by σp : R M where →

σp(t)= θ(t, p), so that σ (0) = p M. Here σ may be defined on a smaller interval p ∈ p than R but we used R for simplicity. As before, we use the letter σ instead of θ so as to avoid a clash of notation with θt = θ(t, p) used previously.

1Akuma-integralit (or akom-integrali). The odd term akumat-iskum found at http://hebrew-terms.huji.ac.il/teva english result.asp? seems to be a canaan- ite innovation. 8.2. FROBENIUS THM, COMMUTATION OF VECTOR FIELDS AND FLOWS75

d Let dt be the natural basis for the tangent line T0 R at the origin. dσ Letσ ˙ = dt . Then d X =σ ˙ (0) = σ . (8.1.3) p p p dt ∗   We now apply the isomorphism F : TpM TF (p)M to formula (8.1.3). We obtain ∗ → d d d F (Xp)= F σp =(F σp) = σF (p) , ∗ ∗ ◦ ∗ dt ◦ ∗ dt dt     ∗   where the last equality follows from F σp(t) = σF (p)(t) as in (8.1.2). ◦ d Now the theorem results from the equality σF (p) dt = XF (p).  ∗ 8.2. Frobenius thm, commutation of vector fields and flows In this section as in the previous one we follow Boothby. Here we prove a special case of the Frobenius theorem on flows, in the case of commuting flows. Recall from Definition 7.9.1 that if θt is a flow on M with infin- itesimal generator X, and Y is a vector field on M, then the vector field LX Y , called the Lie derivative of Y with respect to X, is defined by setting 1 LX Y = lim θ t (Yθ(t,p)) Yp , t 0 − ∗ → t − and furthermore LX Y = [X,Y ]. 

Theorem 8.2.1 (Boothby p. 156). Let X and Y be C∞ vector fields on a manifold M. Let θ = θt be the flow generated by X, and η = ηs be the flow generated by Y . Then the following two conditions are equivalent: (1) [X,Y ]=0; (2) for each p M there exists a δp > 0 such that ηs θt(p) = θ η (p) whenever∈ s < δ and t < δ . ◦ t ◦ s | | p | | p Proof. First we show the easy direction (2) = (1). We apply ⇒ Theorem 8.1.6 with F = θt to conclude that the infinitesimal genera- tor Y of the flow ηs is invariant under θt for a fixed small parameter t, 2 i.e., θt (Yq)= Yθ(t,q). Now consider the Lie derivative. We obtain ∗ 1 1 [X,Y ]q =(LX Y )q = lim θ t (Yθ(t,q)) Yq = lim (Yq Yq)=0. t 0 − ∗ t 0 → t − → t − The other direction will be shown in the next segment. 

2The calculation in Boothby on page 157 contains some misprints involving misplaced parentheses. 76 8. THEOREM OF FROBENIUS

Proof of the opposite direction. Now let us show the oppo- site direction (1) = (2) in Theorem 8.2.1. Assume that [X,Y ]=0 ⇒ identically on M. Consider a point q M. Define a vector Zq(t) at q by setting ∈

Zq(t)= θ t Yθ(t,q) TqM. (8.2.1) − ∗ ∈ ˙ Let us show that the t-derivative Z(t) vanishes. Consider the point q′ = θ(t, q). Then 1 ˙ ′ Zq(t) = lim θ (t+∆t) Yθ(∆t,q′) θ t (Yq ) (8.2.2) ∆t 0 ∆t − ∗ − − ∗ →   Decomposing the action of θ (t+∆t) as composition − ∗ θ (t+∆t) = θ t θ ∆t , − ∗ − ∗ ◦ − ∗ we obtain from (8.2.2) and linearity of induced maps that 1 ˙ ′ Zq(t)= θ t lim θ ∆t Yθ(∆t,q′) Yq (8.2.3) − ∗ ∆t 0 ∆t − −  → ∗     1 ′ ′ Now we recognize the expression ∆t θ ∆t Yθ(∆t,q ) Yq as the Lie − ∗ − derivative LX Y of Y at the point q′ = θ(t, q). Therefore we obtain

Z˙q(t)= θ t ((LX Y )q′ )=0 − ∗ since by hypothesis (1), the Lie bracket vanishes everywhere, including at the point q′ = θ(t, q). Therefore the vector Z (t) of (8.2.1) is constant for t < δ, and q | | therefore equal to Yq. This means that Y is invariant under the flow θt. Therefore by Theorem 8.1.6, we have the commutation of flows ηs θ (q)= θ η (q). ◦ t t ◦ s 8.3. Proof of Frobenius n Theorem 8.3.1 (Frobenius). Let U R be open, and X1,...,Xk : n 1 ⊆ U R be k independent C classical vector fields such that [Xi,Xj]′ = →k m m m=1 Cij Xm for Cij : U R. (As before [ , ]′ is the classical Lie i → · · bracket.) Let gt be the classical flow of Xi. Given p U, let r > 0 P 1 2 k ∈ k be such that ϕ(t1,...,tk)= g g g (p) is defined on ( r, r) t1 ◦ t2 ◦···◦ tk − and ∂ϕ ,..., ∂ϕ are independent there. Then span ∂ϕ (a),..., ∂ϕ (a) = ∂t1 ∂tk { ∂t1 ∂tk } span X (ϕ(a)),...,X (ϕ(a)) for every a ( r, r)k. { 1 k } ∈ − Proof. By definition of gi , ∂ (gi gk )(p) = X . So to ti ∂ti ti tk i ∂ϕ ◦···◦ show span X1,...,Xk one needs to show that ∂ti ∈ { } 1 i 1 (gt1 gt−−1 ) (Xi) span X1,...,Xk . ◦···◦ i ∗ ∈ { } 8.3. PROOF OF FROBENIUS 77

j So one needs to show that (gtj ) (Xi) span X1,...,Xk for every i, j, ∗ ∈ { } 1 and for convenience of notation say j = 1. Using the flow gt we may ∂ change coordinates so that X1 = ∂x1 , and so the flow of X1 is sim- ply gt(x)= x+(t, 0,..., 0), and we need to show that span X1,...,Xk is invariant under the flow g , t ( r, r). { } t ∈ − For fixed q let Xi(t)= Xi(q +(t, 0,..., 0)) then

k d m X (t) = [X ,X ]′ = C (t)X (t). dt i 1 i 1i m m=1 X Since this simple flow maps a vector to the “same” (i.e. parallel) vector at the image point, span X ,...,X being invariant under the flow { 1 k} means that span X1(t),...,Xk(t) is the same subspace for all t ( r, r). { } ∈ − n Let us change basis in R such that X1(q),...,Xk(q) are the first k standard basis vectors. j Let us denote the coordinates of Xi by Xi , j =1,...,n, so we must j show that Xi (t) = 0 for all 1 i k, k +1 j n and t ( r, r). ≤ ≤ ≤ ≤ j ∈ − Let W ( r, r) be the set of all t satisfying Xi (t) = 0 for all 1 i k, k +1⊆ −j n. Then W is clearly closed and 0 W , so if we≤ show≤ that W ≤is also≤ open then W =( r, r) and we are∈ done. At this point we pass to the nonstandard− extension. To show that W is open we must show that for every w W and every s w, j ∈ ≈ s ∗W . By transfer s ∗W means that Xi (s) = 0 for all 1 i k,∈k +1 j n. Say s>w∈ . ≤ ≤ Let A≤=≤ max Xj(u) , w u s, 1 i k, k +1 j n, and | i | ≤ ≤ ≤ ≤ ≤ ≤ assume this maximum is attained at u = u′, with i = a, j = b. Let B = max d Xj(u) , w u s, 1 i k, k +1 j n, and | dt i | ≤ ≤ ≤ ≤ ≤ ≤ assume this maximum is attained at u = u′′, with i = c, j = d. b Then since Xa(w) = 0 we have

b A = X (u′) | a | b b = X (u′) X (w) u′ w B | a − a |≤| − | d d = u′ w X (u′′) | − |·|dt c | m d = u′ w C (u′′)X (u′′) | − |·| 1c m | m X u′ w A ≺| − | A. ≺≺ 78 8. THEOREM OF FROBENIUS

j We have A A and so necessarily A = 0 and so Xi (s) = 0 for all 1 i k,≺≺k +1 j n. The≤ notation≤ ≤is explained≤ in Section 8.5.  ≺ 8.4. A-track and B-track approaches Here we follow [Nowik & Katz 2015]. In Chapter 7.8, we pre- sented the received A-track3 formalism for dealing with the infinitesi- mal generator of a flow on a manifold. Note that the adjective “infin- itesimal” in this context is a dead metaphor as it no longer refers to true infinitesimals. We will now develop an infinitesimal formalism for dealing with the infinitesimal generator of a flow. Of course, an infinitesimal (B-track) formalism enables a more in- tuitive approach not merely to the infinitesimal generator of a flow, but to many other topics in calculus and analysis. Thus, the concepts of derivative, continuity, integral, and can all be defined via in- finitesimals. For example, in Section 5.8 we presented the infinitesimal approach to the extreme value theorem.

8.5. Relations of and ≺ ≺≺ We will work in the hyperreal extension R ∗R. ⊆ Definition 8.5.1. For r, s ∗R, we will write ∈ r s ≺ if r = as for finite a. We will also write r s ≺≺ if r = as for infinitesimal a. Thus r 1 means that r is finite, while ≺ r 1 ≺≺ means that r is infinitesimal. More generally, suppose we are given a finite dimensional vector space V over R. Then, given v ∗V , and s ∗R, we will write ∈ ∈ v s ≺ if ∗ v s for some norm on V . We will generally omit the from functionk k ≺ symbols and so willk·k simply write v s. This condition∗ is independent of the choice of norm since allk normsk ≺ on V are equivalent. Similarly we will write v s ≺≺ 3nativ 8.6. SMOOTH MANIFOLDS; NOTION NEARSTANDARDNESS 79 if v s. We will also write k k ≺≺ v w ≈ when v w 1. We choose a basis for V thus identifying it with Rn. − ≺≺ 1 n n Lemma 8.5.2. An element x = (x ,...,x ) ∗R satisfies x s or x s if and only if for each i, the component∈ xi satisfies≺ the corresponding≺≺ relation. This can be checked directly for the Euclidean norm on Rn. s Definition 8.5.3. Given 0

Definition 8.6.4. If a is nearstandard then st(a) is the unique p M such that a hal(p). ∈ ∈ Definition 8.6.5. For a subset A M, the halo hA of A is a subset ⊆ of ∗M defined by h A = halM (a). a A [∈ h In particular, M is the set of all nearstandard points in ∗M. The following is proved in [Davis 1977, p. 90, Theorem 5.6]. Let X be a metric space, e.g., a finite-dimensional smooth manifold equipped with a Riemannian metric. Theorem 8.6.6 (Davis). For a metric space X the following are equivalent: (1) every bounded closed set in X is compact; (2) every finite point of ∗X is nearstandard. In particular, in every finite-dimensional complete Riemannian man- ifold, finite and nearstandard are equivalent.

8.7. Results for general culture on topology We have the following characterisations of open sets and compact sets. A good reference is [Davis 1977]. h Theorem 8.7.1. A subset A is open in M if and only if A ∗A. ⊆ Example 8.7.2. The subset of M = R given by the closed inter- val A = [0, 1] is not open since ∗A fails to contain negative infinitesimals which are in the halo of A, i.e., hal (0) ∗A. M 6⊆ Theorem 8.7.3 (Davis p. 78, item 1.6). A set A M is compact h ⊆ if and only if ∗A A. ⊆ Example 8.7.4. The subset of R given by the open interval A = (0, 1) is not compact since a positive infinitesimal ǫ is in ∗A but ǫ is not in the halo of A, since it is not infinitely close to any point of A because every point of A is appreciable. Since every point of A is appreciable, so is every point of hA, but infinitesimals are not appreciable. h In particular, if A is compact then ∗A M. Recall that the halo both of a point and of a subset are defined⊆ relative to the ambient manifold M (see Remark 8.6.2). Theorem 8.7.5 (Davis p. 78, item 1.7). The manifold M itself is h compact if and only if M = ∗M. 8.7. RESULTS FOR GENERAL CULTURE ON TOPOLOGY 81

If M is noncompact then hM is an external set.5 Example 8.7.6. The 1-dimensional manifold M = R is not com- pact. The halo hR of R consists of all finite hyperreals. This is an h external subset of ∗R. Recall that R is the domain of the standard part function: st : hR R → defined in Section 3.6.

5A set is external if it is not internal. See Section 5.10.

CHAPTER 9

Gradients, Taylor, prevectorfields

9.1. Properties of the natural extension Given an open W Rn and a smooth function f : W R we ⊆ → note some properties of the extension ∗f : ∗W ∗R, obtained by transfer. When there is no risk of confusion, we will→ omit the from ∗ the function symbol ∗f and simply write f for both the original function h and its extension. Recall that W ∗W consists of all nearstandard points (see Section 8.6). ⊆ Lemma 9.1.1. For open W Rn, let f : W R be continuous. Then f(a) is finite for every a ⊆hW . → ∈ Proof. Given a hW , let U W be a neighborhood of st(a) such ∈ ⊆ that U W and U is compact. ⊆ By compactness, there is a constant C R such that f(x) C ∈ | | ≤ for all x U. By transfer f(x) C for all x ∗U. In particular we have f(a∈) C. Therefore| f(a|) ≤ is finite. (As∈ for functions, we omit | | ≤ the from relation symbols, writing in place of ∗ .)  ∗ ≤ ≤ 9.2. Remarks on gradient h Recall that U is the set of nearstandard points of ∗U, where ∗U is the star-transform of U. Given an open U Rn and a smooth ⊆ ∂f function f : U R, we can compute the partial derivatives ∂ui as usual. →

Definition 9.2.1. The partial derivatives of ∗f are by definition the functions ∗ ∂f , ∂ui   i.e., the extensions of the partial derivatives of f. Definition 9.2.2. Given f C1 we consider the row vector ∈ Da of partial derivatives at each point a ∗U. ∈ 83 84 9. GRADIENTS, TAYLOR, PREVECTORFIELDS

One similarly defines the higher partial derivatives at a point a ∈ ∗U. Definition 9.2.3. Given f C2 we consider the Hessian matrix ∈ Ha of the second partial derivatives at each a ∗U. ∈ h By Lemma 9.1.1, Da and Ha are finite throughout U. For fu- ture applications, we would like to consider the infinitesimal interval between two infinitely close points in the halo of U. Remark 9.2.4. Let a, b hU with a b, then the interval be- h∈ ≈ tween a and b is included in U ∗U. ⊆ 1 Theorem 9.2.5. If f is C then f(b) f(a)= Dx(b a) for some x in the interval between a and b. − − Proof. This results by transfer of the mean value theorem. Note that here Dx(b a) is a product of matrices (row vector times column vector). −  Equivalently, we can write f(b) f(a) D (b a)=0 − − x − for this specific choice of x. Meanwhile, for an arbitrary point in place of x we have the following weaker statement. Theorem 9.2.6. Let f be C1 and let a, b hU be infinitely close. Then for every y in the interval between a and∈b, we have f(b) f(a) D (b a) b a . − − y − ≺≺ k − k Proof. By hypothesis, the partial derivatives Dx are themselves continuous functions of x. Then the characterization of continuity via infinitesimals and microcontinuity in Section 5.6 implies that Dx D 1 for any y a (e.g., for y = st(a)). Therefore we can write − y ≺≺ ≈ f(b) f(a)= D (b a)+(D D )(b a). − y − x − y − By continuity of Dx at x0 = st(x) = st(y), the difference Dx Dy is infinitesimal, proving the theorem. −  Remark 9.2.7. If the first partial derivatives are Lipschitz (e.g., if f is C2), and we are given an infinitesimal constant β such that x y β for all x in the interval between a and b, then we have thek stronger− k ≺ condition f(b) f(a) D (b a) β b a . − − y − ≺ k − k 9.3. PREVECTORS 85

If f is C2 then by transfer of the Taylor approximation theorem we have 1 f(b) f(a)= D (b a)+ (b a)tH (b a) − a − 2 − x − for some x in the interval between a and b, and remarks similar to those we have made regarding Dx Dy apply to Hx Hy. n − − If ϕ =(ϕi): U R then the n rows Di corresponding to ϕi form → a the Jacobian matrix Ja of ϕ at a. By applying the above considerations to each ϕi we obtain that ϕ(b) ϕ(a) J (b a) b a . − − y − ≺≺ k − k If all partial derivatives are Lipschitz (e.g., if ϕ is C2) then ϕ(b) ϕ(a) J (b a) β b a with β as above. − − y − ≺ k − k 9.3. Prevectors

We now choose a positive infinitesimal λ ∗R, and fix it once and for all. We will now define the concept of a prevector.∈ 1 Definition 9.3.1. Given a hM, a prevector based at a is a pair (a, x), where x hM, such∈ that for every smooth function f : ∈ M R, we have f(x) f(a) λ. → − ≺ Note that the hypothesis f(x) f(a) λ is stronger than requir- ing f(x) and f(a) to be infinitely close.− ≺ We can avoid quantification over functions as in Definition 9.3.1 by giving an equivalent condition of being a prevector in coordinates as follows. Consider coordinates in a neighborhood W of st(a) in M, n whose image in Euclidean space is U R . Leta, ˆ xˆ ∗U be the ⊆ ∈ coordinates for a, x ∗W . ∈ Theorem 9.3.2. The pair (a, x) is a prevector based at the point a if and only if xˆ aˆ λ, − ≺ where the difference xˆ aˆ is defined using the linear structure of the n − vector space ∗R ∗U. ⊇ Proof. Let us show that the two definitions are indeed equivalent. Assume the first definition, and let (u1,...,un) be the chosen coordi- nate functions. Since each xi is smooth, we get ui(x) ui(a) λ for each i, i.e.,x ˆ aˆ λ. − ≺ Conversely,− assume≺ (a, x) satisfies the second definition. Let f : M R be a smooth function. Then by the intermediate value the- orem,→ f(x) f(a) = D (ˆx aˆ) for some c in the interval betweena ˆ − c − 1kdam-vector (with a kuf). 86 9. GRADIENTS, TAYLOR, PREVECTORFIELDS andx ˆ. Now the components of Dc are finite by Lemma 9.1.1. There- fore f(x) f(a) xˆ aˆ λ.  − ≺k − k ≺ 9.4. Tangent space to a manifold via prevectors

Definition 9.4.1. We denote by Pa = Pa(M) the set of prevectors based at a.

This is an infinite-dimensional space. We will use Pa to define the finite-dimensional tangent space of M. First we will define an equivalence relation on P as follows. ≡ a Definition 9.4.2. We write (a, x) (a, y) ≡ if one of the following two equivalent conditions is satisfied: (1) f(y) f(x) λ for every smooth function f : M R; (2) in coordinates− ≺≺ as above,y ˆ xˆ λ. → − ≺≺ The equivalence of the two definitions follows by the same argument as in Section 9.3. Definition 9.4.3. We denote by

Ta = Ta(M) the set of equivalence classes Pa/ . In the spirit of notation, the equivalence class of (a, x) P≡will be denoted ∈ a ax T , −→ ∈ a and referred to as an ivector (short for infinitesimal vector) so as to distinguish it from a classical vector. Theorem 9.4.4. A choice of coordinates in a neighborhood of M n gives an identification of Ta with R for every nearstandard a. Proof. Given a hM, let W M be a coordinate neighborhood ∈ ⊆ n of the point st(a) M. Thus we have a map W R with image U n ∈ → n λ ⊆ R . Here by definition x xˆ for all x W . Recall that (R )F is the n 7→ ∈ space of points in ∗R that are not infinite compared to λ. We define a map n λ Pa (R ) → F by sending (a, x) xˆ aˆ. This induces an identification of the 7→ − n n λ n λ space Ta = Pa/ with the space R = (R ) /(R ) . Under this ≡ F I identification, the space Ta inherits the structure of a vector space over R.  9.5. ACTION OF A PREVECTOR ON SMOOTH FUNCTIONS 87

Remark 9.4.5. If we choose a different coordinate patch in a neigh- n borhood of st(a) with image U ′ R , then if ϕ : U U ′ is the change of coordinates, then by Definition⊆ 9.2.1, we have → ϕ(ˆx) ϕ(ˆa) J (ˆx aˆ) xˆ aˆ λ, − − st(a) − ≺≺ k − k ≺ where J is the Jacobian matrix. Hence we obtain ϕ(ˆx) ϕ(ˆa) J (ˆx aˆ) λ. − − st(a) − ≺≺ This means that the map Rn Rn induced by the two identifications n → of Ta with R provided by the two coordinate maps, is given by multi- plication by the matrix Jst(a), and therefore is linear. Thus it preserves the linear space structure of the quotient.

We see that the vector space structure induced on Ta via coordinates is independent of the choice of coordinates, and so we have a well defined vector space structure on Ta over R. Note that the object denoted Ta is a mixture of standard and nonstandard notions. It is a vector space over R, but defined at every nearstandard point a hM.2 ∈

9.5. Action of a prevector on smooth functions Prevectors act on functions in a way that does not rely on either limits or standard part. Definition 9.5.1. A prevector (a, x) P acts on a smooth func- ∈ a tion f : M R as follows: → 1 (a, x) f = f(x) f(a) . λ − The right hand side is finite by definition of prevector. This action induces a differentiation as follows.

Definition 9.5.2. The differentiation of f : M R by an ivec- tor a x T is defined as follows: → −→ ∈ a −→a x f = st (a, x)f .  2If a,b hM and a b, then given coordinates in a neighborhood of st(a), ∈ ≈ n the identifications of Ta and Tb with R induced by these coordinates induces an identification between Ta and Tb. Given a different choice of coordinates, the ma- trix Jst(a) used in the previous paragraph is the same matrix for a and b, and so the identification of Ta with Tb is well defined, independent of a choice of coordinates. Thus when a b hM we may unambiguously add an ivector a x T and an ≈ ∈ −→ ∈ a ivector −→by T . ∈ b 88 9. GRADIENTS, TAYLOR, PREVECTORFIELDS

The action −→a xf is well defined by definition of the equivalence rela- tion . Note again that the ivector −→a x is a nonstandard object based at the≡ nonstandard point a, but it assigns a standard real number to the standard function f. Theorem 9.5.3. The action (a, x)f satisfies the Leibniz rule up to infinitesimals. Proof. Indeed, given functions f and g we can compute the action as follows: 1 (a, x)(fg)= f(x)g(x) f(a)g(a) λ − 1   = f(x)g(x) f(x)g(a)+ f(x)g(a) f(a)g(a) λ − −  1 1  = f(x) g(x) g(a) + f(x) f(a) g(a) λ − λ − 1   1   f(a) g(x) g(a) + f(x) f(a) g(a). ≈ λ − λ − Here the final relation is justified by the continuity of fand finiteness ≈ of the action 1 g(x) g(a) .  λ −   Theorem 9.5.4. Differentiation by an ivector −→a x satisfies the fol- lowing version of the Leibniz rule: a x(fg)= f(a ) a xg + a xf g(a ) −→ 0 · −→ −→ · 0 where a0 = st(a). Proof. By Theorem 9.5.3, we obtain the following for the differ- entiation −→a x(fg): a x(fg)= st(f(a)) axg + a xf st(g(a)) −→ · −→ −→ · = f(a ) a xg + a xf g(a ), 0 · −→ −→ · 0 where the second equality is by continuity of f and g.  When a is real we obtain the ordinary Leibniz rule for differentia- tion.

9.6. Maps acting on prevectors We continue to follow [Nowik & Katz 2015]. Recall that λ is a fixed infinitesimal, Pa denotes the space of prevectors (a, x) at a nearstandard point a hM, i.e., x a λ. Let f : M N∈be a smooth− map≺ between smooth manifolds. Let a hM be a→ nearstandard point. ∈ 9.6. MAPS ACTING ON PREVECTORS 89

Definition 9.6.1. We define the differential dfa of f, df : P (M) P (N), a a → f(a) by (a, x) (f(a), f(x)). 7→ Definition 9.6.2. The differential dfa induces a tangent map T f : T (M) T (N) a a → f(a) by choosing a prevector representing an ivector a x T M and setting −→ ∈ a T fa (−→a x)= f−−−−−−→(a) f(x). The relation between f and df seems more transparent here than in the corresponding classical definition.

Theorem 9.6.3. For a standard point a M, the space Ta is nat- urally identified with the classical tangent space∈ of M at a. Proof. It suffices to show this for an open set U Rn, where n ⊆ the classical tangent space at any point a is R itself. A classical vector v Rn is then identified with ∈ a x where x = a + λ v. −→ · Under this identification, our definitions of the differentiation −→a xf (see Section 9.5) and the action of the differential T fa(−→a x) coincide with the classical ones.  Remark 9.6.4. When the manifolds are open subsets of Euclidean space and T is the classical tangent space, the map T f : T T is a a a → f(a) identified with the Jacobian matrix Ja (at the point a) whose rows are the gradients Da of all n components. Recall that Da v for a column vector v is the matrix product yielding a scalar result.

CHAPTER 10

Prevector fields, classes D1 and D2

10.1. Prevector fields The notion of an internal set was discussed in Section 5.10. Recall that λ is a fixed infinitesimal. A prevector field is, intuitively, the assignment of an infinitesimal displacement (of size comparable to λ) at every point, given by an internal map. Remark 10.1.1. Here, as usual, a map is called internal if it is a member of a star transform of a standard object. For example, a map f : M N is defined by a subset of the Cartesian product M N. → × Therefore f can be thought of as an element of P = (M N). An P × internal map from ∗M to ∗N is then an element of ∗P.

More precisely, we have the following definition. Recall that Pa is the set of prevectors at a point a hM, i.e., pairs (a, x) such that in coordinatesx ˆ aˆ λ. ∈ − ≺ Definition 10.1.2. A prevector field Φ on a smooth manifold M is an internal map Φ : ∗M ∗M such that we have (a, Φ(a)) Pa for every a hM. → ∈ ∈ In coordinates where addition and subtraction are possible, this condition translates into Φ(a) a λ for every a hM. − ≺ ∈ Remark 10.1.3. Even though the condition is imposed only at h points of M, we define Φ on all of ∗M since we need Φ (and therefore both its domain and range) to be an internal entity so as to be able to apply hyperfinite iteration which is the basis of the hyperreal walk.

The relation between prevectors (a, x)and(a, y) in Pa was defined in Section 9.4. Namely,≡ in coordinatesy ˆ xˆ λ. − ≺≺ Definition 10.1.4. Let Φ and G be prevector fields. We say Φ is equivalent to G and write Φ G if Φ(a) G(a) for every a hM, or in coordinates Φ(a) G(a) ≡ λ. ≡ ∈ − ≺≺ Definition 10.1.5. A local prevector field on an open U M is an ⊆ internal map Φ : ∗U ∗V satisfying the above condition, where V U is open. → ⊇ 91 92 10. PREVECTOR FIELDS, CLASSES D1 AND D2

When the distinction is needed, we will call a prevector field defined on all of ∗M a global prevector field. The reason for allowing the values of a local prevector field defined on ∗U to lie in a possibly larger range ∗V is in order to allow a prevector field Φ to be restricted to a smaller domain which is not invariant under Φ. Example 10.1.6. Let M = R. If we wish to restrict the prevector field Φ on M given by Φ(a)= a + λ to the domain ∗(0, 1) ∗R, then we need the flexibility of allowing a slightly larger range for⊆ Φ. In the sequel we will usually not mention the slightly larger range V when describing a local prevector field, but will tacitly assume that we have such a V when needed. A second instance where it may be required for the range to be slightly larger than the domain is the following natural setting for defining a local prevector field. Example 10.1.7. Let p M. Let V M a coordinate neigh- ∈ n ⊆ borhood of p with imagesp ˆ V ′ R . Let X be a classical vector ∈ ⊆ n 1 field on V , given in coordinates by X′ : V ′ R . Then there is a n → neighborhood U ′ ofp ˆ R , with U ′ V ′, such that we can define a ∈ ⊆ local prevector field Φ′ : ∗U ′ ∗V ′ by setting → Φ′(a)= a + λ X′(a). (10.1.1) · Indeed, it suffices to choose U ′ such that U ′ is compact and further- more U V ′. For the corresponding open set U V in M, this ′ ⊆ ⊆ induces a local prevector field Φ : ∗U ∗V which realizes the classical vector field X on U in the sense of Definition→ 10.2.1 below (cf. Theo- rem 9.6.3).

10.2. Realizing classical vector fields Definition 10.2.1. We say that a local prevector field Φ on U realizes a classical vector field X on U if for every smooth function h : U R we have the following for the new function Xh: → Xh (a)= a−−−−→Φ(a) h for all a U, where a−−−−→Φ(a) is the ivector represented by the prevec- tor (a, Φ(∈a)).

1It is not entirely unambiguous how one is to choose a trivialized vector field X′ starting with a vector field X on M. One may want to make this more precise. 10.2. REALIZING CLASSICAL VECTOR FIELDS 93

Remark 10.2.2. When realizing a vector field as in Example 10.1.7 (namely, formula (10.1.1) above) it may be necessary to restrict to a smaller neighborhood U. For example, when M = V = V ′ = (0, 1) d and X = 1 (i.e., X =1 dx in classical notation), one needs to take U = (0,r) for some r R,r < 1 in order for Φ(a) = a + λ to lie always ∈ in ∗V . Note that Definition 10.2.1 involves only standard points. Different coordinates for the same neighborhood U will induce equiv- alent realizations in ∗U. More precisely, we show the following. Remark 10.2.3. Let U Rn. Let X : U Rn be a classical vec- ⊆ n → tor field. Let ϕ : U W R be a change of coordinates. Thus Tϕ → ⊆ n sends Ta to Tϕ(a). Let Y : W R be the classical vector field corre- sponding to X under the coordinate→ change. Then

Y (ϕ(a)) = JaX(a), (10.2.1) where Ja is the Jacobian matrix of ϕ at a. Proposition 10.2.4. In the notation of Remark 10.2.3, let Φ,G be the prevector fields given by (1) Φ(a)= a + λX(a), (2) G(a)= a + λY (a) 1 as in Example 10.1.7. Then Φ ϕ− G ϕ, or equivalently, ≡ ◦ ◦ ϕ Φ(a) G ϕ(a) λ ◦ − ◦ ≺≺ for all a hU. ∈ Proof. Let ϕi,Y i,Gi be the ith component of ϕ,Y,G respectively, i i i and let Da be the differential of ϕ at a, so that Da is the ith row of Ja. We apply the mean value theorem to ϕi together with (10.2.1) to obtain the following for a suitable point c: ϕi Φ(a) Gi ϕ(a)= ϕi(a + λX(a)) Gi ϕ(a) ◦ − ◦ − ◦ = ϕi(a + λX(a)) ϕi(a) λY i(ϕ(a)) − − = DiλX(a) λDi X(a) c − a = λ(Di Di )X(a) c − a λ, ≺≺ i i where Dc is the differential of ϕ at a suitable point c in the interval between a and a + λX(a). Such a point c exists by Theorem 9.2.5. Since this is true for each component i, we obtain ϕ Φ(a) G ϕ(a) λ ◦ − ◦ ≺≺ 94 10. PREVECTOR FIELDS, CLASSES D1 AND D2 proving the theorem. 

10.3. Regularity class D1 for prevectorfields Defining various operations on prevector fields, such as their flow, or Lie bracket, requires assuming some regularity properties. First we recall the following classical definition. Definition 10.3.1. A classical vector field X : U Rn is called K- → Lipschitz, where K R, if ∈ X(a) X(b) K a b for all a, b U. (10.3.1) k − k ≤ k − k ∈ For the local prevector field Φ of Example 10.1.7, where Φ(a) a = λX(a), this translates into − Φ(a) a Φ(b) b Kλ a b (10.3.2) − − − ≤ k − k     for a, b ∗ U. ∈ Remark 10.3.2. Note the presence of the factor of λ on the right hand side of (10.3.2) which is not in (10.3.1). This is due to the fact that (10.3.2) deals with infinitesimal prevectors rather than classical vectors. The bound (10.3.2) motivates the following definition. Definition 10.3.3. A prevector field Φ on a smooth manifold M 1 h is of class D if whenever a, b M and (a, b) Pa then in coordinates the following condition is satisfied:∈ ∈ Φ(a) a Φ(b)+ b λ a b . (10.3.3) − − ≺ k − k As mentioned earlier, condition (10.3.3) is equivalent to the Lips- chitz condition when Φ is defined by a classical vector field.

10.4. Second differences, class D2 To work with Lie brackets a stronger condition than D1 is necessary. Definition 10.4.1. A prevector field Φ on a smooth manifold M is of class D2 if for any a hM, the following condition is satisfied ∈ n in coordinates: for each pair v,w ∗R with v,w λ, the second difference ∈ ≺ ∆2 Φ(a) =Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w) v,w − − is sufficiently small: ∆2 Φ(a) λ v w . v,w ≺ k kk k 10.5. SECORD-ORDER MEAN VALUE THEOREM 95

In Proposition 10.6.1 we will show that the prevector field of Ex- ample 10.1.7, induced by a classical vector field of class Ck, is a Dk prevector field (k =1, 2). This is in fact the central motivation for our definition of Dk. Remark 10.4.2. Our Dk is in fact a weaker condition than Ck, e.g. Φ of Example 10.1.7 is D1 if X is (order 1) Lipschitz, which is weaker than C1. A definition of D0 along the above lines would amount to requiring Φ(a) a λ, i.e. Φ being a prevector field. Note, however, that in our definitions− ≺ above, being a prevector field is part of the definition of an element of Dk. 10.5. Secord-order mean value theorem We will need the result from [Rudin 1976, Theorem 9.40]. We will assume that f is C2 for simplicity (Rudin presents weaker conditions). Theorem 10.5.1 (Rudin Theorem 9.40). Suppose a two-variable function f(u1,u2) is defined and of class C2 in an open set in the plane including a closed rectangle Q with sides parallel to the coordinate axes, having (a, b) and (a + h, b + k) as opposite vertices, where h> 0 and k > 0. Set ∆2(f, Q)= f(a + h, b + k) f(a + h, b) f(a, b + k)+ f(a, b). − − Then there is a point (x, y) in the interior of Q such that ∂2f ∆2(f, Q)= hk (x, y). ∂u1∂u2   Remark 10.5.2. Note the analogy with the mean value theorem. Proof. Define a 1-variable function u(t)= f(t, b + k) f(t, b) where t [a, a + h]. − ∈ Two applications of the mean value theorem will yield the result. Note that the mean value theorem is applicable to the first partial of f since f C2 by hypothesis. Namely there is an x between a and a + h, and there∈ is a y between b and b + k, such that ∆2(f, Q)= u(a + h) u(a) − = hu′(x) ∂f ∂f = h (x, b + k) (x, b) ∂u1 − ∂u1      ∂2f = hk (x, y) ∂u1∂u2   96 10. PREVECTOR FIELDS, CLASSES D1 AND D2 proving the theorem. 

10.6. The condition C2 implies D2 Recall our “second difference” notation for a prevector field Φ at a nearstandard point a: ∆2 Φ(a)=Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w). v,w − − The condition D2 for prevector fields was defined in Section 10.4. This 2 is the condition ∆v,wΦ(a) λ v w for all v,w λ. This is in fact related to [Zygmund 1945≺ k].kk Givenk a classical vector≺ field X, one defines a prevector field Φ(a)= a + λX(a). We will show that if X is of class C2 then Φ is of class D2.

Proposition 10.6.1. Let W Rn be open. Let X : W Rn be 2 ⊆ → a classical C vector field, and consider its natural extension to ∗W . h n Then for each a W and each pair of infinitesimal v,w ∗R , we have ∈ ∈ ∆2 X(a) v w . v,w ≺k kk k We obtain the following immediate corollary. Corollary 10.6.2. Let X be of class C2. If Φ is the prevector 2 field on ∗W of Example 10.1.7, i.e., Φ(a)= a + λX(a), then Φ is D . Proof of Proposition 10.6.1. Let U W be a smaller neigh- borhood of st(a) for which all second partial⊆ derivatives of X are bounded by a constant Const R. Let (X1,...,Xn) be the com- ∈ n ponents of X. Given p U and v1, v2 R such that we have p + s v + s v U for all 0 ∈ s ,s 1, we∈ define functions ψi by 1 1 2 2 ∈ ≤ 1 2 ≤ i i ψ (s1,s2)= X (p + s1v1 + s2v2). Then by chain rule, the mixed second partial of ψ satisfies ∂2ψi Const v v ∂s ∂s ≤ | 1|| 2| 1 2

i where Const depends on the second partials of X . By Theorem 10.5.1 (second-order mean value theorem) there is a point (t , t ) [0, 1] [0, 1] such that 1 2 ∈ × 2 e1+e2 i ∂ i ( 1) ψ (e1, e2)= ψ (t1, t2). 2 − ∂s1∂s2 (e1,e2) 0,1 X∈{ } 10.7. BOUNDS FOR D1 PREVECTOR FIELDS 97

Therefore

2 e1+e2 i ∆v1,v2 X = ( 1) X (p + e1v1 + e2v2) 2 − (e1,e2) 0,1 X∈{ } e1+e2 i = ( 1) ψ (e1, e2) 2 − (e1,e2) 0,1 X∈{ } 2 ∂ i = ψ (t1, t2) ∂s1∂s2

Ci v1 v2 ≤ k kk k where Ci is determined by a bound for all second partial derivatives of Xi in U. Since this is valid for each Xi, i = 1,...,n, there is a K R such that ∈

e1+e2 ( 1) X(p + e1v1 + e2v2) K v1 v2 (10.6.1) 2 − ≤ k kk k (e1,e2) 0,1 X∈{ } n for every p U and v1, v2 R such that p + s1v1 + s2v2 U for ∈ ∈ ∈ all 0 s1,s2 1, si R. Applying≤ ≤ the transfer∈ principle to (10.6.1), we conclude that the n same formula holds, with the same K, for all p ∗U and v1, v2 ∗R ∈ ∈ provided that p + s1v1 + s2v2 ∗U for all 0 s1,s2 1, si ∗R. In ∈ ≤ ≤ ∈ n particular this is true for p = a and all infinitesimal v1, v2 ∗R .  ∈ 10.7. Bounds for D1 prevector fields Recall that a D1 prevector field Φ satisfies the condition Φ(a) a Φ(b)+ b λ a b − − ≺ k − k 1 whenever (a, b) Pa (see Section 10.3). We note that a D prevector field Φ satisfies the∈ following bounds. Proposition 10.7.1. If Φ is a D1 prevector field on M, and a, b hM with a b λ, then ∈ − ≺ a b Φ(a) Φ(b) a b . k − k≺k − k≺k − k Proof. By the triangle inequality we have Φ(a) Φ(b) a b Φ(a) Φ(b) a + b k − k−k − k ≤k − − k (10.7.1) = Kλ a b k − k where the hyperreal constant K is defined by the last equality in (10.7.1), where K is finite by the D1 condition. In other words, we have Kλ a b Φ(a) Φ(b) a b Kλ a b . − k − k≤k − k−k − k ≤ k − k 98 10. PREVECTOR FIELDS, CLASSES D1 AND D2

Rearranging terms, we obtain (1 Kλ) a b Φ(a) Φ(b) (1 + Kλ) a b , − k − k≤k − k ≤ k − k as required.  CHAPTER 11

Prevectorfield bounds via underspill, walks, flows

11.1. Operations on prevector fields We will now show that addition of D1 prevector fields in coordinates is equivalent to their composition (in the sense of the relation ). More precisely, we show the following. ≡ Proposition 11.1.1. Let Φ be a D1 prevector field and G any prevector field. Then for every a hM, we have the following relation among ivectors: ∈ a−−−−−−−→Φ(G(a)) = a−−−−→Φ(a)+ a−−−−→ G(a). In particular if both Φ,G are D1 then Φ G G Φ. ◦ ≡ ◦ Proof. In a coordinate chart, let x = a + (Φ(a) a)+(G(a) a)=Φ(a)+ G(a) a. − − − Passing to equivalence classes, we obtain the following relation among ivectors: a−−−−→Φ(a)+ a−−−−→ G(a)= −→a x. Now let b = G(a). Then Φ(G(a)) x = Φ(b) Φ(a) b + a λ b a λ − − − ≺ k − k ≺≺ by the D1 condition on Φ, proving the proposition.  We will now show that the composition of D1 prevector fields is D1. Proposition 11.1.2. If prevector fields Φ,G are D1 then prevector field Φ G is D1. ◦ Proof. Let c = G(a) and d = G(b). We have by the triangle inequality Φ G(a) Φ G(b) a + b k ◦ − ◦ − k Φ G(a) Φ G(b) G(a)+ G(b) + G(a) G(b) a + b ≤k ◦ − ◦ − k k − − k = Φ(c) Φ(d) c + d + G(a) G(b) a + b k − − k k − − k λ c d + λ a b λ a b ≺ k − k k − k ≺ k − k 99 100 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS where Proposition 10.7.1 is applied in order to bound c d in terms of a b. −  − A similar proposition can be proved for D2.1

11.2. Sharpening the bounds via underspill Here we are interested in sharpening the bounds for prevector- fields on M in terms of a specific real constant.≺ Proposition 11.2.1. Let Φ be a prevector field on W , where W n ⊆ M is an open coordinate neighborhood with image U R . Let B U be a closed ball. We will identify W with U to lighten⊆ the notation.⊆ Then we have the following.

(1) There is C R such that Φ(a) a Cλ for all a ∗B. ∈ k − k ≤ ∈ (2) If G is another prevector field, then there is a finite β ∗R ∈ such that Φ(a) G(a) βλ for all a ∗B. k − k ≤ ∈ (3) If furthermore Φ G then the constant β ∗R in (2) can be chosen to be infinitesimal.≡ ∈ Proof. Since the first statement is a special case of the second (take G(a)= a for all a), it suffices to prove the second statement. Let

A = n ∗N : Φ(a) G(a) nλ for every a ∗B . { ∈ k − k ≤ ∈ } Every infinite n ∗N is in A, so that ∈ ∗N N A. \ ⊆ 1Similarly, we have the following. Proposition 11.1.3. If Φ, G are D2 then Φ G is D2. ◦ Proof. In some coordinates let p = G(a), x = G(a + v) G(a) and y = G(a + w) G(a), and so by Propositions 13.1.2 and 10.7.1 we have −x v and y w . Also− by Propositions 13.1.2 and 10.7.1 we have ≺k k ≺k k

Φ(p + x + y) Φ(G(a + v + w)) p + x + y G(a + v + w) k − k≺k − k = G(a)+ G(a + v)+ G(a + w) G(a + v + w) λ v w . k− − k≺ k kk k Now

Φ G(a) Φ G(a + v) Φ G(a + w)+Φ G(a + v + w) k ◦ − ◦ − ◦ ◦ k = Φ(p) Φ(p + x) Φ(p + y)+Φ G(a + v + w) k − − ◦ k Φ(p) Φ(p + x) Φ(p + y)+Φ(p + x + y) + Φ(p + x + y) Φ(G(a + v + w)) ≤k − − k k − k λ x y + λ v w λ v w . ≺ k kk k k kk k≺ k kk k  11.3. FURTHER BOUNDS VIA UNDERSPILL 101

But A is an internal set, being defined in terms of internal entities Φ and G.2 Therefore by underspill there is a finite integer C in A. Recall that underspill is the fact that if A ∗N is an internal set, and A includes all infinite n then it must also⊆ include a finite n. This is based on the fact that infinite hypernaturals form an external set; see Section 5.11. To prove the third statement (for Φ G), let ≡ λ A = n ∗N : Φ(a) G(a) for every a ∗B . ∈ k − k ≤ n ∈   Every finite n ∗ N is in A and so by there is an infinite n ∗N ∈ 1 ∈ in A. We can therefore choose the infinitesimal value β = n . 

11.3. Further bounds via underspill Proposition 11.3.1. Let Φ be a D1 prevector field on M. Let W n ⊆ M be an open coordinate neighborhood with image U R . Let B U ⊆ ⊆ be a closed ball. Then there is K R such that ∈ Φ(a) a Φ(b)+ b Kλ a b (11.3.1) k − − k ≤ k − k for all a, b ∗B. It follows∈ that (1 Kλ) a b Φ(a) Φ(b) (1 + Kλ) a b . (11.3.2) − k − k≤k − k ≤ k − k Proof. Let N = 1/λ . Let a, b ∗B. We construct a hyperfinite partition of the segment⌊ between⌋ a and∈ b. For each k =0,...,N let k a = a + (b a). k N −

Then ak ak+1 λ. Let Cab ∗R be the maximum of the ratios − ≺ ∈ Φ(a ) a Φ(a )+ a k k − k − k+1 k+1k λ a a k k − k+1k 1 over all k =0,...,N 1. Since Φ is D , it follows that the constant Cab is finite. For every 0− k N 1 we have ≤ ≤ − a b Φ(a ) a Φ(a )+ a C λ a a = C λk − k. k k − k − k+1 k+1k ≤ ab k k − k+1k ab N

2Kanovei to comment on strong transitivity. 102 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS

Therefore by the triangle inequality Φ(a) a Φ(b)+ b k − − k N 1 − Φ(a ) a Φ(a )+ a (11.3.3) ≤ k k − k − k+1 k+1k Xk=0 C λ a b . ≤ ab k − k Now let

A = n ∗N : Φ(a) a Φ(b)+ b nλ a b for each a, b ∗B . { ∈ k − − k ≤ k − k ∈ } Since each Cab as in (11.3.3) is finite, every infinite n ∗N is in A. Hence by underspill, A also contains a finite integer K, proving∈ (11.3.1). The estimate (11.3.2) assertion follows from the first as in the proof of Proposition 10.7.1.  A similar proposition can be proved for D2.3

11.4. Global prevector fields, walks, and flows In this section we will define the hyperreal walk of a prevector field and relate it to the classical flow. Recall that a prevector field Φ is a map Φ : ∗M ∗M from ∗M to itself. We wish to view a prevector field as the first step→ of its walk at time λ> 0, and will define the hyperreal

3Similarly we have the following. Proposition 11.3.2. Let Φ be a D2 prevector field. If W is a coordinate neighborhood with image U Rn and B U is a closed ball, then there is K R such that ⊆ ⊆ ∈ Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w) Kλ v w k − − k≤ k kk k ∗ ∗ ∗ for all a B and v, w Rn such that a + v,a + w,a + v + w B. ∈ ∈ ∈ Proof. The proof is similar to that of Proposition 11.3.1. Let N = 1/λ . ∗ ∗ n ∗ ⌊ ⌋ Given a B and v, w R such that a + v,a + w,a + v + w B, let ak,l = ∈ ∈ ∈ a + k v + l w, 0 k,l N. Let C be the maximum of N N ≤ ≤ avw Φ(a ) Φ(a ) Φ(a )+Φ(a ) k k,l − k+1,l − k,l+1 k+1,l+1 k λ v/N w/N k kk k for 0 k,l N 1 then C is finite. For every 0 k,l N 1 we have ≤ ≤ − avw ≤ ≤ − Φ(a ) Φ(a ) Φ(a )+Φ(a C λ v/N w/N . k k,l − k+1,l − k,l+1 k+1,l+1k≤ avw k kk k Summing over 0 k,l N 1 we get ≤ ≤ − Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w) C λ v w . k − − k≤ avw k kk k By underspill as in the proof of Proposition 11.3.1, there is one finite K which works for all a,v,w.  11.4. GLOBAL PREVECTOR FIELDS, WALKS, AND FLOWS 103 walk and the real flow accordingly (see below). The flow at arbitrary time t is defined by iterating Φ the appropriate number of times. Remark 11.4.1. The idea of a vector field being the infinitesimal generator of its flow receives a literal meaning in the TIDG setting.

We view Φ as an element Φ ∗Map(M). Consider the map ∈ ∗Map(M) ∗N ∗Map(M) × → which is the natural extension of the map Map(M) N Map(M) taking (φ, n) to φn = φ φ φ (altogether n times).× → Thus Φn is defined for all hypernatural◦ ◦···◦n. We will now define the hyperfinite flow Φt of a prevector field Φ. The flow will be defined at times t ∗R that are hyperinteger multiples of the base infinitesimal λ. ∈

Definition 11.4.2. Let Φ : ∗M ∗M be a global prevector field. → For each hypernatural n ∗N, let t = nλ and define the hyperreal ∈ walk Φt of Φ at time t by setting n Φt(a) = Φ (a). (11.4.1)

Definition 11.4.3. We define the real flow φt on M by setting

φt(x)= st(Φnλ(x)) for all x M, where nλ t< (n + 1)λ, i.e., ∈ ≤ t n = . λ   The real flow will be defined for all real t that are sufficiently small so that the standard part is well-defined; see material around (11.8.1) for the details. Theorem 11.4.4. Let Φ be a prevector field of the class D1 on M, and let Φt be its hyperfinite walk as in (11.4.1). Then prevector field Φ is invariant under Φt. n Proof. The differential dΦt of the map Φt = Φ acts on each prevector (a, Φ(a)) by n n dΦt(a, Φ(a)) = (Φ (a), Φ (Φ(a))) = (Φn(a), Φ Φ Φ(a)) ◦ ◦···◦ = (Φn(a), Φ(Φn(a))) =(b, Φ(b)) by the associativity of composition, where b = Φn(a), proving the in- variance of the prevector field Φ. Here the relation Φn Φ = Φ Φn is proved by internal induction (see Section 11.5). ◦ ◦  104 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS

Remark 11.4.5. This proof of invariance compares favorably to the proof of invariance under the flow in the A-track framework (see Theorem 7.8.2). The hyperfinite walk for a local prevector field is defined similarly. A bit of care is required to specify its domain. Given a local prevector field Φ : ∗U ∗V , we extend Φ to Φ′ : ∗V → → ∗V by defining Φ′(a)= a for all a ∗V ∗U, and let Yn = a ∗U : n ∈ − { ∈ (Φ′) (a) ∗U . The domain of Φ will be Y (here as before n(t) is ∈ } t n(t) the integer part of t/λ), where it is defined by Φt(a) = Φt′ (a). We would also like to consider Φ for t 0. This can be defined for a t ≤ global prevector field Φ which is bijective on ∗M, or for a local prevector field which is bijective in the sense of Remark 13.2.5, in particular a D1 1 local prevector field. Namely, we define Φt for t 0 to be (Φ− ) t. ≤ − Note that for any global prevector field Φ, the hyperreal walk Φt is defined for all t 0. This is of course not the case for the classical flow ≥ of a classical vector field. Similarly Φt is defined for all t 0 if Φ is bijective. ≤

11.5. Internal induction Internal induction is treated in [Goldblatt 1998, p. 129].

Theorem 11.5.1. If X ∗N be an internal subset containing 1 ⊆ 4 and closed under the successor function n n +1. Then X = ∗N. 7→ Proof. If the set difference Y = ∗N X is nonempty, then since Y is internal, it has a least element n Y (see\ Section 5.11). Hence n 1 X. However, by closure under successor,∈ the number n must also− be∈ in X, yielding a contradiction.  This will be used in the proof of Theorem 11.6.2.

11.6. Dependence on initial condition and prevector field In this section we deal with initial conditions.5 Remark 11.6.1. We showed in Proposition 11.3.1 that each D1 prevector field Φ defined in an open neighborhood of a closed Euclidean ball B satisfies the bound Φ(a) a Φ(b) + b Kλ a b for k − − k ≤ k − k all a, b ∗B for an appropriate finite K. Such a K will be exploited in Theorem∈ 11.6.2.

4“peulat ha’okev” according to hebrew wikipedia at peano axioms 5tnai hatchalati 11.6. DEPENDENCE ON INITIAL CONDITION AND PREVECTOR FIELD 105

1 Theorem 11.6.2. Let Φ be a local D prevector field on ∗U. Given p n ∈ U and a coordinate neighborhood of p with image W R , let B′ B W be closed balls of radii r/2,r around the image⊆ of p. Sup-⊆ ⊆ pose Φ(a) a Φ(b)+ b Kλ a b for all a, b ∗B, with K a finitek constant,− 6−then k ≤ k − k ∈

(1) there is a positive T R such that Φt(a) ∗B for all a ∗B′ ∈ ∈ ∈ and T t T . Furthermore, for all a, b ∗B′ and 0 t − ≤ ≤ Kt ∈ ≤ ≤ T : Φt(a) Φt(b) e a b . k − k ≤ k − k 2 (2) If we take a slightly larger constant K′ = K/(1 Kλ) , then − for all a, b ∗B′ and T t T : ∈ − ≤ ≤ K′ t K′ t e− | | a b Φ (a) Φ (b) e | | a b . k − k≤k t − t k ≤ k − k Proof. We will give a proof for positive t.7 Let C R be as in Proposition 11.2.1 (on sharpening bounds via underspill and∈ overspill), so that Φ(a) a Cλ for all a ∗B. We define T by setting k − k ≤ ∈ r T = , 2C so that 0 < T R. Let a ∗B′ and 0 t T , and set ∈ ∈ ≤ ≤ n = n(t)= t/λ . (11.6.1) ⌊ ⌋ By internal induction, we obtain a bound for Φn(a) a by means of a hyperfinite sum as follows: −

n n m m 1 r Φ (a) a Φ (a) Φ − (a) nCλ . k − k ≤ k − k ≤ ≤ 2 m=1 X By the triangle inequality we obtain r r Φn(a) Φ(a) a + a + = r, k k≤k − k k k ≤ 2 2

n and therefore Φ (a) ∗B. By Proposition 11.3.1(2) we have ∈ (1 Kλ) a b Φ(a) Φ(b) (1 + Kλ) a b − k − k≤k − k ≤ k − k

6Such finite K exists by Proposition 11.3.1. 7The case of negative t then follows from Remark 13.3.2. 106 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS

8 for all a, b ∗B, and so by internal induction (see Theorem 11.5.1) we obtain ∈ (1 Kλ)n a b Φn(a) Φn(b) (1 + Kλ)n a b , − k − k≤k − k ≤ k − k proving the theorem in view of the fact that Kt n (1 + Kλ)n = 1+ eKt (11.6.2) n ≈   by formula (11.6.1) defining n. The choice of K′ results by elementary algebra. 

11.7. Dependence on initial conditions for local prevector field In this section we will provide bounds on how fast the flows of a pair of prevector fields can diverge from each other. In particular, this will show that if prevector fields are equivalent then the corresponding classical flows coincide. We will deal with a more general situation (than that dealt with in Section 11.6) where A′ is the set of initial conditions, whereas A is the set where we assume that the flow is contained until time T . The set A will typically be larger than A′. 1 Let Φ be a D local prevector field on ∗U and let G be any local prevector field on ∗U. Given a coordinate neighborhood included in U n with image W R , let A′ A ∗W be internal sets. Assume that⊆β is such that⊆ ⊆ Φ(a) G(a) βλ (11.7.1) k − k ≤ for all a A (in particular if β is infinitesimal then the prevector fields are equivalent).∈ Assume also that K > 0 is a finite constant such that Φ(a) a Φ(b)+ b Kλ a b k − − k ≤ k − k for all a, b A. Assume that 0 < T R is such that Φt(a) and Gt(a) ∈ ∈ are in A for all a A′ and 0 t T . ∈ ≤ ≤ 8This step requires internal induction and cannot be obtained by iterating the formula n times, writing down the resulting formula with integer parameter n, and applying the transfer principle. The problem is that the entities occurring in the formula are internal rather than standard, so there is no real statement to apply transfer to. However, one can perhaps use the same trick as in the definition of the hyperfinite iterate in the first place, by introducing an additional parameter as in Section 11.4. Namely, the formula holds for all triples (Φ,G,n) map(M) ∈ × map(M) N, and then apply the star transform. × 11.7. DEPENDENCE ON INITIAL CONDITIONS FOR LOCAL PREVECTOR FIELD107

Theorem 11.7.1. In the notation above, for all a A′ and 0 t T , we have the following bound: ∈ ≤ ≤ β Φ (a) G (a) (eKt 1) βteKt. k t − t k ≤ K − ≤ 1 1 If G− exists, e.g., if G is also D , and if Φt(a) and Gt(a) are in A for all a A′ and T t T , then ∈ − ≤ ≤ β K t K t Φ (a) G (a) (e | | 1) β t e | | k t − t k ≤ K − ≤ | | for all T t T . − ≤ ≤ Proof. It suffices to provide a proof for positive t. We will prove by internal induction that β Φn(a) Gn(a) (1 + Kλ)n 1 . (11.7.2) k − k ≤ K − This implies the statement when applied to n = n(t); cf. (11.6.2). The basis of the induction is provided by the estimate (11.7.1). By Propo- sition 11.3.1(2) we have Φn+1(a) Gn+1(a) k − k Φ(Φn(a)) Φ(Gn(a)) + Φ(Gn(a)) G(Gn(a)) ≤k − k k − k (1 + Kλ) Φn(a) Gn(a) + βλ. ≤ k − k Exploiting the inductive hypothesis (11.7.2), we obtain Φn+1(a) Gn+1(a) k − k β (1 + Kλ) (1 + Kλ)n 1 + βλ ≤ K − β β βKλ (1 + Kλ)n+1 + βλ ≤ K − K − K β = (1 + Kλ)n+1 1 K − proving the inductive step from n to n + 1.   Corollary 11.7.2. Assume Φ and G are as in Theorem 11.7.1. If we have Φ G, then there is a β 1 for the statement of Theo- rem 11.7.1, and≡ therefore Φ (a) G (≺≺a) for all 0 t T . t ≈ t ≤ ≤ Proof. By Proposition 11.2.1.  9

9

Remark 11.7.3. If the internal sets A′ A given in Theorem 11.7.1 are ′ ′ ∗ ⊆ balls B B of radii r < r R, and say we are not given that both Φt(a) ⊆ ∈ 108 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS

11.8. Inducing a real flow

The hyperreal walk Φt of a prevector field Φ induces a classical flow on M as follows.

1 Definition 11.8.1. Let Φ be a D local prevector field. Let B′ ⊆ Rn and [ T, T ] be as in Theorem 11.6.2. We define the real flow − φ : B′ M t → induced by Φ by setting

φt(x)= st(Φt(x)). (11.8.1) Recall that Φ is defined by hyperfinite iteration (of Φ) t/λ times. t ⌊ ⌋ The following are immediate consequences of Theorems 11.6.2 and Corollary 11.7.2. Theorem 11.8.2. Given a D1 prevector field Φ the following hold: 10 K t (1) the flow φt is Lipschitz continuous with constant e | |. (2) the real flow φt is injective. (3) If G is another D1 prevector field with real flow g , and Φ G, t ≡ then φt = gt. Remark 11.8.3. Suppose Φ is obtained from a classical vector field X by the procedure of Example 10.1.7. Then [Keisler 1976, Theorem 14.1] shows that φt is in fact the flow of the classical vector field X in the classical sense. By Theorem 11.8.2(3), each prevector field Φ that realizes X will yield a hyperfinite walk whose standard part will produce the same classical flow. The results of this subsection have the following application to the standard setting. Corollary 11.8.4. Consider an open set U Rn. Let X,Y : U ⊆ → Rn be classical vector fields, where X is Lipschitz with constant K,

′ and Gt(a) are in B for all a B and 0 t T , but rather we are given that only one of them remains in some∈ ball B′ ≤ B′′≤ B with radius r′ < r′′ < r such that β r r′′. Then it follows from the⊆ bound⊆ of Theorem 11.7.1 that the other flow remains≺≺ − in B. To be precise, one needs to repeat the induction of the proof of Theorem 11.7.1. This needs to be rewritten. 10 This follows from the estimates given if we interpret the flow as referring to the map M M at time t. However, if this refers to the full flow M I M then the claim→ apparently does not immediately follow and may not even× be→ true for a general prevectorfield. 11.9. GEODESIC FLOW 109 and X(x) Y (x) b for all x U. If x(t), x˜(t) are integral curves of Xkthen − k ≤ ∈ x(t) x˜(t) eKt x(0) x˜(0) . k − k ≤ k − k If y(t) is an integral curve of Y with x(0) = y(0) then b x(t) y(t) (eKt 1) bteKt. k − k ≤ K − ≤

Proof. Define prevector fields on ∗U as usual by setting Φ(a) a = λX(a) and G(a) a = λY (a) as in Example 10.1.7, and apply Theorems− 11.6.2, 11.7.1, and− Remark 11.8.3.  11.9. Geodesic flow One advantage of the hyperreal approach to solving a differential equation is that the hyperreal walk exists for all time, being defined combinatorially by iteration of a selfmap of the manifold. The focus therefore shifts away from proving the existence of a solution, to estab- lishing the properties of a solution. Thus, our estimates show that given a uniform Lipschitz bound on the vector field, the hyperreal walk for all finite time stays in the finite part of the ∗M. If M is complete then the finite part is nearstandard. Our estimates they imply that the hyperreal walk for all finite time descends to a real flow on M. In particular, the geodesic equation on an n-dimensional mani- fold M, k′′ k i′ j′ α +Γijα α =0, (11.9.1) can be interpreted as a first-order ODE on the (2n 1)-dimensional manifold SM (union of unit spheres in all tangent spaces).− The latter is known to satisfy a uniform Lipschitz bound. Therefore the geodesic flow on a complete Riemannian manifold M exists for all time t R. In more detail, consider a Riemannian metric , with metric∈ co- h i ∂ ∂ efficients written in coordinates as (gij) meaning that ∂ui , ∂uj = gij. kℓ The Γk can be expressed as Γk = g (g g + g ). A smooth ij ij 2 iℓ;j − ij;ℓ jℓ;i curve in M with components (αi) is a geodesic if and only if (11.9.1) holds for each k.

CHAPTER 12

Universes, transfer

This chapter develops a detailed rigorous set-theoretic setting for the TIDG framework.

12.1. Universes The idea of is to enlarge the mathematical world we are studying, in a way that does not change any of its proper- ties, this leading to new insights on the original world. So, our starting point is some set X, which in the case of differential geometry may be (the underlying set of) a smooth n-dimensional manifold M, or the disjoint union of M with finitely many other sets one might want to refer to, say Rn, R, and N. Let us review some set-theoretic background related to universes. Recall that given a set A, the symbol (A) denotes the set of all subsets of A. We define the universe V (X) asP follows. Let

V0(X)= X. At the next step we set V = V (X) (V (X)). More generally, we 1 0 ∪ P 0 define recursively for n N, ∈ V (X)= V (X) (V (X)). n+1 n ∪ P n Finally the universe V (X) is defined as follows. Definition 12.1.1. The universe, or superstructure, over a set X is the set V (X)= Vn(X). n N [∈ Corollary 12.1.2. If a V (X) r X then a V (X).  ∈ ⊆ Exercise 12.1.3. By definition, V (X) V (X). Prove that in m ⊆ m+1 fact Vm(X) $ Vm+1(X), and even, by Cantor, Vm+1(X) has a strictly bigger cardinality than Vm(X). For special needs of nonstandard analysis, one usually requires X to have the following key technical property; see e.g., [Keisler 1976, 15B] or [Chang & Keisler 1990, 4.4]. 111 112 12. UNIVERSES, TRANSFER

Definition 12.1.4 (base sets). A set X is a base set if X = ∅ and 6 if x X then x is an atom in V (X), so that x = ∅ while x V (X)= ∅. ∈ 6 ∩ The following are important consequences. Corollary 12.1.5 ([Chang & Keisler 1990], Corollary 4.4.1). Assume that X is a base set. If a V (X) and a b V (X) then ∈ ∈ ∈ m m 1 and a Vm 1(X).  ≥ ∈ − Corollary 12.1.6. Assume that X is a base set. If a, b V (X) ∈ and a V (X)= b V (X) then either a = b or a, b X ∅ . ∩ ∩ ∈ ∪{ } Proof. Suppose that a / X. Then a m 1 Vm(X) by construc- ∈ ∈ ≥ tion, and hence a V (X) and a V (X) = a. If also b / X then similarly b V (X)⊆ = b, thus a V∩(X) = b S V (X) implies∈a = b. If ∩ ∩ ∩ b X then b V (X) = ∅ (as X is a base set), and we conclude that ∈ ∩ a = a V (X)= ∅, as required.  ∩ For examples of base sets see Section 12.2. 12.2. The reals as the base set We recall Dedekind’s construction of the real field. Theorem 12.2.1. Each real x is defined following Dedekind as the pair x = Q , Q′ of two complementary sets of rationals Q = q { x x} x { ∈ Q : q < x and Q′ = q Q : q x . } x { ∈ ≥ } The key property of R which implies that R is a base set, is the fact that each real x consists of two elements Qx and Qx′ , which are sets of von Neumann rank (see below) equal exactly to ω. For those not versed in set theory we provide a definition. Definition 12.2.2. The von Neumann rank of a set x is an ordi- nal rank(x) defined so that rank(∅) = 0 and if x = ∅ then rank(x) is the least ordinal strictly bigger than all ordinals rank6 (y), y x. ∈ The existence and uniqueness of rank(x) follow from the axioms of modern Zermelo–Fraenkel set theory ZFC, with the key role of the regularity, or foundation axiom in the argument. Informally, rank(x) can be viewed as the finite or transfinite length of the cumulative con- struction of the set x from the empty set. Definition 12.2.3. Let γ be an ordinal. A set X is a γ-base set if X = ∅, X consists of non-empty elements, and we have rank(a) = γ whenever6 a x X. ∈ ∈ Lemma 12.2.4 ([Chang & Keisler 1990], Exercise 4.4.1). If γ is an infinite ordinal, and X is a γ-base set, then X is a base set. 12.3. MORE SETS IN UNIVERSES 113

Proof. If x X = V0(X) then rank(x) = γ + 1 by the choice of X. ∈ If x V1(X) then either x X with rank(x)= γ +1, or x = ∅ (the ∈ ∈ empty set) with rank(∅)=0, or x V1(X) X with rank(x)= γ + 2. Similarly if x V (X) then either∈ x \X with rank(x) = γ + 1, ∈ 2 ∈ or rank(x) 1 (for x = ∅ and x = ∅ ), or x V2(X) X with γ +2 rank≤(x) γ + 3. Extending this{ } argument,∈ one easily\ shows ≤ ≤ by induction that if m 1 then every set y Vm(X) X satisfies either γ+2 rank(y) γ+m≥+1 or rank(Y ) < m∈, hence never\ rank(y)= γ. This≤ rules out the≤ possibility of y x for any x X = V (X).  ∈ ∈ 0 Now consider the most important case X = R (the reals). Recall that ω is the least infinite ordinal. Lemma 12.2.5. The set X = R is an ω-base set, hence, a base set. Proof. Set theory defines natural numbers so that 0 = ∅ and n = 0, 1, 2,...,n 1 for all n > 0. It follows that rank(n) = n for any {natural number− n}. Next, according to the set theoretic definition of an ordered pair x, y as h i x, y = x , x, y , h i {{ } { }} all pairs and triples of natural numbers have finite ranks. Therefore each q, identified with a triple s, m, n , where s h i n ∈ 0, 1, 2 , m =1, 2, 3,... , n =0, 1, 2, 3,... , and q =(s 1) m , also has a{ finite} rank rank(q). − · We conclude that every infinite set S of rational numbers satis- fies rank(S) = ω exactly. Therefore each Dedekind real x consists of two sets (infinite sets of rationals Qx and Qx′ ) of rank rank(Qx) = rank(Qx′ )= ω. It remains to apply Lemma 12.2.4 with γ = ω.  From now on we concentrate on the case X = R. 12.3. More sets in universes

By definition the set V0(R) = R has cardinality card(R) = c (the n continuum), and then by induction card(Vn(R)) = exp (c). This array of cardinalities easily exceeds all needs of conventional mathematics. For instance, the underlying set M of a smooth connected manifold M of dimension n 1 is a set of| cardinality| c. Therefore we can iden- ≥ tify M with a set in the complement V1(R) V0(R). (This choice | | \ excludes the possibility of nonempty intersection of M and R.) Then every object of interest for studying the smooth manifold| | M will be an element of V (R), e.g., (1) The topology of M is an element of V3(R); 114 12. UNIVERSES, TRANSFER

(2) With the usual encoding of an ordered pair a, b as a , a, b , h i {{ } { }} the set M M is an element of V4(R); (3) The set Map(× M) of all maps from M to itself is an element of V5(R). Similarly, the field operations of R, the vector space operations of Rn, and the natural embedding of N into R, are elements of V (R), as is each function from Rn to R, etc. See also Lemma 4.4.3 in [Chang & Keisler 1990]. 12.4. Membership relation Statements about V (X) are made using a first order language with a binary relation , the membership relation, as the only primary re- lation. This language∈ will be called -language. ∈ Definition 12.4.1. An -formula φ is called bounded if and only if the quantifiers x and x ∈always appear in φ in the form ∀ ∃ x(x a ) ∀ ∈ ⇒··· and x(x a ) ∃ ∈ ∧··· where a is either a variable or a constant. These two forms are usually abbreviated respectively as x a ( ) and x a ( ). ∀ ∈ ··· ∃ ∈ ··· 12.5. Ultrapower construction of nonstandard universes In this section we will present a generalisation of the ultrapower construction of Section 4.7. For the purpose of such a generalisation, recall that f, g, . . . denote functions in a standard universe V (X), while their extensions in a nonstandard extension ∗V (X) will be ∗f, ∗g, . . . Definition 12.5.1. A nonstandard extension of V (X) consists of a base set ∗X with X $ ∗X, a subuniverse ∗V (X) V (∗X), and a map ⊆ : V (X) ∗V (X) (12.5.1) ∗ → satisfying the following two conditions:

(I) one has ∗X ∗V (X), and the map is an injective map from ⊆ ∗ V (X) into ∗V (X) such that if x X then ∗x ∗X; ∈ ∈ (II) for every bounded formula φ(x1,...,xk), and every k-tuple of constants a ,...,a V (X), the formula φ(a ,...,a ) is 1 k ∈ 1 k true in V (X) if and only if the formula φ(∗a1,..., ∗ak) is true in ∗V (X). 12.5. ULTRAPOWER CONSTRUCTION OF NONSTANDARD UNIVERSES 115

Example 12.5.2. An example is a formula expressing the statement of the extreme value theorem (EVT) for a function f; see Section 5.8. Therefore by (II) the same formula holds in the extension for ∗f where the quantification is now over the extended domain.

Remark 12.5.3. If a = b then ∗a = ∗b by (II), so is a bijection 6 6 ∗ of V (X) onto a subset of ∗V (X). Yet we introduce the bijectivity condition separately by (I) for the sake of convenience. Also, still by (II), if a A V (X) then ∗a ∗A, so if one identifies a ∈ ∈ ∈ with its image ∗a, then one may think of A as a subset of ∗A, that is, one can think of ∗A as an extension of A. Since functions are also sets, for every function f V (X), f : A B, we can consider the set ∗f, ∈ → which is also a function, ∗f : ∗A ∗B. If A and B are considered to be → subsets of ∗A and ∗B respectively, then the function ∗f is an extension of the function f. Theorem 12.5.4. Nonstandard extensions exist. This is a well-known result presented in many sources, including the fundamental monograph on model theory [Chang & Keisler 1990]. We will provide a proof here so as to make the exposition self-contained. Proof. The construction we employ is called reduced (or bounded) ultrapower. It consists of two parts: (1) the ultrapower itself, along with theLo´s theorem, namely Lemma 12.6.2; (2) the transformation of the bounded ultrapower into a universe. We start with an arbitrary ultrafilter (N) containing all cofi- F ⊆ P nite sets X N, namely a nonprincipal ultrafilter. The starting point for the construction⊆ of the full ultrapower is the product set V (X)N which consists of all infinite sequences xn = xn : n N of ele- h i h ∈ i ments xn V (X), or equivalently, all functions f : N V (X), via ∈ → the identification of each f with the sequence f(n): n N . The subsets V (X)N V N are defined accordingly. h ∈ i n ⊆ Definition 12.5.5. The set union N N N V (X)bd = Vm(X) $ V (X) m [  is the starting point for the construction of the bounded ultrapower. N N Exercise 12.5.6. Find an element in V (X) r V (X)bd. Let m be a natural number. N Definition 12.5.7. A sequence x V (X)bd is of type m if the h ni ∈ set Dm = n : xn Vm(X) Vm 1(X) belongs to . { ∈ \ − } F 116 12. UNIVERSES, TRANSFER

Definition 12.5.8. In the notation as above, a sequence is totally of type m if Dm = N.

In case m = 0, we naturally assume that V 1(X) = ∅, so that − D0 = n : xn V0(X) , and xn is totally of type 0 if and only if x X{= V (X∈) for all }n. h i n ∈ 0 N Lemma 12.5.9. Assume that x V (X)bd. Then there is a h ni ∈ unique number m N such that xn is of type m. ∈ h i Proof. By definition we have x V (X)N for some M and h ni ∈ M therefore we have a pairwise disjoint union N = m M Dm. The lemma now follows from the fact that is an ultrafilter. ≤  F S Following Section 5.1, we define an equivalence relation (or more N ∼ precisely as the relation depends on ) on V (X)bd as follows: ∼F F 1 xn yn if and only if n N : xn = yn . h i∼h i { ∈ }∈F N Lemma 12.5.10. Assume that xn V (X)bd is of type m. There N h i ∈ is a sequence yn V (X)bd totally of type m, such that xn yn . h i ∈ h i ∼F h i Proof. If n D then simply let y = x . If n / D then ∈ m n n ∈ m put yn = a, where a is a fixed element in Vm(X) Vm 1(X). (Or a X = V (X) in the case m = 0.) \ −  ∈ 0 We continue with the proof of Theorem 12.5.4. The -quotient F N N V (X)bd/ = xn : xn V (X)bd F {h iF h i ∈ } consists of all -classes, or -classes F ∼F N xn = yn V (X)bd : xn yn h iF {h i ∈ h i∼h i} N of sequences x V (X)bd. We define h ni ∈ N N Vm(X) / = xn : xn Vm(X) F {h iF h i ∈ } N N for all m, so that V (X)bd/ = V (X) / . F m m F N Corollary 12.5.11. EachS element of V (X)bd/ has the form N F xn , where xn V (X)bd is a sequence totally of type m for some h iF h i ∈ (unique) m, so that xn Vm Vm 1(X) for all n. ∈ \ − Proof. Use lemmas 12.5.9, 12.5.10. 

N If xn , yn belong to V (X)bd then we define a binary relation h iF h iF xn ∗ yn if and only if n N : xn yn . h iF ∈h iF { ∈ ∈ }∈F 1In this case there is a set X such that we have x = y , for all n X. ∈F n n ∈ 12.6. LOS THEOREM 117

N Lemma 12.5.12. If xn , yn belong to V (X)bd and we have h iF h iF xn ∗ yn then there is a set K and an index m 1 such h iF ∈ h iF ∈ F ≥ that xn Vm 1(X), yn Vm(X), and xn yn, for all n K. ∈ − ∈ ∈ ∈ Proof. By definition, the set L = n N : xn yn belongs N { ∈ ∈ } to , and, as xn , yn V (X)bd, there is an index M such that x ,yF V (Xh) foriF allh n.iF Consider∈ the sets n n ∈ M Km = n L : xn Vm 1(X) and yn Vm(X) , 1 m M. { ∈ ∈ − ∈ } ≤ ≤ It follows by Corollary 12.1.5 that L = 1 m M Km, therefore at least ≤ ≤ one of Km belongs to , and we can set K = Km.  F S If x V (X) then we can let x be the infinite constant sequence ∈ h i x, x, x, . . . = xn , where xn = x forall n. We let x denote its - hclass as above.i h i h iF F

12.6. Los theorem Our main goal in Chapter 12 is the proof of Transfer, an assertion to the effect that, roughly speaking, any sentence true or false in the standard universe remains true, resp., false in the nonstandard uni- verse, Corollary 12.7.1 and ultimately Corollary 12.7.9. This is not a really easy task, in part because such a typical tool of logic as proof by induction on the logical complexity of the sentence considered, does not seem to work. However there is another, and stronger claim which happens to admit such an inductive proof. This is the result below, in fact a cornerstone of the theory of . The theorem con- N nects the truth of sentences related to V (X)bd with their “truth sets” in the ultrafilter . F Definition 12.6.1. The truth set T N of a formula φ is defined by setting ⊆ 1 k Tφ a1 ,..., ak = n N : φ(an,...,an) is true in V (X) . (h ni h ni) ∈ Lemma 12.6.2 (theLo´stheorem) . If φ(x1,...,xk) is an arbitrary 1 k N bounded -formula and sequences an ,..., an belong to V (X)bd, then ∈ 1 k h i h i N the sentence φ( an ,..., an ) is true in V (X)bd/ ; ∗ if and only h iF h iF h F ∈i if the truth set Tφ a1 ,..., ak is a member of . (h ni h ni) F Here we use an upper index in the enumeration of free variables to distinguish it from the enumeration of terms of infinite sequences N in V (X)bd. Example 12.6.3. Let φ be the formula x y. Assume that se- N ∈ quences an , bn belong to V (X)bd. Then the formula an bn h i h i h iF ∈h iF 118 12. UNIVERSES, TRANSFER

N is true in V (X)bd/ ; ∗ if and only if an ∗ bn if and only if (by h F ∈i h iF ∈h iF definition) the set T an bn = n N : an bn belongs to . h i∈h i { ∈ ∈ } F Proof. The proof of Lemma 12.6.2 proceeds by induction on the construction of formula φ out of elementary formulas x y by means of logical connectives and quantifiers. Among them, it suffices∈ to consider only (the negation), (conjunction), and (the existence quantifier) since¬ the remaining ones∧ are known to be expressible∃ via combinations of those three. For instance, Φ Ψ is equivalent to (Φ Ψ), and the quantifier is equivalent to ∨. ¬ ∧ The base∀ of induction is¬∃¬ the case when φ is the formula x y, already checked in Example 12.6.3. As we go through the induction∈ steps, we will reduce the arbitrarily long list of free variables x1,...,xk to a minimally nontrivial amount for each step, for the sake of brevity. The step . Let φ(x) be a bounded -formula with a single free ¬ N ∈ variable x, and a V (X)bd. Suppose that the lemma is established h ni ∈ for φ( an ), and let us prove it for φ( an ). h i ¬ h i N Indeed, formula φ( an ) is true in V (X)bd/ ; ∗ if and only ¬ h iNF h F ∈i if φ( an ) is false in V (X)bd/ ; ∗ . The latter holds if and only if h iF h F ∈i (by the inductive hypothesis) Tφ( an ) / or equivalently T φ( an ) h i ∈F ¬ h i ∈F since obviously Tφ = N r T φ, and one and only one set in a pair of complementary sets belongs¬ to (as is an ultrafilter). The step . Here one provesF theF result for φ(a) ψ(a) assuming that the lemma∧ is established separately for φ(a) and∧ψ(a). We leave it as an exercise. The step . Assume that the lemma is established for a bounded ∃ formula φ(x, y, z), and prove it for ( z x) φ(x, y, z). Let an , bn N ∃ ∈ h i h i ∈ V (X)bd. We have to prove that the statement N (A) ( z an ) φ( an , bn , z) is true in V (X)bd/ ; ∗ ∃ ∈h iF h iF h iF h F ∈i is equivalent to

(B) the set L = T z an φ( an , bn ,z) belongs to . ∃ ∈h i h i h i F Note that L = n : z an (φ(an, bn, z) is true in V (X)) . By definition,{ (A)∃ is∈ equivalent to } N (C) there is a sequence cn V (X)bd such that both cn ∗ an h i ∈ N h iF ∈h iF and φ( an , bn , cn ) are true in V (X)bd/ ; ∗ . h iF h iF h iF h F ∈i By the inductive hypothesis for φ and the base of induction result, the requirements of (C) hold if and only if each of the sets

U = T = n : c a , and ′cn cn an n n h i h i∈h i { ∈ } U = T = n : φ(a , b ,c ) is true in V (X) ′′cn φ( an , bn , cn ) n n n h i h i h i h i { } 12.7. ELEMENTARY EMBEDDING 119

belongs to , which is equivalent (as is an ultrafilter) to U cn , where F F h i ∈F

U cn = U ′cn U ′′cn = T cn an φ( an , bn , cn ). h i h i ∩ h i h i∈h i∧ h i h i h i

On the other hand, the set L of (B) satisfies U cn L because if n h i ⊆ ∈ U cn then taking z = cn satisfies φ(an, bn, z) in V (X). Therefore if T h theni L as is an ultrafilter. This ends the proof that (A)∈ Fimplies (B).∈ F F N To prove the converse suppose that L . Recall that V (X)bd = V (X)N, hence the sequence a belongs∈F to some V (X). If n L m m h ni m ∈ then by definition there is some z = cn an such that φ(an, bn, z) is true S ∈ in V (X). It follows that m> 0 and cn Vm 1(X), by Corollary 12.1.5 ∈ − If n / L then let cn Vm 1(X) be arbitrary. The construction of ∈ ∈ − N the sequence cn : n N Vm 1(X) uses the axiom of choice, of h ∈ i ∈N − course. Thus c V (X)bd and if n L then n belongs to the h ni ∈ ∈ set U cn = U ′cn U ′′cn as above by construction. It follows that the h i h i ∩ h i sets U , U belong to , and hence we have (C) and (A). ′cn ′′cn h i h i F This completes the proof of theLo´stheorem. 

12.7. Elementary embedding Corollary 12.7.1. The map x x is an elementary embed- 7−→ h NiF ding of the structure V (X); to V (X)bd/ ; ∗ . In other words, if hφ(x1,...,x∈i k) ish a boundedF ∈i-formula and we have a1,...,ak V (X), then the sentence φ(a1,...,a∈k) is true in V (X) if ∈ 1 k N and only if the sentence φ( a ,..., a ) is true in V (X)bd/ ; ∗ . h iF h iF h F ∈i j j Proof. Recall that a is the infinite sequence an : n N such j j h i h ∈ i that an = a for all n. Therefore the truth set Tφ( a1 ,..., ak ) is equal h i h i to N if φ(a1,...,ak) is true in V (X) and is equal to ∅ otherwise. Now an application of Lemma 12.6.2 concludes the proof.  This result ends the first part of the proof of Theorem 12.5.4. We defined an embedding x x of the universe V (X); to 7−→ h iF h ∈i N V (X)bd/ ; ∗ h F ∈i which satisfies Transfer by Corollary 12.7.1. The final step of the proof N of the theorem is embedding V (X)bd/ ; ∗ into the universe V (∗X), hN F ∈i where ∗X = an : an X , and, we recall, X = V0(X). {h iF h i ∈ } Lemma 12.7.2. Assume that γ is an infinite ordinal and X is a γ-base set. Then ∗X is a (γ + 4)-base set, hence a base set by Lemma 12.2.4. 120 12. UNIVERSES, TRANSFER

Proof. By definition, each an ∗X is a non-empty set of h iF ∈ pairwise -equivalent sequences an = an : n N of elements ∼F h i h ∈ i an X. It remains to prove that rank( an ) = γ +4. Note that a sequence∈ of this form is formally equal toh thei set of all ordered pairs n, an = n , n, an : n N , where h i {{{ } { }} ∈ } rank(n)= n,

rank(an)= γ +1 (as X is a γ-base set), rank( n )= n + 1, { } rank( n, a )= γ + 2, { n} rank( n , n, a )= γ + 3, {{ } { n}} rank( an : n N )= γ + 4, h ∈ i as required. 

The lemma shows that V (∗X) is a legitimate superstructure which satisfies all results in sections 12.1 and 12.2. N Definition 12.7.3. Define a map h : V (X)bd/ V (∗X) as N F → follows. Suppose that an V (X)bd/ . It can be assumed, by Corollary 12.5.11, that ha iFis∈ totally of typeF m for some m, so that h ni an Vm(X) Vm 1(X) (or just an X = V0(X) provided m = 0) for ∈ \ − ∈ all n. The definition of h( an ) goes on by induction on m. h iF (1) If m = 0 then we put h( an )= an . h NiF h iF In other words, if an X then h( an )= an ∗X. h i ∈ h iF h iF ∈ (2) If m 1 then we put ≥ N h( an )= h( bn ): bn Vm 1(X) and bn ∗ an . h iF h iF h i ∈ − h iF ∈h iF This type of definitions is called the Mostowski collapse, see Theorem 4.4.9 in [Chang & Keisler 1990]. Exercise 12.7.4. Prove by induction on m that if a V (X)N h ni ∈ m then h( an ) Vm(∗X).  h iF ∈ Lemma 12.7.5. The map h is a bijection, and h is an isomorphism N in the sense that if a , b V (X)bd then h ni h ni ∈ an ∗ bn if and only if h( an ) h( bn ) . h iF ∈h iF h iF ∈ h iF Proof. To prove that h is a bijection, suppose that an , bn N h i h i ∈ V (X)bd and an = bn . Then the truth set T = n N : an = bn belongs to hby LemmaiF 6 h 12.6.2.iF It can be assumed, by{ Corollary∈ 12.5.11,6 } F that a , b are totally of type m, resp., m′, for some m, m′. h ni h ni Case 1 : m = m′ = 0, so that a , b X = V (X) for all n. Then n n ∈ 0 h( an ) = an and h( bn ) = bn by construction, and on the h iF h iF h iF h iF other hand, we have an = bn since T . h iF 6 h iF ∈F 12.7. ELEMENTARY EMBEDDING 121

Case 2 : m′ = 0 and m 1, so that b X = V (X) and ≥ n ∈ 0 an Vm(X) Vm 1(X) for all n. Then h( bn ) = bn ∗X while ∈ \ − h iF h iF ∈ h( an ) V (∗X) by construction, so that h( bn ) = h( an ) by Lemmah iF 12.7.2.⊆ (Recall Definition 12.1.4.) h iF 6 h iF

Case 3 : m = 0 and m′ 1, similar. ≥ Case 4 : m = m′ 1. Then, for any n, a and b belong to ≥ n n Vm(X) Vm 1(X), therefore an Vm 1(X) and bn Vm 1(X). In \ − ⊆ − ⊆ − addition, if n T then an = bn, hence there exists some cn Vm 1(X) satisfying c ∈a b or c6 b a . It follows that one of∈ the− sets n ∈ n \ n n ∈ n \ n T ′ = n T : c a b , T ′′ = n T : c b a { ∈ n ∈ n \ n} { ∈ n ∈ n \ n} belongs to ; suppose that T ′ . If n / T then let cn Vm 1(X) F ∈ F ∈ ∈ − be arbitrary. Then cn Vm 1(X). Now we have cn ∗ an but h i ∈ − h i ∈ h i cn ∗ bn , hence, by Definition 12.7.3, h( cn ) h( an ) but ¬h i ∈ h i h iF ∈ h iF h( cn ) / h( bn ). Therefore h( an ) = h( bn ). h iF ∈ h iF h iF 6 h iF Case 5 : m, m′ 1 and m = m′, say m′ < m. For any n we ≥ 6 have an Vm(X) Vm 1(X), hence an Vm 1(X). Note that an V (X) is∈ impossible\ for− k < m 1 as⊆ otherwise− a V (X) ⊆ k − n ∈ k+1 ⊆ Vm 1(X), a contradiction. Thus there is an element cn an, cn − ∈ ∈ Vm 1(X) Vm 2(X), in particular, cn Vm 1(X) Vm′ 1(X). Then − \ − ∈ − \ − cn Vm 1(X) and cn ∗ an , hence h( cn ) h( an ). On the h i ∈ − h i ∈ h i h iF ∈ h iF other hand, we have c / b by Corollary 12.1.5, hence c ∗ b n ∈ n ¬h ni ∈ h ni and h( cn ) / h( bn ). Thus still h( an ) = h( bn ). h iF ∈ h iF h iF 6 h iF The claim that h is a bijection is established. Now prove the isomorphism claim. Let an , bn be sequences in N h i h i V (X)bd. As above, it can be assumed that a , b are totally of h ni h ni type m, resp., m′, for some m, m′ N. Suppose that an ∗ bn . ∈ h iF ∈ h iF By definition the set N = n N : an bn belongs to . Then { ∈ ∈ } F an Vm′ 1(X) for all n N by Corollary 12.1.5, and hence we have ∈ − ∈ h( an ) h( bn ) by Definition 12.7.3. h iF ∈ h iF If conversely h( an ) h( bn ) then immediately an ∗ bn still by Definition 12.7.3.h iF ∈ h iF h iF ∈h iF N Definition 12.7.6. Let ∗V (X) = h( an ): an V (X)bd/ , { h iF h i ∈ F} the range of h. If m N then let ∈ N ∗Vm(X)= h( an ): an Vm(X) / .  { h iF h i ∈ F} Exercise 12.7.7. Prove, using the results of Lemma 12.7.5, that the sets ∗V (X) satisfy ∗V (X) = ∗V (X) V (∗X), ∗V (X) m m ∩ m m+1 ⊆ (∗V (X)), and finally ∗V (X)= ∗V (X) V (∗X).  P m m m ⊆ Thus ∗V (X) is a subuniverse ofS the full universe V (∗X) over ∗X = N ∗V (X). Let’s use letters ξ, η to denote arbitrary elements of V (X)bd/ , 0 F 122 12. UNIVERSES, TRANSFER

N so that if ξ V (X)bd/ then h(ξ) ∗V (X). As h is an isomorphism by Lemma 12.7.5,∈ we immediatelyF obtain∈ the following. Corollary 12.7.8. If φ(x1,...,xk) is a bounded -formula and 1 k N 1 k ∈ N ξ ,...,ξ V (X)bd/ , then φ(ξ ,...,ξ ) is true in V (X)bd/ ; ∗ if ∈ F 1 k h F ∈i and only if the sentence φ(h(ξ ),...,h(ξ )) is true in ∗V (X); .  h ∈i Now we are ready to accomplish the proof of Theorem 12.5.4. We claim that ∗X, ∗V (X), and the map x ∗x is a nonstandard exten- sion of V (X). The proof is maintained by7−→ combining Corollaries 12.7.1 and 12.7.8. Namely, if x V (X) then let ∗x = h( x ), so that x ∗x ∈ h iF 7−→ maps V (X) onto ∗V (X) V (∗X). Further, Lemma 12.7.5 ensures that condition (I) of Section 12.5⊆ holds. Now let’s validate condition (II) of 12.5. Consider a bounded formula φ(x1,...,xk), and arbitrary elements a1,...,ak V (X). Then the formula φ(a1,...,ak) is true in V (X) if ∈ 1 k N and only if φ( a ,..., a ) is true in V (X)bd/ ; ∗ (by Corollary h iF h iF 1 h k F ∈i 12.7.1). if and only if the formula φ(∗a ,..., ∗a ) is true in ∗V (X) (by Corollary 12.7.8). Note that both V (X) and ∗V (X) are considered as -structures with the true membership relation . ∈ ∈ It remains to prove that X $ ∗X properly, or, to be more precise, there is an element y ∗X not equal to any ∗x, x X. ∈ ∈ Let a0, a1, a2,... be an infinite sequence of pairwise distinct ele- N N ments an X. Then an = an : n N belongs to X = V0(X) . It ∈ h i h ∈ i follows that ξ = h( an ) ∗X. Prove that if x X then ξ = ∗x . By h iF ∈ ∈ 6 definition one has to show that an x , that is, the set N = n : a = x belongs to . However Nh containsi 6∼ h i all natural numbers with{ a n 6 } F possible exception of a single index n satisfying an = x. Hence N . This completes the proof of Theorem 12.5.4. ∈F

Rephrasing condition (II) of Section 12.5, we obtain:

Corollary 12.7.9 (Transfer). The map x ∗x is an elemen- 7−→ tary -embedding of the structure V (X); to ∗V (X); . In other words,∈ if φ(x1,...,xk) is a boundedh -formula∈i andh a1,...,a∈ik V (X), 1 k ∈ 1 k∈ then φ(a ,...,a ) is true in V (X) if and only if φ(∗a ,..., ∗a ) is true in ∗V (X). 

12.8. Definability In Chapter 4 we presented a construction of the hyperreal line via filters and ultrapowers. For a broader picture, it may be useful to comment on the issue of definability. 12.9. CONSERVATIVITY 123

Many people felt that the hyperreal line is not definable since its definition involves nonconstructive foundational material. There- fore it came as a surprise to many people when Kanovei and Shelah proved in 2004 that, in fact, the hyperreal line is indeed definable. See http://www.ams.org/mathscinet-getitem?mr=2039354 and the article itself [Kanovei & Shelah 2004]. What this means is that there exists a specific set-theoretic formula that defines the hyperreal line. This might seem surprising. To clarify the situation, one would compare it with that for Lebesgue measure. It is widely known and accepted as fact that the Lebesgue measure is σ-additive. Two items need to be pointed out here: (1) to define the Lebesgue measure, one does not need the axiom of choice. (2) to prove that the Lebesgue measure is σ-additive, one needs the axiom of choice. Here item (2) follows from the existence of a model of ZF where the real line is a countable union of countable sets. This model is called the Feferman-Levy model [Feferman & Levy 1963]. See also [Jech 1973, chapter 10]. We see that, while the Lebesgue measure can be defined in ZF, a proof of its σ-additivity necessarily requires Choice. With regard to its foundational status, the situation is similar for a hyperreal line. One can define a hyperreal line without Choice (as Kanovei and Shelah did), but actually to prove something about it one would need some version of Choice.

12.9. Conservativity The various frameworks developed by Robinson and since, including those of Nelson and Hrbacek, are conservative extensions of ZFC, and in this sense are part of classical mathematics. As far as conservativity is concerned, the following needs to be kept in mind. Robinson showed that, given a proof exploiting infinitesimals, there always exists an infinitesimal-free paraphrase. However, this is merely a theoretical result. In practice, such a paraphrase may involve an exponential increase in the complexity of the proof, and correspondingly an exponential decrease in its compre- hensibility, at least in principle. From the point of computer one should certainly be able to appreciate the difference. Similarly, one can note that any result using the real numbers can be restated in terms of the rationals using sequences, and even in terms 124 12. UNIVERSES, TRANSFER of the integers, etc. The ancient Greeks used formulations that would be very difficult for us to follow if we did not use modern translations, in terms of extended number systems that have since become widely accepted and are considered intuitive. The related controversy over Unguru’s proposal is well known. The following two additional points shouuld be kept in mind. First, the conservativity property of the pair ZFC, IST does not hold for pairs like An, An∗ , where An is the n-th order arithmetic and An∗ its nonstandard extension, in fact An∗ is rather comparable to the stan- dard An+1. This profound result of Henson–Keisler near 1994 is well- known to NSA people (provide references and exact formulations). Second, IST yields some principally new mathematics with respect to ZFC, in the sense that there are IST sentences not equivalent to any ZFC sentence. This is established in [Kanovei & Reeken 2004]. This can be compared to the obvious fact that any sentence about C admits an equivalent sentence about R. The following comments by Henson and Keisler usefully illustrate the point:

It is often asserted in the literature that any theorem which can be proved using nonstandard analysis can also be proved without it. The purpose of this paper is to show that this assertion is wrong, and in fact there are theorems which can be proved with nonstandard analysis but cannot be proved without it. There is currently a great deal of confusion among mathemati- cians because the above assertion can be interpreted in two different ways. First, there is the following correct statement: any theorem which can be proved using nonstandard analysis can be proved in Zermelo- Fraenkel set theory with choice, ZFC, and thus is acceptable by contemporary standards as a theorem in mathematics. Second, there is the erroneous con- clusion drawn by skeptics: any theorem which can be proved using nonstandard analysis can be proved without it, and thus there is no need for nonstandard analysis. The reason for this confusion is that the set of principles which are accepted by current math- ematics, namely ZFC, is much stronger than the set of principles which are actually used in mathematical practice. It has been observed (see [F] and [S]) that al- most all results in classical mathematics use methods 12.9. CONSERVATIVITY 125

available in second order arithmetic with appropriate comprehension and choice axiom schemes. This sug- gests that mathematical practice usually takes place in a conservative extension of some system of second order arithmetic, and that it is difficult to use the higher levels of sets. In this paper we shall consider systems of nonstandard analysis consisting of second order nonstandard arithmetic with saturation princi- ples (which are frequently used in practice in nonstan- dard arguments). We shall prove that nonstandard analysis (i.e. second order nonstandard arithmetic) with the ω1-saturation axiom scheme has the same strength as third order arithmetic. This shows that in principle there are theorems which can be proved with nonstandard analysis but cannot be proved by the usual standard methods. [Henson & Keisler 1986, p. 377]

In short, An∗ is comparable to the standard An+1 rather than to An itself.

CHAPTER 13

More regularity

13.1. D2 implies D1 Lemma 13.1.1. Let B Rn be an open ball around the origin 0, ⊆ n n let a hal(0), and let 0 = v ∗ R with v 1. If G : ∗B ∗ R is an internal∈ function satisfying6 ∈ ≺≺ → G(x) 2G(x + v)+ G(x +2v) v G(a) G(a + v) − ≺k kk − k h h for all x B, then there is m ∗ N such that a + mv B and ∈ ∈ ∈ G(a) G(a + v) v G(a) G(a + mv) . − ≺k kk − k Proof. Let N = r/ v where 0 < r R is slightly smaller ⌊ k k⌋ ∈ than the radius of B, and for 0 x ∗ R, x ∗ N is the integer part of x. So any m N satisfies≤ that∈ a ⌊+ ⌋mv ∈ hB. Let A = G(a) G(a + v). For≤ 0 j N let x = a +∈jv, then by our − ≤ ≤ j assumption on G we have G(xj) 2G(xj+1)+ G(xj+2) = Cj v A with C 1. Let C be the maximum− of C , ,C , then kCkk k1 j ≺ 0 ··· N ≺ and G(xj) 2G(xj+1) + G(xj+2) C v A for all 0 j N. Given k N− we have ≤ k kk k ≤ ≤ ≤ A G(x ) G(x ) = G(x ) G(x ) G(x ) G(x ) − k − k+1 0 − 1 − k − k+1 k 1      − G(x ) G(x ) G(x ) G(x ) Ck v A . ≤ j − j+1 − j+1 − j+2 ≤ k kk k j=0     X So for any m N, ≤ m 1 − mA G(x ) G(x ) = A G(x ) G(x ) Cm2 v A , − 0 − m − k − k+1 ≤ k kk k   k=0    X so mA G(x ) G( x ) = Km2 v A with K 1. It follows − 0 − m k kk k ≺ that  

m A Km2 v A G(x ) G(x ) , k k − k kk k≤k 0 − m k and so, multiplying by v , we have m v A (1 Km v ) v G(x0) G(x ) . k k k kk k − k k ≤k kk − m k 127 128 13. MORE REGULARITY

1 Now let m = min N, 2K v , then { ⌊ k k ⌋ } m v A /2 v G(x ) G(x ) . k kk k ≤k kk 0 − m k By definition of N and since K 1 we have that m v is appreciable, ≺ k k i.e. not infinitesimal, and so finally A v G(x0) G(xm) , that is, G(a) G(a + v) v G(a) G(a +≺kmv)kk. − k  − ≺k kk − k Proposition 13.1.2. If F is D2 for some choice of coordinates in W , then F is D1. Proof. Given a hW , in the given coordinates take some ball B around st(a). Define ∈G(x)= F (x) x, then we must show for any v λ that G(a) G(a + v) λ v (here− v = b a in Definition 10.3.3).≺ If G(a) G(a+−v) λ v≺thenk k we are certainly− done. Otherwise λv G(a) −G(a + v)≺≺, andk onk the other hand G(x) 2G(x + v)+ G(x ≺+ k2v) = −F (x) 2Fk(x + v)+ F (x +2v) λ v 2,− by taking v = w in Definition 10.4.1.− Together we have G(x≺) k2Gk(x + v)+ G(x +2v) − ≺ v G(a) G(a + v) , so by Lemma 13.1.1 there is m ∗ N such kthatkka + mv− hB andk ∈ ∈ G(a) G(a+v) v G(a) G(a+mv) v G(a) + G(a+mv) v λ − ≺k kk − k≤k k k k k k ≺k k since F is a prevector field and so G(x)= F (x) x λ for all x.  − ≺ 13.2. Injectivity When speaking about local D1 or D2 prevector fields, whenever needed we will assume, perhaps by passing to a smaller domain, that a constant K as in Propositions 11.3.1, 11.3.2 exists. Corollary 13.2.1. If F is a D1 prevector field then F is injective on hM. Proof. Let a = b hM. If st(a) = st(b) then clearly F (a) = F (b). Otherwise there exists6 ∈ a B including 6 a, b as in Proposition 11.3.1,6 and (2) of that proposition implies F (a) = F (b).  6 We will now show that a D1 prevector field is in fact bijective on hM. We first prove local surjectivity, as follows. n Proposition 13.2.2. Let B1 B2 B3 R be closed balls ⊆ ⊆ ⊆ centered at the origin, of radii r1 < r2 < r3. If F : ∗B2 ∗B3 is a 1 → local D prevector field then F (∗B ) ∗B . 2 ⊇ 1 Proof. Fix 0 < s R smaller than r2 r1 and r3 r2. We will ∈ − − apply transfer to the following fact: For every function f : B2 B3, if f(x) f(y) 2 x y for all x, y B and f(x) x

B then f(B ) B . This fact is indeed true since our assumptions 2 2 ⊇ 1 on f imply that it is continuous, and that for every x ∂B2 the straight interval between x and f(x) is included in B B , so ∈f is homotopic 3− 1 |∂B2 in B3 B1 to the inclusion of ∂B2. Now if some p B1 is not in f(B2) then f− is null-homotopic in B p , and so∈ the same is true for |∂B2 3 −{ } the inclusion of ∂B2, a contradiction. Applying transfer we get that for every internal function f : ∗B ∗B , if f(x) f(y) 2 x y 2 → 3 k − k ≤ k − k for all x, y ∗B2 and f(x) x

1 in a corresponding domain for F − , with K′ only slightly larger than K, namely K′ = K/(1 Kλ). − We conclude this section with the following observations. Lemma 13.3.3. Let F,G be D1 prevector fields. If F (a) G(a) for all standard a, then F (b) G(b) for all b, i.e. F G. ≡ ≡ ≡ Proof. Let K be as in Proposition 11.3.1 for both F and G. Let a = st(b), then F (b) G(b) = F (b) F (a) b+a+F (a) G(a)+G(a) G(b) a+b k − k k − − − − − k F (b) F (a) b + a + F (a) G(a) + G(a) G(b) a + b ≤k − − k k − k k − − k Kλ a b + F (a) G(a) + Kλ a b λ. ≤ k − k k − k k − k ≺≺  Recall that Definition 10.2.1, which defines when a prevector field F realizes a classical vector field X, involves only standard points. It follows from Lemma 13.3.3 that if F is D1 then this determines F up to equivalence. Namely, we have the following. Corollary 13.3.4. Let U Rn be open, and X : U Rn a vector field. If F,G are two D1 prevector⊆ fields that realize X then→ F G. In particular, if X is Lipschitz and G is a D1 prevector field that≡ realizes X, then F G, where F is the prevector field obtained from X as in Example 10.1.7.≡ Part 2

500 yrs of infinitesimal math: Stevin to Nelson

CHAPTER 14

16th century

14.1. Simon Stevin This section contains a discussion of Simon Stevin, unending dec- imals, and the real numbers. The material in this section is based in [B laszczyk, Katz & Sherry 2013]. Nearly a century before Newton and Leibniz, Simon Stevin (Stev- inus) sought to break with an ancient Greek heritage of working ex- clusively with relations among natural numbers,1 and developed an approach capable of representing both “discrete” number (αριθµ ’ oς´ )2 composed of units (µoναδων´ ) and continuous magnitude (µεγεθoς´ ) of geometric origin.3 According to van der Waerden, “[Stevin’s] general notion of a real number was accepted, tacitly or explicitly, by all later ” [Van der Waerden 1985, p. 69]. D. Fearnley-Sander wrote that “the modern concept of real num- ber . . . was essentially achieved by Simon Stevin, around 1600, and was thoroughly assimilated into mathematics in the following two cen- turies” [Fearney-Sander 1979, p. 809]. D. Fowler points out that “Stevin . . . was a thorough-going arithmetizer: he published, in 1585, the first popularization of decimal fractions in the West . . . in 1594, he described an algorithm for finding the decimal expansion of the root of any polynomial, the same algorithm we find later in Cauchy’s proof of the Intermediate Value Theorem” [Fowler 1992, p. 733]. Fowler also emphasizes that important foundational work was yet to be done

1The Greeks counted as numbers only 2, 3, 4,...; thus, 1 was not a number, nor are the fractions: ratios were relations, not numbers. Consequently, Stevin had to spend time arguing that the unit (µoνας´ ) was a number. 2Euclid [ 2007, Book VII, def. 2]. 3Euclid [Euclid 2007, Book V]. See also Aristotle’s Categories, 6.4b, 20-23: “Quantity is either discrete or continuous. [. . . ] Instances of discrete quantities are number and speech; of continuous, lines, surfaces, solids, and besides these, time and place”. 133 134 14. 16TH CENTURY by Dedekind, who proved that the field operations and other arith- metic operations extend from Q to R.4 Meanwhile, Stevin’s decimals stimulated the emergence of power series (see below) and other de- velopments. We will discuss Stevin’s contribution to the Intermediate Value Theorem in Subsection 14.3 below. In 1585, Stevin defined decimals in La Disme as follows: Decimal numbers are a kind of arithmetic based on the idea of the progression of tens, making use of the Arabic numerals in which any number may be written and by which all computations that are met in busi- ness may be performed by integers alone without the aid of fraction. (La Disme, On Decimal Fractions, tr. by V. Sanford, in [Smith 1920, p. 23]). By numbers “met in business” Stevin meant finite decimals,5 and by “computations” he meant addition, subtraction, multiplications, di- vision and extraction of square roots on finite decimals. Stevin argued that numbers, like the more familiar continuous mag- nitudes, can be divided indefinitely, and used a water metaphor to illustrate such an analogy: As to a continuous body of water corresponds a con- tinuous wetness, so to a continuous magnitude corre- sponds a continuous number. Likewise, as the con- tinuous body of water is subject to the same division and separation as the water, so the continuous num- ber is subject to the same division and separation as its magnitude, in such a way that these two quantities cannot be distinguished by continuity and discontinu- ity [Stevin 1585, p. 3]; quoted in [Malet 2006]. Stevin argued for equal rights in his system for rational and irrational numbers. He was critical of the complications in Euclid [Euclid 2007, Book X], and was able to show that adopting the arithmetic as a way of dealing with those theorems made many of them easy to understand and easy to prove. In his Arithmetique [Stevin 1585], Stevin proposed to represent all numbers systematically in decimal notation. P. Ehrlich notes that Stevin’s

4Namely, there is no easy way from representation of reals by decimals, to the field of reals, just as there is no easy way from continuous fractions, another well-known representation of reals, to operations on such fractions. 5But see Stevin’s comments on extending the process ad infinitum in main text at footnote 11. 14.2. DECIMALS FROM STEVIN TO NEWTON AND EULER 135

viewpoint soon led to, and was implicit in, the an- alytic geometry of Ren´eDescartes (1596-1650), and was made explicit by (1616-1703) and (1643-1727) in their arithmetizations thereof [Ehrlich 2005, p. 494].

14.2. Decimals from Stevin to Newton and Euler Stevin’s text La Disme on decimal notation was translated into English in 1608 by Robert Norton (cf. Cajori [Cajori 1928, p. 314]). The translation contains the first occurrence of the word “decimal” in English; the word will be employed by Newton in a crucial passage 63 years later.6 Wallis recognized the importance of unending decimal expressions in the following terms: Now though the Proportion cannot be accurately ex- pressed in absolute Numbers: Yet by continued Ap- proximation it may; so as to approach nearer to it, than any difference assignable (Wallis’s Algebra, p. 317, cited in [Crossley 1987]). Similarly, Newton exploits a power series expansion to calculate de- tailed decimal approximations to log(1 + x) for x = 0.1, 0.2,... (Newton [1]).7 By the time of Newton’s annus mirabilis,± the idea± of un- ending decimal representation was well established. Historian V. Katz calls attention to “Newton’s analogy of power series to infinite decimal expansions of numbers” [Katz, V. 1993, p. 245]. Newton expressed such an analogy in the following passage: Since the operations of computing in numbers and with variables are closely similar . . . I am amazed that it has occurred to no one (if you except N. Mer- cator with his quadrature of the hyperbola) to fit the doctrine recently established for decimal numbers in similar fashion to variables, especially since the way is then open to more striking consequences. For

6See main text at footnote 9. 7Newton scholar N. Guicciardini kindly provided a jpg of a page from New- ton’s manuscript containing such calculations, and commented as follows: “It was probably written in Autumn 1665 (see Mathematical papers 1, p. 134). White- side’s dating is sometimes too precise, but in any case it is a manuscript that was certainly written in the mid when Newton began annotating Wallis’s Arith- metica Infinitorum”[Guicciardini 2011]. The page from Newton’s manuscript can be viewed at http://u.math.biu.ac.il/~katzmik/newton.html 136 14. 16TH CENTURY

since this doctrine in species has the same relation- ship to Algebra that the doctrine in decimal numbers has to common Arithmetic, its operations of Addi- tion, Subtraction, Multiplication, Division and Root- extraction may easily be learnt from the latter’s pro- vided the reader be skilled in each, both Arithmetic and Algebra, and appreciate the correspondence be- tween decimal numbers and algebraic terms continued to infinity . . . And just as the advantage of decimals consists in this, that when all fractions and roots have been reduced to them they take on in a certain mea- sure the nature of integers, so it is the advantage of infinite variable-sequences that classes of more com- plicated terms (such as fractions whose denominators are complex quantities, the roots of complex quanti- ties and the roots of affected equations) may be re- duced to the class of simple ones: that is, to infinite series of fractions having simple numerators and de- nominators and without the all but insuperable en- cumbrances which beset the others [Newton 1671].8 In this remarkable passage dating from 1671, Newton explicitly names infinite decimals as the source of inspiration for the new idea of infi- nite series.9 The passage shows that Newton had an adequate number system for doing calculus and real analysis two centuries before the triumvirate.10 In 1742, John Marsh first used an abbreviated notation for repeat- ing decimals [Marsh 1742, p. 5]; cf. [Cajori 1928, p. 335]. Euler ex- ploits unending decimals in his Elements of Algebra in 1765, as when he sums an infinite series and concludes that 9.999 ... = 10 [Euler 1770, p. 170].

14.3. Stevin’s cubic and the IVT In the context of his decimal representation, Stevin developed nu- merical methods for finding roots of polynomials equations. He de- scribed an algorithm equivalent to finding zeros of polynomials (see Crossley [Crossley 1987, p. 96]). This occurs in a corollary to prob- lem 77 (more precisely, LXXVII) in (Stevin [Stevin 1625, p. 353]).

8We are grateful to V. Katz for signaling this passage. 9See footnote 6. 10The expression “the great triumvirate” is used by Boyer in [Boyer 1949, p. 298] to describe Cantor, Dedekind, and Weierstrass. 14.3. STEVIN’S CUBIC AND THE IVT 137

Here Stevin describes such an argument in the context of finding a root of the cubic equation (which he expresses as a proportion to con- form to the contemporary custom) x3 = 300x + 33915024. Here the whimsical coefficient seems to have been chosen to emphasize the fact that the method is completely general; Stevin notes further- more that numerous addional examples can be given. A textual dis- cussion of the method may be found in Struik [Stevin 1958, p. 476]. Centuries later, Cauchy would prove the Intermediate Value The- orem (IVT) for a continuous function f on an interval I by a divide- and-conquer algorithm. Cauchy subdivided I into m equal subinter- vals, and recursively picked a subinterval where the values of f have opposite signs at the endpoints [Cauchy 1821, Note III, p. 462]. To elaborate on Stevin’s argument following [Stevin 1958, 10, p. 475- 476], note that what Stevin similarly described a divide-and-conquer§ algorithm. Stevin subdivides the interval into ten equal parts, result- ing in a gain of a new decimal digit of the solution at every iteration of his procedure. Stevin explicitly speaks of continuing the iteration ad infinitum:11 Et procedant ainsi infiniment, l’on approche infini- ment plus pres au requis [Stevin 1625, p. 353, last 3 lines]. Arguably, the abstract existence proofs for the real numbers are not needed here since Stevin gives an algorithm for producing an explicit decimal representation of the solution. The IVT for polynomials would resurface in Lagrange before being generalized by Cauchy to the newly introduced class of continuous functions. One frequently hears sentiments to the effect that mathematicians before 1870 did not and could not have provided rigorous proofs, since the real number system was not even built yet, but such an attitude may be anachronistic. It overemphasizes the significance of the tri- umvirate project in an inappropriate context. Stevin is concerned with constructing an algorithm, whereas Cantor is concerned with develop- ing a foundational framework based upon the existence of the totality of the real numbers, as well as their power sets, etc. The latter project involves a number of non-constructive ingredients, including the axiom of infinity and the law of excluded middle. But none of it is needed for Stevin’s procedure, because he is not seeking to re-define “number”

11Cf. footnote 5. 138 14. 16TH CENTURY in terms of alternative (supposedly less troublesome) mathematical ob- jects. Many historians emphasize the approaches developed by Cantor, Dedekind, and Weierstrass to proofs of the existence of real numbers. Sometimes this is done at the expense, and almost to the exclusion, of Stevin’s approach. Stevin’s contribution to the construction of the real numbers deserves more explicit acknowledgment. CHAPTER 15

17th century

15.1. Pierre de Fermat 15.2. James Gregory 15.3. Gottfried Wilhelm von Leibniz

139

CHAPTER 16

18th century

16.1. Leonhard Euler

141

CHAPTER 17

19th century

17.1. Augustin-Louis Cauchy

143

CHAPTER 18

20th century

18.1. Thoralf Skolem 18.2. Edwin Hewitt 18.3. JerzyLo´s 18.4. Abraham Robinson 18.5. Edward Nelson

145

Bibliography

[Albeverio et al. 2009] Albeverio, S.; Fenstad, J.; Hoegh-Krohn, R.; Lindstrom, T. Nonstandard Methods in Stochastic Analysis and . Academic Press 1986, Dover 2009. [Almeida, Neves & Stroyan 2014] Almeida, R.; Neves, V.; Stroyan, K. “Infinites- imal differential geometry: cusps and envelopes.” Differ. Geom. Dyn. Syst. 16 (2014), 1–13. [Bair et al. 2013] Bair, J.; Blaszczyk, P.; Ely, R.; Henry, V.; Kanovei, V.; Katz, K.; Katz, M.; Kutateladze, S.; McGaffey, T.; Schaps, D.; Sherry, D.; Shnider, S. “Is mathematical history written by the victors?” Notices of the American Mathematical Society 60 (2013) no. 7, 886-904. See http://www.ams.org/notices/201307/rnoti-p886.pdf and http://arxiv.org/abs/1306.5973 [Benoit 1997] Benoit, E. “Applications of Nonstandard Analysis in Ordinary Dif- ferential Equations.” Nonstandard Analysis, Theory and Applications. Editors: Leif O. Arkeryd, Nigel J. Cutland, C. Ward Henson NATO ASI Series Volume 493, 1997, pp 153-182. [Blaszczyk, Katz & Sherry 2013] Blaszczyk, P.; Katz, M.; Sherry, D. “Ten miscon- ceptions from the history of analysis and their debunking.” Foundations of Science, 18 (2013), no. 1, 43-74. See http://dx.doi.org/10.1007/s10699-012-9285-8 and http://arxiv.org/abs/1202.4153 [Boothby 1986] Boothby, W. An introduction to differentiable manifolds and Rie- mannian geometry. Second edition. Pure and Applied Mathematics, 120. Academic Press, Inc., Orlando, FL, 1986. [Borovik & Katz 2012] Borovik, A., Katz, M. “Who gave you the Cauchy– Weierstrass tale? The dual history of rigorous calculus.” Foundations of Science 17, no. 3, 245-276. See http://dx.doi.org/10.1007/s10699-011-9235-x [Boyer 1949] Boyer, C. The concepts of the calculus. Hafner Publishing Company. [Bradley & Sandifer 2009] Bradley, R., Sandifer, C. Cauchy’s Cours d’analyse. An annotated translation. Sources and Studies in the and Physical Sciences. Springer, New York. [Cajori 1928] Cajori. A history of mathematical notations, Volumes 1-2. The Open Court Publishing Company, La Salle, Illinois, 1928/1929. Reprinted by Dover, 1993. [Cauchy 1821] Cauchy, A. L. Cours d’Analyse de L’Ecole Royale Polytechnique. Premi`ere Partie. Analyse alg´ebrique (Paris: Imprim´erie Royale, 1821).

147 148 BIBLIOGRAPHY

[Chang & Keisler 1990] Chang, C.; Keisler, H.J. Model Theory. Studies in Logic and the Foundations of Mathematics (3rd ed.), Elsevier (1990) [Origi- nally published in 1973]. [Crossley 1987] Crossley, J. The emergence of number. Second edition. World Sci- entific Publishing Co., Singapore. [Davis 1977] Davis, M.: Applied nonstandard analysis. Pure and Applied Math- ematics. Wiley-Interscience [John Wiley & Sons], New York-London- Sydney, 1977. Reprinted by Dover, NY, 2005, see http://store.doverpublications.com/0486442292.html [Do79] Dombrowski, P.: 150 years after Gauss’ “Disquisitiones generales circa superficies curvas”. With the original text of Gauss. Ast´erisque, 62. Soci´et´eMath´ematique de , Paris, 1979. [Ehrlich 2005] Ehrlich, P. “Continuity.” In Encyclopedia of philosophy, 2nd edition (Donald M. Borchert, editor), Macmillan Reference USA, Farmington Hills, MI, 2005 (The online version contains some slight improvements), pp. 489–517. [Euclid 2007] Euclid. Euclid’s Elements of Geometry, edited, and provided with a modern English translation, by Richard Fitzpatrick, 2007. See http://farside.ph.utexas.edu/euclid.html [Euler 1770] Euler, L. Elements of Algebra (3rd English edition). John Hewlett and Francis Horner, English translators. Orme Longman, 1822. [Feferman & Levy 1963] Feferman, S.; Levy, A. “Independence results in set theory by Cohen’s method II.” Notices Amer. Math. Soc. 10 (1963). [Fowler 1992] Fowler, D. “Dedekind’s theorem: √2 √3 = √6.” American Math- ematical Monthly 99, no. 8, 725–733. × [Goldblatt 1998] Goldblatt, R. Lectures on the hyperreals. An introduction to non- standard analysis. Graduate Texts in Mathematics, 188. Springer-Verlag, New York. [Gordon et al. 2002] Gordon, E.; Kusraev, A.; Kutateladze, S. Infinitesimal analy- sis. Updated and revised translation of the 2001 Russian original. Trans- lated by Kutateladze. Mathematics and its Applications, 544. Kluwer Academic Publishers, Dordrecht, 2002. [Grobman 1959] Grobman, D. “Homeomorphisms of systems of differential equa- tions.” Doklady Akademii Nauk SSSR 128, 880–881. [Guicciardini 2011] Guicciardini, N. Private communication, 27 november 2011. [Hartman 1960] Hartman, P. “A lemma in the theory of structural stability of differential equations.” Proc. A.M.S. 11 (4), 610–620. [Henson & Keisler 1986] Henson, C. W.; Keisler, H. J. “On the strength of non- standard analysis.” J. Symbolic Logic 51 (1986), no. 2, 377–386. [Herzberg 2013] Herzberg, F. Stochastic Calculus with Infinitesimals. Lecture Notes in Mathematics Volume 2067, Springer, 2013 [Hewitt 1948] Hewitt, E. “Rings of real-valued continuous functions. I.” Trans. Amer. Math. Soc. 64 (1948), 45–99. [Jech 1973] Jech, T. The axiom of choice. Studies in Logic and the Foundations of Mathematics, Vol. 75. North-Holland Publishing Co., Amsterdam- London; Amercan Elsevier Publishing Co., Inc., New York. BIBLIOGRAPHY 149

[Kanovei, Katz & Nowik 2016] Kanovei, V.; Katz, K.; Katz, M.; Nowik, T. “Small oscillations of the pendulum, Euler’s method, and adequality.” Quantum Studies: Mathematics and Foundations, to appear. See http://dx.doi.org/10.1007/s40509-016-0074-x and http://arxiv.org/abs/1604.06663 [Kanovei & Reeken 2004] Kanovei, V.; Reeken, M. Nonstandard analysis, axiomat- ically. Springer Monographs in Mathematics, Berlin: Springer. [Kanovei & Shelah 2004] Kanovei, V.; Shelah, S. A definable nonstandard model of the reals.” J. Symbolic Logic 69 (2004), no. 1, 159–164. [Katz & Sherry 2012] Katz, M.; Sherry, D. “Leibniz’s laws of continuity and ho- mogeneity.” Notices of the American Mathematical Society 59 (2012), no. 11, 1550-1558. See http://www.ams.org/notices/201211/rtx121101550p.pdf and http://arxiv.org/abs/1211.7188 [Katz & Sherry 2013] Katz, M., Sherry, D. “Leibniz’s infinitesimals: Their fiction- ality, their modern implementations, and their foes from Berkeley to Russell and beyond.” Erkenntnis 78, no. 3, 571–625. See http://dx.doi.org/10.1007/s10670-012-9370-y and http://arxiv.org/abs/1205.0174 [Katz, V. 1993] Katz, V. J. “Using the to teach calculus.” Sci- ence & Education. Contributions from History, Philosophy and Sociology of Science and Education 2, no. 3, 243–249. [Keisler 1986] Keisler, H. J.: Elementary Calculus: An Infinitesimal Approach. Second Edition. Prindle, Weber & Schimidt, Boston, 1986. See online version at http://www.math.wisc.edu/ keisler/calc.html [Keisler 1976] H. J. Keisler. Foundations of Infinitesimal∼ Calculus. On-line Edition, 2007, at the author’s web page: http://www.math.wisc.edu/ keisler/foundations.html (Originally published by Prindle,∼ Weber & Schmidt 1976.) [Klein 1908] Klein, F. Elementary Mathematics from an Advanced Standpoint. Vol. I. Arithmetic, Algebra, Analysis. Translation by E. R. Hedrick and C. A. Noble [Macmillan, New York, 1932] from the third German edition [Springer, Berlin, 1924]. Originally published as Elementarmathematik vom h¨oheren Standpunkte aus (Leipzig, 1908). [Leibniz 1989] Leibniz, G. W.: La naissance du calcul diff´erentiel. (French) [The birth of differential calculus] 26 articles des Acta Eruditorum. [26 ar- ticles of the Acta Eruditorum] Translated from the Latin and with an introduction and notes by Marc Parmentier. With a preface by Michel Serres. Mathesis. Librairie Philosophique J. Vrin, Paris, 1989. [Lo´s1955] Lo´s, J. “Quelques remarques, th´eor`emes et probl`emes sur les classes d´efi- nissables d’alg`ebres.” In Mathematical interpretation of formal systems, 98–113, North-Holland Publishing Co., Amsterdam, 1955. [Lutz & Goze 1981] Lutz, R.; Goze, M. Nonstandard Analysis, a Practical Guide with Applications. Lecture Notes in Mathematics 881, Springer-Verlag, 1981. [Malet 2006] Malet, A. “Renaissance notions of number and magnitude.” Historia Mathematica 33, no. 1, 63–81. [Marsh 1742] Marsh, J. Decimal arithmetic made perfect. London. 150 BIBLIOGRAPHY

[Newton 1671] Newton, I.: A Treatise on the Methods of Series and Fluxions, 1671. In Whiteside [4] (vol. III), pp. 33–35. [1] Newton, I.: Cambridge UL Add. 3958.4, f. 78v [2] Newton, I. 1946. Sir Isaac Newton’s mathematical principles of natural philosophy and his system of the world, a revision by F. Cajori of A. Motte’s 1729 translation. Berkeley: Univ. of California Press. [3] Newton, I. 1999. The Principia: Mathematical principles of natural phi- losophy, translated by I. B. Cohen & A. Whitman, preceded by A guide to Newton’s Principia by I. B. Cohen. Berkeley: Univ. of California Press. [Nowik & Katz 2015] Nowik, T., Katz, M. “Differential geometry via infinitesimal displacements.” Journal of Logic and Analysis 7:5 (2015), 1–44. See http://www.logicandanalysis.org/index.php/jla/article/view/237/106 and http://arxiv.org/abs/1405.0984 [Robinson 1966] Robinson, A. Non-standard analysis. North-Holland Publishing Co., Amsterdam 1966. [Rudin 1976] Rudin, W. Principles of mathematical analysis. Third edition. Inter- national Series in Pure and Applied Mathematics. McGraw-Hill Book Co., New York-Auckland-D¨usseldorf, 1976. [Fearney-Sander 1979] Fearnley-Sander, D. “Hermann Grassmann and the creation of linear algebra.” Amer. Math. Monthly 86, no. 10, 809–817. [Sherry & Katz 2014] Sherry, D., Katz, M. “Infinitesimals, imaginaries, ideals, and fictions.” Studia Leibnitiana 44 (2012), no. 2, 166–192 (the article ap- peared in 2014 even though the year given by the journal is 2012). See http://arxiv.org/abs/1304.2137 [Smith 1920] Smith, D. Source Book in Mathematics. Mc Grow-Hill, New York. [Spivak 1979] Spivak, M. A comprehensive introduction to differential geometry. Vol. 4. Second edition. Publish or Perish, Inc., Wilmington, Del., 1979. [Stevin 1585] Stevin, S. (1585) L’Arithmetique. In Albert Girard (Ed.), Les Oeu- vres Mathematiques de Simon Stevin (Leyde, 1634), part I, p. 1–101. [Stevin 1625] Stevin, S. L’Arithmetique. In Albert Girard (Ed.), 1625, part II. Online at http://www.archive.org/stream/larithmetiqvedes00stev#page/353/mode/1up [Stevin 1958] Stevin, S. The principal works of Simon Stevin. Vols. IIA, IIB: Math- ematics. Edited by D. J. Struik. C. V. Swets & Zeitlinger, Amsterdam 1958. Vol. IIA: v+pp. 1–455 (1 plate). Vol. IIB: 1958 iv+pp. 459–976. [Stroyan 1977] Stroyan, K. “Infinitesimal analysis of curves and surfaces.” In Hand- book of Mathematical Logic, J. Barwise, ed. North-Holland Publishing Company, 1977. [Stroyan 2015] Stroyan, K. Advanced Calculus using Mathematica: NoteBook Edi- tion. [Stroyan & Luxemburg 1976] Stroyan, K.; Luxemburg, W. Introduction to the The- ory of Infinitesimals. Academic Press, 1976. [Sun 2000] Sun, Y. “Nonstandard Analysis in Mathematical Economics.” In: Non- standard Analysis for the Working , Peter A. Loeb and Manfred Wolff, eds, Mathematics and Its Applications, Volume 510, 2000, pp 261-305. BIBLIOGRAPHY 151

[Tao 2014] Tao, T. Hilbert’s fifth problem and related topics. Graduate Studies in Mathematics, 153. American Mathematical Society, Providence, RI, 2014. [Tao & Vu 2016] Tao, T.; Van Vu, V. “Sum-avoiding sets in groups.” See http://arxiv.org/abs/1603.03068 [Tarski 1930] Tarski, A. “Une contribution `ala th´eorie de la mesure.” Fundamenta Mathematicae 15, 42–50. [Van den Berg 1987] Van den Berg, I. Nonstandard asymptotic analysis. Lecture Notes in Mathematics 1249, Springer 1987. [Van der Waerden 1985] Van der Waerden, B. L. A history of algebra. From al- Khwarizmi to Emmy Noether. Springer-Verlag, Berlin. [We55] Weatherburn, C. E.: Differential geometry of three dimensions. Vol. 1. Cambridge University Press, 1955. [4] Whiteside, D. T. (Ed.): The mathematical papers of Isaac Newton. Vol. III: 1670–1673. Edited by D. T. Whiteside, with the assistance in publica- tion of M. A. Hoskin and A. Prag, Cambridge University Press, London- New York, 1969. [Zygmund 1945] Zygmund, A. Smooth functions. Duke Math. J. 12. 47–76.

CHAPTER 19

Theorema Egregium

19.1. Preliminaries to the Theorema Egregium of Gauss In this section and the two following we will give a proof of Gauss’s theorem to the effect that the Gaussian curvature function of a surface is an intrinsic invariant of a surface. We denote the coordinates in R2 by (u1,u2). In R3, let , be the 2 3 h i Euclidean inner product. Let x = x(u1,u2): R R be a regular parametrized surface. Here “regular” means that→ the vector valued function x has Jacobian of rank 2 at every point. Consider partial ∂ derivatives xi = ∂ui (x), where i =1, 2. Thus, vectors x1 and x2 form a basis for the tangent plane to the surface at every point. Similarly, let ∂2x x = ij ∂ui∂uj be the second partial derivatives. Let n = n(u1,u2) be a unit normal to the surface at the point x(u1,u2), so that n, x = 0. h ii k Definition 19.1.1 (symbols gij, Lij, Γij). The first fundamental form (gij) at a point of the surface is given in coordinates by gij = xi, xj . The second fundamental form (Lij) is given in coordinates h i k by Lij = n, xij . The symbols Γij are uniquely defined by the decom- position h i k xij =Γijxk + Lijn 1 2 (19.1.1) =Γijx1 +Γijx2 + Lijn, where the repeated (upper and lower) index k implies summation, in accordance with the Einstein summation convention. i i Definition 19.1.2 (symbols L j). The Weingarten map (L j) is an endomorphism of the tangent plane, namely R x1 R x2. It is uniquely ∂ ⊕ defined by the decomposition of nj = ∂uj (n) as follows: i nj = L jxi 1 2 = L jx1 + L jx2. Note the relation L = Lk g . ij − j ki 153 154 19. THEOREMA EGREGIUM

Definition 19.1.3. We will denote by square brackets [ ] the anti- symmetrisation over the pair of indices found in between the brackets, e.g. a = 1 (a a ). [ij] 2 ij − ji k Note that g[ij] = 0, L[ij] = 0, and Γ[ij] = 0. We will use the k k notation Γij;ℓ for the ℓ-th partial derivative of the symbol Γij. Proposition 19.1.4. We have the following relation Γq +Γm Γq = L Lq (19.1.2) i[j;k] i[j k]m − i[j k] for each set of indices i, j, k, q (with, as usual, an implied summation over the index m). Proof. The formula will result by antisymmetrizing a formula for the third partial. Let us calculate the tangential component, with respect to the ba- sis x1, x2, n , of the third partial derivative xijk. Recall that nk = p { } ℓ L kxp and xjk =Γjkxℓ + Likn. Thus, we have (x ) = Γmx + L n ij k ij m ij k m m =Γ ij;kxm +Γij xmk + Lijnk + Lij;kn m m p p =Γij;kxm +Γij Γmkxp + Lmkn + Lij (L kxp)+ Lij;kn. Grouping together the tangential terms, we obtain

m m p p (xij)k =Γij;kxm +Γij Γmkxp + Lij (L kxp)+(...)n

q m q  q = Γij;k +Γij Γmk + LijL k xq +(...)n

 q m q q  = Γij;k +Γij Γkm + LijL k xq +(...)n,

 q  since the symbols Γkm are symmetric in the two subscripts. Now the symmetry in j, k (equality of mixed partials) implies the following iden- tity:

(xi[j)k] =0. Therefore

0=(xi[j)k]

q m q q = Γi[j;k] +Γi[jΓk]m + Li[jL k] xq +(...)n,

q m q q  and therefore Γi[j;k] +Γi[jΓk]m + Li[jL k] = 0 for each q =1, 2.  19.2. THE THEOREMA EGREGIUM OF GAUSS 155

19.2. The theorema egregium of Gauss Definition 19.2.1. The Gaussian curvature function K = K(u1,u2) is the determinant of the Weingarten map:

i 1 2 K = det(L j)=2L [1L 2]. (19.2.1) Theorem 19.2.2. We have the following three equivalent formulas for Gaussian curvature K: i 1 2 (a) K = det(L j)=2L [1L 2]; (b) K = det(Lij ) ; det(gij ) (c) K = 2 L L2 . − g11 1[1 2] Proof. The first formula is our definition of K. To prove the sec- k ond formula, we use the relation Lij = L jgki. By the multiplicativity of determinant with respect to matrix− multiplication, we have

i 2 det(Lij) det(L j)=( 1) . − det(gij) Let us now prove formula (c). The proof is a calculation. Note that by definition, 2L L2 = L L2 L L2 . Hence 1[1 2] 11 2 − 12 1 2L L2 = L1 g L2 g L2 L1 g L2 g L2 − 1[1 2] 1 11 − 1 21 2 − 2 11 − 2 21 1 = g L1 L2 g L2 L2 g L1 L2 + g L2 L2 11 1 2 − 21  1 2 − 11 2 1 21  2 1 = g L1 L2 L1 L2 11 1 2 − 2 1 i = g11det( L j)= g11K.  (19.2.2) k Note that we can either calculate using the formula Lij = L jgki, k − or the formula Lij = L igkj. Only the former one leads to the appro- priate cancellations as− in (19.2.2).  Theorem 19.2.3 (Theorema egregium). The Gaussian curvature function K = K(u1,u2) can be expressed in terms of the coefficients of the first fundamental form alone (and their first and second derivatives) as follows:

2 2 j 2 K = Γ1[1;2] +Γ1[1Γ2]j , (19.2.3) g11 k   where the symbols Γij can be expressed in terms of the derivatives of gij be the formula Γk = 1 (g g + g )gℓk, where (gij) is the inverse ij 2 iℓ;j − ij;ℓ jℓ;i matrix of (gij). This theorem will be proved in Section 19.3. 156 19. THEOREMA EGREGIUM

Weingarten Gaussian cur- mean cur- i map (L j) vature K vature H 0 0 plane 0 0 0 0   1 0 cylinder 0 1 0 0 2 invariance   yes no of curvature Table 19.3.1. Plane and cylinder have the same intrinsic geometry (K), but different extrinsic (H)

19.3. Intrinsic versus extrinsic Theorem 19.3.1. Unlike Gaussian curvature K, the mean curva- ture 1 i H = 2 L i = tr(W ) cannot be expressed in terms of the gij and their derivatives. Indeed, the plane and the cylinder have parametrisations with iden- tical gij, but with different mean curvature, cf. Table 19.3.1. To sum- marize, Gaussian curvature K is an intrinsic invariant, while mean 1 i curvature H = 2 L i is an extrinsic invariant, of the surface. Proof of theorema egregium. We present a streamlined version of do Carmo’s proof [?, p. 233]. The proof is in 3 steps.

(1) We express the third partial derivative xijk in terms of the Γ’s (intrinsic information) and the L’s (extrinsic information). (2) The equality of mixed partials yields an identification of an appropriate expression in terms of the Γ’s, with a certain com- bination of the L’s. (3) The combination of the L’s is expressed in terms of Gaussian curvature. The first two steps were carried out in Proposition 19.1.4. Choosing the valus i = j = 1 and k = q = 2 for the indices in equation (19.1.2), we obtain Γ2 +Γm Γ2 = L L2 1[1;2] 1[1 2]m − 1[1 2] i 2 = g1iL [1L 2] 1 2 = g11L [1L 2] 19.4. GAUSSIAN CURVATURE AND LAPLACE-BELTRAMI OPERATOR 157

1 2 Γij j =1 j =2 Γij j =1 j =2 i =1 µ1 µ2 i =1 µ2 µ1 i =2 µ µ i =2 µ− µ 2 − 1 1 2 k 2µ(u1,u2) Table 19.4.1. Symbols Γij of a metric e δij

2 2 since the term L [1L 2] = 0 vanishes. Applying Theorem 19.2.2(c), we obtain the desired formula for K and complete the proof of the theorema egregium. 

19.4. Gaussian curvature and Laplace-Beltrami operator In this section we will complement the material of section 19.2. A (u1,u2) chart in which the metric becomes conformal (see Defi- nition 20.8.1) to the standard flat metric, is referred to as isothermal coordinates. The existence of the latter is proved in [?]. Recall that a metric λ(x, y)(dx2 + dy2), where u1 = x and u2 = y, has metric coefficients g11 = g22 = λ whereas g12 = 0. Definition 19.4.1. The Laplace-Beltrami operator for a metric g = λ(x, y)(dx2 + dy2), where u1 = x and u2 = y, in isothermal coordinates is 1 ∂2 ∂2 ∆ = + . LB λ ∂(u1)2 ∂(u2)2  

The notation means that when we apply the operator ∆LB to a function f = f(u1,u2), we obtain 1 ∂2f ∂2f ∆ (f)= + . LB λ ∂(u1)2 ∂(u2)2   In more readable form for f = f(x, y), we have 1 ∂2f ∂2f ∆ (f)= + . LB λ ∂x2 ∂y2   Theorem 19.4.2. Given a metric in isothermal coordinates with 1 2 metric coefficients gij = λ(u ,u )δij, its Gaussian curvature is minus half the Laplace-Beltrami operator applied to the log of the conformal factor λ: 1 K = ∆ log λ. (19.4.1) −2 LB 1 Here log x is the natural logarithm so that (log x)′ = x . 158 19. THEOREMA EGREGIUM

Proof. Let λ = e2µ. We have from Table 19.4.1: 2Γ2 =Γ2 Γ2 = µ µ . 1[1;2] 11;2 − 12;1 − 22 − 11 It remains to verify that the ΓΓ term in the expression (19.2.3) for the Gaussian curvature vanishes: j 2 1 2 2 2 2Γ1[1Γ2]j = 2Γ1[1Γ2]1 + 2Γ1[1Γ2]2 =Γ1 Γ2 Γ1 Γ2 +Γ2 Γ2 Γ2 Γ2 11 21 − 12 11 11 22 − 12 12 = µ µ µ ( µ )+( µ )µ µ µ 1 1 − 2 − 2 − 2 2 − 1 1 =0. Then from formula (19.2.3) we have 2 1 K = Γ2 = (µ + µ )= ∆ µ. λ 1[1;2] −λ 11 22 − LB

Meanwhile, ∆LB log λ = ∆LB(2µ)=2∆LB(µ), proving the result.  Setting λ = f 2, we restate the theorem as follows. Theorem 19.4.3. Given a metric in isothermal coordinates with 2 1 2 metric coefficients gij = f (u ,u )δij, its Gaussian curvature is minus the Laplace-Beltrami operator of the log of the conformal factor f: K = ∆ log f. (19.4.2) − LB Either one of the formulas (19.2.3), (19.4.1), or (19.4.2) can serve as the intrinsic definition of Gaussian curvature, replacing the extrinsic definition (19.2.1), cf. Remark 34.7.1.

19.5. Duality in linear algebra Let V be a real vector space. We will assume all vector spaces to be finite dimensional unless stated otherwise. Example 19.5.1. Euclidean space Rn is a real vector space of di- mension n.

Example 19.5.2. The tangent plane TpM of a regular surface M at a point p M is a real vector space of dimension 2. ∈ Definition 19.5.3. A linear form, also called 1-form, φ on V is a linear functional φ : V R . → Definition 19.5.4. The dual space of V , denoted V ∗, is the space of all linear forms on V :

V ∗ = φ : V R φ is a 1-form on V . { → | } 19.6. POLAR, CYLINDRICAL, AND SPHERICAL COORDINATES 159

Evaluating φ at an element x V produces a scalar φ(x) R. ∈ ∈ Definition 19.5.5. The natural pairing between V and V ∗ is a linear map , : V V ∗ R, h i × → defined by setting x, y = y(x), for all x V and y V ∗. h i ∈ ∈ If V admits a basis of vectors (xi), then V ∗ admits a unique basis, called the dual basis (yj), satisfying x ,y = δ , (19.5.1) h i ji ij for all i, j =1,...,n, where δij is the Kronecker delta function. ∂ ∂ Example 19.5.6. The vectors ∂x and ∂y form a basis for the tan- gent plane T E of the Euclidean plane E at each point E. The dual p ∈ space is denoted Tp∗ and called the cotangent plane. The basis dual to ∂ ∂ ∂x , ∂y is denoted (dx,dy).  Polar coordinates will be defined in detail in Subsection 19.6. They provide helpful examples of vectors and 1-forms, as follows. ∂ ∂ Example 19.5.7. In polar coordinates, we have a basis ∂r , ∂θ for the tangent plane TpE of the Euclidean plane E at each point p E 0 . ∂ ∂ ∈ \{ } The dual space Tp∗ has a basis dual to ∂r , ∂θ and denoted (dr, dθ). Example 19.5.8. In polar coordinates, the 1-form rdr occurs frequently in calculus. This 1-form vanishes at the origin r = 0, and gets “bigger and bigger” as we get further away from the origin (see next section).

19.6. Polar, cylindrical, and spherical coordinates Definition 19.6.1. Polar coordinates (koordinatot koteviot) (r, θ) satisfy r2 = x2 + y2 and x = r cos θ, y = r sin θ. In R2 0 , one way of defining the ranges for the variables is to require \{ } r > 0 and θ [0, 2π). ∈ It is shown in elementary calculus that the area of a region D in the plane in polar coordinates is calculated using the area element dA = r dr dθ. 160 19. THEOREMA EGREGIUM

Thus, an integral is of the form

dA = rdrdθ. ZD ZZ Cylindrical coordinates in Euclidean 3-space are studied in Vector Calculus. Definition 19.6.2. Cylindrical coordinates (koordinatot gliliot) (r, θ, z) are a natural extension of the polar coordinates (r, θ) in the plane. The volume of an open region D is calculated with respect to cylin- drical coordinates using the volume element dV = r dr dθ dz. Namely, an integral is of the form

dV = rdr dθ dz. ZD ZZZ Example 19.6.3. Find the volume of a right circular cone with height h and base a circle of radius b. Spherical coordinates1 (ρ,θ,ϕ) in Euclidean 3-space are studied in Vector Calculus. Definition 19.6.4. Spherical coordinates (ρ,θ,ϕ) are defined as follows. The coordinate ρ is the distance from the point to the origin, satisfying ρ2 = x2 + y2 + z2, or ρ2 = r2 +z2, where r2 = x2 +y2. If we project the point orthogonally to the (x, y)-plane, the polar coordinates of its image, (r, θ), satisfy x = r cos θ and y = r sin θ. The last coordiate ϕ of the spherical coordinates is the angle between the position vector of the point and the third basis vector e3 in 3-space (pointing upward along the z-axis). Thus z = ρ cos ϕ while r = ρ sin ϕ .

1koordinatot kaduriot 19.7. DUALITY IN CALCULUS; DERIVATIONS 161

Here the ranges of the coordinates are often chosen as follows: 0 ρ ≤ 0 θ 2π ≤ ≤ and 0 ϕ π ≤ ≤ (note the different upper bounds for θ and ϕ). Recall that the area of a spherical region D is calculated using a volume element of the form dV = ρ2 sin ϕ dρ dθ d ϕ, so that the volume of a region D is

dV = ρ2 sin ϕ dρ dθ d ϕ . ZD ZZZD Example 19.6.5. Calculate the volume of the spherical shell be- tween spheres of radius α> 0 and β α. ≥ The area of a spherical region on a sphere of radius ρ = β is calcu- lated using the area element dA = β2 sin ϕ dθ d ϕ . Thus the area of a spherical region D on a sphere of radius β is given by the integral dA = β2 sin ϕ dθ d ϕ . ZD ZZ Example 19.6.6. Calculate the area of the spherical region on a sphere of radius β included in the first octant, (so that all three Carte- sian coordinates are positive).

19.7. Duality in calculus; derivations Let E be a Euclidean space of dimension n, and let p E be a fixed point. ∈ Definition 19.7.1. Let

Dp = f : f C∞ { ∈ } be the space of real C∞-functions f defined in a neighborhood of p E. ∈ Note that Dp is infinite-dimensional as it includes all polynomials. 162 19. THEOREMA EGREGIUM

∂ Theorem 19.7.2. A partial derivative ∂ui at p a 1-form ∂ : λp R ∂ui → on the space λp, satisfying Leibniz’s rule ∂(fg) ∂f ∂g = g(p)+ f(p) ∂ui ∂ui ∂ui p p p for all f,g λp. ∈ The formula can be written briefly as ∂ ∂ ∂ ∂ui (fg)= ∂ui (f)g + f ∂ui (g), at the point p. This was proved in calculus. The formula above moti- vates the following more general definition of a derivation. Definition 19.7.3. A derivation X at p is an R-linear form

X : Dp R → on the space Dp satisfying Leibniz’s rule: X(fg)= X(f)g(p)+ f(p)X(g) (19.7.1) for all f,g Dp. ∈ Remark 19.7.4. Linearity of a derivation is required only with regard to multiplication by scalars from R, not with regard to multi- plication by functions. CHAPTER 20

Derivations, tori

20.1. Space of derivations Let M be an n-dimensional manifold, and let p M. Recall that ∈ a derivation X is a 1-form on the space Dp of functions defined near p such that X satisfies the Leibniz rule (see Section 19.7). The following two results are familiar from advanced calculus. Proposition 20.1.1. The space of all derivations at p is a vector space of dimension n, and is called the tangent space Tp at p. Proof in case 1. Let us prove the result in the case n =1ofa single coordinate u = u1 at the point p = 0. Let X be a derivation. Then X(1) = X(1 1)=2X(1) · by Leibniz’s rule. Therefore X(1) = 0, and similarly for any constant by linearity of X. Now consider the monic polynomial u = u1 of degree 1, i.e., the linear function

u Dp=0. ∈ We evaluate the derivation X at u and set c = X(u) R. By the ∈ Taylor formula, any function f Dp=0, f = f(u), can be written as ∈ f(u)= a + bu + g(u)u where g is smooth and g(0) = 0. Now we have by Linearity and Leibniz rule X(f)= X(a + bu + g(u)u) = bX(u)+ X(g)u(0) + g(0) 1 · = bc ∂ = c (f). ∂u ∂ Thus X coincides with c ∂u for all f Dp. Hence the tangent space is 1-dimensional, proving the theorem in∈ this case. 

163 164 20. DERIVATIONS, TORI

Proof in case 2. Let us prove the result in the case of two vari- ables u, v at the point p = 0 (the general case is similar). Let X be a derivation. Then X(1) = X(1 1)=2X(1) · by Leibniz’s rule. Therefore X(1) = 0, and similarly for any constant by linearity of X. Now consider the monic polynomial u = u1 of degree 1, i.e., the linear function

u Dp=0. ∈ We evaluate the derivation X at u and set c = X(u). Similarly, consider the monic polynomial v = v1 of degree 1, i.e., the linear function

v Dp=0. ∈ We evaluate the derivation X at v and set c′ = X(v). By the Taylor formula, any function f Dp=0, f = f(u, v), can be written as ∈ 2 2 f(u, v)= a + bu + b′v + g(u, v)u + h(u, v)uv + k(u, v)v , where the functions g(u, v), h(u, v), and k(u, v) are smooth. Note that the coefficients b and b′ are the first partial derivatives of f. Now we have 2 2 X(f)= X(a + bu + b′v + g(u, v)u + h(u, v)uv + k(u, v)v ) 2 = bX(u)+ b′X(v)+ X(g)u + g(u, v)2uX(u)+ ··· |u=0,v=0 = bX(u)+ b′X(v)  = bc + b′c′ ∂ ∂ = c (f)+ c′ (f). ∂u ∂v Thus X coincides with the derivation ∂ ∂ c + c′ ∂u ∂v for all test functions f Dp. Hence the tangent space sis spanned by ∂ ∈ ∂ the tangent vectors ∂u and ∂v is therefore 2-dimensional, proving the theorem in this case.  We have the following immediate corollary. Corollary 20.1.2. Let (u1,...,un) be coordinates in a neighbor- hood of p M. Then a basis for the space Tp is given by the n partial derivatives∈ ∂ , i =1, . . . , n. ∂ui   20.3. FIRST FUNDAMENTAL FORM 165

Definition 20.1.3. The vector space dual to the tangent space Tp is called the cotangent space, and denoted Tp∗. Thus an element of a tangent space is a vector, while an element of a cotangent space is called a 1-form, or a covector.

∂ Definition 20.1.4. The basis dual to the basis ( ∂ui ) is denoted (dui), i =1, . . . , n.

i Thus each du is by definition a linear form on Tp, or a cotan- gent vector (covector for short). We are therefore working with dual ∂ i bases ∂ui for vectors, and (du ) for covectors. The pairing as in (19.5.1) gives  ∂ ∂ ,duj = duj = δj, (20.1.1) ∂ui ∂ui i     where δj is the Kronecker delta: δj = 1 if i = j and δj = 0 if i = j. i i i 6 Example 20.1.5. Examples of 1-forms in the plane are dx, dy, dr, rdr, dθ.

20.2. Constructing bilinear forms out of 1-forms Recall that the polarisation formula allows one to reconstruct a symmetric bilinear form B, from the quadratic form Q(v) = B(v, v), at least if the characteristic is not 2: 1 B(v,w)= (Q(v + w) Q(v w)). (20.2.1) 4 − − Similarly, one can construct bilinear forms out of the 1-forms dui, as follows. Consider a quadratic form defined by a linear combination of the rank-1 quadratic forms (dui)2. Polarizing the quadratic form, one obtains a bilinear form on the tangent space Tp.

20.3. First fundamental form Let M be a surface, and p M a point on the surface. ∈ Definition 20.3.1. The first fundamental form g on M is a sym- metric positive definite bilinear form on the tangent space Tp = TpM at p:

g : Tp Tp R, × → defined for all p M and varying continuously and smoothly in p. ∈ 166 20. DERIVATIONS, TORI

Example 20.3.2. Assume M is imbedded in R3. Euclidean 3- space R3 is equipped with a standard inner product which we de- 3 note , R . Its restriction to the tangent plane of M gives a first fundamentalh i form on M:

g : Tp Tp R, g(v,w) := v,w 3 . × → h iR The first fundamental form is traditionally expressed by a matrix of coefficients called metric coefficients gij, with respect to coordi- nates (u1,u2) on the surface. Thus, if a surface is imbedded in Euclidean space R3, the metric is obtained by restricting the Euclidean inner product to the tangent plane of the surface, as follows.

Definition 20.3.3. The metric coefficients gij are given by the inner product of the i-th and the j-th vector in the basis: ∂x ∂x gij = i , j . ∂u ∂u 3  R In particular, the diagonal coefficient satisfies ∂x 2 g = , ii ∂ui

namely it is the square length of the i-th basis vector ∂x . ∂ui

20.4. Expressing the metric in terms of 1-forms In the case of surfaces, at every point p = (u1,u2), we have the 1 2 metric coefficients gij = gij(u ,u ). Each metric coefficient is thus a function of two variables. Consider the case when the matrix is diagonal. This can always be achieved at a point by a change of coordinates. If this is true everywhere in a coordinate chart, such coordinates are called isothermal. In the notation developed in Section 20.2, we can write the first fundamental form in isothermal coordinats as follows: 1 2 1 2 1 2 2 2 g = g11(u ,u )(du ) + g22(u ,u )(du ) . (20.4.1) For example, if the metric coefficients form an identity matrix, we obtain the standard flat metric g = du1 2 + du2 2 , (20.4.2) where the interior superscript denotes an index, while exterior super- script denotes the squaring operation. 20.6. SURFACE OF REVOLUTION 167

Sometimes it is convenient to simplify notation by replacing the coordinates (u1,u2) simply by (x, y). Then the flat metric (20.4.2) will appear simply as dx2 + dy2.

20.5. Hyperbolic metric The hyperbolic metric can expressed by the bilinear form 1 (dx2 + dy2). y2 Note that this expression is undefined for y = 0. Traditionally one considers the upper half plane defined by the condition y > 0. The hyperbolic metric in the upper half plane is a complete metric. Its Gaussian curvature K = K(x, y) satisfies K = 1 at every point follows from the Laplace-Beltrami operator form− of the formula for the Gaussian curvature (see Section 19.4), and results from the fact 1 that the second derivative of log y is 2 . − y

20.6. Surface of revolution In the previous sections, we have outlined a formalism for describing a metric on an arbitrary surface, in coordinates denoted (u1,u2). In the special case of a surface of revolution, it is customary to use the notation u1 = θ and u2 = ϕ. Then the metric (20.4.1) can be written as 2 2 g = g11(θ,ϕ)(dθ) + g22(θ,ϕ)(dϕ) . The starting point in the construction of a surface of revolution in R3 is a curve C in the xz-plane, parametrized by a pair of functions x = f(ϕ), z = g(ϕ). We will assume that f(ϕ) > 0. The surface of revolu- tion (around the z-axis) defined by C is parametrized as follows: x(θ,ϕ)=(f(ϕ) cos θ, f(ϕ) sin θ,g(ϕ)). The condition f(ϕ) > 0 ensures that the resulting surface is imbedded in R3. It will be an imbedded torus if the original curve C itself is a Jordan curve. The pair of functions (f(ϕ),g(ϕ)) gives an arclength parametrisation of the curve if df 2 dg 2 + =1. dϕ dϕ     168 20. DERIVATIONS, TORI

Proposition 20.6.1. Assume (f(ϕ),g(ϕ)) gives an arclength para- metrisation of the curve. Then in the notation of Section 20.4, the metric g of the surface of revolution is g = f 2(ϕ)dθ2 + dϕ2. Example 20.6.2. Setting f(ϕ) = sin ϕ and g(ϕ) = cos ϕ, we obtain a parametrisation of the sphere S2 in spherical coordinates: 2 2 2 gS2 = sin ϕ dθ + dϕ . Proof of Proposition 20.6.1. To calculate the first fundamen- tal form of a surface of revolution, note that ∂x x = =( f sin θ, f cos θ, 0), 1 ∂θ − while ∂x df df dg x = = cos θ, sin θ, , 2 ∂ϕ dϕ dϕ dϕ   2 2 2 2 2 so that we have g11 = f sin θ + f cos θ = f , while df 2 dg 2 df 2 dg 2 g = (cos2 θ + sin2 θ)+ = + 22 dϕ dϕ dϕ dϕ         and df df g = f sin θ cos θ + f cos θ sin θ =0. 12 − dϕ dϕ Thus we obtain the first fundamental form f 2 0 2 2 (gij)= df dg , (20.6.1) 0 dϕ + dϕ ! completing the proof.      In fact, equation (20.6.1) has the following stronger consequence. Proposition 20.6.3. Assume (f,g) is a regular parametrisation of the curve. In the notation of Section 20.4, the metric g of the surface of revolution is df 2 dg 2 g = f 2dθ2 + + dϕ2. dϕ dϕ     ! In Section 20.10, we will express the metric of a surface of revolution by a first fundamental form expressed by a scalar matrix, after an appropriate change of coordinates. 20.7. COORDINATE CHANGE 169

20.7. Coordinate change In this section we will work with coordinate charts in the Euclidean plane. Given a metric in one coordinate chart, we would like to under- stand how the metric coefficients tranform under change of coordinates. Consider a change from a coordinate chart (ui), i =1, 2, to another coordinate chart, denoted (vα), α =1, 2. In the overlap of the two domains, the coordinates can be expressed in terms of each other, e.g. v = v(u). Denote by gij the metric coefficients i with respect to the chart (u ), and byg ˜αβ the metric coefficients with respect to the chart (vα). Definition 20.7.1 (Einstein summation convention). The rule is that whenever a product contains a symbol with a lower index and another symbol with the same upper index, take summation over this repeated index. Proposition 20.7.2. Under a coordinate change, the metric coef- ficients transform as follows: ∂ui ∂uj g˜ = g . αβ ij ∂vα ∂vβ Proof. In the case of a metric induced by a Euclidean imbedding x : R2 R3, → defined by x = x(u)= x(u1,u2), we obtain a new parametrisation y(v)= x(u(v)). Applying chain rule, we obtain ∂y ∂y g˜ = , αβ ∂vα ∂vβ   ∂x ∂ui ∂x ∂uj = , ∂ui ∂vα ∂uj ∂vβ   ∂ui ∂uj ∂x ∂x = , ∂vα ∂vβ ∂ui ∂uj   ∂ui ∂uj = g , ij ∂vα ∂vβ completing the proof.  170 20. DERIVATIONS, TORI

20.8. Conformal equivalence i j i j Definition 20.8.1. Two metrics, g = gijdu du and h = hijdu du , on a surface M are called conformally equivalent, or conformal for short, if there exists a function f = f(u1,u2) > 0 such that g = f 2h, in other words, g = f 2h i, j. (20.8.1) ij ij ∀ Definition 20.8.2. The function f above is called the conformal factor (note that sometimes it is more convenient to refer, instead, to the function λ = f 2 as the conformal factor). Remark 20.8.3. Note that the length of every vector at a point with coordinates (u1,u2) is multiplied precisely by f(u1,u2). More i ∂ speficially, a vector v = v ∂ui which is a unit vector for the metric h, is “stretched” by a factor of f, i.e. its length with respect to g equals f. Indeed, the new length of v is 1 ∂ ∂ 2 g(v, v)= g vi , vj ∂ui ∂uj   p 1 ∂ ∂ 2 = g , vivj ∂ui ∂uj     i j = gijv v 2 i j = pf hijv v i j = fp hijv v = f.p

20.9. Uniformisation theorem for tori Definition 20.9.1. An equivalence class of metrics on M confor- mal to each other is called a conformal structure (mivneh) on M. Theorem 20.9.2. Locally, every metric on a surface can be written as f(x, y)2(dx2 + dy2) with respect to suitable coordinates (x, y). There is a global version of this theorem for tori. The uniformisation theorem in the genus 1 case can be formulated as follows. Recall that by Pythagoras’ theorem, the square-length of a vector in the plane is the sum of the squares of its components, or ds2 = dx2 + dy2, 20.10. ISOTHERMAL COORDINATES 171 where dx and dy measure the components and ds is the arclength in the plane.

Definition 20.9.3. A lattice L R2 a subgroup isomorphic to Z2 ⊆2 which is not included in any line in R . Definition 20.9.4. A function f(x, y) on R2 is called L-periodic if f(x + ℓ)= f(x) for all ℓ L. ∈ Definition 20.9.5. A flat torus T2 is a quotient R2 /L, where L is a lattice in the plane. Theorem 20.9.6 (Uniformisation theorem). For every metric g on the 2-torus T2, there exists a lattice L R2 and a positive L-periodic 2 ⊆ 2 function f(x, y) on R such that the torus (T ,g) is isometric to R2 /L, f 2ds2 , (20.9.1) where ds2 = dx2 + dy2 is the standard flat metric of R2. More generally, there is also a global form of the theorem in arbi- trary genus, which can be stated in several ways. One of such ways involves Gaussian curvature. Theorem 20.9.7 (uniformisation). Every metric on a connected (kashir) surface is conformally equivalent to a metric of constant Gauss- ian curvature. From the complex analytic viewpoint, the uniformisation theorem states that every Riemann surface is covered by either the sphere, the plane, or the upper halfplane. Thus no notion of curvature is needed for the statement of the uniformisation theorem. However, from the differential geometric point of view, what is relevant is that every con- formal class of metrics contains a metric of constant Gaussian curva- ture. See [?] for a lively account of the history of the uniformisation theorem.

20.10. Isothermal coordinates Definition 20.10.1. Coordinates (u1,u2) on a surface M are called isothermal if the matrix (gij) of the first fundamental form of M, with respect to (u1,u2), is a scalar matrix at every point of M. The following lemma expresses the metric of a surface of revolution in isothermal coordinates. 172 20. DERIVATIONS, TORI

Theorem 20.10.2. Suppose (f(ϕ),g(ϕ)), where f(ϕ) > 0, is an arclength parametrisation of the generating curve1 of a surface of rev- olution. Then the change of variable 1 ψ = dϕ, f(ϕ) Z produces a new parametrisation (in terms of variables θ, ψ), with respect to which the first fundamental form is given by a scalar matrix (gij)= 2 (f δij), i.e. the metric is f 2(ϕ(ψ))(dθ2 + dψ2). In other words, we obtain an explicit conformal equivalence be- tween the metric on the surface of revolution and the standard flat metric on the quotient of the (θ, ψ) plane. Such coordinates are re- ferred to as “isothermal coordinates” in the literature. The existence of such a parametrisation is predicted by the uniformisation theorem (see Theorem 20.9.6) in the case of a general surface. df df dϕ Proof. Let ϕ = ϕ(ψ). By chain rule, dψ = dϕ dψ . Now consider again the first fundamental form of Proposition 20.6.3. To impose the 2 2 2 df dg condition g11 = g22, we need to solve the equation f = dψ + dψ , or     df 2 dg 2 dϕ 2 f 2 = + . dϕ dϕ dψ     !   In the case when the generating curve is parametrized by arclength, we dϕ are therefore reduced to the equation f = dψ , or dϕ ψ = . f(ϕ) Z Thus we obtain a parametrisation of the surface of revolution in co- ordinates (θ, ψ), such that the matrix of metric coefficients is a scalar matrix. 

20.11. Tori of revolution We will compute the conformal parameter of the specific torus of revolution (x a)2 + y2 = b2, where b < a. Note that the lattices obtained in the− previous section are all rectangular, with θ varying L dϕ from 0 to 2π and ψ varying from 0 to 0 f(ϕ) . The lemma of the previous section has the following corollary. R 1This curve does not have to be closed. We will specialize to Jordan curves in Section 20.11. 20.11. TORI OF REVOLUTION 173

Corollary 20.11.1. Consider a torus of revolution in R3 formed by rotating a Jordan curve of length L > 0, with unit speed param- etisation (f(ϕ),g(ϕ)) where ϕ [0, L]. Then the torus is conformally 2 ∈ equivalent to a flat torus R /Lc,d, defined by a rectangular lattice

Lc,d = c Z d Z, ⊕ L dϕ where c = 2π and d = 0 f(ϕ) , while the metric expressed in coordi- nates (θ, ψ) is R f 2(ϕ(ψ))(dθ2 + dψ2), dϕ for the change of coordinate ψ = f(ϕ) . R

CHAPTER 21

Conformal parameter

21.1. Conformal parameter τ of standard tori of revolution

Figure 21.1.1. Torus: lattice (left) and imbedding (right)

Definition 21.1.1. Every flat torus C/L is similar to the torus spanned by τ C and 1 C, where τ is in the standard fundamental ∈ ∈ 1 domain D = z = x + iy C : x 2 ,y > 0, z 1 . The parameter τ is{ called the conformal∈ parameter| | ≤ of the torus.| | ≥ } In the previous section we studied the torus of revolution gener- ated by a Jordan curve of length L with arclength parametrisation (f(ϕ),g(ϕ)). We showed that the conformal class of the torus contains the flat rectangular torus, whose lattice Lc,d is spanned by perpendic- L dϕ ular vectors of length c =2π and d = 0 f(ϕ) . We therefore obtain the following corollary. R Corollary 21.1.2. The conformal parameter τ of a torus of rev- olution is pure imaginary: τ = iσ2 of absolute value c d σ2 = max , 1. d c ≥   175 176 21. CONFORMAL PARAMETER

Proof. The proof is immediate from the fact that the lattice is rectangular. 

21.2. Comformal parameter of standard torus of rev We would like to compute the conformal parameter τ of the stan- dard tori imbedded in R3. We first modify the parametrisation so as to obtain a generating curve parametrized by arclength: ϕ ϕ f(ϕ)= a + b cos , g(ϕ)= b sin , (21.2.1) b b where ϕ [0, L] with L =2πb, and b < a. ∈ Theorem 21.2.1. The corresponding flat torus is given by the lat- tice in the (θ, ψ) plane of the form Lc,d = c Z +d Z where c =2π and 2π d = . (a/b)2 1 − Thus the conformal parameter pτ of the flat torus satisfies 1 τ = i max , (a/b)2 1 . (a/b)2 1 − ! − p Proof. By Corollary 21.1.2,p replacing ϕ by ϕ(ψ) produces isother- mal coordinates (θ, ψ) for the torus generated by (21.2.1), where dϕ dϕ ψ = = , f(ϕ) a + b cos ϕ Z Z b and therefore the flat metric is defined by a lattice in the (θ, ψ) plane with c =2π and L=2πb dϕ d = . a + b cos ϕ Z0 b Changing the variable to to ϕ t = b we obtain 2π dt d = . (a/b) + cos t Z0 Thus we are evaluating the definite integral 2π dt d = , (a/b) + cos t Z0 21.2. COMFORMAL PARAMETER OF STANDARD TORUS OF REV 177 where a/b > 1. Let α = a/b. We will use the residue theorem. First note that 2π dt d = α + cos t Z0 dt = α + Re(eit) Z 2dt = . 2α + eit + e it Z − The change of variables z = eit yields idz dt = − z and along the unit circle we have 2idz d = − z(2α + z + z 1) I − 2idz = − (21.2.2) z2 +2αz +1 I 2idz = − , (z λ )(z λ ) I − 1 − 2 where λ = α + √α2 1, λ = α √α2 1. (21.2.3) 1 − − 2 − − − From (21.2.3) we obtain λ λ =2√α2 1. (21.2.4) 1 − 2 − The root λ2 is outside the unit circle. Thus z = λ1 is the only singu- larity of the meromorphic function under the integral sign in (21.2.2) inside the unit circle. Hence we need the residue at λ1 to apply the residue theorem. The residue at λ1 is obtained by multiplying through by (z λ1) and then evaluating at z = λ1. We therefore obtain from (21.2.4)− 2i 2i i Resλ1 = − = − = − . λ1 λ2 2√α2 1 √α2 1 − − − The integral is determined by the residue theorem in terms of the residue at the pole z = λ1. Therefore the lattice parameter d from Corollary 21.1.2 equals 2π d = 2πi Resλ1 = , (α)2 1  − where α = a/b, proving the theorem. p  178 21. CONFORMAL PARAMETER

21.3. Curvature expressed in terms of θ(s) In Section 19.4 we dealt with the Gaussian curvature of surfaces. In the present section we treat a different notion of curvature, that of a differentiable curve in the plane. Let s be an arclength parameter of a plane curve α(s). Using the 2 identification R = C, the tangent vector v(s)= α′(s) can be expressed as v(s)= eiθ(s), where θ(s) is the angle measured counterclockwise between the posi- tive x-axis and the vector v(s). The (geodesic) curvature kα(s) is by definition the function kα(s)= α′′(s) . Note that one has k (s) 0 by definition. | | α ≥ Theorem 21.3.1. We have the following expression for the curva- ture kα of the curve α: dθ k (s)= . (21.3.1) α ds

Proof. iθ(s) Recall that v(s)= e . By chain rule, we have dv dθ = ieiθ(s) , ds ds and therefore the curvature kα(s) satisfies dv dθ k (s)= α′′(s) = = , α | | ds ds

since i = 1.  | | Corollary 21.3.2. Let α(t)=(x(t),y(t)) be an arbitrary parametri- sation (not necessarily arclength) of a curve. We have the following expression for the curvature kα:

x′y′′ y′x′′ k = | − | (21.3.2) α α 3 | ′| Proof. With respect to an arbitrary parameter t, the components of the tangent vector v(t)= α′(t) are x′(t) and y′(t). Hence we have y tan θ = ′ . (21.3.3) x′ Differentiating (21.3.3) with respect to t, we obtain

1 dθ x′y′′ x′′y′ 2 = −2 . (21.3.4) cos θ dt x′ 21.4. TOTAL CURVATURE OF A CONVEX JORDAN CURVE 179

Meanwhile dx dx ds ds x′ = = = cos θ dt ds dt dt and dθ dθ ds = . (21.3.5) dt ds dt dt dα dθ Furthermore, ds = v(t) = α′(t) = dt . Solving (21.3.5) for ds , we obtain from (21.3.4)| that| | | | |

dθ x′y′′ y′x′′ = | − | ds α 3 | ′| and by (21.3.1) we obtain (21.3.2). This is equivalent to the calcula- tion (??). 

21.4. Total curvature of a convex Jordan curve The result on the total curvature1 of a Jordan curve is of interest in its own right. Furthermore, the result serves to motivate an analogous statement of the Gauss–Bonnet theorem for surfaces in Section 21.6. Definition 21.4.1. The total curvature Tot(C) of a curve C with counterclockwise arclenth parametrisation α(s):[a, b] R2 is the integral → b Tot(C)= kα(s)ds. (21.4.1) Za When a curve C is closed, i.e, α(a)= α(b), it is more convenient to write the integral (21.4.1) using the notation of a line integral

Tot(C)= kα(s)ds. (21.4.2) IC Theorem 21.4.2. The total curvature of each strictly convex2 Jor- dan curve C with arclength parametrisation α(s) equals 2π, i.e.,

Tot(C)= kα(s)ds =2π. IC Proof. Let α(s) be a unit speed parametrisation so that α(0) be the lowest point of the curve, and assume the curve is parametrized i0 counterclockwise. Let v(s)= α′(s). Then v(0) = e = 1 and θ(0) = 0.

1Akmumiyut kolelet 2Kmura with a “kuf”. 180 21. CONFORMAL PARAMETER

The function θ = θ(s) is monotone increasing from 0 to 2π. Applying the change of variable formula for integration, we obtain dv dθ 2π Tot(C)= k (s)ds = ds = ds = dθ =2π, α ds ds IC IC IC Z0 proving the theorem. 

Remark 21.4.3. We can also consider the normal vector n(s) to π the curve. The normal vector satisfies n(s)= ei(θ+ 2 ) = iei(θ) and dn = | ds | dv = dθ . Therefore we can also calculate the total curvature as follows: | ds | ds dn dθ 2π k (s)ds = ds = ds = dθ =2π, α ds ds IC IC IC Z0 with n(s) in place of v(s).

Remark 21.4.4. The theorem in fact holds for an arbitrary regular Jordan curve, provided we adjust the definition of curvature to allow for negative sign (in such case θ(s) will not be a monotone function). We have dealt only with the convex case in order to simplify the topological considerations. See further in Section 21.5. Remark 21.4.5. A similar calculation will yield the Gauss–Bonnet theorem for convex surfaces in Section 21.6.

˜ 21.5. Signed curvature kα of Jordan curves in the plane Let α(s) be an arclength parametrisation of a Jordan curve in the plane. We assume that the curve is parametrized counterclockwise. As in Section 21.3, a continuous branch of θ(s) can be chosen where θ(s) is the angle measured counterclockwise from the positive x-axis to the tangent vector v(s)= α′(s). Note that if the Jordan curve is not convex, the function θ(s) will not be monotone and at certain points its derivative may take negative values: θ′(s) < 0. Once we have such a continuous branch, we can define the signed curvature k˜α as follows.

Definition 21.5.1. The signed curvature k˜α(s) of a plane Jordan curve α(s) oriented counterclockwise is defined by setting dθ k˜ (s)= . α ds We have the following generalisation of Theorem 21.4.2 on the total curvature of a convex curve. 21.5. SIGNED CURVATURE k˜α OF JORDAN CURVES IN THE PLANE 181

Theorem 21.5.2. The total signed curvature k˜α of each Jordan curve C with arclength parametrisation α(s) equals 2π, i.e., b Tot(C)= k˜α(s)ds =2π. Za Proof. We exploitg the continuous branch θ(s) as in Theorem 21.4.2, but without the absolute value signs on the derivative of θ(s). Applying the change of variable formula for integration, we obtain dθ 2π Tot(C)= k˜ (s)ds = ds = dθ =2π, α ds IC IC Z0 proving theg theorem.  Remark 21.5.3. For nonconvex Jordan curves, the Gauss map to the circle will not be one-to-one, but will still have an algebraic de- gree one. This phenomenon has a far-reaching analogue for imbedded surfaces where the algebraic degree is proportional to the Euler char- acteristic. Definition 21.5.4. The algebraic degree of a map θ : C S1 at a point z S1 is defined to be the sum → ∈ dθ sign , −1 ds θX(z)   where the sum runs over all points in the inverse image of z. For Jordan curves (i.e., imbedded curves), the algebraic degree is always 1. This is the topological ingredient behind the theorem on the total curvature of a Jordan curve. 21.5.1. Tangent vector defined by a curve. This subsection can be skipped. Let M be a manifold. A tangent vector X at a point p M was ∈ defined as a linear functional X : Dp R on the space Dp of smooth functions defined in a neighborhood of→p, where the 1-form is required to satisfy Leibniz’s rule (19.7.1). The tangent vectors at p form the tangent space Tp = TpM at p, which is an n-dimensional vector space. If (u1,...,un) are coordinates centered at p, then we can write down the ∂ following basis for Tp: ∂ui , i = 1,...,n. If α(t) is a smooth curve through p satisfying α(0) = p, the formula Xf = d f α(t) =  dt t=0 ◦ d f(α(t)) for all f , defines an initial vector3 X = α (0). The dt t=0 Dp ′ directional derivative only∈ depends on the values of f along a curve

3hat’chalati 182 21. CONFORMAL PARAMETER

Figure 21.5.1. Three points on a curve with the same tangent direction, but the middle one corresponds to θ′(s) < 0. with initial vector X, by elementary calculus. Thus we can think of a tangent vector as being defined by choosing a suitable smooth curve. Example 21.5.5. Let p be the point at the center of a coordi- 1 n nate system with coordinates (u ,...,u ). Consider the curve α1(t) defined in coordinates by the n-tuple (t, 0,..., 0). Similarly consider the curves αi, where i = 1,...,n, so that αn(t) = (0,..., 0, t). Then ∂ we have ∂ui = αi′(0), i =1,...,n. The advantage of the more abstract definition of a tangent vector is its independence of any choices. This will allow us to define the tangent bundle by considering all tangent vectors simultaneously.

21.5.2. Rotation index of a closed curve in the plane. This subsection is to be skipped. Given a regular closed plane curve α(s) : [0, L] R2, of length L, a continuous branch of → 1 θ(s)= log α′(s) i can be chosen even if the curve is not simple, obtaining a map θ : [0, L] R . → 21.6. COORDINATE PATCH DEFINED BY FAMILY OF NORMAL VECTORS183

As in Section 21.5, we can assume that θ(0) = 0. Then θ(L) is neces- sarily an integer multiple of 2π. See Millman & Parker [?, p. 55].

Definition 21.5.6. The rotation index ια of a closed unit speed plane curve α(s) is the integer θ(L) θ(0) ι = − . α 2π Example 21.5.7. The rotation index of a figure-8 curve is 0. Theorem 21.5.8. Let α(s) be an arclength parametrisation of a geometric curve C. Then the rotation index is related to the total signed curvature Tot(C) as follows: Tot(C) g ι = . (21.5.1) α 2π The proof (by change of variableg in integration) is the same as in Section 21.5. An analogous relation between the Euler characteristic of a surface and its total Gaussian curvature appears in Section 21.7. 21.6. Coordinate patch defined by family of normal vectors The Gauss–Bonnet theorem for surfaces is to a certain extent anal- ogous to the theorem on the total curvature of a plane curve (Theo- rem 21.5.2 and Theorem 21.5.8). In both cases, an integral of curvature turns out to have topological significance. In the notation of Section 19.1, we have the metric coefficients (gij) of the parametrisation x(u1,u2) of the surface Σ in a neighborhood of a point p Σ. The Gauss∈ map G : M S2 → is given by the unit normal vector n(u1,u2). Remark 21.6.1. The Gauss map defines a coordinate patch on the sphere S2 near the point G(p). In other words, we obtain a parametri- sation of a neighborhood of the point G(p) S2. ∈ Definition 21.6.2. In the new coordinate patch, we obtain the 2 metric coefficients (˜gα,β) of the sphere S relative to the parametrisa- tion n(u1,u2). 1 2 1 2 Here each coefficientg ˜αβ =g ˜αβ(u ,u ) is a function of u and u , as well. These metric coefficients are defined as usual by taking the scalar products of hte first partial derivatives of the parametrisation, this time given by n(u1,u2): 1 2 g˜ (u ,u )= n , n 3 , α,β =1, 2. αβ h α βiR 184 21. CONFORMAL PARAMETER

Lemma 21.6.3. We have the following identity relating the metric 2 coefficients of the metric (˜gαβ) on the sphere S , on the one hand, and the metric coefficients (gij) of the metric on M:

1 2 2 det(˜gαβ)= K(u ,u ) det(gij) where K = K(u1,u2) is the Gaussian curvature of the surface M.

i Proof. Let L = (L j) be the matrix of the Weingarten map with respect to the basis (x1, x2). By definition of curvature we have K = i det(L). Recall that the coefficients L j of the Weingarten map are defined by i i nα = L αxi = xi L α. (21.6.1)

Consider the 3 2-matrices A = [x1 x2] and B = [n1 n2]. Then (21.6.1) implies× by definition that B = AL. Therefore the Gram matrices satisfy

t t t t Gram(n1, n2)= B B =(AL) AL = L A AL t (21.6.2) = L Gram(x1, x2) L. Now we apply the determinant to both sides. We have

det(Gram(x1, x2)) = det(gij). Similarly,

det(Gram(n1, n2)) = det(˜gαβ). Therefore from (21.6.2) we obtain

2 det(˜gαβ) = det(gij)det(L) , and the theorem follows from the identity K = det(L). 4 

4Note that by the Cauchy–Binet formula ??, the desired identity is equivalent to the formula n n = det(Li ) x x . (21.6.3) | 1 × 2| | j | | 1 × 2| An alternative proof of (21.6.3) is to observe that a linear map multiplies the area of parallelograms by (the absolute value of) its determinant. The Weingarten map sends each vector xi to ni and therefore it sends the parallelogram spanned by x1 and x2 to the parallelogram spanned by n1 and n2. Therefore it modifies the area of the parallegram by det(Li ) . | j | 21.7. TOTAL GAUSSIAN CURVATURE AND GAUSS–BONNET THEOREM 185

21.7. Total Gaussian curvature and Gauss–Bonnet theorem Let Σ R3 be an imbedded surface. Recall that the area ele- ment dA of⊆ a surface M is given locally in coordinates (u1,u2) by

1 2 dAM = det(gij)du du . q Here the factor det(gij) should be thought of as the area of the par- ∂ ∂ allelogram spanned by 1 and 2 . For example, in polar coordinates p ∂u ∂u in the plane, the determinant is r2 and therefore we get dA = rdrdθ. On the unit sphere we have dA = sin ϕ dθ dϕ. Let M be a convex closed surface in R3, i.e., an imbedded closed surface of positive Gaussian curvature (a typical example is an ellip- soid). Theorem 21.7.1 (Special case of Gauss-Bonnet). The curvature integral of M satisfies

2 K(p) dAM =4π =2πχ(S ), ZM where K is the Gaussian curvature function defined at every point p of the surface. Proof. The convexity of the surface guarantees that the map n is one-to-one. 1 2 We examine the integrand K dAM in a coordinate chart (u ,u ) 1 2 where it can be written as K(u ,u ) dAM . By Lemma 21.6.3, we have

1 2 1 2 1 2 1 2 2 K(u ,u ) dAM = K(u ,u ) det(gij)du du = det(˜gαβ)du du = dAS .

q q 2 Thus, the expression K dAM coincides with the area element dAS of the unit sphere S2 in every coordinate chart. Hence we can write

2 K dAM = dAS =4π, 2 ZM ZS proving the theorem. 

CHAPTER 22

Addendum on a Bernoullian framework

22.1. Ultrapower construction We briefly describe a construction that yields a nonstandard ex- tension satisfying countable saturation. For every n N let Sn(X) ∈ be the set of all sequences xi i N such that xi Vn(X) for all i, and h i ∈ ∈ let S(X) = n N Sn(X). Let (N) be an ultrafilter on N such ∈ U ⊆ P that every cofinite subset of N is in . Define an equivalence relation S U ∼ on S(X) as follows: xi i N yi i N if the set i N : xi = yi is in . h i ∈ ∼h i ∈ { ∈ } U We denote the equivalence class of xi i N by [ xi i N]. We define a rela- h i ∈ h i ∈ tion ε on S(X) as follows: [ xi i N] ε [ yi i N] if the set i N : xi yi is in . h i ∈ h i ∈ { ∈ ∈ } U A16. Let S′(X) denote the set of equivalence classes of elements of S(X), and let ∗X denote the set of equivalence classes of elements of S (X). We recursively define a function T : S′(X) V (∗X), as 0 → follows: For a = xi i N S0(X) simply T ([a]) = [a]. For n > 0 h i ∈ ∈ and Ai i N Sn(X) Sn 1(X), let T ([ Ai i N]) = T ([ xi i N]) : h i ∈ ∈ − − h i ∈ { h i ∈ [ xi i N] ε [ Ai i N] , which is an element of Vn(∗X). For A S(X) leth ki ∈ denoteh thei ∈ sequence} with constant value A, then finally,∈ let : A ∗ V (X) V (∗X) be defined by ∗A = T ([kA]). In this construction, the internal→ objects are precisely those appearing in the image of T . 22.2. Saturation and topology In Theorem we further assume that our nonstandard extension satisfies countable saturation, which is the following property: When- ever An n N is a countable collection of internal sets such that 0 k n Ak = { } ∈ ≤ ≤ 6 ∅ for every n N, then n N An = ∅. Any nonstandard extension constructed via∈ the ultrapower∈ construction6 described below,T satisfies countable saturation. T

187

CHAPTER 23

More on Gauss-Bonnet

23.1. Second proof of Gauss–Bonnet theorem We will use the existence of global coordinates (with singularities only at the north and south poles) on the sphere to write down a more concrete version of the calculation. Choose global coordinates u1 = θ,uˆ 2 =ϕ ˆ on M by precomposing 1 the spherical coordinates by the inverse G− of the Gauss map G : M S2. Here we can dispense with local coordinate charts and compute→ the Gauss–Bonnet integral in these global coordinates. Using 1 2 the relation K(u ,u ) det(gij)= det(˜gαβ) we obtain as before

1 2 p pˆ K(u ,u ) dAM = K(θ, ϕˆ) dAM ZM ZM

= K(θ,ˆ ϕˆ) det(gij)dθdˆ ϕˆ M ZZ q = det(˜gαβ)dθdϕ = dAS2 . S2 S2 ZZ q ZZ Here each of the double can be replaced by the iterated integral π RR2π dϕ dθ(...) , Z0 Z0  where the appropriate integrand is placed where the dots (...) are.

23.2. Euler characteristic The Euler characteristic of a closed imbedded surface in Euclidean 3-space can be defined via the total Gaussian curvature. Definition 23.2.1. The Euler characteristic χ(M) of a surface M is defined by the relation

2πχ(M)= K(p)dAM , (23.2.1) ZM where K(p) is the Gaussian curvature at the point p M. ∈ 189 190 23. MORE ON GAUSS-BONNET

The relation (23.2.1) is similar to the line intergal expression for the rotation index in formula ??. To show that this definition of the Euler characteristic agrees with the usual one, it is necessary to use the notion of algebraic degree for maps between surfaces, similar to the algebraic degree of a self-map of a circle. 23.3. Exercise on index notation Theorem 23.3.1. Every 2 2 matrix A =(ai ) satisfies the identity × j i k i k i a ka j + q δ j = a ka j, (23.3.1) where q = det(A). Proof. We will use the Cayley-Hamilton theorem which asserts that pA(A) = 0 where pA(λ) is the characteristic polynomial of A. In 2 the rank 2 case, we have pA(λ)= λ (trA)λ + det(A), and therefore we obtain − A2 (trA)A + det(A)I =0, − which in index notation gives ai ak ak ai +q δi = 0. This is equivalent k j − k j j to (23.3.1). 

For a matrix A of size 3 3, the characteristic polynomial pA(λ) has the form p (λ)= λ3 Tr(A×)λ2 + s(A)λ q(A)λ0. Here q(A) = det(A) A − − and s(A)= λ1λ2 + λ1λ3 + λ2λ3, where λi are the eigenvalues of A. By Cayley-Hamilton theorem, we have

pA(A)=0. (23.3.2) Exercise 23.3.2. Express the equation (23.3.2) in index notation. Exercise 23.3.3. A matrix A is called idempotent if A2 = A. Write the idempotency condition in indices with Einstein summation conven- tion (without Σ’s). Exercise 23.3.4. Matrices A and B are similar if there exists an invertible matrix P such that AP P B = 0. Write the similarity condition in indices, as the vanishing− of the (i, j)th coefficient of the difference AP P B. − CHAPTER 24

Bundles

24.1. Trivial bundle (E,B,π), function as a section (mikta) Before introducing the notion of a tangent bundle, let us make some elementary remarks about graphs. Let M be a surface (more generally, a manifold). Given a function

f : M R → we can define its graph in M R as the collection of pairs × (p, f(p)) M R p M . { ∈ × | ∈ } We view M R as a trivial bundle over M, denoted π: × π : M R M. × → We can also consider the section s : M M R defined by the obvious formula → × s(p)=(p, f(p)). The section thus defined clearly satisfies the identity π s = Id , ◦ M in other words π(s(p)) = p for all p M. We will now develop the language of bundles and sections in∈ the context of the tangent and cotangent bundles of M. The total space of a bundle is typically denoted E, the base is denoted B, and the surjective bundle map is denoted π:

E

π  B The whole package is referred to as the triple (E,B,π).

191 192 24. BUNDLES

24.2. Mobius band as a first example of a nontrivial bundle The tangent bundle of a circle is a cylinder S1 R, which can be thought of as a trivial bundle over the circle. × A first example of a non-trivial bundle is provided by the Mobius band (tabaat mobius, rtzuat mobius). The latter can be thought of as a non-trivial bundle over the circle. Consider the parametrized circle c(θ) = 10(cos θ, sin θ, 0) with nor- mal vector n(θ) = (cos θ, sin θ, 0). The circle can be imbedded in a parametrized Mobius band in R3. This is done by adding the endpoints of a “rotating” interval at every point: µ(θ)= c(θ) cos θ n(θ) + sin θ e . ± 2 2 3 θ θ Now “filling in” the pair of points cos 2 n(θ) + sin 2 e3 by the in- terval joining them, we obtain a Mobius± band  θ θ MB(θ,s)= c(θ)+ s cos 2 n(θ) + sin 2 e3 , where 1 s 1. The projection back to the circle c defines a non-trivial− ≤ interval≤ bundle over the circle.

24.3. Tangent bundle The tangent bundle T M of a surface (or more generally mani- fold) M is by definition the collection of pairs T M := (p,X) p M,X T . { | ∈ ∈ p} We have a natural projection π : T M M, → defined by (p,X) p, “forgetting” the second component. Clearly, 7→ the inverse image of a point is precisely the tangent plane Tp at that point: 1 π− (p)= Tp. A vector field is a “section” of the tangent bundle, analogous to the splitting map for a homomorphism. Thus a section s : M T M is a map satisfying → π s = Id . ◦ M In other words, if s is a section, we have π(s(p)) = p for all p M. We first complete the preliminary material on bundles. ∈ 24.5. COTANGENT BUNDLE 193

24.4. Hairy ball theorem In Section 24.2, we discussed the simplest example of a nontrivial bundle, given by the Mobius band. Another example of a nontrivial bundle is provided by the following famous result. Theorem 24.4.1 (The hairy ball theorem). There is no nonvan- ishing continuous tangent vector field on a sphere. Thus, if f is a continuous function that assigns a vector in R3 to every point p on a sphere, such that f(p) is always tangent to the sphere at p, then there is at least one p such that f(p) = 0. The theorem was first stated by Henri Poincar´ein the late 19th century. The theorem is famously stated as “you can’t comb a hairy ball flat”, or sometimes, “you can’t comb the hair of a coconut”. It was first proved in 1912 by Brouwer.

24.5. Cotangent bundle In Section 24.3, we defined the tangent bundle of M. We now define the dual object, the cotangent bundle. Definition 24.5.1. The cotangent bundle of M is the collection of pairs T ∗M := (p, η) p M, η T ∗ . | ∈ ∈ p We have a natural projection (to the “first factor”)

π : T ∗M M, → so that we obtain the usual triple (E,B,π)= T ∗M,M,π). Given a coordinate chart (u1,...,un) in M, we obtain a basis du1,...,dun) at every point in the chart. Every cotangent vector ω at a point with coordinates (u1,...,un) is a linear combination ω = ω dui = ω du1 + ω du2 + + ω dun, i 1 2 ··· n and therefore we can coordinatize T ∗M by the 2n-tuple 1 n (u ,...,u ,ω1,...,ωn).

With respect to this coordinate chart in T ∗M, the projection π : T ∗M M “forgets” the last n coordinates. → Definition 24.5.2. A differential 1-form ω on M is by definition a section of the cotangent bundle of M. Thus at every point p M, a differential form ω gives a linear form ∈ ωp : TpM R . → 194 24. BUNDLES

In coordinates, a differential 1-form ω is given by 1 n i ω = ωi(u ,...,u )du , where the ωi are functions of n variables.

24.6. Exterior derivative of a function

Given a smooth function f C∞(M), we define a differential 1- form, denoted df, on M, in terms∈ of the directional derivative by setting df(X)= Xf, where Xf is the directional derivative of the function f in the di- rection of the vector X. This formula applies at every point of M. Denoting ω = df, we see that

ωp(Xp)= Xpf R ∈ where Xp is the value of the vector field X at the point p. Theorem 24.6.1. In coordinates (u1,...,un), we have ∂f df = dui, (24.6.1) ∂ui with Einstein summation convention. The proof is a restatement of chain rule in several variables. Definition 24.6.2. The collection of all differential 1-forms on M is denoted Ω1(M). Note that Ω1(M) is an infinite-dimensional vector space. The space Ω1(M) is by definition the space of sections of the cotangent bundle of M. In this notation, the exterior derivative is an R-linear map 1 d : C∞(M) Ω (M) → 1 from the space C∞(M) of functions on M to the space Ω (M) of 1- forms on M.

24.7. Extending gradient and exterior derivative Given a function f on an inner product space V , one defines the gradient f of f by setting ∇ f,X = Xf, h∇ i for every vector X. Note that we have a canonical isomorphism ♭ ♭ : V V ∗, X X . → 7→ 24.8. PLAN FOR BUILDING DE RHAM COHOMOLOGY 195

The isomorphism ♭ is defined by setting X♭(Y )= X,Y , h i for every vector Y V . The inverse is denoted ∈ ♯ ♯ : V ∗ V, ω ω , → 7→ for every 1-form ω. If the coordinates are chosen in such a way that ∂ ∂ the basis ( ∂u1 ,..., ∂un ) is an orthonormal basis of TpM, then ∂ ♭ = dui, ∂ui   and ∂ (dui)♯ = . ∂ui Theorem 24.7.1. The exterior derivative of a function f at p M is related to the gradient of f at p M as follows: ∈ ∈ f = ♯(df), ∇ at every point p of the surface M. The proof is immediate in an orthonormal basis.

24.8. Plan for building de Rham cohomology The construction of de Rham cohomology of a manifold involves several stages. It is helpful to keep these stages in mind as we build up the relevant mathematical machinery step-by-step. We start with linear algebra and end with linear algebra. (1) Linear-algebraic stage: from a vector space V to its exterior algebra ΛV = ΛkV. ⊕k (2) Topological stage: from a smooth manifold M to its tangent bundle T M and its cotangent bundle T ∗M. (3) A vector field as a section of T M. (4) A section of T ∗M is a differential 1-form on M. k (5) The exterior bundle ΛM = kΛ (T ∗M) is the bundle of ex- terior algebras parametrized⊕ by points of M. This is a finite- dimensional manifold. k (6) A section of Λ (T ∗M) is a differential k-form on M. k k (7) The space Ω (M) of sections of Λ (T ∗M) is an infinite-dimensional vector space: we are back to linear algebra. (8) The exterior derivative d turns Ω∗(M) into a differential graded algebra. 196 24. BUNDLES

24.9. Exterior algebra, area, determinant, rank The exterior product or wedge product of vectors is an algebraic construction generalizing certain features of the cross product to higher dimensions. Like the cross product, and the scalar triple product, the exterior product of vectors is used in Euclidean geometry to study areas, vol- umes, and their higher-dimensional analogs. In linear algebra, the exterior product provides a basis-independent manner for describing the determinant and the minors of a linear trans- formation, and is fundamentally related to ideas of rank and linear independence. The exterior algebra over a vector space V is generated by an oper- ation called exterior product. It is a key building block in the definition of the algebra of differential forms.

24.10. Formal definition, alternating property Formally, the exterior algebra is a certain unital associative (chok kibutz) algebra over the field K, including V as a subspace. It is denoted by Λ(V ) and its multiplication is also known as the wedge product or the exterior product and is written as

∧ Definition 24.10.1. The wedge product is an associative and bi- linear operation: : Λ(V ) Λ(V ) Λ(V ), ∧ × → sending (α, β) α β, 7→ ∧ with the essential feature that it is alternating on V : v v = 0 for all v V. (24.10.1) ∧ ∈ This implies in particular u v = v u (24.10.2) ∧ − ∧ for all u, v V , and ∈ v v v = 0 (24.10.3) 1 ∧ 2 ∧···∧ k whenever v ,...,v V are linearly dependent. 1 k ∈ 24.12. AREAS IN THE PLANE 197

Remark 24.10.2. Unlike associativity and bilinearity which are re- quired for all elements of the algebra Λ(V ), the last three properties are imposed only on the elements of the subspace V of the algebra. The defining property (24.10.1) and property (24.10.3) are equivalent; properties (24.10.1) and (24.10.2) are equivalent unless the character- istic of K is two.

24.11. Example: rank 1 The exterior algebra Λ(R) over the vector space V = R1 is 2- dimensional, and isomorphic to the subalgebra of the algebra of 2 by 2 matrices consisting of uppertriangular matrices with a pair of identical eigenvalues, i.e., matrices x y 0 x   where x, y R. As a vector space, the algebra is the sum Λ(R) = ∈ R 1 R e1, where e1 is the matrix · ⊕ · 0 1 e = . 1 0 0   2 The matrix e1 is nilpotent: e1 = 0, as required by (24.10.1). Here V = R e1.

24.12. Areas in the plane The purpose of this section is to motivate the skewness of the ex- terior product on vectors in V . The area of a parallelogram can be expressed in terms of the deter- minant of the matrix of coordinates of two of its vertices. The Cartesian plane R2 is a vector space equipped with a basis. The basis consists of a pair of unit vectors

e1 = (1, 0), e2 = (0, 1). Suppose that

v = v1e1 + v2e2, w = w1e1 + w2e2 are a pair of given vectors in R2, written in components. There is a unique parallelogram having v and w as two of its sides. The area of this parallelogram is given by the standard determinant formula: A = det v w = v1w2 v2w 1 . | − | 198 24. BUNDLES

Consider now the exterior product of v and w and exploit the properties stipulated above: v w =(v e + v e ) (w e + w e ) ∧ 1 1 2 2 ∧ 1 1 2 2 = v w e e + v w e e + v w e e + v w e e , 1 1 1 ∧ 1 1 2 1 ∧ 2 2 1 2 ∧ 1 2 2 2 ∧ 2 so that v w =(v w v w )e e , ∧ 1 2 − 2 1 1 ∧ 2 where the first step uses the distributive law for the wedge product, and the last uses the fact that the wedge product is alternating. Note that the coefficient in this last expression is precisely the determinant of the matrix [v w]. The fact that this may be positive or negative has the intuitive meaning that v and w may be oriented in a counterclockwise or clockwise sense as the vertices of the parallelogram they define. Such an area is called the signed area of the parallelogram: the absolute value of the signed area is the ordinary area, and the sign determines its orientation. The fact that this coefficient is the signed area is not an accident. In fact, it is relatively easy to see that the exterior product should be related to the signed area if one tries to axiomatize this area as an algebraic construct. In detail, if A(v,w) denotes the signed area of the parallelogram determined by the pair of vectors v and w, then A must satisfy the following properties: (1) A(av, bw) = abA(v,w) for any real numbers a and b, since rescaling either of the sides rescales the area by the same amount (and reversing the direction of one of the sides reverses the orientation of the parallelogram). (2) A(v, v) = 0, since the area of the degenerate (mathemat- ics)—degenerate parallelogram determined by v (i.e., a line segment) is zero. (3) A(w, v)= A(v,w), since interchanging the roles of v and w reverses the− orientation of the parallelogram. (4) A(v + aw,w) = A(v,w), since adding a multiple of w to v affects neither the base nor the height of the parallelogram and consequently preserves its area. (5) A(e1, e2) = 1, since the area of the unit square is one. With the exception of the last property, the wedge product satis- fies the same formal properties as the area. In a certain sense, the wedge product generalizes the final property by allowing the area of a parallelogram to be compared to that of any “standard” chosen paral- lelogram. In other words, the exterior product in two-dimensions is a 24.13. VECTOR AND TRIPLE PRODUCTS 199 basis-independent formulation of area. This axiomatization of areas is due to Leopold Kronecker and Karl Weierstrass.

24.13. Vector and triple products For vectors in R3, the exterior algebra is closely related to the vector product and triple product. Using the standard basis e1, e2, e3 , the wedge product of a pair of vectors { }

u = u1e1 + u2e2 + u3e3 and

v = v1e1 + v2e2 + v3e3 is u v =(u v u v )(e e )+(u v u v )(e e )+(u v u v )(e e ), ∧ 1 2− 2 1 1∧ 2 1 3− 3 1 1∧ 3 2 3− 3 2 2∧ 3 where e1 e2, e1 e3, e2 e3 is the basis for the three-dimensional { ∧3 ∧ ∧ } space Λ2(R ). This imitates the usual definition of the vector product of vectors in three dimensions. Bringing in a third vector

w = w1e1 + w2e2 + w3e3, the wedge product of three vectors is u v w =(u v w +u v w +u v w u v w u v w u v w )(e e e ), ∧ ∧ 1 2 3 2 3 1 3 1 2− 1 3 2− 2 1 3− 3 2 1 1∧ 2∧ 3 3 3 where e1 e2 e3 is the basis vector for the one-dimensional space Λ (R ). This imitates∧ ∧ the usual definition of the triple product. The vector product and triple product in three dimensions each admit both (1) geometric and (2) algebraic interpretations. The vector product u v can be interpreted as a vector which is perpendicular to both u and×v and whose magnitude is equal to the area of the parallelogram determined by the two vectors. It can also be interpreted as the vector consisting of the minors of the matrix with columns u and v. The triple product of u, v, and w is geometrically a (signed) volume. Algebraically, it is the determinant of the matrix with columns u, v, and w. The exterior product in three- dimensions allows for similar interpretations. In fact, in the presence of a positively oriented orthonormal basis, the exterior product generalizes these notions to higher dimensions. 200 24. BUNDLES

24.14. Anticommutativity of the wedge product Theorem 24.14.1. The wedge product is anticommutative on ele- ments of V .

Proof. Let x, y V . Then ∈ 0=(x + y) (x + y)= x x + x y + y x + y y = x y + y x. ∧ ∧ ∧ ∧ ∧ ∧ ∧ Hence x y = y x. ∧ − ∧ 

More generally, if x1, x2, ..., xk are elements of V , and σ is a permu- tation of the integers [1, ..., k], then

x x x = sgn(σ)x x x , σ(1) ∧ σ(2) ∧···∧ σ(k) 1 ∧ 2 ∧···∧ k where sgn(σ) is the signature of a permutation—signature of the per- mutation σ. A proof of this can be found in greater generality in Bourbaki (1989).

24.15. The k-exterior power; simple (decomposable) multivectors The k-th exterior power of V , denoted Λk(V ), is the vector subspace of Λ(V ) spanned by elements of the form

x x x , x V, i =1, 2, . . . , k. 1 ∧ 2 ∧···∧ k i ∈ Definition 24.15.1. If α Λk(V ), then α is said to be a k- multivector. If, furthermore, α∈can be expressed as a wedge product of k elements of V , then α is said to be decomposable, or simple.

Although decomposable multivectors span Λk(V ), not every ele- ment of Λk(V ) is decomposable.

Example 24.15.2. In R4, the following 2-multivector is not decom- posable: α = e e + e e . (24.15.1) 1 ∧ 2 3 ∧ 4 (This is in fact a symplectic form, since α α = 0. See S. Sternberg [?, III.6]. ∧ 6 24.17. RANK OF A MULTIVECTOR 201

24.16. Basis and dimension of exterior algebra

Theorem 24.16.1. If the dimension of V is n and e1, ..., en is a basis of V , then the set e e e 1 i < i < < i n (24.16.1) { i1 ∧ i2 ∧···∧ ik | ≤ 1 2 ··· k ≤ } is a basis for Λk(V ). Proof. Given any wedge product of the form v v 1 ∧···∧ k then every vector vj can be written as a linear combination of the basis vectors ei; using the bilinearity of the wedge product, this can be expanded to a linear combination of wedge products of those basis vectors. Any wedge product in which the same basis vector appears more than once is zero; any wedge product in which the basis vectors do not appear in the proper order can be reordered, changing the sign whenever two basis vectors change places.  In general, the resulting coefficients of the basis k-vectors can be computed as the minors of the matrix that describes the vectors vj in terms of the basis ei. By counting the basis elements, the dimension of Λk(V ) is the bi- nomial coefficient n n! = . k k!(n 1)!   − In particular, Λk(V )=0 for k > n. Theorem 24.16.2. The dimension of Λ(V ) equals 2n. Proof. Any element of the exterior algebra can be written as a sum of multivectors. Hence, as a vector space the exterior algebra is a direct sum Λ(V )=Λ0(V ) Λ1(V ) Λ2(V ) Λn(V ) ⊕ ⊕ ⊕···⊕ (where by convention Λ0(V ) = K and Λ1(V ) = V ), and therefore its dimension is equal to the sum of the binomial coefficients, which is 2n. 

24.17. Rank of a multivector If α Λk(V ), then it is possible to express α as a linear combination of decomposable∈ multivectors:

α = α(1) + α(2) + + α(s) ··· 202 24. BUNDLES

where each α(i) is decomposable, say α(i) = α(i) α(i), i =1, 2, . . . , s. 1 ∧···∧ k Definition 24.17.1. The rank of the multivector α is the minimal number of decomposable multivectors in such an expansion of α. Rank is particularly important in the study of 2-multivectors. Theorem 24.17.2. The rank of a 2-multivector α equals half the rank of the matrix of coefficients of α in a basis.

Thus if ei is a basis for V, then α can be expressed uniquely as α = a e e ij i ∧ j i,j X where a = a (the matrix of coefficients is skew-symmetric). The ij − ji rank of α agrees with the rank of the matrix (aij). For example, the symplectic from (24.15.1) has rank 2, whereas the rank of its matrix of coefficients is 4. Theorem 24.17.3. In characteristic 0, the 2-multivector α has rank p if and only if the p-fold product satisfies α α =0 ∧···∧ 6 p and | {z } α α =0. ∧···∧ p+1 | {z } CHAPTER 25

Multilinear forms; exterior differential complex

25.1. Formal construction of the exterior algebra The tensor product V W of two vector spaces V and W over a field K is defined as follows.⊗ Consider the set of ordered pairs in the Cartesian product V W . For the purposes of this construction, regard this Cartesian product× as a set rather than a vector space. Definition 25.1.1. The free vector space F on V W is the vector space in which the elements of V W are a basis. × × Thus, F (V W ) is the collection of all finite linear combinations of the form × F (V W )= α e α K, (v ,w ) V W , × i (vi,wi) i ∈ i i ∈ × where we have used the symbol e( v,w) to emphasize that these are taken to be linearly independent by definition for distinct pairs (v,w) V W. ∈ × The tensor product arises by defining the following equivalence re- lations in F (V W ): × e e + e (v1+v2,w) ∼ (v1,w) (v2,w) e e + e (v,w1+w2) ∼ (v,w1) (v,w2) ce e (v,w) ∼ (cv,w) ce e (v,w) ∼ (v,cw) where v1, v2 V , w1,w2 W , and c K. Denote by R F (V W ) the space generated∈ by these∈ four equivalence∈ relations, in⊆ other words,× by the differences

e(v1+v2,w) (e(v1,w) + e(v2,w)) e − (e + e ) (v,w1+w2) − (v,w1) (v,w2) ce(v,w) e(cv,w) ce − e (v,w) − (v,cw) Definition 25.1.2. The tensor product of the two vector spaces V and W is then the quotient space V W = F (V W )/R. ⊗ × 203 204 25. MULTILINEAR FORMS; EXTERIOR DIFFERENTIAL COMPLEX

One readily checks with respect to a basis that the dimensions of V and W multiply: dim(V W ) = dim V dim W. ⊗ Definition 25.1.3. The exterior product V W of V and W is the the vector space obtained as the quotient of∧V W by the idea generated by all the sums v w + w v. ⊗ ⊗ ⊗ The image of v w under the projection V W V W is ⊗ ⊗ → ∧ denoted v w. Denote by (ei) a standard basis for V . One readily checks that∧e e , where i < j, form a basis for i ∧ j V V =Λ2(V ). ∧ Similar remarks apply to the higher exterior powers Λk(V ). 25.2. Exterior differential complex

Associating to every cotangent space Tp∗ its exterior algebra Λ(Tp∗), we obtain the exterior bundle

ΛM := Λ(T ∗M). k Similarly putting together the k-exterior powers Λ Tp∗, we obtain k k the subbundle Λ M := Λ (T ∗M). Definition 25.2.1. A differential k-form is a section of the ex- terior bundle ΛkM of k-multivectors. The space of such sections is denoted Ωk(X). Recall that we have an exterior derivative d defined on functions f by ∂f df = dui, (25.2.1) ∂ui with Einstein summation convention, see (24.6.1). A similar formula defines the exterior derivative for an arbitrary k-form. Theorem 25.2.2. The exterior derivative d gives rise to an exterior differential complex 1 2 n 0 C∞(M) Ω (M) Ω (M) ... Ω (M) 0, → → → → → → satisfying d2 = d d =0. ◦ For instance, the differential d : Ω1(M) Ω2(M) is defined on a 1-form fdu by → d(fdu)= df du. ∧ Remark 25.2.3. The exterior differential complex leads to the def- inition of de Rham cohomology of M, see Section 28.9. 25.3. ALTERNATING MULTILINEAR FORMS 205

25.3. Alternating multilinear forms Let V be a vector space over a field K. Let k N. ∈ Definition 25.3.1. An alternating multilinear function f : V k K (25.3.1) → is called an alternating multilinear form on V . The set of all alternating k-multilinear forms is is a vector space, as the sum of two such maps, or the multiplication of such a map with a scalar, is again alternating. 1 Note that for k = 1, we obtain the space V ∗ =Λ (V ∗). In general, we have the following duality. Theorem 25.3.2. If V has finite dimension n, then the space of k alternating multilinear forms can be identified with Λ (V ∗), where V ∗ denotes the dual space of V . In particular, the dimension of the space of anti-symmetric maps k n from V to K is the binomial coefficient k . Proof. Let us make such an identification  explicit. Consider a k- tuple x =(x ,...,x ) V k. 1 k ∈ j We will denote by y , elements of the dual space V ∗. Given a decomposable element 1 k k y = y ... y Λ V ∗, ∧ ∧ ∈ we would like to define a multilinear map f = fy as in (25.3.1), associ- ated with y. We define it by setting

1 k fy(x1,...,xk)= sgn(σ) y (xσ(1)) ...y (xσ(k)). (25.3.2) σ S X∈ k In other words, i fy(x) = det(y (xj)) 1 is a k k-determinant. Note that we do not multiply by k! . In this we follow× S. Sternberg [?, p. 20]. 

k k Theorem 25.3.3. The space Λ (V ∗) is naturally dual to Λ (V ). The duality is given on simple elements of Λk(V ) by the pairing φ , v ... v = φ(v ,...,v ). h 1 ∧ ∧ ki 1 k The case k = 2 will be examined in detail in the next section. 206 25. MULTILINEAR FORMS; EXTERIOR DIFFERENTIAL COMPLEX

25.4. Case of 2-forms 1 2 Let y ,y V ∗ be 1-forms. Consider the decomposable (simple) exterior 2-form∈ 1 2 2 ω = y y Λ (V ∗). ∧ ∈ Then ω defines an alternating bilinear function f : V 2 K ω → defined as follows. Let u, v V . Following (25.3.2), we set ∈ y1(u) y1(v) f (u, v)= y1(u)y2(v) y1(v)y2(u) = det ω − y2(u) y2(v)   i i where y (u) is the evaluation of the covector y V ∗ on the vector u V . ∈ ∈ Example 25.4.1. In the (x, y)-plane, the form ω = dx dy defines ∧ a bilinear function fω whose geometric meaning is the signed area of the parallelogram spanned by the pair of vectors u, v.

Proposition 25.4.2. The value of fω on the pair u, v in fact de- pends only on its image ξ = u v in Λ2(V ). ∧ Proof. The proof is immediate from the interpretation of fω(u, v) as a generalized signed area of the parallelogram spanned by u and v. 2 2 Namely, fω is proportional to the standard area form dx dy since Λ (R )∗ is 1-dimensional, and dx dy calculates the area A if the∧ parallelogram ∧ 1 2 2 2  spanned by u and v, where u v = A e e Λ (R ).  ∧ ∧ ∈ Remark 25.4.3. We will sometimes write ω(ξ) in place of fω(u, v), where ξ = u v. Thus by definition, ∧ ω(ξ)= fω(u, v).

25.5. Norm on 1-forms Given a norm on a vector space V , there is a natural norm on kk the dual space V ∗, defined as follows. We will denote the new norm ∗ for the purposes of this section. kk

Definition 25.5.1. The norm ∗ on V ∗ is defined for y V ∗ by setting kk ∈ y ∗ = sup y(x) x V, x 1 . k k { | ∈ k k ≤ } In other words, we calculate the dual norm of the covector y by maximizing its value over vectors x V of norm at most 1. ∈ 25.6. POLAR COORDINATES AND DUAL NORMS 207

It is clear that the inequality x 1 can be replaced by equality in this definition: k k ≤

y ∗ = sup y(x) x V, x =1 , k k { | ∈ k k } i.e. the norm of y V ∗ can be calculated over the unit vectors x V . ∈ ∈ Example 25.5.2. If ∂ ∂ , ∂x ∂y   is an orthonormal basis for the Euclidean plane V = R2, then the dual space V ∗ admits an orthonormal basis (dx,dy)

∂ ∂ ∂ which is dual to the basis , . Indeed, dx( ) = 1 and dx ∗ = 1. ∂x ∂y ∂x k k   25.6. Polar coordinates and dual norms The Euclidean metric in the plane E defines a norm on the tangent plane by the usual identification of a space and its tangent space. Lemma 25.6.1. The 1-form xdy ydx has norm r. − Proof. This is immediate from the fact that dx and dy are or- thonormal.  Consider the polar coordinates (r, θ) in the plane. Note that θ is undefined at the origin. Then (dr, dθ) is a basis for the cotangent plane at every point of E 0 . However, the basis is not orthonormal. \{ } y Since tan θ = x , we have x tan θ = y, hence dθ/dx tan θ + x = dy/dx, cos2 θ so that dθ tan θ dx + x = dy. cos2 θ Multiplying by r cos θ, we obtain x r sin θ dx + rdθ = r cos θdy, cos θ or ydx + r2dθ = xdy. 208 25. MULTILINEAR FORMS; EXTERIOR DIFFERENTIAL COMPLEX

We then obtain r2dθ = xdy ydx. By lemma 25.6.1, we obtain − 1 dθ ∗ = , k k r so that it is the covector rdθ that has unit norm. It follows that in the (r, θ) coordinates, we obtain an orthonormal basis for the cotangent space consisting of the pair of covectors (dr, rdθ). 25.7. Dual bases and dual lattices in a Euclidean space The discussion is similar to the situation with a pair of dual vector spaces treated in Section 19.5. Thus, let V be a Euclidean vector space equipped with a scalar product (inner product), denoted , . Given h i a basis (xi) for V , there exists a unique basis (yj) satisfying x ,y = δ , h i ji ij n where δij is the Kronecker delta function. We will view R as equipped with its standard dot product, denoted x, y . h i n Definition 25.7.1. Given a lattice L R , the dual lattice L∗ n ⊆ ⊆ R is the lattice n L∗ = y R x L, x, y Z . { ∈ | ∀ ∈ h i ∈ } Thus the pairing between a point in a lattice and a point in its dual is by definition always an integer.

Theorem 25.7.2. Consider a Z-basis (xi) for L. Consider the ba- sis (yj) dual to the basis (xi). Then (yj) is a Z-basis for the dual lattice L∗. Definition 25.7.3. Given a lattice L Rn with its Euclidean norm , denote by λ (L) the least length of⊆ a nonzero vector in L: || 1 λ (L) = min v v L 0 . 1 | | ∈ \{ } We now consider the rank 1 case.

Proposition 25.7.4. Let L R be a lattice and L∗ R the dual lattice of L. Then ⊆ ⊆ 1 λ1(L∗)= . (25.7.1) λ1(L) Proof. Given a lattice L R, denote by α > 0 its generator, so ⊆ that L = Z α. For an element y R to pair integrally with α, it must 1 ∈ be a multiple of β = α . Thus L∗ = Z β 25.7. DUAL BASES AND DUAL LATTICES IN A EUCLIDEAN SPACE 209

1 and λ1(L∗)= β = α .  Remark 25.7.5. Note that the reciprocal relation (25.7.1) does not hold in general for rank greater than 1. Already in rank 2, the product λ1(L∗)λ1(L) can be greater than 1. For the Eisenstein lattice LE C ⊆ spanned by the cube roots of unity, we obtain the value λ1(LE∗ )λ1(LE)= 2/√3.

CHAPTER 26

Comass norm, Hermitian product, symplectic form

There are two distinct natural norms on k-multivectors: the Eu- cldean norm and the comass norm. First we complete the material on exterior algebra.

2 2 n 26.1. Λ (V ) is isomorphic to the standard model Λ0(R ) We let 2 n Λ0(R ) = Span ei ej :1 i < j n , 2 n n { ∧ ≤ ≤ } so that dim Λ0(R )= 2 . Let V be an n-dimensional vector space. As in Section 25.1, we set  F (V ) = Span e : v,w V , (v,w) ∈ and define the equivalence relation as in Section 25.1, with the ad- ditional relation stipulating that ∼ e e , (26.1.1) v,w ∼ − w,v and define Λ2(V ) := F (V )/ . ∼ Lemma 26.1.1. We have dim Λ2(V ) n . ≤ 2 Proof. Choose a basis a1,...,an for V. Each v V decomposes i i ∈ as a sum v = v ai with v R. Then ∈ i j i j i j e = e i j v w e v w e + v w e . (v,w) (v ai,w aj ) ∼ (ai,aj ) ∼ (ai,aj ) (ai,aj ) ij X X By 26.1.1, we have e viwje viwje . (v,w) ∼ (ai,aj ) − (aj ,ai) i

2 2 n Theorem 26.1.2. Λ (V ) is isomorphic to Λ0(R ).

211 212 26. COMASS NORM, HERMITIAN PRODUCT, SYMPLECTIC FORM

Proof. We define a map ζ :Λ2(V ) Λ2(Rn) → 0 by setting ζ(e )= (viwj vjwi)e e Λ2(V ). (v,w) − i ∧ j ∈ 0 i

26.2. Euclidean norm on k-multivectors and k-forms A natural Euclidean norm can be defined on the space Λk(Rn) of k- multivectors by declaring the basis e e e 1 i < i < < i n (26.2.1) { i1 ∧ i2 ∧···∧ ik | ≤ 1 2 ··· k ≤ } of (24.16.1) to be orthonormal. Then each simple k-form e ... e i1 ∧ ∧ ik has unit Euclidean norm: e ... e =1, | i1 ∧ ∧ ik | and they are all mutually perpendicular. Since we have identified the space of k-multivectors with the space of k-linear forms, this gives a Euclidean norm on the k-linear forms, as well.

Example 26.2.1. For the symplectic form α = e1 e2 + e3 e4 4 ∧ ∧ on V = R , we have α = √2. | | In the next section, we will define a different norm on the space of forms.

26.3. Comass norm of a form Let V be an inner product space, for instance Rn. Recall that an exterior form is called simple (or decomposable) if it can be expressed as a wedge product of 1-forms. Lemma 26.3.1. The comass norm for a simple k-form coincides with the natural Euclidean norm on k-forms. 26.4. SYMPLECTIC FORM FROM A COMPLEX VIEWPOINT 213

Proof. A k-tuple of 1-forms span a k-blade, or a k-dimensional subspace P V . The k-tuple can be replaced by an orthonormal k- tuple forming⊆ a basis for P . This can be done by means of a volume- preserving transformation, for instance by applying the Gram-Schmidt process.  In general, the comass is defined as follows. Recall that a k- multivector of V ∗ can be viewed as a k-linear form on V via (25.3.2). Definition 26.3.2. The comass of a k-linear form is its maximal value on a k-tuple of unit vectors. In formulas, the comass norm of a k-linear form ω on Rn is n ω = max ω(e1,...,ek) ei R , ei =1 for 1 i k . (26.3.1) k k { | ∈ | | ≤ ≤ } Here e denotes the Euclidean norm on Rn. | | Example 26.3.3. For the symplectic form α = e1 e2 + e3 e4 4 ∧ ∧ on V = R , we have α = √2, while α = 1. In other words, its comass norm is smaller| than| its Euclideank k norm: α < α . k k | | This will be proved in the context of Wirtinger’s inequality. Lemma 26.3.4. We have ω ω with equality if and only if ω is simple. k k≤| | Indeed, the Euclidean norm ω of ω is defined by formula similar to (26.3.1), where the maximum is| taken| over all k-forms in Λk(V ) and not merely the simple (decomposable) ones (by the finite-dimensional version of the Riesz representation). This proves the inequality. A characterisation of the boundary case of equality is postponed.

26.4. Symplectic form from a complex viewpoint Example 26.4.1. The 2-form dx dy viewed as a bilinear form 2 ∧ on R is represented by the antisymmetric matrix 0 1 . 1 0 −  ν Consider the ν-dimensional complex vector space C . Denote by Zj the j-th coordinate ν Zj : C C, → with the usual decomposition

Zj = Xj + iYj 214 26. COMASS NORM, HERMITIAN PRODUCT, SYMPLECTIC FORM into the real and imaginary parts. The complex conjugate Z¯j is by definition Z¯ = X iY , j j − j for all j =1,...,ν. Then the exterior product Zj Z¯j can be expressed as follows: ∧ 2 Z Z¯ =(X + iY ) (X iY )= 2iX Y = X Y . j ∧ j j j ∧ j − j − j ∧ j i j ∧ j Thus i Z Z¯ = X Y . 2 j ∧ j j ∧ j

Example 26.4.2 (Symplectic form). Let Z1,...,Zν be the coordi- nate functions in Cν. Then the standard symplectic 2-form (24.15.1), 2 ν denoted A ((C )∗), is given by ∈ ν V A = i Z Z¯ . (26.4.1) 2 j ∧ j j=1 X Remark 26.4.3. A more traditional way of writing the form is in terms of the real basis, by the expression A = X Y = X Y + ... + X Y . j ∧ j 1 ∧ 1 ν ∧ ν j X We have preferred the form (26.4.1) emphasizing the complex structure because of future applications. Example 26.4.4. The corresponding skew-symmetric matrix is a block diagonal matrix with ν diagonal blocks of the form 0 1 1 0 −  corresponding to each of the complex coordinates.

26.5. Hermitian product Let V be a ν-dimensional vector space over C. Let H = H(v,w) be a Hermitian product on V . Example 26.5.1. The standard Hermitian product on V = Cν is given by ν vjwj (26.5.1) j=1 X where v =(vj) and w =(wj). 26.6. ORTHOGONAL DIAGONALISATION 215

Remark 26.5.2. Here we adopt the convention that a Hermitian product H is complex linear in the second variable. There are varying conventions in textbooks regarding this issue. Federer’s book [?] uses the convention (26.5.1). Recall that we have H(v,w)= H(w, v). (26.5.2) Consider its real part v w, and imaginary part A = A(v,w): · H(v,w)= v w + iA(v,w). (26.5.3) · Then v w is a scalar product on V viewed as a 2ν-dimensional real vector space.· Lemma 26.5.3. The imaginary part A is skew-symmetric and there- fore can be viewed as an element in A 2 V , the second exterior power of V . ∈ V Proof. We have A(w, v)= (H(w, v)) = H(v,w) ℑ ℑ by (26.5.2). Therefore A(w, v)= (H(v,w)) = A(v,w).  −ℑ − Lemma 26.5.4. The comass of the standard symplectic form A sat- isfies A =1. k k Proof. It suffices to evaluate A on a pair ξ = v w (see Re- mark 25.4.3), where v and w are orthonormal. Thus∧v w = 0. We therefore have H(v,w) = iA(v,w). Since A(ξ) is real,· the pair- ing ξ, A = A(ξ) can be evaluated as follows: h i A(ξ)= A(v,w)= H(iv,w)= (H(iv,w))=(iv) w 1 (26.5.4) ℜ · ≤ by the Cauchy-Schwarz inequality; equality holds if and only if one has iv = w.  26.6. Orthogonal diagonalisation In this section, we recall some standard material from linear algebra. Theorem 26.6.1. Every skew-symmetric real matrix can be orthog- onally diagonalized into 2 by 2 blocks. Alternatively, the theorem can be formulated as follows (see Theo- rem 26.6.3). Definition 26.6.2. An endomorphism f of Rn is anti-selfadjoint if f(x),y = x, f(y) . h i −h i 216 26. COMASS NORM, HERMITIAN PRODUCT, SYMPLECTIC FORM

Theorem 26.6.3. Given an anti-selfadjoint endomorphism f of Rn, n there exists an orthogonal decomposition of R into subspaces Vj invari- ant under f, where dim V 2 for all j. j ≤ Recall that the rank of a 2-form α is the least possible number of simple (decomposable) 2-forms α(j) occurring in a presentation α = α(j). j X n n Corollary 26.6.4. The rank of a 2-form on V = R is at most 2 . Proof. A 2-form is given by a skew-symmetric matrix A, which can be thought of as an anti-selfadjoint endomorphism fA. Applying the diagonalisation theorem, we obtain an orthonormal basis e1,...,en with respect to which the endomorphism decomposes into µ blocks of size 2 by 2, with possibly an additional 1-dimensional block on which the endomorphism vanishes. With respect to this basis, the 2-form will have the following form. We have an orthonormal set ω ,...,ω V ∗ 1 2µ ∈ and nonnegative numbers λ1,...,λµ, so that µ

fA = λj (ω2j 1 ω2j) . (26.6.1) − ∧ j=1 X Therefore the rank of A is at most µ.  With respect to the new basis, the matrix will consist of 2 by 2 blocks of the form 0 λj λ 0 − j  (or half of it, depending on normalisation). Remark 26.6.5. Some comments on the diagonalisation theorem: n n suppose that A R × is anti-symmetric. Its eigenvalues are all pure ∈ imaginary. Suppose that v = v1 + iv2 is an eigenvector (separated into real and imaginary parts) corresponding to eigenvalue iλ. Then Av = 2n 2n iλv is equivalent to Bv = 0, where B R × is the symmetric block 0 ∈ λI A v1 matrix − and v0 = is a column vector of length 2n. A λI v2 This formulation is the standard eigenvalue problem for the real 0 A v1 symmetric matrix , and each real solution v0 = gives a 2- A 0 v2 dimensional real A-invariant− subspace span v , v . Of course, the { 1 2} v2 eigenvector w0 = − gives the same subspace. There may be an issue v1 in case of multiple eigenvalues, but this can be resolved. CHAPTER 27

Wirtinger inequality, projective spaces

27.1. Wirtinger inequality We exploit the material presented earlier to prove Wirtinger’s in- equality for 2-forms. Then we proceed to the complex projective space. Recall that the standard symplectic form on V (where V is isomor- phic to Cν) is denoted k A Λ (V ∗) ∈ k where Λ (V ∗) is the space of all k-linear alternating forms on V (see Theorem 25.3.3). The form A can be thought of as the imaginary part of the standard Hermitian product as in (27.1.1) below. Following H. Federer [?, p. 40], we prove an optimal upper bound for the comass norm , cf. Definition 26.3.2, of the exterior powers of a 2-form. k k We will use the decomposition H(v,w)= v w + A(v,w) (27.1.1) · of a Hermitian inner product into the sum of a scalar product and the symplectic form, see (26.5.3), similar to the decomposition

zz¯ ′ =(x iy)(x′ + iy′)=(xx′ + yy′)+ i(xy′ x′y) − − (the convention of “barring” the first variable was discussed in Re- mark 26.5.2). Theorem 27.1.1 (Wirtinger inequality). Let V be a ν-dimensional complex vector space. Let µ 1. If ξ 2µ V and ξ is simple, then ≥ ∈ µ ξ, A µ! Vξ ; h i ≤ | | equality holds if and only if there exist elements v1,...,vµ V such that ∈ ξ = v (iv ) v (iv ). 1 ∧ 1 ∧···∧ µ ∧ µ Consequently, the comass satisfies Aµ = µ! k k Proof. The main idea is that in real dimension 2µ, every 2-form splits into a sum of at most µ orthogonal simple (decomposable) pieces. 217 218 27. WIRTINGER INEQUALITY, PROJECTIVE SPACES

We assume that ξ = 1, where is the natural Euclidean norm 2µ | | | | in V . We therefore need to show that ξ, Aµ µ! for all such ξ. The case µ = 1 was treated in Lemma 26.5.4.h i ≤ VIn the general case µ 1, we consider the 2µ dimensional sub- space T associated with the≥ simple 2µ-vector ξ. Let f : T V be the inclusion map (hachala, not hachlala). Recall that →

2 A (V ∗) ∈ is a 2-linear alternating form on V^.

Definition 27.1.2. The restriction of A to the subspace T V is denoted ⊆ 2 2 ( f)A (T ∗). ∧ ∈ Lemma 27.1.3. The restiction of^Aµ to T is the µ-th power of the restriction of A to T .

The proof is trivial. We orthogonally diagonalize the skew-symmetric 2- form ( 2f)A. Thus, we decompose the form ( 2f)A into 2 2 diagonal ∧ ∧ × blocks. Thus, we can choose dual orthonormal bases e1,...,e2µ of T 1 and ω1,...,ω2µ of (T ∗), and nonnegative numbers λ1,...,λµ, so that V µ 2 ( f)A = λj (ω2j 1 ω2j) . (27.1.2) ∧ − ∧ j=1 X By Lemma 26.5.4 we have A = 1 and therefore k k

λj = A(e2j 1, e2j) A = 1 (27.1.3) − ≤k k for each j =1,..., 2µ. Noting that ξ = ǫe1 e2µ with ǫ = 1, we use Lemma 27.1.3 to compute ∧···∧ ±

2µf Aµ = µ! λ ...λ ω ω ∧ 1 µ 1 ∧···∧ 2µ by Lemma 28.1.1 below. Therefore

ξ, Aµ = ǫµ! λ ...λ µ! (27.1.4) h i 1 µ ≤ since λ 1 from (27.1.3). Note that equality occurs in (27.1.4) if and j ≤ only if ǫ = 1 and λj = 1 for each j. Applying the proof of Lemma 26.5.4, we conclude that e2j = ie2j 1, for each j =1,..., 2µ.  − 27.2. WIRTINGER INEQUALITY FOR AN ARBITRARY 2-FORM 219

27.2. Wirtinger inequality for an arbitrary 2-form Recall from Section 20.2 that the polarisation formula allows one to reconstruct a symmetric bilinear form B, from the quadratic form Q(v)= B(v, v), at least if the characteristic is not 2: 1 B(v,w)= (Q(v + w) Q(v w)). (27.2.1) 4 − − This will be exploited in the proof of Proposition 27.2.1 below. There is a useful generalisation of Wirtinger’s inequality for arbi- trary 2-forms. Let T = Cµ.

Proposition 27.2.1. Given an orthonormal basis (ω1,...,ω2µ) of 1 T ∗, and nonzero real numbers λ1,...,λµ, the form µ V α = λj (ω2j 1 ω2j) (27.2.2) − ∧ j=1 X has comass α = max λ . k k j | j| Proof. We can assume without loss of generality that each λj is positive. This can be attained in one of two ways. One can permute the coordinates, by applying the transposition flipping ω2j 1 and ω2j, − so as to change the sign of λj. Alternatively, one can replace, say, ω2j by ω2j, which similarly changes the sign of λj. − 1/2 1/2 Setting ω2′ j 1 = λj ω2j 1 and ω2′ j = λj ω2j, we can write − − µ

α = ω2′ j 1 ω2′ j − ∧ j=1 X We now exploit the polarisation formula (27.2.1). Remark 27.2.2. The polarisation works on a real slice; one then complexifies to go from a real dot product to a Hermitian product.

Consider the hermitian inner product Hα on T obtained by polar- izing the quadratic form 2 2 2 2 1/2 1/2 ω2′ j 1 + ω2′ j = λj ω2j 1 + λj ω2j , − − j j X   X     where the squares are understood in the sense of the symmetric product (see (20.2.1)).

Lemma 27.2.3. The Hα-norm v α of each H-unit vector v is bounded 1/2 | | above by (maxj λj) : 1/2 v α max(λj) . | | ≤ j 220 27. WIRTINGER INEQUALITY, PROJECTIVE SPACES

The Hermitian inner product Hα is set up in such a way as to satisfy

Hα(v,w)= iα(v,w), or α(v,w)= iH (v,w) − α for each Hα-orthogonal pair v,w. Now let ζ be a unit 2-vector such that α = α(ζ), (27.2.3) || || Consider its 2-plane T V . In the 2-plane T , the unit disc of the scalar ⊆ product defined by Hα is an ellipse with respect to the scalar product of H. We will exploit its (perpendicular) principal axes. Let v,w be an orthonormal pair proportional to the principal axes. We can then write ζ = v w. The H-orthonormal pair v,w is also H -orthogonal ∧ α (though in general not Hα-orthonormal), as well. Then, as in (26.5.4), we have

α(ζ)= iHα(ζ)= Hα(iv,w) iv α w α max λj − ≤| | | | ≤ j by Lemma 27.2.3. This proves the lemma.  The lemma generalizes the the µ-th powers of a 2-form, giving a generalisation of Wirtinger’s inequality. Corollary 27.2.4. Every real alternating 2-form A satisfies the comass bound Aµ µ! A µ. (27.2.4) k k ≤ k k Proof. We may assume that A = 1. By Lemma 27.2.1, we k k have λj 1 for each j =1,...,ν. An inspection≤ of the proof Proposition 27.1.1 reveals that the or- thogonal diagonalisation argument, cf. (27.1.3), applies to an arbi- trary 2-form A with comass A = 1.  k k 27.3. Real projective space The 0-dimensional sphere S0 R1 consists of two points, 1. We can therefore represent it as a disjoint⊆ union ± S0 = e0 a(e)0, ∪ where a(v) = v is the antipodal map. The 1-dimensional sphere 2 − S1 R can similarly be partitioned into a union ⊆ S1 = e0 a(e0) e1 a(e1), ∪ ∪ ∪ and similarly Sn = e0 a(e0) e1 a(e1) ... en a(en). ∪ ∪ ∪ ∪ ∪ ∪ 1 27.4. COMPLEX PROJECTIVE LINE CP 221 The quotient of Sn by the antipodal map gives a decomposition RPn = e0 e1 ... en. (27.3.1) ∪ ∪ ∪ Thus, the real projective space admits a cell decomposition with a single cell in each dimension. In the context of the de Rham cohomology, we will be interested in a similar decomposition of complex projective space.

27.4. Complex projective line CP1 In projective geometry, one starts with an arbitrary field K. The familiar construction of “adding a point at infinity” then produces the projective line {∞} KP1 = K ∪ {∞} over K. When the field K = C is that of the complex numbers, we thereby obtain the compactification C of C, namely the Riemann sphere S2: ∪{∞} CP1 = C = S2. ∪ {∞} The familiar stereographic projection provides an identification of the complement of a point in S2 and the complex numbers C.

CHAPTER 28

Complex projective spaces, de Rham cohomology

28.1. A clarification A havhara. We used a combinatorial result in the proof of Wirtinger’s inequality. µ µ Lemma 28.1.1. In j=1 λjxj with commuting variables xj, the coefficient of x1x2 xµ is  ··· P λ λ µ! 1 ··· µ Proof. The product of µ parentheses (λ x + λ x + ... + λ x )(λ x + λ x + ... + λ x ) (λ x + λ x + ... + λ x ) 1 1 2 2 µ µ 1 1 2 2 µ µ ··· 1 1 2 2 µ µ can be analyzed combinatorially as follows: we have µ possibilities of choosing the variable x1 out of the µ parenthetical expressions. Out of the remaining µ 1 parenthetical expressions, we have as many − possibilities for choosing x2, etc., resulting in µ! possibilities.  In the calculation of the comass of the powers of the symplectic form, we applied this combinatorial fact to the variables xj given by the 2-forms

xj = ω2j 1 ω2j. − ∧ 28.2. Complex projective space CPn Example 28.2.1. More generally, we have n n n 1 CP = C CP − , ∪ n 1 where CP − is thought of as the hyperplane at infinity. Complex projective n-space CPn is a complex manifold that may be described by n + 1 complex coordinates as n+1 (z0, z1,...,zn) C , (z1, z2,...,zn) = (0, 0,..., 0) ∈ 6 where the tuples differing by an overall rescaling are identified:

(z0, z1,...,zn) (λz0, λz1,...,λzn); λ C, λ =0. (28.2.1) ≡ ∈ 6 223 224 28. COMPLEX PROJECTIVE SPACES, DE RHAM COHOMOLOGY

These are homogeneous coordinates in the traditional sense of projec- tive geometry. A set of homogeneous coordinates of a point in CPn is denoted [z0, z1,...,zn] (using the traditional square brackets).

28.3. Real projective space as a quotient Recall that the real projective space can be thought of as an an- tipodal quotient of the sphere: RPn = Sn/G, where G = S0 = 1 is the cyclic group of order 2. There is a similar presentation of complex{± } projective space, where in place of the cyclic group G, we have the unitary group U(1).

28.4. Unitary group The unitary group U(n) is the group of complex n n matrices M satisfying × t MM¯ = In. Thus, the group U(1) is the circle of complex numbers of unit absolute value: U(1) = S1 C. ⊆ In the next section, we will exploit the action of U(1) on Cn+1 by scalar multiplication as in (28.2.1).

28.5. Hopf bundle Theorem 28.5.1. The space CPn is a quotient space of the unit (2n+ 1)-sphere S2n+1 in Cn+1 under the action of U(1): CPn = S2n+1/U(1). Thus, the map S2n+1 CPn → is a bundle (eged) with fiber (siv) S1. Proof. Every complex line Cv in Cn+1 intersects the unit sphere in a circle. We restrict the vectors in (28.2.1) to the unit sphere, and the scalars λ in (28.2.1) to the unit circle in C. We then identify under the natural action of U(1) to obtain CPn.  Example 28.5.2. For n = 1, this construction yields the classical Hopf bundle: S2 = S3/U(1), where S2 = CP1. 28.6. FLAGS AND CELL DECOMPOSITION OF CPn 225

28.6. Flags and cell decomposition of CPn Recall that the real projective space has a cell decomposition with a cell in each dimension, see formula (27.3.1). An analogous decompo- sition, but with cells n+1 Definition 28.6.1. A complete flag (Vj) (degel) in V = C is a choice of a nested (mekanenet) sequence of subspaces of dimensions 0, 1, 2,...,n +1, namely: 0 = V V V V = V. { } 0 ⊆ 1 ⊆ 2 ⊆···⊆ n+1 Theorem 28.6.2. A choice of a complete flag (Vj) determines a unique decomposition of CPn into cells in each even dimension: CPn = e0 e2 e2n. (28.6.1) ∪ ∪···∪ Proof. The 2j-dimensional cell e2j CPn consists of points with ⊆ homogeneous coordinates [z0, z1,...,zn] contained in the j-th set-theoretic difference: (z0, z1,...,zn) Vj Vj 1. ∈ \ −  Definition 28.6.3. The standard flag is the nested sequence 1 2 n+1 0 = V0 C C C , { } ⊆ ⊆ ⊆···⊆ where Cj is the subspace of points whose first j coordinates are arbi- trary and the remaining n +1 j coordinates vanish. − With respect to the standard flag, we obtain 2j n+1 e = (z0, z1,...,zj, 0,..., 0) C zj =0 . { ∈ | 6 } Proposition 28.6.4. The collection e2j of points with coordinates j in Vj Vj 1 is diffeomorphic to C . \ − Proof. A diffeomorphism e2j Cj → can be specified as follows:

z0 zj 1 j (z0,...,zj 1, zj, 0,..., 0) ,..., − C , − 7→ z z ∈  j j  proving the proposition.  Theorem 28.6.5. The closure e2j CPn of each cell e2j is the j n ⊆ complex projective subspace CP CP : ⊆ e2j = CPj. 226 28. COMPLEX PROJECTIVE SPACES, DE RHAM COHOMOLOGY

Proof. Consider a nonzero point

(z0,...,zj 1, 0,..., 0) − whose j-th complex coordinate vanishes. Such a point can be approx- imated by points with non-vanishing j-th coordinate, by considering the sequence 1 (z0,...,zj 1, ..., 0), k =1, 2,.... − k j 1 Therefore every point of CP − is in the closure of the 2j-cell, and therefore j j 1 j e2j = C CP − = CP , ∪ completing the proof. 

28.7. g, A, and J in a complex vector space Consider the relation sin(θ) = cos( π θ) 2 − in elementary trigonometry. The corresponding relation in terms of scalar products and determinants is the following. Let v,w R2. Then the 2 by 2 determinant satisfies ∈ det[v,w]=(iv) w, · π where i denotes the rotation by 2 resulting from the usual identifica- tion R2 = C. In other words, we have A(v,w)= g(iv,w), (28.7.1) where A is the standard symplectic form dx dy, and g is the standard ∧ metric dx2 + dy2 in C. This illustrates the passage from a symmetric form to an antisym- metric form. More generally, let V be a complex vector space, for instance Cµ. Thus, V is equipped with an endomorphism J : V V → satisfying J 2 = Id. The endomorphism is called an almost complex − structure. In Cµ, we have

J(v)=(iv1, iv2,...,ivµ). A Hermitian product H(v,w) on V decomposes as H(v,w)= g(v,w)+ iA(v,w), where g is a scalar product in Cµ = R2µ and A is the symplectic form. Note that g is symmetric in the two variables while A is antisymmetric. Moreover, we have the following relation among J, g, and A. 1 28.8. FUBINI-STUDY 2-FORM ON CP 227 Proposition 28.7.1. One has the relation A(v,w)= g(Jv,w). Proof. We have g(iv,w)= (H(iv,w)) = ( iH(v,w)). ℜ ℜ − Hence g(iv,w)= ( i(iA(v,w))) = i2A(v,w)= A(v,w), ℜ − − proving the proposition.  Remark 28.7.2. In this way, a metric on a complex manifold can be used to construct a symplectic form on the manifold, as we illustrate in Section 28.8.

28.8. Fubini-Study 2-form on CP1 We now examine in detail the case of the complex projective line CP1, i.e. the 2-sphere. Returning to CP1, we similarly obtain the differential 2-form from the metric by formula (28.7.1). The round metric of the sphere can be expressed in stereographic coordinates as dx2 + dy2 g = (1 + r2)2 where r = x2 + y2. This normalisation results in a round sphere of radius 1 . 2 p The Fubini-Study form is merely the area form of the metric. Theorem 28.8.1. In an affine neighborhood, we have the following formulas for the metric g and the corresponding Fubini-Study 2-form A (the area form of the metric) on the complex projective line: dx2 + dy2 g = (1 + r2)2 and dx dy A = ∧ (1 + r2)2 The proof is immediate from (32.4.2) and (28.7.1). 228 28. COMPLEX PROJECTIVE SPACES, DE RHAM COHOMOLOGY

28.9. Exterior differential complex Recall that given an n-dimensional manifold M, we have an exterior differential complex 1 2 n 0 C∞(M) Ω (M) Ω (M) ... Ω (M) 0. (28.9.1) → → → → → → i i Here each Ω (M) is the space of sections of the exterior bundle Λ (T ∗M), while the differential d : Ωi Ωi+1 i → raising the degree by 1, is defined as follows. For f C∞(M), we define the 1-form df Ω1(M) by setting ∈ ∈ ∂f df = d f = duj. 0 ∂uj Similarly, the differential d = d : Ω1(M) Ω2(M) 1 → is defined on a 1-form fdu by d(fdu)= df du, (28.9.2) ∧ and similarly for the higher differentials. Definition 28.9.1. A differential form ω is called closed if dω = 0. Proposition 28.9.2. The sequence of maps as in (28.9.1) is a dif- ferential complex, in the sense that d d =0, or more explicitly ◦ dk dk 1 =0 k. ◦ − ∀ Proof. The condition d d = 0 is sometimes written as d2 = 0 where d2 is understood as the◦ composition of two consecutive differen- tials. We check the condition d2 = 0 at the level of Ω1(M). Given a function f C∞(M), we first apply the differential d = d0 to obtain a ∈ ∂f i differential 1-form df = ∂ui du . Next, we apply the differential d = d : Ω1(M) Ω2(M) 1 → ∂f i to the 1-form ∂ui du . Using (28.9.2), we obtain ∂f ∂f ∂2f d dui = d dui = duj dui, ∂ui ∂ui ∂ui∂uj ∧     with a double summation with indices i and j running from 1 to n independently of each other. Since dui dui = 0, we exploit the equality of mixed partials to write ∧ ∂f ∂2f d2f = d dui = duj dui + dui duj =0, ∂ui ∂ui∂uj ∧ ∧   i

28.10. De Rham cocycles and coboundaries Definition 28.10.1. The group Zk (M) = Ker(d ) Ωk(M) dR k ⊆ is called the group of de Rham k-cocycles. Thus, a de Rham k-cocycle is by definition a closed differential k-form. Definition 28.10.2. The group k k BdR(M) = Im(dk 1) Ω (M) − ⊆ is called the group of de Rham k-coboundaries. An alternative term for a de Rham coboundary is an exact differ- ential form.

28.11. De Rham cohomology ring of M, Betti numbers Definition 28.11.1. The k-th de Rham cohomology group of M, k denoted HdR(M), is by definition the group k k k HdR(M)= ZdR(M)/BdR(M) = Ker(dk)/Im(dk 1) − (cocycles modulo the coboundaries). 230 28. COMPLEX PROJECTIVE SPACES, DE RHAM COHOMOLOGY

k Even though the object HdR(M) is traditionally referred to as a group, it is in fact a real vector space.

Definition 28.11.2. The k-th Betti number bk(M) of M is by k definition the dimension of the vector space HdR(M): k bk(M) = dim HdR(M). CHAPTER 29

de Rham cohomology continued

29.1. Comparison of topology and de Rham cohomology Proposition 29.1.1. Every connected manifold M has unit 0-th 0 Betti number b0(M)=1, i.e. HdR(M)= R. Proof. Consider the segment 1 0 C∞(M) Ω (M) → → of the exterior differential complex. The space of the 0-coboundaries is trivial. The space of 0-cocycles consists of all functions f C∞(M) satisfing df = 0. Thus at every point, all the partial derivatives∈ of f must vanish. Such a function must be locally constant. Since M is connected, the function f must also be globally constant. We therefore obtain an identification 0 H (M) R C∞(M) dR ≃ ⊆ of 0-th de Rham cohomology group with the space of constant func- tions.  Theorem 29.1.2. Every closed connected orientable n-dimensional n manifold M satisfies bn(M)=1, i.e. HdR(M)= R. Proof. The proof of this result is more difficult than the previous one, and is postponed. 

29.2. de Rham cohomology of a circle 1 1 1 Theorem 29.2.1. We have HdR(S )= R, i.e., b1(S )=1. 1 1 Proof. Let M = S . Consider the space of 1-cocycles ZdR(M). We define a homomorphism Φ: Z1 (M) R dR → 1 as follows. Let ω ZdR(M) be a de Rham 1-cocycle, i.e., a 1-form. We set ∈ Φ(ω)= ω. IM 231 232 29. DERHAMCOHOMOLOGYCONTINUED

Using the coordinate θ on the circle, we can express ω in terms of the standard 1-form dθ Ω1(M) by writing ω = f(θ)dθ. Thus, ∈ 2π Φ(ω)= f(θ)dθ = F (2π) F (0), − Z0 where F : [0, 2π] R is an antiderivative for f, so that dF = f(θ)dθ. Note that F does→ not descend to a single-valued function on the circle (only to a multiple-valued function). We have to show that the 1-form ω Φ(ω)dθ − is a 1-coboundary. Consider the function g(θ)= F (θ) Φ(ω)θ. − The function g satisfies g(0) = g(2π) and hence descends to a smooth single-valued function on the circle. Therefore by definition, dg B1 (M). ∈ dR Then dg = fdθ Φ(ω)dθ, − and therefore the kernel of Φ consists precisely of the de Rham 1- coboundaries, i.e., the exact forms. By the isomorphism theorem from elementary group theory, we conclude that the first cohomology group of the circle is isomorphic to R. 

29.3. 2-dimensional cohomology of CP1 2 1 We will show later that HdR(CP ) is isomorphic to R. A spe- cific representative of the non-trivial 2-dimensional cohomology class 2 1 in HdR(CP ) called the Fubini-Study form A was defined in Section 28.8. A more general formula for such a form on CPn will be defined later, 2 n and generates HdR(CP ). The form A is a globally defined closed 2- form on CP1. Its cohomology class [A] H2 (CP1) ∈ dR 2 1 spans HdR(CP ) R. The 2-form has unit comass at every point. Therefore we have≃ 1 vol(CP1)= A = dx dy. 1 2 2 CP (1 + r ) ∧ Z ZZC Hence rdrdθ rdr ∞ du vol(CP1)= =2π = π = π, (1 + r2)2 (1 + r2)2 u2 ZZ Z Z1 29.4. 2-DIMENSIONAL COHOMOLOGY OF THE TORUS 233 where we used the u-substitution u = 1+ r2. The area is π since 1 we are dealing with a 2-sphere of Gaussian curvature 4, radius 2 , and π Riemannian diameter 2 . Remark 29.3.1. Later we will define an integer lattice in coho- mology. The element 1 A H2 (CP1) is a generator of the integer π ∈ dR lattice L2 (CP1) Z. dR ≃ 29.4. 2-dimensional cohomology of the torus Theorem 29.4.1. We have H2 (T2) R. dR ≃ Proof. Consider a two-torus whose dimensions are 1 by 1. Let x and y be the standard coordinates on the torus, so they both go from zero to one. Let f(x, y) be a function with total integral 0. We would like to show that the 2-form fdA is exact. It suffices to find functions g and h on the torus such that fdxdy = d(gdx + hdy). (29.4.1) Let 1 a(x)= f(x, t)dt, Z0 and set y g(x, y)= f(x, t)dt + a(x)y, − Z0 where a(x) was chosen so that g(x, 1) = g(x, 0). Here 1 1 a(0) = f(0, t)dt = f(1, t)dt = a(1) Z0 Z0 since f is periodic in x. Thus, g is periodic in the x-direction, as well. Then (29.4.1) holds iff ∂ ∂ g + h = f. −∂y ∂x Since ∂ g = f + a, we need ∂ h = a(x). Thus we define h to be ∂y − ∂x x h(x)= a(s)ds Z0 so h depends only on x. Note that 1 1 1 h(1) = a(s)ds = f(s, t)dtds =0 Z0 Z0 Z0 since f has total integral zero. Of course h(0) = 0 also, so h is smooth on the torus.  234 29. DERHAMCOHOMOLOGYCONTINUED

To do the sphere case, use spherical coordinates φ and θ so that dA = cos2(φ)dφdθ. Then solve fdA = d(gdφ + h dθ) making the necessary changes to allow for the cos2(φ). CHAPTER 30

Elements of homology theory

30.1. Relation between Fubini-Study 2-form and volume form on CPn The Fubini-Study metric g and 2-form A on CP1 was defined in Section 28.8 as follows. The metric g is

dx2 + dy2 g = (1 + r2)2 and the corresponding Fubini-Study 2-form A (the area form of the metric) on the complex projective line is dx dy A = ∧ . (1 + r2)2

We will define the Fubini-Study 2-form A on CPn later. We are primarily interested in the the following theorem.

Theorem 30.1.1. The wedge power

An of the Fubini-Study form is a nondegenerate volume form at every point of CPn. Proof. It suffices to check such a relationship at a point. The n Fubini-Study form at a point is the symplectic form j=1 dxj dyj. Its top exterior power is ∧ P n! dx dy dx dy , 1 ∧ 1 ∧···∧ n ∧ n as a special case of the calculation we saw in the context of Wirtinger’s inequality. 

The following section should be viewed as “general culture” placing this fact in a more general setting of the cup product. 235 236 30. ELEMENTS OF HOMOLOGY THEORY

30.2. Cup product Definition 30.2.1. The cup-product of cohomology classes is the operation ∪ : Ha (M) Hb (M) Ha+b(M), ∪ dR × dR → dR defined by the alternating product in the exterior algebra Λ(Tp∗) at every point p M and in the complex of differential forms Ω(M). ∈ Theorem 30.2.2. The product in cohomology is independent of the choice of representatives at the level of the closed differential forms. Thus, we have α β Ha+b(M) ∪ ∈ dR if α Ha (M) and β Hb (M). ∈ dR ∈ dR Definition 30.2.3. The de Rham cohomology ring is the graded ring Hk (M), k =0,...,n ⊕k dR whether the additive structure comes from the real vector space struc- ture on the individual groups, whereas the multiplicative structure is given by the cup product . ∪ 30.3. Cohomology of complex projective space Theorem 30.3.1. The Betti numbers of complex projective space CPn n n satisfy bk(CP )=1 for all even k = 0, 2,..., 2n, and bk(CP )=0 for all other k. Remark 30.3.2. The geometric counterpart of this result is the decomposition of complex projective space with a single cell in each even dimension, see (28.6.1). The above theorem can be refined as follows, taking into account the ring structure. Theorem 30.3.3. The cohomology ring of CPn is the truncated polynomial ring in a single 2-dimensional variable. Namely, the ring is generated by a single class ω H2 (CPn), with the unique relation ∈ dR ωn+1 =0. Corollary 30.3.4. The 2n-dimensional class ωn H2n (CPn) ∈ dR 2n n spans the 1-dimensional vector space HdR(CP ) over R. 30.5. THE FUNDAMENTAL GROUP π1(M) 237

30.4. Abelianisation in group theory

Definition 30.4.1. Given a group π defined by generators gi and relations rk, one usually writes π = g r . h i | ki The commutator of two elements g, h of a group is the element 1 1 [g, h]= ghg− h− . By adding the commutation relations between each pair of generators, we obtain an abelian group called the abelianisation of π. Example 30.4.2. The group Z can be presented as follows:

Z = g1 ∅ . h | i Example 30.4.3. The group Zn = Z /n Z can be presented as fol- lows: n Z /n Z = g1 g , h | 1 i where the right-hand side is written multiplicatively. Thus, we can think of it as the subgroup of n-th roots of unity in S1 C. ⊆ Example 30.4.4. The group Z2 can be presented as follows: 2 Z = g1,g2 [g1,g2] . h | i Definition 30.4.5. Given a group π = gi rk , with neutral element 1, its abelianisation πab is the group h | i πab = g r , [g ,g ]=1 h i | k i j i obtained by adding commutation relations for each pair of generators. Example 30.4.6. The abelianisation of the fundamental group of a surface of genus g is Z2g.

30.5. The fundamental group π1(M) This is the group defined as follows. We consider based loops in (M, ) where is a fixed basepoint, and define equivalence using ho- motopy∗ as in Nowik’s∗ class. Remark 30.5.1. A based loop defines a trivial class in the fun- damental group if and only if the map S1 M can be “filled” by a disk D, i.e., extended to a map D M where→ ∂D = S1. → Example 30.5.2. The fundamental group of RP2 is the finite cyclic 2 2 group Z2, while the fundamental group of T is the infinite group Z . 238 30. ELEMENTS OF HOMOLOGY THEORY

30.6. The second homotopy group π2(M)

Similarly to the fundamental group π1(M), one defines the second homotopy group π2(M) as the group generated by based maps S2 M → modulo based homotopy connecting a pair of based maps. Similarly to the case of π1, a map belongs to a trivial class in π2(M) if it can be extended to a map of the 3-ball: B3 M. → Remark 30.6.1. Unlike the fundamental group, the second homo- topy group π2(M) is always abelian. Example 30.6.2. The second homotopy group of the 2-sphere S2 = 1 2 CP is infinite cyclic: π2(S )= Z. More generally, we have the following example. Example 30.6.3. The second homotopy group of the CPn is infinite n cyclic for all n: π2(CP )= Z. Discussion of long exact sequence of fibration and application in the case of the Hopf fibration.

30.7. First singular homology group

The singular homology groups with integer coefficients, Hk(M; Z) for k =0, 1,... of M are abelian groups which are homotopy invariants of M. Developing the singular homology theory is time-consuming. The case that we will be primarily interested in as far as these notes are concerned, is that of the 1-dimensional homology group:

H1(M; Z)

(as well as the 2-dimensional homology group H2(M; Z), see below). In this case, the homology groups can be characterized easily without the general machinery of singular simplices. Let S1 C be the unit circle, which we think of as a 1-dimensional manifold with⊆ an orientation given by the counterclockwise direction. Remark 30.7.1. The existence of orientations on parametrized loops in C is familiar from the course in complex functions. See Sec- tion 32.1 for a more general treatment of orientations. Definition 30.7.2. A 1-cycle α on a manifold M is an integer linear combination α = nifi i X 30.7. FIRST SINGULAR HOMOLOGY GROUP 239 where ni Z is called the multiplicity (ribui), while each ∈ f : S1 M i → is a loop given by a smooth map from the circle to M, and each loop is endowed with the orientation coming from S1.

Definition 30.7.3. The space of 1-cycles on M is denoted Z1(M; Z).

Let (Σg, ∂Σg) be a surface with boundary ∂Σg, where the genus g is irrelevant for the moment and is only added so as to avoid confusion with the summation symbol . The boundary ∂Σg is a disjoint union of circles. P Theorem 30.7.4. Assume the surface Σg is oriented. Then the orientation induces an orientation on each boundary circle. See Section 32.1 for a more detailed treatment of orientations. Thus we obtain an orientation-preserving identification of each bound- ary component with the standard unit circle S1 C (with its counter- clockwise orientation). ⊆ Given a map Σg M, its restriction to the boundary therefore produces a 1-cycle → ∂Σg Z1(M; Z), ∈ and 1-boundaries are precisely the 1-cycles that can be obtained in this way. Definition 30.7.5. A 1-boundary in M is a 1-cycle n f i i ∈ i X Z1(M; Z) such that there exists a map of a suitable oriented sur- face Σ M (for some g which may depend on the 1-cycle) satisfying g → ∂Σg = nifi. i X Definition 30.7.6. The space of all 1-boundaries is denoted

B1(M; Z) Z1(M; Z). ⊆ Definition 30.7.7. The 1-dimensional homology group of M with integer coefficients is the quotient group

H1(M; Z)= Z1(M; Z)/B1(M; Z).

Definition 30.7.8. Given a cycle C Z1(M; Z), its homology ∈ class will be denoted [C] H1(M; Z). ∈ We see that the definition of the homology group is somewhat more involved than that of the fundamental group. The relation between them is given by the following theorem. 240 30. ELEMENTS OF HOMOLOGY THEORY

Theorem 30.7.9. The 1-dimensional homology group H1(M; Z) is the abelianisation of the fundamental group π1(M): ab H1(M; Z)=(π1M) . Remark 30.7.10. A significant difference between the fundamen- tal group and the first homology group is the following. While only based loops participate in the definition of the fundamental group, the definition of H1(M; Z) involves free (not based) loops. Example 30.7.11. The fundamental groups of the real projective plane RP2 and the 2-torus T2 are already abelian. Therefore one ob- tains 2 H1(RP ; Z)= Z /2 Z, and 2 2 H1(T ; Z)= Z . Example 30.7.12. The fundamental group of an orientable closed surface Σg of genus g is known to be a group on 2g generators with a single relation which is a product of g commutators. Therefore one has 2g H1(Σg; Z)= Z . CHAPTER 31

Stable norm in homology

31.1. Length, area, volume of cycles Assume the manifold M has a Riemannian metric. Given a smooth loop f : S1 M, we can measure its volume (length) with respect to the metric of→M. We will denote this length by vol(f) with a view to higher-dimensional generalisation. Namely,

i j vol(f)= gij α ′(t)α ′(t)dt Z q where αi are the components of the loop with respect to a suitable coordinate chart. As usual, if the loop travels through several charts, we work with partitions.

Definition 31.1.1. The volume (length) of a 1-cycle C = i nifi is the combined length of all the individual loops, with multiplicity: P vol(C)= n vol(f ). | i| i i X Definition 31.1.2. Let α H1(M; Z) be a 1-dimensional homol- ogy class. We define the volume∈ of α as the infimum of volumes of representative 1-cycles: vol(α) = inf vol(C) C α , { | ∈ } where the infimum is over all cycles C = i nifi representing the class α H1(M; Z). ∈ P Example 31.1.3. In the case of the real projective plane, there is only one nontrivial class. The volume of this class with respect to a given metric g on RP2 is then the least length of a noncontractible loop for this metric.

31.2. Length in 1-dimensional homology of surfaces The following phenomenon occurs for orientable surfaces. 241 242 31.STABLENORMINHOMOLOGY

Theorem 31.2.1. Let M be an orientable surface, i.e. 2-dimensional manifold. Let α H1(M; Z). For all j N, we have ∈ ∈ vol(jα)= j vol(α), (31.2.1) where jα denotes the class α + α + ... + α, with j summands. Proof. To fix ideas, let j = 2. By Lemma 31.2.2 below, a mini- mizing loop C representing a multiple class 2α will necessarily intersect itself in a suitable point p. Then the 1-cycle represented by C can be decomposed into the sum of two 1-cycles (where each can be thought of as a loop based at p). The shorter of the two will then give a min- imizing loop in the class α which proves the identity (31.2.1) in this case. The general case follows similarly.  We have the following elementary fact about the topology of a cylin- der. Lemma 31.2.2. A loop going around a cylinder twice necessarily has a point of self-intersection. Proof. We think of the loop as the graph of a 4π-periodic func- tion f(t) (or alternatively a function on [0, 4π] with equal values at the endpoints). Consider the difference g(t) = f(t) f(t +2π). Then g takes both positive and negative values. Thus,− if g(t) is positive, then g(t + π) is negative. Therefore by the intermediate function theorem, the function g must have a zero t0. Then f(t0) = f(t0 + 2π) hence t0 is a point of self-intersection of the loop.  Another way of writing this identity is as follows: 1 vol(α)= vol(jα). (31.2.2) j

31.3. Stable norm for higher-dimensional manifolds The phenomenon captured by (31.2.2) is no longer true for higher- dimensional manifolds. Namely, the volume of a homology class is no longer multiplicative. However, the limit as j exists and is called the stable norm. →∞ Definition 31.3.1. Let M be a compact manifold of arbitrary dimension. The stable norm of a class α H1(M; Z) is the limit kk ∈ 1 α = lim vol(jα). (31.3.1) j k k →∞ j 31.4. STABLE 1-SYSTOLE 243

Remark 31.3.2. It is obvious from the definition that one has α vol(α). k k ≤ However, the inequality may be strict in general. As noted above, for 2-dimensional manifolds we have α = vol(α). k k Proposition 31.3.3. The stable norm vanishes for a class of finite order.

Proof. If α H1(M; Z) is a class of finite order, one has finitely ∈ 1 many possibilities for vol(jα) as j varies. The factor of j in (31.3.1) shows that α = 0.  k k Similarly, if two classes differ by a class of finite order, their stable norms coincide. Thus the stable norm passes to the quotient lattice defined below.

Definition 31.3.4. The torsion subgroup of H1(M; Z) will be de- noted T1(M) H1(M; Z). The quotient lattice L1(M) is the lattice ⊆ L1(M)= H1(M; Z)/T1(M).

b1(M) Proposition 31.3.5. The lattice L1(M) is isomorphic to Z , where b1 is called the first Betti number of M. Proof. This is a general result in the theory of finitely generated abelian groups. 

31.4. Stable 1-systole Definition 31.4.1. Let M be a manifold endowed with a Riemann- ian metric, and consider the associated stable norm . The stable 1- kk systole of M, denoted stsys1(M), is the least norm of a 1-homology class of infinite order:

stsys (M) = inf α α H (M; Z) T1(M) 1 k k| ∈ 1 \ = λ L (M), . 1  1 kk

The 1-systole sys1(M,g) of a metric g on a surface M is the least length of a noncontractible loop on M. Example 31.4.2. For an arbitrary metric on the 2-torus T2, the 1-systole and the stable 1-systole coincide by Theorem 31.2.1: 2 2 sys1(T ) = stsys1(T ), for every metric on T2. 244 31.STABLENORMINHOMOLOGY

31.5. 2-dimensional homology and systole Let M be a manifold. An example of particular interest to us is the complex projective space M = CPn.

Definition 31.5.1. A 2-cycle in M is a linear combination nifi, where ni Z, of maps fi : Σgi M of closed surfaces of arbitrary genus into∈M. → P A significant difference from the 1-dimensional case is that we allow arbitrary genus, whereas in the 1-dimensional case we used a unique manifold (the circle). Thus, each f is a map f : Σ M where Σ i i gi → gi is a closed surface of genus gi. A singular 1-boundary was defined in an earlier section as the boundary of a map from a surface to M. We define 2-boundaries in a similar way.

Definition 31.5.2. A 2-boundary is a sum nifi realizable as the boundary of a map of a 3-manifold into M. P For the orientation induced on the boundary, see Section 32.1 con- taining a more detailed treatment of orientations. Let Z2(M; Z) and B2(M; Z) be the spaces of 2-cycles and 2-boundaries, respectively. Then 2-dimensional homology is defined as before as the quotient group

H2(M; Z)= Z2(M; Z)/B2(M; Z). n Example 31.5.3. We have H2(S ; Z)= Z if n = 2 and 0 otherwise. n Example 31.5.4. We have H2(CP ; Z)= Z for all n =1, 2,.... The stable norm and the stable systole are defined identically to the 1-dimensional case. Thus, we have

stsys (M) = inf α α H (M; Z) T2(M) 2 k k ∈ 2 \ = λ1 L2(M ), , kk where L2(M) is the lattice given by the quotient by the torsion sub- group T2 H2(M; Z). The stable norm is defined in the next section. ⊆ 31.6. Stable norm in two-dimensional homology

If M is a manifold with a metric g, the stable norm in H2(M; Z) is defined in a way similar to the case k = 1, using thekk stabilisation formula (31.2.2). To define the area of a map into M, we proceed as follows. Given a domain U R2 with coordinates (u1,u2), and a ⊆ ∂f ∂f map x : U M, we form the tangent vectors x = 1 and x = 2 , → 1 ∂u 2 ∂u 31.7. PU’S INEQUALITY 245 in T M, calculate the area x x of the parallelogram in T M p k 1 ∧ 2k p spanned by x1 and x2 with respect to the metric on M. The area is then the integral x x du1du2. k 1 ∧ 2k ZU The area of the parallelogram can be calculated as usual as x x = x x sin α k 1 ∧ 2k | 1|| 2| where α is the angle between them. Here x = g(x , x )1/2, and | i| i i g(x , x ) cos α = 1 2 x x | 1|| 2| where g is the metric on M. Gromov’s stable systolic inequality (see chapter 32.5) for complex projective space is a higher dimensional analogue of Pu’s systolic in- equality for the real projective space, which we review below. 31.7. Pu’s inequality Recall that the real projective plane RP2 is a smooth non-orientable surface. Among its elementary properties are the following. Theorem 31.7.1. The real projective plane has the following prop- erties: 2 2 (1) π1(RP )= H1(RP ; Z)= Z2; (2) The natural orientable double cover of RP2 is the sphere S2. 2 (3) Every metric g on RP naturally lifts to a Z2-invariant met- ric g˜ on the double cover S2. Thus we have a smooth map S2 RP2 → which is 2-to-1. In other words, RP2 is the quotient of S2 by the action of Z /2 Z where the nontrivial element τ : S2 S2 (31.7.1) → is the antipodal map of S2, namely τ(p) = p when S2 is thought of 3 − as the unit sphere in R . 2 Definition 31.7.2. The systole sys1(RP ) can be thought of in two equivalent ways: (1) the least g-length of a loop representing the nontrivial homol- 2 ogy class in H1(RP ; Z); (2) the leastg ˜-length of a path joining a pair of antipodal points, p and τ(p), of the double cover. 246 31.STABLENORMINHOMOLOGY

Theorem 31.7.3 (P. M. Pu). Every metric g on the real projective plane RP2 satisfies the optimal inequality π sys (g)2 area(g), 1 ≤ 2 where sys1 is the systole. The boundary case of equality is attained precisely when the metric is of constant Gaussian curvature. Remark 31.7.4. There is an intriguing similarity between Pu’s in- equality and the isoperimetric inequality for Jordan curves in the plane. Both inequalities relate a length and an area, and both are sharp. Our main interest in Pu’s inequality in this chapter is as motiva- tion for the complex case. In the complex case, we have an analogous inequality stsys (g)2 2 vol(g), 2 ≤ for every metric g on CP2, with equality for the Fubini-Study metric. Before stating and proving Gromov’s stable systolic inequality for complex projective space, we review some material from calculus on manifolds.

31.8. Integration of a 1-form over a circle In previous chapters, we have developed a cohomology theory (de Rham cohomology), and a homology theory (singular homology). In this section we will begin to relate the two theories. Consider a circle C(r) of radius r > 0 in C. The counterclockwise orientation turns the circle into a 1-cycle. Thus we can think of C(r) as an element of the space of cycles with integer coefficients, Z1(C(r); Z). Consider the polar coordinates (r, θ) in C. The differential 1-form dθ is defined everywhere in C 0 . We will denote its restriction to the \{ } circle C(r) C by the same symbol dθ. ⊆ The function θ itself is a multi-valued function on C 0 and also on the circle C(r). We can parametrize the circle by the interval\{ } [0, 2π], for instance by means of the parametrisation reiθ, θ [0, 2π]. (31.8.1) ∈ Thus we can think of θ as a single-valued function on the circle with a point removed: θ : C(r) rei0 R . \{ }→ Here the point rei0 R C corresponds to the values 0 and 2π of the parameter θ. ∈ ⊆ 31.9. VOLUME FORM OF A SURFACE 247

Remark 31.8.1. The closed 1-form dθ, inspite of appearances, is not exact, i.e. it is not a coboundary. This is because the function θ is not a true function on the circle but only a multi-valued one. In fact, rdθ is the volume form of Cr. Proposition 31.8.2. We have the relation

dθ =2π, ZCr independent of r. Proof. We use the parametrisation (31.8.1) to perform the inte- gration using the fundamental theorem of calculus as follows:

2π dθ = θ 0 =2π , ZCr proving the proposition.  

1 Remark 31.8.3. The residue calculation of z at the origin gives dz 2π = i dθ =2πi, z =r z 0 I| | Z showing that the residue is 1 by Cauchy’s theorem.

31.9. Volume form of a surface The area of a surface is calculated by means of a familar formula from classical differential geometry. The area of a surface in a coordi- nate chart U R2 with coordinates x = u1,y = u2, with respect to ⊆ the first fundamental form gij is expressed by a double integral

area = √g g g g dxdy. 11 22 − 12 21 ZU From the differential geometric viewpoint, a more correct form of the expression for the area would be in terms of the absolute value of the integral of a 2-form:

area = det(g ) dx dy , ij ∧ ZU q where the symbol “ ” stands for the wedge product. In other words, the area 2-form dA of∧ the surface is the form dA := det(g ) dx dy. ij ∧

Definition 31.9.1. The expression det(gij) dxp dy, or alterna- tively f 2dx dy, is called the area form or the volume∧ form. ∧ p 248 31.STABLENORMINHOMOLOGY

Suppose the metric is expressed in isothermal coordinates in terms of a conformal factor f(x, y) > 0, so that g = f 2(dx2 + dy2) Remark 31.9.2. Here the squares of dx and dy in the expression for the metric g are the symmetric, not alternating, powers. Then the area becomes area = f 2(x, y) dx dy. ∧ ZU In Section 32.2 we will define the notion of the volume form for an arbitrary Riemannian manifold. The key property of the volume form η of a Riemannian manifold M is the expression for the total volume vol(M) of M in terms of the volume form:

vol(M)= η. ZM In line with the traditional notation of calculus, the volume form is sometimes denoted dvolM , so that we have an identity

vol(M)= dvolM . ZM For any function h C∞(M) on the manifold, we have the obvious bound ∈ hdvol (max ) vol(M), M ≤ h ZM where maxh is the maximal value of h on M.

31.10. Duality of homology and de Rham cohomology We have defined k-cycles only in the case k = 1 and k = 2. These two values are meant throughout this section. Recall that Ωk(M) on M is the space of sections of the exterior bundle of the manifold M. As illustrated in the previous sections, a differential k-form η Ωk(M) on M can be integrated over a k- cycle C Z (M) to obtain∈ a real number ∈ k η R . ∈ ZC Integration thus defines a pairing, which we will denote by the in- tegration symbol . R 31.10. DUALITY OF HOMOLOGY AND DE RHAM COHOMOLOGY 249

Definition 31.10.1. The pairing is between the space Ωk(M) of differential forms on M, and the space Zk(M; Z) of k-cycles in M:

k : Ω (M) Zk(M; Z) R . × → Z Theorem 31.10.2 (Stokes theorem). Let M be a closed manifold. Let η Ωk(M) be a differential form. Assume η is closed, i.e. dη =0. ∈ Let C Zk(M; Z) be a k-cycle. Then the value of the integral η only ∈ C depends on the cohomology class ω = [η] Hk (M) and the homology ∈ dR R class α = [C] Hk(M; Z). ∈ Moreover, the integral vanishes over any torsion class. Therefore we obtain a pairing

k : H (M) Lk(M) R, dR × → Z where Lk(M)= Hk(M; Z)/Tk(M). k Definition 31.10.3. We define an integral lattice LdR(M) in de k Rham cohomology HdR(M) to be the lattice dual to the homology lat- tice Lk(M) (here duality is understood in the sense of Definition 25.7.1). In more detail, we have

k k L (M)= ω H (M) η Z ( C α Lk(M)) ( η ω) . dR ∈ dR ∈ ∀ ∈ ∈ ∀ ∈  ZC 

Alternatively, choosing a basis (y1,...,yb) for Lk(M), we can define the dual lattice as follows:

Lk (M)= ω Hk (M) ω Z i =1,...,b . dR ∈ dR ∈ ∀  Zyi 

Theorem 31.10.4. The pairing between L (M) and Lk (M) is k dR non-degenerate for every closed orientable n-manifold M. R

CHAPTER 32

Orientation and fundamental class

32.1. Orientation and induced orientation Let M be an n-dimensional manifold, so that for every p M, we ∈ have dim(Tp∗m) = n. The n-th exterior algebra at p is 1-dimensional, i.e., n dim Λ (T ∗M) R . p ≃ Thus, any n-form η Ωn(M) locally can be written as ∈ η = f(u1,...,un)du1 dun. ∧···∧ In this section, we will only be concerned with n-forms which are nonvanishing at every point, i.e., f(u1,...,un) = 0. Such an n-form is 6 called a volume form. We will say that two volume forms η and η′ are equivalent if they differ by a positive function g = g(p) > 0 on M:

η η′ η(p)= g(p)η′(p). ∼ ⇐⇒ Definition 32.1.1. An orientation on M is an equivalence class of volume forms on M. Example 32.1.2. On the circle S1, we have the orientation given by dθ (or any positive multiple of it), and the “opposite” orientation given by dθ. This can be thought of as an arrow indicating the clockwise− or counterclockwise direction. Recall that an n-form η can be thought of as an antisymmetric multilinear form. Thus, for a 2-form we have η = η(v,w). Definition 32.1.3. Let v be a vector and η = η a 2-form. The interior product vy η Ω1 ∈ is a 1-form, defined by setting (vy η)( )= η(v, ), · · where the dot denotes the variable. In other words, (vy η)(w) = η(v,w), for every vector w. One sim- ilarly defines the interior product for n-forms, by substituting v into n 1 the first variable. Then vy η Ω − . ∈ 251 252 32. ORIENTATION AND FUNDAMENTAL CLASS

Example 32.1.4. Consider the area form η = dx dy in the plane. ∂ ∧ If v = ∂x then vy η = dy. Similarly, the orthonormal basis ∂ ∂r y η = rdθ since dr, rdθ is an orthonormal basis for the cotangent plane. Let M be an n-manifold with boundary. Thus, we have a bound- ary S M where S is (n 1)-dimensional, and a neighborhood of the boundary⊆ has the form−I N where I = [0, 1] is an interval. n 1 × Let u1,...,u − be coordinates in S, and let t I parametrize the ∂ ∈ interval, so that ∂t is a vector field not tangent to the boundary. Definition 32.1.5. Let η Ωn(M) generate an orientation on M. Then the induced orientation on∈ the boundary S M is generated by ∂ n 1 ⊆ the interior product y η Ω − (S). ∂t ∈ This is the formal way of defining the induced orientation on the boundary. 32.2. Fundamental class in homology Let M is a closed (i.e. compact without boundary) orientable man- ifold of dimension n. As in the case of the circle, a choice of an orien- tation on M gives an isomorphism Hn(M; Z) Z. ≃ Definition 32.2.1. The element which is a positive generator is called the fundamental class of M and denoted [M] Hn(M; Z). ∈ Thus, [M] corresponds to the element 1 Z under the isomorphism ∈ Hn(M; Z) Z. Similarly, we have an isomorphism ≃ Ln (M) Z . dR ≃ By the nondegeneracy of the pairing, we obtain the following. n Corollary 32.2.2. We can choose a generator ω = [η] LdR(M) in such a way as to satisfy ∈

η =1. (32.2.1) ZM This is immediate from Theorem 31.10.4. In an earlier chapter, we defined the notion of comass on alternating (exterior) forms. Definition 32.2.3. A volume form is a nonzero element ζ Ωn(M) such that the pointwise comass norm ζ equals 1 for every ∈p M. k pk ∈ The volume form satisfies ζ = vol(M). ZM 32.3. DUALITY OF COMASS AND STABLE NORM 253

Example 32.2.4. In the case of the unit circle S1, the 1-form ζ = dθ has unit pointwise norm, therefore

dθ =2π = vol(S1). 1 ZS 1 1 Note that the class [ζ] is not in the lattice LdR(S ). Meanwhile, the form 1 1 ζ = dθ 2π 2π 1 1 represents an element of HdR(S ) which is a generator of the integer 1 1 1 lattice LdR(S ), since it integrates to 1 over S . 32.3. Duality of comass and stable norm Definition 32.3.1. The comass norm of a differential k-form is the supremum of the pointwise comass norms,kk∞ cf. Definition 26.3.2:

η = sup ηp p M . k k∞ {k k| ∈ } Next, we define a notion of comass norm in de Rham cohomology, as follows. k Definition 32.3.2. The comass norm ω ∗ of a class ω HdR(M) in de Rham cohomology is the infimum ofk comassk norms η∈ of the representative differential forms η ω. k k∞ ∈ Thus we have

ω ∗ = inf sup ηp p M, η ω , k k η p {k k| ∈ ∈ } where denotes the comass norm in the exterior algebra Λ(T ∗M). kk p Theorem 32.3.3 (Federer; Gromov). The stable norm on Lk(M) k k kk and the comass norm ∗ on LdR(M) HdR(M) are dual to each other. k k ⊆ k Thus, the pair of normed lattices (Lk(M), ) and (LdR(M), ∗) are dual to each other. kk kk For the purpose of this introductory discussion, assume n 2. There are at least three equivalent ways of defining the Fubini-Stud≥ y metric on CPn: (1) It is the unique metric on CPn whose sectional curvature K satisfies 1 K 4; (2) it is defined≤ by≤ a distance function d(v,w) expressed in terms of the Hermitian inner product in homogeneous coordinates; (3) it is given by a metric ds2 written by a specific length element ds appearing below. 254 32. ORIENTATION AND FUNDAMENTAL CLASS

The case of the complex projective line CP1 is special in that the Gauss- ian curvature K is constant. The Fubini-Study metric on the complex projective line is dis- cussed in Section 32.4, and the Fubini-Study 2-form is discussed in Section 28.8.

32.4. Fubini-Study metric on complex projective line The complex projective line CP1 = C = S2 ∪∞ carries a natural metric called the Fubini-Study metric ds2. The distance function d(v,w), associated with the Fubini-Study metric, takes its simplest form in homogeneous coordinates [Z0,Z1]. Namely, given a pair of unit vectors v,w C2, we have the corre- ∈ sponding complex 1-dimensional subspaces Cv C2 and Cw C2, thought of as points in the projective line. The distance⊆ between⊆ them will be denoted, briefly, d(v,w). The distance is defined in terms of the standard Hermitian product H(v,w) as follows. Definition 32.4.1. The distance between points v,w in CP1 is defined by d(v,w) = arccos H(v,w) , (32.4.1) | | i.e. cos d(v,w)= H(v,w) = v w + v w . | | | 1 1 2 2| Remark 32.4.2. The distance vanishes (i.e. cos(d(v,w)) = 1) if v and w differ by a unit complex scalar: v = λw. This is consistent with the fact that the vectors are supposed to define the same point in complex projective space. 1 In an affine neighborhood Z1 = 0 in CP , we introduce a coordinate 6 z = Z0/Z1 = x + iy. Theorem 32.4.3. With respect to coordinate z, the Fubini-Study metric ds2 takes the form dzdz dx2 + dy2 dr2 ds2 = = = . (32.4.2) (1 + zz)2 (1 + r2)2 (1 + r2)2 Here r2 = zz = x2 + y2, where r is the first of the usual polar coordi- nates (r, θ). We present some elementary global-geometric properties of the met- ric. 32.5. FUBINI-STUDY METRIC ON COMPLEX PROJECTIVE SPACE 255

Theorem 32.4.4. The maximal distance between a pair of points 1 π of CP with respect to the distance (32.4.1) is 2 , i.e. its Riemannian π diameter is 2 . π Indeed, the arccosine of a positive value does not exceed 2 . Note that the unit sphere metric has Riemannian diameter π, which is twice the value that appears in the theorem. Theorem 32.4.5. The complex projective line CP1 with the Fubini- 1 3 Study metric gFS is isometric to the sphere of radius 2 in R and total area π.

Proof. The Gaussian curvature K of the Fubini-Study metric gFS is K = 4 at every point. The curvature can be calculated directly using 1 the formula K = ∆ log f where f is the conformal factor f = 2 − LB 1+r as in (32.4.2). Thus we obtain a metric on the 2-sphere CP1 of constant Gaussian curvature 4. By a general result in Riemannian geometry, such a metric is iso- metric to a concentric sphere in R3. The normalisation K = 4 forces 1 the radius of the sphere to be 2 . Hence the total area is π. Alterna- tively, this results from the Gauss-Bonnet theorem.  32.5. Fubini-Study metric on complex projective space The Fubini-Study metric on the complex projective line was dis- cussed in Section 32.4. The complex projective space CPn carries a natural metric called the Fubini-Study metric ds2. The distance function d(v,w), associated with the Fubini-Study metric, takes its simplest form in homogeneous coordinates [Z ,...,Z ]. Namely, given a pair of unit vectors v,w 0 n ∈ Cn+1, we have cos d(v,w)= H(v,w) = v w . | | | j j| j X In an affine neighborhood Z = 0, we introduce affine coordinates n 6 zj = Zj/Zn = xj + iyj. With respect to these coordinates, the Fubini-Study metric takes the n following form. The Fubini-Study metric gFS(X,X) at a point u C is given by the formula ∈ H(X,X)(1+ u 2) H(X,u) 2 g (X,X)= | | −| | . (32.5.1) FS 1+ u 2 | | In the case n = 1, we have H(X,u) = X u and the formula reduces to (32.4.2). | | | || | 256 32. ORIENTATION AND FUNDAMENTAL CLASS

Unlike the 1-dimensional case, sectional curvature is not constant. Theorem 32.5.1. The sectional curvature of the Fubini-Study met- ric on CPn satisfies 1 K 4 at every point and in every 2-dimensional direction. ≤ ≤ The Fubini-Study 2-form A is defined as before by the formula A = g(Jv,w) where J is the complex structure, i.e., multiplication by i. One can write down an explicit formula, as well.1 Proposition 32.5.2. A is a globally defined closed 2-form on CPn. 2 n Its cohomology class [A] spans HdR(CP ) and has unit comass at every point. The proof of Gromov’s inequality exploits the duality between the stable norm and the comass, defined in earlier chapters. It also relies on our generalized version of the Wirtinger inequality to obtain an optimal constant in the inequality. 32.6. Gromov’s inequality for complex projective space 2 n n Recall that if ω is a generator of LdR(CP ) then ω is a fundamental n 2n n cohomology class for CP , i.e. a generator of LdR(CP ). Theorem 32.6.1 (M. Gromov). Every Riemannian metric g on complex projective space CPn satisfies the inequality stsys (g)n n! vol (g); 2 ≤ 2n equality holds for the Fubini-Study metric on CPn. So as to clarify the nature of the proof, we will state the following slightly more general inequality. A similar proof works for the more general theorem. Theorem 32.6.2. Let M be a 2n-dimensional Riemannian mani- fold, satisfying the following conditions: 2 (1) b2(M)=1, in other words HdR(M)= R; 2 n 2n (2) If ω =0 in H (M) then ω∪ =0 in H (M). 6 dR 6 dR Then every Riemannian metric g on M satisfies the inequality stsys (g)n n! vol (g). 2 ≤ 2n

1More explicitly, one can write down A in an affine neighborhood as follows:

dzj dzj ( zj dzj) zj dzj A = i j ∧ ∧ 1+ r2 − (1 + r2)2 P P P ! related to (32.5.1). CHAPTER 33

Proof of Gromov’s theorem

33.1. Homology class α and cohomology class ω Following Gromov’s notation in [?, Theorem 4.36], we let

n α H2(CP ; Z)) Z (33.1.1) ∈ ≃ be the positive generator in homology. Recall that α is represented by the inclusion of the 2-sphere CP1 CPn. Let ⊆ L2 (CPn) Z dR ≃ 2 n n be the lattice in de Rham cohomology HdR(CP ) dual to H2(CP ; Z). Let ω L2 (CPn) ∈ dR be the dual generator in cohomology, so that α ω = 1, where integra- tion is understood in the sense of a representative cycle and 2-form. 1 R Here ω is represented by the form π A where A is the Fubini-Study form: ω = 1 A H2. π ∈ n 2n n Then the cup power ω is a generator  of LdR(CP ) Z, i.e. a fun- n ≃ damental cohomology class for CP . Let η ω be a closed differen- ∈ tial 2-form. Since wedge product in Ω∗(X) descends to cup product ∧ in HdR∗ (X), we have from (32.2.1)

n 1= η∧ . (33.1.2) n ZCP

33.2. Comass and application of Wirtinger inequality Let g be a metric on CPn. The metric defines a norm on each tangent and cotangent space, as well as all the exterior powers. We can then calculate the pointwise comass ηp at a point p, and the comass η of a differential form η. k k Thek comassk∞ norm of a differential k-form is, by definition, the supremum of the pointwise comass norms, cf. Definition 26.3.2. Let 257 258 33. PROOF OF GROMOV’S THEOREM

M = CPn. Choose a point p M. By the Wirtinger inequality and Corollary 27.2.4, we obtain the∈ following bound for the norm at p: n n n ηp∧ n!( ηp ) n!( η ) , (33.2.1) k k ≤ k k ≤ k k∞ where is the comass norm on differential forms. Since this is true for everykk∞p M, we obtain the following inequality for the comass of the form ηn∈: n n η∧ n!( η ) . k k∞ ≤ k k∞ 33.3. Volume calculation Now let dvol be the volume form of the metric (M,g). At every point, we have ηn = ηn dvol . p k p k M Then 1= ηn n ZCP = ηn dvol CPn k k Z n n n!( η ) vol2n(CP ,g) ≤ k k∞ We therefore obtain n n 1 n!( η ) vol2n(CP ,g). (33.3.1) ≤ k k∞ 2 This is true for every form η representing the class ω HdR(M). The infimum of (33.3.1) over all η ω gives the following∈ bound involving ∈ the comass ω ∗ of the class ω: k k n n 1 n!( ω ∗) vol2n (CP ,g) , (33.3.2) ≤ k k where ∗ is the comass norm in cohomology. Denote by the stable kk kk norm in homology. Let L2(M) = H2(M; Z). Recall that the normed lattices 2 (L2(M), ) and (LdR(M), ∗) kk n kk are dual to each other [?]. Since b1(CP ) = 1, it follows that the class α of (33.1.1) satisfies 1 α = , k k ω k k∗ and hence stsys (g)n = α n n! vol (g). (33.3.3) 2 k k ≤ 2n Equality is attained by the two-point homogeneous Fubini-Study met- ric, since the standard CP1 CPn is calibrated by the Fubini-Study Kahler 2-form, which satisfies⊆ equality in the Wirtinger inequality at every point. 33.3. VOLUME CALCULATION 259

Example 33.3.1. Every metric g on the complex projective plane satisfies the optimal inequality 2 2 2 stsys (CP ,g) 2 vol4(CP ,g). (33.3.4) 2 ≤

CHAPTER 34

Eiseinstein, Hermite, Loewner, Pu, and Guth

Here we prove the two classical results of systolic geometry, namely Loewner’s torus inequality as well as Pu’s inequality for the real pro- jective plane, as well as a more recent result by L. Guth.

34.1. Eisenstein integers and Hermite constant We now review some arithmetic preliminaries.

2 Definition 34.1.1. The lattice LE R = C of the Eisenstein ⊆ integers is the lattice in C spanned by the elements 1 and the sixth root of unity.

To visualize the lattice, start with an equilateral triangle in C with 1 √3 vertices 0, 1, and 2 +i 2 , and construct a tiling of the plane by repeat- edly reflecting in all sides. The Eisenstein integers are by definition the set of vertices of the resulting tiling.

Definition 34.1.2. Let b N. The Hermite constant γb is defined in one of the following two equivalent∈ ways:

(1) γb is the square of the biggest first successive minimum λ1(L) among all lattices L of unit covolume; (2) γb is defined by the formula

λ1(L) b √γb = sup L (R , ) , (34.1.1) vol( b /L)1/b ⊆ | |  R 

where the supremum is extended over all lattices L in Rb with a Euclidean norm . | | A lattice realizing the supremum is called a critical lattice. A crit- ical lattice may be thought of as the one realizing the densest packing b 1 in R when we place balls of radius 2 λ1(L) at the points of L. The lattice of Eisenstein integers is critical. 261 262 34. EISEINSTEIN, HERMITE, LOEWNER, PU, AND GUTH

34.2. Standard fundamental domain and Eisenstein integers Definition 34.2.1. The standard fundamental domain D C is the domain ⊆ D = z C z > 1, (z) < 1 , (z) > 0 (34.2.1) ∈ | | |ℜ | 2 ℑ It is a fundamental domain for the action of the group PSL(2, Z) in the upperhalf plane of C. Lemma 34.2.2. Given a lattice L C, one can choose a basis τ, 1 for a homothetic (similar) lattice in such⊆ a way that τ D¯. { } ∈ Proof. Consider a lattice L C = R2. Choose a “shortest” vec- ⊆ 1 tor z L, i.e. we have z = λ (L). By replacing L by the lattice z− L, ∈ | | 1 we may assume that the +1 C is a shortest element in the lattice. We will denote the new lattice∈ by the same letter L, so that now λ1(L) = 1. Now complete the element +1 L to a Z-basis ∈ τ, +1 (34.2.2) { } for L. Consider the real part (τ). Clearly, we can adjust the basis ℜ 1 by adding a suitable integer to τ, so as to satisfy the condition 2 1 − ≤ (τ) 2 . Since τ L, we have τ λ1(L) = 1. Then the basis vectorℜ ≤τ lies in the closure∈ of the domain| | ≥ D of Definition 34.2.1.  The following result is an essential part of the proof of Loewner’s torus inequality with isosystolic defect. Lemma 34.2.3. When b = 2, we have the following value for the Hermite constant: γ = 2 = 1.1547 .... The corresponding critical 2 √3 lattice is homothetic to the Z-span of the cube roots of unity in C, i.e. the Eisenstein integers. Proof. Clearly, multiplying L by nonzero complex numbers does not change the value of the quotient λ (L)2 1 . area(C/L) The imaginary part of τ chosen as in Lemma 34.2.2 satisfies (τ) πℑ ≥ √3 i 3 2 , with equality possible in the following two cases: τ = e D, 2π ∈ or τ = ei 3 D. Finally, we calculate the area of the parallelogram ∈ in C spanned by τ and +1, and write area(C/L) √3 2 = (τ) λ1(L) ℑ ≥ 2 to conclude the proof.  34.4. LOEWNER’S INEQUALITY WITH DEFECT 263

34.3. Loewner’s torus inequality Recall that the systole of a Riemannian manifold M is defined to be least length of a noncontractible loop on M. In the case of the torus T2, the fundamental group is abelian. Hence the systole can be expressed in this case as follows:

2 2 sys (T )= λ1 H (T ; Z), , 1 1 kk where is the stable norm. In this sense, Gromov’s inequality (33.3.4) for thekk complex projective plane is a higher-dimensional analogue of Loewner’s inequality. Loewner’s torus inequality relates the total area, to the systole, i.e. least length of a noncontractible loop on the torus (T2,g). Theorem 34.3.1 (Loewner’s torus inequality). Every metric on the torus satisfies area(g) √3 sys (g)2 0. (34.3.1) − 2 1 ≥ Theorem 34.3.2. The boundary case of equality in (34.3.1) is at- tained if and only if the metric is homothetic to the flat metric obtained as the quotient of R2 by the lattice formed by the Eisenstein integers.

34.4. Loewner’s inequality with defect Loewner’s torus inequality can be strengthened by introducing a “defect” (shegiya? she’erit?) term `ala Bonnesen. To write it down, we need to review the conformal representation theorem (uniformisation theorem). Theorem 34.4.1 (Conformal representation theorem). Every met- ric g on the torus T2 is isometric to a metric of the form f 2(dx2 + dy2), with respect to a unit area flat metric dx2 + dy2 on the torus T2 viewed as a quotient R2 /L of the (x, y) plane by a lattice. See (20.9.1) for more details. The defect term in question is simply the variance (shonut) of the conformal factor f above. The inequality with the defect term looks as follows. Theorem 34.4.2. Every metric on the torus satisfies area(g) √3 sys(g)2 Var(f). (34.4.1) − 2 ≥ 264 34. EISEINSTEIN, HERMITE, LOEWNER, PU, AND GUTH

Here the error term, or isosystolic defect, is given by the variance (shonut) Var(f)= (f m)2 (34.4.2) 2 − ZT of the conformal factor f of the metric g = f 2(dx2 + dy2) on the torus, 2 2 relative to the unit area flat metric g0 = dx +dy in the same conformal class. Here m = fdxdy (34.4.3) 2 T Z 2 2 is the mean (tochelet) of f. More concretely, if (T ,g0) = R /L where L is a lattice of unit coarea, and D is a fundamental domain for the action of L on R2 by translations, then the integral (34.4.3) can be written as m = f(x, y)dxdy ZD where dxdy is the standard Lebesgue measure of R2.

34.5. Computational formula for the variance (shonut) The proof of inequalities with isosystolic defect relies upon the fa- miliar computational formula for the variance of a random variable X in terms of expected values. Namely, we have the formula E (X2) (E (X))2 = Var(X), (34.5.1) µ − µ where µ is a probability measure. Here the variance is Var(X)= E (X m)2 , µ − where m = Eµ(X) is the (i.e. the mean) (tochelet).

34.6. An application Keeping our differential geometric application in mind, we will de- note the random variable f. 2 Now consider a flat metric g0 of unit area on the 2-torus T . Denote the associated measure by µ. In other words, µ is given by the usual Lebesgue measure dxdy. Since µ is a probability measure, we can apply 2 formula (34.5.1) to it. Consider a metric g = f g0 conformal to the flat one, with conformal factor f > 0. Then we have

2 2 Eµ(f )= f dxdy = area(g), 2 ZT 34.7. FUNDAMENTAL DOMAIN AND LOEWNER’S TORUS INEQUALITY 265 since f 2dx dy is precisely the area 2-form of the metric g. Equation∧ (34.5.1) therefore becomes area(g) (E (f))2 = Var(f). (34.6.1) − µ Next, we will relate the expected value Eµ(f) to the systole of the metric g. Then we will then relate (34.5.1) to Loewner’s torus inequality.

34.7. Fundamental domain and Loewner’s torus inequality 2 Let us prove Loewner’s torus inequality for the metric g = f g0 using the computational formula for the variance. We first analyze the expected value term E (f)= 2 fdxdy in (34.6.1). µ T By the proof of Lemma 34.2.3, the lattice of deck transformations R of the flat torus g0 admits a Z-basis similar to τ, 1 C, where τ belongs to the standard fundamental domain (34.2.1).{ } In ⊆ other words, the lattice is similar to Z τ + Z 1 C. ⊆ Consider the imaginary part (τ) and set ℑ σ2 := (τ) > 0. ℑ 2 √3 Lemma 34.7.1. We have σ 2 , with equality if and only if τ is the primitive cube or sixth root of≥ unity. Proof. This is immediate from the geometry of the fundamental domain. 

Since g0 is assumed to be of unit area, the basis for its group of deck tranformations can therefore be taken to be

1 1 σ− τ, σ− , { } 1 where (σ− τ) = σ. With these normalisations, we see that the flat torus isℑ ruled by a pencil of horizontal closed geodesics, denoted

γy = γy(x),

1 each of length σ− , where the “width” of the pencil equals σ, i.e. the parameter y ranges through the interval [0, σ], with γσ = γ0. Note that each of these closed geodesics is noncontractible in T2, and therefore by definition length(γ ) sys (g). y ≥ 1 266 34. EISEINSTEIN, HERMITE, LOEWNER, PU, AND GUTH

By Fubini’s theorem, we pass to the iterated integral (nishneh) to ob- tain the following lower bound for the expected value: σ Eµ(f)= f(x)dx dy Z0 Zγy ! σ = length(γy)dy Z0 σ sys(g), ≥ see [?, p. 41, 44] for details. The theorem now follows from the following more general inequality. Theorem 34.7.2. We have area(g) σ2 sys(g)2 Var(f). (34.7.1) − ≥ Proof. Substituting the inequality E (f) σ sys µ ≥ into (34.6.1), we obtain the inequality area(g) σ2 sys(g)2 Var(f), − ≥ where f is the conformal factor of the metric g with respect to the unit 2 √3 area flat metric g0. Since σ 2 , we obtain in particular Loewner’s torus inequality with isosystolic≥ defect, area(g) √3 sys(g)2 Var(f). (34.7.2) − 2 ≥ cf. [?]. 

34.8. Boundary case of equality Corollary 34.8.1. A metric satisfying the boundary case of equal- ity in Loewner’s torus inequality (35.8.1) is necessarily flat and homo- thetic to the quotient of R2 by the lattice of Eisenstein integers. Proof. If a metric f 2ds2 satisfies the boundary case of equality in (35.8.1), then the variance of the conformal factor f must vanish by (34.7.2). Hence f is a constant function. The proof is completed by applying Lemma 34.2.3 on the Hermite constant in dimension 2.  Now suppose τ is pure imaginary, i.e. the lattice L is a rectangular lattice of coarea 1. Corollary 34.8.2. If τ is pure imaginary, then the metric g = 2 f g0 satisfies the inequality area(g) sys(g)2 Var(f). (34.8.1) − ≥ 34.10. TANGENT MAP 267

Proof. If τ is pure imaginary then σ = (τ) 1, and the inequality follows from (34.7.1). ℑ ≥  p In particular, every surface of revolution satisfies (34.8.1), since its lattice is rectangular, cf. Corollary 20.11.1.

34.9. Hopf fibration h To prove Pu’s inequality, we will need to study the Hopf fibration more closely. The circle action in C2 restricts to the unit sphere S3 C2, which therefore admits a fixed point free circle action. Namely, the⊆ circle S1 = eiθ : θ R C { ∈ } ⊆ 2 acts on a point (z1, z2) C by ∈ iθ iθ iθ e .(z1, z2)=(e z1, e z2). (34.9.1) The quotient manifold (space of orbits) S3/S1 is the 2-sphere S2 (from the projective viewpoint, the quotient space is the complex projective line CP1). Definition 34.9.1. The quotient map h : S3 S2, (34.9.2) → called the Hopf fibration.

34.10. Tangent map Proposition 34.10.1. Consider a smooth map f : M N between differentiable manifolds. Then f defines a natural map → df : T M T N → called the tangent map. Proof. We represent a tangent vector v T M by a curve ∈ p c(t): I M, → such that c′(o)= v. The composite map σ = f c : I N is a curve ◦ → in N. Then the vector σ′(0) is the image of v under df:

df(v)= σ′(0), proving the proposition.  268 34. EISEINSTEIN, HERMITE, LOEWNER, PU, AND GUTH

34.11. Riemannian submersions Consider a fiber bundle map f : M N between closed manifolds. Let F M be the fiber over a point p →N. The differential df : T M ⊆ ∈ → T N vanishes on the subspace TxF T M at every point x F . More precisely, we have the vertical space⊆ (anchi) ∈

ker(dfx)= TxF. Given a Riemannian metric on M, we can consider the orthogonal complement of T F in T M, denoted H T M. Thus x x x ⊆ x T M = T F H x x ⊕ x is an orthogonal decomposition. Definition 34.11.1. A fiber bundle map f : M N between Riemannian manifolds is called a Riemannian submersion→ if at every point x M, the restriction of df : H T N is an isometry. ∈ x → f(x) Proposition 34.11.2. The Hopf fibration h : S3 S2 is a Rie- mannian submersion, for which the natural metric on→ the base is a metric of constant Gaussian curvature +4. The latter is explained by the fact that the maximal distance be- 1 π tween a pair of S orbits is 2 rather than π, in view of the fact that a pair of antipodal points of S3 lies in a common orbit and therefore descends to the same point of the quotient space. A pair of points at maximal distance is defined by a pair v,w S3 ∈ such that w is orthogonal to Cv.

34.12. Hamilton quaternions The quaternionic viewpoint will be applied in Section 36.7. The algebra of the Hamilton quaternions is the real 4-dimensional vector space with real basis 1, i, j, k , so that { } H = R 1+ R i + R j + R k equipped with an associative (chok kibutz), non-commutative product operation. This operation has the following properties: (1) the center of H is R 1; (2) the operation satisfies the relations i2 = j2 = k2 = 1 and ij = ji = k, jk = kj = i, ki = ik = j. − − − − Lemma 34.12.1. The algebra H is naturally isomorphic to the com- plex vector space C2 via the identification R 1+ R i C. ≃ 34.13. COMPLEX STRUCTURES ON THE ALGEBRA H 269

Proof. The subspace R j + R k can be thought of as R j + R ij =(R 1+ R i)j, and we therefore obtain a natural isomorphism H = C1+ Cj, showing that 1, j is a complex basis for H.  { } Theorem 34.12.2. Nonzero quaternions form a group under quater- nionic multiplication. Proof. Given q = a+bi+cj +dk H, let N(q)= a2 +b2 +c2 +d2, andq ¯ = a bi cj dk. Then qq¯∈= N(q). Therefore we have a multiplicative− inverse− − 1 1 q− = q,¯ N(q) proving the theorem. 

34.13. Complex structures on the algebra H Definition 34.13.1. A purely imaginary Hamilton q ∈ H0 in H = R 1+ R i + R j + R k, is a real linear combination of i, j, and k.

Thus H0 H is a real 3-dimensional subspace. ⊆ Proposition 34.13.2. A choice of a purely imaginary quaternion q ∈ H0 specifies a natural complex structure on H. Proof. We can assume without loss of generality that q = 1. | | We then have q2 = 1 and hence R 1+ R q is naturally isomorphic − to C. Choosing a unit-norm quaternion r H orthogonal to the real ∈ 2-dimensional subspace R 1+ R q H, we obtain a natural isomor- ⊆ phism H C + Cr.  ≃

CHAPTER 35

From surface tension to Loewner’s inequality

This chapter contains some introductory remarks.

35.1. Surface tension and shape of a water drop Perhaps the most familiar physical manifestation of the 3-dimensional isoperimetric inequality is the shape of a drop of water. Namely, a drop will typically assume a symmetric round shape. Since the amount of water in a drop is fixed, surface tension forces the drop into a shape which minimizes the surface area of the drop, namely a round sphere. Thus the round shape of the drop is a consequence of the phenomenon of surface tension (metach panim). Mathematically, this phenomenon is expressed by the isoperimetric inequality.

35.2. Isoperimetric inequality in the plane The solution to the isoperimetric problem in the plane is usually expressed in the form of an inequality that relates the length L of a closed curve and the area A of the planar region that it encloses. The isoperimetric inequality states that 4πA L2, ≤ and that the equality holds if and only if the curve is a round circle. The inequality is an upper bound for area in terms of length. It can be rewritten as follows: L2 4πA 0. − ≥ 35.3. Central symmetry Recall the notion of central symmetry: a Euclidean polyhedron is called centrally symmetric if it is invariant under the “antipodal” map x x. 7→ − Thus, in the plane central symmetry is the rotation by 180 degrees. For example, an ellipse is centrally symmetric, as is any ellipsoid in 3-space. 271 272 35. FROM SURFACE TENSION TO LOEWNER’S INEQUALITY

35.4. Property of a centrally symmetric polyhedron in 3-space There is a geometric inequality that is in a sense dual to the isoperi- metric inequality in the following sense. Both involve a length and an area. The isoperimetric inequality is an upper bound for area in terms of length. There is a geometric inequality which provides an upper bound for a certain length in terms of area. More precisely it can be described as follows. Any centrally symmetric convex body of surface area A can be squeezed through a noose of length √πA, with the tightest fit achieved by a sphere. This property is equivalent to a special case of Pu’s inequality (see below), one of the earliest systolic inequalities. Example 35.4.1. An ellipsoid is an example of a convex centrally symmetric body in 3-space. It may be helpful to the reader to develop an intuition for the property mentioned above in the context of thinking about ellipsoidal examples. An alternative formulation is as follows. Every convex centrally symmetric body P in R3 admits a pair of opposite (antipodal) points and a path of length L joining them and lying on the boundary ∂P of P , satisfying π L2 area(∂P ). ≤ 4 35.5. Notion of systole The systole of a compact metric space X is a metric invariant of X, defined to be the least length of a noncontractible loop in X. We will denote it as follows: sys(X). When X is a graph, the invariant is usually referred to as the girth, ever since the 1947 article by W. Tutte [?]. Possibly inspired by Tutte’s article, Loewner started thinking about systolic questions on surfaces in the late 1940s, resulting in a 1950 thesis by his student P.M. Pu [?]. The actual term “systole” itself was not coined until a quarter century later, by Marcel Berger. This line of research was, apparently, given further impetus by a remark of Ren´eThom, in a conversation with Berger in the library of Strasbourg University during the 1961-62 academic year, shortly after the publication of the papers of R. Accola and C. Blatter. Referring to these systolic inequalities, Thom reportedly exclaimed: “Mais c’est fondamental!” [These results are of fundamental importance!] 35.7. PU’S INEQUALITY 273

Subsequently, Berger popularized the subject in a series of articles and books, most recently in the march ’08 issue of the Notices of the American Mathematical Society [?]. A bibliography at the Website for systolic geometry and topology currently contains over 160 articles. Systolic geometry is a rapidly developing field, featuring a number of recent publications in leading journals. Recently, an intriguing link has emerged with the Lusternik-Schnirelmann category. The existence of such a link can be thought of as a theorem in “systolic topology”.

35.6. The real projective plane RP2 In projective geometry, the real projective plane RP2 is defined as the collection of lines through the origin in R3. The distance function on RP2 is most readily understood from this point of view. Namely, the distance between two lines through the origin is by definition the angle between them, or more precisely the lesser of the two angles. This distance function corresponds to the metric of constant Gaussian curvature +1. Alternatively, RP2 can be defined as the surface obtained by iden- tifying each pair of antipodal points on the 2-sphere. Other metrics on RP2 can be obtained by quotienting metrics on S2 imbedded in 3-space in a centrally symmetric way, see the discussion in Section 35.4. Topologically, RP2 can be obtained from the Mobius strip by at- taching a disk along the boundary. Among closed surfaces, the real projective plane is the simplest non-orientable such surface.

35.7. Pu’s inequality Pu’s inequality applies to general Riemannian metrics on RP2. A student of Charles Loewner’s, P.M. Pu proved in a 1950 thesis (published in 1952) that every metric g on the real projective plane RP2 satisfies the optimal inequality π sys(g)2 area(g), ≤ 2 where sys is the systole. The boundary case of equality is attained pre- cisely when the metric is of constant Gaussian curvature. Alternatively, the inequality can be presented as follows: 2 area(g) sys(g)2 0. − π ≥ There is a vast generalisation of Pu’s inequality, due to M. Gromov, called Gromov’s systolic inequality for essential manifolds. To state his 274 35. FROM SURFACE TENSION TO LOEWNER’S INEQUALITY result, we will need to review a topological notion of being essential in Chapter 37. 35.8. Loewner’s torus inequality Similarly to Pu’s inequality, Loewner’s torus inequality relates the total area, to the systole, i.e. least length of a noncontractible loop on the torus (T2,g): area(g) √3 sys(g)2 0. (35.8.1) − 2 ≥ The boundary case of equality is attained if and only if the metric is homothetic to the flat metric obtained as the quotient of R2 by the lattice formed by the Eisenstein integers. 35.9. Bonnesen’s inequality The classical Bonnesen inequality [?] is the strengthened isoperi- metric inequality L2 4πA π2(R r)2, (35.9.1) − ≥ − see [?, p. 3]. Here A is the area of the region bounded by a closed Jordan curve of length (perimeter) L in the plane, R is the circumradius of the bounded region, and r is its inradius. The error term π2(R r)2 on the right hand side of (35.9.1) is traditionally called the isoperimetric− defect (shegiya? she’erit?) A similar strengthening of Loewner’s inequality exists and will be discussed in Section CHAPTER 36

Pu’s inequality

Pu’s inequality for the real projective plane is the inequality π sys (g)2 area(g) 1 ≤ 2 for every metric g on RP2. The case of equality is attained precisely by the metrics of constant Gaussian curvature on RP2. The proof of the inequality exploits certain properties of circle fi- brations of the 3-sphere. A combination of a double fibration of the 3- sphere and a suitable integral-geometric identity will yield a proof of Pu’s inequality.

36.1. From complex structure to fibration

Recall that we have H = R +H0 where R = R 1 is the center while H0 is the space of pure imaginary quaternions. A complex struc- 4 ture on H R defined by q H0 leads to a fibration ≃ ∈ f : S3 S2 q → 1 3 of the unit 3-sphere, as in (34.9.2). Each fiber of fq is a circle S S which is an orbit of the action (34.9.1). It will be convenient in⊆ the sequel to denote this data by a two-arrow diagram of the following type: S1 S3 S2, (36.1.1) → → where the first arrow denotes the inclusion of a fiber, while the second arrow denotes the Hopf fibration. Lemma 36.1.1. Choosing a pair of pure imaginary quaternions q,r ∈3 H0 which are orthogonal, results in a pair of circle fibrations fq, fr of S such that a fiber of one fibration is perpendicular to a fiber of the other fibration whenever two such fibers meet. Such a pair of fibrations, illustrated in Figure 36.1.1 following the notation of (36.1.1), will play a crucial role in the proof of Pu’s in- equality. 275 276 36. PU’S INEQUALITY

S1 S2 A > AA }} AA fr }} AA }} A }} S3 > A }} AA f }} AA q }} AA }} A S1 S2

Figure 36.1.1. A pair of fibrations

36.2. Lie groups A Lie group is simultaneously a group and a smooth manifold, in such a way that the two structures are compatible. More precisely, we have the following definition. Definition 36.2.1. A Lie group is a manifold G with an associative (chok kibutz) operation µ : G G G and inverse ν : G G with the following properties: × → → (1) the operation defines the structure of a group on G, e.g. there exists an element e G such that µ(e, x) = x for all x G, and furthermore µ(x,∈ ν(x)) = e; ∈ (2) both µ and ν are smooth maps. Corollary 36.2.2. The manifold S3 admits the structure of a Lie group. Proof. Nonzero quaternions form a group by Theorem 34.12.2. The multiplication restricts to the unit sphere S3 R4 H by the 1 3 ⊆ ≃ rule q− =q ¯ for q S .  ∈ 36.3. A special isomorphism Definition 36.3.1. Let M be a Riemannian manifold. The unit sphere tangent bundle T uM, or the unit tangent bundle for short, is the subset of T M consisting of elements of unit norm: T uM = v T M v =1 . { ∈ | | | } The sphere S3 can be thought of as the Lie group formed of the unit (Hamilton) quaternions in H C2. In the sequel, an important role≃ will be played by the Lie group SO(3). It turns out SO(3) can be identified with S3/ 1 . Therefore the fi- bration (34.9.2) descends to a fibration {± } SO(3) S2, (36.3.1) → 36.4. SPHERE AS HOMOGENEOUS SPACE 277 exploited in the next section. Lemma 36.3.2. The construction of two “perpendicular” fibrations 3 of Section 34.13 descends to the quotient by the natural Z2 action on S , as well. Proposition 36.3.3. The group SO(3) can be identified with the unit tangent bundle of S2. Proof. Let G = SO(3). We will construct a natural identification F : G T u(S2). → The group G acts naturally on the unit sphere S2 R3 by matrix multiplication. We choose a fixed unit tangent vector⊆v T u(S2). To construct the required identification f, we send an∈ element g G, acting on S2, to the image of v under the tangent map dg : TS2 ∈ TS2. Namely, → F (g)= dg(v) T u(S2). ∈  To summarize, we have the following three distinct ways of viewing the same Lie group: (1) The group SO(3) of orthogonal matrices; (2) the quotient S3/ 1 of the 3-sphere; (3) the unit tangent{± bundle} of the 2-sphere. The underlying smooth manifold can in fact be identified with the real projective space RP3, as well, as is obvious from item (2). 36.4. Sphere as homogeneous space The sphere S2 is the homogeneous space of the Lie group SO(3), cf. (36.3.1), so that S2 = SO(3)/SO(2), cf. [?, p. 82]. The stabilizer of the north pole (0, 0, 1) S3 is clearly the subgroup of matrices of the form ∈ cos θ sin θ 0 sin θ −cos θ 0  0 0 1   which is clearly isomorphic to S0(2). We will denote by SO(2)σ the fiber over (i.e. stabilizer of) a typical point σ S2. The projection ∈ p : SO(3) S2 (36.4.1) → is a Riemannian submersion for the standard metric of constant curva- ture. If we choose a metric of constant Gaussian curvature +1 on S2, 278 36. PU’S INEQUALITY

1 then the metric on SO(3) is of constant sectional curvature 4 . We will view this fibration as a fibration of the unit circle tangent bundle of the sphere: p : T uS2 S2 (36.4.2) → using the identification of T uS2 and SO(3) discussed in Section 36.3. 36.5. Dual real projective plane 2 Denote by RP ∗ the dual real projective plane, namely the collection of lines in the real projective plane RP2 itself. The symbol 2 RP]∗ denotes the orientable double cover, diffeomorphic to the 2-sphere. We will avoid using the notation S2 for this space so as not to confuse two distinct fibrations. 2 We think of RP]∗ as the configuration space of oriented great cir- cles ν on the sphere, with measure dν. Theorem 36.5.1. Let M = S2. The unit tangent bundle T uM can be represented as follows:

u 2 T M = (x, ν) M RP]∗ x ν ∈ × ∈ n o Proof. An oriented (directed) line through x M defines a unique unit tangent vector at x, and vice versa. ∈  36.6. A pair of fibrations By definition the fibration p : T uS2 S2 sends (x, ν) to x: → p(x, ν)= x. By Theorem 36.5.1, there is a second fibration q of T uS2, defined by q(x, ν)= ν. Thus, the total space T uS2 SO(3) admits another Riemannian sub- mersion, which we denote ≃

2 q : SO(3) RP]∗, (36.6.1) → whose typical fiber ν is an orbit of the geodesic flow on SO(3) when the latter is viewed as the unit tangent bundle of S2, cf. Remark 36.3.3. Lemma 36.6.1. Each orbit ν SO(3) projects under p to a great circle on the sphere. ⊆ Proof. A fiber of fibration q is the collection of unit vectors tan- gent to a given directed closed geodesic (great circle) on S2.  36.7. DOUBLE FIBRATION OF SO(3) AND INTEGRAL GEOMETRY ON S2279

SO(2) RP2, dν G GG t9 GG q tt  GG tt GG tt g G# tt SO(3) : L uu LL p uu LLL uu LL uu LL& uu & ν / (S2, dσ) great circle

Figure 36.6.1. Integral geometry on S2, cf. (36.4.1) and (36.6.1)

While fibration q may seem more “mysterious” than fibration p, the two are actually equivalent from the quaternionic point of view, cf. Sections 34.12 and 36.1. The diagram of Figure 36.6.1 illustrates the maps defined so far.

36.7. Double fibration of SO(3) and integral geometry on S2 In this section, we present a self-contained account of an integral- geometric identity used in the proof of Pu’s inequality in Section 36.9. The identity has its origin in results of P. Funk [?] determining a sym- metric function on the two-sphere from its great circle integrals; see [?, Proposition 2.2, p. 59], as well as Preface therein. Denote by Eg(ν) the energy, and by Lg(ν) the length, of a curve ν 2 with respect to a metric g = f g0, where f > 0 is a function on the 2-sphere, while g0 is the standard metric of constant Gaussian cur- vature +1.

Theorem 36.7.1. We have the following integral-geometric iden- tity: 1 area(g)= Eg(ν)dν. 2 2π RP ZZ g Proof. Let ν(t) be a parametrisation of the great circle ν viewed as an orbit in T uS2 SO(3). Then p ν(t) is its projection to the 2- sphere. Note that by≃ definition of energy,◦ we have

E (ν)= f 2 p ν(t)dt. g ◦ ◦ Zν 280 36. PU’S INEQUALITY

We apply Fubini’s theorem [?, p. 164] to the Riemannian submersion q, to obtain 2π 2 Eg(ν)dν = dν f p ν(t)dt 2 2 RP RP 0 ◦ ◦ ZZ g ZZ g Z  (36.7.1) = f 2 p. ◦ ZZZSO(3) Next, we apply Fubini’s theorem to the Riemannian submersion p, to obtain

2 Eg(ν)dν = f p 2 RP SO(3) ◦ ZZ g ZZZ (36.7.2) = f 2 dσ 1 2 ZZS ZSO(2)σ  =2π area(g), completing the proof of the theorem. 

36.8. An inequality Theorem 36.8.1. Let f > 0 be a continuous function on S2. Then there is a great circle ν such that the following inequality is satisfied: 2 f(ν(t))dt π f 2(σ)dσ, (36.8.1) ≤ 2 Zν  ZS where t is the arclength parameter, and dσ is the Riemannian measure of the metric g0 of the standard unit sphere. Corollary 36.8.2. There is a great circle of the sphere whose 2 2 length is L in the metric f g0, so that L πA, where A is the Rie- 2 ≤ mannian surface area of the metric f g0. In the boundary case of equal- ity in (36.8.1), the function f must be constant. Proof of Theorem 36.8.1. The proof is an averaging argument and shows that the average length of great circles is short. Comparing length and energy yields the inequality

2 2 Lg(ν) RP g Eg(ν)dν (36.8.2)   2 R 2π ≤ RP Z g by the Cauchy-Schwarz inequality. Denoting the least length of a great circle ν by L min, we obtain 4π(L )2 min 2π area g 2π ≤ 36.9. PROOF OF PU’S INEQUALITY 281

2 by Theorem 36.7.1, since area(R]P ,g0)=4π. In the boundary case of equality in (36.8.1), one must have equality also in inequality (36.8.2) relating length and energy. It follows that the conformal factor f must be constant along every great circle. Hence f is constant everywhere on S2.  36.9. Proof of Pu’s inequality A metric g on RP2 lifts to a centrally symmetric metricg ˜ on S2. Applying Proposition 36.8.1 tog ˜, we obtain a great circle ofg ˜-length L satisfying L2 πA, where A is the Riemannian surface area ofg ˜. Thus L 2≤ π A L A we have 2 2 2 , where sysπ1(g) 2 while area(g)= 2 , proving Pu’s inequality.≤ See [?] for an alternative≤ proof.  

CHAPTER 37

Gromov’s essential inequality

This chapter deals with Gromov’s systolic inequality for essential manifolds.

37.1. Essential manifolds To give some examples, all surfaces are essential with the excep- tion of the sphere S2. Real projective spaces RPn are essential. If a manifold M admits a metric of negative curvature, then M is essential. More generally, to define the property of being essential for an n- dimensional closed manifold M, we need a minimum of homology the- ory. This will be done in section

37.2. Gromov’s inequality for essential manifolds The deepest result in the field is Gromov’s inequality for the homo- topy 1-systole of an essential n-manifold M: sys(M)n C vol(M), ≤ n where Cn is a universal constant only depending on the dimension of M. Here the systole sys(M) is by definition the least length of a noncontractible loop in M. Example 37.2.1. The inequality holds for all nonsimply connected surfaces, for real projective spaces, for all manifolds admitting a hy- perbolic metric. The proof involves a new invariant called the filling radius, intro- duced by Gromov, defined as follows.

37.3. Filling radius of a loop in the plane The filling radius of a simple loop C in the plane is defined as the largest radius, R> 0, of a circle that fits inside C, see Figure 37.2.1: Fillrad(C R2)= R. ⊆ There is a kind of a dual point of view that allows one to generalize this notion in an extremely fruitful way, as shown in [?]. Namely, we 283 284 37. GROMOV’S ESSENTIAL INEQUALITY

R

C

Figure 37.2.1. Largest inscribed circle has radius R

2 consider the ǫ-neighborhoods of the loop C, denoted UǫC R , see Figure 37.3.1. ⊆

C

Figure 37.3.1. Neighborhoods of loop C

As ǫ> 0 increases, the ǫ-neighborhood UǫC swallows up more and more of the interior of the loop. The “last” point to be swallowed up is precisely the center of a largest inscribed circle. Therefore we can reformulate the above definition by setting

2 Fillrad(C R ) = inf ǫ> 0 loop C contracts to a point in UǫC . ⊆ { | } To define an absolute filling radius in a situation where M is equipped with a Riemannian metric g, Gromov exploits an imbedding due to C. Kuratowski. 37.7. HOMOLOGY THEORY FOR GROUPS 285

37.4. The sup-norm

On the space L∞(X) of bounded Borel functions f on a set X, one can consider the norm f = sup f(x) . k k x X | | ∈ This norm is called the sup-norm.

37.5. Riemannian manifolds as spaces with a distance function We consider a manifold M equipped with a distance function d(x, y), where x, y M. In particular, a Riemannian metric on M defines such a distance∈ function, by minimizing the length of all paths joining x to y.

37.6. Kuratowski imbedding

One imbeds M in the Banach space L∞(X) of bounded Borel functions on M, equipped with the sup norm . Namely, we map k k a point x M to the function f L∞(M) defined by the for- ∈ x ∈ mula fx(y) = d(x, y) for all y M, where d is the distance function defined by the metric. By the∈ triangle inequality we have d(x, y) = fx fy , and therefore the imbedding is strongly isometric, in the pre- kcise− sensek that internal distance and ambient distance coincide. Such a strongly isometric imbedding is impossible if the ambient space is a Hilbert space, even when M is the Riemannian circle (the distance between opposite points must be π, not 2!). We then set E = L∞(M) in the formula above.

37.7. Homology theory for groups Recall that the n-dimensional homology group is nontrivial. If M is orientable, then Hn(M; Z)= Z. If M is non-orientable, then Hn(N, Z2)= Z2. In either case, a generator of the homology group is denoted [M] and called the fundamental class. There is a parallel homology theory for groups. In particular, one can consider the homology of the fundamental group π = π1(M), namely Hn(π). One additional ingredient we need to know is the exis- tence of a natural homomorphism φ : H (M) H (π). n → n 286 37. GROMOV’S ESSENTIAL INEQUALITY

Then M is called essential if its fundamental class [M] maps to a nonzero class in the homology of its fundamental group: φ([M]) =0 H (π). 6 ∈ n 37.8. Relative filling radius We first define a notion of a filling radius of M relative to an imbed- ding. Given an imbedding of M in Euclidean space E, let ǫ > 0, and denote by UǫM the neighborhood in E consisting of all points at dis- tance at most ǫ from M. Let φ : H (M) H (E) ǫ n → n be the inclusion homomorphism induced by the inclusion of M in its ǫ-neighborhood UǫM in E. We set Fillrad(M E) = inf ǫ> 0 φ ([M]) = 0 H (U M) , ⊆ { | ǫ ∈ n ǫ } 37.9. Absolute filling radius We define the filling radius of M to be its relative filling radius relative to the canonical imbedding in L∞(M), by setting

FillRad(M) = FillRad (M L∞(M)) . ⊆ Namely, Gromov proved a sharp inequality relating the systole and the filling radius, :sysπ1 6 FillRad(M), valid for all essential mani- folds M; as well as an inequality≤

1 FillRad C vol n (M), ≤ n n valid for all closed manifolds M. A summary of a proof, based on recent results in geometric measure theory by S. Wenger, building upon earlier work by L. Ambrosio and B. Kirchheim, appears in Section 12.2 of the book ”Systolic geometry and topology” referenced below. A completely different approach to the proof of Gromov’s inequality was recently proposed by L. Guth. 37.10. further Proof of optimal inequality relating systole and filling radius.