True infinitesimal differential geometry
Mikhail G. Katz∗ Vladimir Kanovei Tahl Nowik
M. Katz, Department of Mathematics, Bar Ilan Univer- sity, Ramat Gan 52900 Israel E-mail address: [email protected] V. Kanovei, IPPI, Moscow, and MIIT, Moscow, Russia E-mail address: [email protected] T. Nowik, Department of Mathematics, Bar Ilan Uni- versity, Ramat Gan 52900 Israel E-mail address: [email protected] ∗Supported by the Israel Science Foundation grant 1517/12.
Abstract. This text is based on two courses taught at Bar Ilan University. One is the undergraduate course 89132 taught to a group of about 120 freshmen. This course used infinitesimals to introduce all the basic concepts of the calculus, mainly following Keisler’s textbook. The other is the graduate course 88826 taught to about 10 graduate students. The graduate course developed two viewpoints on infinitesimal generators of flows on manifolds. Much of what we say here is deeply affected by the pedagogical experience of teaching the two groups. The classical notion of an infinitesimal generator of a flow on a manifold is a (classical) vector field whose relation to the actual flow is expressed via integration. A true infinitesimal framework allows one to re-develop the foundations of differential geometry in this area. The True Infinitesimal Differential Geometry (TIDG) framework enables a more transparent relation between the infini- tesimal generator and the flow. Namely, we work with a combinatorial object called a hyper- real walk constructed by hyperfinite iteration. We then deduce the continuous flow as the real shadow of the said walk. Namely, a vector field is defined via an infinitesimal displace- ment in the manifold itself. Then the walk is obtained by iteration rather than integration. We introduce synthetic combinatorial con- ditions Dk for the regularity of a vector field, replacing the classical analytic conditions of Ck type, that guarantee the usual theorems such as uniqueness and existence of solution of ODE locally, Frobe- nius theorem, Lie bracket, small oscillations of the pendulum, and other concepts. Here we cover vector fields, infinitesimal generator of flow, Frobenius theorem, hyperreals, infinitesimals, transfer principle, ultrafilters, ultrapower construction, hyperfinite partitions, micro- continuity, internal sets, halo, prevector, prevector field, regularity conditions for prevector fields, flow of a prevector field, relation to classical flow. Contents
Chapter 1. Preface 11
Part 1. True Infinitesimal Differential Geometry 13 Chapter 2. Introduction 15 2.1. Breakdown by chapters 16 2.2. Historical remarks 17 2.3. Preview of flows and infinitesimal generators 18 Chapter 3. A hyperreal view 19 3.1. Successive extensions N, Z, Q, R, ∗R 19 3.2. Motivating discussion for infinitesimals 20 3.3. Introduction to the transfer principle 21 3.4. Transfer principle 24 3.5. Orders of magnitude 25 3.6. Standard part principle 27 3.7. Differentiation 27 Chapter 4. Ultrapower construction and transfer principle 31 4.1. Introduction to filters 31 4.2. Examples of Filters 32 4.3. Properties of filters 33 4.4. Real number field as quotient of sequences of rationals 34 4.5. Extension of Q 34 4.6. Examples of infinitesimals 34 4.7. Ultrapower construction of the hyperreal field 35 Chapter 5. Microcontinuity, EVT, internal sets 37 5.1. Construction via equivalence relation 37 5.2. Extending the ordered field language 37 5.3. Existential and Universal Transfer 39 5.4. Examples of first order statements 39 5.5. Galaxies 41 5.6. Microcontinuity 43 5.7. Uniform continuity in terms of microcontinuity 45 3 4 CONTENTS
5.8. Hyperreal extreme value theorem 46 5.9. Intermediate value theorem 47 5.10. Ultrapower construction applied to (R); internal sets 48 5.11. The set of infinite hypernaturals is notP internal 49 5.12. Transfering the well-ordering of N; Henkin semantics 50 5.13. From hyperrationals to reals via ultrapower 51 5.14. Repeatingtheconstruction 53
Chapter 6. Application: small oscillations of a pendulum 55 6.1. Small oscillations of a pendulum 55 6.2. Vector fields, walks, and flows 56 6.3. The hyperreal walk and the real flow 57 6.4. Adequality pq 58 6.5. Infinitesimal oscillations 59 6.6. Linearisation 59 6.7. Adjusting linear prevector field 60 6.8. Conclusion 60
Chapter 7. Differentiable manifolds 63 7.1. Definition of differentiable manifold 63 7.2. Open submanifolds, Cartesian products 64 7.3. Tori 65 7.4. Projective spaces 66 7.5. Definition of a vector field 67 7.6. Zeros of vector fields 68 7.7. 1-parameter groups of transformations of a manifold 69 7.8. Invariance of infinitesimal generator under flow 70 7.9. Lie derivative and Lie bracket 71
Chapter 8. Theorem of Frobenius 73 8.1. Invariance under diffeomorphism 73 8.2. Frobenius thm, commutation of vector fields and flows 75 8.3. Proof of Frobenius 76 8.4. A-trackandB-trackapproaches 78 8.5. Relations of and 78 8.6. Smooth manifolds;≺ notion≺≺ nearstandardness 79 8.7. Results for general culture on topology 80
Chapter 9. Gradients, Taylor, prevectorfields 83 9.1. Propertiesofthenaturalextension 83 9.2. Remarks on gradient 83 9.3. Prevectors 85 9.4. Tangent space to a manifold via prevectors 86 CONTENTS 5
9.5. Action of a prevector on smooth functions 87 9.6. Maps acting on prevectors 88 Chapter 10. Prevector fields, classes D1 and D2 91 10.1. Prevector fields 91 10.2. Realizing classical vector fields 92 10.3. Regularity class D1 for prevectorfields 94 10.4. Second differences, class D2 94 10.5. Secord-order mean value theorem 95 10.6. The condition C2 implies D2 96 10.7. Bounds for D1 prevector fields 97 Chapter 11. Prevectorfield bounds via underspill, walks, flows 99 11.1. Operations on prevector fields 99 11.2. Sharpening the bounds via underspill 100 11.3. Further bounds via underspill 101 11.4. Global prevector fields, walks, and flows 102 11.5. Internal induction 104 11.6. Dependence on initial condition and prevector field 104 11.7. Dependence on initial conditions for local prevector field 106 11.8. Inducing a real flow 108 11.9. Geodesic flow 109 Chapter 12. Universes, transfer 111 12.1. Universes 111 12.2. Therealsasthebaseset 112 12.3. More sets in universes 113 12.4. Membership relation 114 12.5. Ultrapower construction of nonstandard universes 114 12.6. Los theorem 117 12.7. Elementary embedding 119 12.8. Definability 122 12.9. Conservativity 123 Chapter13. Moreregularity 127 13.1. D2 implies D1 127 13.2. Injectivity 128 13.3. Time reversal 129
Part 2. 500 yrs of infinitesimal math: Stevin to Nelson 131 Chapter 14. 16th century 133 14.1. Simon Stevin 133 14.2. Decimals from Stevin to Newton and Euler 135 6 CONTENTS
14.3. Stevin’s cubic and the IVT 136 Chapter 15. 17th century 139 15.1. Pierre de Fermat 139 15.2. James Gregory 139 15.3. Gottfried Wilhelm von Leibniz 139 Chapter 16. 18th century 141 16.1. Leonhard Euler 141 Chapter 17. 19th century 143 17.1. Augustin-Louis Cauchy 143 Chapter 18. 20th century 145 18.1. Thoralf Skolem 145 18.2. Edwin Hewitt 145 18.3. JerzyLo´s 145 18.4.AbrahamRobinson 145 18.5. Edward Nelson 145 Bibliography 147 Chapter 19. Theorema Egregium 153 19.1. Preliminaries to the Theorema Egregium of Gauss 153 19.2. The theorema egregium ofGauss 155 19.3. Intrinsic versus extrinsic 156 19.4. Gaussian curvature and Laplace-Beltrami operator 157 19.5. Duality in linear algebra 158 19.6. Polar, cylindrical, and spherical coordinates 159 19.7. Duality in calculus; derivations 161 Chapter20. Derivations,tori 163 20.1. Space of derivations 163 20.2. Constructing bilinear forms out of 1-forms 165 20.3. Firstfundamentalform 165 20.4. Expressing the metric in terms of 1-forms 166 20.5. Hyperbolic metric 167 20.6. Surface of revolution 167 20.7. Coordinate change 169 20.8. Conformal equivalence 170 20.9. Uniformisation theorem for tori 170 20.10. Isothermalcoordinates 171 20.11. Tori of revolution 172 Chapter 21. Conformal parameter 175 CONTENTS 7
21.1. Conformal parameter τ of standard tori of revolution 175 21.2. Comformal parameter ofstandard torus ofrev 176 21.3. Curvature expressed in terms of θ(s) 178 21.4. Total curvature of a convex Jordan curve 179 ˜ 21.5. Signed curvature kα of Jordan curves in the plane 180 21.6. Coordinate patch defined by family of normal vectors 183 21.7. Total Gaussian curvature and Gauss–Bonnet theorem 185
Chapter 22. Addendum on a Bernoullian framework 187 22.1. Ultrapower construction 187 22.2. Saturation and topology 187
Chapter 23. More on Gauss-Bonnet 189 23.1. Second proof of Gauss–Bonnet theorem 189 23.2. Euler characteristic 189 23.3. Exercise on index notation 190
Chapter 24. Bundles 191 24.1. Trivial bundle (E,B,π), function as a section (mikta) 191 24.2. Mobius band as a first example of a nontrivial bundle 192 24.3. Tangent bundle 192 24.4. Hairy ball theorem 193 24.5. Cotangent bundle 193 24.6. Exterior derivative of a function 194 24.7. Extending gradient and exterior derivative 194 24.8. Plan for building de Rham cohomology 195 24.9. Exterioralgebra,area,determinant,rank 196 24.10. Formal definition, alternating property 196 24.11. Example: rank 1 197 24.12. Areas in the plane 197 24.13. Vectorandtripleproducts 199 24.14. Anticommutativity of the wedge product 200 24.15. The k-exterior power; simple (decomposable) multivectors200 24.16. Basis and dimension of exterior algebra 201 24.17. Rank of a multivector 201
Chapter 25. Multilinear forms; exterior differential complex 203 25.1. Formalconstructionoftheexterioralgebra 203 25.2. Exterior differential complex 204 25.3. Alternating multilinear forms 205 25.4. Case of 2-forms 206 25.5. Norm on 1-forms 206 25.6. Polarcoordinatesanddualnorms 207 8 CONTENTS
25.7. Dual bases and dual lattices in a Euclidean space 208
Chapter 26. Comass norm, Hermitian product, symplectic form 211 2 2 n 26.1. Λ (V ) is isomorphic to the standard model Λ0(R ) 211 26.2. Euclidean norm on k-multivectors and k-forms 212 26.3. Comassnormofaform 212 26.4. Symplectic form from a complex viewpoint 213 26.5. Hermitian product 214 26.6. Orthogonal diagonalisation 215
Chapter 27. Wirtinger inequality, projective spaces 217 27.1. Wirtinger inequality 217 27.2. Wirtinger inequality for an arbitrary 2-form 219 27.3. Realprojectivespace 220 27.4. Complex projective line CP1 221 Chapter 28. Complex projective spaces, de Rham cohomology 223 28.1. A clarification 223 28.2. Complex projective space CPn 223 28.3. Real projective space as a quotient 224 28.4. Unitary group 224 28.5. Hopf bundle 224 28.6. Flags and cell decomposition of CPn 225 28.7. g, A, and J in a complex vector space 226 28.8. Fubini-Study 2-form on CP1 227 28.9. Exterior differential complex 228 28.10. De Rham cocycles and coboundaries 229 28.11. De Rham cohomology ring of M, Betti numbers 229
Chapter 29. de Rham cohomology continued 231 29.1. Comparison of topology and de Rham cohomology 231 29.2. de Rham cohomology of a circle 231 29.3. 2-dimensional cohomology of CP1 232 29.4. 2-dimensionalcohomologyofthetorus 233
Chapter 30. Elements of homology theory 235 30.1. Relation between Fubini-Study 2-form and volume form on CPn 235 30.2. Cup product 236 30.3. Cohomology of complex projective space 236 30.4. Abelianisation in group theory 237 30.5. The fundamental group π1(M) 237 30.6. The second homotopy group π2(M) 238 30.7. First singular homology group 238 CONTENTS 9
Chapter31. Stablenorminhomology 241 31.1. Length, area, volume of cycles 241 31.2. Length in 1-dimensional homology of surfaces 241 31.3. Stable norm for higher-dimensional manifolds 242 31.4. Stable 1-systole 243 31.5. 2-dimensional homology and systole 244 31.6. Stable norm in two-dimensional homology 244 31.7. Pu’s inequality 245 31.8. Integration of a 1-form over a circle 246 31.9. Volumeformofasurface 247 31.10. Duality of homology and de Rham cohomology 248
Chapter 32. Orientation and fundamental class 251 32.1. Orientation and induced orientation 251 32.2. Fundamental class in homology 252 32.3. Duality of comass and stable norm 253 32.4. Fubini-Study metric on complex projective line 254 32.5. Fubini-Study metric on complex projective space 255 32.6. Gromov’s inequality for complex projective space 256
Chapter33. ProofofGromov’stheorem 257 33.1. Homology class α and cohomology class ω 257 33.2. Comass and application of Wirtinger inequality 257 33.3. Volume calculation 258
Chapter 34. Eiseinstein, Hermite, Loewner, Pu, and Guth 261 34.1. Eisenstein integers and Hermite constant 261 34.2. Standard fundamental domain and Eisenstein integers 262 34.3. Loewner’s torus inequality 263 34.4. Loewner’s inequality with defect 263 34.5. Computational formula for the variance (shonut) 264 34.6. An application 264 34.7. Fundamental domain and Loewner’s torus inequality 265 34.8. Boundary case of equality 266 34.9. Hopf fibration h 267 34.10. Tangent map 267 34.11. Riemannian submersions 268 34.12. Hamilton quaternions 268 34.13. Complex structures on the algebra H 269 Chapter 35. From surface tension to Loewner’s inequality 271 35.1. Surfacetensionandshapeofawaterdrop 271 35.2. Isoperimetric inequality in the plane 271 10 CONTENTS
35.3. Central symmetry 271 35.4. Property of a centrally symmetric polyhedron in 3-space 272 35.5. Notion of systole 272 35.6. The real projective plane RP2 273 35.7. Pu’s inequality 273 35.8. Loewner’s torus inequality 274 35.9. Bonnesen’s inequality 274 Chapter 36. Pu’s inequality 275 36.1. Fromcomplexstructuretofibration 275 36.2. Lie groups 276 36.3. A special isomorphism 276 36.4. Sphere as homogeneous space 277 36.5. Dualrealprojectiveplane 278 36.6. A pair of fibrations 278 36.7. Double fibration of SO(3) and integral geometry on S2 279 36.8. An inequality 280 36.9. Proof of Pu’s inequality 281 Chapter 37. Gromov’s essential inequality 283 37.1. Essential manifolds 283 37.2. Gromov’s inequality for essential manifolds 283 37.3. Filling radius of a loop in the plane 283 37.4. The sup-norm 285 37.5. Riemannian manifolds as spaces with a distance function285 37.6. Kuratowski imbedding 285 37.7. Homologytheoryforgroups 285 37.8. Relative filling radius 286 37.9. Absolute filling radius 286 37.10. further 286 CHAPTER 1
Preface
Terence1 Tao has recently published a number of monographs where he exploits ultrapowers in general, and Robinson’s infinitesimals in par- ticular, as a fundamental tool; see e.g., [Tao 2014], [Tao & Vu 2016]. In the present text, we apply such an approach to some basic topics in differential geometry. Differential geometry has its roots in the study of curves and sur- faces by methods of mathematical analysis in the 17th century. Ac- cordingly it inherited the infinitesimal methods typical of these early studies in analysis. Weierstrassian effort at the end of the 19th century solidified mathematical foundations by breaking with the infinitesimal mathematics of Leibniz, Euler, and Cauchy, but in many ways the baby was thrown out with the water, as well. This is because mathematics, including differential geometry, was stripped of the intuitive appeal and clarity of the earlier studies. Shortly afterwards, the construction of a non-Archimedean totally ordered field by Levi-Civita in 1890s paved the way for a mathemat- ically correct re-introduction of infinitesimals. Further progress even- tually lead to Abraham Robinson’s hyperreal framework, a modern foundational theory which introduces and treats infinitesimals to great effect with full mathematical rigor. Since then, Robinson’s framework has been used in a foundational role both in mathematical analysis [Keisler 1986] and other branches of mathematics, such as differ- ential equations [Benoit 1997], measure, probability and stochastic analysis [Herzberg 2013], [Albeverio et al. 2009], asymptotic se- ries [Van den Berg 1987], mathematical economics [Sun 2000], etc. These developments have led both to new mathematical results and to new insights and simplified proofs of old results. The goal of the present monograph is to restore the original infini- tesimal approach in differential geometry, on the basis of true infinitesi- mals in Robinson’s framework. In the context of vector fields and flows, the idea is to work with a combinatorial object called a hyperreal walk
1Issues in note 7 (chapter 5), note 2 (chapter11), note 10 (chapter 11), (chap- ter 12), (chapter 12). 11 12 1. PREFACE constructed by hyperfinite iteration. We then deduce the continuous flow as the real shadow of the said walk. We address the book to a reader willing to benefit from both the in- tuitive clarity and mathematical rigor of this modern approach. Read- ers with prior experience in differential geometry will find new (or rather old but somewhat neglected since Weierstrass) insights on known concepts and arguments, while readers with experience in applications of the hyperreal framework will find a new and fascinating application in a thriving area of modern mathematics. Part 1
True Infinitesimal Differential Geometry
CHAPTER 2
Introduction
The classical notion of an infinitesimal generator of a flow on a manifold is a (classical) vector field whose relation to the actual flow is expressed via integration. A true infinitesimal framework allows one to re-develop the foundations of differential geometry in this area in such a way that the relation between the infinitesimal generator and the flow becomes more transparent. The idea is to work with a combinatorial object called a hyper- real walk constructed by hyperfinite iteration. We then deduce the continuous flow as the real shadow of the said walk. Thus, a vector field is defined via an infinitesimal displacement defined by a self-map F of the manifold itself. The flow is then obtained N by iteration F ◦ (rather than integration), as in Euler’s method. As a result, the proof of the invariance of a vector field under a flow becomes a consequence of basic set-theoretic facts such as the associativity of N composition, or more precisely the commutation relation F F ◦ = N ◦ F ◦ F . Furthermore,◦ there are synthetic, or combinatorial, conditions Dk for the regularity of a (pre)vector field, replacing the usual analytic con- ditions of Ck type, that guarantee the usual theorems such as unique- ness and existence of solution of ODE locally, Frobenius theorem, Lie bracket, and other concepts. A pioneering work in applying true infinitesimals to treat differen- tial geometry of curves and surfaces is [Stroyan 1977]. Our approach is different because Stroyan’s point of departure is the analytic class C1 of vector fields, whereas we use purely combinatorial synthetic condi- tions D1 (and D2) in place of C1 (and C2). Similarly, the Ck conditions were taken as the point of departure in [Stroyan & Luxemburg 1976], [Lutz & Goze 1981], and [Almeida, Neves & Stroyan 2014]. To illustrate the advantages of this approach, we provide a proof of a case of the theorem of Frobenius on commuting flows. We also treat small oscillations of a pendulum (see Section 6.1), where the idea that the period of oscillations with infinitesimal amplitude should be independent of the amplitude finds precise mathematical expression.
15 16 2. INTRODUCTION
One advantage of the hyperreal approach to solving a differential equation is that the hyperreal walk exists for all time, being defined combinatorially by iteration of a selfmap of the manifold. The focus therefore shifts away from proving the existence of a solution, to estab- lishing the properties of a solution. Thus, our estimates show that given a uniform Lipschitz bound on the vector field, the hyperreal walk for all finite time stays in the finite part of the ∗M. If M is complete then the finite part is nearstandard. Our estimates they imply that the hyperreal walk for all finite time descends to a real flow on M.
2.1. Breakdown by chapters In Chapters 3 and 4, we introduce the basic notions of an extended number system while avoiding excursions into mathematical logic that are not always accessible to students trained in today’s undergraduate and graduate programs. We start with a syntactic account of a number system containing infinitesimals and infinite numbers, and progress to the relevant concepts such as filter, ultrapower construction, the exten- sion principle (from sets and functions over the reals to their natural extensions over the hyperreals), and the transfer principle. In Chapter 5, we present some typical notions and results from undergraduate calculus and analysis, such as continuity and uniform continuity, extreme value theorem, etc., from the point of view of the extended number system, introduce the notion of internal set gener- alizing that of a natural extention of a real set, and give a related construction of the reals out of hyperrationals. In Chapter 6.1 we apply the above to treat infinitesimal oscillations of the pendulum. In Chapter 7.1, we present the traditional approach to differentiable manifolds, vector fields, flows, and 1-parameter families of transforma- tions. In Chapter 7.8, we present the traditional “A-track” (see section 2.2) approach to the invariance of a vector field under its flow and to the theorem Frobenius on the commutation of flows. In Chapters 8.4 and 9.6, we present an approach to vectors and vec- tor fields as infinitesimal displacements, and obtain bounds necessary for proving the existence of the flow locally. In Chapters 10 and 11 we approach flows as iteration of the (pre)vector field. Chapters 12 and following contain more advanced material on uni- verses. 2.2. HISTORICAL REMARKS 17
2.2. Historical remarks Many histories of analysis are based on a default Weierstrassian foundation taken as a primary point of reference. In contrast, the article [Bair et al. 2013] developed an approach to the history of analysis as evolving along separate, and sometimes competing, tracks. These are: the A-track, based upon an Archimedean continuum; and • the B-track, based upon what we refer to as a Bernoullian (i.e., • infinitesimal-enriched) continuum.1 Historians often view the work in analysis from the 17th to the mid- dle of the 19th century as rooted in a background notion of continuum that is not punctiform. This necessarily creates a tension with modern, punctiform theories of the continuum, be it the A-type set-theoretic continuum as developed by Cantor, Dedekind, Weierstrass, and oth- ers, or B-type continua as developed by [Hewitt 1948], [Lo´s1955 ], [Robinson 1966], and others. How can one escape a trap of pre- sentism in interpreting the past from the viewpoint of set-theoretic foundations commonly accepted today, whether of type A or B? A possible answer to this query resides in a distinction between procedure and ontology. In analyzing the work of Fermat, Leibniz, Euler, Cauchy, and other great mathematicians of the past, one must be careful to distinguish between (1) its practical aspects, i.e., actual mathematical practice involv- ing procedures and inferential moves, and, (2) semantic aspects related to the actual construction of the enti- ties such as points of the continuum, i.e., issues of the ontology of mathematical entities such as numbers or points. In Chapter 14 we provide historical comments so as to connect the notions of infinitesimal analysis and differential geometry with their historical counterparts, with the proviso that the connection is meant in the procedural sense as explained above.
1Scholars attribute the first systematic use of infinitesimals as a foundational concept to Johann Bernoulli. While Leibniz exploited both infinitesimal methods and “exhaustion” methods usually interpreted in the context of an Archimedean continuum, Bernoulli never wavered from the infinitesimal methodology. To note the fact of such systematic use by Bernoulli is not to say that Bernoulli’s foundation is adequate, or that that it could distinguish between manipulations with infinites- imals that produce only true results and those manipulations that can yield false results. 18 2. INTRODUCTION
2.3. Preview of flows and infinitesimal generators As discussed in Section 7.7, the infinitesimal generator X of a flow θt(p) = θ(t, p): R M M on a differentiable manifold M is a (classical) vector field× defined→ by the relation 1 Xp(f) = lim f(θ∆t(p)) f(p) ∆t 0 → ∆t − satisfied by all functions f on M. Even though the term “infinitesimal generator” uses the adjective “infinitesimal”, this traditional notion does not actually exploit any infinitesimals. The formalism is somewhat involved, as can be sensed already in the proof of the invariance of the infinitesimal generator under the flow it generates, as well as the proof of the special case of the Frobenius theorem on commuting flows in Section 8.2. We will develop an alternative formalism where the infinitesimal generator of a flow is an actual infinitesimal (pre)vector field. As an application, we will present more transparent proofs of both the invari- ance of a vector field under its flow and the Frobenius theorem in this case. CHAPTER 3
A hyperreal view
3.1. Successive extensions N, Z, Q, R, ∗R Our reference for true infinitesimal calculus is Keisler’s textbook [Keisler 1986], downloadable at http://www.math.wisc.edu/~keisler/calc.html We start by motivating the familiar sequence of extensions of num- ber systems N ֒ Z ֒ Q ֒ R → → → in terms of their applications in arithmetic, algebra, and geometry. Each successive extension is introduced for the purpose of solving prob- lems, rather than enlarging the number system for its own sake. Thus, the extension Q R enables one to express the length of the diagonal of the unit square⊆ and the area of the unit disc in our number system. The familiar continuum R is an Archimedean continuum, in the sense that it satisfies the following Archimedean property: ( ǫ> 0)( n N)[nǫ > 1] . ∀ ∃ ∈ We therefore provisionally denote this continuum A where “A” stands for Archimedean. Thus we obtain a chain of extensions ,N ֒ Z ֒ Q ֒ A → → → as above. In each case one needs an enhanced ordered1 number system to solve an ever broader range of problems from algebra or geometry. Some historical remarks on Simon Stevin and his real numbers can be found in Section 14.1. The next stage is the extension ,N ֒ Z ֒ Q ֒ A ֒ B → → → → where B is a Bernoullian continuum containing infinitesimals, defined as follows. Definition 3.1.1. A Bernoullian extension of R is any proper ex- tension which is an ordered field.
1sadur 19 20 3. A HYPERREAL VIEW
Any Bernoullian extension allows us to define infinitesimals and do interesting things with those. But things become really interesting if we assume the Transfer Principle (Section 3.4), and work in a true hyperreal field, defined as in Definition 3.3.1 below. We will provide some motivating comments for the transfer principle in Section 3.3.
3.2. Motivating discussion for infinitesimals Infinitesimals can be motivated from three different angles: geo- metric, algebraic, and arithmetic/analytic.
θ
Figure 3.2.1. Horn angle θ is smaller than every recti- linear angle
Here is another figure (1) Geometric (horn angles): Some students have expressed the sentiment that they did not understand infinitesimals until they heard a geometric explanation of them in terms of what was classically known as horn angles. A horn angle is the crivice between a circle and its tangent line at the point of tangency. If one thinks of this crevice as a quantity, it is easy to convince oneself that it should be smaller than every rectilinear angle (see Figure 3.2.1). This is because a sufficiently small arc of the circle will be contained in the convex region cut out by the rectilinear angle no matter how small. When one renders this in terms of analysis and arithmetic, one gets a positive quantity smaller than every positive real number. We cite this example merely as intuitive motivation (our actual construction is different). 3.3. INTRODUCTION TO THE TRANSFER PRINCIPLE 21
(2) Algebraic (passage from ring to field): The idea is to represent an infinitesimal by a sequence tending to zero. One can get something in this direction without reliance on any form of the axiom of choice. Namely, take the ring S of all sequences of real numbers, with arithmetic operations defined term-by-term. Now quotient the ring S by the equivalence relation that declares two sequences to be equivalent if they differ only on a finite set of indices. The resulting object S/K is a proper ring extension of R, where R is embedded by means of the constant sequences. However, this object is not a field. For example, it has zero divisors. But quotienting it further in such a way as to get a field, by extending the kernel K to a maximal ideal K′, produces a field S/K′, namely a hyperreal field. (3) Analytic/arithmetic: One can mimick the construction of the reals out of the rationals as the set of equivalence classes of Cauchy sequences, and construct the hyperreals as equiva- lence classes of sequences of real numbers under an appropriate equivalence relation. This viewpoint is detailed in Section 4.5. Some more technical comments on the definability of a hyperreal line can be found in Section 12.8.
3.3. Introduction to the transfer principle The transfer principle is a type of theorem that, depending on the context, asserts that rules, laws or procedures valid for a certain num- ber system, still apply (i.e., are “transfered”) to an extended number system. Thus, the familiar extension Q R preserves the property of being an ordered field. To give a negative⊆ example, the exten- sion R R of the real numbers to the so-called extended reals does not⊆ preserve∪{±∞} the property of being an ordered field. The hyperreal extension R ∗R (defined below) preserves all first-order properties (i.e., properties⊆ involving quantification over elements but not over sets; see Section 5.4 for a fuller discussion) of ordered fields. For example, the formula sin2 x + cos2 x = 1, true over R for all real x, remains valid over ∗R for all hyperreal x, including infinitesimal and infinite values of x ∗R. ∈ Thus the transfer principle for the extension R ∗R is a theorem ⊆ asserting that any statement true over R is similarly true over ∗R, 22 3. A HYPERREAL VIEW and vice versa. Historically, the transfer principle has its roots in the procedures involving Leibniz’s Law of continuity.2 We will explain the transfer principle in several stages of increasing degree of abstraction. More details can be found in Sections 3.4, 5.2, and 5.3. Definition 3.3.1. An ordered field B, properly including the field A = R of real numbers (so that A $ B) and satisfying the Transfer Principle, is called a hyperreal field. If such an extended field B is fixed then elements of B are called hyperreal numbers,3 while the extended field itself is usually denoted ∗R. Theorem 3.3.2. Hyperreal fields exist. For example, a hyperreal field can be constructed as the quotient of the ring RN of sequences of real numbers, by an appropriate maximal ideal. This will be shown in Section 4.7. Definition 3.3.3. A positive infinitesimal is a positive hyperreal number ǫ such that ( n N)[nǫ < 1] ∀ ∈ More generally, we have the following. Definition 3.3.4. A hyperreal number ε is said to be infinitely small or infinitesimal if a<ε 0 1 is infinitesimal then N = ε is positive infinite, i.e., greater than every real number. A hyperreal number that is not an infinite number are called finite. Sometimes the term limited is used in place of finite. Keisler’s textbook exploits the technique of representing the hyper- real line graphically by means of dots indicating the separation between the finite realm and the infinite realm. One can view infinitesimals with microscopes as in Figure 3.6.1. One can also view infinite numbers with telescopes as in Figure 3.3.1. We have an important subset
finite hyperreals ∗R { } ⊆ 2Leibniz’s theoretical strategy in dealing with infinitesimals was analyzed in detailed studies [Katz & Sherry 2012] as well as [Katz & Sherry 2013], [Sherry & Katz 2014], [Bair et al. 2013]. We found Leibniz’ strategy to be more robust than George Berkeley’s flawed critique thereof. 3Similar terminology is used with regard to integers and hyperintegers; see Section 5.5. 3.3. INTRODUCTION TO THE TRANSFER PRINCIPLE 23
Positive Finite infinite
2 1 0 1 2 − −
1/ǫ Infinite telescope 1 1 1 +1 ǫ − ǫ
Figure 3.3.1. Keisler’s telescope which is the domain of the function called the standard part function (also known as the shadow) which rounds off each finite hyperreal to the nearest real number (for details see Section 3.6).
2 Example 3.3.5. Slope calculation of y = x at x0. Here we use standard part function (also called the shadow)
st : finite hyperreals R { }→ (for details concerning the shadow see Section 3.6). If a curve is defined 2 by y = x we wish to find the slope at the point x0. To this end we use an infinitesimal x-increment ∆x and compute the corresponding y- increment ∆y =(x + ∆x)2 x2 =(x + ∆x + x )(x + ∆x x )=(2x + ∆x)∆x. 0 − 0 0 0 − 0 0 The corresponding “average” slope is therefore ∆y =2x + ∆x ∆x 0 which is infinitely close to 2x, and we are naturally led to the definition ∆y ∆y of the slope at x0 as the shadow of ∆x , namely st ∆x =2x0. 24 3. A HYPERREAL VIEW
The extension principle expresses the idea that all real objects have natural hyperreal counterparts. We will be mainly interested in sets, functions and relations. We then have the following extension principle. Extension principle. The order relation on the hyperreals con- tains the order relation on the reals. There exists a hyperreal number greater than zero but smaller than every positive real. Every set D R ⊆ has a natural extension ∗D ∗R. Every real function f with domain D ⊆ 4 has a natural hyperreal extension ∗f with domain ∗D. Here the naturality of the extension alludes to the fact that such an extension is unique, and the coherence refers to the fact that the domain of the natural extension of a function is the natural extension of its domain. A positive infinitesimal is a positive hyperreal smaller than every positive real. A negative infinitesimal is a negative hyperreal greater than every negative real. An arbitrary infinitesimal is either a positive infinitesimal, a negative infinitesimal, or zero. Ultimately it turns out counterproductive to employ asterisks for hyperreal functions (in fact we already dropped it in equation (3.4.1)). This issue will be dealt with in (5.4.3).
3.4. Transfer principle Definition 3.4.1. The Transfer Principle asserts that every first- order statement true over R is similarly true over ∗R, and vice versa. Here the adjective first-order alludes to the limitation on quantifi- cation to elements as opposed to sets, as discussed in more detail in Section 5.4. Listed below are a few examples of first-order statements. A de- tailed account in terms of a first-order limitation can be found in Sec- tion 5.4. Example 3.4.2. The commutativity rule for addition x+y = y +x is valid for all hyperreal x, y by the transfer principle. Example 3.4.3. The formula sin2 x + cos2 x = 1 (3.4.1) is valid for all hyperreal x by the transfer principle.
4Here the noun principle (in extension principle) means that we are going to ∗ assume that there is a function f : B B which satisfies certain properties. → ∗ It is a separate problem to define B which admits a coherent definition of f for all f : A A, to be solved below. → 3.5. ORDERS OF MAGNITUDE 25
Example 3.4.4. The statement 1 1 0 Example 3.4.5. The characteristic function χQ of the rational numbers equals 1 on rational inputs and 0 on irrational inputs. By ∗ the transfer principle, its natural extension ∗χQ = χ Q will be 1 on hyperrational numbers ∗Q and 0 on hyperirrational numbers (namely, numbers in the complement ∗R ∗Q). \ To give additional examples of real statements to which transfer applies, note that all ordered field-statements are subject to Trans- fer. As we will see below, it is possible to extend Transfer to a much broader category of statements, such as those containing the function symbols exp or sin or those that involve infinite sequences of reals. A hyperreal number x is finite if there exists a real number r such that x < r. A hyperreal number is called positive infinite if it is greater| | than every real number, and negative infinite if it is smaller than every real number. 3.5. Orders of magnitude Hyperreal numbers come in three orders of magnitude: infinitesi- mal, appreciable, and infinite. A number is appreciable if it is finite but not infinitesimal. In this section we will outline the rules for ma- nipulating hyperreal numbers. To give a typical proof, consider the rule that if ǫ is positive infini- 1 tesimal then ǫ is positive infinite. Indeed, for every positive real r we 1 have 0 <ǫ Hb and HK are infinite. • We have the following rules for quotients. ǫ , ǫ , b are infinitesimal; • b H H b is appreciable; • c b , H , H are infinite provided ǫ =0. • ǫ ǫ b 6 We have the following rules for roots, where n is a standard natural number. if ǫ> 0 then √n ǫ is infinitesimal; • n if b> 0 then √b is appreciable; • if H > 0 then √n H is infinite. • Note that the traditional topic of the so-called “indeterminate forms” can be treated without introducing any ad-hoc terminology by means of the following remark. ǫ H Remark 3.5.2. We have no rules in certain cases, such as δ , K , Hǫ, and H + K. Theorem 3.5.3. Arithmetic operations on the hyperreal numbers are governed by the following rules. (1) every hyperreal number between two infinitesimals is infinites- imal. (2) Every hyperreal number which is between two finite hyperreal numbers, is finite. (3) Every hyperreal number which is greater than some positive infinite number, is positive infinite. (4) Every hyperreal number which is less than some negative infi- nite number, is negative infinite. Example 3.5.4. The difference √H +1 √H 1 (where H is infinite) is infinitesimal. Namely, − − √H +1 √H 1 √H +1+ √H 1 √H +1 √H 1= − − − − − √H +1+ √H 1 − H +1 (H 1) = − − √H +1+ √H 1 − 2 = √H +1+ √H 1 − is infinitesimal. Once we introduce limits, this example can be refor- mulated as follows: limn √n +1 √n 1 = 0. →∞ − − 3.7. DIFFERENTIATION 27 Definition 3.5.5. Two hyperreal numbers a, b are said to be in- finitely close, written a b, ≈ if their difference a b is infinitesimal. − It is convenient also to introduce the following terminology and notation. Definition 3.5.6. Two nonzero hyperreal numbers a, b are said to be adequal, written a pq b, if either a 1 or a = b = 0. b ≈ Note that the relation sin x x for infinitesimal x is immediate from the continuity of sine at the≈ origin (in fact both sides are infin- itely close to 0), whereas the relation sin x pq x is a subtler relation equivalent to the computation of the first order Taylor approximation of sine. 3.6. Standard part principle Theorem 3.6.1 (Standard Part Principle). Every finite hyperreal number x is infinitely close to an appropriate real number. Proof. The result is generally true for an arbitrary proper ordered field extension E of R. Indeed, if x E is finite, then x induces a ∈ Dedekind cut on the subfield Q R E via the total order of E. The real number corresponding to the⊆ Dedekind⊆ cut is then infinitely close to x. The real number infinitely close to x is called the standard part, or shadow, denoted st(x), of x. We will use the notation ∆x, ∆y for infinitesimals. Remark 3.6.2. There are three consecutive stages in a typical cal- culation: (1) calculations with hyperreal numbers, (2) calculation with standard part, (3) calculation with real numbers. 3.7. Differentiation An infinitesimal increment ∆x can be visualized graphically by means of a microscope as in the Figure 3.7.1. The slope s of a function f at a real point a is defined by setting f(a + ∆x) f(a) s = st − ∆x 28 3. A HYPERREAL VIEW 1 0 1 2 3 4 − r + α r r + β r + γ st 1 0 1 r 2 3 4 − Figure 3.6.1. The standard part function, st, “rounds off” a finite hyperreal to the nearest real number. The func- tion st is here represented by a vertical projection. Keisler’s “infinitesimal microscope” is used to view an infinitesimal neighborhood of a standard real number r, where α, β, and γ represent typical infinitesimals. Courtesy of Wikipedia. whenever the shadow exists (i.e., the ratio is finite) and is the same for each nonzero infinitesimal ∆x. The construction is illustrated in Figure 3.7.2. Definition 3.7.1. Let f be a real function of one real variable. The derivative of f is the new function f ′ whose value at a real x is the slope of f at x. In symbols, f(x + ∆x) f(x) f ′(x)=st − ∆x whenever the slope exists. f(x+∆x) f(x) − Equivalently, we can write f ′(x) ∆x . When y = f(x) we define a new dependent variable ∆≈y by settting ∆y = f(x + ∆x) f(x) − ∆y called the y-increment, so we can write the derivative as st ∆x . 3.7. DIFFERENTIATION 29 a a a + ∆x Figure 3.7.1. Infinitesimal increment ∆x under the microscope Example 3.7.2. If f(x)= x2 we obtain the derivative of y = f(x) by the following direct calculation: ∆y f ′(x) ≈ ∆x (x + ∆x)2 x2 = − ∆x (x + ∆x x)(x + ∆x + x) = − ∆x ∆x(2x + ∆x) = ∆x =2x + ∆x 2x. ≈ 30 3. A HYPERREAL VIEW f(a + ∆x) f(a) | − | a a + ∆x a Figure 3.7.2. Defining slope of f at a Given a function y = f(x) one defines the dependent variable ∆y = f(x+∆x) f(x) as above. One also defines a new dependent variable dy − by setting dy = f ′(x)∆x at a point where f is differentiable, and sets for symmetry dx = ∆x. Note that we have dy pq ∆y as in Definition 3.5.6 whenever dy = 0. We then have Leibniz’s notation 6 dy f ′(x)= dx and rules like the chain rule acquire an appealing form. CHAPTER 4 Ultrapower construction and transfer principle To motivate the material on filters contained in this chapter, we will first outline a construction of a hyperreal field ∗R exploiting filters. A more detailed technical presentation of the construction appears in Section 4.7. 4.1. Introduction to filters Let RN denote the ring of sequences of real numbers, with arithmetic operations defined termwise. Then we have N ∗R = R /MAX (4.1.1) where “MAX” is an appropriate maximal ideal. What we wish to emphasize is the formal analogy (4.1.1) and the construction of the real field as the quotient field of the ring of Cauchy sequences of rational numbers. Note that in both cases, the subfield is embedded in the superfield by means of constant sequences. We now describe a construction of such a maximal ideal. The ideal MAX consists of all “negligible” sequences un , i.e., sequences which vanish for a set of indices of full measure 1 namely,h i m n N : un =0 =1. { ∈ } Here m : (N) 0, 1 (thus m takes only two values, 0 and 1) is a finitelyP additive→ {measure} taking the value 1 on each cofinite set,1 where (N) is the set of subsets of N. The subset m (N) consisting of setsP of full measure m is called a free ultrafilter.F These⊆ P originate with [Tarski 1930]. The construction of a Bernoullian continuum outlined above was therefore not available prior to that date. The construc- tion outlined above is known as an ultrapower construction. The first construction of this type appeared in [Hewitt 1948], as did the term hyper-real. 1For each pair of complementary infinite subsets of N, such a measure m decides in a coherent way which one is negligible (i.e., of measure 0) and which is dominant (measure 1). 31 32 4. ULTRAPOWER CONSTRUCTION AND TRANSFER PRINCIPLE As already mentioned, to present an infinitesimal-enriched contin- uum of hyperreals, we need some preliminaries on filters. Let I be a nonempty set (usually N). The power set of I is the set (I)= A : A I P { ⊆ } of all subsets of I. Definition 4.1.1. A filter on I is a nonempty collection (I) of subsets of I satisfying the following axioms: F ⊆ P Intersections: if A, B , then A B . • Supersets: if A and∈FA B ∩I, then∈FB . • ∈F ⊆ ⊆ ∈F Thus to show B , it suffices to show ∈F A A B, 1 ∩···∩ n ⊆ for some finite n and some A1,...,An . The full power set (I) is itself a filter.∈F P A filter contains the empty set ∅ if and only if = (I). F F P Definition 4.1.2. We say that is proper if ∅ . F 6∈ F Every filter contains I, and in fact the one-element set I is the smallest filter on I. { } Definition 4.1.3. An ultrafilter is a proper filter that satisfies the following additional property: for any A I, either A or Ac , where Ac = I A. • ⊆ ∈F ∈F \ 4.2. Examples of Filters We present several examples of filters. (1) Principal ultrafilter. Choose i I. Then the collection ∈ i = A I : i A F { ⊆ ∈ } is an ultrafilter, called the principal ultrafilter generated by i. If I is finite, then every ultrafilter on I is of the form i for some i I, and so is principal. F (2) Fr´echet∈ filter. The filter Fre = A I : I A is finite is the cofinite, or Fr´echet, filterF on I, and{ ⊆ is proper\ 2 if and only} if I is infinite. Note that Fre is not an ultrafilter. F 2amiti 4.3. PROPERTIES OF FILTERS 33 (3) Filter generated by a collection. If ∅ = (I), then the filter generated by , i.e., the smallest6 filterH⊆ on PI including , is the collection H H H = A I : A B B for some n and some B F { ⊆ ⊃ 1 ∩···∩ n i ∈ H} For = ∅ we put H = I . H F { } If has a single member B, then H = A I : A B , whichH is called the principal filter generatedF by{ B⊆. The ultrafil-⊃ } ter i of Example (1) is the special case of this when B = i , namelyF a set containing a single element. { } 4.3. Properties of filters Theorem 4.3.1. An ultrafilter satisfies F A B iff A and B , ∩ ∈F ∈F ∈F as well as A B iff A or B , ∪ ∈F ∈F ∈F and also Ac if and only if A . ∈F 6∈ F Theorem 4.3.2. Let be an ultrafilter and A ,...,A a finite F { 1 n} collection of pairwise disjoint (Ai Aj = ∅) sets such that ∩ A ... A . 1 ∪ ∪ n ∈F Then A for exactly one i such that 1 i n. i ∈F ≤ ≤ Theorem 4.3.3. A nonprincipal (or free) ultrafilter must contain each cofinite set. Thus every free ultrafilter includes the Fr´echet filter Fre, namely F F Fre . F ⊆F This is a crucial property used in the construction of infinitesimals and infinitely large numbers. Remark 4.3.4. A filter is an ultrafilter on I if and only if it is a maximal proper filter on I, i.e.,F a proper filter that cannot be extended to a larger proper filter on I. Definition 4.3.5. A collection (I) has the finite intersection property if the intersection of everyH nonempty ⊆ P finite subcollection of is nonempty, i.e., H B1 Bn = ∅ for any n and any B1,...,Bn . ∩···∩ 6 ∈ H Note that the filter H is proper if and only if the collection has the finite intersection propertyF (see Definition 4.3.5). H 34 4. ULTRAPOWER CONSTRUCTION AND TRANSFER PRINCIPLE 4.4. Real number field as quotient of sequences of rationals To motivate the construction of the hyperreal numbers, we will first analyze the construction of the real numbers via Cauchy sequences. Let QN denote the ring of sequences of rational numbers. Let N QC denote the subring consisting of Cauchy sequences. The reals are by definition the quotient field N R := QC MAX (4.4.1) where MAX is the maximal ideal consisting of null sequences (i.e., N sequences tending to zero). Note that QC is only a ring, whereas the quotient (4.4.1) is a field. The point we wish to emphasize is that a field extension is con- structed starting with the base field and using sequences of elements in the base field. 4.5. Extension of Q We will think of sets in the ultrafilter as dominant and their com- plements as negligible.3 We choose a nonprincipal ultrafilter on N. WeformanidealMAX = N F MAX Q consisting of all sequences un : n N of rational num- bers suchF ⊆ that h ∈ i n N : un =0 . { ∈ }∈F An infinitesimal-enriched field extension ∗Q of Q may be obtained via the ultrapower construction by forming the quotient as follows: N ∗Q = Q MAX, where we dropped the subscript from MAX for simplicity. F Remark 4.5.1. This is not the same maximal ideal as the one used in Section 4.4. We use the same notation to emphasize the similarity of the two constructions. 4.6. Examples of infinitesimals With respect to the construction presented in Section 4.5, we now give some examples of infinitesimals. Definition 4.6.1. The equivalence class of a sequence u = u h ni will be denoted [u] or alternatively [un]. 3zaniach 4.7. ULTRAPOWER CONSTRUCTION OF THE HYPERREAL FIELD 35 1 Example 4.6.2. Let α = n . Then α is smaller than every positive real number r > 0. In more detail, α = [u] where u = un : n N is the null sequence 1h ∈ i (i.e., sequence tending to zero) un = n . Consider the set S = n N : un (4) We observe that MAX is an ideal of RN. F (5) We observe that the quotient RN / MAX is a field and in fact an ordered field (see Definition 4.7.1). F (6) We denote it by RN / to simplify notation. F Definition 4.7.1. A natural order relation <∗ on ∗R is defined by setting [un] <∗ [vn] if and only of the relation < holds for a “dominant” set of indices, where “dominance” is determined by a fixed ultrafilter : F [un] <∗ [vn] if and only if n N : un < vn . { ∈ }∈F The complement of a dominant set can be viewed as “negligible.”4 More details on the ultrapower construction can be found in [Davis 1977]. Theorem 4.7.2. The field RN / satisfies the transfer principle. F We give a proof of the transfer principle in Chapter 12. See [Chang & Keisler 1990, Chapter 4], on a more general model theoretic study of ultrapowers and their applications. 4zaniach CHAPTER 5 Microcontinuity, EVT, internal sets 5.1. Construction via equivalence relation From now on, we will work with the following version of the con- struction (4.7.1) formulated in terms of an equivalence relation, which is more readily generalizable to other contexts (see Section 5.10). Namely, we set N ∗R = R / ∼ where is an equivalence relation defined as follows: ∼ un vn if and only if n N : un = vn . h i∼h i { ∈ }∈F Since the relation depends on , so does the quotient RN/ . For this reason the logicians,∼ and they areF definitely a special breed,∼ decided to write directly RN / . F But we will see that this abuse of notation offers some insights. Definition 5.1.1. Subsets ∗Q and ∗N of ∗R consist of -classes of sequences u with rational (respectively, natural) terms. F h ni If X R then the subset ∗X of ∗R consists of -classes of se- ⊆ F quences un with terms un X for each n. In particular the subsets ∗Q h i ∈ and ∗N of ∗R consist of -classes of sequences un with rational (re- spectively, natural) terms.F h i Definition 5.1.2. If X R and f : X R, then the map ∗f : ⊆ → ∗X ∗R sends each -class of a sequence un (un X) to the -class of the→ sequence v Fwhere v = f(u ) forh eachi n. ∈ F h ni n n It is easy to check that this definition ∗f is independent of choices made in the construction. As noted in Definition 4.7.1, the order on ∗R is defined by set- ting [u] <∗ [v] if and only if n N : un < vn . { ∈ }∈F 5.2. Extending the ordered field language Let the ordered field language be defined starting with a pair of operations and +, and enhanced as follows. · 37 38 5. MICROCONTINUITY, EVT, INTERNAL SETS (1) We allow expressions like (x + y) z +(x 2y) called terms; · − (2) we further define formulas like T = T ′ and T < T ′ where T, T ′ are terms; (3) we allow more complex formulas by means of logical connec- tives and quantifiers; (4) given a concrete ordered field F , we allow the replacement of free variables in formulas by elements of F ; (5) this leads to the notion of a formula being true in F ; (6) this finally enables us to express and study various properties of F in a formal and well defined way. However, the ordered field language is not sufficient as a basis for either calculus or analysis. For instance there is no way to express phe- nomena related to non-algebraic functions like exp, sin, etc. Therefore we need to extend the ordered field language further. Definition 5.2.1. We will use the term extended real number lan- guage to refer to the ordered field language extended further as follows: (7) we can freely use expressions like f(x), where f : R R is any particular function; → (8) we can freely use formulas like x D, where D R is any particular set; ∈ ⊆ (9) we can freely use any particular real numbers as parameters. Theorem 5.2.2 (Transfer revisited). Let A be a sentence of the extended real number language, and let ∗A be obtained from A by sub- stituting ∗f and ∗D for each f or D which occur in A. Then A is true over R if and only if ∗A is true over ∗R. In Chapter 12 we will prove a generalisation of Theorem 5.2.2. In Remark 5.4.3 we will explain our approach to dropping the as- terisks on functions. Remark 5.2.3. In the statement of the theorem we wrote true over R, or over ∗R, rather than the usual true in R, to emphasize the fact that the sentences involved refer not only to real numbers themselves but also to sets of reals as well as real functions, namely, objects usually thought of as elements of a superstructure over R; see Chapter 12 for more details. Further analysis of the transfer principle appears in Section 5.3. A complete proof of the transfer principle appears in Chapter 12. 5.4. EXAMPLES OF FIRST ORDER STATEMENTS 39 5.3. Existential and Universal Transfer Some motivating comments for the transfer principle already ap- peared in Section 3.3. Recall that by Theorem 4.7.2 we have an exten- sion R ∗R of an Archimedean continuum by a Bernoullian one which ⊆ furthermore satisfies transfer. To define ∗R, we fix a nonprincipal ultra- N filter over N and let ∗R = R / . By construction, elements [u] of ∗R F F N are represented by sequences u = un : n N of real numbers, u R , with an appropriate equivalenceh relation∈ definedi in terms of ∈as in Section 4.7, or in symbols F N ∗R := R . F Here an order is defined as follows. We have [u] > 0 if and only if the set n N : un > 0 belongs to . { ∈ } F Definition 5.3.1. The inclusion R ∗R is defined by sending a → real number r R to the constant sequence un = r, i.e., ∈ r, r, r, . . . . h i Since we view R as a subset of ∗R, we will denote the resulting (stan- dard) hyperreal by the same symbol r. The transfer principle was formulated in Theorem 5.2.2. To summa- rize, it asserts that truth over R is equivalent truth over ∗R. Therefore there are two directions to transfer in. Definition 5.3.2. Transfer of statements in the direction R ∗R is called universal,1 whereas transfer in the opposite direction ∗R R is called existential.2 5.4. Examples of first order statements In this section we will illustrate the application of the transfer prin- ciple by several examples of transfer, applied to sentences familiar from calculus. The statements transfer applies to are first-order quantified formulas, namely formulas involving quantification over field elements only. Quantification over subsets of the field is disallowed. Thus, the completeness property of the reals is not transferable because its for- mulation involves quantification over all subsets of the field. 1haavara colelet 2haavara yeshit 40 5. MICROCONTINUITY, EVT, INTERNAL SETS Example 5.4.1. Rational numbers q > 0 satisfy the following: n ( q Q+)( n, m N) q = , ∀ ∈ ∃ ∈ m + h i where Q is the set of positive rationals. By universal transfer, we obtain the following statement, satisfied by all hyperrational numbers: + n ( q ∗Q )( n, m ∗N) q = . ∀ ∈ ∃ ∈ m Here of course n, m may be infinite. Hypernaturalh i numbers will be discussed in more detail in Section 5.5. Example 5.4.2. Recall that a function f is continuous at a real point c R if the following condition is satisfied: ∈ ( ǫ R+)( δ R+)( x R) x c < δ f(x) f(c) < ǫ . ∀ ∈ ∃ ∈ ∀ ∈ | − | ⇒ | − | By universal transfer, a continuous function f similarly satisfies the following formula over ∗R: + + ( ǫ ∗R )( δ ∗R )( x ∗R) x c < δ ∗f(x) ∗f(c) < ǫ . ∀ ∈ ∃ ∈ ∀ ∈ | − | ⇒ | − | Note that the formula is satisfied in particular for each positive infini- tesimal ǫ ∗R. We remark that Transfer is applicable to functions, as well (in addition∈ to the ordered field formulas). Remark 5.4.3. In practice most calculations exploit the hyperreal extension of a function f, rather than f itself. We will therefore con- tinue on occasion to denote by f the extended hyperreal function, as well. Example 5.4.4. A real function f is continuous in a domain D R if the following condition is satisfied: ⊆ + + ( x D)( ǫ R )( δ R )( x′ D) ∀ ∈ ∀ ∈ ∃ ∈ ∀ ∈ x′ x < δ f(x′) f(x) < ǫ . | − | ⇒ | − | or equivalently + + ( ǫ R )( x D)( δ R )( x′ D) ∀ ∈ ∀ ∈ ∃ ∈ ∀ ∈ (5.4.1) x′ x < δ f(x′) f(x) < ǫ . | − | ⇒ | − | By universal transfer, the hyperreal extension ∗f of such a function will similarly satisfy the following formula over ∗R: + + ( ǫ ∗R )( x ∗D)( δ ∗R )( x′ ∗D) ∀ ∈ ∀ ∈ ∃ ∈ ∀ ∈ (5.4.2) x′ x < δ ∗f(x′) ∗f(x) < ǫ , | − | ⇒ | − | where ∗D ∗R is the natural extension of D R (see Section 3.1). Note that⊆ transfer is applicable to all such sets ⊆D. 5.5. GALAXIES 41 Example 5.4.5. A real function f is uniformly continuous in a domain D R if the following condition is satisfied: ⊆ + + ( ǫ R )( δ R )( x D)( x′ D) ∀ ∈ ∃ ∈ ∀ ∈ ∀ ∈ x′ x < δ f(x′) f(x) < ǫ . | − | ⇒ | − | By universal transfer, the hyperreal extension ∗f of such a function will similarly satisfy the following formula over ∗R: + + ( ǫ ∗R )( δ ∗R )( x ∗D)( x′ ∗D) ∀ ∈ ∃ ∈ ∀ ∈ ∀ ∈ x′ x < δ ∗f(x′) ∗f(x) < ǫ . | − | ⇒ | − | An alternative characterisation of uniform continuity, of reduced quantifier complexity, is presented in Section 5.7. Example 5.4.6. To give an example of existential transfer, consider the problem of showing that if f is differentiable at c R and f ′(c) > 0 ∈ then there is a point x R,x>c such that f(x) > f(c). Indeed, for ∈ f(c+α) f(c) − infinitesimal α> 0 we have st α > 0 and hence f(c + α) > f(c). Setting x = c + α, we see that the following formula is satisfied over ∗R: ( x)[f(x) > f(c)]. ∃ Therefore by existential transfer, the same formula holds over the real field R. Thus there is a real x such that f(x) > f(c). Remark 5.4.7. Another example of existential transfer will be given in Section 5.6 following formula (5.6.3). 5.5. Galaxies Let us consider the hypernatural numbers ∗N (positive hyperinte- gers) in more detail. Definition 5.5.1. A hyperreal number x is called finite if x is less than some real number. Equivalently, x is finite if x is less| than| some natural number. A number is infinite if it is not finite.| | Thus an infinite positive hypereal is a hyperreal number bigger than each real number. Now consider the extension N ∗N constructed by an ultrapower ⊆ as in Section 4.7. Recall that a sequence u NN given by ∈ u = un : n N h ∈ i N is equivalent to a sequence v N given by v = vn : n N if and only if the set of indices ∈ h ∈ i n N : un = vn { ∈ } 42 5. MICROCONTINUITY, EVT, INTERNAL SETS is a member of a fixed ultrafilter (N). F ⊆ P We identify N with its image in ∗N under the natural embedding generated by n n, n, n, . . . . 7→ h i Theorem 5.5.2. Every finite hypernatural number is a natural num- ber. Proof. Let [u] ∗N where u = un : n N and suppose [u] is finite. Then there exists∈ a real numberh ∈ i r> 0 such that [u] n N : un Si0 = n N : un = i0 , { ∈ } is dominant, i.e., it is a member of . Therefore [u]= i0 and so [u] is a natural number. F We provide an alternative proof exploiting the transfer principle. One observes that natural numbers satisfy the formula ( n N) n ( n ∗N) n Definition 5.5.4. An equivalence relation g on ∗N is defined as ∼ follows: for n, m ∗N, we have ∈ n m if and only if n m is finite. ∼g | − | Definition 5.5.5. An equivalence class of the relation g is called a galaxy. The galaxy of an element x will be denoted gal(x∼).3 By Corollary 5.5.3, there is a unique galaxy containing finite num- bers, namely the galaxy gal(0) = 0, 1, 2,... = N . { } Each galaxy containing an infinite number such as H is of the form gal(H)= ...,H 3,H 2,H 1,H,H +1,H +2,H +3,... { − − − } and is therefore order-isomorphic to Z (rather than N). 4 Corollary 5.5.6. The ordered set (∗N,<) is not well-ordered. 5.6. Microcontinuity Let D = Df R be the domain of a real function f, where the notation was discussed⊆ in Remark 5.4.3. Definition 5.6.1. We say that f is microcontinuous at x if when- ever x′ x, one also has f(x′) f(x) for all x′ is in the domain ∗D ≈ ≈ f of ∗f. Here x′ x means that x′ x is infinitesimal. ≈ − Remark 5.6.2. It is a good exercise to prove that ∗(Df )= D∗f . Remark 5.6.3. This definition can be applied not only at a real point x D but also at a hyperreal point x ∗D ; see Example 5.7.3. ∈ f ∈ f Microcontinuity at all points in a segment [x0,X] provides a useful modern proxy for Cauchy’s definition of continuity in his 1821 text Cours d’Analyse: the function f(x) is continuous with respect to x be- tween the given limits if, between these limits, an in- finitely small increment in the variable always pro- duces an infinitely small increment in the function itself. (translation by [Bradley & Sandifer 2009, p. 26]). 3Sometimes galaxies are defined in the context of larger number systems, such ∗ as R. Our main interest in galaxies lies in illustrating the concept of a non-internal set; see Section 5.11. 4Sadur tov 44 5. MICROCONTINUITY, EVT, INTERNAL SETS In the current cultural climate it needs to be pointed out that Cauchy was obviously not familiar with our particular model of a B-continuum. On the other hand, his procedures and inferential moves find closer proxies in the context of modern infinitesimal-enriched continua than in the context of modern Archimedean continua. Theorem 5.6.4. Let c R. A real function f is continuous in ∈ the ǫ, δ sense at c if and only if ∗f is microcontinuous at c. Proof. We are following [Goldblatt 1998, p. 76]. Recall that a real function f is continuous at c in the ǫ, δ sense if + + ( ǫ R )( δ R )( x Df ) x c < δ f(x) f(c) < ǫ . ∀ ∈ ∃ ∈ ∀ ∈ | − | ⇒| − | (5.6.1) Assume that ∗f is microcontinuous at c, so that x c ∗f(x) ∗f(c). (5.6.2) ≈ ⇒ ≈ Let us prove that f is continuous in the sense of (5.6.1). Choose a real ǫ > 0 as in the leftmost quantifier in (5.6.1). Let d > 0 be infini- tesimal. If x c ( x ∗D ) x c < δ ∗f(x) ∗f(c) < ǫ . ∀ ∈ f | − | ⇒| − | But now if we choose x such that x c, the condition x c < δ is automatically satisfied since x c ≈is infinitesimal. Therefore| − | the − 5.7. UNIFORM CONTINUITY IN TERMS OF MICROCONTINUITY 45 inequality ∗f(x) ∗f(c) < ǫ holds for arbitrary real ǫ > 0. It fol- | − | lows that ∗f(x) ∗f(c), proving the relation (5.6.2) and the opposite implication. ≈ 5.7. Uniform continuity in terms of microcontinuity The ǫ, δ definition of uniform continuity was reviewed in Section 5.3. We now introduce an equivalent definition in terms of microcontinuity. Definition 5.7.1 (Alternative definition). A real function f is uni- formly continuous in its domain D = Df if and only if ( x ∗D )( x′ ∗D )[x x′ ∗f(x) ∗f(x′)] . ∀ ∈ f ∀ ∈ f ≈ ⇒ ≈ Remark 5.7.2. This definition of uniform continuity of a real func- tion f can be reformulated in terms of its natural extension ∗f to the hyperreals as follows: for all x ∗D , ∗f is microcontinuous at x. ∈ f This sounds startlingly similar to the definition of continuity it- self. What is the difference between the two definitions? The point is that microcontinuity is now required at every point of the Bernoullian continuum rather than merely at the points of the Archimedean con- tinuum, i.e., in the domain of ∗f which is the natural extension of the real domain of f. Example 5.7.3. The function f(x)= x2 fails to be uniformly con- tinuous on its domain D = R because of the failure of microcontinuity of its natural extension ∗f at any single infinite hyperreal H ∗D = ∈ f ∗R. The failure of microcontinuity at H is checked as follows. Consider 1 the infinitesimal e = H , and the point H + e infinitely close to H. To show that ∗f is not microcontinuous at H, we calculate 2 2 2 2 2 2 ∗f(H + e)=(H + e) = H +2He + e = H +2+ e H +2. ≈ 2 This value is not infinitely close to ∗f(H)= H : ∗f(H + e) ∗f(H)=2+ ǫ 0. − 6≈ Therefore microcontinuity fails at the point H ∗R. Thus the squaring ∈ function f(x)= x2 is not uniformly continuous on R. The term “microcontinuity” is exploited in [Davis 1977] as well as [Gordon et al. 2002]. It reflects the existence of two definitions of continuity, one using infinitesimals, and one using epsilons. The former is what we refer to as microcontinuity. It is given a special name to dis- tinguish it from the traditional definition of continuity. Note that mi- crocontinuity at a non-standard hyperreal does not correspond to any 46 5. MICROCONTINUITY, EVT, INTERNAL SETS notion available in the epsilontic framework limited to an Archimedean continuum. 1 Example 5.7.4. Consider the function f given by f(x)= x on the open interval D = (0, 1) R. Then ∗f fails to be microcontinuous at a positive infinitesimal.⊂ To see this, choose an infinite H > 0 and 1 1 let x = and x′ = . Clearly x x′. Both of these points are in H H+1 ≈ the extended domain ∗D = ∗(0, 1). Then ∗f(x′) ∗f(x)= H +1 H =1 0. − − 6≈ It follows that the real function f is not uniformly continuous on the open interval (0, 1) R. ⊆ 5.8. Hyperreal extreme value theorem Example 5.8.1. The unit interval [0, 1] R has a natural extension ⊆ ∗[0, 1] ∗R. This contains all positive infinitesimals as well as all numbers⊆ smaller than 1 and infinitely close to 1. Consider the extreme value theorem (EVT). The classical proof of the EVT usually proceeds in two or more stages. (1) Typically, one first shows that the function is bounded. (2) Then one proceeds to construct an extremum by one or an- other procedure involving choices of sequences. The hyperreal approach is both more economical (there is no need to prove boundedness first) and less technical. Furthermore, in the hyperreal approach one can actually write down a succinct formula for a maximum (rather than merely prove its exis- tence); see formula (5.8.3). Theorem 5.8.2. A continuous function f on [0, 1] R has a max- imum. ⊂ Proof. Let H ∗N N ∈ \ be an infinite hypernatural number.5 The real interval [0, 1] has a natural hyperreal extension ∗[0, 1] = x ∗R :0 x 1 . { ∈ ≤ ≤ } 1 Consider its partition into H subintervals of equal infinitesimal length H , with partition points i xi = H , i =0,...,H. 5For instance, the one represented by the sequence 1, 2, 3,... with respect to the ultrapower construction outlined in Section 4.7. h i 5.9. INTERMEDIATE VALUE THEOREM 47 The existence of such a partition follows by universal transfer (see Section 5.3) applied to the first order formula ( n N)( x [0, 1]) ( i < n) i x< i+1 . ∀ ∈ ∀ ∈ ∃ n ≤ n The function f has a natural extension ∗f defined on the hyperreals between 0 and 1. Note that in the real setting (when the number of partition points is finite), a point with the maximal value of f among the partition points xi can always be chosen by induction. We have the following first order property expressing the existence of a maximum of f over a finite collection: i0 i ( n N)( i0 n)( i n) f( ) f( ) . ∀ ∈ ∃ ≤ ∀ ≤ n ≥ n We now apply the transfer principle to obtain i0 i ( n ∗N)( i0 n)( i n) ∗f( ) ∗f( ) , (5.8.1) ∀ ∈ ∃ ≤ ∀ ≤ n ≥ n where ∗N is the collection of hypernatural numbers. Formula (5.8.1) is true in particular for a specific infinite hypernatural value of n given by H ∗N N. Thus, there is a hypernatural i0 such that 0 i0 H and ∈ \ ≤ ≤ ( i ∗N) i H ∗f(xi0 ) ∗f(xi) . (5.8.2) ∀ ∈ ≤ ⇒ ≥ Consider the real point c = st(xi0 ) (5.8.3) where “st” is the standard part function. By continuity of f at c R, ∈ we have ∗f(x 0 ) ∗f(c), and therefore i ≈ st(∗f(xi0 )) = ∗f(st(xi0 )) = f(c). An arbitrary real point x lies in an appropriate sub-interval of the partition, namely x [xi, xi+1], so that st(xi)= x, or xi x. Applying “st” to the inequality∈ in the formula (5.8.2), we obtain ≈ st(∗f(x )) st(∗f(x )). i0 ≥ i Hence f(c) f(x), for all real x, proving c to be a maximum of f (and ≥ by transfer, of ∗f as well). 5.9. Intermediate value theorem Theorem 5.9.1. Let f be a continuous real function on [0, 1] and assume f(0) < 0 and f(1) > 0. Then there is a point c [0, 1] such that f(c)=0. ∈ 48 5. MICROCONTINUITY, EVT, INTERNAL SETS Proof. Let H ∗ N N and consider the corresponding partition ∈ \ i of ∗[0, 1] with partition points xi = H , i =0,...,H. Consider the set of all partition points xi such that for all points before xi, the function is nonpositive. By transfer this set has a last point xi0 . By hypothesis f(x ) 0 < f(x ). i0 ≤ i0+1 Applying standard part we obtain st(f(x )) 0 st(f(x )). i0 ≤ ≤ i0+1 But by microcontinuity st(f(xi0 )) = st(f(xi0+1)). Therefore the point c = st(xi0 ) is a zero of the function. 5.10. Ultrapower construction applied to (R); internal sets P Consider the set of subsets of R, denoted (R). Thus A is contained P in (R), i.e., A (R) means that A is included in R, i.e., A R. By theP extension principle∈ P (see Section 3.3) we have the corresponding⊆ sub- set ∗A ∗R called the natural extension of A. We would like to apply ⊆ the ultrapower construction to (R). For convenience, we will use the P shortened notation P = (R). Then the natural extension ∗P of P is P constructed as before as a quotient of PN by an appropriate equivalence relation defined in terms of an ultrafilter using the dominant/negligible dichotomy. The natural extensions of subsets of R constitute a useful family of subsets. We will now construct an important larger class of subsets of ∗R called internal sets. These are the members of ∗P. Definition 5.10.1. An α ∗P is an equivalence class α = [A] ∈ where A is a sequence A = An P : n N of elements of P (i.e., h ∈ ∈ i subsets of R), where A B if and only if n N : An = Bn . ∼ { ∈ }∈F In more detail, we have the following definition. Definition 5.10.2. Consider a sequence A = An (R): n N h ∈ P ∈ i of subsets of R. We use it to define a set α = [A] ∗R as follows. A ⊆ hyperreal [u] = [un] is an element of the set α = [A] if and only if one has n N : un An . { ∈ ∈ }∈F Such sets α ∗R are called internal. ⊆ Here as usual (N) is the fixed free ultrafilter used in the construction of theF hyperreal ⊆ P field. 5.11. THE SET OF INFINITE HYPERNATURALS IS NOT INTERNAL 49 Example 5.10.3. An example of a subset of ∗R which is internal but is not a natural extension of any real set is the interval [0,H] where H is an infinite hyperreal. To show that it is internal, represent H by a sequence u = un : n N , so that H = [u]. Then the set α = [0,H] is h ∈ i the equivalence class α = [A] of the sequence A = An : n N of real h ∈ i intervals An = [0,un] R. ⊆ 5.11. The set of infinite hypernaturals is not internal Does the construction of Section 5.10 give all possible subsets of ∗R? The answer turns out to be negative even in the case of the hypernat- urals, as we show in Theorem 5.11.4. Theorem 5.11.1. Each internal subset of ∗N has a least element. Remark 5.11.2. In other words, ∗N is internally well-ordered in the sense that the well-ordering property is satisfied if we only deal with internal sets. Proof. Assume that an internal α ia given by α = [A] where A = A . Here each A can be assumed nonempty by Lemma 5.11.3 below. h ni n Since N is well-ordered we can choose a minimal element un An for ∈ each n. Consider the sequence u = un : n N . Its equivalence class [u] is the minimal element in theh set α. ∈ i Lemma 5.11.3. A nonempty set α = [A] where A = A can always h ni be represented by a sequence where each of the sets An in the sequence is nonempty. Proof. The set of indices n for which An = ∅ is negligible since otherwise α would be the empty set itself. For each n from this neg- ligible set of indices n, we can replace the corresponding set An = ∅ by An = N without affecting the equivalence class α = [A]. In the new sequence all the sets An are nonempty. Theorem 5.11.4. The set of infinite hypernaturals, ∗N N, is not \ an internal subset of ∗N. Proof. If the set ∗N N were internal it would have a least ele- ment by Theorem 5.11.1. But\ there is no minimal infinite hypernatural because if H is an infinite hypernatural then H 1 is another infinite hypernatural. − Recall that every galaxy of infinite hypernaturals is order-isomorphic to Z. For the structure of the galaxies see Section 5.5. Thus the set of infinite hypernaturals is external. It follows that every internal set including ∗N N will necessarily contain also some \ 50 5. MICROCONTINUITY, EVT, INTERNAL SETS elements that are finite natural numbers. This principle is called un- derspill. This property is expploited in Section 11.2. 5.12. Transfering the well-ordering of N; Henkin semantics Is it possible to transfer the well-ordering of the natural numbers? All the examples of transfer given in Section 5.3 deal with quantifi- cation over numbers, such as natural, rational, or real numbers. As already pointed out, quantification over sets cannot be encom- passed by the transfer principle because the least upper bound property for bounded sets in R fails when interpreted literally over ∗R. Example 5.12.1. The set of all infinitesimals in ∗R does not admit a least upper bound. Indeed, if C > 0 were such a bound, then either C is infinitesimal or it is appreciable. If C were infinitesimal, then the infinitesimal 2C is greater than C, contradicting the claim that C is a l.u.b. If C is appreciable, then C/2 is also appreciable and therefore a smaller upper bound than C. The contradiction shows that there is no l.u.b. On the other hand we showed in Section 5.11 that, while the set of infinite hypernaturals is not well-ordered with respect to the natural ordering induced from ∗N, nevertheless internal subsets of ∗N do satisfy the well-ordering condition. In fact, quantification over sets in N can be transfered on condition of being interpreted as quantification over elements of P = (N), as follows. The condition of well-ordering can be stated as follows.P To simplify the formula we will use the symbol ′ to denote quantification over nonempty sets only: ∀ ( ′A N)( u A)( x A) u x . ∀ ⊆ ∃ ∈ ∀ ∈ ≤ To make this transferable, we reformulate the condition as follows: ( A P ∅ )( u A)( x A) u x , ∀ ∈ \{ } ∃ ∈ ∀ ∈ ≤ where P = (N). At this point transfer can be applied, resulting in the followingP sentence: A ∗P ∅ ( u A)( x A) u x . ∀ ∈ \{ } ∃ ∈ ∀ ∈ ≤ Here quantification is over elements of ∗P. In other words, quantifica- tion is over internal subsets of ∗N, resulting in a correct sentence. The point is that ∗ (N) $ (∗N). P P 5.13. FROM HYPERRATIONALS TO REALS VIA ULTRAPOWER 51 Remark 5.12.2. Robinson refers to this approach to quantification as Henkin semantics. All sentences involving quantification over subsets of A R (5.12.1) ⊆ can similarly be transfered provided we interpret the quantification as ranging over A (R) (5.12.2) ∈ P and transfering formula (5.12.2) instead of (5.12.1), to obtain A ∗ (R), ∈ P entailing quantification over internal subsets. For example, the least upper bound property for bounded sets holds over the hyperreals, pro- vided we interpret it as applying to internal subsets only. 5.13. From hyperrationals to reals via ultrapower This section supplements the material of Section 4.7 on the ultra- power construction. We used the ultrapower construction to build the hyperreal field out of the real field. But the ultrapower construction can alo be used to give an alternative construction of the real number system starting with the rationals. Let Q be the field of rational numbers. The star-transform of Q gives an extension Q ∗Q ⊆ N to the hyperrationals. Here ∗Q = Q / MAX (see Section 4.5 for de- tails). Consider the subring F ∗Q consisting of all finite hyperrationals. More precisely, we have the⊆ following definition. Definition 5.13.1. We set F = x ∗Q :( r Q) x Theorem 5.13.2. The ideal I F is maximal, so that the quotient field ⊆ ρ = F/I is naturally isomorphic to R. Proof. We will indicate the isomorphisms in each direction, as follows: φ : ρ R and ψ : R ρ. Given an element→ x ρ, choose→ a representativex ˜ F . Sincex ˜ is finite, its standard part∈ is well-defined, and we set ∈ φ(x) = st(˜x), where “st” is the standard part. This is independent of the choice of the representative because the standard part function suppresses all infinitesimal differences. We therefore obtain a map φ : ρ R. To construct a homomorphism in the opposite direction,→ consider a positive real number y R. The procedure will be similar for a negative number but we describe∈ it for a positive number so as to fix ideas. Let K = 10H y ⌊ ⌋ so that 0 10H y K < 1. ≤ − Dividing by 10H we obtain K 1 0 y < ≤ − 10H 10H so that y K . In other words, we are considering the decimal ap- ≈ 10H proximation of y truncated6 at rank H, resulting in a hyperrational K F ∗Q. 10H ∈ ⊆ Then the map ψ : R ρ is defined by sending the real number y to the → K equivalence class of the finite hyperrational 10H in F/I, or in formulas K ψ(y)= + I. 10H K 7 Then we have φ(ψ(y)) = st( 10H )= y, proving the theorem. 6katum (with kuf)? mekutzatz? google translate gives “esroni mekutzatz” 7Should the following material be included in the main text? This can be elaborated as follows in terms of the extended decimal expansion. Consider the decimal expansion of y: y = a.a1a2a3 ...an ... defined for each natural index n N. By transfer, the decimal digits an are defined for each hypernatural ∈ rank, as well: y = a.a1a2a3 ...an ... ; ...aH−1aH aH+1 ... Here ranks to the left of the semicolon are finite, while ranks to the right of the semicolon are infinite. ∗ Choose a infinite hypernatural H N N. The product 10H y has the form 10H y = ∈ \ 5.14. REPEATING THE CONSTRUCTION 53 5.14. Repeating the construction What happens if we apply the F/I-type construction but now with R in place of Q? Namely, consider the star-transfer ∗R, and the ring of finite hyperreals ∗RF ∗R, as well as the ideal of infinitesimals ⊆ ∗RI ∗RF . ⊆ It turns out that we do not get anything new in this direction, as attested to by the following. Theorem 5.14.1. The quotient ∗RF /∗RI is naturally isomorphic to R itself. The proof is essentially the same, with the main ingredient being the existence of the standard part function with range R: st : ∗RF R → with kernel precisely ∗RI . Remark 5.14.2. Similar considerations apply to Rn, showing that n n n the quotient ∗RF /∗RI is naturally isomorphic to R itself. We will return to this observation in Section 8.5. aa1a2a3 ...an ...aH−1aH .aH+1 ... with a decimal point after the digit aH . There- H fore its integer part is the hyperinteger K = 10 y = aa a a ...a ...a − a , ⌊ ⌋ 1 2 3 n H 1 H and we proceed as above. CHAPTER 6 Application: small oscillations of a pendulum The material in this chapter first appeared in Quantum Studies: Mathematics and Foundations [Kanovei, Katz & Nowik 2016]. 6.1. Small oscillations of a pendulum In his 1908 book Elementary Mathematics from an Advanced Stand- point, Felix Klein advocated the introduction of calculus into the high- school curriculum. One of his arguments was based on the problem of small oscillations of the pendulum.1 The problem had been treated until then using a somewhat mysterious superposition principle. The latter involves (a vertical plane projection of) a hypothetical circular motion of the pendulum. Klein advocated what he felt was a bet- ter approach, involving the differential equation of the pendulum; see [Klein 1908, p. 187]. The classical problem of the pendulum translates into the second g order nonlinear differential equationx ¨ = ℓ sin x for the variable an- gle x with the vertical direction, where g is− the constant of gravity and ℓ is the length of the (massless) rod or string. The problem of small os- cillations deals with the case of small amplitude,2 i.e., x is small, so that sin x is approximately x. Then the equation is boldly replaced by the linear onex ¨ = g x, whose solution is harmonic motion with − ℓ period 2π ℓ/g. This suggests that the period of small oscillations should be in- dependentp of their amplitude. The intuitive solution outlined above may be acceptable to a physicist, or at least to the mathematicians’ proverbial physicist. The solution Klein outlined in his book does not go beyond the physicist’s solution. The Hartman–Grobman theorem [Hartman 1960], [Grobman 1959] provides a criterion for the flow of the nonlinear system to be conjugate to that of the linearized sys- tem, under the hypothesis that the linearized matrix has no eigenvalue with vanishing real part. However, the hypothesis is not satisfied for the pendulum problem. 1Tnudat metutelet 2misraat 55 56 6. APPLICATION: SMALL OSCILLATIONS OF A PENDULUM To provide a rigorous mathematical treatment for the notion of small oscillation, it is tempting to exploit a hyperreal framework fol- lowing [Nowik & Katz 2015]. Here the notion of small oscillation can be given a precise sense, namely infinitesimal amplitude. Note however the following. Remark 6.1.1. Even for infinitesimal x one cannot boldly replace sin x by x. Therefore additional arguments are required. The linearisation of the pendulum is treated in [Stroyan 2015] us- ing Dieners’ “Short Shadow” Theorem; see Theorem 5.3.3 and Exam- ple 5.3.4 there. This text can be viewed as a self-contained treatement of Stroyan’s Example 5.3.4. The traditional setting in the context of the real continuum can only make sense of the claim that “the period of small oscillations is inde- pendent of the amplitude” by means of a paraphrase in terms of limits, rather than a specific oscillation. In the context of an infinitesimal- enriched continuum, such a claim can be formalized more literally; see Corollary 6.8.1. 6.2. Vector fields, walks, and flows The framework developed in [Robinson 1966] involves a proper extension ∗R R preserving the properties of R to a large extent dis- ⊇ cussed in Remark 6.3.3. Elements of ∗R are called hyperreal numbers. A positive hyperreal number is called infinitesimal if it is smaller than every positive real number. We choose a fixed positive infinitesimal λ ∗R (a restriction on the choice of λ appears in Section 6.8). ∈ Remark 6.2.1. We will work with flows in a two-dimensional con- text where the manifold can be conveniently identified with C. We will also exploit the natural extension ∗C C. ⊇ Definition 6.2.2. Given a classical vector field V = V (z) where z ∈ C, one forms an infinitesimal displacement δF (of the intended prevec- torfield Φ) by setting δΦ = λV . Thus the aim is to construct the corresponding flow Φt in the plane. Note that a zero of δΦ corresponds to a fixed point of the flow. Definition 6.2.3. The infinitesimal generator of the flow is the function Φ : ∗C ∗C, also called a prevector field, defined by → Φ(z)= z + δΦ(z), (6.2.1) 6.3. THE HYPERREAL WALK AND THE REAL FLOW 57 where δΦ(z) = λV (z) in the case of a displacement generated by a classical vector field as above. Remark 6.2.4. An infinitesimal generator, or prevector field, could be a more general internal function. See Section 5.10 and Chapter 8.4 for details on internal sets, and Definition 10.3.3 for the D1 condition for prevectorfields. 6.3. The hyperreal walk and the real flow We propose a concept of solution of differential equation based on Euler’s method with infinitesimal step, with well-posedness based on a property of adequality (see Section 6.4), as follows. Definition 6.3.1. The hyperreal flow, or walk, Φt(z) is a t-paramet- rized map ∗C ∗C defined whenever t is a hypernatural multiple t = Nλ of λ, by setting→ N Φ (z) = Φ (z) = Φ Φ ... Φ(z) = Φ◦ (z), (6.3.1) t Nλ ◦ ◦ ◦ N where Φ◦ is the N-fold composition. For arbitrary hyperreal t the walk can be defined by setting Φt = Φ t/λ λ. ⌊ ⌋ Remark 6.3.2. The fact that the infinitesimal generator Φ given by (6.2.1) is invariant under the flow Φt of (6.3.1) receives a transparent meaning in this framework, expressed by the commutation relation N N Φ Φ◦ = Φ◦ Φ ◦ ◦ due to transfer of associativity of composition of maps. Remark 6.3.3. As discussed in Section 3.3, the transfer principle is a type of theorem that, depending on the context, asserts that rules, laws or procedures valid for a certain number system, still apply (i.e., are transfered) to an extended number system. Thus, the familiar ex- tension Q R preserves the properties of an ordered field. To give a ⊆ negative example, the extension R R of the real numbers to the so-called extended reals does⊆ not preserve∪{±∞} the properties of an ordered field. The hyperreal extension R ∗R preserves all first-order properties, including the identity sin2 x +⊆ cos2 x = 1 (valid for all hy- perreal x, including infinitesimal and infinite values of x ∗R). The ∈ natural numbers N R ∗R are naturally extended to the hypernat- ⊆ ⊆ urals ∗N ∗R. For a more detailed discussion, see Chapter 12. ⊆ Definition 6.3.4. The real flow φt on C for t R when it exists is ∈ constructed as the shadow (i.e., standard part) of the hyperreal walk Φt by setting φt(z)=st ΦNλ(z) 58 6. APPLICATION: SMALL OSCILLATIONS OF A PENDULUM t where N = λ , while x rounds off the hyperreal number x to the nearest integer no greater⌊ ⌋ than x, and “st” (standard part or shadow) rounds off each finite hyperreal to its nearest real number. For t sufficiently small, appropriate regularity conditions ensure that the point ΦNλ(z) is finite so that the shadow is well-defined. The usual relation of being infinitely close is denoted . Thus for finite (i.e., non-infinite) z,w we have z w if and only if st(≈z) = st(w). This relation is an additive one. ≈ The appropriate relation for working with small prevector fields is a multiplicative rather than an additive one, as detailed in Section 6.4. 6.4. Adequality pq We will use Leibniz’s notation pq . Leibniz actually used a symbol that looks more like but the latter is commonly used to denote a product. Leibniz used⊓ the symbol to denote a generalized notion of equality “up to” (though he did not distinguish it from the usual sym- bol = which he also used in the same sense). A prototype of such a relation (though not the notation) appeared already in Fermat under the name adequality. We will use it for a multiplicative relation among (pre)vectors. Definition 6.4.1. Let z,w ∗C. We say that z and w are adequal ∈ z z and write z pq w if either one has w 1 (i.e., w 1 is infinitesimal) or z = w = 0. ≈ − This implies in particular that the angle between z,w is infinites- imal (when they are nonzero), but the relation pq entails a stronger condition. If one of the numbers is appreciable, then so is the other and the relation z pq w is equivalent to z w. If one of z,w is infin- itesimal then so is the other, and the difference≈ z w is not merely infinitesimal, but so small that the quotients z | w−/z |and z w /w are infinitesimal, as well. | − | | − | We are interested in the behavior of orbits in a neighborhood of a fixed point 0, under the assumption that the infinitesimal displace- ment satisfies the Lipschitz condition. In such a situation, we have the following theorem. Theorem 6.4.2. Prevector fields defined by adequal infinitesimal displacements define hyperreal walks that are adequal at each finite time. In particular, if δΦ pq δG then φt = gt where φt and gt are the corresponding real flows. This was shown in [Nowik & Katz 2015, Example 5.12]. 6.6. LINEARISATION 59 6.5. Infinitesimal oscillations Let x denote the variable angle between an oscillating pendulum and the downward vertical direction. By considering the projection of the force of gravity in the direction of motion, one obtains the equation of motion mℓx¨ = mg sin x where m is the mass of the bob of the pendulum, ℓ is the− length of its massless rod, and g is the constant of gravity. Thus we have a second order nonlinear differential equation x¨ = g sin x. (6.5.1) − ℓ The initial condition of releasing the pendulum at angle a (for ampli- tude) is described by x(0) = a (x˙(0) = 0 We replace (6.5.1) by the pair of first order equations g x˙ = ℓ y, y˙ = g sin x, ( −p ℓ and initial condition (x, y)=(a,p0). We identify (x, y) with x + iy and (a, 0) with a + i0 as in Section 6.2. The classical vector field corresponding to this system is then V (x, y)= g y g sin x i. (6.5.2) ℓ − ℓ The corresponding prevectorq field (i.e.,q infinitesimal generator of the flow) Φ is defined by the infinitesimal displacement δ =(λ g y) (λ g sin x)i Φ ℓ − ℓ so that q q Φ(z)= z + δΦ(z). We are interested in the flow of Φ, with initial condition a +0i. The flow is generated by hyperfinite iteration of Φ. 6.6. Linearisation Consider also a prevector field E(z) = z + δE(z) defined by the displacement δ (z)= λ g y iλ g x E ℓ − ℓ where as before z = x + iy. We areq interestedq in small oscillations, i.e., the case of infinitesimal amplitude a. Since sin x pq x for infinitesimal x we have δE pq δΦ. 60 6. APPLICATION: SMALL OSCILLATIONS OF A PENDULUM Due to the multiplicative nature of the relation of adequality, the rescal- ings of E and Φ by change of variable z = aZ remain adequal and therefore define adequal hyperreal walks by Theorem 6.4.2. 6.7. Adjusting linear prevector field We will compare E to another linear prevector field H defined by g iλ H(x + iy)= e− q ℓ (x + iy) = x cos λ g + y sin λ g + x sin λ g + y cos λ g i ℓ ℓ − ℓ ℓ q q q q g given by clockwise rotation of the x, y plane by infinitesimal angle λ ℓ . The corresponding hyperreal walk, defined by hyperfinite iteration of H, satisfies the exact equality p H (a, 0) = a cos g t, a sin g t (6.7.1) t ℓ − ℓ whenever t is a hypernatural multipleq of λ. Inq particular, we have the periodicity property H 2π (z)= z and hence √g/ℓ 2 Ht+ π = Ht (6.7.2) √g/ℓ 2π whenever both t and g are hypernatural multiples of λ. The usual q ℓ estimates cos λ g 1 sin λ g ℓ − 0, ℓ 1. (6.7.3) λ ≈ λ g ≈ p pℓ imply an adequality δE pq δH whenever xpand y are finite. By Theo- rem 6.4.2 the hyperfinite walks of E and H satisfy Et(a, 0) pq Ht(a, 0) for each finite initial amplitude a and for all finite time t which is a hypernatural multiple of λ. 6.8. Conclusion The advantage of the prevector field H is that its hyperreal walk is given by an explicit formula (6.7.1) and is therefore periodic with 2π period precisely g , provided we choose our base infinitesimal λ in q ℓ 2π such a way that g is hypernatural. We obtain the following con- λ q ℓ sequence of the periodicity property (6.7.2): modulo an appropriate choice of a representing prevector field (namely, H) in the adequality 6.8. CONCLUSION 61 class, the hyperreal walk is periodic with period 2π ℓ/g. This can be summarized as follows. p Corollary 6.8.1. The period of infinitesimal oscillations of the pendulum is independent of their amplitude. If one rescales such an infinitesimal oscillation to appreciable size by a change of variable z = aZ where a is the amplitude, and takes standard part, one obtains a standard harmonic oscillation with pe- riod 2π ℓ/g. The formulation contained in Corollary 6.8.1 has the advantage of involving neither rescaling nor shadow-taking. p CHAPTER 7 Differentiable manifolds 7.1. Definition of differentiable manifold A n-dimensional manifold M is a set covered by a collection of sub- sets (called coordinate charts or neighborhoods), typically denoted A or B, and having the following properties. For each coordinate neigh- borhood A we have an injective map u : A Rn whose image is an n → open set in R . Thus, the coordinate neighborhood is a pair (A, u). The maps are required to satisfy the following compatibility condition. Let n i u : A R , u =(u )i=1,...,n, → and similarly n α v : B R , v =(v )α=1,...,n. → Whenever the overlap A B is nonempty, it has a nonempty image v(A B) in Euclidean space.∩ Restricting to this subset, we obtain a one-to-one∩ map 1 n u v− : v(A B) R (7.1.1) ◦ ∩ → from an open set v(A B) Rn to Rn. ∩ ⊆1 n n Similarly, the map v u− from the open set u(A B) R to R is one-to-one. ◦ ∩ ⊆ Definition 7.1.1. The defining condition of a smooth manifold is that the map (7.1.1) is differentiable for all choices of coordinate 1 neighborhoods A and B as above. The maps u v− are called the transition maps.1 The collection of coordinate charts◦ as above is called an atlas. The coordinate charts induce a topology on M by imposing the usual conditions (arbitrary unions of open sets are open; finite inter- sections of open sets are open). 1funktsiot maavar (nothing to do with ekron haavara) 63 64 7. DIFFERENTIABLE MANIFOLDS Remark 7.1.2. We will usually assume that M is connected. Given the manifold structure as above, connectedness of M is equivalent to path-connectedness2 of M. Remark 7.1.3 (Metrizability). There are some pathological non- Hausdorff examples like two copies of R glued along an open halfline. To rule out such examples, the simplest condition is that of metrizabil- ity; see e.g., Example 7.3.2, Theorem 7.4.2. Identifying A B with a subset of Rn by means of the coordi- nates (ui), we can∩ think of v as given by n functions vα(u1,...,un) as α runs from 1 to n. Remark 7.1.4. The manifold condition can be stated as the re- quirement that the functions vα(ui) are all smooth. Example 7.1.5. The usual hierarchy of smoothness denoted Ck (or C∞) in Euclidean space generalizes to manifolds. Thus, for k = 1 a manifold M is C1 if and only if all the partial derivatives ∂vα , α =1,...,n, i =1,...,n ∂ui are continuous. Similarly, M is C2 if all second partials ∂2vα ∂ui∂uj are continuous, etc. 7.2. Open submanifolds, Cartesian products The notion of open and closed set in M is inherited from Euclidean space via the coordinate charts. An open subset C M of a mani- fold M is itself a submanifold, with differentiable structure⊆ obtained by the restriction of the coordinate map u =(ui)of(A, u) or in formu- las u⇂A C . ∩ Let Matn,n(R) be the set of square matrices with real coefficients. This is identified with Euclidean space of dimension n2, and is therefore a manifold. Theorem 7.2.1. Define a subset GL(n, R) Matn,n(R) by setting ⊆ GL(n, R)= X Matn,n(R) : det(X) =0 . { ∈ 6 } Then GL(n, R) is a submanifold. 2kshir-mesila 7.3. TORI 65 Proof. The determinant function is a polynomial in the entries xij of the matrix X. Therefore it is a continuous function of the entries, 2 which are the coordinates in Rn . Hence GL(n, R) is an open set, and therefore a manifold. Note that its complement is the closed set consisting of matrices of zero determinant. Theorem 7.2.2. Let M and N be two differentiable manifolds of dimensions m and n. Then the Cartesian product M N is a differ- entiable manifold of dimension m + n. The differentiable× structure is defined by coordinate neighborhoods of the form (A B,u v), where (A, u) is a coordinate chart on M, while (B, v) is× a coordinate× chart on N. Here the function u v on A B is defined by × × (u v)(a, b)=(u(a), v(b)) × for all a A, b B. ∈ ∈ 7.3. Tori Example 7.3.1. The torus T 2 = S1 S1 is a 2-dimensional man- ifold. More generally, the n-torus T n =×S1 S1 (product of n copies of the circle) is an n-dimensional manifold.×···× Example 7.3.2. The circle S1 = (x, y) R2 : x2 + y2 = 1 itself is a manifold. Let us give an explicit{ atlas for∈ the circle. Let A+} S1 be the open upper halfcircle ⊆ A+ = a =(x, y) S1 : y> 0 . { ∈ } Consider the coordinate chart (A+,u), namely u : A+ R, → defined by setting u(x, y)= x. The open lower halfcircle A− also gives a coordinate chart (A−,u) where the coordinate u is defined by the same formula. We similarly define the right halfcircle B+ = a =(x, y) S1 : x> 0 , { ∈ } yielding a coordinate chart (B+, v) where v(x, y) = y, and similarly + + for B−. The transition function between A and B is calculated by noting that in the overlap A+ B+ one has both x> 0 and y > 0. ∩ 1 1 Let us calculate a typical transition function u v− . The map v− ◦ sends y R1 to the point ( 1 y2,y) S1, and then the coordinate ∈ − ∈ map u sends ( 1 y2,y) to the first coordinate 1 y2 R1. Thus − 1 p − ∈ the composed map u v− given by p ◦ p y 1 y2 7→ − p 66 7. DIFFERENTIABLE MANIFOLDS is the transition function in this case. Since this is smooth for all y (0, 1), the circle is a C∞ manifold of dimension 1 (modulo spelling out∈ the remaining transition functions). Setting d(p, q) = arccos p, q we see that S1 is metrizable. h i 7.4. Projective spaces Definition 7.4.1. Let X = Rn+1 0 be the collection of (n + 1)-tuples x = (x0,...,xn). Define an\{ equivalence} relation among tuples x, y X by setting x y if and only if there is a real number∼ t = 0 such that∈y = tx, i.e., ∼ 6 yi = txi, i =0, . . . , n. The collection of equivalence classes is called the real projective space and denoted RPn. Theorem 7.4.2. The space RPn is a smooth n-dimensional mani- fold. Proof. We define coordinate neighborhoods Ai, where i =0,...,n by setting i Ai = x : x =0 . (7.4.1) { 6 } n We will now define the coordinate pair (Ai,ui), where ui : Ai R , namely the corresponding coordinate chart. We can set → x0 xi 1 xi+1 xn u (x)= ,..., − , ,..., i xi xi xi xi i since division by x is allowed in Ai by (7.4.1). The coordinate ui is well-defined because if x y then u (x) = u (y). To calculate the ∼ i i transition map, let u = ui and v = uj, where we assume for simplicity 1 that i < j. We wish to calculate the map u v− associated with Ai Aj. 0 n 1 n ◦ j ∩ Take a point z = (z ,...,z − ) R . Since we work with x = 0, we can rescale the homogeneous coordinates∈ so that xj = 1, so6 we can 1 represent v− (z) by 1 0 j 1 j n 1 v− (z)=(z ,...,z − , 1, z ,...,z − ). Now we apply u = ui, obtaining 0 i 1 i+1 j 1 j+1 n 1 1 z z − z z − 1 z z − u v− (z)= ,..., , ,... , , ,..., (7.4.2) ◦ zi zi zi zi zi zi zi All transition functions appearing in (7.4.2) are rational functions and are therefore smooth. Thus RPn is a differentiable manifold. Set- ting d(p, q) = arccos p, q for unit p, q we see that RPn is metriz- able. |h i| 7.5. DEFINITION OF A VECTOR FIELD 67 7.5. Definition of a vector field Recall that the tangent space TpM at a point p M is the collection of all tangent vectors at the point p. Here a tangent∈ vector can be defined via derivations.3 i With respect to a coordinate chart (A, u) where u =(u )i=1,...,n, we ∂ have a basis ( ∂ui ) for TpM. Therefore an arbitrary vector X at p is a linear combination ∂ Xi , ∂ui for appropriate Xi. Here we use the Einstein summation convention. ∂ Since the vectors ∂ui form a basis for the tangent space at every point p A, a choice of component functions Xi(u1,...,un) in the neighborhood∈ will define a vector field ∂ Xi(u1,...,un) ∂ui in A. Here the components Xi are required to be of an appropriate differentiability type. More precisely, we have the following definition. Definition 7.5.1. (see [Boothby 1986, p. 117]) Let M be a C∞ manifold. A vector field X of class Cr on M is a function assigning to each point p of M a vector Xp TpM whose components in any local coordinate (A, u) are functions∈ of class Cr. Example 7.5.2. Let M be the Euclidean plane R2. Via obvious identifications the Euclidean norm in the (x, y)-plane leads naturally to a Euclidean norm on the tangent space (i.e., tangent plane) at | | ∂ ∂ every point with respect to which both tangent vectors ∂x and ∂y are orthogonal and have unit norm: ∂ ∂ = =1. ∂x ∂y Therefore the dual basis (dx,dy ) of 1-forms also satisfies dx = dy = 1, and the two basis 1-forms are orthogonal. | | | | ∂ ∂ 2 Note that each of ∂x and ∂y defines a global vector field on R (defined at every point). Example 7.5.3 (Vector fields defined by polar coordinates). In- teresting examples of vector fields are provided by polar coordinates. These may be undefined at the origin, i.e., apriori only defined in the 3This material was covered in the undergraduate course 88201. 68 7. DIFFERENTIABLE MANIFOLDS 2 ∂ open submanifold R 0 . Here ∂θ is represented by the path (r cos t, r sin t) with derivative ( r sin\{t,} r cos t). Hence − ∂ = r (7.5.1) ∂θ 1 ∂ so that r ∂θ is of norm 1. Passing to the dual basis we obtain that rdθ ∂ ∂ 4 has norm 1 because ∂r , ∂θ are orthogonal. 7.6. Zeros of vector fields Example 7.6.1 (Zero of type source/sink). We mentioned that the ∂ ∂ vector field ∂r in the plane is undefined at the origin, but r ∂r is a continuous (C0) vector field vanishing at the origin. Moreover it is ∂ ∂ equal to x ∂x + y ∂dy and hence smooth. ∂ The vector field in the plane defined by r ∂r is sometimes called a “source” while the vector field X = r ∂ is sometimes called a − ∂r “sink”.5 Note that the integral curves of a source flow from the origin and away from it, whereas the integral curves of a sink flow into the origin, and converge to it at for large time. At a point p R2 with polar coordinates (r, θ) this is given by ∈ r ∂ if p =0 X(p)= − ∂r 6 (0 if p =0. ∂ Example 7.6.2 (Zero of type “circulation”). The vector ∂θ in the plane, viewed as a tangent vector at a point at distance r from the origin, tends to zero as r tends to 0, as is evident from (7.5.1). Therefore the vector field defined by ∂ on R2 0 extends by continuity to the ∂θ \{ } point p = 0. Moreover it equals y ∂ + x ∂ and hence smooth. − ∂x ∂y 4Alternatively, since dr2 = dx2 + dy2 by Pythagoras, we have dr = 1 as well. 2 y 1 | x | xdy−ydx Meanwhile θ = arctan x and therefore dθ = 1+(y/x)2 d(y/x) = y2+x2 x2 = xdy−ydx r2 . Hence xdy ydx r 1 dθ = | − | = = . (7.5.2) | | r2 r2 r Thus rdθ is a unit covector. We therefore have an orthonormal basis (dr, rdθ) for ∂ the cotangent space. Since dθ ∂θ = 1, equation (7.5.2) implies (7.5.1). Therefore 1 ∂ is a unit vector. r ∂θ 5Would these be makor and kior? 7.7. 1-PARAMETER GROUPS OF TRANSFORMATIONS OF A MANIFOLD 69 Thus we obtain a continuous vector field p X(p)= y ∂ + x ∂ 7→ − ∂x ∂y on R2 which vanishes at the origin: ∂ if p =0 X(p)= ∂θ 6 (0 if p =0. Such a vector field is sometimes described as having circulation6 around the point 0. Note that the integral curves of a circulation are circles around the origin. Example 7.6.3 (Zero of type “circulation”on the sphere). Spherical coordinates (ρ,θ,ϕ) in R3 restrict to the unit sphere S2 R3 to give coordinates (θ,ϕ) on S2. The north pole is defined by ϕ⊆ = 0. Here ∂ the angle θ is undefined but the vector field ∂θ can be extended by continuity as in the plane, and we obtain a zero of type “circulation”. ∂ Similarly the south pole is defined by ϕ = π. Here ∂θ has a zero of type “circulation” but going clockwise (with respect to the natural orientation on the 2-sphere). 7.7. 1-parameter groups of transformations of a manifold Let θ : R M M be a smooth mapping (this θ is in general × → unrelated to polar coordinates). Denote coordinates t R and p M. ∈ ∈ We will write θt(p) for θ(t, p). Assume θ satisifies the following two conditions: (1) θ (p)= p for all p M; 0 ∈ (2) θt θs(p)= θt+s(p) for all p M and all s, t R. ◦ ∈ ∈ Then θ is called a 1-parameter group of transformations, a flow, or an action for short. An action θ defines a vector field X on M called the infinitesimal generator of θ, as follows. Definition 7.7.1. The infinitesimal generator X of the flow θt is defined at a point p M as a derivation Xp defined by setting for each function f on M, ∈ 1 Xp(f) = lim f(θ∆t(p)) f(p) . ∆t 0 → ∆t − Remark 7.7.2. This is a generally accepted term defined as above in a context where actual infinitesimals are not present. 6machzor, tzirkulatsia. 70 7. DIFFERENTIABLE MANIFOLDS 7.8. Invariance of infinitesimal generator under flow In this section we follow [Boothby 1986]. Consider a flow θt on a manifold M as in Section 7.7, namely a map θ : R M M, × → where we write θ (p) for θ(t, p), such that θ (p)= p for all p M, and t 0 ∈ θt θs(p)= θt+s(p) for all p M and all s, t R. ◦ ∈ ∈ Then one obtains an induced action θt∗ on a vector field Y on M, producing a new vector field θt∗ (Y ) as follows. We view Y as a deriva- tion acting on functions f on M. Given a function f on M, we need to specify the manner in which the new vector field θt∗ (Y ) differentiates the function f. Definition 7.8.1. The induced action on the vector field Y of the flow θ of the manifold M is defined by setting θ (Y )f = Y (f θ ) t∗ p p ◦ t so that the vector θ ∗ (Y ) resides at the point θ (p) M. t p t ∈ In more detail, let q = θt(p). If a function f is defined in a neigh- borhood of q M then the composed function f θ is defined in a ∈ ◦ t neighorhood of p and therefore it makes sense apply the derivation Yp to the function f θt. Recall that the◦ infinitesimal generator X of a flow θ is the vector field satisfying 1 Xp(f) = lim f(θ∆t(p)) f(p) . ∆t 0 → ∆t − What happens when we apply the induced action of the flow to the vector field which is the infinitesimal generator of the flow itself? An answer is provided by the following theorem. Theorem 7.8.2 (Boothby p. 124). The infinitesimal generator X of an action θt is invariant under the flow: θt∗ (Xp)= Xθt(p). Proof. The proof is a direct computation, obtained by testing the field on a function f Dq where q = θt(p) (and Dq is the space of smooth functions defined∈ in a neighborhood of q) as follows: θ ∗ (X )f = X (f θ ) t p p ◦ t 1 = lim f θt (θ∆t(p)) f θt(p) . ∆t 0 → ∆t ◦ − ◦ 7.9. LIE DERIVATIVE AND LIE BRACKET 71 Since R is abelian, we have θt θ∆t = θt+∆t = θ∆t θt (by definition of a flow). Hence ◦ ◦ 1 θt∗ (Xp)f = lim f θ∆t(θt(p)) f(θt(p)) . ∆t 0 → ∆t ◦ − = Xθt(p)f, proving the theorem. Note that while theorem 7.8.2 is intuitively “obvious”, the received formalism involved in proving it is bulky. We will develop a more direct approach in Section 9.6. 7.9. Lie derivative and Lie bracket In this section as in the previous one we follow Boothby. Definition 7.9.1 (Boothby p. 154). Let θt be a flow on M with infinitesimal generator X, and let Y be a vector field on M. The vector field LX Y , called the Lie derivative of Y with respect to X, is defined by setting 1 LX Y = lim θ t (Yθ(t,p)) Yp t 0 − ∗ → t − Theorem 7.9.2. If X and Y are vector fields on M then LX Y = [X,Y ], where [X,Y ]f = X (Y f) Y (Xf) is the Lie bracket. p − p Proof. This can be checked in coordinates using a Taylor formula with remainder. Example 7.9.3. The coordinate vector fields commute with respect to the Lie bracket, i.e., ∂ ∂ , =0 ∂ui ∂uj for all i, j =1,...,n. This is an equivalent way of stating the equality of mixed partials (sometimes called Schwarz’s theorem or Clairaut’s theorem). CHAPTER 8 Theorem of Frobenius 8.1. Invariance under diffeomorphism In this section as in Section 7.9 we follow Boothby. We first review some concepts from advanced calculus. If σ : M N → is a smooth map such that σ(p) = q N, we obtain an induced ∈ map σ : TpM TqN. Here σ is the map induced on the tangent space.∗ This can→ be defined as follows.∗ Definition 8.1.1. Choose a representing curve α(t) so that X = α′(0), and define σ (X) to be the vector represented by the curve de- fined by the composition∗ σ α on the image manifold N. ◦ In coordinate-free notation, chain rule takes the following form. Theorem 8.1.2 (Chain rule). for smooth maps σ, τ among mani- folds asserts that the composition σ τ satisfies ◦ (σ τ) = σ τ , ◦ ∗ ∗ ◦ ∗ i.e., (σ τ) (X)= σ τ (X) for all tangent vectors X. ◦ ∗ ∗ ◦ ∗ Definition 8.1.3. A diffeomorphism of M is a bijective 1-1 map φ : 1 M M such that both φ and φ− are C∞ functions for all coordinate charts.→ Recall that we have the notion of a flow θ : M M depending on t → parameter t R. ∈ Definition 8.1.4. A vector field X is said to generate a flow θt if X is the infinitesimal generator of θt as in Definition 7.7.1. Recall from Definition 7.7.1 that the vector field (which is the in- finitesimal generator of the flow) and the flow itself are related by 1 Xp(f) = lim f(θ∆t(p)) f(p) . (8.1.1) ∆t 0 → ∆t − 73 74 8. THEOREM OF FROBENIUS Definition 8.1.5. An integral curve,1 or orbit, of a vector field X through a fixed point p M is a curve θ(t, p) satisfying θ(0,p)= p and ∂θ ∈ ∂t = X where t ranges through an open interval containing 0. Recall that a vector field X is said to be invariant under a smooth map σ if σ (X) = X at every point. In more detail, if q = σ(p) M ∗ ∈ then we require that σ (Xp)= Xq. ∗ Theorem 8.1.6 (Boothby p. 142). Let X be a vector field gener- ating a flow θ(t, p) on M and let F : M M be a diffeomorphism. Then X is invariant under F if and only if→F commutes with the flow, i.e., F (θ(t, p)) = θ(t, F (p)). Proof of the easy direction. First we show the easy direc- tion. Suppose X is invariant under the map F , so that F (Xp)= XF (p) ∗ for all p M. Let σp(t) be an integral curve through p M. Since we wish to use∈ p as the index, we use the letter σ in place∈ of θ to avoid clash of notation with θt introduced earlier (see (8.1.1)). By chain rule we have for each t that (F σp) = F σp . ◦ ∗ ∗ ◦ ∗ Thus the image curve F σ sends the vector d (a tangent vector ◦ p dt to R) to F (X); see (8.1.3) for more details. Thus F takes the inte- ∗ gral curve σp of the vector field X to an integral curve of the vector field F (X). Since F (X) = X by hypothesis, the uniqueness of in- tegral∗ curves implies∗ that F (θ(t, p)) = θ(t, F (p)), proving the easy direction. Proof of the opposite direction. To prove the opposite di- rection, suppose F (θ(t, p)) = θ(t, F (p)) (8.1.2) and let us prove that F (Xp)= XF (p). ∗ Consider an integral curve σp through p defined by σp : R M where → σp(t)= θ(t, p), so that σ (0) = p M. Here σ may be defined on a smaller interval p ∈ p than R but we used R for simplicity. As before, we use the letter σ instead of θ so as to avoid a clash of notation with θt = θ(t, p) used previously. 1Akuma-integralit (or akom-integrali). The odd term akumat-iskum found at http://hebrew-terms.huji.ac.il/teva english result.asp? seems to be a canaan- ite innovation. 8.2. FROBENIUS THM, COMMUTATION OF VECTOR FIELDS AND FLOWS75 d Let dt be the natural basis for the tangent line T0 R at the origin. dσ Letσ ˙ = dt . Then d X =σ ˙ (0) = σ . (8.1.3) p p p dt ∗ We now apply the isomorphism F : TpM TF (p)M to formula (8.1.3). We obtain ∗ → d d d F (Xp)= F σp =(F σp) = σF (p) , ∗ ∗ ◦ ∗ dt ◦ ∗ dt dt ∗ where the last equality follows from F σp(t) = σF (p)(t) as in (8.1.2). ◦ d Now the theorem results from the equality σF (p) dt = XF (p). ∗ 8.2. Frobenius thm, commutation of vector fields and flows In this section as in the previous one we follow Boothby. Here we prove a special case of the Frobenius theorem on flows, in the case of commuting flows. Recall from Definition 7.9.1 that if θt is a flow on M with infin- itesimal generator X, and Y is a vector field on M, then the vector field LX Y , called the Lie derivative of Y with respect to X, is defined by setting 1 LX Y = lim θ t (Yθ(t,p)) Yp , t 0 − ∗ → t − and furthermore LX Y = [X,Y ]. Theorem 8.2.1 (Boothby p. 156). Let X and Y be C∞ vector fields on a manifold M. Let θ = θt be the flow generated by X, and η = ηs be the flow generated by Y . Then the following two conditions are equivalent: (1) [X,Y ]=0; (2) for each p M there exists a δp > 0 such that ηs θt(p) = θ η (p) whenever∈ s < δ and t < δ . ◦ t ◦ s | | p | | p Proof. First we show the easy direction (2) = (1). We apply ⇒ Theorem 8.1.6 with F = θt to conclude that the infinitesimal genera- tor Y of the flow ηs is invariant under θt for a fixed small parameter t, 2 i.e., θt (Yq)= Yθ(t,q). Now consider the Lie derivative. We obtain ∗ 1 1 [X,Y ]q =(LX Y )q = lim θ t (Yθ(t,q)) Yq = lim (Yq Yq)=0. t 0 − ∗ t 0 → t − → t − The other direction will be shown in the next segment. 2The calculation in Boothby on page 157 contains some misprints involving misplaced parentheses. 76 8. THEOREM OF FROBENIUS Proof of the opposite direction. Now let us show the oppo- site direction (1) = (2) in Theorem 8.2.1. Assume that [X,Y ]=0 ⇒ identically on M. Consider a point q M. Define a vector Zq(t) at q by setting ∈ Zq(t)= θ t Yθ(t,q) TqM. (8.2.1) − ∗ ∈ ˙ Let us show that the t-derivative Z (t) vanishes. Consider the point q′ = θ(t, q). Then 1 ˙ ′ Zq(t) = lim θ (t+∆t) Yθ(∆t,q′) θ t (Yq ) (8.2.2) ∆t 0 ∆t − ∗ − − ∗ → Decomposing the action of θ (t+∆t ) as composition − ∗ θ (t+∆t) = θ t θ ∆t , − ∗ − ∗ ◦ − ∗ we obtain from (8.2.2) and linearity of induced maps that 1 ˙ ′ Zq(t)= θ t lim θ ∆t Yθ(∆t,q′) Yq (8.2.3) − ∗ ∆t 0 ∆t − − → ∗ 1 ′ ′ Now we recognize the expression ∆t θ ∆t Yθ(∆t,q ) Yq as the Lie − ∗ − derivative LX Y of Y at the point q′ = θ(t, q ). Therefore we obtain Z˙q(t)= θ t ((LX Y )q′ )=0 − ∗ since by hypothesis (1), the Lie bracket vanishes everywhere, including at the point q′ = θ(t, q). Therefore the vector Z (t) of (8.2.1) is constant for t < δ, and q | | therefore equal to Yq. This means that Y is invariant under the flow θt. Therefore by Theorem 8.1.6, we have the commutation of flows ηs θ (q)= θ η (q). ◦ t t ◦ s 8.3. Proof of Frobenius n Theorem 8.3.1 (Frobenius). Let U R be open, and X1,...,Xk : n 1 ⊆ U R be k independent C classical vector fields such that [Xi,Xj]′ = →k m m m=1 Cij Xm for Cij : U R. (As before [ , ]′ is the classical Lie i → · · bracket.) Let gt be the classical flow of Xi. Given p U, let r > 0 P 1 2 k ∈ k be such that ϕ(t1,...,tk)= g g g (p) is defined on ( r, r) t1 ◦ t2 ◦···◦ tk − and ∂ϕ ,..., ∂ϕ are independent there. Then span ∂ϕ (a),..., ∂ϕ (a) = ∂t1 ∂tk { ∂t1 ∂tk } span X (ϕ(a)),...,X (ϕ(a)) for every a ( r, r)k. { 1 k } ∈ − Proof. By definition of gi , ∂ (gi gk )(p) = X . So to ti ∂ti ti tk i ∂ϕ ◦···◦ show span X1,...,Xk one needs to show that ∂ti ∈ { } 1 i 1 (gt1 gt−−1 ) (Xi) span X1,...,Xk . ◦···◦ i ∗ ∈ { } 8.3. PROOF OF FROBENIUS 77 j So one needs to show that (gtj ) (Xi) span X1,...,Xk for every i, j, ∗ ∈ { } 1 and for convenience of notation say j = 1. Using the flow gt we may ∂ change coordinates so that X1 = ∂x1 , and so the flow of X1 is sim- ply gt(x)= x+(t, 0,..., 0), and we need to show that span X1,...,Xk is invariant under the flow g , t ( r, r). { } t ∈ − For fixed q let Xi(t)= Xi(q +(t, 0,..., 0)) then k d m X (t) = [X ,X ]′ = C (t)X (t). dt i 1 i 1i m m=1 X Since this simple flow maps a vector to the “same” (i.e. parallel) vector at the image point, span X ,...,X being invariant under the flow { 1 k} means that span X1(t),...,Xk(t) is the same subspace for all t ( r, r). { } ∈ − n Let us change basis in R such that X1(q),...,Xk(q) are the first k standard basis vectors. j Let us denote the coordinates of Xi by Xi , j =1,...,n, so we must j show that Xi (t) = 0 for all 1 i k, k +1 j n and t ( r, r). ≤ ≤ ≤ ≤ j ∈ − Let W ( r, r) be the set of all t satisfying Xi (t) = 0 for all 1 i k, k +1⊆ −j n. Then W is clearly closed and 0 W , so if we≤ show≤ that W ≤is also≤ open then W =( r, r) and we are∈ done. At this point we pass to the nonstandard− extension. To show that W is open we must show that for every w W and every s w, j ∈ ≈ s ∗W . By transfer s ∗W means that Xi (s) = 0 for all 1 i k,∈k +1 j n. Say s>w∈ . ≤ ≤ Let A≤=≤ max Xj(u) , w u s, 1 i k, k +1 j n, and | i | ≤ ≤ ≤ ≤ ≤ ≤ assume this maximum is attained at u = u′, with i = a, j = b. Let B = max d Xj(u) , w u s, 1 i k, k +1 j n, and | dt i | ≤ ≤ ≤ ≤ ≤ ≤ assume this maximum is attained at u = u′′, with i = c, j = d. b Then since Xa(w) = 0 we have b A = X (u′) | a | b b = X (u′) X (w) u′ w B | a − a |≤| − | d d = u′ w X (u′′) | − |·|dt c | m d = u′ w C (u′′)X (u′′) | − |·| 1c m | m X u′ w A ≺| − | A. ≺≺ 78 8. THEOREM OF FROBENIUS j We have A A and so necessarily A = 0 and so Xi (s) = 0 for all 1 i k,≺≺k +1 j n. The≤ notation≤ ≤is explained≤ in Section 8.5. ≺ 8.4. A-track and B-track approaches Here we follow [Nowik & Katz 2015]. In Chapter 7.8, we pre- sented the received A-track3 formalism for dealing with the infinitesi- mal generator of a flow on a manifold. Note that the adjective “infin- itesimal” in this context is a dead metaphor as it no longer refers to true infinitesimals. We will now develop an infinitesimal formalism for dealing with the infinitesimal generator of a flow. Of course, an infinitesimal (B-track) formalism enables a more in- tuitive approach not merely to the infinitesimal generator of a flow, but to many other topics in calculus and analysis. Thus, the concepts of derivative, continuity, integral, and limit can all be defined via in- finitesimals. For example, in Section 5.8 we presented the infinitesimal approach to the extreme value theorem. 8.5. Relations of and ≺ ≺≺ We will work in the hyperreal extension R ∗R. ⊆ Definition 8.5.1. For r, s ∗R, we will write ∈ r s ≺ if r = as for finite a. We will also write r s ≺≺ if r = as for infinitesimal a. Thus r 1 means that r is finite, while ≺ r 1 ≺≺ means that r is infinitesimal. More generally, suppose we are given a finite dimensional vector space V over R. Then, given v ∗V , and s ∗R, we will write ∈ ∈ v s ≺ if ∗ v s for some norm on V . We will generally omit the from functionk k ≺ symbols and so willk·k simply write v s. This condition∗ is independent of the choice of norm since allk normsk ≺ on V are equivalent. Similarly we will write v s ≺≺ 3nativ 8.6. SMOOTH MANIFOLDS; NOTION NEARSTANDARDNESS 79 if v s. We will also write k k ≺≺ v w ≈ when v w 1. We choose a basis for V thus identifying it with Rn. − ≺≺ 1 n n Lemma 8.5.2. An element x = (x ,...,x ) ∗R satisfies x s or x s if and only if for each i, the component∈ xi satisfies≺ the corresponding≺≺ relation. This can be checked directly for the Euclidean norm on Rn. s Definition 8.5.3. Given 0 Definition 8.6.4. If a is nearstandard then st(a) is the unique p M such that a hal(p). ∈ ∈ Definition 8.6.5. For a subset A M, the halo hA of A is a subset ⊆ of ∗M defined by h A = halM (a). a A [∈ h In particular, M is the set of all nearstandard points in ∗M. The following is proved in [Davis 1977, p. 90, Theorem 5.6]. Let X be a metric space, e.g., a finite-dimensional smooth manifold equipped with a Riemannian metric. Theorem 8.6.6 (Davis). For a metric space X the following are equivalent: (1) every bounded closed set in X is compact; (2) every finite point of ∗X is nearstandard. In particular, in every finite-dimensional complete Riemannian man- ifold, finite and nearstandard are equivalent. 8.7. Results for general culture on topology We have the following characterisations of open sets and compact sets. A good reference is [Davis 1977]. h Theorem 8.7.1. A subset A is open in M if and only if A ∗A. ⊆ Example 8.7.2. The subset of M = R given by the closed inter- val A = [0, 1] is not open since ∗A fails to contain negative infinitesimals which are in the halo of A, i.e., hal (0) ∗A. M 6⊆ Theorem 8.7.3 (Davis p. 78, item 1.6). A set A M is compact h ⊆ if and only if ∗A A. ⊆ Example 8.7.4. The subset of R given by the open interval A = (0, 1) is not compact since a positive infinitesimal ǫ is in ∗A but ǫ is not in the halo of A, since it is not infinitely close to any point of A because every point of A is appreciable. Since every point of A is appreciable, so is every point of hA, but infinitesimals are not appreciable. h In particular, if A is compact then ∗A M. Recall that the halo both of a point and of a subset are defined⊆ relative to the ambient manifold M (see Remark 8.6.2). Theorem 8.7.5 (Davis p. 78, item 1.7). The manifold M itself is h compact if and only if M = ∗M. 8.7. RESULTS FOR GENERAL CULTURE ON TOPOLOGY 81 If M is noncompact then hM is an external set.5 Example 8.7.6. The 1-dimensional manifold M = R is not com- pact. The halo hR of R consists of all finite hyperreals. This is an h external subset of ∗R. Recall that R is the domain of the standard part function: st : hR R → defined in Section 3.6. 5A set is external if it is not internal. See Section 5.10. CHAPTER 9 Gradients, Taylor, prevectorfields 9.1. Properties of the natural extension Given an open W Rn and a smooth function f : W R we ⊆ → note some properties of the extension ∗f : ∗W ∗R, obtained by transfer. When there is no risk of confusion, we will→ omit the from ∗ the function symbol ∗f and simply write f for both the original function h and its extension. Recall that W ∗W consists of all nearstandard points (see Section 8.6). ⊆ Lemma 9.1.1. For open W Rn, let f : W R be continuous. Then f(a) is finite for every a ⊆hW . → ∈ Proof. Given a hW , let U W be a neighborhood of st(a) such ∈ ⊆ that U W and U is compact. ⊆ By compactness, there is a constant C R such that f(x) C ∈ | | ≤ for all x U. By transfer f(x) C for all x ∗U. In particular we have f(a∈) C. Therefore| f(a|) ≤ is finite. (As∈ for functions, we omit | | ≤ the from relation symbols, writing in place of ∗ .) ∗ ≤ ≤ 9.2. Remarks on gradient h Recall that U is the set of nearstandard points of ∗U, where ∗U is the star-transform of U. Given an open U Rn and a smooth ⊆ ∂f function f : U R, we can compute the partial derivatives ∂ui as usual. → Definition 9.2.1. The partial derivatives of ∗f are by definition the functions ∗ ∂f , ∂ui i.e., the extensions of the partial derivatives of f. Definition 9.2.2. Given f C1 we consider the row vector ∈ Da of partial derivatives at each point a ∗U. ∈ 83 84 9. GRADIENTS, TAYLOR, PREVECTORFIELDS One similarly defines the higher partial derivatives at a point a ∈ ∗U. Definition 9.2.3. Given f C2 we consider the Hessian matrix ∈ Ha of the second partial derivatives at each a ∗U. ∈ h By Lemma 9.1.1, Da and Ha are finite throughout U. For fu- ture applications, we would like to consider the infinitesimal interval between two infinitely close points in the halo of U. Remark 9.2.4. Let a, b hU with a b, then the interval be- h∈ ≈ tween a and b is included in U ∗U. ⊆ 1 Theorem 9.2.5. If f is C then f(b) f(a)= Dx(b a) for some x in the interval between a and b. − − Proof. This results by transfer of the mean value theorem. Note that here Dx(b a) is a product of matrices (row vector times column vector). − Equivalently, we can write f(b) f(a) D (b a)=0 − − x − for this specific choice of x. Meanwhile, for an arbitrary point in place of x we have the following weaker statement. Theorem 9.2.6. Let f be C1 and let a, b hU be infinitely close. Then for every y in the interval between a and∈b, we have f(b) f(a) D (b a) b a . − − y − ≺≺ k − k Proof. By hypothesis, the partial derivatives Dx are themselves continuous functions of x. Then the characterization of continuity via infinitesimals and microcontinuity in Section 5.6 implies that Dx D 1 for any y a (e.g., for y = st(a)). Therefore we can write − y ≺≺ ≈ f(b) f(a)= D (b a)+(D D )(b a). − y − x − y − By continuity of Dx at x0 = st(x) = st(y), the difference Dx Dy is infinitesimal, proving the theorem. − Remark 9.2.7. If the first partial derivatives are Lipschitz (e.g., if f is C2), and we are given an infinitesimal constant β such that x y β for all x in the interval between a and b, then we have thek stronger− k ≺ condition f(b) f(a) D (b a) β b a . − − y − ≺ k − k 9.3. PREVECTORS 85 If f is C2 then by transfer of the Taylor approximation theorem we have 1 f(b) f(a)= D (b a)+ (b a)tH (b a) − a − 2 − x − for some x in the interval between a and b, and remarks similar to those we have made regarding Dx Dy apply to Hx Hy. n − − If ϕ =(ϕi): U R then the n rows Di corresponding to ϕi form → a the Jacobian matrix Ja of ϕ at a. By applying the above considerations to each ϕi we obtain that ϕ(b) ϕ(a) J (b a) b a . − − y − ≺≺ k − k If all partial derivatives are Lipschitz (e.g., if ϕ is C2) then ϕ(b) ϕ(a) J (b a) β b a with β as above. − − y − ≺ k − k 9.3. Prevectors We now choose a positive infinitesimal λ ∗R, and fix it once and for all. We will now define the concept of a prevector.∈ 1 Definition 9.3.1. Given a hM, a prevector based at a is a pair (a, x), where x hM, such∈ that for every smooth function f : ∈ M R, we have f(x) f(a) λ. → − ≺ Note that the hypothesis f(x) f(a) λ is stronger than requir- ing f(x) and f(a) to be infinitely close.− ≺ We can avoid quantification over functions as in Definition 9.3.1 by giving an equivalent condition of being a prevector in coordinates as follows. Consider coordinates in a neighborhood W of st(a) in M, n whose image in Euclidean space is U R . Leta, ˆ xˆ ∗U be the ⊆ ∈ coordinates for a, x ∗W . ∈ Theorem 9.3.2. The pair (a, x) is a prevector based at the point a if and only if xˆ aˆ λ, − ≺ where the difference xˆ aˆ is defined using the linear structure of the n − vector space ∗R ∗U. ⊇ Proof. Let us show that the two definitions are indeed equivalent. Assume the first definition, and let (u1,...,un) be the chosen coordi- nate functions. Since each xi is smooth, we get ui(x) ui(a) λ for each i, i.e.,x ˆ aˆ λ. − ≺ Conversely,− assume≺ (a, x) satisfies the second definition. Let f : M R be a smooth function. Then by the intermediate value the- orem,→ f(x) f(a) = D (ˆx aˆ) for some c in the interval betweena ˆ − c − 1kdam-vector (with a kuf). 86 9. GRADIENTS, TAYLOR, PREVECTORFIELDS andx ˆ. Now the components of Dc are finite by Lemma 9.1.1. There- fore f(x) f(a) xˆ aˆ λ. − ≺k − k ≺ 9.4. Tangent space to a manifold via prevectors Definition 9.4.1. We denote by Pa = Pa(M) the set of prevectors based at a. This is an infinite-dimensional space. We will use Pa to define the finite-dimensional tangent space of M. First we will define an equivalence relation on P as follows. ≡ a Definition 9.4.2. We write (a, x) (a, y) ≡ if one of the following two equivalent conditions is satisfied: (1) f(y) f(x) λ for every smooth function f : M R; (2) in coordinates− ≺≺ as above,y ˆ xˆ λ. → − ≺≺ The equivalence of the two definitions follows by the same argument as in Section 9.3. Definition 9.4.3. We denote by Ta = Ta(M) the set of equivalence classes Pa/ . In the spirit of physics notation, the equivalence class of (a, x) P≡will be denoted ∈ a ax T , −→ ∈ a and referred to as an ivector (short for infinitesimal vector) so as to distinguish it from a classical vector. Theorem 9.4.4. A choice of coordinates in a neighborhood of M n gives an identification of Ta with R for every nearstandard a. Proof. Given a hM, let W M be a coordinate neighborhood ∈ ⊆ n of the point st(a) M. Thus we have a map W R with image U n ∈ → n λ ⊆ R . Here by definition x xˆ for all x W . Recall that (R )F is the n 7→ ∈ space of points in ∗R that are not infinite compared to λ. We define a map n λ Pa (R ) → F by sending (a, x) xˆ aˆ. This induces an identification of the 7→ − n n λ n λ space Ta = Pa/ with the space R = (R ) /(R ) . Under this ≡ F I identification, the space Ta inherits the structure of a vector space over R. 9.5. ACTION OF A PREVECTOR ON SMOOTH FUNCTIONS 87 Remark 9.4.5. If we choose a different coordinate patch in a neigh- n borhood of st(a) with image U ′ R , then if ϕ : U U ′ is the change of coordinates, then by Definition⊆ 9.2.1, we have → ϕ(ˆx) ϕ(ˆa) J (ˆx aˆ) xˆ aˆ λ, − − st(a) − ≺≺ k − k ≺ where J is the Jacobian matrix. Hence we obtain ϕ(ˆx) ϕ(ˆa) J (ˆx aˆ) λ. − − st(a) − ≺≺ This means that the map Rn Rn induced by the two identifications n → of Ta with R provided by the two coordinate maps, is given by multi- plication by the matrix Jst(a), and therefore is linear. Thus it preserves the linear space structure of the quotient. We see that the vector space structure induced on Ta via coordinates is independent of the choice of coordinates, and so we have a well defined vector space structure on Ta over R. Note that the object denoted Ta is a mixture of standard and nonstandard notions. It is a vector space over R, but defined at every nearstandard point a hM.2 ∈ 9.5. Action of a prevector on smooth functions Prevectors act on functions in a way that does not rely on either limits or standard part. Definition 9.5.1. A prevector (a, x) P acts on a smooth func- ∈ a tion f : M R as follows: → 1 (a, x) f = f(x) f(a) . λ − The right hand side is finite by definition of prevector. This action induces a differentiation as follows. Definition 9.5.2. The differentiation of f : M R by an ivec- tor a x T is defined as follows: → −→ ∈ a −→a x f = st (a, x)f . 2If a,b hM and a b, then given coordinates in a neighborhood of st(a), ∈ ≈ n the identifications of Ta and Tb with R induced by these coordinates induces an identification between Ta and Tb. Given a different choice of coordinates, the ma- trix Jst(a) used in the previous paragraph is the same matrix for a and b, and so the identification of Ta with Tb is well defined, independent of a choice of coordinates. Thus when a b hM we may unambiguously add an ivector a x T and an ≈ ∈ −→ ∈ a ivector −→by T . ∈ b 88 9. GRADIENTS, TAYLOR, PREVECTORFIELDS The action −→a xf is well defined by definition of the equivalence rela- tion . Note again that the ivector −→a x is a nonstandard object based at the≡ nonstandard point a, but it assigns a standard real number to the standard function f. Theorem 9.5.3. The action (a, x)f satisfies the Leibniz rule up to infinitesimals. Proof. Indeed, given functions f and g we can compute the action as follows: 1 (a, x)(fg)= f(x)g(x) f(a)g(a) λ − 1 = f(x)g(x) f(x)g(a)+ f(x)g(a) f(a)g(a) λ − − 1 1 = f(x) g(x) g(a) + f(x) f(a) g(a) λ − λ − 1 1 f(a) g(x) g(a) + f(x) f(a) g(a). ≈ λ − λ − Here the final relation is justified by the continuity of fand finiteness ≈ of the action 1 g(x) g(a) . λ − Theorem 9.5.4. Differentiation by an ivector −→a x satisfies the fol- lowing version of the Leibniz rule: a x(fg)= f(a ) a xg + a xf g(a ) −→ 0 · −→ −→ · 0 where a0 = st(a). Proof. By Theorem 9.5.3, we obtain the following for the differ- entiation −→a x(fg): a x(fg)= st(f(a)) axg + a xf st(g(a)) −→ · −→ −→ · = f(a ) a xg + a xf g(a ), 0 · −→ −→ · 0 where the second equality is by continuity of f and g. When a is real we obtain the ordinary Leibniz rule for differentia- tion. 9.6. Maps acting on prevectors We continue to follow [Nowik & Katz 2015]. Recall that λ is a fixed infinitesimal, Pa denotes the space of prevectors (a, x) at a nearstandard point a hM, i.e., x a λ. Let f : M N∈be a smooth− map≺ between smooth manifolds. Let a hM be a→ nearstandard point. ∈ 9.6. MAPS ACTING ON PREVECTORS 89 Definition 9.6.1. We define the differential dfa of f, df : P (M) P (N), a a → f(a) by (a, x) (f(a), f(x)). 7→ Definition 9.6.2. The differential dfa induces a tangent map T f : T (M) T (N) a a → f(a) by choosing a prevector representing an ivector a x T M and setting −→ ∈ a T fa (−→a x)= f−−−−−−→(a) f(x). The relation between f and df seems more transparent here than in the corresponding classical definition. Theorem 9.6.3. For a standard point a M, the space Ta is nat- urally identified with the classical tangent space∈ of M at a. Proof. It suffices to show this for an open set U Rn, where n ⊆ the classical tangent space at any point a is R itself. A classical vector v Rn is then identified with ∈ a x where x = a + λ v. −→ · Under this identification, our definitions of the differentiation −→a xf (see Section 9.5) and the action of the differential T fa(−→a x) coincide with the classical ones. Remark 9.6.4. When the manifolds are open subsets of Euclidean space and T is the classical tangent space, the map T f : T T is a a a → f(a) identified with the Jacobian matrix Ja (at the point a) whose rows are the gradients Da of all n components. Recall that Da v for a column vector v is the matrix product yielding a scalar result. CHAPTER 10 Prevector fields, classes D1 and D2 10.1. Prevector fields The notion of an internal set was discussed in Section 5.10. Recall that λ is a fixed infinitesimal. A prevector field is, intuitively, the assignment of an infinitesimal displacement (of size comparable to λ) at every point, given by an internal map. Remark 10.1.1. Here, as usual, a map is called internal if it is a member of a star transform of a standard object. For example, a map f : M N is defined by a subset of the Cartesian product M N. → × Therefore f can be thought of as an element of P = (M N). An P × internal map from ∗M to ∗N is then an element of ∗P. More precisely, we have the following definition. Recall that Pa is the set of prevectors at a point a hM, i.e., pairs (a, x) such that in coordinatesx ˆ aˆ λ. ∈ − ≺ Definition 10.1.2. A prevector field Φ on a smooth manifold M is an internal map Φ : ∗M ∗M such that we have (a, Φ(a)) Pa for every a hM. → ∈ ∈ In coordinates where addition and subtraction are possible, this condition translates into Φ(a) a λ for every a hM. − ≺ ∈ Remark 10.1.3. Even though the condition is imposed only at h points of M, we define Φ on all of ∗M since we need Φ (and therefore both its domain and range) to be an internal entity so as to be able to apply hyperfinite iteration which is the basis of the hyperreal walk. The relation between prevectors (a, x)and(a, y) in Pa was defined in Section 9.4. Namely,≡ in coordinatesy ˆ xˆ λ. − ≺≺ Definition 10.1.4. Let Φ and G be prevector fields. We say Φ is equivalent to G and write Φ G if Φ(a) G(a) for every a hM, or in coordinates Φ(a) G(a) ≡ λ. ≡ ∈ − ≺≺ Definition 10.1.5. A local prevector field on an open U M is an ⊆ internal map Φ : ∗U ∗V satisfying the above condition, where V U is open. → ⊇ 91 92 10. PREVECTOR FIELDS, CLASSES D1 AND D2 When the distinction is needed, we will call a prevector field defined on all of ∗M a global prevector field. The reason for allowing the values of a local prevector field defined on ∗U to lie in a possibly larger range ∗V is in order to allow a prevector field Φ to be restricted to a smaller domain which is not invariant under Φ. Example 10.1.6. Let M = R. If we wish to restrict the prevector field Φ on M given by Φ(a)= a + λ to the domain ∗(0, 1) ∗R, then we need the flexibility of allowing a slightly larger range for⊆ Φ. In the sequel we will usually not mention the slightly larger range V when describing a local prevector field, but will tacitly assume that we have such a V when needed. A second instance where it may be required for the range to be slightly larger than the domain is the following natural setting for defining a local prevector field. Example 10.1.7. Let p M. Let V M a coordinate neigh- ∈ n ⊆ borhood of p with imagesp ˆ V ′ R . Let X be a classical vector ∈ ⊆ n 1 field on V , given in coordinates by X′ : V ′ R . Then there is a n → neighborhood U ′ ofp ˆ R , with U ′ V ′, such that we can define a ∈ ⊆ local prevector field Φ′ : ∗U ′ ∗V ′ by setting → Φ′(a)= a + λ X′(a). (10.1.1) · Indeed, it suffices to choose U ′ such that U ′ is compact and further- more U V ′. For the corresponding open set U V in M, this ′ ⊆ ⊆ induces a local prevector field Φ : ∗U ∗V which realizes the classical vector field X on U in the sense of Definition→ 10.2.1 below (cf. Theo- rem 9.6.3). 10.2. Realizing classical vector fields Definition 10.2.1. We say that a local prevector field Φ on U realizes a classical vector field X on U if for every smooth function h : U R we have the following for the new function Xh: → Xh (a)= a−−−−→Φ(a) h for all a U, where a−−−−→Φ(a) is the ivector represented by the prevec- tor (a, Φ(∈a)). 1It is not entirely unambiguous how one is to choose a trivialized vector field X′ starting with a vector field X on M. One may want to make this more precise. 10.2. REALIZING CLASSICAL VECTOR FIELDS 93 Remark 10.2.2. When realizing a vector field as in Example 10.1.7 (namely, formula (10.1.1) above) it may be necessary to restrict to a smaller neighborhood U. For example, when M = V = V ′ = (0, 1) d and X = 1 (i.e., X =1 dx in classical notation), one needs to take U = (0,r) for some r R,r < 1 in order for Φ(a) = a + λ to lie always ∈ in ∗V . Note that Definition 10.2.1 involves only standard points. Different coordinates for the same neighborhood U will induce equiv- alent realizations in ∗U. More precisely, we show the following. Remark 10.2.3. Let U Rn. Let X : U Rn be a classical vec- ⊆ n → tor field. Let ϕ : U W R be a change of coordinates. Thus Tϕ → ⊆ n sends Ta to Tϕ(a). Let Y : W R be the classical vector field corre- sponding to X under the coordinate→ change. Then Y (ϕ(a)) = JaX(a), (10.2.1) where Ja is the Jacobian matrix of ϕ at a. Proposition 10.2.4. In the notation of Remark 10.2.3, let Φ,G be the prevector fields given by (1) Φ(a)= a + λX(a), (2) G(a)= a + λY (a) 1 as in Example 10.1.7. Then Φ ϕ− G ϕ, or equivalently, ≡ ◦ ◦ ϕ Φ(a) G ϕ(a) λ ◦ − ◦ ≺≺ for all a hU. ∈ Proof. Let ϕi,Y i,Gi be the ith component of ϕ,Y,G respectively, i i i and let Da be the differential of ϕ at a, so that Da is the ith row of Ja. We apply the mean value theorem to ϕi together with (10.2.1) to obtain the following for a suitable point c: ϕi Φ(a) Gi ϕ(a)= ϕi(a + λX(a)) Gi ϕ(a) ◦ − ◦ − ◦ = ϕi(a + λX(a)) ϕi(a) λY i(ϕ(a)) − − = DiλX(a) λDi X(a) c − a = λ(Di Di )X(a) c − a λ, ≺≺ i i where Dc is the differential of ϕ at a suitable point c in the interval between a and a + λX(a). Such a point c exists by Theorem 9.2.5. Since this is true for each component i, we obtain ϕ Φ(a) G ϕ(a) λ ◦ − ◦ ≺≺ 94 10. PREVECTOR FIELDS, CLASSES D1 AND D2 proving the theorem. 10.3. Regularity class D1 for prevectorfields Defining various operations on prevector fields, such as their flow, or Lie bracket, requires assuming some regularity properties. First we recall the following classical definition. Definition 10.3.1. A classical vector field X : U Rn is called K- → Lipschitz, where K R, if ∈ X(a) X(b) K a b for all a, b U. (10.3.1) k − k ≤ k − k ∈ For the local prevector field Φ of Example 10.1.7, where Φ(a) a = λX(a), this translates into − Φ(a) a Φ(b) b Kλ a b (10.3.2) − − − ≤ k − k for a, b ∗ U. ∈ Remark 10.3.2. Note the presence of the factor of λ on the right hand side of (10.3.2) which is not in (10.3.1). This is due to the fact that (10.3.2) deals with infinitesimal prevectors rather than classical vectors. The bound (10.3.2) motivates the following definition. Definition 10.3.3. A prevector field Φ on a smooth manifold M 1 h is of class D if whenever a, b M and (a, b) Pa then in coordinates the following condition is satisfied:∈ ∈ Φ(a) a Φ(b)+ b λ a b . (10.3.3) − − ≺ k − k As mentioned earlier, condition (10.3.3) is equivalent to the Lips- chitz condition when Φ is defined by a classical vector field. 10.4. Second differences, class D2 To work with Lie brackets a stronger condition than D1 is necessary. Definition 10.4.1. A prevector field Φ on a smooth manifold M is of class D2 if for any a hM, the following condition is satisfied ∈ n in coordinates: for each pair v,w ∗R with v,w λ, the second difference ∈ ≺ ∆2 Φ(a) =Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w) v,w − − is sufficiently small: ∆2 Φ(a) λ v w . v,w ≺ k kk k 10.5. SECORD-ORDER MEAN VALUE THEOREM 95 In Proposition 10.6.1 we will show that the prevector field of Ex- ample 10.1.7, induced by a classical vector field of class Ck, is a Dk prevector field (k =1, 2). This is in fact the central motivation for our definition of Dk. Remark 10.4.2. Our Dk is in fact a weaker condition than Ck, e.g. Φ of Example 10.1.7 is D1 if X is (order 1) Lipschitz, which is weaker than C1. A definition of D0 along the above lines would amount to requiring Φ(a) a λ, i.e. Φ being a prevector field. Note, however, that in our definitions− ≺ above, being a prevector field is part of the definition of an element of Dk. 10.5. Secord-order mean value theorem We will need the result from [Rudin 1976, Theorem 9.40]. We will assume that f is C2 for simplicity (Rudin presents weaker conditions). Theorem 10.5.1 (Rudin Theorem 9.40). Suppose a two-variable function f(u1,u2) is defined and of class C2 in an open set in the plane including a closed rectangle Q with sides parallel to the coordinate axes, having (a, b) and (a + h, b + k) as opposite vertices, where h> 0 and k > 0. Set ∆2(f, Q)= f(a + h, b + k) f(a + h, b) f(a, b + k)+ f(a, b). − − Then there is a point (x, y) in the interior of Q such that ∂2f ∆2(f, Q)= hk (x, y). ∂u1∂u2 Remark 10.5.2. Note the analogy with the mean value theorem. Proof. Define a 1-variable function u(t)= f(t, b + k) f(t, b) where t [a, a + h]. − ∈ Two applications of the mean value theorem will yield the result. Note that the mean value theorem is applicable to the first partial of f since f C2 by hypothesis. Namely there is an x between a and a + h, and there∈ is a y between b and b + k, such that ∆2(f, Q)= u(a + h) u(a) − = hu′(x) ∂f ∂f = h (x, b + k) (x, b) ∂u1 − ∂u1 ∂2f = hk (x, y) ∂u1∂u2 96 10. PREVECTOR FIELDS, CLASSES D1 AND D2 proving the theorem. 10.6. The condition C2 implies D2 Recall our “second difference” notation for a prevector field Φ at a nearstandard point a: ∆2 Φ(a)=Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w). v,w − − The condition D2 for prevector fields was defined in Section 10.4. This 2 is the condition ∆v,wΦ(a) λ v w for all v,w λ. This is in fact related to [Zygmund 1945≺ k].kk Givenk a classical vector≺ field X, one defines a prevector field Φ(a)= a + λX(a). We will show that if X is of class C2 then Φ is of class D2. Proposition 10.6.1. Let W Rn be open. Let X : W Rn be 2 ⊆ → a classical C vector field, and consider its natural extension to ∗W . h n Then for each a W and each pair of infinitesimal v,w ∗R , we have ∈ ∈ ∆2 X(a) v w . v,w ≺k kk k We obtain the following immediate corollary. Corollary 10.6.2. Let X be of class C2. If Φ is the prevector 2 field on ∗W of Example 10.1.7, i.e., Φ(a)= a + λX(a), then Φ is D . Proof of Proposition 10.6.1. Let U W be a smaller neigh- borhood of st(a) for which all second partial⊆ derivatives of X are bounded by a constant Const R. Let (X1,...,Xn) be the com- ∈ n ponents of X. Given p U and v1, v2 R such that we have p + s v + s v U for all 0 ∈ s ,s 1, we∈ define functions ψi by 1 1 2 2 ∈ ≤ 1 2 ≤ i i ψ (s1,s2)= X (p + s1v1 + s2v2). Then by chain rule, the mixed second partial of ψ satisfies ∂2ψi Const v v ∂s ∂s ≤ | 1|| 2| 1 2 i where Const depends on the second partials of X . By Theorem 10.5.1 (second-order mean value theorem) there is a point (t , t ) [0, 1] [0, 1] such that 1 2 ∈ × 2 e1+e2 i ∂ i ( 1) ψ (e1, e2)= ψ (t1, t2). 2 − ∂s1∂s2 (e1,e2) 0,1 X∈{ } 10.7. BOUNDS FOR D1 PREVECTOR FIELDS 97 Therefore 2 e1+e2 i ∆v1,v2 X = ( 1) X (p + e1v1 + e2v2) 2 − (e1,e2) 0,1 X∈{ } e1+e2 i = ( 1) ψ (e1, e2) 2 − (e1,e2) 0,1 X∈{ } 2 ∂ i = ψ (t1, t2) ∂s1∂s2 Ci v1 v2 ≤ k kk k where Ci is determined by a bound for all second partial derivatives of Xi in U. Since this is valid for each Xi, i = 1,...,n, there is a K R such that ∈ e1+e2 ( 1) X(p + e1v1 + e2v2) K v1 v2 (10.6.1) 2 − ≤ k kk k (e1,e2) 0,1 X∈{ } n for every p U and v1, v2 R such that p + s1v1 + s2v2 U for ∈ ∈ ∈ all 0 s1,s2 1, si R. Applying≤ ≤ the transfer∈ principle to (10.6.1), we conclude that the n same formula holds, with the same K, for all p ∗U and v1, v2 ∗R ∈ ∈ provided that p + s1v1 + s2v2 ∗U for all 0 s1,s2 1, si ∗R. In ∈ ≤ ≤ ∈ n particular this is true for p = a and all infinitesimal v1, v2 ∗R . ∈ 10.7. Bounds for D1 prevector fields Recall that a D1 prevector field Φ satisfies the condition Φ(a) a Φ(b)+ b λ a b − − ≺ k − k 1 whenever (a, b) Pa (see Section 10.3). We note that a D prevector field Φ satisfies the∈ following bounds. Proposition 10.7.1. If Φ is a D1 prevector field on M, and a, b hM with a b λ, then ∈ − ≺ a b Φ(a) Φ(b) a b . k − k≺k − k≺k − k Proof. By the triangle inequality we have Φ(a) Φ(b) a b Φ(a) Φ(b) a + b k − k−k − k ≤k − − k (10.7.1) = Kλ a b k − k where the hyperreal constant K is defined by the last equality in (10.7.1), where K is finite by the D1 condition. In other words, we have Kλ a b Φ(a) Φ(b) a b Kλ a b . − k − k≤k − k−k − k ≤ k − k 98 10. PREVECTOR FIELDS, CLASSES D1 AND D2 Rearranging terms, we obtain (1 Kλ) a b Φ(a) Φ(b) (1 + Kλ) a b , − k − k≤k − k ≤ k − k as required. CHAPTER 11 Prevectorfield bounds via underspill, walks, flows 11.1. Operations on prevector fields We will now show that addition of D1 prevector fields in coordinates is equivalent to their composition (in the sense of the relation ). More precisely, we show the following. ≡ Proposition 11.1.1. Let Φ be a D1 prevector field and G any prevector field. Then for every a hM, we have the following relation among ivectors: ∈ a−−−−−−−→Φ(G(a)) = a−−−−→Φ(a)+ a−−−−→ G(a). In particular if both Φ,G are D1 then Φ G G Φ. ◦ ≡ ◦ Proof. In a coordinate chart, let x = a + (Φ(a) a)+(G(a) a)=Φ(a)+ G(a) a. − − − Passing to equivalence classes, we obtain the following relation among ivectors: a−−−−→Φ(a)+ a−−−−→ G(a)= −→a x. Now let b = G(a). Then Φ(G(a)) x = Φ(b) Φ(a) b + a λ b a λ − − − ≺ k − k ≺≺ by the D1 condition on Φ, proving the proposition. We will now show that the composition of D1 prevector fields is D1. Proposition 11.1.2. If prevector fields Φ,G are D1 then prevector field Φ G is D1. ◦ Proof. Let c = G(a) and d = G(b). We have by the triangle inequality Φ G(a) Φ G(b) a + b k ◦ − ◦ − k Φ G(a) Φ G(b) G(a)+ G(b) + G(a) G(b) a + b ≤k ◦ − ◦ − k k − − k = Φ(c) Φ(d) c + d + G(a) G(b) a + b k − − k k − − k λ c d + λ a b λ a b ≺ k − k k − k ≺ k − k 99 100 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS where Proposition 10.7.1 is applied in order to bound c d in terms of a b. − − A similar proposition can be proved for D2.1 11.2. Sharpening the bounds via underspill Here we are interested in sharpening the bounds for prevector- fields on M in terms of a specific real constant.≺ Proposition 11.2.1. Let Φ be a prevector field on W , where W n ⊆ M is an open coordinate neighborhood with image U R . Let B U be a closed ball. We will identify W with U to lighten⊆ the notation.⊆ Then we have the following. (1) There is C R such that Φ(a) a Cλ for all a ∗B. ∈ k − k ≤ ∈ (2) If G is another prevector field, then there is a finite β ∗R ∈ such that Φ(a) G(a) βλ for all a ∗B. k − k ≤ ∈ (3) If furthermore Φ G then the constant β ∗R in (2) can be chosen to be infinitesimal.≡ ∈ Proof. Since the first statement is a special case of the second (take G(a)= a for all a), it suffices to prove the second statement. Let A = n ∗N : Φ(a) G(a) nλ for every a ∗B . { ∈ k − k ≤ ∈ } Every infinite n ∗N is in A, so that ∈ ∗N N A. \ ⊆ 1Similarly, we have the following. Proposition 11.1.3. If Φ, G are D2 then Φ G is D2. ◦ Proof. In some coordinates let p = G(a), x = G(a + v) G(a) and y = G(a + w) G(a), and so by Propositions 13.1.2 and 10.7.1 we have −x v and y w . Also− by Propositions 13.1.2 and 10.7.1 we have ≺k k ≺k k Φ(p + x + y) Φ(G(a + v + w)) p + x + y G(a + v + w) k − k≺k − k = G(a)+ G(a + v)+ G(a + w) G(a + v + w) λ v w . k− − k≺ k kk k Now Φ G(a) Φ G(a + v) Φ G(a + w)+Φ G(a + v + w) k ◦ − ◦ − ◦ ◦ k = Φ(p) Φ(p + x) Φ(p + y)+Φ G(a + v + w) k − − ◦ k Φ(p) Φ(p + x) Φ(p + y)+Φ(p + x + y) + Φ(p + x + y) Φ(G(a + v + w)) ≤k − − k k − k λ x y + λ v w λ v w . ≺ k kk k k kk k≺ k kk k 11.3. FURTHER BOUNDS VIA UNDERSPILL 101 But A is an internal set, being defined in terms of internal entities Φ and G.2 Therefore by underspill there is a finite integer C in A. Recall that underspill is the fact that if A ∗N is an internal set, and A includes all infinite n then it must also⊆ include a finite n. This is based on the fact that infinite hypernaturals form an external set; see Section 5.11. To prove the third statement (for Φ G), let ≡ λ A = n ∗N : Φ(a) G(a) for every a ∗B . ∈ k − k ≤ n ∈ Every finite n ∗ N is in A and so by overspill there is an infinite n ∗N ∈ 1 ∈ in A. We can therefore choose the infinitesimal value β = n . 11.3. Further bounds via underspill Proposition 11.3.1. Let Φ be a D1 prevector field on M. Let W n ⊆ M be an open coordinate neighborhood with image U R . Let B U ⊆ ⊆ be a closed ball. Then there is K R such that ∈ Φ(a) a Φ(b)+ b Kλ a b (11.3.1) k − − k ≤ k − k for all a, b ∗B. It follows∈ that (1 Kλ) a b Φ(a) Φ(b) (1 + Kλ) a b . (11.3.2) − k − k≤k − k ≤ k − k Proof. Let N = 1/λ . Let a, b ∗B. We construct a hyperfinite partition of the segment⌊ between⌋ a and∈ b. For each k =0,...,N let k a = a + (b a). k N − Then ak ak+1 λ. Let Cab ∗R be the maximum of the ratios − ≺ ∈ Φ(a ) a Φ(a )+ a k k − k − k+1 k+1k λ a a k k − k+1k 1 over all k =0,...,N 1. Since Φ is D , it follows that the constant Cab is finite. For every 0− k N 1 we have ≤ ≤ − a b Φ(a ) a Φ(a )+ a C λ a a = C λk − k. k k − k − k+1 k+1k ≤ ab k k − k+1k ab N 2Kanovei to comment on strong transitivity. 102 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS Therefore by the triangle inequality Φ(a) a Φ(b)+ b k − − k N 1 − Φ(a ) a Φ(a )+ a (11.3.3) ≤ k k − k − k+1 k+1k Xk=0 C λ a b . ≤ ab k − k Now let A = n ∗N : Φ(a) a Φ(b)+ b nλ a b for each a, b ∗B . { ∈ k − − k ≤ k − k ∈ } Since each Cab as in (11.3.3) is finite, every infinite n ∗N is in A. Hence by underspill, A also contains a finite integer K, proving∈ (11.3.1). The estimate (11.3.2) assertion follows from the first as in the proof of Proposition 10.7.1. A similar proposition can be proved for D2.3 11.4. Global prevector fields, walks, and flows In this section we will define the hyperreal walk of a prevector field and relate it to the classical flow. Recall that a prevector field Φ is a map Φ : ∗M ∗M from ∗M to itself. We wish to view a prevector field as the first step→ of its walk at time λ> 0, and will define the hyperreal 3Similarly we have the following. Proposition 11.3.2. Let Φ be a D2 prevector field. If W is a coordinate neighborhood with image U Rn and B U is a closed ball, then there is K R such that ⊆ ⊆ ∈ Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w) Kλ v w k − − k≤ k kk k ∗ ∗ ∗ for all a B and v, w Rn such that a + v,a + w,a + v + w B. ∈ ∈ ∈ Proof. The proof is similar to that of Proposition 11.3.1. Let N = 1/λ . ∗ ∗ n ∗ ⌊ ⌋ Given a B and v, w R such that a + v,a + w,a + v + w B, let ak,l = ∈ ∈ ∈ a + k v + l w, 0 k,l N. Let C be the maximum of N N ≤ ≤ avw Φ(a ) Φ(a ) Φ(a )+Φ(a ) k k,l − k+1,l − k,l+1 k+1,l+1 k λ v/N w/N k kk k for 0 k,l N 1 then C is finite. For every 0 k,l N 1 we have ≤ ≤ − avw ≤ ≤ − Φ(a ) Φ(a ) Φ(a )+Φ(a C λ v/N w/N . k k,l − k+1,l − k,l+1 k+1,l+1k≤ avw k kk k Summing over 0 k,l N 1 we get ≤ ≤ − Φ(a) Φ(a + v) Φ(a + w)+Φ(a + v + w) C λ v w . k − − k≤ avw k kk k By underspill as in the proof of Proposition 11.3.1, there is one finite K which works for all a,v,w. 11.4. GLOBAL PREVECTOR FIELDS, WALKS, AND FLOWS 103 walk and the real flow accordingly (see below). The flow at arbitrary time t is defined by iterating Φ the appropriate number of times. Remark 11.4.1. The idea of a vector field being the infinitesimal generator of its flow receives a literal meaning in the TIDG setting. We view Φ as an element Φ ∗Map(M). Consider the map ∈ ∗Map(M) ∗N ∗Map(M) × → which is the natural extension of the map Map(M) N Map(M) taking (φ, n) to φn = φ φ φ (altogether n times).× → Thus Φn is defined for all hypernatural◦ ◦···◦n. We will now define the hyperfinite flow Φt of a prevector field Φ. The flow will be defined at times t ∗R that are hyperinteger multiples of the base infinitesimal λ. ∈ Definition 11.4.2. Let Φ : ∗M ∗M be a global prevector field. → For each hypernatural n ∗N, let t = nλ and define the hyperreal ∈ walk Φt of Φ at time t by setting n Φt(a) = Φ (a). (11.4.1) Definition 11.4.3. We define the real flow φt on M by setting φt(x)= st(Φnλ(x)) for all x M, where nλ t< (n + 1)λ, i.e., ∈ ≤ t n = . λ The real flow will be defined for all real t that are sufficiently small so that the standard part is well-defined; see material around (11.8.1) for the details. Theorem 11.4.4. Let Φ be a prevector field of the class D1 on M, and let Φt be its hyperfinite walk as in (11.4.1). Then prevector field Φ is invariant under Φt. n Proof. The differential dΦt of the map Φt = Φ acts on each prevector (a, Φ(a)) by n n dΦt(a, Φ(a)) = (Φ (a), Φ (Φ(a))) = (Φn(a), Φ Φ Φ(a)) ◦ ◦···◦ = (Φn(a), Φ(Φn(a))) =(b, Φ(b)) by the associativity of composition, where b = Φn(a), proving the in- variance of the prevector field Φ. Here the relation Φn Φ = Φ Φn is proved by internal induction (see Section 11.5). ◦ ◦ 104 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS Remark 11.4.5. This proof of invariance compares favorably to the proof of invariance under the flow in the A-track framework (see Theorem 7.8.2). The hyperfinite walk for a local prevector field is defined similarly. A bit of care is required to specify its domain. Given a local prevector field Φ : ∗U ∗V , we extend Φ to Φ′ : ∗V → → ∗V by defining Φ′(a)= a for all a ∗V ∗U, and let Yn = a ∗U : n ∈ − { ∈ (Φ′) (a) ∗U . The domain of Φ will be Y (here as before n(t) is ∈ } t n(t) the integer part of t/λ), where it is defined by Φt(a) = Φt′ (a). We would also like to consider Φ for t 0. This can be defined for a t ≤ global prevector field Φ which is bijective on ∗M, or for a local prevector field which is bijective in the sense of Remark 13.2.5, in particular a D1 1 local prevector field. Namely, we define Φt for t 0 to be (Φ− ) t. ≤ − Note that for any global prevector field Φ, the hyperreal walk Φt is defined for all t 0. This is of course not the case for the classical flow ≥ of a classical vector field. Similarly Φt is defined for all t 0 if Φ is bijective. ≤ 11.5. Internal induction Internal induction is treated in [Goldblatt 1998, p. 129]. Theorem 11.5.1. If X ∗N be an internal subset containing 1 ⊆ 4 and closed under the successor function n n +1. Then X = ∗N. 7→ Proof. If the set difference Y = ∗N X is nonempty, then since Y is internal, it has a least element n Y (see\ Section 5.11). Hence n 1 X. However, by closure under successor,∈ the number n must also− be∈ in X, yielding a contradiction. This will be used in the proof of Theorem 11.6.2. 11.6. Dependence on initial condition and prevector field In this section we deal with initial conditions.5 Remark 11.6.1. We showed in Proposition 11.3.1 that each D1 prevector field Φ defined in an open neighborhood of a closed Euclidean ball B satisfies the bound Φ(a) a Φ(b) + b Kλ a b for k − − k ≤ k − k all a, b ∗B for an appropriate finite K. Such a K will be exploited in Theorem∈ 11.6.2. 4“peulat ha’okev” according to hebrew wikipedia at peano axioms 5tnai hatchalati 11.6. DEPENDENCE ON INITIAL CONDITION AND PREVECTOR FIELD 105 1 Theorem 11.6.2. Let Φ be a local D prevector field on ∗U. Given p n ∈ U and a coordinate neighborhood of p with image W R , let B′ B W be closed balls of radii r/2,r around the image⊆ of p. Sup-⊆ ⊆ pose Φ(a) a Φ(b)+ b Kλ a b for all a, b ∗B, with K a finitek constant,− 6−then k ≤ k − k ∈ (1) there is a positive T R such that Φt(a) ∗B for all a ∗B′ ∈ ∈ ∈ and T t T . Furthermore, for all a, b ∗B′ and 0 t − ≤ ≤ Kt ∈ ≤ ≤ T : Φt(a) Φt(b) e a b . k − k ≤ k − k 2 (2) If we take a slightly larger constant K′ = K/(1 Kλ) , then − for all a, b ∗B′ and T t T : ∈ − ≤ ≤ K′ t K′ t e− | | a b Φ (a) Φ (b) e | | a b . k − k≤k t − t k ≤ k − k Proof. We will give a proof for positive t.7 Let C R be as in Proposition 11.2.1 (on sharpening bounds via underspill and∈ overspill), so that Φ(a) a Cλ for all a ∗B. We define T by setting k − k ≤ ∈ r T = , 2C so that 0 < T R. Let a ∗B′ and 0 t T , and set ∈ ∈ ≤ ≤ n = n(t)= t/λ . (11.6.1) ⌊ ⌋ By internal induction, we obtain a bound for Φn(a) a by means of a hyperfinite sum as follows: − n n m m 1 r Φ (a) a Φ (a) Φ − (a) nCλ . k − k ≤ k − k ≤ ≤ 2 m=1 X By the triangle inequality we obtain r r Φn(a) Φ(a) a + a + = r, k k≤k − k k k ≤ 2 2 n and therefore Φ (a) ∗B. By Proposition 11.3.1(2) we have ∈ (1 Kλ) a b Φ(a) Φ(b) (1 + Kλ) a b − k − k≤k − k ≤ k − k 6Such finite K exists by Proposition 11.3.1. 7The case of negative t then follows from Remark 13.3.2. 106 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS 8 for all a, b ∗B, and so by internal induction (see Theorem 11.5.1) we obtain ∈ (1 Kλ)n a b Φn(a) Φn(b) (1 + Kλ)n a b , − k − k≤k − k ≤ k − k proving the theorem in view of the fact that Kt n (1 + Kλ)n = 1+ eKt (11.6.2) n ≈ by formula (11.6.1) defining n. The choice of K′ results by elementary algebra. 11.7. Dependence on initial conditions for local prevector field In this section we will provide bounds on how fast the flows of a pair of prevector fields can diverge from each other. In particular, this will show that if prevector fields are equivalent then the corresponding classical flows coincide. We will deal with a more general situation (than that dealt with in Section 11.6) where A′ is the set of initial conditions, whereas A is the set where we assume that the flow is contained until time T . The set A will typically be larger than A′. 1 Let Φ be a D local prevector field on ∗U and let G be any local prevector field on ∗U. Given a coordinate neighborhood included in U n with image W R , let A′ A ∗W be internal sets. Assume that⊆β is such that⊆ ⊆ Φ(a) G(a) βλ (11.7.1) k − k ≤ for all a A (in particular if β is infinitesimal then the prevector fields are equivalent).∈ Assume also that K > 0 is a finite constant such that Φ(a) a Φ(b)+ b Kλ a b k − − k ≤ k − k for all a, b A. Assume that 0 < T R is such that Φt(a) and Gt(a) ∈ ∈ are in A for all a A′ and 0 t T . ∈ ≤ ≤ 8This step requires internal induction and cannot be obtained by iterating the formula n times, writing down the resulting formula with integer parameter n, and applying the transfer principle. The problem is that the entities occurring in the formula are internal rather than standard, so there is no real statement to apply transfer to. However, one can perhaps use the same trick as in the definition of the hyperfinite iterate in the first place, by introducing an additional parameter as in Section 11.4. Namely, the formula holds for all triples (Φ,G,n) map(M) ∈ × map(M) N, and then apply the star transform. × 11.7. DEPENDENCE ON INITIAL CONDITIONS FOR LOCAL PREVECTOR FIELD107 Theorem 11.7.1. In the notation above, for all a A′ and 0 t T , we have the following bound: ∈ ≤ ≤ β Φ (a) G (a) (eKt 1) βteKt. k t − t k ≤ K − ≤ 1 1 If G− exists, e.g., if G is also D , and if Φt(a) and Gt(a) are in A for all a A′ and T t T , then ∈ − ≤ ≤ β K t K t Φ (a) G (a) (e | | 1) β t e | | k t − t k ≤ K − ≤ | | for all T t T . − ≤ ≤ Proof. It suffices to provide a proof for positive t. We will prove by internal induction that β Φn(a) Gn(a) (1 + Kλ)n 1 . (11.7.2) k − k ≤ K − This implies the statement when applied to n = n(t); cf. (11.6.2). The basis of the induction is provided by the estimate (11.7.1). By Propo- sition 11.3.1(2) we have Φn+1(a) Gn+1(a) k − k Φ(Φn(a)) Φ(Gn(a)) + Φ(Gn(a)) G(Gn(a)) ≤k − k k − k (1 + Kλ) Φn(a) Gn(a) + βλ. ≤ k − k Exploiting the inductive hypothesis (11.7.2), we obtain Φn+1(a) Gn+1(a) k − k β (1 + Kλ) (1 + Kλ)n 1 + βλ ≤ K − β β βKλ (1 + Kλ)n+1 + βλ ≤ K − K − K β = (1 + Kλ)n+1 1 K − proving the inductive step from n to n + 1. Corollary 11.7.2. Assume Φ and G are as in Theorem 11.7.1. If we have Φ G, then there is a β 1 for the statement of Theo- rem 11.7.1, and≡ therefore Φ (a) G (≺≺a) for all 0 t T . t ≈ t ≤ ≤ Proof. By Proposition 11.2.1. 9 9 Remark 11.7.3. If the internal sets A′ A given in Theorem 11.7.1 are ′ ′ ∗ ⊆ balls B B of radii r < r R, and say we are not given that both Φt(a) ⊆ ∈ 108 11. PREVECTORFIELD BOUNDS VIA UNDERSPILL, WALKS, FLOWS 11.8. Inducing a real flow The hyperreal walk Φt of a prevector field Φ induces a classical flow on M as follows. 1 Definition 11.8.1. Let Φ be a D local prevector field. Let B′ ⊆ Rn and [ T, T ] be as in Theorem 11.6.2. We define the real flow − φ : B′ M t → induced by Φ by setting φt(x)= st(Φt(x)). (11.8.1) Recall that Φ is defined by hyperfinite iteration (of Φ) t/λ times. t ⌊ ⌋ The following are immediate consequences of Theorems 11.6.2 and Corollary 11.7.2. Theorem 11.8.2. Given a D1 prevector field Φ the following hold: 10 K t (1) the flow φt is Lipschitz continuous with constant e | |. (2) the real flow φt is injective. (3) If G is another D1 prevector field with real flow g , and Φ G, t ≡ then φt = gt. Remark 11.8.3. Suppose Φ is obtained from a classical vector field X by the procedure of Example 10.1.7. Then [Keisler 1976, Theorem 14.1] shows that φt is in fact the flow of the classical vector field X in the classical sense. By Theorem 11.8.2(3), each prevector field Φ that realizes X will yield a hyperfinite walk whose standard part will produce the same classical flow. The results of this subsection have the following application to the standard setting. Corollary 11.8.4. Consider an open set U Rn. Let X,Y : U ⊆ → Rn be classical vector fields, where X is Lipschitz with constant K, ′ and Gt(a) are in B for all a B and 0 t T , but rather we are given that only one of them remains in some∈ ball B′ ≤ B′′≤ B with radius r′ < r′′ < r such that β r r′′. Then it follows from the⊆ bound⊆ of Theorem 11.7.1 that the other flow remains≺≺ − in B. To be precise, one needs to repeat the induction of the proof of Theorem 11.7.1. This needs to be rewritten. 10 This follows from the estimates given if we interpret the flow as referring to the map M M at time t. However, if this refers to the full flow M I M then the claim→ apparently does not immediately follow and may not even× be→ true for a general prevectorfield. 11.9. GEODESIC FLOW 109 and X(x) Y (x) b for all x U. If x(t), x˜(t) are integral curves of Xkthen − k ≤ ∈ x(t) x˜(t) eKt x(0) x˜(0) . k − k ≤ k − k If y(t) is an integral curve of Y with x(0) = y(0) then b x(t) y(t) (eKt 1) bteKt. k − k ≤ K − ≤ Proof. Define prevector fields on ∗U as usual by setting Φ(a) a = λX(a) and G(a) a = λY (a) as in Example 10.1.7, and apply Theorems− 11.6.2, 11.7.1, and− Remark 11.8.3. 11.9. Geodesic flow One advantage of the hyperreal approach to solving a differential equation is that the hyperreal walk exists for all time, being defined combinatorially by iteration of a selfmap of the manifold. The focus therefore shifts away from proving the existence of a solution, to estab- lishing the properties of a solution. Thus, our estimates show that given a uniform Lipschitz bound on the vector field, the hyperreal walk for all finite time stays in the finite part of the ∗M. If M is complete then the finite part is nearstandard. Our estimates they imply that the hyperreal walk for all finite time descends to a real flow on M. In particular, the geodesic equation on an n-dimensional mani- fold M, k′′ k i′ j′ α +Γijα α =0, (11.9.1) can be interpreted as a first-order ODE on the (2n 1)-dimensional manifold SM (union of unit spheres in all tangent spaces).− The latter is known to satisfy a uniform Lipschitz bound. Therefore the geodesic flow on a complete Riemannian manifold M exists for all time t R. In more detail, consider a Riemannian metric , with metric∈ co- h i ∂ ∂ efficients written in coordinates as (gij) meaning that ∂ui , ∂uj = gij. kℓ The Γk can be expressed as Γk = g (g g + g ). A smooth ij ij 2 iℓ;j − ij;ℓ jℓ;i curve in M with components (αi) is a geodesic if and only if (11.9.1) holds for each k. CHAPTER 12 Universes, transfer This chapter develops a detailed rigorous set-theoretic setting for the TIDG framework. 12.1. Universes The idea of nonstandard analysis is to enlarge the mathematical world we are studying, in a way that does not change any of its proper- ties, this leading to new insights on the original world. So, our starting point is some set X, which in the case of differential geometry may be (the underlying set of) a smooth n-dimensional manifold M, or the disjoint union of M with finitely many other sets one might want to refer to, say Rn, R, and N. Let us review some set-theoretic background related to universes. Recall that given a set A, the symbol (A) denotes the set of all subsets of A. We define the universe V (X) asP follows. Let V0(X)= X. At the next step we set V = V (X) (V (X)). More generally, we 1 0 ∪ P 0 define recursively for n N, ∈ V (X)= V (X) (V (X)). n+1 n ∪ P n Finally the universe V (X) is defined as follows. Definition 12.1.1. The universe, or superstructure, over a set X is the set V (X)= Vn(X). n N [∈ Corollary 12.1.2. If a V (X) r X then a V (X). ∈ ⊆ Exercise 12.1.3. By definition, V (X) V (X). Prove that in m ⊆ m+1 fact Vm(X) $ Vm+1(X), and even, by Cantor, Vm+1(X) has a strictly bigger cardinality than Vm(X). For special needs of nonstandard analysis, one usually requires X to have the following key technical property; see e.g., [Keisler 1976, 15B] or [Chang & Keisler 1990, 4.4]. 111 112 12. UNIVERSES, TRANSFER Definition 12.1.4 (base sets). A set X is a base set if X = ∅ and 6 if x X then x is an atom in V (X), so that x = ∅ while x V (X)= ∅. ∈ 6 ∩ The following are important consequences. Corollary 12.1.5 ([Chang & Keisler 1990], Corollary 4.4.1). Assume that X is a base set. If a V (X) and a b V (X) then ∈ ∈ ∈ m m 1 and a Vm 1(X). ≥ ∈ − Corollary 12.1.6. Assume that X is a base set. If a, b V (X) ∈ and a V (X)= b V (X) then either a = b or a, b X ∅ . ∩ ∩ ∈ ∪{ } Proof. Suppose that a / X. Then a m 1 Vm(X) by construc- ∈ ∈ ≥ tion, and hence a V (X) and a V (X) = a. If also b / X then similarly b V (X)⊆ = b, thus a V∩(X) = b S V (X) implies∈a = b. If ∩ ∩ ∩ b X then b V (X) = ∅ (as X is a base set), and we conclude that ∈ ∩ a = a V (X)= ∅, as required. ∩ For examples of base sets see Section 12.2. 12.2. The reals as the base set We recall Dedekind’s construction of the real field. Theorem 12.2.1. Each real x is defined following Dedekind as the pair x = Q , Q′ of two complementary sets of rationals Q = q { x x} x { ∈ Q : q < x and Q′ = q Q : q x . } x { ∈ ≥ } The key property of R which implies that R is a base set, is the fact that each real x consists of two elements Qx and Qx′ , which are sets of von Neumann rank (see below) equal exactly to ω. For those not versed in set theory we provide a definition. Definition 12.2.2. The von Neumann rank of a set x is an ordi- nal rank(x) defined so that rank(∅) = 0 and if x = ∅ then rank(x) is the least ordinal strictly bigger than all ordinals rank6 (y), y x. ∈ The existence and uniqueness of rank(x) follow from the axioms of modern Zermelo–Fraenkel set theory ZFC, with the key role of the regularity, or foundation axiom in the argument. Informally, rank(x) can be viewed as the finite or transfinite length of the cumulative con- struction of the set x from the empty set. Definition 12.2.3. Let γ be an ordinal. A set X is a γ-base set if X = ∅, X consists of non-empty elements, and we have rank(a) = γ whenever6 a x X. ∈ ∈ Lemma 12.2.4 ([Chang & Keisler 1990], Exercise 4.4.1). If γ is an infinite ordinal, and X is a γ-base set, then X is a base set. 12.3. MORE SETS IN UNIVERSES 113 Proof. If x X = V0(X) then rank(x) = γ + 1 by the choice of X. ∈ If x V1(X) then either x X with rank(x)= γ +1, or x = ∅ (the ∈ ∈ empty set) with rank(∅)=0, or x V1(X) X with rank(x)= γ + 2. Similarly if x V (X) then either∈ x \X with rank(x) = γ + 1, ∈ 2 ∈ or rank(x) 1 (for x = ∅ and x = ∅ ), or x V2(X) X with γ +2 rank≤(x) γ + 3. Extending this{ } argument,∈ one easily\ shows ≤ ≤ by induction that if m 1 then every set y Vm(X) X satisfies either γ+2 rank(y) γ+m≥+1 or rank(Y ) < m∈, hence never\ rank(y)= γ. This≤ rules out the≤ possibility of y x for any x X = V (X). ∈ ∈ 0 Now consider the most important case X = R (the reals). Recall that ω is the least infinite ordinal. Lemma 12.2.5. The set X = R is an ω-base set, hence, a base set. Proof. Set theory defines natural numbers so that 0 = ∅ and n = 0, 1, 2,...,n 1 for all n > 0. It follows that rank(n) = n for any {natural number− n}. Next, according to the set theoretic definition of an ordered pair x, y as h i x, y = x , x, y , h i {{ } { }} all pairs and triples of natural numbers have finite ranks. Therefore each rational number q, identified with a triple s, m, n , where s h i n ∈ 0, 1, 2 , m =1, 2, 3,... , n =0, 1, 2, 3,... , and q =(s 1) m , also has a{ finite} rank rank(q). − · We conclude that every infinite set S of rational numbers satis- fies rank(S) = ω exactly. Therefore each Dedekind real x consists of two sets (infinite sets of rationals Qx and Qx′ ) of rank rank(Qx) = rank(Qx′ )= ω. It remains to apply Lemma 12.2.4 with γ = ω. From now on we concentrate on the case X = R. 12.3. More sets in universes By definition the set V0(R) = R has cardinality card(R) = c (the n continuum), and then by induction card(Vn(R)) = exp (c). This array of cardinalities easily exceeds all needs of conventional mathematics. For instance, the underlying set M of a smooth connected manifold M of dimension n 1 is a set of| cardinality| c. Therefore we can iden- ≥ tify M with a set in the complement V1(R) V0(R). (This choice | | \ excludes the possibility of nonempty intersection of M and R.) Then every object of interest for studying the smooth manifold| | M will be an element of V (R), e.g., (1) The topology of M is an element of V3(R); 114 12. UNIVERSES, TRANSFER (2) With the usual encoding of an ordered pair a, b as a , a, b , h i {{ } { }} the set M M is an element of V4(R); (3) The set Map(× M) of all maps from M to itself is an element of V5(R). Similarly, the field operations of R, the vector space operations of Rn, and the natural embedding of N into R, are elements of V (R), as is each function from Rn to R, etc. See also Lemma 4.4.3 in [Chang & Keisler 1990]. 12.4. Membership relation Statements about V (X) are made using a first order language with a binary relation , the membership relation, as the only primary re- lation. This language∈ will be called -language. ∈ Definition 12.4.1. An -formula φ is called bounded if and only if the quantifiers x and x ∈always appear in φ in the form ∀ ∃ x(x a ) ∀ ∈ ⇒··· and x(x a ) ∃ ∈ ∧··· where a is either a variable or a constant. These two forms are usually abbreviated respectively as x a ( ) and x a ( ). ∀ ∈ ··· ∃ ∈ ··· 12.5. Ultrapower construction of nonstandard universes In this section we will present a generalisation of the ultrapower construction of Section 4.7. For the purpose of such a generalisation, recall that f, g, . . . denote functions in a standard universe V (X), while their extensions in a nonstandard extension ∗V (X) will be ∗f, ∗g, . . . Definition 12.5.1. A nonstandard extension of V (X) consists of a base set ∗X with X $ ∗X, a subuniverse ∗V (X) V (∗X), and a map ⊆ : V (X) ∗V (X) (12.5.1) ∗ → satisfying the following two conditions: (I) one has ∗X ∗V (X), and the map is an injective map from ⊆ ∗ V (X) into ∗V (X) such that if x X then ∗x ∗X; ∈ ∈ (II) for every bounded formula φ(x1,...,xk), and every k-tuple of constants a ,...,a V (X), the formula φ(a ,...,a ) is 1 k ∈ 1 k true in V (X) if and only if the formula φ(∗a1,..., ∗ak) is true in ∗V (X). 12.5. ULTRAPOWER CONSTRUCTION OF NONSTANDARD UNIVERSES 115 Example 12.5.2. An example is a formula expressing the statement of the extreme value theorem (EVT) for a function f; see Section 5.8. Therefore by (II) the same formula holds in the extension for ∗f where the quantification is now over the extended domain. Remark 12.5.3. If a = b then ∗a = ∗b by (II), so is a bijection 6 6 ∗ of V (X) onto a subset of ∗V (X). Yet we introduce the bijectivity condition separately by (I) for the sake of convenience. Also, still by (II), if a A V (X) then ∗a ∗A, so if one identifies a ∈ ∈ ∈ with its image ∗a, then one may think of A as a subset of ∗A, that is, one can think of ∗A as an extension of A. Since functions are also sets, for every function f V (X), f : A B, we can consider the set ∗f, ∈ → which is also a function, ∗f : ∗A ∗B. If A and B are considered to be → subsets of ∗A and ∗B respectively, then the function ∗f is an extension of the function f. Theorem 12.5.4. Nonstandard extensions exist. This is a well-known result presented in many sources, including the fundamental monograph on model theory [Chang & Keisler 1990]. We will provide a proof here so as to make the exposition self-contained. Proof. The construction we employ is called reduced (or bounded) ultrapower. It consists of two parts: (1) the ultrapower itself, along with theLo´s theorem, namely Lemma 12.6.2; (2) the transformation of the bounded ultrapower into a universe. We start with an arbitrary ultrafilter (N) containing all cofi- F ⊆ P nite sets X N, namely a nonprincipal ultrafilter. The starting point for the construction⊆ of the full ultrapower is the product set V (X)N which consists of all infinite sequences xn = xn : n N of ele- h i h ∈ i ments xn V (X), or equivalently, all functions f : N V (X), via ∈ → the identification of each f with the sequence f(n): n N . The subsets V (X)N V N are defined accordingly. h ∈ i n ⊆ Definition 12.5.5. The set union N N N V (X)bd = Vm(X) $ V (X) m [ is the starting point for the construction of the bounded ultrapower. N N Exercise 12.5.6. Find an element in V (X) r V (X)bd. Let m be a natural number. N Definition 12.5.7. A sequence x V (X)bd is of type m if the h ni ∈ set Dm = n : xn Vm(X) Vm 1(X) belongs to . { ∈ \ − } F 116 12. UNIVERSES, TRANSFER Definition 12.5.8. In the notation as above, a sequence is totally of type m if Dm = N. In case m = 0, we naturally assume that V 1(X) = ∅, so that − D0 = n : xn V0(X) , and xn is totally of type 0 if and only if x X{= V (X∈) for all }n. h i n ∈ 0 N Lemma 12.5.9. Assume that x V (X)bd. Then there is a h ni ∈ unique number m N such that xn is of type m. ∈ h i Proof. By definition we have x V (X)N for some M and h ni ∈ M therefore we have a pairwise disjoint union N = m M Dm. The lemma now follows from the fact that is an ultrafilter. ≤ F S Following Section 5.1, we define an equivalence relation (or more N ∼ precisely as the relation depends on ) on V (X)bd as follows: ∼F F 1 xn yn if and only if n N : xn = yn . h i∼h i { ∈ }∈F N Lemma 12.5.10. Assume that xn V (X)bd is of type m. There N h i ∈ is a sequence yn V (X)bd totally of type m, such that xn yn . h i ∈ h i ∼F h i Proof. If n D then simply let y = x . If n / D then ∈ m n n ∈ m put yn = a, where a is a fixed element in Vm(X) Vm 1(X). (Or a X = V (X) in the case m = 0.) \ − ∈ 0 We continue with the proof of Theorem 12.5.4. The -quotient F N N V (X)bd/ = xn : xn V (X)bd F {h iF h i ∈ } consists of all -classes, or -classes F ∼F N xn = yn V (X)bd : xn yn h iF {h i ∈ h i∼h i} N of sequences x V (X)bd. We define h ni ∈ N N Vm(X) / = xn : xn Vm(X) F {h iF h i ∈ } N N for all m, so that V (X)bd/ = V (X) / . F m m F N Corollary 12.5.11. EachS element of V (X)bd/ has the form N F xn , where xn V (X)bd is a sequence totally of type m for some h iF h i ∈ (unique) m, so that xn Vm Vm 1(X) for all n. ∈ \ − Proof. Use lemmas 12.5.9, 12.5.10. N If xn , yn belong to V (X)bd then we define a binary relation h iF h iF xn ∗ yn if and only if n N : xn yn . h iF ∈h iF { ∈ ∈ }∈F 1In this case there is a set X such that we have x = y , for all n X. ∈F n n ∈ 12.6. LOS THEOREM 117 N Lemma 12.5.12. If xn , yn belong to V (X)bd and we have h iF h iF xn ∗ yn then there is a set K and an index m 1 such h iF ∈ h iF ∈ F ≥ that xn Vm 1(X), yn Vm(X), and xn yn, for all n K. ∈ − ∈ ∈ ∈ Proof. By definition, the set L = n N : xn yn belongs N { ∈ ∈ } to , and, as xn , yn V (X)bd, there is an index M such that x ,yF V (Xh) foriF allh n.iF Consider∈ the sets n n ∈ M Km = n L : xn Vm 1(X) and yn Vm(X) , 1 m M. { ∈ ∈ − ∈ } ≤ ≤ It follows by Corollary 12.1.5 that L = 1 m M Km, therefore at least ≤ ≤ one of Km belongs to , and we can set K = Km. F S If x V (X) then we can let x be the infinite constant sequence ∈ h i x, x, x, . . . = xn , where xn = x forall n. We let x denote its - hclass as above.i h i h iF F 12.6. Los theorem Our main goal in Chapter 12 is the proof of Transfer, an assertion to the effect that, roughly speaking, any sentence true or false in the standard universe remains true, resp., false in the nonstandard uni- verse, Corollary 12.7.1 and ultimately Corollary 12.7.9. This is not a really easy task, in part because such a typical tool of logic as proof by induction on the logical complexity of the sentence considered, does not seem to work. However there is another, and stronger claim which happens to admit such an inductive proof. This is the result below, in fact a cornerstone of the theory of ultraproducts. The theorem con- N nects the truth of sentences related to V (X)bd with their “truth sets” in the ultrafilter . F Definition 12.6.1. The truth set T N of a formula φ is defined by setting ⊆ 1 k Tφ a1 ,..., ak = n N : φ(an,...,an) is true in V (X) . (h ni h ni) ∈ Lemma 12.6.2 (theLo´stheorem) . If φ(x1,...,xk) is an arbitrary 1 k N bounded -formula and sequences an ,..., an belong to V (X)bd, then ∈ 1 k h i h i N the sentence φ( an ,..., an ) is true in V (X)bd/ ; ∗ if and only h iF h iF h F ∈i if the truth set Tφ a1 ,..., ak is a member of . (h ni h ni) F Here we use an upper index in the enumeration of free variables to distinguish it from the enumeration of terms of infinite sequences N in V (X)bd. Example 12.6.3. Let φ be the formula x y. Assume that se- N ∈ quences an , bn belong to V (X)bd. Then the formula an bn h i h i h iF ∈h iF 118 12. UNIVERSES, TRANSFER N is true in V (X)bd/ ; ∗ if and only if an ∗ bn if and only if (by h F ∈i h iF ∈h iF definition) the set T an bn = n N : an bn belongs to . h i∈h i { ∈ ∈ } F Proof. The proof of Lemma 12.6.2 proceeds by induction on the construction of formula φ out of elementary formulas x y by means of logical connectives and quantifiers. Among them, it suffices∈ to consider only (the negation), (conjunction), and (the existence quantifier) since¬ the remaining ones∧ are known to be expressible∃ via combinations of those three. For instance, Φ Ψ is equivalent to (Φ Ψ), and the quantifier is equivalent to ∨. ¬ ∧ The base∀ of induction is¬∃¬ the case when φ is the formula x y, already checked in Example 12.6.3. As we go through the induction∈ steps, we will reduce the arbitrarily long list of free variables x1,...,xk to a minimally nontrivial amount for each step, for the sake of brevity. The step . Let φ(x) be a bounded -formula with a single free ¬ N ∈ variable x, and a V (X)bd. Suppose that the lemma is established h ni ∈ for φ( an ), and let us prove it for φ( an ). h i ¬ h i N Indeed, formula φ( an ) is true in V (X)bd/ ; ∗ if and only ¬ h iNF h F ∈i if φ( an ) is false in V (X)bd/ ; ∗ . The latter holds if and only if h iF h F ∈i (by the inductive hypothesis) Tφ( an ) / or equivalently T φ( an ) h i ∈F ¬ h i ∈F since obviously Tφ = N r T φ, and one and only one set in a pair of complementary sets belongs¬ to (as is an ultrafilter). The step . Here one provesF theF result for φ(a) ψ(a) assuming that the lemma∧ is established separately for φ(a) and∧ψ(a). We leave it as an exercise. The step . Assume that the lemma is established for a bounded ∃ formula φ(x, y, z), and prove it for ( z x) φ(x, y, z). Let an , bn N ∃ ∈ h i h i ∈ V (X)bd. We have to prove that the statement N (A) ( z an ) φ( an , bn , z) is true in V (X)bd/ ; ∗ ∃ ∈h iF h iF h iF h F ∈i is equivalent to (B) the set L = T z an φ( an , bn ,z) belongs to . ∃ ∈h i h i h i F Note that L = n : z an (φ(an, bn, z) is true in V (X)) . By definition,{ (A)∃ is∈ equivalent to } N (C) there is a sequence cn V (X)bd such that both cn ∗ an h i ∈ N h iF ∈h iF and φ( an , bn , cn ) are true in V (X)bd/ ; ∗ . h iF h iF h iF h F ∈i By the inductive hypothesis for φ and the base of induction result, the requirements of (C) hold if and only if each of the sets U = T = n : c a , and ′cn cn an n n h i h i∈h i { ∈ } U = T = n : φ(a , b ,c ) is true in V (X) ′′cn φ( an , bn , cn ) n n n h i h i h i h i { } 12.7. ELEMENTARY EMBEDDING 119 belongs to , which is equivalent (as is an ultrafilter) to U cn , where F F h i ∈F U cn = U ′cn U ′′cn = T cn an φ( an , bn , cn ). h i h i ∩ h i h i∈h i∧ h i h i h i On the other hand, the set L of (B) satisfies U cn L because if n h i ⊆ ∈ U cn then taking z = cn satisfies φ(an, bn, z) in V (X). Therefore if T h theni L as is an ultrafilter. This ends the proof that (A)∈ Fimplies (B).∈ F F N To prove the converse suppose that L . Recall that V (X)bd = V (X)N, hence the sequence a belongs∈F to some V (X). If n L m m h ni m ∈ then by definition there is some z = cn an such that φ(an, bn, z) is true S ∈ in V (X). It follows that m> 0 and cn Vm 1(X), by Corollary 12.1.5 ∈ − If n / L then let cn Vm 1(X) be arbitrary. The construction of ∈ ∈ − N the sequence cn : n N Vm 1(X) uses the axiom of choice, of h ∈ i ∈N − course. Thus c V (X)bd and if n L then n belongs to the h ni ∈ ∈ set U cn = U ′cn U ′′cn as above by construction. It follows that the h i h i ∩ h i sets U , U belong to , and hence we have (C) and (A). ′cn ′′cn h i h i F This completes the proof of theLo´stheorem. 12.7. Elementary embedding Corollary 12.7.1. The map x x is an elementary embed- 7−→ h NiF ding of the structure V (X); to V (X)bd/ ; ∗ . In other words, if hφ(x1,...,x∈i k) ish a boundedF ∈i-formula and we have a1,...,ak V (X), then the sentence φ(a1,...,a∈k) is true in V (X) if ∈ 1 k N and only if the sentence φ( a ,..., a ) is true in V (X)bd/ ; ∗ . h iF h iF h F ∈i j j Proof. Recall that a is the infinite sequence an : n N such j j h i h ∈ i that an = a for all n. Therefore the truth set Tφ( a1 ,..., ak ) is equal h i h i to N if φ(a1,...,ak) is true in V (X) and is equal to ∅ otherwise. Now an application of Lemma 12.6.2 concludes the proof. This result ends the first part of the proof of Theorem 12.5.4. We defined an embedding x x of the universe V (X); to 7−→ h iF h ∈i N V (X)bd/ ; ∗ h F ∈i which satisfies Transfer by Corollary 12.7.1. The final step of the proof N of the theorem is embedding V (X)bd/ ; ∗ into the universe V (∗X), hN F ∈i where ∗X = an : an X , and, we recall, X = V0(X). {h iF h i ∈ } Lemma 12.7.2. Assume that γ is an infinite ordinal and X is a γ-base set. Then ∗X is a (γ + 4)-base set, hence a base set by Lemma 12.2.4. 120 12. UNIVERSES, TRANSFER Proof. By definition, each an ∗X is a non-empty set of h iF ∈ pairwise -equivalent sequences an = an : n N of elements ∼F h i h ∈ i an X. It remains to prove that rank( an ) = γ +4. Note that a sequence∈ of this form is formally equal toh thei set of all ordered pairs n, an = n , n, an : n N , where h i {{{ } { }} ∈ } rank(n)= n, rank(an)= γ +1 (as X is a γ-base set), rank( n )= n + 1, { } rank( n, a )= γ + 2, { n} rank( n , n, a )= γ + 3, {{ } { n}} rank( an : n N )= γ + 4, h ∈ i as required. The lemma shows that V (∗X) is a legitimate superstructure which satisfies all results in sections 12.1 and 12.2. N Definition 12.7.3. Define a map h : V (X)bd/ V (∗X) as N F → follows. Suppose that an V (X)bd/ . It can be assumed, by Corollary 12.5.11, that ha iFis∈ totally of typeF m for some m, so that h ni an Vm(X) Vm 1(X) (or just an X = V0(X) provided m = 0) for ∈ \ − ∈ all n. The definition of h( an ) goes on by induction on m. h iF (1) If m = 0 then we put h( an )= an . h NiF h iF In other words, if an X then h( an )= an ∗X. h i ∈ h iF h iF ∈ (2) If m 1 then we put ≥ N h( an )= h( bn ): bn Vm 1(X) and bn ∗ an . h iF h iF h i ∈ − h iF ∈h iF This type of definitions is called the Mostowski collapse, see Theorem 4.4.9 in [Chang & Keisler 1990]. Exercise 12.7.4. Prove by induction on m that if a V (X)N h ni ∈ m then h( an ) Vm(∗X). h iF ∈ Lemma 12.7.5. The map h is a bijection, and h is an isomorphism N in the sense that if a , b V (X)bd then h ni h ni ∈ an ∗ bn if and only if h( an ) h( bn ) . h iF ∈h iF h iF ∈ h iF Proof. To prove that h is a bijection, suppose that an , bn N h i h i ∈ V (X)bd and an = bn . Then the truth set T = n N : an = bn belongs to hby LemmaiF 6 h 12.6.2.iF It can be assumed, by{ Corollary∈ 12.5.11,6 } F that a , b are totally of type m, resp., m′, for some m, m′. h ni h ni Case 1 : m = m′ = 0, so that a , b X = V (X) for all n. Then n n ∈ 0 h( an ) = an and h( bn ) = bn by construction, and on the h iF h iF h iF h iF other hand, we have an = bn since T . h iF 6 h iF ∈F 12.7. ELEMENTARY EMBEDDING 121 Case 2 : m′ = 0 and m 1, so that b X = V (X) and ≥ n ∈ 0 an Vm(X) Vm 1(X) for all n. Then h( bn ) = bn ∗X while ∈ \ − h iF h iF ∈ h( an ) V (∗X) by construction, so that h( bn ) = h( an ) by Lemmah iF 12.7.2.⊆ (Recall Definition 12.1.4.) h iF 6 h iF Case 3 : m = 0 and m′ 1, similar. ≥ Case 4 : m = m′ 1. Then, for any n, a and b belong to ≥ n n Vm(X) Vm 1(X), therefore an Vm 1(X) and bn Vm 1(X). In \ − ⊆ − ⊆ − addition, if n T then an = bn, hence there exists some cn Vm 1(X) satisfying c ∈a b or c6 b a . It follows that one of∈ the− sets n ∈ n \ n n ∈ n \ n T ′ = n T : c a b , T ′′ = n T : c b a { ∈ n ∈ n \ n} { ∈ n ∈ n \ n} belongs to ; suppose that T ′ . If n / T then let cn Vm 1(X) F ∈ F ∈ ∈ − be arbitrary. Then cn Vm 1(X). Now we have cn ∗ an but h i ∈ − h i ∈ h i cn ∗ bn , hence, by Definition 12.7.3, h( cn ) h( an ) but ¬h i ∈ h i h iF ∈ h iF h( cn ) / h( bn ). Therefore h( an ) = h( bn ). h iF ∈ h iF h iF 6 h iF Case 5 : m, m′ 1 and m = m′, say m′ < m. For any n we ≥ 6 have an Vm(X) Vm 1(X), hence an Vm 1(X). Note that an V (X) is∈ impossible\ for− k < m 1 as⊆ otherwise− a V (X) ⊆ k − n ∈ k+1 ⊆ Vm 1(X), a contradiction. Thus there is an element cn an, cn − ∈ ∈ Vm 1(X) Vm 2(X), in particular, cn Vm 1(X) Vm′ 1(X). Then − \ − ∈ − \ − cn Vm 1(X) and cn ∗ an , hence h( cn ) h( an ). On the h i ∈ − h i ∈ h i h iF ∈ h iF other hand, we have c / b by Corollary 12.1.5, hence c ∗ b n ∈ n ¬h ni ∈ h ni and h( cn ) / h( bn ). Thus still h( an ) = h( bn ). h iF ∈ h iF h iF 6 h iF The claim that h is a bijection is established. Now prove the isomorphism claim. Let an , bn be sequences in N h i h i V (X)bd. As above, it can be assumed that a , b are totally of h ni h ni type m, resp., m′, for some m, m′ N. Suppose that an ∗ bn . ∈ h iF ∈ h iF By definition the set N = n N : an bn belongs to . Then { ∈ ∈ } F an Vm′ 1(X) for all n N by Corollary 12.1.5, and hence we have ∈ − ∈ h( an ) h( bn ) by Definition 12.7.3. h iF ∈ h iF If conversely h( an ) h( bn ) then immediately an ∗ bn still by Definition 12.7.3.h iF ∈ h iF h iF ∈h iF N Definition 12.7.6. Let ∗V (X) = h( an ): an V (X)bd/ , { h iF h i ∈ F} the range of h. If m N then let ∈ N ∗Vm(X)= h( an ): an Vm(X) / . { h iF h i ∈ F} Exercise 12.7.7. Prove, using the results of Lemma 12.7.5, that the sets ∗V (X) satisfy ∗V (X) = ∗V (X) V (∗X), ∗V (X) m m ∩ m m+1 ⊆ (∗V (X)), and finally ∗V (X)= ∗V (X) V (∗X). P m m m ⊆ Thus ∗V (X) is a subuniverse ofS the full universe V (∗X) over ∗X = N ∗V (X). Let’s use letters ξ, η to denote arbitrary elements of V (X)bd/ , 0 F 122 12. UNIVERSES, TRANSFER N so that if ξ V (X)bd/ then h(ξ) ∗V (X). As h is an isomorphism by Lemma 12.7.5,∈ we immediatelyF obtain∈ the following. Corollary 12.7.8. If φ(x1,...,xk) is a bounded -formula and 1 k N 1 k ∈ N ξ ,...,ξ V (X)bd/ , then φ(ξ ,...,ξ ) is true in V (X)bd/ ; ∗ if ∈ F 1 k h F ∈i and only if the sentence φ(h(ξ ),...,h(ξ )) is true in ∗V (X); . h ∈i Now we are ready to accomplish the proof of Theorem 12.5.4. We claim that ∗X, ∗V (X), and the map x ∗x is a nonstandard exten- sion of V (X). The proof is maintained by7−→ combining Corollaries 12.7.1 and 12.7.8. Namely, if x V (X) then let ∗x = h( x ), so that x ∗x ∈ h iF 7−→ maps V (X) onto ∗V (X) V (∗X). Further, Lemma 12.7.5 ensures that condition (I) of Section 12.5⊆ holds. Now let’s validate condition (II) of 12.5. Consider a bounded formula φ(x1,...,xk), and arbitrary elements a1,...,ak V (X). Then the formula φ(a1,...,ak) is true in V (X) if ∈ 1 k N and only if φ( a ,..., a ) is true in V (X)bd/ ; ∗ (by Corollary h iF h iF 1 h k F ∈i 12.7.1). if and only if the formula φ(∗a ,..., ∗a ) is true in ∗V (X) (by Corollary 12.7.8). Note that both V (X) and ∗V (X) are considered as -structures with the true membership relation . ∈ ∈ It remains to prove that X $ ∗X properly, or, to be more precise, there is an element y ∗X not equal to any ∗x, x X. ∈ ∈ Let a0, a1, a2,... be an infinite sequence of pairwise distinct ele- N N ments an X. Then an = an : n N belongs to X = V0(X) . It ∈ h i h ∈ i follows that ξ = h( an ) ∗X. Prove that if x X then ξ = ∗x . By h iF ∈ ∈ 6 definition one has to show that an x , that is, the set N = n : a = x belongs to . However Nh containsi 6∼ h i all natural numbers with{ a n 6 } F possible exception of a single index n satisfying an = x. Hence N . This completes the proof of Theorem 12.5.4. ∈F Rephrasing condition (II) of Section 12.5, we obtain: Corollary 12.7.9 (Transfer). The map x ∗x is an elemen- 7−→ tary -embedding of the structure V (X); to ∗V (X); . In other words,∈ if φ(x1,...,xk) is a boundedh -formula∈i andh a1,...,a∈ik V (X), 1 k ∈ 1 k∈ then φ(a ,...,a ) is true in V (X) if and only if φ(∗a ,..., ∗a ) is true in ∗V (X). 12.8. Definability In Chapter 4 we presented a construction of the hyperreal line via filters and ultrapowers. For a broader picture, it may be useful to comment on the issue of definability. 12.9. CONSERVATIVITY 123 Many people felt that the hyperreal line is not definable since its definition involves nonconstructive foundational material. There- fore it came as a surprise to many people when Kanovei and Shelah proved in 2004 that, in fact, the hyperreal line is indeed definable. See http://www.ams.org/mathscinet-getitem?mr=2039354 and the article itself [Kanovei & Shelah 2004]. What this means is that there exists a specific set-theoretic formula that defines the hyperreal line. This might seem surprising. To clarify the situation, one would compare it with that for Lebesgue measure. It is widely known and accepted as fact that the Lebesgue measure is σ-additive. Two items need to be pointed out here: (1) to define the Lebesgue measure, one does not need the axiom of choice. (2) to prove that the Lebesgue measure is σ-additive, one needs the axiom of choice. Here item (2) follows from the existence of a model of ZF where the real line is a countable union of countable sets. This model is called the Feferman-Levy model [Feferman & Levy 1963]. See also [Jech 1973, chapter 10]. We see that, while the Lebesgue measure can be defined in ZF, a proof of its σ-additivity necessarily requires Choice. With regard to its foundational status, the situation is similar for a hyperreal line. One can define a hyperreal line without Choice (as Kanovei and Shelah did), but actually to prove something about it one would need some version of Choice. 12.9. Conservativity The various frameworks developed by Robinson and since, including those of Nelson and Hrbacek, are conservative extensions of ZFC, and in this sense are part of classical mathematics. As far as conservativity is concerned, the following needs to be kept in mind. Robinson showed that, given a proof exploiting infinitesimals, there always exists an infinitesimal-free paraphrase. However, this is merely a theoretical result. In practice, such a paraphrase may involve an exponential increase in the complexity of the proof, and correspondingly an exponential decrease in its compre- hensibility, at least in principle. From the point of computer scientist one should certainly be able to appreciate the difference. Similarly, one can note that any result using the real numbers can be restated in terms of the rationals using sequences, and even in terms 124 12. UNIVERSES, TRANSFER of the integers, etc. The ancient Greeks used formulations that would be very difficult for us to follow if we did not use modern translations, in terms of extended number systems that have since become widely accepted and are considered intuitive. The related controversy over Unguru’s proposal is well known. The following two additional points shouuld be kept in mind. First, the conservativity property of the pair ZFC, IST does not hold for pairs like An, An∗ , where An is the n-th order arithmetic and An∗ its nonstandard extension, in fact An∗ is rather comparable to the stan- dard An+1. This profound result of Henson–Keisler near 1994 is well- known to NSA people (provide references and exact formulations). Second, IST yields some principally new mathematics with respect to ZFC, in the sense that there are IST sentences not equivalent to any ZFC sentence. This is established in [Kanovei & Reeken 2004]. This can be compared to the obvious fact that any sentence about C admits an equivalent sentence about R. The following comments by Henson and Keisler usefully illustrate the point: It is often asserted in the literature that any theorem which can be proved using nonstandard analysis can also be proved without it. The purpose of this paper is to show that this assertion is wrong, and in fact there are theorems which can be proved with nonstandard analysis but cannot be proved without it. There is currently a great deal of confusion among mathemati- cians because the above assertion can be interpreted in two different ways. First, there is the following correct statement: any theorem which can be proved using nonstandard analysis can be proved in Zermelo- Fraenkel set theory with choice, ZFC, and thus is acceptable by contemporary standards as a theorem in mathematics. Second, there is the erroneous con- clusion drawn by skeptics: any theorem which can be proved using nonstandard analysis can be proved without it, and thus there is no need for nonstandard analysis. The reason for this confusion is that the set of principles which are accepted by current math- ematics, namely ZFC, is much stronger than the set of principles which are actually used in mathematical practice. It has been observed (see [F] and [S]) that al- most all results in classical mathematics use methods 12.9. CONSERVATIVITY 125 available in second order arithmetic with appropriate comprehension and choice axiom schemes. This sug- gests that mathematical practice usually takes place in a conservative extension of some system of second order arithmetic, and that it is difficult to use the higher levels of sets. In this paper we shall consider systems of nonstandard analysis consisting of second order nonstandard arithmetic with saturation princi- ples (which are frequently used in practice in nonstan- dard arguments). We shall prove that nonstandard analysis (i.e. second order nonstandard arithmetic) with the ω1-saturation axiom scheme has the same strength as third order arithmetic. This shows that in principle there are theorems which can be proved with nonstandard analysis but cannot be proved by the usual standard methods. [Henson & Keisler 1986, p. 377] In short, An∗ is comparable to the standard An+1 rather than to An itself. CHAPTER 13 More regularity 13.1. D2 implies D1 Lemma 13.1.1. Let B Rn be an open ball around the origin 0, ⊆ n n let a hal(0), and let 0 = v ∗ R with v 1. If G : ∗B ∗ R is an internal∈ function satisfying6 ∈ ≺≺ → G(x) 2G(x + v)+ G(x +2v) v G(a) G(a + v) − ≺k kk − k h h for all x B, then there is m ∗ N such that a + mv B and ∈ ∈ ∈ G(a) G(a + v) v G(a) G(a + mv) . − ≺k kk − k Proof. Let N = r/ v where 0 < r R is slightly smaller ⌊ k k⌋ ∈ than the radius of B, and for 0 x ∗ R, x ∗ N is the integer part of x. So any m N satisfies≤ that∈ a ⌊+ ⌋mv ∈ hB. Let A = G(a) G(a + v). For≤ 0 j N let x = a +∈jv, then by our − ≤ ≤ j assumption on G we have G(xj) 2G(xj+1)+ G(xj+2) = Cj v A with C 1. Let C be the maximum− of C , ,C , then kCkk k1 j ≺ 0 ··· N ≺ and G(xj) 2G(xj+1) + G(xj+2) C v A for all 0 j N. Given k N− we have ≤ k kk k ≤ ≤ ≤ A G(x ) G(x ) = G(x ) G(x ) G(x ) G(x ) − k − k+1 0 − 1 − k − k+1 k 1 − G(x ) G(x ) G(x ) G(x ) Ck v A . ≤ j − j+1 − j+1 − j+2 ≤ k kk k j=0 X So for any m N, ≤ m 1 − mA G(x ) G(x ) = A G(x ) G(x ) Cm2 v A , − 0 − m − k − k+1 ≤ k kk k k=0 X so mA G(x ) G( x ) = Km2 v A with K 1. It follows − 0 − m k kk k ≺ that m A Km2 v A G(x ) G(x ) , k k − k kk k≤k 0 − m k and so, multiplying by v , we have m v A (1 Km v ) v G(x0) G(x ) . k k k kk k − k k ≤k kk − m k 127 128 13. MORE REGULARITY 1 Now let m = min N, 2K v , then { ⌊ k k ⌋ } m v A /2 v G(x ) G(x ) . k kk k ≤k kk 0 − m k By definition of N and since K 1 we have that m v is appreciable, ≺ k k i.e. not infinitesimal, and so finally A v G(x0) G(xm) , that is, G(a) G(a + v) v G(a) G(a +≺kmv)kk. − k − ≺k kk − k Proposition 13.1.2. If F is D2 for some choice of coordinates in W , then F is D1. Proof. Given a hW , in the given coordinates take some ball B around st(a). Define ∈G(x)= F (x) x, then we must show for any v λ that G(a) G(a + v) λ v (here− v = b a in Definition 10.3.3).≺ If G(a) G(a+−v) λ v≺thenk k we are certainly− done. Otherwise λv G(a) −G(a + v)≺≺, andk onk the other hand G(x) 2G(x + v)+ G(x ≺+ k2v) = −F (x) 2Fk(x + v)+ F (x +2v) λ v 2,− by taking v = w in Definition 10.4.1.− Together we have G(x≺) k2Gk(x + v)+ G(x +2v) − ≺ v G(a) G(a + v) , so by Lemma 13.1.1 there is m ∗ N such kthatkka + mv− hB andk ∈ ∈ G(a) G(a+v) v G(a) G(a+mv) v G(a) + G(a+mv) v λ − ≺k kk − k≤k k k k k k ≺k k since F is a prevector field and so G(x)= F (x) x λ for all x. − ≺ 13.2. Injectivity When speaking about local D1 or D2 prevector fields, whenever needed we will assume, perhaps by passing to a smaller domain, that a constant K as in Propositions 11.3.1, 11.3.2 exists. Corollary 13.2.1. If F is a D1 prevector field then F is injective on hM. Proof. Let a = b hM. If st(a) = st(b) then clearly F (a) = F (b). Otherwise there exists6 ∈ a B including 6 a, b as in Proposition 11.3.1,6 and (2) of that proposition implies F (a) = F (b). 6 We will now show that a D1 prevector field is in fact bijective on hM. We first prove local surjectivity, as follows. n Proposition 13.2.2. Let B1 B2 B3 R be closed balls ⊆ ⊆ ⊆ centered at the origin, of radii r1 < r2 < r3. If F : ∗B2 ∗B3 is a 1 → local D prevector field then F (∗B ) ∗B . 2 ⊇ 1 Proof. Fix 0 < s R smaller than r2 r1 and r3 r2. We will ∈ − − apply transfer to the following fact: For every function f : B2 B3, if f(x) f(y) 2 x y for all x, y B and f(x) x B then f(B ) B . This fact is indeed true since our assumptions 2 2 ⊇ 1 on f imply that it is continuous, and that for every x ∂B2 the straight interval between x and f(x) is included in B B , so ∈f is homotopic 3− 1 |∂B2 in B3 B1 to the inclusion of ∂B2. Now if some p B1 is not in f(B2) then f− is null-homotopic in B p , and so∈ the same is true for |∂B2 3 −{ } the inclusion of ∂B2, a contradiction. Applying transfer we get that for every internal function f : ∗B ∗B , if f(x) f(y) 2 x y 2 → 3 k − k ≤ k − k for all x, y ∗B2 and f(x) x 1 in a corresponding domain for F − , with K′ only slightly larger than K, namely K′ = K/(1 Kλ). − We conclude this section with the following observations. Lemma 13.3.3. Let F,G be D1 prevector fields. If F (a) G(a) for all standard a, then F (b) G(b) for all b, i.e. F G. ≡ ≡ ≡ Proof. Let K be as in Proposition 11.3.1 for both F and G. Let a = st(b), then F (b) G(b) = F (b) F (a) b+a+F (a) G(a)+G(a) G(b) a+b k − k k − − − − − k F (b) F (a) b + a + F (a) G(a) + G(a) G(b) a + b ≤k − − k k − k k − − k Kλ a b + F (a) G(a) + Kλ a b λ. ≤ k − k k − k k − k ≺≺ Recall that Definition 10.2.1, which defines when a prevector field F realizes a classical vector field X, involves only standard points. It follows from Lemma 13.3.3 that if F is D1 then this determines F up to equivalence. Namely, we have the following. Corollary 13.3.4. Let U Rn be open, and X : U Rn a vector field. If F,G are two D1 prevector⊆ fields that realize X then→ F G. In particular, if X is Lipschitz and G is a D1 prevector field that≡ realizes X, then F G, where F is the prevector field obtained from X as in Example 10.1.7.≡ Part 2 500 yrs of infinitesimal math: Stevin to Nelson CHAPTER 14 16th century 14.1. Simon Stevin This section contains a discussion of Simon Stevin, unending dec- imals, and the real numbers. The material in this section is based in [B laszczyk, Katz & Sherry 2013]. Nearly a century before Newton and Leibniz, Simon Stevin (Stev- inus) sought to break with an ancient Greek heritage of working ex- clusively with relations among natural numbers,1 and developed an approach capable of representing both “discrete” number (αριθµ ’ oς´ )2 composed of units (µoναδων´ ) and continuous magnitude (µεγεθoς´ ) of geometric origin.3 According to van der Waerden, “[Stevin’s] general notion of a real number was accepted, tacitly or explicitly, by all later scientists” [Van der Waerden 1985, p. 69]. D. Fearnley-Sander wrote that “the modern concept of real num- ber . . . was essentially achieved by Simon Stevin, around 1600, and was thoroughly assimilated into mathematics in the following two cen- turies” [Fearney-Sander 1979, p. 809]. D. Fowler points out that “Stevin . . . was a thorough-going arithmetizer: he published, in 1585, the first popularization of decimal fractions in the West . . . in 1594, he described an algorithm for finding the decimal expansion of the root of any polynomial, the same algorithm we find later in Cauchy’s proof of the Intermediate Value Theorem” [Fowler 1992, p. 733]. Fowler also emphasizes that important foundational work was yet to be done 1The Greeks counted as numbers only 2, 3, 4,...; thus, 1 was not a number, nor are the fractions: ratios were relations, not numbers. Consequently, Stevin had to spend time arguing that the unit (µoνας´ ) was a number. 2Euclid [Euclid 2007, Book VII, def. 2]. 3Euclid [Euclid 2007, Book V]. See also Aristotle’s Categories, 6.4b, 20-23: “Quantity is either discrete or continuous. [. . . ] Instances of discrete quantities are number and speech; of continuous, lines, surfaces, solids, and besides these, time and place”. 133 134 14. 16TH CENTURY by Dedekind, who proved that the field operations and other arith- metic operations extend from Q to R.4 Meanwhile, Stevin’s decimals stimulated the emergence of power series (see below) and other de- velopments. We will discuss Stevin’s contribution to the Intermediate Value Theorem in Subsection 14.3 below. In 1585, Stevin defined decimals in La Disme as follows: Decimal numbers are a kind of arithmetic based on the idea of the progression of tens, making use of the Arabic numerals in which any number may be written and by which all computations that are met in busi- ness may be performed by integers alone without the aid of fraction. (La Disme, On Decimal Fractions, tr. by V. Sanford, in [Smith 1920, p. 23]). By numbers “met in business” Stevin meant finite decimals,5 and by “computations” he meant addition, subtraction, multiplications, di- vision and extraction of square roots on finite decimals. Stevin argued that numbers, like the more familiar continuous mag- nitudes, can be divided indefinitely, and used a water metaphor to illustrate such an analogy: As to a continuous body of water corresponds a con- tinuous wetness, so to a continuous magnitude corre- sponds a continuous number. Likewise, as the con- tinuous body of water is subject to the same division and separation as the water, so the continuous num- ber is subject to the same division and separation as its magnitude, in such a way that these two quantities cannot be distinguished by continuity and discontinu- ity [Stevin 1585, p. 3]; quoted in [Malet 2006]. Stevin argued for equal rights in his system for rational and irrational numbers. He was critical of the complications in Euclid [Euclid 2007, Book X], and was able to show that adopting the arithmetic as a way of dealing with those theorems made many of them easy to understand and easy to prove. In his Arithmetique [Stevin 1585], Stevin proposed to represent all numbers systematically in decimal notation. P. Ehrlich notes that Stevin’s 4Namely, there is no easy way from representation of reals by decimals, to the field of reals, just as there is no easy way from continuous fractions, another well-known representation of reals, to operations on such fractions. 5But see Stevin’s comments on extending the process ad infinitum in main text at footnote 11. 14.2. DECIMALS FROM STEVIN TO NEWTON AND EULER 135 viewpoint soon led to, and was implicit in, the an- alytic geometry of Ren´eDescartes (1596-1650), and was made explicit by John Wallis (1616-1703) and Isaac Newton (1643-1727) in their arithmetizations thereof [Ehrlich 2005, p. 494]. 14.2. Decimals from Stevin to Newton and Euler Stevin’s text La Disme on decimal notation was translated into English in 1608 by Robert Norton (cf. Cajori [Cajori 1928, p. 314]). The translation contains the first occurrence of the word “decimal” in English; the word will be employed by Newton in a crucial passage 63 years later.6 Wallis recognized the importance of unending decimal expressions in the following terms: Now though the Proportion cannot be accurately ex- pressed in absolute Numbers: Yet by continued Ap- proximation it may; so as to approach nearer to it, than any difference assignable (Wallis’s Algebra, p. 317, cited in [Crossley 1987]). Similarly, Newton exploits a power series expansion to calculate de- tailed decimal approximations to log(1 + x) for x = 0.1, 0.2,... (Newton [1]).7 By the time of Newton’s annus mirabilis,± the idea± of un- ending decimal representation was well established. Historian V. Katz calls attention to “Newton’s analogy of power series to infinite decimal expansions of numbers” [Katz, V. 1993, p. 245]. Newton expressed such an analogy in the following passage: Since the operations of computing in numbers and with variables are closely similar . . . I am amazed that it has occurred to no one (if you except N. Mer- cator with his quadrature of the hyperbola) to fit the doctrine recently established for decimal numbers in similar fashion to variables, especially since the way is then open to more striking consequences. For 6See main text at footnote 9. 7Newton scholar N. Guicciardini kindly provided a jpg of a page from New- ton’s manuscript containing such calculations, and commented as follows: “It was probably written in Autumn 1665 (see Mathematical papers 1, p. 134). White- side’s dating is sometimes too precise, but in any case it is a manuscript that was certainly written in the mid 1660s when Newton began annotating Wallis’s Arith- metica Infinitorum”[Guicciardini 2011]. The page from Newton’s manuscript can be viewed at http://u.math.biu.ac.il/~katzmik/newton.html 136 14. 16TH CENTURY since this doctrine in species has the same relation- ship to Algebra that the doctrine in decimal numbers has to common Arithmetic, its operations of Addi- tion, Subtraction, Multiplication, Division and Root- extraction may easily be learnt from the latter’s pro- vided the reader be skilled in each, both Arithmetic and Algebra, and appreciate the correspondence be- tween decimal numbers and algebraic terms continued to infinity . . . And just as the advantage of decimals consists in this, that when all fractions and roots have been reduced to them they take on in a certain mea- sure the nature of integers, so it is the advantage of infinite variable-sequences that classes of more com- plicated terms (such as fractions whose denominators are complex quantities, the roots of complex quanti- ties and the roots of affected equations) may be re- duced to the class of simple ones: that is, to infinite series of fractions having simple numerators and de- nominators and without the all but insuperable en- cumbrances which beset the others [Newton 1671].8 In this remarkable passage dating from 1671, Newton explicitly names infinite decimals as the source of inspiration for the new idea of infi- nite series.9 The passage shows that Newton had an adequate number system for doing calculus and real analysis two centuries before the triumvirate.10 In 1742, John Marsh first used an abbreviated notation for repeat- ing decimals [Marsh 1742, p. 5]; cf. [Cajori 1928, p. 335]. Euler ex- ploits unending decimals in his Elements of Algebra in 1765, as when he sums an infinite series and concludes that 9.999 ... = 10 [Euler 1770, p. 170]. 14.3. Stevin’s cubic and the IVT In the context of his decimal representation, Stevin developed nu- merical methods for finding roots of polynomials equations. He de- scribed an algorithm equivalent to finding zeros of polynomials (see Crossley [Crossley 1987, p. 96]). This occurs in a corollary to prob- lem 77 (more precisely, LXXVII) in (Stevin [Stevin 1625, p. 353]). 8We are grateful to V. Katz for signaling this passage. 9See footnote 6. 10The expression “the great triumvirate” is used by Boyer in [Boyer 1949, p. 298] to describe Cantor, Dedekind, and Weierstrass. 14.3. STEVIN’S CUBIC AND THE IVT 137 Here Stevin describes such an argument in the context of finding a root of the cubic equation (which he expresses as a proportion to con- form to the contemporary custom) x3 = 300x + 33915024. Here the whimsical coefficient seems to have been chosen to emphasize the fact that the method is completely general; Stevin notes further- more that numerous addional examples can be given. A textual dis- cussion of the method may be found in Struik [Stevin 1958, p. 476]. Centuries later, Cauchy would prove the Intermediate Value The- orem (IVT) for a continuous function f on an interval I by a divide- and-conquer algorithm. Cauchy subdivided I into m equal subinter- vals, and recursively picked a subinterval where the values of f have opposite signs at the endpoints [Cauchy 1821, Note III, p. 462]. To elaborate on Stevin’s argument following [Stevin 1958, 10, p. 475- 476], note that what Stevin similarly described a divide-and-conquer§ algorithm. Stevin subdivides the interval into ten equal parts, result- ing in a gain of a new decimal digit of the solution at every iteration of his procedure. Stevin explicitly speaks of continuing the iteration ad infinitum:11 Et procedant ainsi infiniment, l’on approche infini- ment plus pres au requis [Stevin 1625, p. 353, last 3 lines]. Arguably, the abstract existence proofs for the real numbers are not needed here since Stevin gives an algorithm for producing an explicit decimal representation of the solution. The IVT for polynomials would resurface in Lagrange before being generalized by Cauchy to the newly introduced class of continuous functions. One frequently hears sentiments to the effect that mathematicians before 1870 did not and could not have provided rigorous proofs, since the real number system was not even built yet, but such an attitude may be anachronistic. It overemphasizes the significance of the tri- umvirate project in an inappropriate context. Stevin is concerned with constructing an algorithm, whereas Cantor is concerned with develop- ing a foundational framework based upon the existence of the totality of the real numbers, as well as their power sets, etc. The latter project involves a number of non-constructive ingredients, including the axiom of infinity and the law of excluded middle. But none of it is needed for Stevin’s procedure, because he is not seeking to re-define “number” 11Cf. footnote 5. 138 14. 16TH CENTURY in terms of alternative (supposedly less troublesome) mathematical ob- jects. Many historians emphasize the approaches developed by Cantor, Dedekind, and Weierstrass to proofs of the existence of real numbers. Sometimes this is done at the expense, and almost to the exclusion, of Stevin’s approach. Stevin’s contribution to the construction of the real numbers deserves more explicit acknowledgment. CHAPTER 15 17th century 15.1. Pierre de Fermat 15.2. James Gregory 15.3. Gottfried Wilhelm von Leibniz 139 CHAPTER 16 18th century 16.1. Leonhard Euler 141 CHAPTER 17 19th century 17.1. Augustin-Louis Cauchy 143 CHAPTER 18 20th century 18.1. Thoralf Skolem 18.2. Edwin Hewitt 18.3. JerzyLo´s 18.4. Abraham Robinson 18.5. Edward Nelson 145 Bibliography [Albeverio et al. 2009] Albeverio, S.; Fenstad, J.; Hoegh-Krohn, R.; Lindstrom, T. Nonstandard Methods in Stochastic Analysis and Mathematical Physics. Academic Press 1986, Dover 2009. [Almeida, Neves & Stroyan 2014] Almeida, R.; Neves, V.; Stroyan, K. “Infinites- imal differential geometry: cusps and envelopes.” Differ. Geom. Dyn. Syst. 16 (2014), 1–13. [Bair et al. 2013] Bair, J.; Blaszczyk, P.; Ely, R.; Henry, V.; Kanovei, V.; Katz, K.; Katz, M.; Kutateladze, S.; McGaffey, T.; Schaps, D.; Sherry, D.; Shnider, S. “Is mathematical history written by the victors?” Notices of the American Mathematical Society 60 (2013) no. 7, 886-904. See http://www.ams.org/notices/201307/rnoti-p886.pdf and http://arxiv.org/abs/1306.5973 [Benoit 1997] Benoit, E. “Applications of Nonstandard Analysis in Ordinary Dif- ferential Equations.” Nonstandard Analysis, Theory and Applications. Editors: Leif O. Arkeryd, Nigel J. Cutland, C. Ward Henson NATO ASI Series Volume 493, 1997, pp 153-182. [Blaszczyk, Katz & Sherry 2013] Blaszczyk, P.; Katz, M.; Sherry, D. “Ten miscon- ceptions from the history of analysis and their debunking.” Foundations of Science, 18 (2013), no. 1, 43-74. See http://dx.doi.org/10.1007/s10699-012-9285-8 and http://arxiv.org/abs/1202.4153 [Boothby 1986] Boothby, W. An introduction to differentiable manifolds and Rie- mannian geometry. Second edition. Pure and Applied Mathematics, 120. Academic Press, Inc., Orlando, FL, 1986. [Borovik & Katz 2012] Borovik, A., Katz, M. “Who gave you the Cauchy– Weierstrass tale? The dual history of rigorous calculus.” Foundations of Science 17, no. 3, 245-276. See http://dx.doi.org/10.1007/s10699-011-9235-x [Boyer 1949] Boyer, C. The concepts of the calculus. Hafner Publishing Company. [Bradley & Sandifer 2009] Bradley, R., Sandifer, C. Cauchy’s Cours d’analyse. An annotated translation. Sources and Studies in the History of Mathematics and Physical Sciences. Springer, New York. [Cajori 1928] Cajori. A history of mathematical notations, Volumes 1-2. The Open Court Publishing Company, La Salle, Illinois, 1928/1929. Reprinted by Dover, 1993. [Cauchy 1821] Cauchy, A. L. Cours d’Analyse de L’Ecole Royale Polytechnique. Premi`ere Partie. Analyse alg´ebrique (Paris: Imprim´erie Royale, 1821). 147 148 BIBLIOGRAPHY [Chang & Keisler 1990] Chang, C.; Keisler, H.J. Model Theory. Studies in Logic and the Foundations of Mathematics (3rd ed.), Elsevier (1990) [Origi- nally published in 1973]. [Crossley 1987] Crossley, J. The emergence of number. Second edition. World Sci- entific Publishing Co., Singapore. [Davis 1977] Davis, M.: Applied nonstandard analysis. Pure and Applied Math- ematics. Wiley-Interscience [John Wiley & Sons], New York-London- Sydney, 1977. Reprinted by Dover, NY, 2005, see http://store.doverpublications.com/0486442292.html [Do79] Dombrowski, P.: 150 years after Gauss’ “Disquisitiones generales circa superficies curvas”. With the original text of Gauss. Ast´erisque, 62. Soci´et´eMath´ematique de France, Paris, 1979. [Ehrlich 2005] Ehrlich, P. “Continuity.” In Encyclopedia of philosophy, 2nd edition (Donald M. Borchert, editor), Macmillan Reference USA, Farmington Hills, MI, 2005 (The online version contains some slight improvements), pp. 489–517. [Euclid 2007] Euclid. Euclid’s Elements of Geometry, edited, and provided with a modern English translation, by Richard Fitzpatrick, 2007. See http://farside.ph.utexas.edu/euclid.html [Euler 1770] Euler, L. Elements of Algebra (3rd English edition). John Hewlett and Francis Horner, English translators. Orme Longman, 1822. [Feferman & Levy 1963] Feferman, S.; Levy, A. “Independence results in set theory by Cohen’s method II.” Notices Amer. Math. Soc. 10 (1963). [Fowler 1992] Fowler, D. “Dedekind’s theorem: √2 √3 = √6.” American Math- ematical Monthly 99, no. 8, 725–733. × [Goldblatt 1998] Goldblatt, R. Lectures on the hyperreals. An introduction to non- standard analysis. Graduate Texts in Mathematics, 188. Springer-Verlag, New York. [Gordon et al. 2002] Gordon, E.; Kusraev, A.; Kutateladze, S. Infinitesimal analy- sis. Updated and revised translation of the 2001 Russian original. Trans- lated by Kutateladze. Mathematics and its Applications, 544. Kluwer Academic Publishers, Dordrecht, 2002. [Grobman 1959] Grobman, D. “Homeomorphisms of systems of differential equa- tions.” Doklady Akademii Nauk SSSR 128, 880–881. [Guicciardini 2011] Guicciardini, N. Private communication, 27 november 2011. [Hartman 1960] Hartman, P. “A lemma in the theory of structural stability of differential equations.” Proc. A.M.S. 11 (4), 610–620. [Henson & Keisler 1986] Henson, C. W.; Keisler, H. J. “On the strength of non- standard analysis.” J. Symbolic Logic 51 (1986), no. 2, 377–386. [Herzberg 2013] Herzberg, F. Stochastic Calculus with Infinitesimals. Lecture Notes in Mathematics Volume 2067, Springer, 2013 [Hewitt 1948] Hewitt, E. “Rings of real-valued continuous functions. I.” Trans. Amer. Math. Soc. 64 (1948), 45–99. [Jech 1973] Jech, T. The axiom of choice. Studies in Logic and the Foundations of Mathematics, Vol. 75. North-Holland Publishing Co., Amsterdam- London; Amercan Elsevier Publishing Co., Inc., New York. BIBLIOGRAPHY 149 [Kanovei, Katz & Nowik 2016] Kanovei, V.; Katz, K.; Katz, M.; Nowik, T. “Small oscillations of the pendulum, Euler’s method, and adequality.” Quantum Studies: Mathematics and Foundations, to appear. See http://dx.doi.org/10.1007/s40509-016-0074-x and http://arxiv.org/abs/1604.06663 [Kanovei & Reeken 2004] Kanovei, V.; Reeken, M. Nonstandard analysis, axiomat- ically. Springer Monographs in Mathematics, Berlin: Springer. [Kanovei & Shelah 2004] Kanovei, V.; Shelah, S. A definable nonstandard model of the reals.” J. Symbolic Logic 69 (2004), no. 1, 159–164. [Katz & Sherry 2012] Katz, M.; Sherry, D. “Leibniz’s laws of continuity and ho- mogeneity.” Notices of the American Mathematical Society 59 (2012), no. 11, 1550-1558. See http://www.ams.org/notices/201211/rtx121101550p.pdf and http://arxiv.org/abs/1211.7188 [Katz & Sherry 2013] Katz, M., Sherry, D. “Leibniz’s infinitesimals: Their fiction- ality, their modern implementations, and their foes from Berkeley to Russell and beyond.” Erkenntnis 78, no. 3, 571–625. See http://dx.doi.org/10.1007/s10670-012-9370-y and http://arxiv.org/abs/1205.0174 [Katz, V. 1993] Katz, V. J. “Using the history of calculus to teach calculus.” Sci- ence & Education. Contributions from History, Philosophy and Sociology of Science and Education 2, no. 3, 243–249. [Keisler 1986] Keisler, H. J.: Elementary Calculus: An Infinitesimal Approach. Second Edition. Prindle, Weber & Schimidt, Boston, 1986. See online version at http://www.math.wisc.edu/ keisler/calc.html [Keisler 1976] H. J. Keisler. Foundations of Infinitesimal∼ Calculus. On-line Edition, 2007, at the author’s web page: http://www.math.wisc.edu/ keisler/foundations.html (Originally published by Prindle,∼ Weber & Schmidt 1976.) [Klein 1908] Klein, F. Elementary Mathematics from an Advanced Standpoint. Vol. I. Arithmetic, Algebra, Analysis. Translation by E. R. Hedrick and C. A. Noble [Macmillan, New York, 1932] from the third German edition [Springer, Berlin, 1924]. Originally published as Elementarmathematik vom h¨oheren Standpunkte aus (Leipzig, 1908). [Leibniz 1989] Leibniz, G. W.: La naissance du calcul diff´erentiel. (French) [The birth of differential calculus] 26 articles des Acta Eruditorum. [26 ar- ticles of the Acta Eruditorum] Translated from the Latin and with an introduction and notes by Marc Parmentier. With a preface by Michel Serres. Mathesis. Librairie Philosophique J. Vrin, Paris, 1989. [Lo´s1955] Lo´s, J. “Quelques remarques, th´eor`emes et probl`emes sur les classes d´efi- nissables d’alg`ebres.” In Mathematical interpretation of formal systems, 98–113, North-Holland Publishing Co., Amsterdam, 1955. [Lutz & Goze 1981] Lutz, R.; Goze, M. Nonstandard Analysis, a Practical Guide with Applications. Lecture Notes in Mathematics 881, Springer-Verlag, 1981. [Malet 2006] Malet, A. “Renaissance notions of number and magnitude.” Historia Mathematica 33, no. 1, 63–81. [Marsh 1742] Marsh, J. Decimal arithmetic made perfect. London. 150 BIBLIOGRAPHY [Newton 1671] Newton, I.: A Treatise on the Methods of Series and Fluxions, 1671. In Whiteside [4] (vol. III), pp. 33–35. [1] Newton, I.: Cambridge UL Add. 3958.4, f. 78v [2] Newton, I. 1946. Sir Isaac Newton’s mathematical principles of natural philosophy and his system of the world, a revision by F. Cajori of A. Motte’s 1729 translation. Berkeley: Univ. of California Press. [3] Newton, I. 1999. The Principia: Mathematical principles of natural phi- losophy, translated by I. B. Cohen & A. Whitman, preceded by A guide to Newton’s Principia by I. B. Cohen. Berkeley: Univ. of California Press. [Nowik & Katz 2015] Nowik, T., Katz, M. “Differential geometry via infinitesimal displacements.” Journal of Logic and Analysis 7:5 (2015), 1–44. See http://www.logicandanalysis.org/index.php/jla/article/view/237/106 and http://arxiv.org/abs/1405.0984 [Robinson 1966] Robinson, A. Non-standard analysis. North-Holland Publishing Co., Amsterdam 1966. [Rudin 1976] Rudin, W. Principles of mathematical analysis. Third edition. Inter- national Series in Pure and Applied Mathematics. McGraw-Hill Book Co., New York-Auckland-D¨usseldorf, 1976. [Fearney-Sander 1979] Fearnley-Sander, D. “Hermann Grassmann and the creation of linear algebra.” Amer. Math. Monthly 86, no. 10, 809–817. [Sherry & Katz 2014] Sherry, D., Katz, M. “Infinitesimals, imaginaries, ideals, and fictions.” Studia Leibnitiana 44 (2012), no. 2, 166–192 (the article ap- peared in 2014 even though the year given by the journal is 2012). See http://arxiv.org/abs/1304.2137 [Smith 1920] Smith, D. Source Book in Mathematics. Mc Grow-Hill, New York. [Spivak 1979] Spivak, M. A comprehensive introduction to differential geometry. Vol. 4. Second edition. Publish or Perish, Inc., Wilmington, Del., 1979. [Stevin 1585] Stevin, S. (1585) L’Arithmetique. In Albert Girard (Ed.), Les Oeu- vres Mathematiques de Simon Stevin (Leyde, 1634), part I, p. 1–101. [Stevin 1625] Stevin, S. L’Arithmetique. In Albert Girard (Ed.), 1625, part II. Online at http://www.archive.org/stream/larithmetiqvedes00stev#page/353/mode/1up [Stevin 1958] Stevin, S. The principal works of Simon Stevin. Vols. IIA, IIB: Math- ematics. Edited by D. J. Struik. C. V. Swets & Zeitlinger, Amsterdam 1958. Vol. IIA: v+pp. 1–455 (1 plate). Vol. IIB: 1958 iv+pp. 459–976. [Stroyan 1977] Stroyan, K. “Infinitesimal analysis of curves and surfaces.” In Hand- book of Mathematical Logic, J. Barwise, ed. North-Holland Publishing Company, 1977. [Stroyan 2015] Stroyan, K. Advanced Calculus using Mathematica: NoteBook Edi- tion. [Stroyan & Luxemburg 1976] Stroyan, K.; Luxemburg, W. Introduction to the The- ory of Infinitesimals. Academic Press, 1976. [Sun 2000] Sun, Y. “Nonstandard Analysis in Mathematical Economics.” In: Non- standard Analysis for the Working Mathematician, Peter A. Loeb and Manfred Wolff, eds, Mathematics and Its Applications, Volume 510, 2000, pp 261-305. BIBLIOGRAPHY 151 [Tao 2014] Tao, T. Hilbert’s fifth problem and related topics. Graduate Studies in Mathematics, 153. American Mathematical Society, Providence, RI, 2014. [Tao & Vu 2016] Tao, T.; Van Vu, V. “Sum-avoiding sets in groups.” See http://arxiv.org/abs/1603.03068 [Tarski 1930] Tarski, A. “Une contribution `ala th´eorie de la mesure.” Fundamenta Mathematicae 15, 42–50. [Van den Berg 1987] Van den Berg, I. Nonstandard asymptotic analysis. Lecture Notes in Mathematics 1249, Springer 1987. [Van der Waerden 1985] Van der Waerden, B. L. A history of algebra. From al- Khwarizmi to Emmy Noether. Springer-Verlag, Berlin. [We55] Weatherburn, C. E.: Differential geometry of three dimensions. Vol. 1. Cambridge University Press, 1955. [4] Whiteside, D. T. (Ed.): The mathematical papers of Isaac Newton. Vol. III: 1670–1673. Edited by D. T. Whiteside, with the assistance in publica- tion of M. A. Hoskin and A. Prag, Cambridge University Press, London- New York, 1969. [Zygmund 1945] Zygmund, A. Smooth functions. Duke Math. J. 12. 47–76. CHAPTER 19 Theorema Egregium 19.1. Preliminaries to the Theorema Egregium of Gauss In this section and the two following we will give a proof of Gauss’s theorem to the effect that the Gaussian curvature function of a surface is an intrinsic invariant of a surface. We denote the coordinates in R2 by (u1,u2). In R3, let , be the 2 3 h i Euclidean inner product. Let x = x(u1,u2): R R be a regular parametrized surface. Here “regular” means that→ the vector valued function x has Jacobian of rank 2 at every point. Consider partial ∂ derivatives xi = ∂ui (x), where i =1, 2. Thus, vectors x1 and x2 form a basis for the tangent plane to the surface at every point. Similarly, let ∂2x x = ij ∂ui∂uj be the second partial derivatives. Let n = n(u1,u2) be a unit normal to the surface at the point x(u1,u2), so that n, x = 0. h ii k Definition 19.1.1 (symbols gij, Lij, Γij). The first fundamental form (gij) at a point of the surface is given in coordinates by gij = xi, xj . The second fundamental form (Lij) is given in coordinates h i k by Lij = n, xij . The symbols Γij are uniquely defined by the decom- position h i k xij =Γijxk + Lijn 1 2 (19.1.1) =Γijx1 +Γijx2 + Lijn, where the repeated (upper and lower) index k implies summation, in accordance with the Einstein summation convention. i i Definition 19.1.2 (symbols L j). The Weingarten map (L j) is an endomorphism of the tangent plane, namely R x1 R x2. It is uniquely ∂ ⊕ defined by the decomposition of nj = ∂uj (n) as follows: i nj = L jxi 1 2 = L jx1 + L jx2. Note the relation L = Lk g . ij − j ki 153 154 19. THEOREMA EGREGIUM Definition 19.1.3. We will denote by square brackets [ ] the anti- symmetrisation over the pair of indices found in between the brackets, e.g. a = 1 (a a ). [ij] 2 ij − ji k Note that g[ij] = 0, L[ij] = 0, and Γ[ij] = 0. We will use the k k notation Γij;ℓ for the ℓ-th partial derivative of the symbol Γij. Proposition 19.1.4. We have the following relation Γq +Γm Γq = L Lq (19.1.2) i[j;k] i[j k]m − i[j k] for each set of indices i, j, k, q (with, as usual, an implied summation over the index m). Proof. The formula will result by antisymmetrizing a formula for the third partial. Let us calculate the tangential component, with respect to the ba- sis x1, x2, n , of the third partial derivative xijk. Recall that nk = p { } ℓ L kxp and xjk =Γjkxℓ + Likn. Thus, we have (x ) = Γmx + L n ij k ij m ij k m m =Γ ij;kxm +Γij xmk + Lijnk + Lij;kn m m p p =Γij;kxm +Γij Γmkxp + Lmkn + Lij (L kxp)+ Lij;kn. Grouping together the tangential terms, we obtain m m p p (xij)k =Γij;kxm +Γij Γmkxp + Lij (L kxp)+(...)n q m q q = Γij;k +Γij Γmk + LijL k xq +(...)n q m q q = Γij;k +Γij Γkm + LijL k xq +(...)n, q since the symbols Γkm are symmetric in the two subscripts. Now the symmetry in j, k (equality of mixed partials) implies the following iden- tity: (xi[j)k] =0. Therefore 0=(xi[j)k] q m q q = Γi[j;k] +Γi[jΓk]m + Li[jL k] xq +(...)n, q m q q and therefore Γi[j;k] +Γi[jΓk]m + Li[jL k] = 0 for each q =1, 2. 19.2. THE THEOREMA EGREGIUM OF GAUSS 155 19.2. The theorema egregium of Gauss Definition 19.2.1. The Gaussian curvature function K = K(u1,u2) is the determinant of the Weingarten map: i 1 2 K = det(L j)=2L [1L 2]. (19.2.1) Theorem 19.2.2. We have the following three equivalent formulas for Gaussian curvature K: i 1 2 (a) K = det(L j)=2L [1L 2]; (b) K = det(Lij ) ; det(gij ) (c) K = 2 L L2 . − g11 1[1 2] Proof. The first formula is our definition of K. To prove the sec- k ond formula, we use the relation Lij = L jgki. By the multiplicativity of determinant with respect to matrix− multiplication, we have i 2 det(Lij) det(L j)=( 1) . − det(gij) Let us now prove formula (c). The proof is a calculation. Note that by definition, 2L L2 = L L2 L L2 . Hence 1[1 2] 11 2 − 12 1 2L L2 = L1 g L2 g L2 L1 g L2 g L2 − 1[1 2] 1 11 − 1 21 2 − 2 11 − 2 21 1 = g L1 L2 g L2 L2 g L1 L2 + g L2 L2 11 1 2 − 21 1 2 − 11 2 1 21 2 1 = g L1 L2 L1 L2 11 1 2 − 2 1 i = g11det( L j)= g11K. (19.2.2) k Note that we can either calculate using the formula Lij = L jgki, k − or the formula Lij = L igkj. Only the former one leads to the appro- priate cancellations as− in (19.2.2). Theorem 19.2.3 (Theorema egregium). The Gaussian curvature function K = K(u1,u2) can be expressed in terms of the coefficients of the first fundamental form alone (and their first and second derivatives) as follows: 2 2 j 2 K = Γ1[1;2] +Γ1[1Γ2]j , (19.2.3) g11 k where the symbols Γij can be expressed in terms of the derivatives of gij be the formula Γk = 1 (g g + g )gℓk, where (gij) is the inverse ij 2 iℓ;j − ij;ℓ jℓ;i matrix of (gij). This theorem will be proved in Section 19.3. 156 19. THEOREMA EGREGIUM Weingarten Gaussian cur- mean cur- i map (L j) vature K vature H 0 0 plane 0 0 0 0 1 0 cylinder 0 1 0 0 2 invariance yes no of curvature Table 19.3.1. Plane and cylinder have the same intrinsic geometry (K), but different extrinsic geometries (H) 19.3. Intrinsic versus extrinsic Theorem 19.3.1. Unlike Gaussian curvature K, the mean curva- ture 1 i H = 2 L i = tr(W ) cannot be expressed in terms of the gij and their derivatives. Indeed, the plane and the cylinder have parametrisations with iden- tical gij, but with different mean curvature, cf. Table 19.3.1. To sum- marize, Gaussian curvature K is an intrinsic invariant, while mean 1 i curvature H = 2 L i is an extrinsic invariant, of the surface. Proof of theorema egregium. We present a streamlined version of do Carmo’s proof [?, p. 233]. The proof is in 3 steps. (1) We express the third partial derivative xijk in terms of the Γ’s (intrinsic information) and the L’s (extrinsic information). (2) The equality of mixed partials yields an identification of an appropriate expression in terms of the Γ’s, with a certain com- bination of the L’s. (3) The combination of the L’s is expressed in terms of Gaussian curvature. The first two steps were carried out in Proposition 19.1.4. Choosing the valus i = j = 1 and k = q = 2 for the indices in equation (19.1.2), we obtain Γ2 +Γm Γ2 = L L2 1[1;2] 1[1 2]m − 1[1 2] i 2 = g1iL [1L 2] 1 2 = g11L [1L 2] 19.4. GAUSSIAN CURVATURE AND LAPLACE-BELTRAMI OPERATOR 157 1 2 Γij j =1 j =2 Γij j =1 j =2 i =1 µ1 µ2 i =1 µ2 µ1 i =2 µ µ i =2 µ− µ 2 − 1 1 2 k 2µ(u1,u2) Table 19.4.1. Symbols Γij of a metric e δij 2 2 since the term L [1L 2] = 0 vanishes. Applying Theorem 19.2.2(c), we obtain the desired formula for K and complete the proof of the theorema egregium. 19.4. Gaussian curvature and Laplace-Beltrami operator In this section we will complement the material of section 19.2. A (u1,u2) chart in which the metric becomes conformal (see Defi- nition 20.8.1) to the standard flat metric, is referred to as isothermal coordinates. The existence of the latter is proved in [?]. Recall that a metric λ(x, y)(dx2 + dy2), where u1 = x and u2 = y, has metric coefficients g11 = g22 = λ whereas g12 = 0. Definition 19.4.1. The Laplace-Beltrami operator for a metric g = λ(x, y)(dx2 + dy2), where u1 = x and u2 = y, in isothermal coordinates is 1 ∂2 ∂2 ∆ = + . LB λ ∂(u1)2 ∂(u2)2 The notation means that when we apply the operator ∆LB to a function f = f(u1,u2), we obtain 1 ∂2f ∂2f ∆ (f)= + . LB λ ∂(u1)2 ∂(u2)2 In more readable form for f = f(x, y), we have 1 ∂2f ∂2f ∆ (f)= + . LB λ ∂x2 ∂y2 Theorem 19.4.2. Given a metric in isothermal coordinates with 1 2 metric coefficients gij = λ(u ,u )δij, its Gaussian curvature is minus half the Laplace-Beltrami operator applied to the log of the conformal factor λ: 1 K = ∆ log λ. (19.4.1) −2 LB 1 Here log x is the natural logarithm so that (log x)′ = x . 158 19. THEOREMA EGREGIUM Proof. Let λ = e2µ. We have from Table 19.4.1: 2Γ2 =Γ2 Γ2 = µ µ . 1[1;2] 11;2 − 12;1 − 22 − 11 It remains to verify that the ΓΓ term in the expression (19.2.3) for the Gaussian curvature vanishes: j 2 1 2 2 2 2Γ1[1Γ2]j = 2Γ1[1Γ2]1 + 2Γ1[1Γ2]2 =Γ1 Γ2 Γ1 Γ2 +Γ2 Γ2 Γ2 Γ2 11 21 − 12 11 11 22 − 12 12 = µ µ µ ( µ )+( µ )µ µ µ 1 1 − 2 − 2 − 2 2 − 1 1 =0. Then from formula (19.2.3) we have 2 1 K = Γ2 = (µ + µ )= ∆ µ. λ 1[1;2] −λ 11 22 − LB Meanwhile, ∆LB log λ = ∆LB(2µ)=2∆LB(µ), proving the result. Setting λ = f 2, we restate the theorem as follows. Theorem 19.4.3. Given a metric in isothermal coordinates with 2 1 2 metric coefficients gij = f (u ,u )δij, its Gaussian curvature is minus the Laplace-Beltrami operator of the log of the conformal factor f: K = ∆ log f. (19.4.2) − LB Either one of the formulas (19.2.3), (19.4.1), or (19.4.2) can serve as the intrinsic definition of Gaussian curvature, replacing the extrinsic definition (19.2.1), cf. Remark 34.7.1. 19.5. Duality in linear algebra Let V be a real vector space. We will assume all vector spaces to be finite dimensional unless stated otherwise. Example 19.5.1. Euclidean space Rn is a real vector space of di- mension n. Example 19.5.2. The tangent plane TpM of a regular surface M at a point p M is a real vector space of dimension 2. ∈ Definition 19.5.3. A linear form, also called 1-form, φ on V is a linear functional φ : V R . → Definition 19.5.4. The dual space of V , denoted V ∗, is the space of all linear forms on V : V ∗ = φ : V R φ is a 1-form on V . { → | } 19.6. POLAR, CYLINDRICAL, AND SPHERICAL COORDINATES 159 Evaluating φ at an element x V produces a scalar φ(x) R. ∈ ∈ Definition 19.5.5. The natural pairing between V and V ∗ is a linear map , : V V ∗ R, h i × → defined by setting x, y = y(x), for all x V and y V ∗. h i ∈ ∈ If V admits a basis of vectors (xi), then V ∗ admits a unique basis, called the dual basis (yj), satisfying x ,y = δ , (19.5.1) h i ji ij for all i, j =1,...,n, where δij is the Kronecker delta function. ∂ ∂ Example 19.5.6. The vectors ∂x and ∂y form a basis for the tan- gent plane T E of the Euclidean plane E at each point E. The dual p ∈ space is denoted Tp∗ and called the cotangent plane. The basis dual to ∂ ∂ ∂x , ∂y is denoted (dx,dy). Polar coordinates will be defined in detail in Subsection 19.6. They provide helpful examples of vectors and 1-forms, as follows. ∂ ∂ Example 19.5.7. In polar coordinates, we have a basis ∂r , ∂θ for the tangent plane TpE of the Euclidean plane E at each point p E 0 . ∂ ∂ ∈ \{ } The dual space Tp∗ has a basis dual to ∂r , ∂θ and denoted (dr, dθ). Example 19.5.8. In polar coordinates, the 1-form rdr occurs frequently in calculus. This 1-form vanishes at the origin r = 0, and gets “bigger and bigger” as we get further away from the origin (see next section). 19.6. Polar, cylindrical, and spherical coordinates Definition 19.6.1. Polar coordinates (koordinatot koteviot) (r, θ) satisfy r2 = x2 + y2 and x = r cos θ, y = r sin θ. In R2 0 , one way of defining the ranges for the variables is to require \{ } r > 0 and θ [0, 2π). ∈ It is shown in elementary calculus that the area of a region D in the plane in polar coordinates is calculated using the area element dA = r dr dθ. 160 19. THEOREMA EGREGIUM Thus, an integral is of the form dA = rdrdθ. ZD ZZ Cylindrical coordinates in Euclidean 3-space are studied in Vector Calculus. Definition 19.6.2. Cylindrical coordinates (koordinatot gliliot) (r, θ, z) are a natural extension of the polar coordinates (r, θ) in the plane. The volume of an open region D is calculated with respect to cylin- drical coordinates using the volume element dV = r dr dθ dz. Namely, an integral is of the form dV = rdr dθ dz. ZD ZZZ Example 19.6.3. Find the volume of a right circular cone with height h and base a circle of radius b. Spherical coordinates1 (ρ,θ,ϕ) in Euclidean 3-space are studied in Vector Calculus. Definition 19.6.4. Spherical coordinates (ρ,θ,ϕ) are defined as follows. The coordinate ρ is the distance from the point to the origin, satisfying ρ2 = x2 + y2 + z2, or ρ2 = r2 +z2, where r2 = x2 +y2. If we project the point orthogonally to the (x, y)-plane, the polar coordinates of its image, (r, θ), satisfy x = r cos θ and y = r sin θ. The last coordiate ϕ of the spherical coordinates is the angle between the position vector of the point and the third basis vector e3 in 3-space (pointing upward along the z-axis). Thus z = ρ cos ϕ while r = ρ sin ϕ . 1koordinatot kaduriot 19.7. DUALITY IN CALCULUS; DERIVATIONS 161 Here the ranges of the coordinates are often chosen as follows: 0 ρ ≤ 0 θ 2π ≤ ≤ and 0 ϕ π ≤ ≤ (note the different upper bounds for θ and ϕ). Recall that the area of a spherical region D is calculated using a volume element of the form dV = ρ2 sin ϕ dρ dθ d ϕ, so that the volume of a region D is dV = ρ2 sin ϕ dρ dθ d ϕ . ZD ZZZD Example 19.6.5. Calculate the volume of the spherical shell be- tween spheres of radius α> 0 and β α. ≥ The area of a spherical region on a sphere of radius ρ = β is calcu- lated using the area element dA = β2 sin ϕ dθ d ϕ . Thus the area of a spherical region D on a sphere of radius β is given by the integral dA = β2 sin ϕ dθ d ϕ . ZD ZZ Example 19.6.6. Calculate the area of the spherical region on a sphere of radius β included in the first octant, (so that all three Carte- sian coordinates are positive). 19.7. Duality in calculus; derivations Let E be a Euclidean space of dimension n, and let p E be a fixed point. ∈ Definition 19.7.1. Let Dp = f : f C∞ { ∈ } be the space of real C∞-functions f defined in a neighborhood of p E. ∈ Note that Dp is infinite-dimensional as it includes all polynomials. 162 19. THEOREMA EGREGIUM ∂ Theorem 19.7.2. A partial derivative ∂ui at p a 1-form ∂ : λp R ∂ui → on the space λp, satisfying Leibniz’s rule ∂(fg) ∂f ∂g = g(p)+ f(p) ∂ui ∂ui ∂ui p p p for all f,g λp. ∈ The formula can be written briefly as ∂ ∂ ∂ ∂ui (fg)= ∂ui (f)g + f ∂ui (g), at the point p. This was proved in calculus. The formula above moti- vates the following more general definition of a derivation. Definition 19.7.3. A derivation X at p is an R-linear form X : Dp R → on the space Dp satisfying Leibniz’s rule: X(fg)= X(f)g(p)+ f(p)X(g) (19.7.1) for all f,g Dp. ∈ Remark 19.7.4. Linearity of a derivation is required only with regard to multiplication by scalars from R, not with regard to multi- plication by functions. CHAPTER 20 Derivations, tori 20.1. Space of derivations Let M be an n-dimensional manifold, and let p M. Recall that ∈ a derivation X is a 1-form on the space Dp of functions defined near p such that X satisfies the Leibniz rule (see Section 19.7). The following two results are familiar from advanced calculus. Proposition 20.1.1. The space of all derivations at p is a vector space of dimension n, and is called the tangent space Tp at p. Proof in case 1. Let us prove the result in the case n =1ofa single coordinate u = u1 at the point p = 0. Let X be a derivation. Then X(1) = X(1 1)=2X(1) · by Leibniz’s rule. Therefore X(1) = 0, and similarly for any constant by linearity of X. Now consider the monic polynomial u = u1 of degree 1, i.e., the linear function u Dp=0. ∈ We evaluate the derivation X at u and set c = X(u) R. By the ∈ Taylor formula, any function f Dp=0, f = f(u), can be written as ∈ f(u)= a + bu + g(u)u where g is smooth and g(0) = 0. Now we have by Linearity and Leibniz rule X(f)= X(a + bu + g(u)u) = bX(u)+ X(g)u(0) + g(0) 1 · = bc ∂ = c (f). ∂u ∂ Thus X coincides with c ∂u for all f Dp. Hence the tangent space is 1-dimensional, proving the theorem in∈ this case. 163 164 20. DERIVATIONS, TORI Proof in case 2. Let us prove the result in the case of two vari- ables u, v at the point p = 0 (the general case is similar). Let X be a derivation. Then X(1) = X(1 1)=2X(1) · by Leibniz’s rule. Therefore X(1) = 0, and similarly for any constant by linearity of X. Now consider the monic polynomial u = u1 of degree 1, i.e., the linear function u Dp=0. ∈ We evaluate the derivation X at u and set c = X(u). Similarly, consider the monic polynomial v = v1 of degree 1, i.e., the linear function v Dp=0. ∈ We evaluate the derivation X at v and set c′ = X(v). By the Taylor formula, any function f Dp=0, f = f(u, v), can be written as ∈ 2 2 f(u, v)= a + bu + b′v + g(u, v)u + h(u, v)uv + k(u, v)v , where the functions g(u, v), h(u, v), and k(u, v) are smooth. Note that the coefficients b and b′ are the first partial derivatives of f. Now we have 2 2 X(f)= X(a + bu + b′v + g(u, v)u + h(u, v)uv + k(u, v)v ) 2 = bX(u)+ b′X(v)+ X(g)u + g(u, v)2uX(u)+ ··· |u=0,v=0 = bX(u)+ b′X(v) = bc + b′c′ ∂ ∂ = c (f)+ c′ (f). ∂u ∂v Thus X coincides with the derivation ∂ ∂ c + c′ ∂u ∂v for all test functions f Dp. Hence the tangent space sis spanned by ∂ ∈ ∂ the tangent vectors ∂u and ∂v is therefore 2-dimensional, proving the theorem in this case. We have the following immediate corollary. Corollary 20.1.2. Let (u1,...,un) be coordinates in a neighbor- hood of p M. Then a basis for the space Tp is given by the n partial derivatives∈ ∂ , i =1, . . . , n. ∂ui 20.3. FIRST FUNDAMENTAL FORM 165 Definition 20.1.3. The vector space dual to the tangent space Tp is called the cotangent space, and denoted Tp∗. Thus an element of a tangent space is a vector, while an element of a cotangent space is called a 1-form, or a covector. ∂ Definition 20.1.4. The basis dual to the basis ( ∂ui ) is denoted (dui), i =1, . . . , n. i Thus each du is by definition a linear form on Tp, or a cotan- gent vector (covector for short). We are therefore working with dual ∂ i bases ∂ui for vectors, and (du ) for covectors. The pairing as in (19.5.1) gives ∂ ∂ ,duj = duj = δj, (20.1.1) ∂ui ∂ui i where δj is the Kronecker delta: δj = 1 if i = j and δj = 0 if i = j. i i i 6 Example 20.1.5. Examples of 1-forms in the plane are dx, dy, dr, rdr, dθ. 20.2. Constructing bilinear forms out of 1-forms Recall that the polarisation formula allows one to reconstruct a symmetric bilinear form B, from the quadratic form Q(v) = B(v, v), at least if the characteristic is not 2: 1 B(v,w)= (Q(v + w) Q(v w)). (20.2.1) 4 − − Similarly, one can construct bilinear forms out of the 1-forms dui, as follows. Consider a quadratic form defined by a linear combination of the rank-1 quadratic forms (dui)2. Polarizing the quadratic form, one obtains a bilinear form on the tangent space Tp. 20.3. First fundamental form Let M be a surface, and p M a point on the surface. ∈ Definition 20.3.1. The first fundamental form g on M is a sym- metric positive definite bilinear form on the tangent space Tp = TpM at p: g : Tp Tp R, × → defined for all p M and varying continuously and smoothly in p. ∈ 166 20. DERIVATIONS, TORI Example 20.3.2. Assume M is imbedded in R3. Euclidean 3- space R3 is equipped with a standard inner product which we de- 3 note , R . Its restriction to the tangent plane of M gives a first fundamentalh i form on M: g : Tp Tp R, g(v,w) := v,w 3 . × → h iR The first fundamental form is traditionally expressed by a matrix of coefficients called metric coefficients gij, with respect to coordi- nates (u1,u2) on the surface. Thus, if a surface is imbedded in Euclidean space R3, the metric is obtained by restricting the Euclidean inner product to the tangent plane of the surface, as follows. Definition 20.3.3. The metric coefficients gij are given by the inner product of the i-th and the j-th vector in the basis: ∂x ∂x gij = i , j . ∂u ∂u 3 R In particular, the diagonal coefficient satisfies ∂x 2 g = , ii ∂ui namely it is the square length of the i-th basis vector ∂x . ∂ui 20.4. Expressing the metric in terms of 1-forms In the case of surfaces, at every point p = (u1,u2), we have the 1 2 metric coefficients gij = gij(u ,u ). Each metric coefficient is thus a function of two variables. Consider the case when the matrix is diagonal. This can always be achieved at a point by a change of coordinates. If this is true everywhere in a coordinate chart, such coordinates are called isothermal. In the notation developed in Section 20.2, we can write the first fundamental form in isothermal coordinats as follows: 1 2 1 2 1 2 2 2 g = g11(u ,u )(du ) + g22(u ,u )(du ) . (20.4.1) For example, if the metric coefficients form an identity matrix, we obtain the standard flat metric g = du1 2 + du2 2 , (20.4.2) where the interior superscript denotes an index, while exterior super- script denotes the squaring operation. 20.6. SURFACE OF REVOLUTION 167 Sometimes it is convenient to simplify notation by replacing the coordinates (u1,u2) simply by (x, y). Then the flat metric (20.4.2) will appear simply as dx2 + dy2. 20.5. Hyperbolic metric The hyperbolic metric can expressed by the bilinear form 1 (dx2 + dy2). y2 Note that this expression is undefined for y = 0. Traditionally one considers the upper half plane defined by the condition y > 0. The hyperbolic metric in the upper half plane is a complete metric. Its Gaussian curvature K = K(x, y) satisfies K = 1 at every point follows from the Laplace-Beltrami operator form− of the formula for the Gaussian curvature (see Section 19.4), and results from the fact 1 that the second derivative of log y is 2 . − y 20.6. Surface of revolution In the previous sections, we have outlined a formalism for describing a metric on an arbitrary surface, in coordinates denoted (u1,u2). In the special case of a surface of revolution, it is customary to use the notation u1 = θ and u2 = ϕ. Then the metric (20.4.1) can be written as 2 2 g = g11(θ,ϕ)(dθ) + g22(θ,ϕ)(dϕ) . The starting point in the construction of a surface of revolution in R3 is a curve C in the xz-plane, parametrized by a pair of functions x = f(ϕ), z = g(ϕ). We will assume that f(ϕ) > 0. The surface of revolu- tion (around the z-axis) defined by C is parametrized as follows: x(θ,ϕ)=(f(ϕ) cos θ, f(ϕ) sin θ,g(ϕ)). The condition f(ϕ) > 0 ensures that the resulting surface is imbedded in R3. It will be an imbedded torus if the original curve C itself is a Jordan curve. The pair of functions (f(ϕ),g(ϕ)) gives an arclength parametrisation of the curve if df 2 dg 2 + =1. dϕ dϕ 168 20. DERIVATIONS, TORI Proposition 20.6.1. Assume (f(ϕ),g(ϕ)) gives an arclength para- metrisation of the curve. Then in the notation of Section 20.4, the metric g of the surface of revolution is g = f 2(ϕ)dθ2 + dϕ2. Example 20.6.2. Setting f(ϕ) = sin ϕ and g(ϕ) = cos ϕ, we obtain a parametrisation of the sphere S2 in spherical coordinates: 2 2 2 gS2 = sin ϕ dθ + dϕ . Proof of Proposition 20.6.1. To calculate the first fundamen- tal form of a surface of revolution, note that ∂x x = =( f sin θ, f cos θ, 0), 1 ∂θ − while ∂x df df dg x = = cos θ, sin θ, , 2 ∂ϕ dϕ dϕ dϕ 2 2 2 2 2 so that we have g11 = f sin θ + f cos θ = f , while df 2 dg 2 df 2 dg 2 g = (cos2 θ + sin2 θ)+ = + 22 dϕ dϕ dϕ dϕ and df df g = f sin θ cos θ + f cos θ sin θ =0. 12 − dϕ dϕ Thus we obtain the first fundamental form f 2 0 2 2 (gij)= df dg , (20.6.1) 0 dϕ + dϕ ! completing the proof. In fact, equation (20.6.1) has the following stronger consequence. Proposition 20.6.3. Assume (f,g) is a regular parametrisation of the curve. In the notation of Section 20.4, the metric g of the surface of revolution is df 2 dg 2 g = f 2dθ2 + + dϕ2. dϕ dϕ ! In Section 20.10, we will express the metric of a surface of revolution by a first fundamental form expressed by a scalar matrix, after an appropriate change of coordinates. 20.7. COORDINATE CHANGE 169 20.7. Coordinate change In this section we will work with coordinate charts in the Euclidean plane. Given a metric in one coordinate chart, we would like to under- stand how the metric coefficients tranform under change of coordinates. Consider a change from a coordinate chart (ui), i =1, 2, to another coordinate chart, denoted (vα), α =1, 2. In the overlap of the two domains, the coordinates can be expressed in terms of each other, e.g. v = v(u). Denote by gij the metric coefficients i with respect to the chart (u ), and byg ˜αβ the metric coefficients with respect to the chart (vα). Definition 20.7.1 (Einstein summation convention). The rule is that whenever a product contains a symbol with a lower index and another symbol with the same upper index, take summation over this repeated index. Proposition 20.7.2. Under a coordinate change, the metric coef- ficients transform as follows: ∂ui ∂uj g˜ = g . αβ ij ∂vα ∂vβ Proof. In the case of a metric induced by a Euclidean imbedding x : R2 R3, → defined by x = x(u)= x(u1,u2), we obtain a new parametrisation y(v)= x(u(v)). Applying chain rule, we obtain ∂y ∂y g˜ = , αβ ∂vα ∂vβ ∂x ∂ui ∂x ∂uj = , ∂ui ∂vα ∂uj ∂vβ ∂ui ∂uj ∂x ∂x = , ∂vα ∂vβ ∂ui ∂uj ∂ui ∂uj = g , ij ∂vα ∂vβ completing the proof. 170 20. DERIVATIONS, TORI 20.8. Conformal equivalence i j i j Definition 20.8.1. Two metrics, g = gijdu du and h = hijdu du , on a surface M are called conformally equivalent, or conformal for short, if there exists a function f = f(u1,u2) > 0 such that g = f 2h, in other words, g = f 2h i, j. (20.8.1) ij ij ∀ Definition 20.8.2. The function f above is called the conformal factor (note that sometimes it is more convenient to refer, instead, to the function λ = f 2 as the conformal factor). Remark 20.8.3. Note that the length of every vector at a point with coordinates (u1,u2) is multiplied precisely by f(u1,u2). More i ∂ speficially, a vector v = v ∂ui which is a unit vector for the metric h, is “stretched” by a factor of f, i.e. its length with respect to g equals f. Indeed, the new length of v is 1 ∂ ∂ 2 g(v, v)= g vi , vj ∂ui ∂uj p 1 ∂ ∂ 2 = g , vivj ∂ui ∂uj i j = gijv v 2 i j = pf hijv v i j = fp hijv v = f.p 20.9. Uniformisation theorem for tori Definition 20.9.1. An equivalence class of metrics on M confor- mal to each other is called a conformal structure (mivneh) on M. Theorem 20.9.2. Locally, every metric on a surface can be written as f(x, y)2(dx2 + dy2) with respect to suitable coordinates (x, y). There is a global version of this theorem for tori. The uniformisation theorem in the genus 1 case can be formulated as follows. Recall that by Pythagoras’ theorem, the square-length of a vector in the plane is the sum of the squares of its components, or ds2 = dx2 + dy2, 20.10. ISOTHERMAL COORDINATES 171 where dx and dy measure the components and ds is the arclength in the plane. Definition 20.9.3. A lattice L R2 a subgroup isomorphic to Z2 ⊆2 which is not included in any line in R . Definition 20.9.4. A function f(x, y) on R2 is called L-periodic if f(x + ℓ)= f(x) for all ℓ L. ∈ Definition 20.9.5. A flat torus T2 is a quotient R2 /L, where L is a lattice in the plane. Theorem 20.9.6 (Uniformisation theorem). For every metric g on the 2-torus T2, there exists a lattice L R2 and a positive L-periodic 2 ⊆ 2 function f(x, y) on R such that the torus (T ,g) is isometric to R2 /L, f 2ds2 , (20.9.1) where ds2 = dx2 + dy2 is the standard flat metric of R2. More generally, there is also a global form of the theorem in arbi- trary genus, which can be stated in several ways. One of such ways involves Gaussian curvature. Theorem 20.9.7 (uniformisation). Every metric on a connected (kashir) surface is conformally equivalent to a metric of constant Gauss- ian curvature. From the complex analytic viewpoint, the uniformisation theorem states that every Riemann surface is covered by either the sphere, the plane, or the upper halfplane. Thus no notion of curvature is needed for the statement of the uniformisation theorem. However, from the differential geometric point of view, what is relevant is that every con- formal class of metrics contains a metric of constant Gaussian curva- ture. See [?] for a lively account of the history of the uniformisation theorem. 20.10. Isothermal coordinates Definition 20.10.1. Coordinates (u1,u2) on a surface M are called isothermal if the matrix (gij) of the first fundamental form of M, with respect to (u1,u2), is a scalar matrix at every point of M. The following lemma expresses the metric of a surface of revolution in isothermal coordinates. 172 20. DERIVATIONS, TORI Theorem 20.10.2. Suppose (f(ϕ),g(ϕ)), where f(ϕ) > 0, is an arclength parametrisation of the generating curve1 of a surface of rev- olution. Then the change of variable 1 ψ = dϕ, f(ϕ) Z produces a new parametrisation (in terms of variables θ, ψ), with respect to which the first fundamental form is given by a scalar matrix (gij)= 2 (f δij), i.e. the metric is f 2(ϕ(ψ))(dθ2 + dψ2). In other words, we obtain an explicit conformal equivalence be- tween the metric on the surface of revolution and the standard flat metric on the quotient of the (θ, ψ) plane. Such coordinates are re- ferred to as “isothermal coordinates” in the literature. The existence of such a parametrisation is predicted by the uniformisation theorem (see Theorem 20.9.6) in the case of a general surface. df df dϕ Proof. Let ϕ = ϕ(ψ). By chain rule, dψ = dϕ dψ . Now consider again the first fundamental form of Proposition 20.6.3. To impose the 2 2 2 df dg condition g11 = g22, we need to solve the equation f = dψ + dψ , or df 2 dg 2 dϕ 2 f 2 = + . dϕ dϕ dψ ! In the case when the generating curve is parametrized by arclength, we dϕ are therefore reduced to the equation f = dψ , or dϕ ψ = . f(ϕ) Z Thus we obtain a parametrisation of the surface of revolution in co- ordinates (θ, ψ), such that the matrix of metric coefficients is a scalar matrix. 20.11. Tori of revolution We will compute the conformal parameter of the specific torus of revolution (x a)2 + y2 = b2, where b < a. Note that the lattices obtained in the− previous section are all rectangular, with θ varying L dϕ from 0 to 2π and ψ varying from 0 to 0 f(ϕ) . The lemma of the previous section has the following corollary. R 1This curve does not have to be closed. We will specialize to Jordan curves in Section 20.11. 20.11. TORI OF REVOLUTION 173 Corollary 20.11.1. Consider a torus of revolution in R3 formed by rotating a Jordan curve of length L > 0, with unit speed param- etisation (f(ϕ),g(ϕ)) where ϕ [0, L]. Then the torus is conformally 2 ∈ equivalent to a flat torus R /Lc,d, defined by a rectangular lattice Lc,d = c Z d Z, ⊕ L dϕ where c = 2π and d = 0 f(ϕ) , while the metric expressed in coordi- nates (θ, ψ) is R f 2(ϕ(ψ))(dθ2 + dψ2), dϕ for the change of coordinate ψ = f(ϕ) . R CHAPTER 21 Conformal parameter 21.1. Conformal parameter τ of standard tori of revolution Figure 21.1.1. Torus: lattice (left) and imbedding (right) Definition 21.1.1. Every flat torus C/L is similar to the torus spanned by τ C and 1 C, where τ is in the standard fundamental ∈ ∈ 1 domain D = z = x + iy C : x 2 ,y > 0, z 1 . The parameter τ is{ called the conformal∈ parameter| | ≤ of the torus.| | ≥ } In the previous section we studied the torus of revolution gener- ated by a Jordan curve of length L with arclength parametrisation (f(ϕ),g(ϕ)). We showed that the conformal class of the torus contains the flat rectangular torus, whose lattice Lc,d is spanned by perpendic- L dϕ ular vectors of length c =2π and d = 0 f(ϕ) . We therefore obtain the following corollary. R Corollary 21.1.2. The conformal parameter τ of a torus of rev- olution is pure imaginary: τ = iσ2 of absolute value c d σ2 = max , 1. d c ≥ 175 176 21. CONFORMAL PARAMETER Proof. The proof is immediate from the fact that the lattice is rectangular. 21.2. Comformal parameter of standard torus of rev We would like to compute the conformal parameter τ of the stan- dard tori imbedded in R3. We first modify the parametrisation so as to obtain a generating curve parametrized by arclength: ϕ ϕ f(ϕ)= a + b cos , g(ϕ)= b sin , (21.2.1) b b where ϕ [0, L] with L =2πb, and b < a. ∈ Theorem 21.2.1. The corresponding flat torus is given by the lat- tice in the (θ, ψ) plane of the form Lc,d = c Z +d Z where c =2π and 2π d = . (a/b)2 1 − Thus the conformal parameter pτ of the flat torus satisfies 1 τ = i max , (a/b)2 1 . (a/b)2 1 − ! − p Proof. By Corollary 21.1.2,p replacing ϕ by ϕ(ψ) produces isother- mal coordinates (θ, ψ) for the torus generated by (21.2.1), where dϕ dϕ ψ = = , f(ϕ) a + b cos ϕ Z Z b and therefore the flat metric is defined by a lattice in the (θ, ψ) plane with c =2π and L=2πb dϕ d = . a + b cos ϕ Z0 b Changing the variable to to ϕ t = b we obtain 2π dt d = . (a/b) + cos t Z0 Thus we are evaluating the definite integral 2π dt d = , (a/b) + cos t Z0 21.2. COMFORMAL PARAMETER OF STANDARD TORUS OF REV 177 where a/b > 1. Let α = a/b. We will use the residue theorem. First note that 2π dt d = α + cos t Z0 dt = α + Re(eit) Z 2dt = . 2α + eit + e it Z − The change of variables z = eit yields idz dt = − z and along the unit circle we have 2idz d = − z(2α + z + z 1) I − 2idz = − (21.2.2) z2 +2αz +1 I 2idz = − , (z λ )(z λ ) I − 1 − 2 where λ = α + √α2 1, λ = α √α2 1. (21.2.3) 1 − − 2 − − − From (21.2.3) we obtain λ λ =2√α2 1. (21.2.4) 1 − 2 − The root λ2 is outside the unit circle. Thus z = λ1 is the only singu- larity of the meromorphic function under the integral sign in (21.2.2) inside the unit circle. Hence we need the residue at λ1 to apply the residue theorem. The residue at λ1 is obtained by multiplying through by (z λ1) and then evaluating at z = λ1. We therefore obtain from (21.2.4)− 2i 2i i Resλ1 = − = − = − . λ1 λ2 2√α2 1 √α2 1 − − − The integral is determined by the residue theorem in terms of the residue at the pole z = λ1. Therefore the lattice parameter d from Corollary 21.1.2 equals 2π d = 2πi Resλ1 = , (α)2 1 − where α = a/b, proving the theorem. p 178 21. CONFORMAL PARAMETER 21.3. Curvature expressed in terms of θ(s) In Section 19.4 we dealt with the Gaussian curvature of surfaces. In the present section we treat a different notion of curvature, that of a differentiable curve in the plane. Let s be an arclength parameter of a plane curve α(s). Using the 2 identification R = C, the tangent vector v(s)= α′(s) can be expressed as v(s)= eiθ(s), where θ(s) is the angle measured counterclockwise between the posi- tive x-axis and the vector v(s). The (geodesic) curvature kα(s) is by definition the function kα(s)= α′′(s) . Note that one has k (s) 0 by definition. | | α ≥ Theorem 21.3.1. We have the following expression for the curva- ture kα of the curve α: dθ k (s)= . (21.3.1) α ds Proof. iθ(s) Recall that v(s)= e . By chain rule, we have dv dθ = ieiθ(s) , ds ds and therefore the curvature kα(s) satisfies dv dθ k (s)= α′′(s) = = , α | | ds ds since i = 1. | | Corollary 21.3.2. Let α(t)=(x(t),y(t)) be an arbitrary parametri- sation (not necessarily arclength) of a curve. We have the following expression for the curvature kα: x′y′′ y′x′′ k = | − | (21.3.2) α α 3 | ′| Proof. With respect to an arbitrary parameter t, the components of the tangent vector v(t)= α′(t) are x′(t) and y′(t). Hence we have y tan θ = ′ . (21.3.3) x′ Differentiating (21.3.3) with respect to t, we obtain 1 dθ x′y′′ x′′y′ 2 = −2 . (21.3.4) cos θ dt x′ 21.4. TOTAL CURVATURE OF A CONVEX JORDAN CURVE 179 Meanwhile dx dx ds ds x′ = = = cos θ dt ds dt dt and dθ dθ ds = . (21.3.5) dt ds dt dt dα dθ Furthermore, ds = v(t) = α′(t) = dt . Solving (21.3.5) for ds , we obtain from (21.3.4)| that| | | | | dθ x′y′′ y′x′′ = | − | ds α 3 | ′| and by (21.3.1) we obtain (21.3.2). This is equivalent to the calcula- tion (??). 21.4. Total curvature of a convex Jordan curve The result on the total curvature1 of a Jordan curve is of interest in its own right. Furthermore, the result serves to motivate an analogous statement of the Gauss–Bonnet theorem for surfaces in Section 21.6. Definition 21.4.1. The total curvature Tot(C) of a curve C with counterclockwise arclenth parametrisation α(s):[a, b] R2 is the integral → b Tot(C)= kα(s)ds. (21.4.1) Za When a curve C is closed, i.e, α(a)= α(b), it is more convenient to write the integral (21.4.1) using the notation of a line integral Tot(C)= kα(s)ds. (21.4.2) IC Theorem 21.4.2. The total curvature of each strictly convex2 Jor- dan curve C with arclength parametrisation α(s) equals 2π, i.e., Tot(C)= kα(s)ds =2π. IC Proof. Let α(s) be a unit speed parametrisation so that α(0) be the lowest point of the curve, and assume the curve is parametrized i0 counterclockwise. Let v(s)= α′(s). Then v(0) = e = 1 and θ(0) = 0. 1Akmumiyut kolelet 2Kmura with a “kuf”. 180 21. CONFORMAL PARAMETER The function θ = θ(s) is monotone increasing from 0 to 2π. Applying the change of variable formula for integration, we obtain dv dθ 2π Tot(C)= k (s)ds = ds = ds = dθ =2π, α ds ds IC IC IC Z0 proving the theorem. Remark 21.4.3. We can also consider the normal vector n(s) to π the curve. The normal vector satisfies n(s)= ei(θ+ 2 ) = iei(θ) and dn = | ds | dv = dθ . Therefore we can also calculate the total curvature as follows: | ds | ds dn dθ 2π k (s)ds = ds = ds = dθ =2π, α ds ds IC IC IC Z0 with n(s) in place of v(s). Remark 21.4.4. The theorem in fact holds for an arbitrary regular Jordan curve, provided we adjust the definition of curvature to allow for negative sign (in such case θ(s) will not be a monotone function). We have dealt only with the convex case in order to simplify the topological considerations. See further in Section 21.5. Remark 21.4.5. A similar calculation will yield the Gauss–Bonnet theorem for convex surfaces in Section 21.6. ˜ 21.5. Signed curvature kα of Jordan curves in the plane Let α(s) be an arclength parametrisation of a Jordan curve in the plane. We assume that the curve is parametrized counterclockwise. As in Section 21.3, a continuous branch of θ(s) can be chosen where θ(s) is the angle measured counterclockwise from the positive x-axis to the tangent vector v(s)= α′(s). Note that if the Jordan curve is not convex, the function θ(s) will not be monotone and at certain points its derivative may take negative values: θ′(s) < 0. Once we have such a continuous branch, we can define the signed curvature k˜α as follows. Definition 21.5.1. The signed curvature k˜α(s) of a plane Jordan curve α(s) oriented counterclockwise is defined by setting dθ k˜ (s)= . α ds We have the following generalisation of Theorem 21.4.2 on the total curvature of a convex curve. 21.5. SIGNED CURVATURE k˜α OF JORDAN CURVES IN THE PLANE 181 Theorem 21.5.2. The total signed curvature k˜α of each Jordan curve C with arclength parametrisation α(s) equals 2π, i.e., b Tot(C)= k˜α(s)ds =2π. Za Proof. We exploitg the continuous branch θ(s) as in Theorem 21.4.2, but without the absolute value signs on the derivative of θ(s). Applying the change of variable formula for integration, we obtain dθ 2π Tot(C)= k˜ (s)ds = ds = dθ =2π, α ds IC IC Z0 proving theg theorem. Remark 21.5.3. For nonconvex Jordan curves, the Gauss map to the circle will not be one-to-one, but will still have an algebraic de- gree one. This phenomenon has a far-reaching analogue for imbedded surfaces where the algebraic degree is proportional to the Euler char- acteristic. Definition 21.5.4. The algebraic degree of a map θ : C S1 at a point z S1 is defined to be the sum → ∈ dθ sign , −1 ds θX(z) where the sum runs over all points in the inverse image of z. For Jordan curves (i.e., imbedded curves), the algebraic degree is always 1. This is the topological ingredient behind the theorem on the total curvature of a Jordan curve. 21.5.1. Tangent vector defined by a curve. This subsection can be skipped. Let M be a manifold. A tangent vector X at a point p M was ∈ defined as a linear functional X : Dp R on the space Dp of smooth functions defined in a neighborhood of→p, where the 1-form is required to satisfy Leibniz’s rule (19.7.1). The tangent vectors at p form the tangent space Tp = TpM at p, which is an n-dimensional vector space. If (u1,...,un) are coordinates centered at p, then we can write down the ∂ following basis for Tp: ∂ui , i = 1,...,n. If α(t) is a smooth curve through p satisfying α(0) = p, the formula Xf = d f α(t) = dt t=0 ◦ d f(α(t)) for all f , defines an initial vector3 X = α (0). The dt t=0 Dp ′ directional derivative only∈ depends on the values of f along a curve 3hat’chalati 182 21. CONFORMAL PARAMETER Figure 21.5.1. Three points on a curve with the same tangent direction, but the middle one corresponds to θ′(s) < 0. with initial vector X, by elementary calculus. Thus we can think of a tangent vector as being defined by choosing a suitable smooth curve. Example 21.5.5. Let p be the point at the center of a coordi- 1 n nate system with coordinates (u ,...,u ). Consider the curve α1(t) defined in coordinates by the n-tuple (t, 0,..., 0). Similarly consider the curves αi, where i = 1,...,n, so that αn(t) = (0,..., 0, t). Then ∂ we have ∂ui = αi′(0), i =1,...,n. The advantage of the more abstract definition of a tangent vector is its independence of any choices. This will allow us to define the tangent bundle by considering all tangent vectors simultaneously. 21.5.2. Rotation index of a closed curve in the plane. This subsection is to be skipped. Given a regular closed plane curve α(s) : [0, L] R2, of length L, a continuous branch of → 1 θ(s)= log α′(s) i can be chosen even if the curve is not simple, obtaining a map θ : [0, L] R . → 21.6. COORDINATE PATCH DEFINED BY FAMILY OF NORMAL VECTORS183 As in Section 21.5, we can assume that θ(0) = 0. Then θ(L) is neces- sarily an integer multiple of 2π. See Millman & Parker [?, p. 55]. Definition 21.5.6. The rotation index ια of a closed unit speed plane curve α(s) is the integer θ(L) θ(0) ι = − . α 2π Example 21.5.7. The rotation index of a figure-8 curve is 0. Theorem 21.5.8. Let α(s) be an arclength parametrisation of a geometric curve C. Then the rotation index is related to the total signed curvature Tot(C) as follows: Tot(C) g ι = . (21.5.1) α 2π The proof (by change of variableg in integration) is the same as in Section 21.5. An analogous relation between the Euler characteristic of a surface and its total Gaussian curvature appears in Section 21.7. 21.6. Coordinate patch defined by family of normal vectors The Gauss–Bonnet theorem for surfaces is to a certain extent anal- ogous to the theorem on the total curvature of a plane curve (Theo- rem 21.5.2 and Theorem 21.5.8). In both cases, an integral of curvature turns out to have topological significance. In the notation of Section 19.1, we have the metric coefficients (gij) of the parametrisation x(u1,u2) of the surface Σ in a neighborhood of a point p Σ. The Gauss∈ map G : M S2 → is given by the unit normal vector n(u1,u2). Remark 21.6.1. The Gauss map defines a coordinate patch on the sphere S2 near the point G(p). In other words, we obtain a parametri- sation of a neighborhood of the point G(p) S2. ∈ Definition 21.6.2. In the new coordinate patch, we obtain the 2 metric coefficients (˜gα,β) of the sphere S relative to the parametrisa- tion n(u1,u2). 1 2 1 2 Here each coefficientg ˜αβ =g ˜αβ(u ,u ) is a function of u and u , as well. These metric coefficients are defined as usual by taking the scalar products of hte first partial derivatives of the parametrisation, this time given by n(u1,u2): 1 2 g˜ (u ,u )= n , n 3 , α,β =1, 2. αβ h α βiR 184 21. CONFORMAL PARAMETER Lemma 21.6.3. We have the following identity relating the metric 2 coefficients of the metric (˜gαβ) on the sphere S , on the one hand, and the metric coefficients (gij) of the metric on M: 1 2 2 det(˜gαβ)= K(u ,u ) det(gij) where K = K(u1,u2) is the Gaussian curvature of the surface M. i Proof. Let L = (L j) be the matrix of the Weingarten map with respect to the basis (x1, x2). By definition of curvature we have K = i det(L). Recall that the coefficients L j of the Weingarten map are defined by i i nα = L αxi = xi L α. (21.6.1) Consider the 3 2-matrices A = [x1 x2] and B = [n1 n2]. Then (21.6.1) implies× by definition that B = AL. Therefore the Gram matrices satisfy t t t t Gram(n1, n2)= B B =(AL) AL = L A AL t (21.6.2) = L Gram(x1, x2) L. Now we apply the determinant to both sides. We have det(Gram(x1, x2)) = det(gij). Similarly, det(Gram(n1, n2)) = det(˜gαβ). Therefore from (21.6.2) we obtain 2 det(˜gαβ) = det(gij)det(L) , and the theorem follows from the identity K = det(L). 4 4Note that by the Cauchy–Binet formula ??, the desired identity is equivalent to the formula n n = det(Li ) x x . (21.6.3) | 1 × 2| | j | | 1 × 2| An alternative proof of (21.6.3) is to observe that a linear map multiplies the area of parallelograms by (the absolute value of) its determinant. The Weingarten map sends each vector xi to ni and therefore it sends the parallelogram spanned by x1 and x2 to the parallelogram spanned by n1 and n2. Therefore it modifies the area of the parallegram by det(Li ) . | j | 21.7. TOTAL GAUSSIAN CURVATURE AND GAUSS–BONNET THEOREM 185 21.7. Total Gaussian curvature and Gauss–Bonnet theorem Let Σ R3 be an imbedded surface. Recall that the area ele- ment dA of⊆ a surface M is given locally in coordinates (u1,u2) by 1 2 dAM = det(gij)du du . q Here the factor det(gij) should be thought of as the area of the par- ∂ ∂ allelogram spanned by 1 and 2 . For example, in polar coordinates p ∂u ∂u in the plane, the determinant is r2 and therefore we get dA = rdrdθ. On the unit sphere we have dA = sin ϕ dθ dϕ. Let M be a convex closed surface in R3, i.e., an imbedded closed surface of positive Gaussian curvature (a typical example is an ellip- soid). Theorem 21.7.1 (Special case of Gauss-Bonnet). The curvature integral of M satisfies 2 K(p) dAM =4π =2πχ(S ), ZM where K is the Gaussian curvature function defined at every point p of the surface. Proof. The convexity of the surface guarantees that the map n is one-to-one. 1 2 We examine the integrand K dAM in a coordinate chart (u ,u ) 1 2 where it can be written as K(u ,u ) dAM . By Lemma 21.6.3, we have 1 2 1 2 1 2 1 2 2 K(u ,u ) dAM = K(u ,u ) det(gij)du du = det(˜gαβ)du du = dAS . q q 2 Thus, the expression K dAM coincides with the area element dAS of the unit sphere S2 in every coordinate chart. Hence we can write 2 K dAM = dAS =4π, 2 ZM ZS proving the theorem. CHAPTER 22 Addendum on a Bernoullian framework 22.1. Ultrapower construction We briefly describe a construction that yields a nonstandard ex- tension satisfying countable saturation. For every n N let Sn(X) ∈ be the set of all sequences xi i N such that xi Vn(X) for all i, and h i ∈ ∈ let S(X) = n N Sn(X). Let (N) be an ultrafilter on N such ∈ U ⊆ P that every cofinite subset of N is in . Define an equivalence relation S U ∼ on S(X) as follows: xi i N yi i N if the set i N : xi = yi is in . h i ∈ ∼h i ∈ { ∈ } U We denote the equivalence class of xi i N by [ xi i N]. We define a rela- h i ∈ h i ∈ tion ε on S(X) as follows: [ xi i N] ε [ yi i N] if the set i N : xi yi is in . h i ∈ h i ∈ { ∈ ∈ } U A16. Let S′(X) denote the set of equivalence classes of elements of S(X), and let ∗X denote the set of equivalence classes of elements of S (X). We recursively define a function T : S′(X) V (∗X), as 0 → follows: For a = xi i N S0(X) simply T ([a]) = [a]. For n > 0 h i ∈ ∈ and Ai i N Sn(X) Sn 1(X), let T ([ Ai i N]) = T ([ xi i N]) : h i ∈ ∈ − − h i ∈ { h i ∈ [ xi i N] ε [ Ai i N] , which is an element of Vn(∗X). For A S(X) leth ki ∈ denoteh thei ∈ sequence} with constant value A, then finally,∈ let : A ∗ V (X) V (∗X) be defined by ∗A = T ([kA]). In this construction, the internal→ objects are precisely those appearing in the image of T . 22.2. Saturation and topology In Theorem we further assume that our nonstandard extension satisfies countable saturation, which is the following property: When- ever An n N is a countable collection of internal sets such that 0 k n Ak = { } ∈ ≤ ≤ 6 ∅ for every n N, then n N An = ∅. Any nonstandard extension constructed via∈ the ultrapower∈ construction6 described below,T satisfies countable saturation. T 187 CHAPTER 23 More on Gauss-Bonnet 23.1. Second proof of Gauss–Bonnet theorem We will use the existence of global coordinates (with singularities only at the north and south poles) on the sphere to write down a more concrete version of the calculation. Choose global coordinates u1 = θ,uˆ 2 =ϕ ˆ on M by precomposing 1 the spherical coordinates by the inverse G− of the Gauss map G : M S2. Here we can dispense with local coordinate charts and compute→ the Gauss–Bonnet integral in these global coordinates. Using 1 2 the relation K(u ,u ) det(gij)= det(˜gαβ) we obtain as before 1 2 p pˆ K(u ,u ) dAM = K(θ, ϕˆ) dAM ZM ZM = K(θ,ˆ ϕˆ) det(gij)dθdˆ ϕˆ M ZZ q = det(˜gαβ)dθdϕ = dAS2 . S2 S2 ZZ q ZZ Here each of the double integrals can be replaced by the iterated integral π RR2π dϕ dθ(...) , Z0 Z0 where the appropriate integrand is placed where the dots (...) are. 23.2. Euler characteristic The Euler characteristic of a closed imbedded surface in Euclidean 3-space can be defined via the total Gaussian curvature. Definition 23.2.1. The Euler characteristic χ(M) of a surface M is defined by the relation 2πχ(M)= K(p)dAM , (23.2.1) ZM where K(p) is the Gaussian curvature at the point p M. ∈ 189 190 23. MORE ON GAUSS-BONNET The relation (23.2.1) is similar to the line intergal expression for the rotation index in formula ??. To show that this definition of the Euler characteristic agrees with the usual one, it is necessary to use the notion of algebraic degree for maps between surfaces, similar to the algebraic degree of a self-map of a circle. 23.3. Exercise on index notation Theorem 23.3.1. Every 2 2 matrix A =(ai ) satisfies the identity × j i k i k i a ka j + q δ j = a ka j, (23.3.1) where q = det(A). Proof. We will use the Cayley-Hamilton theorem which asserts that pA(A) = 0 where pA(λ) is the characteristic polynomial of A. In 2 the rank 2 case, we have pA(λ)= λ (trA)λ + det(A), and therefore we obtain − A2 (trA)A + det(A)I =0, − which in index notation gives ai ak ak ai +q δi = 0. This is equivalent k j − k j j to (23.3.1). For a matrix A of size 3 3, the characteristic polynomial pA(λ) has the form p (λ)= λ3 Tr(A×)λ2 + s(A)λ q(A)λ0. Here q(A) = det(A) A − − and s(A)= λ1λ2 + λ1λ3 + λ2λ3, where λi are the eigenvalues of A. By Cayley-Hamilton theorem, we have pA(A)=0. (23.3.2) Exercise 23.3.2. Express the equation (23.3.2) in index notation. Exercise 23.3.3. A matrix A is called idempotent if A2 = A. Write the idempotency condition in indices with Einstein summation conven- tion (without Σ’s). Exercise 23.3.4. Matrices A and B are similar if there exists an invertible matrix P such that AP P B = 0. Write the similarity condition in indices, as the vanishing− of the (i, j)th coefficient of the difference AP P B. − CHAPTER 24 Bundles 24.1. Trivial bundle (E,B,π), function as a section (mikta) Before introducing the notion of a tangent bundle, let us make some elementary remarks about graphs. Let M be a surface (more generally, a manifold). Given a function f : M R → we can define its graph in M R as the collection of pairs × (p, f(p)) M R p M . { ∈ × | ∈ } We view M R as a trivial bundle over M, denoted π: × π : M R M. × → We can also consider the section s : M M R defined by the obvious formula → × s(p)=(p, f(p)). The section thus defined clearly satisfies the identity π s = Id , ◦ M in other words π(s(p)) = p for all p M. We will now develop the language of bundles and sections in∈ the context of the tangent and cotangent bundles of M. The total space of a bundle is typically denoted E, the base is denoted B, and the surjective bundle map is denoted π: E π B The whole package is referred to as the triple (E,B,π). 191 192 24. BUNDLES 24.2. Mobius band as a first example of a nontrivial bundle The tangent bundle of a circle is a cylinder S1 R, which can be thought of as a trivial bundle over the circle. × A first example of a non-trivial bundle is provided by the Mobius band (tabaat mobius, rtzuat mobius). The latter can be thought of as a non-trivial bundle over the circle. Consider the parametrized circle c(θ) = 10(cos θ, sin θ, 0) with nor- mal vector n(θ) = (cos θ, sin θ, 0). The circle can be imbedded in a parametrized Mobius band in R3. This is done by adding the endpoints of a “rotating” interval at every point: µ(θ)= c(θ) cos θ n(θ) + sin θ e . ± 2 2 3 θ θ Now “filling in” the pair of points cos 2 n(θ) + sin 2 e3 by the in- terval joining them, we obtain a Mobius± band θ θ MB(θ,s)= c(θ)+ s cos 2 n(θ) + sin 2 e3 , where 1 s 1. The projection back to the circle c defines a non-trivial− ≤ interval≤ bundle over the circle. 24.3. Tangent bundle The tangent bundle T M of a surface (or more generally mani- fold) M is by definition the collection of pairs T M := (p,X) p M,X T . { | ∈ ∈ p} We have a natural projection π : T M M, → defined by (p,X) p, “forgetting” the second component. Clearly, 7→ the inverse image of a point is precisely the tangent plane Tp at that point: 1 π− (p)= Tp. A vector field is a “section” of the tangent bundle, analogous to the splitting map for a homomorphism. Thus a section s : M T M is a map satisfying → π s = Id . ◦ M In other words, if s is a section, we have π(s(p)) = p for all p M. We first complete the preliminary material on bundles. ∈ 24.5. COTANGENT BUNDLE 193 24.4. Hairy ball theorem In Section 24.2, we discussed the simplest example of a nontrivial bundle, given by the Mobius band. Another example of a nontrivial bundle is provided by the following famous result. Theorem 24.4.1 (The hairy ball theorem). There is no nonvan- ishing continuous tangent vector field on a sphere. Thus, if f is a continuous function that assigns a vector in R3 to every point p on a sphere, such that f(p) is always tangent to the sphere at p, then there is at least one p such that f(p) = 0. The theorem was first stated by Henri Poincar´ein the late 19th century. The theorem is famously stated as “you can’t comb a hairy ball flat”, or sometimes, “you can’t comb the hair of a coconut”. It was first proved in 1912 by Brouwer. 24.5. Cotangent bundle In Section 24.3, we defined the tangent bundle of M. We now define the dual object, the cotangent bundle. Definition 24.5.1. The cotangent bundle of M is the collection of pairs T ∗M := (p, η) p M, η T ∗ . | ∈ ∈ p We have a natural projection (to the “first factor”) π : T ∗M M, → so that we obtain the usual triple (E,B,π)= T ∗M,M,π). Given a coordinate chart (u1,...,un) in M, we obtain a basis du1,...,dun) at every point in the chart. Every cotangent vector ω at a point with coordinates (u1,...,un) is a linear combination ω = ω dui = ω du1 + ω du2 + + ω dun, i 1 2 ··· n and therefore we can coordinatize T ∗M by the 2n-tuple 1 n (u ,...,u ,ω1,...,ωn). With respect to this coordinate chart in T ∗M, the projection π : T ∗M M “forgets” the last n coordinates. → Definition 24.5.2. A differential 1-form ω on M is by definition a section of the cotangent bundle of M. Thus at every point p M, a differential form ω gives a linear form ∈ ωp : TpM R . → 194 24. BUNDLES In coordinates, a differential 1-form ω is given by 1 n i ω = ωi(u ,...,u )du , where the ωi are functions of n variables. 24.6. Exterior derivative of a function Given a smooth function f C∞(M), we define a differential 1- form, denoted df, on M, in terms∈ of the directional derivative by setting df(X)= Xf, where Xf is the directional derivative of the function f in the di- rection of the vector X. This formula applies at every point of M. Denoting ω = df, we see that ωp(Xp)= Xpf R ∈ where Xp is the value of the vector field X at the point p. Theorem 24.6.1. In coordinates (u1,...,un), we have ∂f df = dui, (24.6.1) ∂ui with Einstein summation convention. The proof is a restatement of chain rule in several variables. Definition 24.6.2. The collection of all differential 1-forms on M is denoted Ω1(M). Note that Ω1(M) is an infinite-dimensional vector space. The space Ω1(M) is by definition the space of sections of the cotangent bundle of M. In this notation, the exterior derivative is an R-linear map 1 d : C∞(M) Ω (M) → 1 from the space C∞(M) of functions on M to the space Ω (M) of 1- forms on M. 24.7. Extending gradient and exterior derivative Given a function f on an inner product space V , one defines the gradient f of f by setting ∇ f,X = Xf, h∇ i for every vector X. Note that we have a canonical isomorphism ♭ ♭ : V V ∗, X X . → 7→ 24.8. PLAN FOR BUILDING DE RHAM COHOMOLOGY 195 The isomorphism ♭ is defined by setting X♭(Y )= X,Y , h i for every vector Y V . The inverse is denoted ∈ ♯ ♯ : V ∗ V, ω ω , → 7→ for every 1-form ω. If the coordinates are chosen in such a way that ∂ ∂ the basis ( ∂u1 ,..., ∂un ) is an orthonormal basis of TpM, then ∂ ♭ = dui, ∂ui and ∂ (dui)♯ = . ∂ui Theorem 24.7.1. The exterior derivative of a function f at p M is related to the gradient of f at p M as follows: ∈ ∈ f = ♯(df), ∇ at every point p of the surface M. The proof is immediate in an orthonormal basis. 24.8. Plan for building de Rham cohomology The construction of de Rham cohomology of a manifold involves several stages. It is helpful to keep these stages in mind as we build up the relevant mathematical machinery step-by-step. We start with linear algebra and end with linear algebra. (1) Linear-algebraic stage: from a vector space V to its exterior algebra ΛV = ΛkV. ⊕k (2) Topological stage: from a smooth manifold M to its tangent bundle T M and its cotangent bundle T ∗M. (3) A vector field as a section of T M. (4) A section of T ∗M is a differential 1-form on M. k (5) The exterior bundle ΛM = kΛ (T ∗M) is the bundle of ex- terior algebras parametrized⊕ by points of M. This is a finite- dimensional manifold. k (6) A section of Λ (T ∗M) is a differential k-form on M. k k (7) The space Ω (M) of sections of Λ (T ∗M) is an infinite-dimensional vector space: we are back to linear algebra. (8) The exterior derivative d turns Ω∗(M) into a differential graded algebra. 196 24. BUNDLES 24.9. Exterior algebra, area, determinant, rank The exterior product or wedge product of vectors is an algebraic construction generalizing certain features of the cross product to higher dimensions. Like the cross product, and the scalar triple product, the exterior product of vectors is used in Euclidean geometry to study areas, vol- umes, and their higher-dimensional analogs. In linear algebra, the exterior product provides a basis-independent manner for describing the determinant and the minors of a linear trans- formation, and is fundamentally related to ideas of rank and linear independence. The exterior algebra over a vector space V is generated by an oper- ation called exterior product. It is a key building block in the definition of the algebra of differential forms. 24.10. Formal definition, alternating property Formally, the exterior algebra is a certain unital associative (chok kibutz) algebra over the field K, including V as a subspace. It is denoted by Λ(V ) and its multiplication is also known as the wedge product or the exterior product and is written as ∧ Definition 24.10.1. The wedge product is an associative and bi- linear operation: : Λ(V ) Λ(V ) Λ(V ), ∧ × → sending (α, β) α β, 7→ ∧ with the essential feature that it is alternating on V : v v = 0 for all v V. (24.10.1) ∧ ∈ This implies in particular u v = v u (24.10.2) ∧ − ∧ for all u, v V , and ∈ v v v = 0 (24.10.3) 1 ∧ 2 ∧···∧ k whenever v ,...,v V are linearly dependent. 1 k ∈ 24.12. AREAS IN THE PLANE 197 Remark 24.10.2. Unlike associativity and bilinearity which are re- quired for all elements of the algebra Λ(V ), the last three properties are imposed only on the elements of the subspace V of the algebra. The defining property (24.10.1) and property (24.10.3) are equivalent; properties (24.10.1) and (24.10.2) are equivalent unless the character- istic of K is two. 24.11. Example: rank 1 The exterior algebra Λ(R) over the vector space V = R1 is 2- dimensional, and isomorphic to the subalgebra of the algebra of 2 by 2 matrices consisting of uppertriangular matrices with a pair of identical eigenvalues, i.e., matrices x y 0 x where x, y R. As a vector space, the algebra is the sum Λ(R) = ∈ R 1 R e1, where e1 is the matrix · ⊕ · 0 1 e = . 1 0 0 2 The matrix e1 is nilpotent: e1 = 0, as required by (24.10.1). Here V = R e1. 24.12. Areas in the plane The purpose of this section is to motivate the skewness of the ex- terior product on vectors in V . The area of a parallelogram can be expressed in terms of the deter- minant of the matrix of coordinates of two of its vertices. The Cartesian plane R2 is a vector space equipped with a basis. The basis consists of a pair of unit vectors e1 = (1, 0), e2 = (0, 1). Suppose that v = v1e1 + v2e2, w = w1e1 + w2e2 are a pair of given vectors in R2, written in components. There is a unique parallelogram having v and w as two of its sides. The area of this parallelogram is given by the standard determinant formula: A = det v w = v1w2 v2w 1 . | − | 198 24. BUNDLES Consider now the exterior product of v and w and exploit the properties stipulated above: v w =(v e + v e ) (w e + w e ) ∧ 1 1 2 2 ∧ 1 1 2 2 = v w e e + v w e e + v w e e + v w e e , 1 1 1 ∧ 1 1 2 1 ∧ 2 2 1 2 ∧ 1 2 2 2 ∧ 2 so that v w =(v w v w )e e , ∧ 1 2 − 2 1 1 ∧ 2 where the first step uses the distributive law for the wedge product, and the last uses the fact that the wedge product is alternating. Note that the coefficient in this last expression is precisely the determinant of the matrix [v w]. The fact that this may be positive or negative has the intuitive meaning that v and w may be oriented in a counterclockwise or clockwise sense as the vertices of the parallelogram they define. Such an area is called the signed area of the parallelogram: the absolute value of the signed area is the ordinary area, and the sign determines its orientation. The fact that this coefficient is the signed area is not an accident. In fact, it is relatively easy to see that the exterior product should be related to the signed area if one tries to axiomatize this area as an algebraic construct. In detail, if A(v,w) denotes the signed area of the parallelogram determined by the pair of vectors v and w, then A must satisfy the following properties: (1) A(av, bw) = abA(v,w) for any real numbers a and b, since rescaling either of the sides rescales the area by the same amount (and reversing the direction of one of the sides reverses the orientation of the parallelogram). (2) A(v, v) = 0, since the area of the degenerate (mathemat- ics)—degenerate parallelogram determined by v (i.e., a line segment) is zero. (3) A(w, v)= A(v,w), since interchanging the roles of v and w reverses the− orientation of the parallelogram. (4) A(v + aw,w) = A(v,w), since adding a multiple of w to v affects neither the base nor the height of the parallelogram and consequently preserves its area. (5) A(e1, e2) = 1, since the area of the unit square is one. With the exception of the last property, the wedge product satis- fies the same formal properties as the area. In a certain sense, the wedge product generalizes the final property by allowing the area of a parallelogram to be compared to that of any “standard” chosen paral- lelogram. In other words, the exterior product in two-dimensions is a 24.13. VECTOR AND TRIPLE PRODUCTS 199 basis-independent formulation of area. This axiomatization of areas is due to Leopold Kronecker and Karl Weierstrass. 24.13. Vector and triple products For vectors in R3, the exterior algebra is closely related to the vector product and triple product. Using the standard basis e1, e2, e3 , the wedge product of a pair of vectors { } u = u1e1 + u2e2 + u3e3 and v = v1e1 + v2e2 + v3e3 is u v =(u v u v )(e e )+(u v u v )(e e )+(u v u v )(e e ), ∧ 1 2− 2 1 1∧ 2 1 3− 3 1 1∧ 3 2 3− 3 2 2∧ 3 where e1 e2, e1 e3, e2 e3 is the basis for the three-dimensional { ∧3 ∧ ∧ } space Λ2(R ). This imitates the usual definition of the vector product of vectors in three dimensions. Bringing in a third vector w = w1e1 + w2e2 + w3e3, the wedge product of three vectors is u v w =(u v w +u v w +u v w u v w u v w u v w )(e e e ), ∧ ∧ 1 2 3 2 3 1 3 1 2− 1 3 2− 2 1 3− 3 2 1 1∧ 2∧ 3 3 3 where e1 e2 e3 is the basis vector for the one-dimensional space Λ (R ). This imitates∧ ∧ the usual definition of the triple product. The vector product and triple product in three dimensions each admit both (1) geometric and (2) algebraic interpretations. The vector product u v can be interpreted as a vector which is perpendicular to both u and×v and whose magnitude is equal to the area of the parallelogram determined by the two vectors. It can also be interpreted as the vector consisting of the minors of the matrix with columns u and v. The triple product of u, v, and w is geometrically a (signed) volume. Algebraically, it is the determinant of the matrix with columns u, v, and w. The exterior product in three- dimensions allows for similar interpretations. In fact, in the presence of a positively oriented orthonormal basis, the exterior product generalizes these notions to higher dimensions. 200 24. BUNDLES 24.14. Anticommutativity of the wedge product Theorem 24.14.1. The wedge product is anticommutative on ele- ments of V . Proof. Let x, y V . Then ∈ 0=(x + y) (x + y)= x x + x y + y x + y y = x y + y x. ∧ ∧ ∧ ∧ ∧ ∧ ∧ Hence x y = y x. ∧ − ∧ More generally, if x1, x2, ..., xk are elements of V , and σ is a permu- tation of the integers [1, ..., k], then x x x = sgn(σ)x x x , σ(1) ∧ σ(2) ∧···∧ σ(k) 1 ∧ 2 ∧···∧ k where sgn(σ) is the signature of a permutation—signature of the per- mutation σ. A proof of this can be found in greater generality in Bourbaki (1989). 24.15. The k-exterior power; simple (decomposable) multivectors The k-th exterior power of V , denoted Λk(V ), is the vector subspace of Λ(V ) spanned by elements of the form x x x , x V, i =1, 2, . . . , k. 1 ∧ 2 ∧···∧ k i ∈ Definition 24.15.1. If α Λk(V ), then α is said to be a k- multivector. If, furthermore, α∈can be expressed as a wedge product of k elements of V , then α is said to be decomposable, or simple. Although decomposable multivectors span Λk(V ), not every ele- ment of Λk(V ) is decomposable. Example 24.15.2. In R4, the following 2-multivector is not decom- posable: α = e e + e e . (24.15.1) 1 ∧ 2 3 ∧ 4 (This is in fact a symplectic form, since α α = 0. See S. Sternberg [?, III.6]. ∧ 6 24.17. RANK OF A MULTIVECTOR 201 24.16. Basis and dimension of exterior algebra Theorem 24.16.1. If the dimension of V is n and e1, ..., en is a basis of V , then the set e e e 1 i < i < < i n (24.16.1) { i1 ∧ i2 ∧···∧ ik | ≤ 1 2 ··· k ≤ } is a basis for Λk(V ). Proof. Given any wedge product of the form v v 1 ∧···∧ k then every vector vj can be written as a linear combination of the basis vectors ei; using the bilinearity of the wedge product, this can be expanded to a linear combination of wedge products of those basis vectors. Any wedge product in which the same basis vector appears more than once is zero; any wedge product in which the basis vectors do not appear in the proper order can be reordered, changing the sign whenever two basis vectors change places. In general, the resulting coefficients of the basis k-vectors can be computed as the minors of the matrix that describes the vectors vj in terms of the basis ei. By counting the basis elements, the dimension of Λk(V ) is the bi- nomial coefficient n n! = . k k!(n 1)! − In particular, Λk(V )=0 for k > n. Theorem 24.16.2. The dimension of Λ(V ) equals 2n. Proof. Any element of the exterior algebra can be written as a sum of multivectors. Hence, as a vector space the exterior algebra is a direct sum Λ(V )=Λ0(V ) Λ1(V ) Λ2(V ) Λn(V ) ⊕ ⊕ ⊕···⊕ (where by convention Λ0(V ) = K and Λ1(V ) = V ), and therefore its dimension is equal to the sum of the binomial coefficients, which is 2n. 24.17. Rank of a multivector If α Λk(V ), then it is possible to express α as a linear combination of decomposable∈ multivectors: α = α(1) + α(2) + + α(s) ··· 202 24. BUNDLES where each α(i) is decomposable, say α(i) = α(i) α(i), i =1, 2, . . . , s. 1 ∧···∧ k Definition 24.17.1. The rank of the multivector α is the minimal number of decomposable multivectors in such an expansion of α. Rank is particularly important in the study of 2-multivectors. Theorem 24.17.2. The rank of a 2-multivector α equals half the rank of the matrix of coefficients of α in a basis. Thus if ei is a basis for V, then α can be expressed uniquely as α = a e e ij i ∧ j i,j X where a = a (the matrix of coefficients is skew-symmetric). The ij − ji rank of α agrees with the rank of the matrix (aij). For example, the symplectic from (24.15.1) has rank 2, whereas the rank of its matrix of coefficients is 4. Theorem 24.17.3. In characteristic 0, the 2-multivector α has rank p if and only if the p-fold product satisfies α α =0 ∧···∧ 6 p and | {z } α α =0. ∧···∧ p+1 | {z } CHAPTER 25 Multilinear forms; exterior differential complex 25.1. Formal construction of the exterior algebra The tensor product V W of two vector spaces V and W over a field K is defined as follows.⊗ Consider the set of ordered pairs in the Cartesian product V W . For the purposes of this construction, regard this Cartesian product× as a set rather than a vector space. Definition 25.1.1. The free vector space F on V W is the vector space in which the elements of V W are a basis. × × Thus, F (V W ) is the collection of all finite linear combinations of the form × F (V W )= α e α K, (v ,w ) V W , × i (vi,wi) i ∈ i i ∈ × where we have used the symbol e( v,w) to emphasize that these are taken to be linearly independent by definition for distinct pairs (v,w) V W. ∈ × The tensor product arises by defining the following equivalence re- lations in F (V W ): × e e + e (v1+v2,w) ∼ (v1,w) (v2,w) e e + e (v,w1+w2) ∼ (v,w1) (v,w2) ce e (v,w) ∼ (cv,w) ce e (v,w) ∼ (v,cw) where v1, v2 V , w1,w2 W , and c K. Denote by R F (V W ) the space generated∈ by these∈ four equivalence∈ relations, in⊆ other words,× by the differences e(v1+v2,w) (e(v1,w) + e(v2,w)) e − (e + e ) (v,w1+w2) − (v,w1) (v,w2) ce(v,w) e(cv,w) ce − e (v,w) − (v,cw) Definition 25.1.2. The tensor product of the two vector spaces V and W is then the quotient space V W = F (V W )/R. ⊗ × 203 204 25. MULTILINEAR FORMS; EXTERIOR DIFFERENTIAL COMPLEX One readily checks with respect to a basis that the dimensions of V and W multiply: dim(V W ) = dim V dim W. ⊗ Definition 25.1.3. The exterior product V W of V and W is the the vector space obtained as the quotient of∧V W by the idea generated by all the sums v w + w v. ⊗ ⊗ ⊗ The image of v w under the projection V W V W is ⊗ ⊗ → ∧ denoted v w. Denote by (ei) a standard basis for V . One readily checks that∧e e , where i < j, form a basis for i ∧ j V V =Λ2(V ). ∧ Similar remarks apply to the higher exterior powers Λk(V ). 25.2. Exterior differential complex Associating to every cotangent space Tp∗ its exterior algebra Λ(Tp∗), we obtain the exterior bundle ΛM := Λ(T ∗M). k Similarly putting together the k-exterior powers Λ Tp∗, we obtain k k the subbundle Λ M := Λ (T ∗M). Definition 25.2.1. A differential k-form is a section of the ex- terior bundle ΛkM of k-multivectors. The space of such sections is denoted Ωk(X). Recall that we have an exterior derivative d defined on functions f by ∂f df = dui, (25.2.1) ∂ui with Einstein summation convention, see (24.6.1). A similar formula defines the exterior derivative for an arbitrary k-form. Theorem 25.2.2. The exterior derivative d gives rise to an exterior differential complex 1 2 n 0 C∞(M) Ω (M) Ω (M) ... Ω (M) 0, → → → → → → satisfying d2 = d d =0. ◦ For instance, the differential d : Ω1(M) Ω2(M) is defined on a 1-form fdu by → d(fdu)= df du. ∧ Remark 25.2.3. The exterior differential complex leads to the def- inition of de Rham cohomology of M, see Section 28.9. 25.3. ALTERNATING MULTILINEAR FORMS 205 25.3. Alternating multilinear forms Let V be a vector space over a field K. Let k N. ∈ Definition 25.3.1. An alternating multilinear function f : V k K (25.3.1) → is called an alternating multilinear form on V . The set of all alternating k-multilinear forms is is a vector space, as the sum of two such maps, or the multiplication of such a map with a scalar, is again alternating. 1 Note that for k = 1, we obtain the space V ∗ =Λ (V ∗). In general, we have the following duality. Theorem 25.3.2. If V has finite dimension n, then the space of k alternating multilinear forms can be identified with Λ (V ∗), where V ∗ denotes the dual space of V . In particular, the dimension of the space of anti-symmetric maps k n from V to K is the binomial coefficient k . Proof. Let us make such an identification explicit. Consider a k- tuple x =(x ,...,x ) V k. 1 k ∈ j We will denote by y , elements of the dual space V ∗. Given a decomposable element 1 k k y = y ... y Λ V ∗, ∧ ∧ ∈ we would like to define a multilinear map f = fy as in (25.3.1), associ- ated with y. We define it by setting 1 k fy(x1,...,xk)= sgn(σ) y (xσ(1)) ...y (xσ(k)). (25.3.2) σ S X∈ k In other words, i fy(x) = det(y (xj)) 1 is a k k-determinant. Note that we do not multiply by k! . In this we follow× S. Sternberg [?, p. 20]. k k Theorem 25.3.3. The space Λ (V ∗) is naturally dual to Λ (V ). The duality is given on simple elements of Λk(V ) by the pairing φ , v ... v = φ(v ,...,v ). h 1 ∧ ∧ ki 1 k The case k = 2 will be examined in detail in the next section. 206 25. MULTILINEAR FORMS; EXTERIOR DIFFERENTIAL COMPLEX 25.4. Case of 2-forms 1 2 Let y ,y V ∗ be 1-forms. Consider the decomposable (simple) exterior 2-form∈ 1 2 2 ω = y y Λ (V ∗). ∧ ∈ Then ω defines an alternating bilinear function f : V 2 K ω → defined as follows. Let u, v V . Following (25.3.2), we set ∈ y1(u) y1(v) f (u, v)= y1(u)y2(v) y1(v)y2(u) = det ω − y2(u) y2(v) i i where y (u) is the evaluation of the covector y V ∗ on the vector u V . ∈ ∈ Example 25.4.1. In the (x, y)-plane, the form ω = dx dy defines ∧ a bilinear function fω whose geometric meaning is the signed area of the parallelogram spanned by the pair of vectors u, v. Proposition 25.4.2. The value of fω on the pair u, v in fact de- pends only on its image ξ = u v in Λ2(V ). ∧ Proof. The proof is immediate from the interpretation of fω(u, v) as a generalized signed area of the parallelogram spanned by u and v. 2 2 Namely, fω is proportional to the standard area form dx dy since Λ (R )∗ is 1-dimensional, and dx dy calculates the area A if the∧ parallelogram ∧ 1 2 2 2 spanned by u and v, where u v = A e e Λ (R ). ∧ ∧ ∈ Remark 25.4.3. We will sometimes write ω(ξ) in place of fω(u, v), where ξ = u v. Thus by definition, ∧ ω(ξ)= fω(u, v). 25.5. Norm on 1-forms Given a norm on a vector space V , there is a natural norm on kk the dual space V ∗, defined as follows. We will denote the new norm ∗ for the purposes of this section. kk Definition 25.5.1. The norm ∗ on V ∗ is defined for y V ∗ by setting kk ∈ y ∗ = sup y(x) x V, x 1 . k k { | ∈ k k ≤ } In other words, we calculate the dual norm of the covector y by maximizing its value over vectors x V of norm at most 1. ∈ 25.6. POLAR COORDINATES AND DUAL NORMS 207 It is clear that the inequality x 1 can be replaced by equality in this definition: k k ≤ y ∗ = sup y(x) x V, x =1 , k k { | ∈ k k } i.e. the norm of y V ∗ can be calculated over the unit vectors x V . ∈ ∈ Example 25.5.2. If ∂ ∂ , ∂x ∂y is an orthonormal basis for the Euclidean plane V = R2, then the dual space V ∗ admits an orthonormal basis (dx,dy) ∂ ∂ ∂ which is dual to the basis , . Indeed, dx( ) = 1 and dx ∗ = 1. ∂x ∂y ∂x k k 25.6. Polar coordinates and dual norms The Euclidean metric in the plane E defines a norm on the tangent plane by the usual identification of a space and its tangent space. Lemma 25.6.1. The 1-form xdy ydx has norm r. − Proof. This is immediate from the fact that dx and dy are or- thonormal. Consider the polar coordinates (r, θ) in the plane. Note that θ is undefined at the origin. Then (dr, dθ) is a basis for the cotangent plane at every point of E 0 . However, the basis is not orthonormal. \{ } y Since tan θ = x , we have x tan θ = y, hence dθ/dx tan θ + x = dy/dx, cos2 θ so that dθ tan θ dx + x = dy. cos2 θ Multiplying by r cos θ, we obtain x r sin θ dx + rdθ = r cos θdy, cos θ or ydx + r2dθ = xdy. 208 25. MULTILINEAR FORMS; EXTERIOR DIFFERENTIAL COMPLEX We then obtain r2dθ = xdy ydx. By lemma 25.6.1, we obtain − 1 dθ ∗ = , k k r so that it is the covector rdθ that has unit norm. It follows that in the (r, θ) coordinates, we obtain an orthonormal basis for the cotangent space consisting of the pair of covectors (dr, rdθ). 25.7. Dual bases and dual lattices in a Euclidean space The discussion is similar to the situation with a pair of dual vector spaces treated in Section 19.5. Thus, let V be a Euclidean vector space equipped with a scalar product (inner product), denoted , . Given h i a basis (xi) for V , there exists a unique basis (yj) satisfying x ,y = δ , h i ji ij n where δij is the Kronecker delta function. We will view R as equipped with its standard dot product, denoted x, y . h i n Definition 25.7.1. Given a lattice L R , the dual lattice L∗ n ⊆ ⊆ R is the lattice n L∗ = y R x L, x, y Z . { ∈ | ∀ ∈ h i ∈ } Thus the pairing between a point in a lattice and a point in its dual is by definition always an integer. Theorem 25.7.2. Consider a Z-basis (xi) for L. Consider the ba- sis (yj) dual to the basis (xi). Then (yj) is a Z-basis for the dual lattice L∗. Definition 25.7.3. Given a lattice L Rn with its Euclidean norm , denote by λ (L) the least length of⊆ a nonzero vector in L: || 1 λ (L) = min v v L 0 . 1 | | ∈ \{ } We now consider the rank 1 case. Proposition 25.7.4. Let L R be a lattice and L∗ R the dual lattice of L. Then ⊆ ⊆ 1 λ1(L∗)= . (25.7.1) λ1(L) Proof. Given a lattice L R, denote by α > 0 its generator, so ⊆ that L = Z α. For an element y R to pair integrally with α, it must 1 ∈ be a multiple of β = α . Thus L∗ = Z β 25.7. DUAL BASES AND DUAL LATTICES IN A EUCLIDEAN SPACE 209 1 and λ1(L∗)= β = α . Remark 25.7.5. Note that the reciprocal relation (25.7.1) does not hold in general for rank greater than 1. Already in rank 2, the product λ1(L∗)λ1(L) can be greater than 1. For the Eisenstein lattice LE C ⊆ spanned by the cube roots of unity, we obtain the value λ1(LE∗ )λ1(LE)= 2/√3. CHAPTER 26 Comass norm, Hermitian product, symplectic form There are two distinct natural norms on k-multivectors: the Eu- cldean norm and the comass norm. First we complete the material on exterior algebra. 2 2 n 26.1. Λ (V ) is isomorphic to the standard model Λ0(R ) We let 2 n Λ0(R ) = Span ei ej :1 i < j n , 2 n n { ∧ ≤ ≤ } so that dim Λ0(R )= 2 . Let V be an n-dimensional vector space. As in Section 25.1, we set F (V ) = Span e : v,w V , (v,w) ∈ and define the equivalence relation as in Section 25.1, with the ad- ditional relation stipulating that ∼ e e , (26.1.1) v,w ∼ − w,v and define Λ2(V ) := F (V )/ . ∼ Lemma 26.1.1. We have dim Λ2(V ) n . ≤ 2 Proof. Choose a basis a1,...,an for V . Each v V decomposes i i ∈ as a sum v = v ai with v R. Then ∈ i j i j i j e = e i j v w e v w e + v w e . (v,w) (v ai,w aj ) ∼ (ai,aj ) ∼ (ai,aj ) (ai,aj ) i 2 2 n Theorem 26.1.2. Λ (V ) is isomorphic to Λ0(R ). 211 212 26. COMASS NORM, HERMITIAN PRODUCT, SYMPLECTIC FORM Proof. We define a map ζ :Λ2(V ) Λ2(Rn) → 0 by setting ζ(e )= (viwj vjwi)e e Λ2(V ). (v,w) − i ∧ j ∈ 0 i