Gravity, a Geometrical Course Pietro Giuseppe Frè

Gravity, a Geometrical Course

Volume 1: Development of the Theory and Basic Physical Applications Pietro Giuseppe Frè Dipartimento di Fisica Teorica University of Torino Torino, Italy

Additional material to this book can be downloaded from http://extras.springer.com.

ISBN 978-94-007-5360-0 ISBN 978-94-007-5361-7 (eBook) DOI 10.1007/978-94-007-5361-7 Springer Dordrecht Heidelberg New York London

Library of Congress Control Number: 2012950601

© Springer Science+Business Media Dordrecht 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of pub- lication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com) This book is dedicated to my beloved daughter Laura and to my darling wife Olga. Preface

This book grew out from the Lecture Notes of the course in which I gave for more than 15 years at the University of Torino. That course has a long tradition since it was attached to the Chair of Relativity created at the begin- ning of the 1960s for prof. Tullio Regge. In the years 1990Ð1996, while prof. Regge was Member of the European Parliament the course was given by my long time excellent friend and collaborator prof. Riccardo D’Auria. In 1996 I had the honor to be appointed on Regge’s chair1 and I left SISSA of Trieste to take this momen- tous and challenging legacy. Feeling the burden of the task laid on my shoulders I humbly tried to do my best and create a new course which might keep alive the tra- dition established by my so much distinguished predecessors. In my efforts to cope with the expected standards, I obviously introduced my own choices, view-points and opinions that are widely reflected in the present book. The length of the orig- inal course was of about 120 hours (without exercises). In the new 3 + 2system introduced by the Bologna agreements it was split in two courses but, apart from minor readjustments, I continued to consider them just as part one and part two of a unique entity. This was not a random choice but it sprang from the views that in- spired my teaching and the present book. I always held the opinion that University courses should be long, complex and articulated in many aspects. They should not aim at a quick transmission of calculating abilities and of ready to use information, rather they should be as much formative as informative. They should offer a gen- eral overview of a subject as seen by the professor, in this way giving the students the opportunity of developing their own opinions through the critical absorption of those of the teacher. One aspect that I always considered essential is the historical one, concerning on one side the facts, the life and the personalities of the scientists who shaped our present understanding, on the other hand concerning the usually intricate develop- ment of fundamental ideas. The second aspect to which I paid a lot of attention is the use of an updated and as much as possible rigorous mathematical formalism. Moreover I always tried to

1At that time Regge had shifted from the University to the Politecnico of Torino.

vii viii Preface convey the view that should not be regarded as a technical tool for the solution of Physical Problems or simply as a language for the formulation of Physical Laws, rather as an essential integral part of the whole game. The third aspect taken not only into account but also into prominence, is the emphasis on important physical applications of the theory: not just exercises, from which I completely refrained, but the full-fledged ab initio development of relevant applications in Astrophysics, Cosmology or Particle Theory. The aim was that of showing, from A to Z, as one goes from the first principles to the actual prediction of experimentally verifiable numbers. For the reader’s or student’s convenience I included the listing of some computer codes, written in MATHEMATICA, that solve some of the posed problems or parts thereof. The aim was, once again formative. In the course of their theoretical studies the students should develop the ability to implement formal calculations on a machine, freeing themselves from the slavery to accidental errors and focusing instead all their mental energies on conceptual points. Furthermore implementation of formulae in a computer code is the real test of their comprehension by the learners, more efficient in its task than any ad-hoc prepared exercise. As for the actual choice of the included and developed material, I was inspired by the following view on the role of the course I used to gave, which I extended as a mission to the present book. General Relativity, Quantum Mechanics, Gauge Theories and Statistical Mechanics are the four pillars of the Physical Thought de- veloped in the XX century. That century laid also the foundations for new theo- ries, whose actual relations with the experimental truth and with observations will be clarified only in the present millennium, but whose profound influence on the current thought is so profound that no-one approaching theoretical studies can ig- nore them: I refer to supersymmetry, supergravity, strings and branes. The role of the course in General Relativity which I assumed as given, was not only that of presenting Einstein Theory, in its formulation, historical development and ap- plications, but also that of comparing the special structure of Gravity in relation with the structure of the Gauge-Theories describing the other fundamental inter- actions. This was specially aimed at the development of critical thinking in the student and as a tool of formative education, preparatory to the study of unified theories. The present one is a Graduate Text Book but it is also meant to be a self-contained account of Gravitational Theory attractive for the person with a basic scientific ed- ucation and a curiosity for the topic who would like to learn it from scratch, being his/her own instructor. Just as the original course given in Torino after the implementation of the Bologna agreements, this book is divided in two volumes: 1. Volume 1: Development of the Theory and Basic Physical Applications. 2. Volume 2: Black Holes, Cosmology and Introduction to Supergravity. Volume 1, starting from a summary of and a sketchy historical introduction of its birth, given in Chap. 1, develops the general current description of the physical world in terms of gauge connections and sections of the bundles on Preface ix which such connections are constructed. The special role of Gravity as the gauge theory of the tangent bundle to the base of all other bundles is empha- sized. The mathematical foundations of the theory are contained in Chaps. 2 and 3. Chapter 2 introduces the basic notions of , the definition of and fibre-bundles, differential forms, vector fields, homology and coho- mology. Chapter 3 introduces the theory of connections and metrics. It includes an extensive historical account of the development of mathematical and physical ideas which eventually lead to both general relativity and modern gauge theories of the non-gravitational interactions. The notion of is introduced and exempli- fied with the detailed presentation of a pair of examples in two , one with Lorentzian signature, the other with Euclidian signature. Chapter 4 is devoted to the Schwarzschild metric. It is shown how geodesics of the Schwarzschild metric retrieve the whole building of Newtonian plus corrections that can be very tiny in weak gravitational fields, like that of the Solar System, or gigantic in strong fields, where they lead to qualitatively new physics. The classical tests of General Relativity are hereby discussed, perihelion advance and the bending of light rays, in particular. Chapter 5 introduces the Cartan approach to differential geometry, the vielbein and the spin connection, discusses Bianchi identities and their relation with gauge invariances and eventually introduces Einstein field equations. The dynamical equations of gravity and their derivation from an action principle are developed in a parallel way to their analogues for electrodynamics and non-Abelian gauge theories whose structure and features are constantly compared to those of gravity. The lin- earization of Einstein field equations and the spin of the graviton are then discussed. After that the bottom-up approach to gravity is discussed, namely, following Feyn- man’s ideas, it is shown how a special relativistic linear theory of the graviton field could be uniquely inferred from the conservation of the stress-energy and its non-linear upgrading follows, once the stress-energy tensor of the gravitational field itself is taken into account. The last section of Chap. 5 contains the derivation of the Schwarzschild metric from Einstein equations. Chapter 6 addresses the issue of stellar equilibrium in General Relativity, derives the Tolman Oppenheimer Volkhoff equation and the corresponding mass limits. Next, considering the role of quantum mechanics the Chandrasekhar mass limits for white dwarfs and neutron stars are de- rived. Chapter 7 is devoted to the emission of gravitational waves and to the tests of General Relativity based on the slowing down of the period of double star systems. Volume 2, after a short introductory chapter, the following two chapters are de- voted to Black Holes. In Chap. 2 we begin with a historical account of the notion of black holes from Laplace to the present identification of stellar mass black holes in the galaxy and elsewhere. Next the Kruskal extension of the Schwarzschild solu- tion is considered in full detail preceded by the pedagogical toy example of Rindler space-time. Basic concepts about Future, Past and Causality are introduced next. Conformal Mappings, the Causal Structure of infinity and Penrose diagrams are discussed and exemplified. Chapter 3 deals with rotating black-holes and the Kerr-Newman metric. The usu- ally skipped form of the spin connection and of the Riemann tensor of this metric is calculated and presented in full detail, together with the electric and magnetic x Preface

field strengths associated with it in the case of a charged hole. This is followed by a careful discussion of the static limit, of locally non-rotating observers, of the horizon and of the ergosphere. In a subsequent section the geodesics of the Kerr metric are studied by using the Hamilton Jacobi method and the system is shown to be Liouville integrable with the derivation of the fourth Hamiltonian (the Carter constant) completing the needed shell of four, together with the energy, the and the mass. The last section contains a discussion of the analogy be- tween the Laws of Thermodynamics and those of Black Hole dynamics including the Bekenstein-Hawking entropy interpretation of the horizon area. Chapters 4 and 5 are devoted to cosmology. Chapter 4 contains a historical out- line of modern Cosmology starting from Kant’s proposal that nebulae might be dif- ferent island-universes (galaxies in modern parlance) to the current space missions that have measured the Cosmic Microwave Background anisotropies. The crucial historical steps in building up the modern vision of a huge expanding Universe, which is even accelerating at the present moment, are traced back in some detail. From the Olbers paradox to the discovery of the stellar parallax by Bessel, to the Great Debate of 1920 between Curtis and Shapley, how the human estimation of the Universe’s size enlarged, is historically reported. The discovery of the Cepheides law by Henrietta Leavitt, the first determination of the distance to nearby galax- ies by Hubble and finally the first measuring of the universal cosmic recession are the next episodes of this tale. The discovery of the CMB radiation, predicted by Gamow, the hunt for its anisotropies and the recent advent of the Inflationary Uni- verse paradigm are the subsequent landmarks, which are reported together with bi- ographical touches upon the life and personalities of the principal actors in this exciting adventure of the human thought. Chapter 5, entitled Cosmology and General Relativity: Mathematical Descrip- tion of the Universe, provides a full-fledged introduction to Relativistic Cosmology. The chapter begins with a long mathematical interlude on the geometry of coset manifolds. These notions are necessary for the mathematical formulation of the Cosmological Principle, stating homogeneity and isotropy, but have a much wider spectrum of applications. In particular they will be very important in the subsequent chapters about Supergravity. Having prepared the stage with this mathematical pre- liminaries, the next sections deal with homogeneous but not isotropic cosmologies. Bianchi classification of three dimensional Lie groups is recalled, Bianchi metrics are defined and, within Bianchi type I, the Kasner metrics are discussed with some glimpses about the cosmic billiards, realized in Supergravity. Next, as a pedagogi- cal example of a homogeneous but not isotropic cosmology, an exact solution, with and without matter, of Bianchi type II space-time is treated in full detail. After this, we proceed to the Standard Cosmological Model, including both homogeneity and isotropy. Freedman equations, all their implications and known solutions are dis- cussed in detail and a special attention is given to the embedding of the three type of standard cosmologies (open, flat and closed) into de Sitter space. The concept of particle and event horizons is next discussed together with the derivation of exact formulae for read-shift distances. The conceptual problems (horizon and flatness) of the Standard Cosmological Model are next discussed as an introduction to the new Preface xi inflationary paradigm. The basic inflationary model based on one scalar field and the slow rolling regime are addressed in the following sections with fully detailed calculations. Perturbations, the spectrum of fluctuations up to the evaluation of the spectral index and the principles of comparison with the CMB data form the last part of this very long chapter. The last four chapters of the book provide a conceptual, mathematical and de- scriptive introduction to Supergravity, namely to the Beyond GR World. Chapter 6 starts with a historical outline that describes the birth of supersymme- try both in String Theory and in Field Theory, touching also on the biographies and personalities of the theorists who contributed to create this entire new field through a complicated and, as usual, far from straight, path. The chapter proceeds than with the conceptual foundations of Supergravity, in particular with the notion of Free Differential Algebras and with the principle of rheonomy. Sullivan’s structural the- orems are discussed and it is emphasized how the existence of p-forms, that close the supermultiplets of fundamental fields appearing in higher dimensional super- gravities, is at the end of the day a consequence of the superPoincaré Lie algebras through their cohomologies. The structure of M-theory, the constructive principles to build supergravity Lagrangians and the fundamental role of Bianchi identities is emphasized. The last two sections of the chapter contain a thorough account of type IIA and type IIB supergravities in D = 10, the structure of their FDAs, the rheo- nomic parameterization of their and the full-fledged form of their field equations. Chapter 7 deals with the brane/bulk dualism. The first section contains a concep- tual outline where the three sided view of branes as 1) classical solitonic solutions of the bulk theory, 2) world volume gauge-theories described by suitable world-volume actions endowed with κ-supersymmetry and 3) boundary states in the superconfor- mal field theory description of superstring vacua is spelled out. Next a New First Order Formalism, invented by the author of this book at the beginning of the XXI century and allowing for an elegant and compact construction of κ-supersymmetric Born-Infeld type world-volume actions on arbitrary supergravity backgrounds is de- scribed. It is subsequently applied to the case of the D3-brane, both as an illustration and for the its intrinsic relevance in the gauge/gravity correspondence. The last sec- tions of the chapter are devoted to the presentations of branes as classical solitonic solutions of the bulk theory. General features of the solutions in terms of harmonic functions are presented including also a short review of domain walls and some sketchy description of the Randall-Sundrun mechanism. Chapter 8 is a bestiary of Supergravity Special Geometries associated with its scalar sector. The chapter clarifies the codifying role of the scalar geometry in con- structing the bosonic part of a supergravity Lagrangian. The dominant role among the scalar manifolds of homogeneous symmetric spaces is emphasized illustrating the principles that allow the determination of such U/H cosets for any supergravity theory. The mechanism of symplectic embedding that allows to extend the action of U-isometries from the scalar to the vector field sector are explained in detail within the general theory of electric/magnetic duality rotations. Next the chapter provides a self-contained summary of the most important special geometries appearing in xii Preface

D = 4 and D = 5 supergravity, namely Special Kähler Geometry, Very Special Real Geometry and Quaternionic Geometry. Chapter 9 presents a limited anthology of supergravity solutions aimed at em- phasizing a few relevant new concepts. Relying on the special geometries described in Chap. 8 a first section contains an introduction to supergravity spherical Black Holes, to the attraction mechanism and to the interpretation of the horizon area in terms of a quartic symplectic invariant of the U duality group. The second and third sections deal instead with flux compactifications of both M-theory and type IIA supergravity. The main issue is that of the relation between supersymmetry preser- vation and the geometry of manifolds of restricted holonomy. The problem of super- gauge completion and the role of orthosymplectic superalgebras is also emphasized. Appendices contain the development of gamma algebra necessary for the inclusion of , details on superalgebras and the user guide to Mathematica codes for the computer aided calculation of Einstein equations. Moscow, Russia Pietro Giuseppe Frè University of Torino presently Scientific Counselor of the Italian Embassy in Moscow Acknowledgements

With great pleasure I would like to thank my collaborators and colleagues Pietro Antonio Grassi, Igor Pesando and Mario Trigiante for the many suggestions and discussions we had during the writing of the present book and also for their criti- cal reading of several chapters. Similarly I express my gratitude to the Editors of Springer-Verlag, in particular to Dr. Maria Bellantone, for their continuous assis- tance, constructive criticism and suggestions. My thoughts, while finishing the writing of these volumes, that occurred during solitary winter week-ends in Moscow, were frequently directed to my late parents, whom I miss very much and I will never forget. To them I also express my gratitude for all what they taught me in their life, in particular to my father who, with his own example, introduced me, since my childhood, to the great satisfaction and deep suffering of writing books. Furthermore it is my pleasure to thank my very close friend and collaborator Aleksander Sorin for his continuous encouragement and for many precious consul- tations.

xiii Contents

1 Special Relativity: Setting the Stage ...... 1 1.1 Introduction ...... 1 1.2 Classical Physics Between the End of the XIX and the Dawn oftheXXCentury...... 2 1.2.1MaxwellEquations...... 2 1.2.2 Luminiferous Aether and the Michelson Morley Experiment 4 1.2.3MaxwellEquationsandLorentzTransformations...... 6 1.3 The Principle of Special Relativity ...... 8 1.3.1 Minkowski Space ...... 10 1.4 Mathematical Definition of the Lorentz Group ...... 15 1.4.1 The Lorentz Lie Algebra and Its Generators ...... 16 1.4.2 Retrieving Special Lorentz Transformations ...... 18 1.5RepresentationsoftheLorentzGroup...... 19 1.5.1 The Fundamental Representation ...... 20 1.5.2 The Two-Valued Homomorphism SO(1, 3) SL(2, C) in the Four-Dimensional Case ...... 22 1.6 Lorentz Covariant Field Theories and the Little Group ...... 23 1.6.1 Representations of the Massless Little Group in D = 4... 27 1.7 Noether’s Theorem, Noether’s Currents and the Stress-Energy Tensor...... 28 1.8 Criticism of Special Relativity: Opening the Road to General Relativity...... 31 References ...... 32 2 Basic Concepts About Manifolds and Fibre Bundles ...... 35 2.1 Introduction ...... 35 2.2DifferentiableManifolds...... 36 2.2.1 Homeomorphisms and the Definition of Manifolds ..... 37 2.2.2 Functions on Manifolds ...... 42 2.2.3 Germs of Smooth Functions ...... 43 2.3 Tangent and Cotangent Spaces ...... 44

xv xvi Contents

2.3.1 Tangent Vectors in a Point p ∈ M ...... 45 2.3.2 Differential Forms in a Point p ∈ M ...... 49 2.4 Fibre Bundles ...... 51 2.5 Tangent and Cotangent Bundles ...... 58 2.5.1 Sections of a Bundle ...... 60 2.5.2 The Lie Algebra of Vector Fields ...... 62 2.5.3 The Cotangent Bundle and Differential Forms ...... 64 2.5.4 Differential k-Forms...... 66 2.6 Homotopy, Homology and Cohomology ...... 70 2.6.1Homotopy...... 72 2.6.2Homology...... 75 2.6.3 Homology and Cohomology Groups: General Construction 81 2.6.4RelationBetweenHomotopyandHomology...... 83 References ...... 84 3 Connections and Metrics ...... 85 3.1 Introduction ...... 85 3.2 A Historical Outline ...... 86 3.2.1 Gauss Introduces Intrinsic Geometry and Curvilinear Coordinates...... 87 3.2.2 Introduces n-Dimensional Metric Manifolds...... 91 3.2.3 Parallel Transport and Connections ...... 94 3.2.4 The Metric Connection and from Christoffel to Einstein, via Ricci and Levi Civita . . . 94 3.2.5 Mobiles Frames from Frenet and Serret to Cartan .....102 3.3 Connections on Principal Bundles: The Mathematical Definition . 108 3.3.1 Mathematical Preliminaries on Lie Groups ...... 108 3.3.2 Ehresmann Connections on a Principle Fibre Bundle ....118 3.4 Connections on a Vector Bundle ...... 127 3.5 An Illustrative Example of Fibre-Bundle and Connection .....130 3.5.1 The Magnetic Monopole and the Hopf Fibration of S3 ...130 3.6 Riemannian and Pseudo-Riemannian Metrics: The Mathematical Definition ...... 136 3.6.1 Signatures ...... 137 3.7 The Levi Civita Connection ...... 139 3.7.1 Affine Connections ...... 140 3.7.2 and Torsion of an Affine Connection ...... 141 3.8 Geodesics ...... 144 3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples...... 145 3.9.1 The Lorentzian Example of dS2 ...... 146 3.9.2 The Riemannian Example of the Lobachevskij-Poincaré Plane...... 151 References ...... 154 Contents xvii

4 Motion of a Test Particle in the Schwarzschild Metric ...... 157 4.1 Introduction ...... 157 4.2 Keplerian Motions in Newtonian Mechanics ...... 160 4.3 The Orbit Equations of a Massive Particle in Schwarzschild Geometry...... 162 4.3.1 Extrema of the Effective Potential and Circular Orbits . . . 165 4.4 The Periastron Advance of Planets or Stars ...... 170 4.4.1 Perturbative Treatment of the Periastron Advance .....174 4.5 Light-Like Geodesics in the Schwarzschild Metric and the Deflection of Light Rays ...... 179 References ...... 185 5 Einstein Versus Yang-Mills Field Equations: The Spin Two Graviton and the Spin One Gauge Bosons ...... 187 5.1 Introduction ...... 187 5.2LocallyInertialFramesandtheVielbeinFormalism...... 189 5.2.1TheVielbein...... 192 5.2.2 The Spin-Connection ...... 193 5.2.3 The Poincaré Bundle ...... 194 5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories ...... 195 5.3.1 Hodge Duality ...... 198 5.3.2 Geometrical Rewriting of the Gauge Action ...... 199 5.3.3 Yang-Mills Theory in Vielbein Formalism ...... 200 5.4 Soldering of the Lorentz Bundle to the Tangent Bundle ...... 204 5.4.1 Gravitational Coupling of Spinorial Fields ...... 207 5.5EinsteinFieldEquations...... 209 5.6TheActionofGravity...... 211 5.6.1TorsionEquation...... 214 5.6.2TheEinsteinEquation...... 217 5.6.3 Conservation of the Stress-Energy Tensor and Symmetries of the Gravitational Action ...... 218 5.6.4ExamplesofStress-Energy-...... 219 5.7WeakFieldLimitofEinsteinEquations...... 220 5.7.1 Gauge Fixing ...... 222 5.7.2TheSpinoftheGraviton...... 225 5.8 The Bottom-Up Approach, or Gravity à la Feynmann ...... 227 5.9 Retrieving the Schwarzschild Metric from Einstein Equations . . . 233 References ...... 236 6 Stellar Equilibrium: Newton’s Theory, General Relativity, Quantum Mechanics ...... 237 6.1 Introduction and Historical Outline ...... 237 6.2TheStressEnergyTensorofaPerfectFluid...... 242 6.3 Interior Solutions and the Stellar Equilibrium Equation ...... 245 xviii Contents

6.3.1 Integration of the Pressure Equation in the Case ofUniformDensity...... 250 6.3.2TheCentralPressureofaRelativisticStar...... 254 6.4 The Chandrasekhar Mass-Limit ...... 256 6.4.1 The Degenerate Fermi Gas of Very Many Spin One-Half Particles...... 256 6.4.2 The Equilibrium Equation ...... 264 6.4.3 Polytropes and the Chandrasekhar Mass ...... 267 6.5 Conclusive Remarks on Stellar Equilibrium ...... 270 References ...... 271 7 Gravitational Waves and the Binary Pulsars ...... 273 7.1 Introduction ...... 273 7.1.1TheIdeaofGWDetectors...... 274 7.1.2 The Arecibo Radio Telescope ...... 276 7.1.3 The Coalescence of Binaries and the Interferometer Detectors...... 278 7.2 Green Functions ...... 280 7.2.1 The Laplace Operator and ...... 283 7.2.2 The Relativistic Propagators ...... 284 7.3 Emission of Gravitational Waves ...... 286 7.3.1 The Stress Energy 3-Form of the Gravitational Field ....286 7.3.2 Energy and Momentum of a Plane Gravitational Wave . . . 288 7.3.3 Multipolar Expansion of the Perturbation ...... 291 7.3.4 Energy Loss by Quadrupole Radiation ...... 295 7.4 Quadruple Radiation from the Binary Pulsar System ...... 298 7.4.1KeplerianParametersofaBinaryStarSystem...... 298 7.4.2 Shrinking of the Orbit and Gravitational Waves ...... 301 7.4.3TheFateoftheBinarySystem...... 306 7.4.4 The Double Pulsar ...... 307 7.5 Conclusive Remarks on Gravitational Waves ...... 308 References ...... 309 8 Conclusion of Volume 1 ...... 311 Appendix A Spinors and Gamma Matrix Algebra ...... 312 A.1 Introduction to the Spinor Representations of SO(1,D− 1) 312 A.2 The Clifford Algebra ...... 312 A.3 TheChargeConjugationMatrix...... 314 A.4 Majorana, Weyl and Majorana-Weyl Spinors ...... 316 A.5 A Particularly Useful for D = 4 γ -Matrices.....317 Appendix B Mathematica Packages ...... 318 B.1 Periastropack ...... 318 B.2 Metrigravpack ...... 324 Index ...... 331 Chapter 1 Special Relativity: Setting the Stage

For a superficial observer, scientific truth is beyond the possibility of doubt; the logic of science is infallible, and if the scientists are sometimes mistaken, this is only from their mistaking its rule. . . Henri Poincaré

1.1 Introduction

General Relativity and Special Relativity are both credited to Einstein yet, while for the former he has an absolute credit beyond any possible doubt, the latter, notwith- standing Einstein’s essential role in its formulation, appears to spring from the work of several distinguished actors. As a matter of fact, Special Relativity was coming to ripeness in the scientific community just at the time it was formulated by Ein- stein and, although it is always risky to make such statements, I think that it might have been introduced by someone else, also in the case Einstein did not publish it in 1905. The main difference between these two theoretical developments, that are historically separated by a decade of studies, resides in the following. Special Rel- ativity grew from the need to reconcile the theory with some experimental facts, namely the independency of light-speed from the state of motion of the observer, which was revealed by the Michelson and Morley experiment, and the invariance of Maxwell equations with respect to transformations different from those of Galileo, that was discovered by Lorentz. On the contrary General Relativity was not moti- vated by any new experimental data, rather it sprang from a pure logical need, that of formulating the laws of physics in a frame independent way, equally good for any observer, irrespectively of his state of motion. The awareness of such a logical need was probably present in Einstein’s mind before 1905, yet it is doubtful that it might have developed into a concrete research programme without the intermediate step of Special Relativity. Indeed, once Lorentz group replaced Galilei group on the throne of inertial frames, the logical need of liberating physics from privileged observers was accompanied by another urgent clash: the Lorentz non-invariance of Newton’s theory of gravitation. To solve this problem what was required was a substantial mathematical upgrad- ing of early XX century Physics. Differential geometry and the theory of metrics and connections had parallelly developed in Mathematics, starting with the 1828

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_1, 1 © Springer Science+Business Media Dordrecht 2013 2 1 Special Relativity: Setting the Stage paper of Gauss on curved surfaces and, through the work of Riemann, Christof- fel, Ricci-Curbastro, Bianchi, Levi Civita and the young Cartan, had reached a high degree of development. It was time to adapt Physics to this higher level mathemati- cal language and to incorporate the basic concepts of the new geometry among the building blocks utilized to formulate the fundamental laws of Nature. Einstein did so with General Relativity, discovering that gravitation is nothing else but a mani- festation of the curvature of space-time, interpreted as a Riemannian differentiable manifold. Curiously, Maxwell Theory which, via its Lorentz covariance, motivated Special Relativity, was also a theory of curvature in disguise: the curvature of a principle connection on a fibre-bundle. Indeed Electromagnetism is the simplest ex- ample of what we name nowadays gauge-theories and the electromagnetic potential Aμ is the simplest example of a principal connection. Today we know that more complicated connections on non-Abelian principal bundles describe the other non- gravitational fundamental interactions.

1.2 Classical Physics Between the End of the XIX and the Dawn of the XX Century

James Clerk Maxwell (see Fig. 1.1) put Classical Physics into perfection perform- ing, after that encoded in Newton’s Law, the second great unification of Physics. Before Maxwell there were, on one side, electrostatics, electrodynamics and mag- netism and there was optics on the other. After Maxwell there stood only elec- tromagnetism and its corollary, namely propagating electromagnetic waves, which provide the explanation of what light is. Yet, while completing the classical build- ing, Maxwell opened into it a small window, through which a completely different vision of Physics slept in, first silently and almost reluctantly, to develop then, over the short period of just a few years, into a revolutionary reframing of the whole fabrics of physical thinking.

1.2.1 Maxwell Equations

In his scientific masterpiece [2], Maxwell summarized into four differential equa- tions for the electric field E(t, x) and the magnetic field B(t, x) all the laws of elec- tricity and magnetism that had been explored in the course of the XIX century. When written in the standard notation of three-dimensional vector calculus, they read as follows:

∇·B = 0 (1.2.1) 1 ∂B ∇×E + = 0 (1.2.2) c ∂t ∇·E = 4πρ (1.2.3) 1.2 Classical Physics Between the End of the XIX and the Dawn of the XX Century 3

Fig. 1.1 James Clerk Maxwell (1831–1879) had a short life of only 47 years since he died from ab- dominal cancer in 1879, while he was the first occupant of the Cavendish Professor Chair at Cam- bridge University. He was born in Edinburgh from a family belonging to the peerage, namely to the upper nobility. He studied first at the Edinburgh Academy, then at the University of Edinburgh and finally at Cambridge University, which he attended as a member of the Trinity College (from 1850 to 1856). After finishing studies in mathematics at Cambridge, Maxwell was in Scotland, where he was professor in Aberdeen. Then in 1860 he made return to England with an appointment by the King’s College in London. After his resignation from King’s in 1865 he was once again in Scotland with his wife and lived on his estates. Finally in 1871 he was appointed by Cambridge as first Cavendish professor. The Treatise on Electricity and Magnetism, which contains a complete exposition of Maxwell Equations, was published in 1873 on the basis of previous articles that had appeared starting from 1861. Actually Maxwell began to study electromagnetism already in the years 1855–1856, while he was graduating from Cambridge. The first proposal that light might be identified with an electromagnetic wave dates to an article written by Maxwell in 1864 [1]. Be- sides his fundamental work on Electromagnetism, Maxwell gave other fundamental contributions to mathematical physics. One was the explanation of the nature of Saturn’s rings that Maxwell demonstrated to be necessarily composed of dust of small rocky grains. The other monumental achievement of Maxwell studies is of course in the field of Thermodynamics where, independently from Boltzmann, he formulated in 1866 the kinetic theory of gases and introduced the celebrated Maxwell distribution, which gives the fraction of gas molecules moving at a specified velocity at any given temperature

1 ∂E 4π ∇×B − = J (1.2.4) c ∂t c If we introduce indices i, j, k = 1, 2, 3 for the vector components, the above equa- tions take the following appearance:

i ∂iB = 0 (1.2.5) 1 ∂Bi εij k ∂ E + = 0 (1.2.6) j k c ∂t i ∂iE = 4πρ (1.2.7) 4 1 Special Relativity: Setting the Stage

1 ∂Ei 4π εij k ∂ B − = J i (1.2.8) j k c ∂t c where, in both notations, ρ denotes the electric charge density and J i the electric charge current. The number c appearing in the above equations has the of a velocity and the genius of Maxwell led him to think that it was just the speed of light. His guess was motivated by the observation that in the vacuum, ρ = J i = 0, namely in regions where there are neither charges nor currents, by taking a further derivative ∇× of (1.2.2) one obtains that all three components of the electric field E satisfy the d’Alembert propagation equation with velocity c:

1 ∂2 ∇2E = E (1.2.9) c2 ∂t2 With a similar procedure one obtains that in the vacuum, also the magnetic field satisfies the same propagation equation:

1 ∂2 ∇2B = B (1.2.10) c2 ∂t2 Hence Maxwell concluded that there are electromagnetic waves and he rightly guessed that visible light consists of nothing else but electromagnetic waves belong- ing to a particular region of the possible frequency spectrum. The first experimental detection of electromagnetic waves was done by Heinrich Rudolf Hertz1 in the mid eighties of the XIX century, already after the death of Maxwell.

1.2.2 Luminiferous Aether and the Michelson Morley Experiment

To the mind of XIX century physicists, the pillar of whose thinking was Newtonian mechanics, the detection of electromagnetic waves posed a severe problem. All the waves they knew corresponded to the propagation of oscillations of some mechan- ical medium. Hence a so far unknown medium, pervading the whole Universe, had to exist, whose propagating oscillations the humans perceived as electromagnetic waves. Such a substance was named Luminiferous aether, following an old idea dating back to Newton himself. The seed of disruption of Newtonian physics was already contained in Maxwell equations, since, as Lorentz demonstrated few years

1Heinrich Rudolf Hertz, had a short life. He died in Bonn in 1894 at the age of thirty six. He was born in Hamburg in 1857. In his laboratory at the University of Karlsruhe, where he had been appointed full professor, Hertz constructed the first dipole antennas, both transmitter and receiver and in this way produced the first radio waves demonstrating the existence of the electromagnetic waves implied by Maxwell theory. 1.2 Classical Physics Between the End of the XIX and the Dawn of the XX Century 5

Fig. 1.2 The Michelson Morley interferometer experiment. Albert Abraham Michelson (1852–1931) obtained the American citizenship and lived most of his life in the States, but he was born in in a Jewish family. He was the first American to get a Nobel Prize for science, which he obtained in 1907. The initial scientific career of Michelson developed in the American Navy which he left in 1881 to become professor of Physics in Cleveland, Ohio, after having visited several European Univerisities. Edward Williams Morley (1838–1923) was also American, being born in Newark, New Jersey. He held a chair as professor of Chemistry at the Case Western Reserve University in Cleveland. There, together with Michelson, he constructed the famous interferometer experiment. The Michelson Morley apparatus is conceptually identical to the modern interferom- eters, devised as detectors of gravitational waves. It aimed instead at measuring the motion of the Earth with respect to the luminiferous aether. The absolutely negative result of this experiment was a puzzle which could be resolved only by the theory of Special Relativity later, they are not invariant against the transformations of the Galilei group, that is the foundation stone on which the whole Newtonian building stands. Yet in the mid 18-eighties this fact was still unnoticed. So it was concluded that the existence of the Luminiferous aether was a logical necessity and it was also concluded that the aether provided the means of defining an absolute reference frame, that one where aether is at rest. In 1887 the two American scientists, Michelson and Morley con- structed their interferometric apparatus aimed at measuring the velocity of the Earth with respect to the aether (see Fig. 1.2). Indeed since Earth moves, the speed of light cannot be the same in all directions and at all times throughout the year. In some directions and at some times, light goes against the Earth movement, in some other almost along it. Hence one should necessarily measure interference fringes due to this fact. Yet subtle is the Lord, according to a famous phrasing of Einstein, and no such fringes were detected. The speed of light seemed to be the same in all direc- tions and at all times. This negative result was received as a puzzle by the scientific community and caused a lot of thinking. In particular it motivated Hendrik Antoon 6 1 Special Relativity: Setting the Stage

Lorentz to look deeper into the transformation rules from one reference frame to another that are consistent with Maxwell equations.

1.2.3 Maxwell Equations and Lorentz Transformations

The equations of Newtonian mechanics are invariant under Galileo transformations that connect two relatively inertial systems. Let us denote by {t,x,y,z} the time and space coordinates of a certain physical event in the coordinate frame O and {t,x,y,z} those of the same event in the coordinate frame O. By hypothesis the two frames (or observers) are in relative motion with constant velocity v with respect to each other. Just for simplicity and without any loss of generality let us suppose that the relative motion of the two frames is along the x-axis as shown in Fig. 1.3. The dogma of Galilean-Newtonian Physics was that time is universal and the same for every one. So the Galileo transformation from one reference frame to the other is described by the following simple formula: ⎛ ⎞ ⎛ ⎞ t t ⎜x ⎟ ⎜x + vt ⎟ ⎜ ⎟ = ⎜ ⎟ (1.2.11) ⎝y ⎠ ⎝ y ⎠ z z

Analogous transformations can be written for all the other axes and, together with the rotations, the set of all Galileo transformations turns out to be a Lie group with six paramete‘rs given by the three Euler rotation angles and the three components of the relative velocity {vx,vy,vz}. The astonishing discovery of Lorentz (see Fig. 1.4), published in his 1904 pa- per [3], is that Maxwell equations are not invariant under Galileo transformations, rather they are invariant against modified transformations that break the dogma of universal time and introduce the speed of light c. The special Lorentz transformation which replaces the Galileo transformation (1.2.11) is the following one:

Fig. 1.3 Two inertial reference frames moving with constant relative velocity along the x-axis 1.2 Classical Physics Between the End of the XIX and the Dawn of the XX Century 7

Fig. 1.4 Hendrik Antoon Lorentz (1853–1928) was Dutch by nationality. In 1902 the Nobel Prize in Physics was shared by Lorentz with Pieter Zeeman for the theoretical explanation of the phe- nomenon discovered by the latter and named after him. Hendrik Lorentz was born in Arnhem. He studied physics and mathematics at the University of Leiden, of which, later he became a professor. His doctoral degree was earned in 1875 under the supervision of Pieter Rijke with a thesis entitled “On the theory of reflection and refraction of light”, in which he refined the electromagnetic theory of James Clerk Maxwell. The proposal that moving bodies contract in the direction of motion was put forward by Lorentz in a paper of 1895 arriving at the same conclusion that had been reached also by George FitzGerald. Lorentz discovered that the transition from one reference frame to an- other could be simplified by using a new time variable which he called local time. In 1900, Henri Poincaré called Lorentz’s local time a “wonderful invention” and illustrated it by showing that clocks in moving frames are synchronized by exchanging light signals that are assumed to travel at the same speed against and with the motion of the frame. The transformations that we denote Lorentz transformations, following the name given to them by Poincaré in 1905, were published by Lorentz in a paper of 1904

⎛ ⎞ ⎛ ⎞ √ 1 (t + v x)  − v 2 c2 t ⎜ 1 ( c ) ⎟ ⎜  ⎟ ⎜ 1 ⎟ ⎜x ⎟ = ⎜ √ (x + vt) ⎟ ⎝  ⎠ ⎜ 1−( v )2 ⎟ (1.2.12) y ⎝ c ⎠ z y z where c is the speed of light and v the relative velocity of the two frames. It is ev- ident from their mathematical form that when v  c the Lorentz transformation is approximated extremely well by the Galileo transformation (1.2.11). Just as in the Galileo case, one can write similar transformations for the cases where the relative motion occurs along other axes and mix them with ordinary rotations, building up, at the end of the day, another six parameter group of transformations. Such a group has a simple mathematical name, i.e.,SO(1, 3), since it contains all the 4 × 4 matrices that leave invariant a quadratic form with one positive and three negative eigenval- 8 1 Special Relativity: Setting the Stage ues. It is not clear from the 1904 paper [3] that Lorentz was aware of the group structure he had discovered. Indeed in order to see such a structure one needs to change variables in a way which is somewhat involved. The right change of param- eters could come only from a new physical principle that was the mission of Albert Einstein to clarify, in his celebrated 1905 paper On the electrodynamics of mov- ing bodies [4], and that of Minkowski to interpret geometrically. In previous years, starting from 1895, in an attempt to explain the puzzle provided by the Michelson Morley experiment, Lorentz had proposed that moving bodies contract in the direc- tion of motion and, to the present time, this relativistic effect is named the Lorentz contraction. He also realized that the transition from one reference frame to another could be simplified by using a new time variable which he called local time [5]. Such local time depended on two variables, the first is what Lorentz regarded as the universal time t, but was simply the time of one of the two considered frames. The second variable entering the formula for the local time was the space-location under consideration. In 1900, Henri Poincaré declared that Lorentz’s local time was a wonderful invention and illustrated it by showing that clocks in moving frames are synchronized by exchanging light signals that are assumed to travel at the same speed in both directions, namely when they travel against and when they travel with the motion of the frame.

1.3 The Principle of Special Relativity

It should be clear to the reader of the previous pages that all the tiles of the puzzle were, by the end of 1904, ready and just waited a clear logical mind such as that of Einstein to be assembled together in a meaningful picture. On one side Michelson Morley experiment had shown that light travels always at the same speed, indepen- dently from the state of motion of its observer. Secondly Lorentz had shown that the most important Laws of Nature, apart from Newton’s law of gravitational attraction, namely those codified in Maxwell equations, are covariant not with respect to the transformations of the Galilei group, rather with respect to another set of transfor- mations, those that bear his name. Albert Einstein (see Fig. 1.5) transformed these two facts into the axioms of his new Theory of Special Relativity: (a) The speed of light c is constant and the same in all inertial reference frames. (b) All the Laws of Nature should, like Maxwell equations, have a form, in inertial reference frames, that is covariant with respect to Lorentz transformations. Said differently, the correct transformations from one inertial frame to another are those of Lorentz and rather than searching for complicated interpretations of Lorentz covariance of electrodynamics, one should rather concentrate on mechanics and change the laws of Newtonian mechanics so that they become Lorentz covariant. Einstein showed that these principles implied a critical revision of the concept of contemporaneity. Namely events that happen at the same time for one inertial observer may happen at different times for another observer in relative motion with 1.3 The Principle of Special Relativity 9

Fig. 1.5 Albert Einstein (1879–1955) is the most famous of all physicists of the XX century and he is the principal actor in the story told in the present book. He was born in Ulm, , and died in Princeton in the USA. His citizenship changed three times. Born German he became Swiss, than German again and finally American citizen. He was awarded the Nobel Prize in 1921 for his discovery of the law of the photoelectric effect. This discovery is contained in one of his three fundamental papers of 1905, dealing respectively with the photoelectric phenomenon, the Brownian motion of molecules and the third on Special Relativity. His major achievement, namely the Theory of General Relativity was published in 1915 after a decade of studies. We do not dwell here on Einstein’s biography, since many books have been published on the subject. Moreover his thoughts and ideas will be constantly recalled throughout the development of the present book and many citations will occur respect to the first. Secondly using various arguments he showed that the Principle of Special Relativity implied the equivalence of mass and energy, according to the celebrated formula E = mc2. The meaning of this equivalence is that, even when at rest, a particle of mass m has an energy, which through interaction with other particles or radiation can be extracted or exchanged. For instance a massive particle can decay by means of the emission of a light particle endowed with high kinetic energy and this kinetic energy is subtracted from the rest energy of the decaying particle. The remnant of the decay has necessarily a lower mass than its predecessor. The essential implication of Einstein new approach to the formulation of natu- ral laws was the suppression of the ancestral separation of time from space and the fusion of the former with the latter into a newly born stage for physical processes, named space-time. Intuitively this latter is a continuous space, whose points, named the events are labeled by four parameters, the first of which t, defines when the event occurred, while the last three x, y, z define where it happened. It was the historical mission of Hermann Minkowski (see Fig. 1.6) to make this intuitive idea mathematical sound and construct explicitly the geometrical arena of special rela- tivity. In terms of Minkowski space the formulation of special relativistic theories becomes extremely simple and Einstein ideas become algorithmic. 10 1 Special Relativity: Setting the Stage

Fig. 1.6 Hermann Minkowski (1864–1909) was born in Lithuania, belonging at that time to the Russian Empire. His family was Jewish, partly of Lithuanian, partly of Polish descent. His higher education, however, was German and took place in the historical University of Königsberg, where Immanuel Kant had taught and developed his philosophical ideas one century before. Having become a refined mathematician, whose scientific interests centered on the theory of quadratic forms, Minkowski received prestigious international recognition, including a Prize from the French Academy of Sciences and taught in various Universities of Germanic language, Bonn, Göttingen, Könisberg and Zürich. In the Swiss Polytechnic of Zürich he happened to be one among the teach- ers of Albert Einstein. Since 1902 he was appointed professor in Göttingen and became one of the closest friends and collaborators of David Hilbert. It was just in 1907, two years after the 1905 paper by Einstein and two years before his premature death that he had the brilliant idea of inter- preting Special Relativity in terms of a continuous geometrical space that joined space and time together and was endowed with the metric which bears his name and is invariant under Lorentz transformations

1.3.1 Minkowski Space

The basis of Minkowski’s construction is the realization that the two pillars of Spe- cial Relativity, i.e. constancy of light velocity and Lorentz covariance are just two sides of the same medal. Let us introduce a four-dimensional vector space MMink whose elements are m-tuplets of real numbers named the events: μ 0 1 2 m−1 MMink  x = x ,x ,x ,...,x (1.3.1) =ct where c denotes the speed of light and t the coordinate time; in this way x0 denotes the when and xi (i = 1,...,D − 1) the where of a physical event. The statement that MMink is a vector-space implies that events can be summed and subtracted:

μ μ μ μ μ ∀x ,y ∈ MMink : x + y = z ∈ MMink (1.3.2) 1.3 The Principle of Special Relativity 11 or more generally linearly combined:

μ μ μ μ μ ∀x ,y ∈ MMink and ∀λ,ρ ∈ R : λx + ρy = z ∈ MMink (1.3.3)

These are the same properties with which three-dimensional space is endowed in classical Newtonian mechanics and in Euclidian geometry which provides its math- ematical basis. A Euclidian m-dimensional space Em Rm admits a global notion of distance between any two points based on the existence of a scalar product. The latter is a quadratic bilinear symmetric form on Em:

, :Em ⊗ Em =⇒ R ∀x, y ∈ Em : R  x, y = y, x (1.3.4) ∀x, y, z ∈ Em and ∀λ,ρ ∈ R : λx + ρy, z =λ x, z +ρ y, z which is also assumed to be non-degenerate and positive definite:

∀y ∈ Em x, y =0 ⇒ x ≡ 0 (1.3.5) ∀x ∈ Em : x, x > 0 (1.3.6) x, x =0 ⇒ x = 0 (1.3.7)

Typically the scalar product in a Euclidian space is given by the sum of squares of the vector components:

m ∀x, y ∈ Em : x, y ≡ xiyi (1.3.8) i=1 but any symmetric, non-degenerate matrix Mij with all positive eigenvalues could be used to define , , respecting the same axioms (1.3.4) and (1.3.5, 1.3.6, 1.3.7):

m m i i ∀x, y ∈ E : x, y ≡ x Mij y (1.3.9) i,j=1

The properties of Mij we spelled out in words correspond to the following formulae:

Mij = Mji (1.3.10) Det M>0 (1.3.11) j i Mij x = λx ⇒ λ>0 (1.3.12)

Given the bilinear form , ,theabsolute distance between any two points x, y ∈ Em can be defined as follows:

R  d(x, y)2 ≡|x − y|2 ≡ x − y, x − y (1.3.13) 12 1 Special Relativity: Setting the Stage and by construction is positive definite and obeys the triangular inequality:

∀x, y ∈ Em d(x, y) ≥ 0 (1.3.14) ∀x, y ∈ Em d(x, y) = d(y, x) (1.3.15) d(x, y) = 0 ⇒ x = y (1.3.16) ∀x, y, z ∈ Em d(x, y) + d(y, z) ≥ d(x, z) (1.3.17)

Once time and space are unified into Minkowski space-time, one can introduce a bilinear scalar product (, ) which satisfies the axioms (1.3.4), and non-degeneracy (1.3.5), yet the Principles of Special Relativity require that we remove positive def- initeness and we rather choose a different quadratic form. In Sect. 3.6.1 we will tackle the rigorous mathematical definition of signatures of quadratic forms which was clarified in the XIX century by J.J. Sylvester. In a nutshell the signature of a quadratic form defined as in (1.3.9) consists of the signs of the eigenvalues λi of the matrix Mij . When the scalar product is positive definite all the signs are plus: + + + , ,..., (1.3.18) D times Minkowski understood that all the Principles of Special Relativity are encoded in the choice of another signature, the Lorentzian signature: + − − , ,..., (1.3.19) D−1 times Explicitly the Lorentzian scalar product of Minkowski space can be defined as follows. Identifying the number m with the space-time dimensions, namely with D = 1 + # of space directions, consider the following diagonal matrix: ⎛ ⎞ 10··· ··· ··· 0 ⎜ ⎟ ⎜0 −10··· ··· 0 ⎟ ⎜ − ··· ⎟ η = ⎜00 10 0 ⎟ (1.3.20) ⎜ ...... ⎟ ⎝ ...... ⎠ 0 ··· ··· ··· 0 −1 which is named the flat Minkowski metric. Then for any pair of events xμ,yμ their scalar product is: μ ν (x, y) ≡ x ημνy (1.3.21) The essential novelty attached to Lorentzian signature is that now the square norm of vectors belonging to Minkowski space can be of three types: 1. time-like vectors ⇔ (x, x) > 0. 2. space-like vectors ⇔ (x, x) < 0. 3. null-like vectors ⇔ (x, x) = 0 1.3 The Principle of Special Relativity 13 and, as it will be clear from the mathematical definition of the Lorentz group dis- cussed in the next section, the time, space or null-like character of a vector does not depend on the chosen inertial reference frame. Indeed the Lorentz group is pre- cisely defined as that group of linear substitutions which leaves the Lorentz product (1.3.21) invariant. Consider now the motion of a particle in Minkowski space-time. A generic mo- tion is described by a world-line of the form:

xμ = xμ(τ); τ ∈ R (1.3.22) where τ is some real parameter. Just as in Classical Newtonian Physics we assume that:

Principle 1.3.1 A particle subject to the action of no force travels on a straight line with constant velocity. This means that for such a free particle the world-line is of the form: xμ(τ) = uμτ (1.3.23) where uμ is a constant vector named the D-velocity.

The second principle which encodes the whole of Special Relativity is the fol- lowing:

Principle 1.3.2 The D-velocity of a physical particle is always either time-like or null-like. It is never space-like. We have two possibilities: Massive particles When the rest-mass is larger than zero, namely m>0, the D- velocity is time-like and (u, u) = 1. Massless particles When the rest-mass vanishes, namely m = 0, the D-velocity is null-like and (u, u) = 0.

This principle states that no physical signal can travel faster than light and es- tablishes that all massless particles travel at the speed of light in whatever inertial reference frame. Indeed let us analyze the implications of (u, u) = 0. In order to satisfy the null-like constraint the four vector uμ must be of the form:

u0 =±|u|; ui = ui (1.3.24) where u is any (D − 1)-component vector. Combining Principles 1.3.1 and 1.3.2, we obtain: c c × t =±|u|τ; xi = uiτ ⇒ xi = uit (1.3.25) |u| This means that the considered particle travels with a (D − 1)-velocity given by: c v = u (1.3.26) |u| 14 1 Special Relativity: Setting the Stage

The D − 1 Euclidian squared norm of such a velocity is obviously:

v, v =c2 (1.3.27)

On the other hand combining Principles 1.3.1 and 1.3.2 for the case of a massive particle we get: u0 =± 1 +|u|2; ui = ui (1.3.28) where u is once again any (D − 1)-component vector. This implies: c c × t =± 1 +|u|2τ; xi = uiτ ⇒ xi = uit (1.3.29) 1 +|u|2 which means that the considered particle travels with the following (D − 1)- velocity: c v = ui (1.3.30) 1 +|u|2 whose D − 1 Euclidian squared norm is obviously:

|u|2 v, v =c2

pμ ≡ mcuμ (1.3.33) the space-part of this D-vector takes the form: mv p = (1.3.34) | |2 1 − v c2 and it coincides with the Newtonian momentum mv when the velocity of the con- sidered particle is much smaller than the speed of light v  c. On the other hand the time component of the momentum D-vector is the following: mc p0 ≡ mcu0 = (1.3.35) | |2 1 − v c2 1.4 Mathematical Definition of the Lorentz Group 15

For small velocities, developing in series of v/c we obtain: 1 1 v2 p0 = mc + mv2 + O (1.3.36) 2 c c2

This suggests the interpretation:

p0 = E/c (1.3.37) where E is the energy of the considered particle. Indeed, so doing, we find: 2 2 1 2 v E = mc + mv + O (1.3.38) 2 c2 rest energy Newtonian kinetic energy where we recognize the Newtonian kinetic energy plus an absolute normalization of the zeroth level of E, arbitrary in Newtonian mechanics and fixed to a precise value 2 in the relativistic case, namely to the rest energy E0 = mc . The third principle of special relativity which concludes the construction is μ = N μ Principle 1.3.3 The total D-momentum P i=1 p(i) of an isolated physical system made of N-components is a conserved quantity, namely all possible physi- cal processes will preserve its value throughout time evolution. The correct transfor- mations that relate inertial systems to each other is the Lorentz group, namely that group of linear substitutions which leaves the quadratic form (1.3.21) invariant.

1.4 Mathematical Definition of the Lorentz Group

Let us define mathematically the Lorentz group, which, as we emphasized in pre- vious sections can be introduced for any space-time dimension D = 1 + (D − 1), where one is the number of time-like directions and D − 1 is the number of space- like directions. In the classification of Classical Lie Groups, the Lorentz group is just SO(1,D− 1), whose elements are all those D × D matrices Λ that satisfy the following defining relation: ΛT ηΛ = η (1.4.1) The reason of the above definition and of the choice of the form of the invariant ma- trix η was discussed in the previous section. It is dictated by the notion of Minkowski space-time and by the choice of (1.3.21) as the invariant quadratic form of Special Relativity. Let us consider two physical events that in one inertial frame are described by the D-vectors {xμ,yμ}. In another inertial frame the same events will be described 16 1 Special Relativity: Setting the Stage by new D-vectors, obtained from the former ones by means of a linear substitu- tion: ˜μ = μ ν;˜μ = μ ν x Λ ν x y Λ ν y (1.4.2) The Minkowskian scalar product of the two events will be frame independent, namely: (x, y) = (x,˜ y)˜ (1.4.3) if and only if the condition (1.4.1) is satisfied, as it is immediately evident by the transcription in matrix notation of the fundamental quadratic form:

(x, y) = xT ηy (1.4.4)

So it is mandatory to study the structure of the Lie group SO(1,D − 1) and the properties of its representations. From a historical perspective it is worth re- calling that by the end of the XIX century, the theory of Lie groups, namely of continuous groups whose product law has an analytic structure, had al- ready reached perfection through the work of Killing and Cartan. As we dis- cuss more extensively in Sect. 3.2.5, the classification of all simple Lie groups and the construction of their fundamental representations, including those of the exceptional ones was presented in Cartan’s doctoral thesis of 1894. Hence the study of the D-dimensional Lorentz group SO(1,D − 1) could be consid- ered at the time of Minkowski just an application of a well established the- ory to a specific case. Yet the history of science is never so linear and the D = 4 Lorentz group was separately studied in all of his aspects and for his own sake by several authors in a large number of physical and mathematical pa- pers.

1.4.1 The Lorentz Lie Algebra and Its Generators

Let us consider a Lorentz matrix Λ which is infinitesimally close to the identity, namely: Λ = 1 + M + O M2 (1.4.5) where M is a small matrix, all of its entries being  1. The defining condi- tion (1.4.1) translates at first order in the matrix M into the condition:

MT η + ηM = 0 (1.4.6) which states that the matrix ηM is antisymmetric. Hence the Lorentz Lie algebra so(1,D− 1) is composed by all those matrices that satisfy (1.4.6). We easily con- struct the solution of such a problem, since, as a matrix, the Minkowskian metric η 1.4 Mathematical Definition of the Lorentz Group 17 squares to unity η2 = 1. Hence it suffices to parameterize the space of antisym- metric matrices A and any matrix M satisfying condition (1.4.6) will be of the form: M = ηX; X ∈ A ⇔ XT =−XT (1.4.7) A 1 − The space has dimension 2 D(D 1) and it is customary to introduce a basis of 1 − 2 D(D 1) generators Jμν constructed in the following way. ⎛ ⎞ 0 ··· ··· ··· ··· ··· ··· 0 ⎜ ⎟ ⎜ ...... ⎟ ⎜ ...... ⎟ ⎜ ⎟ ⎜0 ··· ··· 0 ημμ 0 ··· 0 }μ-row ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜0 ··· ··· ··· ··· ··· ··· 0 ⎟ ⎜ ⎟ ⎜ ...... ⎟ ⎜ ...... ⎟ ≡− = ⎜ ⎟ Jμν Jνμ ⎜0 ··· ··· ··· ··· ··· ··· 0 ⎟ (1.4.8) ⎜ ⎟ ⎜0 ··· 100··· ··· 0 }ν-row ⎟ ⎜ ⎟ ⎜0 ··· ··· ··· ··· ··· ··· 0 ⎟ ⎜ ⎟ ⎜ ...... ⎟ ⎜ ...... ⎟ ⎜ ⎟ ⎜0 ··· ··· ··· ··· ··· ··· 0 ⎟ ⎝ ⎠ 0 ··· 0 ··· 0 ··· ··· 0 μ-column ν-column

Assuming by convention μ<ν, it follows from (1.4.8) that Jμν is a ma- trix all of whose entries vanish, except those at the intersection of the μth row with the νth column and at the intersection of the νth row with μth col- umn. The entries μν and νμ of Jμν have both norm 1 and have the same sign if ημμ = 1 while they have opposite signs if ημμ =−1. This means that the set of Jμν -generators contains a subset of D − 1 matrices, i.e. J0i that 1 − − are symmetric and a subset of 2 (D 1)(D 2) antisymmetric ones Jij .The generators J0i are non-compact and give rise to special Lorentz transforma- tions, while the generators Jij span the compact Lie subalgebra so(D − 1) ⊂ so(1,D− 1). Altogether the commutation relations of this standard basis gener- ators are:

[Jμν,Jρσ ]=−ημρ Jνσ + ηνρ Jμσ − ηνσ Jμρ + ημσ Jνρ (1.4.9) and a generic element of the Lorentz Lie algebra can be written as:

1 μν so(1,D− 1)  M = ε Jμν (1.4.10) 2 where the parameters εμν =−ενμ constitute an . If we focus on the physical relevant case of D = 4, the overall number of Lorentz generators is six, three non-compact and three compact. Specifically we have: 18 1 Special Relativity: Setting the Stage ⎛ ⎞ ⎛ ⎞ 0100 00 0 0 ⎜1000⎟ ⎜00−10⎟ J = ⎜ ⎟ ; J = ⎜ ⎟ 01 ⎝0000⎠ 12 ⎝01 0 0⎠ 0000 00 0 0 ⎛ ⎞ ⎛ ⎞ 0010 000 0 ⎜0000⎟ ⎜000−1⎟ J = ⎜ ⎟ ; J = ⎜ ⎟ (1.4.11) 02 ⎝1000⎠ 13 ⎝000 0⎠ 0000 010 0 ⎛ ⎞ ⎛ ⎞ 0001 000 0 ⎜0000⎟ ⎜000 0⎟ J = ⎜ ⎟ ; J = ⎜ ⎟ 03 ⎝0000⎠ 23 ⎝000−1⎠ 1000 001 0

The subgroup of the Lorentz group connected to the identity is obtained by expo- nentiating the matrix M in (1.4.10).

1.4.2 Retrieving Special Lorentz Transformations

Let us consider the transformations generated by the non-compact generators J0i . We can easily show that they are the special Lorentz transformations introduced in (1.2.12). As an example let us exponentiate the generator J01 with a parameter ξ. We obtain: ⎛ ⎞ cosh(ξ) sinh(ξ) 00 ⎜ sinh(ξ) cosh(ξ) 00⎟ Λ = exp[ξJ ]=⎜ ⎟ (1.4.12) 01 ⎝ 0010⎠ 0001 Applying the matrix Λ to the four-vector of coordinates {ct,x,y,z} we obtain: ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ct ct cosh(ξ) + x sinh(ξ) ct ⎜ x ⎟ ⎜x cosh(ξ) + ct sinh(ξ)⎟ ⎜ x ⎟ Λ · ⎜ ⎟ = ⎜ ⎟ ≡ ⎜ ⎟ (1.4.13) ⎝ y ⎠ ⎝ y ⎠ ⎝ y ⎠ z z z

Now it suffices to identify the parameter ξ, usually named rapidity, with the follow- ing combination: v + 1 ξ = log c (1.4.14) 2 1 − v c2 and a straightforward calculation shows that the primed variables defined by (1.4.13) coincide with those spelled out in (1.2.12). Hence the somewhat mysterious Lorentz 1.5 Representations of the Lorentz Group 19 transformations can be reduced to the hyperbolic rotations contained in the group SO(1,D− 1).

1.5 Representations of the Lorentz Group

As we already stressed the general advances in Lie group theory and Lie algebras were already conspicuous by the time Special Relativity was introduced, but the study of the Lorentz group proceeded for some time on an independent track, related to physics, and peculiarities of the case so(1, 3) were widely used and incorporated into the treatment. From the point of view of Mathematics, Èlie Cartan discovered the representations of the so(D, C) Lie algebras that now we name spinorial in 1913 [7], namely in between the advent of Special Relativity and that of General Relativity. From the point of view of Physics, Pauli introduced the intrinsic spin of quantum particles in 1927 [8] and, by means of the three σ -matrices named af- ter him, he constructed the spinor representation of the three-dimensional rotation group SO(3). Pauli’s construction and three-dimensional spinors are quite special since they appear as a manifestation of the sporadic isomorphism so(3) su(2). In 1928 Paul Dirac discovered the fully relativistic theory of the electrons by in- troducing the anti-commuting γ -matrices and, in this way, he was able to show the connection between spinors and the Lorentz group [9]. Actually what Dirac did was the construction of the spinor representation of so(1, 3). Dirac spinors in D = 4are once again special, since they appear as a manifestation of another sporadic isomor- phism of Lie algebras, namely so(1, 3) sl(2, C). Yet, as it was already implicitly contained in Cartan’s paper of 1913, the existence of spinor representations is an intrinsic property of all Lie algebras of type so(D) and the systematic way to con- struct them is via the study of the Clifford algebras of Γ -matrices, defined by the following anti-commutation relations:

{Γa,Γb}=2ηab × 1 (1.5.1)

An exhaustive study of Γ -matrices and spinors is contained in Appendix A, to which we also refer for conventions. In this chapter we will study all representations of the Lorentz group and for the physically relevant case D = 4 we will dwell on the special features provided by the sporadic isomorphisms mentioned above. From a general point of view the irreducible representations of so(1,D− 1) di- vide into two classes that have a profound physical significance, since they match with the spin-statistics theorem of Quantum Field Theory: Bosons The bosonic representations of so(1,D− 1) are obtained from all tensor products of the fundamental representation, in other words they are tensors

tμ1μ2...μn with n-indices. These tensors can be split into irreducible represen-

tations by means of two subsequent operations. First one applies to tμ1μ2...μn one of the symmetrization-anti-symmetrization schemes codified in the Young 20 1 Special Relativity: Setting the Stage

tableaux available for the considered rank n. For instance for the case of n = 5 we have the following possibilities:

; ; ;

(1.5.2)

; ;

Secondly one subtracts from the symmetrized tensor all of his available η-traces so as to make it traceless. Fermions The fermionic representations of so(1,D− 1) are obtained by taking the of any of the available bosonic representations with the funda- mental spinor representation. In other words a fermionic representation is a spinor-tensor Ξ α with one spinor index α and n vector indices. The spinor μ1μ2...μn tensor can be made irreducible by subtracting all of its γ -traces in order to make it γ -traceless. The spin-statistics theorem states that any quantum-field which transforms in a bosonic representation of the Lorentz group as defined above, obeys the Bose- Einstein statistics, while any field which transforms in a fermionic representation obeys the Fermi-Dirac statistics. At the classical level this implies that bosonic fields are commuting real number valued, while fermionic fields are anti-commuting Grassmann number valued. The above description of irreducible bosonic and fermionic representations will become clear through the analysis of a couple of examples. Consider for simplicity the case n = 2, which means a tensor tμν with two indices. The irreducible bosonic representations contained a priori in this tensor are three: ˆ = − 1 ρσ 1. A symmetric traceless tensor defined as t(μ,ν) t(μ,ν) D ημνη tρσ . 2. An antisymmetric tensor defined as t[μ,ν]. ρσ 3. A scalar defined by the trace of the original tensor η tρσ . In the above discussion the round bracket (...) denotes symmetrization on the en- capsulated indices while the square bracket [...] denotes anti-symmetrization of the same.

1.5.1 The Fundamental Spinor Representation

As usual, it is easier to discuss representations at the level of the corresponding Lie algebras rather than at the finite group level. We saw that the generators Jμν of 1.5 Representations of the Lorentz Group 21 the Lorentz algebra so(1,D− 1), forming a set of D × D matrices which contains 1 − 2 D(D 1) elements, satisfy the commutation relations (1.4.9). If we construct a representation of the D-dimensional Clifford algebra (1.5.1), then according to the notation introduced in the Appendix (see (A.3.1)) we can set:

(s) 1 J = Γμν (1.5.3) μν 4 and we can easily verify that these generators satisfy the same commutation rela- tions (1.4.9)asJμν . So doing we succeeded in constructing a representation of the Lorentz algebra in dimension 2[D/2], which is the dimension of the gamma matrices. Such a representation is the spinor representation. Fields valued in the carrier vector space of the latter are the Dirac spinor fields. They are usually denoted as follows: ⎛ ⎞ ψ1 ⎜ ⎟ ⎜ ψ2 ⎟ ⎜ ⎟ = ⎜ . ⎟ ; ≡ [D/2] ψ ⎜ . ⎟  2 (1.5.4) ⎝ψ−1 ⎠ ψ

The entries of ψ are generically complex. As we discuss in Sect. A.4 of the ap- pendix, Dirac spinors are not necessarily irreducible. Depending on the dimension D we can impose the Majorana or the Weyl condition, which are Lorentz invariant, or even both of them and, in this way, we obtain irreducible spinors. A spinor tensor Ξ α that is irreducible both as a spinor and as a tensor can μ1μ2...μn be further reduced by subtracting Lorentz invariant γ -traces. Consider for instance a spinor tensor Ξ(μν) which is symmetric and traceless as a rank two tensor:

μν η Ξ(μν) = 0 (1.5.5)

In a Lorentz invariant way we can extract from Ξ(μν) a spinor vector by setting:

ν Θμ = Γ Ξ(μν) (1.5.6)

In order to obtain a fully irreducible representation of the Lorentz group we have to substract such γ -traces:

a Ξˆ = Ξ − Γ Γ ρΞ (1.5.7) (μν) (μν) D (μ ν)ρ where α is an appropriate coefficient that can be calculated in each dimension D ˆ ν ˆ in order that the new object Ξ(μν) could satisfy the condition Γ Ξ(μν) = 0 and corresponded to a fully irreducible representation of the D-dimensional Lorentz group. 22 1 Special Relativity: Setting the Stage

1.5.2 The Two-Valued Homomorphism SO(1, 3) SL(2, C) in the Four-Dimensional Case

Let us enlarge the set of Pauli matrices introducing also: 10 σ = (1.5.8) 0 01 and let us define the following linear combination of the four sigmas: x0 + x3 x1 − ix2x3 X = xμσ = (1.5.9) μ x1 + ix2 x0 − x3

Consider now a generic element A ∈ SL(2, C). By definition A is a complex uni- modular 2 × 2 matrix: αβ A = ; det A = 1 ⇔ αδ − βγ = 1 (1.5.10) γδ

Calculating the of X we find:

μ ν det X = x x ημν (1.5.11)

On the other hand, for each A ∈ SL(2, C) we have: detX ≡ det A†XA = det X (1.5.12) Since σμ provide a complete basis set for 2 × 2 matrices it follows that X is some other linear combination of the same matrices with new coefficients x˜μ:

μ X =˜x σμ (1.5.13)

Necessarily the new coefficients must be linear combination of the old ones: ˜μ = μ ν x Λ νx (1.5.14) and from (1.5.12) we deduce:

μ ν μ ν x˜ x˜ ημν = x x ημν (1.5.15)

By virtue of its own definition the 4 × 4matrixΛ ∈ SO(1, 3) is an element of the Lorentz group. This simple construction shows that to each element of A ∈ SL(2, C) we can uniquely associate a Lorentz group element Λ. The explicit form of the latter is eas- μ 1 = ily obtained using the trace orthogonality of the σ matrices, namely 2 Tr(σμσν) δμν . Relying on this we can write: μ 1 † Λ = Tr σμAσνA (1.5.16) ν 2 1.6 Lorentz Covariant Field Theories and the Little Group 23

It is evident from (1.5.16) that such a relation is not an isomorphism, rather it is a two-valued homorphism, since to the two matrices A and −A corresponds the same matrix Λ. In proper mathematical language this homorphism is a local isomorphism, since the corresponding Lie algebras sl(2, C) and so(1, 3) are isomorphic. We conclude that the fundamental representation of the group SL(2, C) is ac- tually a complex two-dimensional representation of the Lorentz group SO(1, 3). Which representation is it? The answer is easily given: it is that provided by a Weyl spinor. Indeed the Weyl condition halves the number of non-vanishing components of a Dirac spinor and from four we step down to two.

1.6 Lorentz Covariant Field Theories and the Little Group

Once the principles of special relativity have been accepted, the classical and quan- tum field theories one is led to consider are described, in D-dimensions, by an action principle of the form: A = L {φ}, {∂φ},x dDx (1.6.1) where {φ(x)} denotes collectively a set of fields, each of which belongs to some rep- resentation of the Lorentz group, either bosonic or fermionic and {∂φ(x)} denotes collectively the set of their derivatives with respect to the space-time coordinates:

∂ ∂ φ(x)≡ φ(x) (1.6.2) μ ∂xμ The Lagrangian density L ({φ}, {∂φ},x) is required to be invariant under Lorentz transformations. In addition we always assume that the full action is invariant under space-time translations, namely under transformations of the following form:

xμ → xμ + cμ (1.6.3) where cμ is a set of constant parameters. As an abstract group, the translation group in D dimensions T(D) is isomorphic to the Abelian non-compact Lie group RD.Its generators are named P μ and can be identified with the total momentum operators which we declared to be constant in all physical processes (see Principle 1.3.3). This is automatically guaranteed by translation invariance of the action via Noether theorem that we recall later on in this chapter (see Sect. 1.7). Putting together space- time translations and the Lorentz group, results in a semidirect product:

ISO(1,D− 1) = T(D) SO(1,D− 1) (1.6.4) which is named the D-dimensional Poincaré group (see Fig. 1.7). The correspond- ing Lie algebra is described by the following commutations relations: 24 1 Special Relativity: Setting the Stage

Fig. 1.7 Jules Henri Poincaré (1854–1912) was born near Nancy in a very influential French family. One of his cousins became President of the French Republic during the time of World War One, namely from 1913 to 1920. By that time, however, the great mathematician relative of the President was already dead. Henri Poincaré is often considered one of the last universal geniuses. His contributions to all branches of Mathematics are so extensive and profound that produce a sense of astonishment. Poincaré education was in Paris at the Ècole Polytechnique where he had such a teacher as Charles Hermite. After graduation he taught for some time at the University of Caen, but very young, in 1881 he was appointed professor at the Sorbonne and at the age of 32 he was already elected member of the French Academy of Sciences. In 1909, three years before his death he became member of the Academie Française. The major contributions of Poincaré to Mathematics are the complete solution of the three-body problem in Newtonian mechanics, the foundation of algebraic geometry and topology, where in 1894 he introduced the notion of the fundamental group and posed one of the most famous mathematical conjectures, the clear-cut formulation of non-Euclidian hyperbolic geometry and finally his controversial contribution to the birth special relativity [6]

[Pμ,Pν]=0 (1.6.5)

[Jμν,Pρ]=−ημρ Pν + ηνρ Pμ (1.6.6)

[Jμν,Jρσ ]=−ημρ Jνσ + ηνρ Jμσ − ηνσ Jμρ + ημσ Jνρ (1.6.7) which clearly expose the semidirect product structure. The momentum generators commute among themselves (1.6.5) but they transform in the fundamental represen- tation of the Lorentz group as imposed by (1.6.6)–(1.6.7). We quote a couple of examples of Poincaré invariant action functionals that we also use later on, while discussing Noether theorem (see Fig. 1.8). The first example is given by the free Dirac Lagrangian for an electron or another charged fermion which, utilizing the conventions and notations of Appendix A.4, takes the following form: μ D ADirac = iψγ ∂μψ − mψψ d x (1.6.8) 1.6 Lorentz Covariant Field Theories and the Little Group 25

Fig. 1.8 Amalie Emmy Noether (1882–1935), together with Henrietta Leavitt and Madame Curie is one among the very few but very great woman-scientists who lived by the end of the XIX and the beginning of the XX century. German by nationality, she was born in a Jewish family in the Bavarian city of Erlangen, the same from where in 1872, ten years before her birth, Felix Klein had announced his famous programme, reducing the classification of possible geometries to the classification of Lie groups under which the geometric relations are invariant. Emmy’s father was also a mathematician and she studied at the University of Erlangen. After working several years as a voluntary assistant without salary, in 1915, just after the outbreak of world-war one she was invited by David Hilbert and Felix Klein to what was, by that time, the very center of the scien- tific world, namely the University of Göttingen. She had to suffer the prejudiced opposition of the faculty against women and obtained her habilitation only in 1919, after the defeat of Germany and the end of the war. Her algebraic Göttingen school became renowned around the world and she was described by David Hilbert and Albert Einstein as the most important woman in the history of mathematics. Although in theoretical physics Emmy Noether is mostly known for her theorem on the relation between symmetries and conserved currents, her major contributions were in pure mathematics and in abstract algebra in particular, which she contributed to refound. To this effect it suffices to recall the notion of Noetherian Rings. It must be noted that David Hilbert invited Miss Noether to Göttingen precisely because he was puzzled by the issue of energy conservation in Einstein’s theory of Gravitation. The fact that gravitational energy could gravitate seemed to him a violation of the energy conservation theorem. By means of her theorem, Emmy Noether solved the problem not only for General Relativity but for all systems endowed with a continuous group of symmetries. In 1932 in her plenary address to the International Congress of Mathematicians in Zürich, Emmy Noether was at the top of her mathematical career and a world-wide recognized authority. She had also worked, for the winter semester 1928–1929, at Moscow State University, where she collaborated with Lev Pontryagin and Nikolai Chebotaryov. The same year 1932, to- gether with Emil Artin, she received a long-due recognition by means of the Ackermann-Teubner Memorial Award for Mathematics. In 1933, Hitler rose to power, Emy’s chair in Göttingen was revoked and she emigrated to the Unitated States of America where she obtained a chair in Bryn Mawr College in Pennsylvanya. Unfortunately two years later, in 1935, she died from cancer

The second example we mention is provided by the action functional for a scalar field, with a self-interaction encoded in a potential function W (φ): 26 1 Special Relativity: Setting the Stage 1 1 μν D Ascalar → AKG = ∂μϕ∂νϕη − W (ϕ) d x (1.6.9) 4 2 As it is extensively explained in most introductory text-books on quantum field the- ory, under these conditions, each Lorentz field determines an induced unitary ir- reducible representation (UIR) of the Poincaré group ISO(1,D− 1), which is the mathematical concept corresponding to the physical concept of a particle. Such UIRs are characterized by the values of two Casimir invariants that we can identify with the mass and the spin of the corresponding particle. To make a long story very short, we can say that a UIR of the Poincaré group can be identified with the Hilbert space spanned by the finite norm solutions of the free field equation suitable to the field of spin s that we consider. For instance in the spin zero case, which corresponds to the case of a scalar field, the free equation of motion is:

φ(x)+ m2φ(x)= 0 (1.6.10)

μ where  ≡ ∂ ∂μ is the d’Alembert operator, while the mass is determined by the expansion up to quadratic order, of the potential function:

1 W (ϕ) = W + m2φ2 + O φ3 (1.6.11) 0 2 The standard method of solution of (1.6.10) is through Fourier transforms. We write: 1 D μ ν φ(x)= d k exp −ik x ημν ϕ(k) (1.6.12) (2π)D where kμ is interpreted as the D-momentum of a particle state or the wave-vector of a free propagating wave, which amount to the same thing in quantum mechanics. In momentum space, after Fourier transform, the free equation (1.6.10) becomes: μ 2 −k kμ + m ϕ(k) = 0 (1.6.13) which simply requires that the momentum vector should be on the m2 mass-shell:2 k0 =± k2 + m2; ki = ki (1.6.14) where k is an arbitrary space momentum-vector. The key point in discussing the induced UIRs is the fact that, for whatever type of Lorentz field, the momentum is always a vector, namely it belongs to the funda- mental representations of SO(1,D− 1). Hence we can use Lorentz transformations to reduce kμ to a standard normal form and then study the so called little group, which is defined as that subgroup G ⊂ SO(1,D− 1) which leaves the normal form invariant. There are two cases:

2Note that from now on we use natural units where c = 1. The fundamental constants can be reinstalled at any moment, if necessary, through the use of dimensional analysis. 1.6 Lorentz Covariant Field Theories and the Little Group 27

Massive Fields When the momentum vector kμ is time-like, by means of a suitable Lorentz transformation we can always go to the particle rest frame where k = 0 and k0 =±m. The subgroup which leaves D-vectors of this form invariant is obviously the compact rotation subgroup SO(D − 1), which plays the role of little group in this case. Massless Fields When the momentum vector kμ is null-like, by means of Lorentz transformations the best we can do is to rotate it to the normal form:

0 1 i k = ω; k =±ω; k⊥ = 0; i = 2,...,D− 1 (1.6.15)

which describes a free wave propagating in the direction of the first axis at the speed of light. In this case the little group is smaller and corresponds to the rotation group in the perpendicular space to the wave propagation line, namely it is SO(D − 2). In the case of the scalar field, ϕ(k) is a singlet representation of the Lorentz group and as such it is also a singlet representation of the little group. For fields in non- trivial representations of the Lorentz group, the essential point is that, using all the global and local symmetries of the action, once the momentum vector is put into the normal form, ϕ(k) reduces to a representation of the little group. It is this represen- tation that yields the spin of the corresponding particle and establishes the number of on-shell degrees of freedom. As an example we consider the action functional for a massive vector field, which reads as follows:3 D 1 μ ν ν μ 1 2 μ AMV = d x − (∂μVν − ∂νVμ) ∂ V − ∂ V + m VμV (1.6.16) 4 2

The corresponding field equation reads as follows:

2 Vμ − ∂μ∂ · V + m Vμ = 0 (1.6.17)

μ μ where ∂ · V is a shorthand notation for ∂ Vμ. Taking a further derivative ∂ of (1.6.22) we obtain: 0 = m2∂ · V ⇒ ∂ · V = 0 (1.6.18) Hence the original field equation is equivalent to the system:

2 Vμ(x) + m Vμ(x) = 0 (1.6.19) ∂ · V = 0 (1.6.20)

3From now on we use Einstein convention according to which indices are raised and lowered μ μν with the Minkowski metric, namely V ≡ η Vν and repeated upper-lower indices (or vice-versa) denote summation. 28 1 Special Relativity: Setting the Stage

By means of Fourier transform (1.6.19) takes the same form as (1.6.13) with ϕ(k) substituted by Vν(k). The auxiliary condition (1.6.20 ) becomes

μ k Vμ(k) = 0 (1.6.21)

So when the momentum vector is rotated to the rest frame (1.6.21) implies V0 = 0 and what remains is Vi namely a vector representation of the little group SO(D − 1) which contains D − 1 states. In the massless case one arrives at the same reduction to a representation of the little group SO(D − 2) but in a different way, namely using local gauge invariances. For instance let us consider the case of a massless vector field. The action is the same as that in (1.6.16) but with m = 0. Correspondingly the field equation is just:

Vμ − ∂μ∂ · V = 0 (1.6.22)

In this case the condition ∂ · V = 0 cannot be derived from the equation, but it can be imposed as a gauge fixing condition since, at m = 0 the action is invariant under the following local symmetry:

Vμ(x) = Vμ(x) + ∂μλ(x) (1.6.23)

A careful use of this symmetry allows to show that, at the end of the day, when kμ is reduced to the normal form (1.6.15) of a light-like vector, the only remaining degrees of freedom of Vν(k), are those of an SO(D − 2) vector living in the per- pendicular space to the wave propagation. We do not dwell on the details of this derivation since we will address it for the graviton in comparison with the photon in Sect. 5.7.1. The important message to be remembered is that the degrees of freedom of a Lorentz field are given by the dimension of the corresponding representation of the little group, SO(D − 1) in the massive case SO(D − 2) in the massless one.

1.6.1 Representations of the Massless Little Group in D = 4

In view of the conclusions reached in the previous sections it is useful to consider the representations of the massless little group for the physically relevant case D = 4. In this case there are some peculiarities since all representations of SO(2) happen to be two-dimensional and characterized by a single number s that is the spin of the corresponding massless particles. Let us see how this happens. To begin with, an irreducible representation bosonic representation of SO(2) is a traceless with s-indices: ··· ta1...as (1.6.24) s boxes The number of independent components of such a tensor is easily calculated. In = (1+s)! d 2 an object with s indices has s! components. Yet the trace of such an 1.7 Noether’s Theorem, Noether’s Currents and the Stress-Energy Tensor 29 object with respect to an arbitrary pair of indices is again a tensor with s − 2 indices (s−1)! and hence with (s−2)! components. It follows that the total number of independent components is; (1 + s)! (s − 1)! − = 2 (1.6.25) s! (s − 2)! independently from s. As representatives of the independent components it is con- venient to choose x = t11...1 and y = t22...2 and consider the identification of all the other components with one of these two or with its negative. For instance in the case s = 3wehave: ⎛ ⎞ ⎛ ⎞ t111 x ⎜ ⎟ ⎜ ⎟ ⎜t112 ⎟ traceless ⎜−y ⎟ ⎝ ⎠ =⇒ ⎝ ⎠ (1.6.26) t122 −x t222 y Let cos θ sin θ SO(2)  A(θ) = (1.6.27) sin θ cos θ be an element of the fundamental representation of SO(2). The standard transfor- mation under A of a symmetric tensor:

 b b t = A 1 ···A s t ··· (1.6.28) a1...as a1 as b1 bs induces on the vector of the two independent components (x, y) another SO(2) transformation of the form:  x cos sθ − sin sθ x x = ≡ D A(θ) (1.6.29) y sin sθ cos sθ y s y where the rotation angle is sθ, rather than the original θ. By definition, for all s ∈ N Ds A(θ) = A(sθ) (1.6.30) is the integer spin s representation of the SO(2) group element A(θ)

1.7 Noether’s Theorem, Noether’s Currents and the Stress-Energy Tensor

We already touched upon the use of Noether’s theorem in a previous section. Be- cause of the fundamental relation between symmetries, conserved currents and Bianchi identities, which is at the heart of all gauge field theories, it is convenient to recall the form of this very general and fundamental theorem at the end of the present chapter. Let us consider a classical field theory, containing a set of fields φi(x) whose dynamics is dictated by the action (1.6.1). Let us moreover suppose 30 1 Special Relativity: Setting the Stage that the above action admits a Lie group G of symmetries. Naming TA the genera- tors of the corresponding Lie algebra G:

[ ]= C TA,TB f AB TC (1.7.1) and εA the corresponding infinitesimal parameters we assume the following con- crete realization of the generators by infinitesimal transformations of the following form: + A μ = μ + μ; μ = A μ 1 ε TA x x δx δx ε ΔA(x) (1.7.2) + A i = i + i; i = A i 1 ε TA φ φ δφ δφ ε ΘA(x) which by hypothesis leave the action (1.6.1) invariant. Under these conditions 4 Noether’s theorem states that to each generator TA is associated a conserved cur- rent whose form is the following one: L L ν =− ∂ i + ∂ i − L ν σ jA i ΘA i ∂σ φ δσ ΔA (1.7.3) ∂∂νφ ∂∂νφ = ν 0 ∂νjA (1.7.4)

Examples of application of the Noether theorem are provided by all Lorentz invari- ant field theories. In Chap. 5 we will analyse its application to the calculation of the stress-energy tensor. Here let us just consider two examples related with the spinor and the scalar field. According to standard notations (for conventions see Appendix A) the traditional action for a free Dirac spinor field, which might describe the electron, the muon, the proton or the neutron is that given in (1.6.8). Apart from Lorentz symmetry another important symmetry of this action is that against phase transformations of a constant angle θ:

ψ → exp[ieθ]ψ (1.7.5) ψ → exp[ieθ]ψ

As we shall argue in Chap. 5 this transformation is at the basis of first classical and then Quantum Electrodynamics. The infinitesimal form of this transformation fits into the scheme of Noether theorem with: μ (1 + θT•)xμ = xμ + 0; δxμ = 0; Δ• = 0 (1.7.6) (1 + θT•)ψ = ψ + δψ; δψ = θΘ•; Θ• = ieψ

μ The fact that Δ• = 0 vanishes tells us that the considered transformation is an inter- nal symmetry of the theory which affects only fields but has no action on the points

4Noether’s theorem was derived in 1915 in Göttingen and was published in 1918 in [10]. 1.8 Criticism of Special Relativity: Opening the Road to General Relativity 31 of Minkowski space-time. Applying Noether’s theorem as stated in (1.7.3) we can construct the corresponding conserved Noether current:

j μ = eψγμψ (1.7.7)

This is the electric current which, by coupling to the electromagnetic gauge potential Aμ, gives rise to Electrodynamics. As a second example let us consider the case of a scalar field ϕ(x), whose stan- dard action was written in (1.6.9). In Chap. 5 we shall reconsider this action from the point of view of its gravitational coupling and we shall rewrite it in the vielbein formalism. Here it is just considered as the starting point of a dynamical Poincaré invariant field theory in Minkowski space. Its invariance under space-time transla- tions is evident and this leads to the conservation of an associated current, the stress- energy tensor. Let us compute this current following Noether theorem. Naming Pρ the generators of space-time translations, as we already did above, the infinitesimal transformations are as follows: μ μ μ μ μ ρ μ μ μ 1 + ε Pμ x = x + δx ; δx = ε Δρ ; Δρ = δρ (1.7.8) μ ρ 1 + ε Pμ ϕ = ψ + 0; δψ = ε Θρ; Θρ = 0

The fact that Θρ = 0 signalizes that translations are just the opposite case with re- spect to that considered before. Translations are purely space-time symmetries and the momentum operator Pμ has no non-trivial action in the space of fields. Apply- ing formula (1.7.3) to the present case we obtain the conserved Noether current of translations: μ 1 μ 1 μ σ T = ∂ ϕ∂ρϕ − δ ∂ ϕ∂σ ϕ + W (ϕ) (1.7.9) ρ 4 2 ρ which, as we are going to see in Chap. 5 coincides with the definition of the stress energy tensor as variation of the matter action with respect to the metric. Indeed it suffices to lower the first index of the calculated current with the Minkowski metric ημσ and we obtain the symmetric tensor: μ 1 1 ∗ Tρσ ≡ ησμT = ∂ρϕ∂σ ϕ − ηρσ ∂ ϕ∂∗aϕ + W (ϕ) (1.7.10) ρ 4 2 which can be confronted with the result (5.6.41) obtained in Sect. 5.6.4.

1.8 Criticism of Special Relativity: Opening the Road to General Relativity

Let us now consider Special Relativity in retrospective. Through the implications of Maxwell Electromagnetic Theory and by means of a complicated historical path, Special Relativity arrives at the unification of time and 32 1 Special Relativity: Setting the Stage space into Minkowski space-time and replaces the Galileo group with the Lorentz group as the correct group of transformations that relate one inertial reference frame to the other. Special Relativity encodes a spectacular conceptual advance yet it does not solve, rather it shares with Classical Newtonian Physics the logical weakness of being funded on circular reasoning. Indeed both in Newtonian Physics and in Special Relativity we adhere to the following way of arguing: • We have fundamental laws of Nature that apply only in special reference frames, the inertial ones. • How are the inertial frames defined? • As those where the fundamental laws of Nature that we have constructed apply. Furthermore while Maxwell Theory is automatically Lorentz covariant, gravitation, as described by Newton’s law of universal attraction is by no means Lorentz covari- ant and needs to be revised in order to be reconciled with relativity. Where to start in order to overcome these two problems? For those who know a little bit of differential geometry, and the reader of the next chapter will be such a person, something which immediately appears very specific and probably too restrictive is the character of Minkowski space-time. It is, in the language of next chapter, a manifold, actually a , but it is also an affine variety, namely it is a vector space. Physically this means that although we have given up absolute space-distance, we have not yet given up absolute space-time separation of events. Given any two events xμ, yμ we can still define their absolute separation as: Δ2(x, y) = (x − y,x − y) (1.8.1) where (, ) denotes the Minkowskian scalar product. Einstein’s intuition was that in order to remove circular reasoning and formulate laws of Nature that apply in any reference frame, one had to give up the notion of ab- solute space-time distances. What we are allowed to do is just to measure the length of any curve drawn in space-time that should not be required to be an affine man- ifold rather just a manifold. The rule to calculate such distances is encoded in the of Riemannian geometry and Einstein discovered that such a geomet- rical object is nothing else but the gravitational field. To arrive at these conclusions Einstein studied differential geometry that had independently, slowly developed for about eighty years and was coming to maturity just at the dawn of the new century. The same has to do the student and to help him in this task the next two chapters have been written.

References

1. Maxwell, J.C.: A dynamical theory of the electromagnetic field. Philos. Trans. R. Soc. Lond. 155, 459–512 (1865) 2. Maxwell, J.C.: A Treatise on Electricity and Magnetism. Clarendon Press, Oxford (1873) References 33

3. Lorentz, H.A.: Electromagnetic phenomena in a system moving with any velocity smaller than that of light. Proc. Acad. Sci. Amst. 6, 809–831 (1904) 4. Einstein, A.: Zur Elektrodynamik bewegter Körper. Ann. Phys. 17, 891 (1905) 5. Lorentz, H.A.: Simplified theory of electrical and optical phenomena in moving systems. Proc. Acad. Sci. Amst. 1, 427–442 (1899) 6. Poincaré, H.: La thorie de Lorentz et le principe de réaction. Arch. Néerl. Sci. Exactes Nat. V, 253–278 (1900) 7. Cartan, É.: Les groupes projectifs qui ne laissent invariante aucune multiplicit plane. Bull. Soc. Math. Fr. 41, 53–96 (1913) 8. Pauli, W.: Zur Quantenmechanik des magnetischen Elektrons. Z. Phys. 43 (1927) 9. Dirac, P.M.: The quantum theory of the electron. Proc. R. Soc. Lond. A 117, 610–624 (1928) 10. Noether, E.: Invariante variationsprobleme. Nachr. König. Ges. Wiss. Gött., Math.-Phys. Kl., 235–257 (1918) Chapter 2 Basic Concepts About Manifolds and Fibre Bundles

Mathematics, the Queen of Sciences. . . Carl Friedrich Gauss

2.1 Introduction

General Relativity is founded on the concept of differentiable manifolds.Themath- ematical model of space-time that we adopt is given by a pair (M ,g) where M is a differentiable manifold of dimension D = 4 and g is a metric, that is a rule to calculate the length of curves connecting points of M . In physical terms the points of M take the name of events while every physical process is a continuous succes- sion of events. In particular the motion of a point-like particle is represented by a world-line, namely a curve in M while the motion of an extended object of dimen- sion p is given by a d = p + 1 dimensional world-volume obtained as a continuous succession of p-dimensional hypersurfaces Σp ⊂ M . Therefore, the discussion of such physical concepts is necessarily based on a col- lection of geometrical concepts that constitute the backbone of differential geome- try. The latter is at the basis not only of General Relativity but of all Gauge Theories by means of which XX century Physics obtained a consistent and experimentally verified description of all Fundamental Interactions. The central notions are those which fix the geometric environment: • Differentiable Manifolds • Fibre-Bundles and those which endow such environment with structures accounting for the mea- sure of lengths and for the rules of parallel transport, namely: • Metrics • Connections Once the geometric environments are properly mathematically defined, the met- rics and connections one can introduce over them turn out to be the structures which encode the Fundamental Forces of Nature. The present chapter introduces Differentiable Manifolds and Fibre-Bundles while the next one is devoted to a thorough discussion of Metrics and Connections.

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_2, 35 © Springer Science+Business Media Dordrecht 2013 36 2 Manifolds and Fibre Bundles

2.2 Differentiable Manifolds

First and most fundamental in the list of geometrical concepts we need to introduce is that of a manifold which corresponds, as we already explained, to our intuitive idea of a continuous space. In mathematical terms this is, to begin with, a topolog- ical space, namely a set of elements where one can define the notion of neighbor- hood and limit. This is the correct mathematical description of our intuitive ideas of vicinity and close-by points. Secondly the characterizing feature that distinguishes a manifold from a simple topological space is the possibility of labeling its points with a set of coordinates. Coordinates are a set of real numbers x1(p), . . . , xD(p) ∈ R associated with each point p ∈ M that tell us where we are. Actually in General Relativity each point is an event so that coordinates specify not only its where but also its when. In other applications the coordinates of a point can be the most dis- parate parameters specifying the state of some complex system of the most general kind (dynamical, biological, economical or whatever). In classical physics the laws of motion are formulated as a set of differential equations of the second order where the unknown functions are the three Cartesian coordinates x, y, z of a particle and the variable t is time. Solving the dynamical problem amounts to determine the continuous functions x(t), y(t), z(t), that yield a parametric description of a curve in R3 or better define a curve in R4,having included the time t in the list of coordinates of each event. Coordinates, however, are not uniquely defined. Each observer has its own way of labeling space points and the laws of motion take a different form if expressed in the coordinate frame of different observers. There is however a privileged class of observers in whose frames the laws of motion have always the same form: these are the inertial frames, that are in rectilinear relative motion with constant velocity. The existence of a privileged class of inertial frames is common to classical Newtonian physics and to Special Relativity: the only difference is the form of coordinate transformations connecting them, Galileo transformations in the first case and Lorentz transformations in the second. This goes hand in hand with the fact that the space-time manifold is the flat affine1 manifold R4 in both cases. By definition all points of RN can be covered by one coordinate frame {xi} and all frames with such a property are related to each other by general linear transformations, that is by the elements of the general linear group GL(N, R): i = i j ; i ∈ R x A j x A j GL(N, ) (2.2.1) The restriction to the Galilei or Lorentz subgroups of GL(4, R) is a consequence of the different scalar product on R4 vectors one wants to preserve in the two cases, but the relevant common feature is the fact that the space-time manifold has a vector- space structure. The privileged coordinate frames are those that use the correspond- ing vectors as labels of each point. A different situation arises when the space-time manifold is not flat, like, for instance, the surface of a hypersphere SN . As chartographers know very well there

1A manifold (defined in this section) is named affine when it is also a vector space. 2.2 Differentiable Manifolds 37 is no way of representing all points of a curved surface in a single coordinate frame, namely in a single chart. However we can succeed in representing all points of a curved surface by means of an atlas, namely by a collection of charts, each of which maps one open region of the surface and such that the union of all these regions covers the entire surface. Knowing the transition rule from one chart to the next one, in the regions where they overlap, we obtain a complete coordinate description of the curved surface by means of our atlas. The intuitive idea of an atlas of open charts, suitably reformulated in mathemat- ical terms, provides the very definition of a differentiable manifold, the geometrical concept that generalizes our notion of space-time, from RN to more complicated non-flat situations. There are many possible atlases that describe the same manifold M , related to each other by more or less complicated transformations. For a generic M no privileged choice of the atlas is available differently from the case of RN : here the inertial frames are singled out by the additional vector space structure of the manifold, which allows to label each point with the corresponding vector. Therefore if the laws of physics have to be universal and have to accommodate non-flat space- times, then they must be formulated in such a way that they have the same form in whatsoever atlas. This is the principle of general covariance at the basis of General Relativity: all observers see the same laws of physics. Similarly, in a wider perspective, the choice of a particular set of parameters to describe the state of a complex system should not be privileged with respect to any other choice. The laws that govern the dynamics of a system should be intrinsic and should not depend on the set of variables chosen to describe it.

2.2.1 Homeomorphisms and the Definition of Manifolds

A fundamental ingredient in formulating the notion of differential manifolds is that of homeomorphism.2

Definition 2.2.1 Let X and Y be two topological spaces and let h be a map:

h : X → Y (2.2.2)

If h is one-to-one and if both h and its inverse h−1 are continuous, then we say that h is a homeomorphism.

As a consequence of the theorems proved in all textbooks about elementary topology and calculus, homeomorphisms preserve all topological properties. Indeed let h be a homeomorphism mapping X onto Y and let A ⊂ X be an open subset: its

2We assume that the reader possesses the basic notions of general topology concerning the notions of bases of neighborhoods, open and close subsets, boundary and limit. 38 2 Manifolds and Fibre Bundles image through h, namely h(A) ⊂ Y is also an open subset in the topology of Y .Sim- ilarly the image h(C) ⊂ Y of a closed subset C ⊂ X is a closed subset. Furthermore for all A ⊂ X we have: h(A) = h(A) (2.2.3) namely the closure of the image of a set A coincides with the image of the closure.

Definition 2.2.2 Let X and Y be two topological spaces. If there exists a homeo- morphism h : X → Y then we say that X and Y are homeomorphic.

It is easy to see that given a topological space X, the set of all homeomorphisms h : X → X constitutes a group, usually denoted Hom(X). Indeed if h ∈ Hom(X) is a homeomorphism, then also h−1 ∈ Hom(X) is a homeomorphism. Furthermore if h ∈ Hom(X) and h ∈ Hom(X) then also h ◦ h ∈ Hom(X). Finally the identity map: 1 : X → X (2.2.4) is certainly one-to-one and continuous and it coincides with its own inverse. Hence 1 ∈ Hom(X). As we discuss later on, for any manifold X the group Hom(X) is an example of an infinite and continuous group. Let now M be a topological Hausdorff space. An open chart of M is a pair (U, ϕ) where U ⊂ M is an open subset of M and ϕ is a homeomorphism of U on an open subset Rm (m being a positive integer). The concept of open chart allows to introduce the notion of coordinates for all points p ∈ U. Indeed the coordinates of p are the m real numbers that identify the point ϕ(p) ∈ ϕ(U) ⊂ Rm. Using the notion of open chart we can finally introduce the notion of differen- tiable structure.

Definition 2.2.3 Let M be a topological Hausdorff space. A differentiable structure M A = of dimension m on is an atlas i∈A(Ui,ϕi) of open charts (Ui,ϕi) where ∀i ∈ A, Ui ⊂ M is an open subset and

m ϕi : Ui → ϕi(Ui) ⊂ R (2.2.5)

m is a homeomorphism of Ui in R , namely a continuous, invertible map onto an open subset of Rm such that the inverse map −1 : → ⊂ M ϕi ϕi(Ui) Ui (2.2.6) is also continuous (see Fig. 2.1). The atlas must fulfill the following axioms:

M1 It covers M , namely Ui = M (2.2.7) i so that each point of M is contained at least in one chart and generically in more than one: ∀p ∈ M → ∃(Ui,ϕi)/p ∈ Ui . 2.2 Differentiable Manifolds 39

Fig. 2.1 An open chart is a homeomorphism of an open subset Ui of the manifold M onto an open subset of Rm

Fig. 2.2 A transition function between two open charts is a differentiable map from an open subset of Rm to another open subset of the same

M2 Chosen any two charts (Ui,ϕi), (Uj ,ϕj ) such that Ui Uj = ∅, on the inter- section def Uij = Ui Uj (2.2.8) there exist two homeomorphisms:

m ϕi|U : Uij → ϕi(Uij ) ⊂ R ij (2.2.9) | : → ⊂ Rm ϕj Uij Uij ϕj (Uij )

and the composite map:

def= ◦ −1 ψij ϕj ϕi m m (2.2.10) ψij : ϕi(Uij ) ⊂ R → ϕj (Uij ) ⊂ R

named the transition function which is actually an m-tuplet of m real functions of m real variables is requested to be differentiable (see Fig. 2.2). M3 The collection (Ui,ϕi)i∈A is the maximal family of open charts for which both M1 and M2 hold true.

Next we can finally introduce the definition of differentiable manifold.

Definition 2.2.4 A differentiable manifold of dimension m is a topological space M that admits at least one differentiable structure (Ui,ϕi)i∈A of dimension m.

The definition of a differentiable manifold is constructive in the sense that it provides a way to construct it explicitly. What one has to do is to give an atlas of 40 2 Manifolds and Fibre Bundles open charts (Ui,ϕi) and the corresponding transition functions ψij which should satisfy the necessary consistency conditions: ∀ = −1 i, j ψij ψji (2.2.11)

∀i, j, k ψij ◦ ψjk ◦ ψki = 1 (2.2.12)

In other words a general recipe to construct a manifold is to specify the open charts and how they are glued together. The properties assigned to a manifold are the prop- erties fulfilled by its transition functions. In particular we have:

Definition 2.2.5 A differentiable manifold M is said to be smooth if the transition functions (2.2.10)areinfinitely differentiable ∞ m M is smooth ⇔ ψij ∈ C R (2.2.13)

Similarly one has the definition of a complex manifold.

Definition 2.2.6 A real manifold of even dimension m = 2ν is complex of dimen- sion ν if the 2ν real coordinates in each open chart Ui can be arranged into ν com- plex numbers so that (2.2.5) can be replaced by

ν ϕi : Ui → ϕi(Ui) ⊂ C (2.2.14) and the transition functions ψij are holomorphic maps:

ν ν ψij : ϕi(Uij ) ⊂ C → ϕj (Uij ) ⊂ C (2.2.15)

Although the constructive definition of a differentiable manifold is always in terms of an atlas, in many occurrences we can have other intrinsic global definitions of what M is and the construction of an atlas of coordinate patches is an a posteri- ori operation. Typically this happens when the manifold admits a description as an algebraic locus. The prototype example is provided by the SN sphere which can be defined as the locus in RN+1 of points with distance r from the origin:

N+1 { }∈SN ⇔ 2 = 2 Xi Xi r (2.2.16) i=1

2 In particular for N = 2 we have the familiar S which is diffeomorphic to the com- pactified complex plane C {∞}. Indeed we can easily verify that S2 is a one- dimensional complex manifold considering the atlas of holomorphic open charts suggested by the geometrical construction named the stereographic projection.To this effect consider the picture in Fig. 2.3 where we have drawn the two-sphere S2 of radius r = 1 centered in the origin of R3. Given a generic point P ∈ S2 we can construct its image on the equatorial plane R2 ∼ C drawing the straight line in R3 that goes through P and through the North Pole of the sphere N. Such a line will in- tersect the equatorial plane in the point PN whose value zN , regarded as a complex 2.2 Differentiable Manifolds 41

Fig. 2.3 Stereographic projection of the two sphere

number, we can identify with the complex coordinate of P in the open chart under consideration:

ϕN (P ) = zN ∈ C (2.2.17)

Alternatively we can draw the straight line through P and the South Pole S.This intersects the equatorial plane in another point PS whose value as a complex number, named zS , is just the reciprocal of zN : zS = 1/zN . We can take zS as the complex coordinate of the same point P . In other words we have another open chart:

ϕS(P ) = zS ∈ C (2.2.18)

What is the domain of these two charts, namely what are the open subsets UN and US ? This is rather easily established considering that the North Pole projection yields a finite result zN < ∞ for all points P except the North Pole itself. Hence 2 UN ⊂ S is the open set obtained by subtracting one point (the North Pole) to the sphere. Similarly the South Pole projection yields a finite result for all points P ex- 2 cept the South Pole itself and US is S minus the south pole. More definitely we can choose for UN and US any two open neighborhoods of the South and North Pole re- spectively with non-vanishing intersection (see Fig. 2.4). In this case the intersection UN US is a band wrapped around the equator of the sphere and its image in the complex equatorial plane is a circular corona that excludes both a circular neighbor- hood of the origin and a circular neighborhood of infinity. On such an intersection we have the transition function:

1 ψNS : zN = (2.2.19) zS which is clearly holomorphic and satisfies the consistency conditions in (2.2.11), (2.2.12). Hence we see that S2 is a complex 1-manifold that can be constructed with an atlas composed of two open charts related by the transition function (2.2.19). Ob- viously a complex 1-manifold is a fortiori a smooth real 2-manifold. Manifolds with infinitely differentiable transition functions are named smooth not without a reason. Indeed they correspond to our intuitive notion of smooth hypersurfaces without con- ical points or edges. The presence of such defects manifests itself through the lack of differentiability in some regions. 42 2 Manifolds and Fibre Bundles

Fig. 2.4 The open charts of the North and South Pole

2.2.2 Functions on Manifolds

Being the mathematical model of possible space-times, manifolds are the geometri- cal support of physics. They are the arenas where physical processes take place and where physical quantities take values. Mathematically, this implies that calculus, originally introduced on RN must be extended to manifolds. The physical entities defined over manifolds with which we have to deal are mathematically character- ized as scalar functions, vector fields, tensor fields, differential forms, sections of more general fibre-bundles. We introduce such basic geometrical notions slowly, beginning with the simplest concept of a scalar function.

Definition 2.2.7 A real scalar function on a differentiable manifold M is a map:

f : M → R (2.2.20) that assigns a real number f(p)to every point p ∈ M of the manifold.

The properties of a scalar function, for instance its differentiability, are the prop- erties characterizing its local description in the various open charts of an atlas. For each open chart (Ui,ϕi) let us define:

def= ◦ −1 fi f ϕi (2.2.21) By construction m fi : R ⊃ ϕi(Ui) → R (2.2.22) is a map of an open subset of Rm into the real line R, namely a real function of m (i) (i) real variables (see Fig. 2.5). The collection of the real functions fi(x1 ,...,xm ) 2.2 Differentiable Manifolds 43

Fig. 2.5 Local description of a scalar function on a manifold

constitute the local description of the scalar function f . The function is said to be continuous, differentiable, infinitely differentiable if the real functions fi have such properties. From Definition (2.2.21) of the local description and from Defini- tion (2.2.10) of the transition functions it follows that we must have:

∀ : | = | ◦ Ui,Uj fj Ui Uj fi Ui Uj ψij (2.2.23)

μ μ Let x(i) be the coordinates in the patch Ui and x(j)be the coordinates in the patch Uj . For points p that belong to the intersection Ui Uj we have: μ = (ji) 1 m x(j)(p) ψμ x(j)(p), . . . x(j)(p) (2.2.24) and the gluing rule (2.2.23) takes the form: f(p)= fj (x(j)) = fj ψji(x(i)) = fi(x(i)) (2.2.25)

The practical way of assigning a function on a manifold is therefore that of writing its local description in the open charts of an atlas, taking care that the various fi glue together correctly, namely through (2.2.23). Although the number of continuous and differentiable functions one can write on any open region of Rm is infinite, the smooth functions globally defined on a non-trivial manifold can be very few. Indeed it is only occasionally that we can consistently glue together various local functions ∞ ∞ fi ∈ C (Ui) into a global f . When this happens we say that f ∈ C (M ). All what we said about real functions can be trivially repeated for complex func- tions. It suffices to replace R by C in (2.2.20).

2.2.3 Germs of Smooth Functions

The local geometry of a manifold is studied by considering operations not on the space of smooth functions C∞(M ) which, as just explained, can be very small, but on the space of germs of functions defined at each point p ∈ M that is always an infinite dimensional space. 44 2 Manifolds and Fibre Bundles

Fig. 2.6 A germ of a smooth function is the equivalence class of all locally defined function that coincide in some neighborhood of a point p

Definition 2.2.8 Given a point p ∈ M , the space of germs of smooth functions C∞ at p, denoted p is defined as follows. Consider all the open neighborhoods of p, namely all the open subsets Up ⊂ M such that p ∈ Up. Consider the space C∞ ∈ C∞ ∈ of smooth functions (Up) on each Up. Two functions f (Up) andg C∞   (Up) are said to be equivalent if they coincide on the intersection Up Up (see Fig. 2.6): f ∼ g ⇔ f |  = g|  (2.2.26) Up Up Up Up ∞ The union of all the spaces C (Up) modded by the equivalence relation (2.2.26)is the space of germs of smooth functions at p: ∞ C (Up) C∞ ≡ Up p ∼ (2.2.27)

What underlies the above definition of germs is the familiar principle of analytic continuation. Of the same function we can have different definitions that have differ- ent domains of validity: apparently we have different functions but if they coincide on some open region than we consider them just as different representations of a sin- gle function. Given any germ in some open neighborhood Up we try to extend it to a larger domain by suitably changing its representation. In general there is a limit to such extension and only very special germs extend to globally defined functions on M k the whole manifold . For instance the power series k∈N z defines a holomor- phic function within its radius of convergence |z| < 1. As everybody knows, within the convergence radius the sum of this series coincides with 1/(1 − z) which is a holomorphic function defined on a much larger neighborhood of z = 0. According to our definition the two functions are equivalent and correspond to two different representatives of the same germ. The germ, however, does not extend to a holo- morphic function on the whole Riemann sphere C ∞ since it has a singularity in z = 1. Indeed, as stated by Liouville theorem, the space of global holomorphic functions on the Riemann sphere contains only the constant function.

2.3 Tangent and Cotangent Spaces

In elementary geometry the notion of a tangent line is associated with the notion of a curve. Hence to introduce tangent vectors we have to begin with the notion of curves in a manifold. 2.3 Tangent and Cotangent Spaces 45

Fig. 2.7 Acurveina manifold is a continuous map of an interval of the real line into the manifold itself

Definition 2.3.1 A curve C inamanifoldM is a continuous and differentiable map of an interval of the real line (say [0, 1]⊂R)intoM :

C :[0, 1]→M (2.3.1)

In other words a curve is one-dimensional submanifold C ⊂ M (see Fig. 2.7). There are curves with a boundary, namely C (0) C (1) and open curves that do not contain their boundary. This happens if in (2.3.1) we replace the closed interval [0, 1] with the open interval ]0, 1[. Closed curves or loops correspond to the case where the initial and final point coincide, that is when pi ≡ C (0) = C (1) ≡ pf . Differently said

Definition 2.3.2 A closed curved is a continuous differentiable map of a circle into the manifold: C : S1 → M (2.3.2)

Indeed, identifying the initial and final point means to consider the points of the curve as being in one-to-one correspondence with the equivalence classes

R/Z ≡ S1 (2.3.3) which constitute the mathematical definition of the circle. Explicitly (2.3.3) means that two real numbers r and r are declared to be equivalent if their difference r − r = n is an integer number n ∈ Z. As representatives of these equivalence classes we have the real numbers contained in the interval [0, 1] with the proviso that 0 ∼ 1. We can also consider semiopen curves corresponding to maps of the semiopen interval [0, 1[ into M . In particular, in order to define tangent vectors we are inter- ested in open branches of curves defined in the neighborhood of a point.

2.3.1 Tangent Vectors in a Point p ∈ M

For each point p ∈ M let us fix an open neighborhood Up ⊂ M and let us consider the semiopen curves of the following type: C :[0, 1[→ U p p (2.3.4) Cp(0) = p 46 2 Manifolds and Fibre Bundles

Fig. 2.8 In a neighborhood Up of each point p ∈ M we consider the curves that go through p

Fig. 2.9 The tangent space in a generic point of an S2 sphere

In other words for each point p let us consider all possible curves Cp(t) that go trough p (see Fig. 2.8). Intuitively the tangent in p to a curve that starts from p is the vector that specifies the curve’s initial direction. The basic idea is that in an m-dimensional manifold there are as many directions in which the curve can depart as there are vectors in Rm: furthermore for sufficiently small neighborhoods of p we cannot tell the difference between the manifold M and the flat vector space Rm. Hence to each point p ∈ M of a manifold we can attach an m-dimensional real vector space

∀p ∈ M : p → TpM dim TpM = m (2.3.5) which parameterizes the possible directions in which a curve starting at p can de- part. This vector space is named the tangent space to M at the point p and is, by m m definition, isomorphic to R , namely TpM ∼ R . For instance to each point of an S2 sphere we attach a tangent plane R2 (see Fig. 2.9). Let us now make this intuitive notion mathematically precise. Consider a point ∈ M ∈ ∞ M p and a germ of smooth function fp Cp ( ). In any open chart (Uα,ϕα) that contains the point p, the germ fp is represented by an infinitely differentiable function of m-variables: 1 m fp x(α),...,x(α) (2.3.6)

Let us now choose an open curve Cp(t) that lies in Uα and starts at p: Cp :[0, 1[→ Uα Cp(t) : (2.3.7) Cp(0) = p and consider the composite map:

fp ◦ Cp :[0, 1[⊂ R → R (2.3.8) which is a real function fp Cp(t) ≡ gp(t) (2.3.9) of one real variable (see Fig. 2.10). 2.3 Tangent and Cotangent Spaces 47

Fig. 2.10 The composite map fp ◦ Cp where fp is a germ of smooth function in p and Cp is a curve departing from p ∈ M

We can calculate its derivative with respect to t in t = 0 which, in the open chart (Uα,ϕα), reads as follows: ∂f μ d = p · dx gp(t) μ (2.3.10) dt t=0 ∂x dt t=0 ∈ C∞ M We see from the above formula that the increment of any germ fp p ( ) along a curve Cp(t) is defined by means of the following m real coefficients: μ μ dx c ≡ ∈ R (2.3.11) dt t=0 which can be calculated whenever the parametric form of the curve is given: xμ = xμ(t). Explicitly we have:

dfp ∂fp = cμ (2.3.12) dt ∂xμ Equation (2.3.12) can be interpreted as the action of a differential operator on the space of germs of smooth functions, namely:

∂ ∞ ∞ t ≡ cμ ⇒ t : C (M ) → C (M ) (2.3.13) p ∂xμ p p p Indeed for any germ f and for any curve μ = dx ∂f ∈ C∞ M tpf μ p ( ) (2.3.14) dt t=0 ∂x is a new germ of a smooth function in the point p. This discussion justifies the mathematical definition of the tangent space:

Definition 2.3.3 The tangent space TpM to the manifold M in the point p is the vector space of first order differential operators on the germs of smooth functions C∞ M p ( ).

C∞ M Next let us observe that the space of germs p ( ) is an algebra with respect to linear combinations with real coefficients (αf + βg)(p) = αf (p) + βg(p) and pointwise multiplication f · g(p) ≡ f (p)g(p): 48 2 Manifolds and Fibre Bundles

∀ ∈ R ∀ ∈ C∞ M + ∈ C∞ M α, β f,g p ( )αfβg p ( ) ∀ ∈ C∞ M · ∈ C∞ M f,g p ( )fg p ( ) (2.3.15) (αf + βg) · h = αf · h + βg · h and a tangent vector tp is a derivation of this algebra.

Definition 2.3.4 A derivation D of an algebra A is a map:

D : A → A (2.3.16) that 1. is linear

∀α, β ∈ R ∀f,g ∈ A : D(αf + βg) = αDf + βDg (2.3.17)

2. obeys Leibnitz rule

∀f,g ∈ A : D(f · g) = Df · g + f · Dg (2.3.18)

That tangent vectors fit into Definition 2.3.4 is clear from their explicit realization as differential operators (2.3.13), (2.3.14). It is also clear that the set of derivations D[A ] of an algebra constitutes a real vector space. Indeed a linear combination of derivations is still a derivation, having set:

∀α, β ∈ R, ∀D1, D2 ∈ D[A ], ∀f ∈ A : (αD1 + βD2)f = αD1f + βD2f (2.3.19)

Hence an equivalent and more abstract definition of the tangent space is the follow- ing:

Definition 2.3.5 The tangent space to a manifold M at the point p is the vector space of derivations of the algebra of germs of smooth functions in p: M ≡ C∞ M Tp D p ( ) (2.3.20) ∈ C∞ M Indeed for any tangent vector (2.3.13) and for any pair of germs f,g p ( ) we have:

tp(αf + βg) = αtp(f ) + βtp(g) (2.3.21) tp(f · g) = tp(f ) · g + f · tp(g)

In each coordinate patch a tangent vector is, as we have seen, a first order differ- ential operator singled out by its components, namely by the coefficients cμ.In the language of tensor calculus the tangent vector is identified with the m-tuplet of real numbers cμ. The relevant point, however, is that such m-tuplet representing the 2.3 Tangent and Cotangent Spaces 49

Fig. 2.11 Two coordinate patches

same tangent vector is different in different coordinate patches. Consider two co- μ ordinate patches (U, ϕ) and (V, ψ) with non-vanishing intersection. Name x the coordinate of a point p ∈ U V in the patch (U, ϕ) and yα the coordinate of the same point in the patch (V, ψ). The transition function and its inverse are expressed by setting: xμ = xμ(y); yν = yν(x) (2.3.22) Then the same first order differential operator can be alternatively written as: ∂ ∂yν ∂ ∂ t = cμ or t = cμ = cν (2.3.23) p ∂xμ p ∂xμ ∂yν ∂yν having defined: ∂yν cν ≡ cμ (2.3.24) ∂xμ Equation (2.3.24) expresses the transformation rule for the components of a tangent vector from one coordinate patch to another one (see Fig. 2.11). Such a transformation is linear and the matrix that realizes it is the inverse of the Jacobian matrix (∂y/∂x) = (∂x/∂y)−1. For this reason we say that the components of a tangent vector constitute a controvariant world vector. By definition a covariant world vector transforms instead with the Jacobian matrix. We will see that covariant world vectors are the components of a .

2.3.2 Differential Forms in a Point p ∈ M

Let us now consider the total differential of a function (better of a germ of a smooth ∀ ∈ C∞ M function) when we evaluate it along a curve. f p ( ) and for each curve c(t) starting at p we have: d = μ ∂ ≡ f c(t) c μ f tpf (2.3.25) dt t=0 ∂x = dcμ | ∂ where we have named tp dt t=0 ∂xμ the tangent vector to the curve in its initial point p. So, fixing a tangent vector means that for any germ f we know its total differential along the curve that admits such a vector as tangent in p. Let us now reverse our viewpoint. Rather than keeping the tangent vector fixed and letting the germ f vary let us keep the germ f fixed and let us consider all possible curves that 50 2 Manifolds and Fibre Bundles depart from the point p. We would like to evaluate the total derivative of the germ df dt along each curve. The solution of such a problem is easily obtained: given the tangent vector tp to the curve in p we have df/dt = tpf . The moral of this tale is the following: the concept of total differential of a germ is the dual of the concept of tangent vector. Indeed we recall from linear algebra that the dual of a vector space is the space of linear functionals on that vector space and our discussion shows that the total differential of a germ is precisely a linear functional on the tangent space TpM .

∈ C∞ M Definition 2.3.6 The total differential dfp of a smooth germ f p ( ) is a linear functional on TpM such that

∀tp ∈ TpM dfp(tp) = tpf (2.3.26) ∀tp, kp ∈ TpM , ∀α, β ∈ R dfp(αtp + βkp) = αdfp(tp) + βdfp(kp)

The linear functionals on a finite dimensional vector space V constitute a vector space V (the dual) with the same dimension. This justifies the following

Definition 2.3.7 We name cotangent space to the manifold M in the point p the ∗M vector space Tp of linear functionals (or 1-forms in p) on the tangent space TpM : ∗M ≡ M R = M Tp Hom(Tp , ) (Tp ) (2.3.27)

So we name differential 1-forms in p the elements of the cotangent space and ∀ ∈ ∗M ωp Tp we have:

1) ∀tp ∈ TpM : ωp(tp) ∈ R (2.3.28) 2) ∀α, β ∈ R, ∀tp, kp ∈ TpM : ωp(αtp + βkp) = αωp(tp) + βωp(kp)

The reason why the above linear functionals are named differential 1-forms is that in every coordinate patch {xμ} they can be expressed as linear combinations of the coordinate differentials: μ ωp = ωμ dx (2.3.29) and their action on the tangent vectors is expressed as follows:

∂ t = cμ ⇒ ω (t ) = ω cμ ∈ R (2.3.30) p ∂xμ p p μ Indeed in the particular case where the 1-form is exact (namely it is the differential μ μ of a germ) ωp = dfp we can write ωp = ∂f /∂ x dx and we have dfp(tp) ≡ tpf = cμ∂f /∂ x μ. Hence when we extend our definition to differential forms that are not exact we continue to state the same statement, namely that the value of the 1-form on a tangent vector is given by (2.3.30). 2.4 Fibre Bundles 51

Summarizing, in each coordinate patch, a differential 1-form in a point p ∈ M has the representation (2.3.29) and its coefficients ωμ constitute a controvariant vector. Indeed, in complete analogy to (2.3.23), we have ∂xμ ω = ω dxμ or ω = ω dyν = ω dyν (2.3.31) p μ p μ ∂yν ν having defined: ∂xμ ω ≡ ω (2.3.32) ν μ ∂yν Finally the duality relation between 1-forms and tangent vectors can be summarized writing the rule: ∂ dxμ = δμ (2.3.33) ∂xν ν

2.4 Fibre Bundles

Thenextstepwehavetotakeisgluing together all the tangent TpM and cotangent ∗M spaces Tp we have discussed in the previous sections. The result of such a gluing procedure is not a vector space, rather it is a vector bundle. Vector bundles are specific instances of the more general notion of fibre bundles. The concept of fibre bundle is absolutely central in contemporary physics and provides the appropriate mathematical framework to formulate modern field theory since all the fields one can consider are either sections of associated bundles or connections on principal bundles. There are two kinds of fibre-bundles: 1. principal bundles, 2. associated bundles. The notion of a principal fibre-bundle is the appropriate mathematical concept un- derlying the formulation of gauge theories that provide the general framework to describe the dynamics of all non-gravitational interactions. The concept of a connec- tion on such principal bundles codifies the physical notion of the bosonic particles mediating the interaction, namely the gauge bosons, like the photon, the gluon or the graviton. Indeed, gravity itself is a gauge theory although of a very special type. On the other hand the notion of associated fibre-bundles is the appropriate mathe- matical framework to describe matter fields that interact through the exchange of the gauge bosons. Also from a more general viewpoint and in relation with all sort of applications the notion of fibre-bundles is absolutely fundamental. As we already emphasized, the points of a manifold can be identified with the possible states of a complex sys- tem specified by an m-tuplet of parameters x1,...,xm. Real or complex functions of such parameters are the natural objects one expects to deal with in any scientific 52 2 Manifolds and Fibre Bundles theory that explains the phenomena observed in such a system. Yet, as we already anticipated, calculus on manifolds that are not trivial as the flat Rm cannot be con- fined to functions, which is a too restrictive notion. The appropriate generalization of functions is provided by the sections of fibre-bundles. Locally, namely in each coordinate patch, functions and sections are just the same thing. Globally, however, there are essential differences. A section is obtained by gluing together many lo- cal functions by means of non-trivial transition functions that reflect the geometric structure of the fibre-bundle. To introduce the mathematical definition of a fibre-bundle we need to recall the definition of a Lie group which the reader should have met in other basic courses.

Definition 2.4.1 A Lie group G is: • A group from the algebraic point of view, namely a set with an internal composi- tion law, the product

∀g1g2 ∈ Gg1 · g2 ∈ G (2.4.1) which is associative, admits a unique neutral element e and yields an inverse for each group element. • A smooth manifold of finite dimension dim G = n<∞ whose transition function are not only infinitely differentiable but also real analytic, namely they admit an expansion in power series. • In the topology defined by the manifold structure the two algebraic operations of taking the inverse of an element and performing the product of two elements are real analytic (admit a power series expansion).

The last point in Definition (2.4.1) deserves a more extended explanation. To each group element the product operation associates two maps of the group into itself: ∀g ∈ G : L : G → G : g → L g ≡ g · g g g    (2.4.2) ∀g ∈ G : Rg : G → G : g → Rg g ≡ g · g respectively named the left translation and the right translation. Both maps are re- quired to be real analytic for each choice of g ∈ G. Similarly the group structure induces a map: (·)−1 : G → G : g → g−1 (2.4.3) which is also required to be real analytic. Coming now to fibre-bundles let us begin by recalling that a pedagogical and pictorial example of such spaces is provided by the celebrated picture by Escher of an ant crawling on a Mobius strip (see Fig. 2.12). The basic idea is that if we consider a piece of the bundle this cannot be dis- tinguished from a trivial direct product of two spaces, an open subset of the base manifold and the fibre. In Fig. 2.12 the base manifold is the strip and the fibre is the space containing all possible positions of the ant. However, the relevant point 2.4 Fibre Bundles 53

Fig. 2.12 Escher’s ant crawling on a Mobius strip provides a pedagogical example of a fibre-bundle

is that, globally, the bundle is not a direct product of spaces. If the ant is placed in some orientation at a certain point on the strip, taking her around the strip she will be necessarily reversed at the end of her trip. Hence the notion of fibre-bundle corresponds to that of a differentiable manifold P with dimension dim P = m + n that locally looks like the direct product U × F of an open manifold U of dimension dim U = m with another manifold F (the standard fibre) of dimension dim F = n. Essential in the definition is the existence of a map: π : P → M (2.4.4) named the projection from the total manifold P of dimension m + n toamani- fold M of dimension m, named the base manifold. Such a map is required to be continuous. Due to the difference in dimensions the projection cannot be invertible. Indeed to every point ∀p ∈ M of the base manifold the projection associates a sub- manifold π −1(p) ⊂ P of dimension dim π−1(p) = n composed by those points of x ∈ P whose projection on M is the chosen point p: π(x) = p. The submanifold π −1(p) is named the fibre over p and the basic idea is that each fibre is homeomor- phic to the standard fibre F . More precisely for each open subset Uα ⊂ M of the base manifold we must have that the submanifold

−1 π (Uα) is homeomorphic to the direct product

Uα × F This is the precise meaning of the statement that, locally, the bundle looks like a direct product (see Fig. 2.13). Explicitly what we require is the following: there 54 2 Manifolds and Fibre Bundles

Fig. 2.13 A fibre-bundle is locally trivial

should be a family of pairs (Uα,φα) where Uα are open charts covering the base = M manifold α Uα and φα are maps:

−1 φα : π (Uα) ⊂ P → Uα ⊗ F (2.4.5) that are required to be one-to-one, bicontinuous (= continuous, together with its inverse) and to satisfy the property that:

◦ −1 = π φα (p, f ) p (2.4.6)

Namely the projection of the image in P of a base manifold point p times some fibre point f is p itself. Each pair (Uα,φα) is named a local trivialization. As for the case of manifolds, the interesting question is what happens in the intersection of two different local = ∅ −1 −1 = ∅ trivializations. Indeed if Uα Uβ , then we also have π (Uα) π (Uβ ) . −1 Hence each point x ∈ π (Uα Uβ ) is mapped by φα and φβ in two different pairs (p, fα) ∈ Uα ⊗ F and (p, fβ ) ∈ Uα ⊗ F with the property, however, that the first entry p is the same in both pairs. This follows from property (2.4.6). It implies that there must exist a map: ≡ −1 ◦ : ⊗ → ⊗ tαβ φβ φα Uα Uβ F Uα Uβ F (2.4.7) named transition function, which acts exclusively on the fibre points in the sense that: ∀p ∈ Uα Uβ , ∀f ∈ Ftαβ (p, f ) = p,tαβ (p).f (2.4.8) where for each choice of the point p ∈ Uα Uβ ,

tαβ (p) : F → F (2.4.9) is a continuous and invertible map of the standard fibre F into itself (see Fig. 2.14). 2.4 Fibre Bundles 55

Fig. 2.14 Transition function between two local trivializations of a fibre-bundle

The last bit of information contained in the notion of fibre-bundle is related with the structural group. This has to do with answering the following question: where are the transition functions chosen from? Indeed the set of all possible continuous invertible maps of the standard fibre F into itself constitute a group, so that it is no restriction to say that the transition functions tαβ (p) are group elements. Yet the group of all homeomorphisms Hom(F, F ) is very very large and it makes sense to include into the definition of fibre bundle the request that the transition func- tions should be chosen within a smaller hunting ground, namely inside some finite dimensional Lie group G that has a well defined action on the standard fibre F . The above discussion can be summarized into the following technical definition of fibre bundles.

Definition 2.4.2 A fibre bundle (P, π, M ,F,G)is a geometrical structure that con- sists of the following list of elements: 1. A differentiable manifold P named the total space. 2. A differentiable manifold M named the base space. 3. A differentiable manifold F named the standard fibre. 4. A Lie group G, named the structure group, which acts as a transformation group on the standard fibre:

∀g ∈ G; g : F −→ F {i.e. ∀f ∈ Fg.f∈ F } (2.4.10)

5. A surjection map π : P −→ M , named the projection.Ifn = dim M , m = −1 dim F , then we have dim P = n + m and ∀p ∈ M , Fp = π (p) is an m- dimensional manifold diffeomorphic to the standard fibre F . The manifold Fp is named the fibre at the point p. 6. A covering of the base space ∪(α∈A)Uα = M , realized by a collection {Uα} of open subsets (∀α ∈ AUα ⊂ M ), equipped with a homeomorphism: −1 : × −→ −1 φα Uα F π (Uα) (2.4.11) 56 2 Manifolds and Fibre Bundles

such that ∀ ∈ ∀ ∈ : · −1 = p Uα, f F π φα (p, f ) p (2.4.12) −1 The map φα is named a local trivialization of the bundle, since its inverse φα −1 maps the open subset π (Uα) ⊂ P of the total space into the direct product Uα × F . −1 = −1 −1 : −→ 7. If we write φα (p, f ) φα,p(f ),themapφα,p F Fp is the homeomor- phism required by point (6) of the present definition. For all points p ∈ Uα ∩ Uβ in the intersection of two different local trivialization domains, the composite = · −1 −→ ∈ map tαβ (p) φα,p φβ,pF F is an element of the structure group tαβ G, named the transition function. Furthermore the transition function realizes a smooth map tαβ : Uα ∩ Uβ −→ G.Wehave −1 = −1 φβ (p, f ) φα p,tαβ (p).f (2.4.13)

Just as manifolds can be constructed by gluing together open charts, fibre- bundles can be obtained by gluing together local trivializations. Explicitly one pro- ceeds as follows. 1. First choose a base manifold M , a typical fibre F and a structural Lie Group G whose action on F must be well-defined. 2. Then choose an atlas of open neighborhoods Uα ⊂ M covering the base mani- fold M . 3. Next to each non-vanishing intersection Uα Uβ = ∅ assign a transition func- tion, namely a smooth map: ψαβ : Uα Uβ → G (2.4.14) from the open subset Uα Uβ ⊂ M of the base manifold to the structural Lie group. For consistency the transition functions must satisfy the two conditions: −1 ∀Uα,Uβ /Uα Uβ =∅: ψβα = ψ αβ (2.4.15) ∀Uα,Uβ ,Uγ /Uα Uβ Uγ =∅: ψαβ · ψβγ · ψγα = 1G Whenever a set of local trivializations with consistent transition functions satisfy- ing (2.4.15) has been given a fibre-bundle is defined. A different and much more difficult question to answer is to decide whether two sets of local trivializations de- fine the same fibre-bundle or not. We do not address such a problem whose proper treatment is beyond the scope of this course. We just point out that the classifica- tion of inequivalent fibre-bundles one can construct on a given base manifold M is a problem of global geometry which can also be addressed with the techniques of algebraic topology and algebraic geometry. Typically inequivalent bundles are characterized by topological invariants that receive the name of characteristic classes. In physical language the transition functions (2.4.14) from one local trivializa- tion to another one are the gauge transformations, namely group transformations depending on the position in space-time (i.e. the point on the base manifold). 2.4 Fibre Bundles 57

Definition 2.4.3 A principal bundle P(M ,G)is a fibre-bundle where the standard fibre coincides with the structural Lie group F = G and the action of G on the fibre is the left (or right) multiplication (see (2.4.2)):

∀g ∈ G ⇒ Lg : G → G (2.4.16)

The name principal is given to the fibre-bundle in Definition 2.4.3 since it is a “fa- ther” bundle which, once given, generates an infinity of associated vector bundles, one for each linear representation of the Lie group G. Let us recall the notion of linear representations of a Lie group.

Definition 2.4.4 Let V be a vector space of finite dimension dim V = m and let Hom(V, V ) be the group of all linear homomorphisms of the vector space into itself:

f ∈ Hom(V, V)/ f : V → V (2.4.17) ∀α, β ∈ R ∀v1,v2 ∈ V : f(αv1 + βv2) = αf (v1) + βf (v2) A linear representation of the Lie group G of dimension n is a group homomor- phism: ⎧ ⎪∀g ∈ Gg→ D(g) ∈ Hom(V, V ) ⎨⎪ ∀g g ∈ GD(g· g ) = D(g ) · D(g ) 1 2 1 2 1 2 (2.4.18) ⎪ D(e) = 1 ⎩⎪ ∀g ∈ GD(g−1) =[D(g)]−1

Whenever we choose a basis e1, e2,...,en of the vector space V every element ∈ j f Hom(V, V ) is represented by a matrix fi defined by: = j f(ei) fi ej (2.4.19) Therefore a linear representation of a Lie group associates to each abstract group × j element g an n n matrix D(g)i . As it should be known to the student, linear representations are said to be irreducible if the vector space V admits no non-trivial vector subspace W ⊂ V that is invariant with respect to the action of the group: ∀g ∈ G/D(g)W ⊂ W . For simple Lie groups reducible representations can always be decomposed into a direct sum of irreducible representations, namely V = V1 ⊕ V2 ⊕···⊕Vr (with Vi irreducible) and irreducible representations are completely defined by the structure of the group. These notions that we have recalled from group theory motivate the definition:

Definition 2.4.5 An associated vector bundle is a fibre-bundle where the standard fibre F = V is a vector space and the action of the structural group on the standard fibre is a linear representation of G on V .

The reason why the bundles in Definition 2.4.5 are named associated is almost obvious. Given a principal bundle and a linear representation of G we can immedi- 58 2 Manifolds and Fibre Bundles

Fig. 2.15 The intersection of two local trivializations of a line bundle

ately construct a corresponding vector bundle. It suffices to use as transition func- tions the linear representation of the transition functions of the principal bundle: (V ) ≡ (G) ∈ ψαβ D ψαβ Hom(V, V ) (2.4.20) For any vector bundle the dimension of the standard fibre is named the rank of the bundle. Whenever the base-manifold of a fibre-bundle is complex and the transition func- tions are holomorphic maps, we say that the bundle is holomorphic. A very important and simple class of holomorphic bundles are the line bundles. By definition these are principal bundles on a complex base manifold M with struc- tural group C ≡ C\0, namely the multiplicative group of non-zero complex num- bers.

Let zα(p) ∈ C be an element of the standard fibre above the point p ∈

Uα Uβ ⊂ M in the local trivialization α and let zβ (p) ∈ C be the correspond- ing fibre point in the local trivialization β. The transition function between the two trivialization is expressed by (see Fig. 2.15):

zα(p) zα(p) = fαβ (p) · zβ (p) ⇒ fαβ (p) = , = 0 (2.4.21) zβ (p) ∈C

2.5 Tangent and Cotangent Bundles

Let M be a differentiable manifold of dimension dim M = m: in Sect. 2.3 we have seen how to construct the tangent spaces TpM associated with each point p ∈ M of the manifold. We have also seen that each TpM is a real vector space isomorphic to Rm. Considering the definition of fibre-bundles discussed in the previous section we now realize that what we actually did in Sect. 2.3 was to construct a vector-bundle, the tangent bundle T M (see Fig. 2.16). In the tangent bundle T M the base manifold is the differentiable manifold M , the standard fibre is F = Rm and the structural group is GL(m, R) namely the group of real m × m matrices. The main point is that the transition functions are not newly introduced to construct the bundle rather they are completely determined from the transition functions relating open charts of the base manifold. In other words, when- ever we define a manifold M , associated with it there is a unique vector bundle T M → M which encodes many intrinsic properties of M . Let us see how. 2.5 Tangent and Cotangent Bundles 59

Fig. 2.16 The tangent bundle is obtained by gluing together all the tangent spaces

Consider two intersecting local charts (Uα,φα) and (Uβ ,φβ ) of our manifold. A tangent vector, in a point p ∈ M was written as: = μ ∂ tp c (p) μ (2.5.1) ∂x p

Now we can consider choosing smoothly a tangent vector for each point p ∈ M , namely introducing a map:

p ∈ M → tp ∈ TpM (2.5.2)

Mathematically what we have obtained is a section of the tangent bundle, namely a smooth choice of a point in the fibre for each point of the base. Explicitly this just means that the components cμ(p) of the tangent vector are smooth functions of the base point coordinates xμ. Since we use coordinates, we need an extra label denoting in which local patch the vector components are given: ⎧ ⎨ = μ ∂ | ⇒ t c(α)(x) ∂xμ p in chart α (2.5.3) ⎩ = ν ∂ | ⇒ t c(β)(y) ∂yν p in chart β having denoted xμ and yν the local coordinates in patches α and β, respectively. Since the tangent vector is the same, irrespectively of the coordinates used to de- scribe it, we have: ∂ ∂yν ∂ cν (y) = cμ (x) (2.5.4) (β) ∂yν (α) ∂xμ ∂yν namely: ∂yν cν (p) = cμ (p) (p) (2.5.5) (β) (α) ∂xμ In formula (2.5.5) we see the explicit form of the transition function between two local trivializations of the tangent bundle: it is simply the inverse Jacobian matrix associated with the transition functions between two local charts of the base mani- fold M . On the intersection Uα Uβ we have: ∂y ∀p ∈ U U : p → ψ (p) = (p) ∈ GL(m, R) (2.5.6) α β βα ∂x as it is pictorially described in Fig. 2.17. 60 2 Manifolds and Fibre Bundles

Fig. 2.17 Two local charts of the base manifold M yield two local trivializations of the tangent bundle T M

2.5.1 Sections of a Bundle

It is now the appropriate time to associate a precise definition to the notion of bundle section that we have implicitly advocated in (2.5.2).

π Definition 2.5.1 Consider a generic fibre-bundle E −→ M with generic fibre F . We name section of the bundle arules that to each point p ∈ M of the base manifold associates a point s(p) ∈ Fp in the fibre above p, namely a map

s : M → E (2.5.7) such that: − ∀p ∈ M : s(p) ∈ π 1(p) (2.5.8)

The above definition is illustrated in Fig. 2.18 which also clarifies the intuitive idea standing behind the chosen name for such a concept. It is clear that sections of the bundle can be chosen to be continuous, differen- tiable, smooth or, in the case of complex manifolds, even holomorphic, depending on the properties of the map s in each local trivialization of the bundle. Indeed given a local trivialization and given open charts for both the base manifold M and for the fibre F , the local description of the section reduces to a map:

m n R ⊃ U → FU ⊂ R (2.5.9) where m and n are the dimensions of the base manifold and of the fibre respectively. We are specifically interested in smooth sections, namely in section that are in- π finitely differentiable. Given a bundle E −→ M , the set of all such sections is de- noted by: Γ(E,M ) (2.5.10) Of particular relevance are the smooth sections of vector bundles. In this case to each point of the base manifold p we associate a vector v(p) in the vector space above the point p. In particular we can consider sections of the tangent bundle T M associated with a smooth manifold M . Such sections correspond to the notion of vector fields. 2.5 Tangent and Cotangent Bundles 61

Fig. 2.18 Asectionofafibre bundle

Definition 2.5.2 Given a smooth manifold M , we name vector field on M a smooth section t ∈ Γ(TM , M ) of the tangent bundle. The local expression of such vector field in any open chart (U, φ) is ∂ t = tμ(x) ∀x ∈ U ⊂ M (2.5.11) ∂xμ

2.5.1.1 Example: Holomorphic Vector Fields on S2

As we have seen above, the 2-sphere S2 is a complex manifold of complex dimen- sion one covered by an atlas composed by two charts, that of the North Pole and that of the South Pole (see Fig. 2.19) and the transition function between the local complex coordinate in the two patches is the following one: 1 zN = (2.5.12) zS Correspondingly, in the two patches, the local description of a holomorphic vector field t is given by: d t = vN (zN ) dzN (2.5.13) d t = vS(zS) dzS where the two functions vN (zN ) and vS(zS) are supposed to be holomorphic func- tions of their argument, namely to admit a Taylor power series expansion: ∞ = k vN (zN ) ckzN k=0 (2.5.14) ∞ = k vS(zS) vS(zS) dkzS k=0 However, from the transition function (2.5.12) we obtain the relations:

d =− 2 d ; d =− 2 d zS zN (2.5.15) dzN dzS dzS dzN 62 2 Manifolds and Fibre Bundles

Fig. 2.19 The 2-sphere

and hence: ∞ ∞ ∞ ∞ =− 2−k d = k d =− 2−k d = k d t ckzS dkzS dkzN ckzN dzS dzS dzN dzN k=0 k=0 k=0 k=0 (2.5.16) The only way for (2.5.16) to be self consistent is to have:

∀k>2 ck = dk = 0; c0 =−d2,c1 =−d1,c2 =−d0 (2.5.17) This shows that the space of holomorphic sections of the tangent bundle T S2 is a finite dimensional vector space of dimension three spanned by the three differential operators: d L =−z 0 dz d L =− (2.5.18) 1 dz

2 d L− =−z 1 dz We will have more to say about these operators in the sequel. What we have so far discussed can be summarized by stating the transformation rule of vector field components when we change coordinate patch form xμ to xμ: μ   ∂x tμ x = tν(x) (2.5.19) ∂xν Indeed a convenient way of defining a fibre-bundle is provided by specifying the way its sections transform from one local trivialization to another one which amounts to giving all the transition functions. This method can be used to discuss the construction of the cotangent bundle.

2.5.2 The Lie Algebra of Vector Fields

In Sect. 2.3 we saw that the tangent space TpM at point p ∈ M of a manifold can be identified with the vector space of derivations of the algebra of germs (see 2.5 Tangent and Cotangent Bundles 63

Definition 2.3.5). After gluing together all tangent spaces into the tangent bundle T M such an identification of tangent vectors with the derivations of an algebra can be extended from the local to the global level. The crucial observation is that the set of smooth functions on a manifold C∞(M ) constitutes an algebra with respect to point-wise multiplication just as the set of germs at point p. The vector fields, namely the sections of the tangent bundle, are derivations of this algebra. Indeed each vector field X ∈ Γ(TM , M ) is a of the algebra C∞(M ) into itself: ∞ ∞ X : C (M ) → C (M ) (2.5.20) that satisfies the analogue properties of those mentioned in (2.3.21) for tangent vec- tors, namely:

X(αf + βg) = αX(f ) + βX(g) X(f · g) = X(f ) · g + f · X(g) (2.5.21) ∞ ∀α, β ∈ R (or C);∀f,g ∈ C (M )

On the other hand the set of vector fields, renamed for this reason:

Diff(M ) ≡ Γ(TM , M ) (2.5.22) forms a Lie algebra with respect to the following Lie bracket operation: [X, Y]f = X Y(f ) − Y X(f ) (2.5.23)

Indeed the set of vector fields is a vector space with respect the scalar numbers (R or C, depending on the type of manifold, real or complex), namely we can take linear combinations of the following form:

∀λ,μ ∈ R or C ∀ X, Y ∈ Diff(M ) : λX + μY ∈ Diff(M ) (2.5.24) having defined: ∞ [λX + μY](f ) = λ X(f ) + μ Y(f ) , ∀f ∈ C (M ) (2.5.25)

Furthermore the operation (2.5.23) is the commutator of two maps and as such it is antisymmetric and satisfies the Jacobi identity. The Lie algebra of vector fields is named Diff(M ) since each of its elements can be interpreted as the generator of an infinitesimal diffeomorphism of the manifold onto itself. As we are going to see Diff(M ) is a Lie algebra of infinite dimension,but it can contain finite dimensional subalgebras generated by particular vector fields. The typical example will be the case of the Lie algebra of a Lie group: this is the finite dimensional subalgebra G ⊂ Diff(G) spanned by those vector fields defined on the Lie group manifold that have an additional property of invariance with respect to either left or right translations (see Chap. 3). 64 2 Manifolds and Fibre Bundles

2.5.3 The Cotangent Bundle and Differential Forms

Let us recall that a differential 1-form in the point p ∈ M of a manifold M , namely ∈ ∗M an element ωp Tp of the cotangent space over such a point was defined as a real valued linear functional over the tangent space at p, namely

ωp ∈ Hom(TpM , R) (2.5.26) which implies:

∀tp ∈ TpM ωp : tp → ωp(tp) ∈ R (2.5.27)

The expression of ωp in a coordinate patch around p is: μ ωp = ωμ(p) dx (2.5.28)

μ where dx (p) are the differentials of the coordinates and ωμ(p) are real numbers. We can glue together all the cotangent spaces and construct the cotangent bundles by stating that a generic smooth section of such a bundle is of the form (2.5.28) where ωμ(p) are now smooth functions of the base manifold point p. Clearly if we change , an argument completely similar to that employed in the case of the tangent bundle tells us that the coefficients ωμ(x) transform as follows: ν   ν ∂x ωμ x = ω (x)  (2.5.29) ∂xμ and (2.5.29) can be taken as a definition of the cotangent bundle T ∗M , whose sections transform with the Jacobian matrix rather than with the inverse Jacobian matrix as the sections of the tangent bundle do (see (2.5.19)). So we can write the

Definition 2.5.3 A differential 1-form ω on a manifold M is a section of the cotan- gent bundle, namely ω ∈ Γ(T∗M , M ).

This means that a differential 1-form is a map: ∞ ω : Γ(TM , M ) → C (M ) (2.5.30) from the space of vector fields (i.e. the sections of the tangent bundle) to smooth functions. Locally we can write:

μ Γ(TM , M )  ω = ωμ(x) dx (2.5.31) ∂ Γ T ∗M , M  t = tμ(x) ∂xμ and we obtain ∂ ω(t) = ω (x)tν(x) dxμ = ω (x)tμ(x) (2.5.32) μ ∂xν μ using 2.5 Tangent and Cotangent Bundles 65 ∂ dxμ = δμ (2.5.33) ∂xν ν which is the statement that coordinate differentials and partial derivatives are dual bases for 1-forms and tangent vectors respectively. Since TM is a vector bundle it is meaningful to consider the addition of its sec- tions, namely the addition of vector fields and also their pointwise multiplication by smooth functions. Taking this into account we see that the map (2.5.30)used to define sections of the cotangent bundle, namely 1-forms is actually an F-linear map. This means the following. Considering any F-linear combination of two vector fields, namely: ∞ f1t1 + f2t2,f1,f2 ∈ C (M )t1,t2 ∈ Γ(TM , M ) (2.5.34) for any 1-form ω ∈ Γ(T∗M , M ) we have:

ω(f1t1 + f2t2) = f1(p)ω(t1)(p) + f2(p)ω(t2)(p) (2.5.35) where p ∈ M is any point of the manifold M . It is now clear that the definition of differential 1-form generalizes the concept of total differential of the germ of a smooth function. Indeed in an open neighborhood U ⊂ M of a point p we have: ∀ ∈ C∞ M = μ f p ( )df ∂μfdx (2.5.36) and the value of df at p on any tangent vector tp ∈ TpM is defined to be:

μ dfp(tp) ≡ tp(f ) = t ∂μf (2.5.37) which is the directional derivative of the local function f along tp in the point p.If rather than the germ of a function we take a global function f ∈ C∞(M ) we realize that the concept of 1-form generalizes the concept of total differential of such a function. Indeed the total differential df fits into the definition of a 1-form, since for any vector field t ∈ Γ(TM , M ) we have:

μ ∞ df (t) = t (x)∂μf(x)≡ tf ∈ C (M ) (2.5.38)

μ A first obvious question is the following. Is any 1-form ω = ωμ(x) dx the differ- ential of some function? The answer is clearly no and in any coordinate patch there (1) is a simple test to see whether this is the case or not. Indeed, if ωμ = ∂μf for some ∈ C∞ M germ f p ( ) then we must have: 1 (1) (1) 1 ∂μω − ∂νω = [∂μ,∂ν]f = 0 (2.5.39) 2 ν μ 2 The left hand side of (2.5.39) are the components of what we will name a differential 2-form (2) = (2) μ ∧ ν ω ωμν dx dx (2.5.40) 66 2 Manifolds and Fibre Bundles and in particular the 2-form of (2.5.39) will be identified with the exterior differen- tial of the 1-form ω(1), namely ω(2) = dω(1). In simple words the exterior differen- tial operator d is the generalization on any manifold and to differential forms of any degree of the concept of curl, familiar from ordinary tensor calculus in R3.Forms whose exterior differential vanishes will be named closed forms. All these concepts need appropriate explanations that will be provided shortly from now. Yet, already at this intuitive level, we can formulate the next basic question. We saw that, in or- der to be the total differential of a function, a 1-form must be necessarily closed. Is such a condition also sufficient? In other words are all closed forms the differential of something? Locally the correct answer is yes, but globally it may be no. Indeed in any open neighborhood a closed form can be represented as the differential of another differential form, but the forms that do the job in the various open patches may not glue together nicely into a globally defined one. This problem and its so- lution constitute an important chapter of geometry, named cohomology. Actually cohomology is a central issue in algebraic topology, the art of characterizing the topological properties of manifolds through appropriate algebraic structures.

2.5.4 Differential k-Forms

Next we introduce differential forms of degree k and the exterior differential d.In a later section, after the discussion of homology we show how this relates to the important construction of cohomology. For the time being our approach is simpler and down to earth. We have seen that the 1-forms at a point p ∈ M of a manifold are linear func- tionals on the tangent space TpM . First of all we discuss the construction of exterior k-forms on any vector space W defined to be the kth linear antisymmetric function- als on such a space.

2.5.4.1 Exterior Forms

Let W a vector space of finite dimension over the field F (F can either be R or C depending on the case). In this section we show how we can construct a sequence of vector spaces Λk(W) with k = 0, 1, 2,...,n= dim W defined in the following way:

Λ0(W) = F

Λ1(W) = W (2.5.41) . .

Λk(W) = vector space of k-linear antisymmetric functionals over W 2.5 Tangent and Cotangent Bundles 67

The spaces Λk(W) contain the linear functionals on the kth exterior powers of the vector space W . Such functionals are denoted exterior forms of degree k on W . (k) Let φ ∈ Λk(W) be a k-form. It describes a map:

φ(k) : W ⊗ W ⊗···⊗W → F (2.5.42) with the following properties:

(k) (i)φ (w1,w2,...,wi,...,wj ,...,wk) (k) =−φ (w1,w2,...,wj ,...,wi,...,wk) (2.5.43) (k) (ii)φ (w1,w2,...,αx+ βy,...,wk) (k) (k) = αφ (w1,w2,...,x,...,wk) + βφ (w1,w2,...,y,...,wk) where α, β ∈ F and wi , x, y ∈ W . The first of properties (2.5.43) guarantees that the map φ(k) is antisymmetric in any two arguments. The second property states that φ(k) is linear in each argument. The sequence of vector spaces Λk(W) :

n Λ(W) ≡ Λk(W) (2.5.44) k=0 can be equipped with an additional operation, named exterior product that to each (k ) (k ) pairofak1 and a k2 form (φ 1 ,φ 2 ) associates a new (k1 + k2)-form. Namely we have: ∧: ⊗ → Λk1 Λk2 Λk1+k2 (2.5.45) More precisely we set:

(k1) ∧ (k2) ∈ φ φ Λk1+k2 (W) (2.5.46) and we write: (k ) (k ) δ 1 (k ) φ 1 ∧ φ 2 (w ,w ,...,w + ) = (−) P φ 1 (w ,...,w ) 1 2 k1 k2 (k + k )! P(1) P(k) P 1 2 × (k2) φ (wP(k1+1),...,wP(k1+k2)) (2.5.47) where P are the permutations of k1 + k2 objects, namely the elements of the sym- S = metric group k1+k2 and δP is the parity of the permutation P (δP 0ifP contains an even number of exchanges with respect to the identity permutation, while δP = 1 if such a number is odd). In order to make this definition clear, consider the explicit example where k1 = 2 and k2 = 1. We have: φ(2) ∧ φ(1) = φ(3) (2.5.48) 68 2 Manifolds and Fibre Bundles and we find 1 φ(3)(w ,w ,w ) = φ(2)(w ,w )φ(1)(w ) − φ(2)(w ,w )φ(1)(w ) 1 2 3 3! 1 2 3 2 1 3 − φ(2)(w ,w )φ(1)(w ) − φ(2)(w 1,w )φ(1)(w ) 1 3 2 3 2 1 (2) (1) (2) (1) + φ (w2,w3)φ (w1) + φ (w3,w1)φ (w2) 1 = φ(2)(w ,w )φ(1)(w ) + φ(2)(w ,w )φ(1)(w ) 3 1 2 3 2 3 1 (2) (1) + φ (w3,w1)φ (w2) (2.5.49) The exterior product we have just defined has the following formal property: (k) kk (k) φ ∧ φk = (−) φk ∧ φk ∀φ ∈ Λk(W);∀φk ∈ Λk (W) (2.5.50) which can be immediately verified starting from Definition (2.5.47). Indeed, assum- ing for instance that k2 >k1, it is sufficient to consider the parity of the permutation: 1, 2, ..., k ,k+ 1, ..., k ,k+ 1, ..., k + k Π = 1 1 2 2 1 2 k1,k1 + 2, ..., k1 + k1, 2k1 + 1, ..., k1 + k2, 1, ..., k1 (2.5.51) which is immediately seen to be:

δΠ = k1k2 mod 2 (2.5.52)  Setting P = P Π (which implies δP = δP  + δΠ ) we obtain: (k2) ∧ (k1) = − δP (k2) φ φ (w1,...,wk1+k2 ) ( ) φ (wP(1),...,wP(k2)) P

(k1) × φ (wP(k +1),...,wP(k +k )) 2 1 2  + = − δP δΠ (k2)   ( ) φ (wP Π(1),...,wP Π(k2)) P 

(k1) × φ (wP Π(k +1),...,wP Π(k +k )) 2 2 1  = − δΠ − δPi (k2)   ( ) ( ) φ (wP (k1+1),...,wP (k1+k2)) P 

× (k1)   φ (wP (1),...,wP (k1)) = − δΠ (k1) ∧ (k2) ( ) φ φ (w1,...,wk1+k2 ) (2.5.53)

2.5.4.2 Exterior Differential Forms

It follows that on TpM we can construct not only the 1-forms but also all the higher degree k-forms. They span the vector space Λk(TpM ). By gluing together all such 2.5 Tangent and Cotangent Bundles 69 vector spaces, as we did in the case of 1-forms, we obtain the vector-bundles of k-forms. More explicitly we can set:

Definition 2.5.4 A differential k-form ω(k) is a smooth assignment:

(k) : → (k) ∈ M ω p ωp Λk(Tp ) (2.5.54) of an exterior k-form on the tangent space at p for each point p ∈ M of a manifold.

{ 1 m} Let now (U, ϕ) be a local chart and let dxp,...,dxp be the usual natural basis ∗M (k) of the cotangent space Tp . Then in the same local chart the differential form ω is written as:

(k) = i1 ∧···∧ ik ω ωi1,...,ik (x1,...,xm)dx dx (2.5.55) ∈ C∞ where ωi1,...,ik (x1,...,xm) (U) are smooth functions on the open neighbor- hood U, completely antisymmetric in the indices i1,...,ik. At this point it is obvious that the operation of exterior product, defined on exte- rior forms, can be extended to exterior differential forms. In particular, if ω(k) and   ω(k ) are a k-form and a k-form, respectively, then ω(k) ∧ ω(k ) is a (k + k)-form. As a consequence of (2.5.50)wehave:

   ω(k) ∧ ω(k ) = (−)kk ω(k ) ∧ ω(k) (2.5.56) and in local coordinates we find:

(k) (k) (k) (k) 1 k+k ω ∧ ω = ω[ (x1,...,xm)ω ] dx ∧···∧ dx (2.5.57) i1...ik ik+1...ik+k where [...] denotes the complete antisymmetrization on the indices. ∞ ∞ ∞ Let A0(M ) = C (M ) and let Ak(M ) = C (M ) be the C (M )-module of differential k-forms. To justify the naming module, observe that we can construct the product of a smooth function f ∈ C∞(M ) with a differential form ω(k) setting: (k) (k) fω (Z1,...,Zk) = f · ω (Z1,...,Zk) (2.5.58) for each k-tuplet of vector fields Z1,...,Zk ∈ Γ(TM , M ) Furthermore let m A (M ) = Ak(M ) where m = dim M (2.5.59) k=0 Then A is an algebra over C∞(M ) with respect to the exterior wedge product . To introduce the exterior differential d we proceed as follows. Let f ∈ C∞(M ) be a smooth function: for each vector field Z ∈ Diff(M ),wehaveZ(f ) ∈ C∞(M ) and therefore there is a unique differential 1-form, noted df such that df (Z) = 70 2 Manifolds and Fibre Bundles

Z(f ). This differential form is named the total differential of the function f .Ina local chart U with local coordinates x1,...,xm we have: ∂f df = dxj (2.5.60) ∂xj More generally we can see that there exists an endomorphism d, (ω → dω) of A (M ) onto itself with the following properties:

(i) ∀ω ∈ Ak(M )dω∈ Ak+1(M ) (ii) ∀ω ∈ A (M )ddω= 0 k k (iii) ∀ω ∈ Ak(M ) ∀ω ∈ Ak (M ) (2.5.61)    d(ω(k) ∧ ω(k )) = dω(k) ∧ ω(k ) + (−1)kω(k) ∧ dω(k )

(iv) if f ∈ A0(M )df= total differential In each local coordinate patch the above intrinsic definition of the exterior differen- tial leads to the following explicit representation:

(k) = i1 ∧···∧ ik+1 dω ∂[i1 ωi2...ik+1] dx dx (2.5.62) As already stressed the exterior differential is the generalization of the concept of curl, well known in elementary vector calculus. In the next section we introduce the notions of homotopy, homology and coho- mology that are crucial to understand the global properties of manifolds and Lie groups and will also play an important role in formulating supergravity.

2.6 Homotopy, Homology and Cohomology

Differential 1-forms can be integrated along differentiable paths on manifolds. The higher differential p-forms, to be introduced shortly from now, can be integrated on p-dimensional submanifolds. An appropriate discussion of such integrals and of their properties requires the fundamental concepts of algebraic topology, namely homotopy and homology. Also the global properties of Lie groups and their many- to-one relation with Lie algebras can be understood only in terms of homotopy. For this reason we devote the present section to an introductory discussion of homotopy, homology and of its dual, cohomology. The kind of problems we are going to consider can be intuitively grasped if we consider Fig. 2.20, displaying a closed two-dimensional surface with two handles (actually an oriented, closed Riemann surface of genus g = 2)onwhichwehave drawn several different closed 1-dimensional paths γ1,...,γ6. Consider first the path γ5. It is an intuitive fact that γ5 can be continuously de- formed to just a point on the surface. Paths with such a property are named homo- topically trivial or homotopic to zero. It is also an intuitive fact that neither γ2, nor γ3, nor γ1, nor γ4 are homotopically trivial. Paths of such a type are homotopically 2.6 Homotopy, Homology and Cohomology 71

Fig. 2.20 A closed surface with two handles marked by several different closed 1-dimensional paths

Fig. 2.21 When we cut a surface along a path that is a boundary, namely it is homologically trivial, the surface splits into two separate parts

Fig. 2.22 The sum of the three paths γ1,γ2,γ3 is homologically trivial, namely γ2 + γ3 is homologous to −γ1

non-trivial. Furthermore we say that two paths are homotopic if one can be contin- uously deformed into the other. This is for instance the case of γ6 which is clearly homotopic to γ3. Let us now consider the difference between path γ4 and path γ1 from another viewpoint. Imagine the result of cutting the surface along the path γ4. After the cut the surface splits into two separate parts, R1 and R2 as shown in Fig. 2.21. Such a splitting does not occur if we cut the original surface along the path γ1. The reason for this different behavior resides in this. The path γ4 is the boundary of a region on the surface (the region R1 or, equivalently its complement R2) while γ1 is not the boundary of any region. A similar statement is true for the paths γ2 or γ3.Wesay that γ4 is homologically trivial while γ1, γ2, γ3 are homologically non-trivial. Next let us observe that if we simultaneously cut the original surface along γ1, γ2, γ3 the surface splits once again into two separate parts as shown in Fig. 2.22. This is due to the fact that the sum of the three paths is the boundary of a region: either R1 or R2 of Fig. 2.22. In this case we say that γ2 + γ3 is homologous to −γ1, since the difference γ2 + γ3 − (−γ3) is a boundary. In order to give a rigorous formulation to these intuitive concepts,which can be extended also to higher dimensional submanifolds of any manifold we proceed as follows. 72 2 Manifolds and Fibre Bundles

2.6.1 Homotopy

Let us come back to Definition 2.3.1 of a curve (or path) in a manifold and slightly generalize it.

Definition 2.6.1 Let [a,b] be a closed interval of the real line R parameterized by the parameter t and subdivide it into a finite number of closed, partial intervals:

[a,t1], [t1,t2],...,[tn−1,tn], [tn,b] (2.6.1)

We name piece-wise differentiable path a continuous map:

γ :[a,b]→M (2.6.2) of the interval [a,b] into a differentiable manifold M such that there exists a split- ting of [a,b] into a finite set of closed subintervals as in (2.6.1) with the property that on each of these intervals the map γ is not only continuous but also infinitely differentiable.

Since we have parametric invariance we can always rescale the interval [a,b] and reduce it to be [0, 1]≡I (2.6.3) Let

σ : I → M (2.6.4) τ : I → M be two piece-wise differentiable paths with coinciding extrema, namely such that (see Fig. 2.23):

σ(0) = τ(0) = x0 ∈ M (2.6.5) σ(1) = τ(1) = x1 ∈ M

Definition 2.6.2 We say that σ is homotopic to τ and we write σ  τ if there exists a continuous map: F : I × I → M (2.6.6) such that: F(s,0) = σ(s) ∀s ∈ I F(s,1) = τ(s) ∀s ∈ I (2.6.7) F(0,t) = x0 ∀t ∈ I

F(1,t) = x1 ∀t ∈ I 2.6 Homotopy, Homology and Cohomology 73

Fig. 2.23 Two paths with coinciding extrema

In particular if σ is a closed path, namely a loop at x0,i.e.ifx0 = x1 and if τ homotopic to σ is the constant loop that is

∀s ∈ I : τ(s)= x0 (2.6.8) then we say that σ is homotopically trivial and that it can be contracted to a point. It is quite obvious that the homotopy relation σ  τ is an equivalence relation. Hence we shall consider the homotopy classes [σ ] of paths from x0 to x1. Next we can define a binary product operation on the space of paths in the fol- lowing way. If σ is a path from x0 to x1 and τ is a path from x1 to x2 we can define a path from x0 to x2 traveling first along σ and then along τ . More precisely we set: σ(2t) 0 ≤ t ≤ 1 στ(t)= 2 (2.6.9) − 1 ≤ ≤ τ(2t 1) 2 t 1

What we can immediately verify from this definition is that if σ  σ  and τ  τ  then στ  σ τ . The proof is immediate and it is left to the reader. Hence without any ambiguity we can multiply the equivalence class of σ with the equivalence class of τ always assuming that the final point of σ coincides with the initial point of τ . Relying on these definitions we have a theorem which is very easy to prove but has an outstanding relevance:

Theorem 2.6.1 Let π1(M ,x0) be the set of homotopy classes of loops in the man- ifold M with base in the point x0 ∈ M . If the product law of paths is defined as we just explained above, then with respect to this operation π1(M ,x0) is a group whose identity element is provided by the homotopy class of the constant loop at −1 x0 and the inverse of the homotopy class [σ ] is the homotopy class of the loop σ defined by: − σ 1(t) = σ(1 − t) 0 ≤ t ≤ 1 (2.6.10) (In other words σ −1 is the same path followed backward.) 74 2 Manifolds and Fibre Bundles

Proof Clearly the composition of a loop σ with the constant loop (from now on denoted as x0) yields σ . Hence x0 is effectively the identity element of the group. We −1 still have to show that σσ  x0. The explicit realization of the required homotopy is provided by the following function: ⎧ ⎨⎪σ(2s) 0 ≤ 2s ≤ t F(s,t)= σ(t) t ≤ 2s ≤ 2 − t (2.6.11) ⎩⎪ σ −1(2s − 1) 2 − t ≤ 2s ≤ 2

Let us observe that having defined F as above we have:

F(s,0) = {σ(0) = x0 ∀s ∈ I σ(2s) 0 ≤ s 1 (2.6.12) F(s,1) = 2 −1 − 1 ≤ ≤ σ (2s 1) 2 s 1 and furthermore:

F(0,t)= {σ(0) = x0 ∀t ∈ I (2.6.13) −1 F(1,t)= σ (1) = x0 ∀t ∈ I

Therefore it is sufficient to check that F(s,t) is continuous. Dividing the square [0, 1]×[0, 1] into three triangles as in Fig. 2.24 we see that F(s,t) is continuous in each of the triangles and that is consistently glued on the sides of the triangles. Hence F as defined in (2.6.11) is continuous. This concludes the proof of the theo- rem. 

Theorem 2.6.2 Let α be a path from x0 to x1. Then α − [σ ] −→ α 1σα (2.6.14) is an isomorphism of π1(M ,x0) into π1(M ,x1).

Proof Indeed, since − − − [στ]−→ α 1σα α 1τα = α 1στα (2.6.15)

α α−1 we see that −→ is a homomorphism. Since also the inverse −→ does exist, then the homomorphism is actually an isomorphism. 

From this theorem it follows that in a arc-wise connected manifold, namely in a manifold where every point is connected to any other by at least one piece-wise differentiable path, the group π1(M ,x0) is independent from the choice of the base point x0 and we can call it simply π1(M ). The group π1(M ) is named the first homotopy group of the manifold or simply the fundamental group of M . 2.6 Homotopy, Homology and Cohomology 75

Fig. 2.24 The continuous map the realizes the homotopy between the constant loop and the product of any loop with its own inverse

Definition 2.6.3 A differentiable manifold M which is arc-wise connected is named simply connected if its fundamental group π1(M ) is the trivial group com- posed only by the identity element.

π1(M ) = id ⇔ M = simply connected (2.6.16)

2.6.2 Homology

The notion of homotopy led us to introduce an internal composition group for paths, the fundamental group π1(M ), whose structure is a topological invariant of the manifold M , since it does not change under continuous deformations of the latter. For this group we have used a multiplicative notations since nothing guarantees a priori that it should be Abelian. Generically the fundamental homotopy group of a manifold is non-Abelian. As mentioned above there are higher homotopy groups n πn(M ) whose elements are the homotopy classes of S spheres drawn on the man- ifold. In this section we turn our attention to another series of groups that also codify topological properties of the manifold and are on the contrary all Abelian. These are the homology groups:

Hk(M ); k = 0, 1, 2,...,dim(M ) (2.6.17)

We can grasp the notion of homology if we persuade ourselves that it makes sense to consider linear combinations of submanifolds or regions of dimension p of 76 2 Manifolds and Fibre Bundles

Fig. 2.25 The standard p-simplexes for p = 0, 1, 2

a manifold M , with coefficients in a ring R that can be either Z,orR or, some- times Zn. The reason is that the submanifolds of dimension p are just fit to integrate p-differential forms over them. This fact allows to give a meaning to an expression of the following form:

C (p) = (p) + (p) +···+ (p) m1S1 m2S2 mkSk (2.6.18)

(p) ⊂ M M where Si are suitable p-dimensional submanifolds of the manifold , later on called simplexes, and mi ∈ R are elements of the chosen ring of coefficients. What we systematically do is the following. For each differential p-form ω(p) ∈ Λp(M ) we set:

k (p) (p) (p) ω = ω = mi ω (2.6.19) C (p) (p)+ (p)+···+ (p)C (p) (p) m1S1 m2S2 mkSk i=1 Si and in this we define the integral of ω(p) on the region C (p). Next let us give the precise definition of the p-simplexes of which we want to take linear combinations.

Definition 2.6.4 Let us consider the Euclidian space Rp+1. The standard p-simplex p p+1 Δ is the set of all points {t0,t1,...,tp}∈R such that the following conditions are satisfied:

ti ≥ 0; t0 + t1 +···+tp = 1 (2.6.20)

It is just easy to see that the standard 0-simplex is a point, namely t0 = 1, the standard 1-simplex is a segment of line, the standard 2-simplex is a triangle, the standard 3-simplex is a tetrahedron and so on (see Fig. 2.25). 2.6 Homotopy, Homology and Cohomology 77

Fig. 2.26 The faces of the standard 1-simplex

Fig. 2.27 The faces of the standard 2-simplex

Let us now consider the standard (p − 1)-simplex Δ(p−1) and let us observe that (p−1) p there are (p + 1) canonical maps φi that map Δ into Δ : (p−1) p φi : Δ → Δ (2.6.21) These maps are defined as follows: (p) = φi (t0,...,ti−1,ti+1,...,tp) (t0,...,ti−1, 0,ti+1,...,tp) (2.6.22)

Definition 2.6.5 The p + 1 standard simplexes Δp−1 immersed in the standard p- simplex Δp by means of the p + 1 maps of (2.6.22) are named the faces of Δp and (p) the index i enumerates them. Hence the map φi yields, as a result, the ith face of the standard p-simplex.

For instance the two faces of the standard 1-simplex are the two points (t0 = 0,t1 = 1) and (t0 = 1,t1 = 0) asshowninFig.2.26. Similarly the three segments (t0 = 0,t1 = t,t2 = 1−t), (t0 = t,t1 = 0,t2 = 1−t) and (t0 = t,t1 = 1 − t,t2 = 0) are the three faces of the standard 2-simplex (see Fig. 2.27).

Definition 2.6.6 Let M be a differentiable manifold of dimension m. A continuous map: σ (p) : Δ(p) → M (2.6.23) of the standard p-simplex into the manifold is named a singular p-simplex or simply a simplex of M . 78 2 Manifolds and Fibre Bundles

(2) (2) Fig. 2.28 S1 and S2 are two distinct 2-simplexes, namely two triangles with vertices respectively given by (A0,A1,A2) and B0,B1,B2. (2) The 2-simplex S3 with vertices B0,A1,A2 is the intersection of the other two (2) = (2) (2) S3 S1 S2

Clearly a 1-simplex is a continuous path in M , a 2-simplex is a portion of surface immersed M and so on. The ith face of the simplex σ (p) is given by the (p − 1)- (p) simplex obtained by composing σ with φi :

(p) (p−1) σ ◦ φi : Δ → M (2.6.24)

Let R be a commutative ring.

Definition 2.6.7 Let M be a manifold of dimension m. For each 0 ≤ n ≤ m the group of n-chains with coefficients in R, named C(M , R), is defined as the free R-module having a generator for each n-simplex in M .

In simple words Definition 2.6.7 states that Cp(M , R) is the set of all possible linear combination of p-simplexes with coefficients in R:

C (p) = (p) + (p) +···+ (p) m1S1 m2S2 mkSk (2.6.25) where mi ∈ R. The elements of Cp(M , R) are named p-chains. The concept of p-chains gives a rigorous meaning to the intuitive idea that any p-dimensional region of a manifold can be constructed by gluing together a certain number of simplexes. For instance a path γ can be constructed gluing together a finite number of segments (better their homeomorphic images). In the case p = 2, the construction of a two-dimensional region by means of 2-simplexes corresponds to a triangulation of a surface. As an example consider the case where the manifold we deal with is just the complex plane M = C and let us focus on the 2-simplexes drawn in Fig. 2.28. The chain: C (2) = (2) + (2) S1 S2 (2.6.26) denotes the region of the complex plane depicted in Fig. 2.29, with the proviso C (2) that when we compute the integral of any 2-form on the contribution from the (2) = (2) (2) simplex S3 S1 S2 (the shadowed area in Fig. 2.29) has to be counted twice (2) (2) since it belongs both to S1 and to S2 . Relying on these notions we can introduce the boundary operator. 2.6 Homotopy, Homology and Cohomology 79

Fig. 2.29 Geometrically the (2) + (2) chain S1 S2 is the union of the two simplexes (2) (2) S1 S2

Definition 2.6.8 The boundary operator ∂ is the map:

∂ : Cn(M , R) → Cn−1(M , R) (2.6.27) defined by the following properties: 1. R-linearity

∀C (p) C (p) ∈ M R ∀ ∈ R 1 , 2 Cp( , ), m1,m2 (2.6.28) C (p) + C (p) = C (p) + C (p) ∂ m1 1 m2 2 m1∂ 1 m2∂ 2 2. Action on the simplexes

∂σ ≡ σ ◦ φ0 − σ ◦ φ1 + σ ◦ φ1 −··· p i = (−) σ ◦ φi (2.6.29) i=1

The image of a chain C through ∂, namely ∂C , is called the boundary of the chain. C (2) = S (2) + S (2) As an exercise we can compute the boundary of the 2-chain 1 2 of Fig. 2.28, with the understanding that the relevant ring is, in this case Z.Wehave:

(2) = (2) + (2) ∂C ∂S1 ∂S2 −−−→ −−−→ −−−→ −−−→ −−−→ −−−→ = A1A2 − A0A2 + A0A1 + B1B2 − B0B2 + B1B2 (2.6.30) −−−→ where A1A2,... denote the oriented segments from A1 to A2 and so on. As one sees the change in sign is interpreted as the change of orientation (which is the correct interpretation if one thinks of the chain and of its boundary as the support of an integral). With this convention the 1-chain: −−−→ −−−→ −−−→ −−−→ −−−→ −−−→ A1A2 − A0A2 + A0A1 = A1A2 + A2A0 + A0A1 (2.6.31) (2) is just the oriented boundary of the S1 -simplex as shown in Fig. 2.30.

Theorem 2.6.3 The boundary operator ∂ is nilpotent, namely it is true that:

∂2 ≡ ∂ ◦ ∂ = 0 (2.6.32) 80 2 Manifolds and Fibre Bundles

Fig. 2.30 The oriented boundary of the S(2) symplex

Proof It is sufficient to observe that, as a consequence of their own definition, the maps φi defined in (2.6.22) have the following property: (p) ◦ (p−1) = (p) ◦ (p−1) φi φj φj φi−1 (2.6.33) Then, for the p-simplex σ we have: p i ∂∂σ = (−) δ[σ ◦ φi] i=0 p p−1 = − i − j ◦ (p) ◦ (p−1) ( ) ( ) σ φi φj i=0 j=0 p p−1 = − i+j ◦ (p) ◦ (p−1) + (p) ◦ (p−1) ( ) σ φj φi−1 σ φi φj (2.6.34) j

We can verify that everything in the last line of (2.6.34) cancels identically and this proves the theorem. 

(2) (2) As an illustration we can calculate ∂∂S1 for the 2-simplex S1 described in Fig. 2.28. We obtain:

(2) = − − + + − = ∂∂S1 A2 A1 A2 A0 A1 A0 0 (2.6.35) The nilpotency of the boundary operator ∂ that acts on the chains is the counterpart of the nilpotency of the d that acts on differential forms as ex- plained in Sect. 2.5.4. Consider Fig. 2.31. As one sees the sequence of the vector spaces Cm of m-chains can be put into correspondence with the sequence of vector spaces Λm of differential m-forms. The operator:

∂ : Ck → Ck−1 (2.6.36) makes you to travel on the sequence from left to right, while the exterior derivative operator:

d : Λk → Λk+1 (2.6.37) causes you to travel along the same sequence in the opposite direction from right to left. Both ∂ and d are nilpotent maps. 2.6 Homotopy, Homology and Cohomology 81

Fig. 2.31 Homology versus cohomology groups

2.6.3 Homology and Cohomology Groups: General Construction

Let π : X → Y be a linear map between vector spaces. We define kernel of π and we denote ker π the subspace of X whose elements have the property of being mapped into 0 ∈ Y by π: ker π = x ∈ X/π(x) = 0 ∈ Y (2.6.38) We call image of π and we denote Im π the subspace of Y whose elements have the property that they are the image through π of some element of X: Im π = y ∈ Y/∃x ∈ X/π(x) = y (2.6.39)

A nilpotent operator that acts on a sequence of vector spaces Xi defines a sequence of linear maps πi :

π1 π2 πi X1 −→ X2 −→ X3 −→ · · · −→ Xi −→ Xi+1 (2.6.40) that have the following property:

Im πi ⊂ ker πi+1 (2.6.41)

The inclusion of Im πi in ker πi+1 is what has been pictorially described in Fig. 2.31 and applies both to the boundary and to the exterior derivative operator. This situa- tion suggests the following terminology:

Definition 2.6.9 In every space Ck(M , R) we name cycles the elements of ker ∂, namely the chains C , whose boundary vanishes ∂C = 0. Similarly in every space Λk(M ) we name closed forms or cocycles the elements of ker d, namely the differ- ential forms ω such that dω = 0. 82 2 Manifolds and Fibre Bundles

At the same time:

Definition 2.6.10 In every space Ck(M ,R) we name boundaries all k-chains that are the boundary of a k + 1-chain:

+ C (k) = boundary ⇔∃C (k) = ∂C (k 1) (2.6.42)

Similarly in every space Λk(M ) we name exact forms or coboundaries all dif- ferential forms ω(k) such that they can be written as the exterior derivative of a (k − 1)-form: ω(k) = dω(k−1).

Clearly (2.6.41) can be translated by saying that every boundary is a cycle and every coboundary is a cocycle. The reverse statement, however, is not true in gen- eral. There are cycles that are not boundaries and there are cocycles that are not coboundaries. The concept of homology (or cohomology) previously discussed in an intuitive way can be formalized in the following way.

(k) (k) Definition 2.6.11 Consider the k-cycles. We say that two cycles C1 and C2 are (k) ∼ (k) homologous and we write C1 C2 if their difference is a boundary:

(k) ∼ (k) ⇒∃(k+1) (k) − (k) = (k+1) C1 C2 C3 /C1 C2 ∂C3 (2.6.43)

Clearly homology is an equivalence relation since: (k) (k) (k+1) C − C = ∂C + + 1 2 a ⇒ C(k) − C(k) = ∂ C(k 1) + C(k 1) (2.6.44) (k) − (k) = (k+1) 1 3 a b C2 C3 ∂Cb

Definition 2.6.12 We name kth homology group and we denote Hk(M , R) the group of equivalence classes of the kth cycles with respect to the k-boundaries. Similarly we define kth cohomology group and we denote H k(M , R) the group of equivalence classes of the k-cocycles with respect to the kth coboundaries. Indeed we say that two closed forms ω and ω are cohomologous if their difference is an exact form: ω ∼ ω ⇒∃φ/ω − ω = dφ.

More generally when we have a sequence of vector spaces Xi as in (2.6.40) and a sequence of linear maps πi satisfying (2.6.41) we define the cohomology groups relative to the operator π as:

i ≡ ker πi H(π) (2.6.45) Im πi−1 The relation existing between homology and cohomology is fully contained in the following formula which generalizes to an arbitrary smooth manifold and to 2.6 Homotopy, Homology and Cohomology 83 differential forms of any degree the familiar Gauss lemma or Stokes lemma: ω(k) = dω(k) (2.6.46) ∂C (k+1) C (k+1)

Equation (2.6.46), whose general proof we omit, implies that in the case C (k) is a cycle we have: − ω(k) + dφ(k 1) = ω(k) (2.6.47) C (k) C (k) namely the integral of a closed differential form along a cycle depends only on the cohomology class and not on the choice of the representative. Similarly if ω(k) is a closed form: ω(k) = ω(k) (2.6.48) C (k)+∂C (k+1) C (k) namely the integral of a cocycle along a cycle depends on the homology class of the class and not on the choice of the representative inside the class.

2.6.4 Relation Between Homotopy and Homology

The relation between homotopy and homology groups of a manifold is provided by a fundamental theorem of algebraic geometry that we state without proof:

Theorem 2.6.4 Let M be a smooth manifold. Then there exists a homomorphism:

χ : π1(M ) → H1(M , Z) (2.6.49) that sends the homotopy class of each loop γ into the 1-simplex γ . If M is arc- wise connected, then the map χ is surjective and the kernel of χ is the subgroup of commutators in π1(M ).

We recall that the subgroup of commutators of a discrete group G is the group G generated by all elements of the form x−1y−1xy for some x,y ∈ G. From this theorem we have two consequences:

Corollary 2.6.1 If π1(M ) is Abelian, then H1(M )  π1(M ), namely the homo- topy and cohomology groups coincide.

Corollary 2.6.2 If a manifold M is simply connected (π1(M ) = 1) then also the first homology group is trivial H1(M ) = 0.

The second of the above corollaries implies that in a simply connected manifold every closed loop is homologous to zero, namely it is the boundary of some region. 84 2 Manifolds and Fibre Bundles

On the foundations of Differential Geometry, the notions of Manifolds and Fibre Bundles and all the basic concepts introduced in the present chapter there exist many classical textbooks. A short list reflecting just the preferences of the authors is the following one [1Ð3].

References

1. Nakahara, M.: Geometry, Topology and Physics. IOP Publishing Ltd, Adam Hilger, Bristol (1990) 2. Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. American Mathemat- ical Society, Providence (2001) 3. Nash, C., Sen, S.: Topology and Geometry for Physicists. Academic Press, San Diego (1983) Chapter 3 Connections and Metrics

My work always tried to unite the truth with the beautiful, but when I had to choose one or the other, I usually chose the beautiful. . . Hermann Weyl

3.1 Introduction

We are now ready to use the tangent bundle in order to introduce what will be the main instrument of differential calculus on a smooth manifold, namely the . In General Relativity we shall mainly use the covariant derivative on the tangent bundle but it is important to realize that one can define covariant derivatives on general fibre-bundles. Indeed the covariant derivative is the physicist’s name for the mathematical concept of connection that we are going to introduce and illus- trate. It is also important to stress that even restricting one’s attention to the tangent bundle the connection used in General Relativity is a particular one, the so called Levi Civita connection that arises from a more fundamental object the metric.As we are going to see soon, a manifold endowed with a metric structure is a space where one can measure lengths, specifically the length of all curves. A generic con- nection on the tangent bundle is named an affine connection and the Levi Civita connection is a specific affine connection that is uniquely determined by the metric structure. As we shall presently illustrate every connection has, associated with it, another object (actually a 2-form) that we shall name its curvature. The curvature of the Levi Civita connection is what we name the Riemann curvature of a man- ifold and will be the main concern of General Relativity. It encodes the intuitive geometrical notion of curvature of a surface or hypersurface. The field equations of Einstein’s theory are statements about the Riemann curvature of space-time that is related to its energy-matter content. We should be aware that the notion of curvature applies to generic connections on generic fibre-bundles, in particular on principal bundles. Physically these connections and curvatures are not less important than the Levi Civita connection and the Riemann curvature. They constitute the main mathematical objects entering the theory of fundamental non-gravitational interac- tions.

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_3, 85 © Springer Science+Business Media Dordrecht 2013 86 3 Connections and Metrics

Fig. 3.1 A symbolic assessment by C.N. Yang that Mathematics and Physics have a common root

3.2 A Historical Outline

One spring in the late seventies, C.N. Yang, the father of gauge-theories and one among the most extraordinary scientific minds of the XXth century, was invited by Scuola Normale di Pisa to give the traditional yearly series of Fermian Lectures. It was just the eve of the 1983 experimental discovery of the W and Z bosons, which finally confirmed that gauge theories are indeed the language adopted by Nature to express the fundamental forces binding matter together and driving the evolution of the physical world. Yang chose to start his recollection of gauge-theories and his personal assessment of the entire subject from a symbolic picture of the kind shown in Fig. 3.1. Two leaves depart from the same stem: one is Mathematics, the other is Physics. The two leaves live parallel lives but have a common root and, as it happens in non-Euclidian geometry, parallels can and indeed do intersect. They frequently in- tersect and, equally frequently, interchange the role of guiding pivot. Examples are numberless both in recent and less recent history of science. This is a relevant but somehow trivial observation. The most profound allusion in the typical Chinese symbolism of Yang’s picture is the common stem of the two leaves. Mathematics and Physics were not that much socially and academically separated in the XIXth as they became in the XXth cen- tury and were further unified by a common denominator: the shared philosophical attitude of both the mathematician and the physicist, the latter assessing himself a natural philosopher. The common root of Modern Mathematics and Modern Theoretical Physics is most prominently evident in the case of the concepts of connections and metrics which constitute the main topics of the present chapter. From the physical side these mathematical notions encode all the fundamental interactions among elementary matter constituents. From the mathematical side they are the corner stones of Dif- ferential Geometry, Algebraic Topology and allied subjects. The historical developments of both the notion of a connection and the notion of a metric are quite long, spreading over more than a century. They are strongly intertwined and involve both Physics and Mathematics in an alternate and entangled fashion. Two fundamental geometrical problems are at the basis of both notions: the prob- lem of length and the problem of parallel transport, namely how can we measure 3.2 A Historical Outline 87 distances in a general continuous space and how do we assess the parallelism of two lines going through different points of that space. It turns out that these two appar- ently different problems are intimately related. Such a relation is the reason why connections and metrics, although genuinely different and independent mathemat- ical notions can be, under certain conditions, related in a one-to-one fashion. The case where connection and metric are one-to-one related corresponds to Gravity and is the play-ground of General Relativity, while the case where the connection stands on his own feet is the case of non-gravitational interactions and encodes all the rest of Physics. Historically the first notion of connection that was discovered is that of the metric connection named after its systematizer, the Italian mathematician Levi Civita. The Levi Civita connection is that appropriate to a Riemannian manifold, namely to a manifold equipped with a metric and it is analytically described by the 3-index sym- { μ } bols νρ , introduced in the XIXth century by their German inventor, Elwin Bruno Christoffel. The development of such a notion is embedded in the development of Riemannian geometry, embracing the fall of the XIXth century and the dawn of the XXth, which is a mathematical tale strongly intertwined with the physical tale of Einstein’s quest for General Covariance and the Theory of Gravity. It took several decades in the first half of the XXth century and the work of several mathematicians to single out the notion of a connection on a principal fibre bundle, purified from association with a metric. In a completely independent way in 1954 Yang and Mills introduced the physical notion of non-Abelian gauge fields which extends to all Lie groups G the structure of the electromagnetic theory based on the simplest of all such groups, namely U(1). In the course of due time the mathematical notion of a connection and the physical one of a Yang-Mills field were recognized to be identical. Let us sketch the main outline of this crucial, century long intellectual develop- ment.

3.2.1 Gauss Introduces Intrinsic Geometry and Curvilinear Coordinates

The first appearance of a metric is in the 1828 essay of Gauss (see Fig. 3.2)on curved surfaces. Written in Latin, the Disquisitiones Generales circa superficies curvas [1] contains the major revolutionary step forward that was necessary to over- come the precincts of Euclidian geometry and found a new differential science of spaces able to treat both flat and curved ones. Up to Gauss’ paper, Geometry was either formulated abstractly in terms of Euclidian axioms or analytically in terms of Cartesian coordinates. By Geometry it was meant the study of global properties of plane figures like triangles, squares and other polygons, or solids like the reg- ular polyhedra. All such objects were conceived as immersed in an external space where it was implicitly assumed that one could always define the absolute distance d(A,B) between any two given points A and B. Distance is the basic brick of the whole Euclidian building and it is calculated as the length of the segment with end- 88 3 Connections and Metrics

Fig. 3.2 Carl Friedrich Gauss (1777Ð1855) on the left and Georg Friedrich Bernhard Riemann (1826Ð1866) on the right. Gauss, the King of Mathematicians, was Professor at the University of Göttingen for many decades up to the very end of his long life. His contributions to all fields of Mathematics were enormous and most profound. Bernhard Riemann, another genious and a giant of human thought, had a very short and not too happy life. He was Gauss’ student both at the level of Diploma and of Habilitationsschrift. The whole of his work is contained in no more than 11 papers for a total of little more than 200 pages. Yet each of his contributions was a milestone in Mathematics and set the path for century long future developments. The foundations of Riemannian geometry were laid in the 16 page long dissertation of his Habilitation. Riemann had important ties with the Scuola Normale di Pisa and died in Italy at the age of 39 points in A and B, lying on the unique straight line which goes through any such pair of distinct points. Curved surfaces were obviously known before Gauss, yet their shape and proper- ties were conceived only through their immersion in three-dimensional space, con- sidered unique and absolute, as pretended by Immanuel Kant who promoted Eu- clidian geometry to an a priori truth lying at the basis of any sensorial experience. Gauss revolutionary starting point was that of reformulating the geometrical study of surfaces from an intrinsic rather than extrinsic viewpoint. He wondered how a little being, confined to live on the surface, might have perceived the geometry of his world. Rather than viewing the global shape of the surface M, unaccessible to his observations, the little creature would have explored its local properties in the vicinity of a point p ∈ M. In order to study curved surfaces in these terms, Gauss understood that it was necessary to abandon Cartesian coordinates as a system of point identification. In analytic geometry every point p ∈ E ≡ R3 of Euclidian space is singled out by a triplet of real numbers X, Y , Z which determine the distance of p from the three coordinate axes. As long as the points of the surface M are particular points of E, they admit the labeling in terms of three Cartesian coordinates, yet in such a description there is an excess of superfluous information. Why three coordinates when we are talking about a two-dimensional surface? Two should suffice. Gauss 3.2 A Historical Outline 89

Fig. 3.3 The points p of a curved surface M can be labeled with the three Cartesian coordinates X, Y, Z of the Euclidian space in which M is immersed. Yet this is a redundant information. The two Gaussian coordinates u and v are given through the construction of two systems of curves U and V on the surface was the first to grasp the notion of curvilinear coordinates and invented Gaussian coordinates. A very simple but revolutionary idea. On the surface M let us consider a family of curves U such that each element of the family never intersects any other element of the same family, at least in the neighborhood Up of the point p (for instance the lighter curves in Fig. 3.3) and such that the family covers the entire considered neighborhood. Let us next introduce a second family of curves V , with the same properties among themselves, yet such that each element of the family V intersects all elements of the U family at least in the neighborhood Up (for instance the darker curves Fig. 3.3). Once such systems of curves have been constructed, any point q ∈ Up in the neighborhood of p can be localized by stating on which U-curve and on which V -curve it lies. Assuming that u and v are the real parameters respectively enumerating the U and V curves, the pair of real numbers (u, v) provides the new (Gaussian) system of coordinates to label surface points. Using these coordinates we no longer need to make any reference to the exterior space in which M is immersed that might also be non- existing! By introducing curvilinear Gaussian coordinates the King of Mathematicians freed the study of surfaces from their immersion in the external Euclidian space E but he immediately had to cope with a new fundamental problem. Having abolished from the list of one’s mathematical instruments the straight line segments that join any two points A and B of the surface M, how can we calculate their distance? The 2 great intuitions of Gauss were the tangent plane TpM and the linear element ds , namely the metric. Defining the absolute distance between two points A and B was no longer possi- ble but also not interesting. In the external Euclidian space, the distance between A and B is the length of the segment which joins them, but which relevance has this datum for the little creature confined to live on the surface M, if such a segment does not lie on it? For the little two-dimensional being the only interesting datum is 90 3 Connections and Metrics

Fig. 3.4 The tangent plane to a point p of a surface M is a two-dimensional Euclidian space E2 where the notion of distance is defined. In the tangent plane which approximates infinitesimal regions of the surface surrounding p, we can define the line element as the Euclidian distance between two infinitesimally close points of the surface the length of a road going from A to B: the length of any possible road with such a property! The small ant has exactly the same needs as any contemporary car-driver who wants to start on a journey. Both need to know the length of all possible paths from their origin to their destination. Hence the problem addressed by Gauss was to give an answer to the following question: Can we define the length of any curve departing from p ∈ M and arriving at q ∈ M in terms of data completely intrinsic to the surface M? Gauss’ answer was positive and based on the change of perspective at the basis of the new differential geometry. Let us reformulate the initial question whether we might define the absolute dis- tance between two arbitrary points A,B ∈ M of the surface, adding the extra condi- tion that A and B should be only infinitesimally apart from each other. Analytically this means that if the Gaussian coordinates of A are (u, v), then those of B should be (u + du,v+ dv) where du and dv are infinitesimal. Gauss crucial observation is that a very small portion of the surface M around any point p ∈ M can be approxi- mated by a portion of the tangent plane to the surface at the point p, namely TpM. Smaller the considered region of M better the approximation (see Fig. 3.4). This be- ing the case we observe that Euclidian geometry makes sense in the tangent plane. Gauss remarked that the distance of two infinitesimally close points can be defined as the length of the infinitesimally short segment which joins them and which lies in the tangent plane. Recalling Pithagora’s theorem one might be tempted to say that the square length of the segments joining A and B, named ds2, is the sum of the squared differences of the Gaussian coordinates, namely: ds2 = du2 + dv2. Yet this is not necessarily true. In order to apply Pithagora’s theorem it is required that the axes u and v be orthogonal, which is generically false. Indeed a priori it is by no means guaranteed that the curve u and the curve v, meeting at point A, should intersect there at right angle. In order to calculate ds2 one has therefore to find the components dx and 3.2 A Historical Outline 91 dy of the infinitesimal segment AB in an orthogonal system of axes x and y. Once these components are known one can apply Pithagora’s theorem and write ds2 = dx2 + dy2. The components dx and dy depend on the Gaussian shifts du and dv linearly: dx a(u, v) b(u, v) du = (3.2.1) dy c(u, v) d(u, v) dv with matrix coefficients that vary from place to place on the surface, namely are functions of the Gaussian coordinate u, v. Taking this into account Gauss wrote the line element in the following way:

ds2 = F(u,v)du2 + G(u, v) dv2 + H(u,v)dudv (3.2.2) where F = a2 + c2, G = b2 + d2, H = 2ab + 2cd. Formula (3.2.2), written in 1828 provided the first example (a two-dimensional one) of a Riemannian metric, although Riemann was at that time only a two-year old child.

3.2.2 Bernhard Riemann Introduces n-Dimensional Metric Manifolds

The name of Riemann is associated in Mathematics with so many different and fun- damental objects that the contemporary student is instinctively led to think about the scientific production of this giant of human thought as composed by a countless number of papers, books and contributions. Actually the entire corpus of Riemann’s works is constituted only by 225 pages distributed over 11 articles published during the life-time of their author to which one has to add the 102 pages of the 4 posthu- mous publications. Among the latter there are the 16 pages of the Ueber die Hy- pothesen, welche der Geometrie zu Grunde liegen [2] which, in 1854, was debated by the candidate in front the Göttingen Faculty of Philosophy as Habilitationss- chrift. The habilitation to teach courses was the traditional first step in the academic career foreseen by most European universities all over their very long history. In XIXth century Germany the procedure to access habilitation consisted of the writ- ing of a dissertation on a topic chosen by the Faculty from a list of three proposed by the candidate. Typical time allowed for the preparation of such a dissertation was a couple of months and in the case of Riemann it amounted to exactly seven weeks. Obsessed the whole of his short life by extreme poverty and by a very poor health that eventually led him to death from pulmonary consumption at the quite young age of thirty-nine, the shy and meek Bernhard Riemann, who was nonethe- less quite conscious of his own talents, had already profoundly impressed Gauss with his diploma thesis. Written in 1851 and entitled Grundlagen für eine allge- meine Theorie der Functionen einer veränderlichen complexen Grösse which can be translated as Principles of a General Theory of the Functions of one complex 92 3 Connections and Metrics variable, Riemann’s thesis was completely new and contained all the essentials of the theory of analytic functions as it is taught up to the present day in most univer- sities of the world. Quite openly Gauss told his young student that for many years he had cheered the plan of writing a similar essay on that very topics yet now he would refrain from doing so since everything relevant to that province of thought had already been said by Riemann. When three years later Riemann presented to the Göttingen Faculty his three proposals for the theme of his own Habilitationsschrift, two choices were in fields where the young mathematician felt quite confident, while the third, with some hes- itation, was just added in order to complete the triplet and with the secret hope that it would be immediately discarded by the academic committee as something too philosophical and ill defined. The third proposed title was Grundlagen der Geome- trie, namely the Principles of Geometry. Remembering the talents of the young Herr Riemann, Gauss was fascinated by the idea of giving him precisely such a challeng- ing subject as the Foundations of Geometry to see what he might come up with it. The King of Mathematicians persuaded the Faculty to make such a choice and the poor Bernhard was dismayed by the news. He wrote to his father, a poor Lutheran minister, about his concerns on this matter but he also expressed him his confidence that he would not come too late and that his merits as an independent researcher would be appreciated. Riemann had accepted the challenge and in seven weeks he produced such a mas- terpiece of Mathematics and Philosophy as the Ueber die Hypothesen, welche der Geometrie zu Grunde liegen, that is About the Hypotheses lying at the Foundations of Geometry. With an unparalleled clarity of mind, Riemann began his essay with a profound criticism of the traditional approach to Geometry, refusing the Kantian dogma that this latter is an a-priori datum and rather inclining to the idea that which geometry is the actual one of Physical Space should be determined from experience. He said: It is known that geometry assumes, as things given, both the notion of space and the first principles of constructions in space. She gives definitions of them which are merely nominal, while the true determinations appear in the form of axioms. The relation of these assumptions remains consequently in darkness; we neither perceive whether and how far their connection is necessary, nor a priori, whether it is possible. From Euclid to Legendre (to name the most famous of modern reforming geometers) this darkness was cleared up neither by mathematicians nor by such philosophers as concerned themselves with it.1 After stating this two-thousand year old stalemate, Riemann proceeded to diag- nose its cause. Explicitly he said: The reason of this is doubtless that the general notion of multiply extended magnitudes (in which space-magnitudes are included) remained entirely unworked. I have in the first place, therefore, set myself the task of constructing the notion of a multiply extended magnitude out of general notions of magnitude. It will follow from this that a multiply extended magnitude is capable

1The translation of Riemann’s essay from German into English was done by William Clifford. 3.2 A Historical Outline 93 of different measure-relations, and consequently that space is only a particular case of a triply extended magnitude. In contemporary language the multiply extended magnitudes2 were simply the manifolds which we discussed in Chap. 2 and the measure relations are just the metric to be discussed in the present chapter and introduced for the first time by Gauss through (3.2.2). Following the new road opened by Gauss with the Disquisitiones, Riemann intro- duced n-extended manifolds whose points are labeled by n rather than two curvilin- ear coordinates xi and introduced the line element as a generic symmetric quadratic form in the differentials of these coordinates:

2 i j ds = gij (x) dx dx (3.2.3)

The coefficients of this quadratic form gij (x) were later known as the Riemannian metric tensor. Riemann grasped the main point, namely that the geometry of manifolds is en- coded in the possible metric tensors or measure relations, as he called them, and made the following bold statement: Hence flows as a necessary consequence that the propositions of geometry cannot be derived from general notions of magnitude, but that the properties which distinguish space from other conceivable triply ex- tended magnitudes are only to be deduced from experience. Thus arises the problem, to discover the simplest matters of fact from which the measure-relations of space may be determined; a problem which from the nature of the case is not completely determinate, since there may be several systems of matters of fact which suffice to determine the measure-relations of space. In other words, the young genius was aware that the same manifold could sup- port quite different metrics and thought that this applied in particular to Space,i.e. to the 3-dimensional physical world of our sensorial experience. He posed himself the question which should be the metric of Space and came to the conclusion that such a question could only be answered through experiment. This amounted to say that the geometry of the world is a matter of Physics and not of a priori Philosophy or Mathematics. Such a sentence of Riemann must have influenced Einstein quite deeply. Indeed the final outcome of Einstein Theory of Relativity is that the geome- try of space-time is dynamically determined by its matter content through Einstein field equations. In considering such a question as what is the preferred metric to be selected for a given manifold, Riemann formulated the basic problem of invariants.The matter of facts3 to which he alluded are the intrinsic properties encoded in a given metric tensor namely its invariants and he formulated the problem of determining, for instance, the minimal complete number of invariants able to select Euclidian geometry. In his quest for these invariants he came to the notion of the Riemann

2In the original German text of Riemann these were named mehrfach ausgedehnter Grossen.In modern scientific German the notion of manifolds is referred to as mannigfaltigkeiten. 3Einfachsten Thatsachen in the original German text. 94 3 Connections and Metrics curvature tensor that he outlined in his very dissertation. The curvature form of connections will be the subject of a subsequent section. As we already recalled, Riemann died young and had no time to develop the new theory of differential geometry that he had founded. Yet he had the time to come to Italy and, through his contact with the Scuola Normale di Pisa and the research group of Enrico Betti, whom he deeply admired, to plant the seeds of the absolute differential calculus in the Italian Peninsula where, later, they were strongly developed by Gregorio Ricci Curbastro and Tullio Levi Civita.

3.2.3 Parallel Transport and Connections

The idea of connections developed in the XIXth century along two different routes which merged only in the XXth. One route comes directly from Riemann and through Christoffel, Ricci and Levi Civita went to Einstein. We can name this the metric route. It resulted in the notion of the Levi Civita connection, which is the parallel transport defined by the existence of a metric structure. The other route was independently started in France by the work of Frenet and Serret and went down to Elie Cartan: it is the route of the mobile frames (or repères mobiles as they are named in French). A metric structure is not required in the second approach and the connection deals with the parallel transport of a basis of vectors from one point of a manifold to another one, along a curve. Connections of this type are named Cartan connections. There is obviously a relation between Cartan connections and met- ric connections. Such a relation is established through a so named soldering of the frame bundle with the tangent bundle, which also provides the conversion vocab- ulary between two different formulations of General Relativity: the original metric one of Einstein and the vielbein reformulation by Cartan. From a modern perspec- tive the vielbein formulation appears to be the most fundamental since it includes the metric formulation, yet, differently from this latter, allows also for the coupling to gravity of the fermionic fields, the spinors. The further generalization of Cartan connections to principal connections on generic fibre bundles was finally elaborated by Ehresmann in the post-war years of the middle XXth century and this mathematical development came almost in par- allel with the introduction of non-Abelian gauge theories in physics: the two leaves of C.N. Yang’s drawing protruding from the same stem!

3.2.4 The Metric Connection and Tensor Calculus from Christoffel to Einstein, via Ricci and Levi Civita

As we emphasized in the previous section, the primary concern of the new differen- tial geometry, founded by Riemann as a generalization of Gauss work on surfaces, was that of defining the length of curves on arbitrary manifolds. This leads to the 3.2 A Historical Outline 95

Fig. 3.5 The parallel transport of a vector along a curve is defined through the preservation of its angle with the tangent vector. This definition is possible if we have a metric and hence the notion of scalar product of local vectors

notion of the metric gik(x), introduced in (3.2.3). Once the metric is established, a natural way arises of transporting vectors along any given curve xμ(λ). We can say that a vector is parallel-transported along an arc of curve if the angle between the transported vector and the tangent vector to the curve remains constant throughout the entire transport (see Fig. 3.5). This notion is meaningful if we can define and measure angles. This is precisely what the existence of a metric allows. Indeed, in presence of gμν(x), we can gener- alize the relations of Euclidian geometry by defining the norm of any vector as: μ ν v≡ v (x)v (x)gμν(x) (3.2.4)

μ ν and the angle between any two vectors v1 (x) and v2 (x) as v ,v  cos θ ≡ 1 2 (3.2.5) v1v2 where the scalar product v1,v2 is:  ≡ μ ν v1,v2 v1 (x)v2 (x)gμν(x) (3.2.6) The metric connection is that infinitesimal displacement of a vector X along the direction singled out by another one Y which is so defined as to fulfill the property of preserving angles. It was first conceived by Christoffel. Elwin Bruno Christoffel was born in 1829 in Montjoie, near Aachen, that was re- named Monschau in 1918 (Fig. 3.6). After attending secondary schools in , he enrolled at the University of Berlin, where he had such teachers as Eisenstein and Dirichlet. Particularly the latter is duely considered his master. Christoffel’s doctor dissertation, dealing with the motion of electricity in homogeneous media was defended in 1856, just two years after Riemann’s presentation of the Ueber die Hypothesen. Having spent a few years out of the academic world, Christoffel re- turned to Mathematics in 1859, obtaining his habilitation from Berlin University. In the following years he was professor at the Polytechnic of Zürich, at the newly 96 3 Connections and Metrics

Fig. 3.6 Elwin Bruno (Montjoie 1829, 1918)

founded Technical University of Berlin and finally at the which had become German after the defeat of Napoleon III in the 1870 war. Accord- ing to opinions reported by contemporaries, Elwin Bruno Christoffel was one of the most polished teachers ever to occupy a chair. His lectures were meticulously pre- pared and his delivery was lucid and of the greatest aesthetic perfection. . . The core of his course was the theory of complex functions that he developed and presented according to Riemann’s approach. Although he wrote papers on several different topics like potential theory, differ- ential equations, conformal mappings, and still more, the most relevant and influential of Christoffel’s contributions with the furthest reach- ing consequences was his invention of the three-index symbols that bear his name. ρσ Defined in terms of a metric gμν(x) and of its inverse g (x), the symbols: λ 1 λσ ≡ g (∂μgνσ + ∂νgμσ − ∂σ gμν) (3.2.7) μν 2 are the first example of connection coefficients, actually those of the Levi Civita connection, that preserves angles along the parallel transport it defines [3]. are the key ingredients in the definition of the covariant derivative of a tensor (in particular of a vector): ∇ ≡ − ρ − ρ ··· μtλ1...λn ∂μtλ1...λn tρλ2...λn tλ1ρ...λn μλ1 μλ2 − ρ tλ1...λn−1ρ (3.2.8) μλn

The word tensor was introduced for the first time by Hamilton in 1846, but tensor calculus was developed around 1890 by Gregorio Ricci Curbastro under the title of absolute differential calculus and was made accessible to mathematicians by the 3.2 A Historical Outline 97

Fig. 3.7 The three Italian fathers of tensor calculus also named, at the time, Absolute Differential Calculus

publication of Tullio Levi Civita’s 1900 classic text of the same name, originally written in Italian, later republished in French with Ricci [4]. Gregorio Ricci Curbastro (see Fig. 3.7) was son in an aristocratic family of Lugo di Romagna. On the house where he was born in 1853 there stands a plate with the following words Diede alla scienza il calcolo differenziale assoluto, strumento 98 3 Connections and Metrics indispensabile per la teoria della relativitá generale, visione nuova dell’universo.4 He began his studies at Rome University but he continued them at Scuola Normale di Pisa and finally graduated from the University of Padova in 1875. As his younger friend Luigi Bianchi, born in Parma in 1865 and also student of Scuola Normale, in the Pisa years he was deeply influenced by the teaching of Ulisse Dini and En- rico Betti, the founder of modern topology. Through Betti, both Ricci and Bianchi captured the seeds of differential geometry planted by Riemann few years before. After graduation, Ricci obtained a fellowship that allowed him to spend some years in Munich, in Germany. There he came in touch with the new conception and clas- sification of geometries, based on symmetry groups, developed by Felix Klein and magisterially summarized by him in the celebrated Erlangen Programme [5]. These ideas had an analogous strong impact on Luigi Bianchi. Promoted to the position of full-professor at the University of Padova in 1880, Ricci had there an exceptionally talented graduate student: Tullio Levi Civita who was born in that city in 1873. Ricci, Bianchi and Levi Civita constructed the mathe- matical language used by Einstein to formulate General Relativity, which is also the most common language for classical differential geometry. The key ingredients of that language are just the tensors tμ1...μm , whose defining property is that of trans- λ1...λn forming from one coordinate patch to another, according to:

∂x˜μ1 ∂x˜μm ∂xσ1 ∂xσn ˜μ1...μm ˜ = ··· ··· ρ1...ρm t (x) tσ ...σ (x) (3.2.9) λ1...λn ∂xρ1 ∂xρm ∂x˜σ1 ∂x˜σn 1 n In the contemporary mathematical language utilized in Chap. 2, a tensor with m upper indices and n lower indices is just a section of the mth power of the tangent bundle and at the same time a section of the nth power of the cotangent bundle. Hence the absolute differential calculus of Ricci, Levi Civita and Bianchi is just the differential calculus for sections of those fibre bundles whose transition functions are completely determined by the very manifold structure of their base-manifold M , namely T M and T ∗M . The concept of covariant differentiation was formally developed by Ricci and Levi Civita and, as already stressed above, by using the Christoffel symbols, it realizes the idea of parallel transport preserving the angles defined by a metric structure. Once the covariant differentiation ∇μ is given, one can consider its antisymmetric square and this leads to the Riemann-Christoffel curvature tensor: μ μ μ θ μ θ R μ ≡ ∂ − ∂ + − (3.2.10) λσ ν λ σν σ λν λθ σν σθ λν which, sketched by Riemann in the Ueber die Hypothesen and analytically defined by Christoffel [3], realizes for an arbitrary manifold the idea of intrinsic curva- ture devised by Gauss in the 1828 Disquisitiones. Indeed considering a vector V ρ we find:

4He gave to science Absolute Differential Calculus, essential instrument of the Theory of General Relativity, a new vision of the Universe. 3.2 A Historical Outline 99

Fig. 3.8 The images of the same vector parallel transported along two different paths that converge at the same point of a manifold, generically differ by a rotation angle. That angle is a measure of the intrinsic curvature of the manifold and it is the information codified in the Riemann-Christoffel tensor

[∇ ∇ ] ρ = σ ρ μ, ν V V Rμνσ (3.2.11) The geometrical meaning of this relation is exemplified in Fig. 3.8. Consider an infinitesimally small rectangle whose two sides are given by the two vectors Xμ and Y μ (also of infinitesimally short length), departing from a given point p. Consider next the parallel transport of a third vector V ρ to the opposite site of the rectangle. This parallel transport can be performed along two routes, both arriving at the same destination. The first route follows first X and then Y . The second route does the opposite. The image vectors of these two transports are based at the same point, so they can be compared, in particular they can be subtracted. So doing we find: ρ ρ μ ν ρ σ σ ρ ∇X∇Y V −∇Y ∇XV = X Y [∇μ∇ν]V + ∇XY −∇Y X ∇σ V = μ ν σ ρ + σ ∇ ρ +[ ]σ ∇ ρ X Y V Rμνσ T μν σ V X, Y σ V (3.2.12) where the Riemann-Christoffel tensor was defined above and where: ρ ρ T ρ ≡ − (3.2.13) μλ μλ λμ is another tensor named the torsion. In the case of the Christoffel symbols the tor- sion is identically zero, yet for more general connections it can be different from zero and Levi Civita correctly singled out the vanishing of the torsion as one of the two axioms from which the metric connection can be derived. In any case compari- son of (3.2.12) with Fig. 3.8 enlightens the geometrical meaning of both torsion and curvature. If torsion is zero the parallel transport along the two different paths pro- duces two vectors that differ from one another by a rotation angle and the Riemann ρ tensor Rμνσ encodes all these possible angles. If torsion is not zero the two images ρ of parallel transport differ also by a displacement and the Tμλ encodes all such possible displacements. In a flat space, just as a plane, parallel transport pro- duces no rotation angle and no displacement. Hence the Riemann-Christoffel tensor ρ Rμνσ measures the intrinsic curvature of a metric manifold. In a paper of 1903 [6], Gregorio Ricci introduced a new tensor, later named after him, which is obtained from the Riemann-Christoffel tensor through a contraction of indices. The Ricci tensor is defined as follows: 100 3 Connections and Metrics

N ≡ ρ Ricμν Rμνρ (3.2.14) ρ=1 and, on a metric manifold, measures the first deviation of its volume form from the Euclidian value. Just for this reason it was originally considered by its inventor. Yet such tensor was doomed to play a major role in the development of XXth century scientific thought and in the birth of General Relativity.5 Preparatory to this great future of the Ricci tensor were the algebraic and differ- ential identifies it satisfies. They were derived by Luigi Bianchi in 1902 [9]. Actu- ally, according to Levi Civita, the same identities had already been discovered by Ricci as early as 1880 but they had been discarded by their author as not relevant. The first of Bianchi identities states that the Ricci tensor is symmetric:

Ricμν = Ricνμ (3.2.15) the second, differential identity, states that its divergence is equal to one half of the gradient of its trace:

μ 1 ∇ Ricμν = ∇νR (3.2.16) 2 where, by definition, we have posed:

μν R = g Ricμν (3.2.17)

μ μσ which is named the curvature scalar and we have set ∇ = g ∇σ . The Bianchi identities were precisely the clue that lead Einstein, with the help of Marcel Grossman, to single out the form of the field equations of General Rela- tivity. Combined in a proper way, they suggest the form of a covariantly conserved tensor, the , which plays the role of left hand side in the propagation equations, the right hand side being already decided on physical grounds, namely the conserved stress energy-tensor. Actually the identities discovered by Bianchi on the Ricci tensor are the par- ticular form taken, in the case of the Levi Civita connection, by very general and fundamental identities satisfied by the curvature 2-form of any connection on any principle fibre bundle. In the present chapter we will emphasize this concept, which has the highest significance both in Physics and in Mathematics. After his laurea in Mathematics from the University of Pisa, which he obtained in 1877, Bianchi remained in that city for other two years as student of the Corso di Perfezionamento of the Scuola Normale Superiore. He graduated in 1879, defend- ing a thesis on helicoidal surfaces. Then, just following the steps of Ricci, he was in Germany, first in Munich and then in Göttingen, were he attended courses and sem- inars given by Felix Klein. As already stated, he was deeply influenced by Klein’s

5The very first embryonal idea of the Ricci tensor actually appeared as early as 1892 in another publication of its inventor [7]. 3.2 A Historical Outline 101 group-theoretical view of geometry and one of his major achievements is precisely along that line. In a paper of 1898 [8], Bianchi classified all tridimensional spaces that admit a continuous group of motions. Actually, so doing, he classified all Lie algebras of dimension three. This classification, which is organized into nine types, turned out to be quite relevant for Cosmology in the framework of General Relativ- ity, since it amounts to a classification of all possible space-times that are spatially homogeneous (see Chap. 5 of Volume 2). Since 1882, Bianchi was internal pro- fessor at the Scuola Normale and in 1886 he won the competition for the chair of Projective Geometry at University of Pisa, where he was full-professor for the rest of his life. The same year he published the first edition of his Lezioni di Geometria Differenziale, which is the very first comprehensive treaty on the new discipline pi- oneered by Riemann and also the first place where the name Differential Geometry appeared. Bianchi died in 1928 and he is buried in the Cimitero Monumentale, Piazza dei Miracoli of Pisa. Since the later 1880s up to the end of his life he was an extremely prominent and influential mathematician of the then flourishing Italian School of Geometry. In 1904 Bianchi was member of the committee appointed by the Ac- cademia Nazionale dei Lincei to select the winning paper for the 1901 Royal Prize of Mathematics. Ricci’s ambitions on that Prize had already been manifested some years before, when he presented his works to the committee then headed by Eugenio Beltrami. Notwithstanding Beltrami’s very favorable impressions, the final verdict of the jury on the relevance of tensor analysis had been hesitating and the Prize had not been attributed. Similar conclusion obtained the competition of 1904. Luigi Bianchi showed a great appreciation for the mathematical soundness and vastity of Ricci’s methods but concluded that tensor analysis had not yet demonstrated its rel- evance and essentiality. He utilized Kronecker’s words to say that he preferred new results found with old methods rather than old results retrieved with new, although very powerful, techniques. These events are quite surprising in view of the fact that two years before, in 1902, Bianchi had published his paper containing those identi- ties on the Ricci tensor for which his name is mostly remembered. The Royal Prize for Mathematics, denied to Ricci Curbastro, was attributed few years later, in the 1907 edition, to Ricci’s former student Tullio Levi Civita, by a committee that once again included Luigi Bianchi, together with other distinguished mathematicians such as Vito Volterra and Corrado Segre. This time the usefulness of the tensor methods had been made absolutely undoubtable by the vastity of Levi Civita’s results. Although a little bit dismayed by the failure to get the Royal Prize, Ricci Curbas- tro ended his life in 1925 surrounded by the appreciation of his colleagues and of his fellow citizens both as a scientist and as a politician. Indeed he was nominated member of several academies, including the most prestigious one, that of Lincei and also occupied positions in the local administration of his native city, Lugo di Romagna. On the contrary his genial student Levi Civita, who was professor at the University of Rome La Sapienza, notwithstanding the Royal Prize and other honors, suffered, under the fascist racial laws of 1938, the removal from his chair because of his Jewish origin. Depressed and completely isolated from the scientific world he died from sorrow in 1941. 102 3 Connections and Metrics

Fig. 3.9 Elie Cartan (1869Ð1951). A true giant of XXth century mathematical thought. He completed the theory of Lie Algebras, invented exterior differential calculus, invented the theory of symmetric spaces, introduced the notion of mobile frames, reformulated the theory of General Relativity, discovered spinors much earlier than physicists. He gave fundamental results in the theory of differential equations and essentially invented the concept of fibre-bundles

3.2.5 Mobiles Frames from Frenet and Serret to Cartan

As everyone knows Einstein’s paper on Special Relativity is dated 1905, while that on General Relativity was published at the end of 1915. Einstein’s theory of gravi- tation, which is the main theme of the present book, was formulated in the language of differential geometry as developed by Ricci and Levi Civita and centered around the notion of metric, firstly introduced by Gauss for two-dimensional surfaces and then extended to all dimensions by Riemann. Yet as early as 1901, a relatively young scholar, who was to be a giant of XXth century mathematical thought, had already introduced a different approach [10]to differential geometry, whose greater depth and power started to be appreciated only later on. The genius we are referring to is Elie Cartan (see Fig. 3.9) and the above men- tioned 1901 paper actually deals with the theory of first order partial differential equations. Just as Sophus Lie, dealing with the same classical problem of analy- sis, developed the notion of Lie groups, in the same way Cartan, who had already given unparalleled contributions to the completion of Lie’s theory, reconsidering differential systems from a new view-point, introduced the notion of exterior dif- ferential forms and laid the basis of modern differential geometry, as we presented it in Chap. 2. In a subsequent paper of 1904 [19], Cartan introduced a particular set of 1-forms that, in modern scientific literature, bear his name together with that of the German mathematician Ludwig Maurer.6 Maurer-Cartan one-forms are as-

6Ludwig Maurer (1859Ð1927) obtained his Doctorate in 1887 from the University of Strasbourg (at the time under German rule after the defeat of France in the 1870 war) and became professor of Mathematics at the University of Tübingen. His doctoral dissertation Zur Theorie der linearen Substitutionen [20] happens to contain a germ of the idea of Maurer-Cartan forms developed by Cartan in 1904. 3.2 A Historical Outline 103 sociated with Lie groups and, as the reader will appreciate in the sequel of the present chapter, play an essential role in the general theory of connections on fibre- bundles, namely in establishing the notion of gauge-fields, presently identified with the mediators of all fundamental forces of Nature. Cartan’s 1904 paper, that pre- ceded both Special and General Relativity, is fundamental not only for the actual result it contained about covariantly closed one-forms on Lie group manifolds, but also for the general view-point on geometry that such structures advocated. Us- ing Cartan’s phrasing, this view-point is that of repères mobiles, mobile frames in English. As all great and simples ideas, that of mobile frames developed slowly, through the contribution of more than one mind and, in this case, all the minds were French. Jean Frédéric Frenet was born in Périgueux in 1816 and died in the same city in 1900. He entered the École Normale Supérieure in 1840 but then he quitted it and studied at the University of Toulouse. Frenet’s doctoral thesis, submitted in 1847 had the intriguing title Sur le fonctiones qui servent à determiner l’attraction des sphéroides quelconques. Programme d’une thèse sur quelque propriétés des courbes a double courbure.7 This thesis presented the idea of attaching a frame to each point of an arbitrary curve that develops in three-dimensional space. As this frame moves along the curve, we can look at its rate of change to determine how the curve turns and twists, two evolutions which completely determine the geometry of the considered curve. At the time of his writing, matrix notation and matrix calculus were not yet in general use and Frenet wrote six formulae which just correspond to six entries of a3× 3 matrix. The latter has obviously nine entries and it was Serret’s historical mission to write all of them independently from Frenet. Joseph Alfred Serret was three year younger than Frenet. Born in Paris in 1819 he died in Versailles in 1885. While Frenet had been a missed issue of École Normale Superiéure, Serret was an achieved product of École Polytechnique, from which he graduated in 1848. Appointed professor of Celestial Mechanics at the Collège de France in 1861, Serret was later offered the chair of differential and integral calculus at the Sorbonne. Frenet instead, after his start in Toulouse, was appointed professor of mathematics at the University of Lyon, where he was also director of the astronomical observatory. Frenet-Serret formulae, as history decided they should be named, were published (nine and six) in two independent papers with a very similar title, at one year dis- tance one from the other. Serret’s paper [12] appeared in 1851 and was named Sur quelques formules relatives à la théorie des courbes à double courbure. Frenet pub- lished in 1852 the results of his 1847 doctoral thesis under the title: Sur quelques propriétés des courbes à double courbure [11]. Consider Fig. 3.10. Given a curve C (s) in three-dimensional Euclidian space, we can easily fix the following rule which uniquely defines a basis of three orthonormal

7About the functions which help determining the attraction of general spheroids. Programme for a thesis about some properties of curves with a double curvature. 104 3 Connections and Metrics

Fig. 3.10 The moving frames of Frenet and Serret

vectors attached to each point of the curve. Let r be the standard Cartesian coordi- nates of R3. Any curve can be described by giving r = r(s) where s is the length parameter: s = dr s ds (3.2.18) 0 ds By definition the tangent vector to the curve is:

d T(s) = r(s) (3.2.19) ds and we can define the normal vector by means of the normalized second derivative:

1 d N(s) = t(s) (3.2.20) |t(s)| ds

Finally we can complete the orthonormal system by means of the exterior product of N(s) and T(s) which is historically named the binormal vector:

B(s) = T(s) ∧ N(s) ⇔ Bi(s) = εij k T j (s)N k(s) (3.2.21)

Since E(s) ={T(s), B(s), N(s)} provides a basis for three-vectors, it is clear that also the derivative of this basis can be reexpressed in terms of itself, namely we can write: d E(s) = Ω(s)E(s) (3.2.22) ds where Ω(s) is a 3 × 3 matrix which is necessarily antisymmetric since ET E = 1. d EET = =−E d ET =− T Indeed from such a relation it follows ds Ω ds Ω . In the component language of the middle XIXth century, Frenet and Serret proved that this matrix is not only antisymmetric but it has also a special normal form, in terms of two-parameters that they identified with the curvature κ and the torsion τ of the curve: ⎛ ⎞ 0 κ(s) 0 Ω(s) = ⎝−κ(s) 0 τ(s)⎠ (3.2.23) 0 −τ(s) 0 3.2 A Historical Outline 105

The reason of these names becomes obvious looking at Fig. 3.10. The span of the vectors T, N defines the osculating plane to the curve. If the torsion parameter τ vanishes the curve lies at all times in such a plane and the tangent and normal vectors undergo, from one point to the next one, an infinitesimal rotation of an angle δθ = κ. When the torsion τ is different from zero, the tangent and normal vectors undergo not only an infinitesimal rotation but also a displacement in the direction of the third vector B and this displacement is precisely proportional to τ . The geometry of the curve is fully determined by the knowledge of the parameters κ and τ .For instance all planar curves are characterized by τ = 0 and among them the circles are those with constant curvature κ. Similarly when both κ and τ are non-vanishing but constant we have a spiral and so on. In these very elementary geometrical facts the astonishing mathematical insight of Elie Cartan perceived a far reaching perspective of incredible richness. Refin- ing the idea of the moving frames he started a conceptual revolution of the same amplitude as that done by Gauss with the Disquisitiones of 1828. The origins of Cartan were very humble just as those of the King of Math- ematicians and those of Riemann. The environment where Cartan was born was even poorer and deprived of any cultural background, since his father was a plain blacksmith in the mountain village of Dolomieu in Haute Savoie. Gauss could study thanks to the generosity of the Duke of Brunswick, Cartan obtained the very best scientific education available at the time thanks to state stipends that the French Republic had introduced for talented people, independently from their social or economical status. Discovered in his remote village by the school inspector Du- bost, Elie was state-supported in order to attend Lycée in Lyon and then entered the École Normale Superiéure of Paris where he had such masters as Picard, Darboux and Hermite. Cartan’s doctoral dissertation was presented in 1894 and was already a masterpiece [18]. Some years before, in 1888Ð1890, the German mathematician Wilhelm Karl Joseph Killing (1847Ð1923), who is also responsible for the key no- tion of Killing vector fields,8 had classified the complex finite dimensional simple Lie algebras, inventing the notions of what were later named Cartan subalgebras, root systems and Cartan matrices. So doing he had discovered the possible exis- tence of the exceptional Lie algebras G2,F4,E6, E7 and E8. Killing, who was very creative but hardly rigorous, had no proof of the actual existence of these algebras and could not produce their explicit realizations. Cartan could. His thesis [18]wasa rigorous remake of Killing’s papers [13Ð16] where he also gave the explicit matrix construction of all exceptional Lie algebras, already announced in a paper published by him one year before in German [17]. After such a brilliant start the scientific production of Elie Cartan was an end- less piling up of fundamental results. He brought the theory of Lie algebras and Lie groups to perfection by classifying all their representations and, so doing, in a paper of 1913, he discovered spinors, much earlier than physicists found them nec- essary to describe the intrinsic angular momentum of fermions. He combined Lie

8We shall discuss this notion at length in Volume 2 in the chapters dealing with symmetries of black-holes and cosmological solutions. 106 3 Connections and Metrics theory with differential geometry, founding, developing and completing the theory of symmetric spaces. Extending his very early work on differential forms, he created the exterior differential calculus, which eventually proved much more powerful and synthetic than the tensor calculus of Ricci and Levi Civita. Right after the creation of General Relativity by Einstein, he started rethinking it in terms of mobile frames and came to the reformulation of gravitational equations which goes under the name of Einstein-Cartan theory [21]. While the original metric formulation is inadequate to incorporate fermionic fields, the new one can do that and is therefore more fun- damental. Moreover it is much simpler from the algorithmic point of view and leads to extremely elegant and compact formulae. Cartan’s viewpoint is that adopted in the present book. It is centered around the idea of mobile frames which comes down from Frenet and Serret. It also leads to a simple formulation of the notion of con- nection that Charles Ehresmann, one of the most brilliant students of Cartan, finally brought to mathematical perfection in the fifties of the XXth century. Elie Cartan grasped from Frenet and Serret formulae the message that the ge- ometry of an m-dimensional manifold M could be described by attaching an or- thonormal frame to each of its points. The evolution of the orthonormal frame from one point to the next one, which can occur in m directions, encodes all information about the intrinsic curvature and torsion of the manifold. Relying on the exterior differential calculus he had created, Cartan introduced a system of m one-forms: a = a μ E Eμ(x) dx (3.2.24) which at each point constitute an orthonormal reference frame for controvariant vectors. In modern words {Ea} are a basis of sections of the cotangent bundle π T ∗M =⇒ M . Calculating the exterior derivative of the forms Ea we can write the following formula, just analogous to Frenet and Serret’s formulae:

dEa =−ωab ∧ Eb + Ta (3.2.25) where ωab =−ωba is an antisymmetric-valued one-form and a = a b ∧ c T Tbc(x)E E (3.2.26) is a 2-form which will be named the torsion. If we have no information on the torsion, (3.2.25) is undetermined, namely there are many solutions for the one-form ωab. As we will see, this latter falls into the mathematical category of connections. According to the definition we shall provide in next section, ωab is a connection on a principle bundle P(M , SO(m)) where the base manifold is the considered manifold M and where the structural group is SO(m), namely the Lie group which rotates orthonormal frames among themselves. If we have geometrical or physical reasons to prescribe the value of the torsion, for instance zero (T a = 0), but also any other externally given 2-form, then (3.2.25) uniquely determines the SO(m)- connection ωab. In this case we say that the SO(m)-bundle is soldered to the tangent bundle and the SO(m)-connection ωab becomes a Cartan-connection.Justasin the Frenet-Serret case, the geometry of the manifold, in particular its curvature, is 3.2 A Historical Outline 107 revealed through the calculation of the second derivative, which Cartan immediately understood should be another exterior derivative. Indeed, according to the general theory of principle connections developed in the next section, one can calculate the natural 2-form associated with any connection, which in this case reads as follows:

Rab = dωab + ωac ∧ ωcb (3.2.27)

Rab ab and the expansion of along the orthonormal frame provides a tensor R cd which represents the curvature of the manifold: ab = ab c ∧ d R R cdE E (3.2.28) ab Indeed, as we discuss in a later chapter, R cd has a unique invertible one-to-one relation with the Riemann tensor introduced in (3.2.10). In this way Cartan’s was able to reformulate the entire setup of Riemannian ge- ometry, and with it General Relativity, into the language of exterior differential forms, avoiding the metric tensor of Gauss and Riemann, replaced by the notion of mobile-frames. From the physical viewpoint, Cartan’s approach is also very nat- ural since it can be rephrased as Einstein’s beloved equivalence principle, asserting that we can always find locally inertial reference frames. We will elaborate at length about this point in subsequent chapters. After digesting the next mathematical sections on principle connections, the reader will also appreciate another important aspect of Cartan’s formulation of grav- itational theory: it suppresses its diversity from the other gauge-theories describing non-gravitational interactions. Adopting Cartan’s viewpoint, the fundamental fields describing gravity are also encoded into a connection on a principle bundle as it hap- pens for all other forces. What is special about gravity is the soldering phenomenon, namely the possibility of solving for one part of the connection (ωab) in terms of the other (Ea), by imposing an external condition on the torsion Ta. In Einstein-Cartan formulation the condition on the torsion becomes a field equation streaming from the same variational principle which yields Einstein equations on the Riemann ten- sor and so everything is unified in a consistent, powerful scheme. We will appreciate the relevance of this statement in later chapters. When Elie Cartan died in 1951, two great mathematicians, Shiing-Shen Chern and Claude Chevalley joined to write an obituary that is also an impressive sum- mary of Cartan’s mathematical work. They said: His death came at a time when his reputation and the influence of his ideas were in full ascent. Undoubtedly one of the greatest mathematician of this century, his career was characterized by a rare harmony of genius and modesty. They also said: Closely interwoven with Cartan’s life as a scientist and teacher has been his family life, which was filled with an at- mosphere of happiness and serenity. He had four children, three sons, Henri, Jean, Louis and a daughter, Hélène. Yet fate was quite cruel at least with two of them. Jean Cartan who studied music and was very early recognized as a prominent and talented composer was stolen to his family by premature death from an incurable illness. Louis Cartan, who was a physicist, joined the Resistance during the German occupation of France and, captured by the Nazis, was beheaded in 1943. His father 108 3 Connections and Metrics suspected the terrible truth but learnt about it only in 1945: a deadly blow from which he never recovered. Henri Cartan followed his father steps and became a very prominent mathematician.

3.3 Connections on Principal Bundles: The Mathematical Definition

After presenting the previous historical introduction we come to the contemporary mathematical definition of a connection on a principle fibre-bundle. The adopted view-point is that of Ehresmann and what we are going to define is usually named an Ehresmann connection in the mathematical literature. To this effect we need a number of mathematical preliminaries on Lie groups which will be essential not only in the forthcoming discussion of a principal con- nection but will play a very important conceptual role also in all subsequent devel- opments of the theory and in each of its applications.

3.3.1 Mathematical Preliminaries on Lie Groups

Firstly we recall that on a Lie group G one can define two transitive actions of the same group on itself, the left and the right multiplication respectively. Indeed to each element γ ∈ G we can associate two continuous, infinitely differentiable and invertible maps of G in G, named the left and the right translation which we introduced in (2.4.2). Secondly we introduce the concepts of pull-back and push- forward of any diffeomorphism mapping a differential manifold M into an open submanifold of another differentiable manifold N .Letφ be any such map

φ : M → N (3.3.1) the push-forward of φ, denoted φ∗, is a map from the space of sections of the tangent bundle T M to the space of sections of the tangent bundle T N :

φ∗ : Γ(TM , M ) → Γ(TN , N ) (3.3.2)

Explicitly if X ∈ Γ(TM , M ) is a vector field over M , we can use it to define a new vector field φ∗X ∈ Γ(TN , N ) over N using the following procedure. For ∞ any f ∈ C (N ), namely for any smooth function on N , the action of φ∗X on such a function is given by:   −1 φ∗X(f ) ≡ X(f ◦ φ) ◦ φ (3.3.3)

To clarify this concept let us describe the push-forward in a pair of open charts for both manifolds. Consider Fig. 3.11. On the open neighborhood U ⊂ M we have 3.3 Connections on Principal Bundles: The Mathematical Definition 109

Fig. 3.11 Graphical description of the concept of push-forward

coordinates xμ while on the open neighborhood V ⊂ N we have coordinates yν . In this pair of local charts {U,V} the diffeomorphism (3.3.1) is described by giving the coordinates yν as smooth functions of the coordinates xμ:

yν = yν(x) ≡ φν(x) (3.3.4)

On the neighborhood V the smooth function f ∈ C∞(N ) is described by a real function f(y) of the n-coordinates yν . It follows that f ◦ φ is a smooth function f(x)˜ on the open neighborhood U ⊂ M simply given by: f(x)˜ = f y(x) (3.3.5)

Any tangent vector X ∈ Γ(TM , M ) is locally described on the open chart U by a first order differential operator of the form: ∂ X = Xμ(x) (3.3.6) ∂xμ which therefore can act on f(x)˜ :

∂yν ∂ Xf(x)˜ = Xμ(x) f y(x) (3.3.7) ∂xμ ∂yν

Considering now the coordinates xμ on U as functions of the yν on V , through the inverse of the diffeomorphism φ: − xμ = xμ(y) = φ 1(y) μ (3.3.8) we realize that (3.3.7) defines a new linear first order differential operator acting on any function f : V → R. This differential operator is the push-forward of X through the diffeomorphism φ: −→ −→ ν ν ∂ μ ∂y ∂ φ∗X = X (y) = X x(y) (3.3.9) ∂yν ∂xμ ∂yν 110 3 Connections and Metrics

Similarly the pull-back of φ, denoted φ∗, is a map from the space of sections of the cotangent bundle T ∗N to the space of sections of the cotangent bundle T ∗M : ∗ ∗ ∗ φ : Γ T N , N → Γ T M , M (3.3.10)

Explicitly if ω ∈ Γ(T∗N , N ) is a differential one-form over N , we can use it to define a differential one-form φ∗ω over M as it follows. We recall that a one-form is defined if we assign its value on any vector-field field, hence we set:

∗ ∀X ∈ Γ(TM , M ); φ ω(X) ≡ ω(φ∗X) (3.3.11)

Considering (3.3.9), the local description of the pull-back on a pair of open charts {U,V} is easily derived from the definition (3.3.11). If we name ωμ(y) the local components of the one-form ω on the coordinate patch V :

μ ω = ωμ(y) dy (3.3.12) the components of the pull-back are immediately deduced:

ν ∗ ∗ ∂y φ ω = φ ω (x) dxμ ≡ ω y(x) dxμ (3.3.13) μ ν ∂xμ

3.3.1.1 Left-/Right-Invariant Vector Fields

Let us now consider the case where the manifold M coincides with manifold N and both are equal to a Lie group manifold G. The left and the right translations defined in (2.4.2) are diffeomorphisms and for each of them we can consider both the push-forward and the pull-back. This construction allows to introduce the notion of left- (respectively right-) invariant vector fields and one-forms over the Lie group manifolds G.

Definition 3.3.1 A vector field X ∈ Γ(TG,G)defined over a Lie group manifold G is named left-invariant (respectively right-invariant) if the following condition holds true:

∀γ ∈ G : Lγ ∗X = X (respectively Rγ ∗X = X) (3.3.14)

Similarly:

Definition 3.3.2 A one-form σ ∈ Γ(T∗G, G) defined over a Lie group manifold G is named left-invariant (respectively right-invariant) if the following condition holds true: ∀ ∈ : ∗ = ∗ = γ G Lγ σ σ respectively Rγ σ σ (3.3.15) 3.3 Connections on Principal Bundles: The Mathematical Definition 111

Let us recall that the space of sections of the tangent bundle has the structure of an infinite dimensional Lie algebra for any manifold M . Indeed, given any two sections, we can compute their commutator as differential operators and this defines the necessary Lie bracket:

∀X, Y ∈ Γ(TM , M );[X, Y]=Z ∈ Γ(TM , M ) (3.3.16)

Viewed as a Lie algebra the space of sections of the tangent bundle is usually de- noted Diff0(M ) since every vector field can be regarded as the generator of a dif- feomorphism infinitesimally close to the identity. In the case of group manifolds we have the following simple but very fundamen- tal theorem:

Theorem 3.3.1 The two sets of left-invariant and of right-invariant vector fields over a Lie group manifold G close two finite Lie subalgebras of Diff0(G), respec- tively named GL/R, which are isomorphic to each other and to the abstract Lie algebra G of the Lie group G. Furthermore GL/R commute with each other.

The proof of this theorem is obtained through a series of steps and through the proof of some intermediate lemmas. Let us begin with the first.

Proof For any diffeomorphism φ the push-forward map φ∗ has the following prop- erty:

∀X, Y ∈ Diff0(M ) : φ∗[X, Y]=[φ∗X,φ∗Y] (3.3.17) 

No proof is required for this lemma since it follows straightforwardly from the definition (3.3.3) of the pushforward. However the consequences of the lemma are far reaching. Indeed it implies that the commutator of two left-invariant (respectively right-invariant) vector fields is still left-invariant (respectively right-invariant). So we have the

Lemma 3.3.1 The two sets GL/R of left-invariant (respectively right-invariant) vec- tor fields constitute two Lie subalgebras GL/R ⊂ Diff0(G).

∀X, Y ∈ GL/R :[X, Y]∈GL/R (3.3.18)

This being established we can now show that the left-invariant vector fields can be put into one-to-one correspondence with the elements of tangent space to the group manifold at the identity element, namely with TeG. From this correspon- dence it will follow that the Lie subalgebra GL has dimension equal to the dimen- sion n of the Lie group. The same correspondence can be established also for the right-invariant vector fields and the same conclusion about the dimension of the Lie algebra GR can be deduced. The argument goes as follows. 112 3 Connections and Metrics

Let g :[0, 1]→G be a path in G with initial point at the identity element g(0) = e ∈ G. In local coordinates on the group manifold the path g is described as follows: t ∈[0, 1]; G  g(t) = g α1(t)...αn(t) (3.3.19) where αi denote the group parameters and g(α) denotes the group element identified by the parameters α. Hence we obtain:

d dαi ∂ g(t) = g(α) (3.3.20) dt dt ∂αi The set of derivatives: i i dα c = (3.3.21) dt t=0 constitutes the component of a tangent vector at the identity that we name Xe: −→ ∂ T G  X ≡ ci (3.3.22) e e ∂αi

Conversely for each Xe ∈ TeG we can construct a path g(t) that admits Xe as tan- gent vector at the identity element. Given a tangent vector t at a point p ∈ M of a manifold, constructing a curve on M which goes through p and admits t as tangent vector in that point is a problem which admits solutions on any differentiable mani- fold, a fortiori on a Lie group manifold G. Let therefore g(t) be such a path. To Xe we can associate a left-invariant vector field XL defining the action of the latter on any smooth function f as it follows: ∀ ∈ C∞ ∀ ∈ : ≡ f (G) and ρ G XLf(ρ) f ρg(t) t=0 (3.3.23)

Applying the definition in (3.3.3) the reader can immediately verify that XL defined above is left-invariant. In a completely analogous way, to the same tangent vector Xe we can associate a right-invariant vector field XR, setting: ∀ ∈ C∞ ∀ ∈ : ≡ f (G) and ρ G XRf(ρ) f g(t)ρ t=0 (3.3.24) In this way we have established that to each tangent vector at the identity we can associate both a left-invariant and a right-invariant vector field. On the other hand since each vector field X is a section of the tangent bundle, namely a map:

∀p ∈ M : p → Xp ∈ TpM (3.3.25) it follows that each left-invariant or right-invariant vector fields singles out a tangent vector at the identity. The relevant point is that this double correspondence is an isomorphism of vector spaces. Indeed we have:

Lemma 3.3.2 Let G be a Lie group, let GL/R be the Lie algebra of left-invariant (respectively right-invariant) vector fields. The correspondence: 3.3 Connections on Principal Bundles: The Mathematical Definition 113

∀X ∈ GL/R π : X → Xe ∈ TeG (3.3.26) is an isomorphism of vector spaces.

Proof Let us first of all observe that the correspondence (3.3.26) is a linear map. Indeed for a,b ∈ R and X, Y ∈ GL/R we have:

π(aX + bY) = aXe + bYe (3.3.27) In order to show that π is an isomorphism we need to prove that π is both injective and surjective, in other words that:

ker π = 0; Im π = TeG (3.3.28) The second of the two conditions (3.3.28) was already proved. Indeed we have shown above that to each tangent vector Xe at the identity we can associate a left- invariant (right-invariant) vector field which reduces to that vector at g = e.Itre- mains to be shown that ker π = 0. This follows from a simple observation. If X is left-invariant, then its value at the group element γ ∈ G is determined by its value at the identity by means of a left translation, namely:

Xγ = Lγ ∗Xe (3.3.29)

Therefore if Xe = 0 it follows that Xγ = 0 for all group elements γ ∈ G and hence X = 0 as a vector field. Hence there are no non-trivial vectors in the kernel of the map π and this concludes the proof of the lemma. 

We have therefore:

Corollary 3.3.1 The dimensions of the two Lie algebras GL and GR are equal among themselves and equal to those of the abstract Lie algebra G

dim GL = dim GR = dim G = dim TeG (3.3.30)

In this way we have shown the isomorphism GL ∼ GR ∼ G as vector spaces. It remains to be shown the same isomorphism as Lie algebras.

Proof We can now complete the proof of Theorem 3.3.1. To this effect we argue in the following way. Utilizing the just established vector space isomorphism let us choose a basis for TeG which we denote as tA (A = 1,...,dim G). In this way we have: A A ∀t ∈ TeG : t = x tA,x∈ R (3.3.31)

Relying on the constructions (3.3.23) and (3.3.24) to each element of the basis {tA} we can associate the corresponding left- (right-)invariant vector field: d T(L)(f )(ρ) = f ρg (t) (3.3.32) A dt A t=0 114 3 Connections and Metrics

d T(R)(f )(ρ) = f g (t)ρ (3.3.33) A dt A t=0 where gA(t) denotes a path in G passing through the identity and there admitting the vector tA as tangent. The established vector space isomorphism guarantees that (L) (L) G G TA and TA constitute a basis respectively for L and R. Hence we can write:   (L) (L) = (L)C (L) TA , TB CAB TC (3.3.34)   (R) (R) = (R)C (R) TA , TB CAB TC (3.3.35)

(L)C (L)C where CAB and CAB are constants that, a priori, might be completely different. Equations follow from the fact that we have established that both GL and GR are (L/R) dimension n Lie algebras and the generators TA constitute a basis. Let us know calculate explicitly the commutators of the basis elements using their definitions. On any function f : G → R, we find:   d d   T(L), T(L) (f )(ρ) = f ρg (t)g (τ) − f ρg (t)g (τ) A B dt dτ B A A B t=τ=0 d = C(L)C f ρg (t) (3.3.36) AB dt C t=0   d d   T(R), T(R) (f )(ρ) = f g (t)g (τ)ρ − f g (t)g (τ)ρ A B dt dτ A B B A t=τ=0 d = C(R)C C(R)C (3.3.37) AB dt AB The above (3.3.37) hold true at any point ρ ∈ G. Evaluating them at the identity ρ = e and taking their sum we obtain: d C(L)C + C(R)C f ρg (t) = 0 (3.3.38) AB AB dt C t=0 which implies: (L)C =− (R)C CAB CAB (3.3.39)

This relation suffices to establish the Lie algebra isomorphism of GL with GR. Indeed they are isomorphic as vector spaces and their structure constants become identical under the very simple change of basis:

(L) ↔− (R) TA TA (3.3.40) This concludes the proof of Theorem 3.3.1. Indeed the last isomorphism advocated in that proposition amounts simply to a definition. 

Definition 3.3.3 The Lie algebra GL of the left-invariant vector fields on the Lie group manifold G, isomorphic to that of the right-invariant ones GR, is named the Lie algebra G of the considered Lie group. 3.3 Connections on Principal Bundles: The Mathematical Definition 115

By explicit evaluation as in (3.3.37) we can also show that any left-invariant vector field commutes with any right-invariant one and vice-versa. Indeed we find:   d d   T(L), T(R) (f )(ρ) = f g (t)ρg (τ) − f g (t)ρg (τ) A B dt dτ B A B A t=τ=0 = 0 (3.3.41)

The interpretation of these relations is indeed very simple. The left-invariant vector fields happen to be the infinitesimal generators of the right-translations while the right-invariant ones generate the left translations. Hence the vanishing of the above commutators just amounts to say that the left-invariant are indeed invariant under left translations while the right invariant are insensitive to right translations.

3.3.1.2 Maurer-Cartan Forms on Lie Group Manifolds

Let us now consider left-invariant (respectively right-invariant) one-forms on the group manifold. They were defined in (3.3.15). Starting from the construction of the left- (right-)invariant vector fields it is very easy to construct an independent set of n differential forms with the invariance property (3.3.15) that are in one-to-one correspondence with the generators of the Lie algebra G. Let us consider the explicit (L/R) form of the TA as first order differential operators: −→ (l/r) ∂ T(L/R) = Σi (α) (3.3.42) A A ∂αi According to the already introduced convention αμ are the group parameters and the square matrix: ⎫ ⎪ ⎛ ⎞⎪ ⎪ ∗∗∗∗⎪ (l/r) ⎜ ⎟⎬ i ⎜∗∗∗∗⎟ Σ (α) = ⎝ ⎠ i = 1,...,n (3.3.43) A ∗∗∗∗⎪ ⎪ ∗∗∗∗⎪   ⎭⎪ A=1,...,n whose entries are functions of the parameters can be calculated starting from the constructive algorithm encoded in (3.3.33). In terms of the inverse of the above (l/r) A matrix denoted Σα (α) and such that:

(l/r) (l/r) A i = A Σi (α) ΣB (α) δB (3.3.44) we can define the following set of n differential one-forms:

(l/r) A = A i σ(L/R) Σi (α) dα (3.3.45) 116 3 Connections and Metrics which by construction satisfy the relations: A (L/R) = A σ(L/R) TB δB (3.3.46)

A From (3.3.46) follows that all the forms σL/R are left- (respectively right-) invariant. Indeed we have: ∀ ∈ : ∗ A (L) ≡ A (L) = A (L) = A γ G Lγ σ(L) TB σ(L) Lγ ∗TB σ(L) TB δB (3.3.47)

∗ A = A which implies Lγ σ(L) σ since both forms have the same values on a basis of sections of the tangent bundle as it is provided by the set of left-invariant vector (L) fields TB . A completely identical proof holds obviously true for the right-invariant A one-forms σ(R) defined by the same construction. We come therefore to the conclusion that on each Lie group manifold G we can construct both a left-invariant σ(L) and a right-invariant σ(R) Lie algebra valued one-form defined as it follows: n G  = A σ(L) σ(L)tA (3.3.48) A=1 n G  = A σ(R) σ(R)tA (3.3.49) A=1 where, just as before, {tA} denotes a basis of generators of the abstract Lie algebra G. One may wonder how the Lie algebra forms σ(L/R) could be constructed directly without going through the previous construction of the left- (right-)invariant vector fields. The answer is very simple. Let g(α) ∈ G be a running element of the Lie group parameterized by the parameters α which constitute a coordinate patch for the corresponding group manifold and consider the one-forms:

−1 −1 i −1 −1 i θL = g dg = g ∂ig dα ; θR = g dg = g∂ig dα (3.3.50)

It is immediate to check that such one-forms are left- (respectively right-) invariant. For instance we have:

∗ = −1 = −1 −1 = −1 = Lγ θL (γ g) d(γg) g γ γdg g dg θL (3.3.51)

What might not be immediately evident to the reader is why the left- (right-)invariant one-forms θL/R introduced in (3.3.50) should be Lie algebra valued. The answer is actually very simple. From his basic courses in Lie group theory the reader should recall that the relation between Lie algebras and Lie groups is provided by the ex- ponential map: every element of a Lie group which lies in the branch connected to the identity can be represented as the exponential of a suitable Lie algebra element:

∀γ ∈ G0 ⊂ G, ∃X ∈ G \ γ = exp X (3.3.52) 3.3 Connections on Principal Bundles: The Mathematical Definition 117

The Lie algebra element actually singles out an entire one parameter subgroup:

∀t ∈ R,G0  γ(t)= exp[tX] (3.3.53)

With an obvious calculation we obtain:

− γ 1(t) dγ (t) = Xdt∈ G (3.3.54) namely the left-invariant one-form associated with this parameter subgroup lies in the Lie algebra. This result extends to any group element, so that indeed the previ- ously constructed Lie algebra valued left- (right-)invariant one forms σL/R and the theta forms defined in (3.3.50) are just the very same objects:

σL/R = θL/R (3.3.55)

All the above statements become much clearer when we consider classical groups whose elements are just matrices subject to some defining algebraic condition. Con- sider for instance the rotation group in N-dimensions SO(N). All elements of this group are orthogonal N × N matrices:

T O ∈ SO(N) ⇔ O O = 1N×N (3.3.56)

The elements of the orthogonal Lie algebra so(N) are instead antisymmetric matri- ces: T A ∈ so(N) ⇔ A + A = 0N×N (3.3.57) Calculating the of the matrix Θ = OT dO we immediately obtain: ΘT = OT dO T = dOT O =−OT dO =−Θ ⇒ Θ ∈ so(N) (3.3.58) which proves that the left-invariant one-form is indeed Lie algebra valued.

3.3.1.3 Maurer Cartan Equations

It is now of the utmost interest to consider the following identity which follows immediately from the definitions (3.3.50) and (3.3.55) of the left- (right-)invariant one-forms:

F[σ ]≡dσL/R + σL/R ∧ σL/R = 0 (3.3.59) To prove the above statement it is sufficient to observe that g−1 dg =−dg−1 g. The reason why we introduced a special name F[σ ] for the Lie algebra valued two-form dσ + σ ∧ σ , which turns out to be zero in the case of left- (right-)invariant one-forms σ , is that precisely this combination will play a fundamental role in the theory of connections, representing their curvature. Equation (3.3.59) are named the Maurer Cartan equations of the considered Lie algebra G and they translate into the statement that left- (right-)invariant one-forms have vanishing curvature. 118 3 Connections and Metrics

Since the forms σ are Lie algebra-valued, they can be expanded along a basis of generators tA for G, as we already did in (3.3.49) and the same can be done for the curvature two-form F[σ ], namely we get:

A F[σ ]=F [σ ]tA (3.3.60) 1 FA[σ ]=dσA + C A σ B ∧ σ C 2 BC

A where CBC are the structure constants of the Lie algebra: [ ]= A tB ,tC CBC tA (3.3.61) and the Maurer Cartan equations (3.3.59) amount to the statement:

A F [σL/R]=0 (3.3.62)

3.3.2 Ehresmann Connections on a Principle Fibre Bundle

Equipped with the notions developed in the previous Sect. we are finally in a position to introduce the rigorous mathematical definition due to Ehresmann (see Fig. 3.12) of a connection on a generic principle bundle. Let P(M ,G) be a principle fibre-bundle with base-manifold M and structural group G. Let us moreover denote π the projection:

π : P → M (3.3.63)

Consider the action of the Lie group G on the total space P :

∀g ∈ Gg: G → G (3.3.64)

By definition this action is vertical in the sense that ∀u ∈ P,∀g ∈ G : π g(u) = π(u) (3.3.65) namely it moves points only along the fibres. Given any element X ∈ G where we have denoted by G the Lie algebra of the structural group, we can consider the one-dimensional subgroup generated by it

gX(t) = exp[tX],t∈ R (3.3.66) and consider the curve CX(t, u) in the manifold P obtained by acting with gX(t) on some point u ∈ P :

CX(t, u) ≡ gX(t)(u) (3.3.67) 3.3 Connections on Principal Bundles: The Mathematical Definition 119

Fig. 3.12 Charles Ehresmann was born in German speaking Alsace in 1905 from a poor family. His first education was in German, but after Alsace was returned to France in 1918 as a result of Germany’s defeat in World War I, Ehresmann attended only French schools and his University education was entirely French. Indeed in 1924 he entered the École Normale Superiéure from which he graduated in 1927. After that, he served as teacher of Mathematics in the French colony of Morocco and then went to Göttingen that in the late twenties and beginning of the thirties was the major scientific center of the world, at least for Mathematics and Physics. The raising of Nazi power in Germany dismantled the scientific leadership of the country, caused the decay of Göttingen and obliged all the Jewish scientists who so greatly contributed to German culture to emigrate to the United States. Ehresmann also fled from Göttingen to Princeton where he studied for few years until 1934. In that year he returned to France to obtain his doctorate under the supervision of one of the giants of mathematical thought of the XXth century: Elie Cartan. Charles Ehresmann was professor at the Universities of Strasbourg and Clermont Ferrand. In 1955 a special chair of Topology was created for him at the University of Paris which he occupied up to his retirement in 1975. He died in 1979 in Amiens where his second wife, also a mathematician held a chair. Charles Ehresmann was one of the creators of differential topology. He greatly contributed to the development of the notion of fibre-bundles and of the connections defined over them. He founded the mathematical theory of categories

The vertical action of the structural group implies that: π CX(t, u) = p ∈ M if π(u) = p (3.3.68)

These items allow us to construct a linear map # which associates a vector field X# over P to every element X of the Lie algebra G:

# : G  X → X# ∈ Γ(TP,P) (3.3.69) d ∀f ∈ C∞(P ) X#f(u)≡ f C (t, u) dt X t=0 Focusing on any point u ∈ P of the bundle, the map (3.3.69) reduces to a map:

#u : G → TuP (3.3.70) 120 3 Connections and Metrics

Fig. 3.13 The tangent space to a principle bundle P splits at any point u ∈ P into a vertical subspace along the fibres and a horizontal subspace parallel to the base manifold. This splitting is the intrinsic geometric meaning of a connection

from the Lie Algebra to the tangent space at u.Themap#u is not surjective. We introduce the following:

Definition 3.3.4 At any point p ∈ P(M,G) of a principle fibre bundle we name vertical subspace VuP of the tangent space TuP the image of the map #u:

TuP ⊃ Vu ≡ Im #u (3.3.71)

Indeed any vector t ∈ Im #u lies in the tangent space to the fibre Gp, where p ≡ π(u).

The meaning of Definition 3.3.4 becomes clearer if we consider both Fig. 3.13 and a local trivialization of the bundle including u ∈ P . In such a local trivialization the bundle point u is identified by a pair: loc. triv. p ∈ M u −→ (p, f ) where (3.3.72) f ∈ G and the curve CX(t, u) takes the following appearance: loc. triv. tX CX(t, u) −→ p,e f ,p= π(u) (3.3.73)

Correspondingly, naming αi the group parameters, just as in the previous section, and xμ the coordinates on the base space M , a generic function f ∈ C∞(P ) is just a function f(x,α) of the m coordinates x and of the n parameters α. Comparing (3.3.69) with (3.3.24) we see that, in the local trivialization, the vertical tangent vector X# reduces to nothing else but to the right-invariant vector field on the group manifold associated with the same Lie algebra element X: −→ loc. triv. ∂ X# −→ X = Xi (α) (3.3.74) R R ∂αi As we see from (3.3.74), in a local trivialization a vertical vector contains no deriva- tives with respect to the base space coordinates. This means that its projection onto the base manifold is zero. Indeed it follows from Definition 3.3.4 that: 3.3 Connections on Principal Bundles: The Mathematical Definition 121

X ∈ VuP ⇒ π∗X = 0 (3.3.75) where π∗ is the push-forward of the projection map. Having defined the vertical subspace of the tangent space one would be inter- ested in giving a definition also of the horizontal subspace HuP , which, intuitively, must be somehow parallel to the tangent space TpM to the base manifold at the projection point p = π(u). The dimension of the vertical space is the same as the dimension of the fibre, namely n = dim G. The dimension of the horizontal space HuP must be the same as the dimension of the base manifold m = dim M . Indeed HuP should be the orthogonal complement of VuP . Easy to say, but orthogonal with respect to what? This is precisely the point. Is there an a priori intrinsically defined way of defining the orthogonal complement to the vertical subspace Vu ⊂ TuP ?The μ answer is that there is not. Given a basis {v } of n vectors for the subspace VuP , there are infinitely many ways of finding m extra vectors {hi} which complete this i basis to a basis of TuP . The span of any such collection of m vectors {h } is a pos- sible legitimate definition of the orthogonal complement HuP . This arbitrariness is the root of the mathematical notion of a connection. Providing a fibre bundle with a connection precisely means introducing a rule that uniquely defines the orthogonal complement HuP .

Definition 3.3.5 Let P(M,G) be a principal fibre-bundle. A connection on P is a rule which at any point u ∈ P defines a unique splitting of the tangent space TuP into the vertical subspace VuP and into a horizontal complement HuP satisfying the following properties:

(i) TuP = HuP ⊕ VuP . (ii) Any smooth vector field X ∈ Γ(TP,P)separates into the sum of two smooth = H + V ∈ H ∈ vector fields X X X such that at any point u P we have Xu HuP V ∈ and Xu VuP . (iii) ∀g ∈ G we have that HguP = Lg∗HuP .

The third defining property of the connection states that all the horizontal spaces along the same fibre are related to each other by the push-forward of the left- translation on the group manifold. This beautiful purely geometrical definition of the connection due to Ehresmann emphasizes that it is just an intrinsic attribute of the principle fibre-bundle. In a trivial bundle, which is a direct product of manifolds, the splitting between vertical and horizontal spaces is done once for ever. Vertical space is the tangent space to the fibre, horizontal space is the tangent space to the base. In a non-trivial bundle the splitting between vertical and horizontal directions has to be reconsidered at every next point and fixing this ambiguity is the task of the connection.

3.3.2.1 The Connection One-Form The algorithmic way to implement the splitting rule advocated by Definition 3.3.5 is provided by introducing a connection one-form A which is just a Lie algebra valued one-form on the bundle P satisfying two precise requirements. 122 3 Connections and Metrics

Definition 3.3.6 Let P(M ,G)be a principle fibre-bundle with structural Lie group G, whose Lie algebra we denote by G. A connection one-form on this bundle is a Lie algebra valued section of the cotangent bundle A ∈ G ⊗ Γ(T∗P,P)which satisfies the following defining properties: (i) ∀X ∈ G : A X# = X. ∀ ∈ : ∗ = ≡ −1 (ii) g G LgA Adjg−1 A g Ag.

Given the connection one-form A the splitting between vertical and horizontal subspaces is performed through the following

Definition 3.3.7 At any u ∈ P , the horizontal subspace HuP of the tangent space to the bundle is provided by the kernel of A:   HuP ≡ X ∈ TuP | A(X) = 0 (3.3.76)

The actual meaning of Definitions 3.3.6, 3.3.7 and their coherence with Defini- tion 3.3.5 becomes clear if we consider a local trivialization of the bundle. Just as before, let xμ be the coordinates on the base-manifold and αi the group parameters. With respect to this local patch, the connection one-form A has the following appearance:

A = A+ A (3.3.77) A= A μ = μ A I μ dx dx μ TI (3.3.78) = i = i I A Ai dα dα Ai TI (3.3.79)  where, a priori, both the components Aμ(x, α) and Ai(x, α) could depend on both xμ and αi . The second equality in (3.3.79) and (3.3.78) is due to the fact that A and A are Lie algebra valued, hence they can be expanded along a basis of gen- erators of G. Actually the dependence on the group parameters or better on the group element g(α) is completely fixed by the defining axioms. Let us consider first the implications of property (i) in Definition 3.3.6. To this effect let X ∈ G be A expanded along the same basis of generators, namely X = X TA. As we already # = A (R) remarked in (3.3.74), the image of X through the map # is X X TA where (r) −→ (R) = i i TA ΣA(α) ∂/∂α are the right-invariant vector fields. Hence property (i) re- quires:

(r) A I i = A X Ai ΣA(α) TI X TA (3.3.80) which implies

(r) I i = I Ai ΣA(α) δA (3.3.81) 3.3 Connections on Principal Bundles: The Mathematical Definition 123

(r) I = A Therefore Ai Σi (α) and the vertical component of the one-form connection is nothing else but the right-invariant one-form:

−1 A = σ(R) ≡ dgg (3.3.82)

Consider now the implications of defining property (ii). In the chosen local trivial- ization the left action of any element γ of the structural group G is Lγ : (x, g(α)) → (x, γ · g(α)) and its pull-back acts on the right-invariant one-form σ(R) by adjoint transformations: ∗ : → · · −1 = · · −1 Lγ σ(R) d(γ g)(γ g) γ σ(R) γ (3.3.83) Property (ii) of Definition 3.3.6 requires that the same should be true for the com- plete connection one-form A.Thisfixestheg dependence of A, namely we must have: − A(x, g) = g · A · g 1 (3.3.84) where A = A I μ μ (x) dx TI (3.3.85) is a G Lie algebra valued one-form on the base manifold M . Consequently we can summarize the content of Definition 3.3.6 by stating that in any local trivialization u = (x, g) the connection one-form has the following structure:

− − A = g · A · g 1 + dg · g 1 (3.3.86)

Gauge Transformations From (3.3.86) we easily work out the action on the one- form connection of the transition function from one local trivialization to another one. Let us recall (2.4.8) and (2.4.9). In the case of a principal bundle the transition function tαβ (x) is just a map from the intersection of two open neighborhoods of the base space into the structural group: tαβ : M ⊃ Uα Uβ → G (3.3.87) and we have: transition : (x, gα) → (x, gβ ) = x,gα · tαβ (x) (3.3.88)

Correspondingly, combining this information with (3.3.86) we conclude that:

Aα → Aβ −1 −1 Aα = g · Aα · g + dg · g (3.3.89) −1 −1 Aβ = g · Aβ · g + dg · g A = · A · −1 + · −1 β tαβ (x) α tαβ (x) dtαβ (x) tαβ (x) 124 3 Connections and Metrics

The last equation of (3.3.89) is the most significant one for the bridge from Math- ematics to Physics and provides the common stem of Yang’s picture in Fig. 3.1. The one-form A defined over the base-manifold M , which in particular can be space-time, is what physicists name a gauge potential, the prototype of which is the gauge-potential of electrodynamics; the transformation from one local trivial- ization to another one, encoded in the last of (3.3.89), is what physicists name a gauge transformation. The basic idea about gauge-transformations is that of local symmetry. Some physical system or rather some physical law is invariant with re- spect to the transformations of a group G but not only globally, rather also locally, namely the transformation can be chosen differently from one point of space-time to the next one. The basic mathematical structure realizing this physical idea is that of fibre-bundle and the one-form connection on a fibre-bundle is the appropriate math- ematical structure which encompasses all mediators of all physical interactions.

Horizontal Vector Fields and Covariant Derivatives Let us come back to Defi- nition 3.3.7 of horizontal vector fields. In a local trivialization a generic vector field on P is of the form: μ i X = X ∂μ + X ∂i (3.3.90) μ i where we have used the short-writing ∂μ = ∂/∂x and ∂i = ∂/∂α . The horizontal- ity condition in (3.3.76) becomes: = μA A · · −1 + i · −1 0 X μ g TA g X ∂ig g (3.3.91)

Multiplying on the left by g−1 and on the right by g,(3.3.91) can be easily solved for Xi obtaining in this way the general form of a horizontal vector field on a principal fibre-bundle P(M ,G)endowed with a connection one-form A ↔ A . We get: −→ −→ () −→ (H ) = μ − A A (L) = μ − A A i X X ∂ μ μ TA X ∂ μ μ ΣA ∂ i (3.3.92)

To obtain the above result we made use of the following identities:

() () −→ () () −1 · = A; (L) = i ; A i = A g ∂ig Σi TA ΣA ∂i Σi ΣB δB (3.3.93) which follow from the definitions of left-invariant one-forms and vector fields dis- cussed in previous subsections. Equation (3.3.92) is equally relevant for the bridge between Mathematics and Physics as (3.3.89). Indeed from (3.3.92) we realize that the mathematical notion of horizontal vector fields coincides with the physical notion of covariant derivative. Expressions like that in the bracket of (3.3.92), firstly appeared in the early decades of the XXth century when Classical Electrodynamics was rewritten in the language of Special Relativity and the quadri-potential Aμ was introduced. Indeed with the development first of Dirac equation and then of Quantum Electrodynamics, the co- variant derivative of the electron wave-function made its entrance in the language of theoretical physics: 3.3 Connections on Principal Bundles: The Mathematical Definition 125

∇μψ = (∂μ − iAμ)ψ (3.3.94)

The operator ∇μ might be assimilated with the action of a horizontal vector field (3.3.92) if we just had a one dimensional structural group U(1) with a single gen- erator and if we were acting on functions over P(M , U(1)) that are eigenstates of the unique left-invariant vector field T(L) with eigenvalue i. This is indeed the case of Electrodynamics. The group U(1) is composed by all unimodular complex num- −→ bers exp[iα] and the left-invariant vector field is simply T(L) = ∂/∂α.Thewave- function ψ(x) is actually a section of the principle bundle P(M4, U(1)) since it is a complex spinor whose overall phase factor exp[iα(x)] takes values in the U(1)-fibre (L) over x ∈ M4. Since T exp[iα]=iexp[iα], the covariant derivative (3.3.94)isof the form (3.3.92). This shows that the one-form connection coincides, at least in the case of elec- trodynamics, with the physical notion of gauge-potential which, upon quantization, describes the photon-field, namely the mediator of electromagnetic interactions. The 1954 invention of non-Abelian gauge-theories by Yang and Mills started from the generalization of the covariant derivative (3.3.94) to the case of the su(2) Lie al- A A gebra. Introducing three gauge potentials μ , corresponding to the three standard generators JA of su(2):

A [JA, JB]=εABCJC; JA = iσ ; (A,B,C = 1, 2, 3) (3.3.95) where σ A denote the 2 × 2 Pauli matrices: 01 0 −i 10 σ 1 = ; σ 2 = ; σ 3 = (3.3.96) 10 i0 0 −1 they wrote: ∇ ψ1 = − A A ψ1 μ ∂μ i μ JA (3.3.97) ψ2 ψ2 From the su(2) × u(1) standard model of electro-weak interactions of Glashow, Salam and Weinberg (see Fig. 3.14) we know nowadays that, in appropriate com- A A A ± binations with μ, the three gauge fields μ , describe the W and Z0 particles discovered at CERN in 1983 by the UA1 experiment of Carlo Rubbia. They have spin one and mediate the weak interactions. The mentioned examples show that the Lie algebra-valued connection one-form defined by Ehresmann on principle bundles is clearly related to the gauge-fields of particle physics since it enters the construction of covariant derivatives via the no- tion of horizontal vector fields. On the other hand the same one-form must be related to gravitation as well, since also there one deals with covariant derivatives (3.2.8) sustained by the Levi Civita connection and the Christoffel symbols (3.2.7). The key to understand the unifying point of view offered by the notion of Ehresmann connection on a principle bundle is obtained by recalling what we just emphasized in Chap. 2 when we introduced the very notion of fibre-bundles. Following Defini- tion 2.4.5 let us recall that to each principle bundle P(M ,G) one can associate as many associated vector bundles as there are linear representations of the structural 126 3 Connections and Metrics

Fig. 3.14 Sheldon Glashow, Steven Weinberg and Abdus Salam shared the 1979 Nobel Prize in Physics for their 1968 formulation of a gauge theory of the electro-weak interactions based on the choice of a gauge group G = SU(2) × U(1). In the language of Mathematics this is the structural group of a principle-bundle, whose one-form connection contains four fields. These fields corre- spond to as many elementary particles: the W +, W −, the photon γ and another neutral massive particle Z0. While the existence of the W ± particles could be indirectly inferred from the known weak decays, the Z0 particle had no experimental motivation at the time when the SU(2) × U(1) was formulated. Yet in 1973 the neutral current interactions were discovered at CERN. Their ef- fect was revealed by otherwise unexplicable electron tracks photographed at the Gargamelle Bub- ble Chamber. The direct production of the W ± and Z0 particles had to wait the construction of the Super Proton Synchrotron and the UA1 experiment of 1983 which revealed them. In 1984 Carlo Rubbia and the accelerator engineer Simon van der Meer, who invented the adiabatic cool- ing system at the basis of the experimental success, obtained the Nobel Prize in Physics for the experimental verification of the gauge theory description of electro-weak interactions group G, namely infinitely many. It suffices to use as transition functions the corre- sponding linear representations of the transition functions of the principle bundle as displayed in (2.4.20). In all such associated bundles the fibres are vector spaces of dimension r (the rank of the bundle) and the transition functions are r × r matrices. Every linear representation of a Lie group G of dimension r induces a linear repre- G (L/R) sentation of its Lie algebra , where the left- (right-)invariant vector fields TA are mapped into r × r matrices:

(L/R) → TA D(TA) (3.3.98) satisfying the same commutation relations:   = C D(TA), D(TB ) CAB D(TC) (3.3.99) It is therefore tempting to assume that given a one-form connection A ↔ A on a principle-bundle one can define a one-form connection on every associated vector bundle by taking its r × r matrix representation D(A) ↔ D(A ). In which sense this matrix-valued one-form defines a connection on the considered vector bundle? To answer such a question we obviously need first to define connections on generic vector bundles, which is what we do in the next section. 3.4 Connections on a Vector Bundle 127

3.4 Connections on a Vector Bundle

π Let us now consider a generic vector bundle E =⇒ M of rank r. Its standard fi- bre is an r-dimensional vector space V and the transition functions from one local trivialization to another one are maps: ψαβ : Uα Uβ → Hom(V, V) ∼ GL(r, R) (3.4.1)

In other words, without further restrictions the structural group of a generic vector bundle is just GL(r, R). The notion of a connection on generic vector bundles is formulated according to the following:

π π Definition 3.4.1 Let E =⇒ M be a vector bundle, T M =⇒ M the tangent bundle to the base manifold M (dimM = m) and let Γ(E,M ) be the space of sections of E:aconnection ∇ on E is a rule that, to each vector field X ∈ Γ(TM , M ) associates a map:

∇X : Γ(E,M ) → Γ(E,M ) (3.4.2) satisfying the following defining properties:

(a) ∇X(a1s1 + b1s2) = a1∇Xs1 + a2∇Xs2 ∇ = ∇ + ∇ (b) a1X1+a2X2 s a1 X1 s a2 X2 s (c) ∇X(f · s) = X(f ) · s + f ·∇Xs (d) ∇f ·Xs = f ·∇Xs where a1,2 ∈ R are real numbers, s1,2 ∈ Γ(E,M ) are section of the vector bundle and f ∈ C∞(M ) is a smooth function.

The abstract description of a connection provided by Definition 3.4.1 becomes more explicit if we consider bases of sections for the two involved vector bundles. π To this effect let {si(p)} be a basis of sections of the vector bundle E =⇒ M (i = 1, 2,...,r = rank E) and let {eμ(p)},(μ = 1, 2,...,m= dim M ) be a basis π of sections for the tangent bundle to the base manifold T M =⇒ M . Then we can write: ∇ ≡∇ = j eμ si μsi Θμi sj (3.4.3) j ∀ ∈ M where the local functions Θμi (p),( p ) are named the coefficients of the con- nection. They are necessary and sufficient to specify the connection on an arbitrary section of the vector bundle. To this effect let s ∈ Γ(E,M ). By definition of basis of sections we can write: i s = c (p)si(p) (3.4.4) and correspondingly we obtain: 128 3 Connections and Metrics i ∇μs = ∇μc · si (3.4.5) ∇ i ≡ i + i j μc ∂μc Θμj c

i i having defined ∂μc ≡ eμ(c ). Confronting this with the discussions at the end of i Sect. 3.3.2.1, we see that in the language of physicists ∇μc can be identified as the covariant derivatives of the vector components ci . Let us now observe that for any π vector bundle E =⇒ M if: r ≡ rank(E) is the rank, namely the dimension of the standard fibre and:

m ≡ dim M is the dimension of the base manifold, the connection coefficients can be viewed as a set of m matrices, each of them r × r: ∀ : i = i eμ(p) Θμj (Θeμ )j (3.4.6) Hence more abstractly we can say that the connection on a vector bundle associates to each vector field defined on the base manifold a matrix of dimension equal to the rank of the bundle: ∀X ∈ Γ(TM , M ) :

∇:X → ΘX = r × r-matrix depending on the base-point p ∈ M (3.4.7)

The relevant question is the following: can we relate the matrix ΘX to a one-form and make a bridge between the above definition of a connection and that provided by the Ehresmann approach? The answer is yes and it is encoded in the following equivalent:

π π Definition 3.4.2 Let E =⇒ M be a vector bundle, T ∗M =⇒ M the cotangent bundle to the base manifold M (dimM = m) and let Γ(E,M ) be the space of sections of E:aconnection ∇ on E is a map: ∗ ∇:Γ(E,M ) → Γ(E,M ) ⊗ T M (3.4.8) which to each section s ∈ Γ(E,M ) associates a one-form ∇s with values in Γ(E,M ) such that the following defining properties are satisfied:

(a) ∇(a1s1 + b1s2) = a1∇s1 + a2∇s2 (b) ∇(f · s) = df · s + f ·∇s where a1,2 ∈ R are real numbers, s1,2 ∈ Γ(E,M ) are section of the vector bundle and f ∈ C∞(M ) is a smooth function.

The relation between the two definitions of a connection on a vector bundle is now easily obtained by stating that for each section s ∈ Γ(E,M ) and for each vector field X ∈ Γ(TM , M ) we have: 3.4 Connections on a Vector Bundle 129

∇Xs =∇s(X) (3.4.9)

The reader can easily convince himself that using (3.4.9) and relying on the proper- ties of ∇s established by Definition 3.4.2 those of ∇Xs required by Definition 3.4.1 are all satisfied. Consider now, just as before, a basis of sections of the vector bundle {si(p)}. According to its second Definition 3.4.2 a connection ∇ singles out an r × r matrix- valued one-form Θ ∈ Γ(T∗M , M ) through the equation:

∇ = j si Θi sj (3.4.10) Clearly the connection coefficients introduced in (3.4.7) are nothing else but the values of Θ on each vector field X:

ΘX = Θ(X) (3.4.11)

A natural question which arises at this point is the following. In view of our com- ments at the end of Sect. 3.3.2.1 could we identify Θ with the matrix representa- tion of A , the principal connection on the corresponding principal bundle? In other words could we write: Θ = D(A )? (3.4.12) For a single generic vector bundle this question seems tautological. Indeed the struc- tural group is simply GL(r, R) and being the corresponding Lie algebra gl(r, R) made by the space of generic matrices it looks like obvious that Θ could be viewed as Lie algebra valued. The real question however is another. The identification (3.4.12) is legitimate if and only if Θ transforms as the matrix representation of A . This is precisely the case. Let us demonstrate it. Consider the intersection of two- local trivializations. On Uα Uβ the relation between two bases of sections is given by: (α) = j (β) si D(t(αβ))i sj (3.4.13) where t(αβ) is the transition function seen as an abstract group element. Equa- tion (3.4.13) implies:

(α) −1 (β) −1 Θ = D(dt(αβ)) · D(t(αβ)) + D(t(αβ)) · Θ · D(t(αβ)) (3.4.14) which is consistent with the gauge transformation rule (3.3.89) if we identify Θ as in (3.4.12). The outcome of the above discussion is that once we have defined a connec- tion one-form A on a principle bundle P(M ,G) a connection is induced on any π associated vector bundle E =⇒ M . It is simply defined as follows:

∀s ∈ Γ(E,M ) :∇s = ds + D(A )s (3.4.15) where D() denotes the linear representation of both G and G that define the associ- ated bundle. 130 3 Connections and Metrics

3.5 An Illustrative Example of Fibre-Bundle and Connection

In this section we discuss a simple example that illustrates both the general definition of fibre-bundle and that of principle connection. The chosen example has an intrinsic physical interest since it corresponds to the mathematical description of a magnetic monopole in the standard electromagnetic theory. It is also geometrically relevant since it shows how a differentiable manifold can sometimes be reinterpreted as the total space of a fibre bundle constructed on a smaller manifold.

3.5.1 The Magnetic Monopole and the Hopf Fibration of S3

We introduce our case study, defining a family of principal bundles that depend on an integer n ∈ Z. Explicitly let P(S2, U(1)) be a principal bundle where the base manifold is a 2-sphere M = S2, the structural group is a circle S1 ∼ U(1) and, by definition of principal bundle, the standard fibre coincides with the structural group F = S1. An element f ∈ F of this fibre is just a complex number of modulus one f = exp[iθ]. To describe this bundle with need an atlas of local trivializations and hence, to begin with, an atlas of open charts of the base manifold S2.Weusethe two charts provided by the stereographic projection. Defining S2 via the standard quadratic equation in R3: S2 ⊂ R3 :{ }∈S2 ↔ 2 + 2 + 2 = x1,x2,x3 x1 x2 x3 1 (3.5.1) we define the two open neighborhoods of the North and the South Pole, respectively named H + and H − that correspond, in R3 language to the exclusion of either the point {0, 0, −1} or the point {0, 0, 1}. These neighborhoods are shown in Fig. 3.15. − As we already know from Chap. 2, the stereoghapic projections sN and sS map H and H + onto the complex plane C as it follows:

− x1 + ix2 sN H → C zN ≡ sN (x1,x2,x3) = 1 − x3 (3.5.2) + x1 − ix2 sS H → C zS ≡ sS(x1,x2,x3) = 1 + x3

− − and on the intersection H H we have the transition function zN = 1/zS (see (2.2.19)).

Fig. 3.15 The two open charts H + and H − covering the two-sphere 3.5 An Illustrative Example of Fibre-Bundle and Connection 131

Fig. 3.16 Polar coordinates on the 2-sphere

We construct our fibre-bundle introducing two local trivializations respectively associated with the two open charts H − and H +: −1 : −1 − → − ⊗ : −1 { } = [ ] φN π H H U(1) φN x1,x2,x3 zN , exp iψN (3.5.3) −1 : −1 − → + ⊗ : −1 { } = [ ] φS π H H U(1) φS x1,x2,x3 zS, exp iψS

To complete the construction of the bundle we still need to define a transition func- tion between the two local trivialization namely a map: − + tSN : H H → U(1) (3.5.4) such that 1 zS, exp[iψS] = ,tSN(zN ) exp[iψN ] (3.5.5) zN In order to illustrate the geometrical interpretation of the transition function we are going to write it is convenient to make use of polar coordinates on the two sphere. Following the conventions of Fig. 3.16, we parameterize the points of S2 by means of the two angles θ ∈[0,π] and φ ∈[0, 2π]:

x1 = sin θ cos φ; x2 = sin θ sin φ; x3 = cos θ (3.5.6)

On the intersection H + H − we obtain:

sin θ sin θ zN = exp[iφ]; zS = exp[−iφ] (3.5.7) 1 − cos θ 1 + cos θ and we see that the azimuth angle φ parameterizes the points of the equator identi- fied by the condition θ = π/2 ⇒ x3 = 0. Then we write the transition function as follows: n n zN |zS| tSN(zN ) = exp[inφ]= = ; n ∈ Z (3.5.8) |zN | zS 132 3 Connections and Metrics

The catch in the above equation is that each choice of the integer number n identifies a different inequivalent principal bundle. The difference is that while going around the equator the transition function can cover the structural group S1, once, twice or an integer number of times. The transition function can also go clockwise or anti-clockwise and this accounts for the sign of n. We will see that the integer n characterizing the bundle can be interpreted as the magnetic charge of a magnetic monopole. Before going into that let us first consider the special properties of the case n = 1.

3 2 The Case n = 1 and the Hopf Fibration of S Let us name P(n)(S , U(1)) the principal bundles we have constructed in the previous section. Here we want to show 2 3 that the total space of the bundle P(1)(S , U(1)) is actually the 3-sphere S . To this effect we define S3 as the locus in R4 of the standard quadric:

S3 ⊂ R4 :{ }∈S3 ↔ 2 + 2 + 2 + 2 = X1,X2,X3,X4 X1 X2 X3 X4 1 (3.5.9) and we introduce the Hopf map

π : S3 → S2 (3.5.10) which is explicitly given by:

x1 = 2(X1X3 + X2X4)

x2 = 2(X2X3 − X1X4) (3.5.11) = 2 + 2 − 2 − 2 x3 X1 X2 X3 X4

It is immediately verified that xi (i = 1, 2, 3) as given in (3.5.11) satisfy (3.5.1)if 3 Xi (i = 1, 2, 3, 4) satisfy (3.5.9). Hence this is indeed a projection map from S to S2 as claimed in (3.5.10). We want to show that π can be interpreted as the 2 projection map in the principal fibre bundle P(1)(S , U(1)). To obtain this result we need first to show that for each p ∈ S3 the fibre over p is diffeomorphic to a circle, namely π−1(p) ∼ U(1). To this effect we begin by complexifying the coordinates of R4

Z1 = X1 + iX2; Z2 = X3 + iX4 (3.5.12) so that the equation defining S3 (3.5.9) becomes the following equation in C2:

2 2 |Z1| +|Z2| = 1 (3.5.13)

Then we name U − ⊂ S3 and U + ⊂ S3 the two open neighborhoods where we re- spectively have Z2 = 0 and Z1 = 0.TheHopfmap(3.5.11) can be combined with the stereographic projection of the North pole in the neighborhood U − and with the stereographic projection of the South pole on the neighborhood U + obtain- ing: 3.5 An Illustrative Example of Fibre-Bundle and Connection 133

− 3 Z1 ∀{Z1,Z2}∈U ⊂ S ; sN ◦ π(Z1,Z2) = Z2 (3.5.14) + 3 Z2 ∀{Z1,Z2}∈U ⊂ S ; sS ◦ π(Z1,Z2) = Z1 From both local descriptions of the Hopf map we see that:

∀λ ∈ C π(λZ1,λZ2) = π(Z1,Z2) (3.5.15)

3 3 On the other hand if (Z1,Z2) ∈ S we have that (λZ1,λZ2) ∈ S if and only if λ ∈ U(1) namely if |λ|2 = 1. This proves the desired result that π −1(p) ∼ U(1) for all p ∈ S3. The two equations (3.5.15) can also be interpreted as two local triv- ializations of a principal U(1) bundle with S2 base manifold whose total space is the 3-sphere S3. Indeed we can rewrite them in the opposite direction as fol- lows: ! " −1 : − ≡ −1 − → − ⊗ ;[ ]→ Z1 Z2 φN U π H H U(1) Z1,Z2 , Z2 |Z2| ! " (3.5.16) −1 : + ≡ −1 + → + ⊗ ;[ ]→ Z2 Z1 φS U π H H U(1) Z1,Z2 , Z1 |Z1| This time the transition function does not have to be invented rather it is de- cided a priori by the Hopf map. Indeed we immediately read it from equations (3.5.17):

Z1 |Z2| Z1/Z2 zN tSN = = = (3.5.17) |Z1| Z2 |Z1/Z2| |zN | which coincides with (3.5.8)forn = 1. This is the result we wanted to prove: the 2 3-sphere is the total space for the principal fibre P(1)(S , U(1)).

The U(1)-Connection of the Dirac Magnetic Monopole The Dirac (see Fig. 3.17) monopole [22] of magnetic charge n corresponds to introducing a vector potential i A = Ai(x) dx that is a well-defined connection on the principal U(1)-bundle 2 P(n)(S , U(1)) we discussed above. Physically the discussion goes as follows. Let us work in ordinary three dimensional space R3 and assume that in the origin of our coordinate system there is a magnetically charged particle. Then we observe a magnetic field of the form: m m H = r ⇔ H = x (3.5.18) r3 i r3 i

From the relation Fij = εij k Hk we conclude that the electromagnetic field strength associated with such a magnetic monopole is the following 2-form on R3 − 0: m F = ε xi dxj ∧ dxk (3.5.19) r3 ij k 134 3 Connections and Metrics

Fig. 3.17 Paul Adrien Maurice Dirac (1902Ð1984). Together with Einstein, Schrödinger, Pauli and Heisenberg, Dirac is one of the founders of the new vision of the world encoded in XXth century physics. For the revolutionary originality of his contributions he ranks second to no-one, including Einstein. The invention of Dirac equation for the electron wave-function shares the same degree of intellectual audacity and scientific creativity as the invention of Einstein equation for the metric. Similarly to Einstein’s case Dirac’s mathematical discovery led to the prediction of new fully unexpected phenomena such as the existence of the positron and in general of anti-matter. Paul A.M. Dirac was born in England from a French-speaking father immigrated from Switzerland and a British mother. He studied engineering in Bristol and only in 1923 he joined Cambridge University where he developed most of his career. He was awarded the Nobel Prize in Physics in 1933 together with Erwin Schrödinger for the discovery of new productive forms of atomic theory. He spent the last fourteen years of his life in the United States of America where he was professor in Florida at Miami Coral Gables and then Tallahassee. Among the many scientific achievements of Dirac there is also the quantum theory of magnetic monopoles. His paper on the subject dates 1931. There he showed that if magnetically charged particles existed in Nature this would account for the mutual quantization of electric and magnetic charges. Dirac’s argument amounts, in con- temporary mathematical language, to the discussion presented in this section about fibre-bundles. Quantization of the charge is a topological effect since the exponent in the transition function of a U(1) bundle, namely the number n, is necessarily an integer

Indeed (3.5.19) makes sense everywhere in R3 except at the origin {0, 0, 0}.Using polar coordinates rather than Cartesian ones the 2-form F assumes a very simple expression: g F = sin θdθ∧ dφ (3.5.20) 2 namely it is proportional to the volume form on the 2-sphere Vol(S2) ≡ sin θdθ∧ dφ. We leave it as an exercise for the reader to calculate the relation between the constant g appearing in (3.5.20) and the constant m appearing in (3.5.19). Hence the 2-form F can be integrated on any 2-sphere of arbitrary radius and the result, according to Gauss law, is proportional to the charge of the magnetic monopole. Explicitly we have: 3.5 An Illustrative Example of Fibre-Bundle and Connection 135 g π 2π F = sin θdθ dφ = 2πg (3.5.21) S2 2 0 0 Consider now the problem of writing a vector potential for this magnetic field. By definition, in each local trivialization (Uα,φα) of the underlying U(1) bundle we must have:

F = dAα (3.5.22) where Aα is a 1-form on Uα. Yet we have just observed that F is proportional to the volume form on S2 which is a closed but not exact form. Indeed if the volume form were exact, namely if there existed a global section ω of T ∗S2 such that Vol = dω than, using Stokes theorem we would# arrive# at the absurd conclusion that the = = volume of the sphere vanishes: indeed S2 Vol ∂S2 ω 0 since the 2-sphere has no boundary. Hence there cannot be a vector potential globally defined on the 2- sphere. The best we can do is to have two different definitions of the vector potential, one in the North Pole patch and one in the South pole patch. Recalling that the vector potential is the component of a connection on a U(1) bundle we multiply it by the Lie algebra generator of U(1) and we write the ansätze for the full connection in the two patches:

− 1 on U ; A = AN = i g(1 − cos θ)dφ 2 (3.5.23) + 1 on U ; A = AS =−i g(1 + cos θ)dφ 2 As one sees the first definition of the connection does not vanish as long as θ = 0, namely as long as we stay away from the North Pole, while the second definition does not vanish as long as θ = π, namely as long as we stay away from the South Pole. On the other hand if we calculate the exterior derivative of either AN or AS we obtain the same result, namely i times the 2-form (3.5.20): g F = dAS = dAN = i × sin θdθ∧ dφ (3.5.24) 2 The transition function between the two local trivializations can now be recon- structed from: A = A + −1 =− S N tSN dtNS igdφ (3.5.25) which implies

tNS = exp[igφ] (3.5.26) and by comparison with (3.5.8) we see that the magnetic charge g is the integer n classifying the principal U(1) bundle. The reason why it has to be an integer was already noted. On the equator of the 2-sphere the transition function realizes a map of the circle S1 into itself. Such a map winds a certain number of times but cannot wind a non-integer number of times: otherwise its image is no longer a closed circle. 136 3 Connections and Metrics

In this way we have seen that the quantization of the magnetic charge is a topo- logical condition.

3.6 Riemannian and Pseudo-Riemannian Metrics: The Mathematical Definition

We have discussed at length the notion of connections on fibre bundles. From the historical review presented at the beginning of the chapter we know that connec- tions appeared first in relation with the metric geometry introduced by Gauss and Riemann and developed by Ricci, Levi Civita and Bianchi. It is now time to come to the rigorous modern mathematical definition of metrics. The mathematical environment in which we will work is that provided by a dif- ferentiable manifold M of dimension dim M = m. As several times emphasized in Chap. 2, we can construct many different fibre-bundles on the same base manifold M but there are two which are intrinsically defined by the differentiable structure π of M , namely the tangent bundle T M =⇒ M and its dual, the cotangent bun- π dle T ∗M =⇒ M . Of these bundles we can take the tensor powers (symmetric or antisymmetric or none) which are bundles whose sections transform with the cor- responding tensor powers of the transition functions of the original bundle. The sections of these tensor-powers are the tensors introduced by Ricci and Levi Civita in their absolute differential calculus, for whose differentiation Christoffel had in- vented his symbols (see (3.2.7) and (3.2.8)). For brevity let us use a special notation for the space of sections of the tangent bundle, namely for the algebra of vector fields: X(M ) ≡ Γ(TM , M ). Then we have

Definition 3.6.1 A pseudo-Riemannian metric on a manifold M is a C∞(M ), symmetric, bilinear, non-degenerate form on X(M ), namely it is a map:

∞ g(,) : X(M ) ⊗ X(M ) −→ C (M ) satisfying the following properties (i) ∀X, Y ∈ X(M ) : g(X, Y) = g(Y, X) ∈ C∞(M ) (ii) ∀X, Y, Z ∈ X(M ), ∀f,g ∈ C∞(M ) : g(f X + hY, Z)=fg(X, Z) + hg(Y, Z) (iii) {∀X ∈ X(M ), g(X, Y) = 0}⇒Y ≡ 0

Definition 3.6.2 A Riemannian metric on a manifold M is a pseudo-Riemannian metric which in addition satisfies the following two properties (iv) ∀X ∈ X(M ) : g(X, X) ≥ 0 (v) g(X, X) = 0 ⇒ X ≡ 0 3.6 Riemannian and Pseudo-Riemannian Metrics: The Mathematical Definition 137

The two additional properties satisfied by a Riemannian metric state that it should be not only a symmetric non-degenerate form but also a positive definite one. The definition of pseudo-Riemannian and Riemannian metrics can be summa- rized by saying that a metric is a section of the second symmetric tensor power of the cotangent bundle, namely we have: $ ∗ g ∈ Γ 2T M , M (3.6.1) symm

In a coordinate patch the metric is described by a symmetric rank two tensor gμν(x). Indeed we can write:

μ ν g = gμν(x) dx ⊗ dx (3.6.2) and we have:

μ ν ∀X, Y ∈ X(M ) : g(X, Y) = gμν(x)X (x)Y (x) (3.6.3) if −→ −→ μ ν X = X (x) ∂ μ; Y = Y (x) ∂ ν (3.6.4) are the local expression of the two vector fields in the considered coordinate patch. The essential point in the definition of the metric is the statement that g(X,Y) ∈ C∞(M ), namely that g(X,Y) should be a scalar. This implies that the coefficients gμν(x) should transform from a coordinate patch to the next one with the inverse of the transition functions with which transform the vector field components. This is what implies (3.6.1). On the other hand, according to the viewpoint started by Gauss and developed by Riemann, Ricci and Levi Civita, the metric is the mathematical structure which allows to define infinitesimal lengths and, by integration, the length of any curve. Indeed given a curve C :[0, 1]−→M let t(s) be its tangent vector at any point of the curve C , parameterized s ∈[0, 1]: we define the length of the curve as: 1 s(C ) ≡ g t(s), t(s) ds (3.6.5) 0

3.6.1 Signatures

At each point of the manifold p ∈ M , the coefficients gμν(p) of a metric g con- stitute a symmetric non-degenerate real matrix. Such a matrix can always be diago- nalized by means of an orthogonal transformation O(p) ∈ SO(N), which obviously varies from point to point namely we can write: 138 3 Connections and Metrics ⎛ ⎞ λ1(p) 0 ... 00 ⎜ ⎟ ⎜ 0 λ2(p) 0 ... 0 ⎟ ⎜ ⎟ = OT ⎜ . . . . . ⎟ O gμν(p) (p) ⎜ . . . . . ⎟ (p) (3.6.6) ⎝ ⎠ 0 ...... λN−1(p) 0 0 ...... λN (p) where the real numbers λi(p) are the eigenvalues. Each of them depends on the point p ∈ M , but none of them can vanish, otherwise the determinant of the met- ric would be zero which contradicts one of the defining axioms. Consider next the diagonal matrix: ⎛ ⎞ √ 1 | | 0 ... 00 ⎜ λ1(p) ⎟ ⎜ 0 √ 1 0 ... 0 ⎟ ⎜ |λ (p)| ⎟ ⎜ 2 ⎟ ⎜ . . . . . ⎟ L (p) = ⎜ . . . . . ⎟ (3.6.7) ⎜ ⎟ ⎜ 0 ...... √ 1 0 ⎟ ⎝ |λN−1(p)| ⎠ 0 ...... √ 1 |λN−1(p)|

The matrix M = O · L , where for brevity the dependence on the point p has been omitted, is such that: ⎛ ⎞ sign[λ1] 0 ... 00 ⎜ ⎟ ⎜ 0sign[λ2] 0 ... 0 ⎟ ⎜ ⎟ = T ⎜ . . . . . ⎟ gμν M ⎜ . . . . . ⎟ M (3.6.8) ⎝ ⎠ 0 ...... sign[λN−1] 0 0 ...... sign[λN ] having denoted by sign[x] the function which takes the value 1 if x>0 and the value −1ifx<0. Hence we arrive at the following conclusion and definition.

Definition 3.6.3 Let M be a differentiable manifold of dimension m endowed with ametricg. At every point p ∈ M by means of a transformation g → ST · g · S, the metric tensor gμν(p) can be reduced to a diagonal matrix with p entries equal to 1 and m − p entries equal to −1. The pair of integers (p, m − p) is named the signature of the metric g

The rationale of the above definition is that the signature of a metric is an intrinsic property of g, independent both from the chosen coordinate patch and from the chosen point. The proof of this statement relies on a theorem proved in 1852 by the English Mathematician James Joseph Sylvester (see Fig. 3.18) and named by him the Law of Inertia of Quadratic Forms [23]. According to Sylvester’s theorem a symmetric non-degenerate matrix A can al- ways be transformed into a diagonal one with ±1 entries by means of a substitution A → BT · A · B. On the other hand no such transformation can alter the signature 3.7 The Levi Civita Connection 139

Fig. 3.18 James Joseph Sylvester (1814Ð1897) was an eminent English Mathematician who gave fundamental contributions in matrix theory, group theory and number theory. Of Jewish origins he had to suffer several discriminations in the course of his career and also had to law-suite the Royal Military Academy that refused to pay his full-pension. He crossed twice the Atlantic Ocean to become an American Professor. Once at University of Virginia in 1843 where he stayed only six months and a second time in 1877 when he was appointed inaugural professor of mathematics at the newly founded John Hopkins University of Maryland. In the USA he founded the American Journal of Mathematics. Towards the end of his long life he returned to England and finally received several honors from the Royal Society that in 1901, after his death, instituted the Sylvester Medal in his memory

(p, m − p), which is intrinsic to the matrix A. This is what happens for a single ma- trix. Consider now a point dependent matrix like the metric tensor g, whose entries are smooth functions. Defining s = 2p − m the difference between the number of positive and negative eigenvalues of g it follows that also s is a smooth function. Yet s is an integer by definition. Hence it has to be a constant. The metrics on a differentiable manifold are therefore intrinsically character- ized by their signatures. Riemannian are the positive definite metrics with signature (m, 0). Lorentzian are the metrics with signature (1,m− 1) just as the flat metric of Minkowski space. There are also metrics with more elaborate signatures which appear in certain mathematical problems although they are not immediately relevant for General Relativity.

3.7 The Levi Civita Connection

Having established the rigorous mathematical notion of both a metric and a connec- tion we come back to the ideas of Riemannian curvature and Torsion which were 140 3 Connections and Metrics heuristically touched upon in the course of our historical outline. In particular we are now in a position to derive from clear-cut mathematical principles the Christof- fel symbols anticipated in (3.2.7), the Riemann tensor mentioned in (3.2.10) and the Torsion tensor sketched in (3.2.13). The starting point for the implementation of this plan is provided by a careful consideration of the special properties of affine connections.

3.7.1 Affine Connections

In Definitions 3.4.1, 3.4.2 we fixed the notion of a connection on a generic vector π π bundle E =⇒ M . In particular we can consider the tangent bundle T M =⇒ M .A connection on T M is named affine. It follows that we can give the following:

Definition 3.7.1 Let M be an m-dimensional differentiable manifold, an affine connection on M is a map

∇:X(M ) × X(M ) → X(M ) which satisfies the following properties:

(i) ∀X, Y, Z ∈ X(M ) :∇X(Y + Z) =∇XY +∇XZ (ii) ∀X, Y, Z ∈ X(M ) :∇(X+Y)Z =∇XZ +∇YZ ∞ (iii) ∀X, Y ∈ X(M ), ∀f ∈ C (M ) :∇f XY = f ∇XY ∞ (iv) ∀X, Y ∈ X(M ), ∀f ∈ C (M ) :∇X(f Y) = X[f ]Y + f ∇XY

Clearly also affine connections are encoded into corresponding connection one- forms, which are traditionally denoted by the symbol Γ . In the affine case Γ is gl(m, R)-Lie algebra valued since the structural group of T M is GL(m, R).Let {eμ} be a basis of sections of the tangent bundle so that any vector field X ∈ X(M ) can be written as follows: μ X = X (x)eμ (3.7.1) The connection one-form is defined by calculating the covariant differentials of the basis elements: ∇ = ρ eν Γν eρ (3.7.2) Introduce the dual basis of T ∗M , namely the set of one-forms ωμ such that:

μ = μ ω (eν) δν (3.7.3)

The matrix-valued one-form Γ can be expanded along such a basis obtaining:

= μ ⇒ ρ = μ ρ Γ ω Γμ Γν ω Γμν (3.7.4) 3.7 The Levi Civita Connection 141

ρ The tri-index symbols Γμν encode, patch by patch, the considered affine connection. According to Definition 3.7.1 these connection coefficients are equivalently defined by setting ∇ ≡∇ = ρ eμ eν eν(eμ) Γμν eρ (3.7.5)

3.7.2 Curvature and Torsion of an Affine Connection

To every connection one-form A on a principal bundle P(M ,G), we can associate a curvature 2-form:

F ≡ dA + A ∧ A I I 1 I J K ≡ F TI = dA + f A ∧ A TI (3.7.6) 2 JK which is G Lie algebra valued and whose fundamental properties and profound physical meaning will be analyzed in Chap. 5 of this volume. Here we just note that, π evaluated on any associated vector bundle E =⇒ M , the connection A becomes a matrix and the same is true of the curvature F . In that case the first line of (3.7.6) is to be understood in the sense both of matrix multiplication and of wedge product, A ∧ A A k ∧ A j namely the element (i, j) of is calculated as i k , with summation over the dummy index k. We can apply the general formula (3.7.6) to the case of an affine connection. In that case the curvature 2-form is traditionally denoted with the letter R in honor of Riemann. We obtain: R ≡ dΓ + Γ ∧ Γ (3.7.7) ν which, using the basis {eμ} for the tangent bundle and its dual {ω } for the cotangent bundle, becomes: ν = ν + ρ ∧ ν Rμ dΓμ Γμ Γρ 1 = ωλ ∧ ωσ R ν (3.7.8) 2 λσ μ R ν the four index symbols λσ μ being, by definition, twice the components of the 2- form along the basis {ων}. In particular, in an open chart U ⊂ M , whose coordinates μ μ we denote by x , we can choose the holonomic basis of sections eμ = ∂μ ≡ ∂/∂x , whose dual is provided by the differentials ων = dxν and we get:

R ν = ν − ν + ρ ν − ρ ν λσ μ ∂λΓσμ ∂σ Γλμ Γλμ Γσρ Γσμ Γλρ (3.7.9) Comparing (3.7.9) with the Riemann-Christoffel symbols of (3.2.10), we see that the latter could be identified with the components of the curvature two-form of an affine connection Γ if the Christoffel symbols introduced in (3.2.7) were the coefficients 142 3 Connections and Metrics of such a connection. Which connection is the one described by the Christoffel sym- bols and how is it defined? The answer is: the Levi Civita connection. Its definition follows in the next paragraph.

Torsion and Torsionless Connections The notion of torsion was briefly antici- pated in our historical outline. It applies only to affine connections and distinguishes them from general connections on generic fibre bundles. Intuitively torsion has to do with the fact that when we parallel transport vectors along a loop the transported vector can differ from the original one not only through a rotation but also through a displacement. While the infinitesimal rotation angle is related to the curvature ten- sor, the infinitesimal displacement is related to the torsion tensor. This was explicitly displayed in (3.2.12). Rigorously we have the following:

Definition 3.7.2 Let M be an m-dimensional manifold and ∇ denote an affine connection on its tangent bundle. The torsion T∇ is a map:

T∇ : X(M ) × X(M ) → X(M ) defined as follows:

∀X, Y ∈ X(M ) : T∇ (X, Y) =−T∇ (Y, X) ≡∇XY −∇YX −[X, Y]∈X(M ) −→ Given a basis of sections of the tangent bundle { e μ} we can calculate their com- mutators: [ ]= ρ eμ, eν Kμν (p)eρ (3.7.10) ρ where the point dependent coefficients Kμν (p) are named the contorsion coeffi- cients. They do not form a tensor, since they depend on the choice of basis. For instance in the holonomic basis eμ = ∂μ the contorsion coefficients are zero, while they do not vanish in other bases. Notwithstanding their non-tensorial character they can be calculated in any basis and once this is done we obtain a true tensor, namely the torsion from Definition 3.7.2. Explicitly we have:

= T ρ T∇ (eμ, eν) μνeρ (3.7.11) T ρ = ρ − ρ − ρ μν Γμν Γνμ Kμν

Definition 3.7.3 An affine connection ∇ is named torsionless if its torsion tensor vanishes identically, namely if T∇ (X, Y) = 0, ∀X, Y ∈ X(M )

It follows from (3.7.12) that the coefficients of a torsionless affine connection are symmetric in the lower indices in the holonomic basis. Indeed if the contorsion vanishes, imposing zero torsion reduces to the condition:

ρ = ρ Γμν Γνμ (3.7.12) 3.7 The Levi Civita Connection 143

The Levi Civita Metric Connection Consider now the case where the manifold M is endowed with a metric g. Independently from the signature of the latter (Rie- mannian or pseudo-Riemannian) we can define a unique affine connection which preserves the scalar products defined by g and is torsionless. That affine connection is the Levi Civita connection. Explicitly we have the following

Definition 3.7.4 Let (M ,g) be a (pseudo-)Riemannian manifold. The associated Levi Civita connection ∇g is that unique affine connection which satisfies the fol- lowing two conditions: g (i) ∇ is torsionless, namely T∇g (, ) = 0, (ii) The metric is covariantly constant under the transport defined by ∇g, that is: ∀ ∈ M : = ∇g + ∇g Z, X, Y X( ) Zg(X, Y) g( ZX, Y) g(X, ZY).

The idea behind such a definition is very simple and intuitive. Consider two vec- tor fields X, Y. We can measure their scalar product and hence the angle they form by evaluating g(X, Y). Consider now a third vector field Z and let us parallel trans- port the previously given vectors in the direction defined by Z. For an infinitesimal displacement we have X → X +∇ZX and Y → Y +∇ZY. We can compare the scalar product of the parallel transported vectors with that of the original ones. Im- posing the second condition listed in Definition 3.7.4 corresponds to stating that the scalar product of the parallel transported vectors should just be the increment along Z of the scalar product of the original ones. This is the very intuitive notion of parallelism. It is now very easy to verify that the Christoffel symbols, defined in (3.2.7) are just the coefficients of the Levi Civita connection in the holonomic basis μ eμ = ∂μ ≡ ∂/∂x . As we already remarked, in this case the contorsion vanishes and a torsionsless connection has symmetric coefficients according to (3.7.12). On the other hand the second condition of Definition 3.7.4 translates into:

= σ + σ ∂λgμν Γλμ gσν Γλν gμσ (3.7.13) which admits the Christoffel symbols (3.2.7) as unique solution. There is a standard trick to see this and solve (3.7.13)forΓ . Just write three copies of the same equation with cyclically permuted indices:

= σ + σ ∂λgμν Γλμ gσν Γλν gμσ (3.7.14) = σ + σ ∂μgνλ Γμν gσλ Γμλ gνσ (3.7.15) = σ + σ ∂νgλμ Γνλ gσμ Γνμ gλσ (3.7.16)

Next sum (3.7.14) with (3.7.15) and subtract (3.7.16). In the result of this linear σ combination use the symmetry of Γλμ in its lower indices. With this procedure σ you will obtain that Γλμ is equal to the Christoffel symbols. 144 3 Connections and Metrics

Recalling (3.2.8) which defines the covariant derivative of a generic tensor field according to the tensor calculus of Ricci and Levi Civita, we discover the interpreta- tion of (3.7.13). It just states that the covariant derivative of the metric tensor should be zero:

∇λgμν = 0 (3.7.17) Hence the Levi Civita connection is that affine torsionless connection with respect to which the metric tensor is covariantly constant.

3.8 Geodesics

Once we have an affine connection we can answer the question that was at the root of the whole development of differential geometry, namely which lines are straight in a curved space? To use a car driving analogy, the straight lines are obviously those that imply no turning of the steering wheel. In geometric terms steering the wheel corresponds to changing one’s direction while proceeding along the curve and such a change is precisely measured by the parallel transport of the tangent vector to the curve along itself. Let C (λ) be a curve [0, 1]→M in a manifold of dimension m and let t be its tangent vector. In each coordinate patch the considered curve is represented as xμ = xμ(λ) and the tangent vector has the following components: d tμ(λ) = xμ(λ) (3.8.1) dλ According to the above discussion we can rightly say that a curve is straight if we have:

∇tt = 0 (3.8.2) The above condition immediately translates into a set of m differential equations of the second order for the functions xμ(λ). Observing that: ! " d d d2xμ xρ(λ)∂ xμ(λ) = (3.8.3) dλ ρ dλ dλ2 we conclude that (3.8.1) just coincides with:

d2xμ dxρ dxσ + Γ μ = 0 (3.8.4) dλ2 dλ dλ ρσ which is named the equation. The solutions of these differential equations are the straight lines of the considered manifold and are named the geodesics.A solution is completely determined by the initial conditions which, given the order of the differential system, are 2m. These correspond to giving the values xμ(0), namely dxρ the initial point of the curve, and the values of dλ (0), namely the initial tangent 3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples 145 vector t(0). So we can conclude that at every point p ∈ M there is a geodesic departing along any chosen direction in the tangent space TpM . We can define geodesics with respect to any affine connection Γ , yet nothing guarantees a priori that such straight lines should also be the shortest routes from one point to another of the considered manifold M . In the case we have a metric structure, lengths are defined and we can consider the variational problem of cal- culating extremal curves for which any variation makes them longer. It suffices to implement the standard variational calculus to the length functional (see (3.6.5)): √ s = dτ ≡ 2L dλ (3.8.5) 1 d μ d ν L ≡ gμν(x) x x 2 dλ dλ Performing a variational calculation we get that the length is extremal if 1 δs = √ δL dλ = 0 (3.8.6) 2L We are free to use any parameter λ to parameterize the curves. Let us use the affine parameter λ = τ defined by the condition: d d 2L = g (x) xμ xν = 1 (3.8.7) μν dτ dτ In this case equation (3.8.6) reduces to δL = 0 which is the standard variational equation for a Lagrangian L where the affine parameter τ is the time and xμ are the Lagrangian coordinates qμ. It is a straightforward exercise to verify that the Euler-Lagrange equations of this system: d ∂L ∂L − = 0 (3.8.8) dτ ∂x˙μ ∂xμ coincide with the geodesic equations (3.8.4) where for Γ we use the Christoffel symbols (3.2.7). In this way we reach a very important conclusion. The Levi Civita connection is that unique affine connection for which also in curved space the curves of extremal length (typically the shortest ones) are straight just as it happens in flat space. This being true, the geodesics can be directly obtained from the variational principle which is the easiest and fastest way.

3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples

Let us now illustrate geodesics in some simple examples which will also be useful to emphasize the difference between Riemannian and Lorentzian manifolds. In a 146 3 Connections and Metrics

Riemannian manifold the metric is positive definite and there is only one type of geodesics. Indeed the norm of the tangent vector is always positive and the auxil- iary condition (3.8.7) defining the affine parameter is unique. In a Lorentzian case, on the other hand, we have three kinds of geodesics depending on the sign of the norm of the tangent vector. Time-like geodesics are those where g(t, t)>0 and the auxiliary condition is precisely stated as in (3.8.7). However we have also space-like geodesics where g(t, t)<0 and null-like geodesics where g(t, t) = 0. In these cases the auxiliary condition defining the affine parameter is reformulated as 2L =−1 and 2L = 0, respectively. In General Relativity, time-like geodesics are the world-lines traced in space-time by massive particles that move at a speed less than that of light. Null-like geodesics are the world-lines traced by mass-less particles moving at the speed of light, while space-like geodesics, corresponding to superluminal velocities violate causality and cannot be traveled by any physical particle.

3.9.1 The Lorentzian Example of dS2

An interesting toy example that can be used to illustrate in a pedagogical way many aspects of the so far developed theory is given by 2-dimensional de Sitter space. We can describe this pseudo-Riemannian manifold as an algebraic locus in R3, writing the following quadratic equation:

3 2 2 2 R ⊃ AdS2 :−X + Y + Z =−1 (3.9.1)

A parametric solution of the defining locus equation (3.9.1) is easily obtained by the following position:

X = sinh t; Y = cosh t sin θ; Z = cosh t cos θ (3.9.2) and an overall picture of the manifold is given in Fig. 3.19. The parameters t and θ can be taken as coordinates on the dS2 surface on which we can define a Lorentzian metric by means of the pull-back of the standard SO(1, 2) metric on three-dimensional Minkowski space, namely:

ds2 =−dX2 + dY2 + dZ2 dS2 =−dt2 + cosh2 tdθ2 (3.9.3)

The first thing to note about the above metric is that it describes an expanding two- dimensional universe where the spatial sections at constant time t = const are circles S1 S1 2 = 2 . Indeed the angle θ can be regarded as the coordinate on and dθ dsS1 is the corresponding metric, so that we can write:

ds2 =−dt2 + a2(t) ds2 (3.9.4) AdS2 S1 where: 3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples 147

Fig. 3.19 Two-dimensional de Sitter space is a hyperbolic rotational surface that can be visualized in three dimension

a(t) = cosh t (3.9.5) The reader should remember the paradigm provided by (3.9.4) because this is pre- cisely the structure that we are going to meet in the discussion of relativistic cos- mology (see Chap. 5 of Volume 2). The second important thing to note about the metric (3.9.3) is that it has Lorentzian signature. Hence we are not supposed to find just one type of geodesics, rather we have to discuss three types of them: 1. The null geodesics for which the tangent vector is light-like. 2. The time geodesics for which the tangent vector is time-like. 3. The space geodesics for which the tangent vector is space-like.

According to our general discussion, the proper-length of any curve on dS2 is given by the value of the following integral: % dt 2 dθ 2 √ s = − + cosh2 t dλ ≡ 2L dλ (3.9.6) dλ dλ where λ is any parameter labeling the points along the curve. Performing a varia- tional calculation we get that the length is extremal if 1 δs = √ δL dλ= 0 ⇒ δL = 0 (3.9.7) 2L Hence, as long as we use for λ an affine parameter, defined by the auxiliary condi- tion ⎧ μ ν ⎨−1; space-like dx dx =−˙2 + 2 ˙2 = = ; gμν t cosh tθ k ⎩ 0 null-like (3.9.8) dλ dλ 1; time-like we can just treat: L =−t˙2 + cosh2 tθ˙2 (3.9.9) 148 3 Connections and Metrics as the Lagrangian of an ordinary mechanical problem. The corresponding Euler- Lagrange equations of motion are: L L ∂ ∂ 2 ˙ 0 = ∂λ − = ∂λ cosh tθ ∂θ˙ ∂θ (3.9.10) ∂L ∂L 0 = ∂ − = ∂ t˙ − cosh t sinh tθ˙2 λ ∂t˙ ∂t λ The first of the above equations shows that θ is a cyclic variable and hence we have a first integral of the motion:

const =  ≡ cosh2 tθ˙ (3.9.11) which deserves the name of angular momentum. Indeed the existence of this first- integral follows from the SO(2) rotational symmetry of the metric (3.9.3), as we will show when we discuss the concept of isometries and Killing vectors. Thanks to  and to the auxiliary condition (3.9.8), the geodesic equations are immediately reduced to quadratures. Let us discuss the resulting three types of geodesics sepa- rately.

3.9.1.1 Null Geodesics

For null-geodesics we have:

0 =−t˙2 + cosh2 tθ˙2 (3.9.12)

Combining this information with (3.9.11) we immediately get:

 t˙ =± cosh t (3.9.13)  θ˙ = cosh2 t The ratio of the above two equations yields the of the null- orbits: dθ 1 =± (3.9.14) dt cosh t which is immediately integrated in the following form:

θ + α t tan = tanh (3.9.15) 2 2 where the arbitrary angle α is the integration constant that parameterizes the family of all possible null-like curves on AdS2. In order to visualize the structure of such 3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples 149

Fig. 3.20 The null geodesics on dS2 are straight lines lying on the hyperbolic surface. In this figure we show a family of these straight lines parameterized by the angle α {− π π } in the range 5 , 7

curves in the ambient three-dimensional space, it is convenient to use the following elliptic and hyperbolic trigonometric identities:

2 tanh t 1 + tanh2 t sinh t = 2 ; cosh t = 2 1 − tanh2 t 1 − tanh2 t 2 2 (3.9.16) 2tan φ 1 − tan2 φ sin φ = 2 ; cos t = 2 + 2 φ + 2 φ 1 tan 2 1 tan 2 = t = θ+α Setting y tanh 2 tan 2 , utilizing the parametric solution of the locus equa- tions (3.9.2) and also (3.9.16), we obtain the form of the null geodesics in R3:

X = x; Y = x cos α − sin α; Z = cos α + x sin α 2y x ≡ (3.9.17) 1 − y2 It is evident from (3.9.17) that null geodesics are straight-lines, yet straight-lines that lye on the hyperbolic dS2 surface (see Fig. 3.20).

3.9.1.2 Time-Like Geodesics

For time-geodesics we have:

−1 =−t˙2 + cosh2 tθ˙2 (3.9.18) and following the same steps as in the previous case we obtain the following differ- ential equation for time-like orbits dt 1 =± cosh t 2 + cosh2 t (3.9.19) dθ  which is immediately reduced to quadratures and integrated as follows: θ + α  sinh(t) tan = (3.9.20) 2 42 + 2 cosh(2t)+ 2 150 3 Connections and Metrics

Fig. 3.21 The time-like geodesics on dS2.Inthis figure we show a family of geodesics parameterized by the value of the angular momentum  in the range {−0.5, 0.5}. The angle α is instead fixed to the value =−π α 3

Equation (3.9.20) provides the analytic form of all time-like geodesics in dS2.The two integration constants are  (the angular momentum) and the angle α. It is instructive to visualize also the time geodesics as 3D curves that lye on the = θ+α hyperbolic surface. To this effect we set once again y tan 2 and, using both the orbit equation (3.9.20) and the identities (3.9.16) we obtain (see Fig. 3.21): √ 4 2 22 + cosh(2t)+ 1sinh(t) sin(θ + α) = 72 + (2 + 4) cosh(2t)+ 4 (3.9.21) 4 (θ + α) = − cos 2 2 1  sinh (t) + 2 22+cosh(2t)+1 Changing variable: t = arcsinh x (3.9.22) (3.9.21), combined with the parametric description of the surface (3.9.2) yield the parametric form of the time-geodesics in 3D space:

X = x √ √ √ 4x x2 + 1 x2 + 2 + 1 cos(α) − x2 + 1(4(2 + 1) − x2(2 − 4)) sin(α) Y = (2 + 4)x2 + 4(2 + 1) √ √ √ x2 + 1(4(2 + 1) − x2(2 − 4)) cos(α) + 4x x2 + 1 x2 + 2 + 1sin(α) Z = (2 + 4)x2 + 4(2 + 1) (3.9.23)

3.9.1.3 Space-Like Geodesics

For space-like geodesics we have:

1 =−t˙2 + cosh2 tθ˙2 (3.9.24) 3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples 151 and we obtain the following differential equation for space-like orbits dt 1 =± cosh t 2 − cosh2 t (3.9.25) dθ  which is integrated as follows:

θ + α  sinh(t) tan = (3.9.26) 2 42 − 2 cosh(2t)− 2

As one sees the difference with respect to the equation describing time-like orbits resides only in two signs. By means of algebraic substitutions completely analogous to those used in the previous case we obtain the parameterization of the space-like geodesics as 3D curves. We find

X = x √ √ √ 4x x2 + 1 −x2 + 2 − 1 cos(α) + x2 + 1((2 + 4)x2 − 42 + 4) sin(α) Y = (x2 + 4)2 − 4(x2 + 1) √ √ √ 4x x2 + 1 −x2 + 2 − 1sin(α) − x2 + 1((2 + 4)x2 − 42 + 4) cos(α) Z = (x2 + 4)2 − 4(x2 + 1) (3.9.27)

The sign-changes with respect to the time-like case have significant consequences. For a given value  of the angular momentum the range of the X coordinate and hence of the x parameter is limited by: − 2 − 1

Out of this range coordinates becomes imaginary as it is evident from (3.9.27). In Fig. 3.22 we display the shape of a family of space-like geodesics.

3.9.2 The Riemannian Example of the Lobachevskij-Poincaré Plane

The second example we present of geodesic calculation is that relative to the hyper- bolic upper plane model of Lobachevskij geometry found by Poincaré. As many of my readers already know, the question of whether non-Euclidian geometries did or did not exist was a central issue of mathematical and philo- sophical thought for almost two-thousand years. The crucial question was whether the Vth postulate of Euclid about parallel lines was independent from the previ- ous ones or not. Many mathematicians tried to demonstrate the Vth postulate and 152 3 Connections and Metrics

Fig. 3.22 The space-like geodesics on dS2.Inthis figure we show a family of geodesics parameterized by the value of the angular momentum fixed at  = 2and the angle α in the range {− π π } 3 , 3

typically came to erroneous or tautological conclusions since, as now we know, the Vth postulate is indeed independent and distinguishes Euclidian from other equally self-consistent geometries. The first attempt of a proof dates back to Posi- donius of Rhodes (135Ð51 B.C.) as early as the first century B.C. This encyclo- pedic scholar, acclaimed as one of the most erudite man of his epoch, tried to modify the definition of parallelism in order to prove the postulate, but came to inconclusive and contradictory statements. In the modern era the most interest- ing and deepest attempt at the proof of the postulate is that of the Italian Jesuit Giovanni Girolamo Saccheri (1667Ð1733). In his book Euclides ab omni naevo vindicatus, Saccheri tried to demonstrate the postulate with a reductio ad absur- dum. So doing he actually proved a series of theorems in non-Euclidian geome- try whose implications seemed so unnatural and remote from sensorial experience that Saccheri considered them absurd and flattered himself with the presumption of having proved the Vth postulate. The first to discover a consistent model of non-Euclidian geometry was probably Gauss around 1828. However he refrained from publishing his result since he did not wish to hear the screams of Boeo- tians. With this name he referred to the German philosophers of the time who, fol- lowing Kant, considered Euclidian Geometry an a priori truth of human thought. Less influenced by post-Kantian philosophy in the remote town of Kazan of whose University he was for many years the rector, the Russian mathematician Nicolai Ivanovich Lobachevskij (1793Ð1856) discovered and formulated a consistent ax- iomatic set up of non-Euclidian geometry where the Vth postulated did not hold true and where the sum of internal angles of a triangle was less than π. An explicit model of Lobachevskij geometry was first created by Eugenio Beltrami (1836Ð 1900) by means of lines drawn on the hyperbolic surface known as the pseudo- sphere and then analytically realized by Henri Poincaré (1854Ð1912) some years later. In 1882 Poincaré defined the following two-dimensional Riemannian manifold (M ,g), where M is the upper plane:

R2 ⊃ M : (x, y) ∈ M ⇔ y>0 (3.9.29) 3.9 Geodesics in Lorentzian and Riemannian Manifolds: Two Simple Examples 153 and the metric g is defined by the following infinitesimal line-element:

dx2 + dy2 ds2 = . (3.9.30) y2 Lobachevskij geometry is realized by all polygons in the upper plane M whose sides are arcs of geodesics with respect to the Poincaré metric (3.9.30). Let us derive the general form of such geodesics. This time the metric has Euclidian signature and there is just one type of geodesi- cal curves. Following our variational method the effective Lagrangian is:

1 x˙2 +˙y2 L = (3.9.31) 2 y2 where the dot denotes derivatives with respect to length parameter s.TheLa- grangian variable x is cyclic (namely appears only under derivatives) and from this fact we immediately obtain a first order integral of motion: x˙ 1 = = const (3.9.32) y2 R The name R given to this conserved quantity follows from its geometrical inter- pretation that we will next discover. Using the information (3.9.32) in the auxiliary condition: 2L = 1 (3.9.33) which defines the affine length parameter we obtain: 1 y˙2 = 1 − y2 y2 (3.9.34) R2 and by eliminating ds between (3.9.32) and (3.9.34) we obtain: 1 ydy dx = (3.9.35) R y2 1 − R2 which upon integration yields: % 1 y2 (x − x ) = 1 − (3.9.36) R 0 R2 where x0 is the integration constant. Squaring the above relation we get the follow- ing one: 2 2 2 (x − x0) + y = R (3.9.37) that has an immediate interpretation. A geodesic is just the arc lying in the upper plane of any circle of radius R having center in (x0, 0), namely on some point lying on real axis. 154 3 Connections and Metrics

Fig. 3.23 The geodesics of Poincaré metric in the upper plane compared to the geodesics of the Euclidian metric, namely the straight lines

With this result Lobachevskij geometry is easily visualized. Examples of planar figures with sides that are arcs of geodesics are presented in Fig. 3.23.

References

1. Gauss, K.F.: Disquisitiones generales circa superficies curvas. Göttingen, Dieterich (1828) 2. Riemann, G.F.B.: Ueber die Hypothesen, welche der Geometrie zu Grunde liegen. In: Gesam- melte Mathematische Werke (1866) 3. Christoffel, E.B.: Über die Transformation der homogenen Differentialausdrücke zweiten Grades. J. Reine Angew. Math. 70, 46Ð70 (1869) 4. Levi Civita, T., Ricci, G.: Méthodes de calcul differential absolu et leurs applications. Math. Ann. B 54, 125Ð201 (1900) 5. Klein, F.: Vergleichende Betrachtungen ber neuere geometrische Forschungen. Math. Ann. 43, 63Ð100 (1893). Also: Gesammelte Abh. Vol. 1, pp. 460Ð497. Springer (1921) 6. Ricci, G., Atti R. Inst. Venelo 53(2), 1233Ð1239 (1903Ð1904) 7. Ricci, G.: Résumé de quelques travaux sur le systémes variable de fonctions associées a une forme diffé rentielle quadratique. Bull. Sci. Math. (1892) 8. Bianchi, L.: Sugli spazii a tre dimensioni che ammettono un gruppo continuo di movimenti. (On the spaces of three dimensions that admit a continuous group of movements. Soc. Ital. Sci. Mem. di Mat. 11, 267 (1898) 9. Bianchi Rend, L.: Accad. Naz. Lincei 11, 3 (1902) 10. Cartan, E.: Sur l’integration des systémes d’équations aux differentielles totales. Ann. Éc. Norm. Instit. 18, 241Ð311 (1901) 11. Frenet, J.F.: Sur quelques proprétés des courbes à double courbure. J. Math. Pures Appl. (1852) References 155

12. Serret, J.A.: Sur quelques formules relatives à la théorie des courbes à double courbure. J. De Math. 16 (1851) 13. Killing, W.K.J.: Die Zusammensetzung der stetigen/endlichen Transformationsgruppen. Math. Ann. 31(2), 252Ð290 (1888) 14. Killing, W.K.J.: Die Zusammensetzung der stetigen/endlichen Transformationsgruppen. Math. Ann. 33(1), 1Ð48 (1888) 15. Killing, W.K.J.: Die Zusammensetzung der stetigen/endlichen Transformationsgruppen. Math. Ann. 34(1), 57Ð122 (1889) 16. Killing, W.K.J.: Die Zusammensetzung der stetigen/endlichen Transformationsgruppen. Math. Ann. 36(2), 161Ð189 (1890) 17. Cartan, E.: Über die einfachen Transformationgruppen, pp. 395Ð420. Leipz. Ber. (1893) 18. Cartan, E.: Sur la structure des groupes de transformations finis et continus. Thése, Paris, Nony (1894) 19. Cartan, E.: Sur la structure des groupes infinis de transformations. Ann. Sci. de l’ENS 21, 153Ð206 (1904) 20. Maurer, L.W.: Ueber continuirliche Transformationsgruppen. Math. Ann. 39, 409Ð440 (1891) 21. Cartan, E.: Sur les equations de la gravitation d’Einstein. J. Math. Pures Appl. 9(1), 93Ð161 (1922) 22. Dirac, P.: Quantised singularities in the electromagnetic field. Proc. R. Soc. Lond. Ser. A 133, 60 (1931) 23. Sylvester, J.J.: A demonstration of the theorem that every homogeneous quadratic poly- nomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares. Philos. Mag. IV, 138Ð142 (1852). http://www.maths.ed.ac.uk/~aar/ sylv/inertia.pdf Chapter 4 Motion of a Test Particle in the Schwarzschild Metric

The most incomprehensible thing about the Universe is that it is comprehensible. . . Albert Einstein

4.1 Introduction

It is fair to say that modern physics started with the Copernican Revolution and with the remarkable conceptual synthesis of Johannes Kepler (his portrait is given in Fig. 4.1) who summarized two millennia of astronomical observations of the solar system into three simple laws that describe the orbits of planets and their periods of revolution. Indeed Newton’s theory of gravitational interactions and Newtonian mechanics were just invented to explain Kepler’s laws within a unified theory of all possible motions. It is quite obvious that General Relativity which aims at replacing Newton’s with a more profound and consistent theory of gravity should reproduce Kepler’s laws, at least in first approximation. Clearly we expect some modifications and some new ef- fects but, in order to make the new theory successful, they have to be extremely tiny in physical systems of the size of the solar system. On the contrary the same effects are allowed to become very large and even dominant in extremely narrow astro- physical systems like those provided by binaries of compact stars that are presently accessible to the astronomical observation and could not be even suspected at the time of Kepler or Newton. In this chapter we show that these mostly desirable features characterize the time- like and space-like geodesics of a particular one-parameter space-time metric with Minkowskian signature that is named the Schwarzschild metric after his discoverer. The significance of this result is appreciated through the following reasoning. Once we accept the geometrical model of space-time as the pair (M ,g) where M is a differentiable manifold and g is a pseudo-Riemannian metric, the fundamental problem of mechanics, namely the determination of physical trajectories of point- particles given the forces that act on them, is replaced by the geometrical problem of calculating the geodesics for the metric g. There are three kinds of these latter:

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_4, 157 © Springer Science+Business Media Dordrecht 2013 158 4 Motion in the Schwarzschild Field

Fig. 4.1 Tycho Brahe (1546–1601) on the left and Johannes Kepler (1571–1630) on the right. Tycho Brahe, born Tyge Ottesen Brahe was a Danish nobleman who received the support of the King of Denmark to pursue his systematic naked-eye astronomical observations by means of var- ious instruments and state built installations on the island of Hven. He studied astronomy at the University of Copenhagen and when he began his measurements of planetary parallax he achieved unprecedented precisions, accurate to the arcminute. He was the first to reveal a new star in the sky. On 11 November 1572 Tycho observed a very bright star, now named SN 1572, which unex- pectedly appeared in the constellation of Cassiopea. The title of his publication De stella nova is responsible for the introduction of the term nova in astronomy. As we know today, SN 1572, was actually a supernova of type Ia, whose remnant is still observable. Because of a disagreement with the new King of Denmark, in 1597 Tycho Brahe left his country accepting the invitation to Prague of the King of Bohemia Rudolph II who became Emperor of the Holy Roman Empire. In Prague, Brahe had as student and scientific heir, Johannes Kepler. Born in Weil der Stadt, near Stuttgart, Kepler had noble ancestors but the wealth of his family had declined by the time of his coming to this world and his mother was the daughter of a simple inn-keeper. In later years she was accused of witchcraft and escaped burning at the stake just for her courage to deny all charges under torture. Interest in astronomy was raised in Kepler precisely by such a mother who showed him the 1577 comet. Johannes university education was in Tubingen where he came in touch with Copernican theories and elaborated his personal persuasion of their correctness. His first publication dates back to 1597. In the Mysterium Cosmographicum he attempted a first systematic description of the order of the Universe. In 1599 he become assistant of Tycho Brahe in Prague and when the latter died two years later he inherited all of his precious observational data that were the basis for the formu- lation of his famous three laws. The first two appeared in 1609 in the Astronomia Nova, while he discovered the third in 1618 and published it the next here in Harmonice Mundi. Just as his master Brahe, also Kepler had the venture of observing a supernova in 1604. Also SN 1604 was of type Ia and it has been the last so far observed galactic supernova to the present time (see Fig. 4.2). Im- perial Astronomer, notwithstanding his crucial discoveries that eventually led to Newton’s theory of gravitation mixed science, theology and metaphysics in his work trying to find a divine order in the laws of motion of celestial bodies. He died in 1630 in Regensburg

1. the time-like geodesics are the possible world-lines followed by massive parti- cles, 4.1 Introduction 159

Fig. 4.2 The remnant of the supernova SN 1572 (on the left) and of the supernova SN 1604 (on the right) respectively observed by Tycho Brahe and Johannes Kepler. They were both of type Ia, namely they were caused by the explosion of a white dwarf that reached the critical Chandrasekhar mass limit (see later chapters) by swallowing material from the companion normal star in a binary system. SN 1572 is at a distance of 7500 light years from the Earth in the Cassiopea Constellation. Kepler’s star SN 1604 is instead at a distance of 20000 light-years in the constellation Ophiuchus

2. the light-like geodesics are the possible world-lines followed by massless parti- cles such as the photons, 3. the space-like geodesics cannot be world-lines for any physical particle since you can travel along them only at a speed larger than the speed of light. Hence the metric g is a substitute for the concept of force field and the calculation of time-like geodesics is a substitute for the solution of the fundamental problem of mechanics in this force field. Retrieving almost Keplerian orbits from the time-like geodesics of the Schwarzschild metric shows that this latter is a correct replace- ment for Newton’s law of gravitation. Indeed the one parameter occurring in the Schwarzschild metric can be identified with the mass M of the Newtonian source. General Relativity is a field theory for the space-time metric gμν(x) and it will be a correct theory of gravity if the Schwarzschild metric is a solution of its field equations, actually the unique solution with the symmetry corresponding to Kepler’s problem, namely spherical symmetry. This is what we show in later chapters once Einstein’s field equations have been introduced. In the present chapter we simply assume the Schwarzschild metric and we work out all of its consequences. After a review (see Sect. 4.2) of Kepler’s problem in the context of Newtonian mechanics, in Sect. 4.3 we study the equations for time-like geodesics in Schwarzschild geometry and we show how such a geometrical problem admits a one-to-one map into the previous one. In particular the first integrals of the Newtonian motion, the energy E and the angular momentum  are mapped into their relativistic analogues E and L which are constant along the geodesics and are as- sociated with the same symmetries in both cases: time-independence and spherical symmetry of the gravitational field, respectively. Writing the differential equation of the orbit we find that it is formally identical to that of Newton’s theory but with a modified central potential Veff (r). In addition to the attractive term ∼−1/r and to the centrifugal barrier ∼ 1/r2 there is a third attractive term ∼−1/r3 that is respon- sible for all the deviations from purely Keplerian motion. At large distances this new term is completely negligible and this explains why Newton’s theory works so well, 160 4 Motion in the Schwarzschild Field yet at small distances it dominates and makes dramatic changes in our predictions. It is responsible for qualitatively new effects, in particular the phenomenon of peri- astron advance in planetary orbits. This is a tiny but measurable effect in the solar system that can become impressively large in narrow binary systems. Historically it provided one of the three tests of General Relativity proposed by Einstein himself: the calculation of the measured and unexplained perihelion advance in the orbit of Mercury. Another test proposed by Einstein was the deflection of light-rays by a gravitational field. This has to do with the light-like geodesics whose calculation we address in Sect. 4.5. In the Newtonian case the general integral of the orbit equation can be given in closed analytical form. For the case of Schwarzschild geometry this is also possible but it involves the use of higher transcendental functions. We will come back to this later on (see Chap. 3 of Volume 2 where we discuss rotating black holes). Here, for pedagogical purposes we present numerical solutions of the orbit equation that are produced by a short computer package in MATHEMATICA (presented in Ap- pendix B.1) that also plots them graphically. Alternatively we can use perturbation theory and calculate the first order corrections to Keplerian orbits. This is the first example of the post-Newtonian expansion and produces the celebrated formulae for the periastron advance and for the light-ray deflection angle.

4.2 Keplerian Motions in Newtonian Mechanics

As anticipated in the introductory section it is convenient to start with a review of Kepler’s problem in classical Newtonian mechanics. The first thing to do is to fix our conventions for kinematics. We use polar coordinates and we label the points of a 2-sphere by means of two angular coordinates as described in Fig. 4.3. In Newton’s theory the orbit equations are deduced from the equations of energy and angular momentum conservation that, in the polar coordinates we have adopted, take the following form: 1 dr 2 1 2 GMμ E = μ + − 2 (4.2.1) 2 dt 2 μr r centrifugal Newtonian barrier potential dφ  = μr2 (4.2.2) dt angular momentum

Following quite standard conventions one sets:

dr  dr   r˙ ≡ ; r ≡ ;⇒˙r = r (4.2.3) dt dφ μr2 4.2 Keplerian Motions in Newtonian Mechanics 161

Fig. 4.3 Our conventions for the angular coordinates on the S2 sphere are as follows: the azimuthal angle φ takes the values in the range [0, 2π], while the ascension angle θ runs from 0 (the North Pole) to π (the South Pole). The metric ds2 = dθ2 +sin2 θdφ2 is singular at θ = 0and θ = π. These are coordinate singularities that can be removed by redefining θ so that (4.2.1) becomes:

2 2 1   1  GMμ r 2 + − = E (4.2.4) 2 μr4 2 μr2 r

Introducing a new variable 1 u ≡ (4.2.5) r (4.2.4) becomes:  2 2 u + u = C0 + 2Cu (4.2.6) where we have set: 2Eμ GMμ2 C ≡ ; C ≡ (4.2.7) 0 2 1 2 d Taking a further dφ derivative of (4.2.6) one obtains:   2u u + u − C1 = 0 (4.2.8)

There are two kinds of solutions of (4.2.8), those where u = 0 and those where:

 u + u − C1 = 0 (4.2.9)

The first kind of solutions correspond to the circular orbits for which r = const, while the second are the non-circular ones. Equation (4.2.9) is immediately solved by a change of variable u = y + C1 which reduces it to the familiar equation

d2y + y = 0 (4.2.10) dφ2 of harmonic motion whose general solution is y = b cos(φ − φ0) and contains two integration constants b and φ0. Therefore the general solution of (4.2.9) can be writ- ten as follows: 162 4 Motion in the Schwarzschild Field

Fig. 4.4 The Keplerian orbit of a test particle of mass μ around a center of mass M is an ellipsis parameterized by the semilatus rectum = r++r− a 2 and the r+−r− = ± eccentricity e r++r− , r being the maximal and minimal distances from the center of mass reached by the test particle while going around its orbit. In the case of the Solar system these points are respectively named the aphelion and the perihelion

1 1 = r = a 1 − e2 (4.2.11) u 1 + e cos φ where we named: C 1 ≡ a = semilatus rectum 1 − e2 (4.2.12) b ≡ e = eccentricity C1 In this way the parameter e, whose range is 0 ≤ e ≤ 1, replaces the integration constant b while the second integration constant φ0 is reabsorbed into the definition of the azimuthal angle φ. The names given to the parameters (4.2.12) are traditional in astronomy since the time of Kepler and refer to the geometrical interpretation of the solution (4.2.11) as the equation of an ellipsis (see Fig. 4.4). The geometrical parameters a and e are related to the physical constants of motion, namely to the energy E and the angular momentum .

4.3 The Orbit Equations of a Massive Particle in Schwarzschild Geometry

Let us now turn to the description of space-time as a differentiable manifold en- dowed with a pseudo-Riemannian structure and introduce the Schwarzschild (see Fig. 4.5)metric[1]: − m m 1 ds2 =− 1 − 2 dt2 + 1 − 2 dr2 + r2 dθ2 + sin2 θdφ2 (4.3.1) r r where t denotes the time coordinates, while r, θ, φ are polar coordinates in the same notations as before. To obtain the geodesic equation for the motion of a massive test particle in the metric (4.3.1) we use the variational principle. As we explained in Sect. 3.8,given 4.3 The Orbit Equations of a Massive Particle 163

Fig. 4.5 Karl Schwarzschild (1873–1916) was born in Frankfurt am Mein from a well-to-do Jew- ish family. He published his first papers in astronomy when he was still a boy. In 1889–1890 he published two articles on the determination of orbits for binary stars. Next year he obtained his doctorate in Astronomy from the University of Munich. After an experience at the observatories of Vienna and Munich, in 1900 he was appointed Director of the Astronomical Observatory of Göttingen where he interacted with such personalities as David Hilbert and Hermann Minkowski. In 1909 he moved to the Observatory of Postdam. By this time he was already a famous scientist, member of the Prussian Academy of Sciences since 1912. His contributions concerned various fields of physics related with Astronomy. In particular he studied the equilibrium equations of ra- diation and was the first to discover radiation pressure. At the breaking of World War I in 1914, notwithstanding his forty years of age he enrolled in the German Army and fought first on the Western Front in Belgium, then on the Eastern Front in Russia. In 1916, while at the front, he wrote two scientific articles. One less famous than the second contains quantization rules similar to those of Bohr and Sommerfeld, discovered by him independently. The second article contains the Schwarzschild solution of Einstein equations. He had learnt of Einstein new theory of General Relativity just a couple of months before reading the issue 25 of the Proceedings of the Prussian Academy of Sciences of November 1915. His simple and elegant solution of Einstein field equa- tions was sent to Einstein that answered to him with the following words: have read your paper with the utmost interest. I had not expected that one could formulate the exact solution of the problem in such a simple way. I liked very much your mathematical treatment of the subject. Next Thursday I shall present the work to the Academy with a few words of explanation. The same year, 1916, Schwarzschild died from a rare skin infection contracted on the war front an explicit metric gμν(x) we have the proper time functional of (3.8.6) and (3.8.7) which leads to Euler-Lagrange equations plus the auxiliary condition 2L = 1. Fol- lowing this reasoning, from the Schwarzschild metric (4.3.1) we derive the effective Lagrangian:

− m m 1 −L =− 1 − 2 t˙2 + 1 − 2 r˙2 + r2 θ˙2 + sin2 θφ˙2 (4.3.2) r r where the dot denotes the derivative d/dτ with respect to proper time. The Euler-Lagrange equation of the field θ(τ) yields: 164 4 Motion in the Schwarzschild Field L L d ∂ − ∂ = ⇒ ˙ 0 dτ ∂θ ∂θ (4.3.3) d −2r2θ˙ + 2sinθ cos θφ˙2 = 0 dτ From 4.3.3 we conclude that it is consistent to set: π θ = const = (4.3.4) 2 throughout the motion. In other words the Schwarzschild motion is planar as the Newtonian motion: it takes place in the equatorial plane with respect to the coordi- nate frame we have adopted (see Fig. 4.3). Taking this into account we immediately get the following reduced Lagrangian: − 1 m m 1 L reduc. = − 1 − 2 t˙2 + 1 − 2 r˙2 + r2φ˙2 (4.3.5) 2 r r It is apparent from (4.3.5) that both the time t and the azimuthal angle φ are cyclic canonical variables, namely they appear in the Lagrangian only through their deriva- tives with respect to the Lagrangian time τ . This implies the existence of two first integrals of the motion, namely: dt m t-variation ⇒ 1 − 2 = E = const dτ r (4.3.6) dφ φ-variation ⇒ r2 = L = const dτ It is tempting to interpret E and L as the energy and the angular momentum, respec- tively. Comparison with the Newtonian theory will confirm such an interpretation. Since 2L = 1 along the geodesics, rather than working out the additional Euler- Lagrange equation (for the r-variation) we just insert (4.3.6)into(4.3.5) and we get: E 2 r˙2 L2 1 = − − r2 (4.3.7) − m − m 4 1 2 r 1 2 r r so that we obtain: dr 2 + V (r) = E 2 dτ eff (4.3.8) dφ = L dτ r2 where we have introduced the following effective central potential: m L2 V ≡ 1 − 2 1 + eff r r2 m L2 L2m = 1 − 2 + − 2 (4.3.9) r r2 r3 4.3 The Orbit Equations of a Massive Particle 165

Hence the first equation of (4.3.9) can also be rewritten as follows: 1 dr 2 L2 m L2m E 2 − 1 + − − = (4.3.10) 2 dτ 2r2 r r3 2 Equation (4.3.10) can be compared with the equation for energy conservation in the motion of a test particle of mass μ in the Newtonian field generated by a large mass M (see (4.2.1)). For motions that are sufficiently slow, we can identify:

dτ  c dt (4.3.11) speed of light so that, multiplying (4.3.10)byμ, we can rewrite it in the following way: mμc2 1 dr 2 1 L2c2μ c2L2mμ (E 2 − 1)c2μ − + μ + − =∼ (4.3.12) r 2 dt 2 r2 r3 2

2 2 Furthermore, if the term c L mμ is much smaller than the other terms: r3 c2L2mμ 1 L2c2μ  (4.3.13) r3 2 r2 then we can make the identifications: 2  L2c2μ  ⇒ L  μ μc GM c2mμ = GMμ ⇒ m = (4.3.14) c2 E 2 − 1 2E + μc2 c2μ  E ⇒ E 2  2 μc2 or ⎧ ⎪m = GM = Schwarzschild emiradius ⎨⎪ c2 L =  = angular momentum per unit mass per speed of light ⎪ μc (4.3.15) ⎩⎪ E = 1 + 2 E = total energy in rest mass units μc2 This comparison with the Newtonian theory suggests the following bifurcation: E < 1 bound state ⇒ closed orbit (4.3.16) E > 1 unbound state ⇒ open orbit

4.3.1 Extrema of the Effective Potential and Circular Orbits

By means of (4.3.9) the problem of massive particle orbits in the Schwarzschild geometry has been formally reduced to the problem of classical motions in a central 166 4 Motion in the Schwarzschild Field potential. Hence, just as in classical Newtonian mechanics we have to study the extrema of the effective potential (4.3.9) in order to establish the conditions for dr = stable and unstable circular orbits. Indeed if an orbit is circular we have dτ 0 and the radius r = r0 must be a root of the equation Veff (r) = E . The orbit will be stable if r0 is a minimum of the effective potential Veff , while it will be unstable if it is a maximum. Equating to zero the first derivative of Veff we obtain:

2 2 ∂Veff m L mL 0 = = − 2 + 3 (4.3.17) ∂r r2 2r3 r4 with solutions: L2 ± L2(L2 − 12m) r± = (4.3.18) 2m and considering the discriminant Δ = L2(L2 − 12m) we conclude that for

L2 < 12m2 (4.3.19) there are no extrema and consequently no stable orbits. Reinstalling physical units the stability bound (4.3.19) translates into: dϕ √ GM r2 < 12 (4.3.20) dt c In the case of test particles orbiting around the sun this means:

√ R2 12 GM < (4.3.21) T 2π c where T and R are the period and radius of the orbit, respectively, and M denotes the mass of the sun. Inserting the numerical values of these physical constants we find: GM (6.670 cm3 s−2 g−1) cm2 = =∼ 4.43 · 1015 c 2.998 · 1010 cm s−1 s 1 year 1s= = 3.1536 · 10−7 years 31, 536, 000 (4.3.22) 1km 1cm2 = = 10−10 km2 1010

cm2 1 km2 km2 1 = · 10−3 · =∼ 0.317 · 10−3 s 3.1536 year year Therefore the critical limit in the solar system is given by:

2 2 R − km ≥ 1.7 · 10 4 (4.3.23) T year 4.3 The Orbit Equations of a Massive Particle 167

It is instructive to compare condition (4.3.23) with the actual values of planetary radii or periods. The period of a planet is of the order of the year. Hence to be critical the radius of the orbit should be of the order of 10−2 km while, as we know, the typical distance of a planet from the sun is of the order of 108 km. Alternatively keeping the distance fixed, for instance inserting the average radius of the Earth orbit around the Sun: 8 r⊕ = 1.49 × 10 km (4.3.24) 20 we obtain that a critical period for such a orbit would be Tcrit = 10 years. In other words, in order to be critical, the Earth should be so slow as to make a full revolution in a time ten orders of magnitude longer than the age of the Universe 10 TUniverse ≈ 10 years. These numerical considerations give some appreciation of how far the solar sys- tem is from the critical phenomena implied by general relativity and explain why Newtonian mechanics works so well for the physical system it was invented to ex- plain. Notwithstanding this smallness, the critical region is by no means irrelevant in astrophysical systems. Indeed, as we are going to see, it becomes quite impor- tant near compact stars whose mass is of the order of a stellar mass M but whose radius is of the order of the kilometer.

Minimum and Maximum The second derivative of the potential yields:

2 d Veff r5 = 3L2r − 12mL2 − 2mr2 ≡ J(r) (4.3.25) dr2

Inserting the values r± given in (4.3.18) we find:

J(r+)>0 ⇒ r+ is a minimum (4.3.26) J(r−)<0 ⇒ r− is a maximum

Explicitly we have: L2 + L2(L2 − 12m2) r+ = > 6m (4.3.27) 2m The important conclusion that we reach from (4.3.27) is that stable orbits exist only for radii r>6m. In the case of the Sun the numerical value of the Schwarzschild emiradius reads: GM m =  1.48 km (4.3.28) c2 For L2 = 12m2 we have r− = 6m while the limit of r− for L2 →∞is: L2 − L2(L2 − 12m2) lim r− = lim = 3m (4.3.29) L2→∞ L2→∞ 2m 168 4 Motion in the Schwarzschild Field

Fig. 4.6 Schwarzschild geometry: Plot of the effective radial potential for  = 1 ⇒ L2 = 12m2.Thisis the limiting case where no circular orbits are available. Indeed the potential has no extrema

Fig. 4.7 Schwarzschild geometry: Plot of the effective√ potential for  = 2 ⇒ L2 = 24m2. In this case the stable minimum of√ the potential is at r+ = 6(2+ 2)m = 20.4853m, while the unstable maximum is√ at r− = 6(2− 2)m = 3.51472m

It follows that the range of the root r− is:

3m

1 1 62 122 Veff = − + − (4.3.32) 2 ρ ρ2 ρ3 The structure of the effective potential can be visualized by means of some plots. In Fig. 4.6 we display the limiting case L2 = 12m2 which admits no circular orbits. The reason is evident from the shape. In this case the potential has neither minima nor maxima. It just decreases from infinity towards infinitely large negative values 4.3 The Orbit Equations of a Massive Particle 169

Fig. 4.8 Schwarzschild geometry: Enlargement in the region around the stable minimum of the effective potential√ for  = 2 ⇒ L2 = 24m2

as r → 0. In Figs. 4.7 and 4.8 we have plotted the effective potential for L2 = 24m2, namely for a value of the angular momentum that is just above the stability bound. For this value√ of the parameters the potential starts developing√ both a minimum r+ = 6(2 + 2)m = 20.4853m and a maximum r− = 6(2 − 2)m = 3.51472m. As predicted from our general discussion, the maximum r− corresponding to the unstable circular orbit falls in the interval ]3m, 6m[, while the minimum r+ corre- sponding to the stable orbit is larger than 6m. What is physically important to note is that r+ grows rapidly with angular momentum and it is already far away from the Schwarzschild emiradius at these small values of L2.

Energy of a Particle in a Circular Orbit Let us now calculate the energy of a test particle in motion on a circular orbit and compare our result with the Newtonian theory. We start from (4.3.10) that on a circular orbit (r˙ = 0) reduces to:

1 1 m L2 mL2 E 2 = − + − (4.3.33) 2 2 r 2r2 r3

If the orbit is extremal, namely if r = r± we have the relation (4.3.17) between the radius and the angular momentum that can be solved by:

mr2 L2 = (4.3.34) r − 3m Inserting (4.3.34)into(4.3.33) we obtain: r − 2m E (r) = √ (4.3.35) r(r − 3m) which yields the general relativistic formula for the energy of a test particle in circu- lar motion around a spherically symmetric massive body like the Sun or the Earth. It is very instructive to make a post-Newtonian development of this formula and com- pare it with the purely Newtonian result. What is the small expansion parameter for the post-Newtonian approximation? This is to be established in the natural units we have adopted. At G = c = 1 both r and the Schwarzschild emiradius m are measure 170 4 Motion in the Schwarzschild Field of lengths and the non-relativistic regime is approached when the actual orbit radius r is big with respect to m. Hence the post-Newtonian development corresponds to a series expansion in the dimensionless parameter: m  1 (4.3.36) r Hence we can write: 1 − 2 m m 3 m E (r) = r  1 − 2 1 + +··· (4.3.37) − m r 2 r 1 3 r 1 m m2  1 − + O (4.3.38) 2 r r2 and comparing (4.3.38) with (4.3.15) we obtain: E E 1 m E = 1 + 2  1 + +···1 − +··· μc2 μc2 2 r 1 GMμ ⇒ E =− (4.3.39) 2 r Thelastlinein(4.3.39) correctly reproduces the Newtonian result for the energy of a particle of mass μ bounded on a circular orbit around a center of mass M. Indeed if we consider the effective Newtonian potential:

1 2 GMμ V (Newton) = − (4.3.40) eff 2 μr2 r and we calculate its minimum: (Newton) ∂Veff = 0 (4.3.41) ∂r we obtain the condition of balance between the centrifugal energy and the potential energy: 2 GMμ = (4.3.42) μr2 r which inserted in the Newtonian formula (4.2.1) for the particle energy, together with the circular orbit condition (r˙) gives back (4.3.39).

4.4 The Periastron Advance of Planets or Stars

Let us now go back to (4.3.10) and to the second of (4.3.6) that are the exact rel- ativistic definitions of energy and angular momentum in Schwarzschild geometry. 4.4 The Periastron Advance 171

Making a substitution similar to the substitution (4.2.3) used in the Newtonian case, namely:

dr  dr r˙ ≡ ; r ≡ ;⇒˙r = r L (4.4.1) dτ dφ r2 (4.3.10) becomes:  2 u 2 + u2 = C + 2C u + C u3 (4.4.2) 0 1 3 2 where: E 2 − 1 2Eμ C = = (4.4.3) 0 L2 2 m GMμ2 C = = (4.4.4) 1 L2 2 C2 = 3m (4.4.5)

The definition (4.4.3), (4.4.4) of the parameters C0,1 is consistent with their previous Newtonian definition (4.2.7) thanks to the relation (4.3.15) between the relativistic first integrals E , L and their Newtonian analogues E, . Equation (4.4.2)isthe differential orbit equation in Schwarzschild geometry and replaces the Newtonian equation (4.2.6): the difference resides in the term cubic in u with coefficient C2 = 2m. In full analogy with the procedure followed in the Newtonian case we take a further derivative of (4.4.2) and we obtain:   2 2u u + u − C1 − C2u = 0 (4.4.6) which admits two kinds of solutions, the already discussed circular orbits (u = 0) and the non-circular ones that satisfy the differential equation:

 2 u + u − C1 = C2u (4.4.7) replacing the Keplerian equation (4.2.9). 2 In the limit where C2u is negligible with respect to the other terms the general solution of (4.4.7) becomes (4.2.11) which, as we know, describes an elliptic orbit of semilatus rectum a and eccentricity e, the constant C1 being: 1 C = (4.4.8) 1 a(1 − e2) We can appreciate the difference between General Relativity and Newtonian physics if we study numerical solutions of (4.4.7) with the help of a computer pro- gramme. To this effect it is convenient to measure the radius r and the semilatus rectum a in units of the Schwarzschild emiradius m by setting u u = ; a = am (4.4.9) m 172 4 Motion in the Schwarzschild Field

Fig. 4.9 Schwarzschild geometry: Orbit of a massive test particle with Keplerian parameters a = 70m and e = 0.7 after 1 revolution

which, combined with (4.4.8), (4.4.5) reduces (4.4.7) to the form:

 1 u + u − = 3u2 (4.4.10) a(1 − e2) Equation (4.4.10) is of the second order and numerical solutions can be obtained if we feed the computer programme with two initial conditions. The initial conditions appropriate to describe the physical problem we investigate are the following ones: 1 u(0) = (4.4.11) a(1 − e)  u (0) = 0 (4.4.12)

Equation (4.4.12) fixes the origin of the φ angle at the periastron namely at the point in the orbit where the derivative of the radius goes to zero. Equation (4.4.12) states that distance of the test particle from the star at the first periastron is the same as it would be in a Keplerian orbit, namely r− = a(1 − e) (compare with Fig. 4.4). Given the initial conditions the subsequent n revolutions of the test particle are numerically determined by the differential equation. It is clear from our previous discussions that the relativistic effects will be evident only in narrow systems where a is small and in eccentric orbits e → 1. So in order to emphasize the size of these effects we have chosen an example with the following parameters:

a = 70; e = 0.7 (4.4.13)

In Fig. 4.9 we display the shape of the orbit for this system after one revolution. As it can be visually appreciated the orbit is nearly an ellipsis but not quite. At φ = 2π the test particle is at a distance r(2π) close to the initial value r(0) but slightly bigger. Furthermore the second periastron, namely the angle φ1 corresponding to the second  zero of the derivative r is not exactly at φ1 = 2π but at a slightly earlier angle φ1 = 2π − Δφ. The phenomenon is more clearly understood if we allow the computer programme to run for more revolutions. Figure 4.10 displays three revolutions of the same physical system. At each revolution the periastron is anticipated of some 4.4 The Periastron Advance 173

Fig. 4.10 Schwarzschild geometry: Orbit of a massive test particle with Keplerian parameters a = 70m and e = 0.7 after 3 revolution

Fig. 4.11 Schwarzschild geometry: Orbit of a massive test particle with Keplerian parameters a = 70m and e = 0.7 after 7 revolution

Fig. 4.12 Schwarzschild geometry: Orbit of a massive test particle with Keplerian parameters a = 70m and e = 0.7 after 20 revolution

angle Δφ with respect to 2π. This is further stressed by Fig. 4.11 that shows the orbit after 7 revolutions. It is interesting to understand which kind of pattern emerges at asymptotically late times after many revolutions. This is revealed by looking at Fig. 4.12 which shows the orbit after 20 revolutions. What we witness is a sort of symmetry restora- tion mechanism. In Newtonian physics the eccentric orbits break the spherical sym- metry of the Hamiltonian. In general relativity, due to the periastron advance the 174 4 Motion in the Schwarzschild Field shape of the orbit tends to restore part of the broken symmetry since the positions of the periastron and aphastron are symmetrically distributed on the unit circle.

4.4.1 Perturbative Treatment of the Periastron Advance

The orbits that we have obtained by numerical evaluation in the previous section are extremely far from the planetary orbits of the solar system. Indeed, in order to emphasize the phenomenon of the periastron advance we have chosen an orbit size of the order of tens of Schwarzschild emiradii while the actual order of magnitude of planetary orbits corresponds to hundred of millions of Schwarzschild emiradii. For Mercury, for instance, which is the innermost planet of the solar system, the Keplerian parameters have the following values: 6 aMercury = 55.46 × 10 km e = 0.2 (4.4.14) m = 1.4 km (Schwarzschild emiradius of the Sun)

Large average radii means that u is small and the u2 term in (4.4.10) can be treated as a perturbation of the Newtonian differential (4.2.9). Let us therefore set:

u(φ) = u0(φ) + Δu(φ) (4.4.15) where 1 u (φ) ≡ (1 + e cos φ) (4.4.16) 0 a(1 − e2) is the unperturbed Newtonian solution of (4.2.9) and Δu(φ)  u0(φ) is a small perturbation. Disregarding terms of order (Δu(φ))2,(4.4.7) becomes:

 1 Δu + Δu = + 3m(u )2 a(1 − e2) 0 2 n = Hn cos φ (4.4.17) n=0 where we have introduced the following constants: 3m H0 = a2(1 − e2)2 6em H1 = (4.4.18) a2(1 − e2)2 3e2m H2 = a2(1 − e2)2 4.4 The Periastron Advance 175

Since (4.4.17) is linear in the unknown function Δu, its solution is the linear combination 2 Δu = Δun(φ) (4.4.19) n=0 where each term is the solution of an independent differential equation  + = n Δun Δun Hn cos φ (4.4.20)

Let us study the form of each Δun. The first contribution Δu0 is uninteresting and irrelevant since, for an arbitrary constant k,wehave:

Δu0 = H0 + k cos φ (4.4.21) so that replacing u0 → u0 + Δu0 amounts to an unobservable renormalization of the Keplerian parameters a and e. Next we consider Δu2. Here a solution of the inhomogeneous equation is: H H Δu = 2 − 2 cos 2φ (4.4.22) 2 2 6

Also the perturbation u0 → u0 + Δu2 is essentially unobservable, since Δu2(φ) is periodic of period 2π. Indeed by replacing u0 → u0 + Δu2 we obtain a new orbit which is a slightly deformed ellipsis where the aphastro and periastro continue to occur at φ = 0 and φ = π as in the Keplerian case. This can be appreciated by looking at Figs. 4.13, 4.14, 4.15. In order to emphasize the extremely small effect of the perturbation we have chosen an ultra narrow and eccentric Keplerian orbit with parameters: semilatus rectum = 8 Schwarzschild emiradii (4.4.23) eccentricity = 0.7 and in Fig. 4.13 we have compared the plot of the unperturbed function u0(φ) with the perturbed one u0(φ) + Δu2(φ). The resulting new orbit is plotted in Fig. 4.14 and compared with the unperturbed one. The periodic character of the perturbation implies that the new orbit looks like a deformed ellipsis of slightly changed shape and size. We can compare the actual shape of the perturbed orbit with a purely Keplerian ellipsis by defining a renormalized semilatus rectum and eccentricity via the formulae: r(π)+ r(0) aR = 2 (4.4.24) r(π)− r(0) eR = r(π)+ r(0) where 1 r(φ) = (4.4.25) u0(φ) + Δu2(φ) 176 4 Motion in the Schwarzschild Field

Fig. 4.13 Comparison of the unperturbed solution u0(φ) (thin line) and of the perturbed one u0(φ) + Δu2(φ) (thick line) in the case of Keplerian parameters a = 8m, e = 0.7. As one sees, even in this extreme case of an ultra-narrow and ultra-relativistic orbit the perturbation modifies very slightly the shape of the function without changing its behavior and its periodicity

Fig. 4.14 Comparison of the Keplerian orbit of parameters a = 8m, e = 0.7(thin line) with the perturbed orbit obtained by replacing u0 → u0 + Δu2 (thick line)

Fig. 4.15 Comparison between a Keplerian orbit of parameters a = 8m, e = 0.7 perturbed by the periodic perturbation Δu2 (thick line) and a Keplerian orbit of parameters aR = 5.97683m, eR = 0.624945 (thin line) that, by definition, has the periastron and aphastron located at the same distances from the center of gravity as the perturbed orbit. This comparison shows the deviation from a purely elliptic shape

In Fig. 4.15 we have plotted the Keplerian orbit with parameters a = aR = 5.97683m, e = eR = 0.624945, together with the perturbed orbit of Keplerian pa- rameters a = 8m, e = 0.7. By definition in these two orbits the aphastron and the 4.4 The Periastron Advance 177 periastron are at the same distance, yet the first is a true ellipsis while the second, the perturbed orbit is not and it is slightly squashed. The change in shape is so small in actual astronomical situations where a ∼ 109m that the effect of the periodic perturbation Δu2 is absolutely unobservable. We remain with the perturbation Δu1 that being non-periodic is the most inter- esting one. For n = 1 a solution of the inhomogeneous equation is:

H Δu = 1 φ sin φ (4.4.26) 1 2 As long as: H a(1 − e2) 3m β ≡ 1 × =  1 (4.4.27) 2 e a(1 − e2) the interpretation of the perturbation u0 → Δu1 is simple. It suffices to note that under the condition (4.4.27) we can write: cos (1 − β)φ  cos φ + βφsin φ + O β2 (4.4.28) in order to conclude that: 1 u (φ) + Δu (φ)  1 + e cos (1 − β)φ (4.4.29) 0 1 a(1 − e2) which is a periodic function of period

2π T = > 2π (4.4.30) 1 − β

This means that at each revolution the periastron (and as a consequence the aphas- tron) occurs not at an angle 2π after the previous periastron rather after an angle T . For someone who looks at the sky this effect means that the angular position of the periastron advances of an angle: 1 6πm Δφ = T − 2π = 2π − 1  2πβ = (4.4.31) 1 − β a(1 − e2) at each revolution. We can appreciate this phenomenon by means of numerical plots. As usual, in order to magnify the effect we choose a narrow relativistic orbit at a = 70m with ec- centricity e = 0.6. In Fig. 4.16 we compare the plot of the unperturbed Keplerian so- lution u0(φ) with the perturbed solution u0(φ) + Δu1(φ) as interpreted in (4.4.29). The non-integer change in period is clearly seen in this plot. In Fig. 4.17 the clas- sical Keplerian orbit u0(φ) and the perturbed one u0(φ) + Δu1(φ) are compared on several revolutions. Recalling Figs. 4.10, 4.11, 4.12 wherewehaveplottedthe numerical solution of the exact orbit differential equation at parameters a = 70m, e = 0.7 with Fig. 4.17 we see that the phenomenon of periastron advance featured 178 4 Motion in the Schwarzschild Field

Fig. 4.16 Comparison of the unperturbed solution u0(φ) (thin line) and of the perturbed one u0(φ) + Δu1(φ) (thick line) in the case of Keplerian parameters a = 70m, e = 0.6. The shift in period of the perturbed solution is evident from the plot

Fig. 4.17 Comparison of the unperturbed Keplerian orbit of parameters a = 70m, e = 0.6(thin line) with the orbit perturbed by u0(φ) + Δu1(φ) (thick line). The phenomenon of the periastron advance is evident

by the exact solutions is essentially due to the non-periodic perturbation Δu1 of (4.4.26) as we have claimed. Equation (4.4.31) is the final perturbative formula predicted by General Relativ- ity for the periastron anomaly and constitutes one of the classical tests of the theory proposed by Einstein himself. In the case of the solar system we have m = m = GM/c2 where M is the solar mass. The same formula can also be applied to binary stellar systems. In this case we have: G m ∼ (m + m ) (4.4.32) binary c2 1 2 where m1,2 denote the masses of the two companion stars. It is interesting to insert the numerical values of the Keplerian parameters in the case of the planet Mercury (see (4.4.14)). We obtain:

1.4 −6 −8 ΔφMercury = 6π 10  6π × 2.6 × 10 (4.4.33) 0.96 × 55.46

Since the period of Mercury is TMercury = 0.24 years it follows that in one terrestrial century we have 100/0.24 = 416.6 orbits of Mercury. On the other hand, if we 4.5 Light-Like Geodesics and the Deflection Light-Rays 179 evaluate 6π in arc seconds we have:

6π = 3 × 60 × 60 × 360 arcs = 3, 888, 000 arcs (4.4.34) so that we can write:

−8 ΔφMercury/century = 3, 888, 000 × 416.6 × 2.6 × 10 × arcs/century   42 /century (4.4.35)

The value (4.4.35) is in perfect agreement with the historical records of astronomical observations after subtraction of all the Newtonian effects due to the perturbation on Mercury orbit by the motion of the other planets in the solar system.

4.5 Light-Like Geodesics in the Schwarzschild Metric and the Deflection of Light Rays

Let us now go back to the Lagrangian (4.3.2) for time-like geodesics and replace it with its analogue for light-like geodesics. In this case we write:

− m dt 2 m 1 dr 2 −L =− 1 − 2 + 1 − 2 r dλ r dλ dθ 2 dφ 2 + r2 + sin2 θ (4.5.1) dλ dλ where λ is some affine parameter and we replace the constraint 2L = 1 with the new one: dxμ dxν 0 = g (x) (4.5.2) dλ dλ μν Just as in the case of time-like geodesics we can integrate the equation for the canon- ical variable θ by setting: π θ = = const (4.5.3) 2 In other words the trajectories of photons and massless particles are planar just as those of massive particles. Also in full analogy with the massive case we have that the canonical variables t and φ are cyclic and are respectively associated with two first integrals of the motion, the energy E and the angular momentum L. Explicitly we find: dt 2m 1 − = E = const dλ r (4.5.4) dφ r2 = L = const dλ 180 4 Motion in the Schwarzschild Field

Fig. 4.18 The effective radial potential for the motion of massless particles in Schwarzschild geometry. It displays a maximum at r = 3m but no minima. In this figure the plot is given for L = 5

The difference with the massive case emerges when we enforce the new constraint (4.5.2) which yields the relation:

E 2 r˙2 L2 0 = − − r2 (4.5.5) − 2m − 2m r4 1 r 1 r replacing (4.3.7). From (4.5.5) we obtain: dr 2 = E 2 − V photon(r) dλ eff (4.5.6) dφ = L dλ r2 where the effective potential

L2 2mL2 V photon(r) = − (4.5.7) eff r2 r3 is the light-like counterpart of the massive effective potential (4.3.9). We can imme- diately note the differences. While the centrifugal barrier ∼ 1 is common to both r2 ∼−1 cases, the Newtonian attraction r is absent in the light-like case. This corre- sponds to the obvious fact that in Newton’s theory photons do not feel gravity since they are massless. Yet the short range attractive correction due to General Relativity ∼−1 applies both to photons and massive particles. It is this correction which is r3 responsible for the bending of light-rays in strong gravitational fields like that gen- erated by the sun. This phenomenon is the light-like counterpart of the periastron advance and constitutes another classical test of General Relativity. The plot of the photon potential Veff (r) is displayed in Fig. 4.18. As it is evident from its shape there is a maximum at r = 3m but no minima. Hence, in the case of photons there are no stable circular orbits: only open orbits are available that correspond to the afore- mentioned bending of light-rays. We study such a phenomenon by writing the exact differential equation of the orbit. Proceeding in full analogy with our previous treat- ment of the time-like geodesics we trade the derivatives with respect to the affine 4.5 Light-Like Geodesics and the Deflection Light-Rays 181 parameter for the derivatives with respect to the azimuthal angle:

dr dr dφ dr L2 = = (4.5.8) dλ dφ dλ dφ r2

Inserting (4.5.8)into(4.5.6) and introducing the impact parameter:

L b ≡ E (4.5.9) we obtain: dr 1 1 2m 1/2 =±r2 − 1 − (4.5.10) dφ b2 r2 r which can be further converted into the exact orbit equation:

− dφ 1 1/2 =± − u2 + 2mu3 (4.5.11) du b2

The name given to the impact parameter (4.5.9) is easily justified if we integrate (4.5.11) in the Newtonian approximation:

2mu3  u2 (4.5.12)

Indeed if we disregard the term 2mu3 we obtain: du dφ =± (4.5.13) 1 − u2 b2 which immediately yields:

φ + φ0 = arcsin[bu] (4.5.14) leading to the equation of the Newtonian orbit of a photon:

b r = (4.5.15) sin[φ + φ0] = π Equation (4.5.15) describes a straight line in the θ 2 plane as it is shown in Fig. 4.19. The impact parameter b is geometrically interpreted as the minimal dis- tance from the center of mass reached by the massless particle while traveling along its trajectory. Just as in the massive case, the relativistic perturbation 2mu3 changes this conclusion. Squaring both sides of (4.5.11) we obtain: du 2 1 − − u2 + 2mu3 = 0 (4.5.16) dφ b2 182 4 Motion in the Schwarzschild Field

Fig. 4.19 The Newtonian trajectory of a photon is a straight line and the impact parameter is the minimal distance from the center of mass reached along such a trajectory

d and taking a further dφ derivative of (4.5.16) we find:   2u u + u − 3mu2 = 0 (4.5.17) which is solved either by u = 0 or by:

 u + u = 3mu2 (4.5.18)

The first possibility corresponding to circular orbits is excluded since, as we have already observed, the effective potential has no minima. Hence we are left with (4.5.18) which is the light-like counterpart of (4.4.10). Just as in the massive case the differential equation (4.5.18) can be either treated numerically or perturbatively, when the Newtonian approximation (4.5.12) applies. In order to emphasize the new relativistic phenomenon implied by Schwarzschild geometry, namely the gravita- tional bending of light rays, we have considered the extreme situation provided by a photon impinging on a center of mass with a tiny impact parameter, namely

b = 8.5 Schwarzschild emiradii (4.5.19) and choosing some non-vanishing incidence angle (φ0 = π/5 in our example) we have numerically solved (4.5.18) with initial conditions:

 sin φ cos φ u (0) = 0 ; u(0) = 0 (4.5.20) b b The resulting photon trajectory is displayed in Fig. 4.20. As one sees, rather than proceeding undisturbed on a straight line the photon makes a turn around the center of mass and emerges at an angle different from the one of incidence. The exact relativistic orbit is compared with the corresponding Newtonian one in Fig. 4.21. 4.5 Light-Like Geodesics and the Deflection Light-Rays 183

Fig. 4.20 The trajectory of a photon in Schwarzschild geometry with impact parameter b = 8.5m and = π incidence angle φ0 5

Fig. 4.21 Comparison between the Newtonian trajectory (thick line)andthe exact relativistic trajectory (thin line) of a photon that impinges a center of mass with impact parameter b = 8.5m and incidence angle = π φ0 5

The perturbative treatment of (4.5.18) follows the same lines as the perturbative treatment of orbit equation for the massive particle (4.4.10). We set

u(φ) = u0(φ) + Δu(φ) (4.5.21) where, according to (4.5.15), the unperturbed solution is: 1 u (φ) = sin[φ + φ ] (4.5.22) 0 b 0 In the post-Newtonian approximation (4.5.12) we obtain the following differential equation for the perturbation:  3m Δu + Δu = 3mu2 = 1 − cos2[φ + φ ] (4.5.23) 0 b2 0 which admits the following particular solution: 3m 1 Δu(φ) = 1 + cos 2(φ + φ ) (4.5.24) 2b2 3 0

Hence, choosing our reference frame orientation in such a way that φ0 = 0, we can write the perturbed solution: 1 3m 1 upert(φ) = sin[φ]+ 1 + cos[2φ] (4.5.25) b b 3 184 4 Motion in the Schwarzschild Field

Fig. 4.22 Comparison between the Newtonian and post-Newtonian perturbed trajectory of a photon impinging on a center of mass with impact parameter b = 30m and incidence angle φ0 = 0

The corresponding trajectory is plotted in Fig. 4.22 for an impact parameter b = 30m. In the same figure we have also plotted the corresponding Newtonian trajec- tory, namely a straight-line with the same minimal distance from the center of mass. The angle δ showninFig.4.22 is one half of the total deflection angle of light rays predicted by General Relativity:

Δφlight = 2|δ| (4.5.26)

The angle δ can be evaluated from the post-Newtonian perturbed solution (4.5.25) with the following reasoning. The incoming and outgoing photons are at infinite distance from the center of mass, namely at r =∞⇔u = 0. If we fix φ0 = 0, the zeros of the unperturbed function u0(φ) (see (4.5.22)), are at: 0 incoming photon r =∞ ⇔ u = 0 ⇒ φ(0) = (4.5.27) 0 ∞ π outgoing photon

The zeros of the perturbed function will instead occur at: (pert) 0 − δ incoming photon r =∞ ⇔ u = 0 ⇒ φ∞ = (4.5.28) pert π + δ outgoing photon where δ  1 is small. Hence we can use the power series expansion of trigonometric functions and write:

sin φ∞ ∼ φ∞ = δ; cos 2φ∞ ∼ 1 (4.5.29) and write δ 3m 1 m 0 =− + 1 + ⇒ δ = 2 (4.5.30) b 2b2 3 b Inserting (4.5.30)into(4.5.26) we conclude that General Relativity predicts the fol- lowing total deflection angle References 185

MG 1 Δφ = 4 (4.5.31) light c2 b for light-rays impinging on a center of mass M with impact parameter b. It is interesting to give a numerical evaluation of this effect in the case of light rays (actually any electromagnetic wave signal) passing close to the sun. In this case the maximal possible effect occurs when the impact parameter is the minimal con- ceivable one, namely when b = R equals the radius of our star. The astronomical data are R = 6.96 × 105 km; m = 1.47 km (4.5.32) and imply a maximal deflection angle of 1.47 − − maxΔφSun = 4 × × 10 5 = 8.44 × 10 6 rad = 1.74 arcs (4.5.33) light 6.96 Although tiny this effect is observable and experiments are in full agreement with the theory. The first measure was performed during a solar eclipse in 1919. In the seventies it was confirmed with much higher precision by means of radar signals reflected from a satellite orbiting near the Sun.

References

1. Schwarzschild, K.: Über Das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie. Reimer, Berlin (1916), S. 189 ff. (Sitzungsberichte der Kniglich-Preussischen Akademie der Wissenschaften; 1916) Chapter 5 Einstein Versus Yang-Mills Field Equations: The Spin Two Graviton and the Spin One Gauge Bosons

Our imagination is stretched to the utmost, not, as in fiction, to imagine things which are not really there, but just to comprehend those things which are there. . . Richard Feynman

5.1 Introduction

Enough evidence has been provided in previous chapters that the law of Newtonian mechanics: F = ma (5.1.1) relating the acceleration a of a point-particle of mass m to the total force F applied to it can be successfully replaced by the statement that the world-line of that particle should be a time-like (or null-like if m = 0) geodesic of the pseudo-Riemannian manifold (M ,g)which describes space-time. In particular we saw that, by choosing g to be the Schwarzschild metric, we can retrieve the entire corpus of Newtonian Astronomy as brought to perfection by the monumental work of Laplace in his Exposition du Système du Monde.1 The next question is: what fixes the choice of the metric? Newton theory is com- posed of two parts. Law (5.1.1) establishes how a particle reacts to the presence of a force-field, while the celebrated attraction law: mM F =−G (5.1.2) r2 determines the force on the particle of mass m generated by the presence of another mass M. In General Relativity, Einstein field equations, that constitute the topics of the present chapter, replace Newton’s law (5.1.2). They are 2nd-order differential equations to be satisfied by the metric tensor gμν(x), that are obtained by equating a certain two-index symmetric tensor Gμν , constructed with the components of the

1See Chap. 2 of Volume 2 for more information on Laplace and its contributions to gravitational theories.

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_5, 187 © Springer Science+Business Media Dordrecht 2013 188 5 Einstein Versus Yang-Mills Field Equations

Fig. 5.1 A graphical representation of Eddington’s sheet metaphor which summarizes the basic idea of Einstein Gravity Theory

αβ four-index R γδ, encoding the geometry of space-time, to the stress-energy Tμν encoding the mass-energy content of the latter. The fundamental geometric idea is summarized by the metaphor of Eddington’s sheet, shown in Fig. 5.1. Space-time is like a flat elastic sheet which is curved and bend by any heavy mass we might place on it. Light particles, who would just go along straight lines if the sheet were flat, twist instead their paths if the sheet is curved by its mass-energy content. The path historically followed by Einstein to derive his own field equations was long and complex. Indeed it took about ten years, from the first conception of a generally covariant theory of gravity, to arrive at the publication of the 1915 article containing the final form of the field equations. Considering the matter a posteriori, from the vantage point of contemporary Theoretical Physics, we can present a few different but equally well knit chains of arguments that all lead to the same unique result found by Einstein. These different derivations emerge from the interplay of the two complementary aspects of any field theory, namely 1. the microscopic particle-theory aspect, 2. the macroscopic geometrical aspect, with the two available formalisms one can utilize to discuss (pseudo-)Riemannian geometry namely: 1. the original metric formalism of Riemann, codified in the tensor calculus of Ricci and Levi Civita, 2. the vielbein or repère mobile formalism invented by Cartan. The choice between the two formalisms is not only a matter of taste or convenience, but it is intimately related to two fundamental conceptual issues. The first is the question whether gravity is described by a gauge theory, as it happens to be true of all the other fundamental non-gravitational interactions. The second issue is even more fundamental and concerns the gravitational interaction of fermionic particles that, by definition, happen to be the quanta of spinorial fields. In the classical tensor 5.2 Locally Inertial Frames and the Vielbein Formalism 189 calculus of Ricci and Levi Civita there is no place for spinors, so that the gravi- tational interaction of, say, two electrons, cannot be described within the original metric formulation of General Relativity: an obvious absurdity. Cartan’s vielbein formalism allows instead for the gravitational coupling of spinor fields by means of the spin-connection, which replaces the Levi Civita connection. From this it follows that the vielbein formalism is actually more fundamental than the metric formalism and extends the latter which is not just equivalent to, rather it is contained in the former. Finally comes the question of the Lagrangian formulation of Einstein field equa- tions, namely their derivation from a variational principle. Also in this respect the vielbein formalism shows its superiority in comparison with the metric formalism since within the former framework both the action and its variation are much sim- pler. Yet the particle interpretation of the theory is much clearer in the metric for- mulation and the spin two quantum of the gravitational field, the graviton, is best described as an infinitesimal fluctuation of the metric tensor gμν(x). In view of these facts the present chapter is organized as follows. First, taking a purely geometric view-point we recast Riemannian and pseudo-Riemannian ge- ometry in the vielbein formalism à la Cartan. In this framework the derivation of Bianchi identities which will play a fundamental role in establishing the form of Einstein equations is particularly simple. Secondly using electro-magnetic theory and Yang-Mills theory as a term of com- parison we discuss the linearized form of Einstein field equations and we argue why and how they should be non-linearly extended as a consequence of logical self- consistency. The full non-linear form of the field equations is then uniquely deter- mined with the help of the previously derived Bianchi identities. Thirdly we discuss the action principle and we study its structural and symmetry properties, emphasizing similarities and fundamental differences with Yang-Mills Theories. The main goal of the present chapter will be attained if the reader will develop a clear-cut consciousness that, without gravity the other gauge theories of fundamen- tal interactions could not even be formulated. Yet, although a gauge theory by itself, gravity is very specific and all of its peculiarities can be traced back to the spin two of its fundamental quantum, opposed to the spin one of the non-gravitational interaction messengers.

5.2 Locally Inertial Frames and the Vielbein Formalism

As we anticipated in Sect. 3.2.5, the very idea of a Cartan repère mobile has its distant historical roots in the work of Frenet and Serret. In the context of (pseudo-) Riemannian geometry, it naturally emerges from the concept of Locally Inertial Frames, which is physically related to the Equivalence Principle and to Einstein’s Gedänken Experiment of the free falling lift. The whole matter is summarized in the following mathematical lemma. 190 5 Einstein Versus Yang-Mills Field Equations

Lemma 5.2.1 Let (M ,g)be a (pseudo-)Riemannian manifold of dimension N and p ∈ M an arbitrary point of that manifold. Moreover let Up be an open neighbor- hood of p,whose points are labeled by coordinates xμ. In the same neighborhood we can always construct a new coordinate system xμ with the following properties: (a) The coordinates xμ of the point p vanish: xμ(p) = 0. (b) The value of the metric tensor at p in the x-coordinates is just equal to the signature of the metric: ⎛ ⎞ ±10··· 00 ⎜ ⎟ ⎜ 0 ±10··· 0 ⎟ ⎜ ⎟ = ⎜ . . . . . ⎟ gμν(p) ⎜ . . . . . ⎟ . ⎝ 0 ··· 0 ±10⎠ 0 ··· ··· 0 ±1

(c) The coefficients of the Levi Civita connection vanish at p in the x system: λ = Γ μν(p) 0. The coordinate system xμ with the above properties is named a Locally Inertial Frame.

Proof To prove the lemma we just need to consider the transformation properties of the metric tensor and of the Levi Civita connection. For the first we have: ∂xρ ∂xσ g = g (5.2.1) μν ∂xμ ∂xν ρσ while for the second, adapting to the case of the tangent bundle the general transfor- mation rule (3.4.14) of a principal connection, we find:

λ 2 σ λ ρ τ λ ∂x ∂ x ∂x ∂x ∂x Γ =− + Γ σ (5.2.2) μν ∂xσ ∂xμ∂xν ∂xσ ρτ ∂xμ ∂xν Let us now implicitly define the new coordinates xλ as the solutions of the following quadratic equations: 1 xμ = xμ(p) + Qμ xν + Ωμ xρxσ (5.2.3) ν 2 ρσ where xμ(p) are the coordinates of the point p ∈ M in the old chart and Q and Ω are coefficients to be determined. By means of the position (5.2.3), condition (a) of the lemma is automatically satisfied: x(p) = 0. Consider next point (b) of the lemma and let us evaluate the metric coefficients gμν at p in the new frame. By using (5.2.3)in(5.2.1) we find:

g(p) = QT g(p)Q (5.2.4) where we purposely introduced matrix notation. Recalling the results of Sect. 3.6.1, we know that, whatever the non-degenerate symmetric matrix g(p) might be, we can 5.2 Locally Inertial Frames and the Vielbein Formalism 191 always find a non-degenerate matrix Q that reduces g(p) to be a diagonal matrix with eigenvalues that are all either plus one or minus one. Hence the coefficients μ Q ν in the transformation (5.2.3) are defined by the existing solution of such a N(N−1) problem. Incidentally let us remark that we actually have an 2 -dimensional manifold of such solutions. Indeed, suppose that Q0 is such that:  = T = + + − − ≡ g(p) Q0 g(p)Q0 diag ,..., , ,..., η (5.2.5) p-times (N−p)-times and let Λ ∈ SO(p, N − p) be an arbitrary element of the pseudo-orthogonal group T preserving the flat metric η, i.e. Λ ηΛ = η. For any such Λ, the matrix Q = Q0Λ is another solution of the posed problem. Hence, as far as the linear terms are con- cerned, the transformation (5.2.3)isfixeduptoSO(p, N −p) transformations. Con- sider next point (c) of the lemma. From (5.2.2) we get:

λ λ ∂x   Γ (p) = (p) −Ωθ + Γ θ (p)Qρ Qσ (5.2.6) μν ∂xθ μν ρσ μ ν

θ = θ ρ σ Hence it suffices to pose Ωμν Γρσ (p)Q μQ ν and the locally inertial frame that satisfies the conditions of the lemma is explicitly constructed. 

As anticipated the existence of locally inertial frames is the very content of the Equivalence Principle and the starting point of General Relativity. At any point in space-time, as long as we can make measures only in a very small neighborhood around us, we cannot decide whether gravitational accelerations felt by falling bod- ies are due to the presence of a genuine gravitational field or they are simply due to the non-inertial character of the chosen reference frame. Indeed local accelera- tions and other bending forces acting at point p on the world line of any test particle λ crossing through such a point are provided by the connection coefficients Γμν(p). As we have seen the latter can be set to zero by choosing an appropriate reference frame. Hence can we conclude that gravity is just an illusion? By no means! The subtle point in the above discussion is inherent to the word locally which precedes the qualifier inertial. In the proof of the lemma we have shown how to construct, ∈ M λ for any point p , a coordinate system in which Γμν vanishes at p, yet nothing λ was stated about the vanishing of Γμν at neighboring points. The opposite is true. In the coordinate system where the connection coefficients vanish at p,theyare generically non-zero at all other points of the neighborhood Up. Obviously, choos- ing an infinitesimally close point p + dp we can repeat the above construction and μ + λ find a new coordinate transformation x (x, p dp) such that Γ μν vanishes now at p + dp, yet the transformation xμ(x, p + dp) and the old one xμ(x, p) are not the same. This elementary observation is what leads to the concept of vielbein or, as Cartan named it in French, repère mobile. 192 5 Einstein Versus Yang-Mills Field Equations

5.2.1 The Vielbein

The word vielbein comes from the German viel = many and bein = legs. Originally this nomenclature was introduced in classical General Relativity as vierbein from the German vier = four and the already explained bein. It denoted the mobile ref- erence frame of a four-dimensional Lorentzian space-time. Sometimes the same concept was referred to by means of the more classically sounding word tetrad. With the development of unified theories in diverse dimensions, there appeared in the literature all sort of German multiple enumerations of legs, i.e. the fünfbein,the zehnbein,theelfbein up to the sechsundfünfzigbein. It was finally tacitly agreed to use the word vielbein which covers all possible dimensions. Consider a (pseudo-)Riemannian manifold (M ,g) of dimensions m and in any local chart U ⊂ M , let us construct the family of coordinate transformations which, point by point p ∈ U realize the three conditions mentioned in the lemma. The result is a multiplet of m bi-local functions ξ a(y, x) (a = 1,...,m) where xμ denote (in the old frame) the coordinates of the point p where the connection is required to vanish and the metric to become η, while yμ are the coordinates (also in the old frame) of a generic point belonging to the neighborhood of U. By construction we have ξ a(x, x) = 0. Let us now define the following m × m square matrix which smoothly depends on the point p, namely on the coordinates xμ:  a  a = ∂ξ (y, x) Eμ(x) μ  ∂y y=x   (5.2.7) μ ≡ −1 ⇔ a μ = a Ea (x) E(x) Eμ(x)Eb (x) δb where δa is the Krönecker delta, namely the identity matrix. By means of this con- b μ struction we realize that Ea (x) are, at each point x, nothing else but the entries of the matrix Q introduced in the proof of Lemma 5.2.1 and that we have: μ ν = Ea (x)Eb (x)gμν(x) ηab (5.2.8) Alternatively, inverting the relation, we can write: = a b gμν(x) Eμ(x)Eν (x)ηab (5.2.9) which is more inspiring since it reveals that the metric tensor can be regarded as a quadratic form in terms of another possibly more fundamental object. Indeed Car- tan’s brilliant idea, of which we gave a first anticipation in Sect. 3.2.5, was precisely the following one: instead of utilizing the metric tensor as the primary field de- a scribing gravity, let us introduce a non-degenerate square matrix Eμ(x), which is no longer requested to be symmetrical, and let us regard (5.2.9) not as an equation, rather as a definition of the metric tensor gμν(x) in terms of the new primary object, named the vielbein. Let us single out the distinctive advantages of such a viewpoint. First of all, multiplying by the coordinate differentials, we realize that: a ≡ a μ E Eμ(x) dx (5.2.10) 5.2 Locally Inertial Frames and the Vielbein Formalism 193 is a multiplet of m differential one-forms on the base manifold M . Secondly, in- specting (5.2.9), we observe that the same metric tensor can be constructed from a multitude of equivalent vielbein one-forms. Indeed, consider arbitrary smooth mappings from open neighborhoods of the (pseudo-)Riemannian manifold into the pseudo-orthogonal group a : M ⊃ → − ⇔ T = Λ b(x) U SO(p, m p) Λ (x)ηΛ(x) η (5.2.11) It follows that the vielbeins a a ≡ a b E and E Λ b(x)E (5.2.12) produce the same metric tensor. How do we geometrically interpret this fact? Over the base manifold M , let us construct a principle bundle P(M , SO(p, m − p)) (or, π in equivalent notation Pso(p,m−p) → M ), with structural group SO(p, m − p) and let us consider the associated vector bundle π Vso(p,m−p) → M (5.2.13) provided by the m-dimensional, defining representation of SO(p, m − p). Math- ematically the vielbein Ea is a one-form with values in sections of the bundle  Vso(p,m−p) or, if you prefer, is a section of the product bundle T M ⊗ Vso(p,m−p), namely:   a  E ∈ Γ T M ⊗ Vso(p,m−p), M (5.2.14) This observation suggests an obvious idea: why do we not introduce a principle π connection on the bundle Pso(p,m−p) → M ? Let us do it.

5.2.2 The Spin-Connection

− a The Lie algebra so(p, m p) is made by matrices ω b satisfying the defining con- dition: ωT η + ηω = 0 ⇔ ω ∈ so(p, m − p) (5.2.15) If we raise the second index by means of the flat metric ηab, the rank two tensor ab ≡ a cb ab =− ba ω ω cη obtained in this way is antisymmetric by construction ω ω . π Hence a principle connection on the bundle Pso(p,m−p) → M is locally described, in each coordinate patch Uα ⊂ M , by a one-form with a pair of antisymmetric indices: ab = ab μ; ab =− ba ω ωμ (x) dx ω ω (5.2.16) which, under SO(p, m − p) transition functions from one local trivialization to an- other one, transforms in the canonical way established by (3.4.14): ˜ ab = a b cd + a b cd ω Λ c dΛ d η Λ cΛ d ω (5.2.17) 194 5 Einstein Versus Yang-Mills Field Equations

This transformation of ω, named the spin-connection is paired with the already in- troduced transformation of the vielbein (5.2.12) which was our starting point. Once we have a principle connection, according to the general principles exten- sively discussed in Chap. 3, we can introduce the covariant exterior differential of any section of any associated bundle. In particular we can write the covariant differ- ential of the vielbein one-form, to which we assign the name of torsion two-form:

a a ab c T ≡ dE + ω ∧ E ηbc (5.2.18)

At the same time, specializing to the case under consideration the general formula (3.7.6) for the curvature two-form of a principle connection we can write that of the spin-connection as follows:

ab ab ac db R = dω + ω ∧ ω ηcd (5.2.19)

The attentive reader will recognize that (5.2.18) and (5.2.19)arethesameas(3.2.25) and (3.2.27) anticipated in Sect. 3.2.5. The same attentive reader will also recognize that considered together, the torsion two-form and the curvature two-form can be given a further inspiring interpretation spelled out in the next subsection.

5.2.3 The Poincaré Bundle

Given any flat metric ηab in m-dimensions with signature (p, m − p), the corre- sponding Poincaré group is the semidirect product of the m-dimensional translation group with the pseudo-orthogonal group SO(p, m−p). It is denoted ISO(p, m−p) and, by definition we have:

ISO(p, m − p) = SO(p, m − p) T m (5.2.20)

Naming Pa the generators of the translations and Jab, the generators of the sub- group SO(p, m − p), which is the Lorentz group in the case p = 1, the standard commutation relation of the Lie algebra iso(p, m − p) are the following ones:

[Pa,Pb]=0 (5.2.21)

[Jab,Pc]=−ηacPb + ηbcPa (5.2.22)

[Jab,Jcd]=−ηacJbd + ηbcJad − ηbdJac + ηadJbc (5.2.23) which are just the generalization to higher dimensions and with an arbitrary flat η- metric of (1.6.6)Ð(1.6.7) introduced in Chap. 1.From(5.2.23) we easily read off the structure constants: [ ]= K TI ,TJ fIJ TK (5.2.24) 5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories 195

a having collectively denoted TI ={Pa,Jbc}. Next we can collect the vielbein E and the spin-connection ωab into a single object, namely into a principle connection on the Poincaré bundle: I a ab Ω = Ω TI ≡ E Pa + ω Jab (5.2.25) and following the general prescription displayed in (3.7.6) we can construct the curvature two-form of such a Poincaré connection:

Θ ≡ dΩ + Ω ∧ Ω  K 1 K I J = dΩ + f Ω ∧ Ω TK 2 IJ a ab = T Pa + R Jab (5.2.26) Expanding Θ along the generators we discover that the Torsion Ta and the cur- vature Rab are just the components of the overall Poincaré curvature Θ along the translation and rotation generators respectively. Thus it appears that the main ingredients apt to describe the gravitational field, once Cartan’s viewpoint is adopted, are encoded into a principal connection, one on the Poincaré bundle with structural group ISO(1,m− 1).2 Such an observation is very tempting since it seems to imply that, at the end of the day, gravitation is explained in terms of a gauge theory just as all the other fundamental interactions of Nature. Up to this point the similarities are several and it is worth summarizing them for comparison. For this reason we open a digression on classical Electrodynamics and classical Yang-Mills theories. We will see that notwithstanding the many strik- ing analogies, gravity involves an extra geometric structure, the soldering, which differentiates the gauge theory of gravitational interactions from all the others and makes it unique and special.

5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories

Let us begin by reviewing Classical Electrodynamics in the most frequently utilized approach. There are some matter fields, typically the spinor field ψα(x), describing the electron/positron, or some other electrically charged fermionic particles3 and one considers the standard action for such fields, the free Dirac action:4    μ 4 ADirac = iψγ ∂μψ − mψψ d x (5.3.1)

2From now one we choose p = 1 which is the physically most interesting case. 3All leptons and quarks fall into this category. 4For the conventions on spinors and gamma-matrix algebra see Appendix A. 196 5 Einstein Versus Yang-Mills Field Equations

The electrically charged matter fields can also be spin zero fields, namely complex scalars φ(x), in which case the standard free action is the Klein-Gordon one:   1 1 μ 2 4 AKG = ∂ φ∂μφ − m φφ d x (5.3.2) 4 2

The matter action, namely ADirac and /or AKG has a global symmetry under the transformations of a U(1) Lie group defined as follows: ⎧ ⎪ψ → eieθ ψ ⎨⎪ ψ → e−ieθ ψ ∀ exp[iθ]∈U(1) : (5.3.3) ⎪ φ → eiqθφ ⎩⎪ φ → e−iqθφ where e, q are the electric charges of the considered fields. Indeed as long as the angle θ is constant and independent from the space-time location, performing the replacements (5.3.3) leaves the actions (5.3.1) and (5.3.2) invariant. The basic idea of Electrodynamics is to transform this symmetry from a global into a local one, namely to allow for a space-time dependence of the gauge angle θ → θ(x). The appropriate mathematical apparatus for this operation was developed in Chap. 3. We consider the classical matter fields ψ(x) or φ(x) as sections of a rank one π holomorphic vector bundle E =⇒ M that has the space-time manifold M as base manifold, U(1) as structural group and C as standard fibre. This vector bundle is associated to a principle bundle P(M , U(1)). Next we introduce a principle con- nection on this latter bundle, namely a U(1) Lie algebra valued one-form on the total space P , with the structure dictated by (3.3.86):   μ A = i Aμ dx + dθ (5.3.4) In the case of U(1) the Lie algebra is made by just one generator that can be iden- tified with the imaginary unity i. According to the discussion of Sect. 3.3.2.1,the existence of the connection allows to define the covariant derivatives:

∇μψ ≡ (∂μ − ieAμ)ψ ⇒∇μψ ≡ (∂μ + ieAμ)ψ (5.3.5)

∇μφ ≡ (∂μ − iqAμ)φ ⇒∇μφ ≡ (∂μ + iqAμ)φ (5.3.6) Global symmetry is promoted to a local one, if ordinary derivatives are replaced by covariant ones in the actions (5.3.1), (5.3.2) which become:    μ 4 ADirac = iψγ ∇μψ − mψψ d x (5.3.7)   1 1 μ 2 4 AKG = ∇ φ∇μφ − m φφ d x (5.3.8) 4 2 Indeed the transformations (5.3.3) with an x-dependent gauge angle θ(x) are in- terpreted as the transition functions from a local trivialization of the U(1) bundle 5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories 197

Fig. 5.2 The interaction vertex of electrodynamics

to another one and under such changes of local trivializations the connection one- form changes according the general rule (3.3.89). In the case of electrodynamics this reduces to:

Aμ → Aμ + ∂μθ (5.3.9) The new Lagrangians (5.3.7), (5.3.8) contain now interaction vertices of the origi- nal fields ψ or φ with the gauge field Aμ. For example, in the case of spinor elec- trodynamics, the interaction vertex is depicted in Fig. 5.2. Hence the gauge field Aμ should be equipped with a kinetic Lagrangian which defines its propagator. The answer is known since the time when Maxwell equations were rewritten in compact relativistic notation. The curvature two-form of the electromagnetic U(1)- connection is:

1 μ ν F = dA = Fμν dx ∧ dx 2 (5.3.10) Fμν = ∂μAν − ∂νAμ and its components Fμν encode the electric and magnetic fields according to the following identifications: ⎛ ⎞ 0 E1 E2 E3 ⎜−E1 0 B3 −B2 ⎟ F = ⎜ ⎟ (5.3.11) μν ⎝−E2 −B3 0 B1 ⎠ −E3 B2 −B1 0 Classical Maxwell equations are reproduced by the following differential statements on the field strength:

∂[λFμν] = 0 (5.3.12) μ ∂ Fμν = eJν (5.3.13) μ where Jν = ψγ ψ is the electric current. There is a fundamental difference be- tween the first equation (5.3.12) and the second (5.3.13). Equation (5.3.12)isjust an identity, the Bianchi identity for the curvature of a U(1)-connection:

dF = ddA = 0 (5.3.14) while (5.3.13) is a true dynamical equation which follows upon variation in δAν of the following action:

Atot = Agauge + Amatter (5.3.15) 198 5 Einstein Versus Yang-Mills Field Equations  1 μν 4 Agauge =− F Fμν d x (5.3.16) 4    μ 4 Amatter = iψγ ∇μψ − mψψ d x (5.3.17)

One important observation is that Maxwell equations are consistent with the con- servation of the electric charge since the left hand side of the second equation satis- fies the identity:   ν μ ν ∂ ∂ Fμν = 0 ⇒ ∂ Jν = 0 (5.3.18) which implies the vanishing of the electric current divergence and hence the global conservation of the electric charge. One might wonder whether the actual structure of Maxwell equations is an accident or whether it is uniquely determined by first principles. The correct answer is the second. Indeed we can easily show that, as long as Aμ is interpreted as a connection on a principle U(1)-bundle, namely, as long as the transformation (5.3.9) is required to be a symmetry of the field equations, and as long as the latter are assumed to be Lorentz invariant, linear differential equations of the second order, there is no alternative. Indeed let us see what is the most general form of such equations for the field Aν . Considering all possible index contractions we arrive at the following candidate equation:

μ Aν + α∂ν∂ Aμ + mAν = eJν (5.3.19) where α, m are coefficients to be determined. If we require that this equation should be invariant with respect to the gauge transformation (5.3.9) we obtain:

α =−1,m= 0 (5.3.20)

Exactly the same result is obtained by imposing that the divergence of the right hand side should vanish as a consequence of the field equation. In this way, we realize that gauge invariance and electric current conservation imply each other. On the other hand the choice (5.3.20) of the parameters reproduces (5.3.13). Hence the gauge ac- tion (5.3.16), which is quadratic in the curvature tensor Fμν , is uniquely selected by the fundamental physical principle of electric charge conservation encoding the full structure of electrodynamical interactions. Let us analyze the geometrical structure of the action (5.3.16). To this effect we need to introduce a new involutive operation acting on differential forms which is named Hodge duality.

5.3.1 Hodge Duality

Consider a Riemannian manifold (M ,g) of dimension m and let us construct the graded algebra of differential forms on M , defined by (2.5.59), which we repeat here for reader’s convenience: 5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories 199

m A (M ) = Ak(M ) where m = dim M (5.3.21) k=0 For the above construction of differential forms, the manifold structure is sufficient, the existence of a metric being completely irrelevant. Yet if a metric structure is present, the algebra A (M ) can be equipped with a linear (anti-)involutive oper- ation, named Hodge duality which maps forms of degree k into forms of degree m − k:

 : Ak(M ) → Am−k(M ) (5.3.22) (k) and is explicitly realized as follows. Let ω ∈ Ak(M ) be a k-form. In any coordi- nate patch we have:

(k) = μ1 ∧···∧ μk ω ωμ1...μk dx dx (5.3.23) The Hodge dual of ω(k), denoted ω(k) is the m − k form with the following com- ponents: 1 1 (k) ν1 νm−k μ1...μk ω = √ ωμ ...μ dx ∧···∧dx ε (5.3.24) k! | det g| 1 k ν1...νm−k

μ1...μk In the above equation ε ν1...νm−k denotes the completely antisymmetric Levi Civita symbol whose first k indices have been raised by means of the metric tensor. As a consequence of the above definition we find:

− 2ω(k) = (−1)(m k)kdet(sign)ω(k) (5.3.25) where det(sign) denotes the determinant of the signature of the considered metric which is either +1or−1.

5.3.2 Geometrical Rewriting of the Gauge Action

Using the Hodge dual operation we can easily rewrite the gauge action (5.3.16)in the following way:   1 μν 4 1 μρ νσ 4 − FμνF d x = − FμνFρσ g g | det g| d x (5.3.26) 4 4  1 = F ∧ F (5.3.27) 2 What we learnt from the above discussion is that the structure of Maxwell equa- tions is completely determined by the gauge nature of the electromagnetic field and this fact is just the mathematical transcription of the Physical Conservation Law of Electric Charge. On the other hand the unique structure of Maxwell equations implies an Action Principle (5.3.27) which is quadratic in the curvature F of the 200 5 Einstein Versus Yang-Mills Field Equations principle connection A and cannot be written without the use of the Hodge dual. The latter contains the metric tensor gμν in disguise. In other words, although the electromagnetic field Aμ coincides with a principle connection on a U(1)-fibre bun- dle, whose construction involves only the manifold structure of the space-time base manifold M , the classical dynamics of Aμ, i.e. its propagation equations and in- teraction with other fields, can neither be formulated nor solved without invoking the metric structure of M . In short the electromagnetic U(1)-bundle needs to be constructed on a (pseudo-)Riemannian base manifold. This is the first signal of the basic difference between gravity and the other fun- damental interactions. All interactions are mediated by gauge fields which hap- pen to be principle connections on fibre bundles, yet the connections involved by gravity are special in that they deal with the tangent bundle T M of the space- time manifold M , which is the base manifold not only for T M , but also for all the other principle-bundles P(M , G) associated with non-gravitational interactions. The structure (5.3.27) implies an obligatory and a priori predetermined interaction of every other gauge field with the gravitational one, namely with the metric. Here is the source of universality for gravity.

5.3.3 Yang-Mills Theory in Vielbein Formalism

In view of our advertised preference for the Cartan viewpoint and of our announced plan of trading the metric for the vielbein, a natural question arises about the formu- lation of electrodynamics in vielbein formalism. If we are going to discard the metric tensor, how can we rewrite the action principle (5.3.27) without using Hodge dual- ity? The answer we presently provide to such a question illustrates the basic spirit of the Repère Mobile philosophy: all fundamental operations are lifted to the Lorentz fibres, where they take the same appearance as in flat Minkowski space. What hap- pens in the Lorentz bundle is then transmitted to the tangent bundle through the soldering of the two-bundles which is implemented by the vielbein, via the vanish- ing of the Torsion. Since there is no difference, in this respect, between the Abelian gauge theory of Electrodynamics and the non-Abelian Yang-Mills theories utilized in the de- scription of electro-weak and strong interactions, we immediately address the most general case. We consider a principle bundle P(M , G) where G is the structural group (= gauge group in physical parlance) and the G -Lie algebra valued one-form I A = A TI is a principle connection (i.e. a gauge field in physical parlance). The corresponding curvature two-form is defined below:

F ≡ dA + A ∧ A  I I 1 I J K = F TI = dA + f A ∧ A TI (5.3.28) 2 JK 5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories 201 the bracket in the second line of (5.3.28) providing the explicit definition of the two-forms FI . Each of these latter can be expanded along a basis of differentials:5

I = I μ ∧ ν F Fμν dx dx (5.3.29)

I In terms of the field strength Fμν the Yang-Mills action in m-dimensions takes a form which is the obvious generalization of (5.3.16) and (5.3.27), namely:   A = − I J |μν m = ∧ YM γIJFμνF d x Tr(F F) (5.3.30)

In (5.3.30) the symbol γIJ denotes the G-invariant Killing metric on the Lie alge- bra G. Typically, choosing a linear representation of G, the generators TI become matrices that can be normalized in such a way that:6

Tr(TI TJ ) = γIJ (5.3.31)

The above equation explains the second identity in (5.3.30). Let us then solve the task of recasting the Yang-Mills action (5.3.30) into the framework of the vielbein formalism. To this effect we begin from the volume form, namely from the generally covariant integration measure on an m-dimensional space-time. In metric formalism, if we are supposed to integrate any scalar function f(x) on a (pseudo-)Riemannian manifold, we just write:   m f(x)Vol g = f(x) − det gd x (5.3.32)

In view of the identification (5.2.9), the above equation can be rewritten as follows:   1 f(x)Vol = f(x) Ea1 ∧···∧Eam ε (5.3.33) g m! a1...am

5 1 Note that from now on we omit the factor 2 introduced in the electromagnetic analogue of (5.3.29), namely in (5.3.10). There such a factor was used in order for Fμν to have the tradi- tional normalization Fμν = ∂μAν − ∂ν Aμ. Omitting this factor the normalization of Fμν becomes = 1 A − A + AA − the less traditional one: Fμν 2 (∂μ ν ∂ν μ terms). Paying this moderate price all formulae become much neater and deprived of annoying prefactors. 6An important comment is obligatory at this point. A fundamental theorem of Lie Algebra Theory, states that the Killing metric of a compact semi-simple Lie algebra is always negative definite. So, as long as, the gauge group G is chosen compact, as it is the case in all standard gauge theories of non-gravitational interactions, the kinetic term of the gauge fields defined by the action (5.3.31) turns out to have correct positive definiteness properties and the corresponding quanta have physi- cal propagators. In the case of non-compact gauge algebras the action (5.3.31) introduces negative norm states in the spectrum. This does not mean that non-compact gauge groups are altogether forbidden. Actually they appear in supergravity theories yet the corresponding kinetic terms have a more sophisticated structure which takes care of unitarity. 202 5 Einstein Versus Yang-Mills Field Equations

where εa1...am denotes the completely antisymmetric Levi Civita tensor with flat indices. Indeed it suffices to observe that: 1 1 a1 am a1 am μ1 μm E ∧···∧E εa ...a = E ∧···∧E εa ...a dx ∧···∧dx m! 1 m m! μ1 μm 1 m = det Edmx = − det gdmx (5.3.34)

In the above equation the writing det E denotes the determinant of the square matrix a = 2 × Eμ(x) and the last identity follows from (5.2.9) which implies det g (det E) (det η) =−(det E)2. This being clarified the gauge action (5.3.30) of a Yang-Mills field in a m- dimensional, Lorentz signature, space-time can be rewritten in the vielbein ap- proach, utilizing a first order formalism. Consider the following action functional:  1   ab c1 cm−2 AYM =− Tr F F ∧ E ∧···∧E εc ...c − ab (m − 2)! 1 m 2  1   + Tr F abF Ec1 ∧···∧Ecm ε (5.3.35) m! ab c1...cm

F = F I where ab abTI is a 0-form transforming as a section of the second antisym- π metric power of the Lorentz vector bundle Vso(1,m−1) → M mentioned in (5.2.13):  F I ∈ 2V M ab Γ so(1,m−1), (5.3.36)

F I In addition, ab is also a section of an associated vector bundle to the principle gauge bundle P(M , G), to be precise, the vector bundle of the adjoint represen- tation. This object should be regarded as an independent dynamical variable with respect to which we are supposed to vary the action functional (5.3.35). Performing such a variation we find the equation:

1 I c1 cm−2 0 =− F ∧ E ∧···∧E εc ...c − ab (m − 2)! 1 m 2 2 + F I Ec1 ∧···∧Ecm ε (5.3.37) m! ab c1...cm which admits a unique solution:

I = F I a ∧ b F abE E (5.3.38)

The argument goes as follows. Since the curvature FI is a horizontal two-form leav- ing on the cotangent bundle of the base manifold M it can always be expanded along a basis of local sections of T M , namely along an independent basis of one- forms. The coordinate differentials dxμ provide such a basis but, by construction, 5.3 The Structure of Classical Electrodynamics and Yang-Mills Theories 203 the vielbein {Ea} provide another possible basis. Hence, without loss of generalities we can always pose: I = I a ∧ b F FabE E (5.3.39) I I where Fab is just the name given to the components of F in the vielbein basis. Inserting (5.3.39)into(5.3.37) we easily realize that it reduces to the algebraic con- I = F I dition Fab ab and (5.3.38) follows. F I The obtained result implies that the 0-form field ab is identified, through its own equation of motion with the gauge field strength with flattened indices, namely:

F I =. I μ ν ab FμνEa Eb (5.3.40)

μ the symbol Ea denoting the inverse vielbein. Using (5.3.40) we find     ab μν Tr F Fab = Tr F Fμν (5.3.41) while using (5.3.38) we get:  1   ab c1 cm−2 − Tr F F ∧ E ∧···∧E εc ...c − ab (m − 2)! 1 m 2  1 1   ab c1 cm =− Tr F Fab E ∧···∧E εc ...c (5.3.42) 2 m! 1 m which implies that  1 AYM = Tr(F ∧ F) (5.3.43) 2 once (5.3.40) is implemented. In this way we have shown how to rewrite the Yang-Mills action (5.3.30) without using neither the operation of Hodge duality nor the metric. The price we had to pay was the introduction of a new auxiliary field which satisfies an algebraic equation of motion and can be eliminated inserting the solution of the latter back into the action. Furthermore, the action in the form (5.3.35) displays the coupling of the Yang-Mills fields to the Poincaré connection. The spin connection ωab is not present, yet the vielbein Ea appears explicitly and is an essential ingredient of the construction. The reconstruction of classical Yang-Mills theory can now be completed in the set up of (5.3.35) by considering the variation with respect to the connection one- form A I . This yields the following equation:

| =∇F I ab ∧ c1 ∧···∧ cm−2 0 E E εc1...cm−2ab | + − F I ab ∧ c1 ∧ c2 ∧···∧ cm−2 (m 2) T E E εc1...cm−2ab (5.3.44) where

∇F I|ab = F I|ab + I A J F K|ab + acF I|cd + bcF I|ac d f JK ω ηcd ω ηcd (5.3.45) 204 5 Einstein Versus Yang-Mills Field Equations is the gauge covariant differential of the 0-form field F I|ab. As stated above, this object is both a section of the adjoint vector bundle associated with the principle bundle P(M , G) and an antisymmetric section of the Lorentz bundle. For this rea- son its covariant differential involves both the gauge connection A J and the spin connection ωac. Let us analyze (5.3.44). What is the condition under which it reproduces the standard Yang-Mills equation that we obtain by varying the second order action (5.3.30)? The answer is encoded in the concept of soldering.

5.4 Soldering of the Lorentz Bundle to the Tangent Bundle

Suppose that we enforce the following equation:

Ta = 0 (5.4.1) what are its consequences? First of all, with reference to (5.3.44) we note that it reduces to just the first addend. Secondly we note that ∇F I|ab as defined in (5.3.45) is a one-form on the base manifold M . As such it can be expanded along a basis of sections of the cotangent bundle, in particular along the basis provided by the vielbein Ea. Hence we can write: I|ab I|ab c ∇F =∇cF E (5.4.2) I|ab The tensor ∇cF is defined by the position (5.4.2) and it is named the intrinsic covariant derivative of the Yang-Mills field strength. Comparison with the defini- tion (5.3.45) implies that:  ∇ F I|ab = μ F I|ab + I A J F K|ab c Ea ∂μ f JK μ  + acF I|cd + bcF I|ac ωμ ηcd ωμ ηcd (5.4.3) which justifies its name. Indeed it is the covariant derivative of the tensor F I|ab where the Greek holonomic index μ has been anholonomized and converted into a Latin one through multiplication with the inverse vielbein. The meaning of this I|ab operation is that ∇cF is now a section of the adjoint-bundle with respect to the internal gauge group G and a section of the triple power of the Lorentz bundle, while it behaves as a bona-fide scalar function with respect to the tangent bundle T ∗M of the base-manifold. Inserting (5.4.2)into(5.3.44) we obtain: | =∇ F I ab d ∧ c1 ∧···∧ cm−2 0 d E E E εc1...cm−2ab

dc ...c − p [m−1] ∝ ε 1 m 2 Φp ⇓ (5.4.4) I|ab 0 =∇aF 5.4 Soldering of the Lorentz Bundle to the Tangent Bundle 205

In the above calculations which constitutes a prototype for all the following cal- culations of the present chapter and also for many of the calculations in later chap- [m−1] [m−1] ters, we have denoted by Φp a basis of sections for the [E] exterior product of the 1-form valued Lorentz bundle in the vector representation. In other [ − ] words each element of m 1 [E] is an exterior product of (m − 1) vielbein and there are exactly m independent such monomials. By means of the Levi Civita sym- bol they can be enumerated as follows:

[ − ] m 1 = c1 ∧···∧ cm−1 Φp εc1...cm−1pE E (5.4.5)

Equation (5.4.4) implies that the coefficients of the independent (m − 1)-forms should separately vanish and using the straightforward identity:

− dc1...cm−2p × = − (m 2) − ! ! dp ε εc1...cm−2ab ( ) (m 2) m δab (5.4.6) this implies the last of (5.4.4). In it we recognize the standard dynamical field equa- tion of a Yang-Mills field. Indeed reverting to Greek indices the last of (5.4.4) be- comes: ∇μ I = Fμν 0 (5.4.7) where: ∇ I = I − σ I − σ I + I A J K ρFμν ∂ρFμν ΓρμFσν ΓρνFμσ f JK ρ Fμν (5.4.8) σ The symbol Γρμ denotes the coefficient of the Levi Civita connection as defined in (3.7.14)Ð(3.7.16) and coincides with the Christoffel symbols (3.2.7). The natural question that we expect from the attentive reader is the following: ab “What happened of the spin-connection coefficients ωμ which seem to have com- pletely disappeared from the above rewriting of the dynamical equations?” The an- swer to this quite legitimate question encodes the whole fascination of the soldering mechanism defined by imposing (5.4.1). a Consider the vielbein Eμ and define its complete covariant derivative which is given below: ∇ a ≡ a − ab c − σ a ρEμ ∂ρEμ ωρ Eμηbc ΓρμEσ (5.4.9) a In writing (5.4.9) we have carefully taken into account that Eμ is both a section of the tangent bundle T ∗M and a section of the associated vector bundle to the principal Lorentz bundle. In its first quality the vielbein feels the Levi Civita con- σ ab nection Γρμ, in its second quality it feels the Lorentz connection ωρ . We assume two equations: D a = 1. The vanishing of the torsion (5.4.1), which can be rephrased as [μEν] 0 D a ≡ a − ab c where μEν ∂μEν ωμ Eνηbc. 2. The covariant constancy of the metric ∇ρgμν = 0 which, together with the sym- σ metricity of Γρμ in its lower indices, defines the Levi Civita connection. 206 5 Einstein Versus Yang-Mills Field Equations

D a = Equation [μEν] 0 suffices to completely and uniquely determine the coefficients ab of the spin connection ωρ in terms of the vielbein derivatives and of its inverse. A simple counting explains why. The total number of unknowns in an m-dimensional 1 2 − manifold is 2 m (m 1). The same is the number of linear equations provided by the condition of vanishing torsion. Explicitly the solution can be obtained by means of a trick completely analogous to that used to determine the Levi Civita connection. Introduce the contorsion, defined below: a ≡ μ ν a Kpq Ep Eq ∂[μEν] (5.4.10) and the latinized components of the spin connection: ab = μ ab ωr Er ωμ (5.4.11) In terms of these objects the vanishing torsion equation becomes: a = ab − ab 2Kpq ωp ηqb ωq ηpb (5.4.12) By summing three copies of the above equations with cyclically permuted indices, using the antisymmetry of the contorsion in its lower indices and that of the spin con- ab a nection in its upper ones, the solution for ωp in terms of Kpq is explicitly obtained. This conclusion is quite momentous. It implies that the Lorentz gauge connection is not an independent dynamical object, rather it is completely determined by the vielbein, just as the Levi Civita connection is completely determined by the met- ric. The equivalency of the metric and vielbein formalism arises from the already proposed identification of the metric with the quadratic form in the vielbein which was spelled out in (5.2.9). Combining the latter with the covariant constancy of the metric with respect to the Levi Civita connection we obtain:   ∇ a b + a ∇ b = ρEμEν Eμ ρEν ηab 0 (5.4.13) Implementing once again the standard trick of rewriting three copies of the above equation with permuted indices   = ∇ a b + a ∇ b 0 ρEμEν Eμ ρEν ηab   =− ∇ b a + b∇ a 0 μEν Eρ Eν μEρ ηba (5.4.14)   =− ∇ b a + b∇ a 0 νEρEρ Eρ νEμ ηba ∇ a = and summing them together, upon use of the vanishing torsion equation [ρEμ] 0 we obtain that also the symmetric combination of derivatives vanishes. ∇ a +∇ a = μEν νEμ 0 (5.4.15) Hence we conclude that: ∇ a = μEν 0 (5.4.16) 5.4 Soldering of the Lorentz Bundle to the Tangent Bundle 207

Equation (5.4.16) is of the utmost relevance. It implies that, provided the torsion two-forms Ta vanishes, the tangent bundle and the Lorentz bundle are completely soldered to each other and we can trade the Levi Civita connection for the spin- connection and vice-versa. In practice this means the following. Given any world tensor:

T ν1...νs (5.4.17) μ1...μr with r controvariant and s covariant indices, which is therefore a section of the rth power of the cotangent bundle T ∗M and a section of the sth power of the tangent bundle T M , through multiplication with an appropriate number of direct and inverse vielbein we can convert it into a corresponding section of the (r, s)th power of the associated Lorentz vector bundle:   T b1...bs = Eμ1 ...Eμr Eb1 ...Ebr T ν1...νs (5.4.18) a1...ar a1 ar ν1 νr μ1...μr Soldering means that the parallel transport by means of the spin connection in the Lorentz bundle is completely equivalent to the parallel transport on the tan- gent/cotangent bundle by means of the Levi Civita connection. Introducing the curly capital D as a notation for the Lorentz-covariant differential of any Lorentz-tensor valued p-form, namely:

D a1...ar ≡ a1...ar − a1b1 ∧ c1a2...ar ···− ar br ∧ a1a2...cr T dT ω T ηb1c1 ω T ηbr cr (5.4.19) and reserving the symbol ∇ for the Levi Civita covariant differential of any world- tensor valued p-form:

∇T μ1...μr ≡ dT μ1...μr − Γ μ1 ∧ T ν1μ2...μr ···−Γ μr ∧ T μ1μ2...νr (5.4.20) ν1 νr the soldering (5.4.16) guarantees that we can write:   DT a1...ar = Ea1 ...Ear ∇T μ1...μr μ μr  1  (5.4.21) ∇T μ1...μr = Eμ1 ...Eμr DT a1...ar a1 ar

5.4.1 Gravitational Coupling of Spinorial Fields

The results of the previous section show that as far as tensor fields are concerned the use of the Levi Civita connection or of the spin-connection in the vielbein formalism are completely equivalent. This equivalence does not extend to the case of spinorial fields. As we emphasized more than once, spinorial representations are a specific feature of the orthogonal or pseudo-orthogonal algebras, so that spinorial sections of the tangent or cotangent bundle do not exists. On the other hand given the prin- cipal Lorentz bundle, we can construct all of its associated spinorial bundles. This 208 5 Einstein Versus Yang-Mills Field Equations

Fig. 5.3 The gravitational interaction vertex of spin 1/2 particles

amounts to saying that via the spin connection ωab we can easily define the covari- ant derivative of any spinor field and this, thanks to the soldering condition, which expresses ωab in terms of the vielbein, introduces the gravitational coupling of the spinor particles. We already shew how to rewrite the Yang-Mills Lagrangian in vielbein formal- ism. Let us now consider the case of the Dirac Lagrangian for a spin one half field. As usual, in preparation for those multi-dimensional theories that constitute the fo- cus of attention in contemporary Theoretical Physics, we deal with the case of spinor fields in m-dimensions. For conventions and properties of spinor fields in diverse di- mensions we refer the reader to Appendix A. Here for simplicity we focus on the case of a Dirac spinor which always exists in every dimension. In Minkowski space the free Dirac action is the following one which is the direct generalization to di- mension m of the action considered in (5.3.7):  μ m ADirac = iψγ ∂μψd x (5.4.22)

It can be easily recast in a general covariant form and adapted to a curved space-time manifold through the following rewriting:  1 (cov) a1 a2 a3 am A = iψγ Dψ ∧ E ∧ E ∧···∧E εa ...a Dirac (m − 1)! 1 m (5.4.23) 1 ab Dψ ≡ dψ − ω γabψ 4 The reader will notice that the ingredients entering (5.4.23) are just the vielbein Ea and the spin connection ωab, namely the components of the principal connection on the Poincaré bundle. The above action is manifestly invariant against local Lorentz transformations and it is invariant against general diffeomorphisms on the base man- ifold M , since it is written in terms of differential forms and their exterior products. When the spin connection ωab vanishes and the vielbein reduces to a Krönecker delta function, which is the case of flat Minkowski space, the reader can easily ver- ify that the actions (5.4.23) and (5.4.22) coincide. However, just as the U(1) covari- ant derivative in the case of electrodynamics produced an interaction vertex of the fermion field with the photon field, in the same way, the Lorentz-covariant deriva- tive used in the action (5.4.23) generates a non-trivial interaction vertex of the same fermionic field with the graviton, which has a universal structure and it is illustrated in Fig. 5.3. Having clarified the advantages and power of the vielbein formalism we just adopt it throughout most of the following exposition, turning to the fundamental question of Einstein equations, namely to the dynamics of the gravitational field. 5.5 Einstein Field Equations 209

5.5 Einstein Field Equations

In approaching the issue of Einstein field equations we follow two alternative paths. First we discuss Bianchi identities which played a fundamental role in electrody- namics and have a similar fundamental relevance in gravity, as well. Then we con- struct a geometrical action motivating its uniqueness and from that action we derive the field equations. Secondly, following an inverse procedure, we provide the phys- ical arguments that lead to an essentially unique form of the linearized field equa- tions and we argue why and how they should be non-linearly extended. The iterative procedure, known under the name of Noether coupling, which reconstructs the full non-linear structure of the geometrical theory will be outlined. Our starting point is provided by the definition of the Poincaré curvatures split into the Torsion and the Lorentz curvature two-form, respectively. Summarizing we write:

a a a ab c T = DE ≡ dE − ω ∧ E ηbc (5.5.1) ab ab ac db R = dω − ω ∧ ω ηcd (5.5.2)

These two definitions imply the following Bianchi identities:

a ab c 0 = DT + R ∧ E ηbc (5.5.3) 0 = DRab (5.5.4) from which we can deduce very important consequences in case we assume the soldering condition Ta = 0. Let us begin by expanding the curvature two form Rab along the vielbein basis, which introduces the notion of Riemann tensor: ab = Rab c ∧ d R cdE E (5.5.5) From the point of view of representation theory of the Lorentz group SO(1,m− 1), Rab m being the number space-time dimension, the tensor cd is not irreducible, rather it is the tensor product of two irreducible rank two antisymmetric represen- tations, denoted , in the language of Young tableaux. A priori the total number 1 2 − 2 of independent components is 4 m (m 1) and the decomposition into irreducible representations is as follows:

⊗ = ⊕ ⊕

= ⊕ ⊕•⊕ ⊕ ⊕ (5.5.6) 210 5 Einstein Versus Yang-Mills Field Equations

In the first line of (5.5.6) we just listed the available symmetries of the four-index tensors appearing in the decomposition. These are not yet irreducible, since for the pseudo-orthogonal group we can still construct invariants by taking traces with the invariant metric ηab. In the second line we enumerated the complete list of irre- ducible representations, the hat over a young tableau meaning that the correspond- ing tensor is irreducible, since it has been made traceless. In this way the total num- Rab ber of irreducible representations contained in cd turns out to be six. However, inserting the soldering condition Ta = 0 into the Bianchi identity (5.5.3) we obtain: = Rab c ∧ d ∧ f 0 cdE E E ηbf (5.5.7) Rab This simply implies that any irreducible tensor contained in cd that has at least three antisymmetric indices should vanish. In other words, the algebraic Bianchi identity (5.5.7) translates into the statement:

(5.5.8) 0 = = =

Taking this into account, the Riemann tensor of a Riemannian torsionless mani- fold decomposes into the following three irreducible representations of the Lorentz group SO(1,m− 1): ab R = ⊕ ⊕ • cd (5.5.9) R W ab cd Ricab W ab where cd is named the , Ricab is the symmetric traceless Ricci tensor and R is the scalar curvature. Explicitly we can write:

ab ab 2 [a b]f 2 ab R = W + δ η Ricd]f + δ R (5.5.10) cd cd m − 1 [c m(m − 1) cd where the normalization factors are chosen in such a way that by taking a contraction of the original Riemann tensor, for instance contracting a ↔ c, we obtain: 1 Rb = Ric ηbf + δbR (5.5.11) d df m d and taking a double contraction we have: R =  R (5.5.12) Instead of the traceless Ricci tensor it is customary to use in General Relativity the full Ricci tensor defined as follows: 1 η Rb ≡ Ric = Ric + η R (5.5.13) bf d df df m df 5.6 The Action of Gravity 211

Let us next consider the implication of the differential Bianchi identity (5.5.4). Tak- ing into account the vanishing torsion condition and expanding along the vielbein basis we get: = D Rab f ∧ c ∧ d 0 f cdE E E (5.5.14) Explicitly this means: = D Rab + D Rab + D Rab 0 f cd c df d fc (5.5.15) If in the above equation we take the double contraction f ↔ a and b ↔ d we obtain the following immediate result:

a D Gab = 0 (5.5.16) 1 Gab ≡ Ricab − ηabR (5.5.17) 2 So from the study of Bianchi identities, in presence of the soldering equation, we have come to the conclusion that there exists a unique linear combination of the Riemann tensor components which is symmetric and fulfills the unique property of a being divergenceless D Gab = 0. Such a property guarantees that it can function as the left hand side of an equation, whose right hand side is a conserved symmetric tensor. The object Gab is named the Einstein tensor for very good reasons. The intu- itive idea, summarized by Eddington’s metaphor (see Fig. 5.1), that the space-time curvature is proportional to its energy-matter content, could find a precise mathe- matical formulation when Einstein discovered the tensor Gab. Indeed it was clear to him from the beginning that the conserved current sitting on the right hand side of the gravitational equation had to be the symmetric stress-energy tensor, namely the conserved current of the relativistic momentum P μ, which substitutes the New- tonian gravitational mass. The left hand side had similarly to be a symmetric con- served tensor, encoding the curvature of space-time, which should be quadratic in the derivatives of the metric gμν . The answer was clearly provided by Gab which allowed Einstein to write an equation of the form [2, 3]: 4πG G = T (5.5.18) ab c2 ab where G is Newton’s constant and Tab is the stress-energy tensor describing all matter components filling space-time. It is very much rewarding that (5.5.18) and the soldering condition can be simultaneously obtained as variational equations from a single action principle whose structure can be uniquely determined under few reasonable and compelling assumptions.

5.6 The Action of Gravity

The Yang-Mills action, as we described it in the previous sections is quadratic in the curvature field strength so that it leads to second order differential equations 212 5 Einstein Versus Yang-Mills Field Equations for the gauge field Aμ. Yet in order to be written it requires either the use of the Hodge-duals or of the first order formulation based on the vielbein formalism we explained in Sect. 5.3.2. In any case knowledge of the gravitational field is intrin- sically involved. In the case of gravity a quadratic action in the Riemann tensor is excluded since this would lead to differential equations of order four for the gravi- a tational field which is the metric gμν or its substitute, the vielbein Eμ. This would violate the basic principles of classical mechanics where positions and velocities of the field should define a complete set of initial data. Hence the gravitational action has to be at most linear in the Riemann curvature tensor. Here is the first striking difference with Yang-Mills theory. Such a difference streams from the fact that the basic dynamical degrees of freedom of the gravitational field are not encoded in the connection itself (the Levi Civita connection) rather in the metric from which it derives. On the other hand the whole discussion of the present chapter taught us that the vielbein is in its own sake a connection, simply of a larger algebra, namely the Poincaré Lie algebra. Yet gravity cannot be simply regarded as the Yang-Mills theory of the Poincaré connection since it should include, as a built in principle the soldering (5.4.1) which allows to eliminate the spin connection ωab in terms of the vielbein. So we come up with the following apparently very difficult task: we should con- struct an action functional 

Agrav = Lgrav(E, ω) (5.6.1) for the vielbein Ea and the spin connection ωab possessing the following obligatory properties:

(a) Lgrav(E, ω) should be an m-form (for m-dimensional gravity) constructed only with wedge products of Ea, ωab and their exterior differentials dEa, dωab, without the use of any Hodge duals, which would imply the necessary use of a metric tensor. Fulfilling these specified properties, the action Agrav will be auto- matically invariant against arbitrary diffeomorphisms of the base manifold M . ab (b) Lgrav(E, ω) should be at most linear in the curvature two-form R . (c) Lgrav(E, ω) should be invariant against arbitrary local Lorentz transformations defined as follows: a = a b E Λ b(x)E for the vielbein (5.6.2) ab = a b cd + a b cd ω Λ c(x) dΛ d (x)η Λ c(x)Λ cω for the spin connection (d) The variation of the action with respect to the spin connection one-form ωab should just implement the soldering condition, namely should have as unique solution the vanishing of the torsion two-form Ta = 0. (e) The variation of the action with respect to the vielbein Ea should result in the unique algebraic condition that the Einstein tensor vanishes: Gab = 0. Such a hard task has indeed a simple and unique solution. Before presenting it let us illustrate the reasons behind all the listed requirements. 5.6 The Action of Gravity 213

Diffeomorphic invariance encodes the principle of equivalence. Indeed this is the very starting point of the new theory of gravity, searched by Einstein such that the laws of physics should take the same form for any observer, independently from his state of motion. Linearity in the curvature two-form we have already discussed. It is necessary in order to obtain second order field equations for the gravitational field. Invariance against local Lorentz transformations is equally essential. Indeed the physical degrees of freedom of the gravitational field have already been shown to be associated with the metric, which is a symmetric m × m matrix. Replacing a × gμν by the vielbein Eμ which is an m m matrix, with no prescribed symmetricity property, implies that we are enlarging the number of dynamical variables by means 1 − of an extra amount 2 m(m 1), that are fictitious. The only way this procedure might be correct is whether an extra local symmetry is introduced with exactly as many parameters as we have surreptitiously slept in. This extra symmetry is precisely lo- cal Lorentz symmetry , the parameter space of SO(1,m− 1) having just dimension 1 − 2 m(m 1). Local Lorentz symmetry can be used to gauge away the antisymmetric a part of Eμ, leaving the physical degrees of the metric. Finally that the soldering con- dition should follow as a dynamical variational equation and should not be imposed as an external constraint is a consistency requirement for the dynamical action one wants to construct. Under such a condition, the coupling to gravity of fermionic fields, that involve the spin connection, will automatically introduce all the needed modifications without any ad hoc constructions. Having clarified the implications of all the constructive requirements, let us present their solution. Apart from a multiplicative constant, the gravitational action is the following one:  1 A = L (E, ω) grav κ grav (5.6.3) = a1a2 [ ]∧ a3 ∧···∧ am Lgrav(E, ω) R ω E E εa1...am where εa1...am denotes the completely antisymmetric Levi Civita symbol. Purposely the curvature two-form has been written as Ra1a2 [ω] in order to emphasize that it should be considered as a functional of the spin connection regarded as an indepen- dent variable. Let us comment on the physical dimensions of the parameter κ. First of all, let us note that in m = 4 space-time dimensions the following combinations of fundamen- tal constants of Nature have respectively the dimensions of a mass, and of a length, the Planck-mass and the Planck-length, c G μ = ;  = (5.6.4) P G P c3 where G is Newton’s constant, c is the velocity of light and is Planck’s constant. Secondly let us observe that in natural units where c = 1, the physical dimension m−2 of the m-form integral Lgrav is [Lgrav]= , while the physical dimension of an action is [action]=μ × , having denoted by μ the mass dimension and by  the 214 5 Einstein Versus Yang-Mills Field Equations length dimension. Hence in D = m, in order for the gravitational action Agrav to have the correct physical dimensions the coupling parameter κ must be of the form:

= 1 × m−3−k k κ β 1 2 (5.6.5) μP where 1 and 2 are some fundamental length parameters and β is some conven- tional numerical parameter. For D = 4 the obvious solution is provided by:

k = 0; 1 = P (5.6.6) so that in standard four-dimensional gravity we have: 1 G κ = β ×  = β P 2 (5.6.7) μP c In multi-dimensional theories gravitational theories where D = 4 + k we can con- sider other solutions of the above problem if k dimensions are compactified and we have a new fundamental length parameter provided by the compactification radius Rc. In that case we can choose:

1 = P ; 2 = Rc (5.6.8) and we get: G κ = β × Rk (5.6.9) c2 c Having clarified the nature of the gravitational coupling parameter, let us consider the variational equations derived from the proposed action functional and demon- strate that they possess the advocated properties. It is indeed an easy task to perform a the variation with respect to the two independent variables ωab and E . Let us begin with the first.

5.6.1 Torsion Equation

Performing the δωab-variation, by means of a partial integration we obtain:

= D a2 ∧ a3 ∧···∧ am 0 E E E εaba2...am = a2 ∧ a3 ∧···∧ am T E E εaba2...am (5.6.10)

The above equation has the unique solution Ta2 = 0. To see this it suffices to expand the torsion two-form along the vielbein basis:

a2 = a2 p ∧ q T TpqE E (5.6.11) and substitute back into the original equation. Collecting the coefficients of an inde- pendent basis of (m − 1)-forms we obtain: 5.6 The Action of Gravity 215   = − !× !× t + t ∗ 0 (m 3) 3 Tab 2δ[aTb]∗ (5.6.12) where the ∗ symbols denotes saturation of the corresponding indices. Taking a fur- ∗ = ther trace of (5.6.12) we obtain Ta∗ 0 which, inserted back into the equation, a = implies the full vanishing of the torsion Tbc 0. Hence the soldering of the Lorentz bundle with the tangent bundle is no longer an a priori hypothesis, rather a dynami- cal yield of the classical action principle.

5.6.1.1 Torsionful Connections

Let us now suppose that in dimension m, in addition to the action of pure gravity we have also a contribution from matter, so that the total action is of the form: 

Atot = Lgrav(E, ω) + Lmatter(E,ω,Φ) (5.6.13) where Φ collectively denotes the non-gravitational matter fields. Necessarily the variation with respect to δωab of the matter Lagrangian Lmatter(E,ω,Φ) produces an (m − 1)-form with the following general structure:

δLmatter = f ∧ 1 ∧···∧ m−1 Kabεf1...m−1 E E (5.6.14) δωab

f D The three index tensor Kab(Φ, Φ) depends on the matter fields and possibly on their derivatives. In this case, the torsion tensor is not zero, rather it has a very simple f D expression in terms of Kab(Φ, Φ): t t 2 t ∗ T (Φ, DΦ) =−(m − 1)(m − 2) K + δ[ K ]∗ (5.6.15) ab ab m − 2 a b

In presence of torsion, the spin-connection ωab is modified in the following way:

ωab = ωab(E, ∂E) + Δωab(ΦDΦ) (5.6.16) where ωab(E, ∂E), that depends only the vielbein and its derivatives, is the Levi Civita part of the spin connection, while Δωab(ΦDΦ), depending on the matter fields and their derivatives, is the following one: ab = q[a b] − af bg m p Δω 2η Tqp η η ηmp Tfg E (5.6.17)

Substituting (5.6.16) and (5.6.17) back into the action, we obtain new non-linear interactions of the matter fields that are actually a gravitational effect. 216 5 Einstein Versus Yang-Mills Field Equations

The Torsion of Dirac Fields The simplest example is provided by the case of a 1 spin 2 -field. Adding to the gravitational action the spinor action (5.4.23) the tensor f Kab is immediately calculated: f 1 f 2 f ∗ K =−i ψγ γabψ + 2 δ ψγ γb]γ∗ψ (5.6.18) ab 4(m − 3)! m − 2 [a and for the torsion we find: f 1 f f [a b] T =−i ψγ γabψ + 2η ψγ ψ (5.6.19) ab 4(m − 3)! Via (5.6.17), this spinor bilinear form of the torsion induces quartic interaction of the Fermi fields that are a consequence of Einstein gravity.

Dilaton Torsion Another notable example of torsion mechanism is provided by the case where we have a dilaton coupling of the Einstein action. Suppose that in addition to the vielbein and the spin connection we have also a scalar field ϕ and that the total action takes the following form:  A = [− ] a1a2 [ ]∧ a2 ∧ a3 ∧···∧ am + A tot exp aϕ R ω E E E εa1...am scalar

dilaton gravity (5.6.20)  1 A = Φa1 dϕ ∧ Ea2 ∧···∧Eam ε scalar m! a1...am  1 1 1 m a1 am + − Φ Φm − W (ϕ) E ∧···∧E εa ...a (5.6.21) m! 2m m 1 m

As defined in (5.6.21), the scalar action Ascalar plays, for the Klein-Gordon action of a scalar field ϕ, the same role that is played by (5.3.35) for the Yang-Mills field, namely its is its transcription in the vielbein formalism without the use of any Hodge dual. Indeed, if we consider the 0-form Φa, which belongs to the vector representa- tion of the Lorentz group, as an independent field and we vary Ascalar with respect to it, we obtain the following algebraic condition:

Φa = ∂aϕ (5.6.22)

a where, by definition, dϕ = ∂aϕE . Substituting this result back into (5.6.21)we obtain:  1 1 μν m Ascalar → AKG = ∂μϕ∂νϕg − W (ϕ) − det gd x (5.6.23) m 2 which is the standard form for the action of a scalar field, with potential W (ϕ),in the background of a metric field gμν . 5.6 The Action of Gravity 217

On the other hand, varying the gravitational part of the action in the connection f field, we easily see that it is just of the form discussed before but with a Kpq tensor of the following form:

f 1 f K =−a δ ∂q]ϕ (5.6.24) pq m − 1 [p Hence, also in this case, there is an addition to the Levi Civita part of the spin connection which is contributed by the matter field, in this case the dilaton. We will see in the second volume that this kind of couplings can be easily reab- sorbed by means of a so called Weyl transformation. Indeed it suffices to introduce a new vielbein: a − 1 a a E = κ m exp − ϕ E (5.6.25) m − 2 and, regarded as a functional of the new vielbein Ea, the gravitational part of the ac- tion becomes the standard one written in (5.6.3). Obviously the Weyl transformation (5.6.25) implies modifications of the scalar part of the action.

5.6.2 The Einstein Equation

Let us now consider the variation of the gravitational action (5.6.3) with respect to the vielbein Ea. For convenience we consider at once also the contributions from the matter action. Hence we obtain the following variational equation:

m − 2 δLmatter = a1a2 [ ]∧ a3 ∧···∧ am−1 ∧ f + ∧ f 0 R ω E E εa1...am−3f δE δE κ δE f Rf Tf (5.6.26) where both Rf and Tf are (m − 1)-forms belonging to the vector representation of the Lorentz group SO(1,m− 1). Correspondingly we can parameterize them in terms of two rank two tensors Tfg and Xfg, according to the following formulae:

δLmatter ≡ T = T ηgkε × E1 ∧···∧Em−1 (5.6.27) δEf f fg k1...m−1 = gk × 1 ∧···∧ m−1 Rf Xfgη εk1...m−1 E E (5.6.28) By explicit evaluation we find:

(m−1) 1 Xfg = (−) × Gfg (5.6.29) (m − 1)(m − 2) where Gfg is the Einstein tensor defined in (5.5.17). Hence we obtain the equation:

m Gab = (−) κTfg (5.6.30) 218 5 Einstein Versus Yang-Mills Field Equations

Hence if we fix the numerical parameter β introduced above to the value β = 4π and we interpret T as the stress energy tensor of the matter system, we get κ = 4πG fg c2 reproducing exactly the Einstein equation anticipated in (5.5.18). In this way we have shown that the gravitational action (5.6.3) does indeed the job it was required to do: • Realizes the soldering of the Lorentz bundle with the tangent bundle by imposing the vanishing of the torsion Ta = 0 as a dynamical equation. • Yields the Einstein equation (5.5.18). In the case there are source terms for the torsion we shew that these lead to a modifi- cation of the Levi Civita spin connection by the addition of a Δωab term depending only on the matter fields which, once replaced back into the second order action, produces additional interactions between matter fields, that are just a gravitational effect.

5.6.3 Conservation of the Stress-Energy Tensor and Symmetries of the Gravitational Action

A natural question which arises at this point concerns the conservation of the stress- energy tensor as defined by (5.6.27), namely as the variation of the matter action with respect to the vielbein Ea. The answer to such a question relates with the symmetries of the gravitational and of the matter actions. The main starting point of Einstein search for the correct right hand side of the gravitational field equation was the search of a symmetric tensor, linear in the components of the Riemann tensor whose divergence, in force of the Bianchi identities, should vanish in order to couple to the stress-energy tensor that, in Einstein’s approach, was by its own definition a conserved, divergenceless tensor. So let us consider the symmetries of the action functional (5.6.3). As many times emphasized, the multiplet made by the spin connection plus the vielbein {ωab,Ea} can be regarded as a principal connection on a Principal Poincaré bundle that has space-time Mm as its base manifold. Yet the action we have con- structed is not invariant under local gauge transformation of the full Poincaré Lie algebra whose infinitesimal form is:

a a ab c δE = Dτ + ε E ηbc (5.6.31) δωab = Dεab (5.6.32) where τ a is the gauge parameter associated with the translation generator P a while the antisymmetric εab are the gauge parameters associated with the Lorentz gen- erators J ab. Enumerating the requirements that the gravitational action should sat- isfy, we asked for local Lorentz invariance which corresponds to formulae (5.6.31), (5.6.32) with εa = 0 and whose finite form was displayed in (5.6.3), but we did not enforce gauge translation invariance. Indeed by means of an integration by parts, an immediate calculation shows that under a gauge-translation the action functional (5.6.3) varies as follows: 5.6 The Action of Gravity 219  A = − a1a2 ∧ a3 a4 ∧···∧ am−1 f δτ grav (m 2) R T E E τ εa1...am−1f (5.6.33)

This variation is not identically zero, yet it vanishes on the shell of the soldering condition Ta = 0, which is the variational field equation of the spin connection. Hence from a formal point of view the gravitational action (5.6.3) is gauge invariant only under the Lorentz subgroup of the Poincaré group. Yet, being written solely in terms of differential forms and their wedge products, the action (5.6.3) is invariant against the group of diffeomorphisms Diff(Mm) that involves m-arbitrary functions, namely the gauge transformations associated with the translation generators P a. Hence, although in a more subtle way with respect to Yang-Mills theory, the number of local invariances of the gravitational action is just equal to the dimension of the 1 + Poincaré Lie algebra, namely 2 m(m 1). The outcome of the above discussion is that, after implementation of the solder- ing constraint, namely in second order formalism, the transformation:

Ea → Ea + Dτa (5.6.34) is actually a true infinitesimal symmetry of both the gravitational action and the mat- ter action; by consistency the latter must have the same symmetries as the former. The catch of this apparent paradox is that the gauge transformation (5.6.34)isac- companied by a compensating transformation of the spin connection that preserves the soldering condition Ta = 0 true. This observation allows a very simple proof that the stress-energy tensor defined by (5.6.27) is indeed divergenceless. It suffices to note that, in order for the transformation (5.6.34) to be a symmetry of the matter action it is necessary that:   δL 0 = matter ∧ Dτf ≡ T ∧ Dτf (5.6.35) δEf f By partial integration this is true if and only if:

DTf = 0 (5.6.36)

Inserting the parameterization (5.6.27)into(5.6.36), developing the derivatives and using the soldering constraint Ta = 0, we immediately derive:

f D Tfg = 0 (5.6.37) which is the conservation law of the stress-energy tensor.

5.6.4 Examples of Stress-Energy-Tensors

It is convenient and instructive to construct a few examples of stress-energy-tensors, in particular those associated with the matter Lagrangians we considered in the pre- vious sections. 220 5 Einstein Versus Yang-Mills Field Equations

The Stress-Energy Tensor of the Yang-Mills Field Let us consider the Yang- Mills action in vielbein formalism spelled out in (5.3.35). The variation with respect to a vielbein δEu vielbein produces the following result:

1   YM ab p q c1 cm−3 T =− Tr F Fpq E ∧ E ∧ E ∧···∧E εc ...c − uab f (m − 2)! 1 c 3   + F pqF c1 ∧···∧ cm−1 Tr pq E E εc1...cc−1u (5.6.38) from which we immediately obtain:  YM 4 m−1 ∗∗ 1 ∗∗ T = (−) Fa∗Fb∗η − F F∗∗ (5.6.39) ab (m − 1)! 4 where as usual the symbol ∗ denotes a contracted index. The reader will notice that the stress-energy tensor of a Yang-Mills field is traceless in 4-space-time dimensions m = 4.

The Stress-Energy Tensor of a Scalar Field The stress-energy tensor obtained from the scalar Lagrangian Ascalar defined in (5.6.21) is computed in a similarly easy way. First we get:

(m − 1) Tscalar =− Φa1 Φ Er ∧ Ea2 ∧···∧Eam−1 ε f m! r a1...am−1u  m 1 1 ∗ a1 am−1 + − Φ Φ∗ − W (ϕ) E ∧···∧E εa ...a − u m! 2m m 1 m 1 (5.6.40) from which we immediately get:  scalar m 1 1 ∗ T = (−) Φf Φg − ηfgΦ∗Φ + ηfgW (ϕ) (5.6.41) fg m! 2

Having reviewed these examples let us now turn to the derivation of Einstein equa- tions from the particle theorist’s view point. To do so let us begin with the weak field approximation of the geometrical theory.

5.7 Weak Field Limit of Einstein Equations

Resuming the discussion of previous sections and focusing on the physically inter- esting dimension m = 4, the Einstein equation can be written as follows:

ab c R ∧ E εabcd = Td (5.7.1) = gk 1 ∧ 2 ∧ 3 Td Tfgη εk123 E E E (5.7.2) 5.7 Weak Field Limit of Einstein Equations 221

The weak field approximation consists of assuming that the metric differs by a very small perturbation from the flat Minkowski metric. In other words we set:

gμν = ημν + hμν(x) where hμν(x)  1 (5.7.3) where hμν(x) is a symmetric tensor field with respect tot the Lorentz group SO(1, 3). Under an infinitesimal diffeomorphism:

xμ → xμ + ξ μ(x) (5.7.4) where ξ μ(x) denotes a four-vector of arbitrary functions, the transformation of the fluctuation field is immediately derived by linearizing the tensor transformation of ˆ = ∂xˆρ ∂xˆσ the metric field gμν(x) gρσ (x) ∂xμ ∂xν and we find:

hμν → hμν(x) + ∂μξν + ∂νξμ (5.7.5)

To first order in the perturbation h we can write the corresponding vielbein as fol- lows:

a a 1 ab c E = dx + h dx ηbc (5.7.6) 2 Obviously this is not a unique solution. Infinitely many other can be obtained by per- forming infinitesimal Lorentz rotations. Each of them, however, will influence only a ≡ a − a the antisymmetric part of the matrix δEμ Eμ δμ. We have gauge-fixed Lorentz a symmetry by imposing that δEμ is symmetric. Inserting (5.7.6) into the soldering condition Ta = 0 we immediately obtain the solution for the spin connection which is calculated at the first order in the fluctuation field h:

ab 1 af bg  ω = η η (∂f hg − ∂f gf)dx (5.7.7) 2 Inserting this result into the expression for the curvature two-form Rab and calcu- lating the Einstein tensor Gab to first order in the perturbation h, we obtain:   ρσ ρ ρ Gμν = hμν + ∂μ∂ν hρσ η − ∂μ∂ hρν − ∂ν∂ hρμ   ρσ ρ σ − ημν hρσ η + ημν∂ ∂ hρσ (5.7.8)

Notice that at this level of approximation we no longer make any distinction between the Latin and the Greek indices. The tangent bundle is soldered to the principal a Lorentz bundle and the unperturbed vielbein is a δμ which precisely identifies Greek and Latin indices. Hence the linearized field equation of Einstein gravity reduces to:   ρσ ρ ρ hμν + ∂μ∂ν hρσ η − ∂μ∂ hρν − ∂ν∂ hρμ   ρσ ρ − ημν hρσ η + ημν∂ ∂σ hρσ = κTμν (5.7.9)

As the reader can easily check, the right hand side of the above equation is invariant under the gauge transformation (5.7.5). 222 5 Einstein Versus Yang-Mills Field Equations

So let us temporarily forget the geometrical origin of the gauge invariant linear equation (5.7.9) and investigate its properties.

5.7.1 Gauge Fixing

According to a general strategy, in presence of a local symmetry, such as (5.7.5), we look for gauge fixing conditions to be imposed on the field, in order to break such an invariance and eliminate the unphysical degrees of freedom. The most commonly used gauge-fixing for the gravitational field is the Hilbert-de Donder7 gauge that we also use in the second volume, while discussing gravitational waves. It is defined as follows. We first introduce the following useful combination of the hμν tensor and its trace

1 ρσ γμν(x) = hμν − ημνhρσ (x)η (5.7.10) 2 and then we impose the differential constraint:

μ ∂ γμν = 0 (5.7.11)

Using this constraint the field equation (5.7.9) simplifies dramatically and reduces to:

γμν = κTμν (5.7.12) which will be our starting point in the study of gravitational waves in Chap. 7. The Hilbert-de Donder gauge breaks gauge invariance, yet not completely, because there still exist local transformations that preserve the gauge condition (5.7.11). Indeed consider the gauge transformation of (5.7.11); we have

μ δ∂ γμν = ξν (5.7.13)

Any quadruplet of functions ξ μ(x) which satisfy the following differential equation (where ∂ · ξ denotes the divergence of ξ μ):

ξν = 0 (5.7.14) corresponds to residual gauge transformations that can be utilized to further reduce the number of degrees of freedom of hμν . We have to dispose of all the spurious

7Théophile Ernest de Donder (1872Ð1957) was a Belgian mathematician and physicist. His most famous work dating 1923 concerns a correlation between the Newtonian concept of chemical affin- ity and the Gibbsian concept of free energy. Prof. de Donder was among the very first scientists who studied Einstein General Relativity and was one of its sustainers since its very beginning. He was a personal close friend of Einstein. His main field of activity was the thermodynamics of irreversible processes and the his work can be considered the early basis of the full-fledged development of this subject performed by Ilya Prigogine, the famous Russian born Belgian chemical-physicist who received the 1977 Nobel Prize. 5.7 Weak Field Limit of Einstein Equations 223 degrees of freedom in a systematic way in order to single out the true physical ones carried by the gravitational field. The simplest way to do so is to introduce light-cone coordinates and consider the propagation equation of the metric fluctuation in a specific arbitrarily chosen direction. Null coordinates in an m-dimensional Minkowski space are defined as follows:   − 0 1 i 2 m x = u = x − x ; x⊥ = x ,...,x (5.7.15) + x = v = x0 + x1 (5.7.16) where we have conventionally selected the first as the axis along which the grav- itational quantum propagates. Correspondingly the Minkowskian metric takes the form:

2 + − i i μ ν ds = 2dx ⊗ dx − dx⊥ ⊗ dx⊥ = ημν dx ⊗ dx (5.7.17) where: ⎛ ⎞ 01 00··· 0 ⎜ ⎟ ⎜10 00··· 0 ⎟ ⎜ ⎟ ⎜00 −10··· 0 ⎟ = ⎜ ⎟ ημν ⎜00 0 −1 ··· 0 ⎟ (5.7.18) ⎜ ...... ⎟ ⎝ ...... ⎠ 00 0 ··· 0 −1

In the vacuum, namely in a space-time region where the stress-energy tensor essen- tially vanishes, the metric perturbation γμν subject to the de Donder gauge condi- tion, obeys the d’Alembert free propagation equation:

γμν = 0 (5.7.19)

Introducing the notation: ∂ ∂ ∂+ = ; ∂− = ∂v ∂u + +− ∂ − −+ ∂ ∂ = η = ∂−; ∂ = η = ∂+ (5.7.20) ∂x− ∂x+ ∂ ∂i = ηij =−∂ ∂xj i The d’Alembertian equation (5.7.19) takes the form:  = + i ⊥ ∂+∂− ∂⊥∂i (5.7.21) Plane waves are the simplest solutions of (5.7.19) and correspond to perturbation propagating at the speed of light in the conventionally chosen direction (the first 224 5 Einstein Versus Yang-Mills Field Equations

Fig. 5.4 A graphical representation of a transverse wave. The amplitude oscillates in the transverse direction to the propagation direction

axis). We have incoming and outgoing waves: in both cases the perturbation γμν de- pends only on one of the light-cone coordinates u, v: outgoing waves depend only on u, while incoming waves depend only of v. The discussion is absolutely similar in both cases, mutatis mutandis. Let us choose outgoing waves and set:

hμν(x) = hμν(u) (5.7.22)

Such a position automatically satisfies (5.7.19). We have to consider the implemen- μ tation of the Hilbert-de Donder gauge: ∂ γμν = 0. Since both hμν and γμν depend only on the variable u we have:

γ+ν = const = 0 (5.7.23)

The last of the above equations is fixed by our physically chosen boundary con- ditions. At infinity, namely at very remote future times and in very distant space locations where the wave has not yet arrived, the metric is just Minkowski. Hence there is no constant part of γμν . An m-tuplet of functions which satisfies the harmonic condition (5.7.14) and cor- respondingly preserves the Hilbert de Donder gauge is given by arbitrary function ξ μ(u) of the light-cone coordinate u. Let us see which further degrees of freedom of the tensor γμν can be removed by exploiting such transformations. It suffices to consider the explicit transformation of γ component-wise. We find:

⊥ → ⊥ − ⊥ + ; → γij γij ηij ∂ ξ+ γ++ γ++

γ−j → γ−j + ∂−ξi; γ+− → γ+− (5.7.24)

γ−− → γ−− + 2∂−ξ−; γ+i → γ+i

It is evident that by means of these transformations we can set γ−μ = 0. Indeed ξi +− suffices to remove γ−i , the divergence of ξ− suffices to remove γ−− and γ−+ = γ was already zero on the basis of the previous argument. Furthermore the divergence ⊥ of ξ+ suffice to remove the trace of the transverse tensor γij . Hence in every space- time dimensions m the physical degrees of freedom of a quantum of the metric field, hereafter named the graviton are those of a traceless transverse tensor (see Fig. 5.4): 5.7 Weak Field Limit of Einstein Equations 225 ⎛ ⎞ 00 0 ··· ··· ··· 0 ⎜ ⎟ ⎜00 0 ··· ··· ··· 0 ⎟ ⎜ ⊥ ⊥ ⊥ ⎟ ⎜00 γ γ34 ··· γ − γ ⎟ ⎜ 33 3,m 1 3,m ⎟ ⎜00 γ ⊥ γ ··· γ ⊥ γ ⊥ ⎟ phys = ⎜ 43 44 4,m−1 4,m ⎟ γμν ⎜ ...... ⎟ (5.7.25) ⎜ ...... ⎟ ⎜ ...... ⎟ ⎜ ...... ⎟ ⎝ ...... ⎠ ...... . ⊥ ⊥ ··· ⊥ − m−1 ⊥ 00 γm3 γm4 γm,m−1 i=3 γii The number of these degrees of freedom is easily calculated it is: m(m − 3) #d.o.f = (5.7.26) 2 In D = 4 space-time dimensions the degrees of freedom are 2 and the corresponding gauge fixed form of the γμν tensor is the following one: ⎛ ⎞ 00 0 0 ⎜00 0 0 ⎟ γ phys = (u) = ⎜ ⎟ (5.7.27) μν ⎝00a(u) b(u) ⎠ 00b(u) −a(u) where the two functions a(u) and b(u) are arbitrary. We shall use this gauge-fixed form of the metric perturbation while studying the emission and propagation of gravitational waves (see Chap. 7 of this volume).

5.7.2 The Spin of the Graviton

Hence the gravitational waves, namely the small deformations of the Minkowski metric, propagate in Minkowski space-time and propagate at the speed of light since they obey the d’Alembertian equation. Moreover they are transverse waves since what varies in time along the line of flight is not the amplitude of the metric coef- ficient in that direction, rather all the metric coefficients in the space orthogonal to the line. From the mathematical point of view the right way to approach this issue is from the point of view of the little group of invariance of the momentum vector. A plane- wave gravitational disturbance, like an electromagnetic one of the same type, is an object carrying momentum and this momentum is a light-like vector: μ = − ··· p(wave) 1, 1, 0, 0, , 0 (5.7.28) m−2 The little group of invariance of any element v ∈ D belonging to the carrier vector space of a linear representation of some Lie group G is that subgroup G(v) ⊂ G which leaves the vector v invariant: 226 5 Einstein Versus Yang-Mills Field Equations

Table 5.1 Degrees of freedom of the graviton in D Little group # of d.o.f. diverse dimensions 11 SO(9) 44 10 SO(8) 35 9SO(7) 27 8SO(6) 20 7SO(5) 14 6SO(4) 9 5SO(3) 5 4SO(2) 2 3— 0

∀g ∈ G(v) : g · v = v (5.7.29)

Typically the carrier space D is organized in orbits, under the action of G, whose little groups are different non-isomorphic subgroups of the G. In the case of the Lorentz group SO(1,m− 1) and its fundamental m-dimensional representation the relevant orbits are three corresponding to the time-like, space-like and null-like vec- tors. The corresponding little groups are Gtime = SO(m),Gspace = SO(1,m−1) and Gnull = SO(m − 2). Therefore the degrees of freedom of the graviton correspond to the following irreducible representation of the little group SO(m − 2):

graviton ≡ (5.7.30) where the hat denotes the traceless nature of the symmetric tensor. The number of m(m−3) independent components of such an irreducible representation is the number 2 quoted above. Given its relevance in modern multi-dimensional gravity theories and specifically in supergravities that are the low energy limits of superstring models, in Table 5.1 we list the number of graviton degrees of freedom for all space-time dimensions between D = 11 and D = 3. In D = 3 the graviton disappears. There is no room for its propagation since there is no residual transverse group. Gravity in three space-time dimension allows only for the Newtonian potential and there are no local propagating degrees of freedom. In D = 4 the transverse group is SO(2), whose unitary irreducible representations are all two-dimensional, as we explained in Sect. 1.6.1. For this reason the graviton has the same number of degrees of freedom as the photon or a Yang-Mills gauge boson, namely 2. Yet although all two-dimensional, the irreducible representations of SO(2) are not all equivalent. According to the discussion of Sect. 1.6.1 they are characterized by an half integer number s that we name the spin of the corresponding particle. In the case of the graviton, just as for the photon, s is integer, but it is s = 2 rather than s = 1. Physically this means that we have two states, one of helicity s = 2 and one of helicity s =−2. The two states are pictorially described in Fig. 5.5. 5.8 The Bottom-Up Approach, or Gravity à la Feynmann 227

Fig. 5.5 A graphical representation of the two helicity states of the graviton. Given the momentum vector pμ that corresponds to propagation of the wave in a given direction, the transverse oscil- lations can describe a clock-wise or anti clock-wise circulation around the flight direction axis. These two possibilities correspond to the two available states of the graviton quantum particle

5.8 The Bottom-Up Approach, or Gravity à la Feynmann

There is no doubt that Richard Feynman (see Fig. 5.6) has been the most influential, creative and innovative among all the theoreticians living in the third quarter of the XX century. His very original ideas about perturbation theory and his graphical rep- resentation of quantum relativistic physical processes, not only led to the solution of Quantum Electrodynamics, but also provided a new paradigmatic way of conceiv- ing Quantum Field Theory, that has become part of the forma mentis of all modern physicists. Similarly, his general conception of the path integral provided a new gen- eral framework to conceive all quantum theories. Feynman had also a special talent for teaching and his lecture courses at Caltech have become legendary. He used the most unusual analogies and stimulated student thought with unexpected arguments that led to deeper thinking and understanding. In the mid sixties he gave a course on General Relativity and, in order to introduce Einstein equations of Gravity, he resorted to a hypothetical Venusian civilization. The choice of Venus was not ran- dom. It was motivated by its terrible atmosphere. Venus is absolutely similar to the planet Earth as far as its dimension and its distance from the Sun are concerned, yet the average surface temperature is around 450¡ degrees centigrade and the pressure experienced by someone standing on its soil is about 92 atmospheres. Responsible for such a oven-like climate is the carbon dioxide (CO2) which traps almost all of the outgoing radiation: an extreme version of the Green-House effect. With such a temperature, all ancient oceans of Venus evaporated and water molecules, after fly- ing to the high atmosphere, dissociated loosing their hydrogen atoms that flew away into the interplanetary space. Furthermore the intensive volcanic activity, ubiqui- tous on the surface of the planet, fills its diabolic atmosphere with dense clouds of sulfuric acid, which prevent Venusians from seeing the starry night and penetrating the depths of interplanetary and intergalactic space with their naked eye or optical instruments. Under such circumstances a hypothetical Venusian civilization quite unlikely had produced a Ptolemy, a Copernicus and a Newton and their scientists might not have discovered the Universal Law of gravitational attraction as an expla- nation of the observed motion of planets that they did not observe altogether. Yet, 228 5 Einstein Versus Yang-Mills Field Equations

Fig. 5.6 Richard Phillips Feynman (1918Ð1988) was born in Queens, New York, from a Jewish family, originated from Russia and Poland. Just as Albert Einstein, he was a late talker, namely he started uttering words only after his third year of life. Yet, by the age of fifteen he had already learned, by self-teaching, integral and differential calculus. His high school education was in New York and his first University education took place at the Massachusetts Institute of Technology. Then he was accepted by Princeton, where he obtained his Ph.D. degree in 1942, in times of war. His advisor was John Archibald Wheeler and his thesis contained the seeds of what was later to become the Feynman path integral approach to quantum mechanics and quantum field theory. Feynman was among the youngest physicists who took part in the Manhattan project, namely in the famous secret development of the atomic bomb, that happened in the secluded Laboratories of Los Alamos, New Mexico. In the early post-war years he conducted fundamental researches on Quantum Electrodynamics that lead him to share with Julian Schwinger and Sin-Itiro Tomonaga the 1965 Nobel Prize for the construction of that theory. Since the early sixties, up to his death in 1988, Feynman held the Richard Chace Tolman professorship in theoretical physics at the Cali- fornia Institute of Technology. Besides Quantum Electrodynamics and the path integral, Feynman gave very important contributions to the theory of superfluidity and to Quantum Chromodynam- ics, where he introduced the so called parton model. His personality was very open and brilliant. He had a special talent for teaching, where he resorted to unusual imaginative metaphors both to seduce and to penetrate the minds of his students. He liked very much joking and was typically very gentle, informal and unpretending with young people who sincerely wanted to discuss physics with him, while he could be at times very harsh and even rude with senior fellow scientists when they showed signs of excessive self-esteem as Feynman advocated, intelligent beings, although by a different route, must have reached the same conclusions about the fundamental laws of nature. Venusians were very good particle theorists and besides atomic physics they al- ready mastered classical and quantum field theory and knew that all interactions are mediated by the exchange of quanta of an appropriate field. The history of their sci- ence did not include a Newton but the Venusian analogues of Dirac, Pauli and Emmy Noether had lived and produced their results. Moreover every Venusian child was able to drop things and, while sitting on his chair, experience their falling down. 5.8 The Bottom-Up Approach, or Gravity à la Feynmann 229

Hence Venusian scientists knew that there existed a mysterious force of attraction they named gravitation. Yet what sort of field was responsible for such an interac- tion? The reader should be aware that the Venusian scientists had a perfect knowl- edge of special relativity and of the equivalence of mass and energy E = mc2, which they were daily experiencing with their thermonuclear reactors and particle accel- erators. In order to investigate the mysterious gravitational phenomenon, Venusians had constructed sophisticated analogues of the Cavendish experiment and they knew that masses attract each other with a force that decreases with the inverse squared distance. So sitting at their working desk, the Venusian physicists could summarize the knowledge they had accumulated on gravitation. (a) Gravitation is a long range interaction since its force decreases with a 1 law. r2 Hence the quantum of the mediator field must be a massless particle traveling at the speed of light. (b) Mass seems to be the conserved charge and the source of such an interaction. Yet there were two fundamental obstacles in order to proceed further, namely: (c) mass is equivalent to energy because of special relativity, so energy rather than mass should be the true source that couples to the sought for gravitational field, (d) as far as the Venusians knew, mass is associated with no generator of any sym- metry and it was difficult to imagine how to relate it with Noether’s theorem. The solution to these two problems was the same and came from the following observation. Energy E and not mass m should be the source of gravitation. Yet dif- ferently from mass, energy is not a scalar invariant, rather it is the fourth component of a quadri-vector, to be specific the momentum vector P μ. Hence the charges of gravitation have to be assumed to be the momenta P μ and these, contrary to mass, are associated with the generators of a bona fide symmetry of classical and quan- tum theories. It is the symmetry of space-time translations in Minkowski space. Here was the clue. The relevant canonical conserved Noether current was the cur- rent of quadri-momentum, which in Venusian physics was known under the name of stress-energy tensor. Given any Poincaré invariant field theory, via Noether theorem λ (1.7.3)Ð(1.7.4) the Venusians could calculate its canonical stress-energy tensor T ν which, in most cases turned out to be symmetric after lowering of its upper index with the Lorentz invariant metric η:

= ≡ λ Tνμ Tμν ημλT ν(can) (5.8.1)

Some Venusian scientists had observed that there exists exceptional systems where the canonical Noetherian stress-energy tensor turns out to be non-symmetric, yet other Venusian scientists had shown that, without spoiling its two fundamental prop- erties of being conserved and associated with the translation generators, the canon- ical stress-energy tensor could always be improved (essentially by the addition of some surface terms to the Lagrangian) and made symmetric in all cases. So the Venusians concluded that the mediator of gravitational interactions, that couples to the stress-energy tensor, must be a two-index symmetric tensor field and they named 230 5 Einstein Versus Yang-Mills Field Equations it hμν = hνμ. Venusians also knew that since it couples to a conserved current, in some way or another, the gravitational field hμν should be endowed with some kind of local symmetries, whose number was predetermined by the number of conserved charges, namely by the number of components of the momentum vector P μ, i.e. four in 4-dimensional Minkowski space. If gravity existed in higher m-dimensional Minkowski space-times, of which some foolish Venusian scientists had started to dream, then the number of gauge transformations should be precisely m. Hence, following the Feynman-Venusian approach let us then consider the most general form of a Lorentz invariant linear equation for a symmetric tensor field hμν(x) that is quadratic in partial derivatives, describes a massless particle and has the stress-energy tensor as a source. Such an equation is necessarily of the following form:     ρσ ρ ρ hμν + α∂μ∂ν hρσ η − β ∂μ∂ hρν + ∂ν∂ hρμ   ρσ ρ σ + γημν hρσ η + δημν∂ ∂ hρσ = κTμν (5.8.2) where α, β, γ, δ are some numerical coefficients to be determined. In the electro- magnetic case the conservation of the electric current and the gauge invariance of the field equation were shown to be just two equivalent statements. This is essen- tially true also in the spin two-case, yet there are some additional subtleties which is worth mentioning. To this effect let us compare the relations imposed on the co- efficients of (5.8.2), by the two conditions: μ (a) Conservation of the stress-energy tensor: ∂ Tμν = 0. (b) Invariance of the right hand side of the equation under the gauge transforma- tions (5.7.5). This is easily done. Let us begin with condition (a). If we take the ∂μ derivative of both sides of (5.8.2), the left-hand side vanishes, consistently with the assumed vanishing of the right-hand side, if: β = 1 δ = 1 (5.8.3) α + γ = 0 Consider then condition (b), namely let us suppose that the right-hand side of (5.8.2) is invariant under the transformation (5.7.5). This is true if: β = 1 α = 1 (5.8.4) γ + δ = 0 The two systems of (5.8.3) and (5.8.4) admit the common solution:

β = 1; α = 1; γ =−1; δ = 1 (5.8.5) which exactly corresponds to the linearization of Einstein equation displayed in (5.7.9). Hence the conservation of the stress-energy tensor and the gauge transfor- mation (5.8.4) do not exactly imply each other, yet this is not a real problem, since 5.8 The Bottom-Up Approach, or Gravity à la Feynmann 231 we have so far disregarded the existence of a field redefinition which changes the coefficients of (5.8.2) preserving the stress-energy conservation. Consider the fol- lowing transformation: ˜ ˜ hμν → hμν + uημνh (5.8.6) ˜ ρσ ˜ ˜ where by h = η hρσ we have denoted the trace of the new field hμν and u is ˜ some numerical parameter. Inserting (5.8.6)into(5.8.2) we find that hμν satisfies an equation identical in form to the original one but with new coefficients, related to the old ones in the following way:

β˜ = β δ˜ = δ (5.8.7) α˜ = α(1 + 4u) − 2u γ˜ = γ(1 + 4u) + 2u

It is immediately evident from (5.8.7) that if the stress-energy conservation condi- tion (5.8.3) was satisfied by the old coefficients, so it is by the new ones with a tilda. Hence we actually have a one-parameter family of field equations consistent with the conservation of the stress-energy tensor and they are related to each other by the field transformation (5.8.6). In this family, the linearization of Einstein equa- tion, namely (5.7.9), is gauge-invariant under the transformation (5.7.5) which is the infinitesimal form of diffeomorphisms. The other members of the same family have also a gauge symmetry with the same number of arbitrary functions, the only difference being the following slight modification of the transformation laws: ˜ hμν → ∂μξν + ∂νξμ + uημν∂ · ξ (5.8.8)

μ where we have denoted ∂ · ξ = ∂ ξν . Hence the principle that conservation of the source is equivalent to the existence of a local gauge-symmetry for the interaction- messenger field is respected. We just have the freedom of choosing the form of the local gauge transformations displayed in (5.8.8) and the form associated with the linearization of Einstein equations just corresponds to the simplest choice u = 0. = 1 Yet we have already seen that the choice u 2 is quite convenient for the maximal simplification of the equation, after gauge-fixing. So, coming back to Feynman’s story, the good Venusian particle physicists ar- rived at (5.7.9) and could also easily derive the quadratic Lagrangian from which such an equation follows by means of variational calculus. It is as follows:

(2) 1 ρ μν μν ρσ L (h) =− ∂ h ∂ρhμν − ∂μh ∂νhρσ η 2

μ ν 1 μ + ∂ hμν∂ρh − ∂ h∂μh (5.8.9) 2 where, just as before, h denotes the η-trace of the hμν -tensor. The Lagrangian (5.8.9) is not only Lorentz but also Poincaré invariant and, as such, the Venusians 232 5 Einstein Versus Yang-Mills Field Equations could easily calculate the stress-energy tensor carried by the gravitational field hμν , which is just the Noether current associated with translational symmetries. They arrived at the following result:  λ = − ρσ − λ μσ − ρ − ρσ T μ(h) ∂λhρσ ∂μh ∂ h ∂σ h ∂λh∂ hμρ 2∂λhμσ ∂ρh   λ 1 ρ μν μν ρσ − ∂λh∂μh −δ − ∂ h ∂ρhμν − ∂μh ∂νhρσ η μ 2

μ ν 1 μ + ∂ hμν∂ρh − ∂ h∂μh (5.8.10) 2

The reader may note that this canonical Noether tensor, after lowering the upper index with the flat metric is not automatically symmetric. Namely it falls among those exceptions alluded above. Also on Earth the fact that the Noether conserved current associated with translations is not automatically symmetric in its two indices is a topic that generated a large literature over the years [1]. Various proposals were put forward how to symmetrize the canonical stress-energy tensor in such a way that it is still conserved (divergenceless) and still defines the correct charges, namely the momentum vector P μ. The most popular and most frequently adopted of these symmetrization procedures is due to Belinfante and Rosenfeld and dates back to the years 1939Ð1940. We will not dwell on this issue. In this context what is of interest to us and to the Venusians is the following point. We have a symmetrized energy momentum tensor: (grav)  μ Tμν (h) symm T ν(h) (5.8.11) which means that also the gravitational field hμν carries energy and momentum. Why this energy and momentum should not be on the same footing as the energy and momentum carried by the other matter fields? Hence why should we not modify the gravity field equation as it follows:   ρσ ρ ρ hμν + ∂μ∂ν hρσ η − ∂μ∂ hρν − ∂ν∂ hρμ   −  ρσ + ρ = + (grav) ημν hρσ η ημν∂ ∂σ hρσ κTμν κTμν (h) (5.8.12) Obviously we should and the good Venusians did it. The news is that the gravita- (grav) tional equation is no longer a linear one! We can bring the term κTμν (h) on the left hand-side of the equation and consider the resulting non-linear differential ex- pression the truly correct form of the propagation equation for the hμν field. With some effort we can also reconstruct the Lagrangian from which such a non-linear equation (including quadratic terms) derives. It will contain both quadratic and cu- bic terms in hμν and have the following structure:

 L (h) = L (2)(h) + κΔL (3)(h) (5.8.13)

From L (h) via Noether theorem we can calculate once again the canonical stress- energy tensor which now will involve both quadratic and cubic terms. If we repeat 5.9 Retrieving the Schwarzschild Metric from Einstein Equations 233 the procedure and we add such a correction to the gravitational equation we recon- struct a new renormalized Lagrangian that now includes up to quartic terms in the field, namely:

 L (h) = L (2)(h) + κΔL (3)(h) + κ2ΔL (4)(h) (5.8.14)

It is clear that, with increasing and rapidly divergent algebraic effort, the above procedure can be repeated infinitely many times, generating an hμν Lagrangian that is developed in power series of the coupling parameter κ and which contains an unlimited number of interaction vertices at n-legs: ∞ L tot(h) = L (2)(h) + κnΔL (n)(h) (5.8.15) n=3 Do we have any idea of what the sum of such an infinity series might look like? Of course we do. Introducing the tensor

gμν ≡ ημν + hμν (5.8.16) and its inverse gμν the Lagrangian L tot(h) turns out to be nothing else but the geometrical Einstein-Palatini Lagrangian: L tot(h) = R[g] − det g (5.8.17)

The lesson taught by Feynman with its Venusian history is that without knowing anything about differential geometry, curvature of space-time and tensorial calculus the correct gravitational action could nonetheless be discovered starting from the crucial observation that the gravitational field is what couples to the current of four- momentum P μ, namely the stress-energy tensor. This is not so much surprising if we reflect that the momentum P μ is the generator of translations and if we pretend to transform translations into a local symmetry then we are just requiring that the theory we want to construct should be diffeomorphic or as Einstein formulated it generally covariant invariant. In simple words what is a diffeomorphism if not a local translation?

5.9 Retrieving the Schwarzschild Metric from Einstein Equations

Having discussed Einstein equations in all their aspects it is now the proper moment to prove that the Schwarzschild metric we used in Chap. 4 to retrieve all Newto- nian Physics plus corrections is indeed an exact solution of Einstein equations. As we are going to show, Schwarzschild metric is just a vacuum solution, namely it solves Einstein equations with vanishing stress-energy tensor, being Ricci-flat. In the sequel, we rediscuss Schwarzschild solution at two levels. In Chap. 6 of this 234 5 Einstein Versus Yang-Mills Field Equations volume, considering the equations of stellar equilibrium we join the Schwarzschild metric describing the region of space-time external to a spherical star with the in- terior solution of Einstein equations driven by the stress-energy tensor of the fluid composing the star. In Chap. 2 of Volume 2 we study Schwarzschild space-time and its analytic extension beyond the horizon as the first spherical symmetric example of a black-hole. Finally in Chap. 3 of Volume 2 the Schwarzschild solution will be incorporated as a particular case of the general Kerr-Newman solution which corre- sponds to the most general stationary black-hole. The Schwarzschild metric belongs to the following class of spherical symmetric, static metrics, whose coefficients can depend only on the radial coordinate r:   ds2 =−exp 2a(r) dt2 + exp 2b(r) dr2 + r2 dθ2 + sin2 θdφ2 (5.9.1)

We can easily recast such a metric in the vielbein formalism writing: E0 = dt exp a(r) ; E1 = dr exp b(r) (5.9.2) E2 = rdθ; E3 = rdφsin θ

Calculating the exterior differential of the vielbein (5.9.2) we obtain:

dE0 = ae−bE1 ∧ E0 dE1 = 0 (5.9.3) e−b dE2 = E1 ∧ E2 r e−b cos θ 1 dE3 = E1 ∧ E3 + E2 ∧ E3 r sin θ r On the other hand the vanishing torsion equation takes the following explicit form:

dE0 − ω01 ∧ E1 − ω02 ∧ E2 − ω03 ∧ E3 = 0 dE1 − ω01 ∧ E0 − ω12 ∧ E2 − ω13 ∧ E3 = 0 (5.9.4) dE2 − ω02 ∧ E0 + ω12 ∧ E1 − ω23 ∧ E3 = 0 dE3 − ω03 ∧ E3 + ω13 ∧ E1 + ω23 ∧ E2 = 0 so that, combining (5.9.4) with (5.9.3) we determine the following unique solution for the Levi Civita spin connection ωab:

ω01 =−˙ae−bE0; ω02 = ω03 = 0 e−b e−b ω12 = E2; ω13 = E3 (5.9.5) r r cos θ 1 ω23 = E3; sin θ r where the dot denotes the derivative with respect to the parameter r. Inserting this result in the definition of the curvature two-form we obtain: 5.9 Retrieving the Schwarzschild Metric from Einstein Equations 235   R01 = a¨ −˙ab˙ +˙a2 exp[−2b]E0 ∧ E1 (5.9.6) a˙ R02 = exp[−2b]E0 ∧ E2 (5.9.7) r a˙ R03 = exp[−2b]E0 ∧ E3 (5.9.8) r b˙ R12 =− exp[−2b]E1 ∧ E2 (5.9.9) r b˙ R13 =− exp[−2b]E1 ∧ E3 (5.9.10) r  1 − exp[−2b] R23 =− E2 ∧ E3 (5.9.11) r2

From this result we easily read off the component of the Riemann tensor and we can calculate the Einstein tensor which has the following form:  ˙ 1 − 1 b G = − e 2b − 2 (5.9.12) 00 r2 r2 r  1 − 1 a˙ G =− + e 2b + 2 (5.9.13) 11 r2 r2 r  ˙ − a˙ − b G = G = e 2b +¨a + (a)˙ 2 −˙ab˙ (5.9.14) 22 33 r a = Gb 0 otherwise (5.9.15)

In the vacuum we have to set all the components of Gab to zero. Summing the first two equations we obtain: 0 =˙a + b˙ (5.9.16) Hence the sum of the two functions a(r) and b(r) is a constant. Yet at infinity, namely when r →∞the considered metric should approach the Minkowski metric, namely both a(r) and b(r) should tend to zero. This means that the integration constant is zero and we have a(r) =−b(r) (5.9.17)

Replacing (5.9.17) into the vanishing condition for G22 as given in (5.9.14) we get:

b˙ d   0 =−b¨ + 2b˙2 − 2 = exp −2b(r) r − 1 (5.9.18) r dr The last equation is immediately integrated yielding:

 − m 1 exp 2b(r) = 1 − 2 (5.9.19) r 236 5 Einstein Versus Yang-Mills Field Equations  m exp 2a(r) = 1 − 2 (5.9.20) r where m is an integration constant. This solution uniquely fixed by the boundary conditions at infinity is just the Schwarzschild metric.

References

1. Lord, E.A.: A theorem on stress-energy tensors. J. Math. Phys. 17, 37 (1976) 2. Einstein, A.: Die Feldgleichungen der Gravitation. in: Sitzungsberichte der Preussischen Akademie der Wissenschaften zu Berlin, pp. 844Ð847 (1915) 3. Einstein, A.: Die Grundlage der allgemeinen Relativitätstheorie. Ann. Phys. 49 (1916) Chapter 6 Stellar Equilibrium: Newton’s Theory, General Relativity, Quantum Mechanics

E quindi uscimmo a riveder le stelle Dante Alighieri

6.1 Introduction and Historical Outline

Einstein used to say that the left hand side of his field equations is carved into solid marble, while the right hand side is just only scribbled on perishable wood. By this he meant to emphasize the profound difference between the symmetric tensor Gμν , which is constructed out of geometrical quantities derived from first princi- ples, and the stress-energy tensor Tμν , that encodes the matter content of space-time and looks like a black-box hiding our ignorance about the intimate structure of the latter. Einstein’s dream, pursued by the efforts of all his followers in contemporary theoretical research, is that of bringing the right hand side of the field equations to the left, namely of providing a unification of the metric tensor gμν with all the other fields that describe both Matter and the other interactions of Nature. Unified Theories like Supergravity and the microscopic Superstring Theory from which it derives, are presently the most favorite candidates to fulfill such a task. Notwithstanding the undoubtable logical need of unifying gravity with the other interactions and constructing microscopic quantum theories that encode all fields, the major successes of General Relativity and most of its, by now well established, experimental confirmations are based on the marble-wood dualism that bothered its inventor. In this scheme one takes a drastically simplified description of Matter, by considering it to be a perfect fluid that has no viscosity and is completely described in terms of three local fields: • the energy density scalar field ε(x) ≡ c2ρ(x) that expresses the amount of to- tal energy1 stored in one infinitesimally small tri-volume around the space-time point x, • the pressure scalar field p(x) that gives the pressure of the fluid at the same space- time point x,

1By total energy we mean all kinds of energy, thermal, chemical, radiative and so on, including also the rest-mass energy.

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_6, 237 © Springer Science+Business Media Dordrecht 2013 238 6 Stellar Equilibrium

• the tetra-velocity vector field U μ(x) that expresses the four-velocity of an in- finitesimally small element of the fluid located in x = xi at time t = x0. The general form of the stress-energy tensor for such an ideal fluid will be de- rived in next section. Here we just emphasize that the virtue of such a description resides in that it allows to single out the general implications of gravitational theory, parameterizing the contribution of non-gravitational fields in a way largely inde- pendent from detailed dynamics. Energy density and pressure are captured as the relevant parameters characterizing the internal state of gravitating matter. As we outlined in previous chapters, General Relativity appeared at the dawn of the 20th century as the result of a hundred year long elaboration of geometri- cal concepts. This happened almost in parallel with spectacular advances in human knowledge about the structure and content of the Physical Universe that had re- mained veiled in mystery in all previous millennia. In particular the distribution, actual distance from Earth and physical structure of stars started to be uncovered, the huge and unsuspected organization of the cosmos in gigantic clusters of stars, named galaxies and of the latter in clusters of clusters was disclosed in the first three decades of the last century. The new geometrical theory of gravitation introduced by Einstein played an essential role in framing our understanding of the new astronom- ical data and shaping a new vision of the world. The same did Quantum Mechanics which was born few years after General Relativity. It is quite interesting from the point of view of the History of Science to notice the immediate impact of the newly introduced fundamental principles of physics in understanding the new astronomical discoveries, which happened almost on-line. The principal actors in the tale which constitutes the main topic of the present chapter are Lane and Emden, Sir Arthur Eddington, Subramanyan Chandrasekhar, Robert Oppenheimer, Wilhelm Baade, Fritz Zwicky and Anton Hewish. In 1914, the Astronomer W.S. Adams discovered that Sirius B, the faint compan- ion of the most luminous star of the night sky, Sirius A, whose mass, by using third Kepler law, had been determined to be about 0.7 solar masses, had a surprisingly high surface temperature and a radius of about 18.800 kilometers like a big planet. This was the first hint that the main sequence members are not the total of star popu- lation, the Universe comprising also compact objects like Sirius B, whose properties were intriguing and gave suggestions about the destiny of the normal luminous stars whose life Mankind started understanding was long but not endless. About the end of the XIX century a hydrodynamical model of the normal stars like our Sun had already emerged and the main equilibrium equation had already been written in the following form: d M(r) p(r) =−Gρ (r ) (6.1.1) dr r2 r M(r) = dξ 4πξ2ρ(ξ) (6.1.2) 0 where r is the radial distance from the center of a spherical bubble of fluid, p(r) is the pressure at that distance and ρ(r) is the mass-density at the same distance. Equa- tion (6.1.1) expresses the balance of the repulsive forces due to pressure that tend to 6.1 Introduction and Historical Outline 239

Fig. 6.1 Sir Arthur Eddington, (1882Ð1944) was one of the most prominent Astrophycists of his time. During the solar eclipsis of 1919 he was the first to measure the deflection of light rays predicted by General Relativity and he was responsible for spreading the knowledge of Einstein Theory within the English speaking scientific community

enflate the bubble with the gravitational self-attraction of that large mass of fluid that tends to concentrate it into smaller volumes. Before the discovery of nuclear forces and nuclear reactions, what was the energy source powering the internal core of the Sun and of the other main sequence stars was not clear, yet as early as 1870 the American Physicist Jonathan Homer Lane2 was able to turn the integral-differential equation (6.1.1)Ð(6.1.2) into a manageable second order non-linear differential equation, the Lane-Emden equation [1], by using a class of equations of states (see (6.4.42)) which are still valid models for the mean behavior of stellar matter. With the discovery of 1914 it became clear that there were states of matter, inside compact stars, that were very different from those so far known to Mankind and soon the discovery of the Quantum World provided the clue to interpret them. In his 1926 book on Internal Constitution of the Stars, Eddington (see Fig. 6.1) concluded that Prof. Adams has killed two birds with one stone: he has carried out a new test of Einstein General Theory of Relativity and he has confirmed our suspicion that matter 2000 times denser than platinum is not only possible, but is actually present in our Universe. White Dwarfs, as stars of the type of Sirius B came to be known, were a puzzle since no classical mechanism could be imagined generating the pressure necessary to counterbalance the gravitational self-attraction of such dense objects. The same year 1926 that witnessed these astrophysical considerations saw also the birth of Fermi-Dirac statistics by means of the separate papers on quantum ideal gases pub- lished by Dirac and Fermi [3, 4]. Almost instantaneously, R.H. Fowler, in the very same 1926 year advanced the hypothesis that the electron degeneracy pressure could

2Jonathan Homer Lane (1819Ð1880) was an American physicist who worked most of his life at the US Patent Office. His studies on the Thermodynamics of the Sun were published in 1870 and were extended by Robert Emden in 1907. This latter (1862Ð1940) was of Swiss nationality but served as Professor of Physics and Metereology at the University of Munich in Germany. 240 6 Stellar Equilibrium

Fig. 6.2 Subrahmanyan Chandrasekhar, (1910Ð1995). Born in Lahore (India) Subrahmanyan was the nephew of a Nobel Laureate and was to receive on his turn the Nobel Prize in 1983, sharing it with William Alfred Fowler, not to be confused with the British Physicist, Sir Ralph Howard Fowler of which Chandrasekhar was a student. R.H. Fowler was the first to imagine that white dwarfs could be sustained against gravitational collapse by the degeneracy pressure of an electron gas. Chandrasekhar developed this idea in 1930 and came to discover the mass limit which goes under his name which was also the motivation for his Nobel Laurea. Subrahmanyan obtained his doctorate from the University of Cambridge, where he came under the influence of Eddington, worked in Copenhagen in Bohr’s group and in 1937 was recruited by the University of Chicago where he served as professor until his death in 1995 at the age of 85. His classical studies on stellar structures were followed by exhaustive investigations on the mathematical theory of black-holes and in the very last years of his life of gravitational waves. Of very gentle character and deep culture, Subrahmanyan Chandrasekhar was affectionately named Chandra by all of his student, colleagues and collaborators be the explanation for the missing mechanism sustaining white dwarfs against grav- itational collapse. It was the Indian born American physicist Subrahmanyan Chandrasekhar (see Fig. 6.2) who building on Fowler’s idea made the first accurate models of White Dwarfs in 1930 [9Ð11]. Working on this problem Chandra made the momen- tous discovery that white dwarfs had an upper mass limit that might be understood in terms of first principles and estimated in terms of fundamental physical constants. Chandra’s argument could later be extended in a completely analogous way to neutron stars, whose existence was conjectured just two years later by Baade and Zwicky. They proposed that the formation of such even more compact astronomical objects was the mechanism lying behind supernova explosions. Neutron stars in the form of pulsars were actually discovered in 1964 by Hewish [8], who won the Nobel Prize for that. Both White Dwarfs and Neutron Stars are a spectacular manifestation of Quan- tum Physics on large scales. Without quantum mechanics these objects could not be interpreted in any way. Yet also classical General Relativity imposes its own limit 6.1 Introduction and Historical Outline 241

Fig. 6.3 Robert Oppenheimer, (1904Ð1967) is the founder of the American School of Theoretical Physics. He is best known as Director of the Manhattan Project that constructed the Atomic Bomb and for his very strong opinions against the development of cold war with the Soviet Union which led him to oppose the development of the Hydrogen Bomb in the USA. Discriminated during the McCarthy period, Oppenheimer was rehabilitated under the Kennedy and Johnson’s presiden- cies. Kennedy honored him with the Enrico Fermi Award. Oppenheimer’s notable achievements in physics include the Born-Oppenheimer approximation, work on electron-positron theory, quantum tunneling and in particular the first shaping of the theory of gravitational collapse into black-holes to the mass of possible stars. This is what became apparent through the general covariant updating of the equilibrium equation (6.1.1)Ð(6.1.2). Such an updating was constructed in 1939 by Robert Oppenheimer (see Fig. 6.3) and George Michael Volkoff in a paper [2] which built on previous results of Tolman [5, 6]. Working with the exact field equations of Einstein theory, Oppenheimer and his collaborator derived a much more complicated integral differential relation than (6.3.62). The consequence of such a general relativistic equation is that there exists a lower limit for the ratio between the radius of a star and its Schwarzschild radius. Below such a limit the star cannot exist and necessarily collapses further: to what? To a black-hole is the answer. So the new principles of both General Relativity and Quantum Mechanics changed drastically the picture of stellar equilibrium which had been developed within Newton’s Theory. This came just at the time when new astronomical objects were discovered that could not have been interpreted without this new theoretical understanding. In the course of the XXth century a completely new picture of the Universe emerged which is evolutive contrary to the static ones cheered by all Thinkers and Philosophers. Stars evolve and end up their life in very unusual states of matter like those of white dwarfs, neutron stars or black-holes. The Universe itself evolves as we plan to discuss in later chapters. The present one focuses on stellar equilib- 242 6 Stellar Equilibrium rium and carefully considers the interplay of Newton Theory, General Relativity and Quantum Mechanics.

6.2 The Stress Energy Tensor of a Perfect Fluid

In agreement with the discussion presented in our historical outline we come to the conclusion that in order to write down and solve the relevant Einstein field equations an essential ingredient is provided by the general form of the stress-energy tensor Tμν for a perfect fluid. This is what we derive in the present section. To this effect we make a step back and we consider the same problem in the context of Special Relativity. As we shall see the final result can be immediately promoted to General Relativity by invoking general covariance. A perfect fluid can be idealized as a system of N →∞identical, non-interacting point particles of mass m. At time t the nth particle is characterized by its energy and tri-momentum: 0 = ; i = i = P(n)(t) E(n) P(n) mcγ (v(n))v(n) (i 1, 2, 3) (6.2.1) and also by its position x(n)(t).In(6.2.1) the symbol γ denotes the Lorentz factor: 1 γ(v)≡ (6.2.2) 2 1 − v c2 The stress-energy tensor of the nth particle is given by:3 μ0 = μ T(n) P(n)(t) (6.2.3) i 1 dx T μi = P μ (t) (n) (n) (n) c dt By summing on all the particles contained in the system we obtain its stress-energy tensor: μ0 = μ (3) − T (x) P(n)(t)δ x xn(t) n i (6.2.4) 1 dx T μi = P μ (t)δ(3) x − x (t) (n) (n) n c dt n

3Let us recall that the various components of the stress-energy tensor have the following physical meaning: T 00 is the energy density T k0 is the flux of energy in the kth direction T 0i is the density of ith component of tri-momentum T ki is the flux of ith momentum component in the direction k 6.2 The Stress Energy Tensor of a Perfect Fluid 243

μ = E dxμ Next using the relation P c dt we can rewrite (6.2.4) as follows: P μ P ν T μν = (n) (n) δ(3) x − x (t) (6.2.5) E n n (n) In a reference frame where the perfect fluid is at rest it must be homogeneous and isotropic. This corresponds to imposing the following conditions on the spatial part of the stress-energy tensor: T ij = pδij (6.2.6) where p = p(x) is a scalar function. Indeed the only symmetric 2-tensor that is invariant with respect to the SO(3) rotation group is the Kronecker delta. Looking at (6.2.5) we easily calculate the physical dimensions of the stress-energy tensor T ij .Wehave: i j = 2 2 −2 P(n)P(n) m t 2 −2 [E(n)]=m t (6.2.7) c2 = 2t−2 1 δ3 x − x(t) = = −3 (n) Volume so that: − − Force T ij = m 1t 2 = = pressure (6.2.8) Area Hence by dimensional analysis we conclude that the scalar function p(x) appearing in (6.2.6) is to be interpreted as the pressure of the fluid in its rest frame. Also by comparison of (6.2.6) with (6.2.5) we obtain:

1 P2 p = n 1,δ(3) x − x(t) (6.2.9) E (n) 3 n (n) that has the following obvious physical interpretation. Since the total energy of the = 2 4 + 2 nth particle is given by E(n) m c Pn the pressure is due to the fraction of the total energy not due to rest masses, namely the kinetic energy. In the rest frame there is neither energy or nor momentum flux so that T 0i = T i0 = 0. On the other hand the T 00 component is given by:

[P 0 (t)]2 T 00 = (n) δ(3) x − x(t) E (n) n (n) 3 = E(n)δ x(n) − x(t) n 2 = εtot(x) ≡ c ρ(x) (6.2.10) 244 6 Stellar Equilibrium

Indeed the energy density in the space-time point x = (x0, x) is the sum of the energies of all the particles that happen to be in x at time x0. Summarizing, in the rest frame of the fluid, the stress-energy tensor takes the following form ⎛ ⎞ c2ρ 000 ⎜ 0 p 00⎟ T μν = ⎜ ⎟ (6.2.11) ⎝ 00p 0 ⎠ 000p

To go back to a generic frame where the fluid has tri-velocity v it suffices to make a special Lorentz transformation with velocity v, namely:

μν = μ ρ ρσ T Λ νΛ σ T (6.2.12) where:

0 = Λ 0 γ(v) vi Λ0 = Λi = γ(v) (6.2.13) i 0 c i v vj Λi = δi + γ(v)− 1 j j v2

Explicitly performing the transformation (6.2.12) and recalling that in Special Rel- ativity the tetra-velocity is given by:   vi U μ = γ(v),γ(v) (6.2.14) c we conclude that in the new generic frame the stress-energy tensor of a perfect fluid is given by the following simple formula: T μν = ρc2U μU ν + p U μU ν − ημν (6.2.15) where ημν = diag(+, −, −, −) is the standard Minkowski metric in the mostly mi- nus convention. Equation (6.2.15) gives the stress-energy tensor of a perfect fluid in Special Rela- tivity. Its analogue in General Relativity is easily obtained: it just suffices to replace the flat metric ημν with a generic one gμν(x) obtaining T μν(x) = ρc2U μU ν + p U μU ν − gμν (6.2.16) which is the starting point for all discussions of General Relativity in presence of matter. 6.3 Interior Solutions and the Stellar Equilibrium Equation 245

6.3 Interior Solutions and the Stellar Equilibrium Equation

Let us now choose a static, spherically symmetric metric. As we argued in previous chapters, its most general form is given by (5.9.1) that we repeat here for conve- nience: ds2 =−exp 2a(r) dt2 + exp 2b(r) dr2 + r2 dθ2 + sin2 θdφ2 (6.3.1)

By a(r) and b(r) we have denoted two arbitrary functions of the radial coordinate r, while θ,φ are the usual angular variables as defined in Fig. 4.3. The geometry de- scribed by (6.3.1), in addition to SO(3) rotations, admits as a symmetry also time translations: t → t + Δ; Δ = const (6.3.2) This is the reason why we call it static. This implies that: −→ ∂ ∂ ξ ≡ ξ μ = (6.3.3) ∂xμ ∂t is a Killing vector field.4 Such a static spherical symmetric geometry is just the appropriate description of both the space-time region surrounding and of the space- time region containing a spherical symmetric star that is in a state of equilibrium. A reasonable model of such stars is obtained by regarding the matter out of which they are composed as a perfect fluid, whose stress energy tensor we write as in (6.2.16): T μν(x) = ρUμU ν + p U μU ν − gμν (6.3.4) having chosen natural units (c = 1). In order to be compatible with the static isotropic nature of the metric (6.3.1), the fluid must be at rest on the surfaces t = const, namely on the surfaces orthogonal to the world-lines generated by the time-like Killing vector field (6.3.3)asshown in Fig. 6.4. This means that the fluid four-velocity U μ must be proportional to the Killing vector field: U μ = aξμ; a = some constant (6.3.5) Since the four-velocity is normalized to unity we have: μ ν 0 2 0 −1/2 gμνU U = 1 ⇒ U g00 = 1 ⇒ U = (g00) (6.3.6) which expresses the unique non-vanishing component of the velocity field U μ in terms of the metric itself. If we go over to flat intrinsic components, multiplying by a the vielbein Vμ (x) we obtain an even simpler result:

4For the mathematical definition and properties of Killing vector fields we refer the reader to Volume 2, Sect. 5.2.1. 246 6 Stellar Equilibrium

Fig. 6.4 The fluid composing the static star must be at rest on the space-like surfaces t = const. Hence the world-lines of the fluid elements must admit the ≡ ∂ Killing vector field ξ ∂t as tangent vector

0 =  = μ = ;  = U U0 V0 Uμ 1 Ui 0 (6.3.7)

ab ≡ a b μν Correspondingly the flat intrinsic components T Vμ Vν T of the stress-energy tensor are simply given by: ⎧ ⎨⎪T00 = ρ ab Tab = ηaa ηbb T = T = T = T = p (6.3.8) ⎩⎪ 11 22 33 Tab = 0 otherwise

The intrinsic components of the Einstein tensor for the spherical symmetric static metric (6.3.1) have already been calculated in (5.9.15). Introducing the convenient definitions: h(r) = exp 2b(r) ; f(r)= exp 2a(r) (6.3.9) Equation (5.9.15) can be rewritten as follows:

   = 1 1 h G = G = 1 − + 00 0 r2 h rh2   1 1 f  G =−G1 =− 1 − + 11 1 r2 h rhf 1 −  1 −  G = G =−G2 = (rf h) 1f − rh2 1h (6.3.10) 22 33 2 2 2 1 d + (f h)−1/2 (f h)−1/2f  2 dr Gab = 0 otherwise 6.3 Interior Solutions and the Stellar Equilibrium Equation 247

Combining (6.3.10) with (6.3.8) we see that Einstein equations reduce, in this case, to a system of three ordinary differential equations for the four functions h(r), f(r), p(r), ρ(r). This is undetermined if we do not specify the nature of the fluid we are considering by writing an equation of state, namely a relation between the pressure and the energy density: p = F(ρ) (6.3.11) The function F appearing in (6.3.11) encodes all the thermodynamical properties of the fluid. Consider the equation:   1 d 1 G (r) = 8πρ(r) = r 1 − (6.3.12) 00 r2 dr h by a straightforward integration we obtain: R − R 1 − h 1(R) = 8π dr r2ρ(r)+ const (6.3.13) 0 If we have a spherical distribution of mass-energy the integral: R M(R)= 4π dr r2ρ(r) = d3xρ(x) (6.3.14) 0 SR is immediately interpreted as the total mass-energy contained in a sphere of ra- dius R. Hence solving (6.3.13) with respect to the function h(R) we obtain:

− M(R) k 1 h = 1 − 2 − (6.3.15) R R which still contains the undetermined integration constant introduced in (6.3.13). This latter is fixed imposing the boundary condition h(0) = 1 that corresponds to the regularity of space-time in the origin. In this way we conclude that:

− M(R) 1 h(R) = 1 − 2 (6.3.16) R Equation (6.3.16) shows that the radial-radial component of the metric has, in presence of spherically distributed matter, the same form as in the case of the Schwarzschild vacuum metric. The only difference resides in that the constant pa- rameter m of the Schwarzschild metric (4.3.1) is replaced by the function M(R) introduced in (6.3.14) and representing the total mass contained in a sphere of ra- dius R. At this point a subtle remark on M(R) is in order. Its explicit form has emerged from the integration of one of the Einstein equations and we interpreted it as the total mass contained in a sphere of radius R. To be very precise such an interpretation is illegal since it uses the integration measure of flat space rather than the integration measure determined by the true space-time metric. Given the mass 248 6 Stellar Equilibrium density ρ(x), the proper mass contained in a sphere of radius R is actually defined by the integral: R 2 Mp(R) = 4πr ρ(r) det g3(r) dr (6.3.17) 0 where g3 is the metric of a 3-dimensional section of space-time at constant time. Having found the solution (6.3.16) we actually have: R −1/2 R 2 M(r) 2 Mp(R) = 4πr ρ(r) 1 − 2 dr > M(R) ≡ 4πr ρ(r)dr 0 r 0 (6.3.18) In other words the proper mass contained in a sphere of radius R is strictly larger than the effective mass contained in the same sphere and determining the metric through Einstein equation. This is not a discrepancy rather it encodes a profound physical fact. Indeed the difference:

Mp(R) − M(R) has a clearcut meaning, it is the gravitational binding energy. Just as the mass of a Helium nucleus is smaller than the sum of the masses of two protons and two neutrons in the same way the mass of a star is smaller than the sum of the masses of all its components. Having cleared the meaning of M(R) we come back to the Einstein equations following from the explicit form of the Einstein tensor (6.3.10). So far we have solved only one of them, namely (6.3.12), which has determined the form of the radial-radial component of the metric h(r). Next equation to be considered is:

G11(r) = 8πp(r) (6.3.19) which allows for the determination of the time-time component f(r)= exp[2a(r)]. From (6.3.19) we obtain:    1 M(r) 1 a − + 1 − 2 + 2 = 8πp(r) (6.3.20) r2 r r2 r ⇓

3  M(r)+ 4πp(r)r a (r) = (6.3.21) r(r − 2M(r))

Equation (6.3.21) determines a(r) and hence f(r)by means of a simple integration in r: r M( )+ 4πp( ) 3 f(r)= exp 2 d (6.3.22) 0 ( − 2M( )) 6.3 Interior Solutions and the Stellar Equilibrium Equation 249

It is interesting to appreciate the physical meaning of (6.3.21) by considering its non-relativistic approximation. On one side we recall that:

1 f(r)≡ g  1 − V(r) 1 + 2a(r) (6.3.23) 00 c2 where V(r)is the gravitational potential. On the other hand the non-relativistic ap- proximation corresponds to the regime where we have:

r3p(r) M(r); M(r) r (6.3.24)

These conditions (6.3.24) are easily explained as follows. First of all let us recall that a non-relativistic regime is obtained when:

kinetic energy rest energy (6.3.25) secondly consider that

r3p(r) ∼ pression × Volume  kinetic energy (6.3.26) which is easily seen to be true if one recalls the equation of state of ideal gases pV = nRT and that temperature measures the average kinetic energy per degree of freedom. On the other hand M(r)c2 is the rest energy contained in a sphere of radius r. Therefore in natural units where c = G = 1 the first of (6.3.24)is just the statement (6.3.25). The second condition (6.3.24) is instead the statement that the Schwarzschild radius of the total mass contained in a sphere of radius r is much smaller than the radius r itself. Indeed such a Schwarzschild radius is r (r) = 2 G M(r) and, in natural units, the second of conditions (6.3.24) states that S c2 rS(r) r. Inserting the non-relativistic approximations (6.3.23) and (6.3.24)into (6.3.21) we find: d M(r) V(r) (6.3.27) dr r2 which is the correct differential equation for the Newtonian potential generated by a spherical mass distribution. Having dealt with the equations associated with the first and second independent components of the Einstein tensor (see (6.3.10)) we have determined the two un- known coefficients of the metric in terms of the radial mass-distribution ρ(r).We still have to determine the radial behavior of the pressure p(r) related to the radial mass-distribution ρ(r) by the equation of state (6.3.11). This information is encoded in the third and fourth Einstein equations:

G22 = G33 = 8πp (6.3.28)

Since Einstein equations are not independent being related by the Bianchi identity that implies the covariant conservation of the stress-energy tensor, an alternative 250 6 Stellar Equilibrium and simpler way of determining the radial behavior of the pressure is provided by writing such a conservation law. In flat anholonomic indices we have:

cb ∇cTabη = 0 ⇓ (6.3.29) ca ca fg ca fg ∂cTabη + ωc|af Tgbη η + ωc|bf Tgaη η = 0

Using the particular structure of Tab and choosing for instance b = 1 we get: ca ca fg −∂1T11 − ωc|a1T11η + ωc|1f Tagη η = 0 ⇓ (6.3.30)

−∂1T11 − (ω0|01 − ω2|21 − ω3|31)T11 = 0 Inserting the explicit form (5.9.5) of the spin connection for a spherically symmetric static metric, the non-vanishing components with an index 1 are:

ω0|01 =−ae−b −b | e ω2 21 =− (6.3.31) r e−b ω3|31 =− r while the intrinsic derivative ∂1 is defined by

− 1 ∂ ≡ e b (6.3.32) 1 ∂r Inserting (6.3.32) and (6.3.31)into6.3.30) we find:

d  − p(r) − a (p + ρ) = 0 (6.3.33) dr which combined with (6.3.21) yields the Tolman-Oppenheimer-Volkoff relativistic equation of stellar equilibrium:

d M(r)+ 4πr3p(r) p(r) =−(p + ρ) (6.3.34) dr r[r − 2M(r)]

6.3.1 Integration of the Pressure Equation in the Case of Uniform Density

A very simple and idealized model of a star corresponds to choosing a uniform density:

ρ(r)= ρ0 = const (6.3.35) 6.3 Interior Solutions and the Stellar Equilibrium Equation 251

Fig. 6.5 The total force exerted by pressure on the spherical stratum of matter contained be- tween the spherical surface of radius r and the spherical surface of radius r + dr is given by [p(r +dr)−p(r)]×4πr2. On the other hand the gravitational force exerted on the same stratum of − × 2 × M(r) matter by the matter contained in the sphere of radius r is, by Newton’s law, ρ0 4πρ0r r . Hence the equilibrium equation is obtained by balancing these two forces

In this case the function M(r) is immediately determined and we obtain: 4 M(r) = πρ r3 (6.3.36) 3 0

6.3.1.1 Solution in the Newtonian Case

If we consider the stellar equilibrium problem in the contest of Newtonian Physics, what we have to write is simply the following equation: dp M(r) 4 =−ρ =− πρ r (6.3.37) dr 0 r2 3 0 which expresses the balancing of the pressure repulsive force with the gravitational attractive force (see Fig. 6.5). Equation (6.3.37) is immediately integrated to give: 2 p(r) =− πρ2r2 + const (6.3.38) 3 0 The integration constant is fixed by imposing the obvious boundary condition that the pressure should vanish where the star ends, namely p(R) = 0ifR is the radius of the spherical star. The solution of the differential equation with this boundary condition becomes: 2 p(r) = πρ2 R2 − r2 (6.3.39) 3 0 If we denote the total mass of the star by 4 M ≡ πρ R3 (6.3.40) 3 0 then the Newtonian solution (6.3.39) can be rewritten as follows:

2   3 M r p(r) = 1 − (6.3.41) 8π R4 R 252 6 Stellar Equilibrium

Equation (6.3.41) is written in natural units G = c = 1. It is worth to reinstall the physical units and correspondingly the fundamental physical constants. Dimension- wise we have: [G]= 3t−2m−1 (6.3.42) [c]= t−1

In natural units we have: M(r) =[ ]; r3p =[ ] (6.3.43) so that [p(r)]=[ −2]. The dimension of the physical pressure is: Force − − P(r)= = m 1t 2 (6.3.44) Area Hence we conclude that: G p = P (6.3.45) c4 while we already know that: MG M = (6.3.46) c2 where M denotes the mass of the star in physical units. Hence (6.3.41) translates into:   3 GM2 r P(r)= 1 − (6.3.47) 8π R4 R In particular from (6.3.47) we estimate the central pressure of a star of mass M and radius R: 3 GM2 Pc = P(0) = (6.3.48) 8π R4 Let us feed into (6.3.48) the relevant parameters for the Sun:

10 R = 6.96 × 10 cm 33 M = 1.99 × 10 g (6.3.49) − − G = 6.670 × 10 8dyn × cm2 × g 2

We get the following value for the central pressure: dyn P Sun = 1.343 × 1015 (6.3.50) c cm2 Let us compare this pressure with the pressure of a weight positioned on the surface of the earth. The force experienced by somebody holding a kilogram is 6.3 Interior Solutions and the Stellar Equilibrium Equation 253

9.8 × 105 dyn  106 dyn. Hence the central pressure in a uniform density star with a stellar mass and a size of the order of the sun size is of the order of 109 kilo- grams per square centimeter. It is a very large but perfectly finite pressure. In the next section we will see the qualitative difference provided by the integration of the relativistic pressure equation. In General Relativity the central pressure can become infinite if either the mass is too large or the star radius is too small. In other words General Relativity implies that there are critical densities beyond which gravita- tional attraction is so strong that cannot be balanced by pressure. For the sun the average density is:

3 1.99 33 −30 −3 g ρ = 10 10 gcm = 1.41 (6.3.51) 4π (6.96)2 cm3 which, as we will see, is much below the critical density.

6.3.1.2 Integration of the Relativistic Pressure Equation

= 4 3 We consider (6.3.34) and we substitute M(r) 3 πρ0r . Then using the defini- tion (6.3.40) of the total mass in natural units we obtain:

3 M r 3 + 4π r R3p dp R R3 =−(p + ρ0) (6.3.52) dr − r 3 r r 2M R which we can rewrite as follows: M r 3 1 + 3p dp M R ρ0 =−(p + ρ0) (6.3.53) dr R r − r 3 R r 2M R

Dividing by ρ0 we get:     r 3 3p p p M 1 + =− + R ρ0 1 3 (6.3.54) ρ0 ρ0 R r r − r R R R 2M R introducing rescaled variables r p ξ = ; h = (6.3.55) R ρ0 (6.3.54) is rewritten as follows:

d ξ 3 h = (1 + h)(1 + 3h)M (6.3.56) dξ ξ(Rξ − 2Mξ3) which is immediately reduced to quadratures in the form: dh ξdξ =−M (6.3.57) (1 + h)(1 + 3h) R − 2Mξ2 254 6 Stellar Equilibrium yielding: 1 + 3h 1 = R − 2Mξ2 (6.3.58) 1 + h A where A is the integration constant. Then it suffices to solve (6.3.58)forh, obtain- ing:

2 A − 1 − 2M r 1/2 = R3 p ρ0 2 (6.3.59) 1 − 2M r 1/2 − 3A R3 Fixing the appropriate boundary condition:   M 1/2 p(R) = 0 ⇒ A = 1 − 2 (6.3.60) R we obtain the final solution of the Tolman Oppenheimer Volkoff equation in the case of uniform density: / 2 / 1 − 2 M 1 2 − 1 − 2M r 1 2 = R R3 p ρ0 2 (6.3.61) 1 − 2M r 1/2 − 3 1 − 2 M 1/2 R3 R It is interesting to compare the behavior of the pressure graphic in the two cases, the Newtonian solution encoded in (6.3.41) and the exact General Relativistic solution provided by (6.3.61). We do this in Fig. 6.6.

6.3.2 The Central Pressure of a Relativistic Star

For a star of uniform density the central pressure predicted by Newton theory is, as we saw, the following: 3 GM2 P N = (6.3.62) c 8π R4 The value of the same pressure predicted by General Relativity can be immediately obtained by (6.3.61) and is the following one: GM 2 1 − 1 − 2 3 Mc c2R P GR = c 3 (6.3.63) 4π R 3 1 − 2 GM − 1 c2R

The non-relativistic limit is obtained when the actual radius R of the star is much bigger than its Schwarzschild radius, namely when: GM R (6.3.64) c2 6.3 Interior Solutions and the Stellar Equilibrium Equation 255

Fig. 6.6 In both pictures the thiner line corresponds to the exact relativistic solution for the pressure behavior in the case of a uniform density spherical star while the thicker line depicts the Newtonian solution for the same values of the radius R and of the mass M. Measuring distances in units of the Schwarzschild radius, the only relevant parameter is R, the mass being unity by definition. In the first picture the radius R is very small and close to the critical value R0 = 9/4. Here we note a big difference between the Newtonian and the relativistic behavior. In the second picture the radius R is not too close to the critical value R0 and the relativistic behavior, at R = 50, is already almost coinciding with the Newtonian one

In that case we can develop (6.3.63) in series of the small parameter GM and we c2R obtain:    2 − + GM 2 3 Mc 1 1 2 GM P GR = c R + O (6.3.65) c 4π R3 2 c2R 3 GM2 = +··· (6.3.66) 8π R4

Hence for normal stars, whose radius is much bigger than the Schwarzschild ra- dius, the Newtonian theory is an extremely accurate description of their behavior and relativistic effects are completely negligible. On the other hand for small stars relativistic effects become very important and there is a qualitatively different new feature. Indeed it is evident from (6.3.63) that the central pressure becomes infinite (there is a pole) when the ratio between the star radius R and its Schwarzschild radius GM approaches the following critical value c2 256 6 Stellar Equilibrium

R 9 → = 2.25 (6.3.67) GM/c2 4 What does this mean? It means that in order for a star to sustain its own weight when its radius becomes smaller than 2.25 times the Schwarzschild radius, one needs an infinity pressure, namely it is impossible. The star necessarily collapses, all matter falling within the surface r = GM that, as we are going to see in later chapters, is an c2 event horizon. The final equilibrium state reached by the star is that of a black-hole.

6.4 The Chandrasekhar Mass-Limit

The conclusion of the last section raises an important question: is the collapse into a black-hole the necessary end point in the evolution of any star, when all of its energy is exhausted? The answer would be yes if General Relativity had been the only nov- elty of XXth century Physics. There was, however, another quite relevant novelty: Quantum Mechanics. Together with Quantum Mechanics came also Pauli Exclusion Principle. Precisely this latter is responsible for another source of pressure which, quite unexpectedly, offers a star the last chance to survive in an equilibrium state when all other hopes are already lost. The same principle and the same mathematical modeling actually describes two quite different equilibrium states that correspond to the end-point in the evolution of stars of medium initial size and of much larger size, respectively. The first of this two equilibrium states is that of White Dwarf, while the second is that of Neutron Star. The difference is provided by the nature of the fermions that compose the degenerate Fermi gas filling the star, electrons in the case of white dwarfs neutrons in the second case. The main difference resides in the mass of such fermions, which determines the actual size of the equilibrium radius. Apart from that, the sustaining mechanism is essentially the same in both cases and it can be understood by studying the rather astonishing properties of a completely degenerate gas of an extremely large number of fermionic particles.

6.4.1 The Degenerate Fermi Gas of Very Many Spin One-Half Particles

= 1 Let us consider a system composed by a very large number N of spin s 2 free par- ticles. If this system is deprived of energy to allow for their excitations, all particles will fall into the lowest available energy states. Yet, since Pauli exclusion principle forbids that two fermions occupy the same level, they will pile up at increasing en- 1 ergy levels. Taking into account the degeneracy 2 of each level, due to spin 2 ,we can write the following equation: N = d3n (6.4.1) 2 6.4 The Chandrasekhar Mass-Limit 257 where n is the 3-vector of integer wave-numbers labeling a quantum state in a cubic box of size L × L × L. The momentum of such a state is:

−→ 2π−→ p = n (6.4.2) L Combining (6.4.1) with (6.4.2) we can write: N V V pF = d3p = 4π p2 dp 3 3 2 (2π ) |p|≤pF (2π ) 0 V 4π = p3 (6.4.3) (2π)3 3 F where V = L3 denotes the volume of the box surrounding the fermion system. If we introduce the fermion density:

N ρ = (6.4.4) f V from (6.4.3) we obtain the following expression for the Fermi momentum pF : 2 1/3 pF = 3π ρf (6.4.5)

Next we calculate the ground state energy of a system of relativistic spin one-half particles near the Fermi temperature. Naming mf the mass of these particles, by definition we obtain: −→ = 2 + 2 E0 2 c p mf −→ | p |≤pF 8πcV pF = p2 p2 + m2 c2 dp 3 f (2π) 0   cV pF p 2 = m c p2 + dp 2 3 f 1 (6.4.6) π 0 mf c

Let us now define the new variable x = p . Using this notation we obtain the mf c following expression for the ground state energy: V xF E = c5m4 x2 + x2 dx 0 2 3 f 1 (6.4.7) π 0 Hence: 5 4 E c mf 1 0 = f(x ) (6.4.8) N π23 ρ F 258 6 Stellar Equilibrium where the dimensionless function f(xF ) is defined by the following integral: xF 2 2 f(xF ) ≡ x 1 + x dx (6.4.9) 0

If the dimensionless Fermi momentum is very large, namely xF 1, the function f(xF ) can be expanded in inverse powers of xF and we get:   1 1 f(x ) = x4 + +··· F F 1 2 (6.4.10) 4 xF

The limit xF 1 is that relevant to us since it corresponds to a degenerate Fermi gas of very high density. Indeed we have:

pF 2 1/3 xF = = 3π ρ (6.4.11) mf c mf c and when ρf 1 the same is true of xF . In Thermodynamics pressure is minus the derivative of the internal energy of a gas with respect to the volume it occupies. Hence the pressure of the Fermi gas is:   c5m4 ∂f (x ) ∂x P = f −f(x ) + F ρ F 0 2 3 F π ∂xF ∂ρ   c5m4 f 1 3 3 = −f(xF ) + x 1 + x (6.4.12) π 23 3 F F where we have used the obvious identities:

∂ 1 ∂ ∂xF 1 =− ρ ; ρ = xF (6.4.13) ∂V V ∂ρ ∂ρ 3

∼ 1 4 In the high density limit we already shew that f(xF ) 4 xF so that we finally obtain: 5 4 c mf P ∼ x4 − x2 (6.4.14) 0 12π 23 F F

6.4.1.1 Idealized Models of White Dwarfs and Neutron Stars

Let us now consider the schematic structure of the two equilibrium states of compact collapsed stars in which the gravitational attraction is balanced by the basic pressure of a high density Fermi gas. 6.4 The Chandrasekhar Mass-Limit 259

Fig. 6.7 Sirius is a binary star system composed by a normal main sequence star, Sirius A, twenty-five times more luminous than the Sun, with about two solar masses and a white dwarf companion, Sirius B, of about 0.6 solar masses that have got compressed into a volume similar to the size of the Planet Earth. Sirius B extinguished its nuclear fuel, went through the stage of Red Giant and collapsed into its actual state of White Dwarf about 120 millions of years ago. Sirius B will steadily cool, as the remaining heat is radiated into space over a period of more than two billion years

White Dwarfs As their name indicates, white dwarfs are stars of very small mag- nitude but very high superficial temperature, so that their total luminosity is quite faint although the light they emit is extremely white, corresponding to electronic transitions between very high energy levels. For instance, at about 8.6 light years from the Solar System, in the Canis Maior Constellation, there is the most lumi- nous of night stars, well known to Mankind from remotest antiquity, namely Sirius. This latter, named Sirius A, is actually member of a binary system whose other member, Sirius B, is a white dwarf (see Fig. 6.7). With a mass equal to 0.6 solar masses, Sirius B has a radius of about some thousands kilometers, namely it is as big as the Earth. However the superficial temperature of Sirius B is much higher than that of the Sun, namely it is 25,200 K, which makes the faint light it emits so much white. What is the simplest theoretical description of such stars as Sirius B? Imag- ine that the progenitor which later collapsed into a white dwarf was just a cloud of hydrogen. Gravitation compressed such a cloud until its core reached a sufficiently high temperature to initiate the hydrogen fusion cycle. Protons joined pair by pair into deuterium, then collisions of deuterium nuclei generated tritium and eventu- ally all tritium nuclei fused, pair by pair, into helium nuclei, each time liberating two protons that completed the cycle. For billions of years the fusion cycle went on liberating the energy that made the star luminous and provided the pressure nec- essary to maintain it in equilibrium against gravitational attraction. Although slow the process is not eternal and eventually it comes to an end when all the hydro- gen is fused into helium. At this point of its evolution a main sequence star can 260 6 Stellar Equilibrium

Fig. 6.8 The fusion cycle of hydrogen is the main engine powering middle sized normal stars. When all hydrogen is fused into helium the star starts cooling down and gravitational collapse begins. Considering the minuteness of the electron mass in comparison with that of the baryons, the total weight of a white dwarf is approximately estimated from the number of its electrons. To each of them we are supposed to add two baryons: one proton and one neutron be idealized as a cloud of helium that starts compressing into smaller and smaller volumes under the effect of gravitational attraction. The contraction continues until all electrons are stripped away from their nuclei and condense into a Fermi gas of extremely high density. Let us evaluate the electron density of such a star. Taking into account the very small weight of electrons compared with that of the proton and of the nuclei we get that the total mass of the star is approximated by the following formula:

M  (me + 2mp)N ≈ 2mpN (6.4.15) where mp is the mass of a nucleon (at this level we can consider the mass of a proton and that of a neutron equal) me is the mass of the electron and N is the total number of electrons. Indeed in a helium atom there are two nucleons for each electron (see Fig. 6.8). The radius R of the star can be estimated from its volume via the formula:   3V 1/3 R = (6.4.16) 4π Hence in terms of the total mass and of the radius of the star, the density of the fermionic gas, which in this case is the electronic density, is given by:

3 M ρ = ρ = f el 3 (6.4.17) 8πmp R

Correspondingly the Fermi momentum is given by:

1/3 Mwd xF = (6.4.18) Rwd 6.4 The Chandrasekhar Mass-Limit 261 where we have defined: 9πM M R R M = ≡ ; R = ≡ wd wd (6.4.19) 8mp kwdmp /mec λe Apart from the numerical coefficient: 4 kwd = 2 × (6.4.20) 9π we can say that Mwd is the mass of the white dwarf star measured in units of baryon masses mp. Similarly Rwd is the radius of the same star measured in units of the Compton wave length: λe = (6.4.21) mec of the particles which compose the high density degenerate Fermi gas, in this case the electrons. Inserting such information into previous formulae for the high density Fermi gas pressure we obtain the following result:  1/3   1/3  c M 4 M 2 P ∼ × λ−4 × wd − wd 0 2 e (6.4.22) 12π Rwd Rwd

It is very interesting to see that a completely analogous formula holds true also for neutron stars.

Neutron Stars The neutron particle was experimentally revealed in 1932 by Sir James Chadwick, the British physicist who obtained the 1935 Nobel Premium for this discovery. One year only after this detection, Wilhelm Baade and Fritz Zwicky5 proposed the existence of neutron stars. The two California-based physicists were seeking a theoretical explanation for the enormous energy released in supernova explosions. They argued that what powers supernovae is just the release of the grav- itational binding energy of a collapsing normal star. When all thermonuclear fuel is exhausted, a sufficiently large star collapses under gravitational self attraction to such a compressed state where all protons, neutrons and electrons are squeezed into such a small volume and are so much close to each other that inverse β-decay takes place systematically. By capturing an electron and releasing a neutrino, all

5Wilhelm Heinrich Walter Baade (1893Ð1960) was a German astronomer who emigrated to the USA in 1931 and mostly worked at Mount Wilson Observatory in California. Born in Bulgaria from Swiss parents, Fritz Zwicky (1898Ð1974), worked most of the time in the USA where he ob- tained his Ph.D. from the California Institute of Technology. Later he was to be appointed Professor by the same Institute giving important contributions to various areas of Astronomy. He worked in association with the Mount Wilson and Palomar Observatories. Zwicky was also a brilliant en- gineer and he is considered the father of modern jet propulsion engines. Through his first wife Zwicky become a relative of the US President Frank Delan Roosevelt. 262 6 Stellar Equilibrium

Fig. 6.9 When all protons neutrons and electrons are squeezed into a sufficiently small volume all protons are converted into neutrons by capturing an electron and releasing a neutrino protons are converted into neutrons (see Fig. 6.9). In this way we can conceive the existence of very compact stars, purely made of neutrons, where, once again, gravi- tational attraction is compensated by the pressure of a degenerate Fermi gas of very high density. Such an equilibrium state, denominated Neutron Star was predicted by Baade and Fricky as the result of supernovae explosions in a paper of 1934 [7]. Approximately thirty years later, in 1965, Antony Hewish and Samuel Okoye were able to detect strange radio signals coming from the center of the Crab Neb- ula where in the year 1054 had taken place the most luminous supernova explo- sion ever recorded in history.6 Hewish (see Fig. 6.10) won the 1974 Nobel Prize in Physics for his role in interpreting the pulsating radio-wave signals coming from the Crab-Nebula as the emissions of a pulsar, namely a rotating neutron star in which the magnetic moment is not perfectly aligned with the rotation axis. Under these conditions, which are the generic ones for a rapidly rotating neutron star, the latter becomes a radio-antenna and its regular pulse signal allows for its own de- tection. Several neutron stars, both galactic and extra-galactic, were discovered in the following years. The neutron star closest to Earth, named Calvera, which is at a distance of about 250 light years from us, was discovered in 2007. Let us calculate the Fermi pressure of a neutron star regarding it as a free Fermi gas of neutrons.

6The 1054 supernova was observed by Chinese and Arabic astronomers and its sudden appearance was also recorded in the Chronicles of the St. Gallen Monastery in Switzerland. According to these witnesses SN 1054 was so bright as to be seen in daylight for 23 days and was visible in the night sky for 653 days. 6.4 The Chandrasekhar Mass-Limit 263

Fig. 6.10 Anton Hewish was awarded the Nobel Prize 1974 for his role in the discovery of pulsars. In 1965 together with Samuel Okoye he observed strange radio emissions coming from the Crab Nebula. It turned out that this was the radio signal of a rotating neutron star just located at the very center of the Nebula. That neutron star is the remnant (picture on the right) of the supernova exploded in 1054

The total mass of the star is:

M = Nnmn (6.4.23) where Nn is the number of neutrons and mn  mp is the neutron mass. Calling R the radius of the compact astronomical object, the neutron density is:

3 M 1 ρ = n 3 (6.4.24) 4π mn R Hence the rescaled Fermi-momentum takes the form:   9π M 1/3 xF = (6.4.25) mncR 4 mn and the Fermi pressure is:  1/3   1/3  c M 4 M 2 P ∼ × λ−4 × ns − ns 0 2 n (6.4.26) 12π Rns Rns

Having defined:

9πM M R R M = ≡ ; R = ≡ ns ns (6.4.27) 4mp knsmn /mnc λn 264 6 Stellar Equilibrium and: 4 kns = 1 × ; λn = (6.4.28) 9π mnc As we see (6.4.27) is completely identical in form to (6.4.22). Apart from the minor difference of a factor 2 in the coefficient kns , the quantity Mns is the mass of the neutron star measured in units of baryon masses, just as Mwd was the mass of the white dwarf measured in the same units. Also the factor 2 difference has a nice interpretation. We might write:

4 k = NB × (6.4.29) 9π where, by definition, NB is the number of baryons per fermion of the Fermi gas. In the case of white dwarfs the Fermi gas is made of electrons and for each electron there are two baryons in a star originally made of helium. In the neutron star case, the Fermi gas is made by the neutrons and each of them counts for one baryon. Similarly Rns ,justasRwd, is the radius of the star measured in units of Compton wave-length of the fermionic particles out of which the relevant Fermi gas is made. For neutron stars we measure the radius in neutron wave-lengths just as for white dwarfs we measure it in Compton wave lengths of the electron. With the above understanding the formula for the Fermi pressure takes a general form valid for both white dwarfs and neutron stars:     1/3 4 1/3 2 −4 M M P0 ∼ K × λ × − R R (6.4.30) c M R K ≡ ; M = ; R = 2 12π kmb λ

6.4.2 The Equilibrium Equation

Let us now recall that in Newton’s theory the central pressure of a star of mass M, radius R and constant density is given by (6.3.48). Multiplying and dividing by the same quantities the central pressure can be rewritten in the form given below:

2 3G M2 M P = × = K × λ−4 × c 4 4 (6.4.31) 8π R R where by comparison with the original definition (6.3.48), the constant K turns out to have the following value:

2  4N K = B Gm2 (6.4.32) 27π 3 B 6.4 The Chandrasekhar Mass-Limit 265

Table 6.1 Table of the values of fundamental constants Name of constant Symbol Value Units

Newton constant G = 6.67428 × 10−8 cm3 g−1 s−2 Speed of light c = 2.99792458 × 1010 cm s−1 Planck constant = 1.054571628 × 10−27 gcm2 s−2 −24 Proton mass mp = 1.67262164 × 10 g −24 Neutron mass mn = 1.6749729 × 10 g −27 electron mass me = 0.91093897 × 10 g

If a neutron-star or a white-dwarf could be conceived as a constant density object, then the equilibrium equation would just be obtained by equating Pc as given in (6.4.31) with the Fermi pressure P0 as given in (6.4.30). Obviously the constant density approximation is too rough, since we know that density varies with depth already in normal stars and even more in compact ones. Yet let us remark that on dimensional grounds, as long as we are in a classical non-relativistic theory, namely as long as we are not supposed to use neither the speed of light c, nor Planck’s constant , the only combination of available parameters G, M and R that has the 2 dimension of a pressure is precisely GM , as can be verified by looking at Table 6.1 R4 It follows that solving the pressure differential equation (6.3.37) with any kind of equation of state for the stellar matter:

p = f(ρ) (6.4.33) the final result for the central or average pressure in a spherical star of mass M and radius R must be:

3Gα M2 P = × 8π R4 2 M = K × λ−4 4 (6.4.34) R 2N 2 K = B Gαm2 27π 3 B where α is a dimensionless, numerical coefficient. A very reasonable physical as- sumption, justified by numerical solutions of the pressure equation in presence of typical equations of state like:

p = kρ γ ; k,γ ∈ R (6.4.35) is the following: the coefficient α ≈ 1 is of order unity. In this case we can draw very important and interesting conclusions. Equating the average gravitational pres- sure (6.4.34) with the Fermi pressure (6.4.30), we obtain a determination of the star 266 6 Stellar Equilibrium radius in the form:   2/3 1/3 M R = M 1 − (6.4.36) M0 where we have defined:         K 3/2 9 3/2 c 3/2 1 3/2 M = = π 0  2 2 K 8 Gmp NB α   1 3/2 = . × 58 × 1 46386 10 2 (6.4.37) NB α

4NB The pure number M0 is a mass measured in units of 9π mp. The corresponding value of such a mass, expressed in grams is:   4N 1 3/2 M = M B m = . × 33N 0 0 p 3 87275 10 B 2 g 9π NB α   1 3/2 = . × N × M 3 46389 B 2 (6.4.38) NB α where M = 1.98892 × 1033 g is the mass of the sun. Formula (6.4.38) holds true for both white dwarfs and neutron stars, the only difference being that in the first case NB = 2, while in the second we have NB = 1. In both cases, as long as α is of order unity also M0 is of order unity in terms of solar masses. What is the meaning of M0? This is evident from (6.4.36). For masses M larger than M0 the radius of the star becomes imaginary, namely the star cannot exist. So M0 is the upper limit for the size of a white-dwarf or a neutron star. It is named the Chandrasekhar mass limit after its discoverer. With minor differences due to α and NB the mass limit for both types of compact stars is approximately the same. Careful considerations on the equation of state of white dwarfs yield for instance:

M0 = 1.4M (6.4.39) which corresponds to a value α  0.46 of the parameter α. We will give a derivation of this result in the next section. The actual radius of white-dwarfs and of neutron stars is however very different. The number R turns out to be approximately the same in both cases, but in the first it measures the radius of the star in units of the Compton-wave length of the electron, while in the second measures it in units of the Compton wave length of the neutron

−11 −14 λe = 3.8615910 cm; λn = 2.1001910 cm (6.4.40)

Hence the ratio between the white-dwarf radii and those of neutron stars is typically 1/3 a factor 103. Since R ∼ M ∼ 1058/3, for white dwarfs and for neutron-star we respectively have: 6.4 The Chandrasekhar Mass-Limit 267

8 5 Rwd ∼ 10 cm; Rns ∼ 10 cm (6.4.41) White-dwarfs are therefore objects with about one solar mass concentrated in the volume of a small planet (thousand kilometer of radius), while neutron-stars have the same mass-magnitude squeezed into a sphere some kilometer-wide.

6.4.3 Polytropes and the Chandrasekhar Mass

Let us now elaborate a little further on the Newtonian equilibrium equation we con- sidered in previous sections. Equation (6.3.37), written for the case of constant den- sity, can be easily generalized to a radial dependent density in the way presented in (6.1.1). As already stressed this integral-differential equation can be solved if we supply the additional information of an equation of state, relating pressure to energy den- sity. A class of equations of states that generalizes the well known equation of state 7 of ideal gases p = kB ρ is the following one: 1 p = κργ ; γ = 1 + (6.4.42) n Stars whose constituent matter follows an equation of state of type (6.4.42)are named polytropes and the number n is named the polytropic index. In this case the equilibrium equation (6.1.1) can be reduced to a second order non-linear differ- ential equation whose solutions can be determined numerically and in some cases also in closed analytic form. Let us change variables by setting: n ρ(r)= ρ0 θ(r) (6.4.43) where, by definition ρ0 is the energy density at the center of the star and θ(r) is a function of the radial coordinate r whose boundary condition is θ(0) = 1. Similarly, in agreement with (6.4.42), let us set:

+ 1+ 1 = n 1; = n p(r) Pc θ(r) Pc κρ0 (6.4.44) where Pc is the central pressure of the star. By means of trivial manipulations the equilibrium equation (6.1.1) becomes:

d G ρ r2 θ(r)=− c M(r) (6.4.45) dr n + 1 Pc

7 In Classical Thermodynamics one writes the relation pV = kB NT where kB is the Boltzmann constant, N the Avogadro number, T the temperature, p the pressure and V the volume. Dividing = NT by V the quantity ρ V can be interpreted as the internal energy density, since the temperature N measures the average kinetic energy per particle and V is the density of particles. 268 6 Stellar Equilibrium

By taking a second derivative, (6.4.45) is reduced to:   1 d dθ r2 =−β2θ n (6.4.46) r2 dr dr where we have defined: (n + 1)P β ≡ c 2 (6.4.47) 4πGρc Introducing a rescaled variable z = r/β (6.4.46) is turned into the following stan- dard form:   1 d dθ z2 + θ n = 0 (6.4.48) z2 dz dz which goes under the name of Lane-Emden equation. The boundary conditions that correspond to the considered physical situation are easily found:  θ(0) = 1; θ (0) = 0 (6.4.49)

The solution to (6.4.48)Ð(6.4.49) cannot be written in an explicit analytic form for generic values of n, admitting only a numerical determination. Yet for a few values of nθ(z)is an elementary function. In particular, as the reader can check, we have:

nθ(z) z0 I0[n] √ √ − 1 2 ( 6)3 01 6 z 6 3 sin(z) (6.4.50) 1 z ππ 5 1 ∞∞ + z2 1 3 where we have named z0 the first zero of the θ(z) function and I0 the following integral: z0 2 n  I0[n]≡ z θ(z) dz; θ(z0) =−z0θ (z0) (6.4.51) 0 For other values of n the above data have to be determined numerically through a computer programme. We also remark that the second identity in (6.4.51) follows from the fact the θ(z) is assumed to satisfy the differential equation (6.4.48). What is the relevance of the integral I0? This is easily understood, if we calculate the mass M of the star which is supposedly described by the considered equation of state. We immediately find: R z0 2 3 2 M ≡ 4π r ρ(r)dr = 4πβ ρ0 z θ(z)dz 0  0 (n + 1)κ 3/2 3−n = 4π ρ n I [n] (6.4.52) 4πG 0 0 6.4 The Chandrasekhar Mass-Limit 269

The above relation follows from the simple consideration that the first zero of θ(z), namely z0, defines the radius R ≡ βz0 of the star and shows that something very special happens for the polytropic index n = 3. Indeed for such a value the total mass of the star is independent from the central density ρ0. We are therefore led to consider whether n = 3 has a special theoretical interpretation. This is indeed the case. For n = 3 the equation of states takes the form:

p = κρ4/3 (6.4.53) which is precisely the limiting one for an extremely high density degenerate gas of fermions. How to see it? This is simple. At very high density (xF 1) we can drop the subleading term in (6.4.14) keeping only the fourth order one. Relying on ∼ 4/3 the form (6.4.11) of the Fermi momentum we conclude that p ρf , where ρf is the density of the fermionic particles composing the gas, electrons in the white dwarf case, neutrons in the neutron-star case. If NB is the number of baryons per Fermi-particle, then we have: ρ ρf  (6.4.54) NB mp and we conclude that p ∼ ρ4/3. So we conclude that the equation of state for de- generate compact stars of very high density is indeed of the polytropic type with = ⇔ = 4 polytropic index n 3 γ 3 . So we can calculate the mass M from formula (6.4.52) if we evaluate the index κ from (6.4.14) and (6.4.11). By direct substitution we find:   1 √ 1 4/3 κ = 3 3cπ2/3 × (6.4.55) 4 mpNb which inserted in (6.4.52) yields:     1√ c 3/2 1 2 MCh = 3π I0[3] (6.4.56) 2 G NB mp which is the celebrated Chandrasekhar formula for the upper mass limit of white- dwarfs and neutron stars. To evaluate it explicitly we just need the value of I0[3] which can be calculated from the numerical solution of the Lane-Emden equation in the case n = 3(seeFig.6.11). We find:

z0[3]6.8; I0[3]2.01 (6.4.57)

Inserting this value of I0[3] and those of the fundamental constants in (6.4.56) we find: wd  ; ns  MCh 1.42M MCh 5.72M (6.4.58) wd ns where we have denoted by MCh the upper mass-limit for white dwarfs and by MCh the same for neutron stars. 270 6 Stellar Equilibrium

Fig. 6.11 The solution of the Lane-Emden equation for the polytropic index n = 3. The first zero is at z0  6.8

6.5 Conclusive Remarks on Stellar Equilibrium

Differently from Newtonian theory, Einstein theory foresees a critical lower limit for the ratio between the radius and the mass of a star encoded in (6.3.67). Below such a limit no pressure can sustain the star into equilibrium and gravitational collapse continues up to the formation of a black-hole. Just above that limit there are two equilibrium states for exhausted stars, where gravitational attraction is balanced by a quantum phenomenon, namely the degeneracy pressure of a Fermi gas: the state of White Dwarf and that of Neutron Star. We can therefore conclude that, depending on the initial size of a normal star, there are three possible destinies at the end of its life-cycle. Supermassive stars end their life as black-holes, massive stars as Neutron Stars and medium size ones as White Dwarfs. The existence of an upper mass-limit for Neutron-Stars and White-Dwarfs which is determined purely in terms of fundamental constants of Nature is of the utmost theoretical relevance. For instance, as we shall discuss in later chapters on Cosmol- ogy, supernovae of type Ia, which explode when a white-dwarf member of a binary system reaches the Chandrasekhar mass-limit by sucking material from the compan- ion star, constitute a unique instance of very luminous and precise standard candles that played an essential role in measuring cosmological parameters opening up, at the beginning of the XXI century, entirely new perspectives on our understanding of the physical Universe. Altogether Man’s understanding of stellar equilibrium, which is so essential to frame our present picture of the world and of its evolution, was drastically changed and not only marginally corrected in the first decades of the XX century by the inputs of General Relativity and of Quantum Mechanics. This showed that the typical stellar mass M is not a randomly chosen number but it is explained by the fundamental Laws of Nature encoded in the values of fundamental con- stants. References 271

References

1. Lane, J.H.: On the theoretical temperature of the sun under the hypothesis of a gaseous mass maintaining its volume by its internal heat and depending on the laws of gases known to terrestrial experiment. Am. J. Sci. Arts 50, 57Ð74 (1870) 2. Oppenheimer, J.R., Volkoff, G.M.: On massive neutron cores. Phys. Rev. 55(374), 374Ð381 (1939) 3. Fermi, E.: Sulla quantizzazione del gas perfetto monoatomico. Rend. Lincei 3, 145Ð149 (1926). http://arxiv.org/abs/cond-mat/9912229 4. Dirac, P.A.M.: On the theory of quantum mechanics. Proc. R. Soc., Ser. A 112, 661Ð677 (1926). http://www.jstor.org/stable/94692 5. Tolman, R.C.: Effect of inhomogeneity on cosmological models. Proc. Natl. Acad. Sci. 20(3), 169Ð176 (1934) 6. Tolman, R.C.: Static solutions of Einstein’s field equations for spheres of fluid. Phys. Rev. 55(374), 364Ð373 (1939) 7. Baade, W., Zwicky, F.: Remarks on super-novae and cosmic rays. Phys. Rev. 46, 76Ð77 (1934) 8. Hewish, A., Okoye, S.: Evidence of an unusual source of high radio brightness temperature in the Crab Nebula. Nature 207, 59 (1965) 9. Chandrasekhar, S.: The density of white dwarf stars. Philos. Mag. 11, 592 (1931) 10. Chandrasekhar, S.: The maximum mass of ideal white dwarfs. Astrophys. J. 74, 81 (1931) 11. Chandrasekhar, S.: An Introduction to the Study of Stellar Structure. Dover, New York (1958), (1939), ISBN 0-486-60413-6 Chapter 7 Gravitational Waves and the Binary Pulsars

Like as the waves make towards the pebbled shore, So do our minutes hasten to their end; Each changing place with that which goes before, In sequent toil all forwards do contend. William Shakespeare

7.1 Introduction

The concept of gravitational waves was born in 1918 with a paper published by Einstein under the following title Über Gravitationswelle [1]. For the first time, the effect of gravitational waves was calculated in that article in which there appeared a formula evaluating the power of a gravitational antenna:  Power ∝ Q 2 (7.1.1)

According to Einstein, the energy radiated away per unit time is proportional to the squared modulus of the third time-derivative of the quadrupole moment of the emitting source. Just as electromagnetic waves are produced by accelerated charges, in the same way gravitational waves should be produced by accelerated masses or lumps of energy. There is however a crucial difference, due to the different spin of the fundamental field mediating the interaction. Electromagnetism is mediated by a vector field, that has spin s = 1, while gravitational interactions are transmitted by a symmetric tensor, whose spin is s = 2. Consequently electromagnetic radiation can be produced by a variable electric dipole, while in order to emit gravitational radiation one needs at least a variable quadrupole moment. Einstein was forced to write his 1918 paper in order to correct a serious error he had discovered in his 1916 paper [2], where he had developed the linear approximation scheme to solve the field equations of his theory. In that context he had noticed the existence of plane wave solutions similar to the corresponding wave solutions of Maxwell equations, yet he had overlooked the crucial question of what are the first contributing multipoles, in modern parlance he had overlooked the issue of spin.

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_7, 273 © Springer Science+Business Media Dordrecht 2013 274 7 Gravitational Waves and the Binary Pulsars

It was clear to Einstein that, due to their extreme weakness,1 of the order of (v/c)5, there was no hope of detecting gravitational waves in Earth-based labora- tories; after some years he reconsidered the whole matter coming to the conclusion that gravitational waves actually do not exist, being simply gauge artifacts. In 1936, together with Nathan Rosen,2 Einstein wrote a paper containing such a conclusion and sent it for publication to the Physical Review. The article was rejected. Quite angrily Einstein withdrew the manuscript and published it on the Journal of the Franklin Institute with a less provoking title [3]. In the following years Einstein reconsidered once again the matter and, together with Infeld and Hoffmann, developed a systematic post-Newtonian expansion of the field equations of General Relativity, showing that wave radiation does not appear up to the (v/c)4 order. Yet at the next order, (v/c)5, waves pop up and follow the quadrupole formula (7.1.1), as demonstrated by Hu in a 1947 paper [4].

7.1.1 The Idea of GW Detectors

The first attempts to construct experimental apparats able to detect gravitational waves are due to the American physicist Joseph Weber, the founder of laser and maser physics.3 In the years 1955Ð1956, Weber worked at the Institute for Advanced Studies of Princeton with John Archibald Wheeler4 and developed the project of a

1As we are going to show in the present chapter the 1918 Einstein formula for the emission power can be retrieved from first principles (see (7.3.94)) and precisely involves the ratio of actual veloc- ities with respect to the speed light raised to power five. 2Nathan Rosen, (Brooklyn 1909, Haifa 1995) was the author, together with Einstein and Podolsky of the famous 1935 paper where the possibility that Quantum Mechanics might be incomplete was put forward. In the EPR paper the existence of hidden variables was conjectured and the probabilis- tic interpretation of Quantum Mechanics questioned. Yet, as it is widely known, all experimental tests have always confirmed Quantum Mechanics and rejected any competitor theory. 3Joseph Weber (1919Ð2000) was an American physicist. Born in Paterson, New Jersey, he died in Pittsburgh, Pennsylvania. After serving in the Navy during war-times, where he studied electronics, Weber graduated from the University of Maryland at College Park and obtained his Ph.D. with a thesis on microwave spectroscopy. In 1952 he gave a public lecture in Ottawa where he laid down the principles behind the construction of what were later called lasers and masers. These ideas were developed simultaneously by Charles Townes, Nikolay Basov, and Aleksandr Prokhorov, who built working prototypes of these devices, and received the Nobel Prize for this work in 1964. 4John Archibald Wheeler (July 9, 1911ÐApril 13, 2008) was an eminent American theoretical physicist. He ranks among the later collaborators of Albert Einstein and includes Richard Feyn- man, Kip Thorne, Hugh Everett and Tullio Regge among his Ph.D. students. He tried to achieve Einstein’s vision of a unified field theory. He is also known for having coined the terms black hole and wormhole. As many other American physicists Wheeler participated in the Manhattan Project for the construction of the Atomic Bomb. For a few decades, General Relativity was some- what neglected by the main stream of Physics, being detached from experiment. Wheeler was a key figure in the revival of the subject, leading the school at Princeton, while Dennis Sciama and Yakov Zel’dovich developed the subject in Cambridge and Moscow. The work of Wheeler and his students contributed greatly to the golden age of General Relativity. 7.1 Introduction 275

Fig. 7.1 The antenna Explorer is a cylinder of Al5056, it weights 2300 kg, it is 3 meter long and it has a diameter of 60 cm. It is cooled at the temperature of liquid helium (4.2 K) and it operates at the temperature of 2 K, which is reached by lowering the pressure on the liquid helium reservoir. Its resonance frequencies are around 906 and 923 Hz gravitational antenna made of a resonant metallic bar, which he further improved during a long visit at the University of Leiden in the Netherlands. In the early 1960s Weber developed the first wave detectors and began publishing papers where he claimed evidence of such a detection. In 1972 one of Weber’s bar detectors was sent to the moon on the Apollo 17th lunar mission. Weber’s claims were received with high skepticism by the scientific community and the systematic error inherent to the large noise of his detectors was demon- strated to invalidate his conclusions. Notwithstanding the fact that his efforts were inconclusive, Weber is nonetheless credited as the father of gravitational wave ex- periments. The lead in this direction was then taken by the Italians under the stimulus of Edoardo Amaldi. The idea of starting an experiment aiming to detect GW in Rome was stimulated by the Course on Experimental Tests of Gravitational Theories held in summer 1961 at the Scuola Internazionale E. Fermi in Varenna, where the prob- lem was discussed by J. Weber. The program remained rather vague for practical reasons until 1968, when W. Fairbank spent a few months in Rome at G. Careri’s low temperature laboratory. When Fairbank mentioned his intention of starting the development of a low temperature gravitational antenna, Careri, who was informed for long time of the interest of Edoardo Amaldi in the subject, suggested a first di- rect contact. A group formed by Edoardo Amaldi, Massimo Cerdonio, Renzo Mar- conero and Guido Pizzella was created and a long term research project started which eventually resulted in the creation of the sophisticated, ultra-cryogenic, bar- antennae, nowadays operating in the sites of CERN and of the National Laboratories of Frascati, respectively known as Explorer, Nautilus and Auriga (Fig. 7.1). The operating concept of bar-antennae is extremely simple. A gravitational wave is a propagating deformation of space-time geometry, which induces a vibrating de- formation of macroscopic objects. Orthogonal directions, in the plane transverse to the propagation direction of the wave, are alternatively stretched and compressed. Correspondingly, the bar should be compressed and stretched: if the wave frequency is close to its resonance frequency, the bar will resonate and the resonance can be detected by means of sophisticated electronics. The problem is just that of sensitiv- ity. The space displacements to be measured are of the order of 10−18Ð10−20 cm. 276 7 Gravitational Waves and the Binary Pulsars

Although this might seem incredible, the present sensitivity of the bar detectors is coming close to the critical sensitivity. No gravitational wave signal has so far been revealed, yet it might happen any day. While the bar technology was developed, new impetus to the quest for gravita- tional waves was brought by the indirect evidence of their existence revealed by a sensational astrophysical discovery of 1974. This latter was made possible by the progresses of radio-astronomy and the construction of giant radio-telescopes as that of Arecibo.

7.1.2 The Arecibo Radio Telescope

The construction of the Arecibo telescope (see Fig. 7.2), located in United State territory in the Caribbean island of Puerto Rico, was initiated by Professor William E. Gordon of Cornell University, who originally intended to use it for the study of Earth’s ionosphere. Later, through contact with other agencies, including the Air Force Cambridge Research Laboratory, the original project was somewhat modi- fied, also with the essential contribution of the two brothers Doundoulakis, chief engineers of the General Bronze Corporation in Garden City, New York. These lat- ter devised and designed the cable suspension system which is the structure finally realized. Construction began in the summer of 1960, with the official opening on Novem- ber 1st, 1963. As the primary dish is spherical, its focus is along a line rather than at a single point, as would be the case for a parabolic reflector; thus complicated line feeds had to be used to carry out observations. The telescope has undergone signif- icant upgrades, the first one in 1974, when a high precision surface was added for the current reflector. A Gregorian reflector system was installed in 1997, incorpo- rating secondary and tertiary reflectors to focus radio waves at a single point. This allowed the installation of a suite of receivers, covering the whole 1Ð10 GHz range, that could be easily moved onto the focal point, giving Arecibo a new flexibility.

Fig. 7.2 The giant Arecibo radio Telescope in Puerto Rico 7.1 Introduction 277

Fig. 7.3 The Crab Nebula, the remnant of the 1054 supernova. At the center there is the Crab Pulsar which, detected in 1968, gave the first solid evidence for the existence of neutron stars in our Universe

This long standing and sophisticated machine, which during the years of the cold war was also used for military purposes, notably for the localization of soviet radar installations, is responsible for a few fundamental scientific discoveries that have substantially upgraded human knowledge in the fields of Astrophysics and General Relativity.

7.1.2.1 Discovery of the Crab Pulsar

Four years after its coming into operation, Arecibo enabled Lovelace and other re- searchers to provide the first solid evidence for the existence of neutron stars in our Universe. This came through the detection of the radio signal emitted with a fre- quency of 33 milliseconds by the Crab Pulsar PSR0531+21. This compact star is at the center of the Crab Nebula, that is located at a distance of about 6,500 light years from Earth and has a diameter of about 11 light years (Fig. 7.3). Discovered in 1731 by the English doctor and amateur astronomer John Bevis,5 theCrabNebulais just the remnant of the supernova SN 1054, one of the brightest in history, recorded by Chinese and Arab astronomers and also by the monks of St. Gallen monastery. The supernova 1054 was visible during day-time for 23 days and for 653 days in night-time. The detection of the Crab Pulsar at the center of the Crab Nebula gave final confirmation to the theory that the core of a gravitational collapsing star settles down to the equilibrium state of a neutron star, if its mass is inferior to the relevant Chandrasekhar limit. The supernova II explosion is just generated by the dramatic bouncing of the infalling matter on the incredibly hard crust of the newly formed neutron star.

5John Bevis, (1693Ð1771) besides the discovery of the Crab Nebula is known for his observations of the occultation of Venus by Mercury and for his studies on the eclipses of Jupiter’s moons. 278 7 Gravitational Waves and the Binary Pulsars

Fig. 7.4 The 1993 Nobel laureates R.A. Hulse and J.H. Taylor for their 1974 discovery of the binary pulsar system PSR 1913+16

7.1.2.2 The 1974 Discovery of the Binary System PSR1913+16

The 1993 Nobel Prize for Physics was awarded jointly to Joseph Hooton Taylor and Russell Alan Hulse (see Fig. 7.4) with the following motivation for the discovery of a new type of pulsar, a discovery that has opened up new possibilities for the study of gravitation. The motivation referred to their 1974 discovery of the binary pulsar system known as PSR1913+16. The first detection and subsequent thirty- year long monitoring of the system were performed by means of the Arecibo Radio Telescope [5Ð7]. Born 29th March 1941 in Philadelphia in a family with Quaker roots, Taylor was Distinguished Professor of Physics at the University of Princeton from 1980 to his retirement in 2006. Previously he held a chair of Astronomy at the University of Massachusetts where he was Director of the Five College Radio Astronomy Obser- vatory. Nine year younger than his colleague, Russell A. Hulse was born in New York in 1950. He studied at the University of Massachusetts from which he received his Ph.D. in 1975. Presently he is staff member of the Princeton Plasma Physics Laboratory and also visiting professor at the University of Texas, Dallas. At a very early stage of his scientific career, Hulse engaged with Taylor on a large scale survey for pulsars at the Arecibo facility and the result of their researches was the uncovering of the PSR1913+16 system. This latter is made up of a pulsar and a black companion star. The rotating neutron star emits impulses that are extremely regular and stable in the radio wave region. This allows for a careful reconstruction of the orbital motion of the two stars around their center of mass. Hulse and Taylor 7.1 Introduction 279 had the brilliant idea of using this firstly discovered binary pulsar system to make high-precision tests of General Relativity. Having noticed a constant shrinking of the period of revolution T , they traced it back to the emission of gravitational waves, providing the first indirect evidence for their existence. Indeed they showed that their measurements are in astonishingly precise agreement with the Einstein formula for quadrupolar gravitational radiation established in 1918. In the present chapter we plan to analyze in full detail the mechanism of gravitational radiation emission and the evidence for it provided by binary pulsar systems.

7.1.3 The Coalescence of Binaries and the Interferometer Detectors

The discovery of the binary pulsar not only provided indirect evidence for grav- itational waves, but also clarified the panorama of their candidate astrophysical sources. Gravitational waves are certainly emitted during the explosions of super- novae, both of type II and of type I. These events, however, are too much sporadic and the corresponding emission spectra depend on too many variables for them to be efficiently predicted. Powerful sources of gravitational radiation are also the ac- tive galactic nuclei, that are most probably occupied by giant black holes of millions of stellar masses. Stars and matter falling into such holes certainly produce gravita- tional waves, yet the involved wave-lengths are too large for Earth-based detectors. On the other hand, the binary pulsar provided a new paradigma of a clean, pre- dictable and quite universal astrophysical source of gravitational radiation, whose wave-lengths are compatible with Earth-based detectors. Binary systems are quite abundant in the Universe since about fifty percent of the stars are grouped in pairs. All stars sooner or later collapse and a large fraction of them end their life as neu- tron stars or as black-holes. Hence binary compact star systems must be quite abun- dant as well. The result of Hulse and Taylor showed that such systems are instable against the slow loss of energy through gravitational radiation which, however, be- comes very large and dramatic in the last few seconds before coalescence. Binary coalescences became thus the preferred astrophysical sources of gravitational radi- ation to be searched for. New instruments were devised and slowly constructed for the detection of such events: the gravitational interferometers. The idea of the interferometers is just as simple as that of the resonant bars (Fig. 7.5). In this case the metric disturbance produced by the wave is supposed to deform the two arms, each a few kilometer long, of a laser interferometer, whose concept is the same as that underlying the apparatus utilized by Michelson and Mor- ley to detect the motion of the Earth relative to the Ether. A highly monochromatic laser beam is split in two orthogonal beams that flow in the two arms of the interfer- ometer, are reflected by mirrors placed at the arm end-points and come back, inter- secting at the original splitting point. As long as the two arms are of equal length, the beams come back in phase and do not create any interference. Any gravitational 280 7 Gravitational Waves and the Binary Pulsars

Fig. 7.5 The schema of a gravitational interferometer disturbance should deform the lengths and should be revealed by the sudden appear- ance of an interference. At the present time two gravitational interferometers are in operation and a third is under construction. The existing machines are Virgo, located near Pisa, and Ligo I, located in Louisiana, US. The third machine Ligo II is being built in the Washington State, US. The first six months of joint observations by Virgo (see Fig. 7.6) and Ligo I took place about two years ago and so far no event of gravitational wave transit was detected. The direct detection of the elusive waves is just postponed. All experi- mentalists in this field share the view that a further effort to increase the already fantastic sensitivity of their instrument is necessary, unless we are specially lucky and a binary coalescence suddenly takes place rather closely to us. This being the status of observations let us now carefully consider the mathemati- cal derivation of gravitational wave emission in the weak field approximation of Ein- stein theory. A systematic treatment of this problem necessarily begins with a dis- cussion of Green functions, namely the inverse of the d’Alembertian wave operator.

7.2 Green Functions

The mathematical problem associated with all cases of relativistic wave propagation is that of inverting the d’Alembertian operator: 7.2 Green Functions 281

Fig. 7.6 An aerial vision of the Virgo interferometer at the EGO site near Cascina, Pisa. EGO is the European Gravitational Observatory cofinanced by the Italian INFN (Istituto Nazionale di Fisica Nucleare) and by the French CNRS (Centre National de la Recherche Scientifique). Leading Scientist of the Virgo project is Prof. Adalberto Giazzotto

∂2 3 ∂2 ≡ − (7.2.1) x ∂x2 ∂x2 0 i=1 i since the relevant equations of motion take the form:

φ(x)= j(x) x (7.2.2) where φ(x) is the field to be determined and j(x)describes the source emitting the waves. The standard approach to the solution of such a problem is by means of Green functions and integral representations. One writes the desired field, produced by the source j(x), as follows:    φ(x)= G x − x j x d4x (7.2.3) where the kernel G(x − x) of the above integral representation is a distribution which is supposed to satisfy the following equation: G x − x = δ(4) x − x x (7.2.4) having named δ(4)(x − x) the Dirac delta function. The problem is therefore turned into that of constructing the Green function G(x − x). Once again there is a time honored strategy for such a construction, namely Fourier transforms. Physically the Fourier transform is a decomposition of the searched for object along a complete basis of eigenfunctions of the d’Alembert μ (see Fig. 7.7) operator, provided by the plane waves exp[ikμx ]. Explicitly we set: 282 7 Gravitational Waves and the Binary Pulsars

Fig. 7.7 Jean Baptiste Le Rond D’Alembert (1717Ð1783). D’Alembert was born and died in Paris. Illegitimate child of a writer noble-woman and of an officer, he was abandoned by his mother on the steps of the church of St.Jean Le Rond. Raised in an adoptive humble family he was educated by Jansenists in the Collège des Quatre-Nations at the expenses of his natural father who secretly left him an annuity. One among the top figures of the XVII century enlightenment, d’Alembert who invented such a name for himself was a mathematician, a physicist, a philosopher and a man of let- ters. Member of the French Academy of Sciences, friend and collaborator of Denis Diderot and of many other philosophers of that age he gave outstanding contributions in mathematics, mechanics and optics. He was member of the team preparing the Encyclopedie of which he wrote more than a thousand articles. The differential equation of wave propagation: ∂2f/∂t2 − ∂2f/∂x2 = 0 is one among his many fundamental contributions  1  G x − x = d4kG(k)exp −ik · x − x (7.2.5) (2π)4  1  δ(4) x − x = d4k exp −ik · x − x (7.2.6) (2π)4 In this way (7.2.4) becomes: 1  d4k exp −ik · x − x −k2G(k) − 1 = 0 (7.2.7) (2π)4

2 μ μ where k ≡ k kμ is the Lorentzian norm of the momentum vector k . This leads to the following solution for the Green function:  1 1  G x − x =− d4k exp −ik · x − x (7.2.8) (2π)4 k2 The problem with (7.2.4) is that it is singular since there is a pole along the inte- gration path. The recipe to avoid such a singularity can be provided in three differ- ent ways and this leads to three different solutions for the Green function, having distinct physical properties and distinct uses: the advanced, retarded and Feynman 7.2 Green Functions 283 propagators. In order to appreciate the bearing of Lorentz signature we compare with the solution of the analogue problem in Euclidian signature, namely with the construction of the Green function for the Laplace operator.

7.2.1 The Laplace Operator and Potential Theory

In potential theory for the case of Newtonian or Coulomb forces, one is confronted with a similar problem, the inversion of the Laplace operator:

3 ∂2 Δx ≡ (7.2.9) ∂x2 i=1 i Indeed, if we possess a solution of the equation:  (3)  ΔxG x − x = δ x − x (7.2.10) we can calculate the potential V(x) generated by an arbitrary distribution of masses or charges ρ(x):

ΔxV(x) =−const ρ(x) (7.2.11) In a completely analogous way to the relativistic case we obtain the integral repre- sentation of the Laplace Green function that follows:  1 1  G x − x = d3k exp −ik · x − x (7.2.12) (2π)3 |k|2 Turning to polar coordinates we obtain:

d3k = k2 dk sin θdθdφ (7.2.13) and setting r ≡ x − x 1 G(r) = dk exp[ikr cos θ] sin θdθdφ (2π)3 1 ∞ 1 π d = dk exp[irk] (2π)2 −ikr dθ 0 0 1 ∞ 1 = dk exp[−irk]−exp[irk] (2π)2 −ikr 0 1 1 ∞ sin kr = dk 2 2 (2π) r 0 k 1 = (7.2.14) 4π|r|2 284 7 Gravitational Waves and the Binary Pulsars

As one sees, no singularity was met in this integration and the Green function of the Laplacian has the form of the Newtonian central potential generated by a point- like mass. As a consequence the gravitational or electric potential generated by an arbitrary distribution of masses or charges can be written as:

  ρ(r ) V(r) =− d3r (7.2.15) 4π|r − r|

Let us now return to the relativistic case and let us observe the differences.

7.2.2 The Relativistic Propagators

From (7.2.8), by separating the integration on the various momentum components, we obtain: +∞ [− · −  −  = 1 3 0 0 −  × exp ik (x x ) G(x x ) d k dk exp ik x0 x0 (2π)4 −∞ (k0)2 −|k|2 (7.2.16) and the singularities on the integration path of k0 become evident. They occur at k0 =±|k|. In order to give a meaning to the integral it is necessary to give a pre- scription to deform the integration path in such a way as to avoid the singularities. There exist three possibilities depicted in Fig. 7.8.  The upper path CR yields the retarded Green function GR(x −x ), the lower path  yields the advanced Green function GA(x − x ), while the middle path CF yields  the Feynman propagator GF (x − x ). As discussed in all courses in Quantum Field Theory and Quantum Electrodynamics, the Feynman choice is that appropriate for perturbative quantum calculations. This prescription is such that for positive energy the propagator captures the contribution of particles advancing in time while for negative energies it captures that of anti-particles receding in time. The retarded and

Fig. 7.8 The integration path in the k0-plane corresponding to the three choices of relativistic propagator, advanced, retarded and Feynman 7.2 Green Functions 285 advanced prescriptions pertain instead to classical physics. They are both mean- ingful and simply correspond to the solutions of two different problems: that of emission and that of propagation of classical waves. In the retarded case we pick up the contribution of all events that are in the past light-cone of the event we consider. In this way we predict the value of a field at a certain space-time point knowing the behavior of the source in the past. Alternatively in the advanced case we pick up the contribution of all events that are in the future light-cone of the considered event. In this way knowing the value of a field at a certain space-time point we determine its influence on future events, for instance on those pertaining to an antenna which is supposed to receive signals. Relevant to us, while discussing gravitational waves emitted from astrophysical sources, is the retarded potential.

7.2.2.1 The Retarded Potential

We choose the retarded integration path CR. The integral can be calculated with the residue theorem, using Jordan’s lemma and closing the contour in the upper or in −  the lower half-plane. If x0 x0 < 0 Jordan’s lemma allows us to close it in the upper −  half plane, while for x0 x0 > 0 we have to close it in the lower half plane. In the first case the integration contour contains no poles and the integral is simply zero, while in the second case the two poles are both encircled and we have non-vanishing contributions from the residues. This amounts to saying that the retarded potential −  is proportional to a step function θ(x0 x0). Indeed, using polar coordinates as in the previous case of the Laplacian Green function, we obtain:

  ik(x0−x ) −ik(x0−x )   1 e 0 + e 0 G x − x = θ x − x 2πi 0 0 (2π)4 2k

× eikr cos θ k2 sin θdkdθdφ = = −  (changing variables u cos θ, t x0 x0) 1 ∞ 1 = θ(t) π dkk dueikru × eikt + e−ikt 3 i (2π) 0 −1 1 θ(t) +∞ = dk e−ik(t+r) + e−ik(t−r) 2r (2π)2 −∞ 1 = θ(t) δ(t + r)+ δ(t − r) (7.2.17) 4πr Restoring the original variables in the final result of the above calculation we obtain the following expression for the retarded Green function:  1 0 0  0 0 GR x − x = δ x − x − x − x × θ x − x (7.2.18) 4π|x − x| 286 7 Gravitational Waves and the Binary Pulsars

7.3 Emission of Gravitational Waves

In Chap. 5 we studied the linearized form of Einstein equations and, after fixing the de Donder gauge, we arrived at the following form:

γμν =−16πGTμν (7.3.1) μ ∂ γμν = 0 where γμν is a linear redefinition of the metric deformation around its flat Minkowskian average: 2 gμν(x) = ημν + hμν(x) + O h (7.3.2) 1 ρσ γμν(x) = hμν − ημνhρσ (x)η 2 Using the retarded Green function (7.2.18) to solve these linearized equations (7.3.1), we obtain the following expression for the field γμν(x) generated by a mat- ter system, whose description is encoded in the stress-energy tensor Tμν(x): 0    T (x −|x − x |, x ) γ (x) =−4G d3x μν (7.3.3) μν |x − x| As already emphasized the integral is extended over the past light-cone of the point, whose coordinates are xμ. The above solution represents the disturbance in the grav- itational field produced at point xμ by the presence and motion of some matter in another distant region of space-time. Obviously the disturbance is felt only after the time needed for the signal to travel (at the speed of light) the distance separating xμ from the region where Tμν is present. Relying on the above formula, our main concern is that of evaluating the energy transported by a linearized gravitational wave and relate this latter to the relevant deformations of the source-system. This procedure will produce an evaluation of the emission power of a gravitational wave source. At the end of our elaborations we shall discover that such a power, namely the amount of energy emitted per unit time, is proportional to the modulus squared of the third derivative of the source quadruple moment. This is typically a very small quantity and that is the main reason why gravitational waves have so far escaped direct measurement.

7.3.1 The Stress Energy 3-Form of the Gravitational Field

In order to calculate the energy transported by the gravitational wave we have to define the stress-energy tensor not of matter, but of the gravitational field itself. In the case of general metrics, the definition of mass, energy and momentum of the gravitational field corresponds to an ambiguous problem, since to introduce a stress- energy tensor we need a background reference metric, which is not uniquely given. 7.3 Emission of Gravitational Waves 287

Yet, in the case of linearized gravity, the reference metric is uniquely identified by the undeformed Minkowski metric. In this case a simple manipulation of Einstein equations allows to derive the correct expression of the stress-energy tensor. We start from the field equations in differential form language:

a ab c dE + ω ∧ E ηbc = 0 (7.3.4) ab c ∗ R ∧ E εabcd = κ Td where Ea is the vierbein, ωab the spin connection and κ = 16πG/3. We insert the definition of the curvature

Rab = dωab + ωac ∧ ωcb (7.3.5) in the second of (7.3.4) and we rewrite its left hand side as follows: ab c ab c ab c dω ∧ E εabcd = d ω ∧ E εabcd + ω ∧ dE εabcd (7.3.6)

Then using the torsion equation, namely the first of (7.3.4) we put:

c cb dE =−ω ∧ Eb (7.3.7) and we obtain: ab c ab c ab cf dω ∧ E εabcd = d ω ∧ E εabcd − ω ∧ ω ∧ Ef εabcd (7.3.8)

Hence Einstein equations can be rewritten as: ab c ∗ ∗ d ω ∧ E εabcd = κ Td − td (7.3.9) where the 3-form ∗ 1 t = ωab ∧ ωcf ∧ E − ωaf ∧ ωfb ∧ Ec ε (7.3.10) d κ f abcd can be declared to encode the stress-energy tensor of the gravitational field. Indeed, as a consequence of (7.3.9) the following 3-form

∗ (total) ≡ ∗ − ∗ Td Td td (7.3.11) is conserved in the ordinary sense:

∗ (total) = d Td 0 (7.3.12) Note that this definition is not invariant with respect to local Lorentz transformations since it depends on the bare field ωab. However it is invariant against global Lorentz transformations and therefore it can be used in the asymptotic region of a space-time which is nearly flat. 288 7 Gravitational Waves and the Binary Pulsars

7.3.2 Energy and Momentum of a Plane Gravitational Wave

Let us consider the perturbed metric (7.3.2) and let us introduce a null coordinate system adapted to Minkowski space: − x = u = x0 − x1; xi = x2,x3 (7.3.13) + x = v = x0 + x1 (7.3.14)

Correspondingly the Minkowskian metric takes the form:

2 + − i i μ ν ds = 2 dx ⊗ dx − dx ⊗ dx = ημν dx ⊗ dx (7.3.15) where: ⎛ ⎞ 01 0 0 ⎜10 0 0⎟ η = ⎜ ⎟ (7.3.16) μν ⎝00−10⎠ 00 0 −1

A plane wave corresponds to the case where the metric deformation is a function only of one the light-cone coordinates u, v: for outgoing waves only of u,forin- coming waves only of v. Since we are interested in outgoing waves we choose:

hμν(x) = hμν(u) (7.3.17) Relying on the relations: ∂ ∂ ∂+ = ; ∂− = ∂v ∂u + +− ∂ − −+ ∂ ∂ = η = ∂−; ∂ = η = ∂+ (7.3.18) ∂x− ∂x+ ∂ ∂i = ηij =−∂ ∂xj i from the Hilbert de Donder gauge condition:

μ ∂ γμν = 0 (7.3.19) by using γμν = γμν(u) we obtain:

γ+ν = const = 0 (7.3.20) The last of the above equations is fixed by our physically chosen boundary con- ditions. At infinity, namely at very remote future times and in very distant space locations where the wave has not yet arrived, the metric is just Minkowski. Hence there is no constant part of γμν . Next we recall the discussion of Chap. 5 about residual gauge transformations. The Lorentz covariant Hilbert de Donder gauge is not complete, since there exist 7.3 Emission of Gravitational Waves 289 further gauge transformations that preserve it. By using these transformations one can further reduce the form of γμν making it transverse and traceless, namely of the form: ⎛ ⎞ 00 0 0 ⎜00 0 0 ⎟ γ (u) = ⎜ ⎟ (7.3.21) μν ⎝00a(u) b(u) ⎠ 00b(u) −a(u) where the two functions a(u) and b(u) are arbitrary. We shall use this gauge-fixed form of the metric perturbation in the evaluation of the stress-energy 3-form as given in (7.3.10). To this effect we need to calculate the spin connection associated with the metric deformation hμν .

7.3.2.1 Calculation of the Spin Connection

In order to use the spin-connection formalism we need to give the form of the viel- bein first. This requires a further gauge fixing, that of local Lorentz transformations. Indeed the vielbein Ea is defined up to local Lorentz rotations. We fix that gauge by stating that the linearized vielbein is solely parameterized by the symmetric metric fluctuation hμν . This is obtained by setting:

a a 1 ab c E = dx + h dx ηbc (7.3.22) 2 which yields:

a 1 ab c d dE = ∂ch dx ∧ dx ηbd (7.3.23) 2 Then from the vanishing torsion equation, we obtain:

a ac f 0 = dE + ω ∧ E ηcf (7.3.24) that yields: ab ab 1 ab ab ω ηbm − ω ηbl =− ∂lh ηbm − ∂mh ηbl (7.3.25)  m 2 We uniquely solve the above relation by means of the linearized spin connection:

ab 1 af bg l ω =− η η (∂f hgl − ∂ghfl)dx (7.3.26) 2 This result can be made explicit in the light-cone basis as follows:

+− 1 l ω =− (∂−h+l − ∂+h−l)dx (7.3.27) 2

+i 1 l ω =− (∂−h+l − ∂ih−l)dx (7.3.28) 2 290 7 Gravitational Waves and the Binary Pulsars

−i 1 l ω =− (∂+hil − ∂ih+l)dx (7.3.29) 2

ij 1 l ω =− (∂ihjl − ∂j hil)dx (7.3.30) 2 In the plane wave case denoting the derivative ∂/∂u by a dot we find:

+− − ω = ω i = ωij = 0 (7.3.31)

+i 1 ˙ j ω =− hij dx (7.3.32) 2 or more explicitly:

1 ω+2 =− adx˙ 2 + bdx˙ 3 2 (7.3.33) 1 ω+3 =− bdx˙ 2 −˙adx3 2 Let us now consider the stress-energy tensor 3-form (7.3.10). Since the only non- +i af fb c vanishing component is ω , the second addend ω ∧ ω ∧ E εabcd vanishes identically. Hence

∗ 1 t = ωab ∧ ωcf ∧ E ε (7.3.34) d κ f abcd Now (ab) must necessarily be (+i). Hence c must necessarily be 2 or 3 so that ∗ f =+, also necessarily. Therefore d cannot be anything else but −. Hence only t− is non-vanishing. The final result is: ∗ 2 +2 +3 2 2 ˙ 2 2 3 t− =− ω ∧ ω ∧ E− = (a)˙ + (b) dx ∧ dx ∧ du (7.3.35) κ κ This result can be interpreted by recalling the encoding of the symmetric stress- energy in the stress-energy 3-form. This encoding was discussed in Chap. 5 and takes the following form:

∗ pq r s u td = Tdpη εqrsuE ∧ E ∧ E (7.3.36)

Applying the above relation to the stress energy 3-form of a plane wave as given by (7.3.35), we obtain:

∗ −+ 2 3 − 2 3 t− = 2 · 3t−−η ε+23−E ∧ E ∧ E = 6t−− dx ∧ dx ∧ du (7.3.37)

Hence by comparison we conclude: 1 2 ˙ 2 t−− =−t = (a)˙ + (b) (all other entries vanish) (7.3.38) 01 3κ 7.3 Emission of Gravitational Waves 291 where: 16πG κ = (7.3.39) 3

7.3.3 Multipolar Expansion of the Perturbation

The next step in our elaboration of the general solution (7.3.3) for the metric pertur- bation consists of its systematic multipolar expansion. This will enable us to single out the contributions to the wave from the various angular momentum components of the source deformation. Since these contributions have a faster and faster distance decay at increasing angular momentum , it follows that for weak fields only the first non-vanishing multipole needs to be considered. Due to the spin of the gravi- ton which is s = 2 the first contributing multipole turns out to be the quadrupole as opposed to the case of electromagnetic radiation where the spin s = 1 of the photon selects the dipole as first non-vanishing and dominant contribution. To implement the above outlined programme we begin by performing a partial Fourier (see Fig. 7.9) transform of the metric perturbation with respect to the time coordinate: +∞ 1 iωt γμν(ω, x) = √ dt e γμν(t, x) (7.3.40) 2π −∞ Insertingthisin(7.3.3) we get:   | − | T (ω, r ) γ (ω, x) =−4G d3r eiω r r μν (7.3.41) μν |r − r| The Hilbert gauge condition implies:

0 i ∂ γ0ν = iωγ0ν = ∂ γiν (7.3.42) and since γμν is symmetric we can conclude: i γ = ∂j γ (7.3.43) 0i ω ji i 1 γ = ∂iγ =− ∂i∂j γ (7.3.44) 00 ω i0 ω2 ij

Hence it suffices to determine the spatial components γij . The others are easily retrieved from these.

7.3.3.1 Multipolar Expansion

Let us consider the spatial part of the Green function, which apart from a factor 4π is just the Green function of the Laplace operator discussed in Sect. 7.2.1: 292 7 Gravitational Waves and the Binary Pulsars

Fig. 7.9 Jean Baptiste Joseph Fourier (March 21, 1768ÐMay 16, 1830) is one among the incred- ibly large number of scientific geniuses giving fundamental contributions to the development of Modern Physics, Mathematics and Chemistry contributed by the Revolutionary and Napoleonic France. Other members of that group are Laplace, Lagrange, Monge, Carnot (father and son), to mention some. Born at Auxerre, Fourier was educated in a monastery and studied mathematics. He took prominent part in the Revolution and was appointed professor first at the Ecole Normale Superieure, then at the Ecole Polytechnique where he inherited the chair of Laplace. He served as officer in the Napoleonic Army and was even appointed governor of Egypt. Fourier was elected member of the French Academy of Sciences in 1817. Besides the introduction of the Fourier series and of the Fourier transform, Fourier discovered the equation of heat propagation and was the first to point out the green-house effect of the atmosphere. He died in Paris at the age of sixty-two

 1 G x − x ≡ (7.3.45) |x − x| by means of a trivial manipulation we can rewrite it as follows:  1 G x − x = (7.3.46) (x · x + x · x − 2x · x)1/2 Defining: √ R ≡ x · x (7.3.47) we obtain:    −1/2  1 x · x − 2x · x G x − x = 1 + (7.3.48) R R2 that we can develop into power series of 1/R:  1 1 1    G x − x = − x · x − 2x · x (7.3.49) R 2 R3 3 1    + x · x − 2x · x 2 +··· (7.3.50) 8 R5 7.3 Emission of Gravitational Waves 293

The next step is to reorganize the terms of order R−3 and of order R−5. In this way we obtain:   1 x · x G x − x = + (7.3.51) R R3 3 (x · x)(x · x) 1 x · x + − (7.3.52) 2 R5 2 R3 3 (x · x)2 3 (x · x)(x · x) + − +··· (7.3.53) 8 R5 2 R5 Hence we can write:  1 G x − x = (7.3.54) R xkxk + (7.3.55) R3 k k  1    x x + 3xk x − x 2δk (7.3.56) 2 R5 k, +··· (7.3.57)

The first three lines in the above equation respectively define: (a) the monopole moment, (b) the dipole moment, (c) the quadrupole moment. To see this let us reconsider the general solution of the potential problem in Newto- nian theory provided by (7.2.15) and let us insert into it the development (7.3.54)of the kernel. We obtain: Gρ (x) V(x) =− d3x |x − x| GM G =− − xkDk (7.3.58) R R3 k G − xkxQk +··· (7.3.59) 2R5 k, where:   M = d3x ρ(x ) mass    Dk = d3x xk ρ(x ) dipole (7.3.60)      Qk = d3x 3xk x − x 2δk ρ x quadrupole 294 7 Gravitational Waves and the Binary Pulsars

Fig. 7.10 The radiation region

Equipped with this lore let us return to the case of our relativistic wave. We expand the phase factor according to:   x · x 1 exp iωx − x = exp iωR − iω + O (7.3.61) R2 R2

Next we define the radiation zone by means of the following two inequalities:

 ωR 1; ω|x | 1 (7.3.62) namely the distance is very large compared to the wave-length and the extension of the source is small compared to the wave-length (see Fig. 7.10). In the radiation region, which is the region far a way from the source, the phase factor can be well approximated by exp[iωR] and put out of the integral. Corre- spondingly we get: exp[iωR]   γ (ω, x) =−4G d3x T ω,x (7.3.63) ij R ij 7.3 Emission of Gravitational Waves 295

μ Using the conservation of the stress energy tensor ∂ Tμν = 0 we rewrite: 3   k k d x Tij ω,x = ∂ (Tkj xi) − ∂ Tkj xi (7.3.64) 3  =−iω d x T0j xi (7.3.65) ω 3  =−i d x (T j xi + T ixj ) (7.3.66) 2 0 0 The first term in the second line is dropped because it is a total derivative, the third line corresponds to the explicit symmetrization of the result due to the symmetry of γij . Applying the previous procedure a second time we obtain: 2 3   ω 3    d x Tij ω,x =− d x T x x (7.3.67) 2 00 i j

In addition if we impose that the perturbation γij should be traceless in the 3- dimensional sense we get the semifinal formula:

2 eiωR γ (ω, x) =− Gω2 Q (ω) (7.3.68) ij R ij 3 = 3     − 2 ij Qij (ω) d x T00(ω, x ) 3xixj x δ (7.3.69)

Undoing the time Fourier transform we can also write:

2 1 ∂2 γij (t, x) = G Qij (t) (7.3.70) 3 R ∂t2 where, by comparison with (7.3.54), Qij (t) is recognized to be the time dependent quadrupole moment.

7.3.4 Energy Loss by Quadrupole Radiation

When we calculated the stress-energy tensor of the plane wave we expressed it in terms of the following quadratic form:

1 (a)˙ 2 + (b)˙ 2 ≡ (γ˙ )2 + (γ˙ −˙γ )2 (7.3.71) 23 4 22 33 We focus on the structure of such an expression. Given a symmetric traceless tensor in three dimensions

Kij = Kji; Khh = 0; i, j, h = 1, 2, 3 (7.3.72) 296 7 Gravitational Waves and the Binary Pulsars

Fig. 7.11 Integration on the solid angle. The unit vector n singles out the infinitesimal solid angle around its direction

we look for an SO(3) invariant way of rewriting the tensor combination appearing in (7.3.71), namely: 1 U ≡ (K )2 + (K − K )2 (7.3.73) 23 4 22 33 This can be done in terms of a unit vector:

ni = (1, 0, 0) (7.3.74) whose physical meaning is that of propagation axis of the considered gravitational plane wave. We begin with some identities: 1 1 2 2 2 2 2 2 Kij Kij = K + K + K + 2K + 2K + 2K (7.3.75) 2 2 11 22 23 12 13 23 − i  =− 2 + 2 + 2 Kij n Kj n K11 K12 K13 (7.3.76) 1 i j 2 1 2 Kij n n = K (7.3.77) 4 4 11 Hence we conclude that: 1 1  k 1 i j 2 U = Kij Kij − KiKikn n + Kij n n (7.3.78) 2 2 4 Using this result in our expression for the stress-energy tensor of a plane wave trav- eling in the n direction we are in a position to write down the energy radiated away per unit time and per unit solid angle by gravitational emission. Indeed in a solid angle unit dΩ around the direction n we have (see Fig. 7.11): dE 1 1  k = A Kij Kij − KiKikn n (7.3.79) dt dΩ 2 2 1 i j 2 + Kij n n (7.3.80) 4 where A is a multiplicative constant that we will fix later by comparison with our previous results. 7.3 Emission of Gravitational Waves 297

7.3.4.1 Integration on Solid Angles

We rely on the orthogonality of spherical harmonics: 1 1 dΩ ninj = sin θdθdφninj (7.3.81) 4π 4π 1 = δij (7.3.82) 3 which immediately follows from:

n1 = cos θ (7.3.83) n2 = sin θ cos φ (7.3.84) n3 = sin θ sin φ (7.3.85)

Similarlywehave: 1 dΩ nknnmnr = const δkδmr + δkmδr + δkrδm (7.3.86) 4π The constant can be immediately fixed by taking the trace  = k and comparing with the previous result: 1 1 dΩ nmnr = 5 const δmr → const = (7.3.87) 4π 15 Hence integrating on the solid angles we get: dE dE 1 1 2 = A dΩ = 4πA − + Kij Kij (7.3.88) dt dt dΩ 2 3 15 · 4 4πA = Kij Kij (7.3.89) 5 Recalling the normalization of the energy density in the n-th direction: i 1 2 ˙2 t in =− a˙ + b (7.3.90) 0 16π and the normalization of the solution for the metric perturbation:

2 1 2 ∂ Qij γij = G (7.3.91) R 3 ∂t2 we obtain 3 2 dE 1 1 4 1 ∂ Qij dΩ = G2 +··· R2 dΩ (7.3.92) dt dΩ 16πG R2 9 2 ∂t3 298 7 Gravitational Waves and the Binary Pulsars so that we conclude with the identifications 3 G ∂ Qij A = ; Kij = (7.3.93) 36π ∂t3 and with the celebrated Einstein formula: 3 2 dE G ∂ Qij = (7.3.94) dt 45c5 ∂t3 expressing the energy radiated per unit time in terms of the square of the third deriva- tive of the quadrupole moment.6

7.4 Quadruple Radiation from the Binary Pulsar System

Having retrieved Einstein 1918 formula for the emission power of quadrupole ra- diation, we consider now the analysis of a two-body system like that of the binary pulsar PRS1913+16 discovered by Hulse and Taylor (see Fig. 7.12). Our goal is that of computing the variable quadrupole moment of such a system in order to insert the result into the Einstein formula and obtain predictions about the energy loss through gravitational wave emission. Clearly such energy loss will result into a shrinking of the system and of its revolution period. The rate of that shrinking turns out to be a measurable quantity which can be compared with theoretical predictions thus providing very stringent tests on General Relativity. Our discussion follows closely and updates the treatment of the same problem presented in the book [8].

7.4.1 Keplerian Parameters of a Binary Star System

We begin with a Keplerian-Newton description of the orbit which will be corrected by General Relativity effects like the periastron advance. From the viewpoint of Newtonian mechanics a two-body system can be reduced to a one-body problem in the case of central forces. For a potential: k V(r12) =− (7.4.1) r12 introducing the reduced mass: μ μ μ = 1 2 (7.4.2) μ1 + μ2

6Note in the final formula (7.3.94) the appearance of the factor c5 in the denominator. The speed of light has been reinstalled at the appropriate power on the basis of dimensional analysis. In the previous steps of the calculation we always used natural units in which c = 1, namely our time variable was actually t ∼ ct. 7.4 Quadruple Radiation from the Binary Pulsar System 299

Fig. 7.12 Schema of the binary pulsar system PRS1913+16 discovered by Russell Hulse and Joseph Taylor in 1974. For this discovery they received the Nobel Prize 1993

and naming r = r12, the solution of the dynamical problem is given by the elliptic orbits: a(1 − e2) r = (7.4.3) 1 + e cos θ where the geometrical parameters are related to the mechanical integral of motion, namely the energy and the angular momentum by:

k 2El2 a =− ; e2 = 1 + (7.4.4) 2E μk2

In the case of the Newtonian potential:

k = Gμ1μ2 (7.4.5) and therefore μ μ G 2El2(μ + μ ) a =− 1 2 ; e2 = 1 + 1 2 (7.4.6) E 3 3 2 2 μ1μ2G Hence if we deduce the geometrical parameters of a binary star system orbit, we can calculate the physical parameters (masses and angular momentum). How do we deduce the geometrical parameters? Help comes from the periastron advance that, in the case of the binary pulsar, can be measured notwithstanding the enormous distance of the system from the Earth (see Fig. 7.13). 300 7 Gravitational Waves and the Binary Pulsars

Fig. 7.13 The periastron advance in the pulsar binary system PSR1913+16

In Chap. 4 we have derived the following perturbative formula for the periastron advance: m GM Δϕ = 6π ; m = (7.4.7) a(1 − e2) c2 on the other hand, according to Kepler third law the period is given by: μ T = 2πa3/2 (7.4.8) k

Kepler third law follows from integration of the area derivative

dA 1  T = r2θ˙ = → A = = πab (7.4.9) dt 2 2μ 2μ

Hence for a binary star system we can write the two equations: 1 T = 2πa3/2 (7.4.10) G(μ1 + μ2) G(μ + μ ) 1 Δϕ = 6π 1 2 (7.4.11) c2 a(1 − e2) and we calculate the angle increase per unit time by dividing through the period: Δϕ [G(μ + μ )]3/2 1 = 3 1 2 (7.4.12) T c2 5/2 − 2 BS aBS (1 eBS) 7.4 Quadruple Radiation from the Binary Pulsar System 301

Let us make a numerical comparison between the binary system case and the perihelion advance of mercury where: Δϕ [GM ]3/2 1 = 3 S (7.4.13) T c2 5/2 − 2 Merc aM (1 eM )

MS being the solar mass and aM ,eM the geometrical parameters of Mercury orbit, while aBS, eBS are those of the binary star system. Defining the dimensionless factors: (μ + μ ) x = 1 2 (7.4.14) MS a y = M (7.4.15) aBS 1 − e2 z = M (7.4.16) − 2 1 eBS we obtain the relation Δϕ Δϕ = y5/2x3/2z (7.4.17) T BS T Merc Numerically we have:

z = 1.547 y = 28.46 x = 2.8275 (7.4.18) and hence the factor f = y5/2x3/2z ∼ 31782 (7.4.19) While for mercury the angular advance is  42 /century (7.4.20) for the pulsar binary system we get:

Δϕ ∼ 3.7deg/year (7.4.21)

As we see the overwhelming contribution to this enhancement is due to the narrow- ness of the system, namely the ratio between the semilati recti.

7.4.2 Shrinking of the Orbit and Gravitational Waves

We have indirect evidence of the emission of gravitational waves from the decrease of the period T and the consequent shrinking of the orbit, namely the decrease of the semilatus rectum.From(7.4.6), by taking a time derivative, we obtain 302 7 Gravitational Waves and the Binary Pulsars

Fig. 7.14 The decrease in the period and the shrinking of the orbit for the pulsar binary system PRS1913+16

da μ μ dE = 1 2 (7.4.22) dt 2E2 dt so that a shrinking of the orbit, corresponds to a decrease of the system energy. Such energy is radiated away in the form of gravitational waves. It is extremely interesting to perform an accurate calculation of such of energy loss by quadrupole radiation in order to compare with experimental data on the reduction of the period (see Fig. 7.14).

7.4.2.1 Calculation of the Tensor

We consider Fig. 7.15 and calling r the vector joining one of the two stars with the other we can write:

μ2 μ1 r1 = r; r2 = r (7.4.23) μ1 + μ2 μ1 + μ2 where, according to Kepler laws and the Newtonian solution of the dynamical prob- lem we have: a(1 − e2) r ≡|r|= (7.4.24) 1 + e cos θ Then we define the moment of inertia according to: ≡ k  = k  + k  Ik ρ(x)x x μ1r1, r1 μ2r2, r2 μ μ = 1 2 rkr (7.4.25) μ1 + μ2 Then turning to polar coordinates:

r1 ≡ x = r cos θ; r2 ≡ y = r sin θ; r3 ≡ z = 0 (7.4.26) 7.4 Quadruple Radiation from the Binary Pulsar System 303

Fig. 7.15 The vectors defining the position of the two stars with respect to their center of mass

we obtain:

μ1μ2 2 2 Ixx = r cos θ μ1 + μ2 μ1μ2 2 2 Iyy = r sin θ μ1 + μ2 (7.4.27) μ1μ2 2 Ixy = r sin θ cos θ μ1 + μ2 μ μ TrI = 1 2 r2 μ1 + μ2

The angular momentum, on the other hand is: μ μ  = 1 2 r2θ˙ (7.4.28) μ1 + μ2

Recalling the relation between the angular momentum , the energy E and the geometrical parameters of the orbit a and e, displayed in (7.4.6) by comparison with (7.4.28), we obtain: 1 θ˙ = (μ + μ ) 1 − e2 aG r2 1 2 (7.4.29) μ + μ r˙ = e sin θ 1 2 G a(1 − e2)

This information suffices to calculate the third derivative of the inertia tensor which, turning to natural units where G = c = 1, takes the following form:

d3I μ μ xx = 2 1 2 2sin2θ + 3e cos2 θ sin θ θ˙ dt3 a(1 − e2) 3 d Iyy μ μ =−2 1 2 2sin2θ + 3e cos2 θ sin θ + e sin θ θ˙ dt3 a(1 − e2) (7.4.30) 3 d Ixy μ μ =−2 1 2 2 cos 2θ − e cos θ + 3e cos3 θ θ˙ dt3 a(1 − e2) d3I μ μ =−2 1 2 e sin θθ˙ dt3 a(1 − e2) 304 7 Gravitational Waves and the Binary Pulsars

The quadrupole moment Qij is related to moment of inertia by a very simple rela- tion: 1 Qij = 3 Iij − δij I (7.4.31) 3 so that by means of the above results we can immediately calculate its third deriva- tive. The relevant squared modulus appearing in the Einstein formula (7.3.94)is then easily obtained: ...... ij 1 ...... 1 ... Q Q = I 2 + 2I 2 + I 2 − I 2 (7.4.32) ij 9 xx xy yy 3 and in natural units G = c = 1, we obtain:

dE 1 ... 8μ2μ2 − = |Q |2 = 1 2 12(1 + e cos θ)2 + e2 sin2 θ θ˙2 (7.4.33) dt 45 ij 15a2(1 − e2)2

We can average the energy loss over one revolution period defining: 2π − dE =−1 dE dθ ˙ (7.4.34) dt T 0 dt θ With straightforward algebra we obtain:

1 dE 8μ2μ2 = 1 2 12(1 + e cos θ)2 + e2 sin2 θ θ˙ θ˙ dt 15a2(1 − e2)2 32μ2μ2 √ = 1 2 + 7 μ1 μ2f(θ,e) where (7.4.35) 5(a(1 − e2)) 2 1 f(θ,e)≡ (1 + e cos θ)2 (1 + e cos θ)2 + e2 sin2 θ 12 Using Kepler third law we can express the period of revolution in terms of the semi- latus rectum: 3 2πa2 T = √ (7.4.36) μ1 + μ2 which, inserted in (7.4.34), yields dE 32 μ2μ2 1 1 2π − = 1 2 (μ + μ ) f(θ,e) 7 1 2 5 (7.4.37) dt 5 1 − e2 2 a 2π 0

The integral appearing in (7.4.37) is easily evaluated and one obtains the following final expression for the average energy loss during each revolution (still written in natural units G = c = 1): 7.4 Quadruple Radiation from the Binary Pulsar System 305 dE 32 1 − = μ2μ2(μ + μ )f(e) 5 1 2 1 2 (7.4.38) dt T 5 a where: ≡ 1 + 73 2 + 37 4 f(e) 1 1 e e (7.4.39) (1 − e2) 2 24 96 From (7.4.4), relating the semilatus rectum to the energy of the orbit, we work out: da a2 dE − =− (7.4.40) dt T μ1μ2 dt T and from Kepler’s third law we get:

T˙ 3 a˙ = (7.4.41) T 2 a so that T˙ 1 =− μ2μ2(μ + μ )f(e) (7.4.42) T a4 1 2 1 2 Using once again Kepler’s third law to express the semilatus rectum in terms of the period and reinstalling the fundamental constants by means of dimensional analysis, we obtain the final expression for the derivative of the period as a function of the period, the masses of the two stars composing the system and the eccentricity. We have:

˙ =− 1 · · T 5 u(G, c) g(μ1,μ2) f(e) T 3

5 96 1 G 3 u(G, c) =− (7.4.43) 8 5 5 (2π)3 c = μ1μ2 g(μ1,μ2) 1 (μ1 + μ2) 3 If we insert in this formula the data of the binary pulsar system PSR1913+16, re- called in Table 7.1, we obtain the following theoretical value for the time derivative of the revolution period:

˙ −12 Ttheor =−2.435 × 10 (7.4.44) which is to be compared with the measured experimental value:

˙ −12 Texp = (−2.30 ± 0.22) × 10 (7.4.45)

The incredible good agreement between the theoretical and experimental values is an indirect strong evidence of the emission of gravitational waves. 306 7 Gravitational Waves and the Binary Pulsars

Table 7.1 Data of the binary pulsar system PSR1913+16 Right ascension 19 h 13 m 12.4s Declination +16◦0108 Distance 21,000 light years

Mass of detected pulsar 1.441 × MSun

Mass of companion 1.387 × MSun Rotational period of detected pulsar 59.02 ms Diameter of each neutron star 20 km Orbital period 7.752 h Eccentricity 0.62 Semilatus rectum 1.95 × 106 km

7.4.3 The Fate of the Binary System

Defining

α ≡ u(G, c) · g(μ1,μ2) · f(e) (7.4.46) whose numerical value is:

−4 5 α =−0.625128 × 10 s 3 (7.4.47)

The revolution period obeys the following differential equation

dT − 5 + αT 3 = 0 (7.4.48) dt which is immediately integrated. Considering as initial the present instant of time and fixing the boundary condition at T(0) = T0 = 7.751 h we obtain: 8tα 3/8 T(t)= T 8/3 + (7.4.49) 0 3 hence the period constantly decreases while the orbit radius shrinks and eventu- ally T will reduce to zero when the two stars come so close as to coalesce. From (7.4.49) we can get a rough estimate of the time needed to reach coalescence. Such an estimate is determined by solving T(tf ) = 0fortf . We obtain:

8/3 3T0 15 8 tf = = 4.2983 × 10 s = 1.382 × 10 years (7.4.50) 8α In other words the two neutron stars will fall one on top of the other in about 140 millions of years. Clearly the quadrupole approximation will loose its validity when approaching coalescence. At short distances the non-linear nature of Einstein equa- tions will play an essential role and the only known methods to calculate gravita- tional radiation in such situations are numerical. 7.4 Quadruple Radiation from the Binary Pulsar System 307

Fig. 7.16 The double pulsar system PSR J0737-3039A/B, in the representation of an artist

7.4.4 The Double Pulsar

December 12th 2003, on journals and on the Internet appeared the official announce- ment of a new exciting discovery. An international team of radio-astronomers, in- cluding a strong and driving group of Italians,7 found the the Double Pulsar system officially named PSR J0737-3039A/B (see Fig. 7.16). This system is quite similar to the system PSR1913+16 but it has some additional features that make it an ex- traordinary precise laboratory to test General Relativity and Neutron Star Physics under extreme strong field conditions. The relevant data are displayed in Table 7.2. The first feature, in contrast with the case of PSR1913+16 is that both members of PSR J0737-3039A/B are pulsars and therefore they are both directly detectable. The second notable feature of the system is its extreme narrowness which emphasizes all General Relativity effects. The orbit has low eccentricity but the semilatus rec- tum is less than a million of Kilometers which results in a revolution period of just 2.4 hours. The periastron advance is accordingly very high and its measured value perfectly fits the predictions of General Relativity. Similarly the shortness of the rev- olution period allowed a rapid measure of its shrinking with very high statistics and the indirect evidence of the emission of gravitational waves was tested once again in excellent agreement with General Relativity. Using the measured parameters we

7The Italian Team participating to the discovery is constituted by members of INAF, the Istituto Nazionale di Astrofisica, belonging to the Cagliari Pulsar Group and to the Universities of Cagliari and Bologna, including Marta Burgay, Andrea Possenti and Nichi d’Amico. The main international partners of the collaboration were the Jodrell Bank pulsar group in Manchester, the ATNF pulsar group in Sydney (Australia) and the Swinburne pulsar group also in Australia (Melbourne). Finally also the European Pulsar Timing Array collaboration was involved. The radio-telescope used for the discovery is the Parkes radio telescope in Australia (see Fig. 7.17). Further observations were carried on at the Northern Cross radio-telescope near Bologna in Italy and in other European radio observatories. 308 7 Gravitational Waves and the Binary Pulsars

Fig. 7.17 The Parkes radio-telescope located in Australia and used for the discovery of the double pulsar system PSR J0737-3039A/B

Table 7.2 Data of the binary pulsar system PSR Constellation Canis Major J0737-3039A/B Right ascension 07 h 37 m 51.247 s Declination −30◦3940.74 Distance 2,000 light years

Mass of pulsar A 1.337 × MSun

Mass of pulsar B 1.250 × MSun Rotational period of pulsar A 23 ms Rotational period of pulsar B 2.8s Diameter of each neutron star ∼20 km Orbital period 2.4h Eccentricity 0.088 Semilatus rectum 0.86 × 106 km obtain an estimate of the life-time of this system of approximately 80 millions of years, which is considerably shorter than the life-time of PSR1913+16.

8/3 3T0 6 tf =  80 × 10 years (7.4.51) 8α Last, but not least, the system PSR J0737-3039A/B is much closer to Earth than PSR1913+16. It is only 2000 light years away from us. All these properties make it an extraordinary laboratory of General Relativity which, so far, has confirmed all of its predictions.

7.5 Conclusive Remarks on Gravitational Waves

Although very difficult to be directly detected, Gravitational Waves are a must for General Relativity that, at almost one hundred years from its birth is more solid than References 309 ever, having passed all possible experimental tests. Moreover General Relativity is the conceptual framework in which modern Cosmology has been understood and it is entangled in an essential way with all proposed schemes for the unification of all fundamental interactions. The quantum particle responsible for the gravitational interaction is the spin s = 2 graviton and General Relativity appears to be its only possible low energy effective description. Just as in quantum electrodynamics the spin s = 1 photon is the quantum of the electromagnetic waves, in the same way the graviton makes sense only as the quantum of the gravitational waves which should also exist and propagate classically. The absence of these classical waves would be a deadly blow not only for General Relativity but for the entire structure of our present understanding of the fundamental physical laws. In that case the whole fabric of Physics should be reconsidered. It is however very much rewarding that indirect evidence of gravitational wave emission from binary systems is constantly piled up in simple and absolute agree- ment with the 1918 Einstein perturbative formula. In this respect the recent dis- covery of the double pulsar system is exceptionally relevant. This means a further confirmation of our standard approach to the interpretation of classical and quantum field theories and implies that the final detection of the elusive gravitational waves, although difficult should come true in a reasonably near future.

References

1. Einstein, A.: Über Gravitationswellen. In: Sitzungsberichte der Königlich Preussischen Akademie der Wissenshaften, pp. 154Ð167. Königlich Preussischen Akademie der Wis- senshaften, Berlin (1918) 2. Einstein, A.: Näherungsweise Integration der Feldgleichungen der Gravitation. In: Sitzungs- berichte der Königlich Preussischen Akademie der Wissenshaften, pp. 688Ð696. Königlich Preussischen Akademie der Wissenshaften, Berlin (1916) 3. Einstein, A., Rosen, N.: On gravitational waves. J. Franklin Inst. 223, 43Ð54 (1937) 4. Hu, N.: Radiation damping in the gravitational field. Proc. R. Ir. Acad. A 51, 87Ð111 (1957) 5. Hulse, R.A., Taylor, J.H.: A high sensitivity pulsar survey. Astrophys. J. Lett. 191, L59ÐL61 (1974) 6. Hulse, R.A., Taylor, J.H.: Discovery of a pulsar in a binary system. Astrophys. J. 195, L51ÐL53 (1975) 7. Hulse, R.A.: The discovery of the binary pulsar. In: Les Prix Nobel, 1993, pp. 58Ð79. The Nobel Foundation (1994) 8. Straumann, N.: General Relativity and Relativistic Astrophysics. Springer, Berlin (1981) Chapter 8 Conclusion of Volume 1

In the first volume we have presented the theory of General Relativity comparing it at all times with the other Gauge Theories that describe non-gravitational inter- actions. We have also followed the complicated historical development of the ideas and of the concepts underlying both of them. In particular we have traced back the origin of our present understanding of all fundamental interactions as mediated by connections on principal fibre-bundles and emphasized the special status of Grav- ity within this general scheme. While recalling the historical development we have provided a, hopefully rigorous, exposition of all the mathematical foundations of gravity and gauge theories in a contemporary geometrical approach. In the last two chapters of Volume 1 we have considered relevant astrophysical applications of General Relativity that also provide some of the most accurate tests of its predictions. In Chap. 6 we considered stellar equilibrium and the mass-limits which combine General Relativity and Quantum Mechanics. In Chap. 7 we consid- ered the emission of gravitational waves and the stringent tests of Einstein’s theory that come from the binary pulsar systems. The further historical and conceptual development of the theory is addressed in Volume 2 which covers the following topics: 1. Extended Space-Times, Causal Structure and Penrose Diagrams. 2. Rotating Black-Holes and Thermodynamics. 3. Cosmology and General Relativity: From Hubble to WMAP. 4. The theory of the inflationary universe. 5. The birth of String Theory and Supersymmetry. 6. The conceptual and algebraic foundations of Supergravity. 7. An introduction to the Bulk-Brane dualism with a glance at brane solutions. 8. An introduction to the Supergravity Bestiary. 9. A bird-eye review of various type of solutions of higher dimensional supergrav- ities.

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7_8, 311 © Springer Science+Business Media Dordrecht 2013 312 8 Conclusion of Volume 1

Appendix A: Spinors and Gamma Matrix Algebra

A.1 Introduction to the Spinor Representations of SO(1,D− 1)

The spinor representations of the orthogonal and pseudo-orthogonal groups have different structure in various dimensions. Starting from the representation of the Dirac gamma matrices one begins with a complex representation whose dimension is equal to the dimension of the gammas. A vector in this complex linear space is named a Dirac spinor. Typically Dirac spinors do not form irreducible representa- tions. Depending on the dimensions, one can still impose SO(1,D − 1) invariant conditions on the Dirac spinor that separate it into irreducible parts. These con- straints can be of two types: (a) A reality condition which maintains the number of components of the spinor but relates them to their complex conjugates by means of linear relations. This reality condition is constructed with an invariant matrix C , named the charge conjugation matrix whose properties depend on the dimensions D. (b) A chirality condition constructed with a chirality matrix ΓD+1 that halves the number of components of the spinor. The chirality matrix exists only in even dimensions. Depending on which conditions can be imposed, besides Dirac spinors, in various dimensions D, one has Majorana spinors, Weyl spinors and, in certain dimensions, also Majorana-Weyl spinors. In this appendix we discuss the properties of gamma matrices and we present the various types of irreducible spinor representations in all relevant dimensions from D = 4toD = 11. The upper bound D = 11 is dictated by supersymmetry since supergravity, i.e. the supersymmetric extension of Einstein gravity, can be constructed in all dimensions up to D = 11, which is maximal in this respect. In the present volume, supergravity is not discussed, but some glances of it will occur in the second volume and for this reason, while discussing the necessary topic of gamma matrix algebra, we present a complete description of the available spinors in the various relevant dimensions.

A.2 The Clifford Algebra

In order to describe spinors one needs the Dirac gamma matrices. These form the Clifford algebra:

{Γa,Γb}=2ηab (A.2.1) where ηab is the invariant metric of SO(1,D− 1), that we always choose according to the mostly minus conventions, namely:

ηab = diag(+, −, −,...,−) (A.2.2) A Spinors and Gamma Matrix Algebra 313

To study the general properties of the Clifford algebra (A.2.1) we use a direct con- struction method. 0 We begin by fixing the following conventions. Γ = Γ0 corresponding to the time direction is Hermitian: † = Γ0 Γ0 (A.2.3) i while the matrices Γi =−Γ corresponding to space directions are anti-Hermitian:

† =− Γi Γi (A.2.4) In the study of Clifford algebras it is necessary to distinguish the case of even and odd dimensions.

A.2.1 Even Dimensions

When D = 2ν is an even number the representation of the Clifford algebra (A.2.1) has dimension: D ν dim Γa = 2 2 = 2 (A.2.5) In other words the gamma matrices are 2ν × 2ν . The proof of such a statement is easily obtained by iteration. Suppose that we have the gamma matrices γa corre- sponding to the case ν = ν − 1, satisfying the Clifford algebra (A.2.1)inD − 2 dimensions and that they are 2ν -dimensional. We can write down the following representation for the gamma matrices in D-dimension by means of the following 2ν × 2ν matrices: 0 γa i 0 Γa = ; ΓD−2 = γa 0 0 −i (A.2.6) 0 1 Γ − = ; a = 0, 1,...,D− 3 D 1 −1 0 which satisfy the correct anticommutation relations and have the correct hermiticity properties specified above. This representation admits the following interpretation in terms of matrix tensor products:

Γa = γa ⊗ σ1; ΓD−2 = 1 ⊗ iσ3; ΓD−1 = 1 ⊗ iσ2 (A.2.7) where σ1,2,3 denote the Pauli matrices: 01 0 −i 10 σ = ; σ = ; σ = (A.2.8) 1 10 2 i0 3 0 −1

To complete the proof of our statement we just have to show that for ν = 2, corre- sponding to D = 4 we have a 4-dimensional representation of the gamma matrices. 314 8 Conclusion of Volume 1

This is well established. For instance we have the representation: 0 1 0 σ1,2,3 γ0 = ; γ1,2,3 = (A.2.9) 1 0 −σ1,2,3 0

In D = 2ν one can construct the chirality matrix defined as follows:

2 ΓD+1 = αDΓ0Γ1Γ2 ...ΓD−1;|αD| = 1 (A.2.10) where αD is a phase-factor to be fixed in such a way that: 2 = ΓD+1 1 (A.2.11) By direct evaluation one can verify that:

{Γa,ΓD+1}=0 a = 0, 1, 2,...,D− 1 (A.2.12)

The normalization αD is easily derived. We have:

1 D(D−1) Γ0Γ1 ...ΓD−1 = (−) 2 ΓD−1ΓD−2ΓD−1 (A.2.13) so that imposing (A.2.11) results into the following equation for αD:

1 − − 2 − 2 D(D 1) − (D 1) = αD( ) ( ) 1 (A.2.14) which has solution:

αD = 1ifν = 2μ + 1 ∼ odd (A.2.15) αD = iifν = 2μ ∼ even With the same token we can show that the chirality matrix is Hermitian:

† 1 − − =  − 2 D(D 1) − (D 1) = ΓD+1 α ( ) ( ) Γ0Γ1Γ2 ...ΓD−1 ΓD+1 (A.2.16)

A.2.2 Odd Dimensions

When D = 2ν + 1 is an odd number, the Clifford algebra (A.2.1) can be represented ν ν by 2 × 2 matrices. It suffices to take the matrices Γa corresponding to the even case D = D − 1 and add to them the matrix ΓD = iΓD+1, which is anti-Hermitian and anti-commutes with all the other ones.

A.3 The Charge Conjugation Matrix

T Since Γa and their transposed Γa satisfy the same Clifford algebras it follows that there must be a similarity transformation connecting these two representations of A Spinors and Gamma Matrix Algebra 315

Table A.1 Structure of charge conjugation matrices Charge conjugation matrices   in various space-time D C+ = C+ (real) C− = C− (real) dimensions T 2 T 2 4 C+ =−C+; C+ =−1 C− =−C−; C− =−1 T 2 5 C+ =−C+; C+ =−1 T 2 T 2 6 C+ =−C+; C+ =−1 C− = C−; C− = 1 T 2 7 C− = C−; C− = 1 T 2 T 2 8 C+ = C+; C+ = 1 C− = C−; C− = 1 T 2 9 C+ = C+; C+ = 1 T 2 T 2 10 C+ = C+; C+ = 1 C− =−C−; C− =−1 T 2 11 C− =−C−; C− =−1 the same algebra on the same carrier space. Such statement relies on Schur’s lemma and it is proved in the following way. We introduce the notation: 1 δP Γa ...a ≡ Γ[a Γa ...Γa ] = (−) Γa ...Γa (A.3.1) 1 n 1 2 n n! P(a1) P(an) P ! where P denotes the sum over the n permutations of the indices and δP the parity of permutation P , i.e. the number of elementary transpositions of which it is com- posed. The set of all matrices 1,Γa,Γa1a2 ,...,Γa1...aD constitutes a finite group of 2[D/2]-dimensional matrices. Furthermore the groups generated in this way by − T Γa, Γa or Γa are isomorphic. Hence by Schur’s lemma two irreducible repre- sentations of the same group, with the same dimension and defined over the same vector space, must be equivalent, that is there must be a similarity transformation that connects the two. The matrix realizing such a similarity is called the charge conjugation matrix. Instructed by this discussion we define the charge conjugation matrix by means of the following equations: C C −1 =− T −Γa − Γa (A.3.2) C C −1 = T +Γa + Γa

By definition C± connects the representation generated by Γa to that generated ± T C C by Γa . In even dimensions both − and + exist, while in odd dimensions only one of the two is possible. Indeed in odd dimensions ΓD−1 is proportional to Γ0Γ1 ...ΓD−2 so that the C− and C+ of D − 1 dimensions yield the same result on ΓD−1. This decides which C exists in a given odd dimension. Another important property of the charge conjugation matrix follows from iter- T ating (A.3.2). Using Schur’s lemma one concludes that C± = αC± so that iterating again we obtain α2 = 1. In other words C+ and C− are either symmetric or anti- symmetric. We do not dwell on the derivation which can be obtained by explicit iterative construction of the gamma matrices in all dimensions and we simply col- lect below the results for the properties of C± in the various relevant dimensions (see Table A.1). 316 8 Conclusion of Volume 1

A.4 Majorana, Weyl and Majorana-Weyl Spinors

The Dirac conjugate of a spinor ψ is defined by the following operation:

† ψ ≡ ψ Γ0 (A.4.1) and the charge conjugate of ψ is defined as:

T ψc = C ψ (A.4.2) where C is the charge conjugation matrix. When we have such an option we can either choose C+ or C−. By definition a Majorana spinor λ satisfies the following condition: = c = T  λ λ CΓ0 λ (A.4.3) Equation (A.4.3) is not always self-consistent. By iterating it a second time we get the consistency condition: C T C = Γ0 Γ0 (A.4.4)

There are two possible solutions to this constraint. Either C− is antisymmetric or C+ is symmetric. Hence, in view of the results displayed above, Majorana spinors exist only in

D = 4, 8, 9, 10, 11 (A.4.5)

In D = 4, 10, 11 they are defined using the C− charge conjugation matrix while in D = 8, 9 they are defined using C+. Weyl spinors, on the contrary, exist in every even dimension; by definition they are the eigenstates of the ΓD+1 matrix, corresponding to the +1or−1 eigenvalue. Conventionally the former eigenstates are named left-handed, while the latter are named right-handed spinors:

ΓD+1ψ L =±ψ L (A.4.6) R R

In some special dimensions we can define Majorana-Weyl spinors which are both eigenstates of ΓD+1 and satisfy the Majorana condition (A.4.3). In order for this to be possible we must have:

C T   = Γ0 ΓD+1ψ ΓD+1ψ (A.4.7) which implies: C T  T C −1 = Γ0 ΓD+1Γ0 ΓD+1 (A.4.8) A Spinors and Gamma Matrix Algebra 317

With some manipulations the above condition becomes:

C C −1 =− T ΓD+1 ΓD+1 (A.4.9) which can be checked case by case, using the definition of ΓD+1 as product of all the other gamma matrices. In the range 4 ≤ D ≤ 11 the only dimension where (A.4.9) is satisfied is D = 10 which is the critical dimensions for superstrings. This is not a pure coincidence. Summarizing we have:

Spinors in 4 ≤ D ≤ 11 D Dirac Majorana Weyl Majora-Weyl

4 Yes Yes Yes No 5 Yes No No No 6YesNoYesNo 7 Yes No No No 8 Yes Yes Yes No 9YesYesNoNo 10 Yes Yes Yes Yes 11 Yes Yes No No

A.5 A Particularly Useful Basis for D = 4 γ -Matrices

In this section we construct a D = 4 gamma matrix basis which is convenient for various purposes. Let us first specify the basis and then discuss its convenient prop- erties. In terms of the standard matrices (A.2.8) we realize the so(1, 3) Clifford algebra:

{γa,γb}=2ηab; ηab = diag(+, −, −, −) (A.5.1) by setting:

γ0 = σ1 ⊗ σ3; γ1 = iσ2 ⊗ σ3 γ2 = i1 ⊗ σ2; γ3 = iσ3 ⊗ σ3 (A.5.2) γ5 = 1 ⊗ σ1; C = iσ2 ⊗ 1 where γ5 is the chirality matrix and C is the charge conjugation matrix. In this basis the generators of the Lorentz algebra so(1, 3), namely γab are particularly simple and nice 4 × 4 matrices. Explicitly we get: 318 8 Conclusion of Volume 1 ⎛ ⎞ ⎛ ⎞ −1000 0001 ⎜ 0 −100⎟ ⎜0010⎟ γ = ⎜ ⎟ ; γ = ⎜ ⎟ 01 ⎝ 0010⎠ 02 ⎝0100⎠ 0001 1000 ⎛ ⎞ ⎛ ⎞ 00−i 0 0001 ⎜00 0 −i ⎟ ⎜ 0010⎟ γ = ⎜ ⎟ ; γ = ⎜ ⎟ (A.5.3) 03 ⎝ i 00 0⎠ 12 ⎝ 0 −100⎠ 0 i 00 −1000 ⎛ ⎞ ⎛ ⎞ 00−i 0 0 −i 00 ⎜ 000−i ⎟ ⎜−i 000⎟ γ = ⎜ ⎟ ; γ = ⎜ ⎟ 13 ⎝−i 000⎠ 23 ⎝ 000i ⎠ 0 −i 00 00i 0 Let us mention some relevant formulae that are easily verified in the above basis:

γ0γ1γ2γ3 = iγ5 (A.5.4) and if we fix the convention:

ε0123 =+1 (A.5.5) we obtain:

1 abcd ε γaγbγcγd =−iγ (A.5.6) 24 5

Appendix B: Mathematica Packages

In this appendix we describe (for pedagogical reasons) the structure of two Mathe- matica Packages constructed by the author that can be used to calculate geometrical quantities relevant to the problems addressed in the main text and also to draw plots and pictures. The MATHEMATICA notebook files can be downloaded as supplementary ma- terial from the Springer distribution site.

B.1 Periastropack

This is a MATHEMATICA package for the calculation and drawing of orbits of massive particles in a Schwarzschild metric. After letting the computer read the programme, the package is initialized by typing: periastro. B Mathematica Packages 319

We suggest that you type periastro in a separate NoteBook, different from the NoteBook containing the package. This package solves the numerical differential equation and plots the orbit curve in the xy-plane, comparing it with the Keplerian ellipsis. It is an interactive programme that asks the user to supply the semilatus rectum of the orbit expressed in units of the Schwarzschild emiradius and the ec- centricity. Next after showing the Keplerian orbit and the General Relativity orbit after one revolution the programme stops and ask whether the user wants to display the orbit for more revolutions and for how many. It goes back to such a question until the user is satisfied and decides to stop.

Programme

Main Programme Periastro This is the main programme which asks for the inputs of the parameters and then calls the calculation subroutines periastro:= Print["======"]; Print["We make a comparison between orbits in Newton’s Theory"]; Print["and in Schwarzschild geometry"]; Print["—————————"]; Print["Input of geometrical parameters"]; α = Input["Semilatus rectum in units of Schwarzschild emiradius"]; ec = Input ["eccentricity "]; lq = α ∗ 1 − ec2 ; Print["======"]; Print["PLOT of the ORBIT with the following parameters:"]]; Print["Semilatus rectum = ",α,"m"]; Print["eccentricity = ", ec]; Print["======"]; Print["Keplerian orbit with these parameters"]; perihelkep; Print["======"]; Print["The Schwarzschild orbit with the same parameters"]; nn = 1; perihelgr; Label[ripeto]; Print["After more revolutions, Yes or No?"]; flagga = Input["YesorNoYes or No"]; If[flagga===Yes, Goto[plotto], Goto[stoppo]]; If[flagga===No, Goto[stoppo], {pippo = 0}]; Label[plotto]; nn = Input["number of revolutions"]; perihelgr; Goto[ripeto]; Label[stoppo]; 320 8 Conclusion of Volume 1

Print["Finished"]; Print["—————————"]; ;

Subroutine Perihelkep This is the routine that plots the Keplerian orbit with the chosen parameters perihelkep:= ∗ − 2 ∗ Cos[φ] ParametricPlot α (1 ec ) 1+ec∗Cos[φ] , [ ] α ∗ (1 − ec2) ∗ Sin φ , {φ,0, 2 ∗ Pi}, AxesLabel →{x,y}, Ticks → None, 1+ec∗Cos[φ] PlotStyle →{{Thickness[0.006]}} ;

Subroutine Perihelgr This is the subroutine for the calculation and drawing of the orbit with the same parameters in the Schwarzschild metric perihelgr:= Print["—————————"]; Print["After ", nn, " revolutions"]; + QQ = NDSolve uu[φ]+uu[φ]−3uu[φ]2== 1 , uu[0]== 1+ec , uu[0]==0 , lqlq lqlq { [ ]} { } → ; uu φ , φ,0,2nnπ , MaxSteps 500 Sin[φ] Cos[φ] { } →{ } ParametricPlot Evaluate uu[φ] , uu[φ] /.QQ , φ,0, 2nnπ , AxesLabel x,y , AspectRatio → 1, Ticks → None, PlotStyle →{{Thickness[0.006], RGBColor[1, 0, 0]}} ;

Examples

This section contains some examples of orbits calculated with Periastro.

With a = 60 m and ε = 0.6 periastro ======We make a comparison between orbits in Newton’s Theory and in Schwarzschild geometry ————————— Input of geometrical parameters ======PLOT of the ORBIT with the following parameters: Semilatus rectum = 60 m eccentricity = 0.6 ======Keplerian orbit with these parameters B Mathematica Packages 321

======The Schwarzschild orbit with the same parameters ————————— After 1 revolutions

After more revolutions, Yes or No? ————————— After 3 revolutions

After more revolutions, Yes or No? ————————— After 10 revolutions 322 8 Conclusion of Volume 1

After more revolutions, Yes or No? Finished ————————— {Null}

In this example we see, highly emphasized by the smallness of the orbit and by its large eccentricity the phenomenon of periastron advance.

With a = 10, ε = 0.1 periastro ======We make a comparison between orbits in Newton’s Theory and in Schwarzschild geometry ————————— Input of geometrical parameters ======PLOT of the ORBIT with the following parameters: Semilatus rectum = 10 m eccentricity = 0.1 ======Keplerian orbit with these parameters B Mathematica Packages 323

======The Schwarzschild orbit with the same parameters ————————— After 1 revolutions

After more revolutions, Yes or No? ————————— After 2 revolutions

After more revolutions, Yes or No? Finished ————————— {Null} 324 8 Conclusion of Volume 1

In this example the test particle, placed at distance of only 10 Schwarzschild radii from the center falls into the singularity in just two revolutions if its eccentricity is different from zero, no matter how it is small.

B.2 Metrigravpack

This is a MATHEMATICA package for the calculation of the Riemann and Ricci tensors of an arbitrary (pseudo) Riemannian metric in arbitrary space-time dimen- sions using the standard tensor calculus. It is an interactive package that is initialized and then waits fur further inputs by the user.

Metric Gravity

In this section we provide a package to calculate Einstein equations for any given metric in arbitrary dimensions and using the metric formalism

Routines: Metrigrav This routine is devised to calculate the Levi Civita connec- tion, the Riemann curvature and the Einstein Tensor for general manifolds in the metric formalism. The inputs are (1) the dimension n (2) the set of coordinates a n vector = coordi (3) the set of differentials, a n vector = diffe i j (4) the metric given as a quadratic differential ds2=g[[i,j]]dx dx . TO START this programme you type mainmetric and then you follow instructions

Mainmetric mainmetric:= Print["OK I calculate your space, Give me the data"]; Print["Give me the dimension of your space"]; mdim = Input["dimension = ?"]; Print["Your space has dimension n = ", mdim]; Print["Now I stop and you give me two vectors of dimension ", mdim]; Print["vector coordi = vector of coordinates"]; Print["vector diffe = vector of differentials"]; Print["Next you give me the metric as ds2 = "]; Print ["Then to resume calculation you print metricresume"]; ; firstres:= Print["I resume the calculation"]; Print["First I extract the metric coefficients from your data"]; gg = Table[0, {i, 1, mdim}, {j,1, mdim}]; B Mathematica Packages 325

1 [{ [{ = 1 [ [[ ]] ∗ [[ ]]] ; Do Do gg[[i,j]] 2 (Coefficient ds2, diffe[[i]] diffe[[j]] ) 1 = 1 [ [[ ]] ∗ [[ ]]] ; gg[[j,i]] 2 (Coefficient ds2, diffe[[i]] diffe[[j]] ) }, {j,i + 1, mdim}]; = [ 2 ]; } { }]; gg[[i,i]] Coefficient ds2, diffe[[i]] , i, 1, mdim Print["Then I calculate the inverse metric"]; ggm = Simplify[Inverse[gg]]; Print["Done !"]; Print["and I calculate also the metric determinant"]; detto = Simplify[Det[gg]]; Print["Done"]; ;

Metricresume metricresume:={ firstres; metrigrav;}

Routine Metrigrav metrigrav:= holviel = diffe; Print["I perform the calculation of the Christoffel symbols"]; Gam = Table[0, {i, 1, mdim}, {j,1, mdim}, {k,1, mdim}]; Do[ Do[ mdim 1 [{ [[ ]] = [ 1 Do Gam[[a,b,c]] Simplify 2 (ggm[[a,m]] m=1 ∗ (∂coordi[[[[b]]]]gg[[m,c]] + ∂coordi[[[[c]]]]gg[[m,b]] − → ]; } { }] ∂coordi[[[[m]]]]gg[[b,c]])), Trig True , c,1, mdim , {b,1, mdim}], {a,1, mdim}]; mdim Conne = Table[ holviel[[b]] ∗ Gam[[a,b,c]], {a,1, mdim}, {c,1, mdim}]; b=1 Print["—————–"]; Print["I finished"]; Print["the Levi Civita connection is given by:"]; Do[Do[ Print["Γ [",i,j,"]=", Conne[[i,j]]], {j,1, mdim}], {i, 1, mdim}]; Print["Task finished"]; Print["The result is encoded in a tensor Gam[a,b,c]"]; Print["—————–"]; Print[" Now I calculate the Riemann tensor"]; Rie = Table[0, {a,1, mdim}, {b,1, mdim}, {f,1, mdim}, {g,1, mdim}]; Print["I tell you my steps :"]; Do[{Print[" a=",a]; Do[{Print[" b=",b]; Do[Do[{ 326 8 Conclusion of Volume 1 urdo = (∂coordi[[f ]]Gam[[a,g,b]]− ∂coordi[[g]]Gam[[a,f,b]]); urdo = Simplify[urdo, Trig → True]; mdim weggio = Simplify[ Gam[[a,f,z]] ∗ Gam[[z,g,b]]− z=1 mdim Gam[[a,g,z]] ∗ Gam[[z,f,b]], Trig → True]; z=1 [[ ]] = [ 1 + Rie a,b,f,g Simplify 2 (urdo weggio) , Trig → True]; }, {f,1, mdim}], {g,1, mdim}]; }, {b,1, mdim}]; }, {a,1, mdim}]; Print["Finished"]; Print["————————-"]; Print["Now I evaluate the curvature 2-form of your space"]; RR = Table[0, {i, 1, mdim}, {j,1, mdim}]; mdim mdim Do[Do[{RR[[i, j]] = 2 ∗ Rie[[i,j,a,b]] ∗ (holviel[[a]]**holviel[[b]])}, a=1 b=a+1 {i, 1, mdim}], {j,1, mdim}]; Print["I find the following answer"]; Do[Do[Print["R[",i,j,"]=", RR[[i, j]]], {j,1, mdim}], {i, 1, mdim}]; Print["The result is encoded in a tensor RR[i,j]"]; Print["Its components are encoded in a tensor Rie[i,j,a,b]"]; Print["—————————"]; Print[" Now I calculate the Ricci tensor"]; ricten = Table[0, {a,1, mdim}, {b,1, mdim}]; Do[ricten[[b,e]] = Simplify[Sum[Rie[[xx,b,xx,e]], {xx, mdim}]]; ulla = 0; If[ricten[[b,e]]=!=0, { Print[b,"",e,"", " non-zero"]; Print["Ricci[",b,e,"]= ", ricten[[b,e]]]; ulla = ulla + 1; }], {b,mdim}, {e,mdim}]; Print["I have finished the calculation"]; If[ulla == 0, {Print["The Ricci tensor is zero"]; }, { Print[" The tensor ricten[a,b]] giving the Ricci tensor "]; Print[" is ready for storing on hard disk"]; }]; Print ["—————————-"];

Calculation of the Ricci Tensor of the Reissner Nordstrom Metric Using Metrigrav mainmetric OK I calculate your space, Give me the data Give me the dimension of your space B Mathematica Packages 327

Your space has dimension n = 4 Now I stop and you give me two vectors of dimension 4 vector coordi = vector of coordinates vector diffe = vector of differentials Next you give me the metric as ds2 = Then to resume calculation you print metricresume {Null} coordi ={t,r,θ,ϕ};

={ }; diffe dt , dr, dθ,dϕ −1 ds2 =− 1 − A + Q ∗ dt2 + 1 − A + Q 1 ∗ dr2 + r2 ∗ Sin[θ]2 ∗ dϕ2 + r2 ∗ dθ2; r r2 r r2 metricresume I resume the calculation First I extract the metric coefficients from your data Then I calculate the inverse metric Done! and I calculate also the metric determinant Done I perform the calculation of the Christoffel symbols —————– I finished the Levi Civita connection is given by: dr(−2Q+Ar) Γ [11] = 2r(Q+r(−A+r)) dt(−2Q+Ar) Γ [12] = 2r(Q+r(−A+r)) Γ [13] = 0 Γ [14] = 0 − + + − + Γ [21] = dt( 2Q Ar)(Q r( A r)) 2r5 − Γ [22] = dr(2Q Ar) 2Qr−2Ar2+2r3 − Q+r2 Γ [23] = dθ A r − dϕ(Q+r(−A+r))Sin[θ]2 Γ [24] = r Γ [31] = 0 dθ Γ [32] = r dr Γ [33] = r Γ [34] = −dϕCos[θ]Sin[θ] Γ [41] = 0 dϕ Γ [42] = r Γ [43] = dϕCot[θ] dr + [ ] Γ [44] = r dθCot θ Task finished The result is encoded in a tensor Gam[a,b,c] —————– 328 8 Conclusion of Volume 1

Now I calculate the Riemann tensor I tell you my steps: a=1 b=1 b=2 b=3 b=4 a=2 b=1 b=2 b=3 b=4 a=3 b=1 b=2 b=3 b=4 a=4 b=1 b=2 b=3 b=4 Finished ————————- Now I evaluate the curvature 2-form of your space I find the following answer R[11] = 0 − + R[12] = ( 3Q Ar)dt**dr r2(Q+r(−A+r)) − R[13] = (2Q Ar)dt**dθ 2r2 − + [ ]2 R[14] = − ( 2Q Ar)dt**dϕSin θ 2r2 −3Q2+Q(4A−3r)r+Ar2(−A+r) dt**dr R[21] = r6 R[22] = 0 − R[23] = (2Q Ar)dr**dθ 2r2 − + [ ]2 R[24] = − ( 2Q Ar)dr**dϕSin θ 2r2 − + + − + R[31] = − ( 2Q Ar)(Q r( A r))dt**dθ 2r6 − + R[32] = ( 2Q Ar)dr**dθ 2r2(Q+r(−A+r)) R[33] = 0 − + [ ]2 R[34] = ( Q Ar)dθ**dϕSin θ r2 − + + − + R[41] = − ( 2Q Ar)(Q r( A r))dt**dϕ 2r6 − + R[42] = ( 2Q Ar)dr**dϕ 2r2(Q+r(−A+r)) − R[43] = (Q Ar)dθ**dϕ r2 R[44] = 0 B Mathematica Packages 329

The result is encoded in a tensor RR[i,j] Its components are encoded in a tensor Rie[i,j,a,b] ————————— Now I calculate the Ricci tensor 11 non-zero + − + Ricci[11] = Q(Q r( A r)) 2r6 22 non-zero Ricci[22] =− Q 2r2(Q+r(−A+r)) 33 non-zero Ricci[33] = Q 2r2 44 non-zero [ ]2 Ricci[44] = QSin θ 2r2 I have finished the calculation The tensor ricten[[a,b]] giving the Ricci tensor is ready for storing on hard disk —————————- {Null} MatrixForm[gg] ⎛ ⎞ −1 − Q + A 00 0 ⎜ r2 r ⎟ ⎜ 0 1 00⎟ ⎜ 1+ Q − A ⎟ ⎝ r2 r ⎠ 00r2 0 000r2Sin[θ]2 MatrixForm[ricten] ⎛ ⎞ + − + Q(Q r( A r)) 000 ⎜ 2r6 ⎟ ⎜ − Q ⎟ ⎜ 0 2 + − + 00⎟ ⎜ 2r (Q r( A r)) ⎟ 00Q 0 ⎝ 2r2 ⎠ [ ]2 000QSin θ 2r2 Index

A Brane solutions, 311 Advanced Green function(s), 284 Brane(s), viii, xi, 311 Aether, lumineferous aether, 4, 5 Bulk (field) theory(ies), xi Affine connection, 85, 140Ð145 Amaldi (Edoardo), 275 C Angular momentum, x, 105, 148, 150Ð152, Calvera, 262 159, 160, 162, 164, 165, 169, 170, 179, Canis Maior Constellation, 259 291, 299, 303 Cartan connection(s), 94 Anisotropy, anisotropies, x Carter, x Arecibo Radio Telecope, 276, 278 Causality, ix, 146 Associated bundle(s), 51, 126, 129, 194 Cavendish, 3, 229 Atlas, 37Ð43, 56, 61, 130 Centrifugal barrier, 159, 160, 180 Auxiliary field(s), 203 Cepheides, x Avogadro number, 267 Cerdonio (Massimo), 275 Azimuthal angle, 161, 162, 164, 181 CERN, 125, 126, 275 B Chandrasekhar, 238, 240 Baade, 238, 240, 261, 262 Chandrasekhar mass (limit), ix, 159, 256, 266, Base manifold, ix, 52Ð54, 56, 58Ð60, 64, 106, 267, 270, 277 120, 121, 123, 127, 128, 130, 133, 136, Charge(s), 4, 132Ð136, 158, 196, 198, 199, 193, 196, 200, 202 229, 230, 232, 273, 283, 284 Beltrami, 101, 152 Charge conjugation matrix, 314 Bessel, x Chirality matrix, 312, 314, 317 Betti (Enrico), 94, 98 Circular orbit(s), 161, 165, 166, 168Ð171, 180, Bianchi, 2, 98, 100, 101, 136 182 Bianchi classification, x Classical Lie group(s), 15 Bianchi identity(ies), ix, xi, 29, 100, 189, 197, Clifford algebra(s), 19, 21, 312Ð314, 317 209Ð211, 218, 249 Coalescence of binaries, 279 Bianchi type, x Cohomology, ix, xi, 66, 70, 82 Binary pulsar(s), 298 Cohomology group, 81Ð83 Binary system(s), 159, 160, 259, 270, 278, Compact star(s), 157, 167, 239, 262, 266, 269, 279, 300Ð302, 306, 309 277, 279 Black hole(s), viiiÐx, xii, 160, 274, 279 Compactification, 214 Boltzmann constant, 267 Compton wave length, 261, 264, 266 Born-Infeld, xi Conformal mapping (map), ix, 96 Boson(s), 19, 51, 86, 187, 226 Connection coefficients, 96, 128, 129, 141, Boundary operator, 78Ð80 191, 205

P.G. Frè, Gravity, a Geometrical Course, DOI 10.1007/978-94-007-5361-7, 331 © Springer Science+Business Media Dordrecht 2013 332 Index

Connection one-form, 121Ð126, 129, 140, 141, E 197, 203, 212 Eccentricity, 162, 171, 175, 305Ð308, 319, Connection(s), viii, ix, 1, 2, 85Ð87, 94, 95, 320, 322, 324 106Ð108, 117Ð122, 125Ð130, 135, 136, Eddington, 238Ð240 139Ð145, 189, 190, 192Ð198, 200, Effective potential, 165, 166, 168, 169, 180, 203Ð208, 212, 213, 215Ð219, 289 182 Contorsion, 142, 143, 206 Ehresmann, 94, 106, 108, 118, 119, 121, Copernicus, 227 125 Coset manifold(s), x Ehresmann connection, 108, 118, 125 Cosmic billiard(s), x Einstein, 1, 2, 5, 8Ð10, 25, 32, 94, 98, 102, Cosmic microwave background, CMB, x, xi 106, 107, 134, 160, 163, 188, 211, 237, Cosmological parameter(s), 270 238, 273, 274 Cosmological Principle, x Einstein tensor(s), 100, 211, 212, 217, 221, Cotangent space(s), 44, 50, 51, 64, 69 235, 246, 248, 249, 324 Coulomb, 283 Electric current, 31, 197, 198, 230 Covariant derivative, 85, 96, 124, 125, 128, Electric dipole, 273 144, 196, 204, 205, 208 Electric field, 2, 4 Crab Nebula, 262, 263, 277 Electromagnetic potential, 2 Crab Pulsar, 277 Electromagnetism, 2, 3, 273 Current(s), 4, 25, 29Ð31, 197, 198, 211, 229, Energy loss, 295, 298, 302, 304 230, 232, 233 Entropy, x Curtis, x Equation(s) of state, 247, 249, 265Ð269 Curvature(s), xi, 2, 85, 94, 98Ð100, 103Ð107, Erlangen Programme, 98 117, 118, 139, 141, 142, 188, 194, 195, Euclid, 92, 151 197Ð200, 202, 209Ð213, 221, 233, 234 Euclidian geometry, 11, 87, 88, 90, 93, 95, 152 Euclidian space(s), 11, 76, 88Ð90, 103 D Event horizon(s), x, 256 D3-brane, xi Event(s), 6, 8Ð10, 12, 15, 16, 32, 35, 36, 101, D’Alembert, 282 279, 280, 285 De Sitter space(s), x, 146, 147 Exterior derivative, 80Ð82, 106, 107, 135 Deflection angle, 160, 184, 185 Exterior form(s), 66, 67, 69 Diffeomorphism(s), 63, 108Ð111, 208, 212, 219, 221, 231, 233 F Differentiable manifold, 2, 35Ð37, 39, 40, 42, Fermi, 239, 241, 275 53, 55, 58, 72, 75, 77, 108, 112, 130, Fermi gas, 256, 258, 260Ð262, 264, 270 136, 138Ð140, 157 Fermi pressure, 262Ð265 Differentiable structure, 38, 39, 136 Fermion(s), 24, 208, 257, 264 Differential form(s), ix, 42, 49, 50, 64, 66, Feynman (Richard), 187, 227, 228, 233, 274, 68Ð70, 76, 80Ð83, 102, 106, 107, 115, 284 198, 199, 208, 219, 287 Fibre bundle, 35, 51, 55, 61, 87, 94, 98, 100, Differential geometry, ix, 1, 32, 35, 86, 94, 98, 118, 120, 121, 130, 132, 136, 142, 101, 102, 106, 144, 233 200 Dilaton, 216, 217 First integral(s), 148, 159, 164, 171, 179 Dilaton torsion, 216 First order formalism, xi, 202 Dirac, 19, 134, 228, 239 Flat metric, 139, 191, 193, 194, 232, 244 Dirac spinor(s), 19, 21, 23, 30, 208, 312 Flux compactification(s), xii Distance, x, 11, 32, 40, 87Ð90, 103, 159, 160, Four-momentum, D-momentum, 15, 26 162, 167, 172, 176, 177, 181, 182, 184, Fourier, 292 227, 238, 306 Fowler, 239, 240 Domain wall(s), xi Frascati (National Laboratories), 275 Double pulsar system, 307Ð309 Free differential algebra(s), xi Duality, 51, 198Ð200, 203 Freedman, x Duality rotations, xi Frenet, 94, 102Ð104, 106, 189 Index 333

Fundamental group, 24, 74, 75 I Fusion cycle of hydrogen, 260 Immanuel Kant, Kant, 10, 88 Impact parameter, 181Ð185 G Inertia tensor, 302, 303 Galaxy, galaxies, ix, x, 238 Inertial frames, 1, 8, 15, 32, 36, 37, 189Ð191 Galilei group, 1, 5, 8 Infeld, 274 Galileo, 1, 6, 7, 32, 36 Inflationary universe, x, 311 Gamma matrices, 312, 317 Interference fringes, 5 Gamow, x Interferometer (detectors), 279 Gauge boson(s), 51, 187, 226 Interior solution(s), 234, 245 Gauge fixing, 28, 222, 289 Irreducible representation, 19, 21, 26, 28, 57, Gauge theory, gauge theories, 189 209, 210, 226, 312 Gauge/gravity correspondence, xi Island-universe(s), x Gauss, 2, 35, 87Ð94, 98, 102, 105, 107, 134, Isometry(ies), xi, 148 136, 137, 152 Isotropy, x Gaussian coordinates, curvilinear coordinates, 87, 89Ð91, 93 K General Relativity, viiÐx, 1, 2, 35Ð37, 85, 98, κ-supersymmetry, xi 100Ð103, 106, 107, 159, 160, 180, 184, Kasner metric(s), x 191, 192, 237Ð242, 253, 254, 274, Kepler, 157Ð160, 162, 238, 300, 302, 304, 305 307Ð309, 311 Keplerian parameters, 172Ð176, 178, 298 Geodesic(s), ix, x, 144Ð154, 157Ð160, 162, Kerr-Newman metric, ix 164, 179, 180, 187 Killing, 16, 105 Germs of smooth functions, 43, 44, 47, 48 Killing vector, 105, 148, 245, 246 Giazzotto (Adalberto), 281 Kinetic energy, 9, 15, 243, 249, 267 Göttingen, 10, 25, 30, 88, 91, 92, 100, 119, 163 Klein, 25, 98, 100 Gravitational bending, 182 Kronecker, Kronecker delta, 221, 243 Gravitational binding energy, 248, 261 Kruskal, ix Gravitational wave(s), ix, 5, 222, 225, 240, 273Ð276, 279, 280, 285, 286, 288, L 298, 301, 302, 305, 307Ð309, 311 Lagrange, 292 Graviton(s), ix, 28, 51, 187, 189, 208, Lagrangian(s), xi, 23, 24, 145, 148, 153, 163, 224Ð227, 291, 309 164, 179, 197, 208, 219, 220, 229, Green function(s), 280Ð286, 291 231Ð233 GW detectors, 274 Lane-Emden, 239, 268Ð270 Laplace, ix, 187, 292 H Leavitt, x, 25 Hewish, 238, 240, 262, 263 Left-handed, 316 Hilbert de Donder, 224, 288 Left-invariant vector field, one-form, 110Ð117, Hodge dual, 199, 200, 212, 216 124, 125 Hodge duality, 198Ð200, 203 Levi Civita, 2, 87, 94, 97Ð102, 106, 136, 137, Hoffmann, 274 144, 188, 189 Homeomorphism, 37Ð39, 55 Levi Civita connection, 85, 87, 94, 96, 100, Homogeneity, x 125, 139, 142Ð145, 189, 190, 205Ð207, Homologically (non-)trivial, 71 212, 324, 325, 327 Homology, ix, 66, 70, 75, 81Ð83 Lie bracket, 63, 111 Homotopically (non-)trivial, 70, 73 Lie group, x, 6, 15, 16, 19, 23, 25, 30, 52, Homotopy, 70, 72Ð75, 83 55Ð57, 63, 70, 102, 103, 105, 106, 108, Hopf fibration, 130, 132 110Ð112, 114Ð116 Horizon, x, 234, 256 Light-cone, 223, 224, 285, 286, 288, 289 Horizon area, x, xii Ligo (I, II), 280 Horizontal vector fields, 124, 125 Line bundle, 58 Hubble, x, 311 Little group, 23, 26Ð28, 225, 226 Hulse, 278, 279, 298, 299 Lobachevskij, 152 334 Index

Lobachevskij-Poincaré plane, 151 Noether’s theorem, 23, 24, 29Ð31, 229, Local trivialization, 54Ð56, 58Ð60, 62, 120, 232 122Ð124, 127, 129Ð131, 133, 135, 193, Null geodesic(s), 147Ð149 196, 197 Null-like, 12, 13, 27, 146Ð148, 187, 226 Lorentz, 1, 4, 6Ð8 Lorentz algebra, 21, 317 O Lorentz bundle, 200, 204, 205, 207, 215, 218, Observer(s), x, 1, 6, 8, 36, 37, 213 221 Olbers, Olbers paradox, x Lorentz group, 1, 13, 15, 16, 18Ð24, 27, 32, Open chart, 37Ð43, 46, 47, 54, 56, 58, 60, 61, 194, 209, 210, 216, 217, 221, 226 108Ð110, 130, 131, 141 Lorentz transformations, 6Ð8, 10, 17, 18, 23, Oppenheimer, ix, 238, 241, 250, 254 26, 27, 36, 208, 212, 213, 244, 287, 289 Lorentzian manifold(s), 145 P p-chain(s), 78 M Parallax, x, 158 Magnetic field, 2, 4, 133, 135, 197 Parallel transport, 35, 86, 94Ð96, 98, 99, Magnetic monopole, 130, 132Ð134 142Ð144, 207 Majorana spinor(s), 312, 316 Particle horizon(s), x Majorana-Weyl spinor(s), 312, 316 Penrose diagram(s), ix, 311 Manifold, ixÐxii, 35Ð48, 50Ð56, 58Ð66, 69Ð78, Perfect fluid, 237, 242Ð245 82, 83, 93, 94, 98Ð100, 106Ð108, Periastron, 160, 170, 172Ð174, 176Ð178, 180, 110Ð112, 114Ð116, 120Ð124, 136Ð140, 298Ð300, 307, 322 142Ð146, 190Ð193, 198Ð202 Pesando, xiii Mass, ix, x, 9, 26, 162, 165Ð167, 181Ð185, Pithagora’s theorem, 90, 91 187, 188, 213, 229, 237Ð242, 247Ð249, Pizzella (Guido), 275 251Ð253, 255Ð257, 259Ð261, 263Ð270, Plane gravitational wave, 288 305, 306 Poincaré bundle, 194, 195, 208, 218 Maurer, Maurer Cartan forms, 102, 115 Poincaré group, Poincaré algebra, 23, 26, 194, Maurer Cartan equations, 117, 118 219 Maxwell, 2Ð4, 7 Polar coordinate(s), 131, 134, 160, 162, 283, Maxwell equations, 1Ð4, 6, 8, 197Ð199, 273 285, 302 Metric(s), ix, x, 85Ð87, 93Ð96, 136Ð139, Polytrope(s), 267 143Ð148, 153, 154, 159, 187Ð194, Pressure, 163, 227, 237Ð240, 243, 247, 199Ð203, 205, 206, 210Ð213, 223Ð225, 249Ð256, 258, 259, 261Ð265, 267, 270, 232Ð237, 244Ð250, 286Ð289, 324Ð327 275 Michelson and Morley, 1, 5, 279 Pressure equation, 250, 253, 265 Minkowski, 8Ð10, 12, 16, 163 Principal bundle(s), 2, 51, 57, 58, 85, 108, 123, Minkowski metric, 12, 27, 31, 221, 225, 235, 129, 130, 132, 141 244, 287 Principal connection, 2, 94, 108, 129, 190, Minkowski space, 9, 10, 12, 31, 139, 146, 200, 195, 208, 218 208, 223, 229, 230, 288 Propagator(s), 197, 201, 283, 284 Momentum, x, 14, 15, 23, 24, 26Ð28, 150Ð152, Proper time, 163 159, 160, 164, 165, 169, 170, 225, 229, Pseudo-Riemannian metric(s), 136, 157 230, 232, 233, 242, 243, 257, 258, 291, Pseudo-sphere(s), 152 299, 303 PSR 1913+16, 278, 300, 305Ð308 Pull-back, 108, 110, 123, 146 N Push-forward, 108Ð111, 121 Neutron star(s), ix, 240, 241, 256, 258, 261Ð264, 266, 269, 270, 277Ð279, Q 306Ð308 Quadrupole moment, 273, 293, 295, 298, New first order formalism, xi 304 Newtonian potential, 160, 170, 226, 249, 299 Quadrupole radiation, 295, 298, 302 Newton’s law, 2, 8, 32, 159, 187, 251 Quantum chromodynamics, 228 Noether, 25, 228 Quartic (symplectic) invariant, xii Index 335

R SO(9), 226 Radiation region, 294 Soldering, 94, 107, 195, 200, 204, 205, Reference frame, 5Ð8, 13, 32, 106, 183, 191, 207Ð213, 215, 218, 219, 221 243 Space-like, 12, 13, 146, 147, 150Ð152, 157, Regge, vii, 274 159, 226, 246 Reissner Nordström (solution, black hole, Special Kähler, xii metric), 326 Special Lorentz transformations, 6, 17, 18, 244 Repère mobile, 103, 188, 189, 191, 200 Special Relativity, viii, 1, 2, 5, 8Ð10, 12, 13, Representations of Lorentz group, algebra, 20, 15, 19, 23, 24, 31, 32, 36, 102, 124, 21, 23, 24, 27, 216, 217 229, 242, 244 Rest energy, 9, 15, 249 Spectral index, xi Restricted holonomy, xii Speed of light, velocity of light, 4Ð8, 10, 13, Retarded Green function(s), 284Ð286 14, 27, 146, 159, 165, 213, 223, 225, Rheonomic, xi 229, 265, 286, 298 Rheonomy (principle), xi Sphere(s), 40, 41, 44, 46, 61, 62, 75, 130Ð135, Ricci, Ricci Curbastro, 2, 94, 96Ð102, 106, 152, 160, 161, 247Ð249, 251, 267 136, 137, 144, 188, 189 Spin, spin of a particle, 26, 27, 226 Riemann, 2, 88, 91Ð96, 98, 101, 102, 105, 107, Spin connection(s), ix, 203, 204, 206Ð208, 136, 137, 141, 188 212, 213, 215Ð219, 221, 234, 250, 287, Riemann curvature, 85, 188, 212, 324 289 Riemann tensor, ix, 99, 107, 140, 209Ð212, Spin-statistics, 19, 20 218, 235, 325, 328 Spinor(s), xii, 19, 21, 23, 30, 94, 102, 105, Right invariant vector field, one-form, 115 125, 189, 195, 208, 312, 316, 317 Right-handed, 316 Spinor representations, 19Ð21, 312 Rindler space time, ix Standard cosmological model, x Root(s), 86, 105, 121, 144, 166, 168, 189, 278 Standard fibre, 53Ð55, 57, 58, 128, 130, 196 Rosen, 274 Standard model, 125 Rubbia (Carlo), 125, 126 Static limit, x Stellar equilibrium, ix, 234, 237, 241, 245, S 250, 251, 270, 311 Saccheri, 152 Stellar mass, ix, 167, 253, 270 Salam (Abdus), 125, 126 Stereographic projection, 40, 41, 130, 132 Scalar field(s), xi, 25Ð27, 30, 31, 216, 220, 237 Stokes lemma, 83 Scalar manifold(s), xi Scalar product, 11, 12, 16, 32, 36, 95, 143 Stress energy tensor(s), 31, 218, 242, 245, 295 Schwarzschild emiradius, 165, 167, 169, 171, Sullivan’s theorem(s), xi 174, 319 Super-gauge completion, xii Schwarzschild (metric), ix, 157, 159, 162, 163, Supergravity, viii, xÐxii, 70, 201, 226, 237, 179, 187, 233, 234, 236, 247, 318, 320 311, 312 Schwarzschild radius, 241, 249, 254Ð256, 324 Supermultiplet(s), xi Section(s) of (a) fibre bundle(s), 52, 134 Supernova(e), 261, 262 Semi simple Lie algebra(s), 201 Supernova Ia, 158, 159, 270 Semilatus rectum, 162, 171, 175, 301, SuperPoincaré, xi 305Ð308, 319, 320, 322 Superstring(s), superstring theory, xi, 226, 237, Serret, 94, 102Ð104, 106, 189 317 Shapley, x Supersymmetry, viii, xii, 311, 312 Signature, ix, 12, 137Ð139, 143, 147, 153, 157, Sylvester, 12, 138, 139 190, 194, 199, 202, 283 Symmetric spaces(s), xi, 102, 106 Simplex, standard symplex, 76Ð80, 83 Symplectic embedding(s), xi Simply connected, 75, 83 Sirius (A, B), 238, 239, 259 T Slow rolling, xi Tangent bundle(s), ix, 58Ð64, 85, 94, 98, 106, Smooth manifold, 52, 60, 61, 82, 83, 85 108, 111, 112, 116, 127, 136, 140Ð142, SO(1, 3), 7, 22, 23, 221 190, 200, 204, 205, 207 336 Index

Tangent space(s), 46Ð48, 50, 58, 59, 62Ð64, Vielbein(s), ix, 31, 94, 188, 189, 191Ð195, 66, 69, 111, 120Ð122, 145 200Ð209, 211Ð218, 220, 221, 234, 245, Taylor, 278, 279, 298, 299 289 Time-like, 12Ð15, 27, 146, 147, 149Ð151, Virgo, 280, 281 157Ð159, 179, 180, 187, 226, 245 Volkoff, 241, 250, 254 Tolman Oppenheimer Volkoff, 254 Torsion, 99, 104Ð107, 139Ð142, 194, 195, 200, W 205Ð207, 211, 212, 214Ð216, 218 Wave length(s), 261, 264, 266 Torsion equation, 206, 214, 234, 287, 289 Weak field limit, 220 Torsionful connection(s), 215 Weber, 274, 275 Torsionless connection, 142, 144 Weight(s), 252, 256, 260, 275 Total differential, 49, 50, 65, 66, 70 Weinberg (Steven), 125, 126 Total manifold, 53 Weyl, 85 Tycho Brahe, 158, 159 Weyl spinor(s), 23, 312, 316 Weyl transformation, 217 Wheeler, 228, 274 U White dwarf, 159, 256, 259Ð261, 264, 269, 270 U(1) group, factor, bundle, 87, 125, 133Ð135, Wilson, 261 196, 198, 200 WMAP, 311 UIR (unitary irreducible representations), 26, World line, 191 226 Universal recession, x Y Yang (C.N.), 86, 87, 94, 124, 125 V Yang-Mills theories, 189, 195, 200, 203, 212, Vector bundle(s), 51, 57, 58, 60, 65, 125Ð129, 219 140, 141, 193, 196, 202, 204, 205, 207 Young tableau(x), 209 Vector field(s), ix, xi, 27, 28, 42, 60Ð65, 69, 105, 108, 110Ð116, 119Ð122, 124Ð129, Z 136, 137, 140, 143, 238, 245, 246 Zwicky, 238, 240, 261