PHY483F/1483F Relativity Theory I (2020-21) Department of Physics University of Toronto
PHY483F/1483F Relativity Theory I (2020-21) Department of Physics University of Toronto
Instructor: Prof. A.W. Peet
Sources:-
• M.P. Hobson, G.P. Efsthathiou, and A.N. Lasenby, “General relativity: an introduction for physicists” (Cambridge University Press, 2005) [recommended textbook];
• Sean Carroll, “Spacetime and geometry: an introduction to general relativity” (Addison- Wesley, 2004);
• Ray d’Inverno, “Introducing Einstein’s relativity” (Oxford University Press, 1992);
• Jim Hartle, “Gravity: an introduction to Einstein’s general relativity” (Pearson, 2003);
• Bob Wald, “General relativity” (University of Chicago Press, 1984);
• Tom´asOrt´ın,“Gravity and strings” (Cambridge University Press, 2004);
• Noel Doughty, “Lagrangian Interaction” (Westview Press, 1990);
• my personal notes over three decades.
Version: Monday 16th November, 2020 @ 10:55
Licence: Creative Commons Attribution-NonCommercial-NoDerivs Canada 2.5 Contents
1 R10Sep iii 1.1 Invitation to General Relativity ...... iii 1.2 Course website ...... 1
2 M14Sep 2 2.1 Galilean relativity, 3-vectors in Euclidean space, and index notation . . . . .2
3 R17Sep 8 3.1 Special relativity and 4-vectors in Minkowski spacetime ...... 8 3.2 Partial derivative 4-vector ...... 12
4 M21Sep 14 4.1 Relativistic particle: position, momentum, acceleration 4-vectors ...... 14 4.2 Electromagnetism: 4-vector potential and field strength tensor ...... 18
5 R24Sep 21 5.1 Constant relativistic acceleration and the twin paradox ...... 21 5.2 The Equivalence Principle ...... 24 5.3 Spacetime as a curved Riemannian manifold ...... 25
6 M28Sep 27 6.1 Basis vectors in curved spacetime ...... 27 6.2 Tensors in curved spacetime ...... 29 6.3 Rules for tensor index gymnastics ...... 30
7 R01Oct 32 7.1 Building a covariant derivative ...... 32 7.2 How basis vectors change: the role of the affine connection ...... 32 7.3 The covariant derivative and parallel transport ...... 36
8 M05Oct 38 8.1 The geodesic equations for test particle motion in curved spacetime . . . . . 38 8.2 Example computation for affine connection and geodesic equations ...... 40
9 R08Oct 44 9.1 Spacetime curvature ...... 44 9.2 The Riemann tensor ...... 44 9.3 Example computations for Riemann ...... 46
10 R15Oct 50 10.1 Geodesic deviation ...... 50 10.2 Tidal forces and taking the Newtonian limit for Christoffels ...... 51
11 M19Oct 56 11.1 Newtonian limit for Riemann ...... 56 11.2 Riemann normal coordinates and the Bianchi identity ...... 58
i 11.3 The information in Riemann ...... 60
12 R22Oct 62 12.1 Lie derivatives ...... 62 12.2 Killing vectors and tensors ...... 64
13 M26Oct 67 13.1 Maximally symmetric spacetimes ...... 67 13.2 Einstein’s equations ...... 70
14 R29Oct 73 14.1 Birkhoff’s theorem and the Schwarzschild black hole ...... 73
15 M02Nov 79 15.1 TOV equation for a star ...... 79 15.2 Geodesics of Schwarzschild ...... 81
16 R05Nov 85 16.1 Causal structure of Schwarzschild ...... 85
17 M16Nov 91 17.1 Charged black holes ...... 91 17.2 Rotating black holes ...... 94
18 R19Nov 96 18.1 The Kerr solution ...... 96 18.2 The Penrose process ...... 98
19 M23Nov 101 19.1 Gravitational redshift ...... 101 19.2 Planetary perihelion precession ...... 103
20 R26Nov 106 20.1 Bending of light ...... 106 20.2 Radar echoes ...... 109
21 M30Nov 111 21.1 Geodesic precession of gyroscopes ...... 111 21.2 Accretion disks ...... 114
22 R03Dec 116 22.1 Finding the wave equation for metric perturbations ...... 116 22.2 Solving the linearized Einstein equations ...... 118
23 M07Dec 122 23.1 Gravitational plane waves ...... 122 23.2 Energy loss from gravitational radiation ...... 125
ii 1 R10Sep
1.1 Invitation to General Relativity From a particle physics perspective, the gravitational force is the weakest of the four known forces. So why does gravity dominate the dynamics of the universe? A simple first answer is that there is a lot of matter in the universe that gravitates. Even though the gravitational attraction between any two subatomic particles is weak, if you get enough of them together you can eventually make a black hole! A slightly more sophisticated answer focuses on the range of the gravitational force and what sources it. The only two long-range forces we know of in Nature are gravity and electromagnetism. By contrast, the strong nuclear force binding atomic nuclei and the weak nuclear force responsible for the fusion reaction powering our Sun are very short-range. Electromagnetic fields are sourced by charges and currents, but the universe is electrically neutral on average, so electromagnetism does not dominate its evolution. Gravity, on the other hand, is sourced by energy-momentum. Since everything has energy-momentum, even the graviton, you can never get away from gravity. Newton wowed the world a third of a millennium ago with his Law of Universal Gravita- tion, which explained both celestial and terrestrial observations. Our focus in this course is on explaining Einstein’s famous General Theory of Relativity (GR), which is a gener- alization of both Newtonian gravity and Special Relativity proven useful for describing the dynamics of the cosmos. By the end of term, you will be familiar with Albert Einstein’s famous equation for the gravitational field gµν(x) 1 R − g R + Λg = −8πG T , (1.1) µν 2 µν µν N µν where Rµν and R involve (first and) second derivatives of gµν and Tµν describes the energy- momentum tensor of all non-gravitational fields which are collectively known as matter fields. You will also understand how GR gives back Newton’s theory of gravity in the limit where speeds are small and spacetime is weakly curved. You can think of Einstein’s GR as Gravity 2.0, built on the foundation of Gravity 1.0 established by Newton – an upgrade. The name for this course is “Relativity Theory 1”. Another name by which it is com- monly known is “GR 1”, which stands for “General Relativity 1”. The main thing we learn how to do in this course is how to write the dynamical equations of physics in the language of tensor analysis. Tensor analysis always sounds scary when you start, but it is not much more complicated conceptually than vector analysis, something you have been doing for years. We will show how to take your vector analysis knowledge from flat space and generalize it to spacetime. We begin with flat spacetime, which is pertinent to Special Relativity, and then we build on that knowledge to figure out how to write dynamical equations of physics even in curved spacetime. Einstein taught us that the speed of light is constant and is the same in all inertial frames of reference. We will therefore adopt the relativistic convention that c = 1 throughout the course. This implies that time is measured in metres, and mass is measured in units of energy, e.g. me=511 keV. We will keep all other physical constants explicit, such as Planck’s constant ~ characterizing the strength of quantum effects and the Newton constant GN characterizing the strength of gravity. If you feel queasy about missing factors of c in any equation, they can always be easily restored by using dimensional analysis.
iii 1.2 Course website Please have a careful read of the course website at https://ap.io/483f/. It contains lots of vital and useful information for all students taking PHY483F/PHY1483F, including the syllabus, online lecture notes, and how to contact me. Almost everything you need to know about the course is contained in the pages listed, and in all the clickable links in those pages. The remaining tiny fraction of information that needs to be hidden behind a UofT firewall for class members only can be found on Quercus, in the Announcements (from the Prof.) and Modules (from the TA).
Remarks in my notes intended for more advanced/interested students are indicated in blue.
1 2 M14Sep
2.1 Galilean relativity, 3-vectors in Euclidean space, and index notation Before we review some aspects of Special Relativity and introduce some new ones, let us begin by reminding ourselves of the non-relativistic version of relativity, also known as Galilean relativity. When we want to transform from one inertial frame of reference to another moving at relative velocity v, there are three things we must think about:
(a) how time intervals relate,
(b) how spatial position intervals relate,
(c) how velocities relate.
In Galilean relativity, all clocks are synchronized,
dt0 = dt , (2.1) displacements are related via dx0 = dx − vdt , (2.2) and velocities u = dx/dt compose by simple addition,
u0 = u − v , (2.3) where v is the relative velocity between the unprimed and primed frames of reference. Ein- stein upgraded these formulæ when he invented Special Relativity, and you have seen the results before: they are known as Lorentz Transformations. We will get to them soon enough – and we will show you how simple they can look when written in terms of rapidity rather than velocity. But for now, let us inspect how 3-vectors work more closely, in some detail. This will serve as a pattern for the relativistic case. Lots of things of interest in physics are vectors, which are in essence things that point.I like to say that a vector has a ‘leg’ that sticks out, telling you where it points. Mathematically, the vector components are what you get when you resolve the vector along an orthonormal basis. In Special and General Relativity, we will need to be scrupulously careful to distinguish where we put our indices (up/down and left/right). For arbitrary vectors v, we write the index telling you which component is which with an upstairs index: v1, v2, . . . , vd, where i = 1, . . . , d and d is the spatial dimension. Note that the upstairs index i used here is not a power; instead, it specifies which component vi is being discussed: the ith one. If you think of a contravariant vector as a column vector, the upstairs index i denotes which row of the column vector you are talking about. If you need to take a power of a vector component, the GR convention is to write parentheses around it, e.g. (v1)2. Note also that it is common in GR literature to write the vector v as vi – technically, vi is a component of v, but letting the index show explicitly rather than suppressing it helps us remember its transformation properties.
2 Vectors provide a useful notational shorthand, preventing us from having to write out all the components explicitly every time we write a physics equation like F~ = m~a. Tensor analysis in GR is nothing scary – it is the natural generalization of vector analysis to curved spacetime and multilegged objects. Its underlying idea is twofold:-
• In physics, the most useful dynamical variables transform in well-defined ways under coordinate transformations, and are known as tensors. Example: the momentum vector pi. • The laws of physics should be tensorial equations. A Newtonian example you will recognize is F i = mai . (2.4)
When we change coordinates, the components of tensors on both sides of the equation change, but the underlying physical relations between them do not. The natural type of vector we defined above is called a contravariant vector. This is like a column vector. It has a natural counterpart called a covariant vector, also known as a dual vector. This is like a row vector. A covariant vector ω has components ωi; note that this is a downstairs index rather than an upstairs index. The index i tells you which column of the row vector you are talking about. There is a natural inner product between contravariant vectors v and covariant vectors ω:
X i ω · v = ωiv . (2.5) i A very useful convention that we will use throughout the course is the Einstein summa- tion convention. This is a notational shorthand in which a repeated index is automatically summed over when it occurs precisely once upstairs and precisely once downstairs. This con- vention suppresses the unwieldy Σ signs so that it becomes easier to see the wood for the trees. The thing that signals that you are summing over an index is that it is repeated. Note that a repeated (summed over) index can appear precisely twice in any given equation: if it occurs more times, the writer has made a mistake. Summing over a repeated index is also called index contraction, because what you get for the result has none of the summed-over indices remaining. In our v · ω example above, the result is a scalar: a tensor with zero legs. Why is it important to distinguish between contravariant and covariant vectors? In a nutshell, because they transform differently under coordinate transformations. Let us see how this works for a rotation. You may be used to writing a rotation of (say) a displacement vector as a d × d matrix R. Rotation matrices are orthogonal,
R−1 = RT . (2.6)
Alternatively, we can say that they preserve the Euclidean norm in 3-space:
T R 13R = 13 , (2.7) where 13 is the identity matrix. While R transforms contravariant vectors v in Euclidean space as v → v0 = R · v , (2.8)
3 where the prime indicates the transformed vector and the unprimed vector is the original, the transpose RT transforms covariant vectors vT as
0 vT → vT = vT · RT . (2.9)
However, we strongly discourage writing coordinate transformations in terms of matrices in future, and instead encourage you to get the hang of index notation. Once time is included, coordinates are curvilinear, and spacetime is physically curved, index notation and the Einstein summation convention will help us keep track of indices in a much more succinct way and therefore reduce the error rate when handling tensors. A rotation is expressed in index notation for the contravariant vector components as
i0 i0 j v = R jv . (2.10)
i0 0 The R j is the component of the rotation matrix from the i th row and jth column. Note that the left-right placement of indices here is physically important, as well as the upstairs- downstairs placement. The physics reason why is that rotation matrices are not symmetric, so casually switching them makes no sense. Let us write out the above transformation law more explicitly, so that you can see how it encodes matrix multiplication in a disciplined way. For a rotation of the vector v with components vi about the z-axis, it reads1
10 10 1 10 2 10 3 1 2 v = R 1v + R 2v + R 3v = + cos θ v + sin θ v 20 20 1 20 2 20 3 1 2 v = R 1v + R 2v + R 3v = − sin θ v + cos θ v 30 30 1 30 2 30 3 3 v = R 1v + R 2v + R 3v = v . (2.11)
For the covariant vector ω specified by its components ωi, we have
j ωi0 = Ri0 ωj . (2.12) Note that the left-right and upstairs-downstairs index placements are deliberate and phys- j 0 ically meaningful here, as for the contravariant case earlier. Ri0 is the i th column of the jth row of RT . Rotations are interesting mathematically because they preserve the Euclidean norm. In index notation, this condition reads
i j0 k0 i Rj0 δk0 R ` = δ` , (2.13) where 1, i = ` δi = . (2.14) ` 0, i 6= `
i Mathematically, the tensor δj with one contravariant index and one covariant index is the identity matrix. Since the identity is a symmetric matrix, we do not have to be picky about left-right index placement on Kronecker deltas like we do for other tensors. More generically, we must be very careful about left-right and up-down index placement on our tensors. A
1Nitpickers: please note that, like Carroll, we take the point of view that the vector stays fixed while the coordinate system changes under the relevant transformation.
4 physical implication of the above formula is that even if you rotate your velocity vector, you still get the same kinetic energy for a non-relativistic particle, because the kinetic energy is proportional to the norm of the velocity vector. i0 Using the components of R j given above, use eq.(2.13) to check that you have correctly k0 identified the R` . Then using the transformation laws for contravariant and covariant vectors eq.s (2.10,2.12), show that covariant vector components are transformed in the −θ direction while the contravariant vector components are transformed in the +θ direction. This sign difference might seem rather trivial, but it is anything but! It is our first glimpse of why we do need to be very careful to distinguish between upstairs and downstairs in- dices for vectors – and more generally for tensors, which are multilegged generalization of contravariant and covariant vectors. In physics, we often want to find the norm (length) of a vector, or the angle between two distinct vectors via the dot product. In index notation, what we need to be able to do is to convert contravariant vectors into covariant vectors or vice versa. To achieve this, we need extra structure on the space (or later on, the spacetime) in which the vectors live, called a metric tensor g, which must be invertible. In flat Euclidean 3-space in Cartesian coordinates, the role of the metric tensor is played by the Kronecker delta tensor, which is the identity matrix in both upstairs and downstairs components, 1, i = j δ = , (2.15) ij 0, i 6= j and 1, i = j δij = . (2.16) 0, i 6= j i The one-up-one-down version δj also has the same numerical values. The upstairs spacetime metric is the (left and right) inverse of the downstairs metric,
ij i i g gjk = g k = δk , jk k k gijg = gi = δi . (2.17) As you can see, this equation is easily satisfied for flat Euclidean 3-space in Cartesian coordinates. Soon we will see that this equation must also hold when using more general coordinate systems or when operating in curved spacetime – or both. Converting between contravariant and covariant components of vectors and vice versa is achieved via j vi = gijv , (2.18) and i ij ω = g ωj , (2.19) where again we used the Einstein summation convention. The fact that the metric is so trivial in flat Euclidean 3-space in Cartesian coordinates is why people are often very careless with 2 index placement – if you write it out explicitly you will see that (for example) v = v2 because ij ij gij = δij and g = δ . Reminder: if you need to take a power of a vector component, put parentheses around it to make it unambiguous. For example, (v1)2 means the square of the first contravariant component of the vector v.
5 Notice how if we have a physics equation for contravariant vectors, say F j = maj, we can multiply both sides by δij (which is a number) and sum over the repeated index j to obtain a covariant vector equation Fi = mai. This chain of logic only works because we have a metric available – otherwise, we would have no way of converting upstairs-index equations to lower-index ones. As you will recall from Newtonian physics, the kinetic energy is proportional to the 2 i j square of the velocity vector, i.e. its norm |v| = δijv v . This is a scalar, i.e. invariant under rotations. So is the inner product or dot product of any two contravariant vectors ai and bj, formed by using the metric tensor,
i j a · b = gija b . (2.20) In 3 spatial dimensions only, we can build another 3-index animal out of two contravariant vectors ai and bj by taking an outer product or cross product. We will be able to write an expression for this in our handy index notation by making use of another new object known as the permutation pseudotensor Eijk, which is antisymmetric under interchange of any two of its indices, +1 , (ijk) = even permutation of (123) , Eijk = −1 , (ijk) = odd permutation of (123) , (2.21) 0 , otherwise . There is also an upstairs version Eijk with the same numerical values, in flat Euclidean 3- space. What is this permutation pseudotensor used for? Well, one of the first things it can do is to help us find the determinant of a matrix,
ijk 1 2 3 det(M) = E M iM jM k . (2.22) Applying this formula to an orthogonal transformation matrix allows us to discover that Eijk is a pseudotensor, rather than a proper tensor, because it does not flip sign under a parity transformation xi → −xi. It is also invariant under rotations and translations. ijk When handling expressions containing Eijk or its upstairs version E , we may need to know what contractions of these beasts look like. The identities it obeys are very handy to know,
ijk E Eijk = 3! , ijk k E Eij` = 2! δ` , ijk jk E Eimn = 1! δmn , ijk ijk E E`mn = 0! δ`mn , (2.23) where the generalized Kronecker deltas are defined by j k jk δm δm j k j k δmn ≡ j k = δmδn − δnδm , (2.24) δn δn and i j k δ` δ` δ` ijk i j k i j k i j k i j k i j k i j k i j k δ`mn ≡ δm δm δm = +δ`δmδn + δmδnδ` + δnδ` δm − δmδ` δn − δ`δnδm − δnδmδ` . (2.25) i j k δn δn δn
6 Using all of this, we can finally write out the components of the outer (cross) product in index notation in 3D, j k (a × b)i = Eijka b . (2.26) Notice that in writing the outer product here, we have again used the Einstein summation convention – twice – on both j and k. This makes the expression more compact. Also, since this is a bona fide 3-vector equation, we can raise the index using our (spatial, trivial) i ijk metric. As you should convince yourself, the result is (a × b) = E ajbk. We can use these expressions to find, for example,
m n [a × (b × c)]` = E`mna (b × c) m npq = E`mna E bpcq pq m = δ`ma bpcq p q p q m = (δ` δm − δmδ` )a bpcq q p = a b`cq − a bpc`
= [b(a · c) − c(a · b)]` , (2.27) which should look familiar from vector calculus classes earlier in your education. In general spacetime dimension d, we can define a d-dimensional version of the anti- symmetric permutation pseudotensor with d indices. Then, the outer product between two contravariant vectors ai and bj is more properly thought of as a pseudotensor with (d − 2) i1 i2 legs, because it is formed via the contraction Ei1i2...id a b of two vectors a and b with the d-legged E pseudotensor. Note that E is defined in any dimension as long as the manifold is orientable.
7 3 R17Sep
3.1 Special relativity and 4-vectors in Minkowski spacetime Let us now turn to studying how to generalize spatial vectors in flat Euclidean space to spacetime vectors in flat Minkowski spacetime, in Cartesian coordinates to begin with. The bedrock principle of the constancy of the speed of light has some fairly dramatic physics implications, chief among them being time dilation and length contraction. Both of these ideas have been rigorously tested experimentally, e.g. in particle collider and cosmic ray contexts, and found to hold true. Also, velocities no longer add simply, obeying a composition law that looks pretty mysterious the first time you see it. Let me now demystify this and Lorentz boosts by using a clever parametrization. When you first saw Lorentz boosts, probably at the end of first year Newtonian me- chanics or in a second year modern physics course, they probably looked like the following. For an infinitesimal Lorentz boost in the x direction in units where c = 1, 0 0 0 0 dt = γv (dt − vdx) , dx = γv (dx − vdt) , dy = dy , dz = dz , (3.1) √ 2 where γv ≡ 1/ 1 − v . Using these expressions, you can easily figure out how velocities transform for a Lorentz boost along the x axis, 0 0 0 0 dx ux − v 0 dy uy 0 dz uz ux = 0 = , uy = 0 = , uz = 0 = . (3.2) dt 1 − uxv dt γv(1 − uxv) dt γv(1 − uxv) You can also work out the 3-accelerations
0 ax 0 ay (uyv)ax 0 az (uzv)ax ax = 3 3 , ay = 2 2 + 2 3 , az = 2 2 + 2 3 . γv (1 − uxv) γv (1 − uxv) γv (1 − uxv) γv (1 − uxv) γv (1 − uxv) (3.3) Notice that, unlike for Galilean relativity, acceleration is not an invariant in Special Relativ- ity. But whether or not someone is accelerating is an absolute concept: if the acceleration is zero in one frame of reference, then it is also zero in a Lorentz boosted frame of reference. Note: these formulæ are written in older notation that we will not continue using later in this course. We can write Lorentz boost formulæ in a much prettier way by using the rapidity ζ, which is defined by v = tanh ζ . (3.4) Note that while the speed ranges over v ∈ (−1, +1), the rapidity ranges over ζ ∈ (−∞, +∞). The really awesome thing about rapidity is that it is additive. To add the rapidities, you literally just add them, like for rotation angles: ζtot = ζ1 + ζ2. It is a simple exercise to recover the relativistic velocity addition law from the definition of rapidity and its additive nature. Give it a go yourself to be sure you understand. Now we are in a position to show you a Lorentz boost along the x direction in rapidity variables – da-daah! dt0 = + cosh ζ dt − sinh ζ dx , dx0 = − sinh ζ dt + cosh ζ dx , dy0 = dy , dz0 = dz . (3.5)
8 This looks a bit like a rotation, except for two physically important differences: (1) it mixes temporal and spatial intervals, rather than different spatial intervals, and (2) it involves hyperbolic trig functions, rather than normal trig functions. Another difference is that it is not the 3D Euclidean norm that is preserved under Lorentz transformations, but rather the 4D Minkowski norm, also known as the invariant interval
ds2 = dt2 − dx2 − dy2 − dz2 . (3.6)
The invariant interval so defined is positive if the points are timelike separated, negative if they are spacelike separated, and zero if they are null separated. This classification works regardless of which inertial reference frame you use, because it is invariant under sym- metry transformations of Minkowski spacetime: rotations, [Lorentz] boosts, and translations. The invariant interval ds2 = dt2 − |d~x|2 gives rise to the concept of a light cone. For a point p, this is the cone defined by all light rays emanating from p into the future or the past. Points that are timelike separated from p are inside its light cone (positive ds2), those that are spacelike separated from p are outside it (negative ds2), and those that are null separated from p (zero ds2) lie on the light cone itself. Put more colloquially, if you had just died at point p, then your past light cone and its interior would contain all possible suspects for who had murdered you. If on the other hand you had set off a bomb at p, then your future light cone and its interior would contain beings you could have killed (using any form of explosive, TNT and photon torpedoes included!). Here is a pictorial representation of the light cone (for the D = 2 + 1 case). Figure credit: Wikipedia.
Note that light rays are conventionally drawn at a 45 degree angle on spacetime diagrams, in flat spacetime, to represent the fact that c = 1. In curved spacetime, the story gets more complicated, because the spacetime metric varies with position, rather than being constant. It might be worth reminding you of the definition of proper time. To set the context, consider two events that are timelike separated. The proper time between two spacetime events measures the time elapsed as seen by an observer for whom the two events occur at the same spatial position. In our signature convention, the invariant interval is positive in the timelike case, so ds2 = dτ 2.
9 Motivated by the form of the matrices representing Lorentz boost transformations, let us define a relativistic 4-vector x with components xµ given by
x0 = (c)t , xi = (~x)i . (3.7)
Here, µ ∈ {0, 1, . . . , d}. Notice how time is totally different conceptually than it was in Galilean relativity: it is the zeroth position coordinate, not an invariant. We can then define the invariant interval as 2 µ ν ds = gµνdx dx . (3.8) In flat Minkowski spacetime in Cartesian coordinates, the metric tensor has downstairs components +1, µ = ν = 0 ηµν = −1, µ = ν ∈ {1, 2, . . . , d} . (3.9) 0, µ 6= ν Its upstairs counterpart, the inverse, has components +1, µ = ν = 0 ηµν = −1, µ = ν ∈ {1, 2, . . . , d} . (3.10) 0, µ 6= ν The equations expressing the fact that the upstairs and downstairs Minkowski metrics are inverses of each other are
αβ α α η ηβγ = η γ = δγ , βγ γ γ ηαβη = ηα = δα . (3.11) Again, we have used the Einstein summation convention where repeated indices are summed over. Note that we have chosen the mostly minus signature convention here. Be aware that formulæ that you may obtain from various GR textbooks may have been written in the opposite sign convention. This can be quite annoying when you are trying to track down minus sign errors in a calculation. HEL has a useful table on p.193 outlining key signature convention differences with d’Inverno, Misner-Thorne-Wheeler and Weinberg. The Minkowski metric tensor η is useful for raising and lowering indices. Specifically, ν for a contravariant vector V we can find its covariant components Vµ by contracting with ηµν: ν Vµ = ηµνV . (3.12) Contracting an index means repeating it (precisely once) and summing over it. For ex- ample, in the above equation, the index ν is contracted, while the index µ is not. Let us calculate one component, V0.
ν V0 = η0νV 0 1 2 3 = η00V + η01V + η02V + η03V = (+1)V 0 + (0)V 1 + (0)V 2 + (0)V 3 = +V 0 . (3.13)
10 µ To find the contravariant components ω of a covariant vector ων, we need to contract with the upstairs metric ηµν:- µ µν ω = η ων . (3.14) Using the Minkowski metric, we can define a relativistic dot product between two contravari- ant vectors aµ and bν, µ ν a · b = ηµνa b . (3.15) Before we move on to defining tensors in a more general way, let us make a couple of comments about the symmetry group of Minkowski spacetime for those who might be interested. We talked about rotations earlier, and noted that they preserved the norm of 3- vectors in flat Euclidean space. A rotation matrix is orthogonal and preserves the Euclidean 3-norm. The group of such matrices in 3D is known as SO(3). What is the analogue condition for 4-vectors in flat Minkowski spacetime? If you work out the algebra, you will find that both rotation and boost transformations written as 4 × 4 matrices Λ preserve the Minkowski norm, ΛT ηΛ = η, where η is the Minkowski metric tensor we defined above. In index notation, i k0 `0 i Λk0 η `0 Λ j = η j . (3.16) Such matrices Λ in D = d + 1 dimensions are said to belong to the group SO(1, d). Rotation and boost matrices together are known as Lorentz transformations and they form a Lie (continuous) group known as the Lorentz group. If we include translations as well, the resulting group of transformations is known as the Poincar´egroup ISO(1, d) (mathematically, it is a semidirect product). An interesting fact about the Poincar´egroup is that, without even looking at an experiment, you can prove theoretically that there are only two2 invariants: the mass m and the intrinsic spin s. They are always the same in different inertial frames of reference related by rotations, boosts, or translations. This is why subatomic particles are differentiated by their mass and spin. The third label we use to distinguish subatomic particles, that also respects Poincar´e invariance, is the set of conserved charges under whichever gauge symmetries are relevant, e.g. SU(3) × SU(2) × U(1) of the Standard Model of Particle Physics. So, how do we define vectors and tensors in flat spacetime? The signature property of a vector and, more generally, of a tensor, is that it transforms in a specific and well-defined way under changes of reference frame, using the spacetime coordinates as the quintessential example. For a single-index tensor V with upstairs components V µ,
µ0 0 ∂x 0 V µ = V ν = Λµ V ν , (3.17) ∂xν ν which is known as a contravariant vector. There are also covariant vectors which obey
ν ∂x ν V 0 = V = Λ 0 V . (3.18) µ ∂xµ0 ν µ ν 2If you want to know why, and are unafraid of a little Lie group theory, you can find out why by reading my PHY2404S notes at https://ap.io/archives/courses/2014-2020/2404s/qft.pdf. I also explain there why helicity is the relevant thing for massless particles and why spin has the character of an angular momentum for massive particles.
11 Look closely at the above two equations: they are materially different. In the equation for the contravariant vectors, the transformed coordinates x0 appear in the numerator of the Jacobian and the original coordinates x appear in the denominator in the transformation law; for covariant vectors the opposite happens. Mathematically speaking, contravariant vectors live in the tangent space, which is de- fined at every point in spacetime. Covariant vectors live in the cotangent space. They obey the usual axioms of vector spaces: associativity and commutativity of addition, ex- istence of identity and inverse under addition, distributivity, and compatibility with scalar multiplication. A rank (m, n) tensor has m contravariant indices and n covariant indices. In math- ematical language, a rank (m, n) tensor is a multilinear map from the direct product of m copies of the cotangent space with n copies of the tangent space into the real numbers. Alternatively, you can think of it as a machine with m slots for covariant vectors and n slots for contravariant vectors to make a scalar. For instance, a rank (0, 1) tensor (a covariant vector) is a machine with one slot for a contravariant vector (a rank (1, 0) tensor), which when inserted will produce a scalar (a rank (0, 0) tensor). The spacetime metric is a (0, 2) tensor; its inverse is a (2, 0) tensor. To find out how the components of a tensor transform, you use the transformation matrices on each index in turn,
µ 0 µ 0 σ σ 0 0 ∂x 1 ∂x m ∂x 1 ∂x n µ1 ...µm λ1...λm T ν 0...ν 0 = ... 0 ... 0 T σ ...σ . (3.19) 1 n ∂xλ1 ∂xλm ∂xν1 ∂xνn 1 n
Note that each of the indices λ1, . . . , λm and σ1, . . . , σn in this equation is repeated and summed over, keeping to the Einstein summation convention. So if you were to expand out all the components one by one, this would be a pretty long equation. It’s just as well we know how to represent it compactly using index notation! The general idea of tensor analysis is that all laws of physics should be expressible in terms of tensor equations. In tensorial equations, indices can be consistently raised and lowered, as long as this is done consistently to both sides. In other words, you should not raise an index on the left side of a tensor equation while failing to do the same on the right hand side. Every equation should have the same number and type of indices on both sides. Tensorial equations hold equally well in any frame of reference, even though the components are different in different frames of reference. Now let us turn to a few examples of the utility of tensors in Minkowski spacetime.
3.2 Partial derivative 4-vector We can use Minkowski spacetime tensors to describe more objects than a massive point particle. For starters, we can form a very important covariant vector out of derivatives, ∂ ∂ ≡ . (3.20) µ ∂xµ Its zeroth component describes the time derivative ∂ 1 ∂ ∂ = = , (3.21) 0 ∂(c)t (c) ∂t
12 while the spatial parts ∂i describe spatial derivatives. As you can see, ∂µ arises naturally as a covariant vector. It is a straightforward and worthwhile exercise to show that in flat Minkowski spacetime, 1 ∂2 ∂2 ∂2 ∂2 ∂µ∂ = − − − . (3.22) µ (c)2 ∂t2 ∂x2 ∂y2 ∂z2 This differential operator appears in relativistic wave equations, e.g. the Maxwell equations, µ 2 or the equation of motion for a Klein-Gordon (scalar) field Φ, ∂ ∂µΦ = m Φ. ik·x For fun, let us try applying −i~∂µ to a plane wave of the form f(x) = e and see what happens.
λ −i~∂µ f(x) = −i~∂µ exp(ikλx ) ν λ = −i~[ikνδµ] exp(ikλx ) = ~kµ f(x) . (3.23)
In other words, −i~∂µ is playing the role of the momentum when acting on plane waves of ik·x the form f(x) = e , producing the eigenvalue pµ = ~kµ. In mathematical lingo, we say that the plane wave carries a representation of the translation group. If we only had discrete translation invariance up to a lattice vector instead, we would end up with Bloch waves instead of continuous spectrum plane waves.
13 4 M21Sep
4.1 Relativistic particle: position, momentum, acceleration 4-vectors For any point particle, massive or massless, we can define its 4-momentum pµ by
p0 = E, pi = (~p)i , (4.1) where ~p is the relativistic 3-momentum and E is the relativistic energy. For a massive particle, we have m p0 = √ = m cosh ζ , 1 − v2 m pi = √ vi = m sinh ζ vˆi . (4.2) 1 − v2 Check out for yourself what happens to components of the 4-momentum under Lorentz transformations. Notice that the relativistic norm of the momentum 4-vector is a constant,
µ 2 2 2 p pµ = E − |~p| = m . (4.3) This is known as the mass shell relation. It holds for any particle, massless or massive. For massless particles like the photon, E2 = |~p|2. The 4-velocity is defined for massive particles only, via dxµ(τ) uµ = , (4.4) dτ where τ is the proper time. It is related to the momentum 4-vector by
pµ = muµ . (4.5)
Note that the 4-velocity satisfies µ u uµ = +1 , (4.6) by the mass shell constraint. Work out for yourself how the spatial components of uµ relate 0 i i to the Newtonian velocity – you should find u = γv, u = γv(~v) . The 4-acceleration is defined for massive particles only, via duµ(τ) d2xµ(τ) aµ = = , (4.7) dτ dτ 2 where τ is proper time. Work out for yourself how this relativistic acceleration aµ connects with the Newtonian version of acceleration ~a that you used in first-year undergrad physics. What if we wanted to get a bit more sophisticated and write down an action principle for the point particle? First, let us do a lightning review of some salient points from classical mechanics. In a general dynamical system, our variables are the coordinates
qa(λ) , (4.8)
14 where the index a labels which coordinate we are discussing and λ is a parameter that measures where we are along a particle path. For non-relativistic system we will pick λ = t, Newtonian time. The velocities are q.a(λ), where d . = . (4.9) dλ We also have the expression for the canonical momenta in terms of the velocities, ∂L p = , (4.10) a ∂q.a which are found from the action, which is a functional of the coordinates, Z S = S[qa(λ)] = dλ L(qa(λ), q.b(λ)) . (4.11)
Using the Lagrangian L and the expressions for the canonical momenta in terms of the velocities, we can form the Hamiltonian H which depends on the coordinates on phase space, the coordinates and their conjugate momenta,
a X .a H = H(q , pb) = paq − L. (4.12) a The principle of least action δS = 0, combined with your knowledge of the calculus of variations, results in the Euler-Lagrange equations ∂L d ∂L − = 0 . (4.13) ∂qa dλ ∂q.a These equations of motion are equivalent to Hamilton’s equations, dp dqa a = {p ,H} , = {qa,H} , (4.14) dλ a PB dλ PB where the Poisson bracket is defined via X ∂f ∂g ∂g ∂f {f, g} = − . (4.15) PB ∂qa ∂p ∂qa ∂p a a a The basic dynamical variables for a non-relativistic point particle are xi(t), where t is the non-relativistic time and i = 1, 2, 3. These are 3 bona fide independent functions. There is no issue about how to parametrize t, because all observers agree on time, by Galilean relativity. For a free nonrelativistic particle, the Lagrangian is just the kinetic energy, Z 1 S = dt m|~v|2 . (4.16) nonrel 2 This action respects Galilean invariance in Euclidean 3-space. The canonical momenta are
pi = mvi , (4.17)
15 and the Hamiltonian is 1 H = pip . (4.18) NR 2m i This is just the kinetic energy written in terms of momentum rather than velocity. So, that was all well and good, but what about an action principle for the relativistic point particle? This will be an integral over the worldline of the particle, which is the path it traces out as it moves through spacetime. For relativistic point particles we cannot use the Newtonian kinetic energy, because it is not invariant under Lorentz boosts. We will have to use a generalization that respects Einsteinian relativity. The simplest guess for an action generalizing the above that people typically write for a massive particle is proportional to the arc length, r Z dxµ(τ) dxν(τ) S(1) = −m dτ η , (4.19) rel µν dτ dτ where τ is the proper time (an invariant, unlike the time coordinate). This action has the benefit that, at low speeds, it reduces to the familiar non-relativistic action – up to an additive constant (try it yourself to see how, by doing a Taylor series). It assumes that the particle position xµ(τ) can be parametrized by the proper time τ. The drawback of this first choice of relativistic particle action is twofold. First, the particle is assumed to be massive, so that proper time can be used to parametrize the worldline. If we want to write down equations of motion for massless particles like photons, it will not suffice. Second, our 4 dynamical variables xµ(τ) are not actually independent functions. At all points along the evolution, the 4 must obey the mass shell constraint, .µ . µ x xµ = +1, where . = d/dτ. As a result, only 3 of the 4 x (τ) are independent functions. It is a physics fib to pretend that all 4 can be independently varied in the action principle. This is why some people substitute the mass shell constraint into the Lagrangian to sidestep the problem. p.µ . Suppose further that we tried to use the above Lagrangian Lτ = −m x xµ to find the canonical momenta and Hamiltonian. What we would end up with is Hτ ≡ 0. A related fact is that the geometric arc length Lagrangian is ‘singular’. What does this mean? Well, if we inspect the Euler-Lagrange equations for general qa(λ), we can rearrange them to see that
2 2 2 ∂ L ..b ∂L ∂ L .b ∂ L q = − .a q − .a . (4.20) ∂q.b∂q.a ∂qa ∂qb∂q ∂t∂q Everything on the RHS of this equation is a function of λ, qa, and q.a. So for our massive relativistic point particle, finding all of the accelerations ..xµ in terms of τ, xµ(τ), x.µ(τ) only works if the Hessian tensor ∂2L (4.21) ∂x.ν∂x.µ has maximal rank. It actually has one zero eigenvalue, and this signals the presence of a local gauge symmetry: reparametrization invariance.3
3For more details on this and a number of related topics, see the 1990 textbook “Lagrangian Interaction” by Noel Doughty, intended for senior undergraduates. When I was doing my B.Sc.(Hons) degree in New Zealand, I took a course from Doughty, and his notes and background material were published as this book a year later. I am very grateful to Doughty for helping inspire me to be a theoretical physicist. If you take a peek into the Acknowledgements section, you will see that he thanked me and four of my classmates. :D
16 Let us now mention the correct way to handle a constraint. The key is to introduce a Lagrange multiplier, which in this case we will call e(λ), the einbein. In general, a Lagrange multiplier is something that appears in your action principle only via dependence on “co- ordinates” but not on “velocities”: it is not a truly dynamical field. Its only function is to implement the constraint that you need to impose, in a way that respects the symmetries of your system. In our case, we want to preserve invariance under rotations, boosts, and translations. Using this idea, a Lagrangian can be written down that achieves all the things we need. I will refer to it as the einbein Lagrangian4, Z 1 dxµ(λ) dxν(λ) 1 S(2) = dλ e−1(λ) η + e(λ) m2 . (4.22) rel 2 µν dλ dλ 2 This action is invariant under reparametrizations λ → λ0, as long as the einbein transforms as dλ e → e0 = e . (4.23) dλ0 (2) Varying this action Srel w.r.t. e(λ) gives its Euler-Lagrange equation, and this produces the mass shell constraint, .µ . 2 2 x (λ)xµ(λ) = +m [e(λ)] . (4.24) It is a good idea to check this yourself, by working through the steps. Along the way, you will need to use the fact that ∂xα = ∂ xα = δα = η α , (4.25) ∂xβ β β β and a similar equation for the x.µs. For massive particles we can pick the proper time .µ . parametrization in which e(λ) = 1/m; then x xµ = 1 and λ = τ. For massless particles, a .µ . convenient parametrization is e(λ) = 1, and so x xµ = 0 and there is no concept of proper time, just a parameter λ. Because we have obtained the mass shell constraint equation directly from the action, we can be confident that we truly have only 3 independent functions xµ(λ) in our dynamical system, not 4. (2) µ Varying Srel w.r.t. x (λ) gives the equations of motion for the relativistic particle posi- tion, . [pµ(λ)] = 0 , (4.26) where the canonical momenta are
−1 . pµ(λ) = e (λ)xµ(λ) . (4.27) . The above equation of motion pµ = 0 is equally valid for a massive or massless particle, as long as it is free, rather than acted on by an external force. If it feels an external force, obviously we would generically not expect its momentum to be conserved. The Hamiltonian is
.µ Hλ = pµ(λ) x (λ) − Lλ 1 = e(λ)pµ(λ)p (λ) − m2 . (4.28) 2 µ 4Sometimes it is alternatively referred to as “1D General Relativity”.
17 This Hamiltonian is proportional to the constraint, and this is the correct answer because it gives all the correct Poisson Brackets:
µ µ µ ν {x , pν}PB = δν , {x , x }PB = 0 , {pµ, pν}PB = 0 . (4.29) If we wanted to canonically quantize a system (something we will not be doing in this course), we would replace classical Poisson brackets with quantum mechanical commutators.
4.2 Electromagnetism: 4-vector potential and field strength ten- sor A less trivial example of a special relativistic tensor is Maxwell’s electromagnetism. Having played with the Maxwell equations, you know why EM waves travel (in vacuum) at the speed of light. You may already know that, decades before Einstein invented special relativity, Maxwell had baked it into the very fabric of his eponymous equations! What you may not know is that the familiar electric and magnetic field strengths are actually not correctly described by vectors, but instead by a two-index covariant antisymmetric tensor. Specifically, in four spacetime dimensions, the gauge field strength components Fµν are built out of ~ j F0i = +δijE ~ k Fij = −EijkB (4.30) In this equation, we used the totally antisymmetric permutation symbol in 3 dimensions. The electromagnetic 4-vector gauge potential Aµ is built out of the scalar potential and the 3-vector potential, with components
A0 = Φ , i Ai = A~ . (4.31)
It is related to the field strength via the covariant curl,
Fµν = ∂µAν − ∂νAµ . (4.32) This splits up in 3+1 notation as
B~ = ∇~ × A,~ ∂A~ E~ = −∇~ Φ − . (4.33) ∂t λ Note that Aµ(x ) is the basic dynamical field of electromagnetism. The field strength Fµν is a derived quantity. Using the above definitions, the four Maxwell equations
∂E~ ∇~ × B~ − = J,~ ∇ · E~ = ρ , (4.34) ∂t ∂B~ ∇~ × E~ + = ~0 , ∇ · B~ = 0 . (4.35) ∂t
18 neatly collapse into two manifestly relativistic Maxwell equations,
µν ν ∂µF = J , µνλρ E ∂νFλρ = 0 . (4.36)
Later when we generalize to curved spacetime, the partial derivatives ∂µ will be replaced by covariant derivatives ∇µ. In the above relativistic Maxwell equations, the 4-vector current is built out of the charge density and the 3-vector current, with components
J 0 = ρ , i J i = ~j . (4.37)
The 4-vector current obeys a conservation law,
µ ∂µJ = 0 . (4.38)
Here we are working in four spacetime dimensions. If we write more generally the 5 spacetime dimension D as D = d + 1, then the D-index pseudotensor Eµ0...µd is defined via +1 , (µ0 . . . µd) = even permutation of (012 . . . d) , Eµ0...µd = −1 , (µ0 . . . µd) = odd permutation of (012 . . . d) , (4.39) 0 , otherwise .
If you want the permutation tensor with upstairs indices, you can easily build it by using µν η to raise the indices. Note that our 4-index permutation pseudotensor Eµνλσ obeys some handy identities, in a Minkowski space generalization of what we saw before for Euclidean 3-space. Defining µ ν µν δα δα δαβ = µ ν (4.40) δβ δβ and µ ν λ δα δα δα µνλ µ ν λ δαβγ = δβ δβ δβ (4.41) µ ν λ δγ δγ δγ and µ ν λ σ δα δα δα δα µ ν λ σ µνλσ δβ δβ δβ δβ δαβγδ = µ ν λ σ (4.42) δγ δγ δγ δγ µ ν λ σ δδ δδ δδ δδ
5We only ever consider spacetimes with one timelike dimension. Currently, it is not generally known how to make sense of quantum theory with two or more timelike dimensions. Which ∂/∂t should we use in the Schr¨odingerequation?
19 gives, after quite a bit of algebra,
µνλσ E Eµνλσ = −4! , µνλσ σ E Eµνλδ = −3! δδ , µνλσ λσ E Eµνγδ = −2! δγδ , µνλσ νλσ E Eµβγδ = −1! δβγδ , µνλσ µνλσ E Eαβγδ = −0! δαβγδ . (4.43) The relativistic Lorentz force law can be written very nicely in relativistic tensor notation,
µ µ ν ma = qF νu , (4.44) where uµ is the relativistic 4-velocity and aµ is the relativistic 4-acceleration. You will work out some aspects of this EM story in your HW1 assignment. In particular, you will be able to compute the effect of a Lorentz boost on the electromagnetic fields E~ and B~ , which many of you will not have seen before. We now turn to the question of what happens for accelerated observers moving with constant relativistic acceleration.
20 5 R24Sep
5.1 Constant relativistic acceleration and the twin paradox When I was an undergraduate, a professor introduced the idea of the Twin Paradox to us. Could the space traveller twin really live longer by travelling at relativistic speeds? The maddening thing was that he never equipped us with the technology to answer the question! Here is how we can solve that without having to resort to General Relativity: we will only use what we know about Special Relativity to solve it, along with a tiny bit of calculus. We all know that time dilation lengthens time intervals as compared to what is measured in rest frame. We also know that each observer sees the other person’s clock as running slow. So why is there even a difference between what the space twin sees and what the homebody twin sees? Acceleration. The space twin must accelerate in order to turn around and come back to Earth, before they can compare clocks with the homebody twin. This is what makes the space twin physically distinct from the homebody twin, who stays in a relatively boring inertial reference frame while the space twin gallivants around the galaxy. What do we mean by constant relativistic acceleration, exactly? Without loss of gener- ality, we may take the astronaut acceleration to be pointing along the x1 direction. Since the rapidity is additive under two successive Lorentz boosts, we may take a guess that constant relativistic acceleration occurs when rapidity increases linearly with proper time. Let the magnitude of the constant relativistic acceleration be g. For an infinitesimal addition to rapidity in the x1 direction dζ, the proposal is
dζ = g dτ . (5.1)
Here we have suppressed the factors of c, which can be easily restored via dimensional analysis. This formula is interesting because it actually holds for any kind of acceleration g(τ), not just the constant kind. Let us now see why. Our key tool for analysis will be to define the instantaneous inertial rest frame (IIRF) for the accelerating astronaut, which we will denote by primes. This is obviously distinct from the ordinary inertial reference frame (IRF) of the homebody twin, and it is different at each point along the space twin’s trajectory because they accelerate. The key physical feature of the IIRF at any τ is that the astronaut is at rest in that frame at 0 the instant in question: ux = 0. And since we know the relationship between 3-velocities and 3-accelerations in different inertial reference frames from our experience with Lorentz transformations, we can figure out what happens for the astronaut’s trajectory measured in the lab frame. We have
0 ux − v 0 ax ux = , ax = 3 3 , (5.2) 1 − uxv γv (1 − uxv) √ 2 where γv ≡ 1/ 1 − v . Accordingly, at each instant along the astronaut trajectory,
ux = v . (5.3)
0 Therefore, for a general acceleration ax = g(τ) in the IIRF,
−3 ax = γv g(τ) . (5.4)
21 We also know from elementary Lorentz transformations that dt = γvdτ (this is just like muons from cosmic rays lasting longer in the lab frame than in the muon rest frame because they are whizzing down to earth at relativistic speed). Remembering that ax = dux/dt, rearranging the above equation as a function of v, and integrating gives for the 3-velocity in the x1 direction Z τ arctanh[ux(τ)] = dσ g(σ) . (5.5) 0 In turn, this can be easily rearranged to give the rapidity along the x1 direction Z τ ζ(τ) = dσ g(σ) , (5.6) 0 where we assumed that ζ(τ = 0) = 0. Next, we would like to compute the distance in the homebody frame moved by the astronaut during homebody time dt. This is simply obtained from the speed dx = ux dt = tanh ζ dt . (5.7) To convert to astronaut time, we again use the standard time dilation formula,
dt = cosh ζ dτ . (5.8)
This implies that dx = sinh ζ dτ . (5.9) If we know ζ(τ), we can integrate these equations. It is especially easy to do so for constant acceleration g. The position of the space twin in homebody coordinates becomes 1 x(τ) = [cosh(gτ) − 1] + x . (5.10) g 0 The time for the space twin in homebody coordinates integrates to 1 t(τ) = sinh(gτ) + t . (5.11) g 0 Using these equations, you can figure out the physical effect of acceleration on the ageing process. In your first homework assignment, you will find out that acceleration serves to enhance the familiar constant-speed time dilation effect, rather than reduce it. This is because the free particle trajectory actually maximizes proper time elapsed during motion; any acceleration applied reduces it. We will be able to see why this is later on when we study geodesics. Geodesics are, morally speaking, the closest thing to a straight line that is available in curved spacetime. They describe the trajectories of test particles in freefall. Getting back to our equations above, we can see that for the case of a constant relativistic acceleration g, the trajectory of the accelerating space twin satisfies
12 1 x(τ) − x + − [t(τ) − t ]2 = . (5.12) 0 g 0 g2 As you can see by inspection, this is a hyperbola. The asymptotes of the hyperbola are known as acceleration horizons or Rindler horizons. These asymptotes are lines with
22 a 1:1 slope on a spacetime diagram. We can see why they are horizons by recalling that light rays also move at 45 degrees. An observer on a timelike trajectory going at higher acceleration hugs the hyperbola asymptote more tightly, but still cannot ‘see’ beyond the Rindler horizon.
In fact, the physics is even more interesting than this. The accelerated astronaut not only finds that there are parts of spacetime that they cannot communicate with because of their acceleration, but also that the physics of quantum fields for them is qualitatively and quantitatively different from what the homebody sees. The Minkowski vacuum (the state with no particles), seen in the reference frame of the astronaut with constant acceleration, turns out to have plenty of particles in it, and they can be measured with a detector. Not only that, the spectrum is thermal, at the Rindler temperature. Including factors of c, the formula for this reads ~g TRindler = . (5.13) 2πckB The greater the acceleration, the higher the temperature that the detector will register. This phenomenon of acceleration radiation is known as the Unruh effect. For those who are interested, the physics of particle detectors in GR is explained nicely in the advanced GR textbook by Birrell and Davies “Quantum Field Theory in Curved Space”.
23 5.2 The Equivalence Principle Einstein became famous for several different accomplishments. One which is legion among theoretical physicists is the concept of the Gedankenexperiment (German for thought exper- iment). It allows us to work out all sorts of imaginative ideas without having to actually spend any money. So imagine, if you will, that you are an astronaut on the space station. Imagine that you are blindfolded and kidnapped and then one of two things happens to you. Either you feel the acceleration due to gravity or you take a ride in a rocket ship capable of that same acceleration. How would you tell the difference?
The gravitational force from a body of mass M on a test mass mg is G Mm F~ = − N g rˆ , (5.14) grav r2 where mg is the gravitational mass and GN is the Newton constant. So we have mg GN Mrˆ ~agrav = − 2 . (5.15) mi r
If mi = mg, then this acceleration ~agrav does not depend on the properties of the test mass feeling the gravitational force. The universality of gravitation was first put forward by Newton, centuries before Ein- stein. Others hypothesized that the acceleration due to gravity should be universal, not de- pending on the composition of the falling object. This idea has since been tested exquisitely well. It implies that an object’s inertial mass (what makes you hard to move in the morning) is equal to its gravitational mass (what responds to gravity), and is known as the Weak Equivalence Principle. When Einstein formulated his theory of General Relativity (GR) he decided to bake the equivalence principle into the very fabric of spacetime. In GR, there is no local experiment
24 you can do to tell the difference between acceleration due to rockets and acceleration due to gravity. This is known as the Einstein Equivalence Principle. The really cool thing about the equivalence principle? It implies that every reference frame, including accelerating ones, can be instantaneously approximated by a Lorentz frame. This might seem like mathematical nitpicking, but it is actually a key physics insight, as it implies that locally in spacetime, everything is just Special Relativity. What makes Gen- eral Relativity interesting and nontrivial is the story of how those individual infinitesimal neighbourhoods are sewn together into the fabric of curved spacetime. It is important to note that this equivalence between gravity and acceleration holds only in an infinitesimal patch about a point. If we have access to a finite sized patch of spacetime, we can distinguish gravity from acceleration by measuring tidal forces. We will develop that story later on when we get to discussing geodesic deviation. Consider a photon in Earth’s gravitational field. If it gets aimed upwards, then after a time interval dt, what is the effect? Well, photons cannot change their speed, as they always go at c. What can change for a photon is its energy (or equivalently the magnitude of its momentum, because the photon mass shell relation is E = |~p|). It can also change its heading. When a photon moves upwards in a gravitational field, it gains gravitational potential energy, so it must lose kinetic energy (conservation of the total energy is valid near Earth, because there is a time translation symmetry). The photon should therefore suffer a redshift in going upwards. This phenomenon is known as gravitational redshift, and it implies that clocks run slower when they are deeper in a gravitational field. Black holes take this to an extreme, as we will see much later in the course. Did you know that GPS devices rely on both Special and General Relativity to locate you accurately? They need to take account of the fact that the GPS transmitter satellites are (a) travelling at a measurable fraction of the speed of light, requiring Special Relativistic Doppler corrections, and (b) higher up in Earth’s gravitational field than we are, necessitating General Relativistic corrections. Without those corrections together, you would probably be kilometres off your intended position after a day’s canoeing. So GR does actually touch your life in a measurable way, if you ever use a GPS unit, say in your smartphone.
5.3 Spacetime as a curved Riemannian manifold Newton conceptualized gravity via forces that act at a distance instantaneously. This in- stantaneous propagation of gravitational effects is in direct contradiction to the relativistic principle that the speed of light is the upper speed limit for everyone. In Einstein’s GR, the speed of propagation of gravititational disturbances is tied to be exactly equal to the speed of light in vacuum. The formalism of GR is designed to express all the effects of grav- ity in a relativistic way, like gravitational redshift, via geometrical properties of the fabric of spacetime. The mathematical name for the type of geometry used is (pseudo)Riemannian geometry. A(p + q)-dimensional manifold with signature (p, q) is a spacetime that locally looks like a patch of Rp,q. For example, for our D = 3 + 1 universe with three large spatial dimensions, this would be R3,1. The manifold is the collection (union) of these patches, known as coordinate charts, along with the transition functions that teach you how to sew the patches together. The manifold needs to be continuous, and in order for us
25 to compute sensible physical quantities it should also be differentiable. The mathematical concept of the coordinate chart is equivalent to the usual physics idea of a coordinate system or reference frame. As an example of how you might need more than one coordinate chart to cover a manifold, consider a circle S1. Each coordinate chart must be an open set of R (emphasis on open). So the minimum number of coordinate charts required to cover the 1-sphere is two. For the 2-sphere S2, you need a projection to get S2 onto R2, or a patch thereof like a map. The most commonly used projection is the Mercator projection, which preserves angles rather than area. It is possible to use a different projection that preserves area, such as the Peters projection. However, the price of maintaining areas on the map is that angles are not preserved: countries look funny shaped compared to their Mercator cousins. Because the sphere is curved and the plane is not, you cannot create a map that preserves both angles and areas. The reason why the Mercator projection has been so dominant is a technical one: because it preserves angles, it is optimal for navigation of marine vessels and aeroplanes. But it massively overstates the size of countries closer to the poles. In particular, Western Europe looks more important on Mercator maps than it should, while Africa and Brazil look much smaller. Colonialism also had a role in the dominance of the Mercator projection. Examples of manifolds include Minkowski space, the sphere, the torus, and 2D Riemann surfaces with arbitrary genus. What about spaces that are not manifolds? Any intersection of lines with k-planes will do. A cone is an example of a non-differentiable manifold, because of what happens at its apex. Some manifolds have a boundary, for instance a line segment. Some manifolds have no boundary. General Relativity treats the fabric of spacetime as a differentiable manifold. Note that it is also possible to handle discontinuities in the spacetime metric in some situations in GR, but only if the appropriate source of energy-momentum is available at the discontinuity to enforce consistency with Einstein’s equations. The formalism for handling this non- differentiable case is known as the Israel junction conditions, and its equations are derived by integrating Einstein’s equations across discontinuities in suitably covariant ways. This works a lot like deriving equations for shock waves in fluid mechanics. Spacetime being a differentiable manifold is not enough structure to describe gravity as we see it in experiments. The geometry should be suitably constrained by some physical equations, which should – by the Correspondence Principle – reduce to Newtonian mechanics in the limit of small speeds and weak gravity. Our spacetime manifolds will satisfy the Einstein equations.
26 6 M28Sep
6.1 Basis vectors in curved spacetime How do vectors and tensors work when spacetime is curved? We will have to be more careful than before, and the key difference is that the matrices showing us how to transform between different coordinate systems are no longer constant matrices. Suppose that we have coordinates xµ on our manifold and that we consider an arbitrary functions of these coordinates. Then the directional derivative along a direction λ of a curve is df ∂f dxµ = (6.1) dλ ∂xµ dλ so that we can write d dxµ = ∂ (6.2) dλ dλ µ
In other words, {eµ = ∂µ} is a set of basis vectors. This story goes deeper. Mathematically, the tangent space Tp(M) at a point p of a manifold M is isomorphic to the space of directional derivative operators on curves through p. It is a vector space, and the Leibniz rule is obeyed. Vector fields can then be defined on M. An example of a vector field would be the wind direction at the surface of the Earth. Take a look at https://earth.nullschool.net/ for a very beautiful interactive visualization of winds on Earth. ∗ What about a basis for covariant vectors living in the cotangent space Tp (M)? There is a very natural candidate: the differentials {eµ = dxµ}. Note that these dxµ are not the same µ as the contravariant basis vectors ∂µ = ∂/∂x ; you can tell the difference partly by where the index is placed. The coordinate bases for contravariant and covariant vectors obey a natural inner product, ∂xµ (∂ )(dxµ) = = δµ . (6.3) ν ∂xν ν We do not have to stick to only using partial derivatives and differentials as bases. More generally, we can denote a basis for contravariant vectors living in the tangent space as
{eµ} . (6.4) Any contravariant vector v can be expanded in terms of this basis,
µ v = v eµ . (6.5) Also, we denote a basis for covariant vectors living in the cotangent space as {eν} . (6.6) Any covariant vector ω can be expanded in terms of this basis,
ν ω = ωνe . (6.7)
ν Generally, a basis for contravariant vectors eµ and a basis for covariant vectors e must be reciprocals, µ µ eν · e = δν . (6.8)
27 What if we wanted to measure distances and angles on our spacetime manifold? In terms µ of the basis eµ we can write the vector displacement ds between a point at x and another point at xµ + dxµ in terms of our general basis vectors,
µ ds = eµdx . (6.9)
Accordingly, the line element ds2 is
2 µ ν ds = ds · ds = eµdx · eνdx . (6.10)
From this equation we can identify the metric tensor, denoted by gµν,
gµν = eµ · eν . (6.11)
This is a generalization of the flat Minkowski metric that we encountered in our quick review of Special Relativity, and it tells us how to measure distances and angles. The above line element in curved spacetime obeys a very important principle: it is invariant under arbitrary coordinate changes which are invertible and C∞, known as diffeomorphisms. The inverse metric is denoted as gµν and it is built in an exactly similar fashion,
gµν = eµ · eν . (6.12)
νλ The downstairs metric gµν and its inverse the upstairs metric g obey
µν µ g gνλ = δλ . (6.13) The spacetime metric and its inverse are used all the time in GR, for raising and lowering indices on tensors. Sometimes it is physically useful to use a special basis called the orthonormal basis. In this case, we denote the basis vectors with hats, and they obey
eˆµ · eˆν = ηµν , eˆµ · eˆν = ηµν . (6.14)
Flat, boring Minkowski spacetime R1,3 written in a spherical polar coordinate basis is not a curved spacetime, but it has a spacetime metric and tensor transformation laws that depend on spacetime position. As an exercise to test your understanding, check explicitly that the line element is
ds2 = dt2 − dr2 − r2dθ2 − r2 sin2 θ dφ2 , (6.15) by starting from the expressions for the spherical polar spatial coordinates {r, θ, φ} in terms of the the Cartesian spatial coordinates {x1, x2, x3},
x1 = r cos θ , (6.16) x2 = r sin θ cos φ , (6.17) x3 = r sin θ sin φ . (6.18)
28 6.2 Tensors in curved spacetime Tensors work in curved spacetime a lot like they do in flat spacetime. The most important physical difference is that under a change of reference frame represented by
µ0 0 ∂x Λµ ≡ (6.19) ν ∂xν the new coordinates are related to the old ones by coordinate-dependent factors, rather than simple constants like cos θ or sinh ζ. Our central physics strategy will be to remain focused on the transformation properties of our tensors of interest. That is the essence of what a tensor does: it transforms in very specific, well-defined ways when the coordinate system changes. ν Earlier, we introduced bases eµ for rank (1,0) vectors and e for rank (0,1) vectors. Accordingly, a general rank (1,0) vector v in curved spacetime can be written in components as µ v = v eµ , (6.20) and a covariant vector ω in curved spacetime can be written in components
µ ω = ωµ e . (6.21) Under coordinate transformations, their components transform as
µ0 µ0 ν v = Λ νv , (6.22) and ν ωµ0 = Λµ0 ων . (6.23) The transformation matrices µ0 ν µ0 ∂x ν ∂x Λ = and Λ 0 = (6.24) ν ∂xν µ ∂xµ0 now generically depend on spacetime position, and they satisfy
σ µ0 σ σ ν0 ν0 Λµ0 Λ ν = δν , Λµ0 Λ σ = δµ0 . (6.25) Contravariant vectors on a pseudoRiemannian manifold representing curved spacetime live in the tangent space, which is a vector space. Covariant vectors live in the cotangent space. The collection of all (co)tangent spaces over M is known mathematically as the (co)tangent bundle. One of the key properties of a covariant vector ω is that we can naturally take its inner product with a contravariant vector v without using the metric, and it yields a scalar: ν µ µ ω(v) = ωνe · v eµ = ωµv . (6.26) This enables us to recognize that another way to think about a covariant vector is that it is a machine that takes a contravariant vector and produces a scalar. Or in mathematical words, it is a bilinear map from the cotangent space into the real numbers R, obeying
(aω1 + bω2)(v) = aω1(v) + bω2(v) , (6.27)
ω(av1 + bv2) = aω(v1) + bω(v2) . (6.28)
29 Vectors obey entirely analogous rules. A rank (m, n) tensor in curved spacetime is defined by direct analogy as a multilinear map from a collection of m covariant vectors and n contravariant vectors to R. Its com- ponents in a coordinate basis can be extracted from T by slotting in the right number of covariant and contravariant basis vectors,
µ1...µm µ1 µm T ν1...νn = T (e ,..., e , eν1 ,..., eνn ) . (6.29) Alternatively, it can be written in terms of basis tensors as
µ1...µm ν1 νn T = T ν1...νn eµ1 ⊗ ... ⊗ eµm ⊗ e ⊗ ... ⊗ e , (6.30) where ⊗ denotes the outer product (not the inner product!). Note that in this picture expanding a tensor in components in one basis versus a second basis results in different components, as we would expect; the tensor stays the same. The coordinate transformation law for the components of a rank (m, n) tensor in curved spacetime is
µ 0 µ 0 σ σ 0 0 ∂x 1 ∂x m ∂x 1 ∂x n µ1 ...µm λ1...λm T ν 0...ν 0 = ... 0 ... 0 T σ ...σ . (6.31) 1 n ∂xλ1 ∂xλm ∂xν1 ∂xνn 1 n Notice that now the Jacobians involved typically depend on spacetime position.
6.3 Rules for tensor index gymnastics There are very specific rules for manipulating tensors. We already met one of them: the Einstein summation convention. In curved spacetime it works exactly the same way as in flat spacetime: repeated indices are summed over. But let us also make explicit some other specific tensor manipulation rules. First and foremost among them is the fact that when you write a tensor equation, indices on the LHS and RHS must be exactly matched. For example, pµ = muµ is a sensible tensor µ equation (it has one upstairs index on both sides) while the erroneous pµ = mu is not (on the LHS the µ index is downstairs while on the RHS it is upstairs). Second, vertical moves of tensor indices – up or down – can only be made by lowering or raising them with the rank (0,2) metric tensor or its rank (2,0) inverse. Said less pedantically, we raise or lower indices using the metric. For example,
ν ρν Tµ λσ = gµρT λσ , (6.32) and similarly for other raised/lowered components: you use as many factors of the met- ric/inverse metric as needed (with appropriate contractions) to lower/raise all the requisite indices. Third, we must always preserve the horizontal ordering of the indices when cal- culating, for both upstairs and downstairs indices. For example, for a general rank (2,2) tensor, µν νµ T λσ 6= T λσ . (6.33)
30 (The RHS has the µ and ν indices switched compared to the LHS.) Other horizontal switches of indices are equally verboten, unless you know that the tensor has appropriate symmetry properties. The only standard exception to the rule that horizontal index ordering matters α is the Kronecker δβ tensor, which is symmetrical by definition.
Let us now make a few remarks about symmetries among tensors. Tensors can have symmetries on their indices, which reduce the number of independent components, but this is not generic. For example, under interchange of its indices a two-index tensor might be symmetric Sµν = +Sνµ , (6.34) or antisymmetric Aµν = −Aνµ . (6.35) For rank two tensors only, an arbitrary tensor T can actually be written as the sum of a symmetric tensor S and an antisymmetric tensor A. In components,
Tµν = Sµν + Aµν , (6.36) where 1 1 S = (T + T ) ,A = (T − T ) , (6.37) µν 2! µν νµ µν 2! µν νµ This works because the total number of independent components of a 2-index tensor is D × D = D2, while a symmetric 2-index tensor has D(D+1)/2 components and an antisymmetric 2-index tensor has D(D − 1)/2, so that D(D + 1)/2 + D(D − 1)/2 = D2. For larger rank, such a split cannot be done, because totally symmetric and totally antisymmetric tensors do not have enough independent components between them to cover the total number. Any tensor can be symmetrized or antisymmetrized on any number k of upper or lower indices. For symmetrization on k indices, denoted by round parentheses around those indices, we have 1 T ≡ (T + sum over other permutations of {µ . . . µ }) , (6.38) (µ1...µk) k! µ1µ2...µk 1 k while for antisymmetrization on k indices, denoted by square brackets around those indices, we have 1 T ≡ (T + alternating sum over other permutations of {µ . . . µ }) . (6.39) [µ1...µk] k! µ1µ2...µk 1 k where the alternating sum counts even/odd permutations with a +/− sign. Suppose you know that a tensor is symmetric with all indices down. How do you work out the symmetry of its counterpart with some or all of its indices up? By using known symmetry properties of the downstairs components and using the metric tensor to raise indices. Remember that the metric tensor itself is symmetric under interchange of its two indices, and so is its inverse. Note also that the contraction of the metric tensor with itself is µν µ µ g gµν = g µ = δµ = 1 + ... + 1 = D. (6.40)
31 7 R01Oct
7.1 Building a covariant derivative Because coordinate change matrices generically depend on spacetime position, simple partial derivatives of tensors are typically not themselves tensors. For example, the partial derivative of a covariant vector W , ∂µWν, changes under coordinate changes as
∂ ∂ ∂xµ ∂ ∂xν W −→ W 0 = W ∂xµ ν ∂xµ0 ν ∂xµ0 ∂xµ ∂xν0 ν ∂xµ ∂xν ∂ ∂xµ ∂ ∂xν = W + W . (7.1) ∂xµ0 ∂xν0 ∂xµ ν ν ∂xµ0 ∂xµ ∂xν0
Although the first term looks good for tensoriality, we see that the second term ruins the fun for generic changes of coordinates. At any particular point p, we can choose a reference frame (denoted here by hats) in which the first derivatives can be set to zero in that coordinate system, ∂σˆgµˆνˆ|p = 0. The way to see this mathematically is to use (a) Taylor expansions around a particular point and (b) the tensorial transformation property of the two-index metric tensor gµν. However, this cannot be made to work beyond first order in derivatives, because there are not enough components. Physically, this means that we will need extra structure on our spacetime manifold in order to be able to define covariant derivatives that transform like tensors. The structure we need is known as an affine connection. It will enable us to make covariant versions of partial derivatives ∂µ, denoted ∇µ, designed to transform like tensors. For taking covariant derivatives of tensorial indices relevant to bosonic fields, we will use the µ Levi-Civita connection or Christoffel symbols Γ νλ. For taking covariant derivatives of spinorial indices relevant to fermionic fields, a researcher would use a different beast known µ as the spin connection ω ab (see GR2 for details). We will work on manifolds without torsion, and for this case knowing the metric tensor is sufficient to determine both connections.
7.2 How basis vectors change: the role of the affine connection In curved spacetime, the partial derivative of a basis vector generically does not appear to lie in the tangent space. But as HEL explain in §3, this can be easily fixed up by defining the derivative in the manifold of the coordinate basis vectors by projecting into the tangent space at the point in question. Then we can expand out the expression for the partial derivative ν ∂λ of a contravariant basis vector eµ in terms of the basis eν, with coefficients Γ µλ:
ν ∂λeµ = Γ µλeν . (7.2)
We can figure out the analogous equation for the covariant basis vectors eµ by differentiating µ µ the equation e · eν = δν and taking the partial derivative of both sides, yielding
µ µ ν ∂λe = −Γ νλe . (7.3)
ν We can find the expressions for the coefficients Γ µλ by taking the partial derivative of
32 our metric tensor,
∂λgµν = (∂λeµ) · eν + eµ · ∂λeν σ σ = Γ µλeσ · eν + eµ · Γ νλeσ . (7.4) Using this, we can now form the combination
∂λgµν + ∂νgλµ − ∂µgνλ σ = 2Γ λνgµσ . (7.5) where in the second step we used the fact that the Christoffels are symmetric under inter- change of their lower indices. This fact stems from the assumption that our spacetime has 6 µ zero torsion . Rearranging the above gives the full expression for the coefficients Γ νλ in terms of first derivatives of the metric tensor, 1 Γµ = gµσ (∂ g + ∂ g − ∂ g ) . (7.6) νλ 2 ν σλ λ νσ σ νλ
How about an example? Consider the 2D plane R2 with Cartesian coordinates {x, y}. The basis vectors ex and ey are maximally boring: they do not change with position. How- ever, if we transform to plane polar coordinates {ρ, φ} given by p x = ρ cos φ , ρ = x2 + y2 , y = ρ sin φ , φ = arctan(y/x) , (7.7) then in plane polar coordinates the basis vectors eρ and eφ definitely do change with position, which is the generic situation in GR. To see this, recall that
ds = exdx + eydy = eρdρ + eφdφ , (7.8) and use the above coordinate transformations to identify
eρ = + cos φ ex + sin φ ey ,
eφ = ρ(− sin φ ex + cos φ ey) . (7.9) It follows quickly from the above that ds2 = dρ2 + ρ2 dθ2 . (7.10) We can inspect how the basis vectors change with position, to obtain ∂e ρ = 0 , ∂ρ ∂e 1 ρ = − sin φ e + cos φ e = e , ∂φ x y ρ φ ∂e 1 φ = − sin φ e + cos φ e = e , ∂ρ x y ρ φ ∂e φ = −ρ cos φ e − ρ sin φ e = −ρ e . (7.11) ∂φ x y ρ 6Torsion is a rank (1,2) tensor, and it falls outside the scope of this course.
33 ν From eq.(7.3), we have that ∂λeµ = Γ µλeν, so we can read off the Christoffels,
ρ φ Γ ρρ = 0 , Γ ρρ = 0 , 1 Γρ = 0 , Γφ = + , ρφ ρφ ρ 1 Γρ = 0 , Γφ = + , φρ φρ ρ ρ φ Γ φφ = −ρ , Γ φφ = 0 . (7.12) Alternatively, we could have obtained these expressions for the Christoffels by taking deriva- tives of the metric tensor as in eq.(7.6). But I think also showing the explicit effect on the basis vectors as in eq.(7.11) helps us understand the physics better. I recommend drawing yourself some pictures to illustrate explicitly how the plane polar coordinate basis vectors change with position according to the above equations.
Previously, we noticed that taking the partial derivative of a tensor does not give another tensor, generically. The problem was that the coordinate transformation generally depends on spacetime position. Let us delineate the properties that a covariant derivative ∇ should have. It should be linear, ∇(T + S) = ∇T + ∇S , (7.13) where S, T are arbitrary tensors, and it should obey the Leibniz rule,
∇(T ⊗ S) = (∇T ) ⊗ S + T ⊗ (∇S) . (7.14)
It should also commute with contractions, which is tantamount to assuming that
∇σgµν = 0 , (7.15) a very reasonable assumption. The covariant derivative should reduce to the partial deriva- tive when operating upon scalars, because those tensors have no legs. The combination of the first two properties above implies that ∇ can be written as the sum of the partial derivative ∂ and a linear transformation, which you can think of as a ‘correction’ to keep the derivative tensorial. The coefficients of this correction term are known as the connection µ coefficients or the Christoffel symbols Γ αβ (pronounced criss-toff-ill). To see how this works, let us consider the derivative of a contravariant vector v
µ µ ∂νv = (∂νv )eµ + v (∂νeµ) µ µ λ = (∂νv )eµ + v (Γ µνeλ) µ λ µ = (∂νv )eµ + v (Γ λνeµ) µ λ µ = (∂νv + v Γ λν)eµ , (7.16) where in the third line we relabelled dummy indices. The part in brackets is known as the covariant derivative, µ µ µ λ ∇νv ≡ ∂νv + Γ λνv . (7.17)
34 By exactly similar logic, we can find the covariant derivative of a covariant vector,
λ ∇νωµ = ∂νωµ − Γ µνωλ . (7.18)
If we want to take the covariant derivative of a rank (m, n) tensor, then we just act on each of its legs in turn with the connection,
µ1...µm µ1...µm ∇σT ν1...νn = ∂σT ν1...νn µ1 λµ2...µm µ2 µ1λµ3...µm +Γ σλT ν1...νn + Γ σλT ν1...νn + ... −Γλ T µ1...µm − Γλ T µ1...µm + ... (7.19) σν1 λν2...νn σν2 ν1λν3...νn How about a quick example? We can use the Christoffels to teach us how to take the covariant Laplacian in 2D plane polar coordinates. If we have a scalar field Ψ(ρ, φ), then
µ µ ∇µ∇ Ψ = ∇µ∂ Ψ µ µ ν = ∂µ∂ Ψ + Γ µν∂ Ψ ρ φ ρ ρ ρ φ φ ρ φ φ = (∂ρ∂ + ∂φ∂ )Ψ + Γ ρρ∂ Ψ + Γ ρφ∂ Ψ + Γ φρ∂ Ψ + Γ φφ∂ Ψ ∂2Ψ 1 ∂Ψ 1 ∂2Ψ = + + . (7.20) ∂ρ2 ρ ∂ρ ρ2 ∂φ2 This should look familiar from vector calculus class: we just derived it from first principles.
One of the most important things to remember is that the connection is not a tensor. It has components labelled with Greek indices, but that does not make it a tensor in and of itself. Indeed, the connection is designed specifically to correct the non-tensorial property of the partial derivative in order to create a new tensor from an old one. Its transformation law under a coordinate transformation is
µ λ ν0 µ λ 2 ν0 ν0 ∂x ∂x ∂x ν ∂x ∂x ∂ x Γ 0 0 = Γ − . (7.21) µ λ ∂xµ0 ∂xλ0 ∂xν µλ ∂xµ0 ∂xλ0 ∂xµ∂xλ From this, you can see that the difference between two connections is a tensor, because the second term (which is independent of the Γs) drops out of their transformation law. Our connection is metric compatible, meaning that the covariant derivative con- structed from it obeys ∇σgµν = 0 , (7.22) for all values of σ, µ, ν. There are two other useful equations that follow from this one,
µν ∇σg = 0 , (7.23)
∇λµ0µ1...µd = 0 , (7.24)
where the completely antisymmetric tensor density µ0µ1...µd is used in integrating covariantly over spacetime. We will not have occasion to use it here, but it will play an important role in deriving Einstein’s equations from an action principle in the GR2 course. The fact that our metric-compatible covariant derivative commutes with raising and lowering of indices is
35 very fortunate – if there were torsion, we would have to be scrupulously careful about our index placements. In discussing covariant derivatives of tensors, it is worth noting here that some people use a different convention than ours. They abbreviate by defining commas after indices to represent partial derivatives, while semicolons represent covariant derivatives. We will stick with keeping ∂ and ∇ explicit, because in pages full of long GR equations it is all too easy to lose track of punctuation marks.
7.3 The covariant derivative and parallel transport Introducing a covariant derivative (as compared to a plain derivative) was a really great idea for doing physics. It allows us to write tensor equations wherever we go. All we need to do is to be sure to write ∇s rather than ∂s. But one question worth asking is this: what rate of change does ∇ actually measure? A way to answer that question and get a better handle on ∇ is to ask when ∇ of some tensor is zero. For this we actually have to specify what path along which we hope to compare tensors – because comparing tensors at two different points is, a priori, meaningless in GR. After all, the spacetime metric varies from point to point. µ µ Consider a parametrized curve x (λ), and a vector field v(λ) = v (λ)eµ(λ). The deriva- tive of the vector v field w.r.t the curve parameter λ is dv dvµ ∂e dxσ = e + vµ µ dλ dλ µ ∂xσ dλ dvµ dxσ = + Γµ vν e dλ νσ dλ µ Dvµ ≡ e . (7.25) Dλ µ The quantity D dxσ ≡ ∇ (7.26) Dλ dλ σ is known as the directional covariant derivative. This animal is only defined along the path xµ(λ), and when acting on a tensor it produces another tensor. We say that a tensor T is parallel transported along the path if σ D µ ...µ dx 1 k µ1...µk T = ∇σT ν ...ν = 0 . (7.27) Dλ ν1...ν` dλ 1 ` This is known as the equation of parallel transport, and it is a proper tensor equation. Now, since we have a metric compatible connection, ∇σgµν = 0, parallel transport preserves the inner product of two tensors. For example, for two vectors V µ and W µ, D D D D (g V µW ν) = g V µW ν + g V µW ν + V µ W ν (7.28) Dλ µν Dλ µν µν Dλ Dλ = 0 + 0 + 0 = 0 , (7.29) if the vectors are both parallel transported. You can visualize what parallel transport does by imagining that it keeps the same angle between the vector and the directional derivative along the path xµ(λ).
36 To see what parallel transporting can imply, consider the two-sphere. Imagine that we start at the North Pole with a vector at an angle. We keep the angle of our vector constant as we move along a line of longitude, (say) the Greenwich meridian, down to the Equator. Then imagine that we turn East and continue parallel transporting our vector some way around the equator. Then we turn North and parallel transport our vector up a second line of longitude, back to the North Pole. If you have visualized this correctly in your mind, you will see that our vector, regardless of the direction it was initially pointing, has undergone a finite rotation. This is because the sphere is (positively) curved.
37 8 M05Oct
8.1 The geodesic equations for test particle motion in curved spacetime A geodesic is a path xµ(λ) that parallel transports its own tangent vector. It follows that the equation satisfied by the geodesic is
D dxµ d2xµ dxν dxσ = + Γµ = 0 . (8.1) Dλ dλ dλ2 νσ dλ dλ
We can also think about parallel transport in the following way. When we take an ordinary partial derivative, we do it by taking
f(xµ + ∆xµ) − f(xµ) ∂f lim = . (8.2) ∆x→0 ∆xµ ∂xµ In curved spacetime, the result of this is not a tensor. What we do is instead take the covariant derivative, as follows.
1. We take xµ(λ + dλ) as our “x plus an infinitesimal change” and find T there. 2. We parallel transport T back to the original point at xµ(λ), along the path xµ(λ). 3. We compare the parallel-transported-back T to the original T at xµ(λ), and we ‘divide by’ dλ.
The result is DT/Dλ, the covariant rate of change of the tensor with respect to λ at the spacetime point xµ(λ).
Let us now see another way that the geodesic equation can be derived, using a variational approach. Consider a massive point particle in proper time gauge. The relativistic einbein action is, up to a constant that is physically irrelevant at the classical level,
Z dxµ(τ) dxν(τ) S = −m dτ g (xλ) . (8.3) µν dτ dτ What happens when we vary xµ → xµ + δxµ ? (8.4)
38 Under such a variation, σ gµν → gµν + (∂σgµν)δx . (8.5) Varying the action, we have
1 Z dxµ dxν − δS = dτ δ g (8.6) m µν dτ dτ Z dxµ dxν dδxµ dxν = dτ (∂ g ) δxσ + g + (µ ↔ ν) (8.7) σ µν dτ dτ µν dτ dτ Z dxµ dxν = dτ (∂ g ) δxσ+ σ µν dτ dτ dxσ dxν d2xµ − (∂ g ) δxµ + g δxν + (µ ↔ ν) , (8.8) σ µν dτ dτ µν dτ 2 where in the last step we integrated by parts7. We also used the fact that δdxµ dδxµ = . (8.9) dτ dτ Collecting all the terms, we have
1 Z d2xσ dxν dxρ − δS = dτ g + g Γσ δxµ . (8.10) m µσ dτ 2 µσ νρ dτ dτ
Demanding that this be zero for arbitrary variations δxµ, we obtain the geodesic equation,
d2xµ dxν dxρ + Γµ = 0 . (8.11) dτ 2 νρ dτ dτ An affine parameter λ is defined to be λ = aτ + b for constants a, b. In other words, a λ is an affine parameter if it is linearly related to τ (for a massive particle). For a massless particle, we can still define an affine parameter. In fact, our geodesic equation requires just such an affine parametrization, regardless of the particle mass. For either massive or massless particles, the geodesic equation can be written in very compact form in terms of the momentum vector,
ν µ p ∇ν p = 0 . (8.12)
For point particles, we relate the momentum pµ to the four-velocity uµ via dxµ pµ = muµ = m , m2 > 0 , dτ pµ = uµ , m2 = 0 . (8.13)
The second formula follows Carroll’s convention for defining the “four-velocity” for massless µ particles. Since p pµ = 0 for them, we have a free choice for the proportionality factor.
7We assume that the manifold has sufficiently trivial topology for the integration by parts to work.
39 There is a central physics point to understand about this extremization. Is it a mini- mization or a maximization? In fact, the geodesic maximizes proper time. Why? Well, if we were to lower the proper time interval (∆τ)2 along a changed path, we would get closer to (∆τ)2 = 0, which is a null path. To go lower, to (∆τ)2 < 0, we would have to use an illegal spacelike path. So minimizing (∆τ)2 does not make sense, and in fact the proper time is maximized via the variational principle. The fact that the proper time is maximized happens precisely because it is infinitesimally close to paths with lower proper time. Carroll has a morally similar argument: he shows that for any timelike path, we can approximate it by a (jaggedy looking) piecewise continuous bunch of null paths, all of the pieces of which have zero invariant interval. Since the geodesic is infinitesimally nearby to null paths with zero proper time, it must maximize proper time. The physical consequence of this mathematical fact that geodesics maximize proper time is that accelerated observers – those who are not in freefall – measure less proper time than those who are in freefall. This is why the space twin in the Twin Paradox always comes back younger, not older, than the homebody twin. The more you accelerate around with your rockets, the younger you are compared to a homebody who stays on a geodesic. If all geodesics on a spacetime manifold go as far as they please, then the manifold is said to be geodesically complete. But if some geodesic(s) bang into a singularity, or end prematurely, then the manifold is geodesically incomplete. For spacetimes with matter, this is the generic case, actually. Roger Penrose just won part of the 2020 Nobel Prize in Physics for (co-)explaining this.
8.2 Example computation for affine connection and geodesic equa- tions Let us now work a relatively simple example of calculating Christoffel components for a spacetime with dependence on only one coordinate, x0 = t. We will take the spatially flat8 Friedman-Robertson-Walker ansatz in D = d + 1 spacetime dimensions,
ds2 = dt2 − a2(t)|d~x|2 , (8.14) where a(t) is the scale factor. Since
2 µ ν ds = gµνdx dx , (8.15) we have
g00 = +1 , 2 gij = −[a(t)] δij . (8.16)
Because the metric is diagonal, we can invert it by eye, to obtain
g00 = +1 , gij = −[a(t)]−2 δij . (8.17)
8For the more general case with nontrival spatial metric, see Carroll §8.3.
40 Finding the Christoffels is relatively straightforward, as many of them are zero. Notice that the only coordinate dependence in the metric is on the time coordinate. 0 First, let us try for Γ 00, 1 Γ0 = g0σ (∂ g + ∂ g − ∂ g ) 00 2 0 0σ 0 0σ σ 00 1 = g00∂ g = 0 , (8.18) 2 0 00 because the metric is diagonal and because g00 is a constant. Next up is 1 Γ0 = g0σ (∂ g + ∂ g − ∂ g ) 0i 2 0 iσ i 0σ σ 0i 1 = g00 (∂ g + ∂ g − ∂ g ) 2 0 i0 i 00 0 0i 1 = g00∂ g = 0 , (8.19) 2 i 00 because the metric is diagonal and because g00 is a constant. 0 A more interesting case is Γ ij, which is nonzero. 1 Γ0 = g0σ (∂ g + ∂ g − ∂ g ) ij 2 i jσ j iσ σ ij 1 = g00 (∂ g + ∂ g − ∂ g ) 2 i j0 j i0 0 ij 1 00 = − g ∂0gij .2 = aa δij , (8.20) where . = d/dt. Along the way, we again used the fact that the metric is diagonal and g00 is a constant. i Now consider Γ 00. 1 Γi = giσ (∂ g + ∂ g − ∂ g ) 00 2 0 0σ 0 0σ σ 00 = 0 , (8.21) because the metric is diagonal and because g00 is a constant. i Next, let us look at the only other nonzero Christoffel symbol Γ 0j. We have 1 Γi = giσ (∂ g + ∂ g − ∂ g ) 0j 2 0 jσ j 0σ σ 0j 1 = gik (∂ g + ∂ g − ∂ g ) 2 0 jk j 0k k 0j 1 = gik∂ g 2 0 jk 1 = {[a(t)]−2δik}∂ {[a(t)]2δ } 2 0 jk a. = δi . (8.22) a j
41 i Finally, what about the all-spatial Christoffels Γ jk? We have 1 Γi = gi` (∂ g + ∂ g − ∂ g ) jk 2 j k` k j` ` jk = 0 , (8.23) because none of the spatial components of the metric depends on spatial position. In summary, we have:-
0 . Γ ij = aa δij , (8.24) a. Γi = δi , (8.25) j0 a j with all other components zero. Notice how it is the “velocity” of the scale factora ˙(t) that appears here. The quantity a. = H(t) (8.26) a is known as the Hubble constant if the scale factor is exponential. (Whether or not the scale factor can behave in this fashion is determined by the energy-momentum of matter in the spacetime, as we will discover later on in the course.)
Now let us look at the geodesic equations in this simple spacetime, doing a time space split like for the Christoffels above. In general, we have d2xµ dxν dxσ + Γµ = 0 . (8.27) dλ2 νσ dλ dλ The 0th component of this equation reads d2x0 dxν dxσ 0 = + Γ0 dλ2 νσ dλ dλ d2x0 dxi dxj = + Γ0 dλ2 ij dλ dλ d2x0 dxi dxj = + aa. δ (8.28) dλ2 ij dλ dλ because all the other terms contributing to the sums over ν and σ involve Christoffel com- ponents that are zero. The ith component reads d2xi dxν dxσ 0 = + Γi dλ2 νσ dλ dλ d2xi dx0 dxj dxj dx0 = + Γi + Γi dλ2 0j dλ dλ j0 dλ dλ d2xi 2a. dx0 dxi = + . (8.29) dλ2 a dλ dλ The first thing to notice about these geodesic equations we have derived is that they are coupled and nonlinear. The equation for dx0/dλ depends on what dxi/dλ are doing, and vice
42 versa. This is why solving for motions of massless particles (photons) or massive particles (like electrons) in the background of a general curved spacetime is generically much more complicated than doing Newton’s Laws for non-relativistic physics. The second thing to notice about our super-simple spacetime is that the spatial geodesic equations actually have a first integral (!). To see this, let us try taking the λ derivative of
dxj p = a2(t) δ . (8.30) i ij dλ We have, by the Leibniz rule and the chain rule,
d d dxj p = δ a2(x0) dλ i ij dλ dλ dx0 dxj d2xj = δ 2a a. + δ a2 ij dλ dλ ij dλ2 d2xj 2a. dx0 dxj = δ a2 + ij dλ2 a dλ dλ = 0 . (8.31)
Therefore, pi is a conserved quantity along the geodesic. As we will see a bit later in the course, this conservation law arises because our spacetime metric has a symmetry: none of the components of the metric tensor depends on spatial coordinates. This is your first example of how Noether’s Theorem works in General Relativity.
43 9 R08Oct
9.1 Spacetime curvature Einstein’s General Theory of Relativity upgraded the way we think about gravitational physics. Instead of imposing Newton’s three laws of motion and imposing his force law for universal gravitation, we assume that the starting point is the fabric of spacetime. We worked quite hard already to define tensors on arbitrary spacetimes, by focusing intensely on their transformation properties under changes of reference frame, i.e., changes of coordinates. We also figured out in our last lecture how to define a covariant derivative, with the help of the Levi-Civita connection. We went to all that trouble of wrangling the Christoffel symbols because this enabled us to do two exciting things: (a) to define a derivative ∇µ that is a tensor, even in curved spacetime, and (b) to derive the geodesic equation, which is the equation obeyed by any relativistic particle undergoing freefall in the spacetime in question. Along the way, we learned that geodesics maximize the proper time. As we alluded to earlier, Riemann curvature tensor is the mathematical quantity that Albert Einstein discovered was the key to gravitational physics expressed in the language of curved spacetime. He realized that the Riemann tensor, which contains at most two derivatives of the metric tensor, could even be used to build an action principle for general relativity. We will derive the Einstein action and the Einstein equations of motion for the gravitational field in the GR2 course. For now, all we need to keep in mind is that the Riemann tensor encodes a wide variety of gravitational phenomena in its tensor components, including the physics of tidal forces and the motion of particles in spacetime. In particular, we will soon show how in the Newtonian limit of weak gravity and slow speeds, we will recover familiar expressions from Newtonian physics – without ever having to use the concept of a force! First, we need to develop a bit more formalism.
9.2 The Riemann tensor Consider an infinitesimal parallelogram, with vectors Aµ and Bν forming the sides.
In hand-waving terms, the Riemann curvature is what tells us how much a vector V µ gets rotated under parallel transport around the parallelogram. The infinitesimal change in V , δV , is a (1,0) tensor, and so are A, B, and V . Roughly speaking, we expect δV to be proportional to V and to the size of the parallelogram. To connect δV to A, B, V we need a (1,3) tensor with which to contract indices naturally, and the role of this is played by the Riemann curvature. The resulting equation from our handwaving is therefore µ µ ν α β δV ∼ R ναβV A B . (9.1) While this sketch of Riemann’s origin gives us the gist, we now need to be more precise and make a proper definition.
44 Recall that earlier we found parallel transport to be the right way of thinking about how to compare vectors at different places in spacetime. Combined with our little parallelogram hand-wave just now, this can be used to motivate a mathematical definition of the Riemann tensor as arising from taking commutators of covariant derivatives. On a (1,0) vector V , Riemann is defined via9 ρ ρ λ [∇µ, ∇ν] V = +R λµνV , (9.2) for a torsion-free connection. This formula teaches us how to find the components of the Riemann tensor in terms of Christoffel connection coefficients. Let us write out the pieces ρ individually to see how it works out. First, note that for any (1,1) tensor Tν ,
ρ ρ ρ σ λ ρ ∇µTν = ∂µTν + Γ µσTν − Γ µνTλ . (9.3)
ρ ρ So with Tν = ∇νV , we have
ρ ρ ρ λ λ ρ ∇µ(∇νV ) = ∂µ(∇νV ) + Γ µλ(∇νV ) − Γ µν(∇λV ) (9.4) ρ ρ λ ρ λ λ λ ρ ρ σ = ∂µ(∂νV + Γ νλV ) + Γ µλ(∂νV + Γ νσ) − Γ µν(∂λV + Γ λσV ) (9.5) ρ ρ λ ρ λ λ ρ = ∂µ∂νV + Γ νλ∂µV + Γ µλ∂νV − Γ µν∂λV ρ σ ρ λ σ λ ρ σ + (∂µΓ νσ)V + Γ µλΓ νσV − Γ µνΓ λσV . (9.6) Then
ρ ρ ρ λ σ λ ρ σ ∇µ(∇νV ) − (µ ↔ ν) = ∂µΓ νσ + Γ µλΓ νσ V − Γ µνΓ λσV ρ λ ρ λ λ ρ + Γ νλ∂µV + Γ µλ∂νV − Γ µν∂λV − (µ ↔ ν) (9.7) ρ ρ λ σ = ∂µΓ νσ + Γ µλΓ νσ − (µ ↔ ν) V . (9.8) Now we can put the pieces together to see the general formula for taking the commutator of covariant derivatives acting on a vector. Using
ρ ρ σ [∇µ, ∇ν] V = +R σµνV (9.9) gives us the formula for the Riemann tensor components,
ρ ρ ρ λ ρ λ ρ R σµν = ∂µΓ σν − ∂νΓ σµ + Γ σνΓ λµ − Γ σµΓ λν . (9.10) For a covariant derivative acting on a (0,1) tensor, a covariant vector, one finds the same Riemann tensor coefficients and
λ [∇µ, ∇ν] ωρ = −R ρµνωλ . (9.11) If you slog through the details, you can compute the commutator of covariant derivatives on a rank (k, `) tensor V as well. This is not much worse than the calculation we have just done, and we suppress the details here. The result is
[∇ , ∇ ] V µ1...µk = Rµ1 V λ...µk + Rµ2 V µ1λµ3...µk + ... ρ σ ν1...ν` λρσ ν1...ν` λρσ ν1...ν` − Rλ V µ1...µk − Rλ V µ1...µk − .... (9.12) ν1ρσ λν2...ν` ν2ρσ ν1λν3...ν` 9We are using the sign conventions of HEL
45 Riemann arises naturally as a rank (1,3) tensor. By doing a partial contraction of two of its indices, we can define the Ricci tensor Rµν, which naturally arises as a rank (0,2) tensor, α Rµν = R µνα . (9.13) Notice that we are contracting the first and fourth indices here to make the Ricci tensor. This is a choice of convention, and we have chosen to use the same convention as the HEL textbook. By contracting the Ricci tensor with the metric, we can form the Ricci scalar R, which has rank (0,0), µν R = g Rµν . (9.14) Other kinds of contractions involving Riemann are also possible, such as “Riemann squared” and “Ricci squared”. For our purposes in this course, we only need to know about the Ricci tensor and the Ricci scalar – because both of them will appear on the left hand side of Einstein’s equations. Note that if you change the signature of our Lorentzian spacetime from mostly minus λ ρ to mostly plus, the Christoffels Γ µν would stay the same, the Riemann tensor R λµν would also stay the same, and so would the Ricci tensor Rµν, but the Ricci scalar R would develop a relative minus sign.
9.3 Example computations for Riemann Suppose that we study 2D Euclidean space in plane polar coordinates {x1, x2} = {ρ, ϕ},
ds2 = dρ2 + ρ2dϕ2 . (9.15)
We previously found nonzero Christoffels for this spacetime in these coordinates when we introduced basis vectors, 1 Γ2 = , Γ1 = −ρ . (9.16) 12 ρ 22 From this, we can find the Riemann tensor using our formula from above,
ρ ρ ρ ρ λ ρ λ R σµν = ∂µΓ νσ − ∂νΓ µσ + Γ µλΓ νσ − Γ νλΓ µσ . (9.17) Substituting in gives
1 1 1 λ 1 λ 1 R 212 = ∂1Γ 22 − ∂2Γ 21 + Γ 22Γ λ1 − Γ 21Γ λ2 1 2 1 = ∂1Γ 22 − 0 + 0 − Γ 21Γ 22 1 = ∂ (−ρ) − (−ρ) ρ ρ = −1 − (−1) = 0 . (9.18)
All the other components of Riemann that might have been nonzero are actually zero, also. This result reflects the fact that this 2D spacetime is flat.
46 Now suppose we try a spacetime which we already suspect is curved: the two-sphere with coordinates {x1, x2} = {θ, φ}, ds2 = dθ2 + sin2 θ dφ2 . (9.19) Computing the Christoffels is straightforward, using either the formula in terms of deriva- tives of the metric tensor or the formula for how basis vectors change. The only nonzero components turn out to be cos θ Γ1 = − sin θ cos θ , Γ2 = . (9.20) 22 12 sin θ As we will discover next week, there is only one independent component of Riemann in 2D, 1 and it is R 212. To compute it, we substitute in again,
1 1 2 1 R 212 = ∂1Γ 22 − Γ 21Γ 22 cos θ = ∂ (− sin θ cos θ) − (− sin θ cos θ) θ sin θ = −(cos2 θ − sin2 θ) + cos2 θ = + sin2 θ . (9.21) If we lift the second index using gµν, we obtain
12 R 12 = +1 . (9.22) The answer is positive because the sphere is positively curved. All of the other nonzero 21 12 components of Riemann can be expressed in terms of this one, for instance R 21 = +R 12. Finally, let us work a slightly more nontrivial example of calculating Riemann com- ponents for a spacetime with dependence on only one coordinate. As with our geodesic equation example at the end of the previous section, we take the spatially flat FRW ansatz in D = d + 1 spacetime dimensions, ds2 = dt2 − a2(t)|d~x|2 , (9.23) where a(t) is the scale factor. Most of the components of Riemann for this simple spacetime are actually zero. Let us sketch how to find the ones that are nonzero. We had for the Christoffels
0 . Γ ij = aa δij , a. Γi = δi . (9.24) j0 a j The first group of nonzero Riemann components have one time index up and one down, and two spatial indices: R0 = ∂ Γ0 − Γ0 Γk i0j 0 ji jk 0i . .2 .. . a k = a + aa δij − aa δjk δ i .. a = aa δij . (9.25)
47 Then we have
i i i k R 00j = ∂0Γ j0 + Γ0kΓj0 a..a − a.2 a.2 = δi + δi a2 j a2 j ..a = δi . (9.26) a j The second group of nonzero Riemann components has all spatial indices,
i i 0 i 0 R jk` = Γ k0Γ lj − Γ `0Γ kj a. a. = δi (aa. δ ) − δi (aa. δ ) a k `j a ` kj .2 i i = a δ kδj` − δ `δjk . (9.27) 2 · · Notice how we have discovered both “velocity squared”a ˙ terms, which arise via Γ··Γ·· 2 parts in Riemann, anda ¨ “acceleration” terms, which arise via ∂· g·· parts in Riemann. It is not until you compute the curvature that you see the appearance of the “acceleration” pieces. Notice also how the “acceleration” of the scale factor showed up in the Riemann com- ponents involving the time direction; the all-spatial Riemanns gave only “velocity squared” contributions. Since we now have the Riemann tensor, we can contract it to find Ricci. The nonzero components are
i R00 = R 00i ..a = +δi i a ..a = +d , a 0 k Rij = R ij0 − R ijk .. .2 k k = −aa δij − a δ kδij − δ jδik .. .2 = −aa δij − a (d − 1)δij .. .2 = −δij aa + (d − 1)a , (9.28) where d is the spatial dimension (d = 3 in our universe). Contracting the Ricci tensor with the metric tensor gives the Ricci scalar,
00 ij R = g R00 + g Rij ..a 1 = + d − d − a..a + (d − 1)a.2 a a2 ..a a.2 = +2d + d(d − 1) . (9.29) a a2 Using D = d + 1, we can write this in terms of the spacetime dimension D, ..a a.2 R = +2(D − 1) + (D − 1)(D − 2) . (9.30) a a2
48 The time evolution of this depends sensitively on the details of how the scale factor evolves. We will need to develop the Einstein equation to see how scale factor evolution is tied to the energy-momentum of the type of matter hanging out in the spacetime. Arbitrary scale factors a(t) are not allowed; the Einstein equations will determine them in terms of the energy density and the pressures.
49 10 R15Oct
10.1 Geodesic deviation Geodesics are generally not straight lines in curved spacetime. Physically, they deviate from one another, because of spacetime curvature. How can we make this intuition mathematically precise? Consider a one-parameter family of geodesics γs(λ), where λ is the affine parameter along the geodesic in question. The parameter s ∈ R tells you which geodesic you are referring to. We can choose coordinates s and λ on the manifold as long as the geodesics do not cross.
Then we have two naturally defined vector fields, ∂xµ ∂xµ Sµ = ,T µ = . (10.1) ∂s ∂λ A useful mnemonic here is that S is for Separation while T is for Tangent. We would now like to build the covariant analogue of the ‘relative velocity’ between geodesics, µ α µ V = T ∇αS , (10.2) and the ‘relative acceleration’ µ α µ A := T ∇αV . (10.3) Note that the acceleration of a path away from being a geodesic is different. That would be
α µ T ∇αT . (10.4)
Since our proposed definitions above are tensor equations, they are well-defined. Now, Sµ and T µ are basis vectors adapted to a coordinate system, with s and λ. Therefore,
[S,T ] = 0 . (10.5)
On our way towards building the relative acceleration vector, we will need an identity for vector fields,
µ α µ α µ [X,Y ] = X ∂αY − Y ∂αX (10.6) α µ α µ = X ∇αY − Y ∇αX . (10.7)
50 This allows us to relate S-directional derivatives of T to T -directional derivatives of S, α µ α µ S ∇αT = T ∇αS . (10.8) Now we can compute the relative acceleration vector. µ α σ µ A = T ∇α (T ∇σS ) (10.9) α σ µ = T ∇α(S ∇σT ) α σ µ α σ µ µ ν = (T ∇αS )(∇σT ) + T S {[∇σ∇αT ] + R νασT } α σ µ µ ν α σ = (S ∇αT )(∇σT ) + R νασT T S σ α µ σ α µ +[S ∇σ(T ∇αT ) − (S ∇σT )(∇αT )] µ ν α σ = +R νασT T S , (10.10) where we used (i) [S,T ] = 0, (ii) ∇ obeys the Leibniz rule and Riemann is defined in terms of a commutator of covariant derivatives, (iii) the Leibniz rule and rearranging terms, (iv) relabelling of dummy indices to cancel terms and T being the tangent vector of a geodesic. Summarizing, we have the geodesic deviation equation D2Sµ Aµ = = (∇ ∇ S)µ = +Rµ T νT αSσ . (10.11) Dλ2 T T νασ Here we see how the Riemann curvature tensor governs the deviation of geodesics in a very precise way. The covariant acceleration deviation of this one-parameter family of geodesics is given by the Riemann tensor contracted with the tangent vector T twice, on its second and third indices, and contracted with the separation vector S once, on its fourth index.
10.2 Tidal forces and taking the Newtonian limit for Christoffels Remember the tides? If you, like me, have spent any length of time near the ocean, then you know that the water level rises and falls twice a day. But do you know why? Newton first explained this in his Principia. Basically, oceanic water on the near side to the Moon bulges because it is closer to the Moon than ocean on the far side and hence feels stronger gravity; for the bulge on the far side that can be seen to happen through ‘centrifugal force’. So we see two tides per day. (Note: distances in the figure are not to scale.)
51 How do tidal forces work in Newtonian and Einsteinian gravity? Well, you cannot detect curvature using only one test particle, or only one geodesic. You need to use multiples to see the physical effects of curvature of space or spacetime. So let us think about geodesic deviation in the Newtonian limit, even before we recruit the heavy machinery of tensor analysis in curved spacetime and the Riemann tensor. We will soon see how Riemann and the Newtonian potential are connected by the Newtonian limit of weak gravity and slow speeds. In an inertial frame, the equation of motion of the first particle moving in a Newtonian gravitational potential Φ(xk) is d2xi = −δij∂ Φ(xk) . (10.12) dt2 j Next, we define the vector yi to be the separation of the second particle from the first, which is assumed to be small. We have that d2 (xi + yi) = −δij∂ Φ(xk + yk) . (10.13) dt2 j Taylor expanding gives
k k k k ` 2 ∂jΦ(x + y ) = ∂jΦ(x ) + (∂`∂jΦ(x ))y + O(y ) , (10.14) so that the Newtonian trajectory deviation equation is d2 yi = −δij(∂ ∂ Φ) yk . (10.15) dt2 j k The left hand side is known as the tidal acceleration, and it is described by the second mixed partial derivatives of the Newtonian potential. For simplicity, let us ignore the fact that the Earth is rotating on its own axis as well as the rotation of the Earth around the Sun. Letting the moon be at (x, y, z) = (0, 0, d), we have for the Newtonian gravitational potential G M Φ (x, y, z) = − N m . (10.16) m {x2 + y2 + (z − d)2}1/2 From this we can calculate the acceleration deviation vector 2 ∂ Φ GN Mm i j = + 3 diag(1, 1, −2) . (10.17) ∂x ∂x 0 d Why the asymmetry between the z and x, y directions? Simple. The functional dependence in the denominator is different. 2 ∂ Φ 1 −3/2 2 = − GN Mm∂x 2x · − {...} (10.18) ∂x 0 2 0 −3/2 3 −5/2 = GN Mm {...} − x · 2x · − {...} (10.19) 2 0 G M = N m + 0 (10.20) d3
52 whereas
2 ∂ Φ 1 −3/2 2 = − GN Mm∂z 2(z − d) · − {...} (10.21) ∂z 0 2 0 −3/2 3 −5/2 = GN Mm {...} + (z − d) · 2(z − d) · − {...} (10.22) 2 0 G M G M d2 = + N m − 3 N m (10.23) d3 d5 2G M = − N m . (10.24) d3 Another way to write the same set of equations is to use a unit normal vector ni = xi/r pointing in the radial direction; then
2 ∂ Φ GN Mm aij = − i j = − (δij − 3ninj) 3 (10.25) ∂x ∂x 0 r This (tensor) equation tells us that you get stretched in the radial direction and squeezed in the transverse directions. Quite generally, you can think of gravity as a stretchy-squeezy force. This originates in the fact that gravitational intereactions in our universe are transmitted by a spin-two boson known as the graviton. It has a polarization tensor rather than a polarization vector. After symmetries under arbitrary changes of coordinates are taken into account, there are two independent physical polarizations for the graviton in four spacetime dimensions, like there are for the photon. But please do not mistake one for the other: the photon only has spin one, and in dimension other than D = 3+1 the numbers of independent physical polarizations of photons and gravitons will not match. That they do in D = 3 + 1 is an numerical accident. How big are tidal forces, in orders of magnitude? First, we need to figure out which of the solar system bodies is relevant. If you do the calculation using the above formula for tidal accelerations, you find that the Moon is actually the biggest contributor, because although it is much lighter than the Sun (about 27,100,000 times) it is much closer (about 388 times), and it is the cube of the distance that counts. Plugging in the numbers, you will find that the Sun’s tidal acceleration is only about 45% of the Moon’s. So we focus on the Moon. We would like to compare the magnitude to the acceleration due to gravity. So, to get the order of magnitude, we are computing the ratio of the tidal force on a piece of ocean to the g-force, 2 3 GN MM rE rE MM rE −7 3 · ∼ ∼ 10 . (10.26) d GN ME ME d Tidal forces might seem like teeny weeny forces, but when you multiply by entire oceans, you get physical effects that human beings can relate to. We can make a little table comparing what we have found in Newtonian gravity versus Einsteinian General Relativity so far.
53 What Newton Einstein i λ gravity Φ(x , t) gαβ(x ) d2xi d2xµ dxν dxσ test particle EOM = −δij∂ Φ = −Γµ dt2 j dλ2 νσ dλ dλ d2yi D2Sµ deviation = −δij∂ ∂ Φ yk = +Rµ T νT σSρ dt2 j k Dλ2 νσρ ρ ρ ρ ρ λ ρ λ tidal forces ∂i∂jΦ R σµν = +∂µΓ νσ − ∂νΓ µσ + Γ µλΓ νσ − Γ νλΓ µσ 2 gravity EOM ∇ Φ = 4πGN ρ ??? (Einstein equations, coming soon!) In the Newtonian equation of motion for Φ, ρ is the mass density of whatever is sourcing the gravitational field, and GN is the Newton constant characterizing the strength of gravity.
In order to see how the covariant geodesic deviation equation reduces to the familiar Newtonian equations, we need to take the Newtonian limit in which gravity is weak and speeds are low. (Recall also that x0 = ct and we will need to put back the factors of c here to make the approximation clear.) Either we can assume staticity, or we can note that ∂0 = ∂t/c, which is a factor 1/c smaller than ∂i. In the Newtonian approximation, we treat the Newtonian potential as a perturbation on 1, and we will ignore terms of order Φ2 compared to terms of order Φ. In the weak-field limit, the line element is diagonal and quite simple,
ds2 = (1 + 2Φ/c2)c2dt2 − (1 − 2Φ/c2)(dx2 + dy2 + dz2) . (10.27)
For the moment, you will need to take this equation on faith, as I have not yet developed the machinery required to see how it emerges. What I will do for now is to assume it as an ansatz, and show that it correctly gives back the familiar Newtonian limit in the limit of weak gravity and slow speeds. Later on in the course, I will give a fuller explanation of where this expression for the approximate line element comes from. In the low-speed Newtonian limit, there is no difference between proper time and coor- dinate time t. The dynamical variables of interest become xµ(λ) → xi(t). What this means is that we only need to consider the spatial components of the geodesic deviation equation, as the temporal component takes care of itself automatically. In the limit of slow speeds compared to the speed of light, then, we have
d2yi = +Ri yj . (10.28) dt2 ttj
i To check that this does reduce to the Newtonian expression we need to compute R ttj for the above line element. In the limit of weak gravity, we can find the components of the inverse metric to first order in Φ,
gtt ' (1 − 2Φ/c2)/c2 , gij ' −δij(1 + 2Φ/c2) . (10.29)
For our general Christoffel symbol we have 1 Γµ = gµσ (∂ g + ∂ g − ∂ g ) , (10.30) νλ 2 ν σλ λ σν σ νλ
54 so we can pick off the 0 and i parts individually. Assuming that gravity is weak allows us to keep only first order terms in Φ. Assuming that Φ does not depend on time (to first order in small quantities) sets some Christoffels to zero. For example, 1 Γ0 = g00 (∂ g ) = 0 . (10.31) 00 2 0 00 and 1 Γ0 = g00 (∂ g + ∂ g − ∂ g ) = 0 , (10.32) ij 2 i 0j j 0i 0 ij and 1 Γi = gik (∂ g + ∂ g − ∂ g ) = 0 . (10.33) 0j 2 0 k0 j 0k k 0j Then we have 1 1 Γ0 = g00∂ g ' (1 − 2Φ/c2)∂ (1 + 2Φ/c2) ' ∂ Φ/c2 ⇒ Γt = ∂ Φ/c2 . (10.34) 0i 2 i 00 2 i i ti i Another nontrivial component is 1 1 Γi = gik (∂ g + ∂ g − ∂ g ) = − gik∂ g = δik∂ Φ/c2 ⇒ Γi = δik∂ Φ . (10.35) 00 2 0 k0 0 0k k 00 2 k 00 k tt k Finally, we have 1 Γi = gi` (∂ g + ∂ g − ∂ g ) (10.36) jk 2 j `k k `j ` jk 1 = δi`(1 + 2Φ/c2)(−2/c2)(δ ∂ Φ + δ ∂ Φ − δ ∂ Φ) (10.37) 2 `k j `j k jk ` 1 ⇒ Γi = −δi ∂ Φ − δi ∂ Φ + δi`δ ∂ Φ . (10.38) jk c2 k j j k jk `
55 11 M19Oct
11.1 Newtonian limit for Riemann From the Christoffels we computed last time, we can compute the Riemann components,
1 ∂2Φ Rt = − , (11.1) xtx c2 ∂x2 1 ∂2Φ Rt = − , (11.2) xty c2 ∂x∂y 1 ∂2Φ ∂2Φ Rx = + + , (11.3) yxy c2 ∂x2 ∂y2 1 ∂2Φ Rx = + . (11.4) yxz c2 ∂y∂z plus eight more equations from cyclic permutations of (x, y, z). Note that we do not obtain any squares of partial derivatives here in our Riemanns because we are only working to first order in the Newtonian potential Φ. Then, using our geodesic deviation equation in the Newtonian limit, we have d2yi = +Ri yj . (11.5) dt2 ttj Since we also know that
i i ik ik R tjt = ∂jΓ tt − 0 = +∂j(δ ∂kΦ) = +δ ∂j∂kΦ , (11.6) we can see that the General Relativistic geodesic deviation equation involving Riemann gives back the Newtonian expression, which is exactly what we set out to prove last time.
To illustrate the abstract concept of geodesic deviation, let us work a very simple exam- ple. Suppose that we have a two-sphere of unit radius with line element
2 2 2 2 dΩ2 = dθ + sin θ dφ . (11.7)
If you did the Homework 1 assignment, you will already know how to find the Christoffels for this case. There are only two that are nonzero,
θ φ Γ φφ = − sin θ cos θ , Γ φθ = + cot θ . (11.8) Denoting d/dλ by an overdot, we find for the geodesic equation
.2 θ¨ − sin θ cos θ φ = 0 , (11.9) . . φ¨ + 2 cot θ θ φ = 0 . (11.10)
These are second order nonlinear PDEs, and solving them can be a battle if you do not choose your initial conditions cleverly.
56 If we wish, we can use spherical symmetry to pick a particular initial condition to make integrating these equations simpler. We choose the initial conditions π . . θ(λ)| = , θ(λ)| = −Ω , φ(λ)| = 0 , φ(λ)| = 0 . (11.11) λ=0 2 λ=0 0 λ=0 λ=0
This corresponds to pointing your tangent vector down a line of longitude. The constant Ω0 is the angular speed with which the polar angle θ is changing with the affine parameter λ. What is Riemann? The only nonzero component on the 2-sphere S2 is
θ 2 R φθφ = + sin θ . (11.12)
So the components of the geodesic deviation acceleration are
θ θ ν α σ A = R νασT T S θ φ θ φ θ φ φ θ = R φθφT T S + R φφθT T S = + sin2 θ[T φT θSφ − (T φ)2Sθ)] , (11.13) and
φ φ θ θ φ φ θ φ θ A = R θθφT T S + R θφθT T S = +[T θT φSθ − (T θ)2Sφ] , (11.14)
Now we need to specify S and T . Since the tangent vector to a geodesic running down a line of longitude points in the (negative of the) polar direction, and the separation vector between two adjacent such geodesics points in the azimuthal direction, we have that
θ φ θ φ T = −Ω0 ,T = 0 ,S = 0 ,S = 1 . (11.15)
So θ φ 2 A = 0 ,A = −Ω0 . (11.16) The magnitude is what you should expect for an angular acceleration of the type represented here. The minus sign is physical. It is possible to get considerably more sophisticated in discussing the physics of geodesic deviation. In order to derive more precise equations, one studies a congruence of geodesics, which is a set of curves in an open region of spacetime such that every point in the region lies on precisely one curve. The story of how geodesics deviate can be expressed in more sophisticated tensor languauge by studying the covariant derivative of the four-velocity vector ∇µUν and decomposing it into three independent parts: (a) the trace part θ, known as the expansion of the congruence, (b) the symmetric traceless part σµν, known as the shear of the congruence, and (c) the antisymmetric part ωµν, known as the rotation of the congruence. Each of these affects the evolution of the others, and the equations obtained are different for massive and massless particles. We will not show the details here because the algebra is too long-winded.
57 11.2 Riemann normal coordinates and the Bianchi identity Riemann normal coordinates are a handy coordinate system that you can always use based about any point p. They are defined in a smallish patch in the neighbourhood of p, and do not necessarily extend infinitely in all directions, as we will explain when we talk about geodesic deviation soon. But they are a great little coordinate system that you can use to evaluate tensor equations, and to help prove tensor equations. We will use the notational convention that equations written in Riemann normal coordinates have bars over the tensors. Strictly speaking we should also bar all the indices, but this is beyond my typing patience at present, so please imagine barred indices everywhere in your head. A Riemann normal coordinate system is one built using geodesics about a point p. More concretely for our purposes, it is the coordinate system in which the metric is locally Minkowskian, and the Christoffels are zero – at the point p, ¯µ Γ αβ = 0 . (11.17)
Then, since ∇σgαβ = 0 everywhere, including at p,
¯ ¯λ ¯λ ∇σg¯µν = ∂σg¯µν − Γ σµg¯λν − Γ σνg¯λµ (11.18)
= ∂σg¯µν + 0 = 0 . (11.19)
Therefore, in Riemann normal coordinate system, we have the special relations
∂σg¯µν = 0 , (11.20) ¯α Γ λσ = 0 , (11.21) ¯µ ¯µ ¯µ R νσρ = ∂σΓ νρ − ∂ρΓ νσ . (11.22)
As you can imagine, using this coordinate system we can more quickly check tensor equations. This is not a trick – tensor equations are valid in any coordinate system. Therefore, they must hold in any frame, including the Riemann normal coordinate frame in which our tensor components simplify. This conceptual tool can be super handy.
We are now going to make use of this special coordinate system to identify all the symmetries of Riemann. This is an important quest, because it will enable is to compute how many independent components Riemann has in arbitrary spacetime dimension D = d+1. In turn, that helps us understand the physics of this four-legged tensor. Computing it can be arduous for a general spacetime, and this is why I set computer algebra as part of HW1. To help us find the symmetries, it helps to start by using the spacetime metric to build the (0,4) version of Riemann from the natural (1,3) version,
λ Rαβµν = gαλR βµν . (11.23)
The first symmetry we can notice by inspection of the formula for Riemann in terms of Christoffels. We see immediately that Riemann is antisymmetric upon exchange of its final two indices, ρ ρ R σµν = −R σνµ . (11.24)
58 In Riemann normal coordinates,
¯ ¯λ ¯λ Rρσµν =g ¯ρλ ∂µΓ νσ − ∂νΓ µσ (11.25) 1 =g ¯ ∂ g¯λα (∂ g¯ + ∂ g¯ − ∂ g¯ ) − (µ ↔ ν) (11.26) ρλ µ 2 σ να ν σα α νσ 1 = g¯ g¯λα (∂ ∂ g¯ + ∂ ∂ g¯ − ∂ ∂ g¯ ) − (µ ↔ ν) (11.27) 2 ρλ µ σ να µ ν σα µ α νσ 1 = ∂ ∂ g¯ + ∂ ∂ g¯ − ∂ ∂ g¯ − (µ ↔ ν) (11.28) 2 µ σ νρ µ ν σρ µ ρ νσ 1 = ∂ ∂ g¯ − ∂ ∂ g¯ − (µ ↔ ν) , (11.29) 2 µ σ νρ µ ρ νσ
λα where in the third line above we used the fact that ∂µg¯ = 0 in Riemann normal coordinates, and in the fourth line we used symmetry. Therefore, we can see two additional identities satisfied by Riemann, Rρσµν = −Rσρµν , (11.30) i.e., Riemann is antisymmetric upon exchange of its first two indices as well as its last two, and Rρσµν = Rµνρσ (11.31) i.e., Riemann is symmetric under interchange of the first two indices with the last two. We can also look at a version of Riemann with cyclic permutations on the last three indices, Qρσµν := Rρσµν + Rρµνσ + Rρνσµ . (11.32) Evaluating again in Riemann normal coordinates gives
2Qρσµν = (∂µ∂σg¯νρ − ∂µ∂ρg¯νσ) + (∂ν∂µg¯ρσ − ∂ν∂ρg¯µσ) + (∂σ∂νg¯µρ − ∂σ∂ρg¯µν) −(µ ↔ ν) (11.33) = ∂ρ −∂µg¯νσ − ∂νg¯µσ − ∂σg¯µν + ∂νg¯µσ + ∂µg¯νσ + ∂σg¯νµ +∂σ ∂µg¯νρ + ∂νg¯µρ − ∂νg¯µρ − ∂µg¯νρ + ∂µ ∂νg¯σρ − ∂ν ∂µg¯σρ (11.34) = (0) + (0) + 0 − 0 (11.35) = 0 , (11.36) where we have used the fact that mixed partial derivatives commute and the fact that the metric is symmetric. Because of the antisymmetry properties, an equivalent way of writing this is Rρ[σµν] = 0 , (11.37) and using other symmetries of Riemann it immediately follows from this that
R[ρσµν] = 0 , (11.38) i.e., the totally antisymmetric part of Riemann vanishes too. With straightforward but tedious algebra of very similar type, we can also derive the Bianchi identity which governs covariant derivatives of Riemann. It can be written in (at least) two mathematically different
59 but physically identical ways, which are related by the symmetries of Riemann. The first form is ∇λRρσµν + ∇ρRσλµν + ∇σRλρµν = 0 , (11.39) and the second form is ∇[λRµν]ρσ = 0 , (11.40) which constrains Riemann by relating components at different points. You can think of the Bianchi identity for Riemann as like a Jacobi identity for covariant derivatives,
[[∇µ, ∇ν], ∇λ] + [[∇ν, ∇λ], ∇µ] + [[∇λ, ∇µ], ∇ν] = 0 . (11.41)
11.3 The information in Riemann Now we have all the ingredients we need in order to compute the number of independent Riemann coefficients. We know that as a (0,4) tensor Riemann satisfies
Rαβγδ = −Rαβδγ , (11.42)
Rαβγδ = −Rβαγδ , (11.43)
Rαβγδ = Rγδαβ , (11.44)
R[αβγδ] = 0 . (11.45)
Suppose that we bunch the indices of Riemann in twos. Then we can think of Riemann as like a symmetric combination of two antisymmetric blocks. Recall that the dimension of an antisymmetric D × D matrix is D(D − 1)/2 while that of a symmetric matrix is D(D + 1)/2. Then the number of components of Riemann should be
1 1 1 D n (D) = D(D − 1) D(D − 1) + 1 − . (11.46) R 2 2 2 4 We obtained this by using the symmetries of the first three identities to compute the tentative total and then subtracting off the number of completely antisymmetric components to satisfy the fourth identity. This process works because the four constraints are independent. Then, with very simple algebra, we obtain 1 n (D) = D2(D2 − 1) . (11.47) R 12
Notice a few things about this formula. In one spacetime dimension, nR(1) = 0 and Riemann has no components. This makes sense, as there is only one independent direction, so you cannot build a nonzero commutator of covariant derivatives. There is not enough room in spacetime to build a parallelogram. In two spacetime dimensions, we have nR(2) = 1 and Riemann has just one independent component. This makes gravitational physics in D = 1+1 quite easy compared to higher dimensions. In three spacetime dimensions, we get nR(3) = 6, and in four spacetime dimensions we have nR(4) = 20. This number is, not accidentally, equal to the number of degrees of freedom in the second partial derivatives of the metric that we cannot set to zero by a clever choice of coordinate system when Taylor expanding the metric.
60 As we keep going up in dimension, nR(D) proliferates like a quartic polynomial of D. By the time we get to ten or eleven spacetime dimensions, we are dealing with nR(10) = 825 or nR(11) = 1210 independent components! This is why we often use computer algebra in research, when calculating in spacetime dimensions relevant to string theory. Of course, it is also possible with clever techniques to cut through the algebra and find quicker ways to calculate analytically, when your metric is diagonal or sparse in other significant ways.
61 12 R22Oct
12.1 Lie derivatives So far we have developed covariant derivatives and curvature, which required having a Christoffel connection. An interesting fact is there are some structures that can be de- fined on a curved spacetime manifold even without reference to a connection or curvature. We will introduce a the idea of the Lie10 derivative today, because studying it acting on the metric tensor of spacetime will lead us to the General Relativistic version of Noether’s Theorem, which is one of the most important ideas of all time in theoretical physics. We will find that a symmetry of the spacetime metric gives an integral of the motion, a conserved quantity, which we can use to help solve for trajectories of test particles in some important cases. The key concept we will need for our discussion of Noether’s Theorem is how to take a Lie derivative along the congruence defined by a vector field. So, first things first, what is congruence? On a spacetime manifold, a congruence is a set of curves that fill the manifold (or more generally some part of it) without intersecting. Therefore, the congruence provides a mapping of a manifold onto itself, in the following sense. If the parameter on the curves is λ, then any tiny ∆λ defines a mapping, where each point is advanced by ∆λ along the same curve in the congruence. This is a 1-1 mapping if the vector field is C1, and if it is C∞ it is called a diffeomorphism. If there is such a map for any ∆λ, then we have a one-parameter Lie group, and the mapping is called a Lie dragging along the congruence. Suppose that we have a scalar function f defined on our spacetime manifold. Then our ∗ above mapping defined by ∆λ lets us define a new function f∆λ in the obvious way: if a point P on a certain curve in the congruence gets mapped to the point Q, then
∗ f(P ) = f∆λ(Q) . (12.1)
∗ If it happens that we have a function for which the new value f∆λ(Q) is equal to the old one f(Q), for all Q, ∗ f = f∆λ , (12.2) then the function is invariant under the mapping. If it is invariant for all ∆λ, then the function is said to be Lie dragged. In less fancy language, df = 0 . (12.3) dλ
Acting on any given tensor, the Lie derivative along some vector field V , written as LV , measures how fast the tensor changes along integral curves of V . Acting on a scalar function f, that is just the directional derivative,
λ LV (f) = V ∂λf . (12.4)
Note that this is the partial derivative: we have not involved any affine connection here.
10Pronunciation note: “Lie” rhymes with “see”.
62 What about a vector field? Any vector field V is defined by the congruence of curves for which it is the tangent field, dxµ V µ = . (12.5) dλ A familiar example from undergraduate electromagnetism is that the magnetic flux lines are the integral curves of the magnetic field 3-vector. Now suppose that we have two general vector fields X and Y . Recall that for any vector V , it can be expanded in the coordinate µ basis as V = V ∂µ. Then we can define the commutator [X, Y ] of two vector fields via
[X, Y ](f) ≡ X(Y (f)) − Y (X(f)) , (12.6) where f is an arbitrary function. The neat thing about [X, Y ] is that it is a bona fide vector field: it is linear, [X, Y ](af + bg) = a[X, Y ]f + b[X, Y ]g , (12.7) and it obeys the Leibniz rule,
[X, Y ](fg) = f[X, Y ]g + g[X, Y ]f . (12.8)
In the coordinate basis, the new vector field [X, Y ] has components
µ λ µ λ µ [X,Y ] = X ∂λY − Y ∂λX . (12.9)
This is a well-defined tensor, because the non-tensorial pieces from the partial derivatives cancel by antisymmetry of the commutator. If you prefer, you can write the above formula with covariant derivatives instead – that way, it looks more tensorial. Suppose that we adapt our coordinate system so that V points entirely along the coor- dinate basis vector ∂/∂xd. The utility of choosing this coordinate system is that a diffeomor- phism by λ amounts to a coordinate transformation from (x0, x1, . . . , xd) to (x0, x1, . . . , xd + λ). Then the components of a different vector T µ pulled back from the transformed point to the original are simply T µ(x0, x1, . . . , xd + λ). In this coordinate system, the Lie derivative then becomes ∂ L T µ = T µ . (12.10) V ∂xd This expression is clearly not covariant, but we know that for two vector fields V and T the commutator [V , T ] is a well-defined tensor, and in this coordinate system it happens to have components ∂T µ [V , T ]µ = V ν∂ T µ − T ν∂ V µ = . (12.11) ν ν ∂xd
Since both LV T and [V , T ] are vectors (rank (1, 0) tensors), their components must be equal, and so we finally have the formula we want. The Lie derivative of a vector T along the vector field V is LV T = [V , T ] . (12.12) This quantity on the RHS is called the Lie bracket. The equation says that how the vector T changes along integral curves of another vector V is encoded in the commutator of the two vector fields. The formula for the action of the Lie derivative on covariant vectors follows directly from what we have just derived for contravariant vectors and the Leibniz rule.
63 For a general rank (k, `) tensor, the Lie derivative is
(L T )µ1...µk = V σ∂ T µ1...µk V ν1...ν` σ ν1...ν` −(∂ V µ1 )T λµ2...µk − ... λ ν1...ν` +(∂ V λ)T µ1...µk + .... (12.13) ν1 λν2...ν` This equation may make you a bit uncomfortable because it involves partial derivatives. In fact, if you do the straightforward but tedious algebra, you will find that it is just as valid with covariant derivatives replacing the partial ones,
(L T )µ1...µk = V σ∇ T µ1...µk V ν1...ν` σ ν1...ν` −(∇ V µ1 )T λµ2...µk − ... λ ν1...ν` +(∇ V λ)T µ1...µk + .... (12.14) ν1 λν2...ν` This equation certainly looked less tensorial written the first way. But the first equation has the advantage that it makes clear that no connection is necessary to define Lie derivatives of tensors. It is an independent structure.
12.2 Killing vectors and tensors In this section, we will be especially interested in the expression above for the Lie derivative of the metric tensor, which characterizes everything about gravity in our spacetime. We have
(LV g)µν = ∇µVν + ∇νVµ . (12.15) So if 0 = ∇µKν + ∇νKµ , (12.16) for some vector K, the metric is unchanged. K is known as a Killing vector, and the metric is unchanged along its integral curves, i.e., it has a symmetry. This is Noether’s Theorem in curved spacetime, and it plays an extremely important role in the physics of GR. So, what is the corresponding conservation law? Consider the quantity K · p. Its covariant derivative is
λ λ λ ∇µ(Kλp ) = (∇µKλ)p + Kλ(∇µp ) . (12.17) Contracting this with pµ gives
µ λ µ λ µ λ p ∇µ(Kλp ) = p p ∇µKλ + Kλp (∇µp ) , (12.18) and the second term disappears by the geodesic equation. The first term can also be seen to vanish by virtue of symmetry and the Killing vector equation. So the Killing equation is equivalent to conservation of K · p. More generally, if we have a Killing tensor obeying
∇(µKν1...ν`) = 0 (12.19) then µ ν1 ν` p ∇µ(Kν1...ν` p . . . p ) = 0 . (12.20)
64 A fascination with finding conserved quantities is physically important because it can help us solve for geodesics. Soon, when we introduce black holes, we will see just how crucial conserved quantities can be in analyzing geodesic motion and physical consequences of it. So let us derive an alternative form of the geodesic equation which will be handy for future reference. What is the directional covariant derivative of the downstairs version of the tangent vector to the curve xµ(λ)? D dx d2x dxσ dx µ = µ − Γα α . (12.21) Dλ dλ dλ2 σµ dλ dλ This should be zero for geodesics, giving d2x 1 dxσ dx µ = gαβ (∂ g + ∂ g − ∂ g ) α , dλ2 2 σ βµ µ βσ β σµ dλ dλ 1 dxσ dxβ = (∂ g + ∂ g − ∂ g ) , 2 σ βµ µ βσ β σµ dλ dλ 1 dxσ dxβ = (+∂ g ) (12.22) 2 µ βσ dλ dλ which yields (upon relabelling dummy indices)
d dx 1 dxα dxβ µ = (∂ g ) . (12.23) dλ dλ 2 µ αβ dλ dλ So if the entire spacetime metric has zero dependence on a particular coordinate xµ, the corresponding lower-index tangent vector dxµ/dλ is conserved! For a massive particle, this quantity is none other than pµ/m. For the massless particle, we can choose a convention in which pµ = dxµ/dλ. Therefore,
if ∂µgαβ = 0 ∃µ ∀α, β then pµ = constant . (12.24) Let us do an ultra-simple example of a Killing vector. Consider Minkowski space in 4D, namely R3,1 with the flat spacetime metric. In Cartesian coordinates, we obviously have spacetime translation invariance. This implies that all components of pµ are conserved. As a less trivial example, take our spatially flat FRW universe for which we previously worked out the Christoffels. Notice that the metric depended only on time. Obviously, this means that energy is not conserved. Stop and think on that for a minute. You probably thought that conservation of energy must be true in all circumstances, even for the whole universe. You would be wrong. It requires a symmetry! Since none of the components of the metric depend on spatial coordinates, the spatial momenta pi are conserved. For our third example of Killing vectors, consider the two-sphere S2 with round metric
ds2 = dθ2 + sin2θ dφ2 . (12.25)
How do we find the Killing vectors? We need to solve the D(D + 1)/2 Killing equations,
0 = ∇µKν + ∇νKµ α = ∂µKν + ∂νKµ − 2Γ µνKα . (12.26)
65 First, we need the nonzero Christoffels, cos θ Γφ = , Γθ = − sin θ cos θ . (12.27) φθ sin θ φφ Then the three independent Killing vector equations involve θθ, φφ, θφ:
0 = ∂θKθ ,
0 = ∂φKφ + sin θ cos θ Kθ , 2 cos θ 0 = ∂ K + ∂ K − K . (12.28) φ θ θ φ sin θ φ The first Killing equation teaches us that
Kθ = Kθ(φ) . (12.29)
Taking ∂φ of the second Killing equation gives, after a little bit of massaging of trig functions and using the third equation, 2 ∂φKθ + Kθ = 0 . (12.30) We can readily solve this, Kθ(φ) = A sin φ + B cos φ , (12.31) where A, B are constants of integration. Using this in the third Killing equation and partially integrating w.r.t. φ to find Kφ gives
Kφ = F (θ) + A sin θ cos θ cos φ − B sin θ cos θ sin φ , (12.32) where F is an arbitrary function of integration. Substituting this back into the third Killing equation gives, after more trigonometric algebraic massage, 2 cos θ ∂ F (θ) − F (θ) = 0 , (12.33) θ sin θ which is readily integrated to F (θ) = C sin2 θ , (12.34) where C is a constant of integration. Therefore, the general form of our Killing vectors for the two-sphere are, for the downstairs components,
Kθ = A sin φ + B cos φ , 2 Kφ = C sin θ + sin θ cos θ (A cos φ − B sin φ) . (12.35) If we take A = 0,B = 0,C = 1, we get a Killing vector R with upstairs components Rθ = 0 ,Rφ = 1 . (12.36) If we take A = 0,B = 1,C = 0, we get a Killing vector S with upstairs components Sθ = cos φ , Sφ = − cot θ sin φ . (12.37) If we take A = −1,B = 0,C = 0, we get a Killing vector T with upstairs components T θ = − sin φ , T φ = − cot θ cos φ . (12.38) As you can check by transforming between spherical polar coordinates and Cartesian coordi- nates, these three Killing vectors correspond to R = x∂y −y∂x, S = z∂x −x∂z, T = y∂z −z∂y.
66 13 M26Oct
13.1 Maximally symmetric spacetimes Spacetimes are distinguished by how many symmetries they possess. The more symmet- ric, the more calculable. The less symmetric, the less calculable. Even though maximally symmetric spacetimes possess an unrealistic amount of symmetry for experimental purposes, they are still very useful to study because calculations are easier to complete and they help build intuition. What are the maximally symmetric spacetimes? We need to specify the spacetime signature11 in order to get started on this discussion. In Euclidean signature, Riemannian manifolds with maximal symmetry are (up to local isometry) either: Euclidean space RD, the sphere SD, or hyperbolic space HD. In Lorentzian signature, there are also three options, and they split up according to the value of the cosmological constant Λ (a.k.a. dark energy density). When Λ = 0, we get Minkowski space Rd,1, where D = d + 1. For Λ < 0 we get Anti de Sitter spacetime (AdS), and for Λ > 0 we get de Sitter spacetime (deS). Recall that Minkowski spacetime is invariant under (d + 1) translations, d(d − 1)/2 rotations, and d boosts. Adding the numbers together gives a total of 1 1 1 (d + 1) + d(d − 1) + d = (d + 1)(d + 2) = D(D + 1) (13.1) 2 2 2 symmetries. We therefore say that a spacetime manifold of dimension D is maximally symmetric if it possesses D(D + 1)/2 independent symmetries. What equation should the Riemann tensor obey in maximally symmetric spacetimes? It had better be invariant under local Lorentz transformations, because there is no preferred direction in spacetime. There are only a very few tensors which we can use: gµν and µ1...µD . The epsilon tensor turns out to have the wrong symmetry properties to build Riemann components, and the metric ends up the winner. The sole combination of metric tensor components that possesses the right symmetries to be Riemann is antisymmetric, and tracing gives the constant of proportionality, R R = (g g − g g ) . (13.2) ρσµν D(D − 1) ρν σµ ρµ σν
The Ricci scalar R is constant over the entire manifold for maximally symmetric spacetimes.
Anti de Sitter spacetime AdSD=d+1 can be embedded in a Minkowski spacetime of one higher dimension Rd,2, via − (t1)2 − (t2)2 + (x1)2 + ... + (xd)2 = −L2 (13.3) where L is the radius of curvature of the AdSD. There are several different coordinate 11If we were in a mathematically picky mood, we would also want to specify the spacetime topology.
67 systems in common usage for AdSD. One of the most useful is global coordinates, in which
t1 = L cosh ρ cos τ , (13.4) t2 = L cosh ρ sin τ , (13.5) d X xi = L sinh ρ xˆi , where (ˆxi)2 = 1 . (13.6) i=1 In general dimension, spherical coordinates are defined via
1 xˆ = cos θ1 , p−1 p Y xˆ = cos θ1 sin θm , p ∈ {2, . . . , d − 1} , m=1 d−1 d Y xˆ = sin θm . (13.7) m=1 You can check yourself, either by hand or using SymPy, that the resulting line element of AdSD in global coordinates is
2 2 2 2 2 2 2 ds = L cosh ρ dτ − dρ − sinh ρ dΩd−1 , (13.8) where d−1 `−1 ! 2 2 X Y 2 2 dΩd−1 = dθ1 + sin θm dθ` . (13.9) `=2 m=1 With a further transformation in time and radius to static coordinates,
t = L τ , r = L sinh ρ , (13.10) we obtain r2 r2 −1 ds2 = 1 + dt2 − 1 + dr2 − r2dΩ2 . (13.11) L2 L2 d−1
The scale L is the radius of curvature, and it sets the scale for all the physics in AdSD. The physics of Anti de Sitter (or de Sitter) spacetime in D = d + 1 dimensions differs markedly from the physics of Minkowski spacetime. One of the quickest ways to illustrate this is to compare the falloff of partial waves in AdS versus flat spacetime. Solving a wave equation for a simple type of field is a straightforward way to see this. Consider a Klein-Gordon (scalar) field living in flat Minkowski spacetime. Its equation of motion in spherical coordinates {t, r, ΩD−2} is
µ 2 ∇ ∇µΦ = m Φ . (13.12)
If we write −iωt Φ(t, r, ΩD−2) = e χ(r)Y`,{m}(ΩD−2) , (13.13)
68 where the spherical harmonics obey
2 ∇Sd−1 Y`,{m} = −`(` + d − 2)Y`,{m} , (13.14) and separate variables, we find ∂2 (d − 2) ∂ `(` + d − 2) + + ω2 − − m2 χ(r) = 0 . (13.15) ∂r2 r ∂r r2 The most physically important thing to understand from this partial differential equation is that higher partial waves with ` > 0 are less important at large radius than the ` = 0 mode. A related fact is that when we write out the multipole expansion for electric and magnetic fields in Minkowski spacetime, higher multipole fields fall off with larger powers of radius. This physics is inherent to Minkowski spacetime with Λ = 0. It may surprise you to learn that it does not carry over to other values of the cosmological constant. Suppose that we now consider instead AdSd+1 with global coordinates {t, ρ, Ωd−2},
2 2 2 2 2 2 2 ds = L cosh ρ dτ − dρ − sinh ρ dΩd−1 . (13.16) In this set of coordinates, ρ ranges from 0 (the interior of AdS) to π/2 (the boundary) and the coordinate t ranges from −∞ to +∞. What does the scalar wave equation look like in this spacetime? Anticipating separation of variables again, let us write
−iωτ Φ(τ, ρ, Ωd−1) = e χ(ρ)Y`,{m}(Ωd−2) . (13.17) Then the equation of motion becomes 1 ∂ (tan ρ)d−1∂ + ω2 − `(` + d − 2) csc2 ρ − m2 sec2 ρ χ(ρ) = 0 . (13.18) (tan ρ)d−1 ρ ρ Notice that as we approach the boundary, the higher angular momentum modes are not suppressed compared to the ` = 0 mode. This is the germ of why the AdS/CFT correspon- dence discovered in the context of string theory in 1997 can work: an observer living on the boundary of the spacetime can see lots of information about what is happening in the interior of the spacetime all the way from the boundary. If we want to know the character of solutions to the above differential equation, we can substitute
χ(ρ) = (cos ρ)2h(sin ρ)2bf(ρ) , (13.19) which, upon the substitution y ≡ sin2 ρ , (13.20) gives d ω2 y(1 − y)∂2f + 2b + − (2h + 2b + 1)y ∂ f − (h + b)2 − f = 0 . (13.21) y 2 y 4 The solutions to this equation are hypergeometric functions, with √ d ± d2 + 4m2 ` ` d h = , b = + , − + 1 − . (13.22) ± 4 2 2 2
69 (For further details, see e.g. hep-th/9805171.) D,1 de Sitter spacetime dSD can be embedded in R via √ t1 = L2 − r2 sinh(t/L) , (13.23) d X xi = Lxˆi , where (ˆxi)2 = 1 , (13.24) √ i=1 xD = L2 − r2 cosh(t/L) . (13.25)
This gives rise to static coordinates. (Like AdS, dS can alternatively be sliced with flat, positively curved, or negatively curved spatial sections. In static coordinates, the de Sitter line element becomes
r2 r2 −1 ds2 = 1 − dt2 − 1 − dr2 − r2dΩ2 . (13.26) L2 L2 d−1
This has a cosmological horizon at r = L. We will not have time to develop the similarities and differences between cosmological horizons and black hole horizons in this course.
13.2 Einstein’s equations In plain language, Einstein’s equations express the fact that matter tells spacetime how to curve and spacetime tells matter how to move. In PHY484, I will show how to derive Einstein’s equations of General Relativity. For now, we will just write them down for you and show you how to use them. They relate a geometrical quantity on the left hand side, built out of the Riemann curvature tensor, to an energy-momentum tensor of any matter fields in the physical system containing gravitation as well. In tensor notation, they read as follows, 1 R − g R + Λg = −8πG T . (13.27) αβ 2 αβ αβ N αβ The quantity Λ is known as the cosmological constant. (Note: you can put back the powers of c very easily by recruiting dimensional analysis.) A very important characteristic of Einstein’s equations is that they are nonlinear. You can see this by eye by recalling the formula for Christoffels in terms of metric derivatives, which is nonlinear, as well as the formula for the Riemanns in terms of derivatives of Christof- fels and contractions of Christoffels, which is also nonlinear. Nonlinearity makes GR very different qualitatively than Newtonian gravity. It is only in the Newtonian limit of GR that the linearity with which you are familiar emerges and shows itself as the superposition prin- ciple for the Newtonian potential Φ(x). For generic situations in GR, nonlinearity is present in the partial differential equations for the evolution of spacetime. The mathematics of non- linear PDEs is hugely complicated compared to linear ones, and for generic spacetimes often no general statements can be made. Symmetry helps enormously with the task of trying to solve the differential equations, classify spacetimes, or find their geodesics. The energy-momentum tensor on the RHS of Einstein’s equations is covariantly con- served. The way to see this is to take covariant derivatives of both sides of the Einstein
70 equations. The Einstein tensor is defined as 1 G = R − g R. (13.28) µν µν 2 µν
Notice that this is denoted with a big-Gµν, rather than the small-gµν metric or the GN denoting the Newton gravitational constant. By itself, the rank (0,2) Einstein tensor Gµν does not look like much. But it obeys an extremely useful identity by virtue of the Bianchi identity for the Riemann tensor. To see this, let us take the first form of our Bianchi identity and contract with two factors of the upstairs metric,
νσ µλ 0 = g g (∇λRρσµν + ∇ρRσλµν + ∇σRλρµν) (13.29) µ ν = ∇ Rρµ − ∇ρR + ∇ Rρν . (13.30)
Rearranging this expression gives a relationship between the covariant derivative of the Ricci tensor and the covariant derivative of the Ricci scalar, 1 ∇µR = ∇ R. (13.31) ρµ 2 ρ This identity is handy because it enables us to prove that
µ ∇ Gµν = 0 . (13.32)
In other words, the Einstein tensor is covariantly conserved. We also have the metric com- patibility condition on our affine connection,
σ ∇ gµν = 0 . (13.33)
Then we have µ matter ∇ Tµν = 0 . (13.34) Covariant conservation of the energy-momentum tensor in GR is mandatory, not voluntary. How about some examples of energy-momentum tensors? Consider a perfect fluid, which is a spherical cow approximation to real fluids, characterized only by three things: energy density ρ, pressure p, and fluid velocity uµ. Its energy-momentum tensor is constructed from those three quantities and the metric tensor, p T p.f. = ρ + u u − pg . (13.35) µν c2 µ ν µν More generally, if we have an action principle for some classical matter (non-gravitational) field coupled to gravity, Smatter, then the energy-momentum tensor is determined by varying 12 the action w.r.t. gµν according to the following recipe :
σ 2 δSmatter Tµν(x ) = , (13.36) p−g(xσ) δgµν(xσ)
12I will prove this near the beginning of the GR2 PHY[1]484S course
71 where (−g) is an abbreviation for the determinant of the downstairs metric, √ q −g ≡ − det (gαβ) . (13.37)
This quantity arises in writing down a general relativistically invariant measure of integra- √ tion, dDx −g. (For the case of spherical coordinates on flat Minkowski spacetime, it is r2 sin θ, which should be familiar to you from undergraduate multivariable calculus.) A handy formula is √ 1√ 1√ δ −g = − −g g δgαβ = + −g gαβ δg . (13.38) 2 αβ 2 αβ For a relativistic massive point particle, Z particle m . . 4 T (x) = dτ zµzν δ (x − z(τ)) . (13.39) µν p−g(x)
We can see how this arises by starting from the Einbein action in curved spacetime in proper time gauge for a massive particle,
Z 1 dzµ(τ) dzν(τ) 1 S(2) = m dτ g + m . (13.40) rel 2 µν dτ dτ 2
The only part of this action that depends on the spacetime metric is the first term. Also, we will only get a nonzero result when we are on the particle path. How about for a scalar field Φ? For minimal coupling to gravity,
Z √ 1 1 S [Φ] = dDx −g ∇µΦ∇ Φ − m2Φ2 − V (Φ) . (13.41) scalar 2 µ 2
It follows that 1 T scalar = ∇ Φ∇ Φ − g (∇Φ)2 − V (Φ) . (13.42) µν µ ν µν 2
For the electromagnetic field Aµ, 1 Z √ S [A ] = − dDx −gF µνF . (13.43) EM α 4 µν It follows that 1 T EM = − F F λ − g F λσF . (13.44) µν µλ ν 4 µν λσ
72 14 R29Oct
14.1 Birkhoff’s theorem and the Schwarzschild black hole Let us now attack the question of solving the vacuum Einstein equations when we have a static, spherically symmetric spacetime. After a bit of work, we will be able to show that the Schwarzschild black hole possessing mass M is the unique solution. Our methodology follows that of Carroll §5.2, and will involve a few steps. We will first use spherical symmetry to constrain the possible metric components that might be turned on. Then we will use the vacuum Einstein equations to prove that the time dependence must drop out. Then we will solve the remaining vacuum Einstein equations, and we will obtain the Schwarzschild solution. The last piece of the puzzle will be provided by the Newtonian limit, which will connect a mathematically arbitrary constant of integration to the physical quantity GN M, where M is the mass of the Schwarzschild geometry and GN is the Newton constant, which has dimensions of lengthD−2 and parametrizes the strength of gravity. First, let us discuss the definition of a static spacetime in Lorentzian signature. Calling the timelike coordinate x0, we define a static spacetime as one for which (a) there is no explicit time dependence in the metric and (b) the invariant interval possesses time reversal invariance, ∂ g (xλ) = 0 , (14.1) ∂x0 µν ds2 invariant under x0 → −x0 . (14.2)
A spacetime that only obeys the first condition is called a stationary spacetime. In essence, a static spacetime basically does nothing at all over time, while a stationary spacetime does exactly the same thing at all times. Note that staticity requires that there be no time-space cross terms in the invariant interval, only time-time and space-space components. Isotropy is also big requirement. Having this much symmetry eliminates a lot of possibly independent components of the metric tensor. In particular, writing in terms of either Carte- sian coordinates ~x or spherical polar coordinates r, θ, φ, we can only use three ingredients,
~x · ~x = r2 , 2 2 2 d~x · d~x = dr + r dΩ2 , ~x · d~x = rdr , (14.3) where 2 2 2 2 dΩ2 = dθ + sin θ dφ . (14.4) Any other thing we could build from the available ingredients would not respect spherical symmetry. Given the spherical symmetry of our ansatz, it is traditional to use spherical polar coordinates, in which the metric on the S2 is round – throughout the spacetime. For now, we will allow the metric to have time dependence, but bear in mind that shortly we will find it is disallowed by the Einstein equations. We write the metric as
2 2α00(t0,r0) 0 2 2β00(t0,r0) 0 2 2γ00(t0,r0) 0 0 2δ00(t0,r0) 0 2 2 ds = e (dt ) − e (dr ) − 2e dt dr − e (r ) dΩ2 . (14.5)
73 Next, we can change to a new radial coordinate r(t0, r0) by
r2 = (r0)2e2δ00(t0,r0) . (14.6)
2 2 This r is often referred to as the areal radius, because r is the thing in front of dΩ2, the metric on the round two-sphere. Using this areal radius coordinate, we can then adjust the definitions of all functions dependent on time and radius accordingly, to new functions, single primed,
2 2α0(t0,r) 0 2 2β0(t0,r) 2 2γ0(t0,r) 0 2 2 ds = e (dt ) − e dr − 2e dt dr − r dΩ2 . (14.7) In order to be able to get rid of the 2dt0dr0 term in this line element, we are going to have to work harder. Let us start by trying the simplest proposal for a new time coordinate,
dt =?? e2α(t0,r)dt0 − e2γ(t0,r)dr . (14.8)
If we try to follow this path further, we will find that second mixed partial derivatives of the new t coordinate w.r.t. the old coordinates fail to commute, so the equation (14.8) above is inconsistent. (To see a simple example of how this process works when done right, try transforming from Cartesian coordinates (x, y) on the plane to polar coordinates (r, θ), and checking that mixed second partials commute.) Our simplest proposal for a coordinate change failed. Can we craft a better proposal? As you may recall from the general theory of ODEs/PDEs, the right strategy is to recruit an integrating factor, which here must be a function of both t0 and r: Φ(t0, r). We define a new time coordinate t(t0, r) by
h 0 0 i dt = e2Φ(t,r) e2α(t ,r)dt0 − e2γ(t ,r)dr . (14.9)
The very explicit factor of eΦ(t,r) in front of the [...] parts we wanted is designed precisely such that the right hand side of the above expression is an exact differential. In this case, it can be shown that we can always find such a Φ(t, r). Then, using the above equations, we obtain e−4Φdt2 = e4α(dt0)2 − 2e2(α+γ)dt0dr + e4γdr2 . (14.10) Rearranging this and forming our (dt0)2 and 2dt0dr pieces gives
e2α(t0,r)(dt0)2 − 2e2γ(t0,r)dt0dr = e−2α(t,r)−4Φ(t,r)dt2 − e−2α(t,r)+4γ(t,r)dr2 . (14.11)
Woohoo – the cross terms in the metric are gone! Redefining our metric ansatz functions according to
e2α = e−2α0−4Φ0 , e2β = e2β0 + e−2α0+4γ0 (14.12) gives 2 2α(t,r) 2 2β(t,r) 2 2 2 ds = e dt − e dr − r dΩ2 . (14.13) The point of all this wrestling with differentials was to show that we can always choose a coordinate system in which off-diagonal metric components are absent, even if our spherically symmetric system is time dependent.
74 Our next task is going to be to show that the time dependence in the metric functions also has to drop out. For this part, we will need to use the equations of motion for the metric tensor field on spacetime. For the Einstein equations, we need to compute Christoffels to get Riemanns which we can then contract to get Ricci components, e.g. via SymPy code you wrote for HW1+HW2. We get
t t Γ tt = {∂tα} , Γ tr = ∂rα , t 2(β−α) r 2(α−β) Γ rr = {e ∂tβ} , Γ tt = e ∂rα , r r Γ tr = {∂tβ} , Γ rr = ∂rβ , r −2β r 2 −2β Γ θθ = −re , Γ φφ = −r sin θe , 1 Γθ = , Γθ = − sin θ cos θ , rθ r φφ 1 cos θ Γφ = , Γφ = . (14.14) rφ r θφ sin θ
Note that the pieces involving ∂t have been highlighted with {...} in the above equation so you can clearly see the effect of allowing time dependence. For the Ricci tensor, we obtain 2 R = e2(α−β) −(∂2α) − (∂ α)2 + ∂ α∂ β − (∂ α) tt r r r r r r 2 2 + −(∂t β) + (∂tα)(∂tβ) − (∂tβ) , 2 R = − (∂ β) tr r t 2 R = (∂2α) + (∂ α)2 − (∂ α)(∂ β) − (∂ β) rr r r r r r r 2(β−α) 2 2 + e −(∂t β) − (∂tβ) + (∂tα)(∂tβ) , −2β Rθθ = − e (r∂rβ − r∂rα − 1) + 1 , 2 Rφφ = sin θRθθ . (14.15) All these tensors must be zero for us to have a solution of the vacuum Einstein equations. Note how some of the Einstein equations have turned out to be second order dynamical equations while others are first order constraints. This is a general feature in GR. First, let us look at Rtr. This must be zero, which demands of β(t, r) that
∂tβ(t, r) = 0 ⇒ β = β(r) . (14.16) You can see by looking for the {...} parts in the Riccis that many terms now drop out completely because β is a function of r only. Obviously, this simplifies our life quite a lot! Second, let us notice that the Rθθ = 0 equation (a first order constraint equation) is relatively simple. Let us take a time derivative of it,
−2β −2β ∂t(Rθθ) = 0 = −2(∂tβ)e [r∂rβ − r∂rα − 1] + e [r∂t∂r(β − α)] . (14.17)
But since ∂tβ = 0 by our Rtr = 0 equation, we have
−2β e r∂t∂r(β − α) = 0 . (14.18)
75 Then, using what we know about β(r), we can partially integrate to get
α(t, r) = f(r) + g(t) . (14.19)
Notice how the only remaining place where we have time dependence is in the tt component of the metric. What a stroke of luck! This means that we can absorb it simply by doing a coordinate transformation involving only time (not radius or angular coordinates),
dt˜= dt eg(t) . (14.20)
Let us redefine our time coordinate to correspond to this t˜ (we drop the tilde, for notational clarity). Then we have β = β(r) , α = α(r) . (14.21) Third, let us look at the remaining (more complex) tt and rr Einstein equations, 2 0 = (∂2α) + (∂ α)2 − ∂ α∂ β + (∂ α) , (14.22) r r r r r r 2 0 = −(∂2α) − (∂ α)2 + (∂ α)(∂ β) + (∂ β) (14.23) r r r r r r By simply adding these equations together, we obtain
∂r(α + β) = 0 . (14.24)
This means that β(r) = const. − α(r) . (14.25) This constant of integration can be absorbed into the time coordinate, so that β(r) = −α(r). Fourth, we can plug this expression for β(r) in terms of α(r) back in to the Rθθ = 0 Einstein equation to obtain 2α [2r∂rα + 1] e = 1 . (14.26) By quick inspection you can see that this becomes