Quick viewing(Text Mode)

PHY483F/1483F Relativity Theory I (2020-21) Department of Physics University of Toronto

PHY483F/1483F Relativity Theory I (2020-21) Department of Physics University of Toronto

PHY483F/1483F Relativity Theory I (2020-21) Department of University of Toronto

Instructor: Prof. A.W. Peet

Sources:-

• M.P. Hobson, G.P. Efsthathiou, and A.N. Lasenby, “: an introduction for physicists” (Cambridge University Press, 2005) [recommended textbook];

• Sean Carroll, “ and : an introduction to general relativity” (Addison- Wesley, 2004);

• Ray d’Inverno, “Introducing Einstein’s relativity” (Oxford University Press, 1992);

• Jim Hartle, “: an introduction to Einstein’s general relativity” (Pearson, 2003);

• Bob Wald, “General relativity” (University of Chicago Press, 1984);

• Tom´asOrt´ın,“Gravity and strings” (Cambridge University Press, 2004);

• Noel Doughty, “Lagrangian Interaction” (Westview Press, 1990);

• my personal notes over three decades.

Version: Monday 16th November, 2020 @ 10:55

Licence: Creative Commons Attribution-NonCommercial-NoDerivs Canada 2.5 Contents

1 R10Sep iii 1.1 Invitation to General Relativity ...... iii 1.2 Course website ...... 1

2 M14Sep 2 2.1 Galilean relativity, 3-vectors in , and . . . . .2

3 R17Sep 8 3.1 and 4-vectors in Minkowski spacetime ...... 8 3.2 Partial derivative 4-vector ...... 12

4 M21Sep 14 4.1 Relativistic particle: position, momentum, acceleration 4-vectors ...... 14 4.2 : 4-vector potential and field strength ...... 18

5 R24Sep 21 5.1 Constant relativistic acceleration and the ...... 21 5.2 The ...... 24 5.3 Spacetime as a curved Riemannian ...... 25

6 M28Sep 27 6.1 vectors in curved spacetime ...... 27 6.2 in curved spacetime ...... 29 6.3 Rules for tensor index gymnastics ...... 30

7 R01Oct 32 7.1 Building a ...... 32 7.2 How basis vectors change: the role of the affine ...... 32 7.3 The covariant derivative and transport ...... 36

8 M05Oct 38 8.1 The for test particle motion in curved spacetime . . . . . 38 8.2 Example computation for affine connection and geodesic equations ...... 40

9 R08Oct 44 9.1 Spacetime ...... 44 9.2 The Riemann tensor ...... 44 9.3 Example computations for Riemann ...... 46

10 R15Oct 50 10.1 ...... 50 10.2 Tidal and taking the Newtonian limit for Christoffels ...... 51

11 M19Oct 56 11.1 Newtonian limit for Riemann ...... 56 11.2 Riemann and the Bianchi identity ...... 58

i 11.3 The information in Riemann ...... 60

12 R22Oct 62 12.1 Lie derivatives ...... 62 12.2 Killing vectors and tensors ...... 64

13 M26Oct 67 13.1 Maximally symmetric ...... 67 13.2 Einstein’s equations ...... 70

14 R29Oct 73 14.1 Birkhoff’s theorem and the Schwarzschild ...... 73

15 M02Nov 79 15.1 TOV for a star ...... 79 15.2 of Schwarzschild ...... 81

16 R05Nov 85 16.1 of Schwarzschild ...... 85

17 M16Nov 91 17.1 Charged black holes ...... 91 17.2 Rotating black holes ...... 94

18 R19Nov 96 18.1 The Kerr solution ...... 96 18.2 The Penrose process ...... 98

19 M23Nov 101 19.1 Gravitational ...... 101 19.2 Planetary perihelion precession ...... 103

20 R26Nov 106 20.1 Bending of ...... 106 20.2 Radar echoes ...... 109

21 M30Nov 111 21.1 Geodesic precession of gyroscopes ...... 111 21.2 Accretion disks ...... 114

22 R03Dec 116 22.1 Finding the wave equation for metric perturbations ...... 116 22.2 Solving the linearized Einstein equations ...... 118

23 M07Dec 122 23.1 Gravitational plane waves ...... 122 23.2 Energy loss from gravitational radiation ...... 125

ii 1 R10Sep

1.1 Invitation to General Relativity From a perspective, the gravitational is the weakest of the four known forces. So why does gravity dominate the dynamics of the universe? A simple first answer is that there is a lot of matter in the universe that gravitates. Even though the gravitational attraction between any two subatomic particles is weak, if you get enough of them together you can eventually make a black hole! A slightly more sophisticated answer focuses on the range of the gravitational force and what sources it. The only two long-range forces we know of in Nature are gravity and electromagnetism. By contrast, the strong nuclear force binding atomic nuclei and the weak nuclear force responsible for the fusion reaction powering our Sun are very short-range. Electromagnetic fields are sourced by charges and currents, but the universe is electrically neutral on average, so electromagnetism does not dominate its evolution. Gravity, on the other hand, is sourced by energy-momentum. Since everything has energy-momentum, even the graviton, you can never get away from gravity. Newton wowed the world a third of a millennium ago with his Law of Universal Gravita- tion, which explained both celestial and terrestrial observations. Our focus in this course is on explaining Einstein’s famous General (GR), which is a gener- alization of both Newtonian gravity and Special Relativity proven useful for describing the dynamics of the cosmos. By of term, you will be familiar with ’s famous equation for the gravitational field gµν(x) 1 R − g R + Λg = −8πG T , (1.1) µν 2 µν µν N µν where Rµν and R involve (first and) second derivatives of gµν and Tµν describes the energy- momentum tensor of all non-gravitational fields which are collectively known as matter fields. You will also understand how GR gives back Newton’s theory of gravity in the limit where speeds are small and spacetime is weakly curved. You can think of Einstein’s GR as Gravity 2.0, built on the foundation of Gravity 1.0 established by Newton – an upgrade. The name for this course is “Relativity Theory 1”. Another name by which it is com- monly known is “GR 1”, which stands for “General Relativity 1”. The main thing we learn how to do in this course is how to write the dynamical equations of physics in the language of tensor analysis. Tensor analysis always sounds scary when you start, but it is not much more complicated conceptually than vector analysis, something you have been doing for years. We will show how to take your vector analysis knowledge from flat space and generalize it to spacetime. We begin with flat spacetime, which is pertinent to Special Relativity, and then we build on that knowledge to figure out how to write dynamical equations of physics even in curved spacetime. Einstein taught us that the is constant and is the same in all inertial frames of reference. We will therefore adopt the relativistic convention that c = 1 throughout the course. This implies that time is measured in metres, and mass is measured in units of energy, e.g. me=511 keV. We will keep all other physical constants explicit, such as Planck’s constant ~ characterizing the strength of quantum effects and the Newton constant GN characterizing the strength of gravity. If you feel queasy about missing factors of c in any equation, they can always be easily restored by using dimensional analysis.

iii 1.2 Course website Please have a careful read of the course website at https://ap.io/483f/. It contains lots of vital and useful information for all students taking PHY483F/PHY1483F, including the syllabus, online lecture notes, and how to contact me. Almost everything you need to know about the course is contained in the pages listed, and in all the clickable links in those pages. The remaining tiny fraction of information that needs to be hidden behind a UofT firewall for class members only can be found on Quercus, in the Announcements (from the Prof.) and Modules (from the TA).

Remarks in my notes intended for more advanced/interested students are indicated in .

1 2 M14Sep

2.1 Galilean relativity, 3-vectors in Euclidean space, and index notation Before we review some aspects of Special Relativity and introduce some new ones, let us begin by reminding ourselves of the non-relativistic version of relativity, also known as Galilean relativity. When we want to transform from one inertial to another moving at relative velocity v, there are three things we must think about:

(a) how time intervals relate,

(b) how spatial position intervals relate,

(c) how velocities relate.

In Galilean relativity, all clocks are synchronized,

dt0 = dt , (2.1) displacements are related via dx0 = dx − vdt , (2.2) and velocities u = dx/dt compose by simple addition,

u0 = u − v , (2.3) where v is the relative velocity between the unprimed and primed frames of reference. Ein- stein upgraded these formulæ when he invented Special Relativity, and you have seen the results before: they are known as Lorentz Transformations. We will get to them soon enough – and we will show you how simple they can look when written in terms of rather than velocity. But for now, let us inspect how 3-vectors work more closely, in some detail. This will serve as a pattern for the relativistic case. Lots of things of interest in physics are vectors, which are in essence things that point.I like to say that a vector has a ‘leg’ that sticks out, telling you where it points. Mathematically, the vector components are what you get when you resolve the vector along an orthonormal basis. In Special and General Relativity, we will need to be scrupulously careful to distinguish where we put our indices (up/down and left/right). For arbitrary vectors v, we write the index telling you which component is which with an upstairs index: v1, v2, . . . , vd, where i = 1, . . . , d and d is the spatial . Note that the upstairs index i used here is not a power; instead, it specifies which component vi is being discussed: the ith one. If you think of a contravariant vector as a column vector, the upstairs index i denotes which row of the column vector you are talking about. If you need to take a power of a vector component, the GR convention is to write parentheses around it, e.g. (v1)2. Note also that it is common in GR literature to write the vector v as vi – technically, vi is a component of v, but letting the index show explicitly rather than suppressing it helps us remember its transformation properties.

2 Vectors provide a useful notational shorthand, preventing us from having to write out all the components explicitly every time we write a physics equation like F~ = m~a. Tensor analysis in GR is nothing scary – it is the natural generalization of vector analysis to curved spacetime and multilegged objects. Its underlying idea is twofold:-

• In physics, the most useful dynamical variables transform in well-defined ways under coordinate transformations, and are known as tensors. Example: the momentum vector pi. • The laws of physics should be tensorial equations. A Newtonian example you will recognize is F i = mai . (2.4)

When we change coordinates, the components of tensors on both sides of the equation change, but the underlying physical relations between them do not. The natural type of vector we defined above is called a contravariant vector. This is like a column vector. It has a natural counterpart called a covariant vector, also known as a dual vector. This is like a row vector. A covariant vector ω has components ωi; note that this is a downstairs index rather than an upstairs index. The index i tells you which column of the row vector you are talking about. There is a natural inner product between contravariant vectors v and covariant vectors ω:

X i ω · v = ωiv . (2.5) i A very useful convention that we will use throughout the course is the Einstein summa- tion convention. This is a notational shorthand in which a repeated index is automatically summed over when it occurs precisely once upstairs and precisely once downstairs. This con- vention suppresses the unwieldy Σ signs so that it becomes easier to see the wood for the trees. The thing that signals that you are summing over an index is that it is repeated. Note that a repeated (summed over) index can appear precisely twice in any given equation: if it occurs more times, the writer has made a mistake. Summing over a repeated index is also called index contraction, because what you get for the result has none of the summed-over indices remaining. In our v · ω example above, the result is a scalar: a tensor with zero legs. Why is it important to distinguish between contravariant and covariant vectors? In a nutshell, because they transform differently under coordinate transformations. Let us see how this works for a rotation. You may be used to writing a rotation of (say) a displacement vector as a d × d R. Rotation matrices are orthogonal,

R−1 = RT . (2.6)

Alternatively, we can say that they preserve the Euclidean norm in 3-space:

T R 13R = 13 , (2.7) where 13 is the identity matrix. While R transforms contravariant vectors v in Euclidean space as v → v0 = R · v , (2.8)

3 where the prime indicates the transformed vector and the unprimed vector is the original, the RT transforms covariant vectors vT as

0 vT → vT = vT · RT . (2.9)

However, we strongly discourage writing coordinate transformations in terms of matrices in future, and instead encourage you to get the hang of index notation. Once time is included, coordinates are curvilinear, and spacetime is physically curved, index notation and the Einstein summation convention will help us keep track of indices in a much more succinct way and therefore reduce the error rate when handling tensors. A rotation is expressed in index notation for the contravariant vector components as

i0 i0 j v = R jv . (2.10)

i0 0 The R j is the component of the rotation matrix from the i th row and jth column. Note that the left-right placement of indices here is physically important, as well as the upstairs- downstairs placement. The physics reason why is that rotation matrices are not symmetric, so casually switching them makes no sense. Let us write out the above transformation law more explicitly, so that you can see how it encodes matrix multiplication in a disciplined way. For a rotation of the vector v with components vi about the z-axis, it reads1

10 10 1 10 2 10 3 1 2 v = R 1v + R 2v + R 3v = + cos θ v + sin θ v 20 20 1 20 2 20 3 1 2 v = R 1v + R 2v + R 3v = − sin θ v + cos θ v 30 30 1 30 2 30 3 3 v = R 1v + R 2v + R 3v = v . (2.11)

For the covariant vector ω specified by its components ωi, we have

j ωi0 = Ri0 ωj . (2.12) Note that the left-right and upstairs-downstairs index placements are deliberate and phys- j 0 ically meaningful here, as for the contravariant case earlier. Ri0 is the i th column of the jth row of RT . Rotations are interesting mathematically because they preserve the Euclidean norm. In index notation, this condition reads

i j0 k0 i Rj0 δk0 R ` = δ` , (2.13) where  1, i = ` δi = . (2.14) ` 0, i 6= `

i Mathematically, the tensor δj with one contravariant index and one covariant index is the identity matrix. Since the identity is a symmetric matrix, we do not have to be picky about left-right index placement on Kronecker deltas like we do for other tensors. More generically, we must be very careful about left-right and up-down index placement on our tensors. A

1Nitpickers: please note that, like Carroll, we take the point of view that the vector stays fixed while the changes under the relevant transformation.

4 physical implication of the above formula is that even if you rotate your velocity vector, you still get the same kinetic energy for a non-relativistic particle, because the kinetic energy is proportional to the norm of the velocity vector. i0 Using the components of R j given above, use eq.(2.13) to check that you have correctly k0 identified the R` . Then using the transformation laws for contravariant and covariant vectors eq.s (2.10,2.12), show that covariant vector components are transformed in the −θ direction while the contravariant vector components are transformed in the +θ direction. This sign difference might seem rather trivial, but it is anything but! It is our first glimpse of why we do need to be very careful to distinguish between upstairs and downstairs in- dices for vectors – and more generally for tensors, which are multilegged generalization of contravariant and covariant vectors. In physics, we often want to find the norm () of a vector, or the angle between two distinct vectors via the . In index notation, what we need to be able to do is to convert contravariant vectors into covariant vectors or vice versa. To achieve this, we need extra structure on the space (or later on, the spacetime) in which the vectors live, called a g, which must be invertible. In flat Euclidean 3-space in Cartesian coordinates, the role of the metric tensor is played by the tensor, which is the identity matrix in both upstairs and downstairs components,  1, i = j δ = , (2.15) ij 0, i 6= j and  1, i = j δij = . (2.16) 0, i 6= j i The one-up-one-down version δj also has the same numerical values. The upstairs spacetime metric is the (left and right) inverse of the downstairs metric,

ij i i g gjk = g k = δk , jk k k gijg = gi = δi . (2.17) As you can see, this equation is easily satisfied for flat Euclidean 3-space in Cartesian coordinates. Soon we will see that this equation must also hold when using more general coordinate systems or when operating in curved spacetime – or both. Converting between contravariant and covariant components of vectors and vice versa is achieved via j vi = gijv , (2.18) and i ij ω = g ωj , (2.19) where again we used the Einstein summation convention. The fact that the metric is so trivial in flat Euclidean 3-space in Cartesian coordinates is why people are often very careless with 2 index placement – if you write it out explicitly you will see that (for example) v = v2 because ij ij gij = δij and g = δ . Reminder: if you need to take a power of a vector component, put parentheses around it to make it unambiguous. For example, (v1)2 means the square of the first contravariant component of the vector v.

5 Notice how if we have a physics equation for contravariant vectors, say F j = maj, we can multiply both sides by δij (which is a number) and sum over the repeated index j to obtain a covariant vector equation Fi = mai. This chain of logic only works because we have a metric available – otherwise, we would have no way of converting upstairs-index equations to lower-index ones. As you will recall from Newtonian physics, the kinetic energy is proportional to the 2 i j square of the velocity vector, i.e. its norm |v| = δijv v . This is a scalar, i.e. invariant under rotations. So is the inner product or dot product of any two contravariant vectors ai and bj, formed by using the metric tensor,

i j a · b = gija b . (2.20) In 3 spatial only, we can build another 3-index animal out of two contravariant vectors ai and bj by taking an outer product or cross product. We will be able to write an expression for this in our handy index notation by making use of another new object known as the permutation pseudotensor Eijk, which is antisymmetric under interchange of any two of its indices,   +1 , (ijk) = even permutation of (123) , Eijk = −1 , (ijk) = odd permutation of (123) , (2.21)  0 , otherwise . There is also an upstairs version Eijk with the same numerical values, in flat Euclidean 3- space. What is this permutation pseudotensor used for? Well, one of the first things it can do is to help us find the determinant of a matrix,

ijk 1 2 3 det(M) = E M iM jM k . (2.22) Applying this formula to an orthogonal transformation matrix allows us to discover that Eijk is a pseudotensor, rather than a proper tensor, because it does not flip sign under a parity transformation xi → −xi. It is also invariant under rotations and translations. ijk When handling expressions containing Eijk or its upstairs version E , we may need to know what contractions of these beasts look like. The identities it obeys are very handy to know,

ijk E Eijk = 3! , ijk k E Eij` = 2! δ` , ijk jk E Eimn = 1! δmn , ijk ijk E E`mn = 0! δ`mn , (2.23) where the generalized Kronecker deltas are defined by j k jk δm δm j k j k δmn ≡ j k = δmδn − δnδm , (2.24) δn δn and i j k δ` δ` δ` ijk i j k i j k i j k i j k i j k i j k i j k δ`mn ≡ δm δm δm = +δ`δmδn + δmδnδ` + δnδ` δm − δmδ` δn − δ`δnδm − δnδmδ` . (2.25) i j k δn δn δn

6 Using all of this, we can finally write out the components of the outer (cross) product in index notation in 3D, j k (a × b)i = Eijka b . (2.26) Notice that in writing the outer product here, we have again used the Einstein summation convention – twice – on both j and k. This makes the expression more compact. Also, since this is a bona fide 3-vector equation, we can raise the index using our (spatial, trivial) i ijk metric. As you should convince yourself, the result is (a × b) = E ajbk. We can use these expressions to find, for example,

m n [a × (b × c)]` = E`mna (b × c) m npq = E`mna E bpcq pq m = δ`ma bpcq p q p q m = (δ` δm − δmδ` )a bpcq q p = a b`cq − a bpc`

= [b(a · c) − c(a · b)]` , (2.27) which should look familiar from vector calculus classes earlier in your education. In general spacetime dimension d, we can define a d-dimensional version of the anti- symmetric permutation pseudotensor with d indices. Then, the outer product between two contravariant vectors ai and bj is more properly thought of as a pseudotensor with (d − 2) i1 i2 legs, because it is formed via the contraction Ei1i2...id a b of two vectors a and b with the d-legged E pseudotensor. Note that E is defined in any dimension as long as the manifold is orientable.

7 3 R17Sep

3.1 Special relativity and 4-vectors in Minkowski spacetime Let us now turn to studying how to generalize spatial vectors in flat Euclidean space to spacetime vectors in flat Minkowski spacetime, in Cartesian coordinates to begin with. The bedrock principle of the constancy of the speed of light has some fairly dramatic physics implications, chief among them being and . Both of these ideas have been rigorously tested experimentally, e.g. in particle collider and cosmic ray contexts, and found to hold true. Also, velocities no longer add simply, obeying a composition law that looks pretty mysterious the first time you see it. Let me now demystify this and Lorentz boosts by using a clever parametrization. When you first saw Lorentz boosts, probably at the end of first year Newtonian me- chanics or in a second year modern physics course, they probably looked like the following. For an infinitesimal Lorentz boost in the x direction in units where c = 1, 0 0 0 0 dt = γv (dt − vdx) , dx = γv (dx − vdt) , dy = dy , dz = dz , (3.1) √ 2 where γv ≡ 1/ 1 − v . Using these expressions, you can easily figure out how velocities transform for a Lorentz boost along the x axis, 0 0 0 0 dx ux − v 0 dy uy 0 dz uz ux = 0 = , uy = 0 = , uz = 0 = . (3.2) dt 1 − uxv dt γv(1 − uxv) dt γv(1 − uxv) You can also work out the 3-accelerations

0 ax 0 ay (uyv)ax 0 az (uzv)ax ax = 3 3 , ay = 2 2 + 2 3 , az = 2 2 + 2 3 . γv (1 − uxv) γv (1 − uxv) γv (1 − uxv) γv (1 − uxv) γv (1 − uxv) (3.3) Notice that, unlike for Galilean relativity, acceleration is not an invariant in Special Relativ- ity. But whether or not someone is accelerating is an absolute concept: if the acceleration is zero in one frame of reference, then it is also zero in a Lorentz boosted frame of reference. Note: these formulæ are written in older notation that we will not continue using later in this course. We can write Lorentz boost formulæ in a much prettier way by using the rapidity ζ, which is defined by v = tanh ζ . (3.4) Note that while the speed ranges over v ∈ (−1, +1), the rapidity ranges over ζ ∈ (−∞, +∞). The really awesome thing about rapidity is that it is additive. To add the , you literally just add them, like for rotation angles: ζtot = ζ1 + ζ2. It is a simple exercise to recover the relativistic velocity addition law from the definition of rapidity and its additive nature. Give it a go yourself to be sure you understand. Now we are in a position to show you a Lorentz boost along the x direction in rapidity variables – da-daah! dt0 = + cosh ζ dt − sinh ζ dx , dx0 = − sinh ζ dt + cosh ζ dx , dy0 = dy , dz0 = dz . (3.5)

8 This looks a bit like a rotation, except for two physically important differences: (1) it mixes temporal and spatial intervals, rather than different spatial intervals, and (2) it involves hyperbolic trig functions, rather than normal trig functions. Another difference is that it is not the 3D Euclidean norm that is preserved under Lorentz transformations, but rather the 4D Minkowski norm, also known as the invariant

ds2 = dt2 − dx2 − dy2 − dz2 . (3.6)

The invariant interval so defined is positive if the points are timelike separated, negative if they are spacelike separated, and zero if they are null separated. This classification works regardless of which inertial reference frame you use, because it is invariant under sym- metry transformations of Minkowski spacetime: rotations, [Lorentz] boosts, and translations. The invariant interval ds2 = dt2 − |d~x|2 gives rise to the concept of a . For a point p, this is the cone defined by all light rays emanating from p into the future or the past. Points that are timelike separated from p are inside its light cone (positive ds2), those that are spacelike separated from p are outside it (negative ds2), and those that are null separated from p (zero ds2) lie on the light cone itself. Put more colloquially, if you had just died at point p, then your past light cone and its interior would contain all possible suspects for who had murdered you. If on the other hand you had set off a bomb at p, then your future light cone and its interior would contain beings you could have killed (using any form of explosive, TNT and photon torpedoes included!). Here is a pictorial representation of the light cone (for the D = 2 + 1 case). Figure credit: Wikipedia.

Note that light rays are conventionally drawn at a 45 degree angle on spacetime diagrams, in flat spacetime, to represent the fact that c = 1. In curved spacetime, the story gets more complicated, because the spacetime metric varies with position, rather than being constant. It might be worth reminding you of the definition of . To set the context, consider two events that are timelike separated. The proper time between two spacetime events measures the time elapsed as seen by an observer for whom the two events occur at the same spatial position. In our signature convention, the invariant interval is positive in the timelike case, so ds2 = dτ 2.

9 Motivated by the form of the matrices representing Lorentz boost transformations, let us define a relativistic 4-vector x with components xµ given by

x0 = (c)t , xi = (~x)i . (3.7)

Here, µ ∈ {0, 1, . . . , d}. Notice how time is totally different conceptually than it was in Galilean relativity: it is the zeroth position coordinate, not an invariant. We can then define the invariant interval as 2 µ ν ds = gµνdx dx . (3.8) In flat Minkowski spacetime in Cartesian coordinates, the metric tensor has downstairs components   +1, µ = ν = 0 ηµν = −1, µ = ν ∈ {1, 2, . . . , d} . (3.9)  0, µ 6= ν Its upstairs counterpart, the inverse, has components   +1, µ = ν = 0 ηµν = −1, µ = ν ∈ {1, 2, . . . , d} . (3.10)  0, µ 6= ν The equations expressing the fact that the upstairs and downstairs Minkowski metrics are inverses of each other are

αβ α α η ηβγ = η γ = δγ , βγ γ γ ηαβη = ηα = δα . (3.11) Again, we have used the Einstein summation convention where repeated indices are summed over. Note that we have chosen the mostly minus signature convention here. Be aware that formulæ that you may obtain from various GR textbooks may have been written in the opposite sign convention. This can be quite annoying when you are trying to track down minus sign errors in a calculation. HEL has a useful table on p.193 outlining key signature convention differences with d’Inverno, Misner-Thorne-Wheeler and Weinberg. The Minkowski metric tensor η is useful for raising and lowering indices. Specifically, ν for a contravariant vector V we can find its covariant components Vµ by contracting with ηµν: ν Vµ = ηµνV . (3.12) Contracting an index means repeating it (precisely once) and summing over it. For ex- ample, in the above equation, the index ν is contracted, while the index µ is not. Let us calculate one component, V0.

ν V0 = η0νV 0 1 2 3 = η00V + η01V + η02V + η03V = (+1)V 0 + (0)V 1 + (0)V 2 + (0)V 3 = +V 0 . (3.13)

10 µ To find the contravariant components ω of a covariant vector ων, we need to contract with the upstairs metric ηµν:- µ µν ω = η ων . (3.14) Using the Minkowski metric, we can define a relativistic dot product between two contravari- ant vectors aµ and bν, µ ν a · b = ηµνa b . (3.15) Before we move on to defining tensors in a more general way, let us make a couple of comments about the group of Minkowski spacetime for those who might be interested. We talked about rotations earlier, and noted that they preserved the norm of 3- vectors in flat Euclidean space. A rotation matrix is orthogonal and preserves the Euclidean 3-norm. The group of such matrices in 3D is known as SO(3). What is the analogue condition for 4-vectors in flat Minkowski spacetime? If you work out the algebra, you will find that both rotation and boost transformations written as 4 × 4 matrices Λ preserve the Minkowski norm, ΛT ηΛ = η, where η is the Minkowski metric tensor we defined above. In index notation, i k0 `0 i Λk0 η `0 Λ j = η j . (3.16) Such matrices Λ in D = d + 1 dimensions are said to belong to the group SO(1, d). Rotation and boost matrices together are known as Lorentz transformations and they form a Lie (continuous) group known as the Lorentz group. If we include translations as well, the resulting group of transformations is known as the Poincar´egroup ISO(1, d) (mathematically, it is a semidirect product). An interesting fact about the Poincar´egroup is that, without even looking at an experiment, you can prove theoretically that there are only two2 invariants: the mass m and the intrinsic spin s. They are always the same in different inertial frames of reference related by rotations, boosts, or translations. This is why subatomic particles are differentiated by their mass and spin. The third label we use to distinguish subatomic particles, that also respects Poincar´e invariance, is the set of conserved charges under whichever gauge symmetries are relevant, e.g. SU(3) × SU(2) × U(1) of the Standard Model of Particle Physics. So, how do we define vectors and tensors in flat spacetime? The signature property of a vector and, more generally, of a tensor, is that it transforms in a specific and well-defined way under changes of reference frame, using the spacetime coordinates as the quintessential example. For a single-index tensor V with upstairs components V µ,

µ0 0 ∂x 0 V µ = V ν = Λµ V ν , (3.17) ∂xν ν which is known as a contravariant vector. There are also covariant vectors which obey

ν ∂x ν V 0 = V = Λ 0 V . (3.18) µ ∂xµ0 ν µ ν 2If you want to know why, and are unafraid of a little Lie group theory, you can find out why by reading my PHY2404S notes at https://ap.io/archives/courses/2014-2020/2404s/qft.pdf. I also explain there why helicity is the relevant thing for massless particles and why spin has the character of an for massive particles.

11 Look closely at the above two equations: they are materially different. In the equation for the contravariant vectors, the transformed coordinates x0 appear in the numerator of the Jacobian and the original coordinates x appear in the denominator in the transformation law; for covariant vectors the opposite happens. Mathematically speaking, contravariant vectors live in the , which is de- fined at every point in spacetime. Covariant vectors live in the cotangent space. They obey the usual axioms of vector spaces: associativity and commutativity of addition, ex- istence of identity and inverse under addition, distributivity, and compatibility with scalar multiplication. A rank (m, n) tensor has m contravariant indices and n covariant indices. In math- ematical language, a rank (m, n) tensor is a multilinear map from the direct product of m copies of the cotangent space with n copies of the tangent space into the real numbers. Alternatively, you can think of it as a machine with m slots for covariant vectors and n slots for contravariant vectors to make a scalar. For instance, a rank (0, 1) tensor (a covariant vector) is a machine with one slot for a contravariant vector (a rank (1, 0) tensor), which when inserted will produce a scalar (a rank (0, 0) tensor). The spacetime metric is a (0, 2) tensor; its inverse is a (2, 0) tensor. To find out how the components of a tensor transform, you use the transformation matrices on each index in turn,

µ 0 µ 0 σ σ 0 0 ∂x 1 ∂x m ∂x 1 ∂x n µ1 ...µm λ1...λm T ν 0...ν 0 = ... 0 ... 0 T σ ...σ . (3.19) 1 n ∂xλ1 ∂xλm ∂xν1 ∂xνn 1 n

Note that each of the indices λ1, . . . , λm and σ1, . . . , σn in this equation is repeated and summed over, keeping to the Einstein summation convention. So if you were to expand out all the components one by one, this would be a pretty long equation. It’s just as well we know how to represent it compactly using index notation! The general idea of tensor analysis is that all laws of physics should be expressible in terms of tensor equations. In tensorial equations, indices can be consistently raised and lowered, as long as this is done consistently to both sides. In other words, you should not raise an index on the left side of a tensor equation while failing to do the same on the right hand side. Every equation should have the same number and type of indices on both sides. Tensorial equations hold equally well in any frame of reference, even though the components are different in different frames of reference. Now let us turn to a few examples of the utility of tensors in Minkowski spacetime.

3.2 Partial derivative 4-vector We can use Minkowski spacetime tensors to describe more objects than a massive . For starters, we can form a very important covariant vector out of derivatives, ∂ ∂ ≡ . (3.20) µ ∂xµ Its zeroth component describes the time derivative ∂ 1 ∂ ∂ = = , (3.21) 0 ∂(c)t (c) ∂t

12 while the spatial parts ∂i describe spatial derivatives. As you can see, ∂µ arises naturally as a covariant vector. It is a straightforward and worthwhile exercise to show that in flat Minkowski spacetime, 1 ∂2 ∂2 ∂2 ∂2 ∂µ∂ = − − − . (3.22) µ (c)2 ∂t2 ∂x2 ∂y2 ∂z2 This differential operator appears in relativistic wave equations, e.g. the Maxwell equations, µ 2 or the equation of motion for a Klein-Gordon (scalar) field Φ, ∂ ∂µΦ = m Φ. ik·x For fun, let us try applying −i~∂µ to a plane wave of the form f(x) = e and see what happens.

λ −i~∂µ f(x) = −i~∂µ exp(ikλx ) ν λ = −i~[ikνδµ] exp(ikλx ) = ~kµ f(x) . (3.23)

In other words, −i~∂µ is playing the role of the momentum when acting on plane waves of ik·x the form f(x) = e , producing the eigenvalue pµ = ~kµ. In mathematical lingo, we say that the plane wave carries a representation of the translation group. If we only had discrete translation invariance up to a lattice vector instead, we would end up with Bloch waves instead of continuous spectrum plane waves.

13 4 M21Sep

4.1 Relativistic particle: position, momentum, acceleration 4-vectors For any point particle, massive or massless, we can define its 4-momentum pµ by

p0 = E, pi = (~p)i , (4.1) where ~p is the relativistic 3-momentum and E is the relativistic energy. For a massive particle, we have m p0 = √ = m cosh ζ , 1 − v2 m pi = √ vi = m sinh ζ vˆi . (4.2) 1 − v2 Check out for yourself what happens to components of the 4-momentum under Lorentz transformations. Notice that the relativistic norm of the momentum 4-vector is a constant,

µ 2 2 2 p pµ = E − |~p| = m . (4.3) This is known as the mass shell relation. It holds for any particle, massless or massive. For massless particles like the photon, E2 = |~p|2. The 4-velocity is defined for massive particles only, via dxµ(τ) uµ = , (4.4) dτ where τ is the proper time. It is related to the momentum 4-vector by

pµ = muµ . (4.5)

Note that the 4-velocity satisfies µ u uµ = +1 , (4.6) by the mass shell constraint. Work out for yourself how the spatial components of uµ relate 0 i i to the Newtonian velocity – you should find u = γv, u = γv(~v) . The 4-acceleration is defined for massive particles only, via duµ(τ) d2xµ(τ) aµ = = , (4.7) dτ dτ 2 where τ is proper time. Work out for yourself how this relativistic acceleration aµ connects with the Newtonian version of acceleration ~a that you used in first-year undergrad physics. What if we wanted to get a bit more sophisticated and write down an principle for the point particle? First, let us do a lightning review of some salient points from . In a general dynamical system, our variables are the coordinates

qa(λ) , (4.8)

14 where the index a labels which coordinate we are discussing and λ is a parameter that measures where we are along a particle path. For non-relativistic system we will pick λ = t, Newtonian time. The velocities are q.a(λ), where d . = . (4.9) dλ We also have the expression for the canonical momenta in terms of the velocities, ∂L p = , (4.10) a ∂q.a which are found from the action, which is a functional of the coordinates, Z S = S[qa(λ)] = dλ L(qa(λ), q.b(λ)) . (4.11)

Using the Lagrangian L and the expressions for the canonical momenta in terms of the velocities, we can form the Hamiltonian H which depends on the coordinates on phase space, the coordinates and their conjugate momenta,

a X .a H = H(q , pb) = paq − L. (4.12) a The principle of least action δS = 0, combined with your knowledge of the , results in the Euler-Lagrange equations ∂L d  ∂L  − = 0 . (4.13) ∂qa dλ ∂q.a These are equivalent to Hamilton’s equations, dp dqa a = {p ,H} , = {qa,H} , (4.14) dλ a PB dλ PB where the Poisson bracket is defined via X ∂f ∂g ∂g ∂f {f, g} = − . (4.15) PB ∂qa ∂p ∂qa ∂p a a a The basic dynamical variables for a non-relativistic point particle are xi(t), where t is the non-relativistic time and i = 1, 2, 3. These are 3 bona fide independent functions. There is no issue about how to parametrize t, because all observers agree on time, by Galilean relativity. For a free nonrelativistic particle, the Lagrangian is just the kinetic energy, Z 1 S = dt m|~v|2 . (4.16) nonrel 2 This action respects in Euclidean 3-space. The canonical momenta are

pi = mvi , (4.17)

15 and the Hamiltonian is 1 H = pip . (4.18) NR 2m i This is just the kinetic energy written in terms of momentum rather than velocity. So, that was all well and good, but what about an action principle for the relativistic point particle? This will be an integral over the worldline of the particle, which is the path it traces out as it moves through spacetime. For relativistic point particles we cannot use the Newtonian kinetic energy, because it is not invariant under Lorentz boosts. We will have to use a generalization that respects Einsteinian relativity. The simplest guess for an action generalizing the above that people typically write for a massive particle is proportional to the , r Z dxµ(τ) dxν(τ) S(1) = −m dτ η , (4.19) rel µν dτ dτ where τ is the proper time (an invariant, unlike the time coordinate). This action has the benefit that, at low speeds, it reduces to the familiar non-relativistic action – up to an additive constant (try it yourself to see how, by doing a Taylor series). It assumes that the particle position xµ(τ) can be parametrized by the proper time τ. The drawback of this first choice of relativistic particle action is twofold. First, the particle is assumed to be massive, so that proper time can be used to parametrize the worldline. If we want to write down equations of motion for massless particles like photons, it will not suffice. Second, our 4 dynamical variables xµ(τ) are not actually independent functions. At all points along the evolution, the 4 must obey the mass shell constraint, .µ . µ x xµ = +1, where . = d/dτ. As a result, only 3 of the 4 x (τ) are independent functions. It is a physics fib to pretend that all 4 can be independently varied in the action principle. This is why some people substitute the mass shell constraint into the Lagrangian to sidestep the problem. p.µ . Suppose further that we tried to use the above Lagrangian Lτ = −m x xµ to find the canonical momenta and Hamiltonian. What we would end up with is Hτ ≡ 0. A related fact is that the geometric arc length Lagrangian is ‘singular’. What does this mean? Well, if we inspect the Euler-Lagrange equations for general qa(λ), we can rearrange them to see that

2 2 2 ∂ L ..b ∂L ∂ L .b ∂ L q = − .a q − .a . (4.20) ∂q.b∂q.a ∂qa ∂qb∂q ∂t∂q Everything on the RHS of this equation is a function of λ, qa, and q.a. So for our massive relativistic point particle, finding all of the accelerations ..xµ in terms of τ, xµ(τ), x.µ(τ) only works if the Hessian tensor ∂2L (4.21) ∂x.ν∂x.µ has maximal rank. It actually has one zero eigenvalue, and this signals the presence of a local gauge symmetry: reparametrization invariance.3

3For more details on this and a number of related topics, see the 1990 textbook “Lagrangian Interaction” by Noel Doughty, intended for senior undergraduates. When I was doing my B.Sc.(Hons) degree in New Zealand, I took a course from Doughty, and his notes and background material were published as this book a year later. I am very grateful to Doughty for helping inspire me to be a theoretical physicist. If you take a peek into the Acknowledgements section, you will see that he thanked me and four of my classmates. :D

16 Let us now mention the correct way to handle a constraint. The key is to introduce a Lagrange multiplier, which in this case we will call e(λ), the einbein. In general, a Lagrange multiplier is something that appears in your action principle only via dependence on “co- ordinates” but not on “velocities”: it is not a truly dynamical field. Its only function is to implement the constraint that you need to impose, in a way that respects the symmetries of your system. In our case, we want to preserve invariance under rotations, boosts, and translations. Using this idea, a Lagrangian can be written down that achieves all the things we need. I will refer to it as the einbein Lagrangian4, Z 1 dxµ(λ) dxν(λ) 1  S(2) = dλ e−1(λ) η + e(λ) m2 . (4.22) rel 2 µν dλ dλ 2 This action is invariant under reparametrizations λ → λ0, as long as the einbein transforms as dλ e → e0 = e . (4.23) dλ0 (2) Varying this action Srel w.r.t. e(λ) gives its Euler-Lagrange equation, and this produces the mass shell constraint, .µ . 2 2 x (λ)xµ(λ) = +m [e(λ)] . (4.24) It is a good idea to check this yourself, by working through the steps. Along the way, you will need to use the fact that ∂xα = ∂ xα = δα = η α , (4.25) ∂xβ β β β and a similar equation for the x.µs. For massive particles we can pick the proper time .µ . parametrization in which e(λ) = 1/m; then x xµ = 1 and λ = τ. For massless particles, a .µ . convenient parametrization is e(λ) = 1, and so x xµ = 0 and there is no concept of proper time, just a parameter λ. Because we have obtained the mass shell constraint equation directly from the action, we can be confident that we truly have only 3 independent functions xµ(λ) in our dynamical system, not 4. (2) µ Varying Srel w.r.t. x (λ) gives the equations of motion for the relativistic particle posi- tion, . [pµ(λ)] = 0 , (4.26) where the canonical momenta are

−1 . pµ(λ) = e (λ)xµ(λ) . (4.27) . The above equation of motion pµ = 0 is equally valid for a massive or , as long as it is free, rather than acted on by an external force. If it feels an external force, obviously we would generically not expect its momentum to be conserved. The Hamiltonian is

.µ Hλ = pµ(λ) x (λ) − Lλ 1 = e(λ)pµ(λ)p (λ) − m2 . (4.28) 2 µ 4Sometimes it is alternatively referred to as “1D General Relativity”.

17 This Hamiltonian is proportional to the constraint, and this is the correct answer because it gives all the correct Poisson Brackets:

µ µ µ ν {x , pν}PB = δν , {x , x }PB = 0 , {pµ, pν}PB = 0 . (4.29) If we wanted to canonically quantize a system (something we will not be doing in this course), we would replace classical Poisson brackets with quantum mechanical commutators.

4.2 Electromagnetism: 4-vector potential and field strength ten- sor A less trivial example of a special relativistic tensor is Maxwell’s electromagnetism. Having played with the Maxwell equations, you know why EM waves travel (in vacuum) at the speed of light. You may already know that, decades before Einstein invented special relativity, Maxwell had baked it into the very fabric of his eponymous equations! What you may not know is that the familiar electric and magnetic field strengths are actually not correctly described by vectors, but instead by a two-index covariant . Specifically, in four spacetime dimensions, the gauge field strength components Fµν are built out of ~ j F0i = +δijE ~ k Fij = −EijkB (4.30) In this equation, we used the totally antisymmetric permutation symbol in 3 dimensions. The electromagnetic 4-vector gauge potential Aµ is built out of the scalar potential and the 3-vector potential, with components

A0 = Φ , i Ai = A~ . (4.31)

It is related to the field strength via the covariant curl,

Fµν = ∂µAν − ∂νAµ . (4.32) This splits up in 3+1 notation as

B~ = ∇~ × A,~ ∂A~ E~ = −∇~ Φ − . (4.33) ∂t λ Note that Aµ(x ) is the basic dynamical field of electromagnetism. The field strength Fµν is a derived quantity. Using the above definitions, the four Maxwell equations

∂E~ ∇~ × B~ − = J,~ ∇ · E~ = ρ , (4.34) ∂t ∂B~ ∇~ × E~ + = ~0 , ∇ · B~ = 0 . (4.35) ∂t

18 neatly collapse into two manifestly relativistic Maxwell equations,

µν ν ∂µF = J , µνλρ E ∂νFλρ = 0 . (4.36)

Later when we generalize to curved spacetime, the partial derivatives ∂µ will be replaced by covariant derivatives ∇µ. In the above relativistic Maxwell equations, the 4-vector current is built out of the charge density and the 3-vector current, with components

J 0 = ρ , i J i = ~j . (4.37)

The 4-vector current obeys a conservation law,

µ ∂µJ = 0 . (4.38)

Here we are working in four spacetime dimensions. If we write more generally the 5 spacetime dimension D as D = d + 1, then the D-index pseudotensor Eµ0...µd is defined via   +1 , (µ0 . . . µd) = even permutation of (012 . . . d) , Eµ0...µd = −1 , (µ0 . . . µd) = odd permutation of (012 . . . d) , (4.39)  0 , otherwise .

If you want the permutation tensor with upstairs indices, you can easily build it by using µν η to raise the indices. Note that our 4-index permutation pseudotensor Eµνλσ obeys some handy identities, in a generalization of what we saw before for Euclidean 3-space. Defining µ ν µν δα δα δαβ = µ ν (4.40) δβ δβ and µ ν λ δα δα δα µνλ µ ν λ δαβγ = δβ δβ δβ (4.41) µ ν λ δγ δγ δγ and µ ν λ σ δα δα δα δα µ ν λ σ µνλσ δβ δβ δβ δβ δαβγδ = µ ν λ σ (4.42) δγ δγ δγ δγ µ ν λ σ δδ δδ δδ δδ

5We only ever consider spacetimes with one timelike dimension. Currently, it is not generally known how to make sense of quantum theory with two or more timelike dimensions. Which ∂/∂t should we use in the Schr¨odingerequation?

19 gives, after quite a bit of algebra,

µνλσ E Eµνλσ = −4! , µνλσ σ E Eµνλδ = −3! δδ , µνλσ λσ E Eµνγδ = −2! δγδ , µνλσ νλσ E Eµβγδ = −1! δβγδ , µνλσ µνλσ E Eαβγδ = −0! δαβγδ . (4.43) The relativistic Lorentz force law can be written very nicely in relativistic tensor notation,

µ µ ν ma = qF νu , (4.44) where uµ is the relativistic 4-velocity and aµ is the relativistic 4-acceleration. You will work out some aspects of this EM story in your HW1 assignment. In particular, you will be able to compute the effect of a Lorentz boost on the electromagnetic fields E~ and B~ , which many of you will not have seen before. We now turn to the question of what happens for accelerated observers moving with constant relativistic acceleration.

20 5 R24Sep

5.1 Constant relativistic acceleration and the twin paradox When I was an undergraduate, a professor introduced the idea of the Twin Paradox to us. Could the space traveller twin really live longer by travelling at relativistic speeds? The maddening thing was that he never equipped us with the technology to answer the question! Here is how we can solve that without having to resort to General Relativity: we will only use what we know about Special Relativity to solve it, along with a tiny bit of calculus. We all know that time dilation lengthens time intervals as compared to what is measured in rest frame. We also know that each observer sees the other person’s clock as running slow. So why is there even a difference between what the space twin sees and what the homebody twin sees? Acceleration. The space twin must accelerate in order to turn around and come back to , before they can compare clocks with the homebody twin. This is what makes the space twin physically distinct from the homebody twin, who stays in a relatively boring inertial reference frame while the space twin gallivants around the galaxy. What do we mean by constant relativistic acceleration, exactly? Without loss of gener- ality, we may take the astronaut acceleration to be pointing along the x1 direction. Since the rapidity is additive under two successive Lorentz boosts, we may take a guess that constant relativistic acceleration occurs when rapidity increases linearly with proper time. Let the magnitude of the constant relativistic acceleration be g. For an infinitesimal addition to rapidity in the x1 direction dζ, the proposal is

dζ = g dτ . (5.1)

Here we have suppressed the factors of c, which can be easily restored via dimensional analysis. This formula is interesting because it actually holds for any kind of acceleration g(τ), not just the constant kind. Let us now see why. Our key tool for analysis will be to define the instantaneous inertial rest frame (IIRF) for the accelerating astronaut, which we will denote by primes. This is obviously distinct from the ordinary inertial reference frame (IRF) of the homebody twin, and it is different at each point along the space twin’s trajectory because they accelerate. The key physical feature of the IIRF at any τ is that the astronaut is at rest in that frame at 0 the instant in question: ux = 0. And since we know the relationship between 3-velocities and 3-accelerations in different inertial reference frames from our experience with Lorentz transformations, we can figure out what happens for the astronaut’s trajectory measured in the lab frame. We have

0 ux − v 0 ax ux = , ax = 3 3 , (5.2) 1 − uxv γv (1 − uxv) √ 2 where γv ≡ 1/ 1 − v . Accordingly, at each instant along the astronaut trajectory,

ux = v . (5.3)

0 Therefore, for a general acceleration ax = g(τ) in the IIRF,

−3 ax = γv g(τ) . (5.4)

21 We also know from elementary Lorentz transformations that dt = γvdτ (this is just like muons from cosmic rays lasting longer in the lab frame than in the muon rest frame because they are whizzing down to earth at relativistic speed). Remembering that ax = dux/dt, rearranging the above equation as a function of v, and integrating gives for the 3-velocity in the x1 direction Z τ arctanh[ux(τ)] = dσ g(σ) . (5.5) 0 In turn, this can be easily rearranged to give the rapidity along the x1 direction Z τ ζ(τ) = dσ g(σ) , (5.6) 0 where we assumed that ζ(τ = 0) = 0. Next, we would like to compute the in the homebody frame moved by the astronaut during homebody time dt. This is simply obtained from the speed dx = ux dt = tanh ζ dt . (5.7) To convert to astronaut time, we again use the standard time dilation formula,

dt = cosh ζ dτ . (5.8)

This implies that dx = sinh ζ dτ . (5.9) If we know ζ(τ), we can integrate these equations. It is especially easy to do so for constant acceleration g. The position of the space twin in homebody coordinates becomes 1 x(τ) = [cosh(gτ) − 1] + x . (5.10) g 0 The time for the space twin in homebody coordinates integrates to 1 t(τ) = sinh(gτ) + t . (5.11) g 0 Using these equations, you can figure out the physical effect of acceleration on the ageing process. In your first homework assignment, you will find out that acceleration serves to enhance the familiar constant-speed time dilation effect, rather than reduce it. This is because the free particle trajectory actually maximizes proper time elapsed during motion; any acceleration applied reduces it. We will be able to see why this is later on when we study geodesics. Geodesics are, morally speaking, the closest thing to a straight that is available in curved spacetime. They describe the trajectories of test particles in freefall. Getting back to our equations above, we can see that for the case of a constant relativistic acceleration g, the trajectory of the accelerating space twin satisfies

 12 1 x(τ) − x + − [t(τ) − t ]2 = . (5.12) 0 g 0 g2 As you can see by inspection, this is a hyperbola. The asymptotes of the hyperbola are known as acceleration horizons or Rindler horizons. These asymptotes are lines with

22 a 1:1 slope on a . We can see why they are horizons by recalling that light rays also move at 45 degrees. An observer on a timelike trajectory going at higher acceleration hugs the hyperbola asymptote more tightly, but still cannot ‘see’ beyond the Rindler horizon.

In fact, the physics is even more interesting than this. The accelerated astronaut not only finds that there are parts of spacetime that they cannot communicate with because of their acceleration, but also that the physics of quantum fields for them is qualitatively and quantitatively different from what the homebody sees. The Minkowski vacuum (the state with no particles), seen in the reference frame of the astronaut with constant acceleration, turns out to have plenty of particles in it, and they can be measured with a detector. Not only that, the spectrum is thermal, at the Rindler temperature. Including factors of c, the formula for this reads ~g TRindler = . (5.13) 2πckB The greater the acceleration, the higher the temperature that the detector will register. This phenomenon of acceleration radiation is known as the Unruh effect. For those who are interested, the physics of particle detectors in GR is explained nicely in the advanced GR textbook by Birrell and Davies “Quantum Field Theory in ”.

23 5.2 The Equivalence Principle Einstein became famous for several different accomplishments. One which is among theoretical physicists is the concept of the Gedankenexperiment (German for thought exper- iment). It allows us to work out all sorts of imaginative ideas without having to actually spend any money. So imagine, if you will, that you are an astronaut on the space station. Imagine that you are blindfolded and kidnapped and then one of two things happens to you. Either you feel the acceleration due to gravity or you take a ride in a rocket ship capable of that same acceleration. How would you tell the difference?

The gravitational force from a body of mass M on a test mass mg is G Mm F~ = − N g rˆ , (5.14) grav r2 where mg is the gravitational mass and GN is the Newton constant. So we have   mg GN Mrˆ ~agrav = − 2 . (5.15) mi r

If mi = mg, then this acceleration ~agrav does not depend on the properties of the test mass feeling the gravitational force. The universality of was first put forward by Newton, centuries before Ein- stein. Others hypothesized that the acceleration due to gravity should be universal, not de- pending on the composition of the falling object. This idea has since been tested exquisitely well. It implies that an object’s inertial mass (what makes you hard to move in the morning) is equal to its gravitational mass (what responds to gravity), and is known as the Weak Equivalence Principle. When Einstein formulated his theory of General Relativity (GR) he decided to bake the equivalence principle into the very fabric of spacetime. In GR, there is no local experiment

24 you can do to tell the difference between acceleration due to rockets and acceleration due to gravity. This is known as the Einstein Equivalence Principle. The really cool thing about the equivalence principle? It implies that every reference frame, including accelerating ones, can be instantaneously approximated by a Lorentz frame. This might seem like mathematical nitpicking, but it is actually a key physics insight, as it implies that locally in spacetime, everything is just Special Relativity. What makes Gen- eral Relativity interesting and nontrivial is the story of how those individual infinitesimal neighbourhoods are sewn together into the fabric of curved spacetime. It is important to note that this equivalence between gravity and acceleration holds only in an infinitesimal patch about a point. If we have access to a finite sized patch of spacetime, we can distinguish gravity from acceleration by measuring tidal forces. We will develop that story later on when we get to discussing geodesic deviation. Consider a photon in Earth’s gravitational field. If it gets aimed upwards, then after a time interval dt, what is the effect? Well, photons cannot change their speed, as they always go at c. What can change for a photon is its energy (or equivalently the magnitude of its momentum, because the photon mass shell relation is E = |~p|). It can also change its heading. When a photon moves upwards in a gravitational field, it gains gravitational potential energy, so it must lose kinetic energy (conservation of the total energy is valid near Earth, because there is a time translation symmetry). The photon should therefore suffer a redshift in going upwards. This phenomenon is known as , and it implies that clocks run slower when they are deeper in a gravitational field. Black holes take this to an extreme, as we will see much later in the course. Did you know that GPS devices rely on both Special and General Relativity to locate you accurately? They need to take account of the fact that the GPS transmitter are (a) travelling at a measurable fraction of the speed of light, requiring Special Relativistic Doppler corrections, and (b) higher up in Earth’s gravitational field than we are, necessitating General Relativistic corrections. Without those corrections together, you would probably be kilometres off your intended position after a day’s canoeing. So GR does actually touch your life in a measurable way, if you ever use a GPS unit, say in your smartphone.

5.3 Spacetime as a curved Newton conceptualized gravity via forces that act at a distance instantaneously. This in- stantaneous propagation of gravitational effects is in direct contradiction to the relativistic principle that the speed of light is the upper speed limit for everyone. In Einstein’s GR, the speed of propagation of gravititational disturbances is tied to be exactly equal to the speed of light in vacuum. The formalism of GR is designed to express all the effects of grav- ity in a relativistic way, like gravitational redshift, via geometrical properties of the fabric of spacetime. The mathematical name for the type of geometry used is (pseudo). A(p + q)-dimensional manifold with signature (p, q) is a spacetime that locally looks like a patch of Rp,q. For example, for our D = 3 + 1 universe with three large spatial dimensions, this would be R3,1. The manifold is the collection (union) of these patches, known as coordinate charts, along with the transition functions that teach you how to sew the patches together. The manifold needs to be continuous, and in order for us

25 to compute sensible physical quantities it should also be differentiable. The mathematical concept of the coordinate chart is equivalent to the usual physics idea of a coordinate system or reference frame. As an example of how you might need more than one coordinate chart to cover a manifold, consider a circle S1. Each coordinate chart must be an of R (emphasis on open). So the minimum number of coordinate charts required to cover the 1- is two. For the 2-sphere S2, you need a projection to get S2 onto R2, or a patch thereof like a map. The most commonly used projection is the Mercator projection, which preserves angles rather than area. It is possible to use a different projection that preserves area, such as the Peters projection. However, the price of maintaining areas on the map is that angles are not preserved: countries look funny shaped compared to their Mercator cousins. Because the sphere is curved and the plane is not, you cannot create a map that preserves both angles and areas. The reason why the Mercator projection has been so dominant is a technical one: because it preserves angles, it is optimal for navigation of marine vessels and aeroplanes. But it massively overstates the size of countries closer to the poles. In particular, Western Europe looks more important on Mercator maps than it should, while Africa and Brazil look much smaller. Colonialism also had a role in the dominance of the Mercator projection. Examples of include Minkowski space, the sphere, the , and 2D Riemann surfaces with arbitrary genus. What about spaces that are not manifolds? Any intersection of lines with k-planes will do. A cone is an example of a non-differentiable manifold, because of what happens at its apex. Some manifolds have a boundary, for instance a . Some manifolds have no boundary. General Relativity treats the fabric of spacetime as a differentiable manifold. Note that it is also possible to handle discontinuities in the spacetime metric in some situations in GR, but only if the appropriate source of energy-momentum is available at the discontinuity to enforce consistency with Einstein’s equations. The formalism for handling this non- differentiable case is known as the Israel junction conditions, and its equations are derived by integrating Einstein’s equations across discontinuities in suitably covariant ways. This works a lot like deriving equations for shock waves in fluid mechanics. Spacetime being a differentiable manifold is not enough structure to describe gravity as we see it in experiments. The geometry should be suitably constrained by some physical equations, which should – by the Correspondence Principle – reduce to Newtonian mechanics in the limit of small speeds and weak gravity. Our spacetime manifolds will satisfy the Einstein equations.

26 6 M28Sep

6.1 Basis vectors in curved spacetime How do vectors and tensors work when spacetime is curved? We will have to be more careful than before, and the key difference is that the matrices showing us how to transform between different coordinate systems are no longer constant matrices. Suppose that we have coordinates xµ on our manifold and that we consider an arbitrary functions of these coordinates. Then the directional derivative along a direction λ of a is df ∂f dxµ = (6.1) dλ ∂xµ dλ so that we can write d dxµ = ∂ (6.2) dλ dλ µ

In other words, {eµ = ∂µ} is a set of basis vectors. This story goes deeper. Mathematically, the tangent space Tp(M) at a point p of a manifold M is isomorphic to the space of directional derivative operators on through p. It is a , and the Leibniz rule is obeyed. Vector fields can then be defined on M. An example of a vector field would be the wind direction at the of the Earth. Take a look at https://earth.nullschool.net/ for a very beautiful interactive visualization of winds on Earth. ∗ What about a basis for covariant vectors living in the cotangent space Tp (M)? There is a very natural candidate: the differentials {eµ = dxµ}. Note that these dxµ are not the same µ as the contravariant basis vectors ∂µ = ∂/∂x ; you can tell the difference partly by where the index is placed. The coordinate bases for contravariant and covariant vectors obey a natural inner product, ∂xµ (∂ )(dxµ) = = δµ . (6.3) ν ∂xν ν We do not have to stick to only using partial derivatives and differentials as bases. More generally, we can denote a basis for contravariant vectors living in the tangent space as

{eµ} . (6.4) Any contravariant vector v can be expanded in terms of this basis,

µ v = v eµ . (6.5) Also, we denote a basis for covariant vectors living in the cotangent space as {eν} . (6.6) Any covariant vector ω can be expanded in terms of this basis,

ν ω = ωνe . (6.7)

ν Generally, a basis for contravariant vectors eµ and a basis for covariant vectors e must be reciprocals, µ µ eν · e = δν . (6.8)

27 What if we wanted to measure and angles on our spacetime manifold? In terms µ of the basis eµ we can write the vector displacement ds between a point at x and another point at xµ + dxµ in terms of our general basis vectors,

µ ds = eµdx . (6.9)

Accordingly, the ds2 is

2 µ ν ds = ds · ds = eµdx · eνdx . (6.10)

From this equation we can identify the metric tensor, denoted by gµν,

gµν = eµ · eν . (6.11)

This is a generalization of the flat Minkowski metric that we encountered in our quick review of Special Relativity, and it tells us how to measure distances and angles. The above line element in curved spacetime obeys a very important principle: it is invariant under arbitrary coordinate changes which are invertible and C∞, known as diffeomorphisms. The inverse metric is denoted as gµν and it is built in an exactly similar fashion,

gµν = eµ · eν . (6.12)

νλ The downstairs metric gµν and its inverse the upstairs metric g obey

µν µ g gνλ = δλ . (6.13) The spacetime metric and its inverse are used all the time in GR, for raising and lowering indices on tensors. Sometimes it is physically useful to use a special basis called the orthonormal basis. In this case, we denote the basis vectors with hats, and they obey

eˆµ · eˆν = ηµν , eˆµ · eˆν = ηµν . (6.14)

Flat, boring Minkowski spacetime R1,3 written in a spherical polar coordinate basis is not a curved spacetime, but it has a spacetime metric and tensor transformation laws that depend on spacetime position. As an exercise to test your understanding, check explicitly that the line element is

ds2 = dt2 − dr2 − r2dθ2 − r2 sin2 θ dφ2 , (6.15) by starting from the expressions for the spherical polar spatial coordinates {r, θ, φ} in terms of the the Cartesian spatial coordinates {x1, x2, x3},

x1 = r cos θ , (6.16) x2 = r sin θ cos φ , (6.17) x3 = r sin θ sin φ . (6.18)

28 6.2 Tensors in curved spacetime Tensors work in curved spacetime a lot like they do in flat spacetime. The most important physical difference is that under a change of reference frame represented by

µ0 0 ∂x Λµ ≡ (6.19) ν ∂xν the new coordinates are related to the old ones by coordinate-dependent factors, rather than simple constants like cos θ or sinh ζ. Our central physics strategy will be to remain focused on the transformation properties of our tensors of interest. That is the essence of what a tensor does: it transforms in very specific, well-defined ways when the coordinate system changes. ν Earlier, we introduced bases eµ for rank (1,0) vectors and e for rank (0,1) vectors. Accordingly, a general rank (1,0) vector v in curved spacetime can be written in components as µ v = v eµ , (6.20) and a covariant vector ω in curved spacetime can be written in components

µ ω = ωµ e . (6.21) Under coordinate transformations, their components transform as

µ0 µ0 ν v = Λ νv , (6.22) and ν ωµ0 = Λµ0 ων . (6.23) The transformation matrices µ0 ν µ0 ∂x ν ∂x Λ = and Λ 0 = (6.24) ν ∂xν µ ∂xµ0 now generically depend on spacetime position, and they satisfy

σ µ0 σ σ ν0 ν0 Λµ0 Λ ν = δν , Λµ0 Λ σ = δµ0 . (6.25) Contravariant vectors on a pseudoRiemannian manifold representing curved spacetime live in the tangent space, which is a vector space. Covariant vectors live in the cotangent space. The collection of all (co)tangent spaces over M is known mathematically as the (co). One of the key properties of a covariant vector ω is that we can naturally take its inner product with a contravariant vector v without using the metric, and it yields a scalar: ν µ µ ω(v) = ωνe · v eµ = ωµv . (6.26) This enables us to recognize that another way to think about a covariant vector is that it is a machine that takes a contravariant vector and produces a scalar. Or in mathematical words, it is a bilinear map from the cotangent space into the real numbers R, obeying

(aω1 + bω2)(v) = aω1(v) + bω2(v) , (6.27)

ω(av1 + bv2) = aω(v1) + bω(v2) . (6.28)

29 Vectors obey entirely analogous rules. A rank (m, n) tensor in curved spacetime is defined by direct analogy as a multilinear map from a collection of m covariant vectors and n contravariant vectors to R. Its com- ponents in a coordinate basis can be extracted from T by slotting in the right number of covariant and contravariant basis vectors,

µ1...µm µ1 µm T ν1...νn = T (e ,..., e , eν1 ,..., eνn ) . (6.29) Alternatively, it can be written in terms of basis tensors as

µ1...µm ν1 νn T = T ν1...νn eµ1 ⊗ ... ⊗ eµm ⊗ e ⊗ ... ⊗ e , (6.30) where ⊗ denotes the outer product (not the inner product!). Note that in this picture expanding a tensor in components in one basis versus a second basis results in different components, as we would expect; the tensor stays the same. The coordinate transformation law for the components of a rank (m, n) tensor in curved spacetime is

µ 0 µ 0 σ σ 0 0 ∂x 1 ∂x m ∂x 1 ∂x n µ1 ...µm λ1...λm T ν 0...ν 0 = ... 0 ... 0 T σ ...σ . (6.31) 1 n ∂xλ1 ∂xλm ∂xν1 ∂xνn 1 n Notice that now the Jacobians involved typically depend on spacetime position.

6.3 Rules for tensor index gymnastics There are very specific rules for manipulating tensors. We already met one of them: the Einstein summation convention. In curved spacetime it works exactly the same way as in flat spacetime: repeated indices are summed over. But let us also make explicit some other specific tensor manipulation rules. First and foremost among them is the fact that when you write a tensor equation, indices on the LHS and RHS must be exactly matched. For example, pµ = muµ is a sensible tensor µ equation (it has one upstairs index on both sides) while the erroneous pµ = mu is not (on the LHS the µ index is downstairs while on the RHS it is upstairs). Second, vertical moves of tensor indices – up or down – can only be made by lowering or raising them with the rank (0,2) metric tensor or its rank (2,0) inverse. Said less pedantically, we raise or lower indices using the metric. For example,

ν ρν Tµ λσ = gµρT λσ , (6.32) and similarly for other raised/lowered components: you use as many factors of the met- ric/inverse metric as needed (with appropriate contractions) to lower/raise all the requisite indices. Third, we must always preserve the horizontal ordering of the indices when cal- culating, for both upstairs and downstairs indices. For example, for a general rank (2,2) tensor, µν νµ T λσ 6= T λσ . (6.33)

30 (The RHS has the µ and ν indices switched compared to the LHS.) Other horizontal switches of indices are equally verboten, unless you know that the tensor has appropriate symmetry properties. The only standard exception to the rule that horizontal index ordering matters α is the Kronecker δβ tensor, which is symmetrical by definition.

Let us now make a few remarks about symmetries among tensors. Tensors can have symmetries on their indices, which reduce the number of independent components, but this is not generic. For example, under interchange of its indices a two-index tensor might be symmetric Sµν = +Sνµ , (6.34) or antisymmetric Aµν = −Aνµ . (6.35) For rank two tensors only, an arbitrary tensor T can actually be written as the sum of a S and an antisymmetric tensor A. In components,

Tµν = Sµν + Aµν , (6.36) where 1 1 S = (T + T ) ,A = (T − T ) , (6.37) µν 2! µν νµ µν 2! µν νµ This works because the total number of independent components of a 2-index tensor is D × D = D2, while a symmetric 2-index tensor has D(D+1)/2 components and an antisymmetric 2-index tensor has D(D − 1)/2, so that D(D + 1)/2 + D(D − 1)/2 = D2. For larger rank, such a split cannot be done, because totally symmetric and totally antisymmetric tensors do not have enough independent components between them to cover the total number. Any tensor can be symmetrized or antisymmetrized on any number k of upper or lower indices. For symmetrization on k indices, denoted by round parentheses around those indices, we have 1 T ≡ (T + sum over other permutations of {µ . . . µ }) , (6.38) (µ1...µk) k! µ1µ2...µk 1 k while for antisymmetrization on k indices, denoted by square brackets around those indices, we have 1 T ≡ (T + alternating sum over other permutations of {µ . . . µ }) . (6.39) [µ1...µk] k! µ1µ2...µk 1 k where the alternating sum counts even/odd permutations with a +/− sign. Suppose you know that a tensor is symmetric with all indices down. How do you work out the symmetry of its counterpart with some or all of its indices up? By using known symmetry properties of the downstairs components and using the metric tensor to raise indices. Remember that the metric tensor itself is symmetric under interchange of its two indices, and so is its inverse. Note also that the contraction of the metric tensor with itself is µν µ µ g gµν = g µ = δµ = 1 + ... + 1 = D. (6.40)

31 7 R01Oct

7.1 Building a covariant derivative Because coordinate change matrices generically depend on spacetime position, simple partial derivatives of tensors are typically not themselves tensors. For example, the partial derivative of a covariant vector W , ∂µWν, changes under coordinate changes as

∂ ∂ ∂xµ ∂  ∂xν  W −→ W 0 = W ∂xµ ν ∂xµ0 ν ∂xµ0 ∂xµ ∂xν0 ν ∂xµ ∂xν ∂ ∂xµ ∂  ∂xν  = W + W . (7.1) ∂xµ0 ∂xν0 ∂xµ ν ν ∂xµ0 ∂xµ ∂xν0

Although the first term looks good for tensoriality, we see that the second term ruins the fun for generic changes of coordinates. At any particular point p, we can choose a reference frame (denoted here by hats) in which the first derivatives can be set to zero in that coordinate system, ∂σˆgµˆνˆ|p = 0. The way to see this mathematically is to use (a) Taylor expansions around a particular point and (b) the tensorial transformation property of the two-index metric tensor gµν. However, this cannot be made to work beyond first order in derivatives, because there are not enough components. Physically, this means that we will need extra structure on our spacetime manifold in order to be able to define covariant derivatives that transform like tensors. The structure we need is known as an affine connection. It will enable us to make covariant versions of partial derivatives ∂µ, denoted ∇µ, designed to transform like tensors. For taking covariant derivatives of tensorial indices relevant to bosonic fields, we will use the µ Levi-Civita connection or Christoffel symbols Γ νλ. For taking covariant derivatives of spinorial indices relevant to fermionic fields, a researcher would use a different beast known µ as the spin connection ω ab (see GR2 for details). We will work on manifolds without torsion, and for this case knowing the metric tensor is sufficient to determine both connections.

7.2 How basis vectors change: the role of the affine connection In curved spacetime, the partial derivative of a basis vector generically does not appear to lie in the tangent space. But as HEL explain in §3, this can be easily fixed up by defining the derivative in the manifold of the coordinate basis vectors by projecting into the tangent space at the point in question. Then we can expand out the expression for the partial derivative ν ∂λ of a contravariant basis vector eµ in terms of the basis eν, with coefficients Γ µλ:

ν ∂λeµ = Γ µλeν . (7.2)

We can figure out the analogous equation for the covariant basis vectors eµ by differentiating µ µ the equation e · eν = δν and taking the partial derivative of both sides, yielding

µ µ ν ∂λe = −Γ νλe . (7.3)

ν We can find the expressions for the coefficients Γ µλ by taking the partial derivative of

32 our metric tensor,

∂λgµν = (∂λeµ) · eν + eµ · ∂λeν σ σ = Γ µλeσ · eν + eµ · Γ νλeσ . (7.4) Using this, we can now form the combination

∂λgµν + ∂νgλµ − ∂µgνλ σ = 2Γ λνgµσ . (7.5) where in the second step we used the fact that the Christoffels are symmetric under inter- change of their lower indices. This fact stems from the assumption that our spacetime has 6 µ zero torsion . Rearranging the above gives the full expression for the coefficients Γ νλ in terms of first derivatives of the metric tensor, 1 Γµ = gµσ (∂ g + ∂ g − ∂ g ) . (7.6) νλ 2 ν σλ λ νσ σ νλ

How about an example? Consider the 2D plane R2 with Cartesian coordinates {x, y}. The basis vectors ex and ey are maximally boring: they do not change with position. How- ever, if we transform to plane polar coordinates {ρ, φ} given by p x = ρ cos φ , ρ = x2 + y2 , y = ρ sin φ , φ = arctan(y/x) , (7.7) then in plane polar coordinates the basis vectors eρ and eφ definitely do change with position, which is the generic situation in GR. To see this, recall that

ds = exdx + eydy = eρdρ + eφdφ , (7.8) and use the above coordinate transformations to identify

eρ = + cos φ ex + sin φ ey ,

eφ = ρ(− sin φ ex + cos φ ey) . (7.9) It follows quickly from the above that ds2 = dρ2 + ρ2 dθ2 . (7.10) We can inspect how the basis vectors change with position, to obtain ∂e ρ = 0 , ∂ρ ∂e 1 ρ = − sin φ e + cos φ e = e , ∂φ x y ρ φ ∂e 1 φ = − sin φ e + cos φ e = e , ∂ρ x y ρ φ ∂e φ = −ρ cos φ e − ρ sin φ e = −ρ e . (7.11) ∂φ x y ρ 6Torsion is a rank (1,2) tensor, and it falls outside the scope of this course.

33 ν From eq.(7.3), we have that ∂λeµ = Γ µλeν, so we can read off the Christoffels,

ρ φ Γ ρρ = 0 , Γ ρρ = 0 , 1 Γρ = 0 , Γφ = + , ρφ ρφ ρ 1 Γρ = 0 , Γφ = + , φρ φρ ρ ρ φ Γ φφ = −ρ , Γ φφ = 0 . (7.12) Alternatively, we could have obtained these expressions for the Christoffels by taking deriva- tives of the metric tensor as in eq.(7.6). But I think also showing the explicit effect on the basis vectors as in eq.(7.11) helps us understand the physics better. I recommend drawing yourself some pictures to illustrate explicitly how the plane polar coordinate basis vectors change with position according to the above equations.

Previously, we noticed that taking the partial derivative of a tensor does not give another tensor, generically. The problem was that the coordinate transformation generally depends on spacetime position. Let us delineate the properties that a covariant derivative ∇ should have. It should be linear, ∇(T + S) = ∇T + ∇S , (7.13) where S, T are arbitrary tensors, and it should obey the Leibniz rule,

∇(T ⊗ S) = (∇T ) ⊗ S + T ⊗ (∇S) . (7.14)

It should also commute with contractions, which is tantamount to assuming that

∇σgµν = 0 , (7.15) a very reasonable assumption. The covariant derivative should reduce to the partial deriva- tive when operating upon scalars, because those tensors have no legs. The combination of the first two properties above implies that ∇ can be written as the sum of the partial derivative ∂ and a linear transformation, which you can think of as a ‘correction’ to keep the derivative tensorial. The coefficients of this correction term are known as the connection µ coefficients or the Christoffel symbols Γ αβ (pronounced criss-toff-ill). To see how this works, let us consider the derivative of a contravariant vector v

µ µ ∂νv = (∂νv )eµ + v (∂νeµ) µ µ λ = (∂νv )eµ + v (Γ µνeλ) µ λ µ = (∂νv )eµ + v (Γ λνeµ) µ λ µ = (∂νv + v Γ λν)eµ , (7.16) where in the third line we relabelled dummy indices. The part in brackets is known as the covariant derivative, µ µ µ λ ∇νv ≡ ∂νv + Γ λνv . (7.17)

34 By exactly similar logic, we can find the covariant derivative of a covariant vector,

λ ∇νωµ = ∂νωµ − Γ µνωλ . (7.18)

If we want to take the covariant derivative of a rank (m, n) tensor, then we just act on each of its legs in turn with the connection,

µ1...µm µ1...µm ∇σT ν1...νn = ∂σT ν1...νn µ1 λµ2...µm µ2 µ1λµ3...µm +Γ σλT ν1...νn + Γ σλT ν1...νn + ... −Γλ T µ1...µm − Γλ T µ1...µm + ... (7.19) σν1 λν2...νn σν2 ν1λν3...νn How about a quick example? We can use the Christoffels to teach us how to take the covariant Laplacian in 2D plane polar coordinates. If we have a scalar field Ψ(ρ, φ), then

µ µ ∇µ∇ Ψ = ∇µ∂ Ψ µ µ ν = ∂µ∂ Ψ + Γ µν∂ Ψ ρ φ ρ ρ ρ φ φ ρ φ φ = (∂ρ∂ + ∂φ∂ )Ψ + Γ ρρ∂ Ψ + Γ ρφ∂ Ψ + Γ φρ∂ Ψ + Γ φφ∂ Ψ ∂2Ψ 1 ∂Ψ 1 ∂2Ψ = + + . (7.20) ∂ρ2 ρ ∂ρ ρ2 ∂φ2 This should look familiar from vector calculus class: we just derived it from first principles.

One of the most important things to remember is that the connection is not a tensor. It has components labelled with Greek indices, but that does not make it a tensor in and of itself. Indeed, the connection is designed specifically to correct the non-tensorial property of the partial derivative in order to create a new tensor from an old one. Its transformation law under a coordinate transformation is

µ λ ν0 µ λ 2 ν0 ν0 ∂x ∂x ∂x ν ∂x ∂x ∂ x Γ 0 0 = Γ − . (7.21) µ λ ∂xµ0 ∂xλ0 ∂xν µλ ∂xµ0 ∂xλ0 ∂xµ∂xλ From this, you can see that the difference between two connections is a tensor, because the second term (which is independent of the Γs) drops out of their transformation law. Our connection is metric compatible, meaning that the covariant derivative con- structed from it obeys ∇σgµν = 0 , (7.22) for all values of σ, µ, ν. There are two other useful equations that follow from this one,

µν ∇σg = 0 , (7.23)

∇λµ0µ1...µd = 0 , (7.24)

where the completely antisymmetric µ0µ1...µd is used in integrating covariantly over spacetime. We will not have occasion to use it here, but it will play an important role in deriving Einstein’s equations from an action principle in the GR2 course. The fact that our metric-compatible covariant derivative commutes with raising and lowering of indices is

35 very fortunate – if there were torsion, we would have to be scrupulously careful about our index placements. In discussing covariant derivatives of tensors, it is worth noting here that some people use a different convention than ours. They abbreviate by defining commas after indices to represent partial derivatives, while semicolons represent covariant derivatives. We will stick with keeping ∂ and ∇ explicit, because in pages full of long GR equations it is all too easy to lose track of punctuation marks.

7.3 The covariant derivative and Introducing a covariant derivative (as compared to a plain derivative) was a really great idea for doing physics. It allows us to write tensor equations wherever we go. All we need to do is to be sure to write ∇s rather than ∂s. But one question worth asking is this: what rate of change does ∇ actually measure? A way to answer that question and get a better handle on ∇ is to ask when ∇ of some tensor is zero. For this we actually have to specify what path along which we hope to compare tensors – because comparing tensors at two different points is, a priori, meaningless in GR. After all, the spacetime metric varies from point to point. µ µ Consider a parametrized curve x (λ), and a vector field v(λ) = v (λ)eµ(λ). The deriva- tive of the vector v field w.r.t the curve parameter λ is dv dvµ ∂e dxσ = e + vµ µ dλ dλ µ ∂xσ dλ dvµ dxσ  = + Γµ vν e dλ νσ dλ µ Dvµ ≡ e . (7.25) Dλ µ The quantity D dxσ ≡ ∇ (7.26) Dλ dλ σ is known as the directional covariant derivative. This animal is only defined along the path xµ(λ), and when acting on a tensor it produces another tensor. We say that a tensor T is parallel transported along the path if σ D µ ...µ dx  1 k µ1...µk T = ∇σT ν ...ν = 0 . (7.27) Dλ ν1...ν` dλ 1 ` This is known as the equation of parallel transport, and it is a proper tensor equation. Now, since we have a metric compatible connection, ∇σgµν = 0, parallel transport preserves the inner product of two tensors. For example, for two vectors V µ and W µ, D  D   D D  (g V µW ν) = g V µW ν + g V µW ν + V µ W ν (7.28) Dλ µν Dλ µν µν Dλ Dλ = 0 + 0 + 0 = 0 , (7.29) if the vectors are both parallel transported. You can visualize what parallel transport does by imagining that it keeps the same angle between the vector and the directional derivative along the path xµ(λ).

36 To see what parallel transporting can imply, consider the two-sphere. Imagine that we start at the North Pole with a vector at an angle. We keep the angle of our vector constant as we move along a line of , (say) the Greenwich meridian, down to the Equator. Then imagine that we turn East and continue parallel transporting our vector some way around the equator. Then we turn North and parallel transport our vector up a second line of longitude, back to the North Pole. If you have visualized this correctly in your mind, you will see that our vector, regardless of the direction it was initially pointing, has undergone a finite rotation. This is because the sphere is (positively) curved.

37 8 M05Oct

8.1 The geodesic equations for test particle motion in curved spacetime A geodesic is a path xµ(λ) that parallel transports its own tangent vector. It follows that the equation satisfied by the geodesic is

D dxµ  d2xµ dxν dxσ = + Γµ = 0 . (8.1) Dλ dλ dλ2 νσ dλ dλ

We can also think about parallel transport in the following way. When we take an ordinary partial derivative, we do it by taking

f(xµ + ∆xµ) − f(xµ) ∂f lim = . (8.2) ∆x→0 ∆xµ ∂xµ In curved spacetime, the result of this is not a tensor. What we do is instead take the covariant derivative, as follows.

1. We take xµ(λ + dλ) as our “x plus an infinitesimal change” and find T there. 2. We parallel transport T back to the original point at xµ(λ), along the path xµ(λ). 3. We compare the parallel-transported-back T to the original T at xµ(λ), and we ‘divide by’ dλ.

The result is DT/Dλ, the covariant rate of change of the tensor with respect to λ at the spacetime point xµ(λ).

Let us now see another way that the geodesic equation can be derived, using a variational approach. Consider a massive point particle in proper time gauge. The relativistic einbein action is, up to a constant that is physically irrelevant at the classical level,

Z dxµ(τ) dxν(τ) S = −m dτ g (xλ) . (8.3) µν dτ dτ What happens when we vary xµ → xµ + δxµ ? (8.4)

38 Under such a variation, σ gµν → gµν + (∂σgµν)δx . (8.5) Varying the action, we have

1 Z  dxµ dxν  − δS = dτ δ g (8.6) m µν dτ dτ Z  dxµ dxν  dδxµ  dxν  = dτ (∂ g ) δxσ + g + (µ ↔ ν) (8.7) σ µν dτ dτ µν dτ dτ Z  dxµ dxν = dτ (∂ g ) δxσ+ σ µν dτ dτ  dxσ dxν d2xµ  − (∂ g ) δxµ + g δxν + (µ ↔ ν) , (8.8) σ µν dτ dτ µν dτ 2 where in the last step we integrated by parts7. We also used the fact that δdxµ dδxµ = . (8.9) dτ dτ Collecting all the terms, we have

1 Z  d2xσ dxν dxρ  − δS = dτ g + g Γσ δxµ . (8.10) m µσ dτ 2 µσ νρ dτ dτ

Demanding that this be zero for arbitrary variations δxµ, we obtain the geodesic equation,

d2xµ dxν dxρ + Γµ = 0 . (8.11) dτ 2 νρ dτ dτ An affine parameter λ is defined to be λ = aτ + b for constants a, b. In other words, a λ is an affine parameter if it is linearly related to τ (for a massive particle). For a massless particle, we can still define an affine parameter. In fact, our geodesic equation requires just such an affine parametrization, regardless of the particle mass. For either massive or massless particles, the geodesic equation can be written in very compact form in terms of the momentum vector,

ν µ p ∇ν p = 0 . (8.12)

For point particles, we relate the momentum pµ to the four-velocity uµ via dxµ pµ = muµ = m , m2 > 0 , dτ pµ = uµ , m2 = 0 . (8.13)

The second formula follows Carroll’s convention for defining the “four-velocity” for massless µ particles. Since p pµ = 0 for them, we have a free choice for the proportionality factor.

7We assume that the manifold has sufficiently trivial for the integration by parts to work.

39 There is a central physics point to understand about this extremization. Is it a mini- mization or a maximization? In fact, the geodesic maximizes proper time. Why? Well, if we were to lower the proper time interval (∆τ)2 along a changed path, we would get closer to (∆τ)2 = 0, which is a null path. To go lower, to (∆τ)2 < 0, we would have to use an illegal spacelike path. So minimizing (∆τ)2 does not make sense, and in fact the proper time is maximized via the variational principle. The fact that the proper time is maximized happens precisely because it is infinitesimally close to paths with lower proper time. Carroll has a morally similar argument: he shows that for any timelike path, we can approximate it by a (jaggedy looking) piecewise continuous bunch of null paths, all of the pieces of which have zero invariant interval. Since the geodesic is infinitesimally nearby to null paths with zero proper time, it must maximize proper time. The physical consequence of this mathematical fact that geodesics maximize proper time is that accelerated observers – those who are not in freefall – measure less proper time than those who are in freefall. This is why the space twin in the Twin Paradox always comes back younger, not older, than the homebody twin. The more you accelerate around with your rockets, the younger you are compared to a homebody who stays on a geodesic. If all geodesics on a spacetime manifold go as far as they please, then the manifold is said to be geodesically complete. But if some geodesic(s) bang into a singularity, or end prematurely, then the manifold is geodesically incomplete. For spacetimes with matter, this is the generic case, actually. just won part of the 2020 Nobel Prize in Physics for (co-)explaining this.

8.2 Example computation for affine connection and geodesic equa- tions Let us now work a relatively simple example of calculating Christoffel components for a spacetime with dependence on only one coordinate, x0 = t. We will take the spatially flat8 Friedman-Robertson-Walker ansatz in D = d + 1 spacetime dimensions,

ds2 = dt2 − a2(t)|d~x|2 , (8.14) where a(t) is the scale factor. Since

2 µ ν ds = gµνdx dx , (8.15) we have

g00 = +1 , 2 gij = −[a(t)] δij . (8.16)

Because the metric is diagonal, we can invert it by eye, to obtain

g00 = +1 , gij = −[a(t)]−2 δij . (8.17)

8For the more general case with nontrival spatial metric, see Carroll §8.3.

40 Finding the Christoffels is relatively straightforward, as many of them are zero. Notice that the only coordinate dependence in the metric is on the time coordinate. 0 First, let us try for Γ 00, 1 Γ0 = g0σ (∂ g + ∂ g − ∂ g ) 00 2 0 0σ 0 0σ σ 00 1 = g00∂ g = 0 , (8.18) 2 0 00 because the metric is diagonal and because g00 is a constant. Next up is 1 Γ0 = g0σ (∂ g + ∂ g − ∂ g ) 0i 2 0 iσ i 0σ σ 0i 1 = g00 (∂ g + ∂ g − ∂ g ) 2 0 i0 i 00 0 0i 1 = g00∂ g = 0 , (8.19) 2 i 00 because the metric is diagonal and because g00 is a constant. 0 A more interesting case is Γ ij, which is nonzero. 1 Γ0 = g0σ (∂ g + ∂ g − ∂ g ) ij 2 i jσ j iσ σ ij 1 = g00 (∂ g + ∂ g − ∂ g ) 2 i j0 j i0 0 ij 1 00 = − g ∂0gij .2 = aa δij , (8.20) where . = d/dt. Along the way, we again used the fact that the metric is diagonal and g00 is a constant. i Now consider Γ 00. 1 Γi = giσ (∂ g + ∂ g − ∂ g ) 00 2 0 0σ 0 0σ σ 00 = 0 , (8.21) because the metric is diagonal and because g00 is a constant. i Next, let us look at the only other nonzero Christoffel symbol Γ 0j. We have 1 Γi = giσ (∂ g + ∂ g − ∂ g ) 0j 2 0 jσ j 0σ σ 0j 1 = gik (∂ g + ∂ g − ∂ g ) 2 0 jk j 0k k 0j 1 = gik∂ g 2 0 jk 1 = {[a(t)]−2δik}∂ {[a(t)]2δ } 2 0 jk a. = δi . (8.22) a j

41 i Finally, what about the all-spatial Christoffels Γ jk? We have 1 Γi = gi` (∂ g + ∂ g − ∂ g ) jk 2 j k` k j` ` jk = 0 , (8.23) because none of the spatial components of the metric depends on spatial position. In summary, we have:-

0 . Γ ij = aa δij , (8.24) a. Γi = δi , (8.25) j0 a j with all other components zero. Notice how it is the “velocity” of the scale factora ˙(t) that appears here. The quantity a. = H(t) (8.26) a is known as the Hubble constant if the scale factor is exponential. (Whether or not the scale factor can behave in this fashion is determined by the energy-momentum of matter in the spacetime, as we will discover later on in the course.)

Now let us look at the geodesic equations in this simple spacetime, doing a time space split like for the Christoffels above. In general, we have d2xµ dxν dxσ + Γµ = 0 . (8.27) dλ2 νσ dλ dλ The 0th component of this equation reads d2x0 dxν dxσ 0 = + Γ0 dλ2 νσ dλ dλ d2x0 dxi dxj = + Γ0 dλ2 ij dλ dλ d2x0 dxi dxj = + aa. δ (8.28) dλ2 ij dλ dλ because all the other terms contributing to the sums over ν and σ involve Christoffel com- ponents that are zero. The ith component reads d2xi dxν dxσ 0 = + Γi dλ2 νσ dλ dλ d2xi dx0 dxj dxj dx0 = + Γi + Γi dλ2 0j dλ dλ j0 dλ dλ d2xi 2a. dx0 dxi = + . (8.29) dλ2 a dλ dλ The first thing to notice about these geodesic equations we have derived is that they are coupled and nonlinear. The equation for dx0/dλ depends on what dxi/dλ are doing, and vice

42 versa. This is why solving for motions of massless particles (photons) or massive particles (like electrons) in the background of a general curved spacetime is generically much more complicated than doing Newton’s Laws for non-relativistic physics. The second thing to notice about our super-simple spacetime is that the spatial geodesic equations actually have a first integral (!). To see this, let us try taking the λ derivative of

dxj p = a2(t) δ . (8.30) i ij dλ We have, by the Leibniz rule and the chain rule,

d d  dxj  p = δ a2(x0) dλ i ij dλ dλ  dx0  dxj d2xj = δ 2a a. + δ a2 ij dλ dλ ij dλ2 d2xj 2a. dx0 dxj  = δ a2 + ij dλ2 a dλ dλ = 0 . (8.31)

Therefore, pi is a conserved quantity along the geodesic. As we will see a bit later in the course, this conservation law arises because our spacetime metric has a symmetry: none of the components of the metric tensor depends on spatial coordinates. This is your first example of how Noether’s Theorem works in General Relativity.

43 9 R08Oct

9.1 Spacetime curvature Einstein’s General Theory of Relativity upgraded the way we think about gravitational physics. Instead of imposing Newton’s three laws of motion and imposing his force law for universal gravitation, we assume that the starting point is the fabric of spacetime. We worked quite hard already to define tensors on arbitrary spacetimes, by focusing intensely on their transformation properties under changes of reference frame, i.e., changes of coordinates. We also figured out in our last lecture how to define a covariant derivative, with the help of the Levi-Civita connection. We went to all that trouble of wrangling the Christoffel symbols because this enabled us to do two exciting things: (a) to define a derivative ∇µ that is a tensor, even in curved spacetime, and (b) to derive the geodesic equation, which is the equation obeyed by any relativistic particle undergoing freefall in the spacetime in question. Along the way, we learned that geodesics maximize the proper time. As we alluded to earlier, is the mathematical quantity that Albert Einstein discovered was the key to gravitational physics expressed in the language of curved spacetime. He realized that the Riemann tensor, which contains at most two derivatives of the metric tensor, could even be used to build an action principle for general relativity. We will derive the Einstein action and the Einstein equations of motion for the gravitational field in the GR2 course. For now, all we need to keep in mind is that the Riemann tensor encodes a wide variety of gravitational phenomena in its tensor components, including the physics of tidal forces and the motion of particles in spacetime. In particular, we will soon show how in the Newtonian limit of weak gravity and slow speeds, we will recover familiar expressions from Newtonian physics – without ever having to use the concept of a force! First, we need to develop a bit more formalism.

9.2 The Riemann tensor Consider an infinitesimal parallelogram, with vectors Aµ and Bν forming the sides.

In hand-waving terms, the Riemann curvature is what tells us how much a vector V µ gets rotated under parallel transport around the parallelogram. The infinitesimal change in V , δV , is a (1,0) tensor, and so are A, B, and V . Roughly speaking, we expect δV to be proportional to V and to the size of the parallelogram. To connect δV to A, B, V we need a (1,3) tensor with which to contract indices naturally, and the role of this is played by the Riemann curvature. The resulting equation from our handwaving is therefore µ µ ν α β δV ∼ R ναβV A B . (9.1) While this sketch of Riemann’s origin gives us the gist, we now need to be more precise and make a proper definition.

44 Recall that earlier we found parallel transport to be the right way of thinking about how to compare vectors at different places in spacetime. Combined with our little parallelogram hand-wave just now, this can be used to motivate a mathematical definition of the Riemann tensor as arising from taking commutators of covariant derivatives. On a (1,0) vector V , Riemann is defined via9 ρ ρ λ [∇µ, ∇ν] V = +R λµνV , (9.2) for a torsion-free connection. This formula teaches us how to find the components of the Riemann tensor in terms of Christoffel connection coefficients. Let us write out the pieces ρ individually to see how it works out. First, note that for any (1,1) tensor Tν ,

ρ ρ ρ σ λ ρ ∇µTν = ∂µTν + Γ µσTν − Γ µνTλ . (9.3)

ρ ρ So with Tν = ∇νV , we have

ρ ρ ρ λ λ ρ ∇µ(∇νV ) = ∂µ(∇νV ) + Γ µλ(∇νV ) − Γ µν(∇λV ) (9.4) ρ ρ λ ρ λ λ λ ρ ρ σ = ∂µ(∂νV + Γ νλV ) + Γ µλ(∂νV + Γ νσ) − Γ µν(∂λV + Γ λσV ) (9.5) ρ ρ λ ρ λ λ ρ = ∂µ∂νV + Γ νλ∂µV + Γ µλ∂νV − Γ µν∂λV ρ σ ρ λ σ λ ρ σ + (∂µΓ νσ)V + Γ µλΓ νσV − Γ µνΓ λσV . (9.6) Then

ρ ρ ρ λ  σ  λ ρ σ ∇µ(∇νV ) − (µ ↔ ν) = ∂µΓ νσ + Γ µλΓ νσ V − Γ µνΓ λσV  ρ λ ρ λ  λ ρ + Γ νλ∂µV + Γ µλ∂νV − Γ µν∂λV − (µ ↔ ν) (9.7) ρ ρ λ  σ = ∂µΓ νσ + Γ µλΓ νσ − (µ ↔ ν) V . (9.8) Now we can put the pieces together to see the general formula for taking the commutator of covariant derivatives acting on a vector. Using

ρ ρ σ [∇µ, ∇ν] V = +R σµνV (9.9) gives us the formula for the Riemann tensor components,

ρ ρ ρ λ ρ λ ρ R σµν = ∂µΓ σν − ∂νΓ σµ + Γ σνΓ λµ − Γ σµΓ λν . (9.10) For a covariant derivative acting on a (0,1) tensor, a covariant vector, one finds the same Riemann tensor coefficients and

λ [∇µ, ∇ν] ωρ = −R ρµνωλ . (9.11) If you slog through the details, you can compute the commutator of covariant derivatives on a rank (k, `) tensor V as well. This is not much worse than the calculation we have just done, and we suppress the details here. The result is

[∇ , ∇ ] V µ1...µk = Rµ1 V λ...µk + Rµ2 V µ1λµ3...µk + ... ρ σ ν1...ν` λρσ ν1...ν` λρσ ν1...ν` − Rλ V µ1...µk − Rλ V µ1...µk − .... (9.12) ν1ρσ λν2...ν` ν2ρσ ν1λν3...ν` 9We are using the sign conventions of HEL

45 Riemann arises naturally as a rank (1,3) tensor. By doing a partial contraction of two of its indices, we can define the Ricci tensor Rµν, which naturally arises as a rank (0,2) tensor, α Rµν = R µνα . (9.13) Notice that we are contracting the first and fourth indices here to make the Ricci tensor. This is a choice of convention, and we have chosen to use the same convention as the HEL textbook. By contracting the Ricci tensor with the metric, we can form the Ricci scalar R, which has rank (0,0), µν R = g Rµν . (9.14) Other kinds of contractions involving Riemann are also possible, such as “Riemann squared” and “Ricci squared”. For our purposes in this course, we only need to know about the Ricci tensor and the Ricci scalar – because both of them will appear on the left hand side of Einstein’s equations. Note that if you change the signature of our Lorentzian spacetime from mostly minus λ ρ to mostly plus, the Christoffels Γ µν would stay the same, the Riemann tensor R λµν would also stay the same, and so would the Ricci tensor Rµν, but the Ricci scalar R would develop a relative minus sign.

9.3 Example computations for Riemann Suppose that we study 2D Euclidean space in plane polar coordinates {x1, x2} = {ρ, ϕ},

ds2 = dρ2 + ρ2dϕ2 . (9.15)

We previously found nonzero Christoffels for this spacetime in these coordinates when we introduced basis vectors, 1 Γ2 = , Γ1 = −ρ . (9.16) 12 ρ 22 From this, we can find the Riemann tensor using our formula from above,

ρ ρ ρ ρ λ ρ λ R σµν = ∂µΓ νσ − ∂νΓ µσ + Γ µλΓ νσ − Γ νλΓ µσ . (9.17) Substituting in gives

1 1 1 λ 1 λ 1 R 212 = ∂1Γ 22 − ∂2Γ 21 + Γ 22Γ λ1 − Γ 21Γ λ2 1 2 1 = ∂1Γ 22 − 0 + 0 − Γ 21Γ 22 1 = ∂ (−ρ) − (−ρ) ρ ρ = −1 − (−1) = 0 . (9.18)

All the other components of Riemann that might have been nonzero are actually zero, also. This result reflects the fact that this 2D spacetime is flat.

46 Now suppose we try a spacetime which we already suspect is curved: the two-sphere with coordinates {x1, x2} = {θ, φ}, ds2 = dθ2 + sin2 θ dφ2 . (9.19) Computing the Christoffels is straightforward, using either the formula in terms of deriva- tives of the metric tensor or the formula for how basis vectors change. The only nonzero components turn out to be cos θ Γ1 = − sin θ cos θ , Γ2 = . (9.20) 22 12 sin θ As we will discover next week, there is only one independent component of Riemann in 2D, 1 and it is R 212. To compute it, we substitute in again,

1 1 2 1 R 212 = ∂1Γ 22 − Γ 21Γ 22 cos θ = ∂ (− sin θ cos θ) − (− sin θ cos θ) θ sin θ = −(cos2 θ − sin2 θ) + cos2 θ = + sin2 θ . (9.21) If we lift the second index using gµν, we obtain

12 R 12 = +1 . (9.22) The answer is positive because the sphere is positively curved. All of the other nonzero 21 12 components of Riemann can be expressed in terms of this one, for instance R 21 = +R 12. Finally, let us work a slightly more nontrivial example of calculating Riemann com- ponents for a spacetime with dependence on only one coordinate. As with our geodesic equation example at the end of the previous section, we take the spatially flat FRW ansatz in D = d + 1 spacetime dimensions, ds2 = dt2 − a2(t)|d~x|2 , (9.23) where a(t) is the scale factor. Most of the components of Riemann for this simple spacetime are actually zero. Let us sketch how to find the ones that are nonzero. We had for the Christoffels

0 . Γ ij = aa δij , a. Γi = δi . (9.24) j0 a j The first group of nonzero Riemann components have one time index up and one down, and two spatial indices: R0 = ∂ Γ0 − Γ0 Γk i0j 0 ji jk 0i . .2 .. . a k = a + aa δij − aa δjk δ i .. a = aa δij . (9.25)

47 Then we have

i i i k R 00j = ∂0Γ j0 + Γ0kΓj0 a..a − a.2 a.2 = δi + δi a2 j a2 j ..a = δi . (9.26) a j The second group of nonzero Riemann components has all spatial indices,

i i 0 i 0 R jk` = Γ k0Γ lj − Γ `0Γ kj a.  a.  = δi (aa. δ ) − δi (aa. δ ) a k `j a ` kj .2 i i  = a δ kδj` − δ `δjk . (9.27) 2 · · Notice how we have discovered both “velocity squared”a ˙ terms, which arise via Γ··Γ·· 2 parts in Riemann, anda ¨ “acceleration” terms, which arise via ∂· g·· parts in Riemann. It is not until you compute the curvature that you see the appearance of the “acceleration” pieces. Notice also how the “acceleration” of the scale factor showed up in the Riemann com- ponents involving the time direction; the all-spatial Riemanns gave only “velocity squared” contributions. Since we now have the Riemann tensor, we can contract it to find Ricci. The nonzero components are

i R00 = R 00i ..a = +δi i a ..a = +d , a 0 k Rij = R ij0 − R ijk .. .2 k k  = −aa δij − a δ kδij − δ jδik .. .2 = −aa δij − a (d − 1)δij  .. .2 = −δij aa + (d − 1)a , (9.28) where d is the spatial dimension (d = 3 in our universe). Contracting the Ricci tensor with the metric tensor gives the Ricci scalar,

00 ij R = g R00 + g Rij ..a  1  = + d − d − a..a + (d − 1)a.2 a a2 ..a a.2 = +2d + d(d − 1) . (9.29) a a2 Using D = d + 1, we can write this in terms of the spacetime dimension D, ..a a.2 R = +2(D − 1) + (D − 1)(D − 2) . (9.30) a a2

48 The time evolution of this depends sensitively on the details of how the scale factor evolves. We will need to develop the Einstein equation to see how scale factor evolution is tied to the energy-momentum of the type of matter hanging out in the spacetime. Arbitrary scale factors a(t) are not allowed; the Einstein equations will determine them in terms of the energy density and the pressures.

49 10 R15Oct

10.1 Geodesic deviation Geodesics are generally not straight lines in curved spacetime. Physically, they deviate from one another, because of spacetime curvature. How can we make this intuition mathematically precise? Consider a one-parameter family of geodesics γs(λ), where λ is the affine parameter along the geodesic in question. The parameter s ∈ R tells you which geodesic you are referring to. We can choose coordinates s and λ on the manifold as long as the geodesics do not cross.

Then we have two naturally defined vector fields, ∂xµ ∂xµ Sµ = ,T µ = . (10.1) ∂s ∂λ A useful mnemonic here is that S is for Separation while T is for Tangent. We would now like to build the covariant analogue of the ‘relative velocity’ between geodesics, µ α µ V = T ∇αS , (10.2) and the ‘relative acceleration’ µ α µ A := T ∇αV . (10.3) Note that the acceleration of a path away from being a geodesic is different. That would be

α µ T ∇αT . (10.4)

Since our proposed definitions above are tensor equations, they are well-defined. Now, Sµ and T µ are basis vectors adapted to a coordinate system, with s and λ. Therefore,

[S,T ] = 0 . (10.5)

On our way towards building the relative acceleration vector, we will need an identity for vector fields,

µ α µ α µ [X,Y ] = X ∂αY − Y ∂αX (10.6) α µ α µ = X ∇αY − Y ∇αX . (10.7)

50 This allows us to relate S-directional derivatives of T to T -directional derivatives of S, α µ α µ S ∇αT = T ∇αS . (10.8) Now we can compute the relative acceleration vector. µ α σ µ A = T ∇α (T ∇σS ) (10.9) α σ µ = T ∇α(S ∇σT ) α σ µ α σ µ µ ν = (T ∇αS )(∇σT ) + T S {[∇σ∇αT ] + R νασT } α σ µ µ ν α σ = (S ∇αT )(∇σT ) + R νασT T S σ α µ σ α µ +[S ∇σ(T ∇αT ) − (S ∇σT )(∇αT )] µ ν α σ = +R νασT T S , (10.10) where we used (i) [S,T ] = 0, (ii) ∇ obeys the Leibniz rule and Riemann is defined in terms of a commutator of covariant derivatives, (iii) the Leibniz rule and rearranging terms, (iv) relabelling of dummy indices to cancel terms and T being the tangent vector of a geodesic. Summarizing, we have the geodesic deviation equation D2Sµ Aµ = = (∇ ∇ S)µ = +Rµ T νT αSσ . (10.11) Dλ2 T T νασ Here we see how the Riemann curvature tensor governs the deviation of geodesics in a very precise way. The covariant acceleration deviation of this one-parameter family of geodesics is given by the Riemann tensor contracted with the tangent vector T twice, on its second and third indices, and contracted with the separation vector S once, on its fourth index.

10.2 Tidal forces and taking the Newtonian limit for Christoffels Remember the tides? If you, like me, have spent any length of time near the ocean, then you know that the water level rises and falls twice a day. But do you know why? Newton first explained this in his Principia. Basically, oceanic water on the near side to the Moon bulges because it is closer to the Moon than ocean on the far side and hence feels stronger gravity; for the bulge on the far side that can be seen to happen through ‘centrifugal force’. So we see two tides per day. (Note: distances in the figure are not to scale.)

51 How do tidal forces work in Newtonian and Einsteinian gravity? Well, you cannot detect curvature using only one test particle, or only one geodesic. You need to use multiples to see the physical effects of curvature of space or spacetime. So let us think about geodesic deviation in the Newtonian limit, even before we recruit the heavy machinery of tensor analysis in curved spacetime and the Riemann tensor. We will soon see how Riemann and the Newtonian potential are connected by the Newtonian limit of weak gravity and slow speeds. In an inertial frame, the equation of motion of the first particle moving in a Newtonian gravitational potential Φ(xk) is d2xi = −δij∂ Φ(xk) . (10.12) dt2 j Next, we define the vector yi to be the separation of the second particle from the first, which is assumed to be small. We have that d2 (xi + yi) = −δij∂ Φ(xk + yk) . (10.13) dt2 j Taylor expanding gives

k k k k ` 2 ∂jΦ(x + y ) = ∂jΦ(x ) + (∂`∂jΦ(x ))y + O(y ) , (10.14) so that the Newtonian trajectory deviation equation is d2 yi = −δij(∂ ∂ Φ) yk . (10.15) dt2 j k The left hand side is known as the tidal acceleration, and it is described by the second mixed partial derivatives of the Newtonian potential. For simplicity, let us ignore the fact that the Earth is rotating on its own axis as well as the rotation of the Earth around the Sun. Letting the moon be at (x, y, z) = (0, 0, d), we have for the Newtonian gravitational potential G M Φ (x, y, z) = − N m . (10.16) m {x2 + y2 + (z − d)2}1/2 From this we can calculate the acceleration deviation vector  2  ∂ Φ GN Mm i j = + 3 diag(1, 1, −2) . (10.17) ∂x ∂x 0 d Why the asymmetry between the z and x, y directions? Simple. The functional dependence in the denominator is different. 2   ∂ Φ 1 −3/2 2 = − GN Mm∂x 2x · − {...} (10.18) ∂x 0 2 0   −3/2 3 −5/2 = GN Mm {...} − x · 2x · − {...} (10.19) 2 0 G M = N m + 0 (10.20) d3

52 whereas

2   ∂ Φ 1 −3/2 2 = − GN Mm∂z 2(z − d) · − {...} (10.21) ∂z 0 2 0   −3/2 3 −5/2 = GN Mm {...} + (z − d) · 2(z − d) · − {...} (10.22) 2 0 G M G M d2 = + N m − 3 N m (10.23) d3 d5 2G M = − N m . (10.24) d3 Another way to write the same set of equations is to use a unit normal vector ni = xi/r pointing in the radial direction; then

 2  ∂ Φ GN Mm aij = − i j = − (δij − 3ninj) 3 (10.25) ∂x ∂x 0 r This (tensor) equation tells us that you get stretched in the radial direction and squeezed in the transverse directions. Quite generally, you can think of gravity as a stretchy-squeezy force. This originates in the fact that gravitational intereactions in our universe are transmitted by a spin-two boson known as the graviton. It has a polarization tensor rather than a polarization vector. After symmetries under arbitrary changes of coordinates are taken into account, there are two independent physical polarizations for the graviton in four spacetime dimensions, like there are for the photon. But please do not mistake one for the other: the photon only has spin one, and in dimension other than D = 3+1 the numbers of independent physical polarizations of photons and gravitons will not match. That they do in D = 3 + 1 is an numerical accident. How big are tidal forces, in orders of magnitude? First, we need to figure out which of the solar system bodies is relevant. If you do the calculation using the above formula for tidal accelerations, you find that the Moon is actually the biggest contributor, because although it is much lighter than the Sun (about 27,100,000 times) it is much closer (about 388 times), and it is the cube of the distance that counts. Plugging in the numbers, you will find that the Sun’s tidal acceleration is only about 45% of the Moon’s. So we focus on the Moon. We would like to compare the magnitude to the acceleration due to gravity. So, to get the order of magnitude, we are computing the ratio of the tidal force on a piece of ocean to the g-force, 2 3 GN MM rE rE MM rE  −7 3 · ∼ ∼ 10 . (10.26) d GN ME ME d Tidal forces might seem like teeny weeny forces, but when you multiply by entire oceans, you get physical effects that human beings can relate to. We can make a little table comparing what we have found in Newtonian gravity versus Einsteinian General Relativity so far.

53 What Newton Einstein i λ gravity Φ(x , t) gαβ(x ) d2xi d2xµ dxν dxσ test particle EOM = −δij∂ Φ = −Γµ dt2 j dλ2 νσ dλ dλ d2yi D2Sµ deviation = −δij∂ ∂ Φ yk = +Rµ T νT σSρ dt2 j k Dλ2 νσρ ρ ρ ρ ρ λ ρ λ tidal forces ∂i∂jΦ R σµν = +∂µΓ νσ − ∂νΓ µσ + Γ µλΓ νσ − Γ νλΓ µσ 2 gravity EOM ∇ Φ = 4πGN ρ ??? (Einstein equations, coming soon!) In the Newtonian equation of motion for Φ, ρ is the mass density of whatever is sourcing the gravitational field, and GN is the Newton constant characterizing the strength of gravity.

In order to see how the covariant geodesic deviation equation reduces to the familiar Newtonian equations, we need to take the Newtonian limit in which gravity is weak and speeds are low. (Recall also that x0 = ct and we will need to put back the factors of c here to make the approximation clear.) Either we can assume staticity, or we can note that ∂0 = ∂t/c, which is a factor 1/c smaller than ∂i. In the Newtonian approximation, we treat the Newtonian potential as a perturbation on 1, and we will ignore terms of order Φ2 compared to terms of order Φ. In the weak-field limit, the line element is diagonal and quite simple,

ds2 = (1 + 2Φ/c2)c2dt2 − (1 − 2Φ/c2)(dx2 + dy2 + dz2) . (10.27)

For the moment, you will need to take this equation on faith, as I have not yet developed the machinery required to see how it emerges. What I will do for now is to assume it as an ansatz, and show that it correctly gives back the familiar Newtonian limit in the limit of weak gravity and slow speeds. Later on in the course, I will give a fuller explanation of where this expression for the approximate line element comes from. In the low-speed Newtonian limit, there is no difference between proper time and coor- dinate time t. The dynamical variables of interest become xµ(λ) → xi(t). What this means is that we only need to consider the spatial components of the geodesic deviation equation, as the temporal component takes care of itself automatically. In the limit of slow speeds compared to the speed of light, then, we have

d2yi = +Ri yj . (10.28) dt2 ttj

i To check that this does reduce to the Newtonian expression we need to compute R ttj for the above line element. In the limit of weak gravity, we can find the components of the inverse metric to first order in Φ,

gtt ' (1 − 2Φ/c2)/c2 , gij ' −δij(1 + 2Φ/c2) . (10.29)

For our general Christoffel symbol we have 1 Γµ = gµσ (∂ g + ∂ g − ∂ g ) , (10.30) νλ 2 ν σλ λ σν σ νλ

54 so we can pick off the 0 and i parts individually. Assuming that gravity is weak allows us to keep only first order terms in Φ. Assuming that Φ does not depend on time (to first order in small quantities) sets some Christoffels to zero. For example, 1 Γ0 = g00 (∂ g ) = 0 . (10.31) 00 2 0 00 and 1 Γ0 = g00 (∂ g + ∂ g − ∂ g ) = 0 , (10.32) ij 2 i 0j j 0i 0 ij and 1 Γi = gik (∂ g + ∂ g − ∂ g ) = 0 . (10.33) 0j 2 0 k0 j 0k k 0j Then we have 1 1 Γ0 = g00∂ g ' (1 − 2Φ/c2)∂ (1 + 2Φ/c2) ' ∂ Φ/c2 ⇒ Γt = ∂ Φ/c2 . (10.34) 0i 2 i 00 2 i i ti i Another nontrivial component is 1 1 Γi = gik (∂ g + ∂ g − ∂ g ) = − gik∂ g = δik∂ Φ/c2 ⇒ Γi = δik∂ Φ . (10.35) 00 2 0 k0 0 0k k 00 2 k 00 k tt k Finally, we have 1 Γi = gi` (∂ g + ∂ g − ∂ g ) (10.36) jk 2 j `k k `j ` jk 1 = δi`(1 + 2Φ/c2)(−2/c2)(δ ∂ Φ + δ ∂ Φ − δ ∂ Φ) (10.37) 2 `k j `j k jk ` 1 ⇒ Γi = −δi ∂ Φ − δi ∂ Φ + δi`δ ∂ Φ . (10.38) jk c2 k j j k jk `

55 11 M19Oct

11.1 Newtonian limit for Riemann From the Christoffels we computed last time, we can compute the Riemann components,

1 ∂2Φ Rt = − , (11.1) xtx c2 ∂x2 1 ∂2Φ Rt = − , (11.2) xty c2 ∂x∂y 1 ∂2Φ ∂2Φ Rx = + + , (11.3) yxy c2 ∂x2 ∂y2 1  ∂2Φ  Rx = + . (11.4) yxz c2 ∂y∂z plus eight more equations from cyclic permutations of (x, y, z). Note that we do not obtain any squares of partial derivatives here in our Riemanns because we are only working to first order in the Newtonian potential Φ. Then, using our geodesic deviation equation in the Newtonian limit, we have d2yi = +Ri yj . (11.5) dt2 ttj Since we also know that

i i ik ik R tjt = ∂jΓ tt − 0 = +∂j(δ ∂kΦ) = +δ ∂j∂kΦ , (11.6) we can see that the General Relativistic geodesic deviation equation involving Riemann gives back the Newtonian expression, which is exactly what we set out to prove last time.

To illustrate the abstract concept of geodesic deviation, let us work a very simple exam- ple. Suppose that we have a two-sphere of unit radius with line element

2 2 2 2 dΩ2 = dθ + sin θ dφ . (11.7)

If you did the Homework 1 assignment, you will already know how to find the Christoffels for this case. There are only two that are nonzero,

θ φ Γ φφ = − sin θ cos θ , Γ φθ = + cot θ . (11.8) Denoting d/dλ by an overdot, we find for the geodesic equation

.2 θ¨ − sin θ cos θ φ = 0 , (11.9) . . φ¨ + 2 cot θ θ φ = 0 . (11.10)

These are second order nonlinear PDEs, and solving them can be a battle if you do not choose your initial conditions cleverly.

56 If we wish, we can use spherical symmetry to pick a particular initial condition to make integrating these equations simpler. We choose the initial conditions π . . θ(λ)| = , θ(λ)| = −Ω , φ(λ)| = 0 , φ(λ)| = 0 . (11.11) λ=0 2 λ=0 0 λ=0 λ=0

This corresponds to pointing your tangent vector down a line of longitude. The constant Ω0 is the angular speed with which the polar angle θ is changing with the affine parameter λ. What is Riemann? The only nonzero component on the 2-sphere S2 is

θ 2 R φθφ = + sin θ . (11.12)

So the components of the geodesic deviation acceleration are

θ θ ν α σ A = R νασT T S θ φ θ φ θ φ φ θ = R φθφT T S + R φφθT T S = + sin2 θ[T φT θSφ − (T φ)2Sθ)] , (11.13) and

φ φ θ θ φ φ θ φ θ A = R θθφT T S + R θφθT T S = +[T θT φSθ − (T θ)2Sφ] , (11.14)

Now we need to specify S and T . Since the tangent vector to a geodesic running down a line of longitude points in the (negative of the) polar direction, and the separation vector between two adjacent such geodesics points in the azimuthal direction, we have that

θ φ θ φ T = −Ω0 ,T = 0 ,S = 0 ,S = 1 . (11.15)

So θ φ 2 A = 0 ,A = −Ω0 . (11.16) The magnitude is what you should expect for an angular acceleration of the type represented here. The minus sign is physical. It is possible to get considerably more sophisticated in discussing the physics of geodesic deviation. In order to derive more precise equations, one studies a of geodesics, which is a set of curves in an open region of spacetime such that every point in the region lies on precisely one curve. The story of how geodesics deviate can be expressed in more sophisticated tensor languauge by studying the covariant derivative of the four-velocity vector ∇µUν and decomposing it into three independent parts: (a) the trace part θ, known as the expansion of the congruence, (b) the symmetric traceless part σµν, known as the shear of the congruence, and (c) the antisymmetric part ωµν, known as the rotation of the congruence. Each of these affects the evolution of the others, and the equations obtained are different for massive and massless particles. We will not show the details here because the algebra is too long-winded.

57 11.2 Riemann normal coordinates and the Bianchi identity Riemann normal coordinates are a handy coordinate system that you can always use based about any point p. They are defined in a smallish patch in the neighbourhood of p, and do not necessarily extend infinitely in all directions, as we will explain when we talk about geodesic deviation soon. But they are a great little coordinate system that you can use to evaluate tensor equations, and to help prove tensor equations. We will use the notational convention that equations written in Riemann normal coordinates have bars over the tensors. Strictly speaking we should also bar all the indices, but this is beyond my typing patience at present, so please imagine barred indices everywhere in your head. A Riemann normal coordinate system is one built using geodesics about a point p. More concretely for our purposes, it is the coordinate system in which the metric is locally Minkowskian, and the Christoffels are zero – at the point p, ¯µ Γ αβ = 0 . (11.17)

Then, since ∇σgαβ = 0 everywhere, including at p,

¯ ¯λ ¯λ ∇σg¯µν = ∂σg¯µν − Γ σµg¯λν − Γ σνg¯λµ (11.18)

= ∂σg¯µν + 0 = 0 . (11.19)

Therefore, in Riemann normal coordinate system, we have the special relations

∂σg¯µν = 0 , (11.20) ¯α Γ λσ = 0 , (11.21) ¯µ ¯µ ¯µ R νσρ = ∂σΓ νρ − ∂ρΓ νσ . (11.22)

As you can imagine, using this coordinate system we can more quickly check tensor equations. This is not a trick – tensor equations are valid in any coordinate system. Therefore, they must hold in any frame, including the Riemann normal coordinate frame in which our tensor components simplify. This conceptual tool can be super handy.

We are now going to make use of this special coordinate system to identify all the symmetries of Riemann. This is an important quest, because it will enable is to compute how many independent components Riemann has in arbitrary spacetime dimension D = d+1. In turn, that helps us understand the physics of this four-legged tensor. Computing it can be arduous for a general spacetime, and this is why I set computer algebra as part of HW1. To help us find the symmetries, it helps to start by using the spacetime metric to build the (0,4) version of Riemann from the natural (1,3) version,

λ Rαβµν = gαλR βµν . (11.23)

The first symmetry we can notice by inspection of the formula for Riemann in terms of Christoffels. We see immediately that Riemann is antisymmetric upon exchange of its final two indices, ρ ρ R σµν = −R σνµ . (11.24)

58 In Riemann normal coordinates,

¯  ¯λ ¯λ  Rρσµν =g ¯ρλ ∂µΓ νσ − ∂νΓ µσ (11.25) 1  =g ¯ ∂ g¯λα (∂ g¯ + ∂ g¯ − ∂ g¯ ) − (µ ↔ ν) (11.26) ρλ µ 2 σ να ν σα α νσ 1 = g¯ g¯λα (∂ ∂ g¯ + ∂ ∂ g¯ − ∂ ∂ g¯ ) − (µ ↔ ν) (11.27) 2 ρλ µ σ να µ ν σα µ α νσ 1 = ∂ ∂ g¯ + ∂ ∂ g¯ − ∂ ∂ g¯  − (µ ↔ ν) (11.28) 2 µ σ νρ µ ν σρ µ ρ νσ 1 = ∂ ∂ g¯ − ∂ ∂ g¯  − (µ ↔ ν) , (11.29) 2 µ σ νρ µ ρ νσ

λα where in the third line above we used the fact that ∂µg¯ = 0 in Riemann normal coordinates, and in the fourth line we used symmetry. Therefore, we can see two additional identities satisfied by Riemann, Rρσµν = −Rσρµν , (11.30) i.e., Riemann is antisymmetric upon exchange of its first two indices as well as its last two, and Rρσµν = Rµνρσ (11.31) i.e., Riemann is symmetric under interchange of the first two indices with the last two. We can also look at a version of Riemann with cyclic permutations on the last three indices, Qρσµν := Rρσµν + Rρµνσ + Rρνσµ . (11.32) Evaluating again in Riemann normal coordinates gives

2Qρσµν = (∂µ∂σg¯νρ − ∂µ∂ρg¯νσ) + (∂ν∂µg¯ρσ − ∂ν∂ρg¯µσ) + (∂σ∂νg¯µρ − ∂σ∂ρg¯µν) −(µ ↔ ν) (11.33)  = ∂ρ −∂µg¯νσ − ∂νg¯µσ − ∂σg¯µν + ∂νg¯µσ + ∂µg¯νσ + ∂σg¯νµ    +∂σ ∂µg¯νρ + ∂νg¯µρ − ∂νg¯µρ − ∂µg¯νρ + ∂µ ∂νg¯σρ − ∂ν ∂µg¯σρ (11.34) = (0) + (0) + 0 − 0 (11.35) = 0 , (11.36) where we have used the fact that mixed partial derivatives commute and the fact that the metric is symmetric. Because of the antisymmetry properties, an equivalent way of writing this is Rρ[σµν] = 0 , (11.37) and using other symmetries of Riemann it immediately follows from this that

R[ρσµν] = 0 , (11.38) i.e., the totally antisymmetric part of Riemann vanishes too. With straightforward but tedious algebra of very similar type, we can also derive the Bianchi identity which governs covariant derivatives of Riemann. It can be written in (at least) two mathematically different

59 but physically identical ways, which are related by the symmetries of Riemann. The first form is ∇λRρσµν + ∇ρRσλµν + ∇σRλρµν = 0 , (11.39) and the second form is ∇[λRµν]ρσ = 0 , (11.40) which constrains Riemann by relating components at different points. You can think of the Bianchi identity for Riemann as like a Jacobi identity for covariant derivatives,

[[∇µ, ∇ν], ∇λ] + [[∇ν, ∇λ], ∇µ] + [[∇λ, ∇µ], ∇ν] = 0 . (11.41)

11.3 The information in Riemann Now we have all the ingredients we need in order to compute the number of independent Riemann coefficients. We know that as a (0,4) tensor Riemann satisfies

Rαβγδ = −Rαβδγ , (11.42)

Rαβγδ = −Rβαγδ , (11.43)

Rαβγδ = Rγδαβ , (11.44)

R[αβγδ] = 0 . (11.45)

Suppose that we bunch the indices of Riemann in twos. Then we can think of Riemann as like a symmetric combination of two antisymmetric blocks. Recall that the dimension of an antisymmetric D × D matrix is D(D − 1)/2 while that of a symmetric matrix is D(D + 1)/2. Then the number of components of Riemann should be

1 1  1  D n (D) = D(D − 1) D(D − 1) + 1 − . (11.46) R 2 2 2 4 We obtained this by using the symmetries of the first three identities to compute the tentative total and then subtracting off the number of completely antisymmetric components to satisfy the fourth identity. This process works because the four constraints are independent. Then, with very simple algebra, we obtain 1 n (D) = D2(D2 − 1) . (11.47) R 12

Notice a few things about this formula. In one spacetime dimension, nR(1) = 0 and Riemann has no components. This makes sense, as there is only one independent direction, so you cannot build a nonzero commutator of covariant derivatives. There is not enough room in spacetime to build a parallelogram. In two spacetime dimensions, we have nR(2) = 1 and Riemann has just one independent component. This makes gravitational physics in D = 1+1 quite easy compared to higher dimensions. In three spacetime dimensions, we get nR(3) = 6, and in four spacetime dimensions we have nR(4) = 20. This number is, not accidentally, equal to the number of degrees of freedom in the second partial derivatives of the metric that we cannot set to zero by a clever choice of coordinate system when Taylor expanding the metric.

60 As we keep going up in dimension, nR(D) proliferates like a quartic polynomial of D. By the time we get to ten or eleven spacetime dimensions, we are dealing with nR(10) = 825 or nR(11) = 1210 independent components! This is why we often use computer algebra in research, when calculating in spacetime dimensions relevant to string theory. Of course, it is also possible with clever techniques to cut through the algebra and find quicker ways to calculate analytically, when your metric is diagonal or sparse in other significant ways.

61 12 R22Oct

12.1 Lie derivatives So far we have developed covariant derivatives and curvature, which required having a Christoffel connection. An interesting fact is there are some structures that can be de- fined on a curved spacetime manifold even without reference to a connection or curvature. We will introduce a the idea of the Lie10 derivative today, because studying it acting on the metric tensor of spacetime will lead us to the General Relativistic version of Noether’s Theorem, which is one of the most important ideas of all time in . We will find that a symmetry of the spacetime metric gives an integral of the motion, a conserved quantity, which we can use to help solve for trajectories of test particles in some important cases. The key concept we will need for our discussion of Noether’s Theorem is how to take a along the congruence defined by a vector field. So, first things first, what is congruence? On a spacetime manifold, a congruence is a set of curves that fill the manifold (or more generally some part of it) without intersecting. Therefore, the congruence provides a mapping of a manifold onto itself, in the following sense. If the parameter on the curves is λ, then any tiny ∆λ defines a mapping, where each point is advanced by ∆λ along the same curve in the congruence. This is a 1-1 mapping if the vector field is C1, and if it is C∞ it is called a diffeomorphism. If there is such a map for any ∆λ, then we have a one-parameter Lie group, and the mapping is called a Lie dragging along the congruence. Suppose that we have a scalar function f defined on our spacetime manifold. Then our ∗ above mapping defined by ∆λ lets us define a new function f∆λ in the obvious way: if a point P on a certain curve in the congruence gets mapped to the point Q, then

∗ f(P ) = f∆λ(Q) . (12.1)

∗ If it happens that we have a function for which the new value f∆λ(Q) is equal to the old one f(Q), for all Q, ∗ f = f∆λ , (12.2) then the function is invariant under the mapping. If it is invariant for all ∆λ, then the function is said to be Lie dragged. In less fancy language, df = 0 . (12.3) dλ

Acting on any given tensor, the Lie derivative along some vector field V , written as LV , measures how fast the tensor changes along integral curves of V . Acting on a scalar function f, that is just the directional derivative,

λ LV (f) = V ∂λf . (12.4)

Note that this is the partial derivative: we have not involved any affine connection here.

10Pronunciation note: “Lie” rhymes with “see”.

62 What about a vector field? Any vector field V is defined by the congruence of curves for which it is the tangent field, dxµ V µ = . (12.5) dλ A familiar example from undergraduate electromagnetism is that the magnetic flux lines are the integral curves of the magnetic field 3-vector. Now suppose that we have two general vector fields X and Y . Recall that for any vector V , it can be expanded in the coordinate µ basis as V = V ∂µ. Then we can define the commutator [X, Y ] of two vector fields via

[X, Y ](f) ≡ X(Y (f)) − Y (X(f)) , (12.6) where f is an arbitrary function. The neat thing about [X, Y ] is that it is a bona fide vector field: it is linear, [X, Y ](af + bg) = a[X, Y ]f + b[X, Y ]g , (12.7) and it obeys the Leibniz rule,

[X, Y ](fg) = f[X, Y ]g + g[X, Y ]f . (12.8)

In the coordinate basis, the new vector field [X, Y ] has components

µ λ µ λ µ [X,Y ] = X ∂λY − Y ∂λX . (12.9)

This is a well-defined tensor, because the non-tensorial pieces from the partial derivatives cancel by antisymmetry of the commutator. If you prefer, you can write the above formula with covariant derivatives instead – that way, it looks more tensorial. Suppose that we adapt our coordinate system so that V points entirely along the coor- dinate basis vector ∂/∂xd. The utility of choosing this coordinate system is that a diffeomor- phism by λ amounts to a coordinate transformation from (x0, x1, . . . , xd) to (x0, x1, . . . , xd + λ). Then the components of a different vector T µ pulled back from the transformed point to the original are simply T µ(x0, x1, . . . , xd + λ). In this coordinate system, the Lie derivative then becomes ∂ L T µ = T µ . (12.10) V ∂xd This expression is clearly not covariant, but we know that for two vector fields V and T the commutator [V , T ] is a well-defined tensor, and in this coordinate system it happens to have components ∂T µ [V , T ]µ = V ν∂ T µ − T ν∂ V µ = . (12.11) ν ν ∂xd

Since both LV T and [V , T ] are vectors (rank (1, 0) tensors), their components must be equal, and so we finally have the formula we want. The Lie derivative of a vector T along the vector field V is LV T = [V , T ] . (12.12) This quantity on the RHS is called the Lie bracket. The equation says that how the vector T changes along integral curves of another vector V is encoded in the commutator of the two vector fields. The formula for the action of the Lie derivative on covariant vectors follows directly from what we have just derived for contravariant vectors and the Leibniz rule.

63 For a general rank (k, `) tensor, the Lie derivative is

(L T )µ1...µk = V σ∂ T µ1...µk V ν1...ν` σ ν1...ν` −(∂ V µ1 )T λµ2...µk − ... λ ν1...ν` +(∂ V λ)T µ1...µk + .... (12.13) ν1 λν2...ν` This equation may make you a bit uncomfortable because it involves partial derivatives. In fact, if you do the straightforward but tedious algebra, you will find that it is just as valid with covariant derivatives replacing the partial ones,

(L T )µ1...µk = V σ∇ T µ1...µk V ν1...ν` σ ν1...ν` −(∇ V µ1 )T λµ2...µk − ... λ ν1...ν` +(∇ V λ)T µ1...µk + .... (12.14) ν1 λν2...ν` This equation certainly looked less tensorial written the first way. But the first equation has the advantage that it makes clear that no connection is necessary to define Lie derivatives of tensors. It is an independent structure.

12.2 Killing vectors and tensors In this section, we will be especially interested in the expression above for the Lie derivative of the metric tensor, which characterizes everything about gravity in our spacetime. We have

(LV g)µν = ∇µVν + ∇νVµ . (12.15) So if 0 = ∇µKν + ∇νKµ , (12.16) for some vector K, the metric is unchanged. K is known as a Killing vector, and the metric is unchanged along its integral curves, i.e., it has a symmetry. This is Noether’s Theorem in curved spacetime, and it plays an extremely important role in the physics of GR. So, what is the corresponding conservation law? Consider the quantity K · p. Its covariant derivative is

λ λ λ ∇µ(Kλp ) = (∇µKλ)p + Kλ(∇µp ) . (12.17) Contracting this with pµ gives

µ λ µ λ µ λ p ∇µ(Kλp ) = p p ∇µKλ + Kλp (∇µp ) , (12.18) and the second term disappears by the geodesic equation. The first term can also be seen to vanish by virtue of symmetry and the Killing vector equation. So the Killing equation is equivalent to conservation of K · p. More generally, if we have a Killing tensor obeying

∇(µKν1...ν`) = 0 (12.19) then µ ν1 ν` p ∇µ(Kν1...ν` p . . . p ) = 0 . (12.20)

64 A fascination with finding conserved quantities is physically important because it can help us solve for geodesics. Soon, when we introduce black holes, we will see just how crucial conserved quantities can be in analyzing geodesic motion and physical consequences of it. So let us derive an alternative form of the geodesic equation which will be handy for future reference. What is the directional covariant derivative of the downstairs version of the tangent vector to the curve xµ(λ)? D dx  d2x dxσ dx µ = µ − Γα α . (12.21) Dλ dλ dλ2 σµ dλ dλ This should be zero for geodesics, giving d2x 1 dxσ dx µ = gαβ (∂ g + ∂ g − ∂ g ) α , dλ2 2 σ βµ µ βσ β σµ dλ dλ 1 dxσ dxβ = (∂ g + ∂ g − ∂ g ) , 2 σ βµ µ βσ β σµ dλ dλ 1 dxσ dxβ = (+∂ g ) (12.22) 2 µ βσ dλ dλ which yields (upon relabelling dummy indices)

d dx  1 dxα dxβ µ = (∂ g ) . (12.23) dλ dλ 2 µ αβ dλ dλ So if the entire spacetime metric has zero dependence on a particular coordinate xµ, the corresponding lower-index tangent vector dxµ/dλ is conserved! For a massive particle, this quantity is none other than pµ/m. For the massless particle, we can choose a convention in which pµ = dxµ/dλ. Therefore,

if ∂µgαβ = 0 ∃µ ∀α, β then pµ = constant . (12.24) Let us do an ultra-simple example of a Killing vector. Consider Minkowski space in 4D, namely R3,1 with the flat spacetime metric. In Cartesian coordinates, we obviously have spacetime translation invariance. This implies that all components of pµ are conserved. As a less trivial example, take our spatially flat FRW universe for which we previously worked out the Christoffels. Notice that the metric depended only on time. Obviously, this means that energy is not conserved. Stop and think on that for a minute. You probably thought that conservation of energy must be true in all circumstances, even for the whole universe. You would be wrong. It requires a symmetry! Since none of the components of the metric depend on spatial coordinates, the spatial momenta pi are conserved. For our third example of Killing vectors, consider the two-sphere S2 with round metric

ds2 = dθ2 + sin2θ dφ2 . (12.25)

How do we find the Killing vectors? We need to solve the D(D + 1)/2 Killing equations,

0 = ∇µKν + ∇νKµ α = ∂µKν + ∂νKµ − 2Γ µνKα . (12.26)

65 First, we need the nonzero Christoffels, cos θ Γφ = , Γθ = − sin θ cos θ . (12.27) φθ sin θ φφ Then the three independent Killing vector equations involve θθ, φφ, θφ:

0 = ∂θKθ ,

0 = ∂φKφ + sin θ cos θ Kθ , 2 cos θ 0 = ∂ K + ∂ K − K . (12.28) φ θ θ φ sin θ φ The first Killing equation teaches us that

Kθ = Kθ(φ) . (12.29)

Taking ∂φ of the second Killing equation gives, after a little bit of massaging of trig functions and using the third equation, 2 ∂φKθ + Kθ = 0 . (12.30) We can readily solve this, Kθ(φ) = A sin φ + B cos φ , (12.31) where A, B are constants of integration. Using this in the third Killing equation and partially integrating w.r.t. φ to find Kφ gives

Kφ = F (θ) + A sin θ cos θ cos φ − B sin θ cos θ sin φ , (12.32) where F is an arbitrary function of integration. Substituting this back into the third Killing equation gives, after more trigonometric algebraic massage, 2 cos θ ∂ F (θ) − F (θ) = 0 , (12.33) θ sin θ which is readily integrated to F (θ) = C sin2 θ , (12.34) where C is a constant of integration. Therefore, the general form of our Killing vectors for the two-sphere are, for the downstairs components,

Kθ = A sin φ + B cos φ , 2 Kφ = C sin θ + sin θ cos θ (A cos φ − B sin φ) . (12.35) If we take A = 0,B = 0,C = 1, we get a Killing vector R with upstairs components Rθ = 0 ,Rφ = 1 . (12.36) If we take A = 0,B = 1,C = 0, we get a Killing vector S with upstairs components Sθ = cos φ , Sφ = − cot θ sin φ . (12.37) If we take A = −1,B = 0,C = 0, we get a Killing vector T with upstairs components T θ = − sin φ , T φ = − cot θ cos φ . (12.38) As you can check by transforming between spherical polar coordinates and Cartesian coordi- nates, these three Killing vectors correspond to R = x∂y −y∂x, S = z∂x −x∂z, T = y∂z −z∂y.

66 13 M26Oct

13.1 Maximally symmetric spacetimes Spacetimes are distinguished by how many symmetries they possess. The more symmet- ric, the more calculable. The less symmetric, the less calculable. Even though maximally symmetric spacetimes possess an unrealistic amount of symmetry for experimental purposes, they are still very useful to study because calculations are easier to complete and they help build intuition. What are the maximally symmetric spacetimes? We need to specify the spacetime signature11 in order to get started on this discussion. In Euclidean signature, Riemannian manifolds with maximal symmetry are (up to local isometry) either: Euclidean space RD, the sphere SD, or HD. In Lorentzian signature, there are also three options, and they split up according to the value of the Λ (a.k.a. dark energy density). When Λ = 0, we get Minkowski space Rd,1, where D = d + 1. For Λ < 0 we get Anti de Sitter spacetime (AdS), and for Λ > 0 we get de Sitter spacetime (deS). Recall that Minkowski spacetime is invariant under (d + 1) translations, d(d − 1)/2 rotations, and d boosts. Adding the numbers together gives a total of 1 1 1 (d + 1) + d(d − 1) + d = (d + 1)(d + 2) = D(D + 1) (13.1) 2 2 2 symmetries. We therefore say that a spacetime manifold of dimension D is maximally symmetric if it possesses D(D + 1)/2 independent symmetries. What equation should the Riemann tensor obey in maximally symmetric spacetimes? It had better be invariant under local Lorentz transformations, because there is no preferred direction in spacetime. There are only a very few tensors which we can use: gµν and µ1...µD . The epsilon tensor turns out to have the wrong symmetry properties to build Riemann components, and the metric ends up the winner. The sole combination of metric tensor components that possesses the right symmetries to be Riemann is antisymmetric, and tracing gives the constant of proportionality, R R = (g g − g g ) . (13.2) ρσµν D(D − 1) ρν σµ ρµ σν

The Ricci scalar R is constant over the entire manifold for maximally symmetric spacetimes.

Anti de Sitter spacetime AdSD=d+1 can be embedded in a Minkowski spacetime of one higher dimension Rd,2, via − (t1)2 − (t2)2 + (x1)2 + ... + (xd)2 = −L2 (13.3) where L is the radius of curvature of the AdSD. There are several different coordinate 11If we were in a mathematically picky mood, we would also want to specify the spacetime topology.

67 systems in common usage for AdSD. One of the most useful is global coordinates, in which

t1 = L cosh ρ cos τ , (13.4) t2 = L cosh ρ sin τ , (13.5) d X xi = L sinh ρ xˆi , where (ˆxi)2 = 1 . (13.6) i=1 In general dimension, spherical coordinates are defined via

1 xˆ = cos θ1 , p−1 p Y xˆ = cos θ1 sin θm , p ∈ {2, . . . , d − 1} , m=1 d−1 d Y xˆ = sin θm . (13.7) m=1 You can check yourself, either by hand or using SymPy, that the resulting line element of AdSD in global coordinates is

2 2 2 2 2 2 2  ds = L cosh ρ dτ − dρ − sinh ρ dΩd−1 , (13.8) where d−1 `−1 ! 2 2 X Y 2 2 dΩd−1 = dθ1 + sin θm dθ` . (13.9) `=2 m=1 With a further transformation in time and radius to static coordinates,

t = L τ , r = L sinh ρ , (13.10) we obtain  r2   r2 −1 ds2 = 1 + dt2 − 1 + dr2 − r2dΩ2 . (13.11) L2 L2 d−1

The scale L is the radius of curvature, and it sets the scale for all the physics in AdSD. The physics of Anti de Sitter (or de Sitter) spacetime in D = d + 1 dimensions differs markedly from the physics of Minkowski spacetime. One of the quickest ways to illustrate this is to compare the falloff of partial waves in AdS versus flat spacetime. Solving a wave equation for a simple type of field is a straightforward way to see this. Consider a Klein-Gordon (scalar) field living in flat Minkowski spacetime. Its equation of motion in spherical coordinates {t, r, ΩD−2} is

µ 2 ∇ ∇µΦ = m Φ . (13.12)

If we write −iωt Φ(t, r, ΩD−2) = e χ(r)Y`,{m}(ΩD−2) , (13.13)

68 where the spherical harmonics obey

2 ∇Sd−1 Y`,{m} = −`(` + d − 2)Y`,{m} , (13.14) and separate variables, we find  ∂2 (d − 2) ∂  `(` + d − 2)  + + ω2 − − m2 χ(r) = 0 . (13.15) ∂r2 r ∂r r2 The most physically important thing to understand from this partial differential equation is that higher partial waves with ` > 0 are less important at large radius than the ` = 0 mode. A related fact is that when we write out the multipole expansion for electric and magnetic fields in Minkowski spacetime, higher multipole fields fall off with larger powers of radius. This physics is inherent to Minkowski spacetime with Λ = 0. It may surprise you to learn that it does not carry over to other values of the cosmological constant. Suppose that we now consider instead AdSd+1 with global coordinates {t, ρ, Ωd−2},

2 2 2 2 2 2 2  ds = L cosh ρ dτ − dρ − sinh ρ dΩd−1 . (13.16) In this set of coordinates, ρ ranges from 0 (the interior of AdS) to π/2 (the boundary) and the coordinate t ranges from −∞ to +∞. What does the scalar wave equation look like in this spacetime? Anticipating separation of variables again, let us write

−iωτ Φ(τ, ρ, Ωd−1) = e χ(ρ)Y`,{m}(Ωd−2) . (13.17) Then the equation of motion becomes  1  ∂ (tan ρ)d−1∂  + ω2 − `(` + d − 2) csc2 ρ − m2 sec2 ρ χ(ρ) = 0 . (13.18) (tan ρ)d−1 ρ ρ Notice that as we approach the boundary, the higher angular momentum modes are not suppressed compared to the ` = 0 mode. This is the germ of why the AdS/CFT correspon- dence discovered in the context of string theory in 1997 can work: an observer living on the boundary of the spacetime can see lots of information about what is happening in the interior of the spacetime all the way from the boundary. If we want to know the character of solutions to the above differential equation, we can substitute

χ(ρ) = (cos ρ)2h(sin ρ)2bf(ρ) , (13.19) which, upon the substitution y ≡ sin2 ρ , (13.20) gives  d   ω2  y(1 − y)∂2f + 2b + − (2h + 2b + 1)y ∂ f − (h + b)2 − f = 0 . (13.21) y 2 y 4 The solutions to this equation are hypergeometric functions, with √ d ± d2 + 4m2  ` ` d h = , b = + , − + 1 − . (13.22) ± 4 2 2 2

69 (For further details, see e.g. hep-th/9805171.) D,1 de Sitter spacetime dSD can be embedded in R via √ t1 = L2 − r2 sinh(t/L) , (13.23) d X xi = Lxˆi , where (ˆxi)2 = 1 , (13.24) √ i=1 xD = L2 − r2 cosh(t/L) . (13.25)

This gives rise to static coordinates. (Like AdS, dS can alternatively be sliced with flat, positively curved, or negatively curved spatial sections. In static coordinates, the de Sitter line element becomes

 r2   r2 −1 ds2 = 1 − dt2 − 1 − dr2 − r2dΩ2 . (13.26) L2 L2 d−1

This has a cosmological horizon at r = L. We will not have time to develop the similarities and differences between cosmological horizons and black hole horizons in this course.

13.2 Einstein’s equations In plain language, Einstein’s equations express the fact that matter tells spacetime how to curve and spacetime tells matter how to move. In PHY484, I will show how to derive Einstein’s equations of General Relativity. For now, we will just write them down for you and show you how to use them. They relate a geometrical quantity on the left hand side, built out of the Riemann curvature tensor, to an energy-momentum tensor of any matter fields in the physical system containing gravitation as well. In tensor notation, they read as follows, 1 R − g R + Λg = −8πG T . (13.27) αβ 2 αβ αβ N αβ The quantity Λ is known as the cosmological constant. (Note: you can put back the powers of c very easily by recruiting dimensional analysis.) A very important characteristic of Einstein’s equations is that they are nonlinear. You can see this by eye by recalling the formula for Christoffels in terms of metric derivatives, which is nonlinear, as well as the formula for the Riemanns in terms of derivatives of Christof- fels and contractions of Christoffels, which is also nonlinear. Nonlinearity makes GR very different qualitatively than Newtonian gravity. It is only in the Newtonian limit of GR that the linearity with which you are familiar emerges and shows itself as the superposition prin- ciple for the Newtonian potential Φ(x). For generic situations in GR, nonlinearity is present in the partial differential equations for the evolution of spacetime. The mathematics of non- linear PDEs is hugely complicated compared to linear ones, and for generic spacetimes often no general statements can be made. Symmetry helps enormously with the task of trying to solve the differential equations, classify spacetimes, or find their geodesics. The energy-momentum tensor on the RHS of Einstein’s equations is covariantly con- served. The way to see this is to take covariant derivatives of both sides of the Einstein

70 equations. The is defined as 1 G = R − g R. (13.28) µν µν 2 µν

Notice that this is denoted with a big-Gµν, rather than the small-gµν metric or the GN denoting the Newton gravitational constant. By itself, the rank (0,2) Einstein tensor Gµν does not look like much. But it obeys an extremely useful identity by virtue of the Bianchi identity for the Riemann tensor. To see this, let us take the first form of our Bianchi identity and contract with two factors of the upstairs metric,

νσ µλ 0 = g g (∇λRρσµν + ∇ρRσλµν + ∇σRλρµν) (13.29) µ ν = ∇ Rρµ − ∇ρR + ∇ Rρν . (13.30)

Rearranging this expression gives a relationship between the covariant derivative of the Ricci tensor and the covariant derivative of the Ricci scalar, 1 ∇µR = ∇ R. (13.31) ρµ 2 ρ This identity is handy because it enables us to prove that

µ ∇ Gµν = 0 . (13.32)

In other words, the Einstein tensor is covariantly conserved. We also have the metric com- patibility condition on our affine connection,

σ ∇ gµν = 0 . (13.33)

Then we have µ matter ∇ Tµν = 0 . (13.34) Covariant conservation of the energy-momentum tensor in GR is mandatory, not voluntary. How about some examples of energy-momentum tensors? Consider a perfect fluid, which is a spherical cow approximation to real fluids, characterized only by three things: energy density ρ, pressure p, and fluid velocity uµ. Its energy-momentum tensor is constructed from those three quantities and the metric tensor,  p  T p.f. = ρ + u u − pg . (13.35) µν c2 µ ν µν More generally, if we have an action principle for some classical matter (non-gravitational) field coupled to gravity, Smatter, then the energy-momentum tensor is determined by varying 12 the action w.r.t. gµν according to the following recipe :

σ 2 δSmatter Tµν(x ) = , (13.36) p−g(xσ) δgµν(xσ)

12I will prove this near the beginning of the GR2 PHY[1]484S course

71 where (−g) is an abbreviation for the determinant of the downstairs metric, √ q −g ≡ − det (gαβ) . (13.37)

This quantity arises in writing down a general relativistically invariant measure of integra- √ tion, dDx −g. (For the case of spherical coordinates on flat Minkowski spacetime, it is r2 sin θ, which should be familiar to you from undergraduate multivariable calculus.) A handy formula is √ 1√ 1√ δ −g = − −g g δgαβ = + −g gαβ δg . (13.38) 2 αβ 2 αβ For a relativistic massive point particle, Z particle m . . 4 T (x) = dτ zµzν δ (x − z(τ)) . (13.39) µν p−g(x)

We can see how this arises by starting from the Einbein action in curved spacetime in proper time gauge for a massive particle,

Z 1 dzµ(τ) dzν(τ) 1  S(2) = m dτ g + m . (13.40) rel 2 µν dτ dτ 2

The only part of this action that depends on the spacetime metric is the first term. Also, we will only get a nonzero result when we are on the particle path. How about for a scalar field Φ? For minimal coupling to gravity,

Z √ 1 1  S [Φ] = dDx −g ∇µΦ∇ Φ − m2Φ2 − V (Φ) . (13.41) scalar 2 µ 2

It follows that 1  T scalar = ∇ Φ∇ Φ − g (∇Φ)2 − V (Φ) . (13.42) µν µ ν µν 2

For the electromagnetic field Aµ, 1 Z √ S [A ] = − dDx −gF µνF . (13.43) EM α 4 µν It follows that  1  T EM = − F F λ − g F λσF . (13.44) µν µλ ν 4 µν λσ

72 14 R29Oct

14.1 Birkhoff’s theorem and the Schwarzschild black hole Let us now attack the question of solving the vacuum Einstein equations when we have a static, spherically symmetric spacetime. After a bit of work, we will be able to show that the Schwarzschild black hole possessing mass M is the unique solution. Our methodology follows that of Carroll §5.2, and will involve a few steps. We will first use spherical symmetry to constrain the possible metric components that might be turned on. Then we will use the vacuum Einstein equations to prove that the time dependence must drop out. Then we will solve the remaining vacuum Einstein equations, and we will obtain the Schwarzschild solution. The last piece of the puzzle will be provided by the Newtonian limit, which will connect a mathematically arbitrary constant of integration to the physical quantity GN M, where M is the mass of the Schwarzschild geometry and GN is the Newton constant, which has dimensions of lengthD−2 and parametrizes the strength of gravity. First, let us discuss the definition of a static spacetime in Lorentzian signature. Calling the timelike coordinate x0, we define a static spacetime as one for which (a) there is no explicit time dependence in the metric and (b) the invariant interval possesses time reversal invariance, ∂ g (xλ) = 0 , (14.1) ∂x0 µν ds2 invariant under x0 → −x0 . (14.2)

A spacetime that only obeys the first condition is called a stationary spacetime. In essence, a static spacetime basically does nothing at all over time, while a stationary spacetime does exactly the same thing at all times. Note that staticity requires that there be no time-space cross terms in the invariant interval, only time-time and space-space components. Isotropy is also big requirement. Having this much symmetry eliminates a lot of possibly independent components of the metric tensor. In particular, writing in terms of either Carte- sian coordinates ~x or spherical polar coordinates r, θ, φ, we can only use three ingredients,

~x · ~x = r2 , 2 2 2 d~x · d~x = dr + r dΩ2 , ~x · d~x = rdr , (14.3) where 2 2 2 2 dΩ2 = dθ + sin θ dφ . (14.4) Any other thing we could build from the available ingredients would not respect spherical symmetry. Given the spherical symmetry of our ansatz, it is traditional to use spherical polar coordinates, in which the metric on the S2 is round – throughout the spacetime. For now, we will allow the metric to have time dependence, but bear in mind that shortly we will find it is disallowed by the Einstein equations. We write the metric as

2 2α00(t0,r0) 0 2 2β00(t0,r0) 0 2 2γ00(t0,r0) 0 0 2δ00(t0,r0) 0 2 2 ds = e (dt ) − e (dr ) − 2e dt dr − e (r ) dΩ2 . (14.5)

73 Next, we can change to a new radial coordinate r(t0, r0) by

r2 = (r0)2e2δ00(t0,r0) . (14.6)

2 2 This r is often referred to as the areal radius, because r is the thing in front of dΩ2, the metric on the round two-sphere. Using this areal radius coordinate, we can then adjust the definitions of all functions dependent on time and radius accordingly, to new functions, single primed,

2 2α0(t0,r) 0 2 2β0(t0,r) 2 2γ0(t0,r) 0 2 2 ds = e (dt ) − e dr − 2e dt dr − r dΩ2 . (14.7) In order to be able to get rid of the 2dt0dr0 term in this line element, we are going to have to work harder. Let us start by trying the simplest proposal for a new time coordinate,

dt =?? e2α(t0,r)dt0 − e2γ(t0,r)dr . (14.8)

If we try to follow this path further, we will find that second mixed partial derivatives of the new t coordinate w.r.t. the old coordinates fail to commute, so the equation (14.8) above is inconsistent. (To see a simple example of how this process works when done right, try transforming from Cartesian coordinates (x, y) on the plane to polar coordinates (r, θ), and checking that mixed second partials commute.) Our simplest proposal for a coordinate change failed. Can we craft a better proposal? As you may recall from the general theory of ODEs/PDEs, the right strategy is to recruit an integrating factor, which here must be a function of both t0 and r: Φ(t0, r). We define a new time coordinate t(t0, r) by

h 0 0 i dt = e2Φ(t,r) e2α(t ,r)dt0 − e2γ(t ,r)dr . (14.9)

The very explicit factor of eΦ(t,r) in front of the [...] parts we wanted is designed precisely such that the right hand side of the above expression is an exact differential. In this case, it can be shown that we can always find such a Φ(t, r). Then, using the above equations, we obtain e−4Φdt2 = e4α(dt0)2 − 2e2(α+γ)dt0dr + e4γdr2 . (14.10) Rearranging this and forming our (dt0)2 and 2dt0dr pieces gives

e2α(t0,r)(dt0)2 − 2e2γ(t0,r)dt0dr = e−2α(t,r)−4Φ(t,r)dt2 − e−2α(t,r)+4γ(t,r)dr2 . (14.11)

Woohoo – the cross terms in the metric are gone! Redefining our metric ansatz functions according to

e2α = e−2α0−4Φ0 , e2β = e2β0 + e−2α0+4γ0 (14.12) gives 2 2α(t,r) 2 2β(t,r) 2 2 2 ds = e dt − e dr − r dΩ2 . (14.13) The point of all this wrestling with differentials was to show that we can always choose a coordinate system in which off-diagonal metric components are absent, even if our spherically symmetric system is time dependent.

74 Our next task is going to be to show that the time dependence in the metric functions also has to drop out. For this part, we will need to use the equations of motion for the metric tensor field on spacetime. For the Einstein equations, we need to compute Christoffels to get Riemanns which we can then contract to get Ricci components, e.g. via SymPy code you wrote for HW1+HW2. We get

t t Γ tt = {∂tα} , Γ tr = ∂rα , t 2(β−α) r 2(α−β) Γ rr = {e ∂tβ} , Γ tt = e ∂rα , r r Γ tr = {∂tβ} , Γ rr = ∂rβ , r −2β r 2 −2β Γ θθ = −re , Γ φφ = −r sin θe , 1 Γθ = , Γθ = − sin θ cos θ , rθ r φφ 1 cos θ Γφ = , Γφ = . (14.14) rφ r θφ sin θ

Note that the pieces involving ∂t have been highlighted with {...} in the above equation so you can clearly see the effect of allowing time dependence. For the Ricci tensor, we obtain  2  R = e2(α−β) −(∂2α) − (∂ α)2 + ∂ α∂ β − (∂ α) tt r r r r r r  2 2 + −(∂t β) + (∂tα)(∂tβ) − (∂tβ) ,  2  R = − (∂ β) tr r t  2  R = (∂2α) + (∂ α)2 − (∂ α)(∂ β) − (∂ β) rr r r r r r r 2(β−α)  2 2 + e −(∂t β) − (∂tβ) + (∂tα)(∂tβ) ,  −2β  Rθθ = − e (r∂rβ − r∂rα − 1) + 1 , 2 Rφφ = sin θRθθ . (14.15) All these tensors must be zero for us to have a solution of the vacuum Einstein equations. Note how some of the Einstein equations have turned out to be second order dynamical equations while others are first order constraints. This is a general feature in GR. First, let us look at Rtr. This must be zero, which demands of β(t, r) that

∂tβ(t, r) = 0 ⇒ β = β(r) . (14.16) You can see by looking for the {...} parts in the Riccis that many terms now drop out completely because β is a function of r only. Obviously, this simplifies our life quite a lot! Second, let us notice that the Rθθ = 0 equation (a first order constraint equation) is relatively simple. Let us take a time derivative of it,

−2β −2β ∂t(Rθθ) = 0 = −2(∂tβ)e [r∂rβ − r∂rα − 1] + e [r∂t∂r(β − α)] . (14.17)

But since ∂tβ = 0 by our Rtr = 0 equation, we have

−2β e r∂t∂r(β − α) = 0 . (14.18)

75 Then, using what we know about β(r), we can partially integrate to get

α(t, r) = f(r) + g(t) . (14.19)

Notice how the only remaining place where we have time dependence is in the tt component of the metric. What a stroke of luck! This means that we can absorb it simply by doing a coordinate transformation involving only time (not radius or angular coordinates),

dt˜= dt eg(t) . (14.20)

Let us redefine our time coordinate to correspond to this t˜ (we drop the tilde, for notational clarity). Then we have β = β(r) , α = α(r) . (14.21) Third, let us look at the remaining (more complex) tt and rr Einstein equations,  2  0 = (∂2α) + (∂ α)2 − ∂ α∂ β + (∂ α) , (14.22) r r r r r r  2  0 = −(∂2α) − (∂ α)2 + (∂ α)(∂ β) + (∂ β) (14.23) r r r r r r By simply adding these equations together, we obtain

∂r(α + β) = 0 . (14.24)

This means that β(r) = const. − α(r) . (14.25) This constant of integration can be absorbed into the time coordinate, so that β(r) = −α(r). Fourth, we can plug this expression for β(r) in terms of α(r) back in to the Rθθ = 0 Einstein equation to obtain 2α [2r∂rα + 1] e = 1 . (14.26) By quick inspection you can see that this becomes

2α(r) ∂r re = 1 , (14.27) so that 2α(r) re = r + c1 , (14.28) where c1 is a mathematically arbitrary constant. This can be integrated to give c e2α(r) = 1 + 1 . (14.29) r We are nearly done, but we need one more physical ingredient. We need to know the physical meaning of c1, because it is what controls all the nontrivial radial dependence in our new static spherically symmetric metric satisfying the vacuum Einstein equations. This is where the Newtonian limit comes to our rescue. We know that in regions of weak gravity, far away from the centre of our spacetime near r → ∞, gtt should take the form of 2 gtt ' 1 − 2Φ/c , where Φ = −GN M/r. This fixes our arbitrary constant of integration.

76 Therefore, we finally obtain the famous in four13 spacetime dimensions:  2G M   2G M −1 ds2 = 1 − N dt2 − 1 − N dr2 − r2dΩ2 . (14.30) r r 2 Birkhoff’s Theorem says that this is the unique static spherically symmetric solution of the vacuum Einstein equations. We sketched a proof of this en route, when we found that the Einstein equations would not allow time dependence. Note that in the solution we see GN , which is a theory parameter, and M, which is a solution parameter. One of the two most physically intriguing things about this solution, in this coordinate system, is that there is a place where grr blows up (and gtt goes to zero). This is known as the . It is located at the Schwarzschild radius 2G M r = N . (14.31) S c2 where I have temporarily shown the factors of c for physical clarity. Try calculating your own Schwarzschild radius. You do not fit inside this radius, so you are not a black hole. The second physically intriguing thing about this solution of Einstein’s equations is that it has a curvature singularity at r = 0 that is not just a coordinate singularity. It is a truly physical singularity, as you can see by computing a curvature invariant like Riemann squared, 48(G M)2 RµνσρR = N , (14.32) µνσρ r6 which diverges at r = 0. Notice that at r = rS = 2GN M, this curvature invariant is small in Planck units for a big black hole,

4 4 µνσρ 12`P `P R Rµνσρ(r = rS) = 4 . (14.33) rS Conversely, for a Planck mass black hole, the curvature would be Planckian. A third aspect of this solution should also grab your physical interest: the nonlinearity of gravity manifest in it. Nonlinearity is what allows there to be a nontrivial solution of the vacuum Einstein equations (ones with Tµν = 0) at all. Compare to Newtonian gravity, where a zero mass on the RHS of the Laplace equation would result in a zero Newtonian potential!

Mathematically, the mass M of a classical Schwarzschild black hole might in principle take any value from −∞ to +∞, because rS arose as a mere constant of integration of Einstein’s equations. However, physically there are limits to what the mass can be. For starters, M must be finite for a physically reasonable solution. More importantly, the mass must be nonnegative, M ≥ 0, because the singularity is not covered by a horizon if the Schwarzschild radius is negative! When M < 0, the gravitational redshift also walks off into the complex plane, and we are in trouble interpreting what the heck our spacetime might mean. So already at the classical level, we can imagine why taking our black holes to have

13Cousins of Schwarzschild which are asymptotically flat are available in various dimensions for D > 3. They have 1/rD−3 dependence rather than 1/r dependence in the metric.

77 non- is a sensible physical precaution. (Of course, the case M = 0 is Minkowski spacetime.) There is a more sophisticated argument available for mass non-negativity that takes into account quantum corrections to classical gravity, first made in 1995 by G.T. Horowitz and R.C. Myers. They argued that if a negative-mass black hole solution were physical, in the sense that corrections somehow ‘fixed up’ the negative-mass naked singu- larity into some physical blob with large-but-finite curvature, then the vacuum of quantum gravity would be unstable. Their logic was this: we could reduce the energy of our system from the vacuum state by simply pair-producing more and more blob-antiblob pairs. This works because each blob has negative energy and so does each anti-blob! The existence of negative-mass ‘black holes’ would therefore thoroughly destabilize the vacuum of quantum gravity, which is the foundation upon which we lay excitations of quantum fields describing the fluctuating degrees of freedom of the system. The result would be a horrible, physically inconsistent mess. The moral of this mass positivity story is this: do not trust that every mathematical solution of a physically interesting set of PDEs is physical. We must also check that physical boundary conditions are obeyed, and ensure that basic physical principles like stability of the vacuum are preserved. This is why we assume henceforth that MBH ≥ 0.

78 15 M02Nov

15.1 TOV equation for a star Let us now see what changes when we allow an energy-momentum tensor in our static, spherically symmetric spacetime. The simplest kind of thing to consider is called a perfect fluid. What is a perfect fluid? Physically, it is a kind of spherical cow approximation, in which we model a system like the ball of gas we call our Sun by a simple macroscopic fluid, described only by its proper energy density ρ and pressure p in the instantaneous rest frame. We ignore shear viscosity, bulk viscosity, and heat conduction. For a perfect fluid, the energy-momentum tensor Tµν can be written in the form  p  T p.f. = ρ + U U − pg . (15.1) µν c2 µ ν µν This obeys the conservation equation µ p.f. ∇ Tµν = 0 . (15.2) In flat Minkowski spacetime in Cartesian coordinates, in the Newtonian limit, conservation of energy-momentum can be seen to reduce to (a) the continuity equation, and (b) the Euler equation, the classical equation of motion for a perfect fluid. For details, see §8.3 of HEL. Here, we work in curved spacetime so our story is more involved. As you can check, in comoving coordinates, we have only the time component of the µ 4-velocity and its magnitude is set by the timelike condition U Uµ = 1. Then the Einstein equations for our static, spherically symmetric star involve only radial dependence, and they are (with c = 1) 1 e−2β 2r∂ β − 1 + e2β = 8πG ρ , (15.3) r2 r N 1 e−2β 2r∂ α + 1 − e2β = 8πG p , (15.4) r2 r N  1  e−2β ∂2α + (∂ α)2 − ∂ α∂ β + (∂ α − ∂ β) = 8πG p . (15.5) r r r r r r r N Note very carefully here the difference between ρ(r) and p(r). Make sure you write ρs in your own handwritten notes in such a way that they are easily distinguishable from ps. Now, we have a set of three coupled ODEs in α(r), β(r), ρ(r), p(r). Without some physical input there are not enough equations to solve the system. But we can do it if we recruit µ conservation of the energy-momentum tensor ∇ Tµν = 0 and provide an equation of state. The tt Einstein equation is a function of β only: it does not involve α. This allows us to define a mass function m(r) such that  2G m(r)−1 e2β = 1 − N . (15.6) r Then in terms of m(r) rather than β(r), the tt Einstein equation becomes dm/dr = 4πr2ρ(r), which we can immediately integrate to Z r m(r) = 4π dr˜r˜2ρ(˜r) . (15.7) 0

79 You might look at this formula and think “Oh! This is just the natural answer: you take the mass density and multiply by the surface area, and integrate radially.” But that would be too quick, because the volume element in our curved spacetime metric is actually {drdθdφ r2 sin θ eβ(r)}. So if we wanted to define the true energy density, we would instead calculate Z R 2 ¯ r˜ ρ(˜r) M = dr˜p (15.8) 0 1 − 2GN m(˜r)/r˜ and this is greater than M because of the binding energy (a concept which does make sense in GR for spherical stars). The radial Einstein equation becomes

dα [G m(r) + 4πG r3 p(r)] = N N . (15.9) dr r[r − 2GN m(r)] To get any further, we need to recruit energy-momentum tensor conservation. With only radial dependence, this gives

dα(r) dp(r) [ρ(r) + p(r)] = − , (15.10) dr dr which lets us eliminate dα/dr in favour of dp/dr. We obtain

dp(r) [ρ(r) + p(r)] [G m(r) + 4πG r3 p(r)] = − N N (15.11) dr r[r − 2GN m(r)] This is the Tolman-Oppenheimer-Volkov equation for hydrostatic equilibrium in a star, for the static spherically symmetric case in 4D. In order to actually solve the TOV equation, we need to know one more equation: the equation of state, which is a relationship p = p(ρ). For astrophysical systems, a polytropic equation of state is often employed, which takes the form ρ = Kργ for some constants K, γ. As a toy model, we can consider an incompressible star with finite constant mass density ρ∗ out to some radius R. Then the mass function is easily integrated, and M = 4πR2/3. This in turn gives √ √ R3 − r R2 − R3 − r r2 p(r) = ρ √ S √ S (15.12) ∗ 3 2 3 2 R − rSr − 3 R − rSR

Integrating again to find gtt yields r 3r r 1 r r2 eα(r) = 1 − S − 1 − S , r < R . (15.13) 2 R 2 R3 The pressure increases near the core, even though we have assumed absolute incompressibility of the fluid. In particular, if M > Mmax = (4R)/(9GN ), then the pressure at the core goes to infinity. Oops! With our simplistic ansatz, we have managed to evolve ourselves outside the regime of validity of Einstein’s equations. Of course, real stars do not obey such a simplistic model as an incompressible fluid. Still, it is interesting that we can get the right order of magnitude estimate of when a star can be too big to be gravitationally stable. Sometimes,

80 the stellar object collapses gravitationally into a black hole. If the initial configuration had no overall angular momentum, it will settle down eventually to a Schwarzschild solution. If it is rotating, then the metric we will discuss soon is known as the Kerr black hole. Stellar evolution produces different endpoints depending on the initial mass of the star in question. For small stars like ours, when they run out of gas for nuclear fusion, they contract and become white dwarfs. If they are somewhat larger, above about 1.4M , known as the Chandrasekhar limit, then electron degeneracy pressure is not sufficient to hold them up, and they collapse further to become a neutron star (a class that includes pulsars). Above about 3-4M , known as the Oppenheimer-Volkov limit, even neutron degeneracy pressure is not enough. Bigger stars collapse to produce black holes. People like to categorize black holes by size. We can distinguish three basic classes by formation mechanism. Stellar mass black holes are produced by collapse of individual stars, and have masses of a few to a few hundred solar masses. We also have supermassive black holes at the centres of most galaxies, at millions to billions of solar masses. The third class is known as primordial black holes because the only way these smaller-mass objects could have been formed would have been in the Big Bang. The density of primordial black holes is small, if there were any at all to begin with, because of the period of inflation which grew the universe by gigantic amounts early in the history of its evolution, diluting them.

15.2 Geodesics of Schwarzschild We now move to studying geodesics in the Schwarzschild spacetime explicitly. The nonzero Christoffels for this geometry are

t rS Γ tr = ; (15.14) 2r(r − rS) r Γr = −Γt , Γr = S (r − r ) , Γr = −(r − r ) , Γr = sin2 θ Γr ; (15.15) rr tr tt 2r3 S θθ S φφ θθ 1 Γθ = , Γθ = − sin θ cos θ ; (15.16) rθ r φφ 1 cos θ Γφ = , Γφ = , (15.17) rφ r θφ sin θ where rS is the Schwarzschild radius. Then our geodesic equations become

2 d t rS dt dr 2 + = 0 , (15.18) dλ r(r − rS) dλ dλ 2  2  2 d r rS dt rS dr 2 + 3 (r − rS) − dλ 2r dλ 2r(r − rS) dλ ( )  dθ 2 dφ2 − (r − r ) + sin2 θ = 0 , (15.19) S dλ dλ d2θ 2 dθ dr dφ2 + − sin θ cos θ = 0 , (15.20) dλ2 r dλ dλ dλ d2φ 2 dφ dr cos θ dθ dφ + + 2 = 0 . (15.21) dλ2 r dλ dλ sin θ dλ dλ

81 These equations look rather formidable until you realize that finding the Killing vectors allows you to find first integrals of two out of four of the geodesic equations. This follows because ∂tgµν = 0 and ∂φgµν = 0. We write the energy  r  dt E = p = 1 − S . (15.22) t r dλ and the angular momentum dφ L = p = r2 sin θ . (15.23) φ dλ The next equation we can recruit is µ U Uµ =  , (15.24) where  = 0 for null geodesics and  = +1 for timelike geodesics. For either type of geodesic, µ ν we have gµνU U = , or " #  r   dt 2  r −1  dr 2  dθ 2 dφ2  = 1 − S − 1 − S − r2 + sin2 θ . (15.25) r dλ r dλ dλ dλ

Substituting in our conserved angular momentum L and energy E gives

 r −1  r −1  dr 2  dθ 2 L2  = 1 − S E2 − 1 − S − r2 − . (15.26) r r dλ dλ r2 sin2 θ

Our next step is a piece of physics input. We can use rotational symmetry to pick θ = π/2. It is consistent with the geodesic equations to leave dθ/dλ = 0 for all affine time. Then 1  dr 2 1 1  r   L2  = E2 − 1 − S  + . (15.27) 2 dλ 2 2 r r2 Some textbooks like to help you visualize this setup by making a mapping onto a familiar non-relativistic Newtonian system, as follows,

m → 1 (15.28)  dr 2 |~v|2 → , (15.29) dλ E2 E → , (15.30) tot 2 1  r   L2   r L2 r L2 V (r) → 1 − S  + = − S + − S . (15.31) eff 2 r r2 2 2r 2r2 2r3

You can learn everything you need to know about the availability of various types of (for either null or timelike geodesics) by plotting this “effective potential”. Carroll has two great figures in §5.4, Figures 5.4 and 5.5:-

82 For Newtonian gravity, there are no massless particle orbits. Massive particles can have stable bound orbits, depending on the angular momentum per unit mass. For Einsteinian gravity, photons can , but they are unstable. Any small perturbation and the path flings off back out to infinity (sometimes after buzzing around the black hole horizon a few times) or falls inexorably into the black hole. Massive particles, on the other hand, can have bound orbits, and the outer solution radius gives a stable orbit while the inner one gives an unstable orbit. Circular orbits can happen when dVeff /dr = 0 at r = r∗, solving the equation  r 3r S r2 − L2r + S L2γ = 0 , (15.32) 2 ∗ ∗ 2 where (following Carroll) we introduce γ = 1 for GR and γ = 0 for NG (Newtonian gravity). Specifically, for massless geodesics, r∗ = 3rSγ/2, and as you can see by evaluating the second derivative of Veff (r), it is an unstable maximum. Massive geodesics provide a richer context.

83 We find two solutions, s 2 4 2 r∗ L L 3L γ = 2 ± 4 − 2 , (15.33) rS rS rS rS 2 From this you can quickly see that NG has only one solution, at r∗ = 2L /rS. But for GR the story is a lot more interesting. There are two solutions and, as you can see by computing the second derivative of the effective potential, the outer one is stable while the inner one is unstable. As you can discover by inspecting the negative root of eq.(15.33) carefully, for radii smaller than r = r∗,ICO, where 3r r = S , (15.34) ∗,ICO 2 there are no stable circular orbits at all. Nothing can orbit that close without falling across the horizon. Gravity is too strong. The angular momentum at which the stable and unstable 4 2 2 orbits coalesce for timelike geodesics is L = 3rSL γ, i.e., where the discriminant in eq.(15.33) vanishes. This is called the ISCO, or the Innermost Stable Circular Orbit,

r∗,ISCO = 3rS = 2r∗,ICO . (15.35)

The following image is an artist’s rendition of what the black hole in the Large Magellanic Cloud might look like (credit: Alain Riazuelo / CC BY-SA 2.5.). It looks weird because you are not used to photon trajectories being bent. The strong and nonlinear gravitational effects of the black hole are quite extreme!

84 16 R05Nov

16.1 Causal structure of Schwarzschild

How do light cones behave in the spacetime of the Schwarzschild black hole? In the original Schwarzschild coordinates, we had the spacetime metric  r   r −1 ds2 = 1 − S dt2 − 1 − S dr2 − r2dΩ2 . (16.1) r r 2 Obviously, we will have to suppress some of our four spacetime coordinates in order to fit a diagram onto a two-dimensional page. It will assist our visualizations to suppress the angular directions and focus attention on the time and radial directions. (Tip: be sure to double-check spacetime diagrams in textbooks to eliminate avoidable confusion over which coordinates are suppressed.) For a null trajectory we have ds2 = 0. For purely radial motion, we can immediately read off the slope of the light cone,

dt  r −1 = ± 1 − S . (16.2) dr r As we would expect, the magnitude of this tends to unity at r → ∞. Light rays go at 45◦ on a (t, r) diagram. At slightly smaller radii, it increases a little. What happens at r → rS you may not have expected: the magnitude of the slope of the light cone blows up! The light cone is physically squashed down to have zero opening angle. This is a coordinate singularity. Inside the Schwarzschild radius, gtt and grr both flip sign, seemingly switching roles. This is a symptom of the fact that this coordinate system does not actually cover the region of the black hole spacetime inside the horizon. Another symptom of the disease we see here is that it appears a photon would take an infinite amount of time to fall into the black hole. It does – in these coordinates.

Redshifting depends on the coordinate system. To do a better job of probing the causal structure of the Schwarzschild black hole spacetime, we are actually better off aiming to answer more invariant questions, like “How much affine parameter does it take before a freely falling particle hits the singularity?” We can also hunt for better coordinate systems which do cover the entire black hole spacetime, not just the region outside the horizon.

85 Let us start by inspecting what we have so far in our Schwarzschild coordinates. For radial null paths, dt ±1 = . (16.3) dr (1 − rS/r) Defining dt = ±1 (16.4) dr∗ gives r Z d(r/r ) ∗ = S , (16.5) rS (1 − rS/r) so that  r  r∗ = r + rS ln − 1 , (16.6) rS which is known as the tortoise coordinate. This ranges over r∗ ∈ (−∞, +∞), while the original radial coordinate ranged over r ∈ [0, ∞). So this tortoise coordinate also only covers the region outside the horizon. The benefit of using these coordinates is that the radial null paths are simple, t = ±r∗ + c . (16.7) The light cones are all at 45◦ in tortoise coordinates. Using the tortoise coordinate, our black hole metric becomes   2 rS 2 2 2 2 ds = 1 − [dt − dr∗] − r (r∗)dΩ2 . (16.8) r(r∗) Next, let us try adapting our coordinates to null motion. Define null coordinates in the time-radius plane,

u ≡ t − r∗ , (16.9)

v ≡ t + r∗ . (16.10)

Then our black hole spacetime metric takes the form

 r  ds2 = 1 − S dv2 − 2drdv − r2(u, v)dΩ2 . (16.11) r(u, v) 2 These coordinates are called Eddington-Finkelstein coordinates. As you can check, this metric remains invertible, including at the horizon. Then for radial null motion, we have

 r  dv 2 dv 1 − S − 2 = 0 , (16.12) r dr dr so that  2 dv  (outgoing) , = (1 − rS/r) (16.13) dr  0 (ingoing) .

86 Because the first solution is positive, it is relevant for outgoing radial null paths. The second solution is the one relevant for ingoing radial null paths. Notice what this implies about our light cones in (v, r) coordinates. We have that for the ingoing ‘side’ (on a 2D diagram) of the light cones, this always hugs v =const. For the outgoing ‘side’ of the light cones, the slope depends on r/rS. At r → ∞, this slope is 2. If we are at a finite r > rS, then the slope is positive and bigger than 2. At r = rS the slope becomes infinite, pointing straight up the v-axis. For r < rS the slope becomes negative, and points towards the inside of the black hole only. This represents the physics that we want in a rather more elegant way than Schwarzschild coordinates did. Infalling photons do not make it out of the black hole once they have crossed the horizon. The following picture is a summary of what we have found out about light cones in Eddington-Finkelstein coordinates.

A nice feature of Eddington-Finkelstein coordinates is that our light-cones do not get squished down to infinitely thin pencils. But note carefully that they do turn over at the horizon. Note that with these new coordinates we have managed to cover the region t → +∞ of the black hole spacetime, because at constant v, decreasing r sends t → +∞. So we have extended in one direction. How about the other direction? Are there other coordinates that might restore the symmetry between u and v? Our Eddington-Finkelstein coordinates so far privileged v. Because of that, they are known as ingoing Eddington-Finkelstein coordinates. It turns out that we can alternatively find a second set of Eddington-Finkelstein coordinates, adapted for outgoing rather than ingoing null paths, in which we have

 r  ds2 = 1 − S du2 + 2drdu − r2(u, v)dΩ2 . (16.14) r(u, v) 2

Working back from our definitions, we see that this corresponds to the region t → −∞, as compared to the region t → +∞ which the first set of Eddington-Finkelstein coordinates extended us to. In outgoing Eddington-Finkelstein coordinates (u, r), the slope of the light cones is  2 du  − (ingoing) , = (1 − rS/r) (16.15) dr  0 (outgoing) . The accompanying picture illustrates this.

87 Can we uncover yet more regions of the Schwarzschild black hole spacetime? It turns out that the answer is yes, if we use another even smarter coordinate system known as Kruskal-Szekeres coordinates. Our first guess for how to get further is furnished by choosing both light-cone coordinates (u, v), in place of (u, r), or (v, r), or (t, r∗), or (t, r). We find immediately that  r  ds2 = 1 − S dudv − r2(u, v)dΩ2 , (16.16) r(u, v) 2 where 1 r  r  (v − u) = + ln − 1 , (16.17) 2 rS rS which implicitly defines r as a function of (u, v). This is looking more promising: our light-cones will stay at 45◦ in these (u, v) coordinates. But there is still one big fly in the ointment: we still have the problem that the horizon is located infinitely far away. To cure this symptom, we make an exponential mapping to bring the horizon to a finite place,  u   v  U = − exp − V = + exp + . (16.18) 2rS 2rS In these Kruskal-Szekeres coordinates we find  2r3  ds2 = dUdV − S e−r(U,V )/rS + r2(U, V )dΩ2 . (16.19) r(U, V ) 2 Picking apart the null (U, V ) coordinates into time T and radius R coordinates, via 1 r r  r   t  T = (U + V ) = − 1 exp sinh , (16.20) 2 rS 2rS 2rS 1 r r  r   t  R = (U − V ) = − 1 exp cosh , (16.21) 2 rS 2rS 2rS gives the spacetime metric  2r3  ds2 = −dT 2 + dR2 − S e−r(T,R)/rS + r2(T,R)dΩ2 , (16.22) r(T,R) 2 where r(T,R) is implicitly defined by  r  T 2 − R2 = 1 − er/rS . (16.23) rS

88 This slick manipulation probably feels like it just happened at 100km/h. So let us slow down a little, and unpack all of what these new amazing Kruskal coordinates allow us to see for the physics of the Schwarzschild black hole. In Kruskal-Szekeres coordinates,

• Radial null motion occurs along

T = ±R + c1 . (16.24)

• Surfaces of constant r are at  r  T 2 − R2 = 1 − er/rS , (16.25) rS

which are hyperbolas in the (T,R) plane.

• Surfaces of constant t in are at T  t  = tanh , (16.26) R 2rS

which are simply straight lines in the (T,R) plane.

• The event horizon is at T = ±R. (16.27) This has two solutions, corresponding physically to having both a black hole horizon and a white hole horizon.

• The singularity is at T 2 − R2 = 1 . (16.28) This has two solutions, one which corresponds to a black hole singularity and one which corresponds to a white hole singularity.

• What ranges do our coordinates (U, V ) cover? We see that (U, V ) range over all possible values aside from where the curvature singularity occurs:

− ∞ ≤ T ≤ ∞ , − ∞ ≤ R ≤ ∞ , T 2 − R2 < 1 . (16.29)

Note: it appears that the U, V may be ill-defined inside the horizon, but it is actually the original t, r coordinates that are ill-defined there. The U, V Kruskal coordinates are well-defined, except of course in the disallowed singular region. This is the really key part of using Kruskal coordinates which allows us to obtain what is known as the maximal analytic extension of the Schwarzschild spacetime. (In the figure below, rg is our rS, t˜ is our T , andr ˜ is our R.)

89 Notice how the Kruskal diagram actually has extra regions by comparison to the original Schwarzschild coordinate patch. These new extra regions can be abbreviated as II, III, and IV. From region I we can, via future-directed null rays, go into region II. So it makes sense to interpret this part as the region behind the black hole event horizon. And you can see from the picture above that the black hole singularity is in region II. Suppose, from region I, we followed instead a past-directed null ray. Then what? Ac- cording to our Kruskal diagram, we would cross a horizon to go into another region – III – with another singularity, the white hole singularity, which can be loosely called the ‘mirror image’ of the singularity in region II under time reversal. The horizon in region III is the white hole horizon that we identified in our list of bullet points above. By following future-directed null rays from region III, or past-directed null rays from region II, we can see a second asymptotically flat region. But we can never communicate with it! It is a causally separated place unconnected by timelike or null geodesics to the original asymptotic region. Some people like to speak of the Schwarzschild geometry as a “” connecting two asymptotically flat regions, but it is not physical in any sense to call it a wormhole because it is not traversable. It closes up too quickly for any physical observer (even an electron) to cross from I to IV. For more details, see p.228 of Carroll. A black hole formed in gravitational collapse would involve at most regions I and II. Regions III and IV would not be present; there would be no white hole, only a black hole. What is a white hole, physically? Mathematically, it is the time-reverse of a black hole. Such a beast cannot actually be formed in gravitational collapse – that produces a black hole with a future horizon, not a white hole with a past horizon. The other interesting fact about white holes, shown by D.M. Eardley in 1974, is that a white hole is unstable to collapsing into a black hole. For these and other reasons, you do not need to worry about the physics of white holes if you are considering classical gravity. Only quantum gravity theorists need to worry our heads about such things.

90 17 M16Nov

17.1 Charged black holes The Reissner-Nordstrøm solution is obtained when we assume staticity and spherical symmetry, and allow an energy-momentum tensor coming from the electromagnetic field. Since there are no known magnetic monopoles that could source a magnetic field, we will stick with an electric field14. Then the only nonzero component of F µν is F tr. Let us assume the same metric ansatz as we had for Schwarzschild,

2 2α(r) 2 2β(r) 2 2 2 ds = e dt − e dr − r dΩ2 . (17.1)

µν The covariant source-free Maxwell equation ∇µF = 0 can be rewritten in the form 1 √ √ ∂ −gF µν = 0 , (17.2) −g µ where {−g} is shorthand for the negative of the determinant of the downstairs spacetime metric. This Maxwell equation simplifies significantly by virtue of spherical symmetry. In our spacetime ansatz, we have √ −g = eα+βr2 sin θ , (17.3) so that the Maxwell equation implies

2 α+β tr ∂r r sin θe F = 0 , (17.4) which we can immediately integrate by eye to c F tr = 1 e−α−β . (17.5) r2 (Note: if we had been more rigorous and put in a delta function charge source on the RHS of the Maxwell equation, c1 would have been proportional to the electric charge.) The next step in solving the Einstein-Maxwell system is to substitute in the above electric field into the energy-momentum tensor and apply Einstein’s equations. The details are similar in spirit but longer in practice than what we did before in deriving Schwarzschild, so we will not drag you through the algebra. The really nice thing is that, even with the electric field turned on, it turns out that the Einstein equations still furnish the relationship

α = −β + const., (17.6) between the time-time component of the metric and the space-space component. Integrating up the θθ Einstein equation like we did for Schwarzschild produces the solution,

 2   2 −1 2 2GN M GN Q 2 2GN M GN Q 2 2 2 dsRN = − 1 − 2 + 4 2 dt + 1 − 2 + 4 2 dr +r dΩ2 , (17.7) c r 4π0c r c r 4π0c r

14 If you wanted to do the magnetic case, you would find Fθφ = P sin θ, where P ∝ magnetic charge.

91 where we have temporarily restored physical constants that we would usually set to unity. We can make this look slightly prettier by defining

2 GN M 2 GN Q µ ≡ 2 , and q ≡ 4 ; (17.8) c 4π0c then p 2 2 r± = µ ± µ − q . (17.9) The geometry has two event horizons, an outer horizon and an inner horizon. As you can check by computing the full contraction of the Riemann tensor with itself, the curvature singularity is located at r = 0. The “singularities” in the metric at r = r± are just coordinate singularities, like the one we encountered for Schwarzschild. There are three cases for Reissner-Nordstrøm metrics depending on the sign of what is under the square root in the above formula.

1. µ2 < q2: This is unphysical. The event horizon walks off into the complex plane and the singularity at the origin is then naked. Oops! 2. µ2 > q2: This is physical. It includes the limit of zero charge, which gives back Schwarzschild (r+ = 2µ, r− = 0). Here there are two horizons, at r = r±. The singularity in this case is timelike, as compared to spacelike for Schwarzschild. 3. µ2 = q2. This is also physical, and is known as the extremal Reissner-Nordstrøm spacetime. You can think of it as having exquisitely balanced gravitational attraction and electric repulsion.

The Penrose diagrams for Reissner-Nordstrøm black hole spacetimes are available in HEL §12.6, if you wish to peruse them to obtain intuition. Note however one important caveat on the maximal analytic extensions that display an infinite number (!) of asymptotic regions. The inner horizon has the property that probe perturbations coming in from I − tend to bunch up there: their magnitude grows out of control. But if the perturbation amplitude were that big, then it would surely backreact on the geometry, from having so much energy- momentum. This would entail changing the solution that we already wrote down. What this teaches us is that the semiclassical perturbation analysis is breaking down. Most likely, the singularity of a real physical charged black hole would become spacelike, covered by only one horizon, not two. At this point we can make one more advanced comment, concerning the physical realism of charged black hole solutions. Quantum field theory shows you that charged black holes in real astrophysical situations will actually discharge rather quickly, via the Schwinger process, which nucleates charged particle-antiparticle pairs (e.g. electron-positron pairs) in an electric field, about a Compton wavelength apart. So if you mention a charged black hole to an astrophysicist, they tend to burst out laughing. But in some ways the joke is on them, because focusing on the dynamics of charged black holes was what led string theorists to perform the first-ever first-principles computation of the entropy of black holes in 1996, a discovery whose development indirectly helped me get hired! To finesse the astrophysicist’s objection, you can imagine that the “electric” charge we are discussing is not carried by light quanta in the theory.

92 A note about two neat properties of our extremal Reissner-Nordstrøm black hole. First, we will be able to see pretty quickly that there are multi black hole solutions in this case. This spacetime has one double horizon at r = r− = |q|,

 |q|2  |q|−2 ds2 = − 1 − dt2 + 1 − dr2 + r2dΩ2 . (17.10) ERN r r 2 We can easily define a shifted radial coordinate

ρ := r − |q| . (17.11)

Then dρ = dr and

 |q| r − |q| ρ  |q|−1 1 − = = = 1 + . (17.12) r r ρ + |q| ρ Defining |q| H(ρ) = 1 + , (17.13) ρ we have

2 −2 2 2 2 2 2 dsERN = −H dt + H dρ + (ρ + |q|) dΩ2 −2 2 2 2 2 2 = −H dt + H dρ + ρ dΩ2 . (17.14) This coordinate system is known as isotropic coordinates because the metric in parenthe- ses is the standard Euclidean metric in spherical polar coordinates. We also find that

p −1 GN At = H − 1 . (17.15) If we substituted this ansatz for the gauge potential and the metric into Maxwell’s equations and the Einstein equations, we would find that they require only one equation between them,

2 ∇~ H = 0 . (17.16)

In other words, H is a of Cartesian coordinates ~x obtained from the isotropic spherical coordinates. It is actually possible to have multi black hole solutions of this system, because of the exact cancellation between gravitational attraction and electric repulsion between any two of the black hole centres!

N X GMa H = 1 + . (17.17) |~x − ~x | i=1 a This ability to superpose is extremely niche: it very generally fails in GR, a nonlinear theory. Another interesting feature of the Reissner-Nordstrøm spacetime is what happens when you take the near-horizon limit |~x| → 0. This in effect removes the 1 from the harmonic function. If you look carefully at the single-centred black hole metric in this limit, you will 2 find that it produces AdS2×S , two-dimensional Anti de Sitter spacetime times a two-sphere. This fact is related to the famous AdS/CFT correspondence of string theory.

93 17.2 Rotating black holes Now we move to discussing the Kerr black hole, which has not only mass but also angular momentum. Our discussion here will be based largely on HEL §13. Demanding that the spacetime be stationary and spheroidally symmetric requires an ansatz of the form

ds2 = e2α(r,θ)dt2 − e2γ(r,θ) [dφ − ω(r, θ)dt]2 − e2β(r,θ)dr2 − e2δ(r,θ)dθ2 . (17.18)

Note how many more functions we have turned on here, and the fact that there is now both r and θ dependence in all our metric functions. Mathematically speaking, this complicates the hell out of the process of solving the Einstein equations, because we now have PDEs in two variables instead of ODEs in r only. We will not actually prove that the Kerr solution solves the vacuum Einstein equations, because the algebra is awful.15 Instead, we will derive some fascinating physical properties of spacetimes of the above form, and just present the Kerr solution, gift wrapped with a bow on top. From the above ansatz we see that

2α 2γ 2 gtt = e − e ω . (17.19)

Since the metric has an off-diagonal component, gtφ, inverting to find the upstairs metric is slightly more complicated. We can easily read off two of the components,

grr = −e−2β , (17.20) gθθ = −e−2δ , (17.21) but for the (t, φ) block we need to invert the 2 × 2 matrix. The result is

gtt = e−2α , (17.22) gφφ = −e−2γ + ω2e−2α , (17.23) gtφ = +ωe−2α . (17.24)

It is possible to see one of the most intriguing consequences of GR, known as the dragging of inertial frames, without getting specific about the form of any of the functions in our metric ansatz. Since the metric obeys ∂φgµν = 0, pφ is conserved along a geodesic. Then

φ φµ φφ φt p = g pµ = g pφ + g pt , (17.25) and similarly t tt tφ p = g pt + g pφ . (17.26)

Let us specialize to the case of pφ = 0: no initial angular momentum. This quantity remains zero along the geodesic. Then, recalling our relationship between the momentum and the tangent vector for either massive or massless geodesics, dxµ pµ ∝ , (17.27) dλ 15If you are a masochist and want to see it for yourself, please make SymPy do it.

94 we have dφ pφ gtφ = = = ω(r, θ) . (17.28) dt pt gtt In other words, ω is the coordinate angular velocity of a massless particle with no angular momentum. What we have obtained here might not look like much, but it is physically remarkable. A particle dropped straight inwards from infinity will not end up continuing straight inwards – instead, gravity drags the particle around so it acquires an angular velocity. This effect is know as the dragging of inertial frames. Our next task is to define a physically important surface known as the stationary limit surface. To get basic intuition for this phenomenon, consider what happens if we assume that a particle/observer could in principle remain at fixed (r, θ, φ). This would require a 4-velocity of the form [uµ] = [ut,~0]T . Is this compatible with our spacetime? The answer is: not everywhere! In the region where gtt is negative, we see that our assumed 4-velocity 2 is incompatible with the condition that u = 1. Oops. The equation gtt = 0 delineates the surface inside which a particle/observer cannot stay stationary, and it is called the stationary limit surface. Let us now dig a little deeper. Imagine photons emitted from (r, θ, φ) purely in the ±φ direction at first, so that only dt and dφ are nonzero along the photon path. Using ds2 = 0, we have

2 2 gttdt + 2gtφdtdφ + gφφdφ = 0 , (17.29) so that s 2 dφ gtφ gtφ gtt = − ± 2 − (17.30) dt gφφ gφφ gφφ

If at the emission point gtt/gφφ < 0, then dφ/dt is positive (negative) for photons emitted in the ±φ direction, even though the magnitudes differ. But when gtt = 0, we cross over to a different behaviour. In particular, on the surface gtt(r, θ) = 0, known as the stationary limit surface, there are two qualitatively different solutions: dφ g dφ = −2 tφ = 2ω or = 0 . (17.31) dt gφφ dt The first solution corresponds to a photon sent off in the same direction as the source rotation. The second solution shows that frame dragging is so severe that initially the photon does not move at all. This implies that a massive particle, which must always go slower than a photon, also has to rotate with the source. This is true even if it has an arbitrarily large angular momentum with opposite orientation! As we will find next week when we start talking about experimental successes of GR, the formula for gravitational redshift of an observer at a fixed spatial location in a stationary spacetime is   s νR gtt(E) = . (17.32) νE stationary, fixed gtt(R) This is why the stationary limit surface is also known as the infinite redshift surface.

95 18 R19Nov

18.1 The Kerr solution Today’s material is based on parts of HEL §13. All figures shown are theirs. How would we find the horizon in our rotating spacetime? The defining property of an event horizon is that it is a null surface. In stationary axisymmetric spacetimes, its µν equation must be of the form f(r, θ) = 0. Nullness then implies that g ∂µf∂νf = 0, or rr 2 θθ 2 g (∂rf) + g (∂θf) = 0. In fact, it turns out that it is actually possible to choose our coordinates r and θ such that the equation for the horizon can be put in the form f(r) = 0. rr 2 In this case, our condition reduces to g (∂rf) = 0, and therefore we see that the event horizon occurs when grr = 0 . (18.1)

In our previous case of Schwarzschild, this was equivalent to the condition gtt = 0, but that only holds for static black holes, not stationary ones. This is a good place to mention a definition of a horizon associated to Killing vectors. Suppose that we have a Killing vector χµ. If that Killing vector is null along some null hypersurface Σ, then Σ is a Killing horizon of χµ. Note that χµ is normal to Σ because a null surface cannot have two linearly independent null tangent vectors. Some important facts are as follows. • Every event horizon Σ in a stationary, asymptotically flat spacetime is a Killing horizon for some Killing vector χµ. µ µ µ • If the spacetime is static, then χ will be the Killing vector K = (∂t) representing time translations at infinity. • If the spacetime is stationary but not static, then it will be axisymmetric with a µ µ µ µ µ rotational Killing vector R = (∂φ) , and χ will be a linear combination K + ΩH R for some constant ΩH . To prove that the empty space Einstein equations are satisfied, we need to show that the Ricci tensor is zero for metrics of our form with α, β, γ, δ, ω. Take my word for it: this is a very tedious computation. Here is the that emerges after all the calculational dust has settled:-  2µr 4µar sin2 θ ρ2 ds2 = 1 − dt2 + dtdφ − dr2 − ρ2dθ2 ρ2 ρ2 ∆  2µra2 sin2 θ − r2 + a2 + sin2 θdφ2 , ρ2 ρ2∆ Σ2 sin2 θ ρ2 = dt2 − (dφ − ωdt)2 − dr2 − ρ2dθ2 , (18.2) Σ2 ρ2 ∆ where ρ2 = r2 + a2 cos2 θ , Σ2 = (r2 + a2)2 − a2∆ sin2 θ , (18.3) 2µar ∆ = r2 − 2µr + a2 , ω = . (18.4) Σ2

96 The coordinate system in which we have presented the Kerr metric is known as the Boyer- Lindquist coordinate system. Note: this was not actually the original coordinate system used by Kerr when he derived the black hole, which are known as Kerr-Schild coordinates. Where is the singularity of the Kerr spacetime describing a rotating black hole? Com- puting the full contraction of Riemann with itself shows that only at ρ2 = 0 do we see a physical singularity. This happens at r2 + a2 cos2 θ = 0 , (18.5) yielding π r = 0 , θ = . (18.6) 2 Careful inspection reveals that this singularity is ring shaped. To see this, take the limit M → 0 while keeping a nonzero; the result gives Minkowski spacetime in oblate spheroidal coordinates, which are related to Cartesian coordinates by √ √ x = r2 + a2 sin θ cos φ , y = r2 + a2 sin θ sin φ , z = r cos θ . (18.7) This is to be contrasted with Schwarzschild, where the singularity was pointlike. Where are the horizons? These occur where grr → 0. This requires ∆ = 0, or

p 2 2 r = r± = µ ± µ − a . (18.8) Note that, with factors of c temporarily restored for physical clarity, G M µ = N , and J = Mac . (18.9) c2 So then we require µ ≥ |a| (18.10) for cosmic censorship. Where is the stationary limit surface, also referred to as the ergoregion? This happens when gtt → 0, p 2 2 2 rS± = µ ± µ − a cos θ . (18.11) The following figure summarizes these aspects from a side-on perspective.

97 18.2 The Penrose process Previously we started the discussion of frame dragging in GR for the Kerr spacetime. Let us now finish that line of reasoning, which will help lead us into the subject of black hole thermodynamics. Suppose that you had the ability to fire rockets and wanted to remain fixed at (r, θ) but rotate around φ. Then the 4-velocity is

[uµ] = ut[1, 0, 0, Ω]T , (18.12) where dφ Ω = (18.13) dt is the angular velocity w.r.t. an observer at infinity. Demanding that the 4-velocity squares to  gives a quadratic equation for ut:

t 2 t φ φ 2 t 2 2 gtt(u ) + 2gtφu u + gφφ(u ) = (u ) [gtt + 2gtφΩ + gφφΩ ] =  . (18.14)

For real solutions for ut, we need

2 gφφΩ + 2gtφΩ + gtt ≥ 0 . (18.15)

Since gφφ < 0 everywhere, Ω must lie in the interval Ω ∈ (Ω−, Ω+), where s g2 r gtφ tφ gtt 2 gtt Ω± = − ± 2 − = ω ± ω − . (18.16) gφφ gφφ gφφ gφφ

Notice how Ω− can be negative if gtt > 0. Where gtt = 0, Ω− = 0 and Ω+ = 2ω. This occurs on the stationary limit surface S+, which is outside (or at, for θ = 0, π) the event horizon. A 2 special situation ensues when ω = gtt/gφφ,Ω± = ω. This holds at ∆ = 0, i.e. at the outer horizon. At r = r+, the angular velocity has to be one value only, a ΩH = ω(r+, θ) = . (18.17) 2µr+ This is independent of θ, which is a highly nontrivial physics fact. It is also the maximum allowed value of the angular velocity inside the ergoregion. Now we have all the ingredients at hand to discuss the Penrose process. Suppose that we have an observer at infinity with fixed position who fires particle A into the Kerr black hole ergoregion. Then the energy of A measured at the emission event E is

(A) (A) (A) E = p (E) · uobs = pt (E) , (18.18)

µ ~ T where the observer 4-velocity is [uobs] = [1, 0] . Now suppose that inside the ergoregion, particle A decays into two other particles: A → B + C. Then momentum conservation implies that p(A)(D) = p(B)(D) + p(C)(D) , (18.19) where D denotes the decay event.

98 Suppose that C eventually makes it out to infinity. The observer at infinity measures the particle energy at the reception event R to be

(C) (C) (C) E = pt (R) = pt (D) (18.20) because pt is conserved along a geodesic by virtue of stationarity: ∂tgµν = 0. Similarly, for the original particle, (A) (A) pt (D) = pt (E) . (18.21) Then the time component of the above momentum conservation equation can be rearranged to (C) (A) (B) E = E − pt (D) , (18.22) (B) because pt is conserved along a geodesic. (B) Now, if B were to escape the ergoregion, pt would be timelike, and hence proportional to the particle energy as measured by an observer with purely timelike 4-velocity. Since (B) (C) (A) pt > 0, this implies that E < E , i.e. you get less energy out than you put in. But if B were to instead fall into the black hole, then it would forever remain in the region where (B) gtt has opposite sign. Then pt would be interpreted as a component of spatial momentum, which could in principle be either positive or negative. (If it were the energy, it would have (B) (C) (A) to be positive for a physical particle.) If pt happened to be negative, then E > E . This means that we can extract energy from a rotating black hole! Once B has fallen inside the event horizon, it becomes part of the black hole, whose mass and angular momentum are then

2 2 (B) Mc → Mc + pt , (B) J → J − pφ . (18.23) If we have an observer at fixed r, θ, we already worked out the 4-velocity: [uµ] = ut[1, 0, 0, Ω]T , where Ω = dφ/dt is the angular velocity w.r.t. infinity. This observer measures B’s energy to be (B) (B) µ t  (B) (B)  E = pµ u = u pt + pφ Ω . (18.24) This quantity must be positive for a physical particle, so

p(B) − p(B) < t . (18.25) φ Ω

(B) Consider the quantity L = −pφ . What is it? This is the component of B’s angular momentum along the black hole rotation axis: there is a − sign because we are working in (B) mostly minus signature. Now, because pt < 0 for the Penrose process and Ω > 0, this means that L < 0, resulting in a loss of angular momentum for the black hole. You can keep extracting energy from a black hole like this until you have spun Kerr down all the way to Schwarzschild. Earlier, we learned that the angular velocity is maximal at r = r+, when Ω = ΩH . So in fact for any observer at fixed r, θ, we have a general bound, δMc2 δJ < . (18.26) ΩH

99 Let us sketch the calculation of the area of the outer horizon r+ in the Kerr spacetime. Writing

i j 2 γijdx dx = −ds (dt = 0, dr = 0, r = r+) (18.27)  2 2 2 2  2 2 2 2 (r+ + a ) sin θ 2 = (r+ + a cos θ)dθ + 2 2 2 dφ , (18.28) (r+ + a cos θ) we define the area A as Z A(r) = p|γ|dθdφ . (18.29)

From the metric, we have p 2 2 |γ| = (r+ + a ) sin θ , (18.30) so that 2 2 A(r+) = 4π(r+ + a ) . (18.31) A very cute fact about the Penrose process is that the area of the black hole horizon does not shrink when it occurs. What is the physics behind this? The angular momentum is reduced more than the mass each time we do it, and this ensures that the area of the black hole never decreases. To see a few more details, let us define the irreducible mass by

2 A Mirr = 2 (18.32) 16πGN 1 2 2 = 2 (r+ + a ) (18.33) GN s 2 ! 1 2 4 J = M + M − 2 (18.34) 2 GN This might seem a tad unmotivated until we realize how it is affected by changes in M and J. We find after some straightforward but boring algebra that a δM  δMirr = − δJ . (18.35) p 2 2 2 2 4GN Mirr G M − J /M ΩH

Look carefully at what this implies. We had earlier that for a Penrose process, δJ < δM/ΩH (where both δM and δJ are negative), so

δMirr > 0 . (18.36) Therefore, the maximum work you can extract via the Penrose process is v s u 2 1 u 2 4 J M − Mirr = M − √ tM + M − 2 , (18.37) 2 GN √ and this is maximized to (1 − 1/ 2) ' 29% of the original energy for extreme Kerr. The moral of the story here is that we are discovering relationships between macroscopic variables of the black hole, and this opens the door to discussing black hole thermodynamics (a topic on which I am an expert). I will have more to say about this in Winter/Spring in the second GR course, PHY484S/PHY1484S.

100 19 M23Nov

The reason we teach GR is not based in theoretical aesthetics, although those are really quite beautiful and many great intellects have fallen in love with it! We teach GR and use it because it works as an experimental description of gravity. In the next few lectures, we will discuss some of the signature experiments that established GR firmly in the minds of humans worldwide. Material on experimental tests, not including gravitational waves, is based pretty closely on Appendix 9A and §10 of the HEL textbook. All the figures displayed for this material are theirs.

19.1 Gravitational redshift Suppose that we have a stationary spacetime of the form 2 k 0 2 k 0 i k i j ds = g00(x )(dx ) + 2g0i(x )dx dx + gij(x )dx dx . This includes all of the types of black holes we have studied so far: Schwarzschild, Reissner- Nordstrøm, and Kerr. Imagine two different physical observers who are massive and therefore µ move slower than light. Call them E for emitter and R for receiver, with worldlines xE(τE) µ and xR(τR) respectively, where τE, τR are the proper times for those two observers. Now let E moving with 4-velocity UE(A) emit a photon at event A and R moving with 4-velocity UR(B) receive it at event B.

We can find the energy of a photon in the reference frame of a massive observer by taking the dot product of the photon’s 4-momentum with the observer’s 4-velocity, µ E = pµU . (19.1) This works because we can choose the affine parameter of a null geodesic such that dxµ pµ = . (19.2) dλ (Note that this is different from the convention for massive particles, for which the constant of proportionality in the above equation is the rest mass, rather than unity.) Then we have µ E(A) = pµ(A)UE(A) , (19.3) µ E(B) = pµ(B)UR(B) . (19.4)

101 Since in both cases E = hν, we have

µ νR pµ(B)UR(B) = µ . (19.5) νE pµ(A)UE(A) Now, since the photon’s 4-momentum is tangent to its geodesic, it is parallel transported transported along its path. Equivalently, the directional covariant derivative of pµ is zero along the geodesic, D dxσ dxσ 0 = p = ∇ p = ∂ p − Γν p  Dλ µ dλ σ µ dλ σ µ σµ ν d dxσ = p − Γν p . (19.6) dλ µ µσ ν dλ

µ µ We can use this to relate pµ(B) to pµ(A). Recruiting our convention p = dx /dλ, we have d p = Γν p pσ . (19.7) dλ µ µσ ν Recall that we also have the mass shell relation for the photon,

µ p pµ = 0 . (19.8)

Suppose the emitter E and receiver R are at fixed spatial coordinates. (This would not be true for freely falling observers.) Then the spatial components of the observers’ 4-velocities vanish, i i i dxE i dxR UE = = 0 , and UR = = 0 . (19.9) dτE dτR µ 0 √ Using U Uµ = 1 for massive observers gives u = 1/ g00 , so that

s νR p0(B) g00(A) = . (19.10) νE fixed p0(A) g00(B)

If the metric is stationary, i.e. ∂0gµν = 0, then p0 is conserved by the geodesic equation. Then since the momentum vector for a photon is equal to the tangent vector, as in eq.(19.2), p0 is constant along a photon geodesic, and so s k νR g00(xE) = k . (19.11) νE fixed, stationary g00(xR) For Schwarzschild, we obtain s 2 νR [1 − 2GN m/(c rE)] = 2 . (19.12) νE fixed, stationary [1 − 2GN m/(c rR)]

For the Kerr spacetime, we previously found that the location where g00 = 0 marks the sta- tionary limit surface (SLS), the surface inside which a stationary observer gets involuntarily

102 dragged around with the rotating black hole spacetime geometry. Here, we see that the SLS is also the location where the gravitational redshift for an observer at a fixed spatial location becomes infinite. The quantity z for the redshift is defined by ν 1 R = . (19.13) νE 1 + z If we want to find out for freely falling observers, then we need to solve the geodesic equations for the stationary spacetime in question. The analysis is even more complicated for spacetimes that are not stationary.

19.2 Planetary perihelion precession How will we discover the perihelion advance we are after? We will start by using the geodesic equations derived for the Schwarzschild geometry introduced previously. The analysis can also be done for Kerr, but for our purposes here the non-rotating case will suffice to show the essential physics. We had a conserved energy  2µ . E = 1 − t , (19.14) r and a conserved angular momentum . L = r2φ , (19.15) .µ .ν where · = d/dλ. Then the norm condition gµνx x =  (with  = 0 for photons and +1 for massive particles) gives  2µ L2  E2 + r.2 + 1 − +  = 0 . (19.16) r r2 We saw previously that defining 1 L2   2µ V (r) = +  1 − , (19.17) eff 2 r2 r in analogy with Newtonian experience allows the rewriting 1 1 r.2 + V (r) = E2 ≡ E . (19.18) 2 eff 2 We can combine the knowledge above to find the shape equation,

dφ dφ  dr −1 = , (19.19) dr dλ dλ giving dφ L = ± [2 (E − V (r))]−1/2 . (19.20) dr r2 eff Defining the orbit parameter b, via L b = , (19.21) E

103 gives dφ 1  1  1    2µ−1/2 = − + 1 − , (19.22) dr r2 b2 r2 L2 r where  = 1 for massive particles. (For photons, we would set  = 0 in this equation.) Now we make a change of variables, to L2 1 L2 1 u = = . (19.23) GN M r µ r The radial equation for massive particles (planets, etc.) then turns into

du2 L2 2µ2 2EL2 + − 2u + u2 − u3 = . (19.24) dφ µ2 L2 µ2 On the face of it, this does not look any simpler than before. The neat trick is to realize that differentiating this again yields a simpler second order equation! Straightforward but unilluminating algebra yields d2u 3µ2 + u = 1 + u2 . (19.25) dφ2 L2 This equation is the full unadulterated GR result, and involves no approximations. The second term on the RHS of this equation would be absent in the Newtonian computation. For the Newtonian case, you can check that the solution to the shape equation is

u0 = 1 + e cos φ . (19.26)

Treating this as the zeroth order approximation to the GR result, we can substitute back

u(φ) ' u0(φ) + u1(φ) + ... (19.27) into eq.(19.25) and obtain a perturbative equation for first order corrections u1. This gives d2u 3µ2 1 + u ' u2 . (19.28) dφ2 1 L 0 As you should expect for an inherently nonlinear theory like GR, perturbation theory here is nonlinear. Substituting in the specific form of u0 gives d2u 3µ2  e2  e2  1 + u ' 1 + + 2e cos φ + cos 2φ . (19.29) dφ2 1 L2 2 2 You can check by explicitly differentiating that the solution to this is 3µ2  e2  e2  u ' 1 + + eφ sin φ − cos 2φ . (19.30) 1 L2 2 6 Notice that the first term here is a constant displacement and that the third term is oscillatory about zero. The second term that gives rise to a cumulative effect per orbit is the most physically important one.

104 Figure credit: Mpfiz - Own work, Public Domain. From here on we just focus on that key second cumulative term on top of the zeroth order Newtonian contribution. We have 3µ2 u = 1 + e cos φ + e φ sin φ . (19.31) key L2 This can be rewritten as ukey = 1 + e cos [(1 − α) φ] , (19.32) where 3µ2 α = , (19.33) L2 as you can see by doing a Taylor expansion to first order in small quantities, cos[(1 − α)φ] ' cos φ + αφ sin φ + O(α2) . (19.34) Then the precession per orbit is 6πG2 M 2 ∆φ ' 2πα = N . (19.35) L2 In order to massage this expression a little further, we need to relate L2 to physical quantities we know. For the Newtonian (uncorrected) ellipse, the EOM show that L2 a = , (19.36) µ(1 − e2) so that 6πG M ∆φ = N . (19.37) c2a(1 − e2) The first experimental test of this was with Mercury. For that planet, the gravitational 2 radius µ = GN M/c is about 1.48km, the eccentricity is about e = 0.2056, and the semima- jor axis is about a = 5.79 × 1010m. This results in a perihelion precession advance of about 5 × 10−7 radians per orbit, or about 43 seconds of arc per century. Note that the observed value is actually considerably greater, but most of it comes from two prosaic places: (a) precession of the equinoxes in our geocentric coordinate system, and (b) other planets per- turbing Mercury’s orbit. The residual amount of 43 seconds of arc per century is perfectly described by GR, to within experimental errors. This was not settled definitively in the experimental realm until the 1960s. For Earth, our perihelion precession is less, only about 4 seconds of arc per century. Mercury is affected most because it is closest to the Sun.

105 20 R26Nov

20.1 Bending of light Now let us focus on the bending of light. To start with, let us remind ourselves first of the Newtonian result. Most people think that because two photons with zero mass should feel zero Newtonian force between them, that implies that photons do not feel gravity. This is incorrect. Newton imagined light as corpuscular, and it feels gravity like any other corpuscle. The gravitational acceleration of a test mass does not depend on the mass. In Newtonian mechanics, particles in unbound orbits move on hyperbolae rather than ellipses. The incoming path asymptotes in the infinite past to one of the separatrices, and the outgoing path asymptotes in the infinite future to the other separatrix. In principle, it could come as close as the radius of the stellar object as it slingshots around the star.

We can estimate the size of the effect just using dimensional analysis. The variables in the problem are: GN and c (theory constants), M (a solution parameter), and b, the radius of closest approach. Since the deflection angle we are looking for is dimensionless, we estimate that G M θ ∼ N . (20.1) c2b In principle, θ could have been any function of the dimensionless RHS. We have chosen a linear functional dependence on purpose, because we expect zero deflection angle when there is no star and because we expect a small effect overall. We can get more precise and confirm the linear dependence by asking about the gravita- tional force felt by a corpuscle. Suppose that far from the star it starts in along the x-axis, and that the star is located along the negative y-axis. To first order in small quantities, px is unaffected by the gravitational deflection, and the corpuscle develops a small py by gravitational attraction. The deflection angle |∆φ| = −(py/px)final is, to first order in small

106 quantities, 1 Z ∞ dp |∆φ| = − dx y px −∞ dx 1 Z ∞ dp = − dx y px c −∞ dt 1 Z ∞ G Mm y = − dx N 2 2 p 2 2 px c −∞ (x + y ) x + y 2G M = N , (20.2) c2b where b is the impact parameter. Note that the factor of m for the corpuscle cancelled out: the m in the numerator arising from the gravitational force killed the m in the momentum denominator px = mc. Overall, we see that the Newtonian angle for deflection of light is small but nonzero. To analyze the answer in General Relativity, our starting point is again the geodesic equations. For photons executing equatorial motion (θ = π/2), we had two Killing vectors giving rise to two conserved quantities and also the tangent vector norm condition,

 2µ . 1 − t = E, (20.3) r . r2φ = L, (20.4) −1  2µ .2  2µ .2 1 − t − 1 − r.2 − r2φ = 0 . (20.5) r r

Substituting in the conserved quantities gives for the radial equation

L2  2µ r.2 + 1 − = E2 . (20.6) r2 r

From the above, we can find the GR shape equation for photons moving in a Schwarzschild geometry, dφ 1  1 1  2µ−1/2 = − 1 − . (20.7) dr r2 b2 r2 r Substituting this time 1 u˜ = (20.8) r into the shape equation, and massaging the algebra a bit further, gives

d2u˜ +u ˜ = 3µ u˜2 . (20.9) dφ2 When there is no matter, the RHS of the above shape equation is zero. In that case, the solution is 1 u˜(φ) =u ˜ (φ) = sin φ , (20.10) 0 b

107 where b is the impact parameter. Note how this is different from the timelike trajectories we studied in the previous lecture, which executed elliptical trajectories in the Newtonian limit rather than hyperbolic ones. Here, let us also work perturbatively, writing 1 u˜(φ) ' sin φ +u ˜ (φ) . (20.11) b 1 Substituting to find the equation of motion for the perturbation gives d2u˜ (φ) 3µ 1 +u ˜ (φ) ' sin2 φ . (20.12) dφ2 1 b2 As you can check explicitly, this is solved by 3µ  1  u˜ (φ) ' 1 + cos 2φ , (20.13) 1 2b2 3 so that 1 3µ  1  u˜(φ) ' sin φ + 1 + cos 2φ . (20.14) b 2b2 3 This is the equation describing the trajectory of the photon in GR, to first order in perturbations about the Newtonian result. So let us ask the question: what does the angle tend to as we go very far away from the gravitating body? This amounts to taking r → ∞, which corresponds in our variables tou ˜ → 0. In other words, we need to look for solutions ofu ˜(φ) = 0. For slight deflections, sin φ ' φ and cos 2φ ' 1. Solving for the angle gives a slightly negative answer, corresponding to one of the separatrices of the hyperbola, 2G M φ ' − N . (20.15) c2b We are not quite finished. As indicated in the figure, the GR deflection angle for photons is twice the above result, 4G M |∆φ | ' N . (20.16) GR c2b Notice that this is also twice the Newtonian result for the bending of light. For a grazing deflection by our Sun, it is about 1.75 seconds of arc. What if we cannot apply a perturbation analysis because the deflection angle is large? Then we would need to use the full GR geodesic equations for photons without any approx- imations. In that case, by making use of previous results we have derived for the shape equation for geodesics, we find

Z ∞  1 1  2µ−1/2 |∆φGR| = 2 dr 2 − 2 1 − , (20.17) r0 b r r where r0 is the point of closest approach. At r0, the [...] in the integrand vanishes. Historical note: Eddington’s eclipse expedition to measure bending of light while the Sun was blocked by the Moon was accepted in 1919 and made Einstein a rock star, despite poorly understood systematic errors, because it appealed to Western Europeans in the post WWI climate of wanting peace between nations that had been at war.

108 20.2 Radar echoes One other important test of GR is measuring radar echoes in the solar system, which is about the interplay between distance and time. To analyze this, we need two ingredients. First, one of our geodesic equations for photons from earlier that took the form L2  2µ r.2 + 1 − = E2 . (20.18) r2 r We also had the energy equation  2µ . 1 − t = E, (20.19) r which is the second ingredient. These can be combined to find the t−r shape equation. Using  dr 2 dr dt 2 dr2  2µ−2 = = E2 1 − , (20.20) dλ dt dλ dt r we have that  2µ−3 dr2 (L/E)2  2µ−1 1 − + = 1 − . (20.21) r dt r2 r At the distance of closest approach, which we will call R, we have

dr2 = 0 , (20.22) dt r=R so that at that point (L/E)2  2µ−1 = 1 − (20.23) R2 R Then, after a bit of algebra, the expression dr2  2µ2 (L/E)2  2µ3 = 1 − − 1 − . (20.24) dt r r2 r can then be massaged into the form dr  2µ  R2(1 − 2µ/r)1/2 = 1 − 1 − (20.25) dt r r2(1 − 2µ/R) We can integrate this to get the time taken to travel from radial position R to r. It helps to begin by expanding the integrand to first order in µ/r. After some algebra, we get Z r r  2µ µR  t(r, R) ' dr√ 1 + + + ... . (20.26) 2 2 R r − R r r(r + R) Then we integrate, to obtain √ √ r + r2 − R2  rr − R t(r, R) ' r2 − R2 + 2µ ln + µ + .... (20.27) R r + R The first term on the RHS here is just what we would have got if we had drawn a straight line. So the second and third terms are quantifying the bending of photon trajectories.

109 Now, suppose that we bounced a radar beam out to Venus and back, grazing the Sun. Then we would have twice the sum of the second and third terms above (twice for there and back). Using the approximation that the closest approach distance is much less than the distance of either Earth or Venus from the Sun (rE  R, rV  R), gives 4G M h r r i ∆t ' N ln E V + 1 . (20.28) c3 R2 Note that if we wanted to take into account the gravitational redshift of Earth, this is an order µE/rE correction to what we have already calculated and therefore negligible. Experimentally, when Venus is on the opposite side of the Sun to the Earth, the numerical value of the time delay for a grazing passing of the Sun is about 220µs, if you convert time back from metres to seconds. HEL goes into more detail about the experimental nuances in §10.3. One has to correct for the motion of Venus and Earth in their orbits, their individual gravitational fields, the variance of reflecting surfaces on Venus, and refraction by the Solar corona. After all the experimental dust settles, you get the data agreeing in a pretty way with the GR prediction.

110 21 M30Nov

21.1 Geodesic precession of gyroscopes Precession of gyroscopes is another experimental test of General Relativity. Gyros are interesting because they spin on an axis, and this spin vector sµ feels the effects of General Relativity through the physics of parallel transport. Let us see out how this works. The geodesic is a physically special curve because it parallel transports its own tangent vector, d uµ + Γµ uνuσ = 0 . (21.1) dλ νσ Physically, the spin must be orthogonal to the tangent vector,

µ ν gµνs u = 0 . (21.2)

In other words, the spin cannot have a timelike component in the instantaneous rest frame of the test object. If we want this zero inner product to be conserved at all points along the worldline of the gyro, we need to insist that the spin vector sµ be parallel transported, d sµ + Γµ sνuσ = 0 . (21.3) dλ νσ To demonstrate the effect we are after, it is sufficient to use the approximation that Earth’s gravitational field (in which GPB flew) is described by the Schwarzschild metric. This will simplify our computations because there are fewer Christoffel symbols for Schwarzschild than for Kerr. Imagine that our test gyroscope is orbiting Earth in a circle, in the equatorial plane of our spherical polar coordinate system. Circular motion occurs at fixed (r, θ), so that 1 2 θ ϕ r r u (λ) = 0 and u (λ) = 0 ∀ λ. Because θ = π/2, Γ ϕϕ and Γ θϕ are zero and Γ ϕϕ = Γ θθ. So our spin parallel transport equations in (t, r, θ, ϕ) coordinates become

dst + Γt srut = 0 , (21.4) dλ rt dsr + Γr stut + Γr sϕuϕ = 0 , (21.5) dλ tt ϕϕ dsθ = 0 , (21.6) dλ dsϕ + Γϕ sruϕ = 0 . (21.7) dλ rϕ where µ  2µ−1 µ  2µ  2µ 1 Γt = 1 − , Γr = 1 − , Γr = −r 1 − , Γϕ = . (21.8) rt r2 r tt r2 r ϕϕ r rϕ r To proceed further, we need to know something about the normalization of the velocity vector. We can write it as [uµ] = ut[1, 0, 0, Ω]T , where Ω is our angular velocity for circular motion. What is the angular velocity for our case? We actually mentioned the key ingredi- ents already, in passing, when we discussed massless and massive particle geodesics in the

111 Schwarzschild spacetime. In particular, we derived the shape equation for (quasi-)elliptical orbits. Circular orbits are a special case, and the shape equation can easily be rearranged to find L. We obtain µR2 L2 = (21.9) R − 3µ where R is the radius of the circular orbit. Then using the norm condition on the velocity vector gives (1 − 2µ/R) E = . (21.10) p1 − 3µ/R We can also find the angular velocity, by using the geodesic equations to find ϕ(t),

!2 dϕ2 dϕ  dt −1 = . (21.11) dt dλ dλ

After the dust settles, this gives the very simple expression µ Ω2 = . (21.12) r3 The norm of the 4-velocity must be unity, as appropriate to a massive particle (our gyro- scope). This gives the equation

 2µ −1/2  3µ−1/2 u0 = 1 − − r2Ω2 = 1 − . (21.13) r r

In this system, we have ur = 0 = uθ, and so the condition that the spin vector be orthogonal to the velocity vector becomes  2µ 1 − stut − r2sϕuϕ = 0 . (21.14) r

Since uϕ/ut = dϕ/dt = Ω, we can express st in terms of sϕ,

Ωr2 st = sϕ . (21.15) (1 − 2µ/r) As you can check for yourself, this means that the first and fourth of the parallel transport equations are equivalent. Then the remaining equations are

dsr rΩ dsθ dsϕ utΩ − sϕ = 0 , = 0 , + sr = 0 . (21.16) dλ ut dλ dλ r We can convert the experimentally relatively unfamiliar affine parameter λ to the coor- dinate time t using ut = dt/dλ. Using the third equation to eliminate sϕ from the first gives for the set of three d2sr Ω2 dsθ dsϕ Ω + sr = 0 , = 0 , + sr = 0 . (21.17) dt2 (ut)2 dt dt r

112 This has solution Ω sr(t) = s1(0) cos Ω0t , sθ(t) = 0 , sϕ(t) = − s1(0) sin Ω0t , (21.18) rΩ0 where Ω Ω0 = = Ωp1 − 3µ/r . (21.19) ut Therefore, the spatial part of the spin vector is rotating relative to the radial directionr ˆ with a coordinate angular speed −Ω0 in the direction -ϕ ˆ. But the radial direction itself is rotating with coordinate angular speed +Ω. So it is the difference in speeds which gives rise to geodesic precession.

If you revolve once in a coordinate time t = 2π/Ω, the final direction of the spatial spin vector is 2π + α, where α = 2π(1 − Ω0/Ω). Per revolution, then, the angular precession is " # r 3µ α = 2π 1 − 1 − . (21.20) r

This effect is not very big, but it is cumulative. That means if you can machine almost- perfect gyros and leave them in orbit for a veeeeeeeery long time, then you have a chance of these effects adding up and being measurable. From the GPB website: “Gravity Probe B, launched 20 April 2004, is a space exper- iment testing two fundamental predictions of Einstein’s theory of General Relativity (GR), the geodetic and frame-dragging effects, by means of cryogenic gyroscopes in Earth orbit. Data collection started 28 August 2004 and ended 14 August 2005. Analysis of the data from all four gyroscopes results in a geodetic drift rate of −6, 601.8 ± 18.3 mas/yr and a frame-dragging drift rate of −37.2 ± 7.2 mas/yr, to be compared with the GR predic- tions of −6, 606.1 mas/yr and −39.2 mas/yr, respectively (‘mas’ is milliarc-second; 1 mas= 4.848 × 10−9 radians or 2.778 × 10−7 degrees).”

113 21.2 Accretion disks Lastly, let us mention one more experimental test of GR: accretion disks around compact objects. They have matter swirling around the central black hole at millions of Kelvins, and tend to emit strongly in the X-ray part of the spectrum. Even at such extreme temperatures, some atoms can retain electrons and then emit radiation as they jump between energy levels, and one such nucleus is iron. Looking at the shape of the broadened iron emission line from the whole accretion disk actually gives a probe of the strong-field regime of GR, as we will now motivate. There are two types of redshift that operate in this system: gravitational redshift, and Doppler shifting from relative velocity w.r.t. an observer here on Earth. Supposing that we view an accretion disk and black hole system side-on, we would see a range of Doppler shifting depending on which part of the disk we were looking at. This would even happen in the Newtonian approximation! The really key part is the gravitational redshift. The essential reason is that the smallest-possible frequency present in the observed spectrum must have been emitted at the smallest possible value of r, so that it could experience maximum redshift on the way out. Knowing the radius of the ISCO, we can then get a handle on the biggest frequency ratio possible.

114 The ratio of the photon frequency at reception compared to that at emission is given by µ νR pµ(R)uR = µ . (21.21) νE pµ(E)uE Using what we derived in the previous experiment’s discussion concerning the angular ve- locity and the tangent vector norm condition, you can show with straightforward algebra that  1/2   νR p0(R) 3µ p0(R) p3(E) = 0 3 1 − 1 ± Ω , (21.22) νE p0(E)uE + p3(E)uE r p0(E) p0(E) where + corresponds to emitting matter on the side of the disk moving towards the observer and − corresponds to matter on the other side. Now, because Schwarzschild is a stationary metric, the downstairs component of the time component of the momentum of the photon is conserved along a geodesic. Our last ingredient is to find the ratio p3(E)/p0(E), and this is done using the null photon momentum norm condition. Working in the equatorial plane, we find  2µ−1  2µ 1 1 − (p )2 − 1 − (p )2 − (p )2 = 0 . (21.23) r 0 r 1 r2 3 To get any further for a general angle between the accretion disk and us, we would need to recruit the full photon geodesic equations. But in two special cases we can actually do a slick avoidance manoeuvre and finesse this issue! When the matter is transverse to the observer (or in a face-on disk), ϕ = 0, π. Then p3(E) = 0, and so ν r 3µ R = 1 − . (21.24) νE r When matter moves either directly towards or away from the observer, ϕ = ±π/2. Then the radial component of the photon momentum is zero, and so p νR 1 − 3µ/r = p (21.25) νE 1 ± 1/ r/µ − 2

You√ find that the smallest frequency√ represented in the Iron emission line will be νR/νE = 2/3 ' 0.47 for face-on disks and 1/ 2 for edge-on disks.

115 22 R03Dec

The following gravitational waves material is based on §17 and §18 of HEL. All figures shown are from HEL.

22.1 Finding the wave equation for metric perturbations For this section we will assume that the cosmological constant is zero and that spacetime is approximately flat. We will figure out the equations obeyed by small (perturbative) ripples in the fabric of spacetime about the Minkowski metric, which are known as gravitational waves. To begin, we assume that

gµν = ηµν + hµν , (22.1) where |hµν|  1. To first order in small quantities,

gµν = ηµν − hµν , (22.2) where we raise and lower indices to this order by using the Minkowski metric,

µν µρ νσ h = η η hρσ . (22.3)

It is important to know how the perturbations are affected by changes of coordinates. 0µ µ ν Under global Lorentz transformations x = Λ νx , we know that ∂xρ ∂xσ g0 = g µν ∂x0µ ∂x0ν ρσ ρ σ = Λµ Λν (ηρσ + hρσ) ρ σ = ηµν + Λµ Λν hρσ (22.4) because the Minkowski metric is invariant under global Lorentz transformations. Therefore,

0 ρ σ hµν = Λµ Λν hρσ . (22.5)

In other words, hµν transforms like a tensor under global Lorentz transformations. We can also ask about how perturbations in spacetime are affected by a general coordi- nate transformation of the form x0µ = xµ + ξµ(x) . (22.6) Therefore, ∂x0µ = δµ + ∂ ξµ . (22.7) ∂xν ν ν By eye, we can see using the above equation that to first order in small quantities the inverse transformation obeys ∂xµ = δµ − ∂ ξµ . (22.8) ∂x0ν ν ν

116 Accordingly, under these general coordinate transformations, we have

0 ρ ρ σ σ gµν = δµ − ∂µξ (δν − ∂νξ )(ηµν + hµν)

= ηµν + (hµν − ∂µξν − ∂νξµ) (22.9)

ν where we defined ξµ = ηµνξ , and worked to first order in small quantities. Therefore, the transformation law of the perturbations under general coordinate transformations (22.6) is

0 hµν = hµν − ∂µξν − ∂νξµ . (22.10) What are the Christoffels to first order in perturbations? We did this type of approxi- mation earlier when we first linearized a GR expression to recover its Newtonian limit. Here, we obtain 1 Γσ = ∂ hσ + ∂ hσ − ∂σh  . (22.11) µν 2 ν µ µ ν µν From this, it follows that, to first order in perturbations, 1 Rσ = ∂ ∂ hσ + ∂ ∂σh − ∂ ∂σh − ∂ ∂ hσ . (22.12) µνρ 2 ν µ ρ ρ µν ν µρ ρ µ ν The really neat thing about this expression for the Riemann tensor is that it is invariant under general coordinate transformations (22.6). As you can check, this property also holds for the Ricci tensor and the Ricci scalar. For convenience, let us define

σ h = hσ (22.13) and write 2 µ  = ∂ ∂µ . (22.14) Then we obtain 1 R = ∂ ∂ h + 2h − ∂ ∂ hρ − ∂ ∂ hρ  , (22.15) µν 2 µ ν  µν µ ρ ν ν ρ µ and 2 µν R =  h − ∂µ∂νh . (22.16) Plugging the above expressions into the Einstein equations yields a second-order PDE for the perturbations. In order to aid in wrangling all the pertinent algebra, it is convenient to define the trace reverse of hµν, 1 h¯ ≡ h − η h . (22.17) µν µν 2 µν This obeys the property that ¯ hµν = hµν . (22.18) We also have (again to first order in perturbations) ¯ µν ¯ h = η hµν , (22.19) ¯ ¯ ¯ which obeys h = (1 − D/2)h. In D = 4, h = −h. In terms of hµν, the Einstein equations become 2¯ ¯ρσ ¯ρ ¯ρ  hµν + ηµν∂ρ∂σh − ∂ν∂ρhµ − ∂µ∂ρhν = −16πGN Tµν . (22.20)

117 On the face of it, this equation does not look very much like a familiar wave equation involving the d’Alembertian! In order to figure out what our Einstein equations for the perturbations imply physically, ¯ it is crucial that we understand how hµν transforms under general coordinate transformations 0 (22.6). First, recall our equation (22.10), hµν = hµν − ∂µξν − ∂νξµ. From this, it follows directly that

0 µν h = η (hµν − ∂µξν − ∂νξµ) µ = h − 2∂µξ . (22.21)

Therefore, 1 h¯0µρ = h0µρ − ηµρh0 2 1 = (hµρ − ∂µξρ − ∂ρξµ) − ηµρ (h − 2∂ ξσ) 2 σ ¯µρ µ ρ ρ µ µρ σ = h − ∂ ξ − ∂ ξ + η ∂σξ . (22.22)

Taking the partial derivative of this expression gives

¯0µρ ¯µρ 2 µ ∂ρh = ∂ρh −  ξ . (22.23) So far, this description of algebra manipulations might seem a tad dry. But this is where the real money is to be made in careful observation. Suppose that we are smart enough to choose 2 µ ¯µρ ¯0µρ a coordinate system in which  ξ = ∂ρh . Then ∂ρh = 0, which massively simplifies the Einstein equation. In particular, all the terms on the LHS which did not involve the d’Alembertian operator become equal to zero in this coordinate system. Wow! To summarize, let us drop the primes for clarity, raise the indices with η, and write our Einstein equation in this awesome new coordinate system, 16πG 2h¯µν = − N T µν . (22.24)  c4 In order for the wave equation for our metric perturbations to obey this simple equation, our coordinate system must obey ¯µν ∂µh = 0 . (22.25) Any further coordinate change xµ → xµ + ξµ within this gauge class would be OK, as long 2 µ as it satisfied  ξ = 0. This is very reminiscent of the Lorentz gauge in electromagnetism, µ µ µ µ ∂µA = 0, which still allows further gauge transformations of the form A → A + ∂ λ, 2 where  λ = 0. Accordingly, this gauge for metric perturbations is sometimes rather loosely called the Lorentz gauge. More properly, it is called the de Donder gauge.

22.2 Solving the linearized Einstein equations As always, if we are trying to solve a wave equation, it helps to start by finding the Green’s function, 2 σ σ (4) σ σ xG(x − y ) = δ (x − y ) . (22.26)

118 As explained in detail in HEL §17.6, this is solved by the retarded Green’s function δ(x0 − |~x|)θ(x0) G(xσ) = , (22.27) 4π|~x| as you can check by substituting it in. Note that the retarded Green’s function (as com- pared to, say, the advanced Green’s function) is required by causality: we cannot expect a to be influenced by sources in its future light cone, only those in its past light cone. Using the retarded Green’s function, we can see immediately that the solution to the Einstein equation for the metric perturbation is

4G Z T µν(ct , ~y) h¯µν(ct, ~x) = − N d3~y r (22.28) c4 |~x − ~y|

HEL Fig.17.2 is very helpful for visualizing the meaning of the retarded time variable tr in this equation, which is defined by

ctr = ct − |~x − ~y| . (22.29)

Plodding through the details of how to check whether this satisfies the de Donder gauge condition requires careful attendance to the retarded time story, using the chain rule for derivatives, and integration by parts. The result is ∂ 4G Z 1 ∂ h¯µν = − N d3~y T µν(y0, ~y) . (22.30) ∂xµ c4 |~x − ~y| ∂yµ

But since the energy-momentum tensor is conserved in the linearized theory,

µν ∂µT = 0 , (22.31) we have what we need: ¯µν ∂µh = 0 . (22.32)

119 A very important idea from electromagnetism was the multipole expansion. Here, it is the conserved energy-momentum that sources our gravitational wave, rather than the conserved current sourcing the EM wave, but the principle is analogous. In an asymptotically flat spacetime, higher partial waves fall off with higher powers of distance, so the lowest pertinent multipole moment for a compact source dominates the physics of wave propagation far from the source. As you learned in 3rd year EM class, in order to generate EM waves, a time-dependent dipole moment is needed. In order to generate gravitational waves, it turns out that we will need a time-dependent quadrupole moment. To start our way towards that result, let us Taylor expand the denominator in the integral for h¯µν, with |~x| = r and small ~y, 1 1 1 1 1 ' + (−yi)∂ + (−yi)(−yj)∂ ∂ + ... |~x − ~y| r i r 2! i j r 1 x 3x x − δ r2  = + yi i + yiyj i j ij + ... (22.33) r r3 r5 Motivated by this, we define the multipoles Z µνσ1σ2...σ` 3 µν σ1 σ2 σ` M (ctr) = d ~yT (ctr, ~y)y y . . . y , (22.34) and obtain ∞ `   µν 4GN X (−1) 1 h¯ (ct, ~x) = − M µνσ1σ2...σ` (ct )∂ ∂ . . . ∂ (22.35) c4 `! r σ1 σ2 σ` r `=0

For the case of a compact source, we can use these general expressions to find ap- proximations for our linearized metric perturbations. First we need to consider what the components of T µν tell us physically. T 00 is the energy density of the source particles, and if this is integrated over all space then it gives Mc2, the conserved energy. T 0i is the momen- tum density of source particles, and if this is integrated over all space it gives P ic, which is also conserved at this order in perturbations. The T ij are the internal stresses, and they are not necessarily zero when integrated over all space. Without loss of generality, we may take our spatial coordinates xi to be in the centre of momentum (CoM) frame of the particles, so that P i = 0. Then in CoM coordinates, 4G M h¯00 = − N , h¯0i = h¯i0 = 0 . (22.36) c2r The remaining parts are Z ¯ij 4GN 3  ij 0  h (ct, ~x) = − d ~y T (ct , ~y) 0 (22.37) c2r ct =ct−r It is not especially easy to compute this integral directly. HEL explain carefully in §17.8 that a slightly indirect yet algebraically shorter route can be found by recruiting energy- νµ momentum conservation ∂µT = 0. In a 3+1 split, we have 00 0k 0 = ∂0T + ∂kT i0 ik 0 = ∂0T + ∂kT . (22.38)

120 These two equations can be used to turn our integral over T ij into integrals over higher 0i 00 ik j moments of T and T . The first trick is to consider the integral of ∂k(T y ) over a volume completely enclosing the source and using Gauss’s theorem. The first conservation equation then yields Z 1 d Z d3~yT ij = d3~y T i0yj + T j0yi . (22.39) 2c dtr 0k i j The second trick is to consider the integral of ∂k(T y y ) over the same enclosing volume; it yields Z 2 Z 3 ij 1 d 3 00 i j d ~yT = 2 2 d ~yT y y . (22.40) 2c dtr Defining the quadrupole moment Iij by Z Iij(ct) = d3~yT 00(ct, ~y) yi yj (22.41) gives the solution  2 ij 0  ¯ij 2GN d I (ct ) h (ct, ~x) = − 6 02 . (22.42) c r dt 0 t =tr This is known as the quadrupole formula.

121 23 M07Dec

As a warm-up example of solving for gravitational perturbations, we can consider a station- ary non-relativistic source which is a perfect fluid. In this case the energy-momentum tensor is constant in time, and the distinction between time and retarded time is irrelevant. Then we have directly that 4G Z T µν(~y) h¯µν(~x) = − N d3~y . (23.1) c4 |~x − ~y| When our perfect fluid is non-relativistic, all speeds are much smaller than c and to lowest order in perturbation theory we can neglect the pressure. This gives

T 00 = ρc2 ,T 0i = ρcui ,T ij = ρuiuj , (23.2) where ρ is the proper density distribution of the source. The solution can be written as 4Φ Ai h¯00 = , h¯0i = , h¯ij = 0 , (23.3) c2 c where Z ρ(~y) Φ(~x) = −G d3~y N |~x − ~y| 4G Z ρ(~y)ui(~y) Ai(~x) = − N d3~y . (23.4) c2 |~x − ~y| We can easily obtain hµν as functions of h¯µν using our earlier formula connecting them. The result is 2Φ A h = h = h = h = , h = i . (23.5) 00 11 22 33 c2 0i c This provides the derivation that we promised quite a long time ago of the lowest-order Newtonian approximation to the spacetime metric,  2Φ A  2Φ ds2 = 1 + (cdt)2 + 2 i (cdt)dxi − 1 − δ dxidxj , (23.6) c2 c c2 ij with the bonus that we now allow for slow rotation of the source. An example of a stationary non-relativistic source would be a rigidly rotating sphere.

23.1 Gravitational plane waves Another simple example of solving for gravitational perturbations is the case of gravita- tional plane waves. These take the form 1 h¯µν = Aµν exp(ik xµ) + c.c.. (23.7) 2 µ The de Donder gauge condition requires that the polarization tensor Aµν obeys

µν kµA = 0 , (23.8)

122 i.e., it is transverse to the direction of propagation of the wave. Let us count polarizations. We started off with ten components of our symmetric tensor metric perturbations. Fixing de Donder gauge reduces that to six independent components. We can further fix the gauge by doing a coordinate transformation xµ → xµ+ξµ, as long as we stay within de Donder gauge, which further reduces the number of independent components down to two. Let us see how this works, in more detail. Consider a ξµ of the form

µ µ ν ξ =  exp(ikνx ) . (23.9)

2 µ µ ¯µν This clearly obeys  ξ = 0 if  =const. Under this transformation, we know how h transforms, which tells us that the polarization tensor must also transform as

0µν µν µ ν ν µ µν ρ A = A − i k − i k + iη  kρ . (23.10)

Let our wavevector lie along the z-direction: ~k = kzˆ. Then by our de Donder gauge condition, Aµ3 = Aµ0 ∀µ. Using this and the above two equations, we can straightforwardly show that the components of µ can always be chosen to ensure that the only nonzero components of the polarization tensor are 0 0 0 0 µν 0 a b 0 [A ] =   . (23.11) TT 0 b −a 0 0 0 0 0 This cleverly chosen gauge is known as the transverse traceless (TT) gauge. If we wish, we can write the polarization tensor as

µν µν µν ATT = ae+ + be× . (23.12) More generally, we define the TT gauge via ¯0i ¯ hTT ≡ 0 , hTT ≡ 0 . (23.13) ¯µν Using this and the de Donder gauge condition ∂µhTT = 0 , we have that ¯00 ¯ij ∂0hTT = 0 , ∂ihTT = 0 . (23.14) What effect does such a gravitational plane wave have on a bunch of free particles? We can work this out by using the geodesic equation for their motion, duσ + Γσ uµuν = 0 . (23.15) dτ µν Suppose a particle is initially at rest before the wave comes by. Then [uµ] = c[1,~0]T , and so duσ = −c2Γσ dτ 00 c2 = − ησρ (∂ h + ∂ h − ∂ h ) 2 0 ρ0 0 0ρ ρ 00 = 0 (23.16)

123 because we are working in TT gauge. So hey: our coordinate system is adapted to individual particles! But even though the coordinate separation of particles is constant, their physical separation is not, because h¯µν 6= 0 . Let us parametrize the coordinate spatial separation between two nearby particles as Si. Then the physical spatial separation is 2 i j i j ` ≡ −gijS S = (δij − hij)S S . (23.17) To first order in perturbations, we can define a new physical separation vector ζi by 2 i j ` = δijζ ζ , (23.18) or 1 ζi = Si + hi Sk . (23.19) 2 k To see the effect of our gravitational plane wave in the z direction, let us inspect two particles 3 k3 in the (x, y) plane. Then S = 0. Also, because hTT = 0 ∀k, there is no change in their z-separation due to the plane wave. Their moving around is going to happen in the (x, y) plane only. Another advantage of TT gauge is that h¯ = 0, which also implies that h = 0. µν Picking the e+ polarization tensor for definiteness, we find easily that µν µν µ µν 0 3 hTT = ae+ cos(kµx ) = ae+ cos[k(x − x )] , (23.20) where k = |~k| = ω/c. So a (ζi) = (S1,S2, 0)T − cos[k(x0 − x3)](S1, −S2, 0)T . (23.21) 2 This is illustrated nicely in Fig.18.1 of HEL.

µν For the other case of the crossed polarization e× , this Fig.18.2 of HEL shows how to visualize its effect.

Either way, you can think of gravitational waves as stretchy-squeezy waves.

124 23.2 Energy loss from gravitational radiation THere is no local notion of density in GR, because we could always change it via a coordinate transformation. Also, in generic spacetimes in GR, neither en- ergy nor momentum is conserved. But we can still motivate an expression for the energy- momentum tensor of the gravitational field itself in the perturbative approximation, in order to allow us to derive the famous formula for the power radiated by gravitational waves. We started our perturbative approach to spacetime metric perturbations starting from the full equations, 8πG G = − N T . (23.22) µν c4 µν Now imagine that we go one step beyond linear order, keeping up to second-order terms in small quantities. Then we have 8πG G(1) + G(2) + ... = − N T . (23.23) µν µν c4 µν We could try moving the second-order approximation to the Einstein tensor over to the RHS grav and calling it tµν . The problem with this idea is that unfortunately this expression is not gauge invariant. HEL explain in detail how to fix this by averaging over a small region about any given point and writing 4 grav c (2) tµν ≡ hGµν i . (23.24) 8πGN After a good deal of fairly unilluminating algebra, the resulting expression becomes

4 grav c ¯ ¯ρσ ¯ρσ ¯ 1 ¯ ¯ tµν = h(∂µhρσ)∂νh − 2(∂σh )∂(µhν)ρ − (∂µh)∂νhi 32πGN 2 1 − hh¯ T ρ + η hρσT i . (23.25) ρ(µ ν) 4 µν ρσ The key property of this thing is that it is invariant under gauge transformations, as required. We consider gravitational plane waves in vacuo so that only the top line will appear for us. ¯µν ¯ ¯µν µν Now, in TT de Donder gauge, we have ∂µhTT = 0, hTT = 0, and hTT = hTT. So then in vacuo, we have only the first term in the complicated expression above turned on. In our ¯0i TT gauge, considering only the radiative part of the gravitational field shows that hTT = 0, so that in fact only the spatial components of the perturbations actually appear,

4 grav c TT ij tµν = h(∂µhij )(∂νhTT)i (23.26) 32πGN Physically, the energy flux (energy/area/time) in the ni spatial direction is

0k 0k j F (~n) = −ct nk = +δkjt n , (23.27) in our signature convention, because in general an energy-momentum tensor tµν encodes the flux of µ-momentum in the ν-direction.

125 Let us consider a compact source and aim for the far-field result, choosing ~n to be pointing in the radial direction away from the source. Then we have

4 c TT ij F (ˆr) = − h(∂thij )∂rhTTi . (23.28) 32πGN But from our quadrupole formula from earlier, we know that

2G h..iji h¯ij = − N I (23.29) c6 r where · ≡ d/dt and r means using retarded time. We need an expression for the TT part of the quadrupole moment, so we define 1 J ≡ I − δ I, (23.30) ij ij 3 ij

i where I = Ii . Then ..ij ij ¯ij 2GN h i hTT = hTT = − J (23.31) c6 r Now, in order to finish this line of reasoning, we need to slow down a little and be careful about how to take t and r derivatives at retarded time. Our definition of retarded time was

0 0 xr ≡ ctr = x − |~x − ~y| , (23.32)

0 and so for any function f(xr, ~y), we have 0  0  0 ∂f(xr, ~y) ∂f(y , ~y) ∂xr µ = 0 µ , ∂x ∂y r ∂x 0  0   0  0 ∂f(xr, ~y) ∂f(y , ~y) ∂f(y , ~y) ∂xr i = i + 0 i , (23.33) ∂y ∂y r ∂y r ∂y

0 0 where r means to evaluate at y = xr. We therefore have that

TT 2GN h ...TTi ∂thij = − J ij (23.34) c6 r We can also evaluate

TT 2GN h ..ij i 2GN ...ij ∂rhij = J TT + J TT . (23.35) c6 r2 r c7r The second term here dominates over the first, and so our expression for the radiation flux from the gravitational wave source is

G ...TT ...ij F (ˆr) = N hJ J i . (23.36) 8πr2c9 ij TT Our last task is to express this in terms of the original quadrupole. For that we need a handy projection tensor, Pij ≡ δij − ninj . (23.37)

126 Applying this to an arbitrary spatial vector allows one to see that it obeys the properties we expect of a projector. Then the transverse part of the polarization vector for the gravitational ij i j k` wave is AT = P kP `A is the transverse part. To ensure that there is no trace part, we ij i j 1 ij  k` need to form ATT = P kP ` − 2 P Pk` A . By direct analogy, we find for the quadrupole  1  J ij = P i P j − P ijP J . (23.38) TT k ` 2 k` k`

Denoting the components of the unit radial vector byx ˆi, this gives 1 J TTJ ij = J J ij − 2J jJ ikxˆ xˆ + J ijJ k`xˆ xˆ xˆ xˆ . (23.39) ij TT ij i j k 2 i j k ` To get the integrated luminosity, we integrate this over 4π of solid angle. After the boring dust settles, we have (at last!) the famous formula we wanted,

dE GN h ...... iji = −LGW = − h J ijJ i . (23.40) dt 5c9 r This shows that you not only need a quadrupole (not a monopole or a dipole) to produce gravitational radiation, you also need the third derivative of it turned on. Again, the reason why we use retarded times in this expression is to ensure the correct boundary conditions for our Green’s function reflecting causality. Gravitational radiation was discovered indirectly in 1974, via the famous observations of Russell Hulse and Joseph Taylor of binary pulsars which won them a 1993 Nobel Prize in Physics. The period between winks of the pulsar slowed down over time, at a rate precisely predicted by GR. What was a far more impressive technological feat was the building of LIGO, the Laser Interferometer Gravitational Wave Observatory. It won the 2017 Nobel Prize in Physics for , Barry Barish, and for the direct discovery of gravitational waves – tiny ripples in the very fabric of spacetime long thought technologically impossible to detect. Here are a few URLs for checking out their discoveries:- • https://www.youtube.com/watch?v=FXlg3cr-q44 • https://www.ligo.caltech.edu/news/ligo20160211 • https://www.ligo.caltech.edu/page/four-new-detections-o1-o2-catalog

127