<<

I. GENERAL RELATIVITY – A SUMMARY

A. Pseudo-Riemannian manifolds

Spacetime is a manifold that is continuous and differentiable. This means that we can define scalars, vectors, 1-forms and in general fields and are able to take derivatives at any point. A differential manifold is an primitive amorphous collection of points (events in the case of spacetime). Locally, these points are ordered as points in a Euclidian space. Next, we specify a distance concept by adding a metric g, which contains information about how fast clocks proceed and what are the distances between points. −−→ On the surface of the Earth we can determine a metric by drawing small vectors ∆P on the surface. We state that the length of the vector is given by the inner product −−→ −−→ −−→ −−→ ∆P· ∆P ≡ ∆P2 = (length of ∆P)2, (1.1)

and use a ruler to determine its value. We now have a definition for the inner vector product for a small vector with itself. We use linearity to extend this to macroscopic vectors. Next, we can obtain a definition for the inner product of two different vectors by writing 1 h i A~ · B~ = (A~ + B~ )2 − (A~ − B~ )2 . (1.2) 4 In summary, when one has a distance concept (a ruler on the surface of the Earth), then one can define an inner product, and from this the metric follows (since it is nothing but g(A,~ B~ ) ≡ (A~ · B~ ) = g(B,~ A~). The is symmetric.). A differentiable manifold with a metric as additional structure, is termed a (pseudo-)Riemannian manifold. We now

Figure 1: Left: at each point P on the surface of the Earth a tangent space (in this case a tangent plane) exists; right: the tangent plane is a nearly correct image in the vicinity of the point P. want to assign a metric to spacetime. To this end we introduce a local Lorentz frame (LLF). We can achieve this by going into freefall at point P. The equivalence principle states that all effects of gravitation disappear and that we locally obtain the metric of the special theory

1 of relativity (SRT). This is the Minkowski metric. Thus, we can choose at each point P of the manifold a coordinate system in which the Minkowski metric is valid. While in the SRT this can be a global coordinate system, in general relativity (GR) this is only locally possible. With this procedure we have now found a definition of distance at each point P: 2 µ ν with gµν = ηµν → ds = ηµνdx dx . In essence, we practice SRT at each point P and have a measure for lengths of rods and proper times of ideal clocks. In a LLF the metric is given by ηµν = diag(−1, +1, +1, +1). For a Riemannian manifold all diagonal elements need to be positive. The signature (the sum of the diagonal elements) of the metric of spacetime is +2, and in our case we refer to the manifold as pseudo-Riemannian. Assume that we draw a coordinate system on the Earth’s surface with longitude and latitude. When we look at this reference system, it locally resembles a Cartesian system, when we stay close to point P. Deviations from Cartesian coordinates occur at second order in the distance x from the point P. Mathematically, this means that

|~x|2  g = δ + O , (1.3) jk jk R2

with R the radius of the Earth. A simpler way to understand this is by constructing the tangent plane at point P. Fig. 1 shows that when ~x denotes the position vector of a point with respect to P, then this corresponds to cos |~x| on the tangent plane. A series expansion x2 yields cos x = 1 − 2 + .... As a consequence we see that when one considers only first-order derivates, one observes no influence of the curvature of the Earth. Only when second-order derivatives are taken into account, one obeys curvature effects. The same is true for spacetime. In a curved spacetime we cannot define a global Lorentz frame for which gαβ = ηαβ. However, it is possible to choose coordinates such that in the vicinity of P this equation is almost valid. This is made possible by the equivalence principle. This is the exact definition of a local Lorentz frame and for such a coordinate system one has gαβ(P) = ηαβ for all α, β;

∂ ∂xγ gαβ(P) = 0 for all α, β, γ; (1.4)

∂2 ∂xγ ∂xµ gαβ(P) 6= 0. The existence of local Lorentz frames expresses that each curved spacetime has at each point a flat tangent space. All tensor manipulations occur in this tangent space. The above expressions constitute the mathematical definition of the fact that the equivalence principle allows us to chose a LLF at point P. The metric is used to define the length of a curve. When d~x is a small vector displacement 2 α β on a curve, then the quadratic length is equal to ds = gαβdx dx (we call this the line element). A measure for the length is found by taking the root of the absolute value. This α β 1 yields dl ≡ |gαβdx dx | 2 . Integration gives the total length l and we find

1 Z 1 Z λ1 α β 2 α β 2 dx dx l = gαβdx dx = gαβ dλ, (1.5) along the curve λ0 dλ dλ

2 where λ is the parameter of the curve. The curve has as end points λ0 and λ1. The tangent vector V~ of the curve has components V α = dxα/dλ and we obtain

Z λ1 1 ~ ~ 2 l = V · V dλ (1.6) λ0 for the length of an arbitrary curve. When we perform integrations in spacetime it is important to calculate volumes. With volume we mean a four-dimensional volume. Suppose that we are in a LLF and have 0 1 2 3 α a volume element dx dx dx dx , with coordinates {x } in the local Lorentz metric ηαβ. Transformation theory states that

0 1 2 3 ∂(x , x , x , x ) 0 0 0 0 dx0dx1dx2dx3 = dx0 dx1 dx2 dx3 , (1.7) ∂(x00 , x10 , x20 , x30 )

where the factor ∂()/∂( ) is the Jacobian of the transformation of {xα0 } to {xα}. One has

 ∂x0 ∂x0  0 0 ... ∂(x0, x1, x2, x3) ∂x0 ∂x1 ∂x1 ∂x1 α  0 0 0 0 = det  0 0 ...  = det Λ β0 . (1.8) ∂(x0 , x1 , x2 , x3 ) ∂x0 ∂x1 ......

The calculation of this determinant is rather evolved and it is simpler to realize that in terms of matrices the transformation of the components of the metric is given by the equation (g) = (Λ)(η)(Λ)T , where with ‘T ’ the transpose is implied. Then the determinants obey det(g) = det(Λ)det(η)det(ΛT ). For each matrix one has det(Λ) = det(ΛT ) and furthermore we have det(η) = −1. We obtain det(g) = − [det(Λ)]2. We use the notation

1 α 2 g ≡ det(gα0β0 ) → det(Λ β0 ) = (−g) (1.9) and find

0 1 2 3 1 00 10 20 30 1 00 10 20 30 dx dx dx dx = det [−(gα0β0 )] 2 dx dx dx dx = (−g) 2 dx dx dx dx . (1.10) It is important to appreciate the reasoning we followed in order to obtain the above result. We started in a special coordinate system, the LLF, where the Minkowski metric is valid. We then generalized the result to all coordinate systems.

B. and covariant derivative

Suppose we have a tensor field T( , , ) with rank 3. This field is a function of location and defines a tensor at each point P. We can expand this tensor in the {~eα} which gives the (upper-index) components T αβγ. In general we have 64 components for spacetime. However, we also can expand the tensor T in the dual basis {~e α} and we find

αβγ γ α β T( , , ) ≡ T ~eα ⊗ ~eβ ⊗ ~eγ = Tαβ ~e ⊗ ~e ⊗ ~eγ. (1.11) When we want to calculate the components we use the following theorem:

αβγ α β γ γ γ T = T(~e ,~e ,~e ) and Tµν = T(~eµ,~eν,~e ). (1.12)

3 When we have the components of tensor T in a certain order of upper and lower indices, and we want to know the components with some other order of indices, then the metric can be used. One has

γ αβγ αβγ αρ βγ Tµν = T gαµgβν and for example also T = g Tρ (1.13)

Next, we want to discuss contraction. This is rather complicated to treat in our abstract notation. Given a tensor R, we always can write it in terms of a vector basis as

R( , , , ) = A~ ⊗ B~ ⊗ C~ ⊗ D~ + ... (1.14)

We discuss contraction only for a tensor product of vectors and use linearity to obtain a mathematical description for arbitrary tensors. For contraction C13 of the first and third index one has h ~ ~ ~ ~ i ~ ~ ~ ~ C13 A ⊗ B ⊗ C ⊗ D( , , , ) ≡ (A · C)B ⊗ D( , ). (1.15) We can write the above abstract definition in terms of components and find

~ ~ µ ν µ ν µ µβ δ ~ ~ A · C = A C ~eµ · ~eν = A C gµν = A Cµ → C13R = R µ B × D. (1.16)

In the same way as above, we see that from two vectors A~ and B~ a tensor A~ ⊗ B~ can be constructed by taking the tensor product, while we can obtain a scalar A~ · B~ by taking the inner product. The contraction of the tensor product A~ ⊗ B~ again yields a scalar, h i C A~ ⊗ B~ = A~ · B~ .

µβ δ From now on we will look at expressions such as R µ from a different angle. So far we have viewed these as the components of a tensor; from now on our interpretation is that the indices µ, β, µ and δ label the slots of the abstract tensor R. Thus, Rαβγδ represents the abstract tensor R( , , , ) with as first slot α, second slot β, etc. The above completes our discussion of tensor algebra. In the following we will discuss tensor analysis. We do this for a tensor field T( , ) of rank 2, but what we conclude is valid for all tensor fields. The field T is a function of location in the manifold, T(P). We take the derivative of T along the curve P(λ). At point P the vector A~ tangent to the curve is given ~ dP d ~ by A = dλ = dλ . The derivative of T along the curve (so in the direction of vector A) is given by [T(P(λ + ∆λ))]k − T(P(λ)) ∇ ~T = lim . (1.17) A ∆λ→0 ∆λ Notice that the two tensors, T(P(λ+∆λ)) and T(P(λ)), live in two separate tangent spaces. They are almost identical, because ∆λ is small, but nevertheless they constitute different tangent spaces. We need a way to transport tensor T(P(λ + ∆λ)) to point P, where we can determine the derivative, so we can subtract the tensors. What we need is called parallel transport of T(P(λ + ∆λ)). In a curved manifold we do not observe the effects of curvature when we take first-order derivatives1. Parallel transport then has the same meaning as it does in flat space: the

1 We can always construct a local Lorentz frame which is sufficiently flat for what we intend to do. In that

4 components do not change by the process of transporting. So we have found with Eq. (1.17) an expression for the derivative. The original tensor T( , ) has two slots, and the same is

true for the derivative ∇A~T( , ), since according to Eq. (1.17) the derivative is no more than the difference of two tensors T at different points, and then divided by the distance ∆λ. As a next step we can now introduce the concept of gradient. We notice that the derivative ~ ~ ∇A~T( , ) is linear in the vector A. This means that a rang-3 tensor ∇T( , , A) exists, such that ~ ∇A~T( , ) ≡ ∇T( , , A). (1.19) This is the definition of the gradient of T. The final slot is by convention used as the differentiation slot. The gradient of T is a linear function of vectors and has one slot more that T itself, and furthermore possesses the property that when one inserts A~ in the final slot, one obtain the derivative of T in the direction of A~. We define the components of the gradient as αβ µ ∇T ≡ T ;µ ~eα ⊗ ~eβ ⊗ ~e . (1.20) It is a convention to place the differentiation index below. In addition, notice that one can bring this index up or down, just like any other index. Furthermore, everything else after the semicolon corresponds to a gradient. The components of the gradient are in this case αβ T ;µ. How do we calculate the components of a gradient? The tools for this are the so-called connection coefficients2. These coefficients are called this way, because in taking the deriv- ative we have to compare the tensor field at two different tangent spaces. The connection coefficients give information about how the basis vectors change between these neighboring tangent spaces. Because we have a basis in point P, we can ask what the derivative of ~eα is in the direction of ~eµ. One has ρ ∇~eµ~eα ≡ Γ αµ~eρ. (1.21) This derivative is itself a vector and we can expand it in our basis at point P where we want ρ to know the derivative. The expansion coefficients are Γ αµ. In the same manner we have

ρ ρ ∇~eµ~e = −Γ σµ~eσ. (1.22)

system the basis vectors are constant and their derivatives are zero in point P. This constitutes a definition for the covariant derivative. This definition immediately makes the Christoffel symbols disappear and in α α the LLF one has V ;β = V ,β at point P. This is valid for every tensor and for the metric, gαβ;γ = gαβ,γ = 0 at point P. Since the equation gαβ;γ = 0 is a tensor equation, it is valid in each basis. Given that µ µ Γ αβ = Γ βα, we find that the metric must obey

1  ∂ ∂ ∂  Γα = gαβ g + g − g . (1.18) µν 2 ∂xν βµ ∂xµ βν ∂xβ µν

α Thus, while Γ µν = 0 at P in the LLF, this does not hold for its derivatives, because they contain gαβ,γµ. So the Christoffel symbols may be zero at point P when we select a LLF, but in general they differ from zero in the neighborhood of this point. The difference between a curved and a flat manifold manifests itself in the derivatives of the Christoffel symbols. 2 These are also known as Christoffel symbols.

5 Notice that we now get a minus sign! The connection coefficients show how basis vectors change from place to place. So when one wants to find the components of a gradient, for αβ example T ;γ, then one has to take into account the change of the basis vectors. The tensor Tαβ itself may be constant and only the basis vectors depend on position. One can show that ∂ T α = T α + Γα T µ − Γµ T α , where T α = ∂ T α = T α . (1.23) β;γ β,γ µγ β βγ µ β,γ ~eγ β ∂xγ β When we know the metric g, we can calculate the Christoffel symbols, and with them all covariant derivatives. In this manner we find the equations

α α α µ V ;β = V ,β + Γ µβV , µ Pα;β = Pα,β − Γ αβPµ, (1.24) αβ αβ α µβ β αµ T ;γ = T ,γ + Γ µγT + Γ µγT .

αβ We introduced the notation T ;µ to underscore the fact that covariant differentiation changes αβ the rank of a tensor. Another notation which we will use in the rest of these notes is ∇µT . αβ αβ αβ αβ αβ αβ µ Note that T ;µ = ∇µT = ∇~eµ T . Similarly, we write T ,µ = ∂µT = ∂T /∂x .

C. Geodesics and curvature

When we draw spherical coordinates on a sphere, and follow two lines, that are perpen- dicular to the equation, in the direction of the North pole, we observe that two initial parallel lines meet at a point on the curved surface. The fifth postulate of Euclid does not hold for a curved space: parallel lines can intersect. Another illustration of how curvature manifests itself is perhaps more effective. It is outlined in Fig. 2. We start in point P with a tangent vector that points in the horizontal direction. We take a small step in the direction of Q and after each step we project the tangent vector again on the local tangent space. This is our method of parallel transport. After completing the trajectory P QRP , we observe that the final vector is not parallel to the initial vector. This does not occur in a flat space and is an effect of the curvature of the sphere. The consequence is that on a sphere we cannot define vector fields that are parallel in a global sense. The result of the process of parallel transport depends on the path chosen and on the size of the loop. In order to find a mathematical description, we interpret the interval PQ in Fig. 2 as a curve, and view λ as the parameter of this curve. The vector field V~ is defined at each point of the curve. The vector U~ = d~x/dλ is the vector tangent to the curve. Parallel transport means that in a local inertial coordinate frame at point P the components of V~ must be constant along the curve. One has dV α = U β∂ V α = U β∇ V α = 0 at point P. (1.25) dλ β β The first equality corresponds to the definition of the derivative of a function (in this case α α V ) along the curve, the second equality arises from the fact that Γ µν = 0 at point P in these coordinates. The third equality is a frame-independent expression that is valid in any

6 Figure 2: Parallel transport of a vector V~ around a triangular path PQRP on the surface of a sphere. By transporting V~ along the loop P QRP the final vector will be rotated with respect to the initial vector. The angle of rotation depends on the size of the loop, the path chosen, and the curvature of the manifold.

basis. We take this as the coordinate system independent definition of the parallel transport of V~ along U~ . A vector V~ is parallel transported along a curve with parameter λ when d U β∇ V α = 0 ↔ V~ = ∇ V~ = 0. (1.26) β dλ U~

The last step makes use of the notation for the directional derivative along U~ . The most important curves in a curved spacetime are the geodesics. Geodesics are lines that are drawn as straight as possible, with as condition that the tangent vectors U~ of these lines are parallel transported. For a geodesic one has ~ ∇U~ U = 0. (1.27) Notice that in a LLF these lines are indeed straight. For the components one has

β α β α α µ β U ∇βU = U ∂βU + Γ µβU U = 0. (1.28)

When λ is the parameter of the curve, then U α = dxα/dλ and U β∂/∂xβ = d/dλ. We then find d dxα  dxµ dxβ + Γα = 0. (1.29) dλ dλ µβ dλ dλ Since the Christoffel symbols are known functions of the coordinates {xα}, this is a set of non-linear second-order differential equations for xα(λ). These have unique solutions when α α α α the initial conditions at λ = λ0 are given: x0 = x (λ0) and U0 = (dx /dλ)λ0 . Thus, by α α stating the initial position (x0 ) and velocity (U0 ), we obtain a unique geodesic.

7 By changing the parameter λ, we mathematically change the curve (but not the path). When λ is a parameter of the geodesic, and we define a new parameter φ = aλ + b, with a and b constants, that do not depend on position on the curve, then we have for φ also

d2xα dxµ dxβ + Γα = 0. (1.30) dφ2 µβ dφ dφ Only linear transformations of λ yield new parameters that satisfy the geodesic equation. We call the parameters λ and φ affine parameters. Finally, we remark that a geodesic is also a curve with extremal length (minimum length between two points). Consequently, we can derive the expression for a geodesic also from the Euler-Lagrange equations. In that case we start from Eq. (1.5). We can also show that the length ds along the curve is an affine parameter.

D. Curvature and the Riemann tensor

In Fig. 3 we show two vector fields A~ and B~ . The vectors are sufficiently small that the curvature of the manifold plays no role in the area where this diagram is drawn. Thus we can assume that the vectors live on the surface instead in the tangent space. In order to calculate the commutator [A,~ B~ ], we use a local orthonormal coordinate system. Since we can interpret a vector as a directional derivative, expression Aα∂Bβ/∂xα represents the amount by which the vector B~ changes when it is transported along A~ (this is represented by the short dashed line in the upper right corner in Fig 3). In the same manner Bα∂Aβ/∂xα

Figure 3: The commutator [A,~ B~ ] of two vector fields. We assume that the vectors are small, such that curvature allows them to live in the manifold.

represents the change when A~ is transported along B~ (this corresponds to the other short- dashed line). For the components of the commutator in a coordinate system one has

 ∂ ∂   ∂Bβ ∂Aβ  ∂ [A,~ B~ ] = Aα ,Bβ = Aα − Bα . (1.31) ∂xα ∂xβ ∂xα ∂xα ∂xβ

8 According to the above equation, the commutator [A,~ B~ ] corresponds to the difference of the two dashed lines in Fig. 3. It is the fifth line segment that is needed to close the square (this is the geometric meaning of the commutator). Eq. (1.31) is an operator equation, where the final derivative acts on a scalar field (just as in quantum mechanics). In this way we immediately find the components of the commutator in an arbitrary coordinate system: α β α β A ∂αB − B ∂αA . The commutator is useful to make a distinction between a coordinate basis and a non-coordinate basis (also known as a non-holonomic basis)3. In the discussion that led to Eq. (1.4), we saw that the effects of curvature become noticeable when we take second-order derivatives (or gradients) of the metric. Riemann’s curvature tensor is a measure of the failure of double gradients to close. Take a vector field A~ and take its double gradients. We then find

β ∇µ∇νAα − ∇ν∇µAα = [∇µ, ∇ν]Aα ≡ R αµνAβ. (1.32) This equation can be seen as the definition of the Riemann tensor. The Riemann tensor gives the commutator of covariant derivatives. This means that we have to be careful in a curved spacetime with the order in which we take covariant derivates: they do not commute. We can expand Eq. (1.32) starting from the definition of the covariant derivative,

∂ ∂ ∇ ∇ A = (∇ A )−Γβ (∇ A )−Γβ (∇ A ) and ∇ A = A −Γβ A . (1.33) µ ν α ∂xµ ν α αµ ν β µν β α µ α ∂xµ α αµ β We now have to differentiate, manipulate indices, etc. At the end we find ! ∂Γβ ∂Γβ ∇ ∇ A − ∇ ∇ A = αν − αµ + Γγ Γβ − Γγ Γβ A = Rβ A . (1.34) µ ν α ν µ α ∂xµ ∂xν αν γµ αµ γν β αµν β

The Riemann tensor tells use how a vector field changes along a closed path. We can use Eq. (1.18) to express the Riemann tensor in a LLF as 1 Rα = gασ (∂ ∂ g − ∂ ∂ g + ∂ ∂ g − ∂ ∂ g ) . (1.35) βµν 2 β µ σν β ν σµ σ ν βµ σ µ βν We observe that the metric tensor g contains the information about the intrinsic curvature4. This curvature becomes manifest when we take second-order derivates of the metric. With λ Rαβµν ≡ gαλR βµν and the above expression, we can prove a number of important properties of the Riemann tensor. The Riemann tensor is

3 α In a coordinate basis the basis vectors are given by the partial derivatives, ~eα = ∂/∂x , and because partial α derivatives commute, one has that [~eα,~eβ] = 0. In a non-coordinate basis one has [~eµ,~eν ] = Cµν~eα, with α Cµν the so-called commutation coefficients. A coordinate basis is often useful for carrying out calculations, while a non-coordinate basis can be useful for the interpretation of results. 4 Apart from intrinsic curvature a manifold can also possess extrinsic curvature. Take for example a piece of paper that has no intrinsic curvature, and roll it up into a cylinder. This cylinder has extrinsic curvature and this describes the embedding of a flat sheet of paper in 3D space. GR says nothing about the higher- dimensional spaces in which spacetime may be embedded. GR only deals with the description of curvature measurable within the manifold itself and this corresponds to the intrinsic curvature of spacetime.

9 • Antisymmetric in the last two indices. One has ~ ~ ~ ~ R( , , A, B) = −R( , , B, A) or Rµναβ = −Rµνβα. (1.36)

• Antisymmetric in the first two indices. One has ~ ~ ~ ~ R(A, B, , ) = −R(B, A, , ) or Rµναβ = −Rνµαβ. (1.37)

• The tensor is symmetric under exchange of the first and second pair of indices, ~ ~ ~ ~ ~ ~ ~ ~ R(A, B, C, D) = R(C, D, A, B) or Rµναβ = Rαβµν. (1.38)

• One has the so-called Bianchi identities,

∇µRαβγδ + ∇γRαβδµ + ∇δRαβµγ = 0. (1.39)

The above symmetries reduce the 4 × 4 × 4 × 4 = 256 components of the Riemann tensor to 20. The Ricci curvature tensor (Ricci tensor) is defined as the contraction of the Riemann tensor. One has µ Rαβ ≡ R αµβ. (1.40) For example, in the case of the surface of the Earth this tensor also contains information about the curvature, but as the Riemann tensor integrated over angles. Furthermore, one can show that the Ricci tensor is symmetric. Finally, we have the scalar curvature, the Ricci curvature, defined by α R = R α. (1.41) We have now defined the tensors we need for the description of phenomena in GR. An impressive mathematical apparatus has been created and we are going to put this to first use in order to pose the field equations (the so-called Einstein equations) of GR. We will try to make this plausible through an analogy with the Newtonian description.

E. Newtonian description of tidal forces

We try to find a measure of the curvature of spacetime. We start our experiment by dropping a test particle. We decide as observer5 to go in freefall along with the particle (LLF) and observe that the particle moves along a straight line in spacetime (only in the time direction). There is nothing in the motion of a single particle that betrays curvature. Indeed, in a free-falling coordinate system, the particle is at rest. A single particle is insufficient to discover effects of curvature. Next, we drop two particles. We will study the tidal force on Earth from the perspective of observers that free-fall (LLF) together with the particles. Such observers fall in a straight line towards the center of the Earth. Fig. 4 outlines the situation for two free-falling particles

5 For simplicity we assume that as observer we do not influence the process. Most importantly, we assume that we do not introduce gravitational forces or cause curvature of our own.

10 Figure 4: Left: two free-falling particles move along initially parallel paths towards the center of the Earth. There, both paths intersect; right: lines that are initially parallel on the surface of the Earth at the equator, intersect at the North pole.

P and Q, and we observe that both particles follow paths that lead to the center of the Earth. From the perspective of the observer that is in free-fall with the particles, we see that the particles move towards each other. This is caused by the differential gravitational acceleration of the particles through what are called tidal forces. According to Newton both paths interact because of gravitation, while according to Einstein this occurs because spacetime is curved. What Newton calls gravitation is called curvature of spacetime by Einstein. Gravitation is a property of the curvature of spacetime. We now want to give a

Figure 5: The trajectories of two free-falling particles in a gravitational field Φ. The three-vector ξ~ measures the distance between the two particles and is a function of time. mathematical description of this process that is in agreement with Newton’s laws. In order to accomplish this we consider Fig. 5. The Newtonian equations of motion for particles P

11 and Q are  2     2    d xj ∂Φ d xj ∂Φ 2 = − j and 2 = − j , (1.42) dt (P ) ∂x (P ) dt (Q) ∂x (Q) with Φ the gravitational potential. We define ξ~ as the separation between both particles. dξ~ ~ For parallel trajectories one has dt = 0. With ξ = (xj)(P ) − (xj)(Q) we find from a Taylor expansion that to leading order in the small separation ξ~

d2ξ  ∂2Φ   ∂2Φ  j = − ξ = −E ξ → E = , (1.43) dt2 ∂xj∂xk k jk k jk ∂xj∂xk

with E the gravitational tidal tensor. Notice that the metric for the 3D Euclidian space is given by δjk = diag(1, 1, 1) and that there is no difference between lower and upper indices. Eq. (1.43) is called the equation of Newtonian geodesic deviation. According to Newton, particles moves towards each other and we write

d2ξ~ = −E( , ξ~) (1.44) dt2 in abstract notation. It is interesting that the field equation of Newtonian gravitation,

∇2Φ = 4πGρ, (1.45) can be expressed in terms of second derivatives of Φ, which describe the tidal accelerations in Eq. (1.43). There is an analogous connection in GR.

F. The Einstein equations

We now arrive at the heart of GR, the field equations. We will try to make the field equations plausible in manner that summarizes all previous statements. In Fig. 6 (left diagram) we start with a discussion of the motion of a particle along a worldline. This worldline is parameterized with proper time τ on a clock that is carried by the particle. We can denote the position of the particle at a point of the worldline with P(τ). The velocity

Figure 6: Left: the worldline of a particle is a curve xα(τ) that can be parameterized with the proper time τ of the particle. The velocity U~ is the vector tangent to the curve. Right: we create a coordinate system {xα}. The velocity U~ now has components U α = dxα/dτ.

12 U~ is the tangent vector of the curve and is given by dP d U~ = = . (1.46) dτ dτ For the velocity in the LLF at point P −→ −→ dP· dP −dτ 2 U~ 2 = = = −1, (1.47) dτ 2 dτ 2 where we have used the definition of the metric6. Because this equation yields a number (scalar), is is valid in every coordinate system. We see that the four-velocity vector has length 1 and points in the direction of time. Notice that these definitions do not use any coordinate system. If a coordinate system is available, the components of the velocity are given by dxα U α = . (1.48) dτ Thus, the components are derivates of the coordinates themselves7. When a particle is moving freely and no other forces act than those from the curvature of spacetime, then it must move in a straight line. With this we mean as straight as is possible under the influence of curvature. The particle needs to parallel transport its own velocity. One has ~ ∇U~ U = 0, (1.49) and this is, as we have already seen in Eq. (1.27), the abstract expression for a geodesic. What this means is than when we go to a local Lorentz frame, the components of the four-velocity stay constant (and for this reason the directional derivative vanishes) when the particles moves over a small distance. We now investigate how the geodesic equation is written in an arbitrary coordinate system. This is sketched in the right panel of Fig. 6. In this coordinate system the components of U~ are given by U α = dxα/dτ, and we can write geodesic equation as

α µ α α ν µ ∇µU U = 0 → ∂µU + Γ µνU U = 0. (1.50)

α Notice, that ∇µU is the gradient, of which we then take the inner product with the velocity U µ to find the velocity in the direction of the velocity. This derivation is then set to zero. In the second step we take advantage of the expression of the covariant derivative in terms of components. We find

2 α µ ν α µ α ν µ d x α dx dx ∂µU U +Γ U U = 0 → + Γ = 0. (1.51) |{z} µν |{z} |{z} dτ 2 µν dτ dτ | {z } dxµ dxν dxµ ∂Uα dτ dτ dτ ∂xµ | {z } dUα d dxα dτ = dτ ( dτ )

−→ 6 In the LLF dP corresponds to (∆τ,~0), where ∆τ is the proper time, measure with an ideal clock. One −→ −→ has that dP· dP = −(∆τ)2. 7 The above is valid for a particle with non-zero rest mass. Arguing along the same lines, if the particle is a photon, then U α = dxα/dλ, where now λ is an arbitrary affine parameter (in this case there is no notion of proper time), and we have U~ 2 = 0.

13 It is important to realize that we have started from the abstract tensor Eq. (1.49) for a geodesic. After defining an arbitrary coordinate system we have written this equation in terms or coordinates and the result is expression (1.51). This expression yields four ordinary second-order differential equations for the coordinates x0(τ), x1(τ), x2(τ) and x3(τ). These equation are coupled through the connection coefficients. Because we are dealing with second-order differential equations, we need two initial conditions, for example at time τ = 0 α dxα α the values of both x (τ = 0) and dτ (τ = 0) = U (0). After this the worldline of a free particle (geodesic) is fully determined.

Figure 7: The worldlines of particles P and Q are parallel initially. Because of curvature both particles move towards each other. The distance between the particles is given by the spatial vector ξ~.

We consider in Fig. 7 the geodesic distance between two particles P and Q. The constitutes our starting point in going towards the Einstein equations. Suppose we have two particles that at a certain instant (we choose this instant as τ = 0) are at rest with respect to each other. We define the separation vector ξ~, which points from one particle to the other. Furthermore, particle P has velocity U~ . The demand that the particles are initially at rest ~ with respect to each other amounts to ∇U~ ξ = 0 at point P at time τ = 0. In addition, we define ξ~ such that in the LLF of particle P this vector ξ~ is purely spatial (it is always possible to make this choice). Then ξ~ is perpendicular to the velocity U~ as it points in a direction perpendicular to the time direction. One has U~ · ξ~ = 0 at point P. Summarizing, we demand at time τ = 0

∇ ξ~ = 0  U~  at point P for τ = 0. (1.52) U~ · ξ~ = 0 

~ The second derivative ∇U~ ∇U~ ξ does not vanish, since we know that the effects of curvature become visible when we take second-order derivatives of the metric. This means that the geodesics of the particles are forced together or apart (depending on the metric) when time progresses. One has ~ ~ ~ ~ ∇U~ ∇U~ ξ = −R( , U, ξ, U), (1.53)

14 with R the curvature tensor. This equation describes how two initially parallel geodesics increasingly deviate as time progresses, as a result of curvature. The expression follows from ~ Eqs. (1.24) and (1.32). The second derivative ∇U~ ∇U~ ξ describes the relative acceleration of the particles. In the LLF of particle P at time τ = 0 one has U 0 = 1 and U i = 0. Therefore, we expect

∂2ξ~j (∇ ∇ ξ~)j = = −Rj U αξβU γ = −Rj ξk, (1.54) U~ U~ ∂t2 αβγ 0k0 since the velocity U~ only has a non-vanishing time component in the LLF of particle P, while the separation vector ξ~ only has spacelike components k = 1, 2, 3. In the LLF the equation for the geodesic deviation takes the form ∂2ξj = −Rj ξk, (1.55) ∂t2 0k0 while in Newtonian mechanics we have found (see Eq. (1.43)) that

∂2ξj = −E ξk. (1.56) ∂t2 jk

In a LLF the spatial part of the metric is Cartesian (δij = diag(1, 1, 1)) and the position of the indices is irrelevant. Comparing both expressions yields ∂2Φ R = E = . (1.57) j0k0 jk ∂xjxk We can identify part of the curvature tensor with derivatives of the Newtonian gravitational potential. According to Newton one has

2 jk jk j ∇ Φ = 4πGρ → ∂j∂kΦ δ = Ejkδ = E j, (1.58)

j and we find for the trace of the gravitational tidal tensor E j = 4πGρ. In analogy one might expect that in GR one has j R 0j0 = 4πGρ ? (1.59) as a first guess. However, there is a fundamental problem with Eq. (1.59). It should be an expression that does not depend on the choice of coordinate system. Indeed, we have constructed the equation in a special system: the LLF. What we need to do is find a relation between 0 tensors. In this context we note that in the LLF one has R0000 = 0 en R 000 = 0 because of j µ antisymmetry. Thus one has R 0j0 = 4πGρ → R 0µ0 = 4πGρ. We are still in the LLF (note that also R00 = 4πGρ with R00 the Ricci tensor). There is another difficulty with Eq. (1.59): at the left of the equal sign we have two indices (which both happen to be 0) while at the right there are none. Thus, one might expect that

Rαβ = 4πGTαβ ? (1.60)

Here, Tαβ represents the energy stress tensor, with T00 = ρ (and this often the dominating term in the LLF). Einstein made this guess already in 1912, but it is incorrect! These

15 equations have built-in inconsistencies. It is important to understand what is wrong, and it can be explained as follows. Consider the Riemann tensor. Schematically,

δ δ R αβγ ≈ ∂ ∂γgαβ + non-linear terms. (1.61) When we contract the first and third index, we obtain

β Rαγ ≈ ∂ ∂γgαβ + non-linear terms. (1.62) We see that the proposed equations (1.60) constitute a set of 10 partial differential equations for the 10 components of the metric gαβ (since the metric is symmetric in α and β). Also the Ricci tensor is symmetric. This may all appear fine, but we are at liberty to choose the coordinate system where we are going to work out the equations. We have the freedom to choose x0(P), x1(P), x2(P) and x3(P). We can use this freedom to set 4 of the 10 components of gαβ, viewed as functions of the coordinates, equal to whatever we like (while preserving the signature), for example g00 = −1, g01 = g02 = g03 = 0. However, our equations (1.60) do not allow this, as we would have 10 partial differential equations for 6 unknowns. What we need are 6 equations for 6 unknowns. Before we proceed with our quest for the Einstein equations, two remarks are in order. The first remark has to do with the Bianchi identities. Thanks to these identities ∇µRαβγδ+... = 0 it follows that when we define the Einstein tensor 1 G ≡ R − Rg , (1.63) αβ αβ 2 αβ

with Rαβ the Ricci tensor and R the scalar curvature, then the Bianchi identities ensure that the divergence of the Einstein tensor is equal to zero,

αβ ∇βG = 0. (1.64) The second remark pertains to the well-known conservation laws for energy and momentum. In a LLF one has  ∂T 00 ∂T 0j ∂t + ∂xj = 0, αβ  ∂βT = 0 → (1.65)  ∂T j0 ∂T jk ∂t + ∂xk = 0. ∂T 0j ~ Note that ∂xj is the spatial divergence and conservation of energy states ∂ρ/∂t + divJ = 0, ~ ∂T j0 with J the mass-energy flux. In the same manner ∂t represents the momentum density ∂T jk and ∂xk the momentum flux. Since we only take first derivatives, what is valid in flat space in the LLF is also valid for curved spacetime. In this manner we deduce the tensor equation

αβ ∇βT = 0. (1.66)

It seems reasonable to assume that Nature has chosen 8πG Gαβ = T αβ. (1.67) c4 These are the Einstein equations. The proportionality factor (8πG/c4) can be found by taking the Newtonian limit. Before we impose the Einstein equations, we already know that 8πG ∇ Gαβ = 0 = ∇ T αβ. (1.68) β c4 β

16 These are 4 equations and they are in fact the derivatives of the Einstein equations. These 4 identities (the divergences of Gαβ and T αβ vanish) are already satisfied. This puts 4 constrains on the Einstein equations (also called the field equations) and the field equations only yield 6 new pieces of information. This is exactly what we need.

G. Weak gravitational fields and the Newtonian limit

It is clear that GR describes gravitation in terms of curvature of spacetime and reduces to SRT for local Lorentz frames. However, it is important to explicitly check that the descrip- tion reduces to the Newtonian treatment when we select the correct boundary conditions.

Without gravitation, spacetime possesses the Minkowski metric ηµν. Therefore, weak grav- itational fields only cause small curvatures of spacetime. We assume that coordinates exist, such that the metric takes the following form,

gµν = ηµν + hµν with |hµν|  1. (1.69)

Furthermore, we assume that in this coordinate system the metric is stationary, and that we have ∂0gµν = 0. The worldline of a free-falling particle is given by the geodesic expression

d2xµ dxν dxσ + Γµ = 0. (1.70) dτ 2 νσ dτ dτ We assume that the particle is moving slowly (non-relativistically), such that for the com- ponents of the three-velocity one has dxi/dt  c (i = 1, 2, 3), with t defined via x0 = ct. In this manner we demand for i = 1, 2, 3

dxi dx0  . (1.71) dτ dτ We can neglect the three-velocity and find

d2xµ  dt 2 + Γµ c2 = 0. (1.72) dτ 2 00 dτ

We use Eq. (1.18) and find 1 1 1 Γµ = gκµ(∂ g + ∂ g − ∂ g ) = − gκµ∂ g = − ηκµ∂ h , (1.73) 00 2 0 0κ 0 0κ κ 00 2 κ 00 2 κ 00 where we used equation (1.69). The last equality is valid to first order in hµν. Since we assumed a stationary metric, 1 Γ0 = 0 and Γi = δij∂ h with i = 1, 2, 3. (1.74) 00 00 2 j 00 Inserting this in Eq. (1.72) yields

d2t d2~x 1  dt 2 = 0 and = − c2 ∇h . (1.75) dτ 2 dτ 2 2 dτ 00

17 The first equation states that dt/dτ = constant, and using this we can combine the two expressions. This gives the following equation of motion for the particle, d2~x 1 = − c2∇h . (1.76) dt2 2 00 When we compare this equation with the Newtonian expression for the motion of a particle in a gravitational field (see Eq. (1.42)), we conclude that the expressions are identical when 2 we identify h00 = 2Φ/c . We find that for a slowly moving particle, GR is equivalent to the Newtonian description when the metric is given by  2Φ g = 1 + h = 1 + . (1.77) 00 00 c2

Φ GM −9 We can estimate this correction to the Minkowski metric, since c2 = − c2r and find −10 at the surface of the earth, −10−6 at the surface of the sun, and −10−4 at the surface of a white dwarf. We conclude that the weak-field limit is an excellent approximation. Thus, Eq. (1.77) shows that spacetime curvature in general causes the time coordinate t to differ from the proper time. Consider a clock at rest at a certain point in our coordinate system, so that dxi/dt = 0. The proper time interval dτ between two ticks of this clock is 2 2 µ ν 2 2 given by c dτ = gµνdx dx = g00c dt , and we find

1  2Φ 2 dτ = 1 + dt. (1.78) c2 This gives the interval in proper time dτ that corresponds to an interval dt in coordinate time for a stationary observer in the vicinity of a massive object, in a region with gravitational potential Φ. Since Φ is negative, this proper time interval is shorter than the corresponding interval for a stationary observer at large distance to the object, where Φ → 0 and thus dτ = dt. The spacetime interval is given by

Figure 8: Trajectories of a ball and a bullet in space. Seen in a laboratory the two trajectories have different curvature.

 2Φ ds2 = − 1 + (cdt)2 + dx2 + dy2 + dz2. (1.79) c2

18 This expression describes a geometry of spacetime where particles move on geodesics in the same manner as those of particles in a flat space where the Newtonian force of grav- ity is active. We have found a curved spacetime picture for Newtonian gravitation. The curvature is solely in the time direction. Curvature in time is nothing but the gravitational redshift: time proceeds with different speed at different locations, thus time is curved. This gravitational redshift fully determines the trajectories of particles in a gravitational field. Newtonian gravitation corresponds solely to a curvature of time. Perhaps the above is counter-intuitive, since nothing seems more natural than the idea that gravitation is a manifestation of the curvature of space. Look for example at the trajectories of two objects in space, as shown in Fig. 8. One of the objects is a ball that is moving with a relatively low speed of 5 m/s; it reaches a height of 5 m. The other object is the bullet from a gun. This bullet moves at a much higher speed (500 m/s). When we study the figure, it seems that the orbit of the ball is more strongly curved than that of the bullet. However, we should not look at the curvature of space, but at the curvature of spacetime. To accomplish this we redraw the trajectories in Fig. 9, but now in Minkowski spacetime. We observe that the trajectories of ball and bullet have a similar curvature in spacetime. However, in reality none of the trajectories has any curvature! They appear curved because

Figure 9: Trajectories of a ball and a bullet is spacetime. Seen in a laboratory both trajectories have the same curvature. We compare the orbital length to the arc length of the circle: (radius) = (horizontal distance)2 / 8(height). we have forgotten that the spacetime in which they are drawn is itself curved. The curvature of spacetime is exactly such that the orbits themselves are completely straight: they are geodesics.

19 H. Weak-field limit of the Einstein equations

The Einstein equations (1.67) state that the Einstein tensor is proportional to the energy- momentum tensor, Gµν = constant Tµν. We want to determine the proportionality factor by taking the weak-field limit. For this we only need to consider the 00-component. We find 1 R − Rg = constant × T . (1.80) 00 2 00 00 In the weak-field limit spacetime is only slightly curved and coordinates exist for which gµν = ηµν + hµν with |hµν|  1, while the metric is stationary. Thus, we have g00 ≈ 1. In addition, we can use definition (1.34) of the curvature tensor to find R00. One has µ µ ν µ ν µ R00 = ∂0Γ 0µ − ∂µΓ 00 + Γ 0µΓ ν0 − Γ 00Γ νµ. (1.81) µ In our coordinate system the Γ νσ are small, so that we can neglect the last two terms at first order in hµν. In addition, the metric is stationary in our coordinate system and we have i R00 ≈ −∂iΓ 00. (1.82) i 1 ij In our discussion of the Newtonian limit, we found in Eq. (1.74) that Γ 00 ≈ 2 δ ∂jh00 in first-order in hµν. Thus, we have 1 R ≈ − δij∂ ∂ h . (1.83) 00 2 i j 00

We now can substitute our approximations for g00 and R00 in Eq. (1.80) and find that in the weak-field limit 1 1 δij∂ ∂ h ≈ constant × (T − T ). (1.84) 2 i j 00 00 2 µ Here, we used that R = constant × T with T ≡ T µ, by writing Eq. (1.67) with mixed µ 1 µ µ components, R ν − 2 δ νR = constant × T ν, and perform a contraction by setting µ = ν µ (note that δ µ = 4). In order to proceed we have to make an assumption about the kind of matter that produces the weak gravitational field. For this we take a perfect fluid. For most classic matter distributions one has P/c2  ρ and we can take the energy-momentum tensor for dust. One has Tµν = ρ UµUν, (1.85) and in this manner we find T = ρc2. Furthermore, we assume that the particles that constitute the fluid have velocities U~ in our coordinate system that are small compared to c. We assume that γU ≈ 1 and thus U0 ≈ c. Eq. (1.84) then reduces to 1 1 δij∂ ∂ h ≈ constant × ρc2. (1.86) 2 i j 00 2 ij 2 2 We note that δ ∂i∂j = ∇ . In addition, from Eq. (1.77) we have h00 = 2Φ/c , with Φ the gravitational potential. Choosing the constant of proportionality as 8πG/c4, we retrieve the Poisson equation for Newtonian gravitation, ∇2Φ ≈ 4πGρ. (1.87) This identification verifies our assumption that the proportionality factor between the Ein- stein tensor and the energy-momentum tensor equals 8πG/c4.

20 I. The cosmological constant

The Einstein equations (1.67) are not unique. Einstein quickly discovered that it is im- possible to construct a static model of the Universe on the basis of the field equations. These equations always yield solutions that correspond to an expanding or contracting Uni- verse. When Einstein carried out this work in 1916, only our Milky Way was known, which resembles a uniform distribution of fixed stars. By introducing a cosmological constant Λ, Einstein was capable of creating static models of the Universe (later all these solutions turned out to be unstable). Subsequently, it was discovered that the Milky Way is only one of many galaxies, while in 1929 Hubble discovered the expansion of the Universe. He determined distances and redshifts of neighboring galaxies and concluded that the Universe is expanding; see Fig. 10. The cosmological constant seemed unnecessary. If Einstein had put more trust in his equations, he could have predicted the expansion of the Universe! Today, we have a different view on these issues; more about this later.

Figure 10: Left: the velocity of a galaxy can be determined from the Doppler effect. The distance is determined from the luminosity of standard candles; right: it appears that galaxies are moving away from us with greater speed at increasing distance. The Hubble constant is H0 = 72 km/s/Mpc. Galaxies do not move through space, but drift on the expanding space.

µ µ What Einstein noticed was the following. We know that ∇ Gµν = 0 and also ∇ Tµν = 0. µ µν In addition, ∇ g = 0. We can add any constant multiple of gµν to Gµν and still obtain a consistent set of field equations. It is common to denote the constant of proportionality by Λ, and we then obtain 1 8πG R − g R + Λg = T , (1.88) µν 2 µν µν c4 µν where Λ is a new universal constant of nature, which we call the cosmological constant. In 0 this procedure the ‘modified Einstein tensor’ Gµν = Gµν + Λgµν does not vanish anymore when spacetime is flat! Furthermore, Gµν no longer an immediate measure of the curvature. By again writing Eq. (1.88) with mixed indices and then performing a contraction, we

21 8πG obtain R = c4 T + 4Λ. Inserting this in Eq. (1.88) yields 8πG  1  R = T − T g + Λg . (1.89) µν c4 µν 2 µν µν We now carry out the same procedure as in section I H and obtain the field equations in the weak-field limit for Newtonian gravitation

∇2Φ = 4πGρ − Λc2. (1.90)

For a spherical mass M we obtain for the gravitational field 3GM ~g = ∇Φ = − ~rˆ + c2Λr~r,ˆ (1.91) 2r2 and we conclude that the cosmological term corresponds to a gravitational repulsion, whose strength increases proportional to r. Today we have a different view of the cosmological constant. Note that the energy- momentum tensor of a perfect fluid is given by  P  T µν = ρ + U µU ν + P gµν. (1.92) c2 We imagine that a certain ‘substance’ exists with the curious equation of state P = ρc2. We never encountered such a substance, since it has a negative pressure! The energy-momentum tensor of this substance is given by

2 Tµν = −P gµν = ρc gµν. (1.93) Here, we note the following. Firstly, the energy-momentum of this substance only depends on the metric tensor: it is a property of the vacuum itself and we denote by ρ the energy density of the vacuum. Secondly, the expression for Tµν is identical to that for the constant cosmological term in Eq. (1.88). We can view the cosmological constant as a universal constant that determines the energy density of the vacuum, Λc4 ρ c2 = . (1.94) vacuum 8πG vacuum 2 Denoting the energy-momentum density of the vacuum by Tµν = ρvacuumc gµν, we can write the modified field equations as 1 8πG R − Rg = T + T vacuum , (1.95) µν 2 µν c4 µν µν with Tµν the energy-momentum tensor of matter and radiation.

If it is the case that Λ 6= 0, then at least it must small enough that ρvacuum has negligible gravitational effects (|ρvacuum| < ρmatter) in situations where Newtonian gravitational theory gives a good description of the data. Systems with smallest densities where Newton’s laws can be applied, are small clusters of galaxies. In this manner we can pose the following limit

4 2 Λc −26 −3 |ρvacuumc | = ≤ ρcluster ∼ 10 g/cm (1.96) 8πG

22 Figure 11: History of the expansion of the Universe. In the past the effect of the mass density was more important that that of the cosmological constant and this delayed the expansion of the Universe. However, when the volume of the Universe increases, then the density decreases. The effect of vacuum energy is constant. When the volume is sufficiently large then the Universe will expand forever. for the magnitude of the cosmological parameter. It is evident that if Λ is sufficiently small, it is completely unimportant on the scale of a star. However, as will become clear later on, it need not be negligible compared to the average matter density of the Universe as a whole (since it mostly consists of empty space!), and at very large scales it can have important effects on the evolution of the curvature of the Universe. How can we calculate the energy density of the vacuum? The simplest calculations sum over all zero-point energies of all quantum fields known in Nature. The resulting answer exceeds the upper limit on Λ that we just determined by about 120 orders of magnitude. This is not understood and a physical principle must exist that makes the cosmological constant small. Recent data indicate that the cosmological constant does not vanish. The strongest indications come from measurements of distant Type Ia supernovae, which indicate that the expansion of the Universe at this moment increases. This is outlined in Fig. 11. Without a cosmological constant we expect that the attractive force between all matter in the Universe should slow down the expansion and perhaps even lead to a contraction of the Universe. However, when the cosmological constant does not vanish, then the negative pressure of the vacuum can cause the expansion of the Universe to increase.

J. Alternative relativistic theories of gravity

The Einstein equations are not unique, as we have seen in the previous sections. It is also possible to construct radically new theories of gravity. We discuss a few of these in what

23 follows.

1. Scalar theories of gravity

In the Newtonian description of gravitation, the gravitational field is represented by the scalar Φ. This field obeys Poisson’s equation ∇2Φ = 4πGρ. Because matter can be described relativistically by the energy-momentum tensor Tµν, the only scalar with the dimension of µ mass density that we can construct is T µ. Furthermore, position and time are components µ µ 2 2 of the 4-vector x and we can accommodate the time derivative via 2 ≡ ∇ ∇µ = −∂ct +∇ . A consistent scalar relativistic theory of gravity is given by the field equation 4πG 22Φ = − T µ . (1.97) c2 µ This theory turned out to be incorrect (and for example predicted effects for the orbit of Mercury that were not observed). Furthermore, there is no coupling between gravitation and electromagnetism. Therefore, no gravitational redshift or bending of light by matter is included.

2. Brans - Dicke theory

A theory of gravitation based on a vector field can be excluded, since such a theory pre- dicts that massive particles will repel instead of attract. However, it is possible to formulate a relativistic theory that combines scalar, vector and tensor fields. The most important example of this type of theories is the one formulated by Robert Dicke and Carl Brans in 1961. Brans and Dicke started in the construction of their theory from the equivalence prin- ciple and obtained a description of gravitation in terms of curvature of spacetime. However, instead of treating the gravitational constant G as a constant of Nature, they introduced a scalar field φ that determines the strength of G. This implies that the scalar field φ determ- ines the strength of the coupling of matter to gravitation. The coupled equations for the scalar field and the gravitational field can be written as

2 M µ 2 φ = −4πλ T µ , (1.98) 1 8π M φ  Rµν − 2 gµνR = c4φ Tµν + Tµν . We observe that the effects of matter can be represented by the energy-momentum tensor M Tµν and a coupling constant λ that determines the scalar field. The scalar field determines the magnitude of G and the field equations relate the curvature to the energy-momentum φ M tensors of the scalar field (Tµν) and of matter (Tµν ). Historically, the coupling constant is written as λ = 2/(3 + 2ω). In the limit ω → ∞ we obtain λ → 0, and φ is not influenced by the matter distribution. We then can set φ equal to φ = 1/G. In the limit ω → 0 we have φ that Tµν → 0 and the Brans-Dicke theory reduces to that of Einstein. The Brans-Dicke theory is important, since it shows that it is possible to develop altern- ative theories that are consistent with the equivalence principle. One of the predictions of the Brans-Dicke theory is that the effective gravitational constant G is a function of time and is determined by the scalar field φ. A change in G could influence the orbits of planets

24 and a reasonable conservative conclusion from the data is that ω ≥ 500. This seems to indicate that Einstein’s theory is the correct theory of gravitation at least at relatively low energies.

3. Torsion theories

In our discussion of curved spacetime we assumed that the manifold has no torsion. This is not a necessary demand, and we can generalize the discussion of spacetime with a , µ µ µ T νσ = Γ νσ − Γ σν, (1.99) that does not vanish. Typically, torsion is caused by the quantum mechanical spin of particles. Such theories are complicated mathematically. Gravitational theories with space- time torsion are often called Einstein-Cartan theories and have been investigated extensively. So far no evidence for torsion has been observed.

25 II. COSMOLOGY

A. The Universe at large scales

General relativity has had an important impact on our understanding of the Universe. In- deed, when looking at the Universe in its entirety, it is necessary to invoke general relativity. Roughly speaking, Newtonian gravity is adequate as long as the mass M of a system is small compared to its size R, or more precisely when the natural length scale GM/c2 associated with the mass is small compared to R: GM/c2R  1. General relativity becomes important when this condition breaks down. This could happen for moderate values of GM/c2 if R is small, which is the case with neutron stars and black holes, as we shall discuss later. The other case is cosmology: if space is filled with matter of roughly the same density everywhere then the mass increases with R3, and GM/c2R must eventually get large. Suppose we start increasing R from the center of the Sun. The Sun is not a relativistic object in the above sense (check this!), and once R > R then M hardly increases until the next star is reached. Going on in this way, R will eventually encompass the Milky Way galaxy, which contains some 1011 stars in a radius of roughly 15 kpc. Hence, for the galaxy, GM/c2R ∼ 10−6, so that the dynamics of the galaxy does not need a relativistic description. Galaxies themselves congregate in clusters, with a typical diameter of a Mpc. This is much smaller than the distances our telescopes are capable of seeing, which are in the order of 105 Mpc. As it turns out, averaging over distances of 103 Mpc the Universe appears to be more or less the same everywhere. The density of the Universe at this scale is not known very well, but has been estimated to be at least8 ρ = 10−28 kgm−3. With this density, GM/c2 = 4πGρR3/c2 starts becoming significantly larger than R for R ∼ 1027m ∼ 104 Mpc. This is comparable to the distances to which our telescopes can see, hence we need general relativity. The length scale at which the Universe starts appearing uniform, 103 Mpc, is much smaller than the distance to which we can see, about 105 Mpc. Hence at the largest length scales, the universe appears homogeneous. Moreover, at such scales the Universe also appears to be isotropic about every point: local observations will not reveal great differences between different directions in the sky. Finally, as shown by Edwin Hubble, the Universe is expanding. It could have been the case that at a given point, one would see a larger recessional velocity in one direction than in some other direction. This is not what we see at our location: all galaxies appear to be receding from us with a velocity v related to their distance d by9

v = H0d, (2.1)

−1 −1 where H0 ' 70 kms Mpc . Now, assuming that we are at no special location in the Universe, everything should be receding from everybody else, and isotropy continues to hold.

8 The reason for the uncertainty is that so far we have only been able to study the Universe in the electro- magnetic spectrum, so that a priori we can only “count” the mass that emits electromagnetic radiation. However, there is indirect evidence of copious amounts of dark matter and the number we give for the density is likely to be a severe underestimate. 9 This relationship, called Hubble’s law, is only valid for relatively small distances – hundreds of Mpc – after which it must be replaced by a different one which properly takes into account the curvature of the Universe, about which more will be said later.

26 B. The topology of the Universe

Even if the Universe appears homogeneous and isotropic to observers in galaxies, it is possible in principle to have an observer moving at relativistic velocities with respect to these. Such an observer would not see galaxies recede in all directions according to Eq. 2.1. The assumption of homogeneity and isotropy must refer to a particular choice of time coordinate t such that hypersurfaces of constant t are homogeneous and isotropic. Thus, there is a “preferred” way of slicing the 4-dimensional spacetime of the Universe into 3-dimensional slices, with galaxies as “markers”. In what follows, we will assume that 1. Spacetime can be sliced into hypersurfaces of constant time t which are perfectly homogeneous and isotropic; and 2. The mean rest frame of the galaxies agrees with the definition of simultaneity implied by this particular time coordinate t. One can then introduce spatial coordinates xi, i = 1, 2, 3 “anchored” to galaxies: each galaxy has fixed xi.10 These are called co-moving coordinates. Due to homogeneity, the spatial geometry of a constant t hypersurface can only depend on t, not on the spatial coordinates, so that 2 i j dl =γ ˜ij(t) dx dx . (2.2)

Because of isotropy, all the components ofγ ˜ij(t) must increase at the same rate, without there being a preferred direction, hence 2 2 i j dl = a (t) γij dx dx , (2.3) 11 where γij is independent of time. The line element for spacetime as a whole is 2 2 2 i j ds = −dt + a (t) γij dx dx . (2.4) Note that there can not be a cross term involving dt dxi because we would like the notion of simulaneity defined by t = const to agree with that of the local Lorentz frame of a galaxy. i Since we also set g00 = −1, t is the proper time along a line dx = const, i.e., it is the proper time of the galaxies. Homogeneity and isotropy also imply that on the t = const slices we can choose an origin wherever we please, and γij must be spherically symmetric about that origin. It is not difficult to see that the most general spherically symmetric line element can be expressed as i j 2f(r) 2 2 2 γij dx dx = e dr + r dΩ , (2.5) with dΩ2 = dθ2 + sin2(θ) dφ2. Again demanding homogeneity, we want the scalar curvature (3) R of γij to be a constant. A straightforward calculation shows that  1  (3)R = −2 − e2f (1 − e−2f ) e−2f − 2re−2f f 0r−2 r2 2 = 1 − e−2f (1 − 2rf 0) r2 2 0 = r(1 − e−2f ) , (2.6) r2

10 This is not in contradiction with the expansion of the Universe; it simply implies that the spatial coordinate grid expands in tandem with it. 11 For computational convenience, in this chapter we choose units such that c = G = 1.

27 where 0 = d/dr. Setting this equal to some constant K we get

2 0 K = r(1 − e−2f ) (2.7) r2 which can be integrated to

 1 A γ = e2f = 1 − Kr2 − (2.8) rr 6 r

where A is an integration constant. Demanding local flatness at r = 0, γrr(r = 0) = 0, we get A = 0. Writing k = K/6 we arrive at 1 γ = . (2.9) rr 1 − kr2 Substituting this back into (2.4), we find the Friedman-Lemaˆıtre-Robertson-Walker (FLRW) line element  dr2  ds2 = −dt2 + a2(t) + r2dΩ2 . (2.10) 1 − kr2 Now note that whatever the value of k, we can always rescale it by a positive factor, with appropriate redefinition of both r and a(t). However, the sign of k can not be changed. Hence there are three different values of k we need to consider: k = +1, 0, −1. We now discuss these possibilities in turn.

• k = 0. Then at any moment in time t0, the spatial line element is

dl2 = dr˜2 +r ˜2dΩ2, (2.11)

wherer ˜ = a(t0) r. This is the metric of a flat, 3-dimensional Euclidean space. All t = const slices are spatially flat, and the geometry of the Universe is non-trivial only through the time evolution of the 4-metric as a whole. This is the spatially flat FLRW model. • k = +1. Define a coordinate χ(r) such that

dr2 dχ2 = . (2.12) 1 − r2

Solving for r one finds r = sin(χ) so that the spatial line element at a time t0 is

2 2  2 2 2 2 2 dl = a (t0) dχ + sin (χ) dθ + sin (θ) dφ . (2.13)

This is the metric of a three-sphere of radius a(t0), i.e., of a 3-dimensional hypersurface in 4-dimensional Euclidean space whose Cartesian coordinates satisfy

2 2 2 2 2 x + y + z + w = a (t0). (2.14)

The corresponding 4-metric is the closed FLRW model.

28 • k = −1. A similar coordinate transformation as above now gives

2 2  2 2 2 dl = a (t0) dχ + sinh χ dΩ . (2.15) The corresponding 4-metric is the open FLRW model. This model has an interesting property. χ is a radial coordinate, and when we increase it, the circumferences of the two-dimensional spheres coordinatized by (θ, φ) increase as sinh(χ). But sinh(χ) > χ for all χ > 0, hence the circumferences increase more rapidly with radius than in flat space. The spatial metric dl2 describes a space which cannot be represented as a 3-dimensional hypersurface of a flat, 4-dimensional Euclidean space. The hyperbolic, flat, and spherical geometries are illustrated in Fig. 12.

Figure 12: An illustration of the hyperspherical, flat, and hyperbolic geometries, with one dimension suppressed (in reality they are of course 3D spaces).

C. The dynamics of the Universe

We now turn to the dynamics of the FLRW models as given by the Einstein equations. For this we need to specify an energy-momentum tensor Tµν. On cosmic scales, galaxies behave as a “gas” of particles to which we can assign a (average) density ρ. Neglecting interactions between galaxies, the pressure P can be set to zero. This leads to

Tµν = ρ uµuν, (2.16) where uµ is the 4-velocity of any given galaxy; in co-moving coordinates, uµ = (1, 0, 0, 0). This is the energy-momentum tensor of a perfect fluid with zero pressure. However, other forms of mass-energy are also present in the Universe. For example, the Cosmic Microwave Background represents a thermal distribution of radiation with a temperature of about 3 K. This radiation can also be represented as a perfect fluid, but with non-zero pressure: for massless thermal radiation one has P = ρ/3. There may be other contributions to the total energy-momentum tensor, all of which we shall assume to be in the form of a perfect fluid. This leads us to write Tµν = ρ uµuν + P (gµν + uµuν). (2.17) This we substitute into the Einstein equations, 1 G = R − g R = 8πT , (2.18) µν µν 2 µν µν

29 which leads to

Gtt = 8πTtt = 8πρ, 2 Gxx = 8πTxx = 8πa P. (2.19)

As a result of the high degree of symmetry, these are in fact the only two independent components of the Einstein equations; all the others are either trivially satisfied or equivalent to one of the above. We now need to compute Gtt and Gxx in terms of a(t). Let us do this explicitly for the case of a flat spatial geometry, where the 4-metric can be written in the form ds2 = −dt2 + a2(t) dx2 + dy2 + dz2 . (2.20) The non-vanishing components of the Christoffel symbols are

t t t Γxx = Γyy = Γzz = aa,˙ a˙ Γx = Γx = Γy = Γy = Γz = Γz = , (2.21) xt tx yt ty zt tz a witha ˙ = da/dt. The independent Ricci tensor components are then a¨ R = −3 , tt a 2 Rxx = aa¨ + 2˙a . (2.22)

The Ricci scalar is a¨ a˙ 2  R = −R + 3a−2R = 6 + , (2.23) tt xx a a2 so that 1 a˙ 2 G = R + R = 3 = 8πρ, tt tt 2 a2 1 G = R − a2R = −2aa¨ − a˙ 2 = 8πa2P. (2.24) xx xx 2 Combining these equations we may write a¨ 3 = −4π(ρ + 3P ). (2.25) a Repeating the calculation for the closed and open FLRW Universes, we find that in general they are governed by the following two equations:

a˙ 2 8π k = ρ − , (2.26) a2 3 a2 a¨ 4π = − (ρ + 3P ). (2.27) a 3 The equations (2.27) immediately lead to a striking prediction. Provided that ρ > 0 and P ≥ 0, the Universe can not be static. Indeed, one will havea ¨ < 0, so that the Universe must always be either expanding (˙a > 0) or contracting (˙a < 0), except perhaps at an instant in time when expansion changes over to contraction. Note the nature of this expansion or

30 contraction: The distances between all co-moving observers (galaxies) change with time; there is no “center” or preferred point. If at time t the distance between two such observers is D, then dD D da v ≡ = = HD, (2.28) dt a dt where H(t) =a ˙(t)/a(t) is the Hubble parameter. If the observers are sufficiently close (e.g., at a distance of a few hundred Mpc) then we can approximate H by H0, its value at the current epoch, and we retrieve Hubble’s law (2.1). As found by Hubble, the Universe is currently expanding,a ˙ > 0. According to Eq. (2.27),a ¨ < 0, so moving back in time the Universe must have been expanding faster and faster. If the Universe had always expanded −1 at the current rate, then at the time T = a/a˙ = H0 we would have had a = 0. Since the expansion was actually faster, the time at which a = 0 was even closer to the present. Thus, under the assumption of homogeneity and isotropy, general relativity makes the prediction −1 that at a time less than H0 ago, the Universe was in a singular state: The distance between all points in space was zero, and the density of matter as well as the curvature of spacetime was infinite. This state is referred to as the Big Bang. For a while it was thought that if the assumptions of homogeneity and isotropy were relaxed, the Big Bang could be avoided, but it is now known that singularities are in fact generic features of cosmological solutions. Eqns. (2.26) and (2.27) allow us to obtain an equation for the time evolution of the mass density. Indeed, multiplying (2.26) by a2, differentiating with respect to t, and eliminating a¨ using (2.27), we find a˙ ρ˙ = −3(ρ + P ) . (2.29) a Thus, for matter (P = 0), we get ρ a3 = const, (2.30) which expresses conservation of rest mass. For radiation (P = ρ/3), one has ρa4 = const. (2.31) Here the energy density decreases more rapidly than a3 as a grows, because the radiation in each volume element does work on its surroundings as the Universe expands. Alternatively, in terms of photons, the photon number density decreases as a−3, but each photon loses energy as a−1 due to cosmological redshift caused by the expansion; see below. Let us discuss the qualitative features of the FLRW Universes depending on the value of k. If k = 0 or −1, Eq. (2.26) tells us thata ˙ can never become zero. Indeed, for any matter with P ≥ 0, ρ must decrease at least as rapidly as a−3, so that ρ a2 → 0 as a → ∞. We conclude that if k = 0 (a flat Universe), the expansion ratea ˙ asymptotically approaches zero as t → ∞, while if k = −1 we finda ˙ → 1 as t → ∞. In both cases the Universe expands forever. The situation is different for k = +1. The first term in the right hand side of Eq. (2.26) decreases more rapidly with a than the second term, and since the left hand side must be positive, there is a critical value ac such that a ≤ ac. Furthermore, a cannot asymptotically approach ac as t → ∞ because the magnitude ofa ¨ is bounded from below due to Eq. (2.27). Thus, if k = +1, then at a finite time after the Big Bang, the Universe will reach a maximum size ac after which it will contract again. The same argument as above for the occurence of a Big Bang now shows that a finite time after the recontraction begins, there must be a Big Crunch where once again a = 0. Thus, the spatially closed Universe can only exist for a finite (albeit possibly very long) time.

31 The dynamics of the open, flat, and closed Universes is illustrated in Fig. 13.

Figure 13: The open and flat FLRW metrics represent Universes that expand forever. The closed Universe must recollapse.

Current observations favor k = 0, in which case one has eternal expansion, but the issue is not yet settled. Write the first FLRW equation (2.26) as

8π k H2 = ρ − . (2.32) 3 a2 For the Universe to be spatially flat, the density must be the critical density,

3H2 ρ = . (2.33) c 8π The more general Eq. (2.32) can be written in terms of the critical density as

3k (ρ − ρ) a2 = − , (2.34) c 8π or 3k Ω−1 − 1 ρa2 = − (2.35) 8π where we defined Ω = ρ/ρc. The right hand side is constant, so the left hand side must be as well, irrespective of the value of k. For a Universe which mostly contains matter and radiation (Eqns. (2.30) and (2.31)), ρ decreases more quickly than a2 increases, so that Ω−1 − 1 must increase in tandem and Ω → 0 – unless of course Ω = 1 for all time. But in the latter case, the density ρ of the Universe must have been fine-tuned from the beginning to exactly equal the critical density ρc. This is the flatness problem, which will be discussed in more detail in the next chapter.

32 D. Dark energy

We have seen that if ρ > 0 and P ≤ 0, one hasa ¨ < 0 on account of Eq. (2.27), so that the expansion of the Universe should be slowing down. Since about a decade, evidence has been accumulating that the expansion of the universe is actually speeding up. One explanation could be that there is a cosmological constant Λ in the Einstein equations: 1 8πG R − g R + Λg = T . (2.36) µν 2 µν µν c4 µν Einstein originally introduced the cosmological constant because of the then prevalent philo- sophical bias that the Universe should be forever stationary, whereas in standard general relativity, it needs to either expand or contract. With Λ < 0, a stationary Universe is possible, although we now know that it would not be stable against small perturbations, again leading to either expansion or contraction. A non-stationary Universe is a prediction of general relativity, with or without a Λ term. After Hubble’s observations, Einstein called the introduction of the cosmological constant “the biggest blunder of his life”. However, recently the idea has been resurrected. Observations of distant supernovae revealed that, given their redshift, they are dimmer and hence more distant than they should be, suggest- ing an accelerated expansion of the Universe. Observations currently favor a small, positive cosmological constant, which would make gravity slightly repulsive on large scales and give spacetime a natural tendency to expand. In the present of a cosmological constant, Eqns. (2.26) and (2.27) become

a˙ 2 8π k Λ = ρ − + , (2.37) a 3 a2 3 a¨ 4π Λ = − (ρ + 3P ) + , (2.38) a 3 3 but Eq. (2.29) continues to hold as stated. Let us assume Λ > 0 and repeat the reasoning above for the flat, open, and closed Universes. For k = +1, it is now possible that the second term in Eq. (2.37) will never come to dominate over the first and third, in which case the turnover pointa ˙ = 0 is never reached and the Universe will expand forever. If a turnover is reached, it will be delayed compared to the case with Λ = 0. For k = 0 and k = −1, the Universe will certainly keep expanding forever, just as in the Λ = 0 case. However, since ρ and P go to zero as a → ∞, there must come a time when the Λ term comes to dominate. After that one hasa ¨ > 0: the expansion of the Universe will be accelerating. Another possibility is dark energy, a putative new form of matter or energy with positive density but negative pressure. If you were to try to inflate a tyre by pumping dark energy into it, the tyre would get heavier, but it would actually deflate; heuristically one can think of space being expelled from the tyre. At this point nothing is known about the nature of dark energy12. As a first attempt to model it, one can think of it as being a perfect fluid with equation of state PDE = wρDE, (2.39)

with PDE the pressure, ρDE the density, and w < 0 the equation-of-state parameter. If w is constant and equal to −1, then that is equivalent to having a cosmological constant Λ and

12 Not to be confused with dark matter, the origin of which is equally unclear.

33 there is no need to posit dark energy. However, w could be time-dependent, but even that is something we have no knowledge of at present. From Eq. (2.29) we get a˙ ρ˙ = −3 (1 + w)ρ , (2.40) DE a DE and if w is constant (or slowly varying) this integrates to −3(1+w) ρDE = ρDE,0 a , (2.41)

with ρDE,0 the present-day density of dark energy. If w is non-constant then this can be generalized with the replacement  Z 1 0  −3(1+w) da 0 a → exp 3 0 [1 + w(a )] , (2.42) a a where we set the present-day value of the scale factor a to 1. a˙ The Hubble parameter H = a neatly encapsulates the past history and dynamics of the Universe. By means of Eq. (2.37) we can write   Z 1 0  2 2 −3 −4 −2 da 0 H (a) = H0 ΩMa + ΩRa + Ωka + ΩDE exp 3 0 [1 + w(a )] . (2.43) a a

The dimensionless constants ΩM,ΩR,Ωk, and ΩDE are defined as

8πGρM,0 ΩM = 2 , (2.44) 3H0 8πGρR,0 ΩR = 2 , (2.45) 3H0 8πGρDE,0 ΩDE = 2 , (2.46) 3H0 k Ωk = − 2 . (2.47) H0

Here ρM,0 is the present-day value of the matter density, ρR,0 the same for radiation, and ρDE,0 for the dark energy density. Later on we will talk about methods with which the above constants can be determined. For now we mention that existing data points towards Ωk ' 0, ΩR ' 0, ΩM ' 0.27, and ΩDE ' 0.73. In other words, dark energy completely dominates over matter! As discussed in the previous section, the density of dark energy (or, if w = −1, the density associated with the cosmological constant) must be extremely small, but the Universe is mostly devoid of matter, and the average matter density turns out to be smaller than that of dark energy. Note that as t → ∞, having w = −1 would imply 2 2 2 lim H (t) = lim H (a) = H0 ΩDE, (2.48) t→∞ a→∞ and since H(a) =a/a ˙ , this means p a˙ = H0 ΩDE a. (2.49) Hence, at late times (much beyond the present epoch) the Universe will then end up under- going eternal exponential expansion: h p i a(t) = a0 exp H0 ΩDE t . (2.50)

34 E. Observations

Assuming that w happens to be constant, then the parameters to be measured are

(H0, ΩM, ΩR, ΩDE, w). (2.51)

Over the past decade, a great deal of progress has been made in pinning down these quant- ities, using a number of methods.

1. The Cosmic Microwave Background

It has been observed that the Universe is permeated by (nearly) blackbody radiation with a temperature of ∼ 3 K, the Cosmic Microwave Background (CMB). The most likely explanation is that as the Universe cooled down, eventually, some 370,000 years after the Big Bang, it became possible for electrons and atomic nuclei to form the first atoms without them being immediately knocked apart again. This released radiation, and at the same time the Universe became transparent. The subsequent expansion of the Universe then stretched the typical wavelength of the radiation, thus lowering its temperature, eventually leading to what we see today. Although this radiation is isotropic to about one part in 105 (thus providing a validation of our assumption in previous subsections), departures from isotropy were mapped first by COBE probe and then in far more detail by WMAP; see Fig. 14. These anisotropies were caused by density fluctuations in the very early Universe, which depend sensitively on the fractions of matter and radiation that were present, and CMB 2 2 measurements provide good constraints on ΩM H0 and ΩR H0 .

Figure 14: A map of the temperature variations of the Cosmic Microwave Bacground, made using the WMAP probe.

2. Baryon acoustic oscillations

Baryon acoustic oscillations (BAO) refer to an overdensity or clustering of baryonic matter (as opposed to radiation) in the early Universe, and the associated “sound waves” in the

35 primordial plasma. These oscillations were first identified as “wiggles” in the power spectrum of the CMB (Fig. 15). However, the resulting density fluctuations were also the seed for large-scale clustering of the galaxies which would later formed. The spatial correlations in the distribution of galaxies and galaxy clusters can be studied directly, and compared with 2 the “seed” distribution seen in the CMB, providing additional constraints on ΩM H0 and 2 2 ΩR H0 , but also on ΩDE H0 and w, because the latter influence large scale structure through the effect they have on the way the Universe has been expanding.

Figure 15: The distribution of temperature fluctuations in the CMB. The number l on the horizontal axis indicates angular scale; what is plotted is how much power is contained in the CMB depending on the size of the patch on the sky one considers.

3. Standard candles

As explain above (Eq. 2.43), the Hubble parameter H encapsulates the main parameters one wishes to measure. A practical way to gain direct insight into this function is by measuring luminosity distances DL. Given a localized energy source, luminosity distance is defined through L F = 2 , (2.52) 4πDL where F is the flux measured by an observer, and L is the intrinsic luminosity of the source. If we lived in a static, Euclidean Universe, then DL would be the Euclidean distance between source and observer. Instead, our Universe is curved, but one can still define a luminosity distance through DL. Aside from DL, one can also try to measure the redshift z of the source. Consider an observer at r = 0, and a galaxy at r > 0 in which a signal is emitted which travels at the speed of light (this could be an electromagnetic signal, or a gravitational wave). Suppose a wave crest is emmitted at a time tem and picked up by the observer at at

36 2 time tobs. Setting ds = 0, we have

Z tobs cdt Z r dr = 2 1/2 . (2.53) tem a(t) 0 (1 − kr )

If a second wave crest is emitted at a time tem + ∆tem and absorbed at a time tobs + ∆tobs,

Z tobs+∆tobs cdt Z r dr = 2 1/2 . (2.54) tem+∆tem a(t) 0 (1 − kr ) The right hand sides of (2.53) and (2.54) are the same, since observer and source are at fixed co-moving positions. Taking the difference between the two equations, to leading order in ∆tem and ∆tobs one has a(tobs) ∆tobs = ∆tem. (2.55) a(tem)

In an expanding Universe a(tobs) > a(tem), so that there is a time dilation effect: the time between wave crests will be larger at the observer. The redshift z is defined by

a(t ) 1 + z = obs . (2.56) a(tem) The time dilation implies that according to the observer’s clock, a clock at the source will be slower by a factor 1 + z: dtobs = (1 + z) dts. (2.57) The frequency seen by the observer will then also be lower than at the source, f f = s . (2.58) obs 1 + z Energies are similarly affected13: E E = em . (2.59) obs 1 + z Together with (2.57), this allows us to relate the emitted power to the observed one:

dEobs 1 dEem = 2 (2.60) dt (1 + z) dtem The flux at the observer is given by 1 dE F = obs , (2.61) A dtobs

2 2 where A = 4πa (tobs)r is the area over which a wavefront will have spread when the signal reaches the observer. Hence L F = 2 2 2 , (2.62) 4πa (tobs)r (1 + z)

13 The easiest way to see this is to use the quantum relation E = hf, but it can also be derived classically

37 with L = dEem/dtem the intrinsic luminosity of the source. From the definition of luminosity distance, Eq. (2.52), we find DL = (1 + z)a(tobs) r. (2.63) In the rest of this chapter we specialize to the case of a flat universe, k = 0. Then

Z tobs cdt = r. (2.64) tem a(t) Time and redshift are related as a(t ) 1 + z(t) = obs , (2.65) a(t) and differentiating we get dt 1 dz = − , (2.66) a(t) a(tobs) H(z) where the Hubble parameter is defined as a˙(z) H(z) = , (2.67) a(z)

and a dot denotes derivation with respect to t. Eq. (2.64) can then be written as

Z z dz0 a(tobs) r = c 0 , (2.68) 0 H(z ) so that Z z dz0 DL = c(1 + z) 0 . (2.69) 0 H(z ) For small z, Eq. (2.69) yields H0DL ' cz, (2.70)

which with vrec = cz again gives Hubble’s law. However, the more general expression (2.69) contains far more information; in fact, it encapsulates the entire past dynamics and geometry of the Universe through its dependence on H(z):

DL = DL(z; H0, ΩM, ΩR, Ωk, ρDE, w). (2.71)

If one has a sufficient number of sources for which DL and z can be measured by independent means, then one can fit DL as a function of z with H0,ΩM,ΩR,Ωk,ΩDE, w as free parameters. This way the values of the latter can be measured, as the numbers that lead to the best fit 14 for DL(z). One way to make such measurements is to consider Type Ia supernovae. These are believed to always have the same intrinsic luminosity, to within 10% or so. From their spectrum (or that of the host galaxy) one can also infer their redshift with essentially zero uncertainty. Thus, they are standard candles: DL and z can be measured separately. A

14 Although in the derivation of Eq. (2.69) we assumed a flat Universe, the expression is more generally valid.

38 problem is that Type Ia supernovae will have been slightly different in the past, when stars had not yet produced as many heavy elements as there are today. The luminosity of Type Ia supernovae needs to be calibrated, which necessitates measuring... their distance. This can be done by looking at the galaxies they live in. For spiral galaxies, there exists a relationship between their size and the velocities with which their stars move – the Tully-Fisher relation. The latter can be established by Doppler shift measurements, which then provides a way to measure lumonosity distance. However, this relationship is itself in need of calibration, once again requiring distance measurements by alternative means. The distances to close-by galaxies are measured by observing variable stars called Cepheids. The average brightness of a Cepheid is closely related to the period with which it brightens an dims, from which luminosity distance can be inferred. But this relationship also needs to be calibrated, which is done by parallax measurements on Cepheids in our own galaxy. As the Earth moves around the Sun, the apparent position of a star in the sky traces out an ellipse, and the angular size of this ellipse is a measure of distance. Parallax measurements are accurate out to a few kpc, allowing for the calibration of nearby Cepheids. Beyond that one relies on this Cepheid calibration for distance measurements up to about 10 Mpc. The number of spiral galaxies within that distance is sufficient to calibrate the Tully-Fisher relation, which is then used on its own to measure distances up to tens of Mpc. Finally, Type Ia supernovae are calibrated. Thus, measurements of DL rely on an entire cosmic distance ladder; see Fig. 16. Each step of the ladder comes with statistical as well as systematic uncertainties, related to, e.g., the absorption of light as it travels to us. Nevertheless, supernovae have proven to be useful in further constraining cosmological parameters.

Figure 16: The so-called cosmic distance ladder, which is used to calibrate faraway standard candles such as Type Ia supernovae.

Combining the results from WMAP, BAO, and supernovae, the following information was obtained: • The age of the Universe is 13.72 ± 0.12 billion years;

• The total matter content of the Universe is ΩM = 0.27 ± 0.02. Of this, less than 20% is ordinary, baryonic matter. The rest is dark matter (not to be confused with dark

39 energy!) which doesn’t seem to emit light but makes its presence known through its gravitational pull. The nature of dark matter is as yet unclear;

−5 • The radiation content is tiny, ΩR ∼ 5×10 ; although radiation would have dominated in the very early Universe (due to the a−4 dependence of the density on the scale factor), at the current epoch its effects are negligible;

• The dark energy contribution is ΩDE = 0.73±0.02. Note that ΩM +ΩDE = 1 to within the stated accuracies. Since necessarily the sum of all the Ω’s must be 1, this suggests Ωk = 0, i.e., a flat Universe; • One has −1.11 < w < −0.86. Hence the dark energy could be a cosmological constant – corresponding to w = −1 – but other values of w (and indeed a time-dependent w) are clearly still allowed. The assumption that at large length scales, the past history and dynamics of the Universe is well described by one of the FLRW spacetimes, together with the above constraints, is referred to as the Standard Model of Cosmology.

4. Future measurements

Later on we will talk about gravitational waves (GW), “ripples” in the curvature of spacetime which propagate at the speed of light. In the future these could revolutionize our understanding of the Universe, in various ways.

• Primordial gravitational waves should have been created within a Planck time (∼ 5 × 10−44 s) of the Big Bang, to be compared with the CMB which originated some 370,000 years later. Thus, a cosmological GW background could provide information about what happened at the very instant of the Big Bang, when quantum gravity effects were still strong. This might help us in understanding how quantum mechanics and GR are to be reconciled and combined into one theory. Also, phase transitions in the early Universe associated with the decoupling of the electroweak force15, which happened some 10−10 s after the Big Bang, could have created a GW background. Such primordial GW may have left an imprint on the electromagnetic Cosmic Microwave Background through its polarization. An important goal of the Planck probe, which was launched in 2009, is to study the polarization of the CMB. Alternatively (or in addition), primordial GW may be found by dedicated gravitational wave detectors that are currently operational and under construction. • The direct detection of GW may also give us a new kind of standard candle. Binary systems consisting of two neutron stars, a neutron star and a black hole, or two black holes tend to spiral towards each other as they lose orbital energy through GW emission, and eventually they merge to form a single black hole, leading to even more gravitational radiation. Such systems are self-calibrating, in the sense that the distance can be deduced from the gravitational waveform itself, with no reference to any kind

15 Quantum field theory tells us that electromagnetism and the weak nuclear force were once just one single interaction.

40 of cosmic distance ladder. If the host galaxy in which the coalescence event took place can be separately identified and its redshift z measured, then the relationship (2.71) can be used to constrain cosmological parameters directly. In a later chapter we will study in detail how this is to be done.

41 III. COSMOLOGICAL INFLATION

The Standard Model of Cosmology yields a global picture of the evolution of the Universe. However, it suffers from several subtle, but serious shortcomings. We will investigate this and discover how cosmological inflation presents a solution. Details of inflation are model dependent. Next, we describe an elementary model for the inflation process. We derive the duration of the inflation epoch and study the connection, through reheating, to the Big Bang model discussed previously.

A. Shortcomings of the Standard Model of Cosmology

When considered by itself, there are a number of problems with the Standard Model of Cosmology: • Horizon problem: the horizon of an observer is the largest distance over which an influence, any influence, can have travelled in order to reach this observer. It presents an upper limit to the volume of space that can be in causal contact with the observer. We have already seen that no influence can travel faster than light, and thus it follows that the horizon of an observer represents the size of the visible Universe of this observer. Thermal equilibrium between different parts of the Universe can only be established through the exchange of photons of the cosmic microwave background radiation. How- ever, these photons decoupled from matter around 105 years after the Big Bang: after this event it was not possible anymore to bring different parts of the Universe in thermal equilibrium. During this decoupling, the horizon of an observer was many times smaller than it is today. Thus, we expect that regions with photons of sim- ilar temperatures are much smaller that the present visible Universe. The opposite appears to be true: data show that the entire visible Universe has nearly the same temperature (we have used this fact before to justify the cosmological principle). The resulting paradox is called the horizon problem. • Flatness problem: data show that the present Universe has a metric that is extremely flat: the flat Robertson-Walker metric. We have already calculated that the expansion of the Universe predicts that the curvature becomes increasingly smaller: Ω → 0 when t becomes larger. This implies that, in order to explain the present flatness, the metric of the early Universe resembled even more that of the perfect flat Robertson- Walker metric. This is known as the flatness problem: which mechanism brought the earliest magnitude of flatness so close to that of the perfect Robertson-Walker value? This question can be evaded by assuming that the Universe was always perfectly flat. However, this then leads to the question why did the Universe start off with exactly the critical density? The Standard Model of Cosmology does not answer these questions. • Missing particles: many of the modern particle physics theories predict the existence of exotic particles that have not been discovered so far. Examples include supersymmetric particles and magnetic monopoles. Typically, these particles are extremely massive and are therefore difficult (or impossible at this moment) to create with accelerators. However, in the early Universe temperatures may have been high enough to allow

42 natural creation of such particles. So far, none of these particles have been observed16. This raises the question: if such particles were indeed produced in the early Universe, why do we not find them today?

All these problems can be solved at once when we expand the Standard Model of Cosmo- logy with a new concept: cosmological inflation. This is the assumption that the Universe, immediately after the Big Bang, has had a period of extreme rapid expansion. Mathemat- ically, we can define this as follows: during inflation17

a˙(t) > 0, a¨(t) > 0. (3.1)

By introducing inflation, we can solve the three shortcomings of the Standard Model. The horizon problem is solved by the fact that during inflation the scale factor a(t) becomes extremely large. This implies that parts of the Universe that already had established thermal equilibrium before the start of inflation, inflated to much larger proportions than the horizon of an observer in this part of the Universe. The consequence is then that after inflation ends, the visible Universe for this observer is fully in equilibrium, exactly as the data show today! The flatness problem is solved by the fact that every manifold, when expanded to suffi- ciently large proportions, appears flat to a local observer. This is in complete analogy with the surface of the Earth, which is sufficiently large compared to local observers, and appears flat to us. Finally, the problem of missing particles is solved in a trivial manner through the in- troduction of an inflationary period. When the Universe is expanded by vast amounts, all exotic particles that were created before the start of inflation will be distributed over a large volume. This makes the probability to encounter such a particle in an Earth-bound detector very small. Clearly, inflation only explains why the initial exotic particles are not found, and it makes no statement about particles that may have been created after inflation ended. Note that we do not expect such particles to have been created at these later times. Typ- ically, exotic particles have such high masses that their creation is prevented in a Universe where the energy density has significantly decreased because of expansion.

B. The dynamics of cosmological inflation

We will now discuss how cosmological inflation can be realized: how can we apply the Friedmann equations to obtain inflation? We have already seen during our discussion of the cosmological constant that energy with an equation of state P = −ρ yields an exponentially expanding Universe. However, this is not the only way to satisfy Eq. (3.1). When we consider the second Friedmann equation (2.27), we see that inflation occurs when the right-hand side of this equation is positive. In terms of the equation of state this means that inflation occurs 1 for all matter and energy that has the property P = nρ with n < − 3 .

16 Discovery of magnetic monopoles has been claimed by some experimental physicists, however their result could never be reproduced. 17 Note that a Universe with a cosmological constant obeys this definition, and we could denote our present expansion as inflation. However, we will reserve the term inflation for accelerated expansion of the early Universe.

43 Such energy is not encountered in classical physics. However, in quantum physics it is possible to realize such types of energy, through a description in terms of fields instead of particles. We will now define a field that is constructed in such a manner that it has an 1 energy density and pressure that obey an equation of state in which n < − 3 . It is then up to experimental physics to show whether such a field exists in Nature. The model that is employed the most consists of a scalar field Φ(t) which only depends on time and not on space. The cosmological principle suggests that all energy and matter should be distributed homogeneously and isotropically. The choice for a vector or tensor field would not be in agreement with the demand for isotropy because of the rotation dependence of such fields. Spatial dependence would be in conflict with the demand for homogeneity. A scalar field has a Lagrangian density given by18 1      = − gµν ∂ Φ(t) ∂ Φ(t) − V Φ(t) , (3.2) L 2 µ ν where in the present case none of the spatial derivatives contribute. Every Lagrangian density has a corresponding action S, given by Z 3 √ S = d xdt −g L (3.3) where g is the determinant of the metric. In the present case we employ the flat Robertson- Walker metric, and the determinant is given by g = −a6(t). (3.4) With the Euler-Lagrange equations we can now derive the equations of motion for the scalar field Φ(t). This is a long calculation, although relatively simple since the field only depends on time. The scalar field Φ(t) evolves in time according to the equation of motion a˙(t)   Φ(¨ t) + 3 Φ(˙ t) + c2∂ V Φ(t) = 0. (3.5) a(t) Φ Note that the details of the evolution of the scalar field depend on the potential energy density V Φ(t). The various types of models for cosmological inflation are characterized by the choice of this quantity. Here we will not enter into details and in the following we will not make explicit assumptions for the shape of V Φ(t). Consequently, all our conclusions are generic. Each Lagrangian density leads to an energy-momentum tensor    Tµν = ∂µΦ ∂νΦ + gµνL. (3.6) This expression can be used to calculate the pressure and density due to the scalar field. We insert our current Langrangian density and metric, and compare the resulting energy- momentum tensor with that of the Friedmann fluid. The expression for the pressure and density can then be directly read off. We find 1 1 ρ(t) = Φ˙ 2(t) + V Φ(t), 2 c2 1 1 P (t) = Φ˙ 2(t) − V Φ(t). (3.7) 2 c2

18 Note that for a scalar field one has ∇µΦ = ∂µΦ.

44 Note that the first term can be seen as the kinetic energy density of the scalar field, and the second as the potential energy density. The expression for the density ρ of the total energy is in excellent agreement with our expectation19. We consider a Universe that, in its earliest epoch, is filled with the usual types of energy: cold matter, radiation, a cosmological constant, and now add an inflaton field. All these influences determine, through the Friedmann equations, the evolution of the scale factor with time. When the influence of the inflaton field is dominant, then the Universe will enter a period of exponential expansion. The results is that the other influences can immediately be neglected. Surely, the energy density and pressure of the usual forms of energy and matter are proportional to the inverse of the scale factor to a certain power and therefore will asymptotically (and rapidly for typical inflation models) vanish. Therefore, all terms in the Friedmann equations that are proportional to pressure and density can be neglected in the description of the Universe in an inflationary period. Consequently, we conclude that the Universe in an inflationary period can be described by the Friedmann equations substituting only the influence of the inflaton field, and, obviously, the equations of motion for that inflaton field, see Eq. (3.5). The resulting set of equations of motion are called the inflation equations

¨ ˙ 2  Φ(t) + 3H(t)Φ(t) + c ∂ΦV Φ(t) = 0, 8πG1 1  H2(t) = Φ˙ 2(t) + V Φ(t) . (3.8) 3c2 2 c2 We can summarize inflation cosmology as follows: argue for a particular expression for the potential energy density V Φ(t), substitute this into the inflation equations, solve the resulting set of equations of motion in order to obtain the explicit expression for the scale factor a(t) and the inflaton field Φ(t). Note that it is not guaranteed that a solution a(t), Φ(t) indeed describes inflation (˙a(t) > 0, a¨(t) > 0) for each arbitrary shape of the potential energy density V (Φ(t)). In the next section we will therefore derive criteria that, when obeyed, guarantee inflation.

C. Simplified inflation equations

Next, we consider the question in which manner a scalar field can be used to realize inflation. This is not a trivial question: we have already seen that inflation occurs when 1 density and pressure lead to an equation of state with n < − 3 , but both quantities depend in a non-trivial manner on the magnitude of the scalar field Φ(t), which itself is dictated by the equations of motion. It is for this reason that at present one often makes the assumption of slow evolution20: it is assumed that the scalar field slowly evolves in time, in such a manner that we may assume that the Φ˙ terms in Eq. (3.7) for density and pressure can be neglected (this implies that the kinetic energy of the field is much smaller that its potential energy), 1 1 Φ˙ 2(t)  V Φ(t). (3.9) 2 c2

19 The expressions for the density and pressure can also be obtained by assuming that ρ obeys this expression, and then, as we have demonstrated before, use the Friedmann equations to find the corresponding pressure. 20 The so-called Slow Roll Condition.

45 One then has ρ(t) ≈ +V Φ(t), P (t) ≈ −V Φ(t). (3.10) Note that, with this assumption, the equation of state features the value n = −1, for all choices of the potential energy V Φ(t). Previously, we have seen that this value for n leads to an exponential expansion of the Universe. Thus, we have found that with the assumption of slow roll, exponential inflation results independent of the details of the model. The scalar field is called inflaton field; the particles attributed to this field are called inflatons. In addition, we will assume that Φ(¨ t) is much smaller than 3HΦ(˙ t). In this way we can neglect the first term in Eq. (3.8). This implies that we assume that Φ(˙ t) only slowly changes its magnitude. This condition is important, since it states that the kinetic energy density of the inflatons remains small for a relatively long time. It prevents inflation coming to an end too early. With these assumptions the inflation equations simplify to ˙ 2  3H(t)Φ(t) + c ∂ΦV Φ(t) = 0, 8πG H2(t) − V (Φ(t)) = 0. (3.11) 3c2 The above equations will be called the Simplified Inflation Equations (SIE). Before we solve these equations, we will derive two important parameters. As discussed, the SIE are valid only when 1 1 ˙ 2 ¨ 2 Φ (t) Φ(t) 2 c  1, and  1 (3.12) V Φ(t) 3H(t)Φ(˙ t) are obeyed. Both demands can be rewritten with the help of the two SIE, and cast in an expression that only involves the potential energy density V Φ(t). For a given inflation model this allows rapid inspection of the applicability of the simplified inflation equations. The first demand can be written as 1 1 ˙ 2 2 2 Φ (t) 1 ∂ V (Φ(t)) 2 c = Φ V (Φ(t) 18H2(t) V (Φ(t)) 1 c4 ∂ V (Φ(t))2 = Φ  1, (3.13) 6 8πG V (Φ(t)) where in the first step the first SIE is used, and in the second step the second. We define the inflation parameter  now as 1 c4 ∂ V (Φ(t))2  ≡ Φ , (3.14) 6 8πG V (Φ(t)) and conclude that it must be much smaller than 1 in order to allow the use of the first of the two SIE. The interpretation of the parameter  can be readily seen:  measures the slope of the function V (Φ(t)), and demanding that this parameter is small, corresponds to the assumption that V (Φ(t)) is flat. This parameter also has a different meaning . Using the SIE it can be demonstrated that H˙ (t) 1 − = . (3.15) H2(t) 3

46 When   1 the left-hand side of this equation is much smaller than 1. This states that inflation occurs. This can be seen as follows: the definition of the Hubble constant, H(t) ≡ a˙ (t) a(t) states that this equation can be written as a(¨t)a(t) 1 a¨(t)a(t) 1 − =   1 →  0, (3.16) a˙ 2(t) 3 a˙ 2(t) and the inequality guarantees thata ¨(t)  0: rapid expansion of the Universe. The con- clusion is that the demand   1 not only implies that the SIE can be used, but also that inflation is guaranteed. The last remark explains the name inflation parameter. The second demand in Eq. (3.12) can be rewritten as Φ(¨ t) 1   = − c2∂2 V (Φ(t)) + 3H˙ (t) 3H(t)Φ(˙ t) 9H2(t) Φ c4 ∂2 V (Φ(t))  = − Φ +  1, (3.17) 24πG V (Φ(t)) 9 where in the first step the time derivative of the first SIE is used, and in the second step the second SIE with the condition given in Eq. (3.15). We know that  that   1, and conclude that the second term can be neglected. We define a parameter η as c4 ∂2 V (Φ(t)) η ≡ − Φ , (3.18) 24πG V (Φ(t)) and η must be much smaller than 1 in order that the the second SIE may be used. The interpretation of the parameter η can be seen as follows: η is a measure for the change of the slope of V (Φ(t)) in time. Demanding that this parameter is small, implies that we assume that V (Φ(t)) remains flat for a long time. In summary, we conclude that we can use the SIE to describe inflation for every choice for the potential energy density V (Φ(t)), as long as it is a function that is nearly flat (  1) and remains flat for a long time (η  1). Moreover, this demand guarantees that exponential inflation will result.

D. Example of an inflation model

We will now solve the simplified inflation equations for a massive inflaton field, or in other words a quantum field that describes particles with mass m. From field theory it is known that21 that the Lagrangian density for a time-dependent scalar field for a particle with mass m can be written as 2 2 1 2 m  c  2 L = − Φ˙ (t) − Φ (t), (3.19) 2 2 ~ where ~ is the reduced Planck constant (~ = h/2π). The potential energy density V (Φ(t)) of this field can be read-off and is given by m2  c 2 V (Φ(t)) = Φ2(t), (3.20) 2 ~

21 This is the so-called Klein Gordon equation. It is the relativistic counter part of the Schr¨odingerequation for a particle with spin 0.

47 and the SIE can be written as  c4  3H(t)Φ(˙ t) + m2 Φ(t) = 0, ~2 8πG m2 H2(t) − Φ(t) = 0. (3.21) 6 ~2 These constitute two coupled differential equations: the solution of one equation influences that of the other, and vice versa. It is often a challenge to solve such systems of coupled differential equations, but in the present case it is not that difficult. In this case we take the square root of the second SIE and find an expression for the Hubble constant H(t), r 8πG m H(t) = ± Φ(t), (3.22) 6 ~ (the sign ambiguity results from taking the square root; we will decide later which of the two signs must be chosen to correctly describe inflation). Substituted in Eq. (3.21) we obtain r 8πG m c4 ±3 Φ(t)Φ(˙ t) + m2 Φ(t) = 0. (3.23) 6 ~ ~2 Next, we can divide both sides by mΦ(t), and obtain

mc4 r 6 Φ(˙ t) ± = 0. (3.24) 3~ 8πG The resulting differential equation can be solved in a straightforward manner and has as solution mc4 r 6 Φ(t) = Φ0 ∓ t. (3.25) 3~ 8πG Here, Φ0 is a constant which corresponds to the value of the inflaton field at the start of its time evolution (so at t = 0 ). The expression describes the time evolution of the inflaton field. The next step is to find an expression for the scale factor a(t). This can be accomplished by using the second SIE, and substituting the answer found for the inflaton field. The second SIE can then be written as r ! a˙(t) 8πG m mc4 r 6 H(t) = = ± Φ0 ∓ t a(t) 6 ~ 3~ 8πG r 8πG m m2c4 = ± Φ0 − t. (3.26) 6 ~ 3~2 We are now at a point where we can make a statement about the choice of sign. For an expanding Universe we must have by definitiona ˙(t) > 0. Eq. (3.26) shows that this certainly cannot be the case when the lower sign is chosen, since then for all times it follows that H(t) < 0. Therefore, from now on we select the upper sign. Notice, that with this choice there is still no guarantee thata ˙(t) > 0: the right-hand side of Eq. (3.26) still can be negative for certain times t. We return to this point later; for now it is only relevant to note that the present choice offers the possibility to finda ˙(t) > 0.

48 We will now solve differential equation (3.26). With our choice of sign, this equation can be written as r ! 8πG m m2c4 a˙(t) = Φ0 − t a(t). (3.27) 6 ~ 3~2 We propose the solution a(t) = e−(κ+λt)2 , (3.28) and choose the constants κ and λ such that we solve the SIE. To achieve this we substitute Eq. (3.28) into Eq. (3.27), and find

r 2 4 2 8πG m m c −2κλ − 2λ t = Φ0 − t. (3.29) 6 ~ 3~2 We can now directly read-off the required values for κ and λ. Apparently, we must choose √ 8πG mc2 √ κ = 2 Φ0 and λ = . (3.30) 2c ~ 6 Together with the expression for the inflaton field Φ(t) we found earlier, the entire problem is solved. The solution can be written as mc4 r 6 Φ(t) = Φ0 − t, 3~ 8πG “ √ 2 ”2 8πG mc√ − − 2 Φ0+ t a(t) = e 2c ~ 6 . (3.31) Next, we address the physics content of these functions. We observe that the solution for the inflaton field Φ decreases with time. However, does it decrease sufficiently slowly? We have seen before that accelerated expansion only results when the function V (Φ(t)), here V = Φ2(t), is slowly decreasing in time. Is this the case? In the last section we have seen that inflation is guaranteed only when the inflation parameter  is much smaller than 1. When using the definition of the inflation parameter, we find 1 3c4 ∂ V (Φ(t))2  ≡ Φ 6 8πG V (Φ(t)) c4 1 =  1. (3.32) 12πG Φ2(t) When this condition is obeyed then inflation is guaranteed. Concerning the parameter η, it turns out to obey the same condition. We find c4 ∂2 V (Φ(t)) η ≡ − Φ 24πG V (Φ(t)) c4 1 = − 12πG Φ2(t) = −. (3.33) This implies that when we show that one of the two parameters, for example , is much less than 1, then the same holds for the other parameter. We will now explore when this is the

49 case. We use Eq. (3.31) for the inflaton field and take the square root of both sides. We now can write the condition as r 3 8πG √ t  ~ Φ − ~ 6. (3.34) mc4 6 0 mc2 This yields a time limit: only when less time has elapsed than this limit, inflation will continue. Loosening this condition somewhat, r 3 8πG t  ~ Φ . (3.35) mc4 6 0

This is called tend and at this time, inflation will definitely have stopped: r 3 8πG t ≡ ~ Φ . (3.36) end mc4 6 0 We use this to re-interpret our solution for the scale factor. The exponent of equation (3.31) can be expanded as √ !2 r 8πG mc2 8πG 8πG m m2c4 √ 2 2 − − 2 Φ0 + t = − 4 Φ0 + Φ0t − 2 t . (3.37) 2c ~ 6 4c 6 ~ 6~ The first term is a constant, and only yields in an additional factor in the expression for the scale factor. The other two terms can be simplified. Our condition for inflation, Eq. (3.35), states that the last term is much smaller than the middle term, and hence can be neglected for t  tend. The exponent then simplifies to √ !2 r 8πG mc2 8πG 8πG m √ 2 − − 2 Φ0 + t ≈ − 4 Φ0 + Φ0t. (3.38) 2c ~ 6 4c 6 ~ When we substitute this into our expression for the scale factor, we find √ 8πG 2 8πG m − 4 Φ + Φ0t a(t) = e 4c 0 · e 6 ~ , (3.39)

which is an increasing function with time: inflation! Indeed, the SIE exactly yield the behavior we predicted in the previous section. To conclude this example, we use our model for the massive inflaton field and assign 22 values to the quantities m and Φ0, that are common in literature . We use √ GeV m · kg m ≈ 1013 ≈ 10−14 kg, Φ ≈ 1023 . (3.40) c2 0 s

When these values are inserted in our expression for the time tend, we find that inflation has lasted for about t ≈ 10−35 s. This is an unimaginably short time, but the expansion of the

22 There are more theoretical and experimental motivations to assume that these values are needed for an inflationary description of our Universe. However, the derivations are advanced and cannot be treated here.

50 Universe was nevertheless enormous. We calculate the expansion factor by comparing the scale factor at the start of inflation (given by Eq. (3.39), with t = 0) to the value of the scale factor at the end of inflation (Eq. (3.31)). We find

a(end) 8πG Φ2 = e 4c4 0 . (3.41) a(begin)

By taking logaritms of both sides, and inserting values for m and Φ0, we find the number of e-powers the Universe increased during the inflation period in our model. Several tens of e-powers expansion is found. This is a gigantic expansion in a very short time: inflation!

E. Terminating the inflation period

An inflation model needs to terminate, otherwise a Universe results that is extremely empty, since all matter and energy will be distributed over a vast and always expanding volume. Baryogenesis, nucleosynthesis, and recombination would never occur’ stars and planets would not form, and life wouldn’t come about. Moreover, the expansion rate would have increased to a magnitude that would not be in agreement with present observations. Therefore, it is important to formulate a mechanism by which inflation terminates. In addition, this mechanism must ensure that at the end of inflation the Universe is filled with photons and becomes radiation dominated. In this manner there is a smooth transition to the Standard Model of Cosmology. Both conditions can be incorporated in the inflation models by making use of the quantum mechanical properties of the inflation field. Quantum field theory teaches us that each field may have certain interactions with other quantum fields, and that these interactions determine how fast the quanta (particles) of one field transfer into those of the other field. It should therefore be possible to enrich the dynamics of the inflaton field Φ(t) such that it takes into account the decay of the inflaton field into new particles, which through a subsequent chain of decays, yield photons. During an inflation period this expansion will have little effect on the evolution of the Universe: the photons created contribute an energy density and pressure to the Friedmann equations, but these scale as a−4 and become negligible almost immediately during an era of extreme expansion. This implies that photons are created, but distributed over an extremely vast volume and feature negligible energy density. However, if we impose a mechanism to terminate the extreme expansion, then the transfer of inflatons to photons may become large enough to dominate the evolution of the Universe. Finally, all inflatons decay to photons, and from then on the Universe will be radiation dominated. Subsequent evolution will be described by the Standard Model. It is not easy to employ quantum field theory to describe the decay of inflatons to other particles. Little is known about the behavior of quantum fields beyond Minkowski spacetime. However, a qualitative description can be obtained, and we will discuss the most conventional model for the transition of inflatons to photons. This model amounts to the following: the potential energy density V Φ(t) of the inflaton field features a deep and steep well. When the inflaton field has arrived at this well, it will transfer its potential energy density to kinetic energy density (in the same manner as a marble will transfer its potential energy to kinetic energy when its rolls into a well). Under these conditions the inflation parameter  is sizable and inflation will terminate. Since the Universe now halts its fast expansion, photons created from the decay of the inflaton field

51 will from then on contribute to an increasing radiation energy density, and at some point overtake the density due to the inflaton field. At that instant the Universe becomes radiation dominated, and the Standard Model can be used to describe its evolution. This model is known as reheating phase23. The dynamics of the reheating phase cannot be described by the simplified inflation equations, since this phase does not obey   1 and η  1. We need to return to the original inflation equations (3.8). However, even these equations are not quite appropriate since they do not account for the decay of the inflaton field into photons, or for the fact that an increasing energy density due to radiation is present. We correct this by adding a term to the first inflaton equation (3.8), which describes the decrease of the inflaton field strength due to the decay into other particles. We assume that the decay of the field strength can be described by a term proportional to its time derivative, in analogy with the friction force on a marble being proportional to the velocity of the marble. This analogy obviously presents no proof of the statement that the decay to other particles can be described by a term proportional to Φ(˙ t). Advanced quantum mechanical calculations however show that the decay can indeed be described by such a term. Consequently, we will add a term ΓΦ(˙ t) to the first inflation equation, where Γ is a measure for the decay rate of the inflaton field, ¨ ˙ Φ(t) + (3H(t) + Γ) Φ(t) + ∂ΦV (Φ(t)) = 0. (3.42)

Secondly, also the second inflation equation (3.8) needs to be adapted, since it does not account for the fact that as the inflaton field decays, a growing energy density ργ(t) due to radiation will be present. To account for this we add this density and find

8πG  H2(t) = ρ (t) + ρ (t) . (3.43) 3c2 Φ γ

Furthermore, we add pressure and density to the second Friedmann equation. One finds

a¨ 4πG  = − ρ (t) + 3P (t) + ρ + 3P (t) . (3.44) a 3c2 γ γ Φ Φ

This can be simplified, since we know that radiation pressure is equal to ργ/3. In addition, we can write the inflaton field pressure as

PΦ = n(t)ρΦ. (3.45)

During inflation we have n = −1, but during the reheating phase we do not know the value of n. We even do not know whether it is a constant at all, and for this reason we keep a time dependence. The second Friedmann equation can be written as

a¨ 4πG  = − 2ρ (t) + (3n(t) + 1)ρ (t) . (3.46) a 3c2 γ Φ

In principle we are done now: Eqs. (3.42), (3.43), and (3.44) describe how the inflaton field transfers its energy into photons, how the inflaton field decays, and how the scale factor

23 The prefix re- is misleading: it suggests that the Universe was heated before. Whether this is indeed the case can probably only be anwered by a theory of quantum gravity.

52 evolves during this process. We only have to enter a decay constant Γ, the shape V (Φ(t)) of the potential well, and the state parameter n(t), and then solve these equations; in this way the entire evolution of the reheating process is determined. It is common to rewrite these expressions in a such manner that the inflaton field Φ(t) does not appear. Instead the energy density ρΦ is used (this is a more natural quantity than 1 ˙ 2 the field itself). First we take the time derivative of ρΦ(t) = 2c2 Φ (t) + V (Φ(t)), 1 ρ˙ (t) = Φ(˙ t)Φ(¨ t) + Φ(˙ t)∂ V (Φ(t)), (3.47) Φ c2 Φ 1 ˙ and insert this into Eq. (3.42). Together with the fact that c2 Φ(t) = ρΦ(t) + PΦ(t) and the notation in Eq. (3.45), we find Eq. (3.48). A second equation can be found by taking the time derivative of the first Friedmann equation, and then use the second Friedmann equation to eliminatea/a ¨ . The result is    ρ˙Φ(t) = − 3H(t) + Γ n(t) + 1 ρΦ(t),   ρ˙γ(t) = −4H(t)ργ(t) + Γ n(t) + 1 ρΦ(t), 8πG  H2(t) = ρ (t) + ρ (t) . (3.48) 3c2 Φ γ These equations are called the reheating equations. In the following section we will solve these equations for a special choice for the functions Γ, V (Φ(t)) and n(t).

F. The reheating phase: a simple example

The reheating equations constitute a set of three coupled differential equations, and these are in general not straightforward to solve. Therefore, we will assume that the inflaton field, for each oscillation in the potential well, only transfers little energy to the photon field. Furthermore, the size of the Universe only changes little during a single oscillation. In other words, we assume that the time scale on which oscillations take place is much shorter that the time scale for the decay of the inflaton field and the expansion of the Universe. The first assumption imposes a demand on the size of the function Γ, while the second assumption is motivated by noting that during the reheating phase the Universe is not in a state of exponential expansion. With these assumptions, we can take the energy density per oscillation of the inflaton field constant. One finds 1 ρ = Φ˙ 2(t) + V (Φ(t)) Φ 2c2 = ρmax = const. (3.49) We stress again that the above expression holds per oscillation: for each subsequent oscilla- tion ρmax will have a smaller value that the during the previous oscillation. With these assumptions we can simplify the reheating equations: it will be shown that we can now assign a value to the state parameter n(t), when the function V (Φ(t)) is assigned, and that this value will turn out to be constant. In this context, note that the state parameter, and pressure and density are related: ρ (t) + P (t) n(t) + 1 = Φ Φ . (3.50) ρΦ(t)

53 The assumption that the oscillations of the inflaton field are fast, allows us to replace numerator and denominator by their averages over a single oscillation. Such an average < f(t) > is defined as an integral over all values that a function f(t) takes during the oscillation, divided by the period T of the oscillation,

1 Z T < f(t) > ≡ f(t)dt. (3.51) T 0 Since all time dependence is now integrated out, the expression for the state parameter n(t) will yield a constant; this leads to a considerable simplification in solving the reheating equations! Before proceeding, we first need to calculate the time averages (3.50). This appears to present a problem: to calculate the integrals, we need the functions ρΦ(t) and PΦ(t), and these are precisely the functions we are trying to dfind by solving the reheating equations. However, there is a way out: we can rewrite the time-integrals in integrals over the inflaton field, and we know that the inflaton field oscillates between two extreme values, T Φ ≡ Φ(t = 0), and Φ ≡ Φ(t = ). (3.52) 0 1/2 2

Here, Φ0 is the value of the inflaton field when it starts the oscillation: it is the value of the field at the top of the well. The second is the value the inflaton field assumes when it is halfway the oscillation: it is the magnitude of the field when it is at the bottom of the well, and when it starts increasing again. It turns out that this information is sufficient for calculating the integrals. The averages can be written as 1 Z T 2 Z Φ1/2 f(t) < f(t) > ≡ f(t)dt = dΦ. (3.53) ˙ T 0 T Φ0 Φ(t) By assuming that the energy density of the field is constant during a single oscillation, it follows that 1 r   ρ = Φ˙ 2(t) + V (Φ(t)) = ρ → Φ(˙ t) = ± 2c2 ρ − V (Φ(t)) , (3.54) Φ 2c2 max max

and we can write the average < f(t) > as

Z Φ1/2 −1/2 2  2  < f(t) >= ± f(t) 2c ρmax − V (Φ(t)) dΦ. (3.55) T Φ0 This is a useful expression. By inserting the general expressions for the energy density and pressure due to the inflaton field, we find that the numerator and denominator of Eq. (3.50) can be written as

  Z Φ1/2  1/2 2 p 2 V (Φ(t)) < ρΦ(t) + pΦ(t) > = ± 2c ρmax 1 − dΦ, T Φ0 ρmax r Z Φ1/2 −1/2 2 ρmax  V (Φ(t)) < ρΦ(t) > = ± 2 1 − dΦ. (3.56) T 2c Φ0 ρmax

54 The state parameter n(t) is given by

 1/2 2 R Φ1/2 V (Φ) 2c Φ 1 − ρ dΦ n(t) + 1 = 0 max . (3.57)  −1/2 R Φ1/2 1 − V (Φ) dΦ Φ0 ρmax

What remains to be done is to insert the specific shape V (Φ(t)) for the potential well in which the inflaton field oscillates, and the values Φ0 and Φ1/2 the inflaton field assumes at the lowest and highest point in the well. Next, we choose a specific well and calculate the state parameter. When this is accom- plished, it is only a small step to solve the reheating equations. As a follow up to the previous example, we again consider the case of a massive inflaton field. As we have seen before, the potential energy density can be written as

m2  c 2 V (Φ(t)) = Φ2(t), (3.58) 2 ~ where m represents the mass of the inflatons. This function defines a parabolic well, with at the lowest point Φ = 0. Thus, at the bottom of this well,

Φ1/2 = 0. (3.59)

To determine the value of Φ0, we need to consider that when the inflaton is just starting an oscillation, it has not yet obtained kinetic energy density, Φ˙ = 0, and the resulting total energy density is given entirely by the potential energy density. One finds

2 2 m  c  2 ρΦ = V (Φ) = Φ0. (3.60) 2 ~

We assumed that the total energy density was constant during a single oscillation, ρΦ = ρmax, and we find for the value of the inflaton field at the highest point in the well s 2ρ  2 Φ = ± max ~ . (3.61) 0 m2 c

Next, we evaluate the integrals in de numerator and denominator of Eq. (3.57). When we enter the potential well given by Eq. (3.58), the denominator is given by

Z Φ1/2  1 m2  c 2 −1/2 1 − Φ2 dΦ. (3.62) Φ0 ρmax 2 ~ To simplify our notation, we will introduce a variable x, for which m  c  x ≡ √ Φ. (3.63) 2ρmax ~ The integral in Eq. (3.62) already appears more friendly. We find √ 2ρ   Z 0  −1/2 max ~ 1 − x2 dx. (3.64) m c 1

55 We can directly solve this by using a standard integral, and find √ √ 2ρ   Z 0  −1/2 2ρ   max ~ 1 − x2 dx = − max ~ arcsin(1). (3.65) m c 1 m c

Here, we used the integral R √ 1 dx = arcsin(x). 1−x2 The integral of the numerator of Eq. (3.57) becomes

Z Φ1/2  1 m2  c 2 1/2 1 − Φ2 dΦ. (3.66) Φ0 ρmax 2 ~ We again ease the notation and rewrite the integral as √ 2ρ   Z 0  1/2 max ~ 1 − x2 dx. (3.67) m c 1 √ R 2 1/2 1 2 We can solve the above equation by using (1 − x ) dx = 2 {arcsin(x) + x 1 − x }. We find √ √ 2ρ   Z 0  1/2 2ρ   1 max ~ 1 − x2 dx = − max ~ arcsin(1). (3.68) m c 1 m c 2 The value for the state parameter n immediately follows as √ 2ρmax ~  1 −2 m c 2 arcsin(1) n(t) + 1 = √ = 1 → n(t) = 0. (3.69) 2ρmax ~  − m c arcsin(1) After this mathematical excursion we return to the physics. We have found that inflatons, during the reheating phase, behave as a pressureless gas. This is exactly what we denoted in the previous chapter as cold matter, for which n = 0. Furthermore, notice that the value found for n does not depend on the amplitude of the oscillation. This means than we can use this expression for n for every oscillation, independent from the fact that each next oscillation has a smaller amplitude. With the knowledge that n = 0, the reheating equations simplify to   ρ˙Φ(t) = − 3H(t) + Γ ρΦ(t),

ρ˙γ(t) = −4H(t)ργ(t) + ΓρΦ(t), 8πG  H2(t) = ρ (t) + ρ (t) . (3.70) 3c2 Φ γ The strategy for solving these equations is a follows: we make an assumption for the shape of the Hubble constant H(t), and enter this into the first reheating equation. This yields a differential equation for ρΦ which is decoupled from the other two equations and can therefore be solved independently. When this equation is solved, we use the solution Φ(t) and our assumption for H(t) in the third reheating equation. This results in an algebraic expression for the energy density ργ(t). All three functions are now obtained, but since they only constitute a solution when they obey all three differential equations, we have to enter them in the second reheating equation to check whether all is consistent. If so, then our solution is correct; if not, then our initial assumption for the Hubble constant was incorrect and we have to start all over.

56 What would be a good assumption for H(t)? We can imagine the following: at the start of the reheating phase the only significant form of energy density arises from the inflaton field. We have seen that it behaves as cold matter. According to our calculations the 2 corresponding Hubble constant equals H(t) = 3t . After the inflaton field has decayed to photons, the energy density due to radiation dominates that from inflatons. This density is 1 characterized by a Hubble constant equal to H(t) = 2t . Consequently, it seems reasonable to 2 choose a function H(t) that for small t equals H(t) = 3t , and for increasing t asymptotically 1 approaches H(t) = 2t . In principle, an infinite amount of functions can be found which obey these demands, and we choose the following as an example: 11 1  H(t) = + e−αt . (3.71) t 2 6 We could insert this function and then (perhaps) discover whether we have the correct description. This is certainly not the case for our chosen function. A solution of the inflation equations has not yet been found and we still do not know the time evolution of the energy densities ρΦ, ργ and the scale factor a(t). However, we are in fact not so interested in the detailed time evolution. All we want to know is whether our model yields the correct amount of photons, such that the Standard Model can take over the evolution of Universe. To obtain a satisfactory answer to this question we only need an approximate solution of the reheating equations (3.70). This is much easier than finding a complete solution. The strategy is a follows. We split the problem into two parts: first we consider the time evolution at the start of the reheating phase, thus in the time domain where the inflaton field dominates the energy density, and find approximate solutions. Subsequently, we will consider the end of the reheating phase, the time domain where photons dominate the energy density, and obtain approximate solutions. In doing this we will also find the time ttransition where the energy density due to photons overtakes that due to inflatons. The solutions will contain a number of free constants, and we will assign values to these by demanding that the solutions from the different time domains smoothly join at ttransition. In this manner we find functions ρΦ, ργ and a(t), that approximately solve the reheating equations. First, we consider the time domain where the energy density due to inflatons dominates:

ρΦ  ργ. (3.72)

2 In this region the Hubble constant is given by H(t) = 3t , and we write the third reheating equation as 4 8πG = ρ . (3.73) 9t2 3c2 Φ An expression for the energy density of the inflaton field immediately follows, 3c2 4 ρ (t) = . (3.74) Φ 8πG 9t2 We substitute this into the first reheating equations, and find 8 3c2 2  3c2 4 − = − + Γ , (3.75) 9t2 8πG t 8πG 9t2 and this can only be true when 2 Γ  . (3.76) t

57 The above states that our solution, see Eq. (3.78), is only valid for times smaller than t ≈ Γ−1. this is what we expect: surely, when the inflaton field rapidly decays to photons −1 (thus when Γ is large), then soon (at t = Γ ) the energy density ρΦ is smaller than ργ. The expression found for ρΦ can now be inserted into the second reheating equation, and this yields a differential equation for the energy density ργ due to photons. One finds 8 3c2 4 ρ˙ (t) = − ρ + Γ . (3.77) γ 3t γ 8πG 9t2 The solution is given by 1 ρ 12t 3c2  ρ (t) = γ< + Γt . (3.78) γ t2 t2/3 45 8πG Here, ργ< is a free constant. Its value can be found with the help of the following argument: during inflation all energy density due to normal matter and radiation vanishes. Therefore, we demand that during the reheating phase (this means: at the end of inflation) the energy density ργ vanishes. Inflation ended at instant tend, given in Eq. (3.36); when we insert this in Eq. (3.78) and demand that the result vanishes, we obtain for the constant ργ< r !5/3 12 3c2 3 8πG ρ = − Γ ~ Φ . (3.79) γ< 45 8πG mc4 6 0

With this the entire solution is known! At the start of the reheating phase H(t), ρΦ(t) and ργ(t) are given by 2 H(t) = , 3t 3c2 4 ρ (t) = , Φ 8πG 9t2 r ! 1 12 3c2  3 8πG 5/3 1 12t 3c2 ρ (t) = − Γ ~ Φ + Γt . (3.80) γ t2 45 8πG mc4 6 0 t2/3 45 8πG

This solution only holds for times t  Γ−1. The second step in our analysis involves solving the reheating equations for times where the photon density exceeds the density due to inflatons. Then

ργ  ρΦ. (3.81)

1 The Hubble constant is now given by H(t) = 2t , and the third reheating equation 1 8πG = ρ . (3.82) 4t2 3c2 γ

The expression for the energy density ργ(t) directly follows, 3c2 1 ρ (t) = . (3.83) γ 8πG 4t2 Next, we use the expression for the Hubble constant to express the first reheating equation as  3  ρ˙ (t) = − + Γ ρ (t). (3.84) Φ 2t Φ

58 The solution for this differential equation is easy to find, and equals ρ ρ (t) = Φ> e−Γt. (3.85) Φ t3/2

Here, ρΦ> is again a constant that we can freely assign. We use the argument of continuity: we will demand that our solutions for the energy densities ρΦ in the two time domains smoothly join at instant t< at which the Universe is no longer dominated by inflation. This amounts to demanding 3c2 4 ρ = Φ> e−Γt< . (3.86) 8πG 9t2 3/2 < t<

Note that this is not entirely correct. At the left side we have an expression for ρΦ that was derived for the time domain when ργ  ρΦ, while the right side of the expression is valid for ργ  ρΦ. We have already seen that the left side can be used when t < t<, and we have to demonstrate that the right side can be used when t > t>, where t> remains to be determined. It is clear that the above demand can only be imposed when t< and t> are not too far apart. It is certainly not guaranteed that this is the case, and we have to check this explicitly. When such is the case, we can consider both times as a single instant, and give it the name ttransition. For now we assume that we are able to do this, and then equation (3.86) yields 3c2 4 1 Γttransition ρΦ> = e √ . (3.87) 8πG 9 ttransition The expressions found can be applied when expression (3.81) is satisfied. One finds 3c2 1 ρ  Φ> e−Γt. (3.88) 8πG 4t2 t3/2

This yields information for time t> , hereafter the derived expressions are valid. Unfortu- nately, it is impossible to express t> in a formula, but we can assign a numerical value to t>; as indicated before we hope that it is in the neighborhood of t<. Now the entire solution is known! We conclude that at the end of the reheating phase, the functions H(t), ρΦ(t) and ργ(t) are given by 1 H(t) = , 2t  3c2 4 1  1 Γt< √ −Γt ρΦ(t) = e 3/2 e , 8πG 9 t< t 3c2 1 ρ (t) = . (3.89) γ 8πG 4t2 (3.90) This completes our analysis. We have derived two sets of expressions for the functions H(t), ρΦ(t) and ργ, where one set is valid at the start (inflaton dominated part) and the other set at the photon dominated part. We have ensured that the first part correctly joins the values for ρΦ and ργ that are valid at the end of inflation, and that the second part correctly joins the first part. In addition, we have investigated when the transition takes place, and found that the first set of solutions can be used as long as t < t<, while the second set can be used when t > t>. Next, we will quantify our results, and estimate how long reheating has lasted. In addition we want to check whether t< and t> indeed correctly join.

59 We continue our example, and again take a mass m for the inflaton, while the constant Φ0 is given by √ GeV m · kg m ≈ 1013 ≈ 10−14 kg, and Φ ≈ 1023 . (3.91) c2 0 s For the decay constant Γ we assume

Γ ≈ 1032 s−1. (3.92)

How did we obtain this value? It is an assumption! The constant Γ was introduced as a measure for the decay rate of inflatons (through some chain particle decay) to photons. Only little is known about the manner in which inflatons are coupled to other particles, and it is speculative to assume a value for the decay rate. However, there is a general rule in quantum field theory that states that more massive particles decay faster to other particles. We choose Γ proportional to m and state

mc2 Γ ≈ . (3.93) ~

(both sides now have units 1/s). With this assumption we can calculate the times t< and t>. The first value is found by solving Eq. (3.76), and the second by solving Eq. (3.88). We find −32 −32 t< ≈ 10 s, and t> ≈ 10 s. (3.94) These values are (to the extend that our approximations can tell the difference) identical, and we denote the transition point as ttransition. After this time the inflaton field has transferred sufficient energy to the photon field, that the Universe is radiation dominated. The constant ρΦ> is fixed by Eq. (3.87) as kg ρ ≈ 1042 √ . (3.95) Φ> m s Now all constants are determined, and the evolution of the reheating phase is known. For all times between the end of inflation, tend, and the transition point ttransition we can use Eqs. (3.80) and for times after the transition we use Eqs. (3.90). After the Universe has become dominated by radiation, the Standard Model can be used to describe the evolution. The transition time is the time when the reheating phase ended: in our model reheating lasted about 10−32 seconds.

60 IV. RELATIVISTIC STARS

A. Spherically symmetric stars

Strong gravitational fields exist not only in cosmology near the initial singularity, but also in the interior of very dense stars and black holes. In this section, for simplicity we consider only idealized stars that are spherically symmetric and static, i.e., unchanging. The requirements of spherical symmetry and staticity imply that 4-dimensional spacetime can be sliced into 3-dimensional spatial hypersurfaces which all have the same geometry; and these are in turn built out of 2-dimensional concentric spheres. The most general spherically symmetric, static spacetime has line element ds2 = −e2Φdt2 + e2Λdr2 + r2(dθ2 + sin2 θdφ2) (4.1) = −e2Φdt2 + e2Λdr2 + r2dΩ2, (4.2) where for brevity we have introduced dΩ2 = dθ2 + sin2 θdφ2. (4.3) The flat Minkowski metric has Φ = Λ = 0. In general, Φ and Λ will be functions of r – but not of t (because of staticity) or θ, φ (because of spherical symmetry). We will want to solve the Einstein equations for the line element (4.2). A straighforward calculation shows that the Einstein tensor has components 1 d G = e2Φ r(1 − e−2Λ) , (4.4) 00 r2 dr 1 2 G = − e2Λ(1 − e−2Λ) + Φ0, (4.5) rr r2 r  Φ0 Λ0  G = r2e−2Λ Φ00 + (Φ0)2 + − Φ0Λ0 − , (4.6) θθ r r 2 Gφφ = sin θGθθ, (4.7) with 0 = d/dr, and all other components vanish. We will also need to specify an energy-momentum tensor for the matter our idealized stars are made of. Any fluid in thermodynamical equilibrium has an equation of state of the form P = P (ρ, S) (4.8) which relates the pressure to the energy density ρc2 and the specific entropy S. In many cases the dependence on S can be neglected (e.g., because the latter is negligibly small and can be considered zero), and this is what we do here: P = P (ρ). (4.9) For a perfect fluid with density ρ and pressure P one has T µν = (ρ + P ) U µU ν + P gµν, (4.10) where U µ is the 4-velocity of the fluid. The latter must satisfy

µ µ ν UµU = U U gµν = −1, (4.11)

61 and if the star is to be static it can only have a t component. This implies

0 −Φ Φ U = e ,U0 = e (4.12) with all other components zero. The energy-momentum tensor has components

2Φ T00 = ρ e , (4.13) 2Λ Trr = P e , (4.14) 2 Tθθ = r P, (4.15) 2 Tφφ = sin θ Tθθ. (4.16)

This has to satisfy the conservation law

µν ∇µT = 0. (4.17)

Because of the high degree of symmetry, the only one of these equations that is non-vanishing is the one where ν = r. It takes the form dΦ dP (ρ + P ) = − . (4.18) dr dr This equation tells us what pressure gradient is needed to keep the fluid static in the grav- itational field, whose effect depends on dΦ/dr. The (0, 0) component of the Einstein equations can be found from Eqns. (4.4) and (4.13). At this point it is convenient to introduce the function 1 m(r) = r (1 − e−2Λ(r)), (4.19) 2 so that 2Λ 1 grr = e = 2m(r) . (4.20) 1 − r Then the (0, 0) Einstein equation implies

dm(r) = 4πr2ρ(r). (4.21) dr This is exactly what one would obtain in Newtonian theory if m(r) were the mass enclosed by the radius r. This interpretation is not tenable here, as we shall see in a moment; nevertheless m(r) is convenient to calculate with. The (r, r) Einstein equation can be found from Eqns. (4.5) and (4.14); it can be put in the form dΦ m(r) + 4πr2P = . (4.22) dr r [r − 2m(r)] We now in turn discuss the exterior solution where ρ = P = 0, and possible interior solutions which depend on the details of the equation of state P = P (ρ).

62 B. Exterior solution

Outside the star we have ρ = P = 0, and the Einstein equations (4.21) and (4.22) reduce to dm = 0, (4.23) dr dΦ m = . (4.24) dr r(r − 2m) These have the solutions

m(r) = M = const, (4.25) 2M e2Φ(r) = 1 − , (4.26) r where we have assumed Φ → 0 as r → ∞, as is required if the spacetime is to be asymptot- ically flat (i.e., if we move sufficiently far from the center we recover Minkowski spacetime). One has M = m(R), (4.27) where r = R is the coordinate radius of the star’s surface. It follows that the exterior spacetime has the following form:  2M  1 ds2 = − 1 − dt2 + dr2 + r2dΩ2. (4.28) r 2M 1 − r This is the Schwarzschild metric. Using the geodesic equation, one can show (see below) that at sufficiently large distances and restoring G, a test particle will feel a purely radial inward acceleration ar = −GM/r2. This justifies the choice of the symbol M: it is the mass the star is perceived to have a large distance away. However, it is not the proper mass of the star. The latter would be Z p(3) 3 Mp = g ρ d x (4.29)

Z R  2m(r)−1/2 = 4π ρ(r) 1 − r2dr, (4.30) 0 r

where we used the proper volume element on a t = 0 hypersurface, p(3)g d3x. By contrast, from (4.21) one simply has Z R M = 4π ρ(r) r2dr. (4.31) 0

Clearly M < Mp, the difference being due to the gravitational binding energy. Although we have not shown it, Birkhoff’s theorem says that the metric (4.28) is the only spherically symmetric and asymptotically flat solution of the vacuum Einstein equa- tions, even if one drops the requirement that the spacetime be static! In particular, the exterior spacetime of a spherically symmetric star that is pulsating only radially is neces- sarily Schwarzschild. We will make use of Birkhoff’s theorem later on, when studying an example of a star undergoing radial collapse to form a black hole.

63 C. Interior solution

Inside the star (r < R) one has, in general, ρ 6= 0 and P 6= 0. Dividing Eq. (4.18) by (ρ + P ) and using it to eliminate dΦ/dr in (4.24), one obtains the Tolman-Oppenheimer- Volkoff (TOV) equation: dP (ρ + P )(m + 4πr3P ) = − . (4.32) dr r (r − 2m) The TOV equation, together with Eq. (4.21) and (4.9) are sufficient to solve for m(r), ρ(r), and P (r). The first two are first-order differential equations, requiring two integration constants. It is reasonable to set m(r = 0) = 0. The other constant, Pc = P (r = 0), is the central pressure. Thus, given an equation of state, the latter completely determines the stellar model. Once we know m(r), ρ(r), and P (r), we can define the surface of the star as the radius R where P (R) = 0 – notice that by the TOV equation, P (r) decreases monotonically. This radius R then determines M through M = m(R). We now further simplify the treatment by positing

ρ = const, (4.33)

which replaces the equation of state (4.9). In a normal star this is not an appropriate assumption, but it is approximately true in the case of a neutron star. With this ansatz, Eq. (4.21) can be integrated immediately to give

4πr3 m(r) = ρ. (4.34) 3 This is of course only valid for r < R; outside the star we have

4πR3 M = ρ, (4.35) 3 so that M is now set by ρ and R. The TOV equation (4.32) becomes

dP 4 (ρ + P )(ρ + 3P ) = − πr . (4.36) dr 3 1 − 8πr2ρ/3

Integrating from an arbitrary central density Pc we get

ρ + 3P ρ + 3P  2m1/2 = c 1 − . (4.37) ρ + P ρ + Pc r It follows that the radius R is given by " # 3  ρ + P 2 R2 = 1 − c , (4.38) 8πρ ρ + 3Pc or 1 − (1 − 2M/R)1/2 P = ρ . (4.39) c 3(1 − 2M/R)1/2 − 1

64 Substituting this into (4.37) one gets

(1 − 2Mr2/R3)1/2 − (1 − 2M/R)1/2 P = ρ . (4.40) 3 (1 − 2M/R)1/2 − (1 − 2Mr2/R3)1/2

Using (4.18) we can now solve for Φ. In this case we know the value at r = R, since

g00(R) = −(1 − M/R). (4.41)

Hence 3 1 eΦ(r) = (1 − 2M/R)1/2 − (1 − 2Mr2/R3)1/2. (4.42) 2 2 This completes our derivation of the interior geometry in the constant density case. Notice that there is something peculiar about Eq. (4.39): as M/R → 4/9, Pc → ∞. Given a mass M, a uniform density star can not have radius R ≤ 9M/4 or an infinite pressure would be needed to support it. It was shown by Buchdahl that this is true for any spherically symmetric star model, assuming only that ρ ≥ 0 and dρ/dr ≤ 0, irrespective of the equation of state. If one constructs a star of R = 9M/4 and gives it a small radial push inwards, it must collapse onto itself. The radius will become smaller and smaller, leaving nothing but Schwarzschild spacetime. This process is called complete gravitational collapse, and it results in the formation of a black hole. The most common ultra-compact objects are not black holes but white dwarfs, which form when ordinary stars run out of fuel and fusion processes stop. Electron degeneracy (as a consequency of the Pauli exclusion principle) then provides the pressure to sustain the star. A typical white dwarf is thought to have a mass M of about a solar mass (M ) and a radius comparable to that of Earth, in the order of 104 km. More generally, the radius R scales like M −3. The largest mass (and hence the largest compactness) which the degenerate electron gas can sustain is about 1.4 M (the Chandrasekhar limit). Supernova explosions can produce objects that have a similar mass but are much more compact: neutron stars, in which almost all electrons have combined with protons to leave neutrons. Typical neutron stars have masses between 1.35 M and 2.1 M . The radius is determined by the equation of state, which is highly uncertain,24 but ∼ 10 km should be typical. Despite the uncertainties, 5 M is generally taken to be the largest neutron star mass that can be sustained. Though rare, there are main sequence stars of sufficient mass to produce a remnant of neutron star size but heavier than 5 M when going supernova. There is no known physical process that would stop such an object to collapse into a black hole.

24 In fact, gravitational wave signals from the coalescence of two neutron stars, which are expected to be observed within the next 5 years or so, will be the one sure way to determine the equation of state of bulk nuclear matter.

65 V. BLACK HOLES

We have seen that the exterior (vacuum) part of the spacetime of a spherically symmetric, static star is the Schwarzschild spacetime, Eq. (4.28):

 2M   2M −1 ds2 = − 1 − dt2 + 1 − dr2 + r2dΩ2. (5.1) r r

We also noted that if the radius R must satisfy R > 9M/4 in order that the density at the center of the star remain finite. If R ≤ 9M/4 then the star cannot remain static and must collapse. However, if the collapse happens in a spherically symmetric way, then the outside, vacuum part of spacetime must remain Schwarzschild, as this is the only spherically symmetric and asymptotically flat solution to the vacuum Einstein equations. Following complete gravitational collapse, (4.28) is what the remaining spacetime geometry looks like. The way to understand the structure of a spacetime is to study the paths followed by massless particles (e.g., photons) and by massive particles: null and timelike geodesics, respectively.

A. Conserved quantities

Given that the metric (5.1) has a large amount of symmetry, our task will be simplified greatly by considering constants of the motion. Consider the geodesic equation:

µ ν u ∇µu = 0 (5.2)

µ with u the 4-velocity. Since ∇µgνρ = 0, we can lower the free index, so

µ u ∇µuν = 0. (5.3)

µ If the geodesic is timelike (the spacetime path of a massive particle), u ∂µ = d/dτ with τ µ the proper time; in the case of a null geodesic (the path of a photon), u ∂µ = d/dλ, with λ some affine parameter. For a particle, Eq. (6.27) can be written as du ν = Γρ uµu . (5.4) dτ µν ρ Observe that 1 Γρ uµu = gρσ(∂ g + ∂ g − ∂ g ) uµu µν ρ 2 µ σν ν µσ σ µν ρ 1 = (∂ g + ∂ g − ∂ g ) gρσu uµ 2 µ σν ν µσ σ µν ρ 1 = (∂ g + ∂ g − ∂ g ) uσuµ. (5.5) 2 µ σν ν µσ σ µν The product uσuµ is symmetric in σ and µ, but the first and third terms inside the brackets are together asymmetric in these indices; hence they cancel and we are left only with the middle term: 1 Γρ uµu = ∂ g uµuσ. (5.6) µν ρ 2 ν µσ

66 We see that the geodesic equation can be written as du 1 ν = ∂ g uµuσ. (particles) (5.7) dτ 2 ν µσ The same goes for a null geodesic with the replacement of τ by the affine parameter λ. This ν leads to an important result: if all the components gµσ are independent of x for a particular index ν, then ∂νgµσ = 0, and uν is constant along the entire geodesic: it is a constant of the motion. In the Schwarzschild metric (4.28), all metric components are independent of t, hence u0 is a constant of the motion. It is convenient to introduce another constant E:

E ≡ −u0. (5.8) µ Note that in the case of a massive particle, u0 = p0/m with p the 4-momentum and m the rest mass; E is then the energy per unit rest mass. In the case of a photon, E is simply the energy. The metric is also independent of the angle φ, meaning uφ is a constant of the motion. We define L ≡ uφ, (5.9) where for a massive particle L is the orbital angular momentum per unit rest mass. Because of spherical symmetry, the motion will be confined to a plane. Without loss of generality we can take this to be the equatorial plane, θ = π/2, and one then has uθ = dθ/dλ = 0, with λ any affine parameter along the orbit. The non-zero components of the 4-velocity uµ are then  2M −1 u0 = g0νu = g00u = m 1 − E, ν 0 r dr ur = , dτ uθ = 0, 1 uφ = gφνu = gφφu = L. (5.10) ν φ r2

B. Elliptical and circular orbits

We can now use the defining equations for massive particles and photons: µ µ massive particle: u uµ = −1; photon: u uµ = 0, (5.11) or µ u uν = −κ (5.12) where κ = 1 for a massive particle and κ = 0 for a photon. When applied to our case, this yields:  2M −1  2M −1  dr 2 L2 −E2 1 − + 1 − + = −κ. (5.13) r r dτ r2 This equality immediately gives us the equations for the radial part of the orbital motion; rewriting it slightly we have 1  dr 2 1  2M  L2  1 + 1 − + κ = E2, (5.14) 2 dτ 2 r r2 2

67 Now introduce the effective potential

1  2M  L2  V (r) ≡ 1 − + κ . (5.15) eff 2 r r2

Then the equation for radial motion (5.14) becomes

1  dr 2 1 + V (r) = E2. (5.16) 2 dτ eff 2

This is uncannily similar to what one gets in Newtonian gravity, where the radial motion is the same as that of a particle with unit mass moving along a line, with total energy E2/2 and experiencing an effective potential. In fact, let us make an explicit comparison. In GR, the particle has an effective potential

1 M L2 ML2 V (r) = κ − κ + − . (5.17) eff 2 r 2r2 r3 The first term is an irrelevant constant. The second term is the usual Newtonian potential. The third term also arises in the effective potential in Newtonian theory; it is the usual “centrifugal barrier term”. What is new is the third term, which is a new attractive term that dominates the centrifugal barrier term at small r. It constitutes a crucial difference: As we will see in a moment, this term makes it impossible to have stable circular orbits arbitrarily close to the center. Let us first consider the trajectories of massive particles (κ = 1). Circular orbits (i.e., orbits with dr/dτ = 0) are given by extrema of the effective potential:

dV Mr2 − L2r + 3ML2 0 = eff = , (5.18) dr r4 which has roots L2 ± (L4 − 12L2M 2)1/2 R = . (5.19) ± 2M We see that if L2 < 12M 2 there are no extrema; the particle has insufficient angular momentum to be on a circular orbit and will fall inwards. 2 2 For L > 12M , one can check that the extremum R+ is a minimum of V (r), while R− is a maximum; see Fig. 17. This means that for such values of the specific angular momentum L, there are two possible circular orbits (whereas in Newtonian gravity there would have been only one): a stable one at r = R+ and an unstable one at r = R−. It is not difficult to see that R+ is restricted to the range

R+ > 6M, (5.20)

and R− is restricted to 3M < R− < 6M. (5.21) Thus, for a particle with mass, no stable circular orbits exist at r < 6M, and no circular orbits (stable or otherwise) exist for r < 3M. The last stable orbit (or innermost stable 2 2 circular orbit occurs when L = 12 M , in which case R+ = R− = 6M; this is illustrated in Fig. 18.

68 Figure 17: The effective potential for a massive particle with L2 = 24 M 2. The r coordinate is plotted in units of M. For comparison the Newtonian effective potential is also plotted (dashed line).

Figure 18: The effective potential for a massive particle with L2 = 12 M 2, the smallest angular momentum for which there is a stable orbit (at r = 6M). By constrast, in Newtonian gravity stable orbits exist for arbitrarily small angular momentum.

69 Figure 19: The effective potential for a photon. There is an unstable circular orbit at r = 3M, but no stable orbits. The Newtonian effective potential consists of just the repulsive angular momentum term; here there are never any circular orbits.

Repeating the above considerations for photons, we get dV −L2r + 3ML2 0 = eff = . (5.22) dr r4 In this case the only root is R = 3M, (5.23) which is an unstable orbit; see Fig. (19). Thus, irrespective of angular momentum, the only circular orbit available to a photon is at r = 3M, but the tiniest perturbation will send it out to infinity or in towards the center. r = 3M is referred to as the light sphere. What about the angular motion? Solving for L in the solution R+ of Eq. (5.19), one finds  3M −1 L2 = Mr 1 − (5.24) r for the angular momentum of a massive particle on a stable circular orbit at coordinate 2 radius r. Recalling that dr/dτ = 0 for circular orbits, one has E = 2Veff , hence  2M 2  3M −1 E2 = 1 − 1 − . (5.25) r r Furthermore, dφ 1 = uφ = gφφL = L, (5.26) dτ r2 dt  2M −1 = u0 = g00(−E) = 1 − E. (5.27) dτ r

70 The angular velocity is then

dt dt/dτ  r3 1/2 ∆t = = = , (5.28) dφ dφ/dτ M ∆φ where in the last equality we have used that dt/dφ is constant, as is evident from the next- to-last equality. The period P is the time ∆t taken to complete a revolution, i.e., ∆φ = 2π. This is given by r r3 P = 2π . (5.29) M Formally, this is Kepler’s third law! However, the resemblance is superficial, because it refers to the time coordinate t, which only corresponds to proper time τ as experienced by an observer for stationary observers at r → ∞.

C. Radial infall

In the previous subsection we studied particles on circular orbits. What happens if a particle falls radially inwards? From Eq. (5.14), we can integrate proper time. Up to an integration constant, Z Z dr τ = dτ = . (5.30) [E2 − (1 − 2M/r) (1 + L2/r2)]1/2

For a radially infalling particle, 0 = uφ = L/r2, hence L = 0. If the particle starts out with zero radial velocity at r = R then dr/dτ(r = R) = 0, or 2M E2 = 1 − . (5.31) R Substituting this in Eq. (5.30) and setting L = 0, we get Z dr τ = . (5.32) [2M/r − 2M/R]1/2 Using this equation, it is possible to express the behavior of both r and τ in parametric form, namely R r = (1 + cos η), (5.33) 2 R  R 1/2 τ = (η + sin η). (5.34) 2 2M We see that η = 0 corresponds to r = R. When η = π, r = 0 so that the particle will have reached the center. In principle Eq. (5.33) could be continued by increasing η even further, but this would be incorrect, as we will discuss momentarily. From the above, it is clear that the particle hits the center after a finite proper time given by

π  R 1/2 ∆τ = R . (5.35) 2 2M

71 What about coordinate time t? First note that dr dr dt dr E = = . (5.36) dτ dt dτ dt 1 − 2M/r Combining this with Eq. (5.14) for dr/dτ, we can solve for dr/dt, or dt/dr. Again specializing to the case L = 0 and integrating, one finds Z Z E dr t = dt = . (5.37) [E2 − (1 − 2M/r)]1/2 1 − 2M/r This integral is not easy, but it is possible to solve it analytically. We simply state the result (originally due to Khuri): " # R   R 1/2 R  R 1/2 t = + 2M − 1 η + − 1 sin η 2 2M 2 2M 1/2 (R/2M − 1) + tan(η/2) +2M ln , (5.38) (R/2M − 1)1/2 − tan(η/2) where η is the same parameter as before. We now notice something strange. The above expression diverges for η =η ¯ where

 R 1/2 tan(¯η/2) = − 1 . (5.39) 2M With some trigonometry, this turns out to imply 4M cos(¯η) = − 1. (5.40) R However, from Eq. (5.33) we know that this implies r = 2M!

D. The Schwarzschild horizon

Let us take stock. Eqns. (5.33) and (5.34) describe the motion of an infalling particle as seen by an observer A moving along with that particle, who will have proper time τ. According to this observer, the particle will reach the center after a finite time given by Eq. (5.35). An observer B who is stationary and at large distance (r → ∞) has a proper time that coincides with coordinate time t. According to observer B, the particle will never reach the center; in fact, as t → ∞, the particle’s radial position asymptotically approaches r = 2M. Looking at the metric (5.1), one already suspects that the surface r = 2M is special; indeed, as r → 2M, g00 → 0 and grr → ∞. However, r = 2M is only a coordinate singularity. The measurable quantities associated with the geometry of spacetime are not the metric components, but the components of the Riemann tensor, as they determine tidal µ forces. Now, one can check that the components of R νρσ all remain finite as r → 2M. Hence the infinities in the metric are mere coordinate artefacts, which can be dealt with by choosing a different coordinate system. We leave it as an exercise to find such coordinate systems.

72 Nevertheless, this surface does have a peculiar property. Let us introduce a new coordin- ate ζ = 2M − r. Then the line element is ζ 2M − ζ ds2 = dt2 − dζ2 + (2M − ζ)2dΩ2. (5.41) 2M − ζ ζ

For r < 2M we have ζ > 0, and 2M −ζ = r is always positive; hence a line on which (t, θ, φ) are constant has ds2 < 0: It is timelike. Thus, ζ (and hence r) is a timelike coordinate while t has become spacelike. Now, an infalling particle must follow a timelike worldline with constantly changing r. We know from the preceding discussion that a freely falling observer moves toward smaller and smaller r, hence the forward time direction corresponds to decreasing r for any observer at r < 2M. This also goes for photons; hence light emitted from r < 2M will never escape from the surface. For this reason it is called a horizon. The Schwarzschild spacetime does contain a true singularity, namely at r = 0, where components of the Riemann tensor do diverge. General relativity loses all predictive power there. Just like with the Big Bang singularity one encounters in cosmology, some other theory will be needed to describe what happens there, which will presumably combine general relativity and quantum mechanics into a theory of quantum gravity. The name “black hole” was coined by the American physicist John Archibald Wheeler; it emphasizes the inability of light to escape from the horizon. The key theoretical discoveries about the properties of black holes were done in parallel, and almost entirely independently, in the West and in the Soviet Union. Russian scientists instead chose to talk about “frozen stars”. Indeed, even as a black hole forms, a distant observer will never see the surface of the collapsing star disappear behind the horizon. Particles on the star’s surface will travel on geodesics of the external Schwarzschild geometry, and an observer A following them (much like above) will find that they hit the central singularity in a finite time. However, an observer B who is stationary at arbitrarily large r only sees the outer layer of the star approach the horizon asymptotically. To him, it never really disappears, and the star appears “frozen” in place. Note, incidentally, that observer B will never see observer A fall into the horizon either! To observer B, light emitted from near the black hole will also appear to be redshifted. As discussed above, the t component of a photon’s momentum, p0, is a constant of the motion, which we related to an energy E = −p0. Now consider an observer at finite r who is momentarily at rest. His 4-velocity then has ui = 0 (i.e., the spatial components vanish), 0 √ µ but u = 1/ −g00 to ensure that u uµ = −1. The energy measured by this observer is

 −1/2 ∗ µ 1 2M E (r) = −u pµ = E = 1 − E. (5.42) p−g00 r

As r → ∞, E∗(r) → E, so that E is the energy measured by a very distant observer B if the photon moves far away from the black hole. Note that E < E∗(r), the difference being the energy the photon spends in climbing out of the black hole’s gravitational field. With the energies E∗(r) and E we can associate frequencies f ∗ = E∗/h and f = E/h, or wavelengths λ∗ = c/f ∗ and λ = c/f. The redshift experienced by a photon as it travels from finite r to infinity is λ − λ∗ f ∗ z ≡ = − 1, (5.43) λ∗ f

73 and one has 1 z = − 1. (5.44) p1 − 2M/r For a photon emitted close to the horizon, r = 2M +, the redshift becomes arbitrarily large as  → 0.

E. Non-spherical black holes

So far we only considered spherically symmetric, asymptotically flat, vacuum black holes, in which case the Schwarzschild black hole is the only possible solution, as stated by Birk- hoff’s theorem. However, one can also look for solutions to the Einstein equations that are spherically symmetric and asymptotically flat and contain not matter, but include an electromagnetic field. In special relativity one can write down an energy-momentum tensor for electro- magnetism, which generalizes to curved spacetimes by replacing coordinate derivatives with coordinate derivatives. By using this energy-momentum tensor Tµν in the right hand side of the Einstein equations and again specializing to spherical symmetry and asymptotic flatness, once again a unique, static solution is arrived at, namely the Reissner-Nordstr¨omblack hole. This is a black hole with an electric charge Q at r = 0, which causes an electric field that permeates space. These solutions are probably not very relevant in terms of astrophysics, because black holes are formed by the gravitational collapse of ordinary star remnants, which to an excellent approximation are electrically neutral (Q = 0). More interesting are cylindrically symmetric asymptotically flat black holes in vacuum (Tµν = 0 again), called Kerr-Newman black holes. These represent rotating black holes. Since no matter is involved, what is rotating is space itself. Rotating black holes are sur- rounded by an ergosphere, a region where it is not possible to stand still, and one needs to move around the horizon in the direction of rotation of the hole; see Fig. 20. The rotational pull of a rotating black hole (not only inside the ergosphere but also outside it) is called frame dragging. The ergosphere has another peculiar property. When a particle enters it from the outside, then according to observers at large distances, its energy becomes negative. This leads to an interesting effect called the Penrose process. Suppose a particle is sent into the ergosphere and made to break up into two particles. Imagine the break-up is arranged in such a way that one particle falls into the horizon, while the other escapes to infinity. Since the particle falling in is adding negative energy to the black hole, the particle that escapes will have a larger (positive) energy than the outgoing particle: energy has been extracted from the black hole. What happens is that part of the rotational kinetic energy of the black hole is taken away. In principle the process can be repeated over and over again until all of the rotational energy is depleted, and one ends up with a non-rotating, Schwarzschild black hole. Now suppose one measures the mass M of the black hole as by holding a particle with mass m at a large distance and looking at the force one needs to exert to keep it in its place (for r → ∞, simply F = −GMm/r2). Then as energy is extracted, M will decrease! This is because the rotational kinetic energy contributes to the measured M. In principle, a rotating black hole can also have charge. However, in general relativity,

74 a stationary (possibly rotating), asymptotically flat black hole spacetime is completely de- termined by three numbers25: the mass M, the rotational angular momentum J, and the electric charge Q. This is the Black Hole Uniqueness Theorem (also known as the No Hair Theorem), and it leads to a strong test of GR. By tracking a small object that is closely orbiting the horizon of a much larger black hole, one can tell from the orbit if the spacetime can indeed be described by only three numbers; if not, then GR is wrong. So far it has not been possible to perform this test effectively, because the smaller object would itself have to be a black hole so that it can orbit sufficiently closely without breaking apart; but then the object is not directly observable. However, later on we will talk about the future space-based gravitational wave observatory LISA, which will pick up gravitational radiation (ripples in the curvature of spacetime) emitted by precisely such systems. The details of the smaller black hole’s orbit can be inferred from the gravitational wave signal it emits, and this will enable high-accuracy tests of the Uniqueness Theorem.

Figure 20: In the case of rotating black holes, the horizon is surrounded by an ergosphere, a region where it is not possible to stand still and one has to rotate along with the hole.

As indicated in Fig. 20, just like Schwarzschild black holes, rotating black holes have a horizon. However, this is only true if the angular momentum J satisfies J/M 2 ≤ 1. If J/M 2 > 1, then there is no horizon and one has a naked singularity. Computer simulations of complete gravitational collapse of rotating clouds of matter have always led to a black hole with a horizon, J/M 2 ≤ 1. Penrose has conjectured that given some mild conditions on the kind of matter used (essentially positivity of the energy density), one will always end up with such a black hole; this is referred to as the Cosmic Censorship Conjecture. Quite possibly, the conjecture can be proven as a theorem, in which case it too would be a necessary consequence of GR. If a naked singularity were part of a close binary system, then once again one would be able to infer this from the gravitational wave signal. If the conjecture is indeed a theorem, then finding a naked singularity would immediately invalidate GR.

25 Actually, one can also have black holes with, e.g., color charge as in quantum chromodynamics. Such black holes are of no astrophysical interest, although occasionally they are used in thought experiments.

75 F. Evidence for the existence of black holes

There is a great deal of (admittedly circumstantial) evidence for the existence of black holes. Almost all of this evidence is for black holes in one of two categories:

6 10 1. Supermassive black holes have masses between 10 and 10 M . Most galaxies are believed to harbor a supermassive black hole in their centers, including our own. By tracking stars near the center of our galaxy over a period of years, it has been 6 established that a dark object with a mass of ∼ 4 × 10 M is present there; see Fig. 21. The object is no larger than an AU (or Astronomical Unit, the radius of the Earth’s orbit around the Sun, which is about 1.5 × 108 km); for comparison, if it is indeed a black hole then its horizon has a size of 6 × 106 km.

Figure 21: The orbits of stars in a region called Sagittarius A∗, near the center of the Milky Way galaxy, indicate the presence of an object of about 4 million solar masses, confined to a region no wider than the Earth’s orbit around the Sun.

More evidence comes from Active Galactic Nuclei, galactic centers that emit copious amounts of radiation. The source is believed to be an accretion disk surrounding a black hole; the heating due to the compression of gas spiraling towards the center. Magnetic fields lining the accretion disk cause material to be ejected in tight jets perpendicular to the disk. A black hole is the only known explanation given these features and the amount of energy involved. Ordinary galactic centers also exhibit jets, only much smaller than with AGNs, presumably because the black hole has already devoured most of the matter in its immediate vicinity. Fig. 22 shows a jet near the center of M87.

76 Figure 22: A jet of material emitted by the center of the galaxy M87.

2. Stellar mass black holes with masses between three and a few tens M . These probably fuel so-called X-ray binaries, where a stellar mass black hole and a normal star orbit each other. The tug from the black hole rips material away from the star’s outer layers, causing an accretion disk to form. There is evidence that the accretion disk tends to have a well-defined inner edge, consistent with the existence of a last stable orbit. Of course, in the approximation of spherical symmetry, we have seen that in principle there can be stable stars with radius R > 9M/4 while the last stable orbit has a radius of 6M. However, if the dark object in an X-ray binary had a solid surface, then material crashing into it would cause radiation with a very different frequency distribution than the radiation from an accretion disk. This is not seen, suggesting the existence of a horizon.

Supermassive black holes may have come into being through successive mergers of much smaller black holes in the far past; alternatively, they could have been created almost in one go through the collapse of proto-galactic disks. LISA will see gravitational wave signals from massive black holes out to redshifts z > 15 and will be able to clearly distinguish between the two formation mechanisms. One may wonder about the mass gap between 4 the supermassive and the stellar mass black holes. Holes with masses of, say, 10 M are presumed to be rare in the later epochs of the history of the Universe, as they should have turned into supermassive black holes by then through accretion and mergers. It is possible that globular clusters, nearly spherical systems of stars that orbit the centers of galaxies, can generate intermediate mass black holes of hundreds to tens of thousands of M , although right now there is little evidence of this. Here too, gravitational wave detectors will have the last word. Note that the above evidence is indeed circumstantial. The existence of black holes will only be settled definitively when we attain access to gravitational wave signals from a black hole merging with another compact object.

77 VI. GRAVITATIONAL WAVES AND THEIR PROPERTIES

A. Dynamical gravity

In Newton’s theory of gravity, any changes in the distribution of matter, no matter how localized, are felt instantaneously at arbitrarily large distances. Despite the huge successes of Newton’s theory, this instantaneous action at a distance was considered unsatisfactory already by some of his contemporaries in the late 17th century. These tried to come up with some dynamical mechanism through which the gravitational force would be commu- nicated, but without success. The issue became especially pressing after the development of special relativity (1905), which imposes a strict speed limit, the speed of light, on how fast communication of any kind can be effected. Maxwell’s theory of electromagnetism does not have this instantaneous action at a dis- tance. Consider a localized charge and/or current distribution (the “source”) causing electric and magnetic fields E and B. From the Maxwell equations it can be shown that at a time t, the values of E and B at a distance D from the source depend on what the source was doing at a time t − D/c. The time lag, D/c, is the time needed for a signal to cross the distance D if it traveled at the speed of light: electromagnetism obeys Einstein’s speed limit. E and B can be shown to obey a wave equation: the changes in a charge/current distribution are communicated to the rest of space by electromagnetic waves. Thus, unlike the Newtonian gravitational potential, the electromagnetic field does not just “track” its sources; it has dynamics of its own. After special relativity was developed it was soon speculated that, just like the electro- magnetic field, the gravitational field might also be dynamical. Changes in the gravitational field should propagate in a wave-like fashion, no faster than the speed of light, thus elimin- ating instantaneous action at a distance. A concrete mathematical implementation of these notions would have to wait for another decade, but the general theory of relativity of 1916 indeed incorporated all these ideas. In particular, GR predicts the existence of gravitational waves.

B. Linearized general relativity

An easy way to gain insight into the nature of gravitational waves is to study them in the regime where gravitational fields are weak. In that case the spacetime metric gµν can be written as the flat Minkowski spacetime of special relativity, ηµν, plus a small correction due to the weak gravitational fields, hµν:

gµν = ηµν + hµν, |hµν|  1. (6.1)

Within this approximation, we can write the Einstein equations to first order in hµν, neg- lecting all higher orders. Recall that the full Einstein equations are 1 8πG R − g R = T , (6.2) µν 2 µν c4 µν where the Ricci tensor Rµν and the scalar curvature R are contractions of the Riemann tensor: ρ Rµν = Rµρν , µ R = Rµ. (6.3)

78 The Riemann tensor is in turn constructed from the metric gµν and its first derivatives ∂ρgµν. The Einstein equations are invariant under general coordinate transformations,

xµ −→ x0µ(x), (6.4) under which the metric transforms as ρ σ 0 ∂x ∂x gµν(x) −→ gµν = 0µ 0 gρσ(x). (6.5) ∂x ∂xν

This invariance is broken when we choose a fixed “background” ηµν as in (6.1). Indeed, the numerical values of the components of the metric depend on the reference frame. What we really mean in writing (6.1) is that there exists a specific reference frame where (6.1) holds in a sufficiently large region of spacetime. It is no surprise that if we want to remain in this reference frame, we will no longer be able to transform the metric at will. However, there still exists a (much more limited) family of transformations which respects our choice of frame (6.1). Consider the following “gauge transformations”:

xµ −→ x0µ = xµ + ξµ(x), (6.6)

where all derivatives |∂ρξµ| are at most of the same order as |hµν|, so that quantities of second order in these derivatives can also be neglected. Substituting into the transformation law of the metric, Eq. (6.5) and keeping only lowest-order terms, we find

0 0 hµν(x) −→ hµν(x ) = hµν(x) − (∂µξν + ∂νξµ). (6.7) We can also perform global (x-independent) Lorentz transformations,

µ 0µ m ν x −→ x = Λ νx . (6.8)

It is not difficult to see that hµν transforms as

0 0 ρ σ hµν(x ) = Λµ Λν hρσ(x). (6.9)

Thus, hµν is a tensor under Lorentz transformations, with the caveat that as far as boosts are concerned, we need to limit ourselves to those that do not spoil the condition |hµν|  1.

We are now ready to linearize the Einstein equations. To leading order in hµν and its derivatives, the Riemann tensor is 1 R = (∂ ∂ h + ∂ ∂ h − ∂ ∂ h − ∂ ∂ h ) . (6.10) µνρσ 2 ν ρ µσ µ σ νρ µ ρ νσ ν σ µρ An interesting property is that this linearized Riemann tensor is invariant under the gauge transformations (6.7), as can easily be checked. It will be convenient to introduce 1 h¯ = h − η h, (6.11) µν µν 2 µν where µν h = η hµν. (6.12)

79 ¯ µν Note that h ≡ η hµν = h − 2h = −h, so that Eq. (6.12) can be inverted to give 1 h = h¯ − η h.¯ (6.13) µν µν 2 µν Using (6.10), and (6.13), it is not difficult to show that the linearized Einstein equations take the form 16πG 2h¯ + η ∂ρ∂σh¯ − ∂ρ∂ h¯ − ∂ρ∂ h¯ = − T , (6.14) µν µν ρσ ν µρ µ νρ c4 µν

µ where 2 ≡ ∂µ∂ is the usual d’Alembertian. This equation can be further simplified by making use of the residual gauge freedom (6.6). It is easy to see that these transformations ¯ act on hµν as ¯ ¯0 ¯ ρ hµν −→ hµν = hµν − (∂µξν + ∂νξµ − ηµν∂ρξ ). (6.15) This allows us to impose the so-called harmonic gauge26:

ν ¯ ∂ hµν = 0. (6.16)

With this condition, the last three terms in the LHS of Eq. (6.14) vanish, and we simply get

16πG 2h¯ = − T . (6.17) µν c4 µν These are the linearized Einstein equations. Note that our ability to impose the harmonic gauge, Eq. (6.16), together with (6.17), implies that ν ∂ Tµν = 0. (6.18) This is the expression for energy-momentum conservation in the linearized theory. In the ν ν full theory one has ∇ Tµν, with ∇ the covariant derivative.

C. Gravitational waves

Consider an energy-momentum distribution Tµν that is only non-zero inside some spatially finite region V. As we shall discuss later on, the solution to the linearized Einstein equations at an arbitrary spacetime point (t, x) takes the form

Z 0 0 ¯ G Tµν(t − |x − x |/c, x ) 3 0 hµν(t, x) = −4 2 0 d x . (6.19) c V |x − x | ¯ Note that, unlike the Newtonian potential, the value of hµν at a point x arbitrarily far from the source S does not have instantaneous knowledge of what happens at at V. Rather, there are time lags |x − x0|/c, these being the times needed for a signal traveling at the speed of light to get from points x0 inside the source to the point x. Just like electromagnetism, gravity does not have instantaneous action at at distance after all.

26 Also called the Lorentz gauge, the Hilbert gauge, or the De Donder gauge.

80 Outside the source, where Tµν = 0, Eq. (6.17) reduces to ¯ 2hµν = 0, (6.20) or  1 ∂  − + ∆ h¯ = 0. (6.21) c2 ∂t2 µν This is just a wave equation, for waves traveling at the speed of light. Its solutions can be written as superpositions of plane waves with frequencies ω and wave vectors k,

Aµν cos(ωt − k · x), (6.22) where ω = c|k|, and Aµν has constant components. Note that solutions of this kind will exist also in the complete absence of sources (i.e., vacuum spacetime with Tµν = 0 every- where). Although the latter solutions are by themselves unphysical, they illustrate that the gravitational field has dynamics of its own, independent of matter. For weak gravitational fields and small velocities, |T 00|  |T i0|  |T ii|, which translates into |h¯00|  |h¯i0|  |h¯ii|. In this regime, T 00/c2 ' ρ where ρ is the matter density. The equation (6.17) then reduces to 16πG 2h¯00 ' − ρ. (6.23) c2 For sources moving with 3-velocity v such that v/c  1, (1/c2)∂2h¯00/∂t2 is of order (v/c)2 ∂2h¯00/∂(xi)2, and Eq. (6.21) reduces to

c2∆h¯00 ' −16πGρ. (6.24)

With the identification c2h¯00 = −4φ, (6.25) this becomes ∆φ = 4πGρ, (6.26) which is the Poisson equation for the gravitational potential φ in Newton’s theory of gravity. The identification (6.25) is consistent with the motion of point particles in the weak-field, low-velocity regime. In general relativity this motion is governed by the geodesic equation,

d2xi dxµ dxν + Γi = 0. (6.27) dτ 2 µν dτ dτ

Recall that gµν = ηµν + hµν with |hµν|  1. For v/c  1, the proper time τ will approxim- ately coincide with the coordinate time t associated with the background spacetime whose 0 i metric is ηµν. Moreover, dx /dt ' c while dx /dt = O(v). Hence, to leading order we need only retain the term in (6.27) with µ = ν = 0, so

d2xi ' −c2Γi dt2 00 1  = c2 ∂ih − ∂ h i . (6.28) 2 00 0 0

81 For a non-relativistic source, the time derivative is again of higher order than the spatial derivatives, whence d2xi c2 = ∂ih . (6.29) dt2 2 00 ¯ ¯00 This is an equation in terms of h00 rather than h00. However, note that since h dominates all other components of h¯µν, µ ¯µ ¯00 h = h µ = −h µ = h , (6.30) hence from (6.13) and (6.25) we get

2 c h00 = −2φ. (6.31)

Substituting this into (6.29) we retrieve Newton’s second law for a force with potential φ:

a = −∇φ, (6.32) with a the acceleration 3-vector. Hence, we have retrieved both Newton’s equation for the gravitational potential, Eq. (6.26), and the Newtonian motion of a particle in such a potential, Eq. (6.32). The above considerations also make it clear why Newtonian gravity appears to have instantaneous action at a distance whereas general relativity does not. Given a density distribution ρ contained within a region V, the most general solution of Eq. (6.26) is

Z 0 ρ(t, x ) 3 0 φ(t, x) = G 0 d x . (6.33) V |x − x | The fact that ρ(t, x0) in the integrand does not include a time lag |x − x0|/c is due to the absence of a double time derivative in Eq. (6.26), which in (6.24) could be neglected in the regime where v/c  1.

D. Physical degrees of freedom

¯ A priori, hµν has 10 independent components, as it is a symmetric tensor in 4 spacetime dimensions, but most of these are just a gauge artefact and can be eliminated by using transformations of the form (6.15). Indeed, imposition of the harmonic gauge (6.16) already eliminates 4 components, leaving 6. However, this gauge choice still allows for residual freedom. Indeed, the condition (6.16) is not spoiled by a transformation (6.15) with

2ξµ = 0. (6.34)

Note that if 2ξµ = 0 then also 2ξµν = 0, where

ρ ξµν = ∂µξν + ∂νξµ − ηµν∂ρξ , (6.35)

because 2 commutes with ∂µ. Hence, we can use 4 functions ξµ(x) to eliminate 4 more ¯ components of hµν without spoiling either the harmonic gauge or the simple form of the linearized Einstein equations (6.17). ¯ ¯ In particular, we can choose ξ0(x) such that the trace h = 0, in which case hµν simplifies to hµν. Furthermore, we can choose the three functions ξi(x), i = 1, 2, 3 being a spatial

82 ¯ index, so that h0µ(x) = 0. Since hµν = hµν, the harmonic gauge condition with µ = 0 then becomes 0 i ∂ h00 + ∂ h0i = 0. (6.36)

Since we just set h0i = 0, this reduces to

0 ∂ h00 = 0, (6.37) so that h00 does not depend on time. A time-independent contribution to h00 corresponds to the static part of the gravitational interaction, i.e., to the Newtonian potential of the source arising from its total mass without contributions due to motion. The gravitational 27 wave is the time-dependent part, and since this is our focus here we will just set h00 = 0. The spatial part of the harmonic gauge (with µ = i = 1, 2, 3) is then

j ∂ hij = 0, (6.38)

i and the condition h = 0 becomes h i = 0. In summary, we have

i j h0µ = 0, h i = 0, ∂ hij = 0. (6.39) We have now used up all of our gauge freedom and are left with two degrees of freedom. The gauge in which the conditions (6.39) hold is called the transverse-traceless gauge, or TT TT gauge. The metric perturbation in the TT gauge is denoted hij . Eq. (6.21) has plane wave solutions of the form

TT µ hij = eij(k) cos(kµx ), (6.40) with kµ = (ω/c, k), and ω = c|k|. The tensor eij(k) is called the polarization tensor. For j j TT a single plane wave with wave vector k, the condition ∂ hij = 0 becomes k hij = 0, or j TT n hij = 0 where ˆn = k/|k| is the unit vector in the direction of motion. Hence the non-zero T components of hij are in the plane that is transverse to ˆn. Suppose we choose the z axis to lie in the direction of ˆn. Taking into account symmetry, transversality, and tracelessness of TT hij , we then get   h+ h× 0 TT hij =  h× −h+ 0  cos [ω(t − z/c)] (6.41) 0 0 0 ij In terms of the line element ds2, a spacetime with a plane wave of the above type traveling through it takes the form

2 2 2 2 2 ds = −c dt + dz + [1 + h+ cos[ω(t − z/c)]] dx 2 = + [1 − h+ cos[ω(t − z/c)]] dy + 2h× cos[ω(t − z/c)] dxdy. (6.42)

27 Strictly speaking we should retain the Newtonian contribution h00, but it will have no effect on gravita- tional wave detection as discussed below.

83 E. Effect of gravitational waves on matter

We may now ask what is the effect of the perturbation on matter. The easiest way to understand the action of gravitational waves is to consider the relative motion of two nearby test particles in free fall. A free-falling test particle obeys the geodesic equation,

d2xµ dxν dxρ + Γµ (x) = 0. (6.43) dτ 2 νρ dτ dτ where τ is proper time. Now consider two nearby free-falling particles, at xµ(τ) and xµ(τ) + ζµ. The first particle is subject to Eq. (6.43) while the second one obeys

d2(xµ + ζµ) d(xν + ζν) d(xρ + ζρ) + Γµ (x + ζ) = 0. (6.44) dτ 2 νρ dτ dτ Taking the difference between (6.44) and (6.43), and expanding to first order in ζµ, we get

d2ζµ dxν dζρ dxν dxρ + 2Γµ + ζσ∂ Γµ (x) = 0. (6.45) dτ 2 νρ dτ dτ σ νρ dτ dτ This can be written more succinctly by introducing the covariant derivative of a vector field V µ along the curve xµ(τ): DV µ dV µ dxρ = + Γµ V ν . (6.46) Dτ dτ νρ dτ Using this and the definition of the Riemann tensor, it is not difficult to recast Eq. (6.45) as

D2ζµ dxν dxσ = −Rµ ζρ . (6.47) Dτ 2 νρσ dτ dτ This is the equation of geodesic deviation, which expresses the relative motion of nearby particles in terms of a tidal force determined by the Riemann tensor. Given a point P along a geodesic, there always exists a coordinate transformation that will make the Christoffel symbols vanish at P :

µ Γνρ(P ) = 0. (6.48) This is just the Local Lorentz Frame which we have discussed before. Furthermore, let us consider particles which move non-relativistically, so that their spatial motion dxi/dτ is negligible compared to dx0/dτ. In that case Eq. (6.45) becomes

d2ζi dx0 2 + ζσ∂ Γi = 0. (6.49) dτ 2 σ 00 dτ

i i The quantity ∂σΓ00 is evaluated at the point P , i.e., at x = 0, and the metric is of the form i j σ i j i µ i gµν = ηµν + O(x x ); hence ζ ∂σΓ00 = ζ ∂jΓ00. Since at P both Γνρ = 0 and ∂0Γ0j = 0, one i i i i has R 0j0 = ∂jΓ00 − ∂0Γ0j = ∂jΓ00. We then get

d2ζi dx0 2 = −Ri ζj . (6.50) dτ 2 0j0 dτ

84 If the test masses are moving non-relativistically then dx0/dτ ' c and τ = t, the time coordinate associated with the flat background spacetime. We finally arrive at ¨i 2 i j ζ = −c R 0j0ζ , (6.51) where a dot denotes derivation with respect to t. To compute the relevant component of the Riemann tensor we use the fact that in the linearized theory, this tensor is invariant so that we may compute it in any frame. Evaluating (6.10) in the TT frame we get 1 Ri = R = − h¨TT. (6.52) 0j0 i0j0 c2 ij Hence, at the point P , the geodesic deviation equation reduces to 1 ζ¨i = h¨TTζj. (6.53) 2 ij Let us consider a monochromatic gravitational wave propagating in the z-direction and study its effect on test particles in the (x, y) plane. First we focus on the + polarization. TT At z = 0 and choosing the origin of time such that hij = 0 at t = 0,  1 0 0  TT hij = h+ sin(ωt)  0 −1 0  . (6.54) 0 0 0 ij

i Consider a point particle in the (x, y) plane and write ζ = (x0 + δx(t), y0 + δy(t), 0), where (x0, y0) is the unperturbed position and δx(t), δy(t) the displacement caused by the gravitational wave. Then from Eq. (6.53) and assuming that (x0, y0) is sufficienty close to the origin (0, 0) that it and the particle are on “nearby” geodesics,28 h δx¨ = − + (x + δx) ω2 sin(ωt), 2 0 h δy¨ = + + (y + δy) ω2 sin(ωt). (6.55) 2 0

If we further assume small displacements compared with the unperturbed position, δx  x0 and δy  y0, then this simplifies further to h δx¨ = − + x ω2 sin(ωt), 2 0 h δy¨ = + + y ω2 sin(ωt), (6.56) 2 0 which integrates to h δx(t) = + x ω2 sin(ωt), 2 0 h δy(t) = − + y ω2 sin(ωt). (6.57) 2 0

28 In the next section, when discussing gravitational wave detectors, we will explain this approximation more carefully.

85 Completely analogously, for the cross polarization h δx(t) = × y ω2 sin(ωt), 2 0 h δy(t) = × x ω2 sin(ωt). (6.58) 2 0 The associated deformation of a ring of test particles is shown in Fig. 23.

Figure 23: The deformation of a ring of test particles due to the + and × polarizations.

F. Energy and momentum carried by gravitational waves

In previous sections we have formulated the linearized version of general relativity. The linearized Einstein equations in vacuum are 1 R(1) − η R(1) = 0, (6.59) µν 2 µν (1) where Rµν is the Ricci tensor up to linear terms in the small perturbation hµν around the flat background ηµν; it can be computed from the linearized Riemann tensor, Eq. (6.10). Schematically, the linearized Einstein equations can be written as

(1) Gµν [hρσ] = 0, (6.60)

(1) where Gµν is the Einstein tensor to first order in hµν and its derivatives. Given a solution hµν of the linearized Einstein equations, the metric gµν = ηµν + hµν will generally not be a solution to the full Einstein equations. In fact, typically it will not even solve the second order Einstein equations, in which terms of second order in hµν and derivatives are also included. Indeed, expanding the Einstein tensor as

(1) (2) Gµν[hρσ] = Gµν [hρσ] + Gµν [hρσ] + ... (6.61)

86 (2) (2) where Gµν collects all second order terms, one will typically have Gµν [hρσ] 6= 0. The second order Einstein equations are (1) (2) Gµν [hρσ] + Gµν [hρσ] = 0. (6.62) (2) Now suppose we have a solution hµν of the linearized equations (6.60); then if Gµν [hρσ] 6= 0, the above second order equation clearly does not hold. To correct the second order equation (6.62), we have to add to hµν an even smaller (2) 2 correction hµν , which we take to be O(h ), and which satisfies

(2) (1) (2) Gµν [hρσ] + Gµν [hρσ ] = 0. (6.63) We can write this in the form 8πG G(1)[h(2)] = t (6.64) µν µν c4 µν with c4 t = − G(2)[h ]. (6.65) µν 8πG µν ρσ The corrected Einstein equations then become 8πG G(1)[h + h(2)] = t , (6.66) µν ρσ ρσ c4 µν with c4 t = − G(2)[h ]. (6.67) µν 8πG µν ρσ

Thus, to second order, hµν causes the same correction to the spacetime metric as would be produced by additional ordinary matter with stress-energy tensor tµν. Note that tµν is µ symmetric, and if hµν satisfies the linearized Einstein equations then ∂ tµν = 0, hence it is conserved. It is tempting to regard tµν as the stress-energy tensor of the gravitational field itself, valid to second order in deviation from flatness. However, tµν is not gauge invariant; it changes under the transformations (6.7). Indeed, in general relativity there is no local notion of the energy density of the gravitational field. However, if instead of evaluating tµν at a particular point we average it over a small spatial volume surrounding that point, then we do obtain a gauge-invariant quantity. First we redefine c4  1  t = − R(2) − η R(2) , (6.68) µν 8πG µν 2 µν where h...i denotes the average over a bounded spatial volume. The second order contribu- tions to the Ricci tensor are as follows: 1 1 R(2) = ∂ h ∂ hρσ + hρσ∂ ∂ h − hρσ∂ ∂ h − hρσ∂ ∂ h µν 2 2 µ ρσ ν µ ν ρσ ν σ ρµ µ σ ρν ρσ σ ρ σ ρ ρσ h ∂ρ∂σhµν + ∂ hν∂σhρµ − ∂ hν∂ρhσµ − ∂σh ∂νhρµ 1 1 1  +∂ hρσ∂ h − ∂ hρσ∂ h − ∂ρh∂ h + ∂ρh∂ h + ∂ρh∂ h . (6.69) σ ρ µν σ µ ρν 2 ρ µν 2 ν ρµ 2 µ ρν

However, due to the averaging in (6.68), the expression for tµν will end up being quite simple. First we note that, since we assume an integration volume with a boundary, we

87 can use integration by parts for spatial derivatives ∂i and discard the boundary terms. Now, any time dependence of hµν will be through a retarded time; for instance, for waves 0 0 moving in the z-direction, hµν will depend on time as hµν(x − z), where x = ct. But then ∂0hµν = −∂zhµν. Hence one can always replace a time derivative ∂0 by a spatial derivative – in this case −∂z –, perform integration by parts, and replace −∂z by ∂0 again. Thus, time and spatial derivatives are on the same footing in the average h ... i even though only a spatial integral is being taken. µν Performing integration by parts and using the gauge condition ∂µh = 0, the traceless- ness condition h = 0, and the field equations 2hµν = 0, all terms in the average of (6.69) except the first two will vanish. Again integrating by parts, the remaining terms can be combined to get 1 hR(2)i = − h∂ h ∂ hρσi. (6.70) µν 4 µ ρσ ν (2) In (6.68), hR i is zero, as can be shown by partial integration and using 2hµν = 0. Thus, we arrive at c4 t = h∂ h ∂ hρσi. (6.71) µν 32πG µ ρσ ν

The change in tµν under the gauge transformations (6.7) is c4 δt = h∂ h ∂ (δhρσ) + ∂ (δh )∂ hρσi µν 32πG µ ρσ ν µ ρσ ν c4 = h∂ h ∂ (∂ρξσ + ∂σξρ) + (µ ↔ ν)i 32πG µ ρσ ν c4 = h∂ h ∂ ∂ρξσ + (µ ↔ ν)i . (6.72) 16πG µ ρσ ν Inside the average h...i we can integrate ∂ρ by parts and then use the gauge condition ρ ∂ hρσ = 0. Therefore δtµν = 0, and tµν is gauge invariant. Hence it only depends on the physical content of the spacetime perturbation hµν, which is encapsulated by the two non-zero components in, e.g., the transverse-traceless gauge. In that gauge, the energy gravitational energy density is c2 t00 = hh˙ TTh˙ TTi, (6.73) 32πG ij ij

where the dot denotes derivation w.r.t. time; note that ∂0 = (1/c)∂t. In terms of the two gravitational wave polarizations, h+, h× one has c2 t00 = hh˙ 2 + h˙ 2 i. (6.74) 16πG + × Given a spatial volume V bounded by a surface S, the gravitational energy inside it is Z 3 00 EV = d x t . (6.75) V The gravitational energy going through S per unit of time is then given by Z dEGW 3 00 = − d x ∂tt , (6.76) dt V

88 where the minus sign indicates that we are interested in the energy leaving the surface. µν Using conservation of gravitational stress-energy, ∂µt = 0, this can be written as Z 1 dEGW 3 0i = d x ∂it c dt V Z 0i = dA nit , (6.77) S where dA is the infinitesimal surface element and ˆn the unit normal to S. If S is a sphere then ˆn =r ˆ, the unit vector pointing radially outward, and dA = r2dΩ, with r the sphere’s radius and dΩ = sin(θ)dθdφ in the usual angular coordinates (θ, φ). One then has dE Z GW = cr2 dΩ t0r, (6.78) dt and c4 D E t0r = ∂0h˙ TT∂rhTT . (6.79) 32πG ij ij If r is sufficiently large, a gravitational wave propagating radially outward has the form 1 hTT = f (t − r/c). (6.80) ij r ij The derivative with respect to r then gives ∂ 1 1 ∂ hTT = − f (t − r/c) + f (t − r/c). (6.81) ∂r ij r2 ij r ∂r ij Note that ∂ 1 ∂ f (t − r/c) = − f (t − r/c), (6.82) ∂r ij c ∂t ij and so ∂  1  hTT = −∂ hTT + O ∂r ij 0 ij r2  1  = +∂0hTT + O . (6.83) ij r2

Hence, at large distances one has t0r = t00, and dE Z GW = cr2 dΩ t00. (6.84) dt Using our expression (6.73) for the gravitational energy density,

dE c3r2 Z GW = dΩ hh˙ TTh˙ TTi, (6.85) dt 32πG ij ij or in terms of the two polarizations, dE c3r2 Z GW = dΩ hh˙ 2 + h˙ i. (6.86) dt 16πG + ×

89 Thus, gravitational waves carry away energy, which they can deposit into physical systems. Just like electromagnetic waves, gravitational waves also carry momentum. Given a volume V , the gravitational momentum inside it is 1 Z P k = d3x t0k. (6.87) c V If V is bounded by a large sphere S with radius r and gravitational waves are going radially outward, the outgoing momentum per unit time is k Z ∂PGW 3 0k = − d x ∂0t dt V Z = r2 dΩt0k. (6.88) S Using the expression (6.71), we arrive at

k 3 2 Z ∂PGW c r ˙ TT k TT = − dΩ hhij ∂ hij i. (6.89) dt 32πG S

G. The generation of gravitational waves

The field equations of linearized gravity are (6.17). 16πG 2h¯ = − T , (6.90) µν c4 µν where Tµν is the energy-momentum tensor of matter. Since these are linear equations, they can be solved using Green’s functions. The appropriate Green’s function here is the one that solves the equation 0 4 0 2xG(x − x ) = δ (x − x ), (6.91) where x, x0 are any two spacetime points and derivatives in the LHS are with respect to the components of x = (ct, x). Then for a given Tµν, the solution to (6.90) is 16πG Z h¯ (x) = − d4x0 G(x − x0)T . (6.92) µν c4 µν Choosing boundary conditions such that there is no incoming radiation from infinity, the retarded Green’s function is the appropriate one. It takes the form 1 G(x − x0) = − δ(x0 − x00), (6.93) 4π|x − x0| ret

00 0 0 where x = ct , xret = ctret, and the retarded time tret is given by |x − x0| t = t − . (6.94) ret c Eq. (6.92) then becomes 4G Z 1  |x − x0|  h¯ (t, x) = d3x0 T t − , x0 . (6.95) µν c4 |x − x0| µν c

90 We will want to have solutions for the metric perturbation in a convenient gauge, such as the transverse-traceless gauge. Given an hµν, it can be brought into the TT gauge by acting on it with an appropriate projector. Let ˆn be the direction of propagation of a gravitational wave. Then the following operator removes the component of any spatial vector along the direction ˆn: Pij ≡ δij − ninj. (6.96) i i j Given a spatial vector v , the vector w = Pijv is transverse:

i j ˆn · w = n Pijv = 0. (6.97)

Pij is a projector: PikPkj = Pij. (6.98)

Using Pij, we now construct 1 Λ (ˆn) = P P − P P . (6.99) ij,kl ik jl 2 ij kl This is also a projector, in the sense that

Λij,klΛkl,mn = Λij,mn. (6.100)

i j It is transverse in all indices: n Λij,kl = 0, n Λij,kl = 0, etc. It is also traceless with respect to the first and last index pairs: Λii,kl = Λij,kk = 0. (6.101) Finally, it is symmetric under the interchange (i, j) ↔ (k, l):

Λij,kl = Λkl,ij. (6.102)

The explicit expression for Λij,kl is: 1 Λ (ˆn) = δ δ − δ δ − n n δ − n n δ ij,kl ik jl 2 ij kl j l ik i k jl 1 1 1 + n n δ + n n δ + n n n n . (6.103) 2 k l ij 2 i j kl 2 i j k l

One can show the following: if hµν is a metric perturbation outside the source (where Tµν = 0) which solves the linearized Einstein equations and is already in the Lorentz gauge, then the projection TT hij = Λij,klhkl (6.104)

is equivalent to performing a gauge transformation that brings hµν into the TT gauge. Outside the source, the solutions to (6.90) in the TT gauge take the form

4G Z 1  |x − x0|  hTT(t, x) = Λ (ˆn) d3x0 T t − , x0 . (6.105) ij c4 ij,kl |x − x0| kl c

TT 0 Note that x is the point where hij is being evaluated, while x is restricted to be inside the 0 TT source, where Tµν(tret, x ) 6= 0. We are particulary interested in the behavior of hij far from

91 the source, at a distance r that is much larger than the source’s size, d. In that case we can expand d2  |x − x0| = r − x0 · ˆn + O . (6.106) r To very good approximation, (6.105) can be written as

4G Z 1  r x0 · ˆn  hTT(t, x) = Λ (ˆn) d3x0 T t − + , x0 . (6.107) ij c4 ij,kl |x − x0| kl c c To see how further simplifications can be made, it is useful to Fourier-expand the stress tensor:  0  4 |x − x | Z d k 0 0 T t − , x0 = T˜ (ω, k)e−iω(t−r/c+x ·ˆn)+ik·x . (6.108) kl c (2π)4 kl

For a typical source, Tij(ω, k) will only have power up to some maximum frequency ωs. If the 0 source is non-relativistic then ωsd  c. In addition we have |x | . d. Hence the frequencies TT ω where hµν receives its main contributions are such that ω ω d x0 · ˆn s  1. (6.109) c . c Hence, in the exponent of (6.108) we can use ωx0 · ˆn/c as an expansion parameter:   0 0 ω 1  ω 2 e−iω(t−r/c+x ·ˆn/c)+ik·x = e−iω(t−r/c) 1 − i x0ini + −i x0ix0jninj + ... . (6.110) c 2 c In the time domain, this is equivalent to expanding

 r x0 · ˆn  x0ini 1 T t − + , x0 = T (t − r/c, x0) + ∂ T + x0ix0jninj∂2T + ..., (6.111) kl c c kl c 0 kl 2c2 0 kl where the derivatives in the RHS are evaluated at (t − r/c, x0). Now introduce the multipole moments of the stress tensor Tij: Z Sij = d3xT ij(t, x), Z Sij,k = d3xT ij(t, x)xk, Z Sij,kl = d3xT ij(t, x)xkxl, ... (6.112)

Then substituting the expansion (6.111) into (6.107), we get   TT 1 4G kl 1 ˙ kl,m 1 ¨kl,mp hij = 4 Λij,kl(ˆn) S + nmS + 2 nmnpS + ... , (6.113) r c c 2c ret where [...]ret indicates that the expression in brackets is being evaluated at the retarded time t − r/c. This expansion in multipole moments is in fact an expansion in v/c, where v is a characteristic velocity. Indeed, compared to Skl, the moment Skl,m has an additional

92 m factor x ∼ O(d), and each time derivative brings in a factor O(ωs); combined with the 1/c ˙ kl,m this gives a factor O(ωsd/c). Defining v ≡ ωsd, this means that the term (1/c)nmS is a kl 2 ¨kl,mp correction of O(v/c) to the term S . Similary the term (1/2c )nmnpS is a correction of O(v2/c2), and so on. The expansion (6.113) is not very convenient, as it depends on the moments of the stresses Tij, which in practice may be difficult to determine. It would be physically more intuitive to instead have an expansion in moments of the mass density 29 (1/c2)T 00 and the momentum density (1/c)T 0i. The mass moments are defined as

1 Z M = d3x T 00(t, x), c2 1 Z M i = d3x T 00(t, x)xi, c2 1 Z M ij = d3x T 00(t, x)xixj, c2 1 Z M ijk = d3x T 00(t, x)xixjxk, c2 ... (6.114) while the momentum density moments are given by 1 Z P i = d3x T 0i(t, x), c 1 Z P i,j = d3x T 0i(t, x)xj, c 1 Z P i,jk = d3x T 0i(t, x)xjxk, c ... (6.115)

It is indeed possible to express the stress moments (6.112) as combinations of mass and mo- mentum density moments. Let us do this explicitly for the leading-order term in Eq. (6.113). One has Z Sij = d3x T ij Z 3 i j kl = d x δkδl T Z 3 i j kl = d x (∂kx )(∂lx ) T Z 3 i j kl = − d x x (∂lx ) ∂kT Z 3 i j 0l = d x x (∂lx ) ∂0T . (6.116)

29 Although (1/c2)T 00 has dimensions of mass density, it also includes kinetic energy contributions. It can be approximated as the rest mass density only in the non-relativistic limit.

93 In the next to last line we performed partial integration, with the assumption T kl(t, x) → 0 as |x| → ∞, so that the boundary term vanishes. In the last line we used the conservation µν law ∂µT = 0 for ν = l. Going on in the same vein we find Z Z ij 3 i j 2 00 3 i j 0l S = − d x x x ∂0 T − d xδl x ∂0T Z Z 3 i j 2 00 3 j ki = d x x x ∂0 T + d x ∂kT 1 Z Z = d3x xixjT¨00 − d3xT ij c2 = M¨ ij − Sij, (6.117)

or 1 Sij = M¨ ij. (6.118) 2 Thus, to leading order in v/c, the metric perturbation in the TT gauge takes the form 1 2G hT (t, x) = Λ (ˆn)M¨ kl(t − r/c). (6.119) ij quad r c2 ij,kl

¨ kl This is the mass quadrupole radiation. Note that Λij,kl contracted with M makes the latter traceless, so in the above expression we can replace M kl by 1 Qij ≡ M ij − δijM . (6.120) 3 kk In the above we have been referring to (1/c2)T 00 as the mass density, although of course it includes not just rest mass but also kinetic energy. However, to lowest order in v/c it reduces to the rest mass density, which we denote by ρ. The tensor Qij then becomes the quadrupole tensor from Newtonian theory: Z  1  Qij = d3x ρ(t, x) xixj − r2δij . (6.121) 3 In this approximation, we find 1 2G hT (t, x) = Q¨TT(t − r/c), (6.122) ij quad r c4 ij

TT where Qij is the transverse part of the (already traceless) tensor Qij:

TT Qij = Λij,kl(n) Qij. (6.123) Let us summarize. By linearizing general relativity we arrived at the field equations (6.17). Since by assumption these only take into account linear effects in the metric perturbation, they neglect gravitational self-interaction. Going to second order we were able to derive expressions for the momentum and the energy carried by the gravitational field; since this energy is equivalent to mass, the gravitational field affects itself. To second order this is captured by the “corrected” field equations (6.66). We then returned to linearized equations (6.17) and, given an energy-momentum distribution Tµν for the matter, we found their

94 solutions as a perturbation series in v/c, where v is some characteristic velocity. This is also an expansion in moments of the stress tensor T ij, Eq. (6.113). However, the stress tensor will often be difficult to calculate and is less intuitive than the mass density (1/c2)T 00 and the momentum density (1/c)T 0i. One could easily rewrite the expression (6.113) in terms of the mass and momentum density moments M ijk... and P i,jk..., respectively. We did this TT explicitly for the leading-order contribution in v/c, which relates hij to the mass quadrupole moment: Eq. (6.122). TT A first point worth noting is that the leading contribution to hij is indeed due to the time dependence of the mass quadrupole; there is no monopole or dipole gravitational radiation. These contributions would have depended on time derivatives of, respectively, the mass monopole M and the momentum dipole P i.30 However, 1 Z M˙ = d3x ∂ T 00 c 0 1 Z = − d3x ∂ T 0i c i = 0, (6.124)

where we again assumed that T ij(t, x) → 0 as |x| → 0, so that when integrating the total 0i ˙ i divergence ∂iT , the boundary terms are zero. Similary, it is easy to show that P = 0. Hence, the total mass and momentum are conserved, and this is what is responsible for the absence of monopole or dipole radiation.31 Let us now look more closely into the quadrupole expression (6.122) itself. What radiation is emitted depends on the direction ˆn. However, without loss of generality we can set ˆn = ˆz, the unit vector in the z-direction, as long as the orientation of the source is kept arbitrary. If we do this, then the projector Pij = δij − ninj becomes

 1 0 0  P =  0 1 0  (6.125) 0 0 0

i.e., the projector onto the (x, y) plane. Using the expression (6.99) for Λij,kl in terms of Pij, we get, for any 3 × 3 matrix Aij,

 1  Λ A = P P − P P A ij,kl kl ik jl 2 ij kl kl 1 = (P AP ) − P Tr(PA). (6.126) ij 2 ij Using (6.125) we get   A11 A12 0 P AP =  A21 A22 0  , (6.127) 0 0 0

30 Note that the mass dipole moment M i can always be set to zero with a judicious choice of coordinate system. 31 This simple argument is really only valid in the linearized theory. However, also in the full, non-linear theory one can rigorously show that there will be no monopole or dipole radiation.

95 while Tr(PA) = A11 + A22. Hence     A11 A12 0 1 0 0 A11 + A22 Λij,klAkl =  A21 A22 0  −  0 1 0  0 0 0 2 0 0 0 ij ij   (A11 − A22)/2 A12 0 =  A21 −(A11 − A22)/2 0  . (6.128) 0 0 0 ij Thus, when ˆn = ˆz,  ¨ ¨ ¨  (M11 − M22)/2 M12 0 ¨ ¨ ¨ Λij,klMkl =  M21 −(M11 − m¨ 22)/2 0  . (6.129)

0 0 ij

Writing the quadrupole expression (6.122) in terms of32 (Eq. (6.119)) and writing   h+ h× 0 TT hij =  h× −h+ 0  , (6.130) 0 0 0 ij we can immediately read off the two gravitational-wave polarizations: 1 G h = (M¨ − M¨ ), + r c4 11 22 2 G h = M¨ , (6.131) × r c4 12 where in each case the RHS is computed at the retarded time t − r/c.

32 Note that Λij,klQ¨ij = Λij,klM¨ ij. Although Qij can be more useful when studying multipole expansions,

in our calculations it will be more practical to use Mij.

96 VII. THE DETECTION AND INTERPRETATION OF GRAVITATIONAL WAVES

A. Gravitational wave detectors

We have seen that gravitational waves have the effect of periodically stretching and com- pressing space; the way they act on arrangements of matter particles can be viewed as a propagating tidal distortion. A simple way to detect these tidal forces would then be to take a large metal bar, and look for the vibrations induced in it by passing gravitational waves. Such devices were first built by American physicist Joseph Weber in the 1960s. Although Weber claimed to have detected gravitational waves, his experiment was duplicated many times, always with a null result. As we shall see, astrophysical sources of gravitational waves need to be sensitive to relative length changes smaller than 10−20, while Weber’s experiment only reached about 10−16. Weber’s bars were not cooled and used piezo-electric sensors to detect vibrations. More recently, bar detectors have been constructed that are cryogenic- ally cooled, thus largely removing thermal vibrations as a noise source, and equipped with SQUIDs (Superconducting QUantum Interference Devices) to find length changes. Spherical detectors have also been built, such as the MiniGRAIL based at the University of Leiden. A drawback of such resonant detectors is that they are only sensitive around their natural resonant frequency (typically a few kHz). More recently, detection efforts have focused mainly on interferometric detectors. These are very large (kilometer size) interferometers, the principle behind which is illustrated in Fig. 24. A laser beam is sent through a beam splitter and ends up in two very long resonant cavities, in which luminosity is built up. The light is allowed to interfere where the arms join together. Things are set up in such a way that if no gravitational waves are passing by, the interference is destructive and no light hits the photodetector at the output. When a gravitational wave does come along, it will periodically shorten the light path in one direction and lengthen it in the other direction, changing the interference at the output. This way relative length changes of 10−22 or better can be measured. Interferometric detectors are also broadband detectors, with good sensitivity between a few tens of Hz and a few kHz.

Figure 24: A schematic representation of an interferometric gravitational wave detector.

Let us compute the response of an interferometric detector to a gravitational wave. We will do so in the limit where the wavelength of the gravitational waves is much larger than

97 the size L of the detector; in terms of the angular frequency of the gravitational waves, ωgwL  1. As we shall see, for most astrophysical sources this assumption is more than justified. The beam splitter and the end mirrors will then be on nearby geodesics, where “nearby” indicates being close to each other compared to the length scale over which the metric varies. From Eq. (6.53), the distance of a mirror with respect to the (“nearby”) beam splitter evolves as 1 ζ¨i = h¨ ζj , (7.1) A 2 ij A with A = 1, 2 labeling the mirrors. We will assume that the mirrors and the beam splitter are in the (x, y) plane, with the beam splitter at the origin and the mirrors at (L, 0, 0) and (0, L, 0), respectively, when no gravitational wave is present. In the case of the A = 1 mirror we will only be interested in its motion in the x-direction, and in the case of the A = 2 mirror only the motion in the y-direction will be important. x x y y Write ζ1 = L + δζ1 and ζ2 = L + δζ2 . Then (7.1) simplifies to 1 δζ¨x = h¨ (L + δζx), 1 2 xx 1 1 δζ¨y = h¨ (L + δζy). (7.2) 2 2 yy 2

For simplicity we again consider a monochromatic gravitational wave: hij = Aij cos(ωt). Then Eqs. (7.2) lead to 1 δζ¨x = A (L + δζx) ω2 cos(ωt), 1 2 xx 1 1 δζ¨y = A (L + δζy) ω2 cos(ωt). (7.3) 2 2 yy 2 i Since we expect δζA  L, we neglect them in the right hand sides of above equations, which can then immediately be integrated to 1 δζx = A cos(ωt), 1 2 xx 1 δζy = A cos(ωt), (7.4) 2 2 yy or 1 δζx = h , 1 2 xx 1 δζy = h . (7.5) 2 2 yy We note that this result is also valid for a completely generic wave which is a superposition of waves with different frequencies, because the way we arrived at it only involved equations that were linear in hij. Effectively, the output of the detector h(t) is the difference between the arm lengths in the x and y directions: x y h(t) = (L + δζ1 ) − (L + δζ2 ), (7.6) or 1 h(t) = (h − h ). (7.7) 2 xx yy

98 This is called the strain. Note that the expression (7.7) holds for a detector with perpendicular arms, but this restriction can easily be dropped. In the above derivation we could just as well have con- sidered an interferometer with arms pointing along unit vectors ˆu and ˆv, with an arbitrary angle between the two. The strain would then have taken the form 1 h(t) = (h − h ), (7.8) 2 uu vv

i j i j with huu = hiju u and hvv = hijv v . It is convenient to introduce a detector tensor Dij which relates the metric perturbation in the TT gauge to the detector response:

ij h(t) = D hij, (7.9)

where 1 Dij = (uiuj − vivj). (7.10) 2

Since Eq. (7.9) is linear in h+ and h×, it is always possible to express it as

h(t) = F+h+ + F×h×, (7.11) where F+ and F× are called the beam pattern functions. Note that h+ and h× were defined with respect to a frame in which the gravitational wave propagates in the z-direction, in which case  0 0  h+ h× 0 0 0 0 hij =  h× h+ 0  . (7.12) 0 0 0 ij However, an interferometer can have an arbitary orientation with respect to the propagation direction of the gravitational radiation. Let us call the frame in which (7.12) holds the gravitational wave frame and denote it by (x0, y0, z0). Next, assume an interferometer with 90◦ opening angle, and define a frame (x, y, z) such that one arm points in the x-direction and another in the y-direction; the z-axis is fixed by demanding that (x, y, z) be a right-handed coordinate system. First note that there is some inherent arbitrariness in the choice of the gravitational wave frame. It will be such that the direction of propagation, ˆn, is along the z0-axis, but the x0 and y0 axes can be arbitrary. The gravitational radiation emitted by most sources is elliptically 0 0 polarized, meaning that the amplitudes of h+ and h× are unequal, and the gravitational wave frame is normally defined such that x0 and y0 are along the major and minor axes of the associated ellipse, respectively. It is convenient to introduce new axes x00 and y00 such that x00 points in the (z0, z) plane. If ψ is the angle between x0 and z, then one has

ˆx00 = ˆx0 cos(ψ) − ˆy0 sin(ψ), ˆy00 = ˆx0 sin(ψ) + ˆy0 cos ψ. (7.13)

It is straightforward to show that in the (x00, y00, z00) frame (where z00 = z0),   h+ h× 0 00 hij =  h× h+ 0  , (7.14) 0 0 0 ij

99 with33

0 0 h+ = h+ cos(2ψ) − h× sin(2ψ), 0 0 h× = h+ sin(2ψ) + h× cos(2ψ). (7.15) The angle ψ is referred to as the polarization angle. Next, we need to rotate (7.14) from the (x00, y00, z00) frame to the (x, y, z) frame. This is effected by a rotation by an angle θ around the y-axis followed by a rotation by an angle φ around the z axis. Note that (θ, φ) are the usual spherical coordinates in the (x, y, z) frame; they determine the direction of propagation in the detector frame and hence the sky position of the source. The corresponding rotation matrix is

 cos(φ) sin(φ) 0   cos(θ) 0 sin(θ)  R =  − sin(φ) cos(φ) 0   0 1 0  . (7.16) 0 0 1 − sin(θ) 0 cos(θ)

In the (x, y, z) frame, the metric perturbation is

00 hij = Rik hkl Rjl. (7.17) With some algebra, one finds that the strain (7.7) in a 90◦ detector is given by 1 h = (h − h ) = F h + F h , (7.18) 2 xx yy + + × × where 1 F = (1 + cos2(θ)) cos(2φ) cos(2ψ) − cos(θ) sin(2φ) sin(2ψ), + 2 1 F = (1 + cos2(θ)) cos(2φ) sin(2ψ) + cos(θ) sin(2φ) cos(2φ). (7.19) × 2 A number of large interferometers are currently operational: the two LIGO detectors in the US (with 4 km arm length), the Virgo detector in Italy (3 km arms), GEO in Germany (600 m arms), and TAMA in Japan (300 m). Fig. 25 shows an aerial view of Virgo. LIGO and Virgo currently operate at design sensitivity and will remain operational, with occasional upgrades, until 2011. By 2014, Advanced LIGO and Virgo will have been built at the same facilities, with a factor of ∼ 10 improvement in sensitivity over the initial detectors. Astrophysical estimates suggest that a detection with the existing instruments is plausible, but it is by no means guaranteed. However, the Advanced detectors should see tens to hundreds of signals per year, mostly from inspiraling and colliding binary neutron stars and/or black holes with masses between 3 and 100 solar masses. These and other sources will be discussed later on. A bit further in the future (around 2020), there will be LISA, the Laser Interferometer Space Antenna. As the name suggests, this will be a space-based detector. It will consist of three spacecraft in orbit around the Sun. The orbits are inclined in such a way that the craft always maintain an equilateral triangle configuration, with a constant distance of 5 × 106 km between them. The left panel of Fig. 26 shows an artist’s rendering of LISA.

33 00 00 With minor abuse of notation, in (7.14) we write h+ = h+ and h× = h×.

100 Figure 25: The French-Italian-Dutch Virgo detector is an interferometer with 3 km arm length. Together with the two LIGO detectors in the US it is poised to make a first gravitational wave detection in the next several years.

As a rule of thumb, the larger the detector, the lower its frequency range; LISA will be sensitive to gravitational waves with frequencies between 10−4 and 0.1 Hz. This in turn means that it will be picking up signals from sources with a much larger size. LISA will see the inspiral and merger of binary supermassive black holes of a million to a billion solar masses. Most galaxies are believed to have at least one such very massive black hole in their centers (the one in our own galaxy contains some 4.1 million solar masses). Sometimes galaxies merge together, in which case their central black holes will sink towards the middle of the newly formed, larger galaxy. They can then form a binary system and eventually they will themselves merge to form a single black hole, emitting copious amounts of gravitational radiation in the process.

Figure 26: Left: An artist’s impression of the Laser Interferometer Space Antenna, to be launched around 2020. Right: A possible lay-out for the Einstein Telescope.

Currently an EU-funded design study is in progress for a very advanced ground-based facility called ET, for Einstein Telescope. ET may consist of several interferometers, possibly V-shaped and arranged in a triangle with 10 km sides. It is to be built some time after 2020. ET will look for similar sources as Advanced LIGO and Virgo, but its sensitivity

101 will be another factor of 10 better. With a much larger number of sources as well as better measurement capabilities, it will be a prime instrument for studying the strong-field dynamics of gravity.

B. Finding signals in noise

So far no gravitational wave detection has taken place, although as we shall see, a first detection will almost certainly happen within a decade from now. The data analysis problem is a formidable one: signals will be buried deep into the noise. A great variety of techniques have been developed to cope with this difficulty. In this chapter we will focus on a technique to search for signals of which the waveform is approximately known, but we emphasize from the outset that many other search methods are known and in use. If an expected source can be modeled to the extent that its gravitational waveform is more or less known, then one can apply matched filtering, and this is what we will discuss here. So far we have pretended that detectors are noise-free and that the output, the strain h(t), is solely determined by the tidal action of gravitational waves on the interferometer arms. In reality, the measured strain, s(t), will be a superposition of the signal, h(t), and the noise,34 n(t): s(t) = n(t) + h(t). (7.20) If the shape of the signal h(t) is more or less known, then one can integrate it against the output and divide by the observation time T :

1 Z T 1 Z T 1 Z T s(t) h(t) dt = n(t) h(t) dt + h(t)2dt. (7.21) T 0 T 0 T 0 Both n(t) and h(t) are oscillating functions. However, the second integral in the RHS is positive definite whereas the first one is itself oscillating. Heuristically,

Z T 1 2 2 h(t) dt ∼ h0, (7.22) T 0

where h0 is the characteristic amplitude of the signal. On the other hand,

Z T 1 τ0 n(t) h(t) dt ∼ n0h0, (7.23) T 0 T

where n0 is the characteristic amplitude of the noise, and τ0 is the characteristic period. We see that in the limit T → ∞, the integral (7.23) will go to zero, but not (7.22). In practice, we can not make T arbitrarily large; indeed, the signals from coalescing binaries will last for at most a few minutes, limiting T . Still, it is not necessary to have h0 > n0 1/2 in order to detect the signal; all that is needed is h0 > (τ0/T ) n0. Coalescing compact

34 There are many noise sources that contribute to n(t). The most important ones are thermal noise due to the thermal vibrations of atoms in the optics of the interferometer, seismic noise, and shot noise due to the fact that light has a particle as well as a wave character. All these phenomena degrade the accuracy with which an interferometer can measure relative length changes.

102 −2 binaries have a characteristic frequency of ∼ 100 Hz, or τ0 ∼ 10 s. Setting T = 100 1/2 −2 s, one has (τ0/T ) ∼ 10 . For a continuous signal from, e.g., a millisecond pulsar, the situation is even better in this regard; setting τ0 = 1 ms and observing for T = 1 yr, one 1/2 −5 finds (τ0/T ) ∼ 10 . The above already shows that one can potentially dig very deep into the noise. But as it turns out, simply integrating the detector output against the signal is not the best one can do. First define Z ∞ sˆ = dt s(t) K(t), (7.24) −∞ where K(t) is called a filter. The signal-to-noise ratio is now defined as S/N, where S is the expected value ofs ˆ if the signal is present, and N is its root-mean-square value if the signal is absent. The idea is to find a filter K(t) which maximizes the signal-to-noise ratio35 (SNR). By “expected value” we mean the ensemble average: if f(t) is a function related to the noise, then hf(t)i is the value of f(t) averaged over infinitely many realizations of the noise36. One has Z ∞ S = dt hs(t)iK(t) −∞ Z ∞ = dt h(t) K(t) −∞ Z ∞ = df h˜(f) K˜ ∗(f). (7.25) −∞ In the next-to-last line we have used that hn(t)i = 0, and in the last line we have Fourier ex- panded h and K; a tilde denotes the Fourier transform and an asterisk complex conjugation. On the other hand,

 2 21/2 N = hsˆ i − hsˆi h=0  2 1/2 = hsˆ i h=0 Z ∞ Z ∞ 1/2 = dt dt0 K(t) K(t0) hn(t) n(t0)i (7.26) −∞ −∞ Again using Fourier decomposition of both filter and noise factors, this can be written as

Z ∞ 1/2 1 ˜ 2 N = df Sn(f) |K(f)| , (7.27) −∞ 2

where Sn(f) is defined implicity through 1 hn˜∗(f)n ˜(f 0)i = δ(f − f 0) S (f). (7.28) 2 n

35 The detailed motivation for the definition of S/N is the so-called Neyman-Pearson criterion, which max- imizes the probability of detection for a given false alarm probability. 36 Weighted by the probability functional of the noise; not every realization will be equally likely.

103 −1 Sn(f) is called the noise spectral density, with units of Hz . It describes in which way the noise is correlated with itself.37 Integrating over frequency, it also yields the variance of the noise in the time domain: Z ∞ 2 2 df Sn(f) = hn (t)i = hn (t = 0)i, (7.29) 0

where we have used that Sn(−f) = Sn(f), as follows from its definition (7.28). Putting everything together, we find that the signal-to-noise ratio is

R ∞ ∗ S df h˜(f) K˜ (f) = −∞ . (7.30) h i1/2 N R ∞ 1 ˜ 2 −∞ df 2 Sn(f) |K(f)|

We still need to find the filter K(t) which maximizes S/N. This can most easily be done algebraically, by introducing the following inner product:

Z ∞ a˜∗(f) ˜b(f) (a|b) ≡ Re df 1 −∞ 2 Sn(f) Z ∞ a˜∗(f) ˜b(f) = 4 Re . (7.31) 0 Sn(f) With this definition, one can write S/N as S (K|h) = , (7.32) N (K|K)1/2 where 1 K = S (f) K˜ (f). (7.33) 2 n With the definition (7.31), the SNR is just the inner product of the signal waveform with the unit vector (7.33). Clearly, to maximize the SNR we need to choose K(t) in such a way that this unit vector points in the same direction as h˜(f). This is effected by choosing the filter’s Fourier transform to be h˜(f) K˜ (f) ∝ , (7.34) Sn(f) which is the so-called Wiener filter. The proportionality constant is unimportant, as the vector K only affects the SNR in normalized form. With (7.34), the optimal SNR becomes S = (h|h)1/2 N Z ∞ |h(f)|2 = 4 . (7.35) 0 Sn(f) This result makes intuitive sense. The optimal filter should not just depend on the signal’s waveform, but also on the properties of the noise. In particular, at frequencies where there is

37 Note that Eq. (7.28) is not only the definition of Sn(f). It also implies an assumption on the noise, namely that it is stationary, i.e., its different Fourier components are uncorrelated.

104 more noise (larger Sn(f)), the filter should get suppressed; where there is less noise (smaller Sn(f)), it should have more weight. We will have more to say about signal-to-noise ratios later, when discussing individual sources. For now we stress that in practice, the optimum (7.35) is never reached. For many sources we know the waveform family, but the signal could be any waveform from this continuous set.38 This is the case with the inspiral and coalescence of compact binary objects such as neutron stars and black holes. These come in a wide variety of masses, and different masses lead to different waveforms. To cover the presumed parameter space, one must search the data with a wide variety of filters h(f; θ)/Sn(f), each with different 39 values for the parameters θ = {θ1, . . . , θN } characterizing the waveforms. Of necessity, the number of filters will be finite, and the relevant part of parameter space will only be probed by a sprinkling of points in it. Such a discrete collection of filters h(f; θ)/Sn(f) is called a template bank. Next, the noise is not really stationary, as was assumed in the very definition of the spectral density; see Eq. (7.28) and the associated footnote. To take this into account, in practice a new template bank is generated on a regular basis; in the search for binary coalescences with LIGO and Virgo a new bank is made for every 2048 s of data. “Glitches” in the data are often sufficiently similar to a signal from a known waveform family that it generates an SNR above the pre-set threshold; in a typical week there will be thousands of such events. A detection will not be declared unless a candidate signal is seen more or less simultaneous in at least two different detectors, and if the templates that give the highest SNRs in individual detectors are associated with similar parameter values. In practice, the data analyst does not quite define SNR as we did above. In the definition of S, Eq. (7.25), he/she has no access to an “expected” value of hs(t)i, only to the data stream that actually came out of the detector, s(t). Regarding N, one can take Eq. (7.27) at face value, and construct a noise spectral density Sn(f) from Eq. (7.28) by pretending that the noise is stationary and only considering correlation between noise Fourier modes of the same frequency. Given these considerations and going through similar steps as above, one arrives at S (h(θ)|s) = max . (7.36) N θ p(h(θ)|h(θ)) This brings us to the issue of parameter estimation. Having to ability to infer, say, the component masses in candidate coalescence signals with good accuracy is important in reducing false alarm rates when comparing near-simultaneous triggers in different detectors. Moreover, when a detection is actually made, we will want to extract as much physical information about the source as we possibly can. Although the above caveats should be kept in mind, we will continue to make somewhat idealized assumptions about the data. In particular, we will take it to be not only stationary but also Gaussian; i.e., we assume that the probability distribution for different realizations of the noise takes the form  1 Z ∞ |n˜(f)|2  P (n) = N exp − df 1 , (7.37) 2 −∞ 2 Sn(f)

38 There will also be inaccuracies in the modeling of the waveforms; obtaining more faithful waveform families is currently an object of intense research 39 In the case of some parameters it may be possible to maximize the SNR over them analytically, but usually not for all.

105 where the variance of the Fourier mode with frequency f was taken to be proportional to (1/2)Sn(f), following Eq. (7.28). Using the inner product (7.31), this can be written more succinctly as P (n) = N exp [−(n|n)/2] . (7.38) Now imagine that the conditions for claiming a detection are satisfied. Then the detector output is of the form s(t) = n(t) + h(t; θ), with n(t) a realization of the noise and h(t; θ) the signal, with parameter values θ = {θ1, θ2, . . . , θN }. The latter are a priori unknown, as template waveforms that cause the largest SNR may have different parameter values from the signal. Using the distribution (7.37), we can write down the likelihood that a particular detector output s(t) is seen given that there is a signal with parameters θ in the data:

P (s|θ) = N exp [−(s − h(θ)|s − h(θ))/2] . (7.39)

Now we used Bayes’s theorem, which says that for statements A and B, the probability that A is true given B can be written as P (B|A) P (A) P (A|B) = , (7.40) P (B) where P (B|A) is the probability of B if A is given, and P (A), P (B) are the overall prob- abilities for A and B, respectively. Applied to our problem, A could be a set of parameter values θ, and B a realization of the detector output s(t). Hence, P (s|θ) P (θ) P (θ|s) = . (7.41) P (s) Since in the above, s is a given, we will be able to absorb the denominator P (s) in the RHS into an overall prefactor. In the Bayesian approach to statistics, which we are adopting here, the prior probability P (θ) quantifies any prior knowledge as well as assumptions. For instance, neutron star masses are known to be sharply peaked around 1.35 M ; regarding distances, for isotropic sources one can assume a prior distribution p(0)(r) dr ∝ r dr for sources in the galactic disk, or p(0)(r) dr ∝ r2dr for extragalactic sources. If one does not want to include any prior assumptions, one adopts a uniform prior p(0) = const, which can then be absorbed into the normalization factor N . This is what we will do here. Putting everything together, we get  1  P (θ|s) = N exp (h(θ)|s) − (h(θ)|h(θ)) , (7.42) 2 where we also absorbed a factor exp [−(s|s)/2] into the overall prefactor N (again, s is a given). To arrive at error estimates, it is convenient to find the peak of the distribu- ˆ tion P (θ|s). In the case of a flat prior, the location of the peak θML is called the maximum likelihood estimator. If p(0) is constant, then maximizing P (θ|s) is the same as ˆ maximizing P (s|θ), which is given by (7.39). This gives θML an elegant geometric interpret- ation. The family of signal waveforms h(t; θ) can be viewed as a manifold. The addition of noise n(t) takes us out of this manifold, giving an output s(t). Now, Eq. (7.39) tells us that maximizing P (s|θ) is the same as minimizing (s − h(θ)|s − h(θ)). The quantity (s − h(θ)|s − h(θ)) has a natural interpretation as the square of the distance between the

106 detector output s and the waveform h(θ) in the manifold of signals. Hence, the maximum ˆ likelihood estimator θML is the set of values θ that gives the minimal distance squared between the detector output and the signal manifold. The maximum likelihood estimator has another interesting property, which is that it also gives the parameter values which maximize the SNR, Eq. (7.36). To see this, let us return to Eq. (7.42). Maximizing P (θ|s) is the same as maximizing its logarithm, 1 log P (θ|s) = (h(θ)|s) − (h(θ)|h(θ)) + const. (7.43) 2 0 Now write h(t; θ) = Ah0(t; θ ), where A is an amplitude, and we have made the split θ = {A, θ0} (i.e., θ0 denotes all remaining parameters). One then has40 A2 log P (θ|s) = A(h (θ0)|s) − (h (θ0)|h (θ0)). (7.44) 0 2 0 0 Setting ∂ log P (θ|s)/∂A = 0, we find the maximum likelihood estimate for A,

0 ˆ (h0(θ )|s) AML = 0 0 . (7.45) (h0(θ )|h0(θ )) Substituting this back into (7.44), we get

0 2 1 (h0(θ )|s) max log P (θ|s) = 0 0 . (7.46) A 2 (h0(θ )|h0(θ )) Multiplying by A2 in numerator and denominator and maximizing over the remaining para- meters, we find 1 (h(θ)|s)2 max log P (θ|s) = max . (7.47) θ 2 θ (h(θ)|h(θ)) ˆ Since the maximum likelihood values θML are, by definition, the θ values that maximize the ˆ LHS, the above tells us that θML are also the values that maximize the SNR as given by Eq. (7.36). If the SNR is large, then parameter uncertainties will be small. In the expression (7.42), ˆ ˆ one can then expand the exponent around θML, writing θ = θML + δθ and only retaining leading-order terms in δθ. Recall that θ is actually a set of paramaters θi, i = 1,...,N, so similarly we have a set of δθi. Now, to first order, ˆ ˆ i h(t; θ) = h(t; θML) + ∂ih(t; θML) δθ , (7.48)

i where ∂i = ∂/∂θ . In the exponent of (7.42), all zeroth order terms can be absorbed into i ˆ N . Any terms linear in the δθ will vanish since by definition, θML minimizes the exponent. Hence the leading order non-trivial contributions to the exponent are the quadratic ones, and in that approximation P (θ|s) takes the form  1  P (δθ) = N exp − Γ δθiδθj , (7.49) 2 ij

40 In what follows we ignore the constant contribution in the previous equation, as it is irrelevant for the rest of the discussion

107 where Γij = (∂i∂jh|h − s) + (∂ih|∂jh). (7.50) In the first term, h − s = −n, and in the limit of large signal-to-noise ratio, |h| dominates over |n|. Hence, in this limit, Γij = (∂ih|∂jh). (7.51) This is called the Fisher information matrix. One can show the following: hδθiδθji = (Γ−1)ij, (7.52) where in this case h...i denotes the average with respect to the probability density P (δθ) as in (7.49), with Γij as in (7.51). For a function f(δθ), Z Z Z hf(δθ)i = d(δθ1) d(δθ2) ... d(δθN )f(δθ) P (δθ). (7.53)

The matrix Σij ≡ (Γ−1)ij (7.54) is called the covariance matrix. From Eq. (7.52) √ ∆θi ≡ ph(δθi)2i = Σii, (7.55) where in the last equality there is no summation over repeated indices. Thus, the square roots of the diagonal elements of Σ allow us to estimate the root-mean-square (i.e., 1-σ) un- certainties on the parameters. The other elements give the covariances between parameters, as Eq. (7.52) also shows: hδθiδθji = Σij. (7.56) These are often normalized to give numbers between -1 and 1, called the correlation coeffi- cients: Σij cij ≡ √ . (7.57) ΣiiΣjj If the correlation between two parameters is close to zero, then they are essentially independ- ent from each other as far as the measurement problem is concerned. If cij ' −1 (strongly anti-correlated) or cij ' +1 (strongly correlated) then it will be hard to disentangle θi and θj from the information contained in the signal. In that case it will be difficult to measure them separately, although a combination of them may still be measurable with good accuracy. We stress once again that the covariance matrix formalism as outlined above relies on the SNR being large, which often will not be the case. Nevertheless, the formalism is widely used to give an indication of the accuracy with which quantities can be measured using gravitational wave observations. An important application has been in the analysis of signals from inspiraling compact binaries. If more less simultaneous triggers are seen in two detectors, then one can use the formalism to estimate the uncertainty in the parameters in the individual detectors, assuming a real signal is involved. The parameters associated ˆ with the templates that gave the largest SNR give the values of θML in each of the detectors, and the covariance matrices give the 1-σ uncertainty intervals around these values (which for safety’s sake are then multiplied by some factor). One then compares these intervals in the two detectors for all of the parameters involved. If none of them overlap, then the event can confidently be dismissed as having been spurious and not a gravitational wave signal. When a first gravitational wave signal is seen, the covariance formalism will give us a rough idea of what kind of source we are dealing with, after which more sophisticated techniques will be used for further investigations.

108 VIII. ASTROPHYSICAL SOURCES OF GRAVITATIONAL RADIATION

A. Inspiral of compact binaries

We have seen how gravitational waves are generated, what their action is on test particles, and what energy they carry. Let us now turn to applications, the first one being the inspiral of compact binaries. First consider two compact objects (neutron stars and/or black holes) being in a circular orbit around their center of gravity with a sufficiently large separation R between the bodies, so that the energy carried away by gravitational radiation is very small and the orbits can be considered fixed over at least one period. In addition let’s assume that R  Rschw1,2 where the latter are the objects’ Schwarzschild radii, set by their masses m1 and m2; in that case they can also be considered as point particles. They move on a circle with radius (µ/mi)R, i = 1, 2, with µ = m1m2/(m1 + m2) the reduced mass; setting the associated centripetal force equal to the Newtonian gravitational force acting over the separation R, one finds Kepler’s law: GM ω2 = , (8.1) orb R3

where ωorb is the angular orbital frequency. To leading order in velocity, gravitational radiation results from the time evolution of the quadrupole moment and is described by Eq. (6.122). An brief calculation shows that the quadrupole gravitational wave polarizations are given by

4 GM 5/3 πf 2/3 1 + cos2(ι) h (t) = c gw cos(2πf t + 2φ), + r c2 c 2 gw ret 4 GM 5/3 πf 2/3 h (t) = c gw cos(ι) sin(2πf t + 2φ). (8.2) × r c2 c gw ret

Note how the component masses m1, m2 only enter the amplitude through a particular combination called the chirp mass:

3/5 (m1m2) Mc = 1/5 (8.3) (m1 + m2)

The reason for the name will become later on. The frequency of the emission, fgw is double the orbital frequency, ω f = 2f = 2 orb . (8.4) gw orb 2π This is because for a rigidly rotating system, in the center of mass frame the components of the quadrupole tensor return to the same value after only half a period. fgw is normally called the gravitational wave frequency. This is apt if only quadrupole radiation is being studied, but in reality also higher multipole moments will come into play. These introduce harmonics with frequencies nforb, n = 1, 2, 3,.... However, the harmonic with n = 2 is the dominant contribution to the gravitational waveform, all others being sub-leading by at least one power of v/c. The waveforms (8.2) can be written in an instructive form by associating a length scale with both the chirp mass and the gravitational wave frequency. With the chirp mass we can

109 associate a characteristic radius, 2GM R = c , (8.5) c c2 which would be the radius of a spherical black hole with mass Mc. The gravitational wave frequency can be converted into a wavelength, c λ = . (8.6) fgw

The gravitational wave polarizations (8.2) then become

1 + cos2(ι) h = A cos(2πf t + 2φ), + 2 gw ret h× = A cos(ι) sin(2πfgwtret + 2φ), (8.7) with  2π 2/3 R  R 2/3 A = √ c c . (8.8) 2 r λ

The more mass involved, the larger Rc and the larger the amplitude. The wavelength of the gravitational radiation, λ, gives a sense of the size of the binary itself: the tighter the better. Also note that the radiation is not isotropic: more gets emitted perpendicularly to the orbital plane. If ι = 0 then the overall amplitudes of the two polarizations in (8.7) are equal, and there is only a π/2 phase difference between the two. On the other hand, inside the orbital plane (ι = π/2), the amplitude of h+ is only half as big, and h× is identically zero. Perpendicular to the plane, the radiation is an equal admixture of the “plus” and “cross” polarization; this is called circular polarization. By contrast, an observer watching the binary in the orbital plane only sees the components move back and forth on a line segment, and only one of the two polarizations is present, aligned with that segment: linear polarization. For 0 < ι < π/2 one has elliptical polarization. One can also look at the way power is emitted in different directions. From Eq. (6.86), we see that the power41 emitted per unit solid angle is

dP r2c3 gw = hh˙ 2 + h˙ 2 i (8.9) dΩ 16πG + × Inserting our expressions (8.2) for compact binaries, and keeping in mind that hcos2(ωt + ϕ)i = hsin2(ωt + ϕ)i = 1/2, this becomes

dP 2 c5 GM πf 10/3 gw = c gw g(ι), (8.10) dΩ π G c3

where 1 + cos2(ι)2 g(ι) = + cos2(ι). (8.11) 2

41 Not to be confused with the momentum P i.

110 Hence eight times as much power is emitted perpendicular to the orbital plane than into the plane. Integrating over the sphere we find the total radiated power: Z dP P = dΩ gw gw dΩ 32 c5 GM πf 10/3 = c gw . (8.12) 5 G c3 So far we have been assuming that the components of the binary are on fixed orbits. For non-relativistic systems, the total orbital energy is

Eorb = Ekin + Epot Gm m = − 1 2 . (8.13) 2R

Since gravitational waves carry away energy, Eorb should become more and more negative, which can only happen if R decreases. In fact, from (8.1), 2 ω˙ R˙ = − R orb (8.14) 3 ωorb 2 ω˙ orb = − (Rωorb) 2 . (8.15) 3 ωorb ˙ Note that R is the radial velocity while Rωorb is the tangential velocity. The above tells us that if 2 ω˙ orb  ωorb, (8.16) then the radial motion is very small compared to the tangential motion; when this condition holds, the motion is said to be quasi-circular, and the binary is in the inspiral regime. In what follows we will assume to be in this regime. Using (8.1), and keeping in mind (8.4), one can eliminate R in the expression for the orbital energy in favor of the gravitational wave frequency fgw:

G2M5π2f 2 1/3 E = − c gw . (8.17) orb 8 The loss in orbital energy per unit of time can be equated to the total energy flux of the gravitational wave emission: dE − orb = P . (8.18) dt gw Using (8.17) and (8.12), this gives an expression for the time evolution of the frequency:

96 GM 5/3 f˙ = π8/3 c f 11/3. (8.19) gw 5 c3 gw

This can be integrated to obtain fgw as a function of time:

1 GM −5/8  5 1 f (τ) = c , (8.20) gw π c3 256 τ

111 with τ = tcoal − t, where the coalescence time tcoal is the time at which fgw(τ) formally diverges. Note, however, that apart from the emission of gravitational radiation, we are treating the orbital motion in a Newtonian way. In a fully relativistic treatment, there would be an innermost stable circular orbit (ISCO). In the case where m1  m2 (so that we may view m2 as a test mass in a Schwarzschild geometry with mass m2 ' M, the radius of such an orbit is 6GM r = . (8.21) isco c2 Within the Schwarzschild spacetime, massive test particles can not be on circular orbits with a radius smaller than risco. Using (8.1) and (8.4), the gravitational wave frequency at ISCO is c3 f = . (8.22) gw,isco 63/2πGM If m1 and m2 are comparable, then one point mass can no longer be considered a test particle in the Schwarzschild geometry of the other, but there will still be an ISCO frequency beyond which the waveforms (8.2) are no longer valid. As it turns out, also in the general case it is safe to use the expression (8.22) as a rule of thumb for the breakdown of the quasi-circular regime. After that, the particles will plunge towards each other and collide. As the frequency increases, the separation between the point masses shrinks. From (8.14) and (8.20), R˙ 2 f˙ = − gw R 3 fgw 1 = − , (8.23) 4τ

where we recall that τ = tcoal − t. Integrating this, we obtain  τ 1/4 R(τ) = R0 τ0   tcoal − t = R0 , (8.24) tcoal − t0 with R0 the value of R at some initial time t0, and τ0 = tcoal − t0. Note that in the derivation of (8.2), fgw and R were considered constant. If the orbit evolves then the motion will be described by µ x1(t) = R(t) ˆe(t), m1 µ x2(t) = − R(t) ˆe(t), (8.25) m1 where R(t) is as in (8.24), and ˆe(t) = (cos(Φ(t)/2), cos(ι) sin(Φ(t)/2), sin(ι) sin(Φ(t)/2)). (8.26) and R(t) is as in (8.24). We have defined Z t 0 0 Φ(t) = 2 dt ωorb(t ) Z t 0 0 = dt ωgw(t ), (8.27)

112 with ωgw(t) = 2πfgw(t). When using the quadrupole expressions (6.122) to compute h+, h×, in the arguments of the trigonometric functions we must now replace ωgwt by Φ(t). When ¨ ¨ ¨ computing M11, M22, and M12, time derivatives of R(t) and ωgw(t) will now also appear. However, in the quasi-circular approximation, R˙ (t) can be neglected. The orbital (and hence gravitational wave) frequency will then also not change significantly over a single orbit, and we can also neglectω ˙ gw. This means that in the expressions (8.2), in the arguments of the cosine and sine we may replace 2πfgwt by Φ(t), and in the amplitudes the constant fgw can be replaced by fgw(t), with time dependence as in (8.20). All these are to be evaluated at the retarded time tret. The inspiral gravitational waveforms are then

4 GM 5/3 πf (t )2/3 1 + cos2(ι) h (t) = c gw ret cos(Φ(t )), + r c2 c 2 ret 4 GM 5/3 πf (t )2/3 h (t) = c gw ret cos(ι) sin(Φ(t )). (8.28) × r c2 c ret

Using (8.20) in (8.27), one has

5GM −5/8 Φ(t) = −2 c τ 5/8(t) + Φ , (8.29) c3 c where the integration constant Φc is the phase at coalescence, and we recall that τ = tcoal −t. Let us take stock. Gravitational wave emission causes a binary system to lose orbital energy, leading the orbits to shrink. We have derived the gravitational waveforms in the quadrupole approximation, and under the assumption that the system can be assumed to behave in a Newtonian way, consistent with the assumption of small v/c underlying 2 the quadrupole treatment. In the quasi-circular inspiral regime, whereω ˙ gw  ωgw, both the waveform amplitudes and the frequency increase monotonically – the binary “chirps”. How fast the amplitudes and frequency evolve is determined by a particular combination of the component masses, the chirp mass Mc. Eventually our description must break down, because of the concept of innermost stable circular orbit (ISCO) in general relativity. The components of the binary then plunge towards each other and merge. What kind of objects might be suitable for observation of their gravitational wave signal with existing detectors? The large ground-based interferometric detectors have a frequency range between approximately 20 Hz and 1 kHz, with a strain sensitivity h ∼ 10−22. From the exercises above we already know that Earth-based experiments involving masses being rotated at technologically feasible speeds will not generate detectable gravitational waves. The motion of planets in our solar system also doesn’t generate sufficiently strong gravit- ational radiation, and in any case the frequency is far too low; for the Earth-Sun system, −8 fgw = 2/(365 × 24 × 3600 s) ' 6.4 × 10 Hz. Earth-based experiments involving masses being rotated at technologically feasible speeds will also not generate detectable gravita- tional waves. Hence we need to look for sources in space. Most stars are in fact members of a binary system, so can we detect their gravitational radiation signature? Let’s consider a binary consisting of two ordinary stars with m1 = m2 = 1 M . When the signal from such a binary enters the band, i.e., when fgw = 20 Hz, Kepler’s law (8.1) tells us that the separation is only about 400 km. Ordinary stars have radii in the order of 106 km. In other words, already when their gravitational wave frequency would enter the detector band, nor- mal stars would already have merged completely and could no longer be considered separate

113 objects! Clearly the stars would need to be far more compact. Consider white dwarfs, the embers of normal stars that have spent their nuclear fuel and are kept from collapsing only by electron degeneracy. The maximum mass of a white dwarf is 1.4 M (the Chandrasekhar limit), and a white dwarf binary at fgw = 20 Hz would have a separation of just over 450 km. But a typical white dwarf radius is about 104 km, which doesn’t solve the problem42. Even more compact are neutron stars. These are the result of a very massive star having undergone supernova. If the density of the remnant object in the center is large enough that electron degeneracy is overcome, electrons and protons are forced to combine into neutrons. Neutron stars also have a typical mass of 1.4 M but a radius of only 10 km. Binary neutron stars would still be sufficiently well-separated when their gravitational wave signal enters the band of a ground-based detector. At the innermost stable circular orbit, one has fgw = 1566 Hz, corresponding to a separation of 40 km. Hence even then they will still be separate objects, and one can show that the point particle approximation is still an acceptable one. Finally there are black holes, with typical masses of ∼ 5 M . Two such black holes in a binary would then have M = 10 M , corresponding to a separation of about 700 km when 2 fgw = 20 Hz. Each has a Schwarzschild radius of Gmi/c ' 7.4 km, i = 1, 2. At ISCO, fgw,isco = 438 Hz, implying a separation of 140 km. Clearly, at least in terms of frequency and size, binary neutron stars and black holes (and also mixed systems consisting of a neutron star and a black hole) are potential sources for ground-based detectors. Although several binary neutron stars have been discovered in our galaxy, they are hundreds of millions of years away from merger. Neutron star or black hole binaries that are tight enough that merger is imminent (i.e., that fgw & 20 Hz are thought to be exceedingly rare. The expression (8.20) for fgw as a function of time can be inverted; in terms of observa- tionally interesting quantities,

1.21 M 5/3 100 Hz8/3 τ = 2.18 s . (8.30) Mc fgw

Here 1.21 M is the chirp mass of a binary with m1 = m2 = 1.4 M (two neutron stars). At fgw = 20 Hz, this gives τ = 160 s while at fgw,isco = 1566 Hz, τ = 0.0014 s; hence the inspiral signal will be in band for just under 3 minutes. In the case of a (5, 5) M binary black hole, Mc = 4.32 M , and the signal will remain in band for 19 s. The most massive systems a ground-based detector can see with the given lower cut-off frequency have fgw,isco ' 20 Hz; for an equal-mass binary this corresponds to M ' 200 M . Another interesting quantity is the number of waveform cycles within the detector’s band:

Z tmax Ncyc = fgwdt tmin Z fmax f = gw df . (8.31) ˙ gw fmin fgw

42 However, binary white dwarves will be sources for the space-based detector LISA. In that case the sens- itivity band starts at 10−4 Hz, and the corresponding separation is about 1.5 × 106 km, which is large enough. In fact, LISA will pick up so many signals from binary white dwarves in our galaxy that they will cause “confusion noise”.

114 ˙ Expressing fgw in terms of fgw itself using Eq. (8.19) and integrating we get 1 GM −5/3 N = c (f −5/3 − f −5/3) cyc 32π8/3 c3 min max 20 Hz5/3 1.2 M 5/3 ' 5 × 103 c . (8.32) fmin M Thus, a signal will have hundreds to thousands of cycles in band. This makes it imperative to have a good understanding of the waveform, and especially its phasing, so that the parameters characterizing the sources can extracted with great accuracy. The expression (8.29) is merely the leading-order contribution in an appropriate expansion parameter, e.g., 3 1/8 Θ ≡ (5GMc/(c τ)) . Such an expansion is currently known to 7th order in Θ. Next we turn to detection and parameter estimation. Given an interferometer with beam pattern functions F+ and F×, and the signal’s polarizations h+ and h×, the measured strain will be h(t) = F+(θ, φ, ψ) h+(t) + F×(θ, φ, ψ) h×(t). (8.33) The sky position (θ, φ) enters the strain through the beam pattern functions; the same goes for the polarization angle ψ, which gives the orientation of the long axis of the orbit as seen on the sky. The other parameters in the problem enter through the polarizations; these are

{m1, m2, ι, tc, Φc, r}. (8.34)

Here (m1, m2) are the component masses; ι gives the inclination of the orbital plane with respect to the line of sight; tc is the time of coalescence, Φc the phase at coalescence, and r the distance. It is convenient to rewrite (8.33) as q 2 2 2 2 h(t) = F+(1 + cos (ι)) + F×4 cos (ι) cos(Φ(t) + ϕ0), (8.35) with   F×2 cos(ι)) ϕ0 = arctan − 2 . (8.36) F+(1 + cos (ι)) The calculation of the signal-to-noise ratio, Eq. (7.35), and the Fisher matrix, Eq. (7.51) both require the strain in the Fourier domain. The strain is time-dependent through the time dependence of the polarizations h+(t) and h×(t); hence we will need the Fourier transforms of these. Although one can use numerical methods, such as the Fast Fourier Transform (FFT), it is instructive to have some analytical insight. An analytic expression can be obtained through the stationary phase approximation, which is valid if for a waveform’s amplitude A(t) and phase φ(t), one has |(1/A)dA/dt|  1 and |d2φ/dt2|  (dφ/dt)2. We will not go into details here but will just state that the Fourier transform of the strain in the stationary phase approximation is given by q r5π c GM 5/6 h˜(f) = − F 2 (1 + cos2(ι))2 + F 2 4 cos2(ι) (πf)−7/6 c + × 96 r c3 h  π i × exp i 2πft + Ξ(f) − ϕ − , (8.37) c 0 4 where, at leading order, 3 8πGM f  Ξ(f) = −Φ + c . (8.38) c 8 c3

115 In this approximation, the optimal SNR (7.35), which we will denote by ρ, is given by

1/2  2 Z fmax −7/3  2 2 2 2 2  5π −7/3 c 5/3 f ρ = F+(1 + cos (ι)) + F×4 cos (ι) π 2 Mc , (8.39) 24 r fmin Sn(f)

where [fmin, fmax] is the detector’s sensitivity band and Sn(f) its power spectral density. From the above expression as well as the beam pattern functions (7.19) for an L-shaped interferometer, one sees that for given component masses and distance, the SNR is largest when ι = 0, i.e., when the system is seen face-on, and either θ = 0 or θ = π, i.e., when the direction to the binary is perpendicular to the detector plane. In that case

1/2  2 Z fmax −7/3  5π −7/3 c 5/3 f ρopt = π 2 Mc . (8.40) 6 r fmin Sn(f) One can also take the average over both sky position and orientation of the binary. A brief calculation shows that * + 1 + cos2(ι)2 4 F 2 + F 2 cos2(ι) = , (8.41) + 2 × 25

hence the SNR averaged over all angles is 2 ρ = ρ . (8.42) ave 5 opt Demanding some minimum SNR, one can now ask to what distance a binary with given masses can be seen. A good indicator is the angle-averaged horizon distance, obtained by solving for r in the expression for the angle-averaged SNR. This leads to

1/2  Z fmax −7/3  2 5π −7/3 2 5/3 f −1 dhor = π c Mc ρ0 . (8.43) 5 6 fmin Sn(f)

The left panel of Fig. 27 shows dhor for initial LIGO and Virgo at design sensitivity, and for Advanced LIGO, as a function of total mass for equal mass binaries, for a minimum SNR of 5. Binary neutron stars with Mtot = 2.8 M can seen in initial detectors out to ∼ 20 Mpc, and to ∼ 300 Mpc in Advanced LIGO. For binary black holes with Mtot = 10 M , initial detectors are sensitive up to ∼ 50 Mpc while Advanced LIGO can seen them out to about a Gpc. One might think that the detection rate will scale roughly with the cube of horizon distance, but this is not the case. Fig. 28 illustrates the reason why: binary coalescence events tend to take place in galaxies, which are not at all uniformly distributed over volume. In our immediate intergalactic vicinity, the richest clusters are the Virgo Cluster (∼ 20 Mpc) and the Coma Cluster (∼ 100 Mpc). Astrophysicists try to model how frequently close binary systems composed of neutron stars and/or black holes form; combined with the projected horizon distances as in Fig. 27, this gives predictions for detection rates. Table I gives the likely range of detection rates for the initial (currently operational) LIGO/Virgo network, and for the network of Advanced detectors. The uncertainties are very large due to the difficulties in astrophysical modeling, but in the era of advanced detectors (around 2015), regular detections are to be expec- ted. Much effort is currently being put into the construction of the instrumentation, and preparations for extracting science from the first detections.

116 Figure 27: Angle-averaged horizon distances for initial Virgo (green dashed line), initial LIGO (red dash-dotted line) and Advanced LIGO (solid black line), as a function of total mass, for binaries with equal component masses. The assumed SNR threshold is ρ0 = 5.

Network Source N˙ low N˙ re N˙ high (yr−1) (yr−1) (yr−1) NS-NS 2 × 10−4 0.02 0.2 Initial NS-BH 7 × 10−5 0.0004 0.1 BH-BH 2 × 10−4 0.007 0.5 NS-NS 0.4 40 400 Advanced NS-BH 0.2 10 300 BH-BH 0.4 20 1000

Table I: Detection rates for the initial (i.e., currently operational) network of ground-based de- tectors, and for the advanced detectors which will become operational in 2015. There are huge uncertainties due to the difficulties in modeling the formation of binary systems; here we show res- ults from models with low predicted rates, realistic rates, and high rates. The sources are binary neutron stars (NS-NS), binaries consisting of a neutron star and a black hole (NS-BH), and binary black holes (BH-BH). Regular detections are extremely likely once the advanced detectors become available.

The above considerations are for ground-based detectors only. Around 2020, the space- based LISA will be launched. Being a far larger instrument (with the three probes at 5 million km from each other), LISA will be sensitive to gravitational waves with much larger wavelengths, and hence much lower frequencies: between 10−4 Hz and 0.1 Hz. The sources LISA will have access to will be supermassive black holes such as the ones that lurk in the centers of galaxies. When two galaxies merge, the supermassive black holes will tend to sink to the center of the new galaxy that is formed, and they might then form a binary system. Such systems are already being observed with conventional telescopes; Fig. 29 shows a binary supermassive black hole in the galaxy NGC 6240. To see why LISA is sensitive to these kinds of systems, one can write the expression for the gravitational wave frequency at

117 Figure 28: The distance reach of initial versus Advanced LIGO for binary neutron stars. The white dots are galaxies; note the strong clustering.

ISCO as c3 106 M f = ' 1.75 × 10−2 Hz. (8.44) gw,isco 63/2πGM M Given the sensitivity range f ∈ [10−4, 0.1] Hz, we see that systems will be visible for total 5 8 masses roughly between 10 M and 10 M . They will also be in the frequency band for a 6 −2 very long time. Consider a system with M = 10 M so that fgw,isco is 1.75×10 Hz. Using −4 the expression (8.30) for the time to coalescence as a function of frequency, at fgw = 10 7 Hz one has τ = 1.2 × 10 s, while at fgw,isco, τ = 12.5 s. Hence the signal is in band for four and a half months! Depending on their component masses, sky position, and orientation, supermassive bin- aries up to redshifts of z = 1 can give signal-to-noise ratios in the thousands, and LISA will be able to see inspirals up to z > 15, when black holes were in the process of being formed

118 Figure 29: A picture of what is almost certainly a binary supermassive black hole in the center of the galaxy NGC 6240, made with the Chandra X-ray satellite. The luminous spots are accretion disks around the black holes. This particular system is still very far from merger, but other, much tighter supermassive binaries are primary sources for LISA.

for the first time. In other words, LISA will see all eligible sources in its past light cone! Although the expected detection rate is again dependent on astrophysical modeling, their is broad agreement that there should be between 20 and 80 detections per year, and LISA may be operational for several years. Due to the large signal-to-noise ratios, LISA will be a prime instrument for studying the dynamics of gravity in the strong field regime and to look for possible deviations from GR. Such information can be gleaned not only from the inspiral signal, but also from the merger signal itself, and the “ringing” of the single black hole that results from it. Other interesting sources for LISA are Extreme Mass Ratio Inspirals (EMRIs), where a smaller black hole spirals into a very massive one, possibly on very excentric orbits exhibiting huge perihelion precession. By studying the gravitational wave signal, the orbits can be traced in great detail, which in turn will provide in-depth information about the geometry of spacetime in the immediate vicinity of the more massive black hole; this is what is illustrated in Fig. 30. This will provide accurate tests of the Black Hole Uniqueness Theorem, which states that black hole geometries are only determined by43 the mass M and the rotational angular momentum J.

B. Gravitational waves from spinning neutron stars

Another interesting source of gravitational radiation is a spinning, isolated neutron star44. Neutron stars are axisymmetric to a high degree; in particular they are oblate due to their

43 Technically also by the electric charge Q, but astrophysical black holes will be electrically neutral. 44 Or a neutron star which is part of a binary but still very far from merger, so that the inspiral signal is unmeasurable.

119 Figure 30: Extreme mass ratio inspirals consisting of a smaller black hole spiraling into a bigger one on an excentric orbit are also sources for LISA. They will allow us to check the Black Hole Uniqueness Theorem.

rotation. Although they are mostly composed of neutrons in a superfluid state, they do have a crust which is composed of dense but ordinary matter. They also have a magnetic field, which is usually misaligned with the axis of rotation. This can be viewed as a precessing dipole, which emits electromagnetic radiation. The radiation carries away rotational kinetic energy and causes the star to spin down. With decreasing centrifugal force, the fluid inside gradually becomes less and less oblate, but sometimes the crust is slow in following. When the crust finally catches up and corrects its shape, it cracks, and material gets redistributed over the surface, causing the formation of “mountains”. Due to the extreme surface gravity, no tall structures can sustain themselves, and the mountains are expected to be at most 0.1 mm high. On the other hand, because of the high rotational velocities, even minor deviations from axisymmetry cause a time-dependent quadrupole moment that leads to copious amounts of gravitational radiation. For our purposes, if will suffice to model neutron stars as rigid bodies45. Such bodies can be characterized by their inertia tensor, Z Iij = d3xρ(x)(r2δij − xixj), (8.45)

with ρ(x) the mass density. Note that this is a symmetric real matrix; hence there exists an orthogonal frame in which it is diagonal. The associated axes are called the principal axes of the body, and the diagonal elements I1, I2, I3 are the principal moments of inertia. The frame which diagonalizes Iij is called the “body frame”. Denoting the coordinates in the

45 For a treatment in classical mechanics, see, e.g., Landau and Lifschitz, Vol. I, 1976.

120 0 body frame by xi, one has Z 3 0 0 02 02 I1 = d x ρ(x )(x2 + x3 ), Z 3 0 0 02 02 I2 = d x ρ(x )(x3 + x1 ), Z 3 0 0 02 02 I3 = d x ρ(x )(x1 + x2 ). (8.46)

A simple example is that of an ellipsoid with uniform density ρ, total mass M, and semi-axes a, b, and c. In that case M I = (b2 + c2), 1 5 M I = (c2 + a2), 2 5 M I = (a2 + b2). (8.47) 3 5 If a body rotates with angular velocity ω (which need not be aligned with any of the principal axes of inertia), the angular momentum is given by

Ji = Iijωj. (8.48)

0 0 0 0 0 0 The components of ω and J in the body frame are denoted by (ω1, ω2, ω3) and (J1,J2,J3), 0 0 0 0 0 0 respectively. One has J1 = I1ω1, J2 = I2ω2, J3 = I3ω3. Hence, in general the direction of J is different from that of ω, except if I1 = I2 = I3, or the body is rotating about one of the 0 0 principal axes of inertia, e.g., if ω1 = ω2 = 0. The rotational kinetic energy is 1 E = I ω ω , (8.49) rot 2 ij i j which in the body frame reduces to 1 E = I (ω0 )2 + I (ω0 )2 + I (ω0 )2 . (8.50) rot 2 1 1 2 2 3 3 If ωˆ is the unit vector in the direction of ω, so that ω = ω ωˆ, then 1 E = Iω2, (8.51) rot 2 where I = Iijωˆiωˆj (8.52) is the moment of inertia about the axis of rotation. Here we will assume that the rotation is about a principle axis46, which we will take to 0 0 0 be the z -axis so that ω2 = ω3 = 0. Neutron stars are oblate, with I3 > I1 ' I2. The small

46 Starquakes can cause a misalignement, leading to precession of the crust with respect to the superfluid bulk. Calculations show that “pinning” of superfluid vortices to the crust will quickly damp the precession, after which the star will once again rotate about a principal axis.

121 difference between I1 and I2 is due to minor deformations of the crust (“mountains”). By definition, the body frame is attached to the neutron star and rotates with it; we will call the angular speed ωrot. The spin-down of neutron stars is an extremely slow process, and we will assume that ωrot is constant. To study the gravitational radiation emitted, we will need to go to a non-rotating frame. First let us introduce a frame (x, y, z) whose origin is the center of mass, such that the z-axis coincides with the z0-axis, but the x0 and y0 axes rotate with respect to the x and y axes. The two frames are related by a rotation matrix Rij, 0 xi = Rijxj, (8.53) with   cos(ωrott) sin(ωrott) 0 Rij(t) =  − sin(ωrott) cos(ωrott) 0  . (8.54) 0 0 1 ij 0 The inertial tensor in the body frame is a constant tensor Iij = diag(I1,I2,I3). The same tensor Iij in the rotating (x, y, z) frame does not have constant components; it is related to 0 Iij by the time-dependent rotation matrix Rij(t):

0 T Iij = (R I R )ij = RikRjlIkl (8.55) with RT the transpose. Conversely, I = RT I0 R. (8.56) This leads to

2 2 I11 = I1 cos (ωrott) + I2 sin (ωrott) I − I = 1 + 1 2 cos(2ω t), 2 rot I − I I = 1 2 sin(2ω t), 12 2 rot I13 = 0, I − I I = 1 − 1 2 cos(2ω t), 22 2 rot I23 = 0

I33 = I3. (8.57) ¨ The quadrupole radiation is generated by the acceleration of the second mass moment, Mij (Eq. (6.119)). From the definition, Eq. (6.114), we see that Mij differs from Iij by a minus sign and a trace term. However, TrI = Tr(RT I0R) = Tr(RRT I0) = TrI0

= I1 + I2 + I3, (8.58) where in the second line we used the cyclic property of the trace. Hence TrI is time- independent, and Mij = −Iij + cij, (8.59)

122 where the cij are constants. In particular, I − I M = − 1 2 cos(2ω t) + const, 11 2 rot I − I M = − 1 2 sin(2ω t) + const, 12 2 rot I − I M = 1 2 cos(2ω t) + const. (8.60) 22 2 rot The expressions (6.131) only give the radiation emitted in the z-direction. Since we will want to compute the radiation in any direction, we need to perform an additional rotation to re-orient the source. In particular, we want to incline the neutron star over an angle ι with respect to the z-axis. Because the radiation pattern will be axisymmetric, without loss of generality we can just perform a rotation around the x-axis, with rotation matrix

 1 0 0  ˜ Rij =  0 cos(ι) sin(ι)  , (8.61) 0 − sin(ι) cos(ι) ij

which acts on Mij as ˜ ˜ T ˜ Mij = R M R. (8.62) This gives I − I M˜ = − 1 2 cos(2ω t) + const, 11 2 rot I − I M˜ = − 1 2 cos(ι) sin(2ω t) + const, 12 2 rot I − I M˜ = 1 2 cos2(ι) cos(2ω t) + const. (8.63) 22 2 rot The quadrupole radiation in the newz ˜-direction is 1 G h = (M˜¨ − M˜¨ ), + r c4 11 22 2 G h = M˜¨ , (8.64) × r c4 12 or 1 4G 1 + cos2(ι) h = ω2 (I − I ) cos(2ω t), + r c4 rot 1 2 2 rot 1 4G h = ω2 (I − I ) cos(ι) sin(2ω t). (8.65) × r c4 rot 1 2 rot Thus, we have periodic radiation at twice the rotation frequency. The radiation is circularly polarized in the direction of the neutron star’s axis of rotation (ι = 0) and linearly polarized in the equatorial plane (ι = π/2). Define the ellipticity  by I − I  = 1 2 , (8.66) I3

123 where the moments of inertial are assumed to be ordered such that I1 ≥ I2. In the case of a homogeneous ellipsoid with semi-major axes a, b, and c and a ' b, one has b − a  = + O(2). (8.67) a

Also define a gravitational wave frequency fgw = 2frot = 2ωrot/(2π). In terms of  and fgw, the gravitational wave polarizations can be written as 1 + cos2(ι) h = A cos(2πf t), + 2 gw h× = A cos(ι) sin(2fgwt), (8.68) where 1 4π2G A = I f 2 . (8.69) r c4 3 gw A typical neutron star mass is m ' 1.4 M , and a typical radius R ' 10 km, so that 2 38 2 I3 = (2/5)MR ' 10 kg m . A high, but not uncommon, rotation frequency is f ∼ 1 kHz. Within our galaxy, most neutron stars will be found in or near the bulge at the center, which is at a distance of about 10 kpc. Very little is known about their ellipticity, but values of  ∼ 10−6 may be possible. Using these numbers as a reference, one has 10 kpc  I   f 2    A = 1.04 × 10−25 3 . (8.70) r 1038 kg m2 1 kHz 10−6 This may seem disappointing at first, as it would appear to be several orders of magnitude below the sensitivity of the detectors. However, as we shall discuss later, with periodic sources (as opposed to short-duration ones, like inspirals) one can analyze the data in such a way that the signal-to-noise ratio increases with the square root of the observation time. This build-up of the signal strength makes it plausible that isolated neutron stars will be detectable, if  is not too small. ˜¨ ˜¨ Next we consider the power emitted in gravitational waves. Noting that M11 = −M22 and using the quadrupole formula, we find 2G P = hM˜¨ 2 + M˜¨ 2 i c5 11 12 32G = I22ω6 . (8.71) 5c5 3 rot 2 The rotational kinetic energy, Erot = (1/2)I3ωrot, will then decrease through gravitational wave emission as dE 32G rot = − I22ω6 . (8.72) dt 5c5 3 rot If gravitational wave emission were the dominant mechanism by which rotational energy is lost, then the rotational frequency would decrease as 32G ω˙ = − I22ω5 . (8.73) rot 5c5 3 rot In fact, electromagnetic emission is known to dominate. For instance, the inability to find a gravitational wave signal from the Crab pulsar in existing detector data has allowed re- searchers to put an upper bound on how much energy it emits in gravitational waves. Given

124 the spin-down rate of the pulsar (which is known from radio observations) and the sensitivity of the detectors, it was determined that no more than 6% of the Crab’s rotational energy loss goes into gravitational radiation, or else a gravitational wave signal should have been seen.47

C. Mapping the geometry of the Universe with gravitational waves

We have seen how standard candles, such as Type Ia supernovae, are standard candles: their luminosity distance can be determined (with some error) from their brightness, and their redshift can be measured separately. With a sufficient number of standard candles, one can use the relation

DL = DL(z; H0, ΩM, ΩR, ΩDE, Ωk, w) (8.74)

to constrain the cosmological parameters ΩM etc. However, standard candles are only as reliable as the calibration of their intrinsic luminosity, which depends on a “cosmic distance ladder” of other kinds of sources. The observation of gravitational waves from the inspiral and merger of compact objects will obviate the need for a cosmic distance ladder. Schematically, the strain such signals induce in a detector looks like

M5/3 h(t) = c A(θ, φ, ψ, ι) cos(Φ(t)). (8.75) DL The angles θ and φ give the sky position of the source, ψ is the polarization angle, and ι is the inclination angle of the orbital plane. If at least one of the compact objects is a neutron star then the merger will produce electromagnetic radiation. Indeed, short, hard gamma ray bursts (GRBs) – the brightest events since the Big Bang – are believed to be caused by merging binary neutron stars. If a gamma ray bursts is observed electromagnetically in coincidence with the gravitational wave observation of a binary neutron star coalescence, then it will be natural to associate the two events. In many cases the sky location of a gamma ray burst can be determined, so that θ and φ will be known. It is believed that the gamma ray emission of GRBs happens in tight beams perpendicular to the inspiral plane; if this is so then one can assume ι ' 0, in which case ψ also disappears from the expression for A. However, with a network of detectors one can measure ι and ψ with reasonable accuracy, so that further assumptions are not necessary. This way, A is known. When measuring the amplitude of the signal, one would then still need to disentangle Mc and DL. However, from the expression (8.29) we see that, to leading order in small parameters, the phasing Φ(t) is completely determined by Mc. Hence, by studying the time evolution of the signal’s phase one can measure Mc. With θ, φ, ψ, ι, and Mc known, one can then also determine DL by measuring the amplitude of the signal. Another possibility is using binary supermassive black holes observed by the space-based LISA. In that case the signals will be in band for a year or so, during which time LISA moves about the Sun. This motion Doppler-modulates the gravitational waveform, from

47 The actual fraction is likely to be much lower, but this was one of the first important results from gravitational wave astronomy.

125 which (θ, φ, ι, ψ) can be obtained without the need for an electromagnetic counterpart, and also Mc and DL. If the determination of the sky position is sufficiently accurate that the host galaxy can be determined, then once again z can be measured. Inspiral events are self-calibrating; there is no need to take recourse to other kinds of sources to determine their luminosity distance DL. With sky position known, one can also find out in which galaxy the inspiral occurred. The redshift z of a galaxy can be measured with high accuracy by looking at its spectrum. Thus, DL and z can be inferred independently, and binary neutron star inspiral are standard candles. Because their frequencies happen to fall in the range of audible sound, they are often referred to as standard sirens. In the rest of this chapter we will discuss how standard sirens could be used to determine the geometry of the Universe. First we need to know how gravitational waves propag- ate through a FLRW Universe. Previously we linearized general relativity around a flat (Minkowksi) background; here we need to linearize around a FLRW spacetime. We will not go into details here, but in fact the derivation follows closely what we did before, except that ηµν is replaced by the background metricg ¯µν. One defines 1 h¯ = h − g¯ h, (8.76) µν µν 2 µν

µν with h =g ¯ hµν. The equivalent of the harmonic gauge condition is

¯ ν ¯ ∇ hµν = 0, (8.77)

where ∇¯ is the covariant derivative adapted to the background metric. The linearized vacuum Einstein equations then take the form

¯ ρ ¯ ¯ ∇ ∇ρhµν = 0. (8.78) ¯ One can also impose a TT gauge, in which hµν = hµν. It is not difficult t show that in a (matter-dominated) FLRW Universe, a general spher- ically symmetric solution of Eq. (8.78) that falls of as 1/r approximately takes the form 1 h ' H (t − r/c). (8.79) µν ra(t) µν

Eq. (8.79) tells us that in the expressions for the inspiral waveforms, Eqns. (8.28), we only need to replace 1/r by 1/(a(t) r) to take into account the effect of the cosmological background. To fully understand the effect of propagation through a FLRW spacetime, let us first look at what happens near the source. Define the local wave zone as a region close enough to the source that the expansion of the Universe can be neglected, but sufficiently far so we are not in the strong-field regime and the quadrupole approximation holds. In the local wave zone, one has

1 + cos2(ι)  Z tret,em  h (t ) = h (t ) cos 2π dt0 f (t0 ) + em 0 em,ret 2 em gw,em em  Z tret,em  0 0 h×(tem) = h0(tem,ret) cos(ι) sin 2π dtemfgw,em(tem) , (8.80)

126 with  5/3  2/3 4 GMc πfgw,em(tem,ret) h0(tem,ret) = 2 . (8.81) a(tem)r c c

Here tem is the time measured by a clock at the source, and tem,ret the corresponding retarded time. fgw,em is the frequency associated with the definition of the time tem. Far from the local wave zone, at the location of the observer, the waveforms will be as in (8.80), but with h0 replaced by

 5/3  2/3 4 GMc πfgw,obs(tobs,ret) h0(tobs,ret) = 2 . (8.82) a(tobs) r c c The phasing does not change, because

Z tem,ret Z tobs,ret 0 0 0 0 dtemfgw,em(tem) = dtobsfgw,obs(tobs). (8.83)

This is because at the observer, frequency will have decreased by a factor (1 + z) but time will have increased by that same factor, so the redshift drops out. However, due care must 2/3 be taken with the factor [fgw,ret(tem,ret)] in the amplitudes. One has

fgw,em = (1 + z) fgw,obs. (8.84)

Eq. (8.82) then becomes

 5/3  2/3 2/3 4 GMc πfgw,obstobs,ret h0(tobs,ret) = (1 + z) 2 a(tobs) r c c  5/3  2/3 5/3 4 GMc πfgw,obstobs,ret = (1 + z) 2 . (8.85) DL c c Now define Mc,obs = (1 + z)Mc. (8.86) Then  5/3  2/3 4 GMc,obs πfgw,obs(tobs,ret) h0(tobs,ret) = 2 . (8.87) DL c c Hence, masses also enter the waveform in redshifted form. This mass dilation in an expanding Universe is reminscent of mass dilation in special relativity: objects moving with respect to an observer will appear to be more massive. For ground-based detectors that can see out to cosmological distances (such as Einstein Telescope), this effect is quite helpful: for instance, redshift will make binary neutron stars seem more massive, which implies a louder gravitational wave signal and the ability to see such sources at larger distances. Let us now discuss in more detail the use of compact binary coalescence events as standard sirens. Both space-based and ground-based detectors will be of use here, but both the sources and the methods will be quite different:

• LISA will see between 20 and 80 inspirals of supermassive binary black holes, essen- tially all such events in its past light cone. These will not necessarily have clear optical

127 counterparts48, so that the sky location (needed for measurement of redshift through identification of the host galaxy) will need to be determined from the gravitational wave signal itself. Fortunately LISA’s motion around the Sun causes a Doppler mod- ulation of the signal, which allows for rather good pointing accuracy. Even so, studies have determined that a source will have to be at a redshift z . 1 for host identification to be possible. It is expected that the instrument will see about one source per year that is close enough to be used for cosmography. • Advanced ground-based instruments will see tens to hundreds of coalescences involving at least one neutron star. If the associated gamma ray burst is observed then the host galaxy and the redshift can be established. Einstein Telescope will see in the order of a million such coalescences out to cosmological distances (see Fig. 31); in the course of several years, a thousand of these may allow for redshift measurement.

Figure 31: The distance and redshift reach of Einstein Telescope. M is total mass while ν = m1m2/M is the so-called symmetric mass ratio; equal component mass systems have ν = 0.25. The red solid curves give distances as functions of physical mass M, the blue dash-dotted ones as functions of the observed mass (1 + z) M.

As discussed above, cosmography requires an independent measurement of redshift z and luminosity distance DL. The latter is to be measured from the gravitational wave signal. The Fisher matrix formalism which we discussed earlier will give us an indication of how

48 By the time a supermassive binary black hole will be orbiting with sufficient frequency to be in the LISA band, a common accretion disk will have formed. At merger, several percent of the binary’s mass/energy will be radiated away in a matter of hours. This will cause a sudden change in the Newtonian potential, which determines the orbits of matter particles in the accretion disk. The change will lead to a shock wave through the disk, and the attendant heating may cause a sufficiently distinct electromagnetic signal. However, the accretion disk will be so large that such a signal may take several years to be generated. Even then, it may not be distinguishable from the natural variability of quasars.

128 well this can be done. Recall that the Fisher matrix Γij is defined as

Γij = (∂ih|∂jh), (8.88)

i ij ij −1 with ∂ih = ∂h/∂λ . The covariance matrix Σ = (Γ ) gives 1-σ uncertainties in the parameters λi, as well as their correlation coefficients: √ ∆λi = Σii, Σij cij = √ , (8.89) ΣiiΣjj where no summation over repeated indices is assumed. From Eqns. (8.80), (8.82), (8.29), and the discussion around Eq. (8.33) we see that the parameters (θ, φ, ψ, ι, r) only enter the amplitude, not the phase; hence we can expect them to be strongly correlated, at least a priori. Mc,obs also appears in the amplitude, but it can be measured from the phase with good accuracy. Since the scheme outlined above requires knowledge of sky position, we may assume that (θ, φ) are known. In the case of LISA, the orientation of the source will appear to be changing as the probes move around the Sun, allowing for a determination of (ψ, ι). For ground-based detectors, if GRBs are indeed strongly beamed49 then ι ' 0, in which case ψ drops out of the waveform, as radiation emitted perpendicularly to the inspiral plane is circularly polarized. Even if this is too strong an assumption, with a network of detectors seeing the source with different apparent orientations, ι and ψ can be disentangled. Whatever parameters are known by other means need not be included in (8.88), which only pertains to measurements using the gravitational waveform. Hence (θ, φ) are absent from (8.88); (ι, φ) will be approximately zero in the case of GRBs, and not that strongly correlated in the case of LISA; and Mc,obs can be obtained from the phasing. Very roughly then, we may then treat the Fisher matrix as if it is block-diagonal, with DL in one block. In that case, 1 ∆DL ∼ p . (8.90) (∂DL h|∂DL h) Note that ∂h 1 = − h, (8.91) ∂DL DL so ∆DL 1 1 ∼ p = , (8.92) DL (h|h) ρ with ρ the signal-to-noise ratio. First let’s assume we only want to determine H0, the Hubble parameter at the current 50 era. Ignoring correlations with the other cosmological parameters (ΩM,ΩDE, w),

∂DL ∆DL = ∆H0. (8.93) ∂H0

49 The estimated total beaming angle is ∼ 40◦. Note that this is not a stereal angle; indeed, the associated surface area on the unit sphere is only about 3%. 50 In this section, for simplicity we will assume ΩR = Ωk = 0.

129 From Eqns. (2.69), (2.43), and (8.91), we see that

∆H ∆D 0 = L . (8.94) H0 DL First consider ground-based detectors. Most of its sources will be close to the angle-averaged horizon distance, as larger distances correspond to larger volumes and hence to a higher source rate. This that for the majority of sources, ρ will be not much larger than the threshold SNR required for detection; for concreteness, ρ ∼ 10. Then the above expression together with (8.92) implies that with a single source, H0 can be measured with a relat- ive√ accuracy of 10%. However, with N sources, the uncertainty will go down roughly as N. The Advanced LIGO-Virgo network is expected to see about 40 BNS events per year. Suppose that, over the course of several years, there will be 10 sources with a sufficiently distinct electromagnetic counterpart (not necessarily a gamma ray burst; there could be a non-beamed afterglow); then we gain about a factor of 3, leading to ∆H0/H0 ∼ 3%. Einstein Telescope is expected to see ∼ 1000 identifiable sources over the course of several years; again assuming an SNR close to threshold (ρ ∼ 10), we get ∆H0/H0 ∼ 0.3%. As explained above, in the case of LISA we may have only one source for which a redshift can be determined. On the other hand, this must then be a relatively close-by source, with z . 1. The SNR will depend sensitively on sky position and orientation, but ρ ' 1000 is reasonable. Thus, ∆H0/H0 ∼ 0.1% in that case. One can also assume that, say, (H0, ΩM, ΩDE) have already been determined by other means and try to measure the dark energy equation-of-state parameter w. Again assuming no correlations among the cosmological parameters,

∂DL ∆DL = ∆w, (8.95) ∂w or −1 ∂DL ∆DL ∆w = DL . (8.96) ∂w DL For definiteness, let us assume the true value of w to be w = −1, corresponding to a cosmological constant. We first consider LISA. For a source at z ∼ 0.5, one has DL ' 3 Gpc 3 and |∂DL/∂w| ∼ 500 Mpc. Again taking ∆DL/DL ∼ ρ ∼ 10 , this leads to ∆w/|w| ∼ 0.6%. Sources in ET will be spread out in redshift up to z ∼ 2, but we can make a back-of-the- envelope estimate. The ET sources that contribute the most to estimates of cosmological parameters also happen to be roughtly at z ∼ 0.5, with an SNR of about 20. This leads √to ∆w/|w| ∼ 30% for an individual source. Assuming N = 1000 sources and dividing by N, we get ∆w/|w| ∼ 1%, which is actually very close to what was found in more in-depth studies. In reality these uncertainties will be larger because of weak lensing. The matter between source and observer acts as a lens, leading to magnification or demagnification. Light and gravitational waves are affected in exactly the same way: gravitons and photons both move along lightlike geodesics! This has the effect of corrupting measurements of DL. For the Advanced detectors this will not be much of a problem, since their sources will be relatively close-by (up to about 1 Gpc). However, for sources up to z ∼ 1 (about 3 Gpc), the effect on distance measurements has been estimated to be at the 3 − 5% level. Since this error is independent of the uncertainties due to instrumental noise, it can be combined in quadrature

130 with ∆DL/DL: !1/2 ∆D  ∆D 2 ∆D 2 L = L + L . (8.97) DL total DL noise DL lensing If, for definiteness, we assume a 4% error, then the above LISA estimates will be completely dominated by weak lensing errors: [∆H0/H0]total ∼ 4% and [∆w]total ∼ 23%. Fortunately, it may be possible to at least partially subtract these effects. Other than (de)magnification, lensing also causes deformations in galaxy images (causing them to be banana-shaped rather than elliptical). By making maps of these deformations one can infer the distribution of matter along the line of sight, model the effect of weak lensing, and at least partially subtract it. Note that weak lensing will be far less of a problem for ET, because of large number 2 2 1/2 statistics. Repeating the rough estimate above but with [∆DL/DL]total = (0.04 +0.05 ) ' 6.4%, with 1000 sources one gets an error of 1.2%, barely different from the value without weak lensing. Of course we will want to measure all four parameters H0,ΩM,ΩDE, and w, which will require at least four measurements. Hence, this is something one can not do with LISA alone – although LISA observations can of course be combined with those of other gravitational- wave observatories. Advanced LIGO-Virgo could in principle be used, but the number of sources will be too low to independently measure all parameters. For (ΩM, ΩDE, w) one can always use values obtained by electromagnetism. However, as explained above, this would then make the results dependent on the cosmic distance ladder, which is undesirable. But Einstein Telescope will be suitable. One could first get a high-quality measure of the Hubble constant by using fairly nearby sources that are unaffected by weak lensing and have a high SNR. Careful studies have shown that with only 50 sources up to z = 0.5, one could measure H0 with a relative error of 0.5%. Next, using the rest of the sources and keeping the value of 51 H0 fixed , one could find the values of (ΩM, ΩDE, w) by making a best fit of the measured values of DL against z. To estimate uncertainties, one could create simulated “catalog” of sources at different redshifts zi, i = 1,..., 1000, and assigning to each a “measured” ˆ (i) (i) distance DL = DL(zi) + δDL . Here DL(zi) is the true distance, while the “measurement (i) error” δDL is drawn from a Gaussian distribution with a width ∆DL(z), computed using the Fisher matrix formalism. One then fits the measured distances to the redshift, leading ˆ ˆ to “measured values” (ΩM), ΩDE, wˆ for the cosmological parameters. Having repeated this many times with many different simulated catalogs, one gets a distribution for these values, of which the variance can be computed. This is what is shown in Fig. 32. The result for, e.g., the dark energy measurement is competitive with projections for future, dedicated dark energy missions, where 3.5-11% uncertainty is expected. However, we reiterate that gravitational wave standard sirens are self-calibrating and have no need for an elaborate cosmic distance ladder, circumventing possible unknown systematic errors. What they will tell us about the geometry of the Universe is guaranteed to be correct within the stated statistical errors.

51 As it turns out, simply measuring (H0, ΩM, ΩDE, w) all at once leads to intolerably large uncertainties;

measuring H0 separately solves the problem by avoiding a degeneracy.

131 Figure 32: Top: a simulated “catalog” of binary neutron star inspirals with “measured” luminosity distances, with errors due to both detector noise and weak lensing. Bottom left: histograms of “measured” values of the cosmological parameters, with weak lensing included (left panels) and corrected for (right panels). With weak lensing, the relative uncertainties are 18%, 4.2%, and 18%, respectively. If weak lensing effects can be subtracted then these numbers become 14%, 3.5%, and 15%, respectively. Bottom right: if we live in a spatially flat (k = 0) Universe, as observations indicate, then ΩM + ΩDE = 1 so that only one of the two is independent. The relative uncertainties on ΩM and w are then 9.4% and 7.6%, respectively (with weak lensing), or 8.1% and 6.6%, respectively (without weak lensing).

132