Relativity Theory
Jouko Mickelsson with Tommy Ohlsson and H˚akan Snellman
Mathematical Physics, KTH Physics Royal Institute of Technology Stockholm 2005 Typeset in LATEX
Written by Jouko Mickelsson, 1996. Revised by Tommy Ohlsson, 1998. Revised and extended by Jouko Mickelsson, Tommy Ohlsson, and H˚akan Snellman, 1999. Revised by Jouko Mickelsson, Tommy Ohlsson, and H˚akan Snellman, 2000. Revised by Tommy Ohlsson, 2001. Revised by Tommy Ohlsson, 2003. Revised by Mattias Blennow, 2005.
Solutions to the problems are written by Tommy Ohlsson and H˚akan Snellman, 1999. Updated by Tommy Ohlsson, 2000. Updated by Tommy Ohlsson, 2001. Updated by Tommy Ohlsson, 2003. Updated by Mattias Blennow, 2005. c Mathematical Physics, KTH Physics, KTH, 2005
Printed in Sweden by US–AB, Stockholm, 2005. Contents
Contents i
1 Special Relativity 1 1.1 Geometry of the Minkowski Space ...... 2 1.2 LorentzTransformations...... 4 1.3 Physical Interpretations ...... 5 1.3.1 LorentzContraction ...... 7 1.3.2 TimeDilation...... 7 1.3.3 Relativistic Addition of Velocities ...... 7 1.3.4 The Michelson–Morley Experiment ...... 8 1.3.5 The Relativistic Doppler Effect ...... 10 1.4 The Proper Time and the Twin Paradox ...... 11 1.5 Transformations of Velocities and Accelerations ...... 12 1.6 Energy, Momentum, and Mass in Relativity Theory ...... 13 1.7 The Spinorial Representation of Lorentz Transformations ...... 16 1.8 Lorentz Invariance of Maxwell’s Equations ...... 17 1.8.1 Physical Consequences of Lorentz Transformations ...... 20 1.8.2 TheLorentzForce ...... 22 1.8.3 The Energy-Momentum Tensor ...... 24 1.9 Problems ...... 27
2 Some Differential Geometry 39 2.1 Manifolds ...... 39 2.2 Vector Fields and Tangent Vectors ...... 42 2.2.1 TensorFields ...... 45 2.3 Geodesics ...... 46 2.3.1 Affine Connection and Christoffel Symbols ...... 46 2.3.2 ParallelTransport ...... 48 2.4 TorsionandCurvature...... 50 2.5 Metric and Pseudo-Metric ...... 53 2.6 Problems ...... 59
3 General Relativity 63 3.1 The Einstein Field Equations ...... 63 3.2 TheNewtonianLimit ...... 66 3.3 The Schwarzschild Metric ...... 68
i 3.4 Experimental Tests of General Relativity ...... 70 3.4.1 The Gravitational Redshift ...... 71 3.4.2 The Perihelion Precession of Mercury ...... 72 3.4.3 TheBendingofLight ...... 74 3.4.4 Radar(Laser)EchoDelay...... 74 3.4.5 Black Holes, Binary Star Systems, and Star Evolution ...... 75 3.5 CosmologicalModels...... 75 3.5.1 The Large Scale Structure of the Universe ...... 75 3.5.2 The Robertson–Walker Metric ...... 76 3.6 Problems ...... 79
4 Solutions to Problems 85 4.1 Solutions to Problems in Chapter 1 ...... 85 4.2 Solutions to Problems in Chapter 2 ...... 121 4.3 Solutions to Problems in Chapter 3 ...... 131
Useful Formulas in Relativity Theory 147 Hyperbolic Functions ...... 147 The Electromagnetic Field ...... 147 Metric, Connection, Curvature, and Torsion ...... 147 GeneralRelativity ...... 148
ii Chapter 1
Special Relativity
Relativity theory has at least two independent roots. One is the independence of the state of motion of the observer for the description of physics, which goes at least back to Galilei, and an other the long cherished dream of physicists, since the days of Archimedes, of getting rid of motion by introducing geometry. The first line of thought says that the physics should not depend on the observers frame of reference, whether it is in a state of motion or not. Galilei seems to have realized this, and suggested that an experiment of free fall performed on a boat moving with constant velocity or at rest, should give the same result. In other words: free fall experiments cannot tell whether we are in a coordinate system moving with constant speed v or in a coordinate system at rest. This leads, (together with the invariance under translation and rotations), to the so called Galilei invariance of the equations of motion of Newton. In one space dimension the Galilei transformation is:
x x′ = x vt, (1.1) → − t t′ = t, (1.2) → where v is the constant velocity of the observer. In Newton’s equation F = ma for a body of mass m acted upon by a force F , this invariance is immediately obvious, since the acceleration a is the second derivative of position with respect to time, i.e., a =x ¨(t), and the velocity v is independent of time. The second root, the geometrization of motion, was discussed already by the presocratic philosophers, and was especially promoted by Archimedes. This geometrization is not the same as that discussed later by Kepler and Galilei, who thought of motion in terms of orbits with geometrical shape: ellipses, parabolas and hyperbolas. These conical sections are geometrical, but the particles still move along these curves in space. The idea of the fourth dimension became prominent during the 19th century, especially by the German scientist Gustav Fechner, who wrote about the fourth dimension although with a slightly different aim than ours. Later the English science fiction written H. G. Wells introduced time as the fourth dimension in his novel “The time machine” from 1895. Even with Einstein’s relativity paper in 1905, the four-dimensional formulation was not part of his ideas. His discussion is focused on the relation between electricity and magnetism, and describes there mainly the notion of relative time and the concept of simultaneity.
1 2 CHAPTER 1. SPECIAL RELATIVITY
Soon after Einstein’s formulation of the theory of relativity, his former teacher Hermann Minkowski at ETH in Z¨urich realized that the (special) theory of relativity could be formu- lated very elegantly in a four-dimensional space-time continuum: the Minkowski space. He published his formulation in 1908, just a short time before he untimely died. To follow Minkowski, we should first transform time to have the same dimension as the three other space coordinates. Fortunately this can readily be done, since the speed of light, c, by Einstein’s postulate is a universal constant of nature, the same for all observers related via Lorentz transformations to each other. (The Lorentz transformations leave Maxwell’s equations invariant.) The fourth coordinate is therefore to be chosen as x0 = ct. The motion of an object with constant velocity will then be represented by a straight line in this space, and all theorems about straight lines in four dimensions will have something to say about motions of a free particle. In a similar vein, curved lines will represent particles acted upon by forces, i e. accelerating or retarding particles. Thus motions can be described by geometry, and the old dream of Archimedes is, in a sense, fulfilled. In the four dimensions of Minkowski space the particles are not described by orbits, but by world-lines. These represent the space-time history of the particles. If a particle moves regularly in a Keplerian circular orbit in tree-dimensional space, its total motion, or world-line, will be a (hyper) spiral in Minkowski space, etc. The two roots are now possible to connect to each other, as we think of all transforma- tions that leave the geometrical forms invariant, or of all transformations that are able to superimpose one (four-dimensional) geometrical object onto another one. Below, we start out by studying the mathematical tools relevant to the description of special relativity in Minkowski space.
1.1 Geometry of the Minkowski Space
Let M = R4 be the four dimensional real vector space consisting of all 4-tuples x = (x0,x1,x2,x3) of real numbers. Addition of vectors is defined as usual as x + y = (x0 + y0,x1 +y1,x2 +y2,x3 +y3) and multiplication by real scalars λ as λx = (λx0,λx1,λx2,λx3). Normally, in Euclidean geometry, the length of a vector x M is defined by the formula ∈ x 2 = (x0)2 + (x1)2 + (x2)2 + (x3)2. (1.3) | | The Euclidean inner product is x,y = xiyi, where i = 0, 1, 2, 3. From now on, we shall use the Einstein summation convention:h i Sum over repeated indices (on the same side of an equation). Minkowskian geometry differs from Euclidean geometry: The ‘length’ of a vector x M is defined by the formula ∈ x2 = (x0)2 (x1)2 (x2)2 (x3)2. (1.4) − − − Note that the right-hand side is indefinite; it can be either positive, zero, or negative. For this reason, we shall speak only about the length squared x2 and we are not taking the square root of this expression (which could be imaginary). We introduce the Minkowskian inner product as x y = x0y0 x1y1 x2y2 x3y3. (1.5) · − − − The Minkowskian inner product is obviously commutative, i.e., x y = y x. We shall also introduce the metric tensor η = (η ), where η = 0 if µ = ν,· η =· 1, and η = 1 µν µν 6 00 ii − 1.1. GEOMETRY OF THE MINKOWSKI SPACE 3
(no summation here!) when i = 1, 2, 3. This metric tensor is usually called the Minkowski 1 µν µν 1 metric. The inverse of the Minkowski metric is given by η− = (η ), where η = ηµν . The Minkowski metric and its inverse fulfill the relation
λν ν ν ηµλη = ηµ = δµ, (1.6) ν ν ν where δµ is the Kronecker delta, δµ = 1 if µ = ν and δµ = 0 if µ = ν. In relativity theory, we often use the convention that the Greek6 indices run from 0 to 3, whereas the Latin indices take the values 1, 2, 3. With this convention, we can write µ ν µ ν x y = x ηµν y = ηµν x y . · Another often used notational convention: We lower and raise the vector and tensor indices according to the rules
ν ν λν λω xµ = ηµν x , Aµ = ηµλA , Aµν = ηµληνωA , µ µν µ µλ µν µλ νω x = η xν , A ν = η Aλν , A = η η Aλω. 0 i µ µ Thus, x0 = x and xi = x for i = 1, 2, 3. We can now write x y = x yµ = xµy . Tensors with n indices,− all down, are called covariant tensors· of rank n and tensors with n indices, all up, are called contravariant tensors of rank n. Tensors with indices both up and down are so called mixed tensors. Thus, a vector is a tensor of rank 1 (one index) and a scalar is a tensor of rank zero (no indices). For example, xµ is a contravariant vector and xµ is a covariant vector. There is still another useful notation for the Minkowskian inner product. If we write the vectors x M, using the one-column matrix notation, ∈ x0 x1 x = , (1.7) x2 x3 then the inner product is x y = xtηy = ytηx, where xt is the transposed matrix of x, i.e., a one-row matrix. · We say that x M is time-like if x2 > 0, light-like if x2 = 0, and space-like if x2 < 0. Note that the light-like∈ vectors form a cone (x0)2 = (x1)2 + (x2)2 + (x3)2. A non-spacelike vector x is future pointing if x0 > 0 and past pointing if x0 < 0. A sum of two space-like vectors is in general not space-like. However, a sum of two future (past) pointing vectors is future (past) pointing vector. Indeed, if z = x + y and x2,y2,x0y0 are positive, then z2 = (x + y)2 = x2 + y2 + 2x y = x2 + y2 + 2(x0y0 x y) 2(x0y0 x y ) 0, · − · ≥ − | || | ≥ where x = (x1,x2,x3) is the spatial part of the vector x and we have used the Schwarz inequality x y x y x0y0 | · | ≤ | || | ≤ for any pair of such vectors. There is another peculiarity of the Minkowski metric. The orthogonal complement of a subspace V can contain non-zero vectors of V . For example, if x = 0 is a light-like vector, 6 then the orthogonal complement x⊥ is spanned by x itself and a pair of space-like vectors. 1It is not generally true for all metrics that the elements of the metric and its inverse are equal as we will see later on. 4 CHAPTER 1. SPECIAL RELATIVITY
Example 1.1 Take x = (1, 1, 0, 0). Then, a basis of x⊥ consists of x and the vectors (0, 0, 1, 0), (0, 0, 0, 1).
1.2 Lorentz Transformations
We shall study linear transformations Λ : M M in the Minkowski space. Recall that a transformation is linear if Λ(x + y) = Λ(x)+Λ(→ y) and Λ(λx) = λΛ(x) for any vectors x,y and real number λ. A linear transformation in M can be described by a real 4 4 µ × matrix Λ = (Λ ν ). Note the notational convention widely used in relativity theory: One of the indices is written up and the other down. The vector x is transformed to the vector x′ = Λ(x) with coordinates µ µ ν x′ =Λ ν x . (1.8) In relativity theory, the linear transformations, which leave all Minkowskian inner products invariant play a central role. Thus, we shall study transformations Λ : M M such that → (Λx) (Λy) = x y (1.9) · · for all x,y M. In other words, we require ∈ (Λx)tη(Λy) = xtηy (1.10) for all x,y M. This is equivalent with ΛtηΛ = η. Let us∈ denote by O(3, 1) the set of all Lorentz transformations (the transformations preserving Minkowskian inner products). The set O(3, 1) is a group. If A, B O(3, 1), then ∈ (AB)tη(AB) = Bt(AtηA)B = BtηB = η (1.11) and thus AB O(3, 1). Thus, we have a well-defined product (A, B) AB defined in O(3, 1). The matrix∈ product is 1) associative, i.e., A(BC) = (AB)C, 2)7→ there is a neutral element e, the unit matrix 1, with the property eA = Ae = A for any A, and 3) any element 1 1 1 A O(3, 1) has an inverse A− , the inverse matrix, such that A− A = AA− = e. Note that∈ the existence of the inverse matrix follows from ΛtηΛ = η, where Λ O(3, 1); taking the determinant of both sides and noting that det η = 1, we see that (det∈ Λ)2 = 1 and so det Λ = 1 = 0. − ± 6 0 0 0 Example 1.2 The transformations Λ leaving the coordinate x fixed, x′ = x , form a subgroup, denoted by O(3), of O(3, 1). This is simply the group of all orthogonal transfor- mations in the 3-dimensional Euclidean space R3. In particular, it contains the group of ordinary rotations SO(3), i.e., orthogonal transformations Λ in R3 with det Λ = 1.
Example 1.3 Consider the following linear transformations in M:
0 0 1 x′ = x cosh θ x sinh θ, − 1 0 1 x′ = x sinh θ + x cosh θ, − 2 2 x′ = x , 3 3 x′ = x , where θ is any real number. The parameter θ is called the rapidity. Using the identity 2 2 2 2 (01) cosh θ sinh θ = 1, one easily sees that indeed x′ = x . Denote by Λ (θ) the above − (01) (01) (01) linear transformation. By direct computation, one finds Λ (θ)Λ (θ′)=Λ (θ + θ′). 1.3. PHYSICAL INTERPRETATIONS 5
In Example 1.3 we can replace x1 by either of the coordinates x2,x3 and we could define the linear transformations Λ(02) and Λ(03) in a similar manner.
1.3 Physical Interpretations
An event in the physical 3-dimensional Euclidean space R3 has coordinates
x = (x0,x1,x2,x3), (1.12) where x0 = ct is related to the time t of the event and x1,x2,x3 are the Cartesian space coordinates, x,y,z. Here c is the speed of light (in vacuum), c 299 792 458 m/s. Often we measure time, using x0 instead of t; this means that we use units≡ where c = 1. The unit of length is then the distance, which light travels in one second; or equivalently, if we use meter as the basic unit, then time is measured in units of 3 108 m. The ‘length’ of a normal lecture (45 minutes) would then be about 8.1 1011 m! · The trajectory of a point particle in space· and time is given by a continuous (piecewise differentiable) curve x(τ) in M. It is assumed that the tangent vector u = x′(τ) of the curve is either time-like or light-like at any point. By the chain rule
dx dx dx0 = c (1.13) dt dτ dτ
0 and thus the requirement that u is not space-like, dx dx , means that dτ ≥ dτ
dx c, dt ≤
i.e., the speed of a particle at any instant of time is less or equal to the speed of light. A particle moving freely without any influence from external forces is moving with con- stant velocity (Newton’s first law). Geometrically, this means that the world-line of the particle is a straight line in M with slope 1. ≥
x0 = ct 6slope ≥ 1 slope = 1
- xi M
A particle, or an observer, at rest in a given coordinate system is represented by a world- line parallel to the time axis, i.e., the x0-axis. Let us assume that an observer K is at rest at the origin of the 3-dimensional space in a coordinate system S. The S-coordinates are µ denoted by x . Next, we view the same observer from another coordinate system S′, which is related to the coordinate system S by a Lorentz transformation x′ =Λx of the same type 6 CHAPTER 1. SPECIAL RELATIVITY
as in Example 1.3 in the previous section. The world-line of K, in the S′-coordinates, is given by 0 1 2 3 x′ = τ cosh θ, x′ = τ sinh θ, x′ = 0, x′ = 0, − where we have written x0 = τ and xi = 0 for i = 1, 2, 3. The velocity of K along the 1 x′ -coordinate axis is now 1 θ θ dx′ e e− v′ = = c tanh θ = c θ − θ 0, (1.14) dt′ − − e + e− ≤ where θ 0 is the so called rapidity (or boost parameter). The velocity along the other coordinate≥ axes is zero. We can interpret this result by saying that either 1) the observer 1 K is moving with velocity v′ along the negative x′ -coordinate axis, −
6 6 S S′
v′ − 1 1 - x - x′ K ×
1 or that 2) the coordinate system S′ is moving along the positive x -coordinate axis with 1 velocity v = dx = c tanh θ 0. dt ≥
6 6 S S′
- v
1 1 - x - x′ K ×
Note that v = v′. One of the basic− principles of the special theory of relativity is that there is no coordinate system which is in absolute rest. All motion with constant speed is relative. Any coordinate system moving with constant velocity is called an inertial frame (or inertial coordinate system). There is no preferred way to choose any particular inertial frame (except for computational purposes in a given problem).
Example 1.4 Another way of writing the linear transformations in Example 1.3 is
0 0 v 1 x′ = γ(v)x γ(v)x , − c 1 v 0 1 x′ = γ(v)x + γ(v)x , − c 2 2 x′ = x , 3 3 x′ = x , 1.3. PHYSICAL INTERPRETATIONS 7 where γ(v) = 1 . Sometimes one also introduces β = v/c. Note that cosh θ = γ and √1 v2/c2 − sinh θ = βγ. Thus, the rapidity is given by θ = artanh β. The Lorentz transformation in this example is often called the standard configuration Lorentz transformation.
1.3.1 Lorentz Contraction Let us continue with the example above. Suppose we have a stick of length ℓ such that its one endpoint is at rest at the origin of the x-coordinate system and the other is at the point 1 2 3 0 (x ,x ,x ) = (ℓ, 0, 0). Expressed in the x′-coordinates, at time t′ = x′ /c = 0, the ends are at the positions (0, 0, 0) and (ℓ/ cosh θ, 0, 0). Thus, it seems that the stick is contracted by the factor 1/ cosh θ. Remember that the relative speed of the observers was v = c tanh θ. Using the hyperbolic function identities, we get
2 2 ℓ′ = ℓ 1 v /c . (1.15) − This is the so called Lorentz contraction formulap (or length contraction formula). Note that in the other coordinate directions there are no Lorentz contractions, because in our example 2 2 3 3 x′ = x and x′ = x .
1.3.2 Time Dilation Suppose we have synchronized clocks distributed along the x1-coordinate axis in the S-frame 1 and the observer K′ is moving with constant speed v along the positive x -coordinate axis. The coordinates in the rest frame S′ of the observer K′ are denoted by x′. At time t = t′ = 0, 0 the observer K′ passes the origin of the S-frame. At some later time t = x /c, we have
0 0 1 x′ = x cosh θ x sinh θ, (1.16) − 1 0 1 0 = x′ = x sinh θ + x cosh θ, (1.17) − 1 for an event at the origin of K′. Solving for x from the second equation and inserting into the first, we get 0 0 x /c 2 2 t′ = x′ /c = = t 1 v /c . (1.18) cosh θ − Thus, the moving clocks (with respect to the S-frame)p seem to slow down. Another way of writing Eq. (1.18) is
t t = ′ . (1.19) 1 v2/c2 − According to Eq. (1.19), an observer Kpin the S-frame measures a longer time interval than the observer K′ measures for two events which occur at the same spatial point in S′. Equation (1.19) is the so called time dilation formula.
1.3.3 Relativistic Addition of Velocities As we have seen above, a Lorentz transformation with a hyperbolic ‘angle’ θ in the x0x1- plane can be interpreted as giving a boost of velocity to a particle, or an observer, by the 8 CHAPTER 1. SPECIAL RELATIVITY amount v = c tanh θ. Suppose that we perform still another (similar) Lorentz transformation (01) (01) (01) by an ‘angle’ θ′, leading to new coordinates x′′ =Λ (θ′)x′ =Λ (θ′)Λ (θ)x. By simple (01) (01) (01) multiplication of matrices, we have Λ (θ′)Λ (θ)=Λ (θ′′), where θ′′ = θ + θ′. Using the hyperbolic identity tanh x + tanh y tanh(x + y) = , (1.20) 1 + tanh x tanh y we get v v′ v′′ tanh θ + tanh θ′ c + c = tanh θ′′ = = v v′ , (1.21) c 1 + tanh θ tanh θ′ 1 + c · c i.e., v + v′ v′′ = 2 . (1.22) 1 + vv′/c 1 Thus, an observer moving with velocity v′ along the x′ -axis, when the coordinate system S′ is moving with velocity v relative to the coordinate system S, is not moving with velocity v + v′ relative to S, but with the smaller velocity v′′. The formula (1.22) is called the relativistic addition of velocities. Note especially that if v c and/or v′ c, then v′′ c. → → → In the non-relativistic limit v,v′ c, the classical formula v′′ = v + v′ is of course regained. ≪ 1.3.4 The Michelson–Morley Experiment and the Invariance of the Speed of Light In an inertial coordinate system S, a light ray is represented by a world-line x(τ) with slope = 1. The set of light rays originating from a given point in M forms a light-cone. For example, if we flash a light source at the origin of the coordinate system S at time x0 = 0, then the light rays form the cone x2 = 0, i.e., x0 = (x1)2 + (x2)2 + (x3)2. Now, let us perform a Lorentz transformation Λ, taking us to a new inertial coordinate p system S′, x′ =Λx. Since the Minkowski metric is invariant under Lorentz transformations, 2 2 the points on the cone x = 0 correspond in a 1-1 manner to points x′ = 0, when expressed in the new coordinates. In particular, the slope of the world-lines in the x′-coordinates is again equal to 1. Thus, the speed of light in any inertial coordinate system is equal to the constant c. Historically, this was the starting point when Albert Einstein developed the special theory of relativity in 1905. Already in 1887, A.A. Michelson and E.W. Morley made an important investigation, which laid the experimental foundation of relativity theory. A simplified version of the Michelson interferometer is described in Figure 1.1. According to the principles of classical mechanics, the measured speed of light (or of any other object, for that matter) depends on the velocity of the observer. Suppose that the velocity of a light front traveling along the positive x1-axis is c in some coordinate system S. Then, one would expect that the speed of light measured in a coordinate system S′ moving 1 in the x -axis with speed v would be c v depending whether S′ is moving to the negative or the positive x1-direction. Before Michelson± and Morley, it was assumed that there is some preferred coordinate system S, where the speed of light is precisely c. In other coordinate systems, one would get different values for the speed of light. One said that light (and all electromagnetic radiation) is moving in the ‘ether’ with speed c. For example, assuming that the Sun is at rest in the ‘ether’ frame, the movement of Earth along its orbit and the Earth’s rotation would certainly affect the measured value of 1.3. PHYSICAL INTERPRETATIONS 9
Mirror M2
l2 M
Incoming beam Mirror M1 of light l1
P
Figure 1.1: The Michelson–Morley experiment (simplified). A beam of light is split at the partially reflecting mirror M. A part of the beam is then reflected at mirror M1 and the other part at mirror M2. The beam from M1 is reflected at M and the beam reflected at M2 will go through M. The two beams are again combined and an interference pattern is observed at P . speed of light. In the Michelson–Morley experiment, the different speeds of light (vertical to the movement of Earth or along the motion) would manifest as phase shifts between the light rays 1 and 2. The time for light to travel from M to M1 and back is ℓ ℓ 2ℓ 1 t = 1 + 1 = 1 , 1 c v c + v c · 1 v2/c2 − − assuming that the line MM1 is along the movement of the laboratory (relative to the ‘ether’). Orthogonal to MM1, from M to M2 and back, we get a different result. We have to take into account that in the time t2, which light needs to travel MM2M, the mirror M has traveled a distance vt2. By Pythagoras’ theorem, we have
1/2 vt 2 ct = 2 ℓ2 + 2 . 2 2 2 " #
Solving for t2, we get 2ℓ2 2ℓ2 1 t2 = = . √c2 v2 c · 1 v2/c2 − − p 10 CHAPTER 1. SPECIAL RELATIVITY
The difference in transit times is
2 ℓ2 ℓ1 ∆t = t2 t1 = 2 2 . − c 1 v2/c2 − 1 v /c ! − − p Now, let us rotate the whole instrument such that after the rotation MM1 is perpendic- ular to the motion and MM2 is parallel. Making the same computations with ℓ1 and ℓ2 interchanged, we get
2 ℓ2 ℓ1 ∆t′ = t2′ t1′ = 2 2 . − c 1 v /c − 1 v2/c2 ! − − Finally, p 2 2 ℓ2 + ℓ1 ℓ2 + ℓ1 ℓ1 + ℓ2 v ∆t′ ∆t = 2 2 2 , − c 1 v /c − 1 v2/c2 ! ≃ c · c − − where, since v/c 1, we have expanded in powersp of v/c and taken only the leading term. ≪ In the original Michelson–Morley experiment, ℓ1 = ℓ2 and ℓ1 + ℓ2 = 22 m. Earth’s orbital speed around the Sun is about 30 km/s (the rotational speed is much smaller) and 15 so ∆t′ ∆t 0.73 10− s. Although this is very short time, the relative phase shift ∆N = − c ≃ · 7 (∆t′ ∆t) λ is an observable quantity. For the visible light used, the wavelength λ = 5.5 10− m and− the· relative phase shift ∆N 0.4. The phase shift would produce interference fringes· when the two reflected beams are reunited.≃ No such fringes were observed. This experiment has been repeated again and again with different variations in the setup, but the same conclusion remains: The speed of light is not affected by the motion of the observer or the source.
1.3.5 The Relativistic Doppler Effect Consider radiation of light in the direction of the positive x1-axis in the coordinate frame S; we shall think of the radiation as coming from a fixed source in the S-frame. A wave with wavelength λ and frequency ν (which are related by c = λν) is written as
x1 sin 2π νt . λ − 1 An observer K′ moving with velocity v away from the source, along the positive x -axis, observes the same sinoidal wave; in his/her coordinates it is written as
1 x′ sin 2π ν′t′ . " λ′ − !#
Since the speed of light is constant, we have λ′ν′ = c = λν. On the other hand, ct′ = 1 1 1 ct cosh θ x sinh θ and x′ = ct sinh θ + x cosh θ with tanh θ = v/c. Because the two expressions− above represent the− same wave, we get the relation
1 v/c c v ν′ = ν − = ν − . (1.23) 1 v2/c2 c + v − r p 1.4. THE PROPER TIME AND THE TWIN PARADOX 11
Thus, the observer moving away from the source sees a redshift in frequency of light. Note that since the numerator contains the term v/c linear in the velocity, an observer moving in the opposite direction would see a corresponding blueshift. Only the relative speed matters; so an observer at rest in some inertial frame would observe a redshift when looking at a star moving away. Equation (1.23) is usually called the relativistic Doppler formula.
1.4 The Proper Time and the Twin Paradox
By definition, the world-line x(τ) of an observer K is everywhere time-like, i.e., the tangent vector x′(τ) is a time-like vector for any τ. We can normalize the evolution parameter τ 2 such that x′(τ) = 1. Let us define
τ 2 s(τ) = x′(u) du, (1.24) τ0 Z p where τ0 is an arbitrarily fixed initial point. As a function of τ, s is monotonically increasing. We shall use s as the evolution parameter instead of τ. By the chain rule, we have
dx dτ dx = . (1.25) ds ds dτ Thus, we get 2 2 2 dx ds − dx = = 1. (1.26) ds dτ dτ The new parameter s is called the proper time of the observer K. This is the time, modulo a factor c, shown by a clock which K carries on his/her wrist. Namely, let us first assume that K is at rest in an inertial coordinate system S. Then his/her world-line, properly normalized, is simply x(s) = (s, 0, 0, 0), so that t = x0/c = s/c. In some other inertial coordinate system S′, the same world-line is written as y = y(s), but the inertial coordinates are related by a Lorentz transformation Λ such that y = Λx. The lengths of vectors are not affected by Lorentz transformations, therefore y(s)=Λx(s) is also a proper time parameterization. It follows that the length of a segment of the world-line, measured using the proper time, does not depend on the coordinates used.
Example 1.5 Consider the world-line x(τ) = (cτ, v1τ,v2τ,v3τ) of an observer K moving with constant velocity v = (v1,v2,v3) in some inertial coordinate system. According to the definition (1.24), the proper time corresponding to the parameter interval [0,t] is equal to
t s[0,t] = c2 (v1)2 (v2)2 (v3)2 dτ = ct 1 v2/c2. 0 − − − − Z p p This is again the time dilation formula, which we derived above using Lorentz transforma- tions.
Let us now discuss the so called twin paradox. Suppose an observer K′ is moving relative to an observer K (the latter being at rest at the origin of his/her coordinate system) in the 1 following way: First K′ is moving with constant speed v along the positive x -axis during a time t measured using clocks at rest in the coordinate system S. Then, the motion is 12 CHAPTER 1. SPECIAL RELATIVITY
instantly reversed and K′ is returning to the origin x = 0 in the same time t. So the time used, according to K, is 2t. Let us compute the time interval according to the watch carried by K′. According to the time dilation calculation, the S′-clock is retarded by a factor 2 2 1 1 1 v /c when K′ is moving from x = 0 to x = vt. The same is true on the journey − back to x1 = 0. Thus, the total time spent according to K is 2t 1 v2/c2. Now comes the p ′ paradox: According to the principles of relativity theory only the− relative speeds matter. p So we could perform the same computation with the roles of K and K′ interchanged. The result would be that the S-clock would show the retarded time 2t 1 v2/c2! But the clocks can be compared after the journey at the common origin x = x = 0−! ′p The resolution to the apparent paradox is the following: In special theory of relativity, one has to do the computations in inertial coordinate systems. In the above example, there is no inertial coordinate system such that K′ would be at rest in that system. Because of the reversal of the direction of motion at time t, there is an infinite acceleration at that instant of time. Assuming that K is not accelerated, we can use his/her coordinate system and we would arrive at the (correct) conclusion that the S′-clock is retarded. The same result can be obtained without any ambiguities, using the proper time method. 2 2 The proper time length of the world-line traveled by K′ is 2ct 1 v /c . (This can be computed in any inertial frame, but most conveniently in the rest frame− of K.) p 1.5 Transformations of Velocities and Accelerations
Let S and S′ be two inertial coordinate systems. Assume that S′ is moving with constant velocity v along the positive x1-axis of S. Let us study a particle, that moves parallel to the x1-axes. The motion of the particle is described by the following equations 1 1 1 1 x = x (t) in S and x′ = x′ (t′) in S′. 1 1 The relation between x , t, x′ , and t′ is, of course, given by the standard configuration Lorentz transformation vx1 t′ = γ(v) t , (1.27) − c2 1 1 x′ = γ(v) x vt , (1.28) − 2 2 x′ = x , 3 3 x′ = x , where γ(v) = 1 . √1 v2/c2 − The velocity of the particle relative to S′ is 1 1 1 1 dx′ dx′ dt dx′ 1 u′ = = = . (1.29) dt dt dt dt dt′ ′ ′ dt dt′ dx′1 We obtain dt and dt by differentiating Eqs. (1.27) and (1.28) with respect to t dt v dx1 u1v ′ = γ(v) 1 = γ(v) 1 , (1.30) dt − c2 dt − c2 dx 1 dx1 ′ = γ(v) v = γ(v) u1 v , (1.31) dt dt − − 1.6. ENERGY, MOMENTUM, AND MASS IN RELATIVITY THEORY 13
1 dx1 where we have introduced u = dt , which is the velocity of the particle relative to S. Inserting Eqs. (1.31) and (1.30) into Eq. (1.29), we obtain the formula for transformation of velocity 1 1 u v u′ = −u1v . (1.32) 1 2 − c Compare this result with the formula for relativistic addition of velocities, Eq. (1.22).
Exercise 1.1 Express the velocities
2 3 2 dx′ 3 dx′ u′ = and u′ = dt′ dt′ in the unprimed velocities u1, u2, u3, and v.
The acceleration of the particle relative to S′ is
1 1 1 1 du′ du′ dt du′ 1 a′ = = = . (1.33) dt dt dt dt dt′ ′ ′ dt By differentiating Eq. (1.32) with respect to t, we obtain
2 1 1 1 v du′ a u v v 1 1 c2 1 = 1 + − a = − a , (1.34) dt u v u1v 2 c2 u1v 2 1 2 1 2 1 2 − c − c − c 1 du1 d2x1 where a = dt = dt2 is the acceleration of the particle relative to S. Inserting Eqs. (1.34) and (1.30) into Eq. (1.33), we obtain
1 1 a a′ = , (1.35) 1 3 3 u v γ(v) 1 2 − c which is the formula for transformation of acceleration .
Exercise 1.2 Express the accelerations
2 2 2 3 2 3 2 du′ d x′ 3 du′ d x′ a′ = = 2 and a′ = = 2 dt′ dt′ dt′ dt′ in the unprimed accelerations and velocities a1, a2, a3, u1, u2, u3, and v.
1.6 Energy, Momentum, and Mass in Relativity Theory
In order to save the principle of conservation of energy and momentum in relativity theory, these concepts have to be defined in a different manner as compared to classical Newtonian 1 2 mechanics. For example, the formula Ekin = 2 m0v for the kinetic energy or the formula p = m0v for the momentum cannot be maintained. The reason is that the coordinate trans- formations in relativity theory mix the time and space coordinates in a non-trivial manner. If we try to keep the above formulas, then the conservation of energy and momentum in one 14 CHAPTER 1. SPECIAL RELATIVITY coordinate system would contradict the conservation law in some other coordinate system, moving with constant speed relative to the first one. In order to build equations, which are invariant under Lorentz transformations, the basic constituents have better to transform in a uniform way. That is, we assume that µ µ ν µν the building blocks are vectors transforming according to v′ = Λ ν v , or tensors T ′ = µ ν µ′ν′ Λ µ′ Λ ν′ T , or tensors of higher rank. A basic postulate in relativity theory is that the energy and momentum of a particle are combined into a 4-component vector p, called the 4-momentum, transforming according to p′ = Λp under Lorentz transformations. The last three components are the components of the ordinary 3-momentum p, whereas the first component p0 = E/c is the energy divided by the speed of light. Let x(s) be the world-line of a point particle in the proper time parameterization. We postulate that the 4-momentum of the particle is p = m0cx˙(s) at proper time s, where dx x˙(s) = ds (s). Here m0 is called the rest mass of the point particle (or of an extended object; in the latter case the world-line is assumed to describe the center-of-mass motion). The rest mass is assumed to be a true constant; its value is independent of the coordinate system, which is used to measure m0. Our postulate is consistent with the requirement that p transforms like a 4-vector; the derivativex ˙(s) is transforming like a vector under coordinate transformations. With these postulates, the conservation of energy and momentum is consistent with coordinate transformations. For example, if we consider a collision of two objects with 4-momenta p and q (in a coordinate system S), then the conservation of momentum is expressed as
pin + qin = pout + qout. (1.36)
In some other inertial frame S′, the momenta are measured as p′ and q′. But both sides of the conservation equation are linear in the individual momenta and thus it is equivalent with
pin′ + qin′ = pout′ + qout′ . (1.37) This can be generalized to any number of particles with corresponding momenta before and after the reaction, i.e.,
N M i j Pin = pin = pout = Pout, (1.38) i=1 j=1 X X where N and M are the numbers of particles before and after the reaction, respectively. For the cases when N = 1 and M = 2 (1 2) and N = 2 and M = 1 (2 1), we have to be 2 → 2 2 → 2 2 careful so that P is the same before and after (Pin = Pout). Note that P is invariant for N i any P = i=1 p . Suppose that the particle with rest mass m is at rest in the coordinate system S′. The P 0 4-momentum is p′ = (m0c, 0, 0, 0). In a coordinate system S moving with the velocity v with respect to S′, the 4-momentum is m c p = 0 (1,v1/c, v2/c, v3/c). (1.39) 1 v2/c2 − 2 µ For any two 4-vectors A and B, thep Minkowskian inner product A·B = A Bµ is invariant under Lorentz 2 µ transformations. Especially, A = A Aµ is invariant. This is useful in many applications. 1.6. ENERGY, MOMENTUM, AND MASS IN RELATIVITY THEORY 15
We can also introduce another 4-vector V , the 4-velocity, according to c V = (1,v1/c, v2/c, v3/c) = γ(v)(c, v), (1.40) 1 v2/c2 − p where v = (v1,v2,v3) is the velocity and v is the absolute value of v, the speed. Thus, we can write
p = m0V. (1.41)
2 2 m0c The energy, which in the rest frame is E′ = m0c , is equal to E = in the √1 v2/c2 − moving frame. Expanding in powers of v/c, we obtain
1 3 v2 E = m c2 + m v2 + m v2 + . (1.42) 0 2 0 8 0 c2 · · · For velocities much smaller than the speed of light, i.e., v c, the kinetic energy T = E m c2 is approximately equal to its classical value E =≪1 m v2. − 0 kin 2 0 Note in particular that the 3-momentum p is not equal to m0v, but is given by m v p = 0 . (1.43) 1 v2/c2 − Sometimes one wants to preserve the classicalp formula p = mv. This can be done if one defines the mass as m m = 0 . (1.44) 1 v2/c2 − Thus, the moving mass is larger than thep rest mass. Equation (1.44) is the so called rela- tivistic mass formula. The formula for the energy E together with equation (1.44) now leads to one of the most famous equations in physics E = mc2, (1.45) which says that energy and mass are equivalent. We can now write p = (p0, p), (1.46) where p0 = E/c = mc. Using that p2 is an invariant, we have
2 2 E 2 2 2 2 p = p = (m c) 0 = p′ , (1.47) c − 0 − i.e., 2 2 2 2 E = m0c + (pc) . (1.48)
For massless particles (m0 = 0), we find
E = p c. (1.49) | | 16 CHAPTER 1. SPECIAL RELATIVITY
1.7 The Spinorial Representation of Lorentz Transfor- mations
Lorentz transformations, which ordinarily are given by real 4 4 matrices acting on real 4-component vectors in the Minkowski space, can conveniently be× parameterized as complex 2 2 matrices. As a first step, we write a real vector x M as a Hermitian complex 2 2 matrix,× ∈ × x0 + x3 x1 ix2 x = . (1.50) x1 + ix2 x0 − x3 − Note that any Hermitian complex 2 2 matrix can be written in this way. The determinant of the matrix is det x = (x0)2 (x1)×2 (x2)2 (x3)2 = x2. Next, denote by SL(2, C)− the set− of all complex− 2 2 matrices with determinant = 1. Since det(ab) = det a det b, the product of two elements× in SL(2, C) is again an element of 1 SL(2, C). Similarly, the inverse a− is in SL(2, C) for any a in SL(2, C). The determinant of the unit matrix is equal to 1. Thus, SL(2, C) is a group under multiplication of matrices. Let a SL(2, C) and x be a Hermitian complex 2 2 matrix. Define x′ = axa∗. Now, ∈ × x′∗ = (a∗)∗x∗a∗ = axa∗ and so x′ is Hermitian. It represents a vector in the Minkowski 2 2 space. Furthermore, det x′ = det(axa∗) = det a det x det a∗ = det x. This gives x′ = x . In other words, x x′ is a Lorentz transformation. Let us denote this Lorentz transformation by L(a). 7→ Because of L(ab)x = (ab)x(ab)∗ = a(bxb∗)a∗ = L(a)(bxb∗) = L(a)L(b)x, we have L(ab) = L(a)L(b) a, b SL(2, C). (1.51) ∀ ∈ In the group theory language, this means that the mapping L : SL(2, C) O(3, 1) is a homomorphism. The group O(3, 1) is not connected. First, we can split→ it into two components: The set SO(3, 1), consisting of Lorentz transformations with determinant = 1, and the complementary set, consisting of transformations with determinant = 1. The former one is a group with respect to multiplication of matrices. It can be further− split 00 into the subset SO0(3, 1), consisting of Lorentz transformations Λ with Λ > 0, and to its complement. Again, the former is a group under multiplication of matrices.
Proof: Since a Lorentz transformation Λ preserves the length of any vector, in particular the length of (1, 0, 0, 0), we get
2 2 2 2 Λ00 Λ10 Λ20 Λ30 = 1. − − − 2 This implies Λ00 1, i.e. , Λ 00 1 or Λ00 1. Thus, SO(3, 1) splits into two discon- nected components.≥ The Lorentz≥ transformations≤ − Λ with Λ00 1 have the property that 0 00 0 0i i 0 ≥ x′ =Λ x Λ x > 0 if x is time-like and x > 0. It means that Λ preserves the direction of time. Similarly,− if Λ00 1, then one sees that Λ reverses the direction of time. A product of two transformations≤ − preserving the direction of time, preserves the direction of time. It follows that SO0(3, 1) is a group under composition of Lorentz transformations.
We shall state without proof: The Lorentz transformations L(a) all belong to SO0(3, 1). Any element Λ SO (3, 1) can be represented as Λ = L(a) for some a SL(2, C), which is ∈ 0 ∈ uniquely defined up to the sign, a. This means that the mapping L : SL(2, C) SO0(3, 1) is a 2-1 surjective homomorphism.± → 1.8. LORENTZ INVARIANCE OF MAXWELL’S EQUATIONS 17
The subgroup SU(2) SL(2, C), consisting of unitary matrices, represents rotations. ⊂ 1 1 Namely, if a SU(2), then a− = a∗ and tr x′ = tr (axa∗) = tr (axa− ) = tr x. But 0 ∈ 0 0 2 2 tr x = 2x , so x′ = x . Since under Lorentz transformations x′ = x , it now follows that 2 2 x′ = x and a is indeed a rotation. Because of the constraint det a = 1, the elements of SL(2, C) are parameterized by three independent complex coordinates, i.e., by six real parameters. Three of these real parameters correspond to the rotation parameters (e.g. the Euler angles) and the other three parameters correspond to velocity boosts in different coordinate directions. For example, the boosts in the x3-direction are represented by the matrices λ 0 a(v) = with λ = e θ/2, (1.52) 0 1/λ − 1 1 2 2 3 1 2 2 0 1 2 2 3 0 1 2 2 0 since x′ = x , x′ = x , x′ = 2 (λ− λ )x + 2 (λ− + λ )x , and x′ = 2 (λ− + λ )x 1 2 2 3 − − 1 2 2 − 2 (λ− λ )x according to x′ = a(v)xa(v)∗. On the other hand, since 2 (λ− λ ) = sinh θ, 1 2 − 2 v (03) − 2 (λ− + λ ) = cosh θ, and tanh θ = c , x′ is just Λ (θ)x. α The elements of SL(2, C) are acting naturally on 2-component complex spinors u = , β where α, β C. Thus, one would expect some relation between the spinors and the vectors in the Minkowski∈ space M. Actually, vectors on the light-cone x2 = 0 can be represented as spinors. The matrix elements of a real Lorentz transformation L = L(a) are quadratic polynomials in the complex matrix elements of a SL(2, C). Similarly, light-cone vectors x are quadratic polynomials in the spinor components∈ α and β,
0 1 3 1 1 2 x = uu∗, i.e., x = (αα∗ + ββ∗), x = (αα∗ ββ∗), x ix = αβ∗. 2 2 − −
This is really a light-cone vector, since det(uu∗) = 0 for any spinor u. Furthermore, if u′ = au, then x′ = u′u′∗ = (au)(au)∗ = a(uu∗)a∗ = axa∗ and the spinor transformation of u indeed corresponds to a vectorial Lorentz transformation of x. The correspondence u x = uu∗ between spinors and light-cone vectors is not one-to- one. If we multiply the spinor7→ u by a complex phase eiφ, where φ R, then the vector x remains unchanged. However, it is an easy exercise to show that any∈ light-like vector x with x0 > 0 can be represented by a complex spinor (see Problem 1.50).
1.8 Lorentz Invariance of Maxwell’s Equations
Maxwell’s equations in vacuum, with electric constant ǫ0 (permittivity of free space), mag- 2 netic constant µ0 (permeability of free space), and ǫ0µ0 = 1/c , can be written as E = ρ/ǫ , (1.53) ∇ · 0 ∂E B = µ j + ǫ , (1.54) ∇ × 0 0 ∂t B = 0, (1.55) ∇ · ∂B E = . (1.56) ∇ × − ∂t 1 2 3 1 2 3 Here E = (E , E , E ) = (Ex, Ey, Ez) and B = (B ,B ,B ) = (Bx,By,Bz) are the electric and magnetic field strength vectors, respectively, ρ = ρ(x) is a charge density, and j = j(x) 18 CHAPTER 1. SPECIAL RELATIVITY is an electric current density. The values of the electromagnetic constants depend on the 3 7 2 2 system of units chosen. In SI : µ = 4π 10− kg m/(s A ). 0 · · · In relativity theory, it is convenient to write Maxwell’s equations in a slightly different equivalent way. Let us introduce the real antisymmetric matrix F = (F µν ), which combines both the electric and magnetic field strengths:
0 E1 E2 E3 E1 −0 −cB3 −cB2 F = . (1.57) E2 cB3 − 0 cB1 − E3 cB2 cB1 0 − ∂ The matrix F is called the electromagnetic field strength tensor. Denoting ∂µ = ∂xµ and µ µν ∂ = η ∂ν , we can write the first and second of Maxwell’s equations as
µν ν ∂µF = J , (1.58)
0 i i µ where we have defined J = ρ/ǫ0 and J = cµ0j , i.e., J = (J ) = (ρ/ǫ0, cµ0j). The quantity J is called the charge-current density 4-vector (4-current in short). The two remaining of Maxwell’s equations can be written as
∂µF νλ + ∂ν F λµ + ∂λF µν = 0 (1.59) for all λ, µ, ν = 0, 1, 2, 3. Note that the left-hand side of Eq. (1.59) is totally antisymmetric in the indices λ, µ, ν and therefore there are actually only four independent equations. The basic postulate in Maxwell’s theory is that the antisymmetric tensor F is really transforming like a tensor under Lorentz transformations, i.e.,
µν µ ν λω F ′ (x′)=Λ λΛ ωF (x), (1.60)
t where x′ = Λx and Λ O(3, 1). (In matrix form, we have F ′ = ΛF Λ .) With this ∈ ∂ assumption, denoting ∂µ′ = ∂x′µ ,
µν α µ ν βγ α ν βγ ν αγ ν γ ∂µ′ F ′ (x′)=Λµ Λ βΛ γ ∂αF (x) = δβ Λ γ ∂αF (x)=Λ γ ∂αF (x)=Λ γ J (x), (1.61) α where δβ is the Kronecker delta. Assuming that the 4-current transforms like a vector,
µ µ ν J ′ (x′)=Λ ν J (x), (1.62) we get µν ν ∂µ′ F ′ = J ′ , (1.63) which shows that the first set of Maxwell’s equations are Lorentz covariant, i.e., a solution of the equations is Lorentz transformed into another solution. The second (homogeneous) set of equations is proven to be Lorentz covariant in a similar manner.
Exercise 1.3 Prove that Eq. (1.59) is Lorentz covariant.
3SI = Syst`eme Internationale d’Unit´es (International System of Units) 1.8. LORENTZ INVARIANCE OF MAXWELL’S EQUATIONS 19
Actually, there are two different, but equivalent points of view when looking at the Lorentz transformation, acting on electromagnetic field strengths. The active point of view is that the observer remains fixed, but we are rotating and accelerating the fields F µν and the sources, giving rise to the field strengths. The passive point of view is that fields and currents are the same, but we looking at them in two different coordinate systems, i.e., we are Lorentz transforming the observer. According to the basic philosophy of relativity theory, there is objectively no way to distinguish between these two points of view. It does not matter if the sources are moving and the observer remains at rest or the sources are at rest and the observer is moving. Because the magnetic field B is divergence free, we can write B = A, where the ∇ × ∂ vector field A is the magnetic vector potential. The field E + ∂tA, where ∂t = ∂t , is then curl free and so E = ∂ A φ (1.64) − t − ∇ for some scalar field φ, the electric potential. We shall introduce a 4-vector A = (Aµ) = (φ, cA). Then, we have F µν = ∂µAν ∂ν Aµ. (1.65) − This set of equations is consistent with the Lorentz transformation law, assuming that A µ µ ν transforms like a 4-vector, A′ (x′)=Λ ν A (x). The 4-vector A is called the 4-vector potential. Conversely, starting from Eq. (1.65), the second set of Maxwell’s equations becomes an identity, whereas the first set can be written as
Aν ∂ν (∂ Aµ) = J ν , (1.66) − µ µ µν where = ∂µ∂ is the d’Alembertian operator. The field strengths F do not define µ µ µ µν µν uniquely the 4-vector potential A. If A′ = A +∂ χ for any scalar field χ, then F ′ = F . On the other hand, it is only the set of electric and magnetic field strengths, which contain measurable information of the system. Thus, the 4-vector potential contains redundant degrees of freedom, corresponding to the gauge transformations A A′ = A + dχ above. 7→ One can actually take advantage of the gauge degree of freedom. We can choose χ as a solution of the linear partial differential equation
χ = ∂ Aµ. (1.67) − µ
Then, for A′ = A + dχ, µ ∂µA′ = 0 (1.68) and the first set of Maxwell’s equations simplifies to
ν ν A′ = J . (1.69)
The choice of the 4-vector potential satisfying Eq. (1.68) is called the Lorenz gauge4. It has the further advantage that the gauge condition is preserved under Lorentz transformations.
4This gauge condition has been named after the danish physicist Ludwig Lorenz who first published it. However, it is often erroneously contributed to the dutch physicist Hendrik Antoon Lorentz (who is the “Lorentz” of the Lorentz transformations). 20 CHAPTER 1. SPECIAL RELATIVITY
The simplified Maxwell’s equations5, Aµ = J µ, give rise to the equations for the electric potential and the magnetic vector potential in the Lorenz gauge, 1 ∂2φ φ = ∆φ = ρ/ǫ , (1.70) c2 ∂t2 − 0 1 ∂2A A = ∆A = µ j, (1.71) c2 ∂t2 − 0 where ∆ = 2 is the Laplacian operator. The solutions to the above equations give the so called Li´enard–Wiechert∇ potentials.
1.8.1 Physical Consequences of Lorentz Transformations The Lorentz transformations corresponding to a velocity boost mix the electric and magnetic 0 0 1 components of the field strength tensor. For example, if x′ = x cosh θ x sinh θ and 1 1 0 1 − x′ = x cosh θ x sinh θ is a boost in the x -coordinate direction, then − 2 2 3 E′ = E cosh θ cB sinh θ, (1.72) − 3 3 2 E′ = E cosh θ + cB sinh θ, (1.73) 2 2 3 cB′ = cB cosh θ + E sinh θ, (1.74) 3 3 2 cB′ = cB cosh θ E sinh θ, (1.75) − 1 1 1 1 whereas E′ = E and B′ = B . The interpretation is this: The Lorentz transformation Λ(01) transforms a charge at rest (which is a source for a static electric field) to a charge moving along the x1-axis. But a charge in motion is an electric current. An electric current along the x1-axis is a source for a magnetic field such that the field lines are circles around the x1-axis. For this reason, a static electric field after the transformation contains non- zero magnetic field components. Conversely, an inverse Lorentz transformation maps a pure magnetic field to a new field, which has also non-zero electric field components.
Example 1.6 The electric field E due to a point charge q at the origin is known to be 3 1 E(x) = qx/4πǫ0r . After giving an observer a boost to velocity v along the positive x -axis, the field is transformed to q E 1 2 3 ′(x′) = 3 x ,x cosh θ,x cosh θ , 4πǫ0r q B 3 2 c ′(x′) = 3 0,x sinh θ, x sinh θ , 4πǫ0r − where r = (x1)2 + (x2)2 + (x3)2. Note that in order to compute the field strengths at the point x , we have to write the p ′ x-coordinates on the right-hand side of the equations in terms of the new x′-coordinates. After doing this, we obtain q E 1 2 3 ′(x′) = 3 (x′ + vt′,x′ ,x′ ) cosh θ, 4πǫ0r q B 3 2 c ′(x′) = 3 (0,x′ , x′ ) sinh θ, 4πǫ0r − 5In vacuum, the simplified Maxwell’s equations become Aµ = 0, which means that the electromagnetic field, when its quantum nature is fully exploited, will be seen to correspond to massless particles, photons. 1.8. LORENTZ INVARIANCE OF MAXWELL’S EQUATIONS 21
2 1 2 2 2 3 2 where r = cosh θ x′ + vt′ + x′ + x′ . For small velocities, we deduce the classical formulas q 3/2 1 q 1 1 2 2 2 3 2 − E′ (x′) = x′ + vt′ x′ + vt′ + x′ + x′ , 4πǫ 0 3/2 2 q 2 1 2 2 2 3 2 − E′ (x′) = x′ x′ + vt′ + x′ + x′ , 4πǫ 0 3/2 3 q 3 1 2 2 2 3 2 − E′ (x′) = x′ x′ + vt′ + x′ + x′ , 4πǫ 0 1 cB′ (x′) = 0, 3/2 2 q 3 v 1 2 2 2 3 2 − cB′ (x′) = x′ x′ + vt′ + x′ + x′ , 4πǫ c 0 3/2 3 q 2 v 1 2 2 2 3 2 − cB′ (x′) = x′ x′ + vt′ + x′ + x′ −4πǫ0 c 1 for the electromagnetic field of a point charge moving along the negative x′ -axis with velocity 1 v, i.e., we have an electric current in the negative x′ -direction. We can also obtain the 4-vector potential A for the point charge in a similar manner. We start again from the simple form A(x) = (Aµ(x)) = (φ(x), 0) with φ(x) = q , being 4πǫ0r the electric potential for a point charge at rest at the origin of the x-coordinate system. In 1 the x′-coordinate system, moving with velocity v along the positive x -axis, the potential is
µ A′(x′) = (A′ (x′)) = (φ(x) cosh θ, φ(x) sinh θ, 0, 0) − = (φ(x′) cosh θ, φ(x′) sinh θ, 0, 0). −
In the x′-coordinates, we have
2 2 2 1/2 q 2 1 2 3 − φ(x′) = cosh θ x′ + vt′ + x′ + x′ 4πǫ0 from which we obtain, by using Eq. (1.65), the form of the field strengths above.
The complex plane wave solutions of Maxwell’s equations in vacuum are given by the µ µ ik x 4-vector potential A (x) = ǫ e · , where ǫ is the (constant) polarization vector and k is the µν µ ν ν µ ik x 4-momentum carried by the plane wave. The field strengths are F = i(k ǫ k ǫ )e · . − It should be noted that the component of the polarization vector ǫ, which is parallel to the 4-momentum k (“the longitudinal polarization”), does not contribute to the field strengths F µν . Thus, we may assume that the longitudinal component of ǫ vanishes. The 4-current ν µν ν 2 ν ik x due to the field F is J = ∂µF = (ǫ kk k ǫ )e · , so that Maxwell’s equations in vacuum are satisfied if k2 = ǫ k = 0. By· taking− the real and imaginary parts of the complex solutions, one obtains real solutions· in terms of cosine and sine functions. Because of the constraint ǫ k = 0, there are only two physical independent “transverse” polarization degrees of freedom· for any given 4-momentum k. A basis in the transverse plane is given by a pair of space-like vectors. For example, let k = (k0, k0, 0, 0). Then (0, 0, 1, 0) and (0, 0, 0, 1) define the linearly independent transverse directions. 22 CHAPTER 1. SPECIAL RELATIVITY
1.8.2 The Lorentz Force Maxwell’s equations tell us how sources (charges and currents) give rise to electric and magnetic fields. The Lorentz force law describes how the field strengths determine the trajectory of a moving test charge q with rest mass m0. Let us parameterize the trajectory of the charge q as x = x(s), using the proper time parameter s. The force law is
2 µ µν m0c x¨ (s) = qx˙ ν (s)F (x(s)) (1.76)
This is covariant under Lorentz transformations: Both sides of the equation transform like 4-vectors. In order to understand the physical meaning of the force law, we shall first replace the proper time derivatives by ordinary time derivatives, using x0 = ct. For the space components, we have dx dx dt 1 dx0 1 x˙ = = = u = ux˙ 0. (1.77) ds dt ds c ds c Then, from the spatial part of Eq. (1.76) and the definitions of the electromagnetic field strengths, we obtain
dx˙ i dt d dxi dt dx0 d dx0 dx0 d m c2x¨i = m c2 = m c2 = m ui = m uix˙ 0 0 0 ds 0 ds dt dt ds 0 ds dt ds ds dt 0 dx dt = qx˙ F iν = q x˙ F i0 +x ˙ F ij = q 0 Ei + (u cB)i ν 0 j ds ds × dx0 dx0 dx0 = q Ei + (u B)i = q (E + u B)i . (1.78) ds ds × ds × dx0 Thus, the factors ds cancel on both sides of Eq. (1.78) to give d m ux˙ 0 = q (E + u B) . (1.79) dt 0 × 0 µ µ 0 Now, what is m0ux˙ ? We defined earlier p = m0cx˙ . Thus, p = m0cx˙ = m0ux˙ from Eq. (1.77). This together with Eq. (1.79) leads to dp = q(E + u B). (1.80) dt × For small velocities, this gives the classical formula
m a = q(E + u B). (1.81) 0 × Example 1.7 According to Section 1.3, we have x˙ 0 = 1/ 1 u2/c2 = cosh θ. Show this result, using the fact that the proper time parameter s is defined− such that p 2 x˙ 2 = x˙ 0 x˙ 2 = 1. (1.82) − Using Eqs. (1.77) and (1.82), we deduce
c c x˙ u x u = = 0 ˙ = | | | | x˙ | | 1 + x˙ 2 p 1.8. LORENTZ INVARIANCE OF MAXWELL’S EQUATIONS 23 and from this we solve u/c x˙ = . | | 1 u2/c2 − Thus, we have p 1 x˙ 0 = 1 + x˙ 2 = = cosh θ = γ. 1 u2/c2 p − Example 1.8 Deduce the physical meaningp of the time component of the Lorentz force law. The time component of the Lorentz force law is
2 0 2 d 0 2 dt d 1 2 dt dγ m0c x¨ = m0c x˙ = m0c = m0c ds ds dt 1 u2/c2 ! ds dt − dt = qx˙ F 0ν = qx˙ F 0i = q( x˙ i)(p Ei) = qx˙ iEi = qx˙ E = q u E. ν i − − · ds · dt The factors ds cancel on both sides of the above equation and we obtain, after differentiation with respect to t: m u aγ3 = qu E. 0 · · For small velocities, this gives u (m a qE) = 0, · 0 − which is Newton’s law along the direction of motion, with the force given by qE. Assuming that the electric field E is constant, then we can rewrite this equation as d m u2 d 0 = (qx E) . dt 2 dt · This shows that the kinetic energy increases in time as the work performed by the electric field E. The constant magnetic field B does not contribute to the work.
Example 1.9 Show that we obtain the Lorentz force as a consequence of Lorentz invariance, by considering the transformation from a coordinate system, where there is only a single electric field E acting on a particle with charge q, to a moving coordinate system! Let the observer at rest see the force F = qE. Making a Lorentz transformation to a coordinate system moving with velocity v = (v, 0, 0) along the x-axis, we obtain, using Eqs. (1.72) and (1.73),
1 2 3 3 2 F ′ = qE′ = q(E , E γ cB (v/c)γ, E γ + cB (v/c)γ) − = q(E1, E2γ, E3γ) + qγ(0, vB3,vB2). − But v B = (0, vB3,vB2). Thus, × − 1 2 3 F ′ = q(E , E γ, E γ) + qγ(v B). × For small velocities v c, we have γ 1 and ≪ ≃ F ′ = q(E + v B), × which is the Lorentz force. 24 CHAPTER 1. SPECIAL RELATIVITY
1.8.3 The Energy-Momentum Tensor The energy-momentum tensor T µν of the electromagnetic field tensor F µν is symmetric and defined as ǫ T µν = ǫ F µ F λν + 0 ηµν F F λω. (1.83) 0 λ 4 λω It transforms with respect to Lorentz transformations as µν µ ν λω T ′ (x′)=Λ λΛ ωT (x). (1.84) It is possible to show that µ µν T µ = ηµν T = 0. (1.85) In Eq. (1.83), the last term contains the Lorentz invariant quantity F F µν = 2(c2B2 E2). (1.86) µν − Using Maxwell’s equations (1.58) and (1.59), we obtain ∂ T µν = ǫ J F µν = f ν , (1.87) µ 0 µ − where f = (f µ) = (j E/c, ρE + j B). · × The right-hand side is the Lorentz force density generated by the charge-current density J = (J µ). Without external sources (when J = 0), the energy-momentum tensor T is µν conserved, i.e., ∂µT = 0. This implies, by Stokes’ theorem, that
µν T dSµ = 0, (1.88) ZS 4 where dSµ denotes the surface element of a closed 3-dimensional surface in R ; dSµ is a vector orthogonal to the surface and of length equal to the area element on the surface. Taking S as the plane x0 = const. for different values of x0, we see that
T 0ν d3x, 0 Zx =const. is independent of the time x0 in the case when there are no sources, i.e., for constants a and b, we thus have T 0ν d3x = T 0ν d3x. (1.89) 0 0 Zx =a Zx =b Consider next what happens with sources. We consider the case
j(t, x′) = qu(t)δ(x′(t) x(t)), (1.90) − for a point charge q moving with velocity u(t) = u(x(t)) at the point x(t). The energy change cδT 00 that this charge undergoes can be calculated using Eq. (1.89). We have by Stokes’ theorem and by Eq. (1.87)
00 00 3 00 3 µ0 δT = T (t , x′) d x′ T (t , x′) d x′ = T dS 2 − 1 µ Z Z ZS t2 t2 µ0 3 0 3 = ∂ T d x′dt = f d x′dt µ − ZZt1 ZZt1 t2 1 3 = j(t, x′) E(x′(t)) d x′dt. (1.91) − c · ZZt1 1.8. LORENTZ INVARIANCE OF MAXWELL’S EQUATIONS 25
Inserting the expression for j(t, x′) in the above equation, we find
t2 t2 00 1 3 1 δT = qu(t) E(x′(t))δ(x′(t) x(t)) d x′dt = qu(t) E(x(t)) dt. (1.92) − c · − − c · ZZt1 Zt1 Thus, the difference cδT 00 is equal to the work done by the field and thus we may interpret cT 00 as the energy density carried by the field F µν . Similarly, the spatial components T 0i = T i0 are interpreted as the components of the momentum density. Both the total energy and the total momentum are conserved when there are no sources. We saw above that 2(c2B2 E2) is a Lorentz invariant quantity. There is also a second Lorentz invariant: − ǫ F µν F λω = 8cB E, (1.93) µνλω − · where ǫµνλω is a totally antisymmetric 4th rank tensor with ǫ0123 = 1. 26 CHAPTER 1. SPECIAL RELATIVITY 1.9. PROBLEMS 27
1.9 Problems
Problem 1.1 Show that a) every 4-vector (i.e., vector in the Minkowski space), which is orthogonal to a time-like 4-vector, is space-like. b) the sum of two time-like 4-vectors, which both point into the future, is a time-like 4-vector, which also points into the future. c) every space-like 4-vector can be written as the difference between two light-like 4- vectors, which point into the future. d) the inner product of two time-like 4-vectors, which point into the future, is positive.
Problem 1.2 A rod with length of 1 m is inclined 45◦ in the xy-plane with respect to the x-axis. An observer with the speed 2/3 c approaches the rod in positive direction along the x-axis. How long does he seem the rod to be and in which angle does he observe it to p be inclined relative to his x-axis?
Problem 1.3 When the primary cosmic rays hit the atmosphere, muons are created at an 6 altitude of 10 km to 20 km. A muon in the laboratory lives in average the time τ0 = 2.2 10− s before it decays into an electron (or a positron) and two neutrinos. · Even though a muon only can move τ0c 660 m under the time τ0, a large fraction of the muons will reach the surface of the Earth.≈ How can this be explained? Calculate numerically on a muon, which moves with velocity 0.999c.
Problem 1.4 An express train passes a station with velocity v. A measurement of the length of the train can be performed in the following different ways: a) A “continuum” of linesmen are ordered to align along the track. The two men that see the front or the end of the train pass in front of them when their watches show 12:30 makes a mark where they stand. The distance La between the marks is measured. b) One conductor goes to the front of the train and another one goes to the end. When the watches of the conductors show 12:15 they quickly drive a nail into the track. The linesmen measure the distance Lb between the nails. c) The station master inspects the receding train through a pair of binoculars. Through the binoculars he sees the front of the train to be at the semaphore A at the same time as its end is at the railway point B. The linesmen measure the distance Lc between A and B. d) The station master uses a radar to measure the length of the train. The arrival times of the radar pulses reflected from the front and end of the receding train are t1 and t2, respectively. The distance L = (t t )c/2 is a measure of the length of the train. d 1 − 2 Express La, Lb, Lc, and Ld in terms of L0, the rest length of the train.
Problem 1.5 A train passes a station just after sunset. The length of the train is L. In the front and in the rear, it has two lanterns. With a switch they are put on simultaneously in the train. A station man observes the train pass with velocity v. Does he see the lanterns go on simultaneously? If not, what is the time difference between the turning on of the two lanterns for the station man, expressed in terms of L and v?
Problem 1.6 A hitch-hiker in the Milky Way sits waiting on a small asteroid when a formidably long express space cruiser passes very close to the asteroid. Just as the rear end 28 CHAPTER 1. SPECIAL RELATIVITY is opposite to her, she sees a lantern in the front and in the rear end of the cruiser go on simultaneously. Actually, the rear watch-man also saw them go on, but according to his 9 hydrogen maser wrist-watch he measured a small time difference of 4 10− s between the lightening of the forward and rear lanterns. From the type indication on· the cruiser – X2000 – our hitch-hiker realized that its length was 2 103 m. Had she known what you know, she could have calculated the speed of the cruiser.· What was it, according to Einstein’s special theory of relativity?
Problem 1.7 Two lamps, that are separated by the distance ℓ in an inertial coordinate system K, are switched on simultaneously (in K). In another inertial coordinate system K′, an observer measures the distance between the lamps to be ℓ′ and sees the lamps go on with the time difference τ. Express ℓ in terms of ℓ′ and τ. Assuming that the inertial coordinate system K′ is moving along the axis connecting the two lamps, find also the expression for the relative velocity v between the two inertial coordinate systems, K and K′.
Problem 1.8 An observer O on a train of length L and velocity v relative to the ground is standing at a distance xL (0 x 1) from the front A of the train. When he sees two ≤ ≤ lamps at A and at the rear, B, go on simultaneously, he can calculate at which times t1(A) and t2(B) they went on. An observer O′ on the ground can also determine these two times t1′ and t2′ in his frame of reference, at the time when O just passes O′. If he then finds that t1′ = t2′ it turns out that the velocity v of the train can be expressed as a simple function of x. Find this function and show that if v = 0, then x = 1/2.
Problem 1.9 A rod of length l lies in the xz-plane of a coordinate system. If the angle between the rod and the x-axis is θ, calculate the the length of the rod as seen by an observer moving with velocity v along the x-axis.
Problem 1.10 A rod moves with velocity v along the positive x-axis in an inertial frame S. An observer at rest in S measures the length of the rod to be L. Another observer moves with the velocity v along the x-axis. What length, expressed as a function of L and v, will this observer measure− for the rod? The measurement is done as usual with the endpoints being measured simultaneously for each observer in their respective frames.
Problem 1.11 Two events A and B with coordinates xA and xB are simultaneous for an observer K in the inertial system S. Another observer, K′, moving with velocity u along the x-axis of S measures these events not to be simultaneous, but such that B is earlier− than A with the amount ∆t. What is the distance L between the events A and B expressed in the frame of K if it is L′ for the observer K′ in S′?
Problem 1.12 An observer S in the system K observes two events xα and xβ. The α event takes place at the origin and the β event 2 years later at a distance of 10 light years 1 (ly) forwards along the x -axis. An other observer S′ in K′ moves with velocity v along the x1-axis of K, passing S at the origin. She instead sees the β event 1 year later than the α event. a) How far away does she find event β? b) What is her velocity relative to S? 1.9. PROBLEMS 29
Problem 1.13 A particle of mass m and energy E falls from zenit to the Earth along the z-axis in the rest frame of observer K. Another observer, K′, moves with velocity v along the positive x-axis of K and will observe the particle to approach him with an angle θ relative to the z′-axis. a) Calculate the angle θ expressed in the velocity u of the particle and the velocity v of K′. b) Based on a) give a description of how the starry sky would look like for a space-cruiser moving with high speed in our galaxy.
Problem 1.14 Consider a particle with 4-momentum p = (E/c, p, 0, 0). By making a Lorentz transformation with velocity v along the 1-axis, show that you can obtain the − addition formula for velocities, by expressing the velocity v′′ of the particle in the new system in terms of the velocity velocity v′ in the old system and the velocity v of the motion of the observer.
Problem 1.15 In 1851, Fizeau measured the speed of light in running water. His result can be summarized in the formula
u = u0 + kv, where u is the speed of light in water, that runs with velocity v. The speed of light in water at rest is u0 and the drag coefficient k is given by 1 k = 1 , − n2 where n = c/u0 is the refraction index of water. Explain Fizeau’s result!
Problem 1.16 In 1965, Maarten Schmidt at the Mount Palomar Observatory could identify the strongly redshift Lyman α line in the spectrum of the quasi stellar radio source 3C 9. Normally, this line has the wavelength 1215 A.˚ Schmidt instead found the value 3600 A˚ for this line in this radio source. It is possible to explain the redshift in terms of the Doppler effect. This would imply that 3C 9 moves with an enormous speed relative to our galaxy. Determine a lower bound for the speed of 3C 9.
Problem 1.17 Consider an equilateral triangle with sides of length ℓ, which is at rest in the inertial coordinate system K. Assume that one of the sides in the triangle is parallel to 1 the x -axis of K. In an inertial coordinate system K′ moving relative to K with velocity v along the positive x1-axis of K, an observer measures the lengths of the sides and angles in the triangle. What expressions in ℓ and v for the lengths and angles does he/she find?
1 Problem 1.18 An observer K′ is moving with constant speed v along the positive x -axis 1 of an observer K. A thin rod is parallel with the x′ -axis and is moving in the direction of 2 the positive x′ -axis with relative velocity u. Show that according to the observer K the rod forms an angle φ with the x1-axis, with
uv/c2 tan φ = . − 1 v2/c2 − p 30 CHAPTER 1. SPECIAL RELATIVITY
Problem 1.19 A cylinder is rotating around its axis with angular velocity ω (rad/s) in an inertial system, where the center of gravity is at rest. Show that the observer in an inertial system, that moves with velocity v parallel to the direction of the cylinder axis, will perceive the straight line as twisted around the cylinder. Determine the twist-angle per unit length.
Problem 1.20 A fast train (velocity v) is passing a station during the night. As the train passes the station, all the compartment lights are turned on simultaneously with respect to the rest frame of the train. Relative to an observer standing at the station, the lights seem to be turned on at various times. Compute the velocity u of the line separating the illuminated and unilluminated parts of the train.
Problem 1.21 In a non-relativistic approximation, a planet is moving along a circular orbit (radius R, angular velocity ω) around a star. A space ship is passing by the star, orthogonal with respect to the plane of motion of the planet, with velocity v. Compute the orbit of the planet in the rest frame coordinates of the space ship.
Problem 1.22 An observer B is moving with constant velocity v along the positive x1-axis in the rest frame K of an observer A. An observer C is moving with constant velocity v′ 2 along the positive x′ -axis in the rest frame K′ of the observer B. Compute the absolute value of the relative velocity of C with respect to A. What is the time interval ∆t between two events E1 and E2 which occur at the same spatial point with time difference ∆t′′ in the rest frame K′′ of observer C 0 Hint: It is sufficient to compute the time coordinate x′′ of C as a function of the coordinates xµ of A.
Problem 1.23 Consider a frame K in which light is coming in from a distant source with the angle θ with respect to the x-axis. In a frame K′ moving relative to the frame K with velocity v along the positive x-axis of the frame K, an observer measures the angle θ′ of the incoming light. Show that the angle θ′ can be expressed in θ and v as
sin θ θ′ = arctan v , γ(v) cos θ + c
1 where γ(v) = 2 . 1 v −( c ) q Problem 1.24 A space ship is moving away from Earth. The effect of the engines is regulated so that the the passengers feel the constant acceleration g. Calculate the distance between the Earth and the space ship (measured in the rest frame of the Earth) as a function of a) the time on Earth. b) the time on the space ship. The commander of the space ship is 40 years of age at the beginning of the voyage. How old is he/she when the space ship reaches the Andromeda Nebula, which lies 2 000 000 light years away from Earth? Hint: 1 year π 107 s and g 10 m/s2. ≈ · ≈ 1.9. PROBLEMS 31
Problem 1.25 A rocket (with rest mass m0) starts from rest at the origin of a coordinate system K. Its velocity along the positive x-axis is increased by shooting matter from the rocket with constant velocity w, with respect to the rest frame of the rocket, to the negative x-direction. Compute the rest mass m of the rocket as a function of its velocity v with respect to the origin of K.
Problem 1.26 An electron (rest mass me) collides with a positron (rest mass me). Show that they cannot annihilate into a single photon (a photon has zero rest mass) by using conservation of energy and momentum. Also show that an electron cannot spontaneously emit a photon.
Problem 1.27 An elementary particle with mass M decays into two particles a and b with masses ma and mb, respectively. Calculate the momentum of particle a in the rest frame of particle b.
Problem 1.28 The rest energy of an electron is about 0.51 MeV, i.e., the energy a charged particle, with charge equal to the electron charge, would receive when falling down a potential difference of 0.51 million volts. We assume that the electron is accelerated in a linear accelerator (starting from rest) with a potential difference of 106 V. Compute the final velocity of the electron.
Problem 1.29 A pion with mass mπ and energy Eπ moves along the x-axis. It decays into a muon with mass mµ and a neutrino with mass 0. Calculate the energy Eµ of the muon when it moves in right angle with the x-axis in terms of the velocity of the incoming pion and the masses.
Problem 1.30 A pion with mass mπ decays into an electron with mass m and an antineu- trino with mass mν . Calculate the velocity of the antineutrino in the rest frame of the electron as a function of the masses of the particles, and determine the limiting value of this velocity as the mass of the antineutrino goes to zero.
Problem 1.31 In June 1998, the Super-Kamiokande Collaboration in Japan reported that it had found evidence for massive neutrinos. Super-Kamiokande measures so called at- mospheric neutrinos, which are produced in hadronic showers resulting from collisions of cosmic rays with nuclei in the upper atmosphere. Two of the dominating processes in the production of atmospheric neutrinos are
π+ µ+ + ν , → µ + + where π is a pion, µ is an anti-muon, and νµ is a muon-neutrino, followed by
µ+ e+ +ν ¯ + ν , → µ e + where e is a positron,ν ¯µ is an anti-muon-neutrino, and νe is an electron-neutrino. a) Calculate the kinetic energy of the anti-muon, Tµ+ , and the absolute value of the
3-momentum of the muon-neutrino, pνµ , when the pion decays at rest according to the first decay. Despite the small mass of the the muon-neutrino, neglect it! The rest mass of the pion is mπ and the rest mass of the anti-muon is mµ. 32 CHAPTER 1. SPECIAL RELATIVITY
b) How far will one of the muons, which are produced in the first decay, go (on the average) before it decays according to the second decay? The mean lifetime of a muon at rest is τµ.
+ + Problem 1.32 An anti-muon µ decays into a positron e and two neutrinos νe andν ¯µ. The reaction is µ+ e+ + ν +ν ¯ . −→ e µ Give an expression for the largest possible total energy of the electron neutrino νe in the rest frame of the muon. You may assume that the neutrino masses are negligible compared to the lepton masses.
2 Problem 1.33 A ρ-meson with mass mρ 770 MeV/c sometimes decays into a pair of + ≃ 2 muons (µ µ−) with mass mµ− = mµ+ 106 MeV/c and a photon, γ. What is the maximal kinetic energy that the µ+ can have in≃ this decay in the rest frame of the ρ-meson?
Problem 1.34 Consider the reaction π+ + n K+ + Λ in the rest frame of n. The rest → masses of the particles are mπ+ , mn, mK+ , and mΛ, respectively. What is the kinetic + + energy, T , of the π when the K has total energy E and moves off at an angle of 90◦ to + the direction of the incident π ? (T should be expressed in mπ+ , mn, mK+ , mΛ, and E.)
Problem 1.35 A particle with mass M and 4-momentum p = (E, p) moves towards a detector when it suddenly decays and emits a photon in the direction of motion. The energy registered by the detector is ω. Determine what energy the photon had in the rest frame of the decaying particle.
Problem 1.36 An electron moves with constant velocity towards a positron at rest and they annihilate into two photons. The photons go out with angles φ and φ relative to the direction of the incoming electron. − a) Calculate the angle as a function of the total energy of the electron. b) Show that in the non-relativistic limit the angle is given by cos φ = v/2c.
Problem 1.37 Two photons with wavelengths λ1 and λ2, respectively, are scattered against each other according to Figure 1.2. Calculate the wavelength of the photon with scattering angle θ, i.e., express λ as a function of λ1, λ2, and θ. h Hint: p = λ , where h is Planck constant.
Problem 1.38 A K-meson with mass M decays at rest into two charged pions with the same mass m and a photon according to the reaction formula
0 + K (P ) π (p ) + π−(p ) + γ(k). → 1 2 The momenta of the particles are given in parenthesis after each particle symbol. Calculate the speed v of the pions in their common rest frame (p1 +p2 = 0) as a function of the masses of the particles and the photon energy k0 = ω in the rest frame of the decaying particle. 1.9. PROBLEMS 33
λ
θ λ1 λ2
Figure 1.2: γ + γ γ + γ →
Problem 1.39 In an accelerator protons are accelerated until they reach a kinetic energy of 8000 MeV and are then made to collide with protons at rest. If the sum of the kinetic energies of two colliding protons (measured in the center of mass system) is larger than the rest energy of a proton-antiproton pair, then such a pair can be formed according to the reaction formula p + p p + p + p +p, ¯ → where p is a proton andp ¯ is an antiproton. Is the energy 8000 MeV sufficient for the reaction to go? The rest mass of the proton is 938 MeV.
Problem 1.40 Protons at rest are bombarded with π-mesons. How large kinetic energy do the mesons need to have for the reaction
+ π− + p π + π− + n → to take place? The rest mass of the particles are mπ− = mπ+ 140 MeV, mp 938 MeV, and m 940 MeV. ≈ ≈ n ≈
Problem 1.41 A hydrogen atom H, consisting of an electron and a proton with binding energy B = 13.6 eV, can disintegrate into its two constitutent particles by being hit by a photon. The reaction is γ + H p + e. → Calculate relativistically the least photon energy in the rest frame of H required for this process to occur expressed in terms of B and the hydrogen mass mH.
Problem 1.42 A Σ0-particle with speed c/3 in the direction towards a gamma detector suddenly decays into a Λ-particle and a photon. The photon continues towards the detector. a) What energy does the Σ0-particle have in the system in which the detector is at rest? b) What energy does the photon have in the rest system of the Σ0-particle? c) What energy will be registered in the detector? 0 The mass of the Λ is m 1115.7 MeV and that of Σ is m 0 1192.6 MeV. Λ ≈ Σ ≈ 34 CHAPTER 1. SPECIAL RELATIVITY
Problem 1.43 In the CELSIUS ring at the The Svedberg Laboratory in Uppsala, one would like to study the reaction
p + d p + p + n + η. →
The available kinetic energy of the protons is Tp = 700 MeV and the deuterons (d) can be considered to be at rest. The rest masses of the particles are m m , m m + m , p ≈ n d ≈ p n mn = 940 MeV, and mη = 550 MeV. a) Is the reaction possible? b) If the kinetic energy of the protons in the beam is increased to Tp = 1350 MeV, what is the maximum kinetic energy that the η can get in the system in which the nucleons are at rest after the reaction, expressed in terms of the rest masses and the kinetic energies?
Problem 1.44 In elastic scattering of two particles onto each other, the same type of particles are present before as after the collision. Thus, in e + p e + p elastic scattering → of electrons on protons with corresponding 4-momenta pe, pp, pe′ , and pp′ , one can form an 2 invariant called t, defined by t = (pe pe′ ) . a) Show that, in the center of mass− system defined by the total 3-momentum being 0, 2 the quantity t equals the square of change of the 3-momentum, i.e., t = (pe pe′ ) and express this quantity− in terms of the scattering angle θ between the incoming− and− outgoing electrons and the modulus of the momentum p of the incoming electron. | e| b) Calculate the kinetic energy, Tp′ , of the outgoing proton in the laboratory system, where the incoming proton is at rest before the collision, in terms of the variable t.
Problem 1.45 Consider elastic scattering of photons on electrons
γ(k) + e−(p) γ(k′) + e−(p′), → where k and p are the incoming photon and electron four-momenta and k′ and p′ the corresponding outgoing four-momenta. a) In the laboratory system, the incoming electron is at rest and the outgoing photon is scattered the angle θ with respect to the direction of the incoming photon. Use invariants to derive the so called “Compton formula”, i.e., the difference between the outgoing and incoming photon wavelengths, as a function of θ, in units c = ~ = 1. b) Derive the angular frequency (energy) of the outgoing photon in the center of mass system in terms of the incoming photon angular frequency (energy) in the laboratory system.
Problem 1.46 What is the kinetic energy T of the pion required to create the resonance ∆(1232) in the reaction π + p π + ∆, → where π is a pion and p is a proton? The proton is at rest before the collision. The result should be expressed in terms of the rest masses of the particles involved.
Problem 1.47 The mass of the meson π0 can be measured by the reaction
0 p + π− π + n, → 0 where p is a proton, π− is a negative pion, and n is a neutron. The uncharged π meson decays very quickly into two photons and cannot be easily measured. However, the velocity 1.9. PROBLEMS 35
of the final neutron can be measured and is found to be vn = (0.89418 0.00017) cm/ns. Derive the formula that expresses the mass of the π0 meson as a function± on the masses of the proton, π−, neutron, and the velocity vn, assuming that the reaction takes place at rest for the incoming particles. Simplify the result by showing that the velocity is small, so that we need to retain only lowest non-trivial order in vn/c.
Problem 1.48 A particle A with rest mass mA decays into two particles B and C with rest masses mB and mC , respectively. Assume that the particle A has the speed vA before the decay and that the particle B is at rest after the decay, i.e., pB = 0. Express the speed vA in the rest masses mA, mB, and mC .
Problem 1.49 Two particles, 1 and 2, with mass m1 and m2 respectively collide and form a new particle with mass M. Calculate the mass M and the velocity v of this particle in the rest frame of particle 2 as a function of the velocity v1 and the masses m1 and m2.
Problem 1.50 Let x be a light-like vector in Minkowski space. Show that x0 + x3 u = N , x1 + ix2 where N is a real normalization factor, is a spinor that satisfies x uu∗, where x is a complex ∝ 2 2 matrix, so that det x = det(uu∗) = 0. Normalize this spinor by the requirement that tr×x = 2x0. A Lorentz transformation along the 3-axis is given by e θ/2 0 a(v) = − , 0 eθ/2 where tanh θ = v/c. Show explicitly that this transformation satisfies a(v)u = u(L(a(v))x), where L(a(v))x is the Lorentz transformed vector and u is the normalized spinor.
Problem 1.51 A plane electromagnetic wave moving along the x1-axis has the form x1 E(x) = E sin 2π νt . 0 λ − Introduce the angular frequency ω = 2πν and show that the argument of the wave can be µ ω ω written in the form xµk , where k = ( c , c , 0, 0) is the wave vector of the light wave trav- eling along the positive− x1-axis. Show that this vector is light-like and deduce the Doppler formula by calculating the change in angular frequency ω under a Lorentz transformation along the x1-axis. What does the Doppler formula look like expressed in terms of the Lorentz angle θ? (Give the most concise expression.)
Problem 1.52 An inertial coordinate system K′ is moving relative to another inertial co- ordinate system K with constant velocity v along the positive x1-axis of K. a) Assume that a stick of length ℓ is at rest in K such that ∆x = (ℓ, 0, 0). Calculate ∆x′ in K′. b) Assume that there is a constant electric field E = (0, 0, E) in K (no magnetic field, i.e., B = 0 in K). Calculate E′ and B′ in K′. 36 CHAPTER 1. SPECIAL RELATIVITY
Problem 1.53 An observer at rest experiences in frame K only an electric field E. An observer in K′, moving with velocity v along the positive x-axis, will also observe a magnetic field B′. Calculate this field for small velocities (linear terms in v) and show that this magnetic field is perpendicular to both the E-field and to the velocity of the charged particle in the K′-frame.
Problem 1.54 Let K, K′, and K′′ be as in Problem 1.22. Assume that there is a constant electric field E = (0, 1, 0) (in some given physical units) in the coordinate system K. We assume that the magnetic field B vanishes in K. Compute the components of both the electric and magnetic fields in the coordinate systems K′ and K′′.
Problem 1.55 Show by explicit calculation, using chain derivation and the properties of the Lorentz transformations, that Aµ(x) = 0 (1.94) is invariant under Lorentz transformations, i.e., if Aµ(x) is a solution of Eq. (1.94), then µ A′ (x′) is a solution of the same equation in the primed variables x′ = Λx, where Λ is a Lorentz transformation.
Problem 1.56 Compute the electric and magnetic field components due to a point charge q moving with velocity v along the positive x-axis.
Problem 1.57 A particle of mass m and electric charge q is moving in a constant electric field E. Use the Lorentz force law to calculate the velocity of the particle as a function of the displacement r from the origin along the direction of motion. The particle starts off at rest.
Problem 1.58 Through a straight uncharged conductor the current I is flowing. Determine the electromagnetic field in an inertial system K′ that moves parallel to the conductor with velocity v a) by transforming the electromagnetic field tensor from the rest frame K of the conduc- tor to K′, b) by transforming the current-density 4-vector from K to K′, and then, knowing the charge of the conductor and its current relative to K determine the field in K′.
Problem 1.59 Maxwell’s equations can be expressed by means of the 4-vector electromag- µ netic potential A. When ∂µA = 0 (Lorenz gauge), they take on a simple form. What is this form? Assuming that Maxwell’s equations are on this simple form and furthermore J = 0 µ µ ik x (current free), show for a plane wave, A = ε e · , where ε is the polarization vector, that
E k = B k = 0, · · i.e., the electric and magnetic fields are perpendicular to the direction of motion.
µν µν ωλ Problem 1.60 Calculate the Lorentz invariants Fµν F and ǫµνωλF F for a free elec- µ µ ik x tromagnetic plane wave A (x) = ǫ e · , where ǫ is the polarization vector. Give a physical interpretation of your result. 1.9. PROBLEMS 37
Problem 1.61 a) Show that if the electric and magnetic fields E and B are orthogonal for one observer, they are orthogonal for any observer. b) Show that E and B are orthogonal for free plane waves with Aµ(x) = εµeikx, where ε is the polarization vector. c) Show for the plane waves that E B = Ak, where k is the wave vector and A is a non-vanishing expression. ×
Problem 1.62 An electron with rest mass m0 is moving in a homogeneous magnetic field B = (0, 0,B) and no electric field. Calculate its trajectory if it has velocity u = (u, 0, 0) at time t = 0.
Problem 1.63 Prove that the scalar product E B between the electric and magnetic field vectors is invariant under Lorentz transformations.·
Problem 1.64 In an inertial coordinate system K, there is a constant electric field E = (cB, 0, 0) and a constant magnetic field B = (0, B, 0). In another inertial system K′, the same fields are measured to be E′ = (0, 2cB, cB) and the x-component Bx′ = 0. Compute By′ and Bz′ .
Problem 1.65 Observer A measures the electric and magnetic field strengths to be E = (α, α, 0) and B = (0, 0, 2α/c), respectively, where α = 0. Another observer, observer B, − 6 makes the same measurements and finds E′ = (0, 0, 2α) and B′ = (Bx′ , α/c, Bz′ ). Determine Bx′ and Bz′ .
Problem 1.66 Observer A measures the electric and magnetic field strengths to be E = (0, β, β) and B = (2β/c, 0, 0), respectively, where β = 0. Another observer, observer B, − 6 makes the same measurements and finds E′ = (2β, 0, 0) and B′ = (Bx′ ,By′ , β/c). Determine Bx′ and By′ .
Problem 1.67 Observer A measures the electric and magnetic field strengths to be E = (α, 0, 0) and B = (α/c, 0, 2α/c), respectively, where α = 0. Another observer, observer B, 6 makes the same measurements and finds E′ = (Ex′ , α, 0) and B′ = (α/c, By′ , α/c). Express Ex′ and By′ in terms of α and c. Finally, a third observer, observer C is moving relative to observer B with constant velocity v along the positive x-axis of observer B. Find the electric and magnetic field strengths, E′′ and B′′, as observer C measures them.
Problem 1.68 The four-momentum of a free particle of mass m is pµ = mcx˙ µ. a) Show that the momentum is conserved (i.e., independent of time) by calculating the Euler–Lagrange variational equations for the Lagrangian = p2/2m, where the metric is flat. L b) When the particle moves in an electromagnetic field one can obtain the relevant equations of motion by using the substitution p p + qA/c, where A = A(x) is the electromagnetic potential and q is the charge of the particle.→ Show that to lowest non-trivial order in q the equations of motion for the particle gives the Lorentz force equations. 38 CHAPTER 1. SPECIAL RELATIVITY Chapter 2
Some Differential Geometry
Differential geometry is used in many areas of physics, it is a tool wich can be used to describe local and global properties of spaces which are not vector spaces and which may or may not be curved. Examples of such spaces are spheres, cylinders and hyperboloids. In particular, differential geometry is the language used to describe the general theory of relativity. This chapter is a brief introduction to differential geometry and the basic concepts needed in the study of general relativity are introduced.
2.1 Manifolds
In a vector space like Rn we can use global coordinates. The Cartesian coordinates xi (i = 1, 2,...,n) are everywhere defined and determine a point in the vector space in a one-to- one way. The situation is different in the case of a closed surface like the unit sphere S2 R3. We can define the spherical coordinates (θ,φ) by (x,y,z) = (sin θ cos φ, sin θ sin φ, cos⊂θ) for x2 + y2 + z2 = 1. Any point on S2 corresponds to some value of the coordinates in the ranges 0 θ π and 0 φ 2π. However, the points θ = 0, π are singular in the sense that any≤ value≤ of the coordinate≤ ≤ φ corresponds to the same point (north and south poles). Furthermore, φ = 0, 2π represent the same points on the sphere. This is typical for closed surfaces: There is simply no way to map the points on the surface in a one-to-one manner to points on a vector space or even to any open subset of a vector space. The above obstruction to set up global coordinates on a surface leads to the general notion of a manifold. A manifold is defined as a space, which locally looks like a piece of a vector space Rn. A manifold can be glued together of a collection of open subsets of Rn. More precisely, a (smooth) manifold M is a set such that it is a union of open subsets Uα with a collection of homeomorphisms φα : Uα Vα, called coordinate functions, where each n → Vα is an open subset of R . In addition, we require that all coordinate transformations are 1 Rn smooth, i.e., the composite functions φα φβ− are functions in (a subset of) such that all their partial derivatives (of arbitrary◦ order) exist and are continuous functions, this is illustrated in Figure 2.1. An open set U M together with a coordinate function φ is called ⊂ a chart and is denoted by (Uα,φα). A collection of charts (Ui,φi) such that i Ui = M (that is, each point p M belongs to at least one of the sets Ui) is known as an atlas. The number n is called the∈dimension of the manifold. Note that this definition impliesS that Rn itself is a manifold.
39 40 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
M
Uα
Uβ φα φβ
Rn Rn 1 φ φ− β ◦ α
Vα Vβ
Figure 2.1: Two open sets Uα and Uβ in a manifold M with homeomorphisms φα and φβ n 1 to the open sets V and V in R . The composite function φ φ− is a smooth function. α β β ◦ α
When constructing physical theories using differential geometry, it is important that the laws of physics do not depend on the specific choise of coordinate functions. This is known as the principle of general covariance which and is essential in general relativity.
2 1 2 3 Example 2.1 The unit sphere S is a union of the six hemispheres Ui, = (x ,x ,x ) 2 i 1 ±2 {1 2 2 2 ∈ S x > 0 . For example, U3,+ is homeomorphic to the unit disk (x ,x ) (x ) + (x ) < | ± } 1 2 3 1 2 { | 1 via the projection φ3,+(x ,x ,x ) = (x ,x ). It is easy to show that all the coordinate } 1 transformations φi, φj,− are smooth. ± ◦ ± Example 2.2 The set of all invertible real 2 2 matrices, to be denoted by GL(2, R), is a × a b manifold. Actually, it is an open subset of R4. Namely, a non-singular 2 2 matrix × c d is characterized by the non-vanishing of the determinant,
ad bc = 0. − 6 The set ad bc = 0 is a closed 3-dimensional surface in R4, therefore its complement GL(2, R) is− an open set. Any open subset of Rn is a manifold (Only one single coordinate system is needed!). This example can be generalized to the complex case GL(2, C), or to any number of dimensions GL(n, R), GL(n, C).
Example 2.3 Any smooth surface (any surface without corners or sharp edges) is a man- ifold. For example, the paraboloid z = x2 + y2 in R3 is a manifold. In this case a global 2.1. MANIFOLDS 41
ATLAS Stockholm
15 16
Figure 2.2: The surface of the Earth together with an atlas. In the atlas, there is at least one chart containing each point of the surface (Stockholm for example). coordinate system exists: The points are determined 1-1 by the projection (x,y) R2. The one-sheeted hyperboloid x2 + y2 z2 = 1 is another example; in this case there is∈ no global coordinate system; one has to use− at least two different local coordinate systems.
Example 2.4 The surface of the Earth, see Fig. 2.2, is a manifold (just as any sphere is). Any good atlas is going to contain at least one chart where a given location can be found and some regions may be included in more than one chart. A subset of a manifold may or may not be a manifold; in the former case we call the subset a submanifold. An example of a surface, which is not a submanifold is the light-cone (x0)2 (x1)2 (x2)2 (x3)2 = 0 in R4. We− shall consider− − mappings between different manifolds. We say that a mapping f : M N is smooth if it is smooth in all coordinate systems: This means that the composite → 1 mappings φ f ψ− are smooth mappings between vector spaces when φ is some local α ◦ ◦ β a coordinate system on N and ψβ is a local coordinate system on M.
Exercise 2.1 Show that if f : M N and g : N P are smooth mappings between manifolds, then also g f : M P is→ smooth. → ◦ → Exercise 2.2 Show that the union of two different intersecting lines in the plane is not a submanifold of R2.
Exercise 2.3 Show that the plane αx + βy + γz = δ is a submanifold of R3. 42 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
Exercise 2.4 Show that the function x2 + y is a smooth function on the unit sphere.
2.2 Vector Fields and Tangent Vectors
3 3 Let U = (Ux,Uy,Uz) be a vector field in R . If f : R R is any smooth function, then we can define the function → ∂f ∂f ∂f Uf = U + U + U , (2.1) x ∂x y ∂y z ∂z which is also smooth. Let us denote by C∞(M) the set of all smooth real valued functions on the manifold M. Above we have actually defined a mapping
3 3 U : C∞(R ) C∞(R ). → By the standard properties of partial derivatives, this mapping is linear and it satisfies
U(fg) = fUg + gUf, (2.2)
Leibniz’ rule for differentiation. Furthermore, Uf = 0 for all functions f if and only if 3 U = 0. We state without proof: Any linear mapping of C∞(R ) onto itself, which in addition satisfies Leibniz’ rule, is a derivative by some vector field. Thus, we have a 1- 3 1 correspondence between vector fields and derivations of the algebra C∞(R ) of smooth functions. Motivated by the above consideration, we define: A vector field X on a manifold M is a derivation of the algebra C∞(M). We can always rely on the local coordinates to write a vector field on a manifold M, acting on a function f as n ∂f Xf = Xi(x) . (2.3) ∂xi i=1 X The vector field is then described by its components
X = (X1(x), X2(x),...,Xn(x)). (2.4)
We denote by D1(M) the set of vector fields on M. Because a linear combination of linear maps is linear and a linear combination of derivatives is a derivative (check by Leibniz’ rule!), we observe that D1(M) is a vector space. Suppose that X,Y D1(M). Define the commutator [X,Y ] by ∈ [X,Y ]f = X(Y f) Y (Xf). (2.5) − Then clearly [X,Y ](f + g) = [X,Y ]f + [X,Y ]g and [X,Y ](λf) = λ[X,Y ]f, where λ is a real number. Furthermore,
[X,Y ](fg) = X(Y (fg)) Y (X(fg)) = X(fY g + gY f) Y (fXg + gXf) − − = (Xf)(Y g) + f(XY g) + (Xg)(Y f) + g(XY f) (Y f)(Xg) f(Y Xg) (Y g)(Xf) g(Y Xf) − − − − = f[X,Y ]g + g[X,Y ]f. (2.6)
This shows that [X,Y ] D1(M). ∈ 2.2. VECTOR FIELDS AND TANGENT VECTORS 43
Exercise 2.5 Show that
1. [X,Y ] is linear in both arguments,
2. [X,Y ] = [Y, X], and − 3. [X, [Y,Z]] + [Y, [Z, X]] + [Z, [X,Y ]] = 0.
The last equation is called the Jacobi identity. A vector space equipped with a product [X,Y ] satisfying 1., 2., and 3. is called a Lie algebra.
i i ∂ Writing X = X ∂i and Y = Y ∂i, where ∂i = ∂xi , we get a formula for Z = [X,Y ],
Zi = Xj∂ Y i Y j∂ Xi. (2.7) j − j We want to generalize the concept of a tangent vector on a surface. If S R3 is a smooth surface and p S is a point, then a tangent vector at the point p is given⊂ as the ∈ derivative (with respect to the parameter) of a curve, v =x ˙(s0) with x(s0) = p. The same tangent vector can be obtained from different curves through the point p, because the only thing which matters is the first derivative at p with respect to the parameter. Anyway, the set of all tangent vectors at p spans the tangent plane TpS to the surface. In case of a manifold M, we proceed as follows. Let α(s) and β(s) be two smooth curves through the point p M. Choose a coordinate system φ(q) = (x1(q),x2(q),...,xn(q)) in a neighborhood of the∈ point p. We say the curves α and β are tangential to each other at p if
d d xi(α(s)) = xi(β(s)) for i = 1, 2,...,n at s = s . (2.8) ds ds 0
Exercise 2.6 Show that the above condition is independent of the choice of a coordinate system, i.e., if the curves are tangential in one coordinate system, then they are tangential in any other coordinate system.
A tangent vector v at the point p is an equivalence class of smooth curves through p, a pair of curves being equivalent if they are tangential to each other at p, see Fig. 2.3. The set of all tangent vectors at p is denoted by TpM and is called the tangent space at p. As in the case of a tangent plane, the tangent space is a vector space: Given a pair of tangent vectors represented by the curves α and β, then the sum of tangent vectors is represented by a curve γ such that in the local coordinates
x(γ(s)) = x(α(s)) + x(β(s)) x(p). (2.9) − If λ is a real number and v is a tangent vector represented by the curve α, then the tangent vector λv, in local coordinates, is represented by λ(x(α(s)) x(p)) + x(p). −
Example 2.5 On the unit sphere M = S2, we use the spherical coordinates θ and φ, except at the poles θ = 0, π. A curve can then be parameterized as (θ(s),φ(s)). A tangent vector v T S2 is given by its components v = (v ,v ) with v = θ˙(s ), v = φ˙(s ), and ∈ p θ φ θ 0 φ 0 p = (θ(s0),φ(s0)). How would you describe a tangent vector at the poles θ = 0, π? 44 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
α(s) u v
p β(s)
γ(s)
Figure 2.3: Two curves α and β define the same tangent vector v at p while the curve γ defines another tangent vector u.
1 2 n In a local coordinate system (x ,x ,...,x ), the components of a tangent vector v TpM i ∈ 1 2 n i dx (s) are given by an n-tuple of real numbers v = (v ,v ,...,v ), where v = ds at s = s0. A tangent vector v associates a real number v f to any smooth function f C∞(M), · ∈ df(α(s)) v f = , (2.10) · ds s=s0
where α is any smooth curve tangential to v at p = α(s0).
Exercise 2.7 Show that v f does not depend on the choice of the tangential curve α. · In local coordinates (use the chain rule!),
dxi ∂f v f = . (2.11) · ds ∂xi s=s0
i ∂ For this reason, we use also the notation v = v ∂xi for a tangent vector at p. Using Leibniz’ rule, we get v (fg) = f(p)v g + g(p)v f. (2.12) · · · If X D1(M), then at each point p M the field X determines a tangent vector v = X(p),∈ according to ∈ i v = X (p)∂i, (2.13) in terms of local coordinates. Thus, a vector field can be viewed as a smooth distribution of tangent vectors X(p) T M. ∈ p Let h : M N be a diffeomorphism between two manifolds M and N; i.e., h is smooth, → 1 1 it is one-to-one, and the inverse h− : N M is smooth. Let X D (M). We define a 1 → ∈ vector field X′ = h X D (N) by the formula ∗ ∈ 1 X′[f](p) = X[f h](h− (p)). (2.14) ◦ 2.2. VECTOR FIELDS AND TANGENT VECTORS 45
Using local coordinates xi on M and local coordinates yj on N, we get, by the chain rule for differentiation, k j ∂f i ∂y ∂f X′ (y) = X (x) , (2.15) ∂yj ∂xi ∂yk i.e., j j ∂y i X′ (y) = X (x), (2.16) ∂xi where we have written the function h : M N in terms of coordinates as y = y(x). Note j j i → that for a linear transformation y = A ix , this gives the familiar formula for transformation of a vector field, j j i X′ (y) = A iX (x). (2.17)
Exercise 2.8 Show that h [X,Y ] = [h X,h Y ]. ∗ ∗ ∗ Hint: Use local coordinates.
1 We sometimes also use the notation h f = f h− for f C∞(M). A vector field X can be multiplied by a smooth real valued∗ function◦ f, according∈ to the rule
(fX) g = fX g, (2.18) · · i.e., (fX)i(x) = f(x)Xi(x). (2.19) According to the definition (2.14),
1 1 1 (h (fX) g)(p) = ((fX) (g h)) h− (p) = f(h− )(X (g h)) h− (p) ∗ · · ◦ ◦ · ◦ ◦ = (h f)(p)((h X) g)(p) (2.20) ∗ ∗ · and so h (fX) = (h f)(h X). ∗ ∗ ∗ 2.2.1 Tensor Fields
The set of all linear functions from TpM to R is called the cotangent space of M at p and is denoted by Tp∗M. It is easy to show that Tp∗M is a vector space and given some local i coordinates at p, it is possible to define a basis dx of T ∗M such that { } p i i dx (∂j ) = δj . (2.21)
An element ω Tp∗M is called a cotangent vector (or covariant vector) and may be written as i ∈ i ωidx . From Eq. (2.21), it immedeately follows that for a vector X = X ∂i and a cotangent i vector ω = ωidx , i ω(X) = ωiX . (2.22) Since the cotangent vectors are defined without any reference to any specific choice of local coordinates, the relation
ℓ ℓ i k ∂y ∂y ω = ω dx (∂ ) = ω(∂ ) = ω′ dy ∂′ = ω′ , (2.23) j i j j k ∂xj ℓ ∂xj ℓ 46 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY where xi and yk are local coordinates on M and the primes refer to the y-coordinates, must hold. If multiplied by ∂xj/∂yk, this relation yields ∂xj ω′ = ω (2.24) k ∂yk j for the transformation of the cotangent vector components. Similar to the above, a tensor of type (n, m) is a multilinear function which maps n elements of Tp∗M and m elements of R i1...in TpM to . The components of such a tensor T can be written as Tj1...jm and transform according to the rule
n kα m jβ k1...kn ∂y ∂x i1...in T ′ = Tj ...j . (2.25) ℓ1...ℓm ∂xiα ∂yℓβ 1 m α=1 ! Y βY=1 A tensor field of type (n, m) on a manifold M is a smooth assignment of a tensor of type (n, m) to each point in the manifold, that is, given some local coordinates, the functions i1...in Tj1...jm are smooth.
2.3 Geodesics 2.3.1 Affine Connection and Christoffel Symbols According to the definition, a vector field X D1(M) determines a derivation of the algebra of smooth real valued functions on M. This∈ action is linear in X such that (fX)g = f(Xg) for any pair of smooth functions f and g. Next, we want to define an action of X on D1(M) 1 1 1 itself, which has similar properties. Let X : D (M) D (M) for any X D (M) be an operator satisfying the following conditions:∇ → ∈ 1. The map Y Y is real linear in Y for any fixed X, 7→ ∇X 2. fX+gY Z = f X Z + g Y Z for any vector fields X,Y,Z and any smooth real valued ∇functions f, g,∇ and ∇
3. X (fY ) = f X Y + (X f)Y for any vector fields X,Y and any smooth real valued function∇ f. ∇ · An operator satisfying these conditions is called an affine connection on the manifold M. ∇ Example 2.6 Let M = Rn and define ∂ Y = (X Y i) = (X Y i)∂ . ∇X · ∂xi · i Then, is an affine connection. ∇ Warning! The above example needs a modification when applied to an arbitrary manifold M. The difficulty is that the right-hand side depends on the choice of local coordinates and it does not transform like a true vector. If we transform from the coordinates xi to the coordinates yj = yj(x1,x2,...,xn), then in the new coordinates
j j ∂y i Y ′ (y) = Y (x), (2.26) ∂xi 2.3. GEODESICS 47 and therefore j j j i ∂y i ∂y (X Y ′ )∂′ = (X Y ) ∂′ + Y X ∂′ . (2.27) · j · ∂xi j · ∂xi j j ∂y i The coordinates of the first term on the right-hand side are equal to ∂xi ( X Y ) , but for any non-linear coordinate transformation, we also have a second inhomogeneous∇ term. Choosing local coordinates, the difference
Hi(X,Y ) = ( Y )i X Y i (2.28) ∇X − · is linear in both arguments in the extended sense
Hi(fX,gY ) = fgHi(X,Y ), (2.29) for any smooth functions f and g. For this reason, we can write
i i j k H (X,Y )=ΓjkX Y . (2.30)
i i Here Γjk =Γjk(x) are smooth (local) functions on M. Once again,
( Y )i = X Y i +Γi XjY k. (2.31) ∇X · jk i The functions Γjk are called the Christoffel symbols of the affine connection . Let us look what happens to the Christoffel symbols under a coordinate transformation∇y = y(x). Let us denote by i the covariant derivative ∂ = ∂i . Then, ∇ ∇ ∂xi ∇ ∂ =Γk ∂ . (2.32) ∇i j ij k ∂ ∂xa Denoting ∂′ = i and using ′ = i (which follows from the second axiom for affine i ∂y ∇i ∂y ∇a connections), we get
a b a b b k ∂x ∂x ∂x ∂x ∂x ′ ∂′ = Γ′ ∂′ = ∂ = ∂ + ∂ ∂ ∇i j ij k ∂yi ∇a ∂yj b ∂yi ∂yj ∇a b a ∂yj b ∂xa ∂xb ∂2xb = Γc ∂ + ∂ . (2.33) ∂yi ∂yj ab c ∂yi∂yj b The form of the second term after the last equality sign follows from the chain rule for differentiation. Transforming back to the x coordinates on the left-hand side and using again the chain rule, we finally get
a b k k 2 c k ∂x ∂x ∂y c ∂y ∂ x Γ′ (y) = Γ (x) + . (2.34) ij ∂yi ∂yj ∂xc ab ∂xc ∂yi∂yj Note that in linear coordinate transformations, the inhomogeneous term containing second derivatives vanishes and the Christoffel symbols transform like components of a third rank tensor.
Exercise 2.9 We define the Christoffel symbols on the unit sphere S2, using spherical co- ordinates (θ,φ). When θ = 0, π, we set 6 1 Γθ = sin 2θ, Γφ =Γφ = cot θ, φφ −2 θφ φθ 48 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY and all the other Γ’s are equal to zero. Show that the apparent singularity at θ = 0, π can be removed by a better choice of coordinates at the poles of the sphere. Thus, the above affine connection extends to the whole S2.
The covariant derivative can be generalized to act on arbitrary tensors. Since j acts as a derivative on vector fields, it is natural to define that the covariant derivative of∇ a function f C∞(M) as ∈ f = ∂ f. (2.35) ∇j j We now demand that acts as a derivative on arbitrary tensor fields A and B, that is ∇j AB = A B + B A. (2.36) ∇j ∇j ∇j i i Given the covariant vector field Ai and the contravariant vector field B , the product AiB is a function in C∞(M). It follows that A Bi = ∂ A Bi = A ∂ Bi + Bi∂ A ∇j i j i i j j i = A Bi + Bi A = A (∂ Bi +Γi Bk) + Bi A . (2.37) i∇j ∇j i i j jk ∇j i By solving for Bi A in the above expression, the equation ∇j i Bi A = Bi(∂ A Γk A ) (2.38) ∇j i j i − ji k is obtained. Since Bi is an arbitrary vector field, it immedeately follows that A = ∂ A Γk A . (2.39) ∇j i j i − ji k This result can be generalized to arbitrary tensors as n i1...in i1...in iα i1...iα−1ℓiα+1...in kT = ∂kT + Γ T ∇ j1...jm j1...jm kℓ j1...jm α=1 m X Γℓ T i1...in . (2.40) − kjα j1...jα−1ℓjα+1...jm α=1 X 2.3.2 Parallel Transport
The tangent vectors at a point p M form a vector space TpM. Thus, tangent vectors at the same point can be added. However,∈ at different points p and q, there is in general no way to compare the tangent vectors u TpM and v TqM. In particular, the sum u + v is ill-defined. An affine connection gives∈ a method to∈ relate tangent vectors at p to tangent vectors at q, provided that we have fixed some smooth curve γ(s) starting from p and ending at q. A curve γ defines a distribution of tangent vectors along the curve by
i X(s) =x ˙ (s)∂i. (2.41)
i We have chosen a local coordinate system x . Thus, X(s) Tγ(s)M. Consider the system of first order ordinary differential∈ equations given by Y i =x ˙ k Y i = Y˙ i(s)+Γi (x(s))x ˙ k(s)Y j(s) = 0, i = 1, 2,...,n, (2.42) ∇X(s) ∇k kj where Y (s) is an unknown vector field along the curve x(s). 2.3. GEODESICS 49
Exercise 2.10 Show that the set of equations (2.42) is coordinate independent in the sense that if the equations are valid in one coordinate system, then they are also valid in any other coordinate system.
n k k Example 2.7 With euclidean coordinates in R , x˙ k =x ˙ ∂k is a directional derivative in the direction of the curve x(s). The set of equations∇ (2.42) is given by
x˙ k Y i =x ˙ k∂ Y i = 0, ∇k k stating that the components of the vector Y do not change.
A vector field Y along the curve x(s), satisfying Eq. (2.42), is called a parallel vector field. The existence and uniqueness theorem in the theory of first order differential equations (Picard’s theorem) gives the following fundamental theorem in geometry:
Theorem 2.1 Given a tangent vector v TpM at the initial point p = γ(s0) of a smooth curve γ(s), then there exists a unique parallel∈ vector field Y (s) along γ(s) satisfying the initial condition Y (s0) = v.
Definition: A curve γ(s) is a geodesic (geodetic curve) if its tangent vectorsγ ˙ (s) at each point are parallel.
Thus, the statement γ(s) is a geodesic means that the coordinate functions xi(s) satisfy
x˙ k x˙ i =x ¨i(s)+Γi (x(s))x ˙ j (s)x ˙ k(s) = 0. (2.43) ∇k jk This condition is a second order ordinary differential equation for the coordinate functions. We can use the existence and uniqueness results from the theory of differential equations to formulate the following important theorem:
Theorem 2.2 Given a point p M and a tangent vector u TpM, then there exists, in some open neighborhood of p∈, a unique geodesic γ(s) such that∈ γ(0) = p and γ˙ (0) = u.
Example 2.8 Let M = S2 and let Γ be the affine connection in Exercise 2.9. Then, the coordinates θ(s) and φ(s) of a geodesic satisfy
1 θ¨(s) sin 2θ(s) φ˙(s)φ˙(s) = 0, − 2 φ¨(s) + 2 cot θ(s) θ˙(s)φ˙(s) = 0.
Find the general solution to the geodesic equations. The solutions are great circles on the sphere M. For example, θ = αs + β and φ = const.
Let be a connection on M and γ(s) a curve connecting the points p = γ(s ) and ∇ 1 q = γ(s2). We define the parallel transport from the point p to the point q along the curve γ as a linear map γˆ : T M T M. p → q 50 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
The map is given as follows: Let u T M and let X(s) be a parallel vector field along γ ∈ p such that X(s1) = u. We setγ ˆ(u) = X(s2). The map is linear, because the differential equation ˙ i i k j X (s)+Γkjx˙ (s)X (s) = 0 (2.44) is linear in Xi and therefore the solution depends linearly on the initial condition u.
Rn i Example 2.9 If M = and Γjk = 0, then the parallel transport γˆ is the identity map u u for any curve γ. 7→
Example 2.10 Let M and Γ be as in Example 2.8. Let (θ,φ) = (αs + β,φ0). Now, the parallel transport is determined by the equations
X˙ θ = 0, X˙ + cot θ θX˙ = X˙ + X α cot(αs + β) = 0. φ · φ φ φ 1 This set has the solution Xθ = const. and Xφ = const. (sin(αs + β))− . If u is the tangent vector (1, 1) at the point (θ,φ) = (π/4, 0), then the· parallel transported vector v at (θ,φ) = (π/2, 0) is (1, 1/√2).
2.4 Torsion and Curvature
Given an affine connection on a manifold M, we can define a third rank tensor field k ∇ T = (Tij ) as follows. Any pair of vector fields X and Y gives another vector field T (X,Y ) = Y X [X,Y ]. (2.45) ∇X − ∇Y − The dependence on X and Y is linear, after choosing local coordinates, we may write
k i j k T (X,Y ) = X Y Tij , (2.46)
k which defines the components Tij of the tensor. Since [fX,Y ] = f[X,Y ] (Y f)X (2.47) − · (by the interpretation of a vector field as a first order linear differential operator) T (X,Y ) is linear in the extended sense,
T (fX,Y ) = T (X,fY ) = fT (X,Y ), T (X,Y + Z) = T (X,Y ) + T (X,Z) (2.48) for any real function f. Note further that T (X,Y ) = T (Y, X). Since − T (∂ ,∂ )k = T k =Γk Γk , (2.49) i j ij ij − ji we see that T is precisely the antisymmetric part (in the lower indices) of the Christoffel symbols. From Eq. (2.49) and the transformation formula (2.34) for the Christoffel symbols follows that the components of the torsion T really transform like tensor components under coordinate transformations,
k l m k ∂y ∂x ∂x p T ′ (y) = T (x). (2.50) ij ∂xp ∂yi ∂yj lm 2.4. TORSION AND CURVATURE 51
Next, we define the Riemann curvature tensor R, sometimes just called the curvature. For a triple X,Y,Z of vector fields, we can define a vector field
R(X,Y )Z = [ , ]Z Z. (2.51) ∇X ∇Y − ∇[X,Y ] In local coordinates, m R(∂i,∂j )∂k = Rkij∂m. (2.52) From the definition (2.51), we get
Rm ∂ = ∂ ∂ kij m ∇i∇j k − ∇j ∇i k = (Γm ∂ ) (Γm∂ ) ∇i jk m − ∇j ik m = ∂ Γm ∂ +Γm Γp ∂ ∂ Γm∂ ΓmΓp ∂ , (2.53) i jk m jk im p − j ik m − ik jm p i.e., Rm = ∂ Γm ∂ Γm +Γp Γm Γp Γm . (2.54) kij i jk − j ik jk ip − ik jp For fixed i and j, we may think of R•ij as a real n n matrix, where the replaced upper index is the row index and vice versa.• With this notation,×
R•ij = ∂iΓj• ∂jΓi• + [Γi• , Γj• ] = ∂i +Γi• ,∂j +Γj• . (2.55) • • − • • • • • The curvature is antisymmetric in i and j, i.e.,
Rm = Rm . (2.56) kij − kji Using Eq. (2.54), one checks by direct computation that under a coordinate transformation y = y(x), m r s p m ∂y ∂x ∂x ∂x q R′ (y) = R (x). (2.57) kij ∂xq ∂yk ∂yi ∂yj rsp m k Thus, Rkij is really a 4th rank tensor in contrast to the Christoffel symbols Γij, which transform inhomogeneously under coordinate transformations.
Exercise 2.11 Check directly from the definition that
k r s k ∂y ∂x ∂x m T ′ (y) = T (x) ij ∂xm ∂yi ∂yj rs under a coordinate transformation y = y(x). Instead of the direct (completely straight forward, but cumbersome) computation one can prove the tensorial transformation rule for the curvature and torsion from the extended linearity. For example, in the case of the torsion T , we have
a b a b a b k ∂x ∂x ∂x ∂x ∂x ∂x c T ′ ∂′ = T (∂′,∂′ ) = T ∂ , ∂ = T (∂ ,∂ ) = T ∂ . (2.58) ij k i j ∂yi a ∂yj b ∂yi ∂yj a b ∂yi ∂yj ab c
Comparing the left- and right-hand sides of Eq. (2.58) and taking into account that ∂c = ∂yk ∂xc ∂k′ , we get immediately the tensorial transformation rule (2.50). The case of the curvature tensor is treated in the same way. 52 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
Assume that the torsion T vanishes. From Eq. (2.54), we deduce the first Bianchi identity
m m m Rkij + Rjki + Rijk = 0 (2.59) for all indices. This can also be written as
R(X,Y )Z + R(Z, X)Y + R(Y,Z)X = 0 (2.60) for all vector fields X,Y,Z (see problem 2.4). This identity is in general not true when T = 0. 6 Another important tensor in general relativity is the Ricci tensor
k Rij = Rikj. (2.61)
Exercise 2.12 Show that Rij transforms like a second rank tensor under coordinate trans- formations. The curvature is related to the parallel transport in the following way. Consider a very small parallelogram with edges at x,x + δx,x + δx + δy,x + δy. According to the differential equation (2.42), determining a parallel transport, a tangent vector Y at x when parallel transported to the point x + δx becomes approximately (in given local coordinates)
Y i(x + δx) = Y i(x) Γi (x)Y k(x)δxj . (2.62) − jk At the next point x + δx + δy, we get
Y i(x + δx + δy) = Y i(x) Γi (x)Y k(x)δxj − jk Γi (x + δx)[Y k(x) Γk (x)Y m(x)δxl]δyj − jk − lm = Y i(x) Γi (x)Y k(x)δxj Γi (x)Y k(x)δyj − jk − jk ∂ Γi (x)δxmδyjY k(x)+Γi (x)Γk (x)Y m(x)δxlδyj. (2.63) − m jk jk lm In the last step, we have dropped the terms, which are of third order in the coordinate differentials. In the same way, we can compute the parallel transport of Y from x to x + δy and further to x+δy +δx. The parallel transport around the parallelogram is then obtained as a combination of the right-hand side of the above formula and the latter transport (Note the direction of motion!); the result is
i i k m j δY = Rkmj(x)Y (x)δx δy 1 = Ri (x)Y k(x)(δxmδyj δxjδym). (2.64) 2 kmj − Thus, the parallel transport around the small parallelogram is proportional to the curvature at x and the area of the parallelogram.
Example 2.11 We compute the curvature tensor of the unit sphere S2. Since there are only two independent coordinates, all the non-zero components of R are given by the tensor i i i Rj = Rjθφ = Rjφθ, where i, j = θ,φ. Looking at the table (Exercise 2.9) of the Christoffel symbols, we get− Rθ = sin2 θ, Rφ = 1, φ θ − and the other components = 0. 2.5. METRIC AND PSEUDO-METRIC 53
The second Bianchi identity
∂iR•jk + [Γi• ,R•jk] + ∂jR•ki + [Γj• ,R•ki] + ∂kR•ij + [Γk• ,R•ij ] = 0 (2.65) • • • • • • • • • follows from Eq. (2.55) and the Jacobi identity for matrices (and linear operators), [X, [Y,Z]] + [Y, [Z, X]] + [Z, [X,Y ]] = 0. (2.66)
One just needs to insert X = i, Y = j , and Z = k, with i = ∂i +Γi• etc. ∇ ∇ ∇ ∇ • 2.5 Metric and Pseudo-Metric
In order to define distances and inner products between tangent vectors on a manifold M, we have to define a metric. A Riemannian metric is an inner product defined in each of the tangent spaces. That is, for each p M, we have a non-degenerate bilinear mapping ∈ g : T M T M R, p p × p → which is symmetric, gp(u,v) = gp(v,u) for all tangent vectors u,v TpM, and gp(u,u) > 0 for all u = 0, and it depends smoothly on the coordinates of the∈ point p. Choosing local 6 i i coordinates x and writing the tangent vectors in the coordinate basis, u = u ∂i, we can write a symmetric bilinear mapping as a second rank symmetric tensor,
i j gp(u,v) = gij u v . (2.67)
Non-degenerate means that det(gij ) = 0. Since (gij ) is symmetric, it can be diagonalized. Positivity of the inner product then means6 that all eigenvalues of g are positive. In relativity theory, we need a generalization of the Riemannian metric to a pseudo- Riemannian metric (or Lorentzian metric). In this generalization, we shall drop the re- quirement that the inner product should be positive. In particular, we want to include the Minkowski metric η = (ηµν ), which has signature (1, 3), i.e., it has one positive eigenvalue (= 1) and three negative eigenvalues (= 1). A metric (or a pseudo-metric) can be− used to define distances. If γ(s) is a parameterized curve such that its tangent vector at each point on the curve has non-negative length, then we define the length of the curve (between the parameter values a and b) as
b ℓ(γ) = gγ(s)(γ ˙ (s), γ˙ (s)) ds. (2.68) a Z q The extremal curves γ(s) for the functional ℓ(γ) are the geodesics for a certain connection (the Levi-Civita connection, see the discussion below and the Exercise 2.15). Recall the Euler–Lagrange variational equations: Let x(s) = (x1(s),x2(s),...,xn(s)) be a vector valued function of a real variable s and b S(x) = L(x(s), x˙(s), x¨(s),... ) ds, (2.69) Za where L is some (differentiable) function of the derivatives x, x,˙ x,...¨ . Then the variation of S in the direction δx(s) of a variation of the curve x(s) is
n b ∂L d ∂L d 2 ∂L δS = δxi(s) + ds, (2.70) ∂xi − ds ∂x˙ i ds ∂x¨i −··· i=1 a ( ) X Z 54 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY where we have used partial integration in the variable s in order to factor out the δxi’s under the integral sign. The requirement that the variation δS vanishes in arbitrary directions δxi in the path space is then equivalent to the Euler–Lagrange equations
∂L d ∂L d 2 ∂L + = 0, (2.71) ∂xi − ds ∂x˙ i ds ∂x¨i −··· where i = 1, 2,...,n. If L is only a function of x andx ˙, then Eq. (2.71) reduces to ∂L d ∂L = 0, (2.72) ∂xi − ds ∂x˙ i where i = 1, 2,...,n.
n Example 2.12 If M = R , then we can define a constant metric gij = δij. This is the standard Euclidean metric. In general in Rn, a Riemannian metric is given by smooth real functions gij (x) = gji(x) such that the matrix (gij (x)) is strictly positive (in the sense that all its eigenvalues are positive) for all x Rn. ∈ Example 2.13 If M Rn is any smooth surface in the Euclidean space, then we can define ⊂ a metric g as follows. Let u,v TpM be a pair of tangent vectors to the surface at the point p. The tangent vectors are also∈ vectors in Rn, thus we may compute the scalar product u v. · We set gp(u,v) = u v. From the fact that the Euclidean metric is positive definite follows at once that g is a positive· symmetric form.
Example 2.14 Let M = S2 R3. We compute the metric g on M, as defined in Exam- ple 2.13, in terms of the spherical⊂ coordinates θ and φ. The spherical coordinates are related to the standard coordinates by ∂ ∂ ∂ ∂ = cos θ cos φ + cos θ sin φ sin θ , θ ∂x ∂y − ∂z ∂ ∂ ∂ = sin θ sin φ + sin θ cos φ . φ − ∂x ∂y From this we obtain the inner products
gθθ = g(∂θ,∂θ) = 1, 2 gφφ = g(∂φ,∂φ) = sin θ,
gθφ = gφθ = 0.
For example, the inner product of the vectors (1, 2) and (2, 1) (in the θ and φ coordinates) 2 − is 1 2 gθθ +2 ( 1) gφφ = 2 2 sin θ, at the point (θ,φ). Note that the spherical coordinates are· orthogonal,· · − the· off-diagonal− matrix elements of g are equal to zero.
Example 2.15 According to Eq. (2.68) and Example 2.14, the distance between two points a and b on a sphere along a curve γ(s) = (θ(s),φ(s)) is given by
b b 1/2 2 2 2 2 2 ℓ(γ) = gθθθ˙(s) + gφφφ˙(s) ds = θ˙(s) + sin θ(s)φ˙(s) ds. a a Z q Z h i 2.5. METRIC AND PSEUDO-METRIC 55
The Euler–Lagrange equations then give (check this!)
1 θ¨(s) sin 2θ(s)φ˙(s)2 = 0, − 2 d sin2 θ(s)φ˙(s) = 0, ds h i which agrees with the equations in Example 2.8.
Suppose a (pseudo) metric g is given on a manifold M. From the metric, we can construct a preferred affine connection, called the Levi–Civita connection. Its Christoffel symbols (in given local coordinates) are given by the formula
1 Γk = gkl(∂ g + ∂ g ∂ g ), (2.73) ij 2 i jl j il − l ij
ij 1 where g are the matrix elements of the inverse matrix g− . One should always be extremely careful when trying to define something with the help of local coordinates. It is not a priori clear that the locally defined Christoffel symbols in various coordinate systems match together to define a connection on the whole manifold M. To investigate the patching problem, we compute what happens under a coordinate transformation y = y(x). Since ∂ ∂xk ∂ = , (2.74) ∂yi ∂yi ∂xk we get
∂ ∂ ∂xk ∂xl ∂ ∂ g′ (y) = g , = g , ij y ∂yi ∂yj ∂yi ∂yj x ∂xk ∂xl ∂xk ∂xl = g (x) . (2.75) kl ∂yi ∂yj and similarly for the inverse matrix,
i j ij kl ∂y ∂y g′ (y) = g (x) . (2.76) ∂xk ∂xl Inserting this transformation law into the definition (2.73) of the Christoffel symbols, we get
k a b k 2 c k ∂y ∂x ∂x c ∂y ∂ x Γ′ (y) = Γ + , (2.77) ij ∂xc ∂yi ∂yj ab ∂xc ∂yi∂yj as expected. Thus, the Christoffel symbols defined in different coordinate systems are com- patible and define indeed an affine connection.
Example 2.16 Since the standard Euclidean metric is constant in the standard coordinates, the Christoffel symbols of the Levi-Civita connection vanish.
Example 2.17 The Christoffel symbols computed from the metric defined in Example 2.14 agree with the Christoffel symbols of Exercise 2.9. 56 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
The Levi-Civita connection has two characteristic properties. The first property is that k k its torsion T = 0, since Γij Γji = 0 according to Eq. (2.73). The second property is that the parallel transport defined− by the Levi-Civita connection is metric compatible in the following sense: Let X(s) and Y (s) be a pair of parallel vector fields along a curve γ(s). Then, d g (X(s),Y (s)) = 0, (2.78) ds γ(s) i.e., the inner products of parallel vector fields are constant along the curve. This means that the parallel transportγ ˆ : T M T M between the end points of the curve is an isometry. p → q Theorem 2.3 An affine connection is compatible with a metric g if and only if ∇ Z g(X,Y ) = g( X,Y ) + g(X, Y ) · ∇Z ∇Z for all vector fields X,Y,Z. A word about the notation: We write g(X,Y ) for the real valued smooth function p 7→ gp(X(p),Y (p)). Remember that a vector field acts on functions as derivations, so the left- hand side is a well-defined smooth function, too.
Proof: 1) Assume that the condition for g in the theorem is satisfied. Let X(s) and Y (s) be a pair of parallel vector fields along a curve γ(s). We shall extend X and Y to vector fields defined in an open neighborhood of the curve. Let Z be some vector field defined in a neighborhood of the curve such that along the curve Z(γ(s)) =γ ˙ (s). Since X and Y are parallel along γ, we have X = Y = 0 on the curve γ. ∇Z ∇Z Thus, d g (X(s),Y (s)) = Z g(X,Y ) = g( X,Y ) + g(X, Y ) = 0 on γ. ds γ(s) · ∇Z ∇Z 2) Assume that is compatible with g. Let X,Y,Z be a triple of vector fields. Let ∇ p M and γ be any curve through p such that at p,γ ˙ (s1) = Z(p). Define vector fields along∈ γ by X(s) = X(γ(s)) and Y (s) = Y (γ(s)). Let X1, X2,...,Xn be an orthonormal basis of tangent vectors at p. We define a set of parallel vector fields Xi(s) along γ such that at p = γ(s1), we have Xi(s1) = Xi. Any pair of vector fields along γ can then be written as i i X(s) = α (s)Xi(s), Y (s) = β (s)Xi(s). Now, we have d d g (X(s),Y (s)) = αi(s)βj (s)g (X (s), X (s)) ds γ(s) ds γ(s) i j d = αi(s)βj (s)δ = δ (α ˙ i(s)βj (s) + αi(s)β˙j (s)) ds ij ij i j = gγ(s)(α ˙ (s)Xi(s), β (s)Xj (s)) + i ˙j gγ(s)(α (s)Xi(s), β (s)Xj (s)) = g ( X(s),Y (s)) + g (X(s), Y (s)). γ(s) ∇γ˙ γ(s) ∇γ˙ 2.5. METRIC AND PSEUDO-METRIC 57
Applying this formula to the vector field Z at p, Z(p) =γ ˙ (s1), we get the condition of the theorem at (the arbitrary point) p.
Theorem 2.4 A geodesic of the Levi-Civita connection gives an extremal for the path length between two points. If the points are close enough, then the extremal gives the minimum length. We shall skip the proof of this theorem. It is possible to show that, given a metric, the Levi-Civita connection is the only metric compatible, torsion free connection that exists (see Exercises 2.17 and 2.18).
Exercise 2.13 Let S R3 be a sphere of radius r. Starting from the Euclidean metric in R3, compute the curvature⊂ tensor of S. Compare the result with the curvature of the unit sphere computed earlier in Example 2.11.
Exercise 2.14 Compute the curvature tensor on the hyperboloid (x0)2 (x1)2 (x2)2 = r, r > 0, in R3. − − −
Exercise 2.15 Let x(s) be a parameterized curve on a Riemannian manifold M with a metric gµν . Define a function L(x(s)) of the path by
s2 µ ν L = gµν (x(s))x ˙ x˙ ds. Zs1 Use Euler–Lagrange variational equations to find a second order differential equation for x(s), satisfied by an extremal of L. Compare this with the geodesic equations.
Exercise 2.16 Complete the proof of the second Bianchi identity.
Exercise 2.17 Show that the Levi-Civita connection is metric compatible.
Exercise 2.18 Show that a torsion-free metric compatible connection is the Levi-Civita connection.
Exercise 2.19 Show that 1 Γµ = ∂ ln det g, µν 2 ν where g = (gµν ).
Exercise 2.20 Prove the relation 1 gµν Γα = ∂ ggαβ . µν −√ g β − − p Here g = det(gµν ) < 0.
Exercise 2.21 Suppose a cotangent vector field Xα satisfies Killing’s equations X + X = 0. ∇α β ∇β α α We assume that a point particle (mass m) is moving along a geodesic. Show that p Xα = const., where pα = mx˙ α is the 4-momentum of the particle. 58 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY 2.6. PROBLEMS 59
2.6 Problems
λ Problem 2.1 Let (Γµν ) be the Levi-Civita connection associated to a metric tensor (gµν ). µ 1 1 Show that Γµν = 2 g− ∂ν g, where g = det(gµν ).
Problem 2.2 A ship starts from a position in the Atlantic Ocean with coordinates 10◦ N 30◦ W (Cape Verde Islands). It sails directly to the north to the 45◦ northern latitude (Azores, Portugal) and then it turns abruptly to the west and sails until it hits the 60◦ western longitude (Nova Scotia, Canada). Suppose a vector is parallel transported along the route of the ship (with help of a gyroscope). Its initial direction is 45◦ (north-east). What is its final direction?
Problem 2.3 A vector is first parallel transported along a great circle on a sphere from a point A on the equator to the North pole N, then again along a great circle from N to another point B on the equator, and finally, along the equator back to the point A. Use the standard Riemannian metric on the sphere and prove that the vector is rotated in the above process by an angle θ, which is directly proportional to the area of the geodesic triangle ANB.
Problem 2.4 Starting form the definition of the curvature tensor,
R(X,Y )Z = [ , ]Z Z, ∇X ∇Y − ∇[X,Y ] m derive the formula for the components Rijk in terms of the Christoffel symbols. Prove the first Bianchi identity m m m Rijk + Rjki + Rkij = 0 in the case when the torsion T = 0.
Problem 2.5 Show directly from the definition of parallel transport that in a parallel trans- k 1 kl port defined by the Levi-Civita connection, Γij = 2 g (∂igjl + ∂jgil ∂lgij ), the length of a vector is constant. −
Problem 2.6 Derive the formula relating the Riemann curvature tensor to the parallel transport around an infinitesimal parallelogram.
Problem 2.7 Consider the vector fields ∂ ∂ ∂ ∂ X = x y and Y = x + y ∂y − ∂x ∂x ∂y in the xy-plane. a) Determine the commutator [X,Y ]. b) Assume that an affine connection in the plane satisfies X = Y , Y = Y , ∇X − ∇Y Y X = X, and that the torsion tensor T vanishes. Compute the Riemann curvature tensor ∇R. 60 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
Problem 2.8 Let x1 and x2 be a pair of local coordinates and
∂ ∂ ∂ ∂ X = x2 x1 , Y = x1 + x2 ∂x1 − ∂x2 ∂x1 ∂x2 be a pair of vector fields in R2 0 . Assume that \{ } X = 0, Y = X + Y, ∇X ∇X X = X Y, Y = 0. ∇Y − ∇Y 1 Compute the components R 1ij in the local coordinate basis, where i, j = 1, 2, of the Rie- mann curvature tensor.
Problem 2.9 Derive from the definition of covariant differentiation the transformation rule for Christoffel symbols with respect to general coordinate transformations.
Problem 2.10 A manifold of dimension 3 has a basis of orthonormal vector fields L , L , L with commutationM relations { 1 2 3}
[Li, Lj ] = ǫijkLk, where i, j, k = 1, 2, 3.
Determine the Levi-Civita connection i = Li (1 i 3) and its Riemann curvature tensor R. ∇ ∇ ≤ ≤ Hint: The Levi-Civita connection is the unique metric-compatible torsion-free connection. Use the symmetry properties of the Christoffel symbols coming from this, several times, to evaluate them.
Problem 2.11 Compute the curvature tensor on a sphere of radius r in R3, using the standard Riemannian metric.
Problem 2.12 The non-zero Christoffel symbols on a unit sphere S2 are given in the spher- ical coordinates as 1 Γθ = sin 2θ, Γφ =Γφ = cot θ. φφ −2 θφ φθ j j a) Compute the Christoffel symbols Γφi and Γθi in the orthonormal basis e1 = ∂θ, e = 1 ∂ , e =Γj e , e =Γj e . 2 sin θ φ ∇φ i φi j ∇θ i θi j u1 b) Prove that the parallel transport of a vector u = u1e1 + u2e2 = around a closed u2 2 loop γ(t) on S is given by the operation u′ = Ru, where R is a rotation by an angle Ω equal to the area of the region bounded by the loop γ. Hint: First, write the solution as a line integral of the Christoffel symbols around the loop, and then, apply Stokes’ theorem.
Problem 2.13 Let x, y be local coordinates on a surface S with x + y = 0. Define a metric tensor g by g = 1, g = g = x+y, and g = 1+(x+y)2. Let be6 an affine connection xx xy yx yy ∇ 2.6. PROBLEMS 61 defined by ∂ ∂ ∂ = (x + y) , ∇x ∂x ∂x − ∂y ∂ ∂ ∂ = (2 + (x + y)2) (x + y) , ∇x ∂y ∂x − ∂y ∂ ∂ ∂ = (x + y)(x + y + 1) (x + y + 1) , ∇y ∂x ∂x − ∂y ∂ ∂ ∂ = ((x + y + 1)(1 + (x + y)2) + 1) (x + y)(x + y + 1) . ∇y ∂y ∂x − ∂y a) Compute the Christoffel symbols in the orthonormal basis ∂ ∂ ∂ e = , e = (x + y) + . 1 ∂x 2 − ∂x ∂y (The result is very simple.) b) Consider the parallel transport of a pair of vectors starting from the point (x,y) = (1, 1), counter clockwise along the full circle with center at (x,y) = (2, 2) and radius r = √2. Assume that the initial angle between the vectors is π/3. What is the angle after the parallel transport around the loop?
Problem 2.14 Fix a metric on the paraboloid z = x2 + y2 induced by the standard Eu- clidean metric in R3. Compute the components of the curvature tensor on the paraboloid. Hint: Use polar coordinates in the xy-plane.
Problem 2.15 Let M be the hyperboloid t2 x2 y2 = 1 in R3. We define a pseudo- Riemannian metric g on M as the restriction of− the Minkowski− − metric ds2 = dt2 dx2 dy2 to the surface M. − − a) Show that the metric g has signature + (1 time-like and 1 space-like direction at each point on M). − b) Write explicitly the geodesic equations on M in terms of the cylindrical coordinates (t,r,φ), where x = r cos φ and y = r sin φ. Compute the distance between the points (0, 1, 0) and (1, √2, 0).
Problem 2.16 Consider the pseudo-Riemannian metric
ds2 = (dx1)2 + (dx2)2 (dx3)2 (dx4)2 − − in R4. This induces a pseudo-Riemannian metric g on the surface
S : (x1)2 + (x2)2 (x3)2 (x4)2 = 1. − − a) Show that the metric g on S is Lorentzian, i.e., it has one time-like and two space-like directions at each point. b) Construct a pair of constants of motions for freely falling bodies by integrating once the geodesic equations on S.
Problem 2.17 Let M be a Lorentzian manifold of dimension n = 3. Assume that there is an orthogonal basis of vector fields X,Y,Z such that 62 CHAPTER 2. SOME DIFFERENTIAL GEOMETRY
1. g(X, X) = g(Y,Y ) = g(Z,Z) = 1, − − 2. [X,Y ] = Z, [Y,Z] = X, [Z, X] = Y , − where g is the metric tensor. Compute the Christoffel symbols of the Levi-Civita connection and the Riemann curva- ture tensor in this basis. Hint: Use the symmetry properties of the Christoffel symbols com- ing from the torsion free property of the connection together with g = g = g = 0. ∇X ∇Y ∇Z Chapter 3
General Relativity
In general relativity, the Minkowski space of special relativity is generalized to a four dimen- sional space-time which is a manifold M equipped with a Lorentzian metric g with signature (1,3). The proper time of an observer is the Lorentzian length of the corresponding world line. Space-like, time-like and null vectors are defined in the same way as in the case of special relativity. However, it is not as easy to define future and past pointing vectors. If there exists a global timelike vector field tµ, a non-spacelike vector Aν can be defined as future pointing if µ ν gµν t A > 0. (3.1) Unfortunately, it is not always possible to find such a vector field tµ. A space-time where a time direction may be defined in the above way is called time oriented.
3.1 The Einstein Field Equations
The Einstein tensor is defined as 1 G = R g R, (3.2) µν µν − 2 µν
µν where R = g Rµν is the Ricci scalar. We assume that the metric tensor gµν is pseudo- Riemannian with signature (1, 3) (one positive direction and three negative directions) and gµν is its inverse. The connection is the Levi-Civita connection computed from the metric λ and Rµν = Rµλν is the Ricci tensor.
λ Exercise 3.1 Writing Rαβµν = gαλRβµν , show that
R = R = R = R . αβµν − βαµν − αβνµ µναβ
Show that this implies that Rµν is symmetric.
The Einstein tensor is symmetric. Furthermore, its covariant divergence vanishes,
Gµν = ∂ Gµν +Γµ Gαν +Γν Gµα = 0. (3.3) ∇µ µ µα µα 63 64 CHAPTER 3. GENERAL RELATIVITY
This is seen as follows. First, taking Z = ∂α, X = ∂µ,Y = ∂ν in Theorem 2.3, we obtain
β β ∂αgµν =Γαµgβν +Γαν gµβ =Γαµν +Γανµ. (3.4) This can be also written as ( g) = 0. (3.5) ∇α µν µν 1 For the inverse metric tensor g = (g− )µν , one gets
∂ gµν = Γµ gβν Γν gµβ. (3.6) α − αβ − αβ Note the difference in sign for the covariant derivative of the metric tensor and its inverse.
µ Exercise 3.2 For any vector field X = X ∂µ the components of the covariant derivatives are ( X)µ = ∂ Xµ +Γµ Xα. Show that the covariant divergence is given by ∇ν ν να µ 1/2 1/2 µ ( X) = ( det g)− ∂ (( det g) X ). ∇µ − µ − µ µ In relativity theory literature, it is a custom to use the abbreviation X ;ν = ( ν X) for the covariant differentiation of vector (and higher order tensor) indices. With thi∇s notation, we can write the second Bianchi identity as
Rαβµν;λ + Rαβνλ;µ + Rαβλµ;ν = 0. (3.7)
Contracting the α and µ indices in this identity with the metric tensor, we get
αµ g (Rαβµν;λ + Rαβνλ;µ + Rαβλµ;ν ) = 0. (3.8)
By the definition of the Ricci tensor, this can be written as
R + Rµ R = 0, (3.9) βν;λ βνλ;µ − βλ;ν where we have taken into account that the covariant derivative of gµν vanishes, implying that the multiplication with the components of the metric tensor commutes with covariant differentiation; in particular, index raising and lowering commutes with covariant derivatives. Contracting Eq. (3.9) once again with gβν , we get
gβν (R + Rµ R ) = 0. (3.10) βν;λ βνλ;µ − βλ;ν Using the results of Exercise 3.1, we get
gβν Rµ = gβν gαµR = gβν gαµR = gµαRν = gµαR βνλ;µ αβνλ;µ − βανλ;µ − ανλ;µ − αλ;µ = Rµ . (3.11) − λ;µ Inserting this into the second term in Eq. (3.10), we obtain
R Rµ Rν = 0. (3.12) ;λ − λ;µ − λ;ν
Note that since R is a scalar, R;µ = ∂µR. An equivalent form of the previous equation is
µ µ 2R λ δλ R = 0. (3.13) − ;µ 3.1. THE EINSTEIN FIELD EQUATIONS 65
Raising the index λ and dividing by 2 finally leads to
1 Rµν gµν R = 0. (3.14) − 2 ;µ We have now shown that the covariant divergence of the Einstein tensor vanishes. Einstein’s gravitational field equations are written simply as G Gµν = 8π T µν , (3.15) c4 where G on the right-hand side (Not to be confused with the Einstein tensor!) is New- ton’s gravitational constant and T µν is the energy-momentum (stress-energy) tensor. It describes the distribution of energy in space-time. For example, the electromagnetic field µν µν µ λν ǫ0 µν λω gives a contribution to T defined by TEM = ǫ0F λF + 4 g FλωF (compare with Eq. (1.83)). Another example is the energy-momentum tensor of a perfect fluid. A perfect fluid is characterized by a 4-velocity field u, a scalar density field ρ0, and a scalar pressure field p. The energy-momentum tensor is defined as (when c = 1)
T µν = (ρ + p)uµuν pgµν . (3.16) 0 − A special case of this is p = 0, which can be viewed as the energy-momentum tensor of a flow of non-interacting dust particles. Normally, ρ0 and p are not independent, but they are related by the equation of state of the form p = p(ρ0,T ), where T is the temperature. The requirement that the covariant divergence of the energy-momentum tensor vanishes, leads to the equations of motion for the perfect fluid. In fact, in case of Minkowski space-time and in µi the non-relativistic limit, one gets from µT = 0 for i = 1, 2, 3) the classical Navier–Stokes equation ∇ ∂u ρ + (u )u = ∇p (3.17) ∂t · ∇ − and from T µ0 = 0 the classical continuity equation ∇µ ∂ρ + ∇ (ρu) = 0. (3.18) ∂t ·
2 1 Here ρ = ρ0(1 u )− . The Einstein− field equations can be derived by varying the action
= √ g¯(R + )d4x, (3.19) S − L Z whereg ¯ = det g, R is the Ricci scalar, and is the Lagrangian matter density. If there is no matter present, then = 0 and the resultingL Euler–Lagrange equations are solved L by, for example, the Minkowski metric gµν = ηµν . Thus, the Minkowski space describes a space-time where there is no matter present. Let S be some space-like surface with a time-like normal unit vector field nµ, n0 > 0. Then, ( det g)1/2T µν n d3x − ν ZS 66 CHAPTER 3. GENERAL RELATIVITY gives the energy and momentum contained in S. Equation (3.3) leads to the following conservation law of energy and momentum. Suppose that the metric gαβ does not depend on a particular coordinate xµ. Then,
0 = ∂µgαβ =Γµβα +Γµαβ. (3.20)
Thus, Γαβµ is antisymmetric in the last two indices. Now,
( T )ν = ∂ T ν +Γν T λ Γλ T ν . (3.21) ∇ν µ ν µ νλ µ − νµ λ νλ The third term on the right-hand side is equal to ΓνµλT and it vanishes because the second factor is symmetric in its indices, whereas the− first factor is antisymmetric in λ and ν by the remark above (note that the Christoffel symbols of a Levi-Civita connection are always symmetric in the two lower indices). On the other hand, the sum of the first two 1/2 1/2 ν terms is ( det g)− ∂ [( det g) T ], according to the result of Exercise 3.2. Thus, for − ν − µ fixed µ, J ν = ( det g)1/2T ν is conserved in the usual sense, − µ ν ∂ν J = 0. (3.22)
In order to avoid convergence problems with the infinite integrals, we assume that all energy and momentum are contained in a compact region K in space-time. Consider a surface S, consisting of two space-like components S1 and S2 and some surface S3 ‘far away’ such that T vanishes on S3. Using Gauss’ law and the current conservation, we conclude that the surface integral of ( det g)1/2T ν n over S vanishes. In other words, − µ ν
( det g)1/2T ν n d3x = ( det g)1/2T ν n d3x. (3.23) − µ ν − µ ν ZS1 ZS2 We have taken into account that, since n is future pointing, one of the normal vector fields on S1 and S2 is outward directed and the second inward directed. Equation (3.23) tells us that the stress-energy, in the µ-direction, on S1 is the same as the corresponding quantity on S2; one could think of Si as a fixed time slice at time ti and one obtains the usual law of conservation of energy or momentum. Often one uses units in which G = 1 and c = 1 so that one does not need to write explicitly the coefficient G/c4 in Einstein’s equations.
3.2 The Newtonian Limit
It is known that the Newtonian gravitational theory is valid for fields, which can produce only velocities much smaller than the speed of light. Since the components T 0i and T ij are related to spatial momenta and T 00 is related to energy, this condition says that T 00 is much larger than the other components. Because of Einstein’s equations, the same is true| | for the components of the Einstein tensor. Furthermore, we expect that for weak gravitational fields the metric gµν differs slightly from the Minkowski metric ηµν ,
gµν = ηµν + hµν (3.24) for a small perturbation hµν . Next, we compute the connection, curvature, and finally the Ricci tensor to first order in the perturbation hµν . A straight-forward computation, 3.2. THE NEWTONIAN LIMIT 67 starting from the definitions of the various tensors and using the harmonic gauge condition νµ νµ ∂µh = η ∂µh/2, gives 1 1 G = h η h , (3.25) µν −2 µν − 2 µν µν where h = η hµν . Thus, Einstein’s equations, in this approximation, are linear,
1 1 G hµν ηµν h = 8π T µν . (3.26) −2 − 2 c4 Taking into account the remark in the beginning of this section, only the 00-component is relevant, 1 G h00 h = 16π ρ, (3.27) − 2 − c2 where ρ = T 00/c2 is the matter density in the rest system of the source. We can also drop the time derivatives (in the system of coordinates, where the source is slowly moving, because 1 ∂0 = c ∂t) and so the only relevant equation becomes
1 G 2 h00 h = 16π ρ. (3.28) ∇ − 2 c2 This means that, 1 4 h00 h = φ, (3.29) − 2 c2 where φ is the gravitational potential for the matter distribution ρ. (Compare Eq. (3.28) with the Newtonian equation 2φ = 4πGρ, where φ = GM/r!) ∇ µν 1 µν − Since all the other components of h 2 η h vanish at this order of approximation, we finally get − 2 hµµ = h = φ (no summation!) (3.30) µµ c2 for all µ = 0, 1, 2, 3. For a point mass at the origin, this expression is equal to 2GM/c2r. − Next, we shall compute the geodesics for the metric gµν = ηµν +hµν in the linear approx- imation (we neglect higher order terms in hµν ). For small velocities, the time component x˙ 0(s) of the 4-velocity is much larger than the spatial components. For this reason, we can approximate the geodesic equations of motion as
d2xµ dx0 2 +Γµ = 0. (3.31) ds2 00 ds In the linear approximation,
1 1 Γ0 = ∂0φ, Γi = ∂iφ. (3.32) 00 c2 00 −c2 Thus, the geodesic equations become
1 1 x¨0 + ∂0φ(x ˙ 0)2 = 0, x¨i ∂iφ(x ˙ 0)2 = 0. (3.33) c2 − c2 68 CHAPTER 3. GENERAL RELATIVITY
In the coordinate system, where the source is at rest, the first equation says that we can choose the time t as the geodesic parameter, x0(s) = s = ct, and then the second equation becomes d2xi = ∂iφ. (3.34) dt2 The right-hand side (after multiplication by the mass m of the test particle) is the gravi- tational force of the source on m, so this equation is just Newton’s second law, ma = F , where F = Φ, = (∂ ,∂ ,∂ ) = (∂1,∂2,∂3), and Φ = mφ. −∇ ∇ 1 2 3 − 3.3 The Schwarzschild Metric
The basic problem in Newtonian celestial mechanics is to solve the equations of motions outside of a spherically symmetric mass distribution (orbits of the planets around the Sun, orbits of satellites around the Earth). In general relativity, the first natural problem is to search for spherically symmetric solutions of Einstein’s equations. Actually, there is a unique 1-parameter family of spherically symmetric solutions, which are asymptotically flat, meaning that at large distances from the source the metric tends to 2 µ ν 0 2 1 2 2 2 3 2 the flat Minkowski metric ds = ηµν dx dx = (dx ) (dx ) (dx ) (dx ) . This is the content of Birkhoff’s theorem (which we are not going− to prove).− The− line element of the metric is given as
1 2GM 2GM − ds2 = g dxµdxν = 1 (dx0)2 1 dr2 r2dΩ2, (3.35) µν − c2r − − c2r − where dΩ2 is the angular part of the Euclidean metric in R3, dΩ2 = dθ2 + sin2 θ dφ2. It is clear from Eq. (3.35) that for large distances r the metric approaches the Minkowski metric. The line element (3.35) is called the Schwarzschild metric. When r > 2GM/c2 the Schwarzschild metric is supposed to describe the gravitational field outside of a spherically symmetric star. The other disconnected region r < 2GM/c2 is 2 the Schwarzschild black hole. The singularity at r = rS = 2GM/c , the Schwarzschild event horizon, is actually due to a bad choice of coordinates. There is a way to glue the inside solution to the outside solution in a smooth way by a suitable choice of coordinates; the complete discussion of this was first given by Kruskal and Szekeres in 1960. The Kruskal– Szekeres metric is given as follows. The coordinates are denoted by (u,v,θ,φ). The latter two are the ordinary spherical coordinates on a unit sphere. The coordinates (u,v) are restricted to the region L R2 defined by ⊂ 2GM uv < . c2e The metric is then 2 2 16µ (2µ r)/2µ 2 2 ds = e − dudv r dΩ , (3.36) r − where µ = GM/c2 and r (as well as the time t = t(u,v), see below) is a function of u,v. The coordinate r is defined by the equation
(r 2µ)/2µ uv = (2µ r)e − . (3.37) − 3.3. THE SCHWARZSCHILD METRIC 69 Light cone v at p p K1 K 2 u