1. RELATIVISTIC The one truth of which the human mind can be certain – indeed, this is the meaning of consciousness itself – is the recognition of its own existence. That we may be secure in this truth is assured us by Descartes’ famous axiom even if everything else, including Descartes, is a figment of our imagination. Nothing else can be proved. Our most fundamental belief, then, is that the exists around us, consisting of three spatial dimensions and , and that while we can move about in the three spatial dimensions, time flows inexorably onward, everywhere the same. Newton himself said it: “…absolute, true, and mathematical time, of itself, and from its own nature, flows equably and without relation to anything external.” But he was wrong. The shows us that time and do not have the meaning we thought they had. In the words of Weyl, “…we are to discard our belief in the objective meaning of simultaneity; it was the great achievement of Einstein in the field of the theory of knowledge that he banished this dogma from our minds, and this is what leads us to rank his name with that of Copernicus” (italics his). But the discovery of relativity by Einstein in 1905 was not a bolt from the blue. People had been concerned about the nature of space and time for at least hundreds of years before that, becoming more and more disturbed by the inconsistencies in our understanding of the physical world toward the end of the 19th century. Nevertheless, even after the true nature of space and time became clear, the theory of relativity so contradicted our most fundamental belief that it was rejected for years. Einstein himself never received the Nobel Prize for this that was, in the words of Bertrand Russell, “probably the greatest synthetic achievement of the human intellect up to the time.” Some sixteen years afterward he was reluctantly awarded the Nobel Prize for a lesser work because the greatest of the century, known to more people than the President of the United States, could not be completely ignored. Yet, the truth Einstein taught us displayed once again nature’s tendency to assume the most beautiful, symmetric form, in spite of our objections. And while the truth is a merger of space and time that prevents us from ordering events absolutely in time, it does not result in chaos, but preserves those features that we cannot logically be denied. The principle of is never violated, and as each of us progresses through this four-dimensional space-time, our individual perception of time as moving always forward is not contradicted.

1.1. Einstein’s Postulates Einstein's solution to the dilemma of the of light was as beautiful as it was radical. He chose the most symmetric form for nature, stating that all inertial reference frames are equivalent. He embodied this concept in his two postulates of , which state:

1. The laws of nature are identical in all inertial frames of reference. That is, if we transform the mathematical equations of from one inertial reference frame to another they remain in the same form.

2. The of light c is the same to all observers at rest in inertial frames of reference. A more general statement of this principle might be that the influence of one particle is


not felt instantaneously by another. Instead, the influence propagates at some (maximum) velocity c .

These two postulates can be summed up in a single postulate that states that there is no experiment that can be done to distinguish the absolute velocity of any .

Einstein’s postulates appear beautifully simple and symmetric. In fact they are deceptively simple, since within them lie profound consequences and more than a few startling paradoxes. Fundamental to all the paradoxes is the fact that simultaneity is no longer an objective reality, as Weyl points out, but rather a subjective one that depends on the observer. A simple example, shown in Figure 1, illustrates this. Consider the following "gedanken experiment" (; Einstein loved gedanken experiments):

Figure 1

On the moon, Buffy and Bubba, representating the lunar colonies Alpha and Beta, are having a green-cheese eating contest. When the winner is declared, the news is radioed to the folks back home in Alpha and Beta, which are equidistant from the site of the contest. Each colony receives the news, which travels at the , at exactly the same time. Right? Of course. But Hilda and Wolfgang, who are passing the moon in their space ship on their way back to earth after a vacation, watch the events on the moon and come to a different conclusion. Since, as they view it, the moon is moving to the left, the news reaches Beta, which is moving toward the contest, before it reaches Alpha. Right? Of course. Who is right? Well, they both are. There is no absolute meaning to the concept of simultaneity. In fact, let’s check in with Edgar and Eloise, who are in a space ship going the opposite direction from Wolfie and Hildie, just starting their vacation. As viewed by E and E, the moon is going the opposite direction and the news reaches Alpha before it reaches Beta. We can’t even get agreement on which of two events (the news arriving at Alpha and at Beta) occurred first. In fact, there is really no absolute meaning to simultaneity. Time just doesn’t work that way, although the effects are usually so small that you never noticed it. You may (you should!) wonder what has happened to cause and effect. For example, if A causes event B, what happens if someone else observes events A and B to occur in the reverse order. This is similar to the logical difficulty that occurs when people travel back in


time. Can Dr. No travel back in time and kill his mother so that he himself is never born? Well, of course not, regardless of whether you liked “Back to the ” or not. In fact, the theory of relativity does not violate causality. Two events can be reordered in time by other observers only if the events happen so far apart in and so close together in time that neither light nor anything else (which must travel slower than light) can get from the first event to the . Therefore, event A cannot have any influence on (or cause) event B, and the principle of causality is not violated. Clearly, the two events called “news reaching Alpha” and “news reaching Beta” are too far apart in distance to be connected by a single light pulse. It takes two light pulses to reach the two events, so it is OK that they can be reordered in time by different observers. One of these two events can never cause or influence the other.

1.2. Let’s do another gedanken experiment. This time we put a simple (in concept, at least) on the space ship with Wolfie and Hildie. The clock sends a short laser pulse up to a mirror, and when it strikes the mirror and returns the clock ticks once and sends out the next pulse. If the distance to the mirror is L , the round-trip distance traveled by the laser pulse is dL= 2 . If c is the (universal) velocity of light, the clock ticks once in the time Δ=tdc/2/ = Lc. But what do Buffy and Bubba, standing on the moon, think of this?

Figure 2

As they see it, the light makes a triangular trip up and down as the laser and the mirror move to the right at the velocity v . The total distance the light travels in one tick is found from Pythagoras' theorem: dLvt'2222= ()2 +Δ ' (1.1) But light travels at the velocity c , so the time for the clock to tick once is Δ=tdc''/. The moving clock goes too slow (that is, it is observed by Buffy and Bubba to take too long to tick) by the factor Δt ' = γ (1.2) Δt where


1 γ = (1.3) 1− β 2 β = vc/ (1.4) are the relativistic parameters. But this isn’t just a case of a clock going too slow. The clock is just fine. In fact, everything on the moving space ship is going too slow. Wolfie and Hildie’s hearts beat too slowly, and they are aging too slowly, at least according to Buffy and Bubba. Wolfie and Hildie don’t see anything wrong. There is no experiment they can do to detect their , after all. Should Bubba and Buffy be jealous that Wolfie and Hildie are getting old slower? Not at all; Wolfie and Hilda are not enjoying the extra time. Their thoughts, their days, everything is going slower for them. They don’t experience any extra time. In fact, when you think of it, Wolfie and Hildie see a clock belonging to Buffy and Bubba going too slow compared with their clock. After all, they see themselves as stationary and Buffy and Bubba moving past them on a (very large) space ship. So they see Buffy and Bubba getting old slower than they are. Each one sees the other’s clock as moving slower than their own! This is known as time dilation; time on a moving space ship is observed to be stretched out. Once again, this is not a problem caused by bad . This is the nature of time itself! How can this happen? The paradox is resolved when we consider how the comparison is done. When Buffy and Bubba watch Wolfie and Hildie’s clock, they do it (or can do it, and all methods are equivalent) by watching one clock on the space ship as it passes close to two clocks in two separate places on the moon. When Wolfie and Hildie do the comparison, they watch one moon clock as it goes past two separate clocks on their space ship. It turns out that since the two pairs of observers can’t agree on simultaneity, they have set their clocks incorrectly (relative to one another, in some sense) when they moved them into place for the comparisons. In any event, there is no paradox since the same clocks are not being used in the two measurements. There is one way to get around the problem of multiple clocks. Let’s do another gedanken experiment. This time we have two twins. One twin gets in a space ship and flies to alpha centauri and back at high speed. Her twin brother on earth knows that her clock and her life processes move slower than his. When she returns to earth he is not surprised to see that she is younger than he is. But shouldn’t she see the same thing? That is, since he (on earth) was moving relative to her space ship, shouldn’t she see that he is younger? Well, the correct answer is that she is younger than he is. The symmetry of the situation is broken because she had to accelerate to fly away, decelerate to turn around at alpha centauri, and then accelerate and decelerate again to return home. Therefore, her clocks behave differently. After all, Einstein’s postulates apply only to coordinate systems travelling at a constant velocity. Actually, this gedanken experiment has already been done. In very careful experiments, two atomic clocks were flown around the earth in opposite directions. When they returned to the original laboratory and were compared with "stay-at-home" clocks, they were slower (younger) by about a tenth of a microsecond, just the amount Einstein would have predicted (actually, the effects of had to be taken into account since the planes were flying high above the earth)!


1.3. contraction Just as viewed in the lab frame coordinate seem longer than those measured in the , measured in the lab frame seem shorter than those measured in the

moving . Going back to Figure 1, let L0 be the distance from the transmitting tower to the colony Beta. Buffy and Bubba, on the moon, can measure this with a long tape measure. They also know that the time it takes for the space ship to get from the tower to the

colony Beta is Δ=tLv00/ , where v is the velocity of the space ship. Meanwhile, Hilda and Wolfgang, on the space ship, compute the distance by measuring the time Δt ' to get from the tower to Beta and using the formula Lvt''= Δ . But due to time dilation, they get a different answer for the length:

Lvt''1Δ = = (1.5) Lvt00Δ γ The length they measure is shorter! That is, moving objects (as viewed in this case from the space ship) appear shorter in the direction of motion. This is called . Note that the lengths of objects in directions transverse to the direction of motion are unaffected. If they are the same height and standing on the same level, Bubba and Wolfgang will each look directly into each other’s eyes as they pass by. Neither will appear taller than the other.

1.4. Doppler shift Associated with time dilation and length contraction is the Doppler shift of light. As in the case of sound waves, the motion of a source of waves causes a change in the frequency and wavelength of the waves, but the formulas are different for light waves.

Figure 3.

Consider the gedanken experiment pictured in Figure 3. A space ship traveling at velocity v emits two laser pulses in the forward direction separated by the time (in its coordinate system)

Δt0 . An observer standing in front of the oncoming space ship sees the pulses arrive with a time separation Δt . To compute the time interval observed, we use the Minkowski diagram in Figure 3. In this diagram, we plot the time ct and distance x , both measured in the lab frame. If the


first pulse is emitted at time t1 = 0 and x1 = 0 , the trajectory x = ct of the pulse is a straight at 45 degrees in this diagram. The trajectory of the space ship is x ==vtβ ct , as shown. Because of time dilation, the clock on the space ship runs slow, so the time at which the

second pulse is emitted is tt20=Δγ , and the position is x22= vt=Δγβ c t 0. The time interval between the two parallel light lines representing the two pulses in Figure 3 is therefore

x 1− β Δ=tt −2 =Δγγβγβ t − Δ t =()1 − Δ t = Δ t. (1.6) 2000c 1+ β 0 The time interval seen by the stationary observer is smaller than that of the observer in the space

ship. If there is a series of pulses emitted at the frequency f0 in the space ship frame, the frequency measured by the stationary observer is Doppler shifted by the factor

f 1+ β = (1.7) f0 1− β Note that as the velocity of the emitter approaches c , the Doppler shift approaches infinity.

One of the most interesting applications of the Doppler shift was Hubble’s discovery of the expanding universe. By looking at the frequency (color) of the emission from certain in distant galaxies, Hubble discovered that all distant galaxies are moving away from us, moving faster at larger . To explain this, he postulated that we live in a universe that started as a point and has been expanding for about 13 billion years since then. This has, of course, all sorts of ramifications for science – and even religion!

1.5. Intervals To discuss the concepts of relativistic kinematics it is useful to introduce . For two spatial dimensions and time this is shown in Figure 4. Each point (t,r) in this 4- dimensional space-time is called an event, and the path of a particle, ⎣⎦⎡⎤ts( ),r ( s) for some parameter s , is called the of the particle. We now consider the relationship between two events viewed in the reference frames K and K' when the reference frame K' is moving at the constant velocity v relative to K . We

Figure 4


note at the beginning that observers in both reference frames agree on the v. That is, they agree on the absolute magnitude v = v , since by symmetry neither could observe a greater velocity than the other, but of course they differ on the sign of the vector v. By symmetry, again, it is clear that the transformation from K to K' differs from the inverse transformation (from K ' to K ) only in the sign of v. Central to the discussion is what we call the interval between two events. In ordinary 3- dimensional Euclidian geometry the distance between two infinitesimally separated points is dl2222=++> dx dy dz 0 . (1.8) For two infinitesimally separated events in 4-dimensional Minkowski space, however, we define the intervals ds2222=−−− c dt dx dy 2 dz 2 (1.9) and ds'''''222=−−− c dt dx 2 dy 2 dz 2 (1.10) in K and K' , respectively. These expressions are called the metric equations for the two reference frames, and the quantity ds represents some sort of distance between the two events in 4-dimensional space-time. Because of the minus signs in these expressions the geometry of Minkowski space is not Euclidian, but is called pseudo-Euclidian. Since the square of the interval can be positive or negative some authors treat time as an imaginary coordinate, but we use a different approach here. The rationale for defining the interval as we have done here clearly comes from the fact that the speed of light is the same in all inertial reference frames. If the two events separated by ds correspond to the passage of a signal traveling at the velocity of light, then the interval as we define it vanishes in all reference frames

ds22= ds '0= . (1.11) Put another way, ds2 = 0 in one frame strictly implies that ds '02 = in any other, and vice versa. More generally we note that since uniform motion in one inertial reference frame implies uniform motion in another, the transformation between K and K' must be linear, so it must be true that ds'22= ads , (1.12) where a is some constant that depends, at most, on the relative motion of K and K' . But a cannot depend on the 4-dimensional coordinates themselves since space-time is presumed to be homogeneous, so it can depend at most on the velocity, aa= (v) . Since the velocity v introduces a special direction into the discussion we might expect the interval to transform differently depending on the of the interval relative to v. But since all the spatial components of the interval enter (1.12) quadratically, the interval does not change if any their signs are reversed, which would change the orientation of the interval relative to v. Equivalently, the transformation of the interval must be indifferent to the direction of v, so that aav()v = ( ) . (1.13)


But as noted earlier, the inverse transformation is obtained from the forward transformation merely by changing the sign of v, which does not affect av( ) . Therefore, we see that

ds '2 ds22== a() v ds ' , (1.14) av() from (1.12). It follows that av2 ( ) =1, (1.15) or a =±1. We can discard the negative sign since two successive transformations must give the same result as a single transformation with the same final velocity, so that aa2 = . Therefore, we conclude that a =1 and the interval ds2 is an of the transformation between inertial coordinate systems. As we see shortly, this is all we need to define the between them. It is convenient to classify intervals in the following way: ds2 > 0 , timelike interval, (1.16) ds2 < 0 , spacelike interval, (1.17) ds2 = 0 , lightlike (null) interval. (1.18) Since the interval is invariant, the characterization of an interval as timelike, spacelike, or lightlike is independent of the coordinate system in which the events are viewed. For example, suppose that two events occur in the same place but at different times in the coordinate system K' , so that dr '0= . Then ds''0222= c dt > , and the interval is timelike. In another reference frame K , the interval is still timelike, even though the events occur in different places at different times. Conversely, if the interval between two events is timelike in the reference frame K , then there exists a reference frame K' in which the events occur at the same place. Specifically, if ds222=−> c dt dr 2 0 , (1.19)

in K , then in a coordinate system K' moving at the velocity dr v = (1.20) dt relative to K the events occur at the same place. In the moving reference frame K' the events are separated by the time ds'22 ds dr 2 dt'22===− dt . (1.21) cc22 c 2 In the same way, if two events viewed in the reference frame K' occur simultaneously at two different points, then ds'022=− dr < , so the interval is spacelike. Viewed in the reference frame K , the events appear at two different places and two different times but the interval is still spacelike. Conversely, if two events are separated by a spacelike interval in the reference frame


K , then in some other reference frame K' the events are simultaneous. In this reference frame, the separation of the events is dr''222222=− ds =− ds = dr − c dt . (1.22) Based on the classifications (1.16) - (1.18), we can divide Minkowski space into the regions shown in Figure 5. For an event anywhere located inside the the interval sctr2222=−>0 is timelike. Thus, in any other reference frame K' the event occurs in the same time order relative to the event at the origin as it does in the frame K , and may be regarded as in the absolute past or absolute future relative to the event at the origin. On the other hand, events located outside the light cone are related to the origin by spacelike intervals. Consequently, there is some reference frame K' in which the events are simultaneous. A second transformation to a frame K'' moving with respect to the frame K' will separate the events in time. However, by symmetry we see that if a relative velocity in one direction places event A before B in K'' , then a relative velocity in the opposite direction will place event B first. Thus, events separated by spacelike intervals cannot be absolutely time ordered. We say that events in the region outside the light cone are “elsewhere” relative to the event at the origin. For example, in the parable of the cheese-eating contest discussed earlier the arrival of the news at Alpha and Beta represents two events separated by a spacelike interval. Thus, the news arrived simultaneously at Alpha and Beta as observed by the colonists on the moon, but was observed by the space travelers to arrive first at Alpha or first at Beta depending on the relative velocity of their space ship. For an object such as a clock at rest in an inertial reference frame K' , the distance dr ' 2 between two events along its world line vanishes, so the invariant interval ds''222= c dt is just the time between the events. Viewed from the laboratory frame K the interval is the same, so if the K ' moves the with the velocity v relative to the laboratory frame, then the interval in the laboratory frame is ds2222=−=−= c dt dr( c 22222 v) dt c dt ' . (1.23) Therefore, compared with the time dt elapsed on a clock in the laboratory frame the time elapsed on the moving clock is less, amounting to dt dt'1=−β 2 dt =≤ dt , (1.24) γ where

Figure 5


v β = , (1.25) c and 1 γ = . (1.26) 1− β 2 This shows that a clock at rest in the moving coordinate system indicates less elapsed time than clocks at rest in the laboratory reference frame to which it is compared. This phenomenon is called time dilation. Ample experimental evidence now exists to confirm this effect. It has, of course, nothing to do with the failure of moving clocks to perform correctly. It has to do with the nature of time itself, or, more precisely, the subjective nature of simultaneity. Many physical phenomena, such as the decay rate of subatomic particles, can be used as clocks. For example, when cosmic rays strike the upper atmosphere of the earth, they create a variety of particles including muons. Ordinarily, muons have a half-life of 2.2 μs, so at the velocity of light they would travel, on average, about 600 m. However, due to time dilation a muon traveling at β = 0.999 , which corresponds to an of 2.4 GeV, lives for 50 μs, and travels 15 km. This accounts for the fact that large numbers of muons created in the upper atmosphere are observed at the earth’s surface. In the same way, the subatomic particles created by in high- energy physics experiments would not be except that their brief lifetimes are extended by time dilation. We call the time elapsed on a clock moving with an object the dτ for the object. This is the time actually experienced by the object. For objects that are not in uniform motion the motion within any brief interval may be regarded as uniform, and the elapsed proper time between two points on the world line of the object is t2 dt ττ−= ≤−tt. (1.27) 21∫ γ t 21 t1 ()

Note carefully that in this expression the terms τ1 and τ 2 refer to the time elapsed in an accelerating coordinate system, but t is the time in an inertial reference frame. A few remarks are in order. In the first place, the proper time, defined by the invariant interval in the rest frame of the moving object, is an invariant. That is, all observers agree on the proper time elapsed along the world line of the object. Physically, this corresponds simply to the fact that the clock moving with the object has an indicator (hands, or even a digital readout) on it to indicate the time. All observers, regardless of their relative motion, agree on what the indicator shows. That is, all observers agree on the numbers showing on the digital readout. In the case of subatomic particles, the time is indicated by the number of particles that have – or have not – decayed. The number of particles is the same to all observers. In the second place, when viewed from the rest frame of the moving object, clocks in the laboratory frame are moving with the velocity v = −v so they indicate an elapsed time that is less than that in the object’s rest frame. Thus, to an observer in the laboratory frame the clock in the object rest frame appears to be slow, while viewed from the object rest frame a clock in the laboratory frame appears to be slow. The paradox is resolved by recognizing that when the time


Figure 6 indicated on the clock in the rest frame is observed from the laboratory frame, the are made by two observers at different places in the laboratory frame. They observe the moving clock as it passes by and compare the time indicated on the moving clock with that on their individual clocks. This is shown in Figure 6. Conversely, when the elapsed time on a clock in the laboratory frame is measured by observers in the moving frame, the laboratory clock is compared with two separate clocks in the moving frame. Thus, the measurements are not identical, and for this reason they do not give the same results. To avoid the problem of measuring elapsed time on a moving clock by comparing a single moving clock with two “stationary” clocks, we start with two clocks that are initially at rest in the laboratory frame. We then accelerate one clock to a high velocity, bring it to rest again, and then return it to its original position in the laboratory next to the other clock. When the two clocks are compared, it is found that the clock that has been accelerated and decelerated indicates a smaller elapsed time than the stationary clock, in accordance with (1.27). This is called the “” since it is often stated in the form of an allegory in which two twins are compared. One twin stays on earth and grows old, while the other becomes an astronaut, flies off at high speed to a nearby star, and when she returns she is younger than her twin sister. It is easy to understand that the earthbound twin observes the astronaut’s clock as progressing too slowly, but why doesn’t the astronaut observe her twin sister’s clock as progressing too slowly? In this case, the paradox is resolved by recognizing that the astronaut twin must accelerate to leave the earth, decelerate and turn around after she arrives at the star, and accelerate back toward earth and then decelerate again upon reaching home. The other twin does not accelerate at all, and this is what breaks the symmetry. More to the point, (1.27) involves transformations from one inertial frame to a sequence of inertial frames, each of which describes the moving object for a brief period. Since the earthbound twin remains in an inertial frame of reference, this equation provides a valid description of her observations of her astronaut sister. On the other hand, the astronaut twin is not in an inertial frame, and she cannot use (1.27) to compute her sister’s age. In actual fact, while the astronaut is traveling at constant velocity she does see her sister aging more slowly than herself. However, when she accelerates to turn around at the outbound end of her trip she observes her sister aging at an accelerated rate. Although technology has not reached the stage where the astronaut experiment can actually be tried, experimental confirmation of the twin paradox does exist. In careful experiments using atomic clocks it has been observed that a clock that is flown around the world in an airplane arrives back at the laboratory “younger” than a clock that remains at home. However, the difference in this case is only hundreds of nanoseconds, and the effects of gravity (accounted for in ) are of the same order of magnitude as the time dilation discussed here.


One final remark: if we draw the twins’ world lines on a Minkowski diagram we get paths like those shown in Figure 7. For the astronaut the elapsed time is given by (1.27). The elapsed time for the clock at rest is larger than this. In terms of the intervals, tt22cdt Δ=≥=Δs()earth cdt s '() astronaut . (1.28) ∫∫γ t tt11() That is, the interval along the straight line is larger. We see, therefore, that pseudoeuclidian geometry is different from what we are used to in Euclidian geometry. A straight line in 4- dimensional space-time (called a ) is the longest interval between two events, rather than the shortest.

Figure 7.

1.6. The Lorentz Transformation To find the Lorentz transformation that relates the coordinates in two inertial reference frames we look for the most general linear transformation that leaves the interval ds invariant. The transformation must be linear because uniform motion in one frame must correspond to uniform motion in the other, as noted earlier. To reduce the algebra, we make two simplifications. In the first place we assume that the axes of the two frames are parallel at all times, and that the origins coincide at time tt= '0= , as shown in Figure 8. In the second place we assume that the reference frame K ' moves at the velocity v in the xˆ (and xˆ ') direction relative to the frame K . More general transformations can be obtained by of K or K ' and simple corrections to the origins of the times and distances.

Figure 8.


Before Einstein discovered relativity, everyone, including Newton and Galileo, thought that space and time were independent, and the transformation from K to K ' was simply tt' = (1.29) x' = x (1.30) yy' = (1.31) zz' = (1.32) This is called a . Now we know that things are not so simple.

On physical grounds we see that in the directions transverse to the relative motion the coordinates y and z transform into themselves. Since the transformation is linear we may write yay' = , (1.33) z ' = bz , (1.34) for some constants a and b . But from the symmetry of the forward and backward transformations we see that ab==1, so that yy' = , (1.35) zz' = . (1.36) In the longitudinal direction the most general linear transformation is ct' =+ mct nx , (1.37) x ' =+pct qx , (1.38) for some constants m , n , p and q . To preserve the interval between the origin and the point ()ct,r , we require that

ct22−= x 2 ct 2'' 2 − x 2 =()() mct + nx22 − pct + qx =−()mpctnqx2222222 +−( ) +2( mnpqxct −) . (1.39) From the first term on the right-hand side we see that mp22− =1, (1.40) so we can write m = coshζ , p = −sinhζ , (1.41) for some ζ , and from the second term we see that nq22− =−1, (1.42) so we can write n =−sinhψ , q = coshψ , (1.43) for some ψ . From the third term we find that


mn−=− pq coshζ sinhψζψ + sinh cosh = 0 , (1.44) so that ψ = ζ . The most general transformation that preserves the interval is therefore ct'coshsinh=− ctζ x ζ (1.45) xx'coshsinh=−ζ ct ζ , (1.46) where ζ is called the “boost parameter,” or the “.” The transformation (1.45) and (1.46) resembles a of coordinates except that the sin and cos are replaced by sinh and cosh. In fact, a rotation of coordinates is the most general linear homogeneous transformation that preserves the length dx22+=+ dy dx'' 2 dy 2 in Euclidian geometry. Thus, the Lorentz transformation has the form of a “pseudo-rotation” in pseudo-Euclidian space. To determine the boost parameter ζ we consider the motion of the origin of the K ' frame, as viewed in the K frame. Since the origin ( x'0= ) of the moving frame is at the position x = vt in the laboratory frame, we see from (1.46) that xvtct'== 0 coshζ − sinhζ , (1.47) so

v tanhζ = = β . (1.48) c Therefore, the coefficients in the transformation are 11 coshζ ===γ , (1.49) 1tanh−−22ζβ 1 sinhζ == tanhζζβγ cosh , (1.50) and the complete Lorentz transformation is ct' =−γ ( ctβ x), (1.51) x' =−γ ( xctβ ) , (1.52) yy' = , (1.53) zz' = . (1.54) For a boost to a coordinate system moving to the right, tanhζ =>β 0 . The relation between the new and old coordinate axes is shown in Figure 9, where we see that the new axes x'0= and ct '0= are tilted toward the light line in the upper right quadrant. For a boost to a frame moving to the left, the axes are tilted away from this same light line. From Figure 7 it is easy to see how events separated by a spacelike interval can be reordered in time. For example, the point B is elsewhere with respect to the origin of the stationary system K , and occurs later ( ct > 0 ) in that system. In the K ' system the point B lies below the axis ct '0= and therefore occurs earlier (ct '0< ).


Figure 9

The inverse transformation can be found by solving for x and ct , but it is easier simply to use the symmetry of the forward and inverse transformations. If we just change the direction of motion to −v we immediately get ct=+γ ( ct''β x ) , (1.55) x =+γβ( xct'') , (1.56) yy= ' , (1.57) zz= ' . (1.58) In the limit c →∞ we recover Galilean relativity. To see this we write the Lorentz transformation explicitly in terms of the and get v tx− c2 tt' =⎯⎯⎯c→∞→ , (1.59) v2 1− c2 and

xvt− x ' =⎯⎯⎯c→∞→−xvt. (1.60) v2 1− c2 Finally, we note that Galilean transformations commute but Lorentz transformations do not. That is, if we transform first into a frame K ' moving at the velocity v1 with respect to K and then into a frame K '' moving at the velocity v2 with respect to K ' , we get a Galilean transformation directly into the frame moving at the velocity vvv312= + . It doesn’t which transformation comes first. The same is not true for Lorentz transformations. As we saw earlier, time dilation is the phenomenon that makes a moving clock appear to go slow, and length contraction is the phenomenon that makes a moving object shrink in the direction of motion. To see how time dilation and length contraction arise, we examine the measurement process by which each is determined. To observe time dilation we consider the


progress of a moving clock, that is, a single clock at rest at the point x' in the frame K ' . Differentiating the inverse transformation (1.55) with respect to t keeping x'constant= we get dt '1 = <1. (1.61) dt γ That is, the moving clock always appears to go slower than the stationary clocks to which it is compared. Time dilation refers to the fact that the ticks of the moving clock appear farther apart as viewed from the stationary frame. To observe length contraction, we measure the length of a moving rod by determining the positions of the two ends at a single time t in the stationary frame K . For an infinitesimal rod the corresponding length in the stationary frame is found by differentiating the forward transformation (1.52) with respect to x , holding t = constant , to get dx ' = γ >1. (1.62) dx Length contraction refers to the fact that the tick marks on the moving length scale appear closer together than those on the stationary scale. That is, a rod of length dx' appears to have a length dx=< dx'/γ dx ' in the stationary frame. Note that the factor 1/γ appears in time dilation where the factor γ appears in length contraction. This is because in the first case a coordinate ( x') was held fixed in the moving frame, while in the second case a coordinate (ct ) was held fixed in the stationary frame. We have already introduced the proper time dτ , which in the present discussion corresponds to dt' ( x '= constant) . We similarly define the as the length observed in the reference frame in which the object is at rest, ddxtλ = ' ( '= constant) . We saw previously that the proper time is invariant,

ds'222=−= c dt ' dx ' 222 c dτ (holding dx ' = 0) . (1.63) In the same way we see that the proper length is an invariant, ds'222=−=− c dt ' dx ' 2 dλ 2 (holding dt ' = 0) . (1.64) Physically, the Lorentz invariance of proper time just means that all observers agree on what a clock at rest in the moving frame indicates. Likewise, the invariance of proper length just means that all observers agree on the numbers that appear on a ruler or dial gauge at rest in the moving frame. Clearly, relativistic velocities do not add in the same way that nonrelativistic velocities do, for if we are traveling at the velocity vc= 0.75 through a railroad station and another train passes us at 0.75c , the passengers waiting in the station do not see the faster train traveling at 1.5c . To see how velocities add we consider a reference frame K ' that is moving at the velocity v with respect to the frame K , and a particle moving in the frame K ' at the velocity v' , as shown in Figure 10. The inverse transformation between K and K ' is given by (1.55)-(1.58). For an infinitesimal movement ddtr'= v' ' (1.65)


Figure 10 in the moving frame the motion in the stationary frame is

⎛⎞vv 'x dt=+γ dt '1⎜⎟2 , (1.66) ⎝⎠c

dx=+γ dt''( v v x ) , (1.67)

dy= v''y dt , (1.68)

dz= v''z dt . (1.69) Dividing these expressions we obtain the relations dx vv+ ' v == x , (1.70) x vv ' dt 1+ x c2

dy v 'y vy == , (1.71) dt ⎛⎞vv 'x γ ⎜⎟1+ 2 ⎝⎠c

dz v 'z vz == . (1.72) dt ⎛⎞vv 'x γ ⎜⎟1+ 2 ⎝⎠c These are called the Einstein velocity-addition laws. Clearly, the addition of velocities transverse to one another is different from the addition of velocities that are parallel to one another. As an example we consider a relativistic particle that in some decay process emits another particle with velocity v' at the θ ' in the x ''y plane. In the laboratory frame the velocity of the secondary particle is vv+ 'cosθ ' v = , (1.73) x vv 'cosθ ' 1+ c2 v 'sinθ ' v = . (1.74) y ⎛⎞vv 'cosθ ' γ ⎜⎟1+ 2 ⎝⎠c


Figure 11

In the laboratory frame the emission angle is v v 'sinθ ' tanθ ==y . (1.75) vvvx γ ()+ 'cosθ ' For highly relativistic motion of the primary particle, γ >>1, we see that most of the emission appears in the forward direction in the laboratory frame with θ ≤ O()1/γ . When the emitted particles are , so that vc' = , all radiation that is emitted in the forward hemisphere (θ '/2≤ π ) in the rest frame of the primary particle is emitted inside the cone 1 tanθ ≤ (1.76) βγ in the laboratory frame, as illustrated in Figure 11. This effect is quite pronounced in the radiation from synchrotron radiation sources and high-energy physics experiments where the particle energy corresponds to γ >>1, and often γ >103 .

1.7. Transformation of electromagnetic fields In the nonrelativistic case, we can find the transformation laws for electric and magnetic fields from the Lorentz equation: FEvB= q( +×) (1.77) In a coordinate system moving at the velocity V , the new velocity is vvV' = − and the force is FEVBvBEvB=qq() +×+×' =( ''' +× ) (1.78) where

EEVB' = +× (1.79) BB' = (1.80) In relativistic physics, things are more complicated. The electric and magnetic fields are not 4-vectors, and they transform differently. In contrast with the case of 4-vectors, the longitudinal components of the electric and magnetic fields are unchanged by the boost while the transverse components are changed and mixed. That the longitudinal electric field should remain unchanged by the boost can be understood on physical grounds if we consider the field due to a parallel-plate capacitor whose axis (normal to the plates) is aligned parallel to the boost, as


Figure 12. shown in Figure 12. Since transverse lengths are unaffected by the boost the on the plates is unchanged, but the separation between the plates is reduced by length contraction. However, the field is independent of the separation of the plates so the electric field in the direction of the boost is unchanged. Similarly, if we consider the of a solenoid aligned along the direction of the boost, as shown in Figure 13, we see that the winding density of the solenoid is increased by the Lorentz contraction, while the current in the solenoid is decreased by time dilation. The effects cancel and leave the longitudinal magnetic field unchanged.

Figure 13.

Thus, we find that there is no change in the longitudinal electric field and magnetic field. To see how the transverse electric field changes, we consider the field of a parallel-plate capacitor aligned with its axis perpendicular to the boost, as shown in Figure 14. Due to the Lorentz contraction the charge density on the plates is increased by the factor γ , which increases the electric field accordingly. The magnetic field is slightly more complicated. The magnetic field in the reference frame K ' has new terms that depend on the electric field in K that appear because the charges that give rise to the electric field in K are moving in K ' and constitute a current. In the parallel-plate capacitor shown in Figure 14, there is no magnetic field in the rest frame of the capacitor, but in the laboratory frame the charged plates constitute a surface current density γσ v , where σ is the charge density of the plates in the rest frame. This is proportional

Figure 14.


to the transverse electric field and is responsible for the new terms in the magnetic fields. If we evaluate the rest of the components we get

EE'x = x , (1.81)

E 'y =−γ (EvByz) , (1.82)

E 'zzy=+γ (EvB) , (1.83)

B 'x = Bx , (1.84) ⎛⎞v B 'yyz=+γ ⎜⎟BE2 , (1.85) ⎝⎠c ⎛⎞v B 'zzy=−γ ⎜⎟BE2 . (1.86) ⎝⎠c The inverse transformation is obtained by changing vv→− in (1.81) - (1.86). In the nonrelativistic limit vc/1<< we recover the transformations which we derived from the Lorentz force law. This does not mean that the Lorentz force law is valid only in the nonrelativistic limit. Rather, the factor γ and the higher-order terms in B 'y and B 'z appear in the transformed fields because the in the new coordinate system are altered by time dilation and length contraction, and the must change in the new coordinate system to reflect this. As an example of the transformation of electromagnetic fields we consider the field of a point charge in uniform motion. In the moving frame the electric field lines are directed radially away from the charge, but as observed in the laboratory frame the field is altered. The form of the resulting field is suggested schematically in Figure 15. Due to Lorentz contraction, the electric field is compressed in the longitudinal direction but gets stronger in the transverse direction where the lines of force are now closer together. The magnetic field lines are circles about the direction of motion of the charge.

Figure 15.


2. RELATIVISTIC Beauty, at least in , is perceived in the simplicity and compactness of the equations that describe the phenomena we observe about us. Dirac has emphasized this point and said “It is more important to have beauty in one’s equations than to have them fit experiment…. It seems that if one is working from the point of view of getting beauty in one’s equations, and if one has really a sound insight, one is on a sure line of progress.” In this sense the beauty of lies in the fact that it can all be derived from the postulates of relativity together with just one hypothesis, which we call Hamilton’s principle. This includes all of and all of electricity and magnetism. In fact, if we postulate other interactions, such as the Yukawa potential, the mathematical form of these interactions is very restricted. The flexibility in the choice of natural laws is very limited. In the future, as so-called “grand unified theories” are developed, it is expected that even this limited flexibility will be removed. One of the remarkable developments of has been the growing perception that the laws of physics are inevitable. Hawking may have gone beyond the realm of pure physics when he asked the question “Did God have any choice?” in the way She wrote the laws of physics. However, it seems that if the universe consists of three spatial dimensions and time, and we require causality, then there is little choice in the laws of physics.

Hamilton’s principle states that as a system moves from its configuration at time t1 to that at time t2 it does so along the trajectory for which a quantity called the action is minimized. Unfortunately, we don’t have time to pursue this line of thought here, but it will be sufficient for our purposes to consider those quantities that must be conserved from t1 to time t2 . If we apply Einstein’s postulates to these conserved quantities, we can learn a great deal about the laws of physics.

2.1. 4-vectors To illustrate the problem with Newtonian mechanics, consider the elastic of two particles of equal m shown in Figure 17. Viewed in the lab frame K , the particles are incident with equal and opposite velocities, so that vv(21) =− ( ) . After the collision, the y components of the velocities are reversed, and by symmetry we see that vv(21) =− ( ) , where a bar indicates the value after the collision. In Newtonian mechanics we would explain this by the conservation of and energy, pp(1212) +=+( ) pp( ) ( ) (1.87)

EE(1212) +=+( ) EE( ) ( ) (1.88) where

pv= m (1.89)


Figure 16

1 E = mv2 (1.90) 2 If now we view this collision in a coordinate system moving in the x -direction with the velocity

Vv= x of particle 1, the collision looks like the right-hand diagram in Figure 16. In this frame of reference particle 1 simply goes up and down. Momentum is still conserved in the x direction since the velocities in this direction are the same before and after the collision. But what about the y direction? If we use the velocity transformation laws, we find that for particle 1 the y component of the velocity is

vv'1yy( ) '1( ) vy ()1 == (1.91) ⎛⎞Vv '1x () γ ()vx γ ()V ⎜⎟1+ 2 ⎝⎠c since v '1x ()= 0, but for particle 2 we find

v'2y ( ) vy ()2 = (1.92) ⎛⎞Vv '2x () γ ()V ⎜⎟1+ 2 ⎝⎠c

But whereas vy ()1 and vy (2) were equal and opposite in the lab frame, the corresponding quantities v '1y () and v '2y () are not the same in the moving frame. Thus, the total momentum in the y direction is not zero

pp'1yy( ) + '2( ) ≠ 0 (1.93) and since all the momenta are reversed after the collision, the total momentum is not conserved.

What went wrong? The problem (as always) lies in the nature of time. In the velocity- addition laws we divided the distance increments by the time increment, and this isn’t the same for all the particles. Since we are dealing with a 4-dimensional Minkowski space-time, we need to generalize our vectors to four dimensions. We therefore define the position 4-vector as the set of four quantities rctxyzctα ==( ,,,) ( ,r) (1.94)


where α = 03… and r is the 3-dimensional vector position. Under a Lorentz transformation, this 4-vector position transforms in accordance with Einstein’s postulates of relativity. In particular, the length of the vector sct2 = ()2 − r 2 , which we call the interval, is invariant. We call such an invariant quantity a 4-scalar.

In three dimensions, any set of three quantities, such as the velocity, that transforms the same as the position is called a vector. Extending this to Minkowski space, we can define the 4- vector velocity as follows. For an increment of position drα ==( cdt,,, dx dy dz) ( c ,v) dt (1.95) corresponding to the increment of proper time

dt dτ = (1.96) γ we define the 4-vector velocity

drα ucα ==γ (), v (1.97) dτ Clearly, this is a 4-vector since the numerator is a good 4-vector and the denominator is invariant under a Lorentz transformation. We note that in the limit of low velocity, γ →1 and the spacelike part uv→ , the nonrelativistic velocity, as it should. Curiously, the timelike component of the 4-vector velocity p0 is always larger than c .

It’s straightforward from here to define the 4-vector momentum pαα==mu(γγ mc, mv) (1.98) where m is called the rest mass. It is invariant, the same in all coordinate systems. The spacelike part is clearly the relativistic generalization of the ordinary momentum, for

pv= γ mm⎯⎯⎯γ →1→ v (1.99) But what about that timelike component? Well, at low velocities we can expand

2 mc⎛⎞11 v ⎛⎞22 1 γ mc=⎯⎯⎯vc<< →+ mc⎜⎟1 2 =⎜⎟ mc + mv (1.100) v2 ⎝⎠22cc⎝⎠ 1− c2 The second term in the parentheses is just the familiar . The first term is called the rest energy,

2 Erest = mc (1.101) arguably the most famous equation in all of physics. We see, therefore, that the timelike component of the 4-vector is just the total energy E divided by the speed of light c .

The 4-vector momentum is therefore


α ⎛⎞E p = ⎜⎟,p (1.102) ⎝⎠c where

pv= γ m (1.103)

E = γ mc2 (1.104) Finally, we can make one more use of the powerful machinery of the Lorentz transformation. Since it leaves the length of a 4-vector invariant, we can evaluate the length in any coordinate system. Applying this to the 4-vector momentum, we get

2 E 222222222 −=p γγmc − mv = mc (1.105) c2 where the last expression on the right is the value in the rest frame of the particle, the frame in which v = 0 . Rearranging this gives us the very useful formula

E 22224=+p cmc (1.106) which expresses the energy in terms of the momentum.

In 4-vector notation, then, the conservation of momentum for N particles of arbitrary mass becomes

NN α α ∑∑ppnn= , α = 03… (1.107) nn==11 This replaces the separate energy and momentum conservation laws of Newton, and even includes the case when the before and after the collision or other event are not conserved. In this case, some of the mass is exchanged for kinetic energy.

As an example of the conservation of 4-vector momentum (what we used to call the conservation of momentum and energy), we examine the emission of a gamma ray by a nucleus of mass m* . Before the emission, the energy is E = mc*2, and the momentum is p = 0 . After the photon is emitted, the total energy is E' = γ mc2 + hν , where hν is the photon energy (the quantity h is called Planck’s constant, and ν is the frequency; we’ll learn about this in ) and γ is the relativistic factor for the nucleus, which now has mass m .

Since a photon has no rest mass, the energy relation (1.106) becomes E photon==pch photon ν . The hν total momentum is now pmv' =−γ . The conservation equations are therefore c

E =+hmcmcνγ 2*2 = (1.108) hν pmv= −=γ 0 (1.109) c


We can eliminate hν from these equations, and recalling that γ =−1/(1vc22 / ) we can solve for β = vc/ . The result is simply

m* 1+ β = (1.110) m 1− β