P. Nelson PHYS240/250 Spring 2005

Survey of

Galileo believed that the Earth moved around the sun. Many found this proposition absurd. If the Earth moves, why doesn’t it feel like we’re moving? Why aren’t we thrown off? patiently constructed arguments about how you can play ping-pong on a ship moving on a calm sea and never notice that the ship is moving. While he didn’t have it completely straight, his successors (Huygens and Newton) eventually elevated this idea to the status of a fundamental principle, which we have come to call The : R1 No experiment done within an isolated system can determine whether or how fast that system is moving. In fact if we put all our apparatus in a box the results of any experiment will be the same regardless of whether that box is at rest relative to the Sun or moving in any direction in a straight line at uniform speed. Around the turn of the century the relativity principle seemed once again to be in doubt: Maxwell’s equations didn’t seem to obey it. Some people wanted to abandon relativity; others wanted to modify Maxwell. What Einstein realized was very simple: There is a little freedom in how we interpret the Principle of Relativity. The obvious interpretation used by Newton isn’t the right one, though it’s very nearly right in everyday life. There’s another interpretation that lets us have both Maxwell and relativity. Moreover it has experimentally testable consequences, which have since given us great confidence that Einstein got it right. Einstein’s theory is not something you can deduce from pure thought. There is nothing logically inconsistent with Newton’s theory. Nature just doesn’t happen to that way. To get at the truth we have to do experiments.

1. Prolog: Rotations Let’s go back to high school for a moment. You know how to take any figure in geometry and rotate it. A square remains a square, an isosceles triangle remains an isosceles triangle, etc.: the identity and character of a figure doesn’t change if we just rotate it. The situation is similar if we leave a 3-dimensional figure unchanged but view it from a different angle. Its perspective will change; maybe its width will seem to shrink while its depth seems to increase. But it’s the same object. Millions of years of evolution have given us brains that automatically compensate for changes in perspective, so that we know it’s the same object. Mathematically, if we set up Cartesian axes we can label every point in the plane by two numbers x . Then the same point viewed from a rotated point of view will be labeled by two y x different numbers y .Wecan find the new coordinates using trigonometry, and the fact that the new coordinate axes are rotated by some angle θ relative to the old ones. There’s a simple, precise formula expressing this: x cos θ sin θ x = (1.1) y − sin θ cos θ y

c 2004 P. Nelson

1 P. Nelson PHYS240/250 Spring 2005

In this formula I introduced matrix notation;itsimply means that x = (cos θ)x + (sin θ)y and y = −(sin θ)x + (cos θ)y. When θ =0we just get x = x, y = y.Tothink about this conceptually, imagine digging up all the streets in Manhattan and laying down a new, rotated, square grid of streets. Then if the Empire State Building is at a point P ,itwill still be in the same location (the same point P ) after the new grid is laid down, but the coordinates of that point (which street and which avenue) will no longer be the same as they were before. Now, certainly there are many other coordinate systems we could use to label points in the plane, besides the two Cartesian systems just described. For example, we could define x y r using axes that are not at right angles, or we could even use polar coordinates θ . But there is something special about a Cartesian system: the distance between two points P1 and P2 is given − 2 − 2 by the simple formula d = (x1 x2) +(y1 y2) .Ifwedescribe the points using the rotated − 2 − 2 coordinate system, the formula has exactly the same form: d = (x1 x2) +(y1 y2) . But 2 2 it’s not true that d = (r1 − r2) +(θ1 − θ2) !! In short, In euclidean geometry certain coordinate systems are special (namely the family of Cartesian systems), but any coordinate system within this class is just as good as any other one. What’s this got to do with relativity? In classical physics, the fundamental entities are events.Anevent is specified by a location in space and a moment in time. A trajectory is a continuous string of events. We think of events as points in a geometrical space, called . To do analytical work, we must uniquely assign four numbers to to each event; this implies a coordinate choice on spacetime. (Physicists sometimes use the phrases ,of just frame,orevenobserver,todenote “coordinate choice on spacetime.”) So far, Newton would agree with us. But many coordinate choices are possible on a space, and things will in general look different in different coordinate systems. Newton knew that there were many ways to choose the spatial coordinates x (differing by rotations), and that his equations were equally valid in any of these Cartesian coordinate systems. But it seemed necessary that there was only one correct, universal choice for time. To Einstein, this necessity was not obvious.

2. 16th through 18th Centuries Notice what relativity does not say: it does not say that “everything is relative,” nor even that “all motion is relative.” For sure if your ship speeds up relative to the shore it will affect your ping-pong game! If it turns it will spoil your serves! The ball will appear to swerve to the side due to a “centrifugal .” The only kind of motion we can say is undetectable is uniform straight-line motion of one system relative to another. Or think of it this way: anybody can determine that the Earth rotates without looking up at the sky. Even if you’re stuck in a deep mine shaft, just make a Foucault and you can see it. The principle of relativity only applies to uniform straight-line motion. The thing that you can’t detect is the fact that the Earth is also whizzing around the Sun at high speed, but small acceleration. Even with this caveat, the principle of relativity is a powerful, and in the 16th century unreasonable, law. To make it clear why it’s so powerful, let’s rephrase it this way: R2 Suppose you do a lot of experiments in your lab on Earth. You figure out various laws of Nature describing those experiments. Your friend then sets up an identical lab in an

2 P. Nelson PHYS240/250 Spring 2005

airplane.1 She does the same experiments while flying in uniform straight-line motion relative to you. She will get exactly the same results. All the laws of Nature will look precisely the same to her. That’s what it means to say “relative uniform straight-line motion cannot be detected.” Any experiment that might try to detect it internally to her moving laboratory is governed by the same laws of motion, and so gives the same results as it would in your fixed laboratory. So who’s to say which of you is fixed, which is moving? The principle of relativity constrains the form of Newton’s laws. For instance, an isolated object at rest stays that way. So, by relativity, an isolated object in uniform straight-line motion at velocity v stays that way too — Newton’s First Law. After all, the latter object seems to be at rest to an observer moving at uniform velocity v, and all the laws of physics must be the same for that observer, so he knows it will stay at rest relative to him. A less trivial illustration comes from the structure of Newton’s Third Law. To get concrete, consider two heavy balls joined by a spring. They’re floating there in free space bouncing, isolated from everything else. Each has a position x(t), which we want to find. To keep it simple suppose they both have the same mass m. Newton says F = ma, right? Well yes, but let’s be a bit more precise than that. Newton says There is at least one frame of reference where the motion of the balls x1(t),x2(t) satisfies the equations

d2x d2x m 1 = −K · (x − x ) ,m 2 =+K · (x − x ) . (2.1) dt2 1 2 dt2 1 2 Here K is the usual spring constant. Why the big fuss? Well, there are many different ways to assign coordinates to points in space. For instance, if I felt like it I could label each point not x by its value of x but by the value of x =e . Then definitely each ball has some motion x1,2(t), but these functions don’t obey Newton’s law in its usual form (2.1). Newton is just saying that always there’s some wayoflabeling spacetime points that gives the law of motion his very simple form. On the other hand, although most arbitrary changes of reference frame make Newton’s Third Law look ugly, still there are a few changes that leave it looking exactly the same. Try this [1]: (a) Let x = x +(1cm). Change variables in (2.1) to see what it says about the functions x1(t) and x2(t). The new equation has exactly the same form, so we conclude that shifting the coordinate system doesn’t matter. This is a little like relativity but not quite; it’s another important feature of Newton’s laws called translation invariance. (b) Show that Newton’s laws have a similar invariance under rigid rotations too.

1 In particular, she sets up a frame of reference using clocks and meter sticks identical to mine. Logicians will notice that I’ve slipped in a tricky concept: What does it mean for two different things to be “identical”? The statement is tacitly implying that the two apparatuses came off the same factory line, the one in the plane was brought up to speed gently enough that the acceleration doesn’t break it, all relevant environmental factors like air pressure have been carefully made equal, other differences like cosmic rays are irrelevant to this particular apparatus, and so on.

3 P. Nelson PHYS240/250 Spring 2005

What about relativity? It seemed simple enough to Newton. Consider the transformation to

x = x − Vt, t = t (2.2)

The new frame of reference x is moving relative to the old one x: for example, the motionless trajectory {x = const} takes on the form {x = const − Vt} in the new frame, which is not motionless. It seemed obvious that an observer moving uniformly to the right at velocity V relative to the first observer would see the balls as being located at x1,2(t ) obtained from the first observer’s x1,2(t)bythe Galilean transformation. Then the key question is, would both observers agree that Newton’s law, in its usual form, governs the motion? Try this [2]: Solve (2.2) for x and t in terms of x and t. Substitute into (2.1) to find what law of motion the uniformly moving observer sees. It’s just the same form as (2.1). For example, both observers agree that the balls bounce with the frequency ω = 2K/m. That is, the bounce frequency cannot be used as a way to tell which frame you’re in. So we conclude that Newton’s law of motion obeys one form of the Principle of Relativity, because it’s invariant under Galilean transformations. (We’ll call this Galilean Relativity, though really it was Newton who first understood it clearly.) − 1 2 On the other hand, if the second observer is accelerating,sothat x = x 2 At , then things are quite different: Try this [3]: Again solve for x in terms of x, substitute into (2.1), and see what happens to its form. So indeed accelerated frames are not the same as uniformly moving frames. One key feature of the Galilean transformation (2.2) is that if we transform from one frame F1 to a new one F2 moving at speed V relative to F1, then transform again to a third frame F3 moving at speed W relative to F2, the combined transformation is just the same as a single Galilean transformation; frame F3 is moving relative to frame F1 with some other speed (namely V + W ). Try this [4]: That’s obvious, right? But work through the equations anyway. Write x, the coordinate in F3, in terms of x, the coordinate in F2. Then substitute x in terms of x, the coordinate in F1, and show that x in terms of x is again a Galilean transformation.

3. Uh-oh

What bothered Einstein and others was that Maxwell’s equations for electrodynamics did not remain invariant under Galilean transformations. You can see this yourself using a simple example. In a plane wave the two transverse components Ey and Ez of the electric field each obey the electromagnetic wave equation

∂2E ∂2E c2 − =0. (3.1) ∂x2 ∂t2

Convince yourself that you get a solution if you take E(x, t)tobeofthe form f(x ± ct) for any function f.

4 P. Nelson PHYS240/250 Spring 2005

Try this [5]: (a) Reexpress x, t in terms of x,t (from (2.2)). Show that E = f(x ± ct)= ± 2 ∂2E − ∂2E f(x + Vt ct )doesnot obey the equation c ∂x2 ∂t2 =0. (b) Repeat the analysis of exercise [2]: Take the wave equation for E(x, t) and substitute x in terms of x from (2.2). Does the new equation have the same form as the old? (Use the chain ∂ ∂ ∂ ∂ rule to express ∂x, ∂t in terms of ∂x , ∂t and then substitute.) Actually, we need not have gone to so much trouble. The EM wave equation predicts a single, universal speed for waves, and the mere existence of any object or process with only one allowed speed contradicts Galilean invariance. By itself, this observation isn’t so shocking. After all, sound obeys the same wave equation as light (with a much smaller csound), and nobody worries about that. But these are physically very different situations. In fact (3.1) is only correct for sound moving through air at rest relative to the observer. If lightning strikes a mile away, you’ll hear it sooner if the wind is blowing towards you, later if it’s blowing away. If we write a more general form of the wave equation to describe sound moving through air in motion, then we’ll find it has the Galilean invariance after all. Inverting this logic, suppose we knew or suspected a priori that physics had Galilean invariance, and we were presented with the non-invariant equation (3.1) and told that it describes sound propagation. We’d then be led to conclude that some physical object was missing from the equation — the air speed V0.Wecould even try to find the complete equation, including the effect of the air velocity, by fiddling with the equation until it becomes Galilean-invariant!2 We could then use the upgraded equation to make falsifiable experimental predictions about the propagation of sound under various wind conditions. For sound, those predictions turn out to be true. The important thing about light, however, is that unlike sound it can move through vacuum! 3 So we cannot fix up the EM wave equation by introducing the speed of the medium. In the late nineteenth century, the resolution to this puzzle seemed obvious: The state prepared by a vacuum pump must actually be filled with some invisible medium, and (3.1) is only valid in the special case where the observer is at rest with respect to the medium. This “æther hypothesis” made testable predictions, which turned out to be false.4 Einstein decided to bite the bullet and live with the consequences of the hypothesis that the EM wave equation was complete as it stands, and in particular not Galilean-invariant. He reasoned that, although the speed of sound depends on the temperature, pressure, and composition of the air, the in vacuum was a much more universal thing — there is only one kind of vacuum, and since the EM wave equation has no dispersion, there’s only one kind of light too. He raised c up to the status of a universal, fundamental constant.5 So here’s the puzzle: the EM wave equation (3.1) says that light waves travel at velocity c,

2 Or we could just rederive it more carefully for the general case where the air (or violin string, etc.) is moving relative to the observer at speed V0. Either way, the full equation is ∂ ∂ 2 − 1 ∂2 ∂t + V0 ∂x c2 ∂x2 E =0.You can show that this equation is form-invariant under (2.2) − (remember that V0 = V0 V ). 3 In fact, (3.1) is only correct in this situation — light travels slower than c in glass. 4 Eventually the combination of Michelson-Morley, aberration of starlight, and the transverse character of light convinced people that the classical æther theory was untenable. 5 In other work published the same year (1905) he helped do the same thing for kB andh ¯.

5 P. Nelson PHYS240/250 Spring 2005 period. If this is to be a law of Nature, apparently it must be true in every good reference frame. But in different good reference frames, according to Galileo, the same object will appear to have different velocities, a contradiction. More abstractly, if (3.1) is to be consistent with the Principle of Relativity, apparently it must have Galilean invariance as it stands. And as you just showed, it doesn’t. Since (3.1) came from Maxwell’s equations, they too lack Galilean invariance. Again, this is no problem for sound: if we switch to a frame of reference moving with respect to the air, then (3.1) no longer holds. But light doesn’t require air. Once more: Maxwell’s equations aren’t consistent with the Principle of Relativity as inter- preted by Galileo. To most people it seemed clear that this was no problem: Maxwell’s equations were less than 20 years old. “Obviously” they just needed to be changed. Unfortunately, when people tried changing them to fix up this blemish, the modified equations predicted new electro- magnetic phenomena that were sought but not found.

4. Einstein’s Formulation of the Puzzle

Einstein’s great insight was to realize that (a) Galilean invariance was extremely well tested, but only for things moving much more slowly than light. On the other hand, the paradox involves light itself. So maybe the correct laws of Nature are not invariant under Galilean transformations, but rather under some other transformations that happen to look like Galilean for speeds much smaller than c. (b) The Galilean transformation is not the only way to implement the Principle of Relativity. It contains a hidden assumption. Changing that assumption has some weird consequences, but did let Einstein retain both the EM wave equation and relativity. And it turned out to be testable, and right. Einstein sharpened up the Principle of Relativity a little, to say R3 There are some good frames of reference in which the laws of nature have a simple form, and moreover those laws all have the same form in each good frame.6 All these good frames are moving at constant velocity relative to each other. Thus any object that appears to be moving at constant velocity v in one good frame will also appear to be moving at some other constant velocity v in any other. Thus again it’s impossible to tell which frame is “really” at rest, since the laws governing all experiments in one good frame are the same in any other one. The proposed law R3 doesn’t overturn the other two formulations R1–R2; it’s just more specific. We saw that Newton’s laws obey R3; the good frames are all related by Galilean trans- formations. An object moving at constant velocity, x(t)=vt, will in the new frame of reference obey x(t)=(v − V )t,soitnow moves with constant velocity v = v − V . Again, Newtonian physics satisfies the Principle of Relativity because it’s invariant under Galilean Transformations. What Einstein realized was that this is not the only way to satisfy R3. A theory can instead be invariant under some other set of transformations and still have R3! The bizarre thing is that whatever these transformations are, if an object happens to be moving in any direction at speed c in one good frame it must also be moving at speed c in any other one!

6 Most books use the term inertial frame to refer to the “good” frames. That’s because in these frames the law of inertia applies: things at rest stay at rest, unlike in accelerated frames.

6 P. Nelson PHYS240/250 Spring 2005

(Otherwise the equation obeyed by a light pulse in the second frame would not obey the Maxwell equations in the second frame.) But objects moving much slower than c should have their velocity change as usual, since after all Newtonian physics is quite accurate at ordinary velocities. (Certainly a train seems to be moving slower when viewed from a car moving in the same direction!) Again: We are looking for a set of transformations of reference frame (coordinates on space- time), with two apparently contradictory properties: · Any trajectory at speed c must also appear in the new frame to be moving at c. · But a trajectory at speed |v|c must transform to a trajectory with velocity v ≈ v − V . It may seem impossible to find any set of transformations that accommodates both these requirements, but we’re about to do just that. The key lies in drawing pictures.

5. A Pictorial Solution

Again: Einstein realized that the choice of Galilean transformations to implement relativity made a hidden assumption.7 This was to assume that all good frames of reference agree on the meaning of time. They assign different spatial coordinates x = x to a given event, but all agree on the time: t = t (see (2.2)). Upon closer reasoning, Einstein realized that he couldn’t figure out away in practice for real-life observers to make sure that their clocks remained synchronized with some hypothetical universal time (and hence with each other). It may be simple to synchronize two clocks at the instant when they pass by each other, but later, when they’re separated in space, the idea of synchronization becomes more tricky — to compare times, one would have to send a message to the other, for example via a radio signal, but radio signals don’t travel instantaneously. Once again Einstein bit the bullet and decided that · “Time” has only a personal meaning; it’s a number that a given observer attributes to an event using his own personal clock. · If two observers are in uniform motion relative to each other, then the readings on their (identical) clocks will be related by whatever invariance governs the mechanisms of those clocks. · No known experiments rule out the possibility that these invariance transformations change the value of t attributed to an event. So let’s look into the Merely Mathematical question of whether we can get out of our difficul- ties by proposing that the good frames of reference are all related by a set of transformations that change both the space and time coordinates of an event. Then of course we’ll have to look into the experimental question of whether these transformations actually are invariances of Nature. The pictures we will draw are called space-time diagrams. I’ll draw an entire little story on the xt plane. Each point on this plane is an event. A curve in this plane is a string of events, or the trajectory of a moving object. We sometimes call such a curve the world-line of the object. A straight line is the trajectory of an object moving at constant velocity given by the slope. We are interested in things moving at or near the speed of light, so instead of plotting x vs. t let’s plot x vs. ct.Onsuch a graph light rays move on lines of slope 1, while everyday objects

7 Just to keep the record straight, Newton actually was careful to bring this assumption out explicitly. But after a couple of centuries people just took it for granted.

7 P. Nelson PHYS240/250 Spring 2005 like cars move on lines of extremely steep slope. For example, 1 m/sec is a line that rises about 300 000 000 meters for every meter to the right. We can look at an object from more than one frame of reference. For example my friend in the airplane can look at her own experiments, or fly through my lab and look at mine (hey, watch it!). The object and its world-line are always the same, but the coordinate axes are different in the two frames. For example if my friend is at rest relative to me, but some distance away, her t coordinate axis will just be shifted horizontally in the xt plane compared to mine, since her t axis (the locus of points with x =0)isdifferent from my x =0axis. Similarly if our watches are identical but she’s using Pacific time, then our x-axes will be shifted too. From now on I’ll forget such rather trivial shifts, supposing we’ve arranged that there is some event where we both agree that x = x = ct = ct =0;our coordinate systems have the same origin. Please take a moment to convince yourself that when my friend is moving relative to me at uniform speed, the Galilean transformation means that her t-axis is bent over relative to mine, whereas her x-axis is the same as mine:

Galilean Transformation

We can see geometrically what we saw earlier algebraically: any straight line will have different slopes when expressed in the two coordinate systems. In particular light-flash trajectory of slope 1, which ticks off equal units of x for equal units of ct,won’t tick off equal units of x in equal units of ct. That’s the problem.8 Let’s try to find an alternative to the Galilean transformation. First, whatever the trans- formation we’re looking for, it had better be linear, since it’s supposed to take straight lines to straight lines (see R3). So it’s just a matter of changing the coordinate axes. A familiar example is the rotation transformation, eqn. (1.1): x cos θ sin θ x = (1.1) ct − sin θ cos θ ct

Rotation

8 We could try to fix this problem by rescaling the tick marks on the t axis. This would work for the path moving to the right (shown), but would then fail for a similar path moving to the left. The below works for all light-flash paths.

8 P. Nelson PHYS240/250 Spring 2005

It’s linear, all right. Try this [6]: Show that it also has the nice feature that if you do a rotation, then another rotation, the combined transformation is again of the form (1.1) (with angle θ1 + θ2). [Compare question [4] above.] But it’s not quite what we want! Any straight line will of course change slope under a rotation. The line I drew above bisects the xt axes but not the xt axes. We want one class of straight lines — those of slope1—tokeeptheir slope unchanged. We’re almost there. There’s one class of linear transformations that change the slope of most lines but keep slope-1 fixed. Simply bend the t axis to the right a bit and bend the x axis up a bit!9

Lorentz Transformation

That’s Einstein’s solution. An object at rest in the new frame of reference is indeed moving in the old one (that is, the t axis has some slope away from the old t axis). But the light ray, which bisects the xt axes, also bisects the xt axes. So let’s try letting x ? 1 −β x = (provisional guess) (5.1) ct −β 1 ct where β is some constant. Note that (5.1) looks a little like (1.1), except for one all-important sign change: (1.1) bent the x-axis down whereas we need to bend it up. The key observation is that (5.1) creates a new frame of reference moving uniformly relative to the original one. So if the laws of Nature turn out to be invariant under (5.1) then we can say that they obey the Principle of Relativity R3. Key exercise [7]: (a) Return to exercise [5] above. Take a plane-wave solution to the EM wave equation in the x, t frame, rewrite the same solution in terms of x,t (see eqn. (5.1)), and show that it also solves a wave equation of the same form in these variables. (b) Substitute (5.1) into the EM wave equation (3.1) to show that it really is an invariance as desired. (Compare exercise [5]b above.) (c) Then substitute into Newton’s law (2.1) to show it’s not invariant under (5.1). (d) But show that (5.1) becomes approximately the same as the Galilean transformation with velocity V = βc when we take β 1. Since that means V c, this limiting case corresponds to everyday experience. We are very close, but (5.1) has some remaining flaws: Try this [8]: Show that: (a) If you transform once with β, then transform again with −β,youdon’t get back your original coordinate choice. Compare with the corresponding property of Galilean transformations [4]. (b) If you transform once with β1, then transform again with β2,youdon’t get something of the form (5.1). (Again compare to the Galilean case, where the combined

9 Of course we could equally well bend the t axis to the left and the x axis down; this corresponds to negative values of β.

9 P. Nelson PHYS240/250 Spring 2005 transformation is itself a Galilean transformation.) So (5.1) doesn’t quite make sense. Luckily we can fix this flaw without messing up the good properties ([7] above) by multiplying x and t by an overall constant. Then light rays still tick off equal amounts of x in equal amounts of t. Einstein’s famous proposal had actually been constructed earlier by Lorentz, but as a pure abstraction, a mathematical curiosity about the Maxwell equations. It’s − x 1 1 β x = √ Lorentz Transformation (5.2) ct 1−β2 −β 1 ct

We often use the abbreviation γ =(1− β2)−1/2. This factor is always bigger than one, it’s very close to one for velocities much smaller than c, but it goes to infinity as v → c. How could one have guessed this formula? Well, it’s not that hard to find the right multiplier just by playing around with the math.10 But there’s a deeper answer in Sect. 9 below. Try this [9]: Show that: (a) The Lorentz transformation has all the nice properties you found in [7]a,b. (b) If you do two Lorentz transformations in succession, you again get a Lorentz transformation, much as in exercises [4] and [6]. (c) The peculiar thing is that now the combined transformation has velocity βc , where β is given by the weird formula

β + β β = 1 2 . (5.3) 1+β1β2

Sorry, this takes a bit of algebra. The first key thing to notice about (5.3) is that if β1 =1(so V1 = c), then β =1too, regardless of β2. That’s just what we wanted: something moving at c in one good frame moves at c in any other good frame. Also, if β1 ≤ 1 and β2 ≤ 1 then β ≤ 1aswell. So it’s not inconsistent to require that Nothing can move faster than c, since this law is true in any good frame if it’s true in any other one. (We’ll see later why we must require this.) Another key point is At ordinary velocities β 1 the Lorentz transformation (5.2) becomes exactly the same as the Galilean transformation (2.2), and (5.3) reduces to the ordinary rule for adding velocities. Finally I’ll state without proof: In three space dimensions, choose axes x,ˆ y,ˆ zˆ such that xˆ is along the direction of V . Then augment (5.2) by defining y = y, z = z. The resulting 4-dimensional transfor- mation again has the property that any straight-line trajectory of constant speed equal to c, viewed in the new frame, is again a straight line of constant speed equal to c.

6. Summary So here is the situation so far. We’ve got the Principle of Relativity R3, and two incompatible ways to implement it:

10 Require that the multiplier depend only on the speed |β|, and that the transformation with β replaced by −β should be the inverse to (5.2).

10 P. Nelson PHYS240/250 Spring 2005

i) Galileo’s Relativity: There are many good frames of reference. All laws of Nature look the same in any good frame. Any good frame is related to any other one by a Galilean transformation. ii) Einstein’s Relativity: There are many good frames of reference. All laws of Nature look the same in any good frame. Any good frame is related to any other one by a Lorentz transformation. We’ve seen that Newton’s laws obey (i) but not (ii), while electrodynamics has it the other way around. What’s wrong with that? Well, suppose there were a frame in which both Newton’s laws and Maxwell’s laws were valid in their usual forms. There would be many other frames where Newton’s laws looked the same, but Maxwell’s didn’t, and many frames where it’s the other way round, but no other frames where both looked the same.11 So we could detect absolute motion: the true rest frame is the one in which both look simple at once. The existence of one “best” frame then contradicts relativity in any of its three forms R1–R3. Something has got to give. Either relativity, or Newton, or Maxwell has to go. In the end it’s an experimental question. In light of the failures to fix up Maxwell, Einstein proposed that Newton’s laws were the ones needing repair. It was reasonable since the change will be very minor in the realm of everyday experience. In fact, the above reasoning should make it clear that any new kind of particle or force must also have Lorentz invariance. So Einstein’s proposal wasbold indeed: it applied even to phenomena he had not yet seen nor even imagined. And indeed every new law of Nature discovered since then has proved to be Lorentz invariant. It’s one of the biggest successful generalizations in the ! The existence of laws of this sweeping generality is a miracle, the basic miracle of physics. It’s what gives physical law a different character from the rules governing other branches of science. Again: What’s revolutionary about Einstein’s logic is not just the factual content of his proposal, but also the method: Till then, the general approach had been to propose a law of Nature, then test it. Instead Einstein has gone to the next higher level, writing a transformation principle that’s proposed to be an invariance of all laws of Nature, whatever they may turn out to be.Ifthis kinematic principle has promising consequences, then we can go about the job of finding specific equations of motion compatible with it. Think back to the prolog to these notes, §1. Just as when rotating we describe the same thing from different perspectives by transformed coordinates, so when we move we describe the same events by transformed coordinates. Millions of years of evolution have not, however, equipped us with accurate hardwired intuition about velocities close to c.Wehave to write equations, draw diagrams, and tell little stories about light rays in order to understand relationships no more complicated than those in rotations, relationships about how the same situation will look from different perspectives. In fact, while time isn’t “just like” space (x and t are not equivalent directions), still the fact is that space and time mix under change of good reference frame. It’s weird, but it’s experimentally testable.

7. Geometrical implications

1. The Galilean transformations have a nice property: Both the original and the primed observers

11 Except for trivial, constant translations and spatial rotations.

11 P. Nelson PHYS240/250 Spring 2005 agree that event R is simultaneous with P , whereas S precedes P and Q follows P (figure panel (a)). Geometrically, this is a matter of whether you’re above or below the x axis.

But we are making the working hypothesis that the good reference frames are related by Lorentz, not Galilean, transformations. An observer moving to the right (panel (b)) would disagree with the original observer, saying that R precedes P (it lies below the x axis)!12 Similarly, a leftward-moving observer (panel (c)) would say that S follows P .Interestingly, however, all observers agree that T is later than P , and U is earlier. That’s because these points lie beyond the wavy lines at 45◦ to the axis, and we can never bend the x axis past these. Algebraically, we say that the temporal ordering of P and Q is unambiguous only if |tQ−tP | > |xA − xP |/c. 2. Even though we have arranged for the speed of light to be invariant, nevertheless the frequency of a given light wave won’t look the same in two frames. Here I’ve drawn diagonal lines representing the trajectories of successive wave crests for a wave moving to the right:

These wave crests lie on the lines {ct = x + ncT } in the xt plane, where n = ...,0, 1, 2,.... We want the period T in the primed coordinate system. This is the value of t when the second

12 An even faster-moving observer would also say that Q precedes P !!

12 P. Nelson PHYS240/250 Spring 2005 wave crest intersects the t axis. To find the intersection, solve x =0with ct = x + cT ,or γ(βx + ct)=γ(x + βct)+cT . The intersection is then x =0,t = T/(γ(1 − β)). This value of t is the period of the wave: 1+β (7.1) T = T 1−β Relativistic Doppler shift

Note the sign of the effect: if we move to the right, and the wave overtakes us, we see a longer period than that measured on the light source. Eqn. (7.1) is another experimentally testable conclusion—it’s different from the Doppler shift seen in Galilean-invariant systems (like sound). It’s also the key to interpreting the shifted spectra of distant galaxies: they’re receding from us, and hence their spectral lines are shifted to longer periods (toward the red).

8. Dynamical implications

Ihaven’t talked much about clocks and rods and trains and all that traditional stuff. It’s not the main point. Einstein cooked up all those stories to convince himself and others that (ii) was reasonable, that indeed a moving observer really would disagree with me about the times of events. You have to think very closely about the actual process of measurement, a little like the way Heisenberg later came to realize that exact measurement of position and was a practical impossibility. It’s fun, but it’s not the main thing. In practice we do not spend much time pole-vaulting into rooms shorter than our pole while moving near the speed of light, etc. Instead of such parlor games let’s see what practical (or at least testable) consequences flow from Einstein’s proposal. 1. First of all, we have found a neat geometrical end-run that allows us to reconcile relativity with the existence of a class of objects with one universal velocity in vacuum (for example light pulses, as demonstrated by the Michelson–Morley experiment). We’d be in big trouble, though, if we ever found a second kind of particles with a different, also universal, velocity! There’s no way that two different slopes could both be preserved by a nontrivial family of linear changes of variables. Electrons come with all different allowed speeds, so that’s no problem. Indeed no other fixed-speed particles are known besides photons and gravitons, and gravitons haven’t yet been detected.13

13 Until recently it was thought that neutrinos were another class of fixed-speed (massless) particles. Indeed the neutrinos coming from Supernova 1987A spent many years getting here, but they all arrived at nearly the same time — there was hardly any dispersion in their arrival, and hence all had nearly the same velocity. Originally this fact was interpreted as meaning that they all moved at exactly c; sure enough, the neutrinos all arrived within a few hours of the light of the explosion, after a trip of many years. After the discovery that neutrinos actually have a tiny mass, we now know that they don’t have a fixed universal speed after all; one can in principle create them at rest. It’s just extremely unlikely that a nuclear reaction, releasing millions of eV, would create a neutrino with comparable to its mass, a few eV. It’s much more

13 P. Nelson PHYS240/250 Spring 2005

2. In water light travels fast, but not quite at speed c (the index of refraction reduces the speed). This fact permits an elegant experimental test of the very weird formula (5.3)! Suppose a light flash moves with speed β1c in still water. When it moves through water that is moving with speed β2c relative to the lab, its speed should then be βc in the lab frame. Max von Laue noticed that this observation explained data taken years earlier by Fizeau and others. 3. It may seem hard to make further testable predictions without knowing how to fix up Newton’s laws of motion. Einstein realized that the best way to try to guess a new law consistent with Lorentz invariance was to focus on conservation laws.Heasked, can the world have Lorentz invariance and still have four laws of conservation, which reduce to the usual conservation of p and E in the everyday world? Einstein seems to be making a big assumption here: Although he says that some cherished results are only approximately valid (for example the old addition of velocities formula), never- theless he’s insisting that there should be versions of the conservation laws that, although slightly different from Newton’s, are exact.Ifhe’s right, however, we can find the correct form of the con- servation laws by demanding that, like all other laws of Nature, they should be invariant under Lorentz transformations. Then we can ask if the modified laws are experimentally testable. According to Newton, the quantity ptot = i mivi summed over all the particles in an isolated system doesn’t change in time. For instance in our two-ball system (2.1), each ball keeps reversing its motion and yet the total momentum is fixed. [You should review how this follows from (2.1).] Now in order to be a law of Nature, the conservation of momentum must be true in every good frame of reference. If we move to a new frame moving at constant velocity V relative − to the original one, the Galilean transformation says that every velocity vi changes to vi = vi V , so the momentum seen in the new frame differs by a constant, −( mi)V .Soifitdoesn’t change with time in the old frame, the same is true in the new frame. That’s good. Unfortunately this tidy situation fails if the true invariances are Lorentz transformations. As usual let’s think only of one space dimension. Consider a collision in which one object changes velocity from v1 to u1, and another from v2 to u2.You can calculate that even if in one frame m1v1 + m2v2 = m1u1 + m2u2,inanother frame related by a Lorentz transformation we won’t in general have m1v1 + m2v2 = m1u1 + m2u2. The Newtonian conservation law is not Lorentz invariant, and so is cannot be a law of Nature if all laws of Nature are Lorentz invariant. Luckily there’s a cure. There is a good candidate for a conserved vector quantity, but it’s not i mivi. Instead it’s p = √ mi v Relativistic Momentum i 2 i (8.1) 1−(vi/c)

The following section will show that if (8.1) holds in one frame, it also holds in any Lorentz- transformed frame. Thus the claim that the laws of Nature are such that relativistic momentum is conserved in all isolated systems is a legitimate proposed law — and hence worth testing experimentally. probable that such a neutrino would be created with kinetic energy much bigger than its rest mass, and hence speed very close to c.Thus the small dispersion in arrival times of the neutrinos is now interpreted as giving an upper bound on their rest mass, a bound obeyed by the subsequent experimental measurement of that mass.

14 P. Nelson PHYS240/250 Spring 2005

Notice that you cannot push a baseball up to the speed of light, for then its momentum would become infinite. Indeed it soon became clear experimentally that electrons in accelerators don’t obey the usual cyclotron law once their velocity gets up near c. They get harder and harder to push, just as Einstein predicted.14 On the other hand, at small velocities the relativistic momentum just approaches the old formula; it ties on to everyday experience. What about energy? Again you can show by working an example that if a collision has the 1 2 property that 2 mivi is conserved in one good frame of reference, nevertheless the corrsponding quantity won’t in general be conserved in another one. So, conservation of this quantity cannot bealaw of Nature either, if Einstein’s form of the Principle of Relativity is the right one. But the choice 2 E = √ mic Relativistic Energy i 2 (8.2) 1−(vi/c) does work. In this formula the mi are constants intrinsic to the various objects. That is, an object’s mass doesn’t change when we change our viewpoint.15 Just as before, Einstein’s proposal is that the laws of Nature are such that relativistic energy is conserved in any isolated system. Try this [10]: A subatomic particle of mass M is at rest in the lab. Then it disintegrates into two lighter particles (“electrons”), each of mass m. They fly away along the ±xˆ directions. Thus the relativistic momentum equals zero before and after the disintegration, at least in this frame of reference. (a) Find the velocities ±v. (b) Now transform all three velocities (0, +v, and −v)into a new reference frame moving along the +xˆ-axis at speed V = −v. Using (8.1)–(8.2), confirm that relativistic energy and momentum also balance in this new frame. [If you’re still not convinced, you can try it again for an arbitrary choice of V .] 1 2 At first sight (8.2) seems unrelated to the Newtonian 2 mv . But consider ordinary velocities c. Then rearranging we get ≈ 2 1 2 ··· E mic + 2 mivi + . (8.3) i i The dots represent terms that are negligibly small at small velocities. The second term is also much smaller than the first, but if every participant in a collision keeps its identity then the first term is just a constant.Sothe second term is conserved all by itself, and in the limit it’s just the old Newtonian kinetic energy. We can make the rather ugly formulas (8.1)–(8.2) look nicer by dividing them: for a single particle we find p/E = v/c2. (8.4) This formula is valid regardless of the mass or velocity. The only way around the prohibition against light speed is for a particle to have zero mass, since then p needn’t blow up. Conversely, any massless particle must move at speed c, since otherwise it would have no energy, no momentum — no existence at all. For massless particles, (8.4) reduces to E = pc. (massless particles) (8.5)

14 This effect is important in engineering medical accelerators, as well as in high-power microwave oscillators for radar, etc. 15 Older books sometimes emphasize this by calling m the rest mass of the object.

15 P. Nelson PHYS240/250 Spring 2005

Historically Compton’s verification that interactions of electrons with light obey momentum and energy conservation with (8.5) clinched Einstein’s controversial idea that light consists of parti- cles.16 Another useful formula comes when we eliminate v.You should show yourself that (8.1)–(8.2) imply E2 =(pc)2 +(mc2)2 , (always). (8.6)

When m → 0 this again gives (8.5), whereas in the opposite limit p mc we get the nonrelativistic kinetic energy (8.3). (Please work this out.) Eqn. (8.6) also shows why a convenient unit for mass in high-energy and nuclear physics is 2 2 MeV/c ; for example, the electron mass is me =0.51 MeV/c . Similarly, the convenient unit for momentum is MeV/c.

9. The Amazing Secret of Spacetime Geometry If you’ve done the problems so far, then you know that the rather complicated and mysterious formula (5.2) does what we want it to do. You may wonder why it works, or whether it has a deeper meaning. I also claimed that the equally complicated and mysterious formulas (8.1)–(8.2) have the property that if all four are conserved in any one good frame of reference, then the same will be true in any other such frame, even though the velocities transform via the complicated and mysterious eqn. (5.3). All of these mysteries became much clearer when H. Minkowski and others invented the concept of spacetime geometry. Let’s go back to the humble example of rotations. The distance between any two points is a geometrical property of the points, independent of any coordinate system used to de- scribe them. I mentioned that the formula for this distance always retains the same form, 2 2 d = (x1 − x2) +(y1 − y2) , regardless of which Cartesian coordinate system we choose. You can check this using formula (1.1). We found that Lorentz transformations, like rotations, are linear transformations. We found ourselves led to study transformations that mix space and time. So it makes sense to think of (x, y, z, t)asasingle vector in a four-dimensional “spacetime.” So far that hasn’t got much content. Newton would agree that events are specified by four numbers. To keep it simple, let’s drop two space dimensions and think of x, t only, as we did earlier. You can easily check that under a Lorentz transformation the quantity (∆x)2 +(c∆t)2 changes form, but we already knew that — Lorentz transformations aren’t rotations. Instead,

Lorentz transformations are precisely those linear transformations that pre- (9.1) serve the form of (c∆t)2 − (∆x)2.

16 Actually (8.5) is required by classical electrodynamics. Multiplying both sides by the flux of photons, it relates the flux of energy to that of momentum in a way already derivable from Maxwell’s equations. Einstein also knew that (8.5) was needed in order for quantum theory to make sense. Dividing Einstein’s relation E =¯hω by deBroglie’s relation p =¯hk and using the EM wave equation ω = kc again gives us (8.5). It all fits together. It’s no accident that Einstein’s light-quantum and relativity papers both appeared the same year.

16 P. Nelson PHYS240/250 Spring 2005

You can check this too; along the way you’ll see why the funny prefactor in (5.2) was needed. In the special case of a light ray, for which x = ±ct, the assertion (9.1) just reduces to the fact we already had, that x = ±ct. This is the secret of spacetime: it’s a vector space with a peculiar kind of distance-like func- tion. The Lorentz transformations are the rotation-like transformations preserving the form of the distance function. In fact, another way to guess (5.2) is to think some more about rotations. Let’s let y =ict,sothat we’re seeking linear transformations that preserve the form of (∆x)2 +(∆y)2. These are just rotations. The twist is that we want rotations that keep x, t real, i.e. that keep x real but y purely imaginary. These turn out to be rotations by an angle θ that is an imaginary number! x cosh η sinh η x Taking θ =iη in (1.1) gives the transformation = . Amazingly ct sinh η cosh η ct this is exactly the same as the Lorentz transformation if we choose η so that tanh η = β. The quantity (c∆t)2 − (∆x)2 is so useful that we give it a name: it’s called the invariant interval between (x1,t1) and (x2,t2):

∆s2 =(c∆t)2 − (∆x)2 . (9.2)

Notice that ∆s2 is sometimes a negative number. Also notice that in (9.2) we promoted x to a vector; it turns out that (9.1) is still true after this generalization. Returning to an earlier section, we see that the temporal ordering of two events P and Q is 2 2 unambiguous if ∆sPQ ≥ 0. We say that the separation of the events is timelike if ∆PQs > 0, or lightlike if it’s exactly zero. Otherwise, we call the separation spacelike, and the ordering is ambiguous (dependent on which Lorentz frame we choose). When two events are spacelike separated, then it makes no sense to assert that one “caused” the other one, because neither can be said to precede the other! So we’d better disallow any physical process that could communicate between spacelike-separated events, or in other words any signal moving faster than light.17 What about momentum? The reason mv didn’t work was because the velocity has that messy transformation law (5.3). The reason for that is that v is the quotient of two quantities, ∆x and ∆t, each of which transforms under (5.2). Instead let’s try defining

dx d(ct) p = mc ,E/c= mc . (9.3) ds ds

Here we think of our particle’s trajectory as a curve in spacetime. Instead of thinking as usual of the position as a function of time x(t), let’s rather think of both position and time as changing from one point to the next; the derivatives above tell how they change when we move an interval ds along the curve.18 In fact you can show that the beautiful formulas (9.3) imply the ugly (8.1)–(8.2). Other facts, for instance eqn. (8.4), also follow at once from (9.3). Try this [11]: Rederive the key formulas (8.4), (8.6) starting from (9.3) and (9.2) (not starting from (8.1), (8.2)).

17 We already noted that such a prohibition is consistent with the relativistic velocity addition formula, Eqn. (5.3). Now we see that the speed limit is also necessary to avoid a physically nonsensical confusion about causality. 18 Thus s is a bit like the arc length of a curve in ordinary space.

17 P. Nelson PHYS240/250 Spring 2005

So what? Because ds is invariant under Lorentz transformations, eqns. (9.3) tell us at once that the four quantities (px,py,pz,E/c) undergo the same linear transformation as (x, y, z, ct). Thus momentum and energy combine to a single vector in 4-space just like position and time. Furthermore, if several vectors add up to zero, then after transforming them all with some linear operation they still add up to zero!! What this means in our case is just that if energy and momentum are conserved in one good frame they’ll be conserved in any other good frame too. That’s what we wanted to show. Try this [12]: Now that we know how the values of momentum and energy transform when viewed in a different good frame, consider a massless particle of momentum p and energy E = pc. Apply a Lorentz transformation to find p and E in a frame moving at speed βc relative to the first one. Does your answer look familiar? That is, does your answer resemble some other formula in this handout? Make an Insightful Comment.

10. The World Einstein went farther. What if the participants in a collision don’t retain their identity? What if collisions among nuclei, for example, weren’t just rearrangements of indestructible little marbles? What if the masses mi coming into a collision don’t add up to the same value as the sum of the masses m i of the particles coming out? Einstein was out on a limb here; before the advent of mass spectroscopy there was no evidence that this ever did in fact happen. But the answer was clear to him: by conservation of E,anymass defect ∆m between incoming and outgoing states must appear as a change in kinetic energy of (∆m)c2. And a very big change, too: c2 ∼ 1017 m2/sec2,or1017 J/kg. Einstein immediately grasped that even a fraction of a percent change in mass could account for the enormous energies that seemed to come from nowhere in radioactive decay.19 This is definitely a practical result. Everybody immediately realized that if you could slowly release the energy equivalent of a gram of matter, you’d get 1014 Jofenergy, plenty to run a big city for a long time. Everybody also realized that if you could do the same conversion in a few milliseconds, you could instantly burn that same city to the ground. Nobody knew at the time how either of these could be done in practice. But within a few decades the outlines began to form. All three belligerents in the last World War embarked on urgent crash programs to develop such weapons, with the explicit aim of using them on each other. An entire world vanished forever on 16 July, 1945.

19 In 1905: “Bodies whose energy content is variable to a high degree, for example radium salts,” may perhaps be used to test this prediction. In 1907: “It is possible that radioactive processes may become known in which a considerably larger percentage of the mass of the initial atom is converted into radiations... than is the case for radium.” By the way, the discovery of radioactive energy release also solved the Main Problem of geology at that time, which was why the Earth hadn’t cooled down long ago.

18