<<

J. Broida UCSD Fall 2009 Phys 130B QM II Supplementary Notes on and Maxwell’s Equations

1 The

This is a derivation of the Lorentz transformation of Special Relativity. The basic idea is to derive a relationship between the coordinates x,y,z,t as seen by observer and the coordinates x′,y′,z′,t′ seen by observer ′ moving at a V withO respect to along the positive x axis. O O y′ y V

′ ′ x O x O These observers are assumed to be inertial. In other words, they are moving at a constant velocity with respect to each other and in the absence of any external forces or (which is somewhat redundant). In particular, there is no rotational or gravitational field present. Our derivation is based on two assumptions: 1. The : is the same for all observers in all inertial coordinate systems. 2. The speed of c in a vacuum is the same for all observers independently of their relative motion or the motion of the light source. We first show that this transformation must be of the form t′ = at + bx (1.1a) x′ = dt + ex (1.1b) y′ = y z′ = z where we assume that the origins coincide at t = t′ = 0. The above figure shows the coordinate systems displaced simply for ease of visualization.

1 The first thing to note is that the y and z coordinates are the same for both observers. (This is only true in this case because the relative motion is along the x axis only. If the motion were in an arbitrary direction, then each spatial coordinate of ′ would depend on all of the spatial coordinates of . However, this is the caseO that is used in almost all situations, at least at anO elementary level.) To see that this is necessary, suppose there is a yardstick at the origin of each aligned along each of the y- and y′-axes, and suppose there is a paintbrush at the end of each yardstick pointed towards the other. If ′’s yardstick along the y′-axis gets shorter as seen by , then when the originsO pass each other ’s yardstick will get paint on it. ButO by the Principle of Relativity, ′ shouldO also see ’s yardstick get shorter and hence ′ would get paint on hisO yardstick. SinceO this clearly can’t happen, there canO be no change in a direction perpendicular to the direction of motion. The next thing to notice is that the transformation equations are linear. This is a result of space being homogeneous. To put this very loosely, “things here are the same as things there.” For example, if there is a yardstick lying along the x axis between x = 1 and x = 2, then the length of this yardstick as seen by ′ should be the same as another yardstick lying between x = 2 and x = 3.O But if there were a nonlinear dependence, say ∆x′ goes like ∆x2, then the first yardstick would have a length that goes like 22 12 = 3 while the second would have a length that goes like 32 22 = 5. Since this− is also not the way the world works, equations (1.1) must be− linear as shown. We now want to figure out what the coefficients a,b,d and e must be. First, let look at the origin of ′ (i.e., x′ = 0). Since ′ is moving at a speed V alongO the x-axis, the x coordinateO corresponding toOx′ =0 is x = Vt. Using this in (1.1b) yields x′ =0= dt + eV t or d = eV . Similarly, ′ looks at (i.e., x = 0) and it has the coordinate x′ = Vt′ −with respect to O′ (since movesO in the negative x′ direction as seen by −′). Then from (1.1b)O we have OVt′ = dt and from (1.1a) we have t′ = at, andO hence t′ = dt/V = at so that −aV = d = eV and thus also a = e and d = aV . Using− these results in −equations (1.1)− now gives us −

b t′ = a t + x (1.2a) a   x′ = a(x Vt) (1.2b) − y′ = y z′ = z.

Now let a move along the x-axis (and hence also along the x′-axis) and pass both origins when they coincide at t = t′ = 0. Then the x coordinate of the photon as seen by is x = ct, and the x′ coordinate as seen by ′ is x′ = ct′. Note that theO value of c is the same for both observers. ThisO is assumption (2). Using these in equations (1.2) yields

b bc t′ = a t + ct = at 1+ a a    

2 V x′ = a(ct Vt)= cat 1 − − c   so that V bc cat 1 = x′ = ct′ = cat 1+ − c a     and therefore V/c = bc/a or − b V = . a −c2 So now equations (1.2) become

V t′ = a t x (1.3a) − c2   x′ = a(x Vt) (1.3b) − y′ = y z′ = z.

We still need to determine a. To do this, we will again use the Principle of Relativity. Let look at a clock situated at ′. Then ∆x = V ∆t and from (1.3a), and O′ will measure intervals relatedO by O O V V 2 ∆t′ = a ∆t ∆x = a 1 ∆t. − c2 − c2     Now let ′ look at a clock at (so ∆x = 0). Then ∆x′ = V ∆t′ so (1.3b) yields O O − V ∆t′ = ∆x′ = a(0 V ∆t)= aV ∆t − − − and hence 1 ∆t = ∆t′. a By the Principle of Relativity, the relative factors in the time measurements must be the same in both cases. In other words, sees ′’s time related to his by the factor a(1 V 2/c2), and ′ sees ’s timeO relatedO to his by the factor 1/a. This means that− O O 1 V 2 = a 1 a − c2   or 1 a = V 2 1 2 − c q

3 and therefore the final Lorentz transformation equations are

V t 2 x t′ = − c V 2 1 2 − c q x Vt x′ = − (1.4) V 2 1 2 − c q y′ = y z′ = z. It is very common to define the dimensionless variables V 1 1 β = and γ = = . (1.5) c V 2 1 β2 1 2 − c − q p In terms of these variables, equations (1.4) become β t′ = γ t x − c   x′ = γ(x βct) (1.6) − y′ = y z′ = z. Since c is a universal constant, it is essentially a conversion factor between units of time and units of length. Because of this, we may further change to units where c = 1 (so time is measured in units of length) and in this case the Lorentz transformation equations become t′ = γ(t βx) − x′ = γ(x βt) − (1.7a) y′ = y z′ = z. These equations give the coordinates as seen by ′ in terms of those of . Ifwe want the coordinates as seen by in terms of thoseO of ′, then we let Oβ β and we have O O →− t = γ(t′ + βx′) x = γ(x′ + βt′) (1.7b) y = y′ z = z′. Note that 0 β 1 so that 1 γ < . We also see that ≤ ≤ ≤ ∞ 1 γ2 = 1 β2 − 4 so that γ2 γ2β2 =1. − Then recalling the hyperbolic trigonometric identities

cosh2 θ sinh2 θ =1 − and 1 tanh2 θ = sech2 θ − we may define a parameter θ (sometimes called the ) by

β = tanh θ so that γ = cosh θ and γβ = sinh θ. In terms of θ, equations (1.7a) become

t′ = (cosh θ)t (sinh θ)x − x′ = (sinh θ)t + (cosh θ)x − (1.8) y′ = y z′ = z which looks very similar to a in the xt-plane, except that now we have instead of the usual trigonometric ones. However, note that both sinh terms have the same sign. Next, let us consider motion as seen by both observers. In this case we write displacements in both space and time as

dt′ = γ(dt βdx) − dx′ = γ(dx βdt) − dy′ = dy dz′ = dz.

Then the velocity v′ of a particle along the x′-axis as seen by ′ is x O dx′ dx βdt dx/dt β v β v′ = = − = − = x − . (1.9a) x dt′ dt βdx 1 β(dx/dt) 1 βv − − − x Alternatively, we may write

′ ′ ′ ′ ′ dx dx + βdt dx /dt + β vx + β vx = = ′ ′ = ′ ′ = ′ . (1.9b) dt dt + βdx 1+ β(dx /dt ) 1+ βvx These last two equations are called the relativistic velocity addition law. It ′ ′ should be obvious that for motion along the x-axis we have vy = vy and vz = vz.

5 Be sure to understand what these equations mean. They relate the velocity of an object as seen by two different observers whose along their ′ ′ common x-axis is β. Note that even if vx = 1 (corresponding to vx = c), the velocity as seen by is still just vx = 1. This is quite different from the classical Galilean addition ofO , because nothing can go faster than light (c = 1). One of the most important aspects of Lorentz transformations is that they leave the quantity t2 x2 y2 z2 . In other words, using equations (1.7a) you can easily− show− that− t′2 x′2 y′2 z′2 = t2 x2 y2 z2. (1.10) − − − − − − Note that setting this equal to zero, we get the equation of an outgoing sphere of light as seen by either observer. (Don’t forget that if c = 1, then t becomes ct.) We refer to this as the invariance of the interval because6 it can be written as (∆t′)2 (∆x′)2 (∆y′)2 (∆z′)2 = (∆t)2 (∆x)2 (∆y)2 (∆z)2. − − − − − − If the primed frame is the rest frame of a particle, then we have dx′ = dy′ = dz′ = 0 and dt′ measures the time interval as seen by the particle, called the . Because of this, we sometimes write dτ 2 := dt2 dx2 dy2 dz2. − − − This is frequently also called the proper distance (or ) and written ds2. The difference between dτ and ds is if c = 1, we have 6 ds2 = c2dt2 dx2 = c2dt2(1 v2/c2)= c2dt2/γ2 := c2dτ 2 − − so that dτ 2 = ds2/c2. Since we are working with c = 1, we will write proper distance as ds2 = dt2 dx2 dy2 dz2. − − − Now let’s go to some modern notation. In units with c = 1, we first define our four spacetime components as the vector x0 t x1 x xµ = = .  x2   y   x3   z          (If c = 1 then x0 = ct.) This vector is an element of a 4-dimensional vector space6 called . Then we have ds2 = (dx0)2 (dx1)2 (dx2)2 (dx3)2 − − − or, defining the Lorentz (or Minkowski) metric 1 1 g = (1.11) µν  − 1  −  1   −    6 we write (using the summation convention)

2 µ ν ds = gµν dx dx . (1.12)

Let me note that most particle use this metric, which we can also write as simply gµν = diag(1, 1, 1, 1), but most relativists us the metric g = diag( 1, 1, 1, 1), and you− need− − to be careful when reading equations. µν − Many physicists also use the symbol ηµν rather than gµν when dealing with the Lorentz metric (as opposed to more general metrics used in ). µ µ Vectors x in Minkowski space are classified as timelike if xµx > 0, space- µ µ like if xµx < 0 and null (or lightlike) if xµx = 0. Light rays are null, and hence we see that there are nonzero vectors with zero norm. Because of this, the Minkowski metric is not positive definite, and we say that Minkowski space is semi-Riemannian. From we know that the metric defines an inner product, and ν we can use this to raise or lower indices, for example, xµ = gµν x . In the case µν of the Lorentz metric, we have the inverse metric with components g = gµν . 0 i Furthermore, there is no difference between x and x0, but xi = x for i = 1,..., 3. (Again, be careful because this is the opposite of what you− get if you use the other metric.) Using this notation, equation (1.10) is written

′µ ′ν µ ν gµν x x = gµν x x where the metric is the same in both frames. Lowering indices, we write this in its most compact form as ′ ′µ µ xµx = xµx 2 µ and we say that the length x = xµx is an invariant. It is also important to understand the the product of two vectors aµ and bµ is written in the equivalent forms

3 a b = g aµbν = a bν = a b0 + a bi = a b0 aibi = a b0 a b. · µν ν 0 i 0 − 0 − · i=1 X Note that the summation convention means that repeated Greek indices are to be summed from 0 to 3, and repeated Latin indices are to be summed from 1 to 3. We now write our Lorentz transformation equations as

′µ µ ν x =Λ ν x (1.13) where we have defined the Lorentz transformation γ βγ 0 0 βγ− γ 0 0 Λµ = . (1.14) ν  − 0 010   0 001      7 (You should be aware that some authors put the prime on the indices and ′ ′ µ µ ν 2 write this in the form x = Λ ν x .) Using this, we write the invariant x as ′ ′µ µ α β α xµx =ΛµαΛ βx x . But this must equal x xα, and hence we have

µ T µ T ν µ ΛµαΛ β = (Λ )αµΛ β = (Λ )α gνµΛ β = gαβ (1.15) which can be written in matrix form as

ΛT gΛ= g.

In fact, this can be taken as the definition of a Lorentz transformation Λ. Since α α T α µ α gβ = δβ , we can write equation (1.15) as (Λ ) µΛ β = δβ which shows that Λ is an orthogonal transformation, i.e., ΛT = Λ−1. This is actually just what equations (1.8) say—a Lorentz transformation is a rotation in Minkowski space. −1 µ T µ µ Since (Λ ) ν = (Λ ) ν =Λν , we see from equation (1.13) that

−1 α ′µ −1 α µ ν α (Λ ) µx = (Λ ) µΛ ν x = x or α α ′µ x =Λµ x . (1.16) Equations (1.13) and (1.16) then give us the very useful results

∂x′µ ∂xν Λµ = and Λ ν = (Λ−1)ν = . (1.17) ν ∂xν µ µ ∂x′µ In order to define velocity in an invariant manner, we define the 4-velocity in terms of the proper time by dxµ uµ := . (1.18) dτ Note we can write dτ 2 = dt2 dx2 = dt2(1 v2). Here v is the velocity of a particle as seen by . If we let− ′ be the particle− rest frame, then v is just β and we have dτ 2 = Odt2(1 v2)=Odt2/γ2 so that − dt = γ. dτ Then dxµ dt dxµ dxµ uµ = = = γ dτ dτ dt dt and we can write the 4-velocity as the vector

γ uµ = (1.19) " γv # which has the magnitude

u uµ = γ2 γ2v2 = γ2(1 v2)=1. µ − −

8 0 µ µ 2 (Again, with c = 1 we have x = ct so that u = γ[c, v] and uµu = c .) µ 6 Since Λ ν is a constant matrix, we have dx′µ dxν u′µ = =Λµ =Λµ uν dτ ν dτ ν so that uµ transforms in exactly the same manner as xµ. We call any vector that transforms in this way a 4-vector, which justifies the term ‘4-velocity’ used above. Similarly, we define the 4- by

1 pµ := muµ = mγ (1.20) " v # so that 2 µ 2 p = p pµ = m . (If c = 1 then p2 = m2c2. Let me also emphasize that the mass m in all of our equations6 is the constant rest mass. We never talk about a “relativistic mass” γm that many older books use where they write our mass as m0 and then define m = γm0.) Expanding the square root we have m 1 p0 = mγ = = m 1+ v2 + √1 v2 2 · · · −   which is the sum of a rest term m (= mc) and a kinetic energy mv2/2 (= (1/2)mv2/c) plus higher order terms in v (= v/c). Because of this, we see that p0 is the total energy p0 = mγ = E (= E/c) of the particle, so using mγv = p as the classical momentum, we have

E pµ = . (1.21) " p # Therefore m2 = p2 = E2 p2 or − E2 = p2 + m2. (1.22) (If c = 1, then pµ = mγ[c, v] = [E/c, p] and this becomes E2 = p2c2 + m2c4.) Now,6 the gradient operator is defined as ∇ = ∂/∂x so that

i i ∇ = ∂/∂x := ∂i.

µ Let us define ∂µ = ∂/∂x . Then

∂ ∂0 0 µ ∂µ = and ∂ = . (1.23) " ∇ # " ∇ # − Using equation (1.17) we have ∂ ∂xν ∂ ∂′ = = =Λ ν ∂ µ ∂x′µ ∂x′µ ∂xν µ ν

9 or, equivalently, ′µ µ ν ∂ =Λ ν ∂ which shows that indeed ∂µ transforms as a 4-vector (which is implied by the notation). The operator

∂2 ∂ ∂µ = (∂ )2 + ∂ ∂i = (∂ )2 ∇2 = ∇2 µ 0 i t − ∂t2 − is called the d’Alembertian, and is frequently written as . In quantum we have the momentum operator defined by p = i~∇ and the energy operator defined by E = i~(∂/∂t). Then pi = i~∇i = −i~∂ =+i~∂i and we can define the relativistic momentum operator− − i pµ = i~∂µ. ~ 2 2 2 2 ∇2 Using units with = 1, the expression E p m = 0 becomes ∂t + m2 =0 or − − − − (∂2 ∇2 + m2)φ(x) = ( + m2)φ(x)=0 t − which is known as the Klein-Gordon equation. Even though the two reference frames relative to which we are describing motion must be inertial, there is no reason we can’t describe the motion of an accelerated object. As you might guess, we define the 4- of an object by duµ aµ = . dτ µ Since u uµ = 1, it follows that the 4-acceleration is always orthogonal to the 4-velocity because d duµ 0= uµu =2u =2u aµ. dτ µ µ dτ µ We also define the 4-force dpµ f µ = = maµ dτ so that dpµ dpµ d(γm)/dt f 0 f µ = = γ = γ = (1.24) dτ dt " dp/dt # " γfc # where fc = dp/dt is the classical force on the particle. Since the 4-force obviously µ obeys f uµ = 0, we have 0= f µu = γf 0 γ2f v µ − c · and therefore f 0 = γf v (1.25) c · which, to within the factor of γ, is just the classical power (i.e., the rate at µ 0 2 which work is done). (And if c = 1 we have 0 = f uµ = f γc γ fc v so that f 0 = (γ/c)f v.) 6 − · c ·

10 2 Maxwell’s Equations

Experimentally, it is found that the charge to mass ratio e/mγ of a particle moving at velocity β obeys the law e e = (1 β2)1/2 . mγ m −

(The two sides of this equation refer to different measurements, so it’s not as triv- ial of a statement as it looks at first.) Therefore, e is a constant, and we have that charge is an invariant quantity. What we would now like to know is how and behave. Since charge density is charge/volume, we must find out how volumes transform. Let frame 2 (the primed frame) be in motion with respect to frame 1 (the unprimed frame) along their mutual x-axis, and consider a small cube of side l0 at rest in frame 2. In its rest frame, this cube has volume dτ0 (not to be confused with proper time). From (1.7a) we have 1 1 dτ = dxdydz = dx′dy′dz′ = dτ (2.1) 1 γ γ 0 where we used dx′ = γ(dx βdt) together with dt = 0 for a measurement made in frame 1. Thus we have −

dτ = dτ (1 β2)1/2 . 1 0 − Now suppose the volume is also moving with respect to frame 2, and let this motion be along the x2 axis. Letting v2x be the velocity of the box with respect to frame 2, and similarly for v1x, we have

dτ = dτ (1 v2 )1/2 and dτ = dτ (1 v2 )1/2 . 2 0 − 2x 1 0 − 1x But from (1.9b) we have v2x + β v1x = 1+ βv2x and therefore

v + β 2 1/2 dτ = dτ 1 2x 1 0 − 1+ βv   2x   1+2βv + β2v2 v2 2βv β2 1/2 = dτ 2x 2x − 2x − 2x − 0 (1 + βv )2  2x  (1 β2)(1 v2 ) 1/2 [(1 β2)(1 v2 )]1/2 = dτ − − 2x = dτ − − 2x 0 (1 + βv )2 0 1+ βv  2x  2x (1 β2)1/2 = − dτ2 1+ βv2x

11 2 1/2 where in going to the last line we used dτ2 = dτ0(1 v2x) . Rearranging, this is − dτ2 dτ1 = . (2.2a) γ(1 + βv2x) Reversing the frame point of view, we clearly also have dτ dτ = 1 . (2.2b) 2 γ(1 βv ) − 1x Be sure to understand what these equations say. The velocities v1x and v2x are the observed velocities of a box with rest volume dτ0 moving along the common x-axis as seen in frames 1 and 2, which are moving with velocity β with respect to each other. Now suppose that we have dn charges of Q coulombs each. As we stated above, Q is an invariant. Obviously, dn is also an invariant since it is just the number of charges. Then the charge densities as observed in frames 1 and 2 are Q dn Q dn ρ1 = and ρ1 = dτ1 dτ1 so that ρ1 dτ1 = ρ2 dτ2 where dτ1 and dτ2 are the same volume containing the charge Q dn as observed in the two different reference frames. Then, using equations 2.2, we have

dτ2 ρ1 = ρ2 = ρ2γ(1 + βv2x) dτ1 (2.3) dτ1 ρ2 = ρ1 = ρ1γ(1 βv1x) . dτ2 − The J := ρv is defined as the charge per area-time, so we can write J1x = ρ1v1x and J2x = ρ2v2x. Using these definitions in equations (2.3) yields the transformation of charge density

ρ1 = γ(ρ2 + βJ2x) (2.4) ρ = γ(ρ βJ ) . 2 1 − 1x Now using equations (1.9) and (2.3) we also have

v2x + β J1x = ρ1v1x = ρ2γ(1 + βv2x) 1+ βv2x or

J1x = γ(J2x + βρ2) (2.5a) J = γ(J βρ ) . 2x 1x − 1 Similarly, J1y = ρ1v1y = ρ2γ(1 + βv2x)v1y .

12 But dy1 dy2 v2y v1y = = = . dt1 γ(dt2 + βdx2) γ(1 + βv2x) Hence J1y = ρ2v2y = J2y and similarly J1z = J2z . (2.5b) Comparing equations (2.4) and (2.5) with equations (1.7), we see that we have a 4-current density ρ J µ = . (2.6) " J # (And again, if c = 1 this becomes J µ = [ρc, J].) Now that we6 have shown that J µ does indeed define a 4-vector, let me show another way to arrive at this conclusion that is analogous to the definition of 4-momentum. From (2.1) we can write dτ = dτ0/γ where dτ0 is at rest in frame 2, and γ = (1 β2)−1/2 where β is the velocity of frame 2 with respect to frame − 1. Then the invariance of charge implies ρdτ = ρ0dτ0 or ρdτ0/γ = ρ0dτ0, and hence ρ = γρ0 .

This is analogous to the expression mrelativistic = γmrest or simply m = γm0. Now we also have J = ρv = ρ0γv. Recalling that the 4-velocity is given by

γ uµ = " γv # we see that letting v be the velocity of the charge, i.e., v = β, then (2.6) is the same as µ µ J = ρ0u which is analogous to pµ = muµ. In other words, we have shown

ρ µ γ µ J = = ρ0 = ρ0u . (2.7) " J # " γv #

Since we saw earlier that the derivative operator ∂µ transforms as a 4-vector (technically, a co-vector), and we just showed that J µ is a 4-vector, it follows µ that ∂µJ is a . But

∂ρ ∂ J µ = ∂ ρ + ∂ J i = + ∇ J µ 0 i ∂t · and therefore the continuity equation may be written in the covariant form

µ ∂µJ =0 . (2.8)

13 Now recall Maxwell’s equations:

∇ E =4πρ (2.9a) · ∇ B = 0 (2.9b) · ∂B ∇ E = (2.9c) × − ∂t ∂E ∇ B =4πJ + (2.9d) × ∂t Using B = ∇ A, equation (2.9c) implies × ∂A E = ∇φ − − ∂t so that equations (2.9a) and (2.9d) then imply (using the identity ∇ ∇ A = ∇(∇ A) ∇2A) × × · − ∂ ∇2φ + (∇ A)= 4πρ (2.10a) ∂t · − ∂2A ∂φ ∇2A ∇ ∇ A + = 4πJ . (2.10b) − ∂t2 − · ∂t −   The gauge transformation A A′ = A + ∇Λ leaves the physical field B = ∇ A unchanged, so if E = ∇→φ ∂A/∂t is also to remain unchanged, we must× have φ φ′ = φ ∂Λ/∂t−. This− gives us the freedom to choose (φ, A) such as to satisfy→ the Lorentz− gauge (or Lorentz condition) ∂φ ∇ A + =0 . · ∂t In other words, we demand that the new potentials (φ′, A′) satisfy

∂φ′ 0= ∇ A′ + · ∂t ∂φ ∂2Λ = ∇ A + ∇2Λ+ . · ∂t − ∂t2 Thus, if we can find a Λ that satisfies

∂2Λ ∂φ ∇2Λ = ∇ A + − ∂t2 − · ∂t   the gauge transformed fields will satisfy the Lorentz condition. Fortunately, this is a straightforward problem to solve. All we need to do is find the Green’s function for the wave equation, and then Λ will be the integral of the Green’s function the quantity on the right. (Very briefly, if you have a linear operator L(x) acting on a function f(x) such that L(x)f(x) = g(x), and if you find a Green’s function G(x, x′) defined by L(x)G(x, x′) = δ(x x′), then the −

14 solution to the problem is essentially f(x)= G(x, x′)g(x′) dx′. You can easily verify that acting on this with L(x) will yield L(x)f(x)= g(x).) In any case, choosing the Lorentz gauge,R equations (2.10) become the well- known wave equations

∂2φ ∇2φ = 4πρ (2.11a) − ∂t2 − ∂2A ∇2A = 4πJ . (2.11b) − ∂t2 − Note that an equivalent way of writing these is

φ = 4πJ 0 − A = 4πJ . − So if we define the 4-potential Aµ by

φ Aµ = " A # then the wave equations may be written in the concise form

Aµ = 4πJ µ (2.12) − where the Lorentz condition becomes

µ ∂µA =0 .

That Aµ is indeed a 4-vector follows because  is a Lorentz invariant quantity and J µ is a 4-vector. Thus Aµ must transform as a 4-vector so that (2.12) is covariant (i.e., so that both sides transform the same way). Also, Aµ is unchanged even if c = 1. This is because the right side of (2.11b) becomes ( 4π/c)J and J 0 = ρc6 . − Now we need a bit of terminology. Recall that a 4-vector was defined as a quantity vµ that under a Lorentz transformation transformed as

vµ v′µ =Λµ vν . → ν As we shall see below, it is also possible to have quantities with more than one index such that under a Lorentz transformation, each index transforms like a 4-vector. For example, a quantity F µν (not necessarily the electromagnetic field ) with two indices that transforms like

F µν F ′µν =Λµ Λν F αβ → α β is called a (second rank) tensor. Higher rank are defined in the obvious manner. Note also that all of the indices need not be superscripts (such indices

15 are called contravariant). We can equally have subscripts (called covariant) that transform like ′ α β Fµν =Λµ Λν Fαβ . µ And we can have a mixed tensor like F ν . Indices are raised and lowered by µν using the metric gµν and its inverse g . At last we are ready to write Maxwell’s equations in covariant form. It is not hard to show that even though ∂µ transforms as a 4-vector under a Lorentz µ transformation Λ ν , as does Aµ, the quantity ∂µAν does not transform as a second-rank tensor. However, the antisymmetric quantity Fµν defined by

F := ∂ A ∂ A (2.13a) µν µ ν − ν µ does indeed transform as a tensor. This is called the electromagnetic field tensor. Equivalently, we may consider the contravariant version

F µν = ∂µAν ∂ν Aν . (2.13b) − I claim that equations (2.9a) and (2.9d) can be written in the form

µν ν ∂µF = J . (2.14)

To see this, first recall that we are using the metric g = diag(1, 1, 1, 1) 0 0 i i − i − − so that ∂/∂t = ∂/∂x = ∂0 = ∂ and ∇ := ∂/∂x = ∂i = ∂ . Using E = ∇ϕ ∂A/∂t we have − − − Ei = ∂iA0 ∂0Ai = F i0 − and also B = ∇ A so that × B1 = ∇2A3 ∇3A2 = ∂2A3 + ∂3A2 = F 23 − − − plus cyclic permutations 1 2 3 1. Then the electromagnetic field tensor is given by → → → 0 E1 E2 E3 E1 − 0 −B3 −B2 F µν = . (2.15)  E2 B3 − 0 B1  −  E3 B2 B1 0   −  (Be sure to note that this is the form of F µν for the metric diag(1, 1, 1, 1). If you use the metric diag( 1, 1, 1, 1) then all entries of F µν change− − sign.− In − µ addition, you frequently see the matrix F ν which also has different signs.) Now, for the ν = 0 component of (2.14) we have

0 µ0 i0 i J = ∂µF = ∂iF = ∂iE which is Coulomb’s law ∇ E = ρ . ·

16 Next consider the ν = 1 component of (2.14). This is

1 µ1 01 21 31 J = ∂µF = ∂0F + ∂2F + ∂3F = ∂ E1 + ∂ B3 ∂ B2 − 0 2 − 3 = ∂ E1 + (∇ B)1 − t × and therefore we have ∂E ∇ B = J . × − ∂t Finally, I leave it as an exercise for you to show that equations (2.9b) and (2.9c) can be written as (note the superscripts are cyclic permutations)

∂µF νσ + ∂ν F σµ + ∂σF µν =0 or simply ∂[µF νσ] =0 . (2.16)

Remark: There is another interesting way to arrive at the electromagnetic field tensor that we now describe, but you are free to skip over it. First we need to give a more careful definition of a tensor. To begin, given a V , we can define the dual space V ∗ as the vector space of linear functionals on V . In other words, α V ∗ means that α : V R is a from V to R. (We restrict consideration∈ to real vector spaces.)→ Members of the dual space are frequently called covectors. If V has a e1,...,en , we define the n linear functionals ω1,...,ωn by { } { } i i ω (ej )= δj .

I will show that these n linear functionals form a basis for V ∗, i.e., that they are linearly independent and span V ∗. To show this, let α V ∗ be arbitrary but fixed. Note that for any v V , using the linearity of α ∈we have ∈

i i i α(v)= α(v ei)= v α(ei)= aiv where we have defined the scalars ai by ai = α(ei). On the other hand,

i i j j i j i i ω (v)= ω (v ej)= v ω (ej )= v δj = v

i i i and hence we see that α(v)= aiv = aiω (v) so that α = aiω . This shows that the ωi span V ∗. i To show they are linearly independent, suppose we have ciω = 0 for some i i of scalars ci. Then for any j = 1,...,n we have 0 = ciω (ej ) = ciδj = cj which proves linear independence. Thus we have shown that any α V ∗ can be written in the form ∈ i i α = aiω = α(ei)ω .

17 As an example, consider the space V = R2 consisting of all column vectors of the form v1 v = . v2   Relative to the standard basis we have 1 0 v = v1 + v2 = v1e + v2e . 0 1 1 2     ∗ i If φ V , then φ(v) = φiv , and we may represent φ by the row vector ∈ i φ = (φ1, φ2). In particular, if we write the dual basis as ω = (ai,bi), then we have P 1 1= ω1(e ) = (a ,b ) = a 1 1 1 0 1   0 0= ω1(e ) = (a ,b ) = b 2 1 1 1 1   1 0= ω2(e ) = (a ,b ) = a 1 2 2 0 2   0 1= ω2(e ) = (a ,b ) = b 2 2 2 1 2   so that ω1 = (1, 0) and ω2 = (0, 1). Note, for example,

v1 ω1(v)=(1, 0) = v1 v2   as it should. As another very important example, let V be an inner product space. If a,b V , then the inner product of a and b is the number a,b . Then given a fixed∈ vector a, the quantity a, is a linear functional on Vh becausei it takes a vector b V and gives backh a number:·i ∈ a, : b a,b R . h ·i → h i ∈ Since a, is in V ∗, let us denote it by α, so that α(b)= a,b . Givenh ·i a basis e for V , let us define the numbers g h by i { i} ij g := e ,e = e ,e = g . ij h i ji h j ii ji This is the proper definition of the metric. Then

a,b = aie ,bje = aibj e ,e = aibj g = bjg ai . h i h i j i h i j i ij ji But we also have j j j α(b)= α(b ej)= b α(ej )= b aj

18 and therefore bj a = α(b)= a,b = bj g ai . j h i ji Hence we define i aj = gjia . This is called lowering an index. Since the inner product is nondegenerate by definition, the metric gij must be nonsingular, and hence we can define its inverse which we denote by gij . Multiplying this last equation by gkj we then have

kj kj i k i k g aj = g gjia = δi a = a and thus we define raising an index by

k kj a = g aj . Now that we have an understanding of dual spaces, we are in a to define tensors carefully. So, a tensor T is just a multilinear map

T : V ∗s V r = V ∗ V ∗ V V R . × ×···× × ×···× → By multilinear, we mean that it is linear in each variable separately. This tensor is said to have covariant order r, and contravariant order s, or simply to be a tensor of type (s, r). In other words, T takes as its argument s covectors and r vectors. Since it is multilinear, we see that

(1) (s) T (α ,...,α , v(1), . . . ,v(r))

(1) i1 (s) is j1 ir = T (ai1 ω ,...,ais ω , v(1)ej1 ,...,v(r)eir )

(1) (s) j1 jr i1 is = a a v v T (ω ,...,ω ,e 1 ,...,e r ) i1 · · · is (1) · · · (r) j j

(1) (s) j1 jr i1···is = a a v v T 1 r i1 · · · is (1) · · · (r) j ···j where the last line defines the components of T . Thus we see that if we know the components of T , then we know the result of T acting on an arbitrary collection of vectors and covectors. What happens to the components of T under a change of coordinates? From a practical standpoint, this is what really defines a tensor. A change of basis in V is of the form e e¯ = e pj i → i j i where (pj ) is called the transition matrix. Then any x V can be written i ∈ in terms of its components with respect to either ei ore ¯i, and we have

j i i j x = x ej =x ¯ e¯i =x ¯ ej p i and therefore we must have j j i x = p ix¯

19 or i −1 i j x¯ = (p ) j x . From these we easily see that ∂xi ∂x¯i pi = and (p−1)i = . j ∂x¯j j ∂xj When V undergoes a change of basis, what about V ∗? Let us write in general ωi ω¯i = bi ωj . Since we must also haveω ¯i(¯e )= δi , we see that → j j j i i i k k i k i l k i l i k δj =ω ¯ (¯ej )=¯ω (ekp j )= p j ω¯ (ek)= p j b lω (ek)= p j b lδk = b kp j

i −1 i so that b k = (p ) k. In other words, ωi ω¯i = (p−1)i ωj . → j Finally we are in a position to derive the general transformation law of a tensor. We have

i1···is i1 is T j1···jr = T (¯ω ,..., ω¯ , e¯j1 ,..., e¯jr )

−1 i1 k1 −1 is ks l1 lr = T ((p ) k1 ω ,..., (p ) ks ω ,el1 p j1 ,...,elr p jr )

−1 i1 −1 is l1 lr k1 ks = (p ) (p ) p p T (ω ,...,ω ,e 1 ,...,e r ) k1 · · · ks j1 · · · jr l l −1 i1 l1 k1···ks = (p ) p T 1 r k1 · · · j1 · · · l ···l or i1 is l1 lr i1···is ∂x¯ ∂x¯ ∂x ∂x k1···ks T j1···jr = T l1···lr . ∂xk1 · · · ∂xks ∂x¯j1 · · · ∂x¯jr This is the classical transformation law of a type (s, r) tensor. In the particular case of a second rank tensor F µν under a Lorentz transfor- mation, we have µ µ ν x¯ =Λ ν x so that ∂x¯µ =Λµ ∂xν ν and therefore µ ν µν ∂x¯ ∂x¯ F = F αβ =Λµ Λν F αβ . ∂xα ∂xβ α β Let us now return to the physics. We know that the law is dp F = q(E + v B)= × dt so consider (now τ is the proper time again) dp dt dp dp = = γ . dτ dτ dt dt

20 In terms of the 4-velocity

γ u0 uµ = = " γv # " u # we can write dp = γq(E + v B)= q(γE + γv B) dτ × × = q(u0E + u B) . (2.17) × Also note that if W = p0 is the energy of the particle, then the change in energy is the rate at which work is done, so that dW dr = F = F v = q(E + v B) v = qE v dt · dt · × · · and therefore dp0 dW dW = = γ = qE γv = qE u . (2.18) dτ dτ dt · · Combining equations (2.17) and (2.18), we see that we can define a linear map uµ dpµ/dτ of a 4-vector to another 4-vector, and hence there exists a → µ second rank mixed tensor F ν such that dpµ = qF µ uν . (2.19) dτ ν Comparing (2.19) with (2.17) and (2.18) allows us to pick out the components µ of F ν : 0 Ex Ey Ez E 0 B B µ  x z y  F ν = − .  Ey Bz 0 Bx   −   E B B 0   z y − x   µ  α Finally, we note that we can also write F ν in the alternate forms Fµν = gµαF ν µν να µ and F = g F α.

Now that we have the electromagnetic field tensor, it is straightforward to derive the transformation laws for the E and B fields. Starting from F µν which ′µν µ ν αβ defines the fields E and B, we have F = Λ αΛ βF which then gives the fields E′ and B′ in terms of E and B. In matrix notation, we can write this as

F ′ =ΛF ΛT .

Using equations (1.14) and (2.15), it is easy to multiply out the matrices and

21 show that 0 E′1 E′2 E′3 E′1 − 0 −B′3 −B′2 F ′µν =  E′2 B′3 − 0 B′1  −  E′3 B′2 B′1 0   −    0 E1 γ(E2 βB3) γ(E3 + βB2) − − − − 1 3 2 2 3  E 0 γ(B βE ) γ(B + βE )  = − − . 2 3 3 2 1  γ(E βB ) γ(B βE ) 0 B   − − −   γ(E3 + βB2) γ(B2 + βE3) B1 0     −  This was for the special case of a boost along the x-axis, i.e., β = βˆx. it is not hard to see that we can write down the field transformation laws for a boost β in an arbitrary direction (but with the coordinate axes still parallel) if we write this in terms of components parallel and perpendicular to the boost. This yields

E′ = γ(E + β B) E′ = E (2.20a) ⊥ ⊥ × k k B′ = γ(B β E) B′ = B (2.20b) ⊥ ⊥ − × k k

22