8 Lorentz Invariance and Special Relativity
The principle of special relativity is the assertion that all laws of physics take the same form as described by two observers moving with respect to each other at constant velocity v. If the dynamical equations for a system preserve their form under such a change of coordinates, then they must show a corresponding symmetry. Newtonian mechanics satisfied this principle in the form of Galilei relativity, for which the relation between the coordinates was simply r′(t) = r(t) vt, and for which time in the two frames was identical. However, Maxwell’s field equations− do not preserve their form under this change of coordinates, but rather under a modified transformation: the Lorentz transformations.
8.1 Space-time symmetries of the wave equation Let us first study the space-time symmetries of the wave equation for a field component in the absence of sources: 1 ∂2 2 ψ =0 (411) − ∇ − c2 ∂t2 k l As we discussed last semester spatial rotations x′ = Rklx are realized by the field transfor- 1 1 mation ψ′(x′,t) = ψ(x,t) = ψ(R− x′,t). Then k′ ψ′ = Rlk− lψ = Rkl lψ, and the wave equation is invariant under rotations because 2∇is a rotational∇ scalar. If∇ we have set up a fixed Cartesian coordinate system we may build∇ up any rotation by a sequence of rotations about any of the three axes. Instead of specifying the axis of one of these basic rotations, it is more convenient to specify the plane in which the coordinate axes rotate. For example, we describe a rotation by angle θ about the z-axis as a rotation in the xy-plane.
′ = cos θ sin θ , ′ = sin θ + cos θ . (412) ∇x ∇x − ∇y ∇y ∇x ∇y Then it is easy to see from the properties of trig functions that
2 2 2 2 2 2 ′ + ′ + ′ = + + , (413) ∇x ∇y ∇z ∇x ∇y ∇z under a rotation in any of the three planes, and through composition under any spatial rotation. There must be a similar symmetry in the xt-, yt-, and zt-planes. But because of the relative minus sign we have to use hyperbolic trig functions instead of trig functions: ∂ ∂ ∂ x′ = cosh λ x + sinh λ , = +sinh λ x + cosh λ (414) ∇ ∇ c∂t c∂t′ ∇ c∂t 2 2 2 2 The invariance of x ∂ /c ∂t under this transformation then follows. Clearly there are analogous symmetries∇ − in the yt- and zt-planes. These transformations will replace the Galilei boosts of Newtonian relativity.
85 c 2010 by Charles Thorn
Now let us interpret this symmetry in terms of how the coordinates transform:
x′ = x cosh λ ct sinh λ γ(x vt) (415) − ≡ − ct′ = ct cosh λ x sinh λ γ(ct vx/c) (416) − ≡ − 2 2 where v = c tanh λ. From the identity cosh− = 1 tanh , we see that γ cosh λ = 1/ 1 v2/c2. We see from the first equation that− the origin of the primed≡ coordinate system−x = 0 corresponds to x = vt which means that the origin of the primed system is p ′ moving on the x-axis at the speed v. We say that the primed system is boosted by velocity v = vxˆ with respect to the unprimed system. It is also essential for the invariance of the wave equation that the time t transform as shown. The three rotations in the xy-, yz-, and zx-plane together with the three boosts in the xt-, yt-, and zt-planes generate the group of Lorentz transformations. If the only entities in the physical world were fields satisfying the wave equation, we would have to identify the Lorentz boosts with transformations to new inertial frames. As we shall see, Maxwell’s equations are also invariant under Lorentz transformations, provided that the electric and magnetic fields are appropriately transformed among themselves. Indeed we already know that the electric and magnetic fields transform as vectors under ordinary rotations. To discuss their transformation under boosts we will have to develop the formalism of 4-vectors and 4-tensors.
8.2 Einstein’s Insights Einstein’s contribution to special relativity was not the discovery of Lorentz transformations (which were already well-known). Rather, he recognized that the clash between Lorentz invariance and the Galilei invariance of Newtonian mechanics was inconsistent with the physical principle of relativity. The laws of Maxwell electrodynamics and Newtonian me- chanics cannot both look the same to different inertial observers, since they require different transformation rules. One or the other required modification, and Einstein realized that the defect was in Newtonian mechanics, for which inertial frames are related by Galilei transformations
r′ = r vt, t′ = t (417) − For v << c Lorentz transformations are indistinguishable from Galilei transformations, which accounts for the lack of experimental evidence at the time that Newtonian mechanics was wrong. On the other hand experiments explicitly designed to show that Maxwell’s equations took different forms in different inertial frames had repeatedly obtained null results. Einstein recognized this as direct evidence that Lorentz transformations gave the correct relationship between inertial frames. If this is the case, Newtonian mechanics has to be modified. Accordingly, Einstein developed a new relativistic (i.e. Lorentz invariant) particle mechanics. An alternative way to discover the necessary modifications to Newtonian mechanics is to describe massive matter with Lorentz invariant wave equations. Unlike light, which always
86 c 2010 by Charles Thorn
travels at speed c, material bodies can be brought to rest. The wave equation for light describes wave packets with definite wave number with group velocity vg = ckˆ, which always has magnitude c. However consider the (Klein-Gordon) wave equation 1 ∂2 2 + + µ2 ψ =0 (418) −∇ c2 ∂t2 ik r iωt 2 2 2 2 Then a plane wave e · − will solve this equation if ω = c (k + µ ), so the group velocity 2 2 is vg = ck/ k + µ , which vanishes as k 0. On the other hand the new wave equation is transparently Lorentz invariant, in the same→ way that the massless wave equation is. p Of course, in Einstein’s time the idea that material bodies were wave packets must have seemed a little far-fetched. so he, instead, worked to directly modify the laws of particle mechanics. Ironically, Einstein was actually the first to realize that because of quantum mechanics electromagnetic waves display particle-like properties: the photon can be thought of as a plane wave with energy E = ~ω and momentum p = ~k. Applying this insight to our massive wave packet leads, with ~µ mc, to ≡ p2 p4 p 2 2 2 2 E = c p + m c mc + + O , vg = c (419) ∼ 2m m3c2 p2 + m2c2 p where the last form on the right displays the low momentum limit pp mc. This shows that the Newtonian kinetic energy is obtained at low momentum and that≪ m is the Newtonian mass. At p = 0 we have Einstein’s famous E = mc2! We can easily solve for p in terms of v: v2 p2 mc2 vE = , E = = γmc2, p = = γmv (420) c2 p2 + m2c2 1 v2/c2 c2 − As we shall soon see this modificationp of the relation between momentum and velocity is sufficient to make particle mechanics consistent with Lorentz invariance. In particular we shall see that with the Lorentz force law and this relation, Newton’s equation of motion in the form dp = F = q(E + v B) (421) dt × is valid in all inertial frames connected by Lorentz transformations. But to see this clearly, we need to develop the machinery of 4-vectors and 4-tensors and their transformation laws.
8.3 Some Kinematical Aspects of Lorentz transformations Time Dilatation Let us consider a clock moving down the x-axis according to x(t) = vt,y(t)= z(t) = 0. The clock is at rest at the origin of coordinates in the primed frame with spacetime coordinates given by
2 x′ = γ(x vt), t′ = γ(t vx/c ), y′ = y, z′ = z (422) − −
87 c 2010 by Charles Thorn
The time registered on the clock is t′, which is related to the time in the unprimed frame by
2 2 2 2 2 t′ = γ(v)(t vx(t)/c )= γ(v)(t v t/c )= t 1 v /c (423) − − − where we took into account the fact that the x coordinate ofp the clock is changing in the unprimed frame, x(t) = vt. Thus the observer in the unprimed frame judges the clock to run more slowly than in its rest frame by a factor of 1/ 1 v2/c2 = γ(v). Now consider a clock at rest at the origin in the unprimed− frame. Now the relation p between the times in the two frames is simply t′ = γt. And the primed observer also judges this clock to run more slowly by the same factor γ(v). Each observer concludes that the other observer’s clocks slow down! This effect is literally confirmed daily in accelerator experiments where fast unstable particles are able to travel distances many times their lifetime times c. More generally, consider a clock in arbitrary motion, described by a trajectory r(t) in the unprimed frame. At a given moment we can consider a primed frame in which the clock is instantaneously at rest. Then assuming at that moment r˙ = vxˆ the Lorentz transformation 2 of an infinitesimal time interval dt′ to the Lab frame reads dt′ = γ(dt vdx/c ) = γ(1 v2/c2)dt = dt 1 v2/c2. By rotational invariance, if r˙ = v is in a general− direction this− − 2 2 implies dt′ = dt 1 v /c . Then a finite lapse of time in the instantaneous rest frame p − ∆T = t′ t′ is given by the integral 2 − 1 p t2 1 dr 2 ∆T = dt 1
Invariant intervals and the Light Cone Points in spacetime are more precisely thought of as events. By construction Lorentz transformations leave the quantity x x = x2 c2t2 invariant. But since all events are subject to the same transformation, the “interval”· between− 2 two events s12 =(x1 x2) (x1 x2) is also invariant. Intervals can be positive (space-like), − · − µ negative (time-like) or zero (light-like). If one of the two events is at the origin x1 = 0, the events light-like separated from the origin lie on a double cone with vertices at the origin. Events within the cone have time-like separation from the origin, and those outside have space-like separation. There can be no causal connection between space-like separated events because they cannot be connected by light signals. We may define the proper time τ of a body in motion through dτ 2 = dx dx/c2 = dt2 dr2/c2 = dt2(1 r˙ 2/c2). Note that a clock moving with the body registers− the· body’s proper− time. −
88 c 2010 by Charles Thorn
Length Contraction Consider a rod of length L at rest in the unprimed frame parallel to the x-axis, with ends, say at x1 = 0 and x2 = L. In the primed system it is moving in the negative x direction at speed v: x′ = γ( vt), x′ = γ(L vt). Its length will be judged to 1 − 2 − be L′ = x2′ (t′) x1′ (t′) where the coordinates are measured at the same t′. But these will − 2 2 correspond to different values of t: t′ = γ(t )= γ(t vL/c ), so t = t + Lv/c . Then 1 2 − 2 1 2 2 2 2 L′ = γ(L vt )+ γvt = γL(1 v /c )= L 1 v /c (425) − 2 1 − − A moving object is therefor observed to be contracted by ap factor 1/γ in the direction of motion.
We can also consider the rod at rest in the primed frame, in which case x1′ =0,x2′ = L. At a particular time t in the unprimed frame these coordinates are related to the unprimed ones by x1′ = γ(v)(x1 vt),x2′ = γ(v)(x2 vt), so L = x2′ x1′ = γ(v)(x2 x1), so we reach the same conclusion, that− length is perceived− to be contracted− by motion x − x = L 1 v2/c2. 2 − 1 − p Simultaneity is relative Simultaneous events (x1,t), (x2,t) in the unprimed frame cor- 2 respond to events (x1′ ,t1′ ), (x2′ ,t2′ ) where t2′ t1′ = γv(x1 x2)/c in the primed frame. Of course two simultaneous events at the same− spatial point− are simultaneous in all frames.
Addition of velocities Suppose a particle is moving with constant velocity (Vx,Vy,Vz) 2 in the primed frame, so x′ = Vxt′, y′ = Vyt′, z′ = Vzt′. Then x vt = Vx(t vx/c ), y = V γ(t vx/c2), z = V γ(t vx/c2). − − y − z −
v + Vx Vy Vz x = 2 t, y = 2 t, z = 2 t (426) 1+ vVx/c γ(1 + vVx/c ) γ(1 + vVx/c )
Calling the components of velocity in the unprimed frame (Ux,Uy,Uz) we see that
v + Vx Vy Vz Ux = 2 ,Uy = 2 ,Uz = 2 (427) 1+ vVx/c γ(1 + vVx/c ) γ(1 + vVx/c ) (v + V )2 +(V 2 V 2)(1 v2/c2) U 2 x x = − 2 2 − (1 + vVx/c ) 2 2 2 2 2 2 (1 V /c )(1 v /c ) = c c − −2 2 (428) − (1 + vVx/c )
From the last form we see that U 2 c2 when either v2 c2 or V 2 c2 or both. Thus one never achieves a velocity greater than→ c by adding two velocities,→ no→ matter how close they are to c.
8.4 Space-time Tensors and their Transformation Laws Let us introduce a unified notation xµ for space-time coordinates where the index assumes the values µ = 0, 1, 2, 3 and we identify x0 = ct, x1 = x, x2 = y, x3 = z. The factor of c in x0 assures that all components of xµ have dimensions of length. Then a general Lorentz
89 c 2010 by Charles Thorn
µ µ ν transformation can be expressed as a linear transformation x′ = Λ νx , where the 4 4 µ 2 2 2 × matrix Λ ν is restricted by the requirement that x x x c t is invariant under the transformation. To express the consequences of this invari· ≡ance,− it is convenient to introduce the diagonal matrix ηµν, which has nonvanishing components η11 = η22 = η33 = η00 = 1. Then we can write x x = η xµxν, and the invariance condition becomes − · µν µ ν ρ σ ρ σ µ ν x′ x′ = η Λ Λ x x = η x x = x x, η Λ Λ = η (429) · µν ρ σ ρσ · µν ρ σ ρσ An immediate consequence of the second equation is (detΛ)2 = 1 implying detΛ = 1. µ µ ± Notice that with the stated condition on Λ, it follows that v′ w′ = v w for any v ,w which µ µ ρ µ µ ρ · · transform like coordinates: v′ =Λ ρv and w′ =Λ ρw .
Matrix Notation We can arrange the components of xµ in a column vector
x0 x1 x = (430) x2 x2 and the components of Λµ in a 4 4 matrix ν × 0 0 0 0 Λ 0 Λ 1 Λ 2 Λ 3 1 1 1 1 Λ 0 Λ 1 Λ 2 Λ 3 Λ= 2 2 2 2 . (431) Λ 0 Λ 1 Λ 2 Λ 3 Λ3 Λ3 Λ3 Λ3 0 1 2 3 µ µ ν µ ν Then the equation x′ =Λ νx becomes simply x′ =Λx, and the equation ηµνΛ ρΛ σ = ηρσ reads 1 0 0 0 −0 100 ΛT ηΛ= η, η = (432) 0 010 0 001 As an example a boost v in the x-direction and a rotation in the xy plane are given by
γ γv/c 0 0 10 0 0 γv/c− γ 0 0 0 cos θ sin θ 0 Bx = Rxy = (433) − 0 0 10 0 sin θ −cos θ 0 0 0 01 00 0 1 respectively.
Contravariant Tensors We define a contravariant 4-vector vµ as a 4 component entity µ µ ν whose components transform just like the coordinates v′ = Λ νv . We also stipulate that 4-vectors are unchanged under space-time translations. For this reason the four coordinates
90 c 2010 by Charles Thorn
xµ are not components of a 4-vector, because they change under translations x x + a. But displacements, which are differences of coordinates are 4-vectors. In particular,→ the differentials dxµ are contravariant 4-vectors. Another important contravariant 4-vector is the energy momentum 4-vector of a particle pµ = (E/c, p) = γm(c, v). The easiest way to see that it transforms as a 4-vector is to consider the boost which takes the particle from rest to velocity v = vxˆ. Let the unprimed frame be the one where the particle is at rest p1,2,3 = 0 and p0 = mc. Then the primed frame should move in the negative x direction with speed v. If pµ is a contravariant 4-vector, its components in the new frame should then be
0 0 v 1 1 1 v 0 p′ = γ(p + p )= γmc, p′ = γ(p + p )= γmv (434) c c which are precisely the values we have found for a particle with this velocity. An ordinary rotation can then point the velocity in any direction. Incidentally, in our discussion of the Doppler shift we shall see that the frequency and wave number of a plane wave kµ =(ω/c, k) also transform like a 4-vector. This ensures that the Einstein/deBroglie quantum relations pµ = ~kµ are compatible with Lorentz invariance.
ik r iωt Doppler Shift We examine how a plane wave e · − in the unprimed frame looks in the primed frame. Thinking of kµ = (ω/c, k) as a 4-vector, the plane wave can be written ik x ik′ x′ e · = e · . thus in the primed frame, we also have a plane wave but with frequency and wave numbers
x x x 2 y y z z ω′ = γ(ω vk ), k′ = γ(k vω/c ), k′ = k k′ = k (435) − − x y z Writing k = ω cos θ/c, k = ω sin θ cos ϕ/c, k = ω sin θ sin ϕ/c, and similarly for k′,θ′,ϕ′ = ϕ, these transformation laws lead to v sin θ ω′ = γω(1 cos θ), tan θ′ = (436) − c γ(cos θ v/c) − If θ = 0 the primed observer is moving away from the source of light and we see in this case ω′ = ω (c v)/(c + v) < ω which means a red shift. On the other hand, if θ = π the ob- − server is moving toward the source of light and we find a blue shift ω = ω (c + v)/(c v) > p ′ ω. Note that there is a relativistic Doppler effect when θ = π/2 when ω = γω− > ω. p ′ This transverse Doppler shift is a blue shift. In contrast, at θ′ = π/2, cos θ = v/c, and 2 2 ω′ = ω 1 v /c , a red shift! − Covariantp Tensors In special relativity, there is another type of 4-vector called a covari- µ ant 4-vector, which transforms as the 4-gradient operator ∂µ ∂/∂x . To see that the transformation is different we use the chain rule to write ≡ ν ∂ ∂x′ ∂ ν 1 µ ′ ′ − ∂µ = µ = µ ν =Λ µ∂ν, ∂ν = ∂µ(Λ ) ν (437) ∂x ∂x ∂x′ 1 ν More generally any covariant vector vµ transforms as vµ′ = vν(Λ− ) µ. We see two differences 1 with the contravariant transformation law: first Λ− occurs and secondly the first rather
91 c 2010 by Charles Thorn
than the second index is summed, i.e. the transpose occurs. Note that because of this µ difference the scalar vµw is invariant under all linear transformations, not just Lorentz 1 T transformations. The condition for ordinary rotations R− = R obliterates the distinction between the two kinds of vectors. But for Lorentz transformations, we have instead the condition
1 1 T Λ− = η− Λ η 1 µ µσ ρ ρ σµ (Λ− ) ν = η Λ σηρν = ηνρΛ ση (438)
σµ where η has identical components to ησµ but we raise the indices because we are visualizing 1 it as η− . Because of the 1 entries in η, covariant and contravariant 4-vectors have slightly different transformation laws.− We also see from this exercise that a contravariant vector can ν be converted to a covariant one using the matrix η: vµ ηµνv . Or the other way around µ µν 1,2,3 0 ≡ v = η vν. We see that v = v1,2,3 but v = v0. Although the distinction is very slight we shall maintain it throughout. The following− forms for the scalar product of two four vectors summarize the distinctions we are making:
v w = η vµwν = vµw = v wν = ηµνv w (439) · µν µ ν µ ν From now on the summation convention for repeated indices will always involve the contrac- tion of an upper (contravariant) index with a lower (covariant) index. Once we have understood contravariant and covariant 4-vectors, multi-index tensors (e.g. µν T λ are an easy generalization. Each contravariant (upper) index transforms as a contravari- ant vector and each covariant (lower) index like a covariant vector. A tensor field also has its space-time argument transformed:
µν µ ν 1 κ ρσ 1 T ′ λ (x′)=Λ ρΛ σ(Λ− ) λT κ(Λ− x′) (440)
8.5 Lorentz covariance of Maxwell’s equations Because tensors transform homogeneously under Lorentz transformations, a set of dynamical equations that set a tensor to zero will automatically retain its form in all inertial frames. Thus we seek to cast Maxwell’s equations in this way. The first step is to identify the electric and magnetic fields as the components of some tensor. We start by expressing them in terms of potentials ∂A E = φ k = ∂ ( φ) c∂ A c∂ A c∂ A (441) k −∇k − ∂t k − − 0 k ≡ k 0 − 0 k ǫ Bk = ǫ ǫkmn A = ∂ A ∂ A (442) ijk ijk ∇m n i j − j i
We already know that ∂µ transforms as a covariant 4-vector. So the above forms strongly suggest that we should interpret Aµ =( φ/c, Ak) also as a covariant four vector and Fµν = c(∂ A ∂ A ) as a second rank covariant− 4-tensor. Since it is antisymmetric, there are µ ν − ν µ
92 c 2010 by Charles Thorn
precisely 6 independent components, corresponding to the 6 components of E and B. We may write out the components of F as a 4 4 matrix array as follows: µν × 0 E1 E2 E3 E1 −0 −cB3 −cB2 Fµν = (443) E2 cB3 0 −cB1 − E3 cB2 cB1 0 − µν µ ν ρσ Let us work out what the tensor transformation law F ′ = Λ ρΛ σF implies for the 0 1 0 electric and magnetic fields. Consider boosts in the x direction, so Λ 0 = Λ 1 = γ, Λ 1 = Λ1 = γv/c, and Λ2 =Λ3 = 1. Then 0 − 2 3 1 01 0 1 01 1 0 10 01 1 E′ = F10′ = F ′ =Λ 0Λ 1F +Λ 0Λ 1F = F = E (444) 1 23 2 3 23 23 1 cB′ = F23′ = F ′ =Λ 2Λ 3F = F = cB (445) 2 02 2 0 02 0 12 E′ = F ′ =Λ 2(Λ 0F +Λ 1F ) = γ E2 vB3 = γ E2 +(v B)2 (446) 2 31 − 3 1 31 1 30× cB′ = F ′ =Λ 3(Λ1F +Λ 0F ) v v = γ cB2 + E3 = γ cB2 ( E)2 (447) c − c × 3 3 3 3 3 v 3 E′ = γ E +(v B) , cB′ = γ cB ( E) (448) × − c × We can infer from these formulas the boosts in a general direction v, by defining and as k ⊥ the components of a vector parallel to and perpendicular to v respectively. Then
E′ = E , B′ = B (449) k k k k v E′ = γ(E + v B), B′ = γ B E (450) ⊥ ⊥ × ⊥ ⊥ − c2 × Since we have used potentials, the homogeneous Maxwell equations are automatically satis- fied. The remaining ones are ρ = ǫ E = ǫ ∂ F = ǫ ∂ F k0 = ǫ ∂ F 0k (451) 0∇· 0 k k0 − 0 k 0 k i 1 m i 1 J = ǫikm∂kB cǫ0∂0E = ∂kFik cǫ0∂0Fi0 µ0 − cµ0 − ik i0 = cǫ0∂kF + cǫ0∂0F (452) Thus we see that if we interpret J µ = (cρ, J k) as a contravariant 4-vector, Maxwell’s equa- tions assume the form5 1 ǫ ∂ F µν = J µ ρµ, (453) 0 ν c ≡ 5 In SI units electric and magnetic fields have different units. Since we have defined Fµν to have the µν µ µ units of the electric field, ǫ0∂ν F has dimensions of charge density as does J /c ρ . Perhaps we should ≡ have defined a field strength Bµν = ∂µAν ∂ν Aµ = Fµν /c with dimensions of the magnetic field. Then − 2 µν µν µ Maxwell’s equations would assume the form ǫ0c ∂ν B = ∂ν B /µ0 = J , which resembles the magnetic Amp`ere-Maxwell equation.
93 c 2010 by Charles Thorn
which resembles the Gauss-Maxwell equation. This puts the equations in manifestly Lorentz covariant form, so they take this form in all inertial frames connected by Lorentz transfor- µ ∂ρ mations. Note that charge conservation 0 = ∂µJ = J + ∂t is a direct consequence of the antisymmetry of F µν = F νµ. Its validity in all inertial∇· frames follows from the 4-vector interpretation of J µ. − Although we have used potentials to identify the fields as components of Fµν, we can dispense with them by restoring the homogeneous Maxwell equations. We cast them in covariant form as follows: 1 0 = B = (∂ F + ∂ F + ∂ F ) ∇· c 1 23 2 31 3 12 ∂Bi 1 0 = ( E)i + = ǫijk∂ F + ∂ ǫijkF ∇× ∂t j k0 2 0 jk 1 = ǫijk(∂ F + ∂ F + ∂ F ) (454) 2 j k0 k 0j 0 jk These 4 equations are the various components of the covariant condition
∂µFνρ + ∂ρFµν + ∂νFρµ =0 (455) One can check directly that the left side is antisymmetric under the interchange of any pair of indices, which means that no information is lost if one multiplies by the completely antisymmetric 4-tensor ǫκµνρ and sums over repeated indices:
κµνρ κµνρ κµ 0= ǫ ∂ F = ∂ (ǫ F ) 2∂ F ∗ (456) µ νρ µ νρ ≡ µ κµ κµνρ where F ∗ ǫ Fνρ/2 is known as the dual of Fµν. This dual transformation interchanges the roles of≡ electric and magnetic fields. κ λ µ ν ρστω κλµν Incidentally, from the definition of determinants Λ ρΛ σΛ τ Λ ωǫ = ǫ detΛ, we see that ǫ is invariant under Lorentz transformations (actually under all linear transformations!) with unit determinant. The defining condition of Lorentz transformations shows that trans- forming ηµν as a covariant tensor leaves it unchanged. Thus we can freely use both ηµν and ǫκλµν in the process of constructing new tensors (or pseudotensors) from old ones. Some examples: ηµρηνσF F = F F µν = 2(c2B2 E2) (457) µν ρσ µν − 1 µνρσ µν ǫ F F = F F ∗ = 4cE B (458) 2 µν ρσ µν − ·
are two important scalar fields. The first times ǫ0/4 is just the Lagrangian density for the electromagnetic field. − ǫ 1 S = 0 dtd3x(E2 c2B2)= dtd3x(E2/c2 B2) 2 − 2µ − Z 0 Z ǫ ǫ = 0 dtd3xηµρηνσF F = 0 dtd3xF F µν − 4 µν ρσ − 4 µν Z Z ǫ 1 = 0 d4xF F µν = d4xB Bµν (459) −4c µν −4cµ µν Z 0 Z 94 c 2010 by Charles Thorn
where we understand d4x = d(ct)d3x. These are the only two invariants quadratic in the fields. Finally we need to confirm that the equations of motion for a charged particle in electric and magnetic fields are the same in all inertial frames. Before we worry about a particle moving in fields we should say a word about the motion of free particles. Then Newton’s equation just says that the three momentum p of a free particle is a constant. Then the energy E = p2c2 + m2c4 is also a constant and we have the Lorentz covariant statement that pµ is constant. More generally, there will be conservation of the total energy and p momentum of a system made up of several interacting particles. For example a scattering µ µ process in which two particles of momenta p1 ,p2 collide and emerge with altered momenta µ µ µ µ µ µ p1′ ,p2′ , will respect the conservation laws p1 + p2 = p1′ + p2′ in all inertial systems. In this context one can define Lorentz invariant scalar products like (p + p ) (p + p ) which have 1 2 · 1 2 the same values in all frames. Two common frames are the Lab frame where p2 = 0 and the center of mass frame where p1 + p2 = 0. By evaluating the invariant in these two frames we learn that 1 (p + p )2 = E2 = p2 + p2 +2p p = m2c2 m2c2 2m E 1 2 −c2 CM 1 2 1 · 2 − 1 − 2 − 2 1,Lab E = c m2c2 + m2c2 +2m E c 2m E (460) CM 1 2 2 1,Lab ∼ 2 1,Lab q p at high energies. To make heavy particles we need high ECM which means much higher E1,Lab because of the square root. This is why colliding beams are much more effective than single beams colliding with stationary targets in producing novel heavy particles. Turning to the motion of a particle in an electromagnetic field, we seek to cast the Newton equation dp = q(E + v B), p = γmv (461) dt × as a 4-tensor equation. We have already seen that pµ =(E/c, p) is a contravariant 4-vector. It is convenient to divide by the mass to form the “velocity” 4-vector U µ =(γc,γv). Then it is natural to consider the quantity
ν k FµνU = γ(cFµ0 + Fµkv ) (462) F U ν = γ(cEj + cǫ vkBl)= cγ(E + v B)j (463) jν jkl × F U ν = γF vk = γv E (464) 0ν 0k − · Then Newton’s equation can be written dpj q = F U ν (465) dt γc jν which are the spatial components of the 4-vector equation dp q γ µ = F U ν (466) dt c µν
95 c 2010 by Charles Thorn
But this equation has a time component which seems independent of the Newton equation: dp 1 dE q d 0 = = v E, p2c2 + m2c4 = qv E (467) dt − c dt −c · dt · which, since qv E is just the rate at which workp is done on the particle by the electric field, we recognize· as the relativistic work energy theorem. It is an easy consequence of the Newton equation: d pc2 p2c2 + m2c4 = q(E + v B)= qv E (468) dt γmc2 · × · p and so is not really an independent equation. Finally, we note that dt/γ = dt 1 v2/c2 dτ is just the time interval dt as measured by a clock in the instantaneous particle rest− frame.≡ p The measure of time given by τ is therefore a Lorentz invariant notion and τ is accordingly known as the particle’s proper time. With this concept we can write Newton’s equation in manifestly Lorentz covariant form: dp dU µ q µ = m = F U ν (469) dτ dτ c µν Here Lorentz covariance implicitly requires that the charge q has the same value for all inertial observers. Since it is a property of the particle we could simply assume it. However, the statement should be consistent with the 4-vector transformation properties of J µ. For µ this we need the local statement of charge conservation: ∂µJ = 0, from which we can write
4 µ 3 µ 0 = d x∂µJ = d SnµJ (470) Z I where on the right we have used Gauss’ theorem to write the integral of a divergence as a boundary integral. Now we assume all components of J µ vanish rapidly enough at spatial infinity, so that any part of the boundary at spatial infinity can be dropped. We may freely choose the spatially finite parts of the closed contour. Consider the region bounded by two hypersurfaces, one at t = t1 and the other at t′ = t2′ , where t,t′ are times in two different inertial frames. Then the relation becomes
3 0 3 0 0= d xJ (x,t ) d x′J ′ (x′,t′ )= Q(t ) Q′(t′ ) (471) 1 − 2 1 − 2 Z Z This equation can be read in two ways. First, if the primed observer is in fact the same as the unprimed observer, it reads Q(t2)= Q(t1), i.e. charge is conserved in all inertial frames. Then if we take the case that the primed observer is boosted, we learn that Q′ = Q, i.e. charge is a scalar and has the same value at all times in all inertial frames. This completes the demonstration that Maxwell’s electrodynamics takes the same form in all inertial frames connected by Lorentz transformations. This statement is concisely summarized by: 1 dp q q ǫ ∂ F µν = J µ, ǫµνρσ∂ F =0, µ = F U ν = F pν (472) 0 ν c ν ρσ dτ c µν mc µν
96 c 2010 by Charles Thorn
We should emphasize, that apart from justifying the choice of any convenient inertial frame for the solution of a given problem, the covariant formalism does not really offer new tech- niques for solving problems. For that one simply settles on a convenient frame and solves the equations as we did last semester.
8.6 Action Principles We have already mentioned the action for electromagnetic fields coupled to sources which can be put in covariant form as follows6: ǫ S = dtd3x 0 (E2 c2B2) ρφ + J A fields 2 − − · Z 1 h ǫ i = d4x 0 F F µν + J µA (473) c − 4 µν µ Z h i Maxwell’s equations follow from δS = 0 under variations of the vector potential δAµ, so δF = c(∂ δA ∂ δA ): µν µ ν − ν µ 1 ǫ 1 δS = d4x 0 F µνδF + J µδA = d4x [ ǫ F µν∂ δcA + J µδA ] c − 2 µν µ c − 0 µ ν µ Z Z 1 h i = d4x [ c∂ (ǫ F µνδA )+(J ν + ǫ ∂ cF µν)δA ] (474) c − µ 0 ν 0 µ ν Z
Taking δAµ = 0 at the space-time boundaries, but otherwise arbitrary, then δS = 0 implies µν µ Maxwell’s equations, ǫ0∂νF = J /c. The relativistic version of the action for a particle moving in fixed fields can also be put in covariant form:
v2 S = mc2 dt 1 q dt(φ(r(t),t) v A(r(t)),t) part − − c2 − − · Z r Z = mc2 dτ + q dτU µA (475) − µ Z Z The last form shows that the action is a scalar, assuming the same form in all inertial reference frames. This is another way of understanding why Newton’s equation is covariant. The complete Maxwell electrodynamics follows from the combined action ǫ S = 0 d4xF F µν mc2 dτ + q dτU µA (476) −4c µν − µ Z Z Z 6 4 µνρσ One could also add a term θ d xǫ Fµν Fρσ to the action. This term is invariant under proper Lorentz transformations but it is odd under parity and also odd under time reversal. If we want to keep parity or time reversal symmetry, it wouldR be forbidden. But even if it is allowed, it has no effect on the classical µνρσ µνρσ equations of motion because one can write ǫ Fµν Fρσ = 2∂µ(ǫ Aν Fρσ) showing that its variation will not contribute to δS under the conditions of Hamilton’s principle.
97 c 2010 by Charles Thorn
which is also a Lorentz scalar. It therefore implies covariant equations of motion, which of course we have already shown directly. The current due to the particle is, by equating the last term to (1/c) d4xJ µA , yielding J µ = qc dτU µδ4(x x(τ)). Explicitly, we have µ − R R γ ρc = J 0 = qc2 dτγδ(ct x0(τ))δ(x x(τ)) = qc2 δ(x x(τ(t))) − − dx0/dτ − Z ρ = qδ(x x(τ(t))) (477) − dx dx J = qc dτγ δ(ct x0(τ))δ(x x(τ)) = q δ(x x(τ)) (478) dt − − dt − Z where τ(t) is implicitly defined by ct = x0(τ(t)). These are the familiar equations we used last semester.
8.7 Some particle motions in electromagnetic fields
The motion of a charged particle in uniform fields Fµν(x) =constant is an important case. The covariant form of the equations of motion dU µ q dxµ = F µ U ν,U µ = ,U U µ = c2 (479) dτ mc ν dτ µ − reduces to a set of 4 coupled first order differential equations for U µ(τ) with constant coeffi- cients. If we employ matrix notation, the solution can be expressed as U(τ)= eqFτ/mcU(0). This formal solution can be made more concrete by expressing it as a linear combination of 4 “normal mode” solutions in each of which all components of U have a common exponen- µ µ qrτ/mc tial τ dependence U = Vr e . Then one needs to solve a matrix eigenvalue problem µ FVr = rVr, with the matrix elements of F given by F ν. This is a straightforward problem in linear algebra. One determines the possible eigenvalues r from the characteristic equation det(F rI) = 0. The Lorentz transformation rules − µ µ ρ 1 σ 1 µ F ′ ν =Λ ρF σ(Λ− ) ν = (ΛF Λ− ) ν (480)
Show that det(F ′ rI) = det(F rI) = 0, so the characteristic equation may be evaluated in any Lorentz frame,− for example,− the one in which the electric and magnetic fields are both parallel to the x-axis. In that frame 0 E 0 0 E 0 0 0 F = , det(F rI)=(r2 E2)(r2 + B2) (481) 00 0 cB − − 0 0 cB 0 − so the four possible eigenvalues are r = E and r = icB. The general solution will then µ µ qrτ/mc ± µ ±2 be U = r crVr e . The constraint UµU = c will be independent of τ because µ ν − ηµνV V ′ = 0 unless r + r′ = 0. A simple further integration yields r r P c mc c mc xµ(τ) = xµ(0) + r V µ(eqrτ/mc 1) Xµ + r V µeqrτ/mc (482) qr r − ≡ qr r r r X X 98 c 2010 by Charles Thorn
To represent the trajectory in space as a function of the frame’s time one would have to invert ct = x0(τ) to get τ(t) and then obtain x(τ(t)). In the general case this last step is complicated. A general feature of the motion can be brought out by noticing that c c m2c2 (x(τ) X)2 = r s V V eq(r+s)τ/mc − q2rs r · s r,s X 2 2 2 crc rm c (x(τ) X) = − Vr V r (483) − − q2r2 · − r X which is to say that xµ(τ) lies on a hyperboloid. We next find the motion in a few special situations. Uniform electric field The equation of motion can be immediately integrated to yield p(t)= p0 + qE(t t0), where without loss of generality we can take p0 E = 0 and t0 = 0. Then since v = dr−/dt = pc/ p2 + m2c2, we have · t p p0c + qcEt′ r(t) = r0 + dt′ 2 2 0 2E 2 p 2 2 Z q t′ + 0 + m c p c ctEˆ + 0 pln t + O(1) (484) ∼ q E | | which shows that at large times the particle’s speed approaches that of light. Boosted uniform electric field A boost parallel to E leaves the field invariant and simply maps the solution on to the analogous solution in the primed frame. To see this, take E = Exˆ and p0 = p0yˆ. Then taking a boost in the x-direction the transformed momentum is
v 2 2 2 2 2 2 p′ = γ qEt q E t + p + m c , p′ = p , p′ =0 (485) x − c 0 y 0 z q 2 The time in the new frame t′ = γ(t vx(t)/c ) can be obtained by computing − c x(t) = x + q2E2t2 + p2 + m2c2 p2 + m2c2 (486) 0 qE 0 − 0 q q vx0 v 2 2 2 2 2 2 2 2 2 t′ = γ t ( q E t + p + m c p + m c ) (487) − c2 − qcE 0 − 0 q q p′ = p0 + qEt′ + const (488)
A boost perpendicular to E generates a magnetic field in the primed frame, E′ = γE 2 2 ⊥ ⊥ and B′ = v E′ /c . This is a new physical situation: we can express v = c B′/E′ and ⊥ − 2× 2 ⊥ 2 γ =1/ 1 c B′ /E′ . Consider the above solution in the special case p0 = 0, with E = Eyˆ and putting−x = z = 0, p 0 0 x(t) = z(t)=0 t qcEt′ 2 2 2 2 2 y(t) = y0 + dt′ = Y + cyˆ t + m c /q E (489) 2 2 2 2 2 0 q E t′ + m c Z p p 99 c 2010 by Charles Thorn
2 in a frame moving with velocity vxˆ in the x-direction, so E′ = γEyˆ and B′ = γvEz/cˆ = 2 − vE′z/cˆ . − 2 2 c B′ t′ = γ(t vx(t)/c )= γt, x′ = γ(x(t) vt)= vt′ = t′, z′ =0 − − − − E′ 2 2 2 2 t′ m c γ ′ ′ ′ y (t ) = y(t)= y(t /γ)= Y + c 2 + 2 2 sγ q E′ c2B 2 m2c2 = Y + c t 2 1 ′ + (490) ′ − E 2 q2(E 2 c2B 2) s ′ ′ − ′ This gives us the solution for motion in perpendicular electric and magnetic fields. There is 2 accelerated motion in the direction of E and a uniform drift at speed c B′/E′ in the direction of E′ B′. × Uniform magnetic field dp = qv B (491) dt × Dotting both sides with p shows that p2 and hence v2 and γ are constants. Thus we can write the equation of motion as dv/dt = v ω, where ω = qB/mγ. Thus v is constant k and v rotates in the transverse plane with angular× frequency ω . Explicitly, with B = Bzˆ, ⊥ | | vz = constant, vx = v cos ωt, vy = v sin ωt, and ⊥ ⊥ v v z = v t, x = x + ⊥ sin ωt, y = y ⊥ cos ωt (492) z 0 ω 0 − ω
Thus the particle moves in a circle of radius RB = v /ω = mv γ/qB = p /qB. ⊥ ⊥ ⊥ Boosted uniform magnetic field Again a boost parallel to the field just scales the field without altering the physical situation. So we just examine a boost perpendicular to the field, which we take in the z-direction B = Bzˆ. So our canonical boost in the x-direction v = vxˆ does what we want. In the primed frame B′ = γBzˆ And an electric field E′ = E′yˆ = γv B = v B′ = vB′yˆ appears. So we have perpendicular electric and magnetic fields × × − with E′ < c B′ . To find the motion in such fields all we have to do is boost the solution we found| | with| E| = 0. For simplicity let’s just consider the circular motion in the xy-plane:
v0 v0 x(t)= cos ωBt, y(t)= sin ωBt, z(t)=0 (493) ωB −ωB The boost is then
v0 vv0 v0 x′ = γ cos ω t vt , t′ = γ t cos ω t , y′ = sin ω t, z′ =0 ω B − − c2ω B −ω B B B B We see from the x′ equation that there is an average drift in the negative x′-direction. To 2 express the boosted solution in primed frame variables, we note that v = E′ B′/B′ − ×
100 c 2010 by Charles Thorn
so the average drift v of the gyrating particle is in the direction of E′ B′. Also ωB = − 2 2 × qB/mγ0 = qB′/mγ0γ, where γ0 =1/ 1 v0/c . Ideally we would eliminate t in favor of t′, but we can only do that implicitly. We can− do a partial elimination however p 2 2 v0 v0 x′ = vt′ + γ(1 v /c ) cos ωBt = vt′ + cos ωBt (494) − − ωB − ωBγ
A simple statement can be made at the times when cos ωBt = 0, i.e. the times when x′ = vt′, and y′ = v /ω Y , its extreme values. These are at t′ = γt = γ(n +1/2)π/ω . − ± 0 B ≡± 0 B The special case v0 = 0 is interesting. The particle is at rest in the unprimed frame, where E = 0. In the primed frame it moves uniformly at velocity v. This is consistent with the − equations of motion since E′ = v B′ so that F = q(E′ v B′) = 0. The magnetic force cancels the electric force. This means× that perpendicular− electric× and magnetic fields can be 2 used as a velocity filter: only particles with velocity V = E′ B′/B′ will pass through undeflected. × Drift in Inhomogeneous Magnetic Fields. It is not generally possible to find analytic solutions for nonuniform magnetic fields. But if the fields vary slowly one may use the solution in uniform fields as the starting point for an approximate solution. First consider the case where the magnetic field is in the z direction, but is consistently stronger in one transverse direction, say in the x direction: B zˆ(B + xB′). if we ignore the B′ term, a ≈ 0 simple motion would follow a circle in the xy plane at constant speed. Including the xB′ term would tighten the circle for x> 0 but loosen the circle for x< 0. Thus there should be a systematic drift parallel to the y-axis. To estimate this drift velocity, we write v = v0 +v1, where v0 is the solution for uniform field. Then
dv1 q [xB′v zˆ + v B zˆ]+ O(v B′) (495) dt ≈ m 0 × 1 × 0 1
We can identify the drift velocity as v1 , the time average over a cycle of the zeroth order motion. Averaging the above equationh givesi
[B′ xv + v B ] zˆ] 0 (496) h 0i h 1i 0 × ≈ Presuming the drift velocity is in the xy plane, we conclude
B′ vdrift xv0 (497) ≈−B0 h i The average involves the zeroth order solutions
x(t) = a cos ω t, y = a sin ω t, B − B vx = aω sin ω t, vy = aω cos ω t (498) 0 − B b 0 − B B with ωB = qB/mγ. We then see that
B′ 2 B B 2 v ⊥ drift a ωByˆ = ∇ ×2 a ωB (499) ≈ 2B0 − 2B
101 c 2010 by Charles Thorn
where we wrote the last form in a way that doesn’t depend on the choice of coordinate axes. The sign of ωB matters: it is the same as the sign of the charge. Adiabatic Invariants for slowly varying magnetic fields. I will just say a few words about the use of Adiabatic invariants to explain some interesting features of the motion if the gyrating particle has a component of velocity in the direction of the B field. The canonical formalism of classical mechanics draws attention to the action integrals
Ik = pkdqk (500) I over canonical coordinates that are periodic as the particle follows its trajectory. If some parameter λ in the Hamiltonian varies sufficiently slowly with time, i.e. λ˙ λ/T , then even after a long time T during which the Hamiltonian makes a finite change,≪ the values of the Ik will remain unchanged. An example is the helical motion of a particle about a magnetic field line which is periodic in the coordinates perpendicular to the field. The corresponding action integral is
I = (γmv + qA) dl =2πγmω a2 + q dSn B (501) · B · I Z where we used Stokes theorem. The normal n is in the direction such that the contour is in a counterclockwise sense looking down on it. The sign of the first integral is written assuming the velocity is in the same sense. This requires that the B is antiparallel to n. Thus
I = 2πγmω a2 qπa2B = πa2qB (502) B − Now consider a particle initially gyrating around a field line with a component of initial velocity parallel to the field in a direction of a slowly increasing field. If the field is static the particle’s energy γmc2 will be constant. The adiabatic invariant I will also be constant, so a will decrease as 1/√B. Meanwhile the transverse velocity v = aωB = aqB/γm will ⊥ increase as √B. Thus the parallel velocity will decrease as
2 2 2 2 2 B v = v0 v = v0 v0 (503) k − ⊥ − ⊥ B0 If B increases enough v will drop to 0 and the particle will be reflected back from regions k of strong magnetic field.
8.8 Electrodynamics of a Scalar Field Up until now we have developed electrodynamics by modeling the currents and charges with point particles. In such a description two very different types of physical entities, particles and fields must interact with each other in a consistent way. While this hybrid approach to electrodynamics can achieve some level of success, it engenders many nasty complications. A more coherent approach is to try describing all entities in terms of fields. This is possible
102 c 2010 by Charles Thorn
because quantization endows fields with particle-like properties. Indeed the fields of the standard model have the scope to describe all phenomena seen in experiment thus far. These fields include, in addition to electromagnetic fields, nonabelian gauge fields which mediate the weak and strong interactions, fermionic quark and lepton fields, and finally scalar fields which generate mass. We limit our field description of charged matter to scalar fields. We start with a free scalar field, φ(x). The only linear field equation, second order in 1 derivatives and consistent with Lorentz invariance, φ′(x′)= φ(Λ− x′), is
1 ∂2 ( ∂2 + µ2)φ = µ2 2 + φ =0 (504) − −∇ c2 ∂t2 An action which leads to this field equation is easy to write down: 1 S = dtd3x ( ∂ φ∂µφ µ2φ2) (505) φ 2 − µ − Z δS = dtd3x( ∂ δφ∂µφ µ2φδφ)= dtd3xδφ(∂2 µ2)φ =0 (506) φ − µ − − Z Z We have already considered such a field when we initially motivated Einstein’s modification of particle mechanics. The key was to notice that a packet of plane waves which have the 2 2 2 2 2 2 frequency wave number relation ω = c (k + µ ) have a group velocity vg = k/ k + µ , which with the Planck-Einstein connection p = ~k, E = ~ω, leads to the usual Einstein p modification of momentum p = γmv and E = m2c4 + p2c2. Now we explore the scalar field in its own right. p Before turning to electrodynamics let’s review some basics of Hamilton’s principle for fields in a more general setting. In a field theory the action dtL is a space time integral rather than just a time integral. We generally restrict the integrand, the Lagrangian density, R to be a function of the fields ψi and their space-time derivatives 1 S = d4x (ψ ,∂ ψ ) (507) c L i µ i Z ∂ ∂ ∂ ∂ δS = d4x δψ L + ∂ δψ L = d4x δψ L ∂ L i ∂ψ µ i ∂(∂ ψ ) i ∂ψ − µ ∂(∂ ψ ) " i i i µ i # i i µ i Z X X Z X where we have dropped a surface term. Thus in this general context the field equations following from δS = 0 are ∂ ∂ ∂µ L = L (508) ∂(∂µψi) ∂ψi The analogue of a particle Hamiltonian in field theory is the conserved canonical energy momentum tensor which is constructed as follows ∂ T µν = ∂µψ L + ηµν (509) − i ∂(∂ ψ ) L i ν i X 103 c 2010 by Charles Thorn
∂ ∂ ∂ T µν = ∂µ∂ ψ L + ∂µψ ∂ L + ∂µ ν − ν i ∂(∂ ψ ) i ν ∂(∂ ψ ) L i ν i ν i X ∂ ∂ = ∂µ∂ ψ L + ∂µψ L + ∂µ = ∂µ + ∂µ =0 (510) − ν i ∂(∂ ψ ) i ∂ψ L − L L i ν i i X For the simple scalar field theory we introduced above we have ∂ ∂ L = ∂νφ, L = µ2φ (511) ∂(∂νφ) − ∂φ − 1 T µν = ∂µφ∂νφ ηµν [∂ φ∂ρφ + µ2φ2] (512) − 2 ρ 1 T 00 = [φ˙2 +( φ)2 + µ2φ2 (513) 2 ∇ In this case T µν is symmetric in its indices, but the construction doesn’t guarantee that. µν 2 µ ν Another example is the free electromagnetic field, with = ǫ0FµνF /4= ǫ0c ∂µAν(∂ A ∂νAµ)/2. Then L − − −
∂ νρ ∂ L = ǫ0cF , L =0 ∂(∂νAρ) − ∂Aρ T µν = ǫ c∂µA F νρ + ηµν = ǫ F µ F νρ + ηµν + ǫ c∂ AµF νρ 0 ρ L 0 ρ L 0 ρ = ǫ F µ F νρ + ηµν + ǫ c∂ (AµF νρ) ǫ cAµ∂ F νρ (514) 0 ρ L 0 ρ − 0 ρ ǫ F µ F νρ + ηµν + ǫ c∂ (AµF νρ) (515) → 0 ρ L 0 ρ because of the equations of motion. The first two terms in the final form are symmetric under µ νρ µ ν. The last term satisfies ∂νǫ0c∂ρ(A F ) = 0 by virtue of the antisymmetry of F . It follows↔ that the first two terms are conserved by themselves. Furthermore the contribution of the last term to the total energy-momentum vanishes:
d3xǫ c∂ (AµF 0ρ)= d3xǫ c∂ (AµF 0k)= ǫ c dSn (AµE) 0 (516) 0 ρ 0 k 0 · → Z Z I µν µ νρ µν for localized fields. Therefore the improved energy momentum tensor TS = ǫ0F ρF +η is just as good as the canonical one: indeed it is better since it is both gauge invariant andL symmetric. It is also traceless (in 4 dimensions):
T µν = ǫ F µ F νρ + ηµν (517) S 0 ρ L T µ = ǫ F F µρ +4 =0 (518) Sµ 0 µρ L The reason that symmetric is better is that the simple intuitive construction of the angular momentum/boost tensor
µνρ = xµT νρ xνT µρ (519) M S − S
104 c 2010 by Charles Thorn
µνρ νµ µν is automatically conserved ∂ρ = TS TS = 0. If the canonical one were used one would need to discover an extraM term to make− it conserved. The densities of energy and momentum computed from TS are the familiar ones
2 00 2 ǫ0 2 2 2 ǫ0 2 ǫ0c 2 ǫ0 2 1 2 T = ǫ0E (E c B )= E + B = E + B (520) − 2 − 2 2 2 2µ0 T i0 = ǫ F i F 0ρ = ǫ F i F 0k = ǫ cǫ BlE = ǫ c(E B)i (521) 0 ρ 0 k 0 ikl k 0 × = c(D B)i = cgi (522) × 1 1 = (E H)i = Si (523) c × c the factors of c just reflecting the fact that T i0 has dimensions of energy density, whereas g has dimensions of momentum density and S has dimensions of energy flux. The next question is how to set up a consistent interaction between a scalar field and the electromagnetic fields. For starters we need to find a current J ν built out of the scalar ν field which is conserved. ∂νJ = 0. With a single real field one possibility would be ∂νφ, but then current conservation would require µ = 0. To have µ> 0 we need to make φ complex. Then we can form the combination J = iQ(φ∗∂ φ φ∂ φ∗). Now we check ν − ν − ν ν 2 2 2 2 2 ∂ J = iQ(φ∗∂ φ φ∂ φ∗)= iQ(µ µ ) φ =0 (524) ν − − − − | | by the field equation for φ. Consider the combination
ν ν ν ν 2 ν ∂ φ∗∂ φ + A J = (∂ φ∗ + iQA φ∗)(∂ φ iQA φ)+ Q A A φ∗φ (525) − ν ν − ν ν − ν
We can make the first term on the right gauge invariant under Aν Aν + ∂νΛ, provided φ eiQΛφ, because then → → (∂ iQ(A + ∂ Λ))eiQΛφ = eiQΛ(∂ iQA )φ (526) ν − ν ν ν − ν The last term is not gauge invariant so to make a gauge invariant Lagrangian we should drop it and just keep the first term. Thus the complete Action should be
1 4 ǫ0 ρν ν ν 2 1 4 S = d x F F (∂ + iQA )φ∗(∂ iQA )φ µ φ∗φ d x (527) c − 4 ρν − ν ν − − ≡ c L Z Z Any function of φ∗φ can be added to the integrand without disturbing gauge invariance or 2 Lorentz invariance. Thus we could replace µ φ∗φ with U(φ∗φ), so we write more generally − − ǫ0 ρν ν ν = F F (∂ + iQA )φ∗(∂ iQA )φ U(φ∗φ) (528) L − 4 ρν − ν ν − − ǫ0 ρν µ = F F (D φ)∗D φ U(φ∗φ) (529) − 4 ρν − µ − ν ν where we introduced the shorthand Dµ ∂ iQA for the gauge covariant derivative. The dimension of q = Q~ is Coulombs. To see≡ this,− simply recall from quantum mechanics that p (~/i) and p and qA have the same dimensions. ↔ ∇
105 c 2010 by Charles Thorn
We now need to rederive the field equations. We need:
∂ νρ ∂ ν ν ν ν ρ L = ǫ0cF , L = iQ[φ∗(∂ iQA )φ φ(∂ + iQA )φ∗] J ∂(∂νAρ) − ∂Aρ − − − ≡
∂ ν ν ∂ ν ν L = (∂ iQA )φ, L = (∂ + iQA )φ∗ (530) ∂(∂νφ∗) − − ∂(∂νφ) −
∂ ρ ρ ∂ ρ ρ L = iQAρ(∂ iQA )φ φU ′(φ∗φ), L = iQAρ(∂ + iQA )φ∗ φ∗U ′(φ∗φ) ∂φ∗ − − − ∂φ − Then the equations of motion are
νρ ρν ρ 2 ǫ0c∂νF = +ǫ0c∂νF = J , Jρ = iQ[φ∗∂ρφ φ∂ρφ∗] 2Q Aρφ∗φ ν− ν ρ ρ − − 2 − ∂ν(∂ iQA )φ = iQAρ(∂ iQA )φ + φU ′(φ∗φ), (∂ iQA) φ = φU ′(φ∗φ) (531) ν − ν −ρ ρ − 2 ∂ (∂ + iQA )φ∗ = iQA (∂ + iQA )φ∗ + φ∗U ′(φ∗φ), (∂ + iQA) φ∗ = φ∗U ′(φ∗φ) ν − ρ Notice that the last line is just the complex conjugate of the second line. Notice also the very interesting feature that the current Jρ has a linear term in Aρ. This arose from our consistent treatment of gauge invariance. It is easy to check that this current is gauge invariant. Finally for completeness we quote the improved symmetric energy momentum tensor whose derivation is assigned in the homework:
µν µ νρ µ µ ν ν ν ν µ µ T = ǫ F F +(∂ + iQA )φ∗(∂ iQA )φ +(∂ + iQA )φ∗(∂ iQA )φ S 0 ρ − − µν ǫ0 ρσ ρ ρ η F F +(∂ + iQA )φ∗(∂ iQA )φ + U(φ∗φ) (532) − 4 ρσ ρ ρ − h i In particular the energy density is given by
00 ǫ0 2 2 2 T = (E + c B )+(∂ + iQA )φ∗(∂ iQA )φ +( + iQA)φ∗ ( iQA)φ + U(φ∗φ) S 2 0 0 0 − 0 ∇ · ∇− 8.9 Lorentz Invariant Superconductivity: The Higgs Mechanism We have introduced the scalar field into our discussion of electromagnetism with two pur- poses in mind. The first was to give you a glimpse into the only known systematic way of consistently coupling charged matter to the electromagnetic field. However, to properly understand charged particles in terms of the excitations of a field–scalar or other wise– re- quires quantum mechanics: the quantum field has particle-like properties, as first realized by Einstein in his explanation of the photoelectric effect in terms of photons. The particles that make up atoms and molecules and indeed all materials must be understood in this way. One starts with quantum field theory, and then imagines a classical limiting description in which the particle aspects of the quantum field theory instead of the classical field/wave aspects dominate the approximation. Then one is back to studying the dynamics of charged point particles interacting with the electromagnetic fields. We shall use the classical scalar field to understand two related phenomena. The first is the Higgs mechanism, which provides a way to consistently give the photon a mass. Then we will use the insights gained to explain some aspects of superconductivity.
106 c 2010 by Charles Thorn
The way a scalar field can generate a mass for the photon can be seen from the behavior 2 of the current for constant nonzero φ φ0, Jρ 2Q Aρφ0∗φ0. In this limit Maxwell’s equations read → → −
µν 2 µ 2 ν 2 2 2 µ ǫ c∂ F +2Q A φ∗φ = ǫ c ∂ ∂ A +( ǫ c ∂ +2Q φ∗φ )A =0 0 ν 0 0 0 µ ν − 0 0 0 2Q2φ φ ∂ Aν = 0, ∂2 + 0∗ 0 Aµ =0 (533) ν − ǫ c2 0 which shows that the components of Aµ satisfy a massive wave equation with mass parameter 2 2 2 2 2 µ = 2Q φ0∗φ0/ǫ0c = 2µ0Q φ0 . So the question is: under what circumstances such a constant solution exists? Plugging| | a constant ansatz into the equation for φ at A = 0 shows 2 that one must have φ0U ′( φ0 ) = 0. That is φ0 > 0 must be an extremum of U. A simple | 2| 2 2 | | example would be U = λ( φ φ ) /4. If U ′ were a (positive) constant the only constant | | −| 0| solution would be φ = 0. It is true that φU ′ is also zero at φ = 0 in the example, but the value of U at φ = 0 is higher than at φ = φ0. Of course a finite difference in energy density translates to a total energy difference proportional to the volume of the system, a very large energy difference indeed! After finding a constant solution, we can then describe the excitations by writing φ = φ0 + h + ia and then expanding the action in powers of Aν,h. One can conveniently choose a = 0 as a gauge condition. Then
2 2 2 2 3 U = λ(h +2φ0h) /4= λφ0h + O(h ) ǫ = 0 F F ρσ Q2φ2A2 (∂h)2 λφ2h2 + O(A2h,h3) (534) L − 4 ρσ − 0 − − 0 ~ 2 ~2 2 2 4 So the theory describes a massive photon, with mass MA = µ/c or MA = 2 Q φ0/ǫ0c , 2 ~2 2 2 interacting with a massive scalar Mh = λφ0/c . The Higgs mechanism is important in elementary particle physics not to give a mass to the photon, for which there is no evidence, but rather to give mass to the gauge bosons that mediate the weak interactions, which are extremely short range. Indeed the effect of the mass term on the analogue of the Coulomb potential is dramatic. Instead of Poisson’s equation one has e µr ( 2 + µ2)φ = δ(r), φ = − (535) −∇ 4πr The weak interaction vector bosons have a mass roughly 100 times the proton mass, which 16 corresponds to µ 1/(10− cm)! A particle associated with the scalar field h was discovered in 2012. Finding the∼ “Higgs” particle was one of the goals of the large hadron collider (LHC) still running at CERN in Geneva. From the quartic potential model for U the Higgs (mass)2 2 ~2 2 2 ~ is MH = λφ0/c compared to the vector boson mass (Note that Q = e/ , where e is 2 2 2 4 2 2 ~2 2 2 ~ − the electron charge.) MZ = 2e φ0/ǫ0c . The ratio MH/MZ = λǫ0 c /2e = λ c/8πα was not determined by the properties of any of the particles previously observed. But notice that if MH > 10MZ, λ~c/8π > 100α = O(1), and the electroweak theory would cease to be perturbative. Fortunately for theoretical physics, MH waas measured to be only 125 GeV 1.4M so λ~c/8π 2α 1. ≈ Z ≈ ≪
107 c 2010 by Charles Thorn
In condensed matter physics, the Higgs mechanism is in essence a model of supercon- ductivity, which is caused by the condensation, at low enough temperature, of pairs of electrons (Cooper pairs) which are quasi bound states with total spin S = 0 and total charge q = Q~ = 2e. In this case the scalar field provides a phenomenological description of the − 2 4 Cooper pair condensate. The energy density difference U(0) U(φ0) = λφ0/4 models the energy gain when the pairs condense, which is reflected in the−scalar field acquiring a finite background value φ0 = 0. In this application the scalar field is only present in the material. The induced current6 in the material, in the approximation that the scalar field is just its 2 2 uniform background value, is just Jρ 2Q φ0Aρ. Its conservation in this approximation ≈ −ρ specifies the Lorenz gauge condition ∂ρA = 0. We can relate this form for the current to the effective magnetic permeability of the superconducting material. Recall that for a static situation, the magnetization M is related to the bound current by J = M. Thus for our model superconductor we can write ∇× ( M) = 2Q2φ2 A ∇× ∇× − 0∇× 2M + ( M) = 2Q2φ2B (536) −∇ ∇ ∇· − 0 1 1 Remembering that M = B/µ0 H = (µ0− µ− )B. This means that in Fourier space, − 1 − 1 2 2 2 2 2 2 assuming for simplicity that M =0, (µ0− µ− )= 2Q φ0/k , or µ(k )= µ0k /(k + 2 2 ∇· − − 2µ0Q φ0). In particular µ 0 for k = 0, i.e. for a uniform applied field. This is why we say that a superconductor→ is a perfect diamagnet. The fact that µ = 0 for finite k means that a spatially varying magnetic field would be allowed to exist to some6 extent within the superconductor. Notice that µ µ0 at very short wavelength (k ). To understand this application→ a little better, let’s consider→ the ∞ interface between a “superconductor” filling the half space z < 0, with a static uniform electric or magnetic field in otherwise empty space for z > 0. The static vector potential satisfies
( 2 + M 2)Aµ = 0, z< 0; 2Aµ =0, z> 0 (537) −∇ −∇ At the interface z = 0 we require that Aµ and ∂Aµ/∂z be continuous. A solution with Aµ(z) a function of z only is easily found:
Aµ = aµeMz, z< 0; Aµ(z)= aµ(Mz + 1), z> 0 (538)
the Lorenz gauge condition requires a zˆ = 0. With this solution the uniform fields for z > 0 are E = cMa0zˆ and B =z ˆ · a. Thus they automatically fulfill the conditions − × Et = 0, Bn = 0 expected of a perfect conducting diamagnet. However, with M finite, 1 the fields penetrate into the superconductor, a distance M − called the London penetration depth. The mass of the scalar determines another length scale characterizing the “size” of the scalar particle. In conventional superconductivity this would be the size of the Cooper pair which is actually quite large compared to the atomic size. This second length scale is called the coherence length.
Meissner Effect
108 c 2010 by Charles Thorn
The expulsion of magnetic field from the interior of a superconductor is the Meissner effect. At a rough level postulating that µ = 0 in superconducting material accounts for the effect. We have just seen that interpreting the effect in terms of an effective photon mass incorporates the more subtle effect of a finite penetration depth for the magnetic field. But so far we have not allowed the scalar field to vary spatially, though the field equations certainly allow it. Superconductivity occurs when φ = φ0, but we can imagine a region in which superconductivity is destroyed because φ 0 in that region. The energy cost of destroying superconductivity in a volume V is ∆E =[≈ U(0) U( φ 2)]V λφ4V/4 for our − | 0| → 0 quartic model. This energy difference determines a critical magnetic field Hc via 1 µ H2 U(0) U( φ 2). (539) 2 0 c ≡ − | 0|
For H >Hc in a region of superconductor, there is enough magnetic energy to drive that region of the superconductor normal. Of course, at the boundary of this region, the field φ should smoothly vary from 0 to φ0, so there will be some surface energy arising from the ( φ)2 term in the energy. To go further we need to look to the field equations. ∇ Magnetic Flux tubes Let us consider static solutions (φ˙ = A˙ = 0) of the equations of motion, with E = 0 (A0 = 0). In this case the energy density of the system is proportional to the Lagrangian density, so Hamilton’s principle for static solutions is equivalent to extremizing the total energy, the density of which reduces to
00 ǫ0 2 2 λ 2 2 T c B +( + iQA)φ∗ ( iQA)φ + (φ∗φ φ ) (540) → 2 ∇ · ∇− 4 − 0 where we have assumed the simple quartic model for U. A very interesting solution describes a magnetic flux vortex tube which penetrates the superconductor. This solution describes a real physical phenomenon in which magnetic fields penetrate a finite sized superconductor through narrow tubes in which the superconductor has been driven normal. To make a tractable problem we assume that the superconductor fills all of space and that the vortex tube is centered on the z-axis and extends from z = to z =+ . All the fields will be assumed independent of z, and we will be interested in−∞ the energy per∞ unit length. Since we are looking for a static z-independent solution it is enough to extremize the energy per unit length for the axially symmetric situation you worked out in Problem Set 4:
2 2 ∞ 2 [(ρA)′] 2 m 2 λ 2 2 2 T =2π dρ ǫ c + ρf ′ + ρ QA f + ρ (f φ ) (541) 0 2ρ ρ − 4 − 0 Z0 " # where A = A(ρ)φˆ and φ = f(ρ)eimϕ. As you showed, such a potential produces a magnetic field B =[ρA]′/ρ. In Set 5 I have asked you to find an estimate for T based on the variational principle. Here we will discuss the necessary conditions on the fields at ρ 0, . Since each of the → ∞
109 c 2010 by Charles Thorn
four terms in the integrand is positive, there are 4 independent integrals that must be finite. At ρ we must have f φ and QA m/ρ. By Stokes theorem, the last behavior → ∞ → 0 ∼ shows that the magnetic flux ΦB = dl A = 2πm/Q. This is why we call this solution a magnetic flux tube. In order for the first· integral to converge at ρ = 0 you will show that H necessarily A 0 as ρ 0. If this is so the convergence of the third term requires f(0) = 0. Thus on the axis→ of the→ flux tube the superconductor has been driven normal. If all these conditions are met T will be finite. It then remains to find its minimum for each integer value m. Clearly if m = 0 the minimum is for f = φ0,A = 0 for which T = 0. This is just the ground state of the superconductor. For m = 0 the minimum is for a flux tube with total flux 2πm/Q. Without going any further we know6 a lot about the solution. As ρ varies from 0 to f rises from 0 to φ0. Meanwhile B(ρ) falls from some nonzero value to zero. Your exercise∞ is to flesh out these qualitative conclusions with a concrete variational approximate solution.
110 c 2010 by Charles Thorn
9 Propagation of Plane waves in Materials
9.1 Oscillator model for frequency dependence of a dielectric The dielectric constant and magnetic permeability describe the response of a material to applied fields. As such they depend on two space-time points: the location and time the field is applied and the location and time the response is measured. For example, in full generality we should write, in the linear response approximation,
k 4 1 j k 4 j E (x)= d yǫkj− (x,y)D (y), B (x)= d yµkj(x,y)H (y) (542) Z Z We shall generally be interested in homogeneous and isotropic materials, in which case 1 ǫkj− = δkjǫ and µkj = δkjµ will depend only on x y, but the relationship will still, in general, be non-local in spacetime −
4 1 4 E(x)= d yǫ− (x y)D(y), B(x)= d yµ(x y)H(y) (543) − − Z Z Convolution integrals of this type can be diagonalized by simple Fourier transforms: 4 4 d k ik x d k ik x E(x) = E˜ (k)e · , D(x)= D˜ (k)e · (2π)4 (2π)4 Z 4 Z 1 d k 1 ik (x y) ǫ− (x y) = ǫ˜− (k)e · − (544) − (2π)4 Z and similarly for B, H,µ. Then we have
1 E˜ (k)=˜ǫ− (k)D˜ (k), B˜ (k)=˜µ(k)H˜ (k) (545) We shall not attempt a serious treatment of the theory of matter in this course. Instead we will use simplified models. In an insulator, all of the charged particles are locally bound within atoms and molecules. The main qualitative fact we need to account for is that if an electron is pulled by the action of an electric field, say, the binding mechanism supplies a restoring force. For many purposes an adequate model for the restoring force is a damped harmonic oscillator force. Then the equation of motion for the electron in an applied electric field, assuming nonrelativistic motion, would be e r¨ + γr˙ + ω2r = E(r,t) (546) 0 −m iωt We make the simplifying assumption that E E0e− , so that the steady state solution 2 2 1 ≈ is r = ( e/m)(ω ω iγω)− E. If there are N molecules per unit volume and each − 0 − − molecule has Z electrons fj of which have binding frequency ωj and damping parameter γj, so 2 2 2 1 f = Z, then the dipole moment density will be P =(e /m)N f (ω ω iγ ω)− E. j j j j j − − j Then D = ǫ0E + P ǫE implies that P ≡ P 2 2 2 2 e N f e N fj(ω ω + iγjω) ǫ(ω)= ǫ + j = ǫ + j − (547) 0 m ω2 ω2 iγ ω 0 m (ω2 ω2)2 + γ2ω2 j j j j j j X − − X − 111 c 2010 by Charles Thorn
The essential structure of the frequency dependence given by this simple minded model is a feature of more serious quantum mechanical calculations, where the ωj,fj,γj are effective quantum mechanical parameters. The resonance behavior given by the denominators is an ubiquitous feature of metastable excited states in quantum mechanics. ikz iωt The consequences of such frequency dependence for a plane wave e − propagating in the dielectric can be read off from the relation k = ω√ǫµ (ω/c) ǫ(ω)/ǫ0. Since ǫ is complex, the wave number will also be complex k = β + iα/2:→ p 2 2 2 2 α ω ǫ ω ǫ β = 2 Re , αβ = 2 Im (548) − 4 c ǫ0 c ǫ0 2 2 2 e N fj(ω ω ) Re ǫ = ǫ + j − (549) 0 m (ω2 ω2)2 + γ2ω2 j j j X − e2N f γ ω Im ǫ = j j (550) m (ω2 ω2)2 + γ2ω2 j j j X − Thus the imaginary part of ǫ, which is manifestly positive, implies an attenuation factor αz/2 e− multiplying the phase of the plane wave. This represents a loss of energy from the plane wave to the medium. Since the model result for Im ǫ is proportional to the damping coefficients of the oscillator, this outcome makes good sense. For ω < ωj for all j, Re ǫ is manifestly larger than ǫ0, corresponding to a n index of refraction larger than 1. Assuming none of the ω vanish, limit ω 0 reads j → e2N f ǫ(0) = ǫ + j (551) 0 m ω2 j j X which is characteristic of an insulator. However as ω increases past the resonant frequencies, more and more terms become negative, until eventually ǫ becomes less than ǫ0. Far above the largest ω , ǫ ǫ (1 ω2/ω2) where the plasma frequency is given by ω2 = N e2/ǫ m, j ∼ 0 − p p e 0 where Ne = NZ is just the total number density of electrons in the material. For such high frequencies, the restoring forces have no time to act and all electrons are effectively free. For insulating materials, this characteristic form only applies for ω ωp, so that the index of refraction is only slightly less than 1. ≫
9.2 Conductivity The simple resonance model discussed above can be used as a model of conductivity if one of the ωj, say ω0 = 0. Then separating their contribution from the rest gives
e2N f e2N f e2Nf ǫ(ω) = ǫ + j 0 ǫ (ω) 0 0 m ω2 ω2 iγ ω − mω ω + iγ ≡ b − mω(ω + iγ ) j j j 0 0 X − − e2N free i e , as ω 0 (552) ∼ mωγ0 →
112 c 2010 by Charles Thorn
free where we have identified Nf0 Ne as the number density of free (i.e. unbound) electrons. To interpret this behavior, notice≡ that the displacement current e2N free ∂D/∂t iωD˜ = iωǫ(ω)E˜ = iωǫ (ω)E˜ + e E˜ (553) →− − − b m(γ iω) 0 − The last term can be interpreted in the zero frequency limit as Ohm’s law e2N free e2N free e E˜ σE˜ , σ = e (554) m(γ iω) → mγ 0 − 0 So this model reproduces Ohm’s law in the quasi-static limit. In the formula for the con- ductivity, 1/γ0 can be interpreted as the mean time τ that an electron travels freely between collisions with impurities or lattice perturbations. Then the average drift velocity in an free 2 free electric field would be vd = eEτ/m, so J = eNe vd =(e Ne τ/m)E σE. We can write the dielectric− constant for a− conductor in terms of the effective≡ plasma frequency ω2 e2N free/mǫ , p ≡ e 0 2 free 2 e Ne ωp = (555) mǫ0 2 ωp ǫ(ω) = ǫb(ω) ǫ0 (556) − ω(ω + iγ0)
Since ωp is typically many orders of magnitude larger than γ0, there is a range of frequencies for which the second term is real negative and large. In that range the dielectric constant is negative and the wave number k = ω ǫ/ǫ0/c is imaginary meaning that em waves cannot enter the conductor. In this frequency range light incident on such a metal is completely p reflected. However at some critical frequency ǫ(ωc) = 0 and above that value the metal starts to transmit radiation reducing the reflectivity drastically.
9.3 Plasmas and the Ionosphere An ionized gas is called a plasma. For a dilute (tenuous) plasma the charged particles are essentially free with negligible damping. In our resonance model none of the ωj = 0 appear and the formula collapses to 6
2 2 ǫ(ω) ωp ωp = 1 1 2 (557) ǫ0 − ω(ω + iγ0) ≈ − ω where the approximate form neglects damping. This is of course the high frequency behavior for all materials, but for a tenuous plasma it is also valid for ω <<ωp for which ǫ< 0. Then
ǫ 2 2 ωp kc = ω = ω ωp iωp, α =2 (558) ǫ0 − → c r q 1 In typical laboratory plasmas α− is .002 cm to .2 cm.
113 c 2010 by Charles Thorn
2 2 2 2 If we rewrite the relation between frequency and wave number in the form ω = ωp +k c , we see that the group velocity of a wave packet propagating in a plasma will be vg = 2 kω(k) = c k/ω(k). Traveling waves with frequency ω > ωp are supported by a plasma, ∇and they travel at a speed less than light. Oscillations at the plasma frequency can exist when k = 0, i.e. these disturbances are at rest in the plasma.
Plasma in a Magnetic field In the ionosphere the presence of the earth’s magnetic field alters the propagation of waves in an interesting way. To see how we return to the equation of motion for a particle in a background uniform magnetic field in addition to the em wave:
iωt mr¨ = er˙ B eEe− − × 0 − mω2r = ieωr B eE (559) − 0 0 × 0 − To solve this equation first note that dotting both sides with B yields r B = eE B /mω2. 0 0· 0 · 0 Then taking the cross product of both sides with B0 gives;
mω2r B = ieω(B r B B2r ) eE B − 0 × 0 0 0 · 0 − 0 0 − × 0 = ieω(B eE B /mω2 B2r ) eE B 0 · 0 − 0 0 − × 0 Plugging this back into the first equation then yields
mω2 − (eE mω2r ) = ieω(B eE B /mω2 B2r ) eE B ieω − 0 0 · 0 − 0 0 − × 0 mω2 mω2 ie2ω (mω2r )+ ieω(B2r ) = eE + B (E B ) eE B ieω 0 0 0 ieω mω2 0 · 0 − × 0
Introduce ωB = eB0/m and find e e ie (ω2 ω2 )r = E (ω E ω ) E ω − B 0 m − mω2 B · B − mω × B ω2 ω ω ω P r p E E B B B E = eN 0 = ǫ0 2 2 i (560) − − ω ωB − × ω − ω ω · − D = ǫ0E + P ω2 ω ω ω E p E E B B B E = ǫ0 2 2 i (561) − ω ωB − × ω − ω ω · − This is an example where the effective dielectric constant is a matrix, not just a number. We next use Maxwell’s equations to determine how a plane wave propagates through this medium. Plugging Faraday’s law for a plane wave B = k E/ω into the Ampere-Maxwell equation yields ×
k (k E) = ω2µ D × × − 0 ω2 ω2 ω ω ω 2 E ˆˆ E E p E E B B B E k ( kk ) = 2 2 2 i (562) − · c − ω ωB − × ω − ω ω · − 114 c 2010 by Charles Thorn
The right side of this equation dotted into kˆ must be zero, which puts a constraint on components of the electric field:
ω2 ω ω ω ˆ E p E E B B B E 0 = k 2 2 i · − ω ωB − × ω − ω ω · − ω2 ω kˆ ω ω = E kˆ p kˆ i B kˆ · B B (563) · − ω2 ω2 − ω × − ω ω " − B !# (When B = 0 this is simply the constraint that E is perpendicular to the propagation direction. But when B = 0 it causes complications in the solution for general k, ωB. We consider two simpler6 cases. First consider propagation parallel to the magnetic field, kˆ = ωB/ωB. Then it easily follows (assuming ω = ωp) that the constraint becomes k E = 0, and the equation reduces to 6 ·
ω2 ω2 ω k2E E p E B E ˆ = 2 2 2 i k (564) c − ω ωB − ω × − We want to find two (necessarily complex) solutions for E such that E kˆ is a multiple of E. Taking k in the z-direction, one notes that (ˆx iyˆ) zˆ = i(ˆx × iyˆ) so that the eigenvectors are the circularly polarized states and the± relation× between± k ±and ω is different for each:
2 2 2 2 ω ωp ωB ω ǫ ∓ k = 2 1 2 2 1 2 (565) ∓ c − ω ωB ± ω ≡ c ǫ0 −
2 xˆ iyˆ ωp E = ± , ǫ = ǫ0 1 (566) ∓ √2 ∓ − ω(ω ωB) ∓ Notice that the relative phase of the two polarization states will not be maintained in space. In particular, if E starts out at z = 0 linearly polarized, say in the x direction, then at z = 0, we have 6 E E = [(ˆx + iyˆ)eiωn−z/c +(ˆx iyˆ)eiωn+z/c]= Eeiω(n++n−)z/2c[ˆx cos θ(z)+ˆy sin θ(z)] (567) 2 −
where θ(z) = ωz(n+ n )/2c. Thus the direction of linear polarization rotates with z − (Faraday rotation). − The second simplifying assumption is to take a general propagation direction but assume ω ω. Then the effect of the magnetic field is small, and it is appropriate to perturb B ≪ around B = 0. With k = kzˆ, the two zeroth order solutions are E0 = (ˆx iyˆ) with ∓ degenerate k2 =(ω2 ω2)/c2. Then the first order equation reads ± 0 − p 2 ω2 ω 2 2 2 ˆˆ 2 ω p B (k k0)E0 + k0(E1 kk E1 ) = k0E1 + iE0 − ∓ ∓ − · ∓ ∓ c2 −ω2 − ∓ × ω 115 c 2010 by Charles Thorn
Taking the scalar product of both sides withx ˆ iyˆ, the E terms drop out, and one finds ∓ 1 ω2 ω ω2 ω 2(k2 k2) = p (ˆx iyˆ) i(ˆx iyˆ) B = i p [(ˆx iyˆ) (ˆx iyˆ)] B − 0 c2 ∓ · ± × ω c2 ∓ × ± · ω ω2 zˆ ω ω2 ω2 ω2 zˆ ω k2 = k2 p · B = 1 p p · B (568) 0 ∓ c2 ω c2 − ω2 ∓ ω2 ω ω2 zˆ ω ǫ = 1 p 1 · B (569) ∓ − ω2 ± ω which agrees with the first case when the magnetic field is in the z-direction.
9.4 Group Velocity The frequency dependence of ǫ leads to a frequency dependent group velocity as we have already discussed in the context of wave packets. In our previous discussion our packets were treated as superpositions of a narrow band of wave numbers with the frequency a function of k. Then the formula for the group velocity is vg = kω(k). In our discussion of frequency dependence we have presented the wave number as a∇ function of frequency kc = ωn(ω) where n = ǫµ/ǫ0µ0 is the index of refraction. Then we can express the group velocity in terms of frequency derivatives: c = v (n + ωdn/dω) or p g c v = (570) g n + ωdn/dω
An interesting example is the ǫ mode of a plasma with magnetic background field. We have − in the low frequency limit
ωp dn ωp 2√ωωB n , n + ω − , vg = c (571) − ≈ √ωωB − dω ≈ 2√ωωB ωp
Lower frequency components travel more slowly, so a broad frequency pulse will be received smeared out in time with the later arrivals lower in frequency (whistlers). In our resonance model of dispersion (i.e. frequency dependent index of refraction), ǫ and hence n increase with frequency except for narrow windows surrounding the resonance locations ωj. This is called normal dispersion and from the formula for the group velocity we see that vg < c/n for normal dispersion. However, as one passes through a resonance, the index of refraction starts falling, and for narrow resonances there is a narrow range where the decline is rapid, dn/dω is large and negative (anomalous dispersion). In a region of anomalous dispersion vg can get large, even larger than the speed of light–it can even become negative! There is no violation of causality though. The interpretation of vg as the speed of a signal requires that the frequency width of the packet is much smaller than the scale over which the integrand varies. But in regions of anomalous dispersion the integrand is rapidly varying so the width has to be extremely tiny corresponding to a huge spread in the duration of the packet in time and a corresponding fuzziness in the location of any wave
116 c 2010 by Charles Thorn
front. To establish an actual signal velocity there must be a sharp wave front, which requires a broad range of frequencies. The regions of anomalous dispersion contribute very little to such a broad packet. The variation in the group velocity with frequency produces a spreading of the wave packet. The current homework assignment includes a problem where this can be explicitly seen using a Gaussian wave packet. But the phenomenon is inevitable because a band of 2 2 wave numbers ∆k has a spread of group velocities ∆vg = ∆kd ω/dk , and an additional contribution to the width in space of ∆ktd2ω/dk2. The initial width is at best 1/∆k, so the width after a time t is at least
1 d2ω 2 ∆x(t)= +∆k2t2 (572) ∆k2 dk2 s This matches the behavior you found for a Gaussian wave packet. Notice that the narrower the band of wave numbers, the longer it takes for the spreading to take effect. Of course at the same time the initial width of the packet is correspondingly large. In scattering experiments the only practical limit on the size of the wave-packets is the size of the apparatus.
9.5 Causality and Dispersion Relations As we have mentioned the relation between D and E in the linear response approximation is a nonlocal one of the form
3 D(r,t)= ǫ E(r,t)+ ǫ d r′dt′G(r r′,t t′)E(r′,t′) (573) 0 0 − − Z where we have further assumed translation invariance in space and time. Here G is the Fourier transform of the wave number and frequency dependent dielectric constant
3 d qdω iq r iωt ǫ G(r,t)= (ǫ(q, ω) ǫ )e · − (574) 0 (2π)4 − 0 Z The physical interpretation is that a field applied at r′,t′ induces a response at r,t. In order for causality to be respected, the response cannot occur unless enough time has elapsed for a light signal to travel from r′ to r. That is
G(r r′,t t′)=0, unless c(t t′) > r r′ (575) − − − | − | Considering the formula for ǫ
ǫ(q, ω) 3 ∞ iq r+iωt 1= d r dtG(r,t)e− · (576) ǫ0 − r /c Z Z| | we see that the dependence on ω can be extended into the upper half complex plane because the iωt term in the exponent acquires a negative real part, exponentially improving the con- vergence of the t integral. Viewed as an analytic function of ω, there will be no singularities
117 c 2010 by Charles Thorn
in the upper half plane. Singularities in the lower half plane are allowed by causality, because then iωt has a positive real part causing worse convergence of the t integration. In the remaining discussion we ignore the q dependence. Assuming ǫ is independent of q implies that the response is spatially local, i.e.
dω iωt ǫ G(r,t)= (ǫ(ω) ǫ )e− δ(r) ǫ G(t)δ(r) (577) 0 (2π) − 0 ≡ 0 Z This will be valid if the distortion of the system due to the applied field is over a region small compared to the scale over which the field varies appreciably. In the case of a wave packet this scale is the typical wavelength. Let us now recall our resonance model for the frequency dependence of ǫ. Each resonance contribution had the form f j (578) ω2 ω2 iγ ω j − − j which has pole locations
γ γ2 ω = i j ω2 j (579) − 2 ± s j − 4
and we see that these poles are in the lower half plane for positive γj. So in this model causality demands γj > 0. Of course in our model positive γj corresponds to a damped as opposed to a runaway oscillator. But it is nice to know that this sort of unphysical runaway behavior is forbidden in models that simply respect causality. Analyticity is a very powerful mathematical fact about functions. This is manifested by Cauchy’s theorem, dz f(z ) f(z)= ′ ′ (580) 2πi z z IC ′ − where C is a closed curve within a simply connected region of analyticity for f(z). Mathematical Detour; We can understand Cauchy’s theorem in terms of known features of two dimensional elctrostatics. Write z = x + iy and f = f1 + if2. Then dzf(z)= dxf dyf + i(dxf + dyf ) dl f + i(dl f˜) (581) 1 − 2 2 1 ≡ · · where the two-vector f =(f , f )T and f˜ =(f ,f )T . Then by Stokes theorem 1 − 2 2 1 dzf(z) = dA [ f] + i[ f˜] (582) ∇× z ∇× z I Z But if f(z)= f(x + iy) is analytic it satisfies ∂f ∂f i = ∂x ∂y ∂f ∂f ∂f ∂f i 1 2 = 1 + i 2 (583) ∂x − ∂x ∂y ∂y
118 c 2010 by Charles Thorn
which are the Cauchy-Rieman equations: ∂f ∂f 1 = 2 ∂x ∂y ∂f ∂f 2 = 1 (584) − ∂x ∂y
They easily imply that f1 and f2 each satisfy the 2D Laplace equation. They also imply
f = ∂ ( f ) ∂ f =0 ∇× x − 2 − y 1 f˜ = ∂ (f ) ∂ f =0 (585) ∇× x 1 − y 2 which together give the Cauchy theorem
dzf(z)=9 (586) IC for any closed curve C within a simply connected domaain of analyticity. The Cauchy formula follows by first using the Cauchy theorem to deform any closed curve that encloses the point z′ = z to an infinitesimal circle Cδ of radius δ centered on the point z′ = z. Then dz f(z ) dz f(z ) ′ ′ = ′ ′ 2πi z z 2πi z z IC ′ − ICδ ′ − 2π idθδeiθf(z) = f(z) (587) → 2πiδeiθ Z0 Dispersion Relations for Dielectrics Because ǫ(ω) is analytic in the whole upper half complex ω plane, we can pick C to follow the real axis and close on a large semicircle at ω = . The behavior of ǫ(ω) at large ω depends on the behavior of G(t) as t 0. ∞ →
ǫ(ω) ∞ iωt 1 ∞ d iωt iG(0) 1 ∞ iωt 1 = dtG(t)e = dtG(t) e = dtG′(t)e ǫ − iω dt ω − iω 0 Z0 Z0 Z0 iG(0) G (0) iG (0) = ′ + ′′ + (588) ω − ω2 ω3 ··· It is reasonable to assume a smooth turn-on of G at t = 0 in which case the first term is 2 absent and then ǫ ǫ0 +O(ω− ). at large ω. This is easily fast enough that the contribution from the semicircle→ at infinity to the Cauchy integral vanishes and we have
ǫ(ω) ∞ dω ǫ(ω ) 1= ′ ′ 1 , Im ω > 0 (589) ǫ0 − 2πi(ω′ ω) ǫ0 − Z−∞ − We want to take ω to the real axis, but we have to be careful about the resulting singularity in the integrand at ω′ = ω. If there is finite conductivity there is also a pole in ǫ(ω′) at ω′ = 0, ǫ iσ/ω. This pole was not enclosed by the original Cauchy contour C, so when we take the contour∼ to the real axis, it must pass above the conductivity pole. We can handle
119 c 2010 by Charles Thorn
these singularities by deforming the contour near ω′ = ω into a semicircle of radius δ about ω into the lower half plane (lhp). Similarly we deform the contour near ω′ = 0 into a semicircle of radius δ about 0 into the upper half plane (uhp). Then the contour follows the real axis from to δ, a semicircle in the uhp from δ to +δ, the real axis from +δ to ω δ, a semicircle−∞ in the− lhp from ω δ to ω + δ and then,− finally, the real axis from ω + δ to− + . The real axis contribution, in− the limit δ 0 is what is traditionally called the Principal∞ Part. The semicircle contributions are just→ half the full residues of the respective poles. The residue of the pole at 0 contributes negatively because the contour in that case is clockwise.
ǫ(ω) ∞ dω ǫ(ω ) 1 ǫ(ω) iσ 1 = P ′ ′ 1 + 1 + (590) ǫ0 − 2πi(ω′ ω) ǫ0 − 2 ǫ0 − 2ǫ0ω Z−∞ − ǫ(ω) iσ ∞ dω ǫ(ω ) = 1+ + P ′ ′ 1 (591) ǫ0 ǫ0ω πi(ω′ ω) ǫ0 − Z−∞ − Taking the real and imaginary parts of both sides leads to
ǫ(ω) ∞ dω ǫ(ω ) Re = 1+ P ′ Im ′ (592) ǫ0 π(ω′ ω) ǫ0 Z−∞ − ǫ(ω) σ ∞ dω ǫ(ω ) Im = P ′ Re ′ 1 (593) ǫ0 ǫ0ω − π(ω′ ω) ǫ0 − Z−∞ − These dispersion relations are also known as Kramers-Kronig relations after the physicists who first introduced them. They reflect the consequences of a few very basic and fundamental physical principles. For real ω, it follows from the reality of G(t) that ǫ(ω)∗ = ǫ( ω). Hence Re ǫ is even in ω, whereas Im ǫ is odd. Thus one can write the dispersion integrals− with range 0 <ω< . ∞ ǫ(ω) ∞ dω 2ω ǫ(ω ) Re = 1+ P ′ ′ Im ′ (594) ǫ π ω 2 ω2 ǫ 0 Z0 ′ − 0 ǫ(ω) σ ∞ dω 2ω ǫ(ω ) Im = P ′ Re ′ 1 (595) ǫ ǫ ω − π ω 2 ω2 ǫ − 0 0 Z0 ′ − 0 9.6 Causal Propagation To set up a test of causal wave propagation, we imagine a wave with a sharp wave front timed to arrive, at normal incidence, at the surface of a dispersive material, located at x = 0, at time t = 0. For t< 0, the wave is entirely in the region x< 0 and can be written
iωx/c iωt Incident wave : ψ(x,t) = dωA(ω)e − , x< 0. (596) Z A(ω) = dtψ(0,t)eiωt (597) Z Because ψ(0,t)=0 for t< 0, A(ω) can be taken to be analytic in the upper half ω plane.
120 c 2010 by Charles Thorn
For x > 0 we have, from the normal incidence refraction formula At(ω)=2A(ω)/(1 + n(ω)),
2 iωn(ω)x/c iωt Refracted wave : ψ(x,t)= dω A(ω)e − , x> 0 (598) 1+ n(ω) Z The analyticity in the uhp of A, ǫ and n, means that the integrand for x> 0 is analytic in the uhp. If the integrand vanishes rapidly enough as ω , we can close the contour. To decide → ∞ iωn(ω)x/c iωt iω(x/c t) when we can do this we need the fact that n 1 as ω , so that e − e − . Thus we can close the contour in the uhp when→ x>ct→. Then∞ analyticity in the uhp→ implies, via Cauchy’s theorem, that ψ(x,t)=0 for t < x/c. That is, no signal can propagate faster than light.
121 c 2010 by Charles Thorn
10 Waveguides and Cavities
10.1 The approximation of perfect conductors In our study of waveguides and cavities, we shall make the simplifying assumption of perfect conductivity. That is time varying fields (ω = 0) are strictly zero in the body of the conductor and surface currents and charge densities are6 instantly adjusted to maintain zero fields when fields just outside the conductor are non-zero. Then Maxwell’s equations imply for the fields just outside the conductor the boundary conditions
n B =0, E =0, n D =Σ, n H = K (599) · k · × k where Σ is the free surface charge density and K is the free surface current density. To appreciate when this assumption is valid we briefly recall the behavior of time-varying fields near the boundary of conductors with large but finite σ in the quasistatic approxi- mation, where the displacement current is neglected. Then, with J = σE, the Maxwell iωt equations read (assuming time dependence e− in all fields)
B = µσE, E = iωB, B =0, E =0 (600) ∇× ∇× ∇· ∇· 2i 1 i 2 2B = iµσωB B = − B (601) −∇ ≡ δ2 − δ Suppose we have a non-zero magnetic field parallel to the surface just outside the conductor. Then instead of falling abruptly to zero just inside the conductor, the field drops off expo- (1 i)ξ/δ nentially B(ξ) = B(0)e− − , where ξ is a coordinate that measures the normal depth into the conductor. The skin depth δ = 2/µσω, which measure how deeply the fields penetrate, goes to zero as σ . Here we are assuming that over a scale δ the fields p are essentially constant in the→ directions ∞ parallel to the surface of the conductor. In this situation the corresponding electric field and current density are
1 1 i (1 i)ξ/δ E = B = − n B(0)e− − µσ ∇× µσδ ×
ω (1 i)ξ/δ = (1 i)n B(0)e− − (602) 2µσ − × r ωσ (1 i)ξ/δ ∞ B(0) J = (1 i)n B(0)e− − , K = dξJ = n (603) 2µ − × × µ r Z0 where n is the outward normal to the conductor. Notice that at ξ = 0 the electric field is tangential to the surface and small. (For a perfect conductor the tangential component of E would be 0.) This surface tangential field is simply proportional to the effective surface current ωµ 1 i E = (1 i)K = − K (604) k 2σ − σδ r
122 c 2010 by Charles Thorn
The proportionality constant has the units of resistance and is called the surface impedance Zs = (1 i)/σδ. Similarly Faraday’s law shows that there would be a small normal component to B at− the surface. The last expression for the effective surface current matches that for the boundary condi- tion at a perfect conductor. We see that the effect of finite conductivity is to spread out the abrupt transition for a perfect conductor to a smooth but rapid transition over a distance of the skin depth. Finite conductivity is also responsible for Ohmic losses. These can be evaluated either via the time averaged Poynting vector at the surface Power 1 = n S = n Re E B∗ Area ·h i 2µ · × 1 ω = n Re (1 i)(n B(0)) B∗(0) (605) 2µ 2µσ · − × × r ωδ ωµδ 1 = B(0) 2 = H(0) 2 = H(0) 2 (606) −4µ| | − 4 | | −2σδ | | or as the integral dξRe J E∗/2. These losses go to 0 as σ . The advantage of the last expression is that H· is continuous across the interface and→ ∞ so in the limit of high R conductivity it may be taken as its value just outside the conductor.
10.2 Waveguides A typical waveguide is a long hollow conductor within which electromagnetic waves can propagate. The fields satisfy Maxwell’s equations in the nonconducting hollow region. The effect of the conducting wave guide is, in the limit of a perfect conductor, just the imposition of boundary conditions as discussed in the previous section. We first consider a long cylinder with uniform cross section. The interior of the cylinder is filled with a material with uniform ǫ,µ. We also restrict attention to a perfect conductor, so the boundary conditions on the inner cylindrical wall are E = 0 and n B = 0. Maxwell’s equations imply that each k component of E, B satisfies the wave equation:·
∂2 ∂2 2 + ǫµ E =0, 2 + ǫµ B =0 (607) −∇ ∂t2 −∇ ∂t2 iωt but they also imply relations among those components. For harmonic time variation e− , Faraday’s and the Ampere-Maxwell equations become 1 1 B = E, E = B (608) iω ∇× −iωµǫ∇× These formula’s automatically impose the divergence equations E = B = 0. With cylindrical geometry we choose the z-axis parallel to the interior∇· walls of the∇· cylinder. The walls of the cylinder projected onto the xy-plane will be a closed curve, which for the moment we allow to be arbitrary in shape. Because of this generality we cannot find a plane wave
123 c 2010 by Charles Thorn
solution in all three coordinates, but we can find solutions with plane wave z dependence ikz iωt ikz iωt e − . In effect we can find a 1D plane wave solution. So we assume the form f(x,y)e − where f is any of the field components. We write = +ˆz∂/∂z, and also E = Ezzˆ+E , ∇ ∇⊥ ⊥ and B = Bzzˆ+ B . Then we write out the curl and take its scalar and vector product with ⊥ zˆ:
E = E + Ez zˆ + ikzˆ E (609) ∇× ∇⊥ × ⊥ ∇⊥ × × ⊥ zˆ E =z ˆ E (610) ·∇× ·∇⊥ × ⊥ zˆ ( E) =z ˆ ( Ez zˆ)+ ikzˆ (ˆz E )= Ez ikE (611) × ∇× × ∇⊥ × × × ⊥ ∇⊥ − ⊥ zˆ B =z ˆ B , zˆ ( B)= Bz ikB (612) ·∇× ·∇⊥ × ⊥ × ∇× ∇⊥ − ⊥ Employing these decompositions in the Maxwell equations then gives 1 1 Bz = zˆ E , Ez = zˆ B (613) iω ·∇⊥ × ⊥ −iωµǫ ·∇⊥ × ⊥ 1 1 zˆ B = ( Ez ikE ), zˆ E = ( Bz ikB ) (614) × ⊥ iω ∇⊥ − ⊥ × ⊥ −iωµǫ ∇⊥ − ⊥ E + ikEz = 0, B + ikBz =0 (615) ∇⊥ · ⊥ ∇⊥ · ⊥ The equations on the middle line can be rearranged by taking the cross product of say the first one withz ˆ as k 1 B + zˆ E = zˆ Ez − ⊥ ω × ⊥ iω ×∇⊥ k 1 B +ˆz E = Bz (616) −ωµǫ ⊥ × ⊥ −iωµǫ∇⊥ k2 1 k B 1 = zˆ Ez + Bz (617) ⊥ ω2µǫ − iω ×∇⊥ iω2µǫ∇⊥ k2 k 1 zˆ E 1 = zˆ Ez + Bz (618) × ⊥ ω2µǫ − iω2µǫ ×∇⊥ iωµǫ∇⊥ which shows that, except in the case k = ω√µǫ, we can express E , B in terms of Ez,Bz. ⊥ ⊥ TEM Modes That exceptional case implies± that all components of E and B satisfy the 2D Laplace equation. Since Ez is parallel to the waveguide walls, the perfect conductor boundary conditions require Ez = 0 on all boundaries so the uniqueness theorem of the Laplace equation implies Ez = 0 everywhere inside the guide. Then the above equations imply
k 2 2 Ez = 0, B = zˆ E , Bz =0, for k = ω µǫ (619) ⊥ ω × ⊥ ∇⊥
The third equation shows that Bz is independent of x,y. This constant is not directly restricted by the perfect conductor boundary condition n B = 0 since B is a parallel · z component of B. However 0 = B = B + ikBz show by Gauss’ theorem that ∇· ∇⊥ · ⊥ i dxdy Bz = dl n B =0 (620) k · ⊥ Z I 124 c 2010 by Charles Thorn
7 by the perfect conductor boundary condition on B . Then Bz, being a constant, must in ⊥ fact be 0. Thus the exceptional solution is purely transverse (Ez = Bz = 0) and so is called the TEM (Transverse Electric Magnetic) mode. In order for it to exist, the electric field must satisfy the 2D electrostatic equations E = 0, E = 0, which is equivalent to ⊥ ⊥ ⊥ ⊥ 2 φ = 0, with the boundary condition∇E ·= 0 equivalent∇ × to φ =constant on the boundary. k There∇⊥ is no such solution for a simply connected 2D region, so it can only be relevant to a situation at least as complicated as a coaxial waveguide. A simple example waveguide that supports a TEM mode is a pair of concentric conduct- ing cylinders of radii a
ik ik B = Bz = Bz (622) ⊥ ω2µǫ k2 ∇⊥ γ2 ∇⊥ iω− iω ω ωµ zˆ E = Bz = Bz = B = H (623) × ⊥ ω2µǫ k2 ∇⊥ γ2 ∇⊥ k ⊥ k ⊥ 2 2 2 − ( + k ω µǫ)Bz = 0, n Bz =0 onboundaries (624) −∇⊥ − ·∇⊥
In this case the Neumann boundary conditions on Bz are equivalent to the Bn = E = 0, as k can be easily seen by dotting the first two lines with n. Similarly the TM mode solutions are iωµǫ iωµǫ B = zˆ Ez = zˆ Ez (625) ⊥ ω2µǫ k2 ×∇⊥ γ2 ×∇⊥ − ik ik k zˆ E = zˆ Ez = zˆ Ez = H (626) × ⊥ ω2µǫ k2 ×∇⊥ γ2 ×∇⊥ ωǫ ⊥ 2 2 2 − ( + k ω µǫ)Ez = 0, Ez =0 onboundaries (627) −∇⊥ −
Here Dirichlet conditions are necessary since Ez is parallel to the conducting surface. But this is also sufficient since it implies n Ez = 0. Then dotting the first two equations with ×∇⊥ n shows that Bn = E = 0 on the boundaries, as required by perfect conductor boundary k conditions. In both TM and TE cases we have to solve the Helmholtz equation, but with Neumann or Dirichlet conditions in TE and TM cases respectively.
7Alternatively we can write 1 1 dxdy Bz = dxdyzˆ ( E)= dl E = 0. (621) iω · ∇× iω · Z Z I
125 c 2010 by Charles Thorn
We can regard the Helmholtz equation as an eigenvalue problem for the transverse Lapla- cian
2 2 2 2 2 ψa = γaψa, γ = ω µǫ k (628) −∇⊥ − 2 where γa are the possible eigenvalues. Since the eigenvalues are a discrete set of numbers, fixed for given geometry of waveguide and boundary conditions, we can interpret ωa(k) = 2 2 k + γa/√µǫ as the dispersion formula for 1D waves along the waveguide. The group velocity of these waves is then v = k/(ω µǫ). p g a 10.3 Rectangular Waveguide Let the cross section be a rectangle of dimensions a b with a>b. The eigenfunctions are × 2 2 z mπx nπy m n TE : Bmn = B0 cos cos , γmn = π 2 + 2 (629) a b r a b 2 2 z mπx nπy m n TM : Emn = E0 sin sin , γmn = π 2 + 2 (630) a b r a b The cutoff frequencies of TE and TM modes have a large overlap. The difference is that the modes with m or n equal to 0 are possible for TE but not TM. In particular the lowest cutoff frequency is the TE mode γ10/√µǫ = π/(a√µǫ). For this mode Ez = By = Ex = 0 and πx ik π πx ω iω π πx B = B cos , B = B sin , E = B = B sin (631) z 0 a x −γ2 a 0 a y − k x γ2 a 0 a
10.4 Energy Flow and Attenuation To calculate the energy and energy flow in a wave guide we use the formulas for the time- iωt averaged energy density and Poynting vectors valid for harmonic time dependence e− : 1 ǫ µ 1 u = Re E E∗ + H H∗ , S = ReE H∗ (632) h i 2 2 · 2 · h i 2 × We first evaluate these expressions assuming perfect conductivity. The evaluation is slightly different for TE and TM modes. TE Modes For the former 1 STE = Re[E H∗ + E zHˆ z∗] h i 2 ⊥ × ⊥ ⊥ × 1 k = Re E (ˆz E∗ )+ E zBˆ z∗ 2µ ω ⊥ × × ⊥ ⊥ × 1 k iω = Re zˆ E E∗ + Bz∗ Bz 2µ ω ⊥ · ⊥ k2 ω2µǫ ∇⊥ − 1 ωk iω = Re zˆ Bz Bz∗ + Bz∗ Bz 2µ (k2 ω2µǫ)2 ∇⊥ ·∇⊥ k2 ω2µǫ ∇⊥ − − 126 c 2010 by Charles Thorn
The total power transported along the guide is
P = dxdyzˆ S TE ·h i Z 1 ωk 2 1 ωk 2 2 = d x Bz Bz∗ = d x Bz (633) 2µ (k2 ω2µǫ)2 ∇⊥ ·∇⊥ 2µ γ2 | | − Z Z Finally, the time averaged energy density and energy per unit length are
1 ǫ 1 1 2 uTE = Re E E∗ + B B∗ + Bz h i 2 2 ⊥ · ⊥ 2µ ⊥ · ⊥ 2µ| | 2 2 1 ω ǫµ + k 1 2 = Re Bz Bz∗ + Bz (634) 2 2µ(ω2µǫ k2)2 ∇⊥ ·∇⊥ 2µ| | − 2 2 2 2 1 ω ǫµ + k 1 2 T d x u = d x B B∗ + B ≡ h TEi 2 2µ(ω2µǫ k2) z z 2µ| z| Z Z − 1 ω2ǫµ ωµǫ P = d2x B 2 = P = TE (635) 2µ γ2 | z| k TE v Z g
The relation PTE = vgT precisely corresponds to energy flowing down the guide with the group velocity. TM Modes The corresponding evaluation for TM modes is similar; 1 STM = Re[E H∗ + Ezzˆ H∗ ] h i 2 ⊥ × ⊥ × ⊥ 1 ωµǫ = Re E (ˆz E∗ )+ Ezzˆ B∗ 2µ k ⊥ × × ⊥ × ⊥ 1 h ωµǫ iωµǫ i = Re zˆ E E∗ + Ez Ez∗ 2µ k ⊥ · ⊥ k2 ω2µǫ ∇⊥ − ǫ ωk iω = Re zˆ Ez Ez∗ + Ez Ez∗ 2 (k2 ω2µǫ)2 ∇⊥ ·∇⊥ k2 ω2µǫ ∇⊥ − − The total power transported along the guide is
P = dxdyzˆ S TM ·h i Z ǫ ωk 2 ǫ ωk 2 2 = d x Ez Ez∗ = d x Ez (636) 2 (k2 ω2µǫ)2 ∇⊥ ·∇⊥ 2 γ2 | | − Z Z Finally, the time averaged energy density and energy per unit length are
1 ǫ 1 ǫ 2 uTM = Re E E∗ + B B∗ + Ez h i 2 2 ⊥ · ⊥ 2µ ⊥ · ⊥ 2| | 2 2 1 ǫ(ω ǫµ + k ) ǫ 2 = Re Ez Ez∗ + Ez (637) 2 2(ω2µǫ k2)2 ∇⊥ ·∇⊥ 2| | − 127 c 2010 by Charles Thorn
2 2 2 2 ǫ ω ǫµ + k 1 2 T d x u = d x E E∗ + E ≡ h TEi 2 2(ω2µǫ k2) z z 2| z| Z Z − ǫ ω2ǫµ ωµǫ P = d2x E 2 = P = TM (638) 2 γ2 | z| k TM v Z g Resistive Losses Now that we have the solutions for perfect conductivity we can use them to get a first order estimate of the losses that occur with large but finite conductivity. The Ohmic losses are concentrated within a depth δ from the surface of the conductor and the 2 loss per unit area was found in Eq.(??) to be ωδB B∗/4µc = H /2σδ. We express the k k result in terms of H because that field is continuous· acrossk the| interface.| Then the only k properties of the conductor that come in are δ, σ. The loss per unit length is the integral of this quantity along a ribbon of width dz that encircles the waveguide. For the TM modes the parallel component of the magnetic field just outside the conductor’s surface is approximately the value for perfect conductivity and can be obtained from the cross product TM Modes: iωǫ iωǫ n H = − n (ˆz Ez)= − (ˆzn Ez) (639) × ⊥ k2 ω2µǫ × ×∇⊥ k2 ω2µǫ ·∇⊥ − 2 2 − dPTM 1 ω ǫ 2 = dl n Ez dz −2σδ (k2 ω2µǫ)2 | ·∇⊥ | 2 2− I 1 ω ǫ 2 = dl n Ez (640) −2σδ γ4 | ·∇⊥ | I TE Modes For TE modes, there are two contributions to the parallel magnetic field: Hz and the parallel component of H which can be inferred from ⊥ ik n H = − n Hz (641) × ⊥ k2 ω2ǫ ×∇⊥ − 2 dPTE 1 2 k 2 = dl Hz + n Hz dz −2σδ (k2 ω2µǫ)2 | ×∇⊥ | I 2 − 1 2 k 2 = dl Hz + n Hz (642) −2σδ γ4 | ×∇⊥ | I 2βz The attenuation constant β is defined according to an exponential law P (z)= P (0)e− , so that 1 dP β = (643) −2P dz 2 2 2 4 2 γ dlHz +(k /γ ) dl n Hz β = | ×∇⊥ | (644) TE 2σδ µωk d2x H 2 H H z 2 | | 1 ωǫ dl n Ez β = | ·∇⊥ R | (645) TM 2σδ γ2k d2x E 2 H | z| Since the integrals in these expressions dependR only on the geometry of the wave guide, the 1/2 frequency dependence of the β’s is explicit (remembering that δ ω− ) and assuming that ∼
128 c 2010 by Charles Thorn
µ,µc,ǫ,σ are all frequency independent). Then βTM is proportional to
2 1/2 ω 1 ω ω0 − = 1 2 (646) δk √µǫδ0 ω0 − ω r
where we have written ω0 = γ/√µǫ for the cutoff frequency and δ0 for the skin depth at cutoff frequency. This expression blows up at the cutoff frequency, grows as √ω at high frequency, and has a minimum at ω = ω0√3. The frequency dependence of βTE is more involved because of the two contributions:
1/2 ω ω2 − ω2 ω2 β = 1 0 I 0 + I 1 0 (647) TE ω − ω2 1 ω2 2 − ω2 r 0 It also blows up at cutoff and increases as √ω at high frequency, but its minimum depends on the details of the wave guide.
10.5 Resonant Cavities Waveguides, because they leave one direction (z in our discussion) unrestricted, allow waves to propagate in that direction. The walls of the guide prevent propagation in the x and y directions. These waves were described mathematically by assuming the dependence ikz iωt f(x,y)e − for the various components of the fields and then finding solutions for a con- 2 2 tinuous range of kc = ω ωc . Except for TEM modes, the confinement of fields in the xy dimensions generally produces− a discrete set of lower cutoffs ω > 0 on the allowed frequency p c of propagation. It is instructive to invert the relationship between ω and k: ω = c k2 + γ2, which is reminiscent of the relativistic relation of energy to momentum. If we compare this 2 2 2 p to the relation for 3D space ω = c kx + ky + kz , we can interpret the effect of the waveguide 2 2 walls as quantizing the possible values of kx + ky. Similarly, we could have set up a guide that allowed propagation in twop directions by considering waves in the region between two ikyy+ikzz iωt parallel planes, say at x = 0 and x = L. Then the wave ansatz would be f(x)e − , 2 2 2 and we would obtain a dispersion law ω = c ky + kz + γ , where now the effect of the walls is just to quantize the values of kx. Notice that the effect of confining motion in one dimension has the effect of dimensional reduction.p The extra dimension is not completely lost but makes itself felt through the spectrum of “masses” represented by the cutoff fre- quencies. If the extra dimension is very small, it would take a huge energy (~ω) to excite these extra modes. Some very speculative notions about high energy physics include the hypothesis that our space-time actually has extra dimensions that are extremely tiny, that we will only discover in extremely high energy experiments by measuring the spectrum of cutoff frequencies, which to us will be a spectrum of particle masses. Returning to reality, we now consider confinement in all three dimensions, that is we consider electromagnetic fields in cavities. Then we can expect that the frequency itself will be quantized. Again we simplify life by first assuming perfectly conducting walls, and consider the effect of finite conductivity later as a small perturbation. Since we have already studied cylindrical waveguides, the easiest cavities to discuss are cylinders with conducting
129 c 2010 by Charles Thorn
end plates at fixed values of z. But of course cavities of any shape could be considered. One of the homework problems asks you to investigate a spherical cavity. To describe a cylindrical cavity we can start with our known solutions for a cylindrical waveguide and restrict the z dependence to satisfy the appropriate boundary conditions at z = 0 and z = L. We use the fact that for a given frequency, we get a waveguide solution for 2 2 both signs of k = ω ωc /c. Thus each field component will have the general dependence f(x,y)(A sin kz +±B cos kz−). We can maintain the distinction between TM and TE modes. p At the endplates we must have Ex,y = 0and Bz = 0. For TE modes, which are given in terms of Bz, we see that these boundary conditions imply Bz = Bz(x,y)sin(nπz/L). Because Ex,y are determined in terms of ∂x,yBz, those components of E will automatically vanish as well. Thus the frequency will be quantized to the discrete values ω = c γ2 + n2π2/L2. The effect of the endplates has been to quantize k = nπ/L. The x,y components of the fields are ± p nπz B = B (x,y)sin z z L i nπ iπnz/L nπ iπnz/L nπ nπz B = Bz(x,y) e − e− = Bz(x,y)cos ⊥ γ2 ∇⊥ 2iL − 2iL γ2L∇⊥ L iω nπz zˆ E = Bz(x,y)sin TEModes (648) × ⊥ γ2 ∇⊥ L
For TM modes Bz = 0 and the solution is specified in terms of Ez. For the wave guide ikz solution with z dependence e± we had
ik ikz E = Eze± (649) ⊥ ±γ2 ∇⊥ to get a linear combination of these to vanish at z = 0,L we clearly need the two terms to appear with the same coefficients in Ez (so they appear with opposite coefficients in E ) to ⊥ get E =0 at z = 0. Then E =0 at z = l requires k = nπ/L. Thus we have ⊥ ⊥ nπz iωµǫ nπz Ez = Ez(x,y)cos , B = zˆ Ez(x,y)cos L ⊥ γ2 ×∇⊥ L nπ nπz zˆ E = zˆ Ez(x,y)sin TMModes (650) × ⊥ −γ2L ×∇⊥ L
imϕ For the right circular cylinder the transverse solutions are Bessel functions Jm(γmnρ)e . For TM modes Dirichlet conditions on Ez imply that Rγmn xmn the zeroes of Jm(x). 2 2 2 2 2≡ Then the cavity frequencies are ωmnp = c xmn/R + p π /L . The lowest TM frequency is ω = cx /R 2.405c/R. For TE modes the Neumann conditions of B imply that 010 01 ≈ p z Rγmn = ymn the zeroes of Jm′ (x). For TE modes p = 1 is the lowest allowed z mode, and y11 1.841 is the lowest zero of the various Jm′ . Thus the lowest TE frequency is ≈ 2 2 2 2 ω111 = c y11/R + π /L . since y11 130 c 2010 by Charles Thorn by cos2(nπz/L) or sin2(nzπ/L) and integrating z from 0 to L, which amounts, for n =0 to multiplying the energy per unit length by L/2. For the n = 0 mode one instead multiplies6 by L. This gives L γ2 + n2π2/L2 µL n2π2 U = d2x B 2 = 1+ d2x H 2 (651) TE 4µ γ2 | z| 4 γ2L2 | z| Z Z ǫL n2π2 ǫL U = 1+ d2x E 2, n = 0; U = d2x E 2, n =0 (652) TM 4 γ2L2 | z| 6 TM 2 | z| Z Z The rate of energy loss can be obtained by integrating the rate per unit length for a waveguide (times cos2 kz or sin2 kz) over z from 0 to L, and then adding the contribution from the two endplates. 2 2 1 L ω ǫ 2 2 2 PTM = dl n Ez +2 d x H −2σδ 2 γ4 | ·∇⊥ | | ⊥| 2 2 I Z2 2 1 L ω ǫ 2 ω ǫ 2 2 = dl n Ez +2 d x Ez −2σδ 2 γ4 | ·∇⊥ | γ4 |∇⊥ | 2I2 Z ǫ n π L 2 2 2 = 1+ dl n Ez + d x Ez , n =0 (653) −σδµ γ2L2 4γ2 | ·∇⊥ | | | 6 I Z The last term in square brackets is the contribution of the ends, and the first term in square brackets is multiplied by 2 for the case n = 0. The quality factor Q of a resonant cavity is defined by EnergyStored Q = ω . (654) PowerLoss 2 2 2 2 Writing Cξ/A = dl n Ez /γ d x Ez , with C the circumference and A the cross ⊥ sectional area of the cylinder,| ·∇ we| can write| | H R ωσµδL µL QTM = = , n =0 4(1 + ξCL/4A) 2µcδ(1 + ξCL/4A) 6 ωσµδL µL QTM = = , n =0. (655) 2(1 + ξCL/2A) µcδ(1 + ξCL/2A) For TE modes, there are two contributions to the parallel magnetic field on the cylinder wall, Hz and the parallel component of H , as well as the two endplate contributions: ⊥ 2 2 2 2 1 L 2 n π 2 n π 2 2 PTE = dl Hz + n Hz +2 d x Hz −2σδ 2 L2γ4 | ×∇⊥ | L2γ4 |∇⊥ | I 2 2 2 2 Z 1 L 2 n π 2 n π 2 2 = dl Hz + n Hz + d x Hz (656) −σδ 4 L2γ4 | ×∇⊥ | L2γ2 | | I Z The expression for QTE is more cumbersome because of the extra contribution. There are now two parameters associated with the transverse geometry. 2 2 Cζ dl n Hz Cη dl Hz = | ×∇⊥ | , = | | (657) A γ2 d2x H 2 A d2x H 2 H | z| H | z| R 131 R c 2010 by Charles Thorn Then we can write ωσδµL L2γ2 + n2π2 Q = TE 4 (L2γ2η + n2π2ζ)(LC/4A)+ n2π2 µL L2γ2 + n2π2 = 2 2 2 2 2 2 (658) 2µcδ (L γ η + n π ζ)(LC/4A)+ n π The quality factor controls how rapidly an excited cavity loses its energy. From its definition we have − dU ω0 ω0t = U, U(t)= U(0)e Q (659) dt − Q Where we write ω0 for the eigenfrequency of the mode excited. Such exponential damping corresponds to damping in the fields at half the rate. So the effect of finite conductivity iω0t ω0t/2Q i(ω0+∆ω)t is to replace harmonic time dependence in the fields e− by e− − . This time dependence corresponds to a smoothed out frequency profile ∞ dt iωt ω0t/2Q i(ω0+∆ω)t E0 1 E(ω) = e E e− − = (660) 2π 0 2π ω /2Q + i(ω +∆ω ω) Z0 0 0 − E 2 1 E(ω) 2 = | 0| (661) | | 4π2 (ω ω ∆ω)2 + ω2/4Q2 − 0 − 0 So instead of showing a single discrete frequency ω0, damping spreads out the frequency profile in the energy density over a peak with width at half maximum of Γ = ω0/Q. As Q the profile sharpens to a delta function centered at ω . → ∞ 0 10.6 Perturbation of Boundary Conditions We can get a little more information about the effects of finite conductivity by considering how its presence alters the boundary value problem for the Helmholtz equation. We have seen that the tangential electric field is not strictly zero near the conductor, but rather 1 i 1 i E − K = − n H (662) k ≈ σδ σδ × Thus, taking the simpler case of TM modes, we see that instead of Ez = 0 on the walls of the cylinder, we should have 1 i 1 i iωǫ Ezzˆ − n H zˆ − n Ez ≈ σδ × ⊥ ≈ σδ γ2 ·∇⊥ 2 1+ i ωǫ ω µcδǫ Ez 2 n Ez0 = (1+ i) 2 n Ez0 (663) ≈ σδ γ0 ·∇⊥ 2γ0 ·∇⊥ where we added a 0 subscript on quantities valid in the perfect conductor limit. On the end plates instead of E = 0, we should have ⊥ 2 1 i 1 i iωǫ ω µcδǫ E − zˆ H − 2 Ez0 = (1 + i) 2 Ez0 ⊥ ≈ σδ ± × ⊥ ≈∓ σδ γ0 ∇⊥ ∓ 2γ0 ∇⊥ 2 2 ∂Ez ω µcδǫ 2 ω µcδǫ = E (1 + i) 2 Ez0 = (1 + i) Ez0 (664) ∂z −∇⊥ · ⊥ ≈± 2γ0 ∇⊥ ∓ 2 132 c 2010 by Charles Thorn where the upper (lower) sign applies to the endplate at z = L (z = 0). Denoting the perfect conductor solutions with a 0 subscript, we have for infinite and finite conductivity ∂E ( 2 ǫµω2)E = 0, E =0 onwalls, z0 =0 onendplates (665) −∇ − 0 z0 z0 ∂z 2 2 2 ω µcδǫ ( ǫµω )Ez = 0, Ez = (1+ i) 2 n Ez0 on walls, −∇ − 2γ0 ·∇⊥ ∂E ω2µ δǫ z = (1 + i) c E on endplates (666) ∂z ∓ 2 z0 We now use Green’s Theorem 2 2 2 dzd x(E E∗ E∗ E ) = dS( n) (E E∗ E∗ E )) z∇ z0 − z0∇ z − · z∇ z0 − z0∇ z Z I 2 2 3 µǫ(ω ω ) d xE E∗ = dz dl( n) (E E∗ E∗ E )) − 0 z z0 − · z∇ z0 − z0∇ z Z Z I z=L 2 ∂Ez∗0 ∂Ez d x Ez Ez∗0 − ∂z − ∂z z=0 Z z=L 2 ∂Ez = dz dl( n) Ez Ez∗0 + d xE z∗0 − · ∇ ∂z z=0 Z I Z where n is the normal vector pointing out from the conductor or into the hollow region, and in the last line we used the fact that Ez0 = 0 on the cylinder walls and ∂Ez0/∂z =0 on the endplates. Putting everything together we have 2 2 2 3 ω0µcδ L 2 2 2 (ω ω0) d xEz0Ez∗0 (1 + i) dl n Ez0 + d x Ez0 − ≈ − µ 4γ2 | ·∇⊥ | | | Z 0 I Z ω µ δ ξLC ω ω ω (1 + i) 0 c +1 = (1 + i) 0 (667) − 0 ≈ − Lµ 4A − 2Q From the time dependence implied by finite Q, we infer that ω ω0 = ∆ω iω0/2Q from which we infer that − − ω µ δ ξLC ω Lµ ∆ω = 0 c +1 = 0 , Q = (668) − Lµ 4A −2Q 2δµ (1 + ξCL/4A) c The result for Q agrees with that obtained from our energy analysis and the result for ∆ω is some new information. 10.7 Excitation of Waveguide Modes One way to set up waves in waveguides is to apply an oscillating current distribution at some spot within the guide. This current would in turn produce fields oscillating at the same frequency, which far from the current source would be in a superposition of the allowed propagating modes at that frequency. That is, the modes with γ2 < ω2µǫ. Close to the source 133 c 2010 by Charles Thorn all the other modes will be present to form a proper solution of the Maxwell equations in the presence of the source. This process is a mini-radiation problem in which the outgoing waves are limited to two directions +z and z. If we assume the driving source is localized in z, say J µ = 0 when z > L/2, then the fields− will solve the sourceless Maxwell equations in the whole region to| the| right z >L/2 and in the whole region to the left z < L/2. In both cases the solution should be a superposition of outgoing waves. This means− that the fields on the right will be superpositions of wave guide modes with eikz dependence for real 2 2 z k k = ω µǫ γ or e− | | for imaginary k. similarly on the left they will be superpositions of e ikz or e+−z k . −p | | An effective way to describe this physical situation is to expand the fields in waveguide normal modes. As we have seen these modes divide into TM and TE modes, in which all components of the field are expressed in terms of Ez and Hz respectively. As discussed in J, Problem 8.18, different modes of each type are orthogonal in the sense that d2xE E = 0, or d2xH H =0, λ = µ (669) zλ zµ zλ zµ 6 Z Z if λ = µ. For λ = µ the integrals are nonzero and we may establish a convenient normalization for them.6 Because the Helmholtz equation and boundary conditions are real, the modes could be chosen to be real. However, it will be convenient to allow some modes to be purely imaginary, and for later convenience adopt the normalizations γ2 d2xE E = λ δ , TMModes (670) zλ zµ −k2 λµ Z λ γ2 d2xH H = λ δ , TEModes (671) zλ zµ −k2Z2 λµ Z λ λ µω µ ω√µǫ Z = = , λ TE λ k ǫ k ∈ r k µ k Zλ = = , λ TM (672) ǫω ǫ ω√ǫµ ∈ r 1 The wave impedance Zλ (it has dimensions of resistance) is defined so that H = Zλ− zˆ E ⊥ × ⊥ for both kinds of modes. We see that this normalization implies that for TM modes Ez is imaginary for real kλ and Ez is real for imaginary kλ. For TE modes, however, Hz is imaginary for both cases. Finally note that, with these conventions, the dimensions of the mode functions Ezλ are taken to be 1/Length, instead of the physical dimensions of an electric field (Force/Charge). This is a convenient choice for their use as expansion functions, and simply means that the physical units of the electric field will be restored by the units of the expansion coefficients. However, the dimensions of Hzλ/Ezλ remain 1/Ohms, the same as for physical fields. The purpose of these strange normalization conditions on the z-components was to ar- range simple orthonormality conditions on E : ⊥ 2 2 1 d xE λ E µ = δλµ, and d xH λ H µ = δλµ (673) ⊥ · ⊥ ⊥ · ⊥ Z2 Z Z λ 134 c 2010 by Charles Thorn It is also easy to check that the transverse fields for TE modes are orthogonal to those for TM modes. Notice that the signs of the right sides of these equations reflect that fact that E is always real with our conventions, whereas H is real for propagating modes (kλ real) ⊥ ⊥ but imaginary for nonpropagating modes (kλ imaginary). Finally the orthogonality relation 2 1 d xE λ H µ = δλµ (674) ⊥ × ⊥ Z Z λ will be useful in what follows. We can use the same set of mode fields to construct both right and left moving waves, except that the relative sign between the longitudinal and transverse fields flips in going from 2 2 2 2 right moving to left moving. Thus if we fix kλ =+ ω ǫµ γλ if real and k = i γλ ω ǫµ if imaginary, the right (+) moving and left ( ) moving modes− will be defined via − − p p + ikz + ikz Eλ [Eλ +ˆzEλz]e , Hλ [Hλ +ˆzHλz]e (675) ⊥ ⊥ ≡ ikz ≡ ikz Eλ− [Eλ zEˆ λz]e− , Hλ− [ Hλ +ˆzHλz]e− (676) ≡ ⊥ − ≡ − ⊥ This notation subsumes both TE and TM modes, the index λ distinguishing different modes. If λ is a TE (TM) mode then Eλz =0(Hλz = 0). Each mode is associated with its own cutoff frequency γλ/√µǫ. Returning to the waveguide excitation problem, we can expand the solution of Maxwell’s equations outside the source in a compete set of waveguide modes + + + E = E + E− = Aλ Eλ + Aλ−Eλ− (677) Xλ Xλ + + + H = H + H− = Aλ Hλ + Aλ−Hλ− (678) Xλ Xλ Maxwell’s equations imply that the expansion coefficients will be a fixed set of constants throughout any region of z completely free of sources. However, if two such regions are separated by a region containing sources, the expansion coefficients will differ for the two regions. Since we are assuming a localized source we have one set of expansion coefficients ARλ± in the region to the right of the source and another set ALλ± to the left of the source. For + + the analysis of the waveguide excitation, we shall impose that E =0(ALλ = 0 for all λ) everywhere to the left of the source and E− =0(ARλ− = 0 for all λ) everywhere to the right + of the source. The problem is to relate the nonzero coefficients, ALλ− ,ARλ to the properties of the source. We can then use a trick with the Poynting vector to relate the expansion coefficients on the left to those on the right and a volume integral of the expressions J Eλ±. It is based on the identity · (E H±) = ( E) H± E ( H±) ∇· × λ ∇× · λ − · ∇× λ = iωµH H± + iωǫE E± · λ · λ (E± H) = ( E±) H E± ( H) ∇· λ × ∇× λ · − λ · ∇× = iωµH H± + iωǫE E± J E± · λ · λ − · λ E H± E± H = J E± (679) ∇· × λ − λ × · λ 135 c 2010 by Charles Thorn We can now integrate this over the volume within the waveguide between L/2 + 2 + + A d xzˆ E H± E± H Rµ · µ × λ − λ × Rµ µ Z X + 2 = ARµ d xzˆ (E µ ( )H λ E λ H µ) · ⊥ × ± ⊥ − ⊥ × ⊥ µ X Z + 1 1 0 = ARµ δλµ δλµ = 2 + (680) ±Z − Z Z ARλ µ λ λ − λ X Similarly at z = L/2, E+ = H+ = 0, so that contribution is − 2 A− d x( zˆ) E− H± E± H− Lµ − · µ × λ − λ × µ µ Z X 2 = ALµ− d xzˆ (E µ ( )H λ + E λ H µ) − · ⊥ × ± ⊥ ⊥ × ⊥ µ X Z 2 1 1 A− = A− δ + δ = Zλ Lλ (681) − Lµ ±Z λµ Z λµ − µ λ λ 0 X Putting the two contributions together gives Zλ 3 Zλ 3 ikλz A± = d xJ E∓ = d x (J E J E ) e∓ (682) R λ λ λ z λz L − 2 · − 2 ⊥ · ⊥ ± Z Z Thus the strength of the excitation of each mode, propagating or not is determined by this 8 2 2 single integral . Far down the wave guide only the propagating modes with γλ < ω ǫµ will survive. For a fixed frequency, there will be only a finite number of these. 8 ± More generally if we had not set A L = 0, the left side of these equations would be the differences R λ ± ± A R A L . So, for instance, if J = 0, the equation would simply say that the coefficients on the left and L λ R λ − ± ± right would be the same: ARλ = ALλ. 136 c 2010 by Charles Thorn 11 Radiation from Localized Sources We shall use Green function methods to calculate how electromagnetic waves are emitted from time varying sources. Last semester we obtained the Green function for the wave equation 2 2 1 ∂ + G(r,t; r′,t′) = δ(r r′)δ(t t′) (683) −∇ c2 ∂t2 − − δ( r r′ /c t + t′) G(r,t; r′,t′) = | − | − (684) 4π r r | − ′| where a choice has been made so that G is zero for t µ µν 2 µ ν ν µ 1 µ J = cǫ0∂νF = c ǫ0∂ν(∂ A ∂ A )= ∂ ∂A (685) − −µ0 · 1 ∂2 ∂ ∂Aµ = 2 + Aµ = µ J µ (686) − · −∇ c2 ∂t2 0 3 Jµ(r′,t R/c) A (r,t) = µ d r′ − , R r r′ (687) µ 0 4πR ≡| − | Z where we note that the source contributes at the retarded time. Radiation, by definition, is electromagnetic energy that escapes to large distances, and is determined by the large r behavior of the integral. We can write R r r r′/r ≈ − · µ0 3 A (r,t) d r′J (r′,t R/c), r a (688) µ ∼ 4πr µ − ≫ Z If we Fourier analyze J(r,t) in time we get for the component with frequency ω, iωt iωt µ0 3 ik r r′ A (r, ω)e− e− d r′J (r′, ω)e | − | µ ∼ 4πr µ ikr Z µ0e 3 ikr r′/r A (r, ω) d r′J (r′, ω)e− · (689) µ ∼ 4πr µ Z The simplest radiating physical systems are point particles with harmonic trajectories, for instance moving in a circle of radius a in the xy-plane at constant speed ωa, r(t) = a(cos(ωt), sin(ωt, 0). The current for such a particle is dr J = q δ(r r(t)) (690) dt − iωt 1 J iJ = iqaωe± δ(r a)δ(z) δ(ϕ ωt) x ± y ± − a − iωt = iqωe± δ(r a)δ(z)δ(ϕ ωt), J =0 (691) ± − − z 137 c 2010 by Charles Thorn The angle delta function in this formula must be understood as requiring ϕ = ωt mod 2π, which means it has the Fourier representation ∞ 1 in(ϕ ωt) δ(ϕ ωt)= e − (692) − 2π n= X−∞ Thus we see that although the motion has the frequency ω, the current oscillates at all the harmonics nω of this fundamental frequency. ∞ iqω iωt in(ϕ ωt ϕ0) J iJ = e± δ(r a)δ(z) e − − (693) x ± y ± 2π − n= X−∞ But notice that the higher harmonics are accompanied with high powers of eiϕ, which means that they contribute only to higher multipole moments. In the homework problems you are asked to analyze dipole and quadrupole arrangements of such revolving charges. To calculate the Poynting vector which describes the energy flow of radiation, we need the fields. 1 1 H = B = A (694) µ0 µ0 ∇× ∂A E = φ = iωA φ − ∂t −∇ −∇ c2 = iωA ( A) (695) − iω ∇ ∇· where the last line uses the Lorenz gauge condition 0 = ∂A0/∂(ct)+ A iωφ/c2 + A. In the radiation zone, characterized by the fields falling off as 1/r, the∇· only→− contribution to∇· the spatial derivatives which does not add a further factor of 1/r is when they act on the phase factor eikr = ikreikr/r ikeikr. Thus, in the radiation zone characterized by kr 1, it is valid to∇ substitute ≡ik. ≫ ∇ → ikr e 3 ik r′ H i k d r′J(r′, ω)e− · (696) ∼ 4πr × ikr Z µ0e 3 2 2 ik r′ E d r′ iω J(r′, ω) ic kk J(r′, ω) e− · ∼ 4πωr − · ikrZ e 3 ik r′ i k k d r′J(r′, ω)e− · = Z kˆ H (697) ∼ − 4πǫ ωr × × − 0 × 0 Z where Z0 = µ0c = µ0/ǫ0 377Ω is sometimes called the impedance of the vacuum. Note that, had we not used the≈ Lorenz gauge condition to relate φ to A, the integrand would p involve 2 2 kc iωJ(r′, ω) ikc ρ(r′, ω) = iωJ(r′, ω) ′ J(r′, ω) − − ω ∇ · kc2 c2 iωJ i k J = i k (k J) (698) → − ω · − ω × × 138 c 2010 by Charles Thorn where we used current conservation J = iωρ in the first line and the arrow denotes the change in the integrand after an integration∇· by parts. Finally the flux of energy in the radiation is given by the time averaged Poynting vector, which at large r is 1 S = Re E H∗ h i 2 × Z0 Z0k Re (k H) H∗ = H H∗ ∼ −2k × × 2k · 2 dP 2 Z0 3 ik r′ = r rˆ S k d r′J(r′, ω)e− · dΩ ·h i∼ 32π2 × Z 2 Z0 3 ik r′ = k k d r′J(r′, ω)e− · (699) 32π2k2 × × Z The main point of using the formula on the last line is that the vector inside the absolute value points in the direction of the electric field and hence determines the polarization of the radiation. 11.1 Long Wavelength Limit We have just seen that the intensity of radiation involves the Fourier transform of the current density 3 ik r d rJ(r, ω)e− · (700) Z iωt When only one frequency contributes the normalization is that Re(Je− ) is the actual physical current density. Suppose the source is localized within a region of size a. Then the exponent is of order ka =2πa/λ. If the wavelength is large compared to the size of the source, then ka 1 and we may expand the exponential, relating the terms of the expansion to various multipole≪ moments. We examine the first few terms of this expansion. ik r 1 2 e− · =1 ik r (k r) + (701) − · − 2 · ··· The first term contributes ∂rk d3rJ k(r)= d3r J l(r)= d3rrk J(r)= iω d3rrkρ = iωpk (702) ∂rl − ∇· − − Z Z Z Z where p is the electric dipole moment. The next term contributes kl i ik d3rrlJ k = i d3r(rlJ k + rkJ l) d3r(r kJ k rkk J) − l − 2 − 2 · − · Z Z Z kl i = i d3rJ m (rkrl) d3r[k (J r)]k − 2 ∇m − 2 × × Z Z 139 c 2010 by Charles Thorn kl kl = i d3rrkrl J + ik m = ω d3rrkrlρ + ik m 2 ∇· × − 2 × Z Z kl Q δ = ω kl + kl d3rr2ρ + i[k m]k (703) − 2 3 3 × Z 3 k l where we have recalled the definition of the electric quadrupole moment Qkl = d x(3r r r2δ )ρ. Finally, we have − kl R k 3 k ik r k k ω l ωk 3 2 d rJ (r)e− · = iωp + i[k m] Q k d rr ρ (704) − × − 6 kl − 6 Z Z The last term, which is not expressed in terms of a multipole moment does not contribute to radiation which involves the vector product of the current with k. The three terms that do contribute are electric dipole, magnetic dipole, and electric quadrupole respectively. Each type of radiation has a characteristic angular distribution. For electric and magnetic dipole radiation one has 2 2 ω 2 ω 2 2 2 2 k (k p) = k(k p) k p = ω (k p p∗ k pk p∗) k2 | × × | k2 | · − | · − · · 1 2 2 2 k (k (k m)) = k (k m m∗ k mk m∗) (705) k2 | × × × | · − · · Only if the components of p or m are all relatively real can we identify the right sides of these equations with ω2k2 p 2 sin2 α or k4 m 2 sin2 α, where α is the angle between k and the dipole moment. | | | | dP Z dP Z = 0 ω2(k2 p 2 k p 2), = 0 k2(k2 m 2 k m 2) (706) dΩ 32π2 | | −| · | dΩ 32π2 | | −| · | The only qualitative difference between electric and magnetic dipole radiation is the po- larization, i.e. the orientation of the electric field. For the electric dipole it points in the direction k (k p) which lies in the plane defined by k and p. For the magnetic dipole it points in× the direction× perpendicular to the plane defined by k and m. The total power i j 2 radiated requires dΩk k =4πk /3δij R Z Z Z P = 0 ω2k2 p 2 = 0 k4c2 p 2, P = 0 k4 m 2 (707) 12π | | 12π | | 12π | | respectively. Quadrupole radiation has a more complicated angular distribution. We have to square i j l the vector with components V = ǫijkk Qklk 2 j l m p j l m p V = ǫ k Q k ǫ k Q∗ k =(δ δ δ δ )k Q k k Q∗ k | | ijk kl imn np jm kn − jn km kl np 2 l p m l 2 = k Q Q∗ k k Q k k (708) kl kp −| ml | 2 4 dP ω k Z0 l p m l 2 = (Q Q∗ kˆ kˆ Q kˆ kˆ ) (709) dΩ 1152π2 ml mp −| ml | 140 c 2010 by Charles Thorn where kˆ = (sin θ cos ϕ, sin θ sin ϕ, cos θ). To see the angular distribution in a simple example, take Qij diagonal with Q33 = ˆj ˆn 2 2 2 2 ˆiˆj 2Q11 = 2Q22. Then QijQink k /Q11 = sin θ +4cos θ = 1+3cos θ and Qijk k /Q11 = sin− 2 θ 2cos− 2 θ =1 3cos2 θ. then − − dP c2k6Z c2k6Z = 9 0 Q2 sin2 θ cos2 θ = 0 Q2 sin2 θ cos2 θ. (710) dΩ 1152π2 11 128π2 11 This shows a four lobed structure with vanishing radiation at θ =0,π/2. To integrate over angles one can use rotational invariance and symmetry to argue that m l m l n p dΩk k = C1δml, dΩk k k k = C2(δmlδnp + δmnδlp + δmpδln) (711) Z Z Then tracing over ml determines C1 =4π/3, 5C2 = C1 so C2 =4π/15. Then we find 2 4 2 4 4ω k Z0 1 2 1 2 2 ω k Z0 2 P = Qml ( TrQ +2 Qml ) = Qml (712) 1152π 3 | | − 15 | | | | ! 1440π | | Xml Xml Xml for the total readiated power. 11.2 Beyond the Multipole Expansion If the wavelength of radiation is comparable to the size of the source, the multipole expansion 3 ik r is no longer valid, and we have to try to evaluate the full Fourier transform d xJe− · . For a line antenna we can for example prescribe a simple form for J =zIδ ˆ (x)δ(y)sin(kd/2 k z ) R for d/2