<<

Physics 7C Spring 2015 Discussion Section Notes

Kevin T. Grosvenora,b aBerkeley Center for Theoretical Physics and Department of Physics University of , Berkeley, CA, 94720-7300, USA bTheoretical Physics Group, Lawrence Berkeley National Laboratory Berkeley, CA 94720-8162, USA

Abstract: Some discussion section notes for Physics 7C. Contents

1. Vectors 1

2. The Wave Equation5

3. Solving the Wave Equation7 3.1. Electromagnetic Plane Waves 10

4. Poynting Vector and Flux 11 4.1. Red Laser Pointer 12

5. Ray Tracing Diagrams for Mirrors 13

6. Ray Tracing Diagrams for Lenses 14

7. Compound Optical Systems 16 7.1. Two-Lens Problem 16 7.2. Two-Lens Demonstration 17

8. Midterm 1 Quiz 22

9. Interference 25 9.1. Laser Wavelength Measurement via Metal Ruler 25

10.Thin-Film Interference 27

11.Relativity 28 11.1. How to Measure the Length of a Moving Object 28 11.2. Relativistic Train 30 11.3. Passing Trains 32

12.Midterm 2 Quiz 36

13.Energy and Momentum 40 13.1. 4-Vectors 40 13.2. Colliding Photons 42

14.Quantum Mechanics 45 14.1. The Wacky World of the Double Slit 45 14.2. Blackbody Radiation and the Ultraviolet Catastrophe 47 14.3. Stephan-Boltzmann Law 48 14.4. Bohr Model 49 14.5. Time-Evolution in 1D Infinite Square Well 53

– i – 15.Final Review 57 15.1. Human Eye Optics 57 15.2. Optical Fiber 58 15.3. Modified Michelson Interferometer 59 15.4. Diffraction Grating 61 15.5. Optical Spectroscopy 62 15.6. Relativity and Current-Carrying Wires 64 15.7. Pi Decay 66 15.8. Relativistic Doppler Effect 69 15.9. Quantum Tunneling and Frustrated Total Internal Reflection 70 15.10.Wavefunction Shapes 73

16.Final Exam Solutions 75 16.1. The Pole Vaulter Paradox 75 16.2. Pion Decay 77

1. Vectors

For our purposes, a vector will be something that has several components (usually three; or four in relativity). We must be able to add two vectors (component by component) and we must be able to multiply a vector by a real number. There are a slew of other requirements, but they are usually trivially satisfied, at least for the main vector space we will care about: R3, or Rn for general dimension. The symbol Rn means the set of all n-component expressions, A = (A1,A2,...,An), such that each component is a real number (i.e. Ai ∈ R for i = 1, . . . , n.) In three-dimensional space, let us replace x, y, z with x1, x2, x3 in order to make it easier to generalize to any dimension. We often denote the unit vector in the direction of th xi by xˆi, whose components are all zero except for the i one, which is a one. Then, A may be written, in a general dimension, n,

n X A = Aixˆi. (1.1) i=1 So that we don’t have to keep writing summation signs everywhere, we will usually fol- low Einstein’s convention that repeated indices are summed over, unless stated otherwise. Then, (1.1) becomes neater:

A = Aixˆi. (1.2) The dimensionality is nowhere to be found now, so you must make sure you know what it is from context.

– 1 – Next, we introduce the Kronecker delta symbol: ( 1 if i = j, δij = (1.3) 0 if i 6= j.

Rn has what’s called an inner product structure. This is a map (·, ·): Rn ×Rn → R. That is, you take two vectors, put one in the first slot and the other in the second slot of (·, ·), and you will get a real number, called their inner product. It is also often called their dot product, especially in three dimensions, and we will denote it by A · B. It is defined by

A · B = δijAiBj = AiBi, (1.4) where you need to keep in mind that repeated indices are summed over. In addition, R3 has a special structure called a cross product. This is given by a map (· × ·): Rn × Rn → Rn, so it takes two vectors and spits out another vector.1 In order to talk about cross products, we must introduce the Levi-Civita symbol, and to do that, we must understand cyclic indices. There is a mnemonic for this. Think of a clock that goes from 1 through 3 instead of 1 through 12. Starting at any of the numbers, if you traverse the clock in a clockwise fashion, then the order is declared to be cyclic. If you traverse the clock in the counter-clockwise direction, then the order is anti-cyclic. So, (123), (231) and (312) are cyclic, whereas (132), (213), (321) are anti-cyclic. By the way, and for example, (231) is the permutation that sends 1 (the first slot) to 2 (the first number appearing), 2 to 3 and 3 to 1. The Levi-Civita symbol is defined to be   1 if (ijk) is cyclic,  ijk = −1 if (ijk) is anti-cyclic, (1.5)   0 if any index is repeated.

Roughly speaking, the Levi-Civita symbol is to the cross product what the Kronecker delta is to the dot product:

A × B = ijkxˆiAjBk. (1.6) Let us take a moment to ensure that this definition of the cross-product agrees with the definition of the cross-product that we are likely to have learned before, namely

A × B = (AyBz − AzBy)xˆ + (AzBx − AxBz)yˆ + (AxBy − AyBx)zˆ. (1.7)

To aid in comparison, let us first rewrite (1.7) in terms of our new notation where xˆ is replaced with xˆ1, and yˆ with xˆ2 and so on. Also, the x-component is the 1-component, the y-component is the 2-component and so on. Then,

A × B = (A2B3 − A3B2)xˆ1 + (A3B1 − A1B3)xˆ2 + (A1B2 − A2B1)xˆ3. (1.8)

1Actually, the result of a cross product is what’s called a pseudovector, but no matter.

– 2 – Now, let us expand out the right hand side of (1.6) to see that it is in fact the same as the right hand side of (1.8):

ijkxˆiAjBk = 123xˆ1A2B3 + 132xˆ1A3B2 + 231xˆ2A3B1 + 213xˆ2A1B3

+ 312xˆ3A1B2 + 321xˆ3A2B1

= xˆ1A2B3 − xˆ1A3B2 + xˆ2A3B1 − xˆ2A1B3 + xˆ3A1B2 − xˆ3A2B1. (1.9)

Here, we used 123 = 231 = 312 = 1 and 132 = 213 = 312 = −1. Now, it is quite easy to see that (1.8) is the same as (1.9) just organized slightly more neatly. Okay, so far all we have done is introduce a bunch of notation in order to express the dot product and the cross product more compactly. However, this notation is actually useful once we have to deal with multiple products (like multiple cross-products). For such purposes, the following identity is very useful:

ijki`m = δj`δkm − δjmδk`. (1.10)

This identity is actually reasonably easy to understand. Remember that i, j and k have to take on different values or else the Levi-Civita symbol would be zero anyway (also i, ` and m have to take on different values). Well, since they can only take on three different values, namely 1, 2 or 3, either j is equal to ` and k is equal to m or ` is equal to m and k is equal to `. There just aren’t any other possibilities! That’s what the right hand side of the equation says, except for the signs, which you can figure out by plugging in some specific set of values for the indices, say i = 1, j = ` = 2 and k = m = 3, and then try i = 1, j = m = 2 and k = ` = 3. In fact, this can be extended to any dimension. In n + 1 dimensions, we write

δ δ ··· δ i1j1 i1j2 i1jn δ δ ··· δ i2j1 i2j2 i2jn ii1···in ij1···jn = . . . . , (1.11) ......

δinj1 δinj2 ··· δinjn where the vertical lines surrounding the matrix means “take the determinant”. The identity (1.7) will allow you to deal with situations involving multiple cross prod- ucts. A number of very important vector identities can be proven using these formulas. For example, let us prove the “BAC-CAB” rule:

A × (B × C) = B(A · C) − C(A · B). (1.12)

Of course, you could prove this identity component by component by literally expanding both sides out completely. It will take some time and is tedious, but should be pretty straightforward. On the other hand, it is quite easy to prove in our new notation. Define

D = B × C = ijkxˆiBjCk. (1.13)

Of course, as in (1.2), we can write D in terms of its components:

D = Dixˆi. (1.14)

– 3 – Comparing (1.13) with (1.14) gives us the components of D in terms of the components of B and C:

Di = ijkBjCk. (1.15) Getting back to the left hand side of (1.12), we have

A × (B × C) = A × D = `mixˆ`AmDi. (1.16) Notice that I have used `, m, i instead of the customary i, j, k. These are just dummy indices, so it does not matter what you call them. The reason why I have written i last is because that index is the same as the index on D and I have already written Di in (1.15), so might as well keep that index as i. The reason why I have used ` and m instead of j and k is because j and k already appear in (1.15), so I don’t want to use them again or else I will get confused as to which pair of i’s are supposed to be summed over and, similarly, which pairs of j’s are supposed to be summed over. Plugging (1.15) into (1.16) gives

A × (B × C) = `mixˆ`AmijkBjCk = ijk`mixˆ`AmBjCk. (1.17) Now, we see why (1.10) might be useful since we have a product of two Levi-Civita symbols here with a pair of indices being summed over, namely the index i. However, we have a bit of a problem: in (1.10) the index being summed over, namely i, is the first index of both Levi-Civita symbols. In (1.17) it is the first index in one Levi-Civita symbol, but the last in the other. No matter: we can always cyclically permute the indices of the Levi-Civita symbol without changing it:

`mi = i`m = mi`. (1.18) Think about it: this is just like rotating our clock 3-hour clock in the clockwise direction. That doesn’t change anything. On the other hand, if you switch any two indices, you get a minus sign:

`mi = −`im = −im` = −m`i. (1.19)

In any case, we can safely replace `mi with i`m in (1.17). The result is

A × (B × C) = ijki`mxˆ`AmBjCk. (1.20) Now, we can use (1.10) directly:

A × (B × C) = (δj`δkm − δjmδk`)xˆ`AmBjCk

= xˆjAkBjCk − xˆkAjBjCk

= (Bjxˆj)(AkCk) − (Ckxˆk)(AjBj) = B(A · C) − C(A · B). (1.21)

That’s the “BAC-CAB” rule proven! There are some vector product identities in the back of the front cover of Griffiths’ E&M textbook. It would be great if you can try to prove some of those using these same techniques. Also, to apply the above result, try to prove the following derivative identity:

∇ × (∇ × A) = ∇(∇ · A) − ∇2A, (1.22)

– 4 – where ∇ is the gradient operator ∂ ∇ ≡ xˆi∂i ≡ xˆi , (1.23) ∂xi and ∇2 is the Laplacian, 3 X ∂2 ∇2 = ∇ · ∇ = ∂ ∂ = . (1.24) i i ∂x2 i=1 i Remember the names of these derivatives: ∇f is the gradient of a function f, and ∇ · A is the divergence of the vector function A, and ∇ × A is the curl. The identity (1.22) will be useful in deriving the wave equation that electromagnetic waves satisfy from Maxwell’s equations.

2. The Wave Equation

We will now derive the wave equation satisfied by electromagnetic waves traveling through vacuum. Maxwell’s equations in vacuum read

∇ · E = 0, ∇ × E = −B˙ , (2.1) 1 ˙ ∇ · B = 0, ∇ × B = c2 E.

1 To compare these to equations involving µ0 and 0, remember that µ00 = c2 . Using the two equations above involving the curl, we find

˙ 1 ˙  1 ¨ ∇ × (∇ × E) = ∇ × (−B) = −∂t(∇ × B) = −∂t c2 E = − c2 E. (2.2) Note that in the second step, I swapped the order of the curl and the time derivative. That is perfectly fine - you are free to take derivatives in which order you please. On the other hand, we can also use the vector derivative identity (1.22) applied to E along with Maxwell’s equation ∇ · E = 0:

∇ × (∇ × E) = ∇(∇ · E) − ∇2E = −∇2E. (2.3)

Setting (2.2) and (2.3) equal to each other derives the wave equation:

1 ¨ 2 − c2 E = −∇ E =⇒ E = 0. (2.4)

The differential operator, , often called the d’Alembertian (or just “box”), is defined as 1 ∂2 ≡ − + ∇2. (2.5)  c2 ∂t2 The same derivation shows that B satisfies the exact same wave equation,

B = 0. (2.6)

In fact, you see this type of wave equation all over the place where you expect to see waves of some sort - the wave equation is not special to electromagnetic waves. The things that

– 5 – change from situation to situation are the quantity that satisfies the wave equation and propagation speed. For electromagnetic waves in vacuum, the electric and magnetic fields satisfy the wave equation and the speed c is the speed of light in vacuum. For sound in air, c would be the speed of sound in air and the quantity satisfying the wave equation would be the displacement of air molecules along the direction of propagation of the sound wave relative to their equilibrium position. For transverse waves on a string, c would be the speed of those particular waves, and the quantity satisfying the wave equation is the displacement of the string up or down transverse to the direction in which it is stretched. The propagation speed is related to properties of the material through which the wave is propagating. For electromagnetic waves in vacuum, the speed is related to the magnetic √ permeability, µ0, and electric permittivity, 0, of vacuum via c = 1/ µ00. In fact, the discovery of this relationship between the speed of light and the electromagnetic properties of vacuum led Maxwell to the discovery that light is an electromagnetic wave. Let us assume for simplicity that air is a diatomic ideal gas. Let Ψ(t, x) be the dis- placement relative to equilibrium of air molecules as a function of time along the direction x, which is the direction of propagation of the sound wave. Then, one can show that Ψ satisfies the wave equation

 1 ∂2 ∂2  Ψ = − + Ψ(t, x) = 0, (2.7)  c2 ∂t2 ∂x2 where the speed of sound in air, c, is related to the temperature, T , of the air and the mass, m, of the air molecule via r 7kT c = . (2.8) 5m Here k is Boltzmann’s constant. For transverse waves on a string, let Ψ(t, x) be the up and down (transverse) displace- ment of a point at position x along the string as a function of time. One can show that Ψ(t, x) satisfies the exact same wave equation as (2.7), but where the propagation speed is related to the tension (force per unit length), T , in the string and the mass per unit length, µ, of the string via s T c = . (2.9) µ Aside: You may have learned Maxwell’s equations in integral form and with charges and currents: I Q E · da = enc , (2.10a) 0 I B · da = 0, (2.10b) I E · d` = −Φ˙ B, (2.10c) I 1 B · d` = µ I + Φ˙ . (2.10d) 0 enc c2 E

– 6 – The first two integrals are done over a closed surface, which is the boundary enclosing some volume of space. Then, Qenc is the charge inside that volume of space. The last two integrals are done over a closed loop, which is the boundary of some open surface. Then,

ΦE and ΦB are the electric and magnetic fluxes through that open surface (i.e., the surface integral of the electric and magnetic fields over that open surface), and Ienc is the current piercing that open surface. Let ρ and J be the volume charge and current densities. The volume integral of ρ gives the total charge inside that volume, and the surface integral of J gives the total current piercing that surface. Denote a volume of space by V and the closed surface (or collection of closed surfaces) which is its boundary by ∂V . Denote an open surface by S and the closed loop (or collection of closed loops) which is its boundary by ∂S. Then, we can write Maxwell’s equations as I Z ρ E · da = d3x, (2.11a) ∂V V 0 I B · da = 0, (2.11b) ∂V I Z E · d` = − B˙ · da, (2.11c) ∂S S I Z   1 ˙ B · d` = µ0J + 2 E · da. (2.11d) ∂S S c Now, we can make use of the divergence theorem and Stokes’ theorem, I Z I Z E · da = (∇ · E) d3x, E · d` = (∇ × E) · da, (2.12) ∂V V ∂S S to write Maxwell’s equations as Z Z ρ (∇ · E) d3x = d3x, (2.13a) V V 0 Z (∇ · B) d3x = 0, (2.13b) V Z Z (∇ × E) · da = − B˙ · da, (2.13c) S S Z Z   1 ˙ (∇ × B) · da = µ0J + 2 E · da. (2.13d) S S c Since these equations hold for arbitrary V and S, we must have ρ ∇ · E = , ∇ × E = −B˙ , 0 (2.14) 1 ˙ ∇ · B = 0, ∇ × B = µ0J + c2 E. These are Maxwell’s equations in differential form. Then, Maxwell’s equations in vacuum are simply these without any charges or currents: ρ = 0 and J = 0. By the way, you should now be able to derive

E = µ J˙ + 1 ∇ρ, B = µ ∇ × J. (2.15)  0 0  0 Indeed, these reduce to the wave equation in vacuum when we set ρ = 0 and J = 0.

– 7 – 3. Solving the Wave Equation

For simplicity, let us try to solve the one-dimensional wave equation first. Then we can generalize our solution to three dimensions. One-dimensional wave equations look like (2.7), where there is only one spatial dimension, which we have called x. This equation is often described as a “linear differential equation”. That may strike you as peculiar; if anything, the equation looks quadratic. People often get around this confusion by saying that the more “advanced” meaning of the word “linear” differential equation is that the sum of two arbitrary solutions to the equation is itself a solution. Even though there is nothing wrong with that statement, there is no real need for such a redefinition. If a differential equation is truly linear then there must exist a change of variables that really truly makes the equation look linear. For the one-dimensional wave equation, the change of variables is

1 1 τ ≡ 2 (x + ct), σ ≡ 2 (x − ct). (3.1) One can write the derivatives with respect to the old variables in terms of the new ones. ∂ ∂ ∂ ∂ Define ∂t = ∂t and ∂x = ∂x as well as ∂τ = ∂τ and ∂σ = ∂σ . Then, 1 1 ∂τ 1 ∂σ ∂ − ∂ ∂ = ∂ + ∂ = τ σ , (3.2a) c t c ∂t τ c ∂t σ 2 ∂τ ∂σ ∂ + ∂ ∂ = ∂ + ∂ = τ σ . (3.2b) x ∂x τ ∂x σ 2 We could also invert these relations:

1 1 ∂τ = ∂x + c ∂t, ∂σ = ∂x − c ∂t. (3.3) Therefore, the one-dimensional d’Alembertian can be written as

1 2 2 1 1 − c2 ∂t + ∂x = (∂x + c ∂t)(∂x − c ∂t) = ∂τ ∂σ. (3.4) So, we see that the one-dimensional wave equation (2.7), can be written as

∂τ ∂σΨ = 0. (3.5)

Now it is clear that this is a linear differential equation - it is linear in τ and σ separately!

It is also very easy to solve now - either ∂τ Ψ = 0 or ∂σΨ = 0. Of course, you could have both, but that just means Ψ is a constant, which is uninteresting, and does not actually describe a traveling wave. In other words, Ψ can be an arbitrary function of τ, as long as it does not depend at all on σ, or Ψ can be an arbitrary function of σ, as long as it does not depend at all on τ. In general, Ψ can be a sum of these two things. Thus, the most general solution can be written as

Ψ(t, x) = ΨL(τ) + ΨR(σ). (3.6)

The part of Ψ which just depends on τ is called “left-moving” (hence the subscript L) and the part which just depends on σ is called “right-moving” (hence the subscript R).

– 8 – An interesting set of solutions are called plane wave solutions:

ΨL(τ) = A sin(2kτ + ϕ), ΨR(σ) = A sin(2kσ + ϕ). (3.7)

Here, |A| is the constant amplitude, k the constant wave number, and ϕ the constant phase shift. I do not mean here that the left-moving part and the right-moving part of the general solution have to have the same amplitude, wave number and phase shift - that certainly need not be the case! If you want, you can put a subscript L or R on A, k and ϕ. I haven’t done so because it will clutter these expressions unnecessarily. In general, one can treat the left- and right-moving parts completely independently of each other. As far as solving the wave equation is concerned, there is absolutely nothing special about the sine function. The utility of these “plane wave” solutions is that any arbitrary solution to the wave equation can be written as some sort of superposition of plane wave solutions. Therefore, we don’t actually lose any generality by focusing only on these special solutions. In terms of the old time and position variables, (3.7) reads

ΨL(t, x) = A sin(kx + ωt + ϕ), ΨR(t, x) = A sin(kx − ωt + ϕ), (3.8) where ω = kc. (3.9) 2π The wave number, k, is related to the wavelength via k = λ . The angular frequency is ω, which is related to the frequency, ν, via ω = 2πν. The relation (3.9), which relates ω and k, is in general called a dispersion relation. We could also have used cosines instead, but cosines and sines are related by a π/2 phase shift. So, with an arbitrary phase shift, ϕ, including cosines would be redundant. Actually, when we increase the number of space dimensions from one, it becomes more convenient to fix the sign of the ωt term to be negative and allow k to be positive or negative: Ψ(t, x) = A sin(kx − ωt + ϕ). (3.10) 2π With this convention, we should write |k| = λ and ω = |k|c because it is possible for k to be negative. If k is positive, then this wave is propagating in the +x direction (right- moving) and if k is negative, then it is moving in the −x direction (left-moving). For example, below, I have graphed the Gaussian pulse function e−k2(x+ct)2 over x and for various values of time from t = 0, 4, 8, 12. I have set k = 1 and c = 1 just for convenience. You can see that the Gaussian pulse moves to the left over time. Indeed, e−k2(x+ct)2 is a function of x + ct, or of τ, which is the so-called left-moving coordinate, but not of x − ct, or of σ, which is the so-called right-moving coordinate.

1.0

0.8

0.6

0.4

0.2

-15 -10 -5 0

– 9 – 1.0

0.8

0.6

0.4

0.2

-15 -10 -5 0 1.0

0.8

0.6

0.4

0.2

-15 -10 -5 0 1.0

0.8

0.6

0.4

0.2

-15 -10 -5 0

The generalization of (3.10) to higher dimensions is

Ψ(t, x) = A sin(k · x − ωt + ϕ), (3.11) where the wavenumber, k, becomes a wavevector, k and x = (x, y, z) = xxˆ = yyˆ + zzˆ. The ˆ k direction of k, which is k ≡ |k| , is the direction of propagation of the wave. The cosine and sine functions may be written in terms of complex exponentials:

eix + e−ix eix − e−ix cos x = , sin x = . (3.12) 2 2i The inverse relationships are e±ix = cos x ± i sin x. (3.13) Therefore, instead of considering plane waves of the form (3.11), it is often computationally simpler to consider the form Ψ(t, x) = Aei(k·x−ωt+ϕ). (3.14) In this form, derivatives of plane waves simply turn into multiplication:

Ψ˙ = −iωΨ, ∇Ψ = ikΨ. (3.15)

3.1. Electromagnetic Plane Waves Let us consider plane wave solutions to the wave equation satisfied by E and B:

i(k·x−ωt+ϕ) i(k0·x−ω0t+ϕ0) E = E0 e , B = B0 e . (3.16)

Here, |E0| and |B0| are the amplitudes of the electric and magnetic fields, respectively. Since the fields separately satisfy the wave equation, at this point, there is no need for their wavevectors, frequencies and phase shifts to be the same. That is why we have put

– 10 – primes on those quantities in the magnetic field. However, Maxwell’s equations imply that they are actually the same. Mawell’s equations read

i(k·x−ωt+ϕ) k · E0 e = 0, (3.17a) 0 i(k0·x−ω0t+ϕ0) k · B0 e = 0, (3.17b) i(k·x−ωt+ϕ) 0 i(k0·x−ω0t+ϕ0) k × E0 e = ω B0 e , (3.17c) 0 i(k0·x−ω0t+ϕ0) ω i(k·x−ωt+ϕ) k × B0 e = − c2 E0 e . (3.17d) These equations must hold for all t and x. Either one of the last two equations immediately implies that

k0 = k, ω0 = ω, ϕ0 = ϕ. (3.18)

Now, we can write Maxwell’s equations more simply as

k · E = 0, k × E = ωB, (3.19) ω k · B = 0, k × B = − c2 E. We have already noted that the wavevector, k, points in the direction of propagation of the wave. Maxwell’s equations in the above form say that E and B are perpendicular to k. Thus, electromagnetic waves are transverse waves. In addition, E and B are perpendicular to each other with directions related via

E × B ∝ k, (3.20) and magnitudes related via

|k||E| = ω|B| =⇒ |E| = c|B|. (3.21)

4. Poynting Vector and Flux

Suppose you had a pipe with cross-sectional area, A, and through which water of mass density, ρ, is flowing at a speed, v. A reasonable question to ask would be “How much water (i.e. mass) is passing through the pipe in a given amount of time?” If we divide this quantity by the cross-sectional area through which the water flows, then we get the mass flux (of water mass) with units of area·time . Well, in a given time, ∆t, the water travels a distance ∆x = v∆t. This means that all the water a distance less than or equal to ∆x to the left of a particular point along the pipe will pass through that point in the given time ∆t. The volume of this region is A∆x = Av∆t and so the mass of water contained in this region is ρAv∆t. This is the total mass that passes through the area A in a time interval ∆t. Therefore, the flux is just this divided by the area, A, and the time interval, ∆t:

Φ = ρv. (4.1)

Let us make a formal analogy in the case of electromagnetic waves. In this case, we would like to measure the energy flux (how much energy flows per unit area per unit time). For

– 11 – the water example, when we wanted mass flux, we multiplied mass density by speed, as in Eqn. (4.1). Therefore, if we want the energy flux, we need to multiply the energy density by the speed. The speed of the electromagnetic wave is just c. The electric and magnetic fields carry energy density given by

u = 1  |E|2, u = 1 |B|2. (4.2) E 2 0 B 2µ0 The total energy density is just the sum of these two. Using Eqn. (3.21), we may write

u ≡ u + u = 1 |E||B|. (4.3) E B µ0c Therefore, the energy flux is

Φ = 1 |E||B|c = 1 |E||B|. (4.4) µ0c µ0 Since E and B are perpendicular to each other, we could write Φ = 1 |E × B|. Since we µ0 also know that E × B ∝ kˆ, which is the direction of propagation of the wave, we define the Poynting vector, S ≡ 1 E × B, (4.5) µ0 whose magnitude is simply the energy flux and whose direction is the propagation direction. Since the momentum and energy of light are related via p = E/c, the momentum flux is P ≡ S/c. (4.6) The magnitude, P, of this is the rate at which momentum is passing through some area, per unit area. If it is perfectly absorbed by a surface, then it is equal to the radiation pressure exerted on that surface. If it is perfectly reflected back by a surface, then the radiation pressure exerted on the surface is twice as big.

4.1. Red Laser Pointer The output of a red laser pointer (λ = 635 nm) has a beam power of 10.0 mW and a beam diameter of 1.00 mm. It is propagating in vacuum in the +x direction and is polarized in the y direction. Write down an expression for the electric and magnetic fields in the beam and the Poynting vector as a function of time and position. If this beam illuminates a surface that absorbs 40% and reflects 60%, find the net force on the surface due to ra- diation pressure. [Assume uniform irradiance across the beam’s cross-section. Note: the polarization is the direction of the electric field with +y and −y directions being counted as the same.]

SOLUTION:

2π 7 The wavevector is k = λ xˆ ≈ (10 rad/m)xˆ. It is in the xˆ direction because that is the direction of propagation. Therefore, k · x = 107 rad/mx. The angular frequency is related to the wavevector via ω = |k|c ≈ 3 × 1015 rad/s. Since the polarization is in the y direction, the amplitude of the electric field is E0 = E0yˆ. Now, E × B ∝ kˆ = xˆ, which

– 12 – 1 implies that B ∝ zˆ, or B0 = B0zˆ. Since |B| = |E|/c, we can also write B0 = c E0zˆ. The amplitude of the Poynting vector is S = 1 E × B = 1 E2xˆ =  cE2xˆ. As usal, since 0 µ0 0 0 µ0c 0 0 0 E and B oscillate identically in time, the average Poynting vector is half of its amplitude: 1 2 S = 2 0cE0 xˆ. The irradiance is simply the magnitude of the average Poynting vector: 1 2 I = |S| = 2 0cE0 . This is equal to the power per unit area: 1 P P  cE2 = I = = , 2 0 0 A πr2 where r is the radius of the beam. −12 −2 −1 Solving for E0 and using 0 = 8.9 × 10 J · V · m gives r 2P 3 V E0 = 2 = 3.10 × 10 . πr 0c m Note that r = 5 × 10−4 m (half the diameter). We also therefore have E B = 0 = 1.03 × 10−5 T. 0 c Here, T stands for Teslas, which is the metric unit for magnetic fields. Therefore, the electric and magnetic fields are

E = (3.10 × 103 V/m)yˆ cos(107 rad/m)x − (3 × 1015 rad/s)t + φ B = (1.03 × 10−5 T)zˆ cos(107 rad/m)x − (3 × 1015 rad/s)t + φ

There is an arbitrary phase, φ, which we can dial to whatever we want depending on when we choose t = 0 to be. Since φ is arbitrary, you could have used sines instead of cosines. You could also have used the complex exponential form, if you wish, as long as you keep in the back of your mind that the actual fields are the real parts or the imaginary parts since the fields can’t be complex. Recall that half of the amplitude of the Poynting vector is equal to the irradiance, P/πr2 = 1.27 × 104 W · m−2. Thus,

S = (2.55 × 104 W/m2)xˆ cos2(107 rad/m)x − (3 × 1015 rad/s)t + φ .

The average of the cos2 term is just 1/2. The average momentum flux is

−5 2 P = S/c = S0/2c = (4.24 × 10 N/m )xˆ. By momentum conservation, if this is absorbed by a surface, then this momentum is trans- ferred to the surface. If it is perfectly reflected, then the momentum flux of the beam after the reflection is −P. Conservation of momentum implies that 2P must be transferred to the surface so that the total is still P. That is, the reflected case gives twice as much pressure as the absorbed case. Therefore, the radiation pressure on the surface is

0.4|P| + 0.6|2P| = 1.6|P| = 6.79 × 10−5 N/m2.

The force would be this pressure multiplied by the beam area:

F = (6.79 × 10−5 N/m2)[π × (5 × 10−4 m)2] = 5.33 × 10−11 N .

– 13 – 5. Ray Tracing Diagrams for Mirrors

Consider a concave mirror whose radius of curvature is 10.0 cm. Draw ray-tracing diagrams when the object is

(a) real and sits 20.0 cm in front of the mirror;

(b) real and sits 7.0 cm in front of the mirror;

(c) real and sits 2.0 cm in front of the mirror;

(d) virtual and sits 10.0 cm behind the mirror.

I’ll leave it to you to check that these diagrams agree numerically with the results of the formulae 1 + 1 = 2 and m = − di . di do R do

– 14 – 6. Ray Tracing Diagrams for Lenses

Consider a lens whose focal length has magnitude 10.0 cm. Draw ray-tracing diagrams for the following scenarios:

(a) The object is real and sits 20.0 cm in front of the converging lens;

(b) The object is real and sits 20.0 cm in front of the diverging lens;

(c) The object is virtual and sits 5.0 cm behind the converging lens;

(d) The object is virtual and sits 5.0 cm behind the diverging lens.

Again, I leave it to you to check that these diagrams agree with the results of the formulae 1 + 1 = 1 and m = − di . di do f do

– 15 – 7. Compound Optical Systems

Compound optical systems just have more than one lens and/or mirror, called optical elements. Light goes from one optical element to the next. The image of optical element 1 becomes the object for optical element 2; the image of optical element 2 becomes the object for optical element 3; and so on. It is only in this case that a virtual object can arise because the image of the previous optical element may very well lie behind the following optical element. Keep in mind that if the system contains mirrors, it is possible for one physical lens or mirror to play the role of multiple optical elements. For example, if you have a lens in front of and positioned parallel to a mirror, then light can come from an object on one side of the lens, go through the lens, hit the mirror, bounce back, and then go through the lens again! In this case, the lens plays the role of optical elements 1 and 3, while the mirror plays the role of optical element 2. In principle, you can imagine with multiple mirrors, you can even have one physical mirror playing the role of infinitely many optical elements. If you have ever been in a house of mirrors, you know well how this can come about and what it’s like.

7.1. Two-Lens Problem You have two lenses whose focal lengths have magnitude 10.0 cm, one converging and one diverging. You want to place an object 20.0 cm in front of the first lens in such a way as to produce a final image which is real, upright and twice as large as the object. Where can you place the lenses in order to do this, and must you place the original object right side up or up side down? Where is the final image?

SOLUTION:

Part (d) of the previous section produces a linear magnification of 2. Part (a) of the previous section produces a linear magnification of −1. Combined appropriately, these could produce a linear magnification of −2. We want the image to be upright. With a negative linear magnification, the object would have to be up side down if we are going to combine parts (a) and (d) to achieve the objective. So, the first lens will be the converging one, and the object sits 20.0 cm to the left of this lens and up side down. The image of this lens sits 20.0 cm to the right of this lens, is right side up and equal in size with the object. This image becomes the object for the second lens. According to part (d), we want this object to be virtual and 5.0 cm to the right of the diverging lens. Therefore, place the diverging lens 15.0 cm to the right of the converging lens. The final image will be 10.0 cm to the right of the diverging lens, which is 25.0 cm to the right of the converging lens, or 45.0 cm to the right of the original object.

– 16 – The black ray goes through the vertex of the first lens. It heads towards the would-be image of the first lens, but hits the second lens first and gets bent upwards towards the final image. It is not one of the rays with simple rules for the second lens, but nevertheless we know that it must end up at the final image. The blue ray goes parallel to the axis first, hits the first lens, gets bent towards the focal point of the first lens. It heads towards the would-be image of the first lens, but hits the second lens first and gets bent upwards towards the final image. Again, this is not one of the rays with simple rules for the second lens, but nevertheless we know that it must end up at the final image. The red ray goes through the secondary focal point of the first lens, hits the first lens and comes out parallel to the axis. It heads towards the would-be image of the first lens, but hits the second lens first and gets bent upwards towards the final image. This is one of the rays with simple rules for the second lens: the red ray on the right of the second lens looks like it went through the focal point (the square dot) of the second lens. The cyan and green rays are the other two with simple rules for the second lens. We have just continued them backwards since we know they must originate from the object. The green ray corresponds to the black ray in part (d) of the previous section and the cyan ray corresponds to the red ray in part (d) of the previous section.

7.2. Two-Lens Demonstration The previous problem is a warm-up to begin understanding the “demonstration” that I brought in to section involving one converging and one diverging lens. The converging lens will be “Lens 1” and the diverging lens will be “Lens 2”. The focal length of the converging lens is f1 = +15.5 cm and the focal length of the diverging lens is f2 = −15.0 cm. I asked you to hold the converging lens at arm’s length away from you and look at some object somewhat far away across the room. What you found was that the image that you see is smaller than and up-side-down relative to the original object. You can easily see this with a ray-tracing diagram. Consider what happens to the diagram in part (a) of Section 6 when you move the object further and further to the left of the lens. The path of the blue ray does not change! However, for example, the black ray gets closer and closer to the horizontal axis. Therefore, where the black and blue rays intersect approaches the primary focal point from the right side. The image remains up-side-down relative to the object and

– 17 – it gets smaller and smaller. I had you hold the converging lens at arms length because the image of that lens becomes the object for the lens of your eye. Therefore, it is as if your eye is looking at a small up-side-down object whose distance in front of you is roughly equal to your arm length minus a bit more than the focal length of the converging lens. If you were to hold the converging lens too close to your eye, the image it produces would be very close to your eye and your eye would have to strain as much as it does when you try to look at any object very close to your eye. The other reason why I had you hold the converging lens at arm’s length is that I then wanted you to place the diverging lens very close to the converging lens but closer to you. The main difference between this set-up and the one in the previous two-lens problem is that the image of the first lens is a bit further way from the second lens than the secondary focal point of the second lens, whereas in the previous problem, the image of the first lens is within the secondary focal point of the second lens. At this point, you know see an upright image that is maybe a little larger than the original object and certainly much larger than the image you saw with just the converging lens. Then, I asked you to keep the converging lens fixed, but very slowly move the diverging lens closer and closer towards your eye. You described the image you saw as getting larger and larger. Then, at some point the image gets smaller and smaller and is up-side-down. How can we make sense of this phenomenon? Let’s do it mathematically first. Let the original object be “Object 1” and let it have distance do1 relative to the converging lens, which is “Lens 1”. I asked you to look at an object “far away”. But, what does that mean? Far away compared to what? The object distance is a DISTANCE; it has units, namely meters. It can’t just be “big”, it has to be big compared to something. In this case, we want it to be big relative to the focal length of Lens 1. That is, we want do1  f1. Therefore, it makes sense to define the ratio

do1 ao1 ≡ . (7.1) f1

This number is positive because the original object is real (and so do1 > 0) and the first lens is converging (and so f1 > 0). A far-away object means large ao1, or ao1  1. I can now literally say “large ao1” with impunity because ao1 has no units; it’s just a number. Similarly, define

di1 ai1 ≡ . (7.2) f1 Then, the lens equation for the first lens reads 1 1 1 1 1 1 1 1 + = =⇒ + = =⇒ + = 1. (7.3) do1 di1 f1 ao1f1 ai1f1 f1 ao1 ai1

Solving for ai1 gives 1 ai1 = . (7.4) 1 − 1 ao1 Now, 1  1 since a  1. Therefore, we can Taylor expand the above result: ao1 o1 1  1 2 ai1 = 1 + + + ··· . (7.5) ao1 ao1

– 18 – Since di1 = ai1f1, we see that the image produced by the converging lens is real (that is, di1 > 0) and the image is located just a bit further away from the lens than one focal length. The further away the original object is, the bigger ao1 is, the closer ai1 gets to 1 (from above), the closer the image of the first lens gets to its focal point. To see that the image is up-side-down and small, calculate the transverse magnification:

1  2 di1 ai1f1 ai1 ao1 1 1 m1 = − = − = − = − 1 = − − − · · · . (7.6) do1 ao1f1 ao1 1 − ao1 ao1 ao1

Indeed, m1 is negative and small if ao1 is large. Now, we place the diverging Lens 2, with focal length f2 < 0, after Lens. We will write f2 as −|f2| instead so that we never forget that it is actually negative! Note that the image of Lens 1 may be behind Lens 2 (i.e., on the opposite side of Lens 2 as the side from which the light is coming towards Lens 2). Define distance between Lens 1 and Lens 2 δ ≡ . (7.7) f1 At the start of the demo, δ is small. Then,

do2 = δf1 − di1 = (δ − ai1)f1. (7.8)

Note that if δ < ai1, then do2 < 0, but if δ > ai1, then do2 > 0. That is, if the distance between the two lenses is less than the distance of the image of Lens 1 from Lens 1, then image 1 is a virtual object 2 for the second lens. However, if the distance between the two lenses is greater than the distance of the image of Lens 1 from Lens 1, then image 1 is a real object 2 for the second lens. Again, define

do2 f1 di2 ao2 ≡ = (δ − ai1) , ai2 ≡ . (7.9) |f2| |f2| |f2| A priori, we do not yet know whether the final image of Lens 2 is real or virtual. Therefore, ai2 may be positive, in which case the image is real, or negative, in which case the image is virtual. Then, the lens equation for the Lens 2 reads 1 1 1 1 1 1 + = =⇒ + = do2 di2 f2 ao2|f2| ai2|f2| −|f2| 1 1 =⇒ + = −1. (7.10) ao2 ai2

Solving for ai2 gives 1 1 ai2 = − = − . (7.11) 1 + 1 1 + 1 ao2 δ− 1  f1 1− 1 |f2| ao1 At this point, let us plug in some appropriate numbers,

f1 = 15.5cm, |f2| = 15.0cm, ao1 ≈ 50. (7.12)

– 19 – The value ao1 = 50 corresponds to an object 50 × 15.5 cm = 7.75 m away from Lens 1. Then, (7.11) becomes

1.02 − δ 15.8 cm − δf1 ai2 = = , (7.13) δ − 0.053 δf1 − 0.82 cm where we multiplied numerator and denominator by f1 = 15.5 cm to get the final expression. When δf1 < 0.82 cm, meaning that the two lenses are less than 0.82 cm apart, we have ai2 < 0, which means that the final image of the two-lens system is virtual. Also, ai2 has a fairly large magnitude, starting out as about −19.4 at δ = 0 (when the lenses are right on top of each other) and going to −∞ as we increase the separation of the two lenses towards

0.82 cm. If we separate the lenses a both further, then ai2 becomes huge and positive; this is a real image now very far away from the lenses. As we increase the separation, ai2 decreases until we reach a separation of 15.8 cm, at which point the image is technically exactly at the location of the second lens. If we increase the separation even further, the image becomes virtual again since ai2 becomes negative again. The transverse magnification of Lens 2 is

di2 ai2 1 m2 = − = − = . (7.14) 1  f1 do2 ao2 1 + δ − 1 1− |f2| ao1 The total transverse magnification is 1 ao1 m = m1m2 = − . (7.15) 1 − 1 + 1 − 1 δ − 1 f1 ao1 ao1 |f2| Again, plugging in the numbers gives 0.020 0.31 cm m = = . (7.16) 0.053 − δ 0.82 cm − δf1 Indeed, this agrees with our previous description of our observations. When the lenses are very close together (δ ≈ 0), the image is upright (m > 0). As we increase the separation towards 0.82 cm, the image gets bigger and bigger (m increases). Just beyond 0.82 cm separation, m becomes huge and negative, and then m remains negative and approaches zero as the separation is increased further. To summarize, when the lenses are very close together, the image is virtual and pretty far away in front of you. As you increase the separation of the lenses by moving Lens 2 towards you, the image grows and appears to move further away from you. As you cross 0.82 cm of separation, the image goes from infinitely large and upright infinitely far in front of you to infinitely large and up-side-down infinitely far behind you. As you increase the separation, the image remains up-side-down but gets smaller and smaller in size. At a separation of 15.8 cm, the image goes from being real to virtual, but the still keeps on getting smaller and remains up-side-down. As a side note, an image in front of you is a real object for the lens of your eye. An image behind you is a virtual object for the lens of your eye. While you cannot see a real object behind you, it may be possible to see a virtual object behind you. A virtual object behind you is nothing more than the image of optical elements in front of you. Please make sure you understand this; ask us about it if this is unclear.

– 20 – Now, let us try to understand this phenomenon using ray tracing diagrams. We have already discussed what the diagram looks like for the converging lens, Lens 1. An up-side- down and small image is formed a bit further away from Lens 1 than one focal length. We draw this image as dashed because it is a virtual object for the diverging lens, Lens 2. Initially, this virtual object is a bit further away from Lens 2 than one focal length. The diagram might look like

As the lens is moved further to the right, the diagram might look like

Indeed, the image is upright further to the left and bigger. The image keeps moving further to the left and increases in size until the virtual object sits right at the secondary focal point of the lens. At this point, the outgoing light rays are exactly parallel and therefore look like they are coming from a very large image infinitely far to the left:

Notice that the red light ray doesn’t really change as the lens approaches the virtual object. The black ray just gets steeper and steeper. This continues as we move the lens even further to the right. It is clear to see that that means that the black and red rays will actually converge to the right of the lens, at real and up-side-down image. This image is at first very large and very far to the right, but gets smaller and smaller and closer and closer to the lens as we move the lens to the right. The second “funky” point that we discovered earlier, where the image again changes from being real to virtual is the point when Lens 2 passes the image formed by Lens 1 and the second object goes from being virtual to real.

– 21 – 8. Midterm 1 Quiz

(1) An electromagnetic wave is propagating in the +zˆ direction. At some time and point in space, the electric field points in the yˆ direction. In which direction does the magnetic field point? Answer: B ∝ −xˆ at this point in space and at this time.

(2) Why are electromagnetic plane waves called “plane waves”? Explain with a drawing. Answer: The electric field are identical (as are the magnetic field) at points on the same plane perpendicular to the direction of motion of the plane wave.

(3) A cube of index of refraction n sits in air. A light ray inside the cube hits one face and gets totally internally reflected. It then hits an adjacent face and also gets totally internally reflected. Calculate the minimum possible value of n. 0 π Answer: If θ is the angle of incidence on the first face, then θ = 2 − θ is the angle of incidence on the second face. We need both θ and θ0 to be greater than or equal to the critical angle, sin−1(1/n). Thus, sin θ ≥ 1 and sin θ0 ≥ 1 . However, q n q n 0 p 2 1 2 1 1 2 sin θ = cos θ = 1 − sin θ ≤ 1 − n . Therefore, n ≤ 1 − n . We can square both sides of the inequality without changing the direction of the inequality since both 2 2 √ sides are manifestly positive: 1  ≤ 1− 1  or 1 ≤ √1 . Inverting gives n ≥ 2 ≈ 1.4. n n n 2 (4) If a lens is cut in half through a plane perpendicular to its surface, does it show only half an image? Answer: It still shows a full image, just a dimmer one.

(5) If your near-point distance is N, how close can you stand to a mirror and still be able to focus on your image? Answer: The image is virtual and is the same distance behind the mirror as you are in front of the mirror. Therefore, you should stand no closer than N/2 from the mirror.

(6) When you open your eyes underwater, everything looks blurry. Explain. Answer: Your eyes have an index of refraction roughly equal to that of water. There- fore, if submerged in water, they cannot refract light and will not be able to focus light rays to form real images on the retina.

(7) Would you benefit more from a magnifying glass if your near-point distance is 25 cm or if it is 15 cm? Explain. Answer: The angular magnification of a magnifying glass is M = N/f, where N is the near point and f is the focal length of the lens. Therefore, the larger your near point is, the more you can benefit from the magnifying glass.

(8) When you use a simple magnifying glass, does it matter whether you hold the object to be examined closer to the lens than its focal length or farther away? Explain.

– 22 – Answer: Yes it matters crucially. You must keep the object just within one focal length of the lens in order to produce a large virtual image very far in front of your eyes. If the object is beyond one focal length from the lens then a real image is produced on your side of the lens likely behind you and therefore you will not be able to see it clearly.

(9) Is the final image produced by a telescope real or virtual? Explain. Answer: Virtual and far away in front of you.

(10) Two people are stranded on a deserted island. Both people wear glasses, though one is nearsighted and the other is farsighted. Which person’s glasses should be used to focus the rays of the Sun and start a fire? Explain. Answer: Whoever has the converging lenses. A far-sighted person is able to converge parallel light rays (coming from faraway objects) just fine, but is unable to focus diverging light rays (from nearby objects) strongly enough to form a clear image at the retina. Therefore, the far-sighted person needs converging lenses to help “beef up” their eyes’ converging power. A near-sighted person has strongly converging eyes able to converge the diverging light rays from nearby objects, but too strongly converges parallel light rays from faraway objects. Therefore, the near-sighted person needs diverging lenses to “handicap” their eyes’ converging power.

(11) You have two lenses: lens 1 with a focal length of 0.45 cm and lens 2 with a focal length of 1.9 cm. If you construct a microscope with these lenses, which one should you use as the objective? Explain. Answer: You want the one with a shorter focal length to act as the eyepiece because that acts as a magnifying glass on the image of the objective and because the angular magnification of a magnifying glass is inversely proportional to its focal length. There- fore, you want the 1.9 cm focal length lens to act as the objective lens and the 0.45 cm lens to act as the eyepiece.

(12) Why is it restful to your eyes to gaze off into the distance? Answer: I don’t know! But here’s some information on the matter. Most of the refraction in your eye is actually performed by the cornea, which is a pocket of fluid at 2 the front of the eye which covers the lens and iris (Wikipedia says that 3 of the eye’s refractive power comes from the cornea). As far as I know, nothing happens to the cornea as our eyes adjust between looking at nearby and faraway things. They can be reshaped temporarily or permanently, but by external methods. On the other hand, the adjustments we make to clearly image objects at various distances are made to the lens via ciliary muscles which are connected to the edge of the lens by tendons called zonules. Muscles can only pull (i.e., contract). Pulling on the lens flattens it out and reduces its converging power. This is what you want to do when looking at faraway things since the light rays are reaching your eye basically parallel. Relaxing the ciliary muscles a bit allows the lens to bulge a bit more at the center, which increases its

– 23 – converging power. This is what you want to do when looking at nearby objects since the light rays are diverging when they reach your eye. In fact, it happens that the ciliary muscles are most contracted when looking at distant objects. So, why does your eye feel relaxed when gazing off into the distance, which is precisely when your ciliary muscles are most contracted? I don’t know! The best I can guess is that that’s just the state that our eyes are used to. Also, a nonzero lever arm between the ciliary muscles and the lens might account for this as well, but I don’t think there is one. However, there is one thing that this helps us understand: the fact that our eyes can only properly focus diverging light rays, not converging ones. To properly focus already-converging light rays, we must decrease the converging power of our eye’s lens even further compared to when we are looking at faraway things. Well that would require the ciliary muscles to pull even harder on the lens. But, they can’t because for some reason the eye is “designed” so that the ciliary muscles are most contracted when looking at faraway things. In other words, we were not “designed” to see images produced behind our eyes. I suppose that might make sense evolutionarily. I can’t imagine an environmental stressor that would select that ability since lenses and such are recent human inventions.

– 24 – 9. Interference

9.1. Laser Wavelength Measurement via Metal Ruler Devise and explain a method for measuring the wavelength of a laser pointer chiefly using a finely graded metal ruler (e.g. with 1/32 inch markings or smaller).

SOLUTION:

Consider reflecting the laser off of the ruler at a shallow angle. If there were no notches on the surface of the ruler, then each point on the ruler where the light hits becomes a source for outwardly spreading spherical waves. The superposition of these waves produces wavefronts that travel in the direction of specular reflection. That is, on a far-away screen, we get constructive interference only around the point of specular reflection, as expected. Imagine we make wide notices with narrow reflective bands in between. Then, consider the following diagram showing two adjacent light beams headed towards a far-away screen (e.g. a wall) having reflected off of two adjacent reflective bands.

The optical path length difference is d(cos α − cos β), which must be set equal to mλ for some integer m for constructive interference. For a fixed α (angle at which we shine the laser on the ruler), this gives discrete values for β where bright spots occur (i.e. we get a diffraction pattern). The claim is that we see the same thing if instead we have narrow non-reflective notches with wider reflective bands. Can you think of why? Hint: superposition. This goes under the name of Babinet’s principle, by the way. Consider the following setup

This gives us an expression for the wavelength d h 1 1 i λ = − . p 2 p 2 m 1 + (s0/L) 1 + (sm/L)

– 25 – If you make α very small and use only low orders, then we can assume that sm/L << 1:

d(s2 − s2) λ ≈ m 0 . 2mL2 As an example, when I did this experiment at home, I used the marks on the ruler that were d = 0.5 mm apart and the distance to the wall was L = 105 cm. I found s0 = 10.5 cm and s1 = 11.9 cm. Assuming small angles, this gives

(5 × 10−4 m)[(11.9 cm)2 − (10.5 cm)2] λ = ≈ 711 nm. 2 × 1 × (105 cm)2

That’s not bad! The wavelength should be around 635 nm. Before we rejoice, however, we should note that a millimeter difference in any sm makes a huge difference in the fi- nal answer. For example, if I change s1 to 11.8 cm, I get λ = 657 nm! So, unless I can measure sm and L with very high precision, the uncertainties are likely to swamp the final measurement of λ anyway.

Note: In section, I claimed that if the laser light reflects off of a smooth metalic surface, then we only get specular reflection. This is true only if the region on the surface that is illuminated is much wider than the wavelength of the light. Well, our laser beam has a width of a few millimeters, which will obviously do since its wavelength is on the order of 10−4 mm! As calculated above, the extra optical path length travelled by one beam relative to another that hits the surface a distance x to the left of it is x(cos α − cos β). So, x the phase shift is φ = 2π λ (cos α − cos β). Let a be the width of the illuminated region and let x run from −a/2 to a/2 with the “zero phase” corresponding to x = 0. The intensity is proportional to Z a/2 2 2 2πi x (cos α−cos β) cos A I ∝ e λ dx ∝ , 2 −a/2 A a where A = π λ (cos α − cos β). In the limit a/λ << 1, the intensity becomes a delta function:

a/λ→0 I −−−−→ δ(A).

Thus, the intensity vanishes everywhere except when A = 0, or when cos α = cos β, or α = β, which is the condition for specular reflection!

– 26 – 10. Thin-Film Interference

A piece of paper is wedged between the ends of two sheets of glass. The setup is illuminated at normal incided by cyan laser light (λ = 500 nm). Excluding the point where the two glass sheets meet, you count about 400 dark interference fringes. Calculate the thickness of the sheet of paper.

SOLUTION:

Let t be the thickness of the paper. Let ` be the length of the glass sheets. Let x be the horizontal coordinate starting at 0 at the point where the two glass sheets meet and increasing to the right up to `, the length of the glass sheets. Let t(x) be the thickness of the air gap between the two glass sheets as a function of the coordinate x. By similar triangles, t(x) t t = =⇒ t(x) = x. x ` ` The two beams whose interference we care about are the ones shown below.

Of course, these rays are actually right on top of each other since the incidence is normal and since the paper is presumably very very thin, any refraction at the interfaces is negligible. Since ray 1 reflects off of a glass-air interface (high to low index of refraction), it does NOT receive a π reflection phase shift. On the other hand, ray 2 does because it reflects off of an air-glass interface (low to high index of refraction). Thus,

ϕref,1 = 0 and ϕref,2 = π =⇒ ∆ϕref = π.

We will set ϕpath,1 = 0 since the only difference in path between 1 and 2 is that 2 goes through a thickness, t(x), twice whereas 1 does not. Thus,

2π 4t(x) π 4t(x) π ϕpath,1 = 0 and ϕpath,2 = 2t(x) = =⇒ ∆ϕpath = . λ/nair λ λ Therefore, 4t(x)  ∆ϕ = ∆ϕ + ∆ϕ = + 1 π. tot path ref λ For the interference to be destructive (dark fringe), we must have

4t(x)  ∆ϕ = + 1 π = (2m + 1)π, tot λ

– 27 – where m is an integer. Remember: odd numbers of π are destructive while even numbers of π are constructive. Thus, t mλ t(x) = x = . ` 2 The maximum value of t(x) is t, which occurs when x = `. The maximum value of m is 400 according to the problem statement. Therefore,

m λ 400 × 5 × 10−7 m t = max = = 10−4 m = 0.1 mm . 2 2

11. Relativity

Newtonian mechanics is incorrect! Time is not just time is not just time! The passage of time depends on your state of motion. How much time it takes to get from one point to the next depends on the path you take. How big something is depends on its state of motion. The order of events depends on the state of motion of the observer. All of these statements might seem absurd, and they surely would do to most physicists before Einstein. However, they are all direct consequences of two seemingly innocuous postulates often summarized by the pithy statement that “the laws of physics are identical between all inertial reference frames.” Actually, the statement that the speed of light is constant in all inertial reference frames is already counter-intuitive because it implies that the speed of light is the same no matter the state of motion of the source of that light. Certainly, the same cannot be said of other projectiles like balls and bullets!

11.1. How to Measure the Length of a Moving Object Before we explore some of the counter-intuitive consequences of the principle of special relativity, we will need to know how to measure the length of a moving object. Suppose an object (like a train) of unknown length is moving left-to-right relative to you and your friends at some unknown speed. Can you devise a plan with your friends to measure the length of the train? Here is a method which some of you suggested in discussion section. You and one of your friends synchronize clocks and decided to stand some fixed known distance apart along the direction in which the train is moving such that the train reaches you before your friend. You note the time when the front of the train passes you and when the back passes you and your friend notes when the front of the train passes them. Then, you come back together. You can determine the speed of the train via distance between you and your friend speed of train = . (11.1) time front passes friend − time front passes you Then, you can determine the length of the train via

length of train = speed of train × (time back passes you − time front passes you). (11.2)

This requires you and your friend to synchronize your clocks. This is trickier than it may seem at first. Remember that the passage of time depends on your state of motion and

– 28 – certainly you and/or your friend have to be moving at some point if you first synchronize your clocks when you are together and only then move apart! Even if you didn’t believe in this relativity business, you must agree that your experimental method had better not depend on your own prejudices, whether they be ultimately correct or incorrect. Fortuitously, there is a simple way for you and your friend to synchronize clocks after you are already apart and no longer moving relative to each other. You shine a light at your friend at some time that you set to be “zero”. The light moves relative to either of you at the same speed of about 3 × 108 m/s. Since you know how far apart you are, when your friend receives the light, he knows exactly how long ago your clock read “zero” and he can set his clock appropriately. Actually, we can line up you and millions of your friends (all with synchronized clocks) along a line parallel and very close to the path of the train. Then, each one of you can record the times when the front and back of the train pass you. If you then come together at the end, you will find that the “front” times pair up between pairs of friends a distance apart equal to the length of the train (as measured by you and your friends). The same can be said of the “back times”. That is, you can pick one specific time and at that time exactly one of you or your friends will have recorded the back passing them and one will have recorded the front passing them. The distance between these two people is the measured length of the train. This method allows us to measure the length of the train without technically measuring the speed of the train first, even though the speed of the train could easily be determined from this data as well. The following problem in the next subsection will show why I prefer this method. This discussion of synchronization of clocks brings up an interesting subtlety in state- ments about relativity. Often questions are asked like “what do you observe?” or “what does someone in such and such reference frame observe?” The “what” might be a time or position or whatever else. Such statements are lazy and possibly misleading. Usually, what is meant is not what is observed by one person at one instant in time, but rather when this person can infer once he gathers time and position records from very many observers scat- tered everywhere (technically, the limit of infinitely many such observers packed infinitely closely). That is, you imagine a lattice of synchronized clocks everywhere in space and you are asked what you can infer if you could take the readings from all of those clocks after the process of interest is over. This way, you do not have to take into consideration the finite amount of time it might take light from some event of interest to reach you, which would drastically complicate matters. So, for example, when you are asked what is the length of a moving object that is observed by the person standing still, the question is really what would be measured by the army of friends as described previously. The question is not asking what do you (the one person standing still) actually see. That would be very different and far more complicated because light from different points along the object take different amounts of time to reach your eyes!

– 29 – 11.2. Relativistic Train

People at rest relative to a train measure the length of the train to be L0 (this is the train’s so-called proper length). Alice stands at the back of the train, Bob at the front and Charlie at the middle. They have synchronized their clocks relative to each other. The train travels at a speed v relative to the platform where the Stationmaster stands. These people all decide to set the origin of space and time to be when Charlie passes the Stationmaster.

(a) At the moment C passes S, C turns on a lightbulb. What time will A and B read on their clocks when they see the light?

(b) What is the length of the train as measured in the reference frame of S? (You are not being asked to derive the result. Just take it as an assumption.)

(c) In the reference frame of S, at what time(s) does the light reach A and B. What does this tell you about simultaneity?

(d) In the reference frame of S, what time(s) registers on the clocks of A and B when the light reaches A and B, respectively? What does this tell you about the synchronization of clocks.

(e) A and B hold up mirrors to reflect the light back to C. What time does C measure when he sees the reflections? What time is measured in the S reference frame? What does this tell you about the ticking rate of the clocks on the train as observed by the S reference frame?

SOLUTION:

(a) Let S0 be the rest frame of the train, which is the same as the reference frames of A, B and C. Time and space coordinates measured in this frame will likewise be primed. S can also stand for the reference frame of the Stationmaster and coordinates in this frame will be unprimed. Event 0 is when C passes S and turns on the lightbulb. By agreement, the spacetime coordinates of this event in either reference frame is identically zero:

0 0 (ct0, x0) = (ct0, x0) = (0, 0). (11.3)

Event 1 is when the light reaches A. Relative to A, the light must travel a distance

of L0/2. Therefore, 0 0 L0 L0  (ct1, x1) = 2 , − 2 . (11.4) Event 2 is when the light reaches B. Similarly,

0 0 L0 L0  (ct2, x2) = 2 , 2 . (11.5)

L That is, the light reaches A and B at the same time, 2c , as measured on the train.

– 30 – (b) In reference frame S, the length of the train is contracted by a factor of γ:

L 1 v L = 0 , where γ = and β = . (11.6) γ p1 − β2 c

(c) In reference frame S, A actually moves towards the light that is headed towards her. The speed of the light is still c (postulate 2). Therefore, the relative speed between A

and the light is c + v = (1 + β)c. The distance that must be covered is not L0/2, but L/2. Therefore, the time is s L/2 L /2 1 − β L t = =⇒ ct = 0 = 0 . (11.7) 1 c + v 1 (1 + β)γ 1 + β 2

By the same argument, the relative speed between the light and Bob as measured in S is c − v = (1 − β)c. The distance that must be covered is still L/2. Therefore, the time is s L/2 L /2 1 + β L t = =⇒ ct = 0 = 0 . (11.8) 2 c − v 2 (1 − β)γ 1 − β 2

0 0 Note that, even though ct1 = ct2 so that these two events (the light reaching Alice and the light reaching Bob) occur simultaneously in the S0 reference frame, they do not occur simultaneously in the S reference frame. In S, event 1 happens first, then event 2. The time difference is s s 1 + β L 1 − β L c∆t ≡ ct − ct = 0 − 0 = βγL . (11.9) 2 1 1 − β 2 1 + β 2 0

Events that are simultaneous in one reference frame may not be simultaneous in another reference frame. This is “loss of simultaneity”.

(d) Whether it is Alice looking at her watch or someone at rest on the platform immediately by Alice when the light reaches her, they must agree on what Alice’s watch reads at 0 this moment, which we have already determined is ct1 = L0/2. The same can be said of Bob’s watch. Therefore, the reference frame S observes Alice’s watch read q 0 1−β L0 ct1 = L0/2 when their own watch reads ct1 = 1+β 2 , and they observe Bob’s watch q 0 1+β L0 read ct2 = L0/2 when their own watch reads ct2 = 1−β 2 . It follows that the clocks of A and B are no longer synchronized in the S reference frame! The clock at the back of the train (Alice’s) is systematically ahead of the clock at the front of the train

(Bob’s). The offset is c∆t = βγL0. This is “loss of synchronicity”.

0 (e) In S , after the light has reflected off of A or B’s mirror, it has to travel a further L0/2 distance before returning to C. Therefore, the lights reach C at the same time. Let us call this event 3. Then, 0 0 (ct3, x3) = (L0, 0). (11.10)

– 31 – As measured in reference frame S, the return trip of the light from B to C takes as much time as it took for the light to go from C to A. The return trip of the light from A to C takes as much time as it took for the light to from from C to B. Therefore, the light comes back to C also at the same time, namely

ct3 = ct1 + ct2 = γL0. (11.11)

That is, between event 0, when C turns on the light, and event 3, when the light 0 returns to C, a time L0 has elapsed in S while a time γL0 has elapsed in S. This is “time dilation”. S observes clocks in S0 tick more slowly than his own.

11.3. Passing Trains This problem is a combination of several problems taken from “Introduction to Classical Mechanics” by David Morin.

Charlie stands on a platform while Alice, in one train, and Bob, in another train, pass by going in the same direction. Both trains have a proper length L. A’s speed is 4c/5, and B’s speed is 3c/5. A starts out behind B.

(a) In C’s reference frame, how long does it take for A to overtake B (i.e., the time between the front of A passing the back of B, and the back of A passing the front of B)?

(b) Same question, but in the reference frame of A.

(c) Same question, but in the reference frame of B.

(d) David moves from the back of B’s train to the front at a constant speed, such that he coincides with both the even of the front of A passing the back of B and the back of A passing the front of B. How long does the overtaking process take in D’s reference frame?

(e) Verify that the interval between the two events E1 = front of A passes back of B, and E2 = back of A passes front of B, is the same in all reference frames A, B, C and D. SOLUTION:

(a) Let xiS and ctiS denote the position and time of event Ei (i = 1, 2) in some reference frame S, which can be A, B, C or D. Set the origin of coordinates to be event E1:

x1A = x1B = x1C = x1D = ct1A = ct1B = ct1C = ct1D = 0. (11.12)

Let γSS0 be the gamma factor associated with the motion of reference frame S as 0 viewed by reference frame S . By definition, γSS = 1, of course, for any S. The gamma factors of A and B as viewed by C are 1 5 1 5 γAC = = , γBC = = . (11.13) p1 − (4/5)2 3 p1 − (3/5)2 4

– 32 – Let LAS and LBS be the length of A’s and B’s train, respectively, in the reference frame S. By definition, LAA = LBB = L are the proper lengths. LAC and LBC are length contracted by the appropriate gamma factor: L 3L L 4L LAC = = ,LBC = = . (11.14) γAC 5 γBC 5

front front Let xAS (tS) and xBS (tS) be the position of the front of train A and B, respectively, in reference frame S as a function of the time in reference S, and similarly define back back xAS (tS) and xBS . Then, 4ct 3ct xfront(t ) = C , xback(t ) = C , (11.15a) AC C 5 BC C 5 3L 4ct 4L 3ct xback(t ) = − + C , xfront(t ) = + C . (11.15b) AC C 5 5 BC C 5 5

front back Indeed, t1C is defined to be the time tC when xAC = xBC , which is indeed t1C = 0 front back back front and happens when xAC = xBC = 0. The overtake happens when xAC = xBC : 3L 4ct 4L 3ct − + 2C = xback(t ) = xfront(t ) = + 2C . (11.16) 5 5 AC 2C BC 2C 5 5

Solving for t2C gives ct2C = 7L . (11.17)

back front Just plug this back into xAC or xBC to get x2C , the position where the back of A passes the front of B as viewed by C’s reference frame. This is

x2C = 5L . (11.18)

Aside: We can also determine ct2C as follows. A must travel farther than B by an excess distance equal to the sum of their lengths (as viewed by C’s reference frame), which is 7L/5. The relative speed between A and B as viewed by C’s reference frame is c/5. Therefore, the overtaking time is 7L/5 7L t = = =⇒ ct = 7L. (11.19) 2C c/5 c 2C

(b) We need to know the speed of B as viewed by A. From A’s perspective, C is moving

with velocity vCA = −4c/5. From C’s perspective, B is moving with velocity vBC = 3c/5. Therefore, from A’s perspective, B is moving with velocity

v + v 3c + − 4c  − c 5c v = BC CA = 5 5 = 5 = − . (11.20) BA vBC vCA 3 4  12 1 + c2 1 + 5 − 5 1 − 25 13

The associated gamma factor is 1 13 γBA = q = . (11.21) 5 2 12 1 − − 13

– 33 – Therefore, the length of train B as measured in A’s reference frame is L 12L LBA = = . (11.22) γBA 13

Then, 5ct xfront(t ) = 0, xback(t ) = − A , (11.23a) AA A BA A 13 12L 5ct xback(t ) = −L, xfront(t ) = − A . (11.23b) AA A BA A 13 13

back front Again, t2A is defined to be the time when xAA = xBA : 12L 5ct − L = xback(t ) = xfront(t ) = − 2A . (11.24) AA 2A BA 2A 13 13

Solving for t2A gives ct2A = 5L . (11.25)

Furthermore,

x2A = −L . (11.26)

(c) From B’s perspective, A is moving with velocity

v + v 4c + − 3c  c 5c v = AC CB = 5 5 = 5 = . (11.27) AB vAC vCB 4 3  12 1 + c2 1 + 5 − 5 1 − 25 13

It should not be a surprise that vAB = −vBA! Therefore, in this reference frame,

ct2B = 5L , and x2B = L . (11.28)

(d) In C’s reference frame, D must travel the distance x2C = 5L in the time t2C = 7L/c. Therefore, the velocity of D with respect to C is 5L 5c v = = . (11.29) DC 7L/c 7

The velocities of A and B as viewed by D are

v + v 4c + − 5c  c v = AC CD = 5 7 = , (11.30a) AD vAC vCD 4 5  1 + c2 1 + 5 − 7 5 v + v 3c + − 5c  c v = BC CD = 5 7 = − . (11.30b) BD vBC vCD 4 5  1 + c2 1 + 5 − 7 5

– 34 – It should not be a surprise that vBD = −vAD (why not?) In fact, instead of determining vDC as we did in (11.29), we could have determined vDC by insisting that vAD = −vBD. The associated gamma factors are equal: 1 5 γAD = γBD = = √ . (11.31) p1 − (1/5)2 2 6

The lengths of A and B as viewed by D are equal and given by √ 2 6 L L = L = . (11.32) AD BD 5

From D’s perspective, each train travels a distance equal to each one’s length during the overtaking process. Thus, √ √ LAD 2 6 L/5 2 6 L t2D = = = . (11.33) vAD c/5 c

Both events occur at the position of D, which is just the origin in D’s own reference frame. Therefore, √ ct2D = 2 6 L , and x2D = 0 . (11.34)

(e) In A and B:

2 2 2 2 2 2 2 (ct2A) − (x2A) = (ct2B) − (x2B) = 25L − L = 24L . (11.35)

In C: 2 2 2 2 2 (ct2C ) − (x2C ) = 49L − 25L = 24L . (11.36)

In D: 2 2 2 2 (ct2D) − (x2D) = 24L − 0 = 24L . (11.37)

– 35 – 12. Midterm 2 Quiz

(1) The first missing order in the interference/diffraction pattern produced by a double-slit setup is the fifth interference fringe. What is the ratio of the center-to-center distance between the two slits and the slit width? Answer: Center-to-center slit separation = d and slit width = a. We are asked for d a . Angle relation for fifth interference maximum: d sin θ = 5λ. Angle relation for first d diffraction minimum: a sin θ = λ. Take the ratio of the two equations: a = 5. (2) A thin layer of oil sits on top of some water in a beaker. Looking above, where are you most likely to see the highest density of interference rings and why? Answer: The interference is between reflected light off of the air-oil interface and the light reflected off of the oil-water interface. The latter travels twice the thickness of the oil more than the former (and you also have to take into account the different indices of refraction, and possible reflection phase shifts). If the thickness of the oil film were absolutely constant, then there wouldn’t be interference fringes or rings. Instead, the entire layer would appear to be some constant brightness somewhere between complete constructive or destructive interference. You only get fringes or rings if the thickness of the oil changes as a function of position. In a very clean sample, the thickness of the oil is probably going to be changing most rapidly near the edge of the beaker due to surface tension (this is the so-called meniscus). Therefore, one expects to the see the highest density of rings near the edge.

(3) Fighter jets flying towards a Radar tower on a coast find that they can remain well- hidden if they fly very low, close to the water. Why? Answer: The wavelength of the radar signal is large compared to the characteristic size of structure on the surface of the water, such as waves, etc. So, the water surface pretty much looks flat from the radar signal’s perspective and acts like a flat mirror. Direct radar signals can now interfere with reflected ones. The reflected ones look like they are coming from a virtual radar tower at the same position of the original radar tower, but just the same distance below the sea level as the original tower is above sea level. However, the signal from this virtual tower appears to be phase shifted by π relative to the original tower from the beginning because the actual radar signal is phase shifted by π upon reflection on the air-water interface. In summary, this is like a two-slit interference problem, where the slit separation is about twice the height of the tower above sea level and where the light coming from one of the slits is already phase shifted by π relative to the other slit right from the start. In this case, the middle point of the interference pattern directly ahead of the two slits would have destructive rather than constructive interference. The midway point is the surface of the water. Thus, the radar signal is weak near the surface of the water.

– 36 – Of course, there are other angles at which the radar signal is weak, but if you are in a fighter jet, you might now know the details of the radar towers set up on the enemy shore (e.g., their locations, their heights, etc.). The radar signal will be low near the surface of the water regardless of such details. Therefore, that’s the pilot’s safest bet.

(4) What effect does putting a quarter-wave plate in front of just one slit in a two-slit setup have on the interference pattern? Answer: I’m assuming here that the incident light is polarized along the direction of the optical axis of the quarter-wave plate. This problem would be much harder π otherwise. In this case, there is in initial phase difference of 2 (or a quarter wave) between the two slits. The interference/diffraction pattern will look the same, just shifted in the direction towards the slit that was covered with the quarter-wave plate. The shift is such that the new central maximum lies halfway between the old central maximum and the old first minimum.

(5) Four lightning beams strike a passing train, two at the front and two at the back. In the frame where the train is passing by, all four lightning strikes happen simultaneously? Order the events in the train’s reference frame. Answer: In the train’s reference frame the two lightning strikes at the back hap- pen simultaneously and the two lightning strikes at the front happen simultaneously. However, the two at the front happen before the two at the back. Remember that syn- chronized clocks on the train are not synchronized from the prospective of the frame in which the train is moving; the clock at the back of the train is systematically ahead of the clock at the front of the train. In the frame where the train is passing, a clock at the back of the train reads a later time than does the a clock at the front when the lightning beams strike. Therefore, in the train reference frame, the lightning strikes the front before the back.

(6) If I’m on a train traveling at 4c/5 relative to you and I shoot a rocket forwards at speed 3c/5 relative to me, then at what speed is the rocket moving relative to you? Answer: The train is reference frame S0 and you are in reference frame S. The speed 0 4c of S relative to S as measured in S is v = 5 . The speed of the rocket as measured in 0 0 3c S is u = 5 . The speed of the rocket as measured in S is u0 + v 3c + 4c 7c 35c u = = 5 5 = 5 = . u0v 3 4 12 37 1 + c2 1 + 5 · 5 1 + 25

(7) If I’m on a train traveling at 4c/5 relative to you and I shine light forward, then at what speed is the light moving relative to you? Answer: c. If you plug in u0 = c in the previous problem, you will get u = c.

(8) “Derive” time dilation using the relativistic clock example. AnswerL A train (reference frame S0) moves at speed v relative to reference frame S. Transverse to the direction of motion of the train, light is sent from one side of the

– 37 – train to the other and reflected back, for a total distance of, say, 2h. The time it takes 0 0 2h for this round trip in S is t = c . In S, the total speed of the light beam is still c, but the component of this velocity along the direction in which the train is moving is now v. Therefore, the transverse √ 2 2 c component is c − v = γ . The transverse distance that must be covered is still 2h. 2h 0 Therefore, the round trip time in S is t = c/γ = γt . (9) “Derive” length contraction using time dilation. Answer: A light signal is sent from the back of the train to the front and back. Let

L0 be the proper length of the train (measured in its own rest frame). Then the round 0 0 2L0 trip time in S is t = c . Let L be the length of the train in S. The relative speed between the light beam and the front of the train in S is c − v. The relative speed between the back of the train and the light beam after reflected is c + v. Therefore, the round trip time in S is L L 2Lc 2γ2L t = c−v + c+v = c2−v2 = c . 2 2γ L 0 2γL0 L0 From time dilation, we have c = t = γt = c , which gives L = γ . (10) What are the two effects involved in the relativistic Doppler effect? Answer: (1) The standard Doppler effect, whereby the wavelength of the signal is shortened if the source is moving towards you and lengthened if moving away; and (2) Time dilation, whereby the time it takes for a new wavefront to be created by the moving source increases relative to when it is at rest.

(11) A train and a tunnel both have proper lengths L. The train moves toward the tunnel at speed v. A bomb is located at the front of the train. The bomb is designed to explode when the front of the train passes the far end of the tunnel. A deactivation sensor is located at the back of the train. When the back of the train passes the near end of the tunnel, the sensor tells the bomb to disarm itself. Does the bomb explode? Answer: Yes, the bomb explodes. Let us first consider the train reference frame, in which the answer is obvious. In this frame, the train has length L and the tunnel has length L/γ < L and is heading towards the train at speed v. Therefore, it is clear that the back end of the tunnel will pass the front of the train before the front end of the tunnel reaches the back of the train. In the tunnel frame, the tunnel has length L and the train has length L/γ < L. Therefore, the back of the train reaches the near end of the tunnel before the front of the train reaches the back of the tunnel. You might be tempted to say that the bomb is then deactivated before it can explode. However, you have to keep in mind that the deactivator at the back of the train needs to send a signal to the bomb at the front of the train saying that it has reached the front end of the tunnel and that the bomb should therefore disarm itself. At best, that signal can travel at the speed of light. It will take time for that signal to reach the bomb at the front of the train. If that time is

– 38 – longer than the time it takes for the front of the train to reach the back of the tunnel, then it will be too late and the bomb will explode. Let the front of the tunnel correspond to x = 0 and let t = 0 be when the back of the train passes the front of the tunnel. Henceforth, the signal sent by the deactivator travels forward at the speed of light, its worldline described by xs = ct (the s subscript stands for “signal”). At t = 0, the front of the train is at x = L/γ, since that is the length of the train in the tunnel reference frame. The trajectory of the front of the L train is xb = γ + vt (the b subscript stands for “bomb”). Which one reaches x = L (the back of the tunnel) first? Well, the time it takes for the signal is ts = L/c whereas L  ts 1  for the bomb takes tb = L − γ /v = β 1 − γ , where β ≡ v/c. We claim that tb < ts and so the bomb explodes. To prove this, start with the inequality β < 1, which just says that the train must be moving at less than the speed of light. Multiply by 2β and add 1 to both sides to get 1 + 2β2 < 1 + 2β. Now, subtract 2β + β2 from both sides to get 1 − 2β + β2 < 1 − β2. Rewrite the left hand side as (1−β)2, then take the positive square root of both sides to p 2 1 1 1  get 1−β < 1 − β = γ . This final inequality can be rearranged to read β 1− γ < 1. But, the left hand side is just tb/ts, and so tb < ts. Below are spacetime diagrams in both reference frames in the case β = 4/5. Note that we have set t = t0 = 0 when the front of the train lines up with the front of the tunnel. But, note that we have not set x = x0 = 0 to be the position of this event. The spatial origins of the frames are different: x0 = 0 for the center of the train and x = 0 for the center of the tunnel. In the train frame, the explosion happens before the deactivation signal is sent. In the tunnel frame, those two events occur in the opposite order. However, in both reference frames, the bomb explodes; it certainly cannot be the case that the train explodes in one frame whereas it does not in the other! Assuming that the explosion “signal” (i.e. the fires, etc.) travel at the speed of light, the red shaded regions represent the region of the train that is engulfed in fire, or at least the region that is aware of the fact that the explosion has occurred.

– 39 – 13. Energy and Momentum

Relativistic energy and momentum are given by

E = γmc2, p = γmv, (13.1) where γ is the usual gamma factor associated with v. You can argue these forms with the use of somewhat cryptic collision arguments and energy and momentum conservation, as is done in your textbook. I would like to discuss 4-vectors instead.

13.1. 4-Vectors We can combine the time and space coordinates of an event, as measured in some reference frame S, into a column of four numbers:

ct x     .  y  z Then, we know how these coordinates transform when we change reference frames: they change via a Lorentz transformation. For example, if the reference frame S0 is moving with speed v in the +xˆ direction relative to S, then the primed coordinates for the event, measured in S0, are related to the unprimed coordinates for the event, measured in S, via

ct  γ βγ 0 0 ct0 x βγ γ 0 0 x0    =     . (13.2)      0   y   0 0 1 0  y  z 0 0 0 1 z0

Incidentally, if S0 is moving with speed v in the +yˆ direction instead, then

ct  γ 0 βγ 0 ct0 x  0 1 0 0 x0    =     .      0   y  βγ 0 γ 0  y  z 0 0 0 1 z0 and similarly if S0 is moving in the +zˆ direction. Any collection of four numbers that can be combined into a column and transforms in this way from one reference frame to the next is called a 4-vector. If we consider two events, each with its own set of coordinates (both measured in the same reference frame S), then we can also write down the difference between those coordinates. This is the spacetime displacement from event 1 to event 2:     c∆t c(t2 − t1) ∆x  x − x     2 1    =   ,  ∆y   y2 − y1  ∆z z2 − z1

– 40 – The same can be done for any 4-vector (i.e., one can consider differences in a pair of 4-vectors measured in the same reference frame). The same argument that leads to the invariance of the interval (c∆t)2 − (∆x)2 − (∆y)2 − (∆z)2 implies that the same can be said for any 4-vector. Furthermore, if you multiply any 4-vector by a scalar, which is some number which does not change from one reference frame to the next (e.g., mass), then the result is still a 4-vector. For example, between any two events that are causally related (i.e., can be connected by something traveling at a speed less than or equal to the speed of light), there is one particular reference frame, S∗, in which the two events occur at the exact same location in space. The time between the two events in that particular reference frame is called the proper time, denoted ∆τ. This proper time is invariant under Lorentz transformation. This statement is essen- tially tautological: It is true by fiat, because the proper time is defined with respect to the particular reference frame S∗. This is the same reason why mass is invariant: Mass is defined as the energy (up to factors of c) in the rest frame of the object. If you were to ask me what is the mass of an object that is moving, I would say it is the same mass that the object would have if it were not moving. Note that we are talking about mass here, not this bizarre thing called the relativistic mass, γm, which you should expeditiously excise from your minds. Therefore, the spacetime displacement between two events, measured in some reference frame S, can be divided by the proper time between those two events and the result is still a 4-vector, since the latter is a scalar:

c∆t/∆τ ∆x/∆τ      .  ∆y/∆τ  ∆z/∆τ

The time between those two events measured in any other reference frame, S, is equal to ∆t = γ∆τ, where γ is the gamma factor associated with the velocity at which S∗ is moving relative to S (this is time dilation). Thus,

c∆t/∆τ  γc  ∆x/∆τ  γ∆x/∆t       =   .  ∆y/∆τ  γ∆y/∆t ∆z/∆τ γ∆z/∆t

Taking the limit as all these deltas become really small turns the ratios into derivatives. We recognize these derivatives to be the components of the velocity of S∗ relative to S. This defines the 4-velocity:  γc  γv   x   . (13.3) γvy γvz

– 41 – Finally, we can multiply by the scalar mass of some hypothetical object which moves between the two events. The result is also a 4-vector and it is called the energy-momentum 4-vector: E/c  γmc   p  γmv   x   x   =   , (13.4)  py  γmvy pz γmvz which are precisely the definitions (13.1) given at the start. For free, we have the invariance of the interval associated with this 4-vector, which is

E2 − |p|2. (13.5) c2 Usually, this is actually multiplied by the constant c2 to get E2 − |p|2c2.

13.2. Colliding Photons

[Goldstein, Poole & Safko 7.22 ] A photon of energy E2 collides at angle θ with a photon of energy E1. Determine the minimum value of E2 permitting the formation of a pair of particles of mass m, as a function of E1, m and θ.

SOLUTION:

Expectations: We should expect that the we would need to pump in more energy if we are to create heavier particles. Therefore, if m gets larger, we expect that E2 must get # larger as well: E2 ∼ m , where # is some positive exponent. If the first photon already has a lot of energy (E1 is large), then the second photon shouldn’t have to have so much energy anymore, and vice versa. Therefore, if E1 is big, then E2 can be small, and if E1 is 1 small then E2 should be big: E2 ∼ # . In fact, we can do a bit better than that. Since E1 2 E2 must have units of energy, and mc and E1 have units of energy, we ought to have 2 E ∼ mc2 mc #, where # is some positive exponent. Actually, to be most conservative, 2 E1 2 all we can really say is that E ∼ mc2f mc , where f(x) is an increasing function for x > 0 2 E1 as x increases. If θ → 0, the collision is very weak and the incoming energies must be huge θ→0 in order to produce something. Thus, we expect that E2 −−−→∞. The opposite scenario is θ → π, which corresonds to a head-on collision. This is the “best-case scenario” since it θ→π is the strongest collision. Thus, E2 should be minimal at this angle: E2 −−−→ min E2. In summary, 2 mc2f mc  E = E1 , (13.6) 2 g(θ)

– 42 – where f is an increasing function and g is a function which goes to zero as θ goes to zero and attains a maximum as θ goes to π.

Center of Momentum Frame Picture: It is difficult to describe exactly what happens in the lab frame, S, which is the frame in which the drawing above is drawn. The two masses are in general moving in all sorts of possible directions with all sorts of possible energies. However, the picture is very simple in the center of momentum frame, S0. This is the frame where the total momentum of the system is always exactly 0. So, in this frame, the two photons undergo a head-on collision with both photons coming in with the same energy and equal and opposite momenta. If there is insufficient energy to produce the two masses, m, then the photons could just pass each other, or they could turn into something else. If there is more energy than is the minimum required, then the two masses, m, will be produced and the remaining energy is distributed evenly between the two of them as their kinetic energies. So, the two masses fly off in opposite directions with equal energy and equal and opposite momenta. At the absolute critical case, the two photons collide and all of their energy is used up to produce two masses, m, just sitting there in the center of momentum frame... not moving.

Method 1 (Relativistic Invariant): The relativistic invariant in the COM of frame, S0, is E02 − p02c2, where E0 is the total energy and p0 is the magnitude of the total momentum vector in the COM frame. Well, by definition, p0 = 0. Thus, the relativistic invariant in the COM frame is just E02.

The total energy in the lab frame, S, is E = E1 + E2. We have to break up the momentum vectors of each photon into their components to calculate the magnitude of E1 E2 the total momentum vector, p. The horizontal component of p is c + c cos θ and the E2 vertical component is c sin θ. Therefore, q q E1 E2 2 E2 2 1 2 2 p ≡ |p| = c + c cos θ + c sin θ = c E1 + E2 + 2E1E2 cos θ.

I would like to rewrite this by adding and subtracting 2E1E2 under the square root. Adding 2 2 2 2E1E2 to E1 + E2 completes the square to give (E1 + E2) . Thus, q 1 p 2 1 2 2 θ p = c (E1 + E2) − 2E1E2(1 − cos θ) = c (E1 + E2) − 4E1E2 sin 2 ,

2 θ 1−cos θ where I used the trigonometric identity sin 2 = 2 . This last step is certainly not necessary; it is just my habit to do this whenever I see 1 − cos θ, even though it is not always useful. The relativistic invariant calculated in the lab frame, S, is thus

2 2 2 2 2 2 θ 2 θ E − p c = (E1 + E2) − (E1 + E2) + 4E1E2 sin 2 = 4E1E2 sin 2 . This is equal to the relativistic invariant in the COM of frame, which we have already determined to be just E02 because p0 = 0. Thus,

0 p θ E = 2 E1E2 sin 2 . (13.7)

– 43 – As we have already discussed above, in the critical case, all of the total energy in the COM frame, E0, is used up to form two masses, m, at rest. Thus,

2 4 0 p θ 2 m c E = 2 E1E2 sin 2 = 2mc =⇒ E2 = 2 . (13.8) E1 sin (θ/2)

Notice that this does satisfy all of the expectations we stated in the beginning!

Method 2 (Transform to COM frame): The COM frame is moving relative to the lab frame along the direction of the total momentum vector, p, in the lab frame. Therefore, we will set that direction to be the +x-direction. Note that this is not the horizontal direction, which is what you might have been tempted to call the +x-direction instead. With this choice of coordinates, the total momentum vector, p, does not have any y or z components and thus we can neglect y and z altogether. The x-component of the total momentum vector is therefore just the magnitude of the total momentum vector. The top two components of the momentum 4-vector in the lab frame are

! E +E ! E/c 1 2 pµ = = q c . 1 2 2 θ p c (E1 + E2) + 4E1E2 sin 2

Notice the notation here. The upper Greek index on pµ just indicates that this is a 4- 0 1 2 3 momentum vector and the components are p = E/c, p = px, p = py and p = pz. Technically, I should write down the y and z components, but they are both 0. µ 2 Let us rewrite p by factoring out (E1 + E2) from the square root in p:

1 ! ! µ E1 + E2 E1 + E2 1 p = q 2 ≡ . (13.9) 4E1E2 sin (θ/2) c 1 + 2 c A (E1+E2) Note that I just called the whole mess in the square root A, so that I don’t have to keep writing it . All we know is that the COM frame moves in the +x-direction relative to the lab frame. But, we don’t know how fast it is moving. Let us set its speed to be βc, with corresponding γ factor. We will have to determine what β has to be for the COM frame. We boost pµ to get the 4-momentum vector in the COM frame: ! ! ! ! E0/c 1 −β E + E 1 E + E 1 − βA = p0µ = γ 1 2 = γ 1 2 . p0 −β 1 c A c A − β

For the COM frame, we know that p0 = 0. But, we see above that p0 ∝ A − β. Therefore, the β parameter that takes us from the lab frame to the COM frame must be β = A. Plugging that back in to the equation above gives ! ! ! E0/c 1 E + E 1 − A2 E + E p 1 = √ 1 2 = 1 2 1 − A2 . p0 1 − A2 c 0 c 0

– 44 – Plugging in the definition of A in Eqn. (13.9) gives ! √ ! E0/c 2 E E sin(θ/2) 1 = 1 2 , p0 c 0 which gives precisely the same E0 as we found in method 1 in Eqn. (13.7).

14. Quantum Mechanics

For me, the double slit experiment is the gateway to quantum mechanics. This is not historically how the field developed. I would say that that is closer to the way your textbook presents the material, with Planck’s discovery of the Planck distribution, derived from his clever insight that light came in discrete units called photons, whose energy was directly proportional to the frequency of the light, the proportionality being Planck’s constant. Then, Einstein ran with this idea to explain the photoelectric effect, etc. However, I think that the double slit experiment, moreso that either the Planck distri- bution or the photoelectric effect, really captures a broad scope of the weird and wonderful phenomena that propelled quantum mechanics in the early days and which were the subject of many a heated debate. I hope I’ll be able to convince you of this, but in the meantime, please accept my apologies for presenting material now that is in a later chapter of your textbook.

14.1. The Wacky World of the Double Slit Imagine performing the double slit experiment with light that is weak enough so that photons arrive at the screen at a low enough frequency that you (or, more accurately, the detectors on a screen) can actually distinguish the arrival event of each single photon. Surely, we would have to conclude that light is made up of bona fide particles in this case since you can see when each one arrives at a particular point on the screen. If you were to cover one of the slits, then photons pass through the other slit, theo- retically one at a time, and they just go straight through to the screen. You would expect to see dots form on the screen (if you used photographic film or something like that) right around the point on the screen directly in front of the slit. These dots would pile up over time as you exposed the film longer and longer. If you were to have both slits open, you might think that you would just get two regions on the screen, on directly in front of each one of the slits, where photons pile up over time. After all, if one photon goes through the slits at a time, then it either goes straight in front of one slit or the other, right? Surprisingly, that’s not what happens at all. Instead, you will observe the same old interference pattern that you see when you shine a strong light source through the slits, it just takes time for the pattern to build up as you expose the film longer and longer! In real life, these experiments were first done with electrons rather than photons. For the time being, let us postpone discussion why you might or can use electrons instead of photons in the double slit experiment. Below are pictures taken from the original papers of

– 45 – (b) A. Tonomura, J. Endo, T. Matsuda, T. (a) P. G. Merli, G. F. Missiroli and G. Pozzi. Kawasaki and H. Ezawa. “Demonstration “On the statistical aspect of electron inter- of single-electron build-up of an interference ference phenomena.” American Journal of pattern.” American Journal of Physics 57 Physics 44 306, (1976). 117, (1989).

Figure 1: Time lapse exposures in the double-slit experiment performed using electrons. the first experiments to actually observe this effect. If you want to see a video of this done in 2012, see http://iopscience.iop.org/1367-2630/15/3/033018/media/njp458349movie2.mov.

Please take a moment to contemplate how amazing this is. The electrons are passing through the slits one at a time. What on earth are they interfering with? How do they know to land with a greater probability in some regions of the screen more than others? To me, this experiment is the definitive demonstration of the wave-particle duality. How can something be a wave and a particle at the same time? Well, here it is, in all its glory. To understand this phenomenon, we will develop the rudiments of the wavefunction picture of quantum mechanics and the so-called Copenhagen interpretation. But, let us leave that for another day. For now, consider the following thought experiment. Suppose you put a light source behind the double slit shooting light across each of the slits. There is then a detector on each side that detects this light. When an electron passes through, it may interact with the light and cause decrease in the intensity of the light that is measured at the detectors. Basically, the electron cuts off the light beam for an instant as it passes by. The point of this whole setup is for us to experimentally verify which slit each electron goes through. The question is: does this have any effect on the pattern that you observe on the wall, and, if so, what is the effect? If you think there might be an effect, you might wonder how great an effect this might

– 46 – have. Could I not just make the observation light arbitrarily weak so as to perturb the system minimally? The answer turns out to be pretty catastrophic. If you can determine which slit each electron passes through, then the interference pattern will be completely destroyed. You will end up with a wash of electrons on the screen mostly concentrated at the two points on the screen directly in front of the slits! You can imagine turning the observation light on and off, effectively destroying and then reviving the interference pattern at will! We will not resolve this seeming paradox at the moment. But, let me just tell you the punchline, and you will see how it works later on. The point is that you cannot simply make the observation light arbitrarily weak. If you do, you will not be able to determine the position of the passing electrons with sufficient resolution to determine which slit each passed through. Furthermore, you will find out that you cannot really use arbitrarily high momentum electrons in this experiment. It turns out that the momentum of the electrons and the momentum of the photons you would have to use to observe those electrons will be comparable; they are both very small, but nevertheless comparable in magnitude with each other. Therefore, when they interact (e.g., collide), the photon may have a large effect on the final momentum of the electron and may deflect it significantly. This will completely destroy the interference pattern. Therefore, there is no hidden mini-demon whose job it is to confound your efforts to measure the electrons and observe interference at the same time. You yourself are destroying the interference pattern by perturbing the system too strongly.

14.2. Blackbody Radiation and the Ultraviolet Catastrophe We learn from the photoelectric effect that light may be thought of as being built out of particles called photons, even though it behaves like a wave in most familiar situations. Somehow, very many photons conspire to produce wave-like behavior. This concept of photons is what starts us down the road towards blackbody radiation, although historically the ideas of Planck about blackbody radiation, which we are about to describe, preceded, and in fact inspired, Einstein’s explanation of the photoelectric effect. One thoroughly embarrassing problem that remained before Planck came on the scene is called the ultraviolet catastrophe. Consider a thermally insulated cavity of volume V containing radiation. The energy associated with an electric field is proportional to the square of the electric field. The equipartition theorem states that, at thermal equilibrium at temperature T , the average energy associated with a quadratic degree of freedom, such as this, is ∼ kT (or kT/2; it really doesn’t matter for this discussion). However, there are technically infinitely many possible modes of radiation inside a cavity, with arbitrarily short wavelength. If each mode is to possess an average energy of kT , then the total energy would be infinite! Schroeder describes just how embarrassing this conclusion is: if it were correct, you would expect to be blasted with an infinite amount of radiation every time you open the oven door to check the cookies! The classical assumption is that each mode can have any non-negative energy, E. From 7B, we know that the probability for a mode to have energy E is proportional to the Boltzmann factor: P (E) ∝ e−E/kT . Calculating the average energy per mode as you did

– 47 – for an ideal gas in 7B for such a continuous spectrum produces the equipartition theorem and leads to the UV catastrophe as described above. Planck’s neat idea was that electromagnetic energy is not continuously distributed, but is quantized in integer units of hν, where ν is the frequency of radiation and h is Planck’s constant. He proposed that light was absorbed and emitted by matter in quanta called photons. So, a single mode with frequency ν can have an energy of 0, or hν, or 2hν, etc. But it cannot have an energy between these values, like hν/2, since that would correspond to half a photon! This leads to the Planck distribution and eventually to the Stephan-Bolztmann law of radiation, which states that the average intensity of radiation from a blackbody at temperature T is proportional to T 4, with a proportionality constant given by the Stephan- Bolztman constant.

14.3. Stephan-Boltzmann Law The Stephan-Boltzmann law gives the irradiance of a graybody at temperature T :

2π5k4 W I = σT 4, where σ = B = 5.67 × 10−8 , (14.1) 15h3c2 m2K4 and  is the emissivity of the graybody, which is a number between 0 and 1, with 1 corresponding to a perfect blackbody. On the other hand, the absorptivity, a, of an object measures the fraction of the light incident on the object that the object absorbs. At equilibrium, a = , meaning that what radiation the object absorbs, it emits, so that it neither heats up (if it emits less than it absorbs) or cools down (it emits more than it absorbs). Assume that the sun is a blackbody of temperature 5800 K and radius 7 × 108 m, located 1.5 × 1011 m from the earth. Assume that the earth is a graybody, which absorbs part of the radiation incident upon it from the sun, and then re-radiates it isotropically. Neglect any other effects which could heat the earth. Calculate the surface temperature of the earth under these assumptions.

SOLUTION:

Let RS be the radius of the sun, RES be the earth-sun distance, RE the radius of the earth, TS the temperature of the sun, TE the temperature of the earth, and  the emissivity of the earth, which, at equilibrium, is also the absorptivity of the earth. The power being radiated by the sun is just its irradiance multiplied by its surface area:

4 2 PS = (σTS )(4πRS).

By the time this light reaches the distance of the earth, the power has spread over a sphere with radius RES. Thus, the irradiance of sunlight at the earth, which we call IES, is  2 PS 4 RS IES = 2 = (σTS ) . 4πRES RES

– 48 – This light irradiance is travelling radially outwards from the sun, and so only the cross- sectional area of the earth from the view of the sun is actually absorbing the light. This 2 cross-sectional area is πRE. Furthermore, not all of that light is absorbed: only  of it is absorbed. Thus, the power absorbed by the earth is  2 (abs) 2 4 RSRE PE = πREIES = π(σTS ) . RES On the other hand, the power re-radiated by the earth is isotropic and radiated by all the surface area of the earth: (rad) 4 2 PE = (σTE)(4πRE). (abs) (rad) At equilibrium, PE = PE , and solving for TE gives r RS ◦ TE = TS = 280 K ≈ 7 . 2RES

That’s quite cold, but it’s supposed to represent an average surface temperature for the earth. However, even if it were a good value, we would have to take it with a heap of salt since we didn’t even take into account the fact that the earth has an atmosphere!

14.4. Bohr Model Some time after the Planck’s discovery of his model of blackbody radiation (1990) and Einstein’s explanation of the photoelectric effect (1905), Niels Bohr proposed an explana- tion for atomic spectra: the so-called Bohr model of the atom (1913). I will not reproduce the derivation of the radii, speeds and energies of the electron in its various orbitals in the Bohr model. However, I will mention the way I remember the orbital energy and radius. A special case of the virial theorem says that for orbital paths in the presence of a central force, which is proportional to the inverse square of the radial distance, the average potential energy along the orbit, hV i, is −2 times the average kinetic energy, hT i. Therefore, the average total energy is hEi = −hT i = hV i/2. The convention for potential energy here is that V −−−→r→∞ 0−. That is, the potential energy is negative and approaches zero from below at large distances. This statement of the virial theorem is particularly powerful for circular orbits because these orbits have constant kinetic and potential energies and therefore, we can just get rid of the averages and the result still holds! An electron orbiting a proton in hydrogen is in the presence of the Coulomb force, which is an inverse square force and therefore satisfies the conditions of the special case of the virial theorem discussed above. Therefore, the energy is simply negative of the kinetic energy or half the potential energy:

2 2 2 L L V e α~c E = −T = − = − 2 , and E = = − = − . (14.2) 2I 2mr 2 8π0r 2r I have introduced the dimensionless fine structure constant,

e2 1 α = ≈ . (14.3) 4π0~c 137

– 49 – Set the above two expressions for E equal to each other and solve for r: L2 r = . (14.4) α~mc Finally, we use Bohr’s postulate: angular momentum comes in integer units of ~: 2 2 2 n ~ n ~ rn = = . (14.5) α~mc αmc I actually prefer writing this as n2 c r = ~ , (14.6) n αmc2 because I always remember that mc2 = 0.511 MeV for the electron. It’s not really that important because I never remember what ~c is anyway. For future reference, the value of ~c is ~c = 1.24 µeV·m (that’s micro-electron volts times meters). α~c Plugging this back into the expression E = − 2r gives the energies α2mc2 E = − . (14.7) n 2n2 This is the only expression for the orbital energy I ever remember because it is nice and succinct. I always remember that the hydrogen energy levels go like E ∼ 1/n2. The only sensible unit of energy in this problem is mc2, the rest mass energy of the electron. In fact, we are assuming in our analysis above that the electron is non-relativistic. This means that the energy levels should be very small compared to the rest mass energy of the electron. That is, they should be measured in units of the electron rest mass energy and in those units, they should be small. Indeed, this is the case because α2 is a small number. I remember the factor of α2 via an argument from quantum field theory. Don’t worry, you don’t have to understand the details of the argument or how it is derived in quantum field theory. A qualitative picture will suffice. Worst case scenario: this can just serve as a memory aid. The interaction between the electron and the proton, or indeed any two charged objects, happens via the exchange of a photon. For example, the simplest such exchange might look like J ]J  αJ α (14.8) J  J] e− J p+ This is what is called a Feynman diagram. It is supposed to denote an electron and a proton coming in, interacting via the exchange of a photon, and then going out. In the diagram, time goes upwards. The diagram makes it seem as though the electron and proton go away from each other after the interaction (i.e., repel). This is just the conventional way this diagram is drawn; in fact, the diagram does not usually live in space anyway, but rather in momentum space. But, if you like, there is nothing wrong with drawing the outgoing electron and proton lines to be heading towards each other rather than away. Each vertex denotes a local interaction between a charge and the photon and counts as one factor of α. There are two vertices in the above diagram, and therefore the overall

– 50 – interaction strength goes like α2. Of course, you can have ever-more complicated diagrams with more and more photon lines. You can even have internal loops consisting of electrons and positrons and all sorts of other particles. However, these will necessarily come in with ever-more factors of α and since α is small, these are ever-smaller effects. The energy, (14.7), is sometimes called the tree-level energy because it is derived from the above tree- level Feynman diagram, which contains no loops. Finally, there is the pesky factor of 2 in the denominator. If you remember everything else and, in addition, remember that, for hydrogen, when you plug in n = 1, you are supposed to get −13.6eV, then you can’t miss the factor of 2, since otherwise you would get −27.2eV instead. Now, consider the following problem:

(a) The power radiated by an accelerated charge e is given in classical physics by the formula 2 1 2e 2 P = 3 a (SI units), 4π0 3c where a is the acceleration. Using this formula, calculate the power radiated by an electron in a Bohr orbit characterized by the quantum number n. (According to the correspondence principle, when n is very large this should agree with a proper quantum mechanical calculation.)

(b) The decay rate for an electron in an orbit may be defined to be the power radiated, P , divided by the energy emitted in the decay. (The decay rate is the inverse of the lifetime). Use the Bohr theory expression for the energy radiated, and the expression for P from part (a) to calculate the “correspondence” value of the decay rate when the electron makes a transition from orbit n to orbit n − 1. What is the value of this decay rate when n = 2? (This will not agree exactly with the true quantum theory, since the correspondence principle will not hold when n is not  1.) What is the decay rate when the transition is from an orbit n to an orbit n − m?

(c) Use the value of the “lifetime” of an electron in an n = 2 Bohr orbit, calculated in part (b), to estimate the uncertainty in the energy of the n = 2 energy level. How does it compare with the energy of that level?

SOLUTION:

(a) The acceleration in a circular orbit is related to tangential speed and radius via

v2 a = . r

th The radius rn of the n orbit is in (14.5). The speed in this orbit is given by solving for v in the equation L = mvr and plugging in L = n~: 2 n ~ n~ αc rn = , vn = = . (14.9) αmc mrn n

– 51 – The expression for vn is particularly nice because it shows you that the electron is pretty non-relativistic, since α is a small number, so v  c. Therefore, the acceleration in the nth orbit is 2 3 3 vn α mc an = = 4 . (14.10) rn n ~ We can write the power as 2α a2 P = ~ . 3 c Therefore, the power radiated by an electron in the nth Bohr orbital is

 3 2 2 7 2 4 2α~ α mc 2α m c Pn = = . (14.11) 3 n4~ 3n8~

If we plug in n = 2, we will get 7 2 1  (0.511 MeV)2 eV P = 137 = 1.14 × 109 . 2 3(28)(6.58 × 10−16 eV · s) s

(b) Classically, there would be a continuum of orbital states between n = 2 and n = 1 and the electron could radiate continuously and decay continuously. It’s orbit would very quickly spiral inwards and the electron would crash into the proton. There would be no stable atoms at all and no chemistry or life could possibly exist. Clearly, that’s wrong! You could say that our very existence is evidence for quantum mechanics. The model we are suggesting in this problem is that the electron sort of waits until it would have radiated away the difference in energy between the n = 2 and n = 1

orbitals had it been radiating continuously at the rate P2, and then at that point it radiates that whole energy difference at once. At the rate P2, the time it would take to radiate away the energy difference between n = 2 and n = 1 is −13.6 eV −13.6 eV ∆E 2 − 2 ∆t ≡ 2→1 = 2 1 ≈ 10−8 s. (14.12) 2 P 9 eV 2 1.14 × 10 s This is the average lifetime of the n = 2 orbital. The decay rate, γ, is just the inverse of this.

(c) The energy-time uncertainty relation is

∆E ∆t ≥ ~. 2 If you use the smallest bound for our estimate and ∆t in (14.12), you get −16 ~ 6.58 × 10 eV · s −8 ∆E2 = = −8 = 3.3 × 10 eV. (14.13) 2∆t2 2 × 10 s

Since E2 is of order eV, we can say that we know the energy of the orbital to a high precision since the uncertainty is so small in comparison.

– 52 – 14.5. Time-Evolution in 1D Infinite Square Well First, let us prove that the wavefunctions of the one-dimensional infinite square well of length L are orthonormal. Recall that the wavefunctions and the energies of a particle of mass m occupying the corresponding states are labeled by a positive integer, n:

r 2 nπx n2π2 2 ψ (x) = sin ,E = ~ . (14.14) n L L n 2mL2 We would like to prove that

Z L ∗ ψm(x) ψn(x) dx = δmn, (14.15) 0 where δmn equals 1 if m = n and zero if m 6= n. This is called the Kronecker delta. Note that the complex conjugation is actually immaterial in this case because the wavefunctions happen to be real. However, this is not always the case, so it’s a good idea to keep the complex conjugation in when you write the orthonormality condition in general. Let us write out the left hand side: Z L Z L Z 1 ∗ 2 mπx nπx ψm(x) ψn(x) dx = sin sin dx = 2 sin(mπξ) sin(nπξ) dξ, 0 L 0 L L 0 where we changed the integration variable to ξ ≡ x/L, for convenience. We can use the trigonometric identity

2 sin α sin β = cos(α − β) − cos(α + β).

Using this identity, we can write the integral we are calculating as

Z L Z 1 ∗   ψm(x) ψn(x) dx = cos[(m − n)πξ] − cos[(m + n)πξ] dξ 0 0 1 1 sin[(m − n)πξ] sin[(m + n)πξ] = − (m − n)π 0 (m + n)π 0 = sinc[(m − n)π] − sinc[(m + n)π]. (14.16)

Since m and n are both positive integers, so is m + n. Therefore, sin[(m + n)π] = 0 and therefore sinc[(m + n)π] = 0, since the denominator, (m + n)π 6= 0. On the other hand, m − n can be any integer - positive, negative, or zero. If m − n 6= 0, then we still have sinc[(m − n)π] = 0, but if m − n = 0, then sinc[(m − n)π] = sinc 0 = 1. This proves the desired relation: that this integral is equal to zero except when m = n, in which case it is equal to 1. This is precisely the orthonormality condition, Eqn. (14.15). It turns out that this orthonormality condition is all we need to prove that the wave- functions, ψn(x), form a complete basis. The completeness condition says that any wave- function that satisfies Schr¨odinger’sequation (in this case, for the one-dimensional infinite square well potential) may be written as a superposition of the basis wavefunctions. We

– 53 – would like to prove this now. Suppose we have an arbitrary wavefunction, ψ(x), that sat- isfies the one-dimensional infinite square well potential Schr¨odingerequation. We would like to write it as a superposition:

∞ X ψ(x) = Cnψn(x). (14.17) n=1

∗ Let us multiply both sides by ψm(x) and integrate from x = 0 to x = L:

Z L ∞ Z L ∞ ∗ X ∗ X ψm(x) ψ(x) dx = Cn ψm(x) ψn(x) dx = Cnδmn = Cm. (14.18) 0 n=1 0 n=1

This gives us a formula for calculating the expansion coefficients, Cm. There are some technicalities regarding whether or not the integral expression on the LHS for Cm makes any sense, but these are mathematical qualms and do not, to my knowledge, arise in any meaningful physical situation. Thus, we have shown that the wavefunctions, ψn(x), furnish a complete basis for all appropriate wavefunctions. [Note: this is very similar to Fourier’s theorem, which claims that any “sufficiently nice” function may be written as a superposition of sines and cosines or complex exponentials.] Now, here comes the true utility of these basis wavefunctions. By construction, they are what are called energy eigenstates because they have well-defined energies given in Eqn. (14.14). This is useful because it is easy to write down the time evolution of a state that has a well-defined energy. If ψ(x) is the wavefunction of a state that has energy E, then the time evolution of that state is

ψ(x, t) = e−iEt/~ψ(x). (14.19)

−iωt Often, one defines ω ≡ E/~ so that the exponential can be written e . We may apply this to the basis wavefunctions. The energies are En given earlier. Define ωn ≡ En/~. Then, −iωnt ψn(x, t) = e ψn(x). (14.20) If ψ(x) does not have a well-defined energy, then this simple relation no longer holds. However, we can write ψ(x) as a superposition of the basis states and evolve each term:

∞ X −iωnt ψ(x, t) = Cne ψn(x). (14.21) n=1 Voila! We are able to time-evolve ψ(x) even though it does not have a well-defined energy! Let’s work out an example. Suppose the particle is located somewhere on the left hand half of the infinite square well, but most likely to be found in the middle of the left half. Suppose its wavefunction is   √2 sin 2πx , 0 ≤ x ≤ L , ψ(x) = L L 2 (14.22) 0, elsewhere.

– 54 – I have made sure that the integral of |ψ(x)|2 is 1, which has to be the case since |ψ(x)|2 dx is supposed to represent the probability for the particle to be located in a region of size dx around the point x and so the integral is the probability for the particle to be anywhere, which had better be 1. Note that this wavefunction looks very much like the n = 2 basis wavefunction, but only on the left half of the well. Let us use the formula for the expansion coefficient, Eqn. (14.18):

Z L ∗ Cm = ψm(x) ψ(x) dx 0√ 2 2 Z L/2 mπx 2πx = sin sin dx L 0 L L √ Z 1 m  = 2 sin πξ sin(πξ) dξ 0 2   = √1 sinc m − 1π − sinc m + 1π . (14.23) 2 2 2

Note that we changed the variable of integration to ξ ≡ 2x/L. Note that for m even but √ m 6= 2, this formula gives Cm = 0. We also have C2 = 1/ 2. There’s no real point to simplifying this when m is odd. We have

∞ X ψ(x) = Cnψn(x). (14.24) n=1 Below is a diagram of the wavefunction and its expansion. The blue is ψ(x) and the purple is the result of adding the first 10 terms in the expansion. Of course, the fit is not perfect because the sum must go to ∞, but it’s not bad for just the first 10 terms.

Now, we can evolve this state through time:

∞ X −iωnt ψ(x, t) = Cne ψn(x). (14.25) n=1 We can take the complex square of this (i.e. multiply it with its complex conjugate) and the result is supposed to be the probability density, P (x, t). Where P (x, t) is big is where the particle is likely to be found if a measurement of its position is to be made. Below are snapshots of P (x, t) at various moments in time. Notice that the particle tends to swish

– 55 – back and forth from left to right and back again. We have shown half a period, where the particle starts from being just on the left half to being just on the right half. This takes t = π/ω1 worth of time.

By the way, this state has no well-defined energy. However, it does have an average energy. The interpretation is that if one were to prepare a very very large number of identical systems all in this initial state and then one were to take a measurement of the energy for all the identical setups, one would get different measurements for each setup, but the average energy is well-defined. This average energy is just the sum of the products of the probability for the particle to be in the wavefunction ψn and the energy of that state, En:

∞ X 2 hEiψ = |Cn| En. (14.26) n=1

Since Cn is the coefficient of the wavefunction, ψn, in ψ, its complex square is the probability for the particle to be in the wavefunction ψn. Note that this is an average energy as described earlier. It is not the energy of the state. The state does not have a well-defined energy. This is in contrast to the energy eigenstate with wavefunction ψn(x). If we prepared a large number of identical systems all in the initial wavefunction ψn(x) and we measured the energy of each system separately, we would always measure En. It might be tempting to define hωiψ ≡ hEiψ/~ and then say that the time evolution of ψ(x) is simply ψ(x, t) = e−ihωiψtψ(x) (INCORRECT!). (14.27)

However, this is incorrect because we cannot interpret hEiψ as the energy of the wavefunc- tion ψ(x). This wavefunction does not have a well-defined energy. The only way we can time-evolve the state is to write it as a superposition of the basis of energy eigenstates and then time-evolve each piece separately, as in Eqn. (14.25).

– 56 – 15. Final Review

15.1. Human Eye Optics Let’s consider the optics of human eyes. A human eye can be simplified as one convex lens projecting images on to a screen (the retina). The focal length of the human eye lens is variable. Let’s assume that the distance between the retina and the eyeball is 25 mm, and the diameter of the pupil (which is the effective diameter of the lens of an eye) is 3 mm.

(a) If you are reading a book that is 300 mm away from your eyes and an arrow of 1 cm size on the book forms a clear image on your retina, what is the focal length of the lens of your eye? What is the actual image size of the arrow on your retina?

(b) What is the smallest object you can identify on the book based on the diffraction limit of the eye? Assume the illumination light wavelength to be 600 nm.

(c) In order to achieve the diffraction limited resolution in 2, how small must the “pixel” on your retina be?

SOLUTION:

(a) The lens equation reads 1 = 1 + 1 , and so f so si

s s (300 mm)(25 mm) f = o i = = 23.1 mm . so + si (300 + 25) mm

The transverse magnification is

si 25 mm 1 MT = − = − = − = −0.0833. so 300 mm 12

The negative sign means the image is up-side-down. The size of the image on the retina is 1 1 |y | = |M |y = · 1 cm = cm = 0.833 mm . i T o 12 12

(b) The angular resolution is given by the Rayleigh criterion: λ ∆θ = 2.44 , d

where d is the diameter of the pupil. This converts to a spatial resolution, ∆x, on the book by use of the small angle approximation: ∆x = L∆θ, where L = 300 mm is the distance from the eye to the book (also the object distance). Thus,

λL (6 × 10−4 mm)(300 mm) ∆x = 2.44 = 2.44 = 0.15 mm . d 3 mm

– 57 – (c) According to part (a), the image size for an object the size in part (b) is

1 |y | = |M |y = · 0.15 mm = 12 µm . i T o 12

That’s a very small pixel! However, if one rod is to serve as one pixel, measurements done on some animals show that rods are on the order of microns in size.

15.2. Optical Fiber An optical fiber can be considered as a glass waveguide guiding light through total internal reflection. Light can be coupled in from the end surface of the fiber. What is the range of angle θ of the input light so that it can be guided in the fiber? (nglass = 1.4).

SOLUTION:

Consider the following diagram:

In order to get total internal reflection at θ2, we must have

1 sin θ2 ≥ n , where n = 1.4 is the index of refraction of the glass. We have taken this optical fiber to be surrounded by air with index of refraction ≈ 1. Then, q p 2 1 1 p 2 cos θ2 = 1 − sin θ2 ≤ 1 − n2 = n n − 1.

π Since θ1 = 2 − θ2, we have

1 p 2 sin θ1 = cos θ2 ≤ n n − 1.

Using Snell’s law, we get

p 2 ◦ sin θ = n sin θ1 ≤ n − 1 = 0.98 =⇒ 0 ≤ θ ≤ 78 .

– 58 – 15.3. Modified Michelson Interferometer The Michelson interferometer in the diagram has a birefringent plate in one arm. The birefringent plate has a thickness of 20 µm and refractive index difference between the high and low refractive index directions is 0.01. A beam of y-polarized light with wavelength 800 nm is incident on the Michelson interferometer. Initially, the high-refractive-index direction of the birefringent plate is along the y-direction and an interference pattern shown in the diagram is generated at the screen. Points A, B and C denote the interference first maximum, first minimum, and second maximum in the pattern, respectively. Describe the interference pattern under the following conditions? Give your reasoning.

(a) The birefringent plate is rotated by 45◦ from its original position.

(b) The birefringent plate is rotated by 90◦ from its original position.

Now the wavelength of the light is changed to 400 nm. Again, in terms of I0, what will be the light intensity at positions A, B and C under the following conditions? Give your reasoning.

(c) The birefringent plate is returned to its original position (high refractive index direction along the y-direction).

(d) The birefringent plate is rotated by 45◦ from its original position.

(e) The birefringent plate is rotated by 90◦ from its original position.

[Hint: First determine what kind of waveplate the birefringent material is for 800 nm and 400 nm light, respectively.]

– 59 – SOLUTION:

(a) Let us assume, for simplicity, that the bright bands near the center have roughly the

same intensity, so that the intensity at C in the diagram in the problem is also I0. As hinted in the problem, we should first work out what type of wave-plate the birefringent plate is for 800 nm and for 400 nm. The number of wavelengths that fit in a plate of index of refraction n and thickness t is t tn N = = . λ/n λ

Therefore, the difference in the number of wavelengths that fit in the plate for the fast and slow directions of the waveplate is

t∆n (2 × 10−5 m)(10−2) 2 ∆N = = = . λ λ λ/(100 nm)

For λ = 800 nm, we have ∆N = 1/4 and so the waveplate is a quarter-waveplate (qwp) and for λ = 400 nm, we have ∆N = 1/2 and so the waveplate is a half-waveplate (hwp). After the qwp, the light is circularly polarized. Reflection off of the mirror does not change the rotation of the circular polarization, but it does reverse the direction of propagation of the light. Thus, left circular polarization (lcp) turns into right circular polarization (rcp) and vice-versa. Thus, when the light passes through the qwp again, it becomes linearly polarized light, but polarized in the x-direction rather than the y-direction. Upon reflection off of the center half-silvered mirror, this x-polarization becomes z-polarization. Meanwhile, the light from arm 1 remains y-polarized. Hence, the two light beams are polarized in different directions and cannot interfere. We do not observe rings, just one big bright spot.

(b) The beam in arm two will be phase shifted less than before by a quarter of a wavelength for each time it passes through the qwp. That is a total of half a wavelength. Therefore, where the interference used to be constructive, it will now be destructive and vice-versa. A and C will now be dark and B will be bright.

(c) Twice passing through a qwp produces the image in the problem. Twice passing through a hwp produces a phase shift that is twice as large as with the qwp. The bright fringes in the pattern when the plate is a qwp occur when the relative phase shift between the two arms is 2πn for an integer n and the dark fringes are when it 1  is 2π n + 2 . If we multiply either one of these by 2, we will always get an integer multiply of 2π. Thus, the bright fringes in the pattern when the plate is a hwp occur at both the bright and the dark fringes in the pattern when the plate is a qwp. Of course, there will be dark fringes in between. In other words, the pattern when the plate is a hwp is twice is tight as when the plate is a qwp. A, B and C will all be bright with one dark fringe in between A and B and in between B and C.

– 60 – (d) Any component of the light perpendicular to the high-index-of-refraction axis of the hwp will be phase shifted less relative to the component parallel to this axis by π each time it passes through the hwp, which adds up to 2π (going twice through the hwp). That’s the same as no phase shift at all. Therefore, there will be absolutely no difference if we rotate the hwp by any angle: same as (c).

(e) Same as (c) for the reason stated in (d).

15.4. Diffraction Grating Diffraction gratings can separate different wavelengths into different directions. They can be understood as multiple slits structures. Consider a grating with 600 lines (i.e. slits) per mm. Now we shine a red beam with wavelength 632 nm at normal incidence on to the grating.

(a) How many strong outgoing beams will be observed? (Hint: the largest diffraction angle will be 90◦ in this case). What are there respective outgoing angles?

(b) The outgoing beam with the smallest non-zero angle is called the first order diffraction beam. Now, we have two incident light beams with wavelengths at 632.00 nm and 632.01 nm, respectively. What is the angular separation between their first order diffraction beams?

SOLUTION:

(a) The condition for strong maxima for the diffraction grating is the same as that for the

double-slit: d sin θm = mλ, where d is the separation distance between the centers of successive slits. Solve for m: d sin θ d (1/600) mm m = ≤ = = 2.64. λ λ 632 nm

Note that we used the fact that, no matter what θ is, sin θ is always ≤ 1. Since m is an integer, we conclude that the largest value for m is

mmax = 2 =⇒ There will be 5 strong outgoing beams.

These five correspond to m = 0, m = ±1 and m = ±2. The angles are θm = sin−1(mλ/d): ◦ ◦ ◦ θ0 = 0 , θ±1 = ±22.3 , θ±2 = ±49.36 .

(b) Since the differences are going to be very small, we would need quite a few significant (1) figures to calculate this on a calculator. Instead, we will write θ1 for the angle of the (1) (2) (2) first order diffraction beam for wavelength λ = 632.00 nm and θ1 for λ = 632.01 (2) (1) (2) (1) nm. Define ∆λ ≡ λ −λ = 0.01 nm = 10 pm (pico-meters). Define ∆θ1 ≡ θ1 −θ1 , (2) (1) which we know is very small. Thus, we may Taylor expand sin θ1 around θ1 : (2) (1) (1) sin θ1 ≈ sin θ1 + (∆θ1) cos θ1 .

– 61 – By our formula from part (a),

λ(1) λ(2) sin θ(1) = , sin θ(2) = . 1 d 1 d

The second equation is expanded as λ ∆λ sin θ(1) + (∆θ ) cos θ(1) = + . 1 1 1 d d

(1) Using the previous equation for θ1 , we get ∆λ (∆θ ) cos θ(1) = . 1 1 d

Thus,

∆λ 10−11 m ∆θ = = = 6.5 × 10−6 rad = (3.7 × 10−4)◦ . 1 (1) ((1/600) × 10−3 m) cos 22.3◦ d cos θ1

Your calculator could probably have given you the angles to within more than four (2) (1) decimal places, in which case you could have just calculated θ1 as you did θ1 and taken the difference. If you do the calculation the way I did above, you need to

remember that the ∆θ1 you get just from plugging in the numbers is going to be in radians, not in degrees. I had to convert to degrees at the end.

15.5. Optical Spectroscopy (30 points) Optical spectroscopy is widely used to determine the properties of materials. The figure below is a reflection spectrum from a thin transparent film. It displays the reflectivity of the thin film as a function of light frequency for normal incident light. Based on this spectrum, determine the thickness and refractive index of the thin film.

– 62 – SOLUTION:

The thin film is surrounded by air. Let ray 1 be the ray that reflects off of the front (air-to-film) interfrace and ray 2 the one that reflects off of the back (film-to-air) interface. Then,

ϕref,1 = π, ϕref,2 = 0, ∆ϕref = −π.

The difference in path between 1 and 2 is that in addition to the path of 1, ray 2 goes through the thickness, t, of the film twice. Let n be the index of refraction of the film.

Setting ϕpath,1 = 0, we have

2π 4nt  ∆ϕ = · 2t =⇒ ∆ϕ = − 1 π. path λ/n tot λ

1 ν Let us write this in terms of frequency instead of wavelength. We write λ = c . Thus, 4ntν  ∆ϕ = − 1 π. tot c

For destructive interference, we set this equal to (2m − 1)π, for some integer m. Thus,

2ntν 2nt∆ν = m =⇒ = ∆m = 1, c c where ∆ν = 12.5 × 1012 Hz, is the frequency separation between adjacent minima, and ∆m = 1 since m increases in unit steps (it is always an integer). Thus, c nt = = 1.2 × 10−5 m. (15.1) 2∆ν We will work out the value of n by calculating the maximum reflectivity. We can think of rays 1 and 2 as being two separate light sources, each of intensity RI0, where I0 is the intensity of the incident light. Technically, the second light ray has intensity TRTI0 = 2 (1 − R) RI0. However, assuming RI0 instead, we will find R ≈ 0.1, which we will assume is sufficiently small to justify the approximation. This is just a simplifying assumption; you could very well do the more exact calculation if you wished. The rays are polarized the same way, so we can write the total measured intensity from the superposition of the two rays as p I = I1 + I2 + 2 I1I2 cos ∆ϕ ≈ 2RI0(1 + cos ∆ϕ), where we calculated ∆ϕ earlier. Let us derive this result about adding intensities. Recall that the intensity is proportional to the (complex) square of the total electric field: I = 0c 2 2 |E| . The proportionality constant will not really matter here, but there it is anyway. The reflectance, r, tells you how much of the electric field gets reflected. Therefore, the amplitude of the electric field in ray 1 is rE0 and in ray 2 is approximately also rE0, where E0 is the incident electric field. However, the rays have different phases relative to each other. The phase of ray 2 relative to ray 1 is ∆ϕ. Therefore, if the electric field in ray 1

– 63 – i∆ϕ is represented by rE0, then the electric field in ray 2 is rE0e . The total electric field in i∆ϕ the sum of the two is rE0 1 + e . The total intensity is proportional to the square of this total electric field:  c I = 0 |r|2|E |21 + ei∆ϕ1 + e−i∆ϕ = 2RI 1 + cos ∆ϕ, 2 0 0 where the reflection coefficient, R, is the square of the reflectance, R = |r|2, and where we 0c 2 have denoted 2 |E0| as I0, the incident intensity. At total constructive interference, ∆ϕ = 2mπ, for some integer π, and so I = 4RI0. Therefore, the reflectivity of the film at constructive interference is 4R, that is four times the reflectivity of just one of the air-film interfaces. According to the graph, this reflectivity is equal to 0.4. Therefore, R = 0.1. But, R is just the square of the the reflectance, whose 1−n formula we are given, and which simplifies at normal incidence to r = 1+n : √ n − 12 1 + 0.1 = r2 = R = 0.1 =⇒ n = √ = 1.925 . n + 1 1 − 0.1

Plugging this n into Eqn. (15.1) gives the thickness:

t = 6.23 µm .

15.6. Relativity and Current-Carrying Wires Recall from E&M that an infinite straight wire containing a linear charge density, λ, gen- erates an electric field whose magnitude, as a function of the radial distance, r, from the wire, is E = λ . 2π0r Recall, as well, that an infinite straight wire carrying a current, I, generates a magnetic µ0I field whose magnitude is B = 2πr . The direction of the magnetic field rotates around the wire in a right-handed fashion (if your right thumb points in the direction of the current, then your right fingers wrap around the wire in the direction of the magnetic field). If a positive charge, q, is moving with speed, v, parallel to a current-carrying wire a distance r away, then the charge experiences a magnetic force with magnitude Fm = qvB = µ0I qv 2πr . The force is attractive if the point charge moves parallel to the current and repulsive if it moves anti-parallel (opposite). In either case, the point charge starts accelerating in the radial direction. So far, we have been discussing the picture in the “lab frame”. What if we were to consider the picture from the frame that is moving relative to the lab frame along with the point charge, so that the point charge looks to be at rest in the horizontal direction in this frame. Call this frame, S0, the horizontal rest frame of the point charge. In this frame, the charge is not moving initially. Therefore, it cannot experience a magnetic force. Yet, according to our analysis in the lab frame, it has to start accelerating in the radial direction. Let’s see how.

(a) The wire is made up of an immobile lattice of heavy positive ions and a sea of free mobile electrons. Suppose that each atom gives up one electron so that each ion has

– 64 – charge +e. The wire is neutral in the lab frame and so the linear charge densities of the positive ions and the electrons are λ and −λ, respectively. On average, how far apart along the wire are adjacent positive ions or adjacent electrons as measured in the lab frame?

(b) Let v be the speed of the electrons along the wire. Calculate the current.

(c) The external point charge is at a radial distance, r, from the wire. For simplicity, suppose that it is moving with the same speed, v, and direction as the mobile electrons in the wire. Remember that the direction of current is opposite to the direction of mo- tion of the electrons. Therefore, this is the case when the external point charge moves opposite to the current and thus experiences a repulsive magnetic force. Calculate this magnetic force in the lab frame.

(d) In the frame S0, the external point charge and the electrons in the wire are at rest (since they all have the same velocity). The positive ions are now moving backwards with speed v. In S0, on average, how far apart along the wire are adjacent positive ions? What about adjacent electrons? What is the net charge density measured in this frame? [Note: Charge is a relativistic invariant.]

(e) Use your result from part (d) to calculate the force experienced by the external point charge in the frame S0. Show that it also points radially away (i.e. is repulsive), but is now purely electric rather than purely magnetic. You should find that the force in S0 is bigger than in the lab frame. Can you think of a reason why this should be the case? [Hint: Think time dilation. Note: You just need to argue why the force in S0 should be bigger than in S; you need not explain the factor.]

SOLUTION:

(a) The distance between adjacent positive ions is d+ = e/λ+ = e/λ . This is also the

distance between adjacent electrons: d− = −e/λ− = e/λ , since λ− = −λ.

(b) The linear density of the electrons is n = 1/d− = λ/e since there is one electron per d− length of wire. The current is I = n(−e)v = −λv . The minus sign just reminds us that the current is in the opposite direction to the motion of the electrons.

2 2 (c) F = qv µ0I = µ0qλv = β qλ , where β ≡ v/c and we used the fact that µ c2 = 1/ . m 2πr 2πr 2π0r 0 0

(d) Since the positive ions are at rest in the lab frame, d+ is their proper separation. Their separation measured in S0 would be contracted by a factor of γ = 1 − (v/c)2−1/2. 0 Thus, d+ = d+/γ = e/γλ . The situation for the electrons is exactly the opposite. They are at rest in S0, so their separation in S0 is their proper separation. Their sep-

aration in the lab frame, which is d− = e/λ, is contracted relative to their proper separation. Thus, the separation of the electrons in S0 is bigger than in the lab 0 0 frame: d− = γd− = γe/λ . The wire is no longer neutral when viewed in S : the

– 65 – 0 0 positive linear charge density is λ+ = e/d+ = γλ, while the negative charge density 0 0 is λ− = −e/d− = −λ/γ. These do not cancel anymore: the net charge density is 0 0 1  2 λnet = λ+ + λ− = γ − γ λ = γβ λ .

(e) Even though the external charge does not feel any magnetic force in the frame S0, the wire is now positively charged with charge density γβ2λ in this frame. Therefore, 2 it produces an electric field with magnitude γβ λ radially outwards. This exerts a 2π0r repulsive force on the external point charge equal to the charge times the electric field: 2 F 0 = γβ qλ . Note that the force is repulsive both in the lab frame and in the frame S0, e 2π0r but the force in S0 is bigger by a factor of γ. This makes sense because S0 measures the initial proper time for the external charge. The time in the lab frame will be dilated relative to this by a factor of γ. In S0, the force measured on the charge is greater than in S, but the time for acceleration is also shorter. 15.7. Pi Decay Neutral meson π has a rest mass of 135 MeV/c2 and a half-life of 8.2 × 10−17 s. In one experiment, high energy π mesons are generated. Then each meson decays into two photons: π → γ + γ. Consider the following questions in the lab frame. (a) After traveling 10−6 m, only one percent of the π mesons are left. Calculate the velocity, kinetic energy and momentum of the generated π mesons.

(b) If the two γ photons are produced in the forward and backward directions, respectively, what are the energies of the two photons?

SOLUTION: (a) Let τ 0 = 8.2 × 10−17 s be the proper half-life of the mesons (measured in their own rest frame). Let v be their speed measured in the lab frame, with corresponding β and γ factors. Then, the apparent half-life measured in the lab frame is dilated 0 relative to the meson rest frame: τ = γτ . Suppose there are N0 mesons at the start. Then, the number of un-decayed mesons remaining after time t in the lab frame is −t/τ N = 2 N0. In the same time, the mesons will have traveled a distance d = βct and so we can write N as a function of d instead of t by solving for t in terms of d −d/βcτ as t = d/βc; that is, N = 2 N0. Let α = 1/100 be the fraction of remaining mesons after the mesons travel a distance d = 10−6 m measured in the lab frame. Then, α = 2−d/βcτ = 2−d/βγcτ 0 = e−d ln 2/βγcτ 0 . Solving for βγ gives v/c d ln 2 = βγ = − = 6.12 p1 − (v/c)2 cτ 0 ln α Do not be thrown off by the minus sign; ln α is negative since α is a number less than 1. Now, solving for v gives   d ln 2 cτ 0 ln α 8 v = −q  c = 0.987 c = 2.96 × 10 m/s . d ln 2 2 1 + cτ 0 ln α

– 66 – The momentum is d ln 2 p = βγm c = − m c = 826 MeV/c . π cτ 0 ln α π

The kinetic energy is

p 2 2 2 2 T = (pc) + (mπc ) − mπc = 702 MeV .

(b) Method 1: Let E1 and E2 be the energies of the forward- and backward-moving photons, respectively. The energy and momentum of the pion and the two photons are E /c  γm c  E /c E 1 E /c E  1  π = π , 1 = 1 , 2 = 2 . pπ βγmπc p1 c 1 p2 c −1

Conservation of energy and momentum reads

2 2 E1 + E2 = γmπc ,E1 − E2 = βγmπc .

Adding the two equations and dividing by 2 gives E1. Subtracting the second from the first and dividing by 2 gives E2: s s 1 + β m c2 1 − β m c2 E = π = 831 MeV,E = π = 5.48 MeV. 1 1 − β 2 2 1 + β 2

Method 2: We could use the energy conservation equation from the previous method, 2 which reads E1 + E2 = γmπc , and couple it with the calculation of the relativistic invariant for the pion. The relativistic invariant for the pion is

2 2 2 2 2 2 2 2 4 Eπ − pπc = (γmπc ) − (βγmπc) c = mπc .

The relativistic invariant for the two photons together is

2 2 2 2 2 (E1 + E2) − (p1 + p2) c = (E1 + E2) − (E1 − E2) = 4E1E2.

Therefore, 2 4 mπc = 4E1E2.

Indeed, if you multiply the expressions for E1 and E2 found in method 1, you get 1 2 4 4 mπc , which is consistent with the above equation. So, our two equations, for the two unknowns, E1 and E2, are

2 2 mπc 2 E1 + E2 = γmπc ,E1E2 = 2 .

2 Solve for E2 using the first equation: E2 = γmπc − E1, and plug this into the second equation. One gets a quadratic equation, which, after rearranging things, reads

2 2 2 mπc 2 E1 − γmπc E1 + 2 = 0.

– 67 – 2 γmπc 2 Complete the square by adding and subtracting a term 2 :

2 2 2 2 2 2 2 γmπc 2 γmπc 2 mπc 2 γmπc 2 mπc 2 2 E1 −γmπc E1 + 2 − 2 + 2 = E1 − 2 − 2 (γ −1) = 0.

We write the factor γ2 − 1 as

2 1 1−(1−β2) β2 2 γ − 1 = 1−β2 − 1 = 1−β2 = 1−β2 = (βγ) .

Therefore, our quadratic equation for E1 reads

2 2 γmπc 2 βγmπc 2 E1 − 2 − 2 = 0.

Using the standard factorization of the difference of two squares, we get

2 2  mπc  mπc  E1 − γ(1 + β) 2 E1 − γ(1 − β) 2 = 0.

q 1+β The first root is the same as the E1 we found in method 1 since γ(1 + β) = 1−β . The second root is the E2 we found earlier. We know to pick the first solution for E1 because it is the larger of the two and the forward-moving photon had better have the higher energy.

Method 3: In the rest frame of the pion, the energy and momentum of the pion and the photons are

 0     0     0    Eπ/c 1 E1/c mπc 1 E2/c mπc 1 0 = mπc , 0 = , 0 = . pπ 1 p1 2 1 p2 2 −1

The photons must go off in opposite directions with the exact same magnitude of momentum because, in the rest frame of the pion, the initial momentum is 0. Since the photons have the same magnitude of momentum, they have the same energy, which is half the rest-mass energy of the pion since that is the initial energy in the rest frame of the pion. We simply have to transform back to the lab frame. For example, for the forward- moving photon: ! E /c γ βγ m c1 m c1 1 = π = γ(1 + β) π . p1 βγ γ 2 1 2 1

q 1+β Writing γ(1+β) = 1−β shows that this gives the same energy, E1, as found previously. Similarly, one can find E2.

– 68 – 15.8. Relativistic Doppler Effect

Suppose you direct a laser beam with frequency f0 at an atom moving towards you with a velocity u. (a) What is the light frequency felt by the atom in its rest frame.

(b) The atom will be driven by the laser beam and re-radiate. What is the frequency of the light radiated by the atom in its reference frame? If you observe this atom radiation, what light frequency will you see? What is the corresponding light wavelength?

(c) Suppose you now direct the laser beam towards a mirror moving towards you with a velocity u. What will be the light frequency that you observe? Briefly explain why.

SOLUTION:

(a) The frequency is Doppler shifted upwards (i.e. it should increase) because the atom is moving towards you, the initial source. The atom sees a frequency, f 0, given by

s u 0 1 + c f = u f0 . 1 − c

(b) When the light hits the atom, the oscillating electric field will deform the electron clouds giving the atom a dipole moment that is oscillating at the same frequency as the light. The frequency of dipole radiation is the same as the frequency of oscillation 0 0 of the dipole. Hence, the atom radiation has frequency frerad = f . Now, the atom becomes the source of the radiation and it is moving towards you, which means that the frequency you observe is Doppler shifted upwards relative to f 0. This shift has the same factor, assuming the atom slows down a negligible amount (due

to radiation pressure). You observe a re-radiation frequency, frerad, of

s u  u  1 + c 0 1 + c frerad = u frerad = u f0 . 1 − c 1 − c

The corresponding wavelength is  u  c 1 − c c λrerad = = u . frerad 1 + c f0

(c) The mirror is nothing more than a large collection of atoms all of which do exactly the same thing as the one atom that this problem has been about until now. Thus, the frequency observed will be exactly the same as that for just one atom. We could use the fact that in the mirror’s rest frame, the reflected wave has the exact same frequency (energy) as the incident wave, just with the exact opposite momentum. However, that fact is derived microscopically from the induced oscillating dipoles on the mirror surface, anyway, which is what we have done here.

– 69 – 15.9. Quantum Tunneling and Frustrated Total Internal Reflection A plane wave with wavevector k is sent in from x = −∞ traveling to the right towards a step potential, ( 0, x < 0, V (x) = V0, x > 0.

(a) What is the value of V0 above which the region x > 0 is classically forbidden? Assume that this is the case for the rest of the problem.

(b) The time-independent wavefunction in the region x < 0 (Region I) is given by

ikx −ikx ψI(x) = Ae + Be ,

where the A term represents the incoming plane wave moving to the right and the B term represents the reflected plane wave moving to the left. Justify the nomenclature here: why can we call these plane waves and why is the A term moving to the right and the B term moving to the left? [Hint: Determine the full time-dependent wavefunction.]

(c) Determine the general form of the time-independent wavefunction in the region x > 0

(Region II), ψII(x).

(d) Calculate the transmission coefficient, defined to be the square of the ratio of the amplitudes of the transmitted and incident plane waves: 2 F T ≡ . A

Also calculate the reflection coefficient R ≡ |B/A|2 and check that R + T = 1. CAUTION: Thanks to Carlin for reminding me of the following. The transmission coefficient above should actually be multiplied by a factor of the ratio of the transmitted and incident wavevectors, ktransmitted . For this problem, this doesn’t make a difference kincident because this ratio is equal to 1, but in general you have to keep this factor in. This is because the transmission and reflection coefficients are defined to be the ratio of the transmitted and reflected fluxes to the incident flux. That is, the ratio of the rates of particle flow in the transmitted and reflected beams relative to the incident beam. Therefore, we need to multiply the square amplitudes by the speed of propagation, and then take the ratio. Since reflected and incident waves have the same wavevector and speed, we never need to worry about this as far as R is concerned. However, if V in the transmitted region is not the same as V in the incident region, then the wavevectors in the transmitted and incident regions will be different. To be honest, partly because of this complication, I hardly ever calculate T directly. Instead, I calculate R and then T is just 1 − R.

(e) Despite your answer to part (d), the wavefunction is not zero in the region x > 0. If the barrier has finite extent, it is possible for the incoming particles to tunnel through

– 70 – the barrier to the other side. Let the barrier have length L so that for x > L, the potential once again vanishes. Write down the general form of the time-independent

wavefunction in the region 0 < x < L (Region II), ψII(x), and in the region x > L (Region III), ψIII(x). Write down the equations you would need to solve in order to calculate the transmission and reflection coefficients. (I’m not asking you to actually solve them).

Note: This is similar to the phenomenon of frustrated total internal reflection. In this case, if you bring a refractive material close to the boundary of another where you have total internal reflection set up, then it is possible to get some light to tunnel through the air gap and come out the other side! Below is an image showing this effect. The green laser comes in from the right through the first prism at an angle that should make the beam be totally internally reflected. There is a small air gap between the triangular and the eye-shaped prisms. Nevertheless, some of the light is able to cross that gap and emerge in the eye-shaped prism.

Image source: University of Vermont http://www.uvm.edu/~dahammon/

SOLUTION

(a) The kinetic energy is related to the wavevector via

2k2 T = ~ . 2m

Since V = 0 in the region x < 0, the total energy is simply equal to the kinetic energy there: 2k2 E = ~ . 2m The region x > 0 is classically forbidden if the potential energy there is greater than the total energy E, since that technically means that the kinetic energy is negative!

– 71 – Thus, the region x > 0 is classically forbidden if

2k2 V > E = ~ . (15.2) 0 2m

(b) The full time-dependent wavefunction in region I is

−iEt/~ −i(ωt−kx) −i(ωt+kx) ΨI(t, x) = e ψ(x) = Ae + Be , (15.3)

where E k2 ω = = ~ . (15.4) ~ 2m Now, these really are plane waves. If we track the zero phase (when the exponent is zero), for the A term, as t increases the x position of the zero phase point also increases. Therefore, this plane wave moves to the right. The opposite is true for the B term, which therefore moves to the left.

(c) The solutions to the time-independent Schr¨odingerequation are real complex exponen- tials, rather than complex exponentials. These are growing and decaying exponentials. However, since the wavefunction cannot blow up as x → ∞, the only allowed solution is the exponentially decaying one:

p r −κx 2m(V0 − E) 2mV0 2 ψII(x) = Ce , where κ = = − k . (15.5) ~ ~2

(d) Both ψ and ψ0 (the derivative of ψ) must be continuous everywhere. In particular, we must match ψ and ψ0 in regions I and II at x = 0:

A + B = C, (15.6a) ik(A − B) = −κC. (15.6b)

Eliminate C and solve for B/A:

2 B κ + ik B = − =⇒ R = = 1 . A κ − ik A

That is, we have 100% reflection, despite the fact that ψII(x) 6= 0! The wavefunction in the classically forbidden region is called the evanescent wave.

(e) Now, we are allowed to have both the exponentially decaying as well as the exponen- tially growing solutions in region II because it is just a finite region. Meanwhile, the solution in Region III is again the same plane wave solution as in region I. However, we only want the transmitted wave; there is no incoming wave from the right. Thus,

−κx κx ikx ψII(x) = Ce + De , ψIII(x) = F e . (15.7)

– 72 – Now, we have four equations: matching ψ and ψ0 both at x = 0 and at x = L:

A + B = C + D, (15.8a) ik(A − B) = −κ(C − D), (15.8b) Ce−κL + DeκL = F eikL, (15.8c) −κ(Ce−κL − DeκL) = ikF eikL. (15.8d)

Eliminate C from the first two, the middle two, and the last two equations:

(κ + ik)A + (κ − ik)B = 2κD, (15.9a) ik(A − B)e−κL + κF eikL = κD(eκL + e−κL), (15.9b) 2κDeκL = (κ + ik)F eikL. (15.9c)

Eliminate D:

(κ + ik)AeκL + (κ − ik)BeκL = (κ + ik)F eikL, (15.10a) eκL + e−κL ik(A − B)e−κL + κF eikL = (κ + ik)F eikL . (15.10b) 2

Eliminate B and solve for F/A. After a lot of algebra,

F 4ikκeikL = − . (15.11) A (κ − ik)2eκL − (κ + ik)2e−κL

After a lot more algebra, and plugging in the expressions for k and κ in terms of E

and V0, one finds the transmission coefficient

2   2 2 −1 F mV0 sinh (κL) T = = 1 + . (15.12) A ~2 kκ

This is the tunneling probability and it is not zero! It has the correct behavior that

T → 0 as V0 → ∞ or L → ∞.

15.10. Wavefunction Shapes Below is picture of an infinite potential well with a non-flat bottom. Explain your answers to the following questions.

(a) For some arbitrary allowed energy, E, rank positions A, B and C by the classical kinetic energy of the particle at these positions from largest to smallest.

(b) Repeat for de Broglie wavelength.

(c) Repeat for the amount of time a classical particle spends traversing an interval of width δx at each position.

– 73 – (d) Repeat for the spacings between the zeros of the wavefunction in the regions near each point. Assume that the energy level is sufficiently high that the wavefunction oscillates many times between the two walls.

(e) Repeat for the amplitude of the wavefunction in the region near each point.

(f) Sketch a plausible wavefunction for some high energy level.

SOLUTION:

(a) B > C > A, since K = E − V .

(b) A > C > B, since K ∝ p2 and p ∝ λ−1.

(c) A > C > B, since the particle moves slower where it has less kinetic energy.

(d) A > C > B; same as (b).

(e) A > C > B; same as (c).

(f) We want the amplitude and wavelength to get slightly larger near the sides.

CAUTION: Thanks to Yufan for bringing the following to my attention. I said that amplitudes and wavelengths tend be smaller in regions where the difference between E and V is bigger. The statement about the wavelength is certainly correct. However, the statement about the amplitude only holds for bound states; it does not hold for plane wave states. Our intuitive arguments above for the amplitude technically require that the particle be going back and forth many many times. This is only the case for bound states. Then, it certainly is true that given two regions of space of the same size, the particle is less likely to be found in the region in which it is traveling faster. Therefore, the amplitude will tend to be smaller in those regions. However, for

– 74 – one-dimensional scattering problems, where you send in a plane wave on the left and then study the reflected and transmitted waves, this argument doesn’t really hold. The particle has one pass; it does not go back and forth. So for example, the amplitude of the transmitted wave for a step barrier is smaller than the amplitude of the incident wave even though the wavevector, and therefore the speed, is smaller in the transmitted region.

16. Final Exam Solutions

16.1. The Pole Vaulter Paradox √ 3 A pole vaulter is running with a pole at v = 2 c. Her pole has a proper length of L. She L runs into a barn with proper length 2 with doors on the front and back. When the pole vaulter runs into the barn, a farmer tries to close both front and back doors at the same time, but only for an instant, and then reopens them. (a) What is the length of the pole from the farmers perspective? What is the length of the barn from the pole vaulter’s perspective? From the farmer’s perspective can he close the barn doors at the same time? [15 pts]

(b) Are the doors closed at the same time for the pole vaulter? What is the expression for the time interval of the door closings in the pole vaulters frame? What is the interpretation of the sign of the expression? [10 pts]

(c) In the pole vaulters frame give an expression for what the time interval would have to be to avoid an accident. Comparing the answers of (b) and (c), is there an accident? [5 pts] SOLUTION: √ 3 (a) The γ factor associated with the speed v = 2 c is 1 1 γ = q = q = 2. v 2 1 − 3 1 − c 4 The length of the pole from the farmer’s perspective is L L pole length to farmer = = . γ 2

The length of the barn from the pole vaulter’s perspective is L/2 L barn length to pole vaulter = = . γ 4

From the farmer’s perspective he can close the barn doors at the same time , neglect- ing the fact that the pole is exactly the same length as the barn from his perspective and neglecting timing and reaction time issues. In other words, there is one instant in time in the farmer’s frame of reference when the pole is entirely within the barn.

– 75 – (b) No, the doors are not closed at the same time for the pole vaulter . Let ∆t and ∆x be the time interval and the spatial distance in the reference frame of the barn and farmer between the two events (back door closing, then front door closing). Let ∆t0 and ∆x0 be the corresponding intervals in the reference frame of the pole vaulter. In the reference frame of the barn and farmer, the two events are simultaneous and are separated in space by the proper length of the barn: L ∆t = 0, ∆x = . 2

The invariant interval is L2 L2 (∆s)2 = (c∆t)2 − (∆x)2 = 0 − = − . (16.1) 2 4

In the reference frame of the pole vaulter, the two events are separated in space by the proper length of the pole: ∆x0 = L.

Therefore, (∆s)2 = (c∆t0)2 − (∆x0)2 = (c∆t0)2 − L2. (16.2)

Setting Eqns. (16.1) and (16.2) equal and solving for ∆t0 gives √ 3 L ∆t0 = . (16.3) 2 c

We could have also used a Lorentz transformation: √ √ √ 3 L 3 3 L c∆t0 = γc∆t + βγ∆x = 0 + 2 = L =⇒ ∆t0 = . 2 2 2 2 c

Our definition for ∆t0 means that if it is positive, then the back door closes before the front door closes . This makes sense, of course, since the front of the pole reaches the back of the barn before the back of the pole reaches the front of the barn.

(c) In the pole vaulter’s reference frame, when the front of the pole aligns with the back of the barn, the length of the pole that is inside the barn is just the length of the barn L as measured by the pole vaulter, which is 4 . Therefore, the length of the pole outside 3L of the barn is 4 . Therefore, the minimum time interval between the back door of the barn closing and the front door closing to avoid an accident is the time it takes for the 3L front of the barn to travel the remaining distance 4 to the front of the pole. This time is √ 3L/4 3 L ∆t0 = √ = . (16.4) min 3 c/2 2 c

The time (16.3) is just equal to (16.4). Therefore, an accident is just about avoided .

– 76 – 16.2. Pion Decay A positive pion decays into a muon and a neutrino, π+ → µ+ + ν. The pion rest mass 2 2 mπ = 140 MeV/c , the muon rest mass is mµ = 106 MeV/c , but the neutrino has a mass mν ≈ 0. Assume that the pion starts off at rest.

(a) Using conservation of relativistic momentum and energy, find an expression for the

momentum of the muon that depends only on mπ and mµ. [10 pts]

(b) Show that the following expression is correct. [20 pts]

2 u (mπ/mµ) − 1 = 2 . c (mπ/mµ) + 1

SOLUTION:

(a) Let pµ and pν be the magnitudes of the momenta of the muon and neutrino, respec- tively. Since the pion starts off at rest, the initial momentum is zero. Therefore, conservation of momentum implies that

pµ = pν. (16.5)

Since the neutrino is taken to be massless,

p 2 2 2 4 Eν = pνc + mνc = pνc = pµc, (16.6)

where we plugged in (16.5) to get the final equality. Energy conservation reads q 2 2 2 2 4 mπc = Eµ + Eν = pµc + mµc + pµc. (16.7)

Isolating the square root on one side and squaring gives

p2c2 − 2m c3p + m2 c4 = p2c2 + m2 c4. µ π µ π µ µ

Solving for pµ gives

2 4 2 4  2  mπc − mµc mπ mµ pµ = 3 = − 1 mµc . (16.8) 2mπc mµ 2mπ

– 77 – (b) The energy of the muon is q 2 2 2 4 Eµ = pµc + mµc s  2 2 2 mπ mµ 2 4 2 4 = − 1 mµc + mµc mµ 2mπ s  2 2  2 mµ 2 mπ 2mπ = mµc − 1 + 2mπ mµ mµ s  4  2  2 mµ 2 mπ mπ mπ = mµc − 2 + 1 + 4 2mπ mµ mµ mµ s  2 2 mµ 2 mπ = mµc + 1 2mπ mµ  2  mπ mµ 2 = + 1 mµc . (16.9) mµ 2mπ

u Let u be the speed of the muon in the rest frame of the pion. Let β = c and γ be the associated gamma factor. Then,

2 Eµ = γmc , pµ = γmu = βγmc.

Therefore, 2 u pµc (mπ/mµ) − 1 = β = = 2 . c Eµ (mπ/mµ) + 1

– 78 –