<<

J. Broida UCSD Fall 2009 Phys 130B QM II

1 Angular momentum in

As is the case with most operators in quantum mechanics, we start from the clas- sical definition and make the transition to quantum mechanical operators via the standard substitution x x and p i~∇. Be aware that I will not distinguish a classical quantity such→ as x from the→− corresponding quantum mechanical operator x. One frequently sees a new notation such as ˆx used to denote the operator, but for the most part I will take it as clear from the context what is meant. I will also generally use x and r interchangeably; sometimes I feel that one is preferable over the other for clarity purposes. Classically, angular momentum is defined by

L = r p . × Since in QM we have [xi,pj ]= i~δij it follows that [Li,Lj] = 0. To find out just what this commutation relation is, first recall that components6 of the vector cross product can be written (see the handout Supplementary Notes on Mathematics)

(a b) = ε a b . × i ijk j k Here I am using a sloppy summation convention where repeated indices are summed over even if they are both in the lower position, but this is standard when it comes to angular momentum. The Levi-Civita permutation symbol has the extremely useful property that ε ε = δ δ δ δ . ijk klm il jm − im jl Also recall the elementary commutator identities

[ab,c]= a[b,c] + [a,c]b and [a,bc]= b[a,c] + [a,b]c .

Using these results together with [xi, xj ] = [pi,pj ] = 0, we can evaluate the com- mutator as follows:

[L ,L ] = [(r p) , (r p) ] = [ε x p ,ε x p ] i j × i × j ikl k l jrs r s = εiklεjrs[xkpl, xrps]= εiklεjrs(xk[pl, xrps] + [xk, xrps]pl)

1 = ε ε (x [p , x ]p + x [x ,p ]p )= ε ε ( i~δ x p + i~δ x p ) ikl jrs k l r s r k s l ikl jrs − lr k s ks r l = i~ε ε x p + i~ε ε x p =+i~ε ε x p i~ε ε x p − ikl jls k s ikl jrk r l ikl ljs k s − jrk kil r l = i~(δ δ δ δ )x p i~(δ δ δ δ )x p ij ks − is jk k s − ji rl − jl ri r l = i~(δ x p x p ) i~(δ x p x p ) ij k k − j i − ij l l − i j = i~(x p x p ) . i j − j i But it is easy to see that

ε L = ε (r p) = ε ε x p = (δ δ δ δ )x p ijk k ijk × k ijk krs r s ir js − is jr r s = x p x p i j − j i and hence we have the fundamental angular momentum commutation relation

[Li,Lj]= i~εijkLk . (1.1a)

Written out, this says that

[Lx,Ly]= i~Lz [Ly,Lz]= i~Lx [Lz,Lx]= i~Ly .

Note that these are just cyclic permutations of the indices x y z x. Now the total angular momentum squared is L2 = L L =→L L→, and→ therefore · i i 2 [L ,Lj] = [LiLi,Lj]= Li[Li,Lj] + [Li,Lj ]Li

= i~εijkLiLk + i~εijkLkLi .

But ε L L = ε L L = ε L L ijk k i kji i k − ijk i k where the first step follows by relabeling i and k, and the second step follows by the antisymmetry of the Levi-Civita symbol. This leaves us with the important relation

2 [L ,Lj ]=0 . (1.1b)

Because of these commutation relations, we can simultaneously diagonalize L2 and any one (and only one) of the components of L, which by convention is taken to be L3 = Lz. The construction of these eigenfunctions by solving the differential equations is at least outined in almost every decent QM text. (The old book In- troduction to Quantum Mechanics by Pauling and Wilson has an excellent detailed description of the power series solution.) Here I will follow the algebraic approach that is both simpler and lends itself to many more advanced applications. The main reason for this is that many particles have an intrinsic angular momentum (called ) that is without a classical analogue, but nonetheless can be described mathematically exactly the same way as the above “orbital” angular momentum.

2 In view of this generality, from now on we will denote a general (Hermitian) angular momentum operator by J. All we know is that it obeys the commutation relations [Ji, Jj ]= i~εijkJk (1.2a) and, as a consequence, 2 [J , Ji]=0 . (1.2b) Remarkably, this is all we need to compute the most useful properties of angular momentum. To begin with, let us define the ladder (or raising and lowering) operators

J+ = Jx + iJy (1.3a) J = (J+)† = Jx iJy . − − Then we also have 1 1 Jx = (J+ + J ) and Jy = (J+ J ) . (1.3b) 2 − 2i − − Because of (1.2b), it is clear that

[J 2, J ]=0 . (1.4) ± In addtion, we have

[Jz, J ] = [Jz, Jx] i[Jz, Jy]= i~Jy ~Jx ± ± ± so that [Jz, J ]= ~J . (1.5a) ± ± ± Furthermore, 2 2 [Jz, J ]= J [Jz, J ] + [Jz, J ]J = 2~J ± ± ± ± ± ± ± and it is easy to see inductively that

k k [Jz, J ]= k~J . (1.5b) ± ± ± It will also be useful to note

2 2 J+J = (Jx + iJy)(Jx iJy)= Jx + Jy i[Jx, Jy] − − − 2 2 ~ = Jx + Jy + Jz and hence (since J 2 + J 2 = J 2 J 2) x y − z 2 2 ~ J = J+J + Jz Jz . (1.6a) − − Similarly, it is easy to see that we also have 2 2 ~ J = J J+ + Jz + Jz . (1.6b) −

3 2 Because J and Jz commute they may be simultaneously diagonalized, and we β denote their (un-normalized) simultaneous eigenfunctions by Yα where 2 β ~2 β β ~ β J Yα = αYα and JzYα = βYα .

Since Ji is Hermitian we have the general result

J 2 = ψ J 2ψ = J ψ J ψ = J ψ 2 0 h i i h | i i h i | i i k i k ≥ 2 2 2 2 2 β ~2 2 β and hence J Jz = Jx + Jy 0. But Jz Yα = β Yα and hence we must have h i − h i h i h i≥ β2 α . (1.7) ≤ Now we can investigate the effect of J on these eigenfunctions. From (1.4) we have ± 2 β 2 β ~2 β J (J Yα )= J (J Yα )= α(J Yα ) ± ± ± so that J doesn’t affect the eigenvalue of J 2. On the other hand, from (1.5a) we also have± β ~ β ~ β Jz(J Yα ) = (J Jz J )Yα = (β 1)J Yα ± ± ± ± ± ± and hence J raises or lowers the eigenvalue ~β by one unit of ~. And in general, from (1.5b) we± see that

k β k β ~ k β ~ k β Jz((J ) Yα ) = (J ) (JzYα ) k (J ) Yα = (β k)(J ) Yα ± ± ± ± ± ± so the k-fold application of J raises or lowers the eigenvalue of Jz by k units of ~. k β ± 2 This shows that (J ) Yα is a simultaneous eigenfunction of both J and Jz with corresponding eigenvalues± ~2α and ~(β k), and hence we can write ± k β β k (J ) Yα = Yα ± (1.8) ± where the normalization is again unspecified. β 2 ~2 ~ Thus, starting from a state Yα with a J eigenvalue α and a Jz eigenvalue β, we can repeatedly apply J+ to construct an ascending sequence of eigenstates with 2 Jz eigenvalues ~β, ~(β + 1), ~(β + 2), . . . , all of which have the same J eigenvalue ~2α. Similarly, we can apply J to construct a descending sequence ~β, ~(β 1), ~(β 2), . . . , all of which also have− the same J 2 eigenvalue ~2α. However, because− of (1.7),− both of these sequences must terminate. Let the upper Jz eigenvalue be ~βu and the lower eigenvalue be ~βl. Thus, by definition, − J Y βu = ~β Y βu and J Y βl = ~β Y βl (1.9a) z α u α z α − l α with βu βl J+Yα = 0 and J Yα = 0 (1.9b) − and where, by (1.7), we must have

β2 α and β2 α . u ≤ l ≤

4 By construction, there must be an integral number n of steps from βl to βu, so that − βl + βu = n . (1.10) (In other words, the eigenvalues of J range over the n intervals β , β + 1, z − l − l βl +2,..., βl + (βl + βu) = βu.) − Now, using− (1.6b) we have

2 βu βu 2 ~ βu J Yα = J J+Yα + (Jz + Jz)Yα . − β Then by (1.9b) and the definition of Yα , this becomes

~2 βu ~2 βu αYα = βu(βu + 1)Yα so that α = βu(βu + 1) . In a similar manner, using (1.6a) we have

2 βl βl 2 ~ βl J Yα = J+J Yα + (Jz Jz)Yα − − or ~2 βl ~2 βl αYα = βl(βl + 1)Yα so also α = βl(βl + 1) . Equating both of these equations for α and recalling (1.10) we conclude that n β = β = := j u l 2 where j is either integral or half-integral, depending on whether n is even or odd. In either case, we finally arrive at

α = j(j + 1) (1.11) and the eigenvalues of J range from ~j to ~j in integral steps of ~. z − We can now label the eigenvalues of Jz by ~m instead of ~β, where the integer or half-integer m ranges from j to j in integral steps. Thus our eigenvalue equations may be written −

2 m ~2 m J Yj = j(j + 1) Yj (1.12) m ~ m JzYj = m Yj .

m We say that the states Yj are angular momentum eigenstates with angular mo- mentum j and z-component of angular momentum m. Note that (1.9b) is now written j j J+Yj = 0 and J Yj− =0 . (1.13) −

5 Since (J )† = J , using equations (1.6) we have ± ∓ m m m m m 2 2 ~ m J Yj J Yj = Yj J J Yj = Yj (J Jz Jz)Yj h ± | ± i h | ∓ ± i h | − ∓ i = ~2[j(j + 1) m2 m] Y m Y m − ∓ h j | j i = ~2[j(j + 1) m(m 1)] Y m Y m . − ± h j | j i m m 1 m We know that J Yj is proportional to Yj ± . So if we assume that the Yj are normalized, then± this equation implies that

m ~ m 1 J Yj = j(j + 1) m(m 1) Yj ± . (1.14) ± − ± j p If we start at the top state Yj , then by repeatedly applying J , we can construct all − m j of the states Yj . Alternatively, we could equally well start from Yj− and repeatedly apply J+ to also construct the states. m j Let us see if we can find a relation that defines the Yj . Since Yj is defined j by J+Yj = 0, we will only define our states up to an overall normalization factor. Using (1.14), we have

j ~ j 1 ~ j 1 J Yj = j(j + 1) j(j 1) Yj − = 2j Yj − − − − or p p j 1 ~ 1 1 j Yj − = − J Yj . √2j − Next we have

2 j ~2 j 2 ~2 j 2 (J ) Yj = 2j j(j + 1) (j 1)(j 2) Yj − = (2j)2(2j 1) Yj − − − − − − or p p p j 2 ~ 2 1 2 j Yj − = − (J ) Yj . (2j)(2j 1)2 − − And once more should do it: p 3 j ~3 j 3 (J ) Yj = (2j)(2j 1)2 j(j + 1) (j 2)(j 3) Yj − − − − − − 3p p j 3 = ~ (2j)(2j 1)(2)(3)(2j 2) Y − − − j or p j 3 ~ 3 1 3 j Yj − = − (J ) Yj . 2j(2j 1)(2j 2)3! − − − Noting that m = j 3 so that 3!=(p j m)! and 2j 3=2j (j m)= j + m, it is easy to see we have− shown that − − − −

m ~m j (j + m)! j m j Yj = − (J ) − Yj . (1.15a) (2j)!(j m)! − s −

6 j And an exactly analogous argument starting with Yj− and applying J+ repeatedly shows that we could also write

m ~ m j (j m)! j+m j Yj = − − − (J+) Yj− . (1.15b) s(2j)!(j + m)! It is extremely important to realize that everything we have done up to this point depended only on the commutation relation (1.2a), and hence applies to both integer and half-integer angular momenta. While we will return in a later section to discuss spin (including the half-integer case), for the rest of this section we restrict ourselves to integer values of angular momentum, and hence we will be discussing orbital angular momentum. The next thing we need to do is to actually construct the angular momentum m wave functions Yl (θ, φ). (Since we are now dealing with orbital angular momen- tum, we replace j by l.) To do this, we first need to write L in spherical coordinates. One way to do this is to start from L = (r p) = ε x p where p = i~(∂/∂x ), i × i ijk j k k − k and then use the chain rule to convert from Cartesian coordinates xi to spherical coordinates (r, θ, φ). Using x = r sin θ cos φ y = r sin θ sin φ z = r cos θ so that

2 2 2 1/2 1 1 r = (x + y + z ) θ = cos− z/r φ = tan− y/x we have, for example, ∂ ∂r ∂ ∂θ ∂ ∂φ ∂ = + + ∂x ∂x ∂r ∂x ∂θ ∂x ∂φ x ∂ xz ∂ y ∂ = + cos2 φ r ∂r r3 sin θ ∂θ − x2 ∂φ ∂ cos θ cos φ ∂ sin φ ∂ = sin θ cos φ + ∂r r ∂θ − r sin θ ∂φ with similar expressions for ∂/∂y and ∂/∂z. Then using terms such as ∂ ∂ L = yp zp = i~ y z x z − y − ∂z − ∂y   we eventually arrive at ∂ ∂ L = i~ sin φ cot θ cos φ (1.16a) x − − ∂θ − ∂φ   ∂ ∂ L = i~ cos φ cot θ sin φ (1.16b) y − ∂θ − ∂φ   ∂ L = i~ . (1.16c) z − ∂φ

7 However, another way is to start from the gradient in spherical coordinates (see the section on vector calculus in the handout Supplementary Notes on Mathematics) ∂ 1 ∂ 1 ∂ ∇ = ˆr + θˆ + φˆ . ∂r r ∂θ r sin θ ∂φ

Then L = r p = i~ r ∇ = i~ r (ˆr ∇) so that (since ˆr, θˆ and φˆ are orthonormal) × − × − × ∂ 1 ∂ 1 ∂ L = i~r ˆr ˆr + ˆr θˆ + ˆr φˆ − × ∂r × r ∂θ × r sin θ ∂φ   ∂ 1 ∂ = i~ φˆ θˆ − ∂θ − sin θ ∂φ   If we write the unit vectors in terms of their Cartesian components (again, see the handout on vector calculus) θˆ = (cos θ cos φ, cos θ sin φ, sin θ) − φˆ = ( sin φ, cos φ, 0) − then ∂ ∂ ∂ ∂ ∂ L = i~ ˆx sin φ cot θ cos φ + ˆy cos φ cot θ sin φ + ˆz − − ∂θ − ∂φ ∂θ − ∂φ ∂φ       which is the same as we had in (1.16). Using these results, it is now easy to write the ladder operators L = Lx iLy in spherical coordinates: ± ±

iφ ∂ ∂ L = ~e± i cot θ . (1.17) ± ± ∂θ ± ∂φ   m l To find the eigenfunctions Yl (θ, φ), we start from the definition L+Yl = 0. This yields the equation ∂Y l ∂Y l l + i cot θ l =0 . ∂θ ∂φ We can solve this by the usual approach of separation of variables if we write l Yl (θ, φ)= T (θ)F (φ). Substituting this and dividing by T F we obtain 1 ∂T 1 ∂F = i . T cot θ ∂θ − F ∂φ Following the standard argument, the left side of this is a function of θ only, and the right side is a function of φ only. Since varying θ won’t affect the right side, and varying φ won’t affect the left side, it must be that both sides are equal to a constant, which I will call k. Now the φ equation becomes dF = ikdφ F

8 ikφ l which has the solution F (φ)= e (up to normalization). But Yl is an eigenfunc- tion of Lz = i~(∂/∂φ) with eigenvalue l~, and hence so is F (φ) (since T (θ) just cancels out).− This means that

∂eikφ i~ = k~eikφ := l~eikφ − ∂φ and therefore we must have k = l, so that (up to normalization)

l ilφ Yl = e T (θ) . With k = l, the θ equation becomes dT cos θ d sin θ = l cot θ dθ = l dθ = l . T sin θ sin θ This is also easily integrated to yield (again, up to normalization)

T (θ) = sinl θ .

Thus, we can write l l l ilφ Yl = cl(sin θ) e l where cl is a normalization constant, fixed by the requirement that π l 2 l 2 2l Yl dΩ=2π cl (sin θ) sin θ dθ =1 . (1.18) Z Z0

I will go through all the details involved in doing this integral. You are free to skip down to the result if you wish (equation (1.21)), but this result is also used in other physical applications. First I want to prove the relation

n 1 n 1 n 1 n 2 sin x dx = sin − x cos x + − sin − x dx . (1.19) −n n Z Z This is done as an integration by parts (remember the formula u dv = uv n 1 − v du) letting u = sin − x and dv = sin x dx so that v = cos x and du = n 2 2 2 − R (n 1) sin − x cos x dx. Then (using cos x =1 sin x in the third line) R − − n n 1 sin x dx = sin − x sin x dx Z Z n 1 n 2 2 = sin − x cos x + (n 1) sin − x cos x dx − − Z n 1 n 2 n = sin − x cos x + (n 1) sin − x dx (n 1) sin x dx . − − − − Z Z Now move the last term on the right over to the left, divide by n, and the result is (1.19).

9 We need to evaluate (1.19) for the case where n =2l + 1. To get the final result in the form we want, we will need the basically simple algebraic result

(2l + 1)! (2l + 1)!! = l =1, 2, 3,... (1.20) 2ll! where the double factorial is defined by

n!! = n(n 2)(n 4)(n 6) . − − − · · · There is nothing fancy about the proof of this fact. Noting that n =2l +1 is always odd, we have

n!!=1 3 5 7 9 (n 4) (n 2) n · · · · · · · − · − · 1 2 3 4 5 6 7 8 9 (n 4) (n 3) (n 2) (n 1) n = · · · · · · · · · · · − · − · − · − · 2 4 6 8 (n 3) (n 1) · · · · · · − · − 1 2 3 4 5 6 7 8 9 (n 4) (n 3) (n 2) (n 1) n = · · · · · · · · · · · − · −n 3· −n 1· − · (2 1)(2 2)(2 3)(2 4) (2 − )(2 − ) · · · · · · · · 2 · 2 n! = n−1 . 2 n 1 2 ( −2 )! Substituting n =2l + 1 we arrive at (1.20). Now we are ready to do the integral in (1.18). Since the limits of integration are 0 and π, the first term on the right side of (1.19) always vanishes, and we can ignore it. Then we have

π π 2l+1 2l 2l 1 (sin x) dx = (sin x) − dx 2l +1 Z0 Z0 π 2l 2l 2 2l 3 = − (sin x) − dx 2l +1 2l 1    −  Z0 π 2l 2l 2 2l 4 2l 5 = − − (sin x) − dx 2l +1 2l 1 2l 3    −   −  Z0 2l 2l 2 2l 4 = = − − · · · 2l +1 2l 1 2l 3    −   −  2l (2l 2) π − − sin x dx ×···× 2l (2l 3)  − −  Z0 2ll(l 1)(l 2) (l (l 1)) = − − · · · − − 2 (2l + 1)!! 2ll! (2ll!)2 =2 =2 (1.21) (2l + 1)!! (2l + 1)!

10 π where we used 0 sin x dx = 2 and (1.20). Using this result, (1.18) becomes R l 2 2 (2 l!) 4π cl =1 l (2l + 1)! and hence (2l + 1)! 1/2 1 cl = ( 1)l (1.22) l − 4π 2ll!   where we included a conventional arbitrary phase factor ( 1)l. Putting this all together, we have the top orbital angular momentum state −

(2l + 1)! 1/2 1 Y l(θ, φ) = ( 1)l (sin θ)leilφ . (1.23) l − 4π 2ll!   m To construct the rest of the states Yl (θ, φ), we repeatedly apply L from equa- tion (1.17) to finally obtain −

(2l + 1)! 1/2 1 (l + m)! 1/2 Y m(θ, φ) = ( 1)l l − 4π 2ll! (2l)!(l m)!    −  l m imφ m d − 2l e (sin θ)− l m (sin θ) . (1.24) × d(cos θ) − It’s just not worth going through this algebra also.

2 Spin

It is an experimental fact that many particles, and the in particular, have an intrinsic angular momentum. This was originally deduced by Goudsmit and Uh- lenbeck in their analysis of the famous sodium D line, which arises by the transition from the 1s22s22p63p excited state to the ground state. What initially appears as a strong single line is slightly split in the presence of a magnetic field into two closely spaced lines (Zeeman effect). This (and other lines in the Na spectrum) indicates a doubling of the number of states available to the valence electron. To explain this “fine structure” of atomic spectra, Goudsmit and Uhlenbeck proposed in 1925 that the electron possesses an intrinsic angular momentum in addition to the orbital angular momentum due to its motion about the nucleus. Since magnetic moments are the result of current loops, it was originally thought that this was due to the electron spinning on its axis, and hence this intrinsic angular momentum was called spin. However, a number of arguments can be put forth to disprove that classical model, and the result is that we must assume that spin is a purely quantum phenomena without a classical analogue. As I show at the end of this section, the classical model says that the magnetic moment µ of a particle of charge q and mass m moving in a circle is given by q µ = L 2mc

11 where L is the angular momentum with magnitude L = mvr. Furthermore, the energy of such a charged particle in a magnetic field B is µ B. Goudsmit and Uhlenbeck showed that the ratio of magnetic moment to angula− ·r momentum of the electron was in fact twice as large as it would be for an orbital angular momentum. (This factor of 2 is explained by the relativistic Dirac theory.) And since we know that a state with angular momentum l is (2l + 1)-fold degenerate, the splitting implies that the electron has an angular momentum ~/2. From now on, we will assume that spin is described by the usual angular momen- tum theory, and hence we postulate a spin operator S and corresponding eigenstates sm such that | si

[Si,Sj ]= iεijkSk (2.1a) S2 sm = s(s + 1)~2 sm (2.1b) | si | si S sm = m ~ sm (2.1c) z| si s | si

S sms = ~ s(s + 1) ms(ms 1) sms 1 . (2.1d) ±| i − ± | ± i (I am switching to the more abstractp notation for the eigenstates because the states themselves are a rather abstract concept.) We will sometimes drop the subscript s on ms if there is no danger of confusing this with the eigenvalue of Lz, which we will also sometimes write as ml. Be sure to realize that particles can have a spin s that is any integer multiple of 1/2, and there are particles in nature that have spin 0 (e.g., the pion), spin 1/2 (e.g., the electron, neutron, proton), spin 1 (e.g., the photon, but this is a little bit subtle), spin 3/2 (the ∆’s), spin 2 (the hypothesized graviton) and so forth. Since the z component of electron spin can take only one of two values ~/2, we will frequently denote the corresponding orthonormal eigenstates simply± by ˆz , where ˆz + is called the spin up state, and ˆz is called the spin down state| ±i. An arbitrary| i spin state χ is of the form | −i | i

χ = c+ ˆz + + c ˆz . (2.2a) | i | i −| −i If we wish to think in terms of explicit matrix representations, then we will write this in the form (z) (z) χ = c+χ+ + c χ . (2.2b) − − Be sure to remember that the normalized states sms belong to distinct eigenvalues of a Hermitian operator, and hence they are in| fact orthonormi al. To construct the matrix representation of spin operators, we need to first choose a basis for the space of spin states. Since for the electron there are only two possible states for the z component, we need to pick a basis for a two-dimensional space, and the obvious choice is the standard basis

(z) 1 (z) 0 χ+ = and χ = . (2.3) " 0 # − " 1 #

12 With this choice of basis, we can now construct the 2 2 matrix representation of the spin operator S. × Note that the existence of spin has now led us to describe the electron by a multi-component state vector (in this case, two components), as opposed to the scalar wave functions we used up to this point. These two-component states are frequently called spinors. (z) The states χ were specifically constructed to be eigenstates of Sz (recall that ± 2 we simultaneously diagonalized J and Jz), and hence the matrix representation of Sz is diagonal with diagonal entries that are precisely the eigenvalues ~/2. In other words, we have ± ~ ˆz S ˆz = h ± | z| ±i ± 2 so that ~ 1 0 Sz = . (2.4) 2 " 0 1 # − (We are being somewhat sloppy with notation and using the same symbol Sz to denote both the operator and its matrix representation.) That the vectors defined in (2.3) are indeed eigenvectors of Sz is easy to verify:

~ 1 0 1 ~ 1 ~ 1 0 0 ~ 0 = and = . 2 " 0 1 # " 0 # 2 " 0 # 2 " 0 1 # " 1 # − 2 " 1 # − −

To find the matrix representations of Sx and Sy, we use (2.1d) together with S = Sx iSy so that Sx = (S+ + S )/2 and Sy = (S+ S )/2i. From ± ± − − −

S ˆz ms = ~ 3/4 ms(ms 1) ˆz ms 1 ±| i − ± | ± i we have p

S+ ˆz + =0 S ˆz + = ~ ˆz | i −| i | −i S+ ˆz = ~ ˆz + S ˆz =0 . | −i | i −| −i

Therefore the only non-vanishing entry in S+ is ˆz + S+ ˆz , and the only non- vanishing entry in S is ˆz S ˆz + . Thus we haveh | | −i − h − | −| i 0 1 0 0 S+ = ~ and S = ~ . " 0 0 # − " 1 0 #

Using these, it is easy to see that

~ 0 1 ~ 0 i Sx = and Sy = − . (2.5) 2 " 1 0 # 2 " i 0 #

13 2 2 2 2 And from (2.4) and (2.5) it is easy to calculate S = Sx + Sy + Sz to see that

3 1 0 S2 = ~2 4 " 0 1 # which agrees with (2.1b). It is conventional to write S in terms of the Pauli spin matrices σ defined by ~ S = σ 2 where 0 1 0 i 1 0 σx = σy = − σz = . (2.6) " 1 0 # " i 0 # " 0 1 # − Memorize these. The Pauli matrices obey several relations that I leave to you to verify (recall that the anticommutator is defined by [a,b]+ = ab + ba):

[σi, σj ]=2iεijkσk (2.7a) σ σ = iε σ for i = j (2.7b) i j ijk k 6 [σi, σj ]+ =2Iδij (2.7c)

σiσj = Iδij + iεijkσk (2.7d) Given three-component vectors a and b, equation (2.7d) also leads to the extremely useful result (a σ)(b σ) = (a b)I + i(a b) σ . (2.8) · · · × · We will use this later when we discuss rotations. From the standpoint of physics, what equations (2.2) say is that if we have an electron (or any spin one-half particle) in an arbitrary spin state χ, then the probablility is c 2 that a measurement of the z component of spin will result in | +| +~/2, and the probability is c 2 that the measurement will yield ~/2. We can | −| − say this because (2.2) expresses χ as a linear superposition of eigenvectors of Sz. But we could equally well describe χ in terms of eigenvectors of Sx. To do so, we simply diagonalize Sx and use its eigenvectors as a new set of basis vectors for our two-dimensional spin space. Thus we must solve Sxv = λv for λ and then the eigenvectors v. This is straightforward. The eigenvalue equation is (Sx λI)v = 0, so in order to have a non-trivial solution we must have − λ ~/2 2 2 det(Sx λI)= − = λ ~ /4=0 − ~/2 λ − − so that λ = ~/2 as we should expect. To find the eigenvector corresponding to ± ± λ+ =+~/2 we solve ~ 1 1 a (Sx λ+I)v = − =0 − 2 " 1 1 # " b # −

14 so that a = b and the normalized eigenvector is (we now write v = χ(x)) ±

(x) 1 1 χ+ = . (2.9a) √2 " 1 #

For λ = ~/2 we have − − ~ 1 1 c (Sx λ I)v = =0 − − 2 " 1 1 # " d # so that c = d and the normalized eigenvector is now (where we arbitrarily choose c = +1) − 1 1 χ(x) = . (2.9b) − √2 " 1 # − To understand just what this means, suppose we have an arbitrary spin state (normalized so that α 2 + β 2 = 1) | | | | α χ = . " β #

Be sure to understand that this is a vector in a two-dimensional space, and it exists independently of any basis. In terms of the basis (2.3) we can write

1 0 (z) (z) χ = α + β = αχ+ + βχ " 0 # " 1 # − so that the probability is α 2 that we will measure the z component of spin to be | | +~/2, and β 2 that we will measure it to be ~/2. Alternatively,| | we can express χ in terms of− the basis (2.9):

α a 1 b 1 a/√2+ b/√2 = + = " β # √2 " 1 # √2 " 1 # " a/√2 b/√2 # − − or α = a/√2+ b/√2 and β = a/√2 b/√2 . − Solving for a and b in terms of α and β we obtain 1 1 a = (α + β) and b = (α β) √2 √2 − so that α + β (x) α β (x) χ = χ+ + − χ . (2.10) √2 √2 −    

15 Thus the probability of measuring the x component of spin to be +~/2 is α + β 2 /2, | | and the probability of measuring the value to be ~/2 is α β 2 /2. (Remark: What we just did was nothing more− than the| usual− | change of basis in a vector space. We started with a basis 1 0 e1 = and e2 = " 0 # " 1 #

(which we chose to be the eigenvectors of Sz) and changed to a new basis

1 1 1 1 e¯1 = ande ¯2 = √2 " 1 # √2 " 1 # −

(which were the eigenvectors of Sx). Sincee ¯1 = (e1 +e2)/√2 ande ¯2 = (e1 e2)/√2, we see that this change of basis is described by the transition matrix defined− by j e¯i = ej p i or 1 1 1 1 P = = P − . √2 " 1 1 # − Then a vector α χ = " β # can be written in terms of either the basis e as { i} 1 0 1 2 χ = α + β = χ e1 + χ e2 " 0 # " 1 #

1 2 i 1 i j or in terms of the basis e¯i as χ =χ ¯ e¯1 +χ ¯ e¯2 whereχ ¯ = (p− ) j χ . This then immediately yields { } 1 1 χ = (α + β)¯e1 + (α β)¯e2 √2 √2 − which is just (2.10).) Now we ask how to incorporate spin into the general solution to the Schr¨odinger equation for the . Let us ignore terms that couple spin with the orbital angular momentum of the electron. (This “L–S coupling” is relatively small compared to the electron binding energy, and can be ignored to first order. We will, however, take this into account when we discuss perturbation theory.) Under these conditions, the Hamiltonian is still separable, and we can write the total stationary state wave function as a product of a spatial part ψnlml times a spin part χ(ms). Thus we can write the complete hydrogen atom wave function in the form

Ψnlmlms = ψnlml χ(ms) . Since the Hamiltonian is independent of spin, we have

HΨ= H[ψnlml χ(ms)] = χ(ms)Hψnlml = En[χ(ms)ψnlml ]= EnΨ

16 so that the energies are unchanged. However, because of the spin function, we have doubled the number of states corresponding to a given energy. A more mathematically correct way to write these complete states is as the tensor (or direct) product

Ψ = ψ χ(m ) := ψ χ(m ) . | i | nlml s i | nlml i ⊗ | s i In this case, the Hamiltonian is properly written as H I where H acts on the vector space of spatial wave functions, and I is the identity⊗ operator on the vector space of spin states. In other words,

(H I)( ψ χ(m ) ) := H ψ I χ(m ) . ⊗ | nlml i ⊗ | s i | nlml i⊗ | s i This notation is particularly useful when treating two-particle states, as we will see when we discuss the addition of angular momentum. (Remark: You may recall from linear algebra that given two vector spaces V and V ′, we may define a bilinear map V V ′ V V ′ that takes ordered pairs × → ⊗ (v, v′) V V ′ and gives a new vector denoted by v v′. Since this map is bilinear ∈ × ⊗ by definition, if we have the linear combinations v = xivi and v′ = yj vj′ then v v′ = xiyj(vi v′ ). In particular, if V has basis ei and V ′ has basis e′ , ⊗ ⊗ j P{ } P { j} then ei e′ is a basis for V V ′ which is then of dimension (dim V )(dim V ′) { ⊗P j } ⊗ and called the direct (or tensor) product of V and V ′. Then, if we are given two operators A L(V ) and B L(V ′), the direct product of A and B is the ∈ ∈ operator A B defined on V V ′ by (A B)(v v′) := A(v) B(v′).) ⊗ ⊗ ⊗ ⊗ ⊗ 2.1 Supplementary Topic: Magnetic Moments Consider a particle of charge q moving in a circular orbit. It forms an effective current ∆q q qv I = = = . ∆t 2πr/v 2πr By definition, the magnetic moment has magnitude I qv qvr µ = area = πr2 = . c × 2πrc · 2c But the angular momentum of the particle is L = mvr so we conclude that the magnetic moment due to orbital motion is q µ = L . (2.11) l 2mc The ratio of µ to L is called the gyromagnetic ratio. While the above derivation of (2.11) was purely classical, we know that the electron also possesses an intrinsic spin angular momentum. Let us hypothesize that the electron magnetic moment associated with this spin is of the form e µ = g − S . s 2mc

17 The constant g is found by experiment to be very close to 2. (However, the rel- ativistic Dirac equation predicts that g is exactly 2. Higher order corrections in quantum electrodynamics predict a slightly different value, and the measurement of g 2 is one of the most accurate experimental result in all of physics.) Now− we want to show is that the energy of a magnetic moment in a uniform magnetic field is given by µ B where µ for a loop of area A carrying current I is defined to have magnitude− · IA and pointing perpendicular to the loop in the direction of your thumb if the fingers of your right hand are along the direction of the current. To see this, we simply calculate the work required to rotate a current loop from its equilibrium position to the desired orientation. Consider the figure shown below, where the current flows counterclockwise out of the page at the bottom and into the page at the top.

B µ a/2 θ θ θ FB

FB B a/2

B

Let the loop have length a on the sides and b across the top and bottom, so its area is ab. The magnetic force on a current-carrying wire is

F = Idl B B × Z and hence the forces on the opposite “a sides” of the loop cancel, and the force on the top and bottom “b sides” is FB = IbB. The equilibrium position of the loop is horizontal, so the potential energy of the loop is the work required to rotate it from θ = 0 to some value θ. This work is given by W = F dr where F is the force that I must apply against the magnetic field to rotate the loop.· Since the loop is rotating, the force I must applyR at the top of the loop is in the direction of µ and perpendicular to the loop, and hence has magnitude FB cos θ. Then the work I do is (the factor of 2 takes into account both the top and bottom sides)

θ W = F dr =2 F cos θ(a/2)dθ = IabB cos θ dθ = µB sin θ . · B Z Z Z0 But note that µ B = µB cos(90 + θ)= µB sin θ, and therefore · − W = µ B . (2.12) − ·

18 In this derivation, I never explicitly mentioned the torque on the loop due to B. However, we see that

N = r F = 2(a/2)F sin(90 + θ)= IabB sin(90 + θ) k k k × Bk B = µB sin(90 + θ)= µ B k × k and therefore N = µ B . (2.13) × Note that W = N dθ. We also see that k k R dL d dp = r p = r = r F dt dt × × dt × where we used p = mv and r˙ p = v p = 0. Therefore, as you should already know, × × dL = N . (2.14) dt 3 Mathematical Digression: Rotations and Linear Transformations

Let’s take a look at how the spatial rotation operator is defined. Note that there are two ways to view symmetry operations such as translations and rotations. The first is to leave the coordinate system unchanged and instead move the physical system. This is called an active transformation. Alternatively, we can leave the physical system alone and change the coordinate system, for example by translation or rotation. This is called a passive transformation. In the case of an active transformation, we have the following situation:

x2

¯r

r

θ φ x1 Here the vector r is rotated by θ to give the vector ¯r where, of course, r = ¯r = r. We define a linear transformation T by ¯r = T (r). (This is linear becausek k itk isk easy to see that rotating the sum of two vectors is the sum of the rotated vectors.) From the diagram, the components of ¯r are given by

x¯ = r cos(θ + φ)= r cos θ cos φ r sin θ sin φ 1 − = (cos θ)x (sin θ)x 1 − 2

19 x¯2 = r sin(θ + φ)= r sin θ cos φ + r cos θ sin φ

= (sin θ)x1 + (cos θ)x2 or x¯1 cos θ sin θ x1 = − . (3.1) " x¯2 # " sin θ cos θ # " x2 # Since T is a linear transformation, it is completely specified by defining its values on a basis because T (r)= T xiei = xiT ei . i i  X  X 2 But T ei is just another vector in R , and hence it can be expressed in terms of the basis e as { i} T ei = ej aji . (3.2) j X Be sure to note which indices are summed over in this expression. The matrix (aji) is called the matrix representation of the linear transformation T with respect to the basis ei . You will sometimes see this matrix written as [T ]e. It is very{ important} to realize that a linear transformation T takes the ith basis vector into the ith column of its matrix representation. This is easy to see if we write out the components of (3.2). Simply note that with respect to the basis ei , we have { } 1 0 e1 = and e2 = " 0 # " 1 # and therefore

1 0 a1i T ei = e1a1i + e2a2i = a1i + a2i = " 0 # " 1 # " a2i # which is just the ith column of the matrix A = (aji). As an example, let V have a basis v1, v2, v3 , and let T : V V be the linear transformation defined by { } → T v1 =3v1 + v3 T v = v 2v v 2 1 − 2 − 3 T v3 = v2 + v3 Then the representation of T (relative to this basis) is

3 10 [T ]v = 0 2 1 .  1 −1 1  −   Now let V be an n-dimensional vector space, and let W be a subspace of V . Let T be an operator on V , and suppose W has the property that T w W for ∈

20 every w W . Then we say that W is T-invariant (or simply invariant when the operator∈ is understood). What can we say about the matrix representation of T under these circum- stances? Well, let W have the basis w ,...,w , and extend this to a basis { 1 r} w1,...,wr, v1,...,vn r for V . Since T w W for any w W , we must have { − } ∈ ∈ r T wi = wj aji j=1 X for some set of scalars aji . But for any vi there is no such restriction since all we know is that T v V .{ So we} have i ∈ r n r − T vi = wj bji + vkcki j=1 X kX=1 for scalars bji and cki . Then since T takes the ith basis vector to the ith column of the matrix{ representation} { } [T ], we must have

A B [T ]= " 0 C # where A is an r r matrix, B is an r (n r) matrix, and C is an (n r) (n r) matrix. Such a× matrix is said to be a×block− matrix, and [T ] is in block− triangular× − form. Now let W be an invariant subspace of V . If it so happens that the subspace of V spanned by the rest of the vectors v1,...,vn r is also invariant, then the { − } matrix representation of T will be block diagonal (because all of the bji in the above expansion will be zero). As we shall see, this is in fact what happens when we add two angular momenta J1 and J2 and look at the representation of the rotation operator with respect to the total angular momentum states (where J = J1 + J2). 2 By choosing our states to be eigenstates of J and Jz rather than J1z and J2z, the rotation operator becomes block diagonal rather than a big mess. This is because rotations don’t change j, so for fixed j, the (2j + 1)-dimensional space spanned by j j , j j 1 , ..., j j is an invariant subspace under rotations. This change {|of basis− i is| exactly− i what| Clebsch-Gordani} coefficients accomplish. Let us go back to our specific example of rotations. If we define ¯r = T (r), then on the the one hand we have ¯r = x¯j ej j X while on the other hand, we can write

¯r = T (r)= xiT (ei)= ej ajixi . i i j X X X

21 Since the ej are a basis, they are linearly independent, and we can equate these last two equations to conclude that

x¯j = ajixi . (3.3) i X Note which indices are summed over in this equation. Comparing (3.3) with (3.1), we see that the matrix in (3.1) is the matrix repre- sentation of the linear transformation T defined by (¯x , x¯ )= T (x , x ) = ((cos θ)x (sin θ)x , (sin θ)x + (cos θ)x ) . 1 2 1 2 1 − 2 1 2 Then the first column of [T ] is

T (e1)= T (1, 0) = (cos θ, sin θ) and the second column is T (e )= T (0, 1)=( sin θ, cos θ) 2 − so that cos θ sin θ [T ]= − " sin θ cos θ # as in (3.1). Using (3.3) we can make another extremely important observation. Since the length of a vector is unchanged under rotations, we must have j x¯j x¯j = i xixi. But from (3.3) we see that P P

x¯j x¯j = ajiajkxixk j X Xi,k and hence it follows that

T ajiajk = aij ajk = δik . (3.4a) j j X X In matrix notation, this is AT A = I . Since we are in a finite-dimensional space, this implies that AAT = I also. In T 1 other words, A = A− . Such a matrix is said to be orthogonal. In terms of components, this second condition is

T aij ajk = aij akj = δik . (3.4b) j j X X It is also quite useful to realize that an orthogonal matrix is one whose rows (or columms) form an orthonormal set of vectors. If we let Ai denote the ith row of A, then from (3.4b) we have

A A = a a = δ . i · k ij kj ik j X

22 Similarly, if we let Ai denote the ith column of A, then (3.4a) may be written

Ai Ak = a a = δ . · ji jk ik j X Conversely, it is easy to see that any matrix whose rows (or columns) form an orthonormal set must be an orthogonal matrix. In the case of complex matrices, repeating the above arguments using x 2 = T k k i xi∗xi, it is not hard to show that A†A = AA† = I where A† = A∗ . In this case, the matrix A is said to be unitary. Thus a complex matrix is unitary if and only if itsP rows (or columns) form an orthonormal set under the standard Hermitian inner product. To describe a passive transformation, consider a linear transformation P acting on the basis vectors. In other words, we perform a change of basis defined by

e¯i = P (ei)= ejpji . j X In this situation, the linear transformation P is called the transition matrix. Suppose we have a linear operator A defined on our space. Then with respect to a basis ei this operator has the matrix representation (aij ) defined by (dropping the parenthesis{ } for simplicity)

Aei = ej aji . j X And with respect to another basis e¯ it has the representation (¯a ) defined by { i} ij

Ae¯i = e¯j a¯ji . j X But e¯i = P ei, so the left side of this equation may be written

Ae¯i = AP ei = A ejpji = pjiAej = pjiekakj = ekakj pji j j  X  X Xj,k Xj,k while the right side is

e¯j a¯ji = (P ej)¯aji = ekpkj a¯ji . j j X X Xj,k Equating these two expressions and using the linear independence of the ek we have

pkj a¯ji = akj pji j j X X which in matrix notation is just P A¯ = AP . Since each basis may be written in 1 terms of the other, the matrix P must be nonsingular so that P − exists, and hence we have shown that 1 A¯ = P − AP . (3.5)

23 This extremely important equation relates the matrix representation of an op- erator A with respect to a basis e to its representation A¯ with respect to a basis { i} e¯i defined by e¯i = P ei. This is called a similarity transformation. In fact, this {is exactly} what you do when you diagonalize a matrix. Starting with a matrix A relative to a given basis, you first find its eigenvalues, and then use these to find the corresponding eigenvectors. Letting P be the matrix whose columns are precisely 1 these eigenvectors, we then have P − AP = D where D is a diagonal matrix with diagonal entries that are just the eigenvalues of A.

4 Angular Momentum and Rotations 4.1 Angular Momentum as the Generator of Rotations Before we begin with the physics, let me first prove an extremely useful mathemat- n ical result. For any complex number (or matrix) x, let fn(x) = (1+ x/n) , and consider the limit x n L = lim fn(x) = lim 1+ . n n →∞ →∞ n Since the logarithm is a continuous function, we can interch ange limits and the log, so we have x n x n ln L = ln lim 1+ = lim ln 1+ n n n n →∞   →∞   x ln(1 + x/n) = lim n ln 1+ = lim . n n n 1/n →∞   →∞ As n , both the numerator and denominator go to zero, so we apply l’Hˆopital’s→ rule ∞ and take the derivative of both numerator and denominator with respect to n:

( x/n2)/(1 + x/n) x ln L = lim − = lim = x . n 1/n2 n 1+ x/n →∞ − →∞ Exponentiating both sides, we have proved x n lim 1+ = ex . (4.1) n n →∞   As I mentioned above, this result also applies if x is an n n complex matrix A. This is a consequence of the fact (which I state without proo×f) that for any such matrix, the series ∞ An n! n=0 X converges, and we take this as the definition of eA. (For a proof, see one of my linear algebra books or almost any book on elementary real analysis or advanced calculus.)

24 Now let us return to physics. First consider the translation of a wave function ψ(x) by an amount a.

ψ(x) ψ′(x)

x x + a

This results in a new function ψ′(x + a)= ψ(x), or

ψ′(x)= ψ(x a) . − For infinitesimal a, we expand this in a Taylor series to first order in a: dψ ψ(x a)= ψ(x) a . − − dx Using p = i~(d/dx), we can write this in the form − ia ψ(x a)= 1 p ψ(x) . − − ~   For a finite translation, we consider a sequence of n infinitesimal translations a/n and let n . Applying (4.1) we have → ∞ ~ n iap/ iap/~ ψ′(x)= ψ(x a) = lim 1 ψ(x)= e− ψ(x) . (4.2) − n − n →∞  

Thus we see that translated states ψ′(x)= ψ(x a) are the result of applying the operator − iap/~ Ta = e− to the wave function ψ(x). In the case of a translation in three dimensions, we have the general translation operator

ia p/~ Ta = e− · . (4.3)

We say that p generates translations. Note that the composition of a translation by a and a translation by b yields a translation by a + b as it should:

ia p/~ ib p/~ i(a+b) p/~ TaTb = e− · e− · = e− · = Ta+b .

1 Also, noting that T aTa = TaT a = T0 = 1, we see that Ta− = T a. Thus we have shown that the− composition− of translations is a translation, that− translating by 0 is the identity transformation, and that every translation has an inverse. Any

25 collection of objects together with a composition law obeying closure, the existence of an identity, and the existence of an inverse to every object in the collection is said to form a group, and this group in particular is called the translation group. The composition of group elements is called group multiplication. In the case of the translation group, is also important to realize that TaTb = TbTa. Any group with the property that multiplication is commutative is called an abelian group. Another example of an abelian group is the set of all rotations about a fixed axis in R3. (But the set of all rotations in three dimensions is most definitely not abelian, as we are about to see.) Now that we have shown that the linear momentum operator p generates transla- tions, let us show that the angular momentum operator is the generator of rotations. In particular, we will show explicitly that the orbital angular momentum operator L generates spatial rotations in R3. In the case of the spin operator S which acts on the very abstract space of spin states, we will define rotations by analogy. Let me make a slight change of notation and denote the rotated position vector by a prime instead of a bar. If we rotate a vector x in R3, then we obtain a new vector x′ = R(θ)x where R(θ) is the matrix that represents the rotation. In two dimensions this is x′ cos θ sin θ x = − . " y′ # " sin θ cos θ # " y # If we have a scalar wavefunction ψ(x), then under rotation we obtain a new wave- function ψR(x), where ψ(x)= ψR(R(θ)x)= ψR(x′). (See the figure below.)

ψR(x)

x′ ψ(x) θ x Alternatively, we can write

1 ψR(x)= ψ(R− (θ)x) . Since R is an orthogonal transformation (it preserves the length of x) we know 1 T 1 T that R− (θ) = R (θ) (this really should be written as R(θ)− = R(θ) , but it rarely is), and in the case where θ 1 we then have ≪ 1 θ x x + θy 1 R− (θ)x = = . " θ 1 # " y # " θx + y # − − 1 Expanding ψ(R− (θ)x) with these values for x and y we have (letting ∂i = ∂/∂xi for simplicity) ψ (x)= ψ(x + θy,y θx)= ψ(x) θ[x∂ y∂ ]ψ(x) R − − y − x

26 or, using p = i~∂ this is i − i i i ψ (x)= ψ(x) θ[xp yp ]ψ(x)= 1 θL ψ(x) . R − ~ y − x − ~ z   Thus we see that angular momentum is indeed the generator of rotations. (i/~)θLz For finite θ we exponentiate this to write ψR(x) = e− ψ(x), and in the case of an arbitrary angle θ in R3 this becomes

(i/~)θ L ψR(x)= e− · ψ(x) . (4.4)

In an abstract notation we write this as

ψ = U(R) ψ | Ri | i (i/~)θ L where U(R) = e− · . For simplicity and clarity, we have written U(R) rather than the more complete U(R(θ)), which we continue to do unless the more complete notation is needed. What we just did was for orbital angular momentum. In the case of spin there is no classical counterpart, so we define the spin angular momentum operator S to obey the usual commutation relations, and then the spin states will transform under (i/~)θ S the rotation operator e− · . (We will come back to justifying this at the end of this section.) It is common to use the symbol J to stand for any type of angular momentum operator, for example L, S or L + S, and this is what we shall do from now on. In the case where J = L + S, J is called the total angular momentum operator. (The example above applied to a scalar wavefunction ψ, which represents a spinless particle. Particles with spin are described by vector wavefunctions ψ, and in this case the spin operator S serves to mix up the components of ψ under rotations.) 2 The angular momentum operators J = J J and Jz commute, and hence they have simultaneous eigenstates, denoted by jm ·, with the property that (with ~ = 1) | i J 2 jm = j(j + 1) jm and J jm = m jm | i | i z| i | i where m takes the 2j + 1 values j m j. Since the rotation operator is given iθ J − ≤ 2 ≤ by U(R)= e− · we see that [U(R), J ] =0. Then

J 2U(R) jm = U(R)J 2 jm = j(j + 1)U(R) jm | i | i | i so that the magnitude of the angular momentum can’t change under rotations. However, [U(R), Jz] = 0 so the rotated state will no longer be an eigenstate of Jz with the same eigenvalue6 m. Note that acting to the right we have the matrix element

2 2 j′m′ J U(R) jm = j′m′ U(R)J jm = j(j + 1) j′m′ U(R) jm h | | i h | | i h | | i while acting to the left gives

2 j′m′ J U(R) jm = j′(j′ + 1) j′m′ U(R) jm h | | i h | | i

27 and therefore j′m′ U(R) jm = 0 unless j = j′ . (4.5) h | | i 2 We also make note of the fact that acting with J and Jz in both directions yields

2 j′m′ J jm = j′(j′ + 1) j′m′ jm = j(j + 1) j′m′ jm h | | i h | i h | i and jm′ J jm = m′ jm′ jm = m jm′ jm h | z| i h | i h | i so that (as you should have already known)

j′m′ jm = δ ′ δ ′ . (4.6) h | i j j m m In other words, the states jm form a complete orthonormal set, and the state U(R) jm must be of the form| i | i D (j) U(R) jm = jm′ jm′ U(R) jm = jm′ m′m(θ) (4.7) | i ′ | ih | | i ′ | i Xm Xm where (j) iθ J D ′ (θ) := jm′ U(R) jm = jm′ e− · jm . (4.8) m m h | | i h | | i (Notice the order of susbscripts in the sum in equation (4.7). This is the same as the usual definition of the matrix representation [T ]e = (aij ) of a linear operator T : V V defined by Tei = j ejaji.) Since→ for each j there are 2j + 1 values of m, we have constructed a (2j + 1) (2j + 1) matrix D (j)(θ)P for each value of j. This matrix is referred to as the jth×irreducible representation of the rotation group. The word “irreducible” means that there is no subset of the space of states jj , j, m 1 ,..., j, j that transforms into itself under all rotations U(R(θ{|)).i Put| in− anotheri | way,− i} a representation is irreducible if the vector space on which it acts has no invariant subspaces. Now, it is a general result of the theory of group representations that any repre- sentation of a finite group or compact Lie group is equivalent to a unitary represen- tation, and any reducible unitary representation is completely reducible. Therefore, any representation of a finite group or compact Lie group is either already irre- ducible or else it is completely reducible (i.e., the space on which the operators act can be put into block diagonal form where each block corresponds to an invariant subspace). However, at this point we don’t want to get into the general theory of representations, so let us prove directly that the representations D (j)(θ) of the rotation group are irreducible. Recall that the raising and lowering operators J are defined by ±

J jm = (Jx iJy) jm = j(j + 1) m(m 1) j, m 1 . ±| i ± | i − ± | ± i In particular, the operators J don’t changep the value of j when acting on the states jm . (This is just the content± of equation (1.4).) | i

28 Theorem 4.1. The representations D (j)(θ) of the rotation group are irreducible. In other words, there is no subset of the space of states jm (for fixed j) that transforms among itself under all rotations. | i

Proof. Fix j and let V be the space spanned by the 2j + 1 vectors jm := m . We claim that V is irreducible with respect to rotations U(R). This| meansi that| i given any u V , the set of all vectors of the form U(R) u (i.e., for all rotations U(R)) spans| iV ∈. (Otherwise, if there exists v such that |Ui(R) v didn’t span V , then V would be reducible since the collection| i of all such{U(R)|vi}would define an invariant subspace.) | i To show V is irreducible, let V = Span U(R) u where u V is arbitrary but { |iθi}J | i ∈ fixed. For infinitesimal θ we have U(R(θ)) = e− · = 1 iθ J and in particular − · U(R(εx))=1 iεJ and U(R(εey))=1 iεJ . Then − x − y 1 1 Jb u = (Jx iJy) u = b[1 U(R(εx))] i [1 U(R(εy))] u ±| i ± | i iε − ± iε − | i    1 = [1 U(R(εy))] i + iU(Rb(εx)) u V b ε ± − − | i ∈  by definition of V and vector spaces.b Since J actingb on ue is a linear combination ± | i of rotations acting on u and this is in V , we see that (J )2 acting on u is again | i ± | i some other lineare combination of rotations acting on u and hence is also in V . So | i in general, we see that (J )n u is againe in V . By definition of V , we± may| i write (since j is fixed) e e u = jm jm u = m m u | i | ih | i | ih | i m m X X = m m u + m +1 m +1 u + + j j u | ih | i | ih | i · · · | ih | i where m is simply the smallest value of m for which m u = 0 (and not all of h | i 6 the terms up to j u are necessarily nonzero). Acting on this with J+ we obtain (leaving off the constanth | i factors and noting that J j = 0) +| i J u m +1 m u + m +2 m +1 u + + j j 1 u V . +| i ∼ | ih | i | ih | i · · · | ih − | i ∈ Since m u = 0 by assumption, it follows that m +1 V . e Weh can| i continue 6 to act on u with J a total| of j i ∈m times at which point we | i + − will have shown that m + j m = j := jj V . Nowe we can apply J 2j +1 | − i | i | i ∈ − times to jj to conclude that the 2j + 1 vectors jm all belong to V , and thus | i | i V = V . (This is because we have really just appliede the combination of rotations 2j+1 j m (J ) (J+) − to u , and each step along the way is just some vectore in V .) e − | i As the last subject of this section, let us go back and show that e (i/~)θ S really − · e represents a rotation in spin space. Let R denote the unitary rotation operator, and

29 let A be an arbitrary observable. Rotating our system, a state ψ is transformed into a rotated state ψ = R ψ . In the original system, a measurement| i of A | Ri | i yields the result ψ A ψ . Under rotations, we measure a rotated observable AR in the rotated states.h | Since| i the physical results of a measurement can’t change just because of our coordinate description, we must have

ψ A ψ = ψ A ψ . h R| R| Ri h | | i But ψ A ψ = ψ R†A R ψ h R| R| Ri h | R | i and hence we have R†ARR = A or

1 AR = RAR† = RAR− . (4.9)

(Compare this to (3.5). Why do you think there is a difference?) Now, what about spin one-half particles? To say that a measurement of the spin in a particular direction m can only yield one of two possible results means that the operator S m has only the two eigenvalues ~/2. Let us denote the corresponding states by m· . In other words, we have ± | ±i b b ~ b (S m) m = m . · | ±i ± 2| ±i (This is just the generalization ofb (2.1c)b for the caseb of spin one-half. It also applies to arbitrary spin if we write (S m) m ms = ms m ms .) Let us rotate the unit vector· m| by θi to obtain| anotheri unit vector n, and consider the operator b b b (i/~)θ S (i/~)θ S e− · (S m)e · . b · b (i/~)θ S Acting on the state e− · m we have | ±i b (i/~)θ S (i/~)θ S (i/~)θ S (i/~)θ S [e− · (S m)e · ]e− · m = e− · (S m) m · b | ±i · | ±i ~ (i/~)θ S b b = e− · mb b . ± 2 | ±i Therefore, if we define the rotated state b (i/~)θ S n := e− · m (4.10a) | ±i | ±i we see from (4.9) that we also have the rotated operator b b (i/~)θ S (i/~)θ S S n := e− · (S m)e · (4.10b) · · with the property that b ~b (S n) n = n (4.10c) · | ±i ± 2| ±i (i/~)θ S as it should. This shows that e− b b· does indeedb act as the rotation operator on the abstract spin states.

30 What we have shown then, is that starting from an eigenstate m of the spin operator S m, we rotate by an angle θ to obtain the rotated state| n±i that is an · | ±i eigenstate of S n with the same eigenvalues. b · ~ For spin one-halfb we have S = σ/2 (with = 1), and using (2.8),b you can show that the rotationb operator becomes n iσ θ/2 ∞ ( iσ θ/2) e− · = − · n! m=0 X θ θ = I cos iσ θ sin . (4.11) 2 − · 2 b Example 4.1. Let us derive equations (2.9) using our rotation formalism. We start by letting m = ˆz and n = ˆx, so that θ = (π/2)ˆy. Then the rotation operator is

iπσy /4 π π R(θ)= e− = I cos iσ sin b b 4 − y 4 cos π/4 sin π/4 1 1 1 = − = − . " sin π/4 cos π/4 # √2 " 1 1 #

Acting on the state m + = ˆz + | i | i we have 1 1 b 1 1 1 1 R(θ) ˆz + = − = = ˆx + | i √2 " 1 1 # " 0 # √2 " 1 # | i which is just (2.9a). Similarly, it is also easy to see that

1 1 1 0 1 1 R(θ) ˆz = − = − = ˆx | −i √2 " 1 1 # " 1 # √2 " 1 # | −i − which is the same as (2.9b) up to an arbitrary phase. So far we have shown that the rotation operator R(θ)= R((π/2)ˆy) indeed takes the eigenstates of Sz into the eigenstates of Sx. Let us also show that it takes Sz = σz/2 into Sx = σx/2. This is a straightforward computation:

1 1 1 0 1 1 iπσy /4 iπσy /4 1 1 e− σze = − √2 " 1 1 # " 0 1 # √2 " 1 1 # − − 1 0 2 0 1 = = = σx 2 " 2 0 # " 1 0 # as it should.

31 4.2 Spin Dynamics in a Magnetic Field As we mentioned at the beginning of Section 2, a particle of charge q and mass m moving in a circular orbit has a magnetic moment given classically by q µ = L . l 2mc This relation is also true in quantum mechanics for orbital angular momentum, but for spin angular momentum we must write q µ = g S (4.12) s 2mc where the constant g is called a g-factor. For an electron, g is found by experiment to be very close to 2. (The Dirac equation predicts that g = 2 exactly, and higher order correction in QED show a slight deviation from this value.) And for a proton, g = 5.59. Furthermore, electrically neutral particles such as the neutron also have magnetic moments, which you can think of as arising from some sort of internal charge distribution or current. (But don’t think too hard.) In the presence of a magnetic field B(t), a magnetic moment µ feels an applied torque µ B, and hence possesses a potential energy µ B. Because of this torque, the magnetic× moment (and hence the particle) will precess− · about the field. (Recall that torque is equal to the time rate of change of angular momentum, and hence a non-zero torque means that the angular momentum vector of the particle must be changing.) Let us see what we can learn about this precession. We restrict consideration to particles essentially at rest (i.e., with no angular momentum) in a uniform magnetic field B(t), and write µs = γS where γ = gq/2mc is called the gyromagnetic ratio. (If the field isn’t uniform, there will be a force equal to the negative gradient of the potential energy, and the particle will move in space. This is how the Stern-Gerlach experiment works, as we will see below.) In the case of a spin one-half particle, the Hamiltonian given by gq gq~ H = µ B(t)= S B(t)= σ B(t) . (4.13) s − s · −2mc · −4mc · To understand how spin behaves, we look at this motion from the point of view of the Heisenberg picture, which we now describe. (See Griffiths, Example 4.3 for a somewhat different approach.) The formulation of quantum mechanics that we have used so far is called the Schr¨odinger picture (SP). In this formulation, the operators are independent of time, but the states (wave functions) evolve in time. The stationary state solutions to the Schr¨odinger equation ∂ H Ψ(t) = i~ Ψ(t) | i ∂t| i (where H is independent of time) are given by

iEt/~ iHt/~ Ψ(t) = e− ψ = e− ψ (4.14) | i | i | i

32 where H ψ = E ψ and ψ is independent of time. In the| Heisenbergi | i picture| i (HP), the states are independent of time, and the operators evolve in time according to an equation of motion. To derive the equations of motion for the operators in this picture, we freeze out the states at time t = 0:

~ ψ := Ψ(0) = ψ = e+iHt/ Ψ(t) . (4.15a) | H i | i | i | i In the SP, the expectation value of an observable (possibly time dependent) is given by O iHt/~ iHt/~ = Ψ(t) Ψ(t) = ψ e e− ψ . hOi h |O| i h | O | i In the HP, we want the same measurable result for , so we have hOi = ψ ψ = ψ ψ . hOi h H |OH | H i h |OH | i Equating both versions of these expectation values, we conclude that

iHt/~ iHt/~ = e e− (4.15b) OH O where H = H (t) is the representation of the observable in the HP. Note in particularO thatO the Hamiltonian is the same in both picturesObecause

iHt/~ iHt/~ iHt/~ iHt/~ HH = e He− = e e− H = H . Now take the time derivative of (4.15b) and use the fact that H commutes with iHt/~ e± :

d H i iHt/~ iHt/~ i iHt/~ iHt/~ iHt/~ ∂ iHt/~ O = He e− e e− H + e O e− dt ~ O − ~ O ∂t

i iHt/~ ∂ iHt/~ = [H, ]+ e O e− . ~ OH ∂t If has no explicit time dependence (for example, the operator pt has explicit time dependence),O then this simplifies to the Heisenberg equation of motion d i OH = [H, ] . (4.16) dt ~ OH

Also note that if H is independent of time, then so is HH = H, and (4.16) then shows that H = const and energy is conserved. Returning to our problem, we use the Hamiltonian (4.13) in the equation of motion (4.16) to write dS i i gq i gq i = [H ,S ]= B [S ,S ]= B i~ε S dt ~ s i −~ 2mc j j i −~ 2mc j jik k gq gq =+ ε S B = (S B) 2mc ikj k j 2mc × i or dS(t) gq = S(t) B(t)= µ (t) B(t) . (4.17) dt 2mc × s ×

33 This is just the operator version of the classical equation that says the time rate of change of angular momentum is equal to the applied torque. From this equation we see that the spin of a positively charged particle (q > 0) will precess in the negative sense about B, while the spin of a negatively charged particle (q < 0) will precess in the positive sense. B

S

µs

Figure 1: Precession for q< 0

Let us specialize to the case where B is both uniform and independent of time, and write B = B0ˆz. Then the three components of (4.17) become dS (t) gqB dS (t) gqB dS (t) x = 0 S (t) y = 0 S (t) z =0 . (4.18) dt 2mc y dt − 2mc x dt Defining gqB ω = 0 (4.19) 0 2mc we can combine the Sx and Sy equations to obtain d2S (t) x = ω2S (t) dt2 − 0 x and exactly the same equation for Sy(t). These equations have the solutions

Sx(t)= a cos ω0t + b sin ω0t

Sy(t)= c cos ω0t + d sin ω0t

Sz(t)= Sz(0) .

Clearly, Sx(0) = a and Sy(0) = c. Also, from the equations of motion (4.18) we have ω0Sy(0) = (dSx/dt)(0) = ω0b and ω0Sx(0) = (dSy/dt)(0) = ω0d so that our solutions are −

Sx(t)= Sx(0) cos ω0t + Sy(0) sin ω0t S (t)= S (0) cos ω t S (0) sin ω t (4.20) y y 0 − x 0 Sz(t)= Sz(0) .

34 These can be written in the form Sx(t)= A cos(ω0t+δx) and Sy(t)= A sin(ω0t+δy) 2 2 2 where A = Sx(0) +Sy(0) . Since these are the parametric equations of a circle, we see that S precesses about the B field, with a constant projection along the z-axis. For example, suppose the spin starts out at t = 0 in the xy-plane as an eigenstate of Sx with eigenvalue +~/2. This means that ~ S (0) = ˆx + S (0) ˆx + = h x i h | x | i 2 S (0) = S (0) =0 . h y i h z i Taking the expectation value of equations (4.20) we see that ~ ~ S (t) = cos ω t S (t) = sin ω t S (t) =0 . (4.21) h x i 2 0 h y i − 2 0 h z i This clearly shows that the spin stays in the xy-plane, and precesses about the z- axis. The direction of precession depends on the sign of ω0, which in turn depends on the sign of q as defined by (4.19). How does this look from the point of view of the SP? From (4.14), the spin state evolves according to

iHt/~ i(gqBt/2mc)Sz /~ iω0tSz/~ χ(t) = e− χ(0) = e χ(0) = e χ(0) . | i | i | i | i Comparing this with (4.10a), we see that in the SP the spin precesses about ˆz − with angular velocity ω0. Now let us consider the motion of an electron in an inhomogeneous magnetic field. Since for the electron we have q = e< 0, we see from (4.12) that µ is anti- − s parallel to the spin S. The potential energy is Hs = µs B, and as a consequence, the force on the electron is − ·

F = ∇H = ∇(µ B) . − s s · Suppose we have a Stern-Gerlach apparatus set up with the inhomogeneous B field in the z direction (i.e., pointing up):

In our previous discussion with a uniform B field, there would be no translational force because ∇Bi = 0. But now this is no longer true, and as a consequence, a particle with a magnetic moment will be deflected. The original S-G experiment was done with silver atoms that have a single valence electron, but we can consider this as simply an isolated electron. It is easy to see in general what is going to happen. Suppose that the electron is in an eigenstate of Sz with eigenvalue +~/2. Since the spin points up, the magnetic

35 moment µs is anti-parallel to B, and hence Hs = µs B > 0. Since the electron wants to minimize its energy, it will travel towards− a region· of smaller B, which is up. Obviously, if the spin points down, then µs is parallel to B and Hs < 0 so the electron will decrease its energy by moving to a region of larger B, which is down. Another way to look at this is from the force relation. If µs is anti-parallel to B (i.e., spin up), then since the gradient of B is negative (the B field is decreasing in the positive direction), the force on the electron will be in the positive direction, and hence the electron will be deflected upwards. Similarly, if the spin is down, then the force will be negative and the electron is deflected downwards. It is worth pointing out that a particle with angular momentum l will split into 2l + 1 distinct components. So if the magnetic moment is due to orbital angular momentum, then there will necessarily be an odd number of beams coming out of a S-G apparatus. Therefore, the fact that an electron beam splits into two beams shows that its angular momentum can not be due to orbital motion, and hence is a good experimental verification of the existence of spin. Suppose that the particle enters the S-G apparatus at t = 0 with its spin aligned along ˆx, and emerges at time t = T . If the magnetic field were uniform, then the expectation value of S at t = T would be given by equations (4.21) as ~ ~ S unif = (cos ω T )ˆx (sin ω T )ˆy . (4.22) h iT 2 0 − 2 0 Interestingly, for the inhomogeneous field this expectation value turns out to be zero. To see this, let the particle enter the apparatus with its spin along the ˆx direction, and let its initial spatial wave function be the normalized wave packet ψ0(r,t). In terms of the eigenstates of Sz, the total wave function at t = 0 is then (from (2.9a))

1 ψ0(r, 0) Ψ(r, 0) = . (4.23a) √2 " ψ0(r, 0) #

When the particle emerges at t = T , it is a superposition of localized spin up wave packets ψ+(r,T ) and spin down wave packets ψ (r,T ): −

1 ψ+(r,T ) Ψ(r,T )= . (4.23b) √2 " ψ (r,T ) # − Since the spin up and spin down states are well separated, there is no overlap between these wave packets, and hence we have

ψ+(r,T ) ψ (r,T ) 0 . (4.23c) − ≈ Since the B field is essentially in the z direction there is no torque on the particle in the z direction, and from (4.17) it follows that the z component of spin is conserved. In other words, the total probability of finding the particle with spin

36 up at t = T is the same as it is at t = 0. The same applies to the spin down states, so we have ψ (r,T ) 2 d3r = ψ (r, 0) 2 d3r | + | | 0 | Z Z and 2 3 2 3 ψ (r,T ) d r = ψ0(r, 0) d r | − | | | Z Z and therefore 2 3 2 3 ψ+(r,T ) d r = ψ (r,T ) d r . (4.24) | | | − | Z Z From (4.23a), the expectation value of S at r and t = 0 is given by ~ S r =Ψ†(r, 0) σΨ(r, 0) h i ,0 2 ~ =Ψ†(r, 0) (σ ˆx + σ ˆy + σ ˆz)Ψ(r, 0) 2 x y z ~ = ψ (r, 0) 2 ˆx 2 | 0 | and hence the net expectation value at t = 0 is (since ψ0 is normalized) ~ 3 S = S r d r = ˆx . h i0 h i ,0 2 Z (This should have been expected since the particle entered the apparatus with its spin in the ˆx direction.) And from (4.23b), using (4.23c) we have for the exiting particle ~ S r =Ψ†(r,T ) σΨ(r,T ) h i ,T 2 ~ =Ψ†(r,T ) (σ ˆx + σ ˆy + σ ˆz)Ψ(r,T ) 2 x y z

~ 2 2 = ψ+(r,T ) ψ (r,T ) ˆz . 2 | | − | − | n o Integrating over all space, we see from (4.24) that

S inhom = 0 (4.25) h iT as claimed. Why does the uniform field give a non-zero value for S T whereas the inhomo- geneous field gives zero? The answer lies with (4.19). Observh i e that the precessional frequency ω0 depends on the field B. In the Stern-Gerlach apparatus, the B field is weaker up high and stronger down low. This means that particles lower in the beam precess at a faster rate than particles higher in the beam. Since (4.25) is essentially

37 an average of (4.22), we see that the different phases due to the different values of ω0 will cause the integral to vanish. Of course, this assumes that there is sufficient variation ∆ω0 in precessional frequencies that the trigonometric functions average to zero. This will be true as long as (∆ω )T 2π . 0 ≥ In fact, this places a constraint on the minimum amount of inhomogeneity that the S-G apparatus must have in order to split the beam. From (4.19) we have

e e ∂B0 ∆ω0 = ∆B0 ∆z −mec ≈−mec ∂z where ∆z is the distance between the poles of the magnet. Therefore, in order for the apparatus to work we must have e ∂B 2π 0 T . m c ∂z ≥ ∆z e

5 The Addition of Angular Momentum

The basic idea is simply to express the eigenstates of the total angular momentum operator J of two particles in terms of the eigenstates of the individual angular momentum operators J1 and J2, where J = J1 + J2. (Alternatively, this could be the addition of spin and orbital angular momenta of a single particle, etc.) Each particle’s angular momentum operator satisfies (where the subscript a labels the particle and where we take ~ = 1)

3 [Jai, Jaj ]= i εijkJak Xk=1 with corresponding eigenstate jama . And since we assume that [J1, J2] = 0, it follows that | i

[Ji, Jj ] = [J1i + J2i, J1j + J2j ] = [J1i, J1j ] + [J2i, J2j ] 3 3 3 = i εijkJ1k + i εijkJ2k = i εijk J1k + J2k Xk=1 Xk=1 Xk=1 3  = i εijkJk Xk=1 and hence J is just another angular momentum operator with eigenstates jm . Next we want to describe simultaneous eigenstates of the total angular| momen-i 2 2 2 tum. From [J1, J2] = 0, it is clear that [J1 , J2 ] = 0, and since [Ja , Jaz] =0, we 2 2 see that we can choose our states to be simultaneous eigenstates of J1 , J2 , J1z and J . We denote these states by j j m m . 2z | 1 2 1 2i

38 Alternatively, we can choose our states to be simultaneous eigenstates of J 2, 2 2 2 J1 , J2 and Jz. That [J , Jz] = 0 follows directly because J is just an angular 2 momentum operator. And the fact that [Ja , Jz] = 0 follows because Ja is an angular momentum operator, Jz = J1z + J2z and [J1, J2] = 0. Finally, to show that 2 2 [J , Ja ] = 0, we simply observe that

2 2 J = J1 + J2

= J 2 + J 2 +2J J 1 2 1 · 2 2 2 = J1 + J2 + 2(J1xJ2x + J1yJ2y + J1zJ2z) .

2 2 2 It is now easy to see that [J , Ja ] = 0 because [J1, J2]=0and [Ja , Jai] = 0 for i = x,y,z. We denote these simultaneous eigenstates by j1j2jm . 2 | i However, let me emphasize that even though [J , Jz] = 0, it is not true that [J 2, J ] = 0. This means that we can not specify J 2 in the states j j m m , and az | 1 2 1 2i we can not specify either J1z or J2z in the states j1j2jm . We know that the angular momentum operators| and statesi satisfy (with ~ = 1)

J 2 jm = j(j + 1) jm (5.1a) | i | i J jm = m jm (5.1b) z| i | i J jm = j(j + 1) m(m 1) jm 1 . (5.1c) ±| i − ± | ± i p Let us denote the two individual angular momentum states by j1m1 and j2m2 . Then the two-particle basis states are denoted by the various| formsi (see| Parti I, Section 1 in the handout Supplementary Notes on Mathematics)

j j m m = j m j m = j m j m . | 1 2 1 2i | 1 1i| 2 2i | 1 1i ⊗ | 2 2i

When we write j j m m j′ j′ m′ m′ we really mean h 1 2 1 2| 1 2 1 2i

( j m j m )( j′ m′ j′ m′ )= j m j′ m′ j m j′ m′ . h 1 1| ⊗ h 2 2| | 1 1i ⊗ | 2 2i h 1 1| 1 1ih 2 2| 2 2i Since these two-particle states form a complete set, we can write the total com- bined angular momentum state of both particles as

j1j2jm = j1′ j2′ m1m2 j1′ j2′ m1m2 j1j2jm . (5.2) | i ′ ′ | ih | i j1j2 mX1m2

However, we now show that many of the matrix elements j1′ j2′ m1m2 j1j2jm vanish, and hence the sum will be greatly simplified. h | i For our two-particle operators we have

Jz = J1z + J2z and J = J1 + J2 . ± ± ±

39 While we won’t really make any use of it, let me point out the correct way to write operators when dealing with tensor (or direct) product states. In this case we should properly write operators in the form

J = J 1 +1 J . 1 ⊗ 2 1 ⊗ 2 Then acting on the two-particle states we have, for example,

J ( j m j m )= J 1+1 J ( j m j m ) z | 1 1i ⊗ | 2 2i 1z ⊗ ⊗ 2z | 1 1i ⊗ | 2 2i = J j m j m  + j m J j m 1z| 1 1i ⊗ | 2 2i | 1 1i⊗ 2z| 2 2i = (m + m )( j m j m ) . 1 2 | 1 1i ⊗ | 2 2i As I said, while we won’t generally write out operators and states in this form, you should keep in mind what is really going on. Now, for our two-particle states we have

J 2 j j m m = j (j + 1) j j m m a | 1 2 1 2i a a | 1 2 1 2i J j j m m = m j j m m az| 1 2 1 2i a| 1 2 1 2i 2 where a =1, 2. Taking the matrix element of J1 acting to both the left and right we see that

2 j′ j′ m m J j j jm = j′ (j′ + 1) j′ j′ m m j j jm h 1 2 1 2| 1 | 1 2 i 1 1 h 1 2 1 2| 1 2 i = j′ j′ m m j j jm j (j + 1) h 1 2 1 2| 1 2 i 1 1 or [j′ (j′ + 1) j (j + 1)] j′ j′ m m j j jm =0 . 1 1 − 1 1 h 1 2 1 2| 1 2 i 2 Since this result clearly applies to J2 as well, we must have

j′ j′ m m j j jm = 0 if j′ = j or j′ = j . h 1 2 1 2| 1 2 i 1 6 1 2 6 2 In other words, equation (5.2) has simplified to

j j jm = j j m m j j m m j j jm . | 1 2 i | 1 2 1 2ih 1 2 1 2| 1 2 i m1m2 X

Next, from Jz = J1z + J2z we can let J1z + J2z act to the left and Jz act to the right so that

j j m m J j j jm = (m + m ) j j m m j j jm h 1 2 1 2| z| 1 2 i 1 2 h 1 2 1 2| 1 2 i = j j m m j j jm m . h 1 2 1 2| 1 2 i This shows that

j j m m j j jm = 0 unless m = m + m h 1 2 1 2| 1 2 i 1 2

40 and hence equation (5.2) has become

j1j2jm = j1j2m1m2 j1j2m1m2 j1j2jm . | i m1m2 | ih | i m1+Xm2=m

Finally, we regard the values of j1 and j2 as fixed and understood, and we simply write the total angular momentum state as

jm = m1m2 m1m2 jm . (5.3) | i m1m2 | ih | i m1+Xm2=m

The complex numbers m1m2 jm are called Clebsch-Gordan coefficients. They are really nothing moreh than| thei elements of the unitary transition matrix that takes us from the m1m2 basis to the jm basis. (This is just the statement from linear algebra{| that giveni} two bases {|e andi} e¯ of a vector space V , there is { i} { i} a nonsingular transition matrix P such thate ¯i = j ej pji. Moreover,e ¯i is just the ith column of P . If both bases are orthonormal, then P is in fact a unitary matrix. See Theorem 15 in Supplementary Notes on MathematicsP .) Since (normalized) eigenfunctions corresponding to distinct eigenvalues of a her- mitian operator are orthonormal, we know that

j′m′ jm = δ ′ δ ′ and m′ m′ m m = δ ′ δ ′ . h | i jj mm h 1 2| 1 2i m1m1 m2m2 Therefore, taking the inner product of equation (5.3) with itself we see that

2 m1m2 jm =1 . (5.4) m1m2 |h | i| m1+Xm2=m

Given j1 and j2, this holds for any resulting values of j and m. (This is really just another way of saying that the columns of the unitary transition matrix are normalized.) Now, we know that j1 m1 j1 and j2 m2 j2. Therefore, from m = m + m , the maximum− ≤ value of≤ m must be− j ≤+ j . And≤ since j m j, 1 2 1 2 − ≤ ≤ it follows that the maximum value of j is jmax = j1 + j2. Corresponding to this maximum value j1 + j2 of j, we have a multiplet of 2(j1 + j2) + 1 values of m, i.e., (j + j ) m j + j . On the other hand, since there are 2j + 1 possible − 1 2 ≤ ≤ 1 2 1 values of m1, and 2j2 + 1 possible values of m2, the total number of (not necessarily distinct) possible m1 + m2 = m values is (2j1 + 1)(2j2 + 1). The next highest possible value of m is j + j 1 so that there is a j state with 1 2 − j = j1 + j2 1 and a corresponding multiplet of 2(j1 + j2 1)+1=2(j1 + j2) 1 possible m −values. We continue lowering the j values by− one, and for each such− value j = j + j k there are 2(j + j k) + 1 possible m values. However, the 1 2 − 1 2 − total number of m values in all multiplets must equal (2j1 + 1)(2j2 + 1).

41 Example 5.1. Consider the case j1 =1, j2 = 2. Then the total number of possible m values is (2j + 1)(2j +1)=3 5=15 . 1 2 · We can arrange these values of m = m1 + m2 in a table with m1 across the top and m2 down the side: m1 1 0 1 − 2 3 2 1 −1 −2 −1− 0 − − − m2 0 1 0 1 1 −0 1 2 2 1 2 3 The distribution of m values is as follows: value of m = m + m : 3 2 1 0 1 2 3 1 2 − − − number of occurances: 1 2 3 3 3 2 1

We have mmax =1+2=3 so that jmax = j1 + j2 = 3, and there are seven values of m in this multiplet: 3, 2, 1, 0, 1, 2, 3. If we eliminate these, then the next highest value of m is 2, so there− must− exist− a j = 2 state with the five m values 2, 1, 0, 1, 2. Eliminating these, the highest remaining m value is 1, so there is a j = 1− state− with the three m values 1, 0, 1. This exhausts all of the m values. Note that summing the number of m values− in each multiplet we have 7+5+3 = 15 as it must.

The point of this example is to illustrate the origin of the two ways of getting an m value of j1 + j2 1: one way is the multiplet corresponding to j = j1 + j2, and the second way is− the multiplet corresponding to j = j + j 1. Similarly, the 1 2 − case m = j1 + j2 2 enters into three multiplets: j = j1 + j2, j = j1 + j2 1 and j = j + j 2. − − 1 2 − In any case, for each nonnegative integer k, the state with j = j1 +j2 k contains 2(j + j k) + 1 values of m, subject to the constraint − 1 2 − b [2(j + j k)+1]=(2j + 1)(2j + 1) . (5.5) 1 2 − 1 2 Xk=0

The minimum value jmin = j1 + j2 b is defined by the integer b which is to be determined. Recalling the formula −

m m n +1 k = − (m + n) 2 kX=n

42 we have b b [2(j + j k)+1]=2(j + j )(b + 1) 2 k + (b + 1) 1 2 − 1 2 − Xk=0 Xk=0 = 2(j + j )(b + 1) (b + 1)b + (b + 1) 1 2 − = (b + 1)[2(j + j )+1 b] . 1 2 − Then equation (5.5) becomes (b + 1)[2(j + j )+1 b]=(2j + 1)(2j + 1) 1 2 − 1 2 or b2 2(j + j )b +4j j =0 . − 1 2 1 2 Solving for b using the quadratic formula yields b = (j + j ) (j + j )2 4j j 1 2 ± 1 2 − 1 2 = (j + j ) (j j ) 1 2 ± p1 − 2 so that b =2j or b =2j . Since j = j + j b, we see that either j = j j 1 2 min 1 2 − min 2 − 1 or jmin = j1 j2. If j1 = j2 then one of these is negative and must be rejected, and hence we conclude− that6 j = j j . Thus we arrive at j j j or min | 1 − 2| min ≤ ≤ max j j j j + j . (5.6) | 1 − 2|≤ ≤ 1 2 Equations (5.1c), (5.3), (5.4) and (5.6) are all we need to calculate the Clebsch- Gordan coefficients. The procedure is best illustrated by a specific example.

Example 5.2. Let us consider the case j1 = j2 =1/2 because you (should) already know the answer. In this case we have m1 = m2 = 1/2, so according to equation (5.6) we know that 0 j 1. ± ≤ ≤ First we have jmax = j1 + j2 = 1, and the maximum value of m is m1 + m2 =1 also. From equation (5.3) we have the single term 1 1 1 1 11 = 11 2 2 2 2 E ED E and by equation (5.4) we must have

1 1 2 11 =1 . 2 2

D E

This specifies the topmost state up to a phase factor. Choosing this phase to be 1 (the Condon-Shortley convention), we are left with 1 1 11 = +1 . 2 2 D E

43 Hence equation (5.3) becomes 1 1 11 = (5.7a) 2 2 E E where the state on the left-hand side of this equation is the jm state, and the | i state on the right-hand side is the m1m2 state. To construct the next lowest state,| we acti on the left-hand side of this last result with J , and on the right-hand side with J = J1 + J2 . Using equation (5.1c), we obtain− − − − 3 1 1 1 1 1 √2 10 = + + 4 4 − 2 2 2 − 2 E r  E E or 1 1 1 1 1 10 = + . (5.7b) √2 − 2 2 2 − 2 E  E E Since (by equation (5.3))

1 1 1 1 1 1 1 1 10 = 10 + 10 2 − 2 2 − 2 − 2 2 − 2 2 E ED E ED E we see that 1 1 1 1 1 10 = 10 = . 2 − 2 − 2 2 √2 D E D E We now act on equation (5.7b) with J = J 1 + J2 again to obtain − − − 1 3 1 1 1 1 1 √2 1 1 = + + − √2 " 4 4 − 2 − 2 − 2 − 2 # E r  E E or 1 1 1 1 = (5.7c) − − 2 − 2 E E and hence 1 1 1 1 =1 . − 2 − 2 − D E This completes the j = 1 multiplet which, for obvious reasons, is called the triplet state. Now for the j = 0 state. We know from equation (5.3) that 1 1 1 1 1 1 1 1 00 = 00 + 00 2 − 2 2 − 2 − 2 2 − 2 2 E ED E ED E 1 1 1 1 := a + b . 2 − 2 − 2 2 E E 2 2 To solve for a and b, we first see that from equation (5.4) we have a + b = 1. Next, we note that the jm = 1 0 state is orthogonal to the jm = 0 0 state, and therefore | i | i | i | i 1 0= 10 00 = (a + b) h | i √2

44 so that a = b. Therefore 2a2 = 1 so that a = b =1/√2 and we have − − 1 1 1 1 1 00 = (5.7d) √2 2 − 2 − − 2 2 E  E E which is called the singlet state. We have thus constructed the well-known wavefunctions for two :

1 1 11 = 2 2  E E 1 1 1 1 1 triplet symmetric states:  10 = √ 2 2 + 2 2  2 − −  E n E Eo 1 1 1 1 =  − − 2 − 2   E E  1 1 1 1 1 singlet antisymmetric state: 00 = . √2 2 − 2 − − 2 2 E  E E

While we will soon return to discuss the symmetry of wave functions in detail, it should be fairly clear that exchanging electrons 1 and 2 in any of the triplet states leaves the wave function unchanged, and hence the states are referred to as symmetric. Similarly, exchanging electrons 1 and 2 in the singlet state changes the overall sign of the state, and hence this is called an antisymmetric state.

45