<<

Principles of University of Cambridge Part II Mathematical Tripos

David Skinner Department of Applied and Theoretical Physics, Centre for Mathematical Sciences, Wilberforce Road, Cambridge CB3 0WA United Kingdom

[email protected] http://www.damtp.cam.ac.uk/people/dbs26/

Abstract: These are the lecture notes for the Principles of Quantum Mechanics course given to students taking Part II Maths in Cambridge during Michaelmas Term of 2019. The main aim is to discuss quantum mechanics in the framework of , following Dirac. Along the way, we talk about transformations and symmetries, , composite systems, dynamical symmetries, perturbation theory (both time–independent and time–dependent, degenerate and non–degenerate). We’ll finish with a discussion of various interpretations of quantum mechanics, entanglement and decoherence. Contents

1 Introduction1

2 Hilbert Space4 2.1 Definition of Hilbert Space4 2.1.1 Examples5 2.1.2 Dual Spaces7 2.1.3 Dirac Notation and Continuum States8 2.2 Operators 10 2.3 Composite Systems 14 2.3.1 Product of Hilbert Spaces 14 2.4 Postulates of Quantum Mechanics 18 2.4.1 The Generalized 21

3 The Harmonic Oscillator 22 3.1 Raising and Lowering Operators 22 3.2 Dynamics of Oscillators 25 3.2.1 Coherent States 27 3.2.2 Anharmonic Oscillations 27 3.3 The Three-Dimensional Isotropic Harmonic Oscillator 29

4 Transformations and Symmetries 31 4.1 Transformations of States and Operators 31 4.1.1 Continuous Transformations 33 4.2 Translations 34 4.3 38 4.3.1 Translations Around a Circle 42 4.3.2 43 4.4 Time Translations 45 4.4.1 The Heisenberg Picture 46 4.5 Dynamics 47 4.5.1 Symmetries and Conservation Laws 50 4.6 Parity 51

5 Angular Momentum 56 5.1 Angular Momentum Eigenstates (A Little Representation Theory) 57 5.2 Rotations and Orientation 59 5.2.1 of Diatomic Molecules 60 5.3 Spin 62 5.3.1 Large Rotations 62 5.3.2 The Stern–Gerlach Experiment 63

– i – 5.3.3 and Projective Representations 65 5.3.4 Spin Matrices 68 5.3.5 Paramagnetic Resonance and MRI Scanners 71 5.4 Orbital Angular Momentum 74 5.4.1 74

6 Addition of Angular Momentum 78 6.1 Combining the Angular Momenta of Two States 79 6.1.1 j 0 = j 83 ⊗ 6.1.2 1 1 = 1 0 83 2 ⊗ 2 ⊕ 6.1.3 1 1 = 3 1 84 ⊗ 2 2 ⊕ 2 6.1.4 The Classical Limit 85 6.2 Angular Momentum of Operators 86 6.2.1 The Wigner–Eckart Theorem 86 6.2.2 Dipole Moment Transitions 86 6.2.3 The Runge–Lenz Vector in Hydrogen 86 6.2.4 Enhanced Symmetry of the 3d Isotropic Harmonic Oscillator 89 6.3 Angular Momentum of the Isotropic Oscillator 91 6.4 The Radial Hamiltonian for the Coulomb Potential 93

7 Identical Particles 97 7.1 Bosons and Fermions 97 7.1.1 Pauli’s Exclusion Principle 99 7.1.2 The Periodic Table 100 7.1.3 White Dwarfs, Neutron Stars and Supernovae 104 7.2 Exchange and Parity in the Centre of Momentum Frame 104 7.2.1 Identical Particles and Inelastic Collisions 104

8 Perturbation Theory I: Time Independent Case 107 8.1 An Analytic Expansion 107 8.1.1 Fine Structure of Hydrogen 110 8.1.2 Hyperfine Structure of Hydrogen 114 8.1.3 The Ground State of Helium 115 8.1.4 The Quadratic Stark Effect 118 8.2 Degenerate Perturbation Theory 120 8.2.1 The Linear Stark Effect 123 8.3 Does Perturbation Theory Converge? 124

9 Perturbation Theory II: Time Dependent Case 127 9.1 The Interaction Picture 127 9.2 Fermi’s Golden Rule 130 9.2.1 Ionization by Monochromatic Light 132 9.2.2 Absorption and Stimulated Emission 135 9.2.3 Spontaneous Emission 137

– ii – 9.2.4 Selection Rules 140

10 Interpreting Quantum Mechanics 142 10.1 The Density 142 10.1.1 Von Neumann Entropy 144 10.1.2 Reduced density operators 146 10.1.3 Decoherence 147 10.2 Quantum Mechanics or Hidden Variables? 150 10.2.1 The EPR Gedankenexperiment 150 10.2.2 Bell’s Inequality 151

– iii – Acknowledgments

Nothing in these lecture notes is original. In particular, my treatment is heavily influenced by the textbooks listed below, especially Weinberg’s Lectures on Quantum Mechanics, Hall’s Quantum Theory for Mathematicians and Binney & Skinner’s The Physics of Quan- tum Mechanics (though I feel more justified in plagiarising this one). I’ve also borrowed from previous versions of the lecture notes for this course, by Profs. R. Horgan and A. Davis, which are available online. I am supported by the European Union under an FP7 Marie Curie Career Integration Grant.

– iv – Preliminaries

These are the notes for the Part II course on Principles of Quantum Mechanics offered in Part II of the Maths Tripos, so I’ll feel free to assume you took the IB courses on Quantum Mechanics, Methods and Linear Algebra last year. For Part II, I’d expect the material presented here to complement the courses on Classical Dynamics, Statistical Physics and perhaps Integrable Systems very well on the Applied side, in addition to being a prerequisite for Applications of Quantum Mechanics next term. On the pure side, the courses Linear Analysis or and Representation Theory are probably the most closely related to this one.

Books & Other Resources There are many textbooks and reference books available on Quantum Mechanics. Different ones emphasise different aspects of the theory, or different applications in physics, or give prominence to different mathematical structures. QM is such a huge subject nowadays that it is probably impossible for a single textbook to give an encyclopaedic treatment (and absolutely impossible for me to do so in a course of 24 lectures). Here are some of the ones I’ve found useful while preparing these notes; you might prefer different ones to me.

Weinberg, S. Lectures on Quantum Mechanics, CUP (2013). • This is perhaps the single most appropriate book for the course. Written by one of the great physicists of our times, this book contains a wealth of information. Highly recommended.

Binney, J.J. and Skinner, D. The Physics of Quantum Mechanics, OUP (2014). • The number 1 international bestseller... Contains a much fuller version of these notes (with better diagrams!). Put it on your Christmas list.

Dirac, P. Principles of Quantum Mechanics, OUP (1967). • The notes from an earlier version of this course! Written with exceptional clarity and insight, this is a genuine classic of theoretical physics by one of the founders of Quantum Mechanics.

Messiah, A. Quantum Mechanics, Vols 1 & 2, Dover (2014). • Another classic textbook, originally from 1958. It provides a comprehensive treat- ment of QM, though the order of the presentation is slightly different to the one we’ll follow in this course.

Shankar, R. Principles of Quantum Mechanics, Springer (1994). • A much-loved textbook, with particular emphasis on the physical applications of quantum mechanics.

Hall, B. Quantum Theory for Mathematicians, Springer (2013). • If you’re more interested in the functional analysis & representation theory aspects of QM rather than the physical applications, this could be the book for you. Much of it is at a more advanced level than we’ll need.

– v – Sakurai, J.J. Modern Quantum Mechanics, Addison–Wesley (1994). • By now, another classic book on QM, presented from a perspective very close to this course.

Townsend, J. A Modern to Approach to Quantum Mechanics, 2nd edition, Univer- • sity Science Books (2012). Very accessible, but also fairly comprehensive textbook on QM. Starts by considering 1 spin- 2 particle as a simple example of a two–state system.

Textbooks are expensive, so check what’s available in your college library before shelling out – any reasonably modern textbook on QM will be adequate for this course. There are also lots of excellent resources available freely online, including these:

Here are the lecture notes from another recent version of this course. • Here are the lecture notes from the Part II Further Quantum Mechanics course in • the Cavendish.

These are the lecture notes from Prof. W. Taylor’s graduate course on QM at MIT. • The first half of his course covers material that is relevant here, while scattering theory will be covered in next term’s Applications of Quantum Mechanics Part II course.

This is the lecture course on Quantum Mechanics from Prof. R. Fitzpatrick at the • University of Texas, Austin.

– vi – 1 Introduction

In , a particle’s motion is governed by Newton’s Laws. These are second order o.d.e.s, so to determine the fate of our particle we must specify two initial conditions. We could take these to be the particle’s initial position x(t0) and velocity v(t0), or it’s initial position x(t0) and momentum p(t0). Once these are specified the motion is determined (provided of course we understand how to describe the forces that are acting). This means 1 that we can solve Newton’s Second Law to find the values (x(t), p(t)) for t > t0 . So as time passes, our classical particle traces out a trajectory in the space M of possible positions and momenta, sketched in figure1. The space M is known as phase space and in our case, 6 for motion in three dimensions, M is just R . In general M comes with a rich geometry known as a Poisson structure; you’ll study this structure in detail if you’re taking the Part II courses on Classical Dynamics or Integrable Systems, and we’ll touch on it later in this course, too. Classical are represented by functions

f : M R f :(x, p) f(x, p) . → 7→ For example, we may be interested in the kinetic energy T = p2/2m, potential energy V (x), angular momentum L = x p, or a host of other possible quantities. A priori, × these functions are defined everywhere over M, but if we want to know about the energy or angular momentum of our specific particle then we should evaluate them not at some random point (x, p) M, but along the particle’s phase space trajectory. For example, if ∈ at time t the particle has position x(t) and momentum p(t), then its angular momentum is x(t) p(t). Thus the values of the particle’s energy, angular momentum etc. may × depend on time, though of course our definition of these quantities does not2. In this way,

1 We can certainly do this numerically, at least for t in a sufficiently small open subset around t0. 2 Technically, we take the pullback of f by the embedding map ı :[t0, ∞) → M. + + + z 2d R (x(t),p(t)) y x (x0,p0) f R

Figure 1: A particle’s trajectory in phase space. Observables are represented by functions 2d f : R R, evaluated along a given particle’s trajectory. →

– 1 – everything we could possibly want to know about a single, pointlike particle is encoded in its phase space trajectory. But that is not our World. To the best of our current experimental knowledge, our World is a quantum, not classical, one. Initially, these experiments were based on careful studies of atomic spectroscopy and blackbody radiation, but nowadays I’d prefer to say that the best evidence of quantum mechanics is simply that we use it constantly in our everyday lives. Each time you listen to music on your stereo, post a photo on Instagram or make a call on your phone you’re relying on technology that’s only become possible due to our understanding of the quantum structure of matter. Whenever you plug some- thing into the mains, you’re using electricity that’s in part generated by nuclear reactions in which quantum mechanics is essential, while much of modern medicine relies on new drugs designed with the benefit of the improved understanding of chemistry that quantum mechanics provides. In such a quantum world, instead of a phase space trajectory, everything we could want to know about a particle is encoded in a vector ψ in Hilbert space . As you2d met in IB H R Quantum Mechanics, this state vector+ evolves in time according to Schr¨odinger’sequation. (x(t),p(t)) In the quantum world, observables+ are+ represented by certain operators . The operators z O you saw in IB QM had the same sort of form as observables classical mechanics, such as the kinetic energy operator Tˆ = pˆ2/2m or angular momentum operator Lˆ = xˆ pˆ. However, × rather than being functions, these operators are (roughly) linear maps y : . (x0,p0) O H → H Again, in the first instance these operators are defined throughoutx , but if we’re interested f H in knowing about the energy or angular momentum of our particular quantum particle, R then we should find out what happens when they act on the specific ψ that describes ∈ H the state of our particle at time t. (See figure2.)

H 0 = O O

Figure 2: In Quantum Mechanics, complete knowledge of a particle’s states is determined by a vector in Hilbert space. Observables are represented by Hermitian linear operators : . O H → H

– 2 – In the following chapters we’ll study what Hilbert space is and what it’s operators do in a more general framework than you saw last year, building your insight into the mathematical structure of quantum mechanics. Much of this is just linear algebra, but the Hilbert spaces we’ll care about in QM are often infinite-dimensional, so we also make contact with Functional Analysis. Furthermore, although last year you ‘guessed’ the form of quantum operators by analogy with their classical counterparts, we’ll see that at a deeper level many of them can be understood to have their origins in symmetries of space and time; the operators just reflect the way these symmetry transformations act on Hilbert space, rather than on (non-relativistic) space-time. In this way, Quantum Mechanics makes contact with Representation Theory. So, mathematically, much of Quantum Mechanics boils down to a mix of Functional Analysis and Representation Theory. It’s even true that it provides a particularly interest- ing example of these subjects. But this is not the reason we study it. We study Quantum Mechanics in an effort to understand and appreciate our World, not some abstract mathe- matical one. You’re all intimately familiar with vector spaces, and you’re (hopefully!) also very good at solving Sturm–Liouville type eigenfunction / eigenvalue problems. But the real skill is in understanding how this formalism relates to the world we see around us. It’s not obvious. Newton’s laws are (at least generically) non-linear differential equa- tions and we can’t usually superpose solutions. General Relativity teaches us that space- time is not flat. So it’s not at all clear that our particle should in fact be described by a point in a , any more than it was obvious to Aristotle that bodies actually stay in uniform motion unless acted on by a force, or clear to the Ancients that the arrival of solar eclipses, changes of the weather, or any other natural phenomenon are actually gov- erned by calculable Laws, rather than the whims of various gods. For this reason, instead of emphasising how weird and different Quantum Mechanics is, I’d prefer to make you appreciate how it actually underpins the physics you’re already familiar with. No matter how good you are at solving eigenvalue problems, if you don’t see how these relate to your everyday physical intuition, knowledge you’ve built up since first opening your eyes and learning to crawl, then you haven’t really understood the subject. Let’s begin.

– 3 – 2 Hilbert Space

The realm of Quantum Mechanics is Hilbert space3, so we’ll begin by exploring the prop- erties of these. This chapter will necessarily be almost entirely mathematical; the physics comes later.

2.1 Definition of Hilbert Space

Hilbert space is a vector space over C that is equipped with a complete inner product. H Let’s take a moment to understand what this means; much of it will be familiar from IB Linear Algebra or IB Methods. Saying that is a vector space means that it is a set on which we have an operation H + of addition, obeying

commutativity ψ + φ = φ + ψ associativity ψ + (φ + χ) = (ψ + φ) + χ (2.1) identity ! o s.t. ψ + o = ψ ∃ ∈ H for all ψ, φ, χ . Furthermore, since it is a vector space over C, we can multiply our ∈ H vectors by numbers a, b, c, . . . C, called scalars. This multiplication is ∈ distributive over c(ψ + φ) = cψ + cφ H (2.2) distributive in C (a + b)ψ = aψ + bψ .

In addition, comes equipped with an inner product. This is a map ( , ): C H H × H → that obeys

conjugate symmetry (φ, ψ) = (ψ, φ) linearity (φ, aψ) = a (φ, ψ) (2.3) additivity (φ, ψ + χ) = (φ, ψ) + (φ, χ) positive–definiteness (ψ, ψ) 0 ψ , with equality iff ψ = o . ≥ ∀ ∈ H Note that the first two of these imply (aφ, ψ) = a (φ, ψ) so that ( , ) is antilinear in its first (leftmost) argument4. Note also that (ψ, ψ) = (ψ, ψ) so that this is necessarily real. Whenever we have inner product, we can define5 the norm of a state to be p ψ = (ψ, ψ) . (2.4) k k The properties (2.3) also ensure that the Cauchy–Schwarz inequality (φ, ψ) 2 (φ, φ)(ψ, ψ) | | ≤ holds. 3In this course, we’ll focus on Dirac’s formulation of QM, which is based on Hilbert space. This is by far the most commonly used approach. However, there are some approaches to QM (notably deformation quantization and the theory of C∗-algebras) in which Hilbert spaces do not play a prominent role. We won’t discuss any of them. 4In the maths literature, the inner product is often taken to be linear in the left entry and antilinear in the right. We’ll follow the QM literature, which always uses the opposite convention to the maths literature. 5A norm is a more fundamental notion than an inner product: there are normed vector spaces that do not have inner products, while every has a norm, defined as in (2.4).

– 4 – As always, a set of vectors φ1, φ2, . . . , φn are linearly independent iff the only solution { } to c1φ1 + c2φ2 + + cnφn = o (2.5) ··· for ci C is c1 = c2 = = cn = 0. The dimension of the vector space is the largest possi- ∈ ··· ble number of linearly independent vectors we can find. If there is no such largest number, then we say the vector space has infinite dimension. A set of vectors φ1, φ2, . . . , φn is { } orthonormal if ( 0 when i = j, (φi, φj) = 6 (2.6) 1 when i = j.

An orthonormal set φ1, φ2, . . . , φn forms a of an n-dimensional Hilbert space if every { } Pn ψ can be uniquely expressed as a sum ψ = caφa, with some coefficients ca C. ∈ H a=1 ∈ We can determine these coefficients by taking the inner product with φb, since ! X X (φb, ψ) = φb, caφa = ca(φb, φa) = cb (2.7) a a using the linearity of the inner product and orthonormality of the φa. Quantum mechanics makes use of both finite– and infinite–dimensional Hilbert spaces, as we’ll see. In the infinite dimensional case, we have to decide what we mean by an ‘infinite linear combination’ of (e.g. basis) vectors; not every such infinite sum makes sense P∞ as infinite sums such as caφa might not converge. To prevent this, one requires that a=1 H is complete in the norm (2.4), meaning that every Cauchy sequence S1,S2,... converges P∞ { } in . We say that the sum caψa converges to a vector Ψ if H a=1

lim SN Ψ = 0 (2.8) N→∞ k − k PN where SN = a=1 caψa is the sum of just the first N terms. The interplay between the vector space structure and this requirement of completeness can be very subtle in infinite dimensions — as you’ll see if you take the Part II course on Linear Analysis or (next term) Functional Analysis. In this course, we’ll largely ignore such subtleties, not because they’re not interesting, but because they’re a distraction from all the interesting physics we need to learn!

2.1.1 Examples n It turns out that all finite dimensional (complex) Hilbert spaces are isomorphic to C for some n, with norm v u n uX 2 u = t ui (2.9) k k | | i=1 and inner product n X (v, u) = viui (2.10) i=1

– 5 – where ui are complex numbers – the components of the vector u in the canonical basis. { } n Being able to differentiate vectors in C requires that we can take limits, and to do this we need to know when the limits exist. Given an infinite sequence u(1), u(2),... of P∞ (a) n { } vectors, their sum a=1 u is absolutely convergent to a vector in C iff the sum of the P∞ (a) lengths a=1 u converges as a series of real numbers. Just as with real numbers, if a k k n sequence of vectors converges absolutely, then there is some U C to which it converges, ∈ in the sense that N X U u(a) 0 as N . (2.11) − → → ∞ a=1 n The expresses the completeness of the space C – heuristically, there are no points ‘missing’. A simple infinite dimensional generalisation of this is the space of infinite sequences of complex numbers u = (u1, u2, u3,...) such that

∞ X 2 ui < . (2.12) | | ∞ i=1

2 n This space is known as ` and, heuristically, you can think of it as ‘C with n = ’. The ∞ inner product between two such sequences u and v is defined as

∞ X (v, u) = viui (2.13) i=1 as an obvious generalisation of the finite dimensional case (2.10). The Cauchy–Schwartz inequality gives (v, u) v u < , so this inner product converges provided the | | ≤ k k k k ∞ norms do. Again, the notion of completeness of `2 with respect to this norm enables us to meaningfully take limits and, ultimately, differentiate vectors u `2. ∈ We will often also meet infinite dimensional vector spaces that are spaces of functions, such as the wavefunctions you dealt with throughout 1B QM. Given two functions ψ : n d R C and φ : R C, we can define linear combinations aψ + bφ for any a, b C in the → → ∈ obvious way (aψ + bφ)(x) = aψ(x) + bφ(x) (2.14) where the multipication and addition on the rhs are just those in C, so spaces of functions are naturally infinite dimensional vector spaces (as you saw in IB Methods). To turn this space of functions into a Hilbert space, we first give it the norm

sZ ψ = ψ(x) 2 ddx (2.15) k k Rd | | and require that ψ < – i.e. that the integral converges. In physics, functions for k k ∞ which ψ < holds are called normalisable. Note that just asking for (2.15) to converge k k ∞ is not a very strong restriction, and so our normed function space contains a very wide class of functions, including all piecewise continuous functions that decay sufficiently rapidly as

– 6 – x , and even some functions that are singular for some discrete values of x, provided | | → ∞ these singularities are not strong enough to cause the integral (2.15) to diverge6. Finally, we take the inner product between two functions ψ, φ whose norms are finite to be Z (φ, ψ) = φ¯(x)ψ(x) ddx (2.16) Rd and again this converges by Cauchy–Schwarz. In physics, we often call ψ(x) the wavefunc- tion of the particle, and sometimes call (φ, ψ) the overlap integral between two wavefunc- tions. I’d like to promote Tensor Products to sit here (or hereabouts). I don’t think it’s pos- sible to properly discuss spin, parity etc without first admitting that the corresponding H to a physical system may be more involved than the basic examples considered above. I doubt this will be too hard for them to swallow at this stage, as they’ve seen tensor prod- 2 3 3 Nˆ 3 2 ucts in IB Linear Algebra. (Also in Methods, and especially in L (R , d x) ∼= L (R, dx) though of course they won’t recognize this as a yet. This will require some reorganising, especially of the introduction to rotations, spin & parity. I don’t think this should be too difficult to achieve.

2.1.2 Dual Spaces As with any vector space, the dual ∗ of a Hilbert space is the space of linear maps H∗ H C. That is, an element ϕ defines a map ϕ : ψ ϕ(ψ) C for every ψ , H → ∈ H 7→ ∈ ∈ H such that ϕ : aψ1 + bψ2 aϕ(ψ1) + bϕ(ψ2) (2.17) 7→ for all ψ1, ψ2 and a, b C. ∈ H ∈ One way to construct such a map is to use the inner product: given some state φ ∈ H we can define an element (φ, ) ∗ which acts by ∈ H (φ, ): ψ (φ, ψ) (2.18) 7→ i.e. we take the inner product of ψ with our chosen element φ. The linearity properties ∈ H of the inner product transfer to ensure that (φ, ) is indeed a — note that since the inner product is antilinear in its first entry, it’s important that our chosen element φ sits in this first entry. In 1B Linear Algebra you proved that, in finite dimensions, every ∗ element of arises this way. That is, any linear map ϕ : C can be written as (φ, ) H H → 6The requirement that the space be complete in the norm (??) is rather subtle. If kψ − φk = 0, then we must identify ψ and φ as the same object in our space. This does not necessarily mean that they’re identical as functions, because e.g. they could take different at some discrete points xi ⊂ R, as the non-zero value of ψ − φ at these discrete points would not contribute to (??). In particular, any function that is non-zero only at a discrete set of points should be identified with the zero function. The resulting space 2 d d 2 is known as L (R , d x) or sometimes just L for short. (The L stands for Lebesgue, and is an example 2 d d of a more general type of normed function space.) L (R , d x) consists of equivalence classes of Cauchy sequences of functions that are convergent in the norm (2.15). In this course we’ll mostly gloss over such technicalities, and they’re certainly non-examinable. For a deeper discussion of Hilbert space, see the Part II Linear Analysis & Functional Analysis courses.

– 7 – for some fixed choice of φ . This means in particular that the inner product ( , ) ∈ H provides a vector space

( , ) ∗ = . (2.19) H ∼ H Comfortingly, the same result also holds in infinite dimensions7, but it’s non–trivial to prove and is known as the Riesz Representation Theorem. The isomorphism ∗ = is H ∼ H what is special about Hilbert spaces among various other infinite dimensional vector spaces, and makes them especially easy to handle.

2.1.3 Dirac Notation and Continuum States From now on, in this course we’ll use a notation for Hilbert spaces that was introduced by Dirac and is standard throughout the theoretical physics literature. Dirac denotes an element of as ψ , where the symbol “ ” is known as a ket. An element of the dual H | i | i space is written φ and the symbol “ ” is called a bra. The relation between the ket h | h | φ and the bra φ ∗ is what we would previously have written as φ vs (φ, ). | i ∈ H h | ∈ H The inner product between two states ψ , φ is then written φ ψ forming a bra-ket | i | i ∈ H h | i or bracket. Note that this implicitly uses the isomorphism ∗ = provided by the inner H ∼ H product, building it into the notation. Recall also that (at least in all physics courses!) φ ψ is antilinear in φ . h | i | i n 2 Given an ea of , at least for the cases = C or = ` , in {| i} H H ∼ H ∼ Dirac notation we can expand a general ket ψ as | i X ψ = ψa ea (2.20) | i a | i P P in terms of this basis. Then χ ψ = χ ψa eb ea = χ ψa as usual. h | i a,b b h | i a a It’s very useful to be able to extend this idea also to function spaces. In this case, we introduce a ‘continuum basis’ with elements a labelled by a continuous variable a, | i normalised so that a0 a = δ(a0 a) (2.21) h | i − using the Dirac δ-function. Then we write Z ψ = ψ(a) a da (2.22) | i | i to expand a general ψ in terms of the a ’s. The point of the normalization (2.22) is that | i | i Z Z Z χ ψ = χ(b)ψ(a) b a db da = χ(b)ψ(a) δ(b a) db da = χ(a)ψ(a) da (2.23) h | i h | i − 7We should really be more careful here, though we won’t be concerned with the following subtleties in this course. In infinite dimensions we distinguish the algebraic dual space – the space of all linear functionals ϕ : H → C from the continuous dual, where the number ϕ(ψ) is required to be vary continuously as ψ varies in H. The Riesz Representation applies to the continuous dual. For example, the algebraic dual also contains distributions such as the Dirac δ, acting as δ : ψ 7→ δ[ψ] ≡ ψ(0) “ = R δ(x) ψ(x) dx”, with R 2 ψ ∈ L (R, dx). Certainly δ(x) is not itself square-integrable, so is not in the Hilbert space. However, since 2 L (R, dx) contains discontinuous functions, nor does δ[ψ] vary smoothly with ψ. Functional analysis is almost always interested in the continuous dual, and this if often called just the dual space.

– 8 – 2 which is just the inner product (and also norm) we gave L (R, da) before. Indeed, a key example of a ‘continuum basis’ is the position basis x , where x R. Expanding a {| i} ∈ general state ψ as an integral | i Z ψ = ψ(x0) x0 dx0 , (2.24) | i R | i we see that the complex coefficients are Z x ψ = ψ(x0) x x0 dx0 = ψ(x). (2.25) h | i R h | i In other words, the position space wavefunctions we’re familiar with are nothing but the coefficients of a state ψ in a particular (position) continuum basis. | i ∈ H As always, we could equally choose to expand this same vector in any number of different bases. For example, our state ψ = R ψ(x) x dx from above can equally be | i R | i expanded in the momentum basis as ψ = R ψ˜(p) p dp, where the new coefficients ψ˜(p) = R˜ | i | i ixp/ p ψ are the momentum space wavefunction. Later, we’ll show that x p = e ~/√2π~, h | i h | i so these two sets of coefficients are related by Z 1 Z x ψ = ψ˜(p) x p dp = eixp/~ ψ˜(p) dp h | i ˜ h | i √2π ˜ R ~ R (2.26) Z 1 Z p ψ = ψ(x) p x dx = e−ixp/~ ψ(x) dx . h | i R h | i √2π~ R (In the first line here, we expanded ψ in the momentum basis, then took the inner product | i with x , while in the second we expanded the same ψ in the postion basis and took | i | i the inner product with p .) Equation (2.26) is just the statement that the position and | i momentum space wavefunctions are eachother’s Fourier transforms, again familiar from IB QM. The real point I wish to make is that the fundamental object is the abstract vector ψ . All the physical information about a quantum system is encoded in its state | i ∈ H vector ψ ; the wavefunctions ψ(x) or ψ˜(p) are merely the expansion coefficients in some | i basis. Like any other choice of basis, this expansion may be useful for some purposes and unhelpful for others. Finally, a technical8 point. Although using ‘continuum bases’ such as x or p is {| i} {| i} convenient, because it emphasizes the similarities between the infinite and finite–dimensional cases, it blurs the analysis. In particular, if x0 x = δ(x0 x) then the norm h | i − x 2 = δ(x x) = δ(0) ??? (2.27) k| ik − so, whatever these objects x are, they certainly do not lie in our Hilbert space. It is | i possible to make good mathematical sense of these by appropriately enlarging our spaces to include spaces of distributions, but for the most part (and certainly in this course) physicists are content to say that continuum states such as x are allowable as basis elements, but | i call them non–normalizable states: actual physical particles are never represented by a non-normalizable state. If you’re interested in a more detailed discussion, I recommend the book by Hall listed above. 8And definitely non-examinable, at least in this course.

– 9 – 2.2 Operators A linear operator A is a map A : that is compatible with the vector space structure H → H in the sense that A(c1 φ1 + c2 φ2 ) = c1A φ1 + c2A φ2 , (2.28) | i | i | i | i in Dirac notation. All the operators we meet in Quantum Mechanics will be linear9, so henceforth we’ll just call them ‘operators’. Operators form an algebra: Given two such linear operators A, B, we define their sum αA + βB as

(αA + βB): φ αA φ + βB φ (2.29) | i 7→ | i | i for all α, β C and all φ , and take their product AB to be the composition ∈ | i ∈ H AB : φ A B φ = A(B φ ) (2.30) 7→ ◦ | i | i for all φ . One can check that both the sum and product of two linear operators is | i ∈ H again a linear operator in the sense of (2.28)10. The operator algebra is associative, so A(BC) = (AB)C, but not commutative and in general AB = BA so that the order in which the operators act is important. The difference 6 between these two actions is known as the

[A, B] = AB BA. (2.31) − This commutator obeys the following properties:

antisymmetry [A, B] = [B,A] − linearity [α1A1 + α2A2,B] = α1[A1,B] + α2[A2,B] α1, α2 C ∀ ∈ (2.32) Leibniz identity [A, BC] = [A, B]C + B[A, C] Jacobi identity [A, [B,C]] + [B, [C,A]] + [C, [A, B]] = 0 .

Each of these may easily be verified from the definitions (2.31)&(2.28). of operators arise often in quantum mechanics, as I’m sure you remember from 1B QM. A state ψ is said to be an eigenstate of an operator A if ∈ H

A ψ = aψ ψ (2.33) | i | i 9Non-linear operators do play an important role in Quantum Field Theory, where interactions can change the number of particles. 10In functional analysis, a linear operator A : H → H is bounded if kA|ψik ≤ M k|ψik for some fixed M > 0, and the space of such bounded linear operators on H is denoted as B(H). Bounded linear operators thus map normalisable states to normalisable states, and so act on the whole Hilbert space H. This usually makes them the nicest ones to deal with. One of the reasons we’ll be largely ignoring the analysis aspects of Hilbert space in this course is that the operators we commonly deal with in quantum mechanics, such as the position and momentum operators X and P, are unbounded. (For example, even when its wavefunction 3 is correctly normalised, a particle can be located at arbitrarily large x ∈ R , or have arbitrarily large momentum.) It’s of course possible to handle such unbounded operators rigorously, but doing so involves an extra layer of technicality that would take us too far afield here. For further discussion, see that Part II course on Analysis of Functions, or the recommended book by Hall.

– 10 – where the number aψ C is known as the eigenvalue. Thus, if ψ is an eigenstate of ∈ | i A, acting on ψ with A returns exactly the same vector, except that it is rescaled by a | i number that may depend both on the state and the particular operator. The set of all eigenvalues of an operator A is sometimes called the spectrum of A, while the number of linearly independent eigenstates having the same eigenvalue is called the degeneracy of this eigenvalue. One place where Dirac notation is particularly convenient is that it allows us to label states by their eigenvalues. We let q denote an eigenstate of some operator Q with | i eigenvalue q, so that Q q = q q . For example, a state that is an eigenstate of the position | i | i 3 operator X with position x R (representing a particle that is definitely located at x) ∈ is written x , while the state p is an eigenstate of the momentum operator P with | i | i eigenvalue p. There’s a potential confusion here: x does not refer to the function x, | i but rather a state in whose only support is at x. We saw above that the wavefunction H corresponding to x was really a δ-function. | i In this course, I’ll mostly use the (unconventional!) convention that operators are written using capital letters, with their eigenvalues being labelled by the same letter in lowercase. However, I won’t stick to this religiously; a notable and deeply ingrained excep- tion is to use E for the eigenvalues of the Hamiltonian operator H. The extra structure of the inner product on allows us to also define the adjoint A† H of an operator A by φ A† ψ = ψ A φ for all φ , ψ . (2.34) h | | i h | | i | i | i ∈ H One can easily11 check that (A+B)† = A†+B† , (AB)† = B†A† , (αA)† = αA† and (A†)† = A. (2.35) Note that it follows that [A, B]† = [A†,B†]. Note also that the adjoint of the eigenvalue − equation A a = a a is | i | i a A† = a a . (2.36) h | h | An operator Q is called Hermitian12 if Q† = Q, so that φ Q ψ = ψ Q φ for all h | | i h | | i φ , ψ . Hermitian operators are very special and have a number of important prop- | i | i ∈ H erties. Firstly, suppose q is an eigenstate of a Hermitian operator Q, with eigenvalue q. | i 11At least in the finite dimensional case. We’ll assume without being careful that it holds in the infinite dimensional case too. 12The terminology ‘Hermitian’ is used in the physics literature, coming from the fact that if H is finite dimensional, we can represent linear operators obeying Q† = Q by Hermitian matrices. In functional analysis, we’d have to be a little more careful about exactly which states we allow our operator to act on. For example, differentiating a non-smooth but square–integrable function will typically sharpen its singularities and may render it non-normalisable. Thus the momentum operator −i~d/dx may take a 2 2 state in H =∼ L (R, dx) out of L (R, dx), and to keep analytic control we should first limit the states on which we allow the momentum operator to act. Being more careful, we’d say that operators obeying hψ|Q|φi = hφ|Q|ψi for all |φi, |ψi ∈ H are symmetric, while symmetric operators for which the domain Dom(Q) ⊆ H of the operator is the same as the domain Dom(Q†) of the adjoint are called self-adjoint. The distinction between operators that are self-adjoint and those that are merely symmetric is a little technical, but has important consequences for their spectrum and the existence of eigenstates. See e.g. this Wikipedia page for a basic discussion with examples. We’ll largely ignore such subtleties in our course.

– 11 – Then q q q = q Q q = q Q q = q q q , (2.37) h | i h | | i h | | i h | i so q = q and the eigenvalues of a Hermitian operator are real. Second, suppose q1 and | i q2 are both eigenstates of Q with distinct eigenvalues q1 and q2. Then | i † (q1 q2) q1 q2 = q1 Q q2 q1 Q q2 = 0 , (2.38) − h | i h | | i − h | | i where the first equality uses the reality of q1 and the second uses the fact that Q is Hermitian. Since q1 = q2 by assumption, we must have q1 q2 = 0, so eigenstates of 6 h | i Hermitian operators with distinct eigenvalues are orthogonal and, perhaps after rescaling, we can choose them to be orthonormal. In the finite dimensional case, you proved in 1B Linear Algebra that the set of eigen- vectors of a given Hermitian operator form a basis of ; that is, they form a complete, H orthonormal set. This property allows us to express Hermitian operators in a form that is often useful. If n is an orthonormal basis of eigenstates of a Hermitian operator Q, {| i} with eigenvalues qn , then we can write { } X Q = qn n n (2.39) n | ih | where we think of this as acting on some ψ by | i ∈ H X Q ψ = qn n n ψ . (2.40) | i n | ih | i Note that indeed Q ψ , since the n ψ are just complex numbers, whereas each | i ∈ H h | i term in the sum also involves n . In particular, expressing ψ in this basis as P | i ∈ H | i ψ = cm m gives | i m | i X X Q ψ = qncm n n m = qncn n (2.41) | i n,m | i h | i n | i assuming the basis is orthonormal. One reason this representation of Q is useful is that it allows us to define functions of operators. We set X f(Q) = f(qn) n n (2.42) n | ih | which makes sense provided f(qn) is defined for all eigenvalues qn of the original operator. For example, the inverse of an operator Q is the operator X 1 Q−1 = n n (2.43) n qn | ih | which makes sense provided qn = 0 for all n, while 6 X ln Q = ln(qn) n n (2.44) n | ih | is meaningful whenever Q has eigenvalues that are all positive–definite.

– 12 – A particularly important example of this is that we can represent the identity operator 1H on as H X Z 1H = n n or 1H = q q dq (2.45) n | ih | | ih | where n or q are any bases of , respectively for discrete or continuous labels. {| i} {| i} H This is often convenient for moving between different bases. For example, if p denotes {| i} a complete set of eigenstates of the momentum operator P , with P p = p p , then to | i | i represent this operator acting on a state ψ in the position basis we write | i Z Z x P ψ = x P p p ψ dp = x p p ψ˜(p) dp h | | i h | | ih | i h | i Z  Z  (2.46) 1 ixp/ ∂ 1 ixp/ = e ~ p ψ˜(p) dp = i~ e ~ ψ˜(p) dp . √2π~ − ∂x √2π~ R In the first equality here we’ve squeezed the identity operator 1H = p p dp in between | ih | the momentum operator P and the general state ψ , then used the fact that p is an | i | i eigenstate of P , and finally used our previous expression for x p and standard properties h | i of Fourier transforms. We recognise the integral in the final expression as the position space wavefunction of ψ , so altogether we have | i ∂ x P ψ = i~ x ψ , (2.47) h | | i − ∂xh | i or Pˆ ψ(x) = i~∂ψ/∂x in 1B notation. This and similar manipulations will be used re- − peatedly throughout the course, so it’s important to make sure you’re comfortable with them. If is finite dimensional, we can represent any linear operator on by a . We H H pick an orthonormal basis n and define {| i}

Anm := n A m (2.48) h | | i as usual. In particular, inserting the identity operator in the form I = P n n shows n | ih | that the matrix elements of the product operator AB are ! X (AB)km = k AB m = k A n n B m h | | i h | | ih | | i n (2.49) X X = k A n n B m = AknBnm , n h | | ih | | i n which just corresponds to usual . Hermitian operators are represented † 13 by a Hermitian matrices, i.e. ones whose elements obey (A )nm = Amn, hence the name .

13Though see the previous footnote for subtleties in the infinite dimensional case.

– 13 – 2.3 Composite Systems We often have to deal with systems with more than a single degree of freedom. This could be just a single particle moving in three dimensions rather than one, or a composite system such as a Hydrogen atom consisting of both an electron and a proton, or a diamond which consists of a very large number of carbon atoms. In the case of a macroscopic object, our system could be made up of a huge number of constituent parts, each able to behave differently, at least in principle. It should be intuitively clear that the more complicated our system is, meaning the more independent degrees of freedom it possesses, the larger a Hilbert space we’ll need if we wish to encode all its possible states. We’ll now understand how quantum mechanics handles systems with more than one degree of freedom by taking the tensor product of the Hilbert spaces of its constituent parts.

2.3.1 Tensor Product of Hilbert Spaces

Hilbert spaces are vector spaces over C, so we can try to define their tensor product in the same way as we would for any pair of vector spaces. Recall from IB Linear Algebra that m if 1 and 2 are two, finite dimensional Hilbert spaces, with ea a=1 a basis of 1 and H n H {| i} H fα α=1 a basis of 2, then the tensor product 1 2 is again a vector space over C {| i} H H ⊗ H 14 spanned by all pairs of elements ea fα chosen from the two bases . It follows that | i ⊗ | i

dim( 1 2) = dim( 1) dim( 2) (2.50) H ⊗ H H H provided these are finite. This should be familiar from IB Linear Algebra. It’s important to stress that, as with any vector space, a general element of 1 2 H ⊗ H is a linear combination of these basis elements. In particular, although general elements of 1 and 2 can be written H H m m X X ψ1 = ca ea and ψ2 = dα fα , (2.51) | i | i | i | i a=1 α=1

14If you’re a purist (and the PQM Tripos examiners are not), then there’s something slightly unsatisfactory about this definition of the tensor product vector space, which is that it apparently depends on a choice of basis on both H1 and H2. We can do better, just using the abstract vector space structure, as follows. For any pair of vector spaces V and W , we first define the free vector space F (V × W ) to be the set of all possible pairs of elements chosen from V and W . This is a vector space with the sum of a pair (v, w) and a pair (v0, w0) denoted simply by (v, w) + (v0, w0) (Do not confuse the pair (v, w) ∈ F (V × W ) with an inner product! Here v and w live in different spaces.) By itself, F (V × W ) does not care that V and W are already vector spaces. For example, if {ei} is a basis for V and v = c1e1 + c2e2 for some coefficients c1,2, the elements (v, w) and c1(e1, w) + c2(e2, w) are nonetheless distinct in F (V × W ). We’d like to teach the free vector space about the structure inherited from V and W individually. We thus define the tensor product V ⊗ W to be the quotient F (V × W )/ ∼ under the equivalence relations

(v, w) + (v0, w) ∼ (v + v0, w) , (v, w) + (v, w0) ∼ (v, w + w0) and c(v, w) ∼ (cv, w) ∼ (v, cw) for all v ∈ V , w ∈ W and all c ∈ C. We denote the equivalence class of the pair (v, w) by v ⊗ w.

– 14 – it is not true that a general element of 1 2 necessarily takes the form ψ1 ψ2 . H ⊗ H | i ⊗ | i Rather, a general element of 1 2 may be written as H ⊗ H X Ψ = raα ea fα (2.52) | i a,α | i ⊗ | i

In particular, elements of the form ψ1 ψ2 1 2 are specified by only dim 1 + | i ⊗ | i ∈ H ⊗ H H dim 2 complex coefficients, vastly fewer than is required to specify a generic element (2.52). H Elements of the form ψ φ are sometimes called simple, while in physics generic ele- | i ⊗ | i ments of the form (2.52) are said to be entangled. In section ?? we’ll explore some of the vast resources this entanglement opens up to quantum mechanics; you’ll see much more of it next term if you take the course on & Computation.

In order to make 1 2 a Hilbert space, rather than just a vector space, we must give H ⊗H it an inner product. We do this in the obvious way: if 1 and 2 denote inner products h | i h | i on 1 and 2, we first define the inner product between each pair of basis elements of H H 1 2 by H ⊗ H ( ea fα )( eb fβ ) = ea eb 1 fα fβ 2 , (2.53) h | ⊗ h | | i ⊗ | i h | i h | i and then extend it to arbitrary states Ψ by linearity. Note in particular that if ea | i {| i} and fα are orthonormal bases of their respective Hilbert spaces, then ea fα will {| i} {| i ⊗ | i} also be an orthonormal basis of 1 2. H ⊗ H The most common occurrence of this is simply a single particle moving in more than one 2 dimension. For example, suppose a quantum particle moves in R , described by Cartesian coordinates (x, y). If x x∈ is a complete set of position eigenstates for the x-direction, {| i} R and y y∈ likewise a complete set for the y-direction, then the state of the particle can {| i} R be expanded as Z ψ = ψ(x, y) x y dx dy (2.54) | i R×R | i ⊗ | i where x y x,y ∈ form a continuum basis of the tensor product space. Note that in {| i ⊗ | i} R general, our wavefunction is not a simple product ψ(x, y) = ψ1(x)ψ2(y) of wavefunctions 6 in the two directions separately, though it may be possible to write it as the sum of such products, as you met repeatedly when studying in IB Methods. In this continuum case, the inner product between any two states is Z χ ψ = χ(x0, y0) ψ(x, y) x0 x y0 y dx0 dy0 dx dy h | i R2×R2 h | i h | i Z = χ(x0, y0) ψ(x, y) δ(x x0) δ(y y0) dx0 dy0 dx dy (2.55) R2×R2 − − Z = χ(x, y) ψ(x, y) dx dy . R2 2 2 2 15 2 2 2 2 This is exactly the structure of L (R d x), and we identify L (R , d x) ∼= L (R, dx) 2 ⊗ L (R, dy). 15There’s actually one more – again non-examinable! – step in the infinite dimensional case: we must take 2 2 the completion of L (R, dx) ⊗ L (R, dy) in the norm associated to this inner product. There is no problem

– 15 – 3 More commonly, physics considers quantum systems that live in R , described by a 2 3 3 Hilbert space = L (R , d x) obtained from the tensor product of three copies of the H ∼ Hilbert space for a one-dimensional system. We can expand a general state in this Hilbert space as Z Z ψ = ψ(x, y, z) x y z dx dy dz = ψ(x) x d3x (2.57) | i R3 | i ⊗ | i ⊗ | i R3 | i where the last expression is just a convenient shorthand. Going further, to describe a composite system such as a Hydrogen atom that consists of two particles (an electron and proton), we need to take the tensor product of the individual Hilbert spaces of each particle. Thus (neglecting the spin of the electron and proton) the atom is described by a state Z 3 3 Ψ = Ψ(xe, xp) xe xp d xe d xp (2.58) | i R6 | i ⊗ | i where Ψ(xe, xp) = ( xe xp ) Ψ is the amplitude for the atom’s electron to be found at h | ⊗ h | | i xe while its proton is at xp. Once we know the Hilbert spaces appropriate to describe the fundamental constituents of our system, we can build up the Hilbert space for the combined system by taking tensor products. We should then ask ‘What is the correct Hilbert space to use to describe the fundamental particles in our system?’. Ultimately, this question can only be determined by carrying out an experiment. For example, experiments performed by Stern & Gerlach (see section 5.3.2) showed that a single electron is in fact described by a Hilbert space 2 3 3 2 e = L (R , d x) C , formed from the tensor product of the electron’s position space H ⊗ 2 wavefunction with the two-dimensional Hilbert space C describing the electron’s ‘internal’ 2 degrees of freedom. Thus, letting , be an orthonormal basis of C , a generic state {|↑ i |↓ i} of the electron (whether or not it’s part of a Hydrogen atom) is

Ψ = φ + χ , (2.59) | i | i ⊗ |↑ i | i ⊗ |↓ i if our general state ψ(x, y) can be written as a finite sum of products of square-integrable wavefunctions, say φa(x) and ρb(y) in each of the two directions separately, for then Z Z 2 X |ψ(x, y)| dx dy = φa0 (x)ρb0 (y) φa(x)ρb(y) dx dy 2 2 R R a,a0,b,b0     X Z X Z =  φa0 (x) φa(x) dx  ρb0 (y) ρb(y) dy < ∞ a,a0 R b,b0 R

2 with convergence guaranteed by the properties of the inner product on each copy of L (R, dx). However, 2 2 2 there are functions in L (R , d x) that cannot be so expressed as a finite sum. A simple example is the 2 function f : R → C given by ( 1 when px2 + y2 ≤ a f :(x, y) 7→ . (2.56) 0 else

2 It is clear that this function is square-integrable over R , but it cannot be written as a finite sum of products 2 2 of functions of x and y separately. It thus lies in the completion ofL (R, dx) ⊗ L (R, dy), but not in this space itself. The Hilbert space completion of H1 ⊗ H2 is often denoted H1⊗Hˆ 2, so more correctly we have 2 2 2 2 2 L (R , d x) =∼ L (R, dx)⊗ˆ L (R, dy). We’ll ignore such subtleties in this course, saving a proper treatment for the Functional Analysis course.

– 16 – given in the position representation by

xe Ψ = φ(xe) + χ(xe) (2.60) h | i |↑ i |↓ i 2 3 3 involving a pair of wavefunctions φ, χ L (R , d x). We’ll investigate this in detail in ∈ section 5.3. Let’s now understand how operators act on our composite system. Given linear oper- ators A : 1 1 and B : 2 2, we define the linear operator A B : 1 2 H → H H → H ⊗ H ⊗ H → 1 2 by H ⊗ H (A B): ea fα (A ea ) (B fα ) (2.61) ⊗ | i⊗| i 7→ | i ⊗ | i and we extend this definition to arbitrary states in 1 2 by linearity. In particular, the H ⊗ H operator A acting purely on 1 corresponds to the operator A 1H when acting on the H ⊗ 2 tensor product, and similarly 1H B acts non–trivially just on the second factor in the 1 ⊗ tensor product. Note that since they act on separate Hilbert spaces,

[ A 1H , 1H B ] = 0 (2.62) ⊗ 2 1 ⊗ for any operators A, B, even if these operators happen not to commute when acting on the same Hilbert space. As an example, let’s again consider the case of the Hydrogen atom, where the Hamil- tonian takes the form

2 2 Pe Pp H = 1p + 1e + V (Xe, Xp) . (2.63) 2me ⊗ ⊗ 2mp The kinetic terms for the electron and proton each act non–trivially only on one factor in the tensor product, while the Coulomb interaction

e2 V (Xe, Xp) = (2.64) − Xe Xp | − | depends non–trivially on the location of each particle. In this case, since V depends only on the relative separation of the two, it’s actually more convenient to view the tensor product differently, writing Hyd = e p = com rel (2.65) H H ⊗ H H ⊗ H to split it in terms of the states describing the behaviour of the centre of mass and those describing the relative states. Defining the centre of mass and relative operators

meXe + mpXp Xcom = Pcom = Pe + Pp me + mp (2.66) mpPe mePp X = Xe Xp P = − , − me + mp the Hamiltonian becomes 2  2 2  Pcom P e H = 1rel + 1com (2.67) 2M ⊗ ⊗ 2µ − X | |

– 17 – where M = me + mp is the total mass and µ = memp/M the reduced mass. Such calcula- tions should be familiar from studying planetary orbits in IA Dynamics & Relativity. Henceforth we’ll often omit the symbol — both in states and in operators — when ⊗ the meaning is clear.

2.4 Postulates of Quantum Mechanics So far, we’ve just been doing linear algebra, talking about Hilbert spaces in a fairly abstract way. Let’s now begin to connect this to some physics. The first postulate of quantum mechanics says that our system is described (up to a redundancy discussed below) by some state ψ , and that any complete set of orthogo- | i ∈ H nal states φ1 , φ2 ,... is in one-to-one correspondence with all the possible outcomes of {| i | i } the measurement of some quantity. Further, if a system is prepared to be in some general state X ψ = ca φa (2.68) | i a | i then the probability that the measurement will yield an outcome corresponding to the state φb is | i 2 φb ψ Prob( ψ φb ) = |h | i| (2.69) | i → | i ψ ψ φb φb h | i h | i 2 or equivalently Prob( ψ φb ) = cb φb φb / ψ ψ . The denominators in (2.69) allow | i → | i | | h | i h | i for the possibility that the states ψ and φb may have arbitrary norm. Notice that | i | iP P 2 Prob( ψ φb ) 0 and also, since ψ ψ = cacb φb φa = cb φb φb by the | i → | i ≥ h | i a,b h | i b | | h | i of the φb s, that | i 2 X X cb φb φb ψ ψ Prob( ψ φb ) = | | h | i = h | i = 1 . (2.70) | i → | i ψ ψ ψ ψ b b h | i h | i Thus the probabilities sum to one. Let me make some remarks. Firstly, note that we’ve not specified exactly which Hilbert n 2 2 3 3 space we should use; is to be C , or ` , or L (R , d x), or something else? In fact, H H the appropriate choice depends on the system we’re trying to describe. For example, a 16 2 3 3 single point particle with no internal structure could be described using = L (R , d x) H ∼ whereas to specify the state of a more complicated system with several degrees of freedom we’ll need a larger Hilbert space, encoding for example the location of its centre of mass, but also details of the system’s orientation. We’ll consider how to do this in chapter 2.3. Second, the fact that quantum mechanics only predicts the probabilities of obtaning a given experimental outcome – even in principle – is one of its most puzzling features. Orig- inally, this probabilistic interpretation of wavefunctions was due to Max Born, and (2.69) is sometimes known as the Born rule. He found that, solving the Schr¨odingerequation for 3 scattering a particle off a generic obstacle in R , the particle’s wavefunction would typically

16 2 3 3 Note that we only require H to be isomorphic to the space L (R , d x) of normalizable position-space wavefunctions. We could equally well describe our structureless particle using a momentum space wave- 2 3 3 2 3 3 function, and indeed L (Re , d p) =∼ L (R , d x), with the isomorphism provided by the .

– 18 – become very spread out so as to have appreciable magnitude over a wide region. This is in contrast to our experience of particles bouncing off targets, each in a certain specific way. For example, recall that when performing his gold foil experiment to probe the structure of the atom, Rutherford usually observed α-particles to plough straight through, but oc- casionally saw them rebound back revealing the presence of the dense nucleus. Nucleus or not, each α-particle was observed at a specific angle from the target, and did not itself ‘spread out’. Reconciling this observation with the behaviour of the wavefunction is what led Born to suggest that quantum mechanics should be inherently probabilistic. His rea- soning was perhaps not totally sound, and we’ll explore the interpretative meaning of QM further and from a more modern perspective in chapter 10. As a final remark on this postulate, to clean up the basic probability rule (2.69), it’s usually convenient to assume our states are normalised, meaning that

ψ ψ = 1 , (2.71) h | i and to expand them in an orthonormal basis φb . In this case, the above probability is {| i} simply 2 Prob( ψ φb ) = φb ψ , (2.72) | i → | i |h | i| while the quantity φb ψ C itself is often called the , or just am- h | i ∈ plitude for short. In what follows, we’ll almost always use this simpler expression, but it’s important to recall that it only holds in the case of a properly normalised state expanded in an orthonormal basis. Even when all states are properly normalised, we’re still free to redefine ψ eiα φ for some constant phase α. This phase drops out of both the | i → | i normalisation condition and the Born rule (2.72) for the probability. Taking into account both the normalisation condition and the phase freedom, in quantum mechanics physical states do not correspond to elements ψ but rather to rays through the origin. That | i ∈ H is, provided ψ = 0, the entire family of states | i 6 ∗ ψλ = λ ψ for λ C (2.73) | i | i ∈ define the same physical system — by tuning λ = 1/ ψ ψ we can always find a member | | h | i this family that’s correctly normalised, and the (constant) phase of λ cannot affect the probability of any outcome of a physical experiment. (This is the redundancy referred to above.) Note in particular that the zero vector 0 – the unique element of with vanishing H norm – never represents a physical state. Geometrically, the equivalence ψ λ ψ for all ∗ | i ∼ | i ψ / 0 and all λ C means that physical states actually correspond to elements of | i ∈ H { } ∈ the projective Hilbert space P . As with any , it’s often most convenient H to work simply with normalised vectors ψ themselves, and recall that the overall | i ∈ H phase is irrelevant. The second postulate of quantum mechanics states that quantities are rep- resented by Hermitian linear operators. In particular, upon measurement of a quantity corresponding to a Hermitian operator Q, a state is certain to return the definite value q iff it is an eigenstate of Q with eigenvalue q. Let n be a complete, orthonormal set of | i

– 19 – eigenstates of some Hermitian operator Q, with Q n = qn n . Then we can expand an P | i | i arbitrary state ψ in this basis as ψ = cn n . The expectation value of Q in this state | i | i n | i is X X 2 Q ψ = ψ Q ψ = cm cn m Q n = qn cn , (2.74) h i h | | i m,n h | | i n | | using the orthonormality of the basis. This is just the sum of values of Q possessed by the states n , weighted by the probability that ψ agrees with n . | i | i | i Since operators representing observables are Hermitian, for any such operator we have

ψ Q2 ψ = ψ Q†Q ψ = Q ψ 2 0 (2.75) h | | i h | | i k | ik ≥ and hence 2 2 2 0 ψ (Q Q ψ) ψ = Q ψ Q . (2.76) ≤ h | − h i | i h i − h iψ 2 2 This shows that Q ψ Q , with equality iff ψ is an eigenstate of Q. We define the h i ≥ h iψ | i rms deviation ∆ψQ of Q from its mean Q ψ in a state ψ by h i | i q 2 ∆ψQ = ψ (Q Q ψ) ψ . (2.77) h | − h i | i This is just the usual definition familiar from probability. As always, it gives us a measure of how ‘spread’ a state is around the eigenstate of Q. This implies that we can be sure of the value we’ll obtain when measuring an observable quantity only if we somehow know that our state is in an eigenstate of the corresponding operator before carrying out the measurement. Let me emphasize that these two postulates do not say anything about how the physical process of actually carrying out a measurement is described in the formalism of QM. (Nor do they even tell us what constitutes ‘makes a measurement’.) In particular, we do not say that measuring the observable corresponding to some Hermitian operator Q has anything to do with the mathematical operation of acting on our state ψ with Q. According to | i the Copenhagen interpretation, if we measure the observable corresponding Q and find the result q, then immediately after this measurement, our system must be the state q , because | i we’ve just learned that it does indeed have this value. The wavefunction is assumed to have collapsed from whatever it was before we measured it to q . The Copenhagen interpretation | i is the most widely accepted version of quantum mechanics, but it’s not uncontroversial. We’ll try to understand measurement from a deeper perspective in section 10.2. The final postulate of quantum mechanics is that the state ψ of our system evolves | i in time according to the Schr¨odingerequation ∂ i~ ψ = H ψ (2.78) ∂t| i | i where H is some distinguished operator known as the Hamiltonian.

– 20 – 2.4.1 The Generalized Uncertainty Principle

Suppose we have two Hermitian operators A and B. Let’s define ψA to be the state 2 2 | i A ψ A ψ ψ so that (∆ψA) = ψA , and similarly ψB = B ψ B ψ ψ . Then | i − h i | i k| ik | i | i − h i | i the Cauchy–Schwarz inequality says

2 2 2 2 2 (∆ψA) (∆ψB) = ψA ψB ψA ψB . (2.79) k | ik k | ik ≥ |h | i| Expanding the rhs, we have

ψA ψB = ψ (A A ψ)(B B ψ) ψ = ψ AB A ψ B ψ ψ (2.80) h | i h | − h i − h i | i h | − h i h i | i Since we’re considering Hermitian opertaors A, B, we have ψ AB ψ = ψ BA ψ , so h | | i h | | i 1 Im ψA ψB = ψ [A, B] ψ . (2.81) h | i 2ih | | i Combining this with the Cauchy–Schwarz inequality gives the generalised uncertainty re- lation 2 2 2 1 2 (∆ψA) (∆ψB) ψA ψB [A, B] ψ . (2.82) ≥ |h | i| ≥ 4|h i | In particular, if [A, B] = 0 we cannot in general find states that simultaneously have 6 definite values for both the quantities represented by A or B. As a particular case, recall from IB QM that the position and momentum operators obey the commutation relation [X,P ] = i~. Using this in (2.82) gives

~ ∆ψX ∆ψP (2.83) ≥ 2 so that no can have a definite value of both position and momentum17. This is the original uncertainty principle of Heisenberg.

17The inequality is in fact saturated by states whose position space wavefunctions are Gaussians, so despite the fact that our derivation above neglected several positive semi–definite terms, we cannot place a higher bound on the minimum uncertainty.

– 21 – 3 The Harmonic Oscillator

I now want to use Dirac’s formalism to study a simple system – the one-dimensional harmonic oscillator – with which you should already be familiar. Our aim here is not to learn new things about harmonic oscillators; indeed, we’ll mostly just recover results you’ve known about since you first heard of simple harmonic motion. Rather, our aim is to get accustomed to this more abstract approach to QM, seeing how it’s a very powerful way to think about the subject. The steps we follow in our treatment of the harmonic oscillator will form a good prototype for the way we’ll approach many other problems in QM. We’ll see these same steps repeated in various contexts throughout thef course.

3.1 Raising and Lowering Operators The Hamiltonian of a harmonic oscillator of mass m and classical frequency ω is P 2 1 H = + mω2X2 . (3.1) 2m 2 where X and P are the position and momentum operators, respectively. To analyse this, we begin by defining the dimensionless combination 1 A = (mωX + iP ) . (3.2) √2m~ω A is not a Hermitian operator, but since X and P are both Hermitian, we see that the adjoint of A is 1 A† = (mωX iP ) . (3.3) √2m~ω − Roughly, the motivation for introducing these operators is that they allow us to ‘factorize’ the Hamiltonian. More precisely, we have 1 A†A = (mωX iP )(mωX + iP ) 2m~ω − 1 = P 2 + m2ω2X2 + imω[X,P ] (3.4) 2m~ω H 1 = ~ω − 2 so we can write our Hamiltonian as     † 1 1 H = ~ω A A + = ~ω N + , (3.5) 2 2 where N = A†A. (3.6) A and A† are often called lowering and raising operators, respectively, whilst N is often called the number operator. (The reason for these names will become apparent soon.) No- tice that computing the spectrum of H is obviously equivalent to computing the spectrum of N.

– 22 – Whenever we’re presented with some new operators, the first thing to do is to work out their commutation relations. In this case, the fundamental commutation relations [X,P ] = i~ show that h i 1 A, A† = m2ω2[X,X] imω[X,P ] + imω[P,X] + [P,P ] 2m~ω − imω = ([X,P ] [P,X]) (3.7) −2m~ω − = 1 , whilst [A, A] = 0 = [A†,A†] trivially. It will also be useful to compute commutators involving N. We have   [N,A†] = [A†A, A†] = A†[A, A†] + [A†,A†]A = A† , (3.8) where the final step uses (3.7). We learn that, similarly, [N,A] = A by taking the adjoint − of (3.8). Let’s see what these commutators teach us about the possible energy levels. Suppose that n is a correctly normalised eigenstate of N, so that N n = n n and n n = 1. We | i | i | i h | i have NA† n = (A†N + [N,A†]) n = A†(N + 1) n = (n + 1)A† n (3.9) | i | i | i | i so A† n is an eigenstate of N with eigenvalue n + 1. Similarly, | i NA n = (AN + [N,A]) n = (n 1)A n (3.10) | i | i − | i so A n is an eigenstate of N with eigenvalue n 1. Acting with A† thus raises the eigenvalue | i − of N by one unit, whilst acting with A lowers it by one, giving these operators their names. The above algebra shows that if we can find just one energy eigenstate n , then we can | i construct a whole family of them by repeatedly applying either A† or A. The energies of 1 these new states will be (n+k + )~ω for some k Z. However, we can’t yet conclude that 2 ∈ the energy levels are quantized, because we don’t yet know that there isn’t a continuum of possible starting-points n (or indeed any!). | i This brings us to the second key step: we must now investigate the norm of our states. Since N is Hermitian all its eigenvalues must certainly be real, but in fact they’re also non-negative, because

n = n n n = n N n = n A†A n = A n 2 0 (3.11) h | i h | | i h | | i k | ik ≥ with n = 0 iff A n = 0, by properties of the norm. If A n = 0 then it is an eigenstate of | i | i 6 N with eigenvalue n 1, but we’ve just shown that there are no states in with negative − H eigenvalue for N, so this lowering process must terminate. That will be the case iff n is a non-negative integer. 1 Putting all this together, we have a ground state 0 of energy ~ω and an infinite | i 2 tower of excited states n of energy | i  1 En = n + ~ω , where n N0 . (3.12) 2 ∈

– 23 – These energy levels should be familiar from IB QM. Acting on an energy eigenstate with the raising operator does not necessarily give us an excited state that is correctly normalised. In d = 1, energy eigenstates of the harmonic 18 † oscillator are non-degenerate , so we must have A n = cn n + 1 for some constant cn | i | i (which may depend on n). Taking the norm of both sides shows that 2 † 2 † cn = A n = n AA n = n N + 1 n = n + 1 . (3.13) | | k | ik h | | i h | | i Therefore, the correctly normalised state is

1 † 1 † n+1 n + 1 = A n = p (A ) 0 . (3.14) | i √n + 1 | i (n + 1)! | i Likewise, you can check that n 1 = √1 A n for n 1, whilst A 0 = 0. | − i n | i ≥ | i We’ve seen that everything about the energy levels of the harmonic oscillator follows from i) the algebra (i.e. the commutation relations) of raising and lowering operators, together with ii) considering the norm. Again, we’ll follow these same two steps in analysing the eigenvalues and eigenstates of various operators throughout this course. This also parallels what you did in IB QM: there, by looking for series solutions of Schr¨odinger’s equation, you could find an energy eigenstate for all E R. However, these eigenstates ∈ were only normalizable if the series you found terminated. It was requiring this termination (i.e. normalizability) that lead to quantization of the energies. I hope I’ve persuaded you that the algebraic approach above is somewhat cleaner and more efficient than looking for series solutions to Schr¨odinger’sequation. Nothing has been lost, and we can easily use Dirac’s formalism to recover the explicit wavefunctions of our eigenstates. For example, in the position representation, the defining equation A 0 = 0 of | i the ground state becomes  d  0 = √2m~ω x A 0 = x mωX + iP 0 = mωx + ~ ψ0(x) , (3.15) h | | i h | | i dx where ψ0(x) = x 0 is the position space wavefunction of our ground state. This is a first h | i order o.d.e. whose solution is 1/4 mω   mω 2 ψ0(x) = exp x , (3.16) π~ − 2~ 18 Here is a sketch of the proof. Suppose ψ1 and ψ2 are two normalizable solutions of the d = 1 Schr¨odinger 2 00 equation −~ ψ /2m + V (x)ψ = Eψ with the same energy. Then the 0 0 W (ψ1, ψ2) = ψ1ψ2 − ψ2ψ1 must be constant ∀ x, as follows by differentiating and using the TISE. Since ψ1 and ψ2 are both normaliz- able, they must decay to zero as |x| → ∞. Provided their derivatives remain finite, limx→∞ W (ψ1, ψ2) = 0 and hence it will vanish everywhere. This implies that ψ2 ∝ ψ1, so our two normalizable solutions would be linearly dependent and agree (up to a constant phase) once normalized. Must the derivatives remain finite as x → ∞? A heuristic argument says that for any energy eigenstate E = hT i + hV i and hT i = kP |ψik2/2m, so if E is finite and the potential is bounded from below, we must have kP |ψik2 = 2 R |ψ0(x)|2 dx < ∞ also. This should not be convincing, and a more careful argument ~ R is given in pp. 96–105 of Messiah, A. Quantum Mechanics, vol. 1. The careful proof shows that in d = 1 energy eigenstates with energy E are indeed non-degenerate provided ∃ x0 ∈ R s.t. V (x) > E ∀ x > x0. In other words, if the particle does not have enough energy to exist classically anywhere beyond x = x0.

– 24 – where the overall constant is fixed (up to phase) by the normalization requirement 0 0 = R 2 h | i ψ0(x) dx = 1. This is the same Gaussian wavefunction for the ground state of the R | | harmonic oscillator familiar from IB QM. If needed, the wavefunctions of the excited states † may be found by acting with A = (mωX iP )/√2m~ω, which in the position represen- 1 − d  tation is the differential operator √ ~ + mωx . You can check (or just Google) 2m~ω − dx that these operators generate the usual .

3.2 Dynamics of Oscillators At this stage, we’ve learned everything about the set n of energy eigenstates of the {| i} 1 quantum harmonic oscillator and their corresponding energies En = (n + 2 )~ω. However, we still haven’t said anything about the physics of how quantum oscillators actually behave. Classically, we know that a harmonic oscillator would undergo periodic motion with a period T = 2π/ω. Furthermore, the energy of the classical oscillator is independent of the period, but is proportional to the square of the amplitude of oscillation. To what extent is the same true of our quantum oscillator? To say anything about the motion of a quantum system, we need to examine the TDSE. To get started, first suppose our system is prepared at time t = 0 to be in some energy eigenstate n . Then by the TDSE, at a later time t it will have evolved to | i n, t = e−i(n+1/2)ωt n . (3.17) | i | i Consequently, no eigenstate has a time dependence which oscillates at the classical fre- quency ω. Even worse, no matter which energy eigenstate our oscillator is in, the expected position of the oscillator at any given time t is n, t X n, t = e+i(n+1/2)ωt n X n e−i(n+1/2)ωt = n X n , (3.18) h | | i h | | i h | | i where we’ve used the linearity / antilinearity properties of the inner product. The rhs is independent of time, so none of these states move – our oscillator does not appear to be oscillating! To find some interesting dynamics, we must consider not a single energy eigenstate, but rather a superposition. This is a much more realistic assumption: there’s no practical way we could prepare a macroscopic system to be in just one energy eigenstate. Let’s now P suppose our oscillator is prepared at t = 0 to be in some generic state ψ, 0 = cn n . | i n | i The cn should be chosen so that ψ ψ = 1, but are otherwise arbitrary. Then at time t h | i this state will have evolved to ∞ X −iEnt/~ ψ, t = cn e n (3.19) | i | i n=0 by the TDSE. Now let’s examine where we expect to find such a generic state. We have ! ! X iEmt/~ X −iEnt/~ ψ, t X ψ, t = cm e m X cn e n h | | i h | | i m n (3.20) X i(m−n)ωt = cm cn e m X n . n,m h | | i

– 25 – To evaluate the inner product m X n , we write19 h | | i r ~  † X = A + A (3.21) 2mω in terms of the raising and lowering operators. Recalling that A† n = √n + 1 n + 1 and | i | i A n = √n n 1 , we have | i | − i r ~ m X n = √n m n 1 + √n + 1 m n + 1  , (3.22) h | | i 2mω h | − i h | i showing that m X n is non-zero only when m = n 1. The double sum in (3.20) thus h | | i ± reduces to r ∞ ! ~ X −iωt +iωt ψ, t X ψ, t = √n cn−1cn e + √n cncn−1 e h | | i 2mω n=1 (3.23) X = xn cos(ωt + φn) , n where the xn R>0 and phases φn R are defined by ∈ ∈ r 2n~ iφn cncn−1 = xn e . (3.24) mω Equation (3.23) shows that X oscillates sinusoidally at exactly the classical frequency ω h i whenever our oscillator is prepared in any generic20 superposition of energy eigenstates. In particular, as for the classical oscillator, the frequency of oscillation is independent of the amplitude. In the calculation above, this occurs because the separation between every adjacent pair of energy levels is always ~ω. For a macroscopic oscillator, the only non-negligible amplitudes will be those where n ncl for some ncl 1. Consequently, a measurement of the energy is certain to yield ≈  1 some value close to Encl = (ncl + )~ω ncl~ω. Similarly, in any energy eigenstate we 2 ≈ have 2 ~ † † En n X n = n AA + A A n = . (3.25) h | | i 2mω h | | i mω2 2 2 Thus for our macroscopic oscillator, ψ X ψ will be very close to En /(mω ). Classically, h | | i cl the time average of x2 is proportional to the average potential energy, which for a harmonic oscillator is just half the total energy. Thus, classically we have x2 = E/(mω2) in agreement with the quantum result. The correspondence principle requires that the quantum and classical results agree for large values of E. That this result agrees even for individual energy eigenstates (even low-lying ones) is a coincidence due to special symmetries of the harmonic oscillator. 19To do this calculation using the techniques of IB QM, you’d have worked in the position representation and said Z ∞ ∗  −x2/2α −x2/2α hm|X|ni = Hm(x) e x Hn(x) e dx −∞ where Hn(x) are Hermite polynomials of degree n and α = ~/mω. This is correct, but the integral looks rather unpleasant to evaluate. Fortunately, our operator formalism means we never even have to try! 20 Here, ‘generic’ simply means we must have non-zero amplitudes cn = hn|ψi for at least one pair of adjacent energy levels.

– 26 – 3.2.1 Coherent States To be written: 2019

3.2.2 Anharmonic Oscillations Suppose that, instead of the pure harmonic oscillator, we have a potential that has a mini- mum at x = 0 around which it grows approximately quadratically while x a, but then | |  asymptotes to a constant value at large x . Particles of low energy will have position space | | wavefunctions that are supported near the minimum of V (x), so we’d expect the low-lying energy levels to be roughly equal to those of a corresponding harmonic oscillator. Particles of higher energy would start to see the fact that the potential is not purely harmonic. In fact, for any such asymptotically constant potential with a single extremum, the separa- tion between energy levels gets smaller and smaller as we approach the asymptotic value 21 limx→∞ V (x) of the potential (beyond which we have a continuum of non-bound states) . Let’s prepare a state to be in two adjacent energy levels, say

ψ = cn n + cn+1 n + 1 , | i | i | i where n is the nth energy eigenstate of our anharmonic potential, whatever it may be. | i As long as the potential is symmetric around x = 0, its eigenstates will have definite parity and so n X n = 0 for all n . Therefore, at time t, h | | i | i i(En−En+1)t/~ ψ, t X ψ, t = cncn+1 e n X n + 1 + c.c. . (3.26) h | | i h | | i This is again a sinusoidally oscillating function of time, but we no longer expect the energy levels to be equally spaced, so now the period T = 2π~/(En+1 En) will depend on n. − In fact, since the energy levels get closer together as n increases, if we give our particle a larger amplitude of oscillation – and hence more energy – it will take longer to execute a complete an oscillation. This is just what we’d expect classically. As an example, consider the P¨oschl-Teller potential

2 V (X) = V0 sech (κX) (3.27) − for some constant V0 > 0 and length scale a = 1/κ (see figure3). This has bound state energy levels22 2 2 ~ κ 2 En = (ν n) for 0 n < ν , (3.28) − 2m − ≤ 2 2 where ν is the positive root of ν(ν + 1) = 2mV0/~ κ . The separation between adjacent bound state energy levels is thus

2 2 En+1 En = (2(ν n) 1)~ κ /2m (3.29) − − − 21For a proof, see e.g. chapter 6 of S. Gustafson & I.M. Sigal, Mathematical Concepts of Quantum Mechanics, Springer (2010). (Both this result and its proof are non-examinable in this course.) 22Exercise: Try to prove this. As for the harmonic oscillator, the spectrum can be found algebraically, though doing so will require a certain amount of ingenuity. The required method is similar to one of the questions on Problem Sheet 1.

– 27 – 2 Figure 3: The P¨oschl-Teller potential V (x) = V0 sech (κX) and its energy levels. Figure − by Nicoguaro, taken from this Wikipedia page. and deceases as n increases towards ν. (These formulæ break down when n ν, where we ≥ enter the continuum of non-normalizable states.) 2 2 If V0 ~ κ /2m then the potential is much deeper than it is wide, and contains many  p 2 2 bound states. In this regime, we have ν 2mV0/~ κ 1 and so from (3.29) we find ≈  that a superposition of low-lying states will oscillate with a frequency

2 r ~κ 2V0 ω ν κ , (3.30) ≈ m ≈ m

2 as adjacent low-lying energy levels are always separated by roughly ~κ ν/m. Now, when 2 2 2 x 1/κ, we may approximate V0 sech (κx) V0 + V0κ x . We see that this frequency  − ≈ − is indeed just what we’d expect for the corresponding harmonic oscillator. On the other hand, a superposition of states clustered around n = ν/2 will oscillate at around half this frequency as neighbouring energy levels in this region are only separated by about half as much. If we include a wider range of states in our initial superposition, then we’ll instead find a sequence of terms23

i(En−1−En)t/~ i(En+1−En)t/~ ψ, t X ψ, t = + cn−1cn e n 1 X n + cn+1cn e n + 1 X n h | | i ··· h − | | i h | | i i(En+3−En)t/~ + cn+3cn e n + 3 X n + . h | | i ··· (3.31) Since this is no longer a harmonic oscillator, we do not generically expect n + 3 X n = 0. h | | i Let’s define a frequency Ω = (Encl+1 Encl )/~ where ncl 1. Provided the cn’s are −  clustered around this large central value n = ncl sufficiently tightly that the difference

23Note that for a symmetric potential, we again have hn + 2k|X|ni = 0 by parity.

– 28 – between adjacent energy levels is roughly constant over the range of n for which the cn are appreciable, then, to reasonable accuracy, all the terms that contribute to (3.31) oscillate with frequencies that are integer multiples of Ωncl . Thus the motion will be periodic, but anharmonic, just as we expect classically. If we release the anharmonic oscillator from some large extension x0, then initially the wavefunction will be a superposition of many energy levels, with coefficients that ensure P x ψ = cn x n is sharply peaked around x0. At time t, this state will evolve to h | i n h | i −iEn t/~ X i(En −En)t/~ ψ, t = e cl cn e cl n . | i n | i Since the separation between energy levels varies, the frequencies appearing in this sum are not all integer multiples of any Ω. Consequently, after a time of order 2π/Ω, most terms in the sum will have not quite returned to their original values, so the wavefunction at t = 2π/Ω will be less sharply peaked around x0. With each subsequent passing of time 2π/Ω, the wavefunction will become more and more diffusely spread. Classically, if we release an oscillator with a rather uncertain value for it’s energy, then for a pure harmonic oscillator we always know where to find the oscillator at later times, since it’s period is independent of the energy. However, for an anharmonic oscillator, the period depends on the energy, so after a long time our oscillator is equally likely to be located anywhere.

3.3 The Three-Dimensional Isotropic Harmonic Oscillator 3 Let’s now consider an oscillator in R that is isotropic in the sense that the potential depends only on the radial distance from the origin. If the oscillator is harmonic, its Hamiltonian is P2 1 1 1 H = + mω2X2 = P 2 + P 2 + P 2 + mω2 X2 + Y 2 + Z2 . (3.32) 2m 2 2m x y z 2 This is just the sum of the Hamiltonians for three one-dimensional harmonic oscillators, each with the same frequency ω. We introduce raising and lowering operators 1 1 A† = (mω X iP) and A = (mω X + iP) , (3.33) √2m~ω − √2m~ω 3 one for each Cartesian direction in R . Their commutation relations are h †i † † Ai,Aj = δij , [Ai,Aj] = 0 , [Ai ,Aj] = 0 (3.34) as an obvious generalization of (3.7), and again the Hamiltonian may be written as 3     X † 1 † 3 H = ~ω A Ai + = ~ω A A + (3.35) i 2 · 2 i=1 in terms of these raising and lowering operators. As above, for any energy eigenstate En | i we have   En 3 H 3 † = En En = En A A En ~ω − 2 ~ω − 2 h | · | i 3 (3.36) X 2 = Ai En 0 , k | ik ≥ i=1

– 29 – so the ground state has energy 3~ω/2 – three times the ground state energy of a one- dimensional quantum harmonic oscillator with the same frequency. We’ll call this ground state 0 . Every component of the lowering operator A annihi- | i lates 0 , so its position space wavefunction obeys | i

x Ax 0 = x Ay 0 = x Az 0 = 0 (3.37) h | | i h | | i h | | i and hence is given by

mω 3/4  mωx2   mωy2   mωz2  x 0 = exp exp exp h | i π~ − 2~ − 2~ − 2~ (3.38) mω 3/4  mωr2  = exp , π~ − 2~

3 where again the overall constant is fixed by the normalization requirement in R . Note that this ground state is spherically symmetric. Just as in 1d, we can obtain excited states by acting on 0 with any component of | i the raising operators A†. Acting on any energy eigenstate by any of the components of A† raises the state’s energy by one unit of ~ω. Since the ground state has energy 3/2~ω, all of the states

† n † n † n (Ax) x (Ay) y (Az) z 0 n = nx ny nz = p | i (3.39) | i | i ⊗ | i ⊗ | i (nx + 1)! (ny + 1)! (nz + 1)! involving a fixed total number N = nx + ny + nz of raising operators are degenerate, each with energy  3 EN = N + ~ω . (3.40) 2 The total degeneracy of the N th is (N + 1)(N + 2)/2, the number of ways to partition N into three non-negative integers (nx, ny, nx) (think ‘stars & bars’). This large degeneracy relies in part on the oscillator being isotropic, so that the potential is spherically symmetric. Even so, the degeneracy is rather greater than we’d expect for a generic central potential. We’ll understand the origin of this large degeneracy more fully in section ??.

– 30 – 4 Transformations and Symmetries

In this chapter, we’ll return to examine the sinews of how the formalism of QM works. While classical physics is firmly rooted in space-time, we’ve seen that quantum mechanics takes place in the more abstract Hilbert space. Here, we’ll understand how transformations 3 such as spatial translations or rotations of R affect states in Hilbert space, linking together the world of quantum mechanics with our familiar experience in ‘normal’ space. In so doing, we’ll obtain a deeper appreciation of the origin of many of the most common commutation relations, including all the ones I expect you’re familiar with. For example, we’ll understand why [X,P ] = i~, a fact which probably seemed rather mysterious in IB QM.

4.1 Transformations of States and Operators

3 Suppose we move our physical system about in some way, perhaps translating it through R or rotating it around some chosen origin. In QM, we’d expect our system to be described a different state before the transformation compared to after the transformation. For example, suppose particle is prepared to be in a state ψ whose position space wavefunction 3 | i is strongly peaked around the origin 0 R . If we translate our system through a vector ∈ a, we should expect to find our particle in a new state ψ0 whose wavefunction is strongly | i peaked around x = a. This suggests that the effect of any given spatial transformation should be represented on Hilbert space by a linear operator

U : U : ψ ψ0 = U ψ . (4.1) H → H | i 7→ | i | i where the operator in question depends on the specific type of transformation. Whatever state we start with, after applying the transformation, we still expect to find it somewhere in space. Thus, properly normalised states should remain so after applying U, or in other words

1 = ψ ψ = ψ0 ψ0 = ψ U †U ψ for all ψ . (4.2) h | i h | i h | | i | i ∈ H We’ll now show that the requirement that U preserves the norm of every state fixes it to be a unitary operator24. The required trick is a form of the polarization identity. We first write ψ as ψ = φ + λ χ for some states φ , χ and some λ C. Equation (4.2) | i | i | i | i | i | i ∈ H ∈ becomes

φ φ +λ¯ χ φ +λ φ χ + λ 2 χ χ = φ U †U φ +λ¯ χ U †U φ +λ φ U †U χ + λ 2 χ U †U χ h | i h | i h | i | | h | i h | | i h | | i h | | i | | h | | i (4.3) where we recall that the inner product is antilinear in its left entry. By assumption U preserves the norm of every state, so φ U †U φ = φ φ and χ U †U χ = χ χ . Thus our h | | i h | i h | | i h | i 24Because physical states are represented by rays in H, rather than actual vectors, there is one other possibility, transformations may be represented by operators that are both anti-linear, in the sense that † U(c1|ψ1i + c2|ψ2i) = c1U|ψ1i + c2U|ψ2i), and anti-unitary, meaning that hχ|U U|φi = hφ|χi = hχ|φi for all |φi, |χi ∈ H. Such antiunitary operators turn out to be related to reversing the direction of time; we won’t discuss them further in this course. Wigner showed that all transformations are represented by either linear, unitary operators or anti-linear, anti-unitary ones.

– 31 – identity simplifies to     λ φ χ φ U †U χ = λ χ U †U φ χ φ . (4.4) h | i − h | | i h | | i − h | i In order for this to hold for every ψ , it must continue to hold as we vary λ. In | i ∈ H particular, as the phase of λ varies, the only way for (4.4) to be satisfied is if

φ χ = φ U †U χ . (4.5) h | i h | | i Finally, for this hold for arbitrary states φ and χ we must have U †U = 1. Multiplying | i | i through on the right by U −1 shows that

U † = U −1 (4.6) which says that the adjoint of U is equal to its inverse. Operators with this property are said to be unitary25. There’s one further condition on our operators U. Transformations of space such as translations and rotations form a . For example, the composition of two rotations is again a rotation, the trivial rotation forms the identity, and the inverse of a rotation is a rotation of the same amount around the same axis, but in the opposite sense (e.g. clockwise instead of anticlockwise). Let’s suppose that our spatial transformations form a group G. To reflect this group structure, our operators should26 provide a homomorphism from G to the group of unitary operators in the sense that, for all g1, g2 G, ∈ U(g2) U(g1) = U(g2 g1) and U(1) = 1H (4.7) ◦ · where denotes the group multiplication in G and denotes the composition of linear · ◦ operators acting on . (We’ll typically suppress these composition symbols from now on.) H −1 −1 −1 † † Note that if U1 and U2 are each unitary operators, then (U2U1) = U1 U2 = U1 U2 = † (U2U1) and so the composite operator U2U1 is also unitary. Note also that the identity operator 1H is trivially unitary. Mathematically, the search for homomorphisms from G to a group of linear operators on a vector space is the subject of Representation Theory. In our case, the vector space in question is our chosen Hilbert space . Beyond just being a vector space, this already H comes with the structure of an inner product and we’re asking that our representation of G on preserves this inner product. Thus, mathematically, the structure of QM involves H 3 classifying unitary representations of various groups of transformations of R . We can also understand how transformations act on operators. Suppose A is some operator whose matrix elements we’re interested in. If our transformation maps ψ ψ0 , | i → | i then the expectation value of A will be mapped as

ψ A ψ ψ0 A ψ0 = ψ U †(θ)AU(θ) ψ . (4.8) h | | i 7→ h | | i h | | i 25If (and only if!) you’re doing some functional analysis , note that unitary operators are certainly bounded, and in fact have unit norm in the operator topology. They are thus somewhat better behaved than generic Hermitian (or symmetric) operators. 26This statement is not quite accurate – there’s an important refinement that we’ll return to later in the course.

– 32 – Consequently, we can find the expectation value A will take after the transformation by working with the original states, but instead transforming the operator as

A A0 = U †(θ)AU(θ) = U −1(θ)AU(θ) , (4.9) 7→ using the fact that U is unitary. This is known as a similarity transform. Note that

[A0,B0] = [U −1AU, U −1BU] = U −1[A, B]U, (4.10) so the commutator of two similarity transforms is the similarity transform of the commu- tator. Also, similarity transforms do not change the spectrum of any operator: if ψ is | i an eigenstate of A with eigenvalue a, then U −1 ψ is an eigenstate of A0 with the same | i eigenvalue.

4.1.1 Continuous Transformations A particularly important class of transformations are those that depend smoothly on some parameter θ. For example, we can smoothly vary both the axis about which and the angle through which we rotate, or the magnitude and direction of a translation vector. If θ = 0 is trivial transformation, represented on by the identity operator, then for infinitesimal H transformation we have U(δθ) = 1 i δθ T + O(δθ2) (4.11) − where T is some operator that is independent of θ. (The factor of i in this equation is − just a convention for later convenience.) T is called the generator of the transformation. For such infinitesimal transformations, the condition (4.6) that U is unitary becomes

1 + i δθ T † + O(δθ2) = 1 + iδθ T + O(δθ2) , (4.12) which to first order in δθ gives T = T † . (4.13) Thus the generator T is Hermitian and hence a good candidate for an observable quantity. A finite transformation can be generated by repeatedly performing an infinitesimal one. Specifically, if we set δθ = θ/N and transform N times with U(δθ), then in the limit N we have → ∞  θ N U(θ) = lim 1 i T = e−iθT , (4.14) N→∞ − N where the exponential of an operator may be defined by (2.42), or equivalently by its power series expansion. This form of the unitary operator U(θ) is especially useful when states are expressed in terms of the basis of eigenstates of the Hermitian generator T . We obtain an important equation by using (4.11) to evaluate ψ0 = U(δθ) ψ . Sub- | i | i tracting ψ from both sides and dividing through by δθ, in the limit δθ 0 we obtain | i → ∂ ψ i | i = T ψ (4.15) ∂θ | i There is no assumption here that the wavefunction ψ(x) = x ψ should be differentiable h | i as a function of space. We merely say that as our transformation varies smoothly away

– 33 – from the identity, so too does the state ψ vary smoothly inside . The rate at which ψ | i H | i varies is governed by the generator. For an infinitesimal transformation, our transformation law (4.9) for operators becomes U †(δθ) AU(δθ) = A + i δθ [T,A] + O(δθ2) , (4.16) so δA = i δθ [T,A] . (4.17) Thus, while the rate of change of states themselves is given by the action of the generator, the rate of change of operators under a transformation is determined by the commutator of the generator with the operator. We’ll see that most of the commutation relations we meet in quantum mechanics (and certainly all the familiar ones) arise this way. Furthermore, because Hermitian operators represent observable quantities with which we are familiar, in practice it’s often easier to understand how a transformation should act on an operator, rather than on the more abstract notion of a state in Hilbert space. The above discussion has been rather abstract. Let’s ground it by looking at a few of the most important examples of transformations and their corresponding generators.

4.2 Translations When we move an object around, we expect to find it in a new place. Specifically, suppose 3 ψ X ψ = x0 for some normalised state ψ . Since x0 R just labels a spatial point, it h | | i | i ∈ must behave under translations and rotations like any vector. Thus, after the translation 0 0 0 we expect our object to be described by a new state ψ with ψ X ψ = x0 + a. The | i h | | i general theory of the previous section asserts that this translation is represented on Hilbert space by a unitary operator27 U(a) so that ψ0 = U(a) ψ . Therefore we can write the | i | i expected location of the translated state as 0 0 ψ X ψ = ψ X ψ + a = ψ (X + 1Ha) ψ h | | i h | | i h | | i (4.18) = ψ U −1(a)XU(a) ψ , h | | i where 1H represents the identity operator on . (We’ll often drop the identity operator H where it’s unambiguous.) Since this is true for any state ψ , the same argument as above | i shows that −1 U (a) X U(a) = X + a1H . (4.19) Let’s be clear about what this equation means. In this course, we’re interested in Quantum 3 Mechanics on R , so the (boldface) X : is really a collection of H3 → H three operators corresponding to the three coordinates of R . In more detail, taking the 3 standard Cartesian basis of R ,(4.19) says  −1    U (a)XU(a) X + ax1H  −1    U (a)YU(a) = Y + ay1H  . (4.20) −1 U (a)ZU(a) Z + az1H

27 Strictly, we should label this as Utrans(a) to emphasise that it is the unitary operator generating translations, as opposed to anything else we might associate to a. We typically drop such additional labels. Exactly which unitary operator we’re talking about should hopefully be clear from the context.

– 34 – In other words, conjugating the X (component!) operator by U(a) gives a new operator −1 U (a)XU(a): that we identify with X + ax1H, and similarly for Y and Z. If H → H U(a) acted on the position operator in any other way, it could not represent a translation, so equation (4.19) may be taken as the defining property of the translation operator, distinguishing it from other unitary transformations. For an infinitesimal translation δa, equation (4.11) states that the translation operator can be written as28 i U(δa) = 1 δa P + O(δa2) (4.21) − ~ · for some Hermitian operator P. I’ve named this operator P, so I’m not going to be able to fool any of you that it will turn out to be anything other than the momentum operator. But that’s not how I want you to think of it yet. By definition, P/~ is the generator of infinitesimal translations. The translation generator P/~ must have units 1/(length) so that it makes sense to add the second term of (4.21) to the first. Because we’ve included a factor of ~, our P must have dimensions of (mass velocity). In fact, this is just a convention. We could × choose to measure masses in units of 1/(length velocity), using ~ as a conversion factor × for our units. In relativity, the speed of light c provides a further natural conversion factor between lengths and times, so in units of ~/c, masses are equivalent to inverse lengths. In natural units, we work with length and time scales adapted to Nature rather than our civilisation, so we set ~ = 1 and c = 1. In these units, the translation generator is precisely P. Incidentally, you may be worried about setting ~ = 1 when it is such a ‘small’ number; −34 ~ 1.0547 10 Js/rad. But this is quite wrong. Each atom in the Universe knows that ≈ × ~ is not at all small or insignificant. It only appears small when measured in units such as Joules and seconds that are relevant for steam engines and pocketwatches. The real, deep question is not why ~ appears small, but why humans are so big. Plugging (4.21) into the defining equation (4.19) for the translation operator we find

i [δa P , X ] = δa (4.22) ~ · and since this holds for any infinitesimal translation,

[Xi ,Pj] = i~ δij . (4.23)

These, as I’m sure you recognise, are the fundamental commutation relations of quantum mechanics. I’m prepared to bet that the first time you met them (and perhaps even up until now) you thought they were a weird and mysterious feature of quantum mechanics. Here they’re revealed as little more than common sense: asking where you are and then translating is not the same as first translating and then asking where you are. What is weird and quantum about this equation is not that X and P don’t commute, but the fact that they are operators acting on a Hilbert space.

28 P3 3 Here, δa · P = i=1 δai involves the standard in R . Note that the result is still an operator on H, a linear combination of the three operators P.

– 35 – By repeatedly performing the same infinitesimal translation, we can write

 i  U(a) = exp a P (4.24) −~ · for a translation through the finite vector a. Since U(a) provides a homomorphism of this group of translations to the group of linear operators on , we have H U(b) U(a) = U(b + a) = U(a + b) = U(a) U(b) , (4.25) where in the second equality we used the fact that a + b = b + a, stating that the order in which we perform two translations doesn’t matter29. Using (4.21) to expand the far left & far right of this equation to lowest non–trivial order in aibj shows that the generators P obey [Pi,Pj] = 0 for all i, j (4.26) and so commute with eachother. Again, the fundamental meaning of this equation is simply that translations form an Abelian group — the order of translations doesn’t matter. It only says anything about our ability to simultaneously know all three components of a particle’s momentum once we identify P as corresponding to momentum, which we haven’t yet done. Now let’s think about what the translation operator does to states. Let x repre- | i sent an eigenstate of the position operator with eigenvalue x, so that X x = x x , with | i | i normalisation condition x0 x = δ3(x0 x) (4.27) h | i − as usual for continuum states30. This state represents a particle which is definitely located 3 at x R . Applying the translation operator, ∈ X(U(a) x ) = ([X,U(a)] + U(a)X) x . (4.28) | i | i We can evaluate the commutator by multiplying (4.19) through by U(a) on the left, find- ing [X,U(a)] = U(a) a. The constant a and the eigenvalue x each commute with the translation operator, so X(U(a) x ) = (x + a)(U(a) x ) . (4.29) | i | i As anticipated, our new state U(a) x is certainly located at x + a. Consequently, U(a) x | i | i must be proportional to the state x + a , the normalised state that is definitely located | i 29 3 Here this is of course the statement that the group of spatial translations is (R , +), where the group operation ‘+’ is commutative. Groups for which the group operation is commutative are often called Abelian. 30Technically, |xi is a non-normalizable state, which really means it doesn’t live in the Hilbert space 2 3 3 L (R , d x). This is related to the fact that the position operator X is unbounded – there is no constant M such that kX|ψik ≤ Mk|ψik for all |ψi ∈ H. Na¨ıvely, this means that ‘the eigenvalues of X can be arbitrarily large’, which is physically reasonable since our particle may be located very far away. However, in general, unbounded operators do not have any eigenstates in H, so do not really have eigenvalues. The distinction between bounded and unbounded operators is tremendously important in functional analysis, but will not play a role in this course.

– 36 – at x + a. Setting U(a) x = c x + a for some c C and taking the inner product with | i | i ∈ another position eigenstate x0 we have31 | i † c δ3(x0 x a) = c x0 x + a = x0 U(a) x = U −1(a) x0  x − − h | i h | | i | i | i 1 † 1 (4.30) = x0 a x = δ3(x0 a x) c | − i | i c − − where we used the fact that U(a) is unitary and the fact that if U(a) x = c x + a then | i | i U −1(a) x + a = c−1 x . Comparing both sides shows that c 2 = 1, so c is a pure phase | i | i | | and without loss of generality we can set c = 1. Now suppose we consider translating an arbitrary state ψ . The position space wave- | i function of this state is simply its coefficient

ψ(x) = x ψ (4.31) h | i in the position basis. Thus the wavefunction ψtrans(x) of the translated state ψtrans = | i U(a) ψ is | i −1 † ψtrans(x) = x ψtrans = x U(a) ψ = U (a) x ψ h | i h | | i | i | i (4.32) = x a ψ = ψ(x a) . h − | i − Eq. (4.32) says that the wavefunction of the translated state takes the same value at x as the original wavefunction took at x a. This is completely natural as we’ve translated our − state through a! In particular, for an infinitesimal translation δa, on the one hand we can Taylor expand the translated wavefunction to find32

ψtrans(x) ψ(x) = δa ∇ψ(x) , (4.33) − − · while on the other we expand the operator U(δa) to find   i i ψtrans(x) ψ(x) = x 1 δa P ψ x ψ = δa x P ψ (4.34) − − ~ · − h | i −~ · h | | i using (4.21) to lowest order in δa. Consequently, for any state ψ the momentum operator | i P acts in the position representation as

x P ψ = i~∇ψ(x) . (4.35) h | | i − This is just what you would have said in IB QM last year, here derived from the point of 3 view of the effect a translation of R has on states in Hilbert space. As a special case, let’s apply this argument to the state p that obeys33 P p = p p , | i | i | i representing a particle whose momentum is certainly p. In this case the above argument

31Note the slight awkwardness of the Dirac notation here. Because operators act to the right, we have to write hχ|U|φi = (U −1|χi)†|φi for any unitary operator U, as we used in the third equality. In mathematical notation we would simply write (χ, Uφ) = (U −1χ, φ). 32At least if the wavefunction is sufficiently smooth. 33As with the position operator, the momentum operator P is unbounded, so the ‘momentum eigenstates’ |pi are necessarily non-normalizable, as we’ll soon see explicitly.

– 37 – Figure 4: An infinitesimal rotation transforms v v + δα v. 7→ × becomes −ia·P/~ ψp(x a) = x a p = x U(a) p = x e p − h − | i h | | i h | | i (4.36) −ia·p/~ −ia·p/~ = e x p = e ψp(x) , h | i where in going to the second line we used the fact that p is an eigenstate of P. Comparing | i the first and last expression, we deduce that the position space wavefunction of a momentum eigenstate must take the form ip·x/~ ψp(x) = C e (4.37) −3/2 for some x-independent factor C. It’s convenient to choose C to be (2π~) since in this case we have 3 3 Z Z d x d y 0 p0 p = d3y d3x p0 y y x x p = e−ip ·y/~ δ3(y x) eip·x/~ h | i h | ih | ih | i (2π~)3 − 3 (4.38) Z d x 0 = e−i(p −p)·x/~ = δ3(p0 p) (2π~)3 − so that the momentum eigenstates are normalised in the same way as the position eigen- states (and in the same way as for any continuum states). Again, the form

1 ip·x/~ ψp(x) = e (4.39) (2π~)3/2 of the wavefunction for momentum eigenstates is familiar, but here we’ve derived it from first principles rather than using the position space representation (4.35) of the momentum operator. (We assumed this result earlier when showing that the position and momen- 2 tum space wavefunctions are eachother’s Fourier transforms.) Note also that ψp(x) = 3 R 2 3 | | 1/(2π~) , so 3 ψp(x) d x diverges. Like position eigenstates, momentum eigenstates R | | are also non-normalizable.

4.3 Rotations 3 3 We now consider the effect rotating R has on a quantum state ψ . For a vector v R | i ∈ H ∈ a rotation anticlockwise around the axis αˆ by an amount α is a linear transformation | | 3 3 0 R(α): R R R(α): v v = R(α)v (4.40) → 7→

– 38 – that obeys v0 v0 = v v and det(R(α)) = +1 . (4.41) · · The first of these conditions says that rotations preserve lengths, and implies that R(α) must be an orthogonal transformation. The second ensure that R(α) preserves the orien- tation. For infinitesimal rotations, figure4 shows that

v0 = v + δα v + O(δα2) . (4.42) × (Note that this obeys (4.41).) Given two orthogonal transformations R(α) and R(β) the composite R(β) R(α) is again an orthogonal transformation with unit . Spatial rotations form a group, known as SO(3), with the identity being the trivial rotation and the inverse of R(α) being a rotation of the same amount around the same axis α, but now clockwise. However, in general R(β) R(α) = R(α) R(β), so the order in which we 6 apply rotations is important and the rotation group is non–Abelian. As for any group of transformations, in quantum mechanics the group of rotations is represented on by unitary operators. We denote the unitary operator corresponding to H R(α) as U(α). U(α) acts on the position operator as

U −1(α) X U(α) = R(α)X (4.43) where the lhs involves composition of operators in , while the rhs here is the usual action H of a rotation on the three components of X. For example, if α = αzˆ so that the rotation 3 is around the z-axis in R , then in more detail (4.43) says       U −1(α)XU(α) cos α sin α 0 X  −1   −    U (α)YU(α) = sin α cos α 0  Y  . (4.44) U −1(α)ZU(α) 0 0 1 Z

Thus, conjugating the operator X by this rotation operator changes it to a new operator U −1(α)XU(α) on Hilbert space, which we identity with the usual linear combination X cos α Y sin α. Again, U(α) must act on X this way if it is indeed to correspond to a − rotation. For an infinitesimal transformation, following (4.11) we can write i U(δα) = 1 δα J + O(δα2) (4.45) − ~ · for some Hermitian generators J/~. Since angles are dimensionless, J, like ~, must have dimensions of (length momentum). Later we will see that J corresponds to the angular × momentum operator, though by definition we have that J/~ is the generator of rotations. Using this and (4.42) in (4.43) shows that we have i [ δα J , X ] = δα X , (4.46) ~ · × and since this is true for any axis of rotation, we have the commutation relations X [Ji,Xj] = i~ ijkXk . (4.47) k

– 39 – These relations are nothing more than the statement that the operator X transforms as a vector under rotations. They are the infinitesimal version of (4.43) – they must hold if J is indeed to generate rotations. Just as for translations, the mutual commutation relations among the components of J themselves follow from the fact that the U(α)s provide a homomorphism from the group34 SO(3) of rotations to the space of unitary operators on . However, the non–Abelian H nature of the rotation group means we should expect these commutators to be non–trivial in general. Performing two infinitesimal rotations through angles δα and δβ, for any vector v we have R(δβ) R(δα) v = R(δβ)(v + δα v) = (v + δα v) + δβ (v + δα v) × × × × (4.48) = v + δα v + δβ v + δβ (δα v) × × × × to lowest non–trivial order in δα and δβ. Consequently, the difference between these rotations and the two rotations performed in the opposite order is

[R(δβ) R(δα) R(δα) R(δβ)] v = δβ (δα v) δα (δβ v) − × × − × × = (δβ δα) v (4.49) × × = R(δβ δα) v v × − using standard properties of the vector triple product. The rhs again involves a rotation, through the angle δβ δα. Applying the homomorphism U we obtain × [U(δβ),U(δα)] = U(δβ δα) I (4.50) × − for our operators in Hilbert space. Using (4.45), this is 1 i [δβ J , δα J] = (δβ δα) J (4.51) − ~2 · · −~ × · and since this must hold for arbitrary successive infinitesimal rotations, we have finally X [Ji,Jj] = i~ ijkJk (4.52) k as the commutation relations among the rotation generators. Again, there’s nothing ‘weird’ or ‘quantum’ about (4.52), beyond the fact that they involve operators on Hilbert space. In particular, their form just reflects the fact that the order of rotations around different axes matters, and that the difference between the two orderings is itself a rotation around the axis perpendicular to the original two. Compared to the relations [Pi,Pj] = 0, the non–triviality of the commutation relations (4.52) arises purely because SO(3) is a non–Abelian group, unlike the group of translations. These non– trivial commutation relations do not prevent us from exponentiating (4.45) to write U(α) = e−iα·J/~ for a finite rotation around a fixed axis α, because the exponentiation always involved the same component αˆ J of the rotation generator, which certainly commutes · with itself. 34Again, later we’ll see that this statement needs to be slightly refined.

– 40 – 3 The combined group of translations and rotations of Euclidean R is sometimes known as E(3), or else ISO(3). For translations through a1 and a2, and rotations R1 and R2, E(3) has the group composition law

(a2,R2) (a1,R1) = (a2 + R2a1,R2R1) , (4.53) · where the we note that the second rotation also acts on the first translation. Clearly, 3 both R and SO(3) are subgroups of E(3), but the fact that rotations act non-trivially on 3 previous translations means that E(3) is the semi-direct product E(3) ∼= R o SO(3). This group law shows that rotations and translations do not commute: translating any vector v through a and then rotating around αˆ gives us R(α)(v + a), whereas first rotating then translating gives a + R(α)v instead. In particular, for infinitesimal translations and rotations we have R(δα)(v + δa) (R(δα)v + δa) = δα δa (4.54) − × independent of v. Applying the homomorphism U, in Hilbert space we obtain commutation relations35 [UR(δα),UT (δa)] = UT (δα δa) I (4.55) × − or equivalently X [Ji,Pj] = i~ ijkPk (4.56) k in terms of the rotation and translation generators. More generally, any 3-component linear operator V : is said to transform as a H → H vector under rotations if it obeys

U −1(α)VU(α) = R(α)V (4.57) for any α. (Depending on its behaviour under parity transformations, discussed in sec- tion 4.6, such a V could either be a vector or pseudovector operator.) Just as above, for infinitesimal rotations this implies the commutation relations X [Ji,Vj] = i~ ijkVk (4.58) k among the components of V and J. We see that X, P and J itself36 each transform as vectors under rotations. This of course is just what we’d expect for position, momentum and angular momentum in classical mechanics. On the other hand, if an operator S obeys

U −1(α) SU(α) = S (4.59)

35 Here, for the sake of clarity, we have labelled the translations operator by UT and rotation operator by

UR. In particular, the operator on the rhs of (4.55) is a translation. 36This is always true of X and P, but turns out to be a three-dimensional coincidence for J: the d rotation group SO(d) of R has dimension d(d − 1)/2 and the generators are generically represented by antisymmetric matrices Jij . When d = 3, a 3 × 3 antisymmetric matrix has 3 independent components, so we can equivalently package the Jij as vectors through Ji = ijkJjk.

– 41 – for any α, so that it is unchanged by the rotation operator, we say S is a operator37. The corresponding infinitesimal version is

[J,S] = 0 . (4.60)

Just as we can form a scalar by taking the Euclidean inner product v w of the two vectors 3 · v, w R , so we can form scalar operators from the Euclidean inner product of two vector ∈ operators. Indeed, if U −1(α)VU(α) = R(α)V and similarly for W, then

U −1(α)(V W) U(α) = (U −1(α)VU(α)) (U −1(α)WU(α)) · · = (R(α)V) (R(α)W) (4.61) · = V W · by the standard rotational invariance of the dot product. As always, we can write this in terms of commutators by considering infinitesimal rotations:   X X X Ji , VjWj = [Ji,Vj]Wj + Vj[Ji,Wj] j j j X X = i~ ijkVkWj + i~ Vj ijkWk (4.62) jk jk X = i~ ijk( VjWk + VjWk) = 0 , − jk where in going to the final line we relabelled dummy indices j k in the first term and ↔ used the antisymmetry of ijk. Note that our calculations always preserved the order of V and W — our result holds irrespective of whether V and W commute. An important special case of (4.60) is to take S = J2 = J J. The fact that the rotation · generators obey the commutation relations

X 2 [Ji,Jj] = i~ ijkJk and [Ji, J ] = 0 (4.63) k means that we cannot find a complete set of simultaneous eigenstates of all of the Jis, but we can find a complete set of simultaneous eigenstates of J2 and any one component of J. Looking ahead to our interpretation of J as the angular momentum operator, Born’s 2nd postulate of QM tells us that we cannot say a particle has a definite angular momentum vector, but we can know the magnitude of the angular momentum and the amount aligned along any given axis.

4.3.1 Translations Around a Circle In IB QM, you defined the orbital angular momentum operator L = X P which also × has the commutation relations (4.58) with itself and with X and P. We’ll understand the

37More precisely, we say that S transforms as a scalar under rotations — depending on its behaviour under parity transformations it could be either a scalar or pseudoscalar operator.

– 42 – relation between J and L below, but it’s important to realise that in general J = L. We 6 can understand L from the present perspective as follows. When a system is displaced through the vector a, its state is transformed by the unitary translation operator U(a) = e−ia·P/~. We now imagine successively performing N translations, each along one edge of a regular N-sided polygon that lies in the plane n x · and is centred on the origin. In the limit N that our regular polygon has an infinite → ∞ number of sides, this corresponds to translating our system around a circular path. Thus, we can move our system around a circle by applying a succession of small translations, each in a slightly different direction. Specifically, if the system initially lies at some location x, making an angle α with some axis in the plane n x = 0 of our polygon, then when move · it to lie at an angle α + δα we translate it through δa = δα n x. Thus the associated × unitary translation operator obeys

U −1(δa) X U(δa) = X + δα (n X) (4.64) × and so can be written as i i U(δa) = 1 δα (n X) P + O(δα2) = 1 δα n L + O(δα2) , (4.65) − ~ × · − ~ · where, interchanging dot and cross, L is the Hermitian operator

L = X P . (4.66) × We now see that L/~ is the generator of circular transformations. We recall from IB QM that the components of L obey the commutation relations38 X X X [Li,Xj] = i~ ijkXk , [Li,Pj] = i~ ijkPk and [Li,Lj] = i~ ijkLk , k k k all of which follow from the more primitive commutation relations [Xi,Pj] = i~ δij and [Xi,Xj] = 0 = [Pi,Pj]. Equation (4.65) contains only one operator, n L, so it inevitably −iα n··L/ commutes with itself and we can exponentiate to find U(acirc) = e ~ for finite trans- lations around our circle.

4.3.2 Spin 2 3 3 If our system is completely described by the Hilbert space L (R , d x), so that we com- pletely specify the state of the system by giving (say) its position space wavefunction ψ(x), then L and J are indistinguishable. However, if our system has any internal structure its Hilbert space will be more complicated.Translating it around a circular arc is then not the same thing as rotating it through the same angle: the difference is best explained by a 2 3 3 picture, which you can find in figure5. Thus, when  L (R , d x) the action of J and H L may not coincide. We define the spin operator S to be the difference

S = J L (4.67) − 38Check that you’re able to derive these!

– 43 – Figure 5: The rotation operator U(α) swings a system around the origin and also rotates 3 its orientation in R , while a circular translation merely moves the system around a circular path, without affecting its orientation. The difference is a rotation of the body around its own centre of mass, reorientating it without changing its location. so that J = L + S. From figure5 we see that difference between rotating an object around some fixed origin and a translating it’s centre of mass along a circular path is a rotation around the object’s centre of mass. generates a rotation of the body around its own centre of mass. We thus expect (and will confirm below) that S generates rotations of a body around its own centre of mass, reorienting it in space. This is why S is called the spin operator. In the case of a macroscopic body, made up of many constituent particles, it’s reason- able to suppose that the spin operator just account for the difference between translating the body as a whole and translating each individual particle around their own arcs, with slightly different radii according to where in the body the particle is located. That is, ! ? X X S = Xa Pa X P = xa pa (4.68) a × − × a × where Xa and Pa are position and translation operators for each individual particle and 39 xa and pa are their positions and momenta relative to the centre of mass . However, for an object consisting of many particles such a description is clearly going to be very cumbersome. More fundamentally, we do not know what the ‘fundamental’ constituents of our object really are. (For example, if I sit on a merry-go-round, is the ‘right’ quantum description of my motion given in terms of my cells, or my atoms, or protons and neutrons, or quarks and gluons, or bits of string, or ...?) It’s thus crucial that we can consider objects as a whole in quantum mechanics, just as we can classically. For rotations, understanding S will allow us to do this. In addition, as we’ll see in section 5.3.2, one of the surprises of quantum mechanics is that even fundamental particles such as electrons and photons may have an ‘intrinsic’ spin that (as far as we know) is not related to any composite structure.

39We’ll understand how to properly describe the quantum mechanics of a system with many degrees of freedom (e.g. many particles) in chapter 2.3.

– 44 – Fortunately, commutation relations involving the spin operator S are easy to obtain from the ones we already have for J and L. Firstly, the fundamental relations X X [Ji,Xj] = i~ ijkXk and [Ji,Pj] = i~ ijkPk (4.69) k k show that X [Ji,Lj] = i~ ijkLk , (4.70) k so that the angular momentum operator L also transforms as a vector under rotations. Therefore we have

[Si,Sj] = [Ji Li,Jj Lj] = [Ji,Jj] [Ji,Lj] [Li,Jj] + [Li,Lj] − − − − X X (4.71) = i~ ijk(Jk Lk) = i~ ijkSk , − k k P where we used the fact that [Li,Lj] = i~ k ijkLk. This algebra is the same as that obeyed by the components of J and L. It confirms that the spin operators S generate some form of rotation. In particular, it immediately follows from (4.71) that

2 [Si, S ] = 0 , (4.72) P just as for J and L. On the other hand, since [Li,Xj] = i~ k ijkXk and [Li,Pj] = P i~ k ijkPk, we have [Si,Xj] = [Ji,Xj] [Li,Xj] = 0 − (4.73) [Si,Pj] = [Ji,Pj] [Li,Pj] = 0 − If we view X and P as referring to the system’s centre of mass, then these commutation relations confirm that the spin operator has nothing to do with our object’s location in or motion through space, but is purely to do with rotating its intrinsic orientation. Finally, since S commutes with both X and P, it also commutes with L:

[Si,Lj] = 0 i, j . (4.74) ∀ This allows us to factorize the operator U(α) describing finite rotations as

U(α) = e−iα·J/~ = e−iα·(L+S)/~ = e−iα·L/~ e−iα·S/~ = e−iα·S/~ e−iα·L/~ . (4.75)

As in figure5, these equations confirm that we can think of a quantum rotation as consisting of a translation of a body’s centre of mass along an arc centred on the origin together with a simultaneous rotation of the body around it’s own centre of mass by the same amount. The order in which we perform these two operations makes no difference.

4.4 Time Translations It’s not only transformations of space that are represented on by unitary operators. H Consider translations in time. These again form an Abelian group, since sending t0 to 0 0 0 t0 + t and then to t0 + t + t gives the same end result as does t0 t0 + t t0 + t + t. → →

– 45 – 3 If we want the total probability of our particle being found somewhere in R at all times, then time translation must also be represented by a unitary operator U(t): . Since H → H we can translate through an arbitrarily small time, the time translation operator takes the form  i  U(t) = exp Ht (4.76) −~ where H/~ must be Hermitian and is, by definition, the generator of time translations. This generator must have dimensions 1/(time), so the conventional factor of ~ means that H itself has dimensions of energy. Of course, H will turn out to be the Hamiltonian, but for now I’d like you to think of it more abstractly just as the generator of time translations. The fact that U(t) represents the effect of a time translation on means that if a H particle is in some state ψ(0) at time t0 = 0, translating forward to time t we will find it | i in the state ψ(t) = U(t) ψ(0) = e−iHt/~ ψ(0) . (4.77) | i | i | i In particular, the difference between ψ(t) and the state we find a short time later is | i i ψ(t + δt) ψ(t) = δt H ψ(t) + O(δt2) (4.78) | i − | i −~ | i or, taking the limit δt 0, → ∂ i~ ψ(t) = H ψ(t) . (4.79) ∂t| i | i This, famously, is the Time Dependent Schr¨odingerEquation, which I’ll often abbreviate to TDSE in what follows. It’s just the infinitesimal version of the statement that all states in evolve in time according to the action of a unitary operator U(t). H 4.4.1 The Heisenberg Picture In 1B QM, we were used to the idea that states evolve in time according to the TDSE. However, just as we did for spatial transformations, we can instead work in a picture where the states are time independent and instead the operators evolve in time. To see how this works, suppose OS is some operator that contains no explicit time dependence. Then the amplitude for the state OS ψ(t) to agree at time t with the state χ(t) is | i | i −1 χ(t) OS ψ(t) = χ(0) U (t) OS U(t) ψ(0) (4.80) h | | i h | | i The right hand side here only makes explicit reference to the initial values of the states ψ | i and χ . Since it holds true for any pair of states, just as above we can obtain the same | i results by always working with the initial states and instead evolving the operators as

−1 OH(t) = U (t) OS U(t) . (4.81)

The version of quantum mechanics where we use the TDSE to evolve the states in time, leaving operators unaltered, is known as the Schr¨odingerpicture, whereas the version where the states are fixed at their initial values and instead the operators evolve in time is called the Heisenberg picture.

– 46 – In fact, all this has a precise analogue in classical mechanics. Classically, there are also two ways of thinking about time evolution. On the one hand, we can think of a particle moving in some way through phase space M. If we know it’s location (x(t), p(t)) M for ∈ every time t we can compute any quantity we wish, represented by some function f : M → R, by evaluating f at the location of our particle, obtaining the value f(x(t), p(t)). (This is the perspective we took in the Introduction.) However, Newton’s Laws are deterministic, so, given a force, the entire trajectory is determined by the initial conditions (x0, p0). This suggests a perspective in which the ‘state’ of our particle is simply a choice of initial conditions. These initial conditions do not themselves evolve, rather, it is the quantities we measure that vary in time. Thus, instead of thinking of a physical quantity f as a map from phase space, we treat it just as a map from time, so f :[t0, ) R. You’ll examine these ∞ → classical pictures further if you’re taking the Classical Dynamics course, in the context of Hamiltonian mechanics. (It’s really for this reformulation of classical mechanics, not the particular H, that Hamilton is famous.) Differentiating (4.81) wrt t shows that

d d −1  i −1 −1  OH(t) = U (t) OS U(t) = U (t) HOS U(t) U OSHU(t) dt dt ~ − (4.82) i = [H,OH(t)] , ~ where in the last step we used the fact that [U(t),H] = 0, since U(t) depends only on H. This is Heisenberg’s equation of motion. It’s completely equivalent to Schr¨odinger’s equation (4.79), and is also just a (very important!) special case of our general argument of how operators transform under the action of unitary groups of transformations. More generally, if the original operator OS had some explicit time dependence of its own — independent of any particular particle’s motion — then we would obtain a further term in this equation, modifying it to

d −1 ∂OS i OH(t) = U (t) U(t) + [H,OH(t)] . (4.83) dt ∂t ~ For example, we may wish to understand the behaviour of a charged particle in the presence of an applied electric field. If the electric field is itself changing in time, then the potential in which the particle moves will be time–dependent even in the Schr¨odingerpicture.

4.5 Dynamics So far, we’ve simply defined some unitary operators U(a), U(α) and U(t) that are re- sponsible for translating, rotating or evolving our system through space and time. The commutation relations of the generators P and J, and their commutation relations with X were determined purely by the properties of the corresponding group of transformations 3 of R . However, while we’ve seen that commutators such as [H, X], [H, P] and [H, J] are important in the Heisenberg picture, telling us how these operators change in time, we haven’t yet given any way to actually calculate what such commutators should be. Simi- larly, although we’ve said that in the Schr¨odingerpicture, states evolve in time according to

– 47 – ψ(t) = U(t) ψ(0) , we haven’t given any way to work out what the action of the generator | i | i H on a state should actually be. To do so, we must specify the dynamics: we must provide a relation H = H(X, P) giving H in terms of the other operators whose commutators we already understand. The simplest (non-trivial) such relation is40 1 H = P2 (4.84) 2m where m is a constant with dimensions of mass. Equation (4.84) relates how a state evolves in time (via H) to how its location is translated through space (via P); in other words, it’s telling us something about how particles move! In particular, with this form of Hamiltonian the difference between the expected location of a state ψ at an initial time and a short | i time δt later is i ψ U −1(δt) XU(δt) ψ ψ X ψ = δt ψ [H, X] ψ + O(δt2) h | | i − h | | i ~ h | | i (4.85) δt = ψ P ψ + O(δt2) . m h | | i In the limit δt 0 we see that the expected velocity of the particle is ψ P ψ /m. Equiv- → h | | i alently, since this is true for all ψ , we can write | i ∈ H d P(t) X(t) = (4.86) dt m in the Heisenberg picture. Furthermore, since [H, P] = 0 for this Hamiltonian, P(t) = P(0) and all states travel in uniform motion. In the real world, we observe that particles do not always travel with constant velocities: they may slow down, speed up, or change direction as they encounter various obstacles. These obstacles are typically located at various different points in space. To allow for this, we generalise our dynamical relation (4.84) to 1 H = P2 + V (X) . (4.87) 2m The first term on the right is the kinetic term and is the contribution to the Hamiltonian due to the state’s travelling through space. The second, potential term is the contribution due to the state’s location. This familiar form is still a very special case of the general

40Note that this is the first time we’ve pinned ourselves down to a non–relativistic theory. Up to this point, everything we’ve said holds good in relativistic quantum mechanics and, suitably interpreted, also in quantum field theory. In the lectures we studied translations and rotations, but in the first problem set you’ll extend this to also consider Galilean boosts and will show that the non–relativistic Hamiltonian is compatible with the Galilean algebra. If one instead wishes to study a relativistic version of quantum 3,1 theory, one starts by finding unitary operators representing the action of the Poincar´egroup R oSO(3, 1) of special relativity. The relation (4.84) between H and P does not respect the commutation relations appropriate for Poincar´esymmetry, but of course the dynamical relation H2 = c2P2 + m2c4 does. The difficulty with relativistic quantum mechanics lies not with symmetries, but with interactions. The proper treatment of these requires Quantum Field Theory, a far deeper subject.

– 48 – statement H = H(X, P) and later in the course we’ll meet examples of Hamiltonians that don’t fit this form. Repeating the calculation of (4.86) we still find that dX(t)/dt = P(t)/m, but now [H, P] = 0 so the rate of motion through space will not always be the same. Rather, 6 using (4.82), the Heisenberg picture operators obey d i P(t) = [H, P(t)] = ∇V (t) , (4.88) dt ~ − where ∇V (t) = U −1(t)(∂V /∂X) U(t) = ∇V (X(t)). Thus the motion slows down or − speeds up according to the gradient of V (X). Most of our common intuition about momentum comes from equation (4.88). We ‘feel’ that a tennis ball travelling with a certain speed has less momentum than a cannonball travelling with the same speed, and less than the same tennis ball travelling faster, because we’ve known from an early age that it will cause us less damage to stand in the way of the first tennis ball than either of the other two. This is really a statement about what our unfortunate bodies will have to do in order to bring these projectiles to rest (or how much effort we will have to exert to launch them in the first place). In other words, our intuitive notion of momentum is built on a feeling for the energy our body will gain as we slow the projectile down during the impact. This energy then excites the atoms that ultimately make up our nerves, muscles and bones, becoming dissipated through our bodies. Ultimately, it is the dynamical relation between the Hamiltonian and other operators that justifies us identifying the operator P/~ — by definition, the generator of translations — with our pre-existing notion of momentum (in units of ~). Only after we specify our dynamics do nd commutation relations such as [Xi,Pj] = i~ δij tell us, via Born’s 2 postulate of QM, that a particle cannot simultaneously be in a state of well-defined position and of well-defined momentum. We’ll often find it useful to write the kinetic term in a slightly different form. Keeping careful track of the operator ordering, you’ll show in the problem sets that

L2 = (X P) (X P) = X2 P2 P 2 , (4.89) × · × − r where for Xˆ = X/ X , | | 1   Pr = Xˆ P + P Xˆ (4.90) 2 · · is the radial momentum operator. Consequently, we can write the Hamiltonian (4.87) in terms of the generator of circular translations and the radial momentum operator Pr as P 2 1 H = r + L2 + V (X) (4.91) 2m 2mX2 where we note that [L, X2] = 0, so the order of the operators in the second term is irrelevant provided we keep X2 together as a composite operator. (In particular, in the position representation, X2 acts as multiplication by x2 = r2, which is certainly unaffected by translation in a circle.) Most of our intuitive feeling about angular momentum — mine comes from happy childhood hours spent on merry-go-rounds — is encapsulated in this

– 49 – relation between the Hamiltonian and L2. Again, it’s the form of the Hamiltonian which is the basis of our identification of the generator L/~ of circular transformations with angular momentum (in units of ~).

4.5.1 Symmetries and Conservation Laws One of the immediate uses of the Heisenberg picture is to deduce a relation between sym- metries of the Hamiltonian and conserved quantities41. If it happens that an operator Q commutes with the Hamiltonian, then it also commutes with e−iHt/~, so the Heisenberg picture operator Q(t) = U −1(t) QU(t) = U −1(t) U(t) Q = Q, (4.92) coinciding with the Schr¨odingeroperator for all t. Infinitesimally, provided Q has no explicit time dependence, this is d i i Q(t) = [H,Q(t)] = U −1(t)[H,Q] U(t) = 0 . (4.93) dt ~ ~ Operators that are time independent even in the Heisenberg picture are said to be con- served. We’ve shown that conserved operators are just those that commute with the Hamil- tonian. Suppose we prepare a particle to be in an eigenstate of some conserved operator at time t = 0, so Q ψ(0) = q ψ(0) . Then at any later time we have | i | i Q ψ(t) = QU(t) ψ(0) = U(t) Q ψ(0) = q U(t) ψ(0) = q ψ(t) (4.94) | i | i | i | i | i so our particle remains in an eigenstate of Q at all subsequent times (provided the state evolves according to the TDSE). For this reason, it’s usually sensible to expand our states in a basis of eigenstates of a maximal set of conserved operators, rather than a maximal commuting set of any old operators. We label the states in this basis by their corresponding eigenvalues, because the same labelling will remain valid at subsequent times. For example, provided H has no explicit time dependence42 [H,H] = 0 trivially, so the Hamiltonian itself is always conserved, and it is often useful to work in a basis of energy eigenstates. By far the most important source of conserved quantities is symmetries of the Hamil- tonian. These are transformations that leave the Hamiltonian invariant. We’ve seen that a generic unitary operator U(θ) = e−iθT representing some transformation of space on H acts on operators as in equation (4.9). In particular, its action on the Hamiltonian is

H U −1(θ)HU(θ) . (4.95) 7→ If the Hamiltonian is invariant under this transformation – i.e., if the transformation is a symmetry – then U −1(θ)HU(θ) = H, (4.96)

41Those of you taking the Classical Dynamics course will see a corresponding relation in Hamilton’s approach to classical physics. 42Even if the Hamiltonian does contain explicit time dependence, for example because it describes the dynamics of a charged particle in a varying electric field, [H(t),H(t)] = 0 trivially. However, [H(t0),H(t)] may not vanish as the form of the Hamiltonian itself changes.

– 50 – or equivalently [T,H] = 0 (4.97) for an infinitesimal symmetry transformation, where T is the Hermitian generator. This is exactly the same condition (4.93) we had for the generator T to be conserved, so symmetries of the Hamiltonian correspond to conserved quantities43. For example, if the Hamiltonian is translationally invariant then [P,H] = 0, so for such Hamiltonians, momentum will be conserved. Note that it’s not necessary for the Hamiltonian to be independent of each particle’s position xa if H is to be translationally invariant, so long as any potential term V (xa xb) depends only on the relative positions. − Similarly, if the Hamiltonian is rotationally invariant then [J,H] = 0 so angular momentum 2 will be conserved. In this case, Jz, J and H form a set of mutually commuting operators, so we can expand a general state in a basis n, `, m, . . . where the labels (n, `, m) refer to 2 {| i} the eigenvalues of H, J and Jz. We’ve left open the possibility that our states have further conserved properties, indexed by further labels. These will often contain information about the ‘internal’ state of our system.

4.6 Parity Not all transformations are continuous. In physics, the most prominent example of a 3 discrete transformation is the parity transformation , acting on R as : x x. Since P P 7→ − det( ) = 1, this is different from a rotation, so may have consequences that cannot P − be deduced by considering only rotations. On Hilbert space, parity transformations will still be represented by a unitary operator Π : , but this operator does not involve H → H any parameter that may be taken infinitesimally small. Consequently there’s no generator associated with parity transformations. Parity transformations form the group G = Z2, since carrying out a non-trivial parity transformation twice just brings us back to where 2 we were. Thus = 1 3 and applying our homomorphism shows that P R 2 Π = 1H (4.98) for the parity operator on Hilbert space, too. Thus, the spectrum of Π is just +1, 1 . { − } Note that since Π is unitary its quantum numbers η behave multiplicatively, whilst those of a Hermitian generator of a continuous symmetry (such as P) behave additively. 3 Since Π is supposed to represent a reflection in R , it must act on the position operator as Π−1 X Π = X = X , (4.99) P − or equivalently Π , X = Π X + X Π = 0 . (4.100) { } where the bracket A, B is sometimes called the anticommutator of A and B. Translating { } 3 3 through a and then applying the parity operator : R R is the same as first applying P → 43If you’re taking the Classical Dynamics or Integrable Systems courses, you’ll meet the corresponding statement in classical mechanics in the guise of N¨other’stheorem.

– 51 – , then translating through a, so P − Π−1U(a)Π = U( a) , (4.101) − showing that Π−1PΠ = P , (4.102) − so the translation generators P also anticommute with the parity operator. More generally, if V is any operator that transforms as a vector under rotations, in the sense of (4.57), we say V is a vector operator if also

Π−1VΠ = V , (4.103) − so that Π,V = 0. If instead { } Π−1VΠ = +V (4.104) or equivalently [Π, V] = 0, then V is a pseudovector operator (provided it transforms appropriately under rotations). The most prominent example of a pseudovector operator 3 is the rotation generator J: since the parity tranformation acts on R as I3×3, rotations P − obey R(α) = R(α), or −1R(α) = R(α). This gives P P P P Π−1U(α)Π = U(α) (4.105) for the parity and rotation operators in Hilbert space, or Π−1JΠ = J for the rotation generator. Likewise, the orbital angular momentum operator obeys

Π−1LΠ = Π−1(X P)Π = (Π−1XΠ) (Π−1PΠ) = ( X) ( P) = +L (4.106) × × − × − so is a pseudovector as in classical mechanics. It follows that the spin operator S = J L − is also a pseudovector. Similarly, operators S that are invariant under rotations are scalar operators if in addi- tion Π−1SΠ = S so that they are unchanged under parity, and are pseudoscalar operators if instead Π−1SΠ = S. (Notice that the signs for scalars and pseudoscalars are the opposite − way round to those for vectors and pseudovectors.) As usual, the parity of a system will be conserved if [H, Π] = 0. As a simple example, if our system is governed by the dynamical relation H = P2/2m + V (X) where the potential is an even function, then 1 Π−1HΠ = (Π−1PΠ) (Π−1PΠ) + Π−1V (X)Π 2m · P2 = + V (Π−1XΠ) (4.107) 2m P2 = + V (X) = H, 2m so parity is conserved for such Hamiltonians. Now let’s understand how Π acts on a generic state ψ , starting with the case 2 3 3 | i ∈ H where = L (R , d x) so that all the information about our system is contained in e.g. H ∼

– 52 – its position space wavefunction x ψ . Note that if x is an eigenstate of the position h | i | i operator, then from (4.103) the parity-reversed state x0 = Π x obeys | i | i X x0 = XΠ x = ΠX x = x Π x = x x0 (4.108) | i | i − | i − | i − | i and hence Π x = c x for some constant c. Since Π2 = id, at most c can be a sign44. | i | − i None of the properties of the Π operator above fix this sign45, and we fix the ambiguity by declaring Π to be the operator such that

Π x = + x . (4.109) | i |− i 2 3 3 More generally, given any state ψ L (R , d x), the wavefunction of the parity trans- | i ∈ formed state Π ψ is | i x Π ψ = x ψ = ψ( x) , (4.110) h | | i h− | i − since Π† = Π−1 = Π. Thus, the wavefunction of the new state takes the same value at x as the original one did at x. For example, if our particle is completely described by the − m a state n, `, m with position space wavefunction x n, `, m = Rn( x )Y (x) (where the | i h | i | | ` radial part Rn(r) of the wavefunction determines the energy level n), applying the parity operator gives

m ` x Π n, `, m = x n, `, m = Rn( x )Y ( x) = ( 1) x n, `, m , (4.111) h | | i h− | i | − | ` − − h | i m where we recall from IB QM (or Methods) that the spherical harmonics Y` (x) obey Ym( x) = ( 1)` Ym(x). ` − − ` When we have a complicated object, such as a macroscopic system with some internal 2 3 3 structure, the Hilbert space is not simply L (R , d x) and parity transformations can act in a slightly more complicated way. This is clear classically: the parity transformation of a book initially located at x is not simply a book located at x, but rather a mirror image − (simultaneously left–right, up–down and front–back) of the book. As with the distinction between J and L, it’s tempting to try to account for this by declaring that X just refers to the location of the macroscopic object’s centre of mass, and define a set of parity operators −1 Πa, one for each constituent particle, each acting as Π XaΠa = Xa so as to reflect that a − one particle. Just as with spin and rotations, the problem with this (aside from inefficiency) is that it’s dangerous to assume we know what the correct choice of Hilbert space actually is for a subatomic particle – does the electron have an internal structure? Whatever the choice of on which the full Π acts, its eigenstates must obey Π ψ = H | i η ψ with either η = +1 or η = 1. These signs can certainly arise from the behaviour of | i − the position space wavefunction as an even/odd function, but there can also be ‘intrinsic’ parities that tell us something about the internal structure of the system. In the subatomic world, such intrinsic parities depend just on the ‘species’ of particle our state describes (i.e. whether it’s an electron, a proton, a photon or something else).

44Note that it is important here that Π|xi = c| −xi for all values of x. See e.g. Weinberg, The Quantum Theory of Fields vol. 1, sec. 2.6 for a fuller discussion. 45The operator Π0 = −Π obeys (Π0)2 = +id and {Π0, X} = 0, so is just as good a choice for the parity operator as Π itself.

– 53 – Figure 6: Most atomic transitions are due to the absorption or emission of a photon from a single electron in the atom. The photon’s angular frequency ω = ∆E/~.

The intrinsic parity of a specific type of particle needs to be determined experimentally. For example, consider transitions between different energy levels of an atom. These are usually mediated by electromagnetic interactions: during such a transition, the atom’s electrons either emit or absorb a photon whose energy ~ω is equal to the difference in energy ∆E between the two levels involved in the transition. As we’ll study in more detail later, when the corresponding wavelength is much larger than the typical size of the atom, the transition rate Γ(i f) is given by → 4(∆E)3 Γ(i f) = f D i 2 , (4.112) → c3~4 |h | | i| P where i and f are the initial and final states of the atom, and D = eaXa is the | i | i operator corresponding to the atom’s electric dipole moment. (The sum runs over all the electrons in the atom.) Reversing the parity of space affects all the electrons equally, so −1 −1 the parity operator obeys Π XaΠ = Xa for each Xa. Thus Π DΠ = D. − − Now suppose that the initial and final states of the atom are eigenstates of Π, with eigenvalues ηi and ηf , respectively. Then

−1 f D i = f Π DΠ i = ηi ηf f D i (4.113) − h | | i h | | i h | | i so the amplitude for the transition will vanish unless the initial and final atomic states have opposite parity, ηi ηf = 1. If our initial and final states kept track of the photon as well − as the atom, we’d be able to say overall parity is conserved in such transitions if we assign an intrinsic parity 1 to the photon that is emitted or absorbed in the transition. The − intrinsic parity of the photon is related to the fact that the electromagnetic vector potential A, whose quantum version provides the description of the photon, indeed transforms as a vector, not a pseudovector. (Note that we cannot use this experiment to determine the intrinsic parity of the electron, since we have the same number of electrons in the initial and final state.) In the most common case, just a single electron is involved in the transition and in 2 ` +` this case (4.111) shows that ηi ηf = η ( 1) i f , where `i,f are the total orbital angular e −

– 54 – momentum quantum numbers of the initial and final states of this electron, and ηe is the 2 intrinsic parity of the electron. Since necessarily ηe = 1, in any radiative atomic transition involving just a single electron, the orbital angular momentum quantum number ` must change by an odd number. We’ll explore intrinsic parity further in section 7.2.1 and in the second problem set. In fact, [H, Π] = 0 holds in general for particles travelling in the presence of any type of electromagnetic interaction. Having [H, Π] be zero means that if at time t = 0 you set up a system to be a mirror image of another system, then their subsequent evolution will be identical in the sense that if you observe them at a later time, they will still be mirror images of one another. Hence, when [H, Π] = 0 it is impossible to tell whether a system is being observed directly or through a mirror. One of the major surprises of twentieth–century physics was an experiment by Wu et al. in 1957, which showed that in fact the Hamiltonian of our Universe has [H, Π] = 0 — you can see things in a mirror that 6 are impossible in our World!

– 55 – 5 Angular Momentum

In the previous chapter we obtained the fundamental commutation relations among the position, momentum and angular momentum operators, together with an understanding of how a dynamical relation H = H(X, P) allows us to understand how such quantities evolve in time. However, with the exception of the parity operator, we’ve not yet said anything about the spectrum — the set of possible eigenvalues — of these operators, nor have we been specific about the precise Hilbert space on which they act. Answering such questions is an important part of the subject of representation theory in mathematics; the study of how group actions can be realised on a given vector space. In physics, understanding this will provide essential information about the nature of our quantum system. To a large extent, both questions are determined by the operator algebra itself. For example, taking the of the commutation relations [Xi,Pj] = i~ δij IH in any finite- dimensional Hilbert space gives H i dim( ) δij = trH(XiPj PjXi) = 0 (5.1) H −~ − for all i, j by the cyclic property of the trace. So there is no realisation of the position and translation operators on any non-trivial finite dimensional Hilbert space (and if dim( ) = 0 H then X and P necessarily act trivially). The above argument fails in an infinite dimensional Hilbert space, where neither trH(IH) nor trH(XiPj) is defined. Thus, if we wish to discuss the position and momentum of a quantum system, we’re necessarily in the world of function spaces46. On the other hand, taking the trace of the commutation relations X [Ji,Jj] = i~ ijkJk (5.2) k for the rotation generators just gives i~ ijk trH(Jk) = trH(JiJj JjJi) = 0, so it is possible − to represent each component of J on a finite dimensional Hilbert space in terms of a traceless matrix47. This is often a useful thing to do, particularly when discussing the internal structure of a system.

In this chapter, we’ll consider finite–dimensional Hilbert spaces j whose elements H transform among themselves under rotations. In mathematics, the j are known as ir- H reducible representations of the rotation group, whilst in physics they’re called multiplets of definite total angular momentum. This chapter will also substantiate the link between the rotation generators J and the physics of angular momentum, by examining a simple Hamiltonian for rotating diatomic molecule. We’ll also flesh out the details of exactly how the orientation of a system is encoded in the amplitudes for it to be found in different eigenstates of appropriate angular momentum operators.

46Even here we should really be more careful. As mentioned in a previous footnote, X and P are not 2 3 3 defined as linear operators on the whole of a Hilbert space such as L (R , d x), because a state that is initially square-integrable may not remain so after the application of the unbounded operator X or P. 47This traceless condition in H is closely related to – but distinct from – the fact that the generators of 3 SO(3) can also be represented by antisymmetric (hence traceless) matrices acting on R .

– 56 – 5.1 Angular Momentum Eigenstates (A Little Representation Theory) P Let’s begin by seeing how the algebra [Ji,Jj] = i~ k ijkJk determines the spectrum of the angular momentum operators. Since no two components of J commute, we cannot find a complete set of simultaneous eigenstates of two components of J. However, since [J, J2] = 0 we can find a complete set of simultaneous eigenstates of (any) one component of J and J2. Without loss of generality, we orient our coordinate system so that the chosen 2 component of J is Jz. We let β, m denote a simultaneous eigenstate of Jz and J , where | i 2 2 J β, m = β~ β, m and Jz β, m = m~ β, m (5.3) | i | i | i | i and the factors of ~ are for later convenience. We also choose all the states β, m to be {| i} properly normalised. Since they are eigenstates of Hermitian operators, they must then be orthonormal: 0 0 β , m β, m = δββ0 δmm0 . (5.4) h | i Finally, since we’re looking for the simplest possible way to realize our commutation rela- tions, we’ll assume that the states β, m are non-degenerate; that is, we want a Hilbert {| i} 2 space where β, m is the unique eigenstate of J and Jz with the given eigenvalues. I’ll | i comment more on this assumption in the next chapter. We now define J± = Jx iJy (5.5) ± † which obey J± = J∓ and so are eachother’s adjoint. As they are built from linear combina- 2 tions of the components of J, clearly J± each commute with J , while their commutation relations with Jz are

[Jz,J±] = [Jz,Jx] i[Jz,Jy] = i~ (Jy iJx) = ~ J± . (5.6) ± ∓ ± We learn that 2 2 2 J (J± β, m ) = J±J β, m = ~ β J± β, m (5.7) | i | i | i and Jz(J± β, m ) = ([Jz,J±] + J±Jz) β, m = (m 1)~ (J± β, m ) . (5.8) | i | i ± | i 2 This shows that the new states J± β, m , if not zero, are still eigenstates of both J and 2 | 2 i Jz, with the same eigenvalue β~ for J . However, their Jz eigenvalue is shifted up or down (respectively for J+ β, m and J− β, m ) by one unit of ~. We see that the role of J± in | i | i angular momentum is similar to the role of A and A† for energy of the harmonic oscillator. Again, J+ and J− are often called raising and lowering operators. Their role is to rotate our system, aligning more or less of its total angular momentum along the z-axis without changing the total angular momentum available. (See figure7.) Just as for the harmonic oscillator, examining the algebra of our raising and lower- ing operators has told us the separation between angular momentum eigenstates, given a starting point β, m . To fix our initial states, we must examine the norm. By assumption, | i β, m itself is correctly normalized and we compute | i 2 J+ β, m = β, m J−J+ β, m = β, m (Jx iJy)(Jx + iJy) β, m k | ik h | | i h | − | i 2 2 2 (5.9) = β, m (J Jz ~Jz) β, m = ~ (β m(m + 1)) h | − − | i −

– 57 – Figure 7: The angular momentum raising and lowering operators J± realign the system’s angular momentum to place more or less of it along the z-axis.

† 2 2 where we’ve used the fact that J+ = J−. Similarly, we find J− β, m = ~ (β m(m 1)). k | ik − − But since this is a norm, whatever state beta, m we started with we must have | i 2 J+ β, m = β m(m + 1) 0 (5.10) k | ik − ≥ with equality iff J+ β, m = 0 is the trivial state. This shows that it cannot be possible | i to always keep applying J+, repeatedly raising the Jz eigenvalue m whilst leaving β un- changed. There must be some maximum value of m – let’s call it j – for which J+ β, j = 0. | i By (5.10) this can only be the case if β obeys

β = j(j + 1) (5.11) and so is fixed in terms of the maximum allowed value of m. Similarly, for a generic m value we find

2 J− β, m = β m(m 1) (5.12) k | ik − − so it also cannot be possible to keep lowering the Jz eigenvalue whilst remaining in the 0 0 Hilbert space. There must be some minimum value of m, say j , for which J− β, j = 0 | i and by (5.12) this can only be the case if

β = j0(j0 1) . (5.13) − 2 Applying J+ or J− doesn’t change the J eigenvalue β, so these two values of β must agree. Comparing them, we obtain a quadratic equation for j0 that we solve in terms of j, finding j0 = j + 1 or j0 = j. Since the minimum value of m can’t be greater than the maximum 0 2 2 2 value, we must choose the root j = j. The J eigenvalue β~ = j(j + 1)~ is determined − by j, so we henceforth label our states as j, m ; this labelling is simply less cluttered than | i j(j + 1), m would be. | i Finally, we note that since applying J− repeatedly will take us from the highest state β, j through β, j 1 ,... down to β, j , it must be that 2j is a non-negative integer. | i | − i | − i In other words, the possible values for j are

j 0, 1/2, 1, 3/2,... . (5.14) ∈ { }

– 58 – Once the total angular momentum quantum number j is fixed, we have

m j, j + 1, . . . , j 1, j . (5.15) ∈ {− − − }

Thus there’s a total of 2j + 1 states in any given j multiplet, and j is the Hilbert space H of dimension 2j + 1 spanned by the j, m of fixed j. We can move between states in j {| i} H using the raising and lowering operators J+ and J− which obey p J+ j, m = ~ j(j + 1) m(m + 1) j, m + 1 | i p − | i (5.16) J− j, m = ~ j(j + 1) m(m 1) j, m 1 | i − − | − i 2 These operators only change the Jz eigenvalue, not the J one. They just realign a given system, placing more (J+) or less (J−) of its angular momentum along the z-axis. Although we know we can choose multiplets with different values of j, we’ve not yet discovered a mathematical operator that alters this J2 eigenvalue; this will be done in section 6.2.1.

5.2 Rotations and Orientation This section should talk about how eigenfunctions of angular momentum are related to an object’s orientation. In particular, a classical object has a well-defined orientation because it is in a superposition of very many nearby angular momentum eigenstates. I think this is likely to mean lots of values of j as well as m, but not sure of the best example to use to illustrate this. Note that, since Jx and Jy can be written in terms of the raising and lowering operators J±, rotations around an arbitrary axis transform states in a given j into other states in H the same j. Since Jx = (J+ + J−)/2, we see that when j > 0, the state j, m is never an H | i eigenstate of Jx. Hence, for j > 0, we can never be certain of the outcome of measurements of both Jx and Jz. This argument applies equally to Jy, so it’s impossible to be certain of the outcome of measurements of more than one component of J. If j = 0, every component 3 of J yields zero, but a null vector in R has no direction. Consequently, there are no states in which the vector J has a well-defined direction. This situation contrasts with the momentum vector P which can have a well–defined direction since all its components commute with one another. In the mathematical literature, states with m = j are known as highest weight states. They play a key role in the representation theory of any group G, because once we know the highest weight state the rest of the multiplet can be constructed by applying lowering operators such as J−. Physically, a highest weight state is one in which the body’s angular momentum is most nearly aligned along the z-axis. In this state, the ratio of the squared angular momentum that lies in the xy-plane to that parallel to the z-axis is

2 2 2 2 j, j Jx + Jy j, j j, j J Jz j, j 1 h | 2 | i = h | −2 2 | i = . (5.17) j, j Jz j, j ~ j j h | | i As Jx = (J+ + J−)/2 we have j, j Jx j, j = 0 and similarly j, j Jy j, j = 0. Thus we h | | i h | | i have no information about the direction in the xy-plane any angular momentum points.

– 59 – Macroscopic bodies, for which j 1, have only a tiny fraction of their total angular  momentum in the xy-plane when they’re in state j, j , so the uncertain direction of this | i component of angular momentum does not lead to significant uncertainty in the total direction of the angular momentum vector. By contrast, when j = 1/2, even in state 1/2, 1/2 there’s twice as much angular momentum associated with the xy-plane as with | i the z-axis and the uncertain direction of this planar component makes it impossible to say anything more specific than that the angular momentum vector lies somewhere in the northern, rather than southern, hemisphere. Even when j = 1 and m = 1, the state 1, 1 | i has as much angular momentum along the xy-plane as it has parallel to the z-axis, so the direction of the angular momentum vector is still very uncertain. This is no surprise: since 1 dim( j) = 2j + 1, when j = there are only two independent states the orientation of H 2 our body can take. With such a limited Hilbert space it’s no wonder we only have a fuzzy notion of where the body’s angular momentum lies.

5.2.1 Rotation of Diatomic Molecules 2 Our deduction of the possible eigenvalues and eigenstates of J and Jz came from con- sidering rotations: the states j, m simply enable us to describe what happens when an | i object is rotated around some axis, as we’ll understand in more detail and with many examples later in the chapter. In particular, it’s important to understand that a priori these states have nothing to do with the energy levels of any given particle. As always, there’s no way to tell how rotating an object may or may not change its energy until we specify a form for the Hamiltonian. In this section, we’ll choose a simple form of dynamical relation H = H(J) that will enable us to understand an important part of the dynamics of a diatomic molecule. For some purposes, a diatomic molecule such as CO molecule can be considered to consist of two point masses, the nuclei of the oxygen and carbon atoms, joined by a ‘light rod’ provided by the electrons. Following the analogous formula in classical mechanics, we model the dynamics of this molecule by the Hamiltonian

2 2 2 ! 1 J Jy J H = x + + z . (5.18) 2 Ix Iy Iz where Ix is the moment of inertia along the x-axis, and similarly for Iy and Iz. Though motivated by classical mechanics, the best justification for this form of Hamiltonian is ultimately that it agrees with detailed experiments. In our axisymmetric case, we choose coordinates whose origin is at the centre of mass of the molecule (somewhere between the C and O atoms) and orient our coordinates so that the z-axis lies along the molecule’s axis of symmetry. Then Ix = Iy = I, say, whilst Iz is different. In fact, since the centre of mass of the C and O atoms lie along the z-axis, the molecule’s moment of inertia around this axis is negligible, Iz I (see figure8). We  can thus rewrite our Hamiltonian as  2   1 J 2 1 1 H = + Jz . (5.19) 2 I Iz − I

– 60 – + + + z

y x

Figure 8: A simple model of carbon monoxide. By symmetry, Ix = Iy whilst Iz is much smaller.

2 The virtue of expressing H in terms of J and Jz is that our knowledge of the spectrum of these operators immediately allows us to write down the spectrum of this Hamiltonian: the state j, m is an energy eigenstate with | i 2    ~ j(j + 1) 2 1 1 Ejm = + m , (5.20) 2 I Iz − I

2 with m j. Since Iz I for our diatomic molecule, the coefficient of m is very much | | ≤  greater than that of j(j +1), so states with m = 0 will only occur very far above the ground 6 state. Consequently, the low lying states have energies of the form

2 ~ Ej = j(j + 1) (5.21) 2I for some j. As we saw in the previous section, only discrete values of j are allowed, so the energy levels are quantized. One can excite a CO molecule, causing it to rotate faster, only by supplying a definite amount of energy. Similarly, for the molecule to relax down to a state of lower angular momentum, it must emit a quantized lump of energy. Carbon monoxide is a significantly dipolar molecule. The carbon atom has a smaller share of the binding electrons that does the oxygen, with the result that it is positively charged while the oxygen atom gains negative charge. In Maxwell’s theory of electrody- namics, a rotating electric dipole is expected to emit electromagnetic radiation. Because we’re in the quantum regime, this radiation emerges as photons which, as we’ll see later in the chapter, can add or carry away only one unit ~ of angular momentum. Thus the energies of the photons that can be emitted or absorbed by a rotating dipolar molecule are

2 j~ Eγ = (Ej Ej−1) = . (5.22) ± − ± I Using the relation E = ~ω, the angular frequencies in the rotation spectrum of the molecule are j~ ωj = (5.23) I 12 In the case of CO, 2π~/I evaluates to a frequency ν 113.1724 GHz and spectral lines ∼ occur at multiples of this frequency. In the classical limit of large j, the molecule’s total

– 61 – angular momentum J j~. This is related to the angular frequency Ω at which the | | ≈ molecule rotates by J = IΩ. Comparing to (5.23) we see that in the classical limit | | ωj1 = Ω, so the frequency of the emitted radiation is just the frequency at which the molecule rotates. Measurements of the radiation at 113.1724 GHz provide one of the two most important 48 probes of interstellar gas . In denser, cooler regions, hydrogen atoms combine to form H2 molecules which are bisymmetric and do not have an electric dipole moment when they rotate. Consequently, these molecules, which together with similarly uncommunicative Helium make up the vast majority of cold interstellar gas, lack readily observable spectral lines. Astronomers are thus obliged to study the interstellar medium through the rotation spectrum of the few parts in 106 of CO it contains.

5.3 Spin In section 5.3 we saw that the orbital angular momentum operator L and spin operator S each obeyed an identical algebra to that of the rotation generators,

X 2 [Li,Lj] = i~ ijkLk [Li, L ] = 0 k (5.24) X 2 [Si,Sj] = i~ ijkSk [Si, S ] = 0 k as well as [Si,Lj] = 0. Since the algebra of the Js was all we used to deduce their spectra, it 2 2 follows immediately that (L ,Lz) and (S ,Sz) have the same possible spectra as we found 2 2 2 for (J ,Jz). It’s traditional to label eigenstates of (L ,Lz) as `, m and those of (S ,Sz) as | i s, σ , where ` and s correspond to the eigenvalues of L2 and S2, respectively, and m and σ | i label the eigenvalues of Lz and Sz. (Unfortunately, it’s traditional to label the eigenvalue of both Jz and Lz by the same letter m, even though they mean different things and may take different values in any given state. Which is meant is usually clear from the context.)

5.3.1 Large Rotations Our claim that the possible values of (j, m) are

j 0, 1/2, 1, 3/2,... and then m j, j + 1, . . . , j 1, j (5.25) ∈ { } ∈ {− − − } P was based on examining the algebra [Ji,Jj] = i~ k ijkJk, and likewise for (`, m`) or (s, σ). In turn, this algebra originally came from considering the behaviour of objects under very small rotations. We now check whether these spectra are also compatible with large rotations. Suppose we rotate the state j, m through an amount α around the z-axis. Then | i j, m U(αzˆ) j, m = e−iαJz/~ j, m = e−imα j, m . (5.26) | i → | i | i | i since it is an eigenstate of Jz. Rotations through 2π around any axis return us to our starting point, so are equivalent to no rotation. Thus, for U(α) to be a homomorphism

48The other key probe is the hyperfine line of atomic hydrogen that will be discussed in chapter8.

– 62 – from SO(3) to the group of unitary operators on , we must have U(2παˆ ) = 1H. In H particular, when αˆ = kˆ, from above we must have e−2πim = 1. This is true if m is an integer, but not if it is an odd-half-integer. Since m is an integer or (odd) half-integer iff j is, we conclude that odd half-integer values of j are in fact not compatible with the behaviour of objects under large rotations: they’re ruled out by our basic requirement that U(α) represents the action of SO(3) on . H The fact that global properties of rotations rule out some of the eigenvalues allowed by the rotation algebra is an example of a phenomenon familiar from IB Methods. The behaviour of function in the neighbourhood of a point may be governed by some differential equation. The associated differential operator (if it’s linear) typically has a large spectrum, which is cut down by boundary conditions or periodicity conditions. For example, on any ikx open set U R the linear operator id/dx has eigenfunctions e for any k C. However, ⊂ 1 − ∈ if we know that globally ψ : S C so is periodic, then k must be quantized in units of → the circle’s radius. The only difference in our case is that the non-trivial algebra of the 2 rotation generators already restricted the possible eigenvalues of J and Jz to be quantized in units of ~. Global properties of the rotation group still remove some of the eigenvalues 1 that were allowed locally. More succinctly, states where j N0 + do form representations ∈ 2 of the rotation algebra so(3), but not of the rotation group SO(3). In fact, we’ll see that our arguments above have been rather too hasty.

5.3.2 The Stern–Gerlach Experiment

Quantization of angular momentum in units of ~ was originally proposed by Bohr & Som- merfeld as a means by which the stability of atomic orbits could be understood; we’ve now seen how this quantization arises automatically in the full mathematical framework of quantum mechanics. However, back in 1922 it still seemed very mysterious, so Stern and Gerlach designed and conducted experiments to check whether angular momentum is really quantized in Nature. In the Stern–Gerlach experiments, a beam of uncharged atoms of the same type is passed through a region of slowly varying magnetic field. If the atoms have mass M, the Hamiltonian for this process is49

P2 H = µ B (5.27) 2M − · where B is the applied magnetic field, and µ is known as the magnetic dipole moment of the atom. Ultimately, the reason the atom can be treated as a magnetic dipole is because of detailed properties of the distribution of its electrons. It would take us too far afield to explain this precisely, but the dipole moment arises because the atom has an orientation: a perfectly spherically symmetric atom cannot have any dipole moment as there is no preferred direction for the ‘north’ or ‘south’ poles. Thus the dipole moment is

49The potential energy resulting from coupling a magnetic dipole to an applied magnetic field is similar to the (perhaps more familiar) energy −p · E of an electric dipole p in an applied electric field E. Note that the atoms carry no net electric charge, so there’s no Lorentz force.

– 63 – Figure 9: A schematic picture of the Stern–Gerlach experiment, borrowed from the As- tronomy Cast site. only non-zero for atoms that transform non-trivially under S. If the atom is in the spin s representation, we can write µ = (µ/~s) S. Choosing our coordinates so that the z-axis points along the direction of the magnetic field, the Hamiltonian becomes P2 µ H = BSz , (5.28) 2M − ~s where B = B . The equations of motion (obtained, say, using the Heisenberg picture) tell | | us that the expectation values of position and momentum in any state ψ obey | i d P d µ X = h i and P = (∇B) Sz . (5.29) dth i M dth i ~sh i The second of these equations is analogous to the classical F = ∇V . − We see that the force experienced by any given atom depends on its value of Sz. For a spin s particle, these are ~ s, s + 1, . . . , s 1, s and in particular can be of either sign. {− − − } If an atom is in the state s, s with all its spin aligned along the direction of B, then it will | i experience a force pushing it in the direction of increasing magnetic field. On the other hand, those atoms in the state s, s will be pushed in the direction of decreasing B, while | − i atoms with σ = 0 (when s is an integer) are unaffected. In total, when an initial beam of atoms in which the spins are randomly aligned passes through the region of magnetic field, it will be split into 2s + 1 different trajectories. Thus, if the atoms are perfectly spherical, they will pass through unaffected, whereas if they have s = 1 the beam will be split into three, corresponding to the three possible eigenvalues ~, 0, ~ , if they have s = 2 { } the beam will split into five, and so on. Finally, if the atoms have very large spin (and so a very definite orientation in space) the beam will be split into so many paths that we can no longer distinguish the individual paths, seeing instead a broad smear. This is the same result we’d expect to find if angular momentum is not quantised, where the amount by which an atom is deflected would depend smoothly on its orientation with respect to B. Stern and Gerlach’s original experiment in fact used silver atoms. Their magnetic was controlled by a dial. Initially, as B = 0 the atoms all passed straight through. As they

– 64 – turned up the magnetic field, the beam split – into two separate paths. As B was further increased, the separation between these two beams increased, but no further splitting was observed. Thus, not only is angular momentum quantized as Bohr & Sommerfeld had 1 predicted, but silver atoms have spin s = 2 !

5.3.3 Spinors and Projective Representations This section lies beyond the Schedules and is non-examinable. The Stern–Gerlach experiment shows that, despite our arguments, half–integer values of s actually arise in Nature. In fact, the chemical properties of the elements and the structure of the periodic table, together with properties of materials such as metals, conductors and insulators depends crucially on particles having half-integer values of spin. These values are in conflict with our current mathematical formalism, so we must have made a mistake, imposing too strong a condition that ruled out the possibility of half-integer spins. The error lay in our claim that we needed to represent the action a transformation group G on Hilbert space , rather than just on projective Hilbert space P . (Recall that H H states which differ only by an overall constant – which does not need to have modulus 1 provided we use the general form (2.69) of the Born rule – yield the same results in all experiments. Thus physical systems are represented by states in projective Hilbert space.) Projectively, it’s enough to require

iφ(g2,g1) U(g2) U(g1) = e U(g2 g1) (5.30) ◦ · rather than (4.7), where φ(g1, g2) is a real phase. This phase does not affect which ray in the state lies in, so leaves the physics unchanged. The operator algebra is associative, H so we must have

U(g3) (U(g2) U(g1)) = (U(g3) U(g2)) U(g1) (5.31) ◦ ◦ ◦ ◦ which implies the phases obey

φ(g2, g1) + φ(g3, g2 g1) = φ(g3, g2) + φ(g3 g2, g1) . (5.32) · · Phase factors eıφ(g2,g1) obeying this condition are known as cocycles on the group G. One possible solution of (5.32) is to take

φ(g2, g1) = β(g2 g1) β(g2) β(g1) . (5.33) · − − for some arbitrary (smooth) function β : G R. This solution is ‘trivial’, because if → 0 φ(g2, g1) takes this form, then can define a new unitary transformation operator U (g) = eiβ(g)U(g) which obeys our original condition (4.7). By agreeing to work with the new operator, the phases never arise. The interesting question is whether there are other, non-trivial solutions to (5.32).

– 65 – A theorem in group cohomology50 states that it’s possible for non-trivial projective representations to arise for groups that are not simply connected51. In fact, this is the case for SO(3). The topology of the rotation group can be seen by viewing each rotation as parametrised by a vector α. If we allow the axis of rotation αˆ to point in any direction, then rotations with α (0, π) are uniquely specified. However, when rotating through | | ∈ π we get the same rotation whether rotating around αˆ or αˆ . Thus, we can picture the − space of all rotations as a solid, three dimensional ball α π, but with antipodal points | | ≤ on the surface α = π identified. | | This description show that SO(3) contains smooth, closed paths, beginning and ending at the identity rotation (the origin of the 3-ball) that cannot be continuously shrunk to a point. For example, consider paths which start and end at the identity rotation, i.e. the centre of the . Figure ?? shows a loop which is contractible; it can obviously be shrunk to a point. On the other hand, figure ?? shows a loop for which the angle of rotation starts at zero, smoothly increasing to π at the point A. At this point it reaches the boundary and reappears at the antipodal point A0, before continuing to increase to 2π back at the identity. It should be intuitively clear that this loop cannot be shrunk to a point whilst keeping both its ends fixed at the identity. Next, consider the loop in figure ??, along which the angle of rotation increases from 0 to 4π, reaching the boundary of the ball twice, once at A, reappearing at A0 and then again at B reappearing at B0. By moving the second antipodal pair B and B0 (as shown in figure ??) the section of the path between A0 and B can be pulled across the boundary, smoothly deforming the loop back to the situation in figure ??. Thus, after two complete rotations the situation becomes simple once more. In general, the angle along any closed path must increase by an integer multiple of 2π. The resulting loop will be contractible if this integer is even, but non-contractible if it is odd. The topological properties of paths in G become important when we consider the behaviour of U(g) as the group element g varies around a loop L, beginning and ending at some fixed g0. It is quite possible for U(g) to change smoothly with g along L in such a way that the operators at each end of the loop do not coincide: they may differ by a phase

L U(g0) αLU(g0) −→ where the phase αL depends on the loop. This is consistent with the projective homomor- phism (5.30). To investigate the possibilities for αL, let’s specialize to the case in which the operators act on a Hilbert space of finite dimension N, so that each U(g) can be regarded as an N N × . We can use up some of the freedom in (5.30) by requiring that each such matrix has unit determinant. This does not fix things completely, but the residual freedom

50Unfortunately, I won’t prove this here. If you’re interested, you can find a proof (given in the context of quantum mechanics) in either Weinberg’s The Quantum Theory of Fields, vol. 1 (chapter 2, appendix B), or else in Hall’s Quantum Theory for Mathematicians. 51There’s one other possibility: that the g contains central elements (vectors e ∈ g s.t. [e, g] = 0) that cannot be removed by a redefinition of the generators. This case is not relevant for us.

– 66 – is discrete: det U(g) = +1 for all g G eiNφ(g1,g2) = 1 ∈ ⇒ If G is simply connected, meaning that any closed loop is contractible, then the determinant condition implies αL = 1 for all loops L. This follows straightforwardly using continuity: th αL must be an N root of unity, but if it varies continuously as we vary our loop L and if all such loops are contractible to the single point g0, then we must have the same matrix at the beginning and ending of each loop, so αL = 1. If G is not simply connected, the same argument still shows that αL = 1 for any contractible L, but there may also be non-contractible loops with αL = 1. In this case, we 0 6 can at least deduce that αL = αL0 whenever the loops L and L are in the same homology class, meaning that they can be smoothly deformed into one another. Furthermore, if loops 0 0 L L L and L are traversed successively, then U(g0) αLU(g0) αLαL0 U(g0), so there are → → self-consistency constraints. We can now give a unique definition of U(g) for all g in a simply connected G. Recalling that U(e) = idH, we define U(g0) by choosing any path from the identity e to g0 and demanding that U(g) changes smoothly along this path. The values along the path are unique by the determinant and continuity conditions, but the end result U(g0) is also unique, because by traversing them in opposite directions, any two paths e g0 can be → combined to form a closed loop at g0. This loop is contractible since our group is simply connected. With this definition, continuity also ensures that all cocylces are equal to one, so we have a genuine representation (not just a projective representation) of G. Carrying out the same construction when G is not simply connected, we’ll encounter paths from e to g0 which cannot be smoothly deformed into one another. Thus, starting from U(e) = idH, in general we obtain different values for U(g0) depending on the path we take. In this way we’re forced to consider multi-valued functions on G (just as when defining a continuous square root in the complex plane). The ambiguity, or multi-valuedness, in U(g0) can be resolved only by keeping track of the path we used to reach g0, just as for the complex square root we must keep track of how many times we’ve encircled the origin to be sure which branch we’re on. Such a multi-valued definition inevitably means that non-trivial cocyles appear. As we saw above, the rotation group G = SO(3) is not simply connected, but there rae just two topological classes of loops, depending on whether the net angle of rotation is an even or odd multiple of 2π. Any loop l in the first class is contractible and so has 0 0 αL = 1. Any loop L in the second class is non-contractible, but if we traverse L twice it 2 becomes contractible again. Thus α 0 = 1 and αL0 = 1. The finite-dimensional spaces L ± j on which the rotation operators act are nothing but the multiplets of total angular H 2 momentum j(j + 1)~ , with a basis j, m where {| i} m j, j + 1, . . . , j 1, j ∈ {− − − } 2j+1 Thus dim j = 2j + 1. From the determinant condition (αL) = 1 for any loop, but H from our topological considerations αL = 1. If j is an integer then 2j + 1 is odd and it ±

– 67 – follows that αL = 1 for all loops, contractible or not. However, if j is an odd half-integer then 2j + 1 is even and αL = 1 is possible for non-contractible loops. This is exactly the − behaviour we found above: under the rotation operator U(αzˆ) = e−iαJz/~, a generic state in the j multiplet transforms as

j j X X −iαm ψ = cm j, m U(α) ψ = cm e j, m . (5.34) | i | i 7→ | i | i m=−j m=−j

Any state with j (and hence all m) odd-half-integral changes sign under a rotation through an odd multiple of 2π. The discussion of this section shows that the origin of this unex- pected sign is the non-trivial topology of the rotation group – in our new terminology, we have a projective representation of SO(3). These projective representations play an important role in physics, and are often called spinors. We have, finally, obtained the correct mathematical framework in which to describe rotations and angular momentum in quantum mechanics. The history of the subject devel- oped rather differently to the exposition we’ve given here. By the 1920s, careful studies of atomic spectroscopy had revealed that many spectral lines were in fact doubled, composed of two lines, very close in frequency. In 1924 (even before Schr¨odingerpublished his famous equation) Pauli proposed that this doubling indicated that, in addition to the quantum numbers n, `, m labelling their energy levels, electrons possessed a further quantum number that took just two values. A year later, Uhlenbeck & Goudsmit suggested that this could be associated to some form of internal angular momentum. Their idea was initially treated with suspicion. If one wished to suppose the electron was a small, rotating sphere, then for it to have the needed angular momentum ~/2 and yet keep its radius small enough so that its finite size would not have been detected by experiment52, the surface of the sphere would need to be travelling faster than light. Nevertheless, the Stern–Gerlach experiment showed that particles could indeed have spin, contributing to the Hamiltonian in the presence of a magnetic field in just the same way as would a classical spinning magnetic dipole. Heisenberg, Jordan and C. G. Darwin53 then showed that the internal spin of the electron exactly accounted for the fine splitting of spectral lines that had puzzled Pauli, as we’ll see in chapter 8.1.1. We now understand that spin is an intrinsic property of fundamental particles: the 2 3 3 Hilbert space of a fundamental particle is not simply spat = L (R , d x) describing its H ∼ spatial wavefunction, but rather a tensor product spat s. However we may excite, crash H ⊗H into, or generally interfere with the motion of an electron or W boson, for as long as they 1 remain electrons and W bosons, their spin will always be s = 2 and s = 1, respectively. 5.3.4 Spin Matrices Since σ s, s + 1, . . . , s 1, s , states of definite total spin s can be described by a ∈ {− − − } 2s+1 finite dimensional Hilbert space s = C . As always, once we pick a basis on we can H ∼ H 52To date, no experiment has ever detected a finite size, or any other non-trivial internal spatial structure in the electron. 53Charles Galton Darwin, grandson of Charles Robert Darwin.

– 68 – describe the action of linear operators such as S explicitly in terms of matrices. Let’s now carry this out the first few spin representations, working in the basis s, σ of eigenstates {| i} of Sz. The simplest case is s = 0, for which the only possible value of σ is also zero. This state obeys e−iα·S/~ 0, 0 = 0, 0 for any α. Hence a spin-zero object, like a perfect sphere, | i | i is completely unchanged under any rotation. In view of this, spin-zero particles are known as scalar particles. The discovery of the , announced on 4th July 2012 at CERN in Geneva, was the first time a fundamental scalar particle had been observed in Nature. 1 The next case is s = 2 . Electrons, neutrinos and quarks are fundamental particles of 1 spin 2 , whilst protons, neutrons and the silver atoms used by Stern & Gerlach are examples of composite spin-half particles. When s = 1 , σ can only take one of the two values 1 . 2 ± 2 For shorthand, we let denote the state s, σ = 1 , 1 and denote 1 , 1 . A generic |↑ i | i | 2 2 i |↓ i | 2 − 2 i state of a spin-half system can thus be expanded as

ψ = a + b (5.35) | i |↑ i |↓ i 2 2 in this basis, where a, b C and a + b = 1 so that ψ is correctly normalised. ∈ | | | | | i In this basis, we can write the spin operators themselves as the 2 2 matrices × ! ! Sx Sx Sy Sy Sx = h ↑| |↑ i h ↑| |↓ i ,Sy = h ↑| |↑ i h ↑| |↓ i , Sx Sx Sy Sy h ↓| |↑ i h ↓| |↓ i h ↓| |↑ i h ↓| |↓ i ! (5.36) Sz Sz Sz = h ↑| |↑ i h ↑| |↓ i . Sz Sz h ↓| |↑ i h ↓| |↓ i

Because and are eigenstates of Sz, evaluating Sz in this basis is immediate. To | ↑ i | ↓ i also evaluate Sx and Sy, we note that Sx = (S+ + S−)/2 and Sy = (S+ S−)/2i where − S± are the spin raising and lowering operators defined just as for J±. Using (5.16) with 1 j = s = gives S+ = ~ and S− = ~ . In this way, we obtain 2 |↓ i |↑ i |↑ i |↓ i ! ! ! ~ 0 1 ~ 0 i ~ 1 0 Sx = ,Sy = − ,Sz = . (5.37) 2 1 0 2 i 0 2 0 1 − 54 The coefficients of ~/2 here are known as Pauli matrices and usually written as (σx, σy, σz). Thus, for s = 1/2, we can write S = ~/2 σ. Proceeding to spin-one, we find three possible values σ 1, 0, 1 . Thus the spin- ∈ {− } one Hilbert space is three (complex) dimensional, and when s = 1 we can represent each component of S by a 3 3 matrix. In this case, the spin raising and lowering operators S± × act as S+ 1 = √2~ 0 ,S+ 0 = √2~ + 1 , | − i | i | i | i (5.38) S− + 1 = √2~ 0 ,S− 0 = √2~ 1 , | i | i | i | − i 54 Do note confuse this vector of Pauli matrices with the eigenvalue label of Sz.

– 69 – as follows from (5.16) with j = s = 1. Using these results, one can check that       0 1 0 0 i 0 1 0 0 ~ ~ − Sx =  1 0 1  ,Sy =  i 0 i  ,Sz =  0 0 0  . (5.39) √2   √2  −    0 1 0 0 i 0 0 0 1 − Sadly, these matrices don’t have any special name. Just as for spin-half, we can use this representation to express the state 1, θ in which a measurement of n S along some axis n | i · is certain to yield +~. Examples of fundamental particles with spin-one are the somewhat unimaginatively named W and Z bosons55. These are responsible for the weak interactions that, among other things, allows two Hydrogen nuclei to fuse into Deuterium, powering nuclear fusion in the core of stars. In just the same way, we can represent each component of the spin operator by a (2s + 1) (2s + 1) matrix for any finite s. Since our basis is adapted to Sz, the matrix × representation of Sz will be

Sz = ~ diag(s, s 1,..., s + 1, s) . (5.40) − − −

Matrices for Sx and Sy may be constructed using the spin raising and lowering operators as above. Since S± only change the z-component of the spin by 1 unit, these matrices are very sparse. One finds that they have non-zero entries only along the subleading diagonals, given by ~   (Sx)σ0σ = α(σ) δσ0−1,σ + α(σ 1) δσ0+1,σ 2 − (5.41) ~   (Sy)σ0σ = α(σ) δσ0−1,σ + α(σ 1) δσ0+1,σ , 2i − − p where α(σ) = (s σ)(s + σ + 1). Note that, whatever the value of s, we always have − three matrices (Sx,Sy,Sz) describing reorientations around the three independent axes in 3 R . Note also that each of these matrices is indeed traceless, as required by i~ ijktrH(Sk) = trH([Si,Sj]) = 0 as we said at the beginning of the chapter. In elementary particle physics (and also in Tripos questions), one rarely encounters spins higher than 1. Nonetheless, it’s interesting to consider the limit s 1 of very high  spins to see how our classical intuition emerges. For example, and electric motor that is roughly 1 cm in diameter and weighs about 10 gm might spin at 100 revolutions per 3 2 −1 ∼ 31 second. Its intrinsic angular momentum is then 10 kg m s 10 ~. The classical ∼ ≈ world thus involves huge values of s! Let’s now show that, when s 1, there is very little uncertainty in the direction of a  systems spin. Suppose our system is in the state s, s so that its spin is maximally aligned | i with the z-axis. Let’s compute s, s n S s, s where n = (sin θ, 0, cos θ). That is, we want h | · | i to know how much spin we expect to measure along the direction in the xz-plane, inclined

55 Although the photon carries one unit of ~ in intrinsic angular momentum, it has only two possible states corresponding to left- or right-circular polarized light. Because the photon is massless, one needs to consider representations of the , rather than the spatial rotation group, in order to describe it accurately. We won’t do this in this course.

– 70 – at angle θ to the z-axis. Classically, this would be just ~s cos θ, the projection of s onto this axis. Quantum mechanically, we have

s, s n S s, s = sin θ s, s Sx s, s + cos θ s, s Sz s, s = ~s cos θ , (5.42) h | · | i h | | i h | | i using the fact that Sx = (S+ + S−)/2. Thus, on average the quantum result agrees with the classical intuition. This holds for any value of s, but now let’s ask what the uncertainty in this result is. We compute

2 2 2 2 2 s, s (n S) s, s = sin θ s, s S s, s + sin θ cos θ s, s SxSz + SzSx s, s + cos θ s, s S s, s h | · | i h | x| i h | | i h | z | i 1 2 2 2 2 = sin θ s, s S+S− s, s + ~ s cos θ 4 h | | i 2 s 2 2  = ~ sin θ + s cos θ , 2 (5.43) with all other terms vanishing. Consequently r p s (n S)2 n S 2 = ~ sin θ (5.44) h · i − h · i 2 | | p and so, in the classical limit of large s, the uncertainty is small ( 1/s) compared to ∼ n S . h · i 5.3.5 Paramagnetic Resonance and MRI Scanners In the presence of an external magnetic field, a classical magnetic dipole µ experiences a torque ∂L = µ B . (5.45) ∂t × The dipole will thus turn until it aligns itself along the direction of the field, minimizing its energy. This is familiar from a compass. However, suppose the dipole is already spin- ning around its centre of mass, such that µ = γL where the constant γ is known as the gyromagnetic ratio. Then instead one finds that the torque causes the dipole to precess around B. As in our discussion of the Stern-Gerlach experiment, particles such as a proton or electon do have magnetic dipole moments µ = (2µ/~) S proportional to their spin. We’ll see that in quantum mechanics, this spin does indeed precess around the direction of an applied magnetic field. This is the basis of MRI scanners, which have become an enormously important diagnostic tool for both chemistry and medicine. We can understand the basic principles of an MRI machine by using our spin matrices. Unlike the beam of silver atoms in the Stern-Gerlach experiment, here the protons are not free to move, because they’re held in place by the electromagnetic binding forces of a complex molecule, and this molecule is also held in a (roughly) fixed place in our cells. Thus we take the Hamiltonian to be 2µB H = µ B = Sz , (5.46) − · − ~

– 71 – with no kinetic term. Again, we’ve chosen the direction of the magnetic field to define 1 the zˆ axis. In particular, for a spin- 2 particle such as a proton, the eigenvalues of this Hamiltonian are E± = µB, where B = B. ∓ | Initially, we cannot expect the protons in our body to have their spins already aligned along Bˆ . Suppose instead that some proton has its spin aligned along some axis n = (0, sin θ, cos θ) inclined at angle θ to Bˆ . That is, we consider a proton in the state θ↑ | i defined by ~ n S θ↑ = θ↑ . (5.47) · | i 2 | i We can always choose to expand this state in terms of our basis , , representing {| ↑ i | ↓ i} states of definite spin alongz ˆ, as

θ↑ = a + b (5.48) | i |↑ i |↓ i where a 2 + b 2 = 1. In terms of our matrices (5.37) for spin- 1 , the eigenvalue equa- | | | | 2 tion (5.47) becomes ! ! ! ! ! ! ! ~ 0 i a ~ 1 0 a ~ cos θ i sin θ a ~ a sin θ − + cos θ = − = 2 i 0 b 2 0 1 b 2 i sin θ cos θ b 2 b − − (5.49) θ Solving this eigenvalue problem and the normalisation condition yields a = cos 2 and θ b = i sin 2 (up to a possible phase), so a proton whose spin is aligned along the n axis is in state θ θ θ↑ = cos + i sin (5.50) | i 2 |↑ i 2 |↓ i

Note that θ↑ has the expected behaviour at θ = 0 and θ = π, and that it yields as | i −|↑ i θ is continuously increased to 2π. (It’s straightforward to generalise this example to the case of a state with spin aligned along an arbitrary axis n.) Applying the time evolution operator U(t) = e−iHt/~ with the Hamiltonian (5.46), we find that at time t the proton’s state has evolved to

θ −iHt/~ θ −iHt/~ ψ(t) = U(t) θ↑ = cos e + i sin e | i | i 2 |↑ i 2 |↓ i (5.51) θ θ = cos e−iωt/2 + i sin e+iωt/2 2 |↑ i 2 |↓ i where ω = 2µB is the Larmor frequency. One can check that this state is an eigenstate of the spin operator aligned along the axis n(t) = (sin θ sin ωt, sin θ cos ωt, cos θ), so that at time t, the proton’s spin is definitely aligned along n(t). Consequently, as time passes, the spin of any proton will precess around the direction of B with frequency ω that is independent of the angle of inclination θ – i.e. independent of the proton’s initial orientation (provided it was not pointing exactly along Bˆ in the first place). This is exactly the same behaviour as we found classically. When a material that contains chemically bound hydrogen atoms is immersed in a strong magnetic field, over time the protons will emit radiation (not accounted for by our

– 72 – Figure 10: An image of a cross-section of the head, produced by an MRI scan. Different regions contain different concentration of organic molecules, so react more or less strongly to the resonant magnetic field. MRI scanners are one of the most important diagnostic tools in modern medicine. above Hamiltonian) so as to sit in the ground state. Thus, eventually the precession will cease and most of the protons’ spins will be aligned along the direction of B. Now suppose that, in addition to the static external B field, we apply a small addi- tional magnetic field b cos(ωt) that varies at the same Larmor frequency ω = 2µB. The Hamiltonian felt by the protons will then be ! 2µ B b cos(ωt) H = µ (B + b cos(ωt) ) = (5.52) − · − ~ b cos(ωt) B − if the new field is in the xˆ-direction. The TDSE for this system is thus ! ! ! a˙ B b cos(ωt) a = i (5.53) b˙ − b cos(ωt) B b − varies with frequency ω = 2µB/~ (which corresponds to radio frequencies for µ the dipole moment of the proton and B the strength of a typical magnetic field available in hospitals). This radiation has just the right energy to excite these protons into the state where their spin is aligned against B. Consequently, such radiation is readily absorbed by the sample, whereas radiation at nearby frequencies is not. As we have seen, interference between the two states causes the spin to precess around the direction of B, and this precessing magnetic moment couples resonantly to the applied radiation field. MRI scanners are typically tuned to determine the concentration of a single type of atom, usually Hydrogen. However, in a complex molecule, not every Hydrogen nucleus

– 73 – (proton) feels the same magnetic field, because of additional contributions from the elec- trons that bind the atoms together in the molecule. For example, in methanol (CH3OH) the magnetic field experienced by the proton that is attached to the oxygen atom differs from those experienced by the protons attached to the carbon atom. Also, of the 3 pro- tons bound to the carbon atom, one is on the opposite side of carbon from oxygen, while the other two lie between the carbon and oxygen atoms. The first thus feels a different magnetic field to the other two. Now, the frequency ω of precession is proportional to the strength of the magnetic field at the location of the proton, so for any fixed strength of applied magnetic field, methanol has three different resonant frequencies. Clues to the chemical structure of a substance can thus be obtained by determining the frequencies at which magnetic resonance occurs in a given imposed field. If we choose B to have a spatial gradient, then only a thin slice of our sample material will have ω tuned to its resonant frequency, so we excite transitions to higher energy levels only in this thin slice. Varying this field in an orthogonal direction as the nuclear spins decay back down to the ground state allows us to recover three dimensional images.

5.4 Orbital Angular Momentum The topological considerations that allowed us to admit half-integer values of s do not apply to `. To understand this, recall from section 4.3.1 that L could be interpreted as the generator of circular translations — transformations that translate a state around a circle 3 3 in space, without adjusting its orientation. Unlike the space S /Z2 of rotations, in R the space of such circular paths is contractible (see figure ??). Consequently, translation around a circular path always leaves our state unchanged. In particular, we must have

e−2πizˆ·L/~ `, m = e−2πim `, m = `, m , (5.54) | i | i | i so that m Z and hence ` N0. ∈ ∈ 5.4.1 Spherical Harmonics The commutation relations of the Ls are exactly the same as those of the Ss, so for any finite ` we could choose to represent the orbital angular momentum operators by the same (2` + 1) (2` + 1) matrices as we obtained above. However, in practical applications × we’re usually interested in states whose orbital angular momentum quantum number ` may change, perhaps as a result of the particle being excited from one energy level to another. Thus it’s more convenient to use a formalism that allows us to treat all values of ` simultaneously. Furthermore, we often want to know about L at the same time as knowing about linear momentum P or position X, and we have seen that the [Xi,Pj] commutation relations do not have any finite dimensional representation and we can only represent them as operators acting on (wave)functions. Let’s now see how to reconstruct eigenfunctions of orbital angular momentum – the and spherical harmonics you met in IB – from the operator formalism. In the position representation, the orbital angular momentum operators become

L = X P = i~ x ∇ (5.55) × − ×

– 74 – so that in particular  ∂ ∂  Lz = i~ x y . (5.56) − ∂y − ∂x In terms of spherical polar coordinates we have (x, y, z) = (r sin θ cos φ, r sin θ sin φ, r cos θ), so ∂ ∂x ∂ ∂y ∂ ∂z ∂ ∂ ∂ = + + = y + x (5.57) ∂φ ∂φ ∂x ∂φ ∂y ∂φ ∂z − ∂x ∂y so that Lz = i~ ∂/∂φ. Thus, using these coordinates, the eigenvalue equation for Lz − becomes ∂ x Lz `, m = i~ x `, m = m~ x `, m (5.58) h | | i − ∂φh | i h | i or i∂ψ`,m(x)/∂φ = mψ`,m(x), where ψ`,m(x) = x `, m is the position space wavefunc- − h | i tion. This is solved by imφ ψ`,m(x) = K(r, θ) e (5.59) for some function K(r, θ). Since m Z, ψ`,m is a single–valued function of the azimuthal ∈ angle φ. This is often given as a further reason why only integer values of m (and hence `) should be allowed. A straightforward, though somewhat tedious calculation shows that the raising and lowering operators   ±iφ ∂ ∂ L± = Lx iLy = ~ e i cot θ (5.60) ± ± ∂θ ± ∂φ in the position representation. The condition L+ψ`,` = 0 then fixes

` i`φ ψ`,`(x) = R(r) sin θ e . (5.61)

Applying the lowering operators one finds that all the other ψ`,m(x)s are of the form

m ψ`,m(x) = R(r)Y` (θ, φ) (5.62) where the spherical harmonic s (2` + 1) (` m)! Ym(θ, φ) = ( 1)m − Pm(cos θ) eimφ (5.63) ` − 4π (` + m)! ` is given in terms of the associated Legendre polynomial

1 d`+m Pm(x) = (1 x2)m/2 (x2 1)` . (5.64) ` 2``! − dx`+m − In particular, the spherical harmonics with m = 0 are proportional to the ordinary Legendre polynomial r 2` + 1 Y0(θ) = P (cos θ) . (5.65) ` 4π ` which is a odd or even polynomial in cos θ, according to whether ` is odd or even. In m 2 particular, the P` s are only single–valued as functions on S when ` is an integer, providing

– 75 – m Figure 11: The first few spherical harmonics Y` (θ, φ). Each row contains pictures of the 2` + 1 spherical harmonics at fixed `, with m increasing from ` to ` along each row. − another reason why the half–integer values allowed by general considerations of the algebra must in fact be discarded for L. We won’t be concerned with the detailed form of these spherical harmonics, though you may wish to note the orthogonality condition Z m0 m Y`0 (θ, φ)Y` (θ, φ) sin θ dθ dφ = δ``0 δmm0 (5.66) S2 and the fact that r2 2Ym(θ, φ) = `(` + 1) Ym(θ, φ) (5.67) ∇ ` − ` where 2 is the Laplacian. Furthermore, since L is invariant under parity, Π−1LΠ = +L, ∇ it follows that so too are the raising and lowering operators L±. Therefore, all states in a given ` multiplet have the same parity. To determine what that parity is, note that in spherical polar coordinates the transformation x x becomes 7→ − (r, θ, φ) (r, θ π, φ + π) (5.68) 7→ − ` so in particular cos θ cos θ. Since P`( cos θ) = ( 1) P`(cos θ), we see that the parity 0 7→ − − − of Y` is odd or even, according to whether ` is odd or even, and hence

Ym(θ π, φ + π) = ( 1)` Ym(θ, φ) (5.69) ` − − ` for all spherical harmonics with a given value of `.

– 76 – It’s occasionally useful to have an alternative form of the spherical harmonics. Consider 3 the polynomial in R 3 X i1 i2 i` ψ(x) = ψi i ...i x x x (5.70) 1 2 ` ··· i1,i2,...,i`=1 that is homogeneous of degree `. The coefficients ψi1i2...i` C are necessarily totally ∈ 2 symmetric in their indices, and the polynomial is harmonic, i.e. ψ(x) = 0 if ψi i ...i are ∇ 1 2 ` traceless on any pair of indices. Finally, notice that so far we’ve said nothing about the radial profile of the wavefunc- tion, R(r). Since this function is certainly spherically symmetric, the rotation generators cannot tell us anything about it and to determine R(r) we’d need further information, such as the Hamiltonian.

– 77 – 6 Addition of Angular Momentum

Back in section 2.3.1, we understood how to describe composite systems using the tensor product of the Hilbert spaces of the individual subsystems. However, in many circum- stances, the basis of the tensor product formed by taking all possible pairs of basis ele- ments from the individual subspaces is not the most convenient. A key case is when the subspaces transform under the action of a group G. We’d like to understand the effect of a G-transformation on the combined space, and the ‘obvious’ tensor product basis usually obscures this. In this chapter, we’ll see how this works for the case of the rotation group G = SO(3). Specifically, suppose two subsystems are each in states of definite total angular momentum – meaning mathematically that they transform in irreducible representations of SO(3) – with total angular momentum quantum numbers j1 and j2, respectively. We’d like to express the tensor product of the subsystems in terms of a sum of Hilbert spaces, M j j = j , (6.1) H 1 ⊗ H 2 H j where each j describes a state of the whole system with definite total angular momentum H quantum number j56. Physically, to understand this decomposition is to understand how to add the angular momentum of the two subsystems so as to find the possible values of the combined angular momentum of the whole system. For example, in a Hydrogen atom, both the proton and electron carry angular momentum ~/2 by virtue of their spins, and further angular momentum may be present depending on the electron’s orbit around the proton. If we wish to treat the atom as a whole, then we’ll be concerned with the angular momentum of the combined system rather than that of the electron and proton individually. An even more familiar example is your bike: the total angular momentum comes from a combination of the back and front wheels which can be independent (at least in principle, though I don’t recommend trying this while you’re riding along!). Our second task in this chapter is to fill in a gap left in our knowledge: While we know how to use the raising and lowering operators J± to realign a given amount of an- gular momentum along or away from some axis, we’ve not yet learnt how to perform a mathematical operation that changes the total angular momentum of our state. In sec- tion 6.2.1, we’ll understand that, as well as states, operators themselves can carry angular momentum. Applying such an operator to a state yields a new state whose total angular momentum can differ from that of the original state. We’ll illustrate this using various physical examples, including radiative transitions induced by an electric dipole moment, and the special ‘dynamical’ symmetries present in the Coulomb and harmonic oscillator potentials.

56In quantum mechanics we’re concerned with Hilbert space, but in fact the inner product plays very little role in this story. The whole of this section carries over more generally to the case of representations of a general Lie group G, where we decompose the tensor product Rj1 ⊗ Rj2 of two irreducible representations L (vector spaces) Rj1 and Rj2 in terms of a sum j Rj of further irreducible representations. You can learn more by taking the Part II Representation Theory course next term (and more still in the Part III courses).

– 78 – Figure 12: Two gyroscopes, of individual angular momenta j1 and j2, may be aligned relative to one another so that their total angular momentum ranges between j1 + j2 and j1 j2 . For fixed relative alignment, the total angular momentum may be chosen to be | − | aligned along any axis.

6.1 Combining the Angular Momenta of Two States Suppose we have two subsystems (‘gyros’) enclosed in a box, where the first system has total angular momentum quantum number j1 and the second has total angular momentum quantum number j2. As we saw in section 5.1, a basis of states of the first system is

j1, m1 where m1 j1, j1 + 1, . . . , j1 (6.2) {| i} ∈ {− − } while the second system may be described in terms of the basis

j2, m2 where m2 j2, j2 + 1, . . . , j2 . (6.3) {| i} ∈ {− − } From the general discussion of section 2.3.1, the state of the combined system can be expressed as j1 j2 X X ψ = cm m j1, m1 j2, m2 (6.4) | i 1 2 | i ⊗ | i m1=−j1 m2=−j2 with some coefficients cm1m2 . We can choose m1 and m2 independently, so there are a total of (2j1 + 1)(2j2 + 1) states here – this is the dimension of the tensor product j j . H 1 ⊗ H 2 We want to understand how the states (6.4) behave under rotations; that is, we’d like to understand which linear combinations in (6.4) correspond to a definite amount of angular momentum for the system as a whole. Let’s first consider the corresponding classical situation, where we may wish to know the combined angular momentum of two gryroscopes. Although the total angular momen- tum of the individual components is fixed, that of the whole system is variable because it depends on the relative orientation of the two gyros: if they are aligned with eachother, the whole system might be expected to have total angular momentum labelled by j1 + j2, while if the two subsystems are aligned exactly against eachother, then you may expect the system as a whole to have total angular momentum j1 j2 . (See figure 12.) It’s important | − |

– 79 – to realise that we’re not saying anything at all about how the individual subsystems may or may not be coupled to one another dynamically – that is, we’re not assuming any form of interaction between them in the Hamiltonian. Rather, we’re just considering the different relative alignments of their angular momenta that are possible in principle. Quantum mechanically, the angular momentum operator for the combined system is

J = (J1 1H ) + (1H J2) (6.5) ⊗ j2 j1 ⊗ where J1 and J2 are the angular momentum operators for the two subsystems. It follows that 2 2 2 J = (J 1H ) + (1H J ) + 2(J1 J2) (6.6) 1 ⊗ j2 j1 ⊗ 2 ⊗ where in the final term we take the scalar product of the two spin operators over their spatial vector indices. We’ll often abuse notation by writing

2 2 2 J = J1 + J2 and J = J + J + 2J1 J2 , (6.7) 1 2 · with the tensor product and identity operators being understood. We can rewrite the operator (6.6) in a way that allows us to understand its action on states j1, m1 j2, m2 , which form our basis in (6.4) (we’ve again dropped the symbol). | i| i ⊗ Using the fact that J+ + J− J+ J− Jx = and Jy = − 2 2i we have

2J1 J2 = 2(J1xJ2x + J1yJ2y + J1zJ2z) ·   J1+ + J1− J2+ + J2− J1+ J1− J2+ J2− = 2 + − − + J1zJ2z (6.8) 2 2 2i 2i

= J1+J2− + J1−J2+ + 2J1zJ2z .

Using this to eliminate J1 J2 from (6.6) allows us to write the total angular momentum · operator as 2 2 2 J = J1 + J2 + J1+J2− + J1−J2+ + 2J1zJ2z , (6.9) where we now understand how each of the terms on the rhs act on any state of the form j1, m1 j2, m2 . | i| i Let’s consider the action of this operator on states of the whole system. We’ll start by examining the state j1, j1 j2, j2 in which both gyros are maximally aligned with the | i| i zˆ-axis. Since Jz = J1z + J2z we have

Jz j1, j1 j2, j2 = (j1 + j2)~ j1, j1 j2, j2 (6.10) | i| i | i| i so indeed this state is an eigenstate of Jz with the expected eigenvalue. Also, using (6.9) we have 2 2 2  J j1, j1 j2, j2 = J + J + J1+J2− + J1−J2+ + 2J1zJ2z j1, j1 j2, j2 | i| i 1 2 | i| i 2 = (j1(j1 + 1) + j2(j2 + 1) + 2j1j2) ~ j1, j1 j2, j2 (6.11) | i| i 2 = (j1 + j2)(j1 + j2 + 1)~ j1, j1 j2, j2 | i| i

– 80 – where we’ve used the fact that j1, j1 j2, j2 is annihilated by both J1+ and J2+. Thus, | i| i setting j = j1 + j2, we may write

j, j = j1, j1 j2, j2 (6.12) | i | i| i since j1, j1 j2, j2 satisfies both the defining equations for a state of total angular momen- | i| i tum labelled by j1 + j2, with it all aligned along zˆ. (This is a highest weight state for j = j1 + j2.) 2 Now that we’ve found one mutual eigenstate of the combined J and Jz, we may easily construct others by applying the lowering operator J− = J1− + J2−. Acting on the left and using equation (5.16) we have p p J− j, j = ~ j(j + 1) j(j 1) j, j 1 = ~ 2j j, j 1 . (6.13) | i − − | − i | − i On the other hand, applying J− = J1− + J2− to the rhs of (6.12) gives

(J1− + J2−) j1, j1 j2, j2 | i| i hp p i = ~ j1(j1 + 1) j1(j1 1) j1, j1 1 j2, j2 + j2(j2 + 1) j2(j2 1) j1, j1 j2, j2 1 − − | − i| i − − | i| − i p p = ~ 2j1 j1, j1 1 j2, j2 + ~ 2j2 j1, j1 j2, j2 1 . | − i| i | i| − i (6.14) Comparing the two sides we learn that s s j1 j2 j, j 1 = j1, j1 1 j2, j2 + j1, j1 j2, j2 1 . (6.15) | − i j | − i| i j | i| − i

Note that the lhs is an eigenstate of Jz with eigenvalue (j 1)~, and indeed the rhs is an − eigenstate of J1z + J2z with eigenvalue (j1 + j2 1)~. A further application of J− to the − lhs and J1− + J2− to the rhs would produce an expression for j, j 2 and so on. | − i 2 Since [J ,J−] = 0, all the states we produce by acting on j, j with J− have total | i angular momentum quantum number j = j1 + j2. This corresponds to the individual angular momenta of the subsystems always being aligned with eachother, with the different j, m telling us about how closely there mutual direction of alignment coincides with the | i z-axis. It’s perfectly possible for the individual subsystems to not line up with eachother. In such a configuration the net total angular momentum of the combined system will be less than the maximum value j1 + j2. Let’s seek an expression for the state j 1, j 1 | − − i in which the total angular momentum of the whole system is just less than the maximum possible, but where this angular momentum still points along the zˆ-axis. It’s trivial to verify that any simple state j1, m1 j2, m2 is an eigenstate of Jz = J1z + J2z, with eigenvalue | i| i (m1 + m2)~. We require m1 + m2 = j 1 = j1 + j2 1, so either (m1, m2) = (j1 1, j2) − − − or (m1, m2) = (j1, j2 1). Also, the state j 1, j 1 must be orthogonal to the state − | − − i j, j 1 we found in (6.15) since they are each eigenstates of the Hermitian operator J2 | − i with distinct eigenvalues. Therefore we must have s s j2 j1 j 1, j 1 = j1, j1 1 j2, j2 j1, j1 j2, j2 1 . (6.16) | − − i j | − i| i − j | i| − i

– 81 – Figure 13: Decomposing the tensor product of two irreducible representations into a sum of irreducible representations, drawn as semicircles. The picture on the left shows states obtained by adding a system with j1 = 2 to one with j2 = 1, while that on the right is for j1 = 1 and j2 = 1/2. with a relative sign between the two terms. Again, this is a highest weight state, now with j = j1 + j2 1. Given this state, we may now proceed to construct all the states j 1, m − | − i with j + 1 m j 1 by applying the lowering operator J−. − ≤ ≤ − Figure 13 helps us to organise our results. States of the combined system with well– defined angular momentum are marked by dots. The radius of each semicircle is propor- 0 0 2 2 tional to √j0, where j (j + 1)~ is the state’s eigenvalue J . The height of a state above the centre of the semicircles is proportional to m. The outer semicircle thus contains all 0 0 states with j = j1 + j2; going inwards we meet the states with j = j1 + j2 1, then those 0 − with j = j1 + j2 2 and so on. To construct j 2, j 2 we note that it must be in the − | − − i span of j1, j1 2 j2, j2 , j1, j1 1 j2, j2 1 , j1, j1 j2, j2 2 , and that j, j 2 and {| − i| i | − i| − i | i| − i} | − i j 1, j 2 also lie in the span of these states. Having constructed both j, j 2 and | − − i | − i j 1, j 2 , the state j 2, j 2 must be the unique orthogonal combination. | − − i | − − i From our classical considerations, it’s reasonable to conjecture that the smallest pos- sible total angular momentum quantum number of the whole system is j1 j2 , coming | − | when the two gyros are aligned against one another. Let’s check this by working out the total number of states the conjecture leads to. Without loss, we can assume j1 j2. If the ≥ combined system has j values running from j1 + j2 to j1 j2, then we count a total of −   j1+j2 j1+j2 X X (2j + 1) = 2  j + (2j2 + 1) = (2j1)(2j2 + 1) + (2j2 + 1) (6.17) j=j1−j2 j=j1−j2

= (2j1 + 1)(2j2 + 1) , in agreement with the dimension of the tensor product space we found earlier using the ‘obvious’ tensor product basis in (6.4). We’ve thus accounted for all the possible states of the system. The numbers

Cj,m(j1, m1; j2, m2) = j, m ( j1, m1 j2, m2 ) (6.18) h | | i ⊗ | i

– 82 – are known as Clebsch–Gordan coefficients. Physically, they represent the amplitude that, when the total system is in state j, m , we will find the subsystems to be in states j1, m1 | i | i and j2, m2 when we peer inside the box. For example, from equation (6.15) we see that | i p C3,2(2, 1; 1, 1) = 2/3, so if we know our box contains a gyro of spin 2 and a gyro of spin 1, if the box is in state 3, 2 then there’s a 2/3 chance that the second gyro has its | i spin maximally aligned with the z-axis with the first significantly inclined, but only a 1/3 chance that it is the first gyro that is maximally aligned with the zˆ-axis. Let’s take a look at a few examples to familiarise ourselves with the above general formalism.

6.1.1 j 0 = j ⊗ First, the trivial case: if one of the subsystems (say the second) has j2 = 0, then it always lies in j2, m2 = 0, 0 . Hence the tensor product is trivial | i | i

= j1 0 = j1 C = j1 H H ⊗ H H ⊗ ∼ H because we have no choice about the state of the second subsystem. The states

j, m = j1, m1 0, 0 {| i | i| i} therefore form a basis of the full system, and are immediately a single irreducible repre- sentation with j = j1.

6.1.2 1 1 = 1 0 2 ⊗ 2 ⊕ The first non–trivial case is when j1 = j2 = 1/2. Physically this is relevant, e.g. to the case of the ground state of the Hydrogen atom where to understand the spin of the whole atom we must combine the spins of both the proton and the electron. From above, we have

1, 1 H = e p (6.19) | i |↑ i |↑ i where the subscripts refer to Hydrogen, the electon and the proton, respectively. In this state, the spins of the electron and proton are aligned with eachother, and they each have maximal possible spin along the zˆ-axis. Applying J−H = J−e + J−p we obtain 1 1, 0 H = ( e p + e p) , (6.20) | i √2 |↑ i |↓ i |↓ i |↑ i while applying J−H a second time gives

1, 1 H = e p . (6.21) | − i |↓ i |↓ i Since both the electron and proton are in the state here, any further applications of | ↓ i either lowering operator will annihilate the rhs, in agreement with the action of the total lowering operator J−H on the state 1, 1 H of the whole atom. | − i Let me point out a perhaps surprising feature of this multiplet. In state (6.20), we are certain to find that the z-component of the spins of the electron and proton are ‘antiparal- lel’. This may seem surprising given that the atom is still in a spin-1 state, corresponding to

– 83 – the spins of the subsystems being aligned. The resolution of the paradox is that while the z-components are indeed antiparallel, the components in the xy-plane are aligned, although their precise direction is unknown to us. Similarly, in both the 1, 1 and 1, 1 states, | i | − i while the z-components of the electron and proton spins were aligned, their components in the xy-plane were in fact antiparallel. The poor alignment in the xy-plane accounts for p p p the fact that J2 = √2~ for the atom, which is less than the sum of J2 = 3/4 ~ h i h i for the electron and proton individually. The remaining state of the atom is 0, 0 , in which the atom has no angular momentum. | i This should be orthogonal to (6.20), so we find 1 0, 0 H = ( e p e p) . (6.22) | i √2 |↑ i |↓ i − |↓ i |↑ i The change in sign on the rhs of this equation ensures that the electron and proton spins are antiparallel in the xy-plane as well as in the z-direction. In fact, one can check that this state may also be written as 1 0, 0 H = ( x e x p x e x p) (6.23) | i −√2 |↑ i |↓ i − |↓ i |↑ i where x and x are eigenstates of Jx. Similarly, |↑ i |↓ i 1 1, 0 H = ( x e x p x e x p) (6.24) | i √2 |↑ i |↑ i − |↓ i |↓ i as one can check straightforwardly. Altogether, combining two states each with angular momentum 1/2, we’ve found a triplet of states with total angular momentum j = 1: 1 1, 1 = , 1, 0 = ( + ) , 1, 1 = (6.25) | i |↑ i |↑ i | i √2 |↑ i |↓ i |↓ i |↑ i | − i |↓ i |↓ i and a singlet with total angular momentum j = 0: 1 0, 0 = ( ) . (6.26) | i √2 |↑i|↓i−|↓i|↑i Note that each state in the triplet is symmetric under exchange of the angular momenta of the two subsytems, while the singlet state is antisymmetric under this exchange.

6.1.3 1 1 = 3 1 ⊗ 2 2 ⊕ 2 1 3 1 If our subsystems have j1 = 1 and j2 = 2 , then the total system can have j = 2 or j = 2 . Let’s start as always with the highest state  3 3 , = 1, 1 . (6.27) 2 2 | i |↑ i Applying the lowering operator, we find successively the states  r r 3 1 2 1 , = 1, 0 + 1, 1 , 2 2 3 | i |↑ i 3 | i |↓ i  r r 3 1 1 2 , = 1, 1 + 1, 0 , (6.28) 2 −2 3 | − i |↑ i 3 | i |↓ i  3 3 , = 1, 1 2 −2 | − i |↓ i

– 84 – Figure 14: Classically, the sum of two vectors lies on a sphere centred on the end of one of these vectors, at a location determined by their relative orientation.

3 which complete the j = 2 multiplet. The remaining states are  r r  r r 1 1 1 2 1 1 2 1 , = 1, 0 1, 1 and , = 1, 1 1, 0 . 2 2 3 | i |↑ i − 3 | i |↓ i 2 −2 3 | − i |↑ i − 3 | i |↓ i (6.29) The first of these is obtained by requiring it to be orthogonal to 3 , 1 , and the second | 2 2 i follows by applying the lowering operator to the first.

6.1.4 The Classical Limit We should expect to recover a classical picture when we combine two systems each with large amounts of total angular momentum. Let’s see how this occurs. Classically, if we add two angular momentum vectors j1 and j2 then the resultant vector has magnitude

2 2 2 2 j = (j1 + j2) = j + j + 2j1 j2 . (6.30) 1 2 · If we know nothing about the relative orientation of j1 and j2, then all points on a sphere of radius j2 centred on the end of j1 are equally likely. Consequently, the probability dP | | that j1 and j2 have relative alignment in the range (θ, θ + dθ) is proportional to the area of 2 the band shown in figure 14. The magnitudes of j1 and j2 are fixed, so dj = 2 j1 j2 d cos θ. | || | Hence the probability of this alignment is 2π sin θ dθ j d j dP = = | | | | (6.31) 4π 2 j1 j2 | || | and hence dP/d j = j /2 j1 j2 . | | | | | || | In the quantum case, suppose j1, j2 1. Then the fraction of states in the combined  system with some amount j of angular momentum is 2j + 1 j (6.32) (2j1 + 1)(2j2 + 1) ≈ 2j1j2 provided j1 j2 j j1 + j2. Thus, if we know nothing about the original state of | − | ≤ ≤ the subsystems, the probability that the combined system has total angular momentum j agrees with the classical probability computed above.

– 85 – 6.2 Angular Momentum of Operators Suppose that V is a vector operator, so that in particular U −1(α)VU(α) = R(α)V under rotations, or X [Ji,Vj] = i~ ijkVk (6.33) k infinitesimally. We define the spherical components of V by

+1 1 −1 1 0 V = (Vx + iVy) ,V = (Vx iVy) ,V = Vz (6.34) −√2 √2i − Then the commutation relations (6.33) are equivalent to

m = m~ V (6.35) m p m±1 [J±,V ] = ~ 2 m(m 1) V . − ± 6.2.1 The Wigner–Eckart Theorem 6.2.2 Dipole Moment Transitions 6.2.3 The Runge–Lenz Vector in Hydrogen For a spinless particle of mass57 M in a generic central potential, the Hamiltonian takes the form P2 H = + V ( X ) . (6.36) 2M | | By definition, any such Hamiltonian is rotationally invariant, so it follows that

[L,H] = 0 and [L2,H] = 0 . (6.37)

2 Since we also have [Li, L ] = 0 we can find a complete set of simultaneous eigenstates of 2 H, L and Lz, so we can label our states as n, `, m . The Hamiltonian does not involve Lz 2 | i except through L , so the energy level En,`,m of such a state must certainly be independent of m. However, the Hamiltonian does involve L2 through the kinetic operator, so we do not generically expect the energy levels to be independent of `. Physically, if we give our particle more or less angular momentum, it will whizz round faster or slower. Typically, we’d expect this to change the energy. Nonetheless, it occasionally happens that a Hamiltonian is invariant under transfor- 3 mations beyond those inherited from the transformations of the space R in which the quantum system lives. If the algebra of the corresponding generators closes, meaning that the commutator of any pair of generators is equal to another generator (perhaps including the Hamiltonian), then we say we have a dynamical symmetry. In such cases, it may be possible to change the total angular momentum of a state without changing its energy, so that states of different orbital angular momentum quantum number ` can be degenerate. Hamiltonians with dynamical symmetry are very rare: for single–particle quantum 3 systems moving on R with a central potential, the only cases are the harmonic oscillator

57 We denote the mass with a capital M so as to avoid any confusion with the z angular momentum quantum number m.

– 86 – and motion in a Coulomb potential. However, these two cases are so important that it’s worth considering them from this point of view. The treatment below closely follows section 4.8 of Weinberg – I think I’ll modify it when I write the Wigner-Eckart part The first case we’ll study is for the Coulomb potential, relevant e.g. to the hydrogen atom. Classically, conservation of angular momentum in any central potential implies that the motion is confined to a plane, but the Kepler problem has a further conserved quantity known as the Runge–Lenz vector r, given by x 1 r = e2 + p L (6.38) − x M × | | for a particle of mass M. Conservation of this vector implies that classical bound states have closed orbits; the motion is not just in a plane, but follows a fixed ellipse. (The observation of a gradual shift in Mercury’s orbit – the precession of its perihelion – was one of the early confirmations of General Relativity’s modifications to Newton’s Laws of Gravity.) Now let’s consider the quantum mechanical case. We construct the Runge–Lenz oper- ator e2 1 R = X + (P L L P) − X 2M × − × | 2| (6.39) e 1 i~ = X + P L P . − X M × − M | | Classically, the two terms in brackets on the first line of (6.39) are equal; we’ve written it this way to ensure that R† = R. Since R is a vector operator built from X and P, it follows that its commutation relations with L are X [Li,Rj] = i~ ijkRk . (6.40) k so that it transforms as a vector under rotations. Therefore, the Wigner-Eckart theorem says that 0 0 0 k 0 0 n , ` , m R n, `, m = C`0,m0 (1, k ; `, m) n R n (6.41) h | 1| i h k k i Furthermore, a somewhat tedious calculation (which I’ll spare you, and which I don’t expect you to reproduce) shows that [H, R] = 0 so the Runge-Lenz vector is conserved. Thus the reduced matrix element n0 R n vanishes unless n0 = n. h k k i The orbital angular momentum L = X P is orthogonal to each of the three terms × in this expression, so R L = L R = 0. We also have · · L2 = (X P) L = X (P L) × · · × 3 X = (P L) X + ijk [Xi,PjLk] (6.42) × · i,j,k=1 = (P L) X + 2i~ P X , × · ·

– 87 – as well as

2 2 2 2 P (P L) = 0 , (P L) P = 2i~ P and (P L) = P L (6.43) · × × · × where we’ve been careful to keep track of operator orders. Using these results, we find that the length-squared of the Runge–Lenz operator is given by

2 2 4 2H 2 2 R = Z e + L + ~ . (6.44) µ The point of this expression is that it relates R to H; since we know the spectrum of L2, we can find the eigenvalues of H if we can find the eigenvalues of R. A rather longwinded calculation (that I won’t expect you to reproduce) shows that the components of the Runge-Lenz operator obey the commutation relations 2i~ X [Ri,Rj] = H ijkLk , (6.45) − µ k where we note that H commutes with L. We now introduce the operators 1  r µ  A± = L R (6.46) 2 ± 2H − built from the orbital angular momentum operator, the Runge-Lenz operator and the Hamiltonian, where we recall that functions (such as the square root) of an operator are defined as in (2.42). The point of introducing these strange-looking combinations is that commutation relations (6.40)&(6.45) show that the A± obey the algebra X [A±i,A±j] = i~ ijkA±k and [A±, A∓] = 0 , (6.47) k which we recognise58 as two independent copies of the angular momentum algebra so(3). 2 In particular, just as for the total angular momentum, the eigenvalues of A± take the form 2 a±(a± +1)~ where a± 0, 1/2, 1, 3/2,... , with the eigenvalues of any component of A± ∈ { } running from ~a± to +~a± in steps of ~. − 2 Since [A±,H] = 0 we can find simultaneous eigenstates of H, A± and, say, A+z and A−z. However, taking the square of‘(6.46) and using the fact that L R = R L = 0, · · 2 4 2 2 2 1  2 µ 2 Z e µ ~ A = A = L + R = . (6.48) + − 4 2H − 8H − 4 2 2 In particular, we cannot choose these eigenvalues of A+ and A− independently, but must 2 have a+ = a−. Calling the common eigenvalue a, an simultaneous eigenstate of H and A± obeys 2 4   2 Z e µ 1 2 ~ 2 = a(a + 1) + ~ = (2a + 1) . (6.49) − 8E 4 4

58For bound states of the Coulomb potential, the spectrum of H is negative-definite, so the operators

A± are indeed Hermitian when acting on such states. At energies above the ionisation threshold, the square root in (6.46) means that A± can no longer be treated as Hermitian, and this algebra is then better described as sl(2).

– 88 – For a any non-negative half integer, 2a + 1 1, 2, 3,... , so the energy levels can be ∈ { } written Z2e4µ En = where n N . (6.50) − 2~2n2 ∈ These are the energy levels of hydrogenic atoms that you obtained in IB QM. The eigenval- ues of A+z and A−z can be chosen independently, and each run from a~ to a~ in steps of 2 2 − ~. Thus the degeneracy of the state with energy En is (2a + 1) = n . Again, this is bigger than the degeneracy of a generic central potential, and the degenerate states transform in representations of the enhanced so(3) so(3) symmetry algebra. The diagonal part of this, × generated by L = A+ + A− corresponds to spatial rotations. Incidentally, it’s useful to note that the energy levels of a hydrogenic atom can be written as 2 2 Z α 2 En = mc (6.51) − 2n2 where the dimensionless number e2 α = (6.52) 4π0~c is known as the fine structure constant. Experimentally, α 1/137. The significance of 2≈ this is that, at least for small atomic number Z, En/mc 1 providing a posteriori | |  justification of our use of non–relativistic quantum mechanics to study the hydrogen atom.

6.2.4 Enhanced Symmetry of the 3d Isotropic Harmonic Oscillator As a second example, consider an isotropic 3d harmonic oscillator, with Hamiltonian

P2 1 H = + Mω2X2 (6.53) 2M 2 as in section 3.3. The fact that H is quadratic in X as well as in P means it has an enhanced symmetry beyond the obvious SO(3) rotational invariance. We can see this from the form ! X † 3 H = ~ω A Ai + (6.54) i 2 i in terms of raising and lowering operators. Written this way, H is clearly invariant under the transformations

3 3 0 X † 0 † X † † Ai A uij Aj A (A ) = A (u )ki (6.55) 7→ i i 7→ i k j=1 k=1 for any 3 3 unitary matrix u. These matrices must be unitary if the new A0† is indeed to × be the Hermitian conjugate of the new A0. It’s important to realise that, although they are unitary, the matrices uij do not act on the Hilbert space of the harmonic oscillator, but just on the spatial indices of the vector operators A and A†. They thus mix X and P together59.

59For those taking Integrable Systems: classically these are symmetries of phase space that do not come from symmetries of space itself, and reflect the fact that the harmonic oscillator is (super-)integrable.

– 89 – Note also that the new raising and lowering operators obey the same commutation relations as the original ones. A 3 3 unitary matrix has 9 independent real components, whereas a rotation in three × dimensions is completely specified by only 3 independent Euler angles, so the U(3) sym- metry of this Hamiltonian is much bigger than the SO(3) rotational invariance of a generic central potential. The fact that the Hamiltonian is invariant under the U(3) transforma- tions (6.55) implies that there should be a total of nine conserved quantities corresponding to the nine generators of this symmetry. It’s easy to check that these generators form † the tensor operator T with components Tij = Ai Aj. We have [H, T] = 0 and hence each component Tij is conserved. To understand these operators, we decompose T into its irreducible parts as in section 6.2.1, obtaining

† † " † † # † δij Ai Aj AjAi Ai Aj + AjAi † δij Tij = A A + − + A A . (6.56) · 3 2 2 − · 3

The trace A† A is (up to a constant) just the Hamiltonian itself, which is certainly · conserved. Recalling the definition (3.33) of the raising and lowering operators, the anti- symmetric part can be combined into a vector with components

X † 1 X ijkA Ak = ijk(MωXj iPj)(MωXk + iPk) j 2M ω − j,k ~ j,k (6.57) i X i = ijk (XjPk PjXk) = Li , 2 − ~ j,k ~ where in the last equality we used the fact ijk(XjPk PjXk) = 2ijkXjPk, with the − -symbol ensuring that the components of X and P commute. Thus

† L = i~ (A A) , (6.58) − × so that the vector part of T is nothing but the usual orbital angular momentum operator for the 3d oscillator. It is no surprise that this is conserved. The remaining five conserved quantities — the symmetric, traceless part (Tij +Tji)/2 P − i Tii/3 of T — generate transformations that mix X and P, appropriately rescaled. They’re less familiar, as they’re special to the simple harmonic oscillator. The five com- ponents of this traceless symmetric part correspond to the spin-2 part of the operator T We can see this degeneracy by examining the excited states of the oscillator. Just as in 1d, these may be obtained by acting on 0 with any component of the raising operators † | i A . Since the ground state has energy 3~ω/2 and each application of a raising operator increases the energy by one unit of ~ω, each of the states

A† A† A† 0 (6.59) iN ··· i2 i1 | i has the same energy (N + 3/2)~ω, where we can choose any i1, i2, . . . , iN 1, 2, 3 . We { } ∈ { } often call the (correctly normalised) such state nx, ny, nz , where nx is the total number | i

– 90 – † of times we’ve applied the x-component of A , and similarly for ny and nz. Of course, N = nx + ny + nz. Since the components of the raising operators all commute with one another, we can consider the indices i1, . . . , iN in (6.59) to be symmetrized. Consequently, the degeneracy of the N th energy level is the same as the number of independent components of a rank N tensor in 3 dimensions, which is (N + 1)(N + 2)/2 . (Think ‘stars & bars’, or the number of ways to partition N into three non-negative integers (nx, ny, nx).) This is the larger degeneracy that we promised. It’s easy to derive this degeneracy in number of ways; our derivation here makes clear that the degenerate energy states transform in representations of the larger su(3) algebra.

6.3 Angular Momentum of the Isotropic Oscillator

We can find the spectrum of H` using raising and lowering operators in a similar fashion to our treatment of the full H above. We introduce   1 (` + 1)~ A` = iPr + µωR (6.60) √2µ~ω − R and a short computation using [R,Pr] = i~ shows that    † 3 H` = ~ω A A` + ` + . (6.61) ` 2

This is very similar to the expression (3.35) above, but now we find the commutation relations † (` + 1)~ H`+1 H` [A`,A`] = 1 + = − + 1 (6.62) µωR2 ~ω in place of (3.34) for the raising and lowering operators in Cartesian coordinates. In consequence, the commutator of H` with A` is

† † [A`,H`] = ~ω[A`,A A`] = ~ω[A`,A ]A` = (H`+1 H` + ~ω)A` , (6.63) ` ` − or equivalently H`+1A` A`H` = ~ωA` . (6.64) − − This is very close to showing that A` is a lowering operator, but the presence of H`+1 means that the algebra does not close unless we consider all values of `. To understand the implications of this, let E(`) be an eigenstate of H`, with H` E(`) = E E(`) where | i | i | i we emphasise that the energies depend on the total orbital angular momentum quantum number `. Then H`+1(A` E(`) ) = (E ~ω)A` E(`) , (6.65) | i − | i which implies that A` E(`) is an eigenstate of H`+1 with eigenvalue E ~ω. The Hamil- | i − tonian H`+1 is the one appropriate for the radial part of wavefunctions with total angular momentum labelled by ` + 1, so applying A` to E(`) has created a state with lower en- | i ergy, but with a radial wavefunction corresponding to a state of greater orbital angular momentum.

– 91 – † Figure 15: The action of the operators A` and A` on the states of the isotropic three dimensional oscillator.

By a now familiar argument, in any H` eigenstate we have   E` 3 H` 3 † 2 ` = E(`) E(`) = E(`) A`A` E(`) = A` E(`) 0 . (6.66) ~ω − − 2 ~ω − 2 h | | i k | ik ≥ Thus, for a fixed given energy E the maximum orbital angular momentum our state can have is E 3 `max = . (6.67) ~ω − 2 As `max is a non–negative integer (since it is a possible value for ` at given E) we see that, in agreement with the previous subsection, the ground state energy is 3~ω/2 which occurs when `max = 0, meaning that this ground state is necessarily spherically symmetric.

For any given energy, the state E(`max) which saturates the bound (6.66) obeys | i A` E(`max) = 0 and has the maximum possible total angular momentum out of all max | i the states with this energy. It is thus the quantum equivalent of a circular orbit. In the position representation we have   1 ∂ 1 (`max + 1) mωr 0 = r A`max E(`max) = + + r E(`max) (6.68) h | | i √2µ~ω ∂r r − r ~ h | i where we use r to denote a state that is definitely located at radius r. This equation is | i solved by 2 2 `max −r /4r r E(`max) = C r e 0 (6.69) h | i p where r0 = ~/2µω and C is a constant. Note that the exponential here is just the product of the exponentials we’d expect for Cartesian oscillators. This wavefunction varies with r, so quantum mechanically even a ‘circular’ orbit has some radial kinetic energy. In the limit of large `max in which classical mechanics applies, the radial kinetic energy is negligible compared to the tangential kinetic energy. The full wavefunction is m r ( E(`max) `max, m ) and so also involves the spherical harmonic Y (θ, φ). Bearing h | | i ⊗ | i `max in mind that d3x = r2 sin θ dφ dθ dr, the radial probability density P (r) for this state is 2 2 2`max+2 −r /2r 2`max+2 P (r) r e 0 . For r/r0 √2`max + 2 this rises as r and it falls as the ∼ 

– 92 – Gaussian takes over when r/r0 > √2`max + 2. The rms deviation in the radial location is r0, which is a small fraction of the expected radial location R when `max is large. ∼ h i † We can obtain the radial wavefunctions of more eccentric orbits using A`. Taking the adjoint of (6.64) we obtain † † † H`A A H`+1 = ~ω A . (6.70) ` − ` ` Acting with this on E(` + 1) we have | i † † H`(A E(` + 1) ) = (E + ~ω)(A E(` + 1) ) (6.71) `| i `| i † so that A` E(`+1) is a state of increased energy, whose radial wavefunction is appropriate | i † for orbital angular momentum `. Thus A E(`+1) = c E(`)+~ω for some constant c. The `| i | i radial wavefunctions of the eccentric orbits follow by writing this equation in the position representation, though we won’t perform this calculation here. The way the energy raising † / angular momentum lowering operators A` and the energy lowering / angular momentum raising operators A` act on the states is summarised in figure 15. In the previous section we found that there were (N +2)(N +1)/2 degenerate states in th the N excited level of the isotropic oscillator, each with energy E = (N + 3/2)~ω. From the point of view of the present section, with energy E = (N + 3/2)~ω we have `max = N and this degeneracy arises as

` ` ` Xmax X Xmax (N + 1)(N + 2) 1 = (2` + 1) = `max(`max + 1) = (6.72) 2 `=0 m=−` `=0 since there are 2` + 1 possible states with total orbital angular momentum `.

6.4 The Radial Hamiltonian for the Coulomb Potential For some purposes, it’s useful to study hydrogen from a point of view that emphasises the fact that the potential is central, so energy eigenstates may also be chosen to have definite angular momentum. In the first problem set you showed that the kinetic part of the Hamiltonian could be written as 1  L2  T = P 2 + , (6.73) 2µ r R2

2 1 where R = X X and Pr = (Xˆ P + P Xˆ ) is the radial momentum. Thus, when acting · 2 · · on an orbital angular momentum eigenstate `, m , the Hamiltonian of a hydrogenic atom | i with atomic number60 Z reduces to 2 2 2 Pr `(` + 1)~ Ze H` = + 2 . (6.74) 2µ 2µR − 4π0R As in IA Dynamics & Relativity, this is just what we’d find for a particle moving in one dimension under the influence of the effective potential 2 2 `(` + 1)~ Ze Veff (R) = 2 . (6.75) 2µR − 4π0R 60The atomic number is the number of protons in the atom’s nucleus.

– 93 – H` governs the oscillations of the particle around the minima of this potential. Note that since H` depends only on the radius R and radial momentum Pr, in position space its eigenstates will determine the radial dependence RE(r) = r E(`) of the full wavefunction. h | i A basis of eigenstates E, `, m of the full Hamiltonian H can be chosen to be (tensor) | i 2 products of the eigenstates E(`) of H` and the eigenstates `, m of L and Lz, whose | i | i position space wavefunction Ym(θ, φ) = ˆr `, m is a spherical harmonic, as in section 5.4.1. ` h | i All three terms in H` must have the same dimensions, and in particular the ratio of the two terms in Veff must be dimensionless. Thus

2 2 ~ 4π0R 4π0~ 1 = is dimensionless. (6.76) µR2 × e2 µe2 × R It follows that the Bohr radius 2 4π0~ a0 (6.77) ≡ µe2 is the characteristic length scale of the hydrogen atom. We have

2  2  ~ Pr `(` + 1) 2Z H` = 2 + 2 (6.78) 2µ ~ R − a0R in terms of a0. Let’s define a new operator A` by   a0 i ` + 1 Z A` = Pr + . (6.79) √2 ~ − R (` + 1)a0 Then one finds 2     † a0 i ` + 1 Z i ` + 1 Z A`A` = Pr + Pr + 2 −~ − R (` + 1)a0 ~ − R (` + 1)a0 2 ( 2  2  ) (6.80) a0 Pr Z ` + 1 ` + 1 1 = 2 + + i Pr, . 2 ~ (` + 1)a0 − R ~ R

Using the commutator [Pr,R] = i~ this gives − 2  2 2 2  † a0 Pr Z (` + 1) 2Z i ` + 1 A`A` = 2 + 2 2 + 2 [Pr,R] 2 ~ (` + 1) a0 R − a0R − ~ R 2  2 2  a0 Pr Z `(` + 1) 2Z = 2 + 2 2 + 2 (6.81) 2 ~ (` + 1) a0 R − a0R 2 2 a0µ Z = H` + . ~2 2(` + 1)2

† Had we instead evaluated A`A` the only difference would have been the sign in front of the commutator, so we would instead find

2 2 † a0µ Z A`A` = H`+1 + (6.82) ~2 2(` + 1)2

– 94 – which now involves the Hamiltonian appropriate for states whose orbital angular momen- tum is given by ` + 1. Rearranging (6.81) allows us to write the radial Hamiltonian as

2  2  ~ † Z H` = 2 A`A` 2 (6.83) µa0 − 2(` + 1) while taking the difference of (6.81) and (6.82) gives the commutation relations

2 h †i a0µ A`,A` = (H`+1 H`) . (6.84) ~2 − The final result we need is the commutation relation 2 2 ~ h † i ~ h †i [A`,H`] = 2 A`,A`A` = 2 A`,A` A` = (H`+1 H`) A` , (6.85) µa0 µa0 − where the final step uses (6.84). Cancelling the H`A` term on both sides, this simplifies to

A`H` = H`+1A` , (6.86)

† † and we also have A`H`+1 = H`A` by taking the Hermitian conjugate. 2 Now let’s understand these operators. Let E, ` be an eigenstate of H` and L with 2 2 | i H` E, ` = E E, ` and L E, ` = `(` + 1)~ E, ` . Then | i | i | i | i

H`+1 (A` E, ` ) = A`H` E, ` = E (A` E, ` ) (6.87) | i | i | i so that A` E, ` is an eigenstate of H`+1 with the same energy E. Thus, acting on E, ` , A` | i | i creates a new state with the same energy, but where the effective potential corresponds to the angular momentum quantum number `+1. Acting on the new state A` E, ` with A`+1 | i again creates a new state with the same energy, but even more orbital angular momentum labelled by ` + 2, and so on. We already know that our energy levels are not infinitely degenerate, so this process must eventually terminate. That is, just as in the harmonic oscillator, there must be a maximum value `max such that

A` E, `max = 0 (6.88) max | i and again, by analogy with the classical case where a circular orbit has the highest possible orbital angular momentum for fixed energy, we can view this state as the quantum analogue of a circular orbit. Taking the norm and using equation (6.81) we have

a2µ Z2 0 = A E, ` 2 = E, ` A† A E, ` = 0 E + , (6.89) `max max max `max `max max 2 2 k | i k h | | i ~ 2(`max + 1) or equivalently 2 2 Z ~ 1 E = 2 2 (6.90) − 2µa0 n where n `max + 1 labels the energy level. This agrees with the spectrum of the Hamil- ≡ tonian computed earlier using the Runge–Lenz vector (or indeed in IB QM). The virtue

– 95 – of the Runge–Lenz derivation is that it makes transparent the role of the additional, dy- namical symmetry of Hydrogen, explaining why the energy levels fall into multiplets of so(3) so(3). The virtue of the present derivation is that it makes transparent how the × degeneracy arises because, aside from simply rotating our system, we can trade radial ki- netic energy and Coulomb potential for different orbital kinetic energy whilst keeping the total energy constant.

– 96 – 7 Identical Particles

One of the most striking facts that helps us understand the Universe is that, apart from details of their motion, all electrons are identical. This fact, though convenient, has no ex- planation within Quantum Mechanics. In Quantum Field Theory, one learns that particles are simply quantised modes of excitation of a field. We thus replace the vast number of electrons our Universe contains with a single quantum field that has hugely many excita- tions. From this perspective, the fact that all elementary particles of the same species are identical is more or less a tautology, no more surprising than the fact that one factor of x in the monomial xn is just the same as another.

7.1 Bosons and Fermions Consider a system consisting of N particles, individually described by some state in a Hilbert space a where a = 1,...,N, and suppose αa a gives a complete specification H th | i ∈ H of the state of the a particle, so αa collectively labels all the quantum numbers describing the particle’s energy, orbital angular momentum, spin etc.. In the case that the particles are distinguishable – meaning they are each from different species, such as an electron, a proton or a neutron – a complete set of states of the full system is then labelled by linear combinations of states

N O α1; α2; ... ; αN = α1 α2 αN a . (7.1) | i | i| i · · · | i ∈ H a=1 in accordance with the discussion of section 2.3.1. However, suppose particles 1 and 2 are indistinguishable; for example, they might both be electrons. The state (7.1) thus describes a situation where there is one electron with quantum numbers given by α1 and another whose same properties are labelled by α2. Since both electrons are identical, it can make no difference which is which. Consequently, there can be no physical difference between this state and the state in which these two particles are exchanged. This does not imply that the two states are identical, but does imply that they can only differ by a phase:

iφ α2; α1; ... = e α1; α2; ... , (7.2) | i | i 61 where φ is independent of the particular values of the labels αa, but may depend on the species of particle we exchange. If we exchange the pair of particles once again we find

iφ 2iφ α1; α2; ... = e α2; α1; ... = e α1; α2; ... (7.3) | i | i | i 61There’s a (non-examinable!) exception to this in the case of two spatial dimensions, where the phase may depend on the homotopy class of the path the particles take whilst they are being exchanged. In 2 + 1 dimensions, these paths can become braided (tangled) and the multi–particle state transforms in a representation of the braid group, first studied in the maths literature by Emile Artin. This leads to the possibility of anyonic , which play an important role in some condensed matter systems. 3 Fortunately, we’re interested in quantum mechanics on R and in higher dimensions it turns out the paths can always be disentangled, so anyons do not arise.

– 97 – using the fact that the phase is independent of the particular values of the quantum numbers 2iφ αa. Thus e = 1, so eiφ = 1 . (7.4) ± The choice of sign depends only on the species of the particle. Identical particles which are symmetric under exchange are called62 bosons, while those that are antisymmetric are said to be fermions. If all N of the particles in our system are identical, then the argument immediately extends to say that the state must be either completely symmetric or completely antisym- metric under exchange of any pair, which case being determined by whether the particles are bosons or fermions. That is, for bosons the state of the system is actually described by a vector N N O α1; α2; ... ; αN Sym , (7.5) | i ∈ H ⊂ H whereas for fermions the state lies in the antisymmetric part

N N ^ O α1; α2; ... ; αN . (7.6) | i ∈ H ⊂ H

As a simple example, a pair of identical fermions with quantum numbers α1 and α2 live in the antisymmetric state 1 ψ fermi = ( α1; α2 α2; α1 ) , (7.7) | i √2 | i − | i whereas the system would have to be described by the symmetric combination 1 ψ bose = ( α1; α2 + α2; α1 ) (7.8) | i √2 | i | i if instead these particles were identical bosons. A further fact that also comes from quantum field theory is that particles with integer spin are always bosons, whilst particles with (odd) half–integer spin are always fermions. Among the elementary particles, examples of bosons thus include the W and Z bosons (hence the name!) and the photon, while examples of fermions include electrons, neutrinos and quarks. You’ll learn about how this connection between spin and statistics arises if you take the Part III QFT course. The above considerations apply to composite particles as well as to elementary ones. When we exchange a pair of identical composite particles, we exchange all of their con- stituents. Thus, if the composites are made up from an odd number of fermions and any number of bosons, upon exchange we’ll acquire an odd number of minus signs, so the whole state must be antisymmetric. For example, a proton consists of three (valence) quarks and many gluons, so is a fermion. So too is a neutron. On the other hand, composites consist- ing of an even number of fermions and any number of bosons must be symmetric under

62They’re named after Satyendra Nath Bose and Enrico Fermi, respectively, who first studied their properties.

– 98 – exchange with an identical composite. For example, a hydrogen atom consists of a proton and an electron, each of which are fermions, so the composite Hydrogen atom is itself a boson. This behaviour is consistent with addition of angular momentum. When we add the angular momenta of an odd number of particles each with odd half–integer spin, the total angular momentum is also an odd half–integer, whereas when we combine the angular momenta of an even number of particles each with odd half–integer spin, the total angular momentum is an integer. Even without quantum field theory, it would be impossible for all integer spin particles to be fermions, because a composite consisting of an even number of such integer spin particles would also have integer spin, but would be a boson.

7.1.1 Pauli’s Exclusion Principle The fact that the state of identical fermions must be totally antisymmetric has an im- mediate, striking consequence: no two fermions can be put in exactly the same state simultaneously. This fact is known as Pauli’s exclusion principle. As an example, since we 0 0 must always have ψσ,σ0 (x, x ) = ψσ0,σ(x , x), if both fermions happen to have the same 0 − 0 Sz eigenvalue we find ψσ,σ(x, x ) = ψσ,σ(x , x). Hence the spatial wavefunction of the − system necessarily vanishes at x = x0 so there is zero probability that the two particles are located at the same place. More generally, the state Ψ of N identical fermions is represented in terms of the | i

ψ (α ) ψ (α ) ψ (α ) 1 1 1 2 1 N ··· 1 ψ2(α1) ψ2(α2) ψ2(αN ) α1, α2, . . . , αN Ψ = . . ··· . (7.9) h | i √ . . .. . N! . . . .

ψN (α1) ψN (α2) ψN (αN ) ··· that in this context are known as Slater determinants. Again, α1, α2, . . . , αn represents | i a state in which some fermion has quantum numbers α1, another has α2 and so on, where the labels αi denote a complete set of quantum numbers for single particle states, including (say) their momentum, spin-z value, etc., while ψ1(α1) = α1 ψ1 is the amplitude for the h | i ‘first’ fermion to have quantum numbers α1. If αi = αj for any i = j, then two columns of 6 this determinant coincide. The amplitude . . . , αi, . . . , αi,... Ψ thus vanishes identically, h | i so there is zero probability of any two fermions being found in an identical state. It’s worth stressing that we’ve learned only that the state of a pair of identical fermions must be antisymmetric under exchange of every one of their quantum numbers. The Hilbert 2s+1 space of a single such spin-s fermion is itself a tensor product fermion = spat C , H H ⊗ including a factor that describes the fermion’s spatial wavefunction (including details of 2s+1 its likely position, orbital angular momentum etc.) and a separate factor spin = C H ∼ spanned by the possible spin states s , s 1 ,..., s of this spin-s particle. We must {| i | − i | − i} exchange both the spin and spatial parts of the state in order to find a physically equivalent state.

– 99 – 0 0 In the example of our electron pair, ψσ,σ0 (x, x ) is certainly equivalent to ψσ0,σ(x , x), since both correspond to the electron at x having spin σ and the electron found at x0 having spin σ0. However, antisymmetry of the total state does not imply any relation between 0 0 the states ψσ,σ0 (x, x ) and ψσ0,σ(x, x ): these are distinct physical possibilities because the electron at x has different spin in the two cases. We saw in section 6.1.2 that the spin wavefunctions of the two electrons could be combined into a triplet of spin-1 states 1 1, 1 = , 1, 0 = ( + ) , 1, 1 = (7.10) | i |↑ i|↑ i | i √2 |↑ i|↓ i |↓ i|↑ i | − i |↓ i|↓ i which are symmetric, and a single antisymmetric state 1 0, 0 = ( ) (7.11) | i √2 |↑ i|↓ i − |↓ i|↑ i of spin-zero. Because the overall state of the electron pair must be antisymmetric, the spatial wavefunction must be antisymmetric if the electron pair is in any of the spin-1 states, while the spatial wavefunction must be symmetric if the electron pair is in the spin- 0 state. Thus, for a given spatial wavefunction ψ(x, x0) the pair has four possible states, and the states   1, 1 1 | i 1 ψ(x, x0) ψ(x0, x)  1, 0  and ψ(x, x0) + ψ(x0, x) 0, 0 (7.12) √2 − ⊗ | i  √2 ⊗| i 1, 1 | − i span a basis of the two–electron Hilbert space. In general, as always, an electron pair can be in a superposition of the four states in (7.12), with different spatial wavefunctions in each term of the superposition. Simi- larly, an N-fermion system may be in state Ψ that is a superposition of different states | i α1, α2, . . . , αN , with different (distinct) values of the αis, where each of the ampli- | i tudes α1, α2, . . . , αN Ψ are given in terms of the single–particle states by Slater determi- h | i nants (7.9).

7.1.2 The Periodic Table This section closely follows section 4.5 of Weinberg. One of the outstanding successes of the early years of quantum mechanics was an understanding of the structure of the Periodic Table of the Elements. The exact dynamics of a heavy atom, with large atomic number Z, is very complicated because each electron not only feels an attractive Coulomb potential Ze2/r from the Z protons in the nucleus, − but also a repulsive electromagnetic interaction from all the other electrons. However, it turns out that a reasonable approximation – known as the Hartree approximation – is to imagine that each electron moves in a (roughly) central potential V (r) created by the net effect of the protons and the other electrons, while neglecting the fact that the response of any given electron to this potential will itself modify the potential. Near the nucleus, V (r) Ze2/r as the electron feels the full strength of the attraction of the protons, while ≈ −

– 100 – at large distances V (r) e2/r since the protons’ charge is shielded by the (Z 1) other ≈ − − electrons. As for any central potential, [L2,H] = 0 and [L,H] = 0 so we can label the single– electron states by their values of ` and m, together with another integer n labelling the energy levels. However, because V (r) is not precisely 1/r, unlike the pure Coulomb po- tential, states with different ` do not have identical energy. Because V (r) 1/r near the ∼ nucleus, the single–electron wavefunctions behave as r` for small r, just as in Hydrogen. This mean that states with high values of ` are less likely to be found at small r where the potential is deepest, and so typically have slightly larger energy than those of lower `, compared to the pure Coulomb case. Numerical calculations show that the states of roughly equal energy are 1s, 2s, 2p, 3s, 3p, 4s, 3d, 4p, 5s, 4d, 5p, 6s, 4f, 5d, 6p, 7s, 5f, 7p, . . . with energy increasing as one proceeds down the list. Here, the number corresponds to the label n, while s, p, d, f, . . . stand for sharp, principal, diffuse, faint, ... and simply correspond to ` = 0, 1, 2, 3,.... This historic notation is standard in atomic spectroscopy, which is still of great importance in the pharmaceutical industry. 1 Since electrons have spin- 2 they are fermions. Pauli’s exclusion principle prevents the electrons in a heavy atom from all burying down into the lowest energy 1s state, and forces them to gradually fill up the higher energy levels. There are 2(2` + 1) distinct states with orbital angular momentum `, with the extra factor of 2 coming from the two possible spin states of the electron. Thus, the number of possible states in each line of the above table is 2, 2 + 6 = 8, 2 + 6 = 8, 2 + 10 + 6 = 18, 2 + 10 + 6 = 18 and 2 + 14 + 10 + 6 = 32, etc. The first two elements, hydrogen and helium, have electrons just in the ground state 1s. The 8 elements from lithium to neon have electrons in the n = 1 and n = 2 states, while the next 8 elements from sodium to argon have electrons in states with n = 1, 2, 3. The first excited states of hydrogen and helium are obtained by promoting (one of) their 1s electrons up to the 2s state, and lie 10.2eV and 19.8eV above the corresponding ground states. These energy differences correspond to the frequency of UV radiation, so this transition cannot be stimulated by optical frequency radiation. By constrast, the first excited state of lithium is obtained by promoting a 2s electron to a 2p state, and lies a mere 1.85eV above its ground state. This corresponds to the frequency of rather deep red. Elements that lie beyond helium play an important role in astronomical measurements, even though they are present only in trace amounts compared to hydrogen and helium, because their absorption spectra contain lines at easily observed optical frequencies. The chemical properties of an element are largely determined by the number of elec-

– 101 – trons in its highest energy level, since these are least tightly bound. Atoms with no electrons outside filled energy levels are particularly stable chemically. These elements are called no- ble gases and include helium (Z = 2), neon (Z = 2 + 8 = 10), argon (Z = 2 + 8 + 8 = 18), krypton (Z = 2 + 8 + 8 + 18 = 36), xenon (Z = 2 + 8 + 8 + 18 + 18 = 54) and radon (Z = 2 + 8 + 8 + 18 + 18 + 32 = 86). Elements with either a few more or few less electrons than required to fill a shell have their chemical properties determined by this number, known as the valence and counted positive for extra electrons and negative for fewer. If there is just one electron in the highest energy level, then this electron is easily stripped away, so the element is chemically reactive. These are the alkali metals and include lithium (Z = 2+1 = 3), sodium (Z = 2+8+1 = 11), potassium (Z = 2 + 8 + 8 + 1 = 19) etc. If a large number of such atoms combine to form a solid crystal, then it is energetically favourable for the valence electrons to be shared throughout the solid rather than clinging to their original atom. These electrons thus form a sort of fluid that is free to flow within the crystal when stimulated by a small external force. This gives the crystal high electrical and high thermal conductivity, making it a metal. (You’ll study this in much more detail if you take the Applications of Quantum Mechanics course next term.) Atoms with two electrons more than the noble gases are also chemically reactive, though not as reactive as the alkalis. They are known as the alkaline metals and include beryllium (Z = 2 + 2 = 4), magnesium (Z = 2 + 8 + 2 = 12), calcium (Z = 2 + 8 + 8 + 2 = 20) etc. On the other hand, atoms with one electron missing from their highest energy level tend to strongly attract other electrons and so are chemically reactive non–metals, often reacting violently when brought into contact with metals. These elements are called halogens and include fluorine (Z = 2 + 8 1 = 9), chlorine (Z = 2 + 8 + 8 1 = 17), bromine − − (Z = 2 + 8 + 8 + 18 1 = 35), iodine (Z = 2 + 8 + 8 + 18 + 18 1 = 53) etc. Elements with − − two electrons missing from their highest energy level include oxygen (Z = 2 + 8 2 = 8), − sulfur (Z = 2 + 8 + 8 2 = 16), etc. These elements are again chemically reactive, though − less so than the halogens. The inclusion of the 4f and 5f states in the sixth and seventh energy levels, respec- tively, are responsible for the long sequence of rare earths in the middle of the periodic table. Numerical calculations show that wavefunctions of the 2(2 3 + 1) = 14 different · 4f states have small probability to lie outside the wavefunctions of the two 6s states, de- spite having slightly higher energy. Consequently, these elements are all rather similar chemically. The same is true of the 14 different 5f states compared to the wavefunctions of the two 7s states. The 14 elements in which the 4f states are being filled run from lanthanum (Z = 2 + 8 + 8 + 18 + 18 + 2 + 1 = 57) to ytterbium (Z = 70) and are known as lanthanides, while the next chemically similar sequence are the actinides, running from actinium (Z = 2 + 8 + 8 + 18 + 18 + 32 + 2 + 1 = 89) to nobelium (Z = 102). Beyond this point the nuclei themselves become so unstable that the element tends to undergo radioactive decay before it has chance to participate in any chemical reaction.

– 102 – 0 – 103 –

Figure 16: The Periodic Table of the Elements. For formatting reasons, the lanthanides and actinides are traditionally shown below the rest of the table. (Figure from Wikimedia.) 7.1.3 White Dwarfs, Neutron Stars and Supernovae 7.2 Exchange and Parity in the Centre of Momentum Frame Let’s now consider the effects of exchanging two identical particles on their spatial wave- function. Letting X1, P1 and X2, P2 denote the position and momentum operators of the two particles, exchanging 1 2 implies that ↔ X1 + X2 X2 + X1 Xcom = = Xcom 2 7→ 2 (7.13) Pcom = P1 + P2 P2 + P1 = Pcom , 7→ whilst Xrel = X1 X2 X2 X1 = Xrel − 7→ − − P1 P2 P2 P1 (7.14) Prel = − − = Prel . 2 7→ 2 − Thus, exchange acts trivially on the centre-of-momentum coordinates, but acts on the relative coordinates just like a parity transformation. Since Ym( x) = ( 1)`Ym(x) under ` − − ` parity, we see that if two identical particles have relative orbital angular momentum `, the spatial part of the wavefunction will be either symmetric or antisymmetric under exchange according to whether ` is odd or even. The behaviour of the entire state under exchange is determined by whether the particles in question are bosons or fermions. In the case of identical bosons, the spins must be combined to form a symmetric overall spin state when ` is even, or an antisymmetric overall spin state when ` is odd so as to ensure the overall state is always symmetric under exchange. For fermions, the opposite holds so as to ensure overall antisymmetry. if the neutrons are produced in a state where their relative orbital angular momentum is ` in the centre of momentum frame, their spatial wavefunction acquires a factor of ( 1)` − under exchange of the two neutrons.

7.2.1 Identical Particles and Inelastic Collisions The requirement that the state describing N identical bosons/fermions be totally symmet- ric/antisymmetric under exchange of any pair of identical particles has important conse- quences in collision processes where such particles are created or destroyed. For example, consider an exotic type of ‘atom’ consisting of a spin-0 pion (denoted by π−) bound electromagnetically to a deuterium nucleus (denoted by D+ – itself consisting of a proton and a neutron, bound together by the strong nuclear force). The bound state energy levels of this atom due to the electromagnetic Coulomb attraction between the π− and D+ have the same form as in Hydrogen and, in particular, the ground state is 1, 0, 0 , | i an s-wave having ` = 0. Because the pion is much heavier than the electron, the Bohr radius of the D+π− ‘atom’ is much smaller than that of Hydrogen and the π− wavefunction closely hugs the D+ nucleus. The D+ is known to have spin-1 whilst the π− is spin zero, so the 1s ground state of the D+π− ‘atom’ has total angular momentum j = 1. Now, as well as their electromagnetic interactions, the π− and D+ also interact via the short-range strong nuclear force. This causes the ‘atom’ to be unstable, with the π− rapidly

– 104 – being absorbed by the D+, causing the system to disintegrate into a pair of neutrons (each denoted N): π− + D+ N + N. (7.15) −→ (In particle physics terminology, processes such as this, where different types of particles appear in the initial and final states, are known as inelastic. An elastic process is one in which the initial & final states contain the same particles.) 1 Neutrons are fermions with spin- 2 , so the final state must be antisymmetric under their exchange. One possibility is for their spins to combine into one of the triplet 1 , ( + ) , |↑ i|↑ i √2 |↑ i|↓ i |↓ i|↑ i |↓ i↓ i of symmetric spin-1 states, combined with a spatial wavefunction that is antisymmetric under exchange. From above, this will be the case if the relative orbital angular momentum of the two neutrons is odd. The other possibility is for the spins to form the antisymmetric spin-0 state 1 ( ) √2 |↑ i|↓ i − |↓ i|↑ i combined with a relative spatial wavefunction of even `. Total angular momentum is conserved, so the spin and orbital angular momentum of the final state must combine to give j = 1 as for the initial ground state of the π−D+. In section6 we learnt that when we combine systems with angular momenta ` and s, the combined system could have total angular momentum j ` + s, ` + s 1,... ` s ∈ { − | − |} depending on the relative alignment of the two subsystems. Thus, in the case that the neutrons spins combine to the antisymmetric spin-0 state, then j = `. However, since this case requires ` even for fermionic statistics, we see it is ruled out. The remaining case is that the neutrons spins combine to give net spin-1, so that

j ` + 1, `, ` 1 . ∈ { − } Since fermionic statistics here require that ` is odd, the only possibility to get j = 1 is if ` = 1 and the spin and orbital angular momenta of the neutron pair are neither perfectly aligned nor perfectly anti-aligned. Thus the final state has j = ` = s = 1. These considerations also help us to determine the intrinsic parity of the pion. Recall from section 4.6 that the parity operator Π acts on a state x as | i Π x = η x (7.16) | i | i where the value of η +1, 1 is known as the intrinsic parity that, like spin, depends on ∈ { − } the type of particle that the state x represents. More generally, if x1, κ1; x2, κ2; ; xN , κN | i | ··· i describes an N-particle state in which a particle of type κa is definitely located at xa, then from (??) the parity operator acts as

Π x1, κ1; x2, κ2; ; xN , κN = η1η2 ηN x1, κ1; x2, κ2; ; xN , κN . (7.17) | ··· i ··· | − − ··· − i

– 105 – th where the ηk = 1 is the intrinsic parity of the k particle. (This holds whether or not ± the particles are distinguishable.) Like the spin of fundamental particles, these intrinsic parities are independent of any details of the spatial wavefunction. Intrinsic parities thus have no effect in elastic processes, where the same particles are present in both the initial and final states. The intrinsic parity of a particle can often be determined by examining inelastic processes in which the particle participates. In the example π−D+ NN above, equating the parities of the initial and final states → gives ` 2 ηπ ηD = ( 1) η = 1 , (7.18) − N − since the initial atomic state 1, 0, 0 has ` = 0 and hence no parity other than the intrinsic | i parities of the pion and deuteron. Provided parity is conserved by the strong nuclear interactions causing this decay, we conclude that the π− and D+ must have opposite intrinsic parity. Now, the D+ is predominantly an s-wave bound state of a proton and ` neutron, so ηD = ( 1) ηP ηN = ηP ηN , and furthermore the proton and neutron can − always be chosen to have the same intrinsic parity since they are related by an ‘’ 63 symmetry . Thus ηD = +1 and hence the pion must have intrinsic parity ηπ = 1. − As we mentioned in section 4.5.1, one of the great surprises of particle physics came in the 1950s. Cosmic rays were found to contain various types of particles, including two similar of similar mass called the τ + and the θ+. Both particles decay quickly, the predominant channels being

θ+ π+ + π0 and τ + π+ + π+ + π− . (7.19) −→ −→ The angular distribution of the pions in the final states could be observed in a cloud chamber or bubble chamber, and studying these patterns showed that the pions were produced in an ` = 0 state in both cases. Since ηπ = 1 (irrespective of the electric charge − of the pion, for the same isospin reason as above), the θ+ should have intrinsic parity 2 + ηθ = η = +1, while the τ should have ητ = 1. However, as measurements improved π − the masses and lifetimes of the two particles became indistinguishable. The puzzle was resolved in 1956 when Lee and Yang proposed that the two particles were in fact one and the same, but that parity was not conserved in their decay process. Their proposal was largely ignored until Lee persuaded his colleague, Madame Chien–Shiung Wu, to test it using the β–decay of a certain isotope of cobalt. Wu’s experiments showed that parity is indeed violated in the weak interactions. The fact that x x is not a symmetry of → − Nature can be accommodated in QFT, though the deep reason for it remains mysterious64.

63The has an (approximate) ‘internal’ SU(2) symmetry known as isospin that rotates protons into neutrons; they are different states of the same nucleon, somewhat like the two different spin states |↑ i, |↓ i of the same electron. 64There are various natural ways for parity violation to originate in string theory, but needless to say, none of them have been verified experimentally.

– 106 – 8 Perturbation Theory I: Time Independent Case

We’ve now come about as far as we can (in this course) relying purely on symmetry principles. The dynamics of systems of genuine physical interest is rarely simple enough to hope that we can solve it exactly, just using the general constraints of symmetry. In these circumstances we need to make some sort of approximation, treating our system as being ‘close to’ some other system whose dynamics is sufficiently simple to be controllable. That is, we treat the difference between our actual system, whose experimental properties we care about, and our model system, whose description is simple enough that we can handle it, as a perturbation. The whole art of a theoretical physicist lies in striking this balance; if one’s model system is too simplistic, it might not provide a reasonable guide to the behaviour of the real case65, while if the model is overly complicated it may itself prove impossible to understand. There’s nothing inherently quantum mechanical about the need to approximate a com- plicated system by a simpler one, but, fortunately, perturbation theory in quantum me- chanics often turns out to be considerably easier than in classical dynamics, largely because of the vector space nature of . In this chapter and the next, we’ll study various techniques H to handle quantum mechanical perturbation theory, beginning with the simplest cases.

8.1 An Analytic Expansion

Let H be the Hamiltonian of the experimental system we wish to understand, and H0 be the Hamiltonian of our model system whose eigenstates and eigenvalues we already know. We hope that ∆H = H H0 is in some sense ‘small’, so that it may be treated as a − perturbation. More specifically, we look for a parameter — let’s call it λ — that our true Hamiltonian depends on such that at λ = 0, H = H0. For λ [0, 1], define ∈

Hλ = H0 + λ∆H. (8.1)

We can think of Hλ as the Hamiltonian of an apparatus that is equipped with a dial that allows us to vary λ. At λ = 0 the system is our model case, and at λ = 1 it’s the case of genuine interest. We now seek the eigenstates Eλ of Hλ. This may look as though we’ve made the | i problem even harder — we now need to find the eigenstates not just of our model and experimental systems, but of a 1-parameter family of interpolating systems. Our key assumption is that since Hλ depends analytically on λ, so too do its eigenstates. In essence, this amounts to the assumption that small changes in the system will lead to only small changes in the outcome. Every mountain climber knows that this assumption can be dreadfully false and we’ll see that it can easily fail in QM too, but for now let’s see where it takes us. 65Milk production at a dairy farm was low, so the farmer wrote to the local university, asking for help from academia. A multidisciplinary team of professors was assembled, headed by a theoretical physicist, and two weeks of intensive on-site investigation took place. The scholars then returned to the university, notebooks crammed with data, where the task of writing the report was left to the team leader. Shortly thereafter the physicist returned to the farm, saying to the farmer,“I have the solution, but it works only in the case of spherical cows in a vacuum”.

– 107 – If indeed Eλ depends analytically in λ, then we can expand it as | i 2 Eλ = α + λ β + λ γ + (8.2) | i | i | i | i ··· and similarly expand the eigenvalues

E(λ) = E(0) + λE(1) + λ2E(2) + . (8.3) ···

Plugging these expansions into the defining equation Hλ Eλ = E(λ) Eλ we obtain | i | i 2  (H0 + λ∆H) α + λ β + λ γ + | i | i | i ···  (8.4) = E(0) + λE(1) + λ2E(2) + α + λ β + λ2 γ +  . ··· | i | i | i ··· Since we require this to hold as λ varies, it must hold for each power of λ separately. Thus we find an infinite system of equations

(0) H0 α = E α , | i | i (0) (1) H0 β + ∆H α = E β + E α , | i | i | i | i (0) (1) (0) (8.5) H0 γ + ∆H β = E γ + E β + E γ , | i | i | i | i | i . . and so on. The first of these equations simply states that α is an eigenstate of the model Hamil- (0) | i tonian H0 with eigenvalue E . This is not surprising; under our analyticity assumptions the terms of (λ0) are all that would survive when the dial is set to λ = 0, which is indeed O (0) the model system. Henceforth, we relabel α n and E En to reflect this under- | i → | i → standing. (The notation n is intended to imply the nth energy eigenstate of the model | i Hamiltonian H0, with eigenvalue En; we will distinguish the higher-order corrections to these states with superscripts.) (1) th To determine the first-order correction En to the n energy level of the unperturbed Hamiltonian, we contract the second of equations (8.5) with α n to find h | ≡ h | (1) n H0 β + n ∆H n = En n β + E (8.6) h | | i h | | i h | i n Since (as for any Hamiltonian) the model Hamiltonian is Hermitian,

n H0 β = β H0 n = En n β , (8.7) h | | i h | | i h | i so in the case that the unperturbed state is n , equation (8.6) becomes | i E(1) = n ∆H n . (8.8) n h | | i In other words, to first order in λ, the change in the energy of our system as we move away from the model system is given by the expectation value ∆H n of the change in the h i Hamiltonian when the system is in its original state n . | i

– 108 – To find the perturbed state, to first order in λ we must understand β . We can expand | i β in the complete set n of eigenstates of the original system as | i {| i} X β = bn n . (8.9) | i n | i

Using this expression in the second of equations (8.5) and contracting with m (where h | m = n , the initial state we’re perturbing) gives | i 6 | i

(En Em) bm = m ∆H n (8.10) − h | | i and so, provided Em = En, 6 m ∆H n bm = h | | i . (8.11) En Em − This will hold provided the energy levels of our model system are non-degenerate; we’ll examine how to handle the more general case including degeneracy in section 8.2. In the non-degenerate case, equation (8.11) determines all the expansion coefficients bm in β | i except bn. Fortunately, one can argue that bn = 0 from the requirement that Eλ remains | i correctly normalised — I’ll leave this as an exercise, but the essential idea is that if we move a point on a unit sphere, then to first order r δr = 0 so that the variation is only · non–zero in directions orthogonal to the original vector. With bn = 0 we have

X m ∆H n β = h | | i m (8.12) | i En Em | i m6=n − as the first-order perturbation of the state when the unperturbed state is n . | i (2) th We can also examine the second-order perturbation, En , of the n energy level. To do so, contract the third of equations (8.5) with n . Using the facts that n H0 γ = En n γ h | h | | i h | i and n β = 0 we have h | i

(2) X n ∆H m m ∆H n En = n ∆H β = h | | ih | | i h | | i En Em m6=n − (8.13) X n ∆H m 2 = |h | | i| , En Em m6=n − using our expression (8.12) for β . We could go on to higher order in λ, next finding | i (3) γ in terms of the original states n and then finding the third-order energy shift En | i {| i} etc., but in practice the summations become increasingly messy and we hope (!) that the first few terms already provide a good guide to the behaviour of the system near λ = 1. (In high–energy quantum field theory, modern experiments typically cost many millions of dollars so it’s especially important to have extremely accurate theoretical predictions to compare to. Consequently, there’s a whole industry of people whose life’s work is to compute higher and higher order terms in perturbation series such as these.) Fortunately, in the Tripos you’ll never be asked to do anything beyond 2nd order.

– 109 – Combining our results shows that, to second order in λ, the energy levels of the per- turbed system are given by

2 2 X n ∆H m 3 En(λ) = En + λ n ∆H n + λ |h | | i| + (λ ) (8.14) h | | i En Em O m6=n − where En are the energies of our model system. Recall that this expression – much beloved of Tripos Examiners – is derived under the assumptions i) that the new energies E(λ) and new states Eλ are analytic at λ = 0 and ii) that the model system is non-degenerate so | i Em = En iff m = n . | i | i Let’s now take a look at the use of this formula in a number of examples.

8.1.1 Fine Structure of Hydrogen Our treatment of the hydrogen atom in the last chapter assumed the electron was moving non–relativistically in the Coulomb field of the proton. Since E/µc2 = α2/2n2 1, | |  non–relativistic quantum mechanics should indeed be a good approximation. Nonetheless, better agreement with experiment is obtained by describing the electron using the rela- tivistic Dirac equation66. The energy levels obtained by assuming non-relativistic motion in a pure Coulomb potential are often called the gross structure of the hydrogen atom, whilst the small relativistic corrections to these energies implied by the are known as fine structure. In section 6.2.3, we understood that gross structure energies are independent of ` because of an enhanced symmetry of the pure Coulomb potential, generated by the Runge-Lenz operator. We thus expect that relativistic corrections will lift the degeneracy among states of different `. The Dirac equation itself is beyond the scope of this course, but some of its conse- quences are easy to understand in perturbation theory. Firstly, expanding the relativistic dispersion relation around the non-relativistic limit gives

p p2 p4 E = p2c2 + µ2c4 µc2 + + . (8.15) ≈ 2µ − 8µ3c2 ··· This shows that relativistic effects alter the kinetic energy by

p4 ∆H = (8.16) −8µ3c2 at first order. From (8.8), the first-order shift in the energy of state n`, m of hydrogen | i due to this modified kinetic term is thus

(1) E = n, `, m ∆H n, `, m . (8.17) n,l h | | i (In making this claim, we should note that our perturbation (8.16) is rotationally invariant, 2 0 0 so in particular [Lz, ∆H] = 0 and [L , ∆H] = 0. Therefore n, ` , m ∆H n, `, n = 0 unless h | | i 66A better approximation still — in precise agreement with the most accurate measurements ever per- formed in any branch of science — comes from quantum field theory.

– 110 – `0 = ` and m0 = m. Thus our perturbation does not mix degenerate states of the gross structure, so non-degenerate perturbation theory is sufficient.) 2 2 To compute (8.17), first write ∆H = (H0 V (r)) /2µc where H0 is the gross − − structure Hamiltonian and V (r) the usual Coulomb potential. Then

(1) 1  2 2  E = (En) 2En V (r) + V (r) , (8.18) n,l −2µc2 − h i h i

1 2 2 2 where En = µc α /n is the original energy of our state and V (r) = n, `, m V (r) n, `, m . − 2 h i h | | i The virial theorem for the 1/r potential tells us that 2 T = V , so V = 2En. Next, h i −h i h i to compute V 2 , observe that this 1/r2 term just modifies the effective potential due to h i the electron’s orbital angular momentum. Consider the Hamiltonian for the radial part of the Hydrogen atom, with an effective potential

2 2 2 0 0 2 ~ `(` + 1) α e ~ ` (` + 1) e Veff (r) = 2 + 2 = 2 . (8.19) 2µ r r − 4π0r 2µ r − 4π0r

Here we’ve introduced `0 to absorb the α/r2 term into the angular momentum contribution. Of course `0 does not correspond to any actual angular momentum, it’s just a trick to help us compute V 2 . With this effective potential, repeating the calculations of section 6.4 h i would lead to just the same energy levels as before, but now in terms of `0. That is, in place of (6.90) we’d find 1 1 E(`0) = µc2α2 . (8.20) −2 (`0 + 1)2 using our new (non-integer) `0. This is the exact result for a hydrogen atom with an additional 1/r2 term in its potential. However, in our perturbative context, it’s only appropriate to keep the answer accurate to first order – there may be other effects we haven’t yet accounted for (such as expanding (8.15) further) that contribute at higher order. Expanding `0 around ` we find

4 0 1 2 α E(` ) = En + µc 3 1 + (8.21) 2 n (` + 2 ) ··· where n = ` + 1. Combining this with the other terms in (8.18) gives

! 4 1 1 2 n 3 α En,` = µc 1 4 (8.22) −2 ` + 2 − 4 n as the first-order change in the energy of state n, `, m due to the relativistic correction to | i kinetic energy. Notice that this effect is suppressed by an additional power of α2 v2/c2 ∼ compared to the gross structure, and that the degeneracy among states of different ` is lifted, as we expected. There’s a second consequence of the Dirac equation which comes in at the same order as the above kinetic terms. Due to its spin, the electron has a magnetic dipole moment e m = S . (8.23) −2µ

– 111 – When placed in a magnetic field, this dipole has energy U = m B. Relativistically, the − · electron in a Hydrogen atom does experience a magnetic field because it moves through the electric field E produced by the proton. If the electron has velocity v = p/(µγ), then classically the Lorentz transformation of E produces a magnetic field

γ 1  eˆr  e 1 B = 2 v E = 2 p 2 = 2 3 L (8.24) c × µc × 4π0r −4π0µc r

EXPLAIN THOMAS PRECESSION? Consequently, in QM the Hamiltonian of hydrogen receives a further fine-structure correction 3 α~ 1 HSO = m B = S L (8.25) − · 2µ2c r3 · known as spin–orbit coupling. Since the coefficient of the operator S L is positive, spin– · orbit coupling lowers the total energy when the spin and orbital angular momentum are antiparallel. In particular, S L annihilates any state of hydrogen in which ` = 0, since · then all components of L act trivially. As an important special case, spin–orbit coupling does not affect the ground state. However, excited states with ` = 0 do generically feel the 6 effects of spin–orbit coupling. We first compute the effect of acting on our electron states with L S. To do this, · clearly we must include a specification of the electron’s spin. This could be done using the basis n, `, m and n, `, m , | i ⊗ |↑ i | i ⊗ |↓ i but it turns out to be more convenient to instead combine the electron’s orbital and spin 67 1 angular momenta into J = L + S and label states by n, j, mj; ` where j = ` + 2 or 1 | i j = ` are the two possible values for the combined angular momentum, and mj j. − 2 | | ≤ To make use of this, we write 1 S L = J2 L2 S2 . (8.26) · 2 − − Thus 2   ~ 3 S L n, j, mj; ` = j(j + 1) `(` + 1) n, j, mj; ` · | i 2 − − 4 | i 2 ( 1 (8.27) ~ ` n, j, mj; ` when j = ` + = | i 2 1 2 (` + 1) n, j, mj; ` when j = ` . − | i − 2 Consequently, according to our general expression (8.8), the first-order change in the energy of the n, j, mj : ` state of hydrogen due to spin–orbit coupling is | i 2 2 ( )   1 1 e ~ ` 1 En,j;` = 2 2 3 , (8.28) −4µ c 4π0 (` + 1) r − n,j;` 67 2 2 2 Since [J, L ] = 0, it’s possible to find a basis of simultaneous eigenstates of J , Jz and L . In this basis, we do not necessarily know the z-components of the electron’s orbital and spin angular momenta separately.

– 112 – 1 with the factor of ` or (` + 1) chosen depending on whether j = ` 2 . − 3 ± To compute the expectation value 1/r n,j;` we could just perform the integral h i Z 2 1 3 ψn,`,m(x) 3 d x . R3 | | r However, there’s a nifty trick that allows us to short-circuit the evaluation of this integral. Recall that for states of definite `, the Hamiltonian for the unperturbed atom can be written

2 2 2 Pr ~ `(` + 1) e H` = + 2 (8.29) − 2µ 2µr − 4π0r where Pr = (Xˆ P + P Xˆ )/2 is the radial momentum, given by · ·  ∂ 1 i~ + − ∂r r in the position representation. Now, the expectation value [Pr,H`] = 0 when evaluated h i in any normalizable eigenstate of H`. In our case, the commutator evaluates to  2 2  ~ `(` + 1) e [Pr,H`] = i~ 3 + 2 (8.30) − − µr 4π0r and therefore, provided ` = 0, 6  1  1 1  1  1 1 1 3 = 2 = 3 1 2 (8.31) r n,j;` a0 `(` + 1) r n,j;` a0 `(` + 2 )(` + 1) n where a0 = ~/(αµc) is the Bohr radius and the second equality follows from our previous result (8.21) for 1/r2 . h i Putting this result together with the other factors in (8.28) shows that the first-order energy shifts due to spin–orbit coupling are

1 1 2 4 2 1 En,j;` = + Z α µc 3 1 (8.32a) 2 2n (` + 2 )(` + 1) when j = ` + 1 , whereas when j = ` 1 we have 2 − 2 1 1 4 2 1 En,j;` = α µc 3 1 . (8.32b) −2 2n `(` + 2 )

1 (We also recall that En,j;` = 0 if ` = 0.) As anticipated, states whose spin and orbital angu- lar momenta are anti-aligned (j = ` 1 ) have lower energy and so are more tightly bound − 2 than the pure Coulomb interaction, whereas if the spin and orbital angular momentum are 1 aligned with eachother (j = ` + 2 ) the state is less tightly bound. Notice also that the deviation from a pure Coulombic interaction partially lifts the degeneracy between states of different `. Equations (8.32a)-(8.32b) show that the spin–orbit contribution to the energy is sup- pressed by a factor of α2 compared to the gross structure energy 1 µα2c2/n2. This is − 2 the same suppression as we found for the P4 correction to the kinetic term, so it does not

– 113 – really make sense to consider these two effects on the energy separately. Combining (8.22) with (8.32a)&(8.32b), we find the net energy levels

" 2 ! # 1 2 2 1 α 3 1 En,j;` = µα c 2 3 1 + (8.33) −2 n − n 4n − j + 2 ··· incorporating fine structure correct to order α4. This formula holds regardless of whether j = ` + 1 or j = ` 1 . Even better, although the spin–orbit term only contributes only 2 − 2 when ` = 0, it turns out that there is another term68 that contributes only when ` does 6 equal zero! Even better, this remaining term has the effect that (8.33) is the correct fine 1 structure energy levels even for states with ` = 0 where j = 2 just from the spin. Proceeding down the periodic table, if we attach a single electron to an atomic nu- cleus containing Z protons, the bare Coulomb interaction increases by a factor of Z. The formula (8.33) remains valid provided we replace α e2 by Zα. Hydrogen has Z = 1, ∼ so in the n = 1 ground state the fine structure contribution is suppressed by a factor of α2 1/(137)2 compared to the gross structure, justifying our treatment of the fine struc- ∼ ture as a small perturbation. However, for heavier elements with higher Z, the suppression is only by (Zα)2 and the small value of α can be negated by a sufficiently large Z. In particular, from (8.33) we see that the energy difference between the states with principal quantum number n but j = ` 1 is ± 2 4 4 1 2 1 Z α E 1 E 1 = µc 3 . (8.34) n,`+ 2 ;` − n,`− 2 ;` 2 n `(` + 1)

This gives the splitting between the two 2p states of Hydrogen to be 4.53 10−5 eV, in × excellent agreement with the measured value of 4.54 10−5 eV. These splittings reach × around 10% of the gross energy as one reaches the middle of the periodic table, making perturbative treatments less reliable.

8.1.2 Hyperfine Structure of Hydrogen 1 A proton is a spin- 2 particle, so like the electron it has a magnetic dipole moment mp proportional to its spin. A magnetic dipole mp placed at the origin generates its own magnetic field 2µ0 3 µ0 B = mp δ (x) + (3(mp ˆr)ˆr mp) , 3 4πr3 · − which is (roughly) the field you saw traced out by iron filings when first playing with magnets. The dipole moment me of the electron sits inside this field, leading to a new contribution Hhfs = me B (8.35) − · 3 68 απ~ 3 This remaining relativistic effect is known as the Darwin term and gives a contribution 2µ2c δ (x. The origin of this term is rather subtle, and properly lies within QFT rather than QM. Because of the δ-function, it only affects states which have non-zero probability to be found at r = 0. The only such states are those with ` = 0, since all others are prevented from reaching the origin by the effective potential 2 2 `(` + 1)~ /(2mr ).

– 114 – depending on both dipoles. We’ll just investigate the effect of this perturbation on the ` = 0 states of hydrogen, where it turns out that only the first term δ3(r) in B has ∼ non-vanishing contribution. For these ` = 0 states, one finds

4 m 4 2 1 Hhfs = α µc S I (8.36) 3 M n3~2 · where I is the spin operator for the proton, labelled using a different letter just to avoid confusion with the electron spin operator S. The splittings in energy levels caused by this perturbation are suppressed not just by a factor of α4, but by an additional factor of the electron-to-proton mass ratio m/M ≈ 1/1836. The splittings caused by accounting for nuclear spin are thus much smaller than those of the fine structure considered above, and the perturbation (8.35) is known as the hyperfine contribution to the energy levels. We can compute the eigenvalues in much the same way as for the spin–orbit contribution to fine structure. Let F = I + S denote the total spin of the proton and electron. Then 2   1 2 2 2 ~ 3 S I = F I S  = f(f + 1) . (8.37) · 2 − − 2 − 2 2 2 This is 3~ /4 when the proton and electron spins are antiparallel so f = 0, and +~ /4 − when f = 1. Plugging these into (8.36) shows that the difference between the energies for the n = 1 ground state of hydrogen is

4 m 4 2 −6 En=1;f=+1 En=1;f=0 = α µc 5.88 10 eV , (8.38) − 3 M ≈ × corresponding to a frequency of 1.4GHz and a wavelength of 21cm. Thus transitions between the energy levels split by the hyperfine structure causes the hydrogen atom to have a microwave emission line. The 21cm line provides the most powerful way of tracing diffuse gas in interstellar and intergalactic space. This wavelength is much longer than the size of typical specks of dust, so 21cm radiation can propagate with little absorption right through clouds of dust and gas that do absorb visible light. It was radiation at 1.4GHz that first revealed the large- scale structure of our galaxy. The line is intrinsically very narrow, so the temperature and radial (with the Earth as origin) velocity of the hydrogen that emitted the radiation can be accurately measured from the Doppler shift and broadening of the observed spectral line. The hyperfine structure of hydrogen leading to this 21cm emission line was first pre- dicted theoretically in 1944 by the Dutch physicist H.C. van de Hulst as part of his doctoral thesis, carried out in Utrecht which was then under Nazi occupation. 21cm radiation from our galaxy was first detected experimentally in 1951 by three groups working indepen- dently in the USA, Australia and the Netherlands. The Dutch group used a German radio antenna left over from the war.

8.1.3 The Ground State of Helium After hydrogen, helium is the next most abundant element, making up about a quarter of the ordinary matter in the Universe. Much of this helium was created during the period of nucleosynthesis, a tiny fraction of a second after the Big Bang.

– 115 – Let’s use perturbation theory to find an estimate of the ground state energy of helium. Working in the centre of mass frame and treating the nucleus as stationary at the origin, the (gross structure) Hamiltonian is P2 P2 2e2 2e2 e2 1 H = 1 + 2 + (8.39) 2m 2m − 4π0 X1 − 4π0 X2 4π0 X1 X2 | | | | | − | where (X1, X2) are the position operators and (P1, P2) the momentum operators for the two electrons. The terms describing the electrons’ kinetic energy and attraction to the nucleus are just the sum of those for Hydrogen, except with e2 replaced by 2e2 since there are two protons in the nucleus of helium. We’ll take these terms to be the model Hamiltonian H0, treating the remaining electron-electron repulsion as a perturbation. Before proceeding, we should ask what is the small, dimensionless parameter in which we’re performing our perturbative expansion. One might hope that, as for the fine structure 2 of hydrogen, this is the fine structure constant α = e /~c 1/137. However, it’s clear ≈ that this cannot be the case here, because both the electron–electron interaction and the electron–proton interactions both scale as e2 α. Furthermore, the virial theorem says that ∼ in any state Ψ , for the 1/r hydrogenic Coulomb potential, 2 T Ψ = V Ψ so the kinetic | i h i −h i terms in the hydrogenic term are likewise of order α. In fact, we’ll perform a perturbation in the inverse atomic number, taking λ = 2/Z and treating this as a continuous parameter. We’re interested in values λ (0, 1] corresponding to Z ( , 2]. Note that when Z 1, ∈ ∈ ∞  corresponding to a nucleus with a large positive charge, the two electrons’ attraction to this nucleus should certainly overwhelm their mutual repulsion. In the unperturbed Hamiltonian, single–electron states are just the usual hydrogenic states n, `, m with energy | i 1 1 4 mc2α2 , − × 2 n2 where the factor of 4 arises because each electron’s attraction to the helium nucleus is twice as strong as it was for hydrogen. Since electrons are fermions, the total state of the neutral helium atom must be antisymmetric under their exchange. In particular, the ground state is 1 Ψ0 = 1, 0, 0 1, 0, 0 ( ) (8.40) | i | i ⊗ | i ⊗ √2 |↑ i|↓ i − |↓ i|↑ i 1 where the final factor is the antisymmetric spin state of the two spin- 2 electrons. (Note that, at the level of the gross structure of helium, the Hamiltonian (8.39) is independent of the spins of the two electrons, so we are free to seek simultaneous energy and spin eigenstates.) This state is an eigenstate of the model Hamiltonian H0 with energy 2 2 4α 2 4α 2 2 2 E1 = mc mc = 4α mc 108.8 eV , (8.41) − 2 − 2 − ≈ − being the sum of the energies of the two electrons individually. We now take account of the electron–electron repulsion. To use perturbation theory, we set 2 2 e 1 α~c λ = and ∆H = = . (8.42) Z 4π0 X1 X2 X1 X2 | − | | − |

– 116 – Our ∆H is chosen so that i) it is independent of the expansion parameter and ii) λ∆ is precisely the electron–electron potential when λ = 1, appropriate for helium. Following the general results above, the first order shift in the ground state energy is Z 0 0 0 0 3 0 3 0 3 3 Ψ0 ∆ Ψ0 = Ψ0 x , x x , x ∆H(X1, X2) x1, x2 x1, x2 Ψ0 d x d x d x1 d x2 h | | i h | 1 2ih 1 2| | ih | i 1 2 Z α~c 3 3 = Ψ0(x1, x2) Ψ0(x1, x2) d x1 d x2 x1 x2 Z | − | α~c 2 2 3 3 = ψ100(x1) ψ100(x2) d x1 d x2 , x1 x2 | | | | | − | (8.43) Here ψ100(x) are the single–electron n = 1 wavefunctions

3   2 1 2 −|x|/a2 ψ100(x) = e , (8.44) √π a2

~ where the length scale a2 = 2αmc is half the Bohr radius in hydrogen. In going to the last line of (8.43), we’ve used the fact that the interaction ∆ is independent of spin, and the spin states in (8.40) have norm 1. Performing the integral is straightforward but somewhat tedious69, and isn’t the sort of thing I’m going to ask you to reproduce in an exam. One finds

5 2 2 Ψ0 ∆H Ψ0 = α mc (8.45) h | | i 4 As we suspected, this perturbation is also of order α2 just like the leading–order term. Including the factor of λ = 2/Z, to first–order in perturbation theory the ground state of helium is   2 2 5 2 E1(1/Z)Z=2 = 4α mc 1 + 74.8 eV . (8.46) − − 16 Z ··· Z=2 ≈ −

69 In case you’re curious: Choose the z-axis of x2 to be aligned with whatever direction x1 is. Then p 2 2 |x1 − x2| = |r1 + r2 − 2r1r2 cos θ2|, independent of φ2. So " # Z Z 2 −2r2/a2 4α~c 2 r2 sin θ2 e 3 hΨ0|∆H|Ψ0i = |ψ100(x1)| dr2 dθ2 d x1 . a3 p 2 2 2 |r1 + r2 − 2r1r2 cos θ2| The integral in square brackets can be done using

Z π Z π q ( sin θ2 dθ2 1 d 2 2 2/r1 when r1 > r2 = |r + r − 2r1r2 cos θ2| dθ2 = . p 2 2 r r dθ 1 2 0 |r1 + r2 − 2r1r2 cos θ2| 1 2 0 2 2/r2 when r1 < r2

Thus the radial integral dr2 must be broken into two regions. Letting ρ1 := 2r1/a2 be a rescaled radial coordinate, we have

8α c Z Z r1 r2 Z ∞  hΨ |∆|Ψ i = ~ |ψ (x )|2 2 e−2r2/a2 dr + r e−2r2/a2 dr d3x 0 0 a3 100 1 r 2 2 2 1 2 0 1 r1 Z −ρ1 Z ∞ 4α~c 2 2 − e (2 + ρ1) 3 2α~c −ρ1 −ρ1  5α~c = |ψ100(x1)| d x1 = ρ1 e 2 − e (2 + ρ1) dρ1 = . a2 ρ1 a2 0 4a2 This is the value that was used in the text. And no, I’m not going to derive the second–order result for you.

– 117 – Figure 17: First ionisation energies of the ground states of the first few elements. Figure by Agung Karjono at Wikipedia.

This is in significantly better agreement with the experimental value of 79.0 eV than ≈ − our zeroth–order estimate (8.41). Including also the second–order term70 leads to   2 2 5 1 25 1 3 E1(1/Z) = 4α mc 1 + + (1/Z ) (8.47) − − 8 Z 256 Z2 O giving E1(1/2) 77.5 eV. Given the crudity of our expansion this is in remarkably close ≈ − agreement with the experimental value. In heavy atoms, the ground state energy itself is difficult to measure experimentally since it involves computing the energy required to strip off all the electrons from the nucleus. Fortunately, for the same reason it’s also rarely the directly relevant quantity. Instead, chemists and spectroscopists are usually more interested in the first ionisation energy, defined to be the energy required to liberate a single electron from the atom. Hydrogen only has one electron, so its first ionization energy is just the 13.6 eV which we must supply to eject this electron from the ground state of the atom. In the case of helium, once one electron is stripped away the remaining electron sees the full force of the Coulomb potential from the Z = 2 nucleus, so will have binding energy 54.4 eV. The first ionisation − energy is thus the difference ( 54.4 + 79.0) eV = 24.6 eV. This is significantly greater ≈ − than the ionisation energy of hydrogen, and in fact helium has the greatest first ionisation energy of any of the elements, reflecting its status as the first noble gas (see figure 17).

8.1.4 The Quadratic Stark Effect We’ll now consider a different type of perturbation, caused not by relativistic corrections to the Hamiltonian of the isolated atom, but by allowing the atom to interact with some external system.

70We could calculate this using the result (8.13) for the second-order energy shift. I’ll just quote the answer.

– 118 – Suppose a hydrogen atom is placed in an external electric field E = ∇φ. Most electric − fields we can generate in the laboratory are small compared to the strength of the Coulomb field felt by the electron due to the nucleus71, so perturbation theory should provide a good estimate of the energy shifts such an external field induces. By definition of the electrostatic potential Φ, the field changes the energy of the atom by δE = e(φ(xp) φ(xe)) where xp − and xe are the locations of the proton and electron. We assume the external field varies only slowly over scales of order the Bohr radius, so

δE e r ∇φ = e r E (8.48) ≈ − · − · where r = xe xp. Let’s choose the direction of the electric field to define the z-axis, − and set = E so as to avoid confusion with the energy. Thus E = zˆ and the effect of E | | E the external electric field is to add a new term to the Hamiltonian H0 of the unperturbed atom: H = H0 e X3 (8.49) − E where X3 is the position operator for the z-coordinate of the electron relative to the proton. In this section, we’ll just consider the effect of this electric field on the ground state n, `, m = 1, 0, 0 of the hydrogen atom. This state is non-degenerate for the unperturbed | i | i Hamiltonian (and our perturbation is blind to spin), so again non-degenerate perturbation theory will suffice. From (8.8) the first order change in the ground state energy is

(1) E = e 1, 0, 0 X3 1, 0, 0 , (8.50) 1 − E h | | i −1 However, applying the parity operator Π and noting that Π X3Π = X3 we have − −1 1, 0, 0 X3 1, 0, 0 = 1, 0, 0 Π X3Π 1, 0, 0 = 1, 0, 0 X3 1, 0, 0 , (8.51) h | | i −h | | i −h | | i where the last equality uses the fact that Π 1, 0, 0 = 1, 0, 0 since the ground state wave- | i | i function is spherically symmetric. Thus 1, 0, 0 X3 1, 0, 0 = 0 and parity symmetry ensures h | | i that, to first order in the external electric field, the ground state energy is unaffected. To see a non-trivial effect, we must go one order further. The second order shift in the ground state energy is given by (8.13) in general. In our case, the sum is to be taken over all states of hydrogen except the ground state, so

∞ ` 2 (2) 2 2 X X X 1, 0, 0 X3 n, `, m E1 = e |h | | i| . (8.52) E E1 En n=2 `

n0, `0, m0 X n, `, m = 0 unless `0 ` = 1 . h | | i | − |

Applying this to our case, 1, 0, 0 X3 n, `, m = 0 unless ` = 1, which greatly simplifies h | | i the sums in (8.52). Furthermore, [Lz,X3] = 0 so X3 n, 0, 0 is an eigenstate of Lz with | i 71With modern high intensity lasers, electric fields comparable to the nuclear Coulomb potential can be generated. The behaviour of a hydrogen atom in such a laser must be studied by other methods.

– 119 – eigenvalue zero and hence it is orthogonal to n, 1, m whenever m = 0. In summary, to | i 6 this order the electric field causes the ground state 1, 0, 0 to mix only with states n, 1, 0 . | i | i Thus the only non-vanishing terms in (8.52) are

∞ 2 (2) 2 2 X 1, 0, 0 X3 n, 1, 0 E1 = e |h | | i| (8.53) E E1 En n=2 − involving just a single sum. This is the second-order change in the ground state energy due to the presence of the external electric field. This change in the ground state energy is known as the quadratic Stark effect, after it’s discoverer and since it comes in at order 2. There’s a good physical reason why the E change in energy comes in at this order: the hydrogen atom is electrically neutral and the unperturbed ground state 1, 0, 0 is spherically symmetric, so the atom in this state | i has no electric dipole moment. In response to the applied electric field, the electronic wavefunction changes by an amount that is proportional to the coefficients bm, which are themselves proportional to . In other words, the applied electric field E polarizes the E atom, generating a dipole moment p E. The energy of any dipole p in an electric field is ∝ Udip = p E and since the dipole moment itself is , for the ground state of hydrogen −2 · ∼ E Udip . Higher energy states n, `, m can have a permanent dipole moment due to the ∼ E | i electron’s elliptical orbits, but to study this we need degenerate perturbation theory.

8.2 Degenerate Perturbation Theory Our derivation of the coefficients m ∆ n bm = h | | i (8.54) En Em − of the first-order shift in the state breaks down if there are states m that have the same | i energy as the state n that we’re attempting to perturb. Indeed, naively applying (8.54) | i in this case appears to show that the small perturbation can cause an extremely dramatic shift in the state of the system. There’s nothing particularly surprising about this. Consider a marble lying in the bottom of a bowl at the minimum of the gravitational potential. If we tilt the bowl slightly, the marble will move a little, resettling in the bottom of the tilted bowl as it adjusts to the new minimum of the potential. However, if the marble initially lies at rest on a smooth table, so that it initially has the same energy no matter where on the table it lies, then tilting the table even very slightly will lead to a large change in the marble’s location. To handle this degenerate situation, we first observe that the only states that will acquire a significant amplitude as a result of the perturbation are those that are initially degenerate with the original state. In other words, to good approximation, the state to which the system changes will be a linear combination of those having the same zeroth– order energy as our initial state. In many situations, there are only a finite number of these. We then diagonalise the matrix r ∆ s of the perturbing Hamiltonian in the subspace h | | i

– 120 – spanned by these degenerate states. Since this subspace was completely degenerate wrt H0, in this subspace the eigenstates of the perturbation ∆ are in fact eigenstates of the full Hamiltonian H0 + ∆. Let’s see how this works in more detail. Suppose V is an N–dimensional subspace ⊂ H with H0 ψ = EV ψ for all ψ V . For r = 1,...,N, we let r be an orthonormal | i | i | i ∈ {| i} basis of V and define PV to be the projection operator PV : V . We can write this H → projection operator as N X PV = r r (8.55) | ih | r=1 in terms of our orthonormal basis. Similarly, we let P⊥ = 1 PV denote the projection − onto V ⊥, where V ⊥ = χ : ψ χ = 0 ψ V . (8.56) { | i ∈ H h | i ∀ | i ∈ } 2 2 Note that, as projection operators, PV = PV and P⊥ = P⊥ and that PV P⊥ = P⊥PV = 0. Also, we have [H0,PV ] = 0 and [H0,P⊥] = 0 (8.57) since V was defined in terms of a degenerate subspace of H0. Now let’s consider a perturbed Hamiltonian Hλ = H0 + λ∆. Any eigenstate ψλ of | i the full Hamiltonian obeys

0 = (H0 E(λ) + λ∆) ψλ − | i = (H0 E(λ) + λ∆)(PV + P⊥) ψλ (8.58) − | i = (EV E(λ) + λ∆)PV ψλ + (H0 E(λ) + λ∆)P⊥ ψλ − | i − | i where in the second line we’ve separated out the terms in V from those in V ⊥. Acting on the left with either PV or P⊥ and using (8.57), we obtain the two equations

0 = (EV E(λ) + λ PV ∆) PV ψλ + λ PV ∆P⊥ ψλ (8.59a) − | i | i and 0 = λP⊥∆PV ψλ + (H0 E(λ) + λ P⊥∆) P⊥ ψλ , (8.59b) | i − | i in which we have separated out the effects of the perturbation within V and its complement. Suppose now that we’re perturbing a state ψ0 that, in the absence of the perturbation, | i lies in V . We write 2 ψλ = α + λ β + λ γ + | i | i | i | i ··· (8.60) E(λ) = E(0) + λE(1) + λ2E(2) + ··· just as before, noting that P⊥ ψλ is necessarily at least first–order in λ. To zeroth order, | (0)i equation (8.59a) tells us that E = EV as expected. At first order, this equation becomes

(1) (1) ( E + PV ∆) α = 0 or equivalently PV ∆PV α = E α . (8.61) − | i | i | i This is a remarkable equation! It tells us that, if our assumption that the perturbation ∆ causes only a small change in the energies and states of the system is to hold and if

– 121 – our zeroth-order state ψ0 lies in some degenerate subspace, then ψ0 can’t be just any | i | i eigenstate of H0, but must also be an eigenstate of PV ∆PV – the perturbation, restricted to the degenerate subspace V . In other words, in degenerate systems, states that are stable against small perturbations are those that are already eigenstates of the perturbation. If we started with some other ψ0 V that was not an eigenstate of ∆, then as soon as | i ∈ the perturbation is turned on, the state would rapidly change into a ∆ eigenstate72. This large response to a small change in H would invalidate our use of perturbation theory; it’s the quantum analogue of the fact that the location of a marble on a perfectly flat table is extremely sensitive to any tilt.

Let’s now suppose that we’ve chosen our basis r of V in such a way that PV ∆Pv r = (1) {| i} | i Er , so that r is indeed an eigenstate of the perturbation with some eigenvalue we’ll |i (1) | i call Er . Then choosing our initial state α to be r , by (8.61) the first order shift in its | i | i energy is (1) E = r PV ∆PV r = r ∆ r . (8.62) r h | | i h | | i This is exactly what we would have found in the non-degenerate case, but here it’s very im- portant that we’re considering the effects of perturbations when we start with an eigenstate of ∆. In fact, since r is an eigenstate of both H0 and ∆, we have | i (1) Hλ r = (H0 + λ∆) r = (EV + λE ) r (8.63) | i | i r | i so that in the subspace V we’ve solved the perturbed spectrum exactly. In other words, within V we’re not really doing perturbation theory at all! Fortunately, in problems of interest dim(V ) is typically rather small and finding the exact spectrum of the full Hamil- tonian Hλ within this finite dimensional subspace is usually much easier than finding the full spectrum. As we’ve seen, degeneracy in the energy spectrum often arises as a result of some symmetry. If this symmetry is dynamical, such as the enhanced U(d) symmetry of the d-dimensional harmonic oscillator, or the SU(2) SU(2) symmetry the Runge-Lenz vector × implies for the gross structure of hydrogen, it will almost inevitably by broken by a pertur- bation. Even symmetries such as rotations that are inherited from transformations of space may be broken in the presence of external fields. For example, although the ground state of an atom is typically spherically symmetric, the laboratory in which one conducts an ex- periment likely won’t be and any small stray electric and magnetic fields in the laboratory will perturb the atom away from pure spherical symmetry. For this reason, perturbations typically lift degeneracy. As we’ve seen, even very small perturbations can have a dramatic effect in immediately singling out a preferred set of states in an otherwise degenerate sys- tem; we’ll return to this point in section 10.1.3 when trying to understand ‘collapse of the wavefunction’ in quantum mechanics. If you take the Applications of Quantum Mechanics course next term, you’ll see that the lifting of degeneracy is also important in giving crys- tals their electronic properties: when a large number of identical metal atoms are brought

72To justify this, in particular to understand what happens “as a perturbation is turned on”, we’ll need to consider time-dependent perturbation theory. See chapter9.

– 122 – together, ignoring the inter-atomic interactions, all the valence electrons in the different atoms will be degenerate. The small effect of the potential from nearby atoms breaks this degeneracy and makes these valence electrons prefer to delocalise throughout the crystal.

8.2.1 The Linear Stark Effect As an application of degenerate perturbation theory, let’s again consider the Stark effect but now for the 2s state of our atom. This state is degenerate with the three73 2p states, so here our degenate subspace

V = span 2, 0, 0 , 2, 1, 1 , 2, 1, 0 , 2, 1, 1 . (8.64) {| i | i | i | − i} 74 As before, the electric field induces a perturbation is e X3 and we find a diagonalise this E perturbation within V . Parity implies

0 0 2, 0, 0 X3 2, 0, 0 = 0 and 2, 1, m X3 2, 1, m = 0 m, m 1, 0, 1 , (8.65) h | | i h | | i ∀ ∈ { − } while since [Lz,X3] = 0 we also have 2, 0, 0 X3 2, 1, 1 = 0. Thus the matrix elements of h | | ± i ∆ in the basis (8.64) are   0 0 a 0    0 0 0 0  ∆ V = e  ∗  , (8.66) | E a 0 0 0  0 0 0 0 where Z 2 a = 2, 0, 0 X3 2, 1, 0 = 2π ψ2,0,0(r) ψ2,1,0(r, θ) r cos θ r d(cos θ) = 3a0 , (8.67) h | | i − with a0 the Bohr radius. The eigenvalues of ∆ in this subspace are thus 3e a0, 0, 0, 3e a0 { E − E } with corresponding eigenstates 1 1 ( 2, 0, 0 2, 1, 0 ) , 2, 1, 1 , 2, 1, 1 and ( 2, 0, 0 + 2, 1, 0 ) . (8.68) √2 | i − | i | i | − i √2 | i | i

In the presence of the electric field, the n = 2 state of lowest energy is ( 2, 0, 0 + 2, 1, 0 )/√2 | i | i and we conclude that even for a tiny electric field, the n = 2 states will jump to this preferred state. From our discussion of the quadratic Stark effect above, we know that a change in energy requires the dipole moment D of an atom to be independent of . Since ∝ E E this n = 2 state has a firs-order energy shift of 3e a0, we conclude that the n = 2 − E states of hydrogen do indeed have a permanent electric dipole moment of magnitude 3ea0. Classically, for an electron in an elliptical orbit, this result is expected because the electron

73Since the applied electric field does not coupled to spin, we ignore the effect of spin in this discussion – at this order, the Stark effect will not lift the degeneracy of the gross structure of hydrogen wrt the electron’s spin. 74 A reminder: we define the z-direction to be the direction of the electric field, and X3 is the corresponding operator.

– 123 – would spend more time at the apocentre (the point of its orbit furthest from the atom’s centre of mass) than at the pericentre (the point nearest to the centre of mass). If electron’s orbit was a perfect Keplerian ellipse, the atom would have a permanent electric dipole moment aligned parallel to the orbit’s major axis. However, any small deviation from the 1/r potential will cause the major axis of the ellipse to precess, and the time–averaged dipole moment would vanish. For hydrogen, the potential does differ from 1/r, though only very slightly due to the effects of fine structure. Hence even a weak external field can prevent precession and give rise to a steady electric dipole. Quantum mechanically, in the presence of the electric field the lowest energy n = 2 state ( 2, 0, 0 + 2, 1, 0 )/√2 is no longer an eigenstate of L2, because the field applies | i | i torque to the atom, changing it’s total orbital angular momentum. However, this lowest energy state is still an eigenstate of Lz with eigenvalue zero. Recalling that we took the z-direction to be the direction of the applied field, we see that the angular momentum is perpendicular to E, as expected from the classical picture of an ellipse with major axis aligned along E. Note also that the electric field does not lift the degeneracy between the 2, 1, 1 and 2, 1, 1 states. These two states have their orbital angular momentum | i | − i maximally aligned or anti-aligned with E; classically, they correspond to orbits confined to a plane perpendicular to E.

8.3 Does Perturbation Theory Converge? We began our study of perturbation theory by assuming that the states and energy eigen- values of the full Hamiltonian depended analytically on a dimensionless parameter λ con- trolling the perturbation. Even when the individual coefficients of powers of λ are finite, this is often not the case because the infinite perturbative series itself may fail to converge, or may converge only for some range of λ. The issue is that we really have an expansion in

m ∆ n λ h | | i where m = n , Em En 6 − and the coefficients of λ may grow too rapidly. Heuristically, the condition for convergence is thus that the typical energy splitting m ∆ n induced by the perturbation should be |h | | i| much smaller than the initial energy difference Em En. However, a detailed criterion − is often hard to come by since the higher terms in the perturbation expansion involve complicated sums (or integrals) over many different intermediate states. To illustrate this in a simple context, let’s consider in turn the following three pertur- bations of a 1d harmonic oscillator potential

 2  λm ω x0X 2 − P 1 2 2 1 2 2 H = + mω X + + 2 λm ω X (8.69) 2m 2  +λX4 where x0 and  are constants. Of course, the first two can be solved exactly — we’d never really use perturbation theory to study them.

– 124 – In the first case, we have

2 2 P 1 2 2 λ 2 2 H = + m ω (X λx0) mω x (8.70) 2m 2 − − 2 0 from which we easily see that the exact energies are

  2 1 λ 2 2 En(λ) = n + ~ω mω x0 , (8.71) 2 − 2 with corresponding position space wavefunction x nλ = x λx0 n , just a translation of h | i h − | i the usual harmonic oscillator wavefunction n . If we instead tackled this problem using | i perturbation theory, we’d find

2 2 2 2 4 2 X k X n 3 En(λ) = En λmω x0 n X n + λ m ω x0 |h | | i| + O(λ ) − h | | i (n k)~ω k6=n − (8.72)   2 1 λ 2 2 3 = n + ~ω mω x0 + O(λ ) . 2 − 2

To obtain this result, we note that the first–order term vanishes (e.g. by parity), while since X is a linear combination of creation and annihilation operators A† and A, the only k = n + 1 and k = n 1 terms can contribute to the sum in the second-order term. | i | i | i | − i Going further, we’d find that there are no higher corrections in λ – the O(λ3) terms are in fact zero – though this is not easy to see directly. Thus, in this case, the perturbative result converges to the exact result, and the radius of convergence is infinite. This reflects the 2 fact that the perturbation λm ω x0X didn’t really change the character of the original − Hamiltonian. No matter how large λ is, for large enough x the perturbation remains negligible. Turning to the second case, it’s again immediate that the exact energy levels are

 1 En(λ) = n + ~ω(λ) (8.73) 2 where ω(λ) = ω√1 + λ is the modified frequency. This has a branch cut starting at λ = 1, − so the energy is only analytic in the disc λ < 1. Again using perturbation theory, we find | |

λ 2 2 3 En(λ) = En + mω n X n + O(λ ) 2 h | | i    2  (8.74) 1 λ λ 3 = n + ~ω 1 + + O(λ ) 2 2 − 8 agreeing to this order with the Taylor expansion of the exact answer. Continuing further, we’d find that this does indeed converge provided λ < 1, and that it then | | converges to the exact answer. The physical reason why the perturbation series diverges when λ 1 is simply that if λ = 1, the ‘perturbation’ has completely cancelled the | | ≥ − original harmonic oscillator potential, so we’re no longer studying a system that can be treated as a harmonic oscillator in the first instance. Once λ < 1 the harmonic oscillator −

– 125 – potential is turned upside down, and we do not expect our system to possess any stable bound states. Finally, consider the case 4 H = HSHO + λX . (8.75) I do not know whether this model has been solved exactly, but it can be treated perturba- tively. After a fair amount of non-trivial calculation75 one obtains the series

∞ X n E0(λ) = ~ω + (λ) an (8.76) n=1 for the ground state energy including the quartic interaction, where the coefficients behave as n+1   ( 1) √6 n 1 95 1 −2 an = − 3 Γ(n + ) 1 + O(n ) . (8.77) π3/2 2 − 72 n On account of the Γ-function, these grow faster than factorially with n, so the series (8.76) has zero radius of convergence in λ! Once again, this is easy to see from the form of the perturbed Hamiltonian: even though we may only care about λ > 0, our assumption that the perturbation expansion is analytic in λ at λ = 0 means that, if it converges, it will do so for a disc λ D C. For any λ R<0, the Hamiltonian of the quartic oscillator ∈ ⊂ ∈ is unbounded below, so there cannot be any stable bound states that are analytic in λ at λ = 0. Let me comment that even when perturbative series do not converge, they may still provide very useful information as an asymptotic series. You’ll learn far more about these if you take the Part II course on Asymptotic Methods next term, but briefly, we say a PN n + series SN (λ) = n=0 anλ is asymptotic to an exact function S(λ) as λ 0 (written + → SN (λ) S(λ) as λ 0 ) if ∼ → N 1 X n lim S(λ) anλ = 0 . (8.78) λ→0+ λN − n=0 In other words, if we just include a fixed number N of terms in our series, then for small enough λ 0 these first N terms differ from the exact answer by less that λN for any  > 0 ≥ (so the difference is o(N)). However, if we instead try to fix λ and improve our accuracy by including more and more terms in the series, then an asymptotic series will eventually diverge. Most of the perturbative series one meets in the quantum world (including most Feynman diagram expansions of Quantum Field Theory) are only asymptotic series. Just as in our toy examples above, the radius of convergence of such series is often associated with interesting physics.

75I don’t expect you to reproduce it – if you’re curious you can find the details in Bender, C. and Wu, T.T., Anharmonic Oscillator II: A Study of Perturbation Theory in Large Order, Phys. Rev. D7, 1620-1636 (1973).

– 126 – 9 Perturbation Theory II: Time Dependent Case

In this chapter we’ll study perturbation theory in the case that the perturbation varies in time. Unlike the time–independent case, where we mostly wanted to know the bound state energy levels of the system including the perturbation, in the time–dependent case our interest is usually in the rate at which a quantum system changes its state in response to the presence of the perturbation.

9.1 The Interaction Picture Consider the Hamiltonian H(t) = H0 + ∆S(t) (9.1) where H0 is a time-independent Hamiltonian whose spectrum and eigenstates we under- stand, and ∆S(t) is a perturbation that we now allow to depend on time. (The footnote on ∆S(t) is to remind us that this operator has genuine, explicit time dependence even in the Schr¨odingerpicture.) The idea is that H0 models a system in quiescent state, and we want to understand how eigenstates of H0 respond as they start to notice the perturbation. In the absence of the time-dependent perturbation, the time evolution operator would be −iH t/ U0(t) = e 0 ~ as appropriate for the model Hamiltonian H0. We define the interaction picture state ψI(t) as | i −1 ψI(t) = U (t) ψS(t) , (9.2) | i 0 | i where ψS(t) is the Schr¨odingerpicture state evolving in accordance with the TDSE for | i the full Hamiltonian (9.1). Thus, in the absence of ∆S(t), we’d simply have ψI(t) = −1 | i U (t) U0(t) ψ(0) = ψ(0) so the state ψI(0) would in fact be independent of time. The 0 | i | i | i presence of the perturbation means that our states do not evolve purely according to U0(t), so ψI(t) has non-trivial time dependence. | i To calculate the evolution of ψI(t) , we differentiate | i ∂ iH0t/~ iH0t/~ ∂ i~ ψI(t) = H0 e ψS(t) + e i~ ψS(t) ∂t| i − | i ∂t| i iH0t/~ iH0t/~ (9.3) = H0 e ψS(t) + e (H0 + ∆(t)) ψS(t) − | i | i −1 = U (t) ∆(t) U0(t) ψI(t) , 0 | i where we’ve used the full TDSE to evaluate ∂ ψS /∂t, and in the last line we’ve written | i our state back in terms of the interaction picture. Similarly, the expectation value of an operator AS in the Schr¨odingerpicture becomes −1 ψS(t) AS ψS(t) = ψI(t) U (t)ASU0(t) ψI(t) (9.4) h | | i h | 0 | i in terms of the interaction picture states, so we define the interaction picture operator AI(t) by −1 AI(t) = U0 (t)ASU0(t) (9.5) again using just the evolution operators of the unperturbed, rather than full, Hamiltonian. Infinitesimally, this is

d i −1 ∂AI(t) AI(t) = [H0,AI(t)] + U0 (t) U0(t) (9.6) dt ~ ∂t

– 127 – where the final term comes from the explicit time dependence present even in the Schr¨odinger picture operator. In this sense, the interaction picture is ‘halfway between’ the Schr¨odinger picture, in which states evolve and operators are time independent, and the Heisenberg pic- ture, in which states are always taken at their initial value ψ(0) , but operators evolve | i according to the full Hamiltonian. In the interaction picture, states evolve by just the time-dependent perturbation as ∂ i~ ψI(t) = ∆I(t) ψI(t) (9.7) ∂t| i | i −1 where ∆I(t) = U0 (t)∆(t)U0(t) is the perturbation in the interaction picture, whilst oper- ators also evolve, but only via the model Hamiltonian. It will turn out to be useful to introduce an interaction picture time evolution operator UI(t) defined so that

ψI(t) = UI(t) ψI(0) for any initial state ψI(0) . (9.8) | i | i | i ∈ H From (9.3) we find that UI(t) satisfies the differential equation ∂ i~ UI(t) = ∆I(t) UI(t) . (9.9) ∂t

Note that since the operator ∆I(t) itself depends on time, we cannot immediately integrate this to write UI(t) in terms of the exponential of ∆I(t): in particular it is generally not correct to claim  Z t  ? i 0 0 UI(t) = exp ∆I(t ) dt , (9.10) −~ 0 0 0 because since the operator ∆I(t) itself now depends on time, in general [∆I(t), ∆I(t )] = 0 6 and in expanding the exponential as a power series, we’d have to commute the operators at different times through one another. So far our considerations have been exact, but to make progress we must now approx- imate. To do so, it’ll be convenient to rewrite (9.9) as an integral equation76 Z t i 0 0 0 UI(t) = 1 ∆I(t ) UI(t ) dt (9.11) − ~ 0 in which the operator UI(t) we wish to understand also appears on the rhs inside the 0 integral. We now replace the UI(t ) in the integral (9.11) by its value according to its own equation, obtaining i Z t  i 2 Z t Z t1  UI(t) = 1 ∆I(t1) dt1 + ∆I(t1) ∆I(t2) UI(t2) dt2 dt1 . (9.12) − ~ 0 −~ 0 0

The term containing UI(t) on the rhs now has two explicit powers of the perturbation ∆(t), so may be hoped to be of lower order in our perturbative treatment. Iterating this procedure gives ∞ n X  i  Z t Z tn−1 UI(t) = dt1 dtn ∆I(t1) ∆I(t2) ∆I(tn) (9.13) − ··· ··· n=0 ~ 0 0

76 To obtain this equation, we’ve just integrated (??) using the initial condition UI(0) = 1.

– 128 – in general. Note that the operators ∆(ti) at different times do not necessarily commute — they appear in this expression in a time–ordered sequence, with the integrals taken over the range 0 tn tn−1 t1 t . ≤ ≤ ≤ · · · ≤ ≤ Instead of the na¨ıve expression (9.10), we often write the series (9.13) as  Z t  i 0 0 UI(t) = T exp ∆I(t ) dt (9.14) −~ 0 for short, where the time–ordered exponential 77 T exp( ) is defined by the series (9.13). ··· It’s a good exercise to check that this time–ordered exponential is equal to the usual exponential in the case that [∆(t1), ∆(t2)] = 0 for all t1 and t2. −1 The interaction picture perturbation is ∆I(t) = U0 (t)∆S(t) U0(t), so we can also write (9.13) as

t tn−1 ∞  nZ Z X i −1 UI(t) = dt1 dtn U0 (t1)∆S(t1) U0(t1 t2) ∆S(t2) U0(tn−1 tn) ∆S(tn) U0(tn) −~ ··· − ··· − n=0 0 0 (9.15) −1 using the fact that U (ti)U0(ti−1) = U0(ti−1 ti) . In this form we see that the perturba- 0 − tive expansion of UI(t) treats the effect of the perturbation as a sequence of impulses: we evolve according to H0 for some time tn 0 then feel the effect ∆S(tn) of the perturbation ≥ just at this time, before evolving according to H0 for a further time (tn−1 tn) 0 and − ≥ feeling the next impulse from the perturbation, and so on. Finally, we integrate over all the possible intermediate times at which the effects of the perturbation were felt, and sum over how many times it acted. This method, closely related to Green’s functions in pdes, is also the basis of Feynman diagrams in QFT, where H0 is usually taken to be a free Hamiltonian. Now let’s put these ideas to work. Suppose that at t = 0 we’re in the state ψ(0) = P | i an(0) n , described in terms of some superposition of the basis n of eigenstates of the n | i {| i} unperturbed Hamiltonian. Then a later time t the interaction picture state will be ψI(t) = P | i UI(t) ψ(0) and can again be described by a superposition an(t) n . Contracting with | i n | i a state k gives h | ak(t) = k UI(t) ψI(0) (9.16) h | | i in the interaction picture. Thus, using (9.13) we have Z t i X 0 0 ak(t) ak(0) an(0) k ∆I(t ) n dt ≈ − ~ n 0 h | | i Z t i X −1 0 0 0 0 = ak(0) an(0) k U0 (t )∆S(t ) U0(t ) n dt (9.17) − ~ n 0 h | | i t i Z 0 X i(Ek−En)t /~ 0 0 = ak(0) an(0) e k ∆S(t ) n dt , − ~ n 0 h | | i 77Time–ordered exponentials were introduced by Freeman Dyson and play a prominent role in QFT. They’re also familiar in differential geometry in the context of computing the holonomy of a connection around a closed path. (This is not a coincidence, but it would take us too far afield to explain here.)

– 129 – to first non-trivial order in ∆, where we used the fact that H0 n = En n . In particular, | i | i if at t = 0 we begin in an eigenstate m of H0, so that | i ( 1 if n = m an(0) = , (9.18) 0 else then the amplitude for our perturbation to caused the system to make a transition into a different H0 eigenstate k = m is | i 6 | i t i Z 0 i(Ek−Em)t /~ 0 0 ak(t) = e k ∆S(t ) m dt (9.19) −~ 0 h | | i to first order in the perturbation. Henceforth, we’ll drop the subscript on the Schr¨odinger picture perturbation – note that this is the form in which the perturbation enters the original Hamiltonian.

9.2 Fermi’s Golden Rule To go further, we must specify how the perturbation ∆(t) actually depends on time. We’ll first consider small perturbations of the form

∆(t) = ∆ e−iωt + ∆† e+iωt (9.20) for some time–independent operator ∆. Thus our Schr¨odingerpicture perturbation ∆(t) oscillates with fixed frequency ω. Note that ∆(t) should be Hermitian, as it’s a term in the full Hamiltonian, so we’ve included both an e−iωt and e+iωt term, and wlog we may take ω > 0. We’ll see later that these two terms are responsible for different physics. We call such perturbations monochromatic in anticipation of the case that they describe an applied electromagnetic field, corresponding to light of frequency ω. Performing the time integral in (9.19) in this monochromatic case is trivial and yields

k ∆ m h i k ∆† m h i i(Ek−Em−~ω)t/~ i(Ek−Em+~ω)t/~ ak(t) = h | | i e 1 + h | | i e 1 . Ek Em ~ω − Ek Em + ~ω − − − − (9.21) Each of these terms vanishes at t = 0 and then grow linearly in t for times small compared −1 −1 to either Ek Em ~ω or Ek Em + ~ω , respectively, where the oscillatory nature | − − | | − | of the exponentials reveals itself. Now suppose the final state k into which we are making | i a transition has energy very close to Em +~ω, so that in particular Ek > Em corresponding to the atom absorbing energy from the perturbation. In this case, the timescale Ek Em −1 | − − ~ω is very long and the first term on the rhs of (9.21) grows for a long time. On the | other hand, if the final state has Ek very close to Em ~ω, corresponding to emission − where the atom loses energy to the perturbation, then it is the second term which grows for a long time. In the case of absorption, the small denominator of the first term means that this term will dominate at late times. Thus k ∆ m h i i(Ek−Em−~ω)t/~ ak(t) h | | i e 1 (9.22) ≈ Ek Em ~ω − − −

– 130 – Figure 18: As t , the function sin2(xt)/x2t πδ(x). → ∞ →

−1 for times large compared to Ek Em + ~ω , so that the probability that the system has th| − | made a transition to the k eigenstate of H0 is

2   2 4 k ∆ m 2 (Ek Em ~ω)t ak(t) |h | | i| 2 sin − − . (9.23) | | ≈ (Ek Em ~ω) 2~ − − correct to order ∆ 2. | | Now, for x = 0 we have 6 sin2(xt) 1 lim lim = 0 , (9.24) t→∞ x2t ≤ t→∞ x2t

2 2 whilst limx→0 sin (xt)/x t = t which diverges if we then send t . The integral of this → ∞ function is Z sin2(xt) 2 dx = π , (9.25) R x t independent of t, so we see that

sin2(xt) lim = π δ(x) (9.26) t→∞ x2t (see figure 18). This shows that at late times, the transition rate is

∂ 2 2π 2 Γ(m k) = lim ak(t) k ∆ m δ(Ek Em ~ω) (9.27a) → t→∞ ∂t| | ≈ ~ |h | | i| − − in the case that the system absorbs energy from the perturbation. In the opposite case that the system loses energy back into the perturbation, we have Ek < Em and the corresponding rate of transition is

2π 2 Γ(m k) m ∆ k δ(Ek Em + ~ω) . (9.27b) → ≈ ~ |h | | i| − with an opposite sign in the δ-function. These results were essentially obtained by Dirac and proved to be so useful in describing atomic transitions that Fermi dubbed them ‘golden rules’. The name stuck and now (9.27a)&(9.27b) are known as Fermi’s golden rule.

– 131 – The δ-function in (9.27a) of course says that the perturbation must supply the energy difference Ek Em between the two energy levels of our atomic transition. This δ-function − has the effect that, in practice, a truly monochromatic perturbation cannot cause transi- tions between two discrete energy levels of a bound system such as an atom, because the frequency of the perturbation would need to be perfectly tuned to the energy difference78. There are thus two different classes of circumstances in which we can use Fermi’s golden rule. On the one hand, we may be interested not in transitions to some specific final state k , but to any one of a range of eigenstates of H0. A typical example would be | i transitions between an initial bound state of an atom to any of the continuum of positive- energy (non-bound) states. In these circumstances, we let ρ(Ek) δEk denote the number of (final) states with energy in the interval (Ek,Ek + δEk). The function ρ(Ek) — which needs to be calculated in each case — is known as the density of states; note that it must have dimensions of 1/(energy) in order for ρ(Ek) δEk to be a number of states. The total rate of transition from m is then | i Z Z 2π 2 Γ(m k) ρ(Ek) dEk = k ∆ m δ(Ek Em ~ω) ρ(Ek) dEk → ~ |h | | i| − − (9.28) 2π 2 = Em + ~ω ∆ m ρ(Em + ~ω) . ~ |h | | i| We see that we always make transitions through an energy of exactly ~ω, with the density of states telling us how many states of energy Em + ~ω there actually are. The second set of circumstances is when our perturbation is not purely monochromatic, but contains a range of different frequencies. In this case we can make transitions between two bound states, because there is always some perturbation of exactly the right frequency ω = Ek Em/~ even though the rhs takes discrete values as k varies over different bound − states as the final state. To handle this situation, we decompose the perturbation as a Fourier transform Z ∆(t) = ∆(ω) e−iωt dt . (9.29) R Everything goes through as above, but we now find the probability of being in state k at | i time t is 2 2 i ak(t) = (9.30) | | ~ 9.2.1 Ionization by Monochromatic Light A common example of the use of Fermi’s golden rule is to calculate the rate at which an applied monochromatic electromagnetic field can ionize an atom, say hydrogen, stripping off the electron from a bound state into the continuum of ionized states. We’ll assume that the electromagnetic field itself may be treated classically – this assumption is valid e.g. in a laser, or at the focus of the antenna of a radio telescope where the field is large.

78 −1 2 2 At times that are of order |Ek − Em − ~ω| , before our replacement of sin (xt)/x t by a δ-function was valid, no tuning was needed. However, in this case the probability Pk(t) does not continue to increase with time once the perturbation is ended, and there is no overall transition rate.

– 132 – In the vacuum, the electromagnetic field is divergence free and is entirely generated by Faraday’s Law ∇ E = ∂B/∂t. The whole electromagnetic field can be described by a × − vector potential A via 1 ∂A B = ∇ A and E = . (9.31) × − c ∂t In the monochromatic case, Faraday’s equation is solved by

A(x, t) =  cos(k x ωt) , (9.32) · − where  is a constant vector describing the polarization of the wave and k is the wavevector. From the vacuum Maxwell equation ∇ E = 0, we learn that · k  = 0 , (9.33) · saying that the wave is transverse. The coupling of an elementary particle of charge q to such a vector potential is given by the minimal coupling rule P P qA(X, t) (9.34) → − in the Hamiltonian79, so

(P qA)2 q q2A2 H = − + V (X) = H0 (P A + A P) + . (9.35) 2µ − 2µ · · 2µ In our case of atomic transitions, an electron has charge q = e so to first–order in e the − perturbing Hamiltonian is e ∆(t) = (P  cos(k X ωt) + cos(k X ωt)  P) . (9.36) 2µ · · − · − · The component  P of the momentum operator in the direction of the polarization of the · electromagnetic wave commutes with cos(k X ωt) by the transversality condition (9.33) · − (recall that  itself is a constant vector). Thus e ∆(t) =  P cos(k X ωt) µ · · − e   (9.37) =  P ei(k·X−ωt) + e−i(k·X−ωt) , 2µ · simplifying the perturbation. Suppose that electromagnetic wavelength is much larger than the Bohr radius of the atom. This is a good approximation provided the atomic transition occurs between states that are separated in energy by much less than αµc2, as will be the case for waves with frequencies that are less than those of soft X-rays. In this approximation, k X 1 for ·  79This is called minimal coupling because more complicated modifications are possible if the charged particle is not elementary, or in other more complicated circumstances. You’ll learn more about this in the Part II Classical Dynamics course, or if you take AQM next term.

– 133 – all locations in the atom or molecule at which there is significant probability of finding the electron. To lowest order, we can thus approximate e ∆(t)  P e−iωt + e+iωt . (9.38) ≈ 2µ · We’ve retained the full time–dependence, because large values of t cannot be ruled out in the way we excluded large values of the electron’s position. Finally, since the momentum operator occurs in H0 only in the kinetic term, we can write this perturbation as

ie −iωt +iωt ∆(t) = [H0,  X] e + e (9.39) 2~ · The purpose of rewriting the perturbation in this way is that we know how the unperturbed Hamiltonian H0 acts on the original eigenstates. Note that the perturbation is proportional to the dipole operator D = e X whose matrix elements between the initial and final − states we will need to take. This result, known as the dipole approximation, arose from our approximation of the vector potential as constant over the scale of the atom. Since the electric field E ω  we can interpret the transition as occuring due to the interaction ∝ between the applied electric field and this dipole. ie We must now calculate the matrix element f ∆ i = (Ef Ei) f  X i . Suppose h | | i 2~ − h | · | i that the hydrogen atom is initially in the ground state so that i = 100 , with electronic | i | i wavefunction 1 x 100 = e−r/a , (9.40) h | i √πa3 where a is the Bohr radius. If the energy of the ejected electron is sufficiently great, we can ignore its Coulomb attraction back to the ion and treat the final electron as a free particle 2 2 of momentum ~k, so f = k with energy E = ~ k /2µ and wavefunction | i | i 1 x k = eik·x (9.41) h | i (2π~)3/2 as in (4.35). Thus the matrix element we seek is determined by the integral 1 Z k  X −ik·x  x −r/a 3 100 = 3/2 3 1/2 e e d x . (9.42) h | · | i (2π~) (πa ) R3 · Performing this integral is rather messy80; when the dust settles one finds

8√2 i a7/2 k  X 100 =  k . (9.43) h | · | i π ~3/2(1 + k2a2)3 · 80If you insist: First note that for any function f(r) of the radius, Z 4π Z ∞ e−ik·x f(r) d3x = r f(r) sin kr dr k 0 with r = |x| and k = |k|. Differentiating this result along  · ∂/∂k gives

Z Z ∞ 5 −ik·x −r/a 3 4π −r/a 32πa −i e  · x e d x = 3  · k r e (− sin kr + kr cos kr) dr = 2 2 3  · k , k 0 (1 + k a ) where the final radial integral is done by parts.

– 134 – The fact that this result had to be proportional to  k is clear: the only direction in which · the expectation value k X 100 can point is that of k. h | | i We define the differential rate dΓ|100i→|ki by

2π 2 2 dΓ|100i→|ki = k ∆ 100 δ(Ep E100 ~ω) p dp dΩ (9.44) ~ |h | | i| − − 2 2 −1 where dΩ = sin θ dθ dφ. For a free particle of mass µ we have p dp = p (dEp/dp) dEp = µp dEp, so

dΓ|100i→|pi 2π 2 = p ∆ 100 δ(Ep E100 ~ω) µp dEp dΩ ~ |h | | i| − − 6 3 2 2 7 2 (9.45) 2 µp e  a (Ep E100) 2 = − cos θ δ(Ep E100 ~ω) dEp , π ~3 (1 + k2a2)6 − − where  =  . On the support of the δ-function we can replace Ep E100 by ~ω, and the | | − remaining effect of the δ–function is to soak up the dEp integral. Since we’re assuming the energy of the final state electron is so great that we can neglect it’s attraction back to the proton, we must have k2 1/a2. Thus we simplify our differential rate (9.45) to  8 2 dΓ|100i→|pi 2 µ~ 2 2 2 e cos θ . (9.46) dΩ ≈ π k9a5 E 2 2 where = ω/2 is the magnitude of the electric field and ~ k = 2µEk is frozen to E 2µ(~ω + E100) 2µ~ω. ≈ 9.2.2 Absorption and Stimulated Emission In our derivation of Fermi’s golden rule (9.28), we took the frequency ω of the perturbation to be fixed and assumed a continuum of final states. Now let’s consider the opposite case of transitions into a discrete set of final states, such as the different bound state energy levels of an atom, but where we have a continuous range of perturbing frequencies. This case is relevant e.g. when our atom is immersed not in a monochromatic source such as a laser, but in a thermal bath of radiation. We’ll assume that, while the actual value of the perturbation experienced by our system may be subject to rapid fluctuations, the statistical properties of these fluctuations remain constant in time so that

k ∆(t1) m m ∆(t2) k = fkm(t1 t2) (9.47) h | | i h | | i − where the bar indicates the average over fluctuations. We can write the function fkm(t1 t2) − as a Fourier transform Z −iω(t1−t2) fkm(t1 t2) = Fkm(ω) e dω (9.48) − where Fkm(ω) represents the average strength of the frequency-ω contribution of our per- turbation. From equation (9.19) for the case that an(0) = δnm (so that the system is initially in state m ), we have to first order | i 1 Z t Z t 2 i(Ek−Em)(t1−t2)/~ ak(t) = 2 k ∆(t1) m m ∆(t2) k e dt1 dt2 . (9.49) | | ~ 0 0 h | | i h | | i

– 135 – Averaging over the fluctuations, equation (9.47) gives

1 Z Z t Z t  2 i(Ek−Em−~ω)(t1−t2)/~ ak(t) = 2 Fkm(ω) e dt1 dt2 dω | | ~ 0 0 2 1 Z Z t i(Ek−Em−~ω)t1/~ (9.50) = 2 Fkm(ω) e dt1 dω ~ 0 Z 2 sin ((Ek Em ~ω)t/2~) = 4 Fkm(ω) − − 2 dω . (Ek Em ~ω) − − Just as before, for large times we can replace sin2(xt)/x2 tδ(x) in this integral. Thus → we find the rate of transitions

2 ak(t) 2π Γ( m k ) = | | = Fkm ((Ek Em)/~) (9.51) | i → | i t ~2 − in the case that the states m and k are discrete but we have a continuous range of | i | i perturbing frequencies. The classic application of the above results is to an atom in a bath of radiation. As before, we’ll assume the radiation has wavelengths than the typical atomic size, so that  in the dipole approximation the Hamiltonian may be taken to be X H = H0 + e E(t) Xr , (9.52) r · where H0 is the Hamiltonian of the unperturbed atom, E(t) is the fluctuating electric field th due to the radiation and Xr is the position operator for the r electron in the atom. As in (9.47) take the fluctuating electric field to be correlated as Z −iω(t1−t2) Ei(t1)Ej(t2) = δij (ω) e dω . (9.53) R P

The presence of δij on the rhs means that fluctuations in different directions are uncorre- lated, whilst fluctuations that are aligned are correlated equally no matter their direction. This just says that there is no preferred direction for the electric field and holds e.g. for a thermal bath of radiation. We can understand the meaning of the function (ω) as follows. P As you’ll learn if you’re taking the Part II Electrodynamics course, the energy density of an electromagnetic field is 1 ρ(x, t) = E2(x, t) + B2(x, t) (9.54) 8π and in a purely radiative (source–free) field E2 = B2 so ρ = E2/4π. In our case, equa- tion (9.53) gives 1 3 Z ρ = E2(t) = (ω) dω (9.55) 4π 4π R P so 3 (ω)/2π represents the average energy density in the radiation (at any time) due to P frequencies in the range ( ω , ω + d ω ). | | | | | |

– 136 – From (9.47) we have that perturbation itself is correlated as

2 Z 2 X −iω(t1−t2) k ∆(t1) m m ∆(t2) k = e k Xr m (ω) e dω (9.56) h | | i h | | i r h | | i R P where n and m are states of the unperturbed atom and the mod-square includes the | i | i Euclidean inner product (dot product) of the two k Xr m matrix elements. Thus h | | i 2

2 X Fkm(ω) = e k Xr m (ω) (9.57) r h | | i P and the rate at which the radiation stimulates the atom to make a transition from state m to a state k | i | i 2 2πe2 X X Γ( m k ) = 2 k r m (ωkm) (9.58) | i → | i ~ r h | | i P where ωkm = (Ek Em)/~. − Now, if the fluctuating electric field is real, then the power function must obey (ω) = P ( ω) and (ω) = ∗(ω). In particular, the rate Γ( m k ) in (9.58) is an even function P − P P | i → | i of ωkm and consequently, for fixed Ek Em , this rate is the same whether Ek > Em or | − | Ek < Em. For Ek > Em, the atom absorbs energy from the radiation, exciting to a higher energy level, whilst if Ek < Em the atom emits energy back into the radiation as it decays to a lower energy state. The important result we’ve obtained is that the rate for stimulated emission is the same as the rate of absorption.

9.2.3 Spontaneous Emission Our treatment above calculated the rate at which an atom would decay (or excite) due to being immersed in a bath of radiation. However, we currently cannot understand decays of an isolated atom: if the atom is prepared in any eigenstate of H0, quantum mechanics says that in the absence of a perturbation it will just stay there. Remarkably, Einstein was able to relate the rate of spontaneous decay of an atom to the rate of stimulated decay that we calculated above. Einstein defined Am→k(ω) as the rate at which an atom would spontaneously decay from a state of energy Em to a state of lower energy Ek, where ω = (Em Ek)/~ is the − energy of the emitted photon. If the atom is immersed in a bath of photons with ρ(ω) dω modes of frequency in the range (ω, ω +dω), we let ρ(ω) Bk→m(ω) denote the rate at which energy is absorbed from the radiation bath, exciting the atom from k m . Similarly, | i → | i let ρ(ω) Bm→k denote the rate of decay of the excited atom due to stimulation by the presence of the radiation bath. We calculated values for Bm→k and Bk→m above, but let’s pretend that (like Einstein) we do not know this yet. In thermodynamic equilibrium these rates must balance, so if there are nk atoms in state k and nm in state m we must have | i | i

nm (Am→k(ω) + ρ(ω) Bm→k(ω)) = nk ρ(ω) Bk→m(ω) . (9.59)

– 137 – We now borrow two results you’ll derive in the Part II Statistical Mechanics course, next term. First, when a gas of atoms is in equilibrium at temperature T , the relative numbers of atoms in states with energies Em and Ek is given by the Boltzmann distribution

n e−Ek/kBT k ~ω/kBT = −E /k T = e , (9.60) nm e m B −23 −1 where kB 1.38 10 JK is Boltzmann’s constant. Furthermore, if radiation is in ≈ × equilibrium at the same temperature then the density of states at frequency ω is

3 ~ω 1 ρ(ω) = , (9.61) π2c3 e~ω/kBT 1 − which is essentially the Bose–Einstein distribution. Using these in (9.59) gives

ω3 1   ~ ~ω/kBT Am→k(ω) = e Bk→m(ω) Bm→k(ω) . (9.62) π2c3 e~ω/kBT 1 − − However, Am→k is supposed to be the rate of spontaneous emission of radiation from the atom, so it cannot depend on any properties of the radiation bath. In particular, it must be independent of the temperature T . This is possible iff

Bk→m(ω) = Bm→k(ω) (9.63) so that the rates of stimulated emission and absorption agree. In this case, we have

3 ~ω A (ω) = B (ω) (9.64) m→k π2c3 m→k so that knowing the rate Bk→m(ω) at which a classical electromagnetic wave of frequency ω = (Em Ek)/~ is absorbed by an atom, we can also calculate the rate Am→k(ω) at which − the atom spontaneously decays, even in the absence of radiation. Now, in equation (9.58) we found from a first–principles calculation that 2 2πe2 X X Γ( m k ) = 2 k r m (ωkm) | i → | i ~ r h | | i P (9.65) 2 (2πe)2 X X = 2 k r m ρ(ωkm) 3~ r h | | i where in the second line we’ve used the fact that (ω) = 2πρ(ω)/3 as obtained in (9.55). P Consequently, for our atom Einstein’s B coefficient is 2 (2πe)2 X X Bk→m(ω) = Bm→k(ω) = 2 k r m . (9.66) 3~ r h | | i for stimulated emission or absorption. From Einstein’s statistical argument we thus learn that 2 4e2ω3 X X Am→k(ω) = 3 k r m (9.67) 3~c r h | | i

– 138 – is the rate at which an excited atom must spontaneously decay, even in the absence of any applied radiation. Ingenious though it is, there’s something puzzling about this calculation of the rate of spontaneous decay. Where does our quantum mechanical calculation – stating that isolated excited atoms do not decay – go wrong? The answer is that while we treated the energy levels of the atom fully quantum mechanically, the electromagnetic field itself was treated classically. Spontaneous emission is only possible once one also quantizes the electromagnetic field, since it is fluctuations in the zero–point value of this field that allow the atom to emit a photon and decay; heuristically, the fluctuating value of the quantum electromagnetic field – even in the vacuum – ‘tickles’ the excited atom prompting the decay. Treating the EM field classically is appropriate if the radiation comes from a high–intensity laser, where the energy density of the field is so high that the terms B(ω)ρ(ω) dominate in (9.59). By contrast, emission of light from the atoms in a humble candle is an inherently quantum phenomenon, occurring purely by spontaneous emission once the flame is lit — candlelight shines even in ambient darkness. Einstein gave his statistical argument in 1916, when Bohr’s model of the atom (the first thing you met in 1B QM) was still the deepest understanding physicists had of quantum theory. This was at the same time as he was revolutionising our understanding of gravity, space and time, which obviously wasn’t enough to keep him fully occupied. The quantum treatment of stimulated emission given in the previous section is due to Dirac in 1926, the same year as Schr¨odingerfirst published his famous equation. A first–principles calculation of the spontaneous emission rate Am→k, requiring the quantization of the electromagnetic field, was also given by Dirac just one year later. Things move fast. Dirac’s 1927 results agreed with those we’ve found via Einstein’s argument. The necessity also to quantize the electromagnetic field heralded the arrival of Quantum Field Theory. Let’s use these results to calculate the typical lifetime of an excited state of an isolated atom. When the radiation density ρ(ω) is very small, the number of atoms nm in the excited state obeys ∂nm = Am→k nm (9.68) ∂t − so the atom decays exponentially with a characteristic timescale A−1 . For hydrogen or ∼ m→k an alkali atom, typically the outermost electron changes its energy level in the transition, so we can drop the sum in (9.67) to find

2 3 4e ω 2 Am→k(ω) = k X m (9.69) 3~c3 |h | | i| where X is the position of the outermost electron. Unless k X m vanishes for some reason h | | i (we’ll consider possible reasons momentarily) we’d expect this matrix element to be of the 2 2 typical size a0 characteristic of the atom, so k X m a . Hence the characteristic |h | | i| ∼ 0 lifetime of the excited atomic state is roughly

3 3 −1 3~c 3c µ τ = Am→k 2 3 2 = 3 (9.70) ∼ 4e ω a0 4~ω a0

– 139 – where µ is the mass of the electron. Optical light has a frequency of order ω 1015 Hz, 7 ∼ so the timescale τ is around 10 times longer than the timescale ~/E200 associated to the 7 energy E200 of the first excited state of hydrogen. Thus around 10 oscillations of the hydrogen atom occur before it radiates a photon, decaying down into the ground state.

9.2.4 Selection Rules In the previous sections we’ve found that, in the dipole approximation that the wavelength of any radiation is large compared to the scale of the atom, transition amplitudes in hydro- genic atoms are proportional to the matrix elements n0, `0, m0 X n, `, m . We now consider h | | i a set of conditions that are necessary if this matrix element is not to vanish. Firstly, in the dipole approximation the perturbing operator is independent of any details of the electron’s spin, so the initial and final spin states must be the same. (We’ve been tacitly assuming this by suppressing any mention of these spins thus far.) Next, the parity operator Π acts as81

Π n, `, m = ( 1)` n, `, m and Π−1XΠ = X , (9.71) | i − | i − so since Π−1 = Π we have

0 n0, `0, m0 X n, `, m = n0, `0, m0 ΠΠ−1XΠΠ−1 n, `, m = ( 1)` +`+1 n0, `0, m0 X n, `, m . h | | i h | | i − h | | i (9.72) Consequently, parity imposes

n0, `0, m0 X n, `, m = 0 unless ` + `0 odd (9.73) h | | i ∈ as we saw in section 4.6. This is known as a selection rule on allowed transitions. We can obtain further selection rules by considering angular momentum. In the second problem set you showed that

2 2 2 2 2 [L , [L , X]] = 2~ (L X + XL ) (9.74) and hence that (β β0)2 2(β + β0) n0, `0, m0 X n, `, m = 0 (9.75) − − h | | i where β = `(` + 1) and similarly for β0. Thus the matrix element n0, `0, m0 X n, `, m h | | i vanishes unless (β β0)2 = 2(β + β0), which implies ` `0 = 1 or ` = `0 = 0. The parity − | − | selection rule already rules out this latter case, so we have

n0, `0, m0 X n, `, m = 0 unless ` `0 = 1 , (9.76) h | | i | − | strengthening our previous rule (9.73). A final selection rule comes from noticing that, since X is a vector operator independent P of spin, [Ji,Xj] = [Li,Xj] = i~ k ijkXk. In particular, if we define

X± = X1 iX2 (9.77) ± 81 The electron has intrinsic parity ηe = +1, but our result (9.73) would hold even if it didn’t as the electron appears in both the initial and final states.

– 140 – then [J3,X3] = 0 and [L3,X±] = ~X± . (9.78) ± We therefore find

0 0 0 0 0 0 0 = n, ` , m [L3,X3] n, `, m = (m m)~ n , ` , m X3 n, `, m (9.79a) h | | i − h | | i and, since X± n, `, m is an eigenstate of L3 with eigenvalue (m 1)~, | i ± 0 0 0 0 0 = (m m 1)~ n , ` , m X± n, `, m . (9.79b) − ± h | | i Combining these two we have the further selection rule

n0, `0, m0 X n, `, m = 0 unless m m0 1 , (9.80) h | | i | − | ≤ as you also derived in the second problem sheet. More specifically, for transitions with 0 0 0 0 m = m only X3 can contribute, so the matrix element n , ` , m X n, `, m points in the zˆ h | | i direction. On the other hand, transitions with m0 m = 1 cannot receive any contribution | − | from X3, so the matrix element lies in the xy-plane and in fact     1 1 n0, `0, m + 1 X n, `, m  i while n0, `0, m + 1 X n, `, m +i . (9.81) h | | i ∝ −  h | | i ∝   0 0

Finally, we remark that these selection rules were derived under the assumption that the transition is just due to the dipole approximation where the perturbation involves matrix elements of the dipole moment operator eX. The effects of both higher orders in perturbation theory, and inclusion of interactions beyond those of an electric field with wavelength λ a0 mean that transitions that are ‘forbidden’ according to the above  selection rules may in fact occur. They will typically do so, however, at a much smaller rate.

– 141 – 10 Interpreting Quantum Mechanics

In this final chapter, I want to explore the process of measurement in quantum mechanics.

10.1 The Density Operator To get started, we must first come clean about an unrealistic feature of our treatment so far. Up to this point, we’ve assumed that we know the precise quantum state our system is in. For macroscopic objects this assumption is totally implausible; we can’t possibly hope to discover the quantum state of the 1023 C atoms in a diamond, or the 105 ∼ ∼ atoms in a protein molecule. In fact, even in purely classical systems there’s always some uncertainty in our knowledge of the system. In reality, we cannot know the momentum of a single particle with infinite as all our measurements will be subject to some experimental error. To handle this situation in quantum mechanics, we need a way to describe systems even when we’re not sure which state they’re in. Suppose we know only that our system is in one of the states α , and that the probability it is in state α is pα. It’s important {| i} | i to be clear that we’re not saying that system is in state X φ = √pα α , (10.1) | i α | i because φ itself is a well–defined quantum state. Rather, we’re admitting that we don’t | i know the true state of the system, which could be any of the states α . Indeed, these {| i} states do not need to form a complete set, and do not even need to be orthogonal, although we will take them each to be correctly normalized α α = 1. h | i We define the density operator ρ : by H → H X ρ = pα α α (10.2) α | ih | where the pα are the probabilities introduced above. Thus ρ projects onto state α with | i probability pα. The density operator has the following three properties: First, since prob- abilities are real ρ† = ρ, so the density operator is Hermitian. Second,

trHρ = 1 (10.3) since the probabilities sum to one, and third

χ ρ χ 0 for all χ (10.4) h | | i ≥ | i ∈ H since probabilities are non-negative. We often write this property as ρ 0 for short. In ≥ fact, these three properties can be taken to be the defining properties of a density operator, in the sense that any operator obeying these three properties is the density operator for some system To see this.... The definition (10.2) of the density operator is reminiscent of the definition Q = P qj qj qj of an arbitrary operator in terms of its eigenstates. However, ρ should not be j | ih |

– 142 – considered as corresponding to an observable, because the pα are subjective rather than objective: they have nothing to do with the state of the system itself, rather they just quantity our knowledge or ignorance of that state. For example, if all our records of the results of measurements become lost, our values of the pα will change, but the physical system itself will not. By contrast, the spectrum qj of Q is determined by the Laws of { } Nature, independent of our knowledge of them. Note that, even though the states α {| i} may not form a complete set or even be orthogonal, we’re always free to expand ρ in terms of some complete basis i , as {| i} X X X ρ = i i ρ j j = i pα i α α j j = ρij i j (10.5) | ih | | ih | | i h | ih | i h | | ih | i,j i,j,α i,j P where ρij = pα i α α j are the matrix elements of ρ in this basis. α h | ih | i The density operator allows us to obtain the expectation value of any observable of our system, accounting for both the quantum and classical uncertainties. To see this, let Q be a Hermitian operator corresponding some observable, and let qi be a orthonormal {| i} basis of Q-eigenstates82. Now consider the quantity Q defined by

Q = trH(ρ Q) . (10.6)

Using the qi basis to perform the trace, we have | i X X Q = qi ρ Q qi = pα qi α α Q qi h | | i h | ih | | i i α,i ! (10.7) X X 2 X = pα qi qi α = pα α Q α . |h | i| h | | i α i α This expression combines the quantum expectation value of Q in the state α (which may | i not be an eigenstate of Q), together with our lack of knowledge of the system’s state, represented by the pαs. Rethnk through this argument In the Schr¨odingerpicture, the time evolution of ρ is given by dρ X ∂ n ∂ n  i~ = i~ pn | i n + n h | dt ∂t h | | i ∂t n (10.8) X = pn (H n n n n H) = [H, ρ ] . n | ih | − | ih | This is the quantum analogue of Liouville’s equation dρ/dt = H, ρ in Classical Dynamics, { } which governs the time evolution of a probability density ρ on phase space. To obtain the time evolution of an arbitrary expectation value (of a quantity that has no explicitt time dependence in the Schr¨odingerpicture) we use (10.8) to find d X dρ X i~ trH(ρQ) = i~ a Q a = a (Hρ ρH)Q a = trH(ρ [Q, H]) (10.9) dt a h | dt | i a h | − | i

82Our discussion here is adapted to the case of dim(H) finite, but is can be extended to infinite dimensional Hilbert spaces.

– 143 – where the last equality uses the cyclicity of the trace. We know from before that the rate of change of the expectation value of Q in any pure quantum state is given by the expectation value of [Q, H]/i~. Equation (10.9) states that – even when our knowledge of the quantum state is imprecise – the expected rate of change of Q is the appropriately weighted average of the rates of change of Q for each of the possible states of the system. The density operator and the operators for the Hamiltonian and other observables encapsulate a complete, self–contained theory of quantum dynamics. If we have incomplete knowledge of the system’s quantum state, then use of this formalism is mandatory. If we do happen to know that our system is initially in the precise quantum state ψ , we can | i still use this apparatus by setting ρ = ψ ψ , rather than using the TDSE, though in this | ih | case use of the density operator is optional.

10.1.1 Von Neumann Entropy When our knowledge of the state is incomplete, we say that the system is in an impure or mixed state, and correspondingly we refer to a regular quantum state ψ as pure. | i This terminology is somewhat misleading, because it is really just our knowledge that is incomplete — the system itself is presumably in some perfectly well–defined quantum state, it’s just that we don’t know which one. If we’re definitely in some state ψ , so the system is pure, then ρ = ψ ψ . For such | i | ih | pure states it is easy to see that ρn = ρ using the fact that ψ is normalized. If our density | i operator has pm = 0.99999999 for some state m , with the remaining probability spread | i in some way among other states, then although our knowledge of the system’s state is imperfect, the effects of the impurity are likely to be negligible. On the other hand, if the −20 density operator has pn 10 for all states n so that a tremendous number of states are ≤ | i equally probable, then our knowledge of the true quantum state of a system is very poor indeed. We would like to have a way to quantify how much knowledge, or information, about a state we have once the probability distribution pi has been specified. { } To achieve this, define the von Neumann entropy S(ρ) associated to a density operator by S(ρ) = trH(ρ ln ρ) , (10.10) − P where we recall that ln ρ = ln(ρj) qj qj where . Thus we have j | ih | !   X X X trH(ρ ln ρ) = n pi i i  ln(pj) j j  n − − h | | ih | | ih | | i n i j (10.11) X = pn ln(pn) − n in terms of the probabilities appearing in the density operator. In this form, it’s easy to see that S(ρ) 0 with S(ρ) = 0 iff ρ describes a pure state, where only one of the pns ≥ is non-zero (and hence equal to 1) as we have complete certainty about which state our system is in. Also easy to see is that the maximum value of S(ρ) is ln dim( ), which is attained iff H ρ = ρmax = 1H/dim( ). To see this, use the method of Lagrange multipliers to impose the H

– 144 – constraint trHρ = 1 and vary S(ρ) λ (trH(ρ) 1) with respect to the probabilities and − − Lagrange multiplier λ. At an extremum,

 −1  0 = trH δρ ln ρ + ρρ δρ + λδρ − (10.12) 0 = δλ (trH(ρ) 1) . − In the first line, we’ve used the fact that tr(ρ ρ−1δρ) = tr(ρ δρ ρ−1) inside the trace, so the order of the variation in the logarithm doesn’t matter. These equations must hold for arbitrary variations δρ and δλ, so the first tells us that, at an extremum of the von Neumann entropy, −λ−1 ρ = ρmax = e 1H . (10.13) for some constant e−λ−1. Taking the trace, the second equation fixes the constant of proportionality so that 1 ρmax = 1H , (10.14) dim( ) H with the corresponding maximum entropy being

S(ρmax) = trH (ρmax ln ρmax) − tr (1 ) (10.15) = H H ln(dim( )−1) = ln dim( ) . − dim( ) H H H The form of ρmax shows the entropy is maximised when all states are equally likely – meaning we have no idea about which state our system is actually in. One of the main uses of the density operator and von Neumann entropy is in Quantum Statistical Mechanics. As an example, suppose we wish to extremize the entropy subject to both trH(ρ) = 1 and trH(ρH) = U, saying that we know our system has a fixed average energy U. Then using two Lagrange multipliers λ and β, at an extremum we have

0 = δ [S(ρ) λ (tr(ρ) 1) β (tr(ρH) U)] (10.16) − − − − which gives the three conditions

0 = trH [δρ (ln ρ + 1 + βH + λ)] − 0 = δλ (tr(ρ) 1) (10.17) − 0 = δβ (trH(ρH) U) − Since these must hold for arbitrary variations, the first equation gives

ρ = e−βH e−λ−1 . (10.18) at a maximum of S(ρ) with fixed energy. The second two conditions just enforce our λ+1 constraints: to ensure trH(ρ) = 1 we must set e = Z(β) where the constant

−βH Z(β) = trH(e ) . (10.19)

– 145 – is known as the partition function of our system. Thus, in a state of maximum entropy for fixed average energy, the density operator takes the form

1 −βH 1 X −βEn ρ = e = e En En , (10.20) Z(β) Z(β) n | ih | where in the final expression we have inserted a complete set of H eigenstates. This form of density operator is known as the Gibbs distribution, It plays a fundamental role in quantum statistical mechanics, as you’ll see next term. β is usually denoted 1/ kBT where T is called the temperature and kB the Boltzmann constant. For fixed average energy U of the system, the temperature is determined by the constraint trH(ρH) = U. In other words, the temperature T is determined by the average energy of the system. You’ll work much more with the density operator and entropy if you take the Part II Statistical Mechanics course next term.

10.1.2 Reduced density operators

If our system comprises two (or more) identifiable subsystems A and B, then = A B H H ⊗H so the full Hilbert space is the tensor product of the Hilbert spaces of the subsystems. Let’s suppose A describes the system we’re really interested in, whilst B is the ‘environment’. That is, B describes the quantum state of everything in the Universe except our immediate object of study A. Of course, we can’t hope to know the precise quantum state of B. We now show that, even if A starts in a pure state, once it has entangled itself with B, it will be in an impure state. Let i denote a complete set of orthonormal states for A {| i} and r a complete orthonormal set for B, so that i r is a basis of A B. As {| i} {| i ⊗ | i} H ⊗ H in (10.5) we can expand the density operator for the entire system in terms of this basis as X ρAB = ρir,js i r j s . (10.21) | i⊗| i h |⊗h | ij rs Now consider an observable property purely of subsystem A, represented on the full Hilbert space by some Hermitian operator Q 1B. The expectation value of Q is ⊗

Q = trHA⊗HB (ρAB Q 1B)  ⊗  X X = k t  i r ρir js j s  (Q 1B) k t h |⊗h | | i⊗| i h |⊗h | ⊗ | i⊗| i k t ij,rs (10.22) ! X X = j Q i ρir,jr , h | | i ij r where in going to the last line we’ve used the orthonormality of the bases. Comparing this expression to (10.6) suggests we define a reduced density operator ρA of the subsystem A by

ρA = trHB (ρAB) , (10.23) taking the trace only over subsystem B. Then ! X X X ρA = r ρAB r = A; i ρir,jr A; j , (10.24) h | | i | i h | r ij r

– 146 – where the second equality uses our expression (10.21) for ρAB. We can write our expres- sion (10.22) for Q in terms of the reduced density operator as X Q = i ρAQ i = trH (ρAQ) . (10.25) h | | i A i

That is, for any operator of the form Q 1B we have ⊗

trH ⊗H (ρAB Q 1B) = trH (ρAQ) (10.26) A B ⊗ A in terms of the reduced density operator. The reduced density operator enables us to obtain expectation values of subsystem A’s observables without bothering about the states of B. For an example of a situation in which the reduced density operator is useful, consider an atom placed in a low-intensity field of radiation. Every so often, a photon (quanta of electromagnetic radiation) may come along and scatter off the atom, or be absorbed by the atom into an excited state which subsequently decays, re-emitting the photon, or even cause the atom to be temporarily ionized. If we wish to keep track of the whole system, then as more and more photons interact with the atom, we need to use a larger and larger Hilbert space encompassing further and further tensor products of the Hilbert spaces of individual photons. However, if our interest is just in the state of the atom, it’s enough to keep track of the atom’s reduced density operator, which refers solely to the Hilbert space of the atom.

10.1.3 Decoherence We now show a very important result: Even if a quantum system A is initially pure, meaning we know exactly which state it’s in, interactions with an unmonitored environment B generically cause A to become impure. Suppose both A and B start in pure quantum states φ(0) and χ(0) , respectively. | i | i Initially then,

ρAB(0) = ψ(0) ψ(0) = ( φ(0) χ(0) )( φ(0) χ(0) ) (10.27) | ih | | i ⊗ | i | i ⊗ | i and the whole system will evolve unitarily in time via the operator UAB(t) built from the Hamiltonian of the full system. If the full Hamiltonian does not couple A and B, so that H = HA I + I HB, then ⊗ ⊗ −iHt/~ −i(HA+HB )t/~ −iHAt/~ −iHB t/~ UAB(t) = e = e = e e = UA(t) UB(t) , (10.28) ⊗ ⊗ using the fact that [HA,HB] = 0 since they act on different Hilbert spaces. Thus, in this case

ψ(t) = UAB(t) ψ(0) = UA(t) φ(0) UB(t) χ(0) = φ(t) χ(t) (10.29) | i | i | i ⊗ | i | i ⊗ | i and the density operator at time t will be

−1 ρAB(t) = UAB(t) ψ(0) ψ(0) U (t) = ( φ(t) χ(t) )( φ(t) χ(t) ) . (10.30) | i h | AB | i ⊗ | i h | ⊗ h |

– 147 – Consequently, tracing over B we have that the reduced density operator H ρA(t) = trH ρAB(t) = φ(t) φ(t) (10.31) B | ih | at time t. In other words, if A starts in a pure state and it does not interact with the environment then it will remain in a pure state. In every realistic case, subsystems are coupled to eachother — however weak, there is some term HAB in the Hamiltonian that is not diagonal with respect to the splitting A B. In the presence of an interaction term HAB, the time evolution operator is not H ⊗ H generically a product of the time evolution operators of the two subsystems, and states of A and B will typically become entangled. Let’s demonstrate this assertion in the simplest case that A and B each have only two possible states. Suppose that at time t they have evolved into the entangled state 1 ψ(t) = ( A; 0 B; 0 + A; 1 B; 1 ) . (10.32) | i √2 | i| i | i| i In this ψ(t) , the states of A and B are correlated. Before making any measurements, | i there is a probability 1/2 that A is in either of its two states. However, if we discover (by performing some measurement) that B is in state B; 0 then A must certainly be in state | i A; 0 , whilst if we learn that B is in state B; 1 , A must be in state A; 1 . | i | i | i However, suppose we don’t keep track of the state of B. Then we should trace over the states in B, giving H 1 X   ρA = B; a ( A; 0 B; 0 + A; 1 B; 1 )( A; 0 B; 0 + A; 1 B; 1 ) B; a 2 h | | i| i | i| i h |h | h |h | | i a=1,2 (10.33) 1 = ( A; 0 A; 0 + A; 1 A; 1 ) , 2 | ih | | ih | which is the density operator of an impure state. In tracing over B we’ve lost the infor- H mation carried by the correlations between the two subsystems, so A now we only know that A is equally likely to be in either state. Thus, even when the whole system is in a pure state, if we only measure part of it, the state can appear to be impure. As another example, system A could represent a hydrogen atom which we monitor closely, whilst system B could describe an electromagnetic field to which the atom couples, but which we do not measure precisely. If our atom starts in its first excited state and the electromagnetic field in its ground state then initially, the atom, field and whole system are all in pure states. After some time, coupling between the atom and field means the whole system will evolve into a pure but entangled state

ψ(t) = a0(t) A; 0 F ; 1 + a1(t) A; 1 F ; 0 (10.34) | i | i| i | i| i where A; n represents the nth excited state of the atom, while F ; n is the state of the field | i | i when it contains n photons of frequency associated with transitions between the atom’s ground and first excited state. If we fail to monitor the electromagnetic field, then we must describe the atom by its reduced density operator

2 2 ρA = a0(t) A; 0 A; 0 + a1(t) A; 1 A; 1 . (10.35) | | | ih | | | | ih |

– 148 – Figure 19: Strong subadditivity of the von Neumann entropy.

This density operator shows that the atom is now in an impure state. In general, interactions between an experimental system and the wider environment mean that the state of the whole Universe rapidly becomes entangled. Since we don’t keep track of all the details of the environment, sooner or later we’re obliged to describe our experimental system by its reduced density operator, which will be impure. The tendency for subsystems to evolve from pure quantum states to impure states through interactions with the environment is known as quantum decoherence. Trying to isolate a system from the environment so as to prevent it from becoming impure is one of the main challenges to be overcome in building a practical quantum computer. Give some proofs of these statements using monotonicity & subadditivity of entropy of a reduced density operator Perhaps the most important properties of entropy come from considering it’s behaviour on subsystems. If = A B then H H ⊗ H S(ρAB) S(ρA) + S(ρB) (10.36) ≤ where ρA and ρB are the reduced density matrices for the two subsystems. The equality is saturated if and only if the two subsystems are uncorrelated (unentangled) so that ρAB = ρA ρB. This property is known as subadditivity and you’ll explore it further in ⊗ the final problem sheet. It tells us that the entropy of a whole is no greater than the sum of entropies of its parts. It follows from subadditivity that if = A B C then H H ⊗ H ⊗ H S(ρABC ) S(ρA) + S(ρBC ) S(ρA) + S(ρB) + S(ρC ) (10.37) ≤ ≤ but in fact something stronger is true: we have

S(ρABC ) + S(ρB) S(ρAB) + S(ρBC ) (10.38) ≤ which is known as strong subadditivity and was proved in 1973 by Elliott Lieb and Mary Beth Ruskai. To interpret it, we consider AB and BC are each subsystems of ABC, with AB BC = B. Strong subadditivity states that the entropy of the whole is no greater ∩ than the entropies of the overlapping subsystems AB and BC, even when the entropy of the overlap B is removed.

– 149 – 10.2 Quantum Mechanics or Hidden Variables? 10.2.1 The EPR Gedankenexperiment Einstein was never happy with the probabilistic nature of quantum mechanics. In the EPR thought experiment, an electron and a positron83 are produced in a state with net spin 0, perhaps by the decay of some nucleus from a spin-0 excited state to a lower spin-0 state. The electron and positron travel in opposite directions, each carrying the same amount of momentum. At some distance from the decaying nucleus Alice detects the electron and measures the component of its spin in a direction a of her choice. Since 1 electrons have spin- , Alice inevitably discovers either +~/2 or ~/2. Meanwhile Bob, 2 − who is sitting at a similar distance on the opposite side of the nucleus, detects the positron and measures its spin in some direction b of his choice. We are free to choose the z-axis to be aligned with Alice’s direction a. Since the electron–positron system has combined spin zero, it must be in the state 1 ψ = ( ) (10.39) | i √2 |↑ i|↓ i − |↓ i|↑ i that entangles the separate spins of the electron and positron. According to the Copen- hagen interpretation, when Alice measures +~/2 for the electron spin, the system collapses into the state ψ0 = . (10.40) | i |↑ i|↓ i Thus, whilst before Alice’s measurement the amplitude for the positron to have spin +~/2 along a was 1/2, after she has measured the electron spin, there is no chance that the positron also has spin up along the same axis. The state of the positron corresponding to definitely having spin +~/2 along the b-axis is     θ −iφ/2 θ iφ/2 b = cos e + sin e (10.41) |↑ i 2 |↑ i 2 |↓ i as we found in the second problem sheet, where θ = cos−1(a b). Given that after Al- · ice’s measurement the positron is certainly in state , it follows from (10.41) that the | ↓ i probability Bob measures spin up along b is   2 2 θ b = sin . (10.42) |h ↑ |↓ i| 2

In particular, there is only a small probability he would find spin-up along a direction closely aligned with Alice’s choice a. We’ve supposed that Alice measures first, but if the electron and positron are far apart when the measurements are made, a light signal sent to Bob by Alice when she makes her measurement would not have arrived at Bob by the time he makes his measurement, and vice versa. In these circumstances, relativity tells us that the order in which the measurements appear to be made depend on the velocity of the observer who is judging

83The positron is the antiparticle of the electron, predicted in the relativistic theory by Dirac’s equation. It has the same mass and spin as an electron, but opposite sign electric charge.

– 150 – the matter. Consequently, for consistency the predictions of quantum mechanics must be independent of who is supposed to make the first measurement and thus collapse the state. This condition is satisfied by the above discussion, since the final probability depends only on a b and is thus symmetric between Alice and Bob. · What bothered EPR is that after Alice’s measurement there is a direction (a) along which Bob can never find +~/2 for the positron’s spin, and this direction depends on what exactly Alice chooses to measure. This fact seems to imply that the positron somehow ‘knows’ what Alice measured for the electron, and the collapse of the entangled wavefunc- tion 1 ( ) √2 |↑ i|↓ i − |↓ i|↑ i −→ |↑ i|↓ i apparently confirms this suspicion. Since relativity forbids news of Alice’s work on the electron from influencing the positron at the time of Bob’s measurement, EPR argued that the required information must have travelled out in the form of a hidden variable which was correlated at the time of the nuclear decay with a matching hidden variable in the electron. These hidden variables would then explain the probabilistic nature of quantum mechanics — QM would contain no uncertainties once replaced by a ‘better’ theory taking into account these hidden variables.

10.2.2 Bell’s Inequality Remarkably, Bell was able to show that the predictions of any theory of hidden variables are in conflict with the predictions of quantum mechanics. Suppose we assume that the result of measuring the electron’s spin in the a-direction is completely determined by the values taken by hidden variables in addition to a. That is, if we knew these variables, we could predict with certainty the result of measuring the component of the electron’s spin along any direction a; Alice is only uncertain of the outcome because she does not yet know the values of the hidden variables. We take these variables to be the components of some n-dimensional vector v. Thus the result of measuring the electron’s spin along a is a function σe(a, v) that takes values ~/2 only. ± Similarly, the result of measuring the positron’s spin along b is some function σp(b, v). We have σe(a, v) + σp(a, v) = 0 (10.43) by conservation of angular momentum. Let’s suppose that v has a probability distribution ρ(v), such that the probability dP that v lies in the infinitesimal volume dnv is

dP = ρ(v) dnv . (10.44)

Then the expectation value of interest is Z n σe(a, v) σp(b, v) = ρ(v) σe(a, v) σp(b, v) d v h i Z (10.45) n = ρ(v) σe(a, v) σe(b, v) d v . −

– 151 – Now suppose Bob sometimes measures the spin of the positron parallel to b0 rather than 2 2 ~ b. The fact that σp(b, v) = 4 allows us to write Z 0 0 n σe(a, v) σp(b, v) σe(a, v) σp(b , v) = ρ(v) σe(a, v)[σp(b, v) σp(b , v)] d v h i − h i − − Z   4 0 n = ρ(v) σe(a, v) σe(b, v) 1 σe(b, v) σe(b , v) d v . − − ~2 (10.46) 0 The expression [1 4σe(b, v)σe(b , v)] is non–negative, while the product σe(a, v) σe(b, v) − 2 fluctuates between ~ . Hence we obtain the bound ± 4 2 Z   0 ~ 4 0 n σe(a, v) σp(b, v) σe(a, v) σp(b , v) ρ(v) 1 σe(b, v) σe(b , v) d v h i − h i ≤ 4 − ~2 2 ~ 0 = σe(b, v) σp(b , v) 4 − h i (10.47) This is Bell’s inequality. It must hold for any three unit vectors a, b and b0 if the proba- bilistic nature of QM really comes from some underlying hidden variables. Bell’s inequality can be tested experimentally as follows: for a large number of trials, Alice measures the electron’s spine along a while Bob measures the positonr’s spin along b in half these trials, and along b0 in the other half. From the results of these trials, the lhs of (10.47) can be estimated. The rhs is estimated from a new series of trials in which Alice measures the electron’s spin along b and Bob measures the positron’s spin along b0. If hidden variables exist, the results will obey Bell’s inequality. We now show that quantum mechanics itself violates Bell’s inequality. Since the ran- dom variables σe(a) and σp(b) can each take only the values ~/2 there are just four cases ± to consider. We find

σe(a)σp(b) h i 2 ~ = [PA( a)PB( b a) PA( a)PB( b a) PA( a)PB( b a) + PA( a)PB( b a)] , 4 ↑ ↑ | ↓ − ↑ ↓ | ↓ − ↓ ↑ | ↑ ↓ ↓ | ↑ (10.48) where PB( b a) is the probability Bob measures the positron’s spin to be +~/2 along b, ↑ | ↑ given that Alice it’s spin must be down along a as a result of the collapse of wavefunction after Alice’s measurement. Since nothing is known about the orientation of the electron before Alice makes her measurement, quantum mechanics gives 1 PA( a) = PA( a) = (10.49) ↑ ↓ 2 Above, we calculated that     2 θ 2 θ PB( b a) = sin and PB( b a) = cos (10.50) ↑ | ↑ 2 ↓ | ↑ 2

– 152 – and similarly for PB( b a) and PB( b a). Thus we have ↓ | ↓ ↑ | ↓ 2   2 2 ~ 2 θ 2 θ ~ ~ σe(a) σp(b) = sin cos = cos θ = a b (10.51) h i 4 2 − 2 − 4 − 4 · according to quantum mechanics. Using this correlation in either side of Bell’s inequality we find 2 2 ~ 0 ~ 0 LHS = a (b b ) whereas RHS = (1 b b ) (10.52) 4 | · − | 4 − · In particular, if a b = 0 and b0 = b cos α + a sin α we find · 2 2 ~ ~ LHS = sin α whereas RHS = (1 cos α) (10.53) 4 | | 4 − and it is easy to see that Bell’s inequality is violated for all α = 0, π/2. The predictions of 6 quantum mechanics are thus inconsistent with the existence of hidden variables. Experi- ments84 have been carried out to determine whether Nature herself obeys Bell’s inequalities or violates them — and quantum mechanics triumphs.

84Freedman, S. & Clauser, J., Experimental Test of Local Hidden Variable Theories, Phys. Rev. Lett. 28, 938 (1972). Freedman & Clauser’s experiment used a modified version of Bell’s inequalities using the polarization states of photons, rather than the spins of electrons. See also Aspect, A. & Roger, G. Experimental Realization of the Einstein–Podolski–Rosen–Bohm Gedankenexperiment: A New Violation of Bell’s Inequalities, Phys. Rev. Lett. 49, 91 (1982).

– 153 –