<<

Foundations of Lecture notes

Fabio Deelan Cunden, April 23 2019

CONTENTS

1. Tests, probabilities and 2 2. The Old Quantum Theory 3 3. Bell’s inequalities 6 4. Two-level systems 9 5. The Schrodinger¨ equation and wave mechanics 12 6. Exactly solvable models 18 7. 35 8. Mathematical Foundations of 38 9. and 49 10. Identical particles 59 References 60

Note. I created these notes for the course ACM30210 - Foundations of Quan- tum Mechanics I taught at University College Dublin in 2018. The exposition follows very closely the classical books listed at the end of the document. You can help me continue to improve these notes by emailing me with any comments or corrections you have. 2

1. Tests, probabilities and quantum theory

Quantum theory is a set of rules allowing the computation of probabilities for the outcomes of tests which follow specified preparations. A preparation is an experimental procedure that is completely specified, like a recipe in a good cookbook. Preparation rules should be unambiguous, but they may involve stochastic processes, such as thermal fluctuations, provided that the statistical properties of the stochastic process are known. A test starts like a prepa- ration, but it also includes a final step in which information, previously unknown, is supplied to an observer. In order to develop a theory, it is helpful to establish some basic notions and terminology. A quantum system is an equivalence class of preparations. For exam- ple, there are many equivalent macroscopic procedures for producing what we call a , or a free , etc. While quantum systems are somewhat elu- sive, quantum states can be given a clear operational definition, based on the notion of test. Consider a given preparation and a set of tests. If these tests are performed many times, after identical preparations, we find that the statistical distribution of outcomes of each test tends to a limit: each outcome has a definite probability. We can then define a state as follows: a is characterised by the probabilities of the various outcomes of every conceivable test. This definition is highly redundant. We shall see that these probabilities are not independent. One can specify – in many different ways – a restricted set of tests such that, if the probabilities of the outcomes of these tests are known, it is possible to predict the probabilities of the outcomes of every other test. (A geometric analogy is the definition of a vector by its projections on every axis. These projections are not independent: it is sufficient to specify a finite number of them, on a complete, linearly independent set of axes.) As a simple example of definition of a state, suppose that a photon is said to have right-handed polarisation. Operationally, this means that if we subject that photon to a specific test (namely, a quarter wave plate followed by a suitably ori- ented calcite crystal) we can predict with certainty that the photon will exit in a particular channel. For any other test, consisting of arbitrarily arranged calcite crystals and miscellaneous optically active media, we can then predict probabili- ties for the various exit channels. (These probabilities are computed in the same way as the classical beam intensities.) Note that the word ‘state’ does not refer to the photon by itself, but to an entire experimental setup involving macroscopic instruments. The essence of quantum theory is to provide a mathematical representation of states, together with rules for computing the probabilities of the various out- comes of any test. This set of rules can be consistently organised assuming that 3 there is associated with each physical system a set of quantities, constituting a non- commutative algebra in the technical mathematical sense, the elements of which are the physical quantities themselves.

2. The Old Quantum Theory

There are two main groups of experimental phenomena which are inconsistent with classical physics, namely: (i) when the internal of an atom changes, owing to emission or ab- sorption of light, it does not do so evenly or continuously but in ‘quan- tised’ steps; (ii) a beam of exhibits interference phenomena similar to those of a light wave. (i) Quantum states. A great wealth of experimental data, chiefly derived from spectroscopy, shows that an atom cannot exist in states of continuously varying energy but only in different discrete states of energy, referred to as ‘discrete energy levels’. These levels and the spacings between them are different for the various chemical elements, but are identical for all atom of the same element. The emitted energy (in the form of light) is confined to spectral lines of sharply defined fre- quencies, and, moreover, the frequencies are not related to one other by integral factors, as overtone, but instead show an interesting additive relation, expressed in the Ritz-Rydberg combination principle. The change from level E2 to another E1 is associate with the emission if E2 > E1 (or absorption if E2 < E1) of light, whose frequency ν is determined by the relation

(2.1) E2 − E1 = hν, where h is a universal constant known as Planck’s constant. The above relation is the Bohr frequency rule. It is important to notice that the experimentally mea- surable quantities are the spacings (frequencies) and not the single energy levels! Note that Bohr’s assumption is not only in agreement with the existence of sharp spectral lines, but contains in addition the combination principle. If we order the energy levels E0 < E1 < E2 ··· , then in accordance to (2.1) each frequency is the difference of two terms 1 ν(i → j) = (E − E )(i > j). h i j Consequently, generally speaking there will occur in addition to the frequency ν(i → j), ν(j → k) the frequency

(2.2) ν(i → k) = ν(i → j) + ν(j → k) obtained from them by addition (combination). 4

(ii) Interference of waves. It is well known that light waves when reflected or refracted from a regularly spaced grating interfere and form what is called a ‘’ pattern visible on a screen. This phenomenon was used, for instance, to prove the wave of X-rays. For this purpose the regular spacing is provided by the atoms of a crystal, as an ordinary grating would be too coarse. From the diffraction pattern one can determine the wavelength of the light. Davisson and Germer (1927) carried out a similar experiment with a beam of , all having the same velocity, or momentum p, passing through a crystal, etc. The diffraction pattern so found is very similar to that produced by X-rays. This experiment shows that waves are associate with a beam of electrons. The wavelength of the electrons is then found to be inversely proportional to the momentum λ ∝ 1/p, the slower the electrons the longer is the wavelength. The proportionality constant can be measured and it turns out to be equal to Planck’s constant h (2.3) λ = . p This relation was suggested by de Broglie in his doctoral thesis (1924), and later confirmed in the experiment by Davisson and Germer. This relation was also fun- damental in the formulation of the most famous Heisenberg’s uncertainty relation

(2.4) ∆x∆p ≥ ~/2.

EXAMPLE 2.1. Hydrogen atoms in a discharge lamp emit a series of lines in the visible part of the spectrum. This series is called the Balmer Series after the Swiss teacher Johann Balmer. Here are the lines observed by Balmer (in convenient units) 5 3 21 2 , , , 36 16 100 9 In 1885, Balmer found by trial and error a formula to describe the wavelengths of these lines. Balmer suggested that his formula may be more general and could describe spectra from other elements. Then in 1889, Johannes Robert Rydberg found several series of spectra that would fit a more general relationship, similar to Balmer’s empirical formula. A sample of lines is 11 9 5 7 16 1 5 3 21 2 3 8 15 24 35 ··· , , , , , , , , , , , , , , , , ··· 900 400 144 144 225 12 36 16 100 9 4 9 16 25 36 | {z } | {z } | {z } infrared visible ultraviolet The actual spectral lines are given by the numbers above multiplied by the Rydberg constant 2π2µe2 (2.5) R = h3c where µ is the reduced mass of the electron and the nucleus. The numerical value of the constant R agrees with values obtained from spectroscopic data. The values 5 for hydrogen, ionized helium, and infinite nuclear mass are

−1 RH = 109677.759 ± 0.05 cm −1 RHe = 109722.403 ± 0.05 cm −1 R∞ = 109737.42 ± 0.06 cm .

EXERCISE 2.1. Try to find Balmer’s empirical formula. If you succeed, try to find Rydberg’s generalisation to explain the above sequence. Hint: rearrange the sequence as a table 3 4 8 5 9 36 15 3 7 16 16 144 24 21 16 9 25 100 225 400 35 2 1 5 11 36 9 12 144 900 . . . ..

TABLE 1. Relations between experimental interpretation and the- oretical inferences.

Diffraction (Young 1803, Laue 1912) Electromagnetic waves Blackbody radiation (Planck 1900) Electromagnetic quanta (Einstein 1904) Combination principle (Ritz-Rydberg 1908) Discrete values Franck-Hertz experiment (1913) for physical quantities Stern and Gerlach (1922) Davisson and Germer (1927) Matter waves

2.1. Quantisation rules. In 1915 W. Wilson and A. Sommerfeld discovered independently a simple method of quantisation which was soon applied in the dis- cussion of several atomic phenomena with good success. The first step of the method consists in solving the classical equations of mo- tion defined by the classical Hamiltonian H = H(p, q). The assumption is then introduced that only those classical orbits are allowed as stationary states for which the following conditions are satisfied I (2.6) pkdqk = nkh, nk an integer, 6 for all k = 1,...,N (N being the number of degrees of freedom of the system). These integrals, which are called action integrals, can be calculated only for inte- grable systems or, in the language of old quantum theory, ‘conditionally periodic systems’.

EXAMPLE 2.2 (Quantisation of the ). A classical harmonic oscillator is defined by the Hamiltonian (total energy) 1 1 (2.7) H(p, q) = p2 + mω2q2, (p, q) ∈ 2. 2m 2 R The solution of the Hamilton equations ( dq = ∂H = 1 p (2.8) dt ∂p m dp ∂H 2 dt = − ∂q = −mω q are oscillations about the point (p, q) = (0, 0) ( q(t) = A sin(ωt + φ) (2.9) p(t) = mωA cos(ωt + φ) where A and φ are arbitrary constants. It is easy to see that the (p, q)-plane is partitioned into ellipses of equation q2 p2 + = 1. A2 m2ω2A2 The action integral can be evaluated explicitly I (2.10) pdq = πmωA2 = hn.

Applying the quantisation rule, we see that the amplitude A is restricted to the 1/2 quantised values An = (hn/πmω) . The corresponding energy values are 1 (2.11) E = mω2A2 = (h/2π)ωn = ωn. n 2 n ~ The energy levels predicted by the old quantum theory are integral multiples of ~ω.

3. Bell’s inequalities

3.1. An experiment with polarizers. In a beam of monochromatic (fixed color) light we put a pair of polarizing filters, each of which can be rotated around the axis formed by the beam. As is well known, the light emerging from both fil- ters changes in intensity when the filters are rotated relative to each other. Starting from the orientation where the resulting intensity is maximal, and rotating one of the filters through an angle α, the light intensity decreases with α, vanishing for α = π/2. If we call the intensity of the beam before the filters I0, after the first 7

1 I1, and after the second I2, then I1 = 2 I0 (we assume the original beam to be unpolarized), and 2 (3.1) I2 = I1 cos α. So far the phenomenon is described well by classical physics. During the last cen- tury, however, it has been observed that for very low intensities (monochromatic) light comes in small packages, which were called , whose energy depends on the color, but not on the total intensity. So the intensity must be proportional to the number of these photons, and formula (3.1) must be given a statistical mean- ing: a photon coming from the first filter has probability cos2 α to pass through the second. Thinking along the lines of classical probability, we may associate to a polarization filter in the direction α a random variable Pα taking value 0 if the photon is absorbed by the filter and value 1 if the photon passes through. Then, for two filters in the direction α and β these random variables should correlated as follows (3.2) 1 1 P = P = and P P = (P = 1 and P = 1) = cos2(α − β). E α E β 2 E α β P α β 2 Here we hit on a difficulty: the function on the right hand side is not a possible cor- relation function! To see this, consider three polarizing filters having polarization directions α1, α2, and α3 respectively. They should give rise to random variables. P1, P2 and P3 satisfying 1 P P = cos2(α − α ). E i j 2 i j PROPOSITION 3.1 (Bell’s three variable inequality). For any three 0-1-valued random variables P1, P2, and P3 on the same probability space, the following inequality holds:

P(P1 = 1,P3 = 0) ≤ P(P1 = 1,P2 = 0) + P(P2 = 1,P3 = 0)

PROOF.

P(P1 = 1,P3 = 0) = P(P1 = 1,P2 = 0,P3 = 0) + P(P1 = 1,P2 = 1,P3 = 0) ≤ P(P1 = 1,P2 = 0) + P(P2 = 1,P3 = 0).  In our example, 1 (P = 1,P = 0) = (P = 1) − (P = 1,P = 1) = sin2(α − α ). P i j P i P i j 2 i j Bell’s inequality thus reads 1 1 1 sin2(α − α ) ≤ sin2(α − α ) + sin2(α − α ) 2 1 3 2 1 2 2 2 3 which is clearly violated for the choices α1 = 0, α2 = π/6 and α3 = π/3. 8

The above calculation could be summarized as follows: we are looking for a family of 0-1-valued random variables (Pα)0≤α<π on the same probability space with P(Pα = 1) = 1/2, satisfying 2 P(Pα 6= Pβ) = sin (α − β). Now, on the space of 0-1-valued random variables on a probability space, the func- 1 tion (X,Y ) 7→ P(X 6= Y ) equals the L -distance of X and Y . On the other hand, the function (α, β) 7→ sin2(α − β) does not satisfy the triangle inequality. Therefore no family (Pα)0≤α<π exists which meets the above requirement.

3.2. An improved experiment. On closer inspection the above example is not very convincing. Indeed, when two polarizers are arranged on the optical bench, why should not the random variable for the second polarizer depend on the angle of the first? In fact we can do a better experiments using a clever technique from . It is possible to build a device that produces pairs of photons, such that the members of each pair move in opposite directions and show opposite behaviour towards parallel polarization filters: if one passes the filter, then the other is surely absorbed. With these photon pairs, the very same experiment can be performed, but this time the polarizers are far apart, each one acting on its own photon. (This is the optical version of the Gedankenexperiment was proposed and discussed by Einstein, Podolski, Rosen and Bohm.) The same outcomes are found, violating Bell’s three variable inequality.

3.3. The decisive experiment. Advocates of classical probability could still find serious fault with the argument given so far. So the argument has to be tight- ened still further. This brings us to the experiment (conceptually devised by Bell in the sixties) which was actually performed by A. Aspect and his team at Orsay, Paris, in 1982. In this experiment a random choice out of two different polarization measurements was performed on each side of the pair-producing device, say in the direction α1 or α2 on the left and in the direction β1 or β2 on the right, giving rise to four random variables P1 = P (α1), P2 = P (α2) and Q1 = Q(β1), Q2 = Q(β2), two of which are measured and compared at each trial.

PROPOSITION 3.2 (Bell’s four variable inequality). For any four 0-1-valued random variables P1, P2, Q1, and Q2 on the same probability space, the following inequality holds:

P(P1 = Q1) ≤ P(P1 = Q2) + P(P2 = Q1) + P(P2 = Q2).

Quantum mechanics predicts, and the experiment of Aspect showed that 1 (P (α) = Q(β) = 1) = (P (α) = Q(β) = 0) = sin2(α − β). P P 2 9

Hence 2 P(P (α) = Q(β)) = sin (α − β). So Bell’s four variable inequality reads

2 2 2 2 sin (α1 − β1) ≤ sin (α1 − β2) + sin (α2 − β1) + sin (α2 − β2), which is clearly violated for the choices α1 = 0, α2 = π/3, β1 = π/2 and β2 = π/6.

There does not exist, on any classical probability space, a quadruple P1, P2, Q1, and Q2 of random variables with the correlations measured in this experiment.

4. Two-level systems

Consider the historic Stern-Gerlach experiment whose purpose was to de- termine the magnetic moment of atoms, by measuring the deflection of a neutral atomic beam by a magnetic field. If the mean magnetic field is oriented along the e1-axis, i.e. B¯ = Be1, then the classical equations of predict a deflection of the beam along the e1-axis proportional to µ1 = µ · e1. The surprising result found by Gerlach and Stern was that µ1 could take only two values, ±µ. This result is extremely surprising from the point of view of classical physics, because Gerlach and Stern could have chosen different orientations for their mag- ◦ net, for example e2 and e3 , making angles of ±120 with e1 to measure µ2 = µ·e2 or µ2 = µ · e2, respectively. As the outcome of the experiment cannot be af- fected by merely rotating the magnet, the would have found, likewise, µ2 = ±µ or µ3 = ±µ. This creates, however an apparent contradiction because

µ1 + µ2 + µ3 = µ · (e1 + e2 + e2) = 0.

Obviously, µ1, µ2 and µ3 cannot all be equal to ±µ, and also sum up to zero.

Of course, it is impossible to measure in this way the values of µ1, µ2 and µ3 of the same atom (this is purely classical impossibility). Quantum theory tells that this is not a defect of this particular experimental method for measuring a magnetic moment: no experiment whatsoever can determine µ1, µ2 and µ3 simultaneously.

EXAMPLE 4.1. Consider an experiment with sequential Stern-Gerlach (SG) apparata. We measure the magnetic moment (spin) Sz of a neutral beam in the direction ez. Two beams of equal intensity emerge from the apparatus.

Suppose that we block the Sz = −1/2 beam (the bottom beam say) and we let only the 1/2 beam go into a second identical SG apparatus. This apparatus lets those out the top output and nothing comes out in the bottom output. We say that the states with spin +1/2 have no component along Sz = −1/2.

Now, suppose that we measure the Sx component of the Sz = +1/2 beam. Classically an object with angular momentum along the z axis has no component 10 of angular momentum along the x axis, these are orthogonal directions. Here how- ever, about half of the beam exits through the top Sx = +1/2, and the other half through the bottom Sx = −1/2.

Now we block the top beam and we only have an Sx = −1/2 output. The beam is fed into a SG along ez. We find half of the particles make it out of the third SG with Sz = +1/2 and the other half with Sz = −1/2.

4.1. Spin and Pauli matrices. One can consider experiments with a few Stern- Gerlach apparata in series measuring the magnetic moment (the spin) in the direc- tions ex, ey, ez, and so on. A mathematical framework consistent with the the outcomes of the experiments and their probabilities is as follows. The experiment 2 suggests that the spin state can be described using normalised vectors in C . In particular we can describe a state using two basis vectors, for instance

(4.1) |↑zi and |↓zi . The first corresponds to a spin +1/2 in the z-direction (‘spin up along z’). The second corresponds to the state of spin −1/2 ‘spin down along z’. A magnet measuring the spin in the z direction is an Sz for which |↑zi and |↓zi are eigenvectors with eigenvalue ±1/2, respectively: 1 (4.2) S |↑ i = + |↑ i z z 2 z 1 (4.3) S |↓ i = − |↓ i . z z 2 z It is customary to represent the two vectors in the computational basis 1 0 (4.4) |↑ i = , |↓ i = . z 0 z 1 A state can be represented as a superposition (i.e. normalised linear combination) a (4.5) |ψi = a |↑ i + b |↓ i = , a, b, ∈ , |a|2 + |b|2 = 1. z z b C The probability to measure a state φ given that the system is prepared in the state 2 ψ is |hφ|ψi| . The state |↑zi entering the second Stern-Gerlach must have zero overlap with |↓zi since no such down spins emerge. Moreover the overlap of |↑zi with itself must be one, as all states emerge from the second Stern-Gerlach top output. Indeed, 0 1 h↑ |↓ i = (1 0) = 0, h↑ |↑ i = (1 0) = 1, etc. z z 1 z z 0

The operator Sz can be represented as the 2 × 2 matrix: 1 1 0  (4.6) S = z 2 0 −1 11

When we measure the z-component of a state |↑xi we get |↑zi or |↓zi with equal probability 1/2; hence 1 (4.7) |h↑z|↑xi| = |h↓z|↑xi| = √ . 2 Therefore we can write 1 1 iδ1 (4.8) |↑xi = √ |↑zi + √ e |↓zi . 2 2

The state |↓xi must be orthogonal to |↑xi; this leads to 1 1 iδ1 (4.9) |↓xi = √ |↑zi − √ e |↓zi . 2 2

We can now give a matrix representation of Sx: 1  0 e−iδ1  (4.10) Sx = . 2 eiδ1 0

A similar argument with Sx replaced by Sy leads to 1  0 e−iδ2  (4.11) Sy = . 2 eiδ2 0

Is there any way of determining δ1 and δ2? We can consider a sequential Stern- Gerlach experiment with a measure along ex followed by a measure along ey, leading to 1 (4.12) |h↑y|↑xi| = |h↓y|↑xi| = √ . 2 Thus we obtain 1 1 (4.13) 1 ± ei(δ1−δ2) = √ 2 2 which is satisfied only if

δ2 − δ1 = ±π/2.

We see that the matrix elements of Sz, Sx and Sy cannot all be real. The standard 1 representation is δ1 = 0 and δ2 = π/2 so that Si = 2 σi, i = x, y, z, with 0 1 0 −i 1 0  (4.14) σ = , σ = , σ = . x 1 0 y i 0 z 0 −1

EXERCISE 4.1. (1) Verify the following commutation relations of the Pauli matrices σx, σy and σz.

[σx, σy] = iσz, [σy, σz] = iσx, [σz, σx] = iσy.

(2) Compute the anticommutators {σx, σy}, {σy, σz}, {σz, σx}. 2 2 2 (3) Compute σx, σy and σz . 2 2 2 2 (4) Show that S = Sx + Sy + Sz is a multiple of the identity operator. 12

The description and analysis of spin states that point in arbitrary directions, as specified by a unit vector n goes as follows. Let

n = (nx, ny, nz) = (sin θ cos φ, sin θ sin φ, cos θ) , 0 ≤ θ ≤ π, 0 ≤ φ < 2π, be a unit vector. Here θ and φ are the familiar polar and azimuthal angles. We can consider S = (Sx,Sy,Sz), and define the spin operator Sn along n as

Sn = n · S = nxSx + nySy + nzSz.

EXERCISE 4.2. (1) Check that, just like Sx, Sy and Sz, the eigenvalues of Sn are ±1/2. (2) Show that the eigenvectors |±; ni of Sn can be written as θ θ (4.15) |+; ni = cos |↑ i + sin eiφ |↓ i 2 z 2 z θ θ (4.16) |−; ni = sin |↑ i − cos eiφ |↓ i . 2 z 2 z The unit vector n denotes a location on the unit sphere, or ‘Bloch sphere’. Each point on the surface of the sphere represents a (pure) state, i.e. a normalised 2 vector |ψi ∈ C . For example, |↑zi corresponds to the north pole (θ = 0), etc.

5. The Schrodinger¨ equation and wave mechanics

In the first half of 1926, E. Schrodinger¨ showed that the rules of quantisations can be replaced by another postulate, in which there occurs no mention of whole numbers. Instead, the introduction of integers arises in the same way as, for exam- ple, in a vibrating string, for which the number of nodes is integer. Schrodinger¨ postulated a sort of wave equation and applied it to a number of problems, including the hydrogen atom and the harmonic oscillator. Schrodinger’s¨ approach to the dynamics differs from that of classical mechan- ics in its aims as well in its method. Instead of attempting to find equations, such as Newton’s equations, which enable a prediction to be made of the exact positions and momenta of the particles of a system at any time, he devised a method of cal- culating a function Ψ of the coordinates and time (and not the momenta), with the aid of which, probabilities for the coordinates or other quantities can be predicted. The Schrodinger¨ equation enables to determine a certain function Ψ (of the coordinates and time) called wavefunction or . The square of the modulus of a wavefunction is interpreted as a probability measure for the coordinates of the system in the state represented by this wavefunction. Besides yielding the probability amplitude Ψ, the Schrodinger¨ equation pro- vides a method of calculating values of the energy of the stationary states of the system. No arbitrary postulates concerning quantum numbers are required in the calculations; instead, integers enter automatically in the process of finding satis- factory solutions of the wave equation. 13

For our purposes, the Schrodinger¨ equation, the auxiliary restrictions upon the wavefunction and its probabilistic interpretation are conveniently taken as funda- mental postulates.

5.1. The Schrodinger¨ equation. Consider a system with one degree of free- dom, consisting of a particle of mass m that can move along a straight line, and let us assume that the system is further described by a potential energy function V (x) throughout the region −∞ < x < +∞. The classical Hamiltonian of the system is p2 (5.1) H(p, x) = + V (x) 2m and by the conservation of energy the Hamiltonian is constant on solutions of the Hamilton equations: H(p, x) = W.

~ ∂ Schrodinger’s¨ idea was to replace p by the differential operator i ∂x and W by ~ ∂ − i ∂t , and introduce the function Ψ(x, t) on which these operators can operate  ∂  H ~ , x Ψ(x, t) = W Ψ(x, t). i ∂x For this system the Schrodinger¨ equation is

∂Ψ(x, t) 2 ∂2Ψ(x, t) (5.2) −~ = − ~ + V (x)Ψ(x, t). i ∂t 2m ∂2x

REMARK. The analogy between the terms in the wave equation and the clas- sical energy equation has only a formal significance.

5.2. The time-indepedent Schrodinger¨ equation. In order to solve (5.2) let us (as usual in the analysis of partial differential equations) first study the non- trivial solutions Ψ (if any exist) which can be expressed as product of two func- tions, one involving time alone and the other the coordinate alone:

(5.3) Ψ(x, t) = ψ(x)ϕ(t).

On introducing this in Eq. (5.2) we get

1  2 ∂2ψ(x)  1 ∂ϕ(t) − ~ + V (x)ψ(x) = −~ ψ(x) 2m ∂2x i ϕ(t) ∂t The right-hand side of this equation is a function of time t alone and the left-hand side a function of x alone. It is consequently necessary that the quantity to which each side is equal to be dependent on neither x nor t; that is, that it be a constant. 14

Let us call it E. Then,  ~ ∂ϕ(t) − = Eϕ(t)  i ∂t (5.4)  2 ∂2ψ(x) − ~ + V (x)ψ(x) = Eψ(x). 2m ∂2x The second equation is customarily written as a spectral problem for the Schrodinger¨ 2 2 ~ d operator 2m dx2 + V (x), i.e.,  2 d2  (5.5) − ~ + V (x) ψ(x) = Eψ(x). 2m dx2 Eq. (5.5) is often called time independent Schrodinger¨ equation, or amplitude equa- tion, inasmuch ψ(x) determines the amplitude of Ψ(x, t). It is found that typically the equation possesses various solutions, correspond- ing to various values of the constant E. Let us denote these values of E by attach- ing a subscript n, and similarly the amplitude corresponding to En as ψn(x). The corresponding equation for ϕ(t) can be integrated at once to give −i Ent (5.6) ϕn(t) = e ~ . The general solution of the Schrodinger¨ equation is a linear combination of all the particular solutions with arbitrary coefficients X X −i Ent (5.7) Ψ(x, t) = anΨn(x, t) = anψn(x)e ~ . n n 5.3. Wave functions: discrete and continuous sets of energy values. It is found that satisfactory solutions ψn(x) exist only for certain values of the param- eter En. The values En are called eigenvalues of the Schrodinger¨ operator. It will be shown later that the physical interpretation of the wavefunction requires that the values En represent the energy of the system in its various stationary states. In addition to be a solution of the Schrodinger¨ equation, the wavefunction must be single-valued, continuous and square integrable.

For a given system, the energy levels En may occur only as a set of discrete values, or as a set of values covering a continuous range, or as both. From anal- ogy with spectroscopy it is often said that in these three cases the energy values comprise a discrete spectrum, a continuous spectrum, or both. The way in which the postulates regarding the wave equation and its acceptable solutions lead to the selection of definite energy values may be understood by the qualitative consider- ation of a simple example. Let us consider, for a system of one degree of freedom, that the potential-energy function satisfies V (x) → ∞ as |x| → ∞. For a given value of the energy parameter E, the wave equation is 2 d2ψ(x) (5.8) ~ = (V (x) − E) ψ(x). 2m dx2 15

For sufficiently large |x|, the quantity V (x) − E will be positive. Hence in this ∂2ψ region the curvature ∂2x will be positive if ψ is positive, and negative if ψ is neg- ative. Suppose that at an arbitrary point x = c the function ψ has a certain value ∂ψ and a certain slope ∂x . The behaviour of the function, as it is continued both to the right and to the left, is completely determined by the values assigned to two quantities. We see that, for a given value of E only by a very careful selection of the slope of the function at the point x = c can the function be made to behave properly for large values of x. Similarly for large negative values of x. This selec- tion is such as to cause the to approach the value zero asymptotically with increasing |x|. In view of the sensitiveness of the curve to the parameter E, an infinitesimal change from this satisfactory value will cause the function to behave improperly. We conclude that the parameter E and the slope at the point x = c (for a given value of the function itself at this point) can have only certain values if ψ is to be an acceptable wave function. For each satisfactory value of E there is one (or, in certain cases, more than one) satisfactory value of the slope, by the use of which the corresponding wave function can be built up. For this system the characteristic values En of the energy form a discrete set, and only a discrete set, inasmuch as for every value of E, no matter how large, V (x) − E is positive for sufficiently large values of |x|.

It is customary to number the energy levels for such a system as E0 (the low- est), E1 (the next), and so on, corresponding to wavefunctions ψ0(x), ψ1(x), etc. The integer n, which is written as a subscript in En and ψn(x), is called . Let us now consider a system in which the potential-energy function remains finite as x → ∞ or as x → −∞ or at both limits. For a value of E smaller than both V (+∞) or V (−∞) the argument presented above is valid. Consequently the energy levels will form a discrete set for this region. If E is greater than V (+∞), however, a similar argument shows that the curvature will be such as always to re- turn the wave function to the x axis, about which it will oscillate. Hence any value of E greater than V (+∞) or V (−∞) will be an allowed value, corresponding to an acceptable wave function, and the system will have a continuous spectrum of energy values in this region.

5.4. Probability measures. Let Ψ(x, t) be a normalised solution of the Schrodinger¨ equation; that is Z +∞ Ψ∗(x, t)Ψ(x, t)dx = 1. −∞ For any t, the quantity Ψ∗(x, t)Ψ(x, t)dx = |Ψ(x, t)|2dx is a probability measure. It can be interpreted as the probability that the system represented by the state Ψ(x, t) is in the region between x and x + dx at time t. 16

It is also convenient to normalise the individual amplitude function ψn(x), Z +∞ ∗ ψn(x)ψn(x)dx = 1. −∞ Moreover, the independent solutions of the amplitude equation can always be cho- sen in such a way that they form an orthonormal system; that is Z +∞ ∗ ψm(x)ψn(x)dx = δmn. −∞ P Using this relation, it is found that a wavefunction Ψ(x, t) = n anΨn(x, t) is normalised when the coefficients an ∈ C satisfy the condition X ∗ anan = 1. n

5.5. Stationary states. Let us consider the probability measure Ψ∗Ψ for a P − iEnt ∗ system in the state Ψ(x, t) = n anψn(x)e ~ and its conjugate Ψ (x, t) = P ∗ ∗ iEnt n anψn(x)e ~ . Then,

∗ X ∗ ∗ X ∗ ∗ − i(Em−En)t Ψ (x, t)Ψ(x, t) = ananψn(x)ψn(x)+ amanψm(x)ψn(x)e ~ n m6=n In general, then, the probability function and hence the properties of the system depend on the time, inasmuch as the time enters in the exponential factors of the double sum. Only if the coefficients are zero for all except one value of En is Ψ∗Ψ independent of t. In such a case the wavefunction will contain only a single − iEnt ∗ term Ψn(x, t) = ψn(x)e ~ ; For such a state the probability measure Ψ Ψ is independent of time, and the state is called .

5.6. Average values. If we inquire as to what average value would be ex- pected on measurement at a given time t of the coordinate x of the system in a physical situation represented by the wavefunction Ψ(x, t), the above interpreta- tion of Ψ∗Ψ leads to the answer Z +∞ hxi = xΨ(x, t)∗Ψ(x, t)dx. −∞ A similar integral gives the average value predicted for x2, x3, or any function f(x) of the coordinate x: Z +∞ (5.9) hf(x)i = f(x)Ψ(x, t)∗Ψ(x, t)dx. −∞ In order that the same question can be answered for a more general dynamical function f(p, x) involving the momentum p as well as the coordinate x, we assume 17 that the average value of f(p, x) predicted for a system in the physical situation represented by the wavefunction Ψ is given by the integral Z +∞  ∂  (5.10) hf(p, x)i = Ψ(x, t)∗f ~ , x Ψ(x, t)dx. −∞ i ∂x ~ ∂ in which the operator f obtained from f(p, x) by replacing p by i ∂x , acts on the function Ψ(x, t).

REMARK. In some cases further considerations are necessary in order to de- ~ ∂  termine the exact form of the operator f i ∂x , x , but we shall not encounter such difficulties.

In general, the result of a measurement of f will not be given by the average value hfi; it is instead described by a . −i En t Even if the system is in a stationary state Ψn(x, t) = ψn(x)e ~ , the vari- ance Var(f) = hf 2i − hfi2 of an arbitrary f is, in general, not zero. The energy of the system, corresponding to the Hamiltonian function H(p, x), has, however, a definite value for a stationary state Ψn of the system, equal to En. To prove this, we evaluate hHi and Var(H). The average value of H for the state Ψn is given by Z +∞  2 2  ∗ ~ ∂ hHi = ψn(x) − 2 + V (x) ψn(x)dx = En. −∞ 2m ∂x r r By a similar computation hH i = En, r = 0, 1, 2,... . In particular, this shows that Var(H) = 0; thus, the energy of the system in the state Ψn has a definite value En.

REMARK. Note that the probabilistic interpretation of Ψ(x, t) is unchanged if we multiply the wavefunction by a complex number α of modulus |α| = 1. Thus, iθ a state is an equivalence class of the form e Ψ(x, t), θ ∈ R. (This is especially relevant when discussing symmetries of a quantum system. More on this later.)

EXERCISE 5.1 (Probability conservation law). Show that, if Ψ(x, t) is a solu- tion of the Schrodinger¨ equation ∂Ψ(x, t) i = HΨ(x, t), x ∈ n, ~ ∂t R 2 ~ 2 (H = − 2m ∆ + V (x)), then the probability density |Ψ(x, t)| satisfies the conti- nuity equation (conservation of probability): ∂ |Ψ|2 + ∇~ · J~ = 0, ∂t where the is i   J~ = ~ Ψ∇~ Ψ∗ − Ψ∗∇~ Ψ . 2m 18

EXERCISE 5.2 (). Prove the : if H(p, q) is the classical Hamiltonian of the systems, then the expectation values hqi and hpi of the quantised variables satisfy the classical equation of motion dhqi ∂H  = +h i  dt ∂p dhpi ∂H  = −h i. dt ∂q (This is an instance of the correspondence principle.)

6. Exactly solvable models

6.1. The quantum harmonic oscillator. You may be familiar with several examples of harmonic oscillators form , such as particles on a spring or the pendulum for small deviation from equilibrium, etc. 1 2 The classical Hamiltonian of a particle in a potential V (x) = 2 mω x is 1 1 (6.1) H = p2 + mω2x. 2m 2 The corresponding quantum Hamiltonian is 2 d2 1 (6.2) H = − ~ + mω2x2. 2m dx2 2 Using dimensionless coordinates, the corresponding Schrodinger¨ equation is  1 d2 1  (6.3) − + x2 ψ(x) = Eψ(x). 2 dx2 2 There is a very useful procedure to solve (6.3) based on the determination of the form of ψ in the regions of large |x|, and the subsequent discussion, by the introduction of a factor in the form of a power series (which later reduces to a polynomial), of the value of ψ for generic x ∈ R. This procedure may be called the polynomial method. 6.1.1. Eigenvalues and of the harmonic oscillator. The first steps is the asymptotics of the solution of (6.3) when |x|  1. For large |x|, E is negligibly small relative to x2/2, and the asymptotic form of the Schrodinger¨ equation becomes d2ψ = x2ψ. dx2 2 2 This equation is asymptotically satisfied by e−x /2 and ex /2. Of the two asymp- totic solutions, the second tends rapidly to infinity with increasing values of |x|; the first, however, leads to a satisfactory treatment of the problem. We now proceed to obtain a solution of (6.3) throughout the whole configura- tion space x ∈ R, based upon the asymptotic solution, by introducing as factor a power series in x and determining its coefficients by substitution and recursion. 19

2 Let ψ(x) = H(x)e−x /2. Then

2 d ψ 2 = e−x /2 (x2 − 1)H(x) − 2xH0(x) + H00(x) . dx2 If ψ(x) solves the Schrodinger¨ equation, then the get a differential equation for H(x):

(6.4) H00(x) − 2xH0(x) + (2E − 1)H(x) = 0.

We now represent H(x) as a power series (6.5) X k 0 X k−1 00 X k−2 H(x) = akx ,H (x) = kakx H (x) = k(k − 1)akx . k≥0 k≥0 k≥0 On substitution of these expression, Eq. (6.4) assumes the form

X k [(k + 2)(k + 1)ak+2 − 2kak + (2E − 1)ak] x = 0. k≥0 In order for this series to vanish for all values of x (i.e., for H(x) to be a solution of (6.4)), the coeffecients of the each power of x must vanish

(k + 2)(k + 1)ak+2 + (2E − 2k − 1)ak = 0 or 2(E − k − 1/2) (6.6) a = − a . k+2 (k + 2)(k + 1) k

This recursion formula enables to calculate successively the coefficients a2, a3, a4,... in terms of a0 and a1, which are arbitrary. Note that if a0 = 0 (resp. a1 = 0) only odd (resp. even) powers appear. For arbitrary values of the parameter E, the above series consists of an infinite number of terms and does not correspond to a satisfactory wavefunction. To see this, note that for large enough values of k, we have 2 a ' a . k+2 k k

2 But this is the same recursion of the series coefficients of ex , 4 6 k k+2 2 x x x x ex = 1 + x2 + + + ··· + + + ··· 2! 3! k  k+2  2 ! 2 !

2 so that the higher terms of the series for H(x) differ from those of ex only by 2 a multiplicative constant. Therefore, for large values of |x|, H(x) ∼ ex , and the 2 2 − x x product e 2 H(x) will diverge like e 2 , thus making unacceptable as a wavefunc- tion. 20

We must therefore restrict E to values which will cause the series for H(x) to terminate, leaving a polynomial. The value of E which causes the series to break off after the nth term is seen from (6.6) to be 1 (6.7) E = n + . 2

It is, moreover, also necessary that the value of either a0 or of a1 be put equal to zero, according as n is odd or even. The solutions are thus either odd or even func- tions of x. The normalised solutions of the Schrodinger¨ equation may be written in the form 2 − x (6.8) ψn(x) = Nne 2 Hn(x).

Hn(x) is the polynomial of degree nth determined by the recursion (6.6) with R 2 E = n + 1/2, and Nn is a constant which is adjusted so that |ψn(x)| dx = 1 (this condition fixes the initial value a0 or a1 of the recursion). The polynomials Hn(x), called Hermite polynomials, did not originate with Schrodinger’s¨ work but were well known to mathematicians in connection with other problems. They can be defined using the formula  n 2 d 2 (6.9) H (x) = ex − e−x . n dx Another definition makes use of the generating function ∞ X Hn(x) 2 2 (6.10) S(x, y) = yn = ex −(y−x) . n! n=0

EXERCISE 6.1. Show the equivalence of the two definitions (6.9)-(6.10) of Hn(x).

It is easy to show that S(x, y) satisfies the following identities ∂S (6.11) = −2(y − x)S ∂y ∂S (6.12) = 2yS. ∂x EXERCISE 6.2. Using (6.11) and (6.12) show that the Hermite polynomials satisfy the three-term recursion

(6.13) Hn+1(x) − 2xHn(x) + 2nHn−1(x) = 0

(with initial conditions H0(x) = 1, H1(x) = 2x), and the differential equation 00 0 (6.14) Hn(x) − 2xHn(x) + 2nHn(x) = 0. In particular, the latter is just (6.4) if we put E = n + 1/2. 21

The functions 2 − x 1 (6.15) ψn(x) = Nne 2 Hn(x), with Nn = √ pn!2n π are called Hermite functions. They satisfy the orthogonality relation (why?) Z +∞ (6.16) ψn(x)ψm(x)dx = δnm. −∞ EXAMPLE 6.1. By using the generating function S we can evaluate certain integrals involving ψn which are of importance. For example, we may study the integral which determines the probability of transition from the state n to the state m. This is Z +∞ Z +∞ ∗ −x2 hn |x| mi = ψn(x)xψm(x)dx = NnNm xHn(x)Hm(x)e dx. −∞ −∞ Using the generating functions S(x, y) and S(x, z) we obtain the relation Z +∞ Z +∞ X 1 2 xS(x, y)S(x, z)dx = ynzm xH (x)H (x)e−x dx n!m! n m −∞ n,m −∞ +∞ Z 2 = e2yz xe−(x−y−z) dx −∞ √ = e2yz(y + z) π √  22y3z2 2nyn+1zn = π y + 2y2z + + ··· + + ··· 2! n! 22y2z3 2nynzn+1  + z + 2yz2 + + ··· + + ··· . 2! n! Hence, comparing the coefficients of ynzm, we see that hn |x| mi is zero except for m = n ± 1, and rn + 1 rn hn |x| n + 1i = , hn |x| n − 1i = . 2 2 6.1.2. Creation and annihilation operators. The harmonic oscillator is of im- portance for general theory, because it forms a cornerstone in the theory of radia- tion. An alternative method of solution is based on the creation and annihilation operators. They are defined as

∗ 1 1 (6.17) a = √ (x − ip) = √ (x − ∂x) 2 2 1 1 (6.18) a = √ (x + ip) = √ (x + ∂x) 2 2 Note that a and a∗ do not commute (6.19) [a, a∗] = 1. 22

Using the properties of the Hermite functions it is easy to verify that √ (6.20) aψn(x) = nψn−1(x) √ ∗ (6.21) a ψn(x) = n + 1ψn+1(x) We can also show that the number operator N = a∗a satisfies

(6.22) Nψn(x) = nψn(x). We have the following interpretations: a∗ creates a ‘quantum’ of energy, a annihi- lates a ‘quantum’ of energy, N counts the number of quanta of energy. We write √ (6.23) a∗ |ni = n + 1 |n + 1i √ (6.24) a |ni = n |n − 1i (6.25) N |ni = n |ni .

∗ 1 1 Note that H = a a + 2 = N + 2 , and H |ni = (n + 1/2) |ni. The eigenstates |ni are obtained by the repeated action of the creation operator on the 1 (6.26) |ni = √ (a∗)n |0i . n! EXERCISE 6.3. Prove by induction that

∗ n ∗ n−1 [a, (a ) ] = n(a ) , for all n ∈ N.

6.2. A quantum . The classical Hamiltonian (energy) of a particle in a box [0,L] is 1 (6.27) H = p2. 2m The corresponding quantum Hamiltonian is formally 2 d2 (6.28) H = − ~ . 2m dx2 This corresponds to a force-free (V = 0) situation in the interior of the box. We have to take into account the walls of the box, i.e. the boundary conditions (b.c.). We may assume that the wavefunctions vanishes at the walls ψ(0) = 0 and ψ(L) = 0 (so-called Dirichlet b.c.). The of the Schrodinger¨ operator H with Dirichlet b.c. satisfy the problem  2 d2 − ~ ψ(x) = Eψ(x) for 0 < x < L (6.29) 2m dx2 ψ(x) = 0 at x = 0 and x = L. We see that a generic solution ψ is of the form ψ(x) = Aeiαx + Be−iαx 23 with α2 = 2mE and constants A and B. Imposing the boundary conditions ~2 ψ(0) = 0 ⇒ A + B = 0 ψ(L) = 0 ⇒ AeiαL + Be−iαL = 0. This imposes A = −B and αL = πk, k = 1, 2,... . The value of A is fixed by the R L 2 normalisation condition 0 |ψ(x)| dx = 1. Therefore we find that the Schrodinger¨ equation in a box with Dirichlet b.c. has eigenfunctions (6.30) r 2 πk  π2 2 ψ = sin x with eigenvalues E = ~ k2, k = 1, 2,.... k L L k 2mL2 Note that a particle in a box has a minimum energy π2 2 (6.31) E = ~ > 0. 1 2L2m This is a manifestation of the . If we know that a particle 2π~ is localised in [0,L], then the momentum px has uncertainty ∆px ≥ L . This π~ amounts to not knowing whether the particles moves with a momentum |px| = L in the positive or negative sense. The energy corresponding to this momentum is 2 2 2 π ~ E = px/2m = 2L2m = E1. This is a simple method to obtain a rough estimate of the minimum energy of a system. (It can be used to estimate the energy of electrons in an atom or of nucleons in a nucleus if the atomic or nuclear radius, i.e., the size of the box, is known.)

EXAMPLE 6.2. Consider the Hamiltonian H of a free particle in a box of length L with Dirichlet b.c.. Suppose that a particle (of mass m) is in the state ψ(x), where ψ(x) = Ax(L − x) √ if 0 ≤ x ≤ L and zero otherwise. The normalisation constant is A = 30L5. Suppose that we measure H in the state ψ(x). The mean value of the energy is Z 2 Z L ∗ A~ ∗ 5~ hHi = ψ (x)Hψ(x)dx = ψn(x)dx = 2 . m 0 mL To find the most probable outcome of the measurement we expand the wavefunc- P tion ψ(x) in the eigenfunctions of the Hamiltonian ψ(x) = k≥1 cnψn(x), where Z ∗ cn = ψn(x)ψ(x)dx.

2 2 π ~ A simple calculation reveals that the most probable value is E1 = 2L2m with 2 6 probability |c1| = 960/π ' 0.998.

EXAMPLE 6.3 (Quantum fractals). Let HD be the Hamiltonian of a free parti- cle in the box [0,L] with Dirichlet b.c.. The simplicity of the spectrum and eigen- functions is deceptive and could lead us to think that the dynamics, which is the 24 solution to the Schrodinger¨ equation, is very simple. In fact, this belief is false, as showed by M. V. Berry: the dynamics is instead very intricate. Consider the p evolution −i~∂tψ = HDψ with initial condition ψ(x, 0) = 1/L (uniform initial state). The wavefunction ψ(x, t) at time t ∈ R is given by ψ(x, t) = θ(ξ, τ) where x 2π~t ξ = L , τ = mL2 , and

X 1 2πiξ(k+ 1 )−iπτ(k+ 1 )2 θ(ξ, τ) = const · e 2 2 . (k + 1 ) k∈Z 2 (This is related to a Jacobi theta function.) It is easy to check that the wavefunction is quasiperiodic θ(ξ, τ + 1) = e−iπ/4θ(ξ, τ). Thus, at integer times τ the wave function comes back (up to a phase) to its initial flat form: these are the quantum revivals. More generally, at rational values of τ the graph of |θ(ξ, τ)|2 is piecewise constant and there is a partial reconstruction of the initial wavefunction. On the other hand, it can be proved that at irrational times, the wavefunction is a fractal in space and time and form a beautifully intricate quantum carpet.

EXERCISE 6.4. Different boundary conditions correspond to different Schrodinger¨ operators, i.e., different physical systems. Show that the stationary states ψk(x) 0 and energy levels Ek of a particle in a box with Neumann b.c. (ψ (0) = 0 and ψ0(L) = 0) and are r 1 r 2 πk  ψ (x) = , ψ (x) = cos x , k = 1, 2,... 0 L k L L π2 2 with E = ~ k2, k = 0, 1, 2,.... k 2L2m 2 2 ~ d EXERCISE 6.5. Diagonalise the Hamiltonian − 2m dx2 , 0 ≤ x ≤ L with peri- odic b.c. ψ(0) = ψ(L) (quantum particle in a ring).

REMARK. The strictly positivity of the energy is a property of the Dirichlet boundary conditions. Indeed, with the Neumann ones the energy of the ground state is 0, and, more surprisingly, it is even negative with Robin’s boundary condi- tions.

n 6.3. Free evolution of a quantum particle in R . The evolution of the state of a particle of mass m > 0 in n dimensions in the absence of a potential (i.e., free evolution, V (x) = 0) is governed by the Hamiltonian 2 (6.32) H = − ~ ∆, 2m The operator H is essentially self-adjoint, and we will identify it with its unique self-adjoint extension. If ψ0 is the state at time t = 0, the state ψ(x, t) at time t is given by the solution of the free Schrodinger¨ equation: ∂ψ 2 (6.33) i + ~ ∆ψ = 0, x ∈ n, t ∈ ~ ∂t 2m R R 25

2 n with initial condition ψ(x, 0) = ψ0(x) (here we can assume ψ0(x) ∈ L (R ) ∩ 1 n L (R )). Let us first consider the momentum of a free particle. The wavefunction in momentum representation is the Fourier transform of the state in position repre- sentation 1 Z ψ(p, t) = ψ(x, t)e−ipxdx. (6.34) b n/2 (2π) Rn It is not hard to show, using the Schrodinger¨ equation, that 2 −it~|p| /2m (6.35) ψb(p, t) = e ψc0(p). 2 2 Thus, |ψb(p, t)| = |ψc0(p)| . Hence the probability density of the momentum p is not changing at all in time. This is what we expect, since the momentum of a free particle is a constant of motion. The free Schrodinger¨ equation (6.33) has an explicit solution as inverse Fourier transform of (6.34) Z 1 t~ |p|2+ipx (6.36) ψ(x, t) = n ψc0(p)e 2mi dp. (2π) 2 Rn The above formula can be interpreted as a superposition of plane waves with am- plitude determined by the initial condition (in momentum space) ψc0(p). Using the convolution theorem for the Fourier transform, we can write (6.37)  n/2 Z Z 2m im |x−y|2 ψ(x, t) = e 2~t ψ0(y)dy = G(x, t; y, 0)ψ0(y)dy 4πi~t Rn Rn We see that the solutions is given by the convolution of the free particle propaga- tor  n/2 m 0 2 0 (6.38) G(x, t; x0, t0) = eim|x−x | /[2~(t−t )] 2πi~(t − t0) and the initial condition ψ0. It is easy to show using this explicit representaion of 2 n the solution that for any ψ0 ∈ L (R ), kψk∞ → 0 as t → ∞. So, the probability density |ψ(x, t)|2 of the position flattens as t → ∞. In other words, there is no limiting probability distribution for the position of the particle; it spreads out across the whole space.

6.4. Scattering processes. Problems for which the energy eigenvalues are continuously distributed usually arise in connection with the collision of a parti- cle with a force field. The method of approach is different from that employed for systems with discrete levels. There the boundary conditions were used to deter- mine the discrete energy levels of the particle. In a scattering problem, the energy is specified in advance, and the behaviour of the wavefunction at great distances (the boundary values) is found in terms of it. This asymptotic behaviour can then 26 be related to the amount of scattering of the particle by the force field. We discuss now a couple of exact solutions for simplified models that are routinely used for approximate calculations on more complicated systems. Consider the one-dimensional scattering of a particle with a potential V (x). Suppose that V (x) is nondecreasing and that

lim V (x) = V0 > 0, x→+∞ lim V (x) = 0. x→−∞

In classical mechanics, a particle coming from the left with energy E < V0 can not reach +∞: the potential is a barrier. In quantum mechanics there are new phenomena:

(i) If E < V0 the particle can penetrate the barrier. In general, there is nonzero probability that the particle is in the classical forbidden region {x: E < V (x)}; (ii) If E > V0 there is the possibility that the particle is reflected by the barrier. These properties will be shown by calculating the probabilities of transmission and reflection. The wavefunction of a particle moving to the right, have the asymptotic form

p2m(E − V ) x→+∞ ik2x 0 (6.39) ψ(x) ∼ αe , with k2 = ~ (a particle with energy E, moving in a constant potential V0, with positive momen- tum p.) For large negative x, the particle is approximately free. The wave function has the asymptotic form √ 2mE x→−∞ ik1x −ik1x (6.40) ψ(x) ∼ e + βe , with k1 = ~ (a superposition of an incident wave with intensity 1 and a reflected wave) The ∗ ∗ probability current J~ ∝ (ψ∂xψ − ψ ∂xψ) (see Exercise 5.1) is

incident wave: |J~| ∝ k1 2 reflected wave: |J~| ∝ |β| k1 2 transmitted wave: |J~| ∝ |α| k2. The transmission and reflection coefficients are defined as |J~ | k |J~ | T = t = 2 |α|2,R = r = |β|2. J~i k1 J~i Since the probability is conserved, we must have R + T = 1. 27

EXAMPLE 6.4. Consider a particle moving along the x-axis with energy E ≥ V0 > 0 in a potential ( V if x > 0, V (x) = 0 0 if x < 0. We compute the reflection coefficient. The (exact, not asymptotic!) solution of the Schrodinger¨ equation is ( αeik2x if x > 0, ψ(x) = eik1x + βe−ik1x if x < 0.

If we impose the continuity of ψ and ψ0 at x = 0, we determine k − k β = 1 2 . k1 + k2 Therefore 2 2 k1 − k2 R = |β| = . k1 + k2

Since k1 6= k2, in general there is reflection, even if the energy E > V0 is large enough for a classical particle to overcome the barrier. If E = V0 (k2 = 0), there is total reflection. There is total reflection even if E < V0 (k2 purely imaginary). 2 Finally, if E  V0, R ∼ V0/4E → 0, there is total transmission.

EXAMPLE 6.5. Consider a potential barrier: the potential energy is positive in the interval 0 < x < a and is zero outside this region ( V if 0 < x < a, V (x) = 0 0 otherwise.

We examine the case E > V0 first. In this case  eik1x + Ae−ik1x if x < 0,  ψ(x) = Beik2x + B0e−ik2x if 0 < x < a,  Ceik1x if x > 0.

Imposing the condition that ψ and ψ0 are continuous at x = 0 and x = a we obtain a linear systems for the coefficients A, B, B0,C:

1 + A = B + B0 0 k1(1 − A) = k2(B − B ) Ceik1a = Beik2a + B0e−ik2a

ik1a ik2a 0 −ik2a Ck1e = k2(Be − B e ). 28

If we set κ = k1/k2, then 2κ(1 + κ) B = (1 + κ)2 − (1 − κ)2e2ik2a 2κ(1 − κ)e2ik2a B0 = (1 + κ)2 − (1 − κ)2e2ik2a 4κei(k2−k1)a C = . (1 + κ)2 − (1 − κ)2e2ik2a It follows that the transmission coefficient is 4k2k2  V 2 sin2(k a)−1 T = |C|2 = 1 2 = 1 + 0 2 . 2 2 2 2 2 2 4k1k2 + (k1 − k2) sin (k2a) 4E(E − V0)

Note a new phenomenon: the barrier becomes transparent (T = 1) if k1a = nπ. p 2 If the case E < V0 we have k2 = i 2m|E − V0|/~ = ix2. Using the previous formulae we find the transmission coefficient

−1 4k2x2  V 2 sinh2(x a) T = 1 2 = 1 + 0 2 . 2 2 2 2 2 2 4k1x2 + (k1 + x2) sinh (x2a) 4E(V0 − E) In classical mechanics the particle would not penetrate the barrier. In quantum me- chanics, the solution of the Schrodinger¨ equation shows that, contrary to classical expectations, there is a finite probability that an incident particles will be transmit- ted T > 0 . This effect goes under the name of tunnel effect.

When x2a → ∞ it is easy to find the asymptotic formula √ 8m(V −E) 16E(V0 − E) − 0 a T ∼ 2 e ~ . V0

EXERCISE 6.6. Find an expression for the transmission coefficient in the limit V0 → +∞, a → 0, with V0a = K constant. (This corresponds to a scattering process by a δ-potential.)

6.5. The one-dimensional crystal. A crystal consisting of a vast number of positive ions and electrons is a paradigmatic example of many-body quantum me- chanical system. It is possible to derive some of its salient features from a simpli- fied model of a quantum particles moving in a periodic potential 2 d2 (6.41) H = − ~ + V (x), with V (x + d) = V (x), x ∈ [0, Nd]. 2m dx2 In words, we model the crystal lattice of ions as a crystal ring with N sites with a periodic potential with period d (the lattice spacing). The fact that the potential V (x) is periodic of period d does not mean that the eigenfunctions have to be periodic with the lattice period. The Schrodinger¨ 29 equation remains unchanged if we replace V (x) with V (x + d), but this leaves a constant factor in its eigenfunctions undetermined, so that

ψ(x + d) = µ1ψ(x) where µ is a constant. If we move over one more lattice space, the same argument holds true. Hence

ψ(x + 2d) = µ1µ2ψ(x). If we continue N times, we get back to where we where, so that

ψ(x + Nd) = µ1µ2 ··· µN ψ(x) = ψ(x) ⇒ µ1µ2 ··· µN = 1. Because of the symmetry of the crystal ring, it cannot make any difference at which lattice point we started, and we conclude that all the µk’s must be equal. Hence ψ(x + Nd) = µN ψ(x), or µ = e2πin/N , n = 0, 1,...,N − 1. We have, therefore, ψ(x + md) = e2πinm/N ψ(x). This statement can also be expressed in the following form

2πin x ikx (6.42) ψ(x) = e N d u(x) = e u(x) where k = 2πn/Nd and u(x) is periodic of period d. Eigenfunctions of this form are called Bloch’s waves and it can be proved that they form a basis (this is known as Bloch’s theorem or Floquet’s theorem). Although ψ(x) is not d-periodic, it is easy to see that the probability density |ψ(x)|2 is periodic with the lattice period. If we want to go on with the solution of the Schrodinger¨ equation, we have to specify V (x). A mathematically convenient choice is the Kronig-Penney (1931) potential (a succession of square wells potential of width a and depth v). Consid- ering a single period of the potential, the periodic solution u(x) is ( Aei(α−k)x + Be−i(α−k)x, α2 = 2mE if 0 ≤ x < a (6.43) u(x) = ~2 Cei(β−k)x + De−i(β−k)x, β2 = 2m(v−E) if a < x ≤ d ~2 with the periodic condition u(0) = u(d), u0(0) = u0(d), and the continuity con- ditions of u(x) and u0(x) at x = a. These conditions provide four linear ho- mogeneous equations to determine the four constants A, B, C and D. To have a nontrivial solution the determinant of the matrix coefficient of linear system must vanish. For E < v we get β2 − α2 sinh(β(d − a)) sin(αa) + cosh(β(d − a)) cos(αa) = cos(ka). 2αβ The problem simplifies in the limit a → d, v → ∞ with v(d − a) = K fixed (the potential becomes a periodic array of repulsive δ-functions). In this limit, it turns 30 out that the condition for the existence of nontrivial solutions of the Schrodinger¨ equation is r 2mE Km (6.44) P sin(αd) + cos(αd) = cos(kd), with α = , and P = . ~2 α~2 The left-hand side (a function of E) can assume values larger than one, the right- hand side can not (and it does not depend on E), Therefore, the relation can be 2 2 ~ (αd) satisfied only for those values of E = 2md2 for which the left-hand side remains between −1 and +1. In other words, in a crystal only certain energy bands are al- lowed. (A particle trapped in a potential well can have only discrete energy values. In a periodic potential these discrete are replaced by energy bands.) It is customary to present the results as a plot of the allowed energies E against kd. Note that, the larger K, i.e., the higher the potential separating the regions of zero potential, the narrower the energy bands. For K → ∞ the crystal reduces to a series of independent quantum boxes, and the energy bands become discrete 2 2 2 ~ π n energy levels E = 2md2 . It is instructive to compare the results with that of a free electron. We can obtain this result by letting v = 0 (i.e. K = 0, so that 2 2 E = ~ k /(2m).

EXERCISE 6.7. Derive the relation (6.44) by considering the one-dimensional Schrodinger¨ equation with potential X V (s) = K δ(x − nd). n∈Z 6.6. The hydrogen atom. We consider the hydrogen atom as a system of an electron in three dimension moving in the attractive Coulomb potential generated by a fixed nucleus. Let us, for generality, ascribe to the nucleus charge +Z > 0 and unit charge −1 to the electron. The classical potential energy of the system is Z p 2 2 2 − |x| , in which |x| = x1 + x2 + x3 is the distance between the electron an the nucleus. For convenience we can assume that the nucleus is fixed at the origin of the coordinate system The classical Hamiltonian is p2 Z (6.45) H(p, x) = − , p = (p , p , p ) ∈ 3, x = (x , x , x ) ∈ 3. 2m |x| 1 2 3 R 1 2 3 R The corresponding Schrodinger¨ operator, in suitable units, is Z (6.46) H = −∆ − , |x| and the time-independent Schrodinger¨ equation is Z (6.47) −∆ψ(x) − ψ(x) = Eψ(x). |x| 31

The spherical symmetry of the problem suggests to use spherical coordinates  x1 = r sin θ cos ϕ  (6.48) x2 = r sin θ sin ϕ with r > 0, −π ≤ θ ≤ π, and 0 ≤ ϕ < 2π.  x3 = r cos ϕ

EXERCISE 6.8. Show that the Laplacian in spherical coordinates is 1 ∂  ∂  1 ∂  ∂  1 ∂2 (6.49) ∆ = r2 + sin θ + . r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂2ϕ The Schrodinger¨ equation in spherical coordinates for a generic symmetric potential V (x) = V (|x|) is 1 ∂  ∂ψ  1 ∂  ∂ψ  1 ∂2ψ r2 + sin θ + + (E − V (r)) ψ = 0. r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂2ϕ The above partial differential equation can be separated into ordinary equations using the ansatz ψ(r, θ, ϕ) = R(r)Θ(θ)Φ(ϕ). On introducing this into the equation and dividing by RΘΦ we get 1 ∂  ∂R  Z  1 ∂  ∂Θ 1 ∂2Φ r2 + r2 E + + sin θ + = 0. R ∂r ∂r r Θ sin θ ∂θ ∂θ sin2 θΦ ∂2ϕ

2 1 ∂2Φ On multiplying through by sin θ, the remaining part of the third term, Φ ∂2ϕ which could only be a function of the independent variable ϕ, is seen to be equal to terms independent of ϕ. Hence this term must be equal to a constant, which we call −m2: ∂2Φ + m2Φ = 0. ∂2ϕ The equation in r and θ then can be written as 1 ∂  ∂R  Z  1 ∂  ∂Θ m2 r2 + r2 E + + sin θ − = 0. R ∂r ∂r r Θ sin θ ∂θ ∂θ sin2 θ If we set the θ terms equal to the constant −l(l + 1), (and consequently the r terms equal to l(l + 1)) we eventually obtain  ∂2Φ − = m2Φ  ∂2ϕ     1  ∂ ∂ m2  − sin θ + Θ = l(l + 1)Θ  sin θ ∂θ ∂θ sin θ     1 ∂  ∂R l(l + 1)   − r2 + + V (r) R(r) = ER(r).  r2 ∂r ∂r r2 32

These equations are now to be solved in order to determine the allowed values of the energy. The sequence of solution is the following: We first find that the Φ- equation possesses acceptable solutions only for certain values of the parameter m. Introducing these in the Θ-equation, we find that it then possesses acceptable solutions only for certain values of l(l + 1). (The form m2 and l(l + 1) chosen for the constants is for convenience later on.) Finally, we introduce these values of l in the R-equation and find that it possesses acceptable solutions only for certain values of E. These are the energy levels for the stationary states of the system.

REMARK. Note that the above considerations are valid for a generic radially symmetric potential V (x) = V (|x|).

6.6.1. The Φ-equation. The solutions of the Φ-equation are

1 imϕ (6.50) Φm(ϕ) = √ e . 2π In order for the function to be periodic, m must be an integer, i.e., m ∈ Z. The constant m is called magnetic number. √ The factor 1/ 2π is chosen in order to satisfy the normalisation condition. Actually, they form an orthonormal system Z 2π ∗ (6.51) Φm(ϕ)Φn(ϕ) = δnm. 0 It may pointed out that except for the lowest eigenvalue m2 = 0, all eigenvalues are twice degenerate (the two functions Φm and Φ−m satisfy the same differential equation if m > 0). 6.6.2. The Θ-equation. Now we turn to the equation for Θ(θ). In order to solve the equation, it is convenient for us to introduce the new independent variable t = cos θ ∈ [−1, 1], and consider the function P (t) = Θ(θ(t)). The equation transforms to the somewhat simpler equation d dP (t) m2 − (1 − t2) − P (t) = l(l + 1)P (t). dt dt 1 − t2 For l = 0, 1, 2,... , the above equation is solved by the associated Legendre poly- nomials dm 1 dl (6.52) P m(t) = (−1)m(1 − t2)m/2 (x2 − 1)l, l ∈ , |m| ≤ l. l dtm 2ll! dtl Z+ l is called . 6.6.3. The R-equation. The equation in r is 1 ∂  ∂R  l(l + 1)  r2 + − + E + V (r) R = 0. r2 ∂r ∂r r2 33

For generic V (r) the problem is intractable. The special case V (r) = −Z/r is solvable. In this case the R-equation is 1 ∂  ∂R  l(l + 1) Z  r2 + − + E − R = 0. r2 ∂r ∂r r2 r Let us consider the case E < 0. Introducing the symbols Z α2 = −E, λ = , 2E and the new independent variable ρ = 2αr > 0, the wave equation for S(ρ) = R(ρ(r)) becomes 1 ∂  ∂S   1 l(l + 1) λ ρ2 + − − + S = 0. ρ2 ∂ρ ∂ρ 4 ρ2 ρ As in the treatment of the harmonic oscillator, we first discuss the asymptotic equa- tion. For ρ large, the equation approaches the form 1 ∂  ∂S  1 ρ2 = S ρ2 ∂ρ ∂ρ 4 the acceptable solution of which is S = e−ρ/2. We now assume that the solution of the complete equation in the whole region ρ > 0 has the form S(ρ) = e−ρ/2ρsL(ρ), in which L(ρ) is a power series in ρ with a non-vanishing constant term X k L(ρ) = akρ , a0 6= 0. k After some calculations we find the conditions on s, s = l and a differential equation for L(ρ): ρL00 + (2(l + 1) − ρ) L0 + (λ − l − 1)L = 0. We now introduce the series representation of L and obtain the recursion

(2(k + 1)(l + 1) + k(k + 1)) ak+1 + (λ − l − 1 − k) ak. It can be shown by an argument similar to that used for the harmonic oscillator that for any values of λ and l the series whose coefficients are determined by this formula leads to a function S unacceptable as a wavefunction, unless it breaks off. 0 The condition that it breaks off after the term in ρn is λ − l − 1 − n0 = 0 or λ = n, where n = n0 + l + 1. 34 n0 is called radial quantum number, and can assume values n0 = 0, 1, 2,... . n is called total quantum number. 6.6.4. Energy levels and eigenfunctions. Summarising, if we introduce the quantum numbers n, l, and m as subscript (using n in preference of n0), the wave- functions we have found as acceptable solutions may be written as

(6.53) ψnlm(r, θ, ϕ) = Rnl(r)Θlm(θ)Φm(ϕ), with

1 imϕ (6.54) Φm(ϕ) = √ e , 2π

s (2l + 1)!(l − |m|)! (6.55) Θ (θ) = P |m|(cos θ), lm 2(l + |m|)! l

s  3  l   Z (n − l − 1)! − Zr Zr 2l+1 Zr (6.56) R (r) = − e 2n L . nl (n + l)! 2n4 n n+l n

The wavefunctions corresponding to distinct sets of values for n, l, and m are independent and the energy value corresponding to ψnlm is

 Z 2 (6.57) E = − . n 2n

2 In particular, the lowest eigenvalue E1 = −Z /4 is simple and the corresponding q Z3 −Zr/2 eigenfunction ψnlm(r, θ, ϕ) = 2 e is positive. The allowed values of these quantum numbers we have determined to be

m = 0, ±1, ±2,... l = |m|, |m| + 1, |m| + 2,... n = l + 1, l + 2, l + 3,...

This we may rewrite as

total quantum number n = 1, 2, 3,... azimuthal quantum number l = 0, 1, 2, . . . , n − 1 m = −l, . . . , −1, 0, +1,..., +l

There are consequently 2l + 1 independent wave functions with given values of n and l, and n2 independent wave functions with a given value of n, that is, with the same energy value. 35

7. Matrix Mechanics

Heisenberg formulated and successfully attacked the problem of calculating values of the frequencies and intensities of the spectral lines which a system could emit or absorb; that is, of the energy levels and the electric-moment integrals which we have been discussing. He did not use wave functions and wave equations, but instead developed a formal mathematical method for calculating values of these quantities. Heisenberg invented the new type of algebra as he needed it; it was immedi- ately pointed out by Born and Jordan, however, that in his new quantum mechanics Heisenberg was making use of quantities called matrices, and that his newly in- vented operations were those of matrix algebra which had been already discussed by mathematicians.

7.1. Matrices, their relation to wavefunctions, and the rules of matrix al- gebra. Let us consider a set of orthogonal wavefunctions Ψ0, Ψ1,... Ψn,... and   a dynamical quantity f(q , p ). The corresponding operator is f = f q , ~ ∂ . i i op i i ∂qi In the foregoing sections we have often made use of integrals such as Z ∗ (7.1) fmn = Ψ fopΨ; for example, we have given fnn the physical interpretation of the average value of f when the system is in the nth stationary state. Let us now arrange the numbers fmn (the values of the integrals) in a square array F = (fmn) ordered according to m and n.

We can construct similar arrays G = (gmn), H = (hmn) etc. for other dy- namical quantities. It is found that the symbols F , G, H etc. representing such arrays can be manipulated by an algebra closely related to ordinary algebra, differ- ing from it mainly in the process of multiplication. The rules of this algebra can be easily derived from the properties of wave functions, which we already know. Now let us derive some rules of the new algebra. For example, the sum of two such arrays is an array each of whose elements is the sum of the corresponding elements of the two arrays

(7.2) {f + g}mn = {g + f}mn = fmn + gmn. The addition of arrays is commutative: F + G = G + F . On the other hand, multiplication is not commutative: the product FG is not necessarily equal to the product GF . Let us evaluate the mnth element of the array FG. It is Z ∗ {fg}mn = ΨmfopgopΨn. 36

Now we can express the quantity gopΨn in terms of the functions Ψk with constant coefficients, obtaining X gopΨn = gknΨk. k ∗ That the coefficients are the quantities gkn is seen on multiplying by Ψk and inte- grating. Introducing this in the integral for {fg}mn we obtain Z X ∗ {fg}mn = ΨmfopΨkgkn; k R ∗ since ΨmfopΨk = fmk, this becomes X (7.3) {fg}mn = fmkgkn. k We see that the arrays F , G, etc. can be manipulated according to the rules of matrices. The non-commutative nature of the multiplication of matrices is of great im- portance in matrix mechanics.

7.2. Diagonal matrices and their physical interpretation. A diagonal ma- trix is a matrix whose elements fmn are all zero except those with m = n. The unit matrix, 1, is a special kind of diagonal matrix, all the diagonal elements being equal to unity. A constant matrix is equal to a constant times the unit matrix. Appli- cation of the rule for matrix multiplication shows that the square (or any power) of a diagonal matrix is also a diagonal matrix, its diagonal elements being the squares (or other powers) of the corresponding elements of the original matrix. In discussing the physical interpretation of the wave equation, we saw that the fundamental postulate regarding physical interpretation requires a dynamical quantity f to have a definite value for a system in the state represented by the wave r r function Ψn only when fnn is qual to (fnn) , for all values of r. We can now express this in terms of matrices: if the dynamical quantity f is represented by a diagonal matrix F , then this dynamical quantity has the definite value fnn for the state corresponding to the wavefunction Ψn of the set Ψ0, Ψ1,,. . . .

i − Ent EXAMPLE 7.1. The solutions Ψn = ψn(x)e ~ of the Schrodinger¨ equa- tion correspond to a diagonal energy matrix H = diag(E0,E1,E2,... ), so that, the system in a physical condition (i.e. a state) represented by one of these wave- functions has a fixed value of the total energy. In the case of a system with one degree of freedom no other quantity (except functions of H, such as H2) is represented by a diagonal matrix; with more degrees of freedom there are other diagonal matrices.

EXAMPLE 7.2. The surface-harmonic wavefunctions Θlm(θ)Φm(ϕ) for the hydrogen atom and any other spherical symmetric system make the matrices for 37 the square of the total angular momentum and the z component of the angular momentum diagonal. These quantities thus have definite values for Θlm(θ)Φm(ϕ).

7.3. Matrix elements and their physical interpretation. The principle of interference, which applies to all quantum systems (photons, electrons, atoms, . . . ), suggests that transition probabilities Pnm from a state ψn to a state ψm are the squares of transition amplitudes, and that the latter combine in a linear fashion. Likewise, for the observable f we interpret the matrix element fnm as transition amplitudes. We will not discuss their use in this module.

7.4. The Heisenberg Uncertainty Relations. Given a state |ψi, one can give the probabilities for the possible outcomes of a measurement of an observable A. The probability distribution has a mean or expectation value

hAi = hψ| A |ψi and an uncertainty or deviation about this mean q ∆A = hψ| (A − hAi)2 |ψi.

There are states for which ∆A = 0, and these are the eigenstates of A. If we consider two Hermitian operators A and B, they will have some uncer- tainties ∆A and ∆B in a given state. The Heisenberg uncertainty relations provide a lower bound on the product of uncertainties ∆A∆B. Generally the lower bound depends not only on the operators but also on the state. Of particular interest are those cases in which the lower bounds is independent of the state. Let A and B be two Hermitian operators, with commutator

[A, B] = iC.

It is easy to verify that C is also Hermitian. Fix a state |ψi. Note that the pair A0 = A − hAi and B0 = B − hBi has the same commutator as A and B (verify this). Then, by Schwartz inequality:

2 2 02 02 0 0 0 0 0 0 2 (∆A) (∆B) = hψ| A |ψi hψ| B |ψi = A ψ A ψ B ψ B ψ ≥ A ψ B ψ . We can write the above inequality in terms of commutators and anticommutators 2 2 2 0 0 2 1 0 0 1 0 0 (∆A) (∆B) ≥ hψ| A B |ψi = hψ| {A ,B } + [A ,B ] |ψi . 2 2 Since [A0,B0] = iC has pure imaginary expectation value, and {A0,B0} has real expectation value, we get

2 2 1 0 0 2 1 2 (7.4) (∆A) (∆B) ≥ hψ| {A ,B } |ψi + |hψ| C |ψi| . 4 4 38

This is the general uncertainty relation between any two Hermitian operators and is evidently state dependent. Consider now canonically conjugate operators, for which C = ~. In this case, 2 2 1 0 0 2 1 2 1 2 (∆A) (∆B) ≥ hψ| {A ,B } |ψi + ≥ , 4 4~ 4~ or

(7.5) ∆A∆B ≥ ~/2.

8. Mathematical Foundations of Quantum Mechanics

In classical probability theory one assumes that all the events concerning a statistical experiment form a Boolean σ-algebra and defines a probability measure as a completely additive non-negative function which assigns the value unity for the identity element of the σ-algebra. Typically, the σ-algebra is the Borel σ-algebra BX of a nice topological space X. Under very general conditions it turns out that all probability measures on BX constitute a convex set whose extreme points are degenerate probability measures. We observe that BX admits a null element, namely ∅, a unit element, namely X, a partial order ⊂ and operations union (∪), intersection (∩) and complemen- tation (c). Quantum theory can be thought of as a non-commutative version of probability theory where the σ-algebra BX of events is replaced by a lattice of pro- jections on a Hilbert space H. Most of the computations of quantum mechanics is done in such a lattice. Let H be a real or complex separable Hilbert Space and let P (H) denote the set of all orthogonal projection operators on H, where 0 denotes the null operator and I denotes the identity operator. If P1,P2 ∈ P (H) we say that P1 ≤ P2 if the range of P1 is contained in the range of P2. Then ≤ makes P (H) a partially ordered set. For any linear operator A on H let R(A) denote its range. For two orthogonal projections P1,P2, let P1 ∨ P2 be the orthogonal projection on the closed linear span of the subspaces R(P1) ∪ R(P2). Let P1 ∧ P2 be the orthogonal projection on R(P1) ∩ R(P2). For any P ∈ L(H), I − P is the orthogonal projection on the orthogonal complement R(P )⊥ of the range of P . We may compare 0, I, ≤, ∨, ∧ and the map P 7→ I − P in L(H) with ∅, X, ⊂, ∪, ∩ and complementation A 7→ X \ A of standart probability theory on BX . The chief distinction lies in the fact that ∪ distributes with ∩ but ∨ need not distribute with ∧.

8.1. The postulates of quantum theory. A consistent framework of quantum theory can be introduced through a sequence of postulates. P1 The state of a physical system is described by a normalised vector ψ in a separable complex Hilbert space H. 39

P2 To each (real-valued) observable corresponds a self-adjoint operator A on H,or on a suitable subspace of H (in fact, the of a physical system are elements of a non-commutative C∗-algebra). P3 If the observable A is measured, then the possible outcomes of the mea- sure are the eigenvalues of A. P4 Suppose we measure A on the state ψ. Then, the probability that the observed value is λ ∈ R is X 2 |hxi, ψi| ,

i:λi=λ

where {xi} are the normalised eigenvectors of A and {λi} the corre- sponding eigenvalues. P5 The state of a (closed) system evolves in time as

ψ(t) = U(t)ψ0 for some strongly continuous one-parameter unitary group that only de- pends on the system (and not on the state).

8.2. Hilbert spaces. Suppose H is a complex vector space. A map h·|·i: H × H → C is called a sesquilinear form if it is conjugate linear in the first argument and linear in the second. A positive definite sesquilinear form is called scalar product. Associated with every scalar product is a norm p (8.1) kψk = hψ|ψi. The triangle inequality follows from the Cauchy-Schwarz inequality: (8.2) |hψ|ϕi| ≤ kψkkϕk with equality if and only if ψ and ϕ are parallel. If H is complete with respect to the above norm, it is called a Hilbert space. It is no restriction to assume that H is complete since one can easily replace it by its completion.

n EXAMPLE 8.1. H = C with the usual scalar product n X ∗ (8.3) hψ|ϕi = ψj ϕj j=1 is a (finite dimensional) Hilbert space. Similarly, the set of all square summable 2 sequences ` (N) is a Hilbert space with scalar product X ∗ (8.4) hψ|ϕi = ψj ϕj. j∈N

EXAMPLE 8.2. The space H = L2(M, dµ) is a Hilbert space with scalar product given by Z (8.5) hψ|ϕi = ψ∗(x)ϕ(x)dµ(x). M 40

(Note that the Example 8.1 is a special case of the this one; take M = R and µ a sum of Dirac measures.)

REMARK. Even though the elements of L2(M, dµ) are, strictly speaking, equivalence classes of functions, we will still call them functions for notational convenience. However, note that for f ∈ L2(M, dµ) the value f(x) is not well- defined (unless there is a continuous representative and different continuous func- tions are in different equivalence classes, e.g., in the case of Lebesgue measure).

A vector ψ ∈ H is called normalized or a unit vector if kψk = 1. Two vectors ψ, ϕ ∈ H are called orthogonal (ψ ⊥ ϕ) if hψ|ϕi = 0. Suppose ψ is a unit vector. Then the projection of ψ in the direction of ϕ is given by and ψk defined via

(8.6) ψk = hϕ|ψi ϕ, and ψ⊥defined via

(8.7) ψ⊥ = ψ − hϕ|ψi ϕ, is perpendicular to ϕ. These results can also be generalised to more than one vector. A set of vectors {ϕj} is called an orthonormal set if hϕi|ϕji = δij.

EXERCISE 8.1. Show that every orthonormal set is linearly independent.

n THEOREM 8.1 (Pythagorean theorem). Suppose {ϕj}j=0 is an orthonormal set. Then every ψ ∈ H can be written as n X (8.8) ψ = ψk + ψ⊥, ψk = hϕj|ψi ϕj, j=0 where ψk and ψ⊥ are orthogonal. Moreover, hϕj|ψ⊥i = 0 for all 1 ≤ j ≤ n. In particular n 2 X 2 2 (8.9) kψk = |hϕj|ψi| + kψ⊥k . j=0 ˆ n Moreover, every ψ in the span of {ϕj}j=0 satisfies ˆ (8.10) kψ − ψk ≥ kψ⊥k ˆ with equality holding if and only if ψ = ψk. In other words, ψk is uniquely deter- n mined as the vector in the span of {ϕj}j=0 closest to ψ. From (8.9) we obtain Bessel’s inequality n X 2 2 (8.11) |hϕj|ψi| ≤ kψk j=0 41

n with equality holding if and only if ψ lies in the span of {ϕj}j=0. A scalar product can be recovered from its norm by virtue of the polarization identity 1 (8.12) hϕ|ψi = kϕ + ψk2 − kϕ − ψk2 + ikϕ − iψk2 − ikϕ + iψk2 . 4

A bijective linear operator U ∈ L(H1, H2) is called unitary if U preserves scalar products:

(8.13) hUϕ|Uψi2 = hϕ|ψi1 , for all ψ, ϕ ∈ H1. By the polarization identity, this is the case if and only if U preserves norms: kUψk2 = kψk1 for all ψ ∈ H1 . The two Hilbert spaces H1 and H2 are called unitarily equivalent in this case.

EXERCISE 8.2. The shift operator 2 2 S : ` (N) → ` (N), (a1, a2, a3,... ) 7→ (0, a1, a2,... ) satisfies kSak = kak. Is it unitary?

By continuity of the scalar product we see that Theorem 8.1 can be gener- alised to arbitrary orthonormal sets {ϕj}j∈J . Note that from Bessel’s inequality, it follows that the map ψ 7→ ψk is continuous. An orthonormal set which is not a proper subset of any other orthonormal set is called an orthonormal basis due to the following result.

THEOREM 8.2. For an orthonormal set {ϕj}j∈J , the following conditions are equivalent:

(i) {ϕj}j∈J is a maximal orthonormal set; (ii) For every ψ ∈ H we have X (8.14) ψ = hϕj|ψi ϕj; j∈J (iii) For every ψ ∈ H we have Parseval’s relation

2 X 2 (8.15) kψk = | hϕj|ψi | ; j∈J

(iv) hϕj|ψi = 0 for all j ∈ J implies ψ = 0.

EXAMPLE 8.3. The set of functions

1 imx (8.16) ϕm(x) = √ e , m ∈ Z, 2π forms an orthonormal basis for H = L2(0, 2π). The corresponding orthogonal expansion is just the ordinary . 42

A Hilbert space is separable if and only if there is a countable orthonormal N basis. In fact, if H is separable, then there exists a countable total set {ψj}j=0. Here N ∈ N if H is finite dimensional and N = ∞ otherwise. After throwing away some vectors, we can assume that ψn+1 cannot be expressed as a linear com- bination of the vectors ψ0, . . . , ψn. Now we can construct an orthonormal basis as follows: we begin by normalising ψ0,

ψ0 (8.17) ϕ0 = . kψ0k

Next we take ψ1 and remove the component parallel to ϕ0 and normalise again:

ψ1 − hϕ0|ψ1i ϕ0 (8.18) ϕ1 = . kψ1 − hϕ0|ψ1i ϕ0k Proceeding like this, we define recursively Pn−1 ψn − hϕj|ψ1i ϕj (8.19) ϕ = j=0 . n Pn−1 ψn − j=0 hϕj|ψ1i ϕj This procedure is known as Gram-Schmidt orthogonalization. The result of the N procedure is the orthonormal basis {ϕ}j=0.

2 EXERCISE 8.3. In L (−1, 1), we can orthogonalise the monomials fn(x) = xn. The resulting polynomials are up to a normalisation equal to the Legendre p polynomials ϕn(x) = n + 1/2Pn(x), 1 (8.20) P (x) = 1,P (x) = x, P (x) = 3x2 − 1 ,... 2

THEOREM 8.3. If H is separable, then every orthonormal basis is countable.

In quantum theory it is assumed that the Hilbert space associate to a quantum system is separable. In particular, it can be shown that L2(M, dµ) is separable. Moreover, it turns out that, up to unitary equivalence, there is only one separable infinite dimensional

Hilbert space: Let H be an infinite dimensional Hilbert space and let {ϕj}j∈N be any orthogonal basis. Then the map U : H → `2( ), ψ 7→ (hϕ |ψi) is unitary. N j j∈N In particular,

THEOREM 8.4. Any separable infinite dimensional Hilbert space is unitarily 2 equivalent to ` (N).

EXERCISE 8.4. Let {ϕj} be some orthonormal basis. Show that a bounded linear operator A is uniquely determined by its matrix elements Ajk = hϕj|Aϕki with respect to this basis. 43

8.3. The projection theorem and the Riesz lemma. Let M ⊂ H be a subset. Then M ⊥ = {ψ ∈ H: hϕ|ψi = 0, ∀ϕ ∈ M} is called the orthogonal comple- ment of M. By continuity of the scalar product it follows that M ⊥ is a closed ⊥ linear subspace and by linearity that (span(M)) = M ⊥. For example, we have H⊥ = {0} since every vector in H⊥ must be in particular orthogonal to all vectors in some orthonormal basis.

REMARK. Note that if M ⊂ H is closed, it is a Hilbert space and has an orthonormal basis {ϕj}j∈J .

THEOREM 8.5 (Projection theorem). Let M be a closed linear subspace of a Hilbert space H. Then every ψ ∈ H can be uniquely written as ψ = ψk + ψ⊥ with ⊥ ψk ∈ M and ψ⊥ ∈ M . One writes

(8.21) M ⊕ M ⊥ = H in this situation.

Theorem 8.5 implies that to every ψ ∈ H we can assign a unique vector ψk ∈ ⊥ M closest to ψ. The rest, ψ−ψk, lies in M . The operator PM defined as P ψ = ψk is called the orthogonal projection corresponding to M. Note that

2 (8.22) PM = PM , and hPM ψ|ϕi = hPM ϕ|ψi .

Clearly, PM ⊥ ψ = ψ − PM ψ = (I − PM )ψ. Finally we turn to linear functionals, that is, to linear operators `: H → C. By the Cauchy-Schwarz inequality we know that `ϕ : ψ 7→ hϕ|ψi is a bounded linear functional (with norm kϕk). It turns out that, in a Hilbert space, every bounded linear functional can be written in this way.

THEOREM 8.6 (Riesz lemma). Suppose ` is a bounded linear functional on a Hilbert space H. Then there is a unique vector ϕ ∈ H such that `(ψ) = hϕ|ψi for all ψ ∈ H. In other words, a Hilbert space is equivalent to its own dual space ∗ H ' H via the map ϕ 7→ `ϕ = hϕ|·i which is a conjugate linear isometric bijection between H and H∗.

PROOF. If ` = 0 we can choose ϕ = 0. Otherwise Ker(`) = {ψ ∈ H: `(ψ) = 0} is a proper subspace of H and we can find a unit vector ϕ˜ ∈ Ker(`)⊥. For every ψ ∈ H, `(ϕ ˜)ψ − `(ψ)ϕ ˜ ∈ Ker(`) and hence

0 = hϕ˜|`(ϕ ˜)ψ − `(ψ)ϕ ˜i = `(ϕ ˜) hϕ˜|ψi − `(ψ).

−1 Therefore we can choose ϕ = `(ϕ ˜) ϕ˜. To see uniqueness, let ϕ1, ϕ2 be two such vectors. Then hϕ1 − ϕ2|ψi = hϕ1|ψi − hϕ2|ψi = `(ψ) − `(ψ) = 0 for all ψ ∈ H. ⊥ Therefore ϕ1 − ϕ2 ∈ H = {0}.  44

In view of the isomorphism H∗ ' H, onel often uses Dirac’s notation and de- ∗ notes vectors ψ ∈ H by the ket symbol |ψi and linear functionals `ϕ = hϕ|·i ∈ H using the bra symbol hϕ|. The contraction `ϕ(ψ) of a covector ϕ and a vector ψ is denoted by the bracket (scalar product) hϕ|ψi. One writes the above isomorphism as H 3 |ψi 7→ hψ| ∈ H∗.

The ket-bra |ψihϕ| denotes the rank-one linear operator |vi 7→ `ϕ(v) |ψi = hϕ|vi |ψi.

EXERCISE 8.5. Suppose U : H → H is unitary and M ⊂ H. Show that UM ⊥ = (UM)⊥.

8.4. The C∗ algebra of bounded linear operators. Observables like mo- mentum and spin component are to be represented by operators acting on states (vectors). Almost all operators that appear in quantum theory are linear. A linear operator A: H → H is called bounded if the operator norm (8.23) kAk = sup kAψk kψk=1 is finite. We write A ∈ L(H).

EXAMPLE 8.4. An orthogonal projection PM 6= 0 has norm one.

LEMMA 8.7. (8.24) kAk = sup |hϕ|Aψi| . kψk=kϕk=1

We start by introducing a conjugation for operators on a Hilbert space H. Let A ∈ L(H). Then the adjoint operator A∗ is defined via (8.25) hϕ|A∗ψi = hAϕ|ψi .

n EXAMPLE 8.5. If H = C , then a linear operator is represented by a matrix n A = (aij)i,j=1. Note that s X ∗ ∗ kAk = sup aikaijψkψj. kψk=1 ijk Then  ∗ X X X ∗ ∗ X ∗ X ∗ hAϕ|ψi =  aijϕj ψi = aijϕj ψi = ϕi ajiψj i j ij i j ∗ ∗ n Hence, A = (aji)i,j=1. LEMMA 8.8. Let A, B ∈ L(H) and α ∈ C. Then (i) (A + B)∗ = A∗ + B∗, (αA)∗ = α∗A∗; (ii) A∗∗ = A; 45

(iii) (AB)∗ = B∗A∗; (iv) kA∗k = kAk and kAk2 = kA∗Ak = kAA∗k.

REMARK. As a consequence of kA∗k = kAk, observe that taking the adjoint is continuous.

The algebra L(H) of bounded operators with the involution ∗ forms a C∗- algebra.

EXAMPLE 8.6. The continuous functions C(I) on a bounded interval I ⊂ R together with complex conjugation form a commutative C∗-algebra.

An operator A ∈ L(H) is called normal if AA∗ = A∗A, self-adjoint if A = A∗, unitary if A∗A = AA∗ = I, an (orthogonal) projection if A = A∗ = A2, and positive if A = BB∗ for some B ∈ L(H). Clearly both self-adjoint and unitary elements are normal.

2 EXERCISE 8.6. Consider the Pauli matrices σ1, σ2, σ3 ∈ L(C ): 0 1 0 −i 1 0  (8.26) σ = , σ = , σ = . 1 1 0 2 i 0 3 0 −1 ∗ ∗ ∗ 2 They are Hermitian σi = σi and also unitary σi σi = σiσi = σi = 1.

EXERCISE 8.7. Compute the adjoint of the shift operator 2 2 S : ` (N) → ` (N), (a1, a2, a3,... ) 7→ (0, a1, a2,... ). 8.5. Resolvents and spectra. Let A be a (densely defined) closed operator. The resolvent set of A is the subset of C defined by (8.27) ρ(A) = {z ∈ C :(A − z)−1 ∈ L(H)}. The spectrum of A is the complement of the resolvent set of A: (8.28) σ(A) = C \ ρ(A). In particular, λ ∈ σ(A), if and only if (A − λ)ψ = 0 for some nonzero vector ψ. In this case ψ is called an eigenvector of A corresponding to the eigenvalue λ. We can characterise the spectra of self-adjoint operators.

PROPOSITION 8.9. Let A be self-adjoint. Then all eigenvalues are real and eigenvectors corresponding to distinct eigenvalues are orthogonal.

PROOF. If Aψj = λjψj, j = 1, 2,, we have 2 ∗ 2 λ1kψ1k = hψ1|λ1ψ1i = hψ1|Aψ1i = hAψ1|λ1ψ1i = hλψ1|ψ1i = λ1kψ1k and (λ1 − λ2)hψ1|ψ2i = hAψ1|ψ2i − hAψ1|ψ2i = 0.  46

The result does not imply that two linearly independent eigenfunctions to the same eigenvalue λ are orthogonal. However, it is no restriction to assume that they are since we can use Gram-Schmidt to find an orthonormal basis for Ker(A−λ). If H is finite dimensional, we can always find an orthonormal basis of eigenvectors. (In the infinite dimensional case this is no longer true in general.)

THEOREM 8.10 (Spectral theorem for self-adjoint bounded operators). Let A be bounded and self-adjoint. Then, X A = λk |ψkihψk| k where λk ∈ R and |ψki ∈ H are the eigenvalues and corresponding normalised eigenvectors of A. Moreover, {|ψki} is an orthonormal basis of H. The above spectral decomposition of A provides a unique norm continuous, ∗-homomorphism from the set of all bounded measurable functions B(R) to L(H). For every func- tion f ∈ B(R), we have X f(A) = f(λk) |ψkihψk| . k

REMARK. We can also characterise the spectra of unitary operators. If U is a unitary operator, then all eigenvalues have modulus one and eigenvectors corre- sponding to distinct eigenvalues are orthogonal.

8.6. Time evolution. Now let us turn to the time evolution of such a quantum mechanical system. Given an initial state ψ(0) of the system, there should be a unique ψ(t) representing the state of the system at time t ∈ R. We will write (8.29) ψ(t) = U(t)ψ(0). Moreover, it follows from physical experiments that superposition of states holds; that is, U(t)(α1ψ1(0) + α2ψ2(0)) = α1ψ1(t) + α2ψ2(t). In other words, U(t) must be a linear operator. Moreover, since ψ(t) is a normalised state kψ(t)k = 1, we have (8.30) kU(t)ψk = kψk. Therefore U(t) is a unitary operator. Next, since we have assumed uniqueness of solutions to the initial value problem, we must have (8.31) U(0) = 1,U(t + s) = U(t)U(s). A family of unitary operators U(t) having this property is called a one-parameter unitary group. In addition, it is natural to assume that this group is strongly continuous; that is, (8.32) lim U(t)ψ = ψ, for all ψ ∈ H. t→0 47

Each such group has an infinitesimal generator, defined by

i (8.33) Hψ = lim (U(t)ψ − ψ) , for all ψ ∈ D(H) t→0 t where  i  (8.34) D(H) = ψ ∈ H: lim (U(t)ψ − ψ) exists . t→0 t

This operator is called the Hamiltonian and corresponds to the energy of the sys- tem. If ψ(0) ∈ D(H), then ψ(t) is a solution of the Schrodinger¨ equation (in suitable units)

d (8.35) i ψ(t) = Hψ(t). dt

EXAMPLE 8.7. Suppose that H = C and consider the scalar equation d (8.36) i ψ(t) = Hψ(t), ψ(0) = ψ ∈ dt 0 C

−iHt where H ∈ R. It is immediate to integrate the equation and find ψ(t) = e ψ0.

Recall that the exponential eA of a matrix A is defined by the series

X Ak eA = . k! k≥0

n EXAMPLE 8.8. Suppose that H = C and consider the vector equation d (8.37) i ψ(t) = Hψ(t), ψ(0) = ψ ∈ n dt 0 C

n×n ∗ where H ∈ C is Hermitian H = H. Using the definition of exponential of a −iHt matrix, it is easy to verify that the solution of the problem is ψ(t) = e ψ0.

There is a one-to-one correspondence between one-parameter strongly contin- uous unitary groups of operators on H and self-adjoint operators on H.

THEOREM 8.11 (Stone’s theorem). Let H be self-adjoint and let U(t) = exp(−iAt). (i) U(t) is a strongly continuous one-parameter unitary group. i (ii) The limit limt→0 t (U(t)ψ − ψ) exists if an only ifψ is in the domain of i H, ψ ∈ D(H), in which case limt→0 t (U(t)ψ − ψ) = Hψ. (iii) U(t)D(H) = D(H) and HU(t) = U(t)H. 48

8.7. Orthogonal sums and tensor products. Given two Hilbert spaces H1 and H2, we define their orthogonal sum H1⊕H2 to be the set of all pairs (ψ1, ψ2) ∈ H1 × H2 together with the scalar product

(8.38) h(ϕ1, ϕ2)|(ψ1, ψ2)i = hϕ1|ψ1i1 + hϕ2|ψ2i2 .

It is left as an exercise to verify that H1⊕H2 is again a Hilbert space. Moreover, H1 can be identified with {(ψ1, 0): ψ1 ∈ H1}, and we can regard H1 as a subspace of H1 ⊕ H2, and similarly for H2. It is customary to write ψ1 + ψ2 instead of (ψ1, ψ2). ∞ More generally, for a countable collection (Hj)j=1 of Hilbert spaces, the set

∞  ∞ ∞  M X X 2  (8.39) Hj = ψj : ψj ∈ Hj, kψjkj < ∞ , j=1 j=1 j=1  with scalar product

* ∞ ∞ + ∞ X X X (8.40) ϕj ψj = hϕj|ψji

j=1 j=1 j=1 is a Hilbert space. L∞ 2 EXAMPLE 8.9. j=1 C = ` (N). Similarly, if H and K are two Hilbert spaces, we define their tensor product as follows: we start with the set of all finite linear combinations of elements of H × K:  n  X  (8.41) F (H, K) = αj(ψj, ϕj):(ψ, ϕ) ∈ H × K, αj ∈ C . j=1 

Since we want (ψ1 +ψ2)⊗ϕ = ψ1 ⊗ϕ+ψ2 ⊗ϕ, ψ⊗(ϕ1 +ϕ2) = ψ⊗ϕ1 +ψ⊗ϕ2, and (αψ) ⊗ ϕ = ψ ⊗ (αϕ), we consider the quotient F (H, K)/N(H, K), where

 n  n n   X X X  (8.42) N(H, K) = span αjβk(ψj, ϕk) −  αjψj, βkϕk j,k=1 j=1 k=1  and write ψ ⊗ ϕ for the equivalence class of (ψ, ϕ). Next, define

(8.43) hψ1 ⊗ ϕ1|ψ2 ⊗ ϕ2i = hψ1|ψ2i hϕ1|ϕ2i which extends to a sesquilinear form on F (H, K)/N(H, K). It is easy to show that this sesquilinear form is a scalar product. The completion of F (H, K)/N(H, K) with respect to the induced norm is called tensor product H ⊗ K of H and K.

LEMMA 8.12. If {ψj} and {ϕk} are orthonormal bases for H, K, respectively, then {ψj ⊗ ϕk} is an orthonormal basis for H ⊗ K. 49

n n n m nm EXAMPLE 8.10. H ⊗ C = H . In particular, C ⊗ C = C .

EXAMPLE 8.11. L2(M, dµ) ⊗ L2(N, dν) = L2(M × N, dµ × dν).

9. Angular momentum and spin

In classical mechanics, the components of the angular momentum of a point particle are related to the coordinates x, y and z and momenta px, py and pz by

Lx = ypz − zpy,

Ly = zpx − xpz,

Lz = xpy − ypx. To find quantum mechanical operators for the orbital angular momentum, use is made of the assumption that the correspondence principle must be satisfied. Thus any relation which appears in classical mechanics must be valid after quantisa- tion of the canonical variables. As an example, the resulting expression for the z-component of orbital angular momentum is

Lz = x (−i~∂y) − y (−i~∂x) = −i~ (x∂y − y∂x) .

EXERCISE 9.1. Verify the commutation relations

[Lz, x] = i~y, [Lz, y] = −i~x, [Lz, z] = 0, [Lz, px] = i~py, [Lz, py] = −i~px, [Lz, pz] = 0.

The commutation relations among the various components of the angular mo- mentum can be obtained:

[Lx,Ly] = i~Lz, [Ly,Lz] = i~Lx, [Lz,Lx] = i~Ly, etc. Note that the operators for the three components of angular momentum do not com- mute with one another (they are not simultaneously measurable). The commutation relations can be written in a compact notation using the Levi-Civita antisymmetric tensor [Lk,Lk] = i~jklLl, j, k, l = 1, 2, 3. Another physical quantity of considerable interest is the square of the magnitude of the angular momentum. The corresponding operator is defined through 2 2 2 2 L = Lx + Ly + Lz. It is easy to check that L2 commutes with all three of the components: 2 2 2 [L ,Lx] = [L ,Ly] = [L ,Lz] = 0. Since the z-component and the square of the angular momentum commute with each other, it is possible to choose diagonalise simultaneously both operators. We 50 then have L2ψ = aψ

Lzψ = bψ. 2 2 2 Of course hL i ≥ hLzi, so that a ≥ b . It is useful at this point to define two operators which play a role similar to that of the ladder operators of the simple harmonic oscillator. They are

L± = Lx ± iLy. It can be shown that [Lz,L±] = ±~L±.

This equation says that L+ and L− play the role of ladder operators with the regard to the eigenvalues of Lz. In fact,

Lz(L+ψ) = (b + ~)(L+ψ). Since L2 commutes with all three components of the angular momentum, we get 2 L (L+ψ) = a(L+ψ). 2 Thus the operator L+ operating on an eigenfunction of Lz and L generates a new simultaneous eigenfunction of these two operators for which the eigenvalue of L2 is left unchanged but for which the eigenvalue of Lz is increased by ~. The eigenvalue b has an upper bound; otherwise the inequality a ≥ b2 would be violated. If we assume that b is the largest eigenvalue satisfying the inequality, then it must be L+ψ = 0. If we multiply both sides by L− we get 2 2 L−L+ψ = (L − Lz − ~Lz)ψ = 0, n so that a = b(b + ~). In a similar manner it is easy to show that L−ψ is an eigenfunction of Lz: n n Lz(L−ψ) = (b − n~)L−ψ. 2 Take n to be the largest integer for which the inequality a ≥ (b − n~) is satisfied. n In this case L−L−ψ = 0. If we multiply by L+ we get, n 2 2 n L+L−L−ψ = (L − Lz + ~Lz)L−ψ. 2 From this, a = (b − n~) − (b − n~)~. Combining with the previous relation between a and b we finally get 1 (9.1) b = (n − 1) = l . 2 ~ ~ From this, l is nonnegative and either integer or half-integer (depending on whether n is even or odd). We will show later that for the orbital angular moments, l takes only integral values l = 0, 1, 2, 3,.... 51

The corresponding value of a is

2 a = l(l + 1)~ .

We can summarise as

2 2 L Ylm = l(l + 1)~ Ylm, LzYlm = m~Ylm. l can take on nonnegative integral values, and m can take positive or negative integral values such that l ≥ |m|. Note that these properties result directly from the commutation relations of the angular momentum, and hence follow from only the algebraic properties of the operators. Note that only one component of the angular momentum may be precisely specified at a time. Although a simultaneous knowledge of the two other compo- nents is impossible, it is possible to say something about their expectation values. For example, for a particle in the angular-momentum state Ylm (eigenfunction of 2 L and Lz),

hLxi = hLyi = 0. Also, 1 1 hL2i = hL2i = hL2 − L2i = (l(l + 1) − m2) 2. x y 2 z 2 ~ Note that when the angular momentum “parallel” to the z-axis (m = l), the x- and y-components are still not zero. It is helpful to visualise the results of this section of the aid of a geometrical model. Consider the length of the angular momentum vector (Lx,Ly,Lz) to be p l(l + 1)~. The 2l + 1 allowed projections of this on the the z-axis are given by m~, with m = 0, ±1, ±2,..., ±l. Note that the projection on the z-axis never exceeds the length of the vector. The angular momentum thus may be visualised as lying on the surface of a cone having the z-axis for its axis and an altitude of m~. All positions in the surface are equally likely.

9.1. Orbital angular momentum. Consider now the orbital angular momen- 2 tum wavefunctions which are simultaneously eigenfunctions of L and Lz. It is helpful to introduce spherical coordinates in the usual fashion x = r sin θ cos φ, y = r sin θ sin φ, z = r cos θ. The operator Lz in spherical coordinates takes the form ∂ L = −i . z ~∂φ 2 We see that the simultaneous eigenfunctions of Lz and L are

(9.2) Ylm = exp(imφ)Θ(θ), 52 where ml must take integer values if the resulting function is to be single-valued. In a similar manner, the operator L2 takes the form    2  2 2 1 ∂ ∂ 1 ∂ L = −~ sin θ + . sin θ ∂θ ∂θ sin2 θ ∂θ2 It can be seen that the operator for L2 is essentially the angular part of the Laplacian operator: 2 2 1 ∂ 2 L2 − ~ ∆ = − ~ r + . 2m 2m r ∂r 2mr2

The wavefunctions Ylm(θ, φ) are called spherical harmonics. They form an or- thonormal set of functions on the sphere in the sense that Z ∗ (9.3) hYlm|Yl0m0 i = Ylm(θ, φ) Yl0m0 (θ, φ)dφ sin θdθ = δll0 δmm0 .

Consequently, one can expand any wavefunction ψ(r, θ, φ) as X ψ(r, θ, φ) = alm(r)Ylm(θ, φ). l,m The matrix elements for the z-component of the angular momentum in this repre- sentation are (Ylm = |l, mi)

0 0 (9.4) (Lz)lm,l0m0 = hl, m| Lz l , m = m~δll0 δmm0 . In a similar manner, the matrix elements of the square of the angular momentum are

2 2 0 0 2 (9.5) (L )lm,l0m0 = hl, m| L l , m = l(l + 1)~ δll0 δmm0 .

2 The (infinite) matrices representing Lz and L have thus been evaluated in the representation in which they are diagonal

 0  −1  0   +1     −2   −1   0     +1  L =  +2 , z ~  −3     −2   −1   0     +1   +2   +3  . . . 53

 0  2  2   2     6   6   6     6  L2 = 2  6 . ~  12     12   12   12     12   12   12  . . .

The next problem is that of calculating the matrix elements of the operators Lx and Ly. To do this, use is made of the ladder operators L± that satisfy

L+Ylm = cYl,m+1, 0 L−Ylm = c Yl,m−1.

0 c and c are the only nonzero matrix elements of J+ and J−:

c = hYl,m+1|L+Ylmi , 0 c = hYl,m−1|L−Ylmi .

It turns out that p L−Ylm = ~ l(l + 1) − m(m − 1)Yl,m−1, p L+Ylm = ~ l(l + 1) − m(m + 1)Yl,m+1. In matrix form:   0 √ 0 2 √  0 2     0 √   0 4 √   0 6   √   0 6 √   0 4   0  L− = ~  √ .  0 6 √   0 10   √   0 12 √   0 12 √   0 10   √   0 6   0  . . .

Note that L− and L+ are not self-adjoint. 54

EXERCISE 9.2. Prove the above formulae. (Hint: Use the fact that L+ and L− ∗ 2 2 are Hermitian adjoints L− = L+, and the identity L = L+L− + Lz − ~Lz to 2 compute first hYl,m|L+L−Ylmi = [l(l + 1) − m(m − 1)] ~ .)

L++L− L+−L− From the definition Lx = 2 and Ly = 2i we get the nonzero matrix elements of Lx and Ly:

hY |L Y i = p(l + m)(l − m + 1)~, l,m−1 x lm 2 hY |L Y i = p(l + m + 1)(l − m)~, l,m+1 x lm 2 i hY |L Y i = p(l + m)(l − m + 1) ~, l,m−1 y lm 2 hY |L Y i = p(l + m)(l − m + 1) ~ . l,m+1 y lm 2i 9.2. Spin. Thus far, only orbital angular momentum has been dealt with ex- plicitly. However, it was seen that the formalism based on the commutation rela- tions permitted either half-integral or integral values for l: the restriction to integral l-values resulted from the explicit form of the operators Lz = xpy − ypx, etc., and the requirement of a single-values wavefunction. Experiments confirmed that particles have an internal degree of freedom that obey the commutation relations of an angular moments but they are not associated to the ‘motion’ of the particles. This degree of freedom is called spin angular momentum of the particle. For spin angular momentum, it is found empirically that the quantum number may take on either integral or half-integral values. The relations obtained for the spin can be derived form the commutation relations. We therefore have the result that the simultaneous eigenfunctions of the square of the 2 spin, denoted by S , and the z-component of the spin, denoted by Sz, are given by 2 2 S ψsms = s(s + 1)~ ψsms ,

Szψsms = ms~ψsms , where s may take on either integral or half-integral values depending on the nature of the particle and ms = −s, −s + 1,..., +s − 1, +s. It is also found that for a given particle, s is fixed (a number characterizing the particle), while L2 can 2 assume any value l(l + 1)~ . Note that the spin is a purely quantum observable. This can be seen heuristi- 2 cally: as ~ → 0, S → 0. In the classical limit, ~ → 0, but l can tend to infinity in such a way that l(l + 1)~ remains finite; this is not possible for the spin since s is fixed.

9.3. Total angular momentum. Consider tho problem of the addition of two kinds of angular momenta, the orbital angular momentum and the spin of a particle. 55

The relations which we will obtain, however, are valid for any two commuting angular momenta (e.g., total spin of a system of two particles). The total angular momentum J can be written as the sum of the orbital and the spin angular momenta: J = L + S, where J has components

Jx = Lx + Sx,Jy = Ly + Sy,Jz = Lz + Sz.

Since the angular and spin momenta commute, [Lj,Sk] = 0, the total angular momentum J satisfies the same commutation relations of L and S. Therefore J 2 and Jz commute

2 2 J ψjmj = j(j + 1)~ ψjmj , 2 Jzψjmj = mj~ ψjmj ,

Consider first the eigenvalues of

Jz = Lz + Sz.

They are given by the sum mj = ml +ms of the eigenvalues of Lz and Sz (because 2 2 2 2 [Lz,Sz] = 0). L and S have eigenvalues l(l + 1)~ and s(s + 1)~ , respectively, and the square of the total angular momentum is

2 2 2 J = L + S + 2(LxSx + LySy + LzSz).

2 2 2 2 We want to determine the eigenvalues of J . First, note that J , L , S , Jz all 2 2 commute with one another. It is also clear that L , Lz, S , Sz form another mu- tually commuting set of observables. Therefore we have (at least) two alternative sets of four operators which are mutually commuting. If we fix the values of l and s, the corresponding subspace has dimension

(2l + 1) × (2s + 1).

(This is the number of possible ‘orientations’ of ml and ms.)

9.4. Sum of angular momenta. Consider a system characterised by two an- gular moments J1 and J2 with [J1,J2] = 0. We want to determine the eigenvalues of

J = J1 + J2 = J1 ⊗ 1 + 1 ⊗ J2. 56

From the previous considerations we have that

J1z has eigenvalues ~m1 J2z has eigenvalues ~m2 Jz has eigenvalues ~m = ~(m1 + m2) 2 2 J1 has eigenvalues ~ j1(j1 + 1) 2 2 J2 has eigenvalues ~ j2(j2 + 1) 2 2 J has eigenvalues ~ j(j + 1).

A complete orthonormal set of the Hilbert space H = H1 ⊗ H2 of the system is

{|j1, j2; m1, m2i} = {|j1, m1i ⊗ |j1, m2i} Consider the subspace of H generated by the system

Sj1j2 = {|j1, j2; m1, m2i : |m1| ≤ j1, |m2| ≤ j2}. for fixed values of j1 and j2. The dimension of this subspace if (2j1 + 1)(2j2 + 1). Consider another basis in this subspace, namely

Tj1j2 = {|j1, j2; j, mi}. 2 Note that j and m are defined by the conditions that ~ j(j + 1) is an eigenvalue of 2 J , and ~m is an eigenvalue of Jz. The eigenvalue m assume the 2j + 1 values, from −j to +j. It remains to find the range of values of j.

We first compute the largest value of jmax of j. Suppose that j = jmax and m = j; then m = mmax = (m1 + m2)max = (m1)max + (m2)max = j1 + j2. Hence,

jmax = j1 + j2 and

|j1, j2; m1 = j1, m2 = j2i = |j1, j2; j = j1 + j2, m = ji .

We show now that the second largest value of j is j = j1 + j2 − 1. Consider the eigenvalue ~m = ~(m1 + m2 − 1). There are two corresponding eigenstates in the set Sj1j2 :

|j1, j2; m1 = j − 1, m2 = ji and |j1, j2; m1 = j, m2 = j − 1i .

In other words, the subspace generated by the eigenstates of Jz with eigenvalue ~m = ~(j1 + j2 − 1) has dimension 2. Therefore, there are two eigenstates of Jz in the set Tj1j2 . One is

|j1, j2; j = j1 + j2, m = j − 1i ; the other must be an eigenstate of J 2 corresponding to another eigenvalue and the only possibility is

|j1, j2; j = j1 + j2 − 1, m = ji . 57

2 Thus, j1 + j2 − 1 is an eigenvalue of J . This reasoning can be reiterated: j can assume values j1 + j2, j1 + j2 − 1, j1 + j2 − 2, etc. until a minimum value jmin.

To determine jmin we reason as follows. The subspace generated by Sj1j2 has dimension (2j1 + 1)(2j2 + 1). The number of vectors in the second basis Tj1j2 must be the same. Therefore

j1+j2 X (2j1 + 1)(2j2 + 1) = (2j + 1).

j=jmin

This is an equation in the unknown jmin whose solution is

jmin = |j1 − j2|. We conclude that for the sum of two angular momenta, the eigenvalues of J 2 = 2 2 (J1 +J2) has eigenvalues ~ j(j +1) where j = j1 +j2, j1 +j2 −1,..., |j1 −j2|. 9.4.1. Clebsch-Gordan coefficients. We found that

Sj1j2 = {|j1, j2; m1, m2i : |m1| ≤ j1, |m2| ≤ j2}. and

Tj1j2 = {|j1, j2; j, mi : j = j1 + j2, j1 + j2 − 1,..., |j1 − j2|, |m| ≤ j}. are two orthonormal sets of the same Hilbert space describing two angular mo- menta (the angular momenta of two particles, or the spin and orbital angular mo- mentum of the same particle, etc.). Therefore, any vector in Tj1j2 can be written as linear combination of vectors is Sj1j2 : X (9.6) |j1, j2; j, mi = |j1, j2; m1, m2i hj1, j2; m1, m2|j1, j2; j, mi m1,m2

The coefficients hj1, j2; m1, m2|j1, j2; j, mi of this expansion are called Clebsch- Gordan (CG) coefficients. We can suppress the label j1, j2 from the ket for con- venience.

Suppose without loss of generality that j1 ≥ j2. From the properties of addi- tion of angular momenta we have

hj1, j2; m1, m2|j, mi= 6 0 ⇐⇒ j1 − j2 ≤ j ≤ j1 + j2

hj1, j2; m1, m2|j, mi= 6 0 ⇐⇒ m = m1 + m2. (The first condition is called triangle condition, for geometrically it means that we must be able to form a triangle with sides j1, j2 and j.) Moreover we can always choose the CG coefficients so that

hj1, j2; m1, m2|j, mi ∈ R

hj1, j2; j1, j − j1|j, ji > 0. Another useful property is

j1+j2−j hj1, j2; m1, m2|j, mi = (−1) hj1, j2; −m1, −m2|j, −mi . 58

This relation halves the work in the computations. The calculations of the CG coefficients proceed as follows. First, from the identity

|j1, j2; m1 = j1, m2 = j2i = |j1, j2; j = j1 + j2, m = ji we have

hj1, j2; j1, j2|j1 + j2, j1 + j2i = 1.

Then we apply the operator J− = J1,− ⊗ 1 + 1 ⊗ J2,− to both sides and we recall that p J− |j, mi = ~ j(j + 1) − m(m − 1) |j, m − 1i p J1,− |j1, m1i = ~ j1(j1 + 1) − m1(m1 − 1) |j1, m1 − 1i p J2,− |j2, m2i = ~ j2(j2 + 1) − m2(m2 − 1) |j2, m2 − 1i . The other coefficients can be found by symmetry. If we arrange the CG coefficients into a matrix, we find it is unitary. This follows from the fact that it relates one orthonormal set to another. Since the CG are all real, this matrix is orthogonal. Note that the relation (9.6) can be inverted (using the orthogonality of the CG coefficients matrix). We can write X (9.7) |j1, j2; m1, m2i = ξj |j1, j2; j, m = m1 + m2i , j where ξj = hj1, j2; j, m = m1 + m2|j1, j2; m1, m2i. In this way, the tensor prod- uct |j1, m1i⊗|j2, m2i can be written as linear combination of the vectors |j, m = m1 + m2i, where j = j1 + j2,..., |j1 − j2|.

When j1 = j2 = 1/2, we write symbolically 1 1 ⊗ = 1 ⊕ 0; 2 2 when j1 = 1, j2 = 1/2: 1 3 1 1 ⊗ = ⊕ ; 2 2 2 if j1 = 1, j2 = 1: 1 ⊗ 1 = 2 ⊕ 1 ⊕ 0, and so on.

EXAMPLE 9.1. Let J~ = J~1 + J~2 be the sum of two (commuting) angular 2 momenta. Suppose that j1 = 1/2 and j2 = 1/2. If we measure J , the possible 59

2 outcomes are ~ j(j + 1) with j = 0, 1. The nonzero CG coefficients are   1 1 1 1 , ; + , + 1, 1 = 1 2 2 2 2   1 1 1 1 1 , ; + , − 1, 0 = +√ 2 2 2 2 2   1 1 1 1 1 , ; − , + 1, 0 = +√ 2 2 2 2 2   1 1 1 1 1 , ; + , − 0, 0 = +√ 2 2 2 2 2   1 1 1 1 1 , ; − , + 0, 0 = −√ . 2 2 2 2 2 The CG coefficients can be represented in a square matrix:

 |1, 1i  1 0 0 0  + 1 , + 1  √ √ 2 2 |1, 0i 0 1/ 2 1/ 2 0 + 1 , − 1   =    2 2  . |1, −1i 0 0 0 1  − 1 , + 1     √ √   2 2  1 1 |0, 0i 0 1/ 2 −1/ 2 0 − 2 , − 2

10. Identical particles

Although the definition of identical particles is the same classically and quan- tum mechanically, the implications are different in the two cases. When a system contains a number of particles of the same kind, e.g. a number of electrons, the par- ticles are indistinguishable one from another. No observable change is made when two of them are interchanged. This circumstance gives rise to some curious phe- nomena in quantum mechanics having no analogue in the classical theory, which arise from the fact that in quantum mechanics a transition may occur resulting in merely the interchange of two identical particles, which transition then could not be detected by any observational means. A satisfactory theory ought, of course, to count two indistinguishable states as the same state and to deny that any transition does occur when two identical particles are swapped. Consider, for simplicity, two quantum particles. Their state is represented by the wavefunction ψ(x1, x2). Consider the operator P that swap the two particles P ψ(x1, x2) = ψ(x2, x1). If the particles are indistinguishable, then ψ(x2, x1) = iα e ψ(x1, x2). If we swap again the particles we get

2 2iα (10.1) P ψ(x1, x2) = e ψ(x1, x2) = ψ(x1, x2), so that eiα = ±1. This can be extended to an arbitrary number N of indistinguish- able particles. The state is ψ(x1, . . . , xN ). If we permute the i-th and the j-th 60 particles

Pijψ(x1, . . . , xi . . . , xj, . . . , xN ) = ψ(x1, . . . , xj . . . , xi, . . . , xN ) 2iα = e ψ(x1, . . . , xi . . . , xj, . . . , xN ). Again, the only possibilities (if we assume that the wavefunction is single-valued) are e2iα = ±1. In other words, the state of N indistinguishable particles is totally symmetric (e2iα = +1) or totally antisymmetric (e2iα = −1) under permutations. Indistinguishable particles with symmetric states are called bosons; particles with antisymmetric states are called fermions.

10.1. Bosonic and Fermionic Hilbert Spaces. Two identical bosons will al- ways have symmetric state vectors and two identical fermions will always have antisymmetric state vector. Let us call the Hilbert space of symmetric vectors VS and the Hilbert space of antisymmetric vectors VA. We now examine the relations between these two spaces and the tensor product V1 ⊗ V2. The space V1 ⊗ V2 consists (in finite dimension) of all linear combinations of vectors of the form |v1v2i = |v1i ⊗ |v2i. Suppose that the vectors {|eii} is a basis of V1 (and of V2). To each pair of vectors |eieji and |ejeii (i 6= j) there is one bosonic vector √1 (|e e i + |e e i), and one fermionic vector √1 (|e e i + |e e i). If e = e , 2 i j j i 2 i j j i i j the vector |eieii is already symmetric, i.e. bosonic. There is no corresponding fermionic vector (the Pauli principle). We express this relation as

(10.2) V1 ⊗ V2 = VS ⊕ VA. The case of N = 2 particles lacks one feature that is found at larger N. For every N! product vectors, we get only two acceptable (bosonic or fermionic) vectors. ⊗N Hence, the tensor product V for N ≥ 3 is larger (in dimensionality) than VS ⊗ VA.

References Here are the main sources I used during the preparation of these lecture notes.

[1] L. Pauling and E. B. Wilson, Introdution to quantum mechanics, McGraw-Hill Book Company, 1935. [2] A. Peres, Quantum Theory: Concepts and Methods, Fundamental Theories of Physics 72, Kluwer Academic Publishers, 2002. [3] H. Maassen, Quantum Probability and Theory, In F. Benatti, M. Fannes, R. Floreanini, and D. Petritis, editors, Quantum Information, Computation and Cryptography, 808, 65-108. Springer, Berlin, 2010. [4] G. Nardulli, Meccanica Quantistica I, Principi, Collana di fisica e scienze esatte diretta da Sergio Ratti, FrancoAngeli, 2001. [5] K. R. Parthasarathy, Mathematical Foundations of Quantum Mechanics, Texts and readings in Mathematics 35, Hindustan Book Agency, New Delhi, 2011. [6] L. I. Schiff, Quantum mechanics, 3rd edition, McGraw-Hill, 1968. [7] J. J. Sakurai, Modern Quantum Mechanics, Addison-Wesley, 1994. 61

[8] R. Shankar, Principles of Quantum Mechanics, 2nd edition, Kluwer Academics, 1994. [9] G. Teschl, Mathematical Methods in Quantum Mechanics With Applications to Schrodinger¨ Operators, 2nd edition, Graduate Studies in Mathematics Volume 157, American Mathematical Society Providence, Rhode Island, 2010. [10] H. Weyl, The theory of groups and quantum mechanics, Dover Publications, 1950.