<<

Physics 449/451 - - Course Notes

David L. Feder

September 13, 2011 Contents

1 Energy in Thermal Physics (First Law of ) 3 1.1 Thermal Equilibrium ...... 3 1.2 TheIdealGas...... 3 1.2.1 Thermodynamic Derivation ...... 3 1.2.2 Mechanical Derivation ...... 4 1.3 EquipartitionofEnergy ...... 6 1.4 HeatandWork...... 7 1.5 CompressionWork:theAdiabat ...... 7 1.6 HeatCapacity...... 9

2 The Second Law of Thermodynamics (aka The Microcanonical Ensemble) 13 2.1 TwoState Systems (aka Flipping Coins) ...... 13 2.1.1 Lots and lots of trials ...... 15 2.1.2 Digression: Statistics ...... 16 2.2 Flow toward equilibrium ...... 17 2.3 LargeSystems ...... 18 2.3.1 DiscreteRandomWalks ...... 18 2.3.2 ContinuousRandomWalks ...... 20 2.3.3 QuantumWalksandQuantumComputation ...... 22 2.4 ...... 25 2.4.1 Boltzmann ...... 25 2.4.2 ShannonEntropy...... 26 2.4.3 vonNeumannEntropy...... 28

3 Equilibrium 32 3.1 ...... 32 3.2 Entropy,Heat,andWork ...... 34 3.2.1 ThermodynamicApproach ...... 34 3.2.2 StatisticalApproach ...... 35 3.3 Paramagnetism...... 36 3.4 Mechanical Equilibrium and Pressure ...... 38 3.5 Diffusive Equilibrium and ...... 39

4 Engines and Refrigerators 41 4.1 HeatEngines ...... 41 4.2 Refrigerators ...... 44 4.3 RealHeatEngines ...... 44

1 PHYS 449 Course Notes 2009 2

4.3.1 Stirling Engine ...... 44 4.3.2 SteamEngine...... 46 4.3.3 InternalCombustionEngine...... 47 4.4 RealRefrigerators ...... 49 4.4.1 HomeFridges...... 49 4.4.2 Liquefaction of and Going to Absolute Zero ...... 50

5 Free Energy and Chemical Thermodynamics 51 5.1 FreeEnergyasWork...... 51 5.1.1 Independent variables S and V ...... 51 5.1.2 Independent variables S and P ...... 52 5.1.3 Independent variables T and V ...... 52 5.1.4 Independent variables T and P ...... 53 5.1.5 ConnectiontoWork ...... 53 5.1.6 Varying particle number ...... 54 5.2 Free Energy as Force toward Equilibrium ...... 54

6 Boltzmann Statistics (aka The ) 56 6.1 TheBoltzmannFactor...... 56 6.2 Z andtheCalculationofAnything ...... 58 6.2.1 Example: Pauli Paramagnet Again! ...... 60 6.2.2 Example: Particle in a Box (1D) ...... 62 6.2.3 Example: Particle in a Box (3D) ...... 63 6.2.4 Example: Harmonic Oscillator (1D) ...... 64 6.2.5 Example: Harmonic Oscillator (3D) ...... 65 6.2.6 Example:Therotor ...... 66 6.3 The (reprise) ...... 68 6.3.1 DensityofStates ...... 69 6.4 The Maxwell Speed Distribution ...... 71 6.4.1 InterludeonAverages ...... 73 6.4.2 MolecularBeams...... 73 6.5 (AlreadycoveredinSec.6.2) ...... 75 6.6 Gibbs’Paradox...... 75

7 77 7.1 ChemicalPotentialAgain ...... 77 7.2 GrandPartitionFunction ...... 78 7.3 GrandPotential...... 80

8 VirialTheoremandtheGrandCanonicalEnsemble 81 8.1 VirialTheorem ...... 81 8.1.1 Example:idealgas...... 82 8.1.2 Example: Averagetemperatureofthesun ...... 82 8.2 ChemicalPotential...... 83 8.2.1 Free energies revisited ...... 84 8.2.2 Example:PauliParamagnet...... 85 8.3 GrandPartitionFunction ...... 85 8.3.1 Examples ...... 87 8.4 GrandPotential...... 87 PHYS 449 Course Notes 2009 3

9 Quantum Counting 89 9.1 Gibbs’Paradox...... 89 9.2 ChemicalPotentialAgain ...... 91 9.3 Arranging Indistinguishable Particles ...... 92 9.3.1 Bosons...... 92 9.3.2 Fermions ...... 93 9.3.3 !...... 95 9.4 Emergence of Classical Statistics ...... 96

10 Quantum Statistics 99 10.1 BoseandFermiDistributions ...... 99 10.1.1 Fermions ...... 100 10.1.2 Bosons...... 102 10.1.3 Entropy...... 104 10.2 QuantumClassical Transition ...... 106 10.3 EntropyandEquationsofState...... 107

11 Fermions 110 11.1 3DBoxatzerotemperature...... 110 11.23DBoxatlowtemperature ...... 111 11.3 3Disotropicharmonictrap ...... 113 11.3.1 DensityofStates ...... 113 11.3.2 LowTemperatures ...... 114 11.3.3 SpatialProfile ...... 115 11.4AFewExamples ...... 117 11.4.1 Electrons in Metals ...... 117 11.4.2 ElectronsintheSun ...... 117 11.4.3 Ultracold Fermionic Atoms in a Harmonic Trap ...... 118

12 Bosons 119 12.1QuantumOscillators ...... 119 12.2Phonons...... 120 12.3BlackbodyRadiation...... 123 12.4 BoseEinsteinCondensation ...... 126 12.4.1 BECin3D ...... 126 12.4.2 BEC in Lower Dimensions ...... 127 12.4.3 BECinHarmonicTraps...... 129 PHYS 449 Course Notes 2009 2

Introduction

The purpose of these course notes is mainly to give your writing hand a break. I tend to write lots of equations on the board, because I want to be rigorous with the material. But I write very quickly and my handwriting isn’t pretty (this is probably a huge understatement). So these course notes contain (hopefully) all the equations that I will be writing on the board, so that when you take notes during class you can focus on the concepts and my mistakes, rather than furiously trying to scribble down everything I am writing, which will probably contain mistakes anyhow. Not to say that these notes don’t contain mistakes! These notes also have occasional nonmathematical expressions (i.e. sentences). Chapter 1

Energy in Thermal Physics (First Law of Thermodynamics)

This chapter deals with very fundamental concepts in thermodynamics, many of which you can intuit from your experience.

1.1 Thermal Equilibrium

Some questions to ponder:

What are the ways that you measure room temperature? • What are the ways to measure that are much hotter? Colder? • What exactly is temperature? • What is absolute zero? • How do systems reach a given temperature? • What does it means to say a system is in equilibrium? • What is thermal equilibrium? • 1.2 The Ideal 1.2.1 Thermodynamic Derivation Robert Boyle (16271691) was an Irish alchemist (!) who helped to establish chemistry as a legitimate field. After much observing, he found in 1662 that gases tended to obey the follow equation:

P V = k, (1.1) where P is the pressure, V is the volume, and k is some constant that depends on the specific gas. This equation was known as Boyle’s Law. In 1738 Daniel Bernoulli derived it using Newton’s

3 PHYS 449 Course Notes 2009 4 equations of motion (see more about this in the next section), under the assumption that the gas was made up of particles too tiny to see, but no one paid any attention because these particles were not believed to actually exist. Later, JosephLouis GayLussac (17781850) observed that at constant pressure, one always has V T , or V1T2 = V2T1. This is known as Charles’ Law after some guy named Charles. BenoˆıtPaul∝Emile´ Clapeyron (17991864) put the two laws together to obtain

P V = P0V0 (267 + T ) , (1.2) in which the temperature is measured in degrees Celcius. The number 267 came from observations of GayLussac. This was pretty impressive, since absolute zero is known today to be 273.15◦C. Lorenzo Romano Amedeo Carlo Avogadro di Quaregna (Quaregga) e di Cerreto (17761856), otherwise known as Avogadro, showed in 1811 that the P0V0 out front of Clapeyron’s equation was related to the ‘amount of substance’ of the gas, and wrote:

P V = nR (267 + T ) , (1.3) where n is the number of moles of the gas, and R =8.31 J/mol/K is a universal constant (indepen dent of the type of gas). It’s easier to think of the number of particles N (atoms or molecules) rather 23 than the number of moles, so one can write N = nNA, where NA =6.02214179(30) 10 is known as Avogadro’s number and corresponds to the number of atoms in a mole of gas.× Finally, if we measure temperature in units of Kelvin (K = 273.15+◦ C) and make the substitution R = NAkB, 23 where k =1.381 10− J/K is Boltzmann’s constant, we finally obtain the : B ×

P V = NkBT. (1.4)

This is nice because we don’t have to worry about moles. What’s a mole anyhow??

That’s about all the history you’re going to get!

The ideal gas law (1.4) is called the for the ideal gas, because it relates the three state variables (thermodynamic coordinates) P , V , and T at equilibrium. These are macroscopic properties of the system, and their relationship doesn’t depend on how the system was changed to get to where it is. I could write it instead as P = P (V,T ), i.e. the pressure is given by some function that depends on two variables V and T . All systems at equilibrium have such constitutive equations, relating parameters of the system over which one has external control. The van der Waals (nonideal) gas, for example, was found to have the equation of state a Nk T = P + (V b) , (1.5) B V 2 −   where a and b are moleculedependent constants. Magnets have a similar equation that relates the temperature, magnetic field B, and the amount of magnetization M. See Section 3.3 for more details.

1.2.2 Mechanical Derivation The ideal gas law can also be derived from first principles, which is what both Bernoulli (approx imately) and Boltzmann did (thus the konstant with his initial on it) by assuming an atomistic theory. We’ll do a good job of this in Chapter 3, but a rough job of it can be done right now. Let’s assume that there is a single molecule or atom, bouncing around elastically in a volume V = AL, PHYS 449 Course Notes 2009 5

Figure 1.1: One molecule bouncing into a piston. where the length along the xdirection is L, as shown in Fig. 1.1. After many bounces against the piston, the average pressure directly on it is

∆vx F , on piston F , on molecule m ∆t P = x = x = . (1.6) A − A − A 

Let’s set t = 2L/vx, the time it takes to make a full roundtrip. When it undergoes one elastic collision, its change in velocity is v = 2v . Putting these together gives x − x

∆vx m ∆t 2mv mv2 P = = x = x . (1.7) − A  2AL/vx V If there were actually lots of molecules, each colliding with each other and the walls totally elastically, then we can forget the averageness of the pressure to obtain

N 2 2 2 2 P V = m vx 1 + vx 2 + ... = m vx i = mNvx = NkBT (1.8) i=1     X  using the ideal gas law at the end. So we have the mean kinetic energy per molecule in the xdirection is equivalent to temperature: K 1 1 x = mv2 = k T. (1.9) N 2 x 2 B This means that the mean kinetic energy is 1 1 1 1 K = mv2 + mv2 + mv2 = mv2 = (3/2)Nk T. (1.10) 2 x 2 y 2 z 2 B

The Rootmean square (RMS) speed (which is usually close to the average value) of each atom/molecule 2 is defined as vRMS = v and is equal to 3kBT/m in this case. Suppose we have oxygen at room p p PHYS 449 Course Notes 2009 6

26 temperature. Diatomic oxygen has atomic mass 32, which is 5.297 10− kg. Assuming room ∗ temperature is 300 K, one obtains vRMS = 484 m/s. This is really fast! How fast? Consider that the speed of sound in air at standard temperature and pressure is only 343 m/s. Is it reasonable that the mean speed of the atoms is higher than the sound speed?

1.3 Equipartition of Energy

1 The equipartition theorem states that ‘every degree of freedom contributes an energy of 2 NkB T to the total energy of the system.’

To make sense of this, we need to talk about what a degree of freedom actually is. We’ll be much more rigorous about it in Section 1.13, but because this will be months from now it makes sense to talk about it now. A degree of freedom tells you about what the atom or molecule can actually do. Maybe it can move around (translate), or jiggle (vibrate) or turn around (rotate), or scream, 1 or shine, or a host of other things. Each one of these capabilities contributes an energy of 2 kB T to its energy. Likewise, suppose all your particles were constrained in some way to move only in one dimension; then the total translational energy would be a kB T factor smaller than molecules moving in free space. The real value of the equipartition theorem is that it allows you to calculate the total energy without having to do any work. So lazy people use it all the time in the real world!

It helps to remember your classical mechanics, though. For example, the kinetic energy of a rotating body is given by something that looks like 1 K = I ω2 + I ω2 + I ω2 , (1.11) rot 2 xx x yy y zz z  where Iii are the moments of inertia in the diagonalized inertial basis, and the ωi are the angular frequencies in radians. Each of these terms is quadratic, so each one corresponds to a degree of freedom. But recall that the energy of an oscillator is made up of both a kinetic energy and potential energy: 1 m E = p2 + p2 + p2 + ω2 + ω2 + ω2 , (1.12) osc 2m x y z 2 x y z   where I have already used that ωi = ki/m. Because there are two kinds of quadratic terms comprising the energy in each dimension, the threedimensional oscillator contributes 3k T per p B particle!

So the total energy of the system (in some limit that will be explained more clearly later on) is f U = Nk T, (1.13) 2 B where f is the number of degrees of freedom.

Many examples of equipartition will be covered in class.

It is interesting to note that, even though these results were derived under the assumption of an ideal gas, the equipartition theorem remains correct even when the gases are interacting. In fact, it can be applied to a huge variety of physical systems, many of which would surprise you. PHYS 449 Course Notes 2009 7

1.4 Heat and Work

So far we have seen a connection between temperature and energy through the equation of state of the ideal gas and the idea of molecules bouncing around in a box. Using the equipartition theorem, it looks like the temperature is closely related to the heat of a system, if one considers the heat as being somehow related to the amount of energy a system has. In fact, the word ‘heat’ in thermodynamics is not the same as what we think of as the amount of ‘hotness,’ i.e. is not equivalent to the temperature.

In fact, ‘heat’ is the amount of energy that spontaneously flows between two systems: heat moves from the system at higher temperature to the system at lower temperature. You might worry that this definition is circular, though, because temperature is itself a measure of a system’s tendency to either absorb or release energy. But now we can define temperature slightly better through the equipartition theorem: it is a measure of the total energy of the system U i.e. T 2U/fkB, where f is the number of degrees of freedom. ≈

D∗: Heat: The heat Q is the energy added to the system by a spontaneous (i.e. not forced) process.

Please take careful note: the magnitude of Q is positive if heat is added to the system, and is negative if it is removed. Of course, we could also change a system’s energy by doing something to it. This is called ‘work,’ and it roughly corresponds to what you naturally think of as work: it takes work to do something to something! But it corresponds to anything where forces are present, it doesn’t have to be you doing the work.

D: Work: The work W done on a system is the energy added to the system by a forced process.

Again, the work is positive if energy is added to the system; one says that work is done on the system. If energy is expended (released) by the system by the work, then one says that work is done by the system. But probably these concepts are familiar to you, because they are the same ones used in classical mechanics. Work in that case was defined as

W = F dr, (1.14) Z and it is the same now. Of course, in situations where the applied force doesn’t change with position, then it reduces to W = F r. Anyhow, heat and work together yield the First Law of Thermodynamics: U = Q + W. (1.15) This is really just a statement of the , so it is weird that it is called a Law, but there you go. Processes of heat transfer include conduction, convection, and radiation.

1.5 Compression Work: the Adiabat

There are many ways to do work on a system (or for a system to do work). One of the simplest is to squeeze it, such as pushing or pulling the piston in Fig. 1.1. In the case where we push on the piston (force to the left), the work is W = F x. Now, F P A as long as the pressure is always − ≈ ∗The D stands for a definition. PHYS 449 Course Notes 2009 8 uniform and at equilibrium during the operation. So we have W = P Ax. But Ax = V is the change in volume of the system, so −

W = P V. (1.16) − In our case V < 0 so work W > 0 is done on the system, as expected, because we are the ones doing the work!

If the pressure changes during the course of the process, then the above result will no longer be correct. This is entirely possible because the volume is changing and generally this will change the pressure. In this case one needs to know the equation of state P = P (V,T ) and determine

Vf W = P (V,T )dV. (1.17) − ZVi

Let’s consider an ideal gas. Obviously, squeezing the container will pump energy into the system, because the work (1.17) is positive when dV < 0. But let’s suppose that somehow we manage to keep the temperature completely constant. Then the work done is

Vf dV V W = Nk T = Nk T ln f > 0. (1.18) − B V − B V ZVi  i  Because the temperature is closely tied to the total of the system, this is equivalent to stating that the total energy must remain constant. In this case, we must have as much heat flowing out of the system as there is work being done on the system, i.e.

V Q = Nk T ln f < 0 (1.19) B V  i  which is negative because heat is flowing out. This process is called an isotherm.

Suppose now that we don’t let any heat escape at all. Then the internal energy and temperature must both go up, U = W . Using the equipartition theorem (1.13), we have

f U = Nk T = P V. (1.20) 2 B − Because things can vary during the course of the changes, it is better to write this as a differential equation, (f/2)Nk dT = P dV . Using the ideal gas formula again gives B − f dT dV = (1.21) 2 T − V which can be integrated to yield (f/2) ln(T /T )= ln(V /V ) = ln(V /V ). This yields f i − f i i f T f/2 V f = i VT f/2 = constant V γ P = constant, (1.22) T V ⇒ ⇒  i   f  where γ = (f + 2)/f is the adiabatic exponent, and the process is called an adiabat. PHYS 449 Course Notes 2009 9

A few comments here are in order. Since no heat is flowing in or out of the system in an adiabatic process, then in the absence of any friction we could completely recover the original state of the system by simply reversing the procedure, i.e. by pulling the piston back to where it was before. This will allow the system to do the work, and the total energy and temperature will go back to where they were at the beginning. Though we haven’t talked about entropy yet, it turns out that such a process conserves the entropy, because nothing has changed when get back to where we started. The process is then called isentropic. I mention this because often when people are talking about adiabatic processes, they sometimes actually mean isentropic, and vice versa.

The adiabatic heating of a gas on compression has many practical consequences. When gases are compressed (eventually into liquids) for use in scuba gear or industrial processes, they rapidly heat. So this needs to be performed under cold conditions where the compressing gases can quickly cool, with metal (i.e. thermally conducting) tanks under cold water in the case of scuba gear.

You might be interested to know that this compression is precisely the principle behind diesel engines: when the diesel gas is hot enough it spontaneously explodes. This is why diesel engines don’t need spark plugs and why it is better to have one of those cars if you need to ford a deep river on your way to the mountains! Here’s another application: the Chinooks. When the air descends from the tops of the mountains into the foothills, the air pressure increases because of the lower altitude, which heats it as an adiabatic process. It is largely for this reason that Chinook winds are warm. A similar thing happens in Antarctica with the katabatic winds, which can blow continuously for months. Unfortunately for researchers down there, in that case the air loses its adiabatic heat due to equilibration with the glaciers.

1.6 Heat Capacity

The heat capacity of an object not exactly what it sounds like, i.e. how much heat can it hold. But it’s close: it is how much heat is needed to raise its temperature by one degree. Of course, the properties of the object can change as it heats or cools (like it might freeze, catch fire, etc.), so to the heat capacity needs to be carefully defined. Here’s the way the book does it: the heat capacity is C Q/T . In books you often hear about the ‘specific heat’ instead, which is the heat capacity per unit≡ mass c C/m. The problem with this definition is that Q = U W , so that the heat capacity depends≡ on how the heat is added because of the work term. To make− things simpler we assume that the applied work is zero. Assuming the work is of the form W = P V then this means the volume is kept constant. So we finally have − U ∂U C = = . (1.23) V T ∂T  V  V

Most real objects actually prefer to expand when heated, keeping their pressure constant. This effect is pretty small in solids and liquids (see more on this below) but is important in gases. So it is convenient to include the work done by the object during expansion: U ( P V ) ∂U ∂V C = − − = + P . (1.24) P T ∂T ∂T  P  P  P In solidstate physics people usually just say ‘heat capacity’ or ‘specific heat’ and you should always assume they are talking about CV . You should convince yourself that for an ideal gas CP = PHYS 449 Course Notes 2009 10

CV + NkB = CV + R if N = NA. Often in circumstances where the system is allowed to do volumechanging work it is useful to define the :

H U + PV. (1.25) ≡

Then it’s obvious that CP = (∂H/∂T )P . The enthalpy doesn’t really mean anything particularly physical to me; it seems more like a mathematical device to me. Some textbooks say that it can be thought of as the total energy that you need to create something out of nothing, i.e. it includes the energy you need to displace the air to put something there. The enthalpy is hugely popular in chemistry, so because this is a physics class I’ll try to avoid it as much as possible!

Let’s calculate the heat capacity for an ideal gas. Using the equipartition theorem (1.13) we have

∂ f f f C = Nk T = Nk = R, (1.26) V ∂T 2 B 2 B 2 where the last equality follows if N = NA (see Sec. 1.2). In fact, for many people this equation is actually the equipartition theorem, i.e. that every degree of freedom contributes NkB/2 to the heat capacity. The chemists (who have moles on the brain) state that every degree of freedom contributes R/2. With this fact in mind, let’s consider the heat capacity of a gas of hydrogen (recall that hydrogen is diatomic, as are all the HOBFINC gases: hydrogen, oxygen, bromine, fluorine, iodine, nitrogen, and chlorine): It is clear from this figure that the heat capacity is strongly dependent on

Figure 1.2: The heat capacity of hydrogen gas. the temperature. Is there an error in the ideal gas result above? Or perhaps is there something wrong with what we call a degree of freedom? I will justify the shape of this curve properly in Sec. 6.3.

At the risk of getting ahead of ourselves, it turns out that at a one can add heat to a system without having it increase its temperature at all (phase transitions will be considered quite a bit in Statistical Mechanics II). Think about a pot of water just at the boiling point. You keep the PHYS 449 Course Notes 2009 11

burner on, but it stay at 100◦C. It’s heat capacity is infinite! A similar thing happens when you cool water down to the freezing point: a huge amount of heat leaves the system without the temperature dropping below 0◦C. In fact, a sudden spike in the energy as a function of temperature is used as a signature of a phase transition. In 1908 (almost exactly 100 years ago!), Kamerlingh Onnes was cooling helium and found that it turned into a liquid at 4.2 K. Cooled even further, it started going bananas at Tλ =2.17 K, with a huge release of energy, but below this it stayed a liquid. Only later was it realized that it had turned into a superfluid! These days, the fluidsuperfluid transition is called the ‘lambda point’ because of the shape of the heat capacity curve:

Figure 1.3: Lambda point of liquid helium, showing the energy released at the superfluid transition.

Most materials have a socalled ‘cooling curve’ that shows how the cooling occurs as a function of time. Fig. 1.4 shows a curve for a generic metal as it cools from a liquid to solid. Notice the nonlinear shape of the curve even far from the phase transition. For the hottest cup of coffee (or tea or hot chocolate as per your preference!), is it better to add the milk right away when you pour it, or just before you drink it? Do you need to use the graph to figure this out?

The heat capacity is not a very useful quantity at phase transitions because of this divergence. But it would still be able to make quantitative statements at this point. Instead, one uses the ‘latent heat of formation,’ or the ‘latent heat’ for short: Q L , (1.27) ≡ m which measures the amount of heat needed (or released) to fully transform the substance from one phase to the other. One assumes that the pressure is constant and that no other work is done either. The latent heat for water to boil is L = 2260 J/g, or 540 cal/g (1 J 0.239 cal). Contrast this with ≈ the 418.4 J (100 cal) needed to bring water from 0◦ to 100◦!

While we’re on the topic of phase transitions, you might be interested in looking at the generic phase diagram of many materials. What important biological consequences are there of water’s ‘anomalous’ behaviour? PHYS 449 Course Notes 2009 12

Figure 1.4: Cooling curve for a generic metal as it goes from liquid to solid.

Figure 1.5: Generic phase diagram. The dotted line corresponds to the ‘anomalous’ behaviour of water. Chapter 2

The Second Law of Thermodynamics (aka The Microcanonical Ensemble)

2.1 Two-State Systems (aka Flipping Coins)

D Classical Probability: All events are equally likely.

D Statistical Probability: Probability that an event occurs is the measured relative frequency of occurrence. Q∗ Is this definition circular?

D Trials: Experiments or tests.

D Events: Results of above, designated ei.

The collection of events forms an abstract space called event space. Each point i in space is assigned a probability pi = p(ei) which is the probability of event i, so that

pi =1. i X

The classical probability of event i is p =1/, i when there are points in event space. For a i ∀ single coin, c =2; for a die, d = 6. 4 4 Suppose we toss our coin four times. How many events are there? =c =2 = 16. Each event is listed in the table on the next page. The probability of obtaining the result HHHH is just as likely as obtaining HTTH or HTHT. Does this seem right? ∗The Q stands for a question.

13 PHYS 449 Course Notes 2009 14

Combination (#H, #T) HHHH (4,0) HHHT (3,1) HHTH (3,1) HHTT (2,2) HTHH (3,1) HTHT (2,2) HTTH (2,2) HTTT (1,3) THHH (3,1) THHT (2,2) THTH (2,2) THTT (1,3) TTHH (2,2) TTHT (1,3) TTTH (1,3) TTTT (0,4)

Table 2.1: All the possible outcomes of tossing a coin four times.

Because the result of each coin flip is completely independent of the results of any previous coin 1 flips, the probability of obtaining H or T is always 2 every time. So the probability of getting any of the 4coinflip events is p = 1 1 1 1 = 1 . i 2 × 2 × 2 × 2 16 D Microstate: Each compound event. For this example, there are 16 microstates. All microstates are formed from single events that are themselves equally likely.

Theorem: For a system in equilibrium, all microstates are equally possible.

Ergodic Hypothesis (Boltzmann 1871, Ehrenfest 1911): Given a sufficiently long time, all mi crostates will be observed and all points in event space will be accessed. THIS IS THE UNPROVEN FOUNDATION OF STATISTICAL MECHANICS!

Now, how many microstates have 4 heads and no tails, or (4, 0)? Or (3, 1), (2, 2), (1, 3), or (0, 4)? If we don’t care what order the faces come up, then we have #(4, 0) = #(0, 4) = 1; #(3, 1) = #(1, 3) = 4; #(2, 2) = 6. Recognize the pattern?

D Macrostate: Collection of microstates with some common property. In the coin case above, the macrostates correspond to all events that share the same number of heads and tails.

Theorem: The state of the system (the macrostate that is actually observed) is the macrostate with the largest number of microstates. PHYS 449 Course Notes 2009 15

2.1.1 Lots and lots of trials

How to assign values to the macrostate when the number of trials (N) gets really huge? Suppose that we have a weighted coin, i.e. where the probability that it comes up heads is p(H) p and probability of seeing tails is p(T ) q =1 p so that p + q = 1. Then the microstate probability≡ ≡ − Pmicro is n1 n2 n1 N n1 P = P (n )= p q = p (1 p) − micro micro 1 − where n1 and n2 are the number of p and q events, respectively, for a given trial. The total number of coin tosses is N = n1 + n2.

n1 N n1 Then the macrostate probability Pmacro(n1)=(n1)p q − , where (n1) is the number of ways of arranging n events with p and N n events with q. We also know that 1 − 1 n1 N n1 Pmacro(n1)= (n1)p q − =1. (2.1) n1 n1 X X So now let’s use the binomial theorem:

N N N! n N n (p + q) =1= p q − , (2.2) n!(N n)! n=0 X  −  Comparison of Eqs. (2.1) and (2.2) shows that the number of ways of distributing microstates in a macrostate is given by: N! N (N,n)= . (2.3) n!(N n)! ≡ n −   These are the binomial coefficients, which is why those numbers (1,4,6,4,1) appeared when count ing microstates in the 4coin macrostates above. By the way, the right hand side of the above equation is read ‘N choose n.’ Note that if p = q =1/2 then

N N! =2N . (2.4) n!(N n)! n=0 X − The binomial coefficients for increasing values of N are shown in Fig. 2.1.

What is the intuitive reason for this expression for ? Suppose we have 4 objects, labeled A, B, C, and D. How many ways can these be arranged? Let’s see: ABCD, ABDC, ACBD, ACDB, ADBC, ADCB, BACD,BADC, BCAD, BCDA, BDAC, BDCA, CABD, CADB, CBAD, CBDA, CDAB, CDBA, DABC, DACB, DBAC, DBCA, DCAB, DCBA. So there’s 4! = 24 ways. In general, N objects can be arranged N! ways. Why? Well, for the example above there’s 4 ways of arranging the first letter, 3 ways of arranging the second, 2 ways for the third, and only one way for the last: i.e., 4 3 2 1 = 4!. × × × But A, B, C, and D might not be completely distinguishable! Suppose all of these letters actually represented heads (H). Then obviously there would be only one way of arranging them. If n of the letters stand for heads, then one would need to divide by n!, which is just the number of ways of arranging the previously distinguishable letters, which are now all heads. But we also need to divide PHYS 449 Course Notes 2009 16

Figure 2.1: Normalized binomial coefficients for increasing N. These are the binomial coefficients divided by the total, 2N . by (N n)!, which is the number of ways of arranging the previously distinguishable letters that now stand− for tails. And so the total number of arrangements is N!/n!/(N n)!. − From the above discussion, it should now be obvious how to extend this analysis to ‘coins’ of more than two sides (whatever they would look like!). A threesided coin would have

N! (N,n ,n )= , 1 2 n !n !(N n n )! 1 2 − 1 − 2 while an msided coin would have N! N! (N,n1,n2,...,nm 1)= m = . (2.5) − m 1 m 1 i=1 ni! − n ! N − n ! i=1 i − i=1 i Q   This equation is going to be important later in the term.Q P

I’ll cover some examples of calculating probabilities and ’s in class.

2.1.2 Digression: Statistics

Suppose that there is some variable u that have values u1,u2,...,um appearing with probabilities p(u ),p(u ),...,p(u ) = p ,p ,...,p . Then { } { 1 2 m } { 1 2 m} PHYS 449 Course Notes 2009 17

D Mean Value (Average): of u is designated u and is defined as

m m m u p(u )+ u p(u )+ + u p(u ) uip(ui) uipi u 1 1 2 2 m m = i=1 = i=1 = u p (2.6) ≡ p(u )+ p(u )+ + p(u ) m p(u ) m p i i 1 2 m i=1 i i=1 i i=1 P P X m P P because i=1 pi = 1 by the definition of probability.

P m m In general, f(u)= i pif(ui)= i pifi. P P D Standard Deviation about the mean: of u is designated (u)2 and is defined as

m (u)2 p (u u)2 (u u)2 ≡ i i − ≡ − i=1 X = u2 2uu + u2 = u2 u2. (2.7) − − Note that u2 is just u2. The standard deviation is also called the fluctuations about the mean, the second moment of u about the mean, and the dispersion of u.

D Root-Mean Square (RMS) value: This is just what it sounds like. It is the square root of the mean of the square of the function u: RMS(u) u2. ≡ p 2.2 Flow toward equilibrium

The discussion of flipping coins in the previous section must seem a bit out of place after all the discussion of gases in Chapter 1. In fact, it relates closely with the behaviour of a onedimensional gas (which might also seem a bit artificial!), as will be seen in the following section. A simple extension relates it to a real 3D gas. But before we get to that, it turns out that the simple model of a twosided coin corresponds closely to something called the Pauli paramagnet. This will be studied completely later on in Sec. 3.3, but it is useful to introduce it here, to illustrate how equilibrium works.

The Pauli paramagnet is a collection of tiny atomic magnetic spins that can either point up or down. One could imagine iron atoms this way. If you associate with heads and with tails, for example, then the correspondence with flipping coins is explicit. If↑ you apply a magnetic↓ field, all the spins will want to align themselves with the field, for example all pointing up. If you do this at high temperature and then cool the metal down, the spins will be frozen in the up position, and you get a bar magnet, something where all the little spins add up to a macroscopic magnet. For many metals the spin directions aren’t frozen even at low temperature, and quickly become disordered, which is why permanent magnets tend to be iron. If you heat up the metal, though, then the spins will again become disordered and the total magnetization will go to zero, that is on average there will be as many as atoms. ↑ ↓ With our statistical mechanics hats on, we should ask: why is the configuration with equal numbers of as atoms the most likely? Consider again Fig. 2.1, which shows the binomial coefficients ↑ ↓ PHYS 449 Course Notes 2009 18

N (n1) defined in Eq. (2.3) as a function of the number n1 = n of atoms for a given total number N. Recall that each of these numbers corresponds to the number↑ ↑ of microstates in a given macrostate. It is clear from the figure that the largest macrostate corresponds to n = N/2, i.e. where there are equal numbers of up and down spins. The most populated macrostate↑ is also the one the system is most likely to be in.

If another magnet that was at the same temperature – though I am sweeping issues of temperature under the rug at the moment – was brought into contact with the first (totally disordered) magnet, then what would happen? Suppose that the first magnet has NA spins in it, and the second has NB spins. The most likely macrostates for each system separately correspond to

NA! NB! A = 2 ; B = 2 , (2.8) NA NB 2 ! 2 ! i.e. the configurations with equal numbers  of spin up and down  atoms. Just before contact, the macrostate for the total system is the product of the two, total = AB. Just after contact, it becomes (NA + NB)! total = 2 . (2.9) (NA+NB) 2 ! h  i This corresponds to a total system with N = NA + NB atoms, half of which are in spin up and the other half in spin down. Let’s put in some numbers for concreteness. Suppose each system has NA(B) = 10, so that A(B) = 252. Then before contact we have AB = 63 504, which is a pretty big number. But after contact we have instead total = 184 756, which is almost three times bigger! Even if we had artificially managed to keep all the spins aligned in system B prior to contact, the state with the largest occupation of the total system has a mixture of ups and downs in total.

The above example illustrates that nature will adjust itself to a new environment by spontaneously ‘flowing’ to another state which has a higher ‘multiplicity’ of the macrostate. This is one perspective on the Second law of thermodynamics: systems will flow toward a state with the largest number of microstates. As we will see in the next few sections, these concepts are closely related to the increase of entropy in the universe. There is also an apparent paradox in the above example: the spins in system A need to somehow ‘know’ about the spins in system B in order to maximize the value of , because total > AB. What do you think that this means?

2.3 Large Systems 2.3.1 Discrete Random Walks

Random (or drunken) walks underpin many important algorithms in physics and computer science, among other things. They also provide a nice description of how classical particles diffuse toward some kind of equilibrium. Suppose that there is a walker that at every step will randomly choose between moving right or left. Maybe she flips a coin and moves one step to the right if it comes up heads, and left otherwise. The question is: what is the probability of finding her m steps away after N steps are taken? Put PHYS 449 Course Notes 2009 19 another way, if she is a very precise drunk, so that the length of step is always exactly ℓ, then what is the probability of finding her a distance mℓ from where she started?

We know that N = n1 +n2, where n1 and n2 are the number of steps to the right an left, respectively. Evidently, the distance after N steps is given by the difference, m = n1 n2. These two equations 1 1 − can be inverted, n1 = 2 (N + m) and n2 = 2 (N m), so that the probability of being m steps away is − N! N! 1 1 n1 n2 2 (N+m) 2 (N m) PN (m)= p q = 1 1 p q − . n1!n2! (N + m) ! (N m) ! 2 2 − 1 This is not very pretty. But if p = q = 2 , then    N 1 (N+m) 1 (N m) 1 p 2 q 2 − = 2   and the result becomes a bit nicer,

N N! n1 n2 N! 1 PN (m)= p q = 1 1 . n1!n2! (N + m) ! (N m) ! 2 2 2 −       It’s nice to know some other things, like the mean number of steps taken to the right, and the standard deviation. Let’s calculate these now.

N N N! n1 N n1 n1 = p(n1)n1 = p q − n1 (2.10) n1!(N n1)! n1=0 n1=0 X X − N N! N n1 n1 = q − [n1p ] n1!(N n1)! n1=0 X − N N! N n1 d n1 = q − p (p ) (2.11) n1!(N n1)! dp n1=0 X −   N d N! N n1 n1 = p q − p dp n1!(N n1)! n1=0 X − d = p (p + q)N from binomial theorem dp N 1 = pN(p + q) − = pN. since p + q = 1. So the mean number of steps to the right is just the total number of steps times the probability of taking a step to the right. No surprises here! Obviously we also know that n2 = qN and so m = N(p q). −

2 2 2 Now let’s calculate the standard deviation (n1) = n1 n1 . Because we already know n1, we 2 − only need to obtain n1. But now, instead of a factor of n1 appearing in the expression (2.10), there 2 is now a factor of n1. It would be nice to use a derivative trick like in line (2.11) above. Using 2 d n1 n1 1 d n1 n1 2 (p )= n p − and (p )= n (n 1)p − dp 1 dp2 1 1 − PHYS 449 Course Notes 2009 20 one gets d2 d2 p2 (pn1 ) = (n2 n )pn1 n2pn1 = p2 (pn1 )+ n pn1 . dp2 1 − 1 ⇒ 1 dp2 1 So right away the result can be written down:

d2 n2 = p2 (p + q)N + pN 1 dp2 2 N 2 = p N(N 1)(p + q) − + pN − = p2N(N 1)+ pN.  −  So (n )2 = n2 n 2 = p2N 2 p2N + pN p2N 2 = Np(1 p)= Npq. 1 1 − 1 − − −

The square root of the standard deviation is therefore n1 = √Npq. The relative width of the distribution is therefore n √Npq 1 q 1 1 = = = when p = q. n1 Np √N p √N r

The upshot is that as the number of steps N increases, n1 also increases but the relative width of 1/2 the distribution actually sharpens like N − .

2.3.2 Continuous Random Walks

What is the limiting distribution as the number of steps approaches infinity (N ), but the step size goes to zero? This transforms the discrete random walk discussed above into→ a∞continuous walk. The problem, though, is that when N gets huge, factors like N! and (N n)! appearing in the binomial distribution above get insanely huge, and it becomes difficult to know− what the result is of dividing them. One way to get around the problem is to take the logarithms of both sides, since logs of insanely huge numbers are often tractable large numbers. Here goes:

N! n1 N n1 ln [P (n )] = ln p q − N 1 n !(N n )!  1 − 1  = ln(N!) ln(n !) ln[(N n )!] + n ln(p) + (N n ) ln(q). − 1 − − 1 1 − 1 Now use Stirling’s formula to handle the ,

D Stirling’s formula: ln(n!) n ln(n) n for n 1. ≈ − ≫ One then obtains ln [P (n )] = N ln(N) N n ln(n )+n (N n ) ln[(N n )]+N n +n ln(p)+(N n ) ln(q). N 1 − − 1 1 1 − − 1 − 1 − 1 1 − 1 And so d N n p ln [P (n )] = ln(n ) 1 + ln(N n ) + 1 + ln(p) ln(q) = ln − 1 . (2.12) dn N 1 − 1 − − 1 − n q 1  1  PHYS 449 Course Notes 2009 21

D The most probable value for n1 is designatedn ˜1 and is defined by d ln [PN (n1)]=0. dn1

Using this gives N n˜ p − 1 = 1 because ln(1) = 0. n˜1 q

Straightforward manipulation givesn ˜1 = Np = n1, as expected.

What is the probability PN (n1) away from the most probable valuen ˜1? Taylor series around n1 =n ˜1:

2 2 d (n1 n˜1) d ln [PN (n1)] = ln [PN (˜n1)] + (n1 n˜1) ln [PN (n1)]n1=˜n1 + − 2 ln [PN (n1)]n1=˜n1 + ... − dn1 2 dn1

2 2 (n1 n˜1) d PN (n1) PN (˜n1)exp − 2 ln [PN (n1)]n1=˜n1 . ⇒ ≈ ( 2 dn1 ) The result (2.12) can be used to evaluate the mess in the curly brackets:

d2 1 1 ln [P (n )] = . dn2 N 1 −n − N n 1 1 − 1

At n1 =n ˜1,

d2 1 1 ln [P (˜n )] = dn2 N 1 −Np − N Np 1 − 1 1 = −Np − Nq p + q 1 = = − Npq −Npq

So the second derivative is negative, indicating that n1 =n ˜1 is a maximum. Does this expression look familiar?

So finally one obtains the expression for the probability distribution of a continuous random walk:

2 (n1 Np) 1 PN (n1) = PN (˜n1)exp − − ( 2 Npq )

1 (n n )2 = exp − 1 − 1 , (2.13) 2 2(n )2 2π(n1)  1  where the prefactor of the exponential isp the normalization constant, since

dn P (n ) 1. 1 N 1 ≡ Z PHYS 449 Course Notes 2009 22

1 Fig. 2.1 shows that the distribution of probabilities (assuming p = q = 2 ) gets closer to a Gaussian as N increases.

The continuous random walk discussed above assumes that space is continuous, but the number of steps taken N is still discrete. We can think of the number of steps as being equivalent to time: each step takes a certain amount of time, after all! What does the random walk look like when both space and time are continuous? If the probability of hopping left or right is the same, then the probability of being at the site m on the nth step, Pn(m), satisfies the equation 1 1 P (m)= P (m 1) + P (m + 1). n+1 2 n − 2 n So we also know that 1 1 P (m) P (m)= P (m 1) + P (m + 1) P (m). (2.14) n+1 − n 2 n − 2 n − n Now, if the distance travelled is x = ma with a the stepsize, and the elapsed time is t = nτ with τ the time needed to make a step, then we can convert this discrete equation to a continuous equation when a 0 and τ 0, and the probability Pn(m) is transformed into P (x, t). To make further progress→ we need to→ remember how derivatives are defined: ∂P (x, t) P (x, t + τ) P (x, t) lim − , ∂t τ 0 τ ≡ → and ∂2P (x, t) P (x + a,t)+ P (x a,t) 2P (x, t) lim − − . 2 a 0 2 ∂x ≡ → a Comparison of these expressions with Eq. (2.14) gives ∂P (x, t) a2 ∂2P (x, t) τ = ∂t 2 ∂x2 or alternatively ∂P (x, t) ∂2P (x, t) = D ∂t ∂x2 where the diffusion constant D a2/2τ. Thus, diffusion of a gas is exactly the same as the gas doing a random walk! ≡

2.3.3 Quantum Walks and Quantum Computation The discretetime quantum walk on the line behaves much as the classical version, except that instead of the walker choosing to move right or left at each step, she moves in both directions simultaneously, with some probability amplitude. A simple approach is to allow the walker to carry a ‘quantum coin,’ which can be put into a superposition of heads and tails. So the walker needs to keep track of both her coin coordinates, and her spatial coordinates. The coin states will be encoded in a spin degree of freedom, with 0 and 1 representing heads and tails, respectively. On a 1D | C | C lattice with M + 1 points, her spatial coordinate will be encoded in the vector j S,0 j M. So, the total wavefunction can be written | ≤ ≤

1 M ψ = α σ j . | σj | C ⊗ | S σ=0 j=0 X X PHYS 449 Course Notes 2009 23

The quantum walk is effected by repeated application of the operator U, which consists of first ap plying the Flip operator F on the coin, and then shifting the walker with S in a direction conditional N on the spin state: right if 0 C and left if 1 C . For N steps, the resulting operator is U = [S(F I)] where I is the Mdimensional| identity operator| and ⊗ F = [cos(θ) 0 + sin(θ) 1 ] 0 + [ sin(θ) 0 + cos(θ) 1 ] 1 ; | C | C |C − | C | C |C S = ( 0 0 j +1 j + 1 1 j 1 j ) , | C |C ⊗ | S |S | C |C ⊗ | − S |S i X where θ is some angle defining the fairness of the coin, with θ = π/4 giving the balanced coin. (Note that in 1D, a real coin operator is completely general, but this is not the case for more interesting graphs). Because the coin operator isn’t really randomizing (the operation is perfectly unitary and the evolution of the wavefunction is coherent), I prefer to call this a ‘quantum walk’ rather than a ‘quantum random walk.’ The analysis of this discretetime quantum walk is a bit involved, so I’ll simply show the results in Fig. 2.2 for the random and quantum walks after 20 timesteps. In both of these cases, I chose a balanced coin and started the walks with full probability at the central vertex, and for the quantum case the initial coin state was chosen to be 0 C + i 1 C which gives a nice symmetric pattern. | |

0.2

0.15

0.1 probability

0.05

0 0 10 20 30 40 vertex

Figure 2.2: Comparison of the discretetime random walk and quantum walk on a line. The solid and dashed lines show the probability of finding the classical and quantum walkers at a given vertex, respectively, after 20 timesteps.

It is clear that the probability distributions for the two walks are markedly different. The classical PHYS 449 Course Notes 2009 24 distribution remains peaked at the origin, and the halfwidth at halfmax is around 5 √20 lattice spaces. In contrast, the quantum probability distribution is strongly peaked away from≈ the center, and the spread is much greater. Fig. 2.3 compares the rms width of the resulting distributions as a function of the number of steps. While the classical rms value spreads like the square root of the number of steps so that j √N or alternatively x √t where x and t are position and time respectively, the quantum value∼ spreads linearly in the number∼ of steps, x2 t. Thus, in principle the quantum walk provides a quadratic speedup in the ability to access a given ∼ vertex over p the classical random walk; that is, starting the walker from the central vertex, a quantum walk will give a high probability of hitting the vertex at either end quadratically faster than will a random walk. This polynomial speedup is a bit misleading, though. There is a classical strategy for reaching one of the end vertices starting from the center that also scales linearly with the number of steps. It is simply to move in only one direction! This is a cautionary note when discussing speedups using quantum walks: there can be a classical algorithm that looks nothing like a random walk that can still perform as well.

10

8

6 〉 2 x √〈 4

2

0 0 2 4 6 8 10 12 14 16 18 20 timestep

Figure 2.3: Comparison of the rms spread for the discretetime random walk (solid line) and quantum walk (dashed line). PHYS 449 Course Notes 2009 25

2.4 Entropy

Entropy is the fundamental postulate of statistical mechanics (at least, after the ergodic principle!). Boltzmann suggested that the entropy of a system S is somehow related to the probability of being in a set of microstates, so that entropy and equilibrium are closely related concepts. A completely independent formulation by Shannon links the concept of entropy to information capacity of a channel. Yet a third concept of entropy is that of the purity of quantum states. We’ll explore all these ideas in detail now.

2.4.1 Boltzmann

Suppose S = φ(), i.e. the entropy is some function of the number of microstates in a macrostate. How to determine the function φ()? Consider two independent systems A and B so that SA = φ(A) and SB = φ(B ). The combined system obviously has entropy SAB = φ(AB). Boltzmann assumed that entropy is an additive quantity, i.e. that the entropy of the joint system is simply the sum of the of each: φ(AB )= φ(A)+ φ(B).

But we already know that AB = AB. Why? Because probabilities for compound events are always multiplicative: the total macrostate probability P = ABPAB = PAPB = ApABpB = ABpApB. But pApB = pAB which immediately yields AB = AB. This leaves us with the important behaviour for φ:

φ(AB )= φ(AB)= φ(A)+ φ(B ).

The only function that has these properties is the logarithm, so φ() log(), or ∝

S = kB ln() where kB is Boltzmann’s constant (in principle we don’t yet know what value this takes). This equation underlies all of statistical mechanics.

At this point you might object to the cavalier way in which I simply substituted a log with a ln. There’s nothing about the previous discussion that can tell the difference, though. After all, ln(x) = loge(x) = log10(x)/ log10(e) 2.3 log10(x). So changing the base of the log just gives a constant factor. If all of statistical mechanics≈ were in base10 instead of basee, then this would change the value of Boltzmann’s constant, but that’s about it. The nice thing about working with lns instead of logs is that the inverse of the former is an exponential which is nice to work with. Just a bit below you’ll encounter situations where base2 is more natural.

Recall thatn ˜1 in the random walk example considered earlier was found by (d/dn1) ln[PN (n1)] = 0, i.e. by asking where is the maximum probability of finding n1. But ln[PN (n1)] ln[N (n1)] S. So the most likely event is determined by maximizing the entropy. If you believe∝ the theorem∝ that the state of the system is the macrostate with the largest number of microstates, then this yields another form of the Second law of thermodynamics, namely that the equilibrium state of a system is found by maximizing the entropy. Maybe this gives a clue as to why entropy must keep increasing with time. . . . PHYS 449 Course Notes 2009 26

Let’s now return to a general system where any given trial can have m outcomes, Eq. (2.5). Recall that m = 2 for a coin, m = 6 for a die, etc. What is the explicit expression for the entropy? N! ln( ) = ln N m n !  i=1 i  = N ln(QN) N ln n ! using Stirling’s formula since N 1 − − i ≫ i ! Y = N ln(N) N [n ln(n ) n ] if n 1 n − − i i − i i ≫ ∀ i i X = N ln(N) N [Np ln(Np ) Np ] maximizing gives n = Np − − i i − i N i i i X = N ln(N) N N ln(N) p N p ln(p )+ N p but p =1 − − i − i i i i i i i i X X X P = N p ln(p ). − i i i X So, S = Nk p ln(p ). (2.15) − B i i i X

Let’s also calculate the entropy for an ideal gas of N monatomic atoms in a box of volume V , to show how this alone can yield the equation of state. First we need to find : how many ways can we distribute the atoms in the box? A crude method is to subdivide the box into M boxlets of volume V each, so that MV = V . So, there are M ways of distributing each atom within the volume. Assuming that an arbitrary number of atoms can be in each boxlet (this is the definition of an ideal gas), then the total number of ways to distribute N atoms is

V N = . V   This immediately yields S = kB ln = NkB ln(V/V ). If we were to change the volume, but keep the boxlet size fixed, then S = S S = Nk ln(V /V ) Nk ln(V /V )= Nk ln(V /V ) f − i B f − B i B f i ≡ NkB ln(V/Vi), which is independent of the arbitrary choice of boxlet size V . This immediately implies the existence of the following thermodynamic relation: P = T (∂S/∂V ), because from it we would immediately obtain the equation of state for the ideal gas, P = NkBT/V . Justification of this equation, though, will have to wait for Sec. 3.1.

2.4.2 Shannon Entropy

This expression for S above was ‘derived’ using pretty hokey assumptions. It would be nice if something so fundamental to statistical mechanics could be obtained in a more satisfying way. In fact, there is an information theoretic way to think about entropy that was developed completely independently by Claude Shannon in the midtwentieth century.

Suppose that we have a radio antenna that broadcasts digital signals encoded in the bitstrings X1, X2,..., Xm, where for example X1 = 0110010111 .... The question Shannon posed is: what are the PHYS 449 Course Notes 2009 27 resources required to represent all of these bitstrings? An alternative way of asking the question is: how much memory is required to store all of the bitstrings? Shannon found that the quantity H(X), now called the Shannon entropy, gives the mean number of bits per string required. Formally, if the probability of the antenna to produce bitstring Xi is pi, then the Shannon entropy is

m H(X)= H(p ,p ,...,p ) p log (p ). (2.16) 1 2 m ≡− i 2 i i=1 X D Shannon Entropy formally quantifies resources required to store information. Alternatively, it measures the uncertainty about the value of bitstring X before the measurement is made. Note that the information content doesn’t depend on what the actual variables are (could be coins, dice, drunks, spins, etc.) Example (handout in class) D Shannon’s Noiseless Coding Theorem states that the Shannon entropy gives the lower bound on the ability to compress information.

Clearly, the Shannon entropy for a balanced coin (aka a true coin, where the probability of coming 1 up heads is 2 ) is 1 1 1 1 H = log log = log (2) = 1, coin −2 2 2 − 2 2 2 2     so that only one bit need be used to represent the behaviour of the coin. In other words, we can encode heads as a 0 and tails as a 1. But the Shannon entropy for a balanced die is

1 1 H = log 6 2.585 < 3 coin −6 2 6 × ≈   And so using three bits to represent six numbers is suboptimal, because two words (110 = 7 and 111 = 8) go unused.

Let’s return to general binary systems (unfair coins), where the probability of obtaining one result (heads) is p and the other (tails) is q =1 p. The associated binary entropy is concave, that is: − H [px + (1 p)x ] pH (x )+(1 p)H (x ), bin 1 − 2 ≥ bin 1 − bin 2 where 0 p, x1, x2 1. This concavity can be most easily seen graphically by plotting Hbin as a function≤{ of p: }≤ Example What can this concavity of entropy tell you? Suppose that Alice has both a U.S. and a Canadian quarter, but both have been rigged so that the probabilities of obtaining heads are pU and pC, respectively. Alice’s friend Bob knows that she tends to flip the U.S. coin with probability q and the Canadian coin with probability 1 q. Alice then tells Bob the result of her coin toss (heads or tails). − Q How much information has Bob gained, on average?

1 5 A Bob gets the result, and information about which coin was flipped. Suppose pU = 3 and pC = 6 . If heads come up, Bob can say that it was more likely to have been a Canadian coin that was flipped, PHYS 449 Course Notes 2009 28

Binary Entropy Hbin(p) 1

0.8

0.6 bin H 0.4

0.2

0 0 0.2 0.4 0.6 0.8 1 p

that is: H [qp + (1 q)p ] qH (p )+(1 q)H (p ). bin U − C ≥ bin U − bin C

The concavity of binary entropy also tells us how certain we can be about the fairness of the coin. An entropy of zero implies p = 0 or p = 1 (note that 0log(0) = 0), corresponding to a completely unfair coin that is either twoheaded or twotailed. An entropy of one corresponds to a perfectly 1 balanced coin with p = 2 . So the entropy gives a measure of how uncertain the result of a given coin toss will be: when Hbin = 1, we are most uncertain about the result. The same is true of H and S in general: they provide a measure of the uncertainty in the result of testing the system (i.e. the events resulting from a series of trials on compound objects). This is why you might have heard that entropy is a measure of uncertainty.

2.4.3

So far, we have seen that entropy gives a measure of the number of ways of classifying compound events (like enumerating results of coin tosses, or distributing atoms in space), and a measure of PHYS 449 Course Notes 2009 29 the resources required to represent a certain quantity of information (or the information gained on making a series of measurements). In fact, entropy also gives a measure of the entanglement of a quantum system and of how mixed it is. Suppose Alice has a quantum mechanical coin, where the state 0 represents heads and 1 represents tails. Each of these states are called basis vectors. Unlike classical| coins, quantum coins| can be in a state that is simultaneously heads and tails, with different weights in each. If Alice’s coin is in a pure state, then it can be written as a single wavefunction ψA = (a 0 + b 1 ), where it is assumed that the wavefunction is normalized, a 2 + b 2 = 1. | | | | | | | Before discussing quantum mechanical entropy, it’s important to introduce the concept of the den- sity matrix ρ. If Alice’s coin is in the pure state above, then her is

ρA = ψA ψA = (a 0 + b 1 ) (a∗ 0 + b∗ 1 ) | | | | | | 2 2 2 a ab∗ = a 0 0 + ab∗ 0 1 + ba∗ 1 0 + b 1 1 = | | 2 . | | | | | | | | | | | | ba∗ b  | |  In general, though, her density matrix might not be derivable from a single purestate wavefunction at all, so that it would be a complete mixture of heads and tails: α β ρ = α 0 0 + β 0 1 + γ 1 0 + δ 1 1 = . | | | | | | | | γ δ   Pure and mixed states can be distinguished as follows. First, diagonalize the density matrix; i.e. find its eigenvalues and write them on the diagonal. Then sum up all the diagonal elements. This is called taking the trace, and is denoted Tr(ρ). Both pure and mixed states have Tr(ρ) = 1 but only pure states also satisfy Tr(ρ2) = 1; mixed states have Tr(ρ2) < 1. Let’s work out some examples. If ψ = 0 , then | A | 1 1 0 1 0 ρ = 0 0 = (1 0)= ; ρ2 = ; Tr(ρ) = Tr(ρ2)=1. | | 0 0 0 0 0       If ψ = ( 0 + 0 ) /√2, then | A | | 1 1 1 1 1 1 ρ = ( 0 + 1 ) ( 0 + 1 )= (1 1)= = ρ2. 2 | | | | 2 1 2 1 1     Diagonalizing this density matrix gives eigenvalues 0 and 1, which again gives Tr(ρ) = Tr(ρ2) = 1. But the mixed state 1 1 1 0 1 1 0 ρ = ( 0 0 + 1 1 )= ; ρ2 = 2 | | | | 2 0 1 4 0 1     2 1 has Tr(ρ) = 1 while Tr(ρ )= 2 .

D The von Neumann Entropy of a binary quantum state described by density matrix ρ is

S(ρ)= Tr (ρ log ρ) . − 2 For all pure states, S(ρ) = 0 and for mixed states 0

Now suppose Bob has a quantum coin described by the wavefunction ψB = (c 0 + d 1 ), where c 2 + d 2 = 1. If Alice has her purestate wavefunction, then the total wav| efunction | of both| parties | | | | is said to be separable, ψ = ψA ψB , because it can be written as a product of two wavefunctions that you can ‘take apart’| without| | damaging either one. The combined state of the system is

ψ = ψ ψ = (a 0 + b 1 ) (c 0 + d 1 ) | | A| B | | | | = (ac 00 + ad 01 + bc 10 + bd 11 ) | | | | Clearly, if the combined state were instead

ψ = (α 00 + β 01 + γ 11 ) , α 2 + β 2 + γ 2 =1 | | | | | | | | | | there would be no way to separate the components into two independent wavefunctions belonging to Alice and Bob. This is an entangled state. A maximally entangled pure state that Alice and Bob can share is 1 ψ = ( 00 + 11 ) . (2.17) | √2 | | If Alice measures her coin and finds that it is in state 0 , then Bob is guaranteed to also measure 0 . Note that this result doesn’t depend on who does the| measuring first. Also, they would need to |communicate classically to compare notes to detect the correlated results. So, there is no violation of causality here. This is the famous EPR paradox of early quantum mechanics.

The combined state shared by Alice and Bob also can be written in the form of a density matrix, but now it will be a 4 4 matrix instead of a 2 2. Now the question is: what information does Alice have about the state× of her coin after Bob measures× the state of his coin? Measurement is carried out by Bob tracing over his part of the twocoin density matrix, so that Alice’s density matrix is defined as ρ Tr (ρ). The von Neumann entropy of this twocoin system is defined as A ≡ B S(ρ)= Tr (ρ log ρ )= Tr (ρ log ρ ) , − A 2 A − B 2 B so that the entropy measure doesn’t depend on whether Alice or Bob measures their coin state first. It’s easy to check that separable pure states shared by Alice and Bob give zero von Neumann entropy. Suppose ψA = 0 while ψB = 1 for simplicity. Then ψ = 01 and ρ = 01 01 . Tracing over Bob’s| state is| accomplished | by turning| his information ‘ins| ide out,’| that is by| turning | his outer products into inner products, 1 0 Tr (ρ) = ( 0 0 ) ( 1 1 )= 0 0 = , B | | | | | 0 0   which has zero von Neumann entropy because log(1) = 0 and 0log(0) = 0. But the maximally entangled state (2.17) gives an entropy of one: 1 ρ = ( 00 00 + 00 11 + 11 00 + 11 11 ); 2 | | | | | | | |

1 ρ = Tr (ρ) = [( 0 0 ) ( 0 0 ) + ( 0 1 ) ( 0 1 ) + ( 1 0 ) ( 1 0 ) + ( 1 1 ) ( 1 1 )] A B 2 | | | | | | | | | | | | 1 1 1 0 = [ 0 0 + 1 1 ]= because 0 1 = 1 0 =0. 2 | | | | 2 0 1 | |   PHYS 449 Course Notes 2009 31

But this has the same form as the mixedstate example above, so that S(ρ) = 1. So, the maximally entangled pure state also has an entropy of one! The von Neumann entropy also provides a measure of the entanglement of a pure bipartite quantum state.

So, why is the entropy of the universe steadily increasing?? Chapter 3

Equilibrium

3.1 Temperature

Consider two isolated systems A and B, characterized by UA and UB, respectively. When these are brought together, dS 0. This is because is maximized at equilibrium, so ln() 0 or d ln() 0, and we have already≥ justified that S = k ln(). This means that ≥ ≥ B ∂S ∂S dS = A U + B U 0. ∂U A ∂U B ≥

(f) (i) (f) (i) Because this is a closed system, UA = UA UA Q and UB = UB UB Q. Using this we obtain − ≡ − ≡− ∂S ∂S A B Q 0. ∂U − ∂U ≥   What does this mean? Suppose Q> 0, so that energy moves into system A. We know then that ∂S ∂S A B . ∂U ≥ ∂U But we also know from ‘everyday experience’ that if energy (heat) flows into system A, then system (i) (i) B must have initially been hotter than system A, i.e. TA TB . It therefore seems reasonable (i) ≤ to make the association ∂SA/∂U 1/TA . In fact we can take the analogy further and define temperature this way: ∝ 1 ∂S ∂ = k ln () . (3.1) T ≡ ∂U ∂U B Note that in taking the partial derivative one must ensure that the number of particles remains fixed, as does any variable that would be associated with work (like volume, magnetic field, etc.). Also, I haven’t proven that this definition is consistent with other things that we now know. We’ll see below that if I know the explicit form for the entropy in terms of the internal energy U, it yields an expression for the temperature consistent with expectations. Before proceeding further, it is useful to introduce the new parameter β 1/kBT , which has units of inverse energy, that is widely used in statistical mechanics. Then Eq.≡ (3.1) reads

∂ ln () β . (3.2) ≡ ∂U

32 PHYS 449 Course Notes 2009 33

Unfortunately, an expression for S that is explicitly dependent on U is not always easy to come by! For this reason, it is often simpler to obtain expressions that relate S to other thermodynamic variables, such as volume, magnetic field, etc. So far, we know that temperature, or rather β, is related to the change in the entropy (or alternatively the change in the number of states in the macrostate ) as the energy of the system is changed. But suppose that U and also depend on some other variable x, so that =(U, x) and U = U(x). Then

∂ ln() ∂ ln() ∂U ∂ ln() = X = βX, ∂x ∂U ∂x ≡ ∂U where I have defined a generalized force X as

∂U 1 ∂ ln() ∂S X = = T . (3.3) ≡ ∂x β ∂x ∂x

A simple example of a generalized force would be pressure. Since the pressure changes as the volume changes, it seems reasonable that the generalized force associated with the variable x = V will be the pressure X = P , i.e. 1 ∂ ln() ∂S P = = T . (3.4) β ∂V ∂V

Let’s use this idea to obtain the equation of state of an ideal gas, as was done in Section 2.1. We had V S = k ln() = Nk ln . B B V   Using the definition of the pressure (3.4), we have

∂S ∂ V Nk T P = T = Nk T ln = B , ∂V B ∂V V V    as expected. Note that this results also does not depend on the arbitrary choice of V . Also this is the first hint that our definition of the temperature (3.1) was probably o.k., since we correctly reproduced the ideal gas law.

You might be interested to learn that the correspondence between a generalized variable x and its associated generalized force X has close parallels with the idea of ‘canonically conjugate variables’ in classical and quantum mechanics. For example, position and momentum, or time and energy, are conjugate variables much like volume and pressure. Together, they constitute what is known as ‘.’ In statistical mechanics, Eq. (3.3) indicates that inverse temperature and energy are the conjugate variables. In fact, there is a close relationship between inverse temperature and imaginary time that might be surprising: real gases with density ρ(r,t) diffuse through space in time through the diffusion equation that looks something like this: ∂ ρ(r,t)= D 2ρ(r,t), ∂t ∇ where D is the diffusion coefficient, which measures how quickly particles diffuse. As you might expect, D increases with temperature. Meanwhile, the Schr¨odinger equation for the motion of PHYS 449 Course Notes 2009 34 quantum mechanical particles is

∂ ∂ h2 ih ψ(r,t)= h ψ(r,t)= 2ψ(r,t), ∂t − ∂(it) −2m∇ where now ρ(r,t) = ψ(r,t) 2. This looks much like the diffusion equation in imaginary time t˜ | | 1 if it t˜. A more rigorous correspondence between t˜ and β− requires quantum field theory, unfortunately.→ I will come back to this point in Chapter 6, but in the meantime I hope that I have given you a further hint at the relationship between (the arrow of) time, temperature, and entropy....

3.2 Entropy, Heat, and Work 3.2.1 Thermodynamic Approach Suppose now that the system A comes into contact with a heat reservoir B, where there are many more microstates in B than in A. Then the change in β of system B if it absorbs heat Q from A is going to be tiny, ∂β B Q β . ∂U ≪ B B

Assuming ∂β /∂U β /U , then the inequality becomes β Q/U β or Q/U 1. So now B B B B B B B B we can evaluate the∼ change in the entropy due to the addition of the≪heat Q by expanding≪ in a Taylor series around small Q:

2 2 ∂ ln(B ) Q ∂ ln(B ) S = kB ln[B(UB + Q)] kB ln[B(UB )] QkB + kB 2 + ... − ≈ ∂UB 2 ∂UB 2 Q ∂βB = kB QβB + kB + ... 2 ∂UB Q2 β Q Q k Qβ + k B = Qk β 1+ ≈ B B B 2 U B B 2U ≈ T B  B  So, Q = T S or Q dS = (3.5) T for infinitesimal variations. This expression was in fact the thermodynamic definition for entropy introduced by Rudolf Clausius (sounds very Christmas!) in 1865, as the thing that changes by Q/T when heat is added to a system at temperature T .

Some textbooks have a much simpler derivation. Since dS = dU/T then if you add heat Q but don’t do any work dU = Q so that dS = Q/T . But this is really unsatisfactory, because in fact dS = Q/T even if the volume is changing, as long as the reservoir remains a reservoir during the process.

This is the first time I have mentioned the idea of a reservoir, and you might be curious about it. Basically, a reservoir is some extremely large system that can absorb or release energy, particles, etc. without affecting its own macroscopic properties at all. In the present context it is used to fix the temperature/internal energy of the smaller system A. This is the spirit of the Microcanonical PHYS 449 Course Notes 2009 35

Ensemble: the system of interest (A) has a fixed energy and number of particles, set to the value of some welldefined reservoir.

+ The Third Law of Thermodynamics states that as T 0 , S S0, i.e. that the entropy approaches a limiting value as the temperature goes to absolute→ zero. →

3.2.2 Statistical Approach Suppose that some system exhibits periodic motion, i.e. that a particle moves so that it comes back to where it started. In classical mechanics, there is something called the Action J, that is a constant of the periodic motion. It is defined as:

J = p dx, (3.6) I where p is the momentum and dx is the element of length. In quantum mechanics, J isn’t just conserved, but it is exactly equal to nh, where h is Planck’s constant and n is an arbitrary positive integer. This principle is called Bohr-Sommerfeld quantization, and can be used to obtain the energy spectrum for a range of interesting quantum systems.

Example: Particle in a Box. The particle is confined in a onedimensional box of length L, and is travelling from one side to the other with momentum p = mv. When it hits the wall, it 1 2 2 √ bounces back, and so on. The energy is obviously E = 2 mv = p /2m, so p = 2mE. The action is therefore 2L J = √2mE dx = √2mE 2L nh. ≡ Z0 n2h2 n2π2h2 E (1D)= = . (3.7) ⇒ n 2m(4L2) 2mL2 Quantum mechanics tells us that the accessible states of a system are quantized: the energy levels 2 above for a particle in a box are what heads and tails were for a coin. If we set En = pn/2m then from Eq. (3.8) one also obtains the quantization of momentum, pn = (πh/L)n. In a threedimensional box of length L, the energy levels are

2 2 2 2 2 π h nx + ny + nz En(3D)= 2 . (3.8) 2mL 

It is useful to estimate how many energy levels are occupied for air in a onemetre cubed box at 27 room temperature. Air is mostly made of nitrogen, mN = 14.0067 amu 1.6605 10− kg/amu = 26 34 ×42 2 2.33 10− kg. With h = 1.05457 10− J s, I obtain En = 2.36 10− n J. The characteristic 23 temperature is simply this divided by Boltzmann’s constant kB =1.3807 10− J/K to give Tn = 19 21 En/kB =1.7 10− K. Since room temperature is around 300 K, something like 2 10 states are occupied! We’ll do a better job of this a bit later. The above quantum interlude suggests that the accessible states of a physical system are quantized energy levels, which I’ll now denote ǫi. So, the mean energy U is given by

U n ǫ (3.9) ≡ i i i X PHYS 449 Course Notes 2009 36 and the mean energy per particle N is U 1 n = n ǫ = i ǫ = p ǫ . N N i i N i i i i i i X X X These simple expressions lead us to our first explicit connection between statistical mechanics and thermodynamics:

dU = (ǫidni + nidǫi) i X = ǫidni + nidǫi i i X X = Q (heat absorbed by system) + W (work done on system).

Unfortunately, knowing the singleparticle energy levels given in a box (3.8) is not enough to directly determine the equation of state. For this one needs to know how to construct the entropy and therefore the population of the macrostate . A rough estimate can be obtained as follows. First, we have pn = (πh/L)n = √2mEn so that n = (L/πh)√2mEn. Suppose that each particle accesses many of these n levels, and that there are N particles overall. Then it seems reasonable to assume that the number of states in the macrostate is proportional to the number of occupied energy levels n, which corresponds to the ‘volume in nspace’ with n the radius of the hypersphere:

4π 4π L 3N 4π V N n3N = (2mE )3N/2 = (2mE )3N/2 . (3.10) ∝ 3 3 πh n 3 3N n   (πh) N 3N/2 If we now make the identification En U then V U . The temperature is now immediately found: → ∝ 1 ∂ 3N ∂ ln(U) 3N = ln () = = . (3.11) kB T ∂U 2 ∂U 2U Notice that the proportionality factors are unimportant for the present purpose, since only the overall Udependence of the entropy is needed. Rearranging immediately gives U = (3/2)NkBT , which is just the result of the equipartition theorem. Likewise, the pressure can be found from ∂ ∂ ln(V ) Nk T P = k T ln () = Nk T = B , (3.12) B ∂V B ∂V V which is simply the equation of state for the ideal gas again.

3.3 Paramagnetism

Suppose that we have N atoms arranged in a crystal. All but the outermost of the electrons in a given atom are inert, hybridized with those of neighbouring atoms. The outermost electrons have total spin J. This means that if the atom has only one outermost electron the total spin would be 1 3 J = 2 ; if there were two then J = 1; three electrons would give J = 2 , etc. In quantum mechanics the spin projection mJ can take the values mJ = J, J +1,...,J 1,J. So one electron can give 1 1 − − − m = 2 and 2 at each site, while two electrons give m = 1, 0, and 1, etc. A lot of quantum mechanical work− shows that in a magnetic field B, the energy of these outermost− electrons is

ǫJ = gJ BmJ B, PHYS 449 Course Notes 2009 37

24 2 where g is the Jdependent Land´e gfactor and = eh/2m = 9.2741 10− A m is the Bohr J B e magneton. But you don’t really have to know all this: it is sufficient to assume that ǫJ = cmJ , where c is some arbitrary constant.

Consider the concrete example of J = 1 , so that there are two energies, ǫ = ε and ǫ = ε. If there 2 1 − 2 are n1 spins with energy ε and n2 = N n1 spins with energy ε, then according to Eq. (2.15) we have − − n n n n S = Nk 1 ln 1 + 2 ln 2 . − B N N N N h       i But we also can make use of the definition of the mean energy, Eq. (3.9):

2n U = n ( ε) + (N n )ε = ε(N 2n )= Nε 1 1 . 1 − − 1 − 1 − N   U 2n =1 1 ⇒ Nε − N n 1 U n n 1 U 1 = 1 and obviously 2 =1 1 = 1+ . ⇒ N 2 − Nε N − N 2 Nε     If we define y U/nε, then the expression for the entropy becomes ≡ 1+ x 1+ x 1 x 1 x S = Nk ln + − ln − . − B 2 2 2 2         Now what? We have above that dQ = T dS, or 1/T = ∂S/∂U. Using ∂S/∂U = (∂S/∂x)(∂x/∂U)= (1/Nε)(∂S/∂x), we obtain

1 1 ∂ 1+ x 1+ x 1 x 1 x = ( Nk ) ln + − ln − T Nε − B ∂x 2 2 2 2         k 1 1+ x 1 1 x 1+ x 1 1 x 1 = B ln ln − + − − ε 2 2 − 2 2 2 1+ x − 2 1 x             −  k 1+ x = B ln . − 2ε 1 x  −  We’re making progress! Inverting the last line above, we have

1+ x 2ε = exp 1 x −k T −  B  2ε 1+ x = (1 x)exp ⇒ − −k T  B  (1 + x)exp(βε)=(1 x)exp( βε) since β =1/k T. ⇒ − − B Rearranging, one obtains

exp(βε) exp( βε) exp(βε) + exp( βε) − − = x − x = tanh(βε). 2 − 2 ⇒ − PHYS 449 Course Notes 2009 38

Finally, we are done, because this immediately gives

ε U = Nε tanh . − k T  B  Recall that U = n ( ε)+ n ε = ε(n n ). This means that the net spin of the system is 1 − 2 2 − 1 U ε n n = = N tanh . 1 − 2 ε k T  B  What are the limiting values of the net spin in the limit of high and low magnetic field or high and low temperature? Why?

This isn’t quite the equation of state for the Pauli paramagnet yet, though. The magnetization M is defined as the net spin times the Bohr magneton, so that

ε g 1 BB M = (n n )= N tanh = N tanh 2 . (3.13) B 1 − 2 B k T B k T  B   B  This is the equation of state for a magnetic system, relating the magnetization to the magnetic field and temperature, rather than relating the pressure to the volume and temperature. Where the energy levels for a particle in a box depended on volume, giving pressure as the generalized force, now the energies depend on magnetic field, and the generalized force is the magnetization.

3.4 Mechanical Equilibrium and Pressure

Recall that we made a connection between statistical mechanics and thermodynamics by considering the mean energy of a system of particles, U = i niǫi. Also, for a particle in a box, remember that the energy levels ǫi were functions of the box volume. So P ∂n ∂ǫ dU = (ǫ dn + n dǫ )= ǫ i dS + n i dV = Q + W i i i i i ∂S i ∂V i i X X   ∂ ∂ = n ǫ dS + n ǫ dV = T dS P dV. ∂S i i ∂V i i − i ! i ! X X In obtaining the first half of the final result, I used the fact that 1/T = ∂S/∂U or T = ∂U/∂S. I recognized that the second term corresponds to the work W = P dV according to Eq. (1.16). So this implies that − ∂U P = . (3.14) −∂V This is another handy expression for the pressure, in circumstances where Eq. (3.4) is not convenient. Note, however, that in the calculation of the total energy for a 3D gas of particles performed in Sec. 3.2.2, the equipartition total energy U = (3/2)NkBT doesn’t explicitly depend on volume, only on the temperature! So this equation won’t be useful in this context. PHYS 449 Course Notes 2009 39

3.5 Diffusive Equilibrium and Chemical Potential

Now suppose that the energy of some small system (I’ll call it a subsystem) also depends on the (fluctuating) number of particles N in it. Then

∂U dU = T dS P dV + dN T dS P dV + dN. − ∂N ≡ − This can also be inverted to give dU + P dV dN dS = − . T So the chemical potential for the system is defined as ∂U ∂S = T , N = n . ≡ ∂N − ∂N i i subsystem ∈ X

What does the chemical potential mean? Consider the change in the entropy of the entire system (subsystem plus reservoir) with the number of particles in the subsystem Ns in terms of the change of the entropies of the reservoir SR and subsystem Ss: ∂S ∂S ∂S ∂N ∂S dS = R dN + s dN = dN R R + s . system ∂N s ∂N s s ∂N ∂N ∂N s s  R s s  But N = N N so ∂N /∂N = 1 giving R − s R s − ∂S ∂N dN dS = dN s R = s ( ) . system s ∂N − ∂N − T s − R  s R  Now, equilibrium corresponds to maximizing entropy, or dS = 0. This means that the condition for equilibrium between the subsystem and the reservoir is that the chemical potentials for each should be equal, s = R. But even more important, as equilibrium is being approached, the entropy is changing with time like dS dN system = s s − R 0 dt − dt T ≥   because the entropy must increase toward equilibrium (unless they are already at equilibrium). If initially R >s, then clearly dNs/dt 0 to satisfy the above inequality. This means that in order to reach equilibrium when the chemical≥ potential for the reservoir is initially larger than that of the subsystem, particles must flow from the reservoir into the subsystem. So the chemical potential provides some measure of the number imbalance between two systems that are not at equilibrium. What else does it tell you?

To clarify the various meanings of the chemical potential, let’s return first to the Pauli paramagnet. 1 Recall that in the microcanonical ensemble, the entropy for the spin 2 case was n n n n S = Nk 1 ln 1 + 1 1 ln 1 1 . − B N N − N − N h       i PHYS 449 Course Notes 2009 40

Assuming that we have some subsystem with number N in contact with a reservoir at temperature T , the chemical potential is

∂S N n1 n2 = T = kBT ln = kB T ln 1 = kBT ln − ∂N − N n1 − N N  −      after a bit of algebra. What does this mean? First, you can see that the chemical potential has units of energy. In this case, = 0 when the number of spinup atoms is zero, n1 = 0. The chemical potential is less than zero for any other value of n1, and 0 for n1 N. What does = 0 mean? Suppose that the total number of particles is not zero.| |The ≫ n a zero→ chemical potential means that the change in the number of particles in the reservoir is not related to the change in the number of particles in the subsystem; alternatively, the entropy is invariant under changes in the number of particles. This implies that a zero chemical potential means that the system doesn’t conserve the number of particles. For the Pauli paramagnet, I can keep increasing the number of atoms with spin down, and as long as I don’t create a single spin up, then the system’s entropy doesn’t change: it remains exactly zero.

We’ll return to the chemical potential again in Chapter 7. Chapter 4

Engines and Refrigerators

We’re surrounded by machines, devices that consume energy. How is this energy produced? How do engines actually work? How can we reduce our environmental impact without sacrificing our technological progress? These are the kinds of questions that will be addressed in this chapter.

4.1 Heat Engines

Engines of various kinds are central to our lives in Canada. It turns out that one can understand many of the main characteristics of engines without really knowing anything about how they actually work. One only really needs to invoke the first and second laws of thermodynamics.

Suppose that we have some engine as shown in Fig. 4.1. Some unspecified device is sandwiched

Figure 4.1: Theoretical heat engine. between two reservoirs, one at a temperature Th, called the heat ‘source,’ and the other at a lower temperature Tc Th, called the heat ‘sink.’ The device (called the engine) pulls heat out of the hotter reservoir≤ and dumps it into the colder reservoir. We would like to know how much power (‘work’ in thermodynamics language) the engine could provide, simply using considerations of thermodynamics. We already know from the conservation of energy (aka the First Law) that in

41 PHYS 449 Course Notes 2009 42 principle, we can’t get more energy out than we put in. So it seems reasonable to define the efficiency as e = W / Qh , with e = 1 when the work is equal to the heat pulled in, and e = 0 if no work is done| at| all.| I’m| using absolute values everywhere because all I care about are the magnitudes of the energies, not about which direction they are flowing. Using conservation of energy we have Q = Q + W so the efficiency is | h| | c| | | Q Q Q e = | h| − | c| =1 | c| . (4.1) Q − Q | h| | h|

Unfortunately, this isn’t the whole story. The change in the entropy during the process has to be positive semidefinite (fancy way of saying dS 0). The entropy ‘from’ the hot reservoir is Q /T , ≥ | h| h and that ‘to’ the cold reservoir is Qc /Tc. This isn’t very precise language. But mathematically we must have | | Q Q Q T dS = | c| | h| 0 | c| c . (4.2) T − T ≥ ⇒ Q ≥ T c h | h| h So the efficiency is T e 1 c . (4.3) ≤ − Th This is a bit depressing. It says that the only way that one can achieve a perfect engine, even in principle, is if the hot reservoir had infinite temperature and the cold reservoir was at absolute zero! For example, the maximum efficiency of a steam engine, based only on the temperature difference between boiling and freezing water, would be a mere 27%. So at least 73% of the energy consumed would be irretrievably lost. In fact, the steam engine doesn’t use only water but also steam, so the maximum theoretical efficiency of a steam engine is a bit better than this at 48%; this will be explained at greater length in Section 4.3.

In fact, even from a purely theoretical point of view, the maximum efficiency is significantly worse than Eq. (4.3) would predict. First, it is never possible to have real reservoirs that exactly maintain their temperature near the location where the heat is extracted or dumped; the reservoirs will take some finite amount of time to equilibrate to the same temperature everywhere, even if it assumed to be so huge that the final temperature at the end will be the same as it was before. Second, the engine temperature at the moment the heat is extracted from the hotter reservoir might not be exactly Th, which will lead to an increase in entropy; likewise when it dumps energy into the colder reservoir. And of course entropy could easily be increased by the various (as yet unspecified) processes inside the engine, and heat could be produced by friction which would either heat the engine, or be released to another (uncontrolled) reservoir, changing the energy conservation formula. Given this set of issues, it’s a wonder that engines exist at all!

You might be interested to know that entropy was discovered experimentally by experimentalists in the early to mid19th century who were studying the gases in the context of engine design. At the time, steam engines were in wide use but their scientific properties were basically unknown. Nicolas L´eonard Sadi Carnot (17961832), in particular, showed that the efficiencies were significantly worse than the simple relation (4.1) would indicate, and was the first to identify the quantity Q/T as important, though its significance was not appreciated for some time after his death. Speaking of death, Carnot died during the cholera epidemic that ravaged Europe, and because of fears of contamination most of his scientific writings were buried with him. So unfortunately his general work is not that wellknown. Clausius picked up the ball and coined the term ‘entropy’ in 1865, but PHYS 449 Course Notes 2009 43 nobody understood what it really was until Boltzmann came along and developed a full statistical theory of it in 1877.

To make further theoretical progress (I’ll give some practical advice for engine building later), it is important to at least figure out how to design the most efficient engine, subject to the constraint (4.3). A crucial concept is reversibility. We saw in Section 3.1 that if two systems are in thermal contact, then the flow of heat Q from the hotter one to the cooler one is always associated with an increase of entropy. The only time the entropy change is zero is if the temperatures are the same. But suppose that the temperature difference was only infinitesimal. The heat could flow with only a negligible increase in entropy. Likewise, if the reservoir temperatures were suddenly reversed the heat would flow in the opposite direction, still without significantly increasing the total entropy. Here’s the official definition:

D: Reversible Process: Process that can be reversed by changing the conditions only infinitesi mally.

So, if we had a reversible engine, the inequality in Eq. (4.3) would turn into an equality, and this would be our bestcase scenario. The engine is made up of a gas, and the process is called the Carnot Cycle. It runs like this:

< 1. Set the temperature of the gas Tgas Th; ∼ 2. Keepthe gasat Tgas so it will expand as it absorbs heat (recall that the total energy is conserved so if heat is coming in it better do some P dV work to let some energy out); − < 3. When the engine dumps heat into the cold reservoir, we want Tc Tgas. So we need to allow the gas to adiabatically expand as it cools (recall from Section 1.5∼ that adiabatic compression – compression without heat flow – heats the gas, so expansion must cool it); 4. Dump the heat from the gas into the cold reservoir isothermally (the reverse of step 2);

5. Adiabatically compress the gas to raise the temperature from Tc to Th (the reverse of step 3).

You might worry about steps 3 and 5. The adiabatic expansion of the gas has the gas doing the work, in principle; but to make sure it doesn’t expand too much and get too cold Tgas < Tc, you probably have to do some work to stop it. The easiest way is to simply put the gas ‘bin’ in a larger box that prevent the gas from expanding forever, so you don’t have to expend any real work in stopping it. Except you might worry that I am transferring momentum in this process. Where does that energy come from? Likewise, the adiabatic compression step 5 has the bin doing work on the gas. Where is this energy coming from?

Just because the Carnot engine is the most efficient, it doesn’t mean it is the most useful. In fact, it is probably the most useless engine, because being essentially reversible the amount of work you can extract from it is also infinitesimal, i.e. essentially zero. So I wouldn’t build one of these in a fuel crisis if I were you! You’re much better off coming up with a scheme with a huge temperature difference between the heat source and sink, which will put the efficiency higher to begin with! PHYS 449 Course Notes 2009 44

4.2 Refrigerators

A fridge is just an engine in reverse, literally. Simply take all the arrows in Fig. 4.1 and reverse their directions, so now the work is pointing into the engine, and the heat is flowing out of the cold reservoir and into the hot reservoir. It is pretty obvious from our daytoday experience, and from the mathematics we have developed so far on the statistical mechanics side of things, that this is very unlikely to happen spontaneously. So clearly we would need to do some work to make it happen; thus the fact that we need to plug our fridge into the wall!

The figure of merit for a fridge is no longer the efficiency to avoid confusion; it is now called the ‘coefficient of performance.’ It is defined as c = Qc / W , i.e. the ratio of the heat pulled out of the cold reservoir to the amount of work needed to| accomplish| | | it. Using the first law gives Q 1 c = | c| = . (4.4) Q Q Q / Q 1 | h| − | c| | h| | c|− Now the second law is simply Eq. (4.2) but with the hot and cold labels reversed c h, Q / Q ↔ | h| | c|≥ Th/Tc. Combining this with Eq. (4.4 gives 1 T c = c . (4.5) ≤ T /T 1 T T h c − h − c This result is a bit more interesting than the engine efficiency, because it is somewhat counterintu itive. Notice that if Th Tc the coefficient of performance goes to zero, so that the fridge doesn’t work at all! This is the≫ opposite of the engine case, where this was the bestcase scenario. What do you think is going on? Likewise, the best situation is when the temperature of the hot reservoir + is only slightly larger than that of the cold reservoir, Th Tc . In this case the coefficient of performance can be arbitrarily large! →

Consider your fridge, for which the interior is usually set at 4◦C, or 277 K. The inside of your house is a pretty decent reservoir, so let’s assume that the kitchen is kept at 20◦C, or 293 K. Then c 17. This means that for every Joule of electrical energy consumed, 17 Joules of heat are removed≈ from the interior of the fridge, and 18 Joules of heat are dumped into the kitchen. But consider instead a deep freezer, which is set around 20◦C, or 253 K. Now one obtains c 6, which is much less efficient. Things get much worse as you− attempt to get much colder. This is≈ part of the reason why in practice it is difficult to cool things down to absolute zero (the real reason is quantum mechanics). To get to around 1 K, you need to cool very slowly, so that at any stage the difference between the hot and cold reservoirs remains small. More about these issues in the discussion of real fridges below.

4.3 Real Heat Engines 4.3.1 Stirling Engine The real engine that perhaps most closely resembles the idealized engine shown in Fig. 4.1 is the Stirling Engine, shown in Fig. 4.2. This was originally invented by a priest named Robert Stirling (17901878) in 1818, as an alternative to the steam engine that had a tendency at the time to give you steam burns or explode. The idea was to make sure that the engine material, in this case an ideal (or nearly ideal) gas, was completely sealed, and that the pressures would be much lower than those in steam engines. In fact, soon after his invention a better steel making process was invented PHYS 449 Course Notes 2009 45 that eliminated the explosion problem, and the Stirling engines never really took off. But it turns out that there is a lot of current interest again for reasons that I will get to once I explain the principle.

Figure 4.2: Stirling engine. Thanks to wikipedia!

As shown in Fig. 4.2, the Stirling engine works with a sealed gas in a single chamber made up of three compartments. The upperright compartment is in contact with a heat source while the lower compartment is in contact with a cold reservoir; these are connected by a long ‘regenerator’ over which the temperature of the gas changes slowly from hot to cold. Surrounding the regenerator (not shown) is a material with a large heat capacity, so that it is always almost exactly at the same temperature as the gas at each point. Like the Carnot engine, the Stirling engine has four distinct steps:

1. Some external heat source (top right) dumps heat into the upper compartment. This causes rapid isothermal expansion (P dV work). This pushes the upper piston to the left, turning the flywheel (upper left) clockwise; 2. The turning flywheel also pulls the lower piston acting on the cold compartment, decreasing the pressure locally; 3. The combined lower pressure in the cold compartment and the flywheel’s momentum pushing the hot piston to the right pushes the gas from the hot compartment to the cold compartment, through the regenerator. Heat is absorbed in the process by the regenerator to bring the gas temperature from Th to Tc. 4. The cold piston moves down, isothermally compressing the cold gas and releasing heat into the cold reservoir. The hot piston meanwhile is moving out and the gas is pulled back from cold compartment toward the hot compartment through the regenerator; it absorbs heat from the regenerator during this process to bring the temperature from Tc to Th.

It is clear that there is no net energy transferred to or from the regenerator on each cycle, so the engine efficiency (not including friction, losses, etc.) is entirely determined only by the difference in temperature between the hot and cold compartments. So, in principle this engine has the potential to achieve something close to maximum efficiency e = 1! It is for this reason that Stirling engines have enjoyed something of a renaissance in recent years: its potential efficiency is a boon in this PHYS 449 Course Notes 2009 46 relatively energystarved age. Furthermore, Stirling engines have the potential to effectively boost the efficiency of many traditional engines, simply by employing their necessary waste heat as a power source. In fact, the most efficient home furnaces have a Stirlingtype engine attached for this reason. They are also central to a solarpower generation scheme, where solar light is collected by reflective dishes and concentrated on a Stirling engine. The efficiency rivals that of the best solidstate solar cells and there are no environmental costs whatsoever.

Stirling engines have some downsides, though. They need to withstand huge temperature gradients without melting or warping. They have moving parts which can wear over time. The amount of generated torque is not a high as in other engines (like steam) because of the low gas pressures. They take some time to get going, so they wouldn’t be that useful for transportation. The moving parts need to be optimized for the cycle frequency, so they aren’t wellsuited to environments where the relative temperatures vary widely. And with the huge temperature differences needed for efficiency it is difficult to find cold reservoirs with a sufficiently high heat capacity to keep things going.

4.3.2 Steam Engine The steam engine has been the workhorse of industry for centuries, and remains so today: nuclear and coal electric power plants in fact power giant steam engines that turn turbines to generate the electricity. The steam engine is known as an external combustion engine because the heat comes from a source outside. It runs on the fourstep Rankine cycle, which is shown in Fig. 4.3. The steps are as follows:

Figure 4.3: Rankine cycle. Thanks again to wikipedia!

1. Water is pumped up (actively, this takes work) from lowpressure to highpressure before it is fed to the boiler;

2. The water in the boiler is heated at constant pressure by an external source, turning it into vapour. The increased temperature means its volume expands so the vapour travels up the tube toward the turbine; PHYS 449 Course Notes 2009 47

3. The steam expands adiabatically in the turbine, turning the turbine and thereby generating the power. As it does so it cools, ending up at the original low pressure; 4. The partially condensed vapour is further cooled in the condenser, a pipe network in contact with the cold reservoir.

The condenser temperatures are around 30◦C, and the maximum high temperature corresponds to the point at which the steel starts to warp, around 550◦C. This gives a Carnottype efficiency upper bound of around e = 63%, though most coalfired plants operate around 40%. In practice, the actual operation of the steam engine is more complicated than the Rankine cycle.

4.3.3 Internal Combustion Engine The internal combustion engine comes in two main flavours: gasoline (gas or petrol) and diesel, with the latter more efficient than the former. Their cycles are shown in Figures 4.4 and 4.5. Let’s examine the gas engine first:

Figure 4.4: The gasoline engine. Thanks to Encyclopedia Brittanica online.

1. A mixture of vapourized gas and air are sucked into a cylinder as the piston moves outward. It is then compressed adiabatically as the piston moves inward, which raises its temperature; 2. A spark plug ignites the mixture, raising the temperature and pressure at constant volume; 3. (power stroke) The high pressure gas pushes the piston outward, expanding adiabatically and producing work; 4. The hot gases are pushed out (exhaust) by the inward stroke of the piston and replaced by a new mixture at a lower temperature and pressure in step 1. No work is done in this process. PHYS 449 Course Notes 2009 48

So by this description, the piston actually moves in and out twice over the course of a single cycle. It is called the Otto cycle, named after inventor Nikolaus August Otto. Anyhow, it seems like a suitable name!

Let’s analyze the efficiency of this cycle now. Recall the adiabatic relation (1.22) V γ P = constant. γ 1 Using the ideal gas law P V = NkBT gives T V − = new constant. So for the adiabatic compression γ 1 γ 1 γ 1 (step 1) we have T1V2 − = TcV1 − . Likewise, for the adiabatic expansion (step 3) we have T3V1 − = γ 1 ThV2 − . Therefore we must have γ 1 T V − T 1 = 1 = h . (4.6) T V T c  2  3 The efficiency is then found using Eq. (4.1), but what are the values for Qh and Qc? Recall the definition of the heat capacity (1.23) CV = ∂U/∂T . In the event that no work is done then Q = CV dT = CV (Tf Ti) if the heat capacity is assumed to be temperatureindependent. This is true pretty much only− for an ideal gas, cf. Eq. (1.26). Putting everything together gives R γ 1 Q C (T T ) T (T /T ) T T V − e = 1 | c| =1 V 3 − c =1 c h 1 − c =1 c =1 2 − Q − C (T T ) − T T − T − V | h| V h − 1 h − 1 1  1  T T T T = 1 c h =1 c 3 . (4.7) − T T − T T h  1  h  c  Because the ratios T /T = T /T 1, in general the efficiency of the Otto cycle is lower than 3 c h 1 ≥ that of the Carnot cycle. In automobile parlance, the ratio V1/V2 between the maximum and minimum volume of the cylinder is known as the compression ratio. For regular cars, its value is something like 8 10, while for sports cars it is something like 11 or 12. Together with the value of γ = (f + 2)/f =7−/5 for air (assuming it is diatomic oxygen so that f = 5), one obtains automobile efficiencies of 1 2/5 e 1 56%, (4.8) ≈ − 8 ≈   which turns out to be optimistic. In reality, most gas engines get something like 30% efficiency.

Figure 4.5: The diesel engine. Thanks again to Encyclopedia Brittanica online! PHYS 449 Course Notes 2009 49

Fig. 4.5 shows the corresponding cycle for the fuelinjection type of diesel engine. It is much like the cycle for the gas engine, except that no spark plugs are needed. Rather, on the intake step only regular air is pumped into the cylinder, and the diesel vapour is only injected when the air is at sufficiently high pressure (and therefore temperature) following the compression step. At this stage it spontaneously ignites and explodes. It frankly sounds rather dangerous to me! The advantage is that diesel engines are more efficient than gas ones, closer to e 40%, because of the much higher compression ratios used (closer to 20). also, you won’t have troub≈ le starting your car after fording a river that turned out to be deeper than you had hoped. The downside is that they are sometimes difficult to start in cold weather. Maybe this is why they are more popular in Europe and Asia than in Canada. Or perhaps it is because they sound like trucks and stink.

4.4 Real Refrigerators 4.4.1 Home Fridges The kind of fridge that you probably have in your kitchen runs on a cycle that is almost exactly the reverse of the Rankine engine shown in Fig. 4.3. The working engine again transitions from a liquid to a gas and back, but now at much lower temperature. The preferred working substance is called HFC134a, which is a cousin of freon without the ozonelayer damaging chlorine. Here is the procedure: 1. The gas, initially at around room temperature, is compressed adiabatically, raising its temper ature and pressure to become a superheated highpressure gas. This is where the energy from the wall plug is needed; 2. It passively releases heat and gradually liquefies in the condenser, a network of pipes in contact with the ‘hot’ reservoir (actually at room temperature). These are usually under the fridge. The liquid is still hot (a bit warmer than room temperature) but stays a liquid because of the high pressures; 3. It passes through a throttler (a porous plug) after which it has substantially lowered its tem perature and pressure. This process actually requires energy as well. 4. It absorbs heat from the inside of the fridge and turns back into a gas in the evaporator, a network of pipes in contact with the ‘cold’ reservoir. These are usually in the back of the fridge, behind the back false wall. Of course the ‘cold’ reservoir is understood to be at a higher temperature than the gas.

Let’s look a bit more closely at the throttler process. Because there is no heat during the process, the change in energy is

U U = Q + W =0+ W + W = P V P V , (4.9) f − i left right i i − f f where the negative sign in the last relation follows from the fact that the fluid is doing P V work on the piston to the right. So we have Ui + PiVi = Uf + Pf Vf or conservation of enthalpy, Hi = Hf . Of course we already knew that from the discussion near (Eq. 1.25). Suppose that the liquid is actually an ideal gas. Then H = (f/2)NkBT + NkBT = (f + 2)NkBT/2, which means that if the are the same before and after the throttling process, the temperature must not have changed! So clearly we can’t use an ideal gas as the working liquid. Of course, an ideal gas can’t PHYS 449 Course Notes 2009 50 be also be a liquid, which it needs to be in this cycle. To become a liquid, the gas molecules must attract each other enough, and this additional attractive potential energy is enough to make sure that the temperatures do change: denser liquids have more of the attractive (negative potential) energy contribution. After the throttling the potential energy is higher, so to conserve energy the kinetic energy (which is related to the temperature) is lower. The properties of real interacting gases will only be analyzed next term.

There are times when having a compressor is not convenient, either because of its noise or the lack of an available source of electricity. In these cases, the energy comes from a heat source instead, such as from solar or burning kerosene. Rather than developing an engine that converts heat directly to electricity and using this, one can make an absorption refrigerator that uses the heat directly. I won’t bother going into details, suffice it to say that it uses (toxic) ammonia and (explosive) hydrogen gas. It turns out Einstein and his student Leo Szilard worked on a version that was less dangerous, but it hasn’t seen wide production even though there are indications that it could be incredibly efficient.

4.4.2 Liquefaction of Gases and Going to Absolute Zero This part is almost all handwavey pictures and will be done in class. Chapter 5

Free Energy and Chemical Thermodynamics

5.1 Free Energy as Work 5.1.1 Independent variables S and V Recall that dU = Q + W = T dS P dV . Clearly then U is a simultaneous function of the two parameters S and V , i.e. U = U(S,− V ). So we can write: ∂U ∂U dU = dS + dV T dS P dV, (5.1) ∂S ∂V ≡ −  V  S which immediately yields the following two equations ∂U ∂U T = , P = . (5.2) ∂S − ∂V  V  S The first of these we used extensively in Chapter 3, or rather its inverse relationship 1/T =

(∂S/∂U)V , though I didn’t make a big deal of the fact that we could only evaluate the expres sion assuming constant volume. The second is sort of obvious using W = P dV if we assume that the total energy only depends on work done on it, but the above is more rigorous.−

The fact that dU is an exact differential of the quantity U allows us to derive another useful relation. If we took a second derivative of U, the result can’t depend on the order, i.e. ∂2U ∂2U ∂V∂S ≡ ∂S∂V ∂ ∂U ∂ ∂U ∂V ∂S ≡ ∂S ∂V  S  V  V  S ∂T ∂P = . (5.3) ⇒ ∂V − ∂S  S  V This is a pretty handy formula, though the restriction of constant entropy for the first, or alterna tively finding the pressure as a function of entropy, is a bit difficult in practise.

51 PHYS 449 Course Notes 2009 52

5.1.2 Independent variables S and P Similar equations can be obtained by making a Legendre transformation. Consider for example the quantity P V . Evidently we have d(P V )= P dV + V dP , So

dU = T dS P dV = T dS d(P V )+ V dP d(U + P V )= dH = T dS + V dP. (5.4) − − ⇒ which is the constitutive equation for the enthalpy H that we first encountered in Eq. (1.25) rather than the internal energy U. Following the arguments above, we have H = H(S, P ) so that

∂H ∂H dH = dS + dP T dS + V dP, (5.5) ∂S ∂P ≡  P  S which immediately yields the following two equations ∂H ∂H T = , V = . (5.6) ∂S ∂P  P  S Likewise using the secondderivative trick above, one readily obtains: ∂T ∂V = . (5.7) ∂P ∂S  S  P 5.1.3 Independent variables T and V One can go on doing similar transformations for all the relevant variables, which is a bit of a dull exercise. But in fact there are two particular choices which are the most useful in practise, namely having the independent variables T and V , or the variables T and P . For the first of these, consider the quantity TS, for which we have d(TS)= T dS + SdT . Then

dU = T dS P dV = d(TS) SdT P dV d(U TS) dF = SdT P dV, (5.8) − − − ⇒ − ≡ − − which is the constitutive equation for the F U TS. Most of the time that physicists talk about the free energy, it is this one, and often it is simply≡ − called the ‘free energy’ period. Now, F = F (T, V ) so that

∂F ∂F dF = dT + dV SdT P dV, (5.9) ∂T ∂V ≡− −  V  T which immediately yields the following two equations ∂F ∂F S = , P = . (5.10) − ∂T − ∂V  V  T These are very useful equations! And again, using the secondderivative trick above, one finds: ∂S ∂P = . (5.11) ∂V ∂T  T  V Thus, knowing how the pressure varies with temperature (at constant volume) is enough to tell you how the entropy varies with volume (at constant temperature)! Consider an ideal gas, P = NkBT/V ; then ∂P/∂T = NkB /V , which yields S = NkB ln(Vf /Vi), just the expression that we obtained at the beginning of Chapter 3 when considering the number of macrostates, S = NkB ln(V/V ). PHYS 449 Course Notes 2009 53

5.1.4 Independent variables T and P The other useful case comes from considering a combination of P V and TS:

dG d(U TS + P V )= T dS P dV T dS SdT + P dV + V dP = SdT + V dP, (5.12) ≡ − − − − − where I have introduced the G U TS + P V . The Gibbs free energy is now a function of T and P , so that ≡ −

∂G ∂G dG = dT + dP SdT + V dP, (5.13) ∂T ∂P ≡−  P  T which immediately yields the following two equations

∂G ∂G S = , V = . (5.14) − ∂T ∂P  P  T These are also very useful equations! And again, using the secondderivative trick, one finds:

∂S ∂V = . (5.15) ∂P − ∂T  T  P

Again, consider the ideal gas, V = NkBT/P ; then ∂V/∂T = NkB/P , which yields S = NkB ln(Pf /Pi), which is not an expression we thought of obtaining previously. −

Just so you know, Eqs. (5.7), (5.11), and (5.15), together with the related equation

∂T ∂P = (5.16) ∂V − ∂S  S  V are collectively called Maxwell’s relations, not to be confused with Maxwell’s equations in elec tromagnetism! That Maxwell certainly got around. Of course, all of them can be obtained directly from the original expression for the internal energy by various transformations. They are all basically restatements of the fact that the various thermodynamic quantities (temperature, pressure, volume, entropy) are all related to one another. But you should keep in mind that none of these were derived knowing exactly how they relate, i.e. no equation of state was used. Likewise, the quantities U, H, F , and G are called thermodynamic potentials.

5.1.5 Connection to Work O.k. so now that we have derived all of these relations, what can we do with them? Consider the Helmholtz free energy F = U TS. Clearly dF = dU T dS SdT = Q + W T dS SdT = T dS P dV T dS SdT = −P dV SdT . If the change− was made− isothermally− then dT−= 0 and so the− change− in the− free energy− is exactly− the same as the work, dF = P dV = W . In deriving this and expressions above, I used the fact that Q = T dS, as I derived in− Eq. (3.5). In fact, close inspection of this equation shows that this is only an equality if the amount of heat added is very small; in general one has dS Q/T when the heat is not infinitesimal. So in general TdS >Q and therefore ≥ dF W (5.17) ≤ at constant temperature. PHYS 449 Course Notes 2009 54

If we’re working at constant pressure instead, then it is more convenient to work with the Gibbs free energy: dG = Wother + SdT only if dP = 0. I have added the Wother to indicate that other (nonP dV ) forms of work could be included. In this case we instead have

dG W (5.18) ≤ other at constant temperature and pressure. Values for the (changes in) Gibbs free energy for many systems, particularly those in chemistry, can be found in tables. There are many examples in chemistry that I find boring so I won’t discuss them.

5.1.6 Varying particle number If the number of particles in the system is not fixed, then we also need to include the chemical potential (see Section 8.2). Then the change in internal energy, and the changes in the enthalpy, Helmholtz, and Gibbs free energies are given respectively by

dU = T dS P dV + dN; − dH = T dS + V dP + dN; dF = SdT P dV + dN; − − dG = SdT + V dP + dN. − The first of these we saw back in Section 8.2. The rest of them are pretty trivially changed from their cases seen above without allowing for changes to the total particle number! So we obtain the following rules: ∂U ∂H ∂F ∂G = = = = . (5.19) ∂N ∂N ∂N ∂N  S,V  S,P  T,V  T,P If there is more than one kind of particle, then one would need to insert the additional conditions, i.e. the chemical potential for species 1 would be ∂F = . 1 ∂N  1 T,V,N2,N3,... 5.2 Free Energy as Force toward Equilibrium

We have now pretty firmly established that entropy increases as equilibrium is reached. How about the free energies discussed in the previous section? Suppose that we decompose our total system into two pieces, corresponding to a subsystem and a reservoir. The reservoir is defined as something which can absorb or release energy without changing its temperature (recall Section 3.2.1). The change in the entropy of the total system can be written

dStotal = dS + dSR, where the reservoir is labelled with an ‘R’ and subsystem has no label. We can rewrite this as 1 dStotal = dS + (dUR + PRdVR RdNR) . TR − We can always assume that the temperature of the reservoir remains constant i.e. that the temper ature is defined by that of the reservoir, T = TR. If furthermore the volume VR and the number of PHYS 449 Course Notes 2009 55

particles NR in the reservoir are fixed (which also implies that V and N are fixed in the subsystem, because we assume that the total system is welldefined), then this relation becomes

dU dS = dS + R . total T

Now, since dU + dUR = 0 for the closed total system, this becomes dU 1 dF dS = dS = (dU T dS)= . total − T −T − − T This says that if the total entropy of the total system increases, then the Helmholtz free energy of the subsystem must decrease! The same result holds for the Gibbs free energy. That is, equilibrium corresponds to the minimization of the free energies. This is just a restatement of something that you have probably heard for a long time: a system will seek its lowest (free) energy. But be careful: the Helmholtz free energy is U TS; if the entropy is increasing the free energy is going to decrease at constant temperature even− if the internal energy stays constant or even increases! So this is a result specific to the free energies, not the internal energy.

It is convenient to classify the various thermodynamics quantities according to whether they are ‘extensive’ or ‘intensive.’ These are terms that are used very often in the literature, and so I should let you know what they mean. Extensive quantities are those that get larger when you increase the amount of stuff that you started with, and intensive quantities are those that remain the same under this operation. Obvious things that will change are N and V , and obvious things that won’t are T and P . Here is a list:

Extensive quantities: V , N, S, U, H, F , G, mass; • Intensive quantities: T , P , , density. • If you multiply an intensive and extensive quantity together, you get an extensive quantity; likewise, dividing two extensive quantities yields an intensive quantity. Multiplying two extensive quantities together gives nonsense, so don’t do this! But there is nothing wrong with multiplying two intensive quantities. The mnemonic I use to remember which is which is that the ‘in’ in intensive stands for an ‘in’ternal quantity, i.e. something that doesn’t change when I change an ‘ex’ternal parameter. Chapter 6

Boltzmann Statistics (aka The Canonical Ensemble)

In the preceding chapter, we have seen the microcanonical ensemble, where the total number of particles N = i ni and U = i niǫi are fixed (constant). Suppose now that the total energy is not fixed, but rather than energy can be exchanged with some ‘bath,’ or energy reservoir, the canonical ensembleP . This is theP situation when the system is no longer thermally isolated (i.e. your mug of steaming something, rather than your vacuum thermos full of hot something, but it feels cool to the outside). We saw in the previous chapter that changes in the heat are associated with the rearrangement of particles in energy states; more heat means more particles are found in higher quantum energy levels. So now the question is: how to calculate the ni?

6.1 The Boltzmann Factor

To find the ni in the canonical ensemble, we again use our old trick, maximizing the entropy: d ln() = 0 and d2 ln() < 0. First the first:

∂ ln() ∂ ln() d ln() = dn1 + dn2 + ... =0 n1 n2 ∂ ln() = dn =0. (6.1) n i i i X Recall that in general we can write N! = i ni! which means Q ln() = N ln(N) N n ln(n )+ n = N ln(N) n ln(n ) − − i i i − i i i i i X X X as we have seen before. Now, ∂ ln() ∂ = n ln(n ) (Note the derivative w.r.t. the variable n ) ∂n −∂n i i j j j i X

56 PHYS 449 Course Notes 2009 57

= δ [ln(n )+1]= [ln(n ) + 1] . − ij i − j i X Substituting this into Eq. (6.1), we have:

d ln() = [ln(n ) + 1] dn =0. (6.2) − i i i X Because N = i ni and U = i niǫi are constants, we also know that

P dN =P dni = 0 and dU = ǫidni =0. i i X X You might object to the second equation here, for two reasons. First, I have assumed that the variation of the mean energy is zero, even though I started out by stating that the total energy isn’t fixed! But while energy is allowed to flow between the system and the reservoir, the mean energy is assumed to be a constant at equilibrium. Part of this process is to determine what the mean energy U actually is, if the various ni are allowed to vary. Second, you might object that I haven’t included any variations in the energy levels themselves (i.e. I haven’t included a term that looks like

i nidǫi). I am explicitly assuming here that the container walls themselves are fixed, so that the quantum energy levels are always welldefined; it’s just that now the walls can conduct heat. P O.k., now I have three equations that I need to satisfy simultaneously:

dni = 0; ǫidni = 0; and ln(ni)dni =0. i i i X X X How can I solve for the ni? This is just like a classical mechanics problem where I have to solve for the dynamics of some object, subject to some constraints. The first and second of the above equations are the holonomic equations of constraint for the third equation! If you remember, these equations of constraint could be incorporated using he method of undetermined multipliers, otherwise known at Lagrange multipliers. Because I have two equations of constraint, I need two Lagrange multipliers: call them α and β. Now the equation I need to solve is

[ln(ni)+ α + βǫi] dni =0. i X I have effectively added zero to my original equation, twice! Now, the dni variations are completely arbitrary, but no matter what I choose for the various dni the left hand side always sums to zero. The only way to guarantee this is if the term in the square brackets is itself zero,

ln(ni)+ α + βǫi =0. Inverting gives n = exp [ βǫ α]= A exp( βǫ ) , (6.3) i − i − − i where A = exp( α) is a constant. How to find α and β? Use the equations of constraint again! − Since N = i A exp( βǫi), knowing β and given N, A immediately follows. How about β? Well, because we haven’t allowed− any work to be done (the volume of the system is assumed fixed), then the changeP in mean energy is just due to the heat:

dU = ǫidni (6.4) i X = T dS = k T d ln() = k T ln(n )dn . (6.5) B − B i i i X PHYS 449 Course Notes 2009 58

Comparing (6.4) and (6.5) immediately shows that ǫ = k T ln(n ). Inverting this gives i − B i ǫ n = exp i . (6.6) i −k T  B 

Comparison between Eq. (6.3) and (6.6) shows that β =1/kBT , as I derived using a hokey method last chapter. On the other hand, I also used the identity dU = T dS here, which was derived using the same hokey method, so I’m not sure if I’m any further along, really. . . .

Now that I know the explicit form for ni, I can put it back into the equation for the total number of particles: N = A i exp(ǫi/kBT ). Because the probability of a given outcome is simply equal to the population of a given energy state, divided by the total number of particles, I obtain: P n A exp( ǫ /k T ) exp( ǫ /k T ) p = i = − i B − i B , i N A exp( ǫ /k T ) ≡ Z i − i B where the partition function is definedP as

ǫ Z exp i . ≡ −k T i B X   The Z stands for Zustandsumme, or sum over states. Of course, N = AZ.

6.2 Z and the Calculation of Anything

Once you know Z, you know everything. In other words, knowing the occupation of the various energy levels using formula (6.6) allows you to calculate any thermodynamic quantity of interest. First, let’s write the entropy in terms of Z:

S = k ln() = k N ln(N) n ln(n ) B B − i i " i # X = k N ln(N) n [ln(A) βǫ ] using Eq. (6.6). B − i − i ( i ) X = k N ln(N) N ln(A)+ βU using the definition of U. B { − }

Now, N = A i exp( βǫi) so ln(N) = ln(A) + ln [ i exp( βǫi)] = ln(A) + ln(Z). Inserting this above we obtain − − P S = k [N ln(A)+ N ln(ZP) N ln(A)+ βU] . B − Finally, we have the result for the entropy in terms of the partition function: U S = Nk ln(Z)+ . (6.7) B T Another important quantity that immediately follows without even knowing the explicit dependence of U on Z is the Helmholtz free energy:

F = U TS = Nk T ln(Z). (6.8) − − B PHYS 449 Course Notes 2009 59

We still don’t know the explicit form for U as a function of Z, so here goes. We already know that U = ǫ n = A ǫ exp( βǫ ). This expression can be ‘simplified’ using the fact that i i i i i − i P P ∂ ∂T ∂ 1 ∂ ǫi exp( βǫi)= exp( βǫi)= exp( βǫi)= 2 exp( βǫi). − −∂β − − ∂β ∂T − kBβ ∂T − Putting it together we have

A ∂ ∂Z Nk T 2 ∂Z U = exp( βǫ )= Ak T 2 = B . k β2 ∂T − i B ∂T Z ∂T B i X Now we’re finally done: ∂ ln(Z) U = Nk T 2 . (6.9) B ∂T

The text by Schroeder has a different approach to these relationships, given on pp. 247248. He starts with the definition of the Helmholtz free energy F U TS, and uses the relation (5.10): ≡ − ∂F = S. ∂T −  V,N Note that in taking this partial derivative you must also keep N fixed, for reasons you’ll appreciate in a few moments. Rearranging the equation for F gives the righthandside of the above equation, S = (F U)/T . So this yields a differential equation for F : − − ∂F F U = − . ∂T T  V,N

! Schroeder then posits a solution to the free energy as F = NkBT ln(Z) and checks if it is indeed a solution to the differential equation: −

∂ ∂ ln(Z) F U Nk T ln(Z)= Nk ln(Z) Nk T = , − ∂T B − B − B ∂T T − T  V,N  V,N if we identify of the mean energy as

∂ ln(Z) U =! Nk T 2 , B ∂T  V,N which it in fact is, as shown above. So we’re done.

Some simple examples of how to think about the partition function will be covered in class.

Another very important thermodynamic quantity is the specific heat, or the heat capacity. This is often what is really measured in an experiment. The specific heat at constant volume CV is defined as ∂S C = T . V ∂T  V PHYS 449 Course Notes 2009 60

We can also express the specific heat in terms of the mean energy U: ∂S ∂ ∂ ln(Z) ∂ ln(Z) ∂2 ln(Z) = Nk ln(Z)+ Nk T = 2Nk + Nk T ∂T ∂T B B ∂T B ∂T B ∂T 2   1 ∂ ∂ ln(Z) 1 ∂U = Nk T 2 = . (6.10) T ∂T B ∂T T ∂T   Inserting this into the above expression for the specific heat we obtain ∂U C = . V ∂T  V

We can also express CV in terms of the Helmholtz free energy F , using Eqs. (6.7), (6.8), and (6.9): ∂F ∂ ln(Z) U = Nk ln(Z) Nk T = Nk ln(Z) = S. (6.11) ∂T − B − B ∂T − B − T −  V   This immediately yields ∂2F C = T . V − ∂T 2  V Which of these three expressions you choose to use depends in large part on which one is easiest to calculate for a given problem!

The connection between the Helmholtz free energy and the partition function means that we can carry over all the relationships obtained Sec. 5.1.3. One of these is simply Eq. (6.11): ∂F ∂T ln(Z) ∂ ln(Z) S = = Nk = Nk ln(Z)+ Nk T , − ∂T B ∂T B B ∂T  V  V  V which we already knew from combining Eqs. (6.7) and (6.9). The other one is ∂F ∂ ln(Z) P = = Nk T . − ∂V B ∂V  T  T The last one follows from Eq. (8.2): ∂F ∂N ln(Z) ∂ ln(Z) = = k T = k T ln(Z) Nk T . ∂N − B ∂N − B − B ∂N  T,V  T,V  T,V Because Z = N/A, this last relation can be solved directly: ∂ ln(N/A)/∂N = ∂ ln(N)/∂N =1/N so that = kBT [ln(Z) + 1]. At high temperatures, the arguments of the exponentials in Z will all be small, which− implies that Z > 0. This means that the chemical potential for classical particles is always negative. At low temperatures it might approach zero or get positive. This possibility will be explored next term when we discuss quantum statistics.

6.2.1 Example: Pauli Paramagnet Again! 1 You’ll see that using the canonical ensemble is much easier! Again, we assume a spin 2 system where we have two energy levels ǫ and ǫ. The partition function is therefore − Z = exp(ǫ/k T ) + exp( ǫ/k T ) = 2cosh(ǫ/k T ). (6.12) B − B B PHYS 449 Course Notes 2009 61

Using Eq. (6.9), we obtain ∂ ln(Z) Nk T 2 ∂ Nk T 2 2ǫ U = Nk T 2 = B 2 cosh(ǫ/k T )= B sinh(ǫ/k T ) B ∂T Z ∂T B Z −k T 2 B  B  ǫ = Nǫ tanh . (6.13) − k T  B  Meanwhile, the entropy is U ǫ ǫ U S = Nk ln(Z)+ = Nk ln exp + exp + . B T B k T −k T T   B   B 

What about the magnetization? The energy levels now depend on the magnetic field rather than the volume as was the case for the gas in the box. The magnetization for the magnetic system is the generalized force associated with doing work on the system, in the form of changing the magnetic field (and therefore the accessible energy states), just as the pressure is the generalized force due to changing the volume. So, ∂ǫ d= n dǫ = n i dB MdB, i i i ∂B ≡− i i X X where the negative sign is a convention. To make further progress, we use the definition of the Helmholtz free energy F = U TS: dF = dU SdT T dS. But dU = dQ + d = T dS MdB as shown just now, so dF = T− dS MdB T− dS SdT− = MdB SdT . We are left with− the important relation − − − − − ∂F ∂ ln(Z) Nk T ∂Z M = = Nk T = B . −∂B B ∂B Z ∂B It is easy to show that using this with the expression (6.12) reproduces Eq. (3.13). It’s also important to note that the same reasoning gives the definition of the pressure in terms of the partition function: ∂F ∂ ln(Z) Nk T ∂Z P = = Nk T = B . (6.14) −∂V B ∂V Z ∂V

At very low temperatures, T 0, k T ǫ so the second term in the square brackets becomes → B ≪ vanishingly small. Also, the tanh(ǫ/kBT ) 1 in the expression for U in Eq. (6.9). Putting these together gives → ǫ Nǫ S(T 0) NkB 0. → ≈ kB T − T → So, the entropy of the Pauli paramagnet goes to zero at zero temperature. This is because all of the spins align, so the disorder of the system vanishes (Boltzmann philosophy); alternatively, it becomes trivially easy to describe the state of the system (Shannon philosophy). At the opposite limit of very high temperatures T 0, we obtain ǫ/k T 0 and so U ≫ B → → Nǫ(ǫ/kBT ) because the tanh function has a linear slope near the origin. Meanwhile, Z 2 −because each of the exponentials approaches 1. So the entropy is → Nǫ2 S(T 0) NkB ln(2) NkB ln(2). ≫ → − kBT ≈ Does this make sense? PHYS 449 Course Notes 2009 62

6.2.2 Example: Particle in a Box (1D) Recall that the energy levels for particles of mass m in a onedimensional box of length L were found using BohrSommerfeld quantization to be

h2π2n2 ǫ = . n 2mL2 The partition for this system is now

∞ Z = exp( γn2), − n=0 X where h2π2 γ 2 . ≡ 2mL kB T In general, we’re at high temperatures so that γ 1. To give you an idea of how small γ is, let’s again assume that we are dealing with nitrogen in≪ a 1 m box at room temperature, as described in 21 22 Section 3.2.2. There we saw that kBT/ǫn 2 10 , so γ 10− . Obviously, with this incredibly small coefficient in the exponential, this is∼ a× series that converges∼ very slowly. It makes sense to convert it to an integral: ∞ Z dn exp( γn2). ≈ − Z0 Substituting x = √γn converts this integral to

2 2 1 ∞ x √π π L mkB T L Z = dxe− = = = 2 , (6.15) γ 2√γ 4γ 2πh ≡ λD Z0 r r where the de Broglie wavelength λD is defined as

2πh2 λD . ≡ smkBT

This is an important length scale that I’ll discuss in a moment. But first, now that we have the partition function, let’s calculate some thermodynamic properties, such as the mean energy and the heat capacity: 2 2 NkBT ∂Z NkB T λD 1 L 1 U = = = NkBT, Z ∂T L 2 T λD 2 ∂U 1 C = = Nk . V ∂T 2 B Note that the specific heat has the same units as the entropy. Hmmmmmm.

Now let’s get to what λD means. Suppose that the atoms in the 1D box only have kinetic energy (there’s no gravity, and they all bounce elastically off the walls). Then, the mean energy per 1 2 particle is simply U/N = 2 mv , where v is the mean kinetic energy of an atom. Putting the 1 2 1 above result together gives 2 mv = 2 kBT or v = kB T/m. Alternatively, the mean momentum is p PHYS 449 Course Notes 2009 63

2 p = mv = √mkBT . But we already know that √mkB T = √2πh /λD from the definition of the de Broglie wavelength. So we obtain

√2πh2 1 h p = = . λD √2π λD

But, apart from a constant factor of √2π, this is just de Broglie’s relation p = h/λ showing that particles are waves in quantum mechanics! In other words, the ‘wavelength’ of a particle is inversely proportional to the root of both its mass and its temperature: the lower the temperature, or the lighter the particle, the longer its wavelength. So at sufficiently low temperatures we might expect small particles to ‘fuzz out’ and behave very nonclassically! This is indeed what happens, but this interesting story will have to wait until later in the term.

6.2.3 Example: Particle in a Box (3D) In threedimensions, the quantum energy levels are almost identical to those in 1D:

h2π2 ǫ = n2 + n2 + n2 , n ,n ,n =0, 1, 2,.... nx,ny,nz 2mL2 x y z x y z

 2 2 h¯ π The lowest few accessible energy levels are therefore (where for clarity I define ǫ 2 ): ≡ 2mL

energy level nx ny nz 0 000 ǫ 1 0 0 ǫ 0 1 0 ǫ 0 0 1 2ǫ 1 1 0 2ǫ 1 0 1 2ǫ 0 1 1 3ǫ 1 1 1 4ǫ 2 0 0 4ǫ 0 2 0 4ǫ 0 0 2

The various energy levels are all equally spaced, and the lowest energy level is unique. But the 2 2 h¯ π second lowest, with energy ǫ = 2mL2 , is triply degenerate, as are the third and fifth levels 2ǫ and 4ǫ. So we should write the partition function for the 3D case, including these degeneracy factors, as

Z = g exp( βǫ )= 1+3exp( βǫ)+3exp( 2βǫ) + exp( 3βǫ)+3exp( 4βǫ), m − m − − − − m X where gm is the degeneracy factor. While formally correct, it is very difficult to figure out how on earth to sum this series to get a closedform expression for Z.

Our task is simplified considerably by noticing that the the energies for each dimension are additive, which means that the partition function can be written more conveniently as a product of the PHYS 449 Course Notes 2009 64 partition functions in each dimension:

∞ ∞ ∞ Z = exp[ γ(n2 + n2 + n2)= exp( γn2 ) exp( γn2) exp( γn2) − x y z − x − y − z n ,n ,n n =0 n =0 n =0 xXy z Xx Xy Xz 3 3 L V = (Z1D) = 3 = 3 . λD λD Generalizing the calculation of the 1D mean energy to the 3D case, we obtain 3 3 U = Nk T and C = Nk . 2 B V 2 B Let’s now obtain the equation of state using the equation for the pressure in terms of the free energy (6.14). First calculate F :

F = Nk T ln(Z) = Nk T [ln(V ) 3 ln(λ )] − B − B − D 3 2πh2 3 = Nk T ln(V ) ln + ln(T ) . − B − 2 mk 2   B   So, ∂F Nk T P = = B , −∂V V which is just the expected equation of state for an ideal gas in a cubic box. Putting this result together with that for the mean energy gives 2 2V P V = U = U. 3 3V So the work dW done by changing the volume can be immediately obtained: 2U d= P dV = dV. − −3V

6.2.4 Example: Harmonic Oscillator (1D) Before we can obtain the partition for the onedimensional harmonic oscillator, we need to find the quantum energy levels. Because the system is known to exhibit periodic motion, we can again use BohrSommerfeld quantization and avoid having to solve Schr¨odinger’s equation. The total energy is p2 kx2 p2 mω2x2 E = + = + , 2m 2 2m 2 where ω = k/m is the classical oscillation frequency. Inverting this gives p = √2mE m2ω2x2. Insert this into Eq. (3.6): − p pdx = 2mE m2ω2x2dx = nh, − I I p where the integral is over one full period of oscillation. Let x = 2E/mω2 sin(θ) so that m2ω2x2 = 2mE sin2(θ). Then p 2E 2π 2E 1 2πE pdx = √2mE cos2(θ)dθ = 2π = = nh. mω2 ω 2 ω I r Z0 PHYS 449 Course Notes 2009 65

So, again making the switch E ǫ , we obtain → n hω ǫ = n = nhω. n 2π The full solution to Schr¨odinger’s equation (a lengthy process involving Hermite polynomials) gives 1 ǫn =hω(n + 2 ). Except for the constant factor, BohrSommerfeld quantization has done a fine job of determining the energy states of the harmonic oscillator.

Armed with the energy states, we can now obtain the partition function:

Z = exp( ǫ /k T )= exp( βǫ ) =1+exp( βhω) + exp( 2βhω)+ .... − n B − n − − n n X X But this is just a geometric series: if I make the substitution x exp( βhω), then Z =1+ x + x2 + x3 + .... But I also know that xZ = x + x2 + x3 + .... Since both≡ Z and− xZ have an infinite number of terms, I can subtract them and all terms cancel except the first: Z xZ = 1, which immediately yields Z =1/(1 x), or − − 1 Z = . (6.16) 1 exp( βhω) − − Now I can calculate the mean energy:

2 2 ∂ ln(Z) NkBT ∂Z 2 [1 exp( βhω)] hω U = NkBT = = NkBT − − 2 ( 1) 2 ( 1) exp( βhω) ∂T Z ∂T [1 exp( βhω)] − kBT − − − − exp( βhω) Nhω = Nhω − = . 1 exp( βhω) exp(βhω) 1 − − − 1 = Nhω n(T ) , where n(T ) is the occupation factor. ≡ exp(hω/k T ) 1 B − At very high temperatures T 1, exp(hω/k T ) 1+(hω/k T ), so n(T ) k T/hω and ≫ B ≈ B → B U(T 0) Nk T and C (T 0) Nk . ≫ → B V ≫ → B Notice that these hightemperature values are exactly twice those found for the onedimensional particle in a box, even though the energy states themselves are completely different from each other.

6.2.5 Example: Harmonic Oscillator (3D) By analogy to the threedimensional box, the energy levels for the 3D harmonic oscillator are simply

ǫnx,ny ,nz =hω(nx + ny + nz), nx,ny,nz =0, 1, 2,....

Again, because the energies for each dimension are simply additive, the 3D partition function can 3 be simply written as the product of three 1D partition functions, i.e. Z3D = (Z1D) . Because 3 almost all thermodynamic quantities are related to ln (Z3D) = ln (Z1D) = 3 ln (Z1D), almost all quantities will simply be multiplied by a factor of 3. For example, U3D = 3NkBT = 3U1D and CV (3D)=3NkB =3CV (1D).

One can think of atoms in a crystal as N point masses connected to each other with springs. To a first approximation, we can think of the system as N harmonic oscillators in three dimensions. PHYS 449 Course Notes 2009 66

In fact, for most crystals, the specific heat is measured experimentally to be 2.76NkB at room temperature, accounting for 92% of this simple classical picture. It is interesting to consider the expression for the specific heat at low temperatures. At low temperature, the mean energy goes to U 3Nhω exp( hω/k T ), so that the specific heat approaches → − B 3Nhω hω hω 2 hω C ( hω)exp =3Nk exp . V →− k T 2 − −k T B k T −k T B  B   B   B  This expression was first derived by Einstein, and shows that the specific heat falls off exponentially at low temperature. It provided a tremendous boost to the field of statistical mechanics, because it was fully consistent with experimental observations of the day. Unfortunately, it turns out to be 3 wrong: better experiments revealed that CV T at low temperatures, not exponentially. This is because the atoms are not independent oscillators,∝ but rather coupled oscillators, and the lowlying 3 excitations are travelling lattice vibrations (now known at phonons). Actually, even CV T is wrong at very low temperatures! The electrons that can travel around in crystals also contribute∝ to the specific heat, so in fact C (T 0) T . V → ∝ 6.2.6 Example: The rotor Now let’s consider the energies associated with rotation. In classical mechanics, the rotational kinetic energy is 1 T = ω I ω, 2 where I is the moment of inertia tensor and ω is the angular velocity vector. In the inertial ellipsoid, this can be rewritten L2 L2 L2 T = x + y + z , 2Ixx 2Iyy 2Izz where Lj is the angular momentum along direction ˆand Ijj is the corresponding moment of inertia. Suppose that we have a spherical top, so that Ixx = Iyy = Izz = I:

1 L2 T = L2 + L2 + L2 = . 2I x y z 2I  In the quantum version, the kinetic energy is almost identical, except now the angular momentum is an operator, denoted by a little hat: Lˆ2 T = . 2I The eigenvalues of this operator are ℓ(ℓ + 1)h2/2I, where ℓ = L, L +1, L +2,...,L 1,L so that ℓ can take one of 2L + 1 possible values. − − − −

For a linear molecule (linear top), the partition function for the rotor can then be written as

L 2 2 ∞ ℓ(ℓ + 1)h ∞ L(L + 1)h Z = exp (2L + 1)exp , − 2IkBT ≈ − 2IkBT L=0 ℓ= L   L=0   X X− X where the second term assumes that the contributions from the different ℓ values are more or less equal. This assumption should be pretty good at high temperatures where the argument of the exponential is small. In this case, there are simply 2L + 1 terms for each value of L. Again, PHYS 449 Course Notes 2009 67 because we are at high temperatures the discrete nature of the eigenvalues is not important, we can approximate the sum by an integral:

2 ∞ L(L + 1)h Z (2L + 1)exp dL. ≈ − 2Ik T Z0  B  We can make the substitution x = L(L + 1) so that dx = (2L + 1)dL, which is just term already in the integrand. So the partition function becomes

2 ∞ h x 2IkBT Z = exp dx = 2 . −2IkBT h Z0   Again, I can calculate the mean energy

2 2 2 ∂ ln(Z) NkBT ∂Z 2 h 2IkB U = NkBT = = NkBT 2 = NkBT. ∂T Z ∂T 2IkBT h This is exactly the contribution that we expected from the equipartition theorem: there are two ways the linear top can rotate, so there should be two factors of (1/2)NkBT contributing to the energy.

For a spherical top, each of the energy levels is (2L + 1)fold degenerate. The partition function for the rotor can then be written as

L 2 2 ∞ ℓ(ℓ + 1)h ∞ L(L + 1)h Z = (2L + 1)exp (2L + 1)2 exp , − 2IkBT ≈ − 2IkBT L=0 ℓ= L   L=0   X X− X where the second term assumes that the contributions from the different ℓ values are more or less equal. This assumption should be pretty good at high temperatures where the argument of the exponential is small. In this case, there are simply (2L + 1)2 terms for each value of L. Again, because we are at high temperatures so that the discrete nature of the eigenvalues is not important, we can approximate the sum by an integral:

2 ∞ L(L + 1)h Z (2L + 1)2 exp dL. ≈ − 2Ik T Z0  B  At high temperatures, one needs large values of L before the argument of the exponentials will be significant, so it is reasonable to make the substitution L(L + 1) L2 and (2L + 1)2 4L2. This yields → → 2 2 3/2 ∞ 2 h L 2IkBT Z = 4L exp dL = √π 2 . −2IkBT h Z0     Again, I can calculate the mean energy

2 2 3/2 3/2 2 ∂ ln(Z) NkB T ∂Z 2 h 1 3 2IkB 1/2 3 U = NkBT = = NkBT √π 2 T = NkBT. ∂T Z ∂T 2IkBT √π 2 h 2     For the spherical top, there are now three contributions to the total energy, which accounts for the additional factor of (1/2)NkBT over the linear top. PHYS 449 Course Notes 2009 68

6.3 The Equipartition Theorem (reprise)

The examples presented in the previous section show that the hightemperature limits of the mean energy for the particleinthebox and harmonic oscillator problems were very similar: the mean energies were all some multiple of NkBT/2 and the specific heat some multiples of NkB/2. Notice that for both the particle in the box and the harmonic oscillator, both quantities were three times larger going from 1D to 3D, i.e. where the number of degrees of freedom increased by a factor of 3. Perhaps U and CV at high temperatures provide some measure of the number of degrees of freedom of the particles in a given system.

To make further progress with this idea, we need to revisit the hokey derivation of the equation of state for an ideal gas presented at the end of Section 2.4.1. Recall that, in order to enumerate all the accessible states in a volume V , we subdivided the volume into little ‘volumelets’ of size V , each of which defined an accessible site that a particle can occupy. Then we stated that the entropy was given by S = kB ln(V/V ). But quantum mechanics tells us that we can’t know both the exact position and momentum of a particle at the same time. The relationship between the uncertainty in the position x and the momentum p is quantified in the Heisenberg uncertainty principle xp h, where h again is Planck’s constant. For our purposes, this means that Planck’s constant sets a fundamental≥ limit on the size of the volumelets on can partition the system into. In a sense, the BohrSommerfeld quantization condition already stated this: the integral of momentum over space is minimally Planck’s constant.

Mathematically, this means that we can write the general partition function in one dimension as a continuous function 1 ∞ ∞ ǫ(x, px) Z1D dx dpx exp , → h − kBT Z−∞ Z−∞   where the accessible classical states ǫ are explicitly assumed to be functions of both position x and momentum px. In three dimensions, it would be 1 ǫ(r, p) Z d3r d3p exp . 3D → h3 − k T Z Z  B  The six dimensions (r, p) together constitute phase space, and the two threedimensional integrals are denoted phase-space integrals.

To make these ideas more concrete, let’s again calculate the partition function for a particle in a 1D box of length L. The classical energy is entirely kinetic, so ǫ(p)= p2/2m, and

L/2 2 1 ∞ p 1 mkBT L Zbox = dx dp exp = L√π 2mkBT = L 2 = , h L/2 −2mkBT h r 2πh λD Z− Z−∞   p consistent with Eq. (6.15). As found previously, U(T 0) NkBT/2 and CV (T 0) NkB/2. Let’s also consider the 1D harmonic oscillator, for which≫ → the total energy is ǫ(x,≫ p) =→p2/2m + mω2x2/2. Now the partition function is

2 2 2 p mω x 1 ∞ ∞ 2m + 2 Zh.o. = dx dp exp h − kBT Z−∞ Z−∞ ! 2 2 2 1 ∞ mω x ∞ p = dx exp dp exp . h − 2kBT −2mkBT Z−∞   Z−∞   PHYS 449 Course Notes 2009 69

The second one of these we already did, giving us L/λD. The first one is equally easy to integrate, since it’s also a Gaussian integrand. So we obtain altogether

√π 2k T k T B √ B Zh.o. = 2 π 2mkBT = , 2π r mω hω p which is identical to the hightemperature limit of the 1D harmonic oscillator partition function (12.1): 1 1 k T lim = B . T 0 1 exp( βhω) 1 (1 βhω) ≈ hω ≫ − − − − The mean energy in this limit is

2 2 NkB T ∂Z NkBT hωkB U(T 0) = = NkBT. ≫ Z ∂T ≈ kB T hω

But we already know that the second of the two integrals contributed NkBT/2 to the mean energy U at high temperature, because this was the result for the particle in the 1D box. This means that the first integral also contributed NkBT/2 to the mean energy. This is very much analogous to each spatial integral contributing a factor of NkBT/2 to the mean energy going from 1D to 3D. If you think of the kinetic and potential energy terms in the classical expression for the energy as each standing for one degree of freedom, then one can finally state the D Equipartition Theorem: Each degree of freedom that contributes a term quadratic in position or momentum to the classical singleparticle energy contributes an average energy of kBT/2 per particle. Many examples showing how one can predict the value of U at high temperatures will be covered in class.

6.3.1 Density of States

Recall that for a particle in a 3D box, the energy levels are given by

h2π2 ǫ = n2 + n2 + n2 . n 2mL2 x y z  We can think of the energy levels as giving us the coordinates of objects on the surface of a sphere in Cartesian coordinates, except only in the first octant (because nx,ny,nz 0). So, instead, we can think of the energies as continuous, ǫ = γn2, where γ =h2π2/{2mL2 and}≥n2 is the length of the vector in ‘energy space.’ So, n = ǫ/γ and therefore

p 1 dǫ dn = . 2√γ √ǫ In spherical coordinates, we can write 1 dn = d3n = dn dn dn = n2 sin(θ)dndθdφ 4πn2dn (integrating over angles) x y z → 8   π ǫ 1 dǫ π√ǫdǫ = = g(ǫ)dǫ. 2 γ 2√γ √ǫ 4γ3/2 ≡   PHYS 449 Course Notes 2009 70

So, the density of states per unit energy g(ǫ) for a particle in a 3D box is given by dn π m√2m V (2m)3/2 g(ǫ) = √ǫ = V √ǫ = √ǫ, ≡ dǫ 4γ3/2 2h3π2 4h3π2 where in the last line I have put back in the explicit form for γ. The most important thing to notice is that the density of states per unit energy for the 3D box goes like √ǫ.

If you found this derivation confusing, here’s another one. One way to think about the energy levels for a particle in a box is that they are functions of k, which is proportional to the particle momentum (the Fourier transform of the coordinate): h2 πn 2 h2k2 ǫ = , k 2m L ≡ 2m which is just the usual expression for the kinetic energy if you recognize that p hk. This is known as the free-particle dispersion relation. Now, the energy sphere is in ‘kspace,’≡ or ‘Fourier space.’ Then the density of states is the number of singleparticle states in a volume element in kspace, times the density of points: L3 g(ǫ)dǫ = k2 sin(θ)dθdφdk . π3   Integrating over the angles gives V V dk g(ǫ)dǫ = k2dk g (ǫ)= k2 2π2 ⇒ 3D 2π2 dǫ   for the 3D box. Now, ǫ =h2k2/2m so k = 2mǫ/h. Putting it together we obtain V V p2mǫ 2m 1 V (2m)3/2√ǫ g(ǫ)dǫ = k2dk = dǫ = dǫ, 2π2 2π2 2 2 √ 2π2 3 h r h 2 ǫ 2h as we obtained above.

It is straightforward to generalize the 3D box to the 2D and 1D cases. Going through the same procedure gives Ak dk L dk g (ǫ)= ; g (ǫ)= . 2D 2π dǫ 1D π dǫ     Explicitly using the freeparticle dispersion relation as above shows that the density of states per unit volume is a constant (independent of energy) for 2D, and goes like 1/√ǫ in 1D.

What is all this helping us to do? Let’s consider how many states are occupied at room temperature. The total number of states G(ǫ) is the integral of the density of states up to energy state ǫ:

ǫ 3/2 3/2 2 3/2 V (2m) V (2m) 3/2 π 2mL G(ǫ)= g(ǫ′)dǫ′ = √ǫ′dǫ′ = ǫ = ǫ . 2π2 2h3 6π2 h3 6 h2π2 Z0   3 At room temperature, ǫ = 2 kBT where T = 300 K. So this means that the total number of states accessible for nitrogen in a 1 m box at room temperatures is approximately G(ǫ) (π/6)(1020)3/2 1030. But the density of air at sea level is about 1.25 kg/m3, or about 1025 molecules≈ per m3. So∼ even with these rough estimates it is clear that there are far more states accessible to the molecules in the air at room temperature than there are particles. Alternatively, the probability that a given 5 energy state has a particle in it is something like 10− . PHYS 449 Course Notes 2009 71

6.4 The Maxwell Speed Distribution

Let’s make the assumption that the energy levels are so closely spaced that we are left with a moreorless continuous distribution of them, i.e.

N = n dǫn(ǫ)= dǫg(ǫ) exp( βǫ) i ⇒ − i X Z Z 3/2 A(2m) V ∞ = dǫ√ǫ exp( βǫ), in three dimensions. 4h3π2 − Z0 3 Because Z = N/A, this expression also quickly yields Z = V/λD in the usual way. Alternatively, we can set A = N/Z and obtain

3/2 N 2πh2 (2m)3/2V 2π√ǫN n(ǫ)= Ag(ǫ) exp( βǫ)= √ǫ exp( βǫ)= exp( βǫ). 3 2 3/2 − V mkBT 4h π − (πkB T ) −   Now suppose that the energy levels were simply classical kinetic energy states: ǫ = mv2/2, so that √ǫ = m/2v and dǫ = mvdv. Then,

p 3/2 2 m βmv2 n(v)dv = N v2dv exp . π k T − 2 r  B    This is the Maxwell of velocities. The important thing to notice is that for small velocities, the distribution increases quadratically, n(v)dv v2, while for large velocities it decreases exponentially, n(v)dv exp( βmv2/2). This means that∝ the distribution is not even, i.e. is not symmetric around any given∝ velocity.− As shown below, this will have interesting consequences for the statistics.

Let’s obtain the mean velocity v first:

2 βmv ∞ v3 exp dv vn(v)dv 0 − 2 v = = 2 , n(v)dv 2  βmv  R R ∞ v exp dv 0 − 2 R R   where I’ve cancelled all the common constant terms. To turn these into integrals we can evaluate, let’s set x2 βmv2/2, so that v = x 2k T/m. Then ≡ B p 2k T ∞ x3 exp x2 dx v = B 0 − . m ∞ x2 exp( x2) dx r R0 −  Now it is very useful to know the following integrals:R

∞ (2n)!√π ∞ n! x2n exp( a2x2)dx = and x2n+1 exp( a2x2)dx = . − n!2(2a)2na − 2a2n+2 Z0 Z0 In the current case, we have a = 1 and

2k T 1 8 8k T k T v = B = B 1.596 B . r m 2 2√π r πm ≈ r m PHYS 449 Course Notes 2009 72

Now let’s calculate the RMS velocity, v v2: ≡ 2 p βmv 2 ∞ 4 0 v exp 2 dv 2 v n(v)dv − v = = 2 n(v)dv 2  βmv  R R ∞ v exp dv 0 − 2 2Rk T ∞ x4 expR x2 dx 2k T 24√π 8 3k T = B 0 − = B = B m ∞ x2 exp( x2) dx m 64 2√π πm R0 −  3kBTR kB T v = 1.73 . ⇒ r πm ≈ r m

The most probable speed v˜ corresponds to the point at which the distribution is maximum:

∂ 2 m 3/2 βmv2 N v2dv exp = 0 ∂v π k T − 2 "r  B   # βmv2 m βmv2 2˜v exp v˜2 2˜v exp = 0 − 2 − 2k T − 2    B    2k T k T v˜ = B 1.414 B . ⇒ r m ≈ r m The amazing thing about the MaxwellBoltzmann distribution of velocities is that all the quantities v, v, andv ˜ are different. This is in contrast to the Gaussian distribution seen early on in the term. The various values are in the ratio

v : v :v ˜ 1.224 : 1.128 : 1. ≈ Why do you think that v> v> v˜?

One factoid that you might find interesting is that the numbers you get for air are surprisingly large. 26 Assuming that air is mostly nitrogen molecules, with m =4.65 10− kg at a temperature of 273 K, one obtains v = 454 m/s, and v = 493 m/s. But the speed of× sound in air is 331 m/s at 273 K. So the molecules are moving considerably faster than the sound speed. Does this make sense?

There is an interesting application of this for the composition of the Earth’s atmosphere. First, let’s estimate the velocity a particle at the Earth’s surface would need to fully escape the gravitational field. The balance of kinetic and potential energies implies (1/2)mv2 = mgR, where R is the radius of the Earth. (Actually I think that this implies that the particle is at the Earth’s center, where all of it’s mass would be concentrated?). In any case, we obtain vescape = √2gR 11000 m/s. The 27 ≈ mean velocity for hydrogen molecules (mass of 1.66 10− kg) from the formula v = 8kBT/πm = 1600 m/s. But the MaxwellBoltzmann velocity distribution× has a very long tail at high velocities, p which means that there are approximately 2 109 hydrogen molecules that travel at more than 6 times the average velocity. So there are lots× of hydrogen molecules escaping forever all the time. Thankfully, our supply is continually replenished by protons bombarding us from the sun. Much more serious is helium, which is being lost irretrievably, with no replenishing. In fact, the U.S. Government has been stockpiling huge reserves of liquid helium for years in preparation of a world wide shortage. But over the past few years the current administration has softened its policy on PHYS 449 Course Notes 2009 73 helium conservation and these stockpiles are slowly dwindling. In any case, for oxygen and nitrogen molecules, whose mean velocities are about a factor of 4 lower than that of hydrogen, very few of these can actually escape. Whew!

6.4.1 Interlude on Averages When the various averages were calculated above, we made explicit use of the particular form of the energy, ǫ = mv2/2. Of course, if the energy levels are different, or the dimension of the problem is not 3D like it was above, then the way we take averages is going to be different. So how does one take averages in general using the canonical ensemble?

Recall that the general form for the average of something we can measure, call it B, is

B = piBi, i X where the sum is over all the accessible states of the system, and pi are the probabilities of occupying those states. In the canonical ensemble, those probabilities are

g exp( βǫ ) p = i − i , i Z where Z is the usual partition function, and I have explicitly inserted the degeneracy factor. So the average of B is defined as

g B exp( βǫ ) dk g(k)B(k) exp[ βǫ(k)] ∞ dǫg(ǫ)B(ǫ) exp( βǫ) B = i i i − i − = 0 − . gi exp( βǫi) ≈ dk g(k) exp[ βǫ(k)] ∞ dǫg(ǫ) exp( βǫ) P i − R − R 0 − Thus, the dimensionality,P the dependenceR of the energy ǫ on the wavevectorR k or velocity v, and the inherent degeneracy of a given energy level are all buried in the density of states per unit energy, g(ǫ). In order to calculate any average, this must be done first.

For example, suppose that our fundamental excitations were ripples on the surface of water, where ǫ(k)= αk3/2. This is an effective 2D system, so we use the expression for the 2D density of states,

Ak dk Ak d ǫ 2/3 A ǫ 2/3 2 A g(ǫ)= = = = ǫ1/3. 2π dǫ 2π dǫ α 2π α 3α2/3ǫ1/3 3πα4/3       So this is what we would use to evaluate averages. Of course, the constant terms would disappear, but the energydependence of the density of states would not. For example, the mean energy per particle for this problem would be

4/3 7/3 7 ∞ dǫ ǫ exp( βǫ) (kB T ) Γ 4k T U = 0 − = 3 = B . 1/3 4/3 4 ∞ dǫ ǫ exp( βǫ) (kB T ) Γ 3 R0 − 3  R  6.4.2 Molecular Beams One of these is operational in Nasser MoazzenAhmadi’s lab, so it’s good that you’re learning about it! Suppose that we have an oven containing lots of hot molecules. There’s a small hole in one end, out of which shoot the molecules. On the same table in front of the hole is a series of walls PHYS 449 Course Notes 2009 74 with small holes lined up horizontally with the exit hole of the oven. The idea here is that most of the molecules moving off the horizontal axis will hit the various walls, and only those molecules moving in a narrow cone around the horizontal axis will make it to the screen. The question is: what distribution of velocities do the molecules have that hit the screen?

Evidently, once the molecules leave the last pinhole, they spread out and form a cone whose base is area A on the screen. The number of molecules in the cone with velocities between v and v + dv, and in angles between θ and θ + dθ and between φ and φ + dφ is

dθ sin(θ)dφ number of particles = Avt cos(θ)n(v) dv d= Avt cos(θ)n(v) dv . 4π The flux of molecules f(v)dv is the number of molecules per unit area per unit time,

vn(v)dv π/2 2π vn(v)dv f(v)dv = dθ cos(θ) sin(θ) dφ = . 4π 4 Z0 Z0 For the first integral, set x = sin(θ) so that dx = cos(θ). The integral is therefore x2/2 = 2 π/2 sin (θ)/2 0 = 1/2. And the second integral is 2π. Using the MaxwellBoltzmann distribution of velocities,| we obtain the flux density as

3/2 2 3 3 2 Nπ 2m 3 mv NλD m 3 mv f(v)= v exp = 2 v exp . 8 πkB T −2kBT 8π h −2kBT        

What is the point? I’m not really sure, actually. Sometimes it’s good to know the distribution of velocities hitting a screen to interpret results of an experiment. One thing we can do right now is to derive the equation of state for an ideal gas using it! A crude way to do this is to assume that every time the molecule strikes the surface with velocity v, and bounces off elastically, the screen picks up a momentum 2mv cos(θ), accounting for the angle off the horizontal axis. The mean pressure is therefore the integral over the pressure flux:

π/2 2π sin(θ)dθdφ P = dv dθ dφ2p cos(θ)v cos(θ)n(v)dv 4πV Z Z0 Z0 1 ∞ m ∞ Nm Nm 3k T Nk T = mvn(v)v dv = v2n(v)dv = v2 = B = B . 3V 3V 3V 3V m V Z0 Z0 A slightly less hokey derivation (maybe) is to say that the pressure per unit volume is the mean force along the horizontal per unit area:

F N P = = dvf(v)mv2 = mv2. A x V x Z 2 2 2 2 2 But v = vx + vy + vz =3vx. So Nm Nk T P = v2 = B . 3V V PHYS 449 Course Notes 2009 75

6.5 (Already covered in Sec. 6.2) 6.6 Gibbs’ Paradox

It turns out that much of what we have done so far is fundamentally wrong. One of the first people to realize this was Gibbs (of Free Energy fame!), so it is called Gibbs’ Paradox. Basically, he showed that there was a problem with the definition of entropy that we have been using so far. The only way to resolve the paradox is using quantum mechanics, and we’ll cover this in the next chapter. Of course, we have been using quantum mechanics already in order to define the accessible energy levels that particles can occupy. But so far, we haven’t been concerned with the particles themselves. In this section we’ll see the paradox, and next chapter we’ll resolve it using quantum mechanics.

Recall that the entropy in the canonical ensemble is defined as

U ∂ ln(Z) S = Nk ln(Z)+ = Nk ln(Z)+ Nk T . B T B B ∂T As shown in Section 6.2.3, the partition function for a monatomic ideal gas in a 3D box is simply 3 2 Z = V/λD, where V is the volume and λD = 2πh /mkBT is the de Broglie wavelength. Plugging this in, we obtain q 3 2πh2 3 ln(Z) = ln(V ) ln + ln(T ). − 2 mk 2  B  Evidently, ∂ ln(Z)/∂T = (1/Z)∂Z/∂T =3/2T so

3 2πh2 3 3 3 S = Nk ln(V ) ln + ln(T )+ Nk ln(V )+ ln(T )+ σ , B − 2 mk 2 2 ≡ B 2   B     where 3 2πh2 σ = 1 ln 2 − mk   B  is some constant we don’t really care about. Now to the paradox.

Consider a box of volume V containing N atoms. Now, suppose that a barrier is inserted that divides the box into two regions of equal volume V/2, each containing N/2 atoms. Now the total entropy is the sum of the entropies for each region,

N V 3 N V 3 S = k ln + ln(T )+ σ + k ln + ln(T )+ σ 2 B 2 2 2 B 2 2         V 3 3 = Nk ln + ln(T )+ σ = Nk ln(V )+ ln(T )+ σ . B 2 2 B 2       In other words, simply putting in a barrier seems to have reduced the entropy of the particles in the box by a factor of NkB ln(2). Recall that this is the same as the hightemperature entropy for the twolevel Pauli paramagnet (a.k.a. the coin). So it seems to suggest that the two sides of the partition once the barrier is up take the place of ‘heads’ or ‘tails’ in that there are two kinds of states atoms can occupy: the left or right partitions. But here’s the paradox: putting in the barrier hasn’t PHYS 449 Course Notes 2009 76 done any work on the system, or added any heat, so the entropy should be invariant! And simply removing the barrier brings the entropy back where it was before. Simply reducing the capability of the particles in the left partition from accessing locations in the right partition (and vice versa) shouldn’t change the entropy, because in reality you can’t tell the difference between atoms on the left or on the right.

Clearly, we are treating particles on the left and right as distinguishable objects, which has given rise to this paradox. How can we fix things? The cheap fix is to realize that because all of the particles are fundamentally indistinguishable, the N! permutations of the particles among themselves shouldn’t lead to physically distinct realizations of the system. So, our calculation of Z must be too large by a factor of N!: ZN Z ZN = old , N ≡ correct N! where I explicitly make use of the fact that the Nparticle partition function is Nfold product of the singleparticle partition function Zold. With this ‘ad hoc’ solution,

ln(Z )= N ln(Z ) ln(N!) = N ln(Z ) N ln(N)+ N. N old − old − Now the entropy of the total box is

3 V 3 S = Nk ln(V )+ ln(T )+ σ ln(N)+1 = Nk ln + ln(T )+ σ +1 . B 2 − B N 2       Now, if we partition the box into two regions of volume V/2, each with N/2 particles, then ln(V/N) ln(V/N) and the combined entropy of the two regions is identical to the original entropy. So Gibbs’→ paradox is resolved.

The equations for the mean energy, entropy, and free energy in terms of the proper partition function ZN are now ∂ ln(Z ) ∂ ln(Z ) U = k T 2 N ; S + k ln(Z )+ k T N ; F = k T ln(Z ). B ∂T B N B ∂T − B N That is, the expression are identical to the ones we’ve seen before, except the N factors in front have disappeared, and the Z is replaced by a ZN .

The justification for the replacement of ZN by ZN /N! was pretty hokey, however. Most of the next chapter will be devoted to a formal quantum mechanical justification for it. Chapter 7

Grand Canonical Ensemble

In the canonical ensemble, we considered a small subsystem with heatconducting walls, in contact with a gigantic system at equilibrium with itself. This gigantic system could be considered as a heat bath, or reservoir, because if the small system’s temperature was lower (or higher), heat could flow into (or out of) the subsystem out of (or into) the reservoir, and it wouldn’t affect the temperature of the reservoir at all. Now suppose that we draw only imaginary walls between our subsystem and the reservoir, so that not only heat but also particles can flow between them. How do we properly describe the subsystem, i.e. how do we enumerate the number of ways the particles can occupy the accessible (energy) quantum states, if both the energy and the particle number is fluctuating? The answer is that we must now use the Grand Canonical Ensemble.

7.1 Chemical Potential Again

Now let’s return to the particle in the 3D box. Including the Gibbs’ correction, the entropy is

V 3 mkBT 5 S = NkB ln + ln + . N 2 2πh2 2       3 Now, U = 2 NkBT , so kB T =2U/3N. Inserting this into the above expression for the entropy gives V 3 mU 5 S = NkB ln + ln + . N 2 3πh2N 2       Now we can take the derivative with respect to N to obtain the chemical potential: V 3 mU 5 3 = kBT ln + ln + + kBT 1+ . − N 2 3πh2N 2 2         V 3 mkBT 3 = kBT ln + ln = kBT ln nλ , − N 2 2πh2 D       where n = N/V is the density. Note that the subsystem stops conserving the number of particles 3 when nλD = 1, or when the mean interparticle separation ℓ becomes comparable to the de Broglie 1/3 wavelength, ℓ n− λD. But we already know that something quantum happens on the length scale of the de≡ Broglie∼ wavelength. We’ll come back to this a bit later. For very classical systems,

77 PHYS 449 Course Notes 2009 78

ℓ λD which means that the chemical potential is usually large and negative for the particle in the 3D≫ box.

How can you calculate the chemical potential from the free energy? I’m sure that you were burning to find this out. So let’s do it. The free energy is F = U TS so dF = dU T dS SdT = T dS P dV + dN T dS SdT , so − − − − − − ∂F ∂ ln (Z ) = = k T N . ∂N − B ∂N

Let’s check that this gives the right answer for the particle in the 3D box. In that case, ZN = 3 N 3 3 (V/λD) /N! so ln(ZN )= N ln(V/λ ) N ln(N)+ N. So ∂ ln(ZN )/∂N = ln(V/λD) ln(N) 1 + 1, 3 − − − or = kBT ln(nλD). Suppose that the energies of the particles were larger by a constant factor of 2 2 , so that ǫk = (h k /2m) + . Then

h2k2 βh2k2 V Z = exp β + = exp( β) exp = 3 exp( β). − 2m − − 2m λD − Xk    Xk   So, the chemical potential is now 3 = kB T ln(nλD) + . Clearly, the shift in the energies has led to exactly the same shift in the chemical potential. This additional piece can be thought of as an ‘external chemical potential’ that increases the total value. If the shift had depended on position or momentum, though, then we would have needed to use the equipartition theorem to evaluate the contribution of the additional piece to the chemical potential. Also, if we had included vibration and rotation for molecules, say, then the chemical potential would have increased as well. These would be ‘internal chemical potential’ contributions. Would these tend to increase or decrease the chemical potential?

7.2 Grand Partition Function

Recall that in the derivation of the Boltzmann distribution in the canonical ensemble, we maximized the entropy (or the number of microstates) subject to the two constraints N = i ni and U = niǫi. So we had i P P d ln() + α dN dn + β dU ǫ dn =0, − i − i i i ! i ! X X where α and β were the Lagrange multipliers fixing the two constraints (alternatively, they are unknown multiplicative constants in front of terms that are zero). So we immediately have the following relations: ∂ ln() ∂ ln() α = ; β = . ∂N ∂U But above we have ∂S/∂N = /T = k (∂ ln()/∂N) because S = k ln(). So we immediately − B B obtain α = /kBT = β. So the chemical potential is in fact the Lagrange multiplier that fixes the number− of particles− in the total system. Putting these together, we have ∂ ln() ∂ ln() 1 = kBT ; = . − ∂N ∂U kBT PHYS 449 Course Notes 2009 79

ln() = βU βN + const. ⇒ − = A exp [β (U N)] , ⇒ − where A = exp(const.).

So far, this enumeration has been for the reservoir, which contains the fixed temperature. To find for the subsystem, we note that U = U E and N = N N . So R − s R − s = A exp [β (U N )] = A exp [β (U E ) β (N N )] R R − R − s − − s = A exp [β (U N)] exp [ β (E N )]= . − − s − s system subsystem Finally we obtain = A exp [ β (E N )] . s − s − s The physical system is actually comprised of a very large number of these subsystems, all of which have the same temperature and chemical potential at equilibrium, but all of which have different energies Ei and number of particles Ni. The total number of ways of distributing the particles is therefore the sum of all of the subsystems’ contributions, = i s. So the probability of occupying a given subsystem is the fraction of the distribution of a given subsystem over all of them, P exp [ β (E N )] exp [ β (E N )] p = i = − i − i − i − i , i W exp [ β (E N )] ≡ Ξ i − i − i where Ξ is the grand partitionP function,

Ξ exp [ β (E N )] . ≡ − i − i i X

Examples It is important to note that the way one obtains the grand partition function is different from the way it was done in the canonical ensemble. Suppose that the subsystem contains three accessible energy levels 0, ǫ, and 2ǫ. Now suppose that there is only one atom in the total system. How many ways can I distribute atoms in my energy states? I might have no atoms in any energy state (Es = 0, Ns = 0). I might have one atom in state 0 (Es = 0, Ns = 1). I might have one atom in state ǫ (Es = ǫ, Ns = 1), or I might have one atom in state 2ǫ (Es =2ǫ, Ns = 1). So my grand partition function is:

Ξ = exp[ β(0 0)] + exp[ β(0 1)] + exp[ β(ǫ 1)] + exp[ β(2ǫ 1)] − − − − − − − − = 1+exp(β)[1 + exp( βǫ) + exp( 2βǫ)] . − − Suppose instead that there are an unknown number of particles in the total system, but that each of these three energy levels in my subsystem can only accommodate up to two particles. Then the grand partition function is

Ξ=1 + exp(β)[1+ exp( βǫ) + exp[ 2βǫ)] − − + exp(2β) [1 + exp( βǫ)+2exp( 2βǫ) + exp( 3βǫ) + exp( 4βǫ)] , − − − − where in the second line I have recognized that with two particles we can have Es = 0 (both in state 0), Es = ǫ (one in state 0, the other in state ǫ), Es = 2ǫ (either both in state ǫ, or one in state 0 PHYS 449 Course Notes 2009 80

while the other is in state 2ǫ, thus the factor of two out front of this term), Es =3ǫ (one in state ǫ, the other in state 2ǫ), and Es =4ǫ (both in state 2ǫ).

Now suppose that the number of particles in our system is totally unknown, and also there is no restriction on the number of particles that can exist in a particular energy level. Then the grand partition function is

Ξ=1 + exp(β)[1 + exp( βǫ) + exp[ 2βǫ)] + exp(2β)[1 + exp( βǫ) + exp[ 2βǫ)]2 − − − − = 1+ eβZ + e2βZ2 + ... ∞ n 1 = eβZ = , 1 eβZ n=0 − X  where I have used Euler’s solution in the last line. In other words, the grand canonical ensemble for the fully unrestricted case is simply a linear combination of the canonical partition functions for no particles, one particle, two particles, etc., suitably weighted by their fugacities. In fact, this classical grand partition function leads to slightly paradoxical results, which will be discussed at length next term. Anyhow, that’s how one constructs the grand partition function in practice!

7.3

As we did for the canonical ensemble, we can obtain thermodynamic quantities using the grand partition function instead of the regular partition function. The entropy is defined as

U N S = k p ln(p )= k p [ β(E N ) ln(Ξ)] = + k ln(Ξ), − B i i − B i − i − i − T − T B i i X X where U = i piEi and N = i piNi are the mean energy and mean particle number for the subsystem, respectively. This expression can be inverted to yield P P U N TS = k T ln(Ξ) Φ , − − − B ≡ G where ΦG is the grand potential. Sometimes this is also written as G in statistical mechanics books. Recall that in the canonical ensemble we had U TS = F = k T ln(Z ), where F is the − − B N Helmholtz free energy. So the grand potential is simply related to the free energy by ΦG = F N. Anyhow, the following thermodynamic relations immediately follow: −

∂Φ ∂Φ ∂ ln(Ξ) S = G ; N = G = k T . − ∂T − ∂ B ∂ Two other relations that follow in analogy with the results for the canonical ensemble are:

∂Φ ∂ ln(Ξ) ∂ ln(Ξ) P = G ; U = = k T 2 . − ∂V − ∂β B ∂T Chapter 8

Virial Theorem and the Grand Canonical Ensemble

8.1 Virial Theorem

Before launching into the theory of quantum gases, it’s useful to learn a powerful thing in statistical mechanics, called the virial theorem of Clausius. Consider the quantity G i pi ri, where the sum is over the particles. The timederivative of this quantity is ≡ P dG = (p˙ r + r˙ p )= p˙ r + mv2 =2K + p˙ r =2K + F r , dt i i i i i i i i i i i i i i i X X  X X where I am using K to represent kinetic energy rather than T avoid confusion with temperature. Also, in the last equation I have used Newton’s equation F = p˙ . Now, the time average of some fluctuating quantity (call it A(t)) is defined as A(t) : t 1 τ A(t) A(t)dt. t ≡ τ Z0 We can use this to define the following time average:

dG 1 τ dG G(τ) G(0) = dt = − . dt τ dt τ  t Z0 Now here comes the crucial part. If G(t) is a periodic function, with period P , then clearly for τ = P the time average must be zero since G(τ) = G(0). Likewise, if G has finite values for all times, then as the elapsed time τ gets to be very large the average will also go to zero. This means that dG =0 dt  t as long as G is finite. Finally we obtain the virial theorem

1 K = F r . (8.1) t −2 i i * i + X i

81 PHYS 449 Course Notes 2009 82

Apparently, the word ‘virial’ comes from the Latin word for ‘force’ though I remember from my high school Latin that force was ‘fortis’ meaning strength. Oh well. You see anyhow where the force comes into the picture.

8.1.1 Example: ideal gas As an example of the power of the virial theorem, let’s derive the equation of state for an ideal gas (again! this probably won’t be the last time either). We’ve already seen from the equipartition theorem that for a particle in a 3D box, U = K = (3/2)NkBT . Neglecting particle interactions, the only forces at equilibrium are those of the particles on the box walls (and vice versa) when the collide elastically, dFi = P ndAˆ for all particles, wheren ˆ is the normal to the wall surface (this is nothing but a restatement− of P = dF/dV ). So

F r = P nˆ rdA = P rdV = 3P V i i − − ∇ − i X Z Z where I used Stokes’ theorem in the last bit. Plugging into the virial theorem immediately gives (3/2)NkBT = (3/2)P V which is what we wanted.

8.1.2 Example: Average temperature of the sun Now let’s use the virial theorem to estimate the temperature of the sun. The mass of the sun is M 3 1030 kg and its radius is R 3 107 m. Assuming that the sun is made up entirely of ⊙ ≈ 27 ⊙ ≈ 57 hydrogen with mass 1.67 10− kg, then this gives N = 1.8 10 atoms and therefore a density 34 3 ρ = 1.6 10 m− . The force on a particle of mass m due to the gravity of the sun is F = × 2 11 3 2 (GM m/R )ˆr, where G 6.674 10− m /kg/s is the gravitational constant, and M and R −are the⊙ mass and⊙ radius of the≈ sun,× respectively. The density of the sun (assuming that it’s⊙ constant⊙ everywhere, which isn’t likely to be the case!) is ρ = M /V = M /(4πR3 /3). Now, we’d like to sum up all the gravitional forces from the sun’s center to⊙ the surface.⊙ The radialdependent⊙ mass is therefore M 4πr3 r 3 M(r)= ⊙3 = M . 4πR⊙ 3 R ⊙ 3  ⊙  Also, for the test mass we have dm = ρdV = ρr2 sin(θ)dr dθ dφ. Putting it all together, we have

3 1 1 2 G r K = Fi ri = r sin(θ) dr dθ dφ M ρrˆ r, −2 2 r2 R3 ⊙ i X Z ⊙ where K is the mean kinetic energy. Butr ˆ r = (r/r) r = r, so R⊙ 5 2 GM ρ 4 2πGM ρR 3 GM ⊙ ⊙ K = 3 4π r dr = 3 ⊙ = ⊙ , 2R 0 5R 10 R ⊙ Z ⊙ ⊙ where in the last part I used the expression for the density ρ. Because K = (3/2)NkBT by equipar 2 6 tition, we obtain T (1/5)(GM /NkBR ) 5 10 K. In fact, except for the factor 5 I could have guessed this from≈ dimensional⊙ analysis,⊙ because≈ × there is only one way to combine G, M , and R in a way that gives units of energy. In fact, the temperature of the sun varies from about⊙ 10 million⊙ degrees at the center to about 6000 degrees at the surface. So the virial theorem has done a pretty good job of estimating the average temperature. PHYS 449 Course Notes 2009 83

8.2 Chemical Potential

Suppose that the energy of some small system (I’ll call it a subsystem) also depends on the (fluctu ating) number of particles N in it. Then ∂U dU = T dS P dV + dN T dS P dV + dN. − ∂N ≡ − This can also be inverted to give dU + P dV dN dS = − . T So the chemical potential for the system is defined as ∂U ∂S = T , N = n . ≡ ∂N − ∂N i  S,V  U,V i subsystem ∈ X But remember to be careful when using the second definition of the chemical potential, because

T = (∂U/∂S)V,N .

This reminds me about general forces: here S and T are conjugate variables linked through the total energy U. In a similar way, P and V are conjugate variables because P = (∂U/∂V )S,N . Likewise in magnetic systems for the variables M and B. In a similar way, the first definition− of the chemical potential above shows that and N are also conjugate variables: is the generalized force associated with the variable N, i.e. it is a ‘force’ that fixes the value of N for a given system.

What does the chemical potential mean? Consider the change in the entropy of the entire system (subsystem plus reservoir) with the number of particles in the subsystem Ns in terms of the change of the entropies of the reservoir SR and subsystem Ss: ∂ ∂S ∂S ∂S ∂N ∂S dS = (S + S ) dN = R dN + s dN = dN R R + s . system ∂N R s s ∂N s ∂N s s ∂N ∂N ∂N s s s  R s s  But N = N N so ∂N /∂N = 1 giving R − s R s − ∂S ∂S dN dS = dN s R = s ( ) . system s ∂N − ∂N − T s − R  s R  If you didn’t like this sloppy math, then how about this:

∂Ss ∂SR dSsystem = dNs + dNR. ∂Ns ∂NR Since dN = dN this gives R − s ∂S ∂S dS = dN s R , system s ∂N − ∂N  s R  which is what we obtained above.

Equilibrium corresponds to maximizing entropy, or dS = 0. This means that the condition for equilibrium between the subsystem and the reservoir is that the chemical potentials for each should PHYS 449 Course Notes 2009 84

be equal, s = R. But even more important, as equilibrium is being approached, the entropy is changing with time like dS dN system = s s − R 0 dt − dt T ≥   because the entropy must increase toward equilibrium (unless they are already at equilibrium). If initially R >s, then clearly dNs/dt 0 to satisfy the above inequality. This means that in order to reach equilibrium when the chemical≥ potential for the reservoir is initially larger than that of the subsystem, particles must flow from the reservoir into the subsystem. So the chemical potential provides some measure of the number imbalance between two systems that are not at equilibrium.

8.2.1 Free energies revisited If the number of particles in the system is not fixed, then we also need to include the chemical potential . Then the change in internal energy, and the changes in the enthalpy, Helmholtz, and Gibbs free energies are given respectively by

dU = T dS P dV + dN; − dH = T dS + V dP + dN; dF = SdT P dV + dN; − − dG = SdT + V dP + dN. − The first of these we saw back in Section 8.2. The rest of them are pretty trivially changed from their cases seen above without allowing for changes to the total particle number! So we obtain the following rules: ∂U ∂H ∂F ∂G = = = = . (8.2) ∂N ∂N ∂N ∂N  S,V  S,P  T,V  T,P If there is more than one kind of particle, then one would need to insert the additional conditions, i.e. the chemical potential for species 1 would be

∂F = . 1 ∂N  1 T,V,N2,N3,...

Let’s explore this last expression a bit more:

∂F ∂N ln(Z) ∂ ln(Z) = = k T = k T ln(Z) Nk T . ∂N − B ∂N − B − B ∂N  T,V  T,V  T,V Because Z = N/A, this last relation can be solved directly: ∂ ln(N/A)/∂N = ∂ ln(N)/∂N =1/N so that = kBT [ln(Z) + 1]. At high temperatures, the arguments of the exponentials in Z will all be small, which− implies that Z > 0. This means that the chemical potential for classical particles is always negative. At low temperatures it might approach zero or get positive. This possibility will be explored a bit later in the term when we discuss quantum statistics. PHYS 449 Course Notes 2009 85

8.2.2 Example: Pauli Paramagnet To clarify the various meanings of the chemical potential, let’s return first to the Pauli paramagnet. 1 Recall that in the microcanonical ensemble, the entropy for the spin 2 case was n n n n S = Nk 1 ln 1 + 1 1 ln 1 1 . − B N N − N − N h       i Assuming that we have some subsystem with number N in contact with a reservoir at temperature T , the chemical potential is

∂S N n1 n2 = T = kBT ln = kB T ln 1 = kBT ln − ∂N − N n1 − N N  −      after a bit of algebra. What does this mean? First, you can see that the chemical potential has units of energy. In this case, = 0 when the number of spinup atoms is zero, n1 = 0. The chemical potential is less than zero for any other value of n1, and 0 for n1 N. What does = 0 mean? Suppose that the total number of particles is not zero.| |The ≫ n a zero→ chemical potential means that the change in the number of particles in the reservoir is not related to the change in the number of particles in the subsystem; alternatively, the entropy is invariant under changes in the number of particles. This implies that a zero chemical potential means that the system doesn’t conserve the number of particles. For the Pauli paramagnet, I can keep increasing the number of atoms with spin down, and as long as I don’t create a single spin up, then the system’s entropy doesn’t change: it remains exactly zero.

8.3 Grand Partition Function

Recall that in the derivation of the Boltzmann distribution in the canonical ensemble, we maximized the entropy (or the number of microstates) subject to the two constraints N = i ni and U = niǫi. So we had i P P d ln() + α dN dn + β dU ǫ dn =0, − i − i i i ! i ! X X where α and β were the Lagrange multipliers fixing the two constraints (alternatively, they are unknown multiplicative constants in front of terms that are zero). So we immediately have the following relations: ∂ ln() ∂ ln() α = ; β = . ∂N ∂U But above we have ∂S/∂N = /T = k (∂ ln()/∂N) because S = k ln(). So we immediately − B B obtain α = /kBT = β. So the chemical potential is in fact the Lagrange multiplier that fixes the number− of particles− in the total system. Putting these together, we have ∂ ln() ∂ ln() 1 = kBT ; = . − ∂N ∂U kBT ln() = βU βN + const. ⇒ − = A exp [β (U N)] , ⇒ − where A = exp(const.). PHYS 449 Course Notes 2009 86

So far, this enumeration has been for the reservoir, which contains the fixed temperature. To find for the subsystem, we note that U = U E and N = N N . So R − s R − s = A exp [β (U N )] = A exp [β (U E ) β (N N )] R R − R − s − − s = A exp [β (U N)] exp [ β (E N )]= . − − s − s system subsystem Finally we obtain = A exp [ β (E N )] . s − s − s The physical system is actually comprised of a very large number of these subsystems, all of which have the same temperature and chemical potential at equilibrium, but all of which have different energies Ei and number of particles Ni. The total number of ways of distributing the particles is therefore the sum of all of the subsystems’ contributions, = i s. So the probability of occupying a given subsystem is the fraction of the distribution of a given subsystem over all of them, P exp [ β (E N )] exp [ β (E N )] p = i = − i − i − i − i , i exp [ β (E N )] ≡ Ξ i − i − i where Ξ is the grand partitionP function,

Ξ exp [ β (E N )] . ≡ − i − i i X

Here’s an alternate derivation if you didn’t like that one. The ratio of probabilities for two macrostates is the same as the ratio of their number of microstates,

SR(s2)/kB P (s2) R(s2) e 1 = = = exp [SR(s2) SR(s1)] . P (s ) (s ) eSR(s1)/kB k − 1 R 1  B  We know that 1 1 dS = (dU + P dV dN )= (dU + P dV dN ) R T R R − R −T s s − s 1 = E(s ) E(s ) [N(s ) N(s )] . −T { 2 − 1 − 2 − 1 } So we obtain 1 exp [Es(s2) Ns(s2)] P (s ) kB T 2 = − − . P (s1) n 1 o exp [Es(s1) Ns(s1)] − kB T − The probability of occupying any state s isn then o

1 exp k T [Es(s) Ns(s)] P (s)= − B − , n Ξ o where Ξ is given above. PHYS 449 Course Notes 2009 87

8.3.1 Examples It is important to note that the way one obtains the grand partition function is different from the way it was done in the canonical ensemble. Suppose that the subsystem contains three accessible energy levels 0, ǫ, and 2ǫ. Now suppose that there is only one atom in the total system. How many ways can I distribute atoms in my energy states? I might have no atoms in any energy state (Es = 0, Ns = 0). I might have one atom in state 0 (Es = 0, Ns = 1). I might have one atom in state ǫ (Es = ǫ, Ns = 1), or I might have one atom in state 2ǫ (Es = 2ǫ, Ns = 1). So my grand partition function is:

Ξ = exp[ β(0 0)] + exp[ β(0 1)] + exp[ β(ǫ 1)] + exp[ β(2ǫ 1)] − − − − − − − − = 1+exp(β)[1 + exp( βǫ) + exp( 2βǫ)] . − − Suppose instead that there are an unknown number of particles in the total system, but that each of these three energy levels in my subsystem can only accommodate up to two particles. Then the grand partition function is

Ξ=1 + exp(β)[1+ exp( βǫ) + exp[ 2βǫ)] − − + exp(2β) [1 + exp( βǫ)+2exp( 2βǫ) + exp( 3βǫ) + exp( 4βǫ)] , − − − − where in the second line I have recognized that with two particles we can have Es = 0 (both in state 0), Es = ǫ (one in state 0, the other in state ǫ), Es = 2ǫ (either both in state ǫ, or one in state 0 while the other is in state 2ǫ, thus the factor of two out front of this term), Es =3ǫ (one in state ǫ, the other in state 2ǫ), and Es =4ǫ (both in state 2ǫ).

Now suppose that the number of particles in our system is totally unknown, and also there is no restriction on the number of particles that can exist in a particular energy level. Then the grand partition function is

Ξ=1 + exp(β)[1 + exp( βǫ) + exp[ 2βǫ)] + exp(2β)[1 + exp( βǫ) + exp[ 2βǫ)]2 − − − − = 1+ eβZ + e2βZ2 + ... ∞ n 1 = eβZ = , 1 eβZ n=0 − X  where I have used Euler’s solution in the last line. In other words, the grand canonical ensemble for the fully unrestricted case is simply a linear combination of the canonical partition functions for no particles, one particle, two particles, etc., suitably weighted by their fugacities. In fact, this classical grand partition function leads to slightly paradoxical results, which will be discussed at length next term. Anyhow, that’s how one constructs the grand partition function in practice!

8.4 Grand Potential

As we did for the canonical ensemble, we can obtain thermodynamic quantities using the grand partition function instead of the regular partition function. The entropy is defined as

U N S = k p ln(p )= k p [ β(E N ) ln(Ξ)] = + k ln(Ξ), − B i i − B i − i − i − T − T B i i X X PHYS 449 Course Notes 2009 88

where U = i piEi and N = i piNi are the mean energy and mean particle number for the subsystem, respectively. This expression can be inverted to yield P P U N TS = k T ln(Ξ) Φ , − − − B ≡ G where ΦG is the grand potential. Sometimes this is also written as G in statistical mechanics books. Recall that in the canonical ensemble we had U TS = F = k T ln(Z ), where F is the − − B N Helmholtz free energy. So the grand potential is simply related to the free energy by ΦG = F N. Anyhow, the following thermodynamic relations immediately follow: −

∂Φ ∂Φ ∂ ln(Ξ) S = G ; N = G = k T . − ∂T − ∂ B ∂ Two other relations that follow in analogy with the results for the canonical ensemble are:

∂Φ ∂ ln(Ξ) ∂ ln(Ξ) P = G ; U = = k T 2 . − ∂V − ∂β B ∂T Chapter 9

Quantum Counting

9.1 Gibbs’ Paradox

It turns out that much of what we have done so far is fundamentally wrong. One of the first people to realize this was Gibbs (of Free Energy fame!), so it is called Gibbs’ Paradox. Basically, he showed that there was a problem with the definition of entropy that we have been using so far. The only way to resolve the paradox is using quantum mechanics, which we’ll see later in this chapter. Of course, we have been using quantum mechanics already in order to define the accessible energy levels that particles can occupy. But so far, we haven’t been concerned with the particles themselves. In this section we’ll see the paradox, and over the next sections we’ll resolve it using quantum mechanics.

Recall that the entropy in the canonical ensemble is defined as

U ∂ ln(Z) S = Nk ln(Z)+ = Nk ln(Z)+ Nk T . B T B B ∂T

3 As we saw last term, the partition function for a monatomic ideal gas in a 3D box is Z = V/λD, 2 where V is the volume and λD = 2πh /mkBT is the de Broglie wavelength. Plugging this in, we obtain q 3 2πh2 3 ln(Z) = ln(V ) ln + ln(T ). − 2 mk 2  B  Evidently, ∂ ln(Z)/∂T = (1/Z)∂Z/∂T =3/2T so

3 2πh2 3 3 3 S = Nk ln(V ) ln + ln(T )+ Nk ln(V )+ ln(T )+ σ , B − 2 mk 2 2 ≡ B 2   B     where 3 2πh2 σ = 1 ln 2 − mk   B  is some constant we don’t really care about. Now to the paradox.

Consider a box of volume V containing N atoms. Now, suppose that a barrier is inserted that divides the box into two regions of equal volume V/2, each containing N/2 atoms. Now the total

89 PHYS 449 Course Notes 2009 90 entropy is the sum of the entropies for each region,

N V 3 N V 3 S = k ln + ln(T )+ σ + k ln + ln(T )+ σ 2 B 2 2 2 B 2 2         V 3 3 = Nk ln + ln(T )+ σ = Nk ln(V )+ ln(T )+ σ . B 2 2 B 2       In other words, simply putting in a barrier seems to have reduced the entropy of the particles in the box by a factor of NkB ln(2). Recall that this is the same as the hightemperature entropy for the twolevel Pauli paramagnet (a.k.a. the coin). So it seems to suggest that the two sides of the partition once the barrier is up take the place of ‘heads’ or ‘tails’ in that there are two kinds of states atoms can occupy: the left or right partitions. But here’s the paradox: putting in the barrier hasn’t done any work on the system, or added any heat, so the entropy should be invariant! And simply removing the barrier brings the entropy back where it was before. Simply reducing the capability of the particles in the left partition from accessing locations in the right partition (and vice versa) shouldn’t change the entropy, because in reality you can’t tell the difference between atoms on the left or on the right.

Clearly, we are treating particles on the left and right as distinguishable objects, which has given rise to this paradox. How can we fix things? The cheap fix is to realize that because all of the particles are fundamentally indistinguishable, the N! permutations of the particles among themselves shouldn’t lead to physically distinct realizations of the system. So, our calculation of Z must be too large by a factor of N!: ZN Z ZN = old , N ≡ correct N! where I explicitly make use of the fact that the Nparticle partition function is Nfold product of the singleparticle partition function Zold. With this ‘ad hoc’ solution,

ln(Z )= N ln(Z ) ln(N!) = N ln(Z ) N ln(N)+ N. N old − old − Now the entropy of the total box is

3 V 3 S = Nk ln(V )+ ln(T )+ σ ln(N)+1 = Nk ln + ln(T )+ σ +1 . B 2 − B N 2       Now, if we partition the box into two regions of volume V/2, each with N/2 particles, then ln(V/N) ln(V/N) and the combined entropy of the two regions is identical to the original entropy. So Gibbs’→ paradox is resolved.

The equations for the mean energy, entropy, and free energy in terms of the proper partition function ZN are now ∂ ln(Z ) ∂ ln(Z ) U = k T 2 N ; S + k ln(Z )+ k T N ; F = k T ln(Z ). B ∂T B N B ∂T − B N That is, the expression are identical to the ones we’ve seen before, except the N factors in front have disappeared, and the Z is replaced by a ZN . PHYS 449 Course Notes 2009 91

9.2 Chemical Potential Again

Now let’s return to the particle in the 3D box. Including the Gibbs’ correction, the entropy is

V 3 mkBT 5 S = NkB ln + ln + . N 2 2πh2 2       3 Now, U = 2 NkBT , so kB T =2U/3N. Inserting this into the above expression for the entropy gives V 3 mU 5 S = NkB ln + ln + . N 2 3πh2N 2       Now we can take the derivative with respect to N to obtain the chemical potential: V 3 mU 5 3 = kBT ln + ln + + kBT 1+ . − N 2 3πh2N 2 2         V 3 mkBT 3 = kBT ln + ln = kBT ln nλ , − N 2 2πh2 D       where n = N/V is the density. Note that the subsystem stops conserving the number of particles 3 when nλD = 1, or when the mean interparticle separation ℓ becomes comparable to the de Broglie 1/3 wavelength, ℓ n− λD. But we already know that something quantum happens on the length scale of the de≡ Broglie∼ wavelength. We’ll come back to this a bit later. For very classical systems, ℓ λD which means that the chemical potential is usually large and negative for the particle in the 3D≫ box.

How can you calculate the chemical potential from the free energy? I’m sure that you were burning to find this out. So let’s do it. The free energy is F = U TS so dF = dU T dS SdT = T dS P dV + dN T dS SdT , so − − − − − − ∂F ∂ ln (Z ) = = k T N . ∂N − B ∂N

Let’s check that this gives the right answer for the particle in the 3D box. In that case, ZN = 3 N 3 3 (V/λD) /N! so ln(ZN )= N ln(V/λD) N ln(N)+N. So ∂ ln(ZN )/∂N = ln(V/λD) ln(N) 1+1, 3 − − − or = kBT ln(nλD). Suppose that the energies of the particles were larger by a constant factor of 2 2 , so that ǫk = (h k /2m) + . Then h2k2 βh2k2 V Z = exp β + = exp( β) exp = 3 exp( β). − 2m − − 2m λD − Xk    Xk   So, the chemical potential is now 3 = kB T ln(nλD) + . Clearly, the shift in the energies has led to exactly the same shift in the chemical potential. This additional piece can be thought of as an ‘external chemical potential’ that increases the total value. If the shift had depended on position or momentum, though, then we would have needed to use the equipartition theorem to evaluate the contribution of the additional piece to the chemical potential. Also, if we had included vibration and rotation for molecules, say, then the chemical potential would have increased as well. These would be ‘internal chemical potential’ contributions. Would these tend to increase or decrease the chemical potential? PHYS 449 Course Notes 2009 92

9.3 Arranging Indistinguishable Particles

The way we have derived statistical mechanics so far was to count the ways of arranging various outcomes into bins, and call the mostoccupied bin the equilibrium state of the system. In the microcanonical ensemble, these outcomes tended to be things like ‘headness’ or ‘tailness’, and the bins were the number of heads and tails after N trials. In the canonical ensemble, the outcomes are usually designated as the accessible (quantum) energy levels of a given system, and the bins are the various ways of N particles occupying these levels. But both of these approaches assumed that we could tell the difference between various outcomes (call them sides of a coin, or the energy level, etc.). With completely indistinguishable particles, things get a bit more problematic.

As an example, consider a threesided coin. Or even better a threelevel system like a spin1 particle that can be in the states , , and , with energies 1, 0, and 1, respectively. Suppose we have three of these particles. Then↑ ◦ we obtain↓ the following table, where− the ‘ways’ column denotes the number of ways we could have obtained the number of , , and in the state (i.e. the number of microstates in the macrostate). ↑ ◦ ↓

state energy ways 3 1 | ↑↑↑ 2 3 | ↑↑ ◦ 1 3 | ↑↑↓ 1 3 | ↑ ◦◦ 0 6 |↑◦↓ 0 1 |◦◦◦ 1 3 |↑↓↓ −1 3 |◦◦↓ −2 3 |◦ ↓↓ −3 1 |↓↓↓ −

We started with 10 distinguishable macrostates. But if the macrostates are the various values of the total energy of the macrostates, then we actually have only 7 macrostates, labeled by energies 3, 2, 1, 0, 1, 2, 3 , with 1, 3, 6, 7, 6, 3, 1 microstates in each, respectively. In fact, we’ve briefly {run into this− problem− − } of degeneracy{ before. Now,} the state with zero energy is sevenfold degenerate (g = 7) when there are three particles. How do we count states properly taking the degeneracy g into account?

9.3.1 Bosons

Suppose that we want to enumerate the number of ways to arrange two particles that you can’t distinguish into four degenerate levels. How many ways are there to do this? Here’s another table, where each column represents one of the four degenerate levels, and the number tells you how many particles are in that level: In this example, there are 10 combinations. If you remember your binomial coefficients, this number 5 corresponds to . Why would this be? Let’s do another example to find some kind of trend. 2 How about three particles also with g = 4: PHYS 449 Course Notes 2009 93

0 0 0 2 0 0 1 1 0 1 0 1 1 0 0 1 0 0 2 0 0 1 1 0 1 0 1 0 0 2 0 0 1 1 0 0 2 0 0 0

6 This gives me 20 combinations, or . O.k., this is enough for a trend. The lower number of the 3 ‘choose’ bracket is definitely turning out to be the number of particles. What is the upper number? Well, the number N + g 1 seems to work! So it looks like the number of ways of arranging n − i particles into gi degenerate levels of energy state ǫi is

(b) ni + gi 1 (ni + gi 1)! i = − = − . ni n !(g 1)!   i i − I haven’t imposed any kind of restrictions about how many particles can exist in one of the degenerate levels. These kinds of particles are called ‘bosons,’ because they obey ‘BoseEinstein statistics,’ after the two guys who figured it out; thus the (b) in the superscript of i above.

If there are M energy levels in total, each with their own degeneracy factors gi, then the total number of ways of distributing all the bosons is

(n + g 1)! (b) = (b) = i i − . (9.1) i n !(g 1)! i i i i Y Y − (b) Note that if gi = 1 for a given level ǫi, so that there is only one state per energy level, then i = 1, i.e. that there is only one way to arrange the ni particles. How do we reconcile this with the counting method we used before? If all we get is one microstate in every macrostate, how do we maximize the entropy and obtain our equilibrium state? Let’s postpone this quandary for a moment, and first introduce fermions.

9.3.2 Fermions

Suppose that there is another kind of particle that refuses to share its quantum state with another particle; that is, each state within a degenerate energy level can only hold one particle. Particles with this restrictions are called ‘fermions,’ because the satisfy ‘FermiDirac statistics.’ How many ways of arranging the particles are there now? Let’s consider the same examples as was done above for bosons. First, N = 2, and g = 4: 4 This gives = 6 = . And the case with N = 3 and g = 2 is: 2   PHYS 449 Course Notes 2009 94

0 0 0 3 0 0 1 2 0 1 0 2 1 0 0 2 0 0 2 1 0 1 1 1 1 0 1 1 0 2 0 1 1 1 0 1 2 0 0 1 0 0 3 0 0 1 2 0 1 0 2 0 0 2 1 0 1 1 1 0 2 0 1 0 0 3 0 0 1 2 0 0 2 1 0 0 3 0 0 0

0 0 1 1 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 1 0 0

4 Now we only have four combinations, or . Again, this is enough to see the trend. The top 3 number is clearly the degeneracy, because it hasn’t changed. The bottom number looks like the number of particles. So for fermions we obtain g ! (f) = (f) = i . (9.2) i n !(g n )! i i i i i Y Y − (f) For fermions, i = 1 when gi = ni, i.e. when there is exactly one particle in each of the degenerate states within the energy level ǫi. PHYS 449 Course Notes 2009 95

0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0

9.3.3 Anyons!

As discussed in the previous two subsections, the number of ways of arranging ni bosons and fermions into an energy level with degeneracy gi are respectively given by (n + g 1)! g ! (b) = i i − ; (f) = i . i n !(g 1)! i n !(g n )! i i − i i − i These two forms for i look fairly similar to one another. In fact, we can write only one expression for both of them, that depends only on a parameter I’ll call α: [g + (n 1)(1 α)]! (a) = i i − − , i n ![g 1 α(n 1)]! i i − − i − where clearly for α = 0 we recover the expression for bosons, and for α = 1 we obtain that for fermions. But what about values of α in between? As discussed above, an unlimited number of bosons can occupy any state, independently of whether that state is already occupied by another particle or not; whereas only a single fermion can occupy a given state. So, intuitively, there might max exist particles where only a finite number of them could occupy a given state, i.e. 1 < ni < . In fact, we’ve already discussed these: particles that have repulsive interactions will probably ‘push’∞ other particles away from the state they already occupy.

In the late 1980’s, Duncan Haldane came up with the idea of ‘fractional exclusion statistics’ (FES), where the occupation of a given state depends on the number of particles already in that state. In two dimensions, these particles correspond to ‘anyons,’ because they obey ‘any’ statistics between (and also including) bosonic and fermionic statistics. These exotic particles are known to be the ones responsible for the fractional quantum Hall effect, for example. The ‘’ is the reason for (a) the ‘a’ superscript in the i expression above.

(b) (f) Recall that for bosons, i = 1 only if gi = 1, and for fermions, i = 1 only if gi = ni; in other words, the distribution saturates for these values of the degeneracy gi. For what values of gi do particles obeying F.E.S. saturate? For (a) = 1, this means that we require g +(n 1)(1 α)= n , i i 1 − − i and gi 1 α(ni 1) = 0 because 0! = 1 (notice that both of these equations state the same thing). So we− obtain− the− critical value of α for saturation: g 1 α = i − , c n 1 i − which means that α must be a rational fraction. Here’s a table of results for F.E.S. particles: In other words, as the denominator of the fraction α increases, more and more particles can fit into the state with degeneracy gi, so that anyons with F.E.S. approach Fermi and Bose statistics in the two limits. PHYS 449 Course Notes 2009 96

α gi ni gi ni gi ni gi ni 1 1 1 2 2 3 3 4 4 1 2 1 0 2 3 3 5 4 7 1 3 1 0 2 4 3 7 4 10 1 4 1 0 2 5 3 9 4 13 0 1 * 2 * 3 * 4 *

9.4 Emergence of Classical Statistics

Another way to think about the much larger number of energies than particles at everyday temper atures is this. Suppose you were to define an accessible energy state as an energy region that spans something like 105 quantum energy states. Then there would be on average 1 atoms per region, but there would be a huge degeneracy for this region, because the atom could be sitting in any of the 105 nearby energy levels. (I’m assuming that they are so closelyspaced as to be degenerate for our purposes). Then, we’re in the situation where g 1. Now let’s go back to our expressions for the i ≫ number of ways of distributing ni particles in gi states: [g + (n 1)(1 α)]! (a) = i i − − , i n ![g 1 α(n 1)]! i i − − i − If g 1, then i ≫ [g + n (1 α)]! [g + n n α)]! (a) i i − i i − i i ≈ n !(g αn )! ≈ n !(g αn )! i i − i i i − i (g + n αn )(g + n αn 1) (g αn + 1)(g αn )(g αn 1) (1) = i i − i i i − i − i − i i − i i − i − n !(g αn )(g αn 1) (1) i i − i i − i − gni i , ≈ ni! because not only is gi 1 but also gi ni. Notice that this final result doesn’t depend on alpha and therefore on the kind≫ of statistics. So≫ the quantum way of counting at high temperatures reduces to the simple universal result gni gni = i = i i . n ! n ! i i i i Y Q Q In fact, this isn’t very different from the microcanonical result for a situation where there are various ni outcomes of N trials, micro = N!/ i ni! except that the factor of N! is replaced by gi . But as we will see now, this new factor resolves Gibbs’ paradox: Q g ln() = ln gni ln n ! = [n ln(g ) n ln(n )+ n ]= n ln i + n . i − i i i − i i i i n i i ! i ! i i i Y Y X X     This means that ∂ g g n d ln() = 0 = n ln i dn = ln i i ) dn . ∂n i n i n − n i i i i i i i X    X     PHYS 449 Course Notes 2009 97

g ln i dn =0. ⇒ n i i i X   This equation isn’t much different from the maximization of entropy condition used previously to derive the Boltzmann distribution! Again, we have the supplementary conditions N = i ni or i dni = 0 and that of constant mean energy dU = i ǫidni = 0. Again, I use the method of Lagrange multipliers to obtain the single equation P P P g ln i α βǫ dn =0 n − − i i i i X     from which I get gi = exp [α + βǫi] ni or n = g exp[ βǫ α]= g A exp( βǫ ). i i − i − i − i Now the degeneracy factor has appeared more naturally than the ad hoc addition we did last time. In the continuous notation of the previous subsection, the distribution is

n(ǫ)= Ag(ǫ) exp( βǫ), − where g(ǫ) is just the density of states per unit energy, rather than the degeneracy. The total particle number for a 3D box is then A(2m)3/2V N = √ǫ exp( βǫ)dǫ 4h3π2 − Z which determines the unknown constant A in terms of N and the box volume V .

Now the entropy is written

g S = k ln() = k n ln i + N = k n [ln(g ) ln(A) ln(g )+ βǫ ]+ k N B B i n B i i − − i i B " i i # i X   X = k [βU N ln(A)+ N] . B − But N = AZ which means that ln(A) = ln(N) ln(Z) and the entropy becomes − ∂ S = k [βU N ln(N)+ N ln(Z)+ N]= k βNk T 2 ln(Z)+ N ln(Z) ln(N!) B − B B ∂T −   U ZN U = Nk ln(Z)+ k ln(N!) = k ln + , B T − B B N! T   which is different from the expression assuming distinguishable particles by a factor of ln(N!). This is exactly the term that we needed to resolve Gibbs’ paradox! Note that the expression for U is the same:

g ǫ exp( βǫ ) ∂ U = g Aǫ exp( βǫ )= N i i i − i = Nk T 2 ln g exp( βǫ ) i i − i g exp( βǫ ) B ∂T i − i i i i i ( " i #) X P − X ∂ P∂ ZN = Nk T 2 ln(Z)= k T 2 ln B ∂T B ∂T N!   PHYS 449 Course Notes 2009 98 where I got away with adding the N! term in the log because the derivative with respect to T would make this vanish! And of course I also have ZN F = k T ln . − B N!   Chapter 10

Quantum Statistics

Recall that the way one counts the number of ways quantum particles can be distributed in accessible quantum states is very different from the way classical particles are distributed in their quantum states. But we more or less dropped that story for a while in order to resolve Gibbs’ paradox. At large temperatures, when the number of states that are occupied is much larger than the number of particles, we recover the usual Boltmann distribution. But you might not remember a subtle point. We assumed that the effective degeneracy of a single level is large: we grouped many very closely spaced singlydegenerate quantum states into another state, which we called the highly degenerate quantum state. But we also found that at high temperatures, the mean number of levels occupied is also much larger than the number of particles in the system. In this chapter, we’ll relax the second assumption about high temperatures, so that the number of particles can begin to approach the degeneracy of one of these compound levels. In this case, we shouldn’t recover the classical Boltzmann distribution, but rather the quantum distributions for bosons and fermions.

10.1 Bose and Fermi Distributions

In Section 5.1.3 we saw that the number of ways of arranging ni quantum particles in a single energy level ǫi with degeracy gi is given by [g + (n 1)(1 α)]! (a) = i i − − , i n ![g 1 α(n 1)]! i i − − i − where α = 0 for bosons and α = 1 for fermions, and any other α corresponds to particles with fractional exclusion statistics. The total number of ways of arranging the particles is then

[g + (n 1)(1 α)]! = i i − − . n ![g 1 α(n 1)]! i i i i Y − − − Let’s follow the same procedure as we did previously in deriving the Boltzmann distribution, by maximing the entropy (or ln()) subject to the constraints N = i ni and U = i ǫini. Now, in the grand canonical ensemble we know that the Lagrange multiplier fixing the first condition is nothing but the chemical potential . Putting this together we haveP P

(1 α) ln[g +(n 1)(1 α)]+(1 α) ln(n ) 1+α ln[g 1 α(n 1)]+α+β βǫ dn =0 { − i i − − − − i − i − − i − − i} i i X

99 PHYS 449 Course Notes 2009 100

(1 α) ln[g + (n 1)(1 α)] ln(n )+ α ln[g 1 α(n 1)] + β βǫ =0 ⇒ − i i − − − i i − − i − − i because the variations of ni are all independent and arbitrary.

10.1.1 Fermions

For fermions (α = 1) we have

ln(n ) + ln[g n ]+ β βǫ =0 − i i − i − i g ln i 1 = β(ǫ ). ⇒ n − i −  i  This immediately gives the Fermi-Dirac (FD) distribution function g nFD = i , (10.1) i exp [β(ǫ )]+1 i − otherwise known as the FermiDirac occupation factor. A simpler way to derive the FD distribution is to use the grand canonical partition function. Because each state within a degenerate level can hold either no particles (with energy zero), or one particle (with energy Ei = ǫi and particle number Ni = 1), the FD partition function is

Ξ = 1 + exp[ β(ǫ )] gi , Ξ= 1 + exp[ β(ǫ )] gi . i { − i − } { − i − } i Y The singleparticle partition function is taken to the power of gi, because each of the gi states is assumed to be independent. The grand potential is therefore

Φ = k T ln(Ξ) = k T g ln 1 + exp[ β(ǫ )] . (10.2) G − B − B i { − i − } i X Now, the mean number of particles in the system is

∂Φ ∂ ln(Ξ) g β exp[ β(ǫ )] g N = G = k T = k T i − i − = i . − ∂ B ∂ B 1 + exp[ β(ǫ )] exp[β(ǫ )]+1 i i i i X − − X −

Because N = i ni, we immediately obtain

P gi nFD = i exp [β(ǫ )]+1 i − in agreement with the expression above.

The FermiDirac distribution function for various inverse temperatures β is shown in Fig. 10.1. As temperature increases (lower β), the distribution becomes flatter and shows a longer exponential tail, which begins to look like the Boltzmann distribution. At lower temperature (larger β), the distribution approaches a steplike function, signifying full occupation for levels ǫ< and zero occupation for levels larger than the chemical potential. PHYS 449 Course Notes 2009 101

1

0.8

) 0.6 ε ( FD n 0.4

0.2

0 0 5 10ε 15 20

Figure 10.1: The fermion distribution function for various temperatures. The solid, dashed, and dotdashed lines correspond to β = 10, 1, and 0.5, respectively. The chemical potential is set at 10 and g = 1.

In order to get a better feeling for the Fermi distribution (10.1), let’s modify the way it looks a little. Consider n gi : i − 2 1 1 gi 1 2 exp [β(ǫi )] 2 gi 1 exp [β(ǫi )] ni = gi − − − = − − . − 2 exp [β(ǫi )]+1 2 exp [β(ǫi )]+1 −   − Pulling a factor of exp [β(ǫ )/2] out of both the top and the bottom gives i −

gi gi exp [ β(ǫi )/2] exp [β(ǫi )/2] ni = − − − − − 2 2 exp [β(ǫi )/2] + exp [ β(ǫi )/2]   − − − g β = i tanh (ǫ ) . − 2 2 i −     So we are finally left with a slightly more intuitive expression (I hope!) for the Fermi distribution function that now involves a tanh: g β n = i 1 tanh (ǫ ) . (10.3) i 2 − 2 i −    Recall that the tanh function has the following property: 1 x 0 tanh(ax)= −+1 x ≪ 0 ( 0 x ≫=0 Also, tanh(ax) ax when x 0. So the value of a determines how quickly the function changes ∼ ∼ from its value of 1 to +1 as x passes through zero. In our case, x = ǫi and a = β. This means that the Fermi distribution− is equal to g for ǫ and is exactly zero− when ǫ . Also, as the i i ≪ i ≫ PHYS 449 Course Notes 2009 102

temperature gets lower (β increases), the slope through ǫi = gets larger and larger, approaching a vertical line when T = 0. So the Fermi distribution at zero temperature looks just like a step (Heavisidetheta) function:

gi ǫi < ni(T =0)= giΘ(ǫi ) . − ≡ 0 ǫi > n This means that exactly gi particles occupy each energy level ǫi up to the Fermi energy EF = , and no energy levels are occupied above this.

10.1.2 Bosons

For bosons (α = 0) we have

ln(g 1+ n ) ln(n )+ β βǫ =0. i − i − i − i

Now, the degeneracy of a given level is still assumed to be large, so that gi 1. In this case we obtain ≫ g ln i +1 = β(ǫ ). n i −  i  This can be inverted to give gi = exp [β(ǫi )] 1, ni − − or finally the Bose-Einstein (BE) distribution: g nBE = i , (10.4) i exp [β(ǫ )] 1 i − − otherwise known as the BoseEinstein occupation factor. Note that the BE and FD distributions differ only in the sign of the ‘1’ in the denominator! As we’ll see later, this small difference leads to profoundly different kinds of behaviour. But one thing is immediately clear: the chemical potential for bosons cannot ever exceed the value of the lowest energy level ǫ0. If it did, the value of the distribution function for this level n0 would be a negative number, which is unphysical. So for bosons we must have < 0. Again, we can derive the BE distribution within the grand canonical ensemble instead. There is no restriction to how many bosons can occupy a single state within a degenerate level. So the grand partition function for a given level is now a sum over contributions with no particles, one particle (with energy Ei = ǫi and number Ni = 1), two particles (with energy Ei =2ǫi and Ni = 2), etc.:

Ξ = 1 + exp[ β(ǫ )] + exp[ 2β(ǫ )] + exp[ 3β(ǫ )] + ... gi . i { − i − − i − − i − } Obviously, this is an infinite geometric series, 1 + x + x2 + ... =1/(1 x): − 1 gi Ξ = . i 1 exp[ β(ǫ )]  − − i −  The grand potential is

Φ = k T ln(Ξ) = k T g ln 1 exp[ β(ǫ )] . (10.5) G − B B i { − − i − } i X PHYS 449 Course Notes 2009 103

10

8

) 6 ε ( FD n 4

2

0 10 12 14 ε 16 18 20

Figure 10.2: The bosons distribution function for various temperatures. The solid and dashed lines correspond to β =0.1 and 1, respectively. The chemical potential is set at 10 and g = 1.

The mean number of particles is

∂Φ g β exp[ β(ǫ )] g N = G = k T i − i − = i . − ∂ B 1 exp[ β(ǫ )] exp[β(ǫ )] 1 i i i i X − − − X − −

Again, we know that N = i ni so

P gi nBE = , i exp[β(ǫ )] 1 i − − also in agreement with the expression above.

The BoseEinstein distribution function for various inverse temperatures β is shown in Fig. 10.2. As temperature increases (lower β), the distribution shows a longer exponential tail, which again begins to look like the Boltzmann distribution. At lower temperature (larger β), the distribution tightens up, with more probability near the chemical potential.

Incidentally, for particles with fractional exclusion statistics, we can write the distribution as g nFES = i , i w exp[β(ǫ )] + α { i − } where w(x) is a function of x = exp[β(ǫ )] satisfying the following equation: i − α 1 α w(x) [1 + w(x)] − = x.

If α = 1 (fermions), then w = x as expected. For α = 0 (bosons), we obtain 1 + w = x, again reproducing the expected result. For rational fractions α = p/q with p and q mutually prime, this PHYS 449 Course Notes 2009 104

2

1.5 ) ε (

1 (Boltz,FD,BE) n 0.5

0 0 5 10ε 15 20

Figure 10.3: The Boltzmann, FermiDirac, and BoseEinstein distribution functions, shown as solid, dashed, and dotdashed lines, respectively, are plotted at the same temperature (β = 1) and chemical potential ( = 10). The degeneracy factor is g = 1. implies a polynomial of order q. The distribution for semions, which satisfy α =1/2, is

semion gi ni = . exp[2β(ǫ )] + 1 i − 4 q

The Boltzmann, FermiDirac, and BoseEinstein distributions are plotted together in Fig. 10.3. Notice that they all coincide for large energies levels ǫ, and also at high temperature.

10.1.3 Entropy As discussed at the end of last term and in the review chapter above, the entropy can be obtained in terms of the grand potential using ∂Φ S = G . − ∂T The grand potential for bosons and fermions, derived above, are

Φ = k T g ln 1 exp[ β(ǫ )] , G ∓ B i { ± − i − } i X where the upper sign corresponds to fermions, and the lower sign to bosons. The derivative is carried out more easily in terms of β, so we’ll define the entropy instead as ∂Φ ∂ 1 S = k β2 G = k β2 g ln 1 exp[ β(ǫ )] B ∂β ∓ B ∂β β i { ± − i − } i X (ǫ ) exp[ β(ǫ )] = g k ln 1 exp[ β(ǫ )] k β ∓ i − − i − i ± B { ± − i − }∓ B 1 exp[ β(ǫ )] i i X  ± − −  PHYS 449 Course Notes 2009 105

β(ǫ ) = k g ln 1 exp[ β(ǫ )] + i − . B i ± { ± − i − } exp[β(ǫ )] 1 i i X  − ±  This is a pretty ugly expression, but it can be made to look nicer using the expressions for the FD and BE distributions g n = i . i exp[β(ǫ )] 1 i − ± This can be inverted to give g exp[β(ǫ )] g g g g n = i i − ± i ∓ i = i i ∓ i exp[β(ǫ )] 1 1 exp[ β(ǫ )] i − ± ± − i − which yields g n ln 1 exp[ β(ǫ )] = ln i ∓ i . { ± − i − } − g  i  We’re getting there! We also know that

gi gi ni exp[β(ǫi )] = 1= ∓ − ni ∓ ni β(ǫ )] = ln(g n ) ln(n ). ⇒ i − i ∓ i − i Putting all of these into the expression for the entropy gives n n S = k g ln(g n ) ln(g )+ i ln(g n ) i ln(n ) B i ∓ i ∓ i ± i g i ∓ i − g i i i i X   = k (n g ) ln(g n ) n ln(n ) g ln(g ) . B { i ∓ i i ∓ i − i i ± i i } i X For fermions, the resulting entropy takes the form:

SFD = k n ln(n ) + (g n ) ln(g n )+ g ln(g ) , (10.6) − B { i i i − i i − i i i } i X which reduces so something intuitive when gi = 1:

SFD(g =1)= k n ln(n )+(1 n ) ln(1 n ) . i − B { i i − i − i } i X In other words, it is the sum of two contributions, corresponding to each level either empty, or containing one particle. This is very reminiscent of the entropy for a twostate system (coins or spins)! Note that the entropy is zero if each state is occupied by one particle (ni = 1). This is the ground state of the system at zero temperature, and it reflects the Pauli exclusion principle. For bosons, the form of the entropy is less intuitive looking, but still nice:

SBE = k (g + n ) ln(g + n ) n ln(n ) g ln(g ) . (10.7) B { i i i i − i i − i i } i X Note that for bosons, the entropy is zero only if all states are unoccupied, i.e. ni = 0 i. But this must be nonsense, because we have a finite number of particles in our system! Where∀ have all the particles gone? This is the origin of BoseEinstein condensation, which we’ll get to below. Notice also that there is a complete symmetry in the BoseEinstein case between ni and gi. PHYS 449 Course Notes 2009 106

10.2 Quantum-Classical Transition

Last term we found that the chemical potential at high temperatures is large and negative. This means that one might expect exp[β(ǫi )] 1 at high temperatures, but only if kBT at high temperatures as well. Supposing that− the magnitude≫ of the chemical potential gets| large| ≫ faster than kBT gets large (checked below), then both the Bose and Fermi distributions become the Boltzmann distribution: nBE,FD(T 0) g exp(β)exp( βǫ ), i ≫ ≈ i − i where A = exp(β) is called the fugacity. But remember that we defined A = N/Z in the canonical ensemble. So this immediately gives N = k T ln , B Z   where Z is understood as the hightemperature limit of the canonical partition function. This is consistent with the results of the examples (Pauli paramagnet and particles in a 3D box) discussed last term. If /k T 1, then this implies that N Z or Z N. For a 3D box, this means B ≪− ≪ ≫ V 1 3 1 or n 3 NλD ≫ ≪ λD at high temperatures. But the mean distance between particles d = (1/n)1/3 so this criterion is equivalent to saying that the classical limit is equivalent to stating that the mean distance between particles must be much larger than the de Broglie wavelength, d λ . ≫ D What does this criterion for the world to be classical come from? Suppose we use the equipartition 2 theorem, which states that the mean kinetic energy of each particle is U/N = p /2m = (3/2)kBT . Then p = √3mkBT h/d, where h/d is a measure of the largest quantum wavevector in the system. The condition is equivalent≫ to

2 h λD d = or d λD. ≫ s3mkBT √6π ≫ So, the transition from the classical to the quantum regimes occurs when the particles get sufficiently close together to begin to detect each others’ wavelike natures.

Now let’s tie up that important loose end, and check if kB T at high temperature. Consider the mean particle (fermion or boson) density for a 3D box:| | ≫

2 N 1 ∞ k dk n = = , V 2π2 exp[β(ǫ )] 1 Z0 − ± 2 2 2 2 2 where ǫk =h k /2m. Making the change of variables k = (2mkBT/h )x , we obtain

3/2 2 1 2mk T ∞ x dx n = B , (10.8) 2π2 h2 exp(x2 η) 1   Z0 − ± where η /kBT . We assume that the mean density in the box does not change as a function of temperature.≡ So let’s suppose that we start increasing the temperature. Clearly, the integral PHYS 449 Course Notes 2009 107 has to decrease like 1/T if n is conserved; at very large temperatures, the integral must approach zero. But the only thing in the integral that knows about the temperature is η = /kBT . For this to be true, we clearly require η at high temperatures, or /kBT 1. This justifies the connection between the grand canonical→ −∞ function for indistinguish|able| quantum≫ particles and the canonical distribution function for classical particles discussed above. At large (but not infinite) temperature, the integral becomes

3/2 3/2 1 2mkBT ∞ 2 2 1 2mkBT √π n 2 2 dx x exp[ (x η)] = 2 2 exp . ≈ 2π h − − 2π h kBT 4   Z0     Inverting this gives 3 2πh2 = ln(n)+ ln = k T ln nλ3 k T 2 mk T ⇒ B D B  B   as we found last term.

10.3 Entropy and Equations of State

To obtain the equations of state for quantum particles, let’s calculate the pressure. Obviously, for a 3D box at high temperature we’re expecting the result to be P = nkBT , where n is the density. What about at lower temperatures? How do you think the quantum statistics will affect the pressure?

The pressure is defined in terms of the grand potential as ∂Φ P = G , − ∂V and the grand potential for fermions and bosons is

k T V ∞ Φ = k T g ln 1 exp[ β(ǫ )] B k2dk ln 1 exp[ β(ǫ )] . G ∓ B i { ± − i − }≈∓ 2π2 { ± − k − } i 0 X Z 2 2 In the second line I have explicitly assumed a 3D box, and that the energy levels ǫk = h k /2m are sufficiently closely spaced relative to the temperature that the energy levels are more or less continuously distributed. The pressure is therefore

k T ∞ P = B k2dk ln 1 exp[ β(ǫ )] . ± 2π2 { ± − k − } Z0 To make progress, let’s integrate this expression by parts once. Recall to integrate by parts we use the rule b b b udv = uv vdu. a − Za Za

The first term on the right hand side is called the surface contribution. In the pressure expression, 2 3 we’ll let u = ln 1 exp[ β(ǫk )] and dv = k dk. Now, v = k /3 which is zero for k = 0, and u = 0 when k ={ ±because− the− argument} of the ln becomes one. So the surface term vanishes. The pressure expression∞ becomes

2 4 2 4 k T ∞ h k exp[ β(ǫ )] 1 ∞ h k P = B ( β) − k − dk = n dk. ∓ 2π2 3m ∓ 1 exp[ β(ǫ )] 2π2 3m k Z0 ± − − Z0 PHYS 449 Course Notes 2009 108

Now you see why integrating by parts makes such an improvement! Notice that this expression is correct for either bosons or fermions; you only have to use the appropriate expression for nk.

We can divide the factor of exp[ β(ǫ )] in the numerator into the denominator, which makes − k − the integrand explicitly depend on the distribution functions nk. Making the same substitution 2 2 2 x = (h /2mkBT )k as was done in the evaluation of the mean particle density, and simplifying a bit, one obtains 3/2 4 k T 2mk T ∞ x P = B B dx, 3π2 h2 exp(x2 η) 1   Z0 − ± where as before η = /kBT . Note that the prefactor is almost exactly the same as the right hand side of the expression for the particle density, Eq. (10.8). So finally we have

4 ∞ x 2kBT n 0 exp(x2 η) 1 dx P = −2 ± . (10.9) 3 ∞ x R0 exp(x2 η) 1 dx − ± Let’s check first if this reduces to the known equationR of state at high temperatures, when η : → −∞ 2 4 x 1 ∞ − 2nkB T 0 x e− dx 2nkB T 3√π √π P (T 1) 2 = = nkBT, ≫ ≈ 3 ∞ x2e x dx 3 8 4 R0 −   as expected. R

For arbitrary values of η, the pressure cannot be evaluated analytically, but it is relatively straight forward to obtain the firstorder correction from the ideal gas expression. Because η 1, one can write ≪− 1 1 1 = exp( x2 + η) 1 exp( x2 + η) . exp(x2 η) 1 exp(x2 η) 1 exp( x2 + η) ≈ − ∓ − − ±  −  ± −   Inserting this into the expression for the pressure gives

2 2 ∞ x4 e x eηe 2x dx 3√π η 3√π η 1 2k T n 0 − − 2k T n 8 e 1 e P B ∓ = B ∓ 32√2 nk T ∓ 4√2 . 2 x2 η 2x2 √π √π B η 1 ≈ 3 R ∞ x e− e e− dx 3 eη ≈ 1 e 0 ∓ 4 ∓ 8√2 ∓ 2√2 R  η β 3 The results below Eq. (10.8) suggest that we simply make the identification e = e = nλD. The pressure then becomes nλ3 P nk T 1 D . ≈ B ± √  4 2  The two terms in this expression are called the first and second virial coefficients, respectively. The first virial coefficient simply reflects the ideal gas law, as we found above anyhow. The second is a densitydependent correction that arises strictly from the . The contribution to the pressure is positive for fermions, but negative for bosons. For fermions, the pressure increases 3 as the temperature is lowered (i.e. nλD increases) because the Pauli exclusion principle effectively pushes the particle apart: the Pauli principle acts like a repulsive twobody interaction. In contrast, the second virial coefficient for bosons is negative, indicating that they prefer to stay closer together at lower temperature; this presages the idea of BoseEinstein condensation, discussed in detail below. PHYS 449 Course Notes 2009 109

In general, the deviation of the ideal gas law for noninteracting particles is one important signature for the onset of quantum degeneracy.

In exactly two dimensions, the second virial coefficient for particles with arbitrary exclusions statis tics can be calculated analytically, and the resulting equation of state is exactly

2α 1 P = nk T 1+ − nλ2 . B 4 D   For fermions (α = 1), the second virial coefficient is clearly positive, while for bosons (α = 0), it is 1 negative. Interestingly, it is exactly zero for semions (α = 2 ), which are halfway between the two limits.

The hightemperature correction to the total energy can be calculated in a similar fashion. We have

∂ ln(Ξ) ∂ Φ ∂Φ U = k T 2 = k T 2 G =Φ T G . B ∂T B ∂T −k T G − ∂T  B  We could go through the same procedure as above, integrating by parts, etc. But we can save lots of time by observing that U = i niǫi, so that

2 4 P V ∞ h k 3 U = n dk = P V 2π2 2m k 2 Z0 when comparing to the expression for the pressure above. So one obtains

3 1 U Nk T 1 nλ3 . ≈ 2 B ± 25/2 D   Chapter 11

Fermions

11.1 3D Box at zero temperature

2 2 For a threedimensional box, the highest energy level is EF = h kF /2m, where kF is the Fermi wavevector, or the radius of the Fermi sphere in momentum space (recally p =hk ). What is the Fermi energy at zero temperature in this case? We know that the mean number of particles is therefore i F gV kF N = n = k2dk, i 2π2 i=0 0 X Z where I’ve kept a factor of g out front in case each energy level has any intrinsic degeneracy over 1 and above the usual density of states. For example, if I assume that each fermion has spin 2 , then I could place exactly two particles (g = 2) in each energy level (one with spin up, one with spin down) and still satisfy the Pauli exclusion principle. Integrating, and making use of the fact that 2 2 EF =h kF /2m, I obtain gV 2mE 3/2 N = F , (11.1) 6π2 h2   which can be inverted to give h2 6π2n 2/3 E = . (11.2) F 2m g   Here, n = N/V is the mean density. This expression immediately gives the value for the Fermi wavector at zero temperature: 6π2n 1/3 k = . F g  

Let’s also calculate the mean energy for the fermions at zero temperature: i F gV kF h2k2 gV h2k5 6 gV k3 h2k2 3 U = g ǫ n = k2 dk = F = F F = NE . (11.3) i i i 2π2 2m 10π22m 10 6π2 2m 5 F i=0 0 X Z       The pressure for a is meanwhile

2 4 2 kF 3 2 2 g ∞ h k gh gV k h k 2 2nE P = n dk = k4dk = F F = F . 2π2 3m k 6mπ2 6π2 2m 5V 5 Z0 Z0    

110 PHYS 449 Course Notes 2009 111

But wait a second! These fermions are noninteracting, i.e. are ideal gases. So why is there any pressure at zero temperature? It is called Fermi pressure, and arises solely from the Pauli exclusion principle: you can only squeeze some number of fermions into a given volume before you will get some resistance!

11.2 3D Box at low temperature

At low but finite temperature, we need to evaluate expressions involving the full Fermi occupation factors, i.e.

3 ∞ ∞ X = Xini = d kg(k)n(k)X(k)= g(ǫ)n(ǫ)X(ǫ)dǫ = n(ǫ)Y (ǫ)dǫ, (11.4) i X Z Z−∞ Z−∞ where Y (ǫ) g(ǫ)X(ǫ). Notice that I have explicitly inserted the density of states into the integral expressions,≡ but that the inherent degeneracy of states g (for example originating in the spin degree of freedom) is already included in the ni. Here it is also assumed that Y (ǫ) vanishes when ǫ and is at most a power of ǫ for ǫ . If we define K(ǫ) in terms of Y (ǫ) as → −∞ → ∞ ǫ K(ǫ) Y (ǫ′)dǫ′, ≡ Z−∞ so that Y (ǫ)= dK(ǫ)/dǫ, then one can integrate Eq. (11.4) by parts to obtain

∞ ∞ ∞ ∂n n(ǫ)Y (ǫ)dǫ = n(ǫ)dK(ǫ)= K(ǫ) dǫ. − ∂ǫ Z−∞ Z−∞ Z−∞   The surface term is zero because K(ǫ = ) = 0 and n(ǫ = ) is exponentially suppressed compared with K(ǫ = ). Now, n(ǫ) at low−∞ temperatures is constant∞ for most ǫ, and varies rapidly only near ǫ = . So it is∞ reasonable to expand K(ǫ) in a Taylor series around ǫ = , so that we have

n n ∞ ∞ ∞ (ǫ ) ∂ K(ǫ) ∂n n(ǫ)Y (ǫ)dǫ = K()+ − n dǫ. n! ∂ǫ ǫ= − ∂ǫ Z−∞ Z−∞ " n=1 #   X

The leading term is simply gK(), and only even terms will appear at higherorder because ∂n/∂ǫ is an even function of ǫ . So we have − 2n 2n 1 ∞ ∞ ∞ (ǫ ) ∂n ∂ − Y (ǫ) n(ǫ)Y (ǫ)dǫ = g Y (ǫ′)dǫ′ + − dǫ 2n 1 . (2n)! − ∂ǫ ∂ǫ − ǫ= Z−∞ Z−∞ n=1 Z−∞   X

Finally, making the substitution x (ǫ )/kBT , one obtains ≡ − 2n 1 ∞ ∞ 2n ∂ − n(ǫ)Y (ǫ)dǫ = g Y (ǫ)dǫ + g an(kB T ) 2n 1 Y (ǫ) , ∂ǫ − ǫ= Z−∞ Z−∞ n=1 X where 2n ∞ x d 1 1 an x dx = 2 2(n 1) ζ(2n), ≡ (2n)! −dx e +1 − 2 − Z−∞     with the Riemann zeta functions defined as 1 1 1 ζ(n) 1+ + + + .... (11.5) ≡ 2n 3n 4n PHYS 449 Course Notes 2009 112

In practice we don’t care about anything except the first correction term, n = 1, for which a1 = ζ(2) = π2/6. So 2 ∞ π 2 n(ǫ)Y (ǫ)dǫ g Y (ǫ)dǫ + g (k T ) Y ′(). (11.6) ≈ 6 B Z−∞ Z−∞ That was a lot of work to obtain a very simple correction!

We are finally able to calculate the finitetemperature expressions for the mean number, the mean energy, and the pressure. Using the 3D density of states seen last term, the mean number is 3/2 3/2 2 ∞ gV (2m) ∞ 1/2 gV (2m) 1/2 π 2 1 1/2 N = g(ǫ)n(ǫ)dǫ = ǫ n(ǫ)dǫ ǫ dǫ + (kB T ) − 2π2 2h3 ≈ 2π2 2h3 6 2 Z0 Z0 Z0  gV (2m)3/2 π2 k T 2 = 3/2 1+ B . (11.7) 6π2 h3 8 "   # This means that as we raise the temperature from zero at constant chemical potential, the mean particle number will increase. Alternatively, one can assume that the mean number of particles N remains constant as temperature is increased from zero, so that the Fermi energy EF (11.2) remains welldefined at all temperatures, then the chemical potential must also vary: gV 2mE 3/2 gV (2m)3/2 π2 k T 2 N = F = 3/2 1+ B . 6π2 h2 6π2 h3 8   "   # π2 k T 2 E 1 B . (11.8) ⇒ ≈ F − 12 E "  F  # That is, at fixed Fermi energy, the chemical potential decreases from EF at zero temperature. This is consistent with the picture of the chemical potential we found last term, where the chemical potential becomes increasingly negative as the temperature is increased. But it is important to point out that in the fermion case, the chemical potential isn’t zero at zero temperature; rather it is the highestoccupied energy level.

Next let’s calculate the mean energy: 3/2 3/2 2 ∞ gV (2m) ∞ 3/2 gV (2m) 3/2 π 2 3 U = g(ǫ)ǫn(ǫ)dǫ = ǫ n(ǫ)dǫ ǫ dǫ + (kB T ) √ 2π2 2h3 ≈ 2π2 2h3 6 2 Z0 Z0 Z0  gV (2m)3/2 5π2 k T 2 = 5/2 1+ B . (11.9) 10π2 2h3 8 "   # While correct, this expression isn’t so nice because it is difficult to compare it to the zerotemperature result (11.3). Substituting into Eq. (11.7) gives

2 2 5π kB T 2 3 1+ 8 3 π2 k T U N N 1+ B . 2  2 ≈ 5 π kB T ≈ 5 " 2 # 1+ 8     Finally, inserting the expression (11.8) gives

3 π2 k T 2 π2 k T 2 3 5π2 k T 2 U NE 1 B 1+ B = NE 1+ B . ≈ 5 F − 12 E 2 5 F 12 E "  F  # "   # "  F  # PHYS 449 Course Notes 2009 113

This means that the specific heat at low temperatures is proportional to the temperature, CV = ∂U/∂T T . This is in contrast to the classical (Boltzmann) gas, where the specific heat at low temperature∼ was zero. It’s also the reason for the lineartemperature specific heat observed in clean crystals at very low temperatures, because electrons are fermions. Again, using the relation P =2U/3V gives the correction to the pressure:

2nE 5π2 k T 2 P F 1+ B . (11.10) ≈ 5 12 E "  F  # The finitetemperature correction term is exactly the same as for the mean energy above. This is nothing but the second virial coefficient, which is clearly positive for all finite temperatures. Recall that at very high temperatures, the pressure expression is instead

1 P (T )= nk T 1+ nλ3 . → ∞ B 25/2 D   11.3 3D isotropic harmonic trap 11.3.1 Density of States Almost all of the spinpolarized fermionic atoms that have been cooled to ultralow temperatures have been trapped by magnetic fields or focused laser beams. The confining potentials are generally 3D harmonic traps. So let’s consider this case in more detail. You might be interested to note that Fermi’s original paper on fermionic particles considered this case, not the 3D box case above. As we saw previously, ignoring the zeropoint energy in each dimenion the eigenvalues (accessible energy states) are given by ǫ(nx,ny,nz)= nxhωx + nyhωy + nzhωz. In order to evaluate the various integrals, we first need to obtain the density of states per unit energy. A rough way to do this is to simply set k = n , so that ǫ2 = k2(hω )2 + k2(hω )2 + k2(hω )2 k2(hω)2, where ω = (ω ω ω )1/3 i i x x y y z z ≡ x y z is the mean frequency, and dki/ǫi = 1/hω. Because ki = ni now rather than ki = πni/L, the 3D density of states is given by k2 dk ǫ2 g(ǫ)= = . (11.11) 2 dǫ 2(hω)3

Another (more rigorous) way to obtain the density of states is to ask how many states G(ǫ) are enclosed in an energy surface of radius ǫ = ǫx + ǫy + ǫz. The result is

ǫ ǫ ǫx ǫ ǫx ǫy 3 1 − − − ǫ G(ǫ)= dǫx dǫy dǫz = 3 . hωxhωyhωz 6h ω ω ω Z0 Z0 Z0 x y z The density of states is then

dG(ǫ) ǫ2 ǫ2 g(ǫ)= = 3 = 3 , dǫ 2h ωxωyωz 2(hω) in agreement with the guess above.

Here are another couple of ways to see this, if we assume that the trap is isotropic (ωx = ωy = ωz). We know that the partition function for the onedimensional oscillator with energy levels given by PHYS 449 Course Notes 2009 114

ǫn =hωn, n =0, 1, 2,... (neglecting the zeropoint energy) is 1 Z = , x = exp( βhω). 1D 1 x − − Because the partition function for the sdimensional harmonic oscillator is the sfold product of the onedimensional partition functions, we can write

1 s(s 1) s(s 1)(s 2) Z = = 1+ sx + − x2 + − − x3 + ... sD (1 x)s 2 6 −   ∞ n + s 1 ∞ = xn = g(s,n)xn. (11.12) n − n=0 n=0 X   X So, the degeneracy of the sdimensional oscillator is

n + s 1 (n + s 1)! g(s,n)= − = − . n n!(s 1)!   −

For three dimensions (s = 3), g(n) = (n + 1)(n + 2)/2. This expression becomes simpler g(n′) = n′(n′ + 1)/2 if we make the replacement n′ = n +1 and count from 1 instead of zero. For large n′, 2 this becomes g(n′)= n′ /2, as found above.

Alternatively, the degeneracy for the three dimensional isotropic trap can be found by counting the number of ways we can distribute M distinct energy quanta into a common (degenerate) energy level, just like in the Planck case which we’ll see in the next chapter. We have WN = (N + M 1)!/[M!(N 1)!]. If M = 0 then W = 1, i.e. there is only one way to fill up each level; this− is − N equivalent to the onedimensional case (s = 1). If M = 1 then WN = N!/(N 1)! = N, i.e. there are N ways to fill up each level indexed by N; this is equivalent to s =2. If M−= 2 then there are WN = (N + 1)!/[2(N 1)!] = N(N + 1)/2, ways to fill up each level indexed by N; this is equivalent to s = 3 and clearly s−= M + 1 in this notation.

11.3.2 Low Temperatures

Armed with the density of states, we are in a position to calculate the finitetemperature expressions for the mean number, the mean energy, and the pressure for the 3D isotropic oscillator, as we did for the 3D case above. The mean number is

2 ∞ g ∞ g π N = g(ǫ)n(ǫ)dǫ = ǫ2n(ǫ)dǫ ǫ2dǫ + (k T )2 2(hω)3 ≈ 2(hω)3 3 B Z0 Z0 Z0  g 3 k T 2 = 1+ π2 B . (11.13) 6 hω " #     At zero temperature this expression defines the Fermi energy:

1/3 6N E = hω. (11.14) F g   PHYS 449 Course Notes 2009 115

Again, if we fix N at all temperatures then

g E 3 g 3 k T 2 N = F = 1+ π2 B 6 hω 6 hω " #       which can be inverted to yield π2 k T 2 E 1 B . (11.15) ≈ F − 3 E "  F  #

Next let’s calculate the mean energy:

2 ∞ g ∞ g π U = g(ǫ)ǫn(ǫ)dǫ = ǫ3n(ǫ)dǫ ǫ3dǫ + (k T )22 2(hω)3 ≈ 2(hω)3 2 B Z0 Z0 Z0  g 3 k T 2 = 1+2π2 B . (11.16) 8 hω " #     Substituting into Eq. (11.13) gives

2 2 kB T 2 3 1+2π 3 k T U N N 1+ π2 B .  2 ≈ 4 2 kB T ≈ 4 " # 1+ π     Finally, inserting the expression (11.15) gives

3 π2 k T 2 k T 2 3 2π2 k T 2 U NE 1 B 1+ π2 B NE 1+ B . ≈ 4 F − 3 E ≈ 4 F 3 E "  F  # "   # "  F  # The pressure follows directly from here as it did for the 3D case.

11.3.3 Spatial Profile

The above analysis doesn’t tell us anything about the spatial profile of the confined fermions. In principle, one must put a gi fermions in each available energy state ǫi. So one would need to take into consideration each level and its associated degeneracy, then calculate the appropriate spatial wavefunction to build up the total density profile. This is a lot of work! Thankfully, there is a better way to accomplish this task when the number of particles is large, so that a detailed knowledge of the singleparticle wavefunctions is not necessary. We can use the local density approximation, which assumes that the particles at any given region of the trap behave locally as if there was no external potential at all. In other words, the Fermi occupation factor would be expressed as 1 1 n(r, k)= . 2π3 exp β h2k2/2m + mω2r2/2 +1 − This means that the spatial density is given by 

3/2 g(2m) ∞ √ǫ n(r)= d3kn(r, k)= dǫ. 4h3π2 exp [β (ǫ + mω2r2/2 )]+1 Z Z0 − PHYS 449 Course Notes 2009 116

At sufficiently low temperatures we can approximate the Fermi occupation factor by a Θfunction, as discussed just below Eq. (10.3), and set E . Then we can write ≈ F 2 2 3/2 EF mω r /2 3/2 3/2 g(2m) − g(2m) 1 2 2 n(r)= √ǫdǫ = EF mω r . 4h3π2 6h3π2 − 2 Z0   2 2 The Fermi energy can then be used to define the radius of the cloud R through EF = (1/2)mω R so one obtains g mω 3 r 2 3/2 n(r)= R3 1 . (11.17) 6π2 h − R       Using the definition of the Fermi energy (11.14), the radius of the cloud is

1/6 1/6 2E 2h 6N 6N R = F = = √2 ℓ, (11.18) mω2 mω g g r r     where I have defined the harmonic oscillator length ℓ h/mω (the meaning of this will be discussed below). So finally the density is ≡ p

2 gN r 2 3/2 n(r)= 1 . π2ℓ3 s 3 − R    

Where does the length ℓ come from? For a onedimensional harmonic oscillator the force is Fi = kr ˆı= mω2r ˆı,and the virial theorem (8.1) states that on average − i − i 1 mω2 mω2ℓ2 K = F r = r2 = , −2 i i 2 i 2 * i + * i + X X where ℓ2 is a length scale describing the meansquare displacement of the oscillator. Meanwhile the kinetic energy term is p2 h2 h2 i = k2 = , 2m 2m i 2mℓ2 * i + * i + X X 2 where ℓ− is the meansquare size of the oscillator in the wavevector representation.∗ Alternatively, 2 4 2 2 2 you can think of the ℓ− factor as simply ensuring the correct units. So we obtain ℓ =h /m ω or ℓ = h/mω as the characteristic size of the harmonic oscillator. p At high temperatures, the analysis above won’t apply because many of the fermions below the Fermi energy will be excited into levels above the Fermi energy (cf. Fig. 10.1). In this case, the fermions will behave essentially as Boltzmannons (!). To find the density, we can either treat the potential as an external chemical potential, so that

1 (r) 1 x2 y2 z2 n(r)= exp − ext = exp exp exp exp , (11.19) λ3 k T λ3 −k T −R2 −R2 −R2 D  B  D  B   x   y   z  ∗The reason for this is that the particle distribution in the harmonic oscillator is a Gaussian. So the Fourier transform of it (the momentum-space distribution) is also a Gaussian, whose mean-square size is the inverse of the original Gaussian. PHYS 449 Course Notes 2009 117

2 2 where Ri =2kBT/mωi and ωi are the trapping frequencies in the x, y, and z directions. Alterna tively, we could use the definition of the total number in terms of the canonical partition function N = AZ, and then use the equipartition form for Z,

1 x2 y2 z2 N = AZ = exp d3r exp exp exp , λ3 −k T −R2 −R2 −R2 D  B  Z  x   y   z  and then use the normalization condition N = d3rn(r) to obtain the chemical potential. In any case we have the same thing. The effective volume of the cloud is V = (4π/3)R3 = (4π/3)(2k T/mω)3/2 R i B assuming all frequencies are equal.

11.4 A Few Examples

Let’s consider a few examples of fermions in familiar contexts. First and foremost, we need to determine if we must treat them quantum mechanically or classically. In other words, we first determine if the particles are quantum degenerate or not. If so, we can apply the results obtained above to obtain various properties if we are interested. Below are four examples of Fermi degenerate systems where the temperatures span 14 orders of magnitude!

11.4.1 Electrons in Metals 29 3 Typical electron densities in metals are n = 10 m− . Using the mass of an electron as 9.109 31 9 10− kg, the de Broglie wavelength at room temperature (300 K) is found to be λD =4.303 10− m. 1/3 10 The mean interparticle separation is d = n− = 2 10− m. This means that λD d and the electrons must be treated quantum mechanically. Another way to think about it is to use≫ the Fermi temperature, TF EF /kB, which must be larger than the actual temperature of the system to ≡ 18 be quantum degenerate. Using the same parameters, I obtain EF = 2 10− J (or about 12 eV), 5 and TF =1.5 10 K. This is of course much larger than room temperature. So the electrons have something like 500 times the energy of atoms in the room.

Let’s calculate some other things while we’re at it. The Fermi velocity corresponds to the mean 2 velocity of fermions near the Fermi energy, and is defined as vF =hkF /m, where kF = 2mEF /h 6 is the Fermi wavevector. Putting in numbers gives vF 2.1 10 m/s, which isq pretty fast! The electron pressure is given by Eq. (11.10) at low temperatures.≈ × Since at room temperature kBT/EF 0.002 we are justified in neglecting the finitetemperature correction. The pressure is then P ≈8 1010 N/m2. This is phenomenally high, almost a million atmospheres! It’s amazing that metals≈ × don’t simply explode. . . . Or is it?

11.4.2 Electrons in the Sun The sun was already considered in the context of the virial theorem, Sec. 8.1. Recall that the mass of the sun is M = 3 1030 kg, and the average temperature is something like 5 106 K. The number of hydrogen⊙ atoms gives a reasonable estimate for the number of electrons. With× the mass 27 57 of hydrogen assumed to be 1.67 10− kg, I obtain Ne =1.8 10 electrons. The radius of the sun 7 34 3 8 is 3 10 m, giving a density of 1.6 10 m− . The Fermi temperature is therefore TF =2.7 10 K, which is something like 50 times larger than the mean temperature. So the electrons in the sun are also quantum degenerate! Note that one can also define the velocity of electrons at the Fermi PHYS 449 Course Notes 2009 118

8 surface, vF hkF /m = 10 m/s for electrons in the sun. This is already onethird of the velocity of light, so if≡ the sun were much hotter we’d have to treat the electrons relativistically. In fact, if a star is heavier than about 1.4M then the electron degeneracy pressure will no longer be able to stabilize it against collapse. This socalled⊙ Chandrasekhar limit is quite easy to calculate when the electrons are treated relativistically, as is needed for hot white dwarfs. There is a similar ‘Tolman OppenheimerVolkoff’ limit for neutron stars that are stabilized by Fermi pressure associated with neutrons, in the range 1.4M < M < 3.5M ; the actual values are hard to compute because the strong force is really a pain to⊙ ∼ deal with.∼ ⊙

11.4.3 Ultracold Fermionic Atoms in a Harmonic Trap If we had a very good knowledge of the number of fermions in our trap, then we could use the Fermi energy (11.14) and compare it to kBT to test if the atoms were in fact quantum degenerate. In many experiments, the number of atoms is N 106 and trapping frequencies are of order ω/2π 100 Hz. Then with g = 1 (the atoms are spinpolarized∼ which means that all the spins are pointing∼ in the 29 same direction) one obtains E 1.2 10− J or a Fermi temperature T E /k 1 K. This is F ≈ × F F B ≈ very cold! But these days, experiments are operating around 100 nK or T = 0.1TF . So ultracold fermions in traps are in fact strongly quantum degenerate.

Another way to estimate the temperature required to reach Fermi degeneracy, one can use the 6 3 19 3 Boltzmann spatial profile (11.19). The density is n = 10 /Ri 3 10 m− giving a mean spacing 7 ≈ between particles of d 3 10− m. Fermi degeneracy is reached when λD = d which in this case ≈ 6 corresponds to T = TF 10− K, the same estimate we obtained above. Interestingly, the same density is found using the∼ lowtemperature Fermi distribution (11.17). For the parameters given 3 14 above one obtains ℓ 1.6 m and therefore R 22 m. The volume is then 4πR /3 4.3 10− ≈ 19 3 ≈ ≈ × which gives a density of n 2 10 m− . This is almost identical to the much rougher estimate above, and so the qualitative≈ results× are the same. Chapter 12

Bosons

That minus sign in the denominator of the BoseEinstein distribution function really leads to totally different behaviour, and we’ll now see.

12.1 Quantum Oscillators

Last term we obtained the partition function for the harmonic oscillator, and associated observables. To jog your memory, I’ll repeat some of this derivation here. The BohrSommerfeld quantization procedure yields the energy eigenvalues ǫn = nhω, which is off by only a constant factor. The exact 1 expression is ǫn =hω (n + 2 ), but the additional factor makes no contribution to any statistical properties.

One can then obtain the (canonical) partition function: Z = exp( ǫ /k T )= exp( βǫ ) =1+exp( βhω) + exp( 2βhω)+ .... − n B − n − − n n X X But this is just a geometric series: if I make the substitution x exp( βhω), then Z =1+ x + x2 + x3 + .... But I also know that xZ = x + x2 + x3 + .... Since both≡ Z and− xZ have an infinite number of terms, I can subtract them and all terms cancel except the first: Z xZ = 1, which immediately yields Z =1/(1 x), or − − 1 Z = . (12.1) 1 exp( βhω) − − Now I can calculate the mean energy: 2 2 ∂ ln(Z) NkBT ∂Z 2 [1 exp( βhω)] hω U = NkBT = = NkBT − − 2 ( 1) 2 ( 1) exp( βhω) ∂T Z ∂T [1 exp( βhω)] − kBT − − − − exp( βhω) Nhω = Nhω − = (12.2) 1 exp( βhω) exp(βhω) 1 − − − 1 = Nhω n(T ) , where n(T ) is the occupation factor. ≡ exp(hω/k T ) 1 B −

Notice that the occupation factor is in fact identical to the BoseEinstein distribution function, Eq. (10.4), with the identification ǫi =hω. There is in fact a very close connection between bosons

119 PHYS 449 Course Notes 2009 120 and oscillators, which you might already have anticipated. For example, photons are quantized packets of light, but light is also a wave (i.e. an oscillating field). Photons are also bosons, having integer (unit) spin.

Einstein constructed a model of a solid in 1907, where he assumed that the atoms making up the solid were all able to oscillate independently around their equilibrium positions. (This is the ‘Ein stein solid’ model discussed in Schroeder’s textbook in Chapter 2.2). The point was to understand experimental results obtained early in the 20th century that showed that the heat capacity decreases exponentially at low temperatures. Assuming each oscillator could vibrate in three directions inde pendently, the mean energy becomes 3Nhω U = . exp(hω/k T ) 1 B − The specific heat is then

∂U hω 2 exp(hω/k T ) C = =3Nk B . V ∂T B k T (exp(hω/k T ) 1)2  B  B − At low temperature, the exponential term becomes large, so we obtain

hω 2 hω C (T 0) 3Nk exp , V → ≈ B k T −k T  B   B  which is indeed exponentially small at low temperature. This was a great early success of the quantum theory, because it was the only theory to work, and helped vindicate the idea of quantum mechanics. Unfortunately, better experiments subsequently showed that the heat capacity at low temperatures is in fact not exponential, but rather goes like T 3. A better theory is evidently needed, which is discussed in the next section.

12.2 Phonons

In fact the atoms in a crystal are not able to oscillate completely independently, because the bonds from site to site are by definition strong. A better model, originally proposed by Debye in 1912, is to imagine a regular array of N masses m connected to each other by springs of length a with spring constant K. Suppose that the coordinate ηj describes the displacement of the mass at site j. The Hamiltonian (energy function) for the system is then 1 H = mη˙2 + K (η η )2 . 2 j j+1 − j j X h i The force can be obtained by using ∂V ∂ K F = = (η η )2 = K [(η η ) δ (η η ) δ ] i −∂η −∂η 2 j+1 − j − j+1 − j i,j+1 − j+1 − j i,j i i j j X X = K (2ηj ηj+1 ηj 1) − − − − so that the equations of motion become

mη¨j K (ηj+1 + ηj 1 2ηj )=0. − − − PHYS 449 Course Notes 2009 121

With equilibrium positions identified as x = ja, one can posit that the solutions are given by travelling waves, ηj exp[i(kx ωt)]. Imposing periodic boundary conditions η0 ηN requires 1 = exp(ikNa) which∝ implies kNA− =2πn or k = (2πn/N)(1/a). These crystal momenta≡ are in fact identical to the quantized momenta seen for the particle in a box k = 2πn/L, except that now we identify the total length L = Na with the product of crystal lattice spacings. Inserting the traveling wave solution into the equation of motion gives 2 ika ika mω K e + e− 2 =0. − − − This can be simplified by noting that  2 2 ka 1 ika/2 ika/2 1 ika ika sin = e e− = e + e− 2 . 2 −4 − −4 −     We then obtain the dispersion relation for phonons  4K ka ω2(k)= sin2 . (12.3) m 2   For small wavevector k (momenta) or long wavelength, the frequency is linear: ω(k 0) 2 K/m k a. This is the same as the dispersion relation for a regular wave, ω = ck→, where≈ | | c = a K/m is the wave velocity. It is quite different from the freeparticle dispersion ǫ k2 p k encountered in the context of the 3D box. ∝ p In the canonical ensemble, the mean energy is given by

V ∞ 2 hω(k) U = hωknk = k dk . 2π2 hω¯ (k) k 0 exp 1 X Z kB T − You might wonder why I blithely used the Bose distribution factor her e. In principle, there are only two kinds of distributions that I am allowed to use and be consistent with quantum mechanics, the Bose or Fermi. Because there are no restriction about how many vibrations I am allowed to have in a given energy, it is natural to use the Bose one.

Nevertheless, it will be hard to evaluate this expression using the full dispersion relation (12.3) obtained above. So instead let’s assume that at low temperatures only the lowestenergy states will be excited, so that we can make the identification ω(k) ck. The energy then becomes ≈ 3 4 3 3V ∞ hck 3V kB T ∞ x U 2 dk = 2 hc x dx, (12.4) ≈ 2π 0 exp hck¯ 1 2π hc 0 e 1 Z kB T −   Z − where the factor of 3 comes from the three different polarizations: two transverse ones like the photon, and one longitudinal. There are 3N phonon modes in total; a onedimensional chain with N sites has N modes, in 3D there are 3N. The last integral is related to a couple of special functions: n ∞ x dx = ζ(n + 1)Γ(n + 1), ex 1 Z0 − where Γ(n) is Euler’s and the zeta function was defined earlier (11.5). For n = 3 one obtains Γ(4)ζ(4) = π4/15 so that the heat capacity becomes at low temperature ∂ 3V πk T 4 2π2 k T 3 C (T 0) B hc = V k B . V → ≈ ∂T 30π2 hc 5 B hc     PHYS 449 Course Notes 2009 122

This now has the desired T 3dependence, due to the phonons. Because nothing was assumed in this derivation except for the fact that the dispersion was linear, the same result is true for any system characterized by travelling waves.

In metals at very low temperatures the T 3dependence becomes very small and the heat capacity instead goes like T due to the electrons in the crystal, as discussed in Section 11.2. In insulators the electrons are not mobile and this contribution doesn’t exist.

At higher temperatures, one needs to worry about higherenergy modes in the phonon distribution function. In principle, we need to include the full kdependence of the phonon dispersion rela tion (12.3), but this is a problem because the integrand diverges at high frequencies. But we have overlooked an important fact: phonons with wavelengths shorter than the lattice spacing cannot exist. So there is in fact a natural cutoff at high frequency (short wavelength) that will remove the divergence in the integrals in practice.

This cutoff is called the Debye frequency ωD, and is determined at zero temperature where the dispersion relation is linear. With ω = ck the 3D density of states becomes V k2dk 3Vω2dω = , 2π2 2π2c3 where the factor of 3 again comes from the polarizations. So ωD 3Vω2 Vω3 3N = dω = D , 2π2c3 2π2c3 Z0 which defines the Debye frequency

2 3 1/3 6π c N 1/3 ω = c 6π2n . (12.5) D ≡ V    The Debye temperature is then defined as θ hω /k = (hc/k )(6π2n)1/3. D ≡ D B B Let’s use this to rederive the zerotemperature heat capacity of the phonon gas, but this time let’s use the grand partition function, just for fun. When the chemical potential is zero, the free energy coincides with the grand potential (10.5):

ωD 3kBT V 2 hω/k¯ B T F = k T ln(Ξ) = k T g ln 1 exp[ β(ǫ )] = ω ln 1 e− . − B B i { − − i − } 2π2c3 − i 0 X Z h i Setting x =hω/kBT gives

3 θD/T 3 4 3V kBT kB T 2 x T π F = x ln 1 e− = 9Nk T , 2π2 hc − − B θ 45   Z0  D      < where the value of the integral is obtained under the assumption θD/T 20; at higher temperatures it will be smaller, but the linear approximation to the dispersion relation∼ will not be applicable. The specific heat is then found to be ∂2F 12π4 T 3 C = T = Nk , V − ∂T 2 5 B θ  D  which is easily shown to be identical to the expression found earlier. PHYS 449 Course Notes 2009 123

5

4

3

2 ) (arb. units) ν u( 1

0 0 5 10 15 20 ν (arb. units)

Figure 12.1: The blackbody distribution u(ν) is plotted as a function of the frequency ν.

12.3 Blackbody Radiation

A blackbody is a body that looks completely black when it is cold, i.e. it absorbs light perfectly at all wavelengths. Conversely, these same bodies when heated emit light at all wavelengths, which might make their ‘blackbody’ moniker somewhat confusing. Of course, there is no real material that is a perfect blackbody, but there are several systems that come very close. One obvious one is coal. Perhaps the best current one is actually the cosmic background radiation, believed to be a relic of the Big Bang. As the universe has expanded (effectively adiabatically), the temperature has cooled such that the current temperature is on the order of a few degrees Kelvin. Alternatively, the fabric of spacetime has stretched to such an extent that the radiation emitted from the Big Bang (more or less isotropically) has become severely redshifted.

The distribution of wavelengths of light λ (or frequencies ν = c/λ) from a blackbody was investi gated extensively in the 19th century, and was found to be a universal curve characterized by the temperature. This is plotted in Fig. 12.1. As a model of a natural blackbody, experimentalists studied a small 3D box (cavity) with a hole in it. The heat radiation absorbed by the box is trans formed into light inside, which was considered to rattle around inside, coming to equilibrium, and eventually escaping out of the small hole. It was found that the amount of radiation escaping from the hole was proportional to the area of the hole A and was related to the temperature through the StefanBoltzmann law: dQ = AσT 4, dt 8 2 4 with σ 5.67 10− W/m K . There was no microscopic explanation for this result, however, nor did anyone≈ know× the expression for the distribution of wavelengths. In 1896 Wilhelm Wien suggested c2/λT 5 u(λ)= c1e− /λ , which worked well at short wavelengths but terribly at long wavelengths.

The RayleighJeans theory (1905) attempted to do a better job in deriving the blackbody distri PHYS 449 Course Notes 2009 124 bution. They started by assuming that the cavity walls contained charged particles that oscil lated about their equilibrium positions. Using the equipartition theorem, they proposed that each oscillator has kB T of energy. Assuming a 3D cubic cavity of length L, the density of states is g(k)dk = (L3/π2)k2dk (the missing factor of two comes from the fact that light has two polarization directions). Using the relationship between wavevector and wavelength for photons k = 2π/λ one has k2 =4π2/λ2 and dk = (2π/λ2)dλ so that k2dk = (8π3/λ4)dλ. The density of states is then g(λ)dλ = (8πL3/λ4)dλ. The− distribution is then − 8πk T u(λ)= B . λ4 Unfortunately this does not agree well with the observed distribution, because there is too much radiation implied at short wavelengths (λ 0): you would be microwaved looking at a burning log! Even worse, the total energy density is infinite:→

∞ u = u(λ)dλ = , ∞ Z0 whereas it should go like T 4. This was called the ‘ultraviolet catastrophe’ by Ehrenfest in 1911, and it is because the integral diverges in the ultraviolet (short wavelength limit).

To get around these problems, Planck in 1905 suggested a different form for the distribution function. His reasoning went as follows. Suppose that there are N oscillators (electrons) in the walls of the cavity vibrating at frequency ν. The total energy is UN = NU and the total entropy is SN = NS = kB ln() (we are clearly in the microcanonical ensemble here!). How can one distribute the UN energy amont the N oscillators? Suppose the energy is made up of discrete elements, UN = Mǫ, where M 0. (This is the quantum hypothesis). Then the number of ways of distributing M indistinguishable≫ energy states among N distinguishable oscillators is (N 1+ M)! = − , M!(N 1)! − which we also derived in Eq. (11.12). You will now also recognize this as the number of ways of arranging indistinguishable bosons, Eq. (9.1). The Boltzmann entropy is then M M M M S = k ln() = k 1+ ln 1+ ln N B B N N − N N       U U U U = k 1+ ln 1+ ln B ǫ ǫ − ǫ ǫ       This is strongly reminiscent of the BoseEinstein entropy (10.7). Now, using the microcanonical definition of the temperature gives

1 ∂S 1 U 1 1 U 1 kB ǫ = = kB ln 1+ + ln = ln +1 . T ∂U V ǫ ǫ ǫ − ǫ ǫ − ǫ ǫ U           Inverting gives ǫ U = , exp ǫ 1 kB T − which is the just the same result I obtained previously  for a set of N oscillators, Eq. (12.2), except in that case I used the canonical ensemble. PHYS 449 Course Notes 2009 125

Planck then connected the frequencies of oscillation to the energies of the (light) standing modes in the cavity. He used the classical wave equation for the light modes 1 ∂2y = 2y c2 ∂t2 ∇ and assumed that the light had to vanish at the walls, so that y(x,y,z)= A sin(kxx) sin(kyy) sin(kzz) 2 2 2 2 2 with kn = nπ/Ln. Inserting into the wave equation gives ∂ y/∂t = c k y = ω (k)y or ω(k)= ck, a linear spectrum just as found for phonons. Then with the final− assumption− that ǫ(k) ck or ω =hck, one obtains ∝ hck hc 1 U(k)= U(λ)= exp hck¯ 1 ⇒ λ exp hc 1 kB T − λkB T −     with the identification k =2π/λ. Combining this with the density of states found in the Rayleigh Jeans theory gives the full Planck distribution function 8πhc 1 u(λ)= 5 . λ exp hc 1 λkB T −   This fit the experimental data perfectly! For short wavelengths λ 0 it reproduces Wien’s predic 5 hc/kB T λ → 4 tion, u(λ 0) (8πhc/λ )e− . At long wavelengths one obtains u(λ ) 8πkBT/λ 0 which is→ the same≈ as the RayleighJeans result. → ∞ ≈ →

It is straightforward to calculate the energy of radiation:

dQ ∞ 8πhc 1 u = 5 dλ. dt ≡ 0 λ exp hc 1 Z λkB T −   With the replacement z = hc/λk T , then dλ = (hc/k Tz2)dz and B − B 4 3 (k T ) ∞ z u =8π B dz. (hc)3 ez 1 Z0 − The integral we have seen before, in Eq. (12.4), so the result is 8π5 (k T )4 u = B . 15 (hc)3 This is both finite (good thing!) and also proportional to T 4, in accordance with the Stefan Boltzmann law. The constant is then 8π5k4 σ = B 15(hc)3 3 4 which allows one to obtain the ratio h /kB experimentally. Furthermore, the maximum of the dis tribution occurs when hc/λkBT 4.965, which allows one to obtain the ratio h/kB experimentally. These two facts together uniquely≈ determine both Planck’s and Boltzmann’s constants, neither of which were known previous to this work.

This is the end of the story, more or less. It is interesting to note, though, that while Planck quantized the vibrational levels in the walls of the cavity, he used this only as a ‘trick’ to get a PHYS 449 Course Notes 2009 126 sensible answer. At the time he didn’t believe that the energies were really discrete. He certainly didn’t seem to realize that his theory also implied the discreteness of the light energies. The theory was developed earlier (and therefore independently) of Einstein’s theory on the photoelectric effect, where the quantization of light was key. In fact, the relationship between oscillators and bosons was really only tied together properly much later (in 1924) by Bose. But this is another interesting story, discussed in the next section.

12.4 Bose-Einstein Condensation

The prediction of BoseEinstein condensation (BEC) in 1924 remained moreorless a curiosity in physics for about 70 years, until gaseous BECs were first produced in the laboratory. In the mean time, it formed the theoretical foundation for phenomena such as superfluidity in liquid helium and superconductivity, and led to the development of the laser; but in reality the BEC of noninteracting particles envisaged by Bose and Einstein is very different from the phenomena in these condensed matter systems. BEC is essentially what happens when the temperature is lowered sufficiently that the bosons reach quantum degeneracy (that is, the de Broglie wavelength approaches the mean interparticle distance).

The basic problem with attaining a gaseous BEC is that when gases get cold enough, they condense into very ordinary liquids, and not into BECs. The trick that took 70 years is to try to keep the particles from ordinary condensation before reaching BEC. It turned out that spinpolarized gases were ideal candidates, because the spinspin interaction is van der Waalslike, which means only very weakly attractive when the gases are sufficiently dilute. But if they are very dilute, it means that the mean separation is very large, which means ridiculously cold temperatures are needed to achieve BEC (on the order of nanoKelvins). The achievement of BEC followed years of experimental progress on trapping and cooling of atoms with lasers, which took place mostly in the ’70s. This is a fascinating story, and if these notes ever become a book, I’ll tell it in detail!

12.4.1 BEC in 3D

The mean number of atoms in a 3D box is 2 1 V ∞ k dk N = = . exp[β(ǫ )] 1 2π2 exp[β(ǫ )] 1 i i 0 k X − − Z − − 2 2 Making the substitution x = (h /2mkBT )k gives the mean density N/V :

3/2 1 2mk T ∞ √xdx n = B , 4π2 h2 z 1ex 1   Z0 − − where z exp(β) is the fugacity, as seen previously. Now, the mean density is fixed. So as the temperature≡ decreases, clearly the integral also needs to increase in order to keep up. The only thing depending on the temperature is the fugacity, which appears in the denominator: exp( β) needs to keep decreasing which means that needs to keep increasing (i.e. approach zero from− below). But recall that the chemical potential for the Bose case can never exceed the lowest accessible energy level ǫ0, otherwise the occupation of a given level could be negative. At some point as the temperature drops, will hit zero, and the integral will no longer be able to properly count particles. PHYS 449 Course Notes 2009 127

The particles will suddenly start disappearing! This is the phenomenon of BoseEinstein condensation, and the temperature at which this occurs is called the BoseEinstein condensation (BEC) transition temperature, Tc.

Where have they gone? Notice that the integrand is proportional to √x k. The lowest energy ∝ accessible energy level ǫk=0 = 0 isn’t strictly included in the integral! The particles that appear to be disappearing are simply piling up in the ground state of the system, the zeroenergy k = 0 state. The density of atoms in the ground state is denoted by n0. If the total density n is known (or assumed), then the number of atoms in the ground state can only be inferred from the number that we can count: n = n n′, where n′ is the value of the integral above. 0 −

It’s easy to calculate the value of Tc: simply set =0or z = 1. Then we can use the handy integral defining the Bessel functions ν 1 ∞ x − dx Jν 1 = Γ(ν)ζ(ν). − ≡ ex 1 Z0 − 3 3 3 1 1 √π In our case, ν = 2 and Γ 2 2.612 and ζ 2 = 2 Γ 2 = 2 . So for a 3D box, the BEC transition temperature occurs when ≈    1 2mk T 3/2 3 3 n = B c Γ ζ . 4π2 h2 2 2       Inverting this gives a simple relationship between the transition temperature and the particle density:

h2 k T 3.31 n2/3. (12.6) B c ≈ m How does the BEC temperature relate to the quantum degeneracy temperature, when the de Broglie 2 wavelength becomes comparable to the interparticle separation, λD = d? We have 2πh /mkBTd = 1/3 2 2/3 (1/n) or h n /mkBTd = 1/2π. Thus Td/Tc = 2π/3.31, which means thatq the degeneracy and BEC transition temperatures are basically the same, apart from some factors of order unity. Alternatively, the BEC condition is that the phase-space density be larger than 2.612:

nλ3 2.612. D ≥ What is the fraction of atoms in the condensate for temperatures T

3/2 3/2 1 2mkBT 3 3 T n′ = 2 2 Γ ζ = n . 4π h 2 2 Tc         Together with n = n n′, we obtain 0 − n n T 3/2 0 =1 ′ =1 . (12.7) n − n − T  c  12.4.2 BEC in Lower Dimensions Is there a BEC transition in 2D or 1D systems? The issue is somewhat interesting, and relevant to modern physics, so let’s explore it briefly. The mean particle number is

∞ g(ǫ)dǫ N = , exp[β(ǫ )] 1 Z0 − − PHYS 449 Course Notes 2009 128

2 where in two dimensions the density of states g2D(ǫ) = Am/2πh is independent of energy and in 2 2 one dimension it is given by g1D(ǫ)= m/8π h (L/√ǫ). In two dimensions, we obtain q mA ∞ dǫ mAkB T ∞ dx N 2D = = (substituting x = βǫ) 2πh2 z 1 exp(βǫ) 1 2πh2 z 1ex 1 Z0 − − Z0 − − A x ∞ A = ln 1 ze− = ln (1 z) . λ2 − −λ2 − D 0 D 

What happens when z 1−, i.e. when 0−? Then → → 1 kBT lim n2D = ln 0 λ2 → D  | |  which diverges logarithmically! So it would seem that the density in two dimensions is undefined as T 0, implying that BEC is impossible. But there is a loophole: BEC can happen at exactly zero → temperature. When approaches zero as temperature approaches zero, the ratio /kBT actually remains welldefined and finite. The transition is called the Kosterlitz-Thouless transition. The BEC is pure at T = 0; as temperature increases, vortexantivortex pairs are spontaneously produced, destroying the nice properties of the BEC.

In one dimension, the story is more or less the same. The mean density is given by

m ∞ dǫ n1D = . 8π2h2 √ǫ [z 1 exp(βǫ) 1] r Z0 − − α The same substitution x = βǫ, and also z e− , give ≡ 1/2 mkB T 1 ∞ dx 1 ∞ x− dx n1D = 2πh2 √4π √x [exp(x + α) 1] ≈ λ √4π x + α r Z0 − D Z0 where in the last line I have recognized that the dominant contribution to the integral comes from the singular portion in the vicinity of x = 0. This last integral is each to evaluate, giving

2 1 Γ 1 π 1 n = 2 = . √ √α 4α λ λD 4π   r D + Clearly, this diverges for α 0 or 0−, and so there is no BEC in 1D at finite temperature → → either. In fact, /kBT also diverges so there is no BEC at any temperature in one dimension.

Actually, the absence of BEC in one and two dimensions at finite temperature is only true in infinite systems. I know that I’ve been considering particles in various boxes and the length has been well defined, but really it isn’t. The lowest energy state always has k = 0, or k = πn/L where n = 0. 1 This means that the lowest energy state has a characteristic length scale proportional to k− = . In 1D and 2D, this gives rise to a socalled infrared divergence in the perturbation theory expansion∞ of the gas, and therefore the singular behaviour. This is the same ‘infrared catastrophe’ that you might have heard of in the context of quantum electrodynamics (QED), and it’s no surprise: the mediators of the electromagnetic field are photons which have BoseEinstein statistics. The problem goes away when the bosons are truly confined in finite volumes, such as in harmonic oscillator traps, discussed in the next subsection. In this case, BEC persists at finite temperatures. PHYS 449 Course Notes 2009 129

12.4.3 BEC in Harmonic Traps The spinpolarized atomic gases that have been coaxed into BEC have all been trapped 3D harmonic potentials, just like the ultracold fermions. So we can use some of the results found in that case, most importantly the density of states (11.11). Assuming that the energy level spacing is much smaller than the temperature (which isn’t really true for current experiments, but you have to start somewhere!), the mean number of atoms is

2 ∞ ǫ dǫ 1 N = . 2(hω)3 exp[β(ǫ )] 1 Z0 − − The BEC critical temperature is found by setting = 0 as for the box (if we had kept the zeropoint motion, the we would set =h(ωx + ωz + ωz)/2 rather than zero). With the substitution x = βǫ, the mean number becomes:

3 2 3 1 k T ∞ x dx 1 k T N = B c = B c Γ(3)ζ(3), 2 hω ex 1 2 hω   Z0 −   where Γ(3) = 2 and ζ(3) 1.202. Inverting gives the transition temperature as a function of the number of atoms: ≈ 1/3 k T 0.94hωN . B c ≈ Notice the different power law than that for a BEC in a 3D box, Eq. (12.6). The temperature dependence of the condensate number is then

N T 3 0 =1 , N − Tc   which again differs from the 3D box case, Eq. (12.7).

The first test that the atoms truly had formed a BEC was that the atomic cloud showed a bimodal distribution. That is, the noncondensed atoms formed a cloud that looked moreorless classical, and was identical to the semiclassical (equipartition) profile of the Fermi gas (11.19):

2 2 2 1 ext(r) 1 x y z n′(r)= exp − = exp exp exp exp , λ3 k T λ3 −k T −R2 −R2 −R2 D  B  D  B   x   y   z  2 2 where Ri = 2kBT/mωi . On the other hand, the atoms in the BEC all occupy the lowest energy state of the 3D harmonic potential. This is the solution of the 3D Schr¨odinger equation, and I won’t bore you with the details, but give you the answer instead: N x2 y2 z2 n (r)= 0 exp , 0 π3/2a a a −a2 − a2 − a2 x y z  x y z  3 where ai = h/mωi. The prefactor guarantees that d rn0(r) = N0. Now, it is clear that both the BEC and noncondensate densities are Gaussian, but the length scales over which the Gaussians p vary are very different. The length scales for the condensateR depend only on the trap parameters, and are therefore quite small (of order microns); the length scales for the noncondensate depend on temperature, and for T T are on the order of tens of microns. To sum up: because of the big ∼ c N0 factor in the BEC density, the total density looks like a sharp, small spike in the centre of an extended noncondensate density. This bimodial distribution can easily be seen simply by taking a picture of the atomic cloud with a CCD!